Altera_Forum
Honored Contributor
20 years agoclock cycles on + - * /
Hei, I try to figure out how many clock cycles that is needed to do + - * / on both int and float.
I have used the : nios2_51\examples\vhdl\niosII_cycloneII_2c35\full_featured nios2_51\examples\vhdl\niosII_cycloneII_2c35\standard The bothe show the same result I made an counter in hardware:library ieee, std;
use ieee.std_logic_1164.all;
use ieee.numeric_std.all;
use std.textio.all;
entity counter is
port
(
clk : in std_logic;
reset_n : in std_logic;
chipselect : in std_logic;
address : in std_logic_vector(0 downto 0);
write : in std_logic;
writedata : in std_logic_vector(31 downto 0);
read : in std_logic;
readdata : out std_logic_vector(31 downto 0);
irq : out std_logic
);
end counter;
-------------------------------------------------------------------------------
-- Architecture: behavior
-------------------------------------------------------------------------------
architecture behavior of counter is
signal counter : unsigned(31 downto 0);
begin
readdata <= std_logic_vector(counter);
pro_clock : process (clk, reset_n)
begin
if reset_n = '0' then
counter <= (others => '0');
irq <= '0';
elsif rising_edge(clk) then
if chipselect = '1' and write = '1' and address = "0" then
counter <= unsigned(writedata);
else
counter <= counter + 1;
end if;
if chipselect = '1' and write = '1' and address = "1" then
irq <= writedata(0);
end if;
end if;
end process pro_clock;
end architecture behavior; And I code In C: #include "sys/alt_irq.h"# include "system.h"# include "stdio.h"
# include <io.h>
# define IOADDR_PRE_COUNTER(base) __IO_CALC_ADDRESS_NATIVE(base, 0)# define IORD_PRE_AVALON_COUNTER(base) IORD(base, 0) # define IOWR_PRE_AVALON_COUNTER(base, data) IOWR(base, 0, data)
# define IOADDR_PRE_IRQ(base) __IO_CALC_ADDRESS_NATIVE(base, 0)# define IORD_PRE_AVALON_IRQ(base) IORD(base, 0) # define IOWR_PRE_AVALON_IRQ(base, data) IOWR(base, 0, data)
int main(void){
register volatile float a,b = 1,c = 2;
volatile int data0, data1, data2,i;
alt_irq_context context;
for(i = 0, a = 0, b = 0; i < 100; i ++, b = b + a, a = a + 1){
context = alt_irq_disable_all();
IOWR_PRE_AVALON_COUNTER(0x01211160, 0);
data0 = IORD_PRE_AVALON_COUNTER(0x01211160);
c = b * a;
data1 = IORD_PRE_AVALON_COUNTER(0x01211160);
alt_irq_enable_all(context);
data2 = data1 - data0;
printf("%d \n", data2);
}
return 0;
} And I have tested it with the run on harware. I get: 30 clock cycles on * int 30 clock cycles on + int 200 clock cycles on / int 600 clock cycles on * float 300 clock cycles on + float 600 clock cycles on / float Can this be right? In http://www.altera.com/literature/hb/nios2/n2cpu_nii51015.pdf (http://www.altera.com/literature/hb/nios2/n2cpu_nii51015.pdf) it says that the standard core use 5 clock cycles on embedded multiply. Accessing the counter tok around 3 clock cycles. I have tried to disable the interrupt, I think I Have done it right. I do 100 operation on eatch and all show almost the same. I don't belive some interupt have managed to jump in between 30 clock cycles on all the hundred. I also did the compilation with release optimation and not debug optimation. I also remeber some other in this forum tell that the use 87 clock cycles on the float. Another ting wile I am writing, one time I tried to enable the hardware divde on the full design eksample. But the divide didn't work. I didn't chec the timing in quartus that time. But the core was running as it was set up in the desing on 85 MHz. It is so that the hardware divide don't manage 85 MHz? Have someone some information on the floating point instruction you can download under the IP place in this forum. How fast is it, how much space does it take? I did look into the file, but there was only some of the operation that information was there.