Reduce logic utilization

Question

Hi,   I have this part in my kernel where it takes too much logic 
if(relu == 1){ 
if(out &lt; 0 )
      conv_in = 0.1*out;
else 
      conv_in = out;
 }
  out is a float data. The report.html shows me it taking 4k aluts and 8k ff for this function which is too much for my de1soc to handle. Any idea how to reduce it?  Btw, the function is a leaky activation function where negative data will mutliply by 0.1. Thanks in advance.  EDIT: Whats the ups and downs in using these two compiler flag. 1) -fp-relaxed 2) -fpc

altera_forum · Answer

Since floating-point operations are not natively supported by the DSPs in Cyclone V, for floating-point multiplication, multiplication of mantissa will use DSPs but all other operations including shifting (with barrel shifters) and rounding will use logic and FF. This is expected behavior and cannot be avoided unless you give up on IEEE-754-compliance.

--fp-relaxed will allow parallelizing of floating-point operations in form of a tree that requires reordering of operations. This could slightly reduce the logic/FF overhead at the cost of small changes in the output. However, this might not necessarily make any difference in your kernel unless you have chained floating-point operations.

--fpc can significantly reduce logic and FF overhead of floating-point operations by reducing the area spent on rounding functions, at the cost of losing compliance with the IEEE-754 standard; i.e. if you use that switch, you could get very different (i.e. inaccurate) results compared to running the same code on a CPU/GPU.

Another option you have is to use fixed-point numbers. Altera's documents outline how you can use bit masking to convert floating-point numbers to fixed-point in an OpenCL kernel.

altera_forum · Answer

jack12, try to replace "conv_in = 0.1*out" to "conv_in = 0.125*out" or "conv_in = 0.125*out - 0.03125*out" for more precision -- these expressions is easier.

altera_forum · Answer

The kernel is mainly doing floating point convolutions repeatedly. Anyway, i will try to verify my result and compare my result with the compiler flags on. Thanks HRZ

altera_forum · Answer

Hi WitFed,

I am trying to reduce the logic utilization, as its can not fit into FPGA design. I am confused why creating conv_in = 0.125*out - 0.03125*out will reduce the logic utilization? Shouldnt it be using more logic in subtractor ?

altera_forum · Answer

because there is no 0.1 in hardware.

if you use 0.1, compiler will use a lot of hardware to implement a number as close as 0.1,

however if you use (0.125-0.03125)*out, it's like((1>>3)-(1>>5))*out

Forum Discussion

Reduce logic utilization

6 Replies

Recent Discussions

Duplicate_hierarchy_depth / duplicate_register

Timing analysis - long combinational path

Automatically added negative node for TDS output doesn't work with Agilex 5

Quartus 20.1std compilation fails for Quartus map - Device 10AS057K2F40I1SG

QuartusPro 25.3 Crashed after using the Signal Tap Logic Analyzer