Forum Discussion
Altera_Forum
Honored Contributor
8 years agoAs I expected, your design is too big to fit on the FPGA, and that is why it is failing to compile. This is the area estimation from the log:
+--------------------------------------------------------------------+
; Estimated Resource Usage Summary ;
+----------------------------------------+---------------------------+
; Resource + Usage ;
+----------------------------------------+---------------------------+
; Logic utilization ; 191% ;
; ALUTs ; 69% ;
; Dedicated logic registers ; 121% ;
; Memory blocks ; 14% ;
; DSP blocks ; 136% ;
+----------------------------------------+---------------------------; Your kernel will probably need at least 499 double-precision multipliers and a huge amount of memory and logic to support them. If you write the kernel as follows, you will have the exact same dependency but with a very modest area usage: __kernel void Test51(__global double *data, __global double *rands, int index, int rand_max){
double2 temp;
int gid = get_global_id(0);
temp = data;
for (int i = 1; i < 500; i++)
{
temp = (double) rands * temp;
}
data = temp.s0;
} It is worth mentioning that loops in NDRange kernels are not pipelined and instead, are shared by multiple threads to keep the pipeline busy. Because of this, loop-carried dependencies do not have much of a negative effect in NDRange kernels. However, if you write the same kernel as single work-item, you will get an initiation interval of higher than one due to the loop-carried dependency on the temp variable and very bad performance (which can be fixed by inferring a shift register as outlined in Intel's documents).