cache data on chip ram

Honored Contributor

8 years ago

Some of your __local buffers seem unnecessary to me. For the first kernel, I think you can remove the dependency by reordering the i and j loop, convert the weight_ocr buffer to a single scoped variable, and move the load from external memory between the j and i loop as follows:

            for( a = 0 ; a < depth ; ++a){
                for( j = 0 ; j < col ; ++j){
                    lane_data weight_ocr = weights;
                    for( i = 0 ; i < row ; ++i){    
                    
                       # pragma unroll
                        for( k = 0; k < LANE_NUM ; ++k){
                        data_ch_vec.lane = input; //lanenum*col can pass as param port bcaz they are constant // here use 8 dsp //lc = lane_num*col
                        //printf("Lane:%d %f %f %f \n",k,data_ch_vec.lane.data,data_ch_vec.lane.data,data_ch_vec.lane.data);
                        }
                        
                        //load weights 
                        weight_buffer = weight_ocr; //0,1,2,3,4,5,6 repeat until new filter 7,8,9,10,11,12,13
                        write_channel_altera(weight_ch,weight_buffer);    
                        write_channel_altera(data_ch,data_ch_vec);
                    }    
                }                    
            }

This removes the memory dependency; however, it might break your function so make sure that it works correctly before using it.

For the second kernel, a similar thing can be done. The conv_out buffer does not need to be a __local buffer; you can just replace it with a single scoped variable as follows:

           # pragma unroll
            for(unsigned char ll=0; ll<LANE_NUM; ll++){
            float conv_out = 0;
           # pragma unroll
            for(unsigned i=0; i<PIPE_DEPTH; i++){
                conv_out += accum_piped;
            }
            conv_ch_in.data = conv_out;

This removes the dependency in your second kernel.

Forum Discussion

Recent Discussions

Quartus Prime Lite 25.1 License Error - "Unable to checkout a license" (SALT_LICENSE_SERVER)

Quartus Prime Pro 26.1 - Where to find Documentation of new Signaltap features

Error (292014): Can't find valid feature line for core SLL_CA_HBC_T001_Hyperbus_Memory_Controller_10

Agilex 5 – Critical HSSI Error in JESD204B Example Design

Quartus did not start