Altera_Forum
Honored Contributor
8 years agoOptimize Memory Accesses FPGA
Hi everyone. I am developing a code that has to evaluate some floating point values based on some data. So my current approach is that each evaluation for each cell of an int vector is done by a work item.
This int vector contains the indexes to access some other data. This other data has to be read from global memory and it mainly consist of structures. These structures are used to evaluate the floating point measure I need that will be then saved to a vector of floating points for each work-item id. I'll provide a snippet of code. So the problem here is that this code scales very well on my GPU but not really for FPGA, actually as I double the work-items the timing increases quite bad. So here I am very confused how I should optimize the memory accesses... I can't find much online and I am struggling a lot about this problem because it appears to be present in every code I write. So thanks to anyone that can actually clarify or explain the proper approach to optimize the memory accesses. Thanksint tid = get_global_id(0); //WORK-ITEM FOR EACH SOLUTION
int groupIndex = tid * b_sizes;
int indexWU = tid * b_sizes;
for (int i = 0; i < b_sizes; i++)
{
float totalC = 0;
int rIndex = b_solutions;
if (rIndex != -1)
{
struct SO o= b_o;
struct SR r= b_r;
float freC = 0;
if (r.sL != CRF)
{
float tW = b_wU;
freC = getWBC(tW, r, b_wB) / tW*o.w;
}
float whC = r.whC*o.d;
totalC += freC + whC;
}
b_solPerf += totalC;
}