Forum Discussion
Altera_Forum
Honored Contributor
8 years agoThe host will divide the sparse matrix into 256 strips, and for each strip, the size many be different. Then pass the matrix to the kernel as well as the 256 offset values. The kernel is implemented in RTL and wrapped up in OpenCL library.
The error code try to allocate a large matrix. cl_mem bufferMA; sizeMA=1024*1024*512 bufferMA=clCreateBuffer(context, CL_MEM_READ_WRITE, sizeof(cl_float) * sizeMA, NULL, &status); status = cclEnqueueWriteBuffer(queue,bufferMA,CL_FALSE,0,sizeof(cl_float) * sizeMA, MA, 0,NULL, NULL); here MA is defined as void * MA=(void*)aocl_utils::alignedMalloc(sizeof(cl_float) * sizeMA); and initiated somewhere else. the code can compile, but got error during runtime. I think this is because the RTL kernel take the whole problem as a whole, so the dataset exceeded some limits? --- Quote Start --- Please post the part of your host code that is generating that error message, alongside with actual values for all parameters passed onto that function. Your very likely have a mistake somewhere in your host code. Also you should never write your code in a way that you need to split your buffers on the host, or pass 100 parameters to the kernel. This is certainly not the correct way to write OpenCL code. If you are not familiar with OpenCL, I strongly recommend looking at some basic non-FPGA examples and write some basic OpenCL code on CPUs and GPUs first and then move onto to FPGAs. Altera also has a lot of examples here (https://www.altera.com/products/design-software/embedded-software-developers/opencl/developer-zone.html) which you can look at. --- Quote End ---