Forum Discussion

Altera_Forum's avatar
Altera_Forum
Icon for Honored Contributor rankHonored Contributor
7 years ago

Small mistake in clCreateBuffer crashes PCIe

I made a really simple mistake. I have created a 16 byte buffer but transfered 24 bytes to FPGA global memory. This was crashing the PCIe. I have to restart the entire system because of this small error. The run time drivers of OpenCL SDK are really unstable.

//Assign 16 bytes of memory in FPGA DDR

sample_buf = clCreateBuffer(context, CL_MEM_READ_ONLY, 2* sizeof(char), NULL, &status);

checkError(status, "Failed to create buffer for sample in FPGA");

//Transfer 24 bytes of memory to FPGA DDR

status = clEnqueueWriteBuffer(que_sample, sample_buf, CL_TRUE, 0, 3*sizeof(char), sample_buf_cpu, 0, NULL, NULL);

checkError(status, "Failed to transfer sample_buf_cpu");

3 Replies

  • Altera_Forum's avatar
    Altera_Forum
    Icon for Honored Contributor rankHonored Contributor

    Are you sure the crash is happening because of the clEnqueueWriteBuffer() and not the FPGA reconfiguration? If you are using Arria 10, run-time reconfiguration has a relatively high chance of failure due to partial reconfiguration, especially on the older versions of Quartus. However, if you are sure your issue is caused by clEnqueueWriteBuffer() and you can reliably reproduce it, I recommend opening a ticket with Altera directly and reporting the issue.

  • Altera_Forum's avatar
    Altera_Forum
    Icon for Honored Contributor rankHonored Contributor

    What do you mean by FPGA reconfigration in this scenario? Do you mean mismatch in aocx file for the host program?

  • Altera_Forum's avatar
    Altera_Forum
    Icon for Honored Contributor rankHonored Contributor

    --- Quote Start ---

    What do you mean by FPGA reconfigration in this scenario? Do you mean mismatch in aocx file for the host program?

    --- Quote End ---

    No, without any mismatch or any incorrect settings, FPGA reconfiguration on Arria 10 could hang or crash at run-time due to timing issues associated with partial reconfiguration. If your OS hangs/crashes right after "Reprogramming device [0] with handle X" is printed, then it is crashing because of the reconfiguration. This is a random issue and is not necessarily reproducible even with the same binary.