Forum Discussion

Altera_Forum's avatar
Altera_Forum
Icon for Honored Contributor rankHonored Contributor
8 years ago

The result is right in emulator mode, but is wrong in FPGA

My code result is right in emulator mode, but when I compiled it and run in FPGA, the result is wrong.

What reasons will cause the problem? And how to debug code in the situation? The code need 13 hours to compile.

Thank you very much!

8 Replies

  • Altera_Forum's avatar
    Altera_Forum
    Icon for Honored Contributor rankHonored Contributor

    This could be anything from a race condition that is not properly emulated by the emulator to a bug in the compiler. Pretty much the only way to debug code on the FPGA is to use printf from inside the kernel, and that is only if the kernel does not deadlock.

  • Altera_Forum's avatar
    Altera_Forum
    Icon for Honored Contributor rankHonored Contributor

    --- Quote Start ---

    This could be anything from a race condition that is not properly emulated by the emulator to a bug in the compiler. Pretty much the only way to debug code on the FPGA is to use printf from inside the kernel, and that is only if the kernel does not deadlock.

    --- Quote End ---

    Thanks for your reply!

    My kernels are all task kernel. That is to say mutiple kernels read/write one global variable ?
  • Altera_Forum's avatar
    Altera_Forum
    Icon for Honored Contributor rankHonored Contributor

    --- Quote Start ---

    My kernels are all task kernel. That is to say mutiple kernels read/write one global variable ?

    --- Quote End ---

    Yes, if you have multiple kernels running in parallel which read from or write to the same global memory buffer, and read/write order will affect your output, you will likely get incorrect results. Trying to synchronize such kernels using channels will NOT work.
  • Altera_Forum's avatar
    Altera_Forum
    Icon for Honored Contributor rankHonored Contributor

    --- Quote Start ---

    Yes, if you have multiple kernels running in parallel which read from or write to the same global memory buffer, and read/write order will affect your output, you will likely get incorrect results. Trying to synchronize such kernels using channels will NOT work.

    --- Quote End ---

    But my code have no global variable that multiple kernels write to it. Is the problem of channel? In my code, many channel read/write have conditions. Or the problem of local memory access?? My local buffer is doubble-buffer, in a loop, there is read and write to it, but their address is different.
  • Altera_Forum's avatar
    Altera_Forum
    Icon for Honored Contributor rankHonored Contributor

    As I said it could be something in your code, or just a bug in the compiler; it is impossible to know without full analysis of the code and the resulting HDL. You can try opening a support request with Altera and sending them your source code to see what they say.

  • Altera_Forum's avatar
    Altera_Forum
    Icon for Honored Contributor rankHonored Contributor

    HRZ,

    In response to your comment:

    --- Quote Start ---

    Yes, if you have multiple kernels running in parallel which read from or write to the same global memory buffer, and read/write order will affect your output, you will likely get incorrect results. Trying to synchronize such kernels using channels will NOT work.

    --- Quote End ---

    I have not explored this design pattern yet; but is this due to a lack of cache coherence (in the LSUs that are automatically inferred)? Did you ever bring this up with Altera or get a response?
  • Altera_Forum's avatar
    Altera_Forum
    Icon for Honored Contributor rankHonored Contributor

    --- Quote Start ---

    I have not explored this design pattern yet; but is this due to a lack of cache coherence (in the LSUs that are automatically inferred)? Did you ever bring this up with Altera or get a response?

    --- Quote End ---

    This is not a bug, this is part of the OpenCL specification. Global memory consistency is only guaranteed at the end of kernel execution and hence, one should not expect consistency when kernels running in parallel are reading from/writing to the same global address. Channels are meant to be used for passing data between kernels, not synchronizing global memory load/stores.
  • Altera_Forum's avatar
    Altera_Forum
    Icon for Honored Contributor rankHonored Contributor

    --- Quote Start ---

    But my code have no global variable that multiple kernels write to it. Is the problem of channel? In my code, many channel read/write have conditions. Or the problem of local memory access?? My local buffer is doubble-buffer, in a loop, there is read and write to it, but their address is different.

    --- Quote End ---

    Channel ordering could be the problem, try to print it out. My case was the emulator had the wrong ordering but FPGA was correct.