Is it possible to reach nearby CLBs optimally fast while using openCL?
Hi,
I am working on a research project to deepen my knowledge in parallell computing, and I have just scratched the surface of FPGA technology.
To the best of my knowledge opencl is compatible with FPGAs. I would like to have my openCL kernels reach information locally which is stored in other CLBs nearby. The problem with this is I don't see any possibility to do that iteratively.
Correct me if I'm wrong but the only option openCL gives to reaching local kernel-related information is through the local memory space. The only problem with that is that local memory access is restricted to user defined local workgroups. Local workgroups can be defined in such a way that the user is able to define sections in the board, whom have a common memory allocation space.
That kind of logic would not fit the purpose of my project, however in FPGA architecture there is such a thing as local addressing ( neighbouring CLBs, rows and lines ). A CLB is able to have input from a nearby CLB through the addressing system optimally, becuase the device is wired in a way which makes this possible. Is there any way openCL is able to use this capability of FPGAs?
Thank you for any help in advance!