Channel problem

Honored Contributor

10 years ago

In one sense, that seems about right. The kernel takes one thread at a time, meaning it takes 1024000 work item and passes it into the kernel one by one in a pipelined manner. 40 ms doesn't seem to be around the the expected time with those kernels. There are somethings that can be changed that might improve efficiency, but the main limiting factor in my opinion right now is the memory access to global memory rather than the computation of the kernels.

Which reduction altera example are you referring to?

In terms of private memory, my take on it is that there is really no limit to each work item and the limit of the private memory is the limit of the device itself. Meaning you can use as much private memory as you want given that the amount does not exceed the onchip memory of your FPGA.

Forum Discussion

Recent Discussions

Quartus Prime Lite 25.1 License Error - "Unable to checkout a license" (SALT_LICENSE_SERVER)

Quartus Prime Pro 26.1 - Where to find Documentation of new Signaltap features

Error (292014): Can't find valid feature line for core SLL_CA_HBC_T001_Hyperbus_Memory_Controller_10

Agilex 5 – Critical HSSI Error in JESD204B Example Design

Quartus did not start