Forum Discussion
Having to instantiate a significantly more complex LSU isn't a helpful solution. Doing the math we won't be able to complete our design with HLS if we have to use burst coalesced LSUs. Our design would use none of the features from the burst coalesced unit. It seems like there should be a way to indicate to HLS that I want a deeper pipelined LSU without wasting copious amounts of of resources.
We are starting to switch to Verilog at this point since there doesn't seem to be an answer to this question. If anyone has a solution to this we would greatly appreciate it as having to rewrite our design in Verilog is significantly affecting our timeline.
Hi @AUT ,
Unfortunately at this time, without specifically modifying the generated Verilog (i.e. updating the KERNEL_SIDE_MEM_LATENCY so that the instantiated FIFO is larger), there is no way to increase the capacity of the FIFO associated with the pipelined LSU. What is the desired number of dispatch requests that you are trying to achieve?
- AUT1 year ago
New Contributor
Thank you for letting me know, we will continue with development of our Verilog based design. Given the high latency of HBM I currently need around 128 outstanding request. The best I can get even upping KERNEL_SIDE_MEM_LATENCY is 41 request so I think other more involved changes would be needed in the generated Verilog to exploit the full depth.
If possible in a future release I think a feature similar to Xilinx's num_read_outstanding would be useful especially for HBM/DDR designs.
Best,
Austin