Intel HLS streaming problem
In order to test Intel HLS we developed a producer/consumer project. The producer is a software module executing on the Intel HPS running Linux and the consumer is a hardware module connected to h2f_axi and f2h_axi (DDR) buses.
The consumer module needs to be connected to two fifos (in and out) to communicate with the producer and the memory subsystem.
In order to do that, we used three interfaces: One ihc::stream_in, one ihc::stream_out and a memory mapped master interface ihc::mm_master.
The component is compiled by using i++ (Intel HLS) with the hls_stall_free_return and hls_always_run_component macro. Here is the component signature:
hls_stall_free_return hls_always_run_component component int consumerRAM0(ihc::mm_master<word4_t,ihc::aspace<8>, ihc::awidth<32>, ihc::dwidth<32>, ihc::latency<0>, ihc::waitrequest<true> >& device_6, ihc::stream_in<word4_t>& module1_read, ihc::stream_out<word4_t>& module1_write)You may find the c++ sources in the build archive attached to this post.
Originally the component was declared with a void return value. We added the int return type to see if it would change the component's behavior. It didn't change anything.
Here is the command used to compile the component
i++ -O2 -march=5CSEMA5F31C6 --simulator none --clock 10ns --component consumerRAM0 consumerRAM0_ihls.cpp -o consumerRAM0We target a Cyclone V (DE1-SoC) and the component was tested under both Intel HLS 17.1 and 18.0 with the same results.
The consumer execute its operations in the following order:
- The component recuperates a value from a fifo through the stream_in interface. In this case, the value is a memory address.
- The component then use the address to make a 32bits read operation through the memory mapped interface
- Finally, the value read from the memory mapped operation is then sent through the stream_out interface to another fifo (different fifo from 1.)
When we implement the component on the fpga we get the following behavior:
- The initial input data is correctly read.
- The memory access done through the Avalon MM interface is also valid.
- The stream_out operation on the other hand is done at the correct time but the signals are asserted for too long.
After further investigation, it seems that the HW module loops and close the stream_out operation only when one of the stream_in (module1_read) inputs changes. Otherwise, all the stream_out (module1_write) signals stays active. This has the side effect of writing multiple time the same value into the fifo connected to the stream_out interface.
In order to better demonstrate the problem, we ran the generated HDL files through a custom testbench developed with the help of Modelsim Intel FPGA Starter Edition.
For the first test, we force a 10 cycles delay between new input values. This is done in order for the component to be able to process the request. In the following waveform we clearly see that the write signal (module1_write_valid) issue by the component stay asserted for more than one cycle.
If we change the input of the stream_in interface for every clock cycles, we see that the valid signal stay only one cycle as specified.
Also, if there is no more data in the input fifo connected to the stream_in interface, the component will also stall at the write operation again writing multiple times the same value. The write signal eventually drop after 180ns or 18 clock cycles...
So this points to both read and write operations being link together and not treated as atomic operations. According to the Intel documentation for the Avalon Interface Specifications (mnl_avalon_spec.pdf version 18.0):
The valid signal qualifies valid data on any cycle where data is being transferred from the source to the sink. On each valid cycle the data signal and other source to sink signals are sampled by the sink.
In other words, the component should assert the valid signal for one cycle for every new value. Here we see that the valid signal stays up more than one cycle.
At this point, we are wondering if it's possible to ensure the atomicity of each operations ?