Hello! I am currently using HPS and the FPGA together. I am communicating between them via the lightweight AXI Bridge. On the FPGA I also have LXDE Linux running. In order to test the maximum s...
I included the Qsys file that I am using again. In the Qsys File I declared four Parallel IOs. Those are connected to the HPS lightweight AXI bridge via a memory bridge.
The so called "data_..." PIOs are simply holding data which is either set by the FPGA or the HPS. The "ctrl_...." PIOs should handle the communication between the HPS and the FPGA. This is handled like this (simple explanation): HPS toggles the ctrl_hps_... signal Meanwhile FPGA waits for a state change on this ctrl_hps_... signal When the FPGA registers the state change by the HPS it reads data from the data PIO and sets its own data to the other data PIO After this it toggles its own ctrl_fpga_... signal The HPS also (after toggling its own PIO) waits for this toggle by the FPGA When it registers the state change it again starts the same communication flow
It is somewhat like this:
This is the communication that is happening. This is working totally like intended! So everything is working without any error.
Now to my problem: The FPGA is running with 50MHz. As far as i know the HPS Processor is running with almost 1GHz. (Is this the case? I can not find any information about it)
The FPGA is currently only waiting for a state change and toggles its signal according to that. I can see that this is done correctly in the oscilloscope because it takes 20ns ( --> 50MHz) after the HPS toggles that the FPGA is toggling.
But the HPS is not working the way it should be. I am running the LXDE Linux OS on it. It is slightly faster when controlling it via UART instead of the Desktop GUI but the problem is both times occuring. THE HPS CPU IS WAY TO SLOW. When I am simply just toggling the signals (no other calculations done in the C Code that I am executing on Linux) I only reach a frequency of around 300kHz, when transmitting information and doing some normal C Code operations (bitshift, ...) the frequency drops to around 200kHz.
The FPGA is always toggling correctly after 20ns. However the HPS takes WAY longer. It takes about 3us. I know that doing this with while(1) loops is not the most efficient way of waiting in the C Code. But the code is like
while(fpga has not toggled) { wait...
} //fpga has toggled
toggle hps signal
So it is really not that resource demanding. If the HPS really runs with almost 1GHz why is it taking so long? Is the CPU just not capable of running this code faster? Is there any other restriction? Is there a way that is more efficient?
If the HPS really runs internally with 1GHz , does it really take almost 3000 clock cycles to check one 32Bit address and write to another 32Bit address? It is not somehow stuck in the while loop or waiting for too long I checked this.
More examples with pictures from the oscilloscope:
The highest frequency i can achieve is at around 1.25MHz. But this is only when toggling and not doing anything else, which is not appliable for other projects.
1.25 MHz Toggle Signal
You can see that I achieve 1.25 MHz for the Signal Toggles (I routed them to a GPIO Pin thats why there are spikes besides of the oscilloscope quality of course)
FPGA toggles 20ns after the HPS Processor
It can clearly be seen that the FPGA is toggling its own signal (the blue one) 20ns after the HPS has toggled (yellow signal). According to my C Code the HPS should toggle its signal now as soon as it has detected that the FPGA toggled its own signal. But this takes WAY longer than expected. It takes about 400ns for the processor to do so. But the CPU clock is ~1GHz ??
The processor takes way to long
According to htop when running the code one of the two CPU cores is maxed out at 100% load.
HOWEVER when checking for the toggle and also do data manipulation (only bit shifting nothing complex) and setting/reading some 32Bit addresses the data rate drops. I transmitted a data frame and according to the CPU time it took 760us to transmit the 1024 values.
This it the transmission of the data frame.
Like before the blue signal (FPGA) only takes 20ns to register the toggle of the yellow signal (HPS). The HPS processor however takes "years" to do so. I do not calculate anything to complex. The CPU is only running at around 10% when executing this program. When the CPU is really running at 900MHz to 1GHz i should be way faster than this.
Why is the CPU limiting the speed of the C Code so drastically? I also uploaded the C Code where I achieved 1.25MHz transmission speed. There is nothing to complex going on I think. Where is my problem?