My test set-up is:
32 bit data-width TX on-chip memory -> Modular SG-DMA controller (1024 word bursts (max)) -> 32 bit data-width RX on-chip memory.
I get that on-chip memory doesn't support burst transfers on it's own, but how does it work then?
So I have 2 different on-chip memories. But I can only transfer 32k at a time, so for each 32k I have to set-up a new transfer.
The reason I can only transfer up to 32k is because the RapidIO core, that is going to be the actual target after the tests, has a maximum TX buffer of 32k. So I have to transfer from address 0x0 through 0x8000 and then back to 0x0 through 0x8000 etcetera.
So the value of the length register is 32768 and the number of transfer was for example 100000. Meaning I have a lot of overhead for the amount I transfer. And the modular SG-DMA doesn't have pre-fetching, so I guess this causes for a big loss in speed due to overhead?
So I think the normal SG-DMA might be more suitable for this cause? Since it has pre-fetching?
(If they are actually are gonna use the RapidIO core for production purposes, they will probably write their own DMA controller, so it's mostly to test the speed of the RapidIO core/ see how much speed is lost compared to memory to memory copy etc.)