pipiwau,
I think the answer lies in the DMA master signals. The DMA must be able to continuously read data in order to continuously write data to your memory. If you look at the DMA read_waitrequest, it is going high after several reads... once this happens the DMA must stall to wait for more read data, and therefore must also stall before writing that data to your memory.
Perhaps the memory you are reading from is shared with another master that is accessing it simultaneously? If this is the case you might consider using the Avalon arbitration settings in SOPC Builder to specify that the DMA have, for example, '20' accesses to the read memory while the other master has '1' -- this will ensure that once the DMA is granted access to that memory, it gets 20 continuous accesses. Of course, this may not be the problem, if the memory or peripheral you're reading from has some sort of intrinsic delay after the first several accesses.