It's unfortunate, but it's true. The HAL DMA interface is much more complicated than the old Nios I example functions. There's two reasons for this:
1. The HAL uses a callback interface rather than blocking waiting for the DMA to complete. This allows the processor to continue execution and do something useful while a DMA is underway - this is after all the reason for using a DMA rather than memcpy().
2. A DMA process is described in terms of two half DMA's. While this is a little unatural when looking at the altera_avalon_dma component, it provides a much more general infrastructure which should be able to support future DMA controllers.
To solve your particular problem, I'd recommend providing a seperate DMA device for each peripheral you wish to communicate with - so in your case you'd have three channels. That way you only call the ioctl at most once for each channel, and you avoid this problem.
Even if the bug was not there, I believe you'd find that this becomes necessary in any case once you start putting together a real system. Given that all the transfers are being processed in parallel to you main code execution, you'll find that it becomes difficult to synchronise the DMA activity so that you call the ioctls at the right time.
You'll also have the problem of figuring out where incoming packets have come from. This way it's easy in that there's a one to one correspondance between DMA channel and source.
I hope that makes sense, and solves your problem.