Well it's been a while for me since I used Altera DMA core. I assume you've read this:
http://www.altera.com/literature/hb/nios2/n2cpu_nii51006.pdf and the “Using DMA Devices” on page 6–24 of this:
http://www.altera.com/literature/hb/nios2/n2sw_nii52004.pdf When I used the core, I chose to access it's registers directly rather than use the higher level software functions. However what you have looks good to me except:
in step 3, you're going to want to give the address of the data register of the PIO. It does so happen that this is located at the base address of the PIO peripheral. If it weren't however, you would obtain it using the following macro found in "altera_avalon_pio_regs.h":
IOADDR_ALTERA_AVALON_PIO_DATA(base).
Now what you want to do is have the DMA controller perform a read from the base address of each PIO in turn. This means that you need to have it set up with a list of addresses to read from and reading two bytes from each address. You can't have it read from consecutive addresses because the base addresses of your PIOs are not consecutive. So you can't give it the base address of the first PIO, tell it to do 22 consecutive address reads, and expect to get the data. Does that make sense? That's why I think you need to use the SGDMA core for what you are trying to accomplish. The SGDMA core gives you the ability to specify a list of addresses to read from.
I should add too that in your case it would be faster to just read the values directly from the NIOS and not use DMA. There is a certain amount of overhead associated with DMA. DMA is only efficient when you are moving large chunks of data. So if you are just going to read each PIO once, do some processing in the NIOS, then repeat, drop the DMA as it's not going to buy you anything. You'll spend way more time setting up the DMA, servicing the IRQ, and waiting for the DMA core to cycle through descriptors than you would have just reading the PIO ports directly.
Jake