Forum Discussion
Altera_Forum
Honored Contributor
14 years agoNo matter what hardware based approach there will always be some overhead but with a bit of planning and an algorithm that pipelines well you can hide this overhead. If you have an algorithm with few/no data dependencies you can typically pipeline them which will give you a potential high throughput at the expensive of computation latency. To overcome the latency you just send data at the data engine as fast as possible, if your input data can be placed in a memory buffer then sending the data using a DMA engine (or just building the master directly into your data engine) would be the most efficient means to do this.
Some of what I just mentioned is covered in this document: http://www.altera.com/literature/hb/nios2/edh_ed5v1_03.pdf