--- Quote Start ---
Thanks for responding.
I didn't mention that this was an adaptive filter.
--- Quote End ---
I took a quick look at Wikipedia, the maths are above me :)
--- Quote Start ---
Im doing convolutions with a circular input buffer and a linear coefficient buffer. The state machine grabs the top most input sample and lowest coefficient and along with the error output calculates the new coefficient. Then it uses this new coefficient to calculate the next FIR MAC.
When you implement pipelining, it takes so many cycles for a coefficient calculation to complete.
The next FIR MAC calculation requires the latest coefficient to be calculated first. That is the problem I'm facing. If I implement a 5 stage pipeline, for example, I will have to wait 5 cycles before I can perform the MAC with the newly calculated coefficient.
I still don't understand how I can get around this.
--- Quote End ---
Now if you need the results of the previous sample to calculate the next one, pipelining is not going to be much help.
If I get it right you do a coefficient calculation and then do a FIR of 1024 taps, or does every FIR stage need a newly calculate coefficient? In the first case you could pipeline the FIR for as many stages as you see fit and repeat the FIR operation for (L / stages). This way you can keep the required clock frequency in the 60 MHz region (which is easy to fit).