The purpose of the project is to cancel echo. Think about two conferencing rooms. You are in one room, and you talk. That signal would be xn. That signal reaches the second room, comes out of the speaker, bounces off of the walls, and returns to the mic. This returned signal is dn. The signal dn coming from the other room not only contains the other person's voice, but also some returned echo.
The goal is to create an adaptive FIR filter that takes your voice, xn, as an input, and produces an output yn. This output is subtracted from the returned signal, dn, to yield en, the error signal. This error, along with the input xn and the step size, is used to calculate a gradient. This gradient is added to the current coefficient, yielding new coefficients that point in the direction of minimal error.
After this happens a few times, the filter will be adjusted, and the filter output when subtracted from the returned signal will completely cancel the echo.
This could be done easily on a DSP processor. However, we are to implement it in an FPGA to take advantage of parallelism.
A CPU will grab xn and dn from mic inputs on the FPGA board, and send it to the logic. The logic will then calculate en, and each en value will be sent to the output port via the CPU.
Due to the length of the return path, it is required that a 1024 tap filter be used. But, the professor insisted that we break up the filter into 8 blocks, and calculate new coefficients for each block seperately.
My initial idea was to use muxes, and connect the 8 FIR filters in parallel via the muxes. The ouputs would be summed together to produce the final value. Each time you would send 128 values to one filter, and then control the muxes to move to the next, and so forth.
But, this idea seems so clunky. It seems that there could be a better way to achieve this. I tried reading the IEEE papers, but they really aren't that helpful, and are way to complex for this project.
I understand Verilog really well. I'm just stuck at the architectural implementation.