Your code is too verbose and I believe you are not doing it the easy way.
just use altera multipliers and adders.
If you have say 16 taps, you design a delay line for 16 input stages. Then use 16 altera multiplier(megawizard or inference) to multiply each stage by a coefficient, then add up all results in succession : add up first 8 pairs of results then add up the resulting 4 pairs then the 2 pairs and the last pair(inserting pipeline registers at adder result) then truncate the final sum.
use altera adder(not xor).
You can truncate after each multiplier result or anywhere you wish till the final sum...
If your coefficients are symmetrical you may use half multipliers by adding two corresponding input stages first then multiply.