your input statement is correct:
When using 2's complement(signed), your 12 bit input can swing between 0
+ 2047
your coeff statement is incomplete:
dc gain of your filter(strictly speaking your convolution) in hardware is the sum of all coefficients as scaled for hardware. Your figure of 0.98707 suggests a normalised dc gain.
What happens to this gain in hardware depends on how you scale there, your convolution gain goes up by same factor used to scale coeffs.
for example if your scale factor = 1022 (511/.5, assuming peak normalised coeff = .5) then convolution gain becomes .98707 x 1022 = 1009 in hardware.
then max convolution output = 1009 x
+2047 =
+ 2064984
this requires 22 bits (2^21 = 2097152)
Hence you need to keep all Msbs and truncate 8 Lsb
You can fine tune the convolution sum under your control through this scale factor.
A practical scale factor is the one that keeps unity filter dc gain after Lsb truncation i.e. a scale factor that equals your division(Lsb truncation).
In my example I removed 8 Lsb(this means division by 2^8 = 256). Hence the filter which was scaled up by 1022 during convolution but then divided down by 256 after truncation gets a dc gain of 1022/256 = nearly 4 in hardware. In short: scale the normalised coeffs by a value such that their sum = 2^n (512 if 10 bits used) rather than the peak.
It will help if you break above as three stages:
normalisation(at modelling)
scaling then convolution(mult-add in hardware)
truncation(division by lsb truncation) to match signal resolution
Details of gain control described here:
http://www.digital-filters.co.uk/static/filter_gain