for IIR you may try the leaky integrator:
y(n) = alpha.x(n) + (1-alpha).y(n-1) (requires two mults) or its equivalent of one mult:
y(n) = alpha.[x(n) - y(n-1)] + y(n-1)
alpha will give you you control of cutoff but dc gain will stay unity (alpha = 0~1 normalised)
you can also use power of 2 value for alpha to avoid that single mult if such cutoff resolution is enough.
if required you can also increase/decrease dc gain (apart from cutoff) by scaling input as per your diagram or dividing final output by truncation.
For f1+f2 frequency filtering you need to attenuate it enough e.g. by 40dB. 200 taps x 2 sounds way too much. You don't need to cutoff that sharp since the sum term will be at some distance and assuming you don't have much noise in your signal or that it can be filtered by rrc. I personally believe you better use rrc inside the loop for this purpose.