This makes sense.
If you configure fifoed rx, the fifo would be implemented even if rx input line is not connected, since fifo can still be accesses from avalon MM slave port; that's why you must explicitly disable it.
Please note that a fixed resource cost is due to the uart clock and the synchronization with Avalon slave clock and I believe this accounts for a great part of those LEs.
So I expect the total resource usage would have been similar even with a double speed uart, since this clock related part had to be replicated for rx and tx clocks.
The real wasted resource are those needed to implement registers available on the slave port which are no more used (i.e. rx register on tx part); they cannot be
optimized away since you can still rd/wr them.
For example let's suppose that a complete uart requires 500LE, divided this way:
uart clock control and synchronization: 200LE
rx stage : 150LE
tx stage: 50LE
decoder and registers: 100LE
With a dual speed ip you'd have to replicate the first block, so you'd get a total 700LE
With two single speed uarts, you spare 1 rx and 1tx stage and you need a total of 800LE
Clearly, the benefit of a single dual speed ip depends on the actual resource division, which I don't know