Refer to Altera's "an 408: ddr2 memory interface termination, drive strength, loading, and design layout guidelines (https://www.altera.com/ja_jp/pdfs/literature/an/an408.pdf)". A balanced tree topology is not very common for DDR layout. Fly-by routing for the control bus and clock is recommended.
There is a further issue - following the clock routing layout guidelines outlined in AN408. In order to branch out to realise a balanced tree topology to four devices I suspect you end up compromising the matched length and spacing guidelines. I may be wrong, but it certainly won't be as easy as the recommended fly-by routing.
As to your second question - yes, you should be able to spread the DDR2 I/O across 2 banks. This is quite common especially for wide memory interfaces. However, as for use of the VREF pins - no, I doubt you'll be able to use them for general purpose I/O. The SSTL18 I/O standard that DDR2 requires will automatically cause Quartus to make use of the Vref pins, telling you to connect them to 0.9V. Remember, you can always use Quartus to qualify a pinout for you. I've just tried reusing the Vref pins for other uses, albeit for DDR3, but Quartus isn't having any of it.
Cheers,
Alex