I've found that the BFM provided by one of Altera's distributors (Macnica) is the easiest to use (it's designed specifically for Altera FPGAs). The free version always ends up downtraining to Gen1x1, but the little "installer" that comes with it uses the demo/example designs (so if you have Gen2x2, it'll work, but you'll only get Gen1x1 rates until you pay). You have to use Quartus 12.0 or newer, though (they have versions that work with older Quartus tools, but you'd have to contact them).
You can also get a "simple design example" from them, which I found much more useful than the chaining DMA stuff because it's intuitive (they separate the design from the testbench and do simple reads/writes) -- good for getting started. Everything is script driven, so you can do complex things or just do simple read/write stuff like "read <address>" / "write <addresss>, <data>."
They call it "DrivExpress". The free version often does what you want, but they'll work with you on the price if you need to sim lots of data/throughput. Google "DrivExpress" and it should show up near the top.