I don't know what a ling adder is, but be careful about optimizing the adder "in general" over using the dedicated FPGA hardware. FPGA's have dedicated carry chains, which are the fastest signals for adders/counters/etc. (That doesn't mean there aren't ways to make things faster. A super-long carry chain will run slower than a short one which then runs a secondary adder.)
As for optimizations, are you constraining your design with an .sdc? That's how you tell Quartus what the clock requirements are. Without that, it's just taking a guess? Also, are you using a PLL to make your derived clocks? I would strongly suggest doing that over logic(plenty of posts on ripple clocks here...)