--- Quote Start ---
I am also having some trouble with the design meeting some of the timing. Its as though I can't add enough timing constraints. When I let the compiler run on its own it fails timing big time. I had been using DSE but sometimes I get better results without it.
--- Quote End ---
Even if you've been using Quartus a long time, it might be helpful to check whether you've overlooked some basic settings for performance.
Start with "Tools --> Advisors --> Timing Optimization Advisor --> Maximum Frequency (fmax)" in Quartus. The Advisor doesn't cover everything, but I often use it to make sure I haven't forgotten one of the usual settings appropriate to get the best possible push-button performance.
If none of the recommendations in the Advisor solve your performance problem, then refer to the Quartus handbook in Volume 2, Section III, Chapter 8 "Area and Timing Optimization".
I use DSE often for seed sweeps but don't use it anymore for its other kinds of explorations. I start with turning on all the usual settings like physical synthesis with extra effort to get the best possible push-button performance. Then I run a seed sweep to get the random variation for the design so that I will have some basis to judge whether a different result when I change something is clearly because of what I changed or possibly just a random difference. If a seed sweep consistently has timing violations when using the usual settings to maximize push-button performance, most likely a design change like adding a pipeline stage is needed. As I said in my previous post, most designs don't benefit from manual placement optimization with LogicLock to improve performance. If I consistently get positive margin, then I might try backing off on the settings that cost significant compile time.
A benefit of either the old bottom-up flow or incremental compilation is that the compile-time-intensive optimizations, especially physical synthesis, don't have to be used on the entire design just because a portion of the design needs them.
Floorplanning with LogicLock (with or without incremental compilation) might help performance on your design because your utilization of logic resources is low. The Fitter's algorithms are optimized for typically full designs. That sometimes results in designs with light logic resource utilization getting spread out too much. In that case, a LogicLock region for a block of hierarchy can be helpful for performance.