How large is the final design after synthesis? Usually when I've seen this, it's code that is extremely "large" and then synthesis is supposed to reduce everything. For example, I've seen code that is probably over a million LUTs(by whatever counting method of a LUT), and then gets reduced down by a factor of 20x after synthesis. This usually means a ton of "support logic" that isn't really necessary. Note that synthesis builds everything into gates first and then synthesizes down at that level. So, for example, let's say you had a table that had hundreds of thousands of locations, but you only used a small subset. Synthesis will build tons of gates to make the table, may spend hours doing so, and then at the end realize you only use a few a rip everything out. This is pretty much the way all synthesizers work, so you need to be cognitive of what you're building. I usually see this with really high level code.
Another quick thing I've seen make a difference is when people don't bound their types. For example an integer is a 32-bit value in VHDL, so if you don't bound it's size, we'll make it 32 bits large, and then later on realize you only use 5 of those bits and rip everything out.
Finally, if you have a lot of hierarchy, start synthesizing other levels and look for problem spots.