i believe Tricky mentioned this in one of your other threads.
you can pipeline a design by registering signals between combinational logic. as you add more registers in a given logic path, the amount of combinational logic between registers decreases. less combinational logic means less propagation delay, which means you can clock the registers faster, and hit a higher fmax.
this is highly design dependent. stacking input registers, for example, will not break the combinational logic into smaller pieces and increase fmax. the logic itself has to be segmented and registered in between.