No images above or below.
Anyway, if you are asking about why pipelining a mult then the answer is to break long combinatorial paths and help achieve better fmax. one or two or more pipeline stages are an option in any combinatorial paths, not just multipliers.
edit: to show the mathematical muscles:
fmax = 1/(
<register to register delay> -
<clock skew delay> +
<micro setup delay> +
<micro clock to output delay>)
hence fmax can be increased if data delay is lowered. Tsu and Tco and clk delay are to do with silicon fabrication of registers...
clk delay can be made worse by the designer through gating