--- Quote Start ---
WRT Tricky's comment, I generally try to stay away from lpm functions as they don't port over to different architectures. Also, I don't see anything 'clever' in my code. The reason I use q_next is that it allows me to calculate the negative / zero flags for an operation before the operation is clocked, thereby saving myself the hassle of trying to figure out which instruction modified the flags. There's no magic here, really.
--- Quote End ---
But you did realize that exposing q_next is exactly blowing up the design? Without it, you arrive at the lpm result.
--- Quote Start ---
And, I am lucky! Still only 17 LE. I must admit this starts looking obscure, so I will not get any brownie points.
--- Quote End ---
May be I'm overlooking something, but your third design seems functionally equivalent to the post# 1 code. So not completely clear how the different resource utilization is brought up. In any case, cutting q_next output brings you back to 9 LEs for all design variants.
Without q_next, I prefer a single always block description.
always @(posedge clk or negedge reset_n )
if (!reset_n)
q <= 8'd0;
else if (clken)
begin
if (load)
q <= data;
else if (cnt_en)
begin
if (updown)
q <= q - 1;
else
q <= q + 1;
end
end