What device are you targeting? My concern is that the books are talking about "number of gates", i.e. an ASIC architecture, in which case they are absolutely correct because the building block is so granular. But an FPGA has pre-done building blocks. For example, if it's a 4-input LUT, and you have a 4-input function, you don't save anything by recoding to make the function 3-inputs or 2-inputs, you still use the whole LUT. I just wanted to make sure you didn't waste time on it to get the same results. (If this is CPLD, then this may be off since I don't think the counter logic is as "dedicated" as in the FPGAs. And Stratix II/III's ALM is adaptive, to get around this inefficiency) It sounds like your number of cells is going down though, so you should be on the right track.