--- Quote Start ---
The post-map netlist of your "design" clarifies, that you can implement the complete GF_mult in two Stratix LUTs (one for each bit). It also works with Cylone 4-input LUTs. I don't see a reasonable purpose of preventing this optimization in a real design.
If you want to cut the FPGA feature of implementing complex logic expressions in a single LUT, though. Keeping the intermediate nodes as logic cells doesn't work in a function, I fear, because functions involve a higher level of behavioural description, that abstracts from logic cells. But it should be possible by using a component instead.
--- Quote End ---
The reason for preventing this optimization, is that I want to see the real total number of gates utilized in the designs.
thanks for the advice. I think i should change to component instead of function