Forum Discussion
Altera_Forum
Honored Contributor
10 years ago --- Quote Start --- That would also explain why the same algorithm with the boolean operator exploded in size. Would it be better to optimize fixed point kernels by loading/storing them as 32-bit integers (as 4 chars packed together) and then separating them only for the internal arithmetic of the kernel to keep the alignment at 4 bytes? Or would a char4 vector data type accomplish the same task? --- Quote End --- The boolean operator case is different. Because of the logical dependence, the second load operation has a control dependence on the first one. This uses a different (and more expensive) type of load/store unit. Yes, loading/storing larger types (int, or char4) would solve the alignment problem at the expense of wasted memory.