What does Performance saturation mean when we Increase SIMD size?

Question

Hi,

I am accelerating my application on altera FPGA, When I go with SIMD 32 the resources drops apart of increasing. I studied somewhere that its a performance saturation. My question is, how to prove it? Where could i find the answer of this question? Could i find in report somewhere?

Thank you.

hrz · Answer

That is because the compiler does not support SIMD sizes above 16 and if you choose such SIMD size, it will automatically revert to a SIMD size of 1 and hence, resource utilization will decrease. There should be warning about this in the compilation log, or at least there was one before. A lot of the important warning have been removed in the newer versions of the compiler, hope this one is still there.

Of course there is zero logical reason to have any restriction on SIMD size for FPGAs since, unlike GPUs, FPGAs do not have a fixed architecture; however, this has been like this since the very first version of the compiler and will probably never change.

hrz · Answer

Not really, this has nothing to do with memory bandwidth, this is an artificial compiler limitation. The following compiler warning is generated when compiling your kernel:

Compiler Warning: Kernel Vectorization: requested number of SIMD work items is larger than  ... cannot vectorize efficiently beyond OpenCL widest vector type.

If you write the kernel using the Single Work-item model and use an unroll size of 32, which would have a similar effect to using a SIMD size of 32 in an NDRange kernel, the kernel will compile just fine and the area usage will keep increasing as you increase the unroll factor. Depending on your kernel and FPGA size, you might not be able to fit the design with literally any SIMD size (even 1), or you might be able to still fit it with a hypothetical SIMD size of 32 or more. The compiler cannot know if your design will fit or not without place and routing it; hence, it will not terminate the compilation if some resource is expected to be overutilized. Note that the area utilization numbers you get from the "-report" switch are based on estimation, and final area utilization could be more or less than that.

Memory bandwidth depends on a lot of factors, only one of which is SIMD/unroll size. You can find a comprehensive analysis of memory performance on Intel FPGAs in the following document:

https://arxiv.org/abs/1910.06726

meiyan_l_intel · Answer

Hi, May I know the "When I go with SIMD 32 the resources drops apart of increasing", is mean by you are incresing the SIMD work item to 32, am I right?Thanks

masla5 · Answer

Hi, Yes, you're right, Usually, if resources are used more than hundred percent , then offline compilation is terminated but what happens in case of SIMD 32? Thanks!

masla5 · Answer

Hi,

When is see reports, there are only two memory banks created . In case of 16, there are only 16 memory banks that i can see in reports.

Is there any memory bound issue? If it is, then what it is? Please guide me in this matter.

Thanks!

Forum Discussion

What does Performance saturation mean when we Increase SIMD size?

10 Replies

Recent Discussions

AI Suite - Spatial IP outputs wrong value

AI Suite - Is it possible to simulate the AI IP?

AI Suite - Streaming from HPS to DLA IP

Agilex 7 I-Series "aocl diagnose acl0" error following OFS

AI Suite - Custom model in the FPGA building process