Forum Discussion

Altera_Forum's avatar
Altera_Forum
Icon for Honored Contributor rankHonored Contributor
11 years ago

aoc crashes when compiling fft2d.cl example with -O3

Hi,

When I try to the compile the fft2d example with -O3 I get an error message "Error: System integrator FAILED". This is with version 13.1.3 Build 178 on windows. Without the -O3 option compilation succeeds. The fft2d.log file contains the following:

Channel Pairing Type Does not match

Channel chanin0 needs one chan_read_altera and one chan_write_altera.

Multiple chan_read_altera/chan_write_altera with the same channel ID may cause this problem.

Assertion failed: 0, file custom_ic_impl.cpp, line 3280

This application has requested the Runtime to terminate it in an unusual way.

Please contact the application's support team for more information.

Any comments or suggestions on how to fix this? Thanks.

Dominic

3 Replies

  • Altera_Forum's avatar
    Altera_Forum
    Icon for Honored Contributor rankHonored Contributor

    Apparently this kernel should be compiled with options "-fpc=true --sw-dimm-partition " rather than with "-O3". This is in the README. I guess I should have had a more careful look at the README.

    Cheers,

    Dominic
  • Altera_Forum's avatar
    Altera_Forum
    Icon for Honored Contributor rankHonored Contributor

    That's correct. All of the example designs up on the Altera site have readme files and most of them require additional flags to improve the kernel that is output. In the case of the 2D FFT the memory is being partitioned instead of the default interleaved behavior (--sw-dimm-partition) and the floating point math hardware is optimized to avoid intermediate rounding operations (often called "fused math") that consume additional hardware (--fpc)

    Was the -O3 the only flag you passed in? I suspect I know what happened but we should have issued a more user friendly message in that case so I would like to reproduce this on my end. That example uses a mix of NDRange and single work-item execution (task) kernels so I think by passing -O3 in the compiler attempted to optimize that task like it was an NDRange kernel. In general I would avoid using the -O3 optimization when a task kernel is involved because there is no opportunity for it to be optimized further since tasks only operate on a single work-item so there is no opportunity for the compiler to throw more hardware at the kernel to improve the performance.
  • Altera_Forum's avatar
    Altera_Forum
    Icon for Honored Contributor rankHonored Contributor

    --- Quote Start ---

    That's correct. All of the example designs up on the Altera site have readme files and most of them require additional flags to improve the kernel that is output. In the case of the 2D FFT the memory is being partitioned instead of the default interleaved behavior (--sw-dimm-partition) and the floating point math hardware is optimized to avoid intermediate rounding operations (often called "fused math") that consume additional hardware (--fpc)

    Was the -O3 the only flag you passed in? I suspect I know what happened but we should have issued a more user friendly message in that case so I would like to reproduce this on my end. That example uses a mix of NDRange and single work-item execution (task) kernels so I think by passing -O3 in the compiler attempted to optimize that task like it was an NDRange kernel. In general I would avoid using the -O3 optimization when a task kernel is involved because there is no opportunity for it to be optimized further since tasks only operate on a single work-item so there is no opportunity for the compiler to throw more hardware at the kernel to improve the performance.

    --- Quote End ---

    Thanks for the feedback. Yes, -O3 was the only flag passed to aoc (besides the board choice).

    Cheers,

    Dominic