Forum Discussion

Altera_Forum's avatar
Altera_Forum
Icon for Honored Contributor rankHonored Contributor
8 years ago

Using of Restrict keyword

Hi All,

We have implemented 4 kernels in one .cl file. We are trying to optimize the kernels. So gone through AOCL best practices guide, it suggests the usage of restrict keyword in pointer arguments whenever possible. So we have used for all 4 kernels. But resource utilization has increased from 58% to 135%. Instead, if we use for one kernel, then it is giving performance (kernel execution time is decreased from 98msec to 50msec). Is there any alternative for restrict keyword?

Thanks

3 Replies

  • Altera_Forum's avatar
    Altera_Forum
    Icon for Honored Contributor rankHonored Contributor

    Restrict should not increase resource usage that much unless without restrict, you get fully sequential operation, while with restrict you get pipelined operation with a high II which then requires extra resources to buffer data and accommodate the high II. Can you post your kernel area report before and after adding restrict?

  • Altera_Forum's avatar
    Altera_Forum
    Icon for Honored Contributor rankHonored Contributor

    Please find the below resource utilisation numbers with and without restrict

    With Restrict

    Family : Arria 10

    Device : 10AX115H3F34E2SG

    Timing Models : Final

    Logic utilization (in ALMs) : 258,918 / 427,200 ( 61 % )

    Total registers : 456008

    Total pins : 155 / 618 ( 25 % )

    Total virtual pins : 0

    Total block memory bits : 19,882,664 / 55,562,240 ( 36 % )

    Total RAM Blocks : 2,312 / 2,713 ( 85 % )

    Total DSP Blocks : 192 / 1,518 ( 13 % )

    Total HSSI RX channels : 4 / 24 ( 17 % )

    Total HSSI TX channels : 4 / 24 ( 17 % )

    Total PLLs : 30 / 80 ( 38 % )

    Without Restrict

    Family : Arria 10

    Device : 10AX115H3F34E2SG

    Timing Models : Final

    Logic utilization (in ALMs) : 249,039 / 427,200 ( 58 % )

    Total registers : 453705

    Total pins : 155 / 618 ( 25 % )

    Total virtual pins : 0

    Total block memory bits : 12,301,448 / 55,562,240 ( 22 % )

    Total RAM Blocks : 1,755 / 2,713 ( 65 % )

    Total DSP Blocks : 192 / 1,518 ( 13 % )

    Total HSSI RX channels : 4 / 24 ( 17 % )

    Total HSSI TX channels : 4 / 24 ( 17 % )

    Total PLLs : 30 / 80 ( 38 % )
  • Altera_Forum's avatar
    Altera_Forum
    Icon for Honored Contributor rankHonored Contributor

    I was talking about the compiler's resource estimation report which includes the loop analysis and line-by-line estimated area usage and shows the estimated area usage is going to increase from "58% to 135%".

    The post-place-and-route area utilization is not much different in your case anyway.