Forum Discussion

Altera_Forum's avatar
Altera_Forum
Icon for Honored Contributor rankHonored Contributor
10 years ago

constant memory usage

Hi ,

__void kernel A(global int * restrict inputA) ;

__void kernel B(constant int * restrict inputB) ;

There are two kernels (A and B) .

Why kernel B 's RAM / LE / FF / util is smaller than kernel A ?

Because data is saved into ROM on FPGA -> will need more resource on FPGA instead of decreasing ?

I still have some question :

could anyone explain the meaning as shown (how to map into OpenCL code)?

Logic utilization

Dedicated logic registers

memory block

DSP blocks

9 Replies

  • Altera_Forum's avatar
    Altera_Forum
    Icon for Honored Contributor rankHonored Contributor

    This maybe because for the constant cache, there is no need for the kernel to have LSU since the data is onchip. The ROM is not implemented in the kernel since the constant cache is visible to all kernels and therefore will not be reflected in the kernel utilization report.

  • Altera_Forum's avatar
    Altera_Forum
    Icon for Honored Contributor rankHonored Contributor

    Thank for your answer !

    And I found some document , i still could not understand the the meanings as shown

    (how to map into OpenCL code)?

    Logic utilization

    Dedicated logic registers

    memory block

    DSP blocks
  • Altera_Forum's avatar
    Altera_Forum
    Icon for Honored Contributor rankHonored Contributor

    What do you mean by map to OpenCL code? Those are the resource utilization of your design on the FPGA. Essentially how many of the logic, registers, memory blocks and Dsp are used on the FPGA if you choose to synthesis it down to hardware

  • Altera_Forum's avatar
    Altera_Forum
    Icon for Honored Contributor rankHonored Contributor

    In other words , if I use a operation(+,- ,*,\) , will it map to DSP or other blocks

  • Altera_Forum's avatar
    Altera_Forum
    Icon for Honored Contributor rankHonored Contributor

    Utilization depends on what you're trying to implement and if it needs it. It also depends on the resources of specific FPGA you have. If you're operation is complex enough such that a DSP can be inferred, then it will use a DSP. Otherwise, it will just implemented it using logic blocks. For example, if you have a = b + 1. A simple incrementer like this would not need a DSP and would probably be more efficient if it was implemented using logic. Essentially, all this mapping is done by the tools.

  • Altera_Forum's avatar
    Altera_Forum
    Icon for Honored Contributor rankHonored Contributor

    Thanks !!!

    I have another question is that the report showed DSP blocks > 100% , and then program could still worked and ran successfully.
  • Altera_Forum's avatar
    Altera_Forum
    Icon for Honored Contributor rankHonored Contributor

    The emulation ran successfully or the hardware synthesis? The emulation does not take utilization into account. So if your design was too large to fit onto the FPGA, the emulation will still run. If it was running after hardware synthesis, i think that the reason might be because there is still logic resources available, and so it tries to implement the operation using the available logic resources on the chip. Otherwise, I would expect the design to fail.

  • Altera_Forum's avatar
    Altera_Forum
    Icon for Honored Contributor rankHonored Contributor

    But the global memory is visible to all kernels , will it reflected in the kernel utilization report ?

  • Altera_Forum's avatar
    Altera_Forum
    Icon for Honored Contributor rankHonored Contributor

    No. Again, utilization kernel utilization only reports the utilization of that specific kernel. What will probably be reflected in the utilization report is the LSU that is needed in order to access global memory in each kernel. So one thing is if you have

    __kernel k1(__global int* A) and __kernel k1_noarg()

    Utilization reports will show k1 having more resources due to the additional resources needed to access global memory. Global memory is typically DDR and so is off chip and the FPGA does not utilize or have any control over global memory other than have hardware to access it.