Forum Discussion

Altera_Forum's avatar
Altera_Forum
Icon for Honored Contributor rankHonored Contributor
10 years ago

Why is the early resource estimate feature of aoc so off?

Hi everyone,

I'm trying to compile a kernel that has multiples loops, of which some are also nested, and the early resource estimate feature gives me estimations that are quite off. Can somebody explain me why or experienced the same?

Moreover, in some cases I could see that using #pragma unroll n reduced the estimate usage. Could it be that also in this case the estimations are off? or that, implementing the logic of some loop is more demanding that actually unroll it? In this case the unrolling factor is never bigger than 80 and the body of the loop is maximum 2 instructions.

Early resource estimate: 80% logic, 36% ALUTs, 44% registers, 21% RAMs, 12% DSPs   Kernel 'process': throughput: 4.92e+05 / resources: 64% logic, 26% ALUTs, 38% registers, 8% RAMs, 12% DSPs)
----------------------------------
Using attributes:
   Kernel 'process':
      max_unroll_loops(80)
      num_simd_work_items(1)
      num_compute_units(1)
      num_share_resources(1)
      max_share_resources(8)
aoc: Compiling....
Early resource estimate: 195% logic, 121% ALUTs, 85% registers, 22% RAMs, 6% DSPs
   Kernel 'process': throughput: 2.34e+05 / resources: 180% logic, 111% ALUTs, 78% registers, 9% RAMs, 6% DSPs)
----------------------------------
Using attributes:
   Kernel 'process':
      max_unroll_loops(1)
      num_simd_work_items(2)
      num_compute_units(1)
      num_share_resources(1)
      max_share_resources(8)
aoc: Compiling....
Early resource estimate: 80% logic, 36% ALUTs, 44% registers, 21% RAMs, 12% DSPs
   Kernel 'process': throughput: 4.92e+05 / resources: 64% logic, 26% ALUTs, 38% registers, 8% RAMs, 12% DSPs)
----------------------------------
Using attributes:
   Kernel 'process':
      max_unroll_loops(1)
      num_simd_work_items(1)
      num_compute_units(2)
      num_share_resources(1)
      max_share_resources(8)
aoc: Compiling....
Early resource estimate: 146% logic, 64% ALUTs, 83% registers, 31% RAMs, 25% DSPs
   Kernel 'process': throughput: 9.84e+05 / resources: 129% logic, 53% ALUTs, 76% registers, 16% RAMs, 25% DSPs)
----------------------------------
Using attributes:
   Kernel 'process':
      max_unroll_loops(1)
      num_simd_work_items(1)
      num_compute_units(1)
      num_share_resources(1)
      max_share_resources(8)
aoc: Compiling....
Early resource estimate: 80% logic, 36% ALUTs, 44% registers, 21% RAMs, 12% DSPs
   Kernel 'process': throughput: 4.92e+05 / resources: 64% logic, 26% ALUTs, 38% registers, 8% RAMs, 12% DSPs)
aoc: Compiling....
aoc: Linking with IP library ...
+--------------------------------------------------------------------+
; Estimated Resource Usage Summary                                   ;
+----------------------------------------+---------------------------+
; Resource                               + Usage                     ;
+----------------------------------------+---------------------------+
; Logic utilization                      ;   80%                     ;
; Dedicated logic registers              ;   45%                     ;
; Memory blocks                          ;   21%                     ;
; DSP blocks                             ;   13%                     ;
+----------------------------------------+---------------------------;
aoc: First stage compilation completed successfully.
Error: Cannot fit kernel(s) on device
real    200m31.304s
user    254m36.344s
sys    5m58.536s

4 Replies

  • Altera_Forum's avatar
    Altera_Forum
    Icon for Honored Contributor rankHonored Contributor

    It's possible that without the# pragma unroll the compiler unrolls the loop on its own. Have you tried force disabling unrolling with "#pragma unroll 1"?

    But yeah, I've also had odd cases with the estimates. I'm currently using a kernel that was estimated to take 190%.

    It ended up still fitting, but the compilation took a lot of time. I've never encountered an estimate below 100% that didn't work though.
  • Altera_Forum's avatar
    Altera_Forum
    Icon for Honored Contributor rankHonored Contributor

    --- Quote Start ---

    It's possible that without the# pragma unroll the compiler unrolls the loop on its own. Have you tried force disabling unrolling with "#pragma unroll 1"?

    --- Quote End ---

    I didn't think about it, will for sure look into it!

    Thanks.
  • Altera_Forum's avatar
    Altera_Forum
    Icon for Honored Contributor rankHonored Contributor

    Since you're working looking into logic utilization: Did you find a way to find the actual logic utilization after the compilation completes, not just the initial estimate. I haven't been able to find that number anywhere.

  • Altera_Forum's avatar
    Altera_Forum
    Icon for Honored Contributor rankHonored Contributor

    --- Quote Start ---

    Since you're working looking into logic utilization: Did you find a way to find the actual logic utilization after the compilation completes, not just the initial estimate. I haven't been able to find that number anywhere.

    --- Quote End ---

    You can get the actual values after compilation by going into the created folder for the hardware and opening the project "top.qpf" in Quartus.

    Then go to Processing -> Compilation Report -> Analysis & Synthesis -> Resource Usage Summary