Forum Discussion

Altera_Forum's avatar
Altera_Forum
Icon for Honored Contributor rankHonored Contributor
13 years ago

Improve Build Time on dual processor machine

I have a dual CPU (Xeon E5-2403 @ 1.80GHz, 10M Cache) machine with 16GB of DDR3 Memory (1067 MHz) dedicated to each CPU. Each CPU has 4 cores. Using Quartus 11.1 sp2.

When I run Quartus with the "Use all available processors" option selected, it uses 8 cores. The performance isn't that great. I think I can get better performance if I limited it to 4 cores within one CPU instead so that I don't get the penalty of transfers between the 2 CPUs. When I set the "Maximum processors allowed" to 4, I see that 4 cores are used but they are not within the same CPU! Performance of setting it to 4 cores is identical to 8 cores. How can I force Quartus to use cores within the same CPU?

On another note, I am getting about 1.5x worse performance on this dual CPU machine than on my single CPU, 4 core i5-3450S @ 2.80GHz based machine which has less memory (8GB @ 1333MHz) and less cache (6M)! Wondering if the i5's higher clock is the reason for the better performance or is the dual CPU setup...?

Thanks,

Amol

4 Replies

  • Altera_Forum's avatar
    Altera_Forum
    Icon for Honored Contributor rankHonored Contributor

    Not sure if it will work but: via the task manager, try to set the process's thread affinity.

    Quartus ability to take advantage of multiple cores is limited -- synthesis and specially place & route are not easy to parallelize and Amdahl's law applies.

    So it makes sense that a faster core will outperform multiple slower cores.
  • Altera_Forum's avatar
    Altera_Forum
    Icon for Honored Contributor rankHonored Contributor

    Yes, a faster core will outperform multiple ones if You don't use design partitions. Enable and create partitions for Your modules, so Quartus will parallelize synthesis of them.

  • Altera_Forum's avatar
    Altera_Forum
    Icon for Honored Contributor rankHonored Contributor

    The i5 is almost certainly a much faster cpu than your Xeon.

    I'd guess the Xeon is core2 technology, so the i5 is a later version of the same family and will perform slightly better (instructions per clock).

    Until the OS starts swapping, the very large memory makes little difference. The larger cache helps - but only while the code/data fits in the cache. Once some data item exceeds the cache size random accesses (like sequential ones) will be almost always cache misses!
  • Altera_Forum's avatar
    Altera_Forum
    Icon for Honored Contributor rankHonored Contributor

    Thanks for the replies guys.

    Given all the information here and other places, I think guidelines for a good build machine for FPGA designs would be something like:

    - Fastest CPU you can get

    - Largest cache you can get

    - Fastest memory bus support you can get

    - 16GB of RAM for Stratix V (but memory should be expandable for future larger FPGAs)

    - ECC memory is probably preferable

    - HT turned off if present

    - Single CPU is probably sufficient (if not better)

    In terms of project settings, design partitions help parallelize synthesis and its probably worth using all available processors but one may want to experiment with that.