Forum Discussion

Wrdlwrmpft's avatar
Wrdlwrmpft
Icon for New Contributor rankNew Contributor
5 years ago

Which CPUs are the fastest for compiling Quartus projects?


Hi,

(I hope that it is not a problem to talk about CPUs from AMD and Intel in an Intel forum... ).

We need to buy a few new workstations exclusively for developing and compiling Quartus projects. The problem is that there are several CPUs which I suspect to be good at this, each one with pros and cons. Which one to buy?

Requirements: Minimal Quartus compilation time, ECC RAM (at least 64GB, better 128), budget for one CPU+Mainboard+RAM at most 2000 Dollars. Must be bought until end of this year (so unfortunately no waiting for Intel gen11 and probably Zen3). OS: Latest Linux kernel. No windows, no gaming, no overclocking.

All clock frequencies below are given in turbo mode, because most of the time Quartus only uses a single core for compiling so I think this frequency is the one which matters most. Also because of that, 6 CPU cores should be sufficient. All speed factors (*...) are compared to our Xeons E5-1650 v3.


Current systems:
================

Intel Xeon E5-1650 v3 (Gen4)
6*3,8GHz, L2+L3: 15MB, 64GB DDR4-2133 4-channel, 68GB/s.
Singlethead-Passmark: 1762, Multithread: 8184
Quartus-Compile of a project: 1h 18min

Intel i7-8700 (Gen8)
6*4,6GHz, L2+L3: 12MB, 64GB DDR4-2666 2-channel, 41,6GB/s
Singlethead-Passmark: 2685 (*1.52), Multithread: 13129 (*1.6)
Quartus-Compile of the same project: 1h 01min


CPUs under discussion:
======================

AMD Threadripper 3960X (Zen2)
24*4,5GHz, L3:128MB (8*16M), L2:12MB(24*512k), DDR4-3200, 102,4GB/s, 72xPCIe4
TDP 280 Watt. Singlethread-Passmark: 2706 (*1.54). Multithread: 55672 (*6.8)
Advantages: Very large total cache, fast RAM transfers
Disadvantages: Price, not the fastest clock, power consumption (280W: power supply, cooling).

AMD Ryzen 9 3900XT (Zen2)
12*4,7GHz, L3:64MB(4*16M), L2:6MB(12*512k), DDR4-3200, 51,2GB/s, 28xPCIe4
TDP 105 Watt. Singlethread-passmark: 2798 (*1.59). Multithread: 33266 (*4.06)
Advantages: Large total cache, price, easy upgrade e.g. to 4900X next year
Disadvantage: Slow RAM transfers

Intel Xeon W-1290P (Gen10)
10*5,3GHz, L3+L2:20MB(shared? “smartcache”), DDR4-2933, 45,8GB/s, 20xPCIe3
TDP 125 Watt. Singlethread-passmark: 3140 (*1.78, 3900XT: *1.12) . Multithread 21882 (*2.67)
Advantage: Highest CPU clock, ?shared-cache (?fast L2?)
Disadvantages: Slowest RAM transfers, small total cache

Intel Xeon W-2235 (Gen10)
6*4,6GHz, L3+L2:8,25MB(shared? “smartcache”), DDR4-2933 4-channel, 93,85GB/s, 52xPCIe3
TDP 130 Watt. Singlethread-Passmark: 2709, Multithread: 14890
Advantages: Fast RAM transfers, ?shared-cache (?fast L2?)
Disadvantage: not the fastest clock, very small total cache

AMD EPYC 7262 (Zen2)
8*3,4GHz, L3:128MB(8*16M), L2: 4MB(8*512k), DDR4-3200 8-channel 204,8GB/s, 128*PCIe4
TDP 155 Watt. Singlethread-Passmark: 1939 (*1.10), Multithread: 23771 (*2.83)
Advantages: Very large total cache, very fast RAM transfers
Disadvantage: Very slow clock

(Of course, best would probably be a CPU with few cores, but very fast turbo frequency, large caches per core, and a really fast RAM interface, but such thing does not seem to exist - which one does the best blend?).


I did some research on the web, but did not find much quantitative data on the influence of CPU clock, CPU architecture, Cache architecture and size, and RAM speed and latency on the compilation time. In Altera/Intel documentation I essentially only found that faster CPUs and large CPU-caches are helpful. Since the i7-8700 is about 50% faster at Passmark as the Xeon E5-1650, but only 26% faster at Quartus, I suspect that the RAM data transport speed, which is higher for the Xeon, is also quite important for Quartus.

We would be really grateful if the ranking of the above CPUs in terms of Quartus compilation time could be clarified. For that, any specific comparison of the compilation time of one project on two current CPUs (particularly Zen-2 vs. Intel Gen-10, and dual-channel vs. quad-channel RAM) is helpful.


Many thanks in advance!

11 Replies

  • SyafieqS's avatar
    SyafieqS
    Icon for Super Contributor rankSuper Contributor

    Hi Fred,


    Thanks for using Intel Community,

    There is no exact CPU or benchmark which are the fastest for compiling project. You can purchase any computer that had the high processor depending on your application and needs excluding the Intel HyperThreaded cores, Quartus Prime had a feature call use maximum processor allowed. You can choose up to 16 processor for the compilation.

    Systems with all of the following characteristics may require additional configuration to achieve the lowest possible compilation time:

    •The system has more than two processor cores (excluding Intel HyperThreaded cores)

    •The system is running Microsoft Windows

    •The processor cores share a level of cache. For example, Intel Core 2 Duo and Intel Core 2 Quad processors share 2nd level cache. In addition, the AMD Barcelona processor and some recent Sun UltraSPARC processors are also affected. But AMD Opteron processors (as of May 2007) do not share a level of cache.


    The Quartus Prime software requires significant processor and memory resources. CPU speed is the main factor that affects with compile time performance. Consider multi-core processors and multi CPU configurations to take advantage of multi-threaded compilation. Then design partitioning and incremental compilation take full advantage of the available cores. Refer to Reducing Compilation Time in volume 2 of the Quartus II Handbook

    Different computers have different parallel processing capabilities. In general, newer computers perform better than older ones. This is because they were designed more specifically for parallel processing. Some very old multiprocessor computers might show a decrease in parallel compilation performance due to very low communication bandwidth between the processors.

    You may also need to refer to the kdb listed: https://www.altera.com/support/support-resources/knowledge-base/solutions/rd05082012_510.html


    Thanks,

    Regards



    • Wrdlwrmpft's avatar
      Wrdlwrmpft
      Icon for New Contributor rankNew Contributor

      > There is no exact CPU or benchmark which are the fastest for compiling project.

      The compilation times of the same big Quartus project on a specified Zen2 system and on a Gen10 Intel system could very well give excellent starting points for conclusions. Of course there will be slight variations depending on the FPGA design, used FPGA type, compilation parameters and so on, but they are really small. We tested different projects on our machines. There were almost no relative deviations to the given example as long as there is enough RAM, except for very small projects where the compilation time does not matter anyway. So there can very much be quite exact quantification.

      > You can purchase any computer that had the high processor depending on your application and needs excluding the Intel HyperThreaded cores, Quartus Prime had a feature call use maximum processor allowed. You can choose up to 16 processor for the compilation.

      So if we follow your advice, we would for example buy 8-core AMD Ryzen 7 over 2-core Intel Pentium G5400. According to https://community.intel.com/t5/Intel-Quartus-Prime-Software/Why-is-AMD-Ryzen-7-CPU-so-slow-in-Fitter-in-Quartus-Prime-Lite/td-p/198060 this might be a very bad advice.

      > You can choose up to 16 processor for the compilation.

      As I wrote, according to our findings, using 8 or 16 CPUs does make almost no difference for Quartus. Even 4 or 6 cores does not matter much. I just compiled a Quartus project on one of the Xeon E5-1650 v3 machines we intend to replace. 6 cores: 0:24:49 seconds, 4 cores: 0:25:16, 2 cores: 0:28:49, 1 core: 0:35:29. That is 40% difference between 1 and 4 cores, 23% between 1 and 2 cores, 14% between 2 and 4 cores, but only 1.8% difference between 4 and 6 cores.

      > The system has more than two processor cores (excluding Intel HyperThreaded cores)

      Which one of the CPUs we consider buying does not have more than two cores?...

      > The system is running Microsoft Windows

      We tested both Linux and Windows on several machines. No significant difference. And we use Linux exclusively since 2015 on our workstations. So why do you advice that we should use Windows?

      > The processor cores share a level of cache. For example, Intel Core 2 Duo and Intel Core 2 Quad processors share 2nd level cache. In addition, the AMD Barcelona processor and some recent Sun UltraSPARC processors are also affected. But AMD Opteron processors (as of May 2007) do not share a level of cache.

      I am asking about the very latest top notch hardware, and you refer to 13 year old museum pieces? Are you kidding me?

      > The processor cores share a level of cache.

      I showed in my first posting which ones of the CPUs have shared caches, and stated that cache architecture matters. So the question clearly is not if it does, but how much it does. With current (and not 13 year old) CPU designs.

      > You may also need to refer to the kdb listed

      This is one of the generic old sources I meant with "In Altera/Intel documentation I essentially only found that faster CPUs and large CPU-caches are helpful.".

      > In general, newer computers perform better than older ones

      You don't say...

      • skyjuice's avatar
        skyjuice
        Icon for Occasional Contributor rankOccasional Contributor

        Hi,

        Not all algorithms in the compiler flow are independent/parallel. Assuming you're compiling on local machine and not on a compute grid, you can try the following:

        For fastest elapsed time 16 CPUs are recommended

        QSF setting: set_global_assignment -name NUM_PARALLEL_PROCESSORS 16

        For maximized throughput 4 CPUs are recommended

        QSF setting: set_global_assignment -name NUM_PARALLEL_PROCESSORS 4

        The compile time might depending on the design, so I'd also want to hear your feedback after you try the above