Forum Discussion

Altera_Forum's avatar
Altera_Forum
Icon for Honored Contributor rankHonored Contributor
21 years ago

1S10 dev. board and MAX Freq.

Hello,

I would like to know if we can increase the FMax of the board. I would like to reach the MAX frequency can support the board. How can I do it?

The reference design has implemented a 50 MHz. Is that possible to increase it?

Best regards

Christian

17 Replies

  • Altera_Forum's avatar
    Altera_Forum
    Icon for Honored Contributor rankHonored Contributor

    Sounds like you're progressing along just fine. If you are seeing > 100 with the /s core in Stratix (1s10es) and the critical path reports are in the ALU, you're pretty close to achieving maximum performance. The /f should give you some additional speed.

    Also, try the following quartus options to improve performance slightly (lengthens compile time) - in Quartus fitter settings, turn on the following:

    - Perform physical synthesis for combinational logic

    - Perform register duplication

    - Perform register retiming

    Also make sure you're not using the "fast fit" option (great for speedy compiles when you don't need max f-max, but should not be used for best clock speed).

    Finally, to extract the last few percent performance, try the Quartus "design space explorer" tool. This is an automated way of re-compiling a design over and over using different fitting tweaks to see which one is the best.
  • Altera_Forum's avatar
    Altera_Forum
    Icon for Honored Contributor rankHonored Contributor

    I removed the debug hardware and I see an increase. Now the rest of the optimization can be done to the surrounding hardware. The Fmax I'm seeing now is what I'm used to seeing with NIOS I (we are evaluating NIOS II to see if we can shrink our logic and retain the previous performance).

    So things are looking up for me since my savings using the 's' core (or to a lesser extent the 'f' core) are quite good.

    Thank-you for the help Jesse and Kerri.
  • Altera_Forum's avatar
    Altera_Forum
    Icon for Honored Contributor rankHonored Contributor

    My experience is that the bad paths tend to be those to and from the hardware multiplier.

    The multiplier is also used for shifts and rotates. As an experiment, you can

    remove the hardware multiplier and see the effect on your Fmax.

    Just hand-edit your system PTF file and change remove_hardware_multipler from "0" to "1"

    and then regenerate your system in SOPC Builder. Make sure you aren't running SOPC builder

    while you hand-edit the system PTF. You should also recompile your application so that

    the compiler won't try to use multiply instructions. Your shift and rotate instructions will take more

    cycles now because they don't have the hardware multipler anymore to accelerate them.

    Here's another trick to reduce LE usage and sometimes increase Fmax.

    It uses a little known setting in Quartus.

    Go into the "More Settings" tab of the Fitter options and look for the register packing option.

    It defaults to auto. Change it to something like "Minimize with chains".

    I'm doing this at home from memory so I might not have the names exact.

    What this option does is to make Quartus more aggressive about combining unrelated

    registers and lookup-tables into the same LE. They don't turn it on by default because

    it tends to use more routing resources. It is very rare for Stratix parts to run out of

    routing resources so you should be okay.

    BTW, my experiments during development suggested that the Nios II/s and Nios II/f have similar Fmax characteristics.

    Sounds like you are finding contrary results. I'd like to know about your configuration if

    this is true.
  • Altera_Forum's avatar
    Altera_Forum
    Icon for Honored Contributor rankHonored Contributor

    --- Quote Start ---

    originally posted by badomen@Aug 11 2004, 05:02 PM

    i thought increasing fmax was a typo

    --- Quote End ---

    It was a typo. It increases f_max sometimes but occasionally decreases it, which is why it's disabled by default.

    Also, before people spend ages looking for this option, it is new in Quartus II 4.1.
  • Altera_Forum's avatar
    Altera_Forum
    Icon for Honored Contributor rankHonored Contributor

    Ya, it helped my NIOS a bit, but hindered the surrounding hardware so I'm back to square one. Thanks to James, I had no idea that the multiplier could be removed (I kinda need it though so maybe it could become a custom instruction). What you said is pretty much what I'm seeing in terms of critical paths. Unfortunately some of the Quartus tweaks are unavailable to me since I'm just using the webpack for now (we got the NIOS II in the mail, but have not renewed yet since I'm finding out if it's worth it or not).

    My configuration is some surrounding hardware attached to the custom interface, NIOS II/s, 512B of cache, and I added a hardware divider that I created (I probably re-invented the wheel of the one that comes with the NIOS). I've done a build without my divider to see if that helps and it's not much of a change in fmax at all (probably just resources got moved around and the fmax was affected by that).

    When I go with the II/f core and ditch my divider (and just use the one supplied by altera), my fmax goes up around 20%. The results that you were seeing is what I would have expected as well. I'm sure my fmax would have gone up more then 20% but some of my surrounding hardware limits the system to it's own fmax as expected.

    So I'll give you're suggestion of removing the multiplier a try (still not sure why it would be used for shift/roll functionality, I guess since it has it built in and doesn't require extra LEs).

    Cheers
  • Altera_Forum's avatar
    Altera_Forum
    Icon for Honored Contributor rankHonored Contributor

    shift_rotate_in_submodule is currently turned off, any know the affects of turning this on?

    Cheers
  • Altera_Forum's avatar
    Altera_Forum
    Icon for Honored Contributor rankHonored Contributor

    Turning off the hardware multiply seem to help things fairly well. I still still a few paths that are pretty low which are localized to a few result bits, shift_rot, and av_ld_done. The destination of these are mostly the pipeline flush.