Forum Discussion

Altera_Forum's avatar
Altera_Forum
Icon for Honored Contributor rankHonored Contributor
21 years ago

ALTPLL Problem

After changing the ALTCLOCKPLL’s in the standard vhdl example design to ALTPLL’s everything still worked fine. But when I changed the ratio on the e0 output of the controller_pll from 1/1 to 4/5 my program stopped working. It never even reaches the first breakpoint. I’m running the lwip_web_server demo program from the LWIP standalone software on a NIOS II processor. (The hello_world program doesn’t run either)

The clock is running from the onboard 50MHz crystal. The rest of the example hardware is exactly as it is provided, except that I removed the lcd hardware in the SOPC builder.

Does anyone have any suggestion as to why the program would stop responding? Everything compiles and the debugger shows that the thread is running. But it never gets anywhere (The longest I’ve run it was 10min) I intermittently get errors ( undocumented error -1) about the sdram not being readable – not sure why the sdram would be influenced by the controller_pll. It might be some other error. But I mention it for completeness sake.

I've been struggling for about a week with this one and I'm completely out of ideas.

Thanks alot.

Jan Hendrik

19 Replies

  • Altera_Forum's avatar
    Altera_Forum
    Icon for Honored Contributor rankHonored Contributor

    Jesse,

    What exactly do you measure on the scope? Our custom board uses the same sdram as the Cyclone devkit and I had to remove the delay to get it to work. (Nice since I only need one pll)

    I tested it to 120 MHz with no read/write errors, but I'm wondering if I could optimize it further if I measured and added just the right delay.

    Thanks,

    Ken
  • Altera_Forum's avatar
    Altera_Forum
    Icon for Honored Contributor rankHonored Contributor

    What he meant is the delay needed is proportional to the length of the traces between the FPGA and the RAM. So if your custom board has a shorter trace distance your delay is less, and if it's further then the delay is longer (it's the electrical signal propagation delay). So in short I would call it a progation delay and not a phase shift really (because it's not clock dependent).

    So to measure this delay on the ocsilloscope you want to get a probe onto the closest possible point out of the fpga to the closest possible point on the ram and look at them on the scope to measure the time between the two signals (if you are running at high clock rates you will need a pretty fast scope to do this accurately). Also keep in mind that this is just the trace delay and if you wanted even more accuracy then you would need the delay from the NIOS to that trace (but that's not going to stay consistent between hardware compiles so I wouldn't bother).

    But in the end if it was me, I would probably just have done trial and error since getting at those traces would either require scrapping the solder mask, or soldering directly to VIAs (and if you know what those are you know to avoid doing that at all costs hehe) If you have no errors then you have pretty much found the required delay and since thats a fixed parameter in your design you will not be able to get any more out of it. If you were able to get the Cyclone up to 120MHz then I would stay there (I can barely get my Stratix 1S10 much further then 125MHz and I use a completely on chip design so you are in pretty good shape).

    Cheers
  • Altera_Forum's avatar
    Altera_Forum
    Icon for Honored Contributor rankHonored Contributor

    A quick update: I’ve managed to get it working by compiling and running the NIOS II for 120MHz. It seems that using an integer multiple of the lowest clock frequency works better/easier. For some or other reason I can’t get it to work with a 50MHz, 40MHz combination, but it works perfectly with the 80MHz, 40MHz and 120MHz, 40MHz combinations. I suspect that any multiple of 20MHz would work. I have no idea why the 50Mhz, 40Mhz combination doesn’t work but I think that the relative frequency distance might be to close – just a guess.

    Thank you for everyone’s input. It gave me the necessary info and confidence in my solution-guesses. I hope this post will benefit others also struggling with multiple frequency designs.
  • Altera_Forum's avatar
    Altera_Forum
    Icon for Honored Contributor rankHonored Contributor

    To Ken, BadOmen, and janhendrik:

    BadOmem summed up what I was refering to better than I. He is right that we are concerned with trace delay. However there is more to it -- what we're really concerned with is the difference in propagation between the various signals that arrive at the SDRAM and make their way back to the FPGA.

    to ken land:

    The first thing to worry about is clock skew between the clock that arrives at SDRAM and various other control signals (address, dqm, cas, ras). These signals are synchronous to the FPGA clock driving your processor/bus... but nothing is free or instantaneous; it takes time for our internal clock to traverse the chip, go through logic, activate the relevant SDRAM control signal, and be driven out to an IO. At the same time, we are driving a clock out to SDRAM which comes into the FPGA from an oscillator, passes through a PLL, and exits via a high-speed IO (zero delay through logic)

    In summary, without any PLL compensation the clock signal would arrive too soon for the control signals. So the most simple measurement you can perform is right on the SDRAM. Just scope the SDRAM clock input alongside several of the control signals (one at a time if you wish). Do this while you have a test program running on Nios that does SDRAM access or just executes code. Without any PLL adjustment you should notice the clock arrives too soon -- check against the SDRAM data sheet to meet the spec. The other danger here is in "over-tuning" the PLL by shifting the clock too far back.

    I could write much more but its really just basic timing analysis: look at the waveforms in the back of the datasheet, correlate to the speed grade of chip you have, make your measurements, etc. Also remember to check on the FPGA timing - in the Quartus timing analyzer you'll get *worst case* Tco and Tsu data for all of your external IO, giving the remaining pieces of the puzzle.

    Please let me know if I'm too vauge on the above; its hard to gauge what level of detail to describe here has some people know this stuff a lot better than I do!

    to janhendrik:

    I am happy to hear of your success..but at the same time I have to say: your system is working by accident, something is wrong! The fact that it works at multiples of the original clock speed indicates that there is some crazy timing problem that still exists, but is being masked by the clock edge showing up at the "same" place in time (because the freq is doubled). Please save yourself a lot of trouble later on and concentrate on this before moving on with your design; I guarantee that proving out the memory first will lead to far less problems down the road.

    A final thought: even with an existing timing problem you're likely to get a data-interface to SDRAM working; if a subtle timing problem exists you'll see errant behavior when you're "stressing" the SDRAM. This is because when a processor performs data access (ld/st instructions), they are typically not back-to-back, allowing extra time for any problems to resolve themselves. This can be a real red herring. You can adequately test SDRAM in in two easy-to-test cases:

    1. Executing code from SDRAM.. filling the cache will mean back-to-back accesses. If something's wrong, the execution of code will likely fail.

    2. Performing DMA transfers or other back-to-back transfers with your own custom peripheral.
  • Altera_Forum's avatar
    Altera_Forum
    Icon for Honored Contributor rankHonored Contributor

    Hi Everyone,

    It seems Jesse is right, and I spoke to soon. When I said that it was working the only evidence was that a program actually managed to run from the SDRAM, but it is not performing as I expected it to.

    Since I’m not working with all original code that I wrote myself it took longer to figure out what was wrong and that something is in fact wrong. My project is to try and make a FPGA-based Picture Frame as discussed in the JPEG on NIOS II tread on this forum.

    I am using the Lancelot Hardware expansion card and VHDL. (available at www.fpga.nl) The code runs beautifully on the hardware system that is provided but then the CPU runs at 50 MHz alongside the VGA VHDL that requires a 40MHz clock to drive the 800 x 600 x 24bit VGA display circuitry.

    The only problem with the original Lancelot design is that it doesn’t have a way to get the images from my PC. I have to upload them to the flash – which I suppose is OK. (Un)fortunately I have a lot of pictures and choosing a few favorites is unfair http://forum.niosforum.com/work2/style_emoticons/<#EMO_DIR#>/smile.gif

    To solve this problem I decided to add networking support. But that introduced a new problem. The original Lancelot code runs from SRAM and with the added LWIP there is not enough free space to store the images in the SRAM.

    So what I want to do is run the new software from SDRAM with all the variables there except for the “frame buffers” that would reside in the SRAM. I would also like to run the CPU (and SDRAM) at speeds above 40MHz to give the LWIP software enough processing power to do a decent job of receiving the images from the network, and then I still want to get JPEG decompression to work http://forum.niosforum.com/work2/style_emoticons/<#EMO_DIR#>/smile.gif

    This has proven to be more of a challenge than I initially expected when I decided to start working on this project, but it has at least fulfilled its purpose by teaching me a lot about embedded design end development.

    If anyone has any suggestions on how I can accommodate the 40MHz restriction without slowing everything else down to a crawl that would be great.

    Thanks again for bearing and sharing with me.
  • Altera_Forum's avatar
    Altera_Forum
    Icon for Honored Contributor rankHonored Contributor

    Hi janhendrik,

    Thanks for that description of your project. I had not read the JPEG thread; that is quite interesting and some people here in our office have expressed interest in doing such a project should the free time ever present itself.

    As far as network speed goes, I guess it depends on how big your photos are and how often you want to refresh them. Even at modest clock speeds and full TCP/IP, I think LWIP will be fine for you at 40 or 50Mhz... we run the example web server at 50Mhz and LWIP isn&#39;t terribly fast, but a 150KByte JPEG is downloaded in a couple hundred milliseconds (if I remember correctly). The speed shouldn&#39;t be any different for uploading.. it will probably be slightly faster as you&#39;re going into an SDRAM buffer instead of fetching a file from flash as we do in the web server, and have less work to do while receiving things via a networking stack than composing packets to send. However, LWIP at that speed will probably be too slow if you want to do any kind of video.

    As an alternative to networking, you might consider compactflash (the 1gig cards are quite inexpensive these days) and using a file system (we currently have a new and improved compact flash component, but not a free file system). Micrium, who makes MicroC/OS-II, has a filesystem that (I think) is FAT compatible and bolts on to uCOS... I don&#39;t know if they offer any deals to universities or not but it might be worth considering.

    back to the pll: I just now read this entire thread; I should have done that in the first place... and now understand your PLL problem. From your last report I think you have it right: connector_pll e0 multiplies by 4/5 to get your desired 40mhz output clock, and then the sdram_pll multiplies by 5/4 (along with the "shift") to get 50Mhz again... you should be able to perform the same operation to get other Nios/SDRAM clock speeds to work; fundamentally I think this is correct... so the next question is why doesn&#39;t it work??

    You might, as a debug measure, take your Nios system clock and drive it out to one of the header pins on the dev board (the expansion headers should each have one of the pins tied to a high-speed I/O suitable for a clock; check the schematic to see which one is labeled clock and trace that back to the proper FPGA I/O number), then measure this against the SDRAM clock we&#39;re already driving out of the FPGA and see if they&#39;re the same frequency, and in phase (well, phase-shifted just a bit as we&#39;ve already discussed). Hopefully this shouldn&#39;t be too much trouble - just add a pin and re-compile one more time http://forum.niosforum.com/work2/style_emoticons/<#EMO_DIR#>/smile.gif
  • Altera_Forum's avatar
    Altera_Forum
    Icon for Honored Contributor rankHonored Contributor

    Hi everyone,

    For some or other reason if you compile something enough it either breaks itself or fixes itself :) I can now run the Processor and SDRAM at 50MHz while still having a 40MHz clock for the VGA VHDL. AND the software works too :) I would love to run the CPU and SDRAM at even higher frequencies (like 80MHz or even 100MHz) but as soon as I try to access the SRAM at frequencies above 50MHz it fails to respond ( I doubt that the 40MHz clock has anything to do with this but you (well I) never know.) . Is this normal?

    As far as I could gather the SRAM should be able to work at frequencies up to 100MHz as the access time is 10ns. On the cyclone dev board the SRAM modules are IDT71V416 S10PH Z0051P which gives an access time of 10ns – if I understood the datasheet correctly :)

    I know this question should probably start a new thread but: Is there a reason why the SRAM would stop responding at higher frequencies? The SDRAM seems to work fine at frequencies up to about 100MHz – which would be explained by it being PC100 compatible SDRAM.

    Any SRAM expert out there?
  • Altera_Forum's avatar
    Altera_Forum
    Icon for Honored Contributor rankHonored Contributor

    Hi janhendrik,

    What version are you using?? We found a slight SRAM timing bug in the SRAM interface shipped with Nios II 1.0; its been corrected in 1.01 (v1.01 is somewhere between our releasing it and the CDs arriving in customers hands).

    To be precise, our SRAM interface wasn&#39;t meeting setup time on writes above 50Mhz. Prior to Nios II 1.0 we did test SRAM in this region of clock speeds but observed no problems until the new Stratix II boards came out (we start supporting Stratix II in Nios II 1.01). At this point, due to device timing, we started seeing occasional problems and traced it to a Tsu violation for writes.

    You can patch the SRAM timing very easily if you don&#39;t yet have Nios II 1.01:

    1. Close SOPC Builder

    2. Open altera/kits/nios2/components/altera_nios_dev_kit_stratix_edition_sram2/mk_sram.pl (do this for the "stratix_edition_sram" folder as well)

    3. In these files locate the following line: "if ($system_frequency.....", and just drop in this text to replace it:

    if ($system_frequency > 100E6)

    {

    $SLAVE_SBI->{Read_Wait_States} = &#39;20ns&#39;;

    $SLAVE_SBI->{Write_Wait_States} = &#39;10ns&#39;;

    $SLAVE_SBI->{Hold_Time} = &#39;10ns&#39;;

    $SLAVE_SBI->{Setup_Time} = &#39;5ns&#39;;

    }

    elsif ($system_frequency > 50E6)

    {

    $SLAVE_SBI->{Read_Wait_States} = 1;

    $SLAVE_SBI->{Write_Wait_States} = 1;

    $SLAVE_SBI->{Hold_Time} = 1;

    $SLAVE_SBI->{Setup_Time} = 1;

    }

    The "else" condition, for speeds of 50Mhz or lower, is fine with 0 Tsu because of the effects of our "half-clock" hold time because the outgoing write signal gets gated adding sufficient delay.

    4. After you make this change, open SOPC Builder, remove the SRAM(s) from your design and re-add them before re-generating and compiling. This will ensure that the changes take effect.

    If there is some other problem please advise!
  • Altera_Forum's avatar
    Altera_Forum
    Icon for Honored Contributor rankHonored Contributor

    Hi All,

    After modifying the SRAM interface to fix the timing bug in NIOS II v1.0, everything seems to be working.

    Here is a short description of what I had to do to get everything working:

    - First I changed the ALTCLKLOCK’s to ALTPLL’s. This shouldn’t be a prerequisite but since I am using a Cyclone and not an Apex, FLEX10 or Mercury device I decided to go for the ALTPLLs

    - Next thing to remember is that the PLD_CLKFB input pin is connected to a buffered output of the PLD_CLKOUT pin.

    - For the Standard and Full_featured designs it means that the sdram_pll’s inclk0 frequency should be exactly the same as the connector_pll’s e0 output clock frequency. The connector_pll;s inclk0 freqency should be equal to the systems input clock frequency (50MHz in the case of the ALTERA development boards – Cyclone and Stratix when using the default crystal oscillator)

    - Remember to add a phaseDELAY of -3.5 ns at the sdram_pll e0 output for designs that run on the development kits from ALTERA (I don’t know the delays for other development boards but you should be able to find the delay be following the advice in this thread)

    - If you are using NIOS II version prior to 1.01 fix the SRAM timing bug if you want to work at clock speeds exceeding 50MHz

    Anyway that’s about it. Thanks to everybody that helped me get through my PLL nightmare.