Forum Discussion

Altera_Forum's avatar
Altera_Forum
Icon for Honored Contributor rankHonored Contributor
10 years ago

RAM Bit errors after the two boot stages SoCKit Part1

Hello,

I have a very uncommon problem with Linux on the HPS (Cyclone V) and/or with two previous boot stages. At the moment I don’t have any other ideas to solve this other than listed below. I will briefly describe by problem.

Aim:

The aim is to boot the HPS with Linux from non standard devices (no NAND, QSPI,.. devices) where the FPGA is configured first and the preloader resides within the FPGA image. I’m using the SoCKit. At the moment the R/W access is done by a JTAG connection which transfers the data (U-Boot, DTB, uImage) into the RAM.

Parameters: Quartus 14.1 with the included U-Boot source; Buildroot; Linux Kernel from rocketboards 3.10.37-ltsi socfpga -> up to date

Implementation/bootflow:

  1. The preloader sets a bit in RAM (Avalon Bus) to show that the data can be transferred to the SDRAM after he has calibrated the SDRAM (the own code resides at the end of void spl_board_init(void) within spl.c (/arch/arm/cpu/armv7/socfpga/spl.c) -> function polls for another bit whether the data was transferred or not (handshaking)

  2. JTAG transfers the data and sets the according bit

  3. Preloader calls U-Boot

  4. U-Boot jumps to uImage

Result:

  • Preloader runs successfully ; RAM calibration passed, scrubbed memory section

  • Data is transferred successfully via JTAG

  • U-Boot starts and is jumping the kernel successfully -> Linux is booting

3 Replies

  • Altera_Forum's avatar
    Altera_Forum
    Icon for Honored Contributor rankHonored Contributor

    Continued...

    problem:

    - Sometimes the system hangs with different tracelogs depending on which data was destroyed; the corrupt system comes from bit flips/more than two Bit errors within the RAM which normally means that the calibration should not be successful -> can be seen from memtester (userspace)

    - If the same kernel+DTB is copied to the existing SD-Card (replacing the uImage+ DTB on the SD-Card) in combination with the existing preloader and u-boot (old one from Quartus 13.01), everything works fine

    finding the bug:

    - I’ve tried many things but with no positive result:

    - Changed all addresses TEXT_BASE, FDT,… addresses as used with the SD-Card for the SoCKit

    Adresses: FDT -> 0x100

    uImage -> 0x7FC0

    kernel entry point: 0x8000

    uboot -> 0x01000040

    - I also used some other addresses (suitable addresses) -> did not solve the problem

    - Preloader RAM test with SDRAM_TEST_LONG (spl.c) -> no errors

    - Own function within the preloader which writes 0xFFFFFFFF and 0x00000000 to the full SDRAM address range + data check --> no errors

    - Test on the next bootstage (U-Boot) -> splitted some memory areas into two equal parts and filled them with the same data -> compared against each other to find at minimum one corrupt address with a flipped bit

    - No errors, even with high memory and more than 500MB

    - Checked data written by the JTAG interface by comparing the MD5 sums (1: after writing the images into RAM during the preloader stage; 2: transferred the same data during the U-Boot stage (manually halted) and checked against the original data for the case that U-Boot is doing something wrong and reconfigures the RAM controller which should not happen) --> no errors --> no overlapping of images within the RAM

    - Checked U-Boot settings; compared against SD-Card U-Boot settings which are relevant

    - Compared RAM-CTRL register values which the prealoder configures and the same registers from Linux -> no differences-> configuration is still valid

    - Compared old Preloader settings (RAM configuration in sdram_config.h) -> not the same as with the current handoff files even if the old project was the base for the current one

    - Tried to replace the settings with the old ones-> Preloader RAM check passed but the error are still there

    - Tried a different kernel from rocketboards (socfpga), version 3.16 -> same problem

    Conclusion: -- the Kernel seems not to be faulty, because it works with the SD-Card

    - Maybe there is a problem with U-Boot?

    Has anyone seen such a problem?

    Has anyone ideas what the problem could be?

    Thanks!
  • Altera_Forum's avatar
    Altera_Forum
    Icon for Honored Contributor rankHonored Contributor

    Solved:

    The problem was the DDR3 RAM configuration from the 13.0 SoCKit materials. The new 14.0 differ from them. The values need to be changed from

    DDR3_RAM_HPS_Q130.png -> DDR3_RAM_HPSQ140_GHRD.png

    Result 500MB test --> passed (see last picture)