Forum Discussion

Altera_Forum's avatar
Altera_Forum
Icon for Honored Contributor rankHonored Contributor
15 years ago

TLB Miss Exception

Hello,

I am trying to compile a working MMU uClinux with Nios II.

I downloaded the nios2-linux-20100621.tar.gz

After producing a custom FPGA project for a custom board and imported the board to the kernel tree (as in the wiki), I managed to compile a linux image.

The problem I am observing is that the system hangs for an unhadled exception:

Linux version 2.6.34-00692-g5bc7853-dirty (imagos@woody) (gcc version 4.1.2)# 7 Wed Feb 23 11:30:18 CET 2011
bootconsole  enabled
early_console initialized at 0xe4000420
Linux/Nios II-MMU
init_bootmem_node(?,0x3b9, 0x0, 0x4000)
free_bootmem(0x3b9000, 0x3c47000)
reserve_bootmem(0x3b9000, 0x800)
Built 1 zonelists in Zone order, mobility grouping on.  Total pages: 16256
Kernel command line: 
PID hash table entries: 256 (order: -2, 1024 bytes)
Dentry cache hash table entries: 8192 (order: 3, 32768 bytes)
Inode-cache hash table entries: 4096 (order: 2, 16384 bytes)
We have 16384 pages of RAM
Memory available: 61152k/3811k RAM, 0k/0k ROM (873k kernel code, 2938k data)
Hierarchical RCU implementation.
NR_IRQS:32
Calibrating delay loop... 39.11 BogoMIPS (lpj=195584)
Mount-cache hash table entries: 512
init_BSP(): registering device resources
Switching to clocksource timer
msgmni has been set to 119
ttyJ0 at MMIO 0x4000420 (irq = 1) is a Altera JTAG UART
console  enabled, bootconsole disabled
console  enabled, bootconsole disabled
Freeing unused kernel memory: 2692k freed (0xc00dc000 - 0xc037c000)
Unhandled exception# 12 , fp 0xc3c18cfc
r1:  c0000c20 r2:  fffffff2 r3:  00000000 r4:  00004b98
r5:  00000468 r6:  00004b99 r7:  00000000 r8:  fffff000
r9:  fffffff2 r10: c03b5e60 r11: 00000000 r12: 000fffff
r13: 00000010 r14: 5a827999 r15: 6ed9eba1
ra:  c0093040 fp:  c3c18ea0 sp:  c3c18d58 gp:  00000000
ea:  c0006324 estatus:  00000001

Adding another print to the unhandled_exception function in arch/nios2/mm/tlb.c I managed to understand that the STATUS Register of cpu is zero. Tracing back to entry.S I see that the first thing the exception hander do is to clear status.EH flag, making it impossible to distinguish from Fast TLB Miss exception and Double TLB Miss Exception. Anyway, the cause 12 for an exception (control register 7) always drives to unhandled_exception in exception table.

This may be not the point but it is where my poor observing skills bring.

For your information, checking the GIT LOG, my version is quite dated:

commit 5bc785348a8df706cecd7d318829b6476018a990
Author: Thomas Chou <thomas@wytron.com.tw>
Date:   Mon Jun 21 11:04:09 2010 +0800

Do you think that an update can help me with this?

You'll find the .h of the SOPC System in attachment, so you can check the memory map.

I tried the same kernel sources on the NEEK board (with the proper board selection and sof from the wiki, obviously) and it arrives to the shell. I would like to see the shell in the custom design too.

Any suggestion is appreciated. Thanks.

Regards,

Gabriele Gobbin

6 Replies

  • Altera_Forum's avatar
    Altera_Forum
    Icon for Honored Contributor rankHonored Contributor

    As an update, I downloaded the newest possible kernel shanpshot (used update script this morning). The problem is still present.

  • Altera_Forum's avatar
    Altera_Forum
    Icon for Honored Contributor rankHonored Contributor

    The only weird thing I see in your configuration is the EXCEPTION_ADDR 0xc0000060. I've always seen it at offset 0x20.

  • Altera_Forum's avatar
    Altera_Forum
    Icon for Honored Contributor rankHonored Contributor

    As you can see from the .ptf (which is the real thing about sopc builder), the addresses were chosen to be at physical address in the lowest memory.

            
             reset_slave = "altmemddr_0/s1";
             break_slave = "cpu_0/jtag_debug_module";
             exc_slave = "altmemddr_0/s1";
             reset_offset = "0x00000020";
             break_offset = "0x00000020";
             exc_offset = "0x00000060";

    You can find the ptf file in attachment

    You can also find and image of the CPU configuration screen.

    This is a custom design on a custom board. I have chosen to put the reset and exception vectors in these positions as a suggestion from the Altmemphy manual: on page 21 of emi_ddr_ug.pdf, I have found that the calibration process uses addresses between 0x0 and 0x1F. This can be avoided once I load my system from flash, because it is i a safe place at reset.

    The DDR2 Controller I am using is based on Altmemphy and not Uniphy. So this is why I have chosen these addresses. Anyway I made a mistake, the addresses MUST be calculated multiplying the range for the bus dimension in bytes. On the other hand the kernel is not loading after a reset (it is downloaded via JTAG) so this should not be corrupted.

    Anyway, the only constraint I remember about position is that they must be in the 0x0-0x1FFFFFFF memory range; obviously there must be enough space for the code and variables.

    Then the sopc-create-header-files --single imcpu_fpga.h script generates the map translating all addresses in virtual memory. I am not sure that the point is in this translation, I have tracked a lot of system calls and other stuff (exceptions, interrupts or traps are considered exceptions as I know).

    I have found that the instruction corresponding to the ea register is the stb instruction called from the macro put_user_asm, which is called from the macro __put_user_common, which is called from the macro __put_user which is called by the function __clear_user (arch/nios2/mm/uaccess.c).

    I'll try to recompile with a different reset and break address. As you suggested.
  • Altera_Forum's avatar
    Altera_Forum
    Icon for Honored Contributor rankHonored Contributor

    Hello!

    You are right. Thanks a lot!

    With the exception vector placed at 0x20 it arrives at the shell. It also allows me to give some commands. Great!!

    It is not my intention to make an offense or tell waht is good and what is not but I can not find an explanation for this. I mean:I did not find anywhere this information in Nios II Datasheets or in the wiki. I found suggestions and examples but not constraints.

    I may have jumped a few lines in the pdf or website and lost this thing.

    However, I'll try to track where this difference in addresses makes a difference in software behaviour. I'll keep the community informed if I find something. Please let me know if I missed something in the documentation. Thanks,

    Regards,

    Gabriele Gobbin
  • Altera_Forum's avatar
    Altera_Forum
    Icon for Honored Contributor rankHonored Contributor

    I also remember of you having my same problem some time ago with the kernel timer over 100Hz. That time I was using the no-mmu kernel (and processor) and gave up while searching for the cause of the problem (I made the hypothesis that the scheduler was too heavy for a 80MHz NiosII)... I hope this time it is only something I have jumped while reading or something I'll manage to find.

  • Altera_Forum's avatar
    Altera_Forum
    Icon for Honored Contributor rankHonored Contributor

    Hello!

    just for information to anyone, hoping this avoids problems to other users.

    There is a constraint for the Reset Vecotr Address which seems to be untold. The constraints does not depends on Nios II architecture so it is not found on Altera documentation. It depends on the Kernel image generation process and, in my personal opinion, should be told in the wiki documentation.

    The kernel image construction does not depend on the FPGA memory map but depends on Nios II MMU Memory partitioning.

    The assembler code in arch/nios2/kernel/head.S copies the exc_hook to the address specified in SOPC Builder. Then it copies the fast tlb miss handler to the specified address. The problem in my configuration is that the exception address configured in SOPC Builder is where at the moment the fast tlb miss handler code resides so the exception handler is copied inside the tlb miss handler.

    At the moment a FIXUP comment exists in the file. A possible fixup is to write some instructions that verify which label is to be copied first but it could be a patch valid only for a small set of problems. I suggest that the wiki should be updated, to whom do I need to talk?

    Thanks,

    Gabriele Gobbin