Forum Discussion

Altera_Forum's avatar
Altera_Forum
Icon for Honored Contributor rankHonored Contributor
15 years ago

Gcc4 Frame Pointer Issue

I've been trying to write some code in an MPU exception handler to trap and dump the call stack when an intermittent exception occurs in our system and am pretty sure I've found a bug in the latest Gcc4 compiler that ships with Quartus 10.0. I haven't tried it on 10.1 as yet.

It seems under gcc4 the frame pointer is not calcuated correctly in all cases.

Here's the generated asm code from gcc3 (10.0)

addi sp,sp,-120

stw ra,116(sp)

stw fp,112(sp)

stw r16,108(sp)

stw r17,104(sp)

stw r18,100(sp)

stw r19,96(sp)

stw r20,92(sp)

addi fp,sp,112

And the corresponding code from Gcc4

addi sp,sp,-116

stw ra,112(sp)

stw fp,108(sp)

stw r19,104(sp)

stw r18,100(sp)

stw r17,96(sp)

stw r16,92(sp)

addi fp,sp,92

Under Gcc4 the frame pointer is not pointing to the correct location as documented in the Nios II Processor Reference Handbook Section II-7 Stacks. You can see the difference in the addi fp,sp,xx instruction. Gcc3 does seem to do it right.

Also I've found that the __builtin_return_address() function does not work for Gcc4. Any value greater than >0 under Gcc4 returns 0 for the address.

Rebuild under Gcc3 and it works fine.

Can any one shed some light on this. If it's not a bug how should the call stack be walked now? The following code works under gcc3, fails under gcc4.

struct stack_frame {

struct stack_frame* next;

void* ret;

};

struct stack_frame* fp;

__asm ("mov %0, fp" : "=r" (fp) );

alt_u32 cnt = 0;

while(fp && (cnt < 10)) {

isr_printf("%x: fp(%x) pfp(%x) ra(%x)\n", cnt, fp, fp->next, fp->ret);

fp = fp->next;

cnt++;

} ;

:cry:

18 Replies

  • Altera_Forum's avatar
    Altera_Forum
    Icon for Honored Contributor rankHonored Contributor

    We ended up walking the stack with our own code and anything examined everything that looked like an address in the code space was examined.

    If the address followed a call op code we printed the address. It was a bit of effort but works with gcc3 / gcc4.

    We are building with the with 13.1 now but I never bothered looking at whether this was fixed as our code still worked correctly.
  • Altera_Forum's avatar
    Altera_Forum
    Icon for Honored Contributor rankHonored Contributor

    I've just reread the recent posts and noticed that Altera are no longer including gcc3.

    Have they fixed their gcc4 build so that it can generate 'pure code'?

    The earlier gcc4 builds would always place switch statement jump tables directly in the code segment. Something that you definitely don't want if you are using tightly coupled instruction and data memories.

    If they haven't, anyone who is generating anything small will need to rebuild gcc anyway.
  • Altera_Forum's avatar
    Altera_Forum
    Icon for Honored Contributor rankHonored Contributor

    Update: Quartus 13.1 with GCC 4.7.3 has exactly the same issue.

    It seems that the error occurs whenever the callee saved registers (r16-r23) get pushed onto the stack upon a function call. Then the frame pointer misses the correct position by <numOfCalleeRegs>*4 Bytes.

    Simple code example:

    class MyClass {
    public:
    	MyClass() {};
    	virtual ~MyClass() {};
    };
    void FramePointerIssue( void ){
    	MyClass* foo = new MyClass();
    	delete foo;
    }
    int main(void){
    	FramePointerIssue();
    	return -1;
    }

    Question: Is there an easy way to find out whether an address lies within the code space or not? I know that the information can be found in the *.objdump file, but is there also a programmatic way of accessing this info.
  • Altera_Forum's avatar
    Altera_Forum
    Icon for Honored Contributor rankHonored Contributor

    Just a dumb thought on this one: Can you rename the gcc3 executable to gcc4 and get the IDE to invoke the renamed gcc3? Or are the compilers that incompatible?

    BillA
  • Altera_Forum's avatar
    Altera_Forum
    Icon for Honored Contributor rankHonored Contributor

    The linker script could well add global symbols at the ends of the code space.

    It should then be possible to define C symbols with the same names and take their addresses.

    Actually it probably isn't that difficult to fix gcc to output the correct prologue and epilogue.

    It is also worth checking the epilogue. There was a bug in gcc for arm that caused it to adjust the stack pointer and then load the return value from an address below the sp - which an interrupt could have overwritten.
  • Altera_Forum's avatar
    Altera_Forum
    Icon for Honored Contributor rankHonored Contributor

    To those who are interested in Alteras official statement:

    --- Quote Start ---

    ...

    the GCC is not setting the frame pointer as documented in the ABI. Both the text and diagrams in the ABI document say the FP points at the saved FP on the stack. Instead, GCC is pointing it at the last saved register (i.e., at the low end of the frame), and consistently applying the wrong constant offset to all fp-based stack references. This bug was observed on the GCC 4.7.

    ...

    We have already filed this issue to Mentor (tracking number 16009) and the fixes is planned to be included in our Nios II GNU toolchain (upgrade to GCC4.8) in a future version of the Quartus II software.

    --- Quote End ---

    ... So until this future becomes present, we do have to live with workarounds.
  • Altera_Forum's avatar
    Altera_Forum
    Icon for Honored Contributor rankHonored Contributor

    Unless Mentor are will to post the change somewhere and allow you to rebuild the older gcc with that specific fix applied.

  • Altera_Forum's avatar
    Altera_Forum
    Icon for Honored Contributor rankHonored Contributor

    If anyone's interested we use the following function to dump the stack of a UCOS thread, should be fairly straght forward to modify for none UCOS systems:

    // these variables generate by linker, using standard linker script.
    extern char _start;
    extern char __etext;  
    extern char __ram_exceptions_start;
    extern char __ram_exceptions_end;
    void DumpStackTcb(OS_TCB* a_pTCB)
    {
      void* sp = 0;
      //lint -e{522}
      __asm ("mov %0, sp" : "=r" (sp) );
      void*   pTextStart   = &_start;
      void*   pTextEnd     = &__etext;
      void*   pExceptStart = &__ram_exceptions_start;
      void*   pExceptEnd   = &__ram_exceptions_end;
      OS_STK* pStackBottom = a_pTCB->OSTCBStkBottom;
      OS_STK* pStackTop    = a_pTCB->OSTCBStkBottom + a_pTCB->OSTCBStkSize ;    // OSTCBStkSize is num of elements, not bytes
      OS_STK* pStackCur    = a_pTCB->OSTCBStkPtr;
      
      isr_printf("TS(0x%x) TE(0x%x) ES(0x%x) EE(0x%x)\n", (unsigned)pTextStart, (unsigned)pTextEnd, (unsigned)pExceptStart, (unsigned)pExceptEnd);
      isr_printf("id(0x%x) sb(0x%x) st(0x%x) ss(0x%x) sp(0x%x) sp(0x%x)\n"
        , a_pTCB->OSTCBPrio 
        , (unsigned)pStackBottom
        , (unsigned)pStackTop
        , a_pTCB->OSTCBStkSize * sizeof(OS_STK)
        , (unsigned)pStackCur
        , (unsigned)sp
      );
      
      u32 Cnt = 0;
      for(OS_STK* pStack = pStackCur; (pStack >= pStackBottom) && (pStack < pStackTop) ; pStack++) {
        void* pPossibleAddr = (void*)*pStack;
        if(((((unsigned)pPossibleAddr) & 0x3) == 0) &&
            (((pPossibleAddr > pTextStart) && (pPossibleAddr <= pTextEnd)) ||
            ((pPossibleAddr > pExceptStart) && (pPossibleAddr <= pExceptEnd)))) {
          u32 PrevInstr = *((OS_STK*)pPossibleAddr-1);                                                // check if instruction before poss. return address is a call
          if( ((PrevInstr & 0x3f) == 0) || ((PrevInstr & 0x07ffffff) == 0x003ee83a) ) {               // check call / callr
            isr_printf("0x%x: ra(0x%x) pi(0x%x)\n", Cnt, (unsigned)pPossibleAddr, PrevInstr);
            Cnt++;
          }
        }
      }
    }