Hi,
I think I found the cause for my problem.
Looking just at the code that computes the high word of variable t1, I see only one major difference (register numbers matching the alt_64-t2-version, listed first)
The version where the error happens loads ACCELERATION twice from Dual Port memory, first to r8, then to r4. The other version loads it once only and then merely copies the register content around.
If the other CPU in my system changed ACCELERATION in DPRAM between these two accesses, it would be actually +0xDBE in R8 and -0xDBE in R4 (or vice versa).
The sign extension in R5 matches the value loaded into R8 but the overflow bit at PC=0x11C is computed from the value loaded into R4.
Do you agree that this might be causing my problems? Then at least I know the cause, can implement proper workarounds, and do not have to fear about wrong 64 bit results in situations where no other CPU accesses the operands. I didn't expect that
gcc would produce code to fetch the same volatile operand twice for a single computation.
Erratic alt_64 t2 version
ec: ldw r8,0(r6) /* r6 = &ACCELERATION */
f0: ldw r4,0(r6)
f4: ldw r2,0(r17) /* r17 = &VELOCITY */
f8: srai r5,r8,31
fc: ldw r3,4(r17)
104: add r6,r2,r4
11c: cmpltu r8,r6,r2
138: add r7,r3,r5
13c: add r8,r8,r7
144: mov r4,r8 /* => new VELOCITY_HI, sometimes wrong */
Working alt_32 t2 version
ec: ldw r9,0(r4) /* r6 = &ACCELERATION */
f0: ldw r2,0(r17) /* r17= &VELOCITY */
f4: ldw r3,4(r17)
fc: mov r6,r9
100: srai r7,r9,31
110: add r4,r2,r6
114: cmpltu r8,r4,r2
138: add r5,r3,r7
13c: add r8,r8,r5
144: mov r7,r8 /* => new VELOCITY_HI, always correct */
Or did I put the "volatile" at the wrong place and should've defined something like
#define VELOCITY (*(alt_64 *volatile)((void *)DPRAM_BASE+8))
instead of
#define VELOCITY (*(volatile alt_64 *)((void *)DPRAM_BASE+8))