Hi mendonca,
just a guess: the nios/f has data cache.
If another avalon master changes the value of your flag in memory, the nios/f processor doesn't notice, because it loads the value from the cache. If you load the values by using the 'ldwio' opcode, you read the value directly from memory.
And depending on your optimization level, the value isn't read at all from memory, only a value in a processor register is checked. You can cirumvent this by adding 'volatile' to your variable declaration.
Also, isn't there a hardware mutex component that synchorinzes access to memory between different processors?
Wolfgang