Hi,
Thank you, Michael. I've read your post in the mailing list and learned a lot. The next is my humble opinion.
For the 'futex',
If we use no-MMU and non-SMP Nios uClinux system, I think the 'futex' is an easy problem, because we can control the Nios's interrupt enable bit from the userland. In this case, only interrupts and task switches are the factors which destroy the atomicity.
With MMU and non-SMP case. your plan to make custom instructions 'lock1' and 'lock2' are dangerous for the OS's security, because it means that we can control the interrupts from our userland. Moreover, we can't perceive the codes flows in the Nios's pipeline and get the exact timing of accepting interruptions.
It seems to me that the idea of 'Common Area for atomic functions' deprives the flexibility of 'futex'. From the viewpoint of userlands, 'futex' is only an int-type counter variable that is located in a shared memory. We can access it from anywhere (of course from our userland) and the kernel doesn't know it till the contention will occur.
Fundamentally, we should not revise the kernel's machine-independent codes or add new codes to there, and should keep the 'look & feel' of userland interfaces (not only for the 'futex').
So I think that the idea of emulations by 'hardware mutex' like Mr. Ben Twijnstra
http://sopc.et.ntust.edu.tw/pipermail/nios2-dev/2009-february/002439.html is the best method, though we must change the futex's integer valuable to a pointer that points a 'hardware mutex'. I'm not sure how this will influence the kernel codes, but maybe, we can hide everything under the machine-dependent codes and their macros.
For the possibility of SMP by Nios CPU, I will write it later.
Kazu