Its not possible to do an SMP Linux system on Altera NIOS (see several discussions on this). Altera would need to implement SMP-save atomic instructions first. (See e.g. ARM's "load locked", "store conditional" instructions for a decent way to do this.)
I understand that you created a NIOS compatible CPU yourself. Of course here you can in fact implement such instructions, but I feel that a NIOS clone (done in Verilog or whatever) will be much slower than an Altera branded thingy that uses low-level optimizations that Verilog and friends don't provide.
I agree that implementing a MIPS clone seems more appropriate than implementing a NIOS clone There are some free 32 Bit CPUs in Verilog code available in the Net.
Of course cache synchronization is a huge task to do.
-Michael