You can remove the instruction cache but you have to be careful. You'll loose the instruction master which forces you to have a system as follows:
1) All instructions be placed in an onchip RAM that's 32 bits wide and has a read latency of 1
2) OCI core is not present (OCI core hooks up the the instruction master however without a cache you don't have an instruction master).
3) You don't have any other memory that needs an instruction master (you can have multiple tightly coupled instruction masters however they can only connect to onchip memories)
So in short you need to have a system that's well tested before jumping to this method. By the way if you need prototype a system like this and just wanted the OCI core to download your code you can always use the GERMS monitor as an alternative (it's posted somewhere in this forum).