We've managed to hack the files generated by sopc to remove the second clock, the image builds, but there is something not quite right somewhere.
The TCM is on s2, we could switch to s1 if it might make a difference.
The problem with any mutex is that the cpu with the TCM doesn't have any spare clock cycles to waste waiting for the mutex to be available - or really to test it either.
The cpu is doing hdlc transmit (entirely in software) and has about 190 clocks to process a receive and transmit byte. The tx side only need to look at shared data when looking for a new frame - so can send an extra flag. The rx side has bigger problems - since it can't stall, although it doesn't need to look at any sequence numbers.