Running MCU code directly from the flash will be slow. The 5 cycles are not even the full truth:
I have a design that uses the UFM as lookup table. It turned out that the UFM has an even longer access cycle. Despite of what's mentioned in the MAX1000 ufm guide, the memory raises it's busy pin after the access for some cycles so you cannot read just the next word from it after 5 cycles. Ignoring the busy bit causes the read data to be invalid. In experiements I found out that the cycle on complete random access (no burst mode) takes 12 clock cycles.