You're so totally right, yes. And I know, I have to implement the valid and ready signals in my logic. But I'm currently switching from MM to ST. I can't change my logic at the moment because it is tested and reviewed and stuff. It's company policy stuff (or something like that).
Besides, I have already managed to write and read the correct data. It was a software failure. In my case I had to change the order of the bytes to write correct data to the fifos. And then everythin worked fine.
You have also to know that I work with a kind of multiplexer/demultiplexer on the fifos. This means I take the 64bit data from the fifo, and make a 8bit wide stream from it. The 8bit stream has a lower clock as the fifo. So when the fifo-output is valid I now and then generate a one-clock-cycle ready signal to reqeust new data. After the request, I even wait for some clocks to get a valid output from the fifo.
And on the input-fifo I handle things similary. I collect the data, write them to the output, generate a single valid signal, and don't change the output for a while until I have new Data.
So my design works although I'm not following the spec, because I'm slow as hell!!! ;-)
When I finally can implement the ready and valid signals, this would also mean that I can communicate better with the fifos and speed up the reading of data. (The speed of writing the data depends on other factors.)