Hi,
SPI interfaces usually have a clock polarity and a clock phase bit (CPOL and CPHA). You have to set these bits to the same values on both sides. The most common mode is to se CPOL=CPHA=0 (in my exerience). If CPOL is low, clock is low when not sending and with CPOL high, the clock is high when in idle. With CPHA low, data is sampled by the receiver on the first edge of each clock pulse (leading edge). With CPHA high, data is sampled on the trailing edge of the clock pulse. In each case, it is best for the master to outpu data on the edge which is not used by the slave for sampling.
You should also set one side as the master and the other side as the slave. Master-Master and Slave-Slave cannot work. The master generates the clock. On the slave side, you may have a nSS signal (active low slave select). If enabled, the nSS must go active (low) before the transfer starts and must remain low for tghe duration of the byte being transferred). On the Atmel AVR's, the nSS signal is optional. I would recommend using it, however since it allows you to synchronize the transfer. Without it, your transfer may loose byte synchronization if you have a noise glitch on the clock line. One other thing to check is whether your clock frequency generated by the master falls within the limits of the slave. Refer to the datasheets.
A last thing to check is that, if you send more than one byte at a time, you do not send them too fast for the slave to read the received byte from its receive buffer. One option is to choose a low enough clock speed so that this is guaranteed (taking into account interrupt latency etc.).
The problem with SPI is that it is very much like an UART - there is no flow control or feedback from the slave to stop the master if it needs time to process data. In that respect I2C is better (but more complex).
Regards,
Niki