--- Quote Start ---
originally posted by chmike+nov 6 2006, 12:03 pm--><div class='quotetop'>quote (chmike @ nov 6 2006, 12:03 pm)</div>
--- quote start ---
hello,
<!--quotebegin-dave@Nov 5 2006, 04:01 AM
i too tried the web server example and had similar problems... --- Quote End ---
I am investigating the problem. The web server driver manages to initialize itself with DHCP but we have this hang after less than a minute repetitive ping operation.
Cstc1 very kindly sent me his driver that I tested thursday and friday. Unfortunately I didn't manage to make it initialize itself with DHCP, but it keeps capturing packets on the network for many minutes without hanging. I couldn't test the ping yet. I will try with 'hard wired' ip address today.
The most important difference between the two codes is that cstc1 has made packet sending and receiving a critical section with a semaphore. Another difference is that Cstc1 interrupt handling code is simply queuing a request to process incomming data, where the web server driver read in the packet while in the interrupt code. These two difference may explain why cstc1 driver is more stable.
Though I have some doubts on its performance since it adds 20usec after reading each data out of the driver. Is that required ? The reason why it can't initialize with DHCP is also still unclear now.
cstc1 driver is also very terse in the feedback on the link up/down state or stats. But for now we look for a working solution.
<div align='right'><{post_snapback}> (index.php?act=findpost&pid=19205)</div>
[/b]
--- Quote End ---
Hi,
I did a little poking around. It seems that the board stops responding because it stops correctly receiving packets. I don't think this is a driver issue (although making changes to the driver seems to affect the frequency of the errors, not sure why). My reason for saying this:
The ethernet controller writes received packets into a circular buffer. The driver reads this buffer using a special register that slides along the buffer every time it is read (0xf2) (plus another register that just reads the value at the current position without sliding (0xf0)). When a packet is available in the buffer, the data from the current position should look something like:
01 <status> <packet length> <packet data> <crc> [next packet...]
Or when a packet is not available:
00
So after an interrupt from the ethernet controller, the driver reads the first byte (without sliding the position) to check that it is 01 before continuing. Having put some printfs in the driver code, I found that after a short while the first byte is neither 00 or 01 but some seemingly random number. This would seem to indicate that the driver had gotten "out-of-sync" with the ethernet controller in terms of the position of the sliding register, but I've compared the packet receiving code with the datasheets and "application notes" (which basically contain example driver code) and it appears to match perfectly.
However, I added some more printfs to display the 01/00 byte, and if the byte was 01, the following status and packet length, and if the byte wasn't 01 or 00 (IE some random number), the next 200 bytes in the buffer. I then pinged the board (monitoring with ethereal), output was:
Use static IP configuration, IP =# 192.168.0.128
01 status: 0001 length: 004a
00
01 status: 0001 length: 004a
00
01 status: 0001 length: 004a
00
01 status: 0001 length: 004a
00
01 status: 0001 length: 004a
00
01 status: 0001 length: 0042
66
===========
6766 6968 fb57 2bf4 1100 08 45 3c00 456b 00 180 284e a8c0 100 a8c0 200 08 5c39 0
4 10 6261 6463 6665 6867 6a69 6c6b 6e6d 706f 7271 7473 7675 6177 6362 6564 6766
6968 45d4 f887 01 4e 9000 ae00 00 1100 cd09 1172 08 45 3c00 836c 00 180 ea4c a8c
0 100 a8c0 200 08 5c38 04 11 6261 6463 6665 6867 6a69 6c6b 6e6d 706f 7271 7473 7
675 6177 6362 6564 6766 6968 e546 270a 01 4e 9000 ae00 00 1100 cd09 1172 08 45 3
c00 dd6c 00 180 904c a8c0 100 a8c0 200 08 5c37
===========
According to ethereal, all packets sent were length 74 (4a hex), but the length of the last packet as read from the ethernet controller was 42 hex (8 less). The first 4 bytes in the buffer match those reported in ethereal as the last in the packet (with 4 byte checksum following). The next byte is then 00 as expected (to indicate no more packets). Sorry about the format of the data --- it's all little endian but printed as 16-bit ints.
I have tried inserting delays (up to 50us) between reads and writes to the ethernet controller, no difference. I can only assume this means that the data is getting corrupted either in the RAM on the ethernet controller or when it is read from the controller. I have tried lowering the NIOS system clock from 100MHz to 50MHz and playing with the settings for the PLL controlling the clock to the controller, no difference.
My board may be faulty but as you seem to be experiencing similar problems I'm not so sure...
I don't really have any experience with this kind of thing so I'm not sure what to try next...