hi, im danny's coworker, ill answer a few of those Q's
first of all, thanks for the suggestion that the PC is offloading the checksum, this kinda makes sense since we are using a secondary interface (a pci card) for the nios, but an onboard lan connection for the corporate lan
as for the configuration, here goes:
- stand alone Lwip web server converted to UDP, compiler optimizations to -3 (i think) for a large mem footprint but fast code, LWIP checksumming disabled
- a Nios 2/f core, clocked at 135 mhz (highest we could go in a ES stratix 2s60, with SDram synced) no onchip ram or TCM, all unused devices removed (no button PIO's etc.. just the bare essentials for what it needs to do) with 16kb instruction cache, 64 kb data cache, all hardware accelerate modes enabled
then we use a buffer of almost 64kb, fill it up with data en send it out in one go (LWIP autofragments it, but the PC can handle this, the other way around is no go though) and just throw the send routine in a while(1) loop
this has yielded about 77/78 mbit raw speed, and 73/74 pure data transfer (without the udp headers etc) we have gotten to the point where our desktop computers can just barely keep up with receiving the data
http://forum.niosforum.com/work2/style_emoticons/<#EMO_DIR#>/smile.gif
as you read, we are constantly re-using the same packet data, we have yet to do an actual test with constantly changing data being read from memory, as it is i think out entire packet buffer (60kb) fits into the 64kb data cache, and just stays there
oh by the way, our mac/phy is a smsc lan91c111, we are using an altera nios2 dev board