Forum Discussion

Altera_Forum's avatar
Altera_Forum
Icon for Honored Contributor rankHonored Contributor
20 years ago

lwIP UDP thoughut w/o an RTOS

Wanted to receive 1500 byte payload from 100 MAC (on Cyclone II) device, and using lwIP UDP no RTOS is required.

Can the UDP RX receive 80 Mbit/s? Where the UDP TX is minimal for data flow control only.

The lwIP UDP RX just stripes off the header, and pass the data directly to an DMA.

Any advise?

14 Replies

  • Altera_Forum's avatar
    Altera_Forum
    Icon for Honored Contributor rankHonored Contributor

    --- Quote Start ---

    originally posted by niosiiuser@Jul 22 2005, 02:21 AM

    hello provintan,

    yes it is possible to receive udp packets with over 80 mbit/s. it seems that i’m working on a similar application like you. i got about 82 mbit/s udp rx.

    my system consists of:

    1. cyclone ep1c12 @ 50 mhz cpu clock

    2. 32 mb sdram

    3. 4 mb flash

    4. davicom dm9000 ethernet mac/phy (using the 16 bit bus)

    5. some others stuff …

    for the first experiments i was using “lwip” but unfortunately “lwip” is very slow in handling large udp packets because “lwip” is copying fragmented packets some times around. for getting this speed some changes were needed in my system:

    1. because “lwip” is not very fast, it is better to write your own little udp stack. in my system lwip resides belong a little udp stack. for this i’m filtering incoming udp packets directly in the mac driver (dm9000) and store the payload in the sdram via dma. “lwip” will handle arps and the other network data like tcp.

    2. it is necessary to get the data out of the sram of the mac as fast as possible. if you are too slow the sram will overflow! keep attention on the timing of the ethernet mac when using dma (you have to slow down the dma with the timing settings for the mac).

    3. unfortunately i hadn’t enough ios (240 pqfp) so i had to connect the dm9000 via a 16 bit bus with the cyclone. this results in a lower bandwidth. to compensate this behaviour an own bridge connects the outside tri-state bus (dm9000, flash, compactflash) with the avalon bus. the bridge controls the timing and is doing dynamic bus sizing (32 bit to 16 bit) for read and write cycles.

    so you see it is possible. i hope my explanation helps you to get the desired speed out of your system.

    good luck,

    niosiiuser

    <div align='right'><{post_snapback}> (index.php?act=findpost&pid=8541)

    --- quote end ---

    --- Quote End ---

    hi:

    i am very interesting your word.I think you are right ,the lwip is slow .recent i use dma to speed the transfer.but i test the sram to lan91 is ok.but when i degug the altera_avalon_lan91c111.c,it cant work.I think the niosII ide have some bug!!

    i want to know how to solve this problem!!can i see you code of your changed lwip!!

    thank you!!

    the code is :

    int alt_avalon_lan91c111_output(alt_avalon_lan91c111_if *dev, void *buffer, alt_u32 length)

    {

    alt_u8 irq_value;

    alt_u16 mmu_status;

    alt_u32 remainder;

    alt_u32 len = length;

    alt_u8 buffer1[200];

    void *buf =(void *) buffer;

    void *test;

    alt_dma_txchan txchan;

    int tx,i,k;

    buffer1[0]=0x10;

    buffer1[1]=0x11;

    buffer1[2]=0x12;

    buffer1[3]=0x13;

    buffer1[4]=0x14;

    buffer1[5]=0x15;

    IOWR_32DIRECT(0x1055248, 0, 0x87654321);

    /* Wait for the last Tx to complete */

    do

    {

    irq_value = IORD_ALTERA_AVALON_LAN91C111_IST(dev->base_addr);

    }while(!(irq_value & ALTERA_AVALON_LAN91C111_INT_TX_EMPTY_INT_MSK));

    /* Clear the interrupt */

    IOWR_ALTERA_AVALON_LAN91C111_ACK( dev->base_addr,

    ALTERA_AVALON_LAN91C111_INT_TX_EMPTY_INT_MSK);

    /* Always re-use the same packet */

    IOWR_ALTERA_AVALON_LAN91C111_PNR( dev->base_addr,

    dev->tx_packet_no);

    IOWR_ALTERA_AVALON_LAN91C111_PTR( dev->base_addr,

    ALTERA_AVALON_LAN91C111_PTR_AUTO_INCR_MSK);

    /* The status word */

    IOWR_ALTERA_AVALON_LAN91C111_DATA_HW( dev->base_addr, 0);

    /*

    * The byte count including the 6 control bytes

    *

    * Bit odd this, but the length is always written as even, if the frame is an odd length

    * then the byte is written as one of the control words and an appropriate bit is set

    *

    */

    IOWR_ALTERA_AVALON_LAN91C111_DATA_HW( dev->base_addr, ((length & ~1) + 6));

    /*

    * Write buffer of data to the device

    *

    * Nios requires accesses to be aligned on the correct boundary

    *

    */

    while ((int)buf & 3)

    {

    IOWR_ALTERA_AVALON_LAN91C111_DATA_BYTE( dev->base_addr, *((alt_u8*)buf)++);

    len--;

    }

    remainder = len & 3;

    /* Write out the 32 bit values */

    len>>=2;

    if (len < 186){

    while (len & ~7) /* Write 8-tuples of 32 bit values */

    {

    IOWR_ALTERA_AVALON_LAN91C111_DATA_WORD( dev->base_addr, *((alt_u32*)buf)++);

    IOWR_ALTERA_AVALON_LAN91C111_DATA_WORD( dev->base_addr, *((alt_u32*)buf)++);

    IOWR_ALTERA_AVALON_LAN91C111_DATA_WORD( dev->base_addr, *((alt_u32*)buf)++);

    IOWR_ALTERA_AVALON_LAN91C111_DATA_WORD( dev->base_addr, *((alt_u32*)buf)++);

    IOWR_ALTERA_AVALON_LAN91C111_DATA_WORD( dev->base_addr, *((alt_u32*)buf)++);

    IOWR_ALTERA_AVALON_LAN91C111_DATA_WORD( dev->base_addr, *((alt_u32*)buf)++);

    IOWR_ALTERA_AVALON_LAN91C111_DATA_WORD( dev->base_addr, *((alt_u32*)buf)++);

    IOWR_ALTERA_AVALON_LAN91C111_DATA_WORD( dev->base_addr, *((alt_u32*)buf)++);

    len-=8;

    }

    while (len)

    {

    IOWR_ALTERA_AVALON_LAN91C111_DATA_WORD( dev->base_addr, *((alt_u32*)buf)++);

    len--;

    }

    }else{

    test = buf;

    for(i=0; i<= 200; i+=1)

    {

    printf("00:%x,%x\n",((alt_u32*)test)+i,*((alt_u32*)test+i));

    }

    if ((txchan = alt_dma_txchan_open("/dev/dma")) == NULL)

    {

    exit (1);

    }

    //printf ("0: %d,%d,%d\n",remainder,(alt_u8*)buf,*(alt_u8*)buf);

    printf ("0: %d,%d,%d\n",len*4,(alt_u8*)buf,*(alt_u8*)buf);

    alt_dma_rxchan_ioctl (txchan,0x3, dev->base_addr+8);

    alt_dma_rxchan_ioctl (txchan,0x7, dev->base_addr+8);

    if ((tx = alt_dma_txchan_send (txchan,buf,len*4,dma_tx,NULL)) < 0)

    {

    printf ("1: %d\n", tx);

    exit(1);

    }

    while (!tx_done);

    //printf ("2: %d\n", tx);

    (alt_u8*)buf=(alt_u8*)buf+len*4;

    IOWR_ALTERA_AVALON_LAN91C111_PTR( dev->base_addr,0x6000);

    for(i=0; i<= 800; i+=4)

    {

    k = IORD_32DIRECT(dev->base_addr,8);

    printf("%x\n",k);

    }

    //printf ("2: %d,%d,%d\n",remainder,(alt_u8*)buf,*(alt_u8*)buf);

    printf ("2: %d,%d,%d\n",len*4,(alt_u8*)buf,*(alt_u8*)buf);

    alt_dma_txchan_close (txchan);

    len = 0;

    tx_done = 0;

    }

    while (remainder)

    {

    IOWR_ALTERA_AVALON_LAN91C111_DATA_BYTE( dev->base_addr, *((alt_u8*)buf)++);

    remainder--;

    }

    if (length & 1)

    {

    IOWR_ALTERA_AVALON_LAN91C111_DATA_BYTE( dev->base_addr,

    ALTERA_AVALON_LAN91C111_CONTROL_ODD_MSK);

    }

    else

    {

    IOWR_ALTERA_AVALON_LAN91C111_DATA_HW( dev->base_addr, 0);

    }

    /*

    * Accesses to the MMUCR have to be protected with a semaphore as it&#39;s possible

    * that we could wait for this register not to be busy and in between our read

    * and the write an interrupt could occur which causes the scheduler to run and

    * the ethernet_rx thread could be run!

    */

    # if 0

    ALT_SEM_PEND(dev->semaphore, 1);# endif

    /* Wait for any pending commands to complete */

    do

    {

    mmu_status = IORD_ALTERA_AVALON_LAN91C111_MMUCR(dev->base_addr);

    }while (mmu_status & ALTERA_AVALON_LAN91C111_MMUCR_BUSY_MSK);

    /* Queue the packet */

    IOWR_ALTERA_AVALON_LAN91C111_MMUCR( dev->base_addr,

    ALTERA_AVALON_LAN91C111_MMUCR_ENQUEUE_MSK);# if 0

    ALT_SEM_POST(dev->semaphore);# endif

    return 0;

    }
  • Altera_Forum's avatar
    Altera_Forum
    Icon for Honored Contributor rankHonored Contributor

    Hi @ all,

    well it seems that I&#39;ve lost my code, but it is not that hard to accomplish it. What I did mostly (because it was possible) is to clock the NIOS @ 135 MHz, use max datacache and instructioncache. Furthermore I used the UDP-feature of the lwip-stack.

    Good luck.

    Cheers,

    Danny
  • Altera_Forum's avatar
    Altera_Forum
    Icon for Honored Contributor rankHonored Contributor

    --- Quote Start ---

    originally posted by niosiiuser@Jul 22 2005, 02:21 AM

    hello provintan,

    yes it is possible to receive udp packets with over 80 mbit/s. it seems that i’m working on a similar application like you. i got about 82 mbit/s udp rx.

    my system consists of:

    1. cyclone ep1c12 @ 50 mhz cpu clock

    2. 32 mb sdram

    3. 4 mb flash

    4. davicom dm9000 ethernet mac/phy (using the 16 bit bus)

    5. some others stuff …

    for the first experiments i was using “lwip” but unfortunately “lwip” is very slow in handling large udp packets because “lwip” is copying fragmented packets some times around. for getting this speed some changes were needed in my system:

    1. because “lwip” is not very fast, it is better to write your own little udp stack. in my system lwip resides belong a little udp stack. for this i’m filtering incoming udp packets directly in the mac driver (dm9000) and store the payload in the sdram via dma. “lwip” will handle arps and the other network data like tcp.

    2. it is necessary to get the data out of the sram of the mac as fast as possible. if you are too slow the sram will overflow! keep attention on the timing of the ethernet mac when using dma (you have to slow down the dma with the timing settings for the mac).

    3. unfortunately i hadn’t enough ios (240 pqfp) so i had to connect the dm9000 via a 16 bit bus with the cyclone. this results in a lower bandwidth. to compensate this behaviour an own bridge connects the outside tri-state bus (dm9000, flash, compactflash) with the avalon bus. the bridge controls the timing and is doing dynamic bus sizing (32 bit to 16 bit) for read and write cycles.

    so you see it is possible. i hope my explanation helps you to get the desired speed out of your system.

    good luck,

    niosiiuser

    <div align='right'><{post_snapback}> (index.php?act=findpost&pid=8541)

    --- quote end ---

    --- Quote End ---

    hello niosiiuser:

    i am very interesting what you said.i want to how you write your own little udp stack and tcp stack?hope to receive your letter and your code!

    thank for you help!!
  • Altera_Forum's avatar
    Altera_Forum
    Icon for Honored Contributor rankHonored Contributor

    --- Quote Start ---

    originally posted by wwycoolboy@Sep 24 2006, 06:37 AM

    hello niosiiuser:

    i am very interesting what you said.i want to how you write your own little udp stack and tcp stack?hope to receive your letter and your code!

    thank for you help!!

    --- Quote End ---

    Hello wwycoolboy,

    The only thing I did was to write a little UDP stack which runs besides lwip. Lwip is managing arp and tcp. My code handles the incoming UDP packets only. Decoding UDP packets is easy. Take a look at http://en.wikipedia.org/wiki/user_datagram_protocol (http://en.wikipedia.org/wiki/user_datagram_protocol) to get more information about UDP. My code won’t help you because it is adapted to my very special needs.

    If I were you I would do the following things:

    1. Set up a small lwip Nios application which handles arp and udp.

    2. Write a small PC application which sends UDP packets to your Nios system.

    3. Examine the packets with Ethereal and compare the information with the UDP protocol to learn more about UDP.

    4. Modify your Nios Ethernet driver for intercepting the UDP packets which will be received.

    5. Analyse the incepted UDP packets for your application.

    Good luck,

    niosIIuser