These functions I have taken from niche stack for ucosii. I have removed all the OS dependent code and modified it to use with bare metal software. So all these functions internally calls do_async_transfer and other basic APIs. My actual job is to take raw ethernet frames from USB port and send all those to TSE and similarly all the packets received from TSE to USB. In this case I am receiving large number of raw frames in a single usb transfer,so I segregate them all and send one by one to TSE using tse_mac_raw_send() api. In the Sgdma callback I am receiving single packet/raw frame and send those to USB. The speeds I am getting tested with iperf 4.72MBPS from USB to TSE and 300-315KBPS from TSE to USB. Now I want to increase the speed from TSE to USB part. Can you just tell me what can be the possible bottleneck in the system? And yes NIOS II instruction cache size is 32k.