So you believe the delay is occuring between when the app delivers the packet to the Nichestack and when it is finally output on the wire.
Well there are only a few places where delay can occur.
The Nichstack will hold onto this packet until the OS signals to the Nichstack that it can run its routines. So if your application is super busy performing higher priority tasks, the Nichestack may not get an opportunity to process the packet.
Eventually the Nichestack will deliver a packet to the MAC driver's transmit function. It's been a while since I examined that driver but I believe it performs a synchronous send (meaning it will transmit the packet and wait for it to finish transmitting before it moves on). If you want to debug further, you could put some code into the tse_mac_raw_send function of the driver "ins_tse_mac.c" to spit out the time or something. The driver initiates a DMA transfer to move the packet from memory to the MAC.
After this it's all hardware. The only other place where delay can occur is in the TX FIFO inside the MAC. It's unlikely that this is occurring.
Jake