Yes you'll probably have to dig into the interniche stack code to see how it is done. I think that you will need to allocate a mbuf and put your hardware generated data in there, providing enough room before the data so that the stack can add the headers. Then some of the internal functions will probably handle the rest.
But in my opinion it is more complicated go through the nios core just for that. Adding all the header in hardware and sending it directly to the TSE core should be simpler.
For the UDP checksum, you need to add some (not all!) words from the header and data, do a one-complement and adjust if the result is zero. The wikipedia page (
http://en.wikipedia.org/wiki/user_datagram_protocol#checksum_computation) explains it well.