I would not call trigonometric functions 'arithmetic' :D
Anyway:
1. Choice between LUT or megafunction depends on your precision and latency requirements. A raw LUT is faster but the precision is strongly dependent from table dimension. Altera megafunction is somewhat more versatile since you can easily tune parameters to obtain optimal performarce for your design, but you have output result latency. However the megafunction, too, is fundamentally based on a LUT.
2. If you actually need tangent and possibly not sin and/or cos, there's no point in calculating them and performing division. Use directly a tan function or LUT. Division is convenient for tan definition but in VHDL and in software a direct calculation of tan or any other function is generally more convenient (i.e. faster).
3. I don't know if a 'ready' signal is available. I rather think these megafunction have a fixed latency specified by clock cycles.
4. Usually x^y is calculated as e^(y*logx)