You can manually instantiate them. lcell is the name, and I believe in and out are the ports. (I think in VHDL a_in and a_out are the ports, since in and out are keywords). Naturally without hand-placing them, you can get a lot of variance. If you have timing constraints that require a positive hold, check that Assignments -> Settings -> Fitter has Optimize Hold Timing set to All Paths and Optimize Multi-Corner is checked. This tells the router to add delays to try and meet hold requirements. As a test, I took a FF->FF in Cyclone 3 and added a 20ns hold requirement. The router was able to meet this by routing around the chip almost twice. (Note that it failed at first because I had a 30ns setup requirement. Adding 20ns in the fast corner is ~40ns in the slow corner, so I had given it impossible constraints. By making the setup relationship 50ns or so(I don't remember), the hold could be met.)