1. Assignments -> Settings -> Analysis & Synthesis -> More Settings -> Auto Shift Register Replacement = Off
(You can also locate hierarchies to the Assignment Editor and disable/enable at a hierarchical level.)
2. With no changes to the project, there is nothing that forces I/O register usage, unless using a primitive like altddio_in/out. Otherwise, the fitter has the option to use the I/O register in order to improve timing. So if you have timing constraints, it tends to pull the registers into the I/O. If you have loose I/O constraints but tight internal constraints on these long comb paths(I assume from I/O to I/O), then it should be able to pull them into the fabric. I've seen this work before. Is your internal logic failing timing?
In the absense of any timing constraints, I think it will use the I/O registers just to save space, since you're not really telling it what to do. Also, if the internal constraints aren't tight, it tends to use them.
If you really don't want them, you can go into the Assignment Editor and assign to the specific I/O port, Fast Input Register = Off. There's a Fast Output Register assignment too.