What about some kind of brute force operation ?
implement a shift register with a length of all registers you have and connect the output of the shift register with it's input.
the content you shift is an alternating 1010101010... stream, so all registers must toogle.
also connect all available IOs to such bits of the register. maybe let some of the bit go through embedded multiplier and let these output be connected to your io pins to get more stress
this of course means you will have the maximum core and io current. and so so will need a very well designed power supply and themal cooling of your device.
i knew that other fpga companies like the one with an X in its name forbits such testings but i knew about a design where this works very well. (and the pcb from the manufacture of the device does not work) if your pcb is not very well done then this will unsolder your fpga and damage it of course. (as with the desing kit from that manufacfure)
but this is maximum stress and shows extremly what can be done when it is done perfectly.