__mulsi3 is not deterministic so if the input 'a' into that library is large it takes more time for that while loop to complete than small values of 'a'. I suspect the multiply is needed because the struct offsets are now 12 bytes instead of 8 bytes when 'c' wasn't present. As a result when calculating the index into the array it has to multiply by i*12 to get the new index instead of just increasing the index by i*8 which is just i<<3. So as 'i' increases the input into that multiplier library increases which causes the while loop to iterate more.
I'm not sure why the compiler isn't just using an adder, if you are using -O0 optimization try -O2, maybe that will remove the multiplier library and replace it with an adder instead.