Forum Discussion
Altera_Forum
Honored Contributor
12 years agoHo hum....
If you have to 'check them out' you might as well apply them yourself! Anyway patches 9 and 10 are trivial to verify. 9 fixed a complete stupidity. Patch 2 (memory access costs is straight from the gcc docs). Patch 5 (use high and lo_sum rtx) is from the gcc docs - it says 'if you have these instructions, do this'. I may well have copied the code from one of the other cpus. That just leaves patches 3 and 4, patch 3 is easy to test. The case for patch 4 just appeared, NFI why. If you allocate a small structure in the 'small data' region you want the compiler to generate gp relative addresses for it. Structure references generate 'symbol + const-offset' rtx, without these patches the compiler generates multiple instructions and can end up keeping an extra register pointing to the structure field member - significantly increasing register pressure. If you look at the gcc code, they are very localised changed and inside a lot of conditionals that restrict when they might apply. It is reasonable for the compiler to assume that if the start of a data item is accessible via valid offset from gp then all of that item is accessible from gp [1]. I didn't look at fixing indexing of arrays in 'small data'. Again the 16-bit offset gets added early - instead of being elided into the final memory access. I fixed my small data arrays by defining a structure and using gp as a register variable pointing to its base (this is a very controlled memory map). Not that the code we have would be too slow without these changes. [1] I have a plan to have an array that extends beyond 32k from gp...