It does work great. It does let you write way better code which is easier to maintain and understand. We write separate packages per *interface*, i.e. both directions of a block-to-block connection including the two records, one per direction, and probably some supporting constants, range and type definitions, occasionally the interface also defines some IDLE (default) values for these records if this makes sense in place.
Our architectures use a single record for all registers, with notable exceptions like inferred RAM blocks (or some selected registers without reset) or multi-clock-domain interfacing. Local variables inside the functional process are also collected in a single record type.
What we never did was to find out whether there is any speed or area penalty resulting from this coding style because of synthesis tool defensive actions. If we’ll ever see some of those in action, we will go as far as possible with our coding style and apply standard/structural coding to the smallest part possible to achieve our goal.
In addition to using records, we drastically reduce the use of std_logic(_vector) in favor of boolean, ranged integers and bit_vector. Way better code, but some ‘copper sniffers’ – that’s what I call them – have a hard time getting used to it.
This coding style does not stop you from doing nasty things regarding speed/area issues, but at least those look prettier. ;)
– Matthias