An issue worth remembering is that the maximum stack use is likely to be in an error path somewhere - possibly inside printf() or the logging functions. So looking at what has been written to the stack area isn't necessarily that useful.
For code that has no recursive or indirect calls and doesn't use alloca() it is possible to do a static analysis since compiler generated code (and most hand assembler) will have a fixed stack size/offset for each function and call site. Parse the compiler asm listing (gcc -S -fverbose-asm ...) or a disassembly of a fully fixed up progam image.