FWIW if your board has two 16bit flash chips to generate a 32bit memory device, you must wait for both chips to signal that the erase (or write) has completed - that was the first bug I had to fix in the vendor supplied software for an evaluation board! (vendor and board will remain nameless!!).
I don't beleive there is any reason to do any kind of delay between the reads of the status register. Once the erase/write is complete I think you then write to the device to terminate the erase/write - so this would have to be done inside the erase/write function.
(It is over 10 years since I was doing this.)
Just look at the source for alt_epcs_erase_block().