Forum Discussion
Altera_Forum
Honored Contributor
18 years ago- I think you've got the most important issue covered, is that many users want safe state-machines when their device isn't susceptible. You recognize you're in an environemnt where this can happen
- Safe state-machines are not always safe by just returning to a known state. You already get this. But the example where the SM is just a sequencer, and it goes from s3 -> unkown -> s0. The inputs might be at a point to tell it to go to s4, and will sit at those values while the SM won't advance because it's in S0. I think what you're doing is the best, whereby you're purposefully looking at the error and deciding what to do. This is also the most difficult. - Polling is often the most robust, whereby a SM is doing in triplicate, and the outputs look at all three states and compares them. If they ever don't match, it takes the output that hopefully two of them agree on, and the third SM is reset or somehow set to get back in line with the other two. I've seen FPGAs put down on a board this way(three in parallel). Naturally, this is extremely costly and often not feasible, but it's really one of the more full-proof, easy to understand, methods to handle these errors. (Plus, the outputs should never go to an unprepared state, i.e. they get their data from the two circuites that are working while the third one is fixed. - I don't know your system, but how much of a failure it can handle plays a lot into this. For example, lots of systems require the error to be recognized, but can handle being off-line for a little bit. For these, the user just puts in a lot of flags and checks for error conditions all over the place(state-machine going to states it shouldn't, counters at count values they shouldn't, data that looks corrupted like CRC failures, etc. These flags can be software interrupts or something. Of course, if you're controlling the ejector seat on a jet, that might not be feasible.) - I hope there's good reading material on this subject, but I've never gone down that path. The bottom line is that you're trading area/cost and performance to gain reliability under your conditions. It really requires looking at as much control logic in your system and trying to figure out how it would deal with random changes, and coding in how it would recover. This usually isn't simulatable(not because you can't flip a SM to another state, but because you can't do every permutation at every time in your sim). - One last thing is you might want to use a family that has the internal configuration bit checker: http://www.altera.com/literature/wp/wp-01012.pdf