What could cause a microcontroller to unexpectedly reset?
On PIC and dsPIC chips, I have observed the following causes of unexpected reset.
Hardware:
- Reset pin driven low or floating. Check the obvious stuff first!
- ESD coupling into the reset pin. I've seen this happen when completely unrelated equipment gets turned on on the same desk. Make sure there's enough capacitance on the reset pin, possibly as much as 1 uF.
- ESD coupling into other pins of the processor. Scope probes in particular can act as antennae, couple noise into the chip and cause odd resets. I've heard reports of "invalid opcode" reset codes.
- Bad solder joint/intermittent bridge. Could be losing or shorting a power rail, either on the processor or somewhere else on the board.
- Power rail glitch/noise. Could be caused by any number of external problems, including a damaged regulator or a dip in the upstream supply. Make sure the power rails feeding the processor are stable. May require more cap somewhere, perhaps decoupling cap directly on the processor.
- Some microcontrollers have a Vcap pin, which must not be connected to VDD and must have its own capacitor to common. Failure to connect this pin properly may have unpredictable results.
- Driving an analog input negative past a certain limit causes a reset that reports in RCON like a brownout. The same may be true of digital inputs.
- Very high dV/dt in a nearby power converter can cause a brownout reset. (See this question.) I have seen this in two cases, and in one I was able to track it to capacitive coupling. An IGBT was switching 100-200 amps, and at turn-off some feedback circuits were seeing a few microseconds of noise, going from 2V to over 8V on a 3.3V processor. Increasing the filter cap on that feedback rail made the resets stop. One could imagine that adding a dV/dt filter across the transistor might have had a similar effect.
Software:
- Watchdog timer. Make sure the watchdog timer is cleared often enough, especially in branches of your code that may take a long time to execute, like EEPROM writes. Test for this by disabling the watchdog to see if the problem goes away.
- Divide-by-zero. If you're performing any divide operation in your code, make sure the divisor can never be equal to zero. Add a bounds check before the division. Don't forget that this also applies to modulo operations.
- Stack overflow. Too many nested function calls can cause the system to run out of dynamic memory for the stack, which can lead to crashes at unusual points in code execution.
- Stack underflow. If you are programming in assembler, you can accidentally execute more RETURNs than you executed CALLs.
- Non-existent interrupt routine. If an interrupt is enabled, but no interrupt routine is defined, the processor may reset.
- Non-existent trap routine. Similar to an interrupt routine, but different enough I'm listing it separately. I've seen two separate projects using dsPIC 30F4013 which reset randomly, and the cause was tracked to a trap that was called but undefined. Of course, now you have the question of why a trap is called in the first place, which could be any number of things, including silicon error. But defining all trap handlers should probably be a good early step in diagnosing unexplained resets.
- Function pointer failure. If a function pointer does not point to a valid location, dereferencing the pointer and calling the function pointed to can cause a reset. One amusing cause of this was when I was initializing a structure, with successive values of NULL (for a function pointer) and -1 (for an int). The comma got typoed, so the function pointer actually got initialized to NULL-1. So don't assume that just because it's a CONST it must contain a valid value!
- Invalid/negative array index. Make sure you perform bounds checking on all array indices, both upper and lower bounds, if applicable.
- Creating a data array in program memory that's larger than the largest section of program memory. This may not even throw a compilation error.
- Casting the address of a struct to a pointer to another type, dereferencing that pointer, and using the dereferenced pointer as the LVALUE in a statement can cause a crash. See this question. Presumably, this also applies to other undefined behaviors.
On some dsPICs, the RCON register stores bits indicating cause of reset. This can be very helpful when debugging.
The RESET-pin must be properly driven by a reset circuit monitoring over/under voltage and creating a long enough reset signal. With that in mind my experiences with an uncontrolled hardware reset comes then from:
- Crosstalk from switching lines into the RESET pin/line(make them short)
- Ground shifts/loop caused by switching on/off external high current load
- Voltage spike not filtered by the power supply and too short to active the proper RESET
- Switching external loads by the microcontroller which causes above problems(mainly on inductive loads like motor on/off, relays or old lamp(inrush current)
- Voltage/current spike on any of the microcontroller pins(worst is the oscillator) can cause reverse current and may switch internal register(same as voltage spikes on the supply line). In general, when interfacing to a kind of industrial environment caution needs to be applied (for more see: http://www.ichaus.biz/wp1_mcu_interface ). Level shifting on IOs, input filtering and soft switching outputs are need to be considered. Making the supply lines clean has the first priority on the hardware side. Then RESET and oscillator pins, then IO-lines. -mm
One additional possibility I did not see in this list, is a device that supports ICSP. If insufficient pull ups are used on lines that trigger in circuit serial programming mode, it is sometimes possible to enter that mode randomly. This leads to a reset a short interval later when no program update is sent to the designated serial receiver lines. I suspect an internal watchdog timer forces reset if ICSP is started and no programming data is sent. This is a mistake I have made and spent a great deal of time finding with a 16F876.