Design Considerations for Electronics in Space
This is what I do! Many, many excellent books have been written on the subject, but as a brief bullet-point list, focused particular on embedded systems for space usage:
In general, we use many of the high-reliability design practices learned over many decades of hard-learned lessons from the defense, aviation and even automotive (brake controllers, ABS). This includes methods of fault tolerance (n-redundancy, fail-safe, etc.), rigorous analysis and quality-control of software and hardware, and observance of the many standards written on the subject.(Especially critical if you work for a traditional space environment).
For electronics specifically, ionizing radiation and the lack of Earth's magnetosphere is the big one. As gross oversimplification, we can split into two-classes: total-ionizing dose (TID) and single-event effects. Both have mitigations that range from throwing lots of money at specialized hardware, and clever software/design solutions that can mitigate the effects enough in a much cheaper fashion.
TID is exactly what it sounds like -- over time, you accumulate damage from ionizing radiation and eventually your semiconductors cease to become semiconductors. The effects vary hugely based on process-size, makeup and many other device-level effects but effects you may see include MOSFET threshold voltage shift -- picture a N-channel MOSFET who's Vt slowly drifts downwards until it is always on. Some incredibly hardened-processes have been developed to support very high dose amounts -- the Jupiter-destined Juno mission has some incredible hardware inside a massive, literal vault.
A side-note on TID, since of course radiation-effects are also of interest for terrestrial applications such as nuclear weapons, testing is often done at high and low dose rates. Some semiconductor devices express different results for both -- for example, a paper I read subjected a LDO to both high and low-dose rates. One degraded the Brokaw band-gap circuit, drooping the output voltage over time. The other degraded the beta of the output transistor, reducing the output current over time.
Single-event effects can actually also be observed on Earth -- most people are familiar with ECC DDR memories for critical applications, for example. Additionally, most commercial aircraft must factor this in due to their operating altitude being high-enough that high-energy neutrons can cause electronic circuit malfunction. This is popularly referred to as 'bit-flips' -- an energetic particle travels through a circuit, imparting a linear energy-transfer (LET) that may be sufficient to cause a bit-upset (SEU), a latch-up condition (SEL) that leads to high-current draw due to parasitic BJT behavior, MOSFET gate rupture (SEGR) and burn-out (SEB). You could broadly class any event that results in a system failure as a SEFI -- single-event functional interrupt.
I'll call out latch-up specifically. There are terrestrial specifications for latch-up that fall under JESD78, but those are not designed for radiation-induced latch-up conditions. The mechanism is similar between the two -- a parasitic NPN structure can be energized in conventional CMOS construction, causing a low impedance path from power to ground to be created. This of course will result in large amounts of currents flowing through a part of the chip that was never designed for it. Remembering the current-densities bond-wires and various portions of the dies are designed for, if this situation is not remedied, that chip will die a fiery death. A common mitigation is an upstream current sensor that reacts to cut off the power supply and remove the latch-up.
In terms of software and processors, I distill it down to two major issues. One is protecting volatile memory -- register files, RAM (SRAM/DRAM), etc. It would be unfortunate if your PC register took a SEU and suddenly skipped somewhere else. Second, is protecting non-volatile memory -- your software is useless if it gets corrupted and cannot execute. The usual volatile protection is ECC (SECDED usually) plus scrubbing continuously for errors. For non-volatile, it is much harder -- large quantities of hardened memory is incredibly expensive to purchase, much to the detriment of NASA/ESA science missions. Some folks use n-redundancy, others use natively-hardened technologies like MRAM or FRAM (to a degree, for COTS work) and others pay vendors upwards of six-figures for high-reliability, mission-critical storage.
Mechanically, at least in LEO orbit, you're thermal cycling between sun and darkness every 45 minutes. This is in addition to needing to survive the rigors of launch -- my mechanical colleagues have a set of requirements they design too (I believe part of it is GEVS) to make sure we survive the high-G launch of a rocket. They do an impressive amount of analysis and pre-launch testing to make sure we don't become pieces of flotsam on the way up. In assembly, we avoid using lead-free solders and conformal coat all electrical assemblies.
Thermally, there's no convection in space. For high-power ICs, the only path for heat transfer is radiation, and conduction. Interesting heat-sink designs must be considered to effectively remove heat from a device using only those two methods. Additionally, testing on the ground becomes hardware because not only do you need a thermal chamber, you need a vacuum chamber as well. Here are some pictures of JPL's TVAC chambers.
Working in "new space", where folks aren't building massive GEO/MEO birds that support critical national security or commercial needs, often COTS parts are flown after undergoing testing / analysis on the ground to see how they fare. While one can purchase a flight-ready, several hundred-krad tolerant 74xx00 quad-NAND gate for a few hundred bucks, some folks may test lots of 74LVC00 or similar parts to see how they fare as well. It's all in the amount of risk you're willing to tolerate.
My background is in designing automotive, consumer and industrial electronics, before I entered space work. So, often my thought process is "man, I'm going to use that awesome monolithic, low-power, state of the art part! Oh, wait -- space.". That is usually then replaced by thinking about how discretized, and how minimized I can make that solution for a stable of radiation-tolerant or radiation-hardened components based on knowledge (either from testing, or predictions based on process-technology) of their radiation performance.
Some good books / resources to read:
- Space Mission Analysis and Design
- Spacecraft-Environment Interactions
- IEEE Xplore (In General)
- NSREC Proceedings (In General)
- Various NASA Slides / presentations on the subject (GSFC and others do a lot of this work)
If this answer picks up more interest, I'll likely swing back around to fill it out / edit it to be cleaner.
Thermal considerations, mechanical considerations and outgassing if operating in a vacuum, radiation and related upsets and damage, vibration and shock during launch, export controls on devices and documentation. Limited or nonexistent ability to effect repairs or physical upgrades.