Amazing article, thanks! However, I think when you said "...heat was transmitted by convection through the metal plate inside each page...", you probably meant conduction, right?
Convective heat transfer in metal would be a worrying event on the Space Shuttle!
I don't know if these boards were flown. They were coated with conformal coating (which I hate for reverse-engineering), which is usually omitted from prototypes. I believe that bodge wires are okay for flight if they are done properly.
Interestingly the earliest capacitors were glass jars (called Leyden Jars) [0]. I was taught that early inventors thought that charge was accumulated within the jar, and it wasn't until much later that it was realised that the shape was irrelevant, only the area and the distance between conductors.
oh that is absolutely fascinating to see in detail
I wonder if the very low density (relatively speaking to today) make them more robust against gamma-rays and other radiation problems once outside the atmosphere?
if I remember correctly, and it's been decades of course
four of the computers ran in parallel with the exact same instructions in case one failed or came up with a wrong answer
and the fifth computer was the "decider"
is that understanding correct?
ah I see now you mention
Eight networks were assigned to flight-critical systems,
with each CRT display and engine controller connected to four networks for redundancy.
Yes, the low density and TTL chips (instead of MOS) helped against radiation. When the Shuttle computers moved to semiconductor RAM, they needed extensive error correction, as well as a process that constantly fixed bit errors, as the memory would get multiple errors per flight due to cosmic rays.
As far as redundancy, it's complicated. During critical flight phases, four computers would run the main software (PASS, Primary Avionics Software System), while the fifth computer was ready with the Backup Flight Software (BFS). The backup software was written by a completely different team to ensure that a software bug couldn't crash all the computers at once. In orbit, they used fewer redundant computers to free up computers for payload operations and stuff.
The four computers constantly checked the results from each other and would vote out a faulty system. Voting ensured that a bad computer couldn't vote out the good ones (Byzantine failure). Moreover, the actuators hydraulically voted on the results from the computers: if one computer tried to push a valve in a different direction, the three good computers would physically overpower the bad computer's action at the level of the hydraulic pistons.
Thanks so much for the information. I am familair with the voting logic (I've worked on systems that implemented the same thing, odd-number of processor cores and the majority wins).
One question, were any "misbehaving" processor or actuation requests ever logged? As in, were there examples where one actuator or CPU didn't agree in the Shuttle flights?
There have been a fair number of GPC failures [1], and computers have been voted out. I haven't looked closely enough to see how many were "disagreements" versus hard failures or self-check failures.
It's unlikely that you'd get a simultaneous tie; you'd expect one computer to go bad before the other. But I think in that case, the astronauts switch to the Backup Flight System, the fifth computer.
Mission STS-9 had two computer failures, causing landing to be delayed by 7 3/4 hours. They carried a sixth computer as a backup for following missions.
As far as how the voting works, each computer has a signal indicating what it thinks the status is of each computer, including itself. (Computers can detect many failures from self-checking, such as parity errors.) Each IOP uses these votes to determine the "redundant set", calculating the votes in hardware. The status is also displayed to the astronauts in a 5×5 grid. Astronauts can power down a computer or reboot it.
Favorite story I heard about voting was an anecdote relating to the flight computers on one of the Boing 7xx jets (probably the 757, but I don't know).
The story was that they were planning to fly with 3 computers, and that they would "vote" on important decisions.
The real trick was that they intended to build those computer with 3 separate teams, using clean room implementation (no coordinating with the other teams), and that they were going to use 3 separate CPU architectures, and even 3 different implementation languages.
As I understand it, they conceded on the language choice, they were all going to use the same language, but I don't know about the rest.
The goal was to avoid some catastrophic "unknown unknown" that might have crept into the implementation if they simply rolled out 3 copies of the same system.
> very low density (relatively speaking to today) make them more robust against gamma-rays and other radiation problems once outside the atmosphere?
Yes. Large size transistors (and other IC components) are less impacted by the radiation problems that exist outside the relative security of the atmosphere. Most radiation hardened IC circuity is many process sizes larger than whatever the current state of the art tiny process sizes happen to be at any given time.
But note I said "less impacted". Given sufficient radiation, things will have issues, which is why items like the Shuttle carried the redundant computers, to cover for the possible lucky-strike impacts.
Convective heat transfer in metal would be a worrying event on the Space Shuttle!
FYI - The link for Peter Kogge is broken and should probably link to https://en.wikipedia.org/wiki/Peter_Kogge
[0] https://en.wikipedia.org/wiki/Leyden_jar
I wonder if the very low density (relatively speaking to today) make them more robust against gamma-rays and other radiation problems once outside the atmosphere?
if I remember correctly, and it's been decades of course
four of the computers ran in parallel with the exact same instructions in case one failed or came up with a wrong answer
and the fifth computer was the "decider"
is that understanding correct?
ah I see now you mention
As far as redundancy, it's complicated. During critical flight phases, four computers would run the main software (PASS, Primary Avionics Software System), while the fifth computer was ready with the Backup Flight Software (BFS). The backup software was written by a completely different team to ensure that a software bug couldn't crash all the computers at once. In orbit, they used fewer redundant computers to free up computers for payload operations and stuff.
The four computers constantly checked the results from each other and would vote out a faulty system. Voting ensured that a bad computer couldn't vote out the good ones (Byzantine failure). Moreover, the actuators hydraulically voted on the results from the computers: if one computer tried to push a valve in a different direction, the three good computers would physically overpower the bad computer's action at the level of the hydraulic pistons.
Thanks so much for the information. I am familair with the voting logic (I've worked on systems that implemented the same thing, odd-number of processor cores and the majority wins).
One question, were any "misbehaving" processor or actuation requests ever logged? As in, were there examples where one actuator or CPU didn't agree in the Shuttle flights?
[1] Search for "GPC" in the Mission Summary report: https://newspaceeconomy.ca/wp-content/uploads/2023/05/space-...
Mission STS-9 had two computer failures, causing landing to be delayed by 7 3/4 hours. They carried a sixth computer as a backup for following missions.
As far as how the voting works, each computer has a signal indicating what it thinks the status is of each computer, including itself. (Computers can detect many failures from self-checking, such as parity errors.) Each IOP uses these votes to determine the "redundant set", calculating the votes in hardware. The status is also displayed to the astronauts in a 5×5 grid. Astronauts can power down a computer or reboot it.
The story was that they were planning to fly with 3 computers, and that they would "vote" on important decisions.
The real trick was that they intended to build those computer with 3 separate teams, using clean room implementation (no coordinating with the other teams), and that they were going to use 3 separate CPU architectures, and even 3 different implementation languages.
As I understand it, they conceded on the language choice, they were all going to use the same language, but I don't know about the rest.
The goal was to avoid some catastrophic "unknown unknown" that might have crept into the implementation if they simply rolled out 3 copies of the same system.
do you know anything about the military's secret space-shuttle still in operation?
I'm sure it's either been very modernized or runs on completely different design since it's supposedly remote-control
Yes. Large size transistors (and other IC components) are less impacted by the radiation problems that exist outside the relative security of the atmosphere. Most radiation hardened IC circuity is many process sizes larger than whatever the current state of the art tiny process sizes happen to be at any given time.
But note I said "less impacted". Given sufficient radiation, things will have issues, which is why items like the Shuttle carried the redundant computers, to cover for the possible lucky-strike impacts.