

### Computing in Spaaaaaaace: An Exploration of Extraterrestrial Architecture

Mark Gallagher EECS 573 – Fall 2019





### Outline

#### 1. Introduction

Intro

2. Background & Challenges in Designing for Space

?

3. ???

4. Parting Thoughts on Space Design

?



# Space Missions

- Why focus on space exploration?
  - Because it's cool.
  - And sure something about useful technology being built

ARTEMIS

- Some Focuses:
  - Location Service
  - Communication
  - Planetary Science
  - Resources

Intro

- Meaning of the Universe (42)
- Find Aliens(???)

### **Computers in Space**

- Space systems make use of computers
  - Surprise

Intro

- Run calculations, collect data, transmit information
- Autonomous control, orbit stabilization, guidance, propulsion
- Keep astronauts company?







#### Recent Use Case: SpaceX & Starlink

- Starlink 1 just launched Nov. 11, 2019!
  - 60 satellites launched into orbit
- Starlink promises high-speed satellite internet
  - 1 Gbps with latencies of 25-35ms
  - Peer-to-peer with end-to-end encryption
- Remember: they want *global* coverage, planning for 12,000+ satellites
  - High-speed, high-bandwidth, secure, power-constrained, radiation-tolerant, ×12,000?!?
  - Sounds pricey \$\$\$...

Intro



PACEX . FA

DO:00:08

#### Recent Use Case: SpaceX & Starlink

- Starlink 1 *just launched* Nov. 11, 2019!
  - 60 satellites launched into orbit

• Starlink p

- 1 Gbps
- Peer-to

What challenges in computer architecture exist for these space missions?

- Remember: they want *global* coverage, planning for 12,000+ satellites
  - High-speed, high-bandwidth, secure, power-constrained, radiation-tolerant, ×12,000?!?
  - Sounds pricey \$\$\$...

Intro

# Designing for Space:

Just throw an iPhone in it!

Intro

Challenges

# What Makes Space Design Difficult?

Think about all the challenges & assumptions of designing on Earth first:

- Your processor uses power
  - Trying to be as energy-efficient as possible
  - Power source usually consistent
- Power use = dissipating heat
  - Use a fan or use liquid cooling
- Bugs?

Intro

- Recreate the bug and try to deploy a fix
- Hardware issues mean fixing it in the field



# The Space Environment

- Power is low & intermittent
  - Solar cells don't always get sun
  - Unless you have nuclear or other power sources
  - High Power = High Heat you can't use fans without air!
- Tolerate as much risk as you can
  - Bugs are super costly
  - Difficult to service most of these platforms after launch
- Security
  - Governments launch some valuable hardware up there (ahem, spy stuff)
  - Chips could be maliciously modified without you knowing (A2 attacks, trojans)



Intro

## **Biggest Danger of Space on Modern Electronics**

- Radiation: *it's bad*
- Remember section 1 of this class? "Resilient System Design"?





# **Biggest Danger of Space on Modern Electronics**

2

- Yea think that, but worse.
  - Likelihood of radiation events on Earth is fairly low
    - You can take a performance hit to recover
  - In space...





# What Makes Radiation Effects More Severe?

- More transistors on chip, closer together
  - More likely to get hit by particles
- High clock rates
  - Circuit less likely to stabilize before an error is propagated
- Low threshold voltages
  - Spikes caused by radiation more likely to be treated as a "1" instead of "0"

#### **Everything Modern Processors Do!**



Moore's law describes the empirical regularity that the number of transistors on integrated circuits doubles approx

linked to Moore's lay

Moore's Law – The number of transistors on integrated circuit chips (1971-2018)

This advancement is important as other aspects of technological progress – such as processing speed or the price of electronic products – are

Jur Work in Data

under CC-BY-SA by the author Max Rose

12



Parting

Intro

# Review: Mitigating Soft Errors Caused by Radiation

- Process technology change (\$\$\$\$)
  - Silicon on Insulator (SOI): Sapphire or gallium arsenide wafers
  - Retooling of processor designs
  - Larger startup costs, can't reuse existing, commercial manufacturing process techniques
- Going backwards:

Intro

- Use older process nodes  $\rightarrow$  transistors more spread apart
- Use slower clock rates  $\rightarrow$  less chance of latching an error

# **Other Radiation Mitigations**

- Triple Modular Redundancy (TMR)
  - Discussed in section 1 of this class
- Basic Procedure:
  - Do something 3 times
  - Check 3 results to ensure they agree
  - If they don't, vote for the most common result
  - If there is no common result (highly unlikely), something really bad happened & you need to do it again



Parting

Intro

### Sounds Great!

- Nope.
- We lower the clock rate, increase the voltage, use larger transistors...
- All contributes to *lower overall performance* 
  - Mars Curiosity Rover:
    - BAE RAD750 Single-Core Processor

Challenges

- 200 MHz, 256 MB RAM, 2 GB SSD
- Comparing a rad-hardened chip with the COTS chip:
  - RAD750 incurs ~3x overhead compared to its commercial, PowerPC750 part [Lovelly 2017]
  - Almost 5x for multi-core systems (RAD5545 vs. QorIQ P5040)



# Space Flight Legacy

- NASA prefers to not change much about something that has been flight-proven
  - Less risk to changes
- Architectures are constantly being updated or changed
- Significant financial investments made in:
  - Licensing ISAs to use (Arm, x86, PowerPC...)
  - Verifying software (remember, bugs here are VERY BAD)
- Changes to these architectural parameters can lead to high costs

Intro

# Summary of Challenges in Space Systems





# Resilient System Design

In Space

Intro



Security

# Radiation Hardening by Design

- Radiation hardening by process (i.e., using sapphire) is too expensive
- Instead: equip standard CMOS chips to tolerate radiation
  - Existing manufacturing processes for commercial chips
  - Space industry can catch up to using modern techniques  $\rightarrow$  faster processors!
- Triple Modular Redundancy (TMR) is one of the ways to achieve this

# ESA: Fault-Tolerant GR740 by Cobham

- Europe's latest space-grade processor
- 65nm process technology
- Quad-Core LEON4 SPARC32
  - SPARC is an open-source ISA





20

- 2 Main Benefits:
  - Higher performance and capabilities for general processing
  - Fault-Tolerance: built-in Block TMR scheme!



Resilie<u>nt</u>

Security

Apps

### How the GR740's Block TMR Scheme Works



# Evaluating the GR740's Fault-Tolerance

- Tested by using a particle accelerator to fire heavy ions at it
  - Proved survivability against high radiation levels (LET of 125 MeV·cm<sup>2</sup>/mg)
  - Normal off-the-shelf parts would latch up quickly
- They expect it to face *9 radiation events* (SEUs) a day in geostationary orbit!
  - Theoretically, fault-tolerance strong enough such that an error occurs every 350 years
  - Even then, if detected, processor needs to reset to fix it



Intro

Resilient

Security

Apps





# Resilient System Design

In Space

Intro



Security





In Space

Intro

Evgeny Chereshnev @cheresh · Feb 25, 2018 Replying to @elonmusk and @andrestaltz Encryption? 1J 7  $\bigcirc 2$ C 82 Elon Musk 🕗 @elonmusk · Feb 25, 2018

End-to-end encryption encoded at firmware level. Unlikely to be hacked w current computing tech. If it is (and we learn about it), a crypto fix will go out immediately via network-wide firmware update.

96 173

♡ 1.6K



仚

⊥



Resilient

Security

Apps

# Supply Chain Security of Space Cores

- US space systems are "controlled goods":
  - ITAR/EAR → export controls (aka, "don't export it")
- US "Trusted Foundries" for manufacturing components
  - Big research area for DARPA
- Everything we talked about in class is relevant here:
  - Physical Unclonable Functions (PUFs)
  - Malicious RTL injection
  - Hardware trojans a threat to cores (A2 attack)



Security

## In Addition to Chip Security...

- Remember Starlink? Promises a *secure* communication network
- Onboard space systems need to handle security operations too
  - Security
  - Authentication

#### Systems need to be capable enough to do this!







# Secure & Bug-Free Systems

In Space

Intro

Challenges

Resilient

Security

Apps

Parting

27



# Application-Specific Architectures

In Space

Intro

Challenges

Resilient

Security

Apps

</>

Parting

28

# Space Cores Demand More Capabilities

- Space systems lag consumer electronics
- NASA wants...
- Cloud Services
- Advanced Vehicle Health Monitoring

Challenges

- Improved Displays & Controls
- Augmented Reality Views
- Tele-Presence

Intro

- Radar
- Imaging Spectrometers
- Automated Guidance, Navigation, Control
- Science Event Detection & Response

Apps

Just to name a few...

Parting

29

#### Key Themes: AI, Autonomous Systems, Image Processing, Advanced Sensors

Security

Resilient



# Space Cores Demand More Capabilities

• Space systems lag consumer electronics

- NASA wa
- Cloud Serv
- Advanced

#### Accelerator research needed here too! Power Efficiency + Radiation Tolerance

- Improved Displays & Controls
- Augmented Reality Views
- Tele-Presence

- Automated Guidance, Navigation, Control
- Science Event Detection & Response

Just to name a few...

Key Themes: AI, Autonomous Systems, Image Processing, Advanced Sensors

# NASA: Increasing Performance in Space

- High-Performance Spaceflight Computing (HPSC) Program
  - Boeing & University of Michigan! (Dreslinski, Mudge)
- Multi-Chiplet, SoC architecture
  - Arm-based, multi-core
  - High-performance subsystem
  - Real-time subsystem
  - Redundancies/Fault-Tolerance (TMR)
  - SIMD engines
  - Scalable expand to multiple chiplets, other accelerators!



Intro

Resi<u>lient</u>

Security

Apps

# **Performance Comparisons**

- Europe's GR740: 1.5W
  - Quad-core chiplets, SPARC
  - 250 MHz
- BAE Systems RAD5545: 20W
  - Quad-core, IBM PowerPC
  - 466 MHz

Intro

- America's HPSC: 10W
  - Octo-core chiplets, Arm

Challenges

- Very capable, SIMD helps
- Estimates at 500 MHz



Security

Resilient

Performance

#### Power Efficiency (MOPS/W)



Parting

Apps

32

# **Encouraging Innovation with Open Source**

- Microsemi is supporting RISC-V with their Mi-V cores for Rad-Hard FPGAs
  - C Microsemi.



• 3 Main Benefits:

Intro

- Trusted IP open source cores, viewable RTL code
- Royalty-Free RISC-V ISA is open for anyone to use
  - Encourages development in multiple spaces
  - Flexible, can be extended
- Longevity Standard is *frozen*, code written today can still run in the future

Resilient

Security

• Preserve Space Flight Heritage

Challenges



# Conclusions

Challenges

Intro

Resilient

Security



# Parting Thoughts

- EECS 573 showed 3 focus areas in computer architecture research
- Space systems design is a challenging space that must leverage all 3 of these areas
- Future space systems need:
  - Extreme fault-tolerance capabilities

Challenges

• Security guarantees

Intro

• Stronger compute power



Resilient

Security



Parting

Apps

### References

Krywko, J. 2019. Space-grade CPUs: How do you send more computing power into space?. ArsTechnica.

Brodkin, J. 2018. SpaceX hits two milestones in plan for low-latency satellite broadband. ArsTechnica.

Roser, M., Ritchie, H. 2018. Technological Progress. Our World in Data.

Lovelly, T.M. 2017. *Comparative Analysis of Space-Grade Processors*. Dissertation at the University of Florida.

Cobham Gaisler. 2019. *GR740 Radiation Summary*. Test Report, Doc. No: GR740-RADS-1-1-1.

Powell, W. 2018. *High-Performance Spaceflight Computing (HPSC) Project Overview*. Presentation at the Radiation Hardened Electronics Technology Conference.

Cudmore, A. 2019. *High-Performance Spaceflight Computing (HPSC) Middleware Overview*. Presentation at NASA GSFC.

Marena, T. 2018. *Improving Space Systems Designs Using FPGAs with RISC-V Cores—A Microsemi Tech Focus*. SatMagazine article in April 2018.

