



Arizona Stat University Georgia Tech HARVARD

03 00 175

Jeffrey S. Vetter, ORNL (PI) Alec Talin, Sandia NL David Brooks, Harvard Yu Cao, ASU Sung Kyu Lim, Georgia Tech

ORNL is managed by UT-Battelle, LLC for the US Department of Energy

Sandia National Laboratories

ACSR Program Manager: Robinson Pino



NICE Workshop 30 Mar 2022

### **Overview**

- Many factors are driving improved design of future computer systems
  - Electronics scaling, power, domain specific computing, business models, etc.
  - Massive demand for next-generation HPC systems (e.g., ModSim, AI, Data, Omniverse)
- DOE and others have embraced codesign as a path forward
  - Enable integrated design and implementation of end-to-end solutions, then iterate!
  - Reimagning Codesign focuses on new computational paradigms, workloads, agility
- Abisko is a new codesign project with the ambitious goals
  - Design Spiking Neural Network chiplet based on resistive switching materials that can be integrated with contemporary computer architectures
  - Develop portable software stack for neuromorphic algorithms across a range of platforms
  - Develop codesign framework for deep codesign into devices and materials
- Abisko is an interdisciplinary project including scientists from applications, algorithms, software, architectures, devices and circuits, and materials!



# **Basic Research Needs for Microelectronics (2018 Workshop)**

- Five Priority Research Directions
  - Flip the current paradigm
  - Revolutionize memory and data storage
  - Reimagine informal flow unconstrained by interconnects
  - Redefine computing by leveraging unexploited physical phenomena
  - Reinvent the electricity grid through new materials, devices, and architectures

# Basic Research Needs for **Microelectronics**



Report of the Office of Science Workshop on Basic Research Needs for Microelectronics October 23 – 25, 2018

### **Recent DOE Program on Microelectronics Codesign**

#### Department of Energy

#### DOE Announces \$54 Million for Microelectronics Research to Power Next-Generation Technologies

MARCH 24, 2021

Energy.gov » DOE Announces \$54 Million for Microelectronics Research to Power Next-Generation Technologies

National Labs Will Lead Transformation of Smart Devices, Clean Energy Technologies, and Semiconductor Manufacturing

WASHINGTON, D.C. – The U.S. Department of Energy (DOE) today announced up to \$54 million in new funding for the agency's National Laboratories to advance basic research in microelectronics. Microelectronics are a fundamental building block of modern devices such as laptops, smartphones, and home appliances, and hold the potential to power innovative solutions to challenges like the climate crisis and national security. Watch this videod to learn more about microelectronics.

"Thanks to microelectronics, transformational technologies that used to swallow up entire buildings now fit in the palms of our hands—and it's time to take this work to the next level," said Secretary of Energy Jennifer M. Granholm. "Microelectronics are the key to the technologies of tomorrow, and with DOE's world-class scientists leading the charge, they can help bring our clean energy future to life and put America a step ahead of our economic competitors."

| Principal<br>Investigator   | Institution                                        | City, State        | Proposal Title                                                                                                                           |
|-----------------------------|----------------------------------------------------|--------------------|------------------------------------------------------------------------------------------------------------------------------------------|
| Guha, Supratik              | Argonne National<br>Laboratory (ANL)               | Lemont, IL         | Ultra-Dense, Near-Perfect, Atomic and Synaptic<br>Memory                                                                                 |
| Taylor, Valerie             | Argonne National<br>Laboratory (ANL)               | Lemont, IL         | Threadwork: A Transformative Co-Design<br>Approach to Materials and Computer<br>Architecture Research                                    |
| Braga, Davide               | Fermi National<br>Accelerator<br>Laboratory (FNAL) | Batavia, IL        | Hybrid Cryogenic Detector Architectures for<br>Sensing and Edge Computing enabled by new<br>Fabrication Processes                        |
| Garcia-Sciveres,<br>Maurice | Lawrence Berkeley<br>National Laboratory<br>(LBNL) | Berkeley, CA       | Co-Design and Integration of nano-sensors on CMOS                                                                                        |
| Ramesh,<br>Ramamoorthy      | Lawrence Berkeley<br>National Laboratory<br>(LBNL) | Berkeley, CA       | Codesign of Ultra-Low-Voltage Beyond CMOS<br>Microelectronics                                                                            |
| Haegel, Nancy               | National Renewable<br>Energy Laboratory<br>(NREL)  | Golden, CO         | Nitride materials and interfaces for radiation-<br>hard integrated neutron detection                                                     |
| Vetter, Jeffrey             | Oak Ridge National<br>Laboratory (ORNL)            | Oak Ridge, TN      | Abisko: Codesign in the Wild: Designing<br>Neuromorphic Hardware, Software, and<br>Applications Concurrently using Al-enabled<br>Methods |
| Graves, David               | Princeton Plasma<br>Physics Laboratory<br>(PPPL)   | Princeton, NJ      | Diamond co-doping for quantum sensor<br>applications                                                                                     |
| Aimone, James               | Sandia National<br>Laboratories (SNL)              | Albuquerque,<br>NM | COINFLIPS: CO-designed Improved Neural<br>Foundations Leveraging Inherent Physics<br>Stochasticity                                       |
| McIntyre, Paul              | SLAC National<br>Accelerator<br>Laboratory         | Menlo Park, CA     | Atoms-to-Systems Co-Design: Transforming Data<br>Flow to Accelerate Scientific Discovery                                                 |















- Talin, Albert Alec (Devices)
- Kevin Cao
- David Brooks
- Lim, Sung-Kyu
- Comish, John
- Catherine "Katie" Schuman (Algo)
- Date, Prasanna
- Tripathy, D
- Farah Fahim
- Ghawaly, James
- Tallada, Marc Gonzonlas (Software)
- Gu-Yeon Wei
- Holland Hysmith
- Hornick II, Michael

- Huber, Joseph
- levlev, Anton

٠

- Kulkarni, Shruti
- Frank Liu (Arch)
- Maksymovych, Petro (Materials)
- Marinella, Matthew
- Flynn, Michael
- Miniskar, Narasinga Rao
- Nhan Tran
- Ovchinnikova, Olga S.
- Sumpter, Bobby
- Aaron Young























#### **Abisko Microelectronics Codesign Overview** Applications **Motivation** Transportation CMS Sensors Motifs, Composition **Algorithms** Algorithms ML: SLAYER, Whetstone, EONS, eProp, STDP • Non-ML: Graph algorithms, CSP • Simulators: NEST, Brian2 API, Motifs **Software** DSL and API for neuromorphic co-processing Software • • Built on LLVM and MLIR • Portable across Abisko chiplet, GPUs, etc. ISA. IR Architecture • Design neuromorphic chiplet Architecture • RISC-V neuromorphic extensions • Heterogeneous integration with contemporary technologies Circuit scale up, Interconnects, PDK **Devices and Circuits** • ion insertion (reversible doping) sets analog states **Devices and** • mRaman captures transition linear, non-linear switching Circuits • Will extend to 36x36 x-bar array • Electronic and other optical spectroscopies Compact models Materials Non-equilibrium probes to few nm • **Materials** Data-driven modeling ٠ • On-demand neuromorphism



21

# Pixel Detector: Proposed ML implementation

#### **Digital neuromorphic implementation**



Analog – Mixed Signal implementation using floating gates or memristive cross-bar arrays



- Ability to work in the latent space (downstream resources)
- Reconfigurability vs. pruning?
- On-chip inference vs. on-chip training?
- Light weight models?
- Can lead to self calibrating detectors?

# NeuroRad Project at ORNL

- 1: Develop a neuromorphic-capable radiation anomaly detection algorithm and evaluate on both simulated and real-world data.
- 2: Integrate neuromorphic algorithm on  $\mu$ Caspian board and integrate board with low power radiation detection system.

|                     | Datasets                                                                                                                           |                                                                                                                                                                             |  |  |  |
|---------------------|------------------------------------------------------------------------------------------------------------------------------------|-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------|--|--|--|
|                     | DOE Urban Search Challenge [1]                                                                                                     | HFIR/REDC Static Monitors [2]                                                                                                                                               |  |  |  |
| <b>≵.</b> Oak Ridge | <ul> <li>Single 2"x4"x16" Nal(TI) detector moving through urban street.</li> <li>9700 training runs, 15840 testing runs</li> </ul> | <ul> <li>Multiple static sensor "nodes" each with a single 2"x4"x16" Nal(TI) detector, placed around ORNL HFIR/REDC facility.</li> <li>&lt;200 source encounters</li> </ul> |  |  |  |
| National Laboratory | 3 locations of a source                                                                                                            |                                                                                                                                                                             |  |  |  |

#### **Abisko Microelectronics Codesign Overview**



Materials

٠

**Devices and Circuits** 

non-linear switching

• Non-equilibrium probes to few nm

Data-driven modeling

• On-demand neuromorphism

# **Algorithms**

- Evaluate the best algorithms for specific problems
  - Include comparison against SOA techniques
- Evaluate algorithmic options for specific application
  - Input vector encoding
  - Evaluate different configurations with simulation
- Training, Inference, Online
- Interact with software and architecture teams
- Tools
  - EONS (Evolutionary optimization) for training
  - Deffe for Hyperparameter optimization

#### EONS

- Generates relatively sparse networks
- Evolves the structure of the network





# **Neuromorphic Approach for Smart Pixel Detection**

#### • Dataset:

- Charge values from the LHC every 250ps timesteps
- Goal
  - Data Compression, send only particle track information (x, y,  $\alpha$ ,  $\beta$ )
  - In sensor pixel detection hence, detection model needs to be small
- First approach
  - Apply neuromorphic algorithm EONS
  - Explore spike encoding of charge values
- Other approaches
  - Regression, Spiking convolution NN, unsupervised learning (STDP), Spike-based Object detection algorithms



### How to encode numbers on a neuromorphic computer?



Encoding 0.625, i.e. 0101

Encoding -3.5, i.e. 0111











### **The Virtual Neuron**

- Current encoding methods are inadequate
  - Rate-based encoding does not preserve addition
  - Binning loses information
- Virtual neuron uses binary encoding, preserves addition
- Takes two 2-bit numbers as inputs: x and y
- Returns a 3-bit number as output: z
- Implemented in NEST simulator

| $x_1$ | <i>x</i> <sub>0</sub> | <b>y</b> 1 | <i>y</i> <sub>0</sub> | <b>z</b> 2 | <i>z</i> <sub>1</sub> | <i>z</i> <sub>0</sub> | Sum   |
|-------|-----------------------|------------|-----------------------|------------|-----------------------|-----------------------|-------|
| 0     | 1                     | 0          | 1                     | 0          | 1                     | 0                     | 1+1=2 |
| 0     | 1                     | 1          | 1                     | 1          | 0                     | 0                     | 1+3=4 |
| 1     | 0                     | 1          | 1                     | 1          | 0                     | 1                     | 2+3=5 |
| 1     | 1                     | 1          | 1                     | 1          | 1                     | 0                     | 3+3=6 |



#### **Abisko Microelectronics Codesign Overview**



Materials

•

٠

**Devices and Circuits** 

non-linear switching

Non-equilibrium probes to few nm

Data-driven modeling

• On-demand neuromorphism

### Software

- Develop a holistic software stack for neuromorphic coprocessing in a heterogeneous architecture
  - Programming model
  - Backend code generation
  - Runtime
- Portable to GPU, FPGA, SoC, and Abisko chiplet simulator

- Based on successful experiences with Quantum computing at ORNL: – XACC, QCOR
- Building embedded DSL (Domain Specific Language) with LLVM and MLIR





# XACC/QCOR Approach (as an analogue)



Program call to bell function is a call to another internal function that instantiates a temporary instance of the new QuantumKernel sub-type.

#### **Abisko Microelectronics Codesign Overview**



Materials

•

٠

**Devices and Circuits** 

non-linear switching

Non-equilibrium probes to few nm

Data-driven modeling

• On-demand neuromorphism



- Design chiplet for SNN that can be easily integrated with contemporary technologies
  - Heterogeneous integration
  - Compatible with existing processes
- Use RISC-V interface to chiplet
- Simulate/emulate with existing simulators like Gem5 and Aladdin









# Abisko Architecture: Technology Landscape

• Advanced packaging is clearly one of the main technology drivers of semiconductor scaling in the near future



• Underlying technology is the main uncertainty for neuromorphic accelerator

| 1       |                                                        |                                         |  |
|---------|--------------------------------------------------------|-----------------------------------------|--|
|         | SPIKING                                                | NON-SPIKING                             |  |
| DIGITAL | CMOS-friendly (Loihi)latency and<br>energy constraints | traditional GPU/FPGA/NN<br>accelerators |  |
| ANALOG  | interface to the rest of world,<br>repeatability       | Interface repeatability                 |  |

# Abisko Architecture: Smart Pixel Driver

- CMS Experiment from FemiLab: Farah Fahim
  - 40 MHz collision rate (25ns latency)
  - ~1B detector channels
- Active ongoing effort to design customized ASIC for data acquisition and compression
- Active ongoing effort to establish POR ML method on particle trajectory reconstruction





- On-going effort:
  - Establish baseline specs in computing intensity required using POR ML method
  - Explore techniques to better meet other constraints (quantization with fewer bits, spiking neuromorphic models)
  - Investigate and define Interface between ML accelerator cores and von-Neumann cores

#### **Abisko Microelectronics Codesign Overview**



Materials

•

٠

**Devices and Circuits** 

non-linear switching

Non-equilibrium probes to few nm

Data-driven modeling

• On-demand neuromorphism

# **Devices and Circuits Overview**

#### Goals

- Harness the interplay between mobile defects (ions and vacancies) and electronic properties to realize functional elements for spiking and non-spiking analog neuromorphic networks
- Create and validate small network models; generate device and network data for co-design
- Understand and mitigate radiation induced degradation mechanisms at the device and circuit level



### **Experimental TaOx ReRAM Conductance Distributions**



# ECRAM Synapse Based on Ru Prussian Blue Analog

#### Background

- Prussian blue analogs are highly stable and can be patterned using photolithography
- Open crystal structure ideal for fast ion motion, but most PBA are poor electrical conductors

#### Our work

- We fabricate Ru PBA ECRAM synapses that switch with Li<sup>+</sup> or H<sup>+</sup> ions.
- The synapse display linear, symmetrical characteristics with excellent endurance and good retention
- Scaling experiments indicate  $\Delta t_{sw}$ ~1ns,  $\Delta E_{sw}$ ~0.7fJ for 100 nm channel device.







# Kelvin Probe Force Microscopy (KPFM) on PB thin films



The principles of the measurement procedure in KPFM technique using two pass mode

M.Checa et al, APL , 2021





Next step: nanoscale ionic effects from dielectric spectroscopy

### **Devices and Circuits: Next steps and Recent Publications**

- Investigate RuPBA fabrication compatible with Si integration
- Relate device characteristics to SPM measurements
- Develop compact model for ECRAM
- Construct small networks using TaOx memristors and Ru PBA elements
- Test radiation hardness, effects on accuracy, noise, and retention

- X. Xu, E. J.Cho, L. Bekker, **A. A. Talin**, E. Lee, A. J. Pascall, M. A. Worsley, J. Zhou, C. C. Cook, J. D. Kuntz, S. Cho and C. A. Orme, *A Bioinspired Artificial Injury Response System Based on a Robust Polymer Memristor to Mimic a Sense of Pain, Sign of Injury and Healing*, Adv. Science 2200629, 2022
- Su-in Yi, A. A. Talin, M. J. Marinella, R. S. Williams, *Physical Compact Model for Three-Terminal SONOS Synaptic Circuit Element*, submitted

### Summary

- Abisko is a new microelectronics codesign project with goals of
  - Design Spiking Neural Network chiplet based on resistive switching materials that can be integrated with contemporary computer architectures
  - Develop portable software stack for neuromorphic algorithms across a range of platforms
  - Develop codesign framework for deep codesign into devices and materials
- Truly interdisciplinary team working across the stack



- More information
  - <u>vetter@computer.org</u>
  - <u>https://vetter.github.io</u>
- We are hiring!
  - See jobs.ornl.gov
  - Send an email to me.

Thanks!





### **Bonus Material**

