Tuesday, 23 April 2024 | |||
08:00 | NICE 2024-- day 1NICE 2024RegistrationFor registration, please see the registration options and link to the registration form. Venue
| ||
08:00 | NICE 2024 agendaTimes listed are Pacific Daylight Time (PDT). It is also possible to show the agenda with other time zones listed, which are relevant for: Africa, America, Australia, China, Europe, India, Japan | ||
08:00‑08:30 (30 min) | Registration and breakfast | ||
08:30‑08:45 (15+5 min) | Welcome and opening | Gert Cauwenberghs, Duygu Kuzum, Tajana Rosing (Institute for Neural Computation and Jacobs School of Engineering, UC San Diego) Miroslav Krstić (Associate Vice Chancellor for Research, UC San Diego) | |
08:50‑09:35 (45+5 min) | Organisers round | Members of the organising committees | |
09:40‑10:25 (45+5 min) | Keynote: Brains and AI show presentation.pdf (public accessible) show talk video | Terry Sejnowski (Salk Institute for Biological Studies) | |
10:30‑11:00 (30 min) | Break | ||
11:00‑11:25 (25+5 min) | Biological Dynamics Enabling Training of Binary Recurrent Networks show presentation.pdf (public accessible) show talk video Publication DOI: 10.1109/NICE61972.2024.10549632 Neuromorphic computing systems have been used for the processing of spatiotemporal video-like data, requiring the use of recurrent networks, while attempting to minimize power consumption by utilizing binary activation functions. However, previous work on binary activation networks has primarily focused on training of feed-forward networks due to difficulties in training recurrent binary networks. Spiking neural networks however have been successfully trained in recurrent networks, despite the fact that they operate with binary communication. Intrigued by this discrepancy, we design a generalized leaky-integrate and fire neuron which can be deconstructed to a binary activation unit, allowing us to investigate the minimal dynamics from a spiking network that are required to allow binary activation networks to be trained. We find that a subthreshold integrative membrane potential is the only requirement to allow an otherwise standard binary activation unit to be trained in a recurrent network. Investigating further the trained networks, we find that these stateful binary networks learn a soft reset mechanism by recurrent weights, allowing them to approximate the explicit reset of spiking networks. | William Chapman (Sandia National Laboratories) | |
11:30‑11:40 (10+5 min) | Towards Convergence Intelligence – neuromorphic engineering and engineered organoids for neurotechnology show presentation.pdf (public accessible) show talk video Computing demands are vastly outpacing improvements made through Moore’s law scaling; transistors are reaching their physical limitations. Modern computing is on a trajectory to consume far too much energy and requiring even more data. Yet the current paradigm of artificial intelligence and machine learning casts intelligence as a problem of representation learning and weak compute. These drawbacks have ignited interests in non von Neuman architecture for compute and new types of brain-informed/inspired learning. This talk will explore emerging synergies between neuromorphic engineering and engineered organoid intelligence, a nascent field that we refer to as convergence intelligence. Relevant federal funding opportunities and strategies will be presented along with the presenter’s personal outlook for applying convergence intelligence to brain/body interface technologies for improving health. | Dr. Grace Hwang (Program Director, National Institutes of Health (NIH)/National Institute of Neurological Disorders and Stroke (NINDS)/ U.S. BRAIN Initiative) | |
11:45‑12:10 (25+5 min) | Invited talk: Learning algorithms for spiking and physical neural networks show presentation.pdf (public accessible) show talk video Neuro-inspired hardware and spiking neural networks promise energy-efficient artificial intelligence for mobile and edge applications. However, training such systems requires dedicated lightweight learning algorithms that cope with specific architectural constraints and hardware imperfections and work online, on-chip, and with minimal or no supervision. I will give an overview of recent algorithmic advances for training spiking and physical neural networks toward these goals and sketch a path toward more efficient local learning rules. | Friedemann Zenke (Friedrich Miescher Institute for Biomedical Research & University of Basel) | |
12:15‑12:30 (15 min) | Poster teasers 1-min "this is my poster content" teasers
| ||
12:30‑14:00 (90 min) | Poster-lunch (posters + finger food)
Posters
Talk-Posters
| ||
14:00‑14:25 (25+5 min) | SQUAT: Stateful Quantization-Aware Training in Recurrent Spiking Neural Networks show talk video Publication DOI: 10.1109/NICE61972.2024.10549198 Parameter quantization is needed for deploying high-performance deep learning models on resource-limited hardware, enabling the use of low-precision integers for storage and computation. Spiking neural networks share the goal of enhancing deep learning efficiency, but adopt an 'event-driven' approach to reduce the power consumption of neural network inference. While extensive research has focused on weight quantization, quantization-aware training, and their application to SNNs, the precision reduction of state variables during training has been largely overlooked, potentially diminishing inference performance. We introduce two QAT schemes for stateful neurons: (i) a uniform quantization strategy, an established method for weight quantization, and (ii) threshold-centered quantization, allocation exponentially more quantization levels near the firing threshold. Our results reveal that increasing the density of quantization levels around the firing threshold improves accuracy across several benchmark datasets. We provide an ablation analysis of the effects of weight and state quantization, both individually and combined, and how they impact models. Our comprehensive empirical evaluation includes full precision, 8-bit, 4-bit, and 2-bit quantized SNNs, using QAT, stateful QAT (SQUAT), and post-training quantization methods. The findings indicate that the combination of QAT and SQUAT enhance performance the most, but given the choice of one or the other, QAT improves performance by the largest degree. These trends are consistent across all datasets. | Sreyes Venkatesh (UC Santa Cruz) | |
14:30‑14:40 (10+5 min) | Expressive Dendrites in Spiking Networks show presentation.pdf (public accessible) show talk video Publication DOI: 10.1109/NICE61972.2024.10548485 As deep learning networks increase in size and performance, so do associated computational costs to run these large-scale models. Dendrites off powerful nonlinear "on-the-wire" computational capabilities, increasing the expressivity of point neurons while preserving many of the advantages of SNNs. This talk will discuss our work implementing a library that adds dendritic computation to SNNs within the PyTorch framework, enabling complex deep learning networks that can retain the low power advantages of SNNs. Our library leverages a dendrite CMOS hardware model to inform the software model, which enables nonlinear computation integrated with snnTorch in the context of current state-of-the-art deep learning methods as well as energy-efficient neuromorphic hardware. | Mark Plagge (Sandia National Laboratories) | |
14:45‑15:10 (25+5 min) | Text-to-Events: Synthetic Event Camera Streams from Conditional Text Input show presentation.pdf (public accessible) video (restricted access) Publication DOI: 10.1109/NICE61972.2024.10549580 Event cameras have proven advantageous for tasks that require low-latency and sparse sensor responses. However, the development of deep network algorithms using this sensor has been slow because of the lack of large labelled event camera datasets for network training. This paper describes a method for creating new labelled event camera datasets by using a text-to-X model, where X is one or multiple modalities that allow synthetic data generation. This model is the first reported text-to-events model that produces synthetic event frames directly from text prompts. It uses an autoencoder which is trained to produce sparse event frames representative of event camera outputs. By combining the autoencoder with a diffusion model architecture, the new text-to-events model is able to generate smooth synthetic event streams from moving objects. From training the autoencoder on a camera event dataset of diverse scenes and then the diffusion model on a human gesture event dataset, we demonstrate that the model can generate realistic event sequences of human gestures prompted by different text statements. The classification accuracy of the generated sequences range between 42% to 92%, depending on the gesture group, demonstrating the capability of this method in synthesizing smooth event datasets. | Shih-Chii Liu (Institute of Neuroinformatics, University of Zurich and ETH Zurich) | |
15:15‑15:45 (30 min) | Break | ||
15:45‑16:10 (25+5 min) | Embracing the Hairball: An Investigation of Recurrence in Spiking Neural Networks for Control show presentation.pdf (public accessible) show talk video Publication DOI: 10.1109/NICE61972.2024.10548512 Recurrent, sparse spiking neural networks have been explored in contexts such as reservoir computing and winner-take-all networks. However, we believe there is the opportunity to leverage recurrence in spiking neural networks for other tasks, particularly for control. In this work, we show that evolved recurrent neural networks perform significantly better than feed-forward counterparts. We give two examples of the types of recurrent networks that are evolved and demonstrate that they are highly recurrent and unlike traditional, more structured recurrent neural networks that are used in deep learning literature. | Katie Schuman (University of Tennessee) | |
16:15‑17:00 (45 min) | Open mic / discussion -- day I speakers | ||
17:00‑17:30 (30 min) | Misha Mahowald Prizes show talk video
| Tobi Delbruck (Inst. for Neuroinformatics, UZH-ETH Zurich) | |
17:30‑18:00 (30 min) | Misha Mahowald Prizes show talk video
| Carver Mead (California Institute of Technology) | |
18:00‑19:00 (60 min) | Reception and celebration of 35+ years of neuromorphic engineering |
Wednesday, 24 April 2024 | |||
08:00 | NICE 2024 - Day II | ||
08:00‑08:30 (30 min) | Breakfast | ||
08:30‑09:15 (45+5 min) | Keynote: Hearing with Silicon Cochleas | Shih-Chii Liu (Institute of Neuroinformatics, University of Zurich and ETH Zurich) | |
09:20‑09:30 (10+5 min) | Explaining Neural Spike Activity for Simulated Bio-plausible Network through Deep Sequence Learning show presentation.pdf (public accessible) show talk video Publication DOI: 10.1109/NICE61972.2024.10549689 With significant improvements in large simulations of brain models, there is a growing need to develop tools for rapid analysis and interpreting the simulation results. In this work, we explore the potential of sequential deep learning models to understand and explain the network dynamics among the neurons extracted from a large-scale neural simulation in STACS (Simulation Tool for Asynchronous Cortical Stream). Our method employs a representative neuroscience model that abstracts the cortical dynamics with a reservoir of randomly connected spiking neurons with a low stable spike firing rate throughout the simulation duration. We subsequently analyze the spike dynamics of the simulated spiking neural network through an autoencoder model and an attention-based mechanism. | Shruti R Kulkarni (Oak Ridge National Laboratory) | |
09:35‑10:00 (25+5 min) | Invited talk: Hardware Accelerators for Brain-Inspired Computing show talk video NorthPole is a neural inference accelerator architecture and chip. Inspired by the brain's efficiency and performance, the NorthPole Architecture uses low-precision weights and activations as well as local memory within a parallel, distributed core array, linked by networks-on-chip supporting spatial computing. Its memory access, computation, and communication are orchestrated by prescheduled, distributed control. The NorthPole Inference Chip includes 256 cores and 224 MB of distributed SRAM, using 22 billion transistors in 795 square millimeters of silicon in a 12nm process. | John Arthur (IBM Research) | |
10:05‑10:30 (25+5 min) | Hardware-aware Few-shot Learning on a Memristor-based Small-world Architecture show presentation.pdf (public accessible) show talk video Publication DOI: 10.1109/NICE61972.2024.10548824 Learning from few examples (few-shot learning) is one of the hallmarks of mammalian intelligence. In the presented work, we demonstrate using simulations, on-chip few-shot learning on a recently-proposed Spiking Neural Network (SNN) hardware architecture, the Mosaic. The Mosaic exhibits a small-world property similar to that of a mammalian cortex, by virtue of its physical layout. Thanks to taking advantage of in-memory computing and routing along with local connectivity, the Mosaic is a highly efficient solution for routing information which is the main source of energy consumption in neural network accelerators, and specifically in neuromorphic hardware. We propose to meta-learn a small-world SNN resembling the Mosaic architecture for keyword spotting tasks using Model Agnostic Meta Learning (MAML) algorithm for adaptation on the edge and report the final accuracy on Spiking Heidelberg Digits dataset. Using simulations of hardware environment, we demonstrate 49.09 +- 8.17% accuracy on five unseen classes with 5-shot data and single gradient update. Furthermore, bumping it to 10 gradient steps we achieve an accuracy of 67.97 +- 1.99% on the same configuration. Our results show the applicability of MAML for analog substrates on the edge and highlight a few factors that impact the learning performance of such metalearning models on neuromorphic substrates. | Karthik Charan Raghunathan (Institute of Neuroinformatics, ETH and UZH Zurich) | |
10:35‑11:05 (30 min) | Break | ||
11:05‑11:30 (25+5 min) | Spiking Physics-Informed Neural Networks on Loihi-2 Publication DOI: 10.1109/NICE61972.2024.10548180 | Brad Theilman (Sandia National Laboratories) | |
11:35‑11:45 (10+5 min) | jaxsnn: Event-driven Gradient Estimation for Analog Neuromorphic Hardware show presentation.pdf (public accessible) show talk video Publication DOI: 10.1109/NICE61972.2024.10548709 'jaxsnn' is a novel library based on | Eric Müller (BrainScaleS, Heidelberg University) | |
11:50‑12:00 (10+5 min) | Late-breaking-news: Distributed Neural State Machines on Loihi 2 show presentation.pdf (public accessible) Programming recurrent spiking neural networks (RSNNs) to robustly perform multi-timescale tasks is difficult due to the conflicting demands of the neuron activity to be both dynamic and stable. Conventional gradient descent methods are computationally expensive and do not necessarily yield robust results when deployed to neuromorphic hardware. To address these issues, we show how using high-dimensional random vectors as the smallest units of representation can be leveraged to embed robust multi-timescale dynamics into RSNNs. We embed finite state machines into the RSNN dynamics via an attractor-based weight construction scheme inspired by hyperdimensional computing, wherein the current state is represented by a high-dimensional distributed pattern of neural activity. We embed multiple large state machines into RSNNs on Intel's neuromorphic research chip Loihi 2, demonstrating the effectiveness of this approach for programming neuromorphic hardware. | Alpha Renner (Forschungszentrum Jülich, Germany) | |
12:05‑12:20 (15 min) | Poster teasers 1-min "this is my poster content" teasers
| ||
12:20‑12:30 (10 min) | Group photo | ||
12:30‑14:00 (90 min) | Poster-lunch (posters + finger food)
Posters
Talk-Posters
| ||
14:00‑14:10 (10+5 min) | Quantized Context Based LIF Neurons for Recurrent Spiking Neural Networks in 45nm show presentation.pdf (public accessible) show talk video Publication DOI: 10.1109/NICE61972.2024.10548306 In this study, we propose the first hardware implementation of a context-based recurrent spiking neural network (RSNN) emphasizing on integrating dual information streams within the neocortical pyramidal neurons specifically Context- Dependent Leaky Integrate and Fire (CLIF) neuron models, essential element in RSNN. We present a quantized version of the CLIF neuron (qCLIF), developed through a hardware software codesign approach utilizing the sparse activity of RSNN. Implemented (simulated) in a 45nm technology node, the qCLIF is compact (900um²) and achieves a high accuracy of 90% despite 8 bit quantization. Our work includes a detailed analysis of a network comprising 10 qCLIF neurons with 8-bit computation, and extends to a scalable network of 200 neurons, supporting 82k synapses within a 1.86 mm² footprint. | Sai Sukruth Bezugam (Department of Electrical and Computer Engineering, University of California Santa Barbara) | |
14:15‑14:40 (25+5 min) | Compositional Factorization of Visual Scenes with Convolutional Sparse Coding and Resonator Networks show presentation.pdf (public accessible) show talk video Publication DOI: 10.1109/NICE61972.2024.10549719 We propose a system for visual scene analysis and recognition based on encoding the sparse, latent feature-representation of an image into a high-dimensional vector that is subsequently factorized to parse scene content. The sparse feature representation is learned from image statistics via convolutional sparse coding, while scene parsing is performed by a resonator network. The integration of sparse coding with a resonator network increases the capacity of distributed representations and reduces collisions in the combinatorial search space of factorization. We find that in this context the resonator network is capable of fast and accurate vector factorization, and we develop a confidence-based metric that assists in tracking convergence. | Chris Kymn (UC Berkeley), Sonia Mazelet (UC Berkeley) | |
14:45‑14:55 (10+5 min) | Towards Chip-in-the-loop Spiking Neural Network Training via Metropolis-Hastings Sampling show presentation.pdf (public accessible) show talk video Publication DOI: 10.1109/NICE61972.2024.10548355 This talk studies the use of Metropolis-Hastings sampling for training Spiking Neural Network (SNN) hardware subject to strong unknown non-idealities, and compares the proposed approach to the common use of the backpropagation of error (backprop) algorithm and surrogate gradients, widely used to train SNNs in literature. Simulations are conducted within a chip-in-the-loop training context, where an SNN subject to unknown distortion must be trained to detect cancer from measurements, within a biomedical application context. Our results show that the proposed approach strongly outperforms the use of backprop in terms of accuracy when subject to strong hardware non-idealities. Furthermore, our results also show that the proposed approach outperforms backprop in terms of SNN generalization, needing >10 \times less training data for achieving effective accuracy. These findings make the proposed training approach well-suited for SNN implementations in analog subthreshold circuits and other emerging technologies where unknown hardware non-idealities can jeopardize backprop. | Ali Safa (imec and KU Leuven) | |
15:00‑15:10 (10+5 min) | Leveraging Sparsity of SRNNs for Reconfigurable and Resource-Efficient Network-on-Chip show presentation.pdf (public accessible) show talk video Publication DOI: 10.1109/NICE61972.2024.10548940 Establishing fully-reconfigurable connectivity in a reconfigurable Spiking Recurrent Neural Network (SRNN) hardware poses a significant challenge, particularly for resource-constrained implementations. While Address Event Representation (AER) packet-based spike communication methods are popular for large-scale SRNNs, they become inefficient for smaller-scale SRNNs due to the additional overhead of packet handling circuits, especially in sparse networks. To address this, we introduce SpiCS-Net (pronounced "spikes-net"), a Spike-routing Circuit-Switched Network based on Clos topology tailored for sparse SRNNs. SpiCS-Net capitalizes on the varying degrees of sparseness inherent in SRNNs to enable resource-efficient implementations. Additionally, we propose the concept of Concurrent Connectivity, which determines the total density of connections that can be concurrently established via the Network-on-Chip architecture. This feature allows the proposed NoC architecture to be tuned to meet specific synaptic density requirements, resulting in significant savings in hardware resources, area, and power consumption. | Manu Rathore (TENNLab - Neuromorphic Architectures, Learning, Applications (The University of Tennessee, Knoxville)) | |
15:15‑15:45 (30 min) | Break | ||
15:45‑16:10 (25+5 min) | Invited talk: Towards fractional order dynamics neuromorphic elements show presentation.pdf (public accessible) show talk video There is an increasing need to implement neuromorphic systems that are both energetically and computationally efficient by using memelements, electric elements with memory. A feature not widely incorporated in neuromorphic systems is history-dependent action potential time adaptation which is widely seen in real cells. Our theoretical work shows that power-law history dependent spike time adaptation can be modeled with fractional order differential equations. In the physical context of neuromorphic systems a fracdtional order derivative corresponds to a fractional order capacitor, an electric element that has been elusive to fabricate. Here, we show that fractional order spiking neurons can be implemented using super-capacitors. The super-capacitors have fractional order derivative and memcapacitive properties. I will describe a leaky integrate and fire and a Hodgkin-Huxley circuits. Both circuits show power-law spiking time adaptation and optimal coding properties. The activity of the Hodgkin-Huxley circuit replicted recordings from neurons in the weakly-electric fish that perform a fractional order differentiation of their sensory input. We also show that the circuit can be used to predict neuronal responses to long-lasting stimuli. Our combined work shows that fractional order memcapacitors can be used to implement computationally and energetically efficient neuromorphic devices. | Fidel Santamaria (Department of Neuroscience, Developmental and Regenerative Biology, University of Texas at San Antonio) | |
16:15‑16:25 (10+5 min) | Compute-in-Memory with 6T-RRAM Memristive Circuit for Next-Gen Neuromorphic Hardware Publication DOI: 10.1109/NICE61972.2024.10548860 Compute-in-memory (CIM) is primarily built into neuromorphic hardware to radically subvert the modern computing bottleneck for a range of applications, in particular for today's artificial intelligence (AI) related workloads. Emerging computational memory technologies, such as resistive random-access memory (RRAM), offer clear advantages in CIM to perform tasks in place in the memory itself, providing significant improvements in latency and energy efficiency. In this article, we showcase an innovative memristive circuit in a 6-transistor-1-RRAM (6T1R) configuration to enable a faster yet more efficient CIM for AI applications. In practice, our 6T1R cell leverages a series of pulse-width-modulated (PWM) pulses as computing variables for energy efficiency. In a way to support AI models for simplicity and robustness, individual 6T1R cell can be programmed to encode either positive or negative weight values by measuring the direction of current through the RRAM in addition to the multi-bit computation capability. For computational accuracy, faulty RRAM can be quarantined from the network regardless of its resistance values by setting the particular cell into the high-impedance state. A proof-of-concept validation was conducted using a custom 65 nm CMOS/RRAM technology node. The parallel inter-tile communication would achieve up to 1.6 trillion operations per second (TPOS) in data throughout with a computational efficiency reached to 1.07 TOPS/W. | Kang Jun Bai (Air Force Research Laboratory) | |
16:30‑17:30 (60 min) | Open mic / discussion - day II speakers | ||
17:30 | End of the second day |
Thursday, 25 April 2024 | |||
08:00 | Day III | ||
08:00‑08:30 (30 min) | Breakfast | ||
08:30‑09:15 (45+5 min) | Keynote: NeuroBench: A Framework for Benchmarking Neuromorphic Computing Algorithms and Systems show presentation.pdf (public accessible) show talk video Benchmarks are necessary to accurately measure technological advancements, compare performance with conventional methods, and identify promising future research directions. In this talk, I will present NeuroBench: a benchmark framework for neuromorphic computing algorithms and systems. NeuroBench is a collaboratively-designed effort from an open community of nearly 100 co-authors across over 50 institutions in industry and academia, aiming to provide a representative structure for standardizing the evaluation of neuromorphic approaches. The NeuroBench framework introduces a common set of tools and systematic methodology for inclusive benchmark measurement, delivering an objective reference framework for quantifying neuromorphic approaches in both hardware-independent and hardware-dependent settings. NeuroBench will continually expand its benchmarks and features to foster and track the progress made by the research community. Links: | Jason Yik (School of Engineering and Applied Sciences, Harvard) | |
09:20 | IEEE EMBS Forum on Intelligent Healthcare | ||
09:20‑09:25 (5 min) | Introduction to the IEEE EMBS Forum on Intelligent Healthcare show talk video | Gert Cauwenberghs (UC San Diego) | |
09:25‑09:35 (10 min) | AI for healthcare show talk video | Margot Wagner (Salk Institute) | |
09:35‑09:45 (10 min) | Boxes are pre-digital -- Discovering physiological diversity with AI show talk video | Benjamin Smarr (UC San Diego) | |
09:45‑09:55 (10 min) | Organoid intelligence show talk video | Francesca Puppo (UC San Diego) | |
09:55‑10:05 (10 min) | Health & AI: Flexible Engineered Learning Systems show talk video | Dr. Grace Hwang (NIH / NINDS) | |
10:05‑10:15 (10 min) | EMBS Strategic Plan: from Vision to Implementation show talk video | Metin Akay (IEEE EMBS) | |
10:20‑10:50 (30 min) | Break | ||
10:50‑11:00 (10+5 min) | One-Shot Auditory Blind Source Separation via Local Learning in a Neuromorphic Network show talk video Unsupervised, monosource blind source separation (BSS) continues to be a challenging problem for computational algorithms and machine learning approaches. In contrast, even developing nervous systems are capable of one-shot recognition of auditory signals, including in noisy environments. We present Density Networks (DNs), a novel class of recurrent neural network inspired heavily by the functioning of the auditory system that demonstrates one-shot BSS of auditory signals. In addition, DNs improved sound separation quality by at least 40% against standard methods in unsupervised auditory BSS. | Patrick Abbs (Cambrya, LLC) | |
11:05‑11:30 (25+5 min) | Invited talk: TENN: A highly efficient transformer replacement for edge and event processing. show presentation.pdf (public accessible) | M Anthony Lewis (BrainChip) | |
11:35‑11:45 (10+5 min) | A Recurrent Dynamic Model for Efficient Bayesian Optimization show talk video Publication DOI: 10.1109/NICE61972.2024.10548051 Bayesian optimization is an important black-box optimization method used in active learning. An implementation of Bayesian optimization using vector embeddings from Vector Symbolic Architectures – a family of cognitive modelling tools – was proposed as an efficient, neuromorphic approach to solving these implementation problems. However, a clear path to neural implementation has not been explicated. We will present an implementation of this algorithm expressed as recurrent dynamics that can be easily translated to neural populations, as implemented in Intel's Lava programming framework for neuromorphic computers. We compare the performance of the algorithm using different resolution representations of real-valued data and demonstrate that the ability to find optima is preserved. This work provides a path forward to the implementation of Bayesian optimization on low-power neuromorphic computers and deploying active learning techniques in resource-constrained computing applications. | P. Michael Furlong (Centre for Theoretical Neuroscience/Systems Design Engineering, University of Waterloo) | |
11:50‑12:15 (25+5 min) | Invited talk: Strategic & Large-Scale Considerations of Neuromorphic Computing show presentation.pdf (public accessible) show talk video Algorithmic advances are regularly showcasing exciting capabilities of neural networks. Computer architectures are offering a range of optimizations tailored towards neural network computations. In neuromorphic computing, algorithms and architectures are entwined such that design choices are not independent. Given the interdependence of neural algorithm and architecture interplay, I will first discuss strategic considerations for neuromorphic innovation from a game theoretic perspective. Then I will introduce the world class neuromorphic system developed under Sandia Labs and Intel’s large scale neuromorphic partnership. And finally, I will provide initial performance characterization of the large-scale system. | Craig Vineyard (Sandia National Laboratory) | |
12:20‑12:30 (10+5 min) | GPU-RANC: A CUDA Accelerated Simulation Framework for Neuromorphic Architectures show presentation.pdf (public accessible) show talk video Publication DOI: 10.1109/NICE61972.2024.10548776 Open-source simulation tools play a crucial role for neuromorphic application engineers and hardware architects to investigate performance bottlenecks and explore design optimizations before committing to silicon. Reconfigurable Architecture for Neuromorphic Computing (RANC) is one such tool that offers the ability to execute pre-trained Spiking Neural Network (SNN) models within a unified ecosystem through both software-based simulation and FPGA-based emulation. RANC has been utilized by the community with its flexible and highly parameterized design to study implementation bottlenecks, tune architectural parameters or modify neuron behavior based on application insights and study the trade space on hardware performance and network accuracy. In designing architectures for use in neuromorphic computing, there are a large number of configuration parameters such as number and precision of weights per neuron, neuron and axon counts per core, network topology, and neuron behavior. To accelerate such studies and provide users with a streamlined productive design space exploration, in this talk we introduce a GPU-based implementation of the RANC simulator. We summarize our parallelization approach and quantify the speedup gains achieved with GPU-based tick-accurate simulations across various use cases.Open-source simulation tools play a crucial role for neuromorphic application engineers and hardware architects to investigate performance bottlenecks and explore design optimizations before committing to silicon. Reconfigurable Architecture for Neuromorphic Computing (RANC) is one such tool that offers the ability to execute pre-trained Spiking Neural Network (SNN) models within a unified ecosystem through both software-based simulation and FPGA-based emulation. RANC has been utilized by the community with its flexible and highly parameterized design to study implementation bottlenecks, tune architectural parameters or modify neuron behavior based on application insights and study the trade space on hardware performance and network accuracy. In designing architectures for use in neuromorphic computing, there are a large number of configuration parameters such as number and precision of weights per neuron, neuron and axon counts per core, network topology, and neuron behavior. To accelerate such studies and provide users with a streamlined productive design space exploration, in this talk we introduce a GPU-based implementation of the RANC simulator. We summarize our parallelization approach and quantify the speedup gains achieved with GPU-based tick-accurate simulations across various use cases. | Joshua Mack (Department of Electrical & Computer Engineering, University of Arizona) | |
12:35‑13:35 (60 min) | Lunch | ||
13:35‑13:45 (10+5 min) | Late-breaking-news: Neuromodulated mixture of experts: A prefrontal cortex inspired architecture for lifelong learning show presentation.pdf (public accessible) show talk video Lifelong learning is the ability of a system to learn and retain knowledge of multiple tasks without catastrophic forgetting, switch between them seamlessly and use old knowledge to facilitate more efficient learning of new tasks. Despite recent advances in artificial intelligence, this problem still hasn’t been solved efficiently, with most solutions focused on network expansion (Rusu et al. 2016, Vecoven et al. 2020) – a costly mechanism with limited connect to biology. In this study, we have devised a novel modular deep learning network architecture called Neuromodulated Mixture of Experts (NeMoE) inspired by the prefrontal cortex (PFC) that utilizes the distributed learning framework of the classical Mixture of Experts model and the context-dependent signal-to-noise ratio mediating capabilities of neuromodulators like Dopamine (Vander Weele et al. 2018). To test the model, we developed a novel multi-context “seasonal” foraging task where mice are presented with different environmental contexts indicated by ambient lighting (green/UV) – each context is paired with a high/low cost (shock) and a high/low reward (ensure). We found that mice learn context-specific behavioral strategies and that context is predictable from behavioral features. Further, we used deep reinforcement learning simulations to train NeMoE on the task and found that the model converges to the same policy as real mice for each context – displaying bio-mimicking capabilities. Lastly, we recorded neural activity from the medial prefrontal cortex (mPFC) while the mice were performing the task. We were able to identify ensembles of neurons that have context-specific signal profiles and found that context is decodable from neural activity. Together these findings suggest that neuromodulation driven flexibility can enable models to perform lifelong learning and such “experts” can be found in mPFC ensembles. | Clara N Yi (Salk Institute for Biological Studies) | |
13:50‑14:15 (25+5 min) | Energy Efficient Implementation of MVM Operations Using Filament-free Bulk RRAM Arrays show presentation.pdf (public accessible) show talk video Publication DOI: 10.1109/NICE61972.2024.10549369 This work presents hardware implementation of a spiking neural network model using trilayer bulk RRAM crossbar arrays for an autonomous navigation/racing task. We demonstrate multi-level bulk switching operation in MΩ regime with trilayer bulk RRAM devices (Al2O3/TiO2/TiO2−x) without needing compliance current. We present a neuromorphic compute-in-memory architecture based on trilayer bulk RRAM crossbars by combining energy-efficient voltage sensing at bitlines with the differential encoding of weights to experimentally implement high-accuracy matrix-vector-multiplication (MVM). Our results suggest that use of our bulk RRAM can reduce the energy consumption more than 100x compared to other conventional filamentary RRAM technologies. This work paves the way for neuromorphic computing at the edge under strict area and energy constraints. | Dr. Ashwani Kumar (Electrical and Computer Engineering, UC San Diego) | |
14:20‑14:30 (10+5 min) | TickTockTokens: a minimal building block for event-driven systems show presentation.pdf (public accessible) show talk video Publication DOI: 10.1109/NICE61972.2024.10549408 | Johannes Leugering (Institute for Neural Computation, Bioengineering Dept., UC San Diego) | |
14:35‑14:45 (10+5 min) | Late-breaking-news: Brain-Inspired Hypervector Processing at the Edge of Large Language Models show presentation.pdf (public accessible) show talk video | Goktug Ayar (University of Louisiana at Lafayette) | |
14:50‑15:00 (10+5 min) | Spiking Neural Network-based Flight Controller show presentation.pdf (public accessible) show talk video Publication DOI: 10.1109/NICE61972.2024.10548609 Spiking Neural Network (SNN) control systems have demonstrated advantages over conventional Artificial Neural Networks in energy efficiency and data paucity. In this study, we introduce a SNN-based controller designed within the Neural Engineering Framework (NEF) for the stabilization and trajectory tracking of a quad rotorcraft Unmanned Aircraft System (UAS). The controller's effectiveness is evaluated using a Blade Element Theory simulator, successfully achieving tasks such as take-off, climbing, and circular trajectory tracking. Our results reveal precise control and stability, providing insights into the practical viability and potential of this neuromorphic approach in real-world UAS applications. | Diego Chavez Arana (New Mexico State University) | |
15:05‑15:35 (30 min) | Coffee break | ||
15:35‑16:00 (25+5 min) | PETNet– Coincident Particle Event Detection using Spiking Neural Networks show talk video Publication DOI: 10.1109/NICE61972.2024.10549584 Spiking neural networks (SNN) hold the promise of being a more biologically plausible, low-energy alternative to conventional artificial neural networks. Their time-variant nature makes them particularly suitable for processing time-resolved, sparse binary data. In this paper, we investigate the potential of leveraging SNNs for the detection of photon coincidences in positron emission tomography (PET) data. PET is a medical imaging technique based on injecting a patient with a radioactive tracer and detecting the emitted photons. One central post-processing task for inferring an image of the tracer distribution is the filtering of invalid hits occurring due to e.g. absorption or scattering processes. Our approach, coined PETNet, interprets the detector hits as a binary-valued spike train and learns to identify photon coincidence pairs in a supervised manner. We introduce a dedicated multi-objective loss function and demonstrate the effects of explicitly modeling the detector geometry on simulation data for two use-cases. Our results show that PETNet can outperform the state-of-the-art classical algorithm with a maximal coincidence detection F1 of 95.2%. At the same time, PETNet is able to predict photon coincidences up to 36 times faster than the classical approach, highlighting the great potential of SNNs in particle physics applications. | Jan Debus (ETH Zurich) | |
16:05‑16:15 (10+5 min) | NeRTCAM: CAM-Based CMOS Implementation of Reference Frames for Neuromorphic Processors show talk video Publication DOI: 10.1109/NICE61972.2024.10548603 Neuromorphic architectures mimicking biological neural networks have been proposed as a much more efficient alternative to conventional von Neumann architectures for the exploding compute demands of AI workloads. Recent neuroscience theory on intelligence suggests that cortical columns (CCs) are the fundamental compute units in the neocortex and intelligence arises from CC’s ability to store, predict and infer information via structured Reference Frames (RFs). Based on this theory, recent works have demonstrated brain-like visual object recognition using software simulation. Our work is the first attempt towards direct CMOS implementation of Reference Frames for building CC-based neuromorphic processors. We propose NeRTCAM (Neuromorphic Reverse Ternary Content Addressable Memory), a CAM-based building block that supports the key operations (store, predict, infer) required to perform inference using RFs. NeRTCAM architecture is presented in detail including its key components. All designs are implemented in SystemVerilog and synthesized in 7nm CMOS, and hardware complexity scaling is evaluated for varying storage sizes. NeRTCAM system for MNIST inference with a storage size of 1024 entries incurs just 0.15 mm2 area, 400 mW power and 9.18 μs critical path latency, demonstrating the feasibility of direct CMOS implementation of CAM-based Reference Frames. | Harideep Nair (Carnegie Mellon University.) | |
16:20‑17:20 (60 min) | Open mic / discussion - day III speakers | ||
17:20‑17:30 (10 min) | Final words -- invitation to NICE 2025 | ||
17:30 | End of day 3 and of the talk-days of NICE 2024 |
Friday, 26 April 2024 | |||||||||
08:30 | Tutorial dayVenueThe tutorial day takes place at the San Diego Supercomputer Center SDSC (SDSC visitor information page) There is no reserved parking available for the Friday tutorial.
Tutorials are in three different rooms, all located at San Diego Supercomputer Center, East Expansion; Level B1
TutorialsThese 9 tutorial suggestions have been selected.
| ||||||||
08:30‑10:30 (120 min) | Tutorial slot I Three tutorials in parallel Simulation Tool for Asynchronous Cortical Streams (STACS)In this tutorial, we will explore how to define networks and take advantage of the parallel capabilities of the spiking neural network (SNN) simulator STACS (Simulation Tool for Asynchronous Cortical Streams) https://github.com/sandialabs/STACS. We will primarily be focused on the stand-alone simulator functionality of STACS, which is written in C++, and may be loosely interfaced through Python. Here, a common use case of defining a neural network from preconstructed neuron and synapse models will be covered, with an optional hands-on exercise (through a text editor and the command line directly, and from a Jupyter Notebook). STACS was developed in part to provide a scalable simulation backend that may support large-scale SNN experiments. Developed to be parallel from the ground up, STACS leverages the highly portable Charm++ parallel programming framework https://charm.cs.illinois.edu, which expresses a paradigm of asynchronous message-driven parallel objects. Here, STACS takes advantage of the multicast communication pattern supported by Charm++ to match the irregular communication workload of biological scale models. In addition to the parallel runtime, STACS also implements a memory-efficient distributed network data structure for network construction, simulation, and serialization. This allows for both large-scale and long running simulations (e.g. through checkpoint/restart) on high performance computing (HPC) systems. For network serialization, STACS uses an SNN extension to the popular distributed compressed sparse row (dCSR) format used in graph partitioners, SNN-dCSR. This supports partitioning a network model along its connectivity structure or spatial layout to facilitate more efficient communication by reducing the volume of spikes that are exchanged across compute resources. It also serves as a portable intermediate representation for interoperability between tools within the neural computing ecosystem and for different neuromorphic backends. For preparations: if folks could install Charm++ and STACS (from their respective repositories, https://github.com/UIUC-PPL/charm, https://github.com/sandialabs/STACS) on their computers (Supported on Linux, MacOS) beforehand, that could be good, but the speaker plans on going through those installation steps briefly at the tutorial too since it can be a bit involved. Hands-on tutorial: BrainScaleS neuromorphic compute systemA hands-on tutorial for online interactive use of the BrainScaleS neuromorphic compute system: from the first log-in via the EBRAINS Collaboratory to interactive emulation of small spiking neural networks. This hands-on tutorial is especially suitable for beginners (more advanced attendants are welcome as well). Information about the EBRAINS NMC systems (SpiNNaker and BrainScaleS) is available at https://ebrains.eu/nmc For using the BrainScaleS system during the tutorial (and also independently of the tutorial for own research, free of charge for evaluation) an EBRAINS account (also free of charge) is needed: https://ebrains.eu/register The attendants of the tutorial will use a webbrowser on their own laptops to execute and change provided tutorials and explore on their own. Attendants will be able to continue accessing the systems with a generous test-quota also after the event. An Integrated Toolbox for Creating Neuromorphic Edge ApplicationsSpiking Neural Networks (SNNs) and neuromorphic models are more efficient and have more biologically realism than the activation functions typically used in deep neural networks, transformer models and generative AI. SNNs have local learning rules, can learn on small data sets, and adaptive by neuromod- ulation. However, although the research has discovered their advantages, there are still few compelling practical applications, especially at the edge where sensors and actuators need to be processed in a timely fashion. One reason for this might be that SNNs are much more challenging to understand, build, and operate due to their intrinsic properties. For instance, the mathematical foundation involves differential equations rather than basic activation functions of the neurons. To address these challenges, we have developed CARLsim++. CARLsim++ can lead to rapid development of neuromorphic applications for simulation or edge processing.It is an integrated toolbox that enables fast and easy creation of neuromorphic applications. It encapsulates the mathematical intrinsics and low-level C++ programming by providing a graphical user interface for users who do not have a background in software engineering but still want to create neuromorphic models. In this tutorial, we will demonstrate how one can easily configure inputs and outputs to a physical robot using CARLsim++.
| ||||||||
10:30‑11:00 (30 min) | Coffee break | ||||||||
11:00‑13:00 (120 min) | Tutorial slot II Three tutorials in parallel Neuromorphic Intermediate RepresentationThe Neuromorphic Intermediate Representation (NIR) is a universal format for defining and exchanging neuromorphic models. Since its creation during the 2023 Telluride Neuromorphic workshop, NIR has matured and is available as open-source software (https://github.com/neuromorphs/NIR). NIR currently supports 7 SNN software frameworks and 4 neuromorphic hardware platforms. In this tutorial we will present the concepts behind NIR and show its versatility in live demonstrations. The hands-on projects target both users of neuromorphic technology to learn to convert SNN models from one framework to another, and developers that want to extend NIR or interface it with their own tools. The hands-on examples will be provided as Jupyter Notebooks which can be run on Google Colab. Please bring your own laptop. Further information on the tutorial will be published here: https://github.com/bvogginger/NIR_Tutorial_at_NICE_2024 An Introduction to Design and Simulation using SNS-Toolbox and SNSTorchSNS-Toolbox is a software package for designing and simulating networks of dynamical spiking and non-spiking neurons on CPUs and GPUs, and interfacing them with external systems. In this tutorial, we will begin with an introduction to design within the SNS framework, as well as the architecture of SNS-Toolbox and SNSTorch (a companion package for larger networks). Following this introduction, the tutorial will consist of live coding sessions where participants will write code for SNS-Toolbox and SNSTorch with guidance and assistance from the lead developer. By the end of this tutorial, participants will have designed and simulated a heterogeneous network for locomotion control in MuJoco [9], as well as trained a large-scale network using SNSTorch. All live-coding will be performed using Jupyter Notebooks, which we will provide using Google Colab. These tutorials will be variations of those within the documentation for SNS-Toolbox, available at https://sns-toolbox.readthedocs.io/en/latest/. No prior work is necessary, however all participants will need to provide their own personal computer. Building Scalable, Composable Spiking Neural Algorithms with Fugu (An Introduction)Fugu is an open-source Python library for building composable, scalable, and platform-agnostic spiking neural network algorithms. In many ways, Fugu re-thinks the methodology of how to design spiking algorithms and implement them on-hardware. Unlike many existing approaches, Fugu algorithms are built using a procedural approach—You write code that describes how to build a network, rather than a network itself. This is done in an entirely platform-agnostic way for ‘write once, run anywhere’ spiking algorithms. This approach has several key benefits:
In this tutorial, we will begin with an overview of the basic design Fugu and typical work- flows.
| ||||||||
13:00‑14:00 (60 min) | Lunch
| ||||||||
14:00‑16:00 (120 min) | Tutorial slot III Three tutorials in parallel CrossSim: A Hardware/Software Co-Design Tool for Analog In-Memory ComputingWe present [CrossSim(https://cross-sim.sandia.gov) , an open-source, GPU-accelerated, and experimentally validated simulation tool for the deployment of algorithms on analog in-memory computing (IMC) accelerators. CrossSim simulates how analog errors originating from devices and circuits affect the accuracy of the algorithm accuracy. It can be used as a hardware/software co-design tool for analog IMC, enabling a comprehensive exploration of the design space from the device technology, circuits, system architecture, to the application algorithm to ensure accurate analog computation. This tutorial will feature live code examples to emulate analog IMC using different device technologies on several exemplar applications, such as Fourier transforms and spiking neural network inference. N2A -- neural programming language and workbenchNeuromorphic device makers are moving away from simple LIF dynamics toward programmable neuron models. The challenge is to support this while maintaining cross-platform portability. Remarkably, these are complementary goals. With an appropriate level of abstraction, it is possible to “write once, run anywhere”. N2A allows the user to specify the dynamics for each class of neuron by simply listing its equations. The tool then compiles these for a given target platform. The structure of the network and interactions between neurons are specified in the same equation language. Network structures can be arbitrarily deep and complex. The language supports component creation, extension, and reuse. Components can be shared via built-in Git integration. This tutorial will introduce the user to the N2A programming language and its associated IDE. Upon completion, the user will be able to create new neuron types, new applications, and run them on their local machine. Preparations: This will be a hands-on tutorial. N2A may be downloaded from https://github.com/sandialabs/n2a and run on your personal laptop. SANA-FE: Simulating Advanced Neuromorphic Architectures for Fast ExplorationArchitecting new neuromorphic chips involves several design decisions that can affect power performance. Performance models can be used to estimate the impact of different approaches and inform these decisions. SANA-FE (Simulating Advanced Neuromorphic Architectures for Fast Exploration) is an open-source tool developed in a collaboration between UT Austin and Sandia National Laboratories to rapidly and accurately model and simulate the energy and performance of different neuromorphic hardware platforms. The simulator takes a description of a hardware platform and a spiking neural network (SNN) mapped onto the hardware to model execution of the SNN and predict power and performance. SANA-FE’s rapid and accurate predictions enable hardware-aware algorithm development and design-space exploration for algorithm/architecture codesign. This tutorial will demonstrate SANA-FE and its capabilities to the neuromorphic community, and serve as a hands-on introduction to SANA-FE. We will show how to represent different architectures in SANA-FE, and specify SNNs for different neuromorphic applications. Then, we will demonstrate how to use SANA-FE to run a DVS gesture application on a real-world architecture (v1 Loihi), and finish with a unique SNN mapping challenge. Before attending this tutorial, we recommend installing Docker desktop and downloading the SANA-FE Docker image (jamesaboyle/sana-fe), which includes all required binaries, files and scripts for this session. Docker desktop can be downloaded at: https://www.docker.com/products/docker-desktop/ and in-depth tutorial instructions will be available online at: https://github.com/SLAM-Lab/SANA-FE/blob/main/tutorial/TUTORIAL.md | ||||||||
16:00 | End of NICE 2024 |