Workshops and Tutorials
Monday 19 September 2022

Computing Architecture for AI

(Room - Aula 1A)

Chair

Bill McColl (Huawei Research, CH)

Bill McColl is the Director of the HiSilicon Future Computing Systems Research Lab at Huawei's Zurich Research Center, where he leads fundamental research on future computing architecture, software and algorithms. He is also a Fellow of Wadham College, Oxford University. Previously he was Professor of Computer Science, Head of Research in Parallel Computing, and Chairman of the Faculty of Computer Science at Oxford. He established and led Oxford Parallel, a major center for research on industrial and business applications of HPC at the university. Much of his research has focused on the Bulk Synchronous Parallel (BSP) approach to parallel architecture, software and algorithms. BSP is now used throughout industry for massively parallel graph databases, graph analytics, machine learning and other areas of AI.

Overview

AI is revolutionizing the computing industry, and driving new research and innovation on chip and system architectures for the cloud, the edge, datacenters and supercomputers. New AI accelerators, new infrastructure processors, and new disaggregated system architectures are emerging in response to these major new challenges and opportunities. In this workshop we will hear about the latest developments in these areas, and about future directions in research on AI computing architectures.

08:30 - 09:15

Extreme energy efficiency for extreme edge AI acceleration - An open platform perspective

The pervasive AI megatrend is pushing machine learning (ML) acceleration toward TinyML, i.e the extreme edge, with mW power budgets, while at the same time it raises the bar in terms of accuracy and capabilities. To succeed in this balancing act, we need principled ways to walk the line between flexible and highly specialized TinyML acceleration architectures and circuits. In this talk I will detail on how to walk the line, drawing from the experience of the open PULP (Parallel Ultra-Low Power) platform, based on ML-enhanced RISC-V processors coupled with domain-specific acceleration engines.

Speaker

Luca Benini (ETH Zurich, CH)

Luca Benini holds the Chair of Digital Circuits and Systems at ETHZ and is Full Professor at the Universita di Bologna. Dr. Benini's research interests are in energy-efficient parallel computing systems, smart sensing micro-systems and machine learning hardware. He a Fellow of the IEEE, of the ACM and a member of the Academia Europaea. He received various awards, including the IEEE CAS Mac Van Valkenburg award (2016) and the ACM/IEEE A. Richard Newton Award (2020).

09:15 - 10:00

In-memory computing for deep-learning acceleration

In-memory computing (IMC) is a novel computing paradigm, where certain computational tasks are performed in the memory itself using analog or mixed signal computation techniques. Naturally, memory plays a central role in this computing approach for which both CMOS charged-based as well emerging post-CMOS resistance-based information storage are well suited. The co-existence of computation and storage at the nanometer scale could be the enabler for achieving, ultra-dense, high energy-efficiency and high throughput hardware acceleration of deep neural networks. This talk will provide a broad overview of the recent progress of IMC for accelerating deep learning workloads, highlighting the strengths and weaknesses of the various approaches.

Speaker

Evangelos Eleftheriou (Axelera, NL)

Evangelos Eleftheriou is the Chief Technology Officer (CTO) and Co-Founder of Axelera AI. His research interests include AI and machine learning, including emerging computing paradigms, such as neuromorphic and in-memory computing. Dr. Eleftheriou was appointed an IBM Fellow in 2005, and was inducted into the IBM Academy of Technology in the same year. He was a co-recipient of the IEEE ComS Leonard G. Abraham Prize Paper Award in 2003. He was also a co-recipient of the 2005 Technology Award of the Eduard Rhein Foundation. In 2009, he was also a co-recipient of the IEEE Control Systems Technology Award and the IEEE IEEE Trans. Control Syst. Technol. Outstanding Paper Award. In 2016, he received a Honoris Causa professorship from the University of Patras, Patras, Greece. In 2018, he was inducted into the U.S. National Academy of Engineering as a Foreign Member.

10:00 -10:30

Break

10:30 - 11:15

tinyMl through heterogeneous digital-analog multi-core AI processing

tinyML strives for powerful machine inference in resource scarce distributed devices. To allow intelligent applications at ultra-low energy and low area, one needs 1.) compact compute and memory structures; 2.) which are used at very high utilization. This is possible through the creation of heterogeneous co-processor fabrics, which allow to run every workloads at the most compatible accelerator. Moreover, by using multiple core in parallel, streaming data between the cores, the required amount of on-chip memory and Io bandwidth can be reduced, leading to area, energy and latency savings. This talk will cover the modeling, optimization and implementation of such heterogeneous ML systems.

Speaker

Marian Verhelst (KU Leuven, be)

Marian Verhelst is a full professor at the MICAS laboratories of the EE Department of KU Leuven. Her research focuses on embedded machine learning, hardware accelerators, HW-algorithm co-design and low-power edge processing. Before that, she received a PhD from KU Leuven in 2008 and worked as a research scientist at Intel Labs, Hillsboro OR from 2008 till 2011. Marian is a topic chair of the DATE and ISSCC executive committees, TPC member of VLSI and ESSCIRC and was the chair of tinyML2021 and TPC co-chair of AICAS2020. Marian is an IEEE SSCS Distinguished Lecturer, was a member of the Young Academy of Belgium, an associate editor for TVLSI, TCAS-II and JSSC and a member of the STEM advisory committee to the Flemish Government. Marian received the André Mischke YAE Prize for Science and Policy in 2021, and was awarded the InspiringFifty Deep Tech BeneLux 2021 prize.

11:15 - 12:00

Neuromorphic Intelligence. Electronic circuits for emulating neural processing systems and their application to pattern recognition

Artificial Intelligence (AI) and deep learning algorithms have demonstrated impressive results in a wide range of applications. However, they still have serious shortcomings for use cases that require real-time processing of sensory data and closed-loop interactions with the real-world, in uncontrolled environments.
Neuromorphic Intelligence (NI) aims to mitigate this shortcoming by developing ultra-low power electronic circuits and radically different brain-inspired in-memory computing architectures. In this presentation I will present examples of NI circuits that exploit the physics of their devices to directly emulate the biophysics of real neurons, and I will demonstrate applications of NI processing systems to use cases that require low power, local processing of the sensed data, and that cannot afford to connect to the cloud for running AI algorithms.

Speaker

Giacomo Indiveri (University of Zurich, CH)

Giacomo Indiveri is a dual Professor at the Faculty of Science of the University of Zurich and at Department of Information Technology and Electrical Engineering of ETH Zurich, Switzerland. He is the director of the Institute of Neuroinformatics of the University of Zurich and ETH Zurich. He obtained an M.Sc. degree in electrical engineering in 1992 and a Ph.D. degree in computer science from the University of Genoa, Italy in 2004. His latest research interests lie in the study of spike-based learning mechanisms and recurrent networks of biologically plausible neurons, and in their integration in real-time closed-loop sensory-motor systems designed using analog/digital circuits and emerging memory technologies. His group uses these neuromorphic circuits to validate brain inspired computational paradigms in real-world scenarios, and to develop a new generation of fault-tolerant event-based neuromorphic computing technologies. Indiveri is senior member of the IEEE society, and a recipient of the 2021 IEEE Biomedical Circuits and Systems Best Paper Award. He is also an ERC fellow, recipient of three European Research Council grants.

12:00 -13:30

Lunch

13:30 - 14:15

Hardware/Software Co-Design of Nanoelectronics-Based 3D edge AI Architectures

Edge artificial intelligence (AI) has been hailed as the next frontier of innovation in the Internet of Things (IoT) for our everyday objects to be connected and work together to improve our lives and transform industries. However, major challenges remain in achieving this potential due to the inherent complexity of designing energy-efficient edge AI architectures due to the complexity of complex variations of convolutional neural networks (CNNs) with the underlying limited processing capabilities of edge AI accelerators. In this talk, Prof. Atienza will discuss the benefits of designing nanoelectronics-based 3D edge AI architectures operating at sub-nominal conditions to reduce power and obtain ultra-low-power IoT systems, while highlighting the challenges of possible errors that appear in such systems when executing complex CNN designs. These errors can affect the stored values of CNN weights and activations, compromising their accuracy. Then, a new architectural co-design methodology for edge AI systems can conceive ensembles of CNNs for 3D edge AI architectures with improved robustness against memory errors compared to single-instance CNNs. At the software level, a new optimization methodology for edge AI systems, called E2CNNs, provides compression methods and heuristics to produce an ensemble of CNNs for edge AI devices with the exact memory requirements as the original architecture but improved error robustness. Then, we propose at the hardware level a triple combination of emerging technologies for the fine interweaving of versatile logic functionality and memory for reconfigurable in-memory computing: vertical junctionless gate-all-around nanowire transistors for ultimate downscaling, ambipolar functionality enhancement for fine-grain flexibility, and ferroelectric oxides for non-volatile logic operation. Through the use of 3D compute cubes naturally suited to the hardware acceleration of computation-intensive kernels, designed robustly with E2CNN, we can create the next-generation of nanoelectronics-based 3D edge AI architectures that adapt energy consumption vs. computation precision by using Transformer and Conformer networks as case studies.

Speaker

David Atienza (Embedded Systems Laboratory EPFL, CH)

David Atienza is a Professor of Electrical and Computer Engineering and heads the Embedded Systems Laboratory (ESL) at EPFL. He received his MSc and Ph.D. degrees in Computer Science and Engineering from UCM and IMEC. His research interests focus on system-level design methodologies for energy-efficient computing systems, particularly multi-processor system-on-chip architectures (MPSoC) for servers and next-generation edge AI architectures. He is a co-author of more than 350 publications, 14 patents, and has received several best paper awards in top conferences in these fields. Dr. Atienza received, among other recognitions, the ICCAD 2020 10-Year Retrospective Most Influential Paper Award, the DAC Under-40 Innovators Award in 2018, the IEEE CEDA Early Career Award in 2013, and the ACM SIGDA Outstanding New Faculty Award in 2012. He is an IEEE Fellow, an ACM Distinguished Member, and was the President (2018-2019) of IEEE CEDA.

14:15 - 15:00

Machine Learning Anywhere/Anytime: Purpose-Built Computing Hardware

We will review hardware and hardware/software methods for improving machine learning capabilities for training and inference. The defining characteristic of our methods is that they require no help from the machine learning developers. They rely on "naturally" occurring properties of neural networks. The methods reduce memory and computation costs for out-of-the-box neural networks while, at the same time, they reward user-sider optimizations such as quantization and pruning.

Speaker

Andreas Moshovos, University of Toronto

Andreas Moshovos teaches digital system design and optimization at the University of Toronto. He has also taught at Northwestern University, USA, Ecole Polytechnique De Lausanne, Switzerland, and the University of Athens, Greece. He has over 25 years of experience in application characterization and hardware optimizations for computing hardware. He received the NSF early CAREER award in 2000, a Semiconductor Research Corporation Inventor Recognition Award, IBM Faculty Partnership awards in 2008 and 2009, and the 2021 IEEE Canada C.C. Gottlieb Computer Award, and the 2010 ACM Maurice Wilkes award. He is a fellow of the IEEE and ACM, the Director of the NSERC COHESA Strategic Network on Machine Learning Acceleration, and a faculty affiliate of the Vector Institute.

15:00-15:30

Break

15:30 - 16:15

SpiNNaker2 and Beyond: Bio-inspired Processing from Edge to Cloud

AI is having an increasingly large impact on our daily lives. However, current AI hardware and algorithms are still only partially inspired by the major blueprint for AI, i.e. the human brain. In particular, even the best AI hardware is still far away from the 20W power consumption, the low latency and the unprecedented large scale, high-throughput processing offered by the human brain. In this talk, I will describe our bio-inspired AI hardware, from our award-winning edge systems up to SpiNNaker2, which is the largest cloud system for real-time AI worldwide. With our unique sparsity-optimized hybrid AI framework, this enables a real-time distributed AI system with unprecedented low latency, low energy and high robustness. In this process, we also endavor to marry three different AI concepts: (1) DNN for handling real-world, noisy data, (2) bio-inspiration for efficiency and also to make representations sparse, closing the gap to the (3) symbolic AI layer, which does the abstracted, robust world interaction. Thus, we aim to close the huge gap between current pure computing power AI and true brain-like AI.

Speaker

Christian Mayr (Technical University of Dresden, DE)

Christian Mayr is a Professor of Electrical Engineering at TU Dresden, heading the Chair of Highly-Parallel VLSI-Systems and Neuromorphic Circuits. His career encompasses postings at Infineon, Philips, University Zurich, TU Dresden und John-Hopkins University Baltimore. He is author/co-author of over 80 publications and holds 4 patents. He is a PI in the EU flagship Human Brain Project as well as in the German excellency clusters CETI and cfaed.

16:15 - 17:00

Hardware and software models for hyperscale AI on HPC and cloud systems

I will talk about the challenges of developing future heterogeneous "superclouds" that can continuously run many large-scale AI applications on a single shared heterogeneous architecture (cloud, datacenter or supercomputer). The applications might involve HPC, Machine Learning, Big Data Analytics, Graph Computing or Knowledge Computing, but all will need to be delivered with the high performance, utilization and resilience required by commercial hyperscale services. Some new emerging applications will require not only that, but also to be able to perform HPC, AI and Knowledge Computing together in a tightly integrated way within a single application.

Speaker

Bill McColl (Huawei Research, CH)

17:00 - 17:1

Closing