Tutorial T6

Title: Manycore processing-in-memory systems for accelerating deep learning applications

Abstract: Training machine learning (ML) models at the edge (on-chip training on end user devices) can address many pressing challenges including data privacy/security, increase the accessibility of ML applications to different parts of the world by reducing the dependence on the communication fabric and the cloud infrastructure, and meet the real-time requirements of AR/VR applications. However, existing edge platforms do not have sufficient computing capabilities to support complex ML tasks such as training large CNNs. This tutorial will consider solutions based on Resistive Random-access Memories (ReRAMs) to address these challenges and answer the following questions: (1) How to use ReRAMs as a Processing-in-memory (PIM) architecture? (2) How to map machine learning techniques to ReRAMbased manycore systems to improve performance and energy-efficiency? (3) What are the challenges associated with different types of Deep Learning applications (such as CNNs and GNNs) when mapped to PIM-based manycore architectures? (4) How can we ensure reliability in these architectures? To address these outstanding challenges, outof- the-box approaches need to be explored. By integrating machine learning algorithms, data analytics, statistical modeling, and design of advanced computing systems, this tutorial will engage a broad section of ESWEEK conference attendees. It also highlights how machine learning and embedded system researchers can join hands to design energyefficient and reliable miniaturized computing systems. This tutorial is targeted towards university researchers/professors, MS/Ph.D. students, professionals from industry, and IC designers, who want to learn how to use ReRAMs for ML applications, as well as experienced researchers looking for exciting new directions in PIM. We expect at least 30-40 attendees. We will announce the tutorial through our regular networks, as well as mailing lists of CEDA, IEEE CAS Society, IEEE Computer Society and ACM SIGDA (SIGDA E-News that reaches thousands of EDA professionals, and we will leverage it to publicize our tutorial). The target audience matches the typical ESWEEK participant very well. We also hope that this tutorial will allow ESWEEK to reach out to a wider audience and help boost attendance.

List of topics to be covered

The main objective of the tutorial is to help attendees understand the potential of emerging ReRAM-based computing systems both as memory and as a processing-in-memory architecture. We will elaborate how ReRAMs can implement complex ML applications in an energy-efficient manner. Specific topics are as follows:

  • Application of ReRAMs as a processing-in-memory system
  • Mapping of ML algorithms to ReRAMs for inferencing and training
  • Addressing computation and communication challenges of the PIM-based manycore system
  • Ensuring high performance and accuracy by addressing reliability challenges in ReRAMs
Length and format of the tutorial

The tutorial will be delivered online as a one-hour lecture. Lecture notes will be provided to the attendees in the form of a PDF handout. The notes will include a comprehensive bibliography. Also, we will include a few relevant papers on this topic in the handout. Basic knowledge of algorithms, EDA concepts, statistics, memory design, and manycore architectures is preferred. Prior knowledge in machine learning will be useful, but the speakers will try to make the tutorial as self-contained as possible.

Has the same tutorial (or a similar one) been presented to other events?

Two related special sessions titled “Harnessing the Power of Machine Learning: EDA to Accelerator Design” and “Cross Layer Design Solutions for. Energy-Efficient and Secure Edge AI” were presented at ICCAD 2021. Jana Doppa was involved with the ICCAD 2021 special sessions.

Organizers

Janardhan Rao Doppa, Washington State University, jana.doppa@wsu.edu.

Jana is currently a George and Joan Berry Chair Associate Professor with Washington State University (WSU), Pullman, WA, USA. His current research interests are at the intersection of machine learning and computing systems design. He received a NSF CAREER Award (2019), an Outstanding Paper Award at the AAAI (2013) conference, a Google Faculty Research Award (2015), the Outstanding Innovation in Technology Award from Oregon State University (2015). He received the Reid-Miller Teaching Excellence Award (2018) and the Outstanding Junior Faculty in Research Award (2020) from the Voiland College of Engineering and Architecture at WSU. He is among the 15 outstanding young researchers selected to give Early Career Spotlight talk at the International Joint Conference on Artificial Intelligence (2021).

Biresh Kumar Joardar, Duke University, bireshkumar.joardar@duke.edu.

Biresh finished his PhD from Washington State University in 2020. He is currently a Post-doctoral Computing Innovation Fellow (CI-Fellow) at the Department of Electrical and Computer Engineering at Duke University. His current research interests include machine learning, manycore architectures, accelerators for deep learning, hardware reliability and security. He received the ‘Outstanding Graduate Student Researcher Award’ at Washington state University in 2019. His works have been nominated for Best Paper Awards at prestigious conferences such as DATE and NOCS.

References
  1. B. K. Joardar, J. R. Doppa, P. P. Pande, H. Li and K. Chakrabarty, “AccuReD: High Accuracy Training of CNNs on ReRAM/GPU Heterogeneous 3-D Architecture,” in IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems, vol. 40, no. 5, pp. 971-984, May 2021, doi: 10.1109/TCAD.2020.3013194
  2. B. K. Joardar, J. R. Doppa, H. Li, K. Chakrabarty, and P. P. Pande, “Learning to Train CNNs on Faulty ReRAM-based Manycore Accelerators,” in ACM Trans. Embed. Comput. Syst, 20, 5s, Article 55 (October 2021), 23 pages. DOI:https://doi.org/10.1145/3476986
  3. A. I. Arka, B. K. Joardar, J. R. Doppa, P. P. Pande and K. Chakrabarty, “DARe: DropLayer-Aware Manycore ReRAM architecture for Training Graph Neural Networks,” 2021 IEEE/ACM International Conference On Computer Aided Design (ICCAD), 2021, pp. 1-9, doi: 10.1109/ICCAD51958.2021.9643511.
  4. A. I. Arka, J. R. Doppa, P. P. Pande, B. K. Joardar and K. Chakrabarty, “ReGraphX: NoC-enabled 3D Heterogeneous ReRAM Architecture for Training Graph Neural Networks,” 2021 Design, Automation & Test in Europe Conference & Exhibition (DATE), 2021, pp. 1667-1672, doi: 10.23919/DATE51398.2021.9473949
  5. X. Yang et al., “Multi-Objective Optimization of ReRAM Crossbars for Robust DNN Inferencing under Stochastic Noise,” 2021 IEEE/ACM International Conference On Computer Aided Design (ICCAD), 2021, pp. 1-9, doi: 10.1109/ICCAD51958.2021.9643444.
  6. L. Song, X. Qian, H. Li and Y. Chen, “PipeLayer: A Pipelined ReRAM- Based Accelerator for Deep Learning,” 2017 IEEE International Symposium on High Performance Computer Architecture (HPCA), 2017, pp. 541-552.
  7. A. Shafiee et al., “ISAAC: A Convolutional Neural Network Accelerator with In-Situ Analog Arithmetic in Crossbars,” 2016 ACM/IEEE 43rd Annual International Symposium on Computer Architecture (ISCA), 2016, pp. 14-26