Title: Waferscale Computing Systems: Are We There Yet?
Organizers: Puneet Gupta (UCLA), Saptadeep Pal (Auradine Inc.)
Abstract: Fueled by the tremendous growth of new applications in the domain of big-data computing, deep learning, and scientific computing, the demand for increasing system performance is far outpacing the capability of conventional methods for system performance scaling. Waferscale computing, where an entire 300 mm wafer worth of compute and memory can be extremely tightly integrated, promises to provide orders of magnitude improvement in performance and energy efficiency compared to today’s systems built using traditional packaging technologies.
In this panel, we will discuss the “Promised Land” of waferscale computing. Back in the 1980s, waferscale systems were attempted by a few companies, notable amongst them were Trilogy systems and Tandem Computers. However, yield and cost challenges of building waferscale systems led to its early demise, but the promise remained. After more than 30 years, recent academic (e.g., UCLA/UIUC) and industrial (Cerebras, Tesla) efforts have taken up the challenge again. So, are we in a waferscale technology renaissance period and nearing the days when waferscale technologies would be widely adopted? Many questions remain which we will discuss in this panel.
First and foremost, is the overall technology there yet? Does manufacturing difficulties, more so in the advanced nodes, limit choice of waferscale architectures? Would waferscale integration of heterogeneous chiplets open up more architectural choices over monolithic waferscale technologies ? Waferscale computing comes with very high power density, which means 10s of kilowatts of power need to be supplied and that heat needs to be extracted from the wafer. Is the data center infrastructure ready to accommodate such waferscale systems at scale? Are there lower power use cases for Waferscale ? Moreover, the design infrastructure, such as EDA and simulation tools are not yet truly ready for larger-than-a-reticle design. Thus, the overall design challenges of a waferscale system is humongous and so would the development be confined to a niche group? What are the applications that need waferscale computing and would benefit massively from such systems? Does the cost of waferscale systems justify adoption of these systems at volume? Are there previously untenable applications and business cases that now become feasible with waferscale computing, if so what are those? Are there edge compute use cases where the volumetric compute density would lead to adoption of waferscale systems?
Rakesh Kumar is a Professor in the Electrical and Computer Engineering Department at the University of Illinois at Urbana Champaign with research and teaching interests in computer architecture and system-level design automation. His research has been recognized through one ISCA Influential Paper Award, one MICRO Test-of-Time Award, one ASPDAC 10 Year Retrospective Most Influential Paper (MIP) Award, several best paper awards and best paper award nominations (IEEE MICRO Top Picks, ASPLOS, HPCA, CASES, SELSE, IEEE CAL), ARO Young Investigator Award, and UCSD CSE Best Dissertation Award. His teaching and advising have been recognized through Stanley H Pierce Faculty Award and Ronald W Pratt Faculty Outstanding Teaching Award. He often writes about issues at the intersection of technology, policy, and society; he is the author of the book Reluctant Technophiles (Sage Select: Dec 2021), one of “GQ’s Best Indian Non-fiction Books of 2021”. Rakesh has a BS from IIT Kharagpur and a PhD from University of California at San Diego.
Gabriel H. Loh is a Senior Fellow in AMD Research, the research and advanced development lab for Advanced Micro Devices, Inc. Gabe received his Ph.D. and M.S. in computer science from Yale University in 2002 and 1999, respectively, and his B.Eng. in electrical engineering from the Cooper Union in 1998. Gabe was also a tenured associate professor in the College of Computing at the Georgia Institute of Technology, a visiting researcher at Microsoft Research, and a senior researcher at Intel Corporation. He is a Fellow of the ACM and IEEE, recipient of ACM SIGARCH’s Maurice Wilkes Award, Hall of Fame member for the MICRO, ISCA, and HPCA conferences, (co-)inventor on over one hundred US patent applications and over ninety granted patents, and a recipient of the US National Science Foundation Young Faculty CAREER Award.
Joel Hestness is a Senior Research Scientist at Cerebras Systems, an AI-focused hardware startup building the largest ever processors using wafer-scale integration. Joel helps define algorithms, performance optimizations, and scaling approaches for machine learning and NLP applications on the Cerebras Wafer-Scale Engine. Previously, Joel was a Research Scientist at Baidu’s Silicon Valley AI Lab (SVAIL), where he worked on deep learning speech and language modeling. His work was the first to demonstrate predictable accuracy scaling laws for modern deep learning algorithms, sparking a trend of scaling law studies now pervasive in the field. Joel received his PhD in computer architecture from the University of Wisconsin – Madison, and his Bachelor’s degrees in Mathematics and Computer Science also from UW-Madison.
Dave Nellans joined NVIDIA in 2013 where he is the Director of System Architecture Research. His research interests include building scalable computing systems that optimize node-level efficiency by improving the performance, utilization, and interaction of GPUs, CPUs, smart NICS, and storage systems. Dr. Nellans was previously an early engineering leader at Fusion-IO, one of the pioneers in PCIe-attached NAND-flash storage, where helped the company invent, develop, and ship new datacenter storage products that ultimately led to Fusion-IO’s IPO in 2011. He holds a B.A. in Computer Science from Colgate University and a Ph.D. in Computer Science from the University of Utah.