Remote Ph.D. Defense: Candace Walden

Thursday, July 15, 2021
10:30 a.m.
https://umd.webex.com/umd/j.php?MTID=ma7e9a4181a2a031ac90db6b2227599aa Password: 9cfXAp32iNN
Maria Hoo
301 405 3681
mch@umd.edu

ANNOUNCEMENT:  Remote Ph.D. Defense
 

Name:   Candace Walden
 
Committee:
Prof. Donald Yeung, Advisor/Chair
Prof. Bruce Jacob
Prof. Manoj Franklin
Prof. Sahil Shah
Prof. Abhinav Bhatele
Prof. Hanan Samet, Dean’s Representative
 

Date/Time: Thursday, July 15, 2021 at 10:30 AM


Location:   https://umd.webex.com/umd/j.php?MTID=ma7e9a4181a2a031ac90db6b2227599aa

Password:    9cfXAp32iNN
 
 
Title: Monolithically Integrated SRAM-ReRAM Cahce-Main Memory System
 

Abstract:
Emerging non-volatile memories are dense and potentially compatible with standard CMOS processes, enabling a monolithically integrated CPU-main memory chip. However, area constraints impact the feasibility of fitting the entirety of a multi-core CPU and main memory system into a single die. ReRAM presents a unique opportunity in that it can be fabricated in crosspoint subarrays which leave the bulk of transistors beneath them available for other logic. However, ReRAM also poses a performance challenge; the latency is generally much higher than that of DRAM. Compensating for this through the increased bandwidth afforded from being on-die poses an architectural problem.

The access circuitry for ReRAM subarrays requires only a small percentage of the area beneath the array. Still, this dense circuitry and wiring disrupts the layouts of irregular logic like CPUs. Caches are very regular and composed of smaller subarrays, making them a better candidate to place beneath crosspoint subarrays. By co-designing the cache subarrays and ReRAM crosspoint subarrays, minimal disruption to the cache logic can be achieved while still covering the bulk of the last-level cache area in ReRAM. Using a modified version of Cacti, we are able to explore the design trade-offs when integrating ReRAM and cache and quantify the impact the ReRAM has on the last-level cache. We also examine how the physical integration presents opportunities for logical integration of the last-level cache and main memory.

Additionally, this work explores one architectural style which can balance the monolithic memory system and a general-purpose compute system---a tiled multicore with wide SIMD and multi-threading. We develop a simulator for this architecture capable of simulating a wide variety of system parameters. Through a design space exploration of many of the parameters across sparse, irregular graph kernels and dense, streaming computations, we find monolithic ReRAM exceeds the performance of a state-of-the-art DRAM system for memory intensive workloads given enough parallelism. We further develop an analytic model to describe our system and highlight the important performance characteristics for a monolithic CPU-main memory chip. The analytic model is validated against our simulation data. Using this model, we examine the architectural balance of the systems we simulated.

Finally, we develop an RTL model of the combined cache--main memory interface. This gives a more accurate model for the increase in resources required for the combined controller. We additionally develop a system-on-a-chip with an RTL model that alters requests to the FPGA's main memory to be at the speed of ReRAM requests. This model is used to show the performance of more computationally intensive benchmarks. It also is the first step toward creating a test chip for a monolithically integrated ReRAM main memory.

 

 

Audience: Graduate  Faculty 

remind we with google calendar

 

February 2026

SU MO TU WE TH FR SA
1 2 3 4 5 6 7
8 9 10 11 12 13 14
15 16 17 18 19 20 21
22 23 24 25 26 27 28
1 2 3 4 5 6 7
Submit an Event