Ph.D. Dissertation Defense: Shang Li

Friday, June 28, 2019
11:00 a.m.
AVW 1146
Emily Irwin
301 405 0680
eirwin@umd.edu

NAME: Shang Li

 
Committee:
Prof. Bruce Jacob (advisor)
Prof. Donald Yeung
Prof. Manoj Franklin
Prof. Jeff Hollingsworth
Prof. Alan Sussman (dean's rep)

Date/Time: Friday, June 28, 11-1 pm

Place: 1146 A.V. Williams Building

Title:  Scalable and Accurate Memory System Simulation

Abstract:
Memory systems today possess more complexity than ever. On one hand,
main memory technology has a much more diverse portfolio. Other than the main
stream DDR DRAMs, LPDDR, GDDR, and stacked DRAMs such as HBM and
HMC have been proliferating in certain domains. Non-Volatile Memory(NVM) also
finally makes it to the main memory market, introducing more heterogeneity to
the main memory media. On the other hand, the scale of computer systems, from
personal computers, server computers, to high performance computing systems, has
been increasing. The memory systems have to be able to keep scaling in order not
to bottleneck the whole system. However, current memory simulation works cannot
accurately or efficiently model these developments, making it hard for researchers
and developers to evaluate or to optimize designs for memory systems.
 
In this study, we attack these issues from multiple angles. First, we build a fast
and extendable cycle accurate main memory simulator that can accurately model
almost all existing DRAM protocols and some NVM protocols, and it can be easily
extended to support upcoming protocols as well. We showcase this simulator by
conducting a thorough characterization over existing DRAM protocols and provide
insights on memory system designs.
 
Secondly, to efficiently simulate the increasingly paralleled memory systems,
we propose a lax synchronization model that allows efficient parallel DRAM
simulation. We are able to speedup the overall simulation by a factor of two with single
digit percentage loss in accuracy comparing to cycle accurate simulations. We also
develop mitigation schemes to further improve the accuracy with no additional performance cost.
 
Moreover, we discuss the limitation of cycle accurate models, and explore
the possibility of alternative modeling of DRAM. We propose a novel model that
converts DRAM timing simulation into a classification problem. By doing so we
can make predictions on DRAM latency for each memory request upon first sight,
which is much faster than a cycle accurate simulator, and reduces the simulator’s
complexity. We developed prototypes based on various machine learning models
and they demonstrate promising performance and accuracy balances that makes it
a viable alternative to cycle accurate models.
 
Finally, for large scale memory systems where data movement is often the
bottleneck, we proposed a set of interconnect topologies and implemented them in
a parallel discrete event simulation framework. We evaluate the proposed topologies
through simulation and proves its scalability and performance exceeds existing
topologies with increasing system size or workloads.

Audience: Graduate  Faculty 

 

November 2019

SU MO TU WE TH FR SA
27 28 29 30 31 1 2
3 4 5 6 7 8 9
10 11 12 13 14 15 16
17 18 19 20 21 22 23
24 25 26 27 28 29 30
1 2 3 4 5 6 7
Submit an Event