![]() |
||
ICE Project Descriptions: Summer 2003
1. Architectures for High-Performance, Low-Power Embedded Systems We are building a complete
embedded-system simulator that simulates both the embedded microcontroller
and the RTOS. This will enable us to gather a large amount of information
on the behavior of real-time systems and will allow us to measure the
effect of changes to the system architecture that require modifications
to both hardware and software. Current measurement techniques do not allow
such flexibility; software simulators that execute applications directly
on an emulated processor neglect operating system activity, and systems
that attach logic probes to real hardware obtain accurate measurements
but do not allow modifications to the processor architecture. Tasks within this project include
modeling hardware architectures in C and/or Verilog, building new hardware
constructs for real-time processing, developing enhancements for real-time
operating systems, and developing real-time embedded applications.
2. Robust Computer
Architectures The devices with which computer chips are built are decreasing in detail size and voltage level very rapidly making them more vulnerable to electrical upset, either from external sources or internal interference. Previous work in the area of circuit-level fault tolerance has focused on surviving small numbers of random transient or stuck-at errors. The solution in the memory system (both the DRAM system and the cache system) has been to provide ECC bits (error-correcting codes) that detect and correct such errors introduced into the memory system. By using enough redundant bits, one can catch any number of errors this way. The solution on the processor side has been to replicate resources at different levels of granularity. For example, some systems have multiple identical processors performing the same task at the same time and use a voting algorithm to ignore any erroneous results. Other systems replicate components within the architecture -- for example, by having multiple adders that perform identical computations with a similar voting method to choose the correct result. These approaches are intended to catch transient errors that occur in one (but not usually more than one) of the processors or components involved. As the susceptibility of circuits increases with smaller, higher-performance parts, the likelihood that errors are transient and random decreases rapidly. There is a solution. Rollback recovery has been long used in the field of distributed systems and transaction processing to provide high degrees of reliability in the face of occasional catastrophic failures (e.g. disk crashes). A reliable software system using the technique periodically saves to reliable storage just enough state that the system can be successfully restarted using only that saved state. When a catastrophic failure occurs and is detected, the system is restarted from that saved state. We propose to use rollback recovery not at the system-software level but at the microarchitecture level (i.e. chip level) to provide a high degree of reliability. Periodically, the microprocessor will dump consistent system state (the contents of its internal storage and any recent changes to external storage) to a safe location. Upon detection of an error, this state will be restored to the processor, and the processor will begin executing from this ?known good? state. Tasks within this project include modeling hardware architectures in
Verilog, designing and fabricating (and testing) new hardware
prototypes, and developing testbed applications.
3.
Satisfiability Problem and its Application in VLSI CAD In the (boolean) satisfiability problem, we are given a formula on a set of (boolean) variables, and we are asked to assign each variable either 0 or 1 to make the formula true. The formula consists of variables and three types of basic operations: (i) '+': x+y is true if at least one of the variables x and y gets a value '1'; (ii) '*': x*y is true if and only if both x and y get value '1'; (iii) ''': x' is true if and only if x gets a value '0'. For example, any of the following assignment will make formula x'+y*z true: {x=0, y=0, z=1}, {x=1, y=1, z=1}, {x=0, y=1, z=1}. The satisfiability problem
has numerous applications in computer science, complexity theory, and
very large scale integrated(VLSI) circuits computer aided design(CAD).
The problem is hard and many heuristics have been proposed trying to solve
it. Because these problem solvers come from very different fields and
target very different type of formulas, it is difficult to compare their
performance. Our goals in this project include: (1) understanding the
problem and basic ideas of different solvers, (2) building testbeds for
different solvers, (3) developing new algorithms to solve the problem,
and (4) improving C/C++/JAVA programming skills.
4.
Subordinate Multithreading Architectures and Applications Today, multithreading is beginning to penetrate the mass computing market, and is already available in production volumes (e.g., Intel's Pentium 4 with Hyperthreading). As multithreaded processors gain widespread acceptance, it becomes critical for workloads to effectively exploit the available thread-level parallelism. One obvious source of thread-level parallelism is multiprogrammed workloads (executing multiple applications together); unfortunately, many multiprogrammed workloads cannot provide a sustained source of thread-level parallelism. Another source of thread-level parallelism is parallel workloads. However, this approach requires explicit parallelism, which is usually too challenging for compilers and too labor-intensive for humans. Given spare execution resources in future under-utilized multithreaded processors, an extremely promising approach is subordinate multithreading. In addition to running workload (or ``main'') threads, subordinate multithreading also runs subordinate threads to perform computations on behalf of the main threads. These helper threads can assist or extend the functionality of the main thread in some fashion, or attempt to directly improve application performance. Currently, we are investigating using such helper threads to perform data prefetchting We have recently built the first compiler to automatically generate prefetching code that runs as subordinate threads to improve the performance of the main thread. In ongoing research, we are also investigating new uses of helper threads to include traditional runtime or operating system level optimizations and functions, such as dynamic compilation, garbage collection, and on-line performance feedback. Students participating in this project will investigate new hardware techniques to support subordinate multithreading, as well as study novel applications of subordinate threads. Tasks will include developing simulation models in the context of multithreaded processor simulators, porting applications to simulation infrastructure, and running experiments.
5.
Synthesis-assistance and Compilation Software for Embedded Systems Embedded systems refer to the class of application-specific computer systems used today as controllers and monitors in a variety of consumer and business applications. Such embedded systems are ubiquitous today in cell phones, DVD players, PDAs, household appliances, consumer electronics, communication systems, remote sensing and vehicle control, to name just a few. Since 1999, the dollar volume (total sales) from embedded CPUs has exceeded that of desktop CPUs such as the Pentium, and is growing much more rapidly. Over $50 Billion in embedded CPUs were sold in the year 2001. Embedded systems promise to revolutionize our day-to-day lives with ever-increasing intelligence and connectivity at decreasing cost. Yet, many of the software technologies for embedded systems remain antiquated, from compilers that produce code whose performance and power consumption is substantially inferior to assembly language programs, to synthesis software that provides little guidance to the designer on what decisions to make. These shortcomings decrease system performance and increase time-to-market and software and hardware development cost. This project focuses on developing fundamental technologies to propel the software for embedded systems to the next level of automation. Opportunities along two directions are being explored: increased automation of the synthesis of embedded soft cores, and new compiler strategies for the management of heterogeneous memories in embedded systems. When deployed, these innovations will lead to a quantum leap in the time to market, cost and performance of embedded designs. Both directions rely on improved compiler analysis of application domains. MERIT interns on this project will function as full-fledged group members, and will work with Dr. Barua and his graduate students in delivering key infrastructure components or technologies. Only work that is critical to the project will be assigned to interns; subsequently if the intern is able to complete the project, there is a good chance of a co-authorship on a conference or journal publication. Prerequisites:Programming experience in C and/or C++ is a requirement --
the more the better. Courses in Data Structures(CMSC 420) and Computer
Organization (ENEE 350) will be a significant plus.
|