Specialized hardware accelerators (e.g., ASICs, FPGAs) are a promising solution for dealing with our increasing computational demands, as they offer high parallelism and energy efficiency. However, a major barrier to their global success is the difficulty of hardware design. High-Level Synthesis (HLS) tools generate digital hardware designs from high-level programming languages (e.g., C/C++) and promise to liberate designers from low-level hardware description details. Yet, HLS tools are still primarily usable by designers with hardware expertise and produce acceptable results only for limited classes of applications.
The research of DYNAMO focuses on methods to generate good-quality hardware designs from high-level programming languages. Our goal is to advance the usability of digital hardware platforms and to enable various programmers to accelerate emerging compute-intensive applications. Our research spans several areas of computer science and electrical engineering, including compilers, formal methods, electronic design automation, digital hardware design, and computer architecture.
Current Research Topics
Synthesizing General-Purpose Code into Dynamically Scheduled Circuits
Despite the recent commercial success and the ability to accelerate certain types of applications, standard HLS tools still heavily rely on manual code restructuring, annotations, and exhaustive trial and error with various configuration parameters. Additionally, these tools result in poor performance and limited capabilities in applications where critical information on memory and control dependencies is not available during code compilation. To overcome these limitations, we are actively developing Dynamatic, an open-source HLS compiler that produces dynamically scheduled, dataflow circuits out of C/C++ code. The goal is to achieve high-performance circuits out-of-the-box and to realize behaviors that are beyond the capabilities of standard HLS tools while minimizing the programmer’s effort and need for hardware design expertise.
Introducing Features of Superscalar Processors to Hardware Accelerators
Practically all high-end application processors in our computing devices and data centers are superscalar out-of-order processors: they develop the operation schedule dynamically, execute instructions speculatively, and resolve memory dependencies at runtime. Hardware accelerators such as FPGAs are moving to datacenters and facing broader application classes; thus, they will also need to satisfy the needs of general-purpose markets and exploit a variety of parallelism opportunities. To this end, we are developing hardware mechanisms to introduce key characteristics of superscalar out-of-order processors (e.g., out-of-order memory accesses and speculative execution) into the context of hardware acceleration. We are developing HLS compilation analyses and techniques to automatically determine where these features are useful and to effortlessly (i.e., without programmer intervention) incorporate them into a complete hardware design.
Formal Methods in High-Level Synthesis
HLS tools typically rely on functional verification of a particular circuit through extensive hardware simulation. However, this process is infeasible or extremely time-consuming for complex designs. Furthermore, the lack of formal proof of the correctness of particular compilation steps and the resulting hardware modules prevents the adoption of HLS in safety-critical domains or when hardware production is expensive (e.g., ASICs). DYNAMO employs formal methods to enhance the reliability of HLS tools, improve the quality of the resulting circuits, and reduce the programmer’s verification effort. We use Petri nets to describe and optimize circuit behavior, area, and performance. We exploit model checking to verify critical circuit properties (e.g., under appropriate assumptions, does the circuit execution always terminate?) and to prove the correctness of particular compiler transformations (e.g., is a translation of a semantically correct compiler representation into a circuit always correct?). We address the scalability issues of applying these techniques to complex circuits.
FPGA-Oriented Area and Timing Optimizations
HLS aims to automatically produce circuits whose area and performance are comparable to that of hand-optimized hardware designs. Yet, starting from a high-level software description, it is often challenging for the HLS tool to account for the physical properties of the underlying hardware–especially when it comes to complex reconfigurable devices such as FPGAs. DYNAMO develops a variety of area and timing optimizations for HLS-produced circuits. We redefine classic synthesis techniques such as retiming, pipelining, and resource sharing and apply them to the context of dynamically scheduled circuits produced from C/C++ code. We explore optimization techniques that consider the peculiarities of the FPGA architecture and account for the effects of FPGA synthesis, placement, and routing to produce area- and timing-efficient FPGA circuits.