Serena Curzel - Alumni

Major research topic

Modern High-Level Synthesis: improving productivity with a multi-level approach

Abstract

High-Level Synthesis (HLS) tools simplify the design of hardware accelerators by automatically generating Verilog/VHDL code starting from a general purpose software programming language, usually C/C++. They include a wide range of optimization techniques in the process, most of them performed on a low-level intermediate representation (IR) of the code. Because of the mismatch between the requirements of hardware descriptions and the characteristics of input languages, HLS tools often rely on users to add specific directives (pragmas) that augment the input specification to guide the generation of optimized hardware. A good result thus still requires hardware design knowledge and non-trivial design space exploration, which might be an obstacle for domain scientists seeking to accelerate applications written, for example, in python-based programming frameworks.
This thesis proposes a modern approach based on multi-level compiler technologies to bridge the gap between HLS and high-level frameworks and use domain-specific abstractions to solve domain-specific problems. The key enabling technology is the Multi-Level Intermediate Representation (MLIR), a framework that supports building reusable compiler infrastructure inspired by (and part of) the LLVM project.
The proposed approach uses MLIR to introduce new optimizations at appropriate levels of abstraction outside the HLS tool while still relying on years of HLS research in the low-level hardware generation steps.
Users and developers of HLS tools can thus increase their productivity, obtain accelerators with higher performance, and not be limited by the features of a specific (possibly closed-source) backend.
The presented tools and techniques were designed, implemented, and tested to synthesize machine learning algorithms, but they are broadly applicable to any input specification written in a language that has a translation to MLIR. Generated accelerators can be deployed on Field Programmable Gate Arrays or Application-Specific Integrated Circuits, and they can reach ~10-100 GFLOPS/W efficiency without any manual modification of the code.

Back to Alumni