E-Locus - Institutional Repository of the University of Crete - Design-space exploration of FPGA architectures for efficient HPC acceleration

Home Design-space exploration of FPGA architectures for efficient HPC acceleration

Results - Details

[Add to Basket]

Identifier

000429201

Title

Design-space exploration of FPGA architectures for efficient HPC acceleration

Alternative Title

Εξερεύνηση αρχιτεκτονικών προγραμματιζόμενης λογικής για αποδοτική επιτάχυνση εφαρμογών υψηλών επιδόσεων

Author

Γιαουρτάς, Μιχαήλ Π.

Thesis advisor

Κατεβαίνης, Μανώλης

Reviewer

Παπαευσταθίου, Βασίλης
Πρατικάκης, Πολύβιος
Δόλλας, Απόστολος

Abstract

Field Programmable Gate Arrays (FPGAs) have an ever-expanding impact to more and more applications, ranging from Deep Neural Networks to High-Performance Computing (HPC) and other uses such as customization of Instruction Set Extensions and computation offloading in systems with tightly coupled embedded FPGAs (eFPGAs). As applications diverge in complexity, performance, memory needs and area limitations, there is a need for a wider variety of FPGA architectures. However, developing and implementing new FPGA architectures remains challenging and requires a lot of time, due to their high content in custom layout designs and the need for design software and flows tailored for each specific architecture, leading to the production of more generic products. Many academic works are focusing on the automated FPGA design generation process, in an attempt to promote customizability and reduce time-to-market. In other approaches, researchers target only the exploration process, in which they seek for the optimal architecture for a specific case scenario, using area and delay estimation models. In this thesis we choose to combine the two approaches. We develop an extension for the popular open-source tool Verilog-to-Routing (VTR) in order to export in Verilog the representation of user-specified FPGA architectures, develop support for custom user hard-blocks (RAMs, DSPs, FP Units), and generate Bitstreams for given benchmarks. Our objective is to create synthesizable and technology independent RTL design code, able to be synthesized with any standard cell library. We discover the real design properties of an FPGA architecture using our proposed ASIC flow and retrieve real area and delay measurements and eventually proceed with the exploration of optimal FPGA architectures for given sets of benchmarks. Using our VTR extension, we perform FPGA design-space exploration for a set of HPC oriented benchmarks that are derived using Xilinx's High Level Synthesis (HLS). Our exploration starts by identifying pareto-optimal FPGA architectures starting with the size of Lookup Tables (LUTs) and the number of LUTs per Configurable Logic Block (CLB) and then explore the size of routing channels and wire segments' configurations. We also compare the optimal FPGA architectures derived when using the HPC benchmarks with the respective architectures derived when we use the generic MCNC benchmarks. Finally, we create TCL scripts for synthesis and back-end implementation (place and route) which can adjust to any architectural characteristic and size and automate the ASIC flow for new FPGA chips.

Language

English

Subject

ASIC

EDA

Προγραμματιζόμενη λογική

Issue date

2020-03-27

Collection

School/Department--School of Sciences and Engineering--Department of Computer Science--Post-graduate theses

Type of Work--Post-graduate theses

Views

1416

Digital Documents
	Download document View document Views : 14