Your browser does not support JavaScript!

Doctoral theses

Current Record: 80 of 121

Back to Results Previous page
Next page
Add to Basket
[Add to Basket]
Identifier 000364160
Title Direct communication and synchronization mechanisms in chip multiprocessors
Alternative Title Μηχανισμοί απευθείας επικοινωνίας κ' συγχρονισμού σε πολυεπεξεργαστές ψηφίδας
Author Καββαδίας, Σταμάτης
Thesis advisor Κατεβαίνης, Μανόλης
Reviewer Νικολόπουλος, Δημήτρης
Πνευματικάτος, Διονύσιος
Μπίλας, Άγγελος
Abstract The physical constraints of transistor integration have made chip multiprocessors (CMPs) a necessity, and increasing the number of cores (CPUs) the best approach, yet, for the exploitation of more transistors. Already, the feasible number of cores per chip increases beyond our ability to utilize them for general purposes. Although many important application domains can easily benefit from the use of more cores, scaling, in general, single-application performance with multiprocessing presents a tough milestone for computer science. The use of per core on-chip memories, managed in software with RDMA, adopted in the IBM Cell processor, has challenged the mainstream approach of using coherent caches for the on-chip memory hierarchy of CMPs. The two architectures have largely different implications for software and disunite researchers for the most suitable approach to multicore exploitation. We demonstrate the combination of the two approaches, with cache-integration of a network interface (NI) for explicit interprocessor communication, and flexible dynamic allocation of on-chip memory to hardware-managed (cache) and software-managed parts. The network interface architecture combines messages and RDMA-based transfers, with remote load-store access to the software-managed memories, and allows multipath routing in the processor interconnection network. We propose the technique of event responses that efficiently exploits the normal cache access flow for network interface functions, and prototype our combined approach in an FPGA-based multicore system, which shows reasonable logic overhead (less than 20%) in cache datapaths and controllers, for the basic NI functionality. We also design and implement synchronization mechanisms in the network interface (counters and queues), that take advantage of event responses and exploit the cache tag and data arrays for synchronization state. We propose novel queues, that efficiently support multiple readers, providing hardware lock and job dispatching services, and counters, that enable selective fences for explicit transfers, and can be synthesized to implement barriers in the memory system. Evaluation of the cache-integrated NI on the hardware prototype, demonstrates the flexibility of exploiting both cacheable and explicitly-managed data, and potential advantages of NI transfer mechanism alternatives. Simulations of up to 128 core CMPs show that our synchronization primitives provide significant benefits for contended locks and barriers, and can improve task scheduling efficiency in the Cilk run-time system, especially for regular codes.
Language English
Subject Cache memory
On-chip communication
Synchronization mechanisms
Επικοινωνία
Κρυφή μνήμη
Πρόχειρη μνήμη
Συγχρονισμός
Issue date 2011-03-18
Collection   School/Department--School of Sciences and Engineering--Department of Computer Science--Doctoral theses
  Type of Work--Doctoral theses
Permanent Link https://elocus.lib.uoc.gr//dlib/c/6/9/metadata-dlib-6f512544cf84668982ed058201242537_1300188596.tkl Bookmark and Share
Views 670

Digital Documents
No preview available

Download document
View document
Views : 24