Your browser does not support JavaScript!

Home    Design and Implementation of a Directory based Cache Coherence Protocol  

Results - Details

Add to Basket
[Add to Basket]
Identifier 000368505
Title Design and Implementation of a Directory based Cache Coherence Protocol
Alternative Title Σχεδίαση και υλοποίηση ενός πρωτοκόλλου συνέπειας κρυφών μνημών τύπου καταλόγου
Author Τσαλιαγκός, Δημήτριος Μιχαήλ
Thesis advisor Κατεβαίνης, Μανώλης
Abstract As the number of processors per chip increases, so does the need for efficient and high-speed communication support. This is necessary so that applications can exploit the numerous cores available in today chip multiprocessors. Although explicit communication mechanisms such as RDMA can be used, implicit migration of data among the cores significantly simplifies the programming effort in large scale systems, by providing a simple and intuitive programming model. This approach, however, introduces a problem known as cache coherence, where multiple copies of the data need to be kept consistent. An orthogonal solution is to use directory based coherence protocols, which offer increased scalability by reducing the volume of messages exchanged as opposed to broadcast protocols.In this thesis a directory based cache coherence protocol is implemented in a four-core FPGA based prototype that was developed at the CARV (Computer Architecture and VLSI Systems) laboratory of FORTH (Foundation of Research and Technology). The protocol that was implemented can support up to 16 processors and it is integrated with the existing system which also provide RDMA and special hardware support for synchronization and explicit management of cache memories. Finally, our main finding is that the area overhead of the coherent system as opposed to a non-coherent is only 4% in terms of logic. We evaluate our protocol using custom software micro-benchmarks emulating common operations found in parallel applications such as locks and barriers. Also a matrix multiplication algorithm and a producer-consumer benchmark was developed for evaluating the protocol. Our results show that our design scales for the matrix multiplication algorithm, achieving a speedup that ranges between 3.74 to 1.96.
Language English
Subject Cache Coherence
Caches
Directory Protocols
Multiprocessors
Κατάλογοι συνέπεια μνήμης
Πολυπύρηνοι επεξεργαστές
Συνέπεια κρυφών μνημών
Issue date 2011-07-15
Collection   School/Department--School of Sciences and Engineering--Department of Computer Science--Post-graduate theses
  Type of Work--Post-graduate theses
Views 554

Digital Documents
No preview available

Download document
View document
Views : 20