Your browser does not support JavaScript!

Home    Runtime support for programming explicit communication chip multiprocessors  

Results - Details

Add to Basket
[Add to Basket]
Identifier 000358687
Title Runtime support for programming explicit communication chip multiprocessors
Author Ζαμπετάκης, Μιχαήλ
Thesis advisor Νικολόπουλος, Δημήτριος
Abstract Modern chip multiprocessors (CMP) with explicit managed local memories offer robust and efficient developmment systems. Explicitly managed memories allow programmers to control the locality and the exchange of the data of the programs they develop. Using this immediate control of data exchange ptogrammers can develop applications that achieve high performance by optimizing data transfers and apply proper data distribution between local and global memories. Programmers have to develop applications that must be specific fir each system in order to fully exploit the availale resources and achieve high performance. In this work we develop several applications using a modern multicore development system based on multiple processors and local memories managed by explicit and imlicit communication mechanisms. In order to achieve high performance we exploit the available communication mechanisms to explicitly manage memories and apply data exchange patterns that maximize the resource utilization of the system and achieve high performance. For each application, we measure its performance for various cases and analyze their performance under various circumstances. We develop a Fast Fourrier Transform (FFT), a bitonic sort algorithm, three applications based on the MapReduce framework and a stream application that measures the communication mechanisms' performance by stressing the system. The system we use is a system that was developed at the CARV (Computer Architecture and VLSI Systems) laboratory of FORTH (Foundation of Research and Technology) and is based on a modern development platform FPGA (Field Programmable Gate Array). In this thesis we introduce modules and functionalities in system software libraries, to exploit explixit on-chip communication mechanisms in parallel programming models. Moreover, we port and analyze the performance of the applications for the development system and report techniques on how to exploit the available communications mechanisms in order to achieve high performance using explicit communication mechanisms. We measure that performance and the minimum granularity at which the parallel applications can gain speedup under various cases. And finally we identify the difficulties and the limitations of the applications' porting to the prototype system. We achieve speedup at parallel execution of the Bitonic sort application that takes even 700 cycles to be executed in sequetial execution. In MapReduce applications we achieve speedup almost up to 2 and 4 for two and four processors respectively and in Stream application we stress the communication mechanisms of the prototype system and achieve up to 3200MB/s on-chip data transfer rate.
Language English
Issue date 2010-07-16
Collection   School/Department--School of Sciences and Engineering--Department of Computer Science--Post-graduate theses
  Type of Work--Post-graduate theses
Views 530

Digital Documents
No preview available

Download document
View document
Views : 4