Results - Details
Search command : Author="Πρατικάκης"
And Author="Πολύβιος"
Current Record: 7 of 86
|
Identifier |
000463037 |
Title |
TeraHeap for G1 efficient caching for latency-sensitive applications |
Alternative Title |
TeraHeap στον G1 για αποτελεσματική προσωρινή αποθήκευση σε εφαρμογές ευαίσθητες σε καθυστέρηση |
Author
|
Χαραλάμπους, Μαρία Χ.
|
Thesis advisor
|
Πρατικάκης, Πολύβιος
|
Reviewer
|
Μπίλας, Άγγελος
Μαγκούτης, Κωνσταντίνος
|
Abstract |
Big data analytic frameworks like Apache Spark, handle the vast amount of
data by moving objects outside the JVM managed heap (o⇥-heap) onto a fast
storage device. However, this strategy leads to high serialization/deserialization
(S/D) costs and high garbage collection (GC) overhead, when o⇥-heap objects are
relocated back into the managed heap for processing. TeraHeap is a mechanism
that manages to eliminate these overheads, by extending the JVM to use a second, high-capacity heap (H2) that is memory-mapped over a fast storage device
and coexists alongside the regular heap (H1). TeraHeap eliminates the S/D cost
with the use of memory-mapped I/O, and reduces the GC cost by avoiding GC
scans over the secondary heap. TeraHeap achieves this by (1) marking candidate
objects for placement in the H2 and indicating when to move them, (2) tracking
live objects in the H1 that are referenced from H2, (3) reclaiming dead objects
in H2. Originally TeraHeap was implemented in the Parallel Scavenge Collector,
where large GC pauses are allowed because the main concern is the application’s
throughput. However, this does not perform well with real-time applications, due
to its long pauses. Garbage-First (G1) Collector is for latency-sensitive applications, where the GC pauses are small and they meet a soft real-time goal with
high probability while achieving high throughput.
In this thesis, we imported the TeraHeap mechanism in G1 GC. We aim to
solve the o⇥-heap problem of big data, in latency-sensitive applications that need
quick responses without long GC pauses. Importing TeraHeap in G1 introduces
unique challenges not encountered by Parallel Scavenge, highlighting the design
di⇥erences between the two collectors. These challenges encompass (1) concurrent
heap marking alongside the application threads, (2) G1’s use of evacuation rather
than compaction for small pauses during heap collection, and (3) the incremental
collection approach applied to the old generation. Our evaluation shows that for
the same DRAM size, TeraHeap improves performance by up to 72% compared to
native Spark. However, there is still room for further work in refining this import
process, given its demonstrated complexity and non-trivial nature.
|
Language |
English |
Subject |
Big data |
|
GC |
|
Garbage collection |
|
JVM |
|
Teraheap |
|
Μεγάλος όγκος δεδομένων |
Issue date |
2024-03-22 |
Collection
|
School/Department--School of Sciences and Engineering--Department of Computer Science--Post-graduate theses
|
|
Type of Work--Post-graduate theses
|
Permanent Link |
https://elocus.lib.uoc.gr//dlib/0/8/d/metadata-dlib-1709548157-620220-11412.tkl
|
Views |
131 |