日本語
 
Help Privacy Policy ポリシー/免責事項
  詳細検索ブラウズ

アイテム詳細

  Scheduling Strategies in a Main-Memory MapReduce Framework,Approach for countering Reduce side skew

Venkatachalapathy, M. (2012). Scheduling Strategies in a Main-Memory MapReduce Framework,Approach for countering Reduce side skew. Master Thesis, Universität des Saarlandes, Saarbrücken.

Item is

基本情報

表示: 非表示:
資料種別: 学位論文

ファイル

表示: ファイル
非表示: ファイル
:
2012_Mahendiran Venkatachalapathy_MSc Thesis.pdf (全文テキスト(全般)), 2MB
 
ファイルのパーマリンク:
-
ファイル名:
2012_Mahendiran Venkatachalapathy_MSc Thesis.pdf
説明:
-
OA-Status:
閲覧制限:
制限付き (Max Planck Institute for Informatics, MSIN; )
MIMEタイプ / チェックサム:
application/pdf
技術的なメタデータ:
著作権日付:
-
著作権情報:
-
CCライセンス:
-

関連URL

表示:

作成者

表示:
非表示:
 作成者:
Venkatachalapathy, Mahendiran1, 著者           
Dittrich, Jens2, 学位論文主査           
Quiané-Ruiz, Jorge-Arnulfo3, 監修者
所属:
1International Max Planck Research School, MPI for Informatics, Max Planck Society, ou_1116551              
2Databases and Information Systems, MPI for Informatics, Max Planck Society, ou_24018              
3External Organizations, ou_persistent22              

内容説明

表示:
非表示:
キーワード: -
 要旨: Over the past few decades, there is a multifold increase in the amount of digital data that is being generated. Various attempts are being made to process this vast amount of data in a fast and efficient manner. Hadoop - MapReduce is one such software framework that has gained popularity in the last few years. It provides a reliable and easier way to process huge amount of data in-parallel on large computing cluster. However, Hadoop always persists intermediate results to the local disk. As a result, Hadoop usually suffers from long execution runtimes as it typically pays a high I/O cost for running jobs. The state-of-the-art computing clusters have enough main memory capacity to hold terabytes of data in main memory. We have built M3R (Main Memory MapReduce) framework, a prototype for generic main memory-based data processing. M3R can execute MapReduce jobs and also in addition it can execute general data processing jobs. This master thesis in particular, focuses on countering the data-skewness problem for MapReduce jobs on M3R. Intermediate data following skewed distribution could lead to computational imbalance amongst the reduce tasks, resulting in longer MapReduce job execution times. This provides a scope for rebalancing the intermediate data and thereby reducing the total job runtimes. We propose a novel dynamic approach of data rebalancing, to counter the reducer side data skewness. Our proposed on-the-fly skew countering approach, attempts to detect the level of skewness in the intermediate data and rebalances the intermediate data amongst the reduce tasks. The proposed mechanism performs all the skew-countering related activities during the execution of actual MapReduce job. We have implemented this reduce side skew countering mechanism as a part of the M3R framework. The experiments conducted to study the behavior of this M3R data-rebalancing approach shows there is a significant reduction in the map-reduce job runtimes. In case of the data-skewed input, our proposed skew-control approach for M3R has reduced the total map-reduce job runtime (up to 31 ) when compared to M3R without skew-control.

資料詳細

表示:
非表示:
言語: eng - English
 日付: 2012-09-252012
 出版の状態: 出版
 ページ: -
 出版情報: Saarbrücken : Universität des Saarlandes
 目次: -
 査読: -
 識別子(DOI, ISBNなど): BibTex参照ID: Venkatachalapathy2012
 学位: 修士号 (Master)

関連イベント

表示:

訴訟

表示:

Project information

表示:

出版物

表示: