Social Media-based Polling Data Processing using MapReduce on the Hadoop Framework

Authors

  • Yusuf Yunadian Telkom University Author
  • Hilal H. Nuha Telkom University Author
  • Sidik Prabowo Telkom University Author

DOI:

https://doi.org/10.70323/yrftg243

Keywords:

Polling, Hadoop, mapReduce, wordCount, Processing Speed

Abstract

The processing of data obtained from a polling system holds significant importance as the results can serve as a reference for addressing community issues. With the escalating use of social media, Indonesia ranks fifth globally in Twitter usage. When dealing with substantial data sets, processing speed becomes a concern, prompting the author to develop a more time-efficient system for handling polling data gathered through social media. Hadoop emerges as a prime candidate for this purpose due to its two main modules: Hadoop Distributed File System (HDFS) for distributed storage and MapReduce for algorithmic computation. Through experimentation using both the wordcount program with MapReduce on Hadoop and the version without MapReduce, it was observed that MapReduce outperforms the latter in terms of data processing speed. On average, employing MapReduce on Hadoop resulted in a 1.3 times faster data processing rate compared to the method without MapReduce.

Downloads

Published

2024-03-30

Issue

Section

Articles