This is an unedited manuscript accepted for publication and provided as an Article in Press for early access at the author’s request. The article will undergo copyediting, typesetting, and galley proof review before final publication. Please be aware that errors may be identified during production that could affect the content. All legal disclaimers of the journal apply.
Vanisha Mavi,
- Research Scholar, Department of CSE, Shobhit Institute of Engineering & Technology, (Deemed to be University), Meerut, Uttar Pradesh, India
Abstract
In today’s data-driven era, efficiently handling vast amounts of information has become increasingly important. Data compression plays a vital role in this regard — it is essentially a method of encoding information in such a way that significantly reduces the number of bits required to store or transmit a file. By shrinking data to its most compact form, compression techniques help save storage space, reduce bandwidth consumption, and improve the overall speed of data transfer.
A wide variety of data compression techniques exist today, operating across both online and offline environments. However, the sheer number of available options makes it genuinely challenging for practitioners and researchers to determine which technique best fits their specific needs. Every method has its own set of trade-offs regarding speed, compression efficiency, and computational complexity.
In this paper, we explore and compare several key techniques used to compress and decompress text data, aiming to give readers a clearer understanding of how these methods work in practice. Beyond compression alone, we also take a closer look at Hadoop — one of the most widely adopted frameworks for distributed data processing — covering its design philosophy, development journey, and where it currently stands in real-world deployment.
Additionally, we provide a comprehensive overview of YARN (Yet Another Resource Negotiator), the resource management layer within the Hadoop ecosystem. In this paper, We discuss its core concepts, the challenges that arise during its use, and the various tools and techniques that can be leveraged alongside it to build more efficient and scalable data pipelines.
Keywords: Hadoop, MapReduce, YARN, Data compressions, Textual substitution
Vanisha Mavi. A Dynamic Text Compression Model for Big Data Applications Using Hadoop. International Journal of Algorithms Design and Analysis Review. 2026; 04(02):-.
Vanisha Mavi. A Dynamic Text Compression Model for Big Data Applications Using Hadoop. International Journal of Algorithms Design and Analysis Review. 2026; 04(02):-. Available from: https://journals.stmjournals.com/ijadar/article=2026/view=245694
References
- Mary AJ, Arockiam L. A study on basic concepts of big data. International Journal. 2015 Aug;1(3).
- Vavilapalli VK, Murthy AC, Douglas C, Agarwal S, Konar M, Evans R, Graves T, Lowe J, Shah H, Seth S, Saha B. Apache hadoop yarn: Yet another resource negotiator. InProceedings of the 4th annual Symposium on Cloud Computing 2013 Oct 1 (pp. 1-16).
- Tez Incubation Status – Apache Incubator. Apache.org. 2026. Available from: https://incubator.apache.org/projects/tez.html .
- Mavi V, Tyagi N. Apache Hadoop’s Second Generation: Performance and Resource Management through YARN. International Journal of Engineering Science & Humanities. 2023 Jun 20;13(2):52-60.
- Singla A, Bali N, Chaudhary D. Big data and its applications. Journal of Technology Management for Growing Economies. 2020 Nov 16;11(2):63-7.
- Lama P, Zhou X. Aroma: Automated resource allocation and configuration of mapreduce environment in the cloud. InProceedings of the 9th international conference on Autonomic computing 2012 Sep 18 (pp. 63-72).
- Saha B, Shah H, Seth S, Vijayaraghavan G, Murthy A, Curino C. Apache tez: A unifying framework for modeling and building data processing applications. InProceedings of the 2015 ACM SIGMOD international conference on Management of Data 2015 May 27 (pp. 1357-1369).
- Kaur R, Goyal M. A survey on the different text data compression techniques. Int J Adv Res Comput Eng Technol. 2013 Feb;2(2):711-4.
- Shanmugasundaram S, Lourdusamy R. A comparative study of text compression algorithms. International Journal of Wisdom Based Computing. 2011 Dec;1(3):68-76.
- Singh A, Bhatnagar Y. Enhancement of data compression using Incremental Encoding. International Journal of Scientific & Engineering Research. 2012 May;3(5):1-5.
- Kaur H. A review of data compression techniques and data compression symmetry. Int J Comput Sci Technol. 2013 Apr–Jun;4(2):183–186.
- Sharma A, Vyas S. Hadoop2 YARN. IPASJ Int J Comput Sci. 2015 Sep;3(9):30–34.
- Bhosale HS, Gadekar DP. A review paper on big data and hadoop. International Journal of Scientific and Research Publications. 2014 Oct;4(10):1-7.
- Perwej Y, Kerim B, Adrees MS, Sheta OE. An empirical exploration of the yarn in big data. International Journal of Applied Information Systems (IJAIS). 2017 Dec;12(9):19-29.
- Altarawneh H, Altarawneh M. Data compression techniques on text files: A comparison study. International Journal of Computer Applications. 2011 Jul;26(5):42-54.
- Dean J, Ghemawat S. MapReduce: simplified data processing on large clusters. Communications of the ACM. 2008 Jan 1;51(1):107-13.
- Hadoop – Hortonworks. Rssing.com. 2018. Available from: https://hortonworks53.rssing.com/chan-8518062/all_p9.html
- Hashem IA, Anuar NB, Gani A, Yaqoob I, Xia F, Khan SU. MapReduce: Review and open challenges. Scientometrics. 2016 Oct;109(1):389-422.
- Park K, Peng L. A design of high-speed big data query processing system for social data analysis: Using spark SQL. International Journal of Applied Engineering Research. 2016;11(14):8221-5.
- Karun AK, Chitharanjan K. A review on hadoop—HDFS infrastructure extensions. In2013 IEEE conference on information & communication technologies 2013 Apr 11 (pp. 132-137). IEEE.

International Journal of Algorithms Design and Analysis Review
| Volume | 04 |
| 02 | |
| Received | 15/04/2026 |
| Accepted | 29/04/2026 |
| Published | 02/06/2026 |
| Publication Time | 48 Days |
Login
PlumX Metrics