Mayuri M. Rajpara,
Hirenkumar Thakor,
- Assistant Professor, Faculty of Computer Engineering, Noble University, Junagadh, Bamangam, Gujarat, India
- Associate Professor, Faculty of Computer Application, Noble University, Junagadh, Bamangam, Gujarat, India
Abstract
The exponential growth of big data in recent years has created an urgent need for innovative and efficient processing frameworks capable of managing and analyzing massive and complex datasets. Among these, MapReduce has gained prominence as a powerful tool for distributed data processing due to its simplicity and scalability. However, traditional MapReduce frameworks often encounter significant limitations in terms of efficiency, scalability, and resource optimization, particularly when handling large-scale and heterogeneous datasets. To address these challenges, this study introduces an advanced MapReduce algorithm that incorporates dynamic task scheduling, enhanced data partitioning strategies, and a robust load-balancing mechanism. These improvements are designed to optimize processing speed, improve resource utilization, and enhance fault tolerance. The proposed solution is rigorously evaluated through extensive experiments on a diverse range of datasets, demonstrating substantial improvements over conventional MapReduce frameworks. The results underscore the potential of the advanced algorithm to revolutionize big data processing, offering a scalable, efficient, and adaptable solution for a wide array of applications across multiple domains, fostering innovation in data-driven fields.
Keywords: Big data, MapReduce, external memory algorithm, big data processing, advanced algorithm, load balancing, data processing efficiency
[This article belongs to International Journal of Data Structure Studies ]
Mayuri M. Rajpara, Hirenkumar Thakor. Optimizing Data Processing Efficiency in Big Data: Advanced MapReduce Algorithm Innovations. International Journal of Data Structure Studies. 2025; 03(01):1-7.
Mayuri M. Rajpara, Hirenkumar Thakor. Optimizing Data Processing Efficiency in Big Data: Advanced MapReduce Algorithm Innovations. International Journal of Data Structure Studies. 2025; 03(01):1-7. Available from: https://journals.stmjournals.com/ijdss/article=2025/view=203300
References
- Zamani ED, Smyth C, Gupta S, Dennehy D. Artificial intelligence and big data analytics for supply chain resilience: a systematic literature review. Ann Oper Res. 2023 Aug; 327(2): 605–32.
- Himeur Y, Elnour M, Fadli F, Meskin N, Petri I, Rezgui Y, Bensaali F, Amira A. AI-big data analytics for building automation and management systems: a survey, actual challenges and future perspectives. Artif Intell Rev. 2023 Jun; 56(6): 4929–5021.
- Kumar VN, PS AK. Optimizing Business Insights: An Enhanced Large-Scale RDF Query Processing And Loading Speed On Big Data. J Namib Stud: History Politics Culture. 2023 Aug 10; 35: 1799–818.
- De Virgilio R. Smart RDF data storage in graph databases. In 2017 17th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing (CCGRID). 2017 May 14; 872–881.
- Acharjya DP, Ahmed K. A survey on big data analytics: challenges, open research issues and tools. Int J Adv Comput Sci Appl. 2016 Feb 1; 7(2): 511–8.
- Wylot M, Hauswirth M, Cudré-Mauroux P, Sakr S. RDF data storage and query processing schemes: A survey. ACM Comput Surv. 2018 Sep 6; 51(4): 1–36.
- Chen M, Mao S, Liu Y. Big data: A survey. Mob Netw Appl. 2014 Apr; 19(2): 171–209.
- Thanekar SA, Subrahmanyam K, Bagwan AB. Big Data and MapReduce Challenges, Opportunities and Trends. Int J Electr Comput Eng (2088-8708). 2016 Dec 1; 6(6): 2911–2919.
- Oguntimilehin A, Ademola EO. A Review of Big Data Management, Benefits and Challenges. J Emerg Trends Comput Inf Sci. 2014; 5(6): 433–438.
- Almeida FL. Benefits, challenges and tools of big data management. J Syst Integr (1804-2724). 2017 Oct 1; 8(4): 12–20.
- Alsghaier H, Akour M, Shehabat I, Aldiabat S. The importance of big data analytics in business: a case study. Am J Softw Eng Appl. 2017 Oct; 6(4): 111–5.
- Lefrançois M, Zimmermann A, Bakerally N. A SPARQL extension for generating RDF from heterogeneous formats. In The Semantic Web: 14th International Conference, ESWC 2017, Portorož, Slovenia, May 28–June 1, 2017, Proceedings, Part I 14. Springer International Publishing; 2017; 35–50.
- Santana LH, dos Santos Mello R. A middleware for storing massive RDF graphs into NoSQL. Federal University of Santa Catarina (UFSC) Florian´opolis, SC, Brazil. Researchgate. 2017.
- Bebee BR, Choi D, Gupta A, Gutmans A, Khandelwal A, Kiran Y, Mallidi S, McGaughy B, Personick M, Rajan K, Rondelli S. Amazon Neptune: Graph Data Management in the Cloud. In ISWC (P&D/Industry/BlueSky). 2018 Oct.
- Huang J, Abadi DJ, Ren K. Scalable SPARQL querying of large RDF graphs. Proc VLDB Endow. 2011 Aug 1; 4(11): 1123–34.
- Choi H, Son J, Cho Y, Sung MK, Chung YD. SPIDER: a system for scalable, parallel/distributed evaluation of large-scale RDF data. In Proceedings of the 18th ACM conference on Information and knowledge management. 2009 Nov 2; 2087–2088.
- Farhan Husain M, Doshi P, Khan L, Thuraisingham B. Storage and retrieval of large rdf graph using hadoop and mapreduce. InCloud Computing: First International Conference, CloudCom 2009, Beijing, China, December 1-4, 2009. Proceedings 1. Berlin Heidelberg: Springer; 2009; 680–686.
- Weiss C, Karras P, Bernstein A. Hexastore: sextuple indexing for semantic web data management. Proc VLDB Endow. 2008 Aug 1; 1(1): 1008–19.
- Abadi DJ, Marcus A, Madden SR, Hollenbach K. SW-Store: a vertically partitioned DBMS for Semantic Web data management. VLDB J. 2009 Apr; 18: 385–406.
- Faye DC, Cure O, Blin G. A survey of RDF storage approaches. Revue Africaine de Recherche en Informatique et Mathématiques Appliquées (ARIMA). 2012 Sep 5; 15: 11–35.
| Volume | 03 |
| Issue | 01 |
| Received | 20/12/2024 |
| Accepted | 07/01/2025 |
| Published | 10/03/2025 |
| Publication Time | 80 Days |
PlumX Metrics

