Unravelling the Power of Avro and Hadoop: Revolutionising Big Data Processing and Serialization

Year : 2024 | Volume :02 | Issue : 01 | Page : 8-13
By

    Rajesh Yadav

  1. Assistant Professor, Department of Computer Science, Sies College of Arts, Science & Commerce (Autonomous), Maharashtra, India

Abstract

As the technology is in progress, the accumulation of data is also increasing. As a result of which a lot of organisations are constantly seeking new yet innovative solutions to process and analyse this huge accumulation of data. Well Hadoop proved a gamе-changеr in thе rеalm of big data procеssing whereas Avro proved a solution provider to data Serialization. In this review work, wе will dеlvе into thе world of Avro and Hadoop, еxploring its basics, features, kеy componеnts, connection between Avro and Hadoop, and real-world examples of Avro’s integration with Hadoop.

Keywords: Avro, data serialization, Hadoop HFS, MapReduce, big data

[This article belongs to International Journal of Data Structure Studies(ijdss)]

How to cite this article: Rajesh Yadav , Unravelling the Power of Avro and Hadoop: Revolutionising Big Data Processing and Serialization ijdss 2024; 02:8-13
How to cite this URL: Rajesh Yadav , Unravelling the Power of Avro and Hadoop: Revolutionising Big Data Processing and Serialization ijdss 2024 {cited 2024 Feb 21};02:8-13. Available from: https://journals.stmjournals.com/ijdss/article=2024/view=133422


References

  1. Bhosale HS, Gadekar DP. A review paper on big data and hadoop. Int J Sci Res Publ. 2014; 4(10): 1–7.
  2. Abdelouarit KA, Sbihi B, Aknin N. Towards an approach based on hadoop to improve and organize online search results in big data environment. In: Communication, Management and Information Technology. CRC Press; Florida, United States. 2016; 557–564.
  3. Phaneendra SV, Reddy EM. Big Data-solutions for RDBMS problems-A survey. Int J Adv Res Comput Commun Eng. 2013 Sep; 2(9): 3686–3691.
  4. Mukherjee A, Datta J, Jorapur R, Singhvi R, Haloi S, Akram W. Shared disk big data analytics with apache hadoop. In 2012 IEEE 19th International Conference on High Performance Computing. 2012 Dec; 1–6.
  5. Gupta P, Tyagi N. An approach towards big data—A review. In IEEE International Conference on Computing, Communication & Automation. 2015 May; 118–123.
  6. Patel AB, Birla M, Nair U. Addressing big data problem using Hadoop and Map Reduce. In 2012 IEEE Nirma University International Conference on Engineering (NUiCONE). 2012 Dec; 1–5.
  7. Hukill GS, Hudson C. Avro: Overview and Implications for Metadata Processing. Raleigh, NC: Library Scholarly Publications; 2018; 134.
  8. Vohra D, Vohra, D. Apache avro. Practical Hadoop Ecosystem: A Definitive Guide to Hadoop-Related Frameworks and Tools. Berkeley, CA: Apress; 2016; 303–323.
  9. Akshat Jain. (2023 Jul 23). Unravelling power Hadoop journey big data success stories. [Online]. https://www.linkedin.com/pulse/unraveling-power-hadoop-journey-big-data-success-stories-akshat-jain?trk=article-ssr-frontend-pulse_more-articles_related-content-card.
  10. Mayer-Schonberger Viktor, Cukier Kenneth. Big Data: A Revolution That Will Transform How We Live, Work, and Think. Houghton Mifflin Harcourt; 2013.
  11. Ryza S, Laserson U, Owen S, Wills J. Advanced Analytics with Spark: Patterns for Learning from Data at Scale. California, United States: O’Reilly Media, Inc.; 2014 Nov 12. p. 1–239.
  12. White T. Hadoop: The Definitive Guide. California, United States: O’Reilly Media, Inc.; 2012 May 19.
  13. Marz Nathan, James Warren. Big Data: Principles and Best Practices of Scalable Real time Data Systems. Manning Publications; 2015.
  14. BIG DATA 2e. BIG DATA 2e. Mheducation.co.in. 2020. Available from: https://www.co.in/big-data-2e-9789353167950-india.
  15. Guide to Apache Avro. 2018. Available from: https://www.baeldung.com/java-apache-avro
  16. Avro Serialization. Serialization In Java & Hadoop – DataFlair. DataFlair. 2018. Available from: https://data-flair.training/blogs/avro-serialization/
  17. A Hadoop – Apache Hadoop 2.7.2. Apache.org. 2016. Available from: https://hadoop.apache.org/docs/r2.7.2/
  18. HDFS Architecture Guide. Apache.org. 2024. Available from: https://hadoop.apache.org/docs/r1.2.1/hdfs_design.html
  19. Apache Hadoop Main 3.4.0 API. Apache.org. 2024. Available from: https://hadoop.apache.org/docs/current/api/

Regular Issue Subscription Review Article
Volume 02
Issue 01
Received November 22, 2023
Accepted November 29, 2023
Published February 21, 2024