Different NLP Libraries for Indian Languages

Year : 2024 | Volume :11 | Issue : 01 | Page : 8-14
By

Vinay Verma

Shivam Gupta

  1. Research Scholar MCA Thakur Institute of Management Studies, Career Development & Research (TIMSCDR) Mumbai Maharashtra India
  2. Research Scholar MCA Thakur Institute of Management Studies, Career Development & Research (TIMSCDR) Mumbai Maharashtra India

Abstract

Humans are by nature varied and multilingual. While English holds the title of the most commonly spoken language globally, Hindi is also widely used by people worldwide. Natural language processing (NLP) is the discipline of artificial intelligence (AI) concerned with providing computers the capacity to interpret text and spoken language in the same manner that humans can. People nowadays utilize products that employ NLP as their foundation, such as Alexa or Siri. However, there are several ambiguities in NLP for Indian languages. Currently, NLP libraries such as iNLTK, Indic NLP, Stanford NLP, and others are utilized to process several Indian languages. This article contains information about the various NLP libraries, the processes supported by these libraries, and the accuracy of these models for Indian languages.

Keywords: iNLTK, tokenization, natural language processing (NLP), deep learning, multilingual NLP

[This article belongs to Journal of Open Source Developments(joosd)]

How to cite this article: Vinay Verma, Shivam Gupta. Different NLP Libraries for Indian Languages. Journal of Open Source Developments. 2024; 11(01):8-14.
How to cite this URL: Vinay Verma, Shivam Gupta. Different NLP Libraries for Indian Languages. Journal of Open Source Developments. 2024; 11(01):8-14. Available from: https://journals.stmjournals.com/joosd/article=2024/view=140071





References

  1. Levelling up NLP for Indian Languages. [Online]. 2021. Robert Bosch Center for Data Science and Artificial Intelligence. Iitm.ac.in. Available at https://rbcdsai.iitm.ac.in/blogs/leveling-up-nlp4-indian-langs/
  2. Chowdhary KR. Natural language processing. In: Fundamentals of Artificial Intelligence. New Delhi, India: Springer; 2020. pp. 603–649.
  3. Sanad M. 3 Important NLP Libraries for Indian Languages You Should Try Out Today! [Online]. June 14, 2020. Analytics Vidhya. Available at https://www.analyticsvidhya.com/blog/2020/01/3-important-nlp-libraries-indian-languages-python/
  4. Balaganur S. Top NLP Libraries & Datasets for Indian Languages. [Online]. Analytics India Magazine. February 7, 2020. Available at https://analyticsindiamag.com/top-nlp-libraries-datasets-for-indian-languages/
  5. Manthan S, Kumar J, Mediratta A, Kundale A, Nangare SH. AMBER chatbot and detection of paraphrases for Devnagari. Vishwakarma J Eng Res. 2017; 1 (1): 1–5.
  6. Kharate NG, Patil VH. Survey of machine translation for Indian languages to English and its approaches. Int J Sci Res Computer Sci Eng Inform Technol. 2018; 3 (1): 613–622.
  7. Navalakha D, Pittule M, Mane R, Rathod A, Kharate NG. Review of chatbot system in Marathi language. Int Res J Eng Technol. 2019; 6 (11): 1814–1819.
  8. Arora G. iNLTK: natural language toolkit for Indic languages. arXiv preprint arXiv:2009.12534. September 26, 2020. Available at https://arxiv.org/abs/2009.12534
  9. Joseph J, Lalithsriram SR, Menon N. Applications and developments of NLP resources for text processing in Indian languages: shared multilingual corpora building and pre-trained models. In: Viola L, Spence P, editors. Multilingual Digital Humanities. London, UK: Taylor & Francis; 2023. Chapter 3. .
  10. Aralikatte R, Cheng Z, Doddapaneni S, Cheung JC. Vārta: a large-scale headline-generation dataset for Indic languages. arXiv preprint arXiv:2305.05858. May 10, 2023. Available at https://arxiv.org/abs/2305.05858

Regular Issue Subscription Review Article
Volume 11
Issue 01
Received February 29, 2024
Accepted March 22, 2024
Published April 5, 2024