Generative Artificial Intelligence with Emphasis on Large Language Models: Review and Current Trends

Year : 2025 | Volume : 12 | Issue : 01 | Page : 40 46
    By

    Joshua Michael,

  • Fabian Barreto,

  • Sujata Deshmukh,

  1. Assistant Professor, Department of Computer Engineering, Fr. Conceicao Rodrigues College of Engineering, Fr. Agnel Ashram, Bandra West, Mumbai, Maharashtra, India
  2. Assistant Professor, Department of Electronics and Telecommunication, Xavier Institute of Engineering, Mahim West, Mumbai, Maharashtra, India
  3. Professor, Department of Computer Engineering, Fr. Conceicao Rodrigues College of Engineering, Fr. Agnel Ashram, Bandra West, Mumbai, Maharashtra, India

Abstract

Generative Artificial Intelligence deals with AI systems that generate new content, such as text, and images. It accomplishes this by using data patterns of texts and images that already exist. Generative AI began an era of major advancement in AI, producing more refined and human-like results. Large Language Models, LLMs, is a part of Generative AI with applications in Natural Language Processing such as text generation, translation, summarization, sentiment detection and question answering. This quickly advancing technology is influencing the future of creative and analytical tasks across multiple industries. OpenAI gave us ChatGPT that was trained on LLM. Tech Giants like Google and Meta then entered the race to develop and improve the existing models, with products like Gemma and Llama 3 respectively. LLMs also present ethical issues, including biases inherent in their training data and the risk of being used to create misleading information. This study reviews both proprietary and open-source LLMs in the literature and explains the cost consideration, current trends and future scope.

Keywords: ChatGPT, deep learning, generative artificial intelligence, large language models, natural language processing, retrieval augmented generation

[This article belongs to Journal of Artificial Intelligence Research & Advances ]

How to cite this article:
Joshua Michael, Fabian Barreto, Sujata Deshmukh. Generative Artificial Intelligence with Emphasis on Large Language Models: Review and Current Trends. Journal of Artificial Intelligence Research & Advances. 2024; 12(01):40-46.
How to cite this URL:
Joshua Michael, Fabian Barreto, Sujata Deshmukh. Generative Artificial Intelligence with Emphasis on Large Language Models: Review and Current Trends. Journal of Artificial Intelligence Research & Advances. 2024; 12(01):40-46. Available from: https://journals.stmjournals.com/joaira/article=2024/view=191589


References

  1. Wang A, et al. GLUE: A MultiTask Benchmark and Analysis Platform for Natural Language Understanding. arXiv: 1804.07461. 2018.
  2. Papineni K, et al. Bleu: A method for automatic evaluation of machine translation. Proceedings of the 40th Annual Meeting of the Association for Computational Linguistics. 2002; 311–318.
  3. Vaswani A, et al. Attention is all you need. In Adv Neural Inf Process Syst. 2017; 6000–60.
  4. Alammar J. (2022). The Illustrated Stable Diffusion: Visualizing Machine Learning One Concept at a Time. [Online] Available from: https://jalammar.github.io/illustrated-stable-diffusion/.
  5. Devlin J, et al. BERT: Pre-training of deep bidirectional transformers for language understanding. arXiv. preprint arXiv: 1810.04805. 2018.
  6. Hurst A, Lerer A, Goucher AP, Perelman A, Ramesh A, Clark A, Ostrow AJ, Welihinda A, Hayes A, Radford A, Mądry A. Gpt-4o system card. arXiv preprint arXiv:2410.21276. 2024 Oct 25.
  7. Brown T, et al. Language models are few-shot learners. Adv Neural Inf Process Syst. 2020; 33: 1877–1901.
  8. Hugo Touvron, et al. Llama: Open and efficient foundation language models. arXiv preprint arXiv:2302.13971. 2023a.
  9. Raffel C, Shazeer N, Roberts A, Lee K, Narang S, Matena M, Zhou Y, Li W, Liu PJ. Exploring the limits of transfer learning with a unified text-to-text transformer. J Mach Learn Res. 2020; 21(140): 1–67.
  10. Achiam J, Adler S, Agarwal S, Ahmad L, Akkaya I, Aleman FL, Almeida D, Altenschmidt J, Altman S, Anadkat S, Avila R. Gpt-4 technical report. arXiv preprint arXiv:2303.08774. 2023 Mar 15.
  11. Hennings M. (2023 Nov 24). LoRA Fine-tuning & Hyperparameters Explained (in Plain English). [Online]. Retrieved on 5th March 2024 from https://www.entrypointai.com/blog/lora-fine-tuning/
  12. Hu EJ, et al. LoRA: Low-Rank Adaptation of Large Language Models. arXiv preprint arXiv:2106.09685. 2021.
  13. Dettmers T, et al. QLoRA: Efficient Finetuning of Quantized LLMs. arXiv preprint arXiv:2305.14314. 2023.
  14. Chaudhari S, et al. RLHF Deciphered: A Critical Analysis of Reinforcement Learning from Human Feedback for LLMs. arXiv preprint arXiv:2404.08555. 2024.
  15. Yao S, et al. Tree of Thoughts: Deliberate Problem Solving with Large Language Models, arXiv:2305.10601. 2023.
  16. Silver D, et al. Mastering the game of Go with deep neural networks and tree search. Nature. 2016; 529(7587): 484–489.
  17. Shuming M, et al. The Era of 1-bit LLMs: All Large Language Models are in 1.58 Bits. arXiv:2402.17764. 2024.
  18. Gao Y, et al. Retrieval-augmented generation for large language models: A survey. arXiv preprint arXiv:2312.10997. 2023.
  19. Xu Z, et al. Retrieval-augmented generation with knowledge graphs for customer service question answering. arXiv preprint arXiv:2404.17723. 2024.

Regular Issue Subscription Review Article
Volume 12
Issue 01
Received 05/08/2024
Accepted 12/11/2024
Published 30/12/2024


Login


My IP

PlumX Metrics