Retrieval Augmented Generation for Question Answering in Financial Documents

[{“box”:0,”content”:”n[if 992 equals=”Open Access”]n

n

n

n

Open Access

nn

n

n[/if 992]n[if 2704 equals=”Yes”]n

n

Notice

nThis is an unedited manuscript accepted for publication and provided as an Article in Press for early access at the author’s request. The article will undergo copyediting, typesetting, and galley proof review before final publication. Please be aware that errors may be identified during production that could affect the content. All legal disclaimers of the journal apply.n

n[/if 2704]n

n

Year : 2025 [if 2224 equals=””]10/09/2025 at 2:56 PM[/if 2224] | [if 1553 equals=””] Volume : 12 [else] Volume : 12[/if 1553] | [if 424 equals=”Regular Issue”]Issue : [/if 424][if 424 equals=”Special Issue”]Special Issue[/if 424] [if 424 equals=”Conference”][/if 424] 02 | Page : 08 14

n

n

nn

n

n

n

    By

    n

    [foreach 286]n

    n

    S. Sharon Benita, V. Srividhya,

    n t

  • n

    n[/foreach]

    n

n[if 2099 not_equal=”Yes”]n

    [foreach 286] [if 1175 not_equal=””]n t

  1. Student, Associate Professor, Master of Computer Applications, Avinashilingam Institute for Home Science and Higher Education for Women, Coimbatore, Department of Computer Science, Avinashilingam Institute for Home Science and Higher Education for Women, Coimbatore, Tamil Nadu, Tamil Nadu, India, India
  2. n[/if 1175][/foreach]

n[/if 2099][if 2099 equals=”Yes”][/if 2099]n

n

Abstract

n

n

nIn recent years, the integration of Question Answering (QA) with the Retrieval Augmented Generation (RAG) system has transformed to interact with numerous documents. It uses Natural Language Processing (NLP) techniques to improve accuracy and relevant responses derived from huge documents. RAG integrates the advantages of the retrieval and generation process, which allows systems to generate natural responses and extract context from multiple sources. The main reason to use RAG is that it can help Large Language Model (LLM). Several personalized pieces of information are used in the RAG architecture. In this work, the financial domain is used to handle the financial documents to answer complex questions that efficiently retrieve knowledge of specific terms and context to generate the answers. Collections of financial documents are used to create questions with answers based on the information provided and to compare the performance of the system with known ground truth answers. The ROUGE score is used to evaluate the performance of the RAG. It is used to evaluate the accuracy between the generated responses and matched reference answers for a range of questions. Finally, incorporating RAG into question answering frameworks can improve user interface, confidence, and accuracy in automated solutions in the finance domain.nn

n

n

n

Keywords: Retrieval augmented generation, natural language processing, embedding, large language model, rouge score

n[if 424 equals=”Regular Issue”][This article belongs to Journal of Advanced Database Management & Systems ]

n

[/if 424][if 424 equals=”Special Issue”][This article belongs to Special Issue under section in Journal of Advanced Database Management & Systems (joadms)][/if 424][if 424 equals=”Conference”]This article belongs to Conference [/if 424]

n

n

n

How to cite this article:
nS. Sharon Benita, V. Srividhya. [if 2584 equals=”][226 wpautop=0 striphtml=1][else]Retrieval Augmented Generation for Question Answering in Financial Documents[/if 2584]. Journal of Advanced Database Management & Systems. 10/09/2025; 12(02):08-14.

n

How to cite this URL:
nS. Sharon Benita, V. Srividhya. [if 2584 equals=”][226 striphtml=1][else]Retrieval Augmented Generation for Question Answering in Financial Documents[/if 2584]. Journal of Advanced Database Management & Systems. 10/09/2025; 12(02):08-14. Available from: https://journals.stmjournals.com/joadms/article=10/09/2025/view=0

nn

n

n[if 992 equals=”Open Access”]Full Text PDF[/if 992]n

n

n[if 992 not_equal=”Open Access”]n

n

n[/if 992]n

nn

nnn

n[if 379 not_equal=””]nn

Browse Figures

n

n

n[foreach 379]

figures

[/foreach]n

n

n

n[/if 379]

n

n

n

n

n

References n

n[if 1104 equals=””]n

  1. Biancofiore GM, Deldjoo Y, Noia TD, Di Sciascio E, Narducci F. Interactive question answering systems: Literature review. ACM Computing Surveys. 2024 May 8;56(9):1-38.
  2. Usbeck R, Röder M, Hoffmann M, Conrads F, Huthmann J, Ngonga-Ngomo AC, Demmler C, Unger C. Benchmarking question answering systems. Semantic Web. 2019 Jan 21;10(2):293-304.
  3. Nassiri K, Akhloufi M. Transformer models used for text-based question answering systems. Applied Intelligence. 2023 May;53(9):10602-35.
  4. Abbasiantaeb Z, Momtazi S. Text‐based question answering from information retrieval and deep neural network perspectives: A survey. Wiley Interdisciplinary Reviews: Data Mining and Knowledge Discovery. 2021 Nov;11(6):e1412.
  5. Jayakumar H, Krishnakumar MS, Peddagopu VV, Sridhar R. RNN based question answer generation and ranking for financial documents using financial NER. Sād ā. 2020 Dec;45:1-0.
  6. Phogat KS, Puranam SA, Dasaratha S, Harsha C, Ramakrishna S. Fine-tuning Smaller Language Modelsfor Question Answering over Financial Documents. arXiv preprint arXiv:2408.12337. 2024 Aug 22.
  7. Shah S, Ryali S, Venkatesh R. Multi-Document Financial Question Answering using LLMs. arXiv preprint arXiv:2411.07264. 2024 Nov 8.
  8. Srivastava P, Malik M, Gupta V, Ganu T, Roth D. Evaluating LLMs’ Mathematical Reasoning in Financial Document Question Answering. arXiv preprint arXiv:2402.11194. 2024 Feb 17.
  9. Singh K, Kaur S, Smiley C. FinQAPT: Empowering Financial Decisions with End-to-End LLM- driven Question Answering Pipeline. InProceedings of the 5th ACM International Conference on AI in Finance 2024 Nov 14 (pp. 266-273).
  10. Wang L, Yang N, Huang X, Yang L, Majumder R, Wei F. Improving text embeddings with large language models. arXiv preprint arXiv:2401.00368. 2023 Dec 31.
  11. Kukreja S, Kumar T, Purohit A, Dasgupta A, Guha D. A literature survey on open source large language models. InProceedings of the 2024 7th International Conference on Computers in Management and Business 2024 Jan 12 (pp. 133-143).
  12. Chen A, Stanovsky G, Singh S, Gardner M. Evaluating question answering evaluation. InProceedings of the 2nd workshop on machine reading for question answering 2019 Nov (pp. 119- 124).
  13. Face H. Hugging Face–The AI community building the future. Retrieved July. 2021;20:2021.

nn[/if 1104][if 1104 not_equal=””]n

    [foreach 1102]n t

  1. [if 1106 equals=””], [/if 1106][if 1106 not_equal=””],[/if 1106]
  2. n[/foreach]

n[/if 1104]

n


nn[if 1114 equals=”Yes”]n

n[/if 1114]

n

n

[if 424 not_equal=””]Regular Issue[else]Published[/if 424] Subscription Review Article

n

n

[if 2146 equals=”Yes”][/if 2146][if 2146 not_equal=”Yes”][/if 2146]n

n

n

n

n

n

n

n

n

n

n

n

n

n

n

n

n

n

n

n

n[if 1748 not_equal=””]

[else]

[/if 1748]n

n[if 1746 equals=”Retracted”]n

n

n

n

[/if 1746]n[if 4734 not_equal=””]

n

n

n

[/if 4734]n

n

Volume 12
[if 424 equals=”Regular Issue”]Issue[/if 424][if 424 equals=”Special Issue”]Special Issue[/if 424] [if 424 equals=”Conference”][/if 424] 02
Received 28/04/2025
Accepted 20/06/2025
Published 10/09/2025
Retracted
Publication Time 135 Days

n

n

nn


n

n
My IP
n

PlumX Metrics

nn

n

n

n[if 1746 equals=”Retracted”]n

[/if 1746]nnn

nnn”}]