TITLE
EXPLORING RAG IN MEDICAL QUESTION ANSWERING: INTEGRATING LLMS AND VECTOR DATABASES
AUTHOR(S)
Matija Špeletić1*, Stevica Cvetković2, Milan Protić3, Saša V. Nikolić4
ABSTRACT
This paper presents the design, implementation, and evaluation of a Retrieval-Augmented Generation (RAG) system for medical question answering. The proposed system integrates a state-of-the-art large language model (LLM) with a vector database powered by Neo4j for fast and efficient information retrieval. To enhance the retrieval component, we employed Nomic’s embedding model to generate high-quality vector representations of medical documents. The architecture leverages the synergy between retrieval and generation, enabling the LLM to generate context-aware responses based on relevant clinical trial data. A test dataset was created by extracting clinical trial reports from open-source documents and generating synthetic questions using a language model. Our experimental results demonstrate the potential of RAG-based systems in the medical domain, highlighting their ability to provide accurate and context-rich answers. This study presents both the strengths and limitations of the RAG approach for specialized domains such as healthcare.
DOI
http://www.doi.org/10.70456/GAJM2853
DOWNLOAD
https://unitechsp.tugab.bg/images/2024/4-CST/s4_p127_v1.pdf
How to cite this article:
Matija Špeletić, Stevica Cvetković, Milan Protić, Saša V. Nikolić, EXPLORING RAG IN MEDICAL QUESTION ANSWERING: INTEGRATING LLMS AND VECTOR DATABASES, UNITECH – SELECTED PAPERS - 2024