LLM-Grounded Citations for Trustworthy RAG Answers
The Challenge
Citations, a key factor for explainable AI question answering, returned alongside answers in the RAG pipeline were often incorrect or misleading, hurting citation accuracy in RAG systems. The LLM didn’t consistently utilize all retrieved documents, making it hard to trace answers back to sources. Additionally, hallucinations and assumptions led to misleading or incomplete information; we needed to reduce hallucinations in LLM answers.
Hypothesis
Using the LLM to explicitly generate citations, based on the actual content it used, would better ground outputs, i.e., grounding AI responses with sources, and improve trustworthy AI systems. Also, applying query preprocessing and answer postprocessing would normalize the query and validate the answer against the retrieved context.
Execution
Integrated an LLM-driven citation mechanism along with a preprocessor and postprocessor as part of RAG pipeline optimization. The system evaluates the answer and selectively identifies the most relevant supporting documents, ensuring citations reflect what the model actually used rather than listing all retrieved contexts, thereby grounding AI responses with sources and enforcing query preprocessing and answer postprocessing checks.
Outcomes
The updated approach significantly improved citation relevance and accuracy, leading to more trustworthy AI systems, higher quality of RAG-based answers, and easier validation by end users within an explainable AI question answering framework.