Abstract:
While Large Language Models (LLMs) have improved in both popularity and performance, “hallucinations”, or simply errors, remain in the generated output and can lead to distrust from human users. Further, these models can exhibit great confidence in their false results or even revert to an incorrect statement if sufficiently interrogated by the human user. While retrieval augmented generation (RAG) has shown the ability to reduce errors in the generated response, without explicitly quantifying the uncertainty in the result, human users of these systems are still left to blindly trust the result. To this end, we review existing forms of uncertainty quantification for language models, highlighting forms of calibrating a language model using methods such as Bayesian belief matching and conformal prediction. We then review methods of uncertainty quantification for unimodal LLMs and discuss the challenges when moving toward multimodal LLMs.
Dr. Karl Pazdernik is a Senior Data Scientist within the National Security Directorate at Pacific Northwest National Laboratory (PNNL), a team lead within the Artificial Intelligence and Data Analytics division at PNNL, and a Research Assistant Professor at North Carolina State University. His research has focused on the uncertainty quantification and dynamic modeling of multi-modal data with a particular interest in text analytics, spatial statistics, pattern recognition, and anomaly detection. He received a B.A. in Mathematics from Saint John's University and a Ph.D. in Statistics from Iowa State University.