Analysis and Research on Trustworthy LLMs

Mar 29, 2024 Friday

Abstract:

In this report, we explore the phenomenon of hallucination in large language models (LLMs) and conduct a thorough analysis from seven unique perspectives. These perspectives cover aspects such as the storage of factual knowledge by LLMs, timely updates, multi-fact reasoning, domain-specific knowledge reasoning, and resistance to adversarial samples. To evaluate the extent of LLMs’ mastery of factual knowledge, we designed a benchmark test named “Pinocchio,” which includes 20,000 fact-based questions from various sources, timelines, domains, regions, and languages to assess the scope and depth of factual knowledge in LLMs. Our experiments reveal that existing LLMs still have significant deficiencies in utilizing such factual knowledge, especially in resisting adversarial examples. Adversarial attacks, by slightly modifying factual knowledge, can induce LLMs to produce hallucinations, causing serious issues in many related fields. For example, in the medical field, adversarial attacks could mislead LLMs into making incorrect medical decisions for patients. Thus, we conducted additional research on this point and found that adversaries could successfully induce generative search engines (such as Bing Chat, PerplexityAI, YouChat) to provide incorrect responses, even in practical and high-risk environments with limited access to black-box systems. Through comprehensive human assessments, we demonstrated the effectiveness of adversarial fact-based questions in inducing incorrect responses. Our research also found that generative search engines with retrieval augmentation are more prone to factual inaccuracies compared to LLMs without retrieval enhancement.

Overall, our work reveals significant challenges faced by large language models in practical applications, especially in ensuring the accuracy and safety of their outputs. These findings are not only crucial for improving the performance and reliability of LLMs but also offer new directions for future research in this field. Through our studies, we aim to advance the development towards more trustworthy artificial intelligence technologies.

Time:

Mar 29, 2024 Friday

11:00-11:50

Location:

Rm W1-101, GZ Campus

Online Zoom

Join Zoom at: https://hkust-gz-edu-cn.zoom.us/j/4236852791 OR 423 685 2791

Speaker Bio:

Dr. Xuming HU

Assistant Professor, AI Thrust, The Hong Kong University of Science and Technology (Guangzhou)

Dr. Xuming HU is an Assistant Professor in the AI Thrust at The Hong Kong University of Science and Technology (Guangzhou). Dr. Xu Ming Hu earned his Ph.D. degree from Tsinghua University as an outstanding graduate of Beijing, under the supervision of Prof. Philip S. Yu. His research focuses include trustworthy large language models (LLMs), multimodal LLMs, and the distillation and acceleration of LLMs. In the past five years, Dr. Hu has published over 20 first-author papers in top-tier journals and conferences in the fields of data mining and natural language processing, such as KDD, ICLR, ACL, and TKDE. He has served as an Area Chair for top-tier conferences such as ACL, NAACL, EACL, and as an Action Editor for ACL Rolling Review. Personal homepage: https://xuminghu.github.io/

标签 event

Mar 29, 2024 Friday

发表回复 取消回复

发表回复取消回复