AMS IT Services | Expert Web & Mobile Solutions

📄 NoLan: Mitigating Object Hallucinations in Large Vision-Language Models via Dynamic Suppression of Language Priors

👥 Authors: Lingfeng Ren, Weihao Yu, Runpeng Yu, Xinchao Wang

📅 Published: 2026-02-25

🔥 Upvotes: 1

🎯 What This Research Is About

Object hallucination is a critical issue in Large Vision-Language Models (LVLMs), where AI systems generate descriptions of objects that don't actually appear in the input image. This research investigates a fundamental question: which component is primarily responsible for these hallucinations—the vision encoder that processes images, or the language decoder that generates text?

Through systematic experiments, the researchers discovered that object hallucinations predominantly stem from strong priors in the language decoder. Based on this insight, they developed NoLan (No-Language-Hallucination Decoding), a simple training-free framework that refines output distributions by dynamically suppressing language priors based on the difference between multimodal and text-only inputs.

💡 Why This Matters

Critical Problem Solved: Addresses object hallucinations where AI models "see" objects that don't exist in images—a major reliability issue in vision-language systems
Training-Free Solution: Can be applied to existing models without retraining, making it immediately practical and cost-effective for deployment
Significant Improvements: Achieves accuracy boosts of up to 6.45 points for LLaVA-1.5 7B and 7.21 points for Qwen-VL 7B on POPE benchmarks
Root Cause Identified: Reveals that language decoder biases, not vision processing, are the primary culprit—shifting how we approach this problem
Universal Applicability: Works across various LVLMs and different tasks, demonstrating broad effectiveness

📖 Read Full Paper →

💻 View Code on GitHub →

Curated from Hugging Face daily papers • February 26, 2026

NoLan: Mitigating Object Hallucinations in Large Vision-Language Models via Dynamic Suppression of Language Priors

📄 NoLan: Mitigating Object Hallucinations in Large Vision-Language Models via Dynamic Suppression of Language Priors

🎯 What This Research Is About

💡 Why This Matters

More from Our Blog

MoBind: Bridging IMU Sensors and Video for Precise Motion Tracking

RankEvolve: Automating the Discovery of Retrieval Algorithms via LLM-Driven Evolution

DreamID-Omni: ByteDance's Breakthrough in AI-Powered Audio-Video Generation

Have a Brilliant Idea?