Haiyang Xu

A new approach improves efficiency in Vision-Language Pre-training tasks.

2025-10-11T17:07:48+00:00 ― 6 min read

TRIPS enhances efficiency in vision-language tasks by selecting relevant image patches.

2025-09-17T20:38:36+00:00 ― 7 min read

This article discusses a new framework for assessing hallucinations in LVLMs.

2025-09-04T12:02:06+00:00 ― 6 min read

MIBench tests multimodal models' performance on multiple images.

2025-07-09T14:23:18+00:00 ― 6 min read

mPLUG-Owl3 improves understanding of images and videos for better responses.

2025-06-30T17:13:12+00:00 ― 6 min read

MaVEn enhances AI's ability to process multiple images for better reasoning.

2025-06-23T15:38:00+00:00 ― 5 min read