Ji Zhang

A new approach improves efficiency in Vision-Language Pre-training tasks.

2025-10-11T17:07:48+00:00 ― 6 min read

A new method enhances OOD detection by combining global and local data representations.

2025-10-06T14:01:18+00:00 ― 5 min read

A new approach improves task performance in vision-language models.

2025-09-26T23:36:18+00:00 ― 6 min read

A new approach using multi-agent systems to enhance smaller language models.

2025-09-17T04:26:54+00:00 ― 6 min read

This article discusses a new framework for assessing hallucinations in LVLMs.

2025-09-04T12:02:06+00:00 ― 6 min read

A new benchmark evaluates how role-playing agents interact socially.

2025-08-27T12:43:24+00:00 ― 6 min read

A new framework improves how language agents learn and perform tasks.

2025-08-27T05:28:54+00:00 ― 6 min read

MIBench tests multimodal models' performance on multiple images.

2025-07-09T14:23:18+00:00 ― 6 min read

mPLUG-Owl3 improves understanding of images and videos for better responses.

2025-06-30T17:13:12+00:00 ― 6 min read

A new method to combine language models more effectively.

2025-06-29T22:23:30+00:00 ― 6 min read

New modeling techniques enhance our understanding of bacterial movement.

2025-06-24T23:02:57+00:00 ― 5 min read

MaVEn enhances AI's ability to process multiple images for better reasoning.

2025-06-23T15:38:00+00:00 ― 5 min read

A new framework seeks to improve image generation using human feedback.

2025-06-13T07:34:24+00:00 ― 5 min read

A look at how social media shapes collective opinions.

2025-05-20T18:31:30+00:00 ― 8 min read

Discover how skip tuning enhances efficiency in vision-language models.

2025-02-28T07:07:30+00:00 ― 7 min read