Shanghang Zhang

A new method improves how models perceive depth and spatial relationships in images.

2025-09-12T21:44:54+00:00 ― 6 min read

SPHINX-V enhances AI's ability to interpret images through user interaction.

2025-08-24T07:49:48+00:00 ― 6 min read

A new framework enhances AI's grasp of 3D spaces.

2025-08-11T19:14:05+00:00 ― 7 min read

A novel method for creating detailed 3D images from single images using multiview diffusion.

2025-08-10T01:29:48+00:00 ― 4 min read

CoCoGesture creates lifelike gestures that match spoken words, enhancing interaction.

2025-08-06T05:04:00+00:00 ― 5 min read

A new model enhances the link between visual and language understanding.

2025-07-25T11:10:06+00:00 ― 5 min read

MMTrail combines visual and audio descriptions for better video-language models.

2025-07-08T06:53:20+00:00 ― 4 min read

FactorLLM improves efficiency in language models by reorganizing knowledge storage.

2025-06-27T17:27:42+00:00 ― 5 min read

A new method enhances detail in image creation using regional prompts.

2025-05-31T15:49:06+00:00 ― 6 min read

A novel approach enhances model learning from varied image data.

2025-05-26T18:30:36+00:00 ― 7 min read

A new technique boosts image clarity in busy street environments.

2025-05-11T11:06:40+00:00 ― 7 min read

Discover how ASGDiffusion changes high-resolution image generation.

2025-04-02T23:14:33+00:00 ― 6 min read