Michael Tschannen

A new framework enhances visual reasoning using language models as controllers.

2025-09-19T03:35:06+00:00 ― 5 min read

A new approach improves image captioning with location-aware techniques.

2025-08-24T17:34:24+00:00 ― 6 min read

PaliGemma combines image and text understanding for versatile applications.

2025-07-15T20:45:06+00:00 ― 6 min read

JetFormer creates images and text together in an efficient way.

2025-05-01T00:18:40+00:00 ― 6 min read

VLMs blend vision and language, creating smarter machines that understand the world better.

2025-04-17T05:06:27+00:00 ― 6 min read

Discover how Jet transforms noise into stunning images effortlessly.

2025-02-13T10:00:45+00:00 ― 8 min read