Zhou Zhao

A groundbreaking system merges 3D visuals and language for improved interactions.

2025-10-08T04:20:12+00:00 ― 5 min read

New methods in S2ST improve translation quality while maintaining speaker identity.

2025-09-11T16:07:50+00:00 ― 5 min read

A method for more intuitive control over singing voices using natural language prompts.

2025-08-17T01:33:05+00:00 ― 7 min read

ROSVOT enhances accuracy in transcribing singing voices, even in noisy environments.

2025-08-05T10:11:50+00:00 ― 5 min read

Frieren model improves audio quality and sync for video.

2025-08-02T10:07:55+00:00 ― 6 min read

New method improves conversion from speech to singing using self-supervised learning.

2025-08-01T09:50:25+00:00 ― 7 min read

MelodyLM simplifies music creation using text and voice inputs.

2025-07-23T16:55:55+00:00 ― 6 min read

A new method improves emotion recognition even with incomplete data.

2025-07-17T21:51:48+00:00 ― 5 min read

A new dataset enhances machine speech for Mandarin, aiming for natural expression.

2025-07-14T09:26:55+00:00 ― 6 min read

New AI tools are simplifying music editing with innovative techniques and improved precision.

2025-07-13T18:52:25+00:00 ― 5 min read

OmniBind integrates various data types for improved content understanding and generation.

2025-07-12T14:16:42+00:00 ― 5 min read

MulliVC transforms voices across languages with impressive accuracy and clarity.

2025-07-03T11:54:30+00:00 ― 5 min read

Learn how semantic tokenization improves recommendation systems.

2025-06-13T16:39:30+00:00 ― 5 min read

A new approach to enhance multimodal learning effectiveness.

2025-06-01T11:57:48+00:00 ― 7 min read

Learn how 3D models enhance object orientation estimation for tech applications.

2025-01-28T07:12:27+00:00 ― 7 min read