Ziyang Ma

New methods for selecting speech data minimize labeling while improving recognition accuracy.

2025-09-20T13:53:50+00:00 ― 5 min read

New methods enhance linking text descriptions to sound events.

2025-08-31T16:09:40+00:00 ― 7 min read

ELLA-V enhances text-to-speech quality and control, surpassing previous models.

2025-08-30T01:17:40+00:00 ― 5 min read

A new model enhances machines' understanding of spatial audio.

2025-08-26T15:30:45+00:00 ― 5 min read

MuPT utilizes ABC notation for effective music generation with AI.

2025-08-12T09:00:00+00:00 ― 5 min read

MAP-Neo aims for transparency and performance in AI language modeling.

2025-08-04T21:04:18+00:00 ― 5 min read

GigaSpeech 2 offers a vast dataset for low-resource languages to improve speech recognition.

2025-07-29T02:29:15+00:00 ― 5 min read

A new method improves speech model performance across various tasks.

2025-06-21T02:44:25+00:00 ― 6 min read

VQTalker creates realistic talking avatars in multiple languages, enhancing digital interactions.

2025-03-09T22:14:42+00:00 ― 7 min read