PLCMOS offers a new way to evaluate speech quality without human listeners.
― 5 min read
Cutting edge science explained simply
PLCMOS offers a new way to evaluate speech quality without human listeners.
― 5 min read
A new method combines speech recognition and speaker identification for overlapping speech.
― 5 min read
A new method for voice conversion improves clarity and adaptation.
― 6 min read
Explore how diffusion models transform noise into valuable data outputs.
― 6 min read
A new model improves voice isolation in noisy environments.
― 5 min read
DeCoR helps machines learn new sounds without forgetting old ones.
― 5 min read
A new method enhances the naturalness and variety of text-to-speech output.
― 5 min read
Treff adapter improves audio classification with limited labeled data.
― 5 min read
Research highlights effective methods for recognizing emotions in speech using embeddings.
― 6 min read
This research analyzes dialects using audio recordings to reveal their similarities.
― 6 min read
A novel method enhances audio classification by learning new sounds efficiently.
― 4 min read
A new method aligns disfluent speech with text efficiently.
― 5 min read
A new method for training keyword spotting models using weak supervision in noisy environments.
― 6 min read
MERT addresses music modeling challenges through innovative self-supervised learning techniques.
― 6 min read
AVLIT model combines sound and video for better speech clarity in noisy settings.
― 6 min read
Discover how SVVAD improves voice activity detection for better speaker verification.
― 5 min read
UnDiff enhances audio quality using innovative speech restoration techniques.
― 5 min read
Discover the innovative Multi-Window Masked Autoencoder method for enhanced audio processing.
― 5 min read
A novel method merges audio and visual data to repair missing speech.
― 6 min read
SingNet improves beat tracking in singing voices using past data.
― 6 min read
A fresh look at speaker anonymization and the crucial role of vocoders.
― 5 min read
A new method aims to improve fake audio detection without losing past knowledge.
― 6 min read
New model LinDiff improves speech synthesis speed and quality.
― 4 min read
Techniques to improve speech recognition amidst background noise.
― 5 min read
HiddenSinger improves singing voice quality using advanced AI techniques.
― 5 min read
New methods improve speech clarity for electrolarynx users.
― 6 min read
Recent research improves ASR models for Norwegian, enhancing performance in Bokmål and Nynorsk.
― 4 min read
Gesper framework enhances speech clarity in noisy environments.
― 5 min read
This article discusses a new method for building efficient ASR systems.
― 5 min read
New algorithms enhance audio processing performance across varying sample rates.
― 5 min read
A new model improves music transcription accuracy for multiple instruments.
― 5 min read
A guide to using AI models for music on the Bela platform.
― 5 min read
A new model improves voice conversion by simplifying speech separation techniques.
― 6 min read
A new method transforms mono signals into engaging stereo experiences.
― 5 min read
A new system enhances detection of manipulated audio through innovative techniques.
― 5 min read
LyricWhiz combines advanced models to improve lyric transcription accuracy across languages.
― 5 min read
This article discusses challenges and techniques for managing dataset imbalance in audio classification.
― 6 min read
Whisper-AT combines speech recognition and audio tagging for improved performance.
― 5 min read
A new method enhances speaker identification in film and TV localization.
― 5 min read
New method improves accuracy in turning piano audio into sheet music.
― 4 min read