Researchers improve detection of machine-generated speech using phase information adjustments.
― 6 min read
Cutting edge science explained simply
Researchers improve detection of machine-generated speech using phase information adjustments.
― 6 min read
A new framework enhances the study of unsupervised speech recognition systems.
― 6 min read
New model LinDiff improves speech synthesis speed and quality.
― 4 min read
Researchers blend visual and sound features to improve speech for electrolarynx users.
― 5 min read
This research highlights how LLMs enhance speech understanding in long videos.
― 4 min read
A new method optimizes speech models for better performance with fewer resources.
― 5 min read
EM-Network enhances sequence learning in speech and language processing tasks.
― 5 min read
This study assesses various models for predicting synthesized speech quality.
― 5 min read
This article discusses enhancing speech recognition using confidence-based ensemble methods.
― 5 min read
GenerTTS enhances text-to-speech technology for cross-lingual applications.
― 5 min read
A new model improves speech extraction from noisy backgrounds using deep learning.
― 5 min read
A study on improving vocal sound reproduction through advanced synthesis techniques.
― 5 min read
New methods aim to hide speaker identities while maintaining speech clarity.
― 5 min read
A new method to improve speech quality using energy-efficient networks.
― 5 min read
Researchers analyze how emotions are shared through speech using diverse data.
― 5 min read
New methods improve the quality of synthesized speech using self-supervised learning.
― 5 min read
Federated Learning improves speech recognition while keeping user data private.
― 5 min read
A new method improves emotion detection from speech using audio only.
― 5 min read
O-1 improves speech recognition by optimizing self-training methods.
― 5 min read
Research highlights real-time detection methods for fake audio created by AI.
― 5 min read
New pruning methods enhance zero-shot multi-speaker text-to-speech model performance.
― 7 min read
New methods for selecting speech data minimize labeling while improving recognition accuracy.
― 5 min read
A new method enhances speech quality ranking using listener preference scores.
― 5 min read
A method to enhance ASR systems for users who stutter.
― 5 min read
New single-step methods improve accuracy in formant tracking for speech sounds.
― 4 min read
A new approach enhances the integration of speech with language models.
― 7 min read
Examining how pretrained language models improve text-to-speech quality.
― 5 min read
Microsoft's MuLanTTS offers natural and expressive French text-to-speech capabilities.
― 5 min read
A project aims to improve French speech processing using self-supervised learning.
― 5 min read
New methods improve how machines recognize speech rhythm and emotion.
― 6 min read
This study improves ASR systems' ability to recognize children's speech.
― 5 min read
VoxtLM combines speech recognition, synthesis, text generation, and continuation in one model.
― 4 min read
Libriheavy offers 50,000 hours of spoken English to boost speech recognition technology.
― 5 min read
AV2Wav enhances speech quality using audio and visual cues.
― 5 min read
Core-set selection improves text-to-speech models by focusing on diverse data.
― 5 min read
New method preserves emotional tone in voice conversion for better human-computer interaction.
― 5 min read
Research reveals emotional speech impacts model performance in speech separation tasks.
― 6 min read
Research combines self-supervised learning and new measurement techniques for improved speech inversion.
― 5 min read
Researchers develop a new framework to enhance speech clarity for electrolaryngeal users.
― 5 min read
A new method uses synthetic data to enhance ASR systems in unfamiliar areas.
― 6 min read