Latest Articles for Speech Technology

Sound Advancements in Speech Countermeasure Systems

Researchers improve detection of machine-generated speech using phase information adjustments.

2025-10-26T17:55:10+00:00 ― 6 min read

Audio and Speech Processing Advancements in Unsupervised Speech Recognition

A new framework enhances the study of unsupervised speech recognition systems.

2025-10-25T13:34:45+00:00 ― 6 min read

Sound LinDiff: A Leap Forward in Speech Synthesis

New model LinDiff improves speech synthesis speed and quality.

2025-10-25T00:37:25+00:00 ― 4 min read

Sound Innovative Advances in Electrolaryngeal Speech Technology

Researchers blend visual and sound features to improve speech for electrolarynx users.

2025-10-24T12:28:40+00:00 ― 5 min read

Audio and Speech Processing Improving Speech Recognition with Large Language Models

This research highlights how LLMs enhance speech understanding in long videos.

2025-10-23T22:42:45+00:00 ― 4 min read

Audio and Speech Processing Efficient Management of Large Speech Models

A new method optimizes speech models for better performance with fewer resources.

2025-10-23T21:54:10+00:00 ― 5 min read

Machine Learning EM-Network: A New Approach in Sequence Learning

EM-Network enhances sequence learning in speech and language processing tasks.

2025-10-23T07:19:40+00:00 ― 5 min read

Sound Evaluating Speech Quality with Machine Learning Models

This study assesses various models for predicting synthesized speech quality.

2025-10-21T16:27:40+00:00 ― 5 min read

Audio and Speech Processing Improving Speech Recognition through Confidence-Based Ensembles

This article discusses enhancing speech recognition using confidence-based ensemble methods.

2025-10-16T18:14:30+00:00 ― 5 min read

Audio and Speech Processing Advancing Text-to-Speech: GenerTTS Model Explained

GenerTTS enhances text-to-speech technology for cross-lingual applications.

2025-10-16T15:48:45+00:00 ― 5 min read

Sound Advancing Voice Isolation Technology

A new model improves speech extraction from noisy backgrounds using deep learning.

2025-10-16T02:02:50+00:00 ― 5 min read

Sound Advancements in Articulatory Speech Synthesis

A study on improving vocal sound reproduction through advanced synthesis techniques.

2025-10-11T02:12:30+00:00 ― 5 min read

Audio and Speech Processing Advancements in Speaker Anonymisation Techniques

New methods aim to hide speaker identities while maintaining speech clarity.

2025-10-08T01:20:00+00:00 ― 5 min read

Sound Advancements in Speech Enhancement Using Spiking Neural Networks

A new method to improve speech quality using energy-efficient networks.

2025-10-03T21:44:15+00:00 ― 5 min read

Artificial Intelligence Measuring Emotions in Speech: A New Approach

Researchers analyze how emotions are shared through speech using diverse data.

2025-10-03T09:07:42+00:00 ― 5 min read

Computation and Language Advancements in Text-to-Speech Technology

New methods improve the quality of synthesized speech using self-supervised learning.

2025-09-30T17:37:25+00:00 ― 5 min read

Sound Advancing Speech Recognition with Federated Learning

Federated Learning improves speech recognition while keeping user data private.

2025-09-30T08:43:00+00:00 ― 5 min read

Computation and Language EmoDistill: Advancing Speech Emotion Recognition

A new method improves emotion detection from speech using audio only.

2025-09-28T23:55:36+00:00 ― 5 min read

Machine Learning O-1: A New Frontier in Speech Recognition Training

O-1 improves speech recognition by optimizing self-training methods.

2025-09-26T09:10:10+00:00 ― 5 min read

Sound New Study on Detecting AI-Generated Speech

Research highlights real-time detection methods for fake audio created by AI.

2025-09-21T19:02:50+00:00 ― 5 min read

Sound Improving Voice Synthesis with Pruning Techniques

New pruning methods enhance zero-shot multi-speaker text-to-speech model performance.

2025-09-20T15:31:00+00:00 ― 7 min read

Audio and Speech Processing Advancements in Self-Supervised Learning for Speech Recognition

New methods for selecting speech data minimize labeling while improving recognition accuracy.

2025-09-20T13:53:50+00:00 ― 5 min read

Audio and Speech Processing Improving Speech Quality Assessment with Preference Scores

A new method enhances speech quality ranking using listener preference scores.

2025-09-20T07:25:10+00:00 ― 5 min read

Sound Improving Speech Recognition for Stutterers

A method to enhance ASR systems for users who stutter.

2025-09-20T06:36:35+00:00 ― 5 min read

Audio and Speech Processing Advancements in Formant Tracking for Speech Processing

New single-step methods improve accuracy in formant tracking for speech sounds.

2025-09-19T02:16:10+00:00 ― 4 min read

Computation and Language Connecting Speech with Language Models: The BLSP Method

A new approach enhances the integration of speech with language models.

2025-09-18T15:44:35+00:00 ― 7 min read

Computation and Language The Role of Pretrained Language Models in TTS

Examining how pretrained language models improve text-to-speech quality.

2025-09-17T20:18:35+00:00 ― 5 min read

Audio and Speech Processing MuLanTTS: A New Frontier in Text-to-Speech

Microsoft's MuLanTTS offers natural and expressive French text-to-speech capabilities.

2025-09-15T22:57:55+00:00 ― 5 min read

Computation and Language Advancements in Self-Supervised Learning for French Speech Technologies

A project aims to improve French speech processing using self-supervised learning.

2025-09-14T12:57:25+00:00 ― 5 min read

Audio and Speech Processing Advancements in Automatic Prosody Annotation

New methods improve how machines recognize speech rhythm and emotion.

2025-09-14T12:08:50+00:00 ― 6 min read

Audio and Speech Processing Advancements in Speech Recognition for Children

This study improves ASR systems' ability to recognize children's speech.

2025-09-14T02:25:50+00:00 ― 5 min read

Audio and Speech Processing VoxtLM: A Unified Approach to Speech and Text

VoxtLM combines speech recognition, synthesis, text generation, and continuation in one model.

2025-09-13T11:02:45+00:00 ― 4 min read

Audio and Speech Processing Libriheavy: A New Dataset for Speech Recognition

Libriheavy offers 50,000 hours of spoken English to boost speech recognition technology.

2025-09-12T18:51:05+00:00 ― 5 min read

Audio and Speech Processing Improving Speech Clarity with AV2Wav Technology

AV2Wav enhances speech quality using audio and visual cues.

2025-09-12T17:13:55+00:00 ― 5 min read

Sound Optimizing Text-to-Speech with Core-Set Selection

Core-set selection improves text-to-speech models by focusing on diverse data.

2025-09-12T08:19:30+00:00 ― 5 min read

Audio and Speech Processing Emo-StarGAN: Advancing Voice Conversion Technology

New method preserves emotional tone in voice conversion for better human-computer interaction.

2025-09-11T23:25:05+00:00 ― 5 min read

Sound Emotional Speech Challenges Speech Separation Models

Research reveals emotional speech impacts model performance in speech separation tasks.

2025-09-11T18:33:35+00:00 ― 6 min read

Audio and Speech Processing Enhancing Speech Inversion through Self-Supervised Learning

Research combines self-supervised learning and new measurement techniques for improved speech inversion.

2025-09-10T01:15:50+00:00 ― 5 min read

Sound Improving Clarity in Electrolaryngeal Speech

Researchers develop a new framework to enhance speech clarity for electrolaryngeal users.

2025-09-09T22:50:05+00:00 ― 5 min read

Audio and Speech Processing Improving ASR Systems with Synthetic Data

A new method uses synthetic data to enhance ASR systems in unfamiliar areas.

2025-09-09T15:32:50+00:00 ― 6 min read