Latest Articles for Speech Recognition

Audio and Speech Processing Modular Domain Adaptation: A New Approach to Speech Recognition

MDA enhances speech recognition by optimizing models for specific data areas.

2025-11-05T01:24:10+00:00 ― 6 min read

Audio and Speech Processing Improving Speech Recognition for All Speakers

A new method aims to enhance ASR systems for dysarthric speakers.

2025-11-04T22:58:25+00:00 ― 5 min read

Computation and Language Advancements in Learning Spoken Words with MAMLCon

A new method improves computer understanding of spoken commands with fewer examples.

2025-11-04T22:09:50+00:00 ― 5 min read

Computation and Language Improving Speaker Diarization Using Word Analysis

Enhancing speaker identification by combining sound and spoken words in audio.

2025-11-04T18:55:30+00:00 ― 5 min read

Audio and Speech Processing Advancements in Active Speaker Detection Technology

A new framework improves active speaker detection using audio and visual cues.

2025-11-04T16:29:45+00:00 ― 5 min read

Audio and Speech Processing Improving General Audio Models for Speech Tasks

A new method enhances general audio models for effective speech recognition.

2025-11-04T05:58:10+00:00 ― 6 min read

Audio and Speech Processing Advancing Spoken Language Understanding with Continual Learning

This research addresses forgetting in AI through continual learning in spoken language understanding.

2025-11-04T00:18:05+00:00 ― 8 min read

Sound Developing Empathetic Voice Assistants with CALLS

CALLS aims to improve voice assistants' ability to handle customer interactions.

2025-11-03T19:26:35+00:00 ― 5 min read

Audio and Speech Processing Advancing Speech Recognition with Weakly-Supervised Learning

New methods leverage speaker identity to improve speech recognition performance.

2025-11-03T01:37:45+00:00 ― 5 min read

Computation and Language Advancing Slovak Speech Recognition with Czech Knowledge

Using transfer learning from Czech models boosts Slovak speech recognition accuracy.

2025-11-02T21:19:36+00:00 ― 4 min read

Audio and Speech Processing Advancing Text-to-Speech for Turkic Languages

Building TTS systems for lesser-known Turkic languages using Kazakh data.

2025-11-02T18:20:30+00:00 ― 5 min read

Sound Advancements in Speech Separation with S4M

A new model improves voice isolation in noisy environments.

2025-11-02T10:14:40+00:00 ― 5 min read

Computation and Language Advancements in Lip-Reading Technology with OpenSR

OpenSR enhances lip-reading models using audio data for better accuracy and accessibility.

2025-11-01T17:48:30+00:00 ― 6 min read

Computation and Language Improving Speech Recognition with Disfluency Correction

Research reveals a model to enhance disfluency correction in speech recognition systems.

2025-11-01T17:32:42+00:00 ― 6 min read

Human-Computer Interaction The Impact of Speech Misrecognition on Learning with Teachable Agents

A study on how speech errors affect learning with teachable agents.

2025-11-01T15:42:06+00:00 ― 5 min read

Sound Addressing Challenges in Speech Recognition with Enharmonic Words

A new method improves speech recognition for names that sound alike.

2025-11-01T14:48:40+00:00 ― 5 min read

Machine Learning Advancements in Multi-Task Self-Supervised Learning

New methods improve model flexibility and performance in audio tasks.

2025-11-01T08:20:00+00:00 ― 4 min read

Computation and Language Advancing Spoken Language Understanding with Discrete Units

New method improves spoken language understanding without needing written transcripts.

2025-11-01T00:14:10+00:00 ― 5 min read

Computation and Language Advancements in Translation for Underrepresented Languages

Improving translation technology for low-resource languages like Tamasheq and Quechua.

2025-10-31T16:39:36+00:00 ― 5 min read

Computation and Language New Benchmark for Speech Learning Models

BabySLM evaluates how well machines learn to understand speech based on children's language.

2025-10-31T11:33:20+00:00 ― 7 min read

Sound Advancements in Silent Speech Interfaces

Improving systems for silent speech recognition with new techniques.

2025-10-31T07:13:55+00:00 ― 5 min read

Sound Advancements in Weakly Supervised Keyword Spotting

A new method for training keyword spotting models using weak supervision in noisy environments.

2025-10-31T01:33:50+00:00 ― 6 min read

Sound Improving RNN-T Models with Reinforcement Learning

A new approach enhances RNN-T performance in automatic speech recognition.

2025-10-30T19:53:45+00:00 ― 6 min read

Computation and Language Advancements in Multilingual Speech Recognition Systems

Exploring methods for improved multilingual speech recognition in Indian languages.

2025-10-30T10:10:45+00:00 ― 6 min read

Sound Advancing Voice Activity Detection with SVVAD

Discover how SVVAD improves voice activity detection for better speaker verification.

2025-10-30T09:22:10+00:00 ― 5 min read

Sound Advancements in Automatic Pronunciation Assessment

A new method improves pronunciation feedback for language learners.

2025-10-30T08:33:35+00:00 ― 6 min read

Computation and Language Measuring Adaptability in Speech Recognition Models

A new framework evaluates how well speech models adapt to specific tasks.

2025-10-30T06:56:25+00:00 ― 6 min read

Computation and Language Advancements in Multilingual Speech Translation

Research improves multilingual speech translation using semantic knowledge.

2025-10-30T06:07:50+00:00 ― 4 min read

Hardware Architecture Introducing Sparq: A New Processing Solution for Quantized Neural Networks

Sparq aims to improve performance in quantized neural networks with lower resource needs.

2025-10-30T00:45:54+00:00 ― 4 min read

Sound Slowdown in Speech Recognition: A Closer Look at SlothSpeech

SlothSpeech reveals vulnerabilities in speech recognition systems, slowing them down significantly.

2025-10-29T17:10:30+00:00 ― 5 min read

Sound EmoMix: Advancing Emotional Speech Synthesis

EmoMix enables the creation of speech expressing mixed emotions with precise intensity.

2025-10-29T13:56:10+00:00 ― 5 min read

Computation and Language HK-LegiCoST: Bridging Cantonese Spoken and Written Language

A new corpus for translating Cantonese audio to English text.

2025-10-29T11:59:36+00:00 ― 5 min read

Sound MW-MAE: A New Approach to Audio Learning

Discover the innovative Multi-Window Masked Autoencoder method for enhanced audio processing.

2025-10-29T11:30:25+00:00 ― 5 min read

Audio and Speech Processing Improving ASR Technology with Sequential-Level Generalized Entropy Minimization

A new method enhances automatic speech recognition systems for better accuracy and adaptability.

2025-10-29T02:36:00+00:00 ― 6 min read

Computation and Language Improving Speech Recognition with Contextual Biasing

Contextual biasing enhances ASR systems, improving accuracy in specialized tasks.

2025-10-29T00:58:50+00:00 ― 5 min read

Sound New Method for Improving Language Pronunciation Detection

This study presents a new system for detecting pronunciation errors in language learners.

2025-10-28T21:44:30+00:00 ― 6 min read

Computation and Language Advancing Multilingual Speech Recognition with DistilXLSR

A new model reduces size while improving multilingual speech recognition.

2025-10-28T11:12:55+00:00 ― 6 min read

Computation and Language Advancements in Speech Recognition for Multiple Speakers

A new system improves speech recognition in multi-speaker settings.

2025-10-28T00:41:20+00:00 ― 6 min read

Audio and Speech Processing Combining Speech Processing with Visual Learning

This study examines the benefits of merging speech processing with visual data.

2025-10-27T20:38:25+00:00 ― 6 min read

Computation and Language Assessing Whisper's Performance on Arabic Dialects

A look at how Whisper handles various Arabic dialects and accents.

2025-10-27T13:21:10+00:00 ― 5 min read