Latest Articles for Speech Recognition

Computation and Language Advancing Spoken Language Understanding with CCL

A new method improves how systems handle errors in spoken language understanding.

2025-08-08T14:20:30+00:00 ― 6 min read

Computation and Language Generative Fusion Decoding: Advancing Text Recognition

A new method enhances text recognition accuracy across various applications.

2025-08-07T22:00:54+00:00 ― 6 min read

Computation and Language New Attack Method Silences ASR Systems

A universal audio clip can mute advanced ASR models like Whisper.

2025-08-07T03:29:35+00:00 ― 6 min read

Computation and Language Advancements in Federated Learning for Speech Recognition

Harnessing early-exit models for efficient federated learning in ASR systems.

2025-08-06T09:48:24+00:00 ― 8 min read

Computation and Language Integrating Audio and Language Models: SpeechVerse

SpeechVerse bridges audio understanding and language processing for improved human-computer interaction.

2025-08-06T06:26:25+00:00 ― 6 min read

Computation and Language Improving Classroom Speech Recognition with Continued Pretraining

Enhanced speech recognition for classrooms using advanced training techniques improves learning.

2025-08-05T19:06:15+00:00 ― 6 min read

Machine Learning Advancements in Automatic Speech Recognition with Denoising Language Models

Denoising Language Models improve error correction in speech recognition systems using synthetic data.

2025-08-03T22:34:10+00:00 ― 7 min read

Sound Advancements in Speech Inpainting Techniques

Learn how speech inpainting is restoring audio quality in various fields.

2025-08-02T18:13:45+00:00 ― 6 min read

Audio and Speech Processing Introducing the 4D Model in Speech Recognition

A new model improves speech recognition using multiple decoding methods.

2025-08-01T01:44:35+00:00 ― 6 min read

Computation and Language Improving Arabic Speech Recognition Through Knowledge Distillation

A study on enhancing ASR for Arabic dialects using efficient model techniques.

2025-07-31T23:18:50+00:00 ― 5 min read

Computation and Language Advancements in Self-Supervised Learning for Speech

Exploring self-supervised learning's role in speech processing and its challenges.

2025-07-30T15:51:24+00:00 ― 7 min read

Audio and Speech Processing Advancements in Target Speech Diarization Technology

A look at new methods in understanding overlapping speech during conversations.

2025-07-30T14:06:55+00:00 ― 8 min read

Sound Improving Backdoor Attacks in Speech Recognition

New method targets rhythm changes for stealthy speech attacks.

2025-07-29T08:09:20+00:00 ― 5 min read

Audio and Speech Processing AV-CrossNet: Improving Speech Recognition in Noise

A new system helps separate speech from noise for clearer communication.

2025-07-29T03:17:50+00:00 ― 6 min read

Sound Real-Time Speaker Diarization: An Overview

Learn about online speaker diarization and its significance in various applications.

2025-07-28T06:14:40+00:00 ― 6 min read

Sound Evaluating Discrete Audio Tokens for Speech Tasks

New benchmark tool assesses discrete audio tokens for various speech processing tasks.

2025-07-28T04:37:30+00:00 ― 8 min read

Computation and Language Improving Speech Error Correction in ASR Systems

A new method combines acoustic features and confidence scores for better error correction.

2025-07-25T20:45:15+00:00 ― 5 min read

Computation and Language How Speech Recognition Models Handle Sound Changes

A study on how machines adapt to phonological changes in speech.

2025-07-25T20:31:00+00:00 ― 7 min read

Audio and Speech Processing Improving Speaker Detection with Audio and Visual Data

A system combines audio and video to enhance speaker detection accuracy.

2025-07-25T10:13:40+00:00 ― 5 min read

Computation and Language Advancements in Spoken Dialogue Systems

A new method improves machine dialogue through pseudo-stereo data.

2025-07-25T08:36:30+00:00 ― 6 min read

Computation and Language Improving Chinese Speech Recognition Through Pinyin Regularization

This study presents a dataset and method to enhance Chinese ASR accuracy using Pinyin.

2025-07-25T07:47:55+00:00 ― 7 min read

Sound Breaking Down Deepfake Audio Detection Techniques

This study focuses on improving detection of deepfake audio using advanced methods.

2025-07-25T02:56:25+00:00 ― 5 min read

Sound The Importance of Measuring Uncertainty in Speech Emotion Recognition

Understanding uncertainty boosts the accuracy of emotion recognition in real-world scenarios.

2025-07-24T17:13:25+00:00 ― 6 min read

Audio and Speech Processing New Approach to Speaker Diarization

A system for speaker recognition in multilingual audio without extensive data.

2025-07-24T01:01:45+00:00 ― 5 min read

Computation and Language Advancements in Multilingual Speaker Anonymization

Improving speaker anonymization technology for nine languages to ensure privacy.

2025-07-23T03:58:35+00:00 ― 5 min read

Audio and Speech Processing Advancements in Audio-Visual Speech Recognition

Research highlights the role of video in improving speech recognition in noisy environments.

2025-07-22T20:41:20+00:00 ― 5 min read

Sound Advancements in Multi-Talker Speech Recognition

A new method improves accuracy in recognizing speech from multiple speakers.

2025-07-22T10:58:20+00:00 ― 5 min read

Neuroscience Understanding How Our Brains Process Sound

Explore how the auditory cortex integrates sound over time.

2025-07-22T08:05:26+00:00 ― 6 min read

Sound Advancements in Speech Enhancement Technology

A new method improves speech clarity in noisy environments using dual neural networks.

2025-07-22T06:55:25+00:00 ― 5 min read

Audio and Speech Processing Advancements in Streaming Automatic Speech Recognition

XLSR-Transducer model excels in real-time transcription with minimal data.

2025-07-21T18:46:40+00:00 ― 5 min read

Audio and Speech Processing Seed-ASR: Advancing Speech Recognition Technology

A new model improves accuracy in speech-to-text capabilities across multiple languages.

2025-07-21T14:43:45+00:00 ― 5 min read

Sound Vulnerability in Speech Recognition Systems Exposed

Research reveals risks in multi-task speech models like Whisper.

2025-07-21T09:52:15+00:00 ― 5 min read

Computation and Language TokenVerse: Streamlining Conversation Analysis

TokenVerse simplifies the analysis of spoken conversations by integrating multiple tasks into a single model.

2025-07-21T08:15:05+00:00 ― 6 min read

Sound Advancing Few-Shot Keyword Spotting with Mix-Training

This study examines Mix-Training for keyword spotting in noisy speech conditions.

2025-07-19T16:39:18+00:00 ― 5 min read

Audio and Speech Processing Advancing Speech Recognition for Low-Resource Languages

Improving speech recognition systems for languages with limited online data.

2025-07-19T04:25:45+00:00 ― 5 min read

Audio and Speech Processing Spectrograms and Neural Networks in Speech Recognition

This study examines how neural networks interpret speech using spectrograms.

2025-07-18T22:45:40+00:00 ― 6 min read

Audio and Speech Processing Improving Speech Recognition with Contextual Clues

Learn how context improves automatic speech recognition accuracy and word recognition.

2025-07-16T14:53:25+00:00 ― 5 min read

Computation and Language Analyzing Vowel Harmony in Assamese with fiwGAN

This study uses fiwGAN to explore vowel harmony patterns in Assamese language.

2025-07-16T07:17:06+00:00 ― 5 min read

Audio and Speech Processing Improving Code-Switching ASR with Knowledge Distillation

A new framework enhances ASR performance using limited data and resources.

2025-07-15T22:41:45+00:00 ― 5 min read

Audio and Speech Processing Improving Number Formatting in ASR Transcripts

This article discusses ways to enhance numeric expression formatting in automatic transcripts.

2025-07-14T15:55:35+00:00 ― 5 min read