Latest Articles for Speech Recognition

Sound Advancing Speech Recognition with Time-Sparse Transducer

New model improves speech recognition speed and memory usage.

2025-10-07T23:42:50+00:00 ― 6 min read

Audio and Speech Processing Advancements in Domain-Sensitive Speech Recognition Technology

New methods enhance speech recognition across specific fields without extensive data.

2025-10-07T15:37:00+00:00 ― 6 min read

Audio and Speech Processing Advancements in Acoustic Word Embeddings

A new model improves how computers process spoken language.

2025-10-07T04:16:50+00:00 ― 4 min read

Computation and Language Advancements in Speech Recognition Technology

The Bayes Risk Transducer improves speech recognition efficiency and accuracy.

2025-10-06T21:31:36+00:00 ― 5 min read

Computation and Language Advancements in Spoken Question Answering with LibriSQA

New dataset and framework improve spoken question answering capabilities.

2025-10-06T17:42:30+00:00 ― 5 min read

Sound New Framework Improves Speech Recognition with Metadata

Integrating metadata enhances performance in speech tasks like language identification.

2025-10-06T12:05:10+00:00 ― 6 min read

Audio and Speech Processing Advancements in Transducer Models for Speech Recognition

This article discusses the Transducer model's real-time capabilities and recent improvements.

2025-10-06T11:16:35+00:00 ― 6 min read

Audio and Speech Processing Advancements in Topic Identification from Audio Data

Research explores methods for identifying topics directly from audio recordings.

2025-10-05T23:56:25+00:00 ― 5 min read

Sound Advancing Speech Technology with SCRAPS

A new model connects phonetics and acoustics for better speech technology.

2025-10-05T13:24:50+00:00 ― 7 min read

Audio and Speech Processing Advancements in Active Speaker Detection Using Audio

Research shows benefits of multiple microphones for detecting and locating speakers.

2025-10-03T11:12:40+00:00 ― 5 min read

Audio and Speech Processing Advancements in Speech Enhancement with PCNN

Introducing a new model for clearer speech in noisy environments.

2025-10-03T07:58:20+00:00 ― 5 min read

Sound Advancements in Speaker Diarization Through Audio-Visual Integration

New systems improve speaker identification using both audio and visual data.

2025-10-02T15:46:40+00:00 ― 5 min read

Computation and Language Advancements in Pronunciation Training Technology

Researchers are improving pronunciation training with new technologies for language learners.

2025-10-02T07:40:50+00:00 ― 5 min read

Information Retrieval Advancements in Voice Search Technology

Voice search technology evolves, addressing ASR errors for improved user experience.

2025-09-30T17:00:24+00:00 ― 6 min read

Sound Advancements in Fake Audio Detection with RAWM

A new method improves detection of fake audio using adaptive weight modification.

2025-09-29T01:08:15+00:00 ― 5 min read

Sound Advancements in Target-Speaker Speech Recognition

New model improves speech recognition in noisy environments by focusing on a single speaker.

2025-09-28T08:08:00+00:00 ― 4 min read

Audio and Speech Processing Advancing Bilingual Speech Recognition with Grapheme Units

Enhancing hybrid ASR systems for bilingual speech using grapheme units.

2025-09-27T03:47:35+00:00 ― 5 min read

Computation and Language Advances in Joint Speech-Text Learning

A new model improves speech and text alignment for better automatic recognition.

2025-09-27T02:10:25+00:00 ― 6 min read

Computation and Language New Methods for Evaluating Speaker Diarization

Introducing fresh metrics to assess speaker diarization accuracy in conversational AI.

2025-09-26T18:04:30+00:00 ― 6 min read

Computation and Language Advancements in Speech Recognition Technology

New methods enhance accuracy and speed in speech recognition systems.

2025-09-26T11:35:55+00:00 ― 5 min read

Computation and Language Improving Automatic Speech Recognition with Text Injection

A new method enhances ASR performance through text data integration.

2025-09-26T07:33:00+00:00 ― 6 min read

Computation and Language Improving Speech Recognition with Text Injection

Text injection helps recognize personal information while maintaining privacy.

2025-09-26T06:44:25+00:00 ― 5 min read

Sound Advancements in Speech Recognition with mmWave Technology

Radio2Text uses mmWave signals for real-time speech recognition in noisy environments.

2025-09-25T22:38:35+00:00 ― 6 min read

Computation and Language Improving Grapheme-to-Phoneme Conversion with New Sampling Method

This study enhances G2P models by focusing on error-prone areas during training.

2025-09-25T05:38:20+00:00 ― 5 min read

Audio and Speech Processing Advancements in Formant Tracking Techniques

Discover methods that improve accuracy in formant tracking for speech analysis.

2025-09-24T22:21:05+00:00 ― 6 min read

Computation and Language Advancements in Speech Language Modeling

New methods improve speech processing and generation in language models.

2025-09-19T16:02:05+00:00 ― 5 min read

Sound Advancements in Noise Suppression Technology

New techniques improve audio clarity in noisy environments.

2025-09-19T15:13:30+00:00 ― 6 min read

Audio and Speech Processing Advancing Few-Shot Keyword Spotting with Reading Speech Data

New methods improve keyword spotting using available reading speech data.

2025-09-19T13:36:20+00:00 ― 4 min read

Audio and Speech Processing Advancing Confidence Estimation in Automatic Speech Recognition

A new approach enhances confidence estimation in ASR systems for better accuracy.

2025-09-15T03:14:28+00:00 ― 4 min read

Machine Learning Challenges in Using Convnets for Audio Filterbank Design

This study explores issues with using convnets for audio filterbank creation.

2025-09-14T14:34:35+00:00 ― 5 min read

Audio and Speech Processing Improving Speaker Diarization with Language Models

This article explores advancements in speaker diarization using language models for better accuracy.

2025-09-14T03:14:25+00:00 ― 5 min read

Audio and Speech Processing PromptASR: Next-Level Speech Recognition Technology

New system enhances speech recognition using context-aware prompts.

2025-09-13T10:14:10+00:00 ― 4 min read

Sound Advancements in Universal Audio Models

EnCodecMAE combines self-supervised learning and audio codecs for improved audio task performance.

2025-09-13T09:25:35+00:00 ― 5 min read

Audio and Speech Processing A New Approach to Keyword Spotting

Introducing a flexible method for recognizing keywords in speech across languages.

2025-09-13T06:11:15+00:00 ― 5 min read

Sound New System Improves Voice Extraction from Unstable Head Positions

PIAVE helps machines extract voices clearly, even when speakers turn their heads.

2025-09-12T19:39:40+00:00 ― 6 min read

Sound A New Framework for Speaker Anonymization

Introducing a flexible framework to enhance voice privacy research.

2025-09-12T05:05:10+00:00 ― 7 min read

Computation and Language Improving Explanations for Speech Models

A new method simplifies understanding of speech classification models.

2025-09-12T02:39:25+00:00 ― 6 min read

Sound M-AUDIODEC: A New Way to Compress Audio

M-AUDIODEC compresses multi-channel audio while retaining speaker position and quality.

2025-09-11T16:56:25+00:00 ― 6 min read

Audio and Speech Processing Improving Sound Quality in Hearables

Research reveals new models to enhance voice clarity in smart earbuds.

2025-09-11T12:04:55+00:00 ― 5 min read

Artificial Intelligence Improving Robot Understanding of Human Instructions

A new method enhances robots' ability to follow spoken directions accurately.

2025-09-11T08:21:18+00:00 ― 5 min read