Latest Articles for Speech Recognition

Sound Advancements in Speech Restoration: MaskSR2

MaskSR2 improves speech clarity and quality using innovative techniques.

2025-06-11T07:06:40+00:00 ― 5 min read

Computation and Language Improving Speech Recognition with Disfluency Detection

A new method enhances speech recognition systems by detecting interruptions in speech.

2025-06-11T05:08:42+00:00 ― 6 min read

Neural and Evolutionary Computing Advancements in Spiking Neural Networks with Analog Circuits

A new system leverages spiking neural networks for efficient data processing.

2025-06-10T22:33:42+00:00 ― 5 min read

Computation and Language Advancements in Multilingual Speech Translation Systems

New methods enhance translation accuracy and efficiency for multiple languages.

2025-06-10T16:14:30+00:00 ― 6 min read

Audio and Speech Processing Challenges and Advancements in Keyword Spotting for Urdu

An overview of keyword spotting technologies and their challenges with the Urdu language.

2025-06-10T10:52:05+00:00 ― 6 min read

Audio and Speech Processing Design Choices Impacting Speech Model Performance

A study on how design choices affect speech foundation models.

2025-06-10T06:00:35+00:00 ― 7 min read

Audio and Speech Processing Improving Speech Recognition for Accents

This article discusses methods to enhance speech recognition for accented speech.

2025-06-08T12:42:50+00:00 ― 6 min read

Computation and Language Improving Audio Language Models for Thai and English

This study addresses challenges in audio language models for low-resource languages.

2025-06-08T08:39:55+00:00 ― 5 min read

Audio and Speech Processing Improving TTS Systems for Indian Languages

Enhancing speech synthesis in Indian languages using inter-pausal units.

2025-06-08T02:59:50+00:00 ― 6 min read

Sound Advancing Automatic Speech Recognition with CADA-GAN

CADA-GAN enhances ASR systems' performance across various recording environments.

2025-06-07T23:45:30+00:00 ― 6 min read

Computer Vision and Pattern Recognition Advancements in Audio-Visual Speech Recognition

Llama-AVSR merges audio and visual inputs for enhanced speech recognition accuracy.

2025-06-07T18:05:25+00:00 ― 6 min read

Sound Advancements in Language Learning Feedback Systems

A new method uses virtual shadowing to enhance language learners' pronunciation feedback.

2025-06-07T05:56:40+00:00 ― 6 min read

Machine Learning Advancements in Speech Recognition for Children

A new ASR method helps technology understand children's speech better.

2025-06-06T20:13:40+00:00 ― 5 min read

Computer Vision and Pattern Recognition New System Combines Sound and Vision for Object Recognition

YOSS uses audio to improve object identification in images.

2025-06-05T10:22:06+00:00 ― 4 min read

Audio and Speech Processing Building Better Speech Datasets for Under-Served Languages

A project developing speech and text datasets for languages with limited resources.

2025-06-04T06:41:20+00:00 ― 5 min read

Audio and Speech Processing Improving Speaker Verification with CA-MHFA

A new framework enhances voice recognition and adapts to various speech tasks.

2025-06-04T05:52:45+00:00 ― 4 min read

Computation and Language Advancements in Textless Speech Processing Techniques

New methods improve speech recognition for low-resource languages without text.

2025-06-03T18:32:35+00:00 ― 4 min read

Computation and Language Improving Speech Recognition Through Phonetic Techniques

New methods enhance accuracy in speech recognition systems using phonetic understanding.

2025-06-03T16:55:25+00:00 ― 5 min read

Sound Improving Speech Recognition with Human-Inspired Features

New acoustic features enhance ASR systems' performance in noisy environments.

2025-06-03T14:29:40+00:00 ― 4 min read

Audio and Speech Processing Whisper-Medusa: Advancing Speech Recognition Efficiency

New model achieves faster speech transcription without sacrificing accuracy.

2025-06-03T00:43:45+00:00 ― 4 min read

Audio and Speech Processing Matryoshka Speaker Embeddings: A Flexible Approach to Voice Recognition

Discover how Matryoshka embeddings improve speaker recognition efficiency and flexibility.

2025-06-02T20:40:50+00:00 ― 4 min read

Sound Advancements in Text-to-Speech Adaptation

New model VoiceGuider improves TTS for diverse speakers.

2025-06-02T19:03:40+00:00 ― 6 min read

Audio and Speech Processing Advancements in Speech Recognition Technology

A new method improves speech recognition for long recordings.

2025-05-30T21:54:17+00:00 ― 5 min read

Audio and Speech Processing Advancements in Speech Language Models Without Extensive Training Data

New method for speech language models reduces need for extensive data.

2025-05-29T17:50:26+00:00 ― 6 min read

Audio and Speech Processing The Evolution of Speaker Diarization

How new methods are transforming speaker identification in audio recordings.

2025-05-25T18:57:25+00:00 ― 6 min read

Sound Target Speaker Extraction: Enhancing Clarity in Noisy Settings

Learn how TSE improves speech recognition in crowded environments using text cues.

2025-05-25T00:14:51+00:00 ― 6 min read

Audio and Speech Processing Using Voice Assistants to Detect Mild Cognitive Impairment

Voice assistants help identify early signs of memory issues in older adults.

2025-05-24T01:31:44+00:00 ― 7 min read

Sound Mamba: Advancing Speech Recognition Technology

Mamba enhances speech recognition with speed and accuracy, reshaping interaction with devices.

2025-05-19T22:39:54+00:00 ― 4 min read

Sound Using Visual Cues to Clear Up Speech in Noise

New method enhances speech clarity using visual information from surroundings.

2025-05-18T20:42:14+00:00 ― 5 min read

Sound SAMOS: Advancing Speech Quality Assessment

SAMOS offers a new way to measure speech quality, enhancing naturalness.

2025-05-11T19:57:24+00:00 ― 6 min read

Sound Tiny-Align: A New Approach to Voice Assistants

Tiny-Align enhances voice assistants for better personal interaction on small devices.

2025-05-07T01:43:40+00:00 ― 6 min read

Machine Learning VQalAttent: A New Approach to Speech Generation

Introducing VQalAttent, a simpler model for generating realistic machine speech.

2025-05-05T05:35:38+00:00 ― 5 min read

Audio and Speech Processing United-MedASR: Improving Medical Speech Recognition

A new ASR system enhances medical speech recognition for accurate patient care.

2025-04-30T00:58:50+00:00 ― 6 min read

Sound Detecting Deepfakes: The Role of ASR Models

Exploring how ASR models help identify speech deepfakes effectively.

2025-04-24T01:54:40+00:00 ― 7 min read

Computation and Language A New Method for Speaker-Attributed Speech Recognition

Efficiently tracks speakers in multilingual settings using automatic speech recognition.

2025-04-20T15:33:18+00:00 ― 6 min read

Audio and Speech Processing Advancing Speech Recognition for Dysfluency

Improving machine transcription for better understanding of speech disorders.

2025-04-17T08:35:42+00:00 ― 5 min read

Computation and Language Enhancing Speech Recognition with Pinyin

New model improves Chinese speech recognition accuracy significantly.

2025-04-15T08:10:03+00:00 ― 6 min read

Sound Introducing Noro: A Reliable Voice Conversion System

Noro enhances voice conversion, making it effective even in noisy settings.

2025-04-15T07:14:42+00:00 ― 6 min read

Computation and Language GLM-4-Voice: The Next Step in Chatbots

A new chatbot offering human-like conversations with emotional awareness.

2025-04-02T18:12:36+00:00 ― 3 min read

Computation and Language Transforming Speech Recognition: New Evaluation Methods

Discover how style-agnostic evaluation improves Automatic Speech Recognition systems.

2025-03-26T13:05:15+00:00 ― 7 min read