Latest Articles for ASR

Cryptography and Security New Backdoor Attack Method for Large ML Models

A resource-efficient approach to backdoor attacks on advanced machine learning models.

2025-08-07T00:01:36+00:00 ― 5 min read

Computation and Language Advancements in Federated Learning for Speech Recognition

Harnessing early-exit models for efficient federated learning in ASR systems.

2025-08-06T09:48:24+00:00 ― 8 min read

Machine Learning Advancements in Automatic Speech Recognition with Denoising Language Models

Denoising Language Models improve error correction in speech recognition systems using synthetic data.

2025-08-03T22:34:10+00:00 ― 7 min read

Audio and Speech Processing Advancements in Speech Enhancement with VPIDM

New model VPIDM improves clarity of speech in noisy environments.

2025-08-03T16:54:05+00:00 ― 6 min read

Robotics Advancements in Desktop-Level Robots

A study on desktop robots using natural language and visual recognition technologies.

2025-08-03T13:39:45+00:00 ― 12 min read

Computation and Language Enhancing Language Model Stability Against Attacks

New methods improve language model predictions under varying input conditions.

2025-08-03T07:56:30+00:00 ― 6 min read

Audio and Speech Processing Introducing the 4D Model in Speech Recognition

A new model improves speech recognition using multiple decoding methods.

2025-08-01T01:44:35+00:00 ― 6 min read

Artificial Intelligence New Approach to Evaluate Multilingual Models

A fresh method for testing language model safety and multilingual skills.

2025-07-28T02:37:54+00:00 ― 7 min read

Artificial Intelligence Mitigating Backdoor Attacks in Language Models

A new defense strategy for LLMs against backdoor attacks.

2025-07-26T23:22:36+00:00 ― 5 min read

Computation and Language Improving Speech Error Correction in ASR Systems

A new method combines acoustic features and confidence scores for better error correction.

2025-07-25T20:45:15+00:00 ― 5 min read

Computation and Language Improving Chinese Speech Recognition Through Pinyin Regularization

This study presents a dataset and method to enhance Chinese ASR accuracy using Pinyin.

2025-07-25T07:47:55+00:00 ― 7 min read

Computation and Language Advancing Speech Technology for Tunisian Arabic

This study evaluates speech technology in low-resource languages like Tunisian Arabic.

2025-07-21T12:18:00+00:00 ― 5 min read

Audio and Speech Processing Introducing Emilia: A New Speech Generation Dataset

Emilia provides a diverse dataset for improving speech generation models.

2025-07-20T09:34:45+00:00 ― 6 min read

Audio and Speech Processing Improving Number Formatting in ASR Transcripts

This article discusses ways to enhance numeric expression formatting in automatic transcripts.

2025-07-14T15:55:35+00:00 ― 5 min read

Computation and Language Advancements in Speech Translation Technology

A new model aims to improve speech translation quality through integrated systems.

2025-07-11T02:54:20+00:00 ― 5 min read

Audio and Speech Processing Improving Speech Recognition with AI Collaboration

AI models enhance accuracy of speech-to-text conversions.

2025-07-07T09:50:10+00:00 ― 5 min read

Computation and Language Improving Speech Recognition for Specialized Terms

Research enhances ASR systems using language models for better accuracy.

2025-07-06T20:41:12+00:00 ― 7 min read

Computation and Language Improving Speech Recognition with Context Noise Representation Learning

A method to enhance speech recognition quality in noisy environments.

2025-07-01T23:28:15+00:00 ― 6 min read

Multimedia Advancements in E-Commerce Product Retrieval

A new method enhances product searches across different media formats.

2025-07-01T08:45:24+00:00 ― 6 min read

Artificial Intelligence SAGE-RT: A New Method for Language Model Safety

SAGE-RT creates synthetic data to improve language model safety assessments.

2025-06-28T06:37:42+00:00 ― 5 min read

Sound Advancements in Voice Quality Assessment Using Technology

New methods improve voice quality assessments for patients with vocal system issues.

2025-06-26T07:26:15+00:00 ― 6 min read

Computation and Language Assessing Automatic Speech Recognition Accuracy

A look at measuring accuracy in speech recognition systems with new methods.

2025-06-22T20:50:45+00:00 ― 5 min read

Computation and Language Improving Automatic Speech Recognition with Language Models

New method enhances ASR accuracy using language models for better transcriptions.

2025-06-21T20:33:15+00:00 ― 4 min read

Sound Advancements in Multi-Speaker Speech Recognition

New methods improve speech recognition in challenging multi-speaker situations.

2025-06-20T21:52:55+00:00 ― 4 min read

Computation and Language Using Speech Data for Autism Diagnosis

A new method leverages speech data to improve autism assessments.

2025-06-19T19:12:12+00:00 ― 6 min read

Audio and Speech Processing Enhancing Automatic Speech Recognition with Modularity

Research on modular ASR systems aims to improve performance in noisy environments.

2025-06-16T17:28:35+00:00 ― 4 min read

Audio and Speech Processing Advancements in Speech Recognition with Sortformer

Sortformer integrates speaker diarization and ASR for improved audio processing.

2025-06-15T09:05:15+00:00 ― 5 min read

Audio and Speech Processing Improving Speech Recognition in Multi-Speaker Settings

A new approach enhances ASR by focusing on specific speaker details.

2025-06-11T17:38:15+00:00 ― 5 min read

Sound ESPnet-EZ: Simplifying Speech Model Development

An easy-to-use tool for fine-tuning speech models without complex code.

2025-06-11T15:12:30+00:00 ― 6 min read

Robotics Improving Robot Speech Recognition for Better Collaboration

A new model helps robots follow unclear human instructions more effectively.

2025-06-11T14:53:18+00:00 ― 6 min read

Sound Advancing Automatic Speech Recognition with CADA-GAN

CADA-GAN enhances ASR systems' performance across various recording environments.

2025-06-07T23:45:30+00:00 ― 6 min read

Computation and Language Advancing Speech Recognition with Implicit Techniques

A new method improves speech interactions by integrating recognition and response processes.

2025-06-06T03:21:12+00:00 ― 5 min read

Audio and Speech Processing Evaluating Neural Audio Codecs: Insights from Codec-SUPERB Challenge

A look at the Codec-SUPERB challenge results and codec performance metrics.

2025-06-05T06:58:50+00:00 ― 5 min read

Computation and Language Innovating Speech Recognition for Malasar Language

A project improves speech recognition for the Malasar language using Tamil resources.

2025-05-23T02:48:37+00:00 ― 5 min read

Sound Mamba: Advancing Speech Recognition Technology

Mamba enhances speech recognition with speed and accuracy, reshaping interaction with devices.

2025-05-19T22:39:54+00:00 ― 4 min read

Computation and Language Bridging Bangla Dialects: A Unified Approach

This project aims to standardize Bangla dialects for clearer communication.

2025-05-12T19:19:18+00:00 ― 6 min read

Audio and Speech Processing United-MedASR: Improving Medical Speech Recognition

A new ASR system enhances medical speech recognition for accurate patient care.

2025-04-30T00:58:50+00:00 ― 6 min read

Computation and Language A New Method for Speaker-Attributed Speech Recognition

Efficiently tracks speakers in multilingual settings using automatic speech recognition.

2025-04-20T15:33:18+00:00 ― 6 min read

Computation and Language Enhancing Speech Recognition with Pinyin

New model improves Chinese speech recognition accuracy significantly.

2025-04-15T08:10:03+00:00 ― 6 min read

Computation and Language Saving Neo-Aramaic: A Language in Peril

Efforts to document and preserve the endangered Neo-Aramaic language.

2025-04-13T14:26:15+00:00 ― 6 min read