Electrical Engineering and Systems Science - Audio and Speech Processing

RSS

Audio and Speech Processing Advancing Multilingual Automatic Speech Recognition with Adaptive Masking

A new approach improves efficiency in multilingual ASR models by integrating adaptive masking techniques.

2025-09-06T09:00:15+00:00 ― 5 min read

Sound Using Deepfake Audio for Better Transcription Systems

Investigating deepfake audio to enhance transcription models for less common languages.

2025-09-06T07:23:05+00:00 ― 8 min read

Machine Learning Improving Weak Label Learning Through Negative Example Selection

New strategies enhance weak label learning by selecting relevant negative examples.

2025-09-06T04:57:20+00:00 ― 6 min read

Sound New Watermarking Technique for Audio Models

A novel method to watermark audio created by diffusion models for ownership protection.

2025-09-06T04:08:45+00:00 ― 6 min read

Audio and Speech Processing Improving Speech Recognition with Memory Networks

New techniques enhance ASR systems for better long speech recognition.

2025-09-06T03:20:10+00:00 ― 5 min read

Audio and Speech Processing Advancements in Keyword Spotting Systems

New techniques aim to boost the accuracy of voice-activated devices against attacks.

2025-09-06T01:43:00+00:00 ― 6 min read

Audio and Speech Processing DurIAN-E: Advancing Text-to-Speech Technology

DurIAN-E improves synthetic speech with enhanced expressiveness and natural flow.

2025-09-06T00:54:25+00:00 ― 4 min read

Audio and Speech Processing Advancements in Speech Emotion Recognition Technology

Discover how SER enhances human-machine interactions through emotion detection.

2025-09-06T00:05:50+00:00 ― 5 min read

Audio and Speech Processing Efficient Model Selection for Speech Recognition

A method to choose the best ASR model based on audio features.

2025-09-05T23:17:15+00:00 ― 5 min read

Audio and Speech Processing Improving Speech Clarity with Dereverberation Techniques

Learn how dereverberation boosts speech recognition in noisy environments.

2025-09-05T12:45:40+00:00 ― 4 min read

Sound Introducing Coco-Nut: A Comprehensive Voice Database for TTS

Coco-Nut offers diverse Japanese voice samples for advanced text-to-speech applications.

2025-09-05T11:57:05+00:00 ― 10 min read

Audio and Speech Processing New Method for Room Volume Estimation Using Attention Models

This study presents an attention-based model for estimating room volumes from audio recordings.

2025-09-05T11:08:30+00:00 ― 5 min read

Sound Introducing ASCA: A New Approach to Audio Classification

ASCA model enhances audio classification accuracy for small datasets.

2025-09-05T10:19:55+00:00 ― 5 min read

Computation and Language My Science Tutor Project: A New Way to Learn

MyST aims to improve children's science learning through virtual tutoring.

2025-09-05T09:31:20+00:00 ― 5 min read

Sound Evaluating Sound Event Localization with Different Audio Setups

Study compares sound localization accuracy of four-channel and two-channel audio formats.

2025-09-05T08:42:45+00:00 ― 5 min read

Sound Advancements in Meeting Transcription Technology

A look at M2MeT 2.0 and its impact on meeting transcription.

2025-09-05T03:51:15+00:00 ― 5 min read

Audio and Speech Processing Advancements in Speaker Anonymization Using Neural Audio Codecs

A new audio processing method enhances speaker anonymity while maintaining speech clarity.

2025-09-05T01:25:30+00:00 ― 5 min read

Sound Transforming Tongue Movements into Speech Sounds

This study converts MRI tongue data into real speech audio.

2025-09-04T22:11:10+00:00 ― 4 min read

Audio and Speech Processing Advances and Challenges in Speech Recognition Models

This study examines how model compression impacts speech recognition in noisy environments.

2025-09-04T19:45:25+00:00 ― 5 min read

Audio and Speech Processing Advancements in Sound Event Detection with OAL

Explore how Online Active Learning improves sound recognition efficiency.

2025-09-04T18:56:50+00:00 ― 6 min read

Sound Advancements in Audio and Speech Recognition Model

A new model improves understanding of speech and sounds simultaneously.

2025-09-04T18:08:15+00:00 ― 6 min read

Machine Learning Automatic Classification in Motivational Interviewing

A system that classifies client language in therapy sessions using multiple communication methods.

2025-09-04T16:31:05+00:00 ― 6 min read

Audio and Speech Processing Advancements in Dysarthria Detection Using Machine Learning

New technology improves dysarthria detection and severity classification.

2025-09-04T11:39:35+00:00 ― 5 min read

Audio and Speech Processing Advancements in Voice Pathology Detection

New methods enhance early detection of voice problems using glottal source features.

2025-09-04T10:02:25+00:00 ― 5 min read

Audio and Speech Processing Improving Speech Recognition for Diverse Accents

Enhancing speech models to better recognize and adapt to different accents.

2025-09-04T08:25:15+00:00 ― 4 min read

Sound Advancements in Audio Classification Using DCLS

DCLS enhances audio classification performance by learning kernel positions during training.

2025-09-04T07:36:40+00:00 ― 5 min read

Computer Vision and Pattern Recognition Improving Audio-Visual Learning with Speed Co-Augmentation

A new method enhances machine learning of audio-visual data.

2025-09-04T05:59:30+00:00 ― 5 min read

Audio and Speech Processing Advancements in Speech Extraction Technology

Introducing new models for better speech extraction in noisy environments.

2025-09-04T02:45:10+00:00 ― 5 min read

Computation and Language Improving Speech Recognition with Low-Rank Adaptation

A new method enhances speech recognition efficiency using low-rank adaptation.

2025-09-04T00:19:25+00:00 ― 5 min read

Signal Processing A New Approach to Identifying Schizophrenia Symptoms

Combining audio, video, and text for better mental health assessments.

2025-09-03T22:42:15+00:00 ― 5 min read

Audio and Speech Processing Improving Speech Recognition with New Techniques

A look at advancements in speech recognition to boost speed and accuracy.

2025-09-03T21:05:05+00:00 ― 5 min read

Computation and Language Bridging Language Gaps in Healthcare

Improving doctor-patient communication through advanced speech recognition technologies.

2025-09-03T18:39:20+00:00 ― 6 min read

Cryptography and Security The Privacy Risks of Voice-Controlled Devices

Explore the privacy and security threats of voice-controlled technology.

2025-09-03T16:13:35+00:00 ― 4 min read

Sound Synthia's Melody: A New Tool for Audio Research

Synthia's Melody aids researchers in audio model testing against varied data.

2025-09-03T14:36:25+00:00 ― 5 min read

Computation and Language Addressing Challenges in Long-Form Automatic Speech Recognition

Research focuses on improving ASR systems for unsegmented audio.

2025-09-03T13:47:50+00:00 ― 4 min read

Audio and Speech Processing Advancing Vocal Synthesis for Realistic Audio

Research focuses on optimizing synthesizers for human vocalizations in various media.

2025-09-03T09:44:55+00:00 ― 5 min read

Audio and Speech Processing Advancing Speaker Verification: Addressing Session Variability

A new method improves speaker verification by managing session variability effectively.

2025-09-03T08:56:20+00:00 ― 6 min read

Computation and Language Improving Speech Recognition with Large Language Models

LLMs enhance accuracy and error correction in speech recognition systems.

2025-09-03T06:30:35+00:00 ― 5 min read

Audio and Speech Processing MC-SimCLR: Advancing Sound Learning and Location Awareness

A new method enhances sound recognition and source location without labels.

2025-09-03T00:50:30+00:00 ― 5 min read

Computation and Language HyPoradise: Advancing Automatic Speech Recognition Accuracy

A new benchmark to improve ASR accuracy using language models.

2025-09-02T23:13:20+00:00 ― 6 min read