Electrical Engineering and Systems Science - Audio and Speech Processing

RSS

Audio and Speech Processing Advancements in Automatic Speech Recognition Technology

New method improves speech recognition models while reducing knowledge loss.

2025-11-09T22:48:45+00:00 ― 4 min read

Latest Articles

Audio and Speech Processing Advancements in Hearing Aid Noise Reduction

Discover new methods to enhance hearing aid performance and speech clarity.

2025-11-09T06:37:05+00:00 ― 5 min read

Sound New Self-Supervised Approach for Speech Recognition

A novel method improves speech recognition tasks using less labeled data.

2025-11-09T00:08:25+00:00 ― 5 min read

Sound Advancements in Audio Captioning Techniques

This article examines recent improvements in creating written audio descriptions.

2025-11-08T21:42:40+00:00 ― 5 min read

Sound New Audio Fingerprinting System for TVs

Efficient audio recognition technology designed for low-power television devices.

2025-11-08T13:36:50+00:00 ― 4 min read

Sound Introducing SCHmUBERT: A New Model for Music Generation

SCHmUBERT offers a fresh approach to creating symbolic music with AI.

2025-11-08T12:48:15+00:00 ― 6 min read

Computer Vision and Pattern Recognition Addressing the Invasion of Pomacea canaliculata

Using AI to identify invasive pink snail eggs for better management.

2025-11-08T11:11:05+00:00 ― 5 min read

Sound Advancements in Confidence Estimation for Speech Recognition

A new model enhances confidence scores in speech recognition systems.

2025-11-08T02:16:40+00:00 ― 5 min read

Audio and Speech Processing Advancements in Dysarthric Speech Recognition

New techniques improve understanding of dysarthric speech in communication systems.

2025-11-08T01:28:05+00:00 ― 5 min read

Sound Advancements in Speech Separation Techniques

A novel unsupervised approach enhances voice isolation in audio mixtures.

2025-11-07T23:50:55+00:00 ― 4 min read

Sound ML-SUPERB: Benchmarking Multilingual Speech Models

A new benchmark for evaluating machine learning models in understanding speech across languages.

2025-11-07T23:02:20+00:00 ― 6 min read

Computation and Language Improving Phone Classification in Speech Recognition

This article discusses methods to enhance phone classification using audio features.

2025-11-07T21:25:10+00:00 ― 6 min read

Audio and Speech Processing Advancing AI: Human-Like Audio Understanding

A new model enhances audio perception and reasoning capabilities in AI.

2025-11-07T16:33:40+00:00 ― 6 min read

Sound Advancements in Speech Separation with NASS

NASS improves voice isolation in noisy environments, outperforming traditional methods.

2025-11-07T15:45:05+00:00 ― 4 min read

Audio and Speech Processing Improving Synthetic Voices through Audio Enhancement

A novel approach to enhance audio quality for synthetic voice creation.

2025-11-07T14:07:55+00:00 ― 6 min read

Audio and Speech Processing Advancements in Sound Event Detection with Multi-Task Learning

New techniques improve sound recognition efficiency and reduce labeling costs.

2025-11-07T13:19:20+00:00 ― 6 min read

Sound Updating Sound Quality Metrics for Better Accuracy

Enhancing sound quality metrics using new loudness calculation methods.

2025-11-07T12:30:45+00:00 ― 5 min read

Computation and Language Advancements in Real-Time Speech Translation

AlignAtt enhances simultaneous speech translation with improved speed and quality.

2025-11-07T11:42:10+00:00 ― 5 min read

Sound Balancing Privacy and Efficiency in Speech Models

A new method ensures privacy in speech classification without sacrificing performance.

2025-11-07T10:05:00+00:00 ― 6 min read

Sound Adapting Text-to-Speech Accents with Ease

This study shows how to adapt TTS technology to different accents efficiently.

2025-11-07T09:16:25+00:00 ― 5 min read

Human-Computer Interaction Advancing Socially Interactive Agents with AMII Model

AMII model enhances communication for socially interactive agents through improved non-verbal behavior.

2025-11-07T08:27:50+00:00 ― 5 min read

Audio and Speech Processing Improving Parkinson's Detection with Federated Learning

Using federated learning to enhance speech analysis for Parkinson's diagnosis across languages.

2025-11-07T07:39:15+00:00 ― 5 min read

Computation and Language Identifying Arabic Dialects with Modern Techniques

This study focuses on recognizing Arabic dialects using advanced methods and limited data.

2025-11-07T06:02:05+00:00 ― 4 min read

Computer Vision and Pattern Recognition A New Model for Multi-Modal Data Processing

Introducing a model that integrates various data types for complex tasks.

2025-11-07T05:13:30+00:00 ― 6 min read

Sound Advancements in Bioacoustic Sound Detection

Researchers are improving how we detect animal sounds automatically.

2025-11-07T05:03:27+00:00 ― 6 min read

Audio and Speech Processing Whisper's Versatile Speech Recognition Abilities

Discover how Whisper adapts to various speech tasks using prompt engineering.

2025-11-07T04:24:55+00:00 ― 5 min read

Computation and Language Improving Speech Recognition for Minority Languages

This study examines ways to enhance ASR for low-resource languages using data techniques.

2025-11-07T01:59:10+00:00 ― 4 min read

Audio and Speech Processing FastFit: A New Approach to Speech Generation

FastFit improves speech generation speed without losing sound quality.

2025-11-07T00:22:00+00:00 ― 5 min read

Audio and Speech Processing Advancements in Keyword Spotting with TACos

A new method improves keyword detection in audio recordings.

2025-11-06T23:33:25+00:00 ― 5 min read

Audio and Speech Processing A New Method for Measuring Tongue Movement in Speech

This study introduces a method to better measure tongue movement during speech using X-ray data.

2025-11-06T21:56:15+00:00 ― 6 min read

Sound Advancements in Speaker Diarization with AED-EEND

AED-EEND system enhances speaker diarization by integrating advanced techniques for better accuracy.

2025-11-06T20:19:05+00:00 ― 5 min read

Audio and Speech Processing Pengi: Bridging Audio and Text Processing

Pengi merges audio understanding and text generation into a single model.

2025-11-06T19:30:30+00:00 ― 7 min read

Audio and Speech Processing Reducing Latency in Speech Recognition with Delay-Penalized CTC

A new approach aims to minimize delays in speech recognition systems while maintaining accuracy.

2025-11-06T17:53:20+00:00 ― 4 min read

Audio and Speech Processing Advancing Keyword Spotting with Continuous Learning

A new method enhances keyword spotting systems for better performance in changing audio.

2025-11-06T17:04:45+00:00 ― 4 min read

Sound Advancements in Multilingual Text-to-Speech Technology

A new TTS system enhances speech generation across multiple languages with limited data.

2025-11-06T13:50:25+00:00 ― 6 min read

Computer Vision and Pattern Recognition Composable Diffusion: A New Frontier in Content Creation

CoDi enables simultaneous generation of diverse content types from various inputs.

2025-11-06T13:01:50+00:00 ― 4 min read

Sound Advancements in Sound Separation Using Deep Learning

New techniques improve sound separation from Ambisonics mixes for better audio experiences.

2025-11-06T12:13:15+00:00 ― 6 min read

Audio and Speech Processing Advancements in Speech Model Compression Techniques

A new method improves speech models while reducing resource needs.

2025-11-06T11:24:40+00:00 ― 6 min read

Sound Advancements in Speech-Based Health Monitoring

New methods using speech show promise in identifying breathing patterns and health conditions.

2025-11-06T10:36:05+00:00 ― 4 min read

Sound MIDI-Draw: A New Way to Create Melodies

MIDI-Draw allows anyone to make music by drawing melodies intuitively.

2025-11-06T09:47:30+00:00 ― 5 min read

Sound Innovative Methods for Assessing Audio Quality

New techniques borrowing from image processing enhance audio quality evaluation.

2025-11-06T08:58:55+00:00 ― 6 min read