Electrical Engineering and Systems Science - Audio and Speech Processing

RSS

Sound Advancements in Target-Speaker Speech Recognition

New model improves speech recognition in noisy environments by focusing on a single speaker.

2025-09-28T08:08:00+00:00 ― 4 min read

Sound Balancing Privacy and Smart Audio Monitoring

New methods aim to protect speech privacy in audio monitoring systems.

2025-09-28T06:30:50+00:00 ― 5 min read

Computation and Language Advancing Expressive Speech Synthesis with New Dataset

A new dataset enhances speech synthesis by capturing emotional expression without relying on text.

2025-09-27T18:22:05+00:00 ― 5 min read

Audio and Speech Processing Improving Music Pitch Classification with SDTW

New strategies to enhance training stability for music pitch classification.

2025-09-27T13:30:35+00:00 ― 6 min read

Sound Advancements in Voice Conversion Technology

Phoneme Hallucinator transforms voice conversion with limited data for clearer outputs.

2025-09-27T10:16:15+00:00 ― 5 min read

Sound Advancing Gesture Generation for Digital Humans

A new method creates realistic gestures from raw speech audio.

2025-09-27T08:39:05+00:00 ― 5 min read

Machine Learning New Method for Analyzing Brain Activity During Speech

Researchers develop Neural Latent Aligner to better interpret brain signals during speaking tasks.

2025-09-27T05:24:45+00:00 ― 6 min read

Audio and Speech Processing Advancing Bilingual Speech Recognition with Grapheme Units

Enhancing hybrid ASR systems for bilingual speech using grapheme units.

2025-09-27T03:47:35+00:00 ― 5 min read

Computation and Language Advances in Joint Speech-Text Learning

A new model improves speech and text alignment for better automatic recognition.

2025-09-27T02:10:25+00:00 ― 6 min read

Sound Advancements in Visual Speech Recognition with Lip2Vec

Lip2Vec enhances visual speech recognition using fewer labeled data.

2025-09-27T01:21:50+00:00 ― 7 min read

Computation and Language Advancements in Speech Recognition Technology

New methods enhance accuracy and speed in speech recognition systems.

2025-09-26T11:35:55+00:00 ― 5 min read

Machine Learning O-1: A New Frontier in Speech Recognition Training

O-1 improves speech recognition by optimizing self-training methods.

2025-09-26T09:10:10+00:00 ― 5 min read

Computation and Language Improving Automatic Speech Recognition with Text Injection

A new method enhances ASR performance through text data integration.

2025-09-26T07:33:00+00:00 ― 6 min read

Computation and Language Improving Speech Recognition with Text Injection

Text injection helps recognize personal information while maintaining privacy.

2025-09-26T06:44:25+00:00 ― 5 min read

Sound Advancements in Sound Event Detection Using Generative Learning

Discover how new techniques are transforming sound event detection for various applications.

2025-09-26T05:55:50+00:00 ― 6 min read

Audio and Speech Processing The Importance of Nonlinear Audio Processing

Exploring nonlinear methods in audio for music production and speech analysis.

2025-09-26T03:30:05+00:00 ― 6 min read

Sound Advancements in Pitch Extraction with PitchNet

A new method for accurate pitch detection in music and sound.

2025-09-26T02:41:30+00:00 ― 5 min read

Sound Advancements in Speech Recognition with mmWave Technology

Radio2Text uses mmWave signals for real-time speech recognition in noisy environments.

2025-09-25T22:38:35+00:00 ― 6 min read

Audio and Speech Processing Evaluating an Automatic Sound Masker System in Urban Parks

A study examines the effectiveness of automated sound maskers in public spaces.

2025-09-25T18:35:40+00:00 ― 5 min read

Audio and Speech Processing Advancements in Speaker Recognition with Graph Neural Networks

Graph neural networks improve speaker recognition accuracy by analyzing voice sample relationships.

2025-09-25T09:41:15+00:00 ― 5 min read

Computation and Language Advancements in Speech Emotion Recognition Across Languages

A study evaluating emotion recognition in speech models across six languages.

2025-09-25T08:04:05+00:00 ― 5 min read

Sound AffectEcho: Bridging Emotions in AI Speech

AffectEcho model enhances emotional expression in AI-generated speech.

2025-09-25T07:15:30+00:00 ― 6 min read

Computation and Language Improving Grapheme-to-Phoneme Conversion with New Sampling Method

This study enhances G2P models by focusing on error-prone areas during training.

2025-09-25T05:38:20+00:00 ― 5 min read

Audio and Speech Processing Advancements in Formant Tracking Techniques

Discover methods that improve accuracy in formant tracking for speech analysis.

2025-09-24T22:21:05+00:00 ― 6 min read

Audio and Speech Processing Using Speech Analysis to Assess Parkinson's Disease Severity

Researchers develop speech-based methods for more accurate Parkinson's disease assessment.

2025-09-24T21:32:30+00:00 ― 5 min read

Audio and Speech Processing Advancing Sound Detection with Meta-Learning Techniques

Meta-SELD enhances sound event localization in diverse environments.

2025-09-24T19:55:20+00:00 ― 5 min read

Machine Learning Audiovisual Moments in Time: A New Dataset for Action Recognition

AVMIT offers researchers insights into how sound and vision relate in action recognition.

2025-09-24T07:46:35+00:00 ― 6 min read

Audio and Speech Processing Advancements in Audio Quality Prediction with GML

A new AI model enhances the prediction of audio quality scores.

2025-09-24T03:43:40+00:00 ― 5 min read

Sound AI Music Generation: A Study on Sampling Techniques

This research examines how sampling methods affect AI-generated music quality.

2025-09-24T02:55:05+00:00 ― 5 min read

Sound Advancements in Audio Anti-Spoofing Technology

A new method improves detection of fake audio in voice recognition systems.

2025-09-23T23:40:45+00:00 ― 6 min read

Audio and Speech Processing Advancements in Beat Tracking for Classical Music

New methods enhance beat tracking accuracy in complex classical music.

2025-09-23T14:46:20+00:00 ― 6 min read

Audio and Speech Processing Understanding the Role of Language Diarization

A look at how language diarization helps in multilingual conversations.

2025-09-22T23:23:15+00:00 ― 4 min read

Audio and Speech Processing Advancements in Audio Texture Generation Framework

A new framework simplifies audio texture generation by reducing labeling needs.

2025-09-22T22:34:40+00:00 ― 6 min read

Sound Advancements in Speech Recognition for Noisy Environments

A new system improves voice recognition in loud settings using advanced techniques.

2025-09-22T21:46:05+00:00 ― 5 min read

Audio and Speech Processing Evaluating VoicePrivacy Challenge Baseline B1 Performance

Assessing the effectiveness of voice anonymization without losing natural sound.

2025-09-22T14:28:50+00:00 ― 6 min read

Sound Advancements in Audio Classification with LCANets++

New models enhance audio classification accuracy and resilience against noise and attacks.

2025-09-22T12:51:40+00:00 ― 4 min read

Sound AI in Music: Creation Tools and Techniques

An overview of AI tools for music creation and their unique features.

2025-09-22T08:48:45+00:00 ― 11 min read

Sound Generating Realistic Sounds from Silent Videos

Research explores deep learning for creating audio to match silent video content.

2025-09-22T04:45:50+00:00 ― 6 min read

Computer Vision and Pattern Recognition Combining Sound and Visuals to Improve Audio Quality

A new method enhances sound recordings using visual cues.

2025-09-22T03:57:15+00:00 ― 6 min read

Audio and Speech Processing Evaluating Speech Quality with XLS-R Models

A look at how XLS-R models improve audio quality assessment in online meetings.

2025-09-22T01:31:30+00:00 ― 5 min read