Electrical Engineering and Systems Science - Audio and Speech Processing

RSS

Audio and Speech Processing Improving Speaker Verification with CA-MHFA

A new framework enhances voice recognition and adapts to various speech tasks.

2025-06-04T05:52:45+00:00 ― 4 min read

Sound Addressing the Rise of Deepfake Speech Detection

New methods are needed to detect advanced deepfake speech technologies.

2025-06-04T05:04:10+00:00 ― 5 min read

Sound Improving Bioacoustic Event Detection with New Strategies

New methods boost accuracy in identifying animal sounds from limited data.

2025-06-04T04:15:35+00:00 ― 5 min read

Sound Advancements in Augmented Reality Sound Design

New method improves virtual sound integration in AR environments.

2025-06-04T00:12:40+00:00 ― 6 min read

Sound Advancing Voice Privacy with New Conversion Techniques

A new method aims to preserve voice privacy while allowing for effective communication.

2025-06-03T23:24:05+00:00 ― 4 min read

Computation and Language Advancements in Textless Speech Processing Techniques

New methods improve speech recognition for low-resource languages without text.

2025-06-03T18:32:35+00:00 ― 4 min read

Computation and Language Improving Speech Recognition Through Phonetic Techniques

New methods enhance accuracy in speech recognition systems using phonetic understanding.

2025-06-03T16:55:25+00:00 ― 5 min read

Multimedia A New System for Real-Time Speech and Gesture Generation

This framework improves real-time animations by synchronizing speech and gestures seamlessly.

2025-06-03T15:18:15+00:00 ― 5 min read

Sound Improving Speech Recognition with Human-Inspired Features

New acoustic features enhance ASR systems' performance in noisy environments.

2025-06-03T14:29:40+00:00 ― 4 min read

Audio and Speech Processing Advancing Speech Processing with Consistency in Phase Reconstruction

A new loss function boosts audio quality by aligning phase and magnitude.

2025-06-03T12:03:55+00:00 ― 6 min read

Sound New Model Makes Text-to-Speech More Human

A new TTS model adds emotional depth to computer-generated speech.

2025-06-03T09:38:10+00:00 ― 5 min read

Audio and Speech Processing Improving Speech Recognition for Child-Adult Conversations

Evaluating speech recognition models for autism diagnostic sessions.

2025-06-03T08:01:00+00:00 ― 6 min read

Audio and Speech Processing Advancements in Speech Restoration Techniques

Recent methods improve audio clarity and quality using advanced models.

2025-06-03T07:12:25+00:00 ― 6 min read

Sound New Method for Detecting Speech Deepfakes

A fresh approach improves detection of fake audio recordings.

2025-06-03T05:35:15+00:00 ― 5 min read

Audio and Speech Processing Advancements in Neural Codecs with ESPnet-Codec

ESPnet-Codec enhances training and evaluation of neural codecs for audio and speech.

2025-06-03T03:09:30+00:00 ― 7 min read

Audio and Speech Processing Adjusting Sample Rates for Realistic Audio Effects

Exploring methods to adapt RNNs for varying audio sample rates.

2025-06-03T01:32:20+00:00 ― 6 min read

Audio and Speech Processing Whisper-Medusa: Advancing Speech Recognition Efficiency

New model achieves faster speech transcription without sacrificing accuracy.

2025-06-03T00:43:45+00:00 ― 4 min read

Audio and Speech Processing Matryoshka Speaker Embeddings: A Flexible Approach to Voice Recognition

Discover how Matryoshka embeddings improve speaker recognition efficiency and flexibility.

2025-06-02T20:40:50+00:00 ― 4 min read

Sound NanoVoice: Advancing Personalized Text-to-Speech Technology

Introducing NanoVoice, a quick and efficient text-to-speech model for personalized audio.

2025-06-02T19:52:15+00:00 ― 5 min read

Sound Advancements in Text-to-Speech Adaptation

New model VoiceGuider improves TTS for diverse speakers.

2025-06-02T19:03:40+00:00 ― 6 min read

Sound Advancements in Multilingual Voice Conversion

A novel method for converting voices across languages while preserving unique characteristics.

2025-06-02T15:49:20+00:00 ― 5 min read

Audio and Speech Processing Advancements in Text-to-Speech Style Transfer

New techniques improve expressive speech quality across different speakers.

2025-06-02T15:00:45+00:00 ― 5 min read

Sound Improving Music Classification with Perceptual Metrics

This article explores the role of perceptual metrics in music genre classification.

2025-06-02T12:35:00+00:00 ― 4 min read

Audio and Speech Processing Advancing Multi-Task Learning in Speech Models

A new method improves speech and audio processing across multiple tasks.

2025-06-02T10:57:50+00:00 ― 5 min read

Audio and Speech Processing Improving Speaker Diarization in Meetings

A new system enhances speaker identification during discussions with multiple participants.

2025-06-02T06:54:55+00:00 ― 5 min read

Audio and Speech Processing Advancements in Emotional Text-to-Speech Technology

A new framework enhances emotional expression in TTS systems.

2025-06-02T02:52:00+00:00 ― 5 min read

Sound Pressure Sensors: A New Eavesdropping Risk

Recent findings reveal pressure sensors can be used for eavesdropping.

2025-06-01T13:54:40+00:00 ― 4 min read

Sound Advancements in Sound Event Detection with PMAM

A new algorithm improves sound event detection using self-supervised learning.

2025-06-01T10:40:20+00:00 ― 5 min read

Sound Tackling the Challenge of Fake Speech Detection

Research focuses on improving methods for detecting realistic fake speech.

2025-06-01T09:51:45+00:00 ― 5 min read

Machine Learning Advancements in Audio-Video Generation Techniques

A new method streamlines audio and video creation for better synchronization.

2025-06-01T08:14:35+00:00 ― 5 min read

Audio and Speech Processing Text2FX: Simplifying Audio Effects with Language

Control audio effects using simple language descriptions for easier sound adjustments.

2025-06-01T00:08:45+00:00 ― 5 min read

Sound Advancing Multi-Audio Processing with MALLM

Introducing a new model and benchmark for evaluating multi-audio tasks.

2025-05-31T19:17:15+00:00 ― 5 min read

Sound Animating Emotions for Realistic Talking Heads

A new system models emotional intensity in animated characters for enhanced realism.

2025-05-31T16:51:30+00:00 ― 6 min read

Sound OpenSep: Advancing Audio Separation Technology

OpenSep automates audio separation for clearer sound experiences without manual input.

2025-05-31T07:15:34+00:00 ― 6 min read

Sound PALM: A New Approach to Audio Recognition

PALM enhances audio recognition by optimizing prompt representation and efficiency.

2025-05-31T01:54:50+00:00 ― 4 min read

Audio and Speech Processing Understanding Guitar Pickups: Wire Turns and Gauge

Explore how wire turns and gauge impact guitar pickup sound.

2025-05-31T00:34:39+00:00 ― 7 min read

Audio and Speech Processing Advancements in Speech Recognition Technology

A new method improves speech recognition for long recordings.

2025-05-30T21:54:17+00:00 ― 5 min read

Sound Integrating Audio-Visual Data for Speech Processing

This study analyzes how audio, video, and text work together in speech recognition.

2025-05-30T15:13:22+00:00 ― 7 min read

Computation and Language Advancing Text-to-Speech with New Intonation Model

A new model improves naturalness in text-to-speech systems by analyzing pitch patterns.

2025-05-30T01:51:32+00:00 ― 4 min read

Computation and Language Advancing Speech Technology for African Languages

A new model enhances speech representation for African languages, boosting inclusivity in technology.

2025-05-29T21:50:59+00:00 ― 5 min read