Electrical Engineering and Systems Science - Audio and Speech Processing

RSS

Computation and Language Improving Language Learning with L1-MultiMDD

A new system enhances pronunciation skills by considering first language influences.

2025-09-12T01:50:50+00:00 ― 5 min read

Emerging Technologies Quantum Computing Meets Music Composition

Discover how quantum tools change music creation and performance.

2025-09-12T00:31:30+00:00 ― 6 min read

Audio and Speech Processing Advancements in Voice Conversion Technology

New method improves emotion preservation in voice conversion processes.

2025-09-12T00:13:40+00:00 ― 6 min read

Audio and Speech Processing Emo-StarGAN: Advancing Voice Conversion Technology

New method preserves emotional tone in voice conversion for better human-computer interaction.

2025-09-11T23:25:05+00:00 ― 5 min read

Computation and Language Advancements in Direct Text to Speech Translation

New systems improve translation from text to spoken language without intermediates.

2025-09-11T20:59:20+00:00 ― 4 min read

Audio and Speech Processing Improving Heart Sound Classification with Data Augmentation

Researchers enhance heart sound classification accuracy using codec data augmentation methods.

2025-09-11T19:22:10+00:00 ― 5 min read

Sound Emotional Speech Challenges Speech Separation Models

Research reveals emotional speech impacts model performance in speech separation tasks.

2025-09-11T18:33:35+00:00 ― 6 min read

Sound M-AUDIODEC: A New Way to Compress Audio

M-AUDIODEC compresses multi-channel audio while retaining speaker position and quality.

2025-09-11T16:56:25+00:00 ― 6 min read

Sound Advancements in Speech-to-Speech Translation Technology

New methods in S2ST improve translation quality while maintaining speaker identity.

2025-09-11T16:07:50+00:00 ― 5 min read

Sound Advancing Audio Compression with Neural Techniques

A novel system enhances spatial audio compression for clearer sound and efficiency.

2025-09-11T15:19:15+00:00 ― 4 min read

Audio and Speech Processing MusiLingo: Bridging Music and Language

A new system that connects music and language for better understanding.

2025-09-11T14:30:40+00:00 ― 6 min read

Audio and Speech Processing Improving Sound Quality in Hearables

Research reveals new models to enhance voice clarity in smart earbuds.

2025-09-11T12:04:55+00:00 ― 5 min read

Sound Enhancing Bird Sound Recognition with Metadata

Using extra information boosts our ability to identify bird calls.

2025-09-11T11:16:20+00:00 ― 5 min read

Sound Improving Audio Generation Through Text Alignment Techniques

A new approach enhances audio generation by aligning audio with text descriptions.

2025-09-11T07:13:25+00:00 ― 5 min read

Computation and Language Advancements in Speech Recognition Technology

Researchers work to improve online speech recognition using structured state-space models.

2025-09-11T04:47:40+00:00 ― 5 min read

Audio and Speech Processing Real-Time Speaker Detection for Modern Meetings

A new system enhances meeting experiences by identifying speakers in real-time.

2025-09-11T03:10:30+00:00 ― 4 min read

Audio and Speech Processing Advancing Fake Speech Detection Techniques

New methods are improving our ability to detect fake speech effectively.

2025-09-11T02:21:55+00:00 ― 6 min read

Audio and Speech Processing Anonymizing Speech Data: A New Approach

A method for voice conversion improving privacy and speech quality.

2025-09-11T01:33:20+00:00 ― 7 min read

Sound Advancements in Audio Deepfake Detection Systems

New methods enhance ability to distinguish fake audio from real.

2025-09-10T22:19:00+00:00 ― 6 min read

Sound New Method to Detect Synthetic Speech

A method improves detection of synthetic voices and identifies their creators.

2025-09-10T20:41:50+00:00 ― 5 min read

Sound Advancements in Tiny Speech Enhancement Models

New methods improve tiny models for better speech enhancement using less resources.

2025-09-10T19:53:15+00:00 ― 5 min read

Sound Improving Speech Recognition with Personalisation Techniques

A new method enhances ASR models for individual users using quantisation and adaptation.

2025-09-10T13:24:35+00:00 ― 6 min read

Sound Improving Vocoder Training with Contrastive Learning

New methods enhance vocoder performance with limited audio data.

2025-09-10T12:36:00+00:00 ― 5 min read

Sound Understanding Dysarthria: Speech Disorder Insights

A look into dysarthria, its detection, and the role of technology.

2025-09-10T06:55:55+00:00 ― 6 min read

Sound Improving Speech Recognition with Soft Prompts

Soft prompts enhance speech recognition technology for better performance in noisy environments.

2025-09-10T04:30:10+00:00 ― 5 min read

Audio and Speech Processing Enhancing Speech Inversion through Self-Supervised Learning

Research combines self-supervised learning and new measurement techniques for improved speech inversion.

2025-09-10T01:15:50+00:00 ― 5 min read

Sound Improving Clarity in Electrolaryngeal Speech

Researchers develop a new framework to enhance speech clarity for electrolaryngeal users.

2025-09-09T22:50:05+00:00 ― 5 min read

Cryptography and Security Improving Deepfake Detection Through Diverse Training Methods

This study explores training strategies to enhance detection of fake audio.

2025-09-09T22:01:30+00:00 ― 5 min read

Audio and Speech Processing Advancements in Speech Recognition through Early-Exit Models

New models adapt to improve speech recognition efficiency and responsiveness.

2025-09-09T21:12:55+00:00 ― 5 min read

Audio and Speech Processing Introducing RECAP: A New Frontier in Audio Captioning

RECAP uses advanced techniques to generate accurate audio captions without retraining.

2025-09-09T20:24:20+00:00 ― 5 min read

Sound Fundamentals of Music Theory and Harmony

A practical guide to understanding music theory through harmony and scales.

2025-09-09T16:21:25+00:00 ― 7 min read

Audio and Speech Processing Improving ASR Systems with Synthetic Data

A new method uses synthetic data to enhance ASR systems in unfamiliar areas.

2025-09-09T15:32:50+00:00 ― 6 min read

Sound Estimating Crowd Density with Sound While Protecting Privacy

A new audio-based method estimates crowd sizes without invading personal privacy.

2025-09-09T13:55:40+00:00 ― 5 min read

Computation and Language Advancing Speech Recognition: Instruction-Following Systems

A new approach to speech recognition enhances user interaction with flexible instructions.

2025-09-09T08:15:35+00:00 ― 4 min read

Sound A New Method for Detecting Voice Spoofing

A robust approach to identify audio anomalies and combat voice spoofing.

2025-09-09T07:27:00+00:00 ― 5 min read

Computation and Language Advancements in Emotion Recognition in Conversations

A new model enhances understanding of emotions during conversations.

2025-09-09T06:38:25+00:00 ― 5 min read

Computation and Language Do Computer-Generated Speech Symbols Follow Zipf's Law?

This study examines if learned speech symbols mimic word frequency patterns.

2025-09-09T04:12:40+00:00 ― 5 min read

Sound DiCon: A New Approach to Speech Synthesis

Introducing a faster method for high-quality speech synthesis using diffusion models.

2025-09-09T03:24:05+00:00 ― 6 min read

Audio and Speech Processing HiFTNet: Advancing Text-to-Speech Technology

HiFTNet offers faster, high-quality speech synthesis using efficient innovative techniques.

2025-09-09T02:35:30+00:00 ― 5 min read

Sound Advancements in Voice Conversion Technology Using Face Images

New method transforms voices using facial features for diverse applications.

2025-09-09T01:46:55+00:00 ― 8 min read