Electrical Engineering and Systems Science - Audio and Speech Processing

RSS

Sound New System Improves Voice Extraction from Unstable Head Positions

PIAVE helps machines extract voices clearly, even when speakers turn their heads.

2025-09-12T19:39:40+00:00 ― 6 min read

Audio and Speech Processing Libriheavy: A New Dataset for Speech Recognition

Libriheavy offers 50,000 hours of spoken English to boost speech recognition technology.

2025-09-12T18:51:05+00:00 ― 5 min read

Audio and Speech Processing Improving Speech Clarity with AV2Wav Technology

AV2Wav enhances speech quality using audio and visual cues.

2025-09-12T17:13:55+00:00 ― 5 min read

Audio and Speech Processing EmoConv-Diff: A New Way to Change Emotions in Speech

A fresh method for machines to alter speech emotions naturally.

2025-09-12T16:25:20+00:00 ― 5 min read

Sound Detecting AI-Generated Singing Voices

New methods are being developed to identify deepfake singing voices in the music industry.

2025-09-12T14:48:10+00:00 ― 6 min read

Sound Optimizing Text-to-Speech with Core-Set Selection

Core-set selection improves text-to-speech models by focusing on diverse data.

2025-09-12T08:19:30+00:00 ― 5 min read

Sound Advancements in Speech Emotion Recognition Systems

New models are transforming how we analyze emotions in speech.

2025-09-12T07:30:55+00:00 ― 6 min read

Computer Vision and Pattern Recognition Privacy-First Action Recognition with Ultrasound Technology

A new method uses ultrasound to recognize actions while protecting privacy.

2025-09-12T06:42:20+00:00 ― 5 min read

Sound A New Framework for Speaker Anonymization

Introducing a flexible framework to enhance voice privacy research.

2025-09-12T05:05:10+00:00 ― 7 min read

Sound CiwaGAN: A New Model for Speech Learning

CiwaGAN combines control of speech movements and information sharing for better speech learning.

2025-09-12T04:16:35+00:00 ― 6 min read

Computation and Language IntraVerbalPA: A New Approach to Pronunciation Assessment

A framework that blends verbal and non-verbal cues for better language learning.

2025-09-12T03:28:00+00:00 ― 5 min read

Computation and Language Improving Explanations for Speech Models

A new method simplifies understanding of speech classification models.

2025-09-12T02:39:25+00:00 ― 6 min read

Computation and Language Improving Language Learning with L1-MultiMDD

A new system enhances pronunciation skills by considering first language influences.

2025-09-12T01:50:50+00:00 ― 5 min read

Emerging Technologies Quantum Computing Meets Music Composition

Discover how quantum tools change music creation and performance.

2025-09-12T00:31:30+00:00 ― 6 min read

Audio and Speech Processing Advancements in Voice Conversion Technology

New method improves emotion preservation in voice conversion processes.

2025-09-12T00:13:40+00:00 ― 6 min read

Audio and Speech Processing Emo-StarGAN: Advancing Voice Conversion Technology

New method preserves emotional tone in voice conversion for better human-computer interaction.

2025-09-11T23:25:05+00:00 ― 5 min read

Computation and Language Advancements in Direct Text to Speech Translation

New systems improve translation from text to spoken language without intermediates.

2025-09-11T20:59:20+00:00 ― 4 min read

Audio and Speech Processing Improving Heart Sound Classification with Data Augmentation

Researchers enhance heart sound classification accuracy using codec data augmentation methods.

2025-09-11T19:22:10+00:00 ― 5 min read

Sound Emotional Speech Challenges Speech Separation Models

Research reveals emotional speech impacts model performance in speech separation tasks.

2025-09-11T18:33:35+00:00 ― 6 min read

Sound M-AUDIODEC: A New Way to Compress Audio

M-AUDIODEC compresses multi-channel audio while retaining speaker position and quality.

2025-09-11T16:56:25+00:00 ― 6 min read

Sound Advancements in Speech-to-Speech Translation Technology

New methods in S2ST improve translation quality while maintaining speaker identity.

2025-09-11T16:07:50+00:00 ― 5 min read

Sound Advancing Audio Compression with Neural Techniques

A novel system enhances spatial audio compression for clearer sound and efficiency.

2025-09-11T15:19:15+00:00 ― 4 min read

Audio and Speech Processing MusiLingo: Bridging Music and Language

A new system that connects music and language for better understanding.

2025-09-11T14:30:40+00:00 ― 6 min read

Audio and Speech Processing Improving Sound Quality in Hearables

Research reveals new models to enhance voice clarity in smart earbuds.

2025-09-11T12:04:55+00:00 ― 5 min read

Sound Enhancing Bird Sound Recognition with Metadata

Using extra information boosts our ability to identify bird calls.

2025-09-11T11:16:20+00:00 ― 5 min read

Sound Improving Audio Generation Through Text Alignment Techniques

A new approach enhances audio generation by aligning audio with text descriptions.

2025-09-11T07:13:25+00:00 ― 5 min read

Computation and Language Advancements in Speech Recognition Technology

Researchers work to improve online speech recognition using structured state-space models.

2025-09-11T04:47:40+00:00 ― 5 min read

Audio and Speech Processing Real-Time Speaker Detection for Modern Meetings

A new system enhances meeting experiences by identifying speakers in real-time.

2025-09-11T03:10:30+00:00 ― 4 min read

Audio and Speech Processing Advancing Fake Speech Detection Techniques

New methods are improving our ability to detect fake speech effectively.

2025-09-11T02:21:55+00:00 ― 6 min read

Audio and Speech Processing Anonymizing Speech Data: A New Approach

A method for voice conversion improving privacy and speech quality.

2025-09-11T01:33:20+00:00 ― 7 min read

Sound Advancements in Audio Deepfake Detection Systems

New methods enhance ability to distinguish fake audio from real.

2025-09-10T22:19:00+00:00 ― 6 min read

Sound New Method to Detect Synthetic Speech

A method improves detection of synthetic voices and identifies their creators.

2025-09-10T20:41:50+00:00 ― 5 min read

Sound Advancements in Tiny Speech Enhancement Models

New methods improve tiny models for better speech enhancement using less resources.

2025-09-10T19:53:15+00:00 ― 5 min read

Sound Improving Speech Recognition with Personalisation Techniques

A new method enhances ASR models for individual users using quantisation and adaptation.

2025-09-10T13:24:35+00:00 ― 6 min read

Sound Improving Vocoder Training with Contrastive Learning

New methods enhance vocoder performance with limited audio data.

2025-09-10T12:36:00+00:00 ― 5 min read

Sound Understanding Dysarthria: Speech Disorder Insights

A look into dysarthria, its detection, and the role of technology.

2025-09-10T06:55:55+00:00 ― 6 min read

Sound Improving Speech Recognition with Soft Prompts

Soft prompts enhance speech recognition technology for better performance in noisy environments.

2025-09-10T04:30:10+00:00 ― 5 min read

Audio and Speech Processing Enhancing Speech Inversion through Self-Supervised Learning

Research combines self-supervised learning and new measurement techniques for improved speech inversion.

2025-09-10T01:15:50+00:00 ― 5 min read

Sound Improving Clarity in Electrolaryngeal Speech

Researchers develop a new framework to enhance speech clarity for electrolaryngeal users.

2025-09-09T22:50:05+00:00 ― 5 min read

Cryptography and Security Improving Deepfake Detection Through Diverse Training Methods

This study explores training strategies to enhance detection of fake audio.

2025-09-09T22:01:30+00:00 ― 5 min read