Electrical Engineering and Systems Science - Audio and Speech Processing

RSS

Audio and Speech Processing Real-Time Speaker Detection for Modern Meetings

A new system enhances meeting experiences by identifying speakers in real-time.

2025-09-11T03:10:30+00:00 ― 4 min read

Audio and Speech Processing Advancing Fake Speech Detection Techniques

New methods are improving our ability to detect fake speech effectively.

2025-09-11T02:21:55+00:00 ― 6 min read

Audio and Speech Processing Anonymizing Speech Data: A New Approach

A method for voice conversion improving privacy and speech quality.

2025-09-11T01:33:20+00:00 ― 7 min read

Sound Advancements in Audio Deepfake Detection Systems

New methods enhance ability to distinguish fake audio from real.

2025-09-10T22:19:00+00:00 ― 6 min read

Sound New Method to Detect Synthetic Speech

A method improves detection of synthetic voices and identifies their creators.

2025-09-10T20:41:50+00:00 ― 5 min read

Sound Advancements in Tiny Speech Enhancement Models

New methods improve tiny models for better speech enhancement using less resources.

2025-09-10T19:53:15+00:00 ― 5 min read

Sound Improving Speech Recognition with Personalisation Techniques

A new method enhances ASR models for individual users using quantisation and adaptation.

2025-09-10T13:24:35+00:00 ― 6 min read

Sound Improving Vocoder Training with Contrastive Learning

New methods enhance vocoder performance with limited audio data.

2025-09-10T12:36:00+00:00 ― 5 min read

Sound Understanding Dysarthria: Speech Disorder Insights

A look into dysarthria, its detection, and the role of technology.

2025-09-10T06:55:55+00:00 ― 6 min read

Sound Improving Speech Recognition with Soft Prompts

Soft prompts enhance speech recognition technology for better performance in noisy environments.

2025-09-10T04:30:10+00:00 ― 5 min read

Audio and Speech Processing Enhancing Speech Inversion through Self-Supervised Learning

Research combines self-supervised learning and new measurement techniques for improved speech inversion.

2025-09-10T01:15:50+00:00 ― 5 min read

Sound Improving Clarity in Electrolaryngeal Speech

Researchers develop a new framework to enhance speech clarity for electrolaryngeal users.

2025-09-09T22:50:05+00:00 ― 5 min read

Cryptography and Security Improving Deepfake Detection Through Diverse Training Methods

This study explores training strategies to enhance detection of fake audio.

2025-09-09T22:01:30+00:00 ― 5 min read

Audio and Speech Processing Advancements in Speech Recognition through Early-Exit Models

New models adapt to improve speech recognition efficiency and responsiveness.

2025-09-09T21:12:55+00:00 ― 5 min read

Audio and Speech Processing Introducing RECAP: A New Frontier in Audio Captioning

RECAP uses advanced techniques to generate accurate audio captions without retraining.

2025-09-09T20:24:20+00:00 ― 5 min read

Sound Fundamentals of Music Theory and Harmony

A practical guide to understanding music theory through harmony and scales.

2025-09-09T16:21:25+00:00 ― 7 min read

Audio and Speech Processing Improving ASR Systems with Synthetic Data

A new method uses synthetic data to enhance ASR systems in unfamiliar areas.

2025-09-09T15:32:50+00:00 ― 6 min read

Sound Estimating Crowd Density with Sound While Protecting Privacy

A new audio-based method estimates crowd sizes without invading personal privacy.

2025-09-09T13:55:40+00:00 ― 5 min read

Computation and Language Advancing Speech Recognition: Instruction-Following Systems

A new approach to speech recognition enhances user interaction with flexible instructions.

2025-09-09T08:15:35+00:00 ― 4 min read

Sound A New Method for Detecting Voice Spoofing

A robust approach to identify audio anomalies and combat voice spoofing.

2025-09-09T07:27:00+00:00 ― 5 min read

Computation and Language Advancements in Emotion Recognition in Conversations

A new model enhances understanding of emotions during conversations.

2025-09-09T06:38:25+00:00 ― 5 min read

Computation and Language Do Computer-Generated Speech Symbols Follow Zipf's Law?

This study examines if learned speech symbols mimic word frequency patterns.

2025-09-09T04:12:40+00:00 ― 5 min read

Sound DiCon: A New Approach to Speech Synthesis

Introducing a faster method for high-quality speech synthesis using diffusion models.

2025-09-09T03:24:05+00:00 ― 6 min read

Audio and Speech Processing HiFTNet: Advancing Text-to-Speech Technology

HiFTNet offers faster, high-quality speech synthesis using efficient innovative techniques.

2025-09-09T02:35:30+00:00 ― 5 min read

Sound Advancements in Voice Conversion Technology Using Face Images

New method transforms voices using facial features for diverse applications.

2025-09-09T01:46:55+00:00 ― 8 min read

Audio and Speech Processing Introducing AV-SUPERB: A New Benchmark for Audio-Visual Models

AV-SUPERB evaluates audio and visual models across various tasks for better performance.

2025-09-08T22:32:35+00:00 ― 5 min read

Sound Improving Speaker Diarization with Semantic Information

A new approach enhances speaker diarization by integrating semantic data into the process.

2025-09-08T20:06:50+00:00 ― 5 min read

Sound Faster Text-to-Audio Generation Using Consistency Distillation

New method improves speed and efficiency in Text-to-Audio generation.

2025-09-08T18:29:40+00:00 ― 4 min read

Audio and Speech Processing Advancements in Speech Emotion Recognition: A Multilingual Approach

Research shows improved accuracy in recognizing emotions from speech across languages.

2025-09-08T16:03:55+00:00 ― 4 min read

Sound Improving Speech Recognition with Test-Time Training

Explore how TTT enhances speech recognition by adapting to distribution shifts.

2025-09-08T14:26:45+00:00 ― 6 min read

Computer Vision and Pattern Recognition Advancing Sound Source Localization Techniques

Improving the way we identify sound sources using audio-visual data.

2025-09-08T12:49:35+00:00 ― 6 min read

Computer Vision and Pattern Recognition Mapping Sounds: A New Approach to Soundscape Analysis

A method to visualize and predict sounds in various environments using advanced technology.

2025-09-08T11:12:25+00:00 ― 5 min read

Computation and Language Advancements in Spoken Language Identification

New methods combine audio and metadata for better language recognition.

2025-09-08T07:09:30+00:00 ― 5 min read

Sound New Voice Recognition System Tackles Spoofing Threats

A system designed to detect voice presentation attacks enhances security in voice recognition.

2025-09-08T06:20:55+00:00 ― 6 min read

Audio and Speech Processing Improving Whisper for Low-Resource Languages

Enhancing Whisper's speech recognition for Vietnamese and other low-resource languages.

2025-09-08T03:55:10+00:00 ― 4 min read

Sound Advancements in Text-Based Speech Editing

FluentEditor improves audio editing by focusing on natural flow and consistency.

2025-09-07T20:37:55+00:00 ― 4 min read

Computation and Language New Methods in Simultaneous Speech Translation

Improving real-time translation through advanced segmentation techniques.

2025-09-07T18:12:10+00:00 ― 5 min read

Computation and Language Advancements in Simultaneous Speech Translation

Improving real-time translations through innovative methods and smart policies.

2025-09-07T17:23:35+00:00 ― 5 min read

Audio and Speech Processing Advancing Automatic Speech Recognition for Tunisian Arabic

Efforts to improve ASR systems for Tunisian Arabic and code-switching.

2025-09-07T16:35:00+00:00 ― 5 min read

Sound Personalizing Music Generation: New Approaches

Innovative methods aim to tailor music generation to user preferences.

2025-09-07T15:46:25+00:00 ― 6 min read