Electrical Engineering and Systems Science - Audio and Speech Processing

RSS

Computation and Language Improving Speech Recognition with Cleancoder

Cleancoder enhances ASR systems by reducing background noise for clearer speech understanding.

2025-09-16T21:38:15+00:00 ― 4 min read

Computer Vision and Pattern Recognition RADIO: A New Approach to Talking Heads

RADIO creates realistic talking faces using just one reference image.

2025-09-16T16:46:45+00:00 ― 6 min read

Computation and Language RoDia: A New Dataset for Romanian Dialect Identification

RoDia provides crucial audio samples for identifying Romanian dialects.

2025-09-16T15:58:10+00:00 ― 5 min read

Audio and Speech Processing The Role of Non-Verbal Cues in Communication

Exploring how gestures and expressions enhance our understanding of spoken language.

2025-09-16T08:40:55+00:00 ― 7 min read

Human-Computer Interaction The Art and Science of Music Mixing

A look at mixing music, blending technical skills with artistic vision.

2025-09-16T07:03:45+00:00 ― 5 min read

Audio and Speech Processing Advancements in Sound Event Detection and Localization

Exploring new methods in sound detection and localization using synthetic data.

2025-09-16T05:26:35+00:00 ― 5 min read

Audio and Speech Processing Sound Simulation System for Musicians

A new system helps musicians experience sound on a virtual stage.

2025-09-16T03:00:50+00:00 ― 6 min read

Sound Advancements in Detecting Partially Spoofed Audio

New method improves detection of fake audio segments in recordings.

2025-09-16T01:23:40+00:00 ― 5 min read

Sound Advancements in Music Technology: Separating Rhythm and Harmony

Computers are learning to separate rhythm and harmony in music for creative applications.

2025-09-15T23:46:30+00:00 ― 4 min read

Audio and Speech Processing MuLanTTS: A New Frontier in Text-to-Speech

Microsoft's MuLanTTS offers natural and expressive French text-to-speech capabilities.

2025-09-15T22:57:55+00:00 ― 5 min read

Sound Advancements in Acoustic Traffic Monitoring Technology

New datasets and methods improve vehicle classification for better traffic management.

2025-09-15T13:14:55+00:00 ― 6 min read

Sound Advancements in Automatic Speech Recognition Systems

New methods improve accuracy and speed in speech recognition technology.

2025-09-15T06:46:15+00:00 ― 6 min read

Sound Advancements in Foley Sound Synthesis with Machine Learning

A new synthesizer improves the generation of realistic sound effects for media.

2025-09-15T05:57:40+00:00 ― 5 min read

Audio and Speech Processing Advancing Confidence Estimation in Automatic Speech Recognition

A new approach enhances confidence estimation in ASR systems for better accuracy.

2025-09-15T03:14:28+00:00 ― 4 min read

Sound Advancements in Speech Generation Technology

Introducing a framework for more natural and expressive speech synthesis.

2025-09-15T01:06:10+00:00 ― 6 min read

Sound Classifying Music Genres with Technology

Learn how technology helps categorize music genres efficiently.

2025-09-14T21:51:50+00:00 ― 6 min read

Sound New Model Enhances Fish Feeding Intensity Assessment

A unified approach to assess fish feeding using audio and video data.

2025-09-14T21:03:15+00:00 ― 5 min read

Sound Advancements in Emotional Talking-Head Technology

A new method improves the creation of emotionally expressive talking-head videos.

2025-09-14T15:23:10+00:00 ― 6 min read

Machine Learning Challenges in Using Convnets for Audio Filterbank Design

This study explores issues with using convnets for audio filterbank creation.

2025-09-14T14:34:35+00:00 ― 5 min read

Sound Advancements in Audio and Language Models

The CLAP model bridges audio and text processing for various applications.

2025-09-14T13:46:00+00:00 ― 4 min read

Computation and Language Advancements in Self-Supervised Learning for French Speech Technologies

A project aims to improve French speech processing using self-supervised learning.

2025-09-14T12:57:25+00:00 ― 5 min read

Audio and Speech Processing Advancements in Automatic Prosody Annotation

New methods improve how machines recognize speech rhythm and emotion.

2025-09-14T12:08:50+00:00 ― 6 min read

Sound New Method for Sound Estimation in Scattered Environments

A new approach improves sound estimation in spaces with scattering objects.

2025-09-14T06:28:45+00:00 ― 6 min read

Sound The Impact of Undecidability on Music Production

Examines how undecidability influences music composition and production today.

2025-09-14T05:40:10+00:00 ― 4 min read

Audio and Speech Processing Improving Speaker Diarization with Language Models

This article explores advancements in speaker diarization using language models for better accuracy.

2025-09-14T03:14:25+00:00 ― 5 min read

Audio and Speech Processing Advancements in Speech Recognition for Children

This study improves ASR systems' ability to recognize children's speech.

2025-09-14T02:25:50+00:00 ― 5 min read

Audio and Speech Processing The Role of Audio in Pedestrian Detection

Researchers explore audio sensing technology for improved pedestrian detection in urban areas.

2025-09-14T00:48:40+00:00 ― 5 min read

Audio and Speech Processing Advancements in Sound Field Recording Techniques

New method enhances sound source localization and field separation.

2025-09-13T20:45:45+00:00 ― 6 min read

Sound Advancements in Synthesizing Percussive Sounds

A new method improves drum sound synthesis by focusing on sharp transient elements.

2025-09-13T19:57:10+00:00 ― 6 min read

Sound Creating Privacy-Friendly Synthetic Voice Datasets

Researchers are developing synthetic voice data to protect privacy in voice recognition.

2025-09-13T15:05:40+00:00 ― 5 min read

Audio and Speech Processing VoxtLM: A Unified Approach to Speech and Text

VoxtLM combines speech recognition, synthesis, text generation, and continuation in one model.

2025-09-13T11:02:45+00:00 ― 4 min read

Audio and Speech Processing PromptASR: Next-Level Speech Recognition Technology

New system enhances speech recognition using context-aware prompts.

2025-09-13T10:14:10+00:00 ― 4 min read

Sound Advancements in Universal Audio Models

EnCodecMAE combines self-supervised learning and audio codecs for improved audio task performance.

2025-09-13T09:25:35+00:00 ― 5 min read

Audio and Speech Processing Advancing Autism Diagnosis Through Sound Recognition

A study on using machine learning to identify children's sounds for ASD assessment.

2025-09-13T07:48:25+00:00 ― 5 min read

Audio and Speech Processing A New Approach to Keyword Spotting

Introducing a flexible method for recognizing keywords in speech across languages.

2025-09-13T06:11:15+00:00 ― 5 min read

Audio and Speech Processing Assessing Speech Quality in Audio Communication

A look at how speech quality is tested using crowdsourcing.

2025-09-13T05:22:40+00:00 ― 5 min read

Audio and Speech Processing Advancements in Audio Captioning with Text-Only Training

A new method trains audio captioning systems using only text descriptions.

2025-09-13T02:56:55+00:00 ― 6 min read

Sound Essential Steps for Writing Academic Papers

A guide to crafting clear and effective academic papers.

2025-09-13T01:19:45+00:00 ― 3 min read

Cryptography and Security Backdoor Attacks: A Hidden Threat to Voice Verification

Examining the risks of backdoor attacks on speaker verification systems.

2025-09-12T22:54:00+00:00 ― 6 min read

Computer Vision and Pattern Recognition Advancements in Audio-Visual Segmentation Techniques

A new method enhances audio-visual segmentation without detailed labels.

2025-09-12T20:28:15+00:00 ― 5 min read