Electrical Engineering and Systems Science - Audio and Speech Processing

RSS

Computation and Language Advancements in Text-to-Speech Technology

New methods improve the quality of synthesized speech using self-supervised learning.

2025-09-30T17:37:25+00:00 ― 5 min read

Computation and Language Improving Speech Recognition with Keyword Boosting

A new method enhances the transcription of rare keywords in business conversations.

2025-09-30T10:20:10+00:00 ― 6 min read

Sound Advancing Speech Recognition with Federated Learning

Federated Learning improves speech recognition while keeping user data private.

2025-09-30T08:43:00+00:00 ― 5 min read

Sound MusicLDM: A New Approach to Text-to-Music Generation

MusicLDM transforms text into original music, offering fresh avenues for creativity.

2025-09-30T05:28:40+00:00 ― 7 min read

Sound Improving Singing Melody Extraction Techniques with Deep Learning

New methods enhance the accuracy of extracting singing melodies from mixed audio.

2025-09-30T01:25:45+00:00 ― 7 min read

Sound Advancements in Speech Enhancement Techniques

New model improves speech clarity in noisy environments using innovative methods.

2025-09-29T22:11:25+00:00 ― 5 min read

Sound Analyzing Korean Folk Songs Through Technology

A study on Korean folk songs using modern analytical methods.

2025-09-29T21:22:50+00:00 ― 8 min read

Graphics DiffDance: A New Era in Dance Generation

DiffDance creates detailed dance sequences that match music effectively.

2025-09-29T16:31:20+00:00 ― 5 min read

Sound Addressing Gender Bias in Singing Voice Transcription

Examining fairness in singing voice transcription technology across genders.

2025-09-29T15:42:45+00:00 ― 8 min read

Sound Advancements in Hotword Customization for ASR Systems

SeACo-Paraformer brings flexibility and accuracy to speech recognition technology.

2025-09-29T14:05:35+00:00 ― 5 min read

Audio and Speech Processing Examining Voice Quality and Its Impact

This study explores voice quality classification methods and their significance in communication.

2025-09-29T12:28:25+00:00 ― 4 min read

Audio and Speech Processing Advancements in Active Noise Control Technology

Learn how new algorithms improve noise cancellation techniques for various applications.

2025-09-29T05:59:45+00:00 ― 4 min read

Audio and Speech Processing New Tool Measures Audio Quality with Video Insights

AudioVMAF combines video metrics for improved audio quality assessment.

2025-09-29T01:56:50+00:00 ― 5 min read

Sound Advancements in Fake Audio Detection with RAWM

A new method improves detection of fake audio using adaptive weight modification.

2025-09-29T01:08:15+00:00 ― 5 min read

Cryptography and Security The Growing Need for Steganalysis in Information Security

Steganalysis helps detect hidden messages in multimedia, ensuring secure communication.

2025-09-28T23:31:05+00:00 ― 4 min read

Audio and Speech Processing Separating Speaker Identity from Speech Data

A study on disentangling speaker identity from speech signals for improved processing.

2025-09-28T19:28:10+00:00 ― 5 min read

Multimedia TranSTYLer: A Leap in Virtual Communication

Transforming gestures for virtual agents with preserved meaning.

2025-09-28T18:39:35+00:00 ― 6 min read

Sound Advancements in Sound Source Localization Using Neural Networks

Exploring how neural networks improve the accuracy of sound source localization.

2025-09-28T12:10:55+00:00 ― 6 min read

Computation and Language Improving Punjabi Speech Recognition with Self-Training Methods

Researchers enhance automatic speech recognition for Punjabi using innovative self-training techniques.

2025-09-28T08:56:35+00:00 ― 5 min read

Sound Advancements in Target-Speaker Speech Recognition

New model improves speech recognition in noisy environments by focusing on a single speaker.

2025-09-28T08:08:00+00:00 ― 4 min read

Sound Balancing Privacy and Smart Audio Monitoring

New methods aim to protect speech privacy in audio monitoring systems.

2025-09-28T06:30:50+00:00 ― 5 min read

Computation and Language Advancing Expressive Speech Synthesis with New Dataset

A new dataset enhances speech synthesis by capturing emotional expression without relying on text.

2025-09-27T18:22:05+00:00 ― 5 min read

Audio and Speech Processing Improving Music Pitch Classification with SDTW

New strategies to enhance training stability for music pitch classification.

2025-09-27T13:30:35+00:00 ― 6 min read

Sound Advancements in Voice Conversion Technology

Phoneme Hallucinator transforms voice conversion with limited data for clearer outputs.

2025-09-27T10:16:15+00:00 ― 5 min read

Sound Advancing Gesture Generation for Digital Humans

A new method creates realistic gestures from raw speech audio.

2025-09-27T08:39:05+00:00 ― 5 min read

Machine Learning New Method for Analyzing Brain Activity During Speech

Researchers develop Neural Latent Aligner to better interpret brain signals during speaking tasks.

2025-09-27T05:24:45+00:00 ― 6 min read

Audio and Speech Processing Advancing Bilingual Speech Recognition with Grapheme Units

Enhancing hybrid ASR systems for bilingual speech using grapheme units.

2025-09-27T03:47:35+00:00 ― 5 min read

Computation and Language Advances in Joint Speech-Text Learning

A new model improves speech and text alignment for better automatic recognition.

2025-09-27T02:10:25+00:00 ― 6 min read

Sound Advancements in Visual Speech Recognition with Lip2Vec

Lip2Vec enhances visual speech recognition using fewer labeled data.

2025-09-27T01:21:50+00:00 ― 7 min read

Computation and Language Advancements in Speech Recognition Technology

New methods enhance accuracy and speed in speech recognition systems.

2025-09-26T11:35:55+00:00 ― 5 min read

Machine Learning O-1: A New Frontier in Speech Recognition Training

O-1 improves speech recognition by optimizing self-training methods.

2025-09-26T09:10:10+00:00 ― 5 min read

Computation and Language Improving Automatic Speech Recognition with Text Injection

A new method enhances ASR performance through text data integration.

2025-09-26T07:33:00+00:00 ― 6 min read

Computation and Language Improving Speech Recognition with Text Injection

Text injection helps recognize personal information while maintaining privacy.

2025-09-26T06:44:25+00:00 ― 5 min read

Sound Advancements in Sound Event Detection Using Generative Learning

Discover how new techniques are transforming sound event detection for various applications.

2025-09-26T05:55:50+00:00 ― 6 min read

Audio and Speech Processing The Importance of Nonlinear Audio Processing

Exploring nonlinear methods in audio for music production and speech analysis.

2025-09-26T03:30:05+00:00 ― 6 min read

Sound Advancements in Pitch Extraction with PitchNet

A new method for accurate pitch detection in music and sound.

2025-09-26T02:41:30+00:00 ― 5 min read

Sound Advancements in Speech Recognition with mmWave Technology

Radio2Text uses mmWave signals for real-time speech recognition in noisy environments.

2025-09-25T22:38:35+00:00 ― 6 min read

Audio and Speech Processing Evaluating an Automatic Sound Masker System in Urban Parks

A study examines the effectiveness of automated sound maskers in public spaces.

2025-09-25T18:35:40+00:00 ― 5 min read

Audio and Speech Processing Advancements in Speaker Recognition with Graph Neural Networks

Graph neural networks improve speaker recognition accuracy by analyzing voice sample relationships.

2025-09-25T09:41:15+00:00 ― 5 min read

Computation and Language Advancements in Speech Emotion Recognition Across Languages

A study evaluating emotion recognition in speech models across six languages.

2025-09-25T08:04:05+00:00 ― 5 min read