Electrical Engineering and Systems Science - Audio and Speech Processing

RSS

Sound Advancements in Text-to-Speech Adaptation Technology

New method improves TTS adaptation with minimal data requirements.

2025-10-31T21:48:25+00:00 ― 6 min read

Computation and Language Understanding Explainable AI in Speech Recognition Systems

An overview of explainable AI methods in automatic speech recognition.

2025-10-31T20:11:15+00:00 ― 6 min read

Sound Advancing Audio Question Answering with MWAFM Model

A new model improves how machines understand and respond to audio questions.

2025-10-31T18:34:05+00:00 ― 5 min read

Audio and Speech Processing Evaluating Turn-Taking in Text-to-Speech Systems

Research highlights the need for improved turn-taking in TTS technology.

2025-10-31T17:45:30+00:00 ― 6 min read

Computation and Language New Benchmark for Speech Learning Models

BabySLM evaluates how well machines learn to understand speech based on children's language.

2025-10-31T11:33:20+00:00 ― 7 min read

Audio and Speech Processing Optimizing Synthetic Speech for Better ASR Training

A new method improves synthetic speech selection for enhanced ASR system accuracy.

2025-10-31T08:51:05+00:00 ― 6 min read

Audio and Speech Processing Improving Speech Disorder Alignment with New Techniques

A new method aligns disfluent speech with text efficiently.

2025-10-31T08:02:30+00:00 ― 5 min read

Sound Advancements in Silent Speech Interfaces

Improving systems for silent speech recognition with new techniques.

2025-10-31T07:13:55+00:00 ― 5 min read

Computation and Language Improving ASR Accuracy with Contextual Biasing

New methods enhance automatic speech recognition for rare words using context.

2025-10-31T02:22:25+00:00 ― 6 min read

Sound Advancements in Weakly Supervised Keyword Spotting

A new method for training keyword spotting models using weak supervision in noisy environments.

2025-10-31T01:33:50+00:00 ― 6 min read

Computation and Language Advancing Speech Translation for Low-Resource Languages

Methods to improve speech translation systems for underrepresented languages.

2025-10-31T00:45:15+00:00 ― 4 min read

Sound MERT: A Self-Supervised Model for Music Understanding

MERT addresses music modeling challenges through innovative self-supervised learning techniques.

2025-10-30T23:56:40+00:00 ― 6 min read

Sound Improving RNN-T Models with Reinforcement Learning

A new approach enhances RNN-T performance in automatic speech recognition.

2025-10-30T19:53:45+00:00 ― 6 min read

Audio and Speech Processing AVLIT: Advancing Speech Separation in Noise

AVLIT model combines sound and video for better speech clarity in noisy settings.

2025-10-30T18:16:35+00:00 ― 6 min read

Machine Learning Addressing Shortcut Learning in Voice Recognition Systems

Examining the impact of biased data in audio detection technologies.

2025-10-30T17:28:00+00:00 ― 6 min read

Sound Improving Speech Separation with Multiple Microphones

A new method enhances voice separation using multiple microphones without labeled data.

2025-10-30T15:50:50+00:00 ― 4 min read

Sound Advancing Audio Anti-Spoofing Techniques

A study improves speaker verification models for better identity protection.

2025-10-30T15:02:15+00:00 ― 6 min read

Computation and Language Advancements in Audio Question Answering Systems

New models improve how machines respond to audio-based questions.

2025-10-30T13:25:05+00:00 ― 5 min read

Computation and Language Enhancing Language Identification in Code-Switching Speech

Research aims to improve language detection in English-Mandarin conversations.

2025-10-30T12:36:30+00:00 ― 7 min read

Computation and Language Advances in Swiss German Speech Synthesis

New methods enhance speech synthesis for Swiss German from standard German text.

2025-10-30T10:59:20+00:00 ― 5 min read

Computation and Language Advancements in Multilingual Speech Recognition Systems

Exploring methods for improved multilingual speech recognition in Indian languages.

2025-10-30T10:10:45+00:00 ― 6 min read

Sound Advancing Voice Activity Detection with SVVAD

Discover how SVVAD improves voice activity detection for better speaker verification.

2025-10-30T09:22:10+00:00 ― 5 min read

Sound Advancements in Automatic Pronunciation Assessment

A new method improves pronunciation feedback for language learners.

2025-10-30T08:33:35+00:00 ― 6 min read

Computation and Language Measuring Adaptability in Speech Recognition Models

A new framework evaluates how well speech models adapt to specific tasks.

2025-10-30T06:56:25+00:00 ― 6 min read

Computation and Language Advancements in Multilingual Speech Translation

Research improves multilingual speech translation using semantic knowledge.

2025-10-30T06:07:50+00:00 ― 4 min read

Sound Advancing Speech Processing with HuBERT

HuBERT models improve speech tasks using multiple resolutions for better performance.

2025-10-29T22:02:00+00:00 ― 5 min read

Audio and Speech Processing Advancements in Speaker Identification Technology

New techniques improve accuracy in recognizing speakers and detecting imposters.

2025-10-29T20:24:50+00:00 ― 4 min read

Sound Improving Virtual Analog Audio Effects with Deep Learning

A new approach enhances phase response in virtual audio effects using deep learning.

2025-10-29T18:47:40+00:00 ― 5 min read

Sound Slowdown in Speech Recognition: A Closer Look at SlothSpeech

SlothSpeech reveals vulnerabilities in speech recognition systems, slowing them down significantly.

2025-10-29T17:10:30+00:00 ― 5 min read

Sound UnDiff: A New Approach to Audio Clarity

UnDiff enhances audio quality using innovative speech restoration techniques.

2025-10-29T16:21:55+00:00 ― 5 min read

Computation and Language New Insights into Generative Spoken Language Modeling

Researchers examine how GSLM processes speech in noisy environments.

2025-10-29T15:33:20+00:00 ― 6 min read

Sound Advancements in Stuttering Detection Technology

New methods in machine learning enhance stuttering detection capabilities.

2025-10-29T14:44:45+00:00 ― 5 min read

Sound EmoMix: Advancing Emotional Speech Synthesis

EmoMix enables the creation of speech expressing mixed emotions with precise intensity.

2025-10-29T13:56:10+00:00 ― 5 min read

Sound MW-MAE: A New Approach to Audio Learning

Discover the innovative Multi-Window Masked Autoencoder method for enhanced audio processing.

2025-10-29T11:30:25+00:00 ― 5 min read

Sound Improving Audio Restoration with Visual Cues

A novel method merges audio and visual data to repair missing speech.

2025-10-29T10:41:50+00:00 ― 6 min read

Computation and Language Addressing Hate Speech in Low-Resource Languages

Exploring methods for detecting hate speech in audio broadcasts of under-resourced languages.

2025-10-29T09:04:40+00:00 ― 4 min read

Audio and Speech Processing Reviving Sound: The BABE Method for Audio Restoration

A new method restores lost high frequencies in historical recordings.

2025-10-29T06:38:55+00:00 ― 7 min read

Audio and Speech Processing Improving ASR Technology with Sequential-Level Generalized Entropy Minimization

A new method enhances automatic speech recognition systems for better accuracy and adaptability.

2025-10-29T02:36:00+00:00 ― 6 min read

Sound Advancing Sound Simulation with BEDRF

A new model improves sound diffraction in virtual environments.

2025-10-29T01:47:25+00:00 ― 6 min read

Computation and Language Improving Speech Recognition with Contextual Biasing

Contextual biasing enhances ASR systems, improving accuracy in specialized tasks.

2025-10-29T00:58:50+00:00 ― 5 min read