Electrical Engineering and Systems Science - Audio and Speech Processing

RSS

Sound Advancements in Emotional Text-To-Speech Technology

New model ZET-Speech enhances emotional speech synthesis for diverse speakers.

2025-11-03T23:29:30+00:00 ― 5 min read

Sound Advancements in Transcribing Piano and Violin Music

Study finds new mixing techniques improve music transcription accuracy.

2025-11-03T21:52:20+00:00 ― 4 min read

Sound Advancing Human-Machine Interaction with Empathetic Dialogue

A new method enhances machine responses through better emotional understanding.

2025-11-03T21:03:45+00:00 ― 5 min read

Sound Advancing Speech Recognition in Multi-Talker Settings

A new method improves accuracy in automatic speech recognition for meetings.

2025-11-03T20:15:10+00:00 ― 5 min read

Sound Developing Empathetic Voice Assistants with CALLS

CALLS aims to improve voice assistants' ability to handle customer interactions.

2025-11-03T19:26:35+00:00 ― 5 min read

Audio and Speech Processing Advancements in Audio Inpainting Technology

New methods improve audio restoration and production quality.

2025-11-03T17:49:25+00:00 ― 5 min read

Audio and Speech Processing Advancements in Quantization for Speech Recognition Models

Research enhances quantization techniques to improve speech recognition model efficiency.

2025-11-03T11:20:45+00:00 ― 7 min read

Sound Revolutionizing Audio Quality Measurement with PLCMOS

PLCMOS offers a new way to evaluate speech quality without human listeners.

2025-11-03T10:32:10+00:00 ― 5 min read

Human-Computer Interaction LoopBoxes: A New Way to Make Music

LoopBoxes helps children create music easily and collaboratively.

2025-11-03T08:55:00+00:00 ― 5 min read

Sound Innovative Sound Synthesis Using Neural Networks

A new method for creating realistic impact sounds through neural networks.

2025-11-03T08:06:25+00:00 ― 5 min read

Computation and Language Improving Speech Recognition for Non-Native Speakers

New technique enhances ASR systems for better recognition of non-native accents.

2025-11-03T02:26:20+00:00 ― 6 min read

Audio and Speech Processing Advancing Speech Recognition with Weakly-Supervised Learning

New methods leverage speaker identity to improve speech recognition performance.

2025-11-03T01:37:45+00:00 ― 5 min read

Sound Improving Speech Recognition with the Sidecar Approach

A new method combines speech recognition and speaker identification for overlapping speech.

2025-11-03T00:49:10+00:00 ― 5 min read

Computation and Language Advancing Simultaneous Speech Translation with DiSeg

A novel method improves real-time translation quality and efficiency.

2025-11-03T00:00:35+00:00 ― 4 min read

Computation and Language Improving Few-Shot Learning with Attention Mechanism

A novel approach enhances machine learning through fewer examples and multimodal data.

2025-11-02T22:23:25+00:00 ― 6 min read

Sound Estimating Room Impulse Responses with Multiple Sound Sources

A new method to estimate room responses in complex sound environments.

2025-11-02T21:34:50+00:00 ― 7 min read

Audio and Speech Processing Advancements in Voice Conversion Technology

A new method for voice conversion improves clarity and adaptation.

2025-11-02T19:57:40+00:00 ― 6 min read

Audio and Speech Processing Advancing Text-to-Speech for Turkic Languages

Building TTS systems for lesser-known Turkic languages using Kazakh data.

2025-11-02T18:20:30+00:00 ― 5 min read

Sound Introducing MeLoDy: Speedy Music Generation Unveiled

MeLoDy quickly generates high-quality music from text prompts.

2025-11-02T17:31:55+00:00 ― 5 min read

Sound Addressing Security Threats in Voice Recognition Systems

New methods emerge to protect voice recognition from adversarial attacks.

2025-11-02T16:43:20+00:00 ― 5 min read

Audio and Speech Processing Introducing AudioDec: A New Era in Audio Streaming

AudioDec offers real-time high-quality audio with low data usage.

2025-11-02T15:06:10+00:00 ― 5 min read

Sound New Method Reveals Privacy Risks in Diffusion Models

A novel technique checks for training data exposure in diffusion models.

2025-11-02T13:29:00+00:00 ― 5 min read

Sound Advancements in Speech Separation with S4M

A new model improves voice isolation in noisy environments.

2025-11-02T10:14:40+00:00 ― 5 min read

Audio and Speech Processing Replicating the Sound of Magnetic Tape with Digital Tools

This article discusses how to recreate magnetic tape sound using digital technology.

2025-11-02T09:26:05+00:00 ― 6 min read

Audio and Speech Processing Advancements in Speech Synthesis Technology

New framework improves voice generation quality in speech synthesis.

2025-11-02T06:11:45+00:00 ― 5 min read

Audio and Speech Processing Advancements in Personalized Synthetic Voices

Researchers develop technology to recreate unique voices for those with speech challenges.

2025-11-01T23:43:05+00:00 ― 5 min read

Audio and Speech Processing Improving Speaker Verification with OS-KDFT Method

A new method enhances speaker verification by combining knowledge distillation and fine-tuning.

2025-11-01T22:05:55+00:00 ― 6 min read

Audio and Speech Processing DeCoR: A New Method for Audio Learning

DeCoR helps machines learn new sounds without forgetting old ones.

2025-11-01T21:17:20+00:00 ― 5 min read

Sound Advancements in Real-Time Audio Tagging

Streaming audio transformers improve speed and efficiency in audio tagging systems.

2025-11-01T20:28:45+00:00 ― 6 min read

Computation and Language Advancements in Speech Transcription Methods

New techniques improve accuracy and speed in converting speech to text.

2025-11-01T16:25:50+00:00 ― 5 min read

Sound Assessing Dysarthric Speech: New Methods for Clarity

This research introduces improved assessments for clearer communication in individuals with dysarthria.

2025-11-01T15:37:15+00:00 ― 5 min read

Sound Addressing Challenges in Speech Recognition with Enharmonic Words

A new method improves speech recognition for names that sound alike.

2025-11-01T14:48:40+00:00 ― 5 min read

Audio and Speech Processing Improving Speech Diversity in TTS Systems

A new method enhances the naturalness and variety of text-to-speech output.

2025-11-01T13:11:30+00:00 ― 5 min read

Audio and Speech Processing Advancements in Audio Classification with Treff Adapter

Treff adapter improves audio classification with limited labeled data.

2025-11-01T12:22:55+00:00 ― 5 min read

Machine Learning Advancements in Multi-Task Self-Supervised Learning

New methods improve model flexibility and performance in audio tasks.

2025-11-01T08:20:00+00:00 ― 4 min read

Audio and Speech Processing Advancements in Speech Emotion Recognition Using Speaker Embeddings

Research highlights effective methods for recognizing emotions in speech using embeddings.

2025-11-01T07:31:25+00:00 ― 6 min read

Sound Efficient Audio Tagging with E-PANNs

Discover how E-PANNs improve sound recognition efficiency.

2025-11-01T04:17:05+00:00 ― 5 min read

Computation and Language Analyzing Dialects Through Audio Processing

This research analyzes dialects using audio recordings to reveal their similarities.

2025-11-01T02:39:55+00:00 ― 6 min read

Computation and Language Advancing Spoken Language Understanding with Discrete Units

New method improves spoken language understanding without needing written transcripts.

2025-11-01T00:14:10+00:00 ― 5 min read

Sound Advancements in Audio Classification Techniques

A novel method enhances audio classification by learning new sounds efficiently.

2025-10-31T22:37:00+00:00 ― 4 min read