Electrical Engineering and Systems Science - Audio and Speech Processing

RSS

Sound Detecting Synthetic Speech: Challenges and Solutions

A look at identifying fake audio in today's tech-driven world.

2025-11-05T10:18:35+00:00 ― 4 min read

Computation and Language Advancing Speech Models Through Text Knowledge

Using text models to enhance speech generation for better understanding.

2025-11-05T09:30:00+00:00 ― 8 min read

Computation and Language Improving ASR Accuracy with Synthetic Data Techniques

Research shows how synthetic text can enhance ASR systems effectively.

2025-11-05T04:38:30+00:00 ― 5 min read

Machine Learning Advancing Multi-modal Learning with C-MCR

C-MCR simplifies multi-modal learning by connecting existing knowledge efficiently.

2025-11-05T03:49:55+00:00 ― 6 min read

Sound FluentSpeech: A New Approach to Stutter Removal

FluentSpeech offers an automatic solution for smoother speech editing.

2025-11-05T02:12:45+00:00 ― 6 min read

Audio and Speech Processing Modular Domain Adaptation: A New Approach to Speech Recognition

MDA enhances speech recognition by optimizing models for specific data areas.

2025-11-05T01:24:10+00:00 ― 6 min read

Medical Physics New Study Links Brain Signals to Tongue Movement

Research shows brain signals can help predict tongue movements during speech.

2025-11-04T23:54:21+00:00 ― 6 min read

Sound Advances in Text-to-Speech Technology with U-DiT

U-DiT TTS system enhances natural speech generation through innovative architecture.

2025-11-04T23:47:00+00:00 ― 4 min read

Audio and Speech Processing Improving Speech Recognition for All Speakers

A new method aims to enhance ASR systems for dysarthric speakers.

2025-11-04T22:58:25+00:00 ― 5 min read

Computation and Language Advancements in Learning Spoken Words with MAMLCon

A new method improves computer understanding of spoken commands with fewer examples.

2025-11-04T22:09:50+00:00 ― 5 min read

Computation and Language Improving Speaker Diarization Using Word Analysis

Enhancing speaker identification by combining sound and spoken words in audio.

2025-11-04T18:55:30+00:00 ― 5 min read

Audio and Speech Processing Adapting Gestures for Virtual Agents

Virtual agents learn to mimic human gestures for better interaction.

2025-11-04T18:06:55+00:00 ― 6 min read

Sound Simplifying Sound Synthesis with NAS-FM

A new method for creating synthesizers that benefits musicians.

2025-11-04T17:18:20+00:00 ― 6 min read

Audio and Speech Processing Advancements in Active Speaker Detection Technology

A new framework improves active speaker detection using audio and visual cues.

2025-11-04T16:29:45+00:00 ― 5 min read

Sound Strengthening Voice Verification Against Advanced Threats

A look at challenges and defenses in automatic speaker verification systems.

2025-11-04T15:41:10+00:00 ― 4 min read

Sound The Role of Optical Networks in Modern Communication

Optical networks enable fast data transfer, shaping the future of communication technology.

2025-11-04T14:04:00+00:00 ― 5 min read

Audio and Speech Processing Improving General Audio Models for Speech Tasks

A new method enhances general audio models for effective speech recognition.

2025-11-04T05:58:10+00:00 ― 6 min read

Computation and Language Advancements in Emotion Recognition in Conversations

New model enhances emotional understanding in dialogues.

2025-11-04T05:09:35+00:00 ― 6 min read

Computation and Language New Model Enhances Speech Translation Quality

A model combines spoken language and text to improve translation accuracy.

2025-11-04T04:21:00+00:00 ― 5 min read

Machine Learning Studying Marmoset Calls Through Human Speech Models

Research uses human speech models to analyze Marmoset vocalizations effectively.

2025-11-04T03:32:25+00:00 ― 6 min read

Audio and Speech Processing Advancements in Lung Sound Analysis Technology

New methods improve early detection of respiratory diseases using sound data.

2025-11-04T02:43:50+00:00 ― 5 min read

Sound Distinguishing Between Happy and Mocking Laughter

This study examines how laughter conveys emotions through sound analysis.

2025-11-04T01:55:15+00:00 ― 4 min read

Audio and Speech Processing EfficientSpeech: On-Device Text-to-Speech Technology

A new model brings voice capabilities to devices without internet.

2025-11-04T01:06:40+00:00 ― 5 min read

Audio and Speech Processing Advancing Spoken Language Understanding with Continual Learning

This research addresses forgetting in AI through continual learning in spoken language understanding.

2025-11-04T00:18:05+00:00 ― 8 min read

Sound Advancements in Emotional Text-To-Speech Technology

New model ZET-Speech enhances emotional speech synthesis for diverse speakers.

2025-11-03T23:29:30+00:00 ― 5 min read

Sound Advancements in Transcribing Piano and Violin Music

Study finds new mixing techniques improve music transcription accuracy.

2025-11-03T21:52:20+00:00 ― 4 min read

Sound Advancing Human-Machine Interaction with Empathetic Dialogue

A new method enhances machine responses through better emotional understanding.

2025-11-03T21:03:45+00:00 ― 5 min read

Sound Advancing Speech Recognition in Multi-Talker Settings

A new method improves accuracy in automatic speech recognition for meetings.

2025-11-03T20:15:10+00:00 ― 5 min read

Sound Developing Empathetic Voice Assistants with CALLS

CALLS aims to improve voice assistants' ability to handle customer interactions.

2025-11-03T19:26:35+00:00 ― 5 min read

Audio and Speech Processing Advancements in Audio Inpainting Technology

New methods improve audio restoration and production quality.

2025-11-03T17:49:25+00:00 ― 5 min read

Audio and Speech Processing Advancements in Quantization for Speech Recognition Models

Research enhances quantization techniques to improve speech recognition model efficiency.

2025-11-03T11:20:45+00:00 ― 7 min read

Sound Revolutionizing Audio Quality Measurement with PLCMOS

PLCMOS offers a new way to evaluate speech quality without human listeners.

2025-11-03T10:32:10+00:00 ― 5 min read

Human-Computer Interaction LoopBoxes: A New Way to Make Music

LoopBoxes helps children create music easily and collaboratively.

2025-11-03T08:55:00+00:00 ― 5 min read

Sound Innovative Sound Synthesis Using Neural Networks

A new method for creating realistic impact sounds through neural networks.

2025-11-03T08:06:25+00:00 ― 5 min read

Computation and Language Improving Speech Recognition for Non-Native Speakers

New technique enhances ASR systems for better recognition of non-native accents.

2025-11-03T02:26:20+00:00 ― 6 min read

Audio and Speech Processing Advancing Speech Recognition with Weakly-Supervised Learning

New methods leverage speaker identity to improve speech recognition performance.

2025-11-03T01:37:45+00:00 ― 5 min read

Sound Improving Speech Recognition with the Sidecar Approach

A new method combines speech recognition and speaker identification for overlapping speech.

2025-11-03T00:49:10+00:00 ― 5 min read

Computation and Language Advancing Simultaneous Speech Translation with DiSeg

A novel method improves real-time translation quality and efficiency.

2025-11-03T00:00:35+00:00 ― 4 min read

Computation and Language Improving Few-Shot Learning with Attention Mechanism

A novel approach enhances machine learning through fewer examples and multimodal data.

2025-11-02T22:23:25+00:00 ― 6 min read

Sound Estimating Room Impulse Responses with Multiple Sound Sources

A new method to estimate room responses in complex sound environments.

2025-11-02T21:34:50+00:00 ― 7 min read