Electrical Engineering and Systems Science - Audio and Speech Processing

RSS

Audio and Speech Processing Improving AI Understanding of Speech and Emotion

A new approach trains AI to better recognize speech and emotions in noisy environments.

2025-11-05T17:35:50+00:00 ― 5 min read

Audio and Speech Processing Innovative Audio Analysis for Family Interaction

New methods aim to improve understanding of family dynamics and children's mental health.

2025-11-05T16:47:15+00:00 ― 6 min read

Audio and Speech Processing Advances in Speaker Protection Systems

New deep learning methods enhance speaker diaphragm movement predictions.

2025-11-05T16:10:24+00:00 ― 5 min read

Computation and Language Harnessing ciwGAN for Phonological Analysis

Exploring how ciwGAN can learn and represent phonological features like nasality.

2025-11-05T15:10:05+00:00 ― 4 min read

Audio and Speech Processing Advancements in Speech Recognition with MH-SSM

A new model improves speech recognition efficiency and accuracy.

2025-11-05T14:21:30+00:00 ― 4 min read

Audio and Speech Processing Advancing Speech Recognition with Contextual Insight

A new method enhances speech recognition accuracy using contextual information.

2025-11-05T13:32:55+00:00 ― 5 min read

Sound Simulating Noisy Speech for Better Recognition

Researchers use GANs to generate noisy speech from clean audio, improving speech models.

2025-11-05T12:44:20+00:00 ― 6 min read

Sound Introducing the JNV Corpus: A New Collection of Japanese Nonverbal Vocalizations

The JNV corpus captures diverse emotional sounds in Japanese, enriching existing collections.

2025-11-05T11:55:45+00:00 ― 5 min read

Sound Advancements in Realistic Laughter Synthesis

New methods improve laughter generation for realistic human-computer interactions.

2025-11-05T11:07:10+00:00 ― 5 min read

Sound Detecting Synthetic Speech: Challenges and Solutions

A look at identifying fake audio in today's tech-driven world.

2025-11-05T10:18:35+00:00 ― 4 min read

Computation and Language Advancing Speech Models Through Text Knowledge

Using text models to enhance speech generation for better understanding.

2025-11-05T09:30:00+00:00 ― 8 min read

Computation and Language Improving ASR Accuracy with Synthetic Data Techniques

Research shows how synthetic text can enhance ASR systems effectively.

2025-11-05T04:38:30+00:00 ― 5 min read

Machine Learning Advancing Multi-modal Learning with C-MCR

C-MCR simplifies multi-modal learning by connecting existing knowledge efficiently.

2025-11-05T03:49:55+00:00 ― 6 min read

Sound FluentSpeech: A New Approach to Stutter Removal

FluentSpeech offers an automatic solution for smoother speech editing.

2025-11-05T02:12:45+00:00 ― 6 min read

Audio and Speech Processing Modular Domain Adaptation: A New Approach to Speech Recognition

MDA enhances speech recognition by optimizing models for specific data areas.

2025-11-05T01:24:10+00:00 ― 6 min read

Medical Physics New Study Links Brain Signals to Tongue Movement

Research shows brain signals can help predict tongue movements during speech.

2025-11-04T23:54:21+00:00 ― 6 min read

Sound Advances in Text-to-Speech Technology with U-DiT

U-DiT TTS system enhances natural speech generation through innovative architecture.

2025-11-04T23:47:00+00:00 ― 4 min read

Audio and Speech Processing Improving Speech Recognition for All Speakers

A new method aims to enhance ASR systems for dysarthric speakers.

2025-11-04T22:58:25+00:00 ― 5 min read

Computation and Language Advancements in Learning Spoken Words with MAMLCon

A new method improves computer understanding of spoken commands with fewer examples.

2025-11-04T22:09:50+00:00 ― 5 min read

Computation and Language Improving Speaker Diarization Using Word Analysis

Enhancing speaker identification by combining sound and spoken words in audio.

2025-11-04T18:55:30+00:00 ― 5 min read

Audio and Speech Processing Adapting Gestures for Virtual Agents

Virtual agents learn to mimic human gestures for better interaction.

2025-11-04T18:06:55+00:00 ― 6 min read

Sound Simplifying Sound Synthesis with NAS-FM

A new method for creating synthesizers that benefits musicians.

2025-11-04T17:18:20+00:00 ― 6 min read

Audio and Speech Processing Advancements in Active Speaker Detection Technology

A new framework improves active speaker detection using audio and visual cues.

2025-11-04T16:29:45+00:00 ― 5 min read

Sound Strengthening Voice Verification Against Advanced Threats

A look at challenges and defenses in automatic speaker verification systems.

2025-11-04T15:41:10+00:00 ― 4 min read

Sound The Role of Optical Networks in Modern Communication

Optical networks enable fast data transfer, shaping the future of communication technology.

2025-11-04T14:04:00+00:00 ― 5 min read

Audio and Speech Processing Improving General Audio Models for Speech Tasks

A new method enhances general audio models for effective speech recognition.

2025-11-04T05:58:10+00:00 ― 6 min read

Computation and Language Advancements in Emotion Recognition in Conversations

New model enhances emotional understanding in dialogues.

2025-11-04T05:09:35+00:00 ― 6 min read

Computation and Language New Model Enhances Speech Translation Quality

A model combines spoken language and text to improve translation accuracy.

2025-11-04T04:21:00+00:00 ― 5 min read

Machine Learning Studying Marmoset Calls Through Human Speech Models

Research uses human speech models to analyze Marmoset vocalizations effectively.

2025-11-04T03:32:25+00:00 ― 6 min read

Audio and Speech Processing Advancements in Lung Sound Analysis Technology

New methods improve early detection of respiratory diseases using sound data.

2025-11-04T02:43:50+00:00 ― 5 min read

Sound Distinguishing Between Happy and Mocking Laughter

This study examines how laughter conveys emotions through sound analysis.

2025-11-04T01:55:15+00:00 ― 4 min read

Audio and Speech Processing EfficientSpeech: On-Device Text-to-Speech Technology

A new model brings voice capabilities to devices without internet.

2025-11-04T01:06:40+00:00 ― 5 min read

Audio and Speech Processing Advancing Spoken Language Understanding with Continual Learning

This research addresses forgetting in AI through continual learning in spoken language understanding.

2025-11-04T00:18:05+00:00 ― 8 min read

Sound Advancements in Emotional Text-To-Speech Technology

New model ZET-Speech enhances emotional speech synthesis for diverse speakers.

2025-11-03T23:29:30+00:00 ― 5 min read

Sound Advancements in Transcribing Piano and Violin Music

Study finds new mixing techniques improve music transcription accuracy.

2025-11-03T21:52:20+00:00 ― 4 min read

Sound Advancing Human-Machine Interaction with Empathetic Dialogue

A new method enhances machine responses through better emotional understanding.

2025-11-03T21:03:45+00:00 ― 5 min read

Sound Advancing Speech Recognition in Multi-Talker Settings

A new method improves accuracy in automatic speech recognition for meetings.

2025-11-03T20:15:10+00:00 ― 5 min read

Sound Developing Empathetic Voice Assistants with CALLS

CALLS aims to improve voice assistants' ability to handle customer interactions.

2025-11-03T19:26:35+00:00 ― 5 min read

Audio and Speech Processing Advancements in Audio Inpainting Technology

New methods improve audio restoration and production quality.

2025-11-03T17:49:25+00:00 ― 5 min read

Audio and Speech Processing Advancements in Quantization for Speech Recognition Models

Research enhances quantization techniques to improve speech recognition model efficiency.

2025-11-03T11:20:45+00:00 ― 7 min read