Latest Articles for Speech Processing

Audio and Speech Processing Advancements in Spoken-Term Discovery with DUSTED

DUSTED improves efficiency in identifying spoken words by analyzing phonetic patterns.

2025-06-25T02:17:15+00:00 ― 5 min read

Audio and Speech Processing Advancements in Text-to-Speech with DualSpeech

DualSpeech model improves TTS clarity and speaker resemblance.

2025-06-24T10:54:10+00:00 ― 6 min read

Computation and Language New Benchmark for Hindi Speech Recognition

Research improves speech recognition for Hindi with diverse accents.

2025-06-24T05:11:42+00:00 ― 4 min read

Audio and Speech Processing Advancements in Audio Technology: Introducing X-Codec

X-Codec improves audio generation by integrating semantic understanding into processing.

2025-06-21T15:41:45+00:00 ― 6 min read

Sound Advancements in Speech Emotion Recognition Systems

This study enhances SER through improved preprocessing and efficient attention models.

2025-06-18T12:23:30+00:00 ― 4 min read

Computation and Language Advancing Speech Models with Visual Learning

Research focuses on enhancing language learning through visually grounded speech models.

2025-06-18T03:42:12+00:00 ― 8 min read

Audio and Speech Processing Advancements in Voice Reconstruction Technology for Hearables

New methods improve voice clarity in noisy environments for hearables.

2025-06-17T23:26:10+00:00 ― 5 min read

Audio and Speech Processing Advancing Speech Quality in Noisy Settings

A new method improves speech clarity in loud environments.

2025-06-17T00:45:50+00:00 ― 5 min read

Audio and Speech Processing New Approach in Speech Emotion Recognition

A novel method combines meaning and sound for improved emotion detection in speech.

2025-06-16T16:40:00+00:00 ― 6 min read

Sound Advancements in Audio-Visual Speaker Diarization

An overview of audio-visual speaker diarization methods, challenges, and systems.

2025-06-15T21:14:00+00:00 ― 5 min read

Audio and Speech Processing Evaluating Mamba Model in Speech Processing Tasks

This research analyzes Mamba's performance in speech tasks, emphasizing sound reconstruction and recognition.

2025-06-14T23:22:15+00:00 ― 5 min read

Audio and Speech Processing Advancements in Text-Based Speech Generation

SSR-Speech offers new solutions for speech generation and editing.

2025-06-14T16:05:00+00:00 ― 5 min read

Audio and Speech Processing Acoustic Landmarks: A New Dataset for Speech Processing

Researchers develop a dataset to improve speech recognition and analysis techniques.

2025-06-13T19:50:25+00:00 ― 6 min read

Sound Understanding Emotion Recognition in Speech

A study revealing how deep learning models recognize emotions in speech.

2025-06-11T16:01:05+00:00 ― 5 min read

Audio and Speech Processing Advancing Speaker Verification with IML-KD Technique

A new method improves machine voice recognition for speaker verification.

2025-06-11T09:32:25+00:00 ― 6 min read

Audio and Speech Processing Improving Human-Robot Interaction through Emotion Recognition

Study highlights advances in robot emotion recognition using Vision Transformers.

2025-06-10T02:46:15+00:00 ― 6 min read

Audio and Speech Processing Advancements in Speech Recognition for Multi-Talker Scenarios

A new framework simplifies speech recognition in busy environments.

2025-06-07T20:31:10+00:00 ― 5 min read

Audio and Speech Processing Advancing Speech Processing with Consistency in Phase Reconstruction

A new loss function boosts audio quality by aligning phase and magnitude.

2025-06-03T12:03:55+00:00 ― 6 min read

Audio and Speech Processing Advancements in Neural Codecs with ESPnet-Codec

ESPnet-Codec enhances training and evaluation of neural codecs for audio and speech.

2025-06-03T03:09:30+00:00 ― 7 min read

Audio and Speech Processing Advancing Multi-Task Learning in Speech Models

A new method improves speech and audio processing across multiple tasks.

2025-06-02T10:57:50+00:00 ― 5 min read

Sound Integrating Audio-Visual Data for Speech Processing

This study analyzes how audio, video, and text work together in speech recognition.

2025-05-30T15:13:22+00:00 ― 7 min read

Sound Advancements in Speaker Emotion Recognition Technology

Exploring new methods for recognizing emotions in speech using advanced models.

2025-05-24T20:14:18+00:00 ― 7 min read

Computation and Language Topological Data Analysis in Natural Language Processing

Discover how TDA enhances understanding in language analysis.

2025-05-22T13:35:24+00:00 ― 6 min read

Audio and Speech Processing Identifying the Source of Fake Speech

A new method aims to detect the origin of synthetic voices.

2025-05-03T14:39:08+00:00 ― 7 min read

Audio and Speech Processing Advancements in Speech Separation with Codecformer-EL

New methods improve speech separation using neural audio codecs for clearer communication.

2025-04-26T00:20:40+00:00 ― 8 min read

Computation and Language Advancements in Speech Recognition Technology

New methods improve speech recognition while maintaining past knowledge.

2025-04-21T11:17:42+00:00 ― 5 min read

Sound Advancements in Automatic Speech Recognition

New methods improve how machines recognize spoken language.

2025-04-20T10:37:12+00:00 ― 8 min read

Sound The Future of Voice Cloning: A New Era

Voice cloning technology is advancing, creating lifelike speech that mimics human conversation.

2025-04-11T04:32:42+00:00 ― 6 min read

Audio and Speech Processing Preserving Syllable Stress in Noisy Environments

Research explores how speech enhancement models maintain syllable stress amidst noise.

2025-03-07T10:31:48+00:00 ― 6 min read

Sound Boosting Target Speaker Extraction with New Data

Researchers improve speech processing using Libri2Vox and synthetic data techniques.

2025-02-23T07:21:54+00:00 ― 6 min read

Sound Bringing Dubbing to Life: Enhancing Lip Synchrony

A new method improves lip synchrony in dubbed videos for a natural viewing experience.

2025-02-03T03:44:06+00:00 ― 6 min read