Latest Articles for Audio Processing

Machine Learning Challenges in Using Convnets for Audio Filterbank Design

This study explores issues with using convnets for audio filterbank creation.

2025-09-14T14:34:35+00:00 ― 5 min read

Sound Advancements in Audio and Language Models

The CLAP model bridges audio and text processing for various applications.

2025-09-14T13:46:00+00:00 ― 4 min read

Sound New System Improves Voice Extraction from Unstable Head Positions

PIAVE helps machines extract voices clearly, even when speakers turn their heads.

2025-09-12T19:39:40+00:00 ― 6 min read

Audio and Speech Processing Improving Speech Clarity with AV2Wav Technology

AV2Wav enhances speech quality using audio and visual cues.

2025-09-12T17:13:55+00:00 ― 5 min read

Sound A New Framework for Speaker Anonymization

Introducing a flexible framework to enhance voice privacy research.

2025-09-12T05:05:10+00:00 ― 7 min read

Sound Emotional Speech Challenges Speech Separation Models

Research reveals emotional speech impacts model performance in speech separation tasks.

2025-09-11T18:33:35+00:00 ― 6 min read

Audio and Speech Processing Advancing Fake Speech Detection Techniques

New methods are improving our ability to detect fake speech effectively.

2025-09-11T02:21:55+00:00 ― 6 min read

Sound Improving Vocoder Training with Contrastive Learning

New methods enhance vocoder performance with limited audio data.

2025-09-10T12:36:00+00:00 ― 5 min read

Sound A New Method for Detecting Voice Spoofing

A robust approach to identify audio anomalies and combat voice spoofing.

2025-09-09T07:27:00+00:00 ― 5 min read

Sound DiCon: A New Approach to Speech Synthesis

Introducing a faster method for high-quality speech synthesis using diffusion models.

2025-09-09T03:24:05+00:00 ― 6 min read

Audio and Speech Processing HiFTNet: Advancing Text-to-Speech Technology

HiFTNet offers faster, high-quality speech synthesis using efficient innovative techniques.

2025-09-09T02:35:30+00:00 ― 5 min read

Audio and Speech Processing Introducing AV-SUPERB: A New Benchmark for Audio-Visual Models

AV-SUPERB evaluates audio and visual models across various tasks for better performance.

2025-09-08T22:32:35+00:00 ― 5 min read

Sound Faster Text-to-Audio Generation Using Consistency Distillation

New method improves speed and efficiency in Text-to-Audio generation.

2025-09-08T18:29:40+00:00 ― 4 min read

Audio and Speech Processing Introducing the SPGM Model for Speech Separation

A new model improves speech separation efficiency and performance.

2025-09-07T10:54:55+00:00 ― 5 min read

Sound Innovative Method Transforms Audio Captioning with Text Data

A new approach generates audio captions using only text, improving data efficiency.

2025-09-07T00:23:20+00:00 ― 7 min read

Sound Connecting Music: Audio and Sheet Music Retrieval

Exploring the challenges and innovations in matching audio recordings to sheet music.

2025-09-06T21:57:35+00:00 ― 6 min read

Audio and Speech Processing Improving Audio Datasets with K-Means Clustering

Using k-means clustering to optimize audio data for better model training.

2025-09-06T15:28:55+00:00 ― 5 min read

Audio and Speech Processing Improving Speech Recognition with Audio Augmentation Techniques

Study shows audio augmentation can enhance speech recognition in low-resource languages.

2025-09-06T09:48:50+00:00 ― 5 min read

Machine Learning Improving Weak Label Learning Through Negative Example Selection

New strategies enhance weak label learning by selecting relevant negative examples.

2025-09-06T04:57:20+00:00 ― 6 min read

Audio and Speech Processing Efficient Model Selection for Speech Recognition

A method to choose the best ASR model based on audio features.

2025-09-05T23:17:15+00:00 ― 5 min read

Audio and Speech Processing Improving Speech Clarity with Dereverberation Techniques

Learn how dereverberation boosts speech recognition in noisy environments.

2025-09-05T12:45:40+00:00 ― 4 min read

Audio and Speech Processing New Method for Room Volume Estimation Using Attention Models

This study presents an attention-based model for estimating room volumes from audio recordings.

2025-09-05T11:08:30+00:00 ― 5 min read

Sound Introducing ASCA: A New Approach to Audio Classification

ASCA model enhances audio classification accuracy for small datasets.

2025-09-05T10:19:55+00:00 ― 5 min read

Sound Transforming Tongue Movements into Speech Sounds

This study converts MRI tongue data into real speech audio.

2025-09-04T22:11:10+00:00 ― 4 min read

Audio and Speech Processing Advances and Challenges in Speech Recognition Models

This study examines how model compression impacts speech recognition in noisy environments.

2025-09-04T19:45:25+00:00 ― 5 min read

Audio and Speech Processing Advancements in Sound Event Detection with OAL

Explore how Online Active Learning improves sound recognition efficiency.

2025-09-04T18:56:50+00:00 ― 6 min read

Sound Advancements in Audio and Speech Recognition Model

A new model improves understanding of speech and sounds simultaneously.

2025-09-04T18:08:15+00:00 ― 6 min read

Sound Advancements in Audio Classification Using DCLS

DCLS enhances audio classification performance by learning kernel positions during training.

2025-09-04T07:36:40+00:00 ― 5 min read

Computer Vision and Pattern Recognition Improving Audio-Visual Learning with Speed Co-Augmentation

A new method enhances machine learning of audio-visual data.

2025-09-04T05:59:30+00:00 ― 5 min read

Audio and Speech Processing MC-SimCLR: Advancing Sound Learning and Location Awareness

A new method enhances sound recognition and source location without labels.

2025-09-03T00:50:30+00:00 ― 5 min read

Sound New Insights into Generalization in Neural Networks

Exploring how sharpness of minima influences model performance on unseen audio data.

2025-09-02T15:56:05+00:00 ― 5 min read

Sound Transformers in Music Representation Learning

A study on using transformers for effective music tagging and representation.

2025-09-02T07:01:40+00:00 ― 6 min read

Audio and Speech Processing A Universal Approach to Speech Enhancement

This research presents a model for improving speech clarity across different conditions.

2025-09-02T02:10:10+00:00 ― 5 min read

Sound The Rise of Automated Audio Captioning

Exploring advancements in automated audio captioning and its impact on accessibility.

2025-09-02T01:21:35+00:00 ― 5 min read

Sound Advancements in Text-to-Audio Grounding Techniques

New methods enhance linking text descriptions to sound events.

2025-08-31T16:09:40+00:00 ― 7 min read

Audio and Speech Processing Advancements in Speaker Diarization with E-SHARC Method

E-SHARC improves speaker identification in various audio environments.

2025-08-28T06:22:45+00:00 ― 6 min read

Computer Vision and Pattern Recognition Advancing Audio-Visual Segmentation with Unsupervised Techniques

A new approach simplifies audio-visual segmentation without costly labeled data.

2025-08-27T01:00:18+00:00 ― 7 min read

Audio and Speech Processing New Method to Clear Echoed Speech

A method enhances speech clarity in noisy environments without clear training data.

2025-08-26T17:56:30+00:00 ― 6 min read

Functional Analysis Wavelets and Smoothness: A Practical Insight

Explore the role of wavelets in analyzing function smoothness and its applications.

2025-08-24T23:53:28+00:00 ― 5 min read

Audio and Speech Processing Improving Speaker Diarization with Multi-Microphone Approaches

New methods enhance voice activity and overlap detection in speaker diarization.

2025-08-24T13:18:35+00:00 ― 6 min read