Computer Science - Sound

RSS

Sound MoisesDB: A Breakthrough in Music Source Separation

MoisesDB offers a detailed dataset for advanced music sound separation.

2025-10-02T09:18:00+00:00 ― 6 min read

Sound Advancing Music Captioning with Large Language Models

Using LLMs to create a vast dataset for music captioning.

2025-10-02T08:29:25+00:00 ― 6 min read

Computation and Language Advancements in Pronunciation Training Technology

Researchers are improving pronunciation training with new technologies for language learners.

2025-10-02T07:40:50+00:00 ― 5 min read

Sound Advancements in Voice Style Transfer Technology

HierVST transforms voices seamlessly, enhancing audio quality without needing extensive data.

2025-10-02T05:15:05+00:00 ― 5 min read

Multimedia Advancements in Engagement Estimation for Conversations

Research develops a model to accurately measure engagement in conversations.

2025-10-01T21:57:50+00:00 ― 6 min read

Computer Vision and Pattern Recognition DAVIS: A New Approach to Sound Separation

DAVIS offers a fresh way to tackle audio and visual sound separation.

2025-10-01T19:32:05+00:00 ― 5 min read

Sound Advancing Audio-Visual Segmentation Techniques

A new method enhances accurate identification of sound-producing objects in videos.

2025-10-01T13:52:00+00:00 ― 6 min read

Sound Advancements in Text-to-Speech with DiffProsody

DiffProsody enhances speech synthesis speed and quality through innovative prosody generation.

2025-10-01T13:03:25+00:00 ― 4 min read

Sound Addressing the Loudness War with De-limiter Networks

New technology aims to restore music quality lost in loudness compression.

2025-10-01T02:31:50+00:00 ― 5 min read

Sound Automated System for Identifying Aphasia

New method promises quicker identification of speech disorders like aphasia.

2025-09-30T21:40:20+00:00 ― 5 min read

Cryptography and Security Inaudible Sound Techniques for Speech Manipulation

New method uses ultrasonic sounds to confuse speech recognition systems without detection.

2025-09-30T19:14:35+00:00 ― 6 min read

Computation and Language Advancements in Text-to-Speech Technology

New methods improve the quality of synthesized speech using self-supervised learning.

2025-09-30T17:37:25+00:00 ― 5 min read

Computation and Language Improving Speech Recognition with Keyword Boosting

A new method enhances the transcription of rare keywords in business conversations.

2025-09-30T10:20:10+00:00 ― 6 min read

Sound Advancing Speech Recognition with Federated Learning

Federated Learning improves speech recognition while keeping user data private.

2025-09-30T08:43:00+00:00 ― 5 min read

Sound MusicLDM: A New Approach to Text-to-Music Generation

MusicLDM transforms text into original music, offering fresh avenues for creativity.

2025-09-30T05:28:40+00:00 ― 7 min read

Sound Improving Singing Melody Extraction Techniques with Deep Learning

New methods enhance the accuracy of extracting singing melodies from mixed audio.

2025-09-30T01:25:45+00:00 ― 7 min read

Computation and Language Advancements in Audio Captioning Technology

New methods aim to enhance audio captioning for better accuracy and efficiency.

2025-09-30T00:25:00+00:00 ― 5 min read

Sound Advancements in Speech Enhancement Techniques

New model improves speech clarity in noisy environments using innovative methods.

2025-09-29T22:11:25+00:00 ― 5 min read

Sound Analyzing Korean Folk Songs Through Technology

A study on Korean folk songs using modern analytical methods.

2025-09-29T21:22:50+00:00 ― 8 min read

Graphics DiffDance: A New Era in Dance Generation

DiffDance creates detailed dance sequences that match music effectively.

2025-09-29T16:31:20+00:00 ― 5 min read

Sound Addressing Gender Bias in Singing Voice Transcription

Examining fairness in singing voice transcription technology across genders.

2025-09-29T15:42:45+00:00 ― 8 min read

Sound Advancements in Hotword Customization for ASR Systems

SeACo-Paraformer brings flexibility and accuracy to speech recognition technology.

2025-09-29T14:05:35+00:00 ― 5 min read

Audio and Speech Processing Examining Voice Quality and Its Impact

This study explores voice quality classification methods and their significance in communication.

2025-09-29T12:28:25+00:00 ― 4 min read

Audio and Speech Processing Advancements in Active Noise Control Technology

Learn how new algorithms improve noise cancellation techniques for various applications.

2025-09-29T05:59:45+00:00 ― 4 min read

Audio and Speech Processing New Tool Measures Audio Quality with Video Insights

AudioVMAF combines video metrics for improved audio quality assessment.

2025-09-29T01:56:50+00:00 ― 5 min read

Sound Advancements in Fake Audio Detection with RAWM

A new method improves detection of fake audio using adaptive weight modification.

2025-09-29T01:08:15+00:00 ― 5 min read

Cryptography and Security The Growing Need for Steganalysis in Information Security

Steganalysis helps detect hidden messages in multimedia, ensuring secure communication.

2025-09-28T23:31:05+00:00 ― 4 min read

Multimedia TranSTYLer: A Leap in Virtual Communication

Transforming gestures for virtual agents with preserved meaning.

2025-09-28T18:39:35+00:00 ― 6 min read

Sound Advancements in Sound Source Localization Using Neural Networks

Exploring how neural networks improve the accuracy of sound source localization.

2025-09-28T12:10:55+00:00 ― 6 min read

Computation and Language Improving Punjabi Speech Recognition with Self-Training Methods

Researchers enhance automatic speech recognition for Punjabi using innovative self-training techniques.

2025-09-28T08:56:35+00:00 ― 5 min read

Sound Advancements in Target-Speaker Speech Recognition

New model improves speech recognition in noisy environments by focusing on a single speaker.

2025-09-28T08:08:00+00:00 ― 4 min read

Sound Balancing Privacy and Smart Audio Monitoring

New methods aim to protect speech privacy in audio monitoring systems.

2025-09-28T06:30:50+00:00 ― 5 min read

Computation and Language Advancing Expressive Speech Synthesis with New Dataset

A new dataset enhances speech synthesis by capturing emotional expression without relying on text.

2025-09-27T18:22:05+00:00 ― 5 min read

Audio and Speech Processing Improving Music Pitch Classification with SDTW

New strategies to enhance training stability for music pitch classification.

2025-09-27T13:30:35+00:00 ― 6 min read

Sound Advancements in Voice Conversion Technology

Phoneme Hallucinator transforms voice conversion with limited data for clearer outputs.

2025-09-27T10:16:15+00:00 ― 5 min read

Sound Advancing Gesture Generation for Digital Humans

A new method creates realistic gestures from raw speech audio.

2025-09-27T08:39:05+00:00 ― 5 min read

Audio and Speech Processing Advancing Bilingual Speech Recognition with Grapheme Units

Enhancing hybrid ASR systems for bilingual speech using grapheme units.

2025-09-27T03:47:35+00:00 ― 5 min read

Computation and Language Advances in Joint Speech-Text Learning

A new model improves speech and text alignment for better automatic recognition.

2025-09-27T02:10:25+00:00 ― 6 min read

Sound Advancements in Visual Speech Recognition with Lip2Vec

Lip2Vec enhances visual speech recognition using fewer labeled data.

2025-09-27T01:21:50+00:00 ― 7 min read

Computation and Language Advancements in Speech Recognition Technology

New methods enhance accuracy and speed in speech recognition systems.

2025-09-26T11:35:55+00:00 ― 5 min read