Computer Science - Sound

RSS

Sound Advancements in Automated Audio Captioning

New methods improve accessibility and accuracy in audio captioning.

2025-08-21T14:03:15+00:00 ― 6 min read

Sound Detecting Deepfake Audio Calls: A New Approach

Learn how to identify fake audio calls with innovative challenge-response techniques.

2025-08-21T07:34:35+00:00 ― 5 min read

Computer Vision and Pattern Recognition CustomListener: A New Era in Virtual Interactions

CustomListener creates realistic avatars that respond to conversations dynamically.

2025-08-21T05:08:50+00:00 ― 6 min read

Sound Advancements in Automatic Speaker Diarization Techniques

Research highlights the importance of timing over specific speaker features in diarization models.

2025-08-21T00:17:20+00:00 ― 6 min read

Multimedia Advancements in Lip-to-Speech Technology

New method enhances speech synthesis for individuals who cannot speak.

2025-08-20T20:14:25+00:00 ― 6 min read

Human-Computer Interaction Advancements in Silent Speech Interfaces

A look at MONA, a system enhancing silent speech communication.

2025-08-20T16:11:30+00:00 ― 5 min read

Sound Understanding Automatic Speech Recognition Technology

An overview of ASR and its advancements in modern applications.

2025-08-20T15:22:55+00:00 ― 4 min read

Audio and Speech Processing Advancements in Speech Emotion Recognition with EMOVOME Database

Exploring new methods to improve speech emotion recognition using natural data.

2025-08-20T01:37:00+00:00 ― 5 min read

Robotics Improving Robot Voice Recognition in Noisy Settings

Research focuses on helping robots better understand speech amidst background noise.

2025-08-19T22:22:40+00:00 ― 5 min read

Sound Automating Music Difficulty Assessment Using Audio Analysis

This study advances music education by automating the assessment of piano piece difficulty.

2025-08-19T21:34:05+00:00 ― 6 min read

Audio and Speech Processing Evaluating Voice Recognition in Noisy Environments

A new benchmark assesses voice recognition systems' performance amidst various disturbances.

2025-08-19T14:16:50+00:00 ― 5 min read

Sound The Future of AI in Music Creation

Exploring AI's role in shaping music through advanced techniques and structures.

2025-08-18T14:47:55+00:00 ― 5 min read

Audio and Speech Processing Improving Speech Models with RobustDistiller

A new method enhances speech model performance and efficiency in noisy environments.

2025-08-18T05:53:30+00:00 ― 5 min read

Sound Neural-SRP: Advancing Sound Source Localization

A new method combines traditional techniques with neural networks for better sound localization.

2025-08-17T23:24:50+00:00 ― 5 min read

Sound Advancing Acoustic Sensing with Deep Learning

A novel approach to enhance acoustic sensing without compromising audio quality.

2025-08-17T20:59:05+00:00 ― 6 min read

Sound Advancements in Gesture Generation from Speech

A new system improves realistic gesture creation using only speech audio.

2025-08-17T14:30:25+00:00 ― 6 min read

Sound Notochord: A New MIDI Tool for Musicians

Notochord enhances real-time MIDI music creation using AI for richer performances.

2025-08-17T06:24:35+00:00 ― 6 min read

Sound Prompt-Singer: A New Approach to Singing Voice Control

A method for more intuitive control over singing voices using natural language prompts.

2025-08-17T01:33:05+00:00 ― 7 min read

Sound Advancements in Speech Emotion Recognition with emoDARTS

New model emoDARTS improves accuracy in recognizing speech emotions using deep learning.

2025-08-16T17:27:15+00:00 ― 6 min read

Sound Advancements in Text-to-Speech Voice Characteristics

A study on improving TTS systems with diverse voice samples.

2025-08-16T12:35:45+00:00 ― 4 min read

Audio and Speech Processing Advances in Speech Editing Technology

New tools enhance voice recording editing and production quality.

2025-08-15T09:03:55+00:00 ― 5 min read

Computer Vision and Pattern Recognition Advancements in Dance Accompaniment Technology

New models enhance duet interactions in virtual dance performances.

2025-08-14T21:43:45+00:00 ― 6 min read

Audio and Speech Processing Reviving History: The Art of Audio Restoration

Discover how generative equalization breathes new life into old music recordings.

2025-08-14T20:06:35+00:00 ― 7 min read

Computation and Language Classifying Sorani Kurdish Subdialects Through Audio Data

Research identifies and classifies Sorani Kurdish dialects using extensive audio recordings.

2025-08-14T07:57:50+00:00 ― 6 min read

Audio and Speech Processing Automating Sound Tuning for Realistic Acoustics

A new method improves sound processing through automatic tuning of Feedback Delay Networks.

2025-08-14T07:09:15+00:00 ― 6 min read

Audio and Speech Processing Advancements in Automatic Speech Quality Assessment

A new method improves speech evaluation using entire recordings.

2025-08-14T06:20:40+00:00 ― 7 min read

Sound Measuring Adherence in Generative Music Models

A new approach to evaluate how well music follows audio prompts.

2025-08-13T23:03:25+00:00 ― 8 min read

Computer Vision and Pattern Recognition Introducing the 360+x Dataset for Enhanced Scene Understanding

A new dataset improves how robots interpret real-world environments.

2025-08-13T18:11:55+00:00 ― 6 min read

Sound New Approach to Audio Separation Using Language

This method improves audio separation by combining language descriptions with sound analysis.

2025-08-13T14:57:35+00:00 ― 6 min read

Computer Vision and Pattern Recognition Introducing UniAV: A Unified Approach to Video Localization

UniAV combines action localization, sound detection, and audio-visual event localization for better video understanding.

2025-08-13T10:06:05+00:00 ― 7 min read

Audio and Speech Processing CLaM-TTS: Advancing Text-to-Speech Technology

CLaM-TTS improves speech synthesis using advanced techniques for better efficiency and quality.

2025-08-13T08:28:55+00:00 ― 6 min read

Social and Information Networks Analyzing Music Through Graphs

Graphs allow for new insights into music structure and relationships.

2025-08-13T03:09:57+00:00 ― 5 min read

Audio and Speech Processing Improving Text-to-Speech with RALL-E

RALL-E enhances text-to-speech synthesis for clearer, more natural speech.

2025-08-13T01:11:40+00:00 ― 5 min read

Sound Advancements in Virtual Analog Audio Modeling

Exploring machine learning techniques for modeling analog audio effects.

2025-08-12T22:37:18+00:00 ― 6 min read

Sound MuPT: Advancing Music Generation with ABC Notation

MuPT utilizes ABC notation for effective music generation with AI.

2025-08-12T09:00:00+00:00 ― 5 min read

Audio and Speech Processing Advancing Audio Learning with M2D and M2D-X

New methods improve audio representation through self-supervised learning techniques.

2025-08-12T07:22:50+00:00 ― 6 min read

Audio and Speech Processing Improving Sound Field Reconstruction with AI

A method using AI enhances sound representation in various environments.

2025-08-12T00:54:10+00:00 ― 6 min read

Classical Physics Understanding Spectral Moments in Electromagnetic Testing

Explore the role of spectral moments in reverberation chamber testing and the impact of noise.

2025-08-12T00:28:33+00:00 ― 5 min read

Audio and Speech Processing Efficient Real-Time Piano Transcription Model

A new system for accurate and lightweight real-time piano transcription.

2025-08-12T00:05:35+00:00 ― 5 min read

Computer Vision and Pattern Recognition Any2Point: Bridging 3D Understanding in AI Models

A new framework enhances AI's grasp of 3D spaces.

2025-08-11T19:14:05+00:00 ― 7 min read