Computer Science - Sound

RSS

Sound Advancing Multi-Audio Processing with MALLM

Introducing a new model and benchmark for evaluating multi-audio tasks.

2025-05-31T19:17:15+00:00 ― 5 min read

Sound Animating Emotions for Realistic Talking Heads

A new system models emotional intensity in animated characters for enhanced realism.

2025-05-31T16:51:30+00:00 ― 6 min read

Sound OpenSep: Advancing Audio Separation Technology

OpenSep automates audio separation for clearer sound experiences without manual input.

2025-05-31T07:15:34+00:00 ― 6 min read

Sound PALM: A New Approach to Audio Recognition

PALM enhances audio recognition by optimizing prompt representation and efficiency.

2025-05-31T01:54:50+00:00 ― 4 min read

Audio and Speech Processing Understanding Guitar Pickups: Wire Turns and Gauge

Explore how wire turns and gauge impact guitar pickup sound.

2025-05-31T00:34:39+00:00 ― 7 min read

Audio and Speech Processing Advancements in Speech Recognition Technology

A new method improves speech recognition for long recordings.

2025-05-30T21:54:17+00:00 ― 5 min read

Sound Integrating Audio-Visual Data for Speech Processing

This study analyzes how audio, video, and text work together in speech recognition.

2025-05-30T15:13:22+00:00 ― 7 min read

Computation and Language Advancing Text-to-Speech with New Intonation Model

A new model improves naturalness in text-to-speech systems by analyzing pitch patterns.

2025-05-30T01:51:32+00:00 ― 4 min read

Computation and Language Advancing Speech Technology for African Languages

A new model enhances speech representation for African languages, boosting inclusivity in technology.

2025-05-29T21:50:59+00:00 ― 5 min read

Sound Melody-Guided AI Music Generation

A new model improves music creation using melody and text descriptions.

2025-05-29T20:30:48+00:00 ― 4 min read

Audio and Speech Processing Advancements in Speech Language Models Without Extensive Training Data

New method for speech language models reduces need for extensive data.

2025-05-29T17:50:26+00:00 ― 6 min read

Sound Changing Voices: The Voice Conversion Process

Learn how voice conversion works and its exciting applications.

2025-05-29T13:49:53+00:00 ― 4 min read

Multimedia Evaluating Multimedia Quality with CCI

Discover how CCI improves multimedia quality assessments.

2025-05-29T12:29:42+00:00 ― 6 min read

Multimedia The New Age of Lie Detection

Researchers combine audio and visual cues to detect lies more accurately.

2025-05-29T11:09:31+00:00 ― 6 min read

Human-Computer Interaction Innovative Communication System for Disaster Response

A new voice-based network bridges language gaps in emergencies.

2025-05-29T09:49:20+00:00 ― 6 min read

Audio and Speech Processing Advancements in Device-Directed Speech Detection

Learn how virtual assistants understand user commands better.

2025-05-29T05:48:47+00:00 ― 6 min read

Sound Revolutionizing Audio Captioning with MACE

MACE improves audio captioning by linking sounds to accurate text descriptions.

2025-05-28T17:47:08+00:00 ― 5 min read

Sound Predicting Song Cover Success with Machine Learning

Using machine learning to forecast audience reaction to song covers.

2025-05-28T15:06:46+00:00 ― 7 min read

Sound Improving Audio Classification with ADD Loss

A new approach to enhance classification through Angular Distance Distribution Loss.

2025-05-28T13:46:35+00:00 ― 6 min read

Computation and Language Advancements in Speech Recognition for People with Disabilities

New methods improve communication tools for individuals with speech difficulties.

2025-05-28T11:06:13+00:00 ― 7 min read

Sound Estimating Human Poses Using Sound Waves

Researchers use sound waves to estimate human poses without cameras.

2025-05-27T23:13:12+00:00 ― 8 min read

Audio and Speech Processing Improving Sound Detection in Noisy Environments

New methods using language models enhance sound detection amidst background noise.

2025-05-27T03:01:49+00:00 ― 6 min read

Sound Fish-Speech: A New Era in Text-to-Speech

Fish-Speech enhances voice technology for a more natural communication experience.

2025-05-27T01:41:38+00:00 ― 6 min read

Sound EmoSphere++: A New Era in Emotional Machines

EmoSphere++ enables machines to express emotions like humans, enhancing interactions.

2025-05-26T05:38:53+00:00 ― 7 min read

Sound New Method for Underwater Boundary Estimation

U-COTANS improves underwater boundary detection using deep learning techniques.

2025-05-26T02:58:31+00:00 ― 6 min read

Sound Introducing PIAST: A New Dataset for Piano Music Research

PIAST offers a unique collection of piano music for researchers.

2025-05-26T01:38:20+00:00 ― 5 min read

Computer Vision and Pattern Recognition Advancing Technology with 3D Audio-Visual Segmentation

Machines learn to connect sound and visuals in 3D spaces.

2025-05-25T21:37:47+00:00 ― 7 min read

Audio and Speech Processing The Evolution of Speaker Diarization

How new methods are transforming speaker identification in audio recordings.

2025-05-25T18:57:25+00:00 ― 6 min read

Sound The Soul of Ghanaian Seperewa Music

A look into the traditional sounds of the seperewa harp-lute.

2025-05-25T06:50:24+00:00 ― 6 min read

Sound Target Speaker Extraction: Enhancing Clarity in Noisy Settings

Learn how TSE improves speech recognition in crowded environments using text cues.

2025-05-25T00:14:51+00:00 ― 6 min read

Sound Innovative Audio System Enhances Construction Site Safety

A new system detects screams to improve worker safety on construction sites.

2025-05-24T22:54:40+00:00 ― 7 min read

Sound Advancements in Speaker Emotion Recognition Technology

Exploring new methods for recognizing emotions in speech using advanced models.

2025-05-24T20:14:18+00:00 ― 7 min read

Sound The Concatenator: A New Way to Create Music

A fresh system for merging audio samples to help music creators innovate easily.

2025-05-24T05:32:17+00:00 ― 6 min read

Sound Dynamic Range Compression: Improving Sound Quality

A look at how dynamic range compression enhances audio experiences.

2025-05-24T04:12:06+00:00 ― 6 min read

Audio and Speech Processing Using Voice Assistants to Detect Mild Cognitive Impairment

Voice assistants help identify early signs of memory issues in older adults.

2025-05-24T01:31:44+00:00 ― 7 min read

Sound Dynamic Music Generation for Tabletop RPGs

A system creates real-time music based on tabletop role-playing game narratives.

2025-05-23T16:10:27+00:00 ― 7 min read

Computation and Language SLAM-ASR: A Look at Speech Recognition's Potential

Examining SLAM-ASR's strengths, weaknesses, and future in speech recognition.

2025-05-23T14:50:16+00:00 ― 5 min read

Signal Processing Clearing Up Sound: The SoundSil-DS Method

A new method to clarify and visualize sound-field images.

2025-05-23T13:48:54+00:00 ― 7 min read

Computation and Language Innovating Speech Recognition for Malasar Language

A project improves speech recognition for the Malasar language using Tamil resources.

2025-05-23T02:48:37+00:00 ― 5 min read

Sound Acoustic Volume Rendering: A Leap in Sound Realism

Discover how sound enhances virtual experiences through acoustic volume rendering.

2025-05-21T22:44:46+00:00 ― 7 min read