Electrical Engineering and Systems Science - Audio and Speech Processing

RSS

Sound A New Approach to Emotion-Driven Piano Music Generation

This method enhances music generation by separating emotional aspects into valence and arousal.

2025-07-08T06:04:45+00:00 ― 5 min read

Sound Introducing PiCoGen: A New Way to Create Piano Covers

PiCoGen offers an innovative method for generating piano covers without paired data.

2025-07-08T04:27:35+00:00 ― 5 min read

Sound Addressing Abusive Speech in Audio

Research focuses on identifying abusive speech in audio recordings across languages.

2025-07-08T02:50:25+00:00 ― 5 min read

Computer Vision and Pattern Recognition Generating Synchronized Audio for Silent Videos

A method to create audio that matches first-person viewpoint videos.

2025-07-07T23:36:05+00:00 ― 7 min read

Sound Advancing Beat Tracking in Music Analysis

A new system improves beat tracking across various musical genres.

2025-07-07T15:30:15+00:00 ― 5 min read

Sound AI Music Generation: Listener Preferences in Progressive Metal

Study reveals listener views on AI-generated versus human music.

2025-07-07T13:53:05+00:00 ― 7 min read

Sound Advancing Detection of Lossy Audio Compression

A study on improving methods to detect lossy audio compression for better sound quality.

2025-07-07T12:15:55+00:00 ― 6 min read

Sound Evaluating Large Language Models in Music Creation

This study examines how well LLMs understand and generate music.

2025-07-07T10:38:45+00:00 ― 5 min read

Audio and Speech Processing Improving Speech Recognition with AI Collaboration

AI models enhance accuracy of speech-to-text conversions.

2025-07-07T09:50:10+00:00 ― 5 min read

Audio and Speech Processing Balancing Privacy and Utility in Conversation Analysis

Examining techniques to protect privacy while analyzing recorded conversations.

2025-07-07T04:10:05+00:00 ― 5 min read

Sound MIDI Music Generation: Current Challenges and Future Directions

An overview of MIDI music creation and its expressive potential.

2025-07-07T00:55:45+00:00 ― 5 min read

Sound ChordSync: Aligning Music Chords with Audio

A new model that synchronizes chord annotations with music audio seamlessly.

2025-07-06T22:30:00+00:00 ― 5 min read

Audio and Speech Processing SynesLM: Advancing Audio-Visual Speech Technology

A new model integrates audio and visual data for speech recognition and translation.

2025-07-06T20:04:15+00:00 ― 6 min read

Sound A Clear Method for Estimating Music Difficulty

This study proposes a transparent way to assess music difficulty for educators.

2025-07-06T17:38:30+00:00 ― 6 min read

Computation and Language Bailing-TTS: Advancing Text-to-Speech for Chinese Dialects

A new model enhances speech synthesis for various Chinese dialects.

2025-07-06T14:24:10+00:00 ― 5 min read

Sound Advances in Automatic Piano Cover Generation

A new method improves piano cover creation, balancing quality and musical integrity.

2025-07-06T11:58:25+00:00 ― 4 min read

Sound New Method for Detecting Deepfakes Using Audio and Video

A framework that effectively identifies deepfake content through combined audio and visual analysis.

2025-07-06T08:44:05+00:00 ― 5 min read

Sound Assessing Music Understanding with MuChoMusic Benchmark

A new benchmark to evaluate models analyzing music and language.

2025-07-06T05:29:45+00:00 ― 6 min read

Multimedia Advancing Audio-Visual Generalized Zero-Shot Learning

A new framework improves classification in unseen audio-visual tasks.

2025-07-06T04:41:10+00:00 ― 6 min read

Sound Advancements in Music Generation with AI

A new model enhances music generation using compound tokens and sequential decoding.

2025-07-06T03:04:00+00:00 ― 5 min read

Sound Reviving Ancient Korean Court Music Through Technology

A project reintroducing forgotten Korean court music using modern techniques.

2025-07-06T01:26:50+00:00 ― 6 min read

Audio and Speech Processing Advancements in Emotional Speech Generation

New methods enhance emotional expression in machine speech synthesis.

2025-07-05T22:12:30+00:00 ― 6 min read

Sound Advancing Music Generation Through Melody and Rhythm Separation

A new method improves computer-generated music quality by separating melody and rhythm.

2025-07-05T18:58:10+00:00 ― 5 min read

Sound Connecting Emotions in Music and Sounds

This study examines how music and sounds evoke emotions together.

2025-07-05T13:18:05+00:00 ― 6 min read

Sound Revolutionizing Music Generation with AI

New methods in AI music generation offer improved structure and diversity.

2025-07-05T12:29:30+00:00 ― 5 min read

Audio and Speech Processing Advancing Speech Tech for Arabic Dialects

New framework enhances speech recognition for diverse Arabic dialects.

2025-07-05T10:52:20+00:00 ― 4 min read

Sound Generating Unique Drumbeats from Text Prompts

A system that creates unique drum rhythms based on written prompts for musicians.

2025-07-05T06:49:25+00:00 ― 4 min read

Sound Addressing Accent Recognition Challenges in Speech Technology

New methods improve speech recognition accuracy for diverse accents.

2025-07-05T05:12:15+00:00 ― 4 min read

Sound Assessing Stem Compatibility in Music Production

A new method for judging how well audio pieces fit together in music.

2025-07-05T04:23:40+00:00 ― 5 min read

Sound Optimizing Speaker Diarization for Faster Results

Methods to speed up speaker diarization without sacrificing accuracy.

2025-07-05T00:20:45+00:00 ― 6 min read

Sound GRAFX: A New Tool for Audio Processing

GRAFX offers an open-source solution for efficient audio processing with PyTorch.

2025-07-04T17:52:05+00:00 ― 4 min read

Audio and Speech Processing Advancements in Acoustic Sensor Networks with iDANSE

iDANSE enhances sound processing in acoustic sensor networks for better real-time applications.

2025-07-04T08:09:05+00:00 ― 4 min read

Audio and Speech Processing Advancements in Binaural Signal Matching Techniques

Improving binaural sound reproduction for better audio experiences in various devices.

2025-07-04T07:20:30+00:00 ― 7 min read

Computation and Language New Framework Transforms Speech Into Knowledge Graphs

Wav2graph creates knowledge graphs from spoken language for improved AI understanding.

2025-07-04T04:06:10+00:00 ― 7 min read

Computation and Language Introducing Speech-MASSIVE: A New Dataset for Multilingual Spoken Language Understanding

Speech-MASSIVE aims to enhance spoken language understanding in various languages.

2025-07-04T01:40:25+00:00 ― 6 min read

Audio and Speech Processing Ensuring Speech Data Privacy with New Methods

Innovative techniques protect sensitive speech data while maintaining processing accuracy.

2025-07-04T00:51:50+00:00 ― 7 min read

Audio and Speech Processing Advancements in Cinematic Audio Source Separation

Research on new models improves audio quality in film and television.

2025-07-03T17:34:35+00:00 ― 5 min read

Audio and Speech Processing Advancements in Voice Anonymisation Techniques

New methods improve privacy while preserving speech content and emotions.

2025-07-03T15:57:25+00:00 ― 6 min read

Applications Tracking Infant Vocalizations: Insights into Language Development

Analyzing a child's sounds reveals crucial stages of language growth.

2025-07-03T15:13:32+00:00 ― 5 min read

Sound Improving RNNs for Audio Effects Modeling

New methods for better control of RNNs enhance audio effect simulations.

2025-07-03T15:08:50+00:00 ― 8 min read