A new system combines transcription and translation for better communication.

2025-10-12T11:24:25+00:00 ― 4 min read

Sound Advancements in Speech Recognition with Whisper-AT

Whisper-AT combines speech recognition and audio tagging for improved performance.

2025-10-12T08:10:05+00:00 ― 5 min read

Audio and Speech Processing Integrating Speech with Language Models: The Speech-LLaMA Method

A new approach that combines speech with language models for improved translation.

2025-10-11T18:24:10+00:00 ― 4 min read

Sound Advancements in Automatic Piano Transcription

New method improves accuracy in turning piano audio into sheet music.

2025-10-11T14:21:15+00:00 ― 4 min read

Sound Advancements in Articulatory Speech Synthesis

A study on improving vocal sound reproduction through advanced synthesis techniques.

2025-10-11T02:12:30+00:00 ― 5 min read

Sound Introducing VampNet: A New Approach to Music Creation

VampNet transforms music processing through innovative token modeling techniques.

2025-10-11T01:23:55+00:00 ― 4 min read

Sound EchoVest: A New Hope for Hearing Impairment

Affordable wearable technology for individuals with hearing loss.

2025-10-10T23:46:45+00:00 ― 5 min read

Sound Advancing Lyrics Alignment in Music Services

A new model improves timing accuracy for lyrics in music applications.

2025-10-10T18:55:15+00:00 ― 6 min read

Human-Computer Interaction Introducing SnakeSynth: A New Way to Create Sound

A web-based synthesizer that allows users to create music using simple gestures.

2025-10-10T16:29:30+00:00 ― 4 min read

Sound AI and Creativity in Progressive Metal Music

A study on AI's role in generating progressive metal music.

2025-10-10T13:15:10+00:00 ― 6 min read

Sound ShredGP: A New Way to Generate Guitar Music

A model that creates guitar tablature reflecting famous guitarists' styles.

2025-10-10T12:26:35+00:00 ― 5 min read

Sound Advancements in Self-Supervised Learning for Music Analysis

Exploring the potential of self-supervised learning in music information retrieval.

2025-10-10T10:00:50+00:00 ― 6 min read

Sound Audio Analysis in COVID-19 Detection

Using audio signals to identify respiratory health risks.

2025-10-10T09:12:15+00:00 ― 7 min read

Computation and Language SummaryMixing: A New Approach to Speech Recognition

A new method improves speech recognition speed and accuracy while reducing resource use.

2025-10-10T07:35:05+00:00 ― 5 min read

Audio and Speech Processing Advancements in Bioacoustics Through Feature Embeddings

This study enhances wildlife monitoring using audio feature embeddings for better sound classification.

2025-10-10T02:43:35+00:00 ― 8 min read

Audio and Speech Processing Advancements in Voice Conversion with Urhythmic Technology

Urhythmic enhances voice conversion by focusing on speech rhythm.

2025-10-09T21:52:05+00:00 ― 5 min read

Sound Advancements in Real-Time Music Information Retrieval for Guitarists

Research enhances percussive fingerstyle techniques for guitarists using real-time sound retrieval.

2025-10-09T15:23:25+00:00 ― 7 min read

Computation and Language Advancements in Speech Intent Classification and Slot Filling

This article explores a new model for speech intent and slot identification.

2025-10-09T12:09:05+00:00 ― 6 min read

Sound Detecting the Truth in Synthetic Voices

As voice cloning technology advances, reliable detection methods are crucial.

2025-10-09T06:29:00+00:00 ― 6 min read

Computation and Language Improving Speech Recognition for Older Adults

A study enhances ASR for older speakers, using innovative techniques.

2025-10-09T01:37:30+00:00 ― 6 min read

Computation and Language Advancements in Speech Summarization with BASS

BASS improves summarization of long audio by processing in blocks.

2025-10-08T15:05:55+00:00 ― 5 min read

Sound Risks of Stealthy Backdoor Attacks in Speech Recognition Systems

New methods pose serious security risks for speech recognition technology.

2025-10-08T14:17:20+00:00 ― 7 min read

Audio and Speech Processing New Dataset Aims to Improve Hebrew Speech Recognition

ivrit.ai provides vital resources for enhancing Hebrew ASR technology.

2025-10-08T05:22:55+00:00 ― 6 min read

Computation and Language Advancements in Multilingual Speech Translation Technology

Innovative techniques are transforming how we translate spoken language.

2025-10-08T02:57:10+00:00 ― 6 min read

Audio and Speech Processing Advancements in Speaker Anonymisation Techniques

New methods aim to hide speaker identities while maintaining speech clarity.

2025-10-08T01:20:00+00:00 ― 5 min read

Sound Advancing Speech Recognition with Time-Sparse Transducer

New model improves speech recognition speed and memory usage.

2025-10-07T23:42:50+00:00 ― 6 min read

Sound Introducing the JAZZVAR Dataset for Jazz Piano Variations

A new dataset highlights the creative interpretations of jazz pianists on classic standards.

2025-10-07T14:48:25+00:00 ― 5 min read

Audio and Speech Processing Advancements in HRTF Modeling for Realistic Sound

New methods improve sound representation in virtual and augmented reality.

2025-10-07T10:45:30+00:00 ― 7 min read

Sound FlexiAST: A Flexible Approach to Audio Processing

FlexiAST allows models to adapt to various audio patch sizes efficiently.

2025-10-07T09:56:55+00:00 ― 6 min read

Machine Learning Advances in Speech Analysis for Throat Cancer Detection

Researchers are using machine learning to improve throat cancer diagnosis through speech analysis.

2025-10-07T06:42:35+00:00 ― 6 min read

Sound Introducing Polyffusion: A New Way to Create Music Scores

Polyffusion uses visual techniques to generate and control music effectively.

2025-10-07T01:51:05+00:00 ― 6 min read

Audio and Speech Processing Advancements in Detecting Alzheimer's Through Speech Analysis

Researchers are using speech patterns to detect Alzheimer's earlier and more effectively.

2025-10-07T00:13:55+00:00 ― 6 min read

Sound New Framework Improves Speech Recognition with Metadata

Integrating metadata enhances performance in speech tasks like language identification.

2025-10-06T12:05:10+00:00 ― 6 min read

Audio and Speech Processing Advancements in Transducer Models for Speech Recognition

This article discusses the Transducer model's real-time capabilities and recent improvements.

2025-10-06T11:16:35+00:00 ― 6 min read

Audio and Speech Processing Bias in Transfer Learning for Music Recognition

This study explores bias in audio models used for instrument recognition.

2025-10-06T09:39:25+00:00 ― 6 min read

Sound Advancements in Music Genre Classification Using Deep Learning

This study explores a deep learning approach to accurately classify music genres.

2025-10-06T08:50:50+00:00 ― 7 min read

Sound Automated Sound Source Localization in Shallow Waters

New method improves sound source location tracking in shallow aquatic environments.

2025-10-05T13:27:48+00:00 ― 7 min read

Sound Advancing Speech Technology with SCRAPS

A new model connects phonetics and acoustics for better speech technology.

2025-10-05T13:24:50+00:00 ― 7 min read

Sound Advancements in Emotion Recognition with Self-Supervised Learning

This study highlights the role of self-supervised learning in detecting emotions from audio data.

2025-10-05T08:33:20+00:00 ― 6 min read

Audio and Speech Processing Making Music Easy for Everyone

A new interface simplifies music creation for beginners using text-to-audio technology.

2025-10-04T18:47:25+00:00 ― 5 min read

Computer Science - Sound