Latest Articles for Audio

Sound Balancing Privacy and Smart Audio Monitoring

New methods aim to protect speech privacy in audio monitoring systems.

2025-09-28T06:30:50+00:00 ― 5 min read

Computer Vision and Pattern Recognition Detecting Deepfakes: A New Approach

A method using audio and video for better deepfake detection.

2025-09-27T19:29:12+00:00 ― 4 min read

Audio and Speech Processing Advancements in Audio Quality Prediction with GML

A new AI model enhances the prediction of audio quality scores.

2025-09-24T03:43:40+00:00 ― 5 min read

Sound Generating Realistic Sounds from Silent Videos

Research explores deep learning for creating audio to match silent video content.

2025-09-22T04:45:50+00:00 ― 6 min read

Computer Vision and Pattern Recognition Combining Sound and Visuals to Improve Audio Quality

A new method enhances sound recordings using visual cues.

2025-09-22T03:57:15+00:00 ― 6 min read

Human-Computer Interaction AI's Role in Digital Storytelling

Exploring the impact of AI-generated content on the art of storytelling.

2025-09-22T02:56:54+00:00 ― 7 min read

Sound Improving Music Quality for Everyday Recordings

A new system enhances audio recordings for better listening experiences.

2025-09-21T15:48:30+00:00 ― 6 min read

Information Retrieval Challenges in Learning from Music Videos

This study examines the difficulties of using contrastive learning for music video understanding.

2025-09-18T17:21:45+00:00 ― 6 min read

Sound New Model Enhances Fish Feeding Intensity Assessment

A unified approach to assess fish feeding using audio and video data.

2025-09-14T21:03:15+00:00 ― 5 min read

Audio and Speech Processing Improving Speaker Diarization with Language Models

This article explores advancements in speaker diarization using language models for better accuracy.

2025-09-14T03:14:25+00:00 ― 5 min read

Audio and Speech Processing The Role of Audio in Pedestrian Detection

Researchers explore audio sensing technology for improved pedestrian detection in urban areas.

2025-09-14T00:48:40+00:00 ― 5 min read

Sound New Methods for Detecting AI-Generated Audio

Advanced techniques for ensuring audio authenticity in the age of voice cloning.

2025-09-13T03:40:24+00:00 ― 5 min read

Sound Improving Audio Generation Through Text Alignment Techniques

A new approach enhances audio generation by aligning audio with text descriptions.

2025-09-11T07:13:25+00:00 ― 5 min read

Audio and Speech Processing Advancing Fake Speech Detection Techniques

New methods are improving our ability to detect fake speech effectively.

2025-09-11T02:21:55+00:00 ― 6 min read

Sound Improving Vocoder Training with Contrastive Learning

New methods enhance vocoder performance with limited audio data.

2025-09-10T12:36:00+00:00 ― 5 min read

Cryptography and Security Improving Deepfake Detection Through Diverse Training Methods

This study explores training strategies to enhance detection of fake audio.

2025-09-09T22:01:30+00:00 ― 5 min read

Sound A New Method for Detecting Voice Spoofing

A robust approach to identify audio anomalies and combat voice spoofing.

2025-09-09T07:27:00+00:00 ― 5 min read

Computation and Language Advancements in Spoken Language Identification

New methods combine audio and metadata for better language recognition.

2025-09-08T07:09:30+00:00 ― 5 min read

Sound Advancements in Multi-Instrument Music Synthesis

A new method improves music generation by adding performance context.

2025-09-07T01:11:55+00:00 ― 6 min read

Sound Advancing Music Retrieval with Self-Supervised Learning

A new approach leverages self-supervised learning for connecting audio and sheet music.

2025-09-06T21:09:00+00:00 ― 5 min read

Sound Linking Audio and Sheet Music with Recurrent Networks

A new method improves audio and sheet music matching.

2025-09-06T19:31:50+00:00 ― 6 min read

Sound New Watermarking Technique for Audio Models

A novel method to watermark audio created by diffusion models for ownership protection.

2025-09-06T04:08:45+00:00 ― 6 min read

Computer Vision and Pattern Recognition AVI-Talking: A New Way to Create Expressive Animated Faces

AVI-Talking creates lifelike 3D faces that express emotions through audio.

2025-09-04T10:11:30+00:00 ― 6 min read

Signal Processing A New Approach to Identifying Schizophrenia Symptoms

Combining audio, video, and text for better mental health assessments.

2025-09-03T22:42:15+00:00 ― 5 min read

Computer Vision and Pattern Recognition Advancements in Talking Face Video Generation

New methods improve realism in digital humans and avatars.

2025-09-03T03:38:42+00:00 ― 4 min read

Sound Combining Voice and Face for Better Identification

New method improves speaker verification by merging audio and visual data.

2025-09-02T07:50:15+00:00 ― 5 min read

Computer Vision and Pattern Recognition Detecting Humor in Videos with FunnyNet-W

A new model identifies funny moments in videos using visual, audio, and text data.

2025-08-30T23:09:25+00:00 ― 6 min read

Audio and Speech Processing Advancements in Multimodal Processing with CoAVT

CoAVT integrates audio, visual, and text data for enhanced understanding.

2025-08-28T12:02:50+00:00 ― 7 min read

Sound Audio Flamingo: A New Model for Sound Understanding

Audio Flamingo excels in listening, conversing, and adapting to new audio tasks.

2025-08-26T16:19:20+00:00 ― 5 min read

Computer Vision and Pattern Recognition Advancing Human Communication Through Movement Modeling

A new model generates realistic movements in conversations, improving interaction understanding.

2025-08-24T15:59:36+00:00 ― 5 min read

Computation and Language Real-Time Detection of AI Conversation Issues

A new model improves dialogue breakdown detection for AI systems.

2025-08-20T13:38:18+00:00 ― 8 min read

Computer Vision and Pattern Recognition SonicDiffusion: Merging Sound and Image Creation

A new method to create and edit images using audio signals.

2025-08-14T22:56:36+00:00 ― 6 min read

Audio and Speech Processing CLaM-TTS: Advancing Text-to-Speech Technology

CLaM-TTS improves speech synthesis using advanced techniques for better efficiency and quality.

2025-08-13T08:28:55+00:00 ― 6 min read

Computer Vision and Pattern Recognition Advancements in Gesture Generation with CoCoGesture

CoCoGesture creates lifelike gestures that match spoken words, enhancing interaction.

2025-08-06T05:04:00+00:00 ― 5 min read

Computation and Language Advancements in Translating MEG Signals to Text

A new framework converts MEG signals into meaningful text, aiding communication technology.

2025-08-03T03:43:42+00:00 ― 9 min read

Sound Transforming Audio Captioning Through Innovative Methods

A new approach to audio captioning reduces reliance on paired data.

2025-07-30T21:24:10+00:00 ― 5 min read

Audio and Speech Processing Using Audio Technology for Pedestrian Tracking

This study examines audio methods for tracking pedestrian movement in urban areas.

2025-07-29T17:52:20+00:00 ― 7 min read

Audio and Speech Processing AV-CrossNet: Improving Speech Recognition in Noise

A new system helps separate speech from noise for clearer communication.

2025-07-29T03:17:50+00:00 ― 6 min read

Robotics Learning with Sound: A New Era for Robots

A new system helps robots learn tasks using audio from real-life demonstrations.

2025-07-26T09:42:35+00:00 ― 7 min read

Machine Learning Combining Text and Audio for Better Emotion Classification

A study on using text and audio data to improve emotion recognition.

2025-07-22T23:55:06+00:00 ― 6 min read