Computer Science - Multimedia

RSS

Computer Vision and Pattern Recognition Advancing Image Compression with Frequency Analysis

A new method improves image compression by focusing on frequency bands.

2025-09-16T14:45:18+00:00 ― 6 min read

Computer Vision and Pattern Recognition Vlogger: A New Tool for Video Creation

Vlogger simplifies video blogging, making it quicker and easier for creators.

2025-09-16T11:12:00+00:00 ― 6 min read

Multimedia The Environmental Costs of Video Streaming

Examining energy use and impact of video streaming on the environment.

2025-09-16T01:59:00+00:00 ― 6 min read

Sound New Model Enhances Fish Feeding Intensity Assessment

A unified approach to assess fish feeding using audio and video data.

2025-09-14T21:03:15+00:00 ― 5 min read

Computer Vision and Pattern Recognition The Impact of AI on Video Technology

Discover how AI is changing video creation and streaming.

2025-09-13T11:18:36+00:00 ― 5 min read

Image and Video Processing Introducing the Video Conferencing Dataset for Real-World Communication

A dataset tailored for testing video quality in conferencing situations.

2025-09-13T03:45:30+00:00 ― 5 min read

Computer Vision and Pattern Recognition New Framework Connects Video and Text More Effectively

Researchers develop a framework for better video and text understanding.

2025-09-12T20:49:36+00:00 ― 5 min read

Computer Vision and Pattern Recognition Advancements in Audio-Visual Segmentation Techniques

A new method enhances audio-visual segmentation without detailed labels.

2025-09-12T20:28:15+00:00 ― 5 min read

Sound New System Improves Voice Extraction from Unstable Head Positions

PIAVE helps machines extract voices clearly, even when speakers turn their heads.

2025-09-12T19:39:40+00:00 ― 6 min read

Audio and Speech Processing MusiLingo: Bridging Music and Language

A new system that connects music and language for better understanding.

2025-09-11T14:30:40+00:00 ― 6 min read

Multimedia Effective Poster Design Through Simple Metrics

Learn how to design posters that communicate messages clearly and attractively.

2025-09-09T08:49:24+00:00 ― 5 min read

Multimedia BDIQA: Advancing Video Question Answering with Theory of Mind

A new dataset enhances AI's ability to interpret human behavior in videos.

2025-09-09T07:30:24+00:00 ― 7 min read

Human-Computer Interaction Spica: A New Tool for Blind Users

Spica enhances video access for blind and low-vision users through interactivity.

2025-09-09T06:43:00+00:00 ― 4 min read

Robotics Testing Robots for Unexpected Challenges

Exploring methods to improve robot performance in unpredictable environments.

2025-09-09T02:53:54+00:00 ― 4 min read

Sound Advancements in Voice Conversion Technology Using Face Images

New method transforms voices using facial features for diverse applications.

2025-09-09T01:46:55+00:00 ― 8 min read

Audio and Speech Processing Introducing AV-SUPERB: A New Benchmark for Audio-Visual Models

AV-SUPERB evaluates audio and visual models across various tasks for better performance.

2025-09-08T22:32:35+00:00 ― 5 min read

Information Retrieval Improving Video Search with Modern Techniques

A new method simplifies video searching by combining various information types.

2025-09-08T20:50:30+00:00 ― 6 min read

Multimedia Creating Emotion-Sensitive Machines for Better Interaction

Developing machines that respond based on emotions for improved human-computer interaction.

2025-09-08T19:31:30+00:00 ― 6 min read

Sound Faster Text-to-Audio Generation Using Consistency Distillation

New method improves speed and efficiency in Text-to-Audio generation.

2025-09-08T18:29:40+00:00 ― 4 min read

Computer Vision and Pattern Recognition Advancing Sound Source Localization Techniques

Improving the way we identify sound sources using audio-visual data.

2025-09-08T12:49:35+00:00 ― 6 min read

Computer Vision and Pattern Recognition Mapping Sounds: A New Approach to Soundscape Analysis

A method to visualize and predict sounds in various environments using advanced technology.

2025-09-08T11:12:25+00:00 ― 5 min read

Multimedia Green-LL: Improving Live Video Streaming Experience

A new approach to enhance mobile live video streaming quality and energy efficiency.

2025-09-08T06:13:36+00:00 ― 8 min read

Information Retrieval Personalized Food Recommendations with ChatDiet

ChatDiet combines personal data and population knowledge for better food advice.

2025-09-07T00:28:12+00:00 ― 8 min read

Multimedia Television Debates: A Closer Look at Bias and Civility

An analysis of bias and incivility in Indian television debates.

2025-09-06T18:16:54+00:00 ― 6 min read

Image and Video Processing Advancements in Video Compression Techniques

New framework improves video compression efficiency and quality.

2025-09-06T14:40:20+00:00 ― 5 min read

Human-Computer Interaction The Role of Visual Media in Propaganda

This article examines how images impacted public opinion during the Russia-Ukraine conflict.

2025-09-05T06:04:24+00:00 ― 4 min read

Image and Video Processing Improving Wireless Image Transmission in Noisy Environments

A new method enhances image quality during wireless transmission over noisy channels.

2025-09-05T03:18:04+00:00 ― 5 min read

Computers and Society MemeCraft: A New Tool for Social Advocacy

MemeCraft creates engaging memes to promote social causes safely.

2025-09-04T14:48:00+00:00 ― 10 min read

Computer Vision and Pattern Recognition Improving Audio-Visual Learning with Speed Co-Augmentation

A new method enhances machine learning of audio-visual data.

2025-09-04T05:59:30+00:00 ― 5 min read

Computation and Language RVS Task: A New Look at Giving Directions

Research reveals broader ways to deliver directions using spatial knowledge.

2025-09-03T23:39:30+00:00 ― 7 min read

Signal Processing A New Approach to Identifying Schizophrenia Symptoms

Combining audio, video, and text for better mental health assessments.

2025-09-03T22:42:15+00:00 ― 5 min read

Computer Vision and Pattern Recognition Advancements in Talking Face Generation Technology

New framework improves lip synchronization and visual quality in talking face videos.

2025-09-03T04:02:24+00:00 ― 5 min read

Computer Vision and Pattern Recognition Advancing Defect Detection with Synthetic Samples

A new method generates fake defective samples to improve anomaly detection in manufacturing.

2025-09-02T21:51:06+00:00 ― 6 min read

Sound Combining Voice and Face for Better Identification

New method improves speaker verification by merging audio and visual data.

2025-09-02T07:50:15+00:00 ― 5 min read

Multimedia Advancements in Audio Visual Speaker Localization

A new method enhances speaker tracking using audio and visual data.

2025-09-02T06:13:05+00:00 ― 6 min read

Sound A New Model for Music Generation with AI

MusicAOG simplifies music creation and understanding through innovative graph representation.

2025-08-31T08:52:25+00:00 ― 6 min read

Human-Computer Interaction The Importance of Non-Typical Emotions

Analyzing stress and depression can enhance our understanding of mental health.

2025-08-31T02:02:36+00:00 ― 6 min read

Computer Vision and Pattern Recognition Detecting Humor in Videos with FunnyNet-W

A new model identifies funny moments in videos using visual, audio, and text data.

2025-08-30T23:09:25+00:00 ― 6 min read

Computer Vision and Pattern Recognition AesopAgent: Transforming Stories into Videos

AesopAgent enables users to create videos from stories using advanced AI tools.

2025-08-30T18:32:18+00:00 ― 5 min read

Human-Computer Interaction The Role of Images in Wikipedia Learning

Examining how images impact learning in Wikipedia articles.

2025-08-30T02:28:30+00:00 ― 5 min read