Computer Science - Multimedia

RSS

Image and Video Processing GAMIVAL: A New Tool for Gaming Video Quality

GAMIVAL evaluates streaming quality for mobile cloud gaming without reference videos.

2025-11-13T21:33:00+00:00 ― 4 min read

Latest Articles

Computer Vision and Pattern Recognition Improving Semantic Segmentation with Depth Data

A new method enhances segmentation accuracy by integrating depth information without source data.

2025-11-12T00:01:30+00:00 ― 6 min read

Computer Vision and Pattern Recognition New Framework Transforms Video Generation from Text

A new method improves video creation from text with added control and quality.

2025-11-11T16:15:24+00:00 ― 6 min read

Audio and Speech Processing Advancements in Speech-to-Singing Technology

Research presents a method to convert spoken words into singing efficiently.

2025-11-11T12:52:10+00:00 ― 7 min read

Computer Vision and Pattern Recognition Advancing Machine Learning with Integrated Multimodal Perception

A look at how Integrated Multimodal Perception enhances machine learning capabilities.

2025-11-10T19:51:55+00:00 ― 6 min read

Sound Advancements in Speech Synthesis with CoMoSpeech

CoMoSpeech improves speech synthesis speed and quality with a one-step process.

2025-11-10T05:17:25+00:00 ― 4 min read

Human-Computer Interaction Addressing Hate Raids in Live Streaming Communities

A look into hate raids and their impact on marginalized streamers.

2025-11-09T22:07:24+00:00 ― 5 min read

Computer Vision and Pattern Recognition Advancing Image Compression for Human Perception

A new method improves image compression by prioritizing human-friendly features.

2025-11-09T19:34:25+00:00 ― 5 min read

Computation and Language Understanding Memes Through Contextual Analysis

This study highlights the importance of context in interpreting memes.

2025-11-09T18:10:24+00:00 ― 5 min read

Sound Innovative Approaches to Music Rearrangement

A new method for creating unique music versions by rearranging existing pieces.

2025-11-09T15:31:30+00:00 ― 6 min read

Information Retrieval Introducing the SURE Dataset for Shopping Dialogues

A dataset designed to improve interactions between customers and salespeople in stores.

2025-11-09T10:24:18+00:00 ― 6 min read

Computer Vision and Pattern Recognition A New Approach to Visual Question Answering

Introducing a modular method for zero-shot visual question answering.

2025-11-08T19:07:54+00:00 ― 4 min read

Computation and Language Revising Task Steps Using Video Analysis

A new method to better organize task steps with video insights.

2025-11-08T18:04:42+00:00 ― 6 min read

Computer Vision and Pattern Recognition Advancements in Deblurring Quality Measurement

Improving metrics for assessing deblurring methods using a new dataset.

2025-11-08T16:14:06+00:00 ― 5 min read

Computer Vision and Pattern Recognition Improving Vision-Language Models with CLIP Feedback

A new method enhances vision-language models through real-time feedback for better performance.

2025-11-08T04:38:54+00:00 ― 6 min read

Computation and Language Advancing Fake News Detection Models

New models enhance the detection of fake news using diverse data techniques.

2025-11-08T01:13:30+00:00 ― 5 min read

Computer Vision and Pattern Recognition Advancements in Multi-Camera Systems for Autonomous Vehicles

Occ-BEV enhances vehicle perception through multi-camera 3D modeling and data integration.

2025-11-07T14:57:18+00:00 ― 6 min read

Cryptography and Security Analyzing the J-UNIWARD Method and Its Error

A look into J-UNIWARD's message hiding technique and its minor calculation error.

2025-11-06T17:05:54+00:00 ― 4 min read

Computer Vision and Pattern Recognition Addressing Bias in Visual Question Answering

A new approach tackles language and vision biases in VQA systems.

2025-11-06T14:27:54+00:00 ― 6 min read

Computer Vision and Pattern Recognition Improving Compression Quality of 3D Point Clouds

A method to enhance compressed 3D point cloud data using advanced neural networks.

2025-11-06T06:33:54+00:00 ― 6 min read

Machine Learning Advancing Multi-modal Learning with C-MCR

C-MCR simplifies multi-modal learning by connecting existing knowledge efficiently.

2025-11-05T03:49:55+00:00 ― 6 min read

Sound Simplifying Sound Synthesis with NAS-FM

A new method for creating synthesizers that benefits musicians.

2025-11-04T17:18:20+00:00 ― 6 min read

Computer Vision and Pattern Recognition Do-GOOD Benchmark: Enhancing Document Understanding Models

New benchmark reveals performance gaps in document processing models.

2025-11-04T02:17:36+00:00 ― 7 min read

Computer Vision and Pattern Recognition Advancements in Panoramic Semantic Segmentation

New model improves panoramic image analysis for real-world applications.

2025-11-04T00:19:06+00:00 ― 4 min read

Human-Computer Interaction LoopBoxes: A New Way to Make Music

LoopBoxes helps children create music easily and collaboratively.

2025-11-03T08:55:00+00:00 ― 5 min read

Computer Vision and Pattern Recognition Challenges in Text-Video Retrieval and Solutions

A look at biases in text-video retrieval and ways to enhance accuracy.

2025-11-03T00:45:00+00:00 ― 6 min read

Sound Advancements in Audio Classification Techniques

A novel method enhances audio classification by learning new sounds efficiently.

2025-10-31T22:37:00+00:00 ― 4 min read

Multimedia 360TripleView: Enhancing 360-Degree Video Experience

A new system improves viewing direction selection in 360-degree videos.

2025-10-31T20:44:30+00:00 ― 6 min read

Computer Vision and Pattern Recognition GeneCIS: Advancing Conditional Image Similarity in Computer Vision

A benchmark for assessing image similarity based on user-defined conditions.

2025-10-31T19:09:42+00:00 ― 6 min read

Sound Advancing Audio Question Answering with MWAFM Model

A new model improves how machines understand and respond to audio questions.

2025-10-31T18:34:05+00:00 ― 5 min read

Multimedia Balancing Active Learning in Multimodal Data

A new strategy ensures equal representation of data types in machine learning.

2025-10-31T02:02:42+00:00 ― 6 min read

Computer Vision and Pattern Recognition Advancements in Video Copy Detection Techniques

A new dataset challenges methods for detecting altered video content.

2025-10-30T18:16:36+00:00 ― 6 min read

Computer Vision and Pattern Recognition Advancing Remote Sensing with RS5M and DVLM

A new dataset and model improve remote sensing image analysis.

2025-10-29T03:49:48+00:00 ― 5 min read

Multimedia Optimizing Video Storage for Cataract Surgeries

Research shows effective ways to compress cataract surgery videos for better storage management.

2025-10-28T02:25:06+00:00 ― 5 min read

Sound Analyzing Music with BERT: A New Approach

Research explores BERT's potential in bar-level music analysis.

2025-10-27T07:41:05+00:00 ― 5 min read

Sound Advancing Melody Harmonization with Emotional Context

A new model improves melody harmonization by considering emotional factors.

2025-10-26T21:58:05+00:00 ― 6 min read

Multimedia Advancements in Video Compression Technology

A new method improves video compression while maintaining quality and efficiency.

2025-10-26T05:46:25+00:00 ― 5 min read

Computer Vision and Pattern Recognition Improving Food Instance Segmentation with Smart Labeling

A new framework reduces manual labeling costs in food image segmentation.

2025-10-25T23:35:42+00:00 ― 6 min read

Information Retrieval A New Framework for Multimodal Recommendations

This framework streamlines data processing for better recommendation systems.

2025-10-25T18:59:12+00:00 ― 6 min read

Multimedia Improving Video Encoding Efficiency with New Techniques

A new method speeds up video encoding while maintaining quality.

2025-10-25T14:30:36+00:00 ― 4 min read

Sound Creating Melodies from Simple Beats

This project helps anyone compose music using basic beats and advanced computer methods.

2025-10-25T11:57:35+00:00 ― 5 min read