Learn about advancements in generating long videos that captivate audiences.
Xin Yan, Yuxuan Cai, Qiuyue Wang
― 6 min read
New Science Research Articles Everyday
Learn about advancements in generating long videos that captivate audiences.
Xin Yan, Yuxuan Cai, Qiuyue Wang
― 6 min read
Latest Articles
Po-Hsuan Huang, Jeng-Lin Li, Chin-Po Chen
― 7 min read
Ze Zhang, Enyuan Zhao, Ziyi Wan
― 7 min read
Vera Prohaska, Eduardo Castelló Ferrer
― 7 min read
Taekyung Ki, Dongchan Min, Gyeongsu Chae
― 7 min read
Muhammad Umar Farooq, Awais Khan, Ijaz Ul Haq
― 7 min read
Explore how new technology blends text, images, and sounds for creative content.
Shufan Li, Konstantinos Kallidromitis, Akash Gokul
― 6 min read
SyncFlow merges audio and video generation for seamless content creation.
Haohe Liu, Gael Le Lan, Xinhao Mei
― 4 min read
SizeGS offers a smarter way to compress 3D content without losing quality.
Shuzhao Xie, Jiahang Liu, Weixiang Zhang
― 6 min read
AI learns to create art through self-feedback for better image alignment.
Leigang Qu, Haochuan Li, Wenjie Wang
― 8 min read
Using machine learning to enhance judo match analysis and coaching.
Anthony Miyaguchi, Jed Moutahir, Tanmay Sutar
― 8 min read
AI systems are learning to navigate using language and spatial awareness.
Xuesong Zhang, Yunbo Xu, Jia Li
― 7 min read
New method enhances 3D modeling from videos for gaming and VR.
Jinbo Yan, Rui Peng, Luyang Tang
― 5 min read
Find the perfect music tailored to your unique taste with Diff4Steer.
Xuchan Bao, Judith Yue Li, Zhong Yi Wan
― 6 min read
Discover how semantic multi-item compression changes image sharing and storage.
Tom Bachard, Thomas Maugey
― 6 min read
RoboMM and RoboData transform how robots learn and operate in real environments.
Feng Yan, Fanfan Liu, Liming Zheng
― 7 min read
Discover how AI agents send hidden messages through playful actions.
Ching-Chun Chang, Isao Echizen
― 8 min read
Learn how AI is turning music into captivating visual experiences.
Leonardo Pina, Yongmin Li
― 7 min read
Learn how combining text and images enhances sentiment analysis.
Nguyen Van Doan, Dat Tran Nguyen, Cam-Van Thi Nguyen
― 6 min read
Discover how POINTS1.5 enhances image and text processing capabilities.
Yuan Liu, Le Tian, Xiao Zhou
― 6 min read
WavFusion combines audio, text, and visuals for better emotion recognition.
Feng Li, Jiusong Luo, Wanjun Xia
― 6 min read
TextRefiner boosts Vision-Language Models' performance, making them faster and more accurate.
Jingjing Xie, Yuxin Zhang, Jun Peng
― 7 min read
Explore the rise of machine-generated music and the quest for detection methods.
Yupei Li, Hanqian Li, Lucia Specia
― 6 min read
A new system revolutionizes how music pairs with video content.
Shanti Stewart, Gouthaman KV, Lie Lu
― 6 min read
Learn about innovative video watermarking techniques for content protection.
Pierre Fernandez, Hady Elsahar, I. Zeki Yalniz
― 5 min read
A new model blends music and AI, creating innovative tunes.
Shansong Liu, Atin Sakkeer Hussain, Qilong Wu
― 7 min read
OV-VSS revolutionizes how machines understand video content, identifying new objects seamlessly.
Xinhao Li, Yun Liu, Guolei Sun
― 8 min read
AI TrackMate offers producers objective feedback to improve their music skills.
Yi-Lin Jiang, Chia-Ho Hsiung, Yen-Tung Yeh
― 6 min read
Discover how MMCSAL improves learning efficiency with multimodal data.
Meng Shen, Yake Wei, Jianxiong Yin
― 6 min read
Learn about Frechet Music Distance and its role in evaluating AI-generated music.
Jan Retkowski, Jakub Stępniak, Mateusz Modrzejewski
― 8 min read
Discover how AI can transform sound design in videos and games.
Sudha Krishnamurthy
― 5 min read
A new approach enhances audio-visual question answering accuracy and efficiency.
Zhangbin Li, Jinxing Zhou, Jing Zhang
― 6 min read
A new framework enhances the alignment of sounds and visuals in videos.
Kexin Li, Zongxin Yang, Yi Yang
― 6 min read
Revolutionizing text-to-speech with improved efficiency and natural-sounding voices.
Haowei Lou, Helen Paik, Pari Delir Haghighi
― 6 min read
Combining video and audio for better emotion detection.
Antonio Fernandez, Suzan Awinat
― 9 min read
New techniques improve how machines recognize and interpret video scenes.
Phúc H. Le Khac, Graham Healy, Alan F. Smeaton
― 7 min read
YingSound transforms video production by automating sound effects generation.
Zihao Chen, Haomin Zhang, Xinhan Di
― 6 min read
Researchers use echoes to watermark audio, ensuring creators' rights are protected.
Christopher J. Tralie, Matt Amery, Benjamin Douglas
― 8 min read
This study assesses how well language models recognize music entities in text.
Simon Hachmeier, Robert Jäschke
― 7 min read
Discover how cover songs are identified on YouTube using new methods.
Simon Hachmeier, Robert Jäschke
― 6 min read
Learn how flight patterns keep drones safe and organized.
Shuqin Zhu, Shahram Ghandeharizadeh
― 5 min read
Discover how drones create interactive 3D displays for entertainment and healthcare.
Nima Yazdani, Hamed Alimohammadzadeh, Shahram Ghandeharizadeh
― 5 min read
A new method helps summarize video content easily.
Shiping Ge, Qiang Chen, Zhiwei Jiang
― 6 min read
A new model speeds up video search while improving accuracy.
Jinpeng Wang, Niu Lian, Jun Li
― 6 min read
DAAN improves how machines learn from audio-visual data in zero-shot scenarios.
RunLin Yu, Yipu Gong, Wenrui Li
― 5 min read
Transform your filmmaking with enhanced camera control and artistic effects.
Xi Wang, Robin Courant, Marc Christie
― 6 min read