This study uses Visual Question Answering for assessing charts created by AI models.
James Ford, Xingmeng Zhao, Dan Schumacher
― 7 min read
Cutting edge science explained simply
This study uses Visual Question Answering for assessing charts created by AI models.
James Ford, Xingmeng Zhao, Dan Schumacher
― 7 min read
Using ontology can boost MLLMs' ability to identify plant diseases accurately.
Jihen Amara, Birgitta König-Ries, Sheeba Samuel
― 6 min read
Introducing a method for AI to generate images without large labeled datasets.
Zhiqiang Chen, Guofan Fan, Jinying Gao
― 7 min read
GeCo improves object counting with fewer examples, enhancing accuracy and reliability.
Jer Pelhan, Alan Lukežič, Vitjan Zavrtanik
― 5 min read
A new method enhances image privacy classification with clear, user-friendly explanations.
Alina Elena Baia, Andrea Cavallaro
― 7 min read
New method enhances CT images for better cancer treatment planning.
Belén Serrano-Antón, Mubashara Rehman, Niki Martinel
― 6 min read
Enhancements in LiDAR perception improve performance in multi-sensor environments.
Marc Uecker, J. Marius Zöllner
― 6 min read
A comprehensive dataset aims to improve flood prediction and response efforts globally.
Brandon Victor, Mathilde Letard, Peter Naylor
― 6 min read
A method for clearer satellite images directly from unprocessed data.
Michael Sprintson, Rama Chellappa, Cheng Peng
― 5 min read
CION advances person re-identification by focusing on identity correlations across videos.
Jialong Zuo, Ying Nie, Hanyu Zhou
― 6 min read
A framework merging different knowledge types to improve model performance.
Yaomin Huang, Zaomin Yan, Chaomin Shen
― 5 min read
A new method improves gaze target detection with less labeled data.
Francesco Tonini, Nicola Dall'Asen, Lorenzo Vaquero
― 6 min read
A new approach enhances deep learning model performance amidst noise.
Seyedarmin Azizi, Mohammad Erfan Sadeghi, Mehdi Kamal
― 5 min read
A new framework improves pixel labeling by addressing uncertainty in semantic segmentation.
Xiaoke Hao, Shiyu Liu, Chuanbo Feng
― 6 min read
This study assesses the effectiveness of pre-trained models in Earth Observation applications.
Jose Sosa, Mohamed Aloulou, Danila Rukhovich
― 6 min read
Temporal2Seq framework streamlines multiple video understanding tasks into one model.
Min Yang, Zichen Zhang, Limin Wang
― 8 min read
TAKFL optimizes knowledge sharing in federated learning for diverse device capabilities.
Mahdi Morafah, Vyacheslav Kungurtsev, Hojin Chang
― 6 min read
A method that aligns 3D shapes with 2D images without matched points.
Jingwei Song, Maani Ghaffari
― 6 min read
Explore the essential concepts of molecular physics and their practical applications.
Jun Liu, Geng Yuan, Weihao Zeng
― 4 min read
This new method streamlines image generation in AI models, improving efficiency and speed.
Seongmin Hong, Suh Yoon Jeon, Kyeonghyun Lee
― 6 min read
A new framework enhances video-language dataset quality through iterative refinement.
Xiao Wang, Jianlong Wu, Zijia Lin
― 5 min read
Combining street view images with data to analyze building exteriors.
Zongrong Li, Yunlei Su, Chenyuan Zhu
― 6 min read
A model to assess segmentation quality without ground truth benchmarks.
Ahjol Senbi, Tianyu Huang, Fei Lyu
― 8 min read
MedCLIP-SAMv2 improves tumor detection using advanced segmentation techniques and minimal labeled data.
Taha Koleilat, Hojat Asgariandehkordi, Hassan Rivaz
― 5 min read
A look into how CNNs learn image features and their universal similarities.
Florentin Guth, Brice Ménard
― 7 min read
Researchers use CRISP to improve biodiversity tracking through better image analysis.
Andy V. Huynh, Lauren E. Gillespie, Jael Lopez-Saucedo
― 6 min read
A new index helps assess diversity in AI-generated medical images.
Mohammed Talha Alam, Raza Imam, Mohammad Areeb Qazi
― 9 min read
New methods speed up video encoding and decoding.
Hao Chen, Saining Xie, Ser-Nam Lim
― 5 min read
A new framework enhances the connection between images and text.
Mayug Maniparambil, Raiymbek Akshulakov, Yasser Abdelaziz Dahou Djilali
― 7 min read
Learn how machine learning models can improve when facing new and unseen data.
Zongbo Han, Jialong Yang, Junfan Li
― 7 min read
A look at the role and methods of diffusion models in image creation.
Zheyuan Zhan, Defang Chen, Jian-Ping Mei
― 7 min read
Exploring methods to improve multimodal models in breaking down visual questions.
Haowei Zhang, Jianzhe Liu, Zhen Han
― 6 min read
A new model generates reports from 3D CT scans efficiently and accurately.
Hao Chen, Wei Zhao, Yingli Li
― 8 min read
A new pipeline for generating 3D models from 2D images efficiently.
Potito Aghilar, Vito Walter Anelli, Michelantonio Trizio
― 5 min read
TrojVLM exposes vulnerabilities in Vision Language Models to backdoor attacks.
Weimin Lyu, Lu Pang, Tengfei Ma
― 7 min read
This study reveals effective methods for recognizing hand gestures through ultrasound imaging.
Keshav Bimbraw, Ankit Talele, Haichong K. Zhang
― 5 min read
A new framework improves data generation across multiple sources using energy-based models.
Shiyu Yuan, Jiali Cui, Hanao Li
― 5 min read
SATA improves the robustness and efficiency of Vision Transformers for image classification tasks.
Nick Nikzad, Yi Liao, Yongsheng Gao
― 4 min read
A new method improves object recognition using masks without detailed labels.
Heeseong Shin, Chaehyun Kim, Sunghwan Hong
― 5 min read
A new method simplifies the removal of unwanted content in visual datasets.
Saehyung Lee, Jisoo Mok, Sangha Park
― 6 min read