New framework enhances face recognition security against spoof attacks.
Xinxu Ge, Xin Liu, Zitong Yu
― 6 min read
Cutting edge science explained simply
New framework enhances face recognition security against spoof attacks.
Xinxu Ge, Xin Liu, Zitong Yu
― 6 min read
DICS model enhances image classification by focusing on key features.
Qiaowei Miao, Yawei Luo, Yi Yang
― 5 min read
GLCONet improves the detection of camouflaged objects using local and global features.
Yanguang Sun, Hanyu Xuan, Jian Yang
― 6 min read
A new method improves feature transfer in implicit neural representations for images.
Kushal Vyas, Ahmed Imtiaz Humayun, Aniket Dashpute
― 6 min read
A new method enhances image clarity and recognition in noisy environments.
Thomas C Markhorst, Jan C van Gemert, Osman S Kayhan
― 7 min read
Learn how AMRF enhances image segmentation in industrial applications.
Zheming Zuo, Joseph Smith, Jonathan Stonehouse
― 5 min read
This method enhances interpretability in semantic segmentation using prototypes and multi-scale representation.
Hugo Porta, Emanuele Dalsasso, Diego Marcos
― 5 min read
MAC-VO enhances camera position estimation in challenging environments.
Yuheng Qiu, Yutian Chen, Zihao Zhang
― 5 min read
A study compares pre-trained CNNs and foundation models for medical image retrieval.
Amirreza Mahbod, Nematollah Saeidi, Sepideh Hatamikia
― 6 min read
FKAN improves image and 3D shape representation using learnable activation functions.
Ali Mehrabian, Parsa Mojarad Adi, Moein Heidari
― 5 min read
A new method enhances AI's grasp of human actions through specialized data.
Dewen Zhang, Wangpeng An, Hayaru Shouno
― 7 min read
This method estimates orientations without labeled data using deep learning.
Shiqi Li, Jihua Zhu, Yifan Xie
― 5 min read
This paper evaluates VLMs' ability to reason about sizes and distances.
Yuan-Hong Liao, Rafid Mahmood, Sanja Fidler
― 6 min read
Overview of techniques for detecting and classifying human actions.
Jungpil Shin, Najmul Hassan, Abu Saleh Musa Miah1
― 4 min read
SparX enhances image processing by mimicking the human visual system.
Meng Lou, Yunxiang Fu, Yizhou Yu
― 6 min read
Research showcases LLMs' potential for recognizing objects in event-based visuals.
Zongyou Yu, Qiang Qu, Xiaoming Chen
― 6 min read
Integrating motion information enhances object detection accuracy in images.
Cagri Gungor, Adriana Kovashka
― 5 min read
ScaleFlow++ improves 3D motion estimation using monocular cameras for various applications.
Han Ling, Yinghui Sun, Quansen Sun
― 5 min read
NSSR-DIL transforms low-quality images efficiently without large datasets.
Sree Rama Vamsidhar S, Rama Krishna Gorthi
― 4 min read
A machine learning approach harnessing motion for effective visual data learning.
Simone Marullo, Matteo Tiezzi, Marco Gori
― 7 min read
This framework allows quick learning of new object categories with minimal data.
Yanan Jian, Fuxun Yu, Qi Zhang
― 5 min read
A new system improves the speed and accuracy of video labeling.
Alexandru Bobe, Jan C. van Gemert
― 6 min read
KAT improves deep learning by using advanced KANs to replace MLPs.
Xingyi Yang, Xinchao Wang
― 5 min read
A new framework improves understanding of human actions through skeleton data.
Lehong Wu, Lilang Lin, Jiahang Zhang
― 6 min read
A new method improves robots' grasping ability using natural language commands.
Vineet Bhat, Prashanth Krishnamurthy, Ramesh Karri
― 6 min read
FOLK enhances self-supervised learning through adaptive frequency masking and a teacher-student design.
Amin Karimi Monsefi, Mengxi Zhou, Nastaran Karimi Monsefi
― 5 min read
Adapting DINOv2 boosts BEV segmentation for safer self-driving cars.
Merve Rabia Barın, Görkay Aydemir, Fatma Güney
― 5 min read
A new dataset brings together RGB and event camera data for better facial analysis.
Federico Becattini, Luca Cultrera, Lorenzo Berlincioni
― 8 min read
SteeredMarigold improves depth maps, aiding robots in navigation and interaction.
Jakub Gregorek, Lazaros Nalpantidis
― 5 min read
Introducing GRIN, a new model for depth estimation using sparse data.
Vitor Guizilini, Pavel Tokmakov, Achal Dave
― 7 min read
NVLM enhances AI's grasp of language and visuals for diverse tasks.
Wenliang Dai, Nayeon Lee, Boxin Wang
― 5 min read
This work enhances CLIP's accuracy by addressing intra-modal overlap using lightweight adapters.
Alexey Kravets, Vinay Namboodiri
― 5 min read
A new framework improves segmentation with limited examples.
Amirreza Fateh, Mohammad Reza Mohammadi, Mohammad Reza Jahed Motlagh
― 6 min read
SLAck offers a new approach to tracking diverse objects in videos.
Siyuan Li, Lei Ke, Yung-Hsu Yang
― 6 min read
A benchmark for generalized few-shot segmentation in remote sensing is introduced.
Clifford Broni-Bediako, Junshi Xia, Jian Song
― 5 min read
A new method enhances pose estimation using RGB images informed by depth data.
Alessandro Simoni, Francesco Marchetti, Guido Borghi
― 6 min read
TRIM method reduces image tokens in multi-modal language models while maintaining performance.
Dingjie Song, Wenjun Wang, Shunian Chen
― 5 min read
A new framework accurately estimates depth from single defocused images.
Jinchang Zhang, Ningning Xu, Hao Zhang
― 6 min read
A new method improves efficiency in 3D data capture for various applications.
Zhizhou Jia, Shaohui Zhang, Qun Hao
― 6 min read
WaveMixSR-V2 transforms low-resolution images into high-quality outputs efficiently.
Pranav Jeevan, Neeraj Nixon, Amit Sethi
― 5 min read