Introducing the 3DGrocery100 Dataset for Enhanced Grocery Recognition
A new dataset aims to improve grocery item recognition through detailed 3D data.
― 7 min read
Table of Contents
- Dataset Overview
- Importance of 3D Data
- The Need for 3DGrocery100
- Data Collection Process
- Data Processing and Cleaning
- The Value of 3DGrocery100
- Benchmarking the Dataset
- Few-Shot Learning and Class-Incremental Learning
- Results and Findings
- Limitations and Future Directions
- Conclusion
- Original Source
- Reference Links
Recognizing grocery items accurately is important for areas like self-checkout machines, robots in stores, and help for people with visual impairments. Most current grocery data consists of 2D images, which limit how well models can learn to recognize different products since they do not capture the full shape of items. Recently, advanced 3D sensors like LiDAR and TrueDepth have been added to smartphones, making it possible to collect more detailed 3D data. However, a significant gap remains with few large-scale real-world 3D datasets focused on grocery items.
To address this, we introduce a new large-scale grocery dataset called 3DGrocery100. This dataset includes 100 different types of grocery items with a total of 87,898 3D Point Clouds created from 10,755 RGB-D Images. We also tested this dataset with six advanced 3D point cloud Classification Models. The dataset sets a foundation for further research in grocery recognition.
Dataset Overview
3DGrocery100 consists of 10,755 RGB-D images and 87,898 point clouds across 100 categories. The items are grouped into three main categories: Fruits (10 apple and 24 non-apple classes), Vegetables (28), and Packages (38). The dataset was collected under real-world grocery store conditions, offering a diverse representation of grocery items and their arrangements.
Some grocery items, especially fresh produce, can be challenging to recognize due to pricing issues, random placement, and varying orientations. These issues highlight the necessity for better Data Collection methods, particularly those that allow for 3D features to be captured effectively.
Importance of 3D Data
3D computer vision is increasingly important with applications in areas like healthcare and augmented reality. In grocery stores, accurately identifying and locating items can improve the shopping experience and assist in inventory management. Traditional 2D datasets do not provide the depth information needed to fully recognize and classify grocery items.
3D data adds value because it captures the shape and structure of items. This data is essential for deep learning models that need to learn the fine details of grocery objects, which can affect recognition performance significantly.
The Need for 3DGrocery100
Despite recent advancements in 3D data collection, there remains a scarcity of 3D grocery datasets. Existing datasets often lack sufficient variety and fine-grained categories. To create a practical dataset, we utilized mobile phones equipped with advanced 3D sensors to gather images in a simple and efficient manner. This approach allows us to convert single-view RGB and depth images into usable 3D point clouds.
Our dataset aims to bridge the gap in 3D grocery recognition by providing a well-organized collection of point clouds that represent various grocery items in detail.
Data Collection Process
Our data collection took place over four months in 18 different grocery stores. The process involved taking RGB-D images of items in various store setups. We used an iOS app that works with modern mobile phone cameras to capture both RGB images and depth data. This app allowed for effective image collection, even when grocery items were placed in less-than-ideal lighting or positioning.
The LiDAR and stereo camera features of the iPhone helped achieve better depth mapping and point cloud quality, leading to more accurate representations of grocery items.
Data Hierarchy
Once the data was gathered, it was organized into structured categories. The dataset classifies items into Fruits, Vegetables, and Packages, with additional subcategories for better granularity. Each class contains a certain number of images and corresponding point cloud samples, allowing for varied analysis during experiments.
Data Annotation
Annotating the collected images was an important part of the dataset creation. We marked the boundaries of grocery items within the 2D RGB images to ensure accurate 3D point cloud generation. Careful attention was given to accurately select object boundaries to prevent any extra noise in the point clouds that could hinder analysis.
Data Processing and Cleaning
Processing RGB-D images into point clouds involves some challenges. Often, outliers and noise can be introduced during the conversion process. To address these issues, we applied specific techniques to clean the data, including outlier removal and denoising methods. This ensures a higher quality dataset that accurately reflects the grocery items.
Outlier Removal
Using PointCleanNet, we were able to identify and remove noisy points from the dataset. By focusing on maintaining higher-quality point clouds, we ensure that the resulting dataset can be used reliably for further research and model training.
The Value of 3DGrocery100
The introduction of 3DGrocery100 presents an opportunity for significant advancements in grocery recognition systems. By providing a large and varied dataset, we aim to support the development of methods that can classify and recognize grocery items more effectively.
The dataset is not only extensive in size but also covers different types of grocery items in real-world settings, allowing researchers to build and refine models that could be life-changing for tasks like automated checkouts or assistance for visually impaired shoppers.
Benchmarking the Dataset
To validate the effectiveness of our dataset, we benchmarked it against several models known for their performance in point cloud classification tasks. This process involved evaluating how well these models could classify the grocery items in the dataset, providing insights into their strengths and weaknesses.
Classification Models Used
We tested six state-of-the-art models designed for 3D point cloud classification. Each model was evaluated to see how they handled the unique challenges posed by our dataset. The results from these benchmarks offer a better understanding of the current capabilities and limitations of existing technology in grocery recognition.
Few-Shot Learning and Class-Incremental Learning
Few-shot learning and class-incremental learning are essential areas of study in machine learning, particularly when dealing with new or evolving datasets. Our dataset allows for experimentation in these areas, helping to explore how well models can generalize from limited examples or adapt to new classes of items over time.
Few-Shot Learning
We created a subset of our dataset called 3DGrocery63, merging some similar shape classes. This subset serves as a strong basis for few-shot evaluations, enabling researchers to test how well models can adapt with limited training data.
Class-Incremental Learning
Our dataset is also suitable for class-incremental learning, allowing us to explore how well models maintain their performance as new classes are introduced. This is particularly useful for grocery recognition applications, where new products are frequently added or changed in stores.
Results and Findings
The results from our benchmarking and evaluations provide valuable insights into the performance of different models using our dataset. We observed that while some models excelled in specific tasks, others struggled to adapt to the complexities of grocery item recognition.
Performance Summary
The benchmarking highlighted the importance of color and geometric features in classification tasks. Models performed significantly better when using color data alongside geometric information, showcasing how valuable a complete 3D representation can be for accurate grocery recognition.
Limitations and Future Directions
While 3DGrocery100 represents an important step in the field of grocery recognition, there are still challenges to address. Issues with data quality, annotation processes, and 3D representation conversion indicate areas for improvement in future iterations.
Future Work
Potential future work includes the exploration of unsupervised learning techniques to streamline data annotation and improve overall dataset quality. Additionally, more extensive benchmarking may reveal further insights into the capabilities of various models in real-world grocery scenarios.
Conclusion
In conclusion, the 3DGrocery100 dataset has the potential to significantly enhance research and development in grocery recognition systems. By combining a broad range of grocery categories with advanced 3D data collection methods, this dataset serves as a crucial resource for improving machine learning models used in this field.
Continued exploration and advancement in 3D grocery recognition will pave the way for innovative solutions that can transform the shopping experience for consumers and streamline operations for retailers.
Title: A Benchmark Grocery Dataset of Realworld Point Clouds From Single View
Abstract: Fine-grained grocery object recognition is an important computer vision problem with broad applications in automatic checkout, in-store robotic navigation, and assistive technologies for the visually impaired. Existing datasets on groceries are mainly 2D images. Models trained on these datasets are limited to learning features from the regular 2D grids. While portable 3D sensors such as Kinect were commonly available for mobile phones, sensors such as LiDAR and TrueDepth, have recently been integrated into mobile phones. Despite the availability of mobile 3D sensors, there are currently no dedicated real-world large-scale benchmark 3D datasets for grocery. In addition, existing 3D datasets lack fine-grained grocery categories and have limited training samples. Furthermore, collecting data by going around the object versus the traditional photo capture makes data collection cumbersome. Thus, we introduce a large-scale grocery dataset called 3DGrocery100. It constitutes 100 classes, with a total of 87,898 3D point clouds created from 10,755 RGB-D single-view images. We benchmark our dataset on six recent state-of-the-art 3D point cloud classification models. Additionally, we also benchmark the dataset on few-shot and continual learning point cloud classification tasks. Project Page: https://bigdatavision.org/3DGrocery100/.
Authors: Shivanand Venkanna Sheshappanavar, Tejas Anvekar, Shivanand Kundargi, Yufan Wang, Chandra Kambhamettu
Last Update: 2024-04-07 00:00:00
Language: English
Source URL: https://arxiv.org/abs/2402.07819
Source PDF: https://arxiv.org/pdf/2402.07819
Licence: https://creativecommons.org/licenses/by/4.0/
Changes: This summary was created with assistance from AI and may have inaccuracies. For accurate information, please refer to the original source documents linked here.
Thank you to arxiv for use of its open access interoperability.