Simple Science

Cutting edge science explained simply

# Computer Science# Computer Vision and Pattern Recognition# Artificial Intelligence# Human-Computer Interaction

Revolutionizing Receipt Digitization with a New App

An app that simplifies receipt scanning and storage through automatic detection.

― 6 min read


Smart Receipt ScanningSmart Receipt ScanningAppeasy scanning.Streamlining receipt management with
Table of Contents

In today's world, many payments are moving from cash to digital methods. However, paper receipts are still commonly given after purchases in physical stores. These receipts are important because they serve multiple purposes. They provide proof of purchase which can be useful in cases of theft or for returning items. They also help in documenting expenses for employers or tax authorities. Additionally, paper receipts contain detailed information that is often not available through digital payment methods, such as the items bought, time and location of the purchase, and any discounts used. Therefore, it seems unlikely that paper receipts will completely disappear anytime soon.

Existing Solutions

There are several smartphone apps available that help capture and digitize paper receipts. Some popular ones include Apple Notes, Expensify, and Zoho. The Money Forward ME app has over 12 million users in Japan and processes millions of receipt Images each month. Most of these apps require the user to properly align the receipt within a specific area on their phone screen. This process can be tedious and prone to errors. For instance, pressing the button to take a photo can inadvertently shift the camera's position, resulting in a blurry image. Users may also struggle to take an overhead photo if they have to stand up to align the receipt properly. Therefore, an automatic method for detecting and correcting receipt images would make the process easier for the user and also improve the accuracy of further tasks, such as reading and managing the text on the receipt.

Our Proposal

This paper discusses a new smartphone application that allows users to quickly digitize paper receipts by "waving" their phone over the receipts. The app automatically detects and corrects the receipt images, making it easy for users to store them. An essential step in this process is the correction of the image, which requires accurate Detection of the corners of the receipt.

Challenges in Detection

Traditional methods for detecting edges and corners in images often struggle with paper receipts. Real-world receipts have uneven edges, and colors can be similar to the background, which complicates their detection. Inaccurate corner detection can lead to distorted images when trying to correct the perspective. Our approach involves treating each corner of the receipt as a separate object. We use a modern object detection Model that combines real images of receipts and SyntheticData created to mimic real-world scenarios.

Data Generation

Collecting a large set of real receipt images can be expensive and time-consuming. To overcome this, we generate synthetic data by combining actual receipt images with various backgrounds. We first take a set of scanned images of real receipts, ensuring they are in a vertical position with minimal background exposed. We then apply random transformations, such as rotations and shifts, to simulate how users might take photos from different angles and positions.

To create the synthetic data, we choose diverse backgrounds that users might place their receipts on. This way, we can train the model to recognize receipts against a variety of backgrounds, including those that may have similar colors or textures. By generating a set of images that includes multiple receipts in random positions, we ensure the model learns to ignore interfering objects and focus on the target receipt.

Augmentation Techniques

Once we have our synthetic receipts, we apply a series of transformations to generate a range of different images. These include changing the scale, shifting the positions, and applying rotation to create a variety of perspectives. This helps simulate real-life scenarios where the user's camera might not be perfectly positioned.

By applying such transformations, we not only create a more extensive dataset but also help the model learn how to identify receipt corners even when they are not clearly visible or perfectly aligned.

Training the Model

We train our model using both real and synthetic data. For our training process, we use a popular deep learning framework that allows us to feed in our labeled data and adjust the model parameters to improve its accuracy. The model learns to recognize the four corners of a receipt as unique objects, rather than looking for the entire receipt as a single entity.

During training, we monitor the model's performance and make adjustments as needed. Our goal is for the model to achieve high accuracy in detecting corners even in challenging conditions, such as low contrast or overlapping receipts.

Evaluation of the Model

To evaluate how well our model is working, we compare its performance to traditional edge detection methods. We find that our approach is significantly more accurate. For example, while traditional methods may only correctly identify corners about 36% of the time, our model achieves an accuracy of over 85%. This improvement is crucial for ensuring that users can trust the app to recognize and store their receipts correctly.

User Experience

One of the main goals of our application is to simplify the user experience. Instead of requiring users to align their receipts perfectly, the app allows them to take a more relaxed approach by sweeping their phone over the receipts. This reduces frustration and the likelihood of errors.

We plan to integrate this receipt detection feature into the Money Forward ME app, providing users with a seamless way to manage their receipts. Users will not have to worry about the exact positioning or alignment, making the process more enjoyable and less stressful.

Future Improvements

While our current model shows promising results, we recognize the potential for further improvements. One area we want to explore is the ability to detect corners that may not be fully visible, either because they are hidden or damaged. We also plan to investigate how to rectify images of receipts that are curved or folded.

By continuing to enhance our model and using more varied real-world data, we hope to achieve even better performance. This will make it easier for users to capture and manage their receipts, regardless of the conditions.

Conclusion

In summary, we have developed a novel smartphone application that allows users to easily digitize paper receipts by scanning them with their phones. Our approach leverages modern object detection techniques, which have proven to be more effective than traditional methods. By generating synthetic data and training our model on a diverse set of images, we can achieve high accuracy in detecting receipt corners even in challenging conditions.

This application will help streamline the process of managing receipts, making it more accessible and user-friendly. In the future, we aim to enhance the app further by tackling more complex issues related to receipt detection and correction. We appreciate the feedback on our work and look forward to making this tool even better for users.

Original Source

Title: Automatic Detection and Rectification of Paper Receipts on Smartphones

Abstract: We describe the development of a real-time smartphone app that allows the user to digitize paper receipts in a novel way by "waving" their phone over the receipts and letting the app automatically detect and rectify the receipts for subsequent text recognition. We show that traditional computer vision algorithms for edge and corner detection do not robustly detect the non-linear and discontinuous edges and corners of a typical paper receipt in real-world settings. This is particularly the case when the colors of the receipt and background are similar, or where other interfering rectangular objects are present. Inaccurate detection of a receipt's corner positions then results in distorted images when using an affine projective transformation to rectify the perspective. We propose an innovative solution to receipt corner detection by treating each of the four corners as a unique "object", and training a Single Shot Detection MobileNet object detection model. We use a small amount of real data and a large amount of automatically generated synthetic data that is designed to be similar to real-world imaging scenarios. We show that our proposed method robustly detects the four corners of a receipt, giving a receipt detection accuracy of 85.3% on real-world data, compared to only 36.9% with a traditional edge detection-based approach. Our method works even when the color of the receipt is virtually indistinguishable from the background. Moreover, our method is trained to detect only the corners of the central target receipt and implicitly learns to ignore other receipts, and other rectangular objects. Including synthetic data allows us to train an even better model. These factors are a major advantage over traditional edge detection-based approaches, allowing us to deliver a much better experience to the user.

Authors: Edward Whittaker, Masashi Tanaka, Ikuo Kitagishi

Last Update: 2023-03-10 00:00:00

Language: English

Source URL: https://arxiv.org/abs/2303.05763

Source PDF: https://arxiv.org/pdf/2303.05763

Licence: https://creativecommons.org/licenses/by/4.0/

Changes: This summary was created with assistance from AI and may have inaccuracies. For accurate information, please refer to the original source documents linked here.

Thank you to arxiv for use of its open access interoperability.

Similar Articles