Automated Debugging in Machine Learning Models
A new system improves AutoML performance by automatically fixing search bugs.
― 7 min read
Table of Contents
Deep Learning models are now a key part of many software systems. To help design these models automatically, researchers created Automated Machine Learning (AutoML) systems. These systems automatically find the right model structure and settings for specific tasks. However, like all software, AutoML systems can have bugs. Two major problems we found in AutoML are:
- Performance Bug: Sometimes, it takes too long for the AutoML system to find the right model.
- Ineffective Search Bug: The system might not be able to find a good enough model at all.
When we looked into how AutoML systems work, we saw that they often miss out on important opportunities in their search process. This leads to the two main bugs we mentioned. Based on our findings, we developed a system that can automatically debug and fix these issues in AutoML systems.
Our system watches the AutoML process as it runs. It collects detailed information about the model searching and automatically fixes any bugs by broadening the search space and using a new way to find the best model.
Introduction
In today’s world of Software 2.0, Machine Learning (ML) techniques play a big role in making software smarter. The growth of Deep Learning (DL) offers new chances for software intelligence. As a result, DL models are becoming essential parts of software systems. The market for AI software at the edge is projected to increase significantly in the coming years. The COVID-19 pandemic has also sped up the use of DL techniques across various industries.
Despite the growth of these technologies, many people working in specialized fields do not have a deep understanding of DL. This creates challenges for the software engineering community. To tackle this, researchers developed AutoML, which aims to create ML models without needing extensive knowledge from the user-just the task at hand. Many organizations have created and shared AutoML systems to help users with little or no DL background build effective models.
The AutoML process includes several steps:
- Data Preparation: The AutoML engine collects and cleans the data. It may also enhance the data if necessary.
- Model Generation: The system looks for possible models and their settings based on the provided task.
- Training And Evaluation: The models are trained and evaluated. If the model does not meet a set accuracy level, the search continues to find a better model.
This approach has shown good results in many areas. For example, one AutoML system is used in healthcare to help diagnose medical conditions faster. It has also improved how content is classified online.
However, AutoML systems are not without their flaws. The two bugs we mentioned earlier-performance and ineffective searches-are common issues that lead to wasted time and resources.
To better understand these bugs, we examined AutoKeras, a well-known AutoML engine. We discovered that existing AutoML engines miss various optimization opportunities, leading to the two main bugs.
Performance Bug
First, the search space in many AutoML systems lacks certain valuable options. Many parameters are fixed in these engines, making it hard to find the best models. Changing some of these parameters might greatly improve the system’s performance.
Ineffective Search Bug
Second, existing search strategies used by AutoML engines often ignore important feedback. These strategies usually involve a lot of random guessing, which lowers the chances of finding a good model. They depend heavily on training countless models to refine their guesses, which is not always practical. As a result, the chances of discovering an optimal model and its settings can be very low.
Need for Improvement
The AutoML process needs hundreds of attempts even with simple data sets. Moreover, the models created often underperform compared to expectations. We designed a system to identify and repair the Performance Bugs and ineffective search issues in AutoML.
Our system not only monitors the training and evaluation processes but also suggests improvements when it finds issues. Some of these improvements include expanding the search space and employing effective feedback-driven strategies. We used previous research to show the positive impact of these changes on model performance.
Background on Automated Machine Learning
Building ML models can be complex and challenging. To address this, AutoML systems automate the creation of DL models for given datasets, eliminating the need for human input. In AutoML, each attempt to create a model is called a search. Each search involves:
- Preparing the data and creating features.
- Generating a model.
- Training and evaluating that model.
The whole process is repeated until a satisfactory model is found.
AutoML engines generally evaluate each trained model using its accuracy score. To measure how efficient the searches are, we look at GPU hours and days required.
Steps of the AutoML Pipeline
The AutoML pipeline includes several steps:
Data Preparation
This step gets the training data ready and may involve cleaning, normalizing, or enhancing it.
Model Generation
This part involves searching for the right models and settings. The search space defines all the potential model structures and settings. The search strategy determines how the system explores this space.
Training and Evaluation
Once models are generated, they are trained. After training, the models are evaluated to determine their accuracy. If the model meets the required performance level, the search stops. If not, the system continues to search for better models.
Model Debugging and Repairing
Fixing issues in models is an essential part of software engineering. Various approaches have been proposed to debug and fix problems in ML models. These include:
- Creating adversarial examples to identify weaknesses in models.
- Cleaning up wrongly labeled training data.
- Analyzing input data to fix overfitting and underfitting issues.
- Using specific tools designed to automate debugging practices.
Performance Bugs in AutoML
AutoML systems also face bugs similar to other software systems, including performance bugs and ineffective search bugs. These bugs can lead to issues like extended processing times and low accuracy.
Examining AutoKeras
To study these bugs, we looked closely at AutoKeras. We set a target accuracy, and each of the three search strategies failed to meet the goal in a reasonable time. For example, the Bayesian search strategy took over 9 hours to reach a modest accuracy score, while other strategies failed to meet the target even after long periods.
Our Proposed System
We created a new system that aims to automatically fix performance and ineffective search bugs in AutoML pipelines. Our system monitors the searches for potential issues and suggests necessary changes.
Monitoring and Feedback
Our system collects detailed feedback on training and evaluation to help identify bugs. We defined specific symptoms for both performance and ineffective search bugs, allowing the system to spot them quickly.
When the AutoML engine takes too long to reach a set goal, we classify it as a performance bug. If there’s no improvement in the model score over several attempts, it is labeled as an ineffective search bug.
Feedback-Driven Search
The heart of our proposed system is the feedback-driven search, which uses the feedback data to guide the search process. By analyzing the data, we can find the best actions to take.
Experimentation and Results
We carried out experiments using popular datasets to test the efficiency and effectiveness of our debugging system. The results show that our system significantly outperformed existing strategies.
Performance Improvement
Our system achieved better accuracy scores compared to the baseline methods, indicating that it can effectively fix the issues in the AutoML pipeline.
Efficiency of the System
The time taken to reach target scores was much shorter with our system compared to traditional AutoML strategies.
Conclusion
In summary, our automatic debugging and repairing system for AutoML focuses on fixing performance and ineffective search bugs. By collecting detailed feedback and monitoring the training processes, our system can make informed changes to improve the search and model performance. The results from our evaluation demonstrate that our system is effective and can significantly enhance the performance of AutoML engines compared to existing methods.
Title: DREAM: Debugging and Repairing AutoML Pipelines
Abstract: Deep Learning models have become an integrated component of modern software systems. In response to the challenge of model design, researchers proposed Automated Machine Learning (AutoML) systems, which automatically search for model architecture and hyperparameters for a given task. Like other software systems, existing AutoML systems suffer from bugs. We identify two common and severe bugs in AutoML, performance bug (i.e., searching for the desired model takes an unreasonably long time) and ineffective search bug (i.e., AutoML systems are not able to find an accurate enough model). After analyzing the workflow of AutoML, we observe that existing AutoML systems overlook potential opportunities in search space, search method, and search feedback, which results in performance and ineffective search bugs. Based on our analysis, we design and implement DREAM, an automatic debugging and repairing system for AutoML systems. It monitors the process of AutoML to collect detailed feedback and automatically repairs bugs by expanding search space and leveraging a feedback-driven search strategy. Our evaluation results show that DREAM can effectively and efficiently repair AutoML bugs.
Authors: Xiaoyu Zhang, Juan Zhai, Shiqing Ma, Chao Shen
Last Update: 2023-12-30 00:00:00
Language: English
Source URL: https://arxiv.org/abs/2401.00379
Source PDF: https://arxiv.org/pdf/2401.00379
Licence: https://creativecommons.org/licenses/by/4.0/
Changes: This summary was created with assistance from AI and may have inaccuracies. For accurate information, please refer to the original source documents linked here.
Thank you to arxiv for use of its open access interoperability.