Managing Harm in Decision Support Systems
A study on reducing counterfactual harm in decision-making tools.
― 8 min read
Table of Contents
- Characterizing Harm
- Goals of the Study
- Framework Development
- Structural Causal Models
- Monotonicity Assumptions
- Computational Framework
- Conformal Risk Control
- Practical Application
- Trade-off Between Accuracy and Harm
- Experimental Validation
- Dataset Overview
- Results and Observations
- Discussion
- Limitations
- Conclusion
- Original Source
- Reference Links
Decision Support Systems are tools that help people make decisions by providing useful information. These systems can be particularly helpful in tasks where there are many possible outcomes, like classification tasks, where a person needs to choose from several different labels or categories. When these systems suggest a smaller group of labels to choose from, they are called prediction sets.
While these systems can improve the average Accuracy of decisions, they also come with risks. For instance, a person who is good at making predictions on their own might choose a label correctly but may struggle if they rely on the system. This raises concerns about how often these systems may cause Harm, meaning when they lead someone to make a worse decision than they would have without assistance.
Our aim is to create a way to control how often these systems can cause harm during their use. To reach this goal, we first need to define what we mean by harm in this context. Then, we will show how we can estimate the frequency of harm based on human decisions made without the system.
Characterizing Harm
Harm can be understood in simple terms. It occurs when a decision leads to a worse outcome for someone compared to if they had made their own choice without any help from a system. In our study, we say that a decision support system causes harm if a person would have made a better decision if they had not used the system.
A common setting for these systems involves a classifier, which is a model that predicts a label based on certain characteristics of the data. The idea is that a human expert will take the model's prediction and improve it based on their own knowledge and experience. However, it is not always clear if using the model will make the expert's predictions more accurate.
To help bridge this gap, a new approach has emerged where instead of giving just one label prediction, the system provides a set of potential labels. The human expert then picks one from this set. This method aims to prevent a situation where the expert second-guesses their own judgment, which can lead to mistakes.
However, while these systems can help, they can also lead to harm. For example, if the expert is used to making decisions confidently, the system might lead them to fall back on the predictions instead, causing them to miss their own successful predictions.
Goals of the Study
Our primary goal is to develop decision support systems that limit how much harm they can cause. We want to ensure that, on average, the harm caused is below a certain level that a decision-maker can set.
To do this, we start by laying out the definition of harm and testing how often these systems might cause harm based purely on human predictions.
Framework Development
Our research involves designing decision support systems using a certain systematic approach. We look to limit the harm by applying a method called Conformal Risk Control. This process helps ensure that the risk of causing harm stays below a predefined level.
To set this up, we first focus on how predictions are made, where the context matters a lot. This means that we account for the factors surrounding each decision and how they influence the results.
Structural Causal Models
We utilize structural causal models (SCM) to analyze the decision process. These models allow us to visualize how different variables influence the prediction outcomes. They help in understanding the relationships between the inputs to the decision-making process and the eventual conclusions that are drawn.
Using these models, we can define the risk associated with relying on a decision support system. By assessing how often a decision made with the system would be worse than one made without it, we can identify circumstances when harm occurs.
Monotonicity Assumptions
To further analyze potential harm, we make two important assumptions: counterfactual monotonicity and interventional monotonicity.
Counterfactual Monotonicity: This assumption suggests that if an expert succeeds in making a prediction with the help of the system, they would still succeed if they had made the decision on their own, given that no major changes occur. Conversely, if they fail with the system, they would also likely fail without it.
Interventional Monotonicity: This assumes that as we add more labels to a prediction set, the chances of experts succeeding in their predictions either stay the same or improve.
Both of these ideas help us form the basis for estimating how often a decision support system may cause harm, allowing us to work on systems that better protect against making poor decisions.
Computational Framework
With our assumptions in place, we would like to create an effective way to design decision support systems that remain within the user-defined limits of acceptable harm. We intend to build on the existing models of decision-making and enhance them to ensure a balance between accuracy and the risk of counterfactual harm.
Conformal Risk Control
Conformal risk control is a method that we can apply to manage the risk of harm effectively. This method focuses on ensuring that the expected loss from using the predictive model does not exceed a predetermined limit.
Through this framework, we can calculate the acceptable thresholds for harm and identify which prediction sets lead to better outcomes. By creating rigorous checks within the model, we can see how often a particular decision support system leads to lower accuracy, and we can flag when the potential for harm rises above the designated limits.
Practical Application
To validate our approach, we apply our framework to real datasets with actual human decisions. We seek to understand how well our systems work in practice and identify any patterns or trends. Through various experiments, we analyze how systems perform across different settings and ensure they align with the theoretical expectations we laid out earlier.
Our experiments allow us to evaluate the average counterfactual harm caused by these systems, giving us a clearer picture of their effectiveness in real-world scenarios. We compare results from different predictive models to see how they perform with the calibration sets, looking closely at the accuracy achieved by human experts versus the counterfactual harm experienced.
Trade-off Between Accuracy and Harm
One crucial discovery from our experiments is that there is often a trade-off between accuracy and potential counterfactual harm. When systems are designed to prioritize accuracy, they may inadvertently increase the likelihood of causing harm, leading to poorer outcomes for users.
This trade-off emphasizes the need to carefully calibrate decision support systems. By adjusting the parameters and thresholds used in our models, we can push for better accuracy while still keeping potential harm at bay.
Experimental Validation
To assess our framework validity, we utilize two different datasets for human predictions, following a structured experimental setup. This helps ensure we gather reliable data across varied conditions.
Dataset Overview
In our evaluations, one dataset features noisy images generated from natural images. Participants were asked to classify these images, providing a rich source of predictions based on different levels of difficulty. The second dataset involves predictions made by human subjects using decision support systems, allowing for a direct comparison against purely human predictions.
Results and Observations
From the data gathered, we found consistent patterns. Generally, the decision support systems caused some level of counterfactual harm regardless of the model used. The results also showed that the systems were effective in identifying harm-controlling sets, which were often conservative and did not include all potential options.
As we assessed the average accuracy achieved by human participants with different systems, we also noted that while accuracy improved, it often came at the expense of increased potential harm.
Discussion
Reflecting on our findings, several key points arise. First, we acknowledge that while the decision support systems improve accuracy, they can still cause counterfactual harm. This highlights the importance of keeping potential harm in check when designing these systems.
Moreover, we recognized that our analysis focuses primarily on average counterfactual harm. When these predictions carry significant consequences for individuals, ensuring fairness across different groups is essential. Addressing potential disparities in harm is critical and should be taken into account in future designs.
Limitations
There are also some notable limitations to our work. Our framework assumes specific conditions related to data samples and expert predictions. Hence, it is important to consider extending the model to encompass various factors such as distribution shifts and variations in label accuracy.
Additionally, our experiments were based on a single dataset of noisy images. Testing our framework across different real-world applications would help refine its applicability and ensure broader results.
Conclusion
In this study, we presented an approach to managing counterfactual harm in decision support systems based on prediction sets. By establishing a computational framework that incorporates principles of conformal risk control, we demonstrated how to control the frequency with which these systems can cause harm.
The experiments conducted highlight the importance of balancing accuracy and potential harm, supporting the notion that while enhancing decision-making processes through technology is beneficial, careful consideration must be given to the implications of such systems.
In our future work, we plan to further investigate the trade-off between accuracy and counterfactual harm in various domains, particularly in scenarios where decisions have high stakes, such as healthcare or legal contexts. By doing so, we hope to contribute to the creation of more responsible and effective decision support systems.
Title: Controlling Counterfactual Harm in Decision Support Systems Based on Prediction Sets
Abstract: Decision support systems based on prediction sets help humans solve multiclass classification tasks by narrowing down the set of potential label values to a subset of them, namely a prediction set, and asking them to always predict label values from the prediction sets. While this type of systems have been proven to be effective at improving the average accuracy of the predictions made by humans, by restricting human agency, they may cause harm$\unicode{x2014}$a human who has succeeded at predicting the ground-truth label of an instance on their own may have failed had they used these systems. In this paper, our goal is to control how frequently a decision support system based on prediction sets may cause harm, by design. To this end, we start by characterizing the above notion of harm using the theoretical framework of structural causal models. Then, we show that, under a natural, albeit unverifiable, monotonicity assumption, we can estimate how frequently a system may cause harm using only predictions made by humans on their own. Further, we also show that, under a weaker monotonicity assumption, which can be verified experimentally, we can bound how frequently a system may cause harm again using only predictions made by humans on their own. Building upon these assumptions, we introduce a computational framework to design decision support systems based on prediction sets that are guaranteed to cause harm less frequently than a user-specified value using conformal risk control. We validate our framework using real human predictions from two different human subject studies and show that, in decision support systems based on prediction sets, there is a trade-off between accuracy and counterfactual harm.
Authors: Eleni Straitouri, Suhas Thejaswi, Manuel Gomez Rodriguez
Last Update: 2024-12-04 00:00:00
Language: English
Source URL: https://arxiv.org/abs/2406.06671
Source PDF: https://arxiv.org/pdf/2406.06671
Licence: https://creativecommons.org/licenses/by/4.0/
Changes: This summary was created with assistance from AI and may have inaccuracies. For accurate information, please refer to the original source documents linked here.
Thank you to arxiv for use of its open access interoperability.