Simple Science

Cutting edge science explained simply

# Computer Science# Machine Learning

Auditing Machine Learning Systems: A Practical Approach

A guide to ethical auditing methods for machine learning technologies.

― 8 min read


Ethics in ML AuditingEthics in ML Auditinglearning practices.A guide to ensuring responsible machine
Table of Contents

The use of Machine Learning (ML) systems is growing rapidly, but this brings up many ethical issues and public concerns. There is a clear need to audit these systems to ensure they follow ethical standards. To make auditing a standard practice, two important things need to be in place: a lifecycle model that emphasizes Transparency and Accountability, and a risk assessment process to guide the audit effectively.

This article explains a practical approach towards the auditing of ML systems, which builds on guidelines developed by the European Commission. Our auditing method is based on a lifecycle model that focuses on documentation, accountability, and quality checks, creating a common understanding between the auditors and the organizations being audited.

We describe two pilot studies involving real ML projects, discuss the challenges faced in ML auditing, and suggest future improvements.

The Need for Auditing ML Systems

With the rise of ML technologies, questions about their ethical use and the potential for bias have become very important. Many organizations have created their own guidelines regarding ethical AI, but these documents often do not change how developers actually make decisions. This is mainly because these guidelines are often too vague, and practical tools to support them are usually missing. As a result, developers may feel less responsible for their choices.

One solution to this issue is to apply auditing processes when designing and operating ML systems. Auditing can help ensure accountability and make ethics guidelines more effective. Audits can either be done internally by people in the same organization or by an external party, which is often seen as more trustworthy, especially for high-risk applications.

While external audits can be costly, they often provide more trust to stakeholders. On the other hand, internal audits can promote better documentation and Risk Assessments, increasing the overall traceability of the systems being audited.

The Challenge of Auditing

Conducting audits effectively requires clear conditions to be in place. Both auditors and the organizations must understand the expectations and practices during the audit process. This includes defining standards, evidence collection, testing, and the roles of different individuals involved in the audit.

However, there is a lack of standard practices for assessing risks in ML systems. We propose a new auditing approach inspired by existing Information Systems auditing practices, tailored to address these challenges.

The Proposed Audit Procedure

Our auditing procedure consists of three main phases: planning, fieldwork/documentation, and reporting. While these phases may seem sequential, it is important to note that auditing ML systems should be an ongoing process, reflecting the rapid changes and iterations that often occur in ML development.

Planning Phase

The planning phase aims to set the scope of the audit and create a roadmap for the following phases. This involves reviewing previous audit reports and conducting a risk assessment. It is crucial to determine the resources and skills needed for the audit, often requiring a team with diverse backgrounds to cover the complexities of ML systems.

A good understanding of the overall system architecture and the processes involved in creating and deploying ML models is key during this phase. To achieve this, we follow a lifecycle model that emphasizes Ethical Principles and identifies key risks.

Lifecycle Model

A lifecycle model serves as a common reference for both auditors and the organizations being audited. Most existing models focus heavily on technical details and do not sufficiently incorporate ethical principles such as transparency and accountability.

We propose an enhanced lifecycle model that includes four main steps: formalization, data management, model management, and deployment. In addition to these steps, we introduce three important aspects that promote accountability and transparency:

  1. Agility of Each Phase: Instead of viewing the lifecycle as a linear process, our model treats each phase as iterative and emphasizes the need for documentation and quality checks throughout.

  2. Transparency and Accountability: Our model aligns the different phases with the roles and responsibilities of individuals involved, making it clear what documentation should be produced.

  3. Continuous Impact Assessment: The model highlights the importance of ongoing assessments of how the ML system affects its users and the context in which it operates.

Mapping the audited system onto this lifecycle model helps identify relevant phases and documents to collect, as well as tailor the risk assessment accordingly.

Risk Assessment

Effective auditing relies on having a documented knowledge base of potential risks. For ML systems, this knowledge can be more difficult to gather. Our proposed risk assessment method uses the lifecycle model to simplify the analysis of risks by breaking it down into manageable components.

By using existing frameworks like the European Commission's Assessment List for Trustworthy Artificial Intelligence (ALTAI), we can develop relevant questions to ask during each phase of the lifecycle. This helps ensure that subsequent fieldwork and documentation are guided by these questions.

Fieldwork and Documentation Phase

In this phase, the auditor collects evidence to verify compliance with regulations and assess the effectiveness of control measures through various tests. Evidence can be collected using two main mechanisms:

  1. Transparency Mechanisms: This involves reviewing information disclosed by the developers, such as datasheets and model cards.

  2. Examinability Mechanisms: Here, the auditor conducts experiments directly on the system to validate the information provided earlier.

Once evidence is gathered, the auditor can perform compliance testing, which checks for any discrepancies between the organization’s specifications and the actual implementation. This can help identify any weaknesses in quality assessments and documentation.

Custom Testing

In cases where standard tests are not sufficient, auditors can create their own tests to evaluate aspects of the ML system that may not have been thoroughly assessed by the development team. While this approach can lead to some inconsistencies in the audit, it can be crucial for ensuring that all relevant factors are considered.

Reporting Phase

After the audit is complete, the auditor compiles the results of the various tests and defines criteria for future audits. This could involve scheduling regular audits or conducting them when significant changes occur in the system or user feedback indicates a problem.

The auditor should ensure that any recommended mitigation measures are implemented before the next audit iteration.

Case Studies: Conducting the Pilots

Our auditing procedure has been tested through two pilot studies involving real ML applications. These pilots are not designed to serve as universal templates but rather to encourage dialogue about best practices in ML auditing.

Pilot 1: AI-Assisted Calibration System

This pilot involved an ML system that automates calibration processes for safety components, which have traditionally been conducted manually by engineers. The goal is to support engineers without replacing their expertise.

During the audit, we focused on the formalization, model management, and operationalization processes, as previous data management was already reviewed. The lifecycle model helped structure discussions and identify relevant documentation.

Risk Assessment

The risk assessment process involved filtering questions relevant to the specific steps of the lifecycle model. Key ethical concerns arising from this audit included transparency, explainability, robustness, and safety.

  1. Transparency and Explainability: The system’s output is clear, but it does not adequately convey the uncertainty associated with its recommendations. Suggestions for improvement included logging user selections for validation and conducting experiments to assess over-reliance on the model’s output.

  2. Robustness and Safety: There is a need for thorough documentation of design decisions, failures, and post-market surveillance to ensure safety. Additionally, recommended performance thresholds should be established.

Pilot 2: Geriatronics Project - Vision System

The second pilot focused on the vision module of GARMI, a robotic platform designed to assist elderly people. This project was still in the research stage, and while ethical concerns had been previously addressed, there was insufficient documentation for full auditing.

Risk Assessment

We identified several ethical requirements from the ALTAI that needed to be documented. The audit identified the need for better traceability of data and model management processes, as well as ensuring compliance with privacy regulations.

Documentation templates could help streamline the process, allowing the team to focus on integrating ethical considerations into system development moving forward.

Lessons Learned from the Pilots

The pilots highlighted several key takeaways for future auditing of ML systems:

  1. Auditability Criteria: Not all systems are suitable for auditing due to restrictions on data access or lack of necessary documentation. Establishing auditability criteria prior to planning is crucial.

  2. No One-Size-Fits-All Solution: Different ML systems may require varied auditing approaches based on their risk levels and specific contexts.

  3. Continuous Auditing: Early engagement between auditors and development teams can lead to better documentation and reduced compliance costs. Ongoing collaboration throughout the ML system’s lifecycle may help catch issues earlier and simplify future audits.

  4. Database of Documented Risks: Maintaining a database of past incidents and associated risks can be beneficial for future audits. This kind of tool would aid in developing effective preventive measures.

Conclusion

This article presents a practical approach to auditing machine learning systems, focusing on creating a common understanding between practitioners and auditors. Our proposed process includes a lifecycle model and risk assessment method that integrates ethical principles to enhance accountability and transparency.

We have shown our methodology through real-world examples and discussed the challenges that remain in this field. As the landscape of ML evolves, developing standardized practices and adapting our auditing methods will be essential for ensuring ethical and trustworthy AI systems.

Original Source

Title: Pragmatic auditing: a pilot-driven approach for auditing Machine Learning systems

Abstract: The growing adoption and deployment of Machine Learning (ML) systems came with its share of ethical incidents and societal concerns. It also unveiled the necessity to properly audit these systems in light of ethical principles. For such a novel type of algorithmic auditing to become standard practice, two main prerequisites need to be available: A lifecycle model that is tailored towards transparency and accountability, and a principled risk assessment procedure that allows the proper scoping of the audit. Aiming to make a pragmatic step towards a wider adoption of ML auditing, we present a respective procedure that extends the AI-HLEG guidelines published by the European Commission. Our audit procedure is based on an ML lifecycle model that explicitly focuses on documentation, accountability, and quality assurance; and serves as a common ground for alignment between the auditors and the audited organisation. We describe two pilots conducted on real-world use cases from two different organisations and discuss the shortcomings of ML algorithmic auditing as well as future directions thereof.

Authors: Djalel Benbouzid, Christiane Plociennik, Laura Lucaj, Mihai Maftei, Iris Merget, Aljoscha Burchardt, Marc P. Hauer, Abdeldjallil Naceri, Patrick van der Smagt

Last Update: 2024-05-21 00:00:00

Language: English

Source URL: https://arxiv.org/abs/2405.13191

Source PDF: https://arxiv.org/pdf/2405.13191

Licence: https://creativecommons.org/licenses/by-sa/4.0/

Changes: This summary was created with assistance from AI and may have inaccuracies. For accurate information, please refer to the original source documents linked here.

Thank you to arxiv for use of its open access interoperability.

More from authors

Similar Articles