Connecting Safety and Software Change Management
A method to improve safety in software development for critical systems.
― 6 min read
Table of Contents
When we create software that needs to be very safe, like systems that control airplanes or medical devices, we have to be careful. If these systems fail, it could hurt people or damage the environment. So, we need to keep checking that they are safe all the time.
One way to show that we have thought about safety is through something called a Safety Assurance Case (SAC). An SAC is basically a group of reasons and evidence that says the system is safe to use. It breaks down safety claims into smaller parts, supported by evidence like tests and simulations. This helps everyone understand how decisions about safety were made.
Creating and keeping these safety cases updated is not easy, especially when the software changes. This is because there is sometimes a gap between the work done on safety and the work done on software. People who focus on safety might not know enough about the software itself, and vice versa. Because of this disconnect, it can be hard to keep the safety arguments updated whenever changes occur in the software.
In many cases, safety experts have pointed out that there aren't good ways to manage changes to SACs effectively. So, when something in the software changes, it can be tricky for safety experts to figure out how it affects safety.
They want to know a few things when the software changes:
- Why did the software change?
- What risk is this change trying to fix?
- How could this change affect safety?
To make things easier, we can connect the safety arguments directly to the software changes. This means we need to link important parts of the software to the safety reasons, and keep track of all the changes. This way, the people working on safety will know if something in the software changes and how it might impact safety.
Connecting Software and Safety
To tackle this problem, we propose a method to connect software changes with safety arguments in an organized way. By using a technique called Safety Artifact Forest Analysis (SAFA), we can automatically spot changes in the software. It looks at two versions of the software and creates a visual representation of what has changed. By using this visual representation, safety teams can see exactly what has been added, removed, or modified, and how it affects their safety arguments.
When the software changes, our method also helps capture the reasons for those changes. People who work on the software, like developers, can provide their reasons and other important details for each change. This helps safety teams understand whether the changes are good or bad for safety.
For instance, when designing a system for drones that are used in emergencies like search and rescue, one important requirement is that the drone must know where it can and cannot fly. This is crucial because if a drone flies into a restricted airspace, it could cause accidents.
One requirement for the drones was that they needed to keep checking for airspace information while flying. In a newer version of the software, this was changed to check airspace only when planning new flight paths. Our system can show this change clearly and help safety experts assess if this is still safe.
Linking All the Pieces
When developing safe systems, we start by identifying potential hazards. This is called a preliminary hazard analysis (PHA). From there, we create Fault Trees (FT) or a Failure Mode and Effects Criticality Analysis (FMECA) to understand these risks better. While our focus is mainly on Fault Trees, the same ideas can apply to FMECA.
Fault Trees help us break down risks into smaller parts, showing how different events could cause safety issues. It’s about understanding and managing risks.
In our method, we make sure that there is a direct link between a Fault Tree and its relevant parts in the Safety Assurance Case. This link helps us tie together the safety arguments and the system’s design.
By keeping these connections clear and updated, any change in the system will send alerts through the linked safety arguments. This way, if something in the software changes, we can quickly assess how that affects safety.
Importance of Capturing Reasons
To truly understand how changes affect safety, we need to capture the reasons behind them. When a developer changes something in the code, they need to provide details about why they made that change. This can include what alternatives they considered and what led them to their final decision.
For example, if a safety requirement for a drone says that it should fetch real-time airspace data, changes to how this is done could impact safety. Maintaining a clear record of why and how these requirements change is essential. It provides context that safety analysts need to evaluate whether the new design is still safe or if more precautions are necessary.
Challenges Ahead
There are still many challenges to address when linking safety and software development. One key area is knowledge management. What do safety analysts need to know? How can we collect this information effectively? These are crucial questions that need answering.
Another challenge is analyzing changes intelligently. Software changes can happen at various levels. We want to distinguish between changes that are harmless and those that could be risky. For future approaches, we hope to use smarter systems to analyze changes and suggest actionable steps when safety might be at risk.
Lastly, we need to improve the tools used in this process. Current systems often lack the support needed to manage the connection between safety claims and software changes. Developing user-friendly tools that help maintain this connection, while making it easier for analysts and developers to do their work, is vital.
Conclusion
In summary, maintaining safety in software development is vital, especially for systems where lives could be at stake. Connecting the dots between Safety Requirements and software changes is crucial to ensure that safety remains a priority throughout the development process.
By creating clear links between changes in software and safety arguments, and by capturing the reasons behind those changes, we can improve how we manage safety in software development. As we move forward, addressing the challenges we face will be essential for creating safer systems that can adapt to new changes while keeping everyone safe.
Title: Leveraging Traceability to Integrate Safety Analysis Artifacts into the Software Development Process
Abstract: Safety-critical system's failure or malfunction can cause loss of human lives or damage to the physical environment; therefore, continuous safety assessment is crucial for such systems. In many domains this includes the use of Safety assurance cases (SACs) as a structured argument that the system is safe for use. SACs can be challenging to maintain during system evolution due to the disconnect between the safety analysis and system development process. Further, safety analysts often lack domain knowledge and tool support to evaluate the SAC. We propose a solution that leverages software traceability to connect relevant system artifacts to safety analysis models, and then uses these connections to visualize the change. We elicit design rationales for system changes to help safety stakeholders analyze the impact of system changes on safety. We present new traceability techniques for closer integration of the safety analysis and system development process, and illustrate the viability of our approach using examples from a cyber-physical system that deploys Unmanned Aerial Vehicles for emergency response.
Authors: Ankit Agrawal, Jane Cleland-Huang
Last Update: 2023-07-14 00:00:00
Language: English
Source URL: https://arxiv.org/abs/2307.07437
Source PDF: https://arxiv.org/pdf/2307.07437
Licence: https://creativecommons.org/licenses/by/4.0/
Changes: This summary was created with assistance from AI and may have inaccuracies. For accurate information, please refer to the original source documents linked here.
Thank you to arxiv for use of its open access interoperability.