Software Design For Reliability

With all the complaints you hear about products rebooting and software crashing, do companies really practice Software Reliability? In fact, there are some companies that do, but they are mostly in the industries that require products to have high availability, such as telecom, defense, and space, or safety-averse industries, such as medical and industrial plant operation. Most other industries don’t pay as much attention to it. The best method to increase Software Reliability without significant increases to schedules or budgets is to use a Software Design for Reliability (SDFR) approach. These are the key steps.

Software Reliability Assessment

Before starting a Software Reliability program, perform a Software Reliability Assessment by assessing your team’s capability to produce good software. Benchmark your development practices against industry best practices to ensure they have a solid foundation upon which to integrate the other reliability services. The benchmark study will help you fill in gaps by identifying existing internal best practices and techniques to yield the desired results. It will also help define a set of reliability practices to move defect prevention and detection tasks as far upstream in the development cycle as possible. Once you complete the assessment, choose the specific software techniques and integrate Software Reliability throughout the lifecycle of your program.

Software Reliability Integration in the Concept Phase

In the concept phase, there are two main Software Reliability techniques:

Software Reliability Goal Setting
Software Reliability Program Plan

Perform a Software Reliability Goal Setting by defining system-level software reliability goals. These goals become part of the overall Software Reliability Program Plan. Apply the goals to the design and testing phases.

Software Reliability Integration in the Design Phase

In the design phase, there are six main Software Reliability techniques:

Facilitation of Team Design Template Reviews
Facilitation of Team Design Reviews
Software Failure Modes and Effects Analysis (SFMEA)
Software Fault Tree Analysis (SFTA)
Software Failure Analysis
Software Fault Tolerance

Use the technique Facilitation of Team Design Template Reviews to conduct group pre-design review meetings, which provide your team with forums to expand their knowledge base of design techniques by exchanging design templates. Your team will greatly improve their design inspection results if the inspections are preceded by brief, informal reviews that are highly interactive at multiple points throughout the progression from system architecture through low-level design. This is known as the Facilitation of Team Design Reviews. Use Software Failure Modes and Effects Analysis (SFMEA) and Software Fault Tree Analysis (SFTA) to identify and mitigate failure modes in software, similarly to how you used FMEA and FTA for hardware. Prior to the final stage of a design, use Software Failure Analysis to identify core and vulnerable sections of the software that may benefit from additional run-time protection by incorporating Software Fault Tolerance techniques.

Software Reliability Integration in the Implementation Phase

In the implementation phase, there are two main Software Reliability techniques:

Facilitation of Code Reviews
Software Robustness and Coverage Testing

For Facilitation of Code Reviews, use reliability reviews to target the core and vulnerable sections of code to allow the owner of the source code to develop sufficient synergy with a small team of developers in finding defects. Use system testing efforts to focus on efficient detection of software faults using Software Robustness and Coverage Testing techniques for thorough module-level testing.

Software Reliability Integration in the Testing Phase

In the testing phase, there are four main Software Reliability techniques:

Software Reliability Measurements and Metrics
Usage Profile-Based Testing
Software Reliability Estimation
Software Reliability Demonstration Tests

Use Software Reliability Measurements and Metrics to track the number of remaining software defects, to calculate the Software mean time to failure (MTTF), and to anticipate when the software is ready for deployment. You will be able to apply Usage Profile-Based Testing methods to emphasize test cases based on their anticipated frequency of execution in the field. One important new technique in Software Reliability is Software Reliability Growth.

All of these steps appear as a large resource requirement in the schedule and man hours. But time and time again it has been shown that in the full program ROI analysis that including this methodology brings a net gain in product and program performance that can be measured even before customer delivery.

-Adam

Comments

Larry Weaver says
January 3, 2018 at 5:46 PM
Thanks for explaining that software reliability Measurements and Metrics is the first technique of the testing phase. Having product reliability software that works through all of the techniques in the testing phase would be the type of software I would look for. If I were to get product reliability software, I would be sure to get it from a professional company.