Reliability Methods

Appendix C: Reliability Methods Sorted by Method Category

The following is an excerpt from The Process of Reliability Engineering, a book by Carl S. Carlson and Fred Schenkelberg. Within the book see section 8.3.3 Potential reliability methods, for a listing of a wide range of reliability methods to consider when selecting the best methods to inform key decisions.

This appendix provides a listing of reliability methods sorted by category with decision type annotated. Keep in mind that this is not an exhaustive list of tools, techniques, or methods but a subset thereof that may be useful as you create a reliability plan. It is a list to provide awareness of common methods that aid in the creation of both a reliability plan and a highly reliable product.

Each listed method includes a very brief description of the method and the typical output. We have found many references for how to execute each of these methods, yet few detail why one would use a method. Therefore, we mention what each method provides to assist in matching the best method for your plan’s needs.

Type of decision a method addresses

After each method name in the listing of methods below is a term in parentheses indicating the type of decision the method often addresses. This additional information may assist you in identifying the right method for a specific key decision. The six different decision types are the following:

Prevention: What can we do now to avoid failures or improve reliability?
Comparison: Which design, vendor, or procedure option is better based on reliability considerations?
Priority: Where should we focus our resources to best improve reliability?
Resources: Who and when should accomplish a specific task?
Objective: How do we determine the reliability and availability performance objectives, goals, or requirements?
Measurement: What is the reliability performance now or expected to be in the future?

For a complete discussion of each type of decision, see Appendix B.

Graphic showing reliability method categories (requirements, risk reduction, assurance, and organizational methods) with associated sub-categories (11 total sub-categories) — **Figure C.1.** Categories and subcategories of reliability methods.

Categories of reliability methods

We are using four categories to organize the list of reliability methods. The first three—requirements, risk reduction, and assurance methods—follow a very basic product development process. The fourth category—organizational methods—includes process descriptions and guidelines. Each subcategory below has a short description repeated from Chapter 8 plus a list of methods that fit within that subcategory, including a short description of each method.

Figure C.1 shows the four main categories of methods along with 11 subcategories. The following gives a brief description of each category and subcategory along with a listing of methods within each subcategory.

Jump to specific category or subcategory with these links:

Requirements methods
- Set reliability targets
- Identify reliability data needs
Risk-reduction methods
Assurance methods
Organizational processes

Requirements methods

Requirements methods are the class of reliability methods that support the identification of reliability goals and targets. They also define the associated data measurement needs.

Set reliability targets

This category embraces information-gathering and requirements-setting methods that need to be measurable at the system, subsystem, and component level; verifiable during the product development timeframe; and converted into actionable technical specifications.

Customer and market analysis (Objective)

The evaluation of customer expectations concerning reliability performance. Plus the evaluation of reliability performance of other solutions available in the market. Customer and market analysis provides customer expectation information related to reliability performance.

Environment and use profiling or characterization (Objective)

The set of environmental factors and range (or distribution) of values during manufacturing, shipping, storage, and use appling (or expected to apply) to an item. This includes factors such as weather (e.g.,temperature, humidity, and wind) and usage (e.g., frequency, human interaction, misuse). Often includes details concerning potential sources of stress(es) on an item. A range provides boundaries for the expected stress, whereas a distribution enables advanced forms of stress analysis.

Key performance indicators (Objective)

Also known as KPIs, these are measures of business processes or project characteristics. They provide a monitoring function to evaluate the success of an organization, process, procedure, supplier, etc. Managers often create and use KPIs for a specific project, then retired when the project completes.

PESTLE analysis (Objective)

A framework based on six macro-environmental factors for strategic analysis or market research. The factors are political, economic, social, technological, legal, and environmental. The analysis uses descriptions of factors that may bound what is feasible when establishing reliability-related objectives.

Reliability allocation (Objective)

The process of breaking down system reliability and availability objectives to subsystems and major components. Reliability allocation specifies clearly stated reliability goals for all elements of an item. This information is useful when seeking suitable elements (parts or subsystems). It gives a means to compare the reliability performance of subsystems or elements to a specific and measurable objective.

Reliability goal setting (Objective)

The process of creating a statement that includes the function(s), environment and use conditions, probability of survival, and duration for an item. It is the reliability performance objective for the item under consideration. It provides a detailed statement that is measurable to communicate the desired reliability performance of an item. Included is sufficient information to guide decisions related to material selection, design architecture, and types of or expected environmental and use stresses.

SWOT analysis (Objective)

A technique used to identify strengths, weaknesses, opportunities, and threats related to business competition or project planning. SWOT analysis makes use of a concise description of factors that may influence or frame the establishment of a reliability vision or related objectives.

System effectiveness (Objective)

(a) For repairable systems and items: the probability that a system can successfully meet an operational demand within a given time when operated under specified conditions.

(b) For “one-shot” devices and nonrepairable items: the probability that the system will operate successfully when called upon to do so under specified conditions.

There are various specific elements defined by different authors, yet they generally comprise the concepts of availability, dependability, and capability.

Identify reliability data needs

Reliability data comprise the fuel that supports measuring and assuring that reliability requirements are attained. Data can take the form of test failures, field failures, degradation measurements, test or analysis successes, and other forms of information. The focus needs to be on data integrity and correct measurements.

Measurement system analysis (Measurement)

An experimental and mathematical method of determining the amount of variation that exists within a measurement process.

Metric monitoring & tracking system (Measurement)

A system to identify and establish the process to gather and report data required to assess progress or attainment of objectives such as the reliability goal, KPIs, etc. For each objective specified for a project or program, monitoring and tracking are performed on all specific measurables.

Risk-reduction methods

Risk-reduction methods are the class of reliability methods that support reliability in design, supplier reliability, and manufacturing reliability.

Design-in reliability

This category includes the design for reliability (DFR) methods that can be executed in the design and manufacturing stages of the product development process. Examples include such as failure mode and effects analysis, physics of failure modeling, design margin analysis, and highly accelerated life testing. The focus of the DFR methods should be on high-risk and new concepts.

Analysis of variance (ANOVA) (Comparison)

A basic statistical technique for determining the proportion of influence a factor or set of factors has on total variation. It subdivides the total variation of a data set into meaningful component parts associated with specific sources of variation to test a hypothesis on the parameters of the model or to estimate variance components. There are three models: fixed, random, and mixed. ANOVA provides a means to statistically analyze two or more factors at a time.

Derating (Prevention)

Designing the use of or selecting an item in such a way that applied stresses are below rated values. Derating supplies a means to create a system that is robust over a range of applied stress values. Thus minimizing failures.

Design for manufacturing (DFM) and design for assembly (DFA) (Prevention)

Methods that use fundamental design principles and related design analysis methods, focusing on production manufacturing and assembly. They seek to optimize product design, manufacturing, and ease of assembly.

Design for reliability (DFR) (Prevention)

The deliberate creation of an item considering and facilitating its capability to meet reliability goals. DFR provides a set of tools and expectations to enable design decisions that address reliability performance objectives.

Design for six sigma (Prevention)

A set of techniques and tools for process improvement. Six sigma is characterized by five steps: define, measure, analyze, improve, and control (DMAIC). Design for six sigma expands on six sigma to include a focus on customer expectations or requirements to inform design decisions. It is also a set of tools to solve problems using a define, measure, analyze, design, and verify (DMADV) process. It is a methodology to design robust solutions. Meaning that items can withstand a wide range of applied stress without failure.

Design for X (Prevention)

The deliberate creation of an item by considering and facilitating its capability to meet X business and customer expectations. X may be manufacturing, reliability, serviceability, environment, sustainability, or some other category. Design for X provides a framework for and an intentional focus on a specific set of considerations during the design process.

Design margin analysis (Prevention)

The engineering evaluation of the ratio of an item’s absolute strength to the expected or actual applied load. It expresses how much stronger the system is than it needs to be for an intended load.

Design of experiments (DOE) (Comparison)

The arrangement in which an experimental program is to be conducted and selection of the levels of one or more factors or factor combinations to be included in the experiment. It is also known as statistical design of experiment, as it entails a versatile statistical approach to screen for important factors, optimize factor settings, or detect important factor relationships or interactions.

Design review (Priority)

A milestone or periodic activity within a product development process whereby a design is evaluated against its requirements to verify the outcomes of previous activities and identify issues before committing to (and if need be reprioritizing) further work.

Error proofing (Prevention)

A method to identify and mitigate potential problems in product design or manufacturing processes by modifying the design or process to ensure that anticipated errors are less likely to occur during assembly or manufacturing.

Failure mode and effects analysis (FMEA) (Priority)

A qualitative and systematic analysis method intended to identify, prioritize, and reduce technical risks of failures to an acceptable level, with emphasis on improving product design or manufacturing process. FMEA is a structured process used to identify and prioritize potential failure modes that should receive an appropriate level of mitigation (elimination, reduction, or prevention).

Fault tree analysis (FTA) (Prevention)

A technique to explore the many potential or actual causes of product or system failure. FTA is a top-down analysis method, where one starts with a symptom or fault and then lists the many possible causes in a structured manner. It provides a structured way to organize information about the occurrence of the top event. FTA may be useful to create a system reliability model to estimate system reliability performance or to organize brainstorming potential paths of events that lead to the occurrence of the top event.

Finite element analysis (FEA) (Measurement)

The process of simulating the behavior of a part or assembly under given conditions so that it can be analyzed. FEA uses mathematical models to aid in understanding and quantifying the effects of real-world conditions on a part or assembly.

Hazard analysis (HA) (Priority)

A process of examining a system throughout its life cycle to identify and eliminate or mitigate to an acceptable level inherent safety-related risks. HA provides a prioritized assessment of real or potential conditions that may cause harm or undesired damage, enabling a focus on eliminating or mitigating to an acceptable level the most severe and/or frequent hazards.

Highly accelerated life testing (HALT) (Priority)

An experiment to quickly discover weaknesses in an item’s design or assembly process by applying increasing steps of one or more stress vectors until failures occur.

Human factor analysis (Prevention)

The systematic evaluation of interactions among humans and other elements of a system and the application of theory, principles, data, and other methods to design used to optimize human well-being and overall system performance. Human factor analysis supplies a means to assess and prioritize design improvements that minimize human-error-induced failures.

Level of repair analysis (LORA) (Prevention)

The analytical method used to determine when an item should be repaired, replaced, or retired based on cost and operational requirements. LORA provides a strategy and supporting information useful when establishing a maintenance program for the analyzed system.

Maintenance task analysis (MTA) (Prevention)

The identification of the steps, spares, materials, tools, equipment, skill level, and facility requirement for a given repair task. MTA assists in understanding the logistics support necessary to execute the repair.

Markov chain modeling (Objective)

An analysis technique based on the idea that the system or components within a system may have one or two states (functional or failed). The modeling includes the probability of failure and repair along with the relationship between items and provides a method to model complex repairable systems.

Monte Carlo reliability modeling (Objective)

A mathematical technique that generates random variables (time-to-failure distribution when modeling the reliability of a system) for modeling risk or uncertainty of a system. The approach incorporates all available information and relationships including continuous, discrete, and categorical data. It provides a means to aggregate information that may be discrete, continuous, or the output of deterministic and/or stochastic processes.

Petri net modeling (Objective)

An analysis technique that is a general-purpose graphical and mathematical tool for describing relationships between conditions and events. Petri net modeling provides a method to model complex repairable systems.

Physics of failure (PoF) (Measurement)

The approach for the design and development of a reliable product to prevent failure. The approach is based on knowledge of root causes of failure mechanisms using failure-mechanism-based models or simulation tools. In PoF, a set of failure-mechanism-specific models useful when designing for reliability is constructed.

Reliability block diagram (RBD) (Objective)

A drawing with blocks for each element of a system and connecting lines representing the series, parallel, and more complex relationships between elements. Statistically, an RBD contains the probability of success (reliability) information for each block. For a single point in time (e.g., the warranty period), the probability may be a single number. The probability of success could also be a reliability function based on distributions or nonparametric models. RBDs provides a structure to perform reliability allocation and are also useful to gather reliability measurements to estimate system reliability or to identify elements for improvement priority.

Safety margin, design margin, or factor of safety (Prevention)

A ratio of an item’s absolute strength to the actual applied load. It expresses how much stronger a system is than it needs to be for an intended load. Using such a ratio enables one to account for expected and unexpected variation in applied loads to minimize item failure.

Stress–strength analysis (Prevention)

An analysis of the strength of the materials and the interference of the stresses placed on the materials, where “materials” are not necessarily the raw goods or parts but can be an entire system. Stress–strength analysis provides a means to account for the range of expected and unexpected loads, thus minimizing failures.

Tolerance analysis (Prevention)

The calculation of the allowable deviation from a standard or nominal value that maintains fit, form, and function. Tolerance analysis provides a means to account for the natural variation of parts and fittings.

Achieve supplier reliability

Supplier reliability can be achieved by incorporating reliability specifications and tasks into supplier bid packages, selecting suppliers who are capable of achieving reliability objectives, identifying critical supplier parts, and reviewing and approving supplier tasks for critical parts before shipment.

Acceptance sampling (Comparison)

An inspection in which the decision to accept an item (often a lot) depends on the results of the inspection. Acceptance sampling provides a cost-effective and statistical-based assessment of a lot of items.

Approved vendor list (Priority)

A list of vendors of components, subsystems, or systems that meet specified criteria and/or have an acceptable performance record when used in the organization’s products or systems. This subset of potential vendors is used to minimize selecting unacceptable vendor products and to minimize contractual agreements with suppliers.

Corrective and preventive action (CAPA)

A process initiated by the identification of a nonconforming or other unwanted outcome that may include root cause analysis, short-term and long-term solutions, and follow up to check on the effectiveness of implemented solutions. CAPA affords a means to track, prioritize, and avoid future similar or related nonconformities or unwanted outcomes.

Vendor selection process (Comparison)

An established process to assess potential vendors, often utilizing more than one criteria. Criteria may include technical capability, quality and reliability performance, financial stability, ability to deliver on time, responsiveness to requests, cost, and environmental impact. Vendor selection often occurs when first considering engaging with a new supplier or on a regular basis as an audit. The goal is to offer a means to avoid contracting or buying items or services from unsuitable suppliers.

Implement reliable manufacturing

Manufacturing reliability is essential to ensuring that manufacturing and assembly operations do not reduce the inherent design reliability of products. Steps must be taken to control manufacturing processes so that they are both stable and capable.

Hazard and operability study (HAZOP) (Prevention)

A systematic investigation of a current or future operation. The goal of a HAZOP is to identify and evaluate problems within a plant environment that could pose a risk to employees or equipment.

Highly accelerated stress screening (HASS), highly accelerated stress audit (HASA), environmental stress screening (ESS), and burn-in (Priority)

Variations on the technique to apply one or more stresses to an item to reveal items with latent defects while not adversely harming items without latent defects. Use of these techniques may provide information for the improvement of manufacturing processes.

Both HASS and HASA have stress conditions derived from prior HALT and often use multiple stresses applied over a short duration at high levels.

In ESS, the stresses applied are typically at expected or moderately high levels of weather stress (e.g., temperature, humidity, and rain) and/or use conditions (e.g., frequency of use, insect or animal exposure, and vibration).

Burn-in refers to the application of low to modest stress (often involving only one stress) applied to detect a known failure mechanism for repair or removal.

Layer of protection analysis (LOPA) (Measurement)

A technique used to determine the effectiveness of safeguards and safety-instrumented functions for providing risk reduction as protection from hazards identified for a given process. The identified hazards can be as a result of an earlier HAZOP.

Process capability (Comparison)

A statistical estimate that describes the process’s ability to fulfill the characteristic’s requirements with a stable, in-control process. It is a measure of an item’s inherent variability as a product of the processes involved and provides a means to assess whether the output of a process is within the requirements or specifications.

Statistical process control (SPC)

The set of activities using statistical techniques to reduce variation, increase knowledge of the process, and direct the process in a desired manner. SPC is often associated with control charting and provides a means to identify and minimize unwanted process variation and maintain process stability.

Assurance methods

Assurance methods are the class of reliability methods that support verification of reliability requirements, continuously improving reliability, and maintaining high reliability throughout the life of the product.

Verify that reliability requirements are met

This category involves using physical testing methods, as well as analytical modeling techniques. The focus needs to be on analysis and accelerated life testing. Supplier reliability requirements for critical parts must be verified before shipment.

Bayes success-run theorem (Measurement)

A useful method, based on the binomial distribution, that can be used to determine an appropriate risk-based sample size for process validations. The Bayes success-run theorem is as follows: R = (1 − C)^(1/n), whereR = reliability (or probability of success), C = confidence level, and n is the sample size.

Environmental testing (Measurement)

A series of experiments to determine whether an item will successfully function within the range of expected or specified environmental conditions, such as temperature exposure, salt fog exposure, fungus growth, or insect infestation.

Hypothesis testing (parametric, nonparametric, and graphical methods) (Comparison)

A set of statistical techniques that make a comparison of a population of items compared to a specification or to another population. Example specific techniques include the t-test to compare a mean to a specification or to another mean using samples, F-test to compare variances, and box plots to graphically compare two samples or populations. Hypothesis testing provides a means to quantify the detection of a statistically significant difference.

Life testing and accelerated life testing (ALT) (Measurement)

An experiment conducted to ascertain the time-to-failure characteristics of an item attributable to a specific failure mechanism. ALT shortens the time to failure by using more stressful conditions or higher use rates.

Parts count prediction (Measurement)

The analysis of parts and components to predict the rate at which an item fails. In parts count prediction, the expected or known failure rates for all components within an item are tallied to estimate the item’s overall failure rate. (However, this method is proven to be very inaccurate at estimating the actual field failure rate.)

Regression analysis (Weibull analysis) (Measurement)

A statistical study of the relationship between two or more variables. Regression analysis is a process used to define the mathematical relationship or model between two or more variables. When analyzing time-to-failure data, this is commonly called Weibull analysis in which one may or may not attempt to fit a Weibull distribution to the data.

Reliability growth and management (Measurement)

The improvement in a reliability parameter caused by the successful correction of deficiencies in item design or manufacture. It entails the systematic planning for reliability achievement as a function of time and other resources and controlling the ongoing rate of achievement by reallocation of resources based on comparisons between planned and assessed reliability values.

Simulation (Measurement)

An approach in which one attempts to model a real-world system by representing key characteristics using different conditions over time. Simulation modeling for engineering analysis is often less expensive than creating prototypes.

Continuously improve reliability

The value of physical testing should be enhanced through use of appropriate reliability growth models and life data analysis on a continuous basis. A failure review and corrective action system (FRACAS) can be instituted. Reliability improvement methods should be continuously used throughout the product life cycle.

Alpha testing (Priority)

A form of verification testing accomplished by providing late-stage prototypes or pre-release production units to employees to use under normal conditions with an obligation to provide feedback and defect reporting.

Beta testing (Priority)

A form of verification testing accomplished by providing late-stage prototypes or pre-release production units to select customers to use under normal conditions with an obligation to provide feedback and defect reporting.

Early field results analysis (Priority)

The evaluation of customer complaints, product returns, and associated data for a specified period of time since initial product launch.

Failure analysis (FA) (Prevention)

Subsequent to a failure, the logical systematic examination of an item, its construction, application, and documentation to identify the failure and determine the failure mechanism and its basic cause. FA provides the necessary information to identify suitable short-term and long-term solutions to avoid similar failures in the future.

Failure reporting analysis and corrective action system (FRACAS) (Priority)

A closed-loop system of data collection, analysis, and dissemination to identify and correct failures of a product or process. FRACAS offers a process to track from identification to a suitable resolution of all problems, issues, or failures encountered.

Field data analysis (Measurement)

The mathematical and/or graphical analysis of time-to-failure data of items in use by customers. Analysis of field data enables an estimate of the actual field time-to-failure distribution or failure rate.

Nevada chart (Measurement)

A way to organize field return data that captures the time-to-failure and censored data. A Nevada charts uses sales and returns and organizes the count of returns by which month or week of sales it originated and when the return was reported or arrived. This convenient ongoing chart enables one to tally returns as they arrive while preserving the necessary information for analysis.

Ongoing reliability testing (ORT) (Measurement)

The repetitive sampling of items from production used for a set of experiments to detect changes that impact time-to-failure performance.

Pareto analysis (Priority)

A graphical display of count, frequency or percentages of failure modes, mechanisms, or sources of variation ordered from highest to lowest contribution to overall product or system issues. A Pareto analysis provides an simple method for an organization to identify the costliest or most often occurring events.

Production reliability acceptance testing (PRAT) (Measurement)

Testing performed to measure any degradation in the reliability of a product over the course of production or to assure that products being delivered meet customer’s reliability requirements and/or expectations.

Root cause analysis (RCA) (Prevention)

A structured process or procedure to determine the most underlying or fundamental factor or reason for a failure.RCA is used to gather the necessary information to identify suitable short-term and long-term solutions to avoid similar failures in the future.

Warranty analysis (Measurement)

The statistical and/or graphical analysis of data related to the expected or actual item failures during the warranty period. Warranty analysis is performed to estimate the warranty expenses for a product using reliability estimations. In addition, the analysis of existing warranty claims data may reveal patterns or trends suitable for product improvements or for forecasting the expected number of future warranty claims.

Maintain high reliability throughout life

Establishing and implementing proper service and maintenance procedures will extend product life and help ensure safe and trouble-free usage. Understanding and addressing customer issues during field usage is important.

Condition-based maintenance (CbM) (Prevention)

A proactive maintenance technique that uses real-time data (collected through sensors) to identify when an asset’s performance or condition reaches an unsatisfactory level.

Corrective maintenance (Prevention)

All actions performed as a result of failure to restore an item to a specified condition. These can include any or all of the following steps: localization, isolation, disassembly, interchange, reassembly, alignment, and checkout. Corrective maintenance provides a process to identify and implement the set of tasks to restore an item to service after a failure.

Maintenance planning (Prevention)

The process of identifying what work needs to be done with what specific materials, tools, equipment, and documentation. The process also addresses why a specific maintenance approach is taken, while detailing how to accomplish the work. It provides critical information in anticipation of upcoming maintenance activities for scheduling.

Maintenance scheduling (Prevention)

The process of determining when to accomplish specific maintenance tasks, giving detailed information on who will perform what maintenance and when.

Maintenance storeroom management (Prevention)

The coordinated effort to acquire, house, and maintain parts, tools, and materials necessary for maintenance tasks. Such management facilitates having available the items for kitting in an efficient manner.

Predictive maintenance (Prevention)

Techniques used to estimate the condition of in-service equipment to determine when to perform maintenance, providing a means to optimize preventative maintenance and equipment use while avoiding unwanted downtime.

Preventative maintenance (Prevention)

All actions performed in an attempt to retain an item in a specified condition by providing systematic inspection, detection, and prevention of incipient failures. Preventive maintenance provides a means to avoid unwanted downtime or equipment failures.

Reliability-centered maintenance (RCM) (Prevention)

A logical discipline for developing a scheduled-maintenance program that will realize the inherent reliability levels of complex equipment at minimum cost. RCM provides a framework to organize maintenance activities within an organization.

Spares forecasting (Measurement)

A process for estimating the expected demand for spare parts, components, subsystems, etc. necessary for future repairs or replacements. There are many potential processes for conducting the analysis, ranging from using historical demand graphical or statistical analysis or algorithms that account for minimum order quantities, discounts, order to arrival times, etc. By performing spares forecasting, a demand forecast for each item used as a spare useful to maintain the appropriate number of spares in stock is created to avoid having too many or too few spares.

Organizational processes

Organizational processes identify and establish needed organizational resources, institutionalize reliability methods, and advance the organization up the maturity matrix.

Establish organizational resources

Organizational resources include personnel, training, procedures, and business processes. The organizational resourcetasks will help move the organization to higher stages in the maturity matrix.

Environmental test manual (Prevention)

A document providing technical details on weather, location, and use factors including statistical summaries and distributions, along with approaches for evaluating an item for those conditions where an item may operate. The manual comprises a documented set of environmental stresses common for an organization’s products and markets and details suitable testing approaches.

DFR handbook (Prevention)

A document detailing a range of design activities and tools that support the effort to create a reliable item. The handbook may include basic procedures and approaches, resources for additional information, best practices, and lessons-learned information. Specific sets of approaches, tools, and tasks are given to guide the organization’s DFR work.

Infrastructure investment (Resources)

An ongoing process in which an organization proposes, budgets, funds, procures, and installs capital equipment, involving deployment of a system to obtain items such as test chambers, expensive failure analysis equipment, etc.

Lessons-learned system (Prevention)

A compilation of an organization’s experiences that should be actively taken into account in future endeavors. It provides a method to avoid past mistakes or reinforce the use of best practices.

Professional development (Resources)

An informal or formal process within an organization to encourage, fund, or otherwise support ongoing education for employees, vendors, or customers. Such professional development provides a means to deploy or add reliability-related educational content or classes using an existing company system.

Risk management process (Priority)

The systematic application of management policies, procedures, and practices that help minimize risks, with the primary goal of maintaining operational efficiency in the face of unexpected events.

Safety margin and safety factor guidelines (Prevention)

A published set of safety margin requirements for specific situations or applications. Such guidelines offer policy guidance and details for use during the design and verification of an item.

System safety management (Priority)

All plans and actions taken to identify hazards; assess and mitigate associated risks; and track, control, accept, and document risks encountered in the design, development, test, acquisition, use, and disposal of systems, subsystems, equipment, and infrastructure.

Total quality management (TQM) (Priority)

The set of tools and practices used to manage company resources with a focus on customer satisfaction as the means to achieving sustained financial success. All levels of the organization participate in the development, implementation, and continuous improvement of processes and systems to support customer needs.

Institutionalize reliability methods

It is the responsibility of the engineering department to achieve reliability objectives. Reliability methods must be integrated into ongoing engineering procedures and tasks, including design reviews, supplier selection, work instructions, and design procedures.

Failure consequence and liability management (Priority)

An informal or formal process within an organization used to assess potential adverse risks related to product use, misuse, or failure and the potential consequences to the customer, environment, and organization. This process may include legal reviews, insurance policies, and responses or participation in civil or criminal legal actions. It provides a means to assess potential risks, address those risks, and respond appropriately to events that fall outside the normal warranty claims process.

Product development process work instructions (Resources)

A process to review each stage of the product development process work instructions for integration of appropriate reliability methods and tasks, including who implements the method or task and how and when it is done.

Warranty management (Objective)

A process to establish the terms of a product’s warranty, including duration and policies on replacement, repair, or compensation. Warranty management may also include the oversight and operation of call centers and repair centers, financial analysis, and the reporting of warranty accruals and expenses, field data analysis, and related warranty establishment and servicing activities. The goal is to provide oversight and administration of product warranty-related policies and activities.

Advance up the maturity matrix

It is important for any organization to improve its capability to consistently achieve reliability objectives. The maturity matrix represents advancing stages of reliability maturity.

Reliability maturity assessment (Priority)

An analysis, based on the reliability maturity matrix, used to ascertain the actual organizational culture around reliability-related decisions.

Value stream mapping (Objective)

A technique to visually represent (in a manner similar to a flowchart) all the elements, including information, material, and process elements, that make up a value stream, a portion of a value stream, or a specific process. Value stream mapping enables existing or potential processes, including durations, dependencies, and interactions, to be studied.

Comments

al says
February 14, 2023 at 10:33 AM
Hi,
Cannot wait to see the published book..
Few items I did not see:
* CTQ- defining critical to quality which ties with Cpk
* Following industry standards; e.g. IPC 610 in electronics (class I, II, III)
* Design for redundancy
all the best,
- Fred Schenkelberg says
  February 14, 2023 at 10:47 AM
  Hi Al,
  Excellent additions – thanks for the suggestions
  cheers,
  Fred