If we have even a moderately complex product, we often need to break up our design team into smaller groups. Perhaps electronic, mechanical and hydraulic teams. Perhaps teams for specific functional components. And these teams need to work at least partially independently. So what do we do if we have a system reliability goal? What goals do we pass on to our smaller teams? This is where reliability allocation comes into play. Reliability allocation is often portrayed as a very complex, very involved and extremely exhaustive process. It should never be! In fact – the more exhaustive you make, the less likely it is to work.
If you want to learn more about a straightforward approach to reliability allocation – read this!
Let’s say you are designing a product (… or a system, or a device). And that product has a bunch of functional elements that need to work together to make your product. Power supply. Housing. Sensor. Each functional element can be provided by a single component, subsystem, module or assembly. For simplicity – we will refer to these collectively as ‘components.’
Each component is different. Based on different technologies. Created by different design teams. Perhaps supplied by different suppliers. The design team leader is almost forced to treat the design of each component as its own ‘little design project.’
But there is a problem. You have a reliability goal for the overall product. Let’s say your product consists of nine components working together to do ‘something.’ And your product needs to be 95 percent reliable after three years. Well … all you need to do is say to all the design teams and suppliers of your nine components that their reliability goals are also 95 percent. Right?
Wrong. If you do the math and model the system, it turns out that nine components that each have 95 percent reliability will result in a product with 63 percent reliability. So a 5 percent probability of failure has inadvertently increased to 37 percent.
Welcome to reliability allocation:
… the assigning of numerical reliability goals to subsystems and components in support of system level reliability performance characteristics.
It is possible to work out what component reliability goals that ensure we meet our product reliability goals. And when we come up with our component reliability goals, our smaller design teams can now work independently. But perhaps most importantly, we better manage our money. budget, and design trade-offs in a smarter way.
This article is the first of five on the topic of reliability allocation. We focus on the reliability design cycle in this article to set the scene for reliability allocation. Article #2 outlines the six steps of reliability allocation. Article #3 is perhaps the most important article – it talks about doing something! After all, reliability allocation is all about providing design guidance. Guidance for action. Article #4 covers the three ways of allocating reliability. And finally, Article #5 talks about different types and different considerations for reliability allocation.
A word of caution …
Most reliability allocation efforts fail. Fail badly. Why? Because after we allocate these goals to our nine components – we think our job is done! This is not even a little bit true. When we start to design products, there is (understandably) a great deal of uncertainty. If there wasn’t – then design and engineering would be really, really simple. We would essentially know what the final design looks like at the start!
The first reliability goals we allocate during the design process need to be part of a bigger design leadership activity. They become the initial guidelines or a starting point to help monitor our efforts. They need to be adjusted and optimized continually. When one team struggles to meet their allocated goal – we need to help them. And because we know this is bound to happen for at least some of our components (remember that uncertainty bit?) we need the rest of our design teams to be trying to exceed their allocated goals. Which allows us to ‘give’ our struggling design teams some of the reliability allocated to others.
So reliability allocation will fail and make everyone in your organization frustrated if you don’t use it as part of your reliability design cycle.
The Reliability Design Cycle
There are so many design cycle process flows out there. Some of them are useful. Many of them are industry-specific. And some of them are rubbish. We need to talk about one right here. And we will try and keep it simple.
Hopefully this little process flow works. But before we break it down into anything more useful, we need to focus on the design bit.
Design is an iterative process. We keep coming back to the previous version of our design when we realize it can be improved. Which is a nicer way of saying that we need to fix something we messed up or hadn’t thought of. Analytical design and redesign happen before we build a prototype. Perhaps we use something like finite element modeling (FEM) to realize our strut isn’t as strong as it needs to be. So we change the blueprint there and then. Or better yet, when we have a real design review and someone quickly points out an issue that needs addressing, we have saved ourselves the time and effort it takes to do things like FEM.
I say ‘real’ design reviews to make a point of difference to ‘ritual’ design reviews. You know – the ones where everyone turns up with reams of paperwork and the aim of getting as many boxes ticked as quickly as possible without any disturbances to budget and schedule. Or lunch. These are the worst. They confuse effort (or meetings) with outcomes.
And then there is prototypical design. Where we have exhausted all avenues of analysis or analytical design will now take too much time and money. Which often happens when our design matures. So we create a mock-up, a physical representation, a little product, or something that we can then go and test. And this then feeds back into our design to make it more robust.
But we all know that design is a little dirtier than this. It looks a little bit like this:
And even this might be a little too simplistic! But the main point here is that you need a design process that principally aligns with something like this one, but it must work for your organization or team.
And the bit we haven’t talked about is the reliability design cycle. The bit in the middle.
Instead of having the arrows if iteration, we have a more defined cycle, that looks like this:
Now we are getting somewhere. The reliability design cycle starts with reliability allocation. Which then informs the smart design of our product. And this design means we can update the system reliability model. Which also means we can always estimate the reliability of our system at all stages of design. Uncertainties and all.
But more importantly, this model can help us understand what the vital few failure mechanisms are. The typically small number of weak points or design flaws that will cause most of our failures.
This is great! Only having one or two things we need to worry about means we don’t have to completely redesign our product, or strengthen every strut, increase the margin for every transistor or so on. That is a waste of money (and what we call overengineering). Instead, we only need to address a small part of our design to drastically improve reliability. This saves time, money and stress.
Once we know the vital few ways our product will fail, we can then simulate, analyse or test our product – focusing on these weak points. Which means we learn more about our dominant failure mechanisms, which then allows us to update our system reliability estimate.
And then we can ask ourselves – are we finished? Have we met our goals? Is our product ready to be manufactured? We typically go through this cycle several times before we get it right – so we may be asking ourselves this question a lot.
If we haven’t met our goals, then we decide to do something. Not just record in some formal document that we didn’t meet our goals and implore out teams to work harder. That isn’t leadership – that is angry cheerleading. We will come back to the idea of deciding to do something a little later.
And once we go through these steps, something really cool happens. Reliability grows. Quickly. And cheaply. And to keep reliability growing we need to keep going through the cycle. So we may need to reallocate reliability and keep going.
Just letting you know what we will be covering in this series
Reliability allocation focuses on … reliability. But there are some other things and metrics we will talk about as well. We will cover those things after we go through the reliability allocation process because the underlying principle needs to be taught first.
We will eventually talk about Availability Allocation – which is very similar to reliability allocation. And we will also talk about MTBF Allocation. The MTBF is not an ideal metric – and is not reliability. But there are some situations where this approach is necessary. We also begrudgingly talk about Maintainability Allocation. There are many fundamental problems with trying to control the maintainability in isolation. If a maintenance action takes a long time to complete, we can either reduce the number of times we do it (by improving reliability) or making our product easier to maintain. Or both. Which means you should really look at Availability Allocation instead.
We also talk about Reliability Allocation for Multiple Reliability Requirements. This typically applies to scenarios where you might have a ‘critical failure’ reliability requirement in addition to a more basic definition. So some components or subsystems may be subject to more stringent reliability requirements because of how serious it is when they fail. And then there are Incremental Reliability Goals.
But we will get to all of these after we go through the fundamental approach to reliability allocation.
So how does reliability allocation fit in?
Before you think about giving your components their own reliability goals, you need to be really sure about your system goals. Not only what the customer wants, but what your product needs to get through things your product development process that includes verification and validation. So we typically focus on six key steps that enable reliability allocation.
The first step is identifying your customer requirements. This seems obvious and trivial – but gloss over this at your peril. And once you understand what your customer wants, work out how your product or system needs to perform to get there. That is the second step.
The third step is to work out what you want your reliability design margin to be. You need something up your sleeve. You then establish what we call a preliminary functional series system. This helps us work out if and when we need redundancy. We follow the process to work this out. Not just guess upfront.
The fifth step is allocating reliability. That’s right – we need to do a fair bit of work before we get there. And lastly, we do something. Goals are guidance – not journal entries. So if we are not meeting our goals, it is up to you as the design team leader to get your design over the line.
Don’t worry. We go through all these steps in greater detail in this series of articles. And we make it easy for you!
A quick word about ‘effort,’ ‘accuracy’ and ‘perfection’ before we go
YOUR INITIAL RELIABILITY ALLOCATIONS DON’T NEED TO BE ‘PERFECT.’
In fact, no matter what you do, they never can be ‘perfect.’ The only time that there is no uncertainty about your design is the period after you have completed it. So embrace what might feel like imperfection from the very start.
Many organizations forget about this and turn initial reliability allocation into a suffocatingly exhaustive exercise where all unknowns and uncertainties are analyzed as if is possible to end up in a state of absolute clarity.
The huge problem with expending so much effort upfront is that you quickly exhaust your organization’s supply of ‘reliability engineering tolerance.’ This is an issue. Because this is what our reliability engineering efforts start to feel like:
Needlessly and exhaustively analyzing an ‘un-analyzable’ problem makes your initial goals sacrosanct even though everyone privately acknowledges the uncertainties involved. These goals then tend to remain unchanged throughout the development process because of the effort everyone put into getting the first ones. And inevitably some components will fail to meet their targets.
They go on to fail demonstration testing. Badly. So the organization inevitably concludes these allocated goals are no longer relevant – everyone asks themselves why they put so much effort into getting them right in the first place. Cynicism grows and everyone misses out on an important opportunity for reliability allocations to act as guides that change and optimize as we learn more.
Instead, this is what we want:
And a big tip – there is no such thing as verifying or validating allocated reliability goals. Why? Because they are goals. Concepts. Aspirations. Everyone can (for example) create a goal where they lose 5 kg or 10 pounds over the summer. Nothing wrong with this goal. Whether you achieve it is another thing. And not achieving it for valid reasons doesn’t mean the goal wasn’t valid. Or verified. Because a goal is a concept!
So let’s go through the reliability allocation process step by step. Follow this, and your time as a design leader becomes much easier.
Our next article covers the six steps of reliability allocation and how they relate to the reliability design cycle. Most of these steps set the scene for reliability allocation – or help you find out what you need to do once the goals are set.
Looking forward to seeing you then!
Leave a Reply