HALT in 4 Not Always Easy Steps
Highly Accelerated Life Testing, HALT, is a method to discover the weaknesses in a design. Using a step stress approach of single and combined stresses, you can quickly expose the salient weaknesses in your design and/or assembly process.
The value of HALT is it’s quick and often finds problems not previously known. You will destroy one or more prototypes, yet the value of knowing specifically what needs improvement more then justifies the sacrifice of a few photos.
Conducting HALT may be part of your reliability plan. Keeping a few steps in mind will help make sure your HALT does provide value back to your development efforts.
Step 1 — Brainstorm Stresses
The basic question is what stress (internal or external force) will the item under consideration experience? In short, what types and ranges of values does your product need to survive to be considered reliable?
This includes the basics of:
- Thermal cycling
- Rain, moisture
- Salt fog
And, includes the common elements of:
- Drop, shock, vibration
- Power connection or mis-connection
- Fungus, insects, dust, debris, etc
- Chemical exposure (lotions, cleaners, etc.)
Plus a few that may be specific to your application, such as:
- Mechanical loading, load cycling
- Data stream loading, load cycling
- Interference, such as electromagnetic interference
- Ultraviolet light exposure
Make this a long list. Include range of values expected to experience along with nominal values, if possible. Even a product during storage experiences exposure to some stresses, if not only gravity.
Think about all the ways something impinges or affects change with your system.
Step 2 — Select Stresses for Testing
Now, only after step 1 has generated a long list of stresses, do you narrow down the list for use in HALT.
Do not limit the selection to what is available with the available HALT chamber. If important stresses are not available with your available HALT chamber, you will need to use something other than just your chamber.
The important part in selecting stresses is to use the stresses that either along or in combination will excite the most different types of failure mechanisms. Especially for areas you and team have little practical knowledge (field failures) on how the applied stress will effect the products performance.
Select stresses that:
- Excite multiple failure mechanisms
- Are applied at some level during the product’s existence
- Have little margin (robustness) or an unknown margin based on engineering judgement or simulation work
- Have a relatively high risk of failure (consider FMEA results for this one)
Narrow done the list to those you can apply alone or in combinations with other stresses.
Step 3 — Apply Stresses and Monitor Performance
Create a means to apply the stresses which permits increasing the stresses levels in increments and the ability to return to nominal, when necessary. Determine how you will monitor the product’s performance as you need to know when the applied stress causes an anomaly or failure.
Create a plan that starts with the least likely to cause a failure stress first and expose to levels which are not likely to causes failures. Maybe only to the expected range of normal operating stress levels. Do this for each of the selected stresses.
With each application of stress monitor the product’s performance. With each anomaly or failure start the failure analysis process by gathering data about the symptoms and applied stress.
On failure, return the stress levels to nominal values. If the product returns to operation, thus defining the operational limit for that stress. If the product does not return to operation, it defined as a destruct limit.
The intent is find as many operational and destruct limits as possible, or as many failures as possible. Thus, after noting the circumstances and conditions for any failures, attempt to by pass the issue causing the failure and find additional failures. For example, turning off or ignoring specific performance results or test protocols, you may be able to find additional failures at the same or higher stress levels.
Step 4 — Do the Failure Analysis on Every Anomaly or Failure
Starting as the testing identifies operating or destruct limits, the failure analysis continues after rate actual testing. Conduct thorough failure analysis in order to determine the root causes of the failures.
The intent is to understand the failures well enough to permit the development team to design out the possibility of the failure occurring in the future, or to at least mitigate the effects of the failure on product performance.
Avoid jumping to conclusions or ignoring failures that occur over user operating stress levels. Failures that occur above normal use levels occur sooner than for a customer, not sure how much sooner, yet they will occur unless address and resolved.
HALT is a discovery process. It is a suitable test when you want to identify the margin between the expected use stresses and where the damage caused least to damage and failures. The large the margin the more robust the product.
You cannot pass HALT. Although you can learn a great deal about your product when done well.
Did I miss anything critical to running a useful HALT? Add your comments below.
Also published on Medium.