Formal Experimentation different from Simplistic Approaches

Last Verified July 24, 2024

Statistically based DOE provides several advantages over more simplistic approaches such “one-factor-at-a-time” experimentation. These advantages include:

The use of statistical methodology to determine which factors are actually (statistically) significant
Balanced experimental designs to allow stronger conclusions with respect to cause and effect relationships (as opposed to just finding correlations)
The ability to understand and estimate interactions between factors
The development of predictive models that are used to find optimal solutions for one or more responses

This article will explore the first two advantages in a bit more detail. The second two advantages will be discussed in the next article post.

In order to understand the importance of using statistical methods when making decisions, consider the following (true) story. A former neighbor of ours (Sheryl) used to bowl on the Women’s Professional Bowling Tour. One day she confessed to my wife that she had a habit of eating a Hershey’s Chocolate Bar before every tournament match as she thought it gave her more energy. My wife suggested she might even do better if she tried a Snickers Bar instead, presumably due to the extra protein that peanuts provide. Sheryl agreed to participate in a little experiment where she bowled practices rounds – sometimes after eating Hershey’s Bar and sometimes after eating a Snickers Bar. Here were the scores (the trials were randomized).

To summarize the results, we may construct what is called a “main-effects” plot. It shows the average response at the two different levels that the factor (type of candy bar) was set at. The average score after eating Snickers was 208 and the average score after eating Hershey’s was 205. These results are shown graphically on the main effects plot below.

So, based on these results, should we conclude that Snickers produces higher scores? Unfortunately, many people will answer yes to this, not realizing that we must consider the expected randomness in results to due factors other than the type of candy bar eaten? This is called “experimental error” and this must be considered whenever making decisions. If you go back and look at the raw data, it should be clear that the 3 pin difference (on average) is pretty insignificant when compared to the variability we see among games with the same candy bar! If the 215 game was more like a 199 game (both Snickers games) then the results would completely flip! If you flip a coin 10 times, will you always get 5 head and 5 tails? Of course not, because there will be some random variation that will result in a proportion of heads different from 50% especially when the sample size is small.

Therefore, we must consider expected variation to make sure any change in the response that we observe when a factor goes from one level to another is a real effect. This is what it means to be statistically significant. It gives us confidence that the results are real and are likely not just due to some random outcome.

DOE uses Hypothesis Testing when determining whether the effects of factors (and interactions) on the response are statistically significant. Thus, the models that we develop only include factors that we have high confidence are predictive.

With simplistic experimentation approaches such as trial and error, it is extremely easy to simply believe whatever is tried must have caused any observed change in the outcome. Statistical methodology must be used to obtain valid results.

The second advantage of formal DOE is that the designs naturally are balanced to avoid confusing and invalid conclusions. Very often people will look at data and conclude cause and effect relationships are present simply based on correlations in the data. Correlation just means that a relationship exists, not necessarily that one event causes the other.

Consider an analysis that was done to relate the expenditures on medical care to the expenditures on milk based on surveys of families. The scatter plot below summarizes the relationship that was observed.

It would seem from this graph that more money that is spent on milk, the higher the medical costs are! Does milk actually cause health issues? Do we all need to switch to almond milk?

The problem with simply taking existing data and trying to find relationships in the data, is that we do not know if all other factors that may explain the relationship were controlled for. Upon further analysis of the survey data it became clear that family size had a big part in explaining this relationship. That is, people who live alone tend to spend relatively little on medical care and milk, and very large families tend to spend a lot on both items. Moderately large families are in the middle on both types of expenses. So, if we don’t control for family size, we may simply conclude that milk consumption leads to higher medical costs.

DOE Methodology takes great care to avoid these issues. For factors that are included in the experiment, we ensure that we mix up the levels for each factor so that we don’t only collect data when two different factors are at the same level. This would result in confounding (confusion) as to who caused the effect. Said another way, we balance the design so that we can isolate the true impacts of each factor and interaction (more on interactions in the next article). For factors that we feel may affect the response but are not included in the study, we try to hold those as fixed as possible during the study so they do not influence the results.

This is one big advantage of DOE rather than simply taking existing data and trying to develop models using regression techniques. DOE results tend to be much stronger since the data is collected in a balanced way and potentially important factors are controlled for.

About Steven Wachs

Leave a Reply Cancel reply