a graphical technique that every engineer needs in their toolbox
If there is one graphical technique that deserves a lot more attention that it gets and that every engineer needs to utilize in their day to day, my vote would definitely go for the mighty Youden Plot.
Have you heard of it ? What is it? You may have come across a post I wrote on it a while ago. If you didn’t, here’s the cliff notes version:
- Originally developed by William J. Youden back in 1959 for inter-laboratory comparisons (hence the name Youden Plot)
- An excellent technique for diagnosis and experimentation
- A special kind of scatterplot where paired measurements are plotted
- The two axes span the same range, and there is a 45-degree line called the line of perfect agreement (not a regression line!)
- Departures perpendicular to the 45-line denote variation within the pairs of measurements
- Range of values on either axis signifies variation between the units
OK, great, but what can they be used for?
They can be used for any kind of diagnostic or experimental study where a paired comparison of the same experimental units is conducted. The goal of the comparison can be to compare
- Different states of the same units
- The same units measured under different conditions or at different locations
- An actual state to an ideal state
A few specific uses include but are not limited to:
- Measurement systems analysis: repeatability error (repeats 1 and 2)
- Method comparison studies: comparing two measurement systems
- Bias studies: comparing actual to truth/known standard
- Diagnosis: separating inputs from the function (see Diagnosing Performance and Reliability by David Hartshorne)
- Taguchi robust engineering studies: comparing the actual function to the ideal function
The good news is that I built a function for it specifically for engineers. The draw_youden_plot() function of the sherlock package makes plotting Youden Plots really easy. Let’s walk through a couple examples! You can access the code here.
In the first example, we are going to look at data from two repeated measurements of randomly selected parts, which will give us the ability to asses the repeatability of the measurement system.
Let’s run a few lines of code:
# EDITION 009: YOUDEN PLOT
# 0. LOAD PACKAGES ----
library(tidyverse)
library(sherlock)
# 1. DRAW_YOUDEN_PLOT() FUNCTION ----
# 1.1 REPEATABILITY ASSESSMENT ----
youden_plot_data %>%
draw_youden_plot(x_axis_var = measurement_1, y_axis_var = measurement_2, size = 3)
Above is the Youden Plot we’ve created. The shape of the point cloud tells us that the repeatability of the system is pretty decent; you can ballpark this by comparing the width of the interval perpendicular to the 45-degree line with the variation in the parts (the range of measurements on either axis).
We can also group by specific variable, for example operator. In this case, let’s group by a variable called location, which refers to measurement location within the unit.
youden_plot_data %>%
draw_youden_plot(x_axis_var = measurement_1, y_axis_var = measurement_2, grouping_var = location)
Another example of the application of Youden Plots is of a method comparison study. In the below example, two gages are compared, and the comparison is done by measuring the same set of units once by each gage (gage 1 and 2, respectively).
# 1.2 METHOD COMPARISON ----
youden_plot_data_2 %>%
draw_youden_plot(x_axis_var = gage_1,
y_axis_var = gage_2,
size = 3)
There’s clearly disagreement here in that measurements from gage 2 tend to come in higher than those from gage 1. This clearly indicates systemic error, that is bias.
There is a neat built-in feature that allows you to draw a median line to visualize the amount of bias and to see whether the bias is consistent over the range. All you do is set median_line to TRUE. Here we go:
# 1.2.1 MEDIAN LINE PLOTTED ----
youden_plot_data_2 %>%
draw_youden_plot(x_axis_var = gage_1,
y_axis_var = gage_2,
median_line = TRUE,
size = 3,
analysis_desc_label = "Method Comparison - gage 1 and 2")
That’s it for now, but I will be talking more about Youden Plots in future editions. I hope you enjoyed this week’s edition!
Resources for this week’s edition:
- sherlock package
- Post on Youden plots
Resources for learning R:
- R for Data Science: a very thorough reference book by Hadley Wickham, the creator of the tidyverse. Absolutely free of charge and full of relevant examples and practice tests.
- ggplot2 reference book: a super detailed online book on the gpplot2 plotting package.
- My favorite R course, Business Science DS4B101-R: I learned R mainly throgh this course. Highly recommended if you want to get up to speed and beyond in a relatively short time. It has everything one will need from data cleaning to data visualization to modeling. This course is especially useful for engineers trying to learn or get good at R as it heavily focuses on the fundamentals but goes way beyond just that. Note: this is an affiliate link, meaning you get a hefty discount if you purchase a course, and I receive a small commission.
Leave a Reply