Expectation and Moment Generating Functions

Last Verified September 27, 2021

In statistics and reliability, we use distributions to describe time to failure patterns. The four functions commonly used in reliability engineering include

The probability density function
The cumulative distribution function
The reliability function
The hazard function

We often use terms like, mean, variance, skewness, and kurtosis to describe distributions (along with shape, scale, and location). The mean is defined as the use of a moment generating function. First though let’s first back up to the concept of center of gravity (cog) from mechanics.

Mean

Any shape has a point where it will balance, meaning there is an equal area or mass about the center point. It may not be in the middle which implies an equal area or mass on either side. From our engineering studies, cog of a shape is

$$ \large\displaystyle cog=\frac{\int_{-\infty }^{\infty }{xf\left( x \right)dx}}{\int_{-\infty }^{\infty }{f\left( x \right)dx}}$$

Where f(x) is a function describing the area of the shape over the x.

Recall now from statistics that the area below a probability distribution, describing the probability of occurrence at any point x, is defined as equal to one. Thus the denominator drops out of the equation.

The numerator based on x f(x) is the first moment generating function about the origin. The definition of the mean of a distribution and the cog are the same, thus we are left with the expected value (mean, or cog) or the most likely x to occur from the distribution f(x) is

$$ \large\displaystyle E\left( x \right)=\int_{-\infty }^{\infty }{xf\left( x \right)dx}$$

For the population mean we use the Greek letter, μ, thus the expected value of the distribution is

$$ \large\displaystyle \mu =\int_{-\infty }^{\infty }{xf\left( x \right)dx.}$$

In practical terms the mean for a sample is

$$ \large\displaystyle \bar{x}=\frac{\sum\limits_{i=i}^{n}{\left( {{X}_{i}} \right)}}{\left( N \right)}$$

Variance

The second moment generating function provides information on the dispersion of the mass about the origin. In the case of probability distributions it allows us to define the dispersion about the mean. We use the square of the differences to properly account for the dispersion.

$$ \large\displaystyle Var\left( X \right)=E\left[ {{\left( X-\mu \right)}^{2}} \right]$$

With a little algebra and using the Greek letter σ we have variance for the population

$$ \large\displaystyle {{\sigma }^{2}}=\int_{-\infty }^{\infty }{{{x}^{2}}f\left( x \right)dx}-{{\mu }^{2}}$$

In practical terms variance for a sample is

$$ \large\displaystyle {{s}^{2}}=\frac{\sum\limits_{i=i}^{n}{{{\left( {{X}_{i}}-\bar{X} \right)}^{2}}}}{\left( N-1 \right)}$$

Skewness

The normal distribution is symmetrical about the mean. Most distributions are asymmetrical or not balanced about the mean. We call this skewness. The third moment about the mean provides a measure of the asymmetry of the distribution.

The visual characteristic of skewness is a long tail. If the long tail is on the right the skewness is positive. Likewise, a negative skewness has the long tail on the left.

$$ \large\displaystyle Skew\left( X \right)=E\left[ {{\left( \frac{X-\mu }{\sigma } \right)}^{3}} \right]$$

This is known as Pearson’s moment coefficient of skewness and is the third standardized moment, hence the use of μ and σ.

$$ \large\displaystyle \gamma =\frac{\int\limits_{-\infty }^{\infty }{{{x}^{3}}f\left( x \right)dx}-3\mu {{\sigma }^{2}}-{{\mu }^{2}}}{{{\sigma }^{3}}}$$

In practical terms skewness for a sample is

$$ \large\displaystyle c=\frac{\sum\limits_{i=i}^{n}{{{\left( {{X}_{i}}-\bar{X} \right)}^{3}}}}{\left( N-1 \right){{s}^{3}}}$$

Kurtosis

The fourth moment described the bunchiness or peakedness of the distribution. Again Person suggests using the standardized moment.

$$ \large\displaystyle Skew\left( X \right)=\frac{E\left[ {{\left( X-\mu \right)}^{4}} \right]}{{{\left( E\left[ {{\left( X-\mu \right)}^{2}} \right] \right)}^{2}}}$$

In practical terms kurtosis for a sample is

$$ \large\displaystyle k=\frac{\sum\limits_{i=i}^{n}{{{\left( {{X}_{i}}-\bar{X} \right)}^{4}}}}{\left( N-1 \right){{s}^{4}}}$$

A positive kurtosis means the data is bunched near the mean. And, a negative kurtosis means the distribution is flat (heavy tails).

PDF to CDF with Brief Calculus Refresher

The Normal Distribution

Calculate Weibull Mean and Variance

Comments

Andrew Rowland says
September 1, 2014 at 10:30 AM
Hey Fred, good post. I like the balancing analogy to explain the mean. I’ve used a similar discussion with clients to help them understand why MTBF does not mean failure free period and why the underlying s-distribution matters, especially when comparing MTBF’s.
I would like to comment on something I found a bit unclear (but that’s probably because it’s a holiday weekend and I may be a little sleepy from last night’s activities). The moment generating function (mgf) of the random variable X is defined as m_X(t) = E(exp^tX). It should be apparent that the mgf is connected with a distribution rather than a random variable. In other words, there is only one mgf for a distribution, not one mgf for each moment.
The mean and other moments can be defined using the mgf. The kth moment of X is the kth derivative of the mgf evaluated at t = 0. That is, the first moment (the mean) is the first derivative of the mgf, the variance is the second derivative, etc.
For the interested reader, the mgf of the exponential distribution is lambda / (lambda – t). It can be used to “prove” to themselves the relationship between the mgf and the function for mean/variance which most will already know. For those readers that would rather not do the calculus by hand, the application Maxima does a pretty good job at symbolic mathematics. Of course, who doesn’t like to do calculus by hand?
- Fred Schenkelberg says
  September 1, 2014 at 10:35 AM
  Hi Andrew,
  Thanks for the comment. I agree the discussion around mgr could be clearer – and you are right, when you substitute a particular probability density function in for the f(x) it is specific for that distribution.
  I try to avoid calculus before 10am, when possible, 😉
  Cheers,
  Fred
- Mesfin Esayas Lelisho says
  October 30, 2022 at 5:58 AM
  Dear Andrew Rowland
  The mean and other moments can be defined using the mgf. The kth moment of X is the kth derivative of the mgf evaluated at t = 0. That is, the first moment (the mean) is the first derivative of the mgf, the variance is the second derivative, etc.
  That is right!
  But I think there is exception?
  Here I have some concern,
  From the definition of the continuous uniform distribution, X has probability density function:
  f(x)= 1/(b-a), a≤x≤b
  0, otherwise
  The mgf is given as Mx (t)= [exp(tb)-exp(ta)]/t[b-a]
  If we desired to find Mean or first moment about mean, we have to find derivative and evaluate at t=0. But, while doing so it will be undefined! how do we solve this?
  - M. S. Bhalachandra says
    November 2, 2022 at 5:00 AM
    I think the mgf is not differentiable at
    t = 0.
    So, the method is not applicable to obtain the expression for EX.
    Instead, expand the mgf in powers of t and then find the coefficient of t in the expansion.
    This gives the expression for EX.
    Verify my answer with experts.
izang Salomi says
June 9, 2021 at 4:20 AM
so what are the relationship between the two
- Fred Schenkelberg says
  June 9, 2021 at 7:18 AM
  Hi Izang, the moment generating functions help us to define the expected value for specific distribution items like the mean, variance, skewnewss and kurtosis. cheers, Fred
Mesfin Esayas Lelisho says
October 30, 2022 at 5:48 AM
That is really important post. Thank you, sir!

Mean

Variance

Skewness

Kurtosis

About Fred Schenkelberg

Comments

Leave a Reply Cancel reply