The standard normal distribution is a special case of the normal distribution. For the standard normal distribution, the value of the mean is equal to zero (\(\mu = 0\)), and the value of the standard deviation is equal to 1 (\(\sigma = 1\)).

Thus, by plugin \(\mu = 0\) and \(\sigma = 1\) in the PDF of the normal distribution, the equation simplifies to

\[\begin{align} f(x)& = \frac{1}{\sigma \sqrt{2 \pi}}e^{-\frac{1}{2}\left(\frac{x-\mu}{\sigma}\right)^2} \\ & =\frac{1}{1 \times \sqrt{2 \pi}}e^{-\frac{1}{2}\left(\frac{x-0}{1}\right)^2} \\ & = \frac{1}{\sqrt{2\pi}}e^{-\frac{1}{2}x^2} \end{align}\]

The random variable that possesses the standard normal distribution is denoted by \(z\). Consequently units for the standard normal distribution curve are denoted by \(z\) and are called the \(z\)-values or \(z\)-scores. They are also called standard units or standard scores.

The cumulative distribution function (CDF) of the standard normal distribution, corresponding to the area under the cure for the interval \((-\infty, z]\), usually denoted with the capital Greek letter \(\phi\), is given by

\[F(x<z) = \phi (z) = \frac{1}{\sqrt{2\pi}} \int_{-\infty}^{z}e^{-\frac{1}{2}x^2}dx\]

where \(e \approx 2.71828\) and \(\pi \approx 3.14159\).


Basic Properties of the Standard Normal Curve

The standard normal curve is a special case of the normal distribution, and thus as well a probability distribution curve. Therefore basic properties of the normal distribution hold true for the standard normal curve as well (Weiss 2010).

  1. The total area under the standard normal curve is 1 (this property is shared by all density curves).
  2. The standard normal curve extends indefinitely in both directions, approaching, but never touching, the horizontal axis as it does so.
  3. The standard normal curve is is bell shaped, is centered at \(z=0\). Almost all the area under the standard normal curve lies between \(z=-3\) and \(z=3\).

The \(z\)-values on the right side of the mean are positive and those on the left side are negative. The \(z\)-value for a point on the horizontal axis gives the distance between the mean (\(z=0\)) and that point in terms of the standard deviation. For example, a point with a value of \(z=2\) is two standard deviations to the right of the mean. Similarly, a point with a value of \(z=-2\) is two standard deviations to the left of the mean.


The concept of determining probabilities by calculating the area under the standard normal curve is extensively applied. That is why there exist probability tables to look up the area for a particular \(z\)-value. However, R is such a powerful tool, that we can calculate the area under the curve for any particular \(z\) score.

To calculate the area under the curve for a standard normal distribution we apply the pnorm function. The pnorm function is defined as pnorm(q, mean = 0, sd = 1, lower.tail = TRUE, log.p = FALSE). We disregard the log.p = FALSE. For the moment we keep the default lower.tail = TRUE. Further, we see that the defaults for the mean and the standard deviation is \(0\) and and \(1\), respectively. Thus, the pnorm function, applied to the standard normal distribution, simplifies to pnorm(q). We calculate the the area under the curve for \(z = -3, -2, -1, 0, 1, 2, 3\) or written more formally:

\[P(x\le z) \qquad \text{for } z \in (-3, -2, -1, 0, 1, 2, 3)\]

pnorm(-3)
## [1] 0.001349898
pnorm(-2)
## [1] 0.02275013
pnorm(-1)
## [1] 0.1586553
pnorm(0)
## [1] 0.5
pnorm(1)
## [1] 0.8413447
pnorm(2)
## [1] 0.9772499
pnorm(3)
## [1] 0.9986501

Perfect! We confirmed some of the above stated properties of a standard normal curve. Recall, we confirmed the default value for the lower.tail argument in the pnorm function, lower.tail = TRUE. This means we calculated the area below the curve for the interval \((-\infty, z]\). Calling pnorm(-3) yields very low number. Only about 0.1% of the total area under the curve are found left to \(z=-3\), which corresponds to the distance of 3 times the standard deviation from the mean. Moreover, pnorm(0) yields 50%. Awesome! Thus, we conclude that the area under the cure for the interval \((-\infty, 0]\) is the same as the area under the cure for the interval \([0, \infty)\), and that the area under the curve sums up to \(1\). Again, we confirmed one of the above stated properties of a standard normal curve. And finally, calling pnorm(3) yields a high number close to 1. Thus, approximately 99.9% of the area under the cure can be found in the interval \((-\infty, 3]\). Only little left for the area beyond \(z = 3\).

Recall, that we may explicitly calculate the area under the curve for any interval of interest

\[\begin{align} P(a \le z \le b) & = P(z \le b) - P(z \le a) \\ & =\int_{a}^{b}f(z)dz \\ & = \int_{-\infty}^{b}f(z)dz - \int_{-\infty}^{a}f(z)dz \end{align}\]

Let us calculate the area under the curve for the following intervals: \([-1,1], [-2,2], [-3,3]\). Or in words, let us determine the area under the curve for \(\pm 1\) standard deviation, for \(\pm 2\) standard deviations, and for \(\pm 3\) standard deviations.

# 1 standard deviation
pnorm(1) - pnorm(-1)
## [1] 0.6826895
# 2 standard deviations
pnorm(2) - pnorm(-2)
## [1] 0.9544997
# 3 standard deviation
pnorm(3) - pnorm(-3)
## [1] 0.9973002

Awesome, we just confirmed the Empirical Rule, also known as the 68-95-99.7 rule, which relates to the Chebyshev`s theorem. For a bell-shaped distribution the 3 rules are, that approximately

To strengthen our intuition, the Empirical rule is visualized below.