The shape of the distribution of a random variable may be visualized with a smooth curve. Such curves, which represent the distribution of continuous variables, are called probability density functions (PDF) or just density functions. Probability density functions have three main properties (Mann 2012 , Weiss 2010):
The area under the curve is computed by the integral of the value x from −∞ to +∞, which yields 1:
∫+∞−∞f(x)dx=1
The probability that a continuous random variable x takes a value within a certain interval is given by the area under the curve between the two limits of the interval. The colored area under the curve for the interval (−∞,a] (left panel) and for the interval [a,∞) (right panel) is shown in the figure below.
The probability that x falls in the interval (−∞,a] is
P(X≤a)=∫a−∞f(x)dx
and the probability that x falls in the interval [a,∞) is
P(X≥a)=1−P(X≤a)=∫∞af(x)dx
The probability that a continuous random variable x assumes a value within a certain interval is given by the area under the curve between the two limits of the interval. The colored area under the curve from a to b in the figure below gives the probability that x falls in the interval [a,b].
## Warning in par(mfrow = c(1, 1), default.par): argument 2 does not name a
## graphical parameter
P(a≤x≤b)=∫baf(x)dx=P(x≤b)−P(x≤a)=∫b−∞f(x)dx−∫a−∞f(x)dx
Note: The interval a≤x≤b states that x is greater than or equal to a but less than or equal to b.
For a continuous probability distribution, the probability is always calculated for an interval. The probability that a continuous random variable x assumes a single value is always zero. This is because the probability to pick exactly one value out of an infinite number of values ∈R is zero. In a geometric sense this means that the area of a line, which represents a single point, is zero.
P(x)=0
From this we can deduce the following for a continuous random variable:
P(a≤x≤b)=P(a<x<b)
In other words, the probability that x assumes a value in the interval a to b is the same, whether or not the values a and b are included in the interval.
Citation
The E-Learning project SOGA-R was developed at the Department of Earth Sciences by Kai Hartmann, Joachim Krois and Annette Rudolph. You can reach us via mail by soga[at]zedat.fu-berlin.de.
Please cite as follow: Hartmann, K., Krois, J., Rudolph, A. (2023): Statistics and Geodata Analysis using R (SOGA-R). Department of Earth Sciences, Freie Universitaet Berlin.