logo

johzu

About

Probability and statistics formula sheet


Symbols

Remember that \( \sum_{i=1}^{i=n} \) or \( \sum_{i=1}^{n} \) is the addition of a sequence of numbers, in this case from \(1\) to \(n\). Like this: \( \sum_{i=m}^{n} a_{i} = a_{m} + a_{m+1} + a_{m+2} + \cdots + a_{n-1} + a_{n}\).

name symbol
class amplitude $$A$$
class mark $$CM$$
event $$E$$
sample size $$N,\ n$$
absolute cumulative frequency $$N_{i}$$
absolute frequency $$n_{i}$$
relative frequency $$f_{i}$$
relative cumulative frequency $$F_{i}$$
mean absolute deviation $$MD$$
probability of an event $$P\left( E \right)$$
probability of the complement of an event $$P\left( E^{\complement} \right)$$
union of probabilities $$P\left( A \cup B \right)$$
intersection of probabilities $$P\left( A \cap B \right)$$
(conditional) probability of \(A\) given \(B\) $$P\left( A \vert B \right)$$
range $$R$$
sample space $$S$$
sample standard deviation $$s$$
sample variance $$s^{2}$$
sample elements $$X_{i}$$
average $$\bar{x}$$
median $$\tilde{x}$$
mode $$\hat{x}$$
value $$x_{i}$$
Fisher's moment coefficient of skewness $$\gamma_{1}$$
average $$\mu$$
k-th central moment $$\mu_{k}$$
standard deviation $$\sigma$$
variance $$\sigma^{2}$$
sample space $$\Omega$$
imposible event $$\empty$$

Statistics

name equation
ceiling function $$\left\lceil -1.5 \right\rceil = -1$$ $$\left\lceil 1.5 \right\rceil = 2$$
floor function $$\left\lfloor -1.5 \right\rfloor = -2$$ $$\left\lfloor 1.5 \right\rfloor = 1$$
absolute frequency $$n_{i}$$
average $$\bar{x} = \frac{1}{n} \sum_{i = 1}^{n} x_{i}$$
average (grouped data) $$\bar{x} = \frac{\sum_{i = 1}^{n} \left(f_{i} \cdot x_{i}\right)}{\sum_{i = 1}^{n} f_{i}} $$
class mark $$CM = \frac{\mathrm{Upper\ Limit} + \mathrm{Lower\ Limit}}{2}$$
relative frequency $$f_{i} = \frac{n_{i}}{n}$$
absolute cumulative frequency $$N_{i} = \sum_{j\leq i} n_{j}$$
relative cumulative frequency $$F_{i} = \sum_{j \leq i} \frac{n_{j}}{n}$$
median $$\tilde{x} = \begin{cases} x_{(n+1)/2} & n \text{ is odd} \\ \frac{x_{(n/2)} + x_{(n/2)+1}}{2} & n \text{ is even}\end{cases}$$
median (grouped data) $$\tilde{x} = L_{i} + A \left( \frac{\frac{n}{2} - F_{i - 1}}{f_{i}} \right)$$
mode $$\hat{x} = \operatorname{argmax}_{x_{i}} $$
arguments of the maximum $$\begin{split}\operatorname{argmax}_{S} f &:= \underset{x \in S}{\operatorname{argmax}} f\left( x \right) \\ &:= \left\lbrace x \in S | f(s) \leq f(x) \forall s \in S \right\rbrace\end{split}$$
Is defined as there is an \( x \) in set \(S\) such that \(s\) evaluated in \(f\) function is less or equal to \(x\) evaluated in \(f\) function for all \(s\) in set \(S\).
mode (grouped data) $$\hat{x} = L_{i} + A \left( \frac{f_{i} - f_{i - 1}}{\left( f_{i} - f_{i - 1} \right) + \left( f_{i} - f_{i + 1} \right)} \right)$$
range $$R = x_{\max} - x_{\min}$$
variance $$\sigma^{2} = \frac{1}{N} \sum_{i = 1}^{n} \left(x_{i} - \bar{x}\right)^{2}$$
variance (less rounding error) $$\sigma^{2} = \frac{1}{N} \left\lbrack \sum_{i = 1}^{n} x_{i} - \frac{1}{n} \left(\sum_{i = 1}^{n}x_{i}\right)^{2} \right\rbrack$$
standard deviation $$\sigma \equiv \sqrt{\sigma^{2}}$$
sample average $$\bar{x} = \sum_{i = 1}^{m} x_{i} f \left( x_{i} \right)$$
sample variance $$s^{2} = \frac{n}{n-1} \sum_{i = 1}^{m} \left(x_{i} - \bar{x}\right)^{2}f\left( x_{i} \right)$$
sample variance (less rounding error) $$s^{2} = \frac{1}{n-1} \left\lbrace \sum_{i = 1}^{m} \bar{x}^{2} n f\left( x_{i} \right) - \frac{1}{n} \left\lbrack \sum_{i = 1}^{m} x_{i} n f \left( x_{i} \right)\right\rbrack^{2} \right\rbrace$$

Measures of statistical dispersion

Mean Absolute Deviation Standard deviation Variance
Individual data $$MD = \frac{\underset{i = 1}{\overset{n}{\sum}}\vert x_{i} - \bar{x} \vert}{n}$$ $$\sigma = \sqrt{\frac{\underset{i = 1}{\overset{n}{\sum}}\left( x_{i} - \bar{x} \right)^{2}}{n}}$$ $$\sigma^{2} = \frac{\underset{i = 1}{\overset{n}{\sum}}\left( x_{i} - \bar{x} \right)^{2}}{n}$$
Frequency distribution $$MD = \frac{\underset{i = 1}{\overset{n}{\sum}} f_{i} \cdot \vert x_{i} - \bar{x} \vert}{n}$$ $$\sigma = \sqrt{\frac{\underset{i = 1}{\overset{n}{\sum}}\left( f_{i} \right)\left( x_{i} - \bar{x} \right)^{2}}{n}}$$ $$\sigma^{2} = \frac{\underset{i = 1}{\overset{n}{\sum}}\left( f_{i} \right)\left( x_{i} - \bar{x} \right)^{2}}{n}$$
Grouped data $$MD = \frac{\underset{i = 1}{\overset{n}{\sum}} f_{CM_{i}} \cdot \vert CM_{i} - \bar{x} \vert}{n}$$ $$\sigma = \sqrt{\frac{\underset{i = 1}{\overset{n}{\sum}}\left( f_{CM_{i}} \right)\left( CM_{i} - \bar{x} \right)^{2}}{n}}$$ $$\sigma^{2} = \frac{\underset{i = 1}{\overset{n}{\sum}}\left( f_{MC_{i}} \right)\left( MC_{i} - \bar{x} \right)^{2}}{n}$$

Quantiles

Quartiles Deciles Percentiles
Position $$q_{i} = \left( n + 1 \right) \frac{i}{4},\ i = \left[ 0,4 \right]$$ $$d_{i} = \left( n + 1 \right) \frac{i}{10},\ i = \left[ 0,10 \right]$$ $$p_{i} = \left( n + 1 \right) \frac{i}{100},\ i = \left[ 0,100 \right]$$
Value $$Q_{i} = x_{\left\lfloor q_{i} \right\rfloor} + \left( q_{i} - \left\lfloor q_{i} \right\rfloor \right) \left( x_{\left\lfloor q_{i} \right\rfloor + 1} - x_{\left\lfloor q_{i} \right\rfloor} \right)$$ $$D_{i} = x_{\left\lfloor d_{i} \right\rfloor} + \left( d_{i} - \left\lfloor d_{i} \right\rfloor \right) \left( x_{\left\lfloor d_{i} \right\rfloor + 1} - x_{\left\lfloor d_{i} \right\rfloor} \right)$$ $$P_{i} = x_{\left\lfloor p_{i} \right\rfloor} + \left( p_{i} - \left\lfloor p_{i} \right\rfloor \right) \left( x_{\left\lfloor p_{i} \right\rfloor + 1} - x_{\left\lfloor p_{i} \right\rfloor} \right)$$
Range $$IQR = Q_{3} - Q_{1}$$ $$IDR = D_{9} - D_{1},\ \mathrm{(most\ common)}$$ $$IDR = D_{b} - D_{a}$$ $$IPR = P_{90} - P_{10},\ \mathrm{(most\ common)}$$ $$IPR = P_{b} - P_{a}$$

Histogram

Number of bins and width

name equation notes
$$ k = \left\lceil \frac{\max x - \min x}{h} \right\rceil $$ $$k = \mathrm{number},\ h = \mathrm{width}$$
Square-root choice $$k = \left\lceil \sqrt{n} \right\rceil$$
Sturges' formula $$k = \left\lceil \log_{2} n \right\rceil + 1 = \left\lceil \frac{\log_{10} n}{\log_{10} 2} \right\rceil + 1$$ Derived from a binomial distribution and implicitly assumes an approximately normal distribution.
Rice rule $$k = \left\lceil 2\sqrt[3]{n} \right\rceil$$ Alternative to Sturges' rule.
Doane's formula $$k = 1 + \log_{2}\left( 2 \right) + \log_{2}\left( 1 + \frac{\vert g_{1} \vert}{\sigma_{g_{1}}} \right)$$ $$\sigma_{g_{1}} = \sqrt{\frac{6\left( n - 2 \right)}{\left( n + 1 \right)\left( n + 3 \right)}}$$ Modification of Sturges' formula which attempts to improve its performance with non-normal data. \(g_{1}\) is the estimated 3rd-moment-skewness of the distribution
Scott's normal reference rule $$h = \frac{3.5 \hat{\sigma}}{\sqrt[3]{n}}$$ Where \({\hat {\sigma }}\) is the sample standard deviation.
Freedman-Diaconis' choice $$h = 2\frac{\mathrm{IQR}\left( x \right)}{\sqrt[3]{n}}$$ Replaces \(3.5 \sigma \) of Scott's rule with \(2\ \mathrm{IQR}\), which is less sensitive than the standard deviation to outliers in data.

Probability

name equation
probability of an event $$P\left( E \right) = \lim_{n \to \infty} f\left( E \right) = \lim_{n \to \infty} \frac{n_{E}}{n}$$
union of probabilities $$P\left(A \cup B\right) = \begin{cases} P\left(A\right) + P\left(B\right) - P\left(A \cap B\right) & P\left(A \cap B\right) \neq \empty \\ P\left(A\right) + P\left(B\right) & P\left(A \cap B\right) = \empty\end{cases}$$
complement probability $$P\left( E^{\complement} \right) = 1 - P\left( E \right)$$ $$P\left( \neg E \right) = 1 - P\left( E \right)$$
intersection of disjoint events $$P\left( A \cap B \right) = 0$$
intersection of independent events $$P\left( A \cap B \right) = P\left( A \right)P\left( B \right)$$
intersection of dependent events $$\begin{split}P\left( A \cap B \right) &= P\left( B \vert A \right) P\left( A\right)\\ &= P\left( A \vert B \right) P\left( B\right)\end{split}$$ $$\begin{split}P\left( A \cap B^{\complement} \right) &= P\left( B^{\complement} \vert A \right) P\left( A \right)\\ &= P\left( A \vert B^{\complement} \right) P\left( B^{\complement} \right)\end{split}$$ $$\begin{split}P\left( A^{\complement} \cap B \right) &= P\left( B \vert A^{\complement} \right) P\left( A^{\complement} \right)\\ &= P\left( A^{\complement} \vert B \right) P\left( B\right)\end{split}$$ $$\begin{split}P\left( A^{\complement} \cap B^{\complement} \right) &= P\left( B^{\complement} \vert A^{\complement} \right) P\left( A^{\complement} \right)\\ &= P\left( A^{\complement} \vert B^{\complement} \right) P\left( B^{\complement} \right)\end{split}$$
complement of dependent events $$P\left( B \vert A \right) = 1 - P\left( B^{\complement} \vert A \right)$$ $$P\left( B \vert A^{\complement} \right) = 1 - P\left( B^{\complement} \vert A^{\complement} \right)$$

Permutations and Combinations

Remember that permutations take order into account, while combinations do not. Here, \(n\)represents the total number of elements and \(k\) the number of selected elements.

name equation
permutations of a set of \( n \) different elements taking one subset of \( k \) chosen elements without repetition $$P_k^n = P(n,k)= \frac{n!}{\left( n - k\right)!}$$
permutations of \(n\) distinct elements (arranging all) $$n! = 1 \cdot 2 \cdot 3 \cdots n$$
ordered sequences of length \(k\) formed from \(n\) distinct elements (with repetition) $$n^k$$
permutations of \(n\) elements with repetitions (indistinguishable items). If there are \(r\) types with counts \(n_1, n_2, \ldots, n_r\), with \(n_1 + \ldots + n_r = n\) $$P^{n}_{n_{1}, n_{2},..., n_{r}} = \binom{n}{n_1, n_2, \ldots, n_r} = \frac{n!}{n_{1}!n_{2}!\cdots n_{r}!}$$
circular permutations of \(n\) distinct elements (rotations considered the same) $$(n - 1)!$$
combinations of a set of \( n \) different elements taking one subset of \( k \) chosen elements without repetition $$C^n_k = C(n,k) = {n \choose k} = \frac{n!}{k!\left( n-k\right)!}$$
combinations of a set of \( n \) different elements taking one subset of \( k \) chosen elements with repetition $${n + k - 1 \choose k} = \frac{\left( n + k - 1\right)!}{k!\left( n - 1\right)!} $$

Bayesian probability

name equation
conditional probability of \(A\) given \(B\) $$P\left( A \vert B \right) = \frac{P\left( A \cap B \right)}{P\left( B \right)}$$
Bayes' theorem (special case) $$\begin{split}P\left( A \vert B \right) &= \frac{P\left( B \vert A\right) P\left(A\right)}{P\left( B \right)} \\ &= \frac{P\left( B \vert A \right)P\left( A \right)}{P\left( B \vert A \right)P\left( A \right) + P(B \vert A^{\complement}) P\left( A^{\complement} \right)}\end{split}$$
Bayes' theorem (general) $$P\left( A_{k} \vert B \right) = \frac{P\left( B \vert A_{k} \right)P\left( A_{k} \right)}{\sum_{j}P\left( B \vert A_{j} \right)P\left(A_{j} \right)}$$

Probability distributions

name equation
probability function (for discrete variables) $$f\left( x \right) = P\left( x = x \right)$$
probability density (for continuous variables) $$f\left( x \right) = \frac{d F\left( x \right)}{dx}$$
k-th raw moment (discrete) $$E\left( x^{k} \right) = \sum_{i}x_{i}^{k}f\left( x_{i} \right)$$
k-th raw moment (continuous) $$E\left( x^{k} \right) = \int_{-\infty}^{\infty} x^{k} f\left( x \right)dx$$
k-th central moment (discrete) $$\mu_{k} = E\left( x - \mu \right)^{k} = \sum_{i} \left(x_{i} - \mu \right)^{k}f\left( x_{i} \right)$$
k-th central moment (continuous) $$\mu_{k} = E\left( x - \mu \right)^{k} = \int_{-\infty}^{\infty} \left( x - \mu \right)^{k} f\left( x \right)dx$$
Fisher's moment coefficient of skewness $$\gamma_{1} = \frac{\mu_{3}}{\sigma^{3}}$$
Moment-generating function (discrete) $$G\left( t \right) = E\left( e^{tx} \right) = \sum_{i} e^{t x_{i}}f\left( x_{i} \right)$$
Moment-generating function (continuous) $$G\left( t \right) = E\left( e^{tx} \right) = \int_{-\infty}^{\infty} e^{t x}f\left( x \right) dx$$

See also

Bayes Theorem

Box Whisker Diagram

Probability distributions