Introductory Probability Course Note

Made by Mike_Zhang

所有文章:
Introductory Probability Course Note
Python Basic Note
Limits and Continuity Note
Calculus for Engineers Course Note
Introduction to Data Analytics Course Note
Introduction to Computer Systems Course Note

个人笔记，仅供参考
FOR REFERENCE ONLY

Course note of AMA1104 Introductory Probability, The Hong Kong Polytechnic University, 2021.

1. Probability

1.1 Permutations Rule

$_{n}P_r =P^n_r = n(n-1)(n-2)...(n-r+1) = \frac{n!}{(n-r)!}$

1.2 Combinations Rule

$_{n}C_r =C^n_r =\binom{n}{r}= \frac{n!}{r!(n-r)!}= \frac{n(n-1)(n-2)...(n-r+1)}{r!}$

1.3 Collectively Exhaustive

A set of events is ${A_1, A_2,…, A_n}$ said to be collectively exhaustive if one of the events must occur (list all events), then the sample space is

$S = A_1 \cup A_2 \cup ... \cup A_n$

1.4 Joint Probability

The probability of the intersection of two events is called their joint probability, which is:

$P(A\;and\;B) \;\mathrm or\;P(A \cap B)$

1.5 Union Probability

The probability of the union of two events is called their union probability, which is:

$P(A \cup B) = P(A)+P(B)-P(A \cap B)$

1.6 Mutually Exclusive

Two events are said to be mutually exclusive if, when one of the two events occurs in an experiment, the other cannot occur, which is:

$P(A\cap B) = 0$

If the events, A and B, are mutually exclusive, the probability that either event occurs is

$P(A\cup B) = P(A)+P(B) \\ =P(A)+P(B)-P(A \cap B)\; \mathrm where\; P(A \cap B)=0$

1.7 Conditional Probability

The probability of an event $A$ given that an event $B$ has occurred, is called the conditional probability of $A$ given $B$ and is denoted by the symbol $P(A|B)$ and read as ‘the probability of $A$ given that $B$ has already occurred.

If $A$ and $B$ are two events with $P(A)\neq 0$ and $P(B)\neq 0$,then

$P(A|B)=\frac{P(A\cap B)}{P(B)}\;\; and\;\;P(B|A)=\frac{P(B\cap A)}{P(A)}$

The probability that both of the two events $A$ and $B$ occur is

$P(A\cap B) = P(A)\cdot P(B|A)= P(B)\cdot P(A|B)$

1.8 Independent

Two events $A$ and $B$ are said to be independent if the occurrence of one does not affect the probability of the occurrence of the other.

$A$ and $B$ are independent events if:

$P(A|B)=P(A)\; \mathrm and \; P(B|A)=P(B)$

If two events $A$ and $B$ are independent, then:

$P(A\cap B)=P(A)\cdot P(B)\\ \because P(A|B)=\frac{P(A\cap B)}{P(B)}=P(A)$

1.9 Law of Total Probability

Assume that $B_1,B_2,…,B_n$ are collectively exhaustive events where $P(B_i)\gt 0$, for $i=1,2,…,n$ and $B_i$ and $B_j$ are mutually exclusive events for $i\neq j$.
Then for any event $A$:

$P(A)=P(B_1\cap A)+P(B_2\cap A)+...+P(B_n\cap A) \\ =P(B_1)P(A|B_1)+P(B_2)P(A|B_2)+...+P(B_n)P(A|B_n)$

1.10 Bayes’ Theorem

Suppose that $B_1,B_2,…,B_n$ are n exhaustive events and exhaustive events, then:

$P(B_k|A)=\frac{P(B_k \cap A)}{P(A)} \\ =\frac{P(B_k)P(A|B_k)}{P(B_1)P(A|B_1)+P(B_2)P(A|B_2)+...+P(B_n)P(A|B_n)}$

$\because P(B_k\cap A) = P(B_k)\cdot P(A|B_k) \; based\;on\;the\;Conditional\;Probability$

$P(A)=P(B_1)P(A|B_1)+P(B_2)P(A|B_2)+…+P(B_n)P(A|B_n) \; \mathrm based\;on\;the\;Law\;of\;Total \;probability$

2. Probability Distribution

2.1 Discrete Random Variable

2.1.1 Probability Distribution

$0\le P(X)\le 1$ for each value of $x$;

$\sum P(X)=1$;

$P(x)=P(X=x)$

Mean or Expected value:

$\mu =E(X)=\sum x \cdot P(x)$

Variance:

$\sigma^2 =Var(X) = \sum (x-\mu )^2\cdot P(x)$ $OR\;\;\sigma^2 =Var(X) =\biggl(\sum x^2 \cdot P(x)\biggl) - \mu^2$ $Var(X) =E(X^2)-E(X)^2$

2.1.2 Binomial Probability Distribution

$X\sim Bin(n,p)$

$P(x)=P(X =x)=\binom{n}{x} p^x(1-p)^{n-x}\\x=0,1,2,...,n$

$n$ = total number of trials
$p$ = probability of success
$x$ = number of successes in $n$ trials

Mean or Expected value:

$\mu =E(X)=np$

Variance:

$\sigma^2 =Var(X) = np(1-p)$

2.1.3 Poisson Probability Distribution

$X\sim Poisson(\lambda)$

$P(x)=P(X =x)=\frac{\lambda^xe^{-\lambda}}{x!}\\x=0,1,2,...,n$

where $\lambda$ is the mean number of occurrences in that interval

Mean or Expected value:

$\mu =E(X)=\lambda$

Variance:

$\sigma^2 =Var(X) = \lambda$

Poisson Approximation to the Binomial Distribution:
when the number of trials $n$ is large and at the same time the probability $p$ is small (generally such that $\mu=np\le7$ )

$P(x)=P(X =x)=\frac{(np)^xe^{-np}}{x!}$

$X$ = number of success from n independent trials
$p$ = probability of success
$\lambda = \mu = np$

2.1.4 Negative Binomial Probability Distribution

The probability of performing $k$ independent trials until a total of $r$ successes is accumulated

$X \sim NegBin (r,p)$

$P(X = k)=\binom{k-1}{r-1}p^r(1-p)^{k-r}\\(where\;k = r,r+1,...)$

$\binom{k-1}{r-1}:\;sample\;just\;1\;before\;success\;all\\p^r = p^{r-1}\times p\; (final\;trial)$
$p$ = probability of each trial being a success
$k$ = independent trials, NOT fixed.
$r$ = number of successes, is fixed.

Mean or Expected value:

$\mu =E(X)=\frac{r}{p}$

Variance:

$\sigma^2 =Var(X) = \frac{r(1-p)}{p^2}$

2.1.5 Geometric Probability Distribution

The probability that the first occurrence of success requires k independent trials.
(i.e. $X \sim NegBin (1,p)$)

$X \sim Geo (p)$

$P(X = k)=(1-p)^{k-1}p\\(where\;k \gt 1)$

$p$ = probability of each trial being a success
$k$ = independent trials, NOT fixed.

Mean or Expected value:

$\mu =E(X)=\frac{1}{p}$

Variance:

$\sigma^2 =Var(X) = \frac{1-p}{p^2}$

2.1.6 Hypergeometric Probability Distribution

When sampling is without replacement, and the number of elements $N$ in the population is small (or when the sample elements $N$ in the population is small (or when the sample size $n$ is large relative to $N$), the number of “successes” in a random sample of $n$ items has a hypergeometric probability distribution.

$X\sim Hp(x)$:

$P(x)=P(X =x)=\frac{\binom{r}{x}\binom{N-r}{n-x}}{\binom{N}{n}}$

$N$ = number of elements in the population
$r$ = number of successes in the population
$n$ = sample size (draw from $N$)
$x$ = number of successes in the sample (successful draw from $N$)

Mean or Expected value:

$\mu =E(X)=\frac{nr}{N}$

Variance:

$\sigma^2 =Var(X) = n\bigg(\frac{r}{N}\bigg)\bigg(\frac{N-r}{N}\bigg)\bigg(\frac{N-n}{N-1}\bigg)$

where the $\bigg(\frac{N-n}{N-1}\bigg)$ is the finite population correction factor

2.2 Continuous Random Variable

2.2.1 Probability Distribution

$P(X=c) = 0$
$P(X\lt c)=P(X\le c)$

2.2.2 Normal Distribution

PDF:

$f(x) = \frac{1}{\sqrt{2\pi}\sigma}e^{-(x-\mu )^2/2\sigma^2}$

$X$ follows a normal distribution with mean $\mu$ and $\sigma$(standard deviation), $\sigma^2$(Variance)

$X\sim N(\mu,\sigma^2)$

Mean or Expected value:

$E(X)=\mu$

Variance:

$Var(X) =\sigma^2\\Var(X) =E(X^2)-E(X)^2$

2.2.3 Standard Normal Distribution

Normal Distribution with $\mu = 0$ and $\sigma=1$

$Z\sim N(0,1)$

Standardizing a Normal Distribution: converting an X value to a Z value

$Z=\frac{X-\mu}{\sigma}\; (Not\;\sigma^2)$

where $X\sim N(\mu,\sigma^2)$

$Z=\frac{X-\mu}{\sigma} \Rightarrow X = \mu+Z\sigma$

Normal Distribution as an Approximation to Binomial Distribution:

when both

$np\gt5 \; and\;n(1-p)\gt5$

3 Steps:

Get $\mu$ and $\sigma$ for binomial distribution;

$\mu = np\;,\sigma=\sqrt{np(1-p)}\;,\sigma^2=np(1-p)$

Convert the discrete random variable to a continuous random variable;

Compute the required probability using the normal distribution

3. Sampling Distribution & Estimation

3.1 Sampling Distribution of the Sample Mean

Sampling Distribution of $\bar{X}$

Mean

$\mu_{\bar{X}}= \mu$

Standard Error

$\sigma_{\bar{X}}=\frac{\sigma}{\sqrt{n}}$

When $n/N \gt 0.05$ ($N$ for population size):
$\sigma_{\bar{X}} = \frac{\sigma}{\sqrt{n}} \cdot\sqrt{\frac{N-n}{N-1}}$

Shape

$\bar{X}\sim N(\mu,(\frac{\sigma}{\sqrt{n}})^2)$

3.2 Sampling Distribution of the Sample Proportion

Sampling Distribution of $\bar{p}$

Mean

$\mu_{\bar{X}}= p$

Standard Error

$\sigma_{\bar{p}}=\sqrt{\frac{p(1-p)}{n}}$

When $n/N \gt 0.05$ ($N$ for population size):
$\sigma_{\bar{p}} = \sqrt{\frac{p(1-p)}{n}} \cdot\sqrt{\frac{N-n}{N-1}}$

Shape

$\bar{X}\sim N(p,(\sqrt{\frac{p(1-p)}{n}})^2)$

3.3 Sampling Distribution of the Sample Variance

Sampling Distribution of $s^2$

For

$s^2=\frac{1}{n-1}\sum_{i=1}^n(x_i-\bar{x})^2=\frac{\sum x^2-n(\bar{x})^2}{n-1}=\frac{\sum x^2-\frac{(\sum x)^2}{n}}{n-1}$

So,

$\frac{(n-1)s^2}{\sigma^2}=\frac{1}{\sigma^2}\sum_{i=1}^n(x_i-\bar{x})^2$

where

$\chi^2_{n-1} \sim \frac{(n-1)s^2}{\sigma^2}$

has a chi-square($\chi^2$) distribution with $n-1$ degrees of freedom

Mean

$\mu_{s^2}= \sigma^2$

Variance

$Var(s^2)=\frac{2\sigma^4}{n-1}$

3.4 Confidence interval of population mean $\mu$ with known Variance

$\bar{X}\pm Z_{\alpha/2}\frac{\sigma}{\sqrt{n}}$

a $(1-\alpha)100\%$ C.I.
$\sigma$: population standard deviation
$n$: sample size
$Z_{\alpha/2}$ from the standard normal distribution table with $\alpha/2$ probability.

3.5 Confidence interval of population mean $\mu$ with unknown Variance

$\bar{X}\pm t_{\alpha/2,n-1}\frac{s}{\sqrt{n}}$

a $(1-\alpha)100\%$ C.I.
$s$: sample standard deviation
$n$: sample size
$t_{\alpha/2,n-1}$ from $t$ distribution table with $\alpha/2$ probability and $n-1$ degrees of freedom

References

Slides of AMA1104 Introductory Probability, The Hong Kong Polytechnic University.

个人笔记，仅供参考，转载请标明出处
FOR REFERENCE ONLY

Made by Mike_Zhang

Mathematics

#Math Note #Probability

Introductory Probability Course Note

https://ultrafish.io/post/introductory-probability-course-note/

Author

Mike_Zhang

Posted on

December 12, 2021

Licensed under

Limits and Continuity Note Previous

多米诺配对问题 Domino Matching with Python Next