Probability: Difference between revisions

← Older edit Newer edit →

Revision as of 17:16, 7 November 2019

Calculus-based Probability

Basics

Axioms of Probability

\(\displaystyle 0 \leq P(E) \leq 1\)
\(\displaystyle P(S) = 1\) where \(\displaystyle S\) is your sample space
For mutually exclusive events \(\displaystyle E_1, E_2, ...\), \(\displaystyle P\left(\bigcup_i^\infty E_i\right) = \sum_i^\infty P(E_i)\)

Monotonicity

For all events \(\displaystyle A\), \(\displaystyle B\), \(\displaystyle A \subset B \implies P(A) \leq P(B)\)

Proof

Expectation and Variance

Some definitions and properties.

Definitions

Let \(\displaystyle X \sim D\) for some distribution \(\displaystyle D\). Let \(\displaystyle S\) be the support or domain of your distribution.

\(\displaystyle E(X) = \sum_S xp(x)\) or \(\displaystyle \int_S xp(x)dx\)
\(\displaystyle Var(X) = E[(X-E(X))^2] = E(X^2) - (E(X))^2\)

Total Expection

\(\displaystyle E(X) = E(E(X|Y))\)
Dr. Xu refers to this as the smooth property.

Proof

\(\displaystyle E(X) = \int_S xp(x)dx = \int_x x \int_y p(x,y)dy dx = \int_x x \int_y p(x|y)p(y)dy dx = \int_y\int_x x p(x|y)dxp(y)dy \)

Total Variance

\(\displaystyle Var(Y) = E(Var(Y|X)) + Var(E(Y | X)\)
This one is not used as often on tests as total expectation

Proof

Convergence

There are 4 types of convergence typically taught in undergraduate courses.
See Wikipedia Convergence of random variables

Almost Surely

\(\displaystyle P(\lim X_i = X^*) = 1\)

In Probability

For all \(\displaystyle \epsilon \gt 0\)
\(\displaystyle \lim P(|X_i - X| \geq \epsilon) = 0\)

Implies Convergence in distribution

In Distribution

Pointwise convergence of the cdf
A sequence of random variables \(\displaystyle X_1,...\) converges to \(\displaystyle X^*\) in probability if for all \(\displaystyle x \in S\), \(\displaystyle \lim_{i \rightarrow \infty} F_i(x) = F^*(x)\)

Equivalent to convergence in probability if it converges to a degenerate distribution

In Mean Squared

Delta Method

See Wikipedia
Suppose \(\displaystyle \sqrt{n}(X_n - \theta) \xrightarrow{D} N(0, \sigma^2)\).
Let \(\displaystyle g\) be a function such that \(\displaystyle g'\) exists and \(\displaystyle g'(\theta) \neq 0\)
Then \(\displaystyle \sqrt{n}(g(X_n) - g(\theta)) \xrightarrow{D} N(0, \sigma^2 g'(\theta)^2)\)
Multivariate:
\(\displaystyle \sqrt{n}(B - \beta) \xrightarrow{D} N(0, \Sigma) \implies \sqrt{n}(h(B)-h(\beta)) \xrightarrow{D} N(0, h'(\theta)^T \Sigma h'(\theta))\)

Notes

You can think of this like the Mean Value theorem for random variables.

\(\displaystyle (g(X_n) - g(\theta)) \approx g'(\theta)(X_n - \theta)\)

Limit Theorems

Markov's Inequality

Chebyshev's Inequality

Central Limit Theorem

Very very important. Never forget this.
For any distribution, the sample mean converges in distribution to normal. Let \(\displaystyle \mu = E(x)\) and \(\displaystyle \sigma^2 = Var(x)\)
Different ways of saying the same thing:

\(\displaystyle \sqrt{n}(\bar{x} - \mu) \sim N(0, \sigma^2)\)
\(\displaystyle \frac{\sqrt{n}}{\sigma}(\bar{x} - \mu) \sim N(0, 1)\)
\(\displaystyle \bar{x} \sim N(\mu, \sigma^2/n)\)

Law of Large Numbers

Relationships between distributions

This is important for tests.
See Relationships among probability distributions.

Poisson Distributions

Sum of poission is poisson sum of lambda.

Normal Distributions

If \(\displaystyle X_1 \sim N(\mu_1, \sigma_1^2)\) and \(\displaystyle X_2 \sim N(\mu_2, \sigma_2^2)\) then \(\displaystyle \lambda_1 X_1 + \lambda_2 X_2 \sim N(\lambda_1 \mu_1 + \lambda_2 X_2, \lambda_1^2 \sigma_1^2 + \lambda_2^2 + \sigma_2^2)\) for any \(\displaystyle \lambda_1, \lambda_2 \in \mathbb{R}\)

Gamma Distributions

Note exponential distributions are also Gamma distrubitions

If \(\displaystyle X \sim \Gamma(k, \theta)\) then \(\displaystyle \lambda X \sim \Gamma(k, c\theta)\).
If \(\displaystyle X_1 \sim \Gamma(k_1, \theta)\) and \(\displaystyle X_2 \sim \Gamma(k_2, \theta)\) then \(\displaystyle X_2 + X_2 \sim \Gamma(k_1 + k_2, \theta)\).
If \(\displaystyle X_1 \sim \Gamma(\alpha, \theta)\) and \(\displaystyle X_2 \sim \Gamma(\beta, \theta)\), then \(\displaystyle \frac{X_1}{X_1 + X_2} \sim B(\alpha, \beta)\).

T-distribution

Ratio of normal and squared-root of Chi-sq distribution yields T-distribution.

Chi-Sq Distribution

The ratio of two normalized Chi-sq is an F-distributions

F Distribution

Too many. See the Wikipedia Page. Most important are Chi-sq and T distribution

Textbooks

Sheldon Ross' A First Course in Probability
Hogg and Craig's Mathematical Statistics
Casella and Burger's Statistical Inference

@@ Line 43: / Line 43: @@
 ===In Probability===
+For all <math>\epsilon > 0</math><br>
+<math>\lim P(|X_i - X| \geq \epsilon) = 0</math>
 * Implies Convergence in distribution
 ===In Distribution===
 Pointwise convergence of the cdf<br>