Probability: Difference between revisions
Line 43: | Line 43: | ||
===In Probability=== | ===In Probability=== | ||
For all <math>\epsilon > 0</math><br> | |||
<math>\lim P(|X_i - X| \geq \epsilon) = 0</math> | |||
* Implies Convergence in distribution | * Implies Convergence in distribution | ||
===In Distribution=== | ===In Distribution=== | ||
Pointwise convergence of the cdf<br> | Pointwise convergence of the cdf<br> |
Revision as of 17:16, 7 November 2019
Calculus-based Probability
Basics
Axioms of Probability
- \(\displaystyle 0 \leq P(E) \leq 1\)
- \(\displaystyle P(S) = 1\) where \(\displaystyle S\) is your sample space
- For mutually exclusive events \(\displaystyle E_1, E_2, ...\), \(\displaystyle P\left(\bigcup_i^\infty E_i\right) = \sum_i^\infty P(E_i)\)
Monotonicity
- For all events \(\displaystyle A\), \(\displaystyle B\), \(\displaystyle A \subset B \implies P(A) \leq P(B)\)
Expectation and Variance
Some definitions and properties.
Definitions
Let \(\displaystyle X \sim D\) for some distribution \(\displaystyle D\). Let \(\displaystyle S\) be the support or domain of your distribution.
- \(\displaystyle E(X) = \sum_S xp(x)\) or \(\displaystyle \int_S xp(x)dx\)
- \(\displaystyle Var(X) = E[(X-E(X))^2] = E(X^2) - (E(X))^2\)
Total Expection
\(\displaystyle E(X) = E(E(X|Y))\)
Dr. Xu refers to this as the smooth property.
\(\displaystyle E(X) = \int_S xp(x)dx = \int_x x \int_y p(x,y)dy dx = \int_x x \int_y p(x|y)p(y)dy dx = \int_y\int_x x p(x|y)dxp(y)dy \)
Total Variance
\(\displaystyle Var(Y) = E(Var(Y|X)) + Var(E(Y | X)\)
This one is not used as often on tests as total expectation
Convergence
There are 4 types of convergence typically taught in undergraduate courses.
See Wikipedia Convergence of random variables
Almost Surely
- \(\displaystyle P(\lim X_i = X^*) = 1\)
In Probability
For all \(\displaystyle \epsilon \gt 0\)
\(\displaystyle \lim P(|X_i - X| \geq \epsilon) = 0\)
- Implies Convergence in distribution
In Distribution
Pointwise convergence of the cdf
A sequence of random variables \(\displaystyle X_1,...\) converges to \(\displaystyle X^*\) in probability
if for all \(\displaystyle x \in S\),
\(\displaystyle \lim_{i \rightarrow \infty} F_i(x) = F^*(x)\)
- Equivalent to convergence in probability if it converges to a degenerate distribution
In Mean Squared
Delta Method
See Wikipedia
Suppose \(\displaystyle \sqrt{n}(X_n - \theta) \xrightarrow{D} N(0, \sigma^2)\).
Let \(\displaystyle g\) be a function such that \(\displaystyle g'\) exists and \(\displaystyle g'(\theta) \neq 0\)
Then \(\displaystyle \sqrt{n}(g(X_n) - g(\theta)) \xrightarrow{D} N(0, \sigma^2 g'(\theta)^2)\)
Multivariate:
\(\displaystyle \sqrt{n}(B - \beta) \xrightarrow{D} N(0, \Sigma) \implies \sqrt{n}(h(B)-h(\beta)) \xrightarrow{D} N(0, h'(\theta)^T \Sigma h'(\theta))\)
- Notes
- You can think of this like the Mean Value theorem for random variables.
- \(\displaystyle (g(X_n) - g(\theta)) \approx g'(\theta)(X_n - \theta)\)
Limit Theorems
Markov's Inequality
Chebyshev's Inequality
Central Limit Theorem
Very very important. Never forget this.
For any distribution, the sample mean converges in distribution to normal.
Let \(\displaystyle \mu = E(x)\) and \(\displaystyle \sigma^2 = Var(x)\)
Different ways of saying the same thing:
- \(\displaystyle \sqrt{n}(\bar{x} - \mu) \sim N(0, \sigma^2)\)
- \(\displaystyle \frac{\sqrt{n}}{\sigma}(\bar{x} - \mu) \sim N(0, 1)\)
- \(\displaystyle \bar{x} \sim N(\mu, \sigma^2/n)\)
Law of Large Numbers
Relationships between distributions
This is important for tests.
See Relationships among probability distributions.
Poisson Distributions
Sum of poission is poisson sum of lambda.
Normal Distributions
- If \(\displaystyle X_1 \sim N(\mu_1, \sigma_1^2)\) and \(\displaystyle X_2 \sim N(\mu_2, \sigma_2^2)\) then \(\displaystyle \lambda_1 X_1 + \lambda_2 X_2 \sim N(\lambda_1 \mu_1 + \lambda_2 X_2, \lambda_1^2 \sigma_1^2 + \lambda_2^2 + \sigma_2^2)\) for any \(\displaystyle \lambda_1, \lambda_2 \in \mathbb{R}\)
Gamma Distributions
Note exponential distributions are also Gamma distrubitions
- If \(\displaystyle X \sim \Gamma(k, \theta)\) then \(\displaystyle \lambda X \sim \Gamma(k, c\theta)\).
- If \(\displaystyle X_1 \sim \Gamma(k_1, \theta)\) and \(\displaystyle X_2 \sim \Gamma(k_2, \theta)\) then \(\displaystyle X_2 + X_2 \sim \Gamma(k_1 + k_2, \theta)\).
- If \(\displaystyle X_1 \sim \Gamma(\alpha, \theta)\) and \(\displaystyle X_2 \sim \Gamma(\beta, \theta)\), then \(\displaystyle \frac{X_1}{X_1 + X_2} \sim B(\alpha, \beta)\).
T-distribution
Ratio of normal and squared-root of Chi-sq distribution yields T-distribution.
Chi-Sq Distribution
The ratio of two normalized Chi-sq is an F-distributions
F Distribution
Too many. See the Wikipedia Page. Most important are Chi-sq and T distribution
Textbooks
- Sheldon Ross' A First Course in Probability
- Hogg and Craig's Mathematical Statistics
- Casella and Burger's Statistical Inference