Probability: Difference between revisions

From David's Wiki
Line 18: Line 18:
* <math>Var(X) = E[(X-E(X))^2] = E(X^2) - (E(X))^2</math>
* <math>Var(X) = E[(X-E(X))^2] = E(X^2) - (E(X))^2</math>
===Total Expection===
===Total Expection===
<math>E(X) = E(E(X|Y))</math><br>
Dr. Xu refers to this as the smooth property.
Dr. Xu refers to this as the smooth property.
<math>E(X) = E(E(X|Y))</math>
{{hidden | Proof |
{{hidden | Proof |
<math>
<math>
Line 28: Line 28:
</math>
</math>
}}
}}
===Total Variance===
===Total Variance===
This one is not used as often on tests as total expectation
This one is not used as often on tests as total expectation

Revision as of 16:58, 7 November 2019

Calculus-based Probability

Basics

Axioms of Probability

  • \(\displaystyle 0 \leq P(E) \leq 1\)
  • \(\displaystyle P(S) = 1\) where \(\displaystyle S\) is your sample space
  • For mutually exclusive events \(\displaystyle E_1, E_2, ...\), \(\displaystyle P\left(\bigcup_i^\infty E_i\right) = \sum_i^\infty P(E_i)\)

Monotonicity

  • For all events \(\displaystyle A\), \(\displaystyle B\), \(\displaystyle A \subset B \implies P(A) \leq P(B)\)
Proof

Expectation and Variance

Some definitions and properties.

Definitions

Let \(\displaystyle X \sim D\) for some distribution \(\displaystyle D\). Let \(\displaystyle S\) be the support or domain of your distribution.

  • \(\displaystyle E(X) = \sum_S xp(x)\) or \(\displaystyle \int_S xp(x)dx\)
  • \(\displaystyle Var(X) = E[(X-E(X))^2] = E(X^2) - (E(X))^2\)

Total Expection

\(\displaystyle E(X) = E(E(X|Y))\)
Dr. Xu refers to this as the smooth property.

Proof

\(\displaystyle E(X) = \int_S xp(x)dx = \int_x x \int_y p(x,y)dy dx = \int_x x \int_y p(x|y)p(y)dy dx = \int_y\int_x x p(x|y)dxp(y)dy \)

Total Variance

This one is not used as often on tests as total expectation \(\displaystyle Var(Y) = E(Var(Y|X)) + Var(E(Y | X)\)

Proof

Convergence

There are 4 types of convergence typically taught in undergraduate courses.
See Wikipedia Convergence of random variables

Almost Surely

In Probability

  • Implies Convergence in distribution

In Distribution

  • Equivalent to convergence in probability if it converges to a degenerate distribution

In Mean Squared

Delta Method

See Wikipedia
Suppose \(\displaystyle \sqrt{n}(X_n - \theta) \xrightarrow{D} N(0, \sigma^2)\).
Let \(\displaystyle g\) be a function such that \(\displaystyle g'\) exists and \(\displaystyle g'(\theta) \neq 0\)
Then \(\displaystyle \sqrt{n}(g(X_n) - g(\theta)) \xrightarrow{D} N(0, \sigma^2 g'(\theta)^2)\)
Multivariate:
\(\displaystyle \sqrt{n}(B - \beta) \xrightarrow{D} N(0, \Sigma) \implies \sqrt{n}(h(B)-h(\beta)) \xrightarrow{D} N(0, h'(\theta)^T \Sigma h'(\theta))\)

Notes
  • You can think of this like the Mean Value theorem for random variables.
\(\displaystyle (g(X_n) - g(\theta)) \approx g'(\theta)(X_n - \theta)\)

Limit Theorems

Markov's Inequality

Chebyshev's Inequality

Central Limit Theorem

Very very important. Never forget this.
For any distribution, the sample mean converges in distribution to normal. Let \(\displaystyle \mu = E(x)\) and \(\displaystyle \sigma^2 = Var(x)\)
Different ways of saying the same thing:

  • \(\displaystyle \sqrt{n}(\bar{x} - \mu) \sim N(0, \sigma^2)\)
  • \(\displaystyle \frac{\sqrt{n}}{\sigma}(\bar{x} - \mu) \sim N(0, 1)\)
  • \(\displaystyle \bar{x} \sim N(\mu, \sigma^2/n)\)

Law of Large Numbers

Relationships between distributions

This is important for tests.
See Relationships among probability distributions.

Poisson Distributions

Sum of poission is poisson sum of lambda.

Normal Distributions

  • If \(\displaystyle X_1 \sim N(\mu_1, \sigma_1^2)\) and \(\displaystyle X_2 \sim N(\mu_2, \sigma_2^2)\) then \(\displaystyle \lambda_1 X_1 + \lambda_2 X_2 \sim N(\lambda_1 \mu_1 + \lambda_2 X_2, \lambda_1^2 \sigma_1^2 + \lambda_2^2 + \sigma_2^2)\) for any \(\displaystyle \lambda_1, \lambda_2 \in \mathbb{R}\)

Gamma Distributions

Note exponential distributions are also Gamma distrubitions

  • If \(\displaystyle X \sim \Gamma(k, \theta)\) then \(\displaystyle \lambda X \sim \Gamma(k, c\theta)\).
  • If \(\displaystyle X_1 \sim \Gamma(k_1, \theta)\) and \(\displaystyle X_2 \sim \Gamma(k_2, \theta)\) then \(\displaystyle X_2 + X_2 \sim \Gamma(k_1 + k_2, \theta)\).
  • If \(\displaystyle X_1 \sim \Gamma(\alpha, \theta)\) and \(\displaystyle X_2 \sim \Gamma(\beta, \theta)\), then \(\displaystyle \frac{X_1}{X_1 + X_2} \sim B(\alpha, \beta)\).

T-distribution

Ratio of normal and squared-root of Chi-sq distribution yields T-distribution.

Chi-Sq Distribution

The ratio of two normalized Chi-sq is an F-distributions

F Distribution

Too many. See the Wikipedia Page. Most important are Chi-sq and T distribution

Textbooks