# Statistics

$$\newcommand{\P}[]{\unicode{xB6}} \newcommand{\AA}[]{\unicode{x212B}} \newcommand{\empty}[]{\emptyset} \newcommand{\O}[]{\emptyset} \newcommand{\Alpha}[]{Α} \newcommand{\Beta}[]{Β} \newcommand{\Epsilon}[]{Ε} \newcommand{\Iota}[]{Ι} \newcommand{\Kappa}[]{Κ} \newcommand{\Rho}[]{Ρ} \newcommand{\Tau}[]{Τ} \newcommand{\Zeta}[]{Ζ} \newcommand{\Mu}[]{\unicode{x039C}} \newcommand{\Chi}[]{Χ} \newcommand{\Eta}[]{\unicode{x0397}} \newcommand{\Nu}[]{\unicode{x039D}} \newcommand{\Omicron}[]{\unicode{x039F}} \DeclareMathOperator{\sgn}{sgn} \def\oiint{\mathop{\vcenter{\mathchoice{\huge\unicode{x222F}\,}{\unicode{x222F}}{\unicode{x222F}}{\unicode{x222F}}}\,}\nolimits} \def\oiiint{\mathop{\vcenter{\mathchoice{\huge\unicode{x2230}\,}{\unicode{x2230}}{\unicode{x2230}}{\unicode{x2230}}}\,}\nolimits}$$

Statistics

## Estimation

### Method of Moments Estimator

Sometimes referred to as MME or MMO

• $$\displaystyle E(X) = g(\theta)$$
• Then invert to get your parameters as a function of your moments
• $$\displaystyle \theta = g^{-1}(E(X))$$
• Replace population moments with sample moments
• $$\displaystyle E(X) \rightarrow \bar{x}$$
• $$\displaystyle E(X^2) \rightarrow \frac{1}{n}\sum(x_i - \bar{x})^2$$
• $$\displaystyle \hat{\theta} = g^{-1}(\bar{x})$$

### Maximum Likelihood Estimator

(MLE) Maximum Likelihood Estimator

• Write out the likelihood function $$\displaystyle L(\theta; \mathbf{x}) = f(\mathbf{x}; \theta)$$
• (Optional) Write out the log-likelihood function $$\displaystyle l(\theta) = \log L(\theta; \mathbf{x})$$
• Take the derivative of the log-likelihood function w.r.t $$\displaystyle \theta$$
• Find the maximum of the log-likelihood function by setting the first derivative to 0
• (Optional) Make sure it is the maximum by checking that the Hessian is positive definite
• Your MLE $$\displaystyle \hat{\theta}$$ is the value which maximizes $$\displaystyle L(\theta)$$
• Note if the derivative is always 0, then any value is the MLE. If it is always positive, then take the largest possible value.
Notes
• If $$\displaystyle \hat{\theta}$$ is the MLE for $$\displaystyle \theta$$ then the MLE for $$\displaystyle g(\theta)$$ is $$\displaystyle g(\hat{\theta})$$

### Uniformly Minimum Variance Unbiased Estimator (UMVUE)

UMVUE, sometimes called MVUE or UMVU.
See Wikipedia: Lehmann–Scheffé theorem
An unbiased estimator of a complete-sufficient statistics is a UMVUE.
In general, you should find a complete sufficient statistic using the property of exponential families.
Then make it unbiased with some factors to get the UMVUE.

### Properties

#### Unbiased

An estimator $$\displaystyle \hat{\theta}$$ is unbiased for $$\displaystyle \theta$$ if $$\displaystyle E[\hat{\theta}] = \theta$$

• $$\displaystyle X_n$$ is unbiased for $$\displaystyle E[X]$$ but is not consistent

#### Consistent

An estimator $$\displaystyle \hat{\theta}$$ is consistent for $$\displaystyle \theta$$ if it converges in probability to $$\displaystyle \theta$$

• Example: $$\displaystyle \frac{1}{n}\sum (X-\bar{X})^2$$ is a consistent estimator
for $$\displaystyle \sigma^2$$ for $$\displaystyle N(\mu, \sigma^2$$ but is not unbiased.

### Efficiency

#### Fisher Information

• $$\displaystyle I(\theta) = E[ (\frac{\partial}{\partial \theta} \log f(X; \theta) )^2 | \theta]$$
• or if $$\displaystyle \log f(x)$$ is twice differentiable $$\displaystyle I(\theta) = -E[ \frac{\partial^2}{\partial \theta^2} \log f(X; \theta) | \theta]$$
• $$\displaystyle I_n(\theta) = n*I(\theta)$$ is the fisher information of the sample. Replace $$\displaystyle f$$ with your full likelihood.

#### Cramer-Rao Lower Bound

Given an estimator $$\displaystyle T(X)$$, let $$\displaystyle \psi(\theta)=E[T(X)]$$. Then $$\displaystyle Var(T) \geq \frac{(\psi'(\theta))^2}{I(\theta)}$$

Notes
• If $$\displaystyle T(X)$$ is unbiased then $$\displaystyle \psi(\theta)=\theta \implies \psi'(\theta) = 1$$
Our lower bound will be $$\displaystyle \frac{1}{I(\theta)}$$

The efficiency of an unbiased estimator is defined as $$\displaystyle e(T) = \frac{I(\theta)^{-1}}{Var(T)}$$

## Tests

### Basic Tests

#### T-test

Used to test the mean.

#### F-test

Use to test the ratio of variances.

### Likelihood Ratio Test

• $$\displaystyle LR = -2 \log \frac{\sup_{\theta \in \Theta_0} L(\theta)}{\sup_{\theta \in \Theta} L(\theta)}$$

### Uniformly Most Powerful Test

UMP Test
See Wikipedia: Neyman-Pearson Lemma

• $$\displaystyle R_{NP} = \left\{x : \frac{L(\theta_0 | x)}{L(\theta_1 | x)} \leq \eta\right\}$$

## Confidence Sets

Confidence Intervals

## Bootstrapping

Wikipedia
Boostrapping is used to sample from your sample to get a measure of accuracy of your statistics.

### Nonparametric Bootstrapping

In nonparametric bootstrapping, you resample from your sample with replacement.
In this scenario, you don't need to know the family of distributions that your sample comes from.

### Parametric Bootstrapping

In parametric bootstrapping, you learn the distribution parameters of your sample, e.g. with MLE.
Then you can generate samples from that distribution on a computer.