# Statistics

Statistics

## Estimation

### Method of Moments Estimator

Sometimes referred to as MME or MMO

• $E(X)=g(\theta )$ • Then invert to get your parameters as a function of your moments
• $\theta =g^{-1}(E(X))$ • Replace population moments with sample moments
• $E(X)\rightarrow {\bar {x}}$ • $E(X^{2})\rightarrow {\frac {1}{n}}\sum (x_{i}-{\bar {x}})^{2}$ • ${\hat {\theta }}=g^{-1}({\bar {x}})$ ### Maximum Likelihood Estimator

(MLE) Maximum Likelihood Estimator

• Write out the likelihood function $L(\theta ;\mathbf {x} )=f(\mathbf {x} ;\theta )$ • (Optional) Write out the log-likelihood function $l(\theta )=\log L(\theta ;\mathbf {x} )$ • Take the derivative of the log-likelihood function w.r.t $\theta$ • Find the maximum of the log-likelihood function by setting the first derivative to 0
• (Optional) Make sure it is the maximum by checking that the Hessian is positive definite
• Your MLE ${\hat {\theta }}$ is the value which maximizes $L(\theta )$ • Note if the derivative is always 0, then any value is the MLE. If it is always positive, then take the largest possible value.
Notes
• If ${\hat {\theta }}$ is the MLE for $\theta$ then the MLE for $g(\theta )$ is $g({\hat {\theta }})$ ### Uniformly Minimum Variance Unbiased Estimator (UMVUE)

UMVUE, sometimes called MVUE or UMVU.
See Wikipedia: Lehmann–Scheffé theorem
An unbiased estimator of a complete-sufficient statistics is a UMVUE.
In general, you should find a complete sufficient statistic using the property of exponential families.
Then make it unbiased with some factors to get the UMVUE.

### Properties

#### Unbiased

An estimator ${\hat {\theta }}$ is unbiased for $\theta$ if $E[{\hat {\theta }}]=\theta$ • $X_{n}$ is unbiased for $E[X]$ but is not consistent

#### Consistent

An estimator ${\hat {\theta }}$ is consistent for $\theta$ if it converges in probability to $\theta$ • Example: ${\frac {1}{n}}\sum (X-{\bar {X}})^{2}$ is a consistent estimator
for $\sigma ^{2}$ for $N(\mu ,\sigma ^{2}$ but is not unbiased.

### Efficiency

#### Fisher Information

• $I(\theta )=E[({\frac {\partial }{\partial \theta }}\log f(X;\theta ))^{2}|\theta ]$ • or if $\log f(x)$ is twice differentiable $I(\theta )=-E[{\frac {\partial ^{2}}{\partial \theta ^{2}}}\log f(X;\theta )|\theta ]$ • $I_{n}(\theta )=n*I(\theta )$ is the fisher information of the sample. Replace $f$ with your full likelihood.

#### Cramer-Rao Lower Bound

Given an estimator $T(X)$ , let $\psi (\theta )=E[T(X)]$ . Then $Var(T)\geq {\frac {(\psi '(\theta ))^{2}}{I(\theta )}}$ Notes
• If $T(X)$ is unbiased then $\psi (\theta )=\theta \implies \psi '(\theta )=1$ Our lower bound will be ${\frac {1}{I(\theta )}}$ The efficiency of an unbiased estimator is defined as $e(T)={\frac {I(\theta )^{-1}}{Var(T)}}$ ## Tests

### Basic Tests

#### T-test

Used to test the mean.

#### F-test

Use to test the ratio of variances.

### Likelihood Ratio Test

• $LR=-2\log {\frac {\sup _{\theta \in \Theta _{0}}L(\theta )}{\sup _{\theta \in \Theta }L(\theta )}}$ ### Uniformly Most Powerful Test

UMP Test
See Wikipedia: Neyman-Pearson Lemma

• $R_{NP}=\left\{x:{\frac {L(\theta _{0}|x)}{L(\theta _{1}|x)}}\leq \eta \right\}$ ## Confidence Sets

Confidence Intervals

## Bootstrapping

Wikipedia
Boostrapping is used to sample from your sample to get a measure of accuracy of your statistics.

### Nonparametric Bootstrapping

In nonparametric bootstrapping, you resample from your sample with replacement.
In this scenario, you don't need to know the family of distributions that your sample comes from.

### Parametric Bootstrapping

In parametric bootstrapping, you learn the distribution parameters of your sample, e.g. with MLE.
Then you can generate samples from that distribution on a computer.