Jump to content

Statistics: Difference between revisions

4,381 bytes added ,  6 January 2020
No edit summary
 
(17 intermediate revisions by the same user not shown)
Line 2: Line 2:


==Estimation==
==Estimation==
===Method of Moments Estimator===
Sometimes referred to as MME or MMO
* Calculate your population moments in terms of your parameters
** <math>E(X) = g(\theta)</math>
* Then invert to get your parameters as a function of your moments
** <math>\theta = g^{-1}(E(X))</math>
* Replace population moments with sample moments
** <math>E(X) \rightarrow \bar{x}</math>
** <math>E(X^2) \rightarrow \frac{1}{n}\sum(x_i - \bar{x})^2</math>
** <math>\hat{\theta} = g^{-1}(\bar{x})</math>
===Maximum Likelihood Estimator===
===Maximum Likelihood Estimator===
(MLE)
(MLE)
===Uniformly Minimum Variance Unbiased Estimator===
Maximum Likelihood Estimator
UMVUE, sometimes called MVUE or UMVU.
 
* Write out the likelihood function <math>L(\theta; \mathbf{x}) = f(\mathbf{x}; \theta)</math>
* (Optional) Write out the log-likelihood function <math>l(\theta) = \log L(\theta; \mathbf{x})</math>
* Take the derivative of the log-likelihood function w.r.t <math>\theta</math>
* Find the maximum of the log-likelihood function by setting the first derivative to 0
* (Optional) Make sure it is the maximum by checking that the Hessian is positive definite
* Your MLE <math>\hat{\theta}</math> is the value which maximizes <math>L(\theta)</math>
* Note if the derivative is always 0, then any value is the MLE. If it is always positive, then take the largest possible value.
 
;Notes
* If <math>\hat{\theta}</math> is the MLE for <math>\theta</math> then the MLE for <math>g(\theta)</math> is <math>g(\hat{\theta})</math>
 
===Uniformly Minimum Variance Unbiased Estimator (UMVUE)===
{{main | Wikipedia: Minimum-variance unbiased estimator}}
UMVUE, sometimes called MVUE or UMVU.<br>
See [[Wikipedia: Lehmann–Scheffé theorem]]<br>
An unbiased estimator of a complete-sufficient statistics is a UMVUE.<br>
In general, you should find a complete sufficient statistic using the property of exponential families.<br>
Then make it unbiased with some factors to get the UMVUE.<br>
 
===Properties===
====Unbiased====
An estimator <math>\hat{\theta}</math> is unbiased for <math>\theta</math> if <math>E[\hat{\theta}] = \theta</math>
* <math>X_n</math> is unbiased for <math>E[X]</math> but is not consistent
 
====Consistent====
An estimator <math>\hat{\theta}</math> is consistent for <math>\theta</math> if it converges in probability to <math>\theta</math>
* Example: <math>\frac{1}{n}\sum (X-\bar{X})^2</math> is a consistent estimator
: for <math>\sigma^2</math> for <math>N(\mu, \sigma^2</math> but is not unbiased.
 
===Efficiency===
====Fisher Information====
{{main | Wikipedia: Fisher Information}}
 
* <math>I(\theta) = E[ (\frac{\partial}{\partial \theta} \log f(X; \theta) )^2 | \theta]</math>
* or if <math>\log f(x)</math> is twice differentiable <math>I(\theta) = -E[ \frac{\partial^2}{\partial \theta^2} \log f(X; \theta) | \theta]</math>
* <math>I_n(\theta) = n*I(\theta)</math> is the fisher information of the sample. Replace <math>f</math> with your full likelihood.
 
====Cramer-Rao Lower Bound====
{{main | Wikipedia: Cramér–Rao bound}}
Given an estimator <math>T(X)</math>, let <math>\psi(\theta)=E[T(X)]</math>.
Then <math>Var(T) \geq \frac{(\psi'(\theta))^2}{I(\theta)}</math>
 
;Notes
* If <math>T(X)</math> is unbiased then <math>\psi(\theta)=\theta \implies \psi'(\theta) = 1</math>
: Our lower bound will be <math>\frac{1}{I(\theta)}</math>
 
The efficiency of an unbiased estimator is defined as <math>e(T) = \frac{I(\theta)^{-1}}{Var(T)}</math>
 
===Sufficient Statistics===
 
====Auxiliary Statistics====
 
==Tests==
==Tests==
===Basic Tests===
====T-test====
Used to test the mean.
====F-test====
Use to test the ratio of variances.
===Likelihood Ratio Test===
===Likelihood Ratio Test===
See [[Wikipedia: Likelihood Ratio Test]]<br>
* <math> LR = -2 \log \frac{\sup_{\theta \in \Theta_0} L(\theta)}{\sup_{\theta \in \Theta} L(\theta)}</math>
===Uniformly Most Powerful Test===
===Uniformly Most Powerful Test===
UMP Test
UMP Test<br>
See [[Wikipedia: Neyman-Pearson Lemma]]<br>
* <math>R_{NP} = \left\{x : \frac{L(\theta_0 | x)}{L(\theta_1 | x)} \leq \eta\right\}</math>
 
===Anova===
===Anova===
==Confidence Sets==
==Confidence Sets==
Confidence Intervals
===Relationship with Tests===
==Regression==
==Regression==


==Quadratic Forms==
==Quadratic Forms==
==Bootstrapping==
[https://en.wikipedia.org/wiki/Bootstrapping_(statistics) Wikipedia]<br>
Boostrapping is used to sample from your sample to get a measure of accuracy of your statistics.
===Nonparametric Bootstrapping===
In nonparametric bootstrapping, you resample from your sample with replacement.<br>
In this scenario, you don't need to know the family of distributions that your sample comes from.
===Parametric Bootstrapping===
In parametric bootstrapping, you learn the distribution parameters of your sample, e.g. with MLE.<br>
Then you can generate samples from that distribution on a computer.


==Textbooks==
==Textbooks==
* [https://smile.amazon.com/Statistical-Inference-George-Casella/dp/0534243126?sa-no-redirect=1 Casella and Burger's Statistical Inference]
* [https://smile.amazon.com/dp/0534243126 Casella and Burger's Statistical Inference]
* [https://smile.amazon.com/dp/0321795431 Hogg, McKean, and Craig's Introduction to Mathematical Statistics (7th Edition)]