Statistics: Difference between revisions

← Older edit

Statistics (view source)

Revision as of 14:18, 6 January 2020

4,221 bytes added , 6 January 2020

→‎Unbiased

David

Bureaucrats, Interface administrators, Administrators

5,337

edits

@@ Line 2: / Line 2: @@
 ==Estimation==
+===Method of Moments Estimator===
+Sometimes referred to as MME or MMO
+* Calculate your population moments in terms of your parameters
+** <math>E(X) = g(\theta)</math>
+* Then invert to get your parameters as a function of your moments
+** <math>\theta = g^{-1}(E(X))</math>
+* Replace population moments with sample moments
+** <math>E(X) \rightarrow \bar{x}</math>
+** <math>E(X^2) \rightarrow \frac{1}{n}\sum(x_i - \bar{x})^2</math>
+** <math>\hat{\theta} = g^{-1}(\bar{x})</math>
 ===Maximum Likelihood Estimator===
 (MLE)
-===Uniformly Minimum Variance Unbiased Estimator===
+Maximum Likelihood Estimator
-UMVUE, sometimes called MVUE or UMVU.
+* Write out the likelihood function <math>L(\theta; \mathbf{x}) = f(\mathbf{x}; \theta)</math>
+* (Optional) Write out the log-likelihood function <math>l(\theta) = \log L(\theta; \mathbf{x})</math>
+* Take the derivative of the log-likelihood function w.r.t <math>\theta</math>
+* Find the maximum of the log-likelihood function by setting the first derivative to 0
+* (Optional) Make sure it is the maximum by checking that the Hessian is positive definite
+* Your MLE <math>\hat{\theta}</math> is the value which maximizes <math>L(\theta)</math>
+* Note if the derivative is always 0, then any value is the MLE. If it is always positive, then take the largest possible value.
+;Notes
+* If <math>\hat{\theta}</math> is the MLE for <math>\theta</math> then the MLE for <math>g(\theta)</math> is <math>g(\hat{\theta})</math>
+===Uniformly Minimum Variance Unbiased Estimator (UMVUE)===
+{{main | Wikipedia: Minimum-variance unbiased estimator}}
+UMVUE, sometimes called MVUE or UMVU.<br>
+See [[Wikipedia: Lehmann–Scheffé theorem]]<br>
+An unbiased estimator of a complete-sufficient statistics is a UMVUE.<br>
+In general, you should find a complete sufficient statistic using the property of exponential families.<br>
+Then make it unbiased with some factors to get the UMVUE.<br>
+===Properties===
+====Unbiased====
+An estimator <math>\hat{\theta}</math> is unbiased for <math>\theta</math> if <math>E[\hat{\theta}] = \theta</math>
+* <math>X_n</math> is unbiased for <math>E[X]</math> but is not consistent
+====Consistent====
+An estimator <math>\hat{\theta}</math> is consistent for <math>\theta</math> if it converges in probability to <math>\theta</math>
+* Example: <math>\frac{1}{n}\sum (X-\bar{X})^2</math> is a consistent estimator
+: for <math>\sigma^2</math> for <math>N(\mu, \sigma^2</math> but is not unbiased.
+===Efficiency===
+====Fisher Information====
+{{main | Wikipedia: Fisher Information}}
+* <math>I(\theta) = E[ (\frac{\partial}{\partial \theta} \log f(X; \theta) )^2 | \theta]</math>
+* or if <math>\log f(x)</math> is twice differentiable <math>I(\theta) = -E[ \frac{\partial^2}{\partial \theta^2} \log f(X; \theta) | \theta]</math>
+* <math>I_n(\theta) = n*I(\theta)</math> is the fisher information of the sample. Replace <math>f</math> with your full likelihood.
+====Cramer-Rao Lower Bound====
+{{main | Wikipedia: Cramér–Rao bound}}
+Given an estimator <math>T(X)</math>, let <math>\psi(\theta)=E[T(X)]</math>.
+Then <math>Var(T) \geq \frac{(\psi'(\theta))^2}{I(\theta)}</math>
+;Notes
+* If <math>T(X)</math> is unbiased then <math>\psi(\theta)=\theta \implies \psi'(\theta) = 1</math>
+: Our lower bound will be <math>\frac{1}{I(\theta)}</math>
+The efficiency of an unbiased estimator is defined as <math>e(T) = \frac{I(\theta)^{-1}}{Var(T)}</math>
+===Sufficient Statistics===
+====Auxiliary Statistics====
 ==Tests==
 ===Basic Tests===
@@ Line 13: / Line 78: @@
 Use to test the ratio of variances.
 ===Likelihood Ratio Test===
+See [[Wikipedia: Likelihood Ratio Test]]<br>
+* <math> LR = -2 \log \frac{\sup_{\theta \in \Theta_0} L(\theta)}{\sup_{\theta \in \Theta} L(\theta)}</math>
 ===Uniformly Most Powerful Test===
-UMP Test
+UMP Test<br>
+See [[Wikipedia: Neyman-Pearson Lemma]]<br>
+* <math>R_{NP} = \left\{x : \frac{L(\theta_0 | x)}{L(\theta_1 | x)} \leq \eta\right\}</math>
 ===Anova===
@@ Line 24: / Line 95: @@
 ==Quadratic Forms==
+==Bootstrapping==
+[https://en.wikipedia.org/wiki/Bootstrapping_(statistics) Wikipedia]<br>
+Boostrapping is used to sample from your sample to get a measure of accuracy of your statistics.
+===Nonparametric Bootstrapping===
+In nonparametric bootstrapping, you resample from your sample with replacement.<br>
+In this scenario, you don't need to know the family of distributions that your sample comes from.
+===Parametric Bootstrapping===
+In parametric bootstrapping, you learn the distribution parameters of your sample, e.g. with MLE.<br>
+Then you can generate samples from that distribution on a computer.
 ==Textbooks==
-* [https://smile.amazon.com/Statistical-Inference-George-Casella/dp/0534243126?sa-no-redirect=1 Casella and Burger's Statistical Inference]
+* [https://smile.amazon.com/dp/0534243126 Casella and Burger's Statistical Inference]
+* [https://smile.amazon.com/dp/0321795431 Hogg, McKean, and Craig's Introduction to Mathematical Statistics (7th Edition)]