Machine Learning: Difference between revisions

 
(2 intermediate revisions by the same user not shown)
Line 23: Line 23:
The cross entropy loss is
The cross entropy loss is
* <math>J(\theta) = \sum [(y^{(i)})\log(h_\theta(x)) + (1-y^{(i)})\log(1-h_\theta(x))]</math>
* <math>J(\theta) = \sum [(y^{(i)})\log(h_\theta(x)) + (1-y^{(i)})\log(1-h_\theta(x))]</math>
;Notes
;Notes
* This is the sum of the log probabilities of picking the correct class (i.e. p if y=1 or 1-p if y=0).
* If our model is <math>g(\theta^Tx^{(i)})</math> where <math>g(x)</math> is the sigmoid function <math>\frac{e^x}{1+e^x}</math> then this is convex
* If our model is <math>g(\theta^Tx^{(i)})</math> where <math>g(x)</math> is the sigmoid function <math>\frac{e^x}{1+e^x}</math> then this is convex


Line 91: Line 95:


==SVM==
==SVM==
[http://cs229.stanford.edu/notes/cs229-notes3.pdf Andrew Ng Notes]<br>
[https://see.stanford.edu/materials/aimlcs229/cs229-notes3.pdf Andrew Ng Notes]<br>
Support Vector Machine<br>
Support Vector Machine<br>
This is a linear classifier the same as a perceptron except the goal is not to just classify our data properly but to also maximize the margin.<br>
This is a linear classifier the same as a perceptron except the goal is not to just classify our data properly but to also maximize the margin.<br>
Line 292: Line 296:
Let <math>X_1,...,X_n</math> be bounded in (a,b)<br>
Let <math>X_1,...,X_n</math> be bounded in (a,b)<br>
Then <math>P(|\bar{X}-E[\bar{X}]| \geq t) \leq 2\exp(-\frac{2nt^2}{(b-a)^2})</math>
Then <math>P(|\bar{X}-E[\bar{X}]| \geq t) \leq 2\exp(-\frac{2nt^2}{(b-a)^2})</math>
==See Also==
* [[Supervised Learning]]
* [[Unsupervised Learning]]
* [[Deep Learning]]