Machine Learning: Difference between revisions
| (2 intermediate revisions by the same user not shown) | |||
| Line 23: | Line 23: | ||
The cross entropy loss is | The cross entropy loss is | ||
* <math>J(\theta) = \sum [(y^{(i)})\log(h_\theta(x)) + (1-y^{(i)})\log(1-h_\theta(x))]</math> | * <math>J(\theta) = \sum [(y^{(i)})\log(h_\theta(x)) + (1-y^{(i)})\log(1-h_\theta(x))]</math> | ||
;Notes | ;Notes | ||
* This is the sum of the log probabilities of picking the correct class (i.e. p if y=1 or 1-p if y=0). | |||
* If our model is <math>g(\theta^Tx^{(i)})</math> where <math>g(x)</math> is the sigmoid function <math>\frac{e^x}{1+e^x}</math> then this is convex | * If our model is <math>g(\theta^Tx^{(i)})</math> where <math>g(x)</math> is the sigmoid function <math>\frac{e^x}{1+e^x}</math> then this is convex | ||
| Line 91: | Line 95: | ||
==SVM== | ==SVM== | ||
[ | [https://see.stanford.edu/materials/aimlcs229/cs229-notes3.pdf Andrew Ng Notes]<br> | ||
Support Vector Machine<br> | Support Vector Machine<br> | ||
This is a linear classifier the same as a perceptron except the goal is not to just classify our data properly but to also maximize the margin.<br> | This is a linear classifier the same as a perceptron except the goal is not to just classify our data properly but to also maximize the margin.<br> | ||
| Line 292: | Line 296: | ||
Let <math>X_1,...,X_n</math> be bounded in (a,b)<br> | Let <math>X_1,...,X_n</math> be bounded in (a,b)<br> | ||
Then <math>P(|\bar{X}-E[\bar{X}]| \geq t) \leq 2\exp(-\frac{2nt^2}{(b-a)^2})</math> | Then <math>P(|\bar{X}-E[\bar{X}]| \geq t) \leq 2\exp(-\frac{2nt^2}{(b-a)^2})</math> | ||
==See Also== | |||
* [[Supervised Learning]] | |||
* [[Unsupervised Learning]] | |||
* [[Deep Learning]] | |||