Jump to content

Machine Learning: Difference between revisions

Line 149: Line 149:
In this case, one way to get around this is to perform a non-linear preprocessing of the data <math>\phi(x)</math>.<br>
In this case, one way to get around this is to perform a non-linear preprocessing of the data <math>\phi(x)</math>.<br>
For example <math>\phi(x) = \begin{bmatrix}x \\ x^2 \\ x^3\end{bmatrix}</math>
For example <math>\phi(x) = \begin{bmatrix}x \\ x^2 \\ x^3\end{bmatrix}</math>
If our original model and training only used <math>\langle x, z\rangle</math> then we only need <math>\phi(x)^T\phi(z)</math><br>
 
A kernel <math>K(x,z)</math> is a function that can be expressed as <math>K(x,z)=\phi(x)^T\phi(z)</math> for some function <math>\phi</math><br>
\(\DeclareMathOperator{\sign}{sign}\)
Suppose our model is <math>\hat{y}=\sign \sum_{i=1}^{n} w_i y_i \langle x_i, z \rangle</math>. 
In this case, our model is a linear combination of the training y where <math>\langle x_i, z \rangle</math> represents a similarity between \(z\) and \(x_i\). 
Since we only use <math>\langle x, z\rangle</math> then we only need <math>\phi(x)^T\phi(z)</math> to simulate a non-linear processing of the data.
 
A kernel <math>K(x,z)</math> is a function that can be expressed as <math>K(x,z)=\phi(x)^T\phi(z)</math> for some function <math>\phi</math><br>. Ideally, our kernel function should be able to be computed without having to compute the actual <math>\phi</math> and the full dot product.
====Identifying if a function is a kernel====
====Identifying if a function is a kernel====
Basic check:
Basic check: