Jump to content

Machine Learning: Difference between revisions

Line 155: Line 155:
Since we only use <math>\langle x, z\rangle</math> then we only need <math>\phi(x)^T\phi(z)</math> to simulate a non-linear processing of the data.
Since we only use <math>\langle x, z\rangle</math> then we only need <math>\phi(x)^T\phi(z)</math> to simulate a non-linear processing of the data.


A kernel <math>K(x,z)</math> is a function that can be expressed as <math>K(x,z)=\phi(x)^T\phi(z)</math> for some function <math>\phi</math><br>. Ideally, our kernel function should be able to be computed without having to compute the actual <math>\phi</math> and the full dot product.
A kernel <math>K(x,z)</math> is a function that can be expressed as <math>K(x,z)=\phi(x)^T\phi(z)</math> for some function <math>\phi</math>.  
Ideally, our kernel function should be able to be computed without having to compute the actual <math>\phi</math> and the full dot product.
 
====Identifying if a function is a kernel====
====Identifying if a function is a kernel====
Basic check:
Basic check:
Since the kernel is an inner-product between <math>\phi(x), \phi(z)</math>, it should satisfy the axioms of inner products, namely <math>K(x,z)=K(z,x)</math>, otherwise it is not a kernel.<br>
Since the kernel is an inner-product between <math>\phi(x), \phi(z)</math>, it should satisfy the axioms of inner products, namely <math>K(x,z)=K(z,x)</math>, otherwise it is not a kernel.
 
====Mercer's Theorem====
====Mercer's Theorem====
Let our kernel function be <math>K(z,x)</math>.
Let our kernel function be <math>K(z,x)</math>.