In this post, a SVM classifier is implemented. The SVM is implemented with “Hard Margin” and “Soft Margin”. The “Hard Margin” is used to classify separable data, while the soft margin is used to classifier inseparable data. In addition, kernel can be used in this SVM classifier. In this post, the SVM is implemented with linear kernel and Gaussian Kernel(RBF kernel).
SVM Classifier
We will use a Quadratic Program Solver CVXOPT to solve the Lagrangian of SVM. A tutorial of CVXOPT can be found here (cvsopt).1
2
3
4
5import numpy as np
import matplotlib.pyplot as plt
import cvxopt
from cvxopt import matrix
from cvxopt import solvers
Hard Margin SVM
The Primal problem of SVM is
The Lagrangian of SVM is
The $KKT$ condition is
The Dual Problem is
This dual problem can be solved by a quadratic program solver.
That is
The optimal value of $\alpha_i$ can be obtained by solving the above quadratic problem using CVXOPT.
Making prediction using Linear Kernel
For linear kernel, we make prediction by
we can compute the $\lambda^{\star}$ by
Then we can use the $\lambda^{\star}$ and support vectors to compute $\lambda_0^{\star}$ by
For this question, all the support vectors are used to compute the $\lambda_0^{\star}$ and then take the average as the final $\lambda_0^{\star}$
Gaussian Kernel(RBF)
For a non-linear kernel, if we do not know how the kernel map features to a new feature space, the $\lambda^{\star}$ can not be computed by $\lambda^{\star} = \sum_{i=1}^n \alpha_i^{\star} y_i \mathbf{x_i^{\mathcal{H_k}}}$ if we do not know the $\mathbf{x_i^{\mathcal{H_k}}}$
However, all we need to know is the value of inner product a $\mathbf{x_i^{\mathcal{H_k}}}$ in the new feature space.
For solving the Dual problem,
we just need to know the value of $\langle x_i, x_k \rangle$
For making prediction,
We know that we just need to know the inner product value of $x$ in the new feature space to make a prediction.
Thus, for the SVM using Gaussian kernel, all we need is the value of two vector in the Gaussian kernel. Then we can use this inner product value to solve the Lagrangian and make prediction.
Soft Margin
The Primal problem of SVM is
The Lagrangian form of this prime is
The Dual problem is
The only difference is $0 \leq \alpha_i \leq C \text{ }\forall i$. To solve the quadratic problem, all we need to change is the matrix $\mathbf{P}$, $\mathbf{G}$ and $\mathbf{h}$.
The code of SVM implemented in Python is shown as below.
1 | class SVM: |
That’s it.