[5] Non-linear Troubleshooting with Kernel SVM

[5] Non-linear Troubleshooting with Kernel SVM

2021. 1. 4. 14:05ㆍEnglish/Machine learning algorithm

728x90

Algorithms such as linear SVMs and regression cannot distinguish classes that are distinguished by nonlinearity.

Using kernel methods using the mapping function\(\phi\) can solve nonlinear problems.

Using the mapping function, the nonlinear combination of the original characteristics can be projected into a linearly differentiated high-dimensional space, where the hyperplane is distinguished and then turned back into the original characteristic space to distinguish nonlinear decision regions.

Ex) \(\phi(x_1,x_2) = (z_1,z_2,z_3) = (x_1,x_2,x_1^2+x_2^2)\)

Decision regions can be defined by a projection process of '2D -> 3D -> 2D'.

However, the problem with mapping functions is that the computational cost of creating new characteristics is very expensive. Calculation costs become more expensive, especially for high-dimensional data. To solve this problem, we use a kernel function consisting of dots between two points.

kernel function : \(K(x^{(i)}, x^{(i)}) = \phi(x^{(i)})^{T}\phi(x^{(j)})\)

One of the most frequently used kernels is the radiation base function (Gaussian kernel).

\(K(x^{(i)}, x^{(i)}) =exp\left(-\frac{||x^{(i)} - x^{(j)}||^2}{2\sigma^2}\right) = exp(-\gamma||x^{(i)} - x^{(j)}||^2)\)

* \(\gamma =\frac{1}{2\sigma^2}\)

\(\gamma\) is a parameter that serves to limit the size of the Gaussian sphere. Increasing the \(\gamma\) value reduces the influence and scope of the support vector. As a result, the decision regions become a little closer to the sample and more winding.

The term kernel can also be interpreted as a similarity function between samples. Negative signs translate distance measurements into similarity scores. The similarity score obtained through the exponential function will be between 1 and 0.

Using SVM to solve nonlinear problems

Load Library

import numpy as np
import matplotlib.pyplot as plt
from matplotlib.colors import ListedColormap

from sklearn.svm import SVC
from sklearn import datasets
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import StandardScaler

Create non-linear data

np.random.seed(1)

# Create nonlinear dataset
X_xor = np.random.randn(200, 2)
y_xor = np.logical_xor(X_xor[:, 0] > 0,
                      X_xor[:, 1] > 0)
y_xor = np.where(y_xor, 1, -1)

Draw scatter plot graphs

# scatter plot
plt.scatter(X_xor[y_xor == 1, 0],
           X_xor[y_xor == 1, 1],
           c = 'b', marker = 'x',
           label = '1')
plt.scatter(X_xor[y_xor == -1, 0],
           X_xor[y_xor == -1, 1],
           c = 'r', marker = 's',
           label = '-1')
plt.xlim([-3, 3])
plt.ylim([-3, 3])
plt.legend(loc = 'best')
plt.tight_layout()
plt.show()

Create Decision Regions Graph Function

# define function about visualizing decision_regions
def plot_decision_regions(X, y, classifier, test_idx = None, resolution = 0.02):
    
    # set marker and colormap
    markers = ('s', 'x', 'o', '^', 'v')
    colors = ('red', 'blue', 'lightgreen', 'cyan')
    cmap = ListedColormap(colors[:len(np.unique(y))])
    
    # draws a decision boundary.
    x1_min, x1_max = X[:, 0].min() - 1, X[:, 0].max() + 1
    x2_min, x2_max = X[:, 1].min() - 1, X[:, 1].max() + 1
    xx1, xx2 = np.meshgrid(np.arange(x1_min, x1_max, resolution),
                          np.arange(x2_min, x2_max, resolution))
    Z = classifier.predict(np.array([xx1.ravel(), xx2.ravel()]).T)
    Z = Z.reshape(xx1.shape)
    plt.contourf(xx1, xx2, Z, alpha = 0.3, cmap = cmap)
    plt.xlim(xx1.min(), xx1.max())
    plt.ylim(xx2.min(), xx2.max())
    
    for idx, cl in enumerate(np.unique(y)):
        plt.scatter(X[y == cl, 0], X[y == cl, 1],
                   alpha = 0.8, c = colors[idx],    # alpha : size of marker
                   marker = markers[idx], label = cl,
                   edgecolor = 'black')
        
    if test_idx:
        X_test, y_test = X[test_idx, :], y[test_idx]
        
        plt.scatter(X_test[:, 0], X_test[:, 1],
                   c = '', edgecolor = 'black', alpha = 1,
                   s = 100, label = 'test set')

Learn modeling and identify decision regions

svm = SVC(kernel = 'rbf', random_state = 1, gamma = 0.1, C = 10)
svm.fit(X_xor, y_xor)
plot_decision_regions(X_xor, y_xor, classifier = svm)
plt.legend(loc = 'upper left')
plt.tight_layout()
plt.show()

We can confirm the formation of decision regions nonlinearly.

Linearly separable data can also solve the problem by changing the decision regions to resemble the linear ones.

Iris classification by using linear kernel

728x90

'English > Machine learning algorithm' 카테고리의 다른 글

[7] About RandomForest (0)	2021.01.07
[6] Information gain and impurity of decision tree (0)	2021.01.06
[4] Understand Support Vector Machine (SVM) Principles (linear) (0)	2020.12.31
[3] Logistic Regression Principles (0)	2020.12.31
[2] Implement Adaline (adaptive linear neuron) (0)	2020.12.31

콩하나, 니 백엔드 개발 하나?

콩하나, 니 백엔드 개발 하나?

태그

최근글

댓글

공지사항

아카이브

'English > Machine learning algorithm' 카테고리의 다른 글

관련글

티스토리툴바