基于make_moons数据集建立支持向量机分类器

  • 发布日期:2019-10-22
  • 难度:较难
  • 类别:分类与预测、支持向量机
  • 标签:Python、scikit-learn、支持向量机、make_moons

1. 问题描述

本节使用SVC分类模型对make_moons数据集进行分类预测。共构造了3种不同核函数的支持向量机,分别是线性核函数、多项式核函数及高斯核函数。可以看到,不同核函数的分类器效果各异。

2. 程序实现

In [1]:
#导入make_moons数据,并划分训练测试集
import sklearn.datasets
from sklearn.model_selection import train_test_split
X, y = sklearn.datasets.make_moons(100,noise=0.3)
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.3, random_state=42)
#三种不同核函数的支持向量机
from sklearn import svm
clf_linear=svm.SVC(kernel='linear')
clf_poly=svm.SVC(kernel='poly')
clf_rbf=svm.SVC(kernel='rbf')
#效果评估
clf_linear.fit(X,y)
clf_poly.fit(X,y)
clf_rbf.fit(X,y)
print(clf_linear.score(X,y)) #线性核函数效果
print(clf_poly.score(X,y)) #多项式核函数效果
print(clf_rbf.score(X,y)) #高斯核函数效果
0.87
0.81
0.89
D:\software\Anaconda3\lib\site-packages\sklearn\svm\base.py:193: FutureWarning: The default value of gamma will change from 'auto' to 'scale' in version 0.22 to account better for unscaled features. Set gamma explicitly to 'auto' or 'scale' to avoid this warning.
  "avoid this warning.", FutureWarning)
D:\software\Anaconda3\lib\site-packages\sklearn\svm\base.py:193: FutureWarning: The default value of gamma will change from 'auto' to 'scale' in version 0.22 to account better for unscaled features. Set gamma explicitly to 'auto' or 'scale' to avoid this warning.
  "avoid this warning.", FutureWarning)
In [3]:
#此外,还可以绘制不同核函数模型的预测效果图
#定义效果图绘制函数
import numpy as np
import matplotlib.pyplot as plt
def plot_hyperplane(clf, X, y, h=0.02, draw_sv=True, title='hyperplan'):
    x_min, x_max = X[:, 0].min() - 1, X[:, 0].max() + 1
    y_min, y_max = X[:, 1].min() - 1, X[:, 1].max() + 1
    xx, yy = np.meshgrid(np.arange(x_min, x_max, h),
                         np.arange(y_min, y_max, h))
    plt.title(title)
    plt.xlim(xx.min(), xx.max())
    plt.ylim(yy.min(), yy.max())
    plt.xticks(())
    plt.yticks(())
    Z = clf.predict(np.c_[xx.ravel(), yy.ravel()])
    Z = Z.reshape(xx.shape)
    plt.contourf(xx, yy, Z, cmap='hot', alpha=0.5)
    markers = ['o', 's', '^']
    colors = ['b', 'r', 'c']
    labels = np.unique(y)
    for label in labels:
        plt.scatter(X[y==label][:, 0], 
                    X[y==label][:, 1], 
                    c=colors[label], 
                    marker=markers[label], s=20)
    if draw_sv:
        sv = clf.support_vectors_
        plt.scatter(sv[:, 0], sv[:, 1], c='black', marker='x', s=15)
#可视化效果图
plt.figure(figsize=(5,4), dpi=120)
plot_hyperplane(clf_linear,X,y,h=0.05,title='linear kernel') #可替换clf_linear为clf_poly以及clf_rbf
plt.show()