机器学习模型可视化：基于sklearn和Matplotlib的库scikit-plot|matplotlib|scikit|svm|聚类

scikit-learn (sklearn)是Python环境下常见的机器学习库，包含了常见的分类、回归和聚类算法。在训练模型之后，常见的操作是对模型进行可视化，则需要使用Matplotlib进行展示。

scikit-plot是一个基于sklearn和Matplotlib的库，主要的功能是对训练好的模型进行可视化，功能比较简单易懂。

https://scikit-plot.readthedocs.io

pip install scikit-plot

功能1：评估指标可视化

scikitplot.metrics.plot_confusion_matrix快速展示模型预测结果和标签计算得到的混淆矩阵。

打开网易新闻查看精彩图片

import scikitplot as skplt
rf = RandomForestClassifier()
rf = rf.fit(X_train, y_train)
y_pred = rf.predict(X_test)

skplt.metrics.plot_confusion_matrix(y_test, y_pred, normalize=True) plt.show()

scikitplot.metrics.plot_roc快速展示模型预测的每个类别的ROC曲线。

打开网易新闻查看精彩图片

import scikitplot as skplt
nb = GaussianNB()
nb = nb.fit(X_train, y_train)
y_probas = nb.predict_proba(X_test)

skplt.metrics.plot_roc(y_test, y_probas) plt.show()

scikitplot.metrics.plot_ks_statistic从标签和分数/概率生成 KS 统计图。

打开网易新闻查看精彩图片

import scikitplot as skplt
lr = LogisticRegression()
lr = lr.fit(X_train, y_train)
y_probas = lr.predict_proba(X_test)

skplt.metrics.plot_ks_statistic(y_test, y_probas) plt.show()

scikitplot.metrics.plot_precision_recall从标签和概率生成PR曲线

打开网易新闻查看精彩图片

import scikitplot as skplt
nb = GaussianNB()
nb.fit(X_train, y_train)
y_probas = nb.predict_proba(X_test)

skplt.metrics.plot_precision_recall(y_test, y_probas) plt.show()

scikitplot.metrics.plot_silhouette对聚类结果进行silhouette analysis分析

打开网易新闻查看精彩图片

import scikitplot as skplt
kmeans = KMeans(n_clusters=4, random_state=1)
cluster_labels = kmeans.fit_predict(X)

skplt.metrics.plot_silhouette(X, cluster_labels) plt.show()

scikitplot.metrics.plot_calibration_curve绘制分类器的矫正曲线

打开网易新闻查看精彩图片

import scikitplot as skplt
rf = RandomForestClassifier()
lr = LogisticRegression()
nb = GaussianNB()
svm = LinearSVC()
rf_probas = rf.fit(X_train, y_train).predict_proba(X_test)
lr_probas = lr.fit(X_train, y_train).predict_proba(X_test)
nb_probas = nb.fit(X_train, y_train).predict_proba(X_test)
svm_scores = svm.fit(X_train, y_train).decision_function(X_test)
probas_list = [rf_probas, lr_probas, nb_probas, svm_scores]
clf_names = ['Random Forest', 'Logistic Regression',
'Gaussian Naive Bayes', 'Support Vector Machine']

skplt.metrics.plot_calibration_curve(y_test, probas_list, clf_names) plt.show()功能2：模型可视化

scikitplot.estimators.plot_learning_curve生成不同训练样本下的训练和测试学习曲线图。

打开网易新闻查看精彩图片

import scikitplot as skplt
rf = RandomForestClassifier()

skplt.estimators.plot_learning_curve(rf, X, y) plt.show()

scikitplot.estimators.plot_feature_importances可视化特征重要性。

打开网易新闻查看精彩图片

import scikitplot as skplt
rf = RandomForestClassifier()
rf.fit(X, y)

skplt.estimators.plot_feature_importances( rf, feature_names=['petal length', 'petal width', 'sepal length', 'sepal width']) plt.show()功能3：聚类可视化

scikitplot.cluster.plot_elbow_curve展示聚类的肘步图。

import scikitplot as skplt kmeans = KMeans(random_state=1)

skplt.cluster.plot_elbow_curve(kmeans, cluster_ranges=range(1, 30)) plt.show()功能4：降维可视化

scikitplot.decomposition.plot_pca_component_variance绘制 PCA 分量的解释方差比。

import scikitplot as skplt
pca = PCA(random_state=1)
pca.fit(X)

skplt.decomposition.plot_pca_component_variance(pca) >plt.show()

scikitplot.decomposition.plot_pca_2d_projection绘制PCA降维之后的散点图。

import scikitplot as skplt
pca = PCA(random_state=1)
pca.fit(X)

skplt.decomposition.plot_pca_2d_projection(pca, X, y) plt.show()

机器学习模型可视化：基于sklearn和Matplotlib的库scikit-plot

热搜

热门跟贴

热搜

热门跟贴

相关推荐

他们掏空积蓄投资的古镇，黄了

一场1-1，直接改变4队命运！3大豪门躺着出线，欧洲杯16强诞生8席

张海迪以这一身份赴山东

独家 | 世界500强旗下企业被拍卖：拖欠中建八局亿元工程款至今未结清，七折起拍仍无人报名

香港被欧盟取消免签？部分网友急忙哀嚎唱衰，美日等61国同样待遇

还打什么仗？关键时刻400个军事目标被摧毁 上千名士兵一个没跑掉

意大利绝平1-1克罗地亚 莫德里奇失点+进球

雷迪克：詹姆斯在我确认执教30分钟后才开始跟我谈 他不想参与

美高官明说了：美需要中国学生学人文 印度学生学科学

香港公务员来内地“沉浸式交流”，能学到什么？

深圳装饰行业龙头企业全员待岗！

老人用了20多年的锤子竟是手榴弹！平时用来砸核桃、钉钉子

济南钢城公交所有线路暂停营运

泽连斯基任命乌克兰新任武装部队联合部队司令

2023年度审计工作报告：中央财政赤字4.16万亿元

大疆前高管带6人创业，做出了类目Top1的割草机器人

雨天女子打伞走路 不料下秒掉进前方泳池

嫦娥六号返回器今天返回地球

2024春晚，互联网大厂怎么不“打架”了？

民调：李家超就任两年满意度达68%

还打什么仗？关键时刻400个军事目标被摧毁上千名士兵一个没跑掉

意大利绝平1-1克罗地亚莫德里奇失点+进球

雷迪克：詹姆斯在我确认执教30分钟后才开始跟我谈他不想参与

美高官明说了：美需要中国学生学人文印度学生学科学

雨天女子打伞走路不料下秒掉进前方泳池