SKlearn模型评估方法

发布 : 2020-01-23 分类 : 机器学习 浏览 :

准确率

1.accuracy_score

1
2
3
4
5
6
7
8
9
10
11
# 准确率
import numpy as np
from sklearn.metrics import accuracy_score
y_pred = [0, 2, 1, 3,9,9,8,5,8]
y_true = [0, 1, 2, 3,2,6,3,5,9]

accuracy_score(y_true, y_pred)
Out[127]: 0.33333333333333331

accuracy_score(y_true, y_pred, normalize=False) # 类似海明距离,每个类别求准确后,再求微平均
Out[128]: 3

2.metrics

  • 宏平均比微平均更合理,但也不是说微平均一无是处,具体使用哪种评测机制,还是要取决于数据集中样本分布
  • 宏平均(Macro-averaging),是先对每一个类统计指标值,然后在对所有类求算术平均值。
  • 微平均(Micro-averaging),是对数据集中的每一个实例不分类别进行统计建立全局混淆矩阵,然后计算相应指标。
1
2
3
4
5
6
7
8
9
from sklearn import metrics
metrics.precision_score(y_true, y_pred, average='micro') # 微平均,精确率
Out[130]: 0.33333333333333331

metrics.precision_score(y_true, y_pred, average='macro') # 宏平均,精确率
Out[131]: 0.375

metrics.precision_score(y_true, y_pred, labels=[0, 1, 2, 3], average='macro') # 指定特定分类标签的精确率
Out[133]: 0.5
  • 其中 average 参数有五种:(None, ‘micro’, ‘macro’, ‘weighted’, ‘samples’)
    召回率
1
2
3
4
5
metrics.recall_score(y_true, y_pred, average='micro')
Out[134]: 0.33333333333333331

metrics.recall_score(y_true, y_pred, average='macro')
Out[135]: 0.3125

F1

1
2
metrics.f1_score(y_true, y_pred, average='weighted')
Out[136]: 0.37037037037037035

F2

根据公式计算

1
2
3
4
5
6
from sklearn.metrics import precision_score, recall_score
def calc_f2(label, predict):
p = precision_score(label, predict)
r = recall_score(label, predict)
f2_score = 5*p*r / (4*p + r)
return f2_score

混淆矩阵

1
2
3
4
5
6
7
8
9
10
11
from sklearn.metrics import confusion_matrix
confusion_matrix(y_true, y_pred)

Out[137]:
array([[1, 0, 0, ..., 0, 0, 0],
[0, 0, 1, ..., 0, 0, 0],
[0, 1, 0, ..., 0, 0, 1],
...,
[0, 0, 0, ..., 0, 0, 1],
[0, 0, 0, ..., 0, 0, 0],
[0, 0, 0, ..., 0, 1, 0]])

分类报告

包含:precision/recall/fi-score/均值/分类个数

1
2
3
4
5
6
# 分类报告:precision/recall/fi-score/均值/分类个数
from sklearn.metrics import classification_report
y_true = [0, 1, 2, 2, 0]
y_pred = [0, 0, 2, 2, 0]
target_names = ['class 0', 'class 1', 'class 2']
print(classification_report(y_true, y_pred, target_names=target_names))

输出

1
2
3
4
5
6
7
              precision    recall  f1-score   support

class 0 0.67 1.00 0.80 2
class 1 0.00 0.00 0.00 1
class 2 1.00 1.00 1.00 2

avg / total 0.67 0.80 0.72 5

kappa score

  • kappa score 是一个介于(-1, 1)之间的数. score>0.8 意味着好的分类;0 或更低意味着不好(实际是随机标签)
1
2
3
4
from sklearn.metrics import cohen_kappa_score
y_true = [2, 0, 2, 2, 0, 1]
y_pred = [0, 0, 2, 2, 0, 2]
cohen_kappa_score(y_true, y_pred)
  • ROC 1.计算 ROC 值
1
2
3
4
5
import numpy as np
from sklearn.metrics import roc_auc_score
y_true = np.array([0, 0, 1, 1])
y_scores = np.array([0.1, 0.4, 0.35, 0.8])
roc_auc_score(y_true, y_scores)
  • 2.ROC 曲线
1
2
3
y = np.array([1, 1, 2, 2])
scores = np.array([0.1, 0.4, 0.35, 0.8])
fpr, tpr, thresholds = roc_curve(y, scores, pos_label=2)
  • 海明距离
1
2
3
4
5
from sklearn.metrics import hamming_loss
y_pred = [1, 2, 3, 4]
y_true = [2, 2, 3, 4]
hamming_loss(y_true, y_pred)
0.25

Jaccard 距离

1
2
3
4
5
6
7
8
import numpy as np
from sklearn.metrics import jaccard_similarity_score
y_pred = [0, 2, 1, 3,4]
y_true = [0, 1, 2, 3,4]
jaccard_similarity_score(y_true, y_pred)
0.5
jaccard_similarity_score(y_true, y_pred, normalize=False)
2

可释方差值(Explained variance score)

1
2
3
4
from sklearn.metrics import explained_variance_score
y_true = [3, -0.5, 2, 7]
y_pred = [2.5, 0.0, 2, 8]
explained_variance_score(y_true, y_pred)

平均绝对误差(Mean absolute error)

1
2
3
4
from sklearn.metrics import mean_absolute_error
y_true = [3, -0.5, 2, 7]
y_pred = [2.5, 0.0, 2, 8]
mean_absolute_error(y_true, y_pred)

均方误差(Mean squared error)

1
2
3
4
from sklearn.metrics import mean_squared_error
y_true = [3, -0.5, 2, 7]
y_pred = [2.5, 0.0, 2, 8]
mean_squared_error(y_true, y_pred)

中值绝对误差(Median absolute error)

1
2
3
4
from sklearn.metrics import median_absolute_error
y_true = [3, -0.5, 2, 7]
y_pred = [2.5, 0.0, 2, 8]
median_absolute_error(y_true, y_pred)

R 方值,确定系数

1
2
3
4
from sklearn.metrics import r2_score
y_true = [3, -0.5, 2, 7]
y_pred = [2.5, 0.0, 2, 8]
r2_score(y_true, y_pred)

参考文献

本文作者 : HeoLis
原文链接 : https://ishero.net/SKlearn%E6%A8%A1%E5%9E%8B%E8%AF%84%E4%BC%B0%E6%96%B9%E6%B3%95.html
版权声明 : 本博客所有文章除特别声明外,均采用 CC BY-NC-SA 4.0 许可协议。转载请注明出处!

学习、记录、分享、获得

微信扫一扫, 向我投食

微信扫一扫, 向我投食