ermutuxia

2021-04-21   阅读量: 1458

Scikit-learn Python

如何用sklearn库计算混淆矩阵

扫码加入数据分析学习群

我们评价二分类模型的预测效果的时候通常需要查看混淆矩阵。

那么在Python里面如何用sklearn库计算混淆矩阵呢?

当我们知道了二分类变量y的预测值和实际值的时候,就可以计算混淆矩阵了,我们这里自己随便生成几个数据演示一下

import sklearn

Y_real= [1,0,1,1,1,0,0,0,0,0]

Y_predict=[0,0,0,0,1,1,0,0,0,1]

#如何计算混淆矩阵

confusion_matrix_1=sklearn.metrics.confusion_matrix(Y_real,Y_predict)

print("混淆矩阵如下:",confusion_matrix_1,sep="\n")

#如何获取分类报告

r_1 = sklearn.metrics.classification_report(Y_real,Y_predict)

print("分类报告如下所示:",r_1,sep="\n")

执行结果如下

混淆矩阵如下:

[[4 2]

[3 1]]

分类报告如下所示:

precision recall f1-score support


0 0.57 0.67 0.62 6

1 0.33 0.25 0.29 4


accuracy 0.50 10

macro avg 0.45 0.46 0.45 10

weighted avg 0.48 0.50 0.48 10


3.png




还可以看下混淆矩阵函数的帮助文件

In [11]: help(sklearn.metrics.confusion_matrix)

Help on function confusion_matrix in module sklearn.metrics._classification:


confusion_matrix(y_true, y_pred, *, labels=None, sample_weight=None, normalize=None)

Compute confusion matrix to evaluate the accuracy of a classification.


By definition a confusion matrix :math:`C` is such that :math:`C_{i, j}`

is equal to the number of observations known to be in group :math:`i` and

predicted to be in group :math:`j`.


Thus in binary classification, the count of true negatives is

:math:`C_{0,0}`, false negatives is :math:`C_{1,0}`, true positives is

:math:`C_{1,1}` and false positives is :math:`C_{0,1}`.


Read more in the :ref:`User Guide <confusion_matrix>`.


Parameters

----------

y_true : array-like of shape (n_samples,)

Ground truth (correct) target values.


y_pred : array-like of shape (n_samples,)

Estimated targets as returned by a classifier.


labels : array-like of shape (n_classes), default=None

List of labels to index the matrix. This may be used to reorder

or select a subset of labels.

If ``None`` is given, those that appear at least once

in ``y_true`` or ``y_pred`` are used in sorted order.


sample_weight : array-like of shape (n_samples,), default=None

Sample weights.


.. versionadded:: 0.18


normalize : {'true', 'pred', 'all'}, default=None

Normalizes confusion matrix over the true (rows), predicted (columns)

conditions or all the population. If None, confusion matrix will not be

normalized.


Returns

-------

C : ndarray of shape (n_classes, n_classes)

Confusion matrix whose i-th row and j-th

column entry indicates the number of

samples with true label being i-th class

and predicted label being j-th class.


0.0000 0 0 关注作者 收藏

评论(0)


暂无数据

推荐课程

推荐帖子