热线电话:13121318867

登录
2019-03-04 阅读量: 828
关于sklearn的问题

在sklearn predict_proba()方法中,通常我们只查看概率最高的方法。如何输出前n个类(n> 1)的概率?

例如,输出predict_proba()是这样的,我如何返回最高的2个概率及其关联类?

result_prob = clf.predict_proba(X_test)

返回:

array([

2.55420153e-02, 0.00000000e+00, 0.00000000e+00, 0.00000000e+00,

3.41739673e-02, 0.00000000e+00, 0.00000000e+00, 0.00000000e+00,

0.00000000e+00, 0.00000000e+00, 2.11688875e-05, 0.00000000e+00,

0.00000000e+00, 0.00000000e+00, 8.02579585e-01, 0.00000000e+00,

0.00000000e+00, 1.37978949e-02, 0.00000000e+00, 0.00000000e+00,

0.00000000e+00, 1.15640553e-02, 0.00000000e+00, 0.00000000e+00,

0.00000000e+00, 0.00000000e+00, 0.00000000e+00, 6.76391638e-02,

9.06030431e-03, 0.00000000e+00, 0.00000000e+00, 0.00000000e+00,

0.00000000e+00, 0.00000000e+00, 3.56218448e-02, 0.00000000e+00,

0.00000000e+00, 0.00000000e+00, 0.00000000e+00])

在这种情况下,应该返回概率为8.02579585e-01和的类6.76391638e-02。

解决办法:这实际上是一个Numpy问题; 你可以使用np.argpartition:

import numpy as np

x =np.array([

2.55420153e-02, 0.00000000e+00, 0.00000000e+00, 0.00000000e+00,

3.41739673e-02, 0.00000000e+00, 0.00000000e+00, 0.00000000e+00,

0.00000000e+00, 0.00000000e+00, 2.11688875e-05, 0.00000000e+00,

0.00000000e+00, 0.00000000e+00, 8.02579585e-01, 0.00000000e+00,

0.00000000e+00, 1.37978949e-02, 0.00000000e+00, 0.00000000e+00,

0.00000000e+00, 1.15640553e-02, 0.00000000e+00, 0.00000000e+00,

0.00000000e+00, 0.00000000e+00, 0.00000000e+00, 6.76391638e-02,

9.06030431e-03, 0.00000000e+00, 0.00000000e+00, 0.00000000e+00,

0.00000000e+00, 0.00000000e+00, 3.56218448e-02, 0.00000000e+00,

0.00000000e+00, 0.00000000e+00, 0.00000000e+00])

k = 2 # top-k

ind = np.argpartition(x, -k)[-k:]

x[ind]

结果:

array([0.06763916, 0.80257959])

根据要求,各个班级在ind:

ind

# array([27, 14])

0.0000
4
关注作者
收藏
评论(0)

发表评论

暂无数据
推荐帖子