逻辑回归的Python实现

啊啊啊啊啊吖

2018-11-09 阅读量: 1033

逻辑回归的Python实现

需要：sigmoid函数、模型主体、参数初始化、基于梯度下降的参数更新训练、数据测试与可视化展示。

先定义一个 sigmoid 函数：

import numpy as np

def sigmoid(x):

    z = 1 / (1 + np.exp(-x))    

    return z

定义模型参数初始化函数：

def initialize_params(dims):

    W = np.zeros((dims, 1))

    b = 0

    return W, b

定义逻辑回归模型主体部分，包括模型计算公式、损失函数和参数的梯度公式：

def logistic(X, y, W, b):

    num_train = X.shape[0]

    num_feature = X.shape[1]



    a = sigmoid(np.dot(X, W) + b)

    cost = -1/num_train * np.sum(y*np.log(a) + (1-y)*np.log(1-a))



    dW = np.dot(X.T, (a-y))/num_train

    db = np.sum(a-y)/num_train

    cost = np.squeeze(cost) 



    return a, cost, dW, db

定义基于梯度下降的参数更新训练过程：

def logistic_train(X, y, learning_rate, epochs):    

    # 初始化模型参数

    W, b = initialize_params(X.shape[1])  

    cost_list = []  



    # 迭代训练

    for i in range(epochs):       

        # 计算当前次的模型计算结果、损失和参数梯度

        a, cost, dW, db = logistic(X, y, W, b)    

        # 参数更新

        W = W -learning_rate * dW

        b = b -learning_rate * db        


        # 记录损失

        if i % 100 == 0:

            cost_list.append(cost)   

        # 打印训练过程中的损失 

        if i % 100 == 0:

            print('epoch %d cost %f' % (i, cost)) 



    # 保存参数

    params = {            

        'W': W,            

        'b': b

    }        

    # 保存梯度

    grads = {            

        'dW': dW,            

        'db': db

    }           

    return cost_list, params, grads

定义对测试数据的预测函数：

def predict(X, params):

    y_prediction = sigmoid(np.dot(X, params['W']) + params['b']) 

    for i in range(len(y_prediction)):        

        if y_prediction[i] > 0.5:

            y_prediction[i] = 1

        else:

            y_prediction[i] = 0

   eturn y_prediction




使用 sklearn 生成模拟的二分类数据集进行模型训练和测试：

import matplotlib.pyplot as plt

from sklearn.datasets.samples_generator import make_classification

X,labels=make_classification(n_samples=100, n_features=2, n_redundant=0, n_informative=2, random_state=1, n_clusters_per_class=2)

rng=np.random.RandomState(2)

X+=2*rng.uniform(size=X.shape)



unique_lables=set(labels)

colors=plt.cm.Spectral(np.linspace(0, 1, len(unique_lables)))

for k, col in zip(unique_lables, colors):

    x_k=X[labels==k]

    plt.plot(x_k[:, 0], x_k[:, 1], 'o', markerfacecolor=col, markeredgecolor="k",

             markersize=14)

plt.title('data by make_classification()')

plt.show()

0.0000

关注作者

发表评论

暂无数据

CDA考试动态

CDA报考指南

推荐帖子