logistic回归预建立病例预测模型
用logistic 回归拟合病例data
之前用的深度学习的方法建立预测模型,现在用机器学习的方法来建立模型,首先想到的就是线性模型,可以用logistic回归来训练模型
关于logistic回归的原理其实很简单,主要就是梯度上升法,与梯度下降法不同,logistic回归是最大化似然函数
这里特别需要注意的一点是,所有数组必须转换为np.array格式才能进行加减乘法,而且np.array类似于matlab中的格式,函数可以对整个数组操作
import numpy as np
这里为每个data的数据前面增加了一个常数项1,使模型更加准确
# 读取数据
def load_dataSet():
data = np.loadtxt('all_info.txt')
label = [[data[i,-1]] for i in range(len(data))]
data1 = []
for i in range(len(data)):
temp = []
temp.append(1)
for j in data[i][:-1]:
temp.append(j)
data1.append(temp)
train_data = data1[:251]
test_data = data1[251:]
train_label = label[:251]
test_label = label[251:]
return train_data, test_data, train_label, test_label
train_data, test_data, train_label, test_label = load_dataSet()
np.shape(train_data)
(251L, 35L)
np.shape(train_label)
(251L, 1L)
当z < -30时exp(-z)已经相当大了,所以可以将其返回值视为0
def sigmoid(z):
if z < -30:
return 0
return 1.0 / (1+np.exp(-z))
# 梯度上升函数
def grad_ascend(data, label):
# 将数据转换成np.array格式
data = np.array(data)
label = np.array(label)
m,n = np.shape(data)
alpha = 0.01
alpha_step = 0.99
maxCycles = 1000
weights = np.random.randn(n,1)
for k in range(maxCycles):
alpha = alpha * alpha_step
h = np.array([[sigmoid(i[0])] for i in np.dot(data, weights)])
error = label - h
weights = weights + alpha * np.dot(data.transpose(),error)
return weights
w1 = grad_ascend(test_data, test_label)
def test_accuracy(data, label, weights):
result = np.array([[sigmoid(i[0])] for i in np.dot(data, weights)])
accuracy = float((result == label).sum())/len(label)
return accuracy
test_accuracy(test_data, test_label,w1)
0.6274509803921569
测试准确率竟然达到了62.7%与tensorflow全连接层准确率相差无几