
import pandas as pd
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import StandardScaler
from sklearn.neighbors import KNeighborsClassifier
df = pd.read_csv(‘datatest.csv’, names=[‘出行’, ‘游戏时间’, ‘冰激凌’, ‘配对结果’])
print(df)
feature = df.iloc[:, 0:3]
print(feature)
target = df.iloc[:, -1]
print(target)
x_train, x_test, y_train, y_test = train_test_split(feature, target, test_size=0.21, random_state=1500)
print(x_train.shape, x_test.shape)
ss = StandardScaler()
x_train = ss.fit_transform(x_train)
print(x_train)
knn = KNeighborsClassifier(n_neighbors=3)
训练模型(训练集的特征数据和目标数据)knn.fit(x_train, y_train)
用测试集验证数据,得到评价指标 测试集数据也必须标准化,但是不需要再计算均值和方差,训练集进行标准化的时候已经找到了均值和方差 已经找到转换规则,我们把这个规则用在了训练集上,同样我们可以直接用在测试集上,所以在测试集上的处理,我们只需要标准化x_test = ss.transform(x_test)
score = knn.score(x_test, y_test)
print(f’模型评分{score}')
x_test1 = [[20000, 2.54, 7.65], [2322, 5, 2]]
x_test1 = ss.transform(x_test1)
y_predict = knn.predict(x_test1)
print(y_predict)
欢迎分享,转载请注明来源:内存溢出
微信扫一扫
支付宝扫一扫
评论列表(0条)