网站第三方登录怎么做,工商局网站实名认证怎么做,seo优化包括,网站怎么做海外推广方案文章目录 数据准备阶段KNN预测的过程1.计算新样本与已知样本点的距离2.按照举例排序3.确定k值4.距离最近的k个点投票 scikit-learn中的KNN算法 数据准备阶段
import matplotlib.pyplot as plt
import numpy as np# 样本特征
data_X [[0.5, 2],[1.8, 3],[3.9, 1],[4.7, 4],[6.… 文章目录 数据准备阶段KNN预测的过程1.计算新样本与已知样本点的距离2.按照举例排序3.确定k值4.距离最近的k个点投票 scikit-learn中的KNN算法 数据准备阶段
import matplotlib.pyplot as plt
import numpy as np# 样本特征
data_X [[0.5, 2],[1.8, 3],[3.9, 1],[4.7, 4],[6.2, 6],[7.5, 5],[8.3, 3.5],[9.1, 7],[9.8, 4.5]
]# 样本标记
data_y [0, 0, 0, 1, 1, 1, 1, 1, 1]X_train np.array(data_X)
y_train np.array(data_y)X_trainarray([[0.5, 2. ],[1.8, 3. ],[3.9, 1. ],[4.7, 4. ],[6.2, 6. ],[7.5, 5. ],[8.3, 3.5],[9.1, 7. ],[9.8, 4.5]])y_trainarray([0, 0, 0, 1, 1, 1, 1, 1, 1])选出样本标记为0的样本特征
y_train 0array([ True, True, True, False, False, False, False, False, False])X_train[y_train0]array([[0.5, 2. ],[1.8, 3. ],[3.9, 1. ]])X_train[y_train0, 0]array([0.5, 1.8, 3.9])X_train[y_train0, 1]array([2., 3., 1.])X_train[y_train1, 0].shape(6,)X_train[y_train1, 1].shape(6,)plt.scatter(X_train[y_train0, 0], X_train[y_train0, 1], colorred, markerx)
plt.scatter(X_train[y_train1, 0], X_train[y_train1, 1], colorblack, markero)
plt.show()增加新的样本点
data_new np.array([4, 5])plt.scatter(X_train[y_train0, 0], X_train[y_train0, 1], colorred, markerx)
plt.scatter(X_train[y_train1, 0], X_train[y_train1, 1],colorblack, markero)
plt.scatter(data_new[0], data_new[1], colorb, marker^)
plt.show()KNN预测的过程
1.计算新样本与已知样本点的距离
for data in X_train:print(np.sqrt(np.sum((data - data_new) ** 2)))4.6097722286464435
2.973213749463701
4.001249804748512
1.2206555615733703
2.4166091947189146
3.5
4.5541190146942805
5.478138369920935
5.821511831131154distances [np.sqrt(np.sum((data - data_new) ** 2)) for data in X_train]
distances[4.6097722286464435,2.973213749463701,4.001249804748512,1.2206555615733703,2.4166091947189146,3.5,4.5541190146942805,5.478138369920935,5.821511831131154]2.按照举例排序
np.sort(distances)array([1.22065556, 2.41660919, 2.97321375, 3.5 , 4.0012498 ,4.55411901, 4.60977223, 5.47813837, 5.82151183])sort_index np.argsort(distances)
sort_indexarray([3, 4, 1, 5, 2, 6, 0, 7, 8], dtypeint64)3.确定k值
k 54.距离最近的k个点投票
first_k [y_train[i] for i in sort_index[:k]]
first_k[1, 1, 0, 1, 0]from collections import CounterCounter(first_k)Counter({1: 3, 0: 2})Counter(first_k).most_common()[(1, 3), (0, 2)]Counter(first_k).most_common(1)[(1, 3)]predict_y Counter(first_k).most_common(1)[0][0]
predict_y1得到结果为1KNN判断新加入的点data_y的标记应该为1从图中也可以看到新加入的点更靠近标记为1的点群。
scikit-learn中的KNN算法 from sklearn.neighbors import KNeighborsClassifierkNN_classifier KNeighborsClassifier(n_neighbors5)kNN_classifier.fit(X_train, y_train)data_new.reshape(1, -1)array([[4, 5]])predict_y kNN_classifier.predict(data_new.reshape(1, -1))
predict_yarray([1])与手写KNN得到的结果相同皆判断为1。