λ³Έλ¬Έ λ°”λ‘œκ°€κΈ°

Study_note(zb_data)/Machine Learning

μŠ€ν„°λ””λ…ΈνŠΈ (kNN)

πŸ“Œ kNN?

- μ‹€μ‹œκ°„ μ˜ˆμΈ‘μ„ μœ„ν•œ ν•™μŠ΅μ΄ ν•„μš”ν•˜μ§€ μ•Šλ‹€.

- 고차원 λ°μ΄ν„°μ—λŠ” μ ν•©ν•˜μ§€ μ•Šλ‹€.

좜처 : 제둜베이슀 데이터 슀쿨
좜처 : 제둜베이슀 데이터 슀쿨

πŸ“Œ μ‹€μŠ΅

- μ‹€μ‹œκ°„ μ˜ˆμΈ‘μ„ μœ„ν•œ ν•™μŠ΅μ΄ ν•„μš”ν•˜μ§€ μ•Šλ‹€.

from sklearn.datasets import load_iris

iris = load_iris()
from sklearn.model_selection import train_test_split

X_train, X_test, y_train, y_test = train_test_split(iris.data, iris.target, 
                                                    test_size=0.2, random_state=13,
                                                    stratify=iris.target)

πŸ”»fit 과정이 ν•„μš”

from sklearn.neighbors import KNeighborsClassifier
# n_neighbor -> λͺ‡ κ°œκΉŒμ§€ κ°€κΉŒμš΄ 지점을 찾을 것이냐?
knn = KNeighborsClassifier(n_neighbors=5)
knn.fit(X_train, y_train)

πŸ”»μ˜ˆμΈ‘ κ²°κ³Ό 확인 해보기

from sklearn.metrics import accuracy_score

pred = knn.predict(X_test)
print(accuracy_score(y_test, pred))
>>>>
0.9666666666666667
from sklearn.metrics import classification_report, confusion_matrix

print(confusion_matrix(y_test, pred))
>>>>
[[10  0  0]
 [ 0  9  1]
 [ 0  0 10]]

πŸ”»classfication_report

print(classification_report(y_test, pred))
>>>>
              precision    recall  f1-score   support

           0       1.00      1.00      1.00        10
           1       1.00      0.90      0.95        10
           2       0.91      1.00      0.95        10

    accuracy                           0.97        30
   macro avg       0.97      0.97      0.97        30
weighted avg       0.97      0.97      0.97        30