스터디노트 (ML8_Cross validation)
📌KFold import numpy as np from sklearn.model_selection import KFold X = np.array([[1,2], [3,4], [1,2], [3,4]]) y = np.array([1,2,3,4]) kf = KFold(n_splits=2) kf = KFold(n_splits=2) #split을 2등분으로 나눈다 print(kf.get_n_splits(X)) kf >>> 2 KFold(n_splits=2, random_state=None, shuffle=False) for train_idx, test_idx in kf.split(X): print('---- idx') print(train_idx, test_idx) print('---- train data') pr..
스터디노트 (ML6_Wine)
📌 plotly.express import pandas as pd red = pd.read_csv('../data/winequality-red.csv', sep=';') white = pd.read_csv('../data/winequality-white.csv', sep=';') red['color']= 1. white['color'] = 0. wine =pd.concat([red,white]) wine.info() wine['quality'].unique() >>>> array([5, 6, 7, 4, 8, 3, 9], dtype=int64) import plotly.express as px fig = px.histogram(wine, x='quality') fig.show() fig = px.histo..
스터디노트 (ML5)
📌 LabelEncoder df = pd.DataFrame({ 'A' : ['a', 'b', 'c', 'a', 'b'], 'B' : [1, 2, 3, 1, 0], }) df 📌 Label_encoder - fit -> transform from sklearn.preprocessing import LabelEncoder le = LabelEncoder() le.fit(df['A']) le.classes_ >>>> array(['a', 'b', 'c'], dtype=object) le.transform(df['A']) >>>> array([0, 1, 2, 0, 1]) le.fit_transform(df['A']) >>>> array([0, 1, 2, 0, 1]) le.inverse_transform(df['..