Machine Learning

1.Input and Explore

dataset:df{X,y(target)

upload files:

predict:Xsave w/id

select as target:

click to populate!

select as time:

2.Preprocess

Drop row with nan? Clear Outliers with IQR?

Drop row if condition! UnderSampling(0 more!)

Create field:X.eval("fld1=...;...");no target!

time to serials(YMW) DropNonNumCol_fillna

Normalization? Stop training to show X

3.train/validate split

size(04magic):

random_state:

shuffle

===model parameters===

Tree:

Depth:

alpha:

gamma:

C:

degree

4.Model Selection

DecisionTreeClassifier

RandomForestClassifier(independent)

RandomForestRegressor

GradientBoostingClassifier(tree by tree)

GradientBoostingRegressor

XGBClassifier(derivatives)

XGBRegressor

LinearRegressor

wt regul: Lasso(L1) Ridge(L2) none

LogisticReg(class)

SVM.SVC(class)

SVM.SVR(regress) Linear Ploy RBF

5.Error/Accuracy

MAE:

MSE:

R2_score:

recall:

accuracy:

Predicted=[0]

feature_importances:

6.y_pred vs y_test

model save?model load?

img here