Python for Machine Learning & Data Science — Day 5 Full Guide

Day 5: Model Evaluation, Validation, Improvement & Advanced Engineering

Ek dam detail mein—koi external reference ki zaroorat nahi. Hinglish mein explain kiya gaya hai taaki learning mazedaar bhi ho.

1) Model Evaluation & Metrics / मॉडल मूल्यांकन और मापदंड

English: Aapne model train kar liya — ab evaluate karna hai test data pe. Classification ke liye accuracy, precision, recall, F1, ROC-AUC; regression ke liye MSE, MAE, RMSE, R² use hota hai.
हिंदी: मॉडल तैयार हो गया है — अब उसकी परख करनी है। Classification में accuracy, precision, recall, F1, ROC-AUC और regression में MSE, MAE, RMSE, R² काम आते हैं।

1.1 Classification Metrics: Accuracy, Precision, Recall, F1


from sklearn.metrics import accuracy_score, precision_score, recall_score, f1_score, confusion_matrix

y_true = [0,1,1,0,1,0,1,1]
y_pred = [0,1,0,0,1,0,1,0]

print("Accuracy:", accuracy_score(y_true, y_pred))
print("Precision:", precision_score(y_true, y_pred))
print("Recall:", recall_score(y_true, y_pred))
print("F1:", f1_score(y_true, y_pred))
print("Confusion Matrix:\\n", confusion_matrix(y_true, y_pred))

Your Turn:

Try modifying y_pred to make false positives zyada ya false negatives zyada — observe how precision vs recall changes.

1.2 ROC Curve & AUC


from sklearn.metrics import roc_curve, roc_auc_score
import matplotlib.pyplot as plt

y_prob = [0.1,0.9,0.2,0.4,0.8,0.05,0.9,0.3]
fpr, tpr, _ = roc_curve(y_true, y_prob)
print("AUC:", roc_auc_score(y_true, y_prob))

plt.plot(fpr, tpr, label="ROC")
plt.plot([0,1],[0,1],'--', label="Random")
plt.xlabel("False Positive Rate")
plt.ylabel("True Positive Rate")
plt.legend()
plt.show()

Practice:

Change one of y_prob values drastically (e.g., make a correct class have low prob) and observe AUC change.

1.3 Regression Metrics: MAE, MSE, RMSE, R²


from sklearn.metrics import mean_absolute_error, mean_squared_error, r2_score

y_true_reg = [10,12,9,15,20]
y_pred_reg = [11,11,8,14,19]

print("MAE:", mean_absolute_error(y_true_reg, y_pred_reg))
print("MSE:", mean_squared_error(y_true_reg, y_pred_reg))
print("RMSE:", mean_squared_error(y_true_reg, y_pred_reg, squared=False))
print("R²:", r2_score(y_true_reg, y_pred_reg))

Practice:

Introduce one larger error in y_pred_reg, like change one prediction to 30, and observe how MSE vs MAE react.

2) Model Validation Techniques / मॉडल सत्यापन तकनीक

English: Single train-test split is quick but unstable. Better practice: use k-fold CV, stratified CV, leave-one-out (VERY slow), or even nested CV for tuning+evaluation.
हिंदी: केवल एक बार train-test split karne se results unstable ho sakte hain. Better: k-fold CV, stratified CV, leave-one-out (बहुत slow), या nested CV use करें।

2.1 K-Fold Cross-Validation


from sklearn.model_selection import cross_val_score
from sklearn.ensemble import RandomForestClassifier
import numpy as np

X = np.random.rand(200,5)
y = np.random.randint(0,2,200)
rf = RandomForestClassifier()
scores = cross_val_score(rf, X, y, cv=5)
print("CV Scores:", scores)
print("Mean:", scores.mean())

Your Turn:

Set cv=10 and see if mean score gets more stable (less variance across folds).

2.2 Stratified K-Fold for Imbalanced Classes


from sklearn.model_selection import StratifiedKFold, cross_val_score
from sklearn.ensemble import RandomForestClassifier

skf = StratifiedKFold(n_splits=5, shuffle=True, random_state=42)
s_scores = cross_val_score(rf, X, y, cv=skf)
print("Stratified CV Mean:", s_scores.mean())

Practice:

Create imbalanced y (e.g., 90 zeros and 10 ones) and compare results using plain CV vs stratified CV.

2.3 Nested Cross-Validation (for tuning + evaluation)


from sklearn.model_selection import GridSearchCV
from sklearn.ensemble import RandomForestClassifier

inner_cv = StratifiedKFold(4, shuffle=True, random_state=1)
outer_cv = StratifiedKFold(5, shuffle=True, random_state=2)

param_grid = {'n_estimators':[50,100], 'max_depth':[5,10]}
grid = GridSearchCV(RandomForestClassifier(), param_grid, cv=inner_cv)

nested_scores = cross_val_score(grid, X, y, cv=outer_cv)
print("Nested CV Mean:", nested_scores.mean())

Your Turn:

Compare nested CV result to a simple train-test split with tuning — observe possible over-optimism in single-split tuning.

3) Model Improvement Techniques / मॉडल सुधार विधियाँ

English: To boost your model, use hyperparameter tuning (Grid/Random Search), ensembles (bagging, boosting, stacking), regularization (L1/L2), feature selection, and guard against overfitting.
हिंदी: Model को बेहतर बनाने के लिए hyperparameter tuning, ensembles, regularization, feature selection, और overfitting से बचने की तकनीक उपयोग करें।

3.1 Hyperparameter Tuning: Grid vs Random Search


from sklearn.model_selection import GridSearchCV, RandomizedSearchCV
from sklearn.ensemble import RandomForestClassifier
from scipy.stats import randint

# Grid Search
param_grid = {'n_estimators': [50,100], 'max_depth': [5,10]}
grid = GridSearchCV(RandomForestClassifier(random_state=1), param_grid, cv=4)
grid.fit(X, y)
print("Grid best:", grid.best_params_, grid.best_score_)

# Randomized Search
param_dist = {'n_estimators': randint(50,200), 'max_depth': randint(3,15)}
rand = RandomizedSearchCV(RandomForestClassifier(random_state=1), param_dist, n_iter=5, cv=4, random_state=0)
rand.fit(X, y)
print("Random best:", rand.best_params_, rand.best_score_)

Practice:

Run both searches — note runtime difference if param grid is large. Randomized can be more efficient.

3.2 Ensembles: Voting, Stacking


from sklearn.ensemble import VotingClassifier, StackingClassifier
from sklearn.linear_model import LogisticRegression
from sklearn.svm import SVC
from sklearn.tree import DecisionTreeClassifier

clf1 = LogisticRegression()
clf2 = DecisionTreeClassifier()
clf3 = SVC(probability=True)

voter = VotingClassifier([('lr',clf1),('dt',clf2),('svc',clf3)], voting='soft')
print("Voting CV:", cross_val_score(voter, X, y, cv=5).mean())

stack = StackingClassifier([('rf',rf),('svc',clf3)], final_estimator=LogisticRegression())
print("Stacking CV:", cross_val_score(stack, X, y, cv=5).mean())

Practice:

Try adding KNeighborsClassifier to ensemble. Compare single models vs ensemble performance.

3.3 Regularization: L1 vs L2


from sklearn.linear_model import LogisticRegression

lr_l2 = LogisticRegression(penalty='l2', solver='liblinear')
lr_l1 = LogisticRegression(penalty='l1', solver='liblinear')
print("L2 CV:", cross_val_score(lr_l2, X, y, cv=5).mean())
print("L1 CV:", cross_val_score(lr_l1, X, y, cv=5).mean())

Practice:

Change regularization strength (C) and observe impact on performance and sparsity.

3.4 Feature Selection: Filter vs Wrapper vs Embedded


from sklearn.feature_selection import SelectKBest, f_classif, RFE

sel = SelectKBest(f_classif, k=3)
X_sel = sel.fit_transform(X, y)
print("SelectKBest shape:", X_sel.shape)

rfe = RFE(estimator=LogisticRegression(), n_features_to_select=3)
X_rfe = rfe.fit_transform(X, y)
print("RFE shape:", X_rfe.shape)

Practice:

Train a model on selected features vs full features — compare accuracy and training time.

4) Advanced Feature Engineering / उन्नत फीचर इंजीनियरिंग

English: Advanced tricks: interaction & polynomial features, target/frequency encoding, binning, grouping, using SHAP for feature importance — sab ka use karenge aage.

4.1 Interaction & Polynomial Features


from sklearn.preprocessing import PolynomialFeatures

poly = PolynomialFeatures(2, include_bias=False)
X_poly = poly.fit_transform(X[:10])
print("Original shape:", X[:10].shape, "Poly shape:", X_poly.shape)

Practice:

Print feature names with poly.get_feature_names_out() to see how interactions are created.

4.2 Frequency & Target Encoding


import pandas as pd
df_cat = pd.DataFrame({'city':['A','B','A','C','B','A'], 'y':[1,0,1,0,1,0]})
freq = df_cat['city'].value_counts().to_dict()
df_cat['city_freq'] = df_cat['city'].map(freq)
target_mean = df_cat.groupby('city')['y'].mean().to_dict()
df_cat['city_tgt'] = df_cat['city'].map(target_mean)
print(df_cat)

Practice:

Think about leakage — how to avoid it using K-fold within target encoding?

4.3 Binning Continuous Variables


import pandas as pd
vals = [18,22,35,45,60,70]
bins = [0,20,40,60,100]
labels = ['teen','adult','mid','senior']
pd.cut(pd.Series(vals), bins=bins, labels=labels)

Practice:

Bin 'fare' values into categories like 'low','medium','high' and use them as categorical features.

4.4 Feature Importance (Tree & SHAP sketch)


import numpy as np
rf.fit(X, y)
print("Feature importances:", rf.feature_importances_)
# For actual SHAP, install shap and use shap.TreeExplainer

Practice:

Plot feature importances and drop the lowest 2 — retrain and see if score drops.

5) End-to-End Case Study — Churn Prediction Workflow

English: Aapko ab end-to-end ek practical workflow milega: loading data, preprocessing, feature engineering, training, evaluating, tuning, final model. Copy-paste-run hoga sab easily.
हिंदी: अब आपको एक पूरा practical workflow मिलेगा: data load → preprocessing → feature engineering → model train → evaluate → tune → final model. सब copy-paste-run होगा।


import pandas as pd
from sklearn.model_selection import train_test_split, GridSearchCV
from sklearn.preprocessing import StandardScaler, OneHotEncoder
from sklearn.compose import ColumnTransformer
from sklearn.pipeline import Pipeline
from sklearn.ensemble import RandomForestClassifier
from sklearn.metrics import roc_auc_score, classification_report

df = pd.read_csv('customer_churn.csv')  # make sure CSV present
df = df.dropna(subset=['churn'])
df['age'].fillna(df['age'].median(), inplace=True)
df['balance'].fillna(df['balance'].median(), inplace=True)
df['gender'].fillna('other', inplace=True)
df['region'].fillna('unknown', inplace=True)

X = df.drop(['customer_id','churn'], axis=1)
y = df['churn'].map({'No':0,'Yes':1}) if df['churn'].dtype=='object' else df['churn']

X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.25, random_state=42, stratify=y)

num_cols = ['age','balance']
cat_cols = ['gender','region']
preprocessor = ColumnTransformer([
  ('num', StandardScaler(), num_cols),
  ('cat', OneHotEncoder(handle_unknown='ignore', sparse=False), cat_cols)
])

pipeline = Pipeline([
  ('pre', preprocessor),
  ('clf', RandomForestClassifier(random_state=42))
])

pipeline.fit(X_train, y_train)
pred = pipeline.predict(X_test)
print("Accuracy:", pipeline.score(X_test, y_test))
print(classification_report(y_test, pred))

param_grid = {'clf__n_estimators':[50,100], 'clf__max_depth':[5,10]}
grid = GridSearchCV(pipeline, param_grid, cv=4, scoring='roc_auc')
grid.fit(X_train, y_train)
best = grid.best_estimator_
print("Best params:", grid.best_params_)
final_pred = best.predict(X_test)
print("Final ROC-AUC:", roc_auc_score(y_test, best.predict_proba(X_test)[:,1]))
print(classification_report(y_test, final_pred))

Your Turn as Practice:

Run the code. Then try changing the model (like XGBoost), adding features (like products), or expanding param grid to improve ROC-AUC.

6) Practice Exercises — Copy → Run (with expected results)

English: These exercises are ready to copy → paste → run. Each is independent and verifies one concept. Try altering input to see how results vary.
हिंदी: ये exercises ready-to-run हैं। Copy → paste → run करिए, बाद में अपना twist डालकर देखें कि परिणाम कैसे बदलते हैं।

6.1 Evaluate a classifier: change y_pred to increase FP

... (as above)

6.2 Plot ROC curve with your own probabilities

... (as above)

6.3 Regression metrics with one outlier

... (as above)

6.4 Perform stratified 5-fold CV on a synthetic dataset

... (as above)

6.5 Ensemble Voting vs Stacking comparison

... (as above)

Resources & Next Steps

Download sample datasets (Titanic, Churn) from Kaggle
Scikit-learn documentation on metrics, model selection, pipelines
SHAP for advanced explainability
YouTube tutorials for ensemble methods & tuning

Python for Machine Learning & Data Science — Day 5 Full Guide & Practice (Evaluation, Validation, Ensembles, Tuning)

1) Model Evaluation & Metrics / मॉडल मूल्यांकन और मापदंड

1.1 Classification Metrics: Accuracy, Precision, Recall, F1

Your Turn:

1.2 ROC Curve & AUC

Practice:

1.3 Regression Metrics: MAE, MSE, RMSE, R²

Practice:

2) Model Validation Techniques / मॉडल सत्यापन तकनीक

2.1 K-Fold Cross-Validation

Your Turn:

2.2 Stratified K-Fold for Imbalanced Classes

Practice:

2.3 Nested Cross-Validation (for tuning + evaluation)

Your Turn:

3) Model Improvement Techniques / मॉडल सुधार विधियाँ

3.1 Hyperparameter Tuning: Grid vs Random Search

Practice:

3.2 Ensembles: Voting, Stacking

Practice:

3.3 Regularization: L1 vs L2

Practice:

3.4 Feature Selection: Filter vs Wrapper vs Embedded

Practice:

4) Advanced Feature Engineering / उन्नत फीचर इंजीनियरिंग

4.1 Interaction & Polynomial Features

Practice:

4.2 Frequency & Target Encoding

Practice:

4.3 Binning Continuous Variables

Practice:

4.4 Feature Importance (Tree & SHAP sketch)

Practice:

5) End-to-End Case Study — Churn Prediction Workflow

Your Turn as Practice:

6) Practice Exercises — Copy → Run (with expected results)

6.1 Evaluate a classifier: change y_pred to increase FP

6.2 Plot ROC curve with your own probabilities

6.3 Regression metrics with one outlier

6.4 Perform stratified 5-fold CV on a synthetic dataset

6.5 Ensemble Voting vs Stacking comparison

Resources & Next Steps

एक टिप्पणी भेजें

نموذج الاتصال