AI / ML
ML Basics
Supervised, unsupervised, neural networks, and evaluation metrics — your machine learning starter reference.
📖 5 sections
⏰ 15 min read
✅ Quizzes included
01ML Fundamentals
Supervised
Labeled data. Input+output pairs. Learns to predict.
Unsupervised
No labels. Finds patterns/clusters.
Reinforcement
Agent learns from rewards/penalties.
Overfitting
Model memorizes training data, fails on new data. Fix: more data, regularization, dropout.
Underfitting
Model too simple, misses patterns. Fix: more features, complex model.
Bias-Variance
Bias=underfitting. Variance=overfitting. Goal: balance both.
💡
Bias-Variance Tradeoff: simple models have high bias, complex models have high variance. Cross-validation helps find the sweet spot.
02Supervised Algorithms
AlgorithmTypeBest for
Linear RegressionRegressionPredicting continuous values
Logistic RegressionClassificationBinary classification
Decision TreeBothInterpretable models
Random ForestBothHigh accuracy, less overfitting
SVMClassificationHigh-dimensional data
KNNBothSmall datasets, simple
Neural NetworkBothComplex patterns, large data
Gradient BoostingBothCompetitions, tabular data
MLModel evaluation
# Regression metrics
MAE = mean(|actual-predicted|)
MSE = mean((actual-predicted)^2)
RMSE = sqrt(MSE)
R^2 = 1 - SS_res/SS_tot

# Classification metrics
Accuracy = correct/total
Precision = TP/(TP+FP)
Recall = TP/(TP+FN)
F1 = 2*(Precision*Recall)/(Precision+Recall)
03Neural Networks
Perceptron
Single neuron. Input*weights+bias -> activation function.
Activation functions
ReLU: max(0,x). Sigmoid: 1/(1+e^-x). Tanh: (e^x-e^-x)/(e^x+e^-x).
Forward pass
Data flows input->hidden->output.
Backpropagation
Error flows backward. Gradients calculated via chain rule.
Gradient descent
Weights updated: w = w - learning_rate * gradient.
Epochs
One full pass through training data.
MLSimple neural net concept
Input layer -> Hidden layers -> Output layer

Each neuron: z = w1*x1 + w2*x2 + b
Activation: a = ReLU(z) = max(0, z)

Loss function: measures prediction error
Optimizer: Adam, SGD adjust weights to minimize loss
04Model Evaluation
MLTrain/Test split
# Split data
from sklearn.model_selection import train_test_split
X_train,X_test,y_train,y_test = train_test_split(X,y,test_size=0.2)

# Cross validation (better)
from sklearn.model_selection import cross_val_score
scores = cross_val_score(model, X, y, cv=5)

# Confusion matrix
         Predicted +  Predicted -
Actual + TP           FN
Actual - FP           TN
❓ Quiz
What does overfitting mean?
Overfitting: model learns training data too well including noise, so it fails to generalize to unseen data. Fix: regularization, more data, simpler model.
05Feature Engineering
Normalization
Scale features to 0-1 range. MinMax scaler.
Standardization
Mean=0, std=1. Z-score scaling.
One-hot encoding
Convert categorical to binary columns.
Feature selection
Remove irrelevant features. Reduces overfitting.
PCA
Dimensionality reduction. Keep most important variance.
Missing values
Drop, mean/median impute, or predict.
MLPreprocessing
from sklearn.preprocessing import StandardScaler
scaler = StandardScaler()
X_scaled = scaler.fit_transform(X_train)

# One-hot encoding
pd.get_dummies(df, columns=["category"])