🧠 AI Computer Institute
Content is AI-generated for educational purposes. Verify critical information independently. A bharath.ai initiative.

Machine Learning Cheat Sheet

ai-mlGrades 10-128 sections

Visual Overview: ML Algorithm Decision Flowchart

ML Algorithm Selection Flowchart Start Is your data labeled? (Do you have target values?) YES NO Supervised Learning (Classification or Regression) Unsupervised Learning (Pattern Discovery) Is target categorical or continuous? What pattern do you want to find? Categorical Continuous Groups Features Classification (Logistic, SVM, Random Forest) Regression (Linear, Polynomial, SVR) Clustering (K-means, DBSCAN, Hierarchical) Dimensionality Reduction (PCA, TSNE, Autoencoders)

Choose your approach based on whether your data is labeled and what patterns you're looking for

Learning Types

TypeLabelExamplesGoal
SupervisedYesRegression, ClassificationPredict output from input
UnsupervisedNoClustering, Dimensionality ReductionFind hidden patterns
ReinforcementRewardGame AI, RoboticsMaximize reward through actions
Semi-supervisedMixedFew labeled + many unlabeledLeverage unlabeled data
Self-supervisedSelf-generatedContrastive learning, BERTLearn from unlabeled data

Regression Algorithms

AlgorithmComplexityWhen to UseNotes
Linear RegressionLowLinear relationshipFast, interpretable, simple baseline
Polynomial RegressionMediumCurved relationshipsProne to overfitting
Ridge/LassoLowWith multicollinearityAdds regularization penalty
SVR (Support Vector Regression)MediumNon-linear, outliersGood for small-medium datasets
Decision Tree RegressionMediumNon-linear, interactionsEasy to interpret
Random ForestHighComplex patternsEnsemble, reduces overfitting
Gradient BoostingHighMaximum accuracyOften wins competitions

Classification Algorithms

AlgorithmData TypeWhen to UsePros/Cons
Logistic RegressionLinearFast baselineSimple, interpretable
Naive BayesProbabilisticText, fastAssumes independence, biased
SVMNon-linearSmall-medium dataPowerful, slow on large data
Decision TreeNon-linearInterpretability neededEasy to overfit
Random ForestNon-linearBest general-purposeAccurate, less interpretable
Gradient BoostingNon-linearHigh accuracy neededSlow, prone to overfitting
K-Nearest NeighborsNon-linearSmall datasetsSimple, slow prediction
Neural NetworkComplexDeep patterns, large dataPowerful, needs tuning

Unsupervised Learning

// Clustering
K-Means: Group by distance to centroids
Hierarchical: Tree-like cluster structure
DBSCAN: Density-based, finds irregular shapes
Gaussian Mixture: Probabilistic clustering

// Dimensionality Reduction
PCA: Linear dimensionality reduction
t-SNE: Non-linear visualization (2D/3D)
UMAP: Better than t-SNE for structure
Autoencoders: Neural network compression

// Rules
Apriori: Find frequent itemsets
Eclat: Depth-first variant
Association Rules: If X then Y

// Anomaly Detection
Isolation Forest: Isolate anomalies
Local Outlier Factor (LOF): Density-based
One-Class SVM: Learn normal behavior

Evaluation Metrics

MetricFormula/UseRangeGood When
Accuracy(TP+TN)/(TP+TN+FP+FN)0-1Balanced classes
PrecisionTP/(TP+FP)0-1False positives costly
RecallTP/(TP+FN)0-1False negatives costly
F1-Score2×(Precision×Recall)/(P+R)0-1Balance P & R
ROC-AUCArea under curve0-1Threshold-independent
PR-AUCPrecision-Recall curve0-1Imbalanced classes
MAEMean Absolute Error0-∞Regression
RMSE√(Mean Squared Error)0-∞Regression, penalizes large errors
Variance explained-∞ to 1Regression

Bias-Variance Tradeoff

// High Bias (Underfitting)
- Model too simple for data
- High training error
- High test error
- Example: Linear model on non-linear data

Solutions:
→ Increase model complexity
→ Add more features
→ Train longer
→ Reduce regularization

// High Variance (Overfitting)
- Model too complex for data
- Low training error
- High test error
- Example: High-degree polynomial on few samples

Solutions:
→ More training data
→ Reduce model complexity
→ Regularization (L1/L2)
→ Dropout, early stopping
→ Cross-validation

// Optimal Balance
→ Use validation set to find sweet spot
→ Learning curves: plot train vs validation loss
→ Bias decreases, Variance increases with model complexity

Hyperparameter Tuning

// Grid Search: Try all combinations
from sklearn.model_selection import GridSearchCV
params = {
    'C': [0.1, 1, 10],
    'kernel': ['linear', 'rbf']
}
grid = GridSearchCV(SVM(), params, cv=5)
grid.fit(X, y)
best_model = grid.best_estimator_

// Random Search: Random sample of space
from sklearn.model_selection import RandomizedSearchCV
random = RandomizedSearchCV(model, params, n_iter=20, cv=5)
random.fit(X, y)

// Cross-Validation
from sklearn.model_selection import cross_val_score
scores = cross_val_score(model, X, y, cv=5)
# Splits data into 5 folds, trains 5 times

// Learning Rate Schedules
Constant: lr = 0.01
Step decay: Reduce after N epochs
Exponential decay: lr = lr0 × e^(-kt)
1/t decay: lr = lr0 / (1 + kt)

// Early Stopping
Stop training when validation loss plateaus
Prevents overfitting and saves computation

More Cheat Sheets