Skip to content

MAL-TO/Workshop-ML

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

7 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Workshop on Machine Learning

This is the repository for the Machine Learning workshop organized by the MALTO (MAchine Learning @ Polito) team at the Politecnico di Torino.

Part I - Data Exploration and Preprocessing

  • Introduction
  • Dataset Loading
  • Dataset Overview
  • Data Quality
  • Descriptive Statistics
  • Basic Data Visualization
  • Exploring Feature Relationships
  • Handling Missing Values
  • Feature Engineering
  • EXTRA EXERCISE: Unupervised Exploration: PCA and t-SNE

The labs of Part I have been curated by Andrea Lolli, Samet Basarat, and Claudio Savelli.

PART II - Classification

  • Introduction
  • Setup
  • Training Different Models
    • Logistic Regression
    • K-Nearest Neighbors
    • Decision Tree
    • Random Forest
    • Support Vector Classifier
    • Naive Bayes
  • Model Evaluation
    • Confusion Matrix
    • Accuracy, Precision, Recall, F1 Score
    • ROC Curve and AUC
  • Hyperparameter Tuning
  • Cross-Validation

The labs of Part II have been curated by Ayberk Munis, and Claudio Savelli.

PART III - Regression

  • Introduction
    • Regression vs Classification
    • Evaluation Metrics
  • Loading and Inspection the Dataset
  • Train/Validation/Test Split
  • Feature Scaling
  • Training Different Models and Error Evaluation
    • Linear Regression
    • Polynomial Regression
    • Ridge Regression
    • Lasso Regression
    • Decision Tree Regressor
    • Random Forest Regressor
    • Support Vector Regressor
  • Error Analysis

The labs of Part III have been curated by Tommaso Mazzarini, Arman, and Claudio Savelli.

PART IV - Clustering

  • Introduction
    • Supervised Learning VS Unsupervised Learning
  • Loading and preparing the Data
    • Initial feature visualization and distribution
      • Histogram
      • Pairplot
      • Importance of Feature Scaling
  • Exploring the Data Structure
    • PCA (Principal Component Analysis)
    • t-SNE (t-Distributed Stochastic Neighbor Embedding)
  • Clustering Algorithms
    • K-means
    • DBSCAN
    • Hierarchical Clustering
  • How to evaluate Clustering performance
    • Elbow Method
    • Silhouette Score
    • Rand Index
    • Adjusted Rand Index
    • Davies-Bouldin Index

The labs of Part IV have been curated by Niccolò Malgeri, Emanuele and Claudio Savelli.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 2

  •  
  •