This is the repository for the Machine Learning workshop organized by the MALTO (MAchine Learning @ Polito) team at the Politecnico di Torino.
- Introduction
- Dataset Loading
- Dataset Overview
- Data Quality
- Descriptive Statistics
- Basic Data Visualization
- Exploring Feature Relationships
- Handling Missing Values
- Feature Engineering
- EXTRA EXERCISE: Unupervised Exploration: PCA and t-SNE
The labs of Part I have been curated by Andrea Lolli, Samet Basarat, and Claudio Savelli.
- Introduction
- Setup
- Training Different Models
- Logistic Regression
- K-Nearest Neighbors
- Decision Tree
- Random Forest
- Support Vector Classifier
- Naive Bayes
- Model Evaluation
- Confusion Matrix
- Accuracy, Precision, Recall, F1 Score
- ROC Curve and AUC
- Hyperparameter Tuning
- Cross-Validation
The labs of Part II have been curated by Ayberk Munis, and Claudio Savelli.
- Introduction
- Regression vs Classification
- Evaluation Metrics
- Loading and Inspection the Dataset
- Train/Validation/Test Split
- Feature Scaling
- Training Different Models and Error Evaluation
- Linear Regression
- Polynomial Regression
- Ridge Regression
- Lasso Regression
- Decision Tree Regressor
- Random Forest Regressor
- Support Vector Regressor
- Error Analysis
The labs of Part III have been curated by Tommaso Mazzarini, Arman, and Claudio Savelli.
- Introduction
- Supervised Learning VS Unsupervised Learning
- Loading and preparing the Data
- Initial feature visualization and distribution
- Histogram
- Pairplot
- Importance of Feature Scaling
- Initial feature visualization and distribution
- Exploring the Data Structure
- PCA (Principal Component Analysis)
- t-SNE (t-Distributed Stochastic Neighbor Embedding)
- Clustering Algorithms
- K-means
- DBSCAN
- Hierarchical Clustering
- How to evaluate Clustering performance
- Elbow Method
- Silhouette Score
- Rand Index
- Adjusted Rand Index
- Davies-Bouldin Index
The labs of Part IV have been curated by Niccolò Malgeri, Emanuele and Claudio Savelli.