Skip to content

Mehak2327/ML_Assignments

Repository files navigation

📘 MACHINE LEARNING ASSIGNMENTS – UML501

Mehak | B.Tech COE (TIET)

This repository contains a collection of 8 Machine Learning assignments completed as part of coursework. Each assignment includes hands-on implementation of ML algorithms, reinforcing both theoretical knowledge and practical programming skills using Python.


📂 Assignment Overview

Assignment 1 – NumPy Operations & Matrix Computations

Objective: Introduction to numerical computing using NumPy
Key Tasks:

  • Array creation, slicing, reshaping, flattening
  • Matrix operations (addition, multiplication, inverse, determinant, eigenvalues)
  • Statistical measures (mean, median, SD, covariance, percentiles)
  • Image-to-array conversion & file handling

Assignment 2 – Data Preprocessing Techniques

Objective: Clean and transform raw data for ML models
Key Tasks:

  • Handling missing values & noise removal
  • Normalization & standardization
  • Binning & discretization
  • One-hot encoding, ordinal encoding
  • Similarity & correlation metrics (Jaccard, Cosine, Pearson, Simple Matching) Dataset: Bike Buyers Dataset (synthetic equivalent)

Assignment 3 – Regression Models & PCA

Objective: Compare analytical & iterative regression training
Key Tasks:

  • Linear Regression using Normal Equation + Gradient Descent
  • 5-Fold Cross Validation evaluation
  • Model performance comparison via R² score
  • PCA for dimensionality reduction (before vs after comparison)

Assignment 4 – Web Scraping & Data Extraction

Objective: Collect real-world structured data
Key Tasks:

  • Static scraping using BeautifulSoup
  • Dynamic scraping using Selenium
  • Extracted data from:
    • BooksToScrape
    • IMDb Top 250 Movies
    • TimeAndDate Global weather reports
  • Export to CSV for analysis

Assignment 5 – Ridge & Lasso Regression + Cross Validation

Objective: Regularized regression & model selection
Key Tasks:

  • Ridge Regression using Gradient Descent (with tuning of α & LR)
  • Linear vs Ridge vs Lasso comparison
  • RidgeCV & LassoCV on Boston Dataset
  • Hitters Dataset regression evaluation and best model justification

Assignment 6 – Naïve Bayes & GridSearchCV

Objective: Bayesian modeling and hyperparameter tuning
Key Tasks:

  • Gaussian Naive Bayes – manual & in-built implementation
  • GridSearchCV for best K in KNN

Assignment 7 – Support Vector Machines

Objective: Classification using different SVM kernels
Key Tasks:

  • SVC with Linear / Polynomial / RBF kernels
  • Metrics: Accuracy, Precision, Recall, F1-score
  • Confusion Matrix visualization
  • Effect of feature scaling on SVM performance

Assignment 8 – AdaBoost (Text, Medical & Sensor Data)

Objective: Boosting for stronger ensemble models
Parts Implemented:

  1. SMS Spam Classification
    • TF-IDF vectorization + manual AdaBoost (T=15) + sklearn AdaBoost
  2. Heart Disease Prediction
    • UCI Heart dataset with hypertuning of estimators & learning rate
  3. WISDM Smartphone & Watch Motion Sensor Dataset
    • Accelerometer windowing, feature extraction, manual AdaBoost vs sklearn AdaBoost

🧠 Learning Outcomes

  • Hands-on implementation of ML algorithms (regression, boosting, SVM, Naïve Bayes)
  • Understanding preprocessing, regularization & model evaluation
  • Experience with sensor, medical & text datasets
  • Practical ML pipeline skills (EDA → preprocessing → model → metrics)
  • Data scraping & automation using Python

🧰 Tools & Libraries Used

  • Python 3.x
  • NumPy, Pandas
  • Scikit-learn
  • Matplotlib & Seaborn
  • BeautifulSoup, Selenium
  • Jupyter / Spyder IDE

✍️ Author

Mehak
B.Tech – Computer Engineering (3rd Year)
Thapar Institute of Engineering and Technology
📧 mmehak2_be23@thapar.edu


⭐ Acknowledgement

This repository is part of the Machine Learning (UML501) coursework under the guidance of faculty at TIET, Patiala.


About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages