Phishing ML

Intro

This is a project I am working on to introduce myself to the world of Machine Learning, using scikit-learn.

NOTE: This project is still in progress.

What?

I am training a Machine Learning model to predict whether a given URL is a phishing link.

Get started

Clone the repo:

git clone https://github.com/SiddDevCS/PhishingML.git

Go to the project folder:

cd phishing-ml/

Set up venv (virtual environment)

python3 -m venv venv
source venv/bin/activate

Install libraries

pip install -r requirements.txt

Load model into models/ directory

python3 notebooks/train_model.py

Note: if there is already a model in model/ delete the new model made with `train_model.py`

Finally set up the Flask web app

python3 app/app.py

Visit the web app in your browser at: http://127.0.0.1:5000/

Workflow

Datasets in JSON, for the model to be trained on.
tldextract categorizing/splitting up the link given.
Training the ML model.
The ML model giving output if the link given is a phishing link or not.

Project Structure:

phishing-detector/
├── data/
├────── fetch.py                # Script to fetch phishing datasets (make sure to use VPN, to not get blocked)
├────── phish-data.json         # JSON datasets
├── notebooks/              
├────── load_data.py            # loads JSON into dataframe
├────── train_model.py          # trains/creates model in models/ dir
├── models/      
├────── phishing_model.pkl      # Trained ML model
├── app/
├────── app.py
├────── extract_features.py     # tldextract splitting up link
├────── static/
├───────────── style.css        # UI
├────── templates/
├───────────── index.html       # UI
├── README.md
└── requirements.txt

License

This project is licensed under the MIT License - see the LICENSE file for details.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Phishing ML

Intro

What?

Get started

Note: if there is already a model in model/ delete the new model made with `train_model.py`

Workflow

Project Structure:

License

About

Uh oh!

Releases

Packages

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 15 Commits
app		app
data		data
models		models
notebooks		notebooks
.gitignore		.gitignore
LICENSE		LICENSE
Procfile		Procfile
README.md		README.md
requirements.txt		requirements.txt

License

SiddDevCS/PhishingML

Folders and files

Latest commit

History

Repository files navigation

Phishing ML

Intro

What?

Get started

Note: if there is already a model in model/ delete the new model made with train_model.py

Workflow

Project Structure:

License

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Note: if there is already a model in model/ delete the new model made with `train_model.py`

Packages