Anime Recommender System

A Machine learning model that uses collaborative filtering to generate personalized recommendations to users, written in python using PyTorch, NumPy and Pandas. To learn more about the theory, check out the project page on my website!

Dataset and preprocessing

The dataset used was a subset of the MyAnimeList dataset which contains around 80 million ratings of 14k anime by 300k users. The 5,000 most popular anime and 100,000 randomly sampled users who rated at least 50 anime were taken as the "cleaned up" dataset with which latent features were trained on.
Dask was used to process the relatively large dataset in a distributed way.
You can find the code for this in MAL Dataset/Dataset prep.ipynb.

Training

Matrix factorization is at the heart of this algorithm. The 5,000 x 100,000 dimensional sparse ratings matrix is decomposed into a 5,000 x 10 dimensional 'anime_matrix' and a 10 x 100,000 dimensional 'user_matrix'. These matrices are initialized at random, and are iteratively updated to converge them such that their product matches the original ratings matrix. With this, we will have essentially "filled in" the missing ratings.
PyTorch was used in order to leverage the GPU and speed up training significantly. I also used autograd and an optimizer from PyTorch because why not
Walkthrough and code can be found in Training.ipynb

Prediction

When we need to recommend anime to a user that wasn't part of the 100,000 trained users, we fetch their profile from MyAnimeList using jikanpy, train their specific 10 x 1 vector and then use that to predict their ratings.
The code for this is in Predict.ipynb

Things to try out

I don't think I'll work on it anytime soon but if someone wants to take the project further, these are some first steps:

Add content based recommendation
Figure out some way to get new anime ratings without killing the MAL api :P
Understand and implement some of the best performing submissions from the Netflix Prize contest

Name		Name	Last commit message	Last commit date
Latest commit History 10 Commits
MAL Dataset		MAL Dataset
.gitignore		.gitignore
Predict.ipynb		Predict.ipynb
README.md		README.md
Training.ipynb		Training.ipynb

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Anime Recommender System

Dataset and preprocessing

Training

Prediction

Things to try out

Thanks for reading!

About

Uh oh!

Releases

Packages

Languages

greenfish8090/Anime-Recommender-System

Folders and files

Latest commit

History

Repository files navigation

Anime Recommender System

Dataset and preprocessing

Training

Prediction

Things to try out

Thanks for reading!

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages