This repository provides tools to import your Discord chat data in JSON format, and train a bot to emulate one of the users in the group chat using GPT-2. I made a similar project to do the same thing for Facebook Messenger chats: https://github.com/aoneillmark/Facebook-messenger-chatbot
This is a small project that anyone can make in an afternoon using an old GPT-2 model. The responses are often very silly and nonsensical, and easily distinguished from real messages. That said, users of this repository are expected to approach this responsibly, and never attempt to train a chatbot based on someone else without their permission, and users must respect the privacy of others.
I do not condone or support any use of this repository for malicious purposes, and in using this repository you accept all liability for your actions.
Code/- Contains all the Python scripts for data preparation and model training.dataset_prep.py- Script for preparing the dataset from the JSON files.gpt2_prompter.py- Script for generating responses using the GPT-2 model.gpt2_trainer.py- Script for training the GPT-2 model.
Data/- Directory to place your JSON data files.Results/- Directory where the results of the model training will be stored.
- Download your JSON chat data from Discord (https://github.com/prathercc/discrub-ext)
- Place your Discord chat data in JSON format in the
Data/directory. - Run the
dataset_prep.pyscript to prepare the data for training. - Run the
gpt2_trainer.pyscript to train the GPT-2 model. - Use the
gpt2_prompter.pyscript to generate responses from the trained model.