A Pokémon generator trained using the DDPM forward process and performing the reverse process with both DDPM and stochastic DDIM sampling.
The dataset used for training is located at data/archive.zip. It was retrieved from Kaggle's Pokémon Sprites Dataset.
- Size: 10,437 images
- Resolution: 96x96
- Classes: 898 Pokémon across various games
The model is a U-Net trained with the DDPM forward process for 50 epochs.
- Trained Model: The trained state_dict is available at
model/pokemon.pth. - Code: The full implementation is in
Diffusion_updated.ipynb.
The generated samples are resized to 64x64 for display purposes.
Below are examples of outputs from the model. Empirically, DDIM sampling tends to produce better results compared to DDPM sampling:
Follow these steps to generate your own Pokémon:
- Clone this repository to your local environment
- Create a Python virtual environment and install torch and matplotlib in it
- Open the folder by running
cd pokemon_generator_diffusion - To use the DDPM sampler, run
make generate_ddpmTo use the DDIM sampler, runmake generate_ddim
- Time: The generation process typically takes 1 to 2 minutes, depending on your device.
- Occasional Issues: The model may sometimes produce completely white images due to occasional performance issues. If this happens, simply run the generation process again.
The model was trained with DDPM, so using DDIM with intervals larger than 1 can result in unstable or noisy outputs since the model didn't learn how to handle larger timesteps.
- Fine-tuning the model with DDIM: Adapt the model to handle larger intervals for faster sampling.
- Experimenting with stochastic DDIM: Improve stability and performance by tuning the noise schedule (η) or experimenting with hybrid sampling methods.


