StableGen is a project aimed at recreating the core components of Stable Diffusion from scratch. The repository includes PyTorch implementations of the VAE, Diffusion model, and CLIP model. Each module is built from the ground up with customization options to adapt to various generative AI needs.
- VAE (Variational Autoencoder): Encodes and decodes images with latent representations.
- Diffusion model: U-Net architecture combined with the diffusion process for generating images from random noise.
- CLIP (Contrastive Language–Image Pre-Training) model: A model for understanding text-image relationships, enabling guided image generation.
- Full implementation of each core component.
- Training scripts for each component.