Improving Editability in Image Generation with Layer-wise Memory

CVPR 2025

🔥 Latest News

Sep 12th, 2025 Code released on GitHub!
Sep 4th, 2025 Dataset released on GitHub!
May 5th, 2025 Paper released on arXiv!
Feb. 26th, 2025 Paper accepted to CVPR 2025!

TODOs

Paper release
Benchmark dataset release
Code release
Extended benchmark dataset & result release

Overview

This repository provides the official implementation for the CVPR 2025 paper "Improving Editability in Image Generation with Layer-wise Memory." The method enhances editability in image generation using a layer-wise memory approach integrated with the PixArt-alpha pipeline. The code includes an interactive Gradio demo for inpainting and evaluation scripts using CLIP and LLAVA metrics on a multi-edit benchmark dataset.

Key features:

Layer-wise inpainting with custom memory for improved object editing.
Support for cross-attention masking and multi-query disentanglement.
Evaluation on a custom benchmark with metrics like CLIP score, BLEU, METEOR, and ROUGE.

Installation

Prerequisites

Python 3.8 or higher
CUDA-enabled GPU (for faster inference and evaluation)
Git

Setup

Clone the repository:

git clone https://github.com/carpedkm/improving-editability.git
cd improving-editability

Create and activate a virtual environment (recommended):

 conda create -n editability python=3.12 -y
 conda activate editability

Install dependencies from requirements.txt:
```
pip install -r requirements.txt
```

Install custom dependencies:

Diffusers (from custom fork):
```
cd diffusers
pip install -e .
cd ..
```

CLIP (from OpenAI):

pip install "git+https://github.com/openai/CLIP.git@dcba3cb2e2827b4022701e7e1c7d9fed8a20ef1"

Download the benchmark dataset (multi_edit_bench_original_100.json) and place it in the root directory or specify its path via command-line arguments.

Usage

Interactive Demo (`demo.py`)

Launch the Gradio interface for interactive inpainting:

python app/demo.py --GPU_IDX 0 --result_dir ./output

Interface Overview:
- Input a prompt, draw a mask on the sketchpad, and optionally provide an initial image.
- Adjust parameters like scheduler, guidance scales, vanilla ratio, and more via sliders and dropdowns.
- Click "Generate" to produce the output image.
Example: Generate an image with "A red apple on a table" as prompt and a sketched mask.

Evaluation (`evaluate.py`)

Run batch evaluation on generated images:

python eval/evaluate.py --result_dir ./output/ours --dataset_json multi_edit_bench_original_100.json --GPU_IDX 1

This script computes CLIP scores (class and prompt with std-dev) and LLAVA-based metrics (BLEU, METEOR with std-dev, ROUGE).
Results are printed to the console and saved as text files in the result_dir (e.g., clip_scores.txt, bleu_scores.txt).

Generation and Evaluation Pipeline (`evaluate.py` with generation)

The evaluation script includes image generation using the PixArt pipeline. Customize via arguments:

python eval/evaluate.py --gpu 0 --dataset multi_edit_bench_original_100.json --result_dir ./output/ours --vanilla_ratio 0.05 --cattn_masking --multi_query_disentanglement --seed 334 --shard 0

Generates images in ./output/ours/gen/ and evaluates them automatically.

Project Structure

app/: Contains the Gradio demo script (demo.py).
configs/: Configuration files (auto-generated in output directories).
diffusers/: Custom diffusers library (installed via Git).
diffusion/: Custom schedulers (e.g., sa_solver_diffusers.py).
eval/: Evaluation scripts and functions.
- evaluate_functions.py: CLIP and LLAVA evaluation functions.
- evaluate.py: Main evaluation script with generation pipeline.
- multiedit_dataset.py: Custom dataset loader for multi-edit JSON.
scripts/: Pipeline scripts (e.g., pipeline_pixart_inpaint_with_latent_memory_improved.py).
.gitignore: Ignores unnecessary files like caches and outputs.
environment.yml: Conda environment file (optional).
multi_edit_bench_original_100.json: Benchmark dataset (100 samples).
README.md: This documentation.
requirements.txt: Filtered list of dependencies.

Dataset

The benchmark dataset (multi_edit_bench_original_100.json) contains 100 samples for multi-object editing evaluation. Each entry includes background prompts, local prompts, bounding boxes, and classes. Place it in the root or specify via --dataset_json.

Citation

If you find this work useful, please cite our paper:

@inproceedings{dkm2025improving,
  author    = {Kim, Daneul and Lee, Jaeah and Park, Jaesik},
  title     = {Improving Image Editability in Image Generation with Layer-wise Memory},
  booktitle = {CVPR},
  year      = {2025},
}

License

This project is licensed under the MIT License. See the LICENSE file for details.

Acknowledgments

Built on PixArt-alpha and Hugging Face Diffusers.
Evaluation uses CLIP (OpenAI) and LLAVA (llava-hf).
Thanks to the open-source community for essential libraries.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Improving Editability in Image Generation with Layer-wise Memory

🔥 Latest News

TODOs

Overview

Installation

Prerequisites

Setup

Usage

Interactive Demo (`demo.py`)

Evaluation (`evaluate.py`)

Generation and Evaluation Pipeline (`evaluate.py` with generation)

Project Structure

Dataset

Citation

License

Acknowledgments

About

Uh oh!

Releases

Packages

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 22 Commits
app		app
assets		assets
configs		configs
diffusers		diffusers
diffusion		diffusion
eval		eval
scripts		scripts
.gitignore		.gitignore
README.md		README.md
multi_edit_bench_original_100.json		multi_edit_bench_original_100.json
requirements.txt		requirements.txt

SNU-VGILab/improving-editability

Folders and files

Latest commit

History

Repository files navigation

Improving Editability in Image Generation with Layer-wise Memory

🔥 Latest News

TODOs

Overview

Installation

Prerequisites

Setup

Usage

Interactive Demo (demo.py)

Evaluation (evaluate.py)

Generation and Evaluation Pipeline (evaluate.py with generation)

Project Structure

Dataset

Citation

License

Acknowledgments

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Interactive Demo (`demo.py`)

Evaluation (`evaluate.py`)

Generation and Evaluation Pipeline (`evaluate.py` with generation)

Packages