Data Processing for PixelArena

Project page: https://pixelarena.reify.ing/project

Web viewer for the results: https://pixelarena.reify.ing/

Usage

Setup project:

Clone the repository: git clone https://github.com/ifsheldon/mllm-semantic-segmentation.git
(Optional) setup submodules: git submodule update --init --recursive
Install uv: https://docs.astral.sh/uv/getting-started/installation/
Run uv sync to install dependencies.
Run uv run poe setup-frontend to install frontend dependencies.
(Optional) install oxen: https://docs.oxen.ai/getting-started/install
(Optional) run oxen clone https://hub.oxen.ai/ifsheldon/mllm-segmentation-data to get all results.
- remember to run ln -s mllm-segmentation-data/results results if you need to run the frontend.

Run frontend: uv run poe run-frontend

Eval Set

CelebAMask-HQ

A random subset (500 images) of the CelebAMask-HQ dataset is used for evaluation.

Images: eval-set/celeb/images, 512x512
Images (150): eval-set/celeb/images-150, 512x512, a subset (150) of the images
Reference masks: eval-set/celeb/masks-512, 512x512
Upscaled reference masks: eval-set/celeb/masks-1024, 1024x1024

Test results should be saved in results/celeb directory. For Gemini and GPT generated masks, the naming convention is <id>.mask.[0-4].{raw.jpeg, raw.png, pred.png}. [0-4] is attempt index (total 5 attempts). raw.{jpeg, png} means the colorful mask images generated by Gemini/GPT, pred.png means the P-mode png converted from the colorful jpeg.

COCO

A random subset (150 images) of the COCO dataset is used for evaluation.

Images (150): eval-set/coco/images-150, 512x512, a subset (150) of the images
Reference masks: eval-set/coco/masks-1024, 1024x1024

Test results should be saved in results/coco directory. For Gemini and GPT generated masks, the naming convention is <id>.mask.[0-4].{raw.jpeg, raw.png, pred.png}. [0-4] is attempt index (total 5 attempts). raw.{jpeg, png} means the colorful mask images generated by Gemini/GPT, pred.png means the P-mode png converted from the colorful jpeg.

Results

Results are tracked by oxen, a version control system for large datasets.

Name		Name	Last commit message	Last commit date
Latest commit History 134 Commits
eval-set		eval-set
frontend		frontend
label_palettes		label_palettes
panopticapi @ 7bb4655		panopticapi @ 7bb4655
pixel_arena		pixel_arena
plots		plots
saved_result_binary		saved_result_binary
.gitignore		.gitignore
.gitmodules		.gitmodules
.python-version		.python-version
AGENTS.md		AGENTS.md
LICENSE		LICENSE
README.md		README.md
calculate_metrics.py		calculate_metrics.py
check_results.py		check_results.py
clients-template.py		clients-template.py
convert_masks.py		convert_masks.py
convert_oneformer_masks.ipynb		convert_oneformer_masks.ipynb
draw_comparison.ipynb		draw_comparison.ipynb
draw_label_encoding_palette.ipynb		draw_label_encoding_palette.ipynb
draw_statistics.ipynb		draw_statistics.ipynb
error_case.ipynb		error_case.ipynb
generate_mask_gemini.py		generate_mask_gemini.py
generate_mask_gemini_pro.py		generate_mask_gemini_pro.py
generate_mask_gpt.py		generate_mask_gpt.py
process_coco.py		process_coco.py
pyproject.toml		pyproject.toml
shuffle_colors.ipynb		shuffle_colors.ipynb
uv.lock		uv.lock

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Data Processing for PixelArena

Usage

Eval Set

CelebAMask-HQ

COCO

Results

About

Uh oh!

Releases

Languages

License

ifsheldon/PixelArena

Folders and files

Latest commit

History

Repository files navigation

Data Processing for PixelArena

Usage

Eval Set

CelebAMask-HQ

COCO

Results

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Languages