Skip to content

ifsheldon/PixelArena

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Data Processing for PixelArena

Project page: https://pixelarena.reify.ing/project

Web viewer for the results: https://pixelarena.reify.ing/

Usage

Setup project:

  1. Clone the repository: git clone https://github.com/ifsheldon/mllm-semantic-segmentation.git
  2. (Optional) setup submodules: git submodule update --init --recursive
  3. Install uv: https://docs.astral.sh/uv/getting-started/installation/
  4. Run uv sync to install dependencies.
  5. Run uv run poe setup-frontend to install frontend dependencies.
  6. (Optional) install oxen: https://docs.oxen.ai/getting-started/install
  7. (Optional) run oxen clone https://hub.oxen.ai/ifsheldon/mllm-segmentation-data to get all results.
    • remember to run ln -s mllm-segmentation-data/results results if you need to run the frontend.

Run frontend: uv run poe run-frontend

Eval Set

CelebAMask-HQ

A random subset (500 images) of the CelebAMask-HQ dataset is used for evaluation.

  • Images: eval-set/celeb/images, 512x512
  • Images (150): eval-set/celeb/images-150, 512x512, a subset (150) of the images
  • Reference masks: eval-set/celeb/masks-512, 512x512
  • Upscaled reference masks: eval-set/celeb/masks-1024, 1024x1024

Test results should be saved in results/celeb directory. For Gemini and GPT generated masks, the naming convention is <id>.mask.[0-4].{raw.jpeg, raw.png, pred.png}. [0-4] is attempt index (total 5 attempts). raw.{jpeg, png} means the colorful mask images generated by Gemini/GPT, pred.png means the P-mode png converted from the colorful jpeg.

COCO

A random subset (150 images) of the COCO dataset is used for evaluation.

  • Images (150): eval-set/coco/images-150, 512x512, a subset (150) of the images
  • Reference masks: eval-set/coco/masks-1024, 1024x1024

Test results should be saved in results/coco directory. For Gemini and GPT generated masks, the naming convention is <id>.mask.[0-4].{raw.jpeg, raw.png, pred.png}. [0-4] is attempt index (total 5 attempts). raw.{jpeg, png} means the colorful mask images generated by Gemini/GPT, pred.png means the P-mode png converted from the colorful jpeg.

Results

Results are tracked by oxen, a version control system for large datasets.

About

PixelArena: A benchmark for Pixel-Precision Visual Intelligence

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published