-
Notifications
You must be signed in to change notification settings - Fork 127
Open
Labels
community contributionAn opportunity for contribution from the GuideLLM community already invested in this area.An opportunity for contribution from the GuideLLM community already invested in this area.
Milestone
Description
Is your feature request related to a problem? Please describe.
Add support in guideLLM to benchmark geospatial image-to-image models such as Prithvi and TerraMind.
Unlike LLM-style workloads:
- These models do not consume or produce tokens
- They take images as input and return images as output
- Inference requests are sent to a vLLM server at the /pooling endpoint
From a quick look, guideLLM benchmarks seem to be token-centric, so existing metrics do not correctly represent performance for this type of workload. In the case of Geospatial models token metrics are not relevant leaving requests/second and request latency as the main metrics.
Describe the solution you'd like
As a first step, I would like to:
- Add support for benchmarking via the /pooling endpoint
- Handle cases where input/output data contains no tokens
- Report performance using:
- requests_per_second (1 request = 1 image)
- request_latency (end-to-end image processing time)
Additional context
We currently benchmark these models using vLLM’s built-in benchmarking scripts with a custom backend. A prototype implementation is available in this (fork).
Example command used today:
python -m vllm.entrypoints.cli.main bench serve \
--base-url ${VLLM_ENDPOINT} \
--dataset-name=custom \
--model ibm-nasa-geospatial/Prithvi-EO-2.0-300M-TL-Sen1Floods11 \
--skip-tokenizer-init true \
--endpoint /pooling \
--backend io-processor-plugin \
--metric-percentiles 25,75,99 \
--percentile-metrics e2el \
--dataset-path ./benchmarks/dataset_url_input_india.jsonl \
--num-prompts 5000 \
--request-rate 10 \
--max-concurrency 10
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
community contributionAn opportunity for contribution from the GuideLLM community already invested in this area.An opportunity for contribution from the GuideLLM community already invested in this area.