feat: add pytest markers for test categorization (#322) #326

planetf1 · 2026-01-20T14:17:58Z

Misc PR

Type of PR

Bug Fix
New Feature
Documentation
Other

Description

Link to Issue: Fixes tests: qualitative vs LLM usage #322

Adds pytest markers to categorize tests by backend, resource requirements, and test type.

Key benefit: Run pytest with zero config—tests auto-skip based on your system (Ollama running? GPU? API keys?) with clear messages about what's missing.

Advanced: Target specific tests: pytest -m "ollama" or pytest -m "not (requires_gpu or requires_api_key)"

Auto-Detection

Check	Method	Threshold
Ollama	Port 11434 socket	Must be listening
API Keys	Env vars	Must exist
GPU	torch (CUDA/MPS)	Must be available
RAM	psutil	48GB for heavy tests

Usage

pytest                                    # Auto-skips based on system
pytest -m "not llm"                       # Fast unit tests only
pytest -m "ollama"                        # Ollama tests only
pytest --ignore-gpu-check                 # Override (for debugging)

Implementation

Markers: ollama, openai, watsonx, huggingface, vllm, requires_api_key, requires_gpu, requires_heavy_ram, qualitative, llm

Files: pyproject.toml (psutil dep + markers), test/conftest.py (auto-detection + skip logic), test/MARKERS_GUIDE.md (430-line guide), README.md/docs/tutorial.md (links), 4 test files (example markers)

Notes:

48GB RAM threshold from empirical testing (granite-3.3-8b failed on 32GB/36GB)
Apple Silicon support (MPS + CUDA detection)
Extends fix: Additional tests optimization when running on github actions. #293 CI optimizations (preserves qualitative skip + memory cleanup)

See test/MARKERS_GUIDE.md for complete documentation.

Testing

Tests added to the respective file if code was changed
New code has 100% coverage if code as added
Ensure existing tests and github automation passes (a maintainer will kick off the github automation when the rest of the PR is populated)

Testing notes: Added markers to 4 test files; auto-detection tested via existing suite; all pre-commit hooks passed.

…g#322) Implements comprehensive marker system to categorize tests by backend, resource requirements, and test type. Auto-detects system capabilities (Ollama, API keys, GPU, RAM) and skips tests with clear messaging. - Add backend markers: ollama, openai, watsonx, huggingface, vllm - Add capability markers: requires_api_key, requires_gpu, requires_heavy_ram - Auto-detection: Ollama (port 11434), GPU (CUDA/MPS), RAM (48GB threshold) - CLI overrides: --ignore-{gpu,ram,ollama,api-key}-check - Add comprehensive test/MARKERS_GUIDE.md documentation - Link guide from README.md and docs/tutorial.md - Extends existing CI optimizations from generative-computing#293 (preserves qualitative skip)

github-actions · 2026-01-20T14:18:14Z

The PR description has been updated. Please fill out the template for your PR to be reviewed.

mergify · 2026-01-20T14:18:34Z

Merge Protections

Your pull request matches the following merge protections and will not be merged until they are valid.

🟢 Enforce conventional commit

Wonderful, this rule succeeded.

Make sure that we follow https://www.conventionalcommits.org/en/v1.0.0/

title ~= ^(fix|feat|docs|style|refactor|perf|test|build|ci|chore|revert|release)(?:\(.+\))?:

planetf1 · 2026-01-20T14:26:14Z

I started the RAM check at 32GB. After extensive cleanup on my macbook m1 I couldn't run the pytorch tests. So upped to 48GB, but cannot test. Other changes to model, quantization could also be done to reduce footprint (I could try adding tests with a smaller model as an additional issue pr if it helps?) - so the need for this condition may change, but I think this makes a clear developer-friendly intent workable

planetf1 · 2026-01-20T14:29:15Z

Example output from uv run pytest -v

test/backends/test_huggingface.py::test_adapters SKIPPED (Skipping test: Insufficient RAM (32.0GB < ...) [  0%]

or

test/backends/test_litellm_watsonx.py::test_generate_from_raw SKIPPED (Skipping test: watsonx API ke...) [ 11%]

Currently the detection code doesn't check the required models are available

In draft whilst I verify execution at least on local 32GB mac

- Add markers to 7 more backend test files (litellm_ollama, openai_ollama, vision_ollama, vision_openai, litellm_watsonx, vllm_tools, huggingface_tools) - Add note that Ollama check only verifies service, not loaded models - Add fixture to normalize OLLAMA_HOST from 0.0.0.0 to 127.0.0.1 for client connections

Adds xgrammar to hf optional dependencies to support constrained decoding in granite_common. Fixes test_answerability failure.

Adds pytest markers to 5 stdlib test files based on model requirements: Heavy RAM tests (8B+ models): - test_spans.py: granite-3.3-8b (huggingface, gpu, heavy_ram) - test_think_budget_forcing.py: gpt-oss:20b (ollama, heavy_ram) GPU-only tests (4B models): - test_rag.py: granite-4.0-micro (huggingface, gpu) Lightweight tests (1B models): - test_sofai_sampling.py: llama3.2:1b (ollama) - test_sofai_graph_coloring.py: llama3.2:1b (ollama)

Adds markers to 3 test files using large Ollama models: - test_genslot.py: granite3.3:8b (22GB, unquantized) - test_component_typing.py: granite3.3:8b (22GB, unquantized) - test_think_budget_forcing.py: gpt-oss:20b (adds requires_gpu) The granite3.3:8b Ollama model is 22GB (full precision), requiring both GPU and heavy RAM despite being only 8B parameters.

planetf1 · 2026-01-20T17:25:52Z

With the current configuration I was able to run uv run pytest -v and succeeded on all tests. Some were close on resource (macOS, arm m1, 32GB) - in part becuase ollama leaves models loaded, but it's a starting point. Some further refinements/test changes may make the experience smoother?

planetf1 marked this pull request as ready for review January 20, 2026 14:26

planetf1 marked this pull request as draft January 20, 2026 14:43

planetf1 added 4 commits January 20, 2026 14:49

feat: add xgrammar dependency to hf extras

c61582c

Adds xgrammar to hf optional dependencies to support constrained decoding in granite_common. Fixes test_answerability failure.

planetf1 marked this pull request as ready for review January 20, 2026 17:25

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

feat: add pytest markers for test categorization (#322) #326

feat: add pytest markers for test categorization (#322) #326

planetf1 commented Jan 20, 2026 •

edited

Loading

Uh oh!

github-actions bot commented Jan 20, 2026

Uh oh!

mergify bot commented Jan 20, 2026

Uh oh!

planetf1 commented Jan 20, 2026

Uh oh!

planetf1 commented Jan 20, 2026 •

edited

Loading

Uh oh!

planetf1 commented Jan 20, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

feat: add pytest markers for test categorization (#322) #326

Are you sure you want to change the base?

feat: add pytest markers for test categorization (#322) #326

Conversation

planetf1 commented Jan 20, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Misc PR

Type of PR

Description

Auto-Detection

Usage

Implementation

Testing

Uh oh!

github-actions bot commented Jan 20, 2026

Uh oh!

mergify bot commented Jan 20, 2026

Merge Protections

🟢 Enforce conventional commit

Uh oh!

planetf1 commented Jan 20, 2026

Uh oh!

planetf1 commented Jan 20, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

planetf1 commented Jan 20, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

planetf1 commented Jan 20, 2026 •

edited

Loading

planetf1 commented Jan 20, 2026 •

edited

Loading