AI VIsion Library for Robot Framework that verifies UI/screenshots (including template “look & feel”) by sending instructions plus one or more images to an OpenAI-compatible API (Ollama, OpenAI, Perplexity, Gemini, etc.).
The main keyword (Verify That) expects the model to return a strict RESULT: / EXPLANATION: format and will fail the test if the result is not pass.
- Visual assertions on one or more screenshots using natural-language instructions.
- Optional non-image attachments (txt, pdf, source, log) included as context for assertions.
- Template comparison keyword to validate “actual vs expected” look & feel (optionally creates a side-by-side image).
- Image utilities built on Pillow: open/convert, watermark, combine images, auto-generate names, save into Robot Framework output directory.
- Configurable AI system prompt for enforcing response format and verification behavior.
- Works with multiple providers via the
openaiPython client and OpenAI-compatible endpoints (base_url).
Keywords documentation can be found here.
Install from PyPI (once published):
pip install -U robotframework-aivisionRuntime dependencies include Robot Framework, Pillow, and the openai Python client.
Import the library in Robot Framework and choose a provider using platform plus optional overrides (base_url, api_key, model, image_detail).
Default (Ollama-like local setup):
*** Settings ***
Library AIVisionOpenAI (API key required):
*** Settings ***
Library AIVision
... platform=OpenAI
... api_key=%{OPENAI_API_KEY}
... model=gpt-5.2Perplexity:
*** Settings ***
Library AIVision
... platform=Perplexity
... api_key=%{PPLX_API_KEY}
... model=sonar-proGemini (OpenAI-compatible endpoint):
*** Settings ***
Library AIVision
... platform=Gemini
... api_key=%{GEMINI_API_KEY}
... model=gemini-2.5-flashYou can override the default AI system prompt that enforces the strict RESULT: / EXPLANATION: response format.
*** Settings ***
Library AIVision
... platform=OpenAI
... api_key=%{OPENAI_API_KEY}
... system_prompt=You are a test automation assistant. Return exactly: RESULT: pass|fail and EXPLANATION: ...The library defines these platform presets (model and base_url) which you can override via import arguments.
| Platform | Default base_url |
Default model | API key |
|---|---|---|---|
| Ollama | http://localhost:11434/v1 |
qwen3-coder:480b-cloud |
Not required |
| DockerModel | http://localhost:12434/engines/v1 |
ai/qwen3-vl:8B-Q8_K_XL |
Not required. |
| OpenAI | https://api.openai.com/v1 |
gpt-5.2 |
Required. |
| Perplexity | https://api.perplexity.ai |
sonar-pro |
Required. |
| Gemini | https://generativelanguage.googleapis.com/v1beta/openai/ |
gemini-2.5-flash |
Required. |
| Manual | None |
None |
Required. |
All keywords below are implemented in AIVision and are available after importing the library.
| Keyword | Purpose |
|---|---|
Verify That |
Send one or more screenshots and/or file attachments with instructions to the model, parse the RESULT and raise AssertionError on failure. |
Verify Screenshot Matches Look And Feel Template |
Compare a screenshot against a reference template with a built-in instruction set; optional combined image creation. |
Open Image |
Open an image (and optionally convert mode, default RGB). |
Save Image |
Save a PIL image to a path (defaults to RF output directory) with optional watermark. |
Generate Image Name |
Create a unique timestamp-based filename with prefix/extension. |
Combine Images On Paths Side By Side |
Combine two image files side-by-side (optionally watermark) and optionally save. |
Combine Images Side By Side |
Combine two in-memory PIL images side-by-side (optionally watermark). |
Add Watermark To Image |
Add watermark text using the included font file. |
*** Settings ***
Library AIVision platform=Ollama
*** Test Cases ***
Login button is correct
Verify That ${CURDIR}/screens/login.png Login button is visible and labeled as 'Sign In'*** Settings ***
Library AIVision
*** Test Cases ***
UI matches log output
@{files} = Create List ${CURDIR}/screens/home.png ${CURDIR}/logs/ui.log
Verify That ${files} The log mentions the same banner text shown on the page.Note: For PDF attachments, the library will extract text using PyMuPDF (fitz). If no text is found and a vision model is in use, it will try to render PDF pages as images using PyMuPDF (fitz); otherwise the PDF is included as base64 text. All other non-image attachments are read as text by default (including unknown extensions).
*** Settings ***
Library AIVision
*** Test Cases ***
Home page matches template
Verify Screenshot Matches Look And Feel Template
... ${CURDIR}/screens/home_actual.png
... ${CURDIR}/templates/home_expected.png*** Settings ***
Library AIVision
*** Test Cases ***
Home page matches template - custom rules
Verify Screenshot Matches Look And Feel Template
... ${CURDIR}/screens/home_actual.png
... ${CURDIR}/templates/home_expected.png
... override_instructions=Verify layout, spacing, typography, and brand colors match the template exactly.