Skip to content

Conversation

@mbarlow12
Copy link

@mbarlow12 mbarlow12 commented May 5, 2025

Note

This feature was created from the rmi-pip-install branch and should be reviewed/merged after #7 has been completed. Similarly, this can be edited/merged as needed in @mbarlow12 's absence.

Description

This PR introduces an attribute-auditing framework for OPGEE models.
Auditing can now be enabled at four granularities (none, field, processes, all) and:

  • Reports the provenance (input, static_default, smart_default, unknown) of every field‐level attribute.
  • Optionally writes a PNG process diagram for the audited field.
  • Persists audit data (field_audit.csv) alongside existing result artifacts.

Motivation and Context

Model transparency is critical when reconciling results with known external data and for building an understanding of the underlying model. By recording where every final attribute value originates—and, when requested, how it flows through the process graph—users can:

  • Validate inputs and defaults.
  • Debug unexpected results more quickly.
  • Document and explain unexpected outputs without (or with less) manual tracing

Types of changes (check all that apply)

  • Bug fix (non-breaking)
  • New feature (non-breaking)
  • Docs/comments
  • Test

Change Summary

  • Core

    • opgee/audit.py (new):
      • AuditFlag, AuditRow, level translation/validation helpers.
      • _generate_field_audit_report() inspects AttrDefs, original XML, smart-default registry, etc.
      • audit_field() orchestrates reporting and diagram generation.
    • opgee/field.py – add audit_data slot to FieldResult.
    • opgee/manager.py
      • Wire auditing into _run_field and save_results().
      • Consolidate audit_dfs, write field_audit.csv only when data present.
      • Catch correct dask.distributed.TimeoutError.
    • opgee/built_ins/run_plugin.py – propagate -o/--output-dir to global config so diagrams save in the right place.
    • opgee/etc/system.cfg – new OPGEE.AuditLevel setting with inline documentation.
  • Tests

    • tests/test_audit.py (>150 LOC) covers:
      • Source classification accuracy.
      • Behaviour of each audit level.
      • CSV / PNG persistence.
      • Audit execution even when Field.run raises.
    • Added tests/files/audit_model.xml and minor additions to tests/files/test_model.xml.
  • Misc

    • Pin ruff >=0.11.8 in dev extras.
    • Trailing newline fixes in two table CSVs.
    • Minor refactor / doc tweaks across commits.

Additional notes

  • The audit pipeline is opt-in; default config remains None, so existing workflows are unaffected.
  • PNG generation leverages existing write_process_diagram()—no extra dependencies introduced.
  • When running via CLI the new behaviour is automatically triggered by setting
    OPGEE.AuditLevel in opgee.cfg or with --set OPGEE.AuditLevel=Field (etc.).

Commit list

  • 6659768 (2025-05-04) fix: Ensure audit data saved in correct scenarios
    • Skip when audit_data empty
    • Add tests for audit levels and failure cases
    • Save if Field.run fails
  • f3a52cf (2025-05-04) test(audit): fix missing imports, add run subcommand assertions
  • 437aa57 (2025-05-04) Merge branch 'rmi-pip-install' into xml-audit
  • bdd55b3 (2025-05-01) feat: Improve audit functionality and add ruff to dev dependencies
    • Fix audit level string comparison
    • Add tests for audit functionality
    • Update ruff dependency in pyproject.toml
    • Improve audit field function
    • Add test for audit save results
  • a0e6518 (2025-04-30) revert manager formatting
  • 1d9cfe7 (2025-04-30) revert autoformatting, unused functions, and unnecessary type hinting
  • 6d62dfd (2025-04-30) initial passing audit tests, save audit data to csv
  • 9e2dbeb (2025-04-30) Merge branch 'rmi-pip-install' into xml-audit
  • e28a3de (2025-04-22) feat: Enhance field auditing with configurable levels and process diagram
    • Add audit level translation and validation
    • Support field and process auditing
    • Generate process diagram for audited fields
    • Improve error handling and logging
  • 6113070 (2025-04-16) refactor: Simplify audit module and improve type safety
  • d10053d (2025-04-16) test: add model for audit testing
  • ef5f0e3 (2025-04-11) test(audit): add initial tests for audit feature
    • use rst for docstrings
  • ed18c5c (2025-04-11) feat: add audit functionality for tracking attribute sources in field models

…gram

- Add audit level translation and validation
- Support field and process auditing
- Generate process diagram for audited fields
- Improve error handling and logging
- Fix audit level string comparison
- Add tests for audit functionality
- Update ruff dependency in pyproject.toml
- Improve audit field function
- Add test for audit save results
- Skip when audit_data empty
- Add tests for audit levels and failure cases
- Save if `Field.run` fails
Options shown in system config didn't match the actual strings used in code. And added a warning that currently there's an issue with the "Processes" auditing level
@lloyd-rmi
Copy link

Tests are passing through GitHub Actions, but are a few are failing for me related to graphing. Might be related to the issues I'm having running the "Processes" auditing locally as well.

Copy link

@lloyd-rmi lloyd-rmi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm running into some issues locally, but I am able to get the basic Field auditing functionality to work, which is the main thing I want to focus on. Would need to understand why Processes is running into issues for me and confirm if it's an actual bug or just something that I'm missing locally. But that can be worked on after this change.


if attr_name in original_attrs:
source = "input"
elif attr_name in SmartDefault.registry:

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just making sure I understand the logic: all provided attributes are captured in the original_attrs variable so then if an attribute shows up outside of that, we can check against the registry if it was possible to supply it with a smart default, correct? So the order of operations matters here.

@lloyd-rmi lloyd-rmi merged commit 757e436 into main May 9, 2025
2 of 3 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants