Skip to content

Refactor: Migrate data models and collections to Pydantic#7

Open
osama-ata wants to merge 1 commit intomasterfrom
feature/pydantic-migration
Open

Refactor: Migrate data models and collections to Pydantic#7
osama-ata wants to merge 1 commit intomasterfrom
feature/pydantic-migration

Conversation

@osama-ata
Copy link
Owner

This commit completes the migration of the XER parser's data handling to use Pydantic for all data models and their collections.

Key changes include:

  • Core Models (src/xer_parser/model/classes/*.py): All classes
    (e.g., Account, Task, Project, WBS, Rsrc, TaskPred) now inherit
    from pydantic.BaseModel.

    • Manual __init__ methods replaced with Pydantic fields, providing
      automatic type validation, coercion, and default values.
    • XER data types (strings, integers, floats, dates, booleans as 'Y'/'N')
      are mapped to appropriate Python types (str, int, float, datetime,
      Optional[str] for flags).
    • get_tsv() methods updated to use model_dump() and correctly
      format data for XER output, including datetime objects and None values.
    • Removed static obj_list and related class methods for instance tracking.
  • Collection Classes (src/xer_parser/model/*.py): All collection
    classes (e.g., Accounts, Tasks, Projects) updated.

    • add() methods now use ModelClass.model_validate(params) for robust
      Pydantic model instantiation.
    • A data_context (referencing the main Data object) is passed to
      model instances, enabling relationship navigation via properties
      (e.g., task.calendar, project.activities).
  • Reader (src/xer_parser/reader.py):

    • The central Data class (src/xer_parser/model/classes/data.py)
      is now a Pydantic model, managing all collections with
      default_factory and setting up data_context via model_post_init.
    • Reader.create_object() simplified, delegating validation and type
      conversion to Pydantic models via collection add() methods.
    • Removed manual type conversions (e.g., locale.atof).
  • Writer (src/xer_parser/write.py):

    • Adapted to accept the Pydantic Data model instance.
    • Accesses collections via attributes of the Data object (e.g.,
      data_obj.tasks) and calls item.get_tsv() on Pydantic models.
  • Utilities & Analysis:

    • XerExplorer (tools/explorer.py) now correctly interacts with
      Pydantic models, benefiting from typed attributes.
    • DCMA14 analysis (dcma14/analysis.py) updated to use direct
      attribute access on Pydantic models, removing redundant type
      conversions and improving accuracy.
  • Dependencies:

    • Added pydantic>=2.0 as a core dependency in pyproject.toml.
  • Tests (tests/*.py):

    • Updated tests/test_reader.py to assert Pydantic model types,
      attribute types, and data context propagation.
    • Added tests for Pydantic ValidationError on malformed XER input.
    • (Other tests like test_writer.py, test_dcma14.py may require
      follow-up adjustments to fully align with Pydantic inputs/outputs if
      their test data setup or assertions relied on pre-Pydantic structures).
  • Documentation (docs/):

    • installation.rst updated to list Pydantic as a dependency.
    • api_reference.rst reviewed to ensure Pydantic models are documented.
    • Example snippets in getting_started.rst and examples.rst updated
      to reflect Pydantic model usage (e.g., direct typed attribute access).

This migration enhances data integrity, type safety, and developer experience by leveraging Pydantic's robust validation and modeling capabilities.

This commit completes the migration of the XER parser's data handling
to use Pydantic for all data models and their collections.

Key changes include:

-   **Core Models (`src/xer_parser/model/classes/*.py`):** All classes
    (e.g., Account, Task, Project, WBS, Rsrc, TaskPred) now inherit
    from `pydantic.BaseModel`.
    -   Manual `__init__` methods replaced with Pydantic fields, providing
        automatic type validation, coercion, and default values.
    -   XER data types (strings, integers, floats, dates, booleans as 'Y'/'N')
        are mapped to appropriate Python types (str, int, float, datetime,
        Optional[str] for flags).
    -   `get_tsv()` methods updated to use `model_dump()` and correctly
        format data for XER output, including datetime objects and None values.
    -   Removed static `obj_list` and related class methods for instance tracking.

-   **Collection Classes (`src/xer_parser/model/*.py`):** All collection
    classes (e.g., Accounts, Tasks, Projects) updated.
    -   `add()` methods now use `ModelClass.model_validate(params)` for robust
        Pydantic model instantiation.
    -   A `data_context` (referencing the main `Data` object) is passed to
        model instances, enabling relationship navigation via properties
        (e.g., `task.calendar`, `project.activities`).

-   **Reader (`src/xer_parser/reader.py`):**
    -   The central `Data` class (`src/xer_parser/model/classes/data.py`)
        is now a Pydantic model, managing all collections with
        `default_factory` and setting up `data_context` via `model_post_init`.
    -   `Reader.create_object()` simplified, delegating validation and type
        conversion to Pydantic models via collection `add()` methods.
    -   Removed manual type conversions (e.g., `locale.atof`).

-   **Writer (`src/xer_parser/write.py`):**
    -   Adapted to accept the Pydantic `Data` model instance.
    -   Accesses collections via attributes of the `Data` object (e.g.,
        `data_obj.tasks`) and calls `item.get_tsv()` on Pydantic models.

-   **Utilities & Analysis:**
    -   `XerExplorer` (`tools/explorer.py`) now correctly interacts with
        Pydantic models, benefiting from typed attributes.
    -   `DCMA14` analysis (`dcma14/analysis.py`) updated to use direct
        attribute access on Pydantic models, removing redundant type
        conversions and improving accuracy.

-   **Dependencies:**
    -   Added `pydantic>=2.0` as a core dependency in `pyproject.toml`.

-   **Tests (`tests/*.py`):**
    -   Updated `tests/test_reader.py` to assert Pydantic model types,
        attribute types, and data context propagation.
    -   Added tests for Pydantic `ValidationError` on malformed XER input.
    -   (Other tests like `test_writer.py`, `test_dcma14.py` may require
        follow-up adjustments to fully align with Pydantic inputs/outputs if
        their test data setup or assertions relied on pre-Pydantic structures).

-   **Documentation (`docs/`):**
    -   `installation.rst` updated to list Pydantic as a dependency.
    -   `api_reference.rst` reviewed to ensure Pydantic models are documented.
    -   Example snippets in `getting_started.rst` and `examples.rst` updated
        to reflect Pydantic model usage (e.g., direct typed attribute access).

This migration enhances data integrity, type safety, and developer experience
by leveraging Pydantic's robust validation and modeling capabilities.
@osama-ata osama-ata requested a review from Copilot May 26, 2025 23:11
Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR refactors the XER parser’s data models to leverage Pydantic, enhancing type validation and maintainability while migrating collection and model classes. Key changes include:

  • Updating core model classes and collection classes to inherit from Pydantic’s BaseModel and use Field definitions.
  • Revising helper methods (e.g. get_tsv(), get_id(), and model_post_init()) to integrate Pydantic’s model_dump() and model validation methods.
  • Adjusting related documentation and dependency configuration to support Pydantic >=2.0.

Reviewed Changes

Copilot reviewed 55 out of 55 changed files in this pull request and generated 1 comment.

Show a summary per file
File Description
src/xer_parser/model/classes/pcatval.py Converted to a Pydantic model with updated field types and added a generic "data" field.
src/xer_parser/model/classes/pcattype.py Updated to inherit from BaseModel, with Field aliases and removal of legacy obj_list.
src/xer_parser/model/classes/obs.py Refactored as a Pydantic model; streamlined get_tsv() and repr methods.
src/xer_parser/model/classes/nonwork.py Migrated to Pydantic with type conversions and standardized get_tsv() implementation.
src/xer_parser/model/classes/fintmpl.py Updated to use Pydantic with field validations and consolidated get_tsv() formatting.
src/xer_parser/model/classes/data.py Central Data model now uses default_factory for collections and sets up data context using model_post_init.
src/xer_parser/model/classes/currency.py Converted to Pydantic, standardizing field types and string conversions in get_tsv().
src/xer_parser/model/classes/calendar_data.py Expanded parsing logic using Pydantic, regex improvements, and post-init field population.
src/xer_parser/model/classes/calendar.py Refactored Calendar model with updated type conversions and improved TSV formatting.
src/xer_parser/model/classes/acttype.py Transitioned to a Pydantic model with simplified field definitions.
src/xer_parser/model/classes/activitycode.py Refactored to leverage Pydantic model validation and updated property/method implementations.
src/xer_parser/model/classes/account.py Simplified conversion to Pydantic and standardized TSV output.
src/xer_parser/model/calendars.py Updated collection class to use Pydantic model instances and improved iteration and TSV export.
src/xer_parser/model/acttypes.py Migrated ActTypes collection to use Pydantic models with updated add() and iteration methods.
src/xer_parser/model/activityresources.py Refactored collection handling for TaskRsrc models, including data context propagation.
src/xer_parser/model/activitycodes.py Revised ActivityCodes collection to leverage Pydantic and consistent TSV generation.
src/xer_parser/model/accounts.py Updated Accounts collection to use Pydantic models and improved iteration and finder methods.
pyproject.toml Added dependency on Pydantic (>=2.0) as a core project requirement.
docs/source/installation.rst Updated dependency list to include Pydantic (>=2.0).
docs/source/api_reference.rst Enhanced API documentation to automatically document the central Data model.
Comments suppressed due to low confidence (1)

src/xer_parser/model/classes/pcatval.py:13

  • The use of a generic field name 'data' may reduce clarity. Consider renaming it to 'data_context' or a more descriptive name to better indicate that it stores contextual information.
data: Any = Field(default=None, exclude=True) # Standard data field

Comment on lines +52 to 61
self.index = 0 # Reset index for each new iteration
return self

def __next__(self) -> Calendar:
if self.index >= len(self._calendars):
if self.index < len(self._calendars):
result = self._calendars[self.index]
self.index += 1
return result
else:
raise StopIteration
Copy link

Copilot AI May 26, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Rather than manually resetting and maintaining an index for iteration, consider implementing iter as 'return iter(self._calendars)' to simplify the code and reduce potential iteration issues.

Copilot uses AI. Check for mistakes.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant