Refactor: Migrate data models and collections to Pydantic#7
Open
Refactor: Migrate data models and collections to Pydantic#7
Conversation
This commit completes the migration of the XER parser's data handling
to use Pydantic for all data models and their collections.
Key changes include:
- **Core Models (`src/xer_parser/model/classes/*.py`):** All classes
(e.g., Account, Task, Project, WBS, Rsrc, TaskPred) now inherit
from `pydantic.BaseModel`.
- Manual `__init__` methods replaced with Pydantic fields, providing
automatic type validation, coercion, and default values.
- XER data types (strings, integers, floats, dates, booleans as 'Y'/'N')
are mapped to appropriate Python types (str, int, float, datetime,
Optional[str] for flags).
- `get_tsv()` methods updated to use `model_dump()` and correctly
format data for XER output, including datetime objects and None values.
- Removed static `obj_list` and related class methods for instance tracking.
- **Collection Classes (`src/xer_parser/model/*.py`):** All collection
classes (e.g., Accounts, Tasks, Projects) updated.
- `add()` methods now use `ModelClass.model_validate(params)` for robust
Pydantic model instantiation.
- A `data_context` (referencing the main `Data` object) is passed to
model instances, enabling relationship navigation via properties
(e.g., `task.calendar`, `project.activities`).
- **Reader (`src/xer_parser/reader.py`):**
- The central `Data` class (`src/xer_parser/model/classes/data.py`)
is now a Pydantic model, managing all collections with
`default_factory` and setting up `data_context` via `model_post_init`.
- `Reader.create_object()` simplified, delegating validation and type
conversion to Pydantic models via collection `add()` methods.
- Removed manual type conversions (e.g., `locale.atof`).
- **Writer (`src/xer_parser/write.py`):**
- Adapted to accept the Pydantic `Data` model instance.
- Accesses collections via attributes of the `Data` object (e.g.,
`data_obj.tasks`) and calls `item.get_tsv()` on Pydantic models.
- **Utilities & Analysis:**
- `XerExplorer` (`tools/explorer.py`) now correctly interacts with
Pydantic models, benefiting from typed attributes.
- `DCMA14` analysis (`dcma14/analysis.py`) updated to use direct
attribute access on Pydantic models, removing redundant type
conversions and improving accuracy.
- **Dependencies:**
- Added `pydantic>=2.0` as a core dependency in `pyproject.toml`.
- **Tests (`tests/*.py`):**
- Updated `tests/test_reader.py` to assert Pydantic model types,
attribute types, and data context propagation.
- Added tests for Pydantic `ValidationError` on malformed XER input.
- (Other tests like `test_writer.py`, `test_dcma14.py` may require
follow-up adjustments to fully align with Pydantic inputs/outputs if
their test data setup or assertions relied on pre-Pydantic structures).
- **Documentation (`docs/`):**
- `installation.rst` updated to list Pydantic as a dependency.
- `api_reference.rst` reviewed to ensure Pydantic models are documented.
- Example snippets in `getting_started.rst` and `examples.rst` updated
to reflect Pydantic model usage (e.g., direct typed attribute access).
This migration enhances data integrity, type safety, and developer experience
by leveraging Pydantic's robust validation and modeling capabilities.
There was a problem hiding this comment.
Pull Request Overview
This PR refactors the XER parser’s data models to leverage Pydantic, enhancing type validation and maintainability while migrating collection and model classes. Key changes include:
- Updating core model classes and collection classes to inherit from Pydantic’s BaseModel and use Field definitions.
- Revising helper methods (e.g. get_tsv(), get_id(), and model_post_init()) to integrate Pydantic’s model_dump() and model validation methods.
- Adjusting related documentation and dependency configuration to support Pydantic >=2.0.
Reviewed Changes
Copilot reviewed 55 out of 55 changed files in this pull request and generated 1 comment.
Show a summary per file
| File | Description |
|---|---|
| src/xer_parser/model/classes/pcatval.py | Converted to a Pydantic model with updated field types and added a generic "data" field. |
| src/xer_parser/model/classes/pcattype.py | Updated to inherit from BaseModel, with Field aliases and removal of legacy obj_list. |
| src/xer_parser/model/classes/obs.py | Refactored as a Pydantic model; streamlined get_tsv() and repr methods. |
| src/xer_parser/model/classes/nonwork.py | Migrated to Pydantic with type conversions and standardized get_tsv() implementation. |
| src/xer_parser/model/classes/fintmpl.py | Updated to use Pydantic with field validations and consolidated get_tsv() formatting. |
| src/xer_parser/model/classes/data.py | Central Data model now uses default_factory for collections and sets up data context using model_post_init. |
| src/xer_parser/model/classes/currency.py | Converted to Pydantic, standardizing field types and string conversions in get_tsv(). |
| src/xer_parser/model/classes/calendar_data.py | Expanded parsing logic using Pydantic, regex improvements, and post-init field population. |
| src/xer_parser/model/classes/calendar.py | Refactored Calendar model with updated type conversions and improved TSV formatting. |
| src/xer_parser/model/classes/acttype.py | Transitioned to a Pydantic model with simplified field definitions. |
| src/xer_parser/model/classes/activitycode.py | Refactored to leverage Pydantic model validation and updated property/method implementations. |
| src/xer_parser/model/classes/account.py | Simplified conversion to Pydantic and standardized TSV output. |
| src/xer_parser/model/calendars.py | Updated collection class to use Pydantic model instances and improved iteration and TSV export. |
| src/xer_parser/model/acttypes.py | Migrated ActTypes collection to use Pydantic models with updated add() and iteration methods. |
| src/xer_parser/model/activityresources.py | Refactored collection handling for TaskRsrc models, including data context propagation. |
| src/xer_parser/model/activitycodes.py | Revised ActivityCodes collection to leverage Pydantic and consistent TSV generation. |
| src/xer_parser/model/accounts.py | Updated Accounts collection to use Pydantic models and improved iteration and finder methods. |
| pyproject.toml | Added dependency on Pydantic (>=2.0) as a core project requirement. |
| docs/source/installation.rst | Updated dependency list to include Pydantic (>=2.0). |
| docs/source/api_reference.rst | Enhanced API documentation to automatically document the central Data model. |
Comments suppressed due to low confidence (1)
src/xer_parser/model/classes/pcatval.py:13
- The use of a generic field name 'data' may reduce clarity. Consider renaming it to 'data_context' or a more descriptive name to better indicate that it stores contextual information.
data: Any = Field(default=None, exclude=True) # Standard data field
Comment on lines
+52
to
61
| self.index = 0 # Reset index for each new iteration | ||
| return self | ||
|
|
||
| def __next__(self) -> Calendar: | ||
| if self.index >= len(self._calendars): | ||
| if self.index < len(self._calendars): | ||
| result = self._calendars[self.index] | ||
| self.index += 1 | ||
| return result | ||
| else: | ||
| raise StopIteration |
There was a problem hiding this comment.
Rather than manually resetting and maintaining an index for iteration, consider implementing iter as 'return iter(self._calendars)' to simplify the code and reduce potential iteration issues.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
This commit completes the migration of the XER parser's data handling to use Pydantic for all data models and their collections.
Key changes include:
Core Models (
src/xer_parser/model/classes/*.py): All classes(e.g., Account, Task, Project, WBS, Rsrc, TaskPred) now inherit
from
pydantic.BaseModel.__init__methods replaced with Pydantic fields, providingautomatic type validation, coercion, and default values.
are mapped to appropriate Python types (str, int, float, datetime,
Optional[str] for flags).
get_tsv()methods updated to usemodel_dump()and correctlyformat data for XER output, including datetime objects and None values.
obj_listand related class methods for instance tracking.Collection Classes (
src/xer_parser/model/*.py): All collectionclasses (e.g., Accounts, Tasks, Projects) updated.
add()methods now useModelClass.model_validate(params)for robustPydantic model instantiation.
data_context(referencing the mainDataobject) is passed tomodel instances, enabling relationship navigation via properties
(e.g.,
task.calendar,project.activities).Reader (
src/xer_parser/reader.py):Dataclass (src/xer_parser/model/classes/data.py)is now a Pydantic model, managing all collections with
default_factoryand setting updata_contextviamodel_post_init.Reader.create_object()simplified, delegating validation and typeconversion to Pydantic models via collection
add()methods.locale.atof).Writer (
src/xer_parser/write.py):Datamodel instance.Dataobject (e.g.,data_obj.tasks) and callsitem.get_tsv()on Pydantic models.Utilities & Analysis:
XerExplorer(tools/explorer.py) now correctly interacts withPydantic models, benefiting from typed attributes.
DCMA14analysis (dcma14/analysis.py) updated to use directattribute access on Pydantic models, removing redundant type
conversions and improving accuracy.
Dependencies:
pydantic>=2.0as a core dependency inpyproject.toml.Tests (
tests/*.py):tests/test_reader.pyto assert Pydantic model types,attribute types, and data context propagation.
ValidationErroron malformed XER input.test_writer.py,test_dcma14.pymay requirefollow-up adjustments to fully align with Pydantic inputs/outputs if
their test data setup or assertions relied on pre-Pydantic structures).
Documentation (
docs/):installation.rstupdated to list Pydantic as a dependency.api_reference.rstreviewed to ensure Pydantic models are documented.getting_started.rstandexamples.rstupdatedto reflect Pydantic model usage (e.g., direct typed attribute access).
This migration enhances data integrity, type safety, and developer experience by leveraging Pydantic's robust validation and modeling capabilities.