feat: Validation Split & MLflow Tracking (fixes #22)#78

Open

verdhanyash wants to merge 1 commit intoetsi-ai:mainfrom

verdhanyash:feat/validation-split-mlflow-tracking

Contributor

verdhanyash commented Feb 17, 2026 •

edited

Loading

Fixes

Closes: #22

Type of Change

Bug fix
New feature
Documentation / Refactor
Math / Logic correction

Description

Implemented validation split and MLflow tracking for monitoring model generalization performance.

Changes:

Added validation_split: float = 0.2 parameter to Model.train() method
Split data into training and validation sets after preprocessing (shuffled, seed-reproducible)
Compute validation loss per-epoch inside the existing Rust progress_callback — training loop stays entirely in Rust, preserving optimizer state and avoiding FFI overhead
Added _calculate_validation_loss() method supporting both classification (cross-entropy) and regression (MSE)
Extended Rust bindings with forward() method to expose raw model outputs for validation loss calculation
Updated save_model() to log both loss and val_loss metrics to MLflow with epoch steps
Added input validation for validation_split parameter range
Stored val_loss_history on model instance, initialized in __init__() and load()

Key architectural decisions (addressing PR #62 feedback):

Training loop remains 100% in Rust — rust_model.train() called exactly once
Optimizer state (Adam moments, SGD) is never reset between epochs
Validation loss computed in the existing progress_callback (no extra FFI overhead)
Data shuffled before split using np.random.default_rng(seed) for reproducibility

Result: MLflow dashboard now displays overlapping train/validation loss curves for monitoring overfitting and generalization.

How Has This Been Tested?

Unit Tests: Created and ran test_validation_split.py with 8 tests covering classification, regression, edge cases, seed reproducibility, and MLflow logging
Smoke Testing: Verified end-to-end with mock datasets — progress bar shows both Train Loss and Val Loss
Integration: Confirmed Rust core forward() method works correctly with Python API
Existing Tests: All 46 pytest tests pass (38 existing + 8 new, zero regressions)

Screenshots / Logs

Contribution Context

I am contributing through the SWOC program.


          feat: Validation Split & MLflow Tracking (fixes etsi-ai#22)

8dd5418

github-actions bot commented Feb 17, 2026

Thank you for opening this PR! Our automated system is currently verifying the PR requirements.
Internal Discussion: Discord

github-actions bot added the SWoC26 label

github-actions bot commented Feb 17, 2026

Validation Successful!

This pull request has been verified and linked to issue #22. The system is now synchronizing metadata from the referenced issue. Kindly await maintainer review of your changes.

github-actions bot requested review from Aamod007, Arsh123344423, Romit23, Satyamgupta2365, SrishtiSonam and debug-soham

February 17, 2026 20:33

github-actions bot assigned verdhanyash

github-actions bot added area: python-api feature Medium labels

github-actions bot commented Feb 17, 2026

Validation Successful!

This pull request has been verified and linked to issue #22. The system is now synchronizing metadata from the referenced issue. Kindly await maintainer review of your changes.

5 similar comments

github-actions bot commented Feb 17, 2026

Validation Successful!

This pull request has been verified and linked to issue #22. The system is now synchronizing metadata from the referenced issue. Kindly await maintainer review of your changes.

github-actions bot commented Feb 17, 2026

Validation Successful!

This pull request has been verified and linked to issue #22. The system is now synchronizing metadata from the referenced issue. Kindly await maintainer review of your changes.

github-actions bot commented Feb 17, 2026

Validation Successful!

This pull request has been verified and linked to issue #22. The system is now synchronizing metadata from the referenced issue. Kindly await maintainer review of your changes.

github-actions bot commented Feb 17, 2026

Validation Successful!

This pull request has been verified and linked to issue #22. The system is now synchronizing metadata from the referenced issue. Kindly await maintainer review of your changes.

github-actions bot commented Feb 17, 2026

Validation Successful!

This pull request has been verified and linked to issue #22. The system is now synchronizing metadata from the referenced issue. Kindly await maintainer review of your changes.

Collaborator

Aamod007 commented Feb 18, 2026

Good Work @verdhanyash

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Reviewers

Aamod007 Awaiting requested review from Aamod007

Arsh123344423 Awaiting requested review from Arsh123344423

debug-soham Awaiting requested review from debug-soham

Romit23 Awaiting requested review from Romit23

Satyamgupta2365 Awaiting requested review from Satyamgupta2365

SrishtiSonam Awaiting requested review from SrishtiSonam

Labels

area: python-api feature Medium SWoC26