Cleanup v3 #59

ngc92 · 2026-01-09T22:00:52Z

Move almost all optimizer state functionality into the base class. Only the mapping from tensor indices to checkpoint names needs to remain in the "llama"-specific implementation.

Copilot

Pull request overview

This pull request refactors optimizer state management by moving most of the functionality from the model-specific LLamaOptimizerStateManager into the base AdamWStateManager class. This cleanup effort introduces a new GenericTensorContainer class for generic tensor storage and consolidates common optimizer state operations.

Key changes:

Introduced GenericTensorContainer for storing tensors in a vector-based container
Moved optimizer state allocation and management logic into the base AdamWStateManager class
Simplified checkpoint save/load operations by delegating to the optimizer's methods
Added virtual methods to IModel for creating block and non-block tensor containers

Reviewed changes

Copilot reviewed 10 out of 10 changed files in this pull request and generated 5 comments.

Show a summary per file

File	Description
src/utilities/tensor_container.h	Adds `GenericTensorContainer` class for storing tensors in a vector
src/training/model.h	Replaces optimizer accessor methods with single `optimizer()` method; adds virtual methods for tensor container creation
src/training/model.cpp	Implements helper methods for creating block and non-block containers
src/training/checkpoint.cpp	Delegates optimizer checkpoint operations to the optimizer's save/load methods
src/training/adamw_optimizer.h	Moves optimizer state storage and buffer management to base class; keeps checkpoint methods virtual
src/training/adamw_optimizer.cpp	Implements generic state allocation and buffer management in base class
src/models/llama_optimizer.h	Removes most implementation details, keeping only checkpoint-specific logic
src/models/llama_optimizer.cpp	Implements checkpoint save/load using `OptStateWrapper` for tensor name mapping
src/models/llama_model.h	Updates interface to use new optimizer accessor and adds container shape methods
src/models/llama_model.cpp	Implements tensor shape filling and updates optimizer initialization

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

src/models/llama_optimizer.cpp

src/training/adamw_optimizer.cpp

src/models/llama_optimizer.cpp

src/training/adamw_optimizer.cpp

Copilot

Pull request overview

Copilot reviewed 11 out of 11 changed files in this pull request and generated 1 comment.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

src/models/llama_model.cpp

ngc92 added 3 commits January 9, 2026 12:00

add a generic TensorContainer implementation

7bee7d7

move buffer allocation to generic optimizer

ec2bcfa

also handle non-block weights

36bc12f

ngc92 marked this pull request as ready for review January 9, 2026 22:01

Copilot AI review requested due to automatic review settings January 9, 2026 22:01

Copilot started reviewing on behalf of ngc92 January 9, 2026 22:01 View session

Copilot AI reviewed Jan 9, 2026

View reviewed changes

ngc92 force-pushed the cleanup-v3 branch from c2e1313 to f69d4fa Compare January 9, 2026 23:01

ngc92 requested a review from Copilot January 9, 2026 23:08

Copilot started reviewing on behalf of ngc92 January 9, 2026 23:08 View session

Copilot AI reviewed Jan 9, 2026

View reviewed changes

src/models/llama_model.cpp Outdated Show resolved Hide resolved

more optimizer generalization

99ed1e1

ngc92 force-pushed the cleanup-v3 branch from f69d4fa to 99ed1e1 Compare January 9, 2026 23:13

ngc92 merged commit 258b9fd into dev Jan 12, 2026
29 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Cleanup v3 #59

Cleanup v3 #59

Uh oh!

ngc92 commented Jan 9, 2026

Uh oh!

Copilot AI left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Copilot AI left a comment

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Cleanup v3 #59

Cleanup v3 #59

Uh oh!

Conversation

ngc92 commented Jan 9, 2026

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants