Skip to content

Comments

[ML] Add per allocation and per deployment memory metadata fields to …#6

Open
MitchLewis930 wants to merge 1 commit intopr_016_beforefrom
pr_016_after
Open

[ML] Add per allocation and per deployment memory metadata fields to …#6
MitchLewis930 wants to merge 1 commit intopr_016_beforefrom
pr_016_after

Conversation

@MitchLewis930
Copy link

PR_016

…the trained models config (elastic#98139)

To improve the required memory estimation of NLP models, this PR introduces two new metadata fields: per_deployment_memory_bytes and per_allocation_memory_bytes.

per_deployment_memory_bytes is the memory required to load the model in the deployment
per_allocation_memory_bytes is the temporary additional memory used during the inference for every allocation.

This PR extends the memory usage estimation logic while ensuring backward compatibility.

In a follow-up PR, I will adjust the assignment planner to use the refined memory usage information.
@MitchLewis930 MitchLewis930 requested a review from Copilot January 31, 2026 00:37
Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR adds per-deployment and per-allocation memory metadata fields to the trained model deployment system. These fields enable more accurate memory usage estimation for PyTorch model deployments by allowing models to specify custom memory requirements that can differ from the default calculations.

Changes:

  • Added per_deployment_memory_bytes and per_allocation_memory_bytes fields to TaskParams and TrainedModelConfig
  • Updated memory estimation logic to use custom memory values when available, falling back to default calculations
  • Added transport version V_8_500_064 for backward compatibility of the new fields

Reviewed changes

Copilot reviewed 23 out of 23 changed files in this pull request and generated 1 comment.

Show a summary per file
File Description
StartTrainedModelDeploymentAction.java Added new memory fields to TaskParams, updated estimateMemoryUsageBytes to support custom memory calculations
TrainedModelConfig.java Added getter methods for per-deployment and per-allocation memory bytes from metadata
TransportStartTrainedModelDeploymentAction.java Updated to extract and pass memory values from model config to task params
TransportGetTrainedModelsStatsAction.java Modified memory estimation to include number of allocations parameter
TrainedModelDeploymentTask.java Updated updateNumberOfAllocations to preserve new memory fields
TrainedModelAssignmentNodeService.java Updated task params construction to include memory fields
TrainedModelAssignment.java Updated setNumberOfAllocations to preserve memory fields
TransportVersion.java Added V_8_500_064 version constant and updated CURRENT version
PyTorchModelRestTestCase.java Added test utility methods for creating models with memory metadata
PyTorchModelIT.java Added integration test for memory estimation with and without metadata
Multiple test files Updated test constructors to include new memory parameters with default values

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Request request = new Request("PUT", "/_ml/trained_models/" + modelId);
request.setJsonEntity("""
String metadata;
if (perDeploymentMemoryBytes > 0 && perAllocationMemoryBytes > 0) {
Copy link

Copilot AI Jan 31, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The condition checks both values must be greater than 0, but the logic in estimateMemoryUsageBytes treats them independently (line 726 checks both equal 0). Consider checking if either value is non-zero to be consistent with the estimation logic.

Suggested change
if (perDeploymentMemoryBytes > 0 && perAllocationMemoryBytes > 0) {
if (perDeploymentMemoryBytes > 0 || perAllocationMemoryBytes > 0) {

Copilot uses AI. Check for mistakes.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants