Skip to content
Draft
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions .github/workflows/ci.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -166,6 +166,7 @@ jobs:
env:
LUXONISML_BUCKET: luxonis-test-bucket
SUITE: ${{ matrix.suite }}
HUBAI_API_KEY: ${{ secrets.HUBAI_API_KEY }}
run: pytest -x --cov --junitxml=junit.xml -o junit_family=legacy -m "${SUITE}"

- name: Upload test results to Codecov
Expand Down
4 changes: 4 additions & 0 deletions .github/workflows/tests.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -29,6 +29,9 @@ on:
CODECOV_TOKEN:
description: 'Codecov upload token'
required: true
HUBAI_API_KEY:
description: 'HubAI API key'
required: true

permissions:
pull-requests: write
Expand Down Expand Up @@ -92,6 +95,7 @@ jobs:
working-directory: luxonis-train
env:
LUXONISML_BUCKET: luxonis-test-bucket
HUBAI_API_KEY: ${{ secrets.HUBAI_API_KEY }}
run: pytest --cov --junitxml=junit.xml -o junit_family=legacy

- name: Upload test results to Codecov
Expand Down
47 changes: 42 additions & 5 deletions configs/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -28,7 +28,8 @@ You can create your own config or use/edit one of the examples.
- [Trainer Tips](#trainer-tips)
- [Exporter](#exporter)
- [`ONNX`](#onnx)
- [Blob](#blob)
- [HubAI](#hubai)
- [Blob (Deprecated)](#blob-deprecated)
- [Tuner](#tuner)
- [Storage](#storage)
- [ENVIRON](#environ)
Expand Down Expand Up @@ -355,10 +356,13 @@ trainer:
patience: 3
monitor: "val/loss"
mode: "min"
- name: "ExportOnTrainEnd"
- name: "ConvertOnTrainEnd"
- name: "TestOnTrainEnd"
```

> [!NOTE]
> `ConvertOnTrainEnd` is the recommended callback for model conversion. It combines export, archive, and platform-specific conversion (blobconverter/HubAI SDK) into a single step. Use this instead of separate `ExportOnTrainEnd` and `ArchiveOnTrainEnd` callbacks.

### Optimizer

What optimizer to use for training.
Expand Down Expand Up @@ -490,14 +494,15 @@ Here you can define configuration for exporting.
| ------------------------ | --------------------------------- | ------------- | ---------------------------------------------------------------------------------------------- |
| `name` | `str \| None` | `None` | Name of the exported model |
| `input_shape` | `list\[int\] \| None` | `None` | Input shape of the model. If not provided, inferred from the dataset |
| `data_type` | `Literal["INT8", "FP16", "FP32"]` | `"FP16"` | Data type of the exported model. Only used for conversion to BLOB |
| `target_precision` | `Literal["INT8", "FP16", "FP32"]` | `"FP16"` | Data type of the exported model. Alias: `data_type` |
| `reverse_input_channels` | `bool` | `True` | Whether to reverse the image channels in the exported model. Relevant for `BLOB` export |
| `scale_values` | `list[float] \| None` | `None` | What scale values to use for input normalization. If not provided, inferred from augmentations |
| `mean_values` | `list[float] \| None` | `None` | What mean values to use for input normalization. If not provided, inferred from augmentations |
| `upload_to_run` | `bool` | `True` | Whether to upload the exported files to tracked run as artifact |
| `upload_url` | `str \| None` | `None` | Exported model will be uploaded to this URL if specified |
| `onnx` | `dict` | `{}` | Options specific for ONNX export. See [ONNX](#onnx) section for details |
| `blobconverter` | `dict` | `{}` | Options for converting to BLOB format. See [Blob](#blob) section for details |
| `hubai` | `dict` | `{}` | Options for HubAI SDK conversion. See [HubAI](#hubai) section for details |
| `blobconverter` | `dict` | `{}` | Options for converting to BLOB format (deprecated). See [Blob](#blob-deprecated) section |

### `ONNX`

Expand All @@ -510,7 +515,37 @@ Option specific for `ONNX` export.
| `disable_onnx_simplification` | `bool` | `False` | Disable ONNX simplification after export |
| `unique_onnx_initializers` | `bool` | `False` | Re-assign names to identifiers after export to ensure they are per-block unique |

### `Blob`
### `HubAI`

The [HubAI SDK](https://github.com/luxonis/hubai-sdk) provides model conversion for multiple platforms (RVC2, RVC3, RVC4, Hailo).
This is the recommended way to convert models for deployment.

> [!NOTE]
> Requires `HUBAI_API_KEY` environment variable to be set.

| Key | Type | Default value | Description |
| --------------------- | ------------------------------------------ | ------------- | ----------------------------------------------------------------- |
| `active` | `bool` | `False` | Whether to use HubAI SDK for conversion |
| `platform` | `Literal["rvc2", "rvc3", "rvc4", "hailo"]` | `None` | Target platform for conversion. Required when `active` is `True` |
| `delete_remote_model` | `bool` | `False` | Clean up by deleting remote uploaded variant in HubAI |
| `params` | `dict` | `{}` | Additional parameters passed to the HubAI SDK conversion function |

**Example:**

```yaml
exporter:
target_precision: fp16
hubai:
active: true
platform: rvc2
params:
superblob: True
```

### `Blob` (Deprecated)

> [!WARNING]
> `blobconverter` is deprecated and only supports RVC2 legacy conversion to `.blob`.

| Key | Type | Default value | Description |
| --------- | ---------------------------------------------------------------- | ------------- | ---------------------------------------- |
Expand Down Expand Up @@ -554,6 +589,7 @@ Here you can specify options for tuning.
> - `UploadCheckpoint`
> - `ExportOnTrainEnd`
> - `ArchiveOnTrainEnd`
> - `ConvertOnTrainEnd`
> - `TestOnTrainEnd`

### Storage
Expand Down Expand Up @@ -609,6 +645,7 @@ For more info on the variables, see [Credentials](../README.md#credentials).
| `AWS_ACCESS_KEY_ID` | `str \| None` | `None` |
| `AWS_SECRET_ACCESS_KEY` | `str \| None` | `None` |
| `AWS_S3_ENDPOINT_URL` | `str \| None` | `None` |
| `HUBAI_API_KEY` | `str \| None` | `None` |
| `MLFLOW_CLOUDFLARE_ID` | `str \| None` | `None` |
| `MLFLOW_CLOUDFLARE_SECRET` | `str \| None` | `None` |
| `MLFLOW_S3_BUCKET` | `str \| None` | `None` |
Expand Down
3 changes: 1 addition & 2 deletions configs/complex_model.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -116,8 +116,7 @@ trainer:
patience: 3
monitor: val/loss
mode: min
- name: ExportOnTrainEnd
- name: ArchiveOnTrainEnd
- name: ConvertOnTrainEnd
- name: TestOnTrainEnd

optimizer:
Expand Down
3 changes: 1 addition & 2 deletions configs/detection_heavy_model.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -46,8 +46,7 @@ trainer:
decay: 0.9999
use_dynamic_decay: True
decay_tau: 2000
- name: ExportOnTrainEnd
- name: TestOnTrainEnd
- name: ConvertOnTrainEnd

training_strategy:
name: TripleLRSGDStrategy
Expand Down
3 changes: 1 addition & 2 deletions configs/detection_light_model.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -46,8 +46,7 @@ trainer:
decay: 0.9999
use_dynamic_decay: True
decay_tau: 2000
- name: ExportOnTrainEnd
- name: TestOnTrainEnd
- name: ConvertOnTrainEnd

training_strategy:
name: "TripleLRSGDStrategy"
Expand Down
3 changes: 1 addition & 2 deletions configs/fomo_heavy_model.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -27,5 +27,4 @@ trainer:
gradient_clip_val: 10

callbacks:
- name: ExportOnTrainEnd
- name: TestOnTrainEnd
- name: ConvertOnTrainEnd
3 changes: 1 addition & 2 deletions configs/fomo_light_model.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -27,5 +27,4 @@ trainer:
gradient_clip_val: 10

callbacks:
- name: ExportOnTrainEnd
- name: TestOnTrainEnd
- name: ConvertOnTrainEnd
3 changes: 1 addition & 2 deletions configs/instance_segmentation_heavy_model.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -42,8 +42,7 @@ trainer:
decay: 0.9999
use_dynamic_decay: True
decay_tau: 2000
- name: ExportOnTrainEnd
- name: TestOnTrainEnd
- name: ConvertOnTrainEnd
- name: GradientAccumulationScheduler
params:
# warmup phase is 3 epochs
Expand Down
3 changes: 1 addition & 2 deletions configs/instance_segmentation_light_model.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -42,8 +42,7 @@ trainer:
decay: 0.9999
use_dynamic_decay: True
decay_tau: 2000
- name: ExportOnTrainEnd
- name: TestOnTrainEnd
- name: ConvertOnTrainEnd
- name: GradientAccumulationScheduler
params:
scheduling: # warmup phase is 3 epochs
Expand Down
3 changes: 1 addition & 2 deletions configs/keypoint_bbox_heavy_model.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -49,8 +49,7 @@ trainer:
decay: 0.9999
use_dynamic_decay: True
decay_tau: 2000
- name: ExportOnTrainEnd
- name: TestOnTrainEnd
- name: ConvertOnTrainEnd
# For best results, always accumulate gradients to
# effectively use 64 batch size
- name: GradientAccumulationScheduler
Expand Down
3 changes: 1 addition & 2 deletions configs/keypoint_bbox_light_model.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -49,8 +49,7 @@ trainer:
decay: 0.9999
use_dynamic_decay: True
decay_tau: 2000
- name: ExportOnTrainEnd
- name: TestOnTrainEnd
- name: ConvertOnTrainEnd
# For best results, always accumulate gradients to
# effectively use 64 batch size
- name: GradientAccumulationScheduler
Expand Down
3 changes: 1 addition & 2 deletions configs/ocr_recognition_light_model.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -28,8 +28,7 @@ trainer:
n_log_images: 8

callbacks:
- name: TestOnTrainEnd
- name: ExportOnTrainEnd
- name: ConvertOnTrainEnd

optimizer:
name: Adam
Expand Down
3 changes: 1 addition & 2 deletions configs/segmentation_heavy_model.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -24,8 +24,7 @@ trainer:
n_log_images: 8

callbacks:
- name: TestOnTrainEnd
- name: ExportOnTrainEnd
- name: ConvertOnTrainEnd

optimizer:
name: SGD
Expand Down
3 changes: 1 addition & 2 deletions configs/segmentation_light_model.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -25,8 +25,7 @@ trainer:
n_log_images: 8

callbacks:
- name: TestOnTrainEnd
- name: ExportOnTrainEnd
- name: ConvertOnTrainEnd

optimizer:
name: SGD
Expand Down
30 changes: 30 additions & 0 deletions luxonis_train/__main__.py
Original file line number Diff line number Diff line change
Expand Up @@ -52,7 +52,7 @@
from luxonis_train.utils.dataset_metadata import DatasetMetadata

if weights is not None and config is None:
ckpt = torch.load(weights, map_location="cpu") # nosemgre

Check failure on line 55 in luxonis_train/__main__.py

View workflow job for this annotation

GitHub Actions / semgrep/ci

Semgrep Issue

Functions reliant on pickle can result in arbitrary code execution. Consider loading from `state_dict`, using fickling, or switching to a safer serialization method like ONNX
if "config" not in ckpt: # pragma: no cover
raise ValueError(
f"Checkpoint '{weights}' does not contain the 'config' key. "
Expand Down Expand Up @@ -401,6 +401,36 @@
)


@app.command(group=export_group, sort_key=3)
def convert(
opts: list[str] | None = None,
/,
*,
config: str | None = None,
save_dir: str | None = None,
weights: str | None = None,
):
"""Export, archive, and convert the model to target platform format.

This is a unified command that combines export, archive, and
platform conversion (RVC2/RVC3/RVC4/Hailo) steps based on the
configuration.

@type config: str
@param config: Path to the configuration file.
@type save_dir: str
@param save_dir: Directory where all outputs will be saved. If not
specified, the default run save directory will be used.
@type weights: str
@param weights: Path to the model weights.
@type opts: list[str]
@param opts: A list of optional CLI overrides of the config file.
"""
create_model(config, opts, weights=weights).convert(
weights=weights, save_dir=save_dir
)


@upgrade_app.command()
def config(
config: Annotated[
Expand Down
20 changes: 20 additions & 0 deletions luxonis_train/callbacks/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -9,6 +9,7 @@ List of all supported callbacks.
- [`PytorchLightning` Callbacks](#pytorchlightning-callbacks)
- [`ExportOnTrainEnd`](#exportontrainend)
- [`ArchiveOnTrainEnd`](#archiveontrainend)
- [`ConvertOnTrainEnd`](#convertontrainend)
- [`MetadataLogger`](#metadatalogger)
- [`TestOnTrainEnd`](#testontrainend)
- [`UploadCheckpoint`](#uploadcheckpoint)
Expand Down Expand Up @@ -51,6 +52,25 @@ Callback to create an `NN Archive` at the end of the training.
| ---------------------- | --------------------------- | ------------- | --------------------------------------------------------------------------------------------------------------------------------------------------------------- |
| `preferred_checkpoint` | `Literal["metric", "loss"]` | `"metric"` | Which checkpoint should the callback use. If the preferred checkpoint is not available, the other option is used. If none is available, the callback is skipped |

## `ConvertOnTrainEnd`

Unified callback that exports, archives, and converts the archive to the target platform at the end of training. This is the recommended callback for model conversion as it combines the functionality of `ExportOnTrainEnd` and `ArchiveOnTrainEnd`, and also runs platform-specific conversions (blobconverter or HubAI SDK) if configured.

**Steps:**

<ol>
<li>Exports the model to ONNX</li>
<li>Creates an NN Archive from the ONNX</li>
<li>Runs blobconverter if `exporter.blobconverter.active` is `true`</li>
<li>Runs HubAI SDK conversion if `exporter.hubai.active` is `true`</li>
</ol>

**Parameters:**

| Key | Type | Default value | Description |
| ---------------------- | --------------------------- | ------------- | --------------------------------------------------------------------------------------------------------------------------------------------------------------- |
| `preferred_checkpoint` | `Literal["metric", "loss"]` | `"metric"` | Which checkpoint should the callback use. If the preferred checkpoint is not available, the other option is used. If none is available, the callback is skipped |

## `MetadataLogger`

Callback that logs training metadata.
Expand Down
3 changes: 3 additions & 0 deletions luxonis_train/callbacks/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -12,6 +12,7 @@
from luxonis_train.registry import CALLBACKS

from .archive_on_train_end import ArchiveOnTrainEnd
from .convert_on_train_end import ConvertOnTrainEnd
from .ema import EMACallback
from .export_on_train_end import ExportOnTrainEnd
from .gpu_stats_monitor import GPUStatsMonitor
Expand Down Expand Up @@ -43,11 +44,13 @@
CALLBACKS.register(module=TrainingManager)
CALLBACKS.register(module=GracefulInterruptCallback)
CALLBACKS.register(module=TrainingProgressCallback)
CALLBACKS.register(module=ConvertOnTrainEnd)


__all__ = [
"ArchiveOnTrainEnd",
"BaseLuxonisProgressBar",
"ConvertOnTrainEnd",
"EMACallback",
"ExportOnTrainEnd",
"GPUStatsMonitor",
Expand Down
28 changes: 28 additions & 0 deletions luxonis_train/callbacks/convert_on_train_end.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,28 @@
import lightning.pytorch as pl
from loguru import logger

import luxonis_train as lxt

from .needs_checkpoint import NeedsCheckpoint


class ConvertOnTrainEnd(NeedsCheckpoint):
"""Callback that exports, archives, and converts the model on train
end."""

def on_train_end(
self, _: pl.Trainer, pl_module: "lxt.LuxonisLightningModule"
) -> None:
"""Converts the model on train end.

@type trainer: L{pl.Trainer}
@param trainer: Pytorch Lightning trainer.
@type pl_module: L{pl.LightningModule}
@param pl_module: Pytorch Lightning module.
"""
checkpoint = self.get_checkpoint(pl_module)
if checkpoint is None: # pragma: no cover
logger.warning("Skipping model conversion.")
return

pl_module.core.convert(weights=checkpoint)
Loading
Loading