Skip to content

[FEATURE] Concatenation Layer for Per-Ticker Data #8

@GongJr0

Description

@GongJr0

Feature Details

Implement a concatenation module that merges numeric AR(n) features with categorical embeddings (ticker, sector, period, etc.) into a unified torch.Tensor input. This will act as the final pre-model feature aggregation stage, ensuring consistent input formatting for the downstream neural network.

The module should:

  • Accept heterogeneous feature groups (lag features, embeddings, derived stats)
  • Handle variable-length embeddings per feature while ensuring consistent output shape
  • Be modular so additional features can be appended later without breaking existing pipelines

Affected Modules

As stated in the parent issue.

Implementation Checklist

  • Define input spec for numeric and embedding features
  • Implement concatenation logic in FeatureGen (likely torch.cat along feature dimension)
  • Add shape validation to ensure outputs are consistent across tickers and time steps
  • Support batch-wise concatenation for multiple tickers simultaneously
  • Unit tests:
    • Verify numeric + embedding features combine to the expected final dimension
    • Check correct handling of missing embeddings or features
    • Test batched input scenarios

Limitations

As stated in the parent issue.

Metadata

Metadata

Assignees

Labels

featureImplementation tracking for approved features

Projects

Status

Ready

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions