DISCLAIMER: We are working on a clean-up version of the code to be published soon here.
This repository contains code and data for the attribution hackathon.
- Install conda
- Create the conda environent:
conda env create -f attr_hack_env.yml - Activate conda environment:
conda activate attr_hack - Run data processing / simulation (only needed to re-run data processing, you may skip this step).
- GPP model
python Simple_GPP_model/process_predictor-variables.pypython Simple_GPP_model/gpp_model.py
- SM model
python simple_sm_model/era_sm_sim.py
- GPP model
| file | description |
|---|---|
hackathon/models/linear.py |
A dummy model. |
hackathon/base_model.py |
The base model class. |
hackathon/model_runner.py |
The model runner class. |
hackathon/data_pipeline.py |
The dataloader. |
benchmark.py |
Evaluate model. |
-
Make a copy of
hackathon/models/linear.pywithin the same directory and rename it to a meaningful name, e.g.,hackathon/models/rnn.py. -
Replace the class
Linearwith your model, give it a meaningful name, e.g.,class RNN(BaseModel). The model must subclassBaseModel! -
The model
__init__method must take**kwargsand pass them to the parent class (BaseModel) like so:class MyModel(BaseModel): def __init__(self, num_features: int, num_targets: int, **kwargs) -> None: super(MyModel, self).__init__(**kwargs) self.linear = torch.nn.Linear(num_features, num_targets) self.softplus = torch.nn.Softplus() def forward(self, x: Tensor) -> Tensor: out = self.softplus(self.linear(x)) return out
-
Define a function
model_setupwithin the same file that returns an initialized model (e.g.,hackathon/models/rnn.py). The function must take the argumentnorm_stats. Themodel_setupfunction must follow this pattern:def model_setup(norm_stats: dict[str, Tensor]) -> BaseModel: """Create a model as subclass of hackathon.base_model.BaseModel. Parameters ---------- norm_stats: Feature normalization stats with signature {'mean': Tensor, 'std': Tensor}, both tensors with shape (num_features,). Returns ------- A model. """ model = MyCustomModel( # <- Is a subclass of BaseModel num_features=8, # <- A model HP num_targets=1, # <- A model HP num_layers=1000000, # <- A model HP ... # more model HPs learning_rate=0.01, # <- BaseModel kwarg weight_decay=0.0, # <- BaseModel kwarg norm_stats=norm_stats) # <- BaseModel kwarg return model
-
Add your model to the
benchmark.pyfile by importing and adding themodel_setupfunction:from hackathon.models.linear import model_setup as linear_model model_funs = [linear_model]
-
Run your model with
python benchmark.py(--quickrunfor developer run over one epoch / CV fold with less training and validation data).
- The
log_dirargument sets the base directory of the experiment. - Cross validation runs are each put into a subdirectory (
log_dir/fold_00etc.) - The final model (ensemble of all CV models) and its predictions are saved in
log_dir/final
0 0 1 1 2 1 1 0 2 1