-
Notifications
You must be signed in to change notification settings - Fork 4
Open
Labels
documentationImprovements or additions to documentationImprovements or additions to documentation
Description
The interface would be at least like the following: `AnnData -> dict[AnnData]'.
Let's limit our variables to consider to at most 2 (e.g., cell_type:a/b and split: train/val/test)
Cases:
- User wants to split for each set
train, test, vals.t. all these preserve their cell type distribution. E.g. if there is quarter celltype a in the whole dataset in train, test and val they should also be 1/4 of their respective sets. (here no entry is duplicate but in sampling this might not be the case) - User wants to sample each set so that the classes have equal proportion. (here there can be duplicate entries)
@FrancescaDr here do you have any more cases you'd like to discuss? Also which one would be more important for you?
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
documentationImprovements or additions to documentationImprovements or additions to documentation