[Tracking] Tracking issue for the new memory efficient dataset.

Starting from `2.0a11`, PyHealth starts to use a disk-based memory efficient dataset to reduce the memory usage for large dataset such as MIMIC4. 

This issues tracks any potential bugs or improvements required for new memory efficient dataset. 

## Improvements
- [ ] Batched processing for task transformation to speed up. https://github.com/sunlabuiuc/PyHealth/pull/750
- [x] Support multi-worker for task transformation. https://github.com/sunlabuiuc/PyHealth/pull/748
- [x] Support configure `n_worker` for dask. https://github.com/sunlabuiuc/PyHealth/pull/743

## Bugs
- [ ] Temporary folder for dataset is not proprely cleaned after dataset processing. https://github.com/sunlabuiuc/PyHealth/pull/753
- [ ] Cached data is not cleaned if the program crashed in the middle, which may lead to corrupted cache file. https://github.com/sunlabuiuc/PyHealth/pull/753
- [x] Incorrect null handling for patient_id and timestamp https://github.com/sunlabuiuc/PyHealth/pull/746
- [x] https://github.com/sunlabuiuc/PyHealth/issues/742 https://github.com/sunlabuiuc/PyHealth/pull/744

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[Tracking] Tracking issue for the new memory efficient dataset. #740

Improvements

Bugs

Sub-issues

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

[Tracking] Tracking issue for the new memory efficient dataset. #740

Description

Improvements

Bugs

Sub-issues

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions