From 6b01d56d10e8473ac7a52a286ae3200fe9fbb9de Mon Sep 17 00:00:00 2001 From: Dmitry Duplyakin Date: Fri, 2 Jun 2023 13:30:44 -0600 Subject: [PATCH 1/3] Star md for eagle-jobs --- eagle-jobs.md | 17 +++++++++++++++++ 1 file changed, 17 insertions(+) create mode 100644 eagle-jobs.md diff --git a/eagle-jobs.md b/eagle-jobs.md new file mode 100644 index 0000000..1bf30af --- /dev/null +++ b/eagle-jobs.md @@ -0,0 +1,17 @@ +# Eagle Jobs + +## Description + +HPC dataset with 11M+ jobs from NREL's Eagle supercomputer. These jobs were submitted to run on Eagle between Nov 2018 and Feb 2023. The data are sufficiently anonymized and do not include sensitive user or project data. HPC research community does not have many public, large, and complete job traces like this one, and release of this dataset should help address this gap. + +## Data + +This repository contains a single large file: eagle_data.parquet, which uses parquet, a column-oriented data file format with compression. + +## References + +To be added when the paper is published. + +## Disclaimer and Attribution + +To be added later. From eeec05864212374459caf8faf01d5d2046358b9a Mon Sep 17 00:00:00 2001 From: Dmitry Duplyakin Date: Fri, 2 Jun 2023 13:38:46 -0600 Subject: [PATCH 2/3] Add OEDI link --- eagle-jobs.md | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) diff --git a/eagle-jobs.md b/eagle-jobs.md index 1bf30af..9408522 100644 --- a/eagle-jobs.md +++ b/eagle-jobs.md @@ -6,7 +6,9 @@ HPC dataset with 11M+ jobs from NREL's Eagle supercomputer. These jobs were subm ## Data -This repository contains a single large file: eagle_data.parquet, which uses parquet, a column-oriented data file format with compression. +The data is available through OEDI: [https://data.openei.org/submissions/5860](https://data.openei.org/submissions/5860). + +This link points to a repository that contains a single large file: `eagle_data.parquet`, which uses parquet, a column-oriented data file format with compression. ## References From 0c7cb8f9ba2e076d4e31d82cd9879d18be1e9221 Mon Sep 17 00:00:00 2001 From: Dmitry Duplyakin Date: Wed, 18 Sep 2024 07:58:11 -0600 Subject: [PATCH 3/3] Update eagle-jobs.md Add missing references and citation info --- eagle-jobs.md | 30 +++++++++++++++++++++++++++++- 1 file changed, 29 insertions(+), 1 deletion(-) diff --git a/eagle-jobs.md b/eagle-jobs.md index 9408522..fd7fcdf 100644 --- a/eagle-jobs.md +++ b/eagle-jobs.md @@ -12,7 +12,35 @@ This link points to a repository that contains a single large file: `eagle_data. ## References -To be added when the paper is published. +Published Paper: + +[Mastering HPC Runtime Prediction: From Observing Patterns to a Methodological Approach](https://www.nrel.gov/docs/fy23osti/86526.pdf) + +For citing our research, please use: + +``` +@incollection{menear2023mastering, + title={Mastering HPC Runtime Prediction: From Observing Patterns to a Methodological Approach}, + author={Menear, Kevin and Nag, Ambarish and Perr-Sauer, Jordan and Lunacek, Monte and Potter, Kristi and Duplyakin, Dmitry}, + booktitle={Practice and Experience in Advanced Research Computing}, + pages={75--85}, + year={2023} +} +``` + +You can also cite the dataset we produced and released: + +``` +@div{oedi_5860, + title = {NREL Eagle supercomputer jobs}, + author = {Duplyakin, Dmitry, Menear, Kevin.}, + abstractNote = {High performance computing dataset with 11M+ jobs from NREL's Eagle supercomputer. These jobs were submitted to run on Eagle between Nov 2018 and Feb 2023. The data are sufficiently anonymized and do not include sensitive user or project data. HPC research community does not have many public, large, and complete job traces like this one, and releasing this dataset should help address this gap.}, + url = {https://data.openei.org/submissions/5860}, + place = {United States}, + year = {2023}, + month = {02} +} +``` ## Disclaimer and Attribution