diff --git a/pipeline-templates/README.MD b/pipeline-templates/README.MD index 10ffbfa..6a765f1 100644 --- a/pipeline-templates/README.MD +++ b/pipeline-templates/README.MD @@ -1,6 +1,25 @@ # Pipeline Templates -A collection of jsonnet templates that provide specific capabilities. +A collection of jsonnet pipeline templates that provide specific capabilities. -Templates: -* `split-combine` can be used to split text files into multiple files every x lines or combine multiple files into a single file. +## Template Subdirectories + +--- +## `split-combine` +### Description: +can be used to split text files into multiple files every x lines or combine multiple files into a single file. +### Usage +Splitting: +This will take any files in the input repo `myrepo` and split it into seperate files every `1000` lines +```bash +pachctl create pipeline --jsonnet https://raw.githubusercontent.com/pachyderm/examples/master/pipeline-templates/split-combine/splitcombine.jsonnet \ +--arg name="mypipeline" --arg mode="split" \ +--arg lines=1000 --arg source="myrepo" +``` +Combining: +This will combine all the files in the `mypipeline_split` repo and combine them into a single file called `/pfs/out/combined.csv` +```bash +pachctl create pipeline --jsonnet https://raw.githubusercontent.com/pachyderm/examples/master/pipeline-templates/split-combine/splitcombine.jsonnet \ +--arg name="mypipeline_2" --arg mode="combine" --arg source="mypipeline_split" --arg output="/pfs/out/combined.csv" +``` +---