From b4d3fec864aba8384943de0368717b5aaa753d57 Mon Sep 17 00:00:00 2001 From: Russell Bryant Date: Wed, 10 Jul 2024 15:37:13 -0400 Subject: [PATCH] sdg: Add details on how to configure serving for custom pipelines The design proposal in #109, and the corresponding implementation in https://github.com/instructlab/sdg/pull/86, raised the importance of clearly defining how a custom pipeline that requires a model with custom adapters would be configured. This document explores that topic. It's possible this should just become a subsection of #109. Signed-off-by: Russell Bryant --- docs/sdg/data-generate-serve-config.yaml | 100 +++++++++++++++++++++++ 1 file changed, 100 insertions(+) create mode 100644 docs/sdg/data-generate-serve-config.yaml diff --git a/docs/sdg/data-generate-serve-config.yaml b/docs/sdg/data-generate-serve-config.yaml new file mode 100644 index 00000000..95a837cf --- /dev/null +++ b/docs/sdg/data-generate-serve-config.yaml @@ -0,0 +1,100 @@ +# Serve Config for `data generate` command + +`ilab` currently automates model serving under the following conditions: + +* `ilab model serve` +* `ilab model chat` without a custom API endpoint and without `ilab + model serve` already running. +* `ilab data generate` without a custom API endpoint and without `ilab + model serve` already running. +* `ilab model evaluate` + +As features are added to the `instructlab-sdg` library, the configuration +requirements are growing beyond what is currently available through the `ilab` +CLI's `data generate` command. This document reviews the requirements and makes +a proposal for how to configure `ilab` for the expanded SDG use cases. + +## Requirements + +In all other cases of automatically serving a model, `ilab` is only serving a +single model. We now have a need to serve both a model, as well as that model +with custom adapters. This is [supported by +vllm](https://docs.vllm.ai/en/latest/models/lora.html#serving-lora-adapters), +one of our model serving backends for `ilab`. + +In addition to specifying which lora adapter(s) to serve, we must also be able +to configure the model ID that is used for it in the OpenAI API. There is a related design +that is [proposing a configuration format for SDG flows](https://github.com/instructlab/sdg/pull/86). +A flow configuration file will include an expectation of one or more model IDs +to be accessible, so we need a way to ensure our config matches the flow +expectations. + +## Proposal + +### Use Case + +`ilab data generate` with a custom workflow that requires a custom model adapter +in addition to the model without an adapter. + +### Configuration + +First, let's put the custom adapter aside and review how we would configure the +teacher model today. + +The `serve:` section of `config.yaml` includes a `model_path` field. The model +path is also used as the model ID used to request this model via the OpenAI API. +This same model ID must be in the `generate.model` configuration option for this +same ID to be used as the default teacher model. + +```yaml +serve: + model_path: "path/to/model_directory" # both a path and the model ID used in the API +... +generate: + model: "path/to/model_directory" # the default model ID to request from the API +... +``` + +If we want to serve a model with a custom adapter, we can do so using custom +`vllm_args` in the configuration file. + +```yaml +serve: + model_path: "path/to/model_directory" # both a path and the model ID used in the API + backend: "vllm" + backend_args: + vllm_args: + - ... + - "--lora-modules" + - "my_custom_adapter=path/to/my_custom_adapter" + - ... +... +generate: + model: "path/to/model_directory" # the default model ID to request from the API +... + +In this example, we have added another model ID, `my_custom_adapter`, to the +OpenAI API endpoint served by `vllm`. This model ID can match the expectation of +a custom flow configuration file. Using a potential configuration example from +[an open PR](https://github.com/instructlab/sdg/pull/86), here is how the +expectation of `my_custom_adapter` could be expressed. Note that the details +of this configuration format are pending the resoltuion of the [corresponding +design proposal](https://github.com/instructlab/dev-docs/pull/109). + +```yaml +version: "1.0" +models: + - name: my_custom_adapter + description: a funky adaptor for generating questions +block_configs: + - block_type: LLMBlock + block_config: + block_name: gen_questions + config_path: configs/skills/freeform_questions.yaml + add_num_samples: True + model: my_custom_adapter + output_cols: + - question + drop_duplicates: + - question +```