Skip to content

Documentation

Manuel Pastor edited this page Sep 6, 2019 · 6 revisions

Documentation strategy

Scope

Documentation of models aims to make persistent in an organized way all relevant information resulting from model development.

Documentation in flame

Flame will support model documentation in the following ways:

  1. Defining a documentation class with all the required information
  2. Populating automatically the documentation class fields that describe methodological aspects, either defined as modeling parameters, or quality analysis results
  3. Supporting the input of user-defined documentation class fields, like the description of the endpoint, the name and address of the modeling engineer, etc.
  4. Generating diverse model documentation formats (e.g. QMRF) as PDF or other exportable/editable text formats
  5. Providing API access to the documentation

Key features:

  • Model documentation will be stored within the model folder of each version, making it exportable
  • Only end-user input will be stored as a YAML file. All the rest of the information will be retrieved on the fly from respective classes (parameters/results) uppon request, to guarantee full consistency
  • End-user input and report generation will be supported by manage commands, as well as from API calls
  • Model ID will not be automatically generated. Uniqueness in a certain domain (project, company, etc.) is a responability of the end-user

Type of information

Currently, three levels of documentation are considered according to the final user of the model.

  1. Level 1: Basic information (Date, endpoint, contact details, short description)
  2. Level 2: Contains details on model building (descriptors, parameters, applicability domain technique etc). Intended to facilitate reproducibility among modelers.
  3. Level 3: Intended for regulators. Contains extended information of the model as well as justification of its use.

Information sources

  1. parameters.yaml: Contains configuration information that is necessary for documentation of the model.
  2. results.pkl: Contains information resulting from model pre-processing of data and model performance.
  3. user: Information that the user should fill by hand (e.g., detailed information of the endpoint, justification of model applicability etc (Basically level 3 information and some level 1 fields)

Documentation will be generated automatically every time a model is built (QMRF like) or used for prediction (QPRF like), using the sources 1 and 2 from the list above.

For this documentation to be complete, the user needs to provide additional information, which will be appended to previos one and stored in the model folder, using a command like the following:

flame -c manage -a document -f template.txt

Class documentation.py

This class will load into documentation.yaml all the fields that can be automatically assigned after model building and manage the different types of output documents to generate as well.

Documents to be generated:

Model information:

  • Spreadsheet template: Template defined to document models in a machine-readable format.
  • Markdown draft: Template defined to document models in QMRF format.

Prediction report:

  • Report on the application of the model to query compounds (QPRF like template). This QPRF is generated using the last set of compounds predicted by a model. This implies the serialization of some information just after the prediction (i.e., date, molecules, etc.).

All documents generated are stored in the model folder and can be acessed using manage comand

[under development]

Clone this wiki locally