From 1eed1f8163346b3055d32f601834b97b0725afc9 Mon Sep 17 00:00:00 2001 From: "Vikram K. Mulligan" Date: Tue, 8 Feb 2022 16:13:52 -0500 Subject: [PATCH 01/33] Adding ensemble metrics to sidebars. --- .../RosettaScripts/FeaturesReporter/_Sidebar.md | 2 ++ .../FeaturesReporter/features_reporters/_Sidebar.md | 4 ++++ .../RosettaScripts/FeaturesReporter/rscripts/_Sidebar.md | 4 ++++ scripting_documentation/RosettaScripts/Filters/_Sidebar.md | 2 ++ scripting_documentation/RosettaScripts/Movers/_Sidebar.md | 2 ++ .../RosettaScripts/SimpleMetrics/_Sidebar.md | 4 ++++ .../RosettaScripts/TaskOperations/_Sidebar.md | 2 ++ scripting_documentation/RosettaScripts/_Sidebar.md | 2 ++ .../RosettaScripts/composite_protocols/_Sidebar.md | 2 ++ 9 files changed, 24 insertions(+) diff --git a/scripting_documentation/RosettaScripts/FeaturesReporter/_Sidebar.md b/scripting_documentation/RosettaScripts/FeaturesReporter/_Sidebar.md index de27f22ed..4d334d69b 100644 --- a/scripting_documentation/RosettaScripts/FeaturesReporter/_Sidebar.md +++ b/scripting_documentation/RosettaScripts/FeaturesReporter/_Sidebar.md @@ -20,6 +20,8 @@ * [[Simple Metrics | SimpleMetrics]] + * [[Ensemble Metrics|EnsembleMetrics]] + * [[Filters|Filters-RosettaScripts]] * [[FeaturesReporters|Features-reporter-overview]] diff --git a/scripting_documentation/RosettaScripts/FeaturesReporter/features_reporters/_Sidebar.md b/scripting_documentation/RosettaScripts/FeaturesReporter/features_reporters/_Sidebar.md index 4faf9cccc..209569d19 100644 --- a/scripting_documentation/RosettaScripts/FeaturesReporter/features_reporters/_Sidebar.md +++ b/scripting_documentation/RosettaScripts/FeaturesReporter/features_reporters/_Sidebar.md @@ -14,6 +14,10 @@ * [[Filters|Filters-RosettaScripts]] + * [[Simple Metrics|SimpleMetrics]] + + * [[Ensemble Metrics|EnsembleMetrics]] + * [[Residue Selectors|ResidueSelectors]] * [[PackerPalettes|PackerPalette]] diff --git a/scripting_documentation/RosettaScripts/FeaturesReporter/rscripts/_Sidebar.md b/scripting_documentation/RosettaScripts/FeaturesReporter/rscripts/_Sidebar.md index 94af5a392..3800286b4 100644 --- a/scripting_documentation/RosettaScripts/FeaturesReporter/rscripts/_Sidebar.md +++ b/scripting_documentation/RosettaScripts/FeaturesReporter/rscripts/_Sidebar.md @@ -14,6 +14,10 @@ * [[Residue Selectors|ResidueSelectors]] + * [[Simple Metrics|SimpleMetrics]] + + * [[Ensemble Metrics|EnsembleMetrics]] + * [[PackerPalettes|PackerPalette]] * [[Filters|Filters-RosettaScripts]] diff --git a/scripting_documentation/RosettaScripts/Filters/_Sidebar.md b/scripting_documentation/RosettaScripts/Filters/_Sidebar.md index b37eb316e..3752c97cf 100644 --- a/scripting_documentation/RosettaScripts/Filters/_Sidebar.md +++ b/scripting_documentation/RosettaScripts/Filters/_Sidebar.md @@ -20,6 +20,8 @@ * [[Simple Metrics | SimpleMetrics]] + * [[Ensemble Metrics | EnsembleMetrics]] + * [[Filters|Filters-RosettaScripts]] * [[FeaturesReporters|Features-reporter-overview]] diff --git a/scripting_documentation/RosettaScripts/Movers/_Sidebar.md b/scripting_documentation/RosettaScripts/Movers/_Sidebar.md index b37eb316e..dd495fff9 100644 --- a/scripting_documentation/RosettaScripts/Movers/_Sidebar.md +++ b/scripting_documentation/RosettaScripts/Movers/_Sidebar.md @@ -20,6 +20,8 @@ * [[Simple Metrics | SimpleMetrics]] + * [[Ensemble Metrics|EnsembleMetrics]] + * [[Filters|Filters-RosettaScripts]] * [[FeaturesReporters|Features-reporter-overview]] diff --git a/scripting_documentation/RosettaScripts/SimpleMetrics/_Sidebar.md b/scripting_documentation/RosettaScripts/SimpleMetrics/_Sidebar.md index 19e142d78..e5e662775 100644 --- a/scripting_documentation/RosettaScripts/SimpleMetrics/_Sidebar.md +++ b/scripting_documentation/RosettaScripts/SimpleMetrics/_Sidebar.md @@ -14,6 +14,10 @@ * [[Residue Selectors|ResidueSelectors]] + * [[Simple Metrics|SimpleMetrics]] + + * [[Ensemble Metrics|EnsembleMetrics]] + * [[PackerPalettes|PackerPalette]] * [[Task Operations|TaskOperations-RosettaScripts]] diff --git a/scripting_documentation/RosettaScripts/TaskOperations/_Sidebar.md b/scripting_documentation/RosettaScripts/TaskOperations/_Sidebar.md index f41e9d5c9..d2c0e52c8 100644 --- a/scripting_documentation/RosettaScripts/TaskOperations/_Sidebar.md +++ b/scripting_documentation/RosettaScripts/TaskOperations/_Sidebar.md @@ -19,6 +19,8 @@ * [[Task Operations|TaskOperations-RosettaScripts]] * [[Simple Metrics | SimpleMetrics]] + + * [[Ensemble Metrics|EnsembleMetrics]] * [[Filters|Filters-RosettaScripts]] diff --git a/scripting_documentation/RosettaScripts/_Sidebar.md b/scripting_documentation/RosettaScripts/_Sidebar.md index b217aad09..003412fb8 100644 --- a/scripting_documentation/RosettaScripts/_Sidebar.md +++ b/scripting_documentation/RosettaScripts/_Sidebar.md @@ -22,6 +22,8 @@ * [[Simple Metrics | SimpleMetrics]] + * [[Ensemble Metrics|EnsembleMetrics]] + * [[Filters|Filters-RosettaScripts]] * [[FeaturesReporters|Features-reporter-overview]] diff --git a/scripting_documentation/RosettaScripts/composite_protocols/_Sidebar.md b/scripting_documentation/RosettaScripts/composite_protocols/_Sidebar.md index 33d6bcd71..35255fec7 100644 --- a/scripting_documentation/RosettaScripts/composite_protocols/_Sidebar.md +++ b/scripting_documentation/RosettaScripts/composite_protocols/_Sidebar.md @@ -20,6 +20,8 @@ * [[Simple Metrics | SimpleMetrics]] + * [[Ensemble Metrics|EnsembleMetrics]] + * [[Filters|Filters-RosettaScripts]] * [[Features Reporters|Features-reporter-overview]] From 9156829f734fefb725cc0fa9a2d005db07f7fb96 Mon Sep 17 00:00:00 2001 From: "Vikram K. Mulligan" Date: Tue, 8 Feb 2022 16:22:50 -0500 Subject: [PATCH 02/33] Updating main RosettaScripts page. --- scripting_documentation/RosettaScripts/RosettaScripts.md | 1 + 1 file changed, 1 insertion(+) diff --git a/scripting_documentation/RosettaScripts/RosettaScripts.md b/scripting_documentation/RosettaScripts/RosettaScripts.md index 8162b50ea..87c20cb24 100644 --- a/scripting_documentation/RosettaScripts/RosettaScripts.md +++ b/scripting_documentation/RosettaScripts/RosettaScripts.md @@ -19,6 +19,7 @@ Fleishman SJ, Leaver-Fay A, Corn JE, Strauch EM, Khare SD, et al. (2011) Rosetta - [[JumpSelectors |JumpSelectors]] - [[PackerPalettes|PackerPalette]] - [[SimpleMetrics]] +- [[EnsembleMetrics]] --------------------- From 35cb71758edae4c2a29f3a6fd480a4e3a7c65867 Mon Sep 17 00:00:00 2001 From: "Vikram K. Mulligan" Date: Tue, 8 Feb 2022 16:39:13 -0500 Subject: [PATCH 03/33] Adding page for CentralTendency metric. --- .../ensemble_metric_pages/CentralTendency.md | 40 +++++++++++++++++++ 1 file changed, 40 insertions(+) create mode 100644 scripting_documentation/RosettaScripts/EnsembleMetrics/ensemble_metric_pages/CentralTendency.md diff --git a/scripting_documentation/RosettaScripts/EnsembleMetrics/ensemble_metric_pages/CentralTendency.md b/scripting_documentation/RosettaScripts/EnsembleMetrics/ensemble_metric_pages/CentralTendency.md new file mode 100644 index 000000000..24f4c6c3b --- /dev/null +++ b/scripting_documentation/RosettaScripts/EnsembleMetrics/ensemble_metric_pages/CentralTendency.md @@ -0,0 +1,40 @@ +# CentralTendency Ensemble Metric +*Back to [[SimpleMetrics]] page.* +## CentralTendency Ensemble Metric + +[[_TOC_]] + +### Description + +The Central Tendency metric accepts as input a real-valued [[SimpleMetric|SimpleMetrics]]. It then applies it to each pose in an ensemble, collecting a series of values. At reporting time, the metric computes measures of central tendency (mean, median, and mode), plus other descriptive statistics about the distribution of the measured value over the ensemble (standard deviation, standard error, min, max, range). + +### Author and history + +Created Tuesday, 8 February 2022 by Vikram K. Mulligan, Center for Computational Biology, Flatiron Institute (vmulligan@flatironinstitute.org). This was the first [[EnsembleMetric|EnsembleMetrics]] implemented + +### Interface + +[[include:ensemble_metric_CentralTendencyEnsembleMetric_type]] + +### Named values produced + +Measure | Name (used for the [[EnsembleFilter]]) | Description +--------|----------------------------------------|------------ +Mean | mean | The average of the values measured for the poses in the ensemble. +Median | median | When values measured from all of hte poses in the ensemble are listed in increasing order, this is the middle value. If the number of poses in the ensemble is even, the middle two values are averaged. +Mode | mode | The most frequently seen value in the values measured from the poses in the environment. If more than one value appears with equal frequency and this frequency is highest, the values are averaged. +Standard Deviation | stddev | Estimate of the standard deviation of the mean, defined as the sqrt( sum_i( S_i - mean )^2 / N ), where S_i is the ith sample, mean is the average of all the samples, and N is the number of samples. +Standard Error | stderr | Estimate of the standard error of the mean, defined by stddev / sqrt(N), where N is the number of samples. +Min | min | The minimum value seen. +Max | max | The maximum value seen. +Range | range | the largest value seen minus the smallest. + +#### Note about mode + +The mode of a set of floating-point numbers can be thrown off by floating-point error. For instance, two poses may have energies of -3.7641 kJ/mol, but the process of computing that energy may result in slightly different values at the 15th decimal point. This would prevent the filter from recognizing this is at the most frequent value. + +##See Also + +* [[SimpleMetrics]]: Available SimpleMetrics. +* [[EnsembleMetrics]]: Available EnsembleMetrics. +* [[I want to do x]]: Guide to choosing a tool in Rosetta. \ No newline at end of file From cc023e6f6f18990aa429c1cb4e7e4476aecfec84 Mon Sep 17 00:00:00 2001 From: "Vikram K. Mulligan" Date: Tue, 8 Feb 2022 16:39:30 -0500 Subject: [PATCH 04/33] Updating auto-generated docs. --- .../RosettaScripts/xsd/filter_FragmentScoreFilter_type.md | 2 +- .../RosettaScripts/xsd/mover_ParsedProtocol_type.md | 5 +++-- 2 files changed, 4 insertions(+), 3 deletions(-) diff --git a/scripting_documentation/RosettaScripts/xsd/filter_FragmentScoreFilter_type.md b/scripting_documentation/RosettaScripts/xsd/filter_FragmentScoreFilter_type.md index cf17faf3b..b556a2dfb 100644 --- a/scripting_documentation/RosettaScripts/xsd/filter_FragmentScoreFilter_type.md +++ b/scripting_documentation/RosettaScripts/xsd/filter_FragmentScoreFilter_type.md @@ -13,7 +13,7 @@ Filter based on any score that can be calculated in fragment_picker. outputs_name="(pose &string;)" csblast="(&string;)" blast_pgp="(&string;)" placeholder_seqs="(&string;)" sparks-x="(&string;)" sparks-x_query="(&string;)" psipred="(&string;)" - vall_path="(/scratch/benchmark/W.hojo-1/rosetta.Hojo-1/master/main/database//sampling/vall.jul19.2011.gz &string;)" + vall_path="(/Users/vmulligan/rosetta_git_workingcopy/Rosetta/main/tools/doc_tools/../../database//sampling/vall.jul19.2011.gz &string;)" frags_scoring_config="(&string;)" n_frags="(200 &non_negative_integer;)" n_candidates="(1000 &non_negative_integer;)" print_to_pdb="(false &xs:boolean;)" diff --git a/scripting_documentation/RosettaScripts/xsd/mover_ParsedProtocol_type.md b/scripting_documentation/RosettaScripts/xsd/mover_ParsedProtocol_type.md index 2a4c1874f..5284f7d94 100644 --- a/scripting_documentation/RosettaScripts/xsd/mover_ParsedProtocol_type.md +++ b/scripting_documentation/RosettaScripts/xsd/mover_ParsedProtocol_type.md @@ -11,8 +11,8 @@ This is a special mover that allows making a single compound mover and filter ve apply_probability="(ℜ)" resume_support="(false &bool;)" > + ensemble_metrics="(&string;)" apply_probability="(ℜ)" + report_at_end="(true &bool;)" never_rerun_filter="(false &bool;)" /> @@ -33,6 +33,7 @@ Subtag **Add**: The steps to be applied. - **filter**: The filter whose execution is desired - **metrics**: A comma-separated list of metrics to run at this point. - **labels**: A comma-separated list of labels to use for the provided metrics in the output. If empty/missing, use the metric names from the metrics setting. If '-', use the metric's default. +- **ensemble_metrics**: A comma-separated list of ensemble metrics to add at this point. Ensemble metrics will collect information about the pose at this point, and will later report statistics about the ensemble of poses that they have seen. - **apply_probability**: by default equal probability for all tags - **report_at_end**: Report filter value via filter re-evaluation on final pose after conclusion of protocol. Otherwise report filter value as evaluated mid-protocol. - **never_rerun_filter**: Never run this filter after the original apply-time run. Use this option to avoid expensive re-runs when reporting From a18aca64b55832e3a39da76a0c87a105ff946b05 Mon Sep 17 00:00:00 2001 From: "Vikram K. Mulligan" Date: Tue, 8 Feb 2022 16:44:29 -0500 Subject: [PATCH 05/33] Adding auto-generated ensemble metric docs. --- .../ensemble_metric_CentralTendency_type.md | 25 +++++++++++++++++++ 1 file changed, 25 insertions(+) create mode 100644 scripting_documentation/RosettaScripts/xsd/ensemble_metric_CentralTendency_type.md diff --git a/scripting_documentation/RosettaScripts/xsd/ensemble_metric_CentralTendency_type.md b/scripting_documentation/RosettaScripts/xsd/ensemble_metric_CentralTendency_type.md new file mode 100644 index 000000000..71e9363f8 --- /dev/null +++ b/scripting_documentation/RosettaScripts/xsd/ensemble_metric_CentralTendency_type.md @@ -0,0 +1,25 @@ + + +_Autogenerated Tag Syntax Documentation:_ + +--- +An ensemble metric that takes a real-valued simple metric, applies it to all poses in an ensemble, and calculates measures of central tendency (mean, median, mode) and other statistics about the distribution (standard deviation, standard error of the mean, min, max, range, etc.). Values that this ensemble metric returns are referred to in scripts as: mean, median, mode, stddev, stderr, min, max, and range. + +```xml + +``` + +- **label_prefix**: If provided, this prefix is prepended to the label for this ensemble metric (with an underscore after the prefix and before the ensemble metric name). +- **label_suffix**: If provided, this suffix is appended to the label for this ensemble metric (with an underscore after the ensemble metric name and before the suffix). +- **ensemble_generating_protocol**: An optional ParsedProtocol or other mover for generating an ensemble from the current pose. This protocol will be applied repeatedly (ensemble_generating_protocol_repeats times) to generate the ensemble of structures. Each generated pose will be measured by this metric, then discarded. The ensemble properties are then reported. If not provided, the current pose is measured and the report will be produced later (e.g. at termination with the JD2 rosetta_scripts application). +- **ensemble_generating_protocol_repeats**: The number of times that the ensemble_generating_protocol is applied. This is the maximum number of structures in the ensemble (though the actual number may be smaller if the protocol contains filters or movers that can fail for some attempts). Only used if an ensemble-generating protocol is provided with the ensemble_generating_protocol option. +- **n_threads**: The number of threads to request for generating ensembles in parallel. This is only used in multi-threaded compilations of Rosetta (compiled with extras=cxx11thread), and only when an ensemble-generating protocol is provided with the ensemble_generating_protocol option. A value of 0 means to use all available threads. In single-threaded builds, this must be set to 0 or 1. +- **use_additional_output_from_last_mover**: If true, this ensemble metric will use the additional output from the previous pose (assuming the previous pose generates multiple outputs) as the ensemble, analysing it and producing a report immediately. If false, then it will behave normally. False by default. +- **real_valued_metric**: The name of a real-valued simple metric defined previously. Required input. + +--- From 3e8cc7ff9f4af78f9362b1b625f6af3fcf940fa1 Mon Sep 17 00:00:00 2001 From: "Vikram K. Mulligan" Date: Tue, 8 Feb 2022 16:57:25 -0500 Subject: [PATCH 06/33] Updating CentralTendency ensemble metric doc. --- .../EnsembleMetrics/ensemble_metric_pages/CentralTendency.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/scripting_documentation/RosettaScripts/EnsembleMetrics/ensemble_metric_pages/CentralTendency.md b/scripting_documentation/RosettaScripts/EnsembleMetrics/ensemble_metric_pages/CentralTendency.md index 24f4c6c3b..2a2b9678f 100644 --- a/scripting_documentation/RosettaScripts/EnsembleMetrics/ensemble_metric_pages/CentralTendency.md +++ b/scripting_documentation/RosettaScripts/EnsembleMetrics/ensemble_metric_pages/CentralTendency.md @@ -14,7 +14,7 @@ Created Tuesday, 8 February 2022 by Vikram K. Mulligan, Center for Computational ### Interface -[[include:ensemble_metric_CentralTendencyEnsembleMetric_type]] +[[include:ensemble_metric_CentralTendency_type]] ### Named values produced From 0e5918f8357deeab4dae4ab0e5b67ad6e53b25dc Mon Sep 17 00:00:00 2001 From: "Vikram K. Mulligan" Date: Wed, 9 Feb 2022 19:39:47 -0500 Subject: [PATCH 07/33] Working on documentation for EnsembleMetrics. --- .../EnsembleMetrics/EnsembleMetrics.md | 36 +++++++++++++++++++ 1 file changed, 36 insertions(+) create mode 100644 scripting_documentation/RosettaScripts/EnsembleMetrics/EnsembleMetrics.md diff --git a/scripting_documentation/RosettaScripts/EnsembleMetrics/EnsembleMetrics.md b/scripting_documentation/RosettaScripts/EnsembleMetrics/EnsembleMetrics.md new file mode 100644 index 000000000..63fb0f03d --- /dev/null +++ b/scripting_documentation/RosettaScripts/EnsembleMetrics/EnsembleMetrics.md @@ -0,0 +1,36 @@ +# EnsembleMetrics +*Back to main [[RosettaScripts|RosettaScripts]] page.* + +Page created Wed, 9 February 2022 by Vikram K. Mulligan, Flatiron Institute (vmulligan@flatironinstitute.org). + +[[_TOC_]] + +## Description + +Just as [[SimpleMetrics]] measure some property of a pose, EnsembleMetrics measure some property of a group (or _ensemble_) of poses. They are designed to be used in two phases. In the _accumulation_ phase, an EnsembleMetric is applied to each pose in an ensemble in sequence, allowing it to store any relevant measurements from that pose that will later be needed to calculate properties of the ensemble. In the _reporting_ phase, the EnsembleMetric generates a report about the properties of the ensemble and writes this report to disk or to tracer. Following reporting, an EnsembleMetric may be _interrogated_ by such modules as the [[EnsembleFilter]], allowing retrieval of any floating-point values computed by the EnsembleMetric for filtering. Alternatively, the EnsembleMetric may be _reset_ for re-use (meaning that accumulated data, but not configuration settings, are wiped). + +## Usage modes + +EnsembleMetrics have three intended usage modes in [[RosettaScripts]]: + +Mode | Setup | Accumulation Phase | Reporting Phase | Subsequent Interrogation | Subsequent Resetting +---- | ----- | ------------------ | --------------- | ------------------------ | -------------------- +Basic accumulator mode | Added to a protocol at point of accumulation. | The EnsembleMetric is applied to each pose that the RosettaScripts script handles, in sequence. | The EnsembleMetric produces its report at termination of the RosettaScripts application. This report covers all poses seen during this RosettaScripts run. | None. | None. +Internal generation mode | Provided with a ParsedProtocol for generating the ensemble of poses from the input pose, and a number to generate. Added to protocol at point where ensemble should be generated from pose at that point. | Accumulates information about each pose in the ensemble it generates. Poses are then discaded. | The report is provided immediately once the ensemble has been generated. The script then continues with the input pose. | After reporting. | On next nstruct (repeat) or next job. +Multiple pose mover mode | Set to use input from a mover that produces many outputs (a [[MultiplePoseMover]]). Placed in script after such a mover. | Collects data from each pose produced by previous mover. | Reports immediately after collecting data on all poses produced by previous mover. The script then continues on. | After reporting. | On next nstruct (repeat) or next job. + +CONTINUE HEREs + +##Available EnsembleMetrics + +EnsembleMetric | Description +------------ | ------------- +**[[CentralTendency]]** | Takes a [[real-valued SimpleMetric|SimpleMetrics]], applies it to each pose in an ensemble, and returns measures of central tendency (mean, median, mode) and other measures of the distribution (standard deviation, standard error, etc.). + +##See Also + +* [[SimpleMetrics]]: Measure a property of a single pose. +* [[Filters|Filters-RosettaScripts]]: Filter on a measured feature of a pose. +* [[EnsembleFilter]]: Filter on a property of an ensemble of poses. +* [[Movers|Movers-RosettaScripts]]: Modify a pose. +* [[I want to do x]]: Guide to choosing a Rosetta protocol. \ No newline at end of file From b3730e1e00ebdf585f8f69139ce4ba576e73d55d Mon Sep 17 00:00:00 2001 From: "Vikram K. Mulligan" Date: Wed, 9 Feb 2022 19:56:34 -0500 Subject: [PATCH 08/33] Fleshing out EnsembleMetric documentation. --- .../EnsembleMetrics/EnsembleMetrics.md | 101 +++++++++++++++++- 1 file changed, 96 insertions(+), 5 deletions(-) diff --git a/scripting_documentation/RosettaScripts/EnsembleMetrics/EnsembleMetrics.md b/scripting_documentation/RosettaScripts/EnsembleMetrics/EnsembleMetrics.md index 63fb0f03d..afee96cf7 100644 --- a/scripting_documentation/RosettaScripts/EnsembleMetrics/EnsembleMetrics.md +++ b/scripting_documentation/RosettaScripts/EnsembleMetrics/EnsembleMetrics.md @@ -9,6 +9,12 @@ Page created Wed, 9 February 2022 by Vikram K. Mulligan, Flatiron Institute (vmu Just as [[SimpleMetrics]] measure some property of a pose, EnsembleMetrics measure some property of a group (or _ensemble_) of poses. They are designed to be used in two phases. In the _accumulation_ phase, an EnsembleMetric is applied to each pose in an ensemble in sequence, allowing it to store any relevant measurements from that pose that will later be needed to calculate properties of the ensemble. In the _reporting_ phase, the EnsembleMetric generates a report about the properties of the ensemble and writes this report to disk or to tracer. Following reporting, an EnsembleMetric may be _interrogated_ by such modules as the [[EnsembleFilter]], allowing retrieval of any floating-point values computed by the EnsembleMetric for filtering. Alternatively, the EnsembleMetric may be _reset_ for re-use (meaning that accumulated data, but not configuration settings, are wiped). +##Available EnsembleMetrics + +EnsembleMetric | Description +------------ | ------------- +**[[CentralTendency]]** | Takes a [[real-valued SimpleMetric|SimpleMetrics]], applies it to each pose in an ensemble, and returns measures of central tendency (mean, median, mode) and other measures of the distribution (standard deviation, standard error, etc.). + ## Usage modes EnsembleMetrics have three intended usage modes in [[RosettaScripts]]: @@ -19,13 +25,98 @@ Basic accumulator mode | Added to a protocol at point of accumulation. | The Ens Internal generation mode | Provided with a ParsedProtocol for generating the ensemble of poses from the input pose, and a number to generate. Added to protocol at point where ensemble should be generated from pose at that point. | Accumulates information about each pose in the ensemble it generates. Poses are then discaded. | The report is provided immediately once the ensemble has been generated. The script then continues with the input pose. | After reporting. | On next nstruct (repeat) or next job. Multiple pose mover mode | Set to use input from a mover that produces many outputs (a [[MultiplePoseMover]]). Placed in script after such a mover. | Collects data from each pose produced by previous mover. | Reports immediately after collecting data on all poses produced by previous mover. The script then continues on. | After reporting. | On next nstruct (repeat) or next job. -CONTINUE HEREs +### Example of basic usage -##Available EnsembleMetrics +In this example, the input is a cyclic peptide. This script perturbs the peptide backbone, relaxes the peptide, and then applies a [[CentralTendency EnsembleMetric|CentralTendency]] that in turn applies a [[TotalEnergyMetric]], measuring total score. At the end of execution (after repeat execution, a number of times set with the `-nstruct` flag), the EnsembleMetric produces a report about the mean, median, mode, etc. of the samples. -EnsembleMetric | Description ------------- | ------------- -**[[CentralTendency]]** | Takes a [[real-valued SimpleMetric|SimpleMetrics]], applies it to each pose in an ensemble, and returns measures of central tendency (mean, median, mode) and other measures of the distribution (standard deviation, standard error, etc.). +```xml + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +``` + +### Example of internal generation mode + +TODO + +### Example of multiple pose mover mode + +TODO + +## Interrogating EnsembleMetric floating-point values by name + +## Note about running in MPI mode + +TODO ##See Also From 97d78921d5e8239bc7843bf0f2afd1f7dcd36883 Mon Sep 17 00:00:00 2001 From: "Vikram K. Mulligan" Date: Wed, 9 Feb 2022 23:15:51 -0500 Subject: [PATCH 09/33] Updating auto-generated docs. --- .../ensemble_metric_CentralTendency_type.md | 5 ++++- .../xsd/filter_EnsembleFilter_type.md | 21 +++++++++++++++++++ 2 files changed, 25 insertions(+), 1 deletion(-) create mode 100644 scripting_documentation/RosettaScripts/xsd/filter_EnsembleFilter_type.md diff --git a/scripting_documentation/RosettaScripts/xsd/ensemble_metric_CentralTendency_type.md b/scripting_documentation/RosettaScripts/xsd/ensemble_metric_CentralTendency_type.md index 71e9363f8..833bb10a0 100644 --- a/scripting_documentation/RosettaScripts/xsd/ensemble_metric_CentralTendency_type.md +++ b/scripting_documentation/RosettaScripts/xsd/ensemble_metric_CentralTendency_type.md @@ -7,7 +7,8 @@ An ensemble metric that takes a real-valued simple metric, applies it to all pos ```xml + +_Autogenerated Tag Syntax Documentation:_ + +--- +A filter that filters based on some named float-valued property measured by an EnsembleMetric. Note that the value produced by the EnsembleMetric is based on an ensemble generated earlier in the protocol, presumably from the pose on which we are currently filtering. + +```xml + +``` + +- **ensemble_metric**: (REQUIRED) A previously-defined EnsembleMetric that produces at least one floating-point value. This filter will filter a pose based on that value. +- **named_value**: (REQUIRED) A named floating-point value produced by the EnsembleMetric, on which this filter will filter. +- **threshold**: The threshold for rejecting a pose. +- **filter_acceptance_mode**: The criterion for ACCEPTING a pose. For instance, if the value returned by the ensemble metric is greater than the threshold, and the mode is 'less_than_or_equal' (the default mode), then the pose is rejected. Allowed modes are: 'greater_than', 'less_than', 'greater_than_or_equal', 'less_than_or_equal', 'equal', and 'not_equal'. +- **confidence**: Probability that the pose will be filtered out if it does not pass this Filter + +--- From ca9eebdb51f0460004ccc1b43deaf4d0ca1244d2 Mon Sep 17 00:00:00 2001 From: "Vikram K. Mulligan" Date: Thu, 10 Feb 2022 15:57:07 -0500 Subject: [PATCH 10/33] Adding note about accessing named values. --- .../EnsembleMetrics/EnsembleMetrics.md | 46 +++++++++++++++++++ 1 file changed, 46 insertions(+) diff --git a/scripting_documentation/RosettaScripts/EnsembleMetrics/EnsembleMetrics.md b/scripting_documentation/RosettaScripts/EnsembleMetrics/EnsembleMetrics.md index afee96cf7..5c1483324 100644 --- a/scripting_documentation/RosettaScripts/EnsembleMetrics/EnsembleMetrics.md +++ b/scripting_documentation/RosettaScripts/EnsembleMetrics/EnsembleMetrics.md @@ -114,6 +114,52 @@ TODO ## Interrogating EnsembleMetric floating-point values by name +Each EnsembleMetric can return one or more floating-point values describing different features of the ensemble. Each of these has a name associated with it. + +### From C++ or Python code + +From C++ (or Python) code, after an EnsembleMetric produces its final report, these values can be interrogated with the `get_metric_by_name()` method. To see all names offered by a particular EnsembleMetric, call `real_valued_metric_names()`: + +```C++ + // Create an EnsembleMetric: + CentralTendency my_ensemble_metric; + // Configure this EnsembleMetric here. This particular + // example would require a SimpleMetric to be passed to + // it, though in general the setup for EnsembleMetrics + // will vary from EnsembleMetric subclass to subclass. + + for( core::Size i=1; i<=nstruct; ++i ) { + // Generate a pose here. + // ... + + // Collect data from it: + my_ensemble_metric.apply( pose ); + } + + // Produce final report (to tracer or disk, + // depending on configuration): + my_ensemble_metric.produce_final_report(); + + // Get the names of floating point values + // that the EnsembleMetric has calculated: + utility::vector1< std::string > const value_names( + my_ensemble_metric.real_valued_metric_names() + ); + + // Confirm that "median" is a name of a value + // returned by this particular metric: + runtime_assert( value_names.has_value( "median" ) ); //This passes. + + // Get the median value from the ensemble: + core::Real const median_value( + my_ensemble_metric.get_metric_by_name( "median" ) + ); +``` + +### Using filters + +TODO + ## Note about running in MPI mode TODO From bbfdb09d097f6de0fea8de16107f577d9704b2b6 Mon Sep 17 00:00:00 2001 From: "Vikram K. Mulligan" Date: Thu, 10 Feb 2022 16:08:53 -0500 Subject: [PATCH 11/33] Adding note about filtering. --- .../RosettaScripts/EnsembleMetrics/EnsembleMetrics.md | 6 +++++- 1 file changed, 5 insertions(+), 1 deletion(-) diff --git a/scripting_documentation/RosettaScripts/EnsembleMetrics/EnsembleMetrics.md b/scripting_documentation/RosettaScripts/EnsembleMetrics/EnsembleMetrics.md index 5c1483324..773b568a9 100644 --- a/scripting_documentation/RosettaScripts/EnsembleMetrics/EnsembleMetrics.md +++ b/scripting_documentation/RosettaScripts/EnsembleMetrics/EnsembleMetrics.md @@ -158,7 +158,11 @@ From C++ (or Python) code, after an EnsembleMetric produces its final report, th ### Using filters -TODO +In RosettaScripts (or in PyRosetta or even C++ code), when an EnsembleMetric is used in internal generator mode or multiple pose mover mode (_i.e._ it applies itself to an ensemble of poses that it either generates internally or receives from a previous mover) a subsequent [[EnsembleFilter]] may be used to interrogate a named value computed by the EnsembleMetric, and to cause the protocol to pass or fail depending on that property of the ensemble. + +Why would someone want to do this? One example would be if one wanted to write a script that would design a protein, generate for each design a conformational ensemble, and score the propensity to favour the designed state (_e.g._ with the planned [[PNear]] EnsembleMetric), then discard those designs that have poor propensith to favour the designed state based on the ensemble analysis. This would ensure that one could produce thousands or tens of thousands of designs in memory, analyze them all, and only write to disk the ones worth carrying forward. Other similar usage patterns are possible. + +For more information, see the page for the [[EnsembleFilter]]. ## Note about running in MPI mode From 9dd6e8930753870e202fc3a914fbb9507ef7d06d Mon Sep 17 00:00:00 2001 From: "Vikram K. Mulligan" Date: Thu, 10 Feb 2022 16:11:27 -0500 Subject: [PATCH 12/33] Revising text slightly. --- .../RosettaScripts/EnsembleMetrics/EnsembleMetrics.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/scripting_documentation/RosettaScripts/EnsembleMetrics/EnsembleMetrics.md b/scripting_documentation/RosettaScripts/EnsembleMetrics/EnsembleMetrics.md index 773b568a9..9b48cb12c 100644 --- a/scripting_documentation/RosettaScripts/EnsembleMetrics/EnsembleMetrics.md +++ b/scripting_documentation/RosettaScripts/EnsembleMetrics/EnsembleMetrics.md @@ -160,7 +160,7 @@ From C++ (or Python) code, after an EnsembleMetric produces its final report, th In RosettaScripts (or in PyRosetta or even C++ code), when an EnsembleMetric is used in internal generator mode or multiple pose mover mode (_i.e._ it applies itself to an ensemble of poses that it either generates internally or receives from a previous mover) a subsequent [[EnsembleFilter]] may be used to interrogate a named value computed by the EnsembleMetric, and to cause the protocol to pass or fail depending on that property of the ensemble. -Why would someone want to do this? One example would be if one wanted to write a script that would design a protein, generate for each design a conformational ensemble, and score the propensity to favour the designed state (_e.g._ with the planned [[PNear]] EnsembleMetric), then discard those designs that have poor propensith to favour the designed state based on the ensemble analysis. This would ensure that one could produce thousands or tens of thousands of designs in memory, analyze them all, and only write to disk the ones worth carrying forward. Other similar usage patterns are possible. +Why would someone want to do this? One example would be if one wanted to write a script that would design a protein, generate for each design a conformational ensemble, and score the propensity to favour the designed conformation (_e.g._ with the planned [[PNear]] EnsembleMetric), then discard those designs that have poor propensity to favour the designed state based on the ensemble analysis. This would ensure that one could produce thousands or tens of thousands of designs in memory, analyze them all, and only write to disk the ones worth carrying forward. Variant patterns include generating initial designs using a low-cost initial design protocol, doing moderate-cost ensemble analysis, discarding poor designs with the EnsembleFilter, and refining those designs that pass the filter using higher-cost refinement protocols. Other similar usage patterns are possible. For more information, see the page for the [[EnsembleFilter]]. From 9629b4b3a5db9f5af38ca7581c2c0b16aad4620d Mon Sep 17 00:00:00 2001 From: "Vikram K. Mulligan" Date: Thu, 10 Feb 2022 16:16:42 -0500 Subject: [PATCH 13/33] Adding note about MPI mode. --- .../RosettaScripts/EnsembleMetrics/EnsembleMetrics.md | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/scripting_documentation/RosettaScripts/EnsembleMetrics/EnsembleMetrics.md b/scripting_documentation/RosettaScripts/EnsembleMetrics/EnsembleMetrics.md index 9b48cb12c..b53f94158 100644 --- a/scripting_documentation/RosettaScripts/EnsembleMetrics/EnsembleMetrics.md +++ b/scripting_documentation/RosettaScripts/EnsembleMetrics/EnsembleMetrics.md @@ -166,9 +166,9 @@ For more information, see the page for the [[EnsembleFilter]]. ## Note about running in MPI mode -TODO +Note that EnsembleMetrics that run in different MPI processes cannot share information about the different poses that they have seen at present. This means that they will produce reports about only the ensemble of poses that they have seen _in their own MPI process_. They can still be used in MPI mode to analyse different ensembles in each MPI process. Support for generating giant ensembles by MPI and analysing them with EnsembleMetrics is planned for the future. -##See Also +## See Also * [[SimpleMetrics]]: Measure a property of a single pose. * [[Filters|Filters-RosettaScripts]]: Filter on a measured feature of a pose. From 33019004e61376e3a11572706b34140d5b178fa6 Mon Sep 17 00:00:00 2001 From: "Vikram K. Mulligan" Date: Thu, 10 Feb 2022 18:28:56 -0500 Subject: [PATCH 14/33] Adding example of internal generation mode. --- .../EnsembleMetrics/EnsembleMetrics.md | 122 +++++++++++++++++- 1 file changed, 116 insertions(+), 6 deletions(-) diff --git a/scripting_documentation/RosettaScripts/EnsembleMetrics/EnsembleMetrics.md b/scripting_documentation/RosettaScripts/EnsembleMetrics/EnsembleMetrics.md index b53f94158..f89a2aa9a 100644 --- a/scripting_documentation/RosettaScripts/EnsembleMetrics/EnsembleMetrics.md +++ b/scripting_documentation/RosettaScripts/EnsembleMetrics/EnsembleMetrics.md @@ -27,7 +27,7 @@ Multiple pose mover mode | Set to use input from a mover that produces many outp ### Example of basic usage -In this example, the input is a cyclic peptide. This script perturbs the peptide backbone, relaxes the peptide, and then applies a [[CentralTendency EnsembleMetric|CentralTendency]] that in turn applies a [[TotalEnergyMetric]], measuring total score. At the end of execution (after repeat execution, a number of times set with the `-nstruct` flag), the EnsembleMetric produces a report about the mean, median, mode, etc. of the samples. +In this example, the input is a cyclic peptide (provided with the `-in:file:s` commandline option). This script perturbs the peptide backbone, relaxes the peptide, and then applies a [[CentralTendency EnsembleMetric|CentralTendency]] that in turn applies a [[TotalEnergyMetric]], measuring total score. At the end of execution (after repeat execution, a number of times set with the `-nstruct` commandline option), the EnsembleMetric produces a report about the mean, median, mode, etc. of the samples. ```xml @@ -38,7 +38,9 @@ In this example, the input is a cyclic peptide. This script perturbs the peptid - + @@ -59,7 +61,9 @@ In this example, the input is a cyclic peptide. This script perturbs the peptid - + @@ -96,17 +100,123 @@ In this example, the input is a cyclic peptide. This script perturbs the peptid - + - + ``` ### Example of internal generation mode -TODO +This example is similar to the example above, only this time, we load one or more cyclic peptides (provided with the `-in:file:s` or `-in:file:l` commandline options), generate a conformational ensemble for each peptide _in memory_, without writing all structures to disk, and perform ensemble analysis on that ensemble, filtering on the results with the [[EnsembleMetric]]. + +```xml + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +``` ### Example of multiple pose mover mode From a5778bb6f493572edf75aebc22242ff2e49d86ba Mon Sep 17 00:00:00 2001 From: "Vikram K. Mulligan" Date: Thu, 10 Feb 2022 18:38:22 -0500 Subject: [PATCH 15/33] Adding note about multithreading. --- .../EnsembleMetrics/EnsembleMetrics.md | 26 +++++++++++-------- 1 file changed, 15 insertions(+), 11 deletions(-) diff --git a/scripting_documentation/RosettaScripts/EnsembleMetrics/EnsembleMetrics.md b/scripting_documentation/RosettaScripts/EnsembleMetrics/EnsembleMetrics.md index f89a2aa9a..f9dd81b62 100644 --- a/scripting_documentation/RosettaScripts/EnsembleMetrics/EnsembleMetrics.md +++ b/scripting_documentation/RosettaScripts/EnsembleMetrics/EnsembleMetrics.md @@ -5,7 +5,7 @@ Page created Wed, 9 February 2022 by Vikram K. Mulligan, Flatiron Institute (vmu [[_TOC_]] -## Description +## 1. Description Just as [[SimpleMetrics]] measure some property of a pose, EnsembleMetrics measure some property of a group (or _ensemble_) of poses. They are designed to be used in two phases. In the _accumulation_ phase, an EnsembleMetric is applied to each pose in an ensemble in sequence, allowing it to store any relevant measurements from that pose that will later be needed to calculate properties of the ensemble. In the _reporting_ phase, the EnsembleMetric generates a report about the properties of the ensemble and writes this report to disk or to tracer. Following reporting, an EnsembleMetric may be _interrogated_ by such modules as the [[EnsembleFilter]], allowing retrieval of any floating-point values computed by the EnsembleMetric for filtering. Alternatively, the EnsembleMetric may be _reset_ for re-use (meaning that accumulated data, but not configuration settings, are wiped). @@ -15,7 +15,7 @@ EnsembleMetric | Description ------------ | ------------- **[[CentralTendency]]** | Takes a [[real-valued SimpleMetric|SimpleMetrics]], applies it to each pose in an ensemble, and returns measures of central tendency (mean, median, mode) and other measures of the distribution (standard deviation, standard error, etc.). -## Usage modes +## 2. Usage modes EnsembleMetrics have three intended usage modes in [[RosettaScripts]]: @@ -25,7 +25,7 @@ Basic accumulator mode | Added to a protocol at point of accumulation. | The Ens Internal generation mode | Provided with a ParsedProtocol for generating the ensemble of poses from the input pose, and a number to generate. Added to protocol at point where ensemble should be generated from pose at that point. | Accumulates information about each pose in the ensemble it generates. Poses are then discaded. | The report is provided immediately once the ensemble has been generated. The script then continues with the input pose. | After reporting. | On next nstruct (repeat) or next job. Multiple pose mover mode | Set to use input from a mover that produces many outputs (a [[MultiplePoseMover]]). Placed in script after such a mover. | Collects data from each pose produced by previous mover. | Reports immediately after collecting data on all poses produced by previous mover. The script then continues on. | After reporting. | On next nstruct (repeat) or next job. -### Example of basic usage +### 2.1 Example of basic usage In this example, the input is a cyclic peptide (provided with the `-in:file:s` commandline option). This script perturbs the peptide backbone, relaxes the peptide, and then applies a [[CentralTendency EnsembleMetric|CentralTendency]] that in turn applies a [[TotalEnergyMetric]], measuring total score. At the end of execution (after repeat execution, a number of times set with the `-nstruct` commandline option), the EnsembleMetric produces a report about the mean, median, mode, etc. of the samples. @@ -110,7 +110,7 @@ In this example, the input is a cyclic peptide (provided with the `-in:file:s` c ``` -### Example of internal generation mode +### 2.2 Example of internal generation mode This example is similar to the example above, only this time, we load one or more cyclic peptides (provided with the `-in:file:s` or `-in:file:l` commandline options), generate a conformational ensemble for each peptide _in memory_, without writing all structures to disk, and perform ensemble analysis on that ensemble, filtering on the results with the [[EnsembleMetric]]. @@ -218,15 +218,19 @@ This example is similar to the example above, only this time, we load one or mor ``` -### Example of multiple pose mover mode +#### 2.2.1 Multi-threading + +When used in internal generation mode, the EnsembleMetric can generate members of the ensemble in [[parallel threads|Multithreading]]. This uses the [[RosettaThreadManager]], assigning work to available threads up to a user-specied maximum number to request. To set the maximum number of threads to request, use the `n_threads` option (where a setting of zero means to request all available threads). This functionality is only available in multi-threaded builds of Rosetta (built using `extras=cxx11thread` in the `scons` command), and requires that the total number of Rosetta threads be set at the command line using the `-multithreading:total_threads` commandline option. Note that an EnsembleMetric may be assigned fewer than the requested number of threads if other modules are using threads; at a minimum, it is guaranteed to be assigned the calling thread. + +### 2.3 Example of multiple pose mover mode TODO -## Interrogating EnsembleMetric floating-point values by name +## 3. Interrogating EnsembleMetric floating-point values by name Each EnsembleMetric can return one or more floating-point values describing different features of the ensemble. Each of these has a name associated with it. -### From C++ or Python code +### 3.1 From C++ or Python code From C++ (or Python) code, after an EnsembleMetric produces its final report, these values can be interrogated with the `get_metric_by_name()` method. To see all names offered by a particular EnsembleMetric, call `real_valued_metric_names()`: @@ -266,19 +270,19 @@ From C++ (or Python) code, after an EnsembleMetric produces its final report, th ); ``` -### Using filters +### 3.2 Using filters In RosettaScripts (or in PyRosetta or even C++ code), when an EnsembleMetric is used in internal generator mode or multiple pose mover mode (_i.e._ it applies itself to an ensemble of poses that it either generates internally or receives from a previous mover) a subsequent [[EnsembleFilter]] may be used to interrogate a named value computed by the EnsembleMetric, and to cause the protocol to pass or fail depending on that property of the ensemble. Why would someone want to do this? One example would be if one wanted to write a script that would design a protein, generate for each design a conformational ensemble, and score the propensity to favour the designed conformation (_e.g._ with the planned [[PNear]] EnsembleMetric), then discard those designs that have poor propensity to favour the designed state based on the ensemble analysis. This would ensure that one could produce thousands or tens of thousands of designs in memory, analyze them all, and only write to disk the ones worth carrying forward. Variant patterns include generating initial designs using a low-cost initial design protocol, doing moderate-cost ensemble analysis, discarding poor designs with the EnsembleFilter, and refining those designs that pass the filter using higher-cost refinement protocols. Other similar usage patterns are possible. -For more information, see the page for the [[EnsembleFilter]]. +Note that if one simply wants the value produced by the EnsembleMetric to be recorded in the pose, the EnsembleFilter can be used for that purpose as well by setting `confidence="0"` (so that the filter never rejects anything, but only reports). At some point, a SimpleMetric may be written for that purpose. For more information, see the page for the [[EnsembleFilter]]. -## Note about running in MPI mode +## 4. Note about running in MPI mode Note that EnsembleMetrics that run in different MPI processes cannot share information about the different poses that they have seen at present. This means that they will produce reports about only the ensemble of poses that they have seen _in their own MPI process_. They can still be used in MPI mode to analyse different ensembles in each MPI process. Support for generating giant ensembles by MPI and analysing them with EnsembleMetrics is planned for the future. -## See Also +## 5. See Also * [[SimpleMetrics]]: Measure a property of a single pose. * [[Filters|Filters-RosettaScripts]]: Filter on a measured feature of a pose. From a04596251da283a4aac16a07d9165cc84fd25f03 Mon Sep 17 00:00:00 2001 From: "Vikram K. Mulligan" Date: Thu, 10 Feb 2022 21:48:06 -0500 Subject: [PATCH 16/33] Updating note about multi-threading. --- .../RosettaScripts/EnsembleMetrics/EnsembleMetrics.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/scripting_documentation/RosettaScripts/EnsembleMetrics/EnsembleMetrics.md b/scripting_documentation/RosettaScripts/EnsembleMetrics/EnsembleMetrics.md index f9dd81b62..a1c596109 100644 --- a/scripting_documentation/RosettaScripts/EnsembleMetrics/EnsembleMetrics.md +++ b/scripting_documentation/RosettaScripts/EnsembleMetrics/EnsembleMetrics.md @@ -220,7 +220,7 @@ This example is similar to the example above, only this time, we load one or mor #### 2.2.1 Multi-threading -When used in internal generation mode, the EnsembleMetric can generate members of the ensemble in [[parallel threads|Multithreading]]. This uses the [[RosettaThreadManager]], assigning work to available threads up to a user-specied maximum number to request. To set the maximum number of threads to request, use the `n_threads` option (where a setting of zero means to request all available threads). This functionality is only available in multi-threaded builds of Rosetta (built using `extras=cxx11thread` in the `scons` command), and requires that the total number of Rosetta threads be set at the command line using the `-multithreading:total_threads` commandline option. Note that an EnsembleMetric may be assigned fewer than the requested number of threads if other modules are using threads; at a minimum, it is guaranteed to be assigned the calling thread. +When used in internal generation mode, the EnsembleMetric can generate members of the ensemble in [[parallel threads|Multithreading]]. This uses the [[RosettaThreadManager]], assigning work to available threads up to a user-specied maximum number to request. To set the maximum number of threads to request, use the `n_threads` option (where a setting of zero means to request all available threads). This functionality is only available in multi-threaded builds of Rosetta (built using `extras=cxx11thread` in the `scons` command), and requires that the total number of Rosetta threads be set at the command line using the `-multithreading:total_threads` commandline option. Note that an EnsembleMetric may be assigned fewer than the requested number of threads if other modules are using threads; at a minimum, it is guaranteed to be assigned the calling thread. **Note: this is a _highly_ experimental feature that can fail for many ensemble-generating protocols. When in doubt, it is safest to set `n_threads` to 1 (the default) for an EnsembleMetric.** ### 2.3 Example of multiple pose mover mode From eb93e85d9bf1858897fc981c174639468d8dc925 Mon Sep 17 00:00:00 2001 From: "Vikram K. Mulligan" Date: Thu, 10 Feb 2022 22:53:08 -0500 Subject: [PATCH 17/33] Adding example for mode 3. --- .../EnsembleMetrics/EnsembleMetrics.md | 56 ++++++++++++++++++- 1 file changed, 55 insertions(+), 1 deletion(-) diff --git a/scripting_documentation/RosettaScripts/EnsembleMetrics/EnsembleMetrics.md b/scripting_documentation/RosettaScripts/EnsembleMetrics/EnsembleMetrics.md index a1c596109..c9fc29d57 100644 --- a/scripting_documentation/RosettaScripts/EnsembleMetrics/EnsembleMetrics.md +++ b/scripting_documentation/RosettaScripts/EnsembleMetrics/EnsembleMetrics.md @@ -224,7 +224,61 @@ When used in internal generation mode, the EnsembleMetric can generate members o ### 2.3 Example of multiple pose mover mode -TODO +The following example uses the [[BundleGridSampler]] mover to grid-sample helical bundle conformations parametrically. For each conformation sampled, the protocol then uses the [[Disulfidize]] mover to generate all possible disulfides joining the helices as an ensemble of poses. It then computes the median disulfide pair energy, and discards conformations for which this energy is above a cutoff. + +```xml + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +``` ## 3. Interrogating EnsembleMetric floating-point values by name From 62a915844db80b2e5fb7a2fca23de4e324eca579 Mon Sep 17 00:00:00 2001 From: "Vikram K. Mulligan" Date: Thu, 10 Feb 2022 22:55:37 -0500 Subject: [PATCH 18/33] Add EnsembleFilter docs to filter list. --- .../RosettaScripts/Filters/Filters-RosettaScripts.md | 1 + 1 file changed, 1 insertion(+) diff --git a/scripting_documentation/RosettaScripts/Filters/Filters-RosettaScripts.md b/scripting_documentation/RosettaScripts/Filters/Filters-RosettaScripts.md index 5beb4a849..8518db1fe 100644 --- a/scripting_documentation/RosettaScripts/Filters/Filters-RosettaScripts.md +++ b/scripting_documentation/RosettaScripts/Filters/Filters-RosettaScripts.md @@ -37,6 +37,7 @@ Filter | Description **[[CompoundStatement|CompoundStatementFilter]]** | Uses previously defined filters with logical operations to construct a compound filter. **[[CombinedValue|CombinedValueFilter]]** | Weighted sum of multiple filters. **[[CalculatorFilter]]** | Combine multiple filters with a mathematical expression. +**[[EnsembleFilter]]** | Filter based, not on a property of a single pose, but on a property of an _ensemble_ of many poses. **[[ReplicateFilter]]** | Repeat a filter multiple times and average. **[[Boltzmann|BoltzmannFilter]]** | Boltzmann weighted sum of positive/negative filters. **[[MoveBeforeFilter]]** | Apply a mover before applying the filter. From d38608f9c01f68b2ea717778ad2f9a3941f2c252 Mon Sep 17 00:00:00 2001 From: "Vikram K. Mulligan" Date: Thu, 10 Feb 2022 22:59:47 -0500 Subject: [PATCH 19/33] Moving some filters that were in the wrong folder. --- .../Filters/{ => filter_pages}/AlignmentAAFinderFilter.md | 0 .../Filters/{ => filter_pages}/AlignmentGapInserterFilter.md | 0 .../RosettaScripts/Filters/{ => filter_pages}/ChainBreakFilter.md | 0 .../RosettaScripts/Filters/{ => filter_pages}/FragQualFilter.md | 0 .../Filters/{ => filter_pages}/FragmentScoreFilter.md | 0 .../Filters/{ => filter_pages}/HelixHelixAngleFilter.md | 0 .../RosettaScripts/Filters/{ => filter_pages}/HolesFilter.md | 0 .../{ => filter_pages}/LongestContinuousApolarSegmentFilter.md | 0 .../Filters/{ => filter_pages}/MPSpanAngleFilter.md | 0 .../Filters/{ => filter_pages}/SequenceDistanceFilter.md | 0 .../Filters/{ => filter_pages}/SpanTopologyMatchPoseFilter.md | 0 .../RosettaScripts/Filters/{ => filter_pages}/TMsAACompFilter.md | 0 12 files changed, 0 insertions(+), 0 deletions(-) rename scripting_documentation/RosettaScripts/Filters/{ => filter_pages}/AlignmentAAFinderFilter.md (100%) rename scripting_documentation/RosettaScripts/Filters/{ => filter_pages}/AlignmentGapInserterFilter.md (100%) rename scripting_documentation/RosettaScripts/Filters/{ => filter_pages}/ChainBreakFilter.md (100%) rename scripting_documentation/RosettaScripts/Filters/{ => filter_pages}/FragQualFilter.md (100%) rename scripting_documentation/RosettaScripts/Filters/{ => filter_pages}/FragmentScoreFilter.md (100%) rename scripting_documentation/RosettaScripts/Filters/{ => filter_pages}/HelixHelixAngleFilter.md (100%) rename scripting_documentation/RosettaScripts/Filters/{ => filter_pages}/HolesFilter.md (100%) rename scripting_documentation/RosettaScripts/Filters/{ => filter_pages}/LongestContinuousApolarSegmentFilter.md (100%) rename scripting_documentation/RosettaScripts/Filters/{ => filter_pages}/MPSpanAngleFilter.md (100%) rename scripting_documentation/RosettaScripts/Filters/{ => filter_pages}/SequenceDistanceFilter.md (100%) rename scripting_documentation/RosettaScripts/Filters/{ => filter_pages}/SpanTopologyMatchPoseFilter.md (100%) rename scripting_documentation/RosettaScripts/Filters/{ => filter_pages}/TMsAACompFilter.md (100%) diff --git a/scripting_documentation/RosettaScripts/Filters/AlignmentAAFinderFilter.md b/scripting_documentation/RosettaScripts/Filters/filter_pages/AlignmentAAFinderFilter.md similarity index 100% rename from scripting_documentation/RosettaScripts/Filters/AlignmentAAFinderFilter.md rename to scripting_documentation/RosettaScripts/Filters/filter_pages/AlignmentAAFinderFilter.md diff --git a/scripting_documentation/RosettaScripts/Filters/AlignmentGapInserterFilter.md b/scripting_documentation/RosettaScripts/Filters/filter_pages/AlignmentGapInserterFilter.md similarity index 100% rename from scripting_documentation/RosettaScripts/Filters/AlignmentGapInserterFilter.md rename to scripting_documentation/RosettaScripts/Filters/filter_pages/AlignmentGapInserterFilter.md diff --git a/scripting_documentation/RosettaScripts/Filters/ChainBreakFilter.md b/scripting_documentation/RosettaScripts/Filters/filter_pages/ChainBreakFilter.md similarity index 100% rename from scripting_documentation/RosettaScripts/Filters/ChainBreakFilter.md rename to scripting_documentation/RosettaScripts/Filters/filter_pages/ChainBreakFilter.md diff --git a/scripting_documentation/RosettaScripts/Filters/FragQualFilter.md b/scripting_documentation/RosettaScripts/Filters/filter_pages/FragQualFilter.md similarity index 100% rename from scripting_documentation/RosettaScripts/Filters/FragQualFilter.md rename to scripting_documentation/RosettaScripts/Filters/filter_pages/FragQualFilter.md diff --git a/scripting_documentation/RosettaScripts/Filters/FragmentScoreFilter.md b/scripting_documentation/RosettaScripts/Filters/filter_pages/FragmentScoreFilter.md similarity index 100% rename from scripting_documentation/RosettaScripts/Filters/FragmentScoreFilter.md rename to scripting_documentation/RosettaScripts/Filters/filter_pages/FragmentScoreFilter.md diff --git a/scripting_documentation/RosettaScripts/Filters/HelixHelixAngleFilter.md b/scripting_documentation/RosettaScripts/Filters/filter_pages/HelixHelixAngleFilter.md similarity index 100% rename from scripting_documentation/RosettaScripts/Filters/HelixHelixAngleFilter.md rename to scripting_documentation/RosettaScripts/Filters/filter_pages/HelixHelixAngleFilter.md diff --git a/scripting_documentation/RosettaScripts/Filters/HolesFilter.md b/scripting_documentation/RosettaScripts/Filters/filter_pages/HolesFilter.md similarity index 100% rename from scripting_documentation/RosettaScripts/Filters/HolesFilter.md rename to scripting_documentation/RosettaScripts/Filters/filter_pages/HolesFilter.md diff --git a/scripting_documentation/RosettaScripts/Filters/LongestContinuousApolarSegmentFilter.md b/scripting_documentation/RosettaScripts/Filters/filter_pages/LongestContinuousApolarSegmentFilter.md similarity index 100% rename from scripting_documentation/RosettaScripts/Filters/LongestContinuousApolarSegmentFilter.md rename to scripting_documentation/RosettaScripts/Filters/filter_pages/LongestContinuousApolarSegmentFilter.md diff --git a/scripting_documentation/RosettaScripts/Filters/MPSpanAngleFilter.md b/scripting_documentation/RosettaScripts/Filters/filter_pages/MPSpanAngleFilter.md similarity index 100% rename from scripting_documentation/RosettaScripts/Filters/MPSpanAngleFilter.md rename to scripting_documentation/RosettaScripts/Filters/filter_pages/MPSpanAngleFilter.md diff --git a/scripting_documentation/RosettaScripts/Filters/SequenceDistanceFilter.md b/scripting_documentation/RosettaScripts/Filters/filter_pages/SequenceDistanceFilter.md similarity index 100% rename from scripting_documentation/RosettaScripts/Filters/SequenceDistanceFilter.md rename to scripting_documentation/RosettaScripts/Filters/filter_pages/SequenceDistanceFilter.md diff --git a/scripting_documentation/RosettaScripts/Filters/SpanTopologyMatchPoseFilter.md b/scripting_documentation/RosettaScripts/Filters/filter_pages/SpanTopologyMatchPoseFilter.md similarity index 100% rename from scripting_documentation/RosettaScripts/Filters/SpanTopologyMatchPoseFilter.md rename to scripting_documentation/RosettaScripts/Filters/filter_pages/SpanTopologyMatchPoseFilter.md diff --git a/scripting_documentation/RosettaScripts/Filters/TMsAACompFilter.md b/scripting_documentation/RosettaScripts/Filters/filter_pages/TMsAACompFilter.md similarity index 100% rename from scripting_documentation/RosettaScripts/Filters/TMsAACompFilter.md rename to scripting_documentation/RosettaScripts/Filters/filter_pages/TMsAACompFilter.md From 11215f250ac63370b3d33720a5bf5e6c7fb8f4ff Mon Sep 17 00:00:00 2001 From: "Vikram K. Mulligan" Date: Thu, 10 Feb 2022 23:12:25 -0500 Subject: [PATCH 20/33] Adding documentation for EnsembleFilter. --- .../Filters/filter_pages/EnsembleFilter.md | 132 ++++++++++++++++++ 1 file changed, 132 insertions(+) create mode 100644 scripting_documentation/RosettaScripts/Filters/filter_pages/EnsembleFilter.md diff --git a/scripting_documentation/RosettaScripts/Filters/filter_pages/EnsembleFilter.md b/scripting_documentation/RosettaScripts/Filters/filter_pages/EnsembleFilter.md new file mode 100644 index 000000000..7dfbda089 --- /dev/null +++ b/scripting_documentation/RosettaScripts/Filters/filter_pages/EnsembleFilter.md @@ -0,0 +1,132 @@ +# EnsembleFilter +*Back to [[SimpleMetrics]] page.* +*Back to [[Filters | Filters-RosettaScripts]] page.* +## EnsembleFilter + +Created by Vikram K. Mulligan (vmulligan@flatironinstitute.org) on 10 February 2022. + +[[_TOC_]] + +### Description + +This filter takes as input an [[EnsembleMetric|EnsembleMetrics]] that has been used to evaluate some set of properties of an ensemble of filters, retrives a named floating-point value from the metric, and filters based on whether that value is greater than, equal to, or less than some threshold. (Note that [[EnsembleMetrics]] evaluate a property of a collection or _ensemble_ poses, not of a single pose. This makes this filter unusual: where most discard a trajectory based on the state of a single pose, this can discard a trajectory based on the state of large ensemble of poses -- for example, based on many sampled conformatinos of a single design.) + + +### Options + +[[include:filter_SimpleMetricFilter_type]] + +### Example: + +In this example, we load one or more cyclic peptides (provided with the `-in:file:s` or `-in:file:l` commandline options), generate a conformational ensemble of slightly perturbed conformations for each peptide _in memory_, without writing all structures to disk, and perform ensemble analysis on that ensemble with the [[CentralTendency EnsembleMetric|CentralTendency]], filtering on the results with the EnsembleFilter. Only those peptides that have low-energy ensembles of perturbed conformations pass the filter. + +```xml + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +``` + +### See also + +* [[EnsembleMetrics]]: Available SimpleMetrics +* [[SimpleMetrics]]: Available SimpleMetrics +* [[SimpleMetricFilter]]: Filter on an arbitrary SimpleMetric +* [[Movers|Movers-RosettaScripts]]: Available Movers +* [[I want to do x]]: Guide to choosing a Rosetta protocol. \ No newline at end of file From b7e9ef79447b69e504b0ca3e0f2f8c1add90e04a Mon Sep 17 00:00:00 2001 From: "Vikram K. Mulligan" Date: Thu, 10 Feb 2022 23:13:08 -0500 Subject: [PATCH 21/33] Minor typos. --- .../RosettaScripts/EnsembleMetrics/EnsembleMetrics.md | 3 +-- 1 file changed, 1 insertion(+), 2 deletions(-) diff --git a/scripting_documentation/RosettaScripts/EnsembleMetrics/EnsembleMetrics.md b/scripting_documentation/RosettaScripts/EnsembleMetrics/EnsembleMetrics.md index c9fc29d57..d9559cf96 100644 --- a/scripting_documentation/RosettaScripts/EnsembleMetrics/EnsembleMetrics.md +++ b/scripting_documentation/RosettaScripts/EnsembleMetrics/EnsembleMetrics.md @@ -112,7 +112,7 @@ In this example, the input is a cyclic peptide (provided with the `-in:file:s` c ### 2.2 Example of internal generation mode -This example is similar to the example above, only this time, we load one or more cyclic peptides (provided with the `-in:file:s` or `-in:file:l` commandline options), generate a conformational ensemble for each peptide _in memory_, without writing all structures to disk, and perform ensemble analysis on that ensemble, filtering on the results with the [[EnsembleMetric]]. +This example is similar to the example above, only this time, we load one or more cyclic peptides (provided with the `-in:file:s` or `-in:file:l` commandline options), generate a conformational ensemble for each peptide _in memory_, without writing all structures to disk, and perform ensemble analysis on that ensemble, filtering on the results with the [[EnsembleFilter]]. ```xml @@ -215,7 +215,6 @@ This example is similar to the example above, only this time, we load one or mor - ``` #### 2.2.1 Multi-threading From fc6e6749a4cd0c7f0f819f7380275f24100fdc7e Mon Sep 17 00:00:00 2001 From: "Vikram K. Mulligan" Date: Thu, 10 Feb 2022 23:15:41 -0500 Subject: [PATCH 22/33] Expanding note about mode. --- .../EnsembleMetrics/ensemble_metric_pages/CentralTendency.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/scripting_documentation/RosettaScripts/EnsembleMetrics/ensemble_metric_pages/CentralTendency.md b/scripting_documentation/RosettaScripts/EnsembleMetrics/ensemble_metric_pages/CentralTendency.md index 2a2b9678f..80393e4f0 100644 --- a/scripting_documentation/RosettaScripts/EnsembleMetrics/ensemble_metric_pages/CentralTendency.md +++ b/scripting_documentation/RosettaScripts/EnsembleMetrics/ensemble_metric_pages/CentralTendency.md @@ -31,7 +31,7 @@ Range | range | the largest value seen minus the smallest. #### Note about mode -The mode of a set of floating-point numbers can be thrown off by floating-point error. For instance, two poses may have energies of -3.7641 kJ/mol, but the process of computing that energy may result in slightly different values at the 15th decimal point. This would prevent the filter from recognizing this is at the most frequent value. +The mode of a set of floating-point numbers can be thrown off by floating-point error. For instance, two poses may have energies of -3.7641 kJ/mol, but the process of computing that energy may result in slightly different values at the 15th decimal point. This could prevent the filter from recognizing this is at the most frequent value. Mode is most useful as a metric when the "floating-point" values are actually integers (for instance, given a [[SimpleMetric|SimpleMetrics]] like the [[SelectedResidueCountMetric]], which returns integer counts). ##See Also From 08dc81e128e6259038d411f636a7159d5150169f Mon Sep 17 00:00:00 2001 From: "Vikram K. Mulligan" Date: Thu, 10 Feb 2022 23:23:36 -0500 Subject: [PATCH 23/33] Minor tweak. --- .../RosettaScripts/EnsembleMetrics/EnsembleMetrics.md | 6 ++++-- 1 file changed, 4 insertions(+), 2 deletions(-) diff --git a/scripting_documentation/RosettaScripts/EnsembleMetrics/EnsembleMetrics.md b/scripting_documentation/RosettaScripts/EnsembleMetrics/EnsembleMetrics.md index d9559cf96..3f30a2cf1 100644 --- a/scripting_documentation/RosettaScripts/EnsembleMetrics/EnsembleMetrics.md +++ b/scripting_documentation/RosettaScripts/EnsembleMetrics/EnsembleMetrics.md @@ -288,8 +288,10 @@ Each EnsembleMetric can return one or more floating-point values describing diff From C++ (or Python) code, after an EnsembleMetric produces its final report, these values can be interrogated with the `get_metric_by_name()` method. To see all names offered by a particular EnsembleMetric, call `real_valued_metric_names()`: ```C++ + // C++ pseudo-code: + // Create an EnsembleMetric: - CentralTendency my_ensemble_metric; + CentralTendencyEnsembleMetric my_ensemble_metric; // Configure this EnsembleMetric here. This particular // example would require a SimpleMetric to be passed to // it, though in general the setup for EnsembleMetrics @@ -341,4 +343,4 @@ Note that EnsembleMetrics that run in different MPI processes cannot share infor * [[Filters|Filters-RosettaScripts]]: Filter on a measured feature of a pose. * [[EnsembleFilter]]: Filter on a property of an ensemble of poses. * [[Movers|Movers-RosettaScripts]]: Modify a pose. -* [[I want to do x]]: Guide to choosing a Rosetta protocol. \ No newline at end of file +* [[I want to do x]]: Guide to choosing a Rosetta protocol. From af0cded938bd0459692a7791ca8df89a1d975c65 Mon Sep 17 00:00:00 2001 From: "Vikram K. Mulligan" Date: Thu, 24 Feb 2022 19:39:23 -0500 Subject: [PATCH 24/33] Updating note about MPI. --- .../EnsembleMetrics/EnsembleMetrics.md | 12 ++++++++---- 1 file changed, 8 insertions(+), 4 deletions(-) diff --git a/scripting_documentation/RosettaScripts/EnsembleMetrics/EnsembleMetrics.md b/scripting_documentation/RosettaScripts/EnsembleMetrics/EnsembleMetrics.md index 3f30a2cf1..0340f35ca 100644 --- a/scripting_documentation/RosettaScripts/EnsembleMetrics/EnsembleMetrics.md +++ b/scripting_documentation/RosettaScripts/EnsembleMetrics/EnsembleMetrics.md @@ -11,9 +11,9 @@ Just as [[SimpleMetrics]] measure some property of a pose, EnsembleMetrics measu ##Available EnsembleMetrics -EnsembleMetric | Description ------------- | ------------- -**[[CentralTendency]]** | Takes a [[real-valued SimpleMetric|SimpleMetrics]], applies it to each pose in an ensemble, and returns measures of central tendency (mean, median, mode) and other measures of the distribution (standard deviation, standard error, etc.). +EnsembleMetric | Description | MPI support? +-------------- | ----------- | ------------ +**[[CentralTendency]]** | Takes a [[real-valued SimpleMetric|SimpleMetrics]], applies it to each pose in an ensemble, and returns measures of central tendency (mean, median, mode) and other measures of the distribution (standard deviation, standard error, etc.). | YES ## 2. Usage modes @@ -335,7 +335,11 @@ Note that if one simply wants the value produced by the EnsembleMetric to be rec ## 4. Note about running in MPI mode -Note that EnsembleMetrics that run in different MPI processes cannot share information about the different poses that they have seen at present. This means that they will produce reports about only the ensemble of poses that they have seen _in their own MPI process_. They can still be used in MPI mode to analyse different ensembles in each MPI process. Support for generating giant ensembles by MPI and analysing them with EnsembleMetrics is planned for the future. +The [[Message Passing Interface (MPI)|MPI]] permits massively parallel execution of a Rosetta protocol. If an EnsembleMetric is used in basic mode (Section 2.1) using the [[MPI build|Build-Documentation]] of Rosetta, all poses seen _by all processes_ are considered part of the ensemble that is being analysed. At the end of the protocol, all of the instances of the EnsembleMetric on worker processes will report back to the director process with the measurements needed to allow the director process to perform the analysis on the whole ensemble. This can be convenient for rapidly analysing very large ensembles generated in memory across a large cluster, without needing to write thousands or millions of structuers to disk. This functionality is currently only available in the [[JD2]] version of the [[RosettaScripts]] application, and only when the [[MPIWorkPoolJobDistributor|JD2]] (the default MPI JD2 job distributor) is used. Support for [[JD3|RosettaScripts-JD3]] is planned. + +Note that EnsembleMetrics that run in different MPI processes, and which generate ensembles internally using either a generating protocol (Section 2.2) or a multiple pose mover (Section 2.3), report immediately on the ensemble seen locally _in that process_. In this case, no information is shared between processes. + +As a final note, some EnsembleMetrics may not support MPI job collection. These should tell you so with a suitable error message at parse time (i.e. before you run an expensive protocol and try to collect in results). See the table of EnsembleMetrics for MPI support. ## 5. See Also From 41dacf139c387c6c14daf8281cd1b589cfc77227 Mon Sep 17 00:00:00 2001 From: "Vikram K. Mulligan" Date: Fri, 25 Feb 2022 00:41:39 -0500 Subject: [PATCH 25/33] Updating CentralTendency and FragmentScore auto-generated docs. --- .../xsd/ensemble_metric_CentralTendency_type.md | 10 +++++----- .../xsd/filter_FragmentScoreFilter_type.md | 2 +- 2 files changed, 6 insertions(+), 6 deletions(-) diff --git a/scripting_documentation/RosettaScripts/xsd/ensemble_metric_CentralTendency_type.md b/scripting_documentation/RosettaScripts/xsd/ensemble_metric_CentralTendency_type.md index 833bb10a0..2aaf50613 100644 --- a/scripting_documentation/RosettaScripts/xsd/ensemble_metric_CentralTendency_type.md +++ b/scripting_documentation/RosettaScripts/xsd/ensemble_metric_CentralTendency_type.md @@ -9,8 +9,8 @@ An ensemble metric that takes a real-valued simple metric, applies it to all pos ``` @@ -20,9 +20,9 @@ An ensemble metric that takes a real-valued simple metric, applies it to all pos - **output_mode**: The output mode for reports from this ensemble metric. Default is 'tracer'. Allowed modes are: 'tracer', 'tracer_and_file', or 'file'. - **output_filename**: The file to which the ensemble metric report will be written if output mode is 'tracer_and_file' or 'file'. Note that this filename will have the job name and number prepended so that each report is unique. - **ensemble_generating_protocol**: An optional ParsedProtocol or other mover for generating an ensemble from the current pose. This protocol will be applied repeatedly (ensemble_generating_protocol_repeats times) to generate the ensemble of structures. Each generated pose will be measured by this metric, then discarded. The ensemble properties are then reported. If not provided, the current pose is measured and the report will be produced later (e.g. at termination with the JD2 rosetta_scripts application). -- **ensemble_generating_protocol_repeats**: The number of times that the ensemble_generating_protocol is applied. This is the maximum number of structures in the ensemble (though the actual number may be smaller if the protocol contains filters or movers that can fail for some attempts). Only used if an ensemble-generating protocol is provided with the ensemble_generating_protocol option. -- **n_threads**: The number of threads to request for generating ensembles in parallel. This is only used in multi-threaded compilations of Rosetta (compiled with extras=cxx11thread), and only when an ensemble-generating protocol is provided with the ensemble_generating_protocol option. A value of 0 means to use all available threads. In single-threaded builds, this must be set to 0 or 1. +- **ensemble_generating_protocol_repeats**: The number of times that the ensemble_generating_protocol is applied. This is the maximum number of structures in the ensemble (though the actual number may be smaller if the protocol contains filters or movers that can fail for some attempts). Only used if an ensemble-generating protocol is provided with the ensemble_generating_protocol option. Defaults to 1. +- **n_threads**: The number of threads to request for generating ensembles in parallel. This is only used in multi-threaded compilations of Rosetta (compiled with extras=cxx11thread), and only when an ensemble-generating protocol is provided with the ensemble_generating_protocol option. A value of 0 means to use all available threads. In single-threaded builds, this must be set to 0 or 1. Defaults to 1. NOTE THAT MULTI-THREADING IS HIGHLY EXPERIMENTAL AND LIKELY TO FAIL FOR MANY ENSEMBLE-GENERATING PROTOCOLS. When in doubt, leave this set to 1. - **use_additional_output_from_last_mover**: If true, this ensemble metric will use the additional output from the previous pose (assuming the previous pose generates multiple outputs) as the ensemble, analysing it and producing a report immediately. If false, then it will behave normally. False by default. -- **real_valued_metric**: The name of a real-valued simple metric defined previously. Required input. +- **real_valued_metric**: (REQUIRED) The name of a real-valued simple metric defined previously. Required input. --- diff --git a/scripting_documentation/RosettaScripts/xsd/filter_FragmentScoreFilter_type.md b/scripting_documentation/RosettaScripts/xsd/filter_FragmentScoreFilter_type.md index b556a2dfb..288cf53a7 100644 --- a/scripting_documentation/RosettaScripts/xsd/filter_FragmentScoreFilter_type.md +++ b/scripting_documentation/RosettaScripts/xsd/filter_FragmentScoreFilter_type.md @@ -13,7 +13,7 @@ Filter based on any score that can be calculated in fragment_picker. outputs_name="(pose &string;)" csblast="(&string;)" blast_pgp="(&string;)" placeholder_seqs="(&string;)" sparks-x="(&string;)" sparks-x_query="(&string;)" psipred="(&string;)" - vall_path="(/Users/vmulligan/rosetta_git_workingcopy/Rosetta/main/tools/doc_tools/../../database//sampling/vall.jul19.2011.gz &string;)" + vall_path="(/home/vikram/rosetta_devcopy/Rosetta/main/database//sampling/vall.jul19.2011.gz &string;)" frags_scoring_config="(&string;)" n_frags="(200 &non_negative_integer;)" n_candidates="(1000 &non_negative_integer;)" print_to_pdb="(false &xs:boolean;)" From e0d42133e2b6bb2397d00a790a674b4921d5bde3 Mon Sep 17 00:00:00 2001 From: "Vikram K. Mulligan" Date: Thu, 10 Mar 2022 20:10:10 -0500 Subject: [PATCH 26/33] Add auto-generated docs for PNearEnsembleMetric. --- .../xsd/ensemble_metric_PNear_type.md | 48 +++++++++++++++++++ 1 file changed, 48 insertions(+) create mode 100644 scripting_documentation/RosettaScripts/xsd/ensemble_metric_PNear_type.md diff --git a/scripting_documentation/RosettaScripts/xsd/ensemble_metric_PNear_type.md b/scripting_documentation/RosettaScripts/xsd/ensemble_metric_PNear_type.md new file mode 100644 index 000000000..9946ce070 --- /dev/null +++ b/scripting_documentation/RosettaScripts/xsd/ensemble_metric_PNear_type.md @@ -0,0 +1,48 @@ + + +_Autogenerated Tag Syntax Documentation:_ + +--- +An ensemble metric that computes PNear, an estimate of the Boltzmann probability that a molecule's conformation is close to a desired conformation, based on an ensemble of sampled conformations and their energies. PNear was described in Bhardwaj, Bahl, Mulligan et al. (2016). Accurate de novo design of hyperstable constrained peptides. Nature 538(7625):329-335. doi: 10.1038/nature19791. Given N samples, it is defined by: PNear = sum_i( exp( -(R_i/lambda)^2) exp(-E_i/(kbt)) ) / sum_j( exp(-E_j/(kbt)) ), where E_i and E_j are the energies of the ith and jth samples, respectively, and R_i is the RMSD of the ith sample to the reference state. + +```xml + +``` + +- **label_prefix**: If provided, this prefix is prepended to the label for this ensemble metric (with an underscore after the prefix and before the ensemble metric name). +- **label_suffix**: If provided, this suffix is appended to the label for this ensemble metric (with an underscore after the ensemble metric name and before the suffix). +- **output_mode**: The output mode for reports from this ensemble metric. Default is 'tracer'. Allowed modes are: 'tracer', 'tracer_and_file', or 'file'. +- **output_filename**: The file to which the ensemble metric report will be written if output mode is 'tracer_and_file' or 'file'. Note that this filename will have the job name and number prepended so that each report is unique. +- **ensemble_generating_protocol**: An optional ParsedProtocol or other mover for generating an ensemble from the current pose. This protocol will be applied repeatedly (ensemble_generating_protocol_repeats times) to generate the ensemble of structures. Each generated pose will be measured by this metric, then discarded. The ensemble properties are then reported. If not provided, the current pose is measured and the report will be produced later (e.g. at termination with the JD2 rosetta_scripts application). +- **ensemble_generating_protocol_repeats**: The number of times that the ensemble_generating_protocol is applied. This is the maximum number of structures in the ensemble (though the actual number may be smaller if the protocol contains filters or movers that can fail for some attempts). Only used if an ensemble-generating protocol is provided with the ensemble_generating_protocol option. Defaults to 1. +- **n_threads**: The number of threads to request for generating ensembles in parallel. This is only used in multi-threaded compilations of Rosetta (compiled with extras=cxx11thread), and only when an ensemble-generating protocol is provided with the ensemble_generating_protocol option. A value of 0 means to use all available threads. In single-threaded builds, this must be set to 0 or 1. Defaults to 1. NOTE THAT MULTI-THREADING IS HIGHLY EXPERIMENTAL AND LIKELY TO FAIL FOR MANY ENSEMBLE-GENERATING PROTOCOLS. When in doubt, leave this set to 1. +- **use_additional_output_from_last_mover**: If true, this ensemble metric will use the additional output from the previous pose (assuming the previous pose generates multiple outputs) as the ensemble, analysing it and producing a report immediately. If false, then it will behave normally. False by default. +- **compute_pnear_to_native**: Should PNear be computed using the structure provided with the -in:file:natie option as the reference state? Defaults to true. If multiple scorefunctions are provided, one PNear value is computed using each in turn. +- **compute_pnear_to_lowestE**: Should PNear be computed using the structure provided with the -in:file:native option as the reference state? Defaults to false. If multiple scorefunctions are provided, one PNear value is computed using each in turn. +- **scorefxns**: (REQUIRED) A comma-separated list of previously-defined scoring functions. At least one must be provided. +- **kbt**: The Boltzmann temperature, used to compute Boltzmann probabilities of conformational states. Defaults to 0.62 kcal/mol, which is approximately physiological temperature. Must be positive. +- **lambda**: The breadth of the Gaussian, in Angstroms, used to determine whether the RMSD to the reference state is small enough for a sampled state to be considered 'near-native' or not. The default value, 2.5 A, is appropriate for small proteins. For peptides, a value of 1.5 A to 2.0 A is more appropriate. Very large proteins or protein complexes might warrant a larger value. Must be positive. +- **superimpose_for_rmsd**: Set whether the poses should be superimposed (aligned) for the RMSD calculation. The default is true, which is appropriate for PNear calculations for folding. For PNear calculations for docking, users may want to set this to false (so that docked configuratinos that preserve backbone conformation but involve a different rigid-body transform than native register as high-RMSD states). +- **use_backbone_in_rmsd**: Set whether backbone heavyatoms should be used in computing RMSD values. True by default. If no residue selector is provided with the backbone_residue_selector option, all residues in the pose are used if this option is set to true. +- **use_CB_in_rmsd**: If this option is true, then the first sidechain heavyatom (CB in canonical amino acids) is included with the backbone heavyatoms when computing heavyatom RMSD. Requires use_backbone_in_rmsd to be set to true. False by default. +- **use_sidechain_in_rmsd**: Set whether sidechain heavyatoms should be used in computing RMSD values. False by default. If no residue selector is provided with the sidehchain_residue_selector option, all residues in the pose are used if this option is set to true. +- **native_file**: A PDB, CIF, or other structure file for the native pose. Only used if compute_pnear_to_native is true. If this is left as an empty string (the default), then the file passed on the commandline with -in:file:native is used. +- **native_preparation_protocol**: A mover applied to the native pose on load. (This can be useful for setting up cyclic geometry, for example.) Not used if not provided. +- **backbone_residue_selector**: An optional residue selector to select the residues whose backbone heavyatoms (and first sidechain heavyatoms, if use_CB_in_rmsd is set to true) are used when computing RMSD values. If this is not provided, all positions are used. This option does nothing unless use_backbone_in_rmsd is set to true. The name of a previously declared residue selector or a logical expression of AND, NOT (!), OR, parentheses, and the names of previously declared residue selectors. Any capitalization of AND, NOT, and OR is accepted. An exclamation mark can be used instead of NOT. Boolean operators have their traditional priorities: NOT then AND then OR. For example, if selectors s1, s2, and s3 have been declared, you could write: 's1 or s2 and not s3' which would select a particular residue if that residue were selected by s1 or if it were selected by s2 but not by s3. +- **sidechain_residue_selector**: An optional residue selector to select the residues whose sidechain heavyatoms are used when computing RMSD values. If this is not provided, all positions are used. This option does nothing unless use_sidechain_in_rmsd is set to true. The name of a previously declared residue selector or a logical expression of AND, NOT (!), OR, parentheses, and the names of previously declared residue selectors. Any capitalization of AND, NOT, and OR is accepted. An exclamation mark can be used instead of NOT. Boolean operators have their traditional priorities: NOT then AND then OR. For example, if selectors s1, s2, and s3 have been declared, you could write: 's1 or s2 and not s3' which would select a particular residue if that residue were selected by s1 or if it were selected by s2 but not by s3. + +--- From 5faa10703dfa679e51a0d1813d553115179829d4 Mon Sep 17 00:00:00 2001 From: "Vikram K. Mulligan" Date: Thu, 10 Mar 2022 20:10:35 -0500 Subject: [PATCH 27/33] Add PNear ensemble metric to EnsembleMetrics documenation page. --- .../RosettaScripts/EnsembleMetrics/EnsembleMetrics.md | 1 + 1 file changed, 1 insertion(+) diff --git a/scripting_documentation/RosettaScripts/EnsembleMetrics/EnsembleMetrics.md b/scripting_documentation/RosettaScripts/EnsembleMetrics/EnsembleMetrics.md index 0340f35ca..4a7f3f7b5 100644 --- a/scripting_documentation/RosettaScripts/EnsembleMetrics/EnsembleMetrics.md +++ b/scripting_documentation/RosettaScripts/EnsembleMetrics/EnsembleMetrics.md @@ -14,6 +14,7 @@ Just as [[SimpleMetrics]] measure some property of a pose, EnsembleMetrics measu EnsembleMetric | Description | MPI support? -------------- | ----------- | ------------ **[[CentralTendency]]** | Takes a [[real-valued SimpleMetric|SimpleMetrics]], applies it to each pose in an ensemble, and returns measures of central tendency (mean, median, mode) and other measures of the distribution (standard deviation, standard error, etc.). | YES +**[[PNear|PNearEnsembleMetric]]** | Based on a conformational ensemble, computes the propensity to favour a desired state or the lowest-energy state sampled. | YES ## 2. Usage modes From d00043ea95ab8008a0aba7cd3b9f400f4ce6406b Mon Sep 17 00:00:00 2001 From: "Vikram K. Mulligan" Date: Thu, 10 Mar 2022 20:40:08 -0500 Subject: [PATCH 28/33] Updating CentralTendency. --- .../EnsembleMetrics/ensemble_metric_pages/CentralTendency.md | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/scripting_documentation/RosettaScripts/EnsembleMetrics/ensemble_metric_pages/CentralTendency.md b/scripting_documentation/RosettaScripts/EnsembleMetrics/ensemble_metric_pages/CentralTendency.md index 80393e4f0..0e1d7c6b3 100644 --- a/scripting_documentation/RosettaScripts/EnsembleMetrics/ensemble_metric_pages/CentralTendency.md +++ b/scripting_documentation/RosettaScripts/EnsembleMetrics/ensemble_metric_pages/CentralTendency.md @@ -1,5 +1,5 @@ # CentralTendency Ensemble Metric -*Back to [[SimpleMetrics]] page.* +*Back to [[EnsembleMetrics]] page.* ## CentralTendency Ensemble Metric [[_TOC_]] @@ -37,4 +37,5 @@ The mode of a set of floating-point numbers can be thrown off by floating-point * [[SimpleMetrics]]: Available SimpleMetrics. * [[EnsembleMetrics]]: Available EnsembleMetrics. +* [[PNear ensemble metric|PNearEnsembleMetric]]: An ensemble metric that computes propensity to favour a desired conformation given a conformational ensemble. * [[I want to do x]]: Guide to choosing a tool in Rosetta. \ No newline at end of file From f08de9eee2ee6c83086878f4521db74d92e082bb Mon Sep 17 00:00:00 2001 From: "Vikram K. Mulligan" Date: Thu, 10 Mar 2022 20:40:24 -0500 Subject: [PATCH 29/33] Adding documentation for PNear ensemble metric. --- .../PNearEnsembleMetric.md | 63 ++++++++++++++++++ .../ensemble_metric_pages/PNear_Eqn.png | Bin 0 -> 24464 bytes 2 files changed, 63 insertions(+) create mode 100644 scripting_documentation/RosettaScripts/EnsembleMetrics/ensemble_metric_pages/PNearEnsembleMetric.md create mode 100644 scripting_documentation/RosettaScripts/EnsembleMetrics/ensemble_metric_pages/PNear_Eqn.png diff --git a/scripting_documentation/RosettaScripts/EnsembleMetrics/ensemble_metric_pages/PNearEnsembleMetric.md b/scripting_documentation/RosettaScripts/EnsembleMetrics/ensemble_metric_pages/PNearEnsembleMetric.md new file mode 100644 index 000000000..9a3b9d4cc --- /dev/null +++ b/scripting_documentation/RosettaScripts/EnsembleMetrics/ensemble_metric_pages/PNearEnsembleMetric.md @@ -0,0 +1,63 @@ +# PNear Ensemble Metric +*Back to [[EnsembleMetrics]] page.* +## PNear Ensemble Metric + +[[_TOC_]] + +### Description + +PNear is a metric used to describe the propensity of a sequence to favour a particular conformational state. It was first described in Bhardwaj, Mulligan, Bahl _et al_. (2016) _Nature_ 538(7625):329-335, doi: 10.1038/nature19791. The PNear [[EnsembleMetric|EnsembleMetrics]] computes PNear given a conformational ensemble generated by some protocol. It can optionally compute the propensity to favour some desired state, provided with the `-in:file:native` commandline option or with the `native_file` option in RosettaScripts, or it can compute the propensity to favour the lowest-energy state observed in the ensemble. + +### Author and history + +Created Thursday, 10 March 2022 by Vikram K. Mulligan, Center for Computational Biology, Flatiron Institute (vmulligan@flatironinstitute.org). + +### Details of the calculation + +PNear is defined as follows: + +![Expression defining PNear](/scripting_documentation/RosettaScripts/EnsembleMetrics/ensemble_metric_pages/PNear_Eqn.png) + +In the above, _N_ is the number of structures in the ensemble, _Ei_ and _rmsdi_ are the energy and the RMSD to target conformation of the ith sample, respectively, and λ (lambda) and kBT are two parameters controlling the calculation of PNear. The parameter λ, measured in Angstroms, defines how close a structure has to be to the target conformation (either the user-provided native state or the lowest-energy state in the ensemble) in order for it to be considered "close enough" to count as contributing to high propensity to favour the target state. The parameter kBT, meaured in kcal/mol, is a Boltzmann temperature that determines how much each sampled conformation contributes to distribution of states as its energy rises. + +### Interface + +[[include:ensemble_metric_PNear_type]] + +### Vector-valued metrics produced + +The PNear ensemble metric produces two vectors of outputs that can be accessed from C++ or Python by calling the `PNearEnsembleMetric::get_realvector_metric_value_by_name( std::string const & name, core::Size const index )` method. These are called "pnear_to_native" and "pnear_to_lowestE". There is one entry in the vector for every scoring function provided when configuring the PNear ensemble metric. (For ordinary use, when only one scoring function is provided, these are 1-vectors.) + +### Example usage + +The following script reads a series of PDB files or structures from a silent file (passed in with one of the `-in:file:s`, `-in:file:l` or `-in:file:silent` options), aligns each to a native structure (`inputs/native.pdb`), scores each with Rosetta's `ref2015` scoring function, and computes PNear to both the native structure and to the lowest-energy structure in the ensemble of input structuers. + + +```xml + + + + + + + + + + + + + + +``` + +##See Also + +* [[EnsembleMetrics]]: Available EnsembleMetrics. +* [[SimpleMetrics]]: Available SimpleMetrics. +* [[CentralTendency ensemble metric|CentralTendency]]: An ensemble metric that computs mean, median, mode, etc. of values produced by a [[SimpleMetric|SimpleMetrics]]. +* [[I want to do x]]: Guide to choosing a tool in Rosetta. \ No newline at end of file diff --git a/scripting_documentation/RosettaScripts/EnsembleMetrics/ensemble_metric_pages/PNear_Eqn.png b/scripting_documentation/RosettaScripts/EnsembleMetrics/ensemble_metric_pages/PNear_Eqn.png new file mode 100644 index 0000000000000000000000000000000000000000..f946fa265fbf6560bfb1ee7b4ad76448645a9a64 GIT binary patch literal 24464 zcmZTw2|Sc*+rP)YBpIbewrQlYRkD<|qDWCGOV&2ZI+A^xP|;#*L4=f4_QcqYPN8gV zjNOD1W34P1%XdA~d(P>7zd65i&KxJq^E~%`UH|R6BMuww7vNjRhY%7l(APbR5Cu2Y8lbv|FX6QR8q&Yrw@_Kc+W`HSZy_Zu8Kd`96cH$sxg zKzH{spN^4kU-F@;rP|4iGcwN~$=#8DQ>l1lTHbKe8^u^fmk;~i9zF1MQ~JtPZ`Mh* z*PGTEwMO$8+{N>@$Ms71sW&B*B_)&;#QSP4uJvf@{^~KVIar!P8(c7Q|NQbYe|UJf zzD{kF1h?J%>?C^OL^BE{#D*^u+qIHh4;99u;`mzIiB?-S@LFtDHyLEMm z&ra7y;tuE9Xrkrf8a&j;hZ2!~!uE7710n}Ayd-GB54Gl4^HVReb0hs*8wC(2zX(UF zed9g!$S|&nHkzX0izn~7%!fN%Ogf3+ySa2aUYwsIhKIirlE&-)e!WsXAt*;($Zgebj1>U5&$Mx!L2O`1dke)Vnr6+*vXy{xw-4G*KV_^5d&|3dNV z-U>+B@zXnc`0U~A6UwM^Y0)1g2QT`fxhckAsa38`1@aj!q*--ln&mF1&j0%H(6h?q z@j+awMS(rKINGs_OzH76G=}>ZwmIhS+gHhY%XVX(G9StB{mi%b>|~m4uyyDLynwuA z5r^IDYpamli5EQNa^D}N3Jy%!4w0Q#9p(HvA0Ag(<7DU`KQPxJA=AQJ<%hh`zJz)cy76;q`~{%rA1~rR%^-W zhH*9DyrRU^xpZO(dPvQ4cqK|yabc76pB`BkAQWSM6noCBS3ubb?D@Y*$j;EZFu;R^ zHy%W{b)yHInv!-R=gy+WsU9`IpPT3nv$p=Dh3My4xB9*rC8yRba%Xn^-h*cw;)O`1 z$}b#S_{a_^fg?B;$By)Rt|WKOeI4{g+P*SUA8yM-3~w;X$dsDj$_h#D(xsL*l} zeq{WSe=o!2CWnxafvH?LtYkcb=q6)(V_M6XR7&Y!f~H`u+# zMGXYMRT36d)|E@rl7!LpI2*Q9gIgPs#!Q!ooW*lp;}`n)iJ^=XwykbG5`SD5_LgYz>wCQ-uC3U813I2*%9(oL&X#rSW9sJ@W_qP&_N&Ic%eGXdChTt#B)!>yensc)V#-%oPyh*pT*67zACY&rwMi%&S7INRyGbM zstAx;n*nv*pdehnrW zMx1s_iwkJwro)K5+QtV*4&mgladaMRP7$R39QufBdw!Om-VpEb_BJo+sM4<=gWO5? zIHnjh?zEa^p`9Tk@oVjM)% zk%eCkO3lTQqT}U1BDrLiuU8`>G9#xYooZ65_!#Rx!Dtk_S%1#e1Qt3(aH0qYc zrSae;c6vdxmWOq7>r|&-aHjQ1gRrLi>gRv(3fjP_30GGJyoBy|b85Hm>K0f)^ z74nvI{uY<#opMUJNO;bx%f4%7G3^#_I z9^x$v%!R7;(X*2iIL(G zw_YtYRyMAMuC0>&x{KTT&B%{KP$8@b2LXwdHl6gEzkYsZOW-y9e<~{ru*4AG*nh}Y zt;qKr>a#CsxU0JrIyj%Nrse6y*Q;&O5r-0o#(SJ<#~&%3eLK*RB4aErIV>0L8Ol@B z>SYwaRq8CUoJ3?hVi5 z=ijB`=nZU^u?7ZVvqCxhUc$wojKmxF3pN@jFH^Ht$$EV7=$WYixpe!%r2XQ)9OTb` zU9J7*b?3i<8Frm7UAl|0>x^nq$ZbSPmVl97${^+a{4FPwBCFv3(Ww%aBkl| zFhy~0$6p98b`>zgkGs=nC;L&rOt&}L#8B@Q0}x)$%&iuV{9U3f^!YjzFeWa~vE^BG zTtTGVv{L-Dvr z=QVLp23EBuNbaH?iXm{)FJ7A{C4FXFp&@W`YJC>-lI;R>hC6AP?xkJhxX!%FBlE=q zfQmA4@2z)!8M)!B=e41Nj^Z8kdOuBAJx|Uq_Nyg*s=2AQ_zJXgJfWFJ9%}F{Z z@-jYNY&@jJE5CmVmYP*(wrrHAYyOU-M|V|T{xMlA!A|OQq2P=(#WQEzp;$9Z#r}I2 z#!6)xx%9qVml@Ji?A}kN6%>9o{nC*MwFp50whEQwZL$|RROr~UW7GQ;u3kNkgp~?b zYDQz)=Fc!_iB@htN_<-hZ4Z~{H#JPA71lZi_EgBNwbLxBIMp%PkAbkHCK1!ko%Yog zwdR=$u2KD1@%ClL9Kg4gS|~EE9f1bBW?_MQ`FhfVb&cP>I7M5n9@}2ilfA8a%}G0z z=>}%C+;os3ZRm4xS3ABPD{6nmcj*S&twye z0H|B!g0EeEY1fI5dV`n(B)@<}xVMg^wCYH975>3hX8k3e3{z@C_vEuMrdK1{Q=dNC-o z+wN)Mw-YbU387FffeS8fy_JC>RZXK)XClH5C92>^vlAz&L*-K6p^%i?h#tDP1DAU4 zecHMtj=X?h7i`Bz?O3~R=usPX<;!Y|4=4MN?WCO+*9uTWntnggOK-_P;WA-2aV;$G zRVy@Nwg%1nAb4J1I+C>l{SZbys0Kv1e(vyJa~3iBd@6TIn&F)a6? zd8((vAGhiFvrn%DQ_O7VXA{;85BWH`w8dMd&%m!AvXe|wb0-`~dkpk*UA(SO&WwN8 z+XN*CU5Vx|wI#1E-uU66@ciOU>Fc9Q-0a-Tm-z<%&+2_3rtVcN$gvdY;kfFrP-%RK zb(L#(>AGVEI{+lDBKfCoS;AWtxBntBjD`Dx{<5Y3NtK#eg1U&KSZ0Jn5hyE?(C}G7 zc5Z)OQlt&|7zHdX%%FquTX7^mFVW1op2vqoB^?RX=Cm-om)Vm7UJIU#R@_9ZrQ7B5 zpf|D7)>SO>`(v<~-ROjTJnY+&pp6@=Bl%6;061T|-u)~~CB!rEsYym8*G8N<8)Gmf zczJQJW6AB!CN;Ft(VmJ<&!FPT8d03godL5`T1}UKP}v2M5#i)WTRtlP(WH(ZL64d! z*{`_&Zy0VE!4YHLR^T9rpo$p3t3BydmCfd(HZAW|b7L(o<7+kcJE&h96LAh+O=UIp zIktIryQxDmim_{a9PSj@H}dHQX`VT{I%{7@W6;&aZsTyKfawM)>(Yzp_NF6*fj|jO zza8}H9p^t32T_K*igGS?6}zKPxwaC&q3>WsZq?$t@O%1OnLyTyyX{-Yiu)s@qdFsB ze~$d6^3ndOnEJG(Dlel>*m-_&VaAsBMg9nok!7?mE-Jl8U1W@sNDEokWh(SfGy{DV zFO^nP8G`vPz1WQUUQq#2m5XAxh(SQ0X1eG zigdz7utqd2`c2-N;$MC@O+)*p4iVoN@Ga zhZIj22_^d*jIPsJUqf9eDXk^7nN}qiay+h5xI(4Z;a_uqUP>1BA5(ShS~HMb`r*C{4ys)L}g!aHz4@6^jyH_MRPFiUL(SKgLvDf2*f?HSRr(p9piKd098 z`1A`<6Fx8fXx6r7W;>QmwH9f8HN%R z#gUDl2kPURJzeu^J?CfLsYB203@!s}U^Ald{Fr0pvK#H`16Dx3C33zsYb_eDWGbxC z+AQQO;M?--xNtY4CsO#Xex7dR{n|otE#G_E!HelLAr;M|3-XioAM!TryTRjI0vT;-L$x_h4VTb`4V#1#E8CV;HXg7%QE$H}zndPH zT~Kr;B)v4N$*8B?_flQU{k$__EzWhxnxKmDlouSR7SB!hcdh(okG2D)%C0?mr{2Z1 zzEm4{lD5u7&j9adzz5uPfQVi$OCXb2HOCMY@9*+tvjh1h?x<{Mk;_Nws}0O7CuPUJ zx7m~)J~ibtr#zAjZ@>bQyVQTEJIl68Jp5jk>3?Z5F+9%`Zlvj5XCsZYZKGPqFj)e` z+0k#$E%GpG7+$(4yWVZwdi-8L5IYH|BJmo&+v&$VC>Af`TLV@^@T+@|Eh}Dt$4R5> zzj$;gvDH&9DtrCT3&lY-g$EP1;WRt__~OE z=})(GMd2%ASA{LZwLZCz9WW%s+YK)Bm-v^wZ3A6RpQGES|0cOWq%)TNvG;fn`D0Ye zT-tt`=1dp6+MSYx6PPXuJBTx>vi&j0MC4m)dDnVvgBO(PL_(Ig>p>^Y@K={%6LWUu zz)zchKDX0r`L@W%(YiIL%tiLs>?}!Nh!NL`ADVWZPVYTGm7}SOg*Z{cbQt&o@E@X zxKpNdrVi;m+;yqP4{&;!-5tf_3q<=Kl=1ki1vr~RR{N* z2Z7er&zN>mIrHI>0lFcfEl#dloZ}}G^sXVJ1p&iM9~W%>AC;TFcdols8I5*Qysa03 z*mBx|Pyp|+%bN4YZ889S{3UavUh4mSry;Fgf5~e$5DvfS721Mx3^R$k4M~VV|;z(*gH!kP3 zhsP_4$II${32`XdP?AKDGEUa;^SGYNs&IN&X9``d^G5=cKqmR4J6u`0IwbHu$kRZt z)}ps}<@t-5_)wG2Q}EHMW@~<4Gzs5q$^R>OX(kvYX$7t$i@Fx$$XKUnT1Ga5T7p}n z?6{Kt^G>01)SEZSn*QjCY4*Amo)v8yqBPj3sk=jwgLPM}MDQ{yRHU)e@@#E^qD8L> zdNtXE`!9QMt)+Pn1T=lsgx_@WhYVvuwE_z-?YrCa+qY)!a?a&U)O^m&KKy`<{;6?d zjvgwA)@(b)LI1=()Kj63EGe#>sqhb6?$Rq^V>AJJWX;@8N9YmstZgbFx_VZO=@DHU z6O_2gP@~yMeXn?FC68^Zf^s|#p2|C0FG4lI-@7XA$Ad9ihE?prCTTiRAJfMTJ2A&Z zr!~uBYGYh9fL5Oy*CQ(fnWa6!_A_JM5EK{VBqxUI;|{=>j!U18{Q>ToFpnk^4ae1 zy}Fz;X^$zi1dlqw;NHL)F=QJs$%#P!ieiL0O;k^%a0AaoQs%tW@xzv5J$*77h{c+F7LJir(zhl)%}a;xfYQdSa-mQ$O%($z5Tacz@c4Hm7c8C0x|sP45t(QJ-U#0lp|XXzo1f37oN}Ekd_O zM(+u-MMU0OXnoPQXpG#dJuAc6Y?8`-R z+<2z`t^k=!VBNWBV9%ZmGCf;iwB)(S5$%Pa!UzNi_Ou+@3qD$F!b6aA^8Er#r8?EG zpJ8;I+GY5qkR`211WZ&%V``}}T3lp~lbL{O=fX7|s_pmpW!C50JLtwHw^inPDwKS0 zlAa9Ipq1>+IRpOhQxg! zd2`zVi6JXlegygx-qZa2Z2g`w+cGbCr0z9>k}T4<9Djxmr|I%ihm-C zCUPfKBwi{;9_v7}PP!3#oKKKaMgFkbB7pIy*s(t~p8`|x1SAy9l zb+3KDQUg3kw6-v;E_}IfDCx+#2-<}80gB$1-Kic|iKeCxrMsAW{sFy{*y81wc(+#& zmj{|1HQ|sOG^p~fs_e__cbtoM@@Gzu3XjHTpD5%bgR#Lj0onyR{N%6{ZOy*7t=X1) z$9lnvq-UAkRAd46o2oqkA%Vi6Z7F&nsjj9s?DK8UGFK~bXkK2@Yxz{eROW@G(p8lL z5wnOeS?hq7=oJPHAwXQE?_X3fC>_Ly@A~2Q%ZX^N=MM|wsf6G^+F8sR=+s2<3XpSA z1Go^RRxL0QwQVG>HZHrp<@j*lC3;nYm1So9n)Mi=1qYRVt8?2{>eCp~kffJq*ZP|B ziVGD4YpGpHEJ+uubJ3dqUeqDaQL#9{-RK=SrE9uvi#;o*J9um6P8)-!NMB^*NDU76ltq*S;^Ndvh5<*$7s+MCDbn zRLz-5(GH%U)4cg_6xeo zklTJ?K%Y9aFgk@&N*Es3zCh}-g(BILJs zWZ926#E8h5BelYfDlQ!Zb7fgFl$N0hKGo|hox5|V{u%>kg1XR{sLD-WS{z>@EGBU# z?p#LMXF*H?OpV(LRhTx-94iwfpD1($MHQ2f{Cc&aDxseYTC?X)b^A{DbD)?b4WNOj zAw}EP*eI--W>Xz&#KLZGzVv&S^ad}9kxKOp&Y3c(vB!JbodM(RqZ*{z42dY(G=SdA zzEsPsDz!27e?6?n!62VJeY(&vq{5caUHGa=xh8gR4@coYEut?G-nxH#Ba;F!>0)BV zwNJb9SY|am88%FDlN>&=n`zosy1jg=)iP&cy0~0K{shO0pNTE;c~fcAZ53HTVbVee zAJMe0m=ro5N8oj_sr5cdtPbU7-wLpp`s;Cb>-`hTrI-w>`7~njr*Be z#;Myk#H;x?<&=$;f;oTT;?acB=wgR)+r&oTagNk_3)QvirX5*1z5VnMYLWE;)P@F*7}*3N*C***z3tJt;n*t27s(}F|O-S$*S{DBy`fY|hrv(K-` z^hkg!N-IhnpVh>&socW5wrZJQWl!8T*-;s&j)Y~5ZnT{)D!g2c=Z&~|bJ9?;f&T@z`5`wybwSnab;teQxm&d?y+(BQgxhcy2F3Unma(JDqzf0Q< z2arvVm(-m-aJT*&eLl)@sy=Qc(qW=Lm&okOzU zoqeLL@q0&BuGs4NjrjS;{+MB61OSx!P_s(E#n+JT>w<6k^2pjyq`{Zd}32+_173 zX}h8S@N|@T$k4N_pATmNBjY?}yU@;316qx~yt!3Nn{vKFg_zB9@`rRo?(%t-qn5B9 z{3sDJhWW{=JwEKD%Iw0aOckfI+7B&1e*O|?%asZ;C0R*y)swiXFC#&fbHlP!gP1nT zr)l};*|`ETRw)jdx$a|a4y_Z3o=Emc@^H1-mCj81Z z&A!LDpkg`qe!oOXH zGQeMKY1150Id+V;d8kHbauXohQ5>|9tGe}O=DrZ!xNJkgnpPJ}LbauZR*{_E51<02 zMV3d?pp^T?66D(42|2?@zFEckG)u>o0aCtny`aHVNiWS|=tH{H40mE6Wsa@d@pU|w zGg*~af>Ku1AX z%*4FAGO!!g1Q^hl~7k zVv5I`FQC&dN8rg+7&6{xf4t(E9Q`iEnwDcNN~%h?0cZJhhbu43eAo&!k4&RX1_MVy zb#aQltCUne5AO?Oyj&qCwZgKenR)kO@n}&en~lwXgDLa8N0*LEOrL-tSh+?d$%%h8 z`1mA$(BJmjG~-tlDX0P~ZiLRj;Jh6UX(ItLMogh0akrYNZ+ku!p(1-S;$BMdqJ?pu z?(LbtQwAYz!>!qgjIIx$#G@;M63f-ji$-XSyQjg5S$F%4pF&Ae%U**!TUyfnOu(;^ zr(aBprpgJ3X4UM@v+!1r>-D5r>wMCEiD5G;gd?m5=VgbwPNo=_XE_--J0-pPgarwz zXirUK2EKUi`_qz*xSW5wmMvPw@ z{Fk>AEC(O3Wm!?J%I2;14Qo(O5Yq*qTPkr+R{N7+V$_U~ajj#OLOA8?A$7UoShKb)sdR<{z2c4bpG6&;Vkd^=i zBlAMXRS2Aa8ov^kvvp>+=Ca4o)yXE8+wGn3!ife++t2bLR_vh6Ym_Z%Y&qe*<`u1< z@Ae`H3)L2CKW0?O%iHsETM9c5in!sc9k=})?OaU;r2|L$OF;FT!8IvEIbVjA`E!HV z@sfcHE*&pQZtKs2B50kvR{9DXf1D9ijJ#da7{I_%4X7M&3MS4lu1SoOPUGd$g=qH8 z$)j@(F6X0wCS55{w)yo>!Ug*L8*(;f0pJ}{bMNCoisch*2#iJcRF}@8h_Bl$7q}Ah z`As)e&#g3jau`P%R>uN>q{`~M?`-@>m1)hg2^AvcStj9N##&?YSjdc0)EL%~-8|2Y z%$kBmjP~7F^SQLVB_!WS$26tZe(bh>c!`}@Scuqu;;$q_1=^D%8-NaRsoj}z5ut!` z37Ld#7C^D<7OHnyyi(5mcp6-#79;+B z|GoLnP+Qg7Dc?+eHGn|r^o%hJ=KsLT#Rv|2i0!=;_ zD)K5aOa63=$O8XVd0^C>jaVeN^8(1ewsFRB3SwyIg#nc0xB2d)aNf*}lT+m(5#zcc zo=5UaCGP6@Ky)h5+V+1wKXLDsWjQGXgCo$laht4)x6&Ktu4LYoJC&7D)Uq}2Ku-pazMh7LcWBV*(I&9Evc#Lcdjq-E3fJ8549;ujGxey)uQE-naomadpyZ5Xzbze6 z)ixV|t_T!sL&y>hz;d%m$y{7uvK!q`t`+xR#?$Nh<-?lcFMlF*0>L2mxL=X(OVGek zF?jxH^U+7|Bn@<;wB|OUPKujgaTTUFE8VhZ3@^-q7_VH%I|9EGWtry_FtfWJuhVLs z3cQtUviB^cBi9jM%HT0Hsq^viA?sQbHvx7-;>GbEP1C3h2uML~3y~HpS1!rAb-l&% z`O0dgW{rNo`S(L;mFvMRW;yF~o8M!*c6Q*FG0NzP)Lb3(zX?`@qktFDEL0$zn$LtJ zVyOvw*H*rljHt)rv$H_J*v8ztinp!sY~5v*>tYb#Eb{$TgmBI%WMR}G=B0*MNvIRN z5VB&UgAr}e+TWk8-govsZ`Nt^%Y3U)#@$OZURm~yz>b6;)}AgHx2ZmgrrW8GYUcN- zA8ufkA}vlI;=m=j+ajE%Rn6r&4DAK5@G_RkIzh?!qmkym)6*pyJWtTetaMe10vezR zaYi-gL(X}QT~HD^l3e+FI(VN^+`r`C+q@B?tC)qP7rH^pqHqafm^h=1C`@oAwf2BW zp>1*Vx@W!x5K8KT@AmvakRogPiLC4w`G=iJf^U0wp2`72{*|Bwx&b+mc2QF_0H{{ZjLgDf^oD3xe zvnzmVtq*MUjS?ojKdUU1wHWE?ou7ljevZ^rudf;St;ga8t3lM(bvIA^^K4qFI4lG#A@TiN6GQ>|BynrMzyRO0JLdY~>5zFm#ckSDlHgkj_ z$8lQu#N&erY}9V=Ua`!eWtCu}Ydc(V? zmZ7=;`Yq5J73Vc*wStwtNHC>E>e}YSJCraQpa+~xK1M9hb@xCbnca2#-2--{=|6qZ zh2pA#MG`Uhkt^w*TI(-r6nvfd%_9c74GNl{+)p)V*Sq&GJGZ>X|K^L+c1V1;n&#RF z=s@BQtkm@5WTQltY_81_nI;*_F#(jnxOe!RdGom4!pr9S>fAa2JaGm4_<0_!+- zE`*6*0zm{jrw5qwX}CESz~)-ArE7^EJ25uQ{JC>O{I;A?ulZc4tyG2ug-hKABKPY^ zj`F%0rhATN#Qg0H^}diisH+=$8MYRik)l)PR?U+yIgoh3#BPD3cvYw(;a!o^f?00m zjgFQfy9-y~Gmbpm2WaBc=5>&Y-xMJkd(5{?-#j)o;MaGYqMO+LteTtoe(VPby6OMX z)`f(A2LsQfLl;6jz~d(aRKP*a{Rs>K%t2-SzB6B_(32k;c9wdN=Ufsp2w9x%7o-l& z*(7h5jC4cyn#)S&_|qS+g7tJ64)E*Ty;& zyBonw45=P|xK9A14gxpy@jKl6UXeTbMdWy>AfO>NkgyoDSRSTC!S?wX1B_j03OKI# zV{2_zXnVO$2<8FKKrj(~%Ch)4^`q0Rs)Zg-#7zs)p1sJ$vXpSQt%v)ddxlE$W4edc zNE7~Ti<#}RuaJ6NMecMh6CgvRg?$1XRdyr!^$~v-B&;?Wv^r0p{66r8rH+z85-|+j zaDh`)-AfMJ2k3Xm98m5@I*g^n{;{&^bYEtq1o2@JX6)Mk1@aYSsPyd7uv7(N>sAeXd;c(J<|k}j1mH=So)sYilL~+M zs{9H1<`DCuXUT(d=uNG+_H21QzF|8!xKuC?k(&RsGqoO=<2o$lfF?k4VJoJSt;~&Pv?~z7MWFoVk(ri- zj+|-*JMZe67n2@1En%SErq^{Qc!`;k3@gmwuu9_2Gj%J_*Mo z+v2jf`b@TUnFtxa&K?cvK7m+$!gJpr_Q=2;5?Ihv2O*)JhB<0j^o&CXP@t7|_Ugd5w>K-B6L`N3yt(x| ztb_@b6ObE*Ed`HUTI{6WZ_4wE5*@+8c+T>_nIAC)8-Br_o;Tq6>>Y#1VL>u4+?`{r z2Ly}-7Y`w|RAPw!sTd5N`^}6A=s-Qk$s8vV2=4p;z5DWK2-RDTaf4{K79Gtn68nC^ zbaGcb$TqCUsk<2qSZU1CL)R^f(S2fF=CzJPD8`31+@xS?l8vpM@I@BOyA*S6JVzPh3pNkovoiR9QuY zLAV)LO$+Ya^#c@&n)*;MtHlsTuaX|=V%Yo}Ems1g)VYiq3||_Bq+uNaKteNZp`%wD zOI^Sq+_%!;#Iux1pUKy&amByBChqOQgm~PVofjT4rh}IjywP*;p{Um0-JCj*y0wZ+ z(F!bOTyUc0c;0M=YHk%g2q|n{DySalTW^&%0u{)8L5$Ttz&N=%&p$GJT`?b~|Gsr^ zWIg0qjz3yS37!`V7y5oCDUHQ^$2Q)?OMXR!4FysYE>TiI-TIx#P4b*z*<}--|H`Rf zf(1yvb!=h|WR%~j{dnzQEd(YS2P?yC2N?jgOjKnV8S~ZhKz#u(e8v(pfLKTX!@;nr z1X27h&n@V_y0@405`;k5lZY>&NTF|7000OGpPCkga}gE;Ku|&@&`5XLYBJ=PQ8X0z z3Sn&V;3HTQEP(l64=n(;m!dlu3&oKB$a^bD| z{0QPlY+wg_d7>&m^FXwow5%5j}-+Am&{jlCO)Mo;9~5Rc^iN;ub#jZ8C;sz z={yCVnph4{uL7x^`LC)$j%IYYcCQ(kYZZHmU-7j`eSDcu8Aim@yyWsa1bf_~ zPF#N@{kWcP8l+$U-l?O*te_c7N0Y?HiAYQkJVxdgplYM66NPUJD*#)qUTVwCUGAd} ziBQPTbT7*)+5^rKP*NWz`X^fuCiu7wk&ei zx4O%MJR1VrS7Z2%5yx}pMuw`+|1lC?jazKv`~oglosWh z=d+Q48J87dg3rJxxWo-pOb{fwsaRvd!g>h0f_t}3{;Ah#Q^owN6_#NIjm@UVk|7C= z1c9}?WnaGu1gR=sLi-{bL2g!r3=0Hz#A%nohzsOL2Nt1P!MGnQaRYuD&Zs*Niob14 zKjvhjdnzt|dP`yiv+e)#0E+H?{X7LO*bAO6fq$Qf-FZmaiW_b=PmMrml>&{i)#1IyaF^Y|q*xZu9GER_QoI`WyQ=Eh0m`}&k7 zFsFQsr2s*2<)^2=sqjgND4P8&HXqD~Xbjr~EDU6HRn<`Fy8iYhf|Pm8YrCZ*Mbo!! zU^J484slfku|E+q&=2+|JezFJIT!Izi;@e{sK4f>huEvE)SLN=Xyp3kWox%{%)fkq z!J;H}@20{k-3WY6Z+72r!acuvqG)|F1jW&gbMFut+E%Gn|8@PY_y0y-MEt^@K4s=F&M z$|?~diq*H{e~7TnFfFj*0n`PXlWej^<0%n;BTyv;f`N!ta%1+3#!+0ZPl# z@VrIKvjA`46nk^5)u{%O7;7cNo&f1L6!P;WIj}p&aYlGMKvwAD@TWJPJ1ws&pZ`Ea z>r`FWf1jLzA!ZU@&@^C@l_Z7Y;@RB{@)%BL(vCL_;^W9@+dk&kUUe<#_K6GMa8NjllqEc;s4AKvG(_)eb_t$71SI7;sRSd;0n%Eh)SVyaPfcRIS5=ZUK8>A z|ER}Dtw3VibCmf|&Vn>S-%YjH|HWG%1!9S<>MZqfZ25an2&_pu(l`xQ_>8s}aFa?k z&b*1$%RL4~B*F~RCLF0i%gDFBz3AY<`%f0osM;=!KFNajvC z=qEZZQhI$M{~}?_*n_NGjDCZ8?1xh0(XJ9f227YK=WVdnxs3>Ox?&iAJdCp~L0mzDZ-d>f37R21tH4*pT@90e|6 z3pySf*)nn!Z&~;~p|2Z;!Jc=lfDe`_4HY)Nt7t2JAs9I4o-r7A z*pAGsSGAYyNo=PWiNlsPy4@d#`G32-MtE4zOb2iU7QhR$IrHE1>^TD18n|%2sI@f# z=6uA-cVx{RNqq)T!EqUTuh0N1-Hr-PO|K1=-^I)x^b$> zNeIJ9xw%0zM9}GRWKp+rAu=FmVy3$b%()>zGHj`k#e?}`NOT+h@k^{lv05LS&yt_; z)vdQE=icwZZW4qeE>xxKy_6QZ(O7PojS;kcjLGup4yWIz0D$6!F#3-$5eZW;@KeP- z0Lvgrp0`U98@iSRQN(XD;#Ks&iwuh;5D0OA=G>54(0s8|S@scEn8%x$L869loVB}hIZspv8P}ztNKq;-NXxWx#p?G6fHrQz<)S|*q z8AWMM$2&2x4iC^-7Gd4bX8}wJEZ<3eb4$j$Q{KW8PGztW>Q4mK20#*n0id-;YrqCqaCDy!1BM~LRnj0HeLw;inlel>siiqq6kag2D$JeHuPA^ zB@+p7hL<#aBn{{pjB<6b$!=7Uc4 zJP=mF4zHP0}&@2>g+$NiH zA1Y(DocRheB#ypamW=q&_xieZRqus*d35IOZ3FNf<*iF$B1pDy-yRs~XLkjqhZjhv z{=q-cKg(2bukpJH1dEAYxOR6G_8#T54V+qqV2VF{I&ePN-&@>9&$rTjp1^%=c$j?{aVM|+mv|8kC`pVsBGw*CUy!)^h-|oA7dQauvALTqFaGj4W zgse9mX#0c7F45C_u<1<#QAUb5GnitPu(LkN+urEiOi;=oM*qPI@8mHvi4iMrssya&pg@SK#7{>h)Hdnm{dY+$FEhguL zM@4a`0t?$UM0oS)ACAqzqU?`KScC2bb1ZfQ6_o(-N%RIO?6@OM_nB@jE~}< z074*DLncg{qjxXPn^0Rk@=h$nIuAB!ft-8F*OIZ)fMwe5`@D$;iG!O8#YCfX91=kS zMj~*)!)9s=d7J5J$(_B8YYm``Av5kXaB4$x3L!S&pGBc>$66GnKopkITTgY4#ATa^ z?Yg)LfoWzi?KD^y!_LYDLSmkFMzc4T=dRil3~u8Eu{B}`#T7gMLx7~@4UBNAyyxbL zZmaE8ki}=u*%tv>pc`B=*9+DuLgAiv&IOqAcs9UPwmOz)T2q1W=3js2KldjQiaGm4 zT-7`$X$Z%f&UpvZf?!A`rOo&N1H*lTe*QdpF!$(~@k%(YW;rQ>0~`Lw&PEX|en#Is zq$3q8E*C>LgM1O(1qWsO`R#^{tCy4)u@q*`KBF9~l8Zv&9RKghs2lQQ0vV~m&NKyk zD72!p&B^6UgVva!*J8N|wnFM)HU`Z>lwijQQ}ZGwEk-b-F)5zQ>gE-&Tx}QF{ckP4 zI#BZhxVlZH$ZbOuAz-U+)2U`5q9L%;ArtJ`` z=iS+I9GQjwbyPM8vwyAbeF6DT0@LhPqGpXBHwH1^{~&bL|)@P0#q^YE}RZ90v!C$eaD(FH&)JF7G3>jex%GBqA|&%toc+(rX=)vTq&!8 z*s4Kw)t4O{SigrC{x;S{E_9naT)`N?yQ<(NCnn${?&1y(Xu1b)ozAmxNE8_R^wcCa ziZUhu#*%W;^vB*&sZ2l>Yv{W?Ue(n2)x4Q2>2zj7b;8Xy2QRW0zqpppzW6gk`*J`E zTzYcgv<0e(l92em4GS__Z8-(FB6-17q90N|Tv+Bbc>&adAM>uY_9=kYJfwQPF-aPu z-q^VFdNQm?cI*t8jXSTjN)ffRw4Tk7rIqJlwC@qW>h*h|17ZiVuwfPCWpIU;$=Rb9yg4T{SAS9VwyEK&>%n7#ez9{iVxM#T8 zTJ1#~mTp1jd3Kiu^D!mDt~OGP%%Fk;&QErtSwY9a#XZX_T?=LohZVkhnDi;zay9Av zDvzNL(x|Vm55;4Lpg4vFzXB+(qcDac`S^j*WhQ2w%aQinBsb$EgIHL%v3Fjks&_1 zZ6Jv8*6zO0d{!kD=9V-%PoFx4C|A^Z4ELMAt?s>C?f1rVhRtx}3pn#dFC`F&DaJY_ z;k&M2V<(aUVv(@;P{haen5b?=i}>W&@nSQAo)h7a1Oo9hev9pe*n}xLWodpedGxN2 z|L4oAcjyyv|9bqFSf}BfQG=k>8SIQFECobPJihqzqcM)6Wu2qQud-ZJMQ&gb3pPc- zDHu{GLB!JOBZhdYtl4qy@b@M6A0{&4*kKnFa1e#hv?VN_(~ovU*byL;5h)zAD+F)@ zabOx~&;F?-i&KAT+r0Oh?9-!WS5?_OgCV!#jx`1mUS?|y62}u)E?qT_cj^u{4%a{P zN1_g&V6_k!L^AU%y~_OjRltUHd(^)I=D9M05ErVcb&eR zWv$z)TjgN>Rqb`HeYeOwDEp<+u`l*+GTk7RgfaS(B~avaX4i3oJjE0Uym{`f?RLI9 zSA-+P{~<e#@gE30`djMfqx@i;zr%C}Op3!ITw#NQP&ja`v`rPWdmMqgQpBamo;S``g>i9` zZtTA-k(afJOUAOEFJVbzoyve;Ecv+ZzWR*k&W)KPZUt|Fn3!K#Uxs<^?=p-fu%;qN zOb#^8d^Qk+=Y}I{*a*$DFuN~Gh82c#MD!k61dAX&#{b(P0(+59x6iqj6J-jWC9 zAxJ&I5W|FD(3GTqRenx(BY3V!z-Tm_dBIZ>45v^anCp>$HaG$z?bN(0?OM7aEQk!n z^sPa&pSh`?F%h>9K$A6mJU4GD_z2+0&|D=a-%4b4R4)CcG=5}CkEr%Pi+P|lzg;SE zn-+HP8R%o_u(xSbZBXU6|7^3653^mGqoFPcf+pv~px?d0$@YSb#;)DiK(m`r7zS2F5Q*g&SUa{*~wcJ>%4T zT1&~RoXOOgu<)?|1)^4}n3HLTF;(b~T;WZneb`wdjLl%YfkI$KEYRSxlS`ydO$%|0*~ z-uv8{)WqOK?=;j*s9+XYp!%ekqOF#O-qZlIE8sD7e+NdjOOZ{vWJ%r}SR{KL29X$n zoMj1Yn}9Pv0jkebEMK_s3cW@(I&LF7FmoADxqP+qUs+m+AGVkE7OAC8V_6)zqPZ|T zyi7*YYn=0-o(^wHw*&Cb+q-ASUr6IjE_B~vyY|t@)}5c+2l%gwXAKX1pxTFva&1{T zH6JuD%tDlEQr4TJfQm7v6Oqa$)uEz!?NXv)XM^GO^PJVW+kX_@3Imf}!hs((J!cPA z#(lsVY1Iw3TMG}WcRFxEsCtta4+k|lRs>7#yyhKgB$zj*ZiH2xo#r~%boZV?1u&=?WMG)e$@$Mx4suqBX+yXDEn?Yu42r%xGa}l0axq<|1=Xk znqC9v8VnDw=?0_yb<46-@Q2yl*Iv`r&*0xW1kGEtcHJ#&=QFx@W?5;i0jx!Hg8 zK80w136vWQsjpLAa!5k(pK*_GGU`yGf7(Cv=xk&iG$?jX?i0y|3&xPCj@Dbd!)0*< zi(_Vussoed8h9wCy@r)1<`G&Yh3tB}hU<5t6_?s0J7?YU{?F>t;5w^X;eWa+!X{STz93>T9B1L#8=-x8Wr#zLC|R& z4ye}s_4)OTf;mUxSqra?d1eg8D%)7W`%arIHoG9*rmyRl$MOiSwO;TUbX&f)cQDkc z3}TjOBlUgX>=-tJYWS!s=J9#N>bi6fc7%94sZ%^VyI>>mk%mX#Z;^I^lC8sp#}Y%m z49+El$|zXmO@WJ}q5iEG;AxMl7}r5})=#k&GM3``Rb32DV>$Wp0$-y9Pp=$ZQjFD& z$H_@a7&5iEw#Y8iZC`%&L56v0WwQpy+BK6-b?3-^6T{JLp-o+CV1#z|4$@*Ykki3k zSej(7C#Z7;`nKkSqj?%^+vZ(wA3S(~QF*yLf1j7uqldawPv-2IIiRlIWe5c(UwWo` zf+h7^Jir+#w>{o!K3d)vp62cA*KJF==~xZd;HoyQ{{!_eL0(2XC}%)<8L}J4NWGzB zk;>LVfQh!Ea1IH$4HH>6pk({>z^+7FU&;iWFD+^ss~`7W$*+S9q}>51 z0+R#d!oH=M-&kQ0`^&X4)tNyxirWEax*;x%G9tEm!8m)bz zqD88F3HTD3h#Yz*p#sHY|<5<8unI3e-j*kHL28g{>$>Q?5&!!;0xnzZ}Y3l9# z?GmDNuz|Lk?)#Bzo?2BT_0^lVRfez+C1)eM1j+{Kh1zt&VuRrbZuU5MIgN@7oB1QK z6;4ADbuIB15)gkg?~Ny1TMD)%^Sv6d4PeFK{z{HbgNv5_GR95a?@&dVn-i_{|#zwvQ=utWwZLcyQ>f2Gj_<%8L!9W1p_363ifWisr!ms%N)D#w^B6v@{gV5u2 z)ZCi!Dy9J$MAolQ3@_sdVCJ|CC~TKNr!~1PQ;)KlSznOA94L%wFQ#i=0=s*n4Sfp&&KwOz`Ab2PaDR=I8 z6;UgsnRDUc9t(_11V`#OC|w|LR)f54=_%iQkMENO!tU{^B|#CUPy?8tuXZ(A-v$}q zq&M_c8ABx@OT!_K*E!zG*9X5*hVZX@GvzzDZKEAd!55)}JPl?R5`IYRkO$@rz@`e> zt|&yS*n_5Ft$Mq`IeInVpteQ{{aDkuVDrrAivi~Rm!3WE#*=j)oxKBPt663uI6l+O zM8o-Lor1`tA!7UyttN=v&x0bT|2F`w|b5a`8FA)KV! zz(3haUaJ8s9HBAcen4{uNU>RF*I~+1O4^BLiXW;^>I;X8#AcIkZU1zNa58nfK2VZ6 zNc!Xoyl(%J57TUClC5`vGZ|-4@CB|ZcN^DxWIF}?wFG%>QD`yl{Nl&9qQV0z55JO8 zYzC{CoM2n+8T-oRZ!TiCCcRSeYK0;!ER1%pphpJln4vpT^xnM{X$;-L2jI8Vt%1e@ zKbM!A7^aD0Zw0z_aj%oVR@Vjpkmp8J;t!wAjSUd-l={rlSC#Y%wU{W+DWJuJRKXn< z1V7WW=CHw<9ap0=rVY>RfKp52xug9e+7ZW&UAF@+9s`o^_qVPqvSrbu{ejX0-T_E} zz(xIBGzPkxiy~$e-}V~qUm);vJvd;2Eh!)NO@9mq{8?O7v#S_~H~>&JEAe7#xZ_?u zGD-L-?@z&5I=#m%+mux}3r4k0G3O8PlRUK#n$(3XWN+`x=Si{?1!M7MtJ%NV-?-U|0F>da2uWmcQ! zV#%b_zM9K0+7iYB zVMsVc{L^dw=Vhz&1N#B0W09kWVS7ZqyzVE?e~+rA0sf}_3@rUAS5pK@gX*{lLxWv8 z4ui_eJ*&_C#F52-rjPvt6*=eS<&_4|>p%J+iT^BhL}ILG04hVr$#dKl Date: Thu, 10 Mar 2022 20:44:27 -0500 Subject: [PATCH 30/33] Adding a little more description. --- .../ensemble_metric_pages/PNearEnsembleMetric.md | 10 ++++++---- 1 file changed, 6 insertions(+), 4 deletions(-) diff --git a/scripting_documentation/RosettaScripts/EnsembleMetrics/ensemble_metric_pages/PNearEnsembleMetric.md b/scripting_documentation/RosettaScripts/EnsembleMetrics/ensemble_metric_pages/PNearEnsembleMetric.md index 9a3b9d4cc..f13e952fb 100644 --- a/scripting_documentation/RosettaScripts/EnsembleMetrics/ensemble_metric_pages/PNearEnsembleMetric.md +++ b/scripting_documentation/RosettaScripts/EnsembleMetrics/ensemble_metric_pages/PNearEnsembleMetric.md @@ -6,7 +6,7 @@ ### Description -PNear is a metric used to describe the propensity of a sequence to favour a particular conformational state. It was first described in Bhardwaj, Mulligan, Bahl _et al_. (2016) _Nature_ 538(7625):329-335, doi: 10.1038/nature19791. The PNear [[EnsembleMetric|EnsembleMetrics]] computes PNear given a conformational ensemble generated by some protocol. It can optionally compute the propensity to favour some desired state, provided with the `-in:file:native` commandline option or with the `native_file` option in RosettaScripts, or it can compute the propensity to favour the lowest-energy state observed in the ensemble. +PNear is a metric used to describe the propensity of a sequence to favour a particular conformational state. It was first described in Bhardwaj, Mulligan, Bahl _et al_. (2016) _Nature_ 538(7625):329-335, doi: 10.1038/nature19791. The PNear [[EnsembleMetric|EnsembleMetrics]] computes PNear given a conformational ensemble generated by some protocol. It can optionally compute the propensity to favour some desired state, provided with the `-in:file:native` commandline option or with the `native_file` option in RosettaScripts, or it can compute the propensity to favour the lowest-energy state observed in the ensemble. ### Author and history @@ -14,11 +14,13 @@ Created Thursday, 10 March 2022 by Vikram K. Mulligan, Center for Computational ### Details of the calculation -PNear is defined as follows: +PNear is defined as follows: ![Expression defining PNear](/scripting_documentation/RosettaScripts/EnsembleMetrics/ensemble_metric_pages/PNear_Eqn.png) -In the above, _N_ is the number of structures in the ensemble, _Ei_ and _rmsdi_ are the energy and the RMSD to target conformation of the ith sample, respectively, and λ (lambda) and kBT are two parameters controlling the calculation of PNear. The parameter λ, measured in Angstroms, defines how close a structure has to be to the target conformation (either the user-provided native state or the lowest-energy state in the ensemble) in order for it to be considered "close enough" to count as contributing to high propensity to favour the target state. The parameter kBT, meaured in kcal/mol, is a Boltzmann temperature that determines how much each sampled conformation contributes to distribution of states as its energy rises. +In the above, _N_ is the number of structures in the ensemble, _Ei_ and _rmsdi_ are the energy and the RMSD to target conformation of the ith sample, respectively, and λ (lambda) and kBT are two parameters controlling the calculation of PNear. The parameter λ, measured in Angstroms, defines how close a structure has to be to the target conformation (either the user-provided native state or the lowest-energy state in the ensemble) in order for it to be considered "close enough" to count as contributing to high propensity to favour the target state. The parameter kBT, meaured in kcal/mol, is a Boltzmann temperature that determines how much each sampled conformation contributes to distribution of states as its energy rises. + +The value of PNear ranges from 0 to 1, with values close to 0 indicating that the molecule has very low propensity to favour the desired conformation and values close to 1 indicating that it has very high propensity to do so. It may be thought of as the Boltzmann probability of being near to the desired conformation, where "near" is defined fuzzily by a Gaussian function of RMSD with breadth λ. By defining this fuzzily rather than sharply, numerical instability on repeated sampling runs is avoided: small changes in the distribution of samples result in _small_ changes in PNear, but could result in _large_ changes in the values of earlier metrics that used hard cutoffs to determine whether a sampled state was "native-like" or not. ### Interface @@ -30,7 +32,7 @@ The PNear ensemble metric produces two vectors of outputs that can be accessed f ### Example usage -The following script reads a series of PDB files or structures from a silent file (passed in with one of the `-in:file:s`, `-in:file:l` or `-in:file:silent` options), aligns each to a native structure (`inputs/native.pdb`), scores each with Rosetta's `ref2015` scoring function, and computes PNear to both the native structure and to the lowest-energy structure in the ensemble of input structuers. +The following script reads a series of PDB files or structures from a silent file (passed in with one of the `-in:file:s`, `-in:file:l` or `-in:file:silent` options), aligns each to a native structure (`inputs/native.pdb`), scores each with Rosetta's `ref2015` scoring function, and computes PNear to both the native structure and to the lowest-energy structure in the ensemble of input structuers. ```xml From 0801e4a56a7e97351b0d5d0898ed31cb1a2e212e Mon Sep 17 00:00:00 2001 From: "Vikram K. Mulligan" Date: Thu, 10 Mar 2022 20:51:06 -0500 Subject: [PATCH 31/33] Adding MPI note. --- .../ensemble_metric_pages/PNearEnsembleMetric.md | 2 ++ 1 file changed, 2 insertions(+) diff --git a/scripting_documentation/RosettaScripts/EnsembleMetrics/ensemble_metric_pages/PNearEnsembleMetric.md b/scripting_documentation/RosettaScripts/EnsembleMetrics/ensemble_metric_pages/PNearEnsembleMetric.md index f13e952fb..6ef977938 100644 --- a/scripting_documentation/RosettaScripts/EnsembleMetrics/ensemble_metric_pages/PNearEnsembleMetric.md +++ b/scripting_documentation/RosettaScripts/EnsembleMetrics/ensemble_metric_pages/PNearEnsembleMetric.md @@ -8,6 +8,8 @@ PNear is a metric used to describe the propensity of a sequence to favour a particular conformational state. It was first described in Bhardwaj, Mulligan, Bahl _et al_. (2016) _Nature_ 538(7625):329-335, doi: 10.1038/nature19791. The PNear [[EnsembleMetric|EnsembleMetrics]] computes PNear given a conformational ensemble generated by some protocol. It can optionally compute the propensity to favour some desired state, provided with the `-in:file:native` commandline option or with the `native_file` option in RosettaScripts, or it can compute the propensity to favour the lowest-energy state observed in the ensemble. +As with any ensemble metric, the analysis can be performed for an ensemble of structures on disk, an ensemble generated on the fly in memory (but never written to disk), or even a distributed ensemble sampled on many nodes across a large cluster via MPI (with no single node ever seeing all of the structures). + ### Author and history Created Thursday, 10 March 2022 by Vikram K. Mulligan, Center for Computational Biology, Flatiron Institute (vmulligan@flatironinstitute.org). From 99146f77c5a8879577c9bb0c53b9b5d22c155908 Mon Sep 17 00:00:00 2001 From: "Vikram K. Mulligan" Date: Fri, 11 Mar 2022 11:58:15 -0500 Subject: [PATCH 32/33] Updating auto-generated docs. --- .../xsd/ensemble_metric_CentralTendency_type.md | 5 +++++ .../RosettaScripts/xsd/filter_EnsembleFilter_type.md | 5 +++++ 2 files changed, 10 insertions(+) diff --git a/scripting_documentation/RosettaScripts/xsd/ensemble_metric_CentralTendency_type.md b/scripting_documentation/RosettaScripts/xsd/ensemble_metric_CentralTendency_type.md index 2aaf50613..6e5c2d553 100644 --- a/scripting_documentation/RosettaScripts/xsd/ensemble_metric_CentralTendency_type.md +++ b/scripting_documentation/RosettaScripts/xsd/ensemble_metric_CentralTendency_type.md @@ -5,6 +5,11 @@ _Autogenerated Tag Syntax Documentation:_ --- An ensemble metric that takes a real-valued simple metric, applies it to all poses in an ensemble, and calculates measures of central tendency (mean, median, mode) and other statistics about the distribution (standard deviation, standard error of the mean, min, max, range, etc.). Values that this ensemble metric returns are referred to in scripts as: mean, median, mode, stddev, stderr, min, max, and range. +References and author information for the CentralTendency ensemble metric: + +CentralTendencyEnsembleMetric SimpleMetric's author(s): +Vikram K. Mulligan, Systems Biology group, Center for Computational Biology, Flatiron Institute [vmulligan@flatironinstitute.org] (Created the ensemble metric framework and wote the CentralTendency ensemble metric.) + ```xml Date: Mon, 18 Apr 2022 19:07:09 -0400 Subject: [PATCH 33/33] Update documentation with mention of support in MPIFileBufJobDistributor for ensemble metrics. --- .../RosettaScripts/EnsembleMetrics/EnsembleMetrics.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/scripting_documentation/RosettaScripts/EnsembleMetrics/EnsembleMetrics.md b/scripting_documentation/RosettaScripts/EnsembleMetrics/EnsembleMetrics.md index 0340f35ca..3e373d7d9 100644 --- a/scripting_documentation/RosettaScripts/EnsembleMetrics/EnsembleMetrics.md +++ b/scripting_documentation/RosettaScripts/EnsembleMetrics/EnsembleMetrics.md @@ -335,7 +335,7 @@ Note that if one simply wants the value produced by the EnsembleMetric to be rec ## 4. Note about running in MPI mode -The [[Message Passing Interface (MPI)|MPI]] permits massively parallel execution of a Rosetta protocol. If an EnsembleMetric is used in basic mode (Section 2.1) using the [[MPI build|Build-Documentation]] of Rosetta, all poses seen _by all processes_ are considered part of the ensemble that is being analysed. At the end of the protocol, all of the instances of the EnsembleMetric on worker processes will report back to the director process with the measurements needed to allow the director process to perform the analysis on the whole ensemble. This can be convenient for rapidly analysing very large ensembles generated in memory across a large cluster, without needing to write thousands or millions of structuers to disk. This functionality is currently only available in the [[JD2]] version of the [[RosettaScripts]] application, and only when the [[MPIWorkPoolJobDistributor|JD2]] (the default MPI JD2 job distributor) is used. Support for [[JD3|RosettaScripts-JD3]] is planned. +The [[Message Passing Interface (MPI)|MPI]] permits massively parallel execution of a Rosetta protocol. If an EnsembleMetric is used in basic mode (Section 2.1) using the [[MPI build|Build-Documentation]] of Rosetta, all poses seen _by all processes_ are considered part of the ensemble that is being analysed. At the end of the protocol, all of the instances of the EnsembleMetric on worker processes will report back to the director process with the measurements needed to allow the director process to perform the analysis on the whole ensemble. This can be convenient for rapidly analysing very large ensembles generated in memory across a large cluster, without needing to write thousands or millions of structuers to disk. This functionality is currently only available in the [[JD2]] version of the [[RosettaScripts]] application, and only when the [[MPIWorkPoolJobDistributor|JD2]] (the default MPI JD2 job distributor) or the MPIFileBufJobDistributor (the default MPI JD2 job distributor for use with Rosetta silent files) is used. Support for [[JD3|RosettaScripts-JD3]] is planned. Note that EnsembleMetrics that run in different MPI processes, and which generate ensembles internally using either a generating protocol (Section 2.2) or a multiple pose mover (Section 2.3), report immediately on the ensemble seen locally _in that process_. In this case, no information is shared between processes.