diff --git a/core/gui/src/app/workspace/component/property-editor/operator-property-edit-frame/operator-property-edit-frame.component.html b/core/gui/src/app/workspace/component/property-editor/operator-property-edit-frame/operator-property-edit-frame.component.html
index 1f2c2963f29..5c4c9d1fec7 100644
--- a/core/gui/src/app/workspace/component/property-editor/operator-property-edit-frame/operator-property-edit-frame.component.html
+++ b/core/gui/src/app/workspace/component/property-editor/operator-property-edit-frame/operator-property-edit-frame.component.html
@@ -47,6 +47,150 @@
nzTheme="outline"
[nzPopoverContent]="PythonLambdaPopContent"
class="question-circle-button">
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
You can add a new column by:
@@ -80,6 +224,476 @@
Typing in the expression as True if tuple_["Unit Price"] > 500 else False
+
+ You can cast the type of existing columns by:
+
+ - Clicking on the blue "+" button
+ - Selecting the existing column you want to cast in the drop-down list
+ - Selecting the target data type you want to convert to
+ - The column will maintain its name but change its data type
+
+ Available data types include:
+
+ - String - Text data
+ - Integer - Whole numbers
+ - Double - Decimal numbers
+ - Long - Large integer values
+ - Boolean - True/false values
+ - TimeStamp - Date and datetime values
+ - Binary - Binary data
+
+
+ Example: Convert a text column "Age" containing numbers to integer type
+ Operations:
+
+ - Clicking on the blue "+" button
+ - Selecting "Age" from the column drop-down list
+ - Selecting "Integer" as the target data type
+ - The "Age" column will now be treated as integer values instead of text
+
+
+ Note: Type casting may fail if the source data cannot be converted to the target type (e.g.,
+ converting "abc" to integer).
+ Ensure your data is compatible with the target type.
+
+
+
+ You can load data from a CSV file by:
+
+ - Clicking "Select File" to select your CSV file
+ - Setting the appropriate File Encoding (usually UTF-8)
+ - Configuring the Delimiter if not using commas
+ - Enabling "Header" if your CSV has column names in the first row
+ - Setting a Limit to control how many rows to read (optional)
+
+ File format options:
+
+ - File Encoding - Character encoding (UTF-8, UTF-16, etc.)
+ - Delimiter - Character separating values (comma, semicolon, etc.)
+ - Header - Whether first row contains column names
+ - Limit - Maximum number of rows to read
+
+
+ Example: Load a CSV file with headers and comma-separated values
+ Operations:
+
+ - Clicking "Select File" and selecting your CSV file
+ - Setting File Encoding to "UTF-8"
+ - Setting Delimiter to "," (comma)
+ - Enabling "Header" checkbox
+ - Leaving Limit empty to read all rows
+
+
+ Note: Make sure your CSV file is properly formatted and accessible. Large files may take longer to
+ process.
+
+
+
+ You can select and reorder columns by:
+
+ - Clicking the "Add" button to add columns to the output
+ - Selecting which columns to include from the dropdown
+ - Using the Drag Handles to reorder columns by dragging up or down
+ - Clicking the Trash Can button to remove columns from the output
+ - Only the selected columns will appear in the final result
+
+ Drop Option:
+
+ - Check the "Drop Option" box to instead drop selected columns
+
+ Column operations:
+
+ - Add Column - Include a column in the selection
+ - Remove Column - Click the trash can to remove from selection
+ - Reorder Columns - Drag the handles to change column order
+ - Alias - Rename columns in the output by typing a new name in the alias field
+
+
+ Example: Keep only "state", "year", and "custody_tot" columns and rename "custody_tot" to "Total"
+ Operations:
+
+ - Setting "Drop Option" to "Keep Columns"
+ - Clicking the "Add" button
+ - Selecting "state" from the dropdown
+ - Adding "year" and "custody_tot" columns similarly
+ - Typing "Total" in the alias field for "custody_tot"
+
+
+
+
+ You can join two datasets by:
+
+ - Selecting the Left Input Attribute from the first dataset
+ - Selecting the Right Input Attribute from the second dataset
+ - Choosing the Join Type from the dropdown
+ - Rows with matching values in both attributes will be combined according to the join type
+
+ Join types explained:
+
+ - Inner Join - Only rows with matches in both datasets
+ -
+ Left Outer Join - All rows from left dataset, matched rows from right (nulls for non-matches)
+
+ -
+ Right Outer Join - All rows from right dataset, matched rows from left (nulls for non-matches)
+
+ - Full Outer Join - All rows from both datasets (nulls where no match exists)
+
+
+ Example: Join annual and quarterly prison data on custody_tot values using inner join
+ Operations:
+
+ - Selecting "custody_tot" as the Left Input Attribute
+ - Selecting "custody_tot" as the Right Input Attribute
+ - Setting Join Type to "inner"
+ - Only records with matching custody_tot values in both datasets will appear
+
+
+ Note: Make sure the join attributes have compatible data types and meaningful matching values for
+ the best results.
+
+
+
+ You can filter rows based on conditions by:
+
+ - Selecting the Attribute (column) you want to filter on
+ - Choosing a Condition operator (=, !=, >, <, etc.)
+ - Entering the Value to compare against
+ - Adding multiple predicates for complex filtering using the Predicates section
+
+ Available operators include:
+
+ - Equals (=) - Exact match
+ - Not Equals (!=) - Does not match
+ - Greater Than (>) - Numeric comparison
+ - Less Than (<) - Numeric comparison
+ - Greater Than or Equal (>=) - Numeric comparison
+ - Less Than or Equal (<=) - Numeric comparison
+ - Is Null - Null attributes
+ - Is Not Null - Non-Null attributes
+
+
+ Example: Filter to show only data for Alabama
+ Operations:
+
+ - Selecting "state" as the Attribute
+ - Selecting "=" (equals) as the Condition
+ - Entering "Alabama" as the Value
+ - Only rows where state equals "Alabama" will pass through the filter
+
+
+ Multiple Conditions: Use the Predicates section to add multiple filter conditions.
+ All predicates are combined with OR logic - a row passes through if it matches
+ ANY of the conditions.
+
+
+
+ You can perform aggregations by:
+
+ - Selecting an Aggregate Function (sum, count, average, etc.)
+ - Choosing the Attribute (column) to aggregate
+ - Specifying a Result Attribute name for the output column
+ - Clicking the + button to add more aggregations
+
+ Available aggregation functions:
+
+ - Sum - Total of numeric values
+ - Count - Number of rows
+ - Average - Mean of numeric values
+ - Min - Smallest value
+ - Max - Largest value
+ - Concat - Concatenate text values
+
+
+ Example: Calculate total mortality and average custody statistics
+ Operations:
+
+ - Setting Aggregate Function to "sum"
+ - Selecting "mortality_tot" as the Attribute
+ - Setting Result Attribute to "mortality_tot_by_year"
+ - Adding another aggregation for average custody statistics
+
+
+ Multiple Aggregations: You can add multiple aggregation functions to calculate different
+ statistics.
+ Each aggregation will create a new column in the output with the specified result attribute name.
+
+
+
+ You can sort your data by:
+
+ - Selecting the Attribute (column) you want to sort by
+ - Specifying the Attribute Domain Min (minimum expected value)
+ - Specifying the Attribute Domain Max (maximum expected value)
+ - The data will be sorted in ascending order by the selected attribute
+
+ Domain configuration:
+
+ - Attribute Domain Min - Sets the minimum value in the sorting range
+ - Attribute Domain Max - Sets the maximum value in the sorting range
+
+
+ Example: Sort data by year from 2013 to 2020
+ Operations:
+
+ - Selecting "year" as the Attribute
+ - Setting Attribute Domain Min to "2013"
+ - Setting Attribute Domain Max to "2020"
+ - Data will be sorted by year in ascending order within the specified range
+
+
+
+
+ You can filter data using regular expressions by:
+
+ - Selecting the Attribute (column) to search in
+ - Entering a Regex pattern to match against
+ - Optionally enabling Case Insensitive matching
+ - Only rows where the attribute matches the regex pattern will pass through
+
+ Common regex patterns:
+
+ - 15-31 - Matches any text containing "15-31"
+ - ^2020 - Matches text starting with "2020"
+ - Dec$ - Matches text ending with "Dec"
+ - [0-9]+ - Matches one or more digits
+ - .*EOY.* - Matches text containing "EOY" anywhere
+
+
+ Example: Keep only end-of-year data (dates containing "15-31")
+ Operations:
+
+ - Selecting "date" as the Attribute
+ - Entering "12-31" as the Regex pattern
+ - Enabling Case Insensitive if needed
+ - Only rows with dates containing "12-31" will be kept
+
+
+
+
+
+ You can render HTML content by:
+
+ - Specifying the HTML content field name (usually html-content)
+ - The selected field should contain valid HTML
+ - The visualizer will render the HTML content in the result panel
+ - This is useful for Visualization Operators and formatted displays
+
+
+
+ Example: Display formatted mortality statistics in HTML
+ Operations:
+
+ - Setting HTML content to "html-content"
+ - Ensuring the "html-content" field contains valid HTML markup
+ - The visualizer will render tables, charts, or formatted text
+ - Results appear in a web-friendly format in the result panel
+
+
+
+
+
+ You can create custom R functions by:
+
+ - Clicking "Edit code content" to write your R code
+ - Setting the Worker count for parallel processing
+ - Defining Attribute Names and Attribute Types for output columns
+ - Optionally enabling Use Tuple API for advanced data access
+ - Optionally enabling Retain input columns to keep original data
+
+ Configuration options:
+
+ - Edit code content - Write custom R code to process your data
+ - Worker count - Number of parallel R workers for processing
+ - Use Tuple API - Enable row-by-row processing with tuple access
+ - Retain input columns - Keep original input columns in the output
+
+
+ Example: Process mortality data with custom R calculations
+ Operations:
+
+ - Clicking "Edit code content" to open the R code editor
+ - Writing R code to analyze custody_tot, state, date, and year data
+ -
+ Defining output attributes like "custody_tot" (double), "state" (string)
+
+ - Setting appropriate data types for each output column
+ - Configuring worker count based on data size and R processing needs
+
+
+ R Code Editor: Write any R code to transform, analyze, or create new data columns.
+
+
+
+ You can create custom Python functions by:
+
+ - Clicking "Edit code content" to write your Python code
+ - Setting the Worker count for parallel processing
+ - Clicking "+ extra output columns" to add new columns created by your Python code
+ - Defining Attribute Names and Attribute Types for each new output column
+ - Optionally enabling Use Tuple API for advanced data access
+ - Optionally enabling Retain input columns to keep input columns in output
+
+ Configuration options:
+
+ - Edit code content - Write custom Python code to process your data
+ - Worker count - Number of parallel Python workers for processing
+ - + extra output columns - Add definitions for new columns your Python code creates
+ - Use Tuple API - Enable row-by-row processing with tuple access
+ - Retain input columns - Keep original input columns in the output
+
+
+ Example: Process mortality data and create new calculated columns
+ Operations:
+
+ - Clicking "Edit code content" to open the Python code editor
+ - Writing Python code that creates new columns like mortality_rate, risk_category
+ - Clicking "+ extra output columns" for each new column
+ - Defining "mortality_rate" (double), "risk_category" (string)
+ - Setting appropriate data types for each new column your Python code produces
+
+
+ Python Code Editor: Write any Python code to transform, analyze, or create new data columns.
+
+
+
+ You can create dumbbell plots by:
+
+ - Setting the Category Column Name for grouping data points
+ - Defining Dumbbell Start Value and Dumbbell End Value columns
+ - Specifying the Measurement Column Name for the data being plotted
+ - Setting the Compared Column Name for comparison categories
+ - Configuring the Dot Column Value for additional data points
+ - Optionally enabling Show Legend to display plot legend
+
+ Plot configuration:
+
+ - Category Column Name - Groups data points (e.g., boundaries, regions)
+ - Start/End Values - Define the range for each dumbbell (lower/upper bounds)
+ - Measurement Column - The metric being visualized (e.g., confidence intervals)
+ - Compared Column - Categories to compare (e.g., states, groups)
+ - Dot Column Value - Additional data points on the plot (e.g., median, IQR)
+
+
+ Example: Compare confidence intervals across state boundaries
+ Operations:
+
+ - Setting Category Column Name to "Boundary"
+ -
+ Setting Dumbbell Start Value to "lower" and End Value to
+ "upper"
+
+ - Setting Measurement Column Name to "CI" (confidence interval)
+ - Setting Compared Column Name to "State"
+ - Setting Dot Column Value to "IRR"
+
+
+ Use Case: Dumbbell plots are ideal for comparing ranges, confidence intervals, or before/after
+ values.
+ Legend: Enable "Show Legend" to help users understand different elements in the plot.
+
+
+
+ You can create nested tables by:
+
+ - Adding Attribute groups to create hierarchical sections
+ - Setting the Original attribute Name (source column)
+ - Setting the New Attribute Name (display name for the nested table)
+ - Clicking "Add attribute" to include more columns in each group
+
+ Nested table structure:
+
+ - Attribute group - Creates nested sections/categories in the table
+ - Original attribute Name - The actual column name from your data
+ - New Attribute Name - User-friendly name shown in the nested table
+ - Multiple groups - Each group becomes a separate nested section
+
+
+ Example: Create nested table with demographics and statistics sections
+ Operations:
+
+ - Creating first group: "Demographics"
+ -
+ Adding "year" → "Year" and "state" → "State"
+
+ - Creating second group: "Statistics"
+ - Adding "mortality_tot" → "Total Deaths"
+
+
+
+
+ You can create line charts by:
+
+ - Setting the X Label and Y Label for axis titles
+ - Choosing the X Value and Y Value columns for data points
+ - Selecting a Line Mode (line with dots, solid line, etc.)
+ - Configuring Line Color for each data series
+ - Adding multiple lines by clicking the "+" at the bottom
+
+ Chart configuration:
+
+ - X/Y Labels - Titles displayed on the chart axes
+ - X/Y Values - Data columns to plot (X = horizontal, Y = vertical)
+ - Line Mode - Visual style of the line (with/without dots)
+ - Line Color - Color for each data series line
+ - Multiple Lines - Add multiple Y values to compare data series
+
+
+ Example: Plot death rates over time by age group
+ Operations:
+
+ -
+ Setting X Label to "Year" and Y Label to
+ "Death Rate"
+
+ -
+ Setting X Value to "year" and Y Value to
+ "mortality_65over_per_10000"
+
+ - Choosing Line Mode as "line with dots"
+ - Adding more lines for other age groups like "mortality_under_65_per_10000"
+
+
+
+
+ You can sort your data by:
+
+ - Selecting the Attribute (column) to sort by
+ - Choosing ASC (ascending) or DESC (descending)
+
+
+ Example: Sort by year (newest first)
+
+ - Attribute: year
+ - Sort Preference: DESC
+
+
+
+
+ You can create bar charts by:
+
+ - Setting the Fields for chart labeling (e.g., xlabel)
+ - Choosing the Category Column for grouping data
+ - Selecting the Value Column associated with each category
+ - Optionally enabling Horizontal Orientation for horizontal bars
+ - Adding Pattern textures based on data attributes
+
+ Chart configuration:
+
+ - Category Column - Groups data into separate bars (e.g., regions, states)
+ - Value Column - Choose value column associated with category
+ - Horizontal Orientation - Display bars horizontally instead of vertically
+ - Pattern - Add visual textures to distinguish data categories
+
+
+ Example: Compare regional percentages with a bar chart
+ Operations:
+
+ - Setting Category Column to "region"
+ - Setting Value Column to "perc_exp" (percentage values)
+ - Setting Fields to "xlabel" for labeling
+
+