diff --git a/SUMMARY.md b/SUMMARY.md index 32de888c6..0d3a1ea80 100644 --- a/SUMMARY.md +++ b/SUMMARY.md @@ -32,6 +32,7 @@ * [Troubleshooting](use/alerting/notifications/troubleshooting.md) * [Customize](dynamic/customize-alerting.md) * [Add a monitor using the CLI](use/alerting/k8s-add-monitors-cli.md) + * [Derived State monitor](use/alerting/k8s-derived-state-monitors.md) * [Override monitor arguments](use/alerting/k8s-override-monitor-arguments.md) * [Write a remediation guide](use/alerting/k8s-write-remediation-guide.md) diff --git a/use/alerting/k8s-derived-state-monitors.md b/use/alerting/k8s-derived-state-monitors.md new file mode 100644 index 000000000..fd708f7c4 --- /dev/null +++ b/use/alerting/k8s-derived-state-monitors.md @@ -0,0 +1,39 @@ +--- +description: SUSE Observability +--- + +# Derived State Monitors + +## Overview + +In Observability scenarios where logical (business) components lack direct monitors but are affected by issues in their technical dependencies, you can use the derived-state-monitor function to derive a state from the connected technical components for the logical component. +This monitor traverses component dependencies and selects the most critical health state based on direct observations (e.g., from metrics), ignoring any already-derived states. It will apply the derived state to all components selected through the `componentTypes` parameter. +During traversal, only components with observed (non-derived) health states are considered for health derivation. Components with derived states are skipped in evaluation but still traversed to reach deeper dependencies—for example, logical components depending on other logical components. + +## Derived Health State Monitor example + +A Monitor implemented using the `derived-state-monitor` function looks like: + +``` + - _type: "Monitor" + name: "Aggregated health state of a Deployment, StatefulSet, ReplicaSet and DaemonSet" + tags: + - deployments + - replicasets + - statefulsets + - daemonsets + - derived + - propagated + identifier: "urn:custom:monitor:..." + status: "DISABLED" + description: "Description" + function: {{ get "urn:stackpack:common:monitor-function:derived-state-monitor" }} + arguments: + componentTypes: "deployment, replicaset, statefulset, daemonset" + intervalSeconds: 30 + remediationHint: "Investigate component [{{ causeName }}](/#/components/{{ causeComponentUrnForUrl }}) as is causing the workload to be unhealthy." +``` +* The function has a single argument `componentTypes` where you can express the different component types as a single string of `,` separated values +* The function offers two values to use in the remediation guide, `causeComponentName` being the component name where the state is propagated from and its `causeComponentUrnForUrl` to be able to create a link + +The monitor can be implemented using the guide at [Add a threshold monitor to components using the CLI](/use/alerting/k8s-add-monitors-cli.md) \ No newline at end of file diff --git a/use/alerting/kubernetes-monitors.md b/use/alerting/kubernetes-monitors.md index 038d1a75b..f45845804 100644 --- a/use/alerting/kubernetes-monitors.md +++ b/use/alerting/kubernetes-monitors.md @@ -144,22 +144,18 @@ Cluster doesn't have any health itself. But a cluster is build from few componen - all nodes and then takes the most critical health state. -### Aggregated health state of a DaemonSet +### Derived Workloads health state (Deployment, DaemonSet, ReplicaSet, StatefulSet) -The monitor aggregates states of all children Pods and then returns the most critical health state. +The monitor aggregates states of all top-most dependencies and then returns the most critical health state based on direct observations (e.g., from metrics). +This approach ensures that health signals propagate from low-level technical components (like Pods) to higher-level logical components, but only when the component itself lacks an observed health state. +To use this monitor effectively, make sure that some or all of following health checks are disabled: +* Deployment desired replicas +* DaemonSet desired replicas +* ReplicaSet desired replicas +* StatefulSet desired replicas -### Aggregated health state of a Deployment +If you have a use case where logical components have no direct monitors then you can use the [Derived State Monitor](/use/alerting/k8s-derived-state-monitors.md) function to infer their health based on the technical components they depend on. -The monitor aggregates states of all children ReplicaSets and then returns the most critical health state. ReplicaSets have -the similar Monitor, so eventually this one aggregates health states of all children ReplicaSets and Pods. - -### Aggregated health state of a ReplicaSet - -The monitor aggregates states of all children Pods and then returns the most critical health state. - -### Aggregated health state of a StatefulSet - -The monitor aggregates states of all children Pods and then returns the most critical health state. ## See also