diff --git a/.gitbook/assets/otel/aws_nodejs_otel_proxy_collector_configuration.svg b/.gitbook/assets/otel/aws_nodejs_otel_proxy_collector_configuration.svg
deleted file mode 100644
index 148152233..000000000
--- a/.gitbook/assets/otel/aws_nodejs_otel_proxy_collector_configuration.svg
+++ /dev/null
@@ -1,16 +0,0 @@
-
\ No newline at end of file
diff --git a/.gitbook/assets/otel/open-telemetry-collector-kubernetes.png b/.gitbook/assets/otel/open-telemetry-collector-kubernetes.png
new file mode 100644
index 000000000..04ba6da81
Binary files /dev/null and b/.gitbook/assets/otel/open-telemetry-collector-kubernetes.png differ
diff --git a/.gitbook/assets/otel/open-telemetry-collector-lambda.png b/.gitbook/assets/otel/open-telemetry-collector-lambda.png
new file mode 100644
index 000000000..e1f2c3466
Binary files /dev/null and b/.gitbook/assets/otel/open-telemetry-collector-lambda.png differ
diff --git a/.gitbook/assets/otel/open-telemetry-collector-linux.png b/.gitbook/assets/otel/open-telemetry-collector-linux.png
new file mode 100644
index 000000000..d9a08be79
Binary files /dev/null and b/.gitbook/assets/otel/open-telemetry-collector-linux.png differ
diff --git a/SUMMARY.md b/SUMMARY.md
index c48d37d6a..345802d29 100644
--- a/SUMMARY.md
+++ b/SUMMARY.md
@@ -92,16 +92,21 @@
* [Certificates for sidecar injection](setup/agent/k8sTs-agent-request-tracing-certificates.md)
## 🔭 Open Telemetry
-* [Getting started](setup/otel/getting-started.md)
+* [Overview](setup/otel/overview.md)
+* [Getting started](setup/otel/getting-started/README.md)
+ * [Concepts](setup/otel/concepts.md)
+ * [Rancher & Kubernetes](setup/otel/getting-started/getting-started-k8s.md)
+ * [Linux](setup/otel/getting-started/getting-started-linux.md)
+ * [AWS Lambda](setup/otel/getting-started/getting-started-lambda.md)
* [Open telemetry collector](setup/otel/collector.md)
- * [Collector as a proxy](setup/otel/proxy-collector.md)
-* [Languages](setup/otel/languages/README.md)
- * [Generic Exporter configuration](setup/otel/languages/sdk-exporter-config.md)
- * [Java](setup/otel/languages/java.md)
- * [Node.js](setup/otel/languages/node.js.md)
- * [Auto-instrumentation of Lambdas](setup/otel/languages/node.js/auto-instrumentation-of-lambdas.md)
- * [.NET](setup/otel/languages/dot-net.md)
- * [Verify the results](setup/otel/languages/verify.md)
+ * [Sampling](setup/otel/sampling.md)
+ * [SUSE Observability OTLP APIs](setup/otel/otlp-apis.md)
+* [Instrumentation](setup/otel/instrumentation/README.md)
+ * [Java](setup/otel/instrumentation/java.md)
+ * [Node.js](setup/otel/instrumentation/node.js.md)
+ * [Auto-instrumentation of Lambdas](setup/otel/instrumentation/node.js/auto-instrumentation-of-lambdas.md)
+ * [.NET](setup/otel/instrumentation/dot-net.md)
+ * [SDK Exporter configuration](setup/otel/instrumentation/sdk-exporter-config.md)
* [Troubleshooting](setup/otel/troubleshooting.md)
## CLI
@@ -171,7 +176,7 @@
## 🔐 Security
* [Service Tokens](use/security/k8s-service-tokens.md)
-* [Ingestion API Keys](use/security/k8s-ingestion-api-keys.md)
+* [API Keys](use/security/k8s-ingestion-api-keys.md)
## ☁️ SaaS
* [User Management](saas/user-management.md)
diff --git a/setup/install-stackstate/kubernetes_openshift/ingress.md b/setup/install-stackstate/kubernetes_openshift/ingress.md
index 092fc4031..40d62e888 100644
--- a/setup/install-stackstate/kubernetes_openshift/ingress.md
+++ b/setup/install-stackstate/kubernetes_openshift/ingress.md
@@ -55,7 +55,7 @@ This step assummes that [Generate `baseConfig_values.yaml` and `sizing_values.ya
{% endhint %}
-## Configure Ingress Rule for Open Telemetry Traces via the SUSE Observability Helm chart
+## Configure Ingress Rule for Open Telemetry
The SUSE Observability Helm chart exposes an `opentelemetry-collector` service in its values where a dedicated `ingress` can be created. This is disabled by default. The ingress needed for `opentelemetry-collector` purposed needs to support GRPC protocol. The example below shows how to use the Helm chart to configure an nginx-ingress controller with GRPC and TLS encryption enabled. Note that setting up the controller itself and the certificates is beyond the scope of this document.
diff --git a/setup/otel/collector.md b/setup/otel/collector.md
index ce4e4b20a..cd7c498b5 100644
--- a/setup/otel/collector.md
+++ b/setup/otel/collector.md
@@ -6,258 +6,244 @@ description: SUSE Observability
The OpenTelemetry Collector offers a vendor-agnostic implementation to receive, process and export telemetry data. Applications instrumented with Open Telemetry SDKs can use the collector to send telemetry data to SUSE Observability (traces and metrics).
-Your applications, when set up with OpenTelemetry SDKs, can use the collector to send telemetry data, like traces and metrics, straight to SUSE Observability. The collector is set up to receive this data by default via OTLP, the native open telemetry protocol. It can also receive data in other formats provided by other instrumentation SDKs like Jaeger and Zipkin for traces, and Influx and Prometheus for metrics.
+Your applications, when set up with OpenTelemetry SDKs, can use the collector to send telemetry data, like traces and metrics, to SUSE Observability or another collector (for further processing). The collector is set up to receive this data by default via OTLP, the native open telemetry protocol. It can also receive data in other formats provided by other instrumentation SDKs like Jaeger and Zipkin for traces, and Influx and Prometheus for metrics.
-Usually, the collector is running close to your application, like in the same Kubernetes cluster, making the process efficient.
+The collector is running close to your application, in the same Kubernetes cluster, on the same virtual machine, etc. This allows SDKs to quickly offload data to the collector, which can then do transformations, batching and filtering. It can be used by multiple applications and allows for easy changes to your data processing pipeline.
-For SUSE Observability integration, it's simple: SUSE Observability offers an OTLP endpoint using the gRPC protocol and uses bearer tokens for authentication. This means configuring your OpenTelemetry collector to send data to SUSE Observability is easy and standardized.
+For installation guides use the different [getting started guides](./getting-started/). The getting started guides provide a basic collector configuration to get started, but over time you'll want to customize it to your needs and add additional receivers, processors, and exporters to customize your ingestion pipeline to your needs.
-## Pre-requisites
+## Configuration
-1. A Kubernetes cluster with an application that is [instrumented with Open Telemetry](./languages/README.md)
-2. An API key for SUSE Observability
-3. Permissions to deploy the open telemetry collector in a namespace on the cluster (i.e. create resources like deployments and configmaps in a namespace). To be able to enrich the data with Kubernetes attributes permission is needed to create a [cluster role](https://github.com/open-telemetry/opentelemetry-helm-charts/blob/main/charts/opentelemetry-collector/templates/clusterrole.yaml) and role binding.
+The collector configuration defines pipelines for processing the different telemetry signals. The components in the processing pipeline can be divided in several categories, and each component has its own configuration. Here we'll give an overview of the different configuration sections and how to use them.
-## Kubernetes configuration and deployment
+### Receivers
-To install and configure the collector for usage with SUSE Observability we'll use the [Open Telemetry Collector helm chart](https://opentelemetry.io/docs/kubernetes/helm/collector/) and add the configuration needed for SUSE Observability:
+Receivers accept telemetry data from instrumented applications, here via OTLP:
-1. [Configure the collector](#configure-the-collector)
- 1. helm chart configuration
- 2. generating metrics from traces
- 3. sending the data to SUSE Observability
- 4. combine it all together in pipelines
-2. [Create a Kubernetes secret for the SUSE Observability API key](#create-secret-for-the-api-key)
-3. [Deploy the collector](#deploy-the-collector)
-4. [Configure your instrumented applicatins to send telemetry to the collector](#configure-applications)
+```yaml
+receivers:
+ otlp:
+ protocols:
+ grpc:
+ endpoint: 0.0.0.0:4317
+ http:
+ endpoint: 0.0.0.0:4318
+```
-### Configure the collector
+There are many more receivers that accept data via other protocols, for example Zipkin traces, or that actively collect data from various sources, for example:
+* Host metrics
+* Kubernetes metrics
+* Prometheus metrics (OpenMetrics)
+* Databases
-Here is the full values file needed, continue reading below the file for an explanation of the different parts. Or skip ahead to the next step, but make sure to replace:
-* `` with the OTLP endpoint of your SUSE Observability. If, for example, you access SUSE Observability on `play.stackstate.com` the OTLP endpoint is `otlp-play.stackstate.com` for GRPC and `otlp-http-play.stackstate.com` for HTTP traffic. So simply prefixing `otlp-` or `otlp-http-` to the normal SUSE Observability url will do.
-* `` with the cluster name you configured in SUSE Observability. **This must be the same cluster name used when installing the SUSE Observability agent**. Using a different cluster name will result in an empty traces perspective for Kubernetes components.
+Some receivers support all 3 signals (traces, metrics, logs), others support only 1 or 2, for example the Prometheus receiver can only collect metrics. The opentelemetry-collector-contrib repository has [all receivers](https://github.com/open-telemetry/opentelemetry-collector-contrib/tree/main/receiver) with documentation on their configuration.
-{% hint style="warning" %}
-The Kubernetes attributes and the span metrics namespace are required for SUSE Observability to provide full functionality.
-{% endhint %}
+### Processors
-{% hint style="info" %}
-The suggested configuration includes tail sampling for traces. Sampling can be fully customized and, depending on your applications and the volume of traces, it may be needed to [change this configuration](#trace-sampling). For example an increase (or decrease) in `max_total_spans_per_second`. It is highly recommended to keep sampling enabled to keep resource usage and cost under control.
-{% endhint %}
+The data from the receivers can be transformed or filtered by processors.
-{% code title="otel-collector.yaml" lineNumbers="true" %}
```yaml
-extraEnvsFrom:
- - secretRef:
- name: open-telemetry-collector
-mode: deployment
-image:
- repository: "otel/opentelemetry-collector-k8s"
-ports:
- metrics:
- enabled: true
-presets:
- kubernetesAttributes:
- enabled: true
- extractAllPodLabels: true
-config:
- extensions:
- bearertokenauth:
- scheme: SUSEObservability
- token: "${env:API_KEY}"
- exporters:
- otlp/stackstate:
- auth:
- authenticator: bearertokenauth
- endpoint: :443
- otlphttp/stackstate:
- auth:
- authenticator: bearertokenauth
- endpoint: https://
- processors:
- tail_sampling:
- decision_wait: 10s
- policies:
- - name: rate-limited-composite
- type: composite
- composite:
- max_total_spans_per_second: 500
- policy_order: [errors, slow-traces, rest]
- composite_sub_policy:
- - name: errors
- type: status_code
- status_code:
- status_codes: [ ERROR ]
- - name: slow-traces
- type: latency
- latency:
- threshold_ms: 1000
- - name: rest
- type: always_sample
- rate_allocation:
- - policy: errors
- percent: 33
- - policy: slow-traces
- percent: 33
- - policy: rest
- percent: 34
- resource:
- attributes:
- - key: k8s.cluster.name
- action: upsert
- value:
- - key: service.instance.id
- from_attribute: k8s.pod.uid
- action: insert
- - key: service.namespace
- from_attribute: k8s.namespace.name
- action: insert
- filter/dropMissingK8sAttributes:
- error_mode: ignore
- traces:
- span:
- - resource.attributes["k8s.node.name"] == nil
- - resource.attributes["k8s.pod.uid"] == nil
- - resource.attributes["k8s.namespace.name"] == nil
- - resource.attributes["k8s.pod.name"] == nil
- connectors:
- spanmetrics:
- metrics_expiration: 5m
- namespace: otel_span
- routing/traces:
- error_mode: ignore
- table:
- - statement: route()
- pipelines: [traces/sampling, traces/spanmetrics]
- service:
- extensions:
- - health_check
- - bearertokenauth
- pipelines:
- traces:
- receivers: [otlp]
- processors: [filter/dropMissingK8sAttributes, memory_limiter, resource]
- exporters: [routing/traces]
- traces/spanmetrics:
- receivers: [routing/traces]
- processors: []
- exporters: [spanmetrics]
- traces/sampling:
- receivers: [routing/traces]
- processors: [tail_sampling, batch]
- exporters: [debug, otlp/stackstate]
- metrics:
- receivers: [otlp, spanmetrics, prometheus]
- processors: [memory_limiter, resource, batch]
- exporters: [debug, otlp/stackstate]
+processors:
+ batch: {}
```
-{% endcode %}
-The `config` section customizes the collector config itself and is discussed in the next section. The other parts are:
+The batch processor batches all 3 signals, improving compression and reducing the number of outgoing connections. The opentelemetry-collector-contrib repository has [all processors](https://github.com/open-telemetry/opentelemetry-collector-contrib/tree/main/processor) with documentation on their configuration.
-* `extraEnvsFrom`: Sets environment variables from the specified secret, in the next step this secret is created for storing the SUSE Observability API key (Receiver / [Ingestion API Key](../../use/security/k8s-ingestion-api-keys.md))
-* `mode`: Run the collector as a Kubernetes deployment, when to use the other modes is discussed [here](https://opentelemetry.io/docs/kubernetes/helm/collector/).
-* `ports`: Used to enable the metrics port such that the collector can scrape its own metrics
-* `presets`: Used to enable the default configuration for adding Kubernetes metadata as attributes, this includes Kubernetes labels and metadata like namespace, pod, deployment etc. Enabling the metadata also introduces the cluster role and role binding mentioned in the pre-requisites.
+### Exporters
-#### Configuration
+To send data to the SUSE Observability backend the collector has exporters. There are exporters for different protocols, push- or pull-based, and different backends. Using the OTLP protocols it is also possible to use another collector as a destination for additional processing.
-The `service` section determines what components of the collector are enabled. The configuration for those components comes from the other sections (extensions, receivers, connectors, processors and exporters). The `extensions` section enables:
-* `health_check`, doesn't need additional configuration but adds an endpoint for Kubernetes liveness and readiness probes
-* `bearertokenauth`, this extension adds an authentication header to each request with the SUSE Observability API key. In its configuration, we can see it is getting the SUSE Observability API key from the environment variable `API_KEY`.
+```yaml
+exporters:
+ # The gRPC otlp exporter
+ otlp/suse-observability:
+ auth:
+ authenticator: bearertokenauth
+ # Put in your own otlp endpoint
+ endpoint:
+ # Use snappy compression, if no compression specified the data will be uncompressed
+ compression: snappy
+```
-The `pipelines` section defines pipelines for the traces and metrics. The metrics pipeline defines:
-* `receivers`, to receive metrics from instrumented applications (via the OTLP protocol, `otlp`), from spans (the `spanmetrics` connector) and by scraping Prometheus endpoints (the `prometheus` receiver). The latter is configured by default in the collector Helm chart to scrape the collectors own metrics
-* `processors`: The `memory_limiter` helps to prevent out-of-memory errors. The `batch` processor helps better compress the data and reduce the number of outgoing connections required to transmit the data. The `resource` processor adds additional resource attributes (discussed separately)
-* `exporters`: The `debug` exporter simply logs to stdout which helps when troubleshooting. The `otlp/stackstate` exporter sends telemetry data to SUSE Observability using the OTLP protocol via GRPC (Default). The `otlphttp/stackstate` exporter sends telemetry data to SUSE Observability using the OTLP protocol via HTTP and is meant to be used where there area some impediments to use the GRPC one (needs to be activated in the pipelines). Both OTLP exporters are configured to use the bearertokenauth extension for authentication to send data to the SUSE Observability OTLP endpoint.
+The SUSE Observability exporter requires authentication using an api key, to configure that an [authentication extension](#extensions) is used. The opentelemetry-collector-contrib repository has [all exporters](https://github.com/open-telemetry/opentelemetry-collector-contrib/tree/main/exporter) with documentation on their configuration.
-For traces, there are 3 pipelines that are connected:
-* `traces`: The pipeline that receives traces from SDKs (via the `otlp` receiver) and does the initial processing using the same processors as for metrics. It exports into a router which routes all spans to both other traces pipelines. This setup makes it possible to calculate span metrics for all spans while applying sampling to the traces that are exported.
-* `traces/spanmetrics`: Use the `spanmetrics` connector as an exporter to generate metrics from the spans (`otel_span_duration` and `otel_span_calls`). It is configured to not report time series anymore when no spans have been observed for 5 minutes. SUSE Observability expects the span metrics to be prefixed with `otel_span_`, which is taken care of by the `namespace` configuration.
-* `traces/sampling`: The pipeline that exports traces to SUSE Observability using the OTLP protocol, but uses the tail sampling processor to make the trace volume that is sent to SUSE Observability predictable to keep the cost predictable as well. Sampling is discussed in a [separate section](#trace-sampling).
+If the gRPC exporter doesn't work for you (see also [troubleshooting](./troubleshooting.md#some-proxies-and-firewalls-dont-work-well-with-grpc)), you can switch to the, slightly less efficient, OTLP over HTTP protocol by using the `otlphttp` exporter instead. Replace all references to `otlp/suse-observability` with `otlphttp/suse-observability` (don't forget the references in the pipelines) and make sure to update the exporter config to:
-The `resource` processor is configured for both metrics and traces. It adds extra resource attributes:
+```yaml
+exporters:
+ # The gRPC otlp exporter
+ otlphttp/suse-observability:
+ auth:
+ authenticator: bearertokenauth
+ # Put in your own otlp HTTP endpoint
+ endpoint:
+ # Use snappy compression, if no compression specified the data will be uncompressed
+ compression: snappy
+```
-* The `k8s.cluster.name` is added by providing the cluster name in the configuration. SUSE Observability needs the cluster name and Open Telemetry does not have a consistent way of determining it. Because some SDKs, in some environments, provide a cluster name that does not match what SUSE Observability expects the cluster name is an `upsert` (overwrites any pre-existing value).
-* The `service.instance.id` is added based on the pod uid. It is recommended to always provide a service instance id, and the pod uid is an easy way to get a unique identifier if the SDKs don't provide one.
+{% hint type="warning" %}
+The OTLP HTTP endpoint for SUSE Observability is different from the OTLP endpoint. Use the [OTLP APIs](./otlp-apis.md) to find the correct URL.
+{% endhint %}
-#### Trace Sampling
+### Service pipeline
-It is highly recommended to use sampling for traces:
+For each telemetry signal a separate pipeline is configured. The pipelines are configured in the `service.pipeline` section and define which receivers, processors and exporters should be used in which order. Before using a component in the pipeline it must first be defined in its configuration section. The `batch` processor, for example, doesn't have any configuration but still has to be declared in the `processors` section. Components that are configured but are not included in a pipeline will not be active at all.
-* To manage resource usage by only processing and storing the most relevant traces
-* To manage costs and have predictable costs
-* To reduce noise and focus on the important traces only, for example by filtering out health checks
+```yaml
+service:
+ pipelines:
+ traces:
+ receivers: [otlp]
+ processors: [memory_limiter, resource, batch]
+ exporters: [debug, spanmetrics, otlp/suse-observability]
+ metrics:
+ receivers: [otlp, spanmetrics, prometheus]
+ processors: [memory_limiter, resource, batch]
+ exporters: [debug, otlp/suse-observability]
+```
-There are 2 approaches for sampling, head sampling and tail sampling. This [Open Telemetry docs page](https://opentelemetry.io/docs/concepts/sampling/) discusses the pros and cons of both approaches in detail. The collector configuration provided here uses tail sampling to support these requirements:
+### Extensions
-1. Have predictable cost by having a predictable trace volume
-2. Have a large sample of all errors
-3. Have a large sample of all slow traces
-4. Have a sample of all other traces to see the normal application behavior
+Extensions are not used directly in pipelines for processing data but extend the capabilities of the collector in other ways. For SUSE Observability it is used to configure the authentication using an api key. Extensions must be defined in a configuration section before they can be used. Similar to the pipeline components an extension is only active when it is enabled in the `service.extensions` section.
-Criteria 2 and 3 can only be fulfilled by tail sampling. Let's look at the sampling policies used in the configuration of the tail sampler now:
+```yaml
+extensions:
+ bearertokenauth:
+ scheme: SUSEObservability
+ token: "${env:API_KEY}"
+service:
+ extensions: [ bearertokenauth ]
+```
-* There is only one top-level policy, it is a `composite` policy. It uses a rate limit, allowing at most 500 traces per second, giving a predictable trace volume. It uses other policies as sub-policies to make the actual sampling decissions.
-* The `errors` policy is of type `status_code` and is configured to only sample traces that contain errors. 33% of the rate limit is reserved for errors, via the `rate_allocation` section of the composite policy.
-* The `slow-traces` policy is of type `latency` and filters all traces slower than 1 second. 33% of the rate limits is reserved for the slow traces.
-* The `rest` policy is of the `always_sample` type. It will sample all traces until it hits the rate limit enforced by the composite policy, which is 34% of the total rate limit of 500 traces.
+The opentelemetry-collector-contrib repository has [all extensions](https://github.com/open-telemetry/opentelemetry-collector-contrib/tree/main/extension) with documentation on their configuration.
-There are many more policies available that can be added to the configuration when needed. For example, it is possible to filter traces based on certain attributes (only for a specific application or customer). The tail sampler can also be replaced with the probabilistic sampler. For all configuration options please use the documentation of these processors:
-* [Tail sampling](https://github.com/open-telemetry/opentelemetry-collector-contrib/tree/main/processor/tailsamplingprocessor)
-* [Probabilistic sampling](https://github.com/open-telemetry/opentelemetry-collector-contrib/tree/main/processor/probabilisticsamplerprocessor)
+## Transforming telemetry
-### Create a secret for the API key
+There are many processors in the [opentelemetry-collector-contrib repository](https://github.com/open-telemetry/opentelemetry-collector-contrib/tree/main/processor). Here we try to give an overview of commonly used processors and their capabilities. For more details and many more processors use the [opentelemetry-collector-contrib repository](https://github.com/open-telemetry/opentelemetry-collector-contrib/tree/main/processor).
-The collector needs a Kubernetes secret with the SUSE Observability API key. Create that in the same namespace (here we are using the `open-telemetry` namespace) where the collector will be installed (replace `` with your API key):
+### Filtering
-```bash
-kubectl create secret generic open-telemetry-collector \
- --namespace open-telemetry \
- --from-literal=API_KEY=''
+Some instrumentations or applications may generate a lot of telemetry data that is just noisy and unneeded for your use-case. The [filter processor](https://github.com/open-telemetry/opentelemetry-collector-contrib/tree/main/processor/filterprocessor) can be used to drop the data that you don't need in the collector, to avoid sending the data to SUSE Observability. For example to drop all the data of 1 specific service:
+
+```yaml
+processors:
+ filter/ignore-service1:
+ error_mode: ignore
+ traces:
+ span:
+ - resource.attributes["service.name"] == "service1"
```
-SUSE Observability supports two types of keys:
-- Receiver API Key
-- Ingestion API Key
+The filter processor uses the [Open Telemetry Transformation Lanuage (OTTL)](https://github.com/open-telemetry/opentelemetry-collector-contrib/blob/main/pkg/ottl/README.md) to define the filters.
-#### Receiver API Key
+### Adding, modifying or deleting attributes
-You can find the API key for SUSE Observability on the Kubernetes Stackpack installation screen:
+The [attributes processor](https://github.com/open-telemetry/opentelemetry-collector-contrib/tree/main/processor/attributesprocessor) can change attributes of spans, logs or metrics.
-1. Open SUSE Observability
-2. Navigate to StackPacks and select the Kubernetes StackPack
-3. Open one of the installed instances
-4. Scroll down to the first set of installation instructions. It shows the API key as `STACKSTATE_RECEIVER_API_KEY` in text and as `'stackstate.apiKey'` in the command.
+```yaml
+processors:
+ attributes/accountid:
+ actions:
+ - key: account_id
+ value: 2245
+ action: insert
+```
-#### Ingestion API Key
+The [resource processor](https://github.com/open-telemetry/opentelemetry-collector-contrib/tree/main/processor/resourceprocessor) can modify attributes of a [resource](concepts.md#resources). For example to add a Kubernetes cluster name to every resource:
-SUSE Observability supports creating multiple Ingestion Keys. This allows you to assign a unique key to each OpenTelemetry Collector for better security and access control.
-For instructions on generating an Ingestion API Key, refer to the [documentation page](../../use/security/k8s-ingestion-api-keys.md).
+```yaml
+ processors:
+ resource/add-k8s-cluster:
+ attributes:
+ - key: k8s.cluster.name
+ action: upsert
+ value: my-k8s-cluster
+```
+
+For changing metric names and other metric specific information there is also the [metrics transformer](https://github.com/open-telemetry/opentelemetry-collector-contrib/tree/main/processor/metricstransformprocessor).
+
+### Transformations
+
+The [transform processor](https://github.com/open-telemetry/opentelemetry-collector-contrib/tree/main/processor/transformprocessor) can be used to, for example, set a span status:
+
+```yaml
+processors:
+ transform:
+ error_mode: ignore
+ trace_statements:
+ - set(span.status.code, STATUS_CODE_OK) where span.attributes["http.request.status_code"] == 400
+```
+
+It supports many more transformations, like modifying the span name, converting metric types or modifying log events. See it's [readme](https://github.com/open-telemetry/opentelemetry-collector-contrib/tree/main/processor/transformprocessor) for all the possibilities. It uses the [Open Telemetry Transformation Lanuage (OTTL)](https://github.com/open-telemetry/opentelemetry-collector-contrib/blob/main/pkg/ottl/README.md) to define the filters.
+
+## Scrub sensistive data
+
+The collector is the ideal place to remove or obfuscate sensitive data, because it sits right between your applications and SUSE Observability and has processors to [filter and transform your data](#transforming-telemetry). Next to the filtering and transformation capabilities already discussed there is also a [redaction processor](https://github.com/open-telemetry/opentelemetry-collector-contrib/tree/main/processor/redactionprocessor) available that can mask attribute values that match a block list. It can also remove attributes that don't match a specified list of allowed attributes, however using this can quickly result in dropping most attributes resulting in very limited observability capabilities. Note that it does not process resource attributes.
-### Deploy the collector
+An example that only masks specific attributes and/or values:
-To deploy the collector first make sure you have the Open Telemetry helm charts repository configured:
+```yaml
+processors:
+ redaction:
+ allow_all_keys: true
+ # attributes matching the regexes on the list are masked.
+ blocked_key_patterns:
+ - ".*token.*"
+ - ".*api_key.*"
+ blocked_values: # Regular expressions for blocking values of allowed span attributes
+ - '4[0-9]{12}(?:[0-9]{3})?' # Visa credit card number
+ - '(5[1-5][0-9]{14})' # MasterCard number
+ summary: debug
+```
+
+## Trying out the collector
+
+The getting started guides show how to deploy the collector to Kubernetes or using Linux packages for a production ready setup. It is also possible to run it, for example for tests, directly as a docker container to try it out:
```bash
-helm repo add open-telemetry https://open-telemetry.github.io/opentelemetry-helm-charts
+docker run \
+ -p 127.0.0.1:4317:4317 \
+ -p 127.0.0.1:4318:4318 \
+ -v $(pwd)/config.yaml:/etc/otelcol-contrib/config.yaml \
+ otel/opentelemetry-collector-contrib:latest
```
-Now install the collector, using the configuration defined in the previous steps:
+This uses the collector contrib image which includes all contributed components (receivers, processors, etc.). A smaller, more limited version of the image is also available, but it has only a very limited set of components available:
```bash
-helm upgrade --install opentelemetry-collector open-telemetry/opentelemetry-collector \
- --values otel-collector.yaml \
- --namespace open-telemetry
+docker run \
+ -p 127.0.0.1:4317:4317 \
+ -p 127.0.0.1:4318:4318 \
+ -v $(pwd)/config.yaml:/etc/otelcol/config.yaml \
+ otel/opentelemetry-collector:latest
```
-### Configure applications
+Note that the Kubernetes installation defaults to the Kubernetes distribution of the collector image, `otel/opentelemetry-collector-k8s`, which has more components than the basic image, but less than the contrib image. If you run into missing components with that image you can simply switch it to use the contrib image , `otel/opentelemetry-collector-contrib`, instead.
+
+# Troubleshooting
+
+## HTTP Requests from the exporter are too big
+
+In some cases HTTP requests for telemetry data can become very large and may be refused by SUSE Observability. SUSE Observability has a limit of 4MB for the gRPC protocol. If you run into HTTP requests limits you can lower the requests size by changing the compression algorithm and limiting the maximum batch size.
-The collector as it is configured now is ready to receive and send telemetry data. The only thing left to do is to update the SDK configuration for your applications to send their telemetry via the collector to the agent.
+### HTTP request compression
+
+The getting started guides enable `snappy` compression on the collector, this is not the best compression but uses less CPU resources than `gzip`. If you removed the compression you can enable it again, or you can switch to a compression algorithm that offers a better [compression ratio](https://github.com/open-telemetry/opentelemetry-collector/blob/main/config/configgrpc/README.md#compression-comparison). The same compression types are available for gRPC and HTTP protocols.
+
+### Max batch size
+
+To reduce the HTTP request size can be reduced by adding configuration to the `batch` processor limiting the batch size:
+
+```yaml
+processor:
+ batch: {}
+ send_batch_size: 8192 # This is the default value
+ send_batch_max_size: 10000 # The default is 0, meaning no max size at all
+```
-Use the [generic configuration for the SDKs](./languages/sdk-exporter-config.md) to export data to the collector. Follow the [language-specific instrumentation instructions](./languages/README.md) to enable the SDK for your applications.
+The batch size is defined in number of spans, metric data points, or log records (not in bytes), so you might need some experimentation to find the correct setting for your situation. For more details please refer to the [batch processor documentation](https://github.com/open-telemetry/opentelemetry-collector/blob/main/processor/batchprocessor/README.md).
-## Related resources
+# Related resources
The Open Telemetry documentation provides much more details on the configuration and alternative installation options:
diff --git a/setup/otel/concepts.md b/setup/otel/concepts.md
new file mode 100644
index 000000000..c14f2a873
--- /dev/null
+++ b/setup/otel/concepts.md
@@ -0,0 +1,53 @@
+---
+description: SUSE Observability
+---
+
+# Open Telemetry concepts
+
+This is a summary of the most important concepts in Open Telemetry and should be sufficient to get started. For a more detailed introduction use the [Open Telemetry documentation](https://opentelemetry.io/docs/concepts/)
+
+## Signals
+
+Open Telemetry recognizes 3 telemetry signals:
+
+* Traces
+* Metrics
+* Logs
+
+At the momemt SUSE Observability supports traces and metrics, logs will be supported in a future version. For Kubernetes logs it is possible to use the [SUSE Observability agent](/k8s-quick-start-guide.md) instead.
+
+### Traces
+
+Traces allow us to visualize the path of a request through your application. A trace consists of one or more spans that together form a tree, a single trace can be entirely within a single service, but it can also go across many services. Each span represents an operation in the processing of the request and has:
+* a name
+* start and end time, from that a duration can be calculated
+* status
+* attributes
+* resource attributes (see [resources](#resources))
+* events
+
+Span attributes are used to provide metadata for the span, for example a span that for an operation that places an order can have the `orderId` as an attribute, or a span for an HTTP operation can have the HTTP method and URL as attributes.
+
+Span events can be used to represent a point in time where something important happened within the operation of the span. For example if the span failed there can be an `exception` or an `error` event that captures the error message, a stacktrace and the exact point in time the error occurred.
+
+### Metrics
+
+Metrics are measurements captured at runtime and they result in a metric event. Metrics are important indicators for application performance and availability and are often used to alert on an outage or performance problem. Metrics have:
+* a name
+* a timestamp
+* a kind (counter, gauge, histogram, etc.)
+* attributes
+* resource attributes (see [resources](#resources))
+
+Attributes provide the metadata for a metric.
+
+## Resources
+
+A resource is the entity that produces the telemetry data. The resource attributes provide the metadata for the resource. For example a process running in a container, in a pod, in a namespace in a Kubernetes cluster can have resource attributes for all these entities.
+
+Resource attributes are often automatically assigned by the SDKs. However it is recommended to always set the `service.name` and `service.namespace` attributes explicitly. The first one is the logical name for the service, if not set the SDK will set an `unknown_service` value making it very hard to use the data later in SUSE Observability. The namespace is a convenient way to organize your services, especially useful if you have the same services running in multiple locations.
+
+## Semantic conventions
+
+Open Telemetry defines common names for operations and data, they call this the semantic conventions. Semantic conventions follow a naming scheme that allows for standardizing processing of data across languages, libraries and code bases. There are semantic conventions for all signals and for resource attributes. They are defined for many different platforms and operations on the [Open Telemetry website](https://opentelemetry.io/docs/specs/semconv/attributes-registry/). SDKs make use of the semantic conventions to assign these attributes and SUSE Observability also respects the conventions and relies on them, for example to recognize Kubernetes resources.
+
diff --git a/setup/otel/getting-started.md b/setup/otel/getting-started.md
deleted file mode 100644
index 4438a0ef9..000000000
--- a/setup/otel/getting-started.md
+++ /dev/null
@@ -1,22 +0,0 @@
----
-description: SUSE Observability
----
-
-# Getting Started with Open Telemetry
-
-
-
-SUSE Observability supports [Open Telemetry](https://opentelemetry.io/docs/what-is-opentelemetry/). Open Telemetry is a set of standardized protocols and an open-source framework to collect, transform and ship telemetry data such as traces, metrics and logs. Open telemetry supports a wide variety of programming languages and platforms.
-
-SUSE Observability has support for both metrics and traces and adds the Open Telemetry metrics and traces to the (Kubernetes) topology data that is provided by the SUSE Observability agent. Therefore it is still needed to also install the SUSE Observability agent. Support for logs and using Open Telemetry without the SUSE Observability agent is coming soon.
-
-Open Telemetry consists of several different components. For usage with SUSE Observability, the [SDKs](./languages/README.md) to instrument your application and the [Open Telemetry collector](./collector.md) are the most important parts. We'll show how to configure both for usage with SUSE Observability.
-
-If your application is already instrumented with Open Telemetry or with any other library that is supported by Open Telemetry, like Jaeger or Zipkin, the collector can be used to ship that data to SUSE Observability and no additional instrumentation is needed.
-
-SUSE Observability requires the collector to be configured with specific processors and authentication to make sure all data used by SUSE Observability is available.
-
-## References
-
-* [Open Telemetry collector](https://opentelemetry.io/docs/collector/) on the Open Telemetry documentation
-* [SDKs to instrument your application](https://opentelemetry.io/docs/languages/) on the Open Telemetry documentation
\ No newline at end of file
diff --git a/setup/otel/getting-started/README.md b/setup/otel/getting-started/README.md
new file mode 100644
index 000000000..4b1a29285
--- /dev/null
+++ b/setup/otel/getting-started/README.md
@@ -0,0 +1,12 @@
+---
+description: SUSE Observability
+---
+
+# Getting started
+
+You might first want to familiarize yourself with the Open Telemetry [terminology and concepts](../concepts.md). like, signals, resources, etc.
+
+To get started monitoring one of your own applications follow the getting started guide that matches best with your deployment setup:
+* [Kubernetes or Rancher](./getting-started-k8s.md)
+* [Linux host](./getting-started-linux.md)
+* [AWS Lambda functions](./getting-started-lambda.md)
diff --git a/setup/otel/getting-started/getting-started-k8s.md b/setup/otel/getting-started/getting-started-k8s.md
new file mode 100644
index 000000000..7b455e5dc
--- /dev/null
+++ b/setup/otel/getting-started/getting-started-k8s.md
@@ -0,0 +1,173 @@
+---
+description: SUSE Observability
+---
+
+# Getting Started with Open Telemetry on Rancher / Kubernetes
+
+Here is the setup we'll be creating, for an application that needs to be monitored:
+
+* The monitored application / workload running in cluster A
+* The Open Telemetry collector running near the observed application(s), so in cluster A, and sending the data to SUSE Observability
+* SUSE Observability running in cluster B, or SUSE Cloud Observability
+
+
+
+
+## The Open Telemetry collector
+
+{% hint type="info" %}
+For a production setup it is strongly recommended to install the collector, since it allows your service to offload data quickly and the collector can take care of additional handling like retries, batching, encryption or even sensitive data filtering.
+{% endhint %}
+
+First we'll install the OTel (Open Telemetry) collector in cluster A. We configure it to:
+
+* Receive data from, potentially many, instrumented applications
+* Enrich collected data with Kubernetes attributes
+* Generate metrics for traces
+* Forward the data to SUSE Observability, including authentication using the API key
+
+Next to that it will also retry sending data when there are connection problems.
+
+### Create the namespace and a secret for the API key
+
+We'll install in the `open-telemetry` namespace and use the receiver API key generated during installation (see [here](/use/security/k8s-ingestion-api-keys.md#api-keys) where to find it):
+
+```bash
+kubectl create namespace open-telemetry
+kubectl create secret generic open-telemetry-collector \
+ --namespace open-telemetry \
+ --from-literal=API_KEY=''
+```
+
+### Configure and install the collector
+
+We install the collector with a Helm chart provided by the Open Telemetry project. Make sure you have the Open Telemetry helm charts repository configured:
+
+```bash
+helm repo add open-telemetry https://open-telemetry.github.io/opentelemetry-helm-charts
+```
+
+Create a `otel-collector.yaml` values file for the Helm chart. Here is a good starting point for usage with SUSE Observability, replace `` with your OTLP endpoint (see [OTLP API](../otlp-apis.md) for your endpoint) and insert the name for your Kubernetes cluster instead of ``:
+
+{% code title="otel-collector.yaml" lineNumbers="true" %}
+```yaml
+# Set the API key from the secret as an env var:
+extraEnvsFrom:
+ - secretRef:
+ name: open-telemetry-collector
+mode: deployment
+image:
+ # Use the collector container image that has all components important for k8s. In case of missing components the otel/opentelemetry-collector-contrib image can be used which
+ # has all components in the contrib repository: https://github.com/open-telemetry/opentelemetry-collector-contrib
+ repository: "otel/opentelemetry-collector-k8s"
+ports:
+ metrics:
+ enabled: true
+presets:
+ kubernetesAttributes:
+ enabled: true
+ extractAllPodLabels: true
+# This is the config file for the collector:
+config:
+ receivers:
+ nop: {}
+ otlp:
+ protocols:
+ grpc:
+ endpoint: 0.0.0.0:4317
+ http:
+ endpoint: 0.0.0.0:4318
+ extensions:
+ # Use the API key from the env for authentication
+ bearertokenauth:
+ scheme: SUSEObservability
+ token: "${env:API_KEY}"
+ exporters:
+ nop: {}
+ otlp/suse-observability:
+ auth:
+ authenticator: bearertokenauth
+ # Put in your own otlp endpoint
+ endpoint:
+ compression: snappy
+ processors:
+ memory_limiter:
+ check_interval: 5s
+ limit_percentage: 80
+ spike_limit_percentage: 25
+ batch: {}
+ resource:
+ attributes:
+ - key: k8s.cluster.name
+ action: upsert
+ # Insert your own cluster name
+ value:
+ - key: service.instance.id
+ from_attribute: k8s.pod.uid
+ action: insert
+ # Use the k8s namespace also as the open telemetry namespace
+ - key: service.namespace
+ from_attribute: k8s.namespace.name
+ action: insert
+ connectors:
+ # Generate metrics for spans
+ spanmetrics:
+ metrics_expiration: 5m
+ namespace: otel_span
+ service:
+ extensions: [ health_check, bearertokenauth ]
+ pipelines:
+ traces:
+ receivers: [otlp]
+ processors: [memory_limiter, resource, batch]
+ exporters: [debug, spanmetrics, otlp/suse-observability]
+ metrics:
+ receivers: [otlp, spanmetrics, prometheus]
+ processors: [memory_limiter, resource, batch]
+ exporters: [debug, otlp/suse-observability]
+ logs:
+ receivers: [nop]
+ processors: []
+ exporters: [nop]
+```
+{% endcode %}
+
+{% hint type="warning" %}
+**Use the same cluster name as used for installing the SUSE Observability agent** if you also use the SUSE Observability agent with the Kubernetes stackpack. Using a different cluster name will result in an empty traces perspective for Kubernetes components and will overall make correlating information much harder for SUSE Observability and your users.
+{% endhint %}
+
+Now install the collector, using the configuration file:
+
+```bash
+helm upgrade --install opentelemetry-collector open-telemetry/opentelemetry-collector \
+ --values otel-collector.yaml \
+ --namespace open-telemetry
+```
+
+The collector offers a lot more configuration receivers, processors and exporters, for more details see our [collector page](../collector.md). For production usage often large amounts of spans are generated and you will want to start setting up [sampling](../sampling.md).
+
+## Collect telemetry data from your application
+
+The common way to collect telemetry data is to instrument your application using the Open Telemetry SDK's. We've documented some quick start guides for a few languages, but there are many more:
+* [Java](../instrumentation/java.md)
+* [.NET](../instrumentation/dot-net.md)
+* [Node.js](../instrumentation/node.js.md)
+
+For other languages follow the documentation on [opentelemetry.io](https://opentelemetry.io/docs/languages/) and make sure to configure the SDK exporter to ship data to the collector you just installed by following [these instructions](../instrumentation/sdk-exporter-config.md).
+
+## View the results
+Go to SUSE Observability and make sure the Open Telemetry Stackpack is installed (via the main menu -> Stackpacks).
+
+After a short while and if your pods are getting some traffic you should be able to find them under their service name in the Open Telemetry -> services and service instances overviews. Traces will appear in the [trace explorer](/use/traces/k8sTs-explore-traces.md) and in the [trace perspective](/use/views/k8s-traces-perspective.md) for the service and service instance components. Span metrics and language specific metrics (if available) will become available in the [metrics perspective](/use/views/k8s-metrics-perspective.md) for the components.
+
+If you also have the Kubernetes stackpack installed the instrumented pods will also have the traces available in the [trace perspective](/use/views/k8s-traces-perspective.md).
+
+## Next steps
+You can add new charts to components, for example the service or service instance, for your application, by following [our guide](/use/metrics/k8s-add-charts.md). It is also possible to create [new monitors](/use/alerting/k8s-monitors.md) using the metrics and setup [notifications](/use/alerting/notifications/configure.md) to get notified when your application is not available or having performance issues.
+
+# More info
+
+* [API keys](/use/security/k8s-ingestion-api-keys.md)
+* [Open Telemetry API](../otlp-apis.md)
+* [Customizing Open Telemetry Collector configuration](../collector.md)
+* [Open Telemetry SDKs](../instrumentation/README.md)
\ No newline at end of file
diff --git a/setup/otel/getting-started/getting-started-lambda.md b/setup/otel/getting-started/getting-started-lambda.md
new file mode 100644
index 000000000..5f2ab8ed3
--- /dev/null
+++ b/setup/otel/getting-started/getting-started-lambda.md
@@ -0,0 +1,160 @@
+---
+description: SUSE Observability
+---
+
+# Getting started for AWS Lambda
+
+We'll setup monitoring for one or more AWS Lambda functions:
+* The monitored AWS Lambda function(s) (instrumented using Open Telemetry)
+* The Open Telemetry collector
+* SUSE Observability or SUSE Cloud Observability
+
+
+
+## The Open Telemetry collector
+
+{% hint type="info" %}
+For a production setup it is strongly recommended to install the collector, since it allows your service to offload data quickly and the collector can take care of additional handling like retries, batching, encryption or even sensitive data filtering.
+{% endhint %}
+
+First we'll install the OTel (Open Telemetry) collector, in this example we use a Kubernetes cluster to run it close to the Lambda functions. A similar setup can be made using a collector installed on a virtual machine instead. The configuration used here only acts as a secure proxy to offload data quickly from the Lambda functions and runs within trusted network infrastructure.
+
+### Create the namespace and a secret for the API key
+
+We'll install in the `open-telemetry` namespace and use the receiver API key generated during installation (see [here](/use/security/k8s-ingestion-api-keys.md#api-keys) where to find it):
+
+```bash
+kubectl create namespace open-telemetry
+kubectl create secret generic open-telemetry-collector \
+ --namespace open-telemetry \
+ --from-literal=API_KEY=''
+```
+
+### Configure and install the collector
+
+We install the collector with a Helm chart provided by the Open Telemetry project. Make sure you have the Open Telemetry helm charts repository configured:
+
+```bash
+helm repo add open-telemetry https://open-telemetry.github.io/opentelemetry-helm-charts
+```
+
+Create a `otel-collector.yaml` values file for the Helm chart. Here is a good starting point for usage with SUSE Observability, replace `` with your OTLP endpoint (see [OTLP API](../otlp-apis.md) for your endpoint) and insert the name for your Kubernetes cluster instead of ``. When using the ingress configuration also make sure to insert your own domain name and the corresponding TLS certificate secret in the marked locations.
+
+{% code title="otel-collector.yaml" lineNumbers="true" %}
+```yaml
+mode: deployment
+presets:
+ kubernetesAttributes:
+ enabled: true
+ # You can also configure the preset to add all the associated pod's labels and annotations to you telemetry.
+ # The label/annotation name will become the resource attribute's key.
+ extractAllPodLabels: true
+extraEnvsFrom:
+ - secretRef:
+ name: open-telemetry-collector
+image:
+ repository: "otel/opentelemetry-collector-k8s"
+
+config:
+ receivers:
+ otlp:
+ protocols:
+ grpc:
+ endpoint: 0.0.0.0:4317
+ http:
+ endpoint: 0.0.0.0:4318
+ extensions:
+ # Use the API key from the env for authentication
+ bearertokenauth:
+ scheme: SUSEObservability
+ token: "${env:API_KEY}"
+ exporters:
+ otlp:
+ auth:
+ authenticator: bearertokenauth
+ # Put in your own otlp endpoint
+ endpoint:
+
+ service:
+ extensions: [health_check, bearertokenauth]
+ pipelines:
+ traces:
+ receivers: [otlp]
+ processors: [batch]
+ exporters: [otlp]
+ metrics:
+ receivers: [otlp]
+ processors: [batch]
+ exporters: [otlp]
+ logs:
+ receivers: [otlp]
+ processors: [batch]
+ exporters: [otlp]
+
+ingress:
+ enabled: true
+ annotations:
+ kubernetes.io/ingress.class: ingress-nginx-external
+ nginx.ingress.kubernetes.io/ingress.class: ingress-nginx-external
+ nginx.ingress.kubernetes.io/backend-protocol: GRPC
+ # "12.34.56.78/32" IP address of NatGateway in the VPC where the otel data is originating from
+ # nginx.ingress.kubernetes.io/whitelist-source-range: "12.34.56.78/32"
+ hosts:
+ - host: "otlp-collector-proxy."
+ paths:
+ - path: /
+ pathType: ImplementationSpecific
+ port: 4317
+ tls:
+ - secretName:
+ hosts:
+ - "otlp-collector-proxy."
+
+# Instead of ingress:
+
+# Alternative 1, load balancer service
+#service:
+# type: LoadBalancer
+# loadBalancerSourceRanges: 12.34.56.78/32 # The IP address of NatGateway in the VPC for the lambda functions
+
+# Alternative 2, node port service
+#service:
+# type: NodePort
+#ports:
+# otlp:
+# nodePort: 30317
+```
+{% endcode %}
+
+Now install the collector, using the configuration file:
+
+```bash
+helm upgrade --install opentelemetry-collector open-telemetry/opentelemetry-collector \
+ --values otel-collector.yaml \
+ --namespace open-telemetry
+```
+
+Make sure that the proxy collector is accessible by the Lambda functions by either having the ingress publicly accessible or by having the collector IP in the same VPC as the Lambda functions. It is recommended to use a source-range whitelist to filter out data from untrusted and/or unknown sources (see the comment in the yaml). Next to the ingress setup it is also possible to expose the collector to the Lambda functions via:
+* a LoadBalancer service that restricts access by limiting the source ranges, see "Alternative 1".
+* a NodePort service for the collector, see "Alternative 2".
+
+The collector offers a lot more configuration receivers, processors and exporters, for more details see our [collector page](../collector.md). For production usage often large amounts of spans are generated and you will want to start setting up [sampling](../sampling.md).
+
+## Instrument a Lambda function
+
+Open Telemetry supports instrumenting Lambda functions in multiple languages using Lambda layers. The configuration of those Lambda layers should use the address of the collector from the previous step to ship the data. To instrument a Node.js lambda follow our [detailed instructions here](../instrumentation/node.js/auto-instrumentation-of-lambdas.md). For instrumenting other languages apply the same configuration as for Node.js but use one of the other [Open Telemetry Lambda layers](https://opentelemetry.io/docs/platforms/faas/lambda-auto-instrument/).
+
+## View the results
+Go to SUSE Observability and make sure the Open Telemetry Stackpack is installed (via the main menu -> Stackpacks).
+
+After a short while and if your Lambda function(s) are getting some traffic you should be able to find the functions under their service name in the Open Telemetry -> services and service instances overviews. Traces will appear in the [trace explorer](/use/traces/k8sTs-explore-traces.md) and in the [trace perspective](/use/views/k8s-traces-perspective.md) for the service and service instance components. Span metrics and language specific metrics (if available) will become available in the [metrics perspective](/use/views/k8s-metrics-perspective.md) for the components.
+
+## Next steps
+You can add new charts to components, for example the service or service instance, for your application, by following [our guide](/use/metrics/k8s-add-charts.md). It is also possible to create [new monitors](/use/alerting/k8s-monitors.md) using the metrics and setup [notifications](/use/alerting/notifications/configure.md) to get notified when your application is not available or having performance issues.
+
+# More info
+
+* [API keys](/use/security/k8s-ingestion-api-keys.md)
+* [Open Telemetry API](../otlp-apis.md)
+* [Customizing Open Telemetry Collector configuration](../collector.md)
+* [Open Telemetry SDKs](../instrumentation/README.md)
\ No newline at end of file
diff --git a/setup/otel/getting-started/getting-started-linux.md b/setup/otel/getting-started/getting-started-linux.md
new file mode 100644
index 000000000..3338061a0
--- /dev/null
+++ b/setup/otel/getting-started/getting-started-linux.md
@@ -0,0 +1,174 @@
+---
+description: SUSE Observability
+---
+
+# Getting Started with Open Telemetry
+
+Here is the setup we'll be creating, for an application that needs to be monitored:
+
+* The monitored application / workload running on a Linux host
+* The Open Telemetry collector running on the same Linux host
+* SUSE Observability or SUSE Cloud Observability
+
+
+
+## Install the Open Telemetry collector
+
+{% hint type="info" %}
+For a production setup it is strongly recommended to install the collector, since it allows your service to offload data quickly and the collector can take care of additional handling like retries, batching, encryption or even sensitive data filtering.
+{% endhint %}
+
+First we'll install the collector. We configure it to:
+
+* Receive data from, potentially many, instrumented applications
+* Enrich collected data with host attributes
+* Generate metrics for traces
+* Forward the data to SUSE Observability, including authentication using the API key
+
+Next to that it will also retry sending data when there are a connection problems.
+
+### Configure and install the collector
+
+### Install and configure the collector
+
+The collector provides packages (apk, deb and rpm) for most Linux versions and architectures and it uses `systemd` for automatic service configuration. To install it find the [latest release on Github](https://github.com/open-telemetry/opentelemetry-collector-releases/releases) and update the URL in the example to use the latest version:
+
+{% tabs %}
+{% tab title="DEB AMD64" %}
+```bash
+wget https://github.com/open-telemetry/opentelemetry-collector-releases/releases/download/v0.123.1/otelcol-contrib_0.123.1_linux_amd64.deb
+sudo dpkg -i otelcol-contrib_0.123.1_linux_amd64.deb
+```
+{% endtab %}
+{% tab title="DEB ARM64" %}
+```bash
+wget https://github.com/open-telemetry/opentelemetry-collector-releases/releases/download/v0.123.1/otelcol-contrib_0.123.1_linux_arm64.deb
+sudo dpkg -i otelcol-contrib_0.123.1_linux_arm64.deb
+```
+{% endtab %}
+{% tab title="RPM AMD64" %}
+```bash
+wget https://github.com/open-telemetry/opentelemetry-collector-releases/releases/download/v0.123.1/otelcol-contrib_0.123.1_linux_amd64.rpm
+sudo rpm -iv1 otelcol-contrib_0.123.1_linux_amd64.rpm
+```
+{% endtab %}
+{% tab title="RPM ARM64" %}
+```bash
+wget https://github.com/open-telemetry/opentelemetry-collector-releases/releases/download/v0.123.1/otelcol-contrib_0.123.1_linux_arm64.rpm
+sudo rpm -iv1 otelcol-contrib_0.123.1_linux_arm64.rpm
+```
+{% endtab %}
+{% endtabs %}
+
+For other installation options use the [Open Telemetry instructions](https://opentelemetry.io/docs/collector/installation/#linux).
+
+After installation modify the collector configuration by editing `/etc/otelcol-contrib/config.yaml`. Change the file such that it looks like the `config.yaml` example here, replace `` with your OTLP endpoint (see [OTLP API](../otlp-apis.md) for your endpoint) and insert your receiver api key for `` (see [here](/use/security/k8s-ingestion-api-keys.md#api-keys) where to find it):
+
+{% code title="config.yaml" lineNumbers="true" %}
+```yaml
+receivers:
+ nop: {}
+ otlp:
+ protocols:
+ # Only bind to localhost to keep the collector secure, see https://github.com/open-telemetry/opentelemetry-collector/blob/main/docs/security-best-practices.md#safeguards-against-denial-of-service-attacks
+ grpc:
+ endpoint: 127.0.0.1:4317
+ http:
+ endpoint: 127.0.0.1:4318
+ # Collect own metrics
+ prometheus:
+ config:
+ scrape_configs:
+ - job_name: 'otel-collector'
+ scrape_interval: 10s
+ static_configs:
+ - targets: ['0.0.0.0:8888']
+extensions:
+ health_check: {}
+ pprof:
+ endpoint: 0.0.0.0:1777
+ zpages:
+ endpoint: 0.0.0.0:55679
+ # Use the API key from the env for authentication
+ bearertokenauth:
+ scheme: SUSEObservability
+ token: ""
+exporters:
+ nop: {}
+ debug: {}
+ otlp/suse-observability:
+ compression: snappy
+ auth:
+ authenticator: bearertokenauth
+ # Put in your own otlp endpoint
+ endpoint:
+processors:
+ memory_limiter:
+ check_interval: 5s
+ limit_percentage: 80
+ spike_limit_percentage: 25
+ batch: {}
+ # Optionally include resource information from the system running the collector
+ resourcedetection/system:
+ detectors: [env, system] # Replace system with gcp, ec2, azure when running in cloud environments
+ system:
+ hostname_sources: ["os"]
+connectors:
+ # Generate metrics for spans
+ spanmetrics:
+ metrics_expiration: 5m
+ namespace: otel_span
+service:
+ extensions: [ bearertokenauth, health_check, pprof, zpages ]
+ pipelines:
+ traces:
+ receivers: [otlp]
+ processors: [memory_limiter, resourcedetection/system, batch]
+ exporters: [debug, spanmetrics, otlp/suse-observability]
+ metrics:
+ receivers: [otlp, spanmetrics, prometheus]
+ processors: [memory_limiter, batch, resourcedetection/system]
+ exporters: [debug, otlp/suse-observability]
+ logs:
+ receivers: [nop]
+ processors: []
+ exporters: [nop]
+```
+{% endcode %}
+
+Finally restart the collector:
+
+```bash
+sudo systemctl restart otelcol-contrib
+```
+
+To see the logs of the collector use:
+```bash
+sudo journalctl -u otelcol-contrib
+```
+
+## Collect telemetry data from your application
+
+The common way to collect telemetry data is to instrument your application using the Open Telemetry SDK's. We've documented some quick start guides for a few languages, but there are many more:
+* [Java](../instrumentation/java.md)
+* [.NET](../instrumentation/dot-net.md)
+* [Node.js](../instrumentation/node.js.md)
+
+No additional configuration is needed for the SDKs, they export to localhost via OTLP or OTLP over HTTP (depending on the supported protocols) by default.
+
+For other languages follow the documentation on [opentelemetry.io](https://opentelemetry.io/docs/languages/).
+
+## View the results
+Go to SUSE Observability and make sure the Open Telemetry Stackpack is installed (via the main menu -> Stackpacks).
+
+After a short while and if your application is processing some traffic you should be able to find it under its service name in the Open Telemetry -> services and service instances overviews. Traces will appear in the [trace explorer](/use/traces/k8sTs-explore-traces.md) and in the [trace perspective](/use/views/k8s-traces-perspective.md) for the service and service instance components. Span metrics and language specific metrics (if available) will become available in the [metrics perspective](/use/views/k8s-metrics-perspective.md) for the components.
+
+## Next steps
+You can add new charts to components, for example the service or service instance, for your application, by following [our guide](/use/metrics/k8s-add-charts.md). It is also possible to create [new monitors](/use/alerting/k8s-monitors.md) using the metrics and setup [notifications](/use/alerting/notifications/configure.md) to get notified when your application is not available or having performance issues.
+
+# More info
+
+* [API keys](/use/security/k8s-ingestion-api-keys.md)
+* [Open Telemetry API](../otlp-apis.md)
+* [Customizing Open Telemetry Collector configuration](../collector.md)
+* [Open Telemetry SDKs](../instrumentation/README.md)
\ No newline at end of file
diff --git a/setup/otel/languages/README.md b/setup/otel/instrumentation/README.md
similarity index 100%
rename from setup/otel/languages/README.md
rename to setup/otel/instrumentation/README.md
diff --git a/setup/otel/languages/dot-net.md b/setup/otel/instrumentation/dot-net.md
similarity index 94%
rename from setup/otel/languages/dot-net.md
rename to setup/otel/instrumentation/dot-net.md
index 550efb358..41f03cf82 100644
--- a/setup/otel/languages/dot-net.md
+++ b/setup/otel/instrumentation/dot-net.md
@@ -32,9 +32,9 @@ env:
- name: OTEL_DOTNET_AUTO_HOME
value: "/autoinstrumentation"
```
-3. Also add the extra environment variables [to configure the service name and exporter endpoint](./sdk-exporter-config.md) on the pod.
+3. Also add the extra environment variables [to configure the service name and exporter endpoint](./sdk-exporter-config.md) on the pod, supported protocols are gRPC and protobuf over HTTP.
4. Deploy your application with the changes
-5. [Verify](./verify.md) SUSE Observability is receiving traces and/or metrics
+5. Verify SUSE Observability is receiving traces and/or metrics by searching for the metrics / traces in the metrics / trace explorer for your service name
For more details please refer to the [Open Telemetry documentation](https://opentelemetry.io/docs/languages/java/automatic/).
diff --git a/setup/otel/languages/java.md b/setup/otel/instrumentation/java.md
similarity index 93%
rename from setup/otel/languages/java.md
rename to setup/otel/instrumentation/java.md
index 54f3b3a85..da61cc294 100644
--- a/setup/otel/languages/java.md
+++ b/setup/otel/instrumentation/java.md
@@ -17,8 +17,8 @@ Automatic instrumentation does not require any modifications of the application.
```bash
java -javaagent:/path/to/opentelemetry-javaagent.jar -jar myapp.jar
```
-3. Deploy your application with the extra environment variables [to configure the service name and exporter endpoint](./sdk-exporter-config.md).
-4. [Verify](./verify.md) SUSE Observability is receiving traces and/or metrics
+3. Deploy your application with the extra environment variables [to configure the service name and exporter endpoint](./sdk-exporter-config.md), supported protocols are gRPC and protobuf over HTTP.
+4. Verify SUSE Observability is receiving traces and/or metrics by searching for the metrics / traces in the metrics / trace explorer for your service name
For more details please refer to the [Open Telemetry documentation](https://opentelemetry.io/docs/languages/java/automatic/).
diff --git a/setup/otel/languages/node.js.md b/setup/otel/instrumentation/node.js.md
similarity index 92%
rename from setup/otel/languages/node.js.md
rename to setup/otel/instrumentation/node.js.md
index 6676261b6..64203152c 100644
--- a/setup/otel/languages/node.js.md
+++ b/setup/otel/instrumentation/node.js.md
@@ -19,8 +19,8 @@ npm install --save @opentelemetry/auto-instrumentations-node
```bash
node --require @opentelemetry/auto-instrumentations-node/register app.js
```
-3. Deploy your application with the extra environment variables [to configure the service name and exporter endpoint](./sdk-exporter-config.md).
-4. [Verify](./verify.md) SUSE Observability is receiving traces and/or metrics
+3. Deploy your application with the extra environment variables [to configure the service name and exporter endpoint](./sdk-exporter-config.md), supported protocols are gRPC and protobuf over HTTP.
+4. Verify SUSE Observability is receiving traces and/or metrics by searching for the metrics / traces in the metrics / trace explorer for your service name
For more details please refer to the [Open Telemetry documentation](https://opentelemetry.io/docs/languages/js/automatic/).
diff --git a/setup/otel/languages/node.js/auto-instrumentation-of-lambdas.md b/setup/otel/instrumentation/node.js/auto-instrumentation-of-lambdas.md
similarity index 94%
rename from setup/otel/languages/node.js/auto-instrumentation-of-lambdas.md
rename to setup/otel/instrumentation/node.js/auto-instrumentation-of-lambdas.md
index d89ac09d3..a43444c34 100644
--- a/setup/otel/languages/node.js/auto-instrumentation-of-lambdas.md
+++ b/setup/otel/instrumentation/node.js/auto-instrumentation-of-lambdas.md
@@ -93,9 +93,7 @@ Be aware this collector is used to send the data over to a next collector which
Depending on the desired functionality, or based upon factors such as volumes of data being generated by lambdas instrumented in this way, collectors can be set up for batching, tail-sampling, and other pre-processing techniques to reduce the impact on SUSE Observability.
-See this page for [guidance and instruction](../../proxy-collector.md) on how to set up a batching collector that acts as a security proxy for SUSE Observability.
-See this page for [instructions](../../collector.md) on how to set up a collector that does tail-sampling as well.
-For more information about processor configuration on the opentelemetry collector, see the [official documentation](https://github.com/open-telemetry/opentelemetry-collector/blob/main/processor/README.md).
+Follow the getting [started guide](../../getting-started/getting-started-lambda.md) for setting up a collector to send the data to SUSE Observability. Customizing the configuration of the collector to set up sampling, filtering etc can be found in our [collector documentation](../../collector.md).

diff --git a/setup/otel/instrumentation/sdk-exporter-config.md b/setup/otel/instrumentation/sdk-exporter-config.md
new file mode 100644
index 000000000..83f9fa219
--- /dev/null
+++ b/setup/otel/instrumentation/sdk-exporter-config.md
@@ -0,0 +1,86 @@
+---
+description: SUSE Observability
+---
+
+# Configuring SDK exporters
+
+To send data to SUSE Observability the SDKs that are used to instrument your application use a built-in exporter. A production ready setup uses [a collector](#with-a-collector-production-setup) close to your instrumeneted applications to send the data to SUSE Observability, but it is also possible to have the instrumented application [directly send](#without-a-collector) the telemetry data to SUSE Observability.
+
+## With a collector (production setup)
+
+### SDK Exporter config for Kubernetes
+
+All SDKs, regardless of the language, use the same basic configuration for defining the Open Telemetry [service name](https://opentelemetry.io/docs/concepts/glossary/#service) and the exporter endpoint (i.e. where the telemetry is sent).
+
+These can be configured by setting environment variables for your instrumented application.
+
+In Kubernetes set these environment variables in the manifest for your workload (replace `` with a name for your application service):
+
+```yaml
+...
+spec:
+ containers:
+ - env:
+ - name: OTEL_EXPORTER_OTLP_ENDPOINT
+ value: http://opentelemetry-collector.open-telemetry.svc.cluster.local:4317
+ - name: OTEL_SERVICE_NAME
+ value:
+ - name: OTEL_EXPORTER_OTLP_PROTOCOL
+ value: grpc
+...
+```
+
+The endpoint specified in the example assumes the collector was installed using the defaults from [the installation guide](../collector.md). It uses port `4317` which uses the `gRPC` version of the OTLP protocol. Some instrumentations only support HTTP, in that case, use port `4318`.
+
+The service name can also be derived from Kubernetes labels that may already be present. For example like this:
+```yaml
+spec:
+ containers:
+ - env:
+ - name: OTEL_SERVICE_NAME
+ valueFrom:
+ fieldRef:
+ apiVersion: v1
+ fieldPath: metadata.labels['app.kubernetes.io/component']
+```
+
+### SDK Exporter config for other installations
+
+To configure the SDK set these environment variables for your application:
+
+```bash
+export OTEL_EXPORTER_OTLP_ENDPOINT="http://:4317"
+export OTEL_EXPORTER_OTLP_PROTOCOL="grpc"
+export OTEL_SERVICE_NAME=""
+export OTEL_RESOURCE_ATTRIBUTES='service.namespace='
+```
+
+The example uses port `4317` which uses the `gRPC` version of the OTLP protocol. Some instrumentations only support HTTP, which uses port `4318` with the protocol set to `http`. Use the SDK documentation for your language to check which protocol the SDK supports. The `OTEL_EXPORTER_OLTP_ENDPOINT` and `OTEL_EXPORTER_OTLP_PROTOCOL` can be omitted, they have default values which send data to the preferred endpoint on the localhost.
+
+The `OTEL_RESOURCE_ATTRIBUTES` is optional and, next to defining a service namespace, can be used to set more resource attributes in a comma-separated list.
+
+### gRPC vs HTTP
+
+OTLP, the Open Telemetry Protocol, supports gRPC and protobuf over HTTP. In the previous section, the exporter protocol is set to `gRPC`, this usually gives the best performance. Next to the SDK not supporting gRPC there can be other reasons to prefer HTTP:
+
+* Some firewalls are not setup to handle gRPC
+* (reverse) proxies and load balancers may not support gRPC without additional configuration
+* gRPC's long-lived connections may cause problems when load-balancing.
+
+To switch to HTTP instead of gRPC change the protocol to `http` *and* use port `4318`.
+
+To summarize, you can try to use HTTP in case gRPC is not working for you:
+
+* `grpc` protocol uses port `4317` on the collector
+* `http` protocol uses port `4318` on the collector
+
+## Without a collector
+
+In small test setups it can be convenient to directly send data from your instrumented application to SUSE Observability. The only difference from the collector setup documented above is to use a different value for `OTEL_EXPORTER_OTLP_ENDPOINT`:
+
+* For gRPC use the OTLP endpoint for SUSE Observability, see the [OTLP APIs page](../otlp-apis.md).
+* For HTTP use the OTLP over HTTP endpoint for SUSE Observability, see the [OTLP APIs page](../otlp-apis.md).
+
+{% hint type="info" %}
+Replace both the collector URL **and** the port with the SUSE Observability endpoints. Depending on your SUSE Observability installation the ports will be different.
+{% endhint %}
diff --git a/setup/otel/languages/sdk-exporter-config.md b/setup/otel/languages/sdk-exporter-config.md
deleted file mode 100644
index bf52ab0c9..000000000
--- a/setup/otel/languages/sdk-exporter-config.md
+++ /dev/null
@@ -1,54 +0,0 @@
----
-description: SUSE Observability
----
-
-# Exporter config
-
-All SDKs, regardless of the language, use the same basic configuration for defining the Open Telemetry [service name](https://opentelemetry.io/docs/concepts/glossary/#service) and the exporter endpoint (i.e. where the telemetry is sent).
-
-These can be configured by setting environment variables for your instrumented application.
-
-In Kubernetes set these environment variables in the manifest for your workload (replace `` with a name for your application service):
-
-```yaml
-...
-spec:
- containers:
- - env:
- - name: OTEL_EXPORTER_OTLP_ENDPOINT
- value: http://opentelemetry-collector.open-telemetry.svc.cluster.local:4317
- - name: OTEL_SERVICE_NAME
- value:
- - name: OTEL_EXPORTER_OTLP_PROTOCOL
- value: grpc
-...
-```
-
-The endpoint specified in the example assumes the collector was installed using the defaults from [the installation guide](../collector.md). It uses port `4317` which uses the `gRPC` version of the OTLP protocol. Some instrumentations only support HTTP, in that case, use port `4318`.
-
-The service name can also be derived from Kubernetes labels that may already be present. For example like this:
-```yaml
-spec:
- containers:
- - env:
- - name: OTEL_SERVICE_NAME
- valueFrom:
- fieldRef:
- apiVersion: v1
- fieldPath: metadata.labels['app.kubernetes.io/component']
-```
-
-## gRPC vs HTTP
-
-OTLP, the Open Telemetry Protocol, supports gRPC and protobuf over HTTP. Some SDKs also support JSON over HTTP. In the previous section, the exporter protocol is set to `gRPC`, this usually gives the best performance and is the default for many SDKs. However, in some cases it may be problematic:
-
-* Some firewalls are not setup to handle gRPC
-* (reverse) proxies and load balancers may not support gRPC without additional configuration
-* gRPC's long-lived connections may cause problems when load-balancing.
-
-To switch to HTTP instead of gRPC change the protocol to `http` *and* use port `4318`.
-
-To summarize, use HTTP in case gRPC is given problems:
-
-* `grpc` protocol uses port `4317`
-* `http` protocol uses port `4318`
diff --git a/setup/otel/languages/verify.md b/setup/otel/languages/verify.md
deleted file mode 100644
index 1fee35778..000000000
--- a/setup/otel/languages/verify.md
+++ /dev/null
@@ -1,22 +0,0 @@
----
-description: SUSE Observability
----
-
-# Verify the instrumentation is working
-
-If the collector and the instrumentation setup has been successful data should be available in SUSE Observability within about a minute or two.
-
-You can check that SUSE Observability is receiving traces:
-
-1. Open SUSE Observability in a browser
-2. Find (one of) the pods that is instrumented
-3. Select the pod to open the Highlights page
-4. Open the trace perspective. If the pod is serving traffic it should now show traces
-
-To check that SUSE Observability is receiving metrics:
-
-1. Open SUSE Observability in a browser
-2. Open the metrics explorer from the menu
-3. Search for the metrics exposed by your application
-
-If there are still no metrics after 5 minutes something is likely mis-configured. See [troubleshooting](../troubleshooting.md) for help.
\ No newline at end of file
diff --git a/setup/otel/otlp-apis.md b/setup/otel/otlp-apis.md
new file mode 100644
index 000000000..b460acb37
--- /dev/null
+++ b/setup/otel/otlp-apis.md
@@ -0,0 +1,68 @@
+---
+description: SUSE Observability
+---
+
+# SUSE Observability Open Telemetry Protocol support
+
+SUSE Observability supports 2 versions of the OTLP protocol, the `grpc` version (also referred to as OTLP) and `http/protobuf` (also referred to as OTLP over HTTP). In the collector configuration you can choose which exporter to use, but make sure to configure the correcct URL for SUSE Observability. The `grpc` version of the protocol is preferred, it allows for larger payloads and higher throughput. But in case of poor support for `grpc` in your infrastructure you can switch to the HTTP version. See also [troubleshooting](./troubleshooting.md#some-proxies-and-firewalls-dont-work-well-with-grpc)
+
+## SUSE Cloud Observability
+
+The endpoints for SUSE Cloud Observability are:
+
+* OTLP: `https://otlp-.app.stackstate`.com:443
+* OTLP over HTTP: `https://otlp-http-`.app.stackstate.com
+
+## Self-hosted SUSE Observability
+
+For a self-hosted installation you need to enable one of the endpoints, or both, by configuring the ingress for SUSE Observability as [described here](../install-stackstate/kubernetes_openshift/ingress.md#configure-ingress-rule-for-open-telemetry).
+
+When SUSE Observability is running in the same cluster as the collector you can also use it without ingress by using the service endpoints:
+* OTLP: `http://suse-observability-otel-collector..svc.cluster.local:4317`
+* OTLP over HTTP: `http://suse-observability-otel-collector..svc.cluster.local:4318`
+
+Make sure to set `insecure: true` in the collector configuration (see next section) to allow the usage of plain http endpoints instead of https.
+
+## Collector configuration
+
+The examples in the collector configuration use the OTLP protocol like this:
+
+```
+extensions:
+ bearertokenauth:
+ scheme: SUSEObservability
+ token: "${env:API_KEY}"
+exporters:
+ otlp/suse-observability:
+ auth:
+ authenticator: bearertokenauth
+ endpoint:
+ # Optional TLS configurations:
+ #tls:
+ # To disable TLS entirely:
+ # insecure: true
+ # To disable certificate verification (but still use TLS):
+ # insecure_skip_verify: true
+```
+
+To use the OTLP over HTTP protocol instead use the `otlphttp` exporter instead. Don't forget to update the exporter references, `otlp/suse-observability`, in your pipelines to `otlphttp/suse-observability`!
+
+```
+extensions:
+ bearertokenauth:
+ scheme: SUSEObservability
+ token: "${env:API_KEY}"
+exporters:
+ otlphttp/stackstate:
+ auth:
+ authenticator: bearertokenauth
+ endpoint:
+ # Optional TLS configurations:
+ #tls:
+ # To disable TLS entirely:
+ # insecure: true
+ # To disalbe certificate verification (but still use TLS):
+ # insecure_skip_verify: true
+```
+
+There is more configuration available to control the exact requirements and behavior of the exporter. For example it is also possible to use a custom CA root certificate or to enable client certificates. See the [OTLP exporter documentation](https://github.com/open-telemetry/opentelemetry-collector/blob/main/exporter/otlpexporter/README.md) for the details.
diff --git a/setup/otel/overview.md b/setup/otel/overview.md
new file mode 100644
index 000000000..55b8faf44
--- /dev/null
+++ b/setup/otel/overview.md
@@ -0,0 +1,21 @@
+---
+description: SUSE Observability
+---
+
+# Getting Started with Open Telemetry
+
+
+
+SUSE Observability supports [Open Telemetry](https://opentelemetry.io/docs/what-is-opentelemetry/). Open Telemetry is a set of standardized protocols and an open-source framework to collect, transform and ship telemetry data such as traces, metrics and logs. Open telemetry supports a wide variety of programming languages and platforms.
+
+SUSE Observability has support for both metrics and traces. When used in combination with the Kubernetes stackpack the Kubernetes pods will be enriched with traces when available. By installing the Open Telemetry stackpack new overview pages for services and service instances become available providing access to traces and span metrics. Open Telemetry metrics can be used in monitors and metric bindings. The stackpack comes with metric bindings for span metrics and .NET and JVM memory metrics. There are also out-of-the-box monitors for span error rates and duration.
+
+The recommended setup for usage with SUSE Observability, is to instrument applications with the applicable Open Telemetry [SDKs](./instrumentation/README.md) and to install the [Open Telemetry collector](./collector.md) close to your instrumented applications to pre-process the data (enrich with Kubernetes labels, sampling on traces, etc.) and ship the data to SUSE Observability. The Open Telemetry Collector can also be used to collect metrics from many types of telemetry sources without instrumenting the applications using the SDKs. See the [Open Telemetry collector integrations](https://opentelemetry.io/ecosystem/registry/?language=collector) for more details and if your database or telemetry protocol is supported.
+
+Follow the [getting started](./getting-started/README.md) guide to set up everything such that it works best with SUSE Observability.
+
+## References
+
+* [Open Telemetry collector](https://opentelemetry.io/docs/collector/) on the Open Telemetry documentation
+* [Open Telemetry collector integrations](https://opentelemetry.io/ecosystem/registry/?language=collector)
+* [SDKs to instrument your application](https://opentelemetry.io/docs/instrumentation/README.md) on the Open Telemetry documentation
\ No newline at end of file
diff --git a/setup/otel/proxy-collector.md b/setup/otel/proxy-collector.md
deleted file mode 100644
index 462d4cb36..000000000
--- a/setup/otel/proxy-collector.md
+++ /dev/null
@@ -1,94 +0,0 @@
----
-description: SUSE Observability
----
-
-# Open Telemetry Collector as a proxy
-
-The normal configuration of the Opentelemetry Collector for tail-sampling traces can be found [here](collector.md)
-
-The below configuration describes a deployment that only does batching, and no further processing of traces, metrics,
-or logs. It is meant as a security proxy that exists outside the SUSE Observability cluster, but within trusted network
-infrastructure. Security credentials for the proxy and SUSE Observability can be set up separately, adding a layer of
-authentication that does not reside with the caller, but with the host.
-
-
-
-{% code title="otel-collector.yaml" lineNumbers="true" %}
-```yaml
-mode: deployment
-presets:
- kubernetesAttributes:
- enabled: true
- # You can also configure the preset to add all the associated pod's labels and annotations to you telemetry.
- # The label/annotation name will become the resource attribute's key.
- extractAllPodLabels: true
-extraEnvsFrom:
- - secretRef:
- name: open-telemetry-collector
-image:
- repository: "otel/opentelemetry-collector-k8s"
-
-config:
- receivers:
- otlp:
- protocols:
- grpc:
- endpoint: 0.0.0.0:4317
- http:
- endpoint: 0.0.0.0:4318
-
- exporters:
- # Exporter for traces to traffic mirror (used by the common config)
- otlp:
- endpoint:
- auth:
- authenticator: bearertokenauth
-
- extensions:
- bearertokenauth:
- scheme: SUSEObservability
- token: "${env:API_KEY}"
-
- service:
- extensions: [health_check, bearertokenauth]
- pipelines:
- traces:
- receivers: [otlp]
- processors: [batch]
- exporters: [otlp]
- metrics:
- receivers: [otlp]
- processors: [batch]
- exporters: [otlp]
- logs:
- receivers: [otlp]
- processors: [batch]
- exporters: [otlp]
-
-ingress:
- enabled: true
- annotations:
- kubernetes.io/ingress.class: ingress-nginx-external
- nginx.ingress.kubernetes.io/ingress.class: ingress-nginx-external
- nginx.ingress.kubernetes.io/backend-protocol: GRPC
- # "12.34.56.78/32" IP address of NatGateway in the VPC where the otel data is originating from
- # nginx.ingress.kubernetes.io/whitelist-source-range: "12.34.56.78/32"
- hosts:
- - host: "otlp-collector-proxy.${CLUSTER_NAME}"
- paths:
- - path: /
- pathType: ImplementationSpecific
- port: 4317
- tls:
- - secretName: ${CLUSTER_NODOT}-ecc-tls
- hosts:
- - "otlp-collector-proxy.${CLUSTER_NAME}"
-```
-{% endcode %}
-
-
-### Ingress Source Range Whitelisting
-
-To emphasize the role of the proxy collector as a security measure, it is recommended to use a source-range whitelist
-to filter out data from untrusted and/or unknown sources. In contrast, the SUSE Observability ingestion collector may
-have to accept data from multiple sources, maintaining a whitelist on that level does not scale well.
\ No newline at end of file
diff --git a/setup/otel/sampling.md b/setup/otel/sampling.md
new file mode 100644
index 000000000..89a8f33b5
--- /dev/null
+++ b/setup/otel/sampling.md
@@ -0,0 +1,118 @@
+---
+description: SUSE Observability
+---
+
+# Sampling
+
+Sampling is used to reduce the volume of data that is exported to SUSE Observability, while compromising the quality of the telemetry data as little as possible. The main reason to apply sampling is to reduce cost (of network, storage, etc).
+
+If your applications generate little data there is no need for sampling and it can even hinder observability due to a lack of telemetry data. However if your application has a significant amount of traffic, for example more than 1000 spans per second, it can already make sense to apply sampling.
+
+There are 2 main types of sampling, head sampling and tail sampling.
+
+## Head sampling
+
+Head sampling makes the sampling decision (whether to export the data or not) as early as possible. Therefore the decision cannot be based on the entire trace but only on the, very limited, information that is available. The otel collector has the [probabalistic sampling processor](https://github.com/open-telemetry/opentelemetry-collector-contrib/tree/main/processor/probabilisticsamplerprocessor) which implements Consistent Probabality Sampling. The sampler is configurable and makes a sampling decision based of the trace id (useful for traces) or of a hash of an attribute (useful for logs). This ensures that all spans for a trace are always sampled or not and you will have complete traces in SUSE Observability.
+
+The advantages of head sampling are:
+* Easy to understand
+* Efficient
+* Simple to configure
+
+But a down side is that it is impossible to make sampling decisions on an entire trace, for example to sample all failed traces and only a small selection of the successful traces.
+
+To enable head sampling configure the processor and include it in the pipelines. This example samples 1 out of 4 traces based on the trace id:
+
+```yaml
+processors:
+ probabilistic_sampler:
+ sampling_percentage: 25
+ mode: "proportional"
+```
+
+## Tail sampling
+
+Tail sampling postpones the sampling decision until a trace is (almost) complete. This allows the tail sampler to make based on the entire traces, for example to always sample failed traces and/or slow traces. There are many more possibilities of course. The Open Telemetry collector has a [tail sampling processor](https://github.com/open-telemetry/opentelemetry-collector-contrib/tree/main/processor/tailsamplingprocessor) to apply tail sampling.
+
+So the main advantage of tail sampling is the much bigger flexibility it provides in making sampling . But it comes at a price:
+* Harder to configure properly and understand
+* Must be stateful to store the spans for traces until a sampling decision is made
+* Therefore also (a lot) more resource usage
+* The sampler might not keep up and needs extra monitoring and scaling for that
+
+To enable tail sampling configure the processor and include in the pipelines.
+
+```yaml
+processors:
+ tail_sampling:
+ decision_wait: 10s
+ policies:
+ - name: rate-limited-composite
+ type: composite
+ composite:
+ max_total_spans_per_second: 500
+ policy_order: [errors, slow-traces, rest]
+ composite_sub_policy:
+ - name: errors
+ type: status_code
+ status_code:
+ status_codes: [ ERROR ]
+ - name: slow-traces
+ type: latency
+ latency:
+ threshold_ms: 1000
+ - name: rest
+ type: always_sample
+ rate_allocation:
+ - policy: errors
+ percent: 33
+ - policy: slow-traces
+ percent: 33
+ - policy: rest
+ percent: 34
+```
+
+The example samples:
+* A maximum of 500 spans per second
+* all spans in traces that have errors up to 33% of the maximum
+* all spans in traces slower than 1 second up to 33% of the maximum
+* other spans up to the maximum rate allowed
+
+For more details on the configuration options and different policies use the [tail sampling readme](https://github.com/open-telemetry/opentelemetry-collector-contrib/tree/main/processor/tailsamplingprocessor).
+
+It his however not completely set-it-and-forget-it, if its resource usage starts growing you might want to scale out to use multiple collectors to handle the tail sampling which will then also require [routing](https://github.com/open-telemetry/opentelemetry-collector-contrib/blob/main/connector/routingconnector/README.md) to route traffic based on trace id.
+
+## Sampling traces in combination with span metrics
+
+In the getting started section the collector configuration doesn't include sampling. When adding sampling we want to be careful to keep the metrics that are calculated from traces as accurate as possible. Especially tail-sampling can result in very skewed metrics, because typically the relative amount of errors is much higher. To avoid this we split the traces pipeline into multiple parts and connect them with the forward connector. Modify the config to include the extra connector and sampling processor. And modify the pipelines as shown here:
+
+```yaml
+connectors:
+ # enable the forwarder
+ forward:
+processors:
+ # Configure the probabilistic sampler to sample 25% of the traffic
+ probabilistic_sampler:
+ sampling_percentage: 25
+ mode: "proportional"
+service:
+ pipelines:
+ traces:
+ receivers: [otlp]
+ processors: [memory_limiter, resource]
+ exporters: [forward]
+ traces/spanmetrics:
+ receivers: [forward]
+ processors: []
+ exporters: [spanmetrics]
+ traces/sampling:
+ receivers: [forward]
+ processors: [probabilistic_sampler, batch]
+ exporters: [debug, otlp/stackstate]
+ metrics:
+ receivers: [otlp, spanmetrics, prometheus]
+ processors: [memory_limiter, resource, batch]
+ exporters: [debug, otlp/stackstate]
+```
+
+The example uses the probabilistic sampler configured to sample 25% percent of the traffic. You'll likely want to tune the percentage for your situation or switch to the [tail sampler](#tail-sampling) instead. The pipeline setup is the same for the tail sampler, just replace the reference to the `probabilistic_sampler` with `tail_sampling`.
diff --git a/setup/otel/troubleshooting.md b/setup/otel/troubleshooting.md
index 97214d7f7..f5ec90e47 100644
--- a/setup/otel/troubleshooting.md
+++ b/setup/otel/troubleshooting.md
@@ -32,14 +32,13 @@ To ensure the api key is configured correctly check that:
1. the secret contains a valid API key (verify this in SUSE Observability)
2. the secret is used as environment variables on the pod
3. the `bearertokenauth` extension is using the correct scheme and the value from the `API_KEY` environment variable
-4. the `bearertokenauth` extension is used by the `otlp/stackstate` exporter
+4. the `bearertokenauth` extension is used by the `otlp/suse-observability` exporter
### Some proxies and firewalls don't work well with gRPC
-If the collector needs to send data through a proxy or a firewall it can be that they either block the traffic completely or possibly drop some parts of the gRPC messages or unexpectedly drop the long-lived gRPC connection completely. The easiest fix is to switch from gRPC to use HTTP instead, by replacing the `otlp/stackstate` exporter configuration and all its references with the `otlphttp/stackstate` exporter which is already configured and ready.
+If the collector needs to send data to SUSE Observability through a proxy or a firewall it can be that they either block the traffic completely or possibly drop some parts of the gRPC messages or unexpectedly drop the long-lived gRPC connection completely. The easiest fix is to switch from gRPC to use HTTP instead, by replacing the `otlp/suse-observability` exporter configuration and all its references with the `otlphttp/suse-observability` exporter which is already configured and ready.
-
-Here `` is similar to the ``, but instead of a `otlp-` prefix it has `otlp-http-` prefix, for example, `otlp-http-play.stackstate.com`.
+Here `` is similar to the ``, but instead of a `otlp-` prefix it has `otlp-http-` prefix, for example, `otlp-http-play.stackstate.com`. For more details see the [collector configuration](./collector.md#exporters).
## The instrumented application cannot send data to the collector
@@ -52,15 +51,11 @@ If the SDK logs network connection timeouts it can be that either there is a mis
### The language SDK doesn't support gRPC
-Not all language SDKs have support for gRPC. If OTLP over gRPC is not supported it is best to switch to OTLP over HTTP. The [SDK exporter config](./languages/sdk-exporter-config.md#grpc-vs-http) describes how to make this switch.
+Not all language SDKs have support for gRPC. If OTLP over gRPC is not supported it is best to switch to OTLP over HTTP. The [SDK exporter config](./instrumentation/sdk-exporter-config.md#grpc-vs-http) describes how to make this switch.
### The language SDK uses the wrong port
-Using the wrong port usually appears as a connection error but can also show up as network connections being unexpectedly closed. Make sure the SDK exporter is using the right port when sending data. See the [SDK exporter config](./languages/sdk-exporter-config.md#grpc-vs-http).
-
-### Some proxies and firewalls don't work well with gRPC
-
-If the collector needs to send data through a proxy or a firewall it can be that they either block the traffic completely or possibly drop some parts of the gRPC messages or unexpectedly drop the long-lived gRPC connection completely. The [SDK exporter config](./languages/sdk-exporter-config.md#grpc-vs-http) describes how to switch from gRPC to HTTP instead.
+Using the wrong port usually appears as a connection error but can also show up as network connections being unexpectedly closed. Make sure the SDK exporter is using the right port when sending data. See the [SDK exporter config](./instrumentation/sdk-exporter-config.md#grpc-vs-http).
## Kubernetes pods with hostNetwork enabled
diff --git a/use/security/k8s-ingestion-api-keys.md b/use/security/k8s-ingestion-api-keys.md
index 132caabfe..b6863e20f 100644
--- a/use/security/k8s-ingestion-api-keys.md
+++ b/use/security/k8s-ingestion-api-keys.md
@@ -2,9 +2,19 @@
description: SUSE Observability
---
-# Ingestion API Keys
+# API Keys
-## Overview
+API keys are used for sending telemetry data to SUSE Observability. It now offers two types of API keys:
+- Receiver API Key: This key is typically generated during the initial installation of your SUSE Observability instance, and it never expires
+- Ingestion API Key: You can create Ingestion API Keys using the SUSE Observability CLI (STS). These keys offer expiration dates, requiring periodic rotation for continued functionality.
+
+The receiver API key can be found in your `values.yaml` as the `receiverApiKey`, but you can also find it in the installation instructions of the stackpacks. For example if you installed the Kubernetes stackpack:
+1. Open SUSE Observability
+2. Navigate to StackPacks and select the Kubernetes StackPack
+3. Open one of the installed instances
+4. Scroll down to the first set of installation instructions. It shows the API key as `STACKSTATE_RECEIVER_API_KEY` in text and as `'stackstate.apiKey'` in the command.
+
+## Ingestion API Keys
Ingestion API Keys are used by external tools to ingest data (like metrics, events, traces and so on) to the SUSE Observability cluster.
These tools can be STS Agent or/and OTel Collector.
@@ -74,14 +84,13 @@ An Ingestion API Key can be deleted using the `sts` CLI. Pass the ID of the Key
✅ Ingestion Api Key deleted: 250558013078953
```
-## Authenticating using service tokens
+## Authenticate using Ingestion API keys
Once created, an Ingestion API Key can be used to authenticate:
-- stackstate-k8s-agent
+- suse-observability-agent
- OTel Collector
-
-### stackstate-k8s-agent
+### suse-observability-agent
The SUSE Observability agent requires an API key for communication, historically known as the Receiver API Key. SUSE Observability now offers two options for authentication:
- Receiver API Key: This key is typically generated during the initial installation of your SUSE Observability instance,
@@ -92,20 +101,18 @@ The SUSE Observability agent requires an API key for communication, historically
When using the SUSE Observability collector, you'll need to include an `Authorization` header in your configuration. The collector accepts either a Receiver API Key or an Ingestion API Key for authentication.
The following code snippet provides an example configuration:
```yaml
- extensions:
- bearertokenauth:
- scheme: SUSE Observability
- token: "${env:API_KEY}"
-
- ...
-
- exporters:
- otlp/stackstate:
- auth:
- authenticator: bearertokenauth
- endpoint: :443
- otlphttp/stackstate:
- auth:
- authenticator: bearertokenauth
- endpoint: https://
+extensions:
+ bearertokenauth:
+ scheme: SUSE Observability
+ token: "${env:API_KEY}"
+exporters:
+ otlp/suse-observability:
+ auth:
+ authenticator: bearertokenauth
+ endpoint: :443
+ # or
+ otlphttp/suse-observability:
+ auth:
+ authenticator: bearertokenauth
+ endpoint: https://
```
\ No newline at end of file