Skip to content

Conversation

@vasconsaurus
Copy link
Contributor

@vasconsaurus vasconsaurus commented Sep 12, 2025

Updates the collector configuration to deal with receiving and exporting metrics from multiple services.

Important

When thinking about adding multiple receivers/exporters, we need to take into consideration:

  • unlike most otel-collector's receivers and exporters, we can't add multiple prometheus receivers by aliasing, e.g.: prometheus/service_1, prometheus/service_2. We can have only one receiver, which can have multiple jobs.
  • unlike when sending traces, we need to specify the dataset when exporting metrics to honeycomb via the otlp exporter.

The way we dealt with this is by using the routing connector:

  • we have a pipeline with the prometheus receiver, and the connector functioning as the exporter
  • we have a second pipeline with the connector functioning as the receiver, and otlp set up as the exporter
  • inside the connector we have a table where we use the service name to decide into which pipeline it should go
    • the prometheus receiver sets the service name from the job_name

If we want to send prometheus metrics from another service, we would need to:

  • add the endpoint variable to the otel-collector configuration in the docker-compose file (local development)
  • add a new job to prometheus' scrape_config,
  • add another item to the routing connector table
  • add another exporter otlp/service_metrics, with the correct Dataset (considering we are also sending metrics to Honeycomb)
  • add another pipeline metrics/service with the new exporter

Notes

  • This does add some duplication, but seems the most straightforward way to deal with this, at least for now
  • There is a tool to dynamically allocate targets for the prometheus receiver, but it is specific to kubernetes, so not useful for us right now
  • Regarding the connector and debugging: the default_pipelines is where anything that is not matched is sent. We currently only use it when debugging. To debug the connector, you can also use the debug pipeline (commented out), and use the debug exporter.

References

I also renamed  to make it more pender specific, to make it obvious
that those are only pender related metrics.

Adding other exporters and pipelines seems pretty straight forward,
just add '/service_name'. But adding a second Prometheus receiver, or making
sure it sends scraped data to the correct pipeline seems more complex.
@vasconsaurus vasconsaurus requested a review from dmou September 12, 2025 18:39
scrape_interval: 15s
static_configs:
- targets: ["pender:3200"]
- targets: ["${env:pender_metrics_endpoint}"]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what if the environment variable was just prometheus_targets and included the ["..."] text? 🙂

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A question: Let's say the targets are the pender endpoint, and the check-api endpoint. How do we make sure we send them to the correct exporter, dataset?

I'm wondering if we would have separate prometheus configs for each endpoint, or if we could use one for all of them.

Or does that not matter?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ah right the way we're using datasets here has them as one per service, and metrics in honeycomb requires it to be attached to a dataset. so we will need to have a separate exporter for each service

we could have them all use the same receiver and then just filter per service based on the metric attribute app or something like that using a processor (https://opentelemetry.io/docs/collector/configuration/#processors) but it might be simpler to just have completely separate pipelines 😒

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I found this blog post, which I think might be relevant: https://www.honeycomb.io/blog/simplify-opentelemetry-pipelines-headers-setter

For now, do we want to assume completely separate pipelines or not? If complete separate pipelines, do you still want changes to the env var?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

let's assume separate pipelines and keep the env var you have! I think you'll have to change the prometheus receiver name to be prometheus/pender or something similarly unique though (https://github.com/open-telemetry/opentelemetry-collector/blob/main/receiver/README.md#configuring-receivers)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that does not work for the prometheus receiver: open-telemetry/opentelemetry-operator#3034

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

... well then. it looks like we will have to run an otel collector for each service 😒 or try to get the processor filtering above working

ew

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can set up different jobs from Prometheus, but the issue here would be how to send each to the correct Dataset, right?

Maybe we can take a step back, and just have one main 'check' Dataset, instead of one per service? I just assumed one per service made sense, I guess. Then we could have a more generic approach. If we need different configuration we could set up a second job, if we don't, we can pass the environment variables as you first suggested.

This would make the configuration easier I think, what do you think? Are there any drawbacks?

So we can route a specific service prometheus metrics
to the specific service honeycomb dataset.

Notes:
1. We don't need to send the dataset when sending traces,
but we still do when sending metrics.
2. While we can aliases most otel-collector components, e.g.:
otlp/pender_metrics. The prometheus receiver does not support that,
so we need the router connector to send to the right pipeline.
(There is a target-allocation feature in the prometheus receiver,
but it is specific to kubernetes.)
@vasconsaurus
Copy link
Contributor Author

vasconsaurus commented Sep 25, 2025

@dmou I was able to get metrics in Honeycomb using the routing connector 🎉
If you have some time, please review and let me know what you think.

@dmou
Copy link
Contributor

dmou commented Sep 25, 2025

@dmou I was able to get metrics in Honeycomb using the routing connector 🎉 If you have some time, please review and let me know what you think.

amazing! could you update the PR description with how we would add an additional service? just to have some record of what resources would need to be duplicated and which ones can remain 🙂

@vasconsaurus vasconsaurus changed the title 6473 – Update collector configuration with endpoint environment variable 6473 – Update collector configuration to allow multiple services Sep 29, 2025
@vasconsaurus vasconsaurus requested a review from dmou September 29, 2025 14:05
@vasconsaurus
Copy link
Contributor Author

@dmou I was able to get metrics in Honeycomb using the routing connector 🎉 If you have some time, please review and let me know what you think.

amazing! could you update the PR description with how we would add an additional service? just to have some record of what resources would need to be duplicated and which ones can remain 🙂

@dmou, I updated the PR, let me know what you think.

Copy link
Contributor

@dmou dmou left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm if you're intentionally keeping the for debugging purposes lines for now!

@vasconsaurus vasconsaurus merged commit 5ba61d6 into develop Sep 29, 2025
1 check passed
@vasconsaurus vasconsaurus deleted the 6473-update-otelconfig-endpoint-var branch September 29, 2025 15:13
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants