Skip to content

Conversation

@michel-laterman
Copy link
Contributor

What does this PR do?

Fixes flaky TestFleetDownloadProxyURL test by changing how agent status is gathered.

Related issues

@michel-laterman michel-laterman requested a review from a team as a code owner December 18, 2025 20:33
@michel-laterman michel-laterman added the Team:Elastic-Agent-Control-Plane Label for the Agent Control Plane team label Dec 18, 2025
@michel-laterman michel-laterman added the flaky-test Unstable or unreliable test cases. label Dec 18, 2025
@elasticmachine
Copy link
Contributor

Pinging @elastic/elastic-agent-control-plane (Team:Elastic-Agent-Control-Plane)

@mergify
Copy link
Contributor

mergify bot commented Dec 18, 2025

This pull request does not have a backport label. Could you fix it @michel-laterman? 🙏
To fixup this pull request, you need to add the backport labels for the needed
branches, such as:

  • backport-./d./d is the label that automatically backports to the 8./d branch. /d is the digit
  • backport-active-all is the label that automatically backports to all active branches.
  • backport-active-8 is the label that automatically backports to all active minor branches for the 8 major.
  • backport-active-9 is the label that automatically backports to all active minor branches for the 9 major.

Comment on lines +863 to +865
// ExecStatusRaw executes `elastic-agent status --output=json`.
//
// Returns the output parsed as map[string]any and the error from the execution. Keep in mind the agent exits with status 1 if it's
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The agent control server uses a non-standard format when serializing timestamps breaking the ExecStatus command. I've added this method in order to avoid this.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is weird, how is something formatted with time.Format not a valid time.Time?

Do you have an example of the error this causes?

I don't like just duplicating the function like this, it would be better to fix things in place, but I can't suggest alternatives unless I know what specific error we are working around.

Copy link
Contributor Author

@michel-laterman michel-laterman Dec 18, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

agent status returned an error: could not unmarshal agent status output: parsing time "2025-12-18 19:02:32 +0000 UTC" as "2006-01-02T15:04:05Z07:00": cannot parse " 19:02:32 +0000 UTC" as "T".

By default Go will try to to parse strings using the RFC3339 nano format (docs), the control server does not use this when serializing times to send to the client

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a drawback to changing to RFC3339? It seems like this would always unconditionally fail without a custom unmarshal implementation. Is this just the first time anyone has ever tried to unmarshal this as JSON in Go?

This is clearly just an artifact of the control server, the upgrade watcher is using gRPC instead so doesn't hit this I suspect because it never unmarshals as JSON.

Why do we have this non-standard format?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can't think of a drawback from a technical standpoint; i've made a draft to change it here:#11923.
I don't know why the non-standard format was chosen, perhaps @blakerouse or @michalpristas would know?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

flaky-test Unstable or unreliable test cases. skip-changelog Team:Elastic-Agent-Control-Plane Label for the Agent Control Plane team

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Flaky Test]: TestFleetDownloadProxyURL – Unable to verify that upgrade has failed.

3 participants