Skip to content

Conversation

@JakobJingleheimer
Copy link
Member

@JakobJingleheimer JakobJingleheimer commented Nov 10, 2025

This PR adds support for an xfail directive (similar to skip and todo), whereby a user can flag a test-case as expected to fail.

@JakobJingleheimer JakobJingleheimer added the test_runner Issues and PRs related to the test runner subsystem. label Nov 10, 2025
@nodejs-github-bot
Copy link
Collaborator

Review requested:

  • @nodejs/test_runner

@nodejs-github-bot nodejs-github-bot added the needs-ci PRs that need a full CI run. label Nov 10, 2025
@vassudanagunta
Copy link
Contributor

vassudanagunta commented Nov 24, 2025

There is long-established precedent for xfail:

What is the commitment level and timeline for this? I ask because I've already implemented xfail, ypass and some other additional feature extensions to node:test in a yet-to-be-released node-test-extra package (like fs-extra is to node:fs). I'd be delighted if xfail was subsumed into node:test. Because I've put a lot of thought into it, I might have something to contribute to nodejs on this. I could write up my thoughts, collaborate with @JakobJingleheimer, and/or create an alternate PR.

lmk, please

@JakobJingleheimer
Copy link
Member Author

JakobJingleheimer commented Nov 24, 2025

Ah, sure. I'm not particular about the name as long as it's intuitive (which xfail seems to be).

I expect this could land pretty quickly: quite a few collaborators had previously voiced support for it, and to my knowledge there have been no general concerns raised against the feature itself.

Aside from the name (which is trivial to update in the PR), the remaining work is:

  • find where the result data gets passed to reporters (and add this property to that—the actual implementation work there is probably ~5 seconds)
  • update the docs

Creating competing PRs is never desirable and generally pisses off the person / people already working on it.

Happy to collaborate on the feature; but aside from influencing the name, I think this is very straightforward and has very little implication work, especially almost none remaining.

In terms of timeline to finish this, I expect to have time this week (the past ~2 weeks were very tumultuous for me).

@vassudanagunta
Copy link
Contributor

Additional features that my version has:

  • option to specify the expected error. I think this is really important. If for example you are documenting/reproducing/tracking via a test a bug that is in released code, you will want it to xfail, since it ought not fail CI, but only if it fails in the expected way.

  • command line option to treat xfails as regular tests, i.e. treating them as test failures. This is useful when you want to see the failure details, which most reporters only show for failed tests. With node-test-extra, it will also show all debug/console logging, which is otherwise suppressed.

    • I am considering augmenting that command line option with a filter arg, so you can filter which xfails are treated as regular tests.

I think that's it, though I'll have to take a look at

@JakobJingleheimer
Copy link
Member Author

Alex and I considered those and pushed them to v2 to get the base implementation landed sooner (many users probably won't need the v2 extras, so no need to make them wait for things they don't want/need). If you already have an implementation for that, I'm happy to defer to you for that once this lands 🙂

@vassudanagunta
Copy link
Contributor

vassudanagunta commented Nov 24, 2025

Deferring to v2 makes sense, but I think you should quickly consider how it would be added to the API, just to avoid potential API thrashing or getting painted into a corner. In the implementation, you have a string value given for xfail that is treated as a reporting message, just like todo. That makes sense:

test('this should to that', { xfail: 'bug #382' }, (t) => {
   //...
});

How would you, in v2, specify the expected failure? An xfail-error property that specifies the error message or pattern? Something that can be parsed out of the existing xfail property? Or within the test function, e.g. via a new assert.xfail method that specifies both the expected value and also the expected erroneous value? Or...?

@vassudanagunta
Copy link
Contributor

vassudanagunta commented Nov 24, 2025

Arguably a new kind of assert might be better than adding an xfail marker as a test property.

assert.xfail(expected, actual, expectedErroneous?, message?)

OTOH, assertions of failures maybe should be reserved for testing that the code throws the correct exception under specific conditions.

@JakobJingleheimer
Copy link
Member Author

That's contrary to the intention of this feature: the test should be identical to the non-failing version. This is flag to flip.

assert.xfail is a different feature 🙂

@JakobJingleheimer
Copy link
Member Author

you have a string value given for xfail that is treated as a reporting message, just like todo. That makes sense:


test('this should to that', { xfail: 'bug #382' }, (t) => {

   //...

});

We were thinking in the case of it.fail, any failure is okay.

In the case of { fail } it would accept, true, a string or regex to compare against err.message (and maybe an instance of Error/AssertionError).

For all of these, there is no API thrashing as they're all truthy / falsy, which makes them additive to the current.

@vassudanagunta
Copy link
Contributor

vassudanagunta commented Nov 24, 2025

it all sounds good to me. I'm glad this is getting into node:test. Thank you!

As to reporting, I think you have three options:

  1. xfail is a special case of pass. If you want existing reporters that don't know about xfail to report it as a regular pass.

  2. xfail is a special case of todo. If you want existing reporters that don't know about xfail to report it as a todo. this makes conceptual sense since it is something that shouldn't be treated as "all green", something that should be highlighted as needing to be addressed.

  3. you don't worry about breaking existing reporters, and you want xfail to be reported distinctly from pass and todo.

In my implementation, I chose option 2. A failing xfail generates a test todo event, where the todo message is prefixed with "xfail:". A passing xfail generates a test fail event, with an error indicating that it was an xfail that unexpectedly passed.

This allows my xfail aware reporter to produce a more informative report (it formats xfail results differently from both pass and todo), while degrading gracefully for legacy reporters.

@JakobJingleheimer
Copy link
Member Author

I (and I think Alex too?) was imagining the result of it.xfail would go into the existing pass (when it throws) or fail (when it doesn't throw) and would additionally have a property expectedFail: true (not married to the prop name either—if a well known one exists that isn't crazy, happy to go with that, especially if reports will already support it).

Reporters that support the extra prop do so, and those that don't just behave "normally" (even the uninformed report is still correct).

@vassudanagunta
Copy link
Contributor

ok, that's what i meant by option 1. Since I've implemented an xfail-aware custom reporter, it's all good for my purposes.

Screenshot 2025-11-24 at 11 22 23 PM

@JakobJingleheimer
Copy link
Member Author

Huzzah! TAP reporter is now consuming/surfacing xfail. The others aren't yet because I can't find where they compose their output text

TAP reporter output consuming 'xfail' 🎉 Spec reporter output missing 'xfail'

@vassudanagunta
Copy link
Contributor

vassudanagunta commented Dec 6, 2025

The others aren't yet because I can't find where they compose their output text

Jacob, both spec and dot reporters rely on:

function formatTestReport(type, data, prefix = '', indent = '', hasChildren = false, showErrorDetails = true) {

I know this as I spent a lot of time trying to reverse engineer the semantics of TestsStream, which isn't documented nearly enough for people to write custom reporters without inferring semantics by reading all of the built-in reporter code. I will open a discussion or issue on that topic when I get the chance.

But as part of my reverse engineering, I ended up paying off some test_runner tech debt in precisely this part of the code: #59700. It's seems to be falling through the cracks in PR limbo. Maybe we can get that merged? It isn't strictly necessary for your PR, but since we'll have review eyes on this area of code, it would be efficient to do both at the same time?



it.todo('sync pass todo', () => {
it.xfail('sync expect fail (method)', () => {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you should also add a "sync pass xfail with message" like there is with todo and skip.

This opens up the question of what that should look like for TAP. In my own reporter, I append #XFAIL to the message. The discussion on TAP Directives for XPASS and XFAIL that I linked to in the main thread debates #XFAIL vs #TODO. The latter makes a lot of sense too... see my comment that conceptually an xfail is a special case of todo.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe the folks behind the TAP spec, TAPE and tapjs/node-tap can chime in? @isaacs @ljharb

Copy link
Member Author

@JakobJingleheimer JakobJingleheimer Dec 7, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you mean in the case of specifying it via config object? If so, the value that would subsequently be supported would be the expected failure we talked about #60669 (comment) so

it('should add', { xfail: 'Expected 2 to equal 1' }, () => {  });

where the test-case must fail with an error whose message is 'Expected 2 to equal 1'

Or as a regex:

it('should add', { xfail: /expected 2 to equal 1/i }, () => {  });

When it fails with a different error, xfail would then not "pass" (and would be marked "failed", reported as failed, etc).

That would then be used in the test-case reporter output like

✔ should add # Expected 2 to equal 1
✔ should add # /expected 2 to equal 1/i
✖ should add # Expected 2 to equal 1
✖ should add # /expected 2 to equal 1/i

But since that's part of v1.1, I think for now it should not handle any value other than truthy/falsy.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think both a reason (like todo and skip) and expected error should be supported. Example reasons:

✔ should add #XFAIL Issue #389

✔ should add #XFAIL Can't fix until SemVer Major increment

I'm building a system where released bugs are tracked both as a a bug tracking issue and as an XFAIL test case.

I took your comment to mean (update of the current documentation to illustrate):

  • skip <boolean> | <string> If truthy, the test is skipped. If a string is provided, that string is displayed in the test results as the reason for skipping the test. Default: false.
  • todo <boolean> | <string> If truthy, the test marked as TODO. If a string is provided, that string is displayed in the test results as the reason why the test is TODO. Default: false.
  • xfail <boolean> | <string> | <RegExp> If truthy, the test marked as XFAIL. If a string is provided, that string is displayed in the test results as the reason why the test is expected to fail. If a RegExp is provided, only a failure that matches is permitted. Default: false.

Now I see my interpretation was wrong. In any case, that approach would allow EITHER a reason OR an expected error, NOT BOTH. I think both should be supported.

But as you say, we can punt until v1.1. 🤫🤐🙃

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think both a reason (like todo and skip) and expected error should be supported.

How? I think that creates a paradox: is it merely a label or is it the expected error? There's no way to know.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see two reasonable ways to do it:

  1. The xfail property works exactly like todo and skip: if it is a string it is the reason. Another optional property would be added to specify a specific error or error pattern, e.g. xfail_error.
  2. xfail:
    • true
    • string: reason
    • RegExp: expected pattern in error message
    • {reason: string, error: RegExp}

@JakobJingleheimer
Copy link
Member Author

Jacob, both spec and dot reporters rely on:

function formatTestReport(type, data, prefix = '', indent = '', hasChildren = false, showErrorDetails = true) {

Huzzah, yes, that's it. Thanks!

@JakobJingleheimer JakobJingleheimer marked this pull request as ready for review December 7, 2025 19:26
@codecov
Copy link

codecov bot commented Dec 7, 2025

Codecov Report

❌ Patch coverage is 95.83333% with 1 line in your changes missing coverage. Please review.
✅ Project coverage is 88.52%. Comparing base (a65421a) to head (ea2f54c).
⚠️ Report is 23 commits behind head on main.

Files with missing lines Patch % Lines
lib/internal/test_runner/tests_stream.js 75.00% 1 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main   #60669      +/-   ##
==========================================
- Coverage   88.54%   88.52%   -0.02%     
==========================================
  Files         703      703              
  Lines      208545   208606      +61     
  Branches    40216    40242      +26     
==========================================
+ Hits       184652   184671      +19     
- Misses      15905    15947      +42     
  Partials     7988     7988              
Files with missing lines Coverage Δ
lib/internal/test_runner/harness.js 91.82% <100.00%> (ø)
lib/internal/test_runner/reporter/tap.js 98.23% <100.00%> (+0.01%) ⬆️
lib/internal/test_runner/reporter/utils.js 93.26% <100.00%> (+0.13%) ⬆️
lib/internal/test_runner/test.js 97.36% <100.00%> (+0.01%) ⬆️
lib/internal/test_runner/tests_stream.js 89.65% <75.00%> (-0.35%) ⬇️

... and 40 files with indirect coverage changes

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

@JakobJingleheimer
Copy link
Member Author

PR is quasi-blocked by a bug in the Validate commit message workflow (pending fix: nodejs/core-validate-commit#130).

@JakobJingleheimer JakobJingleheimer force-pushed the test_runner/feat/should-fail branch from 241e5dd to e93deac Compare December 21, 2025 10:05
@JakobJingleheimer JakobJingleheimer added the request-ci Add this label to start a Jenkins CI on a PR. label Dec 23, 2025
@github-actions github-actions bot added request-ci-failed An error occurred while starting CI via request-ci label, and manual interventon is needed. and removed request-ci Add this label to start a Jenkins CI on a PR. labels Dec 23, 2025
@github-actions
Copy link
Contributor

Failed to start CI
   ⚠  Commits were pushed since the last approving review:
   ⚠  - test_runner: support expecting a test-case to fail
   ⚠  - fixup!: rename `fail` → `xfail`
   ⚠  - fixup!: add `getXFail` to `TestStream`
   ⚠  - fixup!: test-case descriptions
   ⚠  - fixup!: add doc for `xfail`
   ⚠  - fixup!: tidy todo comment
   ⚠  - fixup!: add `# EXPECTED FAILURE` to other builtin reporter output
   ⚠  - fixup!: update snapshot
   ⚠  - fixup!: add meta to doc
   ⚠  - fixup!: remove unnecessary code comment
   ⚠  - fixup!: update snapshot
   ⚠  - fixup!: add `xfail` cases to all reporter tests
   ⚠  - fixup!: update snapshots
   ⚠  - fixup!: update snapshot
   ⚠  - fixup!: update API doc to be less "why", just "what"
   ✘  Refusing to run CI on potentially unsafe PR
https://github.com/nodejs/node/actions/runs/20462588334

Copy link
Member

@pmarchini pmarchini left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

@JakobJingleheimer JakobJingleheimer added request-ci Add this label to start a Jenkins CI on a PR. and removed request-ci-failed An error occurred while starting CI via request-ci label, and manual interventon is needed. labels Dec 23, 2025
@github-actions github-actions bot removed the request-ci Add this label to start a Jenkins CI on a PR. label Dec 23, 2025
@nodejs-github-bot
Copy link
Collaborator

@JakobJingleheimer
Copy link
Member Author

I did quite some digging, and it seems there is no consensus on a name. xfail seems the most prevalent (⅔ to ¾); there are some failing, but it's not even the biggest minority (based on what I could find).

IMO none of the names are particularly good, so I think going with the most likely recognised (xfail) is the better choice.

@vassudanagunta
Copy link
Contributor

@ljharb's point is that in the JS world, Ava and Jest call it failing. But Jest just copied Ava. And when Ava added it to their reporting output, they went with knownFailure/knownFailureCount instead of failing/failingCount, obviously to disambiguate.

And when you search both Ava and Jest's repos or Issues for "failing" or "failing test", you get more hits for regular test failures, not expected failures. failing is just too ambiguous. It will make documentation harder to write. People will constantly have to clarify what they mean by "failing test".

It doesn't have to be xfail, but I really don't think it should be failing.

@ljharb
Copy link
Member

ljharb commented Dec 23, 2025

if it's bikeshed time, what about mustFail? It's short, clear, and conveys that it's an error if it doesn't fail.

@vassudanagunta
Copy link
Contributor

mustFail makes me think of an API test that asserts an error result or exception must occur in a given circumstance.

knownFailure which Ava uses for reporting makes a lot of sense.

@JakobJingleheimer
Copy link
Member Author

I dunno about bike-shedding time 😆 I would say we don't have especially good options, what we currently have may be the best of the worst, so unless a dark-horse appears to wow everyone, let's go with what is already done. If something awesome comes up later, we can alias it and phase out the current (and provide a migration—the API won't change, just the same).

@ljharb
Copy link
Member

ljharb commented Dec 23, 2025

I'm not sure why there's a rush to land something with a name that folks aren't broadly happy with?

@JakobJingleheimer
Copy link
Member Author

JakobJingleheimer commented Dec 23, 2025

A time-boxed discussion on a better name sounds reasonable, and I don't mind making the change (it's a trivial find+replace since xfail is such a unique needle). But I don't want to languish whilst users are stuck waiting.

Shall we say end of day CET Saturday, 27 December? I'll be on a long train Sunday, which will give me time to make relevant changes and then babysit CI. I'll be unavailable much of January.

@ljharb
Copy link
Member

ljharb commented Dec 23, 2025

so far we've got:

  • xfail (common outside JS, but a bit unintuitive)
  • failing (jest/ava, but can be confusing to reference in prose)
  • mustFail
  • knownFailure

Also, there's @ts-expect-error, which has the same semantics, so maybe:

  • expectError

Any more thoughts from folks?

@JakobJingleheimer
Copy link
Member Author

Let's move this to a discussion (I think that will be easier, and we can eventually use its poll feature): nodejs/test-runner#1

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

needs-ci PRs that need a full CI run. test_runner Issues and PRs related to the test runner subsystem.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants