Skip to content

Commit 436e420

Browse files
authored
Check parent done condition before rejecting extraneous worker (#5238)
The `WorkerThreadPoolHierarchicalTestExecutorServiceTests.limitsWorkerThreadsToMaxPoolSize` is flaky with a rather puzzling exception. While there are 3 permits available, 1 worker active thread. And yet the task rejected from the pool. ``` Caused by: java.util.concurrent.RejectedExecutionException: Task withWorkerLeaseManager [parallelism = 3, semaphore = java.util.concurrent.Semaphore@5af97070[Permits = 3]] rejected from java.util.concurrent.ThreadPoolExecutor@f5237bd2[Running, pool size = 3, active threads = 1, queued tasks = 0, completed tasks = 2] at org.junit.platform.engine.support.hierarchical.WorkerThreadPoolHierarchicalTestExecutorService$LeaseAwareRejectedExecutionHandler.rejectedExecution(WorkerThreadPoolHierarchicalTestExecutorService.java:922) ``` This happens because, given the following test tree with a max of 3 workers, worker 1 starts execution at the root node: ``` root <- worker 1 |- child1 | |- leaf1a | |- leaf1b |- child2 | |- leaf2a | |- leaf2b ``` When the child nodes are executed, there are 2 worker threads active and 1 available. The state at this point looks like this: ``` root |- child1 <- worker 1 | |- leaf1a | |- leaf1b |- child2 <- worker 2 | |- leaf2a | |- leaf2b ``` Both workers will then race each other trying to start 1 more. The winner of that race, child1 for the sake of argument, will then execute the `leaf1a`, while the other thread will start working on the second `leaf1b` node. The state at this point looks like this: ``` root |- child1 | |- leaf1a <- worker 1 | |- leaf1b <- worker 3 |- child2 | |- leaf2a <- worker 2 | |- leaf2b ``` Then a second race condition happens. After `leaf1a` is done, the thread starts to wait for `leaf1b` to be finished. While it is waiting, it returns its worker lease and tries to compensate. At the time is tries to compensate, both worker 2 and worker 3 are still busy and the executor rejects the task. The state at this point looks like this: ``` root |- child1 <- worker 1 (awaiting leaf1b) | |- leaf1a | |- leaf1b <- worker 3 |- child2 | |- leaf2a <- worker 2 | |- leaf2b ``` Then a second race condition happens. After `leaf1a` is done, the thread starts to wait for `leaf1b` to be finished. While it is waiting, it returns its worker lease and tries to compensate. At the time is tries to compensate, both worker 2 and worker 3 are still busy and the executor rejects the task. By the time the `RejectedExecutionHandler` is invoked, worker 2 and 3 have finished their work and returned their leases. The state at this point looks like this: ``` root |- child1 <- worker 1 (awaiting leaf1b) | |- leaf1a | |- leaf1b |- child2 | |- leaf2a | |- leaf2b ``` At this point the `executor.isShutdown() || workerLeaseManager.isAtLeastOneLeaseTaken()` condition evaluates to false because 1) the executor is not shutdown and 2) there are no workers with active leases so an exception is thrown. By also checking if the `doneCondition` of the worker that tried to start the new worker we can see that it was waiting in vain.
1 parent 5a3d882 commit 436e420

File tree

2 files changed

+6
-4
lines changed

2 files changed

+6
-4
lines changed

documentation/modules/ROOT/partials/release-notes/release-notes-6.0.2.adoc

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -17,6 +17,7 @@ repository on GitHub.
1717
==== Bug Fixes
1818

1919
* Make `ConsoleLauncher` compatible with JDK 26 by avoiding final field mutations.
20+
* Recheck done condition before rejecting extraneous worker in `WorkerThreadPoolHierarchicalTestExecutorService`
2021

2122
[[v6.0.2-junit-platform-deprecations-and-breaking-changes]]
2223
==== Deprecations and Breaking Changes

junit-platform-engine/src/main/java/org/junit/platform/engine/support/hierarchical/WorkerThreadPoolHierarchicalTestExecutorService.java

Lines changed: 5 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -212,13 +212,13 @@ private void maybeStartWorker(BooleanSupplier doneCondition) {
212212
if (workerLease == null) {
213213
return;
214214
}
215-
threadPool.execute(new RunLeaseAwareWorker(workerLease,
215+
threadPool.execute(new RunLeaseAwareWorker(workerLease, doneCondition,
216216
() -> WorkerThread.getOrThrow().processQueueEntries(workerLease, doneCondition),
217217
() -> this.maybeStartWorker(doneCondition)));
218218
}
219219

220-
private record RunLeaseAwareWorker(WorkerLease workerLease, Runnable work, Runnable onWorkerFinished)
221-
implements Runnable {
220+
private record RunLeaseAwareWorker(WorkerLease workerLease, BooleanSupplier parentDoneCondition, Runnable work,
221+
Runnable onWorkerFinished) implements Runnable {
222222

223223
@Override
224224
public void run() {
@@ -916,7 +916,8 @@ public void rejectedExecution(Runnable r, ThreadPoolExecutor executor) {
916916
return;
917917
}
918918
worker.workerLease.release(false);
919-
if (executor.isShutdown() || workerLeaseManager.isAtLeastOneLeaseTaken()) {
919+
if (executor.isShutdown() || workerLeaseManager.isAtLeastOneLeaseTaken()
920+
|| worker.parentDoneCondition.getAsBoolean()) {
920921
return;
921922
}
922923
throw new RejectedExecutionException("Task with " + workerLeaseManager + " rejected from " + executor);

0 commit comments

Comments
 (0)