Issue: vitest-dev/vitest#10097
With describe.concurrent and maxConcurrency: 5, users expect at most 5 tests' resources to be held simultaneously. Instead, all N concurrent tests fire their beforeEach hooks (allocating resources) before any test body runs — producing N simultaneous resource allocations regardless of maxConcurrency.
Root cause: the current scheduler is effectively BFS over the task tree. All nodes at depth N complete before depth N+1 begins.
Every test run is a recursive tree where every node has a lifecycle (not just leaf tests):
File
└── Suite [aroundAll / beforeAll → children → afterAll]
├── Suite [aroundAll / beforeAll → children → afterAll]
│ ├── Test [aroundEach / beforeEach → body → afterEach]
│ └── Test [aroundEach / beforeEach → body → afterEach]
└── Test [aroundEach / beforeEach → body → afterEach]
Resources can be allocated at any level of this tree (not just in beforeEach). A suite's beforeAll opens resources for the whole suite. Nested concurrent suites can each open their own beforeAll resources. The BFS problem applies at every level, not just at the leaf (test) level.
Current (BFS): In a concurrent group of N children, all N subtrees start simultaneously. Their operations queue into a flat FIFO. Result: all beforeEach hooks across all N tests complete before any test body runs.
Desired (DFS with bounded parallelism): In a concurrent group, start at most maxConcurrency children at a time. When one child's entire subtree completes (including all its hooks at every level), free that slot to the next sibling. This keeps at most maxConcurrency subtrees genuinely in-flight at any level of the tree.
At any concurrent group in the tree, at most
maxConcurrencychildren's subtrees are simultaneously in-flight.
"In-flight" means: the subtree has started (its first before-hook has begun) but not yet completed (its last after-hook has not yet finished). This invariant holds recursively at every level of the tree.
This claim is:
- Sufficient to bound resource ownership at every level
- Deadlock-free: each subtree holds one "slot" in its parent group, no nesting of slots across levels, no cycles
- General: applies uniformly to suites with beforeAll/afterAll/aroundAll and tests with beforeEach/afterEach/aroundEach, at any depth
The fix lives at the concurrent dispatch point — wherever the tree fans out into concurrent children. Instead of launching all children simultaneously, launch at most K, and when one subtree fully completes, start the next:
runConcurrentGroup(children, K):
run at most K children simultaneously
when a child's full subtree completes → start next child
Applied recursively at every concurrent group in the tree, this naturally produces DFS-ordered execution with bounded parallelism.
No special-casing of leaf tests needed. No changes to the hook call chain needed. The tree structure itself enforces the invariant.
In runSuite, where a concurrent group is dispatched, wrap each child with a short-lived limiter:
// before
await Promise.all(tasksGroup.map(c => runSuiteChild(c, runner)))
// after
const groupLimiter = limitConcurrency(runner.config.maxConcurrency)
await Promise.all(tasksGroup.map(c => groupLimiter(() => runSuiteChild(c, runner))))Each child holds one slot in groupLimiter for the entire duration of its subtree. The slot is released only when runSuiteChild resolves (after all hooks at all levels within that subtree have completed). When a slot frees, the next waiting sibling starts.
The limiter instance is created per concurrent group and GC'd when the group finishes. Multiple instances exist simultaneously only if concurrent groups are nested or running in parallel — each scoped to its own group, no cross-level interference.
A global instance with subtree-scoped holding would deadlock: a parent concurrent group holds K slots (one per in-flight child), and when those children contain their own concurrent groups, the children try to acquire more slots from the same exhausted pool. Per-group instances avoid this entirely — each level has its own independent pool.
The current global limitMaxConcurrency wraps individual hook calls and the test body. With the new model these are two orthogonal concerns:
-
Subtree-level concurrency (the fix): per-group limiter at dispatch — bounds how many sibling subtrees are in-flight simultaneously. This is the resource-ownership guarantee.
-
Within-lifecycle hook parallelism: when
sequence.hooks = 'parallel', multiple hooks within a single lifecycle run concurrently. This is independent of subtree concurrency and can use its own limiter (or be left unbounded, since the number of hooks per lifecycle is small and fixed).
The existing global limitMaxConcurrency was conflating both concerns. After the fix, it can be removed or narrowed to concern (2) only.
Concrete example: 400 concurrent tests, maxConcurrency=5, each test does beforeEach(alloc) → body → afterEach(release).
All 400 runTest calls start immediately via Promise.all. Each enqueues into the global FIFO:
Queue: [t1.bE, t2.bE, t3.bE, ..., t400.bE] ← all 400 enqueued upfront
running: t1–t5
t1.bE completes → slot freed → t6.bE starts
t1 enqueues t1.body at the BACK of the queue:
Queue: [t6.bE, t7.bE, ..., t400.bE, t1.body]
→ t1.body sits behind 394 other beforeEach items
→ by the time t1.body runs, all 400 beforeEach have completed
→ 400 resources allocated simultaneously
Only 5 runTest calls start at all. The other 395 wait on the group limiter before even beginning:
groupLimiter slots: [t1, t2, t3, t4, t5] ← t6–t400 haven't started at all
Each of t1–t5 runs its lifecycle as a sequential chain:
t1: beforeEach → body → afterEach
t2: beforeEach → body → afterEach
...
When t1 fully completes → slot freed → t6 starts its lifecycle
At most 5 resources held at any time.
Two properties hold together:
- Only K tests exist at all — the rest haven't started, so their
beforeEachhasn't been called yet. - Within each started test, the lifecycle is a sequential chain —
t1.bodyfollows immediately aftert1.beforeEachwith nothing in between, not after 394 other items.
The bounding happens before the lifecycle begins, not inside it. There is no global queue for individual operations to race in.
PR #9653 introduced a per-operation global FIFO limiter to fix beforeAll throttling (#8367). It had its own clean mathematical claim:
"At most K operations execute simultaneously."
This is correct and uniform — every hook call and test body competes equally for K slots. No special cases. But it was modeling at the wrong level of abstraction for the domain.
The mismatch: users reason about maxConcurrency in terms of test lifecycles ("at most K tests running at once, so at most K database connections open"). They don't think in terms of individual hook/body calls. A test lifecycle spans many sequential operations — beforeEach, body, afterEach — and the per-operation model makes no promise about how many lifecycles are simultaneously open. As shown in the walkthrough above, all 400 lifecycles can be open at once even with K=5.
The per-operation model was also not simple in practice. To make it safe, it required:
- A "leaf only takes lock" discipline (only leaf operations acquire slots, not wrappers)
- Explicit setup/teardown slot acquisition in
callAroundHooksto manage the aroundEach phases - Careful avoidance of nested acquisition to prevent deadlock
All of that complexity was compensating for the fact that the model was correct but misaligned with the domain.
The per-group dispatch model aligns implementation with mental model:
"At any concurrent group, at most K subtrees are simultaneously in-flight."
This is what users mean when they set maxConcurrency. The complexity in callAroundHooks evaporates because the invariant is now enforced at the right level — before a lifecycle begins, not inside it.
This is a great explanation