feat(testutil): add lazy timeout context with location-based reset #21120

mafredri · 2025-12-05T12:32:20Z

It's common to create a context early in a test body, then do setup work
unrelated to that context. By the time the context is actually used, it
may have already timed out. This was detected as test failures in #21091.

The new Context function returns a context that resets its timeout when
accessed from new lines in the test file. The timeout does not begin until
the context is first used (lazy initialization).

This is useful for integration tests that pass contexts through various
subsystems, where each subsystem should get a fresh timeout window.

Key behaviors:

Timer starts on first Done, Deadline, or Err call
Value does not trigger initialization
Each unique line in a _test.go file resets the timeout window
Same-line access (e.g., in loops) does not reset
Expired contexts cannot be resurrected

Limitations:

Wrapping with a child context (e.g., context.WithCancel) prevents
reset since the childs methods shadow the parent
Storing the Done channel circumvents resets

The original fixed-timeout behavior is available via ContextFixed.

Arguably, it might suffice to simplify this implementation to only
implement the start timer on first use functionality. But given our
massively parallel test-suite, I'm biased towards enabling the reset as
well.

It's common to create a context early in a test body, then do setup work unrelated to that context. By the time the context is actually used, it may have already timed out. This was detected as test failures in #21091. The new Context() function returns a context that resets its timeout when accessed from new lines in the test file. The timeout does not begin until the context is first used (lazy initialization). This is useful for integration tests that pass contexts through many subsystems, where each subsystem should get a fresh timeout window. Key behaviors: - Timer starts on first Done(), Deadline(), or Err() call - Value() does not trigger initialization (used for tracing/logging) - Each unique line in a _test.go file gets a fresh timeout window - Same-line access (e.g., in loops) does not reset - Expired contexts cannot be resurrected Limitations: - Wrapping with a child context (e.g., context.WithCancel) prevents resets since the child's methods don't call through to the parent - Storing the Done() channel prevents resets on subsequent accesses The original fixed-timeout behavior is available via ContextFixed().

deansheather · 2025-12-09T04:11:17Z

We can't change the timeout for each individual call which is unfortunate. I know a lot of tests just use a single long context for the entire test, but some tests do have multiple contexts with different timeouts for different calls.

Also, the behavior of resetting on every new call means we can't set test-wide timeouts anymore (except with the -timeout flag on go test, which is intentionally very large to account for the entire test suite).

Maybe we need a different API to solve these things? Here's something off the top of my head:

func TestBlah(t *testing.T) {
  xctx := testutil.ContextProvider(t, testutil.WaitLong) // full test timeout, lazily creates a parent context when we hit the first call
  // ...
  err := MyFunction(xctx.New(testutil.WaitMedium), param) // makes a new "child" context from the parent
}



func testutil.ContextProvider(t *testing.T, testTimeout time.Duration) *testutil.ParentContext {
  return &testutil.ParentContext{
    testTimeout: testTimeout,
  }
}

type testutil.ParentContext struct {
  testTimeout time.Duration

  // todo: synchronization fields...
  ctx context.Context
}

func (c *testutil.ParentContext) New(timeout time.Duration) context.Context {
  c.ensureInitialized() // ensures c.ctx is set
  return context.WithTimeout(c.ctx, timeout)
}

// todo: Other methods as needed, e.g. NewWithCancel, NewWithDeadline

^ Which would just be a very light wrapper around the functions in the context package.

It's a shame that we have tests that initialize the context early followed by heavy non-context consuming startup cost. Does this mostly just stem from tests using coderdtest?

deansheather

comment above oops

mafredri · 2025-12-09T13:56:47Z

Thanks for taking a look and the discussion @deansheather, I didn't make it clear in the ticket but this is more of a design/proposal to spark discussion.

Also, the behavior of resetting on every new call means we can't set test-wide timeouts anymore (except with the -timeout flag on go test, which is intentionally very large to account for the entire test suite).

This is a valid concern. The reasoning is that in a massively parallel test suite, it's hard to reason about what length of timeout is actually valid, and rather move the thought process to "maximum time we want this test to sit idle".

Do keep in mind that call-site is tracked, so you do get an approximate timeout for the test, even though it's not exactly 15 seconds, perhaps more like 25 seconds.

Granted, this implementation is not a great way to shift the thinking from "max test time" to "max idle time".

Maybe we need a different API to solve these things? Here's something off the top of my head:

This is an interesting alternative, and one we could probably enforce/lint. (It's very hard to lint the referenced bug-case without also causing false positives, which sparked this PR.)

Taking some inspiration from you proposal, I could see an more ergonomic alternative that also functions as a drop-in.

package testutil

var _ context.Context = (*TestContext)(nil)

func Context(...) *TestContext { ... } // always lazy start (or option?)

type TestContext struct{}

func (tc *TestContext) Cancel() { ... }                  // we were missing this previously
func (tc *TestContext) New(timeout) *TestContext { ... } // make timeout optional?
func (tc *TestContext) Reset() *TestContext { ... }      // resets the timer

func (tc *TestContext) Join(ctx ...context.Context) *TestContext { ... } // sometimes you have two independent contexts, but can only pass one along
// more?

This would still allow the most basic use to be ctx := testutil.Context(t, testutil.WaitMedium) whilst using it as a normal context.

spikecurtis · 2025-12-10T02:43:54Z

My opinion is that this is a lot of complexity for a relatively narrow benefit.

in a massively parallel test suite, it's hard to reason about what length of timeout is actually valid, and rather move the thought process to "maximum time we want this test to sit idle".

This trades one thing that is hard to reason about for something that is much harder. Speaking about the original lazy context, when you make a call in the test function and pass the context, it is very difficult to know whether it will trigger a reset. If the call synchronously "uses" the context, then it is reset, but if it spawns a goroutine that "uses" the context, then it is not.

Even if the resets are explicit, like in your second proposal, the question of what duration to use is still hard to reason about. Like, if your test has steps A, B, C, and D with resets in between, with the current, basic context you have to estimate the sum of the durations. With resets you have to estimate the max. Have we actually made life easier for test authors? Dubious, IMO.

mafredri · 2025-12-10T18:24:25Z

@spikecurtis thanks for taking a look and for the feedback. You raise valid concerns and I honestly don’t disagree with you on any point.

It seems the only concept here that might hold some value is the delayed start, but is it worth the decrease in predictability? Probably not.

I briefly considered flipping the logic, error of context is not used within 1/2 or 2/3 of the timeout. But now we’re just breaking valid use-cases. Maybe something takes a context but doesn’t use it and takes long enough to process you hit the error. Rejected.

One more option comes to mind, swapping out the implementation via build tag allowing you to run a form of ”lint” on the test-suite, where we could enforce the error case. But again, having this code in the code base simply might not be worth catching the few weird flakes we run into.

This was a fun thought experiment, thanks for participating! Unless you guys have any better ideas, I’m going to close this one out.

github-actions bot assigned mafredri Dec 5, 2025

mafredri force-pushed the mafredri/lazy-timeout-context branch 2 times, most recently from 9896675 to e6e4582 Compare December 5, 2025 13:00

mafredri marked this pull request as ready for review December 8, 2025 14:13

mafredri requested review from deansheather and spikecurtis December 8, 2025 14:13

mafredri force-pushed the mafredri/lazy-timeout-context branch 3 times, most recently from 195362e to c9238e2 Compare December 8, 2025 15:59

mafredri force-pushed the mafredri/lazy-timeout-context branch from c9238e2 to 166e3b0 Compare December 8, 2025 16:02

deansheather reviewed Dec 9, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

feat(testutil): add lazy timeout context with location-based reset #21120

feat(testutil): add lazy timeout context with location-based reset #21120

mafredri commented Dec 5, 2025 •

edited

Loading

Uh oh!

deansheather commented Dec 9, 2025

Uh oh!

deansheather left a comment

Uh oh!

mafredri commented Dec 9, 2025

Uh oh!

spikecurtis commented Dec 10, 2025

Uh oh!

mafredri commented Dec 10, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

feat(testutil): add lazy timeout context with location-based reset #21120

Are you sure you want to change the base?

feat(testutil): add lazy timeout context with location-based reset #21120

Conversation

mafredri commented Dec 5, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

deansheather commented Dec 9, 2025

Uh oh!

deansheather left a comment

Choose a reason for hiding this comment

Uh oh!

mafredri commented Dec 9, 2025

Uh oh!

spikecurtis commented Dec 10, 2025

Uh oh!

mafredri commented Dec 10, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

mafredri commented Dec 5, 2025 •

edited

Loading