Skip to content

flake: TestAgent_SessionTTYShell #1177

@flake-investigator

Description

@flake-investigator

CI Run Link: https://github.com/coder/coder/actions/runs/19971155962
Failed Job: test-go-pg (macos-latest)
Date/Time: 2025-12-05 ~17:58 UTC

Commit Info:

  • SHA: 61beb7bfa8b371ef7c1c9399043e3c4a3e1dd6f2
  • Author: Mathias Fredriksson
  • Title: docs: rewrite dev containers documentation for GA (#21080)

Error Evidence (from logs):

=== FAIL: agent TestAgent_SessionTTYShell/(22) (11.58s)
    agent_test.go:483: 2025-12-05 17:56:31.610: cmd: read error: match deadline exceeded: context deadline exceeded (wanted 1 bytes; got 0: "")
    agent_test.go:483:
        	Error Trace:	/Users/runner/work/coder/coder/pty/ptytest/ptytest.go:374
        	            				/Users/runner/work/coder/coder/pty/ptytest/ptytest.go:248
        	            				/Users/runner/work/coder/coder/agent/agent_test.go:483
        	Error:      	read error
        	Test:       	TestAgent_SessionTTYShell/(22)
        	Messages:   	match deadline exceeded: context deadline exceeded (wanted 1 bytes; got 0: "")
...
--- FAIL: TestAgent_SessionTTYShell/(22) (11.58s)

=== FAIL: agent TestAgent_SessionTTYShell (0.00s)
DONE 2091 tests, 37 skipped, 2 failures in 141.324s

Test Function and Location:

  • File: agent/agent_test.go
  • Function: TestAgent_SessionTTYShell
  • Lines: ~455-486 at commit 61beb7b (start located via pattern ^func TestAgent_SessionTTYShell)

Assignment Analysis (Test Function Blame attempt):

  • Located function start at approx line 455 in agent/agent_test.go at 61beb7b.
  • Recent commits touching this test file include:
    • 33b42fca (Ethan) "test: fix flake in TestAgent_Metrics_SSH"
    • 5807fe01 (Spike) "test: prevent TestAgent_ReconnectingPTY connection reporting check from interfering"
    • 4d1003ea (Zach) "fix: remove initial global HTTP client usage"
    • 51d8a053 (Ethan) "test: disable direct connections for a deterministic reachable peers metric"
  • None of these commits clearly modify TestAgent_SessionTTYShell itself. Due to lack of precise per-line blame tooling here, exact last modifier of the function is unclear.

Root Cause Classification: Flaky Test

  • Symptom: PTY-based expect timed out (match deadline exceeded) while interacting with an interactive shell over SSH on macOS runner.
  • No data race warnings observed.
  • No panic/OOM indicators observed.
  • Appears timing-dependent on macOS (pty/expecter interaction). Similar flakes have occurred in other PTY tests.

Related Issues (similar failure pattern, different tests):

Proposed Next Steps:

  • Review TestAgent_SessionTTYShell to add more robust synchronization before expecting PTY output (e.g., explicit prompt detection with retries or longer wait on macOS).
  • Consider increasing timeouts or using require.Eventually around PTY expect patterns for macOS.
  • Investigate whether Homebrew auto-updates or environment setup during the run adds latency; ensure the test is insulated from external env timing.

Reproduction:

  • Run on macOS runner: job "test-go-pg (macos-latest)"; the test attempts to start an interactive shell via SSH with a PTY and expect "test" after running echo test. Intermittently, the expecter times out before receiving a byte.

Assignment:

  • Assigning to @ethanndickson per secondary rule (recent meaningful changes to agent/agent_test.go and agent test infra). Component area: agent PTY/SSH tests.

Metadata

Metadata

Assignees

Labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions