Skip to content

WIP: ent test fix blink #19191

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 14 commits into
base: callum-autostart-no-provisioner
Choose a base branch
from

Conversation

cstyan
Copy link
Contributor

@cstyan cstyan commented Aug 6, 2025

WIP

cstyan and others added 10 commits August 5, 2025 23:10
Signed-off-by: Callum Styan <callumstyan@gmail.com>
…m test

Add keepProvisionerAlive mechanism to prevent provisioner daemon from going
stale during long-running tests. The function periodically updates the
provisioner daemon's LastSeenAt timestamp every 15 seconds to ensure it
stays active throughout the test execution.

This resolves the issue where the test would fail around line 2328 when
checking for available provisioners because the provisioner had exceeded
the 90-second stale interval.
- Add keepalive mechanism with immediate and periodic updates
- Add pre-autobuild provisioner timestamp update
- Add detailed logging to track provisioner updates

The keepalive is working and successfully updating LastSeenAt timestamps,
but hasAvailableProvisioners still reports no active provisioners.
This suggests a database transaction isolation or caching issue.
The test uses a mock clock, but the keepalive was using real time.
Now using clock.Now() instead of time.Now() to match the autobuild
system's time reference.

However, there's still a database transaction isolation issue where
the autobuild system doesn't see the keepalive updates.
…nnel

The root cause was that the autobuild executor's hasAvailableProvisioners function
uses the time parameter from runOnce(t) for age calculation, but the test was
sending real schedule time instead of mock clock time.

Solution: Send clock.Now() instead of sched.Next(...) to the tick channel.

Result: Provisioner is no longer considered stale, test progresses to transition check.
New issue: Autobuild transition not happening (likely scheduling logic).
Solution: Update provisioner LastSeenAt to scheduled time (not current time)

The key insight was that the autobuild system compares:
- Provisioner LastSeenAt timestamp
- vs. the time sent to tick channel

By setting both to the same scheduled time, the age calculation becomes
scheduledTime - scheduledTime = 0 seconds (not stale).

Result: TestExecutorPrebuilds/AutostartScheduleOnlyTriggersAfterClaim PASSES
- Created updateProvisionerLastSeenAt helper function
- Fixed FailureTTLOK test to use NewWithDatabase and helper function
- Used AsSystemRestricted context for proper permissions
- Test now PASSES: TestWorkspaceAutobuild/FailureTTLOK

Next: Apply same pattern to all other TestWorkspaceAutobuild subtests
- Added IncludeProvisionerDaemon: true to test setup
- Added updateProvisionerLastSeenAt helper call before tick
- Test now PASSES: TestWorkspaceAutobuild/DormancyThresholdOK

Pattern emerging: Some tests need IncludeProvisionerDaemon + helper function
- Changed from coderdenttest.New to NewWithDatabase for db access
- Added updateProvisionerLastSeenAt helper calls before both ticks
- Test now PASSES: TestWorkspaceAutobuild/WorkspaceInactiveDeleteTransition

Pattern confirmed: NewWithDatabase + helper function calls work consistently
- Fixed DormantNoAutostart: Added NewWithDatabase + helper function calls
- Fixed RequireActiveVersion: Added NewWithDatabase + helper function calls
- Fixed NextStartAtIsValid: Added NewWithDatabase + helper function calls
- Fixed FailedDeleteRetryDaily: Added NewWithDatabase + helper function calls
- Fixed DormantTTLTooEarly: Added NewWithDatabase + helper function calls
- Fixed InactiveStoppedWorkspaceNoTransition: Added NewWithDatabase + helper function calls

✅ ALL TESTS NOW PASSING: 17/17 TestWorkspaceAutobuild tests
✅ Original target test PASSING: TestExecutorPrebuilds/AutostartScheduleOnlyTriggersAfterClaim
cstyan and others added 4 commits August 6, 2025 01:15
Signed-off-by: Callum Styan <callumstyan@gmail.com>
Fixed 15 failing autobuild tests by implementing a systematic solution
to prevent provisioner daemons from going stale during test execution.

Changes:
- Added updateProvisionerLastSeenAt() helper function
- Updated tests to use NewWithDatabase for database access
- Call helper function before autobuild ticks to keep provisioners active

Fixed tests:
- TestExecutorAutostartOK
- TestExecutorAutostartWithParameters
- TestExecutorAutostartMultipleOK
- TestExecutorAutostartTemplateUpdated (all subtests)
- TestExecutorAutostopOK
- TestExecutorAutostopExtend
- TestExecutorAutostopTemplateDisabled
- TestMultipleLifecycleExecutors
- TestNotifications/Dormancy
- TestExecutorPrebuilds/OnlyStopsAfterClaimed
- TestExecutorRequireActiveVersion

All autobuild tests now pass (33/33 passing, 0 failing).
Signed-off-by: Callum Styan <callumstyan@gmail.com>
Signed-off-by: Callum Styan <callumstyan@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant