-
Notifications
You must be signed in to change notification settings - Fork 1.1k
fix: suppress startup script logs in non-blocking SSH mode #21073
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
mafredri
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree with not dumping logs in non-blocking mode, but skipping the stage log doesn't add value IMO and introduces inconsistency.
cli/cliui/agent.go
Outdated
| case codersdk.WorkspaceAgentLifecycleStartTimeout: | ||
| // Backwards compatibility: Avoid printing warning if | ||
| // coderd is old and doesn't set ReadyAt for timeouts. | ||
| // Backwards compat: old coderd may not set ReadyAt for timeouts. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: This seems like a needless change? I guess we should update CLAUDE.md.
cli/cliui/agent.go
Outdated
| default: | ||
| if agent.LifecycleState.Starting() { | ||
| sw.Log(time.Time{}, codersdk.LogLevelWarn, noticeStartupRunning) | ||
| sw.Log(time.Time{}, codersdk.LogLevelWarn, troubleshootingMessage(agent, docsURL+"/admin/templates/troubleshooting#your-workspace-may-be-incomplete")) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: For consistency, we should make this an anchor variable too.
cli/cliui/agent_test.go
Outdated
| "⧗ Waiting for the workspace agent to connect", | ||
| "✔ Waiting for the workspace agent to connect", | ||
| "⧗ Running workspace agent startup scripts (non-blocking)", | ||
| // Non-blocking mode: no "Running" stage, just a notice |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure why we want this removed, considering we show the Waiting above anyway?
For consistency, we should still show the Running line, just not dump logs.
|
I took a look at #13580, and it seems like this isn't addressing the decision in that ticket, which is to remove the log streaming entirely from all modes and replace it with instructions. Or am I misunderstanding the conclusion from that ticket? |
I don't have an opinion on live-log-streaming. My instinct is we shouldn't do it in the We should definitely not show a massive dump of failed logs every single SSH invocation though. That turns what could be a transient failure into a permanent one for things like Mux which rely on |
Alright 👍🏻, we can leave it in then for interactive mode. But just to be clear, it's totally fine to do this for Consider: ❯ ssh dev.w.me.coder -- 'echo hi 1>&2' | grep hi
hiWe output on stderr inside the SSH session, but it gets printed on stdout. So we're not preventing normal scriptability.
Totally agree with this 👍🏻. I'd think this could be done with a fairly minor diff, though. |
c5547f3 to
40655a8
Compare
cli/cliui/agent_test.go
Outdated
| }, | ||
| { | ||
| // In non-blocking mode, a startup error still shows the warning but | ||
| // doesn't stream logs. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: Copy this test rather than change it so we don't lose coverage.
cli/cliui/agent.go
Outdated
| } | ||
| // For Starting states in non-blocking mode, complete the | ||
| // stage since we're not waiting for scripts to finish. | ||
| if agent.LifecycleState.Starting() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: The previous behavior of not completing the stage and showing the Notice: seemed fine, as long as we don't show script logs.
In non-blocking mode (--wait=no), skip streaming startup script logs while still showing the stage indicator and warnings for failures. This eliminates log spam when using SSH via ProxyCommand or with --wait=no, bringing behavior closer to standard SSH expectations. Fixes #13580
40655a8 to
3dceb0e
Compare
Add guidelines to prevent common PR feedback issues: - Don't reword existing comments or code unless directly motivated by the task - Don't delete existing comments that explain non-obvious behavior - Add new test cases instead of modifying existing ones Refs #21073
| err = func() error { | ||
| // In non-blocking mode, skip streaming logs. | ||
| // See: https://github.com/coder/coder/issues/13580 | ||
| if !opts.Wait { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is fine, but code within this func could be simplified, checking follow and having conditionals does not make sense anymore.
|
@mafredri can you take this to completion? I think you have a very specific vision for how it should work, which is great, but I don't intend on repeat contributions so the dialogue to get the change acceptable not going to yield long-term benefits for us. |
Sure, I don't mind. 👍🏻 |
|
Ok thanks. I can rubber stamp those if you want but my code review shouldn't be worth anything here. |
I'm fairly confident we retained behavior (test coverage and manual testing), but I'll still get another engineer to review in the meantime. Feel free to stamp, at least the latter PR, assuming it matches what you expect to happen. Early attach (does not skip build logs): During startup: After startup (still shows warning, but no logs): |
Summary
This PR improves parity between
sshbehavior for Coder workspaces vs non-Coder hosts. When you runssh server -- echo hello, you gethello. Butssh workspace.coder -- echo hellowould dump hundreds of lines of startup script logs first—even when not waiting for them.This is part of making Coder Workspaces and Mux work better together.
Fixes #13580
Changes
--wait=noor via ProxyCommand): Show only a concise warning if startup failed/timed out, no logsssh workspace.coder -- cmd): Suppress all agent status output via newQuietoption--wait=yes): Unchanged—still streams logsBefore/After
ssh workspace.coder -- echo hellohellocoder ssh --wait=no ws(TTY)coder ssh --wait=yes wsGenerated with mux