-
Notifications
You must be signed in to change notification settings - Fork 1.1k
feat(agent): add agent socket API #20717
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
bf9e8fd to
1daf67e
Compare
a17863a to
9fdc98d
Compare
| // Request whether a unit is ready to be started. That is, all dependencies are satisfied. | ||
| rpc SyncReady(SyncReadyRequest) returns (SyncReadyResponse); | ||
| // Get the status of a unit and list its dependencies. | ||
| rpc SyncStatus(SyncStatusRequest) returns (SyncStatusResponse); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is the flow going to be:
- SyncWant(fooscript, barscript)
- SyncWant(fooscript, bazscript)
- while ! SyncReady(fooscript)
- SyncStart(fooscript)
- ...
- SyncComplete(fooscript)
I assume SyncStatus is more useful or debugging than actual script writing (which is fine).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also, wdyt about supporting watch vs polling?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One more thought, what happens if I reverse the order of want and start? So start -> want want -> ready. (I imagine users will feel it's natural to start immediately once your script gets going, it may be wrong use but still.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Your described flow is correct. Although SyncStart should probably do the polling server side.
In terms of watch vs poll, we agreed with Cian to only poll for now. We can implement watch later.
To your last thought:
- Want should almost always be called before Start.
- Wants that are added after Start will only be applied on the next Start.
- We could add a guard to say that dependencies cannot be added to a unit that has been started.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
On further consideration: we do need to think about what happens when a script fails (without being marked as such) and gets retried. The status will be stuck in "started".
agent/agentsocket/socket_unix.go
Outdated
|
|
||
| // Fall back to /tmp with user-specific path | ||
| uid := os.Getuid() | ||
| return filepath.Join("/tmp", fmt.Sprintf("coder-agent-%d.sock", uid)), nil |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(I wonder if this might create conflicts between tests. Especially later on with CLI support and CLI tests. Consider adding a CLI flag to configure it.)
f7245fa to
79d783a
Compare
Further slim down the agentsocket proxy simplify the agentsocket server interface avoid two race conditions use t.Cleanup where applicable
| func (s *Server) acceptConnections() { | ||
| // In an edge case, Stop() might race with acceptConnections() and set s.listener to nil. | ||
| // Therefore, we grab a copy of the listener under a lock. We might still get a nil listener, | ||
| // but then we know close has already run and we can return early. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Outdated comment (no Stop). Entire comment is not needed after my suggestion for type Service.
|
|
||
| // getDefaultSocketPath returns the default socket path for Unix-like systems | ||
| func getDefaultSocketPath() (string, error) { | ||
| // Try XDG_RUNTIME_DIR first |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please leave this for future refactor, but it's a good idea to handle these types of things in the agent, and pass them down rather than have each component do their own thing.
agent/agentsocket/socket_unix.go
Outdated
| } | ||
|
|
||
| // Try to connect to see if it's actually listening | ||
| conn, err := net.Dial("unix", path) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we use net.Dialer{}.DialContext(ctx, ...) here so we don't block forever if the socket is in a bad state or there's a bad agent listening on it?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
make gen
…duction agents must specify a well-known agentsocket path and clients must be able to specify the socket to address
DanielleMaywood
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
stamping with a nit about a comment
agent/agentsocket/server.go
Outdated
|
|
||
| server.listener = listener | ||
|
|
||
| // This context is canceled by s.Stop() when the server is stopped. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| // This context is canceled by s.Stop() when the server is stopped. | |
| // This context is canceled by s.Close() when the server is stopped. |
I think this comment is outdated?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I thought I got to that one 😄. On it. Thanks!
mtojek
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
post-merge LGTM! Thank you both
relates to: coder/internal#1094
This is number 2 of 5 pull requests in an effort to add agent script ordering. It adds a drpc API that is exposed via a local socket. This API serves access to a lightweight DAG based dependency manager that was inspired by systemd.
In follow-up PRs:
I used an LLM to produce some of these changes, but I have conducted thorough self review and consider this contribution to be ready for an external reviewer.