-
Notifications
You must be signed in to change notification settings - Fork 957
Open
Labels
needs-triageIssue that require triageIssue that require triage
Description
Is there an existing issue for this?
- I have searched the existing issues
Current Behavior
retry button doesn't actually work after a failed shutdown. If the workspace has already been stopped, a retry will actually start the workspace again while changing the status in coder to "stopped".
Please add "sync" functionality to Sync the state of the coder workspace with the state of the instance in AWS.
Relevant Log Output
Error: updating EC2 Instance (i-0bc1806a0f3973987) user data: waiting for EC2 Instance (i-0bc1806a0f3973987) stop: timeout while waiting for state to become 'stopped' (last state: 'stopping', timeout: 10m0s)
on main.tf line 211, in resource "aws_instance" "dev":
211: resource "aws_instance" "dev" {
Expected Behavior
Coder state should sync with state of AWS instance.
The issue is described in detailed in this ticket https://help.coder.com/hc/en-us/requests/4024
Steps to Reproduce
- Create an EC2 workspace for a p3.2xlarge GPU instance
- Once its running, stop instance from coder UI
- Aws_instance will timout at 10mins. Increasing the timout period in terraform has no effect on this.
- Once the operation timesout, the workspace will have status “failed”. However, the AWS instance is actually “stopped”
- Only possible option from “failed” state is “retry”
- Running “retry” will change the workspace to have estate “stopped”. However the AWS instance is actually “started”
Environment
- Host OS: Ubuntu 22.04
- Coder version: 2.22.1
Additional Context
The issue occurs consistently
Metadata
Metadata
Assignees
Labels
needs-triageIssue that require triageIssue that require triage