Skip to content

bug: UI retry doesn't work after failed shutdown #19112

@mambon2

Description

@mambon2

Is there an existing issue for this?

  • I have searched the existing issues

Current Behavior

retry button doesn't actually work after a failed shutdown. If the workspace has already been stopped, a retry will actually start the workspace again while changing the status in coder to "stopped".

Please add "sync" functionality to Sync the state of the coder workspace with the state of the instance in AWS.

Relevant Log Output

Error: updating EC2 Instance (i-0bc1806a0f3973987) user data: waiting for EC2 Instance (i-0bc1806a0f3973987) stop: timeout while waiting for state to become 'stopped' (last state: 'stopping', timeout: 10m0s)
on main.tf line 211, in resource "aws_instance" "dev":
  211: resource "aws_instance" "dev" {

Expected Behavior

Coder state should sync with state of AWS instance.
The issue is described in detailed in this ticket https://help.coder.com/hc/en-us/requests/4024

Steps to Reproduce

  1. Create an EC2 workspace for a p3.2xlarge GPU instance
  2. Once its running, stop instance from coder UI
  3. Aws_instance will timout at 10mins. Increasing the timout period in terraform has no effect on this.
  4. Once the operation timesout, the workspace will have status “failed”. However, the AWS instance is actually “stopped”
  5. Only possible option from “failed” state is “retry”
  6. Running “retry” will change the workspace to have estate “stopped”. However the AWS instance is actually “started”

Environment

  • Host OS: Ubuntu 22.04
  • Coder version: 2.22.1

Additional Context

The issue occurs consistently

Metadata

Metadata

Assignees

No one assigned

    Labels

    needs-triageIssue that require triage

    Type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions