Skip to content
Merged
Show file tree
Hide file tree
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Next Next commit
feat(coderd/database/dbpurge): add retention for workspace agent logs
Replace hardcoded 7-day retention for workspace agent logs with configurable
retention from deployment settings. Falls back to global retention when not
set, and skips deletion entirely when effective retention is 0.

Depends on #21038
Updates #20743
  • Loading branch information
mafredri committed Dec 2, 2025
commit 62e3ab8af0001f0a4680dfbbdb09927521ad72f2
5 changes: 5 additions & 0 deletions cli/testdata/coder_server_--help.golden
Original file line number Diff line number Diff line change
Expand Up @@ -717,6 +717,11 @@ that data type.
How long connection log entries are retained. Set to 0 to disable
(keep indefinitely).

--workspace-agent-logs-retention duration, $CODER_WORKSPACE_AGENT_LOGS_RETENTION (default: 7d)
How long workspace agent logs are retained. Logs from non-latest
workspace builds are deleted after this period to free up storage
space. Set to 0 to disable automatic deletion of workspace agent logs.

TELEMETRY OPTIONS:
Telemetry is critical to our ability to improve Coder. We strip all personal
information before sending data to our servers. Please only disable telemetry
Expand Down
5 changes: 5 additions & 0 deletions cli/testdata/server-config.yaml.golden
Original file line number Diff line number Diff line change
Expand Up @@ -761,3 +761,8 @@ retention:
# an expired key. Set to 0 to disable automatic deletion of expired keys.
# (default: 7d, type: duration)
api_keys: 168h0m0s
# How long workspace agent logs are retained. Logs from non-latest workspace
# builds are deleted after this period to free up storage space. Set to 0 to
# disable automatic deletion of workspace agent logs.
# (default: 7d, type: duration)
workspace_agent_logs: 168h0m0s
4 changes: 4 additions & 0 deletions coderd/apidoc/docs.go

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

4 changes: 4 additions & 0 deletions coderd/apidoc/swagger.json

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

15 changes: 10 additions & 5 deletions coderd/database/dbpurge/dbpurge.go
Original file line number Diff line number Diff line change
Expand Up @@ -18,8 +18,7 @@ import (
)

const (
delay = 10 * time.Minute
maxAgentLogAge = 7 * 24 * time.Hour
delay = 10 * time.Minute
// Connection events are now inserted into the `connection_logs` table.
// We'll slowly remove old connection events from the `audit_logs` table.
// The `connection_logs` table is purged based on the configured retention.
Expand Down Expand Up @@ -66,9 +65,15 @@ func New(ctx context.Context, logger slog.Logger, db database.Store, vals *coder
return nil
}

deleteOldWorkspaceAgentLogsBefore := start.Add(-maxAgentLogAge)
if err := tx.DeleteOldWorkspaceAgentLogs(ctx, deleteOldWorkspaceAgentLogsBefore); err != nil {
return xerrors.Errorf("failed to delete old workspace agent logs: %w", err)
workspaceAgentLogsRetention := vals.Retention.WorkspaceAgentLogs.Value()
if workspaceAgentLogsRetention == 0 {
workspaceAgentLogsRetention = vals.Retention.Global.Value()
}
if workspaceAgentLogsRetention > 0 {
deleteOldWorkspaceAgentLogsBefore := start.Add(-workspaceAgentLogsRetention)
if err := tx.DeleteOldWorkspaceAgentLogs(ctx, deleteOldWorkspaceAgentLogsBefore); err != nil {
return xerrors.Errorf("failed to delete old workspace agent logs: %w", err)
}
}
if err := tx.DeleteOldWorkspaceAgentStats(ctx); err != nil {
return xerrors.Errorf("failed to delete old workspace agent stats: %w", err)
Expand Down
98 changes: 97 additions & 1 deletion coderd/database/dbpurge/dbpurge_test.go
Original file line number Diff line number Diff line change
Expand Up @@ -246,7 +246,11 @@ func TestDeleteOldWorkspaceAgentLogs(t *testing.T) {
// After dbpurge completes, the ticker is reset. Trap this call.

done := awaitDoTick(ctx, t, clk)
closer := dbpurge.New(ctx, logger, db, &codersdk.DeploymentValues{}, clk)
closer := dbpurge.New(ctx, logger, db, &codersdk.DeploymentValues{
Retention: codersdk.RetentionConfig{
WorkspaceAgentLogs: serpent.Duration(7 * 24 * time.Hour),
},
}, clk)
defer closer.Close()
<-done // doTick() has now run.

Expand Down Expand Up @@ -392,6 +396,98 @@ func mustCreateAgentLogs(ctx context.Context, t *testing.T, db database.Store, a
require.NotEmpty(t, agentLogs, "agent logs must be present")
}

func TestDeleteOldWorkspaceAgentLogsRetention(t *testing.T) {
t.Parallel()

now := time.Date(2025, 1, 15, 7, 30, 0, 0, time.UTC)

testCases := []struct {
name string
retentionConfig codersdk.RetentionConfig
logsAge time.Duration
expectDeleted bool
}{
{
name: "RetentionEnabled",
retentionConfig: codersdk.RetentionConfig{
WorkspaceAgentLogs: serpent.Duration(7 * 24 * time.Hour), // 7 days
},
logsAge: 8 * 24 * time.Hour, // 8 days ago
expectDeleted: true,
},
{
name: "RetentionDisabled",
retentionConfig: codersdk.RetentionConfig{
WorkspaceAgentLogs: serpent.Duration(0),
},
logsAge: 60 * 24 * time.Hour, // 60 days ago
expectDeleted: false,
},
{
name: "GlobalRetentionFallback",
retentionConfig: codersdk.RetentionConfig{
Global: serpent.Duration(14 * 24 * time.Hour), // 14 days global
WorkspaceAgentLogs: serpent.Duration(0), // Not set, falls back to global
},
logsAge: 15 * 24 * time.Hour, // 15 days ago
expectDeleted: true,
},
{
name: "CustomRetention30Days",
retentionConfig: codersdk.RetentionConfig{
WorkspaceAgentLogs: serpent.Duration(30 * 24 * time.Hour), // 30 days
},
logsAge: 31 * 24 * time.Hour, // 31 days ago
expectDeleted: true,
},
}

for _, tc := range testCases {
t.Run(tc.name, func(t *testing.T) {
t.Parallel()

ctx := testutil.Context(t, testutil.WaitShort)
clk := quartz.NewMock(t)
clk.Set(now).MustWait(ctx)

oldTime := now.Add(-tc.logsAge)

db, _ := dbtestutil.NewDB(t, dbtestutil.WithDumpOnFailure())
logger := slogtest.Make(t, &slogtest.Options{IgnoreErrors: true})
org := dbgen.Organization(t, db, database.Organization{})
user := dbgen.User(t, db, database.User{})
_ = dbgen.OrganizationMember(t, db, database.OrganizationMember{UserID: user.ID, OrganizationID: org.ID})
tv := dbgen.TemplateVersion(t, db, database.TemplateVersion{OrganizationID: org.ID, CreatedBy: user.ID})
tmpl := dbgen.Template(t, db, database.Template{OrganizationID: org.ID, ActiveVersionID: tv.ID, CreatedBy: user.ID})

ws := dbgen.Workspace(t, db, database.WorkspaceTable{Name: "test-ws", OwnerID: user.ID, OrganizationID: org.ID, TemplateID: tmpl.ID})
wb1 := mustCreateWorkspaceBuild(t, db, org, tv, ws.ID, oldTime, 1)
wb2 := mustCreateWorkspaceBuild(t, db, org, tv, ws.ID, oldTime, 2)
agent1 := mustCreateAgent(t, db, wb1)
agent2 := mustCreateAgent(t, db, wb2)
mustCreateAgentLogs(ctx, t, db, agent1, &oldTime, "agent 1 logs")
mustCreateAgentLogs(ctx, t, db, agent2, &oldTime, "agent 2 logs")

// Run the purge.
done := awaitDoTick(ctx, t, clk)
closer := dbpurge.New(ctx, logger, db, &codersdk.DeploymentValues{
Retention: tc.retentionConfig,
}, clk)
defer closer.Close()
testutil.TryReceive(ctx, t, done)

// Verify results.
if tc.expectDeleted {
assertNoWorkspaceAgentLogs(ctx, t, db, agent1.ID)
} else {
assertWorkspaceAgentLogs(ctx, t, db, agent1.ID, "agent 1 logs")
}
// Latest build logs are always retained.
assertWorkspaceAgentLogs(ctx, t, db, agent2.ID, "agent 2 logs")
})
}
}

//nolint:paralleltest // It uses LockIDDBPurge.
func TestDeleteOldProvisionerDaemons(t *testing.T) {
// TODO: must refactor DeleteOldProvisionerDaemons to allow passing in cutoff
Expand Down
14 changes: 14 additions & 0 deletions codersdk/deployment.go
Original file line number Diff line number Diff line change
Expand Up @@ -829,6 +829,9 @@ type RetentionConfig struct {
// Keys are only deleted if they have been expired for at least this duration.
// Defaults to 7 days to preserve existing behavior.
APIKeys serpent.Duration `json:"api_keys" typescript:",notnull"`
// WorkspaceAgentLogs controls how long workspace agent logs are retained.
// Defaults to 7 days to preserve existing behavior.
WorkspaceAgentLogs serpent.Duration `json:"workspace_agent_logs" typescript:",notnull"`
}

type NotificationsConfig struct {
Expand Down Expand Up @@ -3420,6 +3423,17 @@ Write out the current server config as YAML to stdout.`,
YAML: "api_keys",
Annotations: serpent.Annotations{}.Mark(annotationFormatDuration, "true"),
},
{
Name: "Workspace Agent Logs Retention",
Description: "How long workspace agent logs are retained. Logs from non-latest workspace builds are deleted after this period to free up storage space. Set to 0 to disable automatic deletion of workspace agent logs.",
Flag: "workspace-agent-logs-retention",
Env: "CODER_WORKSPACE_AGENT_LOGS_RETENTION",
Value: &c.Retention.WorkspaceAgentLogs,
Default: "7d",
Group: &deploymentGroupRetention,
YAML: "workspace_agent_logs",
Annotations: serpent.Annotations{}.Mark(annotationFormatDuration, "true"),
},
{
Name: "Enable Authorization Recordings",
Description: "All api requests will have a header including all authorization calls made during the request. " +
Expand Down
41 changes: 29 additions & 12 deletions docs/admin/setup/data-retention.md
Original file line number Diff line number Diff line change
@@ -1,8 +1,8 @@
# Data Retention

Coder supports configurable retention policies that automatically purge old
Audit Logs, Connection Logs, and API keys. These policies help manage database
growth by removing records older than a specified duration.
Audit Logs, Connection Logs, Workspace Agent Logs, and API keys. These policies
help manage database growth by removing records older than a specified duration.

## Overview

Expand All @@ -16,7 +16,8 @@ Retention policies help you:

> [!NOTE]
> Retention policies are disabled by default (set to `0`) to preserve existing
> behavior. The only exception is API keys, which defaults to 7 days.
> behavior. The exceptions are API keys and workspace agent logs, which default
> to 7 days.

## Configuration

Expand All @@ -25,11 +26,12 @@ a YAML configuration file.

### Settings

| Setting | CLI Flag | Environment Variable | Default | Description |
|-----------------|-------------------------------|-----------------------------------|----------------|--------------------------------------|
| Audit Logs | `--audit-logs-retention` | `CODER_AUDIT_LOGS_RETENTION` | `0` (disabled) | How long to retain Audit Log entries |
| Connection Logs | `--connection-logs-retention` | `CODER_CONNECTION_LOGS_RETENTION` | `0` (disabled) | How long to retain Connection Logs |
| API Keys | `--api-keys-retention` | `CODER_API_KEYS_RETENTION` | `7d` | How long to retain expired API keys |
| Setting | CLI Flag | Environment Variable | Default | Description |
|----------------------|------------------------------------|----------------------------------------|----------------|------------------------------------------|
| Audit Logs | `--audit-logs-retention` | `CODER_AUDIT_LOGS_RETENTION` | `0` (disabled) | How long to retain Audit Log entries |
| Connection Logs | `--connection-logs-retention` | `CODER_CONNECTION_LOGS_RETENTION` | `0` (disabled) | How long to retain Connection Logs |
| API Keys | `--api-keys-retention` | `CODER_API_KEYS_RETENTION` | `7d` | How long to retain expired API keys |
| Workspace Agent Logs | `--workspace-agent-logs-retention` | `CODER_WORKSPACE_AGENT_LOGS_RETENTION` | `7d` | How long to retain workspace agent logs |

### Duration Format

Expand All @@ -48,7 +50,8 @@ Go duration units (`h`, `m`, `s`):
coder server \
--audit-logs-retention=365d \
--connection-logs-retention=90d \
--api-keys-retention=7d
--api-keys-retention=7d \
--workspace-agent-logs-retention=7d
```

### Environment Variables Example
Expand All @@ -57,6 +60,7 @@ coder server \
export CODER_AUDIT_LOGS_RETENTION=365d
export CODER_CONNECTION_LOGS_RETENTION=90d
export CODER_API_KEYS_RETENTION=7d
export CODER_WORKSPACE_AGENT_LOGS_RETENTION=7d
```

### YAML Configuration Example
Expand All @@ -66,6 +70,7 @@ retention:
audit_logs: 365d
connection_logs: 90d
api_keys: 7d
workspace_agent_logs: 7d
```

## How Retention Works
Expand Down Expand Up @@ -100,6 +105,16 @@ ago. Active keys are never deleted by the retention policy.
Keeping expired keys for a short period allows Coder to return a more helpful
error message when users attempt to use an expired key.

### Workspace Agent Logs Behavior

Workspace agent logs are retained based on the retention period, but **logs from
the latest build of each workspace are always retained** regardless of age. This
ensures you can always debug issues with active workspaces.

Only logs from non-latest workspace builds that are older than the retention
period are deleted. Setting `--workspace-agent-logs-retention=7d` keeps all logs
from the latest build plus logs from previous builds for up to 7 days.

## Best Practices

### Recommended Starting Configuration
Expand All @@ -111,6 +126,7 @@ retention:
audit_logs: 365d
connection_logs: 90d
api_keys: 7d
workspace_agent_logs: 7d
```

### Compliance Considerations
Expand Down Expand Up @@ -150,9 +166,10 @@ To keep data indefinitely for any data type, set its retention value to `0`:

```yaml
retention:
audit_logs: 0s # Keep audit logs forever
connection_logs: 0s # Keep connection logs forever
api_keys: 0s # Keep expired API keys forever
audit_logs: 0s # Keep audit logs forever
connection_logs: 0s # Keep connection logs forever
api_keys: 0s # Keep expired API keys forever
workspace_agent_logs: 0s # Keep workspace agent logs forever
```

## Monitoring
Expand Down
4 changes: 3 additions & 1 deletion docs/reference/api/general.md

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

24 changes: 16 additions & 8 deletions docs/reference/api/schemas.md

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

11 changes: 11 additions & 0 deletions docs/reference/cli/server.md

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

Loading