-
Notifications
You must be signed in to change notification settings - Fork 1.1k
perf: optimize GetTemplateAppInsightsByTemplate by pre-filtering on start/end times #20669
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
|
||
| -- name: GetTemplateAppInsightsByTemplate :many | ||
| -- GetTemplateAppInsightsByTemplate is used for Prometheus metrics. Keep | ||
| -- in sync with GetTemplateAppInsights and UpsertTemplateUsageStats. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Double checking: is there anything to update in GetTemplateAppInsights or UpsertTemplateUsageStats?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh nice catch, I hadn't seen that part of the comment. Will double check.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@zedkipp UpsertTemplateUsageStats uses the same workspace_app_stats table to we can apply the same change there. GetTemplateAppInsights already uses a table with pre-aggregated data, there's no expensive calculation post-join so I don't believe there's any benefit to doing the same optimization for the join there. We can reevaluate later if we see that query show up in query insights as being expensive.
start/end times Signed-off-by: Callum Styan <callumstyan@gmail.com>
Signed-off-by: Callum Styan <callumstyan@gmail.com>
391b435 to
2074695
Compare
In this PR we're optimizing the
GetTemplateAppInsightsByTemplatequery by pre-filtering out apps which do not have an active session during the start/end time window.Note: as of Nov 4 this is our most expensive query internally in terms of DB load though it only is called ~900 times in a 24h period it has a 1s average execution time and minimum of ~400ms.
This query currently looks at all entries in
workspace_app_stats(IIUC each row is an app session), then splits them into per-minute buckets before filtering out buckets that do not fall within our start/end time window. This leads to expensive query processing time to do sequential scans and joins for various data that will eventually just be thrown away due to being outside the time range.Instead, we can pre-filter out the majority of the buckets that need to be thrown away by only retrieving entries from
workspace_app_statsfor sessions where at least some portion of the sessions active time range is within our start/end time range. We keep the existing filter of buckets to ensure we still filter out buckets that fall outside the time range.The default time window for the query is 5 minutes, so start =
time.Now() - 5mand end =time.Now(). UsingEXPLAINand a specific 5 minute time window from earlier today (2025-11-03 18:00–18:05 UTC) the difference is pretty obvious:I compared/verified the output of the queries, the filtered rows and distinct app names/template IDs, and that
terminalis still not considered an app.