-
Notifications
You must be signed in to change notification settings - Fork 25.7k
Improve scalability of get-license action #134457
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Improve scalability of get-license action #134457
Conversation
Today this action runs on the transport worker thread and forwards the request on to the master by default. It turns out that Elastic Agent uses this API as a readiness check whenever opening a new connection, so a thundering herd of 1000s of agents can prevent the transport worker threads from doing more useful work for far too long, leading to high latency and timeouts. This commit changes the default behaviour to run the action on the local node rather than forwarding to the master (although the option remains to specify `?local=false`) and dispatches the work off of the transport worker early.
|
Pinging @elastic/es-security (Team:Security) |
|
Hi @DaveCTurner, I've created a changelog YAML for you. |
The default for the `?local` parameter to the `GET _license` API changed from `false` to `true` in elastic/elasticsearch#134457. This commit adjusts the documentation to match.
ywangd
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
I think it means a default GetLicense request now forks twice: First in its REST handler and second time in TransportMasterNodeAction. I think it's fine. Just wanted to call it out explicitly.
Yes you're right; I'd forgotten about that second dispatch in It concerns me slightly that any non-local |
* Update get-license default for ?local The default for the `?local` parameter to the `GET _license` API changed from `false` to `true` in elastic/elasticsearch#134457. This commit adjusts the documentation to match. * Explain false
Today this action runs on the transport worker thread and forwards the
request on to the master by default. It turns out that Elastic Agent
uses this API as a readiness check whenever opening a new connection, so
a thundering herd of 1000s of agents can prevent the transport worker
threads from doing more useful work for far too long, leading to high
latency and timeouts.
This commit changes the default behaviour to run the action on the local
node rather than forwarding to the master (although the option remains
to specify
?local=false) and dispatches the work off of the transportworker early.