-
Notifications
You must be signed in to change notification settings - Fork 25.7k
Description
Description
We are running a large scale Elasticsearch 7 cluster. When performing profiling on the master node, we observed that about 50% of the CPU time is spent in TransportClusterHealthAction.executeHealth. Half of this time is spent in the validateRequest method, and the other half in getResponse. These two methods are doing the same heavy operation twice, building a ClusterHealthResponse first to validate it and then to return it. Optimizing this would for our workload save 25% of the overall CPU time spent by the master node and presumably make cluster health requests respond twice as fast.
I have already written the small code change needed to optimize this (just a new nullable method validateRequestAndGetResponse). Would you like me to submit this as a PR?