Skip to content

Conversation

@afoucret
Copy link
Contributor

@afoucret afoucret commented Dec 4, 2025

Added implicits configurable Limit for COMPLETION and RERANK

  • Execution limits
    • The maximum number of rows processed by COMPLETION and RERANK must be bounded by default.
    • When the input dataset exceeds this limit, it should be truncated before the command executes.
    • Administrators must be able to override or disable these limits through cluster settings:
Setting Description Default
esql.command.completion.enabled Enable or disable the COMPLETION command true
esql.command.completion.limit Maximum number of rows to be processed by COMPLETION 100
esql.command.rerank.enabled Enable or disable the RERANK command true
esql.command.rerank.limit Maximum number of rows to be processed by RERANK 1000

Updated documentation

  • Documentation.
    • < 9.3
      • COMPLETION and RERANK documentation must explicitly warn about risks.
      • Documentation must strongly recommend adding a LIMIT before COMPLETION and RERANK.
    • >= 9.3.0 Added a note about the implicit limit in RERANK and COMPLETION

Others:

  • Added to integration tests for COMPLETION and RERANK so we can have tests scenarios using settings

Closes: #136861

@afoucret afoucret changed the title Esql usage limit v3 [ESQL][Inference] Introduce usage limits for COMPLETION and RERANK Dec 8, 2025
@afoucret afoucret added >enhancement :Search Relevance/ES|QL Search functionality in ES|QL labels Dec 8, 2025
@elasticsearchmachine
Copy link
Collaborator

Hi @afoucret, I've created a changelog YAML for you.

@afoucret afoucret force-pushed the esql-usage-limit-v3 branch from afdfd7c to 2f2c354 Compare December 8, 2025 09:49
@afoucret afoucret marked this pull request as ready for review December 8, 2025 10:30
@elasticsearchmachine elasticsearchmachine added the Team:Search Relevance Meta label for the Search Relevance team in Elasticsearch label Dec 8, 2025
@elasticsearchmachine
Copy link
Collaborator

Pinging @elastic/es-search-relevance (Team:Search Relevance)

@elasticsearchmachine
Copy link
Collaborator

Hi @afoucret, I've created a changelog YAML for you.

@github-actions
Copy link
Contributor

github-actions bot commented Dec 8, 2025

ℹ️ Important: Docs version tagging

👋 Thanks for updating the docs! Just a friendly reminder that our docs are now cumulative. This means all 9.x versions are documented on the same page and published off of the main branch, instead of creating separate pages for each minor version.

We use applies_to tags to mark version-specific features and changes.

Expand for a quick overview

When to use applies_to tags:

✅ At the page level to indicate which products/deployments the content applies to (mandatory)
✅ When features change state (e.g. preview, ga) in a specific version
✅ When availability differs across deployments and environments

What NOT to do:

❌ Don't remove or replace information that applies to an older version
❌ Don't add new information that applies to a specific version without an applies_to tag
❌ Don't forget that applies_to tags can be used at the page, section, and inline level

🤔 Need help?

@afoucret afoucret force-pushed the esql-usage-limit-v3 branch from ff1608f to b414480 Compare December 9, 2025 09:12
@afoucret afoucret linked an issue Dec 9, 2025 that may be closed by this pull request
@afoucret afoucret closed this Dec 9, 2025
@afoucret afoucret reopened this Dec 9, 2025
Source.readFrom((PlanStreamInput) in),
in.readNamedWriteable(LogicalPlan.class),
in.readNamedWriteable(Expression.class),
in.getTransportVersion().supports(ESQL_INFERENCE_ROW_LIMIT_TRANSPORT_VERSION)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we don't need to introduce a new transport version, in fact we don't need this method at all.
now that RERANK and COMPLETION have an implicit limit - they will always be executed on the coordinator, meaning we never need to send them to the data nodes.
so we can further simplify this, remove the NamedWritable and just throw an exception if we need to serialize them (which would be a bug and a code path that should never be reached):

@Override
public void writeTo(StreamOutput out) {
throw new UnsupportedOperationException("not serialized");
}
@Override
public String getWriteableName() {
throw new UnsupportedOperationException("not serialized");
}

ChangePoint, Fuse, Fork etc are also not serialized.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Completely removed the serialization logic for Rerank and Completion logical and physical plans.

try (var resp = run(query)) {
List<List<Object>> values = getValuesList(resp);
// Should be limited by the default row limit (100)
assertThat(values.size(), lessThanOrEqualTo(100));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the index we create always has 6 documents - so we are not really testing the limit enforcement.
let's change the createAndPopulateTestIndex so that we create an index with more than 100 documents when we test COMPLETION and more than 1000 when we test RERANK.
and here we should check that we get exactly 100 documents back.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Test have been updated.


try (var resp = run(query)) {
List<List<Object>> values = getValuesList(resp);
assertThat(values.size(), lessThanOrEqualTo(customLimit));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

let's check we get exactly customLimit docs back (take a look at the other comment that suggests changing createAndPopulateTestIndex.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Tests have been updated.

@leemthompo
Copy link
Contributor

Docs preview LGTM, nice!

@afoucret afoucret requested a review from ioanatia December 9, 2025 11:45
public EsqlStatement parse(
String query,
QueryParams params,
SettingsValidationContext settingsValidationCtx,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if we should consider embedding inferenceSettings in SettingsValidationContext?
I know that SettingsValidationContext serves a different purpose, so maybe we can just have a ValidationContext that is initialized in EsqlSession and can receive whatever context is necessary for validation during parsing?
Happy to do this as a separate follow up and not in this PR, since it would increase the scope.

@afoucret afoucret enabled auto-merge (squash) December 9, 2025 13:10
@afoucret afoucret merged commit 3d754d2 into elastic:main Dec 9, 2025
34 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

>enhancement :Search Relevance/ES|QL Search functionality in ES|QL Team:Search Relevance Meta label for the Search Relevance team in Elasticsearch v9.3.0

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[ES|QL] Cannot use STATS after RERANK or COMPLETION [ESQL][Inference] Introduce usage limits for COMPLETION and RERANK commands

4 participants