Skip to content
Merged
Show file tree
Hide file tree
Changes from 7 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
40 changes: 14 additions & 26 deletions docs/content/stable/additional-features/auto-analyze.md
Original file line number Diff line number Diff line change
Expand Up @@ -4,8 +4,6 @@ headerTitle: Auto Analyze service
linkTitle: Auto Analyze
description: Use the Auto Analyze service to keep table statistics up to date
headcontent: Keep table statistics up to date automatically
tags:
feature: early-access
menu:
stable:
identifier: auto-analyze
Expand All @@ -22,40 +20,30 @@ Similar to [PostgreSQL autovacuum](https://www.postgresql.org/docs/current/routi

## Enable Auto Analyze

Before you can use the feature, you must enable it by setting `ysql_enable_auto_analyze_service` to true on all YB-Masters, and both `ysql_enable_auto_analyze_service` and `ysql_enable_table_mutation_counter` to true on all YB-TServers.
For new universes running v2025.2 or later, Auto Analyze is enabled by default when you deploy using yugabyted, YugabyteDB Anywhere, or YugabyteDB Aeon.

For example, to create a single-node [yugabyted](../../reference/configuration/yugabyted/) cluster with Auto Analyze enabled, use the following command:
In addition, when upgrading a deployment to v2025.2 or later, if the universe has the cost-based optimizer enabled (`on`), YugabyteDB will enable Auto Analyze.

You can explicitly enable or disable auto analyze by setting `ysql_enable_auto_analyze` on both yb-master and yb-tserver.

For example, to create a single-node [yugabyted](../../reference/configuration/yugabyted/) cluster with Auto Analyze explicitly enabled, use the following command:

```sh
./bin/yugabyted start \
--master_flags "ysql_enable_auto_analyze_service=true" \
--tserver_flags "ysql_enable_auto_analyze_service=true,ysql_enable_table_mutation_counter=true"
--master_flags "ysql_enable_auto_analyze=true" \
--tserver_flags "ysql_enable_auto_analyze=true"
```

Enabling Auto Analyze on an existing cluster requires a rolling restart to set `ysql_enable_auto_analyze_service` and `ysql_enable_table_mutation_counter` to true.

## Configure Auto Analyze

You can control how frequently the service updates table statistics using the following YB-TServer flags:

- `ysql_auto_analyze_threshold` - the minimum number of mutations (INSERT, UPDATE, and DELETE) needed to run ANALYZE on a table. Default is 50.
- `ysql_auto_analyze_scale_factor` - a fraction that determines when enough mutations have been accumulated to run ANALYZE for a table. Default is 0.1.

Increasing either of these flags reduces the frequency of statistics updates.

If the total number of mutations for a table is greater than its analyze threshold, then the service runs ANALYZE on the table. The analyze threshold of a table is calculated as follows:

```sh
analyze_threshold = ysql_auto_analyze_threshold + (ysql_auto_analyze_scale_factor * <table_size>)
```

where `<table_size>` is the current `reltuples` column value stored in the `pg_class` catalog.
The auto analyze service counts the number of mutations (INSERT, UPDATE, and DELETE) to a table and triggers ANALYZE on the table automatically when certain thresholds are reached. You can configure this behavior using the following settings.

`ysql_auto_analyze_threshold` is important for small tables. With default settings, if a table has 100 rows and 20 are mutated, ANALYZE won't run as the threshold is not met, even though 20% of the rows are mutated.
A table needs to accumulate a minimum number of mutations before it is considered for ANALYZE. This minimum is the sum of:

On the other hand, `ysql_auto_analyze_scale_factor` is especially important for big tables. If a table has 1,000,000,000 rows, 10% (100,000,000 rows) would have to be mutated before ANALYZE runs. Set the scale factor to a lower value to allow for more frequent statistics collection for such large tables.
- A fraction of the table size. This is controlled by [ysql_auto_analyze_scale_factor](../../reference/configuration/yb-tserver/#ysql-auto-analyze-scale-factor). This setting defaults to 0.1, which translates to 10% of the current table size. Current table size is determined by the [reltuples](https://www.postgresql.org/docs/15/catalog-pg-class.html#:~:text=CREATE%20INDEX.-,reltuples,-float4) column value stored in the `pg_class` catalog entry for that table.
- A static count of [ysql_auto_analyze_threshold](../../reference/configuration/yb-tserver/#ysql-auto-analyze-threshold) (default 50) mutations. This setting ensures that small tables are not aggressively ANALYZED because the scale factor requirement is easily met.

In addition, `ysql_auto_analyze_batch_size` controls the maximum number of tables the Auto Analyze service tries to analyze in a single ANALYZE statement. The default is 10. Setting this flag to a larger value can potentially reduce the number of YSQL catalog cache refreshes if Auto Analyze decides to ANALYZE many tables in the same database at the same time.
Separately, Auto Analyze also considers cooldown settings for a table so as to not trigger ANALYZE aggressively. After every run of ANALYZE on a table, a cooldown period is enforced before the next run of ANALYZE on that table, even if the mutation thresholds are met. The cooldown period starts from [ysql_auto_analyze_min_cooldown_per_table](../../reference/configuration/yb-tserver/#ysql_auto_analyze_min_cooldown_per_table) (default: 10 seconds) and exponentially increases to [ysql_auto_analyze_max_cooldown_per_table](../../reference/configuration/yb-tserver/#ysql_auto_analyze_max_cooldown_per_table) (default: 24 hours). Cooldown values for a table do not reset. This means that in most cases, it is expected that, after a while, a frequently updated table is only analyzed once every `ysql_auto_analyze_max_cooldown_per_table` period.

For more information on flags used to configure the Auto Analyze service, refer to [Auto Analyze service flags](../../reference/configuration/yb-tserver/#auto-analyze-service-flags).

Expand Down Expand Up @@ -94,4 +82,4 @@ SELECT reltuples FROM pg_class WHERE relname = 'test';

## Limitations

Because ANALYZE is a DDL statement, it can cause DDL conflicts when run concurrently with other DDL statements. As Auto Analyze runs ANALYZE in the background, you should turn off Auto Analyze if you want to execute DDL statements. You can do this by setting `ysql_enable_auto_analyze_service` to false on all YB-TServers at runtime.
ANALYZE is technically considered a DDL statement (schema change) and normally conflicts with other [concurrent DDLs](../best-practices-operations/administration/#concurrent-ddl-during-a-ddl-operation). However, when run via the auto analyze service, ANALYZE can run concurrently with other DDL. In this case, ANALYZE is pre-empted by concurrent DDL and will be retried at a later point. However, when [transactional DDL](../explore/transactions/transactional-ddl/) is enabled (off by default), certain kinds of transactions that contain DDL may face a `kConflict` error when a background ANALYZE from the auto analyze service interrupts this transaction. In such cases, it is recommended to disable the auto analyze service explicitly and trigger ANALYZE manually. Issue {{<issue 28903>}} tracks this scenario.
14 changes: 11 additions & 3 deletions docs/content/stable/reference/configuration/yb-master.md
Original file line number Diff line number Diff line change
Expand Up @@ -1058,16 +1058,24 @@ Default: `true`

## Auto Analyze service flags

{{<tags/feature/ea idea="590">}}To learn about the Auto Analyze service, see [Auto Analyze service](../../../additional-features/auto-analyze).
To learn about the Auto Analyze service, see [Auto Analyze service](../../../additional-features/auto-analyze).

Auto analyze is automatically enabled when the [cost-based optimizer](../../../architecture/query-layer/planner-optimizer/) (CBO) is enabled by setting the [yb_enable_cbo](../tb-tserver/#yb_enable_cbo) flag to `on`.

To explicitly control the service, you can set the `ysql_enable_auto_analyze` flag.

See also [Auto Analyze Service TServer flags](../yb-tserver/#auto-analyze-service-flags).

##### ysql_enable_auto_analyze_service
##### ysql_enable_auto_analyze

{{<tags/feature/ea idea="590">}}Enable the Auto Analyze service, which automatically runs ANALYZE to update table statistics for tables that have changed more than a configurable threshold.
Enable the Auto Analyze service, which automatically runs ANALYZE to update table statistics for tables that have changed more than a configurable threshold.

Default: false

##### ysql_enable_auto_analyze_service (deprecated)

Use ysql_enable_auto_analyze instead.

## Advisory lock flags

To learn about advisory locks, see [Advisory locks](../../../architecture/transactions/concurrency-control/#advisory-locks).
Expand Down
65 changes: 46 additions & 19 deletions docs/content/stable/reference/configuration/yb-tserver.md
Original file line number Diff line number Diff line change
Expand Up @@ -2131,64 +2131,71 @@ Default: `legacy_mode`

Enables the YugabyteDB [cost-based optimizer](../../../architecture/query-layer/planner-optimizer/) (CBO). Options are `on`, `off`, `legacy_mode`, and `legacy_stats_mode`.

When enabling CBO, you must run ANALYZE on user tables to maintain up-to-date statistics.
When CBO is enabled (set to `on`), [auto analyze](#auto-analyze-service-flags) is also enabled automatically. If you disable auto analyze explicitly, you are responsible for periodically running ANALYZE on user tables to maintain up-to-date statistics.

For information on using this parameter to configure CBO, refer to [Enable cost-based optimizer](../../../best-practices-operations/ysql-yb-enable-cbo/).

### Auto Analyze service flags

{{<tags/feature/ea idea="590">}}To learn about the Auto Analyze service, see [Auto Analyze service](../../../additional-features/auto-analyze).
To learn about the Auto Analyze service, see [Auto Analyze service](../../../additional-features/auto-analyze).

{{< note title="Note" >}}
Auto analyze is automatically enabled when the [cost-based optimizer](../../../best-practices-operations/ysql-yb-enable-cbo/) (CBO) is enabled ([yb_enable_cbo](#yb_enable_cbo) is set to `on`).

To fully enable the Auto Analyze service, you need to enable `ysql_enable_auto_analyze_service` on all YB-Masters and YB-TServers, and `ysql_enable_table_mutation_counter` on all YB-TServers.
In v2025.2 and later, CBO and Auto Analyze are enabled by default in new universes when you deploy using yugabyted, YugabyteDB Anywhere, or YugabyteDB Aeon. In addition, when upgrading a deployment to v2025.2 or later, if the universe has the cost-based optimizer enabled (`on`), YugabyteDB will enable Auto Analyze.

{{< /note >}}
To explicitly control the service, you can set the `ysql_enable_auto_analyze` flag.

See also [Auto Analyze Service Master flags](../yb-master/#auto-analyze-service-flags).

##### --ysql_enable_auto_analyze_service
##### --ysql_enable_auto_analyze

{{% tags/wrap %}}
{{<tags/feature/ea idea="590">}}
{{<tags/feature/t-server>}}
{{<tags/feature/restart-needed>}}
{{<tags/feature/t-server>}}
Default: `false`
{{% /tags/wrap %}}

Enable the Auto Analyze service, which automatically runs ANALYZE to update table statistics for tables that have changed more than a configurable threshold.

##### --ysql_enable_table_mutation_counter
##### --ysql_auto_analyze_threshold

{{% tags/wrap %}}


Default: `false`
{{<tags/feature/restart-needed>}}
Default: `50`
{{% /tags/wrap %}}

Enable per table mutation (INSERT, UPDATE, DELETE) counting. The Auto Analyze service runs ANALYZE when the number of mutations of a table exceeds the threshold determined by the [ysql_auto_analyze_threshold](#ysql-auto-analyze-threshold) and [ysql_auto_analyze_scale_factor](#ysql-auto-analyze-scale-factor) settings.
The minimum number of mutations needed to run ANALYZE on a table. For more details, see [Auto Analyze service](../../../additional-features/auto-analyze).

##### --ysql_auto_analyze_threshold
##### --ysql_auto_analyze_scale_factor

{{% tags/wrap %}}

{{<tags/feature/restart-needed>}}
Default: `50`
Default: `0.1`
{{% /tags/wrap %}}

The minimum number of mutations needed to run ANALYZE on a table.
The fraction defining when sufficient mutations have been accumulated to run ANALYZE for a table. For more details, see [Auto Analyze service](../../../additional-features/auto-analyze).

##### --ysql_auto_analyze_scale_factor
##### --ysql_auto_analyze_min_cooldown_per_table

{{% tags/wrap %}}

{{<tags/feature/restart-needed>}}
Default: `0.1`
Default: `10000` (10 seconds)
{{% /tags/wrap %}}

The fraction defining when sufficient mutations have been accumulated to run ANALYZE for a table.
The minimum duration (in milliseconds) for the cooldown period between successive runs of ANALYZE on a specific table by the auto analyze service. For more details, see [Auto Analyze service](../../../additional-features/auto-analyze).

##### --ysql_auto_analyze_max_cooldown_per_table

{{% tags/wrap %}}

{{<tags/feature/restart-needed>}}
Default: `86400000` (24 hours)
{{% /tags/wrap %}}

ANALYZE runs when the mutation count exceeds `ysql_auto_analyze_scale_factor * <table_size> + ysql_auto_analyze_threshold`, where table_size is the value of the `reltuples` column in the `pg_class` catalog.
The maximum duration (in milliseconds) for the cooldown period between successive runs of ANALYZE on a specific table by the auto analyze service. For more details, see [Auto Analyze service](../../../additional-features/auto-analyze).

##### --ysql_auto_analyze_batch_size

Expand Down Expand Up @@ -2240,6 +2247,26 @@ Default: `5000`

Timeout, in milliseconds, for the node-level mutation reporting RPC to the Auto Analyze service.

##### --ysql_enable_auto_analyze_service (deprecated)

{{% tags/wrap %}}
{{<tags/feature/t-server>}}
{{<tags/feature/restart-needed>}}
Default: `false`
{{% /tags/wrap %}}

Enable the Auto Analyze service, which automatically runs ANALYZE to update table statistics for tables that have changed more than a configurable threshold.

##### --ysql_enable_table_mutation_counter (deprecated)

{{% tags/wrap %}}


Default: `false`
{{% /tags/wrap %}}

Enable per table mutation (INSERT, UPDATE, DELETE) counting. The Auto Analyze service runs ANALYZE when the number of mutations of a table exceeds the threshold determined by the [ysql_auto_analyze_threshold](#ysql-auto-analyze-threshold) and [ysql_auto_analyze_scale_factor](#ysql-auto-analyze-scale-factor) settings.

### Advisory lock flags

To learn about advisory locks, see [Advisory locks](../../../architecture/transactions/concurrency-control/#advisory-locks).
Expand Down