You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: sdk/cosmos/azure-cosmos-spark_3-1_2-12/docs/configuration-reference.md
+14-13Lines changed: 14 additions & 13 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -20,16 +20,17 @@ Configuration Reference:
20
20
|`spark.cosmos.useGatewayMode`|`false`| Use gateway mode for the client operations |
21
21
|`spark.cosmos.read.forceEventualConsistency`|`true`| Makes the client use Eventual consistency for read operations instead of using the default account level consistency |
22
22
|`spark.cosmos.applicationName`| None | Application name |
23
-
|`spark.cosmos.preferredRegionsList`| None | Preferred regions list to be used for a multi region Cosmos DB account. This is a comma separated value (e.g., `[eastus,westus]`) provided preferred regions will be used as hint. You should use a collocated spark cluster with your Cosmos DB account and pass the spark cluster region as preferred region. See list of azure regions [here](https://docs.microsoft.com/dotnet/api/microsoft.azure.documents.locationnames?view=azure-dotnet)|
23
+
|`spark.cosmos.preferredRegionsList`| None | Preferred regions list to be used for a multi region Cosmos DB account. This is a comma separated value (e.g., `[East US, West US]` or `East US, West US`) provided preferred regions will be used as hint. You should use a collocated spark cluster with your Cosmos DB account and pass the spark cluster region as preferred region. See list of azure regions [here](https://docs.microsoft.com/dotnet/api/microsoft.azure.documents.locationnames?view=azure-dotnet)|
|`spark.cosmos.write.maxRetryCount`|`3`| Cosmos DB Write Max Retry Attempts on failure |
31
-
|`spark.cosmos.write.maxConcurrency`| None | Cosmos DB Item Write Max concurrency. If not specified it will be determined based on the Spark executor VM Size |
32
-
|`spark.cosmos.write.bulkEnabled`|`true`| Cosmos DB Item Write bulk enabled |
|`spark.cosmos.write.maxRetryCount`|`10`| Cosmos DB Write Max Retry Attempts on retriable failures (e.g., connection error, moderakh add more details) |
31
+
|`spark.cosmos.write.point.maxConcurrency`| None | Cosmos DB Item Write Max concurrency. If not specified it will be determined based on the Spark executor VM Size |
32
+
|`spark.cosmos.write.bulk.maxPendingOperations`| None | Cosmos DB Item Write Max concurrency. If not specified it will be determined based on the Spark executor VM Size |
33
+
|`spark.cosmos.write.bulk.enabled`|`true`| Cosmos DB Item Write bulk enabled |
33
34
34
35
### Query Config
35
36
@@ -39,19 +40,19 @@ When doing read operations, users can specify a custom schema or allow the conne
39
40
40
41
| Config Property Name | Default | Description |
41
42
| :--- | :---- | :--- |
42
-
|`spark.cosmos.read.inferSchemaEnabled`|`true`| When schema inference is disabled and user is not providing a schema, raw json will be returned. |
43
-
|`spark.cosmos.read.inferSchemaQuery`|`SELECT * FROM r`| When schema inference is enabled, used as custom query to infer it. For example, if you store multiple entities with different schemas within a container and you want to ensure inference only looks at certain document types or you want to project only particular columns. |
44
-
|`spark.cosmos.read.inferSchemaSamplingSize`|`1000`| Sampling size to use when inferring schema and not using a query. |
45
-
|`spark.cosmos.read.inferSchemaIncludeSystemProperties`|`false`| When schema inference is enabled, whether the resulting schema will include all [Cosmos DB system properties](https://docs.microsoft.com/azure/cosmos-db/account-databases-containers-items#properties-of-an-item). |
46
-
|`spark.cosmos.read.inferSchemaIncludeTimestamp`|`false`| When schema inference is enabled, whether the resulting schema will include the document Timestamp (`_ts`). Not required if `spark.cosmos.read.inferSchemaIncludeSystemProperties` is enabled, as it will already include all system properties. |
43
+
|`spark.cosmos.read.inferSchema.enabled`|`true`| When schema inference is disabled and user is not providing a schema, raw json will be returned. |
44
+
|`spark.cosmos.read.inferSchema.query`|`SELECT * FROM r`| When schema inference is enabled, used as custom query to infer it. For example, if you store multiple entities with different schemas within a container and you want to ensure inference only looks at certain document types or you want to project only particular columns. |
45
+
|`spark.cosmos.read.inferSchema.samplingSize`|`1000`| Sampling size to use when inferring schema and not using a query. |
46
+
|`spark.cosmos.read.inferSchema.includeSystemProperties`|`false`| When schema inference is enabled, whether the resulting schema will include all [Cosmos DB system properties](https://docs.microsoft.com/azure/cosmos-db/account-databases-containers-items#properties-of-an-item). |
47
+
|`spark.cosmos.read.inferSchema.includeTimestamp`|`false`| When schema inference is enabled, whether the resulting schema will include the document Timestamp (`_ts`). Not required if `spark.cosmos.read.inferSchema.includeSystemProperties` is enabled, as it will already include all system properties. |
47
48
48
49
#### Json conversion configuration
49
50
50
-
When reading json documents, if a document contains an attribute that does not map to the schema type, the user can decide whether to use a `null` value (Relaxed) or an exception (Strict).
51
51
52
52
| Config Property Name | Default | Description |
53
53
| :--- | :---- | :--- |
54
-
|`spark.cosmos.read.schemaConversionMode`|`Relaxed`| The schema conversion behavior (Relaxed, Strict) |
54
+
| `spark.cosmos.read.schemaConversionMode` | `Relaxed` | The schema conversion behavior (`Relaxed`, `Strict`). When reading json documents, if a document contains an attribute that does not map to the schema type, the user can decide whether to use a `null` value (Relaxed) or an exception (Strict).
55
+
|
55
56
56
57
#### Partitioning Strategy Config
57
58
@@ -65,7 +66,7 @@ When reading json documents, if a document contains an attribute that does not m
65
66
66
67
| Config Property Name | Default | Description |
67
68
| :--- | :---- | :--- |
68
-
|`spark.cosmos.throughputControlEnabled`|`false`| Whether throughput control is enabled |
69
+
|`spark.cosmos.throughputControl.enabled`|`false`| Whether throughput control is enabled |
69
70
|`spark.cosmos.throughputControl.name`| None | Throughput control group name |
70
71
|`spark.cosmos.throughputControl.targetThroughput`| None | Throughput control group target throughput |
71
72
|`spark.cosmos.throughputControl.targetThroughputThreshold`| None | Throughput control group target throughput threshold |
see [Query Configuration](https://github.com/Azure/azure-sdk-for-java/blob/master/sdk/cosmos/azure-cosmos-spark_3-1_2-12/docs/configuration-reference.md#query-config) for more detail.
113
113
114
114
Note when running queries unless if are interested to get back the raw json payload
115
-
we recommend setting `spark.cosmos.read.inferSchemaEnabled` to be `true`.
115
+
we recommend setting `spark.cosmos.read.inferSchema.enabled` to be `true`.
116
116
117
117
see [Schema Inference Configuration](https://github.com/Azure/azure-sdk-for-java/blob/master/sdk/cosmos/azure-cosmos-spark_3-1_2-12/docs/configuration-reference.md#schema-inference-config) for more detail.
118
118
@@ -122,7 +122,7 @@ see [Schema Inference Configuration](https://github.com/Azure/azure-sdk-for-java
0 commit comments