Skip to content

Commit a2d7842

Browse files
hlcianfagnaamotl
authored andcommitted
Scale: Add tutorial "Scale CrateDB ... to cope with peaks in demand"
1 parent e554ab4 commit a2d7842

File tree

1 file changed

+178
-0
lines changed

1 file changed

+178
-0
lines changed

docs/admin/clustering/scale/demand.md

Lines changed: 178 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -7,4 +7,182 @@ and how it is applied in a real-world data management scenario, specifically
77
about tuning your database cluster to cope with high-demand situations.
88

99

10+
## Introduction
11+
12+
Many organizations size their database infrastructure to handle the maximum level of load they can anticipate, but very often load is seasonal, in some cases around specific events on certain days of the year. Compromises are often made where infrastructure sits idle most of the year and performance is not as good as desired when requests peak.
13+
14+
CrateDB allows clusters to be scaled both up and down by adding and removing nodes, this allows significant savings during quiet periods, and also the provisioning of extra capacity during particular periods of high activity to have optimal performance.
15+
16+
When nodes are added or removed CrateDB automatically rebalances shards, but in cases where we have very large volumes of historical data and new nodes are only added for a short period of time, we may want to avoid any of the historical data being relocated to the temporary nodes.
17+
18+
## About
19+
The approach described below explains how to achieve this using [shard allocation filtering](https://crate.io/docs/crate/reference/en/5.1/general/ddl/shard-allocation.html) on a table that is partitioned by day, the idea is that the period of high activity can be foreseen, so the scaling up will take place the day before a big event, and the scaling down someday after the event has ended.
20+
21+
This same approach can be applied to multiple tables. It is particularly relevant for the larger tables, and smaller tables can be kept on the baseline nodes, but it is always good to consider the impact on querying performance if the small tables will be queried during the big event `JOIN`ed to the big tables that will have data on the temporary nodes.
22+
23+
## Preparing the test environment
24+
25+
In this example, we will imagine that the surge in demand we are preparing for is related to the 2022 FIFA Men’s World Cup running from 20/11/2022 to 18/12/2022.
26+
27+
We will start with a 3 nodes cluster, on which we will create a test table and populate it with some pre-World Cup data:
28+
29+
```sql
30+
CREATE TABLE test (
31+
ts TIMESTAMP,
32+
recorddetails TEXT,
33+
"day" GENERATED ALWAYS AS date_trunc('day',ts)
34+
)
35+
PARTITIONED BY ("day")
36+
CLUSTERED INTO 4 SHARDS
37+
WITH (number_of_replicas=1);
38+
39+
INSERT INTO test (ts) VALUES ('2022-11-18'),('2022-11-19');
40+
```
41+
42+
The shards will initially look like this:
43+
![image|690x167](https://global.discourse-cdn.com/flex020/uploads/crate/original/1X/ac18a9cb507201d8e54771e320501f4aaac0eb16.png)
44+
45+
## Before deploying extra nodes
46+
47+
We want to make sure that the addition of the temporary nodes does not result on data from the large tables getting rebalanced to use these nodes.
48+
49+
We will be able to identify the new nodes by using a custom attribute (`node.attr.storage=temporarynodes`) (see further down for details on how to configure this), so the first step is to configure the existing partitions so that they do not consider the new nodes as suitable targets for shard allocation.
50+
51+
In CrateDB 5.1.2 or higher we can achieve this with:
52+
53+
```sql
54+
/* this applies the setting to all existing partitions and new partitions */
55+
ALTER TABLE test SET ("routing.allocation.exclude.storage" = 'temporarynodes');
56+
57+
/* then we run this other command so that the setting does not apply to new partitions */
58+
ALTER TABLE ONLY test RESET ("routing.allocation.exclude.storage");
59+
```
60+
61+
No data gets reallocated when running this, and there is no impact on querying or ingestion.
62+
63+
Starting in CrateDB 5.2 this setting is visible in `settings['routing']` in `information_schema.table_partitions`.
64+
65+
## Deploying the extra nodes
66+
67+
We want to deploy the new nodes setting a custom attribute.
68+
If using containers add a line to `args` in your YAML file with:
69+
70+
```
71+
- -Cnode.attr.storage=temporarynodes
72+
```
73+
74+
Otherwise add this to `crate.yml` (typically on `/etc/crate`)
75+
76+
```
77+
node.attr.storage=temporarynodes
78+
```
79+
80+
Please note the word `storage` in this context does not have any special meaning for CrateDB, it is just a name that we have chosen in this case for the custom attribute.
81+
82+
Starting with CrateDB 5.2 these node attributes will be visible in `sys.nodes`.
83+
84+
We need to calculate how many shards will be created each day (that is on each partition) during the special event, since our test table is `CLUSTERED INTO 4 SHARDS` `WITH (number_of_replicas=1)` we would have 4 (shards per partition) x 2 (primary + copy) = 8
85+
86+
We then need to see what is the ceiling of the number we got (8) divided by the total number of nodes (baseline + temporary), if that is for instance 3+2=5 then we have ceiling(8/5)=2.
87+
88+
That means that if a maximum of 2 of the new shards created each day during the event goes to each node then the new data will be balanced across all nodes.
89+
90+
With the nodes ready, on the day before the event, we need to configure the special tables so that any new partitions follow this rule:
91+
92+
```sql
93+
ALTER TABLE ONLY test SET ("routing.allocation.total_shards_per_node" = 2);
94+
```
95+
96+
This setting (`total_shards_per_node`) is visible at partition level in `settings['routing']` in `information_schema.table_partitions`.
97+
98+
No data gets reallocated when running this and there is no impact on querying or ingestion.
99+
100+
This setting can be checked by using:
101+
102+
```sql
103+
SHOW CREATE TABLE test;
104+
```
105+
106+
## During the event
107+
108+
Let’s now simulate the arrival of data during the event:
109+
110+
```sql
111+
INSERT INTO test (ts) VALUES
112+
('2022-11-20'),('2022-11-21'),('2022-11-22'),('2022-11-23'),
113+
('2022-11-24'),('2022-11-25'),('2022-11-26'),('2022-11-27'),
114+
('2022-11-28'),('2022-11-29'),('2022-11-30'),('2022-12-01'),
115+
('2022-12-02'),('2022-12-03'),('2022-12-04'),('2022-12-05'),
116+
('2022-12-06'),('2022-12-07'),('2022-12-08'),('2022-12-09'),
117+
('2022-12-10'),('2022-12-11'),('2022-12-12'),('2022-12-13'),
118+
('2022-12-14'),('2022-12-15'),('2022-12-16'),('2022-12-17'),
119+
('2022-12-18')
120+
```
121+
122+
We can see that data from before the event stays on the baseline nodes while data for the days of the event gets distributed over all nodes:
123+
124+
![image|690x220](https://global.discourse-cdn.com/flex020/uploads/crate/original/1X/b1c1a1ac42ac3d0eb644529e57c4b9c49eae2e87.png)
125+
126+
The same can be checked programmatically with this query:
127+
128+
```sql
129+
SELECT table_partitions.table_schema,
130+
table_partitions.table_name,
131+
table_partitions.values['day']::TIMESTAMP,
132+
shards.primary,
133+
shards.node['name']
134+
FROM sys.shards
135+
JOIN information_schema.table_partitions
136+
ON shards.partition_ident=table_partitions.partition_ident
137+
ORDER BY 1,2,3,4,5;
138+
```
139+
140+
## The day the event ends
141+
142+
On the last day of the event, we need to configure the table so that the next partition goes to the baseline nodes:
143+
144+
```sql
145+
ALTER TABLE ONLY test SET ("routing.allocation.exclude.storage" = 'temporarynodes');
146+
147+
ALTER TABLE ONLY test RESET ("routing.allocation.total_shards_per_node");
148+
```
149+
150+
## A day after the event has ended
151+
152+
New data should now again go the baseline nodes only.
153+
154+
Let’s confirm it:
155+
156+
```sql
157+
INSERT INTO test (ts) VALUES ('2022-12-19'),('2022-12-20')
158+
```
159+
160+
![image|690x73](https://global.discourse-cdn.com/flex020/uploads/crate/original/1X/72b9f0bd28fb88402ea951f9f8a9a15c7c491ad2.png)
161+
162+
When we are ready to decommission the temporary nodes, we need to move the data collected during the days of the event.
163+
164+
In CrateDB 5.1.2 or higher we can achieve this with:
165+
166+
```sql
167+
ALTER TABLE test SET ("routing.allocation.exclude.storage" = 'temporarynodes');
168+
ALTER TABLE test RESET ("routing.allocation.total_shards_per_node");
169+
```
170+
171+
The data movement takes place one replica at a time and there is no impact on querying of the event’s data while it is being moved, new data also continues to flow to the baseline nodes and its ingestion and querying are also not impacted.
172+
173+
We can monitor the progress of the data relocation querying `sys.shards` and `sys.allocations`.
174+
175+
Once all shards have moved away from the temporary nodes, we can decommission them gracefully:
176+
177+
```sql
178+
ALTER CLUSTER DECOMMISSION 'nodename';
179+
```
180+
181+
Once this is done, the machines can safely be shutdown.
182+
183+
## When the time comes for the next event
184+
185+
If desired, new nodes can be deployed reusing the same names that were used for the temporary nodes before.
186+
187+
10188
[shard allocation filtering]: inv:crate-reference#ddl_shard_allocation

0 commit comments

Comments
 (0)