Skip to content

Commit abab6f4

Browse files
antiguruteskje
andauthored
Design doc for replacing materialized views (#34106)
Design for replacing materialized views. --------- Signed-off-by: Moritz Hoffmann <mh@materialize.com> Signed-off-by: Moritz Hoffmann <antiguru@gmail.com> Co-authored-by: Jan Teske <jteske@posteo.net>
1 parent 6771e75 commit abab6f4

File tree

1 file changed

+234
-0
lines changed

1 file changed

+234
-0
lines changed
Lines changed: 234 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,234 @@
1+
# Replacement materialized views
2+
3+
Associated:
4+
- https://github.com/MaterializeInc/materialize/pull/34032 (MVP: `CREATE REPLACEMENT`)
5+
- https://github.com/MaterializeInc/materialize/pull/34039 (Per-object read only mode)
6+
- https://github.com/MaterializeInc/materialize/pull/34234 (MVP: `CREATE MATERIALIZED VIEW ... REPLACING`)
7+
8+
## Problem
9+
10+
At the moment, a materialized view names a _view definition_ and the _output columns_ derived from that definition.
11+
We cannot change one or the other independently, which means that any change to a materialized view requires dropping and recreating it.
12+
This is inconvenient for users as changes need to cascade through the dependency graph.
13+
14+
We call this strong coupling between a materialized view and its definition and output columns.
15+
This design explores an alternative in which we can decouple these concepts, allowing users to change one without needing to drop and recreate the other.
16+
This would move us closer to a loosely coupled system, which is easier to maintain and evolve over time.
17+
18+
## Success criteria
19+
20+
We allow users to change the definition and output columns of a materialized view without needing to drop and recreate it.
21+
We preserve existing dependencies on the materialized view when doing so, and we ensure that the system remains consistent.
22+
23+
Changing a running materialized view can cause additional work for downstream consumers.
24+
While we cannot avoid this work, we aim to provide tools to quantify the amount of changed data.
25+
26+
We provide a mechanism that allows cutting over to a new definition with minimal downtime.
27+
A stop-and-restart approach is not acceptable, unless explicitly requested by the user.
28+
29+
## Background
30+
31+
Materialized views are views that Materialize maintains and writes down durably.
32+
They're identified by a name, have a SQL definition, and produce a set of output columns.
33+
The name of a materialized view uniquely identifies a _shard_, which is the durable storage location for the materialized view's data.
34+
When a materialized view is created, Materialize plans and deploys a dataflow that computes the view's contents based on its definition.
35+
36+
(We can rename materialized views, which merely changes the name associated with the shard.)
37+
38+
A materialized view has a unique and immutable catalog item ID that is valid for the lifetime of the object.
39+
It also has a global ID that identifies the dataflow, and associates a schema (`RelationDesc`) with the shard.
40+
The global ID associates compute and storage: it needs to be registered with persist so persist knows what schema to expect, and to reclaim durable resources when the materialized view is dropped.
41+
42+
When a materialized view is dropped, Materialize tears down the dataflow and reclaims the shard.
43+
44+
## Principles
45+
46+
While not specific to this project, I want to outline some principles that helped me to come up with the design.
47+
In user interfaces, it is a principle that all operations can be undone, for example by pressing Ctrl+z.
48+
This allows users to correct mistakes or just decide otherwise.
49+
50+
Similarly, Materialize offers mechanisms with potentially disastrous outcomes that cannot be undone.
51+
For example, a cluster might become overloaded, or we seal a materialized view for all times.
52+
Extending the "undo" principle, we split operations into a _stage_ and _apply_ operation.
53+
The stage operation creates objects, but localizes their effect to a cluster.
54+
Only once the user applies a change, the effect becomes global.
55+
56+
The following solution uses this principle to apply changes to materialized views.
57+
58+
## Solution proposal
59+
60+
We introduce the notion of a "replacement materialized view".
61+
A replacement materialized view allows users to stage the change of definition and output columns of a materialized view.
62+
The user can then inspect the replacement, and decide to apply or discard it.
63+
64+
We add the following SQL syntax:
65+
* `CREATE MATERIALIZED VIEW replacement_name REPLACING mv_name AS SELECT ...`
66+
Creates a replacement for the specified materialized view with the new definition.
67+
The usual properties for materialized views apply, such as the cluster and its options.
68+
* `ALTER MATERIALIZED VIEW mv_name APPLY REPLACEMENT replacement_name`
69+
Applies the specified replacement to the materialized view.
70+
This updates the definition and output columns of the materialized view to match those of the replacement.
71+
Existing dependencies on the materialized view are preserved.
72+
The replacement materialized view is dropped.
73+
74+
When a replacement materialized view is created, we validate that the new definition is compatible with the existing materialized view.
75+
Replacement materialized views are "read-only":
76+
Their dataflows hydrate normally, but their storage sinks are configured to not perform writes to the output shard.
77+
Only when a replacement is applied is the dataflow given permission to start writing to the output.
78+
79+
Compared to regular materialized views, replacement materialized views are more limited:
80+
* It is not possible to select from or depend on replacement materialized views.
81+
* For each materialized view, at most one replacement can exist at any point in time.
82+
83+
Replacement materialized views can be inspected like regular materialized views, using `SHOW CREATE MATERIALIZED VIEW`, `mz_materialized_views`, `EXPLAIN MATERIALIZED VIEW`, etc.
84+
Additionally, a new system relation `mz_replacements` is provided, specifying the replacement targets for all replacements in the system.
85+
86+
Internally, we change the definition of a materialized view as follows:
87+
* A materialized view is uniquely identified by its name and catalog item ID.
88+
* A materialized view uniquely identifies a persist shard.
89+
* A materialized view has a current definition and output columns, identified by a unique global ID.
90+
* A materialized view can have additional versions, each with their own unique global ID and schema.
91+
92+
This corresponds with switching from a strongly-coupled model to a loosely-coupled model: We switch from binding the view definition and schema, to just binding the schema.
93+
94+
## Formalism
95+
96+
Replacing a materialized view with a new version means that at some point in time, we need to switch the contents of the materialized view from the old definition to the new definition.
97+
98+
Let `mv = [mv-updates, since, upper]` be a correct view of the updates in `mv`.
99+
Now consider switching over to a new definition `mv' = [mv'-updates, since', upper']` by applying a replacement `r = [r-updates, since_r, upper_r]`.
100+
It must be true that:
101+
* `since_r <= upper` (the replacement can start at or before the current upper frontier of the materialized view),
102+
* `upper < r_upper` (the replacement must be able to catch up to the current upper frontier of the materialized view).
103+
104+
When applying the replacement at time `upper'`, we need to ensure that:
105+
106+
```
107+
mv'_updates = append(mv-updates, upper', diff(upper', mv-updates, r-updates))
108+
```
109+
110+
From this moment, onwards, the materialized view `mv` will reflect the updates from `mv'`.
111+
112+
### Schema evolution and multiple versions
113+
114+
When applying a replacement, we need to ensure that the new schema is compatible with the existing schema.
115+
We define compatibility as follows:
116+
1. The schema must be the same as the original schema,
117+
2. Or, the schema must be a superset of the original schema (i.e., it can add new columns but cannot remove existing ones).
118+
119+
Schema evolution is tied to what persist considers a safe schema change.
120+
At the moment, this is new nullable columns, but nothing else.
121+
If persist would support more schema changes in the future, we could consider allowing them here as well.
122+
123+
Even if we do not change the schema, we need to register a new version of the materialized view.
124+
A version is defined as a global ID, and a relation description.
125+
All versions map to the same persist shard.
126+
127+
## Timestamp selection
128+
129+
Replacing a materialized view should result in a well-defined history that has definite data at each readable timestamp.
130+
This means that we need to start the replacement materialized view at the write frontier of the existing materialized view.
131+
The implementation needs to ensure this, and must warn or refuse to apply a replacement if this is not possible.
132+
133+
It can pick a later timestamp than the old write frontier, but then needs to wait for the old materialized view to reach that timestamp before the replacement can be applied.
134+
135+
Optionally, we could allow users to let the new materialized view start at a later timestamp and let the time jump forward as needed.
136+
This is highly risky as it introduces gaps in the history, and we should only consider it if there is a strong use case.
137+
The behavior should be guarded with an explicit clause, such as `WITH (FORWARD TIME)`.
138+
139+
### Future work: Marking spans of time as invalid
140+
141+
Except for some subscribes and sinks, times that have errors aren't generally readable by users.
142+
We could use this property to mask the period that we jump forward as invalid, by emitting an error at the beginning of the jump until the end of the jump.
143+
Specifically, when advancing time from `upper` to `upper'`, `upper` <= `upper'`, we would emit the error `[("masked interval", upper + 1, 1), ("masked interval", upper', -1)]`.
144+
An issue with implementing this is that the materialized view dataflow would need to be aware of the replacement, which complicates the implementation.
145+
146+
## Minimal Viable Prototype
147+
148+
* Update the parser to support the above syntax.
149+
* Implement planning and sequencing for the new commands.
150+
* Add catalog relations for replacements: `mz_replacements`.
151+
* Record replacements and state transitions in the audit log.
152+
* Do not support schema evolution in the MVP.
153+
* Implement "correct" timestamp selection for replacements, starting at the write frontier of the existing materialized view.
154+
* Provide better introspection data for replacements, such as the ability to see the differences between the current and replacement definitions.
155+
* Surface metadata about the amount of staged changes (records, bytes) between the current and replacement definitions.
156+
* Document how users can observe hydration progress, or implement new ways to do so.
157+
Specifically, users should be able to monitor progress through the `mz_frontiers` and `mz_arrangement_sizes` introspection views.
158+
* Design a mechanism to read from replacement materialized views.
159+
(The actual design of the mechanism is out of scope for this document.)
160+
161+
## GA considerations
162+
163+
The `mz_replacements` is separate from the existing `mz_materialized_views` system relation, and to get a complete picture of materialized views and their replacements, users need to query both.
164+
For GA, we should add a column `replacement_for` to `mz_materialized_views` that indicates whether a materialized view is a replacement, and if so, for which materialized view.
165+
For regular materialized views, this column would be NULL.
166+
We should only implement this change for GA, as it would be a breaking change to remove it in the future.
167+
168+
## Future work
169+
170+
* Provide insights into the impact of applying a replacement beyond metadata:
171+
* Introspect the actual changes.
172+
For example, which rows would be added or removed.
173+
* Automate applying a replacement once the new definition is hydrated.
174+
* Allow replacements to jump forward in time.
175+
* Support replacements for other maintained objects, such as upsert sources.
176+
177+
## Alternatives
178+
179+
### Replacements as first-class catalog items
180+
181+
Instead of modelling replacements as special materialized views, we could make them separate catalog items instead.
182+
183+
```
184+
CREATE REPLACEMENT <replacement_mv_name> FOR MATERIALIZED VIEW <mv_name> AS SELECT ...
185+
```
186+
187+
We seriously considered this approach and there exists an MVP implementing it.
188+
However, it seems inferior to the special-materialized-views approach:
189+
* Implementation-wise, it requires a lot of new code to support the new item type.
190+
This includes a significant amount of duplication in planning and sequencing.
191+
* UX-wise, having a new item type is a potential source of user confusion.
192+
Existing monitoring for materialized views wouldn't work for replacements out of the box.
193+
194+
It is worth pointing out that the limitations of replacement materialized views, particularly the inability to select from them, is a source of user confusion as well.
195+
We are hopeful that it will be possible to reduce these limitations in the future.
196+
197+
### Multi-output materialized views
198+
199+
The MVP design lets the replacement write at the same shard as the original materialized view.
200+
This comes with limitations, most importantly that we need to ensure that we don't write until cut-over time.
201+
In turn, this prevents reads from the replacement until we're cutting over.
202+
203+
An alternative design is to let the replacement materialized view write to a separate shard in addition to the original materialized view's shard.
204+
A benefit would be that the replacement can be read from immediately, allowing users to inspect the data.
205+
206+
Several parts of Materialize would need to be updated to support this:
207+
* Dataflows support multiple exports, but the behavior is untested and collides with the assumption that we can identify a dataflow by a single global ID.
208+
* The catalog would need to support multiple shards per materialized view.
209+
* We would need to control read-write mode per dataflow export.
210+
This might be a positive change as the controller sends instructions per global ID, not per dataflow.
211+
* It is unclear how we could cut over from one shard to another to reclaim storage space.
212+
If we don't, we'll have to pay the hydration and storage cost for all shards indefinitely.
213+
(We can't cut-over existing dataflows to a new shard, and the only other time we can do breaking changes is when deploying a new version of Materialize.
214+
It feels odd to tie a clean-up mechanism to a version deployment.)
215+
216+
## Open questions
217+
218+
<!--
219+
What is left unaddressed by this design document that needs to be
220+
closed out?
221+
222+
When a design document is authored and shared, there might still be
223+
open questions that need to be explored. Through the design document
224+
process, you are responsible for getting answers to these open
225+
questions. All open questions should be answered by the time a design
226+
document is merged.
227+
-->
228+
229+
### SQL syntax
230+
231+
The current proposal uses the term `REPLACING` in the `CREATE MATERIALIZED VIEW` statement.
232+
An alternative is to use `AS REPLACEMENT FOR mv_name`, which might be more explicit.
233+
We could also consider the term `REPLACE`, but that might be confused with `CREATE OR REPLACE`.
234+
Alternatively, we could use `REPLACES` instead of `REPLACING`.

0 commit comments

Comments
 (0)