|
| 1 | +# Replacement materialized views |
| 2 | + |
| 3 | +Associated: |
| 4 | +- https://github.com/MaterializeInc/materialize/pull/34032 (MVP: `CREATE REPLACEMENT`) |
| 5 | +- https://github.com/MaterializeInc/materialize/pull/34039 (Per-object read only mode) |
| 6 | +- https://github.com/MaterializeInc/materialize/pull/34234 (MVP: `CREATE MATERIALIZED VIEW ... REPLACING`) |
| 7 | + |
| 8 | +## Problem |
| 9 | + |
| 10 | +At the moment, a materialized view names a _view definition_ and the _output columns_ derived from that definition. |
| 11 | +We cannot change one or the other independently, which means that any change to a materialized view requires dropping and recreating it. |
| 12 | +This is inconvenient for users as changes need to cascade through the dependency graph. |
| 13 | + |
| 14 | +We call this strong coupling between a materialized view and its definition and output columns. |
| 15 | +This design explores an alternative in which we can decouple these concepts, allowing users to change one without needing to drop and recreate the other. |
| 16 | +This would move us closer to a loosely coupled system, which is easier to maintain and evolve over time. |
| 17 | + |
| 18 | +## Success criteria |
| 19 | + |
| 20 | +We allow users to change the definition and output columns of a materialized view without needing to drop and recreate it. |
| 21 | +We preserve existing dependencies on the materialized view when doing so, and we ensure that the system remains consistent. |
| 22 | + |
| 23 | +Changing a running materialized view can cause additional work for downstream consumers. |
| 24 | +While we cannot avoid this work, we aim to provide tools to quantify the amount of changed data. |
| 25 | + |
| 26 | +We provide a mechanism that allows cutting over to a new definition with minimal downtime. |
| 27 | +A stop-and-restart approach is not acceptable, unless explicitly requested by the user. |
| 28 | + |
| 29 | +## Background |
| 30 | + |
| 31 | +Materialized views are views that Materialize maintains and writes down durably. |
| 32 | +They're identified by a name, have a SQL definition, and produce a set of output columns. |
| 33 | +The name of a materialized view uniquely identifies a _shard_, which is the durable storage location for the materialized view's data. |
| 34 | +When a materialized view is created, Materialize plans and deploys a dataflow that computes the view's contents based on its definition. |
| 35 | + |
| 36 | +(We can rename materialized views, which merely changes the name associated with the shard.) |
| 37 | + |
| 38 | +A materialized view has a unique and immutable catalog item ID that is valid for the lifetime of the object. |
| 39 | +It also has a global ID that identifies the dataflow, and associates a schema (`RelationDesc`) with the shard. |
| 40 | +The global ID associates compute and storage: it needs to be registered with persist so persist knows what schema to expect, and to reclaim durable resources when the materialized view is dropped. |
| 41 | + |
| 42 | +When a materialized view is dropped, Materialize tears down the dataflow and reclaims the shard. |
| 43 | + |
| 44 | +## Principles |
| 45 | + |
| 46 | +While not specific to this project, I want to outline some principles that helped me to come up with the design. |
| 47 | +In user interfaces, it is a principle that all operations can be undone, for example by pressing Ctrl+z. |
| 48 | +This allows users to correct mistakes or just decide otherwise. |
| 49 | + |
| 50 | +Similarly, Materialize offers mechanisms with potentially disastrous outcomes that cannot be undone. |
| 51 | +For example, a cluster might become overloaded, or we seal a materialized view for all times. |
| 52 | +Extending the "undo" principle, we split operations into a _stage_ and _apply_ operation. |
| 53 | +The stage operation creates objects, but localizes their effect to a cluster. |
| 54 | +Only once the user applies a change, the effect becomes global. |
| 55 | + |
| 56 | +The following solution uses this principle to apply changes to materialized views. |
| 57 | + |
| 58 | +## Solution proposal |
| 59 | + |
| 60 | +We introduce the notion of a "replacement materialized view". |
| 61 | +A replacement materialized view allows users to stage the change of definition and output columns of a materialized view. |
| 62 | +The user can then inspect the replacement, and decide to apply or discard it. |
| 63 | + |
| 64 | +We add the following SQL syntax: |
| 65 | +* `CREATE MATERIALIZED VIEW replacement_name REPLACING mv_name AS SELECT ...` |
| 66 | + Creates a replacement for the specified materialized view with the new definition. |
| 67 | + The usual properties for materialized views apply, such as the cluster and its options. |
| 68 | +* `ALTER MATERIALIZED VIEW mv_name APPLY REPLACEMENT replacement_name` |
| 69 | + Applies the specified replacement to the materialized view. |
| 70 | + This updates the definition and output columns of the materialized view to match those of the replacement. |
| 71 | + Existing dependencies on the materialized view are preserved. |
| 72 | + The replacement materialized view is dropped. |
| 73 | + |
| 74 | +When a replacement materialized view is created, we validate that the new definition is compatible with the existing materialized view. |
| 75 | +Replacement materialized views are "read-only": |
| 76 | +Their dataflows hydrate normally, but their storage sinks are configured to not perform writes to the output shard. |
| 77 | +Only when a replacement is applied is the dataflow given permission to start writing to the output. |
| 78 | + |
| 79 | +Compared to regular materialized views, replacement materialized views are more limited: |
| 80 | +* It is not possible to select from or depend on replacement materialized views. |
| 81 | +* For each materialized view, at most one replacement can exist at any point in time. |
| 82 | + |
| 83 | +Replacement materialized views can be inspected like regular materialized views, using `SHOW CREATE MATERIALIZED VIEW`, `mz_materialized_views`, `EXPLAIN MATERIALIZED VIEW`, etc. |
| 84 | +Additionally, a new system relation `mz_replacements` is provided, specifying the replacement targets for all replacements in the system. |
| 85 | + |
| 86 | +Internally, we change the definition of a materialized view as follows: |
| 87 | +* A materialized view is uniquely identified by its name and catalog item ID. |
| 88 | +* A materialized view uniquely identifies a persist shard. |
| 89 | +* A materialized view has a current definition and output columns, identified by a unique global ID. |
| 90 | +* A materialized view can have additional versions, each with their own unique global ID and schema. |
| 91 | + |
| 92 | +This corresponds with switching from a strongly-coupled model to a loosely-coupled model: We switch from binding the view definition and schema, to just binding the schema. |
| 93 | + |
| 94 | +## Formalism |
| 95 | + |
| 96 | +Replacing a materialized view with a new version means that at some point in time, we need to switch the contents of the materialized view from the old definition to the new definition. |
| 97 | + |
| 98 | +Let `mv = [mv-updates, since, upper]` be a correct view of the updates in `mv`. |
| 99 | +Now consider switching over to a new definition `mv' = [mv'-updates, since', upper']` by applying a replacement `r = [r-updates, since_r, upper_r]`. |
| 100 | +It must be true that: |
| 101 | +* `since_r <= upper` (the replacement can start at or before the current upper frontier of the materialized view), |
| 102 | +* `upper < r_upper` (the replacement must be able to catch up to the current upper frontier of the materialized view). |
| 103 | + |
| 104 | +When applying the replacement at time `upper'`, we need to ensure that: |
| 105 | + |
| 106 | +``` |
| 107 | +mv'_updates = append(mv-updates, upper', diff(upper', mv-updates, r-updates)) |
| 108 | +``` |
| 109 | + |
| 110 | +From this moment, onwards, the materialized view `mv` will reflect the updates from `mv'`. |
| 111 | + |
| 112 | +### Schema evolution and multiple versions |
| 113 | + |
| 114 | +When applying a replacement, we need to ensure that the new schema is compatible with the existing schema. |
| 115 | +We define compatibility as follows: |
| 116 | +1. The schema must be the same as the original schema, |
| 117 | +2. Or, the schema must be a superset of the original schema (i.e., it can add new columns but cannot remove existing ones). |
| 118 | + |
| 119 | +Schema evolution is tied to what persist considers a safe schema change. |
| 120 | +At the moment, this is new nullable columns, but nothing else. |
| 121 | +If persist would support more schema changes in the future, we could consider allowing them here as well. |
| 122 | + |
| 123 | +Even if we do not change the schema, we need to register a new version of the materialized view. |
| 124 | +A version is defined as a global ID, and a relation description. |
| 125 | +All versions map to the same persist shard. |
| 126 | + |
| 127 | +## Timestamp selection |
| 128 | + |
| 129 | +Replacing a materialized view should result in a well-defined history that has definite data at each readable timestamp. |
| 130 | +This means that we need to start the replacement materialized view at the write frontier of the existing materialized view. |
| 131 | +The implementation needs to ensure this, and must warn or refuse to apply a replacement if this is not possible. |
| 132 | + |
| 133 | +It can pick a later timestamp than the old write frontier, but then needs to wait for the old materialized view to reach that timestamp before the replacement can be applied. |
| 134 | + |
| 135 | +Optionally, we could allow users to let the new materialized view start at a later timestamp and let the time jump forward as needed. |
| 136 | +This is highly risky as it introduces gaps in the history, and we should only consider it if there is a strong use case. |
| 137 | +The behavior should be guarded with an explicit clause, such as `WITH (FORWARD TIME)`. |
| 138 | + |
| 139 | +### Future work: Marking spans of time as invalid |
| 140 | + |
| 141 | +Except for some subscribes and sinks, times that have errors aren't generally readable by users. |
| 142 | +We could use this property to mask the period that we jump forward as invalid, by emitting an error at the beginning of the jump until the end of the jump. |
| 143 | +Specifically, when advancing time from `upper` to `upper'`, `upper` <= `upper'`, we would emit the error `[("masked interval", upper + 1, 1), ("masked interval", upper', -1)]`. |
| 144 | +An issue with implementing this is that the materialized view dataflow would need to be aware of the replacement, which complicates the implementation. |
| 145 | + |
| 146 | +## Minimal Viable Prototype |
| 147 | + |
| 148 | +* Update the parser to support the above syntax. |
| 149 | +* Implement planning and sequencing for the new commands. |
| 150 | +* Add catalog relations for replacements: `mz_replacements`. |
| 151 | +* Record replacements and state transitions in the audit log. |
| 152 | +* Do not support schema evolution in the MVP. |
| 153 | +* Implement "correct" timestamp selection for replacements, starting at the write frontier of the existing materialized view. |
| 154 | +* Provide better introspection data for replacements, such as the ability to see the differences between the current and replacement definitions. |
| 155 | + * Surface metadata about the amount of staged changes (records, bytes) between the current and replacement definitions. |
| 156 | + * Document how users can observe hydration progress, or implement new ways to do so. |
| 157 | + Specifically, users should be able to monitor progress through the `mz_frontiers` and `mz_arrangement_sizes` introspection views. |
| 158 | +* Design a mechanism to read from replacement materialized views. |
| 159 | + (The actual design of the mechanism is out of scope for this document.) |
| 160 | + |
| 161 | +## GA considerations |
| 162 | + |
| 163 | +The `mz_replacements` is separate from the existing `mz_materialized_views` system relation, and to get a complete picture of materialized views and their replacements, users need to query both. |
| 164 | +For GA, we should add a column `replacement_for` to `mz_materialized_views` that indicates whether a materialized view is a replacement, and if so, for which materialized view. |
| 165 | +For regular materialized views, this column would be NULL. |
| 166 | +We should only implement this change for GA, as it would be a breaking change to remove it in the future. |
| 167 | + |
| 168 | +## Future work |
| 169 | + |
| 170 | +* Provide insights into the impact of applying a replacement beyond metadata: |
| 171 | + * Introspect the actual changes. |
| 172 | + For example, which rows would be added or removed. |
| 173 | +* Automate applying a replacement once the new definition is hydrated. |
| 174 | +* Allow replacements to jump forward in time. |
| 175 | +* Support replacements for other maintained objects, such as upsert sources. |
| 176 | + |
| 177 | +## Alternatives |
| 178 | + |
| 179 | +### Replacements as first-class catalog items |
| 180 | + |
| 181 | +Instead of modelling replacements as special materialized views, we could make them separate catalog items instead. |
| 182 | + |
| 183 | +``` |
| 184 | +CREATE REPLACEMENT <replacement_mv_name> FOR MATERIALIZED VIEW <mv_name> AS SELECT ... |
| 185 | +``` |
| 186 | + |
| 187 | +We seriously considered this approach and there exists an MVP implementing it. |
| 188 | +However, it seems inferior to the special-materialized-views approach: |
| 189 | +* Implementation-wise, it requires a lot of new code to support the new item type. |
| 190 | + This includes a significant amount of duplication in planning and sequencing. |
| 191 | +* UX-wise, having a new item type is a potential source of user confusion. |
| 192 | + Existing monitoring for materialized views wouldn't work for replacements out of the box. |
| 193 | + |
| 194 | +It is worth pointing out that the limitations of replacement materialized views, particularly the inability to select from them, is a source of user confusion as well. |
| 195 | +We are hopeful that it will be possible to reduce these limitations in the future. |
| 196 | + |
| 197 | +### Multi-output materialized views |
| 198 | + |
| 199 | +The MVP design lets the replacement write at the same shard as the original materialized view. |
| 200 | +This comes with limitations, most importantly that we need to ensure that we don't write until cut-over time. |
| 201 | +In turn, this prevents reads from the replacement until we're cutting over. |
| 202 | + |
| 203 | +An alternative design is to let the replacement materialized view write to a separate shard in addition to the original materialized view's shard. |
| 204 | +A benefit would be that the replacement can be read from immediately, allowing users to inspect the data. |
| 205 | + |
| 206 | +Several parts of Materialize would need to be updated to support this: |
| 207 | +* Dataflows support multiple exports, but the behavior is untested and collides with the assumption that we can identify a dataflow by a single global ID. |
| 208 | +* The catalog would need to support multiple shards per materialized view. |
| 209 | +* We would need to control read-write mode per dataflow export. |
| 210 | + This might be a positive change as the controller sends instructions per global ID, not per dataflow. |
| 211 | +* It is unclear how we could cut over from one shard to another to reclaim storage space. |
| 212 | + If we don't, we'll have to pay the hydration and storage cost for all shards indefinitely. |
| 213 | + (We can't cut-over existing dataflows to a new shard, and the only other time we can do breaking changes is when deploying a new version of Materialize. |
| 214 | + It feels odd to tie a clean-up mechanism to a version deployment.) |
| 215 | + |
| 216 | +## Open questions |
| 217 | + |
| 218 | +<!-- |
| 219 | +What is left unaddressed by this design document that needs to be |
| 220 | +closed out? |
| 221 | +
|
| 222 | +When a design document is authored and shared, there might still be |
| 223 | +open questions that need to be explored. Through the design document |
| 224 | +process, you are responsible for getting answers to these open |
| 225 | +questions. All open questions should be answered by the time a design |
| 226 | +document is merged. |
| 227 | +--> |
| 228 | + |
| 229 | +### SQL syntax |
| 230 | + |
| 231 | +The current proposal uses the term `REPLACING` in the `CREATE MATERIALIZED VIEW` statement. |
| 232 | +An alternative is to use `AS REPLACEMENT FOR mv_name`, which might be more explicit. |
| 233 | +We could also consider the term `REPLACE`, but that might be confused with `CREATE OR REPLACE`. |
| 234 | +Alternatively, we could use `REPLACES` instead of `REPLACING`. |
0 commit comments