You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: change_column_type_materialized_view/README.md
+17-7Lines changed: 17 additions & 7 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -6,9 +6,9 @@ To change a column type in a Materialized View Data Source is a process that nee
6
6
7
7
This change needs to re-create the Materialized View and populate it again with all the data without stoping our ingestion.
8
8
9
-
For that the steps will be:
9
+
For that, the steps will be:
10
10
11
-
1. Create a new Materialized View (Pipe and Data Source) to change the type to the colum.
11
+
1. Create a new Materialized View (Pipe and Data Source) to change the type to the column.
12
12
2. Run CI.
13
13
3. Backfill the new Materialized View with the data previous to its creation.
14
14
4. Run CD and run the backfill in the main Workspace.
@@ -48,7 +48,7 @@ Create a Copy Pipe `analytics_pages_backfill.pipe` for backfilling purposes:
48
48
NODE analytics_pages_backfill_node
49
49
50
50
SQL >
51
-
51
+
%
52
52
SELECT
53
53
toDate(timestamp) AS date,
54
54
device,
@@ -67,14 +67,24 @@ SQL >
67
67
pathname
68
68
69
69
TYPE COPY
70
-
DATASOURCE analytics_pages_mv_1
70
+
TARGET_DATASOURCE analytics_pages_mv_1
71
71
```
72
72
73
73
## 2: Run CI
74
74
75
75
Make sure the changes are deployed correctly in the CI Tinybird Branch. Optionally you can add automated tests or verify it from the `tmp_ci_*` Branch created as part of the CI pipeline.
76
76
77
-
## 3: Backfilling
77
+
## 3: (For large datasets) Splitting the Data into Chunks for Backfilling
78
+
79
+
If your data source is large, you may run into a memory error like this:
80
+
```
81
+
error: "There was a problem while copying data: [Error] Memory limit (for query) exceeded. Make sure the query just process the required data. Contact us at support@tinybird.co for help or read this SQL tip: https://tinybird.co/docs/guides/best-practices-for-faster-sql.html#memory-limit-reached-title"
82
+
```
83
+
84
+
To avoid memory issues, you will need to break the backfill operation into smaller, manageable chunks. This approach reduces the memory load per query by processing only a subset of the data at a time. You can use the ***data source's sorting key*** to define each chunk.
85
+
Refer to [this guide](https://www.tinybird.co/docs/work-with-data/strategies/backfill-strategies#scenario-3-streaming-ingestion-with-incremental-timestamp-column) for more details.
86
+
87
+
## 4: Backfilling
78
88
79
89
Wait for the first event to be ingested into `analytics_pages_mv_1` and then proceed with the backfilling.
80
90
@@ -93,10 +103,10 @@ tb sql "select timestamp from tinybird.datasources_ops_log where event_type = 'c
0 commit comments