Skip to content

Commit 779393b

Browse files
authored
add solution to managing incoming data
1 parent 266be47 commit 779393b

File tree

1 file changed

+21
-2
lines changed

1 file changed

+21
-2
lines changed

episodes/pseudocode.md

Lines changed: 21 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -140,7 +140,7 @@ One way to do that is to check we have coded the workflow correctly *before* we
140140

141141
Write some pseudocode for that step of the process.
142142

143-
:::
143+
::::::::::::::
144144

145145
::: challenge
146146

@@ -152,7 +152,26 @@ In order to analyse the data over time, you need to append the weekly file diges
152152

153153
Write some pseudocode of how you might automate this process.
154154

155-
:::
155+
::: solution
156+
157+
In order to analyse the data over time, you need to append the weekly file digest to the existing, now very large, main data file. Before adding anything new, and in order to safeguard the integrity of the data, you need to create a backup of that main data file and send a copy of that backup file to your cloud storage account for safekeeping. Once the new data has been appended, you need to rename the new main data file with today's date as part of the file name, and run software against the file to ensure the integrity of the data, e.g., to check that no data is missing (which might indicate a malfunctioning device).
158+
159+
*Steps in the data combination process*
160+
161+
1. Create a new copy of the main data file with today's date as part of the file name.
162+
2. Move the previous version of the main data file to cloud storage.
163+
3. Save all the new data files individually to local storage with the device ID of each as part of the file names.
164+
4. Create a new weekly data file digest into which the daily digests from the different devices will be imported.
165+
5. Import each daily digest to that data file with an `append` command, ensuring that the device ID relating to each file's data is written into a separate column.
166+
6. Append the weekly digest to the newly renamed main data file.
167+
7. Verify that no data is missing. In [OpenRefine](https://openrefine.org/), using `Facet by Blank` on the relevant data fields could be one way to verify that no data is missing.
168+
169+
*Using a shell script to automate the work*
170+
171+
Again, a shell script could be used to automate this work. Given that these tasks are run weekly, it would make sense to turn this into an automated task rather than a manual one as that will not only be faster, but will reduce the opportunity for error.
172+
173+
::::::::::
174+
::::::::::
156175

157176
-------------------
158177

0 commit comments

Comments
 (0)