-
Notifications
You must be signed in to change notification settings - Fork 1
Description
When data is exported from the ODK server, the main dataset is downloaded in wide format, while each "repeat group" is exported separately as a .csv file in long format. In these repeat group files, each repeated entry is represented as a new row, capturing nested or looped survey responses.
This format mismatch creates challenges during data processing. The main dataset does not include any variables from the repeat groups, and the long format of repeat group files is structurally incompatible with the wide format of the main dataset. Additionally, nested repeat groups where one repeat group contains another require careful handling to preserve hierarchical relationships.
To prepare the data for analysis, each repeat group must first be reshaped from long format to wide format. This transformation ensures consistency with the main dataset. Once reshaped, the repeat group data can be merged back into the main dataset using unique identifiers . For nested repeat groups, merging should be done using the parent_key to maintain the correct linkage between parent and child records.