Skip to content

Repeat group data merge for ODK #3

@armanmahmud1

Description

@armanmahmud1

When data is exported from the ODK server, the main dataset is downloaded in wide format, while each "repeat group" is exported separately as a .csv file in long format. In these repeat group files, each repeated entry is represented as a new row, capturing nested or looped survey responses.

This format mismatch creates challenges during data processing. The main dataset does not include any variables from the repeat groups, and the long format of repeat group files is structurally incompatible with the wide format of the main dataset. Additionally, nested repeat groups where one repeat group contains another require careful handling to preserve hierarchical relationships.

To prepare the data for analysis, each repeat group must first be reshaped from long format to wide format. This transformation ensures consistency with the main dataset. Once reshaped, the repeat group data can be merged back into the main dataset using unique identifiers . For nested repeat groups, merging should be done using the parent_key to maintain the correct linkage between parent and child records.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions