Skip to content

Investigate performance limits of DisclosureProtection metric #691

@frances-h

Description

@frances-h

Problem Description

Currently, the DisclosureProtection metric warns about poor performance when the size of the input data is greater than 50,000 rows. This number was chosen without investigation into the performance of the metric. It'd be helpful to know how the performance of the metric changes based on the size of the input, so that we can warn the user of possible poor performance earlier and suggest an alternative metric.

Expected behavior

Investigate the performance of the DisclosureProtection metric, considering input data length, number of known/sensitive columns, and number of unique discrete values in those columns. Also test across the different CAP methods.

Once we have a good understanding of the performance, we should update the warning in DisclosureProtection based on the results of the investigation.

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions