Commit 7c2253b
Outlier detection numerical values (#944)
## Changes
Added new check function for outlier detection of numeric values. The
checkuses a statistical method called MAD (Median Absolute Deviation) to
check whether the specified column's values are within the calculated
limits. The lower limit is calculated as median - 3.5 * MAD and the
upper limit as median + 3.5 * MAD. Values outside these limits are
considered outliers.
### Linked issues
<!-- DOC: Link issue with a keyword: close, closes, closed, fix, fixes,
fixed, resolve, resolves, resolved. See
https://docs.github.com/en/issues/tracking-your-work-with-issues/linking-a-pull-request-to-an-issue#linking-a-pull-request-to-an-issue-using-a-keyword
-->
Resolves #359
### Tests
<!-- How is this tested? Please see the checklist below and also
describe any other relevant tests -->
- [x] manually tested
- [x] added unit tests
- [x] added integration tests
- [ ] added end-to-end tests
- [x] added performance tests
---------
Co-authored-by: Marcin Wojtyczka <marcin.wojtyczka@databricks.com>
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>1 parent 0055e66 commit 7c2253b
File tree
7 files changed
+480
-1
lines changed- docs/dqx/docs/reference
- src/databricks/labs/dqx
- tests
- integration
- perf
- resources
- unit
7 files changed
+480
-1
lines changed| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
69 | 69 | | |
70 | 70 | | |
71 | 71 | | |
| 72 | + | |
72 | 73 | | |
73 | 74 | | |
74 | 75 | | |
| |||
1396 | 1397 | | |
1397 | 1398 | | |
1398 | 1399 | | |
| 1400 | + | |
1399 | 1401 | | |
1400 | 1402 | | |
1401 | 1403 | | |
| |||
1735 | 1737 | | |
1736 | 1738 | | |
1737 | 1739 | | |
| 1740 | + | |
| 1741 | + | |
| 1742 | + | |
| 1743 | + | |
| 1744 | + | |
| 1745 | + | |
| 1746 | + | |
1738 | 1747 | | |
1739 | 1748 | | |
1740 | 1749 | | |
| |||
2103 | 2112 | | |
2104 | 2113 | | |
2105 | 2114 | | |
| 2115 | + | |
| 2116 | + | |
| 2117 | + | |
| 2118 | + | |
| 2119 | + | |
| 2120 | + | |
| 2121 | + | |
2106 | 2122 | | |
2107 | 2123 | | |
2108 | 2124 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
973 | 973 | | |
974 | 974 | | |
975 | 975 | | |
| 976 | + | |
| 977 | + | |
| 978 | + | |
| 979 | + | |
| 980 | + | |
| 981 | + | |
| 982 | + | |
| 983 | + | |
| 984 | + | |
| 985 | + | |
| 986 | + | |
| 987 | + | |
| 988 | + | |
| 989 | + | |
| 990 | + | |
| 991 | + | |
| 992 | + | |
| 993 | + | |
| 994 | + | |
| 995 | + | |
| 996 | + | |
| 997 | + | |
| 998 | + | |
| 999 | + | |
| 1000 | + | |
| 1001 | + | |
| 1002 | + | |
| 1003 | + | |
| 1004 | + | |
| 1005 | + | |
| 1006 | + | |
| 1007 | + | |
| 1008 | + | |
| 1009 | + | |
| 1010 | + | |
| 1011 | + | |
| 1012 | + | |
| 1013 | + | |
| 1014 | + | |
| 1015 | + | |
| 1016 | + | |
| 1017 | + | |
| 1018 | + | |
| 1019 | + | |
| 1020 | + | |
| 1021 | + | |
| 1022 | + | |
| 1023 | + | |
| 1024 | + | |
| 1025 | + | |
| 1026 | + | |
| 1027 | + | |
| 1028 | + | |
| 1029 | + | |
| 1030 | + | |
| 1031 | + | |
| 1032 | + | |
| 1033 | + | |
| 1034 | + | |
| 1035 | + | |
| 1036 | + | |
| 1037 | + | |
| 1038 | + | |
| 1039 | + | |
| 1040 | + | |
| 1041 | + | |
| 1042 | + | |
| 1043 | + | |
| 1044 | + | |
| 1045 | + | |
| 1046 | + | |
| 1047 | + | |
| 1048 | + | |
| 1049 | + | |
| 1050 | + | |
| 1051 | + | |
| 1052 | + | |
| 1053 | + | |
| 1054 | + | |
976 | 1055 | | |
977 | 1056 | | |
978 | 1057 | | |
| |||
2920 | 2999 | | |
2921 | 3000 | | |
2922 | 3001 | | |
| 3002 | + | |
| 3003 | + | |
| 3004 | + | |
| 3005 | + | |
| 3006 | + | |
| 3007 | + | |
| 3008 | + | |
| 3009 | + | |
| 3010 | + | |
| 3011 | + | |
| 3012 | + | |
| 3013 | + | |
| 3014 | + | |
| 3015 | + | |
| 3016 | + | |
| 3017 | + | |
| 3018 | + | |
| 3019 | + | |
| 3020 | + | |
| 3021 | + | |
| 3022 | + | |
| 3023 | + | |
| 3024 | + | |
| 3025 | + | |
| 3026 | + | |
| 3027 | + | |
| 3028 | + | |
| 3029 | + | |
| 3030 | + | |
| 3031 | + | |
| 3032 | + | |
| 3033 | + | |
| 3034 | + | |
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
79 | 79 | | |
80 | 80 | | |
81 | 81 | | |
| 82 | + | |
| 83 | + | |
| 84 | + | |
| 85 | + | |
| 86 | + | |
| 87 | + | |
| 88 | + | |
| 89 | + | |
| 90 | + | |
82 | 91 | | |
83 | 92 | | |
84 | 93 | | |
| |||
0 commit comments