@@ -141,7 +141,13 @@ We have tried to resolve any conflicts in the *best* possible manner.
141141 Each dataset consists of 200-1050 observations in 2 dimensions.
142142
143143
144- 3 . [ ` other ` ] ( catalogue/other.md ) includes:
144+ 3 . [ ` mnist ` ] ( catalogue/mnist.md ) -
145+ LeCun's MNIST database of handwritten digits
146+ and Zalando's Fashion-MNIST dataset.
147+
148+
149+
150+ 4 . [ ` other ` ] ( catalogue/other.md ) includes:
145151
146152 * ` hdbscan ` - a dataset used for demonstrating the outputs of the
147153 [ Python implementation] ( https://github.com/scikit-learn-contrib/hdbscan )
@@ -172,7 +178,7 @@ We have tried to resolve any conflicts in the *best* possible manner.
172178 (TODO: help needed).
173179
174180
175- 4 . [ ` sipu ` ] ( catalogue/sipu.md ) -
181+ 5 . [ ` sipu ` ] ( catalogue/sipu.md ) -
176182 datasets available at the SIPU (Speech and Image Processing Unit,
177183 School of Computing, University of Eastern Finland) website
178184
@@ -190,7 +196,7 @@ We have tried to resolve any conflicts in the *best* possible manner.
190196 We excluded the ` DIM ` -sets as they turn out to be too easy
191197 for most algorithms.
192198
193- 5 . [ ` uci ` ] ( catalogue/uci.md ) -
199+ 6 . [ ` uci ` ] ( catalogue/uci.md ) -
194200 a selection of datasets available at the University of California, Irvine,
195201 [ Machine Learning Repository] ( http://archive.ics.uci.edu/ml/ )
196202 (Dua and Graff, 2019)
@@ -201,23 +207,23 @@ We have tried to resolve any conflicts in the *best* possible manner.
201207 also listed in the SIPU repository.
202208 Note that "the" Iris dataset is available elsewhere (see ` other ` ).
203209
204- 6 . [ ` wut ` ] ( catalogue/wut.md ) -
210+ 7 . [ ` wut ` ] ( catalogue/wut.md ) -
205211 authored by the fantastic students
206212 of Marek Gagolewski's Python for Data Analysis course at
207213 Warsaw University of Technology:
208214 Przemysław Kosewski, Jędrzej Krauze, Eliza Kaczorek, Anna Gierlak,
209215 Adam Wawrzyniak, Aleksander Truszczyński, Mateusz Kobyłka and Michał Maciąg.
210216
211217
212- 7 . [ ` g2mg ` ] ( catalogue/g2mg.md ) -
218+ 8 . [ ` g2mg ` ] ( catalogue/g2mg.md ) -
213219 a modified version of ` G2 ` -sets from SIPU with variances
214220 dependent on datasets' dimensionalities, i.e., s* np.sqrt(d/2),
215221 which makes these problems more difficult.
216222
217223 Each dataset consists of 2048 observations belonging
218224 to either of two Gaussian clusters in 1, 2, ..., 128 dimensions.
219225
220- 8 . [ ` h2mg ` ] ( catalogue/h2mg.md ) -
226+ 9 . [ ` h2mg ` ] ( catalogue/h2mg.md ) -
221227 two Gaussian-like hubs with spread dependent on datasets' dimensionalities
222228
223229 Each dataset consists of 2048 observations in 1, 2, ..., 128 dimensions.
@@ -231,85 +237,88 @@ We have tried to resolve any conflicts in the *best* possible manner.
231237## List of Datasets
232238
233239
234- | | dataset | n| d|
235- | :--| :----------------------| ------:| --:|
236- | 1 | fcps/atom | 800| 3|
237- | 2 | fcps/chainlink | 1000| 3|
238- | 3 | fcps/engytime | 4096| 2|
239- | 4 | fcps/hepta | 212| 3|
240- | 5 | fcps/lsun | 400| 2|
241- | 6 | fcps/target | 770| 2|
242- | 7 | fcps/tetra | 400| 3|
243- | 8 | fcps/twodiamonds | 800| 2|
244- | 9 | fcps/wingnut | 1016| 2|
245- | 10 | graves/dense | 200| 2|
246- | 11 | graves/fuzzyx | 1000| 2|
247- | 12 | graves/line | 250| 2|
248- | 13 | graves/parabolic | 1000| 2|
249- | 14 | graves/ring | 1000| 2|
250- | 15 | graves/ring_noisy | 1050| 2|
251- | 16 | graves/ring_outliers | 1030| 2|
252- | 17 | graves/zigzag | 250| 2|
253- | 18 | graves/zigzag_noisy | 300| 2|
254- | 19 | graves/zigzag_outliers | 280| 2|
255- | 20 | other/chameleon_t4_8k | 8000| 2|
256- | 21 | other/chameleon_t5_8k | 8000| 2|
257- | 22 | other/chameleon_t7_10k | 10000| 2|
258- | 23 | other/chameleon_t8_8k | 8000| 2|
259- | 24 | other/hdbscan | 2309| 2|
260- | 25 | other/iris | 150| 4|
261- | 26 | other/iris5 | 105| 4|
262- | 27 | other/square | 1000| 2|
263- | 28 | sipu/a1 | 3000| 2|
264- | 29 | sipu/a2 | 5250| 2|
265- | 30 | sipu/a3 | 7500| 2|
266- | 31 | sipu/aggregation | 788| 2|
267- | 32 | sipu/birch1 | 100000| 2|
268- | 33 | sipu/birch2 | 100000| 2|
269- | 34 | sipu/compound | 399| 2|
270- | 35 | sipu/d31 | 3100| 2|
271- | 36 | sipu/flame | 240| 2|
272- | 37 | sipu/jain | 373| 2|
273- | 38 | sipu/pathbased | 300| 2|
274- | 39 | sipu/r15 | 600| 2|
275- | 40 | sipu/s1 | 5000| 2|
276- | 41 | sipu/s2 | 5000| 2|
277- | 42 | sipu/s3 | 5000| 2|
278- | 43 | sipu/s4 | 5000| 2|
279- | 44 | sipu/spiral | 312| 2|
280- | 45 | sipu/unbalance | 6500| 2|
281- | 46 | sipu/worms_2 | 105600| 2|
282- | 47 | sipu/worms_64 | 105000| 64|
283- | 48 | uci/ecoli | 336| 7|
284- | 49 | uci/glass | 214| 9|
285- | 50 | uci/ionosphere | 351| 34|
286- | 51 | uci/sonar | 208| 60|
287- | 52 | uci/statlog | 2310| 19|
288- | 53 | uci/wdbc | 569| 30|
289- | 54 | uci/wine | 178| 13|
290- | 55 | uci/yeast | 1484| 8|
291- | 56 | wut/circles | 4000| 2|
292- | 57 | wut/cross | 2000| 2|
293- | 58 | wut/graph | 2500| 2|
294- | 59 | wut/isolation | 9000| 2|
295- | 60 | wut/labirynth | 3546| 2|
296- | 61 | wut/mk1 | 300| 2|
297- | 62 | wut/mk2 | 1000| 2|
298- | 63 | wut/mk3 | 600| 3|
299- | 64 | wut/mk4 | 1500| 3|
300- | 65 | wut/olympic | 5000| 2|
301- | 66 | wut/smile | 1000| 2|
302- | 67 | wut/stripes | 5000| 2|
303- | 68 | wut/trajectories | 10000| 2|
304- | 69 | wut/trapped_lovers | 5000| 3|
305- | 70 | wut/twosplashes | 400| 2|
306- | 71 | wut/windows | 2977| 2|
307- | 72 | wut/x1 | 120| 2|
308- | 73 | wut/x2 | 120| 2|
309- | 74 | wut/x3 | 185| 2|
310- | 75 | wut/z1 | 192| 2|
311- | 76 | wut/z2 | 900| 2|
312- | 77 | wut/z3 | 1000| 2|
240+ | | dataset | n| d|
241+ | :--| :----------------------| ------:| ---:|
242+ | 1 | fcps/atom | 800| 3|
243+ | 2 | fcps/chainlink | 1000| 3|
244+ | 3 | fcps/engytime | 4096| 2|
245+ | 4 | fcps/hepta | 212| 3|
246+ | 5 | fcps/lsun | 400| 2|
247+ | 6 | fcps/target | 770| 2|
248+ | 7 | fcps/tetra | 400| 3|
249+ | 8 | fcps/twodiamonds | 800| 2|
250+ | 9 | fcps/wingnut | 1016| 2|
251+ | 10 | graves/dense | 200| 2|
252+ | 11 | graves/fuzzyx | 1000| 2|
253+ | 12 | graves/line | 250| 2|
254+ | 13 | graves/parabolic | 1000| 2|
255+ | 14 | graves/ring | 1000| 2|
256+ | 15 | graves/ring_noisy | 1050| 2|
257+ | 16 | graves/ring_outliers | 1030| 2|
258+ | 17 | graves/zigzag | 250| 2|
259+ | 18 | graves/zigzag_noisy | 300| 2|
260+ | 19 | graves/zigzag_outliers | 280| 2|
261+ | 20 | mnist/digits | 70000| 784|
262+ | 21 | mnist/fashion | 70000| 784|
263+ | 22 | other/chameleon_t4_8k | 8000| 2|
264+ | 23 | other/chameleon_t5_8k | 8000| 2|
265+ | 24 | other/chameleon_t7_10k | 10000| 2|
266+ | 25 | other/chameleon_t8_8k | 8000| 2|
267+ | 26 | other/hdbscan | 2309| 2|
268+ | 27 | other/iris | 150| 4|
269+ | 28 | other/iris5 | 105| 4|
270+ | 29 | other/square | 1000| 2|
271+ | 30 | sipu/a1 | 3000| 2|
272+ | 31 | sipu/a2 | 5250| 2|
273+ | 32 | sipu/a3 | 7500| 2|
274+ | 33 | sipu/aggregation | 788| 2|
275+ | 34 | sipu/birch1 | 100000| 2|
276+ | 35 | sipu/birch2 | 100000| 2|
277+ | 36 | sipu/compound | 399| 2|
278+ | 37 | sipu/d31 | 3100| 2|
279+ | 38 | sipu/flame | 240| 2|
280+ | 39 | sipu/jain | 373| 2|
281+ | 40 | sipu/pathbased | 300| 2|
282+ | 41 | sipu/r15 | 600| 2|
283+ | 42 | sipu/s1 | 5000| 2|
284+ | 43 | sipu/s2 | 5000| 2|
285+ | 44 | sipu/s3 | 5000| 2|
286+ | 45 | sipu/s4 | 5000| 2|
287+ | 46 | sipu/spiral | 312| 2|
288+ | 47 | sipu/unbalance | 6500| 2|
289+ | 48 | sipu/worms_2 | 105600| 2|
290+ | 49 | sipu/worms_64 | 105000| 64|
291+ | 50 | uci/ecoli | 336| 7|
292+ | 51 | uci/glass | 214| 9|
293+ | 52 | uci/ionosphere | 351| 34|
294+ | 53 | uci/sonar | 208| 60|
295+ | 54 | uci/statlog | 2310| 19|
296+ | 55 | uci/wdbc | 569| 30|
297+ | 56 | uci/wine | 178| 13|
298+ | 57 | uci/yeast | 1484| 8|
299+ | 58 | wut/circles | 4000| 2|
300+ | 59 | wut/cross | 2000| 2|
301+ | 60 | wut/graph | 2500| 2|
302+ | 61 | wut/isolation | 9000| 2|
303+ | 62 | wut/labirynth | 3546| 2|
304+ | 63 | wut/mk1 | 300| 2|
305+ | 64 | wut/mk2 | 1000| 2|
306+ | 65 | wut/mk3 | 600| 3|
307+ | 66 | wut/mk4 | 1500| 3|
308+ | 67 | wut/olympic | 5000| 2|
309+ | 68 | wut/smile | 1000| 2|
310+ | 69 | wut/stripes | 5000| 2|
311+ | 70 | wut/trajectories | 10000| 2|
312+ | 71 | wut/trapped_lovers | 5000| 3|
313+ | 72 | wut/twosplashes | 400| 2|
314+ | 73 | wut/windows | 2977| 2|
315+ | 74 | wut/x1 | 120| 2|
316+ | 75 | wut/x2 | 120| 2|
317+ | 76 | wut/x3 | 185| 2|
318+ | 77 | wut/z1 | 192| 2|
319+ | 78 | wut/z2 | 900| 2|
320+ | 79 | wut/z3 | 1000| 2|
321+
313322
314323
315324
0 commit comments