Skip to content

Commit f0a9239

Browse files
Merge pull request #114 from Quantmetry/doc_cosmetic_changes
Doc cosmetic changes
2 parents 0948780 + 428ec66 commit f0a9239

File tree

5 files changed

+25
-20
lines changed

5 files changed

+25
-20
lines changed

README.rst

Lines changed: 4 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -47,7 +47,7 @@ Qolmat can be installed in different ways:
4747
.. code:: sh
4848
4949
$ pip install qolmat # installation via `pip`
50-
$ pip install qolmat[pytorch] # if you need pytorch
50+
$ pip install qolmat[pytorch] # if you need ImputerDiffusion relying on pytorch
5151
$ pip install git+https://github.com/Quantmetry/qolmat # or directly from the github repository
5252
5353
⚡️ Quickstart
@@ -105,8 +105,8 @@ The full documentation can be found `on this link <https://qolmat.readthedocs.io
105105

106106
**How does Qolmat work ?**
107107

108-
Qolmat allows model selection for scikit-learn compatible imputation algorithms, by performing three steps pictured below:
109-
1) For each of the K folds, Qolmat artificially masks a set of observed values using a default or user specified `hole generator <explanation.html#hole-generator>`_,
108+
| Qolmat allows model selection for scikit-learn compatible imputation algorithms, by performing three steps pictured below:
109+
1) For each of the K folds, Qolmat artificially masks a set of observed values using a default or user specified `hole generator <explanation.html#hole-generator>`_.
110110
2) For each fold and each compared `imputation method <imputers.html>`_, Qolmat fills both the missing and the masked values, then computes each of the default or user specified `performance metrics <explanation.html#metrics>`_.
111111
3) For each compared imputer, Qolmat pools the computed metrics from the K folds into a single value.
112112

@@ -117,7 +117,7 @@ This is very similar in spirit to the `cross_val_score <https://scikit-learn.org
117117

118118
**Imputation methods**
119119

120-
The following table contains the available imputation methods. We distinguish single imputation methods (aiming for pointwise accuracy, mostly deterministic) from multiple imputation methods (aiming for distribution similarity, mostly stochastic).
120+
The following table contains the available imputation methods. We distinguish single imputation methods (aiming for pointwise accuracy, mostly deterministic) from multiple imputation methods (aiming for distribution similarity, mostly stochastic). For further details regarding the distinction between single and multiple imputation, you can refer to the `Imputation article <https://en.wikipedia.org/wiki/Imputation_(statistics)>`_ on Wikipedia.
121121

122122
.. list-table::
123123
:widths: 25 70 15 15

docs/api.rst

Lines changed: 11 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -103,4 +103,14 @@ Diffusion engine
103103

104104
imputations.imputers_pytorch.ImputerDiffusion
105105
imputations.diffusions.ddpms.TabDDPM
106-
imputations.diffusions.ddpms.TsDDPM
106+
imputations.diffusions.ddpms.TsDDPM
107+
108+
109+
Utils
110+
================
111+
112+
.. autosummary::
113+
:toctree: generated/
114+
:template: function.rst
115+
116+
utils.data.add_holes

docs/explanation.rst

Lines changed: 7 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -99,7 +99,7 @@ We compute the associated complete dataset :math:`\hat{X}^{(k)}` for the partial
9999
-----------------
100100

101101
Evaluating the imputers requires to generate holes that are representative of the holes at hand.
102-
The missingness mechanisms have been classified by Rubin [1] into MCAR, MAR and MNAR.
102+
The missingness mechanisms have been classified by :ref:`Rubin [1]<rubin-article>` into MCAR, MAR and MNAR.
103103

104104
Suppose we have :math:`X_{obs}`, a subset of a complete data model :math:`X = (X_{obs}, X_{mis})`, which is not fully observable (:math:`X_{mis}` is the missing part).
105105
We define the matrix :math:`M` such that :math:`M_{ij}=1` if :math:`X_{ij}` is missing, and 0 otherwise, and we assume distribution of :math:`M` is parametrised by :math:`\psi`.
@@ -108,14 +108,14 @@ The observations are said to be Missing Completely at Random (MCAR) if the proba
108108
Formally,
109109

110110
.. math::
111-
P(M | X_{obs}, X_{mis}, \psi) = P(M, \psi), \quad \forall \psi.
111+
P(M | X_{obs}, X_{mis}, \psi) = P(M | \psi), \quad \forall \psi.
112112
113113
The observations are said to be Missing at Random (MAR) if the probability of an observation to be missing only depends on the observations. Formally,
114114

115115
.. math::
116116
P(M | X_{obs}, X_{mis}, \psi) = P(M | X_{obs}, \psi), \quad \forall \psi, X_{mis}.
117117
118-
Finally, the observations are said to be Missing Not at Random (MNAR) in all other cases, i.e. if P(M | X_{obs}, X_{mis}, \psi) does not simplify.
118+
Finally, the observations are said to be Missing Not at Random (MNAR) in all other cases, i.e. if :math:`P(M | X_{obs}, X_{mis}, \psi)` does not simplify.
119119

120120
Qolmat allows to generate new missing values on a an existing dataset, but only in the MCAR case.
121121

@@ -140,4 +140,7 @@ Qolmat can be used to search for hyperparameters in imputation functions. Let sa
140140
141141
References
142142
----------
143-
[1] Rubin, Donald B. `Inference and missing data. <https://www.math.wsu.edu/faculty/xchen/stat115/lectureNotes3/Rubin%20Inference%20and%20Missing%20Data.pdf>`_ Biometrika 63.3 (1976): 581-592.
143+
144+
.. _rubin-article:
145+
146+
[1] Rubin, Donald B. `Inference and missing data. <https://www.math.wsu.edu/faculty/xchen/stat115/lectureNotes3/Rubin%20Inference%20and%20Missing%20Data.pdf>`_ Biometrika 63.3 (1976): 581-592.

examples/tutorials/plot_tuto_diffusion_models.py

Lines changed: 0 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -7,7 +7,6 @@
77
and :class:`~qolmat.imputations.diffusions.ddpms.TsDDPM` classes.
88
"""
99

10-
# %%
1110
import pandas as pd
1211
import numpy as np
1312
import matplotlib.pyplot as plt

examples/tutorials/plot_tuto_mean_median.py

Lines changed: 3 additions & 10 deletions
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,6 @@
11
"""
22
========================================================================================
3-
Tutorial for comparison between mean and median imputations with uniform hole generation
3+
Comparison of basic imputers
44
========================================================================================
55
66
In this tutorial, we show how to use the Qolmat comparator
@@ -31,11 +31,10 @@
3131
# the 82nd column contains the critical temperature which is used as the
3232
# target variable.
3333
# The data does not contain missing values; so for the purpose of this notebook,
34-
# we corrupt the data, with the ``qolmat.utils.data.add_holes`` function.
34+
# we corrupt the data, with the :func:`qolmat.utils.data.add_holes` function.
3535
# In this way, each column has missing values.
3636

37-
df_data = data.get_data("Superconductor")
38-
df = data.add_holes(df_data, ratio_masked=0.2, mean_size=120)
37+
df = data.add_holes(data.get_data("Superconductor"), ratio_masked=0.2, mean_size=120)
3938

4039
# %%
4140
# The dataset contains 82 columns. For simplicity,
@@ -76,10 +75,6 @@
7675
imputer_median = imputers.ImputerMedian()
7776
dict_imputers = {"mean": imputer_mean, "median": imputer_median}
7877

79-
generator_holes = missing_patterns.UniformHoleGenerator(
80-
n_splits=2, subset=cols_to_impute, ratio_masked=0.1
81-
)
82-
8378
metrics = ["mae", "wmape", "KL_columnwise"]
8479

8580
# %%
@@ -88,9 +83,7 @@
8883
# (those previously mentioned),
8984
# a list with the columns names to impute,
9085
# a generator of holes specifying the type of holes to create.
91-
# Just a few words about hole generation.
9286
# in this example, we have chosen the uniform hole generator.
93-
# You can see what this looks like.
9487
# For example, by imposing that 10% of missing data be created
9588
# ``ratio_masked=0.1`` and creating missing values in columns
9689
# ``subset=cols_to_impute``:

0 commit comments

Comments
 (0)