Skip to content

Commit fcb015a

Browse files
committed
minor things
1 parent 1c20597 commit fcb015a

File tree

1 file changed

+1
-1
lines changed

1 file changed

+1
-1
lines changed

paper/sections/methodology_2.rmd

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -64,7 +64,7 @@ MMD({X}^\prime,\tilde{X}^\prime) &= \frac{1}{m(m-1)}\sum_{i=1}^m\sum_{j\neq i}^m
6464
\end{aligned}
6565
\end{equation}
6666

67-
where $X=\{x_1,...,x_m\}$, $\tilde{X}=\{\tilde{x}_1,...,\tilde{x}_n\}$ represent independent and identically distributed samples drawn from probability distributions $\mathcal{X}$ and $\mathcal{\tilde{X}}$ respectively @gretton2012kernel. MMD is a measure of the distance between the kernel mean embeddings of $\mathcal{X}$ and $\mathcal{\tilde{X}}$ in a Reproducing Kernel Hilbert Space, $\mathcal{H}$ [@berlinet2011reproducing]. An important consideration is the choice of the kernel function $k(\cdot,\cdot)$. In our implementation we make use of a Gaussian kernel with a constant length-scale parameter of $0.5$. As the Gaussian kernel captures all moments of distributions $\mathcal{X}$ and $\mathcal{\tilde{X}}$, we have that $MMD(X,\tilde{X})=0$ if and only if $X=\tilde{X}$. Conversely, larger values $MMD(X,\tilde{X})>0$ indicate that it is more likely that $\mathcal{X}$ and $\mathcal{\tilde{X}}$ are different distributions. In our context, large values therefore indicate that a domain shift indeed seems to have occurred.
67+
where $X=\{x_1,...,x_m\}$, $\tilde{X}=\{\tilde{x}_1,...,\tilde{x}_n\}$ represent independent and identically distributed samples drawn from probability distributions $\mathcal{X}$ and $\mathcal{\tilde{X}}$ respectively @gretton2012kernel. MMD is a measure of the distance between the kernel mean embeddings of $\mathcal{X}$ and $\mathcal{\tilde{X}}$ in a Reproducing Kernel Hilbert Space, $\mathcal{H}$ [@berlinet2011reproducing]. An important consideration is the choice of the kernel function $k(\cdot,\cdot)$. In our implementation, we make use of a Gaussian kernel with a constant length-scale parameter of $0.5$. As the Gaussian kernel captures all moments of distributions $\mathcal{X}$ and $\mathcal{\tilde{X}}$, we have that $MMD(X,\tilde{X})=0$ if and only if $X=\tilde{X}$. Conversely, larger values $MMD(X,\tilde{X})>0$ indicate that it is more likely that $\mathcal{X}$ and $\mathcal{\tilde{X}}$ are different distributions. In our context, large values, therefore, indicate that a domain shift indeed seems to have occurred.
6868

6969
To assess the statistical significance of the observed shifts under the null hypothesis that samples $X$ and $\tilde{X}$ were drawn from the same probability distribution, we follow @arcones1992bootstrap. To that end, we combine the two samples and generate a large number of permutations of $X + \tilde{X}$. Then, we split the permuted data into two new samples $X^\prime$ and $\tilde{X}^\prime$ having the same size as the original samples. Then under the null hypothesis, we should have that $MMD(X^\prime,\tilde{X}^\prime)$ be approximately equal to $MMD(X,\tilde{X})$. The corresponding $p$-value can then be calculated by counting how these two quantities are not equal.
7070

0 commit comments

Comments
 (0)