Skip to content

Commit d18d18a

Browse files
committed
sorted out charts
1 parent 7b8a6bc commit d18d18a

File tree

8 files changed

+117
-88
lines changed

8 files changed

+117
-88
lines changed

Manifest.toml

Lines changed: 71 additions & 70 deletions
Large diffs are not rendered by default.

intro.qmd

Lines changed: 40 additions & 12 deletions
Original file line numberDiff line numberDiff line change
@@ -15,22 +15,17 @@ To start with, we will look at a proof-of-concept that demonstrates the main obs
1515

1616
We begin by generating the synthetic data for a simple binary classification problem. For illustrative purposes, we will use data that is linearly separable. The chart below shows the data $\mathcal{D}$ at time zero, before any implementation of recourse.
1717

18+
1819
```{julia}
1920
#| output: true
2021
#| label: fig-data
2122
#| fig-cap: "Linearly separable synthetic data"
2223
23-
N = 1000
24-
xmax = 2
25-
X, ys = make_blobs(
26-
N, 2;
27-
centers=2, as_table=false, center_box=(-xmax => xmax), cluster_std=0.1
28-
)
29-
ys .= ys.==2
30-
X = X'
31-
xs = Flux.unstack(X,2)
32-
data = zip(xs,ys)
33-
counterfactual_data = CounterfactualData(X,ys')
24+
max_obs = 1000
25+
catalogue = load_synthetic(max_obs)
26+
counterfactual_data = catalogue[:linearly_separable]
27+
X = counterfactual_data.X
28+
ys = vec(counterfactual_data.y)
3429
plot()
3530
scatter!(counterfactual_data)
3631
```
@@ -44,6 +39,7 @@ n_epochs = 100
4439
model = Chain(Dense(2,1))
4540
mod = FluxModel(model)
4641
Models.train(mod, counterfactual_data; n_epochs=n_epochs)
42+
mod_orig = deepcopy(mod)
4743
```
4844

4945
@fig-model below shows the linear separation of the two classes.
@@ -72,7 +68,7 @@ Markdown.parse(
7268
```
7369

7470
```{julia}
75-
opt = Flux.Adam(0.01)
71+
opt = Flux.Descent(0.01)
7672
gen = GenericGenerator(;decision_threshold=γ, opt=opt)
7773
```
7874

@@ -143,5 +139,37 @@ plt_single_repeat = plot(mod,counterfactual_data′;zoom=0,colorbar=false,title=
143139
144140
plt = plot(plt_original, plt_single, plt_single_retrained, plt_single_repeat, layout=(1,4), legend=false, axis=nothing, size=(600,165))
145141
savefig(plt, joinpath(www_path, "poc.png"))
142+
savefig(plt, "paper/www/poc.png")
143+
display(plt)
144+
```
145+
146+
## Mitigation Strategies
147+
148+
```{julia}
149+
#| output: true
150+
#| label: fig-mitigate
151+
#| fig-cap: "Mitigation strategies."
152+
153+
# Generators:
154+
generators = Dict(
155+
"Generic (γ=0.5)" => GenericGenerator(opt = opt, decision_threshold=0.5),
156+
"Generic (γ=0.9)" => GenericGenerator(opt = opt, decision_threshold=0.9),
157+
"Gravitational" => GravitationalGenerator(opt = opt),
158+
"ClaPROAR" => ClapROARGenerator(opt = opt)
159+
)
160+
161+
# Counterfactuals
162+
x = select_factual(counterfactual_data, rand(candidates))
163+
counterfactuals = Dict([name => generate_counterfactual(x, 1, counterfactual_data, mod_orig, gen;) for (name, gen) in generators])
164+
165+
# Plots:
166+
plts = []
167+
for (name,ce) ∈ counterfactuals
168+
plt = plot(ce; title=name, colorbar=false, ticks = false, legend=false, zoom=0)
169+
plts = vcat(plts..., plt)
170+
end
171+
plt = plot(plts..., size=(750,200), layout=(1,4))
172+
savefig(plt, joinpath(www_path, "mitigation.png"))
173+
savefig(plt, "paper/www/mitigation.png")
146174
display(plt)
147175
```

paper/paper.pdf

16.7 KB
Binary file not shown.

paper/paper.tex

Lines changed: 5 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -657,17 +657,17 @@ \section{Introduction}\label{intro}}
657657

658658
}
659659

660-
\caption{Dynamics in Algorithmic Recourse: (a) we have a simple linear classifier trained for binary classification where samples from the negative class ($y=0$) are marked in blue and samples of the positive class ($y=1$) are marked in orange; (b) the implementation of AR for a random subset of individuals leads to a noticable domain shift; (c) as the classifier is retrained we observe a corresponding model shift; (d) as this process is repeated, the decision boundary moves away from the target class.}\label{fig:poc}
660+
\caption{Dynamics in Algorithmic Recourse: (a) we have a simple linear classifier trained for binary classification where samples from the negative class ($y=0$) are marked in orange and samples of the positive class ($y=1$) are marked in blue; (b) the implementation of AR for a random subset of individuals leads to a noticeable domain shift; (c) as the classifier is retrained we observe a corresponding model shift; (d) as this process is repeated, the decision boundary moves away from the target class.}\label{fig:poc}
661661
\end{figure}
662662

663-
We think that these types of endogenous dynamics may be problematic and deserve our attention. From a purely technical perspective we note the following: firstly, model shifts may inadvertently change classification outcomes for individuals who never received and implemented recourse. Secondly, we observe in Figure \ref{fig:poc} that as the decision boundary moves in the direction of the non-target class, counterfactual paths become shorter. We think that in some practical applications, this can be expected to generate costs for involved stakeholders. To follow our argument, consider the following two examples:
663+
We think that these types of endogenous dynamics may be problematic and deserve our attention. From a purely technical perspective, we note the following: firstly, model shifts may inadvertently change classification outcomes for individuals who never received and implemented recourse. Secondly, we observe in Figure \ref{fig:poc} that as the decision boundary moves in the direction of the non-target class, counterfactual paths become shorter. We think that in some practical applications, this can be expected to generate costs for involved stakeholders. To follow our argument, consider the following two examples:
664664

665665
\begin{example}[Consumer Credit]
666-
\protect\hypertarget{exm:consumer}{}\label{exm:consumer}Suppose Figure \ref{fig:poc} relates to an automated decision-making system used by a retail bank to evaluate credit applicants with respect to their creditworthiness. Assume that the two features are meaningful in the sense that creditworthiness increases in the south-east direction. Then we can think of the outcome in panel (d) as representing a situation where the bank supplies credit to more borrowers (orange), but these borrowers are on average less creditworthy and more of them can be expected to default on their loan. This represents a cost to the retail bank.
666+
\protect\hypertarget{exm:consumer}{}\label{exm:consumer}Suppose Figure \ref{fig:poc} relates to an automated decision-making system used by a retail bank to evaluate credit applicants with respect to their creditworthiness. Assume that the two features are meaningful in the sense that creditworthiness increases in the South-East direction. Then we can think of the outcome in panel (d) as representing a situation where the bank supplies credit to more borrowers (orange), but these borrowers are on average less creditworthy and more of them can be expected to default on their loan. This represents a cost to the retail bank.
667667
\end{example}
668668

669669
\begin{example}[Student Admission]
670-
\protect\hypertarget{exm:student}{}\label{exm:student}Suppose Figure \ref{fig:poc} relates to an automated decision-making system used by a university in its student admission process. Assume that the two features are meaningful in the sense that the likelihood of students completing their degree increases in the south-east direction. Then we can think of the outcome in panel (b) as representing a situation where more students are admitted to university (orange), but they are more likely to fail their degree than students that were admitted in previous years. The university admission committee catches on to this and suspends its efforts to offer Algorithmic Recourse. This represents an opportunity cost to future student applicants, that may have derived utility from being offered recourse.
670+
\protect\hypertarget{exm:student}{}\label{exm:student}Suppose Figure \ref{fig:poc} relates to an automated decision-making system used by a university in its student admission process. Assume that the two features are meaningful in the sense that the likelihood of students completing their degree increases in the South-East direction. Then we can think of the outcome in panel (b) as representing a situation where more students are admitted to university (orange), but they are more likely to fail their degree than students that were admitted in previous years. The university admission committee catches on to this and suspends its efforts to offer Algorithmic Recourse. This represents an opportunity cost to future student applicants, that may have derived utility from being offered recourse.
671671
\end{example}
672672

673673
Both examples are exaggerated simplifications of potential real-world scenarios, but they serve to illustrate the point that recourse for one single individual may exert negative externalities on other individuals.
@@ -821,7 +821,7 @@ \subsubsection{Domain Shifts}\label{domain-shifts}}
821821
\end{aligned}
822822
\end{equation}
823823

824-
where \(X=\{x_1,...,x_m\}\), \(\tilde{X}=\{\tilde{x}_1,...,\tilde{x}_n\}\) represent independent and identically distributed samples drawn from probability distributions \(\mathcal{X}\) and \(\mathcal{\tilde{X}}\) respectively \protect\hyperlink{ref-gretton2012kernel}{{[}25{]}}. MMD is a measure of the distance between the kernel mean embeddings of \(\mathcal{X}\) and \(\mathcal{\tilde{X}}\) in a Reproducing Kernel Hilbert Space, \(\mathcal{H}\) \protect\hyperlink{ref-berlinet2011reproducing}{{[}26{]}}. An important consideration is the choice of the kernel function \(k(\cdot,\cdot)\). In our implementation we make use of a Gaussian kernel with a constant length-scale parameter of \(0.5\). As the Gaussian kernel captures all moments of distributions \(\mathcal{X}\) and \(\mathcal{\tilde{X}}\), we have that \(MMD(X,\tilde{X})=0\) if and only if \(X=\tilde{X}\). Conversely, larger values \(MMD(X,\tilde{X})>0\) indicate that it is more likely that \(\mathcal{X}\) and \(\mathcal{\tilde{X}}\) are different distributions. In our context, large values therefore indicate that a domain shift indeed seems to have occurred.
824+
where \(X=\{x_1,...,x_m\}\), \(\tilde{X}=\{\tilde{x}_1,...,\tilde{x}_n\}\) represent independent and identically distributed samples drawn from probability distributions \(\mathcal{X}\) and \(\mathcal{\tilde{X}}\) respectively \protect\hyperlink{ref-gretton2012kernel}{{[}25{]}}. MMD is a measure of the distance between the kernel mean embeddings of \(\mathcal{X}\) and \(\mathcal{\tilde{X}}\) in a Reproducing Kernel Hilbert Space, \(\mathcal{H}\) \protect\hyperlink{ref-berlinet2011reproducing}{{[}26{]}}. An important consideration is the choice of the kernel function \(k(\cdot,\cdot)\). In our implementation, we make use of a Gaussian kernel with a constant length-scale parameter of \(0.5\). As the Gaussian kernel captures all moments of distributions \(\mathcal{X}\) and \(\mathcal{\tilde{X}}\), we have that \(MMD(X,\tilde{X})=0\) if and only if \(X=\tilde{X}\). Conversely, larger values \(MMD(X,\tilde{X})>0\) indicate that it is more likely that \(\mathcal{X}\) and \(\mathcal{\tilde{X}}\) are different distributions. In our context, large values, therefore, indicate that a domain shift indeed seems to have occurred.
825825

826826
To assess the statistical significance of the observed shifts under the null hypothesis that samples \(X\) and \(\tilde{X}\) were drawn from the same probability distribution, we follow \protect\hyperlink{ref-arcones1992bootstrap}{{[}27{]}}. To that end, we combine the two samples and generate a large number of permutations of \(X + \tilde{X}\). Then, we split the permuted data into two new samples \(X^\prime\) and \(\tilde{X}^\prime\) having the same size as the original samples. Then under the null hypothesis, we should have that \(MMD(X^\prime,\tilde{X}^\prime)\) be approximately equal to \(MMD(X,\tilde{X})\). The corresponding \(p\)-value can then be calculated by counting how these two quantities are not equal.
827827

paper/sections/empirical_2.rmd

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -6,7 +6,7 @@ Below, we first present our main experimental findings regarding these questions
66

77
We start this section off with the key high-level observations. Across all datasets (synthetic and real), classifiers and counterfactual generators we observe either most or all of the following dynamics at varying degrees:
88

9-
- Statistically significant domain and model shift as measured by MMD.
9+
- Statistically significant domain and model shifts as measured by MMD.
1010
- A deterioration in out-of-sample model performance as measured by the F-Score evaluated on a test sample. In many cases this drop in performance is substantial.
1111
- Significant perturbations to the model parameters as well as an increase in the model's decisiveness.
1212
- Disagreement between the original and retrained model, in some cases large.

paper/www/mitigation.png

-3.86 KB
Loading

paper/www/placeholder.png

-48.5 KB
Binary file not shown.

paper/www/poc.png

3.03 KB
Loading

0 commit comments

Comments
 (0)