Skip to content

Commit ad8ec4a

Browse files
committed
minor thing
1 parent aaa7ca2 commit ad8ec4a

File tree

5 files changed

+14
-14
lines changed

5 files changed

+14
-14
lines changed

paper/paper.Rmd

Lines changed: 4 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -125,6 +125,10 @@ knitr::opts_chunk$set(
125125
```{r, child=child_docs}
126126
```
127127

128+
# Acknowledgements {-}
129+
130+
Some of the members of TU Delft were partially funded by ICAI AI for Fintech Research, an ING --- TU Delft collaboration.
131+
128132
# References {.unnumbered}
129133

130134
::: {#refs}
@@ -136,8 +140,4 @@ knitr::opts_chunk$set(
136140

137141
Granular results for all of our experiments can be found in this online companion: [https://www.paltmeyer.com/endogenous-macrodynamics-in-algorithmic-recourse/](https://www.paltmeyer.com/endogenous-macrodynamics-in-algorithmic-recourse/). The Github repository containing all the code used to produce the results in this paper can be found here: [https://github.com/pat-alt/endogenous-macrodynamics-in-algorithmic-recourse](https://github.com/pat-alt/endogenous-macrodynamics-in-algorithmic-recourse).
138142

139-
# Acknowledgements {-}
140-
141-
Some of the members of TU Delft were partially funded by ICAI AI for Fintech Research, an ING --- TU Delft collaboration.
142-
143143

paper/paper.pdf

-1.67 KB
Binary file not shown.

paper/paper.tex

Lines changed: 8 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -801,7 +801,7 @@ \subsection{Simulations}\label{method-2-experiment}}
801801

802802
Note that the operation in line 4 is an assignment, rather than a copy operation, so any updates to `batch' will also affect \(\mathcal{D}\). The function \(\text{eval}(M,\mathcal{D})\) loosely denotes the computation of various evaluation metrics introduced below. In practice, these metrics can also be computed at regular intervals as opposed to every round.
803803

804-
Along with any other fixed parameters affecting the counterfactual search, the parameters \(T\) and \(B\) are assumed as given in Algorithm \ref{algo-experiment}. Still, it is worth noting that the higher these values, the more factual instances undergo recourse throughout the entire experiment. Of course, this is likely to lead to more pronounced domain and model shifts by time \(T\). In our experiments, we choose the values such that \(T \cdot B\) corresponds to the application of recourse on \(\approx50\%\) of the negative instances from the initial dataset. As we compute evaluation metrics at regular intervals throughout the procedure, we can also verify the impact of recourse when it is implemented for a smaller number of individuals.
804+
Along with any other fixed parameters affecting the counterfactual search, the parameters \(T\) and \(B\) are assumed as given in Algorithm \ref{algo-experiment}. Still, it is worth noting that the higher these values, the more factual instances undergo recourse throughout the entire experiment. Of course, this is likely to lead to more pronounced domain and model shifts by time \(T\). In our experiments, we choose the values such that the majority of the negative instances from the initial dataset receive recourse. As we compute evaluation metrics at regular intervals throughout the procedure, we can also verify the impact of recourse when it is implemented for a smaller number of individuals.
805805

806806
Algorithm \ref{algo-experiment} summarizes the proposed simulation experiment for a given dataset \(\mathcal{D}\), model \(M\) and generator \(G\), but naturally, we are interested in comparing simulation outcomes for different sources of data, models and generators. The framework we have built facilitates this, making use of multi-threading in order to speed up computations. Holding the initial model and dataset constant, the experiments are run for all generators, since our primary concern is to benchmark different recourse methods. To ensure that each generator is faced with the same initial conditions in each round \(t\), the candidate batch of individuals from the non-target class is randomly drawn from the intersection of all non-target class individuals across all experiments \(\left\{\textsc{Experiment}(M,\mathcal{D},G)\right\}_{j=1}^J\) where \(J\) is the total number of generators.
807807

@@ -843,7 +843,7 @@ \subsubsection{Model Shifts}\label{model-shifts}}
843843
\hypertarget{empirical}{%
844844
\section{Experiment Setup}\label{empirical}}
845845

846-
This section presents the exact ingredients and parameter choices describing the simulation experiments we ran to produce the findings presented in the next section (\ref{empirical-2}). For convenience, we use Algorithm \ref{algo-experiment} as a template to guide us through this section. A few high-level details upfront: each experiment is run for a total of \(T=50\) rounds, where in each round we provide recourse to five per cent of all individuals in the non-target class, so \(B_t=0.05 * N_t^{\mathcal{D}_0}\)\footnote{As mentioned in the previous section, we end up providing recourse to a total of \(\approx50\%\) by the end of round \(T=50\).}. All classifiers and generative models are retrained for 10 epochs in each round \(t\) of the experiment. Rather than retraining models from scratch, we initialize all parameters at their previous levels (\(t-1\)) and backpropagate for 10 epochs using the new training data as inputs into the existing model. Evaluation metrics are computed and stored every 10 rounds. To account for noise, each individual experiment is repeated five times.\footnote{In the current implementation, we use the same train-test split each time to only account for stochasticity associated with randomly selecting individuals for recourse. An interesting alternative may be to also perform data splitting each time, thereby adding an additional layer of randomness.}
846+
This section presents the exact ingredients and parameter choices describing the simulation experiments we ran to produce the findings presented in the next section (\ref{empirical-2}). For convenience, we use Algorithm \ref{algo-experiment} as a template to guide us through this section. A few high-level details upfront: each experiment is run for a total of \(T=50\) rounds, where in each round we provide recourse to five per cent of all individuals in the non-target class, so \(B_t=0.05 * N_t^{\mathcal{D}_0}\). All classifiers and generative models are retrained for 10 epochs in each round \(t\) of the experiment. Rather than retraining models from scratch, we initialize all parameters at their previous levels (\(t-1\)) and backpropagate for 10 epochs using the new training data as inputs into the existing model. Evaluation metrics are computed and stored every 10 rounds. To account for noise, each individual experiment is repeated five times.\footnote{In the current implementation, we use the same train-test split each time to only account for stochasticity associated with randomly selecting individuals for recourse. An interesting alternative may be to also perform data splitting each time, thereby adding an additional layer of randomness.}
847847

848848
\hypertarget{empirical-classifiers}{%
849849
\subsection{\texorpdfstring{\(M\)---Classifiers and Generative Models}{M---Classifiers and Generative Models}}\label{empirical-classifiers}}
@@ -1113,6 +1113,12 @@ \section{Concluding Remarks}\label{conclusion}}
11131113

11141114
This work has revisited and extended some of the most general and defining concepts underlying the literature on Counterfactual Explanations and, in particular, Algorithmic Recourse. We demonstrate that long-held beliefs as to what defines optimality in AR, may not always be suitable. Specifically, we run experiments that simulate the application of recourse in practice using various state-of-the-art counterfactual generators and find that all of them induce substantial domain and model shifts. We argue that these shifts should be considered as an expected external cost of individual recourse and call for a paradigm shift from individual to collective recourse in these types of situations. By proposing an adapted counterfactual search objective that incorporates this cost, we make that paradigm shift explicit. We show that this modified objective lends itself to mitigation strategies that can be used to effectively decrease the magnitude of induced domain and model shifts. Through our work, we hope to inspire future research on this important topic. To this end we have open-sourced all of our code along with a Julia package: \href{https://anonymous.4open.science/r/AlgorithmicRecourseDynamics/README.md}{\texttt{AlgorithmicRecourseDynamics.jl}}. Future researchers should find it easy to replicate, modify and extend the simulation experiments presented here and apply them to their own custom counterfactual generators.
11151115

1116+
\hypertarget{acknowledgements}{%
1117+
\section*{Acknowledgements}\label{acknowledgements}}
1118+
\addcontentsline{toc}{section}{Acknowledgements}
1119+
1120+
Some of the members of TU Delft were partially funded by ICAI AI for Fintech Research, an ING --- TU Delft collaboration.
1121+
11161122
\hypertarget{references}{%
11171123
\section*{References}\label{references}}
11181124
\addcontentsline{toc}{section}{References}
@@ -1285,11 +1291,5 @@ \section*{Appendix}\label{appendix}}
12851291

12861292
Granular results for all of our experiments can be found in this online companion: \url{https://www.paltmeyer.com/endogenous-macrodynamics-in-algorithmic-recourse/}. The Github repository containing all the code used to produce the results in this paper can be found here: \url{https://github.com/pat-alt/endogenous-macrodynamics-in-algorithmic-recourse}.
12871293

1288-
\hypertarget{acknowledgements}{%
1289-
\section*{Acknowledgements}\label{acknowledgements}}
1290-
\addcontentsline{toc}{section}{Acknowledgements}
1291-
1292-
Some of the members of TU Delft were partially funded by ICAI AI for Fintech Research, an ING --- TU Delft collaboration.
1293-
12941294
\end{document}
12951295

paper/sections/empirical.rmd

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,6 @@
11
# Experiment Setup {#empirical}
22

3-
This section presents the exact ingredients and parameter choices describing the simulation experiments we ran to produce the findings presented in the next section (\@ref(empirical-2)). For convenience, we use Algorithm \ref{algo-experiment} as a template to guide us through this section. A few high-level details upfront: each experiment is run for a total of $T=50$ rounds, where in each round we provide recourse to five per cent of all individuals in the non-target class, so $B_t=0.05 * N_t^{\mathcal{D}_0}$^[As mentioned in the previous section, we end up providing recourse to a total of $\approx50\%$ by the end of round $T=50$.]. All classifiers and generative models are retrained for 10 epochs in each round $t$ of the experiment. Rather than retraining models from scratch, we initialize all parameters at their previous levels ($t-1$) and backpropagate for 10 epochs using the new training data as inputs into the existing model. Evaluation metrics are computed and stored every 10 rounds. To account for noise, each individual experiment is repeated five times.^[In the current implementation, we use the same train-test split each time to only account for stochasticity associated with randomly selecting individuals for recourse. An interesting alternative may be to also perform data splitting each time, thereby adding an additional layer of randomness.]
3+
This section presents the exact ingredients and parameter choices describing the simulation experiments we ran to produce the findings presented in the next section (\@ref(empirical-2)). For convenience, we use Algorithm \ref{algo-experiment} as a template to guide us through this section. A few high-level details upfront: each experiment is run for a total of $T=50$ rounds, where in each round we provide recourse to five per cent of all individuals in the non-target class, so $B_t=0.05 * N_t^{\mathcal{D}_0}$. All classifiers and generative models are retrained for 10 epochs in each round $t$ of the experiment. Rather than retraining models from scratch, we initialize all parameters at their previous levels ($t-1$) and backpropagate for 10 epochs using the new training data as inputs into the existing model. Evaluation metrics are computed and stored every 10 rounds. To account for noise, each individual experiment is repeated five times.^[In the current implementation, we use the same train-test split each time to only account for stochasticity associated with randomly selecting individuals for recourse. An interesting alternative may be to also perform data splitting each time, thereby adding an additional layer of randomness.]
44

55
## $M$---Classifiers and Generative Models {#empirical-classifiers}
66

paper/sections/methodology_2.rmd

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -46,7 +46,7 @@ In order to simulate the dynamic process, we suppose that the model $M$ is retra
4646

4747
Note that the operation in line 4 is an assignment, rather than a copy operation, so any updates to 'batch' will also affect $\mathcal{D}$. The function $\text{eval}(M,\mathcal{D})$ loosely denotes the computation of various evaluation metrics introduced below. In practice, these metrics can also be computed at regular intervals as opposed to every round.
4848

49-
Along with any other fixed parameters affecting the counterfactual search, the parameters $T$ and $B$ are assumed as given in Algorithm \ref{algo-experiment}. Still, it is worth noting that the higher these values, the more factual instances undergo recourse throughout the entire experiment. Of course, this is likely to lead to more pronounced domain and model shifts by time $T$. In our experiments, we choose the values such that $T \cdot B$ corresponds to the application of recourse on $\approx50\%$ of the negative instances from the initial dataset. As we compute evaluation metrics at regular intervals throughout the procedure, we can also verify the impact of recourse when it is implemented for a smaller number of individuals.
49+
Along with any other fixed parameters affecting the counterfactual search, the parameters $T$ and $B$ are assumed as given in Algorithm \ref{algo-experiment}. Still, it is worth noting that the higher these values, the more factual instances undergo recourse throughout the entire experiment. Of course, this is likely to lead to more pronounced domain and model shifts by time $T$. In our experiments, we choose the values such that the majority of the negative instances from the initial dataset receive recourse. As we compute evaluation metrics at regular intervals throughout the procedure, we can also verify the impact of recourse when it is implemented for a smaller number of individuals.
5050

5151
Algorithm \ref{algo-experiment} summarizes the proposed simulation experiment for a given dataset $\mathcal{D}$, model $M$ and generator $G$, but naturally, we are interested in comparing simulation outcomes for different sources of data, models and generators. The framework we have built facilitates this, making use of multi-threading in order to speed up computations. Holding the initial model and dataset constant, the experiments are run for all generators, since our primary concern is to benchmark different recourse methods. To ensure that each generator is faced with the same initial conditions in each round $t$, the candidate batch of individuals from the non-target class is randomly drawn from the intersection of all non-target class individuals across all experiments $\left\{\textsc{Experiment}(M,\mathcal{D},G)\right\}_{j=1}^J$ where $J$ is the total number of generators.
5252

0 commit comments

Comments
 (0)