Skip to content

Commit d5905aa

Browse files
authored
Merge pull request #121 from pythonhealthdatascience/dev
Dev
2 parents 187b1b6 + a4a50d8 commit d5905aa

File tree

33 files changed

+1977
-237
lines changed

33 files changed

+1977
-237
lines changed

.flake8

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -3,15 +3,15 @@ per-file-ignores =
33
docstrings*.py: F811
44
length_warmup*.py: F401,F821
55
logs*.py: F811
6-
mathematical*.py: F401
6+
mathematical*.py: F401,F821,E402
77
n_reps*.py: E0602,F401,F821,W0611
88
outputs*.py: F811
99
parallel*.py: F401,F821
1010
parameters_file*.py: E402,F811,E0102
1111
parameters_validation*.py: F821
1212
replications*.py: F401,F811,F821
1313
scenarios*.py: F401,F821
14-
tables_figures*.py: F401,F821
14+
tables_figures*.py: F401,F821,E402
1515
tests*.py: F821
1616
tests_resources/*.py: E0401
1717
*/outputs_resources/*.py: E261,E262,F821

.lintr

Lines changed: 12 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -5,23 +5,28 @@ exclusions: list(
55
unused_import_linter = Inf,
66
object_usage_linter = Inf
77
),
8+
"pages/experiments/tables_figures.qmd" = list(
9+
object_name_linter = 65
10+
),
811
"pages/inputs/parameters_validation.qmd" = list(
912
object_usage_linter = 771:772
1013
),
1114
"pages/output_analysis/length_warmup.qmd" = list(
12-
unused_import_linter = Inf
15+
unused_import_linter = Inf,
16+
object_usage_linter = Inf
17+
),
18+
"pages/output_analysis/length_warmup_resources/metrics.R" = list(
19+
object_usage_linter = Inf
1320
),
1421
"pages/output_analysis/n_reps.qmd" = list(
1522
unused_import_linter = Inf,
1623
object_usage_linter = Inf
1724
),
1825
"pages/output_analysis/outputs.qmd" = list(
19-
one_call_pipe_linter = Inf,
2026
line_length_linter = 2850
2127
),
2228
"pages/output_analysis/parallel.qmd" = list(
23-
object_usage_linter = Inf,
24-
one_call_pipe_linter = Inf
29+
object_usage_linter = Inf
2530
),
2631
"pages/output_analysis/outputs_resources/model.R" = list(
2732
object_usage_linter = Inf
@@ -34,10 +39,9 @@ exclusions: list(
3439
),
3540
"pages/style_docs/linting_resources/code.R",
3641
"pages/verification_validation/mathematical.qmd" = list(
37-
unused_import_linter = Inf
38-
),
39-
"pages/verification_validation/tests_resources/simulation.R" = list(
40-
one_call_pipe_linter = 228
42+
unused_import_linter = Inf,
43+
object_usage_linter = Inf,
44+
library_call_linter = Inf
4145
),
4246
"pages/verification_validation/tests_resources/test_back.R" = list(
4347
expect_identical_linter = Inf

_quarto.yml

Lines changed: 5 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -23,6 +23,7 @@ website:
2323
- pages/project/stars.qmd
2424
- section: "Introduction"
2525
contents:
26+
- pages/intro/des.qmd
2627
- pages/intro/rap.qmd
2728
- pages/intro/guidelines.qmd
2829
- pages/intro/foss.qmd
@@ -77,6 +78,10 @@ website:
7778
- pages/sharing/citation.qmd
7879
- pages/sharing/changelog.qmd
7980
- pages/sharing/archive.qmd
81+
- section: "Closing remarks"
82+
contents:
83+
- pages/further_info/conclusion.qmd
84+
- pages/further_info/feedback.qmd
8085
favicon: images/stars_logo_blue.png
8186
navbar:
8287
collapse: false

pages/experiments/scenarios.qmd

Lines changed: 14 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -21,6 +21,9 @@ bibliography: scenarios_resources/references.bib
2121
**Relevant reproducibility guidelines:**
2222

2323
* STARS Reproducibility Recommendations (⭐): Provide code for all scenarios and sensitivity analyses.
24+
* STARS Reproducibility Recommendations: Save outputs to a file.
25+
* STARS Reproducibility Recommendations: Avoid excessive output files.
26+
* STARS Reproducibility Recommendations: Address large file sizes.
2427

2528
**Pre-reading:**
2629

@@ -441,6 +444,17 @@ kable(sensitivity_results) |> scroll_box(height = "400px")
441444

442445
:::
443446

447+
## Saving results
448+
449+
Saving your simulation results to file is important for reproducility, as it allows others to verify your findings and generate consistent (or new) figures and analyses, even if they can't re-run your simulation.
450+
This practice is transparent, providing a clear record of what you found, and it is valuable for you as well, ensuring you always know exactly what results you obtained, and can regenerate your own tables and figures from the results.
451+
452+
However, there are two key things to keep in mind:
453+
454+
**1. Number of files.** Running many scenarios or replications can easily lead to an explosion of output files. Do not save each scenario or run as a separate file unless there is a specific need. Instead, combine all results into a **single file with columns marking scenario and replication IDs**.
455+
456+
**2. Avoid large file sizes.** Be strategic about what you save. For short tests and debugging, saving detailed ("patient-level") results makes sense. But for full-scale runs with many replications, those files can become unmanageably large. Generally, **save summary outputs** to file for analysis (e.g., means from each run), not massive raw datasets. If you absolutely need to save or share large files, use compressed formats (e.g., `csv.gz`). Also, keep in mind practical size limits for version control: for example, GitHub's individual file size limit is 100 MB.
457+
444458
## Explore the example models
445459

446460
<div class="h3-tight"></div>

pages/experiments/tables_figures.qmd

Lines changed: 43 additions & 23 deletions
Original file line numberDiff line numberDiff line change
@@ -8,6 +8,7 @@ bibliography: scenarios_resources/references.bib
88
```{python}
99
#| echo: false
1010
# pylint: disable=too-many-locals,undefined-variable,unused-import
11+
# pylint: disable=pointless-statement
1112
```
1213

1314
:::: {.pale-blue}
@@ -165,7 +166,7 @@ def summarise_scenarios(results, groups, result_vars, path_prefix=None):
165166
result_vars : list
166167
List of performance measures to get results on (provided as strings).
167168
path_prefix : str, optional
168-
Path prefix to save tables to. Each metric will be saved as
169+
Path prefix to save tables to. Each metric will be saved as
169170
{path_prefix}_{metric}.csv
170171
171172
Returns
@@ -182,8 +183,10 @@ def summarise_scenarios(results, groups, result_vars, path_prefix=None):
182183
.apply(pd.Series)
183184
.reset_index()
184185
)
185-
summary_table.columns = (list(summary_table.columns[:-4]) +
186-
["mean", "std_dev", "ci_lower", "ci_upper"])
186+
summary_table.columns = (
187+
list(summary_table.columns[:-4]) +
188+
["mean", "std_dev", "ci_lower", "ci_upper"]
189+
)
187190
188191
# Add column to identify which metric this is
189192
summary_table["metric"] = result_var
@@ -312,7 +315,9 @@ HTML(to_html_datatable(sensitivity_tables["mean_patients_in_system"]))
312315
#'
313316
#' @return A named list of summary data frames.
314317
315-
summarise_scenarios <- function(results, groups, result_vars, path_prefix = NULL) {
318+
summarise_scenarios <- function(
319+
results, groups, result_vars, path_prefix = NULL
320+
) {
316321
summary_tables <- list()
317322
318323
for (result_var in result_vars) {
@@ -321,8 +326,8 @@ summarise_scenarios <- function(results, groups, result_vars, path_prefix = NULL
321326
summarise(
322327
mean = mean(.data[[result_var]], na.rm = TRUE),
323328
std_dev = sd(.data[[result_var]], na.rm = TRUE),
324-
ci_lower = t.test(.data[[result_var]])$conf.int[1],
325-
ci_upper = t.test(.data[[result_var]])$conf.int[2],
329+
ci_lower = t.test(.data[[result_var]])$conf.int[1L],
330+
ci_upper = t.test(.data[[result_var]])$conf.int[2L],
326331
.groups = "drop"
327332
) |>
328333
mutate(metric = result_var)
@@ -344,7 +349,7 @@ summarise_scenarios <- function(results, groups, result_vars, path_prefix = NULL
344349
### Scenario analysis
345350

346351
```{r}
347-
result_variables = c(
352+
result_variables <- c(
348353
"mean_wait_time_doctor",
349354
"utilisation_doctor",
350355
"mean_queue_length_doctor",
@@ -699,49 +704,64 @@ This function is used to plot the results from the scenarios and sensitivity ana
699704
```{r}
700705
#' Plot multiple performance measures at once
701706
#'
702-
#' @param summary_tables Named list of summary tables, one per metric (like output from summarise_scenarios()).
707+
#' @param summary_tables Named list of summary tables, one per metric
708+
#' (like output from summarise_scenarios()).
703709
#' @param x_var Name of variable to plot on x axis.
704710
#' @param colour_var Name of variable to colour lines with (can be NULL).
705711
#' @param name_mappings Optional named list for prettier axis/legend labels.
706712
#' @param path_prefix Optional path prefix to save figures.
707713
#' @return Named list of ggplot objects
708714
709715
plot_metrics <- function(
710-
summary_tables, x_var, colour_var = NULL,
711-
name_mappings = NULL, path_prefix = NULL
716+
summary_tables, x_var, colour_var = NULL, name_mappings = NULL,
717+
path_prefix = NULL
712718
) {
719+
# List to store plots for each metric
713720
plot_list <- list()
714721
715722
for (metric_name in names(summary_tables)) {
723+
# Extract relevant results table
716724
summary_table <- summary_tables[[metric_name]]
717725
718-
y_var <- "mean"
719-
ci_lower <- "ci_lower"
720-
ci_upper <- "ci_upper"
721-
722-
xaxis_title <- if (!is.null(name_mappings[[x_var]])) name_mappings[[x_var]] else x_var
723-
yaxis_title <- if (!is.null(name_mappings[[metric_name]])) name_mappings[[metric_name]] else metric_name
724-
legend_title <- if (!is.null(colour_var) && !is.null(name_mappings[[colour_var]])) name_mappings[[colour_var]] else colour_var
726+
# Helper to map a variable to display name if available in `name_mappings`.
727+
# Just uses variable name if no mapping is found.
728+
get_name <- function(var) {
729+
if (!is.null(var) && !is.null(name_mappings[[var]])) {
730+
name_mappings[[var]]
731+
} else {
732+
var
733+
}
734+
}
735+
xaxis_title <- get_name(x_var)
736+
yaxis_title <- get_name(metric_name)
737+
legend_title <- get_name(colour_var)
725738
739+
# Create plot, with or without grouping colour variable
726740
if (!is.null(colour_var)) {
727741
summary_table[[colour_var]] <- as.factor(summary_table[[colour_var]])
728-
p <- ggplot(summary_table, aes_string(x = x_var, y = y_var, group = colour_var, color = colour_var, fill = colour_var)) +
742+
p <- ggplot(summary_table,
743+
aes_string(x = x_var, y = "mean", group = colour_var,
744+
color = colour_var, fill = colour_var)) +
729745
geom_line() +
730-
geom_ribbon(aes_string(ymin = ci_lower, ymax = ci_upper), alpha = 0.1) +
731-
labs(x = xaxis_title, y = yaxis_title, color = legend_title, fill = legend_title) +
746+
geom_ribbon(aes_string(ymin = "ci_lower", ymax = "ci_upper"),
747+
alpha = 0.1) +
748+
labs(x = xaxis_title, y = yaxis_title, color = legend_title,
749+
fill = legend_title) +
732750
theme_minimal()
733751
} else {
734-
p <- ggplot(summary_table, aes_string(x = x_var, y = y_var)) +
752+
p <- ggplot(summary_table, aes_string(x = x_var, y = "mean")) +
735753
geom_line() +
736-
geom_ribbon(aes_string(ymin = ci_lower, ymax = ci_upper), alpha = 0.1, show.legend = FALSE) +
754+
geom_ribbon(aes_string(ymin = "ci_lower", ymax = "ci_upper"),
755+
alpha = 0.1, show.legend = FALSE) +
737756
labs(x = xaxis_title, y = yaxis_title) +
738757
theme_minimal()
739758
}
740759
741760
# Save plot if prefix supplied
742761
if (!is.null(path_prefix)) {
743762
output_path <- paste0(path_prefix, "_", metric_name, ".png")
744-
ggsave(filename = output_path, plot = p, width = 6.5, height = 4, bg = "white")
763+
ggsave(filename = output_path, plot = p, width = 6.5, height = 4L,
764+
bg = "white")
745765
}
746766
747767
plot_list[[metric_name]] <- p

pages/further_info/conclusion.qmd

Lines changed: 127 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,127 @@
1+
---
2+
title: Conclusion
3+
---
4+
5+
<!-- Hide as no python-content or r-content blocks -->
6+
<style>
7+
#quarto-announcement {
8+
display: none !important;
9+
}
10+
</style>
11+
12+
<br>
13+
14+
::: {.pale-blue}
15+
16+
**Well done, you made it to the end!** 😁
17+
18+
This book has shared with you the knowledge and tools to create simulation models in Python or R as part of a reproducible analytical pipeline - models that others can reproduce, trust, understand, and build upon.
19+
20+
:::
21+
22+
## What's next?
23+
24+
<div class="h3-tight"></div>
25+
26+
### Explore the example models
27+
28+
These example models demonstrate many of the practices covered in this book. They may seem daunting at first - there's a lot going on - but having worked through the book, you're in a good position to understand them.
29+
30+
Remember, these are **examples**, not prescriptions. They're not perfect, and there's no single "right way" to build reproducible models. They simply show one approach to implementing the principles you've learned.
31+
32+
**Nurse visit simulation:**
33+
34+
{{< include ../../html/pydesrapmms.html >}}
35+
36+
{{< include ../../html/rdesrapmms.html >}}
37+
38+
**Stroke pathway simulation:**
39+
40+
{{< include ../../html/pydesrapstroke.html >}}
41+
42+
{{< include ../../html/rdesrapstroke.html >}}
43+
44+
### Make your own model
45+
46+
The best way to solidify what you've learned is to apply it. When planning you model, remember that a good simulation starts with **conceptual modelling**. As defined in Robinson (2007):
47+
48+
> "The conceptual model is a non-software specific description of the simulation model that is to be developed, describing the objectives, inputs, outputs, content, assumptions and simplifications of the model."
49+
50+
Some good resources on conceptual modelling include:
51+
52+
* Robinson, Stewart. 2007. "Chapter 5: Conceptual Modelling." In Simulation: The Practice of Model Development and Use, 63–75. John Wiley & Sons.
53+
* Robinson, Stewart. 2007. "Chapter 6: Developing the Conceptual Model." In Simulation: The Practice of Model Development and Use, 77–93. John Wiley & Sons.
54+
55+
This book focused on building simple model structures to help you establish robust foundations and reproducible workflows. However, real-world simulation models often involve additional features and complexities, such as reneging, balking, priority classes, resource scheduling, branching, blocking, or more detailed patients pathways.
56+
57+
For inspiration on implementing a wider range of features, the [HSMA "little book of DES"](https://des.hsma.co.uk/) is an excellent resource. Its examples use Python, and the set up or structure may differ from those in this book, but the simulation principles apply whatever language you use. Focus on understanding the logic - how features are implemented and why - then adapt those ideas for the language, structure, or workflow that best fits your own model.
58+
59+
### Review your existing models
60+
61+
Already have simulation models in development or completed? Now's a great time to audit them against the practices you've learned.
62+
63+
Use the checklists linked below to identify what you've already achieved (woohoo!) and what's missing. Then revisit relevant sections of the book to fill the gaps. Even small improvements - adding seeds, externalising parameters, or documenting dependencies - can significantly enhance your model's reproducibility.
64+
65+
## Checklists
66+
67+
Download checklists to **audit existing models** or **guide development of existing models**.
68+
69+
You can see examples of completed checklists in the nurse visit and stroke simulation example model repositories.
70+
71+
{{< downloadthis conclusion_resources/stars_reproducibility_recommendations.md dname="stars_reproducibility_recommendations" label="Download the STARS reproducibility recommendations" type="primary" >}}
72+
73+
{{< downloadthis conclusion_resources/nhs_levels_of_rap.md dname="nhs_levels_of_rap" label="Download the NHS Levels of RAP maturity framework" type="primary" >}}
74+
75+
Also, don't forget about the handy **verification and validation** checklist:
76+
77+
{{< downloadthis ../verification_validation/verification_validation_resources/verification_validation_checklist.md dname="verification_validation_checklist" label="Download the verification and validation checklist" type="primary" >}}
78+
79+
## Acknowledgements
80+
81+
This book builds upon the generous work of many contributors to the open-source and simulation communities. We are particularly grateful for:
82+
83+
* The **SimPy** and **simmer** development teams for creating and maintaining excellent open-source DES libraries.
84+
* The **NHS RAP Community of Practice** for their maturity framework.
85+
* The **HSMA Programme** (Health Service Modelling Associates) for their [little book of DES](https://des.hsma.co.uk/).
86+
* All the **researchers and practitioners who have openly shared their simulation models**, enabling the research that informed this book.
87+
* **Contributors and reviewers** who have provided feedback to improve this resource
88+
89+
Full references and citations appear throughout the book where specific resources are discussed.
90+
91+
### Please cite this book
92+
93+
The code in this book is licensed under an **MIT License**, and the text is under **CC-BY-SA**, making it free to use, modify and share. However, we kindly ask/require that you **cite or acknowledge** this work when you use it.
94+
95+
Suggested citation:
96+
97+
> Heather, A., Monks, T., Mustafee, N., & Harper, A. (2025). DES RAP Book: Reproducible Discrete-Event Simulation in Python and R. https://github.com/pythonhealthdatascience/des_rap_book. https://doi.org/10.5281/zenodo.17094155.
98+
99+
## Find out more about STARS
100+
101+
This book is part of the **STARS (Sharing Tools and Artefacts for Reusable and Reproducible Simulations)** project, supported by the Medical Research Council [grant number MR/Z503915/1].
102+
103+
![](../../images/stars_banner.png)
104+
105+
STARS tackles the challenges of sharing, reusing, and reproducing discrete event simulation (DES) models in healthcare. Our goal is to create open resources using the two most popular open-source languages for DES: Python and R. As part of this project, you'll find tutorials, code examples, and tools to help researchers and practitioners develop, validate, and share DES models more effectively.
106+
107+
Learn more:
108+
109+
* **GitHub organisation:** <https://github.com/pythonhealthdatascience>
110+
111+
<!-- TODO: Add link to STARS summary website once created -->
112+
113+
## Well done! It's a journey, not a race
114+
115+
Whether you're new to reproducible simulation or deepening your existing practice, you've taken important steps toward building more trustworthy, transparent models.
116+
117+
Implementing these practices requires time and iteration. **Perfection is not required immediately - or ever**. Many of us are crunched for time, juggling multiple projects and deadlines. Finding space to improve workflows can feel impossible.
118+
119+
When time is limited, focus on what matters most for your specific project. Go through the checklists or flip back through the book to identify key steps that would make the biggest difference. **Every small change moves your work forward.**
120+
121+
**Don't be afraid to share your model**, even if it doesn't have all the bells and whistles. You benefit from others' shared work; others benefit from yours. Shared models spark conversations, enable collaboration, and push the field forward. Each model shared raises the bar for transparency and makes it easier for the next person to do the same.
122+
123+
<br>
124+
125+
*See the next page for details on giving feedback and contributing to this resource.*
126+
127+
<br><br>

0 commit comments

Comments
 (0)