Finish the last section

penelopeysm · penelopeysm · commit 91c5c18b8174 · 2025-11-26T12:48:42.000Z
diff --git a/usage/threadsafe-evaluation/index.qmd b/usage/threadsafe-evaluation/index.qmd
@@ -14,7 +14,7 @@ This page specificaly discusses Turing's support for threadsafe model evaluation
 
 :::{.callout-note}
 Please note that this is a rapidly-moving topic, and things may change in future releases of Turing.
-If you are ever unsure about what works and doesn't, please don't hesitate to ask on Slack or Discourse (links can be found at the footer of this site)!
+If you are ever unsure about what works and doesn't, please don't hesitate to ask on [Slack](https://julialang.slack.com/archives/CCYDC34A0) or [Discourse](https://discourse.julialang.org/c/domain/probprog/48)
 :::
 
 ## MCMC sampling
@@ -102,7 +102,7 @@ sample(model, NUTS(), 100; check_model=false, progress=false)
 ::: {.callout-warning}
 ## Upcoming changes
 
-In the next release of Turing, if you use tilde-observations inside threaded blocks, you will have to declare this upfront using:
+Starting from DynamicPPL 0.39, if you use tilde-statements or `@addlogprob!` inside threaded blocks, you will have to declare this upfront using:
 
 ```julia
 model = threaded_obs() | (; y = randn(N))
@@ -136,7 +136,14 @@ model = threaded_assume_bad(100)
 model()
 ```
 
-**Note, in particular, that this means that you cannot use `predict` to sample new data in parallel.**
+**Note, in particular, that this means that you cannot currently use `predict` to sample new data in parallel.**
+
+:::{.callout-note}
+## Threaded `predict`
+
+Support for threaded `predict` will be added in DynamicPPL 0.39 (see [this pull request](https://github.com/TuringLang/DynamicPPL.jl/pull/1130)).
+:::
+
 That is, even for `threaded_obs` where `y` was originally an observed term, you _cannot_ do:
 
 ```{julia}
@@ -148,13 +155,6 @@ pmodel = threaded_obs(N)  # don't condition on data
 predict(pmodel, chn)
 ```
 
-
-:::{.callout-note}
-## Threaded `predict`
-
-Support for the above call to `predict` may land in the near future, with [this pull request](https://github.com/TuringLang/DynamicPPL.jl/pull/1130).
-:::
-
 ## Alternatives to threaded observation
 
 An alternative to using threaded observations is to manually calculate the log-likelihood term (which can be parallelised using any of Julia's standard mechanisms), and then _outside_ of the threaded block, [add it to the model using `@addlogprob!`]({{< meta usage-modifying-logprob >}}).
@@ -187,7 +187,12 @@ See [this Discourse post](https://discourse.julialang.org/t/parallelism-within-t
 
 We make no promises about the use of tilde-statements _with_ these packages (indeed it will most likely error), but as long as you use them to only parallelise regular Julia code (i.e., not tilde-statements), they will work as intended.
 
-One benefit of rewriting the model this way is that sampling from this model with `MCMCThreads()` will always be reproducible.
+The main downside of this approach is:
+
+1. You can't use conditioning syntax to provide data; it has to be passed as an argument or otherwise included inside the model.
+2. You can't use `predict` to sample new data.
+
+On the other hand, one benefit of rewriting the model this way is that sampling from this model with `MCMCThreads()` will always be reproducible.
 
 ```{julia}
 using Random
@@ -196,17 +201,19 @@ y = randn(N)
 model = threaded_obs_addlogprob(N, y)
 nuts_kwargs = (check_model=false, progress=false, verbose=false)
 
-chain1 = sample(Xoshiro(468), model, NUTS(), MCMCThreads(), 100, 4; nuts_kwargs...)
-chain2 = sample(Xoshiro(468), model, NUTS(), MCMCThreads(), 100, 4; nuts_kwargs...)
+chain1 = sample(Xoshiro(468), model, NUTS(), MCMCThreads(), 1000, 4; nuts_kwargs...)
+chain2 = sample(Xoshiro(468), model, NUTS(), MCMCThreads(), 1000, 4; nuts_kwargs...)
 mean(chain1[:x]), mean(chain2[:x])  # should be identical
 ```
 
 In contrast, the original `threaded_obs` (which used tilde inside `Threads.@threads`) is not reproducible when using `MCMCThreads()`.
+(In principle, we would like to fix this bug, but we haven't yet investigated where it stems from.)
 
 ```{julia}
 model = threaded_obs(N) | (; y = y)
-chain1 = sample(Xoshiro(468), model, NUTS(), MCMCThreads(), 100, 4; nuts_kwargs...)
-chain2 = sample(Xoshiro(468), model, NUTS(), MCMCThreads(), 100, 4; nuts_kwargs...)
+nuts_kwargs = (check_model=false, progress=false, verbose=false)
+chain1 = sample(Xoshiro(468), model, NUTS(), MCMCThreads(), 1000, 4; nuts_kwargs...)
+chain2 = sample(Xoshiro(468), model, NUTS(), MCMCThreads(), 1000, 4; nuts_kwargs...)
 mean(chain1[:x]), mean(chain2[:x])  # oops!
 ```
 
@@ -228,10 +235,36 @@ In particular:
 This part will likely only be of interest to DynamicPPL developers and the very curious user.
 :::
 
-TODO: Something about metadata, accumulators, and TSVI.
+### Why is VarInfo not threadsafe?
+
+As alluded to above, the issue with threaded tilde-statements stems from the fact that these tilde-statements modify the VarInfo object used for model evaluation, leading to potential data races.
+
+Traditionally, VarInfo objects contain both *metadata* as well as *accumulators*.
+Metadata is where information about the random variables' values are stored.
+It is a Dict-like structure, and pushing to it from multiple threads is therefore not threadsafe (Julia's `Dict` has similar limitations).
+
+On the other hand, accumulators are used to store outputs of the model, such as log-probabilities
+The way DynamicPPL's threadsafe evaluation works is to create one set of accumulators per thread, and then combine the results at the end of model evaluation.
+
+In this way, any function call that _solely_ involving accumulators can be made threadsafe.
+For example, this is why observations are supported: there is no need to modify metadata, and only the log-likelihood accumulator needs to be updated.
+
+However, `assume` tilde-statements always modify the metadata, and thus cannot currently be made threadsafe.
+
+### OnlyAccsVarInfo
+
+As it happens, much of what is needed in DynamicPPL can be constructed such that they *only* rely on accumulators.
+
+For example, as long as there is no need to *sample* new values of random variables, it is actually fine to completely omit the metadata object.
+This is the case for `LogDensityFunction`: since values are provided as the input vector, there is no need to store it in metadata.
+We need only calculate the associated log-prior probability, which is stored in an accumulator.
+Thus, starting from DynamicPPL v0.39, `LogDensityFunction` itself is in fact completely threadsafe.
 
-TODO: Say how OnlyAccsVarInfo and FastLDF changes this.
+Technically speaking, this is achieved using `OnlyAccsVarInfo`, which is a subtype of `VarInfo` that only contains accumulators, and no metadata at all.
+It implements enough of the `VarInfo` interface to be used in model evaluation, but will error if any functions attempt to modify or read its metadata.
 
-Essentially, `predict(model, chn)` SHOULD work after #1130 because that uses OAVI, which doesn't have Metadata. It uses VAIMAcc to accumulate the values, but that is threadsafe as long as TSVI is used.
+There is currently an ongoing push to use `OnlyAccsVarInfo` in as many settings as we possibly can.
+For example, this is why `predict` will be threadsafe in DynamicPPL v0.39: instead of modifying metadata to store the predicted values, we store them inside a `ValuesAsInModelAccumulator` instead, and combine them at the end of evaluation.
 
-FastLDF, _once constructed_, also works with threaded assume. The only problem is that to get the ranges and linked status it has to first generate a VarInfo, which cannot be done. But if there's a way to either manually provide the ranges OR use an accumulator instead to get the ranges/linked status, then it would straight up enable threaded assume with NUTS / any sampler that only uses FastLDF.
+However, propagating these changes up to Turing will require a substantial amount of additional work, since there are many places in Turing which currently rely on a full VarInfo (with metadata).
+See, e.g., [this PR](https://github.com/TuringLang/DynamicPPL.jl/pull/1154) for more information.