Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
130 changes: 60 additions & 70 deletions Manifest.toml

Large diffs are not rendered by default.

2 changes: 1 addition & 1 deletion Project.toml
Original file line number Diff line number Diff line change
Expand Up @@ -51,4 +51,4 @@ StatsPlots = "f3b207a7-027a-5e70-b257-86293d7955fd"
Turing = "fce5fe82-541a-59a6-adf8-730c64b5f9a0"

[compat]
Turing = "0.41"
Turing = "0.42"
12 changes: 4 additions & 8 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -95,19 +95,15 @@ If you wish to speed up local rendering, there are two options available:

## Troubleshooting build issues

As described in the [Quarto docs](https://quarto.org/docs/computations/julia.html#using-the-julia-engine), Quarto's Julia engine uses a worker process behind the scenes.
Quarto's Julia engine uses a separate worker process behind the scenes.
Sometimes this can result in issues with old package code not being unloaded (e.g. when package versions are upgraded).
If you find that Quarto's execution is failing with errors that aren't reproducible via a normal REPL, try adding the `--execute-daemon-restart` flag to the `quarto render` command:
If you find that Quarto's execution is failing with errors that aren't reproducible via a normal REPL, try running:

```bash
quarto render /path/to/index.qmd --execute-daemon-restart
quarto call engine julia kill
```

And also, kill any stray Quarto processes that are still running (sometimes it keeps running in the background):

```bash
pkill -9 -f quarto
```
before rerunning the build (see [the Quarto docs](https://quarto.org/docs/computations/julia.html#quarto-call-engine-julia-commands) for more information).

## Licence

Expand Down
2 changes: 1 addition & 1 deletion _quarto.yml
Original file line number Diff line number Diff line change
Expand Up @@ -43,7 +43,7 @@ website:
href: https://turinglang.org/team/
right:
# Current version
- text: "v0.41"
- text: "v0.42"
menu:
- text: Changelog
href: https://turinglang.org/docs/changelog.html
Expand Down
7 changes: 5 additions & 2 deletions developers/compiler/model-manual/index.qmd
Original file line number Diff line number Diff line change
Expand Up @@ -54,7 +54,10 @@ function gdemo2(model, varinfo, x)
# value and the updated varinfo.
return nothing, varinfo
end
gdemo2(x) = DynamicPPL.Model(gdemo2, (; x))

# The `false` type parameter here indicates that this model does not need
# threadsafe evaluation (see the threadsafe evaluation page for details)
gdemo2(x) = DynamicPPL.Model{false}(gdemo2, (; x))

# Instantiate a Model object with our data variables.
model2 = gdemo2([1.5, 2.0])
Expand All @@ -66,4 +69,4 @@ We can sample from this model in the same way:
chain = sample(model2, NUTS(), 1000; progress=false)
```

The subsequent pages in this section will show how the `@model` macro does this behind-the-scenes.
The subsequent pages in this section will show how the `@model` macro does this behind-the-scenes.
13 changes: 3 additions & 10 deletions developers/contexts/submodel-condition/index.qmd
Original file line number Diff line number Diff line change
Expand Up @@ -265,18 +265,11 @@ We should like `myprefix(big_ctx, @varname(x))` to return `@varname(a.b.x)`.
Consider the following naive implementation, which mirrors a lot of code in the tilde-pipeline:

```{julia}
using DynamicPPL: NodeTrait, IsLeaf, IsParent, childcontext, AbstractContext
using DynamicPPL: childcontext, AbstractContext, AbstractParentContext
using AbstractPPL: AbstractPPL

function myprefix(ctx::DynamicPPL.AbstractContext, vn::VarName)
return myprefix(NodeTrait(ctx), ctx, vn)
end
function myprefix(::IsLeaf, ::AbstractContext, vn::VarName)
return vn
end
function myprefix(::IsParent, ctx::AbstractContext, vn::VarName)
return myprefix(childcontext(ctx), vn)
end
myprefix(::AbstractContext, vn::VarName) = vn
myprefix(ctx::AbstractParentContext, vn::VarName) = myprefix(childcontext(ctx), vn)
function myprefix(ctx::DynamicPPL.PrefixContext, vn::VarName)
# The functionality to actually manipulate the VarNames is in AbstractPPL
new_vn = AbstractPPL.prefix(vn, ctx.vn_prefix)
Expand Down
2 changes: 1 addition & 1 deletion developers/inference/abstractmcmc-interface/index.qmd
Original file line number Diff line number Diff line change
Expand Up @@ -320,4 +320,4 @@ It looks like we're extremely close to our true parameters of `Normal(5,3)`, tho

## Conclusion

We've seen how to implement the sampling interface for general projects. Turing's interface methods are ever-evolving, so please open an issue at [AbstractMCMC](https://github.com/TuringLang/AbstractMCMC.jl) with feature requests or problems.
We've seen how to implement the sampling interface for general projects. Turing's interface methods are ever-evolving, so please open an issue at [AbstractMCMC](https://github.com/TuringLang/AbstractMCMC.jl) with feature requests or problems.
122 changes: 82 additions & 40 deletions developers/inference/implementing-samplers/index.qmd

Large diffs are not rendered by default.

2 changes: 1 addition & 1 deletion developers/models/varinfo-overview/index.qmd
Original file line number Diff line number Diff line change
Expand Up @@ -233,7 +233,7 @@ If you use it, you must pay close attention to correctness:**

1. For models with multiple variables, the order in which these variables occur in the vector is not obvious. The short answer is that it depends on the order in which the variables are added to the VarInfo during its initialisation. If you have models where the order of variables can vary from one execution to another, then `unflatten` can easily lead to incorrect results.

2. The meaning of the values passed in will generally depend on whether the VarInfo is linked or not (see the [Variable Transformations page]({{< meta developers/transforms/dynamicppl >}}) for more information about linked VarInfos). You must make sure that the values passed in are consistent with the link status of the VarInfo. In contrast, `InitFromParams` always uses unlinked values.
2. The meaning of the values passed in will generally depend on whether the VarInfo is linked or not (see the [Variable Transformations page]({{< meta dev-transforms-dynamicppl >}}) for more information about linked VarInfos). You must make sure that the values passed in are consistent with the link status of the VarInfo. In contrast, `InitFromParams` always uses unlinked values.

3. While `unflatten` modifies the parameter values stored in the VarInfo, it does not modify any other information, such as log probabilities. Thus, after calling `unflatten`, your VarInfo will be in an inconsistent state, and you should not attempt to read any other information from it until you have called `evaluate!!` again (which recomputes e.g. log probabilities).

Expand Down
46 changes: 10 additions & 36 deletions developers/transforms/dynamicppl/index.qmd
Original file line number Diff line number Diff line change
Expand Up @@ -126,50 +126,24 @@ This sequence of events is summed up in the following diagram, where `f(..., arg
In the final part of this article, we will take a more in-depth look at the internal DynamicPPL machinery that allows us to convert between representations and obtain the correct probability densities.
Before that, though, we will take a quick high-level look at how the HMC sampler in Turing.jl uses the functions introduced so far.

## Case study: HMC in Turing.jl
## HMC in Turing.jl

While DynamicPPL provides the _functionality_ for transforming variables, the transformation itself happens at an even higher level, i.e. in the sampler itself.
The HMC sampler in Turing.jl is in [this file](https://github.com/TuringLang/Turing.jl/blob/5b24cebe773922e0f3d5c4cb7f53162eb758b04d/src/mcmc/hmc.jl).
In the first step of sampling, it calls `link` on the sampler.
This transformation is preserved throughout the sampling process, meaning that `is_transformed()` always returns true.

We can observe this by inserting print statements into the model.
Here, `__varinfo__` is the internal symbol for the `VarInfo` object used in model evaluation:

```{julia}
setprogress!(false)

@model function demo2()
x ~ LogNormal()
if x isa AbstractFloat
println("-----------")
println("model repn: $(DynamicPPL.getindex(__varinfo__, @varname(x)))")
println("internal repn: $(DynamicPPL.getindex_internal(__varinfo__, @varname(x)))")
println("is_transformed: $(is_transformed(__varinfo__, @varname(x)))")
end
end

sample(demo2(), HMC(0.1, 3), 3);
```


(Here, the check on `if x isa AbstractFloat` prevents the printing from occurring during computation of the derivative.)
You can see that during the three sampling steps, `is_transformed` is always kept as `true`.

::: {.callout-note}
The first two model evaluations where `is_transformed` is `false` occur prior to the actual sampling.
One occurs when the model is checked for correctness (using [`DynamicPPL.check_model_and_trace`](https://github.com/TuringLang/DynamicPPL.jl/blob/ba490bf362653e1aaefe298364fe3379b60660d3/src/debug_utils.jl#L582-L612)).
The second occurs because the model is evaluated once to generate a set of initial parameters inside [DynamicPPL's implementation of `AbstractMCMC.step`](https://github.com/TuringLang/DynamicPPL.jl/blob/ba490bf362653e1aaefe298364fe3379b60660d3/src/sampler.jl#L98-L117).
Both of these steps occur with all samplers in Turing.jl, so are not specific to the HMC example shown here.
:::

In the first step of sampling, it calls `link` on the sampler.
What this means is that from the perspective of the HMC sampler, it _never_ sees the constrained variable: it always thinks that it is sampling from an unconstrained distribution.

The biggest prerequisite for this to work correctly is that the potential energy term in the Hamiltonian—or in other words, the model log density—must be programmed correctly to include the Jacobian term.
This is exactly the same as how we had to make sure to define `logq(y)` correctly in the toy HMC example above.

Within Turing.jl, this is correctly handled because a statement like `x ~ LogNormal()` in the model definition above is translated into `assume(LogNormal(), @varname(x), __varinfo__)`, defined [here](https://github.com/TuringLang/DynamicPPL.jl/blob/ba490bf362653e1aaefe298364fe3379b60660d3/src/context_implementations.jl#L225-L229).
If you follow the trail of function calls, you can verify that the `assume` function does indeed check for the presence of the `is_transformed` flag and adds the Jacobian term accordingly.
Within Turing.jl, this is correctly handled by calling

```julia
x, inv_logjac = with_logabsdet_jacobian(y, inverse_transform)
```

and then passing `inv_logjac` to DynamicPPL's `LogJacobianAccumulator`.

## A deeper dive into DynamicPPL's internal machinery

Expand Down Expand Up @@ -415,4 +389,4 @@ In this chapter of the Turing docs, we've looked at:
- the higher-level usage of transforms in DynamicPPL and Turing.

This will hopefully have equipped you with a better understanding of how constrained variables are handled in the Turing framework.
With this knowledge, you should especially find it easier to navigate DynamicPPL's `VarInfo` type, which forms the backbone of model evaluation.
With this knowledge, you should especially find it easier to navigate DynamicPPL's `VarInfo` type, which forms the backbone of model evaluation.
18 changes: 12 additions & 6 deletions faq/index.qmd
Original file line number Diff line number Diff line change
Expand Up @@ -47,7 +47,7 @@ To understand more about how Turing determines whether a variable is treated as

## Can I use parallelism / threads in my model?

Yes, but with important caveats! There are two types of parallelism to consider:
Yes, but with some important caveats:

### 1. Parallel Sampling (Multiple Chains)
Turing.jl fully supports sampling multiple chains in parallel:
Expand All @@ -58,16 +58,24 @@ Turing.jl fully supports sampling multiple chains in parallel:
See the [Core Functionality guide]({{<meta core-functionality>}}#sampling-multiple-chains) for examples.

### 2. Threading Within Models
Using threads inside your model (e.g., `Threads.@threads`) requires more care:

Using threads inside your model (e.g., `Threads.@threads`) requires more care.
In particular, only threaded **observe** statements are safe to use; threaded **assume** statements can lead to crashes or incorrect results.
Please see the [Threadsafe Evaluation page]({{< meta usage-threadsafe-evaluation >}}) for complete details.

```julia
@model function f(y)
x = Vector{Float64}(undef, length(y))
Threads.@threads for i in eachindex(y)
x[i] ~ Normal() # UNSAFE: `assume` statements in @threads can crash!
y[i] ~ Normal(x[i]) # `observe` statements are okay
# This would be unsafe!
# x[i] ~ Normal()
# This is safe:
y[i] ~ Normal()
end
end
# If you have parallel tilde-statements or `@addlogprob!` in a model,
# you must mark the model as threadsafe:
model = setthreadsafe(f(y), true)
```

**Important limitations:**
Expand All @@ -76,8 +84,6 @@ end
- **Assume statements** (sampling statements): Often crash unpredictably or produce incorrect results
- **AD backend compatibility**: Many AD backends don't support threading. Check the [multithreaded column in ADTests](https://turinglang.org/ADTests/) for compatibility

For safe parallelism within models, consider vectorised operations instead of explicit threading.

## How do I check the type stability of my Turing model?

Type stability is crucial for performance. Check out:
Expand Down
4 changes: 2 additions & 2 deletions tutorials/bayesian-neural-networks/index.qmd
Original file line number Diff line number Diff line change
Expand Up @@ -227,7 +227,7 @@ nn_forward(x, θ) = nn(x, vector_to_parameters(θ, ps))
fig = plot_data()
# Find the index that provided the highest log posterior in the chain.
_, i = findmax(ch[:lp])
_, i = findmax(ch[:logjoint])
# Extract the max row value from i.
i = i.I[1]
Expand Down Expand Up @@ -286,4 +286,4 @@ anim = @gif for i in 1:n_end
end every 5
```

This has been an introduction to the applications of Turing and Lux in defining Bayesian neural networks.
This has been an introduction to the applications of Turing and Lux in defining Bayesian neural networks.
24 changes: 11 additions & 13 deletions tutorials/bayesian-time-series-analysis/index.qmd
Original file line number Diff line number Diff line change
Expand Up @@ -175,11 +175,6 @@ function mean_ribbon(samples)
return m, (m - low, up - m)
end

function get_decomposition(model, x, cyclic_features, chain, op)
chain_params = get_sections(chain, :parameters)
return returned(model(x, cyclic_features, op), chain_params)
end

function plot_fit(x, y, decomp, ymax)
trend = mapreduce(x -> x.trend, hcat, decomp)
cyclic = mapreduce(x -> x.cyclic, hcat, decomp)
Expand Down Expand Up @@ -219,8 +214,9 @@ function plot_fit(x, y, decomp, ymax)
return trend_plt, cyclic_plt
end

chain = sample(decomp_model(x, cyclic_features, +) | (; y=yf), NUTS(), 2000, progress=false)
yf_samples = predict(decomp_model(x, cyclic_features, +), chain)
model = decomp_model(x, cyclic_features, +) | (; y = yf)
chain = sample(model, NUTS(), 2000, progress=false)
yf_samples = predict(decondition(model), chain)
m, conf = mean_ribbon(yf_samples)
predictive_plt = plot(
t,
Expand All @@ -233,7 +229,7 @@ predictive_plt = plot(
)
scatter!(predictive_plt, t, yf .+ yf_max; color=2, label="Data", legend=:topleft)

decomp = get_decomposition(decomp_model, x, cyclic_features, chain, +)
decomp = returned(model, chain)
decomposed_plt = plot_fit(t, yf, decomp, yf_max)
plot(predictive_plt, decomposed_plt...; layout=(3, 1), size=(700, 1000))
```
Expand Down Expand Up @@ -308,8 +304,9 @@ scatter!(t, yf; color=2, label="Data")
```

```{julia}
chain = sample(decomp_model(x, cyclic_features, .*) | (; y=yg), NUTS(), 2000, progress=false)
yg_samples = predict(decomp_model(x, cyclic_features, .*), chain)
model = decomp_model(x, cyclic_features, .*) | (; y = yg)
chain = sample(model, NUTS(), 2000, progress=false)
yg_samples = predict(decondition(model), chain)
m, conf = mean_ribbon(yg_samples)
predictive_plt = plot(
t,
Expand All @@ -322,7 +319,7 @@ predictive_plt = plot(
)
scatter!(predictive_plt, t, yg; color=2, label="Data", legend=:topleft)

decomp = get_decomposition(decomp_model, x, cyclic_features, chain, .*)
decomp = returned(model, chain)
decomposed_plt = plot_fit(t, yg, decomp, 0)
plot(predictive_plt, decomposed_plt...; layout=(3, 1), size=(700, 1000))
```
Expand All @@ -337,14 +334,15 @@ let
end
```

The model fits! What about the infered cyclic components?
The model fits!
What about the inferred cyclic components?

```{julia}
βc = Array(group(chain, :βc))
plot_cyclic_features(βc[:, begin:num_freqs, :], βc[:, (num_freqs + 1):end, :])
```

While multiplicative model fits to the data, it does not recover the true parameters for this dataset.
While a multiplicative model does manage to fit the data, it does not recover the true parameters for this dataset.

# Wrapping up

Expand Down
Loading