VarNamedTuple, with an application for FastLDF #1150

mhauru · 2025-11-20T12:59:57Z

I decided that rather than take over VarInfo like in #1074, the first use case of VarNamedTuple should be replacing the NamedTuple/Dict combo in FastLDF. That's what this PR does.

This is still work in progress:

Documentation is lacking/out of date
There's dead code, and unnecessarily complex code
Performance on Julia v1.11 needs fixing
There's type piracy
This doesn't handle Colons in VarNames.

However, tests seem to pass, so I'm putting this up. I ran the familiar FastLDF benchmarks from #1132, adapted a bit. Source code:

module VNTBench

using DynamicPPL, Distributions, LogDensityProblems, Chairmarks, LinearAlgebra
using ADTypes, ForwardDiff, ReverseDiff
@static if VERSION < v"1.12"
    using Enzyme, Mooncake
end

const adtypes = @static if VERSION < v"1.12"
    [
        ("FD", AutoForwardDiff()),
        ("RD", AutoReverseDiff()),
        ("MC", AutoMooncake()),
        ("EN" => AutoEnzyme(; mode=set_runtime_activity(Reverse), function_annotation=Const))
    ]
else
    [
        ("FD", AutoForwardDiff()),
        ("RD", AutoReverseDiff()),
    ]
end

function benchmark_ldfs(model; skip=Union{})
    vi = VarInfo(model)
    x = vi[:]
    ldf_no = DynamicPPL.LogDensityFunction(model, getlogjoint, vi)
    fldf_no = DynamicPPL.Experimental.FastLDF(model, getlogjoint, vi)
    @assert LogDensityProblems.logdensity(ldf_no, x) ≈ LogDensityProblems.logdensity(fldf_no, x)
    median_new = median(@be LogDensityProblems.logdensity(fldf_no, x))
    print("           FastLDF: eval      ----  ")
    display(median_new)
    for name_adtype in adtypes
        name, adtype = name_adtype
        adtype isa skip && continue
        ldf = DynamicPPL.LogDensityFunction(model, getlogjoint, vi; adtype=adtype)
        fldf = DynamicPPL.Experimental.FastLDF(model, getlogjoint, vi; adtype=adtype)
        ldf_grad = LogDensityProblems.logdensity_and_gradient(ldf, x)
        fldf_grad = LogDensityProblems.logdensity_and_gradient(fldf, x)
        @assert ldf_grad[2] ≈ fldf_grad[2]
        median_new = median(@be LogDensityProblems.logdensity_and_gradient(fldf, x))
        print("           FastLDF: grad ($name) ----  ")
        display(median_new)
    end
end

println("Trivial model")
@model f() = x ~ Normal()
benchmark_ldfs(f())

println("Eight schools")
y = [28, 8, -3, 7, -1, 1, 18, 12]
sigma = [15, 10, 16, 11, 9, 11, 10, 18]
@model function eight_schools(y, sigma)
    mu ~ Normal(0, 5)
    tau ~ truncated(Cauchy(0, 5); lower=0)
    theta ~ MvNormal(fill(mu, length(y)), tau^2 * I)
    for i in eachindex(y)
        y[i] ~ Normal(theta[i], sigma[i])
    end
    return (mu=mu, tau=tau)
end
benchmark_ldfs(eight_schools(y, sigma))

println("IndexLenses, dim=1_000")
@model function badvarnames()
    N = 1_000
    x = Vector{Float64}(undef, N)
    for i in 1:N
        x[i] ~ Normal()
    end
end
benchmark_ldfs(badvarnames())

println("Submodel")
@model function inner()
    m ~ Normal(0, 1)
    s ~ Exponential()
    return (m=m, s=s)
end
@model function withsubmodel()
    params ~ to_submodel(inner())
    y ~ Normal(params.m, params.s)
    1.0 ~ Normal(y)
end
benchmark_ldfs(withsubmodel())

end

Results on Julia v1.12:

On base(breaking):
Trivial model
           FastLDF: eval      ----  18.047 ns
           FastLDF: grad (FD) ----  51.805 ns (3 allocs: 96 bytes)
           FastLDF: grad (RD) ----  3.157 μs (45 allocs: 1.531 KiB)
Eight schools
           FastLDF: eval      ----  165.723 ns (4 allocs: 256 bytes)
           FastLDF: grad (FD) ----  685.846 ns (11 allocs: 2.594 KiB)
           FastLDF: grad (RD) ----  39.959 μs (562 allocs: 20.562 KiB)
IndexLenses, dim=1_000
           FastLDF: eval      ----  24.250 μs (14 allocs: 8.312 KiB)
           FastLDF: grad (FD) ----  6.296 ms (1516 allocs: 11.197 MiB)
           FastLDF: grad (RD) ----  2.577 ms (38029 allocs: 1.321 MiB)
Submodel
           FastLDF: eval      ----  57.568 ns
           FastLDF: grad (FD) ----  179.448 ns (3 allocs: 112 bytes)
           FastLDF: grad (RD) ----  10.750 μs (145 allocs: 5.062 KiB)

On this branch:
Trivial model
           FastLDF: eval      ----  11.869 ns
           FastLDF: grad (FD) ----  53.264 ns (3 allocs: 96 bytes)
           FastLDF: grad (RD) ----  3.273 μs (45 allocs: 1.531 KiB)
Eight schools
           FastLDF: eval      ----  203.159 ns (4 allocs: 256 bytes)
           FastLDF: grad (FD) ----  718.750 ns (11 allocs: 2.594 KiB)
           FastLDF: grad (RD) ----  39.792 μs (562 allocs: 20.562 KiB)
IndexLenses, dim=1_000
           FastLDF: eval      ----  9.181 μs (2 allocs: 8.031 KiB)
           FastLDF: grad (FD) ----  4.235 ms (508 allocs: 11.174 MiB)
           FastLDF: grad (RD) ----  2.560 ms (38017 allocs: 1.321 MiB)
Submodel
           FastLDF: eval      ----  49.660 ns
           FastLDF: grad (FD) ----  221.359 ns (3 allocs: 112 bytes)
           FastLDF: grad (RD) ----  10.667 μs (148 allocs: 5.219 KiB)

Same thing but in Julia v1.11:

On base(breaking):
Trivial model
           FastLDF: eval      ----  11.082 ns
           FastLDF: grad (FD) ----  53.747 ns (3 allocs: 96 bytes)
           FastLDF: grad (RD) ----  3.069 μs (46 allocs: 1.562 KiB)
           FastLDF: grad (MC) ----  221.910 ns (2 allocs: 64 bytes)
           FastLDF: grad (EN) ----  128.970 ns (2 allocs: 64 bytes)
Eight schools
           FastLDF: eval      ----  164.326 ns (4 allocs: 256 bytes)
           FastLDF: grad (FD) ----  690.049 ns (11 allocs: 2.594 KiB)
           FastLDF: grad (RD) ----  39.250 μs (562 allocs: 20.562 KiB)
           FastLDF: grad (MC) ----  1.082 μs (10 allocs: 656 bytes)
           FastLDF: grad (EN) ----  733.325 ns (13 allocs: 832 bytes)
IndexLenses, dim=1_000
           FastLDF: eval      ----  33.458 μs (15 allocs: 8.344 KiB)
           FastLDF: grad (FD) ----  6.652 ms (1516 allocs: 11.197 MiB)
           FastLDF: grad (RD) ----  2.488 ms (38028 allocs: 1.321 MiB)
           FastLDF: grad (MC) ----  89.583 μs (21 allocs: 24.469 KiB)
           FastLDF: grad (EN) ----  92.833 μs (20 allocs: 102.531 KiB)
Submodel
           FastLDF: eval      ----  70.884 ns
           FastLDF: grad (FD) ----  135.958 ns (3 allocs: 112 bytes)
           FastLDF: grad (RD) ----  10.563 μs (148 allocs: 5.188 KiB)
           FastLDF: grad (MC) ----  481.250 ns (2 allocs: 80 bytes)
           FastLDF: grad (EN) ----  344.612 ns (2 allocs: 80 bytes)

On this branch:
Trivial model
           FastLDF: eval      ----  1.309 μs (27 allocs: 800 bytes)
           FastLDF: grad (FD) ----  1.522 μs (30 allocs: 960 bytes)
           FastLDF: grad (RD) ----  4.667 μs (71 allocs: 2.344 KiB)
           FastLDF: grad (MC) ----  358.143 ns (7 allocs: 224 bytes)
           FastLDF: grad (EN) ----  130.768 ns (2 allocs: 64 bytes)
Eight schools
           FastLDF: eval      ----  164.326 ns (4 allocs: 256 bytes)
           FastLDF: grad (FD) ----  645.378 ns (11 allocs: 2.594 KiB)
           FastLDF: grad (RD) ----  39.541 μs (562 allocs: 20.562 KiB)
           FastLDF: grad (MC) ----  1.043 μs (10 allocs: 656 bytes)
           FastLDF: grad (EN) ----  747.925 ns (13 allocs: 832 bytes)
IndexLenses, dim=1_000
           FastLDF: eval      ----  9.430 μs (3 allocs: 8.062 KiB)
           FastLDF: grad (FD) ----  4.616 ms (508 allocs: 11.174 MiB)
           FastLDF: grad (RD) ----  2.467 ms (38016 allocs: 1.321 MiB)
           FastLDF: grad (MC) ----  73.292 μs (9 allocs: 24.188 KiB)
           FastLDF: grad (EN) ----  72.875 μs (8 allocs: 102.250 KiB)
Submodel
           FastLDF: eval      ----  52.213 ns
           FastLDF: grad (FD) ----  107.166 ns (3 allocs: 112 bytes)
           FastLDF: grad (RD) ----  10.521 μs (142 allocs: 5.078 KiB)
           FastLDF: grad (MC) ----  453.493 ns (2 allocs: 80 bytes)
           FastLDF: grad (EN) ----  320.367 ns (2 allocs: 80 bytes)

So on 1.12 all looks good: This is a bit faster than the old version, substantial faster when there are a lot of IndexLenses, as it should. On 1.11 performance is destroyed, probably because type inference fails/gives up, and I need to fix that.

The main point of this PR is not performance, but having a general data structure for storing information keyed by VarNames, so I'm happy as long as performance doesn't degrade. Next up would be using this same data structure for ConditionContext (hoping to fix #1148), ValuesAsInModelAcc, maybe some other Accumulators, InitFromParams, GibbsContext, and finally to implement an AbstractVarInfo type.

I'll update the docs page with more information about what the current design is that I've implemented, but the one sentence summary is that it's nested NamedTuples, and then whenever we meet IndexLenses, it's an Array for the values together with a mask-Array that marks which values are valid values and which are just placeholders.

I think I know how to fix all the current short-comings, except for Colons in VarNames. Setting a value in a VNT with a Colon could be done, but getting seems ill-defined, at least without providing further information about the size the value should be.

vnt = VarNamedTuple(
vnt = setindex!!(vnt, 1, @varname(x[2]))
vnt = setindex!!(vnt, 1, @varname(x[4]))
getindex(@varname(x[:])  # What should this return?

cc @penelopeysm, though this isn't ready for reviews yet.

* Remove NodeTrait * Changelog * Fix exports * docs * fix a bug * Fix doctests * Fix test * tweak changelog

* Fast Log Density Function * Make it work with AD * Optimise performance for identity VarNames * Mark `get_range_and_linked` as having zero derivative * Update comment * make AD testing / benchmarking use FastLDF * Fix tests * Optimise away `make_evaluate_args_and_kwargs` * const func annotation * Disable benchmarks on non-typed-Metadata-VarInfo * Fix `_evaluate!!` correctly to handle submodels * Actually fix submodel evaluate * Document thoroughly and organise code * Support more VarInfos, make it thread-safe (?) * fix bug in parsing ranges from metadata/VNV * Fix get_param_eltype for TSVI * Disable Enzyme benchmark * Don't override _evaluate!!, that breaks ForwardDiff (sometimes) * Move FastLDF to experimental for now * Fix imports, add tests, etc * More test fixes * Fix imports / tests * Remove AbstractFastEvalContext * Changelog and patch bump * Add correctness tests, fix imports * Concretise parameter vector in tests * Add zero-allocation tests * Add Chairmarks as test dep * Disable allocations tests on multi-threaded * Fast InitContext (#1125) * Make InitContext work with OnlyAccsVarInfo * Do not convert NamedTuple to Dict * remove logging * Enable InitFromPrior and InitFromUniform too * Fix `infer_nested_eltype` invocation * Refactor FastLDF to use InitContext * note init breaking change * fix logjac sign * workaround Mooncake segfault * fix changelog too * Fix get_param_eltype for context stacks * Add a test for threaded observe * Export init * Remove dead code * fix transforms for pathological distributions * Tidy up loads of things * fix typed_identity spelling * fix definition order * Improve docstrings * Remove stray comment * export get_param_eltype (unfortunatley) * Add more comment * Update comment * Remove inlines, fix OAVI docstring * Improve docstrings * Simplify InitFromParams constructor * Replace map(identity, x[:]) with [i for i in x[:]] * Simplify implementation for InitContext/OAVI * Add another model to allocation tests Co-authored-by: Markus Hauru <markus@mhauru.org> * Revert removal of dist argument (oops) * Format * Update some outdated bits of FastLDF docstring * remove underscores --------- Co-authored-by: Markus Hauru <markus@mhauru.org>

* print output * fix * reenable * add more lines to guide the eye * reorder table * print tgrad / trel as well * forgot this type

github-actions · 2025-11-20T13:06:04Z

Benchmark Report

this PR's head: c818bf887ab474d15944394e62237f8be29b0f32
base branch: 1bb97aebfa262438f042132a65470dd4397e837a

Computer Information

Julia Version 1.11.7
Commit f2b3dbda30a (2025-09-08 12:10 UTC)
Build Info:
  Official https://julialang.org/ release
Platform Info:
  OS: Linux (x86_64-linux-gnu)
  CPU: 4 × AMD EPYC 7763 64-Core Processor
  WORD_SIZE: 64
  LLVM: libLLVM-16.0.6 (ORCJIT, znver3)
Threads: 1 default, 0 interactive, 1 GC (on 4 virtual cores)

Benchmark Results

┌───────────────────────┬───────┬─────────────┬───────────────────┬────────┬────────────────────────────────┬────────────────────────────┬─────────────────────────────────┐
│                       │       │             │                   │        │        t(eval) / t(ref)        │     t(grad) / t(eval)      │        t(grad) / t(ref)         │
│                       │       │             │                   │        │ ──────────┬──────────┬──────── │ ───────┬─────────┬──────── │ ──────────┬───────────┬──────── │
│                 Model │   Dim │  AD Backend │           VarInfo │ Linked │      base │  this PR │ speedup │   base │ this PR │ speedup │      base │   this PR │ speedup │
├───────────────────────┼───────┼─────────────┼───────────────────┼────────┼───────────┼──────────┼─────────┼────────┼─────────┼─────────┼───────────┼───────────┼─────────┤
│               Dynamic │    10 │    mooncake │             typed │   true │    383.87 │   367.84 │    1.04 │   8.96 │   12.97 │    0.69 │   3440.54 │   4770.03 │    0.72 │
│                   LDA │    12 │ reversediff │             typed │   true │   2700.89 │  2820.33 │    0.96 │   8.54 │    4.87 │    1.75 │  23076.91 │  13739.12 │    1.68 │
│   Loop univariate 10k │ 10000 │    mooncake │             typed │   true │ 104336.18 │ 57389.65 │    1.82 │   4.13 │    6.43 │    0.64 │ 430426.68 │ 368965.17 │    1.17 │
├───────────────────────┼───────┼─────────────┼───────────────────┼────────┼───────────┼──────────┼─────────┼────────┼─────────┼─────────┼───────────┼───────────┼─────────┤
│    Loop univariate 1k │  1000 │    mooncake │             typed │   true │   7910.87 │  5777.88 │    1.37 │   4.98 │    5.81 │    0.86 │  39431.65 │  33545.80 │    1.18 │
│      Multivariate 10k │ 10000 │    mooncake │             typed │   true │  32326.37 │ 35487.48 │    0.91 │  10.33 │   13.18 │    0.78 │ 333953.86 │ 467886.15 │    0.71 │
│       Multivariate 1k │  1000 │    mooncake │             typed │   true │   3689.41 │  3722.80 │    0.99 │   9.19 │    9.07 │    1.01 │  33922.28 │  33774.19 │    1.00 │
├───────────────────────┼───────┼─────────────┼───────────────────┼────────┼───────────┼──────────┼─────────┼────────┼─────────┼─────────┼───────────┼───────────┼─────────┤
│ Simple assume observe │     1 │ forwarddiff │             typed │  false │      2.72 │     2.61 │    1.05 │   3.93 │    3.89 │    1.01 │     10.70 │     10.14 │    1.06 │
│           Smorgasbord │   201 │ forwarddiff │             typed │  false │   1214.47 │  1090.53 │    1.11 │ 120.67 │   71.26 │    1.69 │ 146554.97 │  77715.75 │    1.89 │
│           Smorgasbord │   201 │ forwarddiff │       simple_dict │   true │       err │      err │     err │    err │     err │     err │       err │       err │     err │
├───────────────────────┼───────┼─────────────┼───────────────────┼────────┼───────────┼──────────┼─────────┼────────┼─────────┼─────────┼───────────┼───────────┼─────────┤
│           Smorgasbord │   201 │ forwarddiff │ simple_namedtuple │   true │       err │      err │     err │    err │     err │     err │       err │       err │     err │
│           Smorgasbord │   201 │      enzyme │             typed │   true │   1719.48 │  1511.00 │    1.14 │   5.37 │    8.11 │    0.66 │   9241.55 │  12256.49 │    0.75 │
│           Smorgasbord │   201 │    mooncake │             typed │   true │   1725.80 │  1472.96 │    1.17 │   5.24 │    6.11 │    0.86 │   9048.33 │   8992.74 │    1.01 │
├───────────────────────┼───────┼─────────────┼───────────────────┼────────┼───────────┼──────────┼─────────┼────────┼─────────┼─────────┼───────────┼───────────┼─────────┤
│           Smorgasbord │   201 │ reversediff │             typed │   true │   1702.88 │  1510.74 │    1.13 │  88.04 │  100.19 │    0.88 │ 149919.54 │ 151365.18 │    0.99 │
│           Smorgasbord │   201 │ forwarddiff │      typed_vector │   true │   1687.07 │  1485.59 │    1.14 │  56.61 │   68.33 │    0.83 │  95497.74 │ 101511.35 │    0.94 │
│           Smorgasbord │   201 │ forwarddiff │           untyped │   true │   1694.97 │  1495.02 │    1.13 │  55.72 │   65.71 │    0.85 │  94439.65 │  98236.99 │    0.96 │
├───────────────────────┼───────┼─────────────┼───────────────────┼────────┼───────────┼──────────┼─────────┼────────┼─────────┼─────────┼───────────┼───────────┼─────────┤
│           Smorgasbord │   201 │ forwarddiff │    untyped_vector │   true │   1673.63 │  1491.35 │    1.12 │  55.06 │   68.39 │    0.81 │  92152.07 │ 101997.83 │    0.90 │
│              Submodel │     1 │    mooncake │             typed │   true │      7.47 │     3.22 │    2.32 │   5.03 │   11.25 │    0.45 │     37.59 │     36.19 │    1.04 │
└───────────────────────┴───────┴─────────────┴───────────────────┴────────┴───────────┴──────────┴─────────┴────────┴─────────┴─────────┴───────────┴───────────┴─────────┘

codecov · 2025-11-20T13:14:58Z

Codecov Report

❌ Patch coverage is 41.66667% with 182 lines in your changes missing coverage. Please review.
✅ Project coverage is 77.00%. Comparing base (1bb97ae) to head (c818bf8).

Files with missing lines	Patch %	Lines
src/varnamedtuple.jl	37.37%	181 Missing ⚠️
src/utils.jl	75.00%	1 Missing ⚠️

Additional details and impacted files

@@             Coverage Diff              @@
##           breaking    #1150      +/-   ##
============================================
- Coverage     80.01%   77.00%   -3.01%     
============================================
  Files            41       42       +1     
  Lines          3877     4153     +276     
============================================
+ Hits           3102     3198      +96     
- Misses          775      955     +180

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

penelopeysm · 2025-11-20T13:29:38Z

It looks to me that the 1.11 perf is only a lot worse on the trivial model. In my experience (ran into this exact issue with Enzyme once, see also https://github.com/TuringLang/DynamicPPL.jl/pull/877/files), trivial models with 1 variable can be quite susceptible to changes in inlining strategy. It may be that a judicious @inline or @noinline somewhere will fix this.

… and also `bundle_samples` (#1129) * Implement `ParamsWithStats` for `FastLDF` * Add comments * Implement `bundle_samples` for ParamsWithStats -> MCMCChains * Remove redundant comment * don't need Statistics?

* Make FastLDF the default * Add miscellaneous LogDensityProblems tests * Use `init!!` instead of `fast_evaluate!!` * Rename files, rebalance tests

mhauru · 2025-11-27T19:18:12Z

Performance on v1.11 has been fixed, and many other improvements made. Things that remain unfinished:

Adding support for Colons as much as we can. I think we can add support at least within the tilde pipeline, by converting colons into ranges.
Study and document what are the restrictions that this design of VNT imposes modelling syntax. This forbids things like having @varname(a[1,1]) and @varname(a[1,1,2]) in the same model, which could previously easily be the case because of linear indexing or a non-Array a, as well as @varname(a[-1]) and other oddities enabled by unusual array types like OffsetArray. Some of these we can add support for, others we would have to ban if we keep this design of VNT, but we should first figure out what are all the limitations and then decide how much we care about them.

This probably isn't ready to merge due to aforementioned limitations, but fixing them will be adding things, rather than modifying things, compared to what is in this PR, so I think now is a good time for a first review.

penelopeysm

I just happened to be reading the docs, will leave the code for next time - but the design sounds good!

docs/src/internals/varnamedtuple.md

github-actions · 2025-11-27T19:55:47Z

DynamicPPL.jl documentation for PR #1150 is available at:
https://TuringLang.github.io/DynamicPPL.jl/previews/PR1150/

Co-authored-by: Penelope Yong <penelopeysm@gmail.com>

* Make threadsafe evaluation opt-in * Reduce number of type parameters in methods * Make `warned_warn_about_threads_threads_threads_threads` shorter * Improve `setthreadsafe` docstring * warn on bare `@threads` as well * fix merge * Fix performance issues * Use maxthreadid() in TSVI * Move convert_eltype code to threadsafe eval function * Point to new Turing docs page * Add a test for setthreadsafe * Tidy up check_model * Apply suggestions from code review Fix outdated docstrings Co-authored-by: Markus Hauru <markus@mhauru.org> * Improve warning message * Export `requires_threadsafe` * Add an actual docstring for `requires_threadsafe` --------- Co-authored-by: Markus Hauru <markus@mhauru.org>

mhauru · 2025-12-01T17:09:34Z

Oh, forgot to mention: I'm up for reconsidering the name. Given the role of PartialArray, I wonder if we should call it something more generic. VarNameMap?

VarNameDict would be good if we made it a subtype of AbstractDict. A bit unsure if we should do that. Seems like you should define merge, map, filter, values, keys, and pairs. We could do all those, though not map! and filter! which may be part of the interface as well. The thing that makes me hesitate is that the notion of keys is a bit unusual: You can setindex!!(vnt, val, @varname(a[1:2]), but @varname(a[1:2]) won't be in keys(vnt), rather @varname(a[1]) and @varname(a[2]) will. AbstractDict is woefully underd-ocumented, but I wonder if that would violate some assumptions.

Opinions welcome.

penelopeysm · 2025-12-01T11:38:20Z

src/varnamedtuple.jl

+_haskey(arr::AbstractArray, optic::IndexLens) = _haskey(arr, optic.indices)
+_haskey(arr::AbstractArray, inds) = checkbounds(Bool, arr, inds...)


Suggested change

_haskey(arr::AbstractArray, optic::IndexLens) = _haskey(arr, optic.indices)

_haskey(arr::AbstractArray, inds) = checkbounds(Bool, arr, inds...)

_haskey(arr::AbstractArray, optic::IndexLens) = _hasindices(arr, optic.indices)

_hasindices(arr::AbstractArray, inds) = checkbounds(Bool, arr, inds...)

I would prefer different function names for different signatures!

Would the logic then be that it's _haskey when the second argument is a lens, and _hasindices when it's a tuple? We should then probably rename _haskey on PartialArray too. It wouldn't be the worst, but I feel like it would introduce an extra mental overhead of keeping track of the distinction between IndexLenses and their index tuples.

penelopeysm · 2025-12-01T11:40:19Z

src/varnamedtuple.jl

+    data::Array{ElType,num_dims}
+    mask::Array{Bool,num_dims}
+
+    function PartialArray(
+        data::Array{ElType,num_dims}, mask::Array{Bool,num_dims}
+    ) where {ElType,num_dims}
+        if size(data) != size(mask)
+            throw(ArgumentError("Data and mask arrays must have the same size"))
+        end
+        return new{ElType,num_dims}(data, mask)
+    end


Do we want FixedSizeArrays, or too much faff?

src/varnamedtuple.jl

penelopeysm · 2025-12-01T11:43:42Z

src/varnamedtuple.jl

+"""Take the minimum size that a dimension of a PartialArray needs to be, and return the size
+we choose it to be. This size will be the smallest possible power of
+PARTIAL_ARRAY_DIM_GROWTH_FACTOR. Growing PartialArrays in big jumps like this helps reduce
+data copying, as resizes aren't needed as often.
+"""
+function _partial_array_dim_size(min_dim)
+    factor = PARTIAL_ARRAY_DIM_GROWTH_FACTOR
+    return factor^(Int(ceil(log(factor, min_dim))))
+end


Is this better for performance? Would it be also equally OK to make it min_dim to begin but still scale by 4 each time, or is it just magically faster whenever the size is always a power of 4?

I don't think there's anything special about powers of 4. I somehow found it simpler to think about it when all size-setting went through this function, but we could indeed bypass this when first creating an array. I should also say that I've benchmarked none of the impact of this. I assume, quite strongly, that doing these steps is helpful, but by how much and what would be a good step size, I have no idea. 4 is a pure guess.

penelopeysm · 2025-12-01T11:46:40Z

src/varnamedtuple.jl

+resized in exponentially increasing steps. This means that most `setindex!!` calls are very
+fast, but some may incur substantial overhead due to resizing and copying data. It also


although the cost of setindex!! is still O(1) amortised...!

That's right. Do you think we should mention that in the docstring?

penelopeysm · 2025-12-01T17:32:58Z

src/varnamedtuple.jl

+heterogeneous data under different indices of the same symbol. That is, if one either
+
+* sets `a[1]` and `a[2]` to be of different types, or
+* sets `a[1].b` and `a[2].c`, without setting `a[1].c`. or `a[2].b`,


Suggested change

* sets `a[1].b` and `a[2].c`, without setting `a[1].c`. or `a[2].b`,

* sets `a[1].b` and `a[2].c`, without setting `a[1].c` and `a[2].b`,

Though I think this could be simplified to just

if `a[1]` and `a[2]` both exist, sets `a[1].b` without also setting `a[2].b`

Yeah, that's simpler. I put in a slight modification of your last proposal.

src/varnamedtuple.jl

penelopeysm · 2025-12-01T17:39:04Z

src/varnamedtuple.jl

+    # return VarNamedTuple(_setindex!!(vnt.data, value, S))
+    # but that seems to be type unstable. Why? Shouldn't it obviously be the same as the
+    # below?


maybe because the symbol S is no longer represented at the type level?

I thought constant propagation would deal with it still. It seems like quite a trivial case of const prop to me.

src/varnamedtuple.jl

penelopeysm · 2025-12-01T17:42:52Z

src/varnamedtuple.jl

+# TODO(mhauru) Should this return tuples, like it does now? That makes sense for
+# VarNamedTuple itself, but if there is a nested PartialArray the tuple might get very big.
+# Also, this is not very type stable, it fails even in basic cases. A generated function


Personally I would be kind of inclined to just let it return a vector. I don't know if keys is used in performance sensitive aspects.

I should try how a Vector works here. I wanted to mimic this:

julia> Base.keys((; a=1, b=2)) (:a, :b)

But the thought of some 10k element tuple from a PartialArray is quite off-putting.

* Standardise `:lp` -> `:logjoint` * changelog * fix a test

Co-authored-by: Penelope Yong <penelopeysm@gmail.com>

mhauru · 2025-12-04T17:35:55Z

In a conversation with @penelopeysm we just realised that we don't understand how VNT handles something like

x[1:2] ~ Dirichlet(ones(2))

In LogDensityFunction, we come to assign a RangesAndLinked to x[1:2], but that's a scalar value being assigned to a range, so that would store the same value for both x[1] and x[2]. Shouldn't that crash? Do we not have a test for that? Also, I don't know what we should do with this. Worse still, in the above example, what would we do in extract_priors? The RangesAndLinked for two variables you could imagine breaking up into two RangesAndLinked objects, one for each, but you can't do that for a joint multivariate distribution like Dirichlet.

penelopeysm and others added 21 commits October 21, 2025 18:08

v0.39

8c3d30f

Merge remote-tracking branch 'origin/main' into breaking

262d732

Merge branch 'main' into breaking

77af4eb

Merge remote-tracking branch 'origin/main' into breaking

c57de02

Update DPPL compats for benchmarks and docs

7300c22

remove merge conflict markers

79150ba

Merge remote-tracking branch 'origin/main' into breaking

6dc7c02

Merge branch 'main' into breaking

2ca96cc

Merge branch 'main' into breaking

a8eb2e7

Remove NodeTrait (#1133)

4ca9528

* Remove NodeTrait * Changelog * Fix exports * docs * fix a bug * Fix doctests * Fix test * tweak changelog

implement LogDensityProblems.dimension

9624103

forgot about capabilities...

ce80713

Merge branch 'main' into breaking

1d21728

Merge branch 'main' into breaking

c4cec0b

use interpolation in run_ad

8553e40

Improvements to benchmark outputs (#1146)

3cd8d34

* print output * fix * reenable * add more lines to guide the eye * reorder table * print tgrad / trel as well * forgot this type

Add VarNamedTuple, tests, and WIP docs

eab7131

Add comparisons and merge

0c7825b

Start using VNT in FastLDF

15d5a8a

Move _compose_no_identity to utils.jl

871eb9f

github-actions bot assigned mhauru Nov 20, 2025

Allow generation of ParamsWithStats from FastLDF plus parameters,…

4a11560

… and also `bundle_samples` (#1129) * Implement `ParamsWithStats` for `FastLDF` * Add comments * Implement `bundle_samples` for ParamsWithStats -> MCMCChains * Remove redundant comment * don't need Statistics?

mhauru mentioned this pull request Nov 24, 2025

Make FastLDF the default #1139

Merged

penelopeysm and others added 3 commits November 25, 2025 11:41

Make FastLDF the default (#1139)

766f663

* Make FastLDF the default * Add miscellaneous LogDensityProblems tests * Use `init!!` instead of `fast_evaluate!!` * Rename files, rebalance tests

Minor refactor

c1b935b

Remove IndexDict

262a6f9

Make VNT merge type stable. Simplify printing, improve tests.

3ca36c4

mhauru marked this pull request as ready for review November 27, 2025 19:18

mhauru added 5 commits November 27, 2025 19:23

Add VNT too API docs

59f67fd

Fix doctests

9aba468

Clean up tests a bit

0b4c772

Fix API docs

38662a8

Fix a bug and a docstring

e41afca

mhauru requested a review from penelopeysm November 27, 2025 19:49

penelopeysm reviewed Nov 27, 2025

View reviewed changes

penelopeysm self-requested a review November 27, 2025 19:55

mhauru and others added 2 commits November 28, 2025 09:14

Apply suggestions from code review

8c50bbb

Co-authored-by: Penelope Yong <penelopeysm@gmail.com>

Fix VNT docs

cae8864

yebai mentioned this pull request Nov 28, 2025

Depreciate VarInfo in favour of SimpleVarInfo #416

Closed

4 tasks

penelopeysm and others added 2 commits December 1, 2025 13:22

Merge remote-tracking branch 'origin/main' into breaking

993cc5b

penelopeysm reviewed Dec 1, 2025

View reviewed changes

Standardise :lp -> :logjoint (#1161)

54ae7e3

* Standardise `:lp` -> `:logjoint` * changelog * fix a test

Base automatically changed from breaking to main December 2, 2025 12:45

mhauru added 3 commits December 3, 2025 17:05

Merge branch 'tmp2' into mhauru/vnt-for-fastldf

6012b11

Merge commit 'ee863d6' into mhauru/vnt-for-fastldf

f114e40

Merge branch 'breaking' into mhauru/vnt-for-fastldf

bccfdf0

mhauru changed the base branch from main to breaking December 3, 2025 17:13

mhauru and others added 4 commits December 3, 2025 17:36

Apply suggestions from code review

384e3ac

Co-authored-by: Penelope Yong <penelopeysm@gmail.com>

Add a microoptimisation

9d61a54

Improve docstrings

8c8e39f

Simplify use of QuoteNodes

c818bf8

		_haskey(arr::AbstractArray, optic::IndexLens) = _haskey(arr, optic.indices)
		_haskey(arr::AbstractArray, inds) = checkbounds(Bool, arr, inds...)

		resized in exponentially increasing steps. This means that most `setindex!!` calls are very
		fast, but some may incur substantial overhead due to resizing and copying data. It also

	* sets `a[1].b` and `a[2].c`, without setting `a[1].c`. or `a[2].b`,
	* sets `a[1].b` and `a[2].c`, without setting `a[1].c` and `a[2].b`,

VarNamedTuple, with an application for FastLDF #1150

Are you sure you want to change the base?

VarNamedTuple, with an application for FastLDF #1150

Uh oh!

Conversation

mhauru commented Nov 20, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

github-actions bot commented Nov 20, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Benchmark Report

Computer Information

Benchmark Results

Uh oh!

codecov bot commented Nov 20, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

penelopeysm commented Nov 20, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

mhauru commented Nov 27, 2025

Uh oh!

penelopeysm left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

github-actions bot commented Nov 27, 2025

Uh oh!

mhauru commented Dec 1, 2025

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

mhauru commented Dec 4, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

mhauru commented Nov 20, 2025 •

edited

Loading

github-actions bot commented Nov 20, 2025 •

edited

Loading

codecov bot commented Nov 20, 2025 •

edited

Loading

penelopeysm commented Nov 20, 2025 •

edited

Loading