|
1 | | -PreallocationTools.jl |
| 1 | +# PreallocationTools.jl |
| 2 | + |
| 3 | +[](https://gitter.im/JuliaDiffEq/Lobby?utm_source=badge&utm_medium=badge&utm_campaign=pr-badge&utm_content=badge) |
| 4 | +[](https://github.com/SciML/PreallocationTools.jl/actions?query=workflow%3ACI) |
| 5 | + |
| 6 | +PreallocationTools.jl is a set of tools for helping build non-allocating |
| 7 | +pre-cached functions for high-performance computing in Julia. Its tools handle |
| 8 | +edge cases of automatic differentiation to make it easier for users to get |
| 9 | +high performance even in the cases where code generation may change the |
| 10 | +function that is being called. |
| 11 | + |
| 12 | +## dualcache |
| 13 | + |
| 14 | +`dualcache` is a method for generating doubly-preallocated vectors which are |
| 15 | +compatible with non-allocating forward-mode automatic differentiation by |
| 16 | +ForwardDiff.jl. Since ForwardDiff uses chunked duals in its forward pass, two |
| 17 | +vector sizes are required in order for the arrays to be properly defined. |
| 18 | +`dualcache` creates a dispatching type to solve this, so that by passing a |
| 19 | +qualifier it can automatically switch between the required cache. This method |
| 20 | +is fully type-stable and non-dynamic, made for when the highest performance is |
| 21 | +needed. |
| 22 | + |
| 23 | +### Using dualcache |
| 24 | + |
| 25 | +```julia |
| 26 | +dualcache(u::AbstractArray, N = Val{default_cache_size(length(u))}) |
| 27 | +``` |
| 28 | + |
| 29 | +The `dualcache` function builds a `DualCache` object that stores both a version |
| 30 | +of the cache for `u` and for the `Dual` version of `u`, allowing use of |
| 31 | +pre-cached vectors with forward-mode automatic differentiation. To access the |
| 32 | +caches, one uses: |
| 33 | + |
| 34 | +```julia |
| 35 | +get_tmp(tmp::DualCache, u) |
| 36 | +``` |
| 37 | + |
| 38 | +When `u` has an element subtype of `Dual` numbers, then it returns the `Dual` |
| 39 | +version of the cache. Otherwise it returns the standard cache (for use in the |
| 40 | +calls without automatic differentiation). |
| 41 | + |
| 42 | +In order to preallocate to the right size, the `dualcache` needs to be specified |
| 43 | +to have the corrent `N` matching the chunk size of the dual numbers or larger. |
| 44 | +In a differential equation, optimization, etc., the default chunk size is computed |
| 45 | +from the state vector `u`, and thus if one creates the `dualcache` via |
| 46 | +`dualcache(u)` it will match the default chunking of the solver libraries. |
| 47 | + |
| 48 | +### Example |
| 49 | + |
| 50 | +```julia |
| 51 | +using LinearAlgebra, OrdinaryDiffEq |
| 52 | +function foo(du, u, (A, tmp), t) |
| 53 | + mul!(tmp, A, u) |
| 54 | + @. du = u + tmp |
| 55 | + nothing |
| 56 | +end |
| 57 | +prob = ODEProblem(foo, ones(5, 5), (0., 1.0), (ones(5,5), zeros(5,5))) |
| 58 | +solve(prob, Rosenbrock23()) |
| 59 | +``` |
| 60 | + |
| 61 | +fails because `tmp` is only real numbers, but during automatic differentiation |
| 62 | +we need `tmp` to be a cache of dual numbers. Since `u` is the value that will |
| 63 | +have the dual numbers, we dispatch based on that: |
| 64 | + |
| 65 | +```julia |
| 66 | +using LinearAlgebra, OrdinaryDiffEq, PreallocationTools |
| 67 | +function foo(du, u, (A, tmp), t) |
| 68 | + tmp = get_tmp(tmp, u) |
| 69 | + mul!(tmp, A, u) |
| 70 | + @. du = u + tmp |
| 71 | + nothing |
| 72 | +end |
| 73 | +chunk_size = 5 |
| 74 | +prob = ODEProblem(foo, ones(5, 5), (0., 1.0), (ones(5,5), dualcache(zeros(5,5), Val{chunk_size}))) |
| 75 | +solve(prob, TRBDF2(chunk_size=chunk_size)) |
| 76 | +``` |
| 77 | + |
| 78 | +or just using the default chunking: |
| 79 | + |
| 80 | +```julia |
| 81 | +using LinearAlgebra, OrdinaryDiffEq, PreallocationTools |
| 82 | +function foo(du, u, (A, tmp), t) |
| 83 | + tmp = get_tmp(tmp, u) |
| 84 | + mul!(tmp, A, u) |
| 85 | + @. du = u + tmp |
| 86 | + nothing |
| 87 | +end |
| 88 | +chunk_size = 5 |
| 89 | +prob = ODEProblem(foo, ones(5, 5), (0., 1.0), (ones(5,5), dualcache(zeros(5,5)))) |
| 90 | +solve(prob, TRBDF2()) |
| 91 | +``` |
| 92 | + |
| 93 | +## LazyBufferCache |
| 94 | + |
| 95 | +```julia |
| 96 | +LazyBufferCache(f::F=identity) |
| 97 | +``` |
| 98 | + |
| 99 | +A `LazyBufferCache` is a `Dict`-like type for the caches which automatically defines |
| 100 | +new cache vectors on demand when they are required. The function `f` is a length |
| 101 | +map which maps `length_of_cache = f(length(u))`, which by default creates cache |
| 102 | +vectors of the same length. |
| 103 | + |
| 104 | +Note that `LazyBufferCache` does cause a dynamic dispatch, though it is type-stable. |
| 105 | +This gives it a ~100ns overhead, and thus on very small problems it can reduce |
| 106 | +performance, but for any sufficiently sized calculation (e.g. >20 ODEs) this |
| 107 | +may not be even measurable. |
| 108 | + |
| 109 | +### Example |
| 110 | + |
| 111 | +```julia |
| 112 | +using LinearAlgebra, OrdinaryDiffEq, PreallocationTools |
| 113 | +function foo(du, u, (A, lbc), t) |
| 114 | + tmp = lbc[u] |
| 115 | + mul!(tmp, A, u) |
| 116 | + @. du = u + tmp |
| 117 | + nothing |
| 118 | +end |
| 119 | +prob = ODEProblem(foo, ones(5, 5), (0., 1.0), (ones(5,5), LazyBufferCache())) |
| 120 | +solve(prob, TRBDF2()) |
| 121 | +``` |
| 122 | + |
| 123 | +## Similar Projects |
| 124 | + |
| 125 | +[AutoPreallocation.jl](https://github.com/oxinabox/AutoPreallocation.jl) tries |
| 126 | +to do this automatically at the compiler level. |
0 commit comments