@@ -90,6 +90,105 @@ prob = ODEProblem(foo, ones(5, 5), (0., 1.0), (ones(5,5), dualcache(zeros(5,5)))
9090solve (prob, TRBDF2 ())
9191```
9292
93+ ### Handling Chunk Sizes
94+
95+ It is important that the ` DiffCache ` matches the chunk sizes used in the actual differentiation. Let's
96+ understand this by looking at a real problem:
97+
98+ ``` julia
99+ using ForwardDiff
100+ using PreallocationTools
101+
102+ randmat = rand (10 , 2 )
103+ sto = similar (randmat)
104+ stod = dualcache (sto)
105+
106+ function claytonsample! (sto, τ; randmat= randmat)
107+ sto = get_tmp (sto, τ)
108+ @show typeof (τ)
109+ @show size (sto), size (randmat)
110+ sto .= randmat
111+ τ == 0 && return sto
112+
113+ n = size (sto, 1 )
114+ for i in 1 : n
115+ v = sto[i, 2 ]
116+ u = sto[i, 1 ]
117+ sto[i, 2 ] = (1 - u^ (- τ) + u^ (- τ)* v^ (- (τ/ (1 + τ))))^ (- 1 / τ)
118+ end
119+ return sto
120+ end
121+
122+ ForwardDiff. derivative (τ -> claytonsample! (stod, τ), 0.3 )
123+ ```
124+
125+ Notice that by default the ` dualcache ` builds a cache that is compatible with differentiation w.r.t the
126+ a variable sized as the cache variable, i.e. it's naturally made for Jacobians.
127+
128+ ``` julia
129+ julia> typeof (dualcache (sto))
130+ PreallocationTools. DiffCache{Matrix{Float64}, Matrix{ForwardDiff. Dual{nothing , Float64, 10 }}}
131+ ```
132+
133+ Notice that it choose a chunk size of 10, matching the chunk size that would be used if ` ForwardDiff.jacobian ` is used
134+ to calculate the derivative w.r.t. an input matching the size of ` sto ` (i.e., the Jacobian of ` claytonsample! ` w.r.t. ` randmat ` ).
135+ However, in our actual differentiation we have:
136+
137+ ``` julia
138+ typeof (τ) = ForwardDiff. Dual{ForwardDiff. Tag{var"#49#50" , Float64}, Float64, 1 }
139+ ```
140+
141+ a single chunk size because it's differentiating w.r.t. a single dimension. This messes with the sizes in the reinterpretation:
142+
143+ ``` julia
144+ (size (sto), size (randmat)) = ((55 , 2 ), (10 , 2 ))
145+ ```
146+
147+ The fix is to ensure that the cache is generated with the correct chunk size, i.e.:
148+
149+ ``` julia
150+ stod = dualcache (sto,Val{1 })
151+ ```
152+
153+ Thus the following code successfully computes the derivative in a non-allocating way:
154+
155+ ``` julia
156+ using ForwardDiff
157+ using PreallocationTools
158+
159+ randmat = rand (10 , 2 )
160+ sto = similar (randmat)
161+ stod = dualcache (sto,Val{1 })
162+
163+ function claytonsample! (sto, τ; randmat= randmat)
164+ sto = get_tmp (sto, τ)
165+ sto .= randmat
166+ τ == 0 && return sto
167+
168+ n = size (sto, 1 )
169+ for i in 1 : n
170+ v = sto[i, 2 ]
171+ u = sto[i, 1 ]
172+ sto[i, 2 ] = (1 - u^ (- τ) + u^ (- τ)* v^ (- (τ/ (1 + τ))))^ (- 1 / τ)
173+ end
174+ return sto
175+ end
176+
177+ ForwardDiff. derivative (τ -> claytonsample! (stod, τ), 0.3 )
178+
179+ 10 × 2 Matrix{Float64}:
180+ 0.0 0.171602
181+ 0.0 - 0.412736
182+ 0.0 0.149273
183+ 0.0 0.18172
184+ 0.0 0.144151
185+ 0.0 - 0.110773
186+ 0.0 0.221714
187+ 0.0 - 0.111034
188+ 0.0 - 0.0723283
189+ 0.0 0.251095
190+ ```
191+
93192## LazyBufferCache
94193
95194``` julia
@@ -104,7 +203,9 @@ vectors of the same length.
104203Note that ` LazyBufferCache ` does cause a dynamic dispatch, though it is type-stable.
105204This gives it a ~ 100ns overhead, and thus on very small problems it can reduce
106205performance, but for any sufficiently sized calculation (e.g. >20 ODEs) this
107- may not be even measurable.
206+ may not be even measurable. The upside of ` LazyBufferCache ` is that the user does
207+ not have to worry about potential issues with chunk sizes and such: ` LazyBufferCache `
208+ is much easier!
108209
109210### Example
110211
0 commit comments