Skip to content

Commit 673f022

Browse files
committed
ode with unc. in param
1 parent b3757a0 commit 673f022

File tree

2 files changed

+430
-0
lines changed

2 files changed

+430
-0
lines changed

docs/src/lecture_13/lecture.md

Lines changed: 342 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,342 @@
1+
# Data-driven Ordinary Differential Equations
2+
3+
We have looked into the uncertainty propagation trough an ODE in previous lecture. The uncertainty may stem from:
4+
- unknown boundary consitions (e.g. initial conditions)
5+
- unknown parameters,
6+
- missing terms (hidden dynamics) of ODE
7+
8+
9+
10+
While differential equations are commonly used in science to describe many aspects of the physical world, ranging from dynamical systems and curves in space to complex multi-physics phenomena.
11+
12+
As an example, consider a simple non-linear ordinary differential equation:
13+
14+
```math
15+
\begin{align}
16+
\dot{x}&=\alpha x-\beta xy,\\\dot{y}&=-\delta y+\gamma xy,
17+
\end{align}
18+
```
19+
20+
Which describes behavior of a predator-pray models in continuous times:
21+
- x is the population of prey (sheep),
22+
- y is the population of predator (wolfes)
23+
- derivatives represent instantaneous growth rates of the populations
24+
- ``t`` is the time and ``\alpha, \beta, \gamma, \delta`` are parameters.
25+
26+
Can be written in vector arguments ``\mathbf{x}=[x,y]``:
27+
```math
28+
\frac{d\mathbf{x}}{dt}=f(\mathbf{x},\theta)
29+
```
30+
with arbitrary function ``f`` with vector of parameters ``\theta``.
31+
32+
33+
The first steps we may want to do with an ODE is to see it's evolution in time. The most simple approach is to discretize the time axis into steps:
34+
``t = [t_1, t_2, t_3, \ldots t_T]``
35+
and evaluate solution at these points.
36+
37+
Replacing derivatives by differences:
38+
```math
39+
\dot x \leftarrow \frac{x_t-x_{t-1}}{\Delta t}
40+
```
41+
we can derive a general scheme (Euler solution):
42+
```math
43+
\mathbf{x}_t = \mathbf{x}_{t-1} + \Delta{}t f(\mathbf{x}_t,\theta)
44+
```
45+
which can be written genericaly in julia :
46+
```julia
47+
48+
function f(x,θ)
49+
α,β,γ,δ = θ
50+
x1,x2=x
51+
dx1 = α*x1 - β*x1*x2
52+
dx2 = δ*x1*x2 - γ*x2
53+
[dx1,dx2]
54+
end
55+
56+
function solve(f,x0::AbstractVector,θ,dt,N)
57+
X = hcat([zero(x0) for i=1:N]...)
58+
X[:,1]=x0
59+
for t=1:N-1
60+
X[:,t+1]=X[:,t]+dt*f(X[:,t],θ)
61+
end
62+
X
63+
end
64+
```
65+
66+
67+
Is simple and working (with sufficienty small ``dt``):
68+
69+
![](lotka.svg)
70+
71+
ODE of this kind is an example of a "complex" simulation code that we may want to use, interact with, modify or incorporate into a more complex scheme.
72+
- we will test how to re-define the elemnetary oeprations using custom types, automatic differnetiation and automatic code generation
73+
- we will redefine the plotting operation to display the new type correctly
74+
- we will use composition to incorporate the ODE into a more complex solver
75+
76+
77+
## Uncertainty propagation
78+
79+
Prediction of the ODE model is valid only if all parameters and all initial conditions are accurate. This is almost never the case. While the number of sheep can be known, the number of wolfes in a forest is more uncertain. The same model holds for predator-prey in insects where the number of individuals can be only estimated.
80+
81+
Uncertain initial conditions:
82+
- number of predators and prey given by a probability distribution
83+
- interval ``[0.8,1.2]`` corresponds to uniform distribution ``U(0.8,1.2)``
84+
- gaussian ``N(\mu,\sigma)``, with mean ``\mu`` and standard deviation ``\sigma`` e.g. ``N(1,0.1)``
85+
- more complicated distributions are more realistic (the number of animals is not negative!)
86+
87+
### Ensemble approach
88+
89+
The most simple approach is to represent distribution by an empirical density = discrete samples.
90+
```math
91+
p(\mathbf{x})\approx \frac{1}{K}\sum_{k=1}^{K} \delta(\mathbf{x}-\mathbf{x}^{(k)})
92+
```
93+
94+
In the case of a Gaussian, we just sample:
95+
```julia
96+
K = 10
97+
X0 = [x0 .+ 0.1*randn(2) for _=1:K] # samples of initial conditions
98+
Xens=[X=solve(f,X0[i],θ0,dt,N) for i=1:K] # solve multiple times
99+
```
100+
(can be implemented more elegantly using multiple dispatch on Vector{Vector})
101+
102+
![](LV_ensemble.svg)
103+
104+
While it is very simple and universal, it may become hard to interpret.
105+
- What is the probability that it will higher than ``x_{max}``?
106+
- Improving accuracy with higher number of samples (expensive!)
107+
108+
### Propagating a Gaussian
109+
110+
Propagation of uncertainty has been studied in many areas of science. Relation between accuracy and computational speed is always a tradeoff.
111+
112+
A common appoach to propagation of uncertainty is linearized Gaussian:
113+
- variable ``x`` is represented by gaussian ``N(\mu,\sigma)``
114+
- transformation of addition: ``x+a\sim N(\mu+a,\sigma)``
115+
- transformation of multiplication: ``a*x\sim N(a*\mu,a*\sigma)``
116+
- general transformation approximated:
117+
```math
118+
g(x)\sim N(g(\mu),g'(\mu)*\sigma)
119+
```
120+
121+
This can be efficienty implemented in Julia:
122+
```julia
123+
struct GNum{T} where T<:Real
124+
μ::T
125+
σ::T
126+
end
127+
import Base: +, *
128+
+(x::GaussNum{T},a::T) where T =GaussNum(x.μ+a,x.σ)
129+
+(a::T,x::GaussNum{T}) where T =GaussNum(x.μ+a,x.σ)
130+
*(x::GaussNum{T},a::T) where T =GaussNum(x.μ*a,a*x.σ)
131+
*(a::T,x::GaussNum{T}) where T =GaussNum(x.μ*a,a*x.σ)
132+
```
133+
134+
For the ODE we need multiplication of two Gaussians. Using Taylor expansion and neglecting covariances:
135+
```math
136+
g(x_1,x_2)=N\left(g(\mu_1,\mu_2), \sqrt{\left(\frac{dg}{dx_1}(\mu_1,\mu_2)\sigma_1\right)^2 + \left(\frac{dg}{dx_1}(\mu_1,\mu_2)\sigma_1\right)^2}\right)
137+
```
138+
which trivially applies to sum: ``x_1+x_2=N(\mu_1+\mu_2, \sqrt{\sigma_1^2 + \sigma_1^2})``
139+
140+
```julia
141+
+(x1::GaussNum{T},x2::GaussNum{T}) where T =GaussNum(x1.μ+x2.μ,sqrt(x1.σ.^2 + x2.σ.^2))
142+
*(x1::GaussNum{T},x2::GaussNum{T}) where T =GaussNum(x1.μ*x2.μ, sqrt(x2.μ*x1.σ.^2 + x1.μ*x2.σ.^2))
143+
144+
```
145+
146+
Following the principle of defining the necessary functions on the type, we can make it pass through the ODE:
147+
- it is necessary to define new initialization (functions `zero`)
148+
- define nice-looking constructor (``±``)
149+
```julia
150+
±(a::T,b::T) where T:<Real =GaussNum(a,b)
151+
```
152+
153+
```julia
154+
GX=solve(f,[1.0±0.1,1.0±0.1],[0.1,0.2,0.3,0.2],0.1,1000)
155+
```
156+
157+
![](LV_GaussNum.svg)
158+
159+
- function overloading follows a deterministic procedure => can be automated (macro, generated functions)
160+
161+
162+
### Flexibility
163+
164+
The great advantage of the former model was the ability to run an arbitrary code with uncertainty at an arbitrary number.
165+
166+
For example, we may know the initial conditions, but do not know the parameter value.
167+
168+
```julia
169+
GX=solve(f,[1.0±0.1,1.0±0.1],[0.1±0.1,0.2,0.3,0.2],0.1,1000)
170+
```
171+
172+
![](LV_GaussNum2.svg)
173+
174+
### Disadvantage
175+
176+
The result does not correspond to the ensemble version above.
177+
- we have ignored the covariances
178+
- extension to version with covariances is possible by keeping track of the correlations (`Measurements.jl`), where other variables are stored in a dictionary:
179+
- correlations found by language manipulations
180+
- very flexible and easy-to-use
181+
- discovering the covariances requires to build the covariance from `ids`. (Expensive if done too often).
182+
183+
184+
## Vector uncertainty
185+
The previous simple approach ignores the covariances between variables. Even if we tract covariances linearnly in the same fashion (``Measurements.jl``), the approach will suffer from a loss of precision under non-linearity.
186+
187+
188+
![](https://photos1.blogger.com/blogger/5955/293/1600/unscented-transform-explained.jpg)
189+
190+
- The linearization-based approach propogates through the non-linearity only the mean and models its neighborhood by a plane.
191+
- Propagating all samples is too expensive
192+
- Methods based on quadrature or cubature rules are a compromise
193+
194+
195+
The cubature approach is based on moment matching:
196+
```math
197+
\mu_g = \int g(x) p(x) dx
198+
```
199+
for which is ``g(\mu)`` poor approximation, corresponding to:
200+
```math
201+
\mu_g = g(\mu) = \int g(x) \delta(x-\mu) dx
202+
```
203+
For Gaussian distribution, we can use a smarter integration rule, called the Gauss-Hermite quadrature:
204+
```math
205+
\mu_g = \int g(x) p(x) dx \approx \sum_{j=1}^J w_j g(x_j)
206+
```
207+
where ``x_j`` are prescribed quadrature points (see e.g. ![online tables](https://www.efunda.com/math/num_integration/findgausshermite.cfm))
208+
209+
In multivariate setting, the same problem is typically solved with the aim to reduce the computational cost to linear complexity with dimension. Most often aimimg at ``O(2d)`` complexity where ``d`` is the dimension of vector ``x``.
210+
211+
212+
One of the most popular approaches today is based on cubature rules approximating the Gaussian in radial-spherical coordinates.
213+
214+
### Cubature rules
215+
216+
Consider Gaussian distribution with mean ``\mu`` and covariance matrix ``\Sigma`` that is positive definite with square root ``\sqrt\Sigma``, such that ``\sqrt\Sigma \sqrt\Sigma^T=\Sigma``. The quadrature pints are:
217+
```math
218+
x_i = \mu + \sqrt\Sigma q_i
219+
```
220+
```math
221+
\begin{align}
222+
q_{1}&=\sqrt{d}\begin{bmatrix}1\\
223+
0\\
224+
\vdots
225+
\end{bmatrix}
226+
&
227+
q_{2}&=\sqrt{d}\begin{bmatrix}0\\
228+
1\\
229+
\vdots
230+
\end{bmatrix} \ldots
231+
&
232+
q_{d+1}&=\sqrt{d}\begin{bmatrix}-1\\
233+
0\\
234+
\vdots
235+
\end{bmatrix}
236+
q_{d+2}&=\sqrt{d}\begin{bmatrix}0\\
237+
-1\\
238+
\vdots
239+
\end{bmatrix} \ldots
240+
\end{align}
241+
```
242+
that can be composed into a matrix ``Q=[q_1,\ldots q_{2d}]`` that is constant:
243+
```math
244+
Q = \sqrt{d} [ I_d, -I_d]
245+
```
246+
247+
![](cubature.png)
248+
249+
Those quadrature points are in integration weighted by:
250+
```math
251+
w_i = \frac{1}{2d}, i=1,\ldots,2d
252+
```
253+
where ``d`` is dimension of the vectors.
254+
255+
The quadrature points are propogated through the non-linearity in parallel (``x_i'=g(x_i)``) and the resulting Gaussian distribution is:
256+
```math
257+
\begin{align}
258+
x' & \sim N(\mu',\Sigma')\\
259+
\mu' & = \frac{1}{2d}\sum_{j=1}^{2d} x'_i\\
260+
\Sigma &= \frac{1}{2d}\sum_{j=1}^{2d} (x'_i-\mu')^T (x'_i-\mu')
261+
\end{align}
262+
```
263+
264+
It is easy to check that if the sigma-points are propagated through an identity, they preserve the mean and variance.
265+
```math
266+
\begin{align}
267+
\mu' & = \frac{1}{2d}\sum_{j=1}^{2d} (\mu + \sqrt{\Sigma}q_i)\\
268+
& = \frac{1}{2d}(2d\mu + \sqrt{\Sigma} \sum_{j=1}^{2d} (q_i)
269+
& = \mu
270+
\end{align}
271+
272+
```
273+
274+
275+
For our example:
276+
277+
![](LV_Quadrics.svg)
278+
279+
- only 4 trajectories propagated deterministically
280+
- can not be implemented using a single number type
281+
- the number of points to store is proportional to the dimension
282+
- manipulation requires operations from linear algebra
283+
- moving to representations in vector form
284+
- simple for initial conditions,
285+
- how to extend to operate also on parameters?
286+
287+
### Smarter implementation
288+
Easiest solution is to put the corresponding parts of the problem together:
289+
- ode function ``f``,
290+
- its state ``x0``,
291+
- and parameters ``θ``
292+
can be wrapped into an ODEProblem
293+
294+
```julia
295+
struct ODEProblem{F,T,X<:AbstractVector,P<:AbstractVector}
296+
f::F
297+
tspan::T
298+
x0::X
299+
θ::P
300+
end
301+
```
302+
- the solver can operate on the ODEProbelm type
303+
304+
### Unceratinty propagation in vectors
305+
306+
Example: consider uncertainty in state ``[x_1,x_2]`` and the first parameter ``\theta_1``.
307+
308+
Quick and dirty:
309+
```julia
310+
getuncertainty(o::ODEProblem) = [o.u0[1:2];o.θ[1]]
311+
setuncertainty!(o::ODEProblem,x::AbstractVector) = o.u0[1:2]=x[1:2],o.θ[1]=x[3]
312+
```
313+
and write a general Cubature solver using multiple dispatch.
314+
315+
Practical issues:
316+
- how to check bounds? (Asserts)
317+
- what if we provide an incompatible ODEProblem
318+
- define a type that specifies the type of uncertainty?
319+
```julia
320+
struct GaussODEProblem
321+
mean::ODEProblem
322+
unc_in_u # any indexing type accepted by to_index()
323+
unc_in_θ
324+
sqΣ0
325+
end
326+
```
327+
328+
We can dispatch the cubature solver on GaussODEProblem and the ordinary ``solve`` on GaussODEProblem.OP internally.
329+
330+
```julia
331+
getmean(gop::GaussODEProblem) =[ gop.mean.x0[gop.unc_in_u];gop.mean.θ[gop.unc_in_θ]]
332+
setmean!(gop::GaussODEProblem,x::AbstractVector) = begin
333+
gop.mean.x0[gop.unc_in_u]=x[1:length(gop.unc_in_u)]
334+
gop.mean.θ[gop.unc_in_θ]=x[length(gop.unc_in_u).+[1:length(gop.unc_in_θ)]]
335+
end
336+
```
337+
338+
Constructor accepts an ODEProblem with uncertain numbers and converts it to GaussODEProblem:
339+
- goes through ODEProblem ``x0`` and ``θ`` fields and checks their types
340+
- replaces GaussNums in ODEProblem by ordinary numbers
341+
- remembers indices of GaussNum in ``x0`` and ``θ``
342+
- copies standard deviations in GaussNum to ``sqΣ0``

0 commit comments

Comments
 (0)