Skip to content

Commit 164b365

Browse files
committed
semantic line breaks
1 parent ab9cbaf commit 164b365

File tree

2 files changed

+27
-56
lines changed

2 files changed

+27
-56
lines changed

tutorials/bayesian-poisson-regression/index.qmd

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -235,4 +235,4 @@ chains_new = chain[201:end, :, :]
235235
plot(chains_new)
236236
```
237237

238-
As can be seen from the numeric values and the plots above, the standard deviation values have decreased and all the plotted values are from the estimated posteriors. The exponentiated mean values, with the warmup samples removed, have not changed by much and they are still in accordance with their intuitive meanings as described earlier.
238+
As can be seen from the numeric values and the plots above, the standard deviation values have decreased and all the plotted values are from the estimated posteriors. The exponentiated mean values, with the warmup samples removed, have not changed by much and they are still in accordance with their intuitive meanings as described earlier.

tutorials/gaussian-processes-introduction/index.qmd

Lines changed: 26 additions & 55 deletions
Original file line numberDiff line numberDiff line change
@@ -12,20 +12,13 @@ using Pkg;
1212
Pkg.instantiate();
1313
```
1414

15-
[JuliaGPs](https://github.com/JuliaGaussianProcesses/#welcome-to-juliagps) packages integrate well with Turing.jl because they implement the Distributions.jl
16-
interface.
15+
[JuliaGPs](https://github.com/JuliaGaussianProcesses/#welcome-to-juliagps) packages integrate well with Turing.jl because they implement the Distributions.jl interface.
1716
This tutorial assumes basic knowledge of [Gaussian processes](https://en.wikipedia.org/wiki/Gaussian_process) (i.e., a general understanding of what they are); for a comprehensive introduction, see [Rasmussen and Williams (2006)](http://www.gaussianprocess.org/gpml/).
18-
For a more in-depth understanding of the
19-
[JuliaGPs](https://github.com/JuliaGaussianProcesses/#welcome-to-juliagps) functionality
20-
used here, please consult the
21-
[JuliaGPs](https://github.com/JuliaGaussianProcesses/#welcome-to-juliagps) docs.
22-
23-
In this tutorial, we will model the putting dataset discussed in Chapter 21 of
24-
[Bayesian Data Analysis](http://www.stat.columbia.edu/%7Egelman/book/).
25-
The dataset comprises the result of measuring how often a golfer successfully gets the ball
26-
in the hole, depending on how far away from it they are.
27-
The goal of inference is to estimate the probability of any given shot being successful at a
28-
given distance.
17+
For a more in-depth understanding of the [JuliaGPs](https://github.com/JuliaGaussianProcesses/#welcome-to-juliagps) functionality used here, please consult the [JuliaGPs](https://github.com/JuliaGaussianProcesses/#welcome-to-juliagps) docs.
18+
19+
In this tutorial, we will model the putting dataset discussed in Chapter 21 of [Bayesian Data Analysis](http://www.stat.columbia.edu/%7Egelman/book/).
20+
The dataset comprises the result of measuring how often a golfer successfully gets the ball in the hole, depending on how far away from it they are.
21+
The goal of inference is to estimate the probability of any given shot being successful at a given distance.
2922

3023
### Let's download the data and take a look at it:
3124

@@ -36,15 +29,15 @@ df = CSV.read("golf.dat", DataFrame; delim=' ', ignorerepeated=true)
3629
df[1:5, :]
3730
```
3831

39-
We've printed the first 5 rows of the dataset (which comprises only 19 rows in total).
32+
These are the first 5 rows of the dataset (which comprises only 19 rows in total).
4033
Observe it has three columns:
4134

42-
1. `distance` -- how far away from the hole. I'll refer to `distance` as `d` throughout the rest of this tutorial
35+
1. `distance` -- how far away from the hole. We will refer to `distance` as `d` throughout the rest of this tutorial
4336
2. `n` -- how many shots were taken from a given distance
4437
3. `y` -- how many shots were successful from a given distance
4538

46-
We will use a Binomial model for the data, whose success probability is parametrised by a
47-
transformation of a GP. Something along the lines of:
39+
We will use a Binomial model for the data, whose success probability is parametrised by a transformation of a GP. Something along the lines of:
40+
4841
$$
4942
\begin{aligned}
5043
f & \sim \operatorname{GP}(0, k) \\
@@ -68,25 +61,15 @@ using AbstractGPs, LogExpFunctions, Turing
6861
end
6962
```
7063

71-
We first define an `AbstractGPs.GP`, which represents a distribution over functions, and
72-
is entirely separate from Turing.jl.
64+
We first define an `AbstractGPs.GP`, which represents a distribution over functions, and is entirely separate from Turing.jl.
7365
We place a prior over its variance `v` and length-scale `l`.
74-
`f(d, jitter)` constructs the multivariate Gaussian comprising the random variables
75-
in `f` whose indices are in `d` (plus a bit of independent Gaussian noise with variance
76-
`jitter` -- see [the docs](https://juliagaussianprocesses.github.io/AbstractGPs.jl/dev/api/#FiniteGP-and-AbstractGP)
77-
for more details).
78-
`f(d, jitter)` has the type `AbstractMvNormal`, and is the bit of AbstractGPs.jl that implements the
79-
Distributions.jl interface, so it's legal to put it on the right-hand side
80-
of a `~`.
81-
From this you should deduce that `f_latent` is distributed according to a multivariate
82-
Gaussian.
83-
The remaining lines comprise standard Turing.jl code that is encountered in other tutorials
84-
and Turing documentation.
85-
86-
Before performing inference, we might want to inspect the prior that our model places over
87-
the data, to see whether there is anything obviously wrong.
88-
These kinds of prior predictive checks are straightforward to perform using Turing.jl, since
89-
it is possible to sample from the prior easily by just calling the model:
66+
`f(d, jitter)` constructs the multivariate Gaussian comprising the random variables in `f` whose indices are in `d` (plus a bit of independent Gaussian noise with variance `jitter` -- see [the docs](https://juliagaussianprocesses.github.io/AbstractGPs.jl/dev/api/#FiniteGP-and-AbstractGP) for more details).
67+
`f(d, jitter)` has the type `AbstractMvNormal`, and is the bit of AbstractGPs.jl that implements the Distributions.jl interface, so it's legal to put it on the right-hand side of a `~`.
68+
From this you should deduce that `f_latent` is distributed according to a multivariate Gaussian.
69+
The remaining lines comprise standard Turing.jl code that is encountered in other tutorials and Turing documentation.
70+
71+
Before performing inference, we might want to inspect the prior that our model places over the data, to see whether there is anything obviously wrong.
72+
These kinds of prior predictive checks are straightforward to perform using Turing.jl, since it is possible to sample from the prior easily by just calling the model:
9073

9174
```{julia}
9275
m = putting_model(Float64.(df.distance), df.n)
@@ -117,22 +100,13 @@ end
117100
plot(hists...; layout=(4, 5))
118101
```
119102

120-
In this case, the only prior knowledge I have is that the proportion of successful shots
121-
ought to decrease monotonically as the distance from the hole increases, which should show
122-
up in the data as the blue lines generally go down as we move from left to right on each
123-
graph.
124-
Unfortunately, there is not a simple way to enforce monotonicity in the samples from a GP,
125-
and we can see this in some of the plots above, so we must hope that we have enough data to
126-
ensure that this relationship holds approximately under the posterior.
127-
In any case, you can judge for yourself whether you think this is the most useful
128-
visualisation that we can perform -- if you think there is something better to look at,
129-
please let us know!
103+
In this case, the only prior knowledge we is that the proportion of successful shots ought to decrease monotonically as the distance from the hole increases, which should show up in the data as the blue lines generally go down as we move from left to right on each graph.
104+
Unfortunately, there is not a simple way to enforce monotonicity in the samples from a GP, and we can see this in some of the plots above, so we must hope that we have enough data to ensure that this relationship holds approximately under the posterior.
105+
In any case, you can judge for yourself whether you think this is the most useful visualisation that we can perform; if you think there is something better to look at, please let us know!
130106

131107
Moving on, we generate samples from the posterior using the default `NUTS` sampler.
132-
We'll make use of [ReverseDiff.jl](https://github.com/JuliaDiff/ReverseDiff.jl), as it has
133-
better performance than [ForwardDiff.jl](https://github.com/JuliaDiff/ForwardDiff.jl/) on
134-
this example. See the [automatic differentiation docs]({{< meta usage-automatic-differentiation >}}) for more info.
135-
108+
We'll make use of [ReverseDiff.jl](https://github.com/JuliaDiff/ReverseDiff.jl), as it has better performance than [ForwardDiff.jl](https://github.com/JuliaDiff/ForwardDiff.jl/) on this example.
109+
See the [automatic differentiation docs]({{< meta usage-automatic-differentiation >}}) for more info.
136110

137111
```{julia}
138112
using Random, ReverseDiff
@@ -141,8 +115,7 @@ m_post = m | (y=df.y,)
141115
chn = sample(Xoshiro(123456), m_post, NUTS(; adtype=AutoReverseDiff()), 1_000, progress=false)
142116
```
143117

144-
We can use these samples and the `posterior` function from `AbstractGPs` to sample from the
145-
posterior probability of success at any distance we choose:
118+
We can use these samples and the `posterior` function from `AbstractGPs` to sample from the posterior probability of success at any distance we choose:
146119

147120
```{julia}
148121
d_pred = 1:0.2:21
@@ -154,7 +127,5 @@ plot!(d_pred, reduce(hcat, samples); label="", color=:blue, alpha=0.2)
154127
scatter!(df.distance, df.y ./ df.n; label="", color=:red)
155128
```
156129

157-
We can see that the general trend is indeed down as the distance from the hole increases,
158-
and that if we move away from the data, the posterior uncertainty quickly inflates.
159-
This suggests that the model is probably going to do a reasonable job of interpolating
160-
between observed data, but less good a job at extrapolating to larger distances.
130+
We can see that the general trend is indeed down as the distance from the hole increases, and that if we move away from the data, the posterior uncertainty quickly inflates.
131+
This suggests that the model is probably going to do a reasonable job of interpolating between observed data, but less good a job at extrapolating to larger distances.

0 commit comments

Comments
 (0)