Skip to content

Commit 8cbe199

Browse files
committed
2 parents 1c9f187 + 6e3631d commit 8cbe199

File tree

2 files changed

+1
-178
lines changed

2 files changed

+1
-178
lines changed

docs/src/lecture_05/lab.md

Lines changed: 0 additions & 177 deletions
Original file line numberDiff line numberDiff line change
@@ -56,46 +56,6 @@ We are getting a little ahead of ourselves in this lab, as understanding of thes
5656
- run the function with the argument once before running `@time` or use `@btime` if you have `BenchmarkTools` readily available in your environment
5757
- To see some measurable difference with this simple function, a longer vector of coefficients may be needed.
5858

59-
!!! details
60-
```@repl lab05_polynomial
61-
function polynomial_stable(a, x)
62-
accumulator = zero(x)
63-
for i in length(a):-1:1
64-
accumulator += x^(i-1) * a[i]
65-
end
66-
accumulator
67-
end
68-
```
69-
70-
```@repl lab05_polynomial
71-
@code_warntype polynomial_stable(a, x) # type stable
72-
@code_warntype polynomial_stable(a, xf) # type stable
73-
```
74-
75-
```@repl lab05_polynomial
76-
polynomial(a, xf) #hide
77-
polynomial_stable(a, xf) #hide
78-
@time polynomial(a, xf)
79-
@time polynomial_stable(a, xf)
80-
```
81-
82-
Only really visible when evaluating multiple times.
83-
```julia
84-
julia> using BenchmarkTools
85-
86-
julia> @btime polynomial($a, $xf)
87-
31.806 ns (0 allocations: 0 bytes)
88-
128.0
89-
90-
julia> @btime polynomial_stable($a, $xf)
91-
28.522 ns (0 allocations: 0 bytes)
92-
128.0
93-
```
94-
Difference only a few nanoseconds.
95-
96-
97-
*Note*: Recalling homework from lab 1. Adding `zero` also extends this function to the case of `x` being a matrix, see `?` menu.
98-
9959

10060
Code stability issues are something unique to Julia, as its JIT compilation allows it to produce code that contains boxed variables, whose type can be inferred during runtime. This is one of the reasons why interpreted languages are slow to run but fast to type. Julia's way of solving it is based around compiling functions for specific arguments, however in order for this to work without the interpreter, the compiler has to be able to infer the types.
10161

@@ -201,22 +161,6 @@ In order to get more of a visual feel for profiling, there are packages that all
201161
Let's compare this with the type unstable situation.
202162

203163

204-
!!! details
205-
First let's define the function that allows us to run the `polynomial` multiple times.
206-
```@repl lab05_polynomial
207-
function run_polynomial(a, x, n)
208-
for _ in 1:n
209-
polynomial(a, x)
210-
end
211-
end
212-
```
213-
214-
```julia
215-
@profview run_polynomial(a, xf, Int(1e5)) # clears the profile for us
216-
```
217-
![poly_unstable](poly_unstable.png)
218-
219-
220164
Other options for viewing profiler outputs
221165
- [ProfileView](https://github.com/timholy/ProfileView.jl) - close cousin of `ProfileSVG`, spawns GTK window with interactive FlameGraph
222166
- [VSCode](https://www.julia-vscode.org/docs/stable/release-notes/v0_17/#Profile-viewing-support-1) - always imported `@profview` macro, flamegraphs (js extension required), filtering, one click access to source code
@@ -235,47 +179,6 @@ We have noticed that no matter if the function is type stable or unstable the ma
235179
[^1]: Explanation of the Horner schema can be found on [https://en.wikipedia.org/wiki/Horner%27s\_method](https://en.wikipedia.org/wiki/Horner%27s_method).
236180

237181

238-
!!! details
239-
```julia
240-
function polynomial(a, x)
241-
accumulator = a[end] * one(x)
242-
for i in length(a)-1:-1:1
243-
accumulator = accumulator * x + a[i]
244-
end
245-
accumulator
246-
end
247-
```
248-
249-
Speed up:
250-
- 49ns -> 8ns ~ 6x on integer valued input
251-
- 59ns -> 8ns ~ 7x on real valued input
252-
253-
```
254-
julia> @btime polynomial($a, $x)
255-
8.008 ns (0 allocations: 0 bytes)
256-
97818
257-
258-
julia> @btime polynomial_stable($a, $x)
259-
49.173 ns (0 allocations: 0 bytes)
260-
97818
261-
262-
julia> @btime polynomial($a, $xf)
263-
8.008 ns (0 allocations: 0 bytes)
264-
97818.0
265-
266-
julia> @btime polynomial_stable($a, $xf)
267-
58.773 ns (0 allocations: 0 bytes)
268-
97818.0
269-
```
270-
These numbers will be different on different HW.
271-
272-
**BONUS**: The profile trace does not even contain the calling of mathematical operators and is mainly dominated by the iteration utilities. In this case we had to increase the number of runs to `1e6` to get some meaningful trace.
273-
274-
```julia
275-
@profview run_polynomial(a, xf, Int(1e6))
276-
```
277-
![poly_horner](poly_horner.png)
278-
279182
---
280183

281184
### Where to find source code?
@@ -409,42 +312,6 @@ nothing # hide
409312
![lab04-ecosystem](ecosystems/lab04-worldstep.png)
410313

411314

412-
!!! details
413-
Red bars indicate type instabilities. The bars stacked on top of them are high,
414-
narrow and not filling the whole width, indicating that the problem is pretty
415-
serious. In our case the worst offender is the `filter` method inside
416-
`find_food` and `find_mate` functions.
417-
In both cases the bars on top of it are narrow and not the full with, meaning
418-
that not that much time has been really spend working, but instead inferring the
419-
types in the function itself during runtime.
420-
421-
As a reminder, this is the `find_food` function:
422-
```julia
423-
# original
424-
function find_food(a::Animal, w::World)
425-
as = filter(x -> eats(a,x), w.agents |> values |> collect)
426-
isempty(as) ? nothing : sample(as)
427-
end
428-
```
429-
Just from looking at that piece of code its not obvious what is the problem,
430-
however the red color indicates that the code may be type unstable. Let's see if
431-
that is the case by evaluation the function with some isolated inputs.
432-
433-
```julia
434-
using InteractiveUtils # hide
435-
w = Wolf(4000)
436-
find_food(w, world)
437-
@code_warntype find_food(w, world)
438-
```
439-
440-
Indeed we see that the return type is not inferred precisely but ends up being
441-
just the `Union{Nothing, Agent}`, this is better than straight out `Any`, which
442-
is the union of all types but still, julia has to do dynamic dispatch here, which is slow.
443-
444-
The underlying issue here is that we are working array of type `Vector{Agent}`,
445-
where `Agent` is abstract, which does not allow the compiler to specialize the
446-
code for the loop body.
447-
448315
## Different `Ecosystem.jl` versions
449316

450317
In order to fix the type instability in the `Vector{Agent}` we somehow have to
@@ -478,50 +345,6 @@ end
478345

479346
Which differences can you observe? Why is one version faster than the other?
480347

481-
!!! details
482-
It turns out that with this simple change we can already gain a little bit of speed:
483-
484-
| | `find_food` | `reproduce!` |
485-
|-------------------------------------------|-------------|--------------|
486-
|`Animal{A}` & `Dict{Int,Agent}` | 43.917 μs | 439.666 μs |
487-
|`Animal{A}` & `Dict{Int,Union{...}}` | 12.208 μs | 340.041 μs |
488-
489-
We are gaining performance here because for small `Union`s of types the julia
490-
compiler can precompile the multiple available code branches. If we have just a
491-
`Dict` of `Agent`s this is not possible.
492-
493-
This however, does not yet fix our type instabilities completely. We are still working with `Union`s of types
494-
which we can see again using `@code_warntype`:
495-
```@setup uniondict
496-
include("ecosystems/animal_S_world_DictUnion/Ecosystem.jl")
497-
498-
function make_counter()
499-
n = 0
500-
counter() = n += 1
501-
end
502-
503-
function create_world()
504-
n_grass = 1_000
505-
n_sheep = 40
506-
n_wolves = 4
507-
508-
nextid = make_counter()
509-
510-
World(vcat(
511-
[Grass(nextid()) for _ in 1:n_grass],
512-
[Sheep(nextid()) for _ in 1:n_sheep],
513-
[Wolf(nextid()) for _ in 1:n_wolves],
514-
))
515-
end
516-
world = create_world();
517-
```
518-
```@example uniondict
519-
using InteractiveUtils # hide
520-
w = Wolf(4000)
521-
find_food(w, world)
522-
@code_warntype find_food(w, world)
523-
```
524-
525348
---
526349

527350
Julia still has to perform runtime dispatch on the small `Union` of `Agent`s that is in our dictionary.

docs/src/lecture_05/lecture.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -228,7 +228,7 @@ BenchmarkTools.Trial: 11 samples with 1 evaluation.
228228

229229
We can see that we have approximately 3-fold improvement.
230230

231-
Let's profile again, not forgetting to use `Profile.clear()` to clear already stored probes.
231+
Let's profile again.
232232
```
233233
prof = @profview g2(p,n)
234234
ProfileCanvas.html_file("profiles/profile2.html", prof)

0 commit comments

Comments
 (0)