Skip to content

Commit 759b5a6

Browse files
committed
admonitions
1 parent 6e3631d commit 759b5a6

File tree

3 files changed

+324
-381
lines changed

3 files changed

+324
-381
lines changed

docs/src/lecture_05/lecture.md

Lines changed: 98 additions & 97 deletions
Original file line numberDiff line numberDiff line change
@@ -795,6 +795,104 @@ BenchmarkTools.Trial: 2440 samples with 1 evaluation.
795795
```
796796
By not checking the bounds, we bring the speed close to the version based on matrix multiplication, while having small memory requirements (further speedup can be achieved using threadding).
797797

798+
## NamedTuples are more efficient that Dicts
799+
It happens a lot in scientific code, that some experiments have many parameters. It is therefore very convenient to store them in `Dict`, such that when adding a new parameter, we do not have to go over all defined functions and redefine them.
800+
801+
Imagine that we have a (nonsensical) simulation like
802+
```julia
803+
settings = Dict(:stepsize => 0.01, :h => 0.001, :iters => 500, :info => "info")
804+
function find_min!(f, x, p)
805+
for i in 1:p[:iters]
806+
= x + p[:h]
807+
fx = f(x) # line 4
808+
x -= p[:stepsize] * (f(x̃) - fx)/p[:h] # line 5
809+
end
810+
x
811+
end
812+
```
813+
Notice the parameter `p` is a `Dict` and that it can contain arbitrary parameters, which is useful. Hence, `Dict` is cool for passing parameters.
814+
Let's now run the function through the profiler
815+
```julia
816+
x₀ = rand()
817+
f(x) = x^2
818+
prof = @profview find_min!(f, x₀, settings)
819+
ProfileCanvas.html_file("profiles/profile6.html", prof)
820+
```
821+
from the profiler's output [here](profiles/profile6.html) we can see some type instabilities. Where they come from?
822+
The compiler does not have any information about types stored in `settings`, as the type of stored values are `Any` (caused by storing `String` and `Int`).
823+
```julia
824+
julia> typeof(settings)
825+
Dict{Symbol, Any}
826+
```
827+
The second problem is `get` operation on dictionaries is very time consuming operation (although technically it is O(1)), because it has to search the key in the list. `Dict`s are designed as a mutable container, which is not needed in our use-case, as the settings are static. For similar use-cases, Julia offers `NamedTuple`, with which we can construct settings as
828+
```julia
829+
nt_settings = (;stepsize = 0.01, h=0.001, iters=500, :info => "info")
830+
```
831+
The `NamedTuple` is fully typed, but which we mean the names of fields are part of the type definition and fields are also part of type definition. You can think of it as a struct. Moreover, when accessing fields in `NamedTuple`, compiler knows precisely where they are located in the memory, which drastically reduces the access time.
832+
Let's see the effect in `BenchmarkTools`.
833+
```julia
834+
julia> @benchmark find_min!(x -> x^2, x₀, settings)
835+
BenchmarkTools.Trial: 10000 samples with 1 evaluation.
836+
Range (min max): 86.350 μs 4.814 ms ┊ GC (min max): 0.00% 97.61%
837+
Time (median): 90.747 μs ┊ GC (median): 0.00%
838+
Time (mean ± σ): 102.405 μs ± 127.653 μs ┊ GC (mean ± σ): 4.69% ± 3.75%
839+
840+
▅██▆▂ ▁▁ ▁ ▂
841+
███████▇▇████▇███▇█▇████▇▇▆▆▇▆▇▇▇▆▆▆▆▇▆▇▇▅▇▆▆▆▆▄▅▅▄▅▆▆▅▄▅▃▅▃▅ █
842+
86.4 μs Histogram: log(frequency) by time 209 μs <
843+
844+
Memory estimate: 70.36 KiB, allocs estimate: 4002.
845+
846+
julia> @benchmark find_min!(x -> x^2, x₀, nt_settings)
847+
BenchmarkTools.Trial: 10000 samples with 7 evaluations.
848+
Range (min max): 4.179 μs 21.306 μs ┊ GC (min max): 0.00% 0.00%
849+
Time (median): 4.188 μs ┊ GC (median): 0.00%
850+
Time (mean ± σ): 4.493 μs ± 1.135 μs ┊ GC (mean ± σ): 0.00% ± 0.00%
851+
852+
█▃▁ ▁ ▁ ▁ ▁
853+
████▇████▄██▄█▃██▄▄▇▇▇▇▅▆▆▅▄▄▅▄▅▅▅▄▁▅▄▁▄▄▆▆▇▄▅▆▄▄▃▄▆▅▆▁▄▄▄ █
854+
4.18 μs Histogram: log(frequency) by time 10.8 μs <
855+
856+
Memory estimate: 16 bytes, allocs estimate: 1.
857+
```
858+
859+
Checking the output with JET, there is no type instability anymore
860+
```julia
861+
@report_opt find_min!(f, x₀, nt_settings)
862+
No errors !
863+
```
864+
865+
## Don't use IO unless you have to
866+
- debug printing in performance critical code should be kept to minimum or using in memory/file based logger in stdlib `Logging.jl`
867+
```julia
868+
function find_min!(f, x, p; verbose=true)
869+
for i in 1:p[:iters]
870+
= x + p[:h]
871+
fx = f(x)
872+
x -= p[:stepsize] * (f(x̃) - fx)/p[:h]
873+
verbose && println("x = ", x, " | f(x) = ", fx)
874+
end
875+
x
876+
end
877+
878+
@btime find_min!($f, $x₀, $params_tuple; verbose=true)
879+
@btime find_min!($f, $x₀, $params_tuple; verbose=false)
880+
```
881+
- interpolation of strings is even worse https://docs.julialang.org/en/v1/manual/performance-tips/#Avoid-string-interpolation-for-I/O
882+
```julia
883+
function find_min!(f, x, p; verbose=true)
884+
for i in 1:p[:iters]
885+
= x + p[:h]
886+
fx = f(x)
887+
x -= p[:stepsize] * (f(x̃) - fx)/p[:h]
888+
verbose && println("x = $x | f(x) = $fx")
889+
end
890+
x
891+
end
892+
@btime find_min!($f, $x₀, $params_tuple; verbose=true)
893+
```
894+
895+
798896
## Boxing in closure
799897
Recall closure is a function which contains some parameters contained
800898

@@ -954,100 +1052,3 @@ No errors !
9541052
9551053
So when you use closures, you should be careful of the accidental boxing, since it can inhibit the speed of code. **This is a big deal in Multithreadding and in automatic differentiation**, both heavily uses closures. You can track the discussion [here](https://github.com/JuliaLang/julia/issues/15276).
9561054
957-
958-
## NamedTuples are more efficient that Dicts
959-
It happens a lot in scientific code, that some experiments have many parameters. It is therefore very convenient to store them in `Dict`, such that when adding a new parameter, we do not have to go over all defined functions and redefine them.
960-
961-
Imagine that we have a (nonsensical) simulation like
962-
```julia
963-
settings = Dict(:stepsize => 0.01, :h => 0.001, :iters => 500, :info => "info")
964-
function find_min!(f, x, p)
965-
for i in 1:p[:iters]
966-
= x + p[:h]
967-
fx = f(x) # line 4
968-
x -= p[:stepsize] * (f(x̃) - fx)/p[:h] # line 5
969-
end
970-
x
971-
end
972-
```
973-
Notice the parameter `p` is a `Dict` and that it can contain arbitrary parameters, which is useful. Hence, `Dict` is cool for passing parameters.
974-
Let's now run the function through the profiler
975-
```julia
976-
x₀ = rand()
977-
f(x) = x^2
978-
prof = @profview find_min!(f, x₀, settings)
979-
ProfileCanvas.html_file("profiles/profile6.html", prof)
980-
```
981-
from the profiler's output [here](profiles/profile6.html) we can see some type instabilities. Where they come from?
982-
The compiler does not have any information about types stored in `settings`, as the type of stored values are `Any` (caused by storing `String` and `Int`).
983-
```julia
984-
julia> typeof(settings)
985-
Dict{Symbol, Any}
986-
```
987-
The second problem is `get` operation on dictionaries is very time consuming operation (although technically it is O(1)), because it has to search the key in the list. `Dict`s are designed as a mutable container, which is not needed in our use-case, as the settings are static. For similar use-cases, Julia offers `NamedTuple`, with which we can construct settings as
988-
```julia
989-
nt_settings = (;stepsize = 0.01, h=0.001, iters=500, :info => "info")
990-
```
991-
The `NamedTuple` is fully typed, but which we mean the names of fields are part of the type definition and fields are also part of type definition. You can think of it as a struct. Moreover, when accessing fields in `NamedTuple`, compiler knows precisely where they are located in the memory, which drastically reduces the access time.
992-
Let's see the effect in `BenchmarkTools`.
993-
```julia
994-
julia> @benchmark find_min!(x -> x^2, x₀, settings)
995-
BenchmarkTools.Trial: 10000 samples with 1 evaluation.
996-
Range (min max): 86.350 μs 4.814 ms ┊ GC (min max): 0.00% 97.61%
997-
Time (median): 90.747 μs ┊ GC (median): 0.00%
998-
Time (mean ± σ): 102.405 μs ± 127.653 μs ┊ GC (mean ± σ): 4.69% ± 3.75%
999-
1000-
▅██▆▂ ▁▁ ▁ ▂
1001-
███████▇▇████▇███▇█▇████▇▇▆▆▇▆▇▇▇▆▆▆▆▇▆▇▇▅▇▆▆▆▆▄▅▅▄▅▆▆▅▄▅▃▅▃▅ █
1002-
86.4 μs Histogram: log(frequency) by time 209 μs <
1003-
1004-
Memory estimate: 70.36 KiB, allocs estimate: 4002.
1005-
1006-
julia> @benchmark find_min!(x -> x^2, x₀, nt_settings)
1007-
BenchmarkTools.Trial: 10000 samples with 7 evaluations.
1008-
Range (min max): 4.179 μs 21.306 μs ┊ GC (min max): 0.00% 0.00%
1009-
Time (median): 4.188 μs ┊ GC (median): 0.00%
1010-
Time (mean ± σ): 4.493 μs ± 1.135 μs ┊ GC (mean ± σ): 0.00% ± 0.00%
1011-
1012-
█▃▁ ▁ ▁ ▁ ▁
1013-
████▇████▄██▄█▃██▄▄▇▇▇▇▅▆▆▅▄▄▅▄▅▅▅▄▁▅▄▁▄▄▆▆▇▄▅▆▄▄▃▄▆▅▆▁▄▄▄ █
1014-
4.18 μs Histogram: log(frequency) by time 10.8 μs <
1015-
1016-
Memory estimate: 16 bytes, allocs estimate: 1.
1017-
```
1018-
1019-
Checking the output with JET, there is no type instability anymore
1020-
```julia
1021-
@report_opt find_min!(f, x₀, nt_settings)
1022-
No errors !
1023-
```
1024-
1025-
## Don't use IO unless you have to
1026-
- debug printing in performance critical code should be kept to minimum or using in memory/file based logger in stdlib `Logging.jl`
1027-
```julia
1028-
function find_min!(f, x, p; verbose=true)
1029-
for i in 1:p[:iters]
1030-
= x + p[:h]
1031-
fx = f(x)
1032-
x -= p[:stepsize] * (f(x̃) - fx)/p[:h]
1033-
verbose && println("x = ", x, " | f(x) = ", fx)
1034-
end
1035-
x
1036-
end
1037-
1038-
@btime find_min!($f, $x₀, $params_tuple; verbose=true)
1039-
@btime find_min!($f, $x₀, $params_tuple; verbose=false)
1040-
```
1041-
- interpolation of strings is even worse https://docs.julialang.org/en/v1/manual/performance-tips/#Avoid-string-interpolation-for-I/O
1042-
```julia
1043-
function find_min!(f, x, p; verbose=true)
1044-
for i in 1:p[:iters]
1045-
= x + p[:h]
1046-
fx = f(x)
1047-
x -= p[:stepsize] * (f(x̃) - fx)/p[:h]
1048-
verbose && println("x = $x | f(x) = $fx")
1049-
end
1050-
x
1051-
end
1052-
@btime find_min!($f, $x₀, $params_tuple; verbose=true)
1053-
```

0 commit comments

Comments
 (0)