@@ -34,58 +34,59 @@ When passed to [`probabilities`](@ref) the output depends on the input data type
3434 by using [`CountOccurrences`](@ref). When giving the resulting probabilities to
3535 [`entropy`](@ref), the original permutation entropy is computed [^BandtPompe2002].
3636- **Multivariate data**. If applied to a an `D`-dimensional `Dataset`,
37- then no embedding is constructed, and we each vector ``\\ bf{x}_i`` of the dataset
38- directly to its permutation pattern ``\\ pi_{i}``, ``\\ pi_{i}`` by comparing the
37+ then no embedding is constructed, `m` must be equal to `D` and `τ` is ignored.
38+ Each vector ``\\ bf{x}_i`` of the dataset is mapped
39+ directly to its permutation pattern ``\\ pi_{i}`` by comparing the
3940 relative magnitudes of the elements of ``\\ bf{x}_i``.
4041 Like above, probabilities are estimated as the frequencies of the permutation symbols.
41- In this case, `m` is ignored,
42- but `m` must still match the dimension of the dataset for optimization.
4342 The resulting probabilities can be used to compute multivariate permutation
4443 entropy[^He2016], although here we don't perform any further subdivision
4544 of the permutation patterns (as in Figure 3 of[^He2016]).
4645
4746Internally, [`SymbolicPermutation`](@ref) uses the [`OrdinalPatternEncoding`](@ref)
4847to represent ordinal patterns as integers for efficient computations.
4948
49+ See [`SymbolicWeightedPermutation`](@ref) and [`SymbolicAmplitudeAwarePermutation`](@ref)
50+ for estimators that not only consider ordinal (sorting) patterns, but also incorporate
51+ information about within-state-vector amplitudes.
52+ For a version of this estimator that can be used on spatial data, see
53+ [`SpatialSymbolicPermutation`](@ref).
54+
55+ !!! note "Handling equal values in ordinal patterns"
56+ In Bandt & Pompe (2002), equal values are ordered after their order of appearance, but
57+ this can lead to erroneous temporal correlations, especially for data with
58+ low amplitude resolution [^Zunino2017]. Here, by default, if two values are equal,
59+ then one of the is randomly assigned as "the largest", using
60+ `lt = ComplexityMeasures.isless_rand`.
61+ To get the behaviour from Bandt and Pompe (2002), use `lt = Base.isless`.
62+
5063## Outcome space
5164
5265The outcome space `Ω` for `SymbolicPermutation` is the set of length-`m` ordinal
53- patterns (i.e. permutations) that can be formed by the integers `1, 2, …, m`,
54- ordered lexicographically. There are `factorial(m)` such patterns.
66+ patterns (i.e. permutations) that can be formed by the integers `1, 2, …, m`.
67+ There are `factorial(m)` such patterns.
5568
56- For example, the outcome `[3, 1, 2]` corresponds to the ordinal pattern of having
57- first the largest value, then the lowest value, and then the value in between.
69+ For example, the outcome `[2, 3, 1]` corresponds to the ordinal pattern of having
70+ the smallest value in the second position, the next smallest value in the third
71+ position, and the next smallest, i.e. the largest value in the first position.
72+ See also [`OrdinalPatternEncoding`(@ref).
5873
5974## In-place symbolization
6075
6176`SymbolicPermutation` also implements the in-place [`probabilities!`](@ref)
62- for `Dataset` input (or embedded vector input).
63- The length of the pre-allocated symbol vector must match the length of the dataset.
77+ for `Dataset` input (or embedded vector input) for reducing allocations in looping scenarios .
78+ The length of the pre-allocated symbol vector must be the length of the dataset.
6479For example
6580
6681```julia
67- using DelayEmbeddings, ComplexityMeasures
82+ using ComplexityMeasures
6883m, N = 2, 100
6984est = SymbolicPermutation(; m, τ)
70- x = Dataset(rand(N, m) # timeseries example
85+ x = Dataset(rand(N, m)) # some input dataset
7186πs_ts = zeros(Int, N) # length must match length of `x`
7287p = probabilities!(πs_ts, est, x)
7388```
7489
75- See [`SymbolicWeightedPermutation`](@ref) and [`SymbolicAmplitudeAwarePermutation`](@ref)
76- for estimators that not only consider ordinal (sorting) patterns, but also incorporate
77- information about within-state-vector amplitudes.
78- For a version of this estimator that can be used on high-dimensional arrays, see
79- [`SpatialSymbolicPermutation`](@ref).
80-
81- !!! note "Handling equal values in ordinal patterns"
82- In Bandt & Pompe (2002), equal values are ordered after their order of appearance, but
83- this can lead to erroneous temporal correlations, especially for data with
84- low amplitude resolution [^Zunino2017]. Here, by default, if two values are equal,
85- then one of the is randomly assigned as "the largest", using
86- `lt = ComplexityMeasures.isless_rand`. To get the behaviour from Bandt and Pompe (2002), use
87- `lt = Base.isless`).
88-
8990[^BandtPompe2002]: Bandt, Christoph, and Bernd Pompe. "Permutation entropy: a natural
9091 complexity measure for timeseries." Physical review letters 88.17 (2002): 174102.
9192[^Zunino2017]: Zunino, L., Olivares, F., Scholkmann, F., & Rosso, O. A. (2017).
0 commit comments