|
58 | 58 | ### Tangent types |
59 | 59 | The types of tangents and cotangents depend on the types of the primals. However, sometimes our functions may have arguments which derivatives we can not compute or do not need. In that case, we represent it as `NoTangent`. `ZeroTangent` is used when tangent is equal to zero. |
60 | 60 |
|
| 61 | +!!! note "Quick rrule recipe (template)" |
| 62 | + 1. Compute the primal output `y = f(args...)`. |
| 63 | + 2. Capture any intermediate values you need for the backward pass. |
| 64 | + 3. Return `(y, pullback)` where `pullback(ȳ)` returns a tuple of tangents for `(f, args...)`. |
| 65 | + 4. Use `NoTangent()` for arguments that are not differentiable and `ZeroTangent()` when appropriate. |
| 66 | + |
| 67 | + Minimal template: |
| 68 | + |
| 69 | + ```julia |
| 70 | + # ...existing code... |
| 71 | + function ChainRulesCore.rrule(::typeof(f), arg1::A, arg2::B) where {A,B} |
| 72 | + # 1) primal |
| 73 | + y = f(arg1, arg2) |
| 74 | + |
| 75 | + # 2) capture intermediates if needed |
| 76 | + # e.g. cached = some_intermediate(arg1, arg2) |
| 77 | + |
| 78 | + # 3) define pullback |
| 79 | + function pullback(ȳ) |
| 80 | + # compute cotangents for args; shapes must match original args |
| 81 | + ∂arg1 = ... # same shape/type as arg1 |
| 82 | + ∂arg2 = ... # same shape/type as arg2 |
| 83 | + # first return value corresponds to the function object itself |
| 84 | + return NoTangent(), ∂arg1, ∂arg2 |
| 85 | + end |
| 86 | + |
| 87 | + return y, pullback |
| 88 | + end |
| 89 | + # ...existing code... |
| 90 | + ``` |
| 91 | + |
| 92 | +!!! note "General Hints for rrules" |
| 93 | + - Always return NoTangent() as the first element in the pullback tuple (it denotes the function object). |
| 94 | + - Use similar(x) or zeros(eltype(x), size(x)) for cotangent buffers to preserve type/shape. |
| 95 | + - Use .+= when writing into x̄ if segments can overlap (prevents losing accumulated contributions). |
| 96 | + - Watch out for shapes: Δy passed to pullback has exactly the same shape as y. |
| 97 | + - For reductions (sum/maximum), think which inputs share the same contribution and broadcast the cotangent accordingly. |
| 98 | + - For maxima/argmax: decide tie semantics (equal split vs first index) and document your choice. |
| 99 | + - Mutating primals inside pullbacks breaks the purity assumption — avoid it. |
| 100 | + |
61 | 101 |
|
62 | 102 | !!! warning "Exercise" |
63 | 103 | ```@example lab09 |
|
0 commit comments