Skip to content

Conversation

@ChrisRackauckas
Copy link
Member

Continues #168

@mcabbott

I didn't have the write permissions for some reason so continuing it here.

src/zygote.jl Outdated
Comment on lines 46 to 52
# Define a new species of projection operator for this type:
ChainRulesCore.ProjectTo(x::VectorOfArray) = ChainRulesCore.ProjectTo{VectorOfArray}()

# Gradient from iteration will be e.g. Vector{Vector}, this makes it another AbstractMatrix
(::ChainRulesCore.ProjectTo{VectorOfArray})(dx::AbstractVector{<:AbstractArray}) = VectorOfArray(dx)
# Gradient from broadcasting will be another AbstractArray
(::ChainRulesCore.ProjectTo{VectorOfArray})(dx::AbstractArray) = dx
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe this may not be necessary at all. One thing I thought to test was whether iteration like this worked without it, but it does, it hits the @adjoint getindex rule:

julia> function iter(vofa)
       s = 0
       for a in vofa
         s += prod(a)
       end
       s
       end;

julia> gradient(iter, va)[1]
VectorOfArray{Float64,3}:
2-element Vector{Matrix{Float64}}:
 [0.007377548303139424 0.0014293014720444096 0.0004998127128840348; 0.0005414269337139141 0.0007721441834498009 0.0006559506948612249; 0.0042378737935180105 0.0006765914005991947 0.00045986425172967415]
 [0.002774992305796606 0.002978041675310144 0.004412709924140469; 0.0056425205408066285 0.005228088118453952 0.0036646150274027; 0.0036825199036535465 0.004902176341789764 0.045170987413739046]

src/zygote.jl Outdated

ZygoteRules.@adjoint function getindex(VA::AbstractVectorOfArray, i::Union{Int,AbstractArray{Int},CartesianIndex,Colon,BitArray,AbstractArray{Bool}})
function AbstractVectorOfArray_getindex_adjoint(Δ)
Δ′ = [ (i == j ? Δ : zero(x)) for (x,j) in zip(VA.u, 1:length(VA))]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unrelated to the PR, but this seems like it allocates quite a bit, when iterating a VectorOfArray. I guess that using Fill(0.0, size(Δ)) would often make Δ′ have an abstract type?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, it would make it an abstract type and sometimes hurt inference. Then we'd have to rely on union optimizations and pray.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Relying on union optimizations might be the right idea here though, I'll have to check.

ChrisRackauckas and others added 3 commits September 30, 2021 01:40
Co-authored-by: Michael Abbott <32575566+mcabbott@users.noreply.github.com>
@ChrisRackauckas ChrisRackauckas merged commit 1d94f35 into master Sep 30, 2021
@ChrisRackauckas ChrisRackauckas deleted the grad branch September 30, 2021 11:18
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants