-
-
Notifications
You must be signed in to change notification settings - Fork 5.7k
Open
Description
Turns out that the use of a generated function in my MWE for #60348 was superflous to the effect, so distracted from the actual issue.
It turns out that the invoke mechanism from #56660 is slow no matter where the code instance came from.
Here is a better MWE:
using BenchmarkTools
using Compiler
using Core.IR
struct SplitCacheOwner; end
struct SplitCacheInterp <: Compiler.AbstractInterpreter
world::UInt
inf_params::Compiler.InferenceParams
opt_params::Compiler.OptimizationParams
inf_cache::Vector{Compiler.InferenceResult}
codegen_cache::IdDict{CodeInstance,CodeInfo}
function SplitCacheInterp(;
world::UInt = Base.get_world_counter(),
inf_params::Compiler.InferenceParams = Compiler.InferenceParams(),
opt_params::Compiler.OptimizationParams = Compiler.OptimizationParams(),
inf_cache::Vector{Compiler.InferenceResult} = Compiler.InferenceResult[])
new(world, inf_params, opt_params, inf_cache, IdDict{CodeInstance,CodeInfo}())
end
end
Compiler.InferenceParams(interp::SplitCacheInterp) = interp.inf_params
Compiler.OptimizationParams(interp::SplitCacheInterp) = interp.opt_params
Compiler.get_inference_world(interp::SplitCacheInterp) = interp.world
Compiler.get_inference_cache(interp::SplitCacheInterp) = interp.inf_cache
Compiler.cache_owner(::SplitCacheInterp) = SplitCacheOwner()
Compiler.codegen_cache(interp::SplitCacheInterp) = interp.codegen_cache
import Core.OptimizedGenerics.CompilerPlugins: typeinf, typeinf_edge
@eval @noinline typeinf(::SplitCacheOwner, mi::MethodInstance, source_mode::UInt8) =
Base.invoke_in_world(which(typeinf, Tuple{SplitCacheOwner, MethodInstance, UInt8}).primary_world, Compiler.typeinf_ext_toplevel, SplitCacheInterp(; world=Base.tls_world_age()), mi, source_mode)and then
julia> const cinst = let world = Base.get_world_counter()
sig = Tuple{typeof(sin), Float64}
method_table = nothing
mi = @ccall jl_method_lookup_by_tt(sig::Any, world::Csize_t, method_table::Any)::Any
Compiler.typeinf_ext_toplevel(SplitCacheInterp(; world), mi, Compiler.SOURCE_MODE_ABI)
end
CodeInstance for MethodInstance for sin(::Float64)
julia> @btime invoke(sin, cinst, x) setup=(x=rand())
118.912 ns (2 allocations: 32 bytes)
0.803755305547964I'm seeing this on both v1.12 and nightly.
Note that if x is a compile time constant, then the compiler is still able to compile away the whole invocation:
julia> @btime invoke(sin, cinst, 1.0)
0.991 ns (0 allocations: 0 bytes)
0.8414709848078965but performing the actual invoke step at runtime is quite expensive.
Metadata
Metadata
Assignees
Labels
No labels