Skip to content

cu(; unified=true) doesn't use unified memory with sparse matrices #2974

@matteosecli

Description

@matteosecli

Sanity checks (read this first, then remove this section)

  • Make sure you're reporting a bug; for general questions, please use Discourse or
    Slack.

  • If you're dealing with a performance issue, make sure you disable scalar iteration
    (CUDA.allowscalar(false)). Only file an issue if that shows scalar iteration happening
    in CUDA.jl or Base Julia, as opposed to your own code.

  • If you're seeing an error message, follow the error message instructions, if any
    (e.g. inspect code with @device_code_warntype). If you can't solve the problem using
    that information, make sure to post it as part of the issue.

  • Always ensure you're using the latest version of CUDA.jl, and if possible, please
    check the master branch to see if your issue hasn't been resolved yet.

If your bug is still valid, please go ahead and fill out the template below.

Describe the bug

cu(; unified=true) doesn't use unified memory with sparse matrices.

To reproduce

using CUDA
using CUDA.CUSPARSE
using LinearAlgebra
using SparseArrays

A = sprand(ComplexF64, 1000, 1000, 0.01)
dA = cu(A; unified=true)
println(typeof(dA.rowVal))

prints CuArray{Int32, 1, CUDA.DeviceMemory}. Same bug found by @albertomercurio.

Expected behavior

I expected to see CuArray{Int32, 1, CUDA.UnifiedMemory}

Version info

Details on Julia:

Julia Version 1.12.1
Commit ba1e628ee49 (2025-10-17 13:02 UTC)
Build Info:
  Official https://julialang.org release
Platform Info:
  OS: Linux (aarch64-linux-gnu)
  CPU: 288 × unknown
  WORD_SIZE: 64
  LLVM: libLLVM-18.1.7 (ORCJIT, neoverse-v2)
  GC: Built with stock GC
Threads: 288 default, 1 interactive, 288 GC (on 288 virtual cores)
Environment:
  JULIA_NUM_THREADS =
  JULIA_VSCODE_REPL = 1
  JULIA_EDITOR = code
  LD_LIBRARY_PATH = /capstor/store/cscs/2go/go54/nvhpc/Linux_aarch64/25.3/math_libs/nvpl/lib/:
  JULIA_CUDA_MEMORY_POOL = none
  JULIA_DEPOT_PATH = /capstor/scratch/cscs/msecl/gh200/juliaup/depot
  JULIA_ADIOS2_PATH = /user-environment/linux-sles15-neoverse_v2/gcc-13.3.0/adios2-2.10.2-uoe2ctr7j34tm7oed7sc4b6qr7g6a2ei
  JULIA_LOAD_PATH = :/user-environment/juhpc_setup/julia_preferences
  JULIA_PROJECT = XXXX

Details on CUDA:

CUDA toolchain:
- runtime 12.9, artifact installation
- driver 550.54.15 for 13.0
- compiler 12.9

CUDA libraries:
- CUBLAS: 12.9.1
- CURAND: 10.3.10
- CUFFT: 11.4.1
- CUSOLVER: 11.7.5
- CUSPARSE: 12.5.10
- CUPTI: 2025.2.1 (API 12.9.1)
- NVML: 12.0.0+550.54.15

Julia packages:
- CUDA: 5.9.4
- CUDA_Driver_jll: 13.0.2+0
- CUDA_Compiler_jll: 0.3.0+0
- CUDA_Runtime_jll: 0.19.2+0

Toolchain:
- Julia: 1.12.1
- LLVM: 18.1.7

Environment:
- JULIA_CUDA_MEMORY_POOL: none

Preferences:
- CUDA_Runtime_jll.version: 12.9

4 devices:
  0: NVIDIA GH200 120GB (sm_90, 93.326 GiB / 95.577 GiB available)
  1: NVIDIA GH200 120GB (sm_90, 93.873 GiB / 95.577 GiB available)
  2: NVIDIA GH200 120GB (sm_90, 93.886 GiB / 95.577 GiB available)
  3: NVIDIA GH200 120GB (sm_90, 93.881 GiB / 95.577 GiB available)

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions