-
Notifications
You must be signed in to change notification settings - Fork 1.7k
Open
Description
Tensorflow creates a single compute CUDA stream, hence if my model runs a small kernel, I am unable to run multiple kernels on multiple compute streams.
I was hoping that if I load multiple tensorflow instances of the same model, each would get a different CUDA stream. However looking via nsys profiler - I still see only one CUDA stream where all my clients inference calls go to and I don't get the multi stream option.
Is it configurable? Is there something that can be done about it?
thanks
Eyal
Metadata
Metadata
Assignees
Labels
No labels