-
Notifications
You must be signed in to change notification settings - Fork 1.7k
Description
Description
Lets say you want to start multiple GPU instances for a python triton model. How do you do it? Short answer: I think it is currently not possible.
Example:
In terms of triton lets assume this is our wanted configuration:
instance_group [
{ kind: KIND_GPU gpus: [0] count: 1 },
{ kind: KIND_GPU gpus: [3] count: 2 }
]
As written in the docs, python does not use this, you have to set devices in your model.py by yourself, which would be fine if it worked. With the code above pasted in your config.pbtxt triton will spawn three processes and the logs should look like this:
I1202 16:02:04.802832 240 python_be.cc:2289] "TRITONBACKEND_ModelInstanceInitialize: pyannote_0_0 (GPU device 0)"
I1202 16:02:04.803008 240 python_be.cc:2289] "TRITONBACKEND_ModelInstanceInitialize: pyannote_1_0 (GPU device 3)"
I1202 16:02:04.803094 240 python_be.cc:2289] "TRITONBACKEND_ModelInstanceInitialize: pyannote_1_1 (GPU device 3)"
However how am I supposed to set the correct GPU_ID in my model.py now? args["model_config"] just returns my global configuration, it seems like i have no way of finding out which instance I am currently in.
Did I oversee anything? There should be a way to configure this right?
Triton Information
25.11 Container
To Reproduce
This is code agnostic, any python model will run into this problem if you want to assign a GPU.
Expected behavior
I would like to have something like an env variable which tells me which instance i am in and which gpu is supposed to be used for this instance (according to the configuration in my config.pbtxt).