Skip to content

Python backend: Starting multiple GPU instances is currently not feasible/possible with a global config #8555

@protonicage

Description

@protonicage

Description
Lets say you want to start multiple GPU instances for a python triton model. How do you do it? Short answer: I think it is currently not possible.

Example:
In terms of triton lets assume this is our wanted configuration:

instance_group [
  { kind: KIND_GPU gpus: [0] count: 1 },
  { kind: KIND_GPU gpus: [3] count: 2 }
]

As written in the docs, python does not use this, you have to set devices in your model.py by yourself, which would be fine if it worked. With the code above pasted in your config.pbtxt triton will spawn three processes and the logs should look like this:

I1202 16:02:04.802832 240 python_be.cc:2289] "TRITONBACKEND_ModelInstanceInitialize: pyannote_0_0 (GPU device 0)"
I1202 16:02:04.803008 240 python_be.cc:2289] "TRITONBACKEND_ModelInstanceInitialize: pyannote_1_0 (GPU device 3)"
I1202 16:02:04.803094 240 python_be.cc:2289] "TRITONBACKEND_ModelInstanceInitialize: pyannote_1_1 (GPU device 3)"

However how am I supposed to set the correct GPU_ID in my model.py now? args["model_config"] just returns my global configuration, it seems like i have no way of finding out which instance I am currently in.
Did I oversee anything? There should be a way to configure this right?

Triton Information
25.11 Container

To Reproduce
This is code agnostic, any python model will run into this problem if you want to assign a GPU.

Expected behavior
I would like to have something like an env variable which tells me which instance i am in and which gpu is supposed to be used for this instance (according to the configuration in my config.pbtxt).

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions