Hi, I’d like to deploy Qwen3-VL-4B on a Jetson Orin Nano Super.
Could you please advise:
Which inference framework works best (e.g., TensorRT-LLM, ONNX, Transformers + accelerate)?
Is jetson-inference compatible with Qwen-VL models?
What environment setup (JetPack version, CUDA, quantization, etc.) is recommended to fit the model in memory?
Any example or guidance would be very helpful. Thanks!