-
Notifications
You must be signed in to change notification settings - Fork 8
Open
Description
Hi,
I am trying to run the final training step to generate camel_train.pklz (final weights) which can work on any video.
I am using this command as per the docs -
uv run tracklab -cn cameltrack_train dataset=dancetrack
But getting the error
Error executing job with overrides: ['dataset=dancetrack']
Traceback (most recent call last):
File "/efs/notebook/yash/CAMELTrack/.venv/lib/python3.11/site-packages/pytorch_lightning/trainer/call.py", line 48, in _call_and_handle_interrupt
return trainer_fn(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/efs/notebook/yash/CAMELTrack/.venv/lib/python3.11/site-packages/pytorch_lightning/trainer/trainer.py", line 599, in _fit_impl
self._run(model, ckpt_path=ckpt_path)
File "/efs/notebook/yash/CAMELTrack/.venv/lib/python3.11/site-packages/pytorch_lightning/trainer/trainer.py", line 988, in _run
self.strategy.setup(self)
File "/efs/notebook/yash/CAMELTrack/.venv/lib/python3.11/site-packages/pytorch_lightning/strategies/strategy.py", line 159, in setup
self.setup_optimizers(trainer)
File "/efs/notebook/yash/CAMELTrack/.venv/lib/python3.11/site-packages/pytorch_lightning/strategies/strategy.py", line 139, in setup_optimizers
self.optimizers, self.lr_scheduler_configs = _init_optimizers_and_lr_schedulers(self.lightning_module)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/efs/notebook/yash/CAMELTrack/.venv/lib/python3.11/site-packages/pytorch_lightning/core/optimizer.py", line 180, in _init_optimizers_and_lr_schedulers
optim_conf = call._call_lightning_module_hook(model.trainer, "configure_optimizers", pl_module=model)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/efs/notebook/yash/CAMELTrack/.venv/lib/python3.11/site-packages/pytorch_lightning/trainer/call.py", line 176, in _call_lightning_module_hook
output = fn(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^
File "/efs/notebook/yash/CAMELTrack/cameltrack/camel.py", line 356, in configure_optimizers
num_warmup_steps=self.trainer.estimated_stepping_batches // 20,
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/efs/notebook/yash/CAMELTrack/.venv/lib/python3.11/site-packages/pytorch_lightning/trainer/trainer.py", line 1707, in estimated_stepping_batches
self.fit_loop.setup_data()
File "/efs/notebook/yash/CAMELTrack/.venv/lib/python3.11/site-packages/pytorch_lightning/loops/fit_loop.py", line 275, in setup_data
iter(self._data_fetcher) # creates the iterator inside the fetcher
^^^^^^^^^^^^^^^^^^^^^^^^
File "/efs/notebook/yash/CAMELTrack/.venv/lib/python3.11/site-packages/pytorch_lightning/loops/fetchers.py", line 105, in __iter__
super().__iter__()
File "/efs/notebook/yash/CAMELTrack/.venv/lib/python3.11/site-packages/pytorch_lightning/loops/fetchers.py", line 52, in __iter__
self.iterator = iter(self.combined_loader)
^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/efs/notebook/yash/CAMELTrack/.venv/lib/python3.11/site-packages/pytorch_lightning/utilities/combined_loader.py", line 351, in __iter__
iter(iterator)
File "/efs/notebook/yash/CAMELTrack/.venv/lib/python3.11/site-packages/pytorch_lightning/utilities/combined_loader.py", line 92, in __iter__
super().__iter__()
File "/efs/notebook/yash/CAMELTrack/.venv/lib/python3.11/site-packages/pytorch_lightning/utilities/combined_loader.py", line 43, in __iter__
self.iterators = [iter(iterable) for iterable in self.iterables]
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/efs/notebook/yash/CAMELTrack/.venv/lib/python3.11/site-packages/pytorch_lightning/utilities/combined_loader.py", line 43, in <listcomp>
self.iterators = [iter(iterable) for iterable in self.iterables]
^^^^^^^^^^^^^^
File "/efs/notebook/yash/CAMELTrack/.venv/lib/python3.11/site-packages/torch/utils/data/dataloader.py", line 491, in __iter__
return self._get_iterator()
^^^^^^^^^^^^^^^^^^^^
File "/efs/notebook/yash/CAMELTrack/.venv/lib/python3.11/site-packages/torch/utils/data/dataloader.py", line 422, in _get_iterator
return _MultiProcessingDataLoaderIter(self)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/efs/notebook/yash/CAMELTrack/.venv/lib/python3.11/site-packages/torch/utils/data/dataloader.py", line 1199, in __init__
self._reset(loader, first_iter=True)
File "/efs/notebook/yash/CAMELTrack/.venv/lib/python3.11/site-packages/torch/utils/data/dataloader.py", line 1236, in _reset
self._try_put_index()
File "/efs/notebook/yash/CAMELTrack/.venv/lib/python3.11/site-packages/torch/utils/data/dataloader.py", line 1486, in _try_put_index
index = self._next_index()
^^^^^^^^^^^^^^^^^^
File "/efs/notebook/yash/CAMELTrack/.venv/lib/python3.11/site-packages/torch/utils/data/dataloader.py", line 698, in _next_index
return next(self._sampler_iter) # may raise StopIteration
^^^^^^^^^^^^^^^^^^^^^^^^
File "/efs/notebook/yash/CAMELTrack/cameltrack/train/sampler.py", line 53, in __iter__
yield from batched(self.sample_generator(), self.batch_size)
File "/efs/notebook/yash/CAMELTrack/cameltrack/train/sampler.py", line 311, in batched
batch = tuple(islice(it, n))
^^^^^^^^^^^^^^^^^^^^
File "/efs/notebook/yash/CAMELTrack/cameltrack/train/sampler.py", line 35, in sample_generator
key = self.rng.choice(range(len(samplers)), p=probs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "numpy/random/_generator.pyx", line 824, in numpy.random._generator.Generator.choice
ValueError: probabilities contain NaN
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/efs/notebook/yash/CAMELTrack/.venv/bin/tracklab", line 10, in <module>
sys.exit(main())
^^^^^^
File "/efs/notebook/yash/CAMELTrack/.venv/lib/python3.11/site-packages/hydra/main.py", line 94, in decorated_main
_run_hydra(
File "/efs/notebook/yash/CAMELTrack/.venv/lib/python3.11/site-packages/hydra/_internal/utils.py", line 394, in _run_hydra
_run_app(
File "/efs/notebook/yash/CAMELTrack/.venv/lib/python3.11/site-packages/hydra/_internal/utils.py", line 457, in _run_app
run_and_report(
File "/efs/notebook/yash/CAMELTrack/.venv/lib/python3.11/site-packages/hydra/_internal/utils.py", line 223, in run_and_report
raise ex
File "/efs/notebook/yash/CAMELTrack/.venv/lib/python3.11/site-packages/hydra/_internal/utils.py", line 220, in run_and_report
return func()
^^^^^^
File "/efs/notebook/yash/CAMELTrack/.venv/lib/python3.11/site-packages/hydra/_internal/utils.py", line 458, in <lambda>
lambda: hydra.run(
^^^^^^^^^^
File "/efs/notebook/yash/CAMELTrack/.venv/lib/python3.11/site-packages/hydra/_internal/hydra.py", line 132, in run
_ = ret.return_value
^^^^^^^^^^^^^^^^
File "/efs/notebook/yash/CAMELTrack/.venv/lib/python3.11/site-packages/hydra/core/utils.py", line 260, in return_value
raise self._return_value
File "/efs/notebook/yash/CAMELTrack/.venv/lib/python3.11/site-packages/hydra/core/utils.py", line 186, in run_job
ret.return_value = task_function(task_cfg)
^^^^^^^^^^^^^^^^^^^^^^^
File "/efs/notebook/yash/CAMELTrack/.venv/lib/python3.11/site-packages/tracklab/main.py", line 47, in main
module.train(tracking_dataset, pipeline, evaluator, OmegaConf.to_container(cfg.dataset, resolve=True))
File "/efs/notebook/yash/CAMELTrack/cameltrack/cameltrack.py", line 330, in train
trainer.fit(self.CAMEL, self.datamodule, ckpt_path=ckpt_path)
File "/efs/notebook/yash/CAMELTrack/.venv/lib/python3.11/site-packages/pytorch_lightning/trainer/trainer.py", line 561, in fit
call._call_and_handle_interrupt(
File "/efs/notebook/yash/CAMELTrack/.venv/lib/python3.11/site-packages/pytorch_lightning/trainer/call.py", line 69, in _call_and_handle_interrupt
trainer._teardown()
File "/efs/notebook/yash/CAMELTrack/.venv/lib/python3.11/site-packages/pytorch_lightning/trainer/trainer.py", line 1039, in _teardown
loop.teardown()
File "/efs/notebook/yash/CAMELTrack/.venv/lib/python3.11/site-packages/pytorch_lightning/loops/fit_loop.py", line 502, in teardown
self._data_fetcher.teardown()
File "/efs/notebook/yash/CAMELTrack/.venv/lib/python3.11/site-packages/pytorch_lightning/loops/fetchers.py", line 80, in teardown
self.reset()
File "/efs/notebook/yash/CAMELTrack/.venv/lib/python3.11/site-packages/pytorch_lightning/loops/fetchers.py", line 142, in reset
super().reset()
File "/efs/notebook/yash/CAMELTrack/.venv/lib/python3.11/site-packages/pytorch_lightning/loops/fetchers.py", line 76, in reset
self.length = sized_len(self.combined_loader)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/efs/notebook/yash/CAMELTrack/.venv/lib/python3.11/site-packages/lightning_fabric/utilities/data.py", line 52, in sized_len
length = len(dataloader) # type: ignore [arg-type]
^^^^^^^^^^^^^^^
File "/efs/notebook/yash/CAMELTrack/.venv/lib/python3.11/site-packages/pytorch_lightning/utilities/combined_loader.py", line 358, in __len__
raise RuntimeError("Please call `iter(combined_loader)` first.")
RuntimeError: Please call `iter(combined_loader)` first.
I have done the setup correctly, have downloaded all 3 test, val and train datasets. As well as in the states directory, all 3 checkpoint files are available.
Here is my dir structure, and cameltrack_train.yaml
Cameltrack_train.yaml
defaults:
- cameltrack
- override dataset: dancetrack
- _self_
pipeline:
- track
use_wandb: false
wandb:
mode: disabled
state:
load_file: "${dataset.dataset_path}/states/dancetrack-${dataset.eval_set}.pklz"
save_file: null
load_from_public_dets: true
modules:
track:
training_enabled: true
use_wandb: false
wandb:
mode: disabled
# Generate the training dataset setup: compile tracklets in a pickle file for each split
datamodule_cfg:
name: "camel" # Name for the pickle files containing the training tracklets (will be appended with the split name)
path: "${dataset.dataset_path}/states/camel_training" # Where to store those training states
tracker_states:
train: "${dataset.dataset_path}/states/dancetrack-train.pklz" # Update this path to your states
val: "${dataset.dataset_path}/states/dancetrack-val.pklz" # Update this path to your states
I am quite close to running my first ever training pipeline, any help is appreciated.
Thanks
Metadata
Metadata
Assignees
Labels
No labels