The compatibility issues among DPGEN 0.13.2, DeepMD 2.2.10, and MKL 2023. #1837
li-shining
started this conversation in
General
Replies: 0 comments
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
-
INFO:dpgen:start running
INFO:dpgen:=============================iter.000000==============================
INFO:dpgen:-------------------------iter.000000 task 00--------------------------
INFO:dpgen:-------------------------iter.000000 task 01--------------------------
Traceback (most recent call last):
File "/gdata01/opt/miniconda3/envs/deepmd/lib/python3.10/site-packages/dpdispatcher/submission.py", line 356, in handle_unexpected_submission_state
job.handle_unexpected_job_state()
File "/gdata01/opt/miniconda3/envs/deepmd/lib/python3.10/site-packages/dpdispatcher/submission.py", line 855, in handle_unexpected_job_state
raise RuntimeError(err_msg)
RuntimeError: job:d1ea028110299b4834db99b4a6da6cf923abbf33 9004 failed 3 times.
Possible remote error message: ==> /gdata54/jmhan/DPGen/test/dpgen-master/dpgen_example/run_node16/work/ad59723500ae5d1649165e081ee9ba9395e9981b/000/train.log <==
ges/deepmd/train/trainer.py:1197: py_func (from tensorflow.python.ops.script_ops) is deprecated and will be removed in a future version.
Instructions for updating:
tf.py_func is deprecated in TF V2. Instead, there are two
options available in V2.
- tf.py_function takes a python function which manipulates tf eager
tensors instead of numpy arrays. It's easy to convert a tf eager tensor to
an ndarray (just call tensor.numpy()) but having access to eager tensors
means
tf.py_functions can use accelerators such as GPUs as well asbeing differentiable using a gradient tape.
- tf.numpy_function maintains the semantics of the deprecated tf.py_func
(it is not differentiable, and manipulates numpy arrays). It drops the
stateful argument making all functions stateful.
DEEPMD INFO average training time: 0.0000 s/batch (exclude first 1000 batches)
DEEPMD INFO finished training
DEEPMD INFO wall time: 0.451 s
corrupted size vs. prev_size in fastbins
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/gdata01/opt/miniconda3/envs/deepmd/bin/dpgen", line 10, in
sys.exit(main())
File "/gdata01/opt/miniconda3/envs/deepmd/lib/python3.10/site-packages/dpgen/main.py", line 255, in main
args.func(args)
File "/gdata01/opt/miniconda3/envs/deepmd/lib/python3.10/site-packages/dpgen/generator/run.py", line 5474, in gen_run
run_iter(args.PARAM, args.MACHINE)
File "/gdata01/opt/miniconda3/envs/deepmd/lib/python3.10/site-packages/dpgen/generator/run.py", line 4805, in run_iter
run_train(ii, jdata, mdata)
File "/gdata01/opt/miniconda3/envs/deepmd/lib/python3.10/site-packages/dpgen/generator/run.py", line 724, in run_train
return run_train_dp(iter_index, jdata, mdata)
File "/gdata01/opt/miniconda3/envs/deepmd/lib/python3.10/site-packages/dpgen/generator/run.py", line 927, in run_train_dp
submission.run_submission()
File "/gdata01/opt/miniconda3/envs/deepmd/lib/python3.10/site-packages/dpdispatcher/submission.py", line 260, in run_submission
self.handle_unexpected_submission_state()
File "/gdata01/opt/miniconda3/envs/deepmd/lib/python3.10/site-packages/dpdispatcher/submission.py", line 360, in handle_unexpected_submission_state
raise RuntimeError(
RuntimeError: Meet errors will handle unexpected submission state.
Debug information: remote_root==/home/jmhan/gdata/DPGen/test/dpgen-master/dpgen_example/run_node16/work/ad59723500ae5d1649165e081ee9ba9395e9981b.
Debug information: submission_hash==ad59723500ae5d1649165e081ee9ba9395e9981b.
Please check error messages above and in remote_root. The submission information is saved in /home/jmhan/.dpdispatcher/submission/ad59723500ae5d1649165e081ee9ba9395e9981b.json.
For furthur actions, run the following command with proper flags: dpdisp submission ad59723500ae5d1649165e081ee9ba9395e9981b
Beta Was this translation helpful? Give feedback.
All reactions