Skip to content

The torch backend compress error #4416

@Jeremy1189

Description

@Jeremy1189

Bug summary

I tested the scratch training with se_atten_v2 and dpa2 using Deepmd-kit v3.0.0.0, and both models successfully compressed. However, after fine-tuning the dpa2 descriptor, I distilled and obtained a student model from the fine-tuned model. Although the downloaded student model can freeze properly, it cannot be compressed using the command (dp --pt compress) following the instructions from the tutorial (https://bohrium.dp.tech/notebooks/16449433825?utm_source=weixin&utm_medium=weixin&utm_campaign=article&utm_term=jc1124&test=aaa).

DeePMD-kit Version

3.0.0

Backend and its version

pytorch

How did you download the software?

Offline packages

Input Files, Running Commands, Error Log, etc.

(base) root@bohrium-12166-1226011:/personal/dpa2_hea/version11/result_distillation/no_data_stat_nbatch/task.0000# dp --pt freeze
To get the best performance, it is recommended to adjust the number of threads by setting the environment variables OMP_NUM_THREADS, DP_INTRA_OP_PARALLELISM_THREADS, and DP_INTER_OP_PARALLELISM_THREADS. See https://deepmd.rtfd.io/parallelism/ for more information.
[2024-11-25 13:30:01,481] DEEPMD INFO DeePMD version: 3.0.0
[2024-11-25 13:30:01,733] DEEPMD WARNING The rcut goes beyond table upper boundary, performing extrapolation.
[2024-11-25 13:30:02,859] DEEPMD INFO Saved frozen model to frozen_model.pth
(base) root@bohrium-12166-1226011:/personal/dpa2_hea/version11/result_distillation/no_data_stat_nbatch/task.0000# dp --pt compress
To get the best performance, it is recommended to adjust the number of threads by setting the environment variables OMP_NUM_THREADS, DP_INTRA_OP_PARALLELISM_THREADS, and DP_INTER_OP_PARALLELISM_THREADS. See https://deepmd.rtfd.io/parallelism/ for more information.
[2024-11-25 13:30:11,496] DEEPMD INFO DeePMD version: 3.0.0
Traceback (most recent call last):
File "/opt/deepmd-kit/bin/dp", line 10, in
sys.exit(main())
^^^^^^
File "/opt/deepmd-kit/lib/python3.12/site-packages/deepmd/main.py", line 927, in main
deepmd_main(args)
File "/opt/deepmd-kit/lib/python3.12/site-packages/torch/distributed/elastic/multiprocessing/errors/init.py", line 348, in wrapper
return f(*args, **kwargs)
^^^^^^^^^^^^^^^^^^
File "/opt/deepmd-kit/lib/python3.12/site-packages/deepmd/pt/entrypoints/main.py", line 562, in main
enable_compression(
File "/opt/deepmd-kit/lib/python3.12/site-packages/deepmd/pt/entrypoints/compress.py", line 41, in enable_compression
model_def_script = json.loads(saved_model.model_def_script)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/opt/deepmd-kit/lib/python3.12/json/init.py", line 346, in loads
return _default_decoder.decode(s)
^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/opt/deepmd-kit/lib/python3.12/json/decoder.py", line 337, in decode
obj, end = self.raw_decode(s, idx=_w(s, 0).end())
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/opt/deepmd-kit/lib/python3.12/json/decoder.py", line 355, in raw_decode
raise JSONDecodeError("Expecting value", s, err.value) from None
json.decoder.JSONDecodeError: Expecting value: line 1 column 1 (char 0)

Steps to Reproduce

finetune->distill->freeze->compress

Further Information, Files, and Links

No response

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions