-
Notifications
You must be signed in to change notification settings - Fork 582
feat(pd): support dpa2/dpa3 C++ inference #4870
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat(pd): support dpa2/dpa3 C++ inference #4870
Conversation
…eepmd-kit into support_comm_2
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
This PR adds support for DPA2/DPA3 C++ inference with LAMMPS integration for the Paddle framework. The changes implement necessary Paddle operators and build infrastructure to enable message passing and communication patterns required for DPA2/DPA3 models in distributed computing environments.
- Adds custom Paddle operators for border communication (message passing) needed for DPA2/DPA3
- Updates C++ API to support Paddle model loading and inference with communication dictionary
- Fixes various tensor operations and API calls to be compatible with Paddle framework requirements
Reviewed Changes
Copilot reviewed 32 out of 32 changed files in this pull request and generated 6 comments.
Show a summary per file
| File | Description |
|---|---|
| source/op/pd/setup.py | Setup script for building Paddle C++ extension |
| source/op/pd/comm.cc | Core C++ implementation of border communication operators |
| source/op/pd/CMakeLists.txt | Build configuration for Paddle operators |
| source/install/build_cc_pd.sh | Build script for C++ Paddle inference |
| source/api_cc/src/common.cc | Updates to load Paddle operators and detect Paddle models |
| source/api_cc/src/DeepPotPD.cc | Main Paddle C++ API implementation with communication support |
| source/api_cc/include/DeepPotPD.h | Header updates for communication tensor storage |
| deepmd/pd/utils/stat.py | Statistics computation utilities with type distinction support |
| deepmd/pd/utils/nlist.py | Neighbor list utilities with API compatibility fixes |
| deepmd/pd/model/descriptor/*.py | Various descriptor implementations with communication support |
| deepmd/pd/entrypoints/main.py | Model freezing updates for static graph compilation |
Comments suppressed due to low confidence (3)
source/op/pd/comm.cc:228
- [nitpick] Consider using a named constant or compile-time check instead of runtime sizeof comparison for better performance and clarity.
communicator_tensor, nlocal_tensor, nghost_tensor);
source/op/pd/comm.cc:246
- Incomplete documentation block. The comment should be properly formatted as a standard C++ comment or completed as a proper docblock.
paddle::Tensor& recv_g1_tensor_grad) {
deepmd/pd/utils/stat.py:129
- The function has incomplete documentation. The existing docstring should describe all parameters and the return value, not just the purpose.
def _post_process_stat(
for more information, see https://pre-commit.ci
|
Note Other AI code review bot(s) detectedCodeRabbit has detected other AI code review bot(s) in this pull request and will avoid duplicating their findings in the review comments. This may lead to a less comprehensive review. 📝 WalkthroughWalkthroughThis update introduces a PaddlePaddle custom operator library ( Changes
Sequence Diagram(s)sequenceDiagram
participant PythonUser
participant deepmd.pd.cxx_op
participant deepmd_op_pd (Custom Op)
participant DescriptorModule
participant PaddleRuntime
PythonUser->>deepmd.pd.cxx_op: import cxx_op
deepmd.pd.cxx_op->>deepmd_op_pd (Custom Op): Try dynamic load ("deepmd_op_pd")
deepmd.pd.cxx_op-->>PythonUser: Expose ENABLE_CUSTOMIZED_OP, paddle_ops_deepmd
PythonUser->>DescriptorModule: Call .forward(..., comm_dict)
DescriptorModule->>deepmd.pd.cxx_op: Use paddle_ops_deepmd.border_op (if enabled)
alt Custom Op enabled
DescriptorModule->>deepmd_op_pd (Custom Op): border_op(sendlist, ... , g1, ...)
deepmd_op_pd (Custom Op)->>PaddleRuntime: MPI communication, tensor exchange
deepmd_op_pd (Custom Op)-->>DescriptorModule: Return processed tensor
else Custom Op not enabled
DescriptorModule-->>PythonUser: Raise NotImplementedError
end
Estimated code review effort🎯 5 (Critical) | ⏱️ ~90 minutes Suggested labels
Suggested reviewers
Note ⚡️ Unit Test Generation is now available in beta!Learn more here, or try it out under "Finishing Touches" below. ✨ Finishing Touches
🧪 Generate unit tests
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. 🪧 TipsChatThere are 3 ways to chat with CodeRabbit:
SupportNeed help? Create a ticket on our support page for assistance with any issues or questions. CodeRabbit Commands (Invoked using PR comments)
Other keywords and placeholders
CodeRabbit Configuration File (
|
| config->EnableNewIR(true); | ||
| config->EnableCustomPasses({"add_shadow_output_after_dead_parameter_pass"}, | ||
| true); | ||
| // config->SwitchIrOptim(false); |
Check notice
Code scanning / CodeQL
Commented-out code Note
| config_fl->EnableNewIR(true); | ||
| config_fl->EnableCustomPasses({"add_shadow_output_after_dead_parameter_pass"}, | ||
| true); | ||
| // config_fl->SwitchIrOptim(false); |
Check notice
Code scanning / CodeQL
Commented-out code Note
…Sulfate/deepmd-kit into support_dpa2/3_infernce
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
🧹 Nitpick comments (1)
deepmd/pd/model/descriptor/repflows.py (1)
595-673: Comprehensive parallel mode implementation with proper spin supportThis implementation correctly handles both regular and spin modes in parallel execution:
Strengths:
- Proper detection of spin mode via
comm_dictlength- Correct tensor padding and reshaping for both modes
- Appropriate use of the custom
border_opwith CPU-placed integer tensors- Clean separation of real/virtual embeddings in spin mode
- Proper use of
concat_switch_virtualfor spin tensor rearrangementPotential concerns:
- The assertion
assert len(comm_dict) >= 6appears twice (lines 595, 646) - consider consolidating- Complex nested conditionals for spin vs non-spin modes could benefit from helper methods
Consider extracting helper methods to reduce complexity:
def _prepare_node_embedding_for_parallel(self, node_ebd, comm_dict, nloc, nall): """Helper to prepare node embeddings for parallel processing.""" has_spin = len(comm_dict) >= 7 if not has_spin: return self._pad_regular_embedding(node_ebd, nloc, nall), nloc, nall else: return self._prepare_spin_embedding(node_ebd, nloc, nall) def _pad_regular_embedding(self, node_ebd, nloc, nall): """Pad regular (non-spin) node embeddings.""" # Current padding logic here def _prepare_spin_embedding(self, node_ebd, nloc, nall): """Prepare spin mode embeddings with proper splitting and concatenation.""" # Current spin logic here
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (4)
deepmd/pd/model/atomic_model/base_atomic_model.py(13 hunks)deepmd/pd/model/descriptor/repflows.py(8 hunks)deepmd/pd/model/descriptor/repformers.py(8 hunks)source/api_cc/CMakeLists.txt(1 hunks)
🚧 Files skipped from review as they are similar to previous changes (3)
- source/api_cc/CMakeLists.txt
- deepmd/pd/model/descriptor/repformers.py
- deepmd/pd/model/atomic_model/base_atomic_model.py
🧰 Additional context used
🧠 Learnings (2)
📓 Common learnings
Learnt from: njzjz
PR: deepmodeling/deepmd-kit#4144
File: source/api_cc/tests/test_deeppot_dpa_pt.cc:166-246
Timestamp: 2024-10-08T15:32:11.479Z
Learning: Refactoring between test classes `TestInferDeepPotDpaPt` and `TestInferDeepPotDpaPtNopbc` is addressed in PR #3905.
Learnt from: njzjz
PR: deepmodeling/deepmd-kit#4144
File: source/api_cc/tests/test_deeppot_dpa_pt.cc:166-246
Timestamp: 2024-09-19T04:25:12.408Z
Learning: Refactoring between test classes `TestInferDeepPotDpaPt` and `TestInferDeepPotDpaPtNopbc` is addressed in PR #3905.
📚 Learning: refactoring between test classes `testinferdeeppotdpapt` and `testinferdeeppotdpaptnopbc` is address...
Learnt from: njzjz
PR: deepmodeling/deepmd-kit#4144
File: source/api_cc/tests/test_deeppot_dpa_pt.cc:166-246
Timestamp: 2024-10-08T15:32:11.479Z
Learning: Refactoring between test classes `TestInferDeepPotDpaPt` and `TestInferDeepPotDpaPtNopbc` is addressed in PR #3905.
Applied to files:
deepmd/pd/model/descriptor/repflows.py
🧬 Code Graph Analysis (1)
deepmd/pd/model/descriptor/repflows.py (3)
deepmd/pd/utils/spin.py (1)
concat_switch_virtual(6-29)deepmd/pd/model/descriptor/repformers.py (1)
border_op(57-71)deepmd/pd/model/network/network.py (1)
Tensor(30-33)
🪛 Ruff (0.12.2)
deepmd/pd/model/descriptor/repflows.py
449-450: Use a single if statement instead of nested if statements
Combine if statements using and
(SIM102)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (30)
- GitHub Check: Test Python (6, 3.12)
- GitHub Check: Test Python (6, 3.9)
- GitHub Check: Test Python (5, 3.9)
- GitHub Check: Test Python (3, 3.9)
- GitHub Check: Test Python (1, 3.12)
- GitHub Check: Test Python (4, 3.12)
- GitHub Check: Test Python (5, 3.12)
- GitHub Check: Test Python (4, 3.9)
- GitHub Check: Test Python (1, 3.9)
- GitHub Check: Test Python (2, 3.9)
- GitHub Check: Test Python (3, 3.12)
- GitHub Check: Test Python (2, 3.12)
- GitHub Check: Analyze (javascript-typescript)
- GitHub Check: Analyze (python)
- GitHub Check: Analyze (c-cpp)
- GitHub Check: Test C++ (true)
- GitHub Check: Test C++ (false)
- GitHub Check: Build C++ (clang, clang)
- GitHub Check: Build C++ (cpu, cpu)
- GitHub Check: Build C library (2.14, >=2.5.0,<2.15, libdeepmd_c_cu11.tar.gz)
- GitHub Check: Build C library (2.18, libdeepmd_c.tar.gz)
- GitHub Check: Build C++ (cuda120, cuda)
- GitHub Check: Build C++ (cuda, cuda)
- GitHub Check: Build C++ (rocm, rocm)
- GitHub Check: Build wheels for cp310-manylinux_aarch64
- GitHub Check: Build wheels for cp311-win_amd64
- GitHub Check: Build wheels for cp311-manylinux_x86_64
- GitHub Check: Build wheels for cp311-macosx_arm64
- GitHub Check: Build wheels for cp311-manylinux_x86_64
- GitHub Check: Build wheels for cp311-macosx_x86_64
🔇 Additional comments (6)
deepmd/pd/model/descriptor/repflows.py (6)
13-16: LGTM: Clean conditional import structureThe import of custom operator components is well-structured and follows the expected pattern for optional dependencies.
41-43: LGTM: Appropriate import for spin functionalityThe import of
concat_switch_virtualis correctly added to support the spin mode functionality implemented later in the file.
58-79: LGTM: Robust fallback mechanism for custom operatorsThe conditional setup provides a clean fallback when custom operators are not available, with clear error messaging that references documentation. The assignment pattern matches other descriptor modules.
446-446: LGTM: Type annotation update aligns with interface changesThe change from
Optional[dict]toOptional[list[paddle.Tensor]]correctly reflects the new communication interface used across descriptor modules.
450-451: LGTM: Appropriate guards for dynamic mode assertionsThe addition of
paddle.in_dynamic_mode()guards around mapping assertions is correct. These assertions are only meaningful in dynamic mode where JIT tracing doesn't apply.Also applies to: 530-531, 576-577, 587-588
457-457: LGTM: Tensor compatibility improvementReplacing scalar
-1withpaddle.full_like(..., -1)ensures proper tensor operations and type consistency. This is a good practice for tensor frameworks.Also applies to: 480-480
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
♻️ Duplicate comments (2)
source/api_cc/src/DeepPotPD.cc (2)
213-213: Remove commented code.This commented line should be removed from production code as flagged by static analysis.
- // config->SwitchIrOptim(false);
222-222: Remove commented code.This commented line should be removed from production code as flagged by static analysis.
- // config_fl->SwitchIrOptim(false);
🧹 Nitpick comments (1)
source/api_cc/src/DeepPotPD.cc (1)
14-165: Consider moving the Logger class to a separate header file.The Logger class is a substantial addition (150+ lines) that provides comprehensive logging functionality. While the implementation looks solid with proper RAII semantics and configurable levels, it would be better placed in a dedicated header file for reusability across the codebase.
Consider extracting this to
source/api_cc/include/Logger.handsource/api_cc/src/Logger.ccfor better modularity and potential reuse in other components.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (2)
deepmd/pd/utils/nlist.py(8 hunks)source/api_cc/src/DeepPotPD.cc(9 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
- deepmd/pd/utils/nlist.py
🧰 Additional context used
🧠 Learnings (4)
📓 Common learnings
Learnt from: njzjz
PR: deepmodeling/deepmd-kit#4144
File: source/api_cc/tests/test_deeppot_dpa_pt.cc:166-246
Timestamp: 2024-09-19T04:25:12.408Z
Learning: Refactoring between test classes `TestInferDeepPotDpaPt` and `TestInferDeepPotDpaPtNopbc` is addressed in PR #3905.
Learnt from: njzjz
PR: deepmodeling/deepmd-kit#4144
File: source/api_cc/tests/test_deeppot_dpa_pt.cc:166-246
Timestamp: 2024-10-08T15:32:11.479Z
Learning: Refactoring between test classes `TestInferDeepPotDpaPt` and `TestInferDeepPotDpaPtNopbc` is addressed in PR #3905.
Learnt from: njzjz
PR: deepmodeling/deepmd-kit#4226
File: deepmd/dpmodel/atomic_model/base_atomic_model.py:202-202
Timestamp: 2024-10-16T21:49:57.401Z
Learning: When reviewing PRs, avoid making refactor suggestions that are not directly related to the PR's changes. For example, in `deepmd/dpmodel/atomic_model/base_atomic_model.py`, do not suggest simplifying `for kk in ret_dict.keys()` to `for kk in ret_dict` unless it's relevant to the PR.
📚 Learning: refactoring between test classes `testinferdeeppotdpapt` and `testinferdeeppotdpaptnopbc` is address...
Learnt from: njzjz
PR: deepmodeling/deepmd-kit#4144
File: source/api_cc/tests/test_deeppot_dpa_pt.cc:166-246
Timestamp: 2024-10-08T15:32:11.479Z
Learning: Refactoring between test classes `TestInferDeepPotDpaPt` and `TestInferDeepPotDpaPtNopbc` is addressed in PR #3905.
Applied to files:
source/api_cc/src/DeepPotPD.cc
📚 Learning: in `deepmd/dpmodel/model/make_model.py`, the variable `nall` assigned but not used is intentional an...
Learnt from: njzjz
PR: deepmodeling/deepmd-kit#4226
File: deepmd/dpmodel/model/make_model.py:370-373
Timestamp: 2024-10-16T21:50:10.680Z
Learning: In `deepmd/dpmodel/model/make_model.py`, the variable `nall` assigned but not used is intentional and should not be flagged in future reviews.
Applied to files:
source/api_cc/src/DeepPotPD.cc
📚 Learning: for code blocks in `doc/model/train-fitting-dos.md` that display commands, use 'txt' as the language...
Learnt from: njzjz
PR: deepmodeling/deepmd-kit#3875
File: doc/model/train-fitting-dos.md:107-107
Timestamp: 2024-10-08T15:32:11.479Z
Learning: For code blocks in `doc/model/train-fitting-dos.md` that display commands, use 'txt' as the language specification as per user njzjz's preference.
Applied to files:
source/api_cc/src/DeepPotPD.cc
🧬 Code Graph Analysis (1)
source/api_cc/src/DeepPotPD.cc (1)
source/api_cc/src/common.cc (2)
load_op_library(412-439)load_op_library(412-412)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (28)
- GitHub Check: Build wheels for cp311-manylinux_x86_64
- GitHub Check: Build wheels for cp311-win_amd64
- GitHub Check: Build wheels for cp311-manylinux_x86_64
- GitHub Check: Build wheels for cp310-manylinux_aarch64
- GitHub Check: Build wheels for cp311-macosx_arm64
- GitHub Check: Test Python (6, 3.9)
- GitHub Check: Test Python (6, 3.12)
- GitHub Check: Test Python (3, 3.12)
- GitHub Check: Test Python (3, 3.9)
- GitHub Check: Test Python (4, 3.12)
- GitHub Check: Test Python (5, 3.9)
- GitHub Check: Test Python (5, 3.12)
- GitHub Check: Test Python (1, 3.12)
- GitHub Check: Test Python (1, 3.9)
- GitHub Check: Test Python (4, 3.9)
- GitHub Check: Test Python (2, 3.9)
- GitHub Check: Test Python (2, 3.12)
- GitHub Check: Test C++ (false)
- GitHub Check: Test C++ (true)
- GitHub Check: Build C++ (cuda, cuda)
- GitHub Check: Analyze (python)
- GitHub Check: Analyze (c-cpp)
- GitHub Check: Build C++ (cuda120, cuda)
- GitHub Check: Build C++ (cpu, cpu)
- GitHub Check: Build C++ (clang, clang)
- GitHub Check: Build C++ (rocm, rocm)
- GitHub Check: Build C library (2.18, libdeepmd_c.tar.gz)
- GitHub Check: Build C library (2.14, >=2.5.0,<2.15, libdeepmd_c_cu11.tar.gz)
🔇 Additional comments (10)
source/api_cc/src/DeepPotPD.cc (10)
197-197: LGTM: Op library loading enabled.The uncommented call to
deepmd::load_op_library()correctly enables the Paddle custom operator library loading, which is essential for DPA2/DPA3 inference support as mentioned in the PR objectives.
269-344: Enhanced logging improves debugging capabilities.The replacement of ad-hoc
std::coutstatements with structured logging using the new Logger class significantly improves code maintainability and debugging capabilities. The debug-level logging of hyperparameters will be valuable for troubleshooting model issues.
376-376: Input tensor name updated to match Paddle model specifications.The change from
"coord"to"extended_coord"aligns with the updated Paddle model freezing specifications mentioned in the AI summary.
380-380: Input tensor name updated to match Paddle model specifications.The change from
"atype"to"extended_atype"aligns with the updated Paddle model freezing specifications mentioned in the AI summary.
388-448: Message passing implementation enables distributed computation.The implementation replaces the previous exception with comprehensive MPI-based message passing logic. The pointer-to-integer conversion workaround (lines 431-438) is a clever solution to Paddle's limitation with double pointer tensors.
Key improvements:
- Proper tensor setup for MPI communication parameters
- Safe type casting with assertions
- Efficient pointer address conversion for
sendlisthandlingThe assertion on Line 421 ensures pointer size compatibility before the conversion, which is good defensive programming.
445-448: Tensor handle promoted to member variable.Converting
mapping_tensorfrom a local to a member variable (this->mapping_tensor) is likely necessary to maintain tensor lifetime during asynchronous Paddle inference operations.
451-455: Tensor handle promoted to member variable.Converting
firstneigh_tensorfrom a local to a member variable (this->firstneigh_tensor) is likely necessary to maintain tensor lifetime during asynchronous Paddle inference operations.
457-461: Redundant parentheses removed in CopyFromCpu call.The removal of redundant parentheses in
CopyFromCpu(fparam.data())improves code readability without changing functionality.
562-562: Input tensor name updated to match Paddle model specifications.The change from
"coord"to"extended_coord"in the second predictor aligns with the updated Paddle model freezing specifications.
567-567: Input tensor name updated to match Paddle model specifications.The change from
"atype"to"extended_atype"in the second predictor aligns with the updated Paddle model freezing specifications.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
♻️ Duplicate comments (1)
deepmd/pd/model/task/fitting.py (1)
629-629: Remove invalid.to(device=...)method call.This matches the previous review comment - PaddlePaddle tensors don't support the
.to(device=...)method. This should be removed or replaced with appropriate PaddlePaddle tensor placement.- ).to(device=descriptor.place) # jit assertion + ) # jit assertion
🧹 Nitpick comments (1)
deepmd/pd/model/task/fitting.py (1)
98-101: Use ternary operator for cleaner code.The conditional assignment can be simplified using a ternary operator as suggested by static analysis.
- if callable(merged): - sampled = merged() - else: - sampled = merged + sampled = merged() if callable(merged) else merged
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (1)
deepmd/pd/model/task/fitting.py(7 hunks)
🧰 Additional context used
🧠 Learnings (1)
📓 Common learnings
Learnt from: njzjz
PR: deepmodeling/deepmd-kit#4144
File: source/api_cc/tests/test_deeppot_dpa_pt.cc:166-246
Timestamp: 2024-09-19T04:25:12.408Z
Learning: Refactoring between test classes `TestInferDeepPotDpaPt` and `TestInferDeepPotDpaPtNopbc` is addressed in PR #3905.
Learnt from: njzjz
PR: deepmodeling/deepmd-kit#4144
File: source/api_cc/tests/test_deeppot_dpa_pt.cc:166-246
Timestamp: 2024-10-08T15:32:11.479Z
Learning: Refactoring between test classes `TestInferDeepPotDpaPt` and `TestInferDeepPotDpaPtNopbc` is addressed in PR #3905.
Learnt from: njzjz
PR: deepmodeling/deepmd-kit#4226
File: deepmd/dpmodel/atomic_model/base_atomic_model.py:202-202
Timestamp: 2024-10-16T21:49:57.401Z
Learning: When reviewing PRs, avoid making refactor suggestions that are not directly related to the PR's changes. For example, in `deepmd/dpmodel/atomic_model/base_atomic_model.py`, do not suggest simplifying `for kk in ret_dict.keys()` to `for kk in ret_dict` unless it's relevant to the PR.
🪛 Ruff (0.12.2)
deepmd/pd/model/task/fitting.py
98-101: Use ternary operator sampled = merged() if callable(merged) else merged instead of if-else-block
Replace if-else-block with sampled = merged() if callable(merged) else merged
(SIM108)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (30)
- GitHub Check: Analyze (python)
- GitHub Check: Analyze (javascript-typescript)
- GitHub Check: Analyze (c-cpp)
- GitHub Check: Test Python (5, 3.12)
- GitHub Check: Test Python (4, 3.9)
- GitHub Check: Test Python (6, 3.9)
- GitHub Check: Test Python (5, 3.9)
- GitHub Check: Test Python (3, 3.12)
- GitHub Check: Test Python (6, 3.12)
- GitHub Check: Test Python (4, 3.12)
- GitHub Check: Test Python (3, 3.9)
- GitHub Check: Test Python (2, 3.12)
- GitHub Check: Test Python (2, 3.9)
- GitHub Check: Test Python (1, 3.12)
- GitHub Check: Test Python (1, 3.9)
- GitHub Check: Build wheels for cp310-manylinux_aarch64
- GitHub Check: Build wheels for cp311-macosx_arm64
- GitHub Check: Build wheels for cp311-win_amd64
- GitHub Check: Build wheels for cp311-manylinux_x86_64
- GitHub Check: Build wheels for cp311-manylinux_x86_64
- GitHub Check: Build wheels for cp311-macosx_x86_64
- GitHub Check: Test C++ (true)
- GitHub Check: Test C++ (false)
- GitHub Check: Build C library (2.18, libdeepmd_c.tar.gz)
- GitHub Check: Build C library (2.14, >=2.5.0,<2.15, libdeepmd_c_cu11.tar.gz)
- GitHub Check: Build C++ (cpu, cpu)
- GitHub Check: Build C++ (cuda120, cuda)
- GitHub Check: Build C++ (clang, clang)
- GitHub Check: Build C++ (cuda, cuda)
- GitHub Check: Build C++ (rocm, rocm)
🔇 Additional comments (10)
deepmd/pd/model/task/fitting.py (10)
7-7: LGTM!The
Callableimport is correctly added and used in the newcompute_input_statsmethod signature.
75-161: Well-implemented input statistics computation.The method correctly handles both frame and atomic parameter statistics with proper:
- Support for both callable and list inputs
- Protection against division by zero
- Tensor operations for computing means and standard deviations
- Parameter updates using
paddle.assign
337-338: LGTM!The new attribute for controlling middle output evaluation is properly initialized.
461-463: LGTM!The setter method provides a clean interface for controlling middle output evaluation.
523-525: LGTM! Proper tensor casting for PaddlePaddle.The change from
.to()to.astype()is correct for PaddlePaddle tensor type casting operations.
614-618: LGTM! Well-implemented middle output handling for mixed types.The conditional middle output computation using
call_until_lastis correctly implemented and properly guarded by the evaluation flag.
625-640: LGTM! Comprehensive middle output handling for non-mixed types.The implementation correctly:
- Handles middle outputs for each atom type separately
- Uses proper masking with
paddle.tileandpaddle.where- Accumulates results across all types
655-657: LGTM! Improved masking withpaddle.full_like.The use of
paddle.full_like(atom_property, 0.0)for masking is more explicit and appropriate than usingpaddle.zeros_like.
662-662: LGTM! Proper boolean mask casting.The use of
.astype("bool")for explicit boolean casting is correct and clear.
665-666: LGTM! Improved return structure.Returning a results dictionary provides better structure and supports the new middle output functionality while maintaining the original output under
self.var_name.
dfa0dd0 to
c2e337f
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
♻️ Duplicate comments (1)
source/api_cc/src/DeepPotPD.cc (1)
213-213: Remove commented-out code or document its purpose.These commented-out
SwitchIrOptim(false)calls should either be removed entirely or properly documented if they're intended for future debugging/configuration use.Also applies to: 222-222
🧹 Nitpick comments (1)
source/api_cc/src/DeepPotPD.cc (1)
14-165: Consider moving Logger class to a separate header file.This Logger implementation is comprehensive but adds significant code bulk to the DeepPotPD implementation file. For better modularity and reusability across the codebase, consider extracting it to a dedicated logging utility header.
The Logger class could be moved to something like
source/api_cc/include/logging.hto enable reuse across other C++ components.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (1)
source/api_cc/src/DeepPotPD.cc(8 hunks)
🧰 Additional context used
🧠 Learnings (4)
📓 Common learnings
Learnt from: njzjz
PR: deepmodeling/deepmd-kit#4144
File: source/api_cc/tests/test_deeppot_dpa_pt.cc:166-246
Timestamp: 2024-09-19T04:25:12.408Z
Learning: Refactoring between test classes `TestInferDeepPotDpaPt` and `TestInferDeepPotDpaPtNopbc` is addressed in PR #3905.
Learnt from: njzjz
PR: deepmodeling/deepmd-kit#4144
File: source/api_cc/tests/test_deeppot_dpa_pt.cc:166-246
Timestamp: 2024-10-08T15:32:11.479Z
Learning: Refactoring between test classes `TestInferDeepPotDpaPt` and `TestInferDeepPotDpaPtNopbc` is addressed in PR #3905.
Learnt from: njzjz
PR: deepmodeling/deepmd-kit#4226
File: deepmd/dpmodel/atomic_model/base_atomic_model.py:202-202
Timestamp: 2024-10-16T21:49:57.401Z
Learning: When reviewing PRs, avoid making refactor suggestions that are not directly related to the PR's changes. For example, in `deepmd/dpmodel/atomic_model/base_atomic_model.py`, do not suggest simplifying `for kk in ret_dict.keys()` to `for kk in ret_dict` unless it's relevant to the PR.
📚 Learning: refactoring between test classes `testinferdeeppotdpapt` and `testinferdeeppotdpaptnopbc` is address...
Learnt from: njzjz
PR: deepmodeling/deepmd-kit#4144
File: source/api_cc/tests/test_deeppot_dpa_pt.cc:166-246
Timestamp: 2024-09-19T04:25:12.408Z
Learning: Refactoring between test classes `TestInferDeepPotDpaPt` and `TestInferDeepPotDpaPtNopbc` is addressed in PR #3905.
Applied to files:
source/api_cc/src/DeepPotPD.cc
📚 Learning: in `deepmd/dpmodel/model/make_model.py`, the variable `nall` assigned but not used is intentional an...
Learnt from: njzjz
PR: deepmodeling/deepmd-kit#4226
File: deepmd/dpmodel/model/make_model.py:370-373
Timestamp: 2024-10-16T21:50:10.680Z
Learning: In `deepmd/dpmodel/model/make_model.py`, the variable `nall` assigned but not used is intentional and should not be flagged in future reviews.
Applied to files:
source/api_cc/src/DeepPotPD.cc
📚 Learning: for code blocks in `doc/model/train-fitting-dos.md` that display commands, use 'txt' as the language...
Learnt from: njzjz
PR: deepmodeling/deepmd-kit#3875
File: doc/model/train-fitting-dos.md:107-107
Timestamp: 2024-10-08T15:32:11.479Z
Learning: For code blocks in `doc/model/train-fitting-dos.md` that display commands, use 'txt' as the language specification as per user njzjz's preference.
Applied to files:
source/api_cc/src/DeepPotPD.cc
🧬 Code Graph Analysis (1)
source/api_cc/src/DeepPotPD.cc (1)
source/api_cc/src/common.cc (2)
load_op_library(412-439)load_op_library(412-412)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (29)
- GitHub Check: Test Python (6, 3.12)
- GitHub Check: Test Python (4, 3.12)
- GitHub Check: Test Python (6, 3.9)
- GitHub Check: Test Python (5, 3.12)
- GitHub Check: Test Python (5, 3.9)
- GitHub Check: Build wheels for cp311-macosx_arm64
- GitHub Check: Test Python (4, 3.9)
- GitHub Check: Test Python (3, 3.9)
- GitHub Check: Test Python (2, 3.9)
- GitHub Check: Test Python (3, 3.12)
- GitHub Check: Test Python (1, 3.9)
- GitHub Check: Test Python (1, 3.12)
- GitHub Check: Build wheels for cp310-manylinux_aarch64
- GitHub Check: Test Python (2, 3.12)
- GitHub Check: Build wheels for cp311-macosx_x86_64
- GitHub Check: Build wheels for cp311-win_amd64
- GitHub Check: Build wheels for cp311-manylinux_x86_64
- GitHub Check: Build wheels for cp311-manylinux_x86_64
- GitHub Check: Build C++ (rocm, rocm)
- GitHub Check: Build C++ (clang, clang)
- GitHub Check: Build C++ (cpu, cpu)
- GitHub Check: Build C++ (cuda, cuda)
- GitHub Check: Build C++ (cuda120, cuda)
- GitHub Check: Build C library (2.18, libdeepmd_c.tar.gz)
- GitHub Check: Test C++ (true)
- GitHub Check: Build C library (2.14, >=2.5.0,<2.15, libdeepmd_c_cu11.tar.gz)
- GitHub Check: Test C++ (false)
- GitHub Check: Analyze (c-cpp)
- GitHub Check: Analyze (python)
🔇 Additional comments (8)
source/api_cc/src/DeepPotPD.cc (8)
197-197: LGTM: Critical for DPA2/DPA3 support.Enabling
deepmd::load_op_library()is essential for loading the PaddlePaddle custom operator library (deepmd_op_pd) that enables DPA2/DPA3 inference. This aligns with the PR's core objective.
269-295: Excellent logging improvements.The structured logging replaces ad-hoc debug prints with proper leveled logging, providing valuable debugging information for DPA2/DPA3 model initialization while being controllable via log levels.
Also applies to: 309-328, 333-344
376-376: LGTM: Updated tensor input names for DPA2/DPA3.The change from generic names to
"extended_coord"and"extended_atype"aligns with the updated input naming conventions for DPA2/DPA3 models that support extended coordinate and atom type tensors for message passing.Also applies to: 380-380
407-415: LGTM: Proper data type handling for sendnum.The conditional type conversion for
sendnumproperly handles cases where the source data type differs from the expectedint32_t, ensuring compatibility across different platforms.
421-421: Good defensive programming with pointer size assertion.The assertion ensures that
std::intptr_tis 64-bit before the pointer-to-integer conversion, preventing potential issues on platforms with different pointer sizes.
426-438: Clever workaround for PaddlePaddle limitation.The conversion of pointer arrays to
intptr_taddresses PaddlePaddle's inability to construct tensors from double pointers. The detailed comment explains the rationale well.
445-445: LGTM: Proper tensor handle lifetime management.Storing tensor handles as member variables ensures they persist for the lifetime of inference operations, preventing premature destruction that could cause runtime errors.
Also applies to: 451-451
457-468: LGTM: Clean parameter tensor handling.The optional parameter tensor setup properly handles both fparam and aparam cases with appropriate reshaping and data copying. Using local unique_ptr is correct for these immediate-use tensors.
5a3f647 to
6c4309b
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
♻️ Duplicate comments (2)
deepmd/pd/entrypoints/main.py (1)
376-377: Why isdo_atomic_virialhardcoded to False?The change from a dynamic boolean input to a hardcoded
Falsevalue could break functionality for users who need atomic virial calculations. Is this intentional? If so, please document the reasoning. If temporary, consider adding a TODO comment.source/api_cc/src/DeepPotPD.cc (1)
213-213: Remove or uncomment the configuration lines.These commented-out configuration lines appear to be debug settings. Either remove them entirely or uncomment them with appropriate conditions if they're needed for debugging.
// Remove these lines entirely: // config->SwitchIrOptim(false); // config_fl->SwitchIrOptim(false);Also applies to: 222-222
🧹 Nitpick comments (2)
deepmd/pd/model/descriptor/repformers.py (1)
207-208: Minor docstring improvement.Removing the default value mention from the docstring makes the parameter documentation cleaner and more precise.
source/api_cc/src/DeepPotPD.cc (1)
20-165: Consider using existing logging framework instead of custom implementation.While the Logger class is well-implemented with proper features (log levels, colors, timestamps, stream interface), introducing a custom logging solution in a large codebase like deepmd-kit may not be ideal. Consider using an established logging library like spdlog or integrating with the project's existing logging infrastructure.
The implementation itself is solid with proper RAII, move semantics, and thread-safety considerations for the stream interface.
If keeping the custom logger, consider these improvements:
// Add thread safety for static members #include <mutex> class Logger { private: static std::mutex config_mutex; // ... existing members public: static void setLevel(Level level) { std::lock_guard<std::mutex> lock(config_mutex); minLevel = level; } // Similar for other setters };
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (5)
deepmd/pd/entrypoints/main.py(2 hunks)deepmd/pd/model/descriptor/repflows.py(8 hunks)deepmd/pd/model/descriptor/repformers.py(8 hunks)deepmd/pd/utils/serialization.py(3 hunks)source/api_cc/src/DeepPotPD.cc(8 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
- deepmd/pd/utils/serialization.py
🧰 Additional context used
🧠 Learnings (5)
📓 Common learnings
Learnt from: njzjz
PR: deepmodeling/deepmd-kit#4144
File: source/api_cc/tests/test_deeppot_dpa_pt.cc:166-246
Timestamp: 2024-10-08T15:32:11.479Z
Learning: Refactoring between test classes `TestInferDeepPotDpaPt` and `TestInferDeepPotDpaPtNopbc` is addressed in PR #3905.
Learnt from: njzjz
PR: deepmodeling/deepmd-kit#4144
File: source/api_cc/tests/test_deeppot_dpa_pt.cc:166-246
Timestamp: 2024-09-19T04:25:12.408Z
Learning: Refactoring between test classes `TestInferDeepPotDpaPt` and `TestInferDeepPotDpaPtNopbc` is addressed in PR #3905.
Learnt from: njzjz
PR: deepmodeling/deepmd-kit#4226
File: deepmd/dpmodel/atomic_model/base_atomic_model.py:202-202
Timestamp: 2024-10-16T21:49:57.401Z
Learning: When reviewing PRs, avoid making refactor suggestions that are not directly related to the PR's changes. For example, in `deepmd/dpmodel/atomic_model/base_atomic_model.py`, do not suggest simplifying `for kk in ret_dict.keys()` to `for kk in ret_dict` unless it's relevant to the PR.
📚 Learning: when reviewing prs, avoid making refactor suggestions that are not directly related to the pr's chan...
Learnt from: njzjz
PR: deepmodeling/deepmd-kit#4226
File: deepmd/dpmodel/atomic_model/base_atomic_model.py:202-202
Timestamp: 2024-10-16T21:49:57.401Z
Learning: When reviewing PRs, avoid making refactor suggestions that are not directly related to the PR's changes. For example, in `deepmd/dpmodel/atomic_model/base_atomic_model.py`, do not suggest simplifying `for kk in ret_dict.keys()` to `for kk in ret_dict` unless it's relevant to the PR.
Applied to files:
deepmd/pd/entrypoints/main.py
📚 Learning: refactoring between test classes `testinferdeeppotdpapt` and `testinferdeeppotdpaptnopbc` is address...
Learnt from: njzjz
PR: deepmodeling/deepmd-kit#4144
File: source/api_cc/tests/test_deeppot_dpa_pt.cc:166-246
Timestamp: 2024-10-08T15:32:11.479Z
Learning: Refactoring between test classes `TestInferDeepPotDpaPt` and `TestInferDeepPotDpaPtNopbc` is addressed in PR #3905.
Applied to files:
source/api_cc/src/DeepPotPD.ccdeepmd/pd/model/descriptor/repformers.pydeepmd/pd/model/descriptor/repflows.py
📚 Learning: in `deepmd/dpmodel/model/make_model.py`, the variable `nall` assigned but not used is intentional an...
Learnt from: njzjz
PR: deepmodeling/deepmd-kit#4226
File: deepmd/dpmodel/model/make_model.py:370-373
Timestamp: 2024-10-16T21:50:10.680Z
Learning: In `deepmd/dpmodel/model/make_model.py`, the variable `nall` assigned but not used is intentional and should not be flagged in future reviews.
Applied to files:
source/api_cc/src/DeepPotPD.cc
📚 Learning: for code blocks in `doc/model/train-fitting-dos.md` that display commands, use 'txt' as the language...
Learnt from: njzjz
PR: deepmodeling/deepmd-kit#3875
File: doc/model/train-fitting-dos.md:107-107
Timestamp: 2024-06-13T16:32:13.786Z
Learning: For code blocks in `doc/model/train-fitting-dos.md` that display commands, use 'txt' as the language specification as per user njzjz's preference.
Applied to files:
source/api_cc/src/DeepPotPD.cc
🧬 Code Graph Analysis (2)
source/api_cc/src/DeepPotPD.cc (1)
source/api_cc/src/common.cc (2)
load_op_library(412-439)load_op_library(412-412)
deepmd/pd/model/descriptor/repformers.py (3)
deepmd/pd/utils/spin.py (1)
concat_switch_virtual(6-29)deepmd/pd/model/descriptor/repflows.py (1)
border_op(60-74)deepmd/pd/model/network/network.py (1)
Tensor(30-33)
🪛 Ruff (0.12.2)
deepmd/pd/model/descriptor/repflows.py
449-450: Use a single if statement instead of nested if statements
Combine if statements using and
(SIM102)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (29)
- GitHub Check: Build wheels for cp310-manylinux_aarch64
- GitHub Check: Build wheels for cp311-manylinux_x86_64
- GitHub Check: Build wheels for cp311-macosx_x86_64
- GitHub Check: Build wheels for cp311-manylinux_x86_64
- GitHub Check: Build wheels for cp311-win_amd64
- GitHub Check: Build wheels for cp311-macosx_arm64
- GitHub Check: Test Python (5, 3.9)
- GitHub Check: Test Python (6, 3.9)
- GitHub Check: Test Python (6, 3.12)
- GitHub Check: Test Python (5, 3.12)
- GitHub Check: Test Python (3, 3.9)
- GitHub Check: Test Python (4, 3.9)
- GitHub Check: Test Python (4, 3.12)
- GitHub Check: Test Python (2, 3.12)
- GitHub Check: Test Python (3, 3.12)
- GitHub Check: Test Python (2, 3.9)
- GitHub Check: Test Python (1, 3.12)
- GitHub Check: Test Python (1, 3.9)
- GitHub Check: Build C library (2.18, libdeepmd_c.tar.gz)
- GitHub Check: Build C library (2.14, >=2.5.0,<2.15, libdeepmd_c_cu11.tar.gz)
- GitHub Check: Test C++ (false)
- GitHub Check: Test C++ (true)
- GitHub Check: Build C++ (rocm, rocm)
- GitHub Check: Build C++ (cuda, cuda)
- GitHub Check: Build C++ (cuda120, cuda)
- GitHub Check: Build C++ (clang, clang)
- GitHub Check: Build C++ (cpu, cpu)
- GitHub Check: Analyze (c-cpp)
- GitHub Check: Analyze (python)
🔇 Additional comments (20)
deepmd/pd/entrypoints/main.py (2)
371-373: LGTM! Enhanced flexibility with dynamic batch sizing.The change from fixed batch size
[1, ...]to dynamic batch size[-1, ...]for coordinate, atom type, and box tensors improves the model's flexibility to handle variable batch sizes during inference.
392-412: Well-structured communication tensor specifications.The expanded
comm_dictinput specification properly supports the new communication tensor interface with individual specs forsend_list,send_proc,recv_proc,send_num,recv_num, andcommunicator. The updated input names (extended_coord,extended_atype) also align with the broader codebase conventions.deepmd/pd/model/descriptor/repflows.py (4)
13-16: Excellent custom operator integration with proper fallback.The conditional import and assignment of
border_opprovides a clean fallback mechanism when the custom Paddle operator library isn't available. The stub function with clear error message guides users to the documentation, and the spin utility import is properly added.Also applies to: 41-43, 58-79
450-451: Improved robustness with dynamic mode checks and tensor operations.The addition of
paddle.in_dynamic_mode()guards before assertions prevents errors during static compilation, and the replacement of scalar-1withpaddle.full_like(..., -1)ensures proper tensor type compatibility.Also applies to: 457-457, 480-480, 530-531, 576-577, 587-588
446-446: Proper type standardization for communication interface.The change from
Optional[dict[str, paddle.Tensor]]toOptional[list[paddle.Tensor]]aligns with the codebase-wide standardization of the communication tensor interface.
595-675: Well-implemented parallel communication with spin support.The parallel mode implementation properly distinguishes between spin and non-spin cases:
- Non-spin: Simple padding along the atom dimension
- Spin: Split real/virtual parts, concatenate along features, pad, then use
concat_switch_virtualThe integration with
border_opfollows the correct pattern with CPU-placed integer tensors and proper tensor reshaping.deepmd/pd/model/descriptor/repformers.py (4)
13-16: Consistent custom operator integration.The implementation matches the pattern in
repflows.pywith proper conditional import and fallback mechanism for theborder_opcustom operator.Also applies to: 38-40, 55-76
411-411: Consistent type standardization.The
comm_dictparameter type change aligns with the broader codebase standardization effort.
413-414: Improved conditional logic with proper mode checks.The conditional logic properly handles both parallel and non-parallel execution paths with appropriate dynamic mode checks to prevent assertion errors during static compilation.
Also applies to: 439-439, 447-448, 465-467, 476-478
483-560: Consistent parallel communication implementation.The parallel mode implementation follows the same well-structured pattern as in
repflows.py, properly handling both spin and non-spin cases with appropriate tensor padding,border_opcommunication, andconcat_switch_virtualfor spin mode.source/api_cc/src/DeepPotPD.cc (10)
14-19: New header includes for logging functionality.The additional includes support the new Logger class implementation. The includes are appropriate for the functionality being added.
269-275: Good logging integration replacing debug prints.The replacement of
std::coutstatements with the structured logger is a significant improvement. The logging provides clear information about model loading and CINN compilation status.Also applies to: 277-284, 286-294
308-327: Excellent debugging output for model introspection.The addition of debug logs to print input/output tensor names for both predictors will be very helpful for debugging and understanding the model interface. The lambda function for formatting the output is clean and reusable.
332-343: Good addition of hyperparameter debug logging.Adding debug output for all the loaded hyperparameters (
do_message_passing,rcut,ntypes, etc.) provides valuable insight into the model configuration during initialization.
387-438: Comprehensive message passing implementation.This is a major improvement - replacing the previous exception-throwing placeholder with full implementation of message passing tensor setup. The code properly handles:
- Multiple MPI communication tensors (send_proc, recv_proc, send_num, recv_num, communicator, send_list)
- Data type conversion for
sendnumtoint32_t- Workaround for Paddle's lack of double pointer support by converting to
intptr_tThe pointer address conversion approach is a clever workaround for Paddle's limitations.
One potential concern: The assertion
assert(sizeof(std::intptr_t) == 8);assumes 64-bit systems. Consider adding a more descriptive error message or handling 32-bit systems gracefully:static_assert(sizeof(std::intptr_t) == 8, "This implementation requires 64-bit pointers for Paddle compatibility");
439-447: Proper mapping tensor implementation.The replacement of the previous exception with actual mapping tensor creation is good. The code correctly:
- Creates mapping with proper remapping via
fwd_map- Uses
int64_tas expected by the model- Stores as member variable for lifetime management
450-453: Member variable assignment for tensor handles.Converting
firstneigh_tensorfrom local to member variable ensures the tensor handle's lifetime extends beyond the method scope, which is important for proper resource management in the Paddle inference context.
456-459: Consistent tensor handling patterns.The changes to use
std::unique_ptr<paddle_infer::Tensor>and remove redundant parentheses inCopyFromCpucalls improve code consistency and readability.Also applies to: 462-467
197-197: Confirm custom operator library and loader implementationI wasn’t able to locate a built
deepmd_op_pdshared library or the definition ofdeepmd::load_op_library()in the repository. Please verify that:
- Your CMake setup (e.g. CMakeLists.txt) adds and installs
deepmd_op_pdas a shared library.- The
deepmd::load_op_library()function is implemented and calls the correct library filename (e.g.libdeepmd_op_pd.so,.dll, or.dylib).
375-375: Verified consistency of tensor input names across C++ and model definitions
Renamed inputs"extended_coord"and"extended_atype"are used in source/api_cc/src/DeepPotPD.cc (lines 375 & 379) and throughout the Python model definitions (e.g. InputSpec names and tests). No remaining calls toGetInputHandle("coord")orGetInputHandle("atype")were found.
No further action required.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (3)
deepmd/pd/entrypoints/main.py(2 hunks)deepmd/pd/utils/serialization.py(3 hunks)source/api_cc/src/DeepPotPD.cc(7 hunks)
🚧 Files skipped from review as they are similar to previous changes (2)
- deepmd/pd/entrypoints/main.py
- deepmd/pd/utils/serialization.py
🧰 Additional context used
🧠 Learnings (4)
📓 Common learnings
Learnt from: njzjz
PR: deepmodeling/deepmd-kit#4144
File: source/api_cc/tests/test_deeppot_dpa_pt.cc:166-246
Timestamp: 2024-10-08T15:32:11.479Z
Learning: Refactoring between test classes `TestInferDeepPotDpaPt` and `TestInferDeepPotDpaPtNopbc` is addressed in PR #3905.
Learnt from: njzjz
PR: deepmodeling/deepmd-kit#4144
File: source/api_cc/tests/test_deeppot_dpa_pt.cc:166-246
Timestamp: 2024-09-19T04:25:12.408Z
Learning: Refactoring between test classes `TestInferDeepPotDpaPt` and `TestInferDeepPotDpaPtNopbc` is addressed in PR #3905.
Learnt from: njzjz
PR: deepmodeling/deepmd-kit#4226
File: deepmd/dpmodel/atomic_model/base_atomic_model.py:202-202
Timestamp: 2024-10-16T21:49:57.401Z
Learning: When reviewing PRs, avoid making refactor suggestions that are not directly related to the PR's changes. For example, in `deepmd/dpmodel/atomic_model/base_atomic_model.py`, do not suggest simplifying `for kk in ret_dict.keys()` to `for kk in ret_dict` unless it's relevant to the PR.
📚 Learning: refactoring between test classes `testinferdeeppotdpapt` and `testinferdeeppotdpaptnopbc` is address...
Learnt from: njzjz
PR: deepmodeling/deepmd-kit#4144
File: source/api_cc/tests/test_deeppot_dpa_pt.cc:166-246
Timestamp: 2024-10-08T15:32:11.479Z
Learning: Refactoring between test classes `TestInferDeepPotDpaPt` and `TestInferDeepPotDpaPtNopbc` is addressed in PR #3905.
Applied to files:
source/api_cc/src/DeepPotPD.cc
📚 Learning: in `deepmd/dpmodel/model/make_model.py`, the variable `nall` assigned but not used is intentional an...
Learnt from: njzjz
PR: deepmodeling/deepmd-kit#4226
File: deepmd/dpmodel/model/make_model.py:370-373
Timestamp: 2024-10-16T21:50:10.680Z
Learning: In `deepmd/dpmodel/model/make_model.py`, the variable `nall` assigned but not used is intentional and should not be flagged in future reviews.
Applied to files:
source/api_cc/src/DeepPotPD.cc
📚 Learning: for code blocks in `doc/model/train-fitting-dos.md` that display commands, use 'txt' as the language...
Learnt from: njzjz
PR: deepmodeling/deepmd-kit#3875
File: doc/model/train-fitting-dos.md:107-107
Timestamp: 2024-06-13T16:32:13.786Z
Learning: For code blocks in `doc/model/train-fitting-dos.md` that display commands, use 'txt' as the language specification as per user njzjz's preference.
Applied to files:
source/api_cc/src/DeepPotPD.cc
🧬 Code Graph Analysis (1)
source/api_cc/src/DeepPotPD.cc (2)
source/api_cc/src/common.cc (2)
load_op_library(412-439)load_op_library(412-412)source/api_cc/tests/test_deeppot_pd.cc (5)
firstneigh(225-225)firstneigh(287-287)firstneigh(370-370)firstneigh(448-448)firstneigh(510-510)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (30)
- GitHub Check: Test Python (5, 3.9)
- GitHub Check: Test Python (6, 3.9)
- GitHub Check: Test Python (6, 3.12)
- GitHub Check: Test Python (3, 3.9)
- GitHub Check: Test Python (5, 3.12)
- GitHub Check: Test Python (4, 3.9)
- GitHub Check: Test Python (2, 3.9)
- GitHub Check: Test Python (3, 3.12)
- GitHub Check: Test Python (4, 3.12)
- GitHub Check: Test Python (1, 3.12)
- GitHub Check: Test Python (2, 3.12)
- GitHub Check: Test Python (1, 3.9)
- GitHub Check: Analyze (javascript-typescript)
- GitHub Check: Analyze (c-cpp)
- GitHub Check: Analyze (python)
- GitHub Check: Build C++ (cuda120, cuda)
- GitHub Check: Build C++ (rocm, rocm)
- GitHub Check: Build C++ (cuda, cuda)
- GitHub Check: Build C++ (cpu, cpu)
- GitHub Check: Build C++ (clang, clang)
- GitHub Check: Build wheels for cp311-win_amd64
- GitHub Check: Build wheels for cp311-manylinux_x86_64
- GitHub Check: Build C library (2.14, >=2.5.0,<2.15, libdeepmd_c_cu11.tar.gz)
- GitHub Check: Build C library (2.18, libdeepmd_c.tar.gz)
- GitHub Check: Build wheels for cp311-macosx_arm64
- GitHub Check: Build wheels for cp311-manylinux_x86_64
- GitHub Check: Build wheels for cp310-manylinux_aarch64
- GitHub Check: Build wheels for cp311-macosx_x86_64
- GitHub Check: Test C++ (true)
- GitHub Check: Test C++ (false)
🔇 Additional comments (6)
source/api_cc/src/DeepPotPD.cc (6)
197-197: LGTM: Operator library loading is now properly enabled.This uncomments the call to
deepmd::load_op_library()which is essential for loading the Paddle custom operator library (deepmd_op_pd) as mentioned in the PR objectives.
269-344: LGTM: Improved logging with proper formatting.The replacement of
std::coutstatements with structured logging calls using the new Logger class significantly improves code quality. The addition of debug output for hyperparameters will be valuable for troubleshooting. This addresses the previous review comments about debug print statements by implementing proper logging infrastructure.
439-447: LGTM: Mapping tensor implementation is correct.The mapping tensor setup properly handles the remapping through
fwd_mapand the conditional logic correctly handles cases where mapping may or may not be present.
450-453: LGTM: Proper tensor handle lifetime management.Converting
firstneigh_tensorto a member variable (this->firstneigh_tensor) ensures the tensor handle remains valid throughout the computation, which is essential for proper memory management.
456-466: LGTM: Clean variable declarations and simplified method calls.The removal of redundant parentheses in
CopyFromCpucalls improves code clarity. The variable declaration changes are appropriate.
387-438: Ensure message‐passing path is tested and pointer size check is portableI wasn’t able to find any tests exercising the
do_message_passing == 1 && nghost > 0branch inDeepPotPD.cc(lines 387–438). Please verify or add test coverage for this scenario to avoid regressions.Also, to improve portability on 32-bit systems, replace the runtime
assertwith a compile‐time check:- assert(sizeof(std::intptr_t) == 8); + static_assert(sizeof(std::intptr_t) >= sizeof(void*), + "intptr_t must be large enough to hold pointer values");
Codecov Report❌ Patch coverage is Additional details and impacted files@@ Coverage Diff @@
## devel #4870 +/- ##
==========================================
- Coverage 84.71% 84.35% -0.36%
==========================================
Files 699 701 +2
Lines 68128 68567 +439
Branches 3541 3573 +32
==========================================
+ Hits 57716 57843 +127
- Misses 9277 9583 +306
- Partials 1135 1141 +6 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
|
@njzjz this PR is ready for review. |
njzjz
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The current state of this PR looks good to me.
|
@coderabbitai resolve |
✅ Actions performedComments resolved. |
Support DPA2/DPA3 C++ inference with lammps
Note
The intermediate representation (IR) of PaddlePaddle's computation graph does not support the
Dict[str, Tensor]data type. Considering the special nature ofcomm_dict, we replaced it withList[Tensor], which allowed us to successfully run inference for DPA2 and DPA3.DPA2

DPA3

Summary by CodeRabbit
Summary by CodeRabbit
New Features
Improvements
Bug Fixes
Build/Chores