Skip to content

Conversation

@OutisLi
Copy link
Collaborator

@OutisLi OutisLi commented Nov 23, 2025

The unit test for this descriptor will be committed through another pr.

Summary by CodeRabbit

  • Breaking Changes

    • Model compression configuration updated to include new required parameters
  • Improvements

    • Precomputed type-embedding data improves model compression efficiency and inference performance

✏️ Tip: You can customize this high-level summary in your review settings.

- Introduced a new method for type embedding compression, allowing precomputation of strip-mode type embeddings for all type pairs.
- Added runtime checks to ensure compatibility with the strip input mode and the existence of necessary filter layers.
- Updated the forward method to utilize precomputed type embeddings when available, improving performance during inference.

These changes optimize the handling of type embeddings, enhancing the efficiency of the descriptor model.
- Changed the assignment of type embedding data to use `register_buffer`, improving memory management.
- Introduced a new variable `embd_tensor` for clarity and maintainability in the embedding computation process.

These modifications enhance the structure of the code while maintaining existing functionality.
Copilot AI review requested due to automatic review settings November 23, 2025 05:25
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR implements type embedding compression for the se_e3_tebd descriptor. The compression allows the model to pre-compute and store type embedding outputs during compression, avoiding redundant computation during inference.

Key changes:

  • Added type embedding network parameter to compression pipeline
  • Pre-computes type embeddings for all type pairs and stores in type_embd_data buffer
  • Modified forward pass to use pre-computed embeddings when compression is enabled

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

@coderabbitai
Copy link
Contributor

coderabbitai bot commented Nov 23, 2025

📝 Walkthrough

Walkthrough

Modified SE-T TEBD descriptor compression to restrict compression to "strip" input mode, introduced a type_embedding_net parameter to enable_compression, added a type_embd_data buffer for storing precomputed type-embedding data, and reworked forward logic to use pre-tabulated compression data when enabled.

Changes

Cohort / File(s) Summary
SE-T TEBD Descriptor Compression Enhancement
deepmd/pt/model/descriptor/se_t_tebd.py
Added compression guard for "strip" input mode only. Refactored enable_compression signature to accept type_embedding_net, table_data, table_config, lower, and upper parameters. Introduced type_embd_data buffer in DescrptBlockSeTTebd to store precomputed compressed type-embedding data. Reworked strip-mode forward logic to conditionally use pre-tabulated compression data instead of on-the-fly computation. Added runtime validation for filter_layers_strip existence during compression enablement.

Sequence Diagram(s)

sequenceDiagram
    participant User
    participant Descriptor as DescrptSeTTebd
    participant Block as DescrptBlockSeTTebd
    participant Net as TypeEmbedNet

    User->>Descriptor: enable_compression(type_embedding_net, ...)
    rect rgb(200, 220, 255)
    Note over Descriptor: Guard check: tebd_input_mode == "strip"
    end
    Descriptor->>Descriptor: Validate filter_layers_strip exists
    Descriptor->>Block: enable_compression(type_embedding_net, ...)
    Block->>Net: Compute type embeddings
    Net-->>Block: Precomputed embedding tensor
    rect rgb(220, 240, 220)
    Block->>Block: register_buffer(type_embd_data, embedding)
    end
    Block-->>Descriptor: Compression enabled

    User->>Descriptor: forward(...)
    alt compress is True (strip mode)
        rect rgb(255, 240, 200)
        Note over Block: Use precomputed type_embd_data
        Block-->>Descriptor: Type-related embeddings from buffer
        end
    else compress is False
        rect rgb(240, 220, 255)
        Note over Block: Compute type embeddings on-the-fly
        Block-->>Descriptor: Type-related embeddings computed
        end
    end
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

  • Method signature overhaul: enable_compression parameters completely changed from table configuration primitives to high-level objects (type_embedding_net, table_data, table_config); requires understanding the new interface contract
  • Dual-path forward logic: Conditional branching based on compression state adds complexity; both paths need verification for correctness
  • Buffer lifecycle management: New type_embd_data buffer registration and usage patterns need careful review to ensure proper tensor lifecycle and device placement
  • Runtime validation logic: Guard conditions and error messages specific to TEBD compression require careful verification against intended restrictions

Possibly related PRs

Suggested labels

Python, Core

Suggested reviewers

  • njzjz
  • iProzd
  • wanghan-iapcm

Pre-merge checks and finishing touches

❌ Failed checks (1 warning)
Check name Status Explanation Resolution
Title check ⚠️ Warning The PR title mentions 'se_e3_tebd' but the actual changes are exclusively in 'se_t_tebd.py'. The title is misleading about which descriptor is being modified. Update the title to 'feat(pt): Implement type embedding compression for se_t_tebd' to accurately reflect the changes being made.
✅ Passed checks (2 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Docstring Coverage ✅ Passed Docstring coverage is 83.33% which is sufficient. The required threshold is 80.00%.
✨ Finishing touches
  • 📝 Generate docstrings
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (1)
deepmd/pt/model/descriptor/se_t_tebd.py (1)

1092-1108: Consider simplifying the buffer update pattern.

The type embedding compression logic is correct: it precomputes all type-pair embeddings and stores them for efficient lookup during inference. However, the pattern of deleting and re-registering the buffer could be simplified.

Instead of:

if hasattr(self, "type_embd_data"):
    del self.type_embd_data
self.register_buffer("type_embd_data", embd_tensor)

Consider directly updating the buffer data:

-            if hasattr(self, "type_embd_data"):
-                del self.type_embd_data
-            self.register_buffer("type_embd_data", embd_tensor)
+            self.type_embd_data = embd_tensor

Since type_embd_data is already registered as a buffer in __init__, directly assigning the new tensor will update the buffer while maintaining proper serialization behavior. This approach is simpler and avoids potential issues with buffer management during multiple enable_compression calls.

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between e98dc5a and 16b383f.

📒 Files selected for processing (1)
  • deepmd/pt/model/descriptor/se_t_tebd.py (8 hunks)
🧰 Additional context used
🧬 Code graph analysis (1)
deepmd/pt/model/descriptor/se_t_tebd.py (2)
deepmd/dpmodel/utils/type_embed.py (1)
  • TypeEmbedNet (30-221)
deepmd/pt/model/network/network.py (2)
  • TypeEmbedNet (268-358)
  • get_full_embedding (317-332)
🪛 Ruff (0.14.5)
deepmd/pt/model/descriptor/se_t_tebd.py

557-557: Avoid specifying long messages outside the exception class

(TRY003)


1068-1068: Avoid specifying long messages outside the exception class

(TRY003)


1070-1072: Avoid specifying long messages outside the exception class

(TRY003)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (31)
  • GitHub Check: Agent
  • GitHub Check: CodeQL analysis (python)
  • GitHub Check: Test Python (5, 3.9)
  • GitHub Check: Test Python (4, 3.9)
  • GitHub Check: Test Python (6, 3.9)
  • GitHub Check: Test Python (6, 3.12)
  • GitHub Check: Test Python (4, 3.12)
  • GitHub Check: Test Python (5, 3.12)
  • GitHub Check: Test Python (3, 3.12)
  • GitHub Check: Test Python (2, 3.9)
  • GitHub Check: Test Python (3, 3.9)
  • GitHub Check: Test Python (1, 3.12)
  • GitHub Check: Test Python (1, 3.9)
  • GitHub Check: Test Python (2, 3.12)
  • GitHub Check: Build C++ (cuda120, cuda)
  • GitHub Check: Build C++ (rocm, rocm)
  • GitHub Check: Build C++ (cpu, cpu)
  • GitHub Check: Build C++ (cuda, cuda)
  • GitHub Check: Build C++ (clang, clang)
  • GitHub Check: Build C library (2.14, >=2.5.0,<2.15, libdeepmd_c_cu11.tar.gz)
  • GitHub Check: Build wheels for cp310-manylinux_aarch64
  • GitHub Check: Build C library (2.18, libdeepmd_c.tar.gz)
  • GitHub Check: Build wheels for cp311-macosx_arm64
  • GitHub Check: Build wheels for cp311-macosx_x86_64
  • GitHub Check: Build wheels for cp311-manylinux_x86_64
  • GitHub Check: Build wheels for cp311-win_amd64
  • GitHub Check: Build wheels for cp311-manylinux_x86_64
  • GitHub Check: Test C++ (true)
  • GitHub Check: Test C++ (false)
  • GitHub Check: Analyze (python)
  • GitHub Check: Analyze (c-cpp)
🔇 Additional comments (5)
deepmd/pt/model/descriptor/se_t_tebd.py (5)

556-557: LGTM: Compression restricted to strip mode.

The guard correctly restricts compression to "strip" mode, which aligns with the PR's implementation strategy for type embedding compression.


700-710: LGTM: Type embedding buffer properly registered.

The type_embd_data buffer is correctly registered for storing precomputed type-embedding compression data, with appropriate initialization and device placement.


997-1014: LGTM: Compression path correctly uses precomputed embeddings.

The conditional logic properly switches between precomputed type_embd_data (compressed) and on-the-fly computation (uncompressed). The index calculation for type pairs is correct.

Note: The code assumes idx values are within bounds of tt_full. This should be safe given that ntypes_with_padding defines both the index range and type_embd_data dimensions, but consider adding an assertion if defensive checks are desired.


1044-1072: LGTM: Proper validation for type embedding compression requirements.

The updated signature and validation checks correctly enforce that:

  1. Compression only works in "strip" mode
  2. filter_layers_strip must exist

The error messages are clear and help users understand the constraints.


1074-1091: LGTM: Geometric embedding compression setup.

The compression configuration for the geometric embedding network is properly initialized with bounds, table config, and data.

@codecov
Copy link

codecov bot commented Nov 23, 2025

Codecov Report

❌ Patch coverage is 36.00000% with 16 lines in your changes missing coverage. Please review.
✅ Project coverage is 84.24%. Comparing base (da452d7) to head (16b383f).
⚠️ Report is 5 commits behind head on devel.

Files with missing lines Patch % Lines
deepmd/pt/model/descriptor/se_t_tebd.py 36.00% 16 Missing ⚠️
Additional details and impacted files
@@           Coverage Diff           @@
##            devel    #5059   +/-   ##
=======================================
  Coverage   84.24%   84.24%           
=======================================
  Files         709      709           
  Lines       70236    70296   +60     
  Branches     3623     3619    -4     
=======================================
+ Hits        59169    59221   +52     
- Misses       9900     9906    +6     
- Partials     1167     1169    +2     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

@OutisLi OutisLi requested review from iProzd and njzjz November 23, 2025 07:23
@iProzd iProzd added this pull request to the merge queue Nov 24, 2025
Merged via the queue into deepmodeling:devel with commit 5163e74 Nov 24, 2025
66 checks passed
@OutisLi OutisLi deleted the pr/e3_compress branch November 25, 2025 11:29
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants