feat(pt): Implement type embedding compression for se_e3_tebd #5059

OutisLi · 2025-11-23T05:25:17Z

The unit test for this descriptor will be committed through another pr.

Summary by CodeRabbit

Breaking Changes
- Model compression configuration updated to include new required parameters
Improvements
- Precomputed type-embedding data improves model compression efficiency and inference performance

_{✏️ Tip: You can customize this high-level summary in your review settings.}

- Introduced a new method for type embedding compression, allowing precomputation of strip-mode type embeddings for all type pairs. - Added runtime checks to ensure compatibility with the strip input mode and the existence of necessary filter layers. - Updated the forward method to utilize precomputed type embeddings when available, improving performance during inference. These changes optimize the handling of type embeddings, enhancing the efficiency of the descriptor model.

- Changed the assignment of type embedding data to use `register_buffer`, improving memory management. - Introduced a new variable `embd_tensor` for clarity and maintainability in the embedding computation process. These modifications enhance the structure of the code while maintaining existing functionality.

…escrptBlockSeTTebd

…mbedding handling

Copilot

Pull request overview

This PR implements type embedding compression for the se_e3_tebd descriptor. The compression allows the model to pre-compute and store type embedding outputs during compression, avoiding redundant computation during inference.

Key changes:

Added type embedding network parameter to compression pipeline
Pre-computes type embeddings for all type pairs and stores in type_embd_data buffer
Modified forward pass to use pre-computed embeddings when compression is enabled

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

deepmd/pt/model/descriptor/se_t_tebd.py

coderabbitai · 2025-11-23T05:29:04Z

📝 Walkthrough

Walkthrough

Modified SE-T TEBD descriptor compression to restrict compression to "strip" input mode, introduced a type_embedding_net parameter to enable_compression, added a type_embd_data buffer for storing precomputed type-embedding data, and reworked forward logic to use pre-tabulated compression data when enabled.

Changes

Cohort / File(s)	Summary
SE-T TEBD Descriptor Compression Enhancement `deepmd/pt/model/descriptor/se_t_tebd.py`	Added compression guard for "strip" input mode only. Refactored `enable_compression` signature to accept `type_embedding_net`, `table_data`, `table_config`, `lower`, and `upper` parameters. Introduced `type_embd_data` buffer in `DescrptBlockSeTTebd` to store precomputed compressed type-embedding data. Reworked strip-mode forward logic to conditionally use pre-tabulated compression data instead of on-the-fly computation. Added runtime validation for `filter_layers_strip` existence during compression enablement.

Sequence Diagram(s)

sequenceDiagram
    participant User
    participant Descriptor as DescrptSeTTebd
    participant Block as DescrptBlockSeTTebd
    participant Net as TypeEmbedNet

    User->>Descriptor: enable_compression(type_embedding_net, ...)
    rect rgb(200, 220, 255)
    Note over Descriptor: Guard check: tebd_input_mode == "strip"
    end
    Descriptor->>Descriptor: Validate filter_layers_strip exists
    Descriptor->>Block: enable_compression(type_embedding_net, ...)
    Block->>Net: Compute type embeddings
    Net-->>Block: Precomputed embedding tensor
    rect rgb(220, 240, 220)
    Block->>Block: register_buffer(type_embd_data, embedding)
    end
    Block-->>Descriptor: Compression enabled

    User->>Descriptor: forward(...)
    alt compress is True (strip mode)
        rect rgb(255, 240, 200)
        Note over Block: Use precomputed type_embd_data
        Block-->>Descriptor: Type-related embeddings from buffer
        end
    else compress is False
        rect rgb(240, 220, 255)
        Note over Block: Compute type embeddings on-the-fly
        Block-->>Descriptor: Type-related embeddings computed
        end
    end

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Method signature overhaul: enable_compression parameters completely changed from table configuration primitives to high-level objects (type_embedding_net, table_data, table_config); requires understanding the new interface contract
Dual-path forward logic: Conditional branching based on compression state adds complexity; both paths need verification for correctness
Buffer lifecycle management: New type_embd_data buffer registration and usage patterns need careful review to ensure proper tensor lifecycle and device placement
Runtime validation logic: Guard conditions and error messages specific to TEBD compression require careful verification against intended restrictions

Possibly related PRs

Add 4 pt descriptor compression #4227: Extends compression support for the TEBD variant of the SE-T descriptor with type-embedding compression infrastructure.
feat(pt): add compression support for se_e3_tebd #4992: Modifies the same descriptor file with enable_compression API changes and compression data path adjustments.
doc:fix error in doc of se_e3_tebd #5018: Updates documentation to match the compression-mode restriction to "strip" input mode introduced in this PR.

Suggested labels

Python, Core

Suggested reviewers

njzjz
iProzd
wanghan-iapcm

Pre-merge checks and finishing touches

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Title check	⚠️ Warning	The PR title mentions 'se_e3_tebd' but the actual changes are exclusively in 'se_t_tebd.py'. The title is misleading about which descriptor is being modified.	Update the title to 'feat(pt): Implement type embedding compression for se_t_tebd' to accurately reflect the changes being made.

✅ Passed checks (2 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Docstring Coverage	✅ Passed	Docstring coverage is 83.33% which is sufficient. The required threshold is 80.00%.

✨ Finishing touches

📝 Generate docstrings

🧪 Generate unit tests (beta)

Create PR with unit tests
Post copyable unit tests in a comment

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 1

🧹 Nitpick comments (1)

deepmd/pt/model/descriptor/se_t_tebd.py (1)
1092-1108: Consider simplifying the buffer update pattern.

The type embedding compression logic is correct: it precomputes all type-pair embeddings and stores them for efficient lookup during inference. However, the pattern of deleting and re-registering the buffer could be simplified.

Instead of:
if hasattr(self, "type_embd_data"):
    del self.type_embd_data
self.register_buffer("type_embd_data", embd_tensor)
Consider directly updating the buffer data:
-            if hasattr(self, "type_embd_data"):
-                del self.type_embd_data
-            self.register_buffer("type_embd_data", embd_tensor)
+            self.type_embd_data = embd_tensor
Since type_embd_data is already registered as a buffer in __init__, directly assigning the new tensor will update the buffer while maintaining proper serialization behavior. This approach is simpler and avoids potential issues with buffer management during multiple enable_compression calls.

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between e98dc5a and 16b383f.

📒 Files selected for processing (1)

deepmd/pt/model/descriptor/se_t_tebd.py (8 hunks)

🧰 Additional context used

🧬 Code graph analysis (1)

deepmd/pt/model/descriptor/se_t_tebd.py (2)

deepmd/dpmodel/utils/type_embed.py (1)

TypeEmbedNet (30-221)

deepmd/pt/model/network/network.py (2)

TypeEmbedNet (268-358)

get_full_embedding (317-332)

🪛 Ruff (0.14.5)

deepmd/pt/model/descriptor/se_t_tebd.py

557-557: Avoid specifying long messages outside the exception class

(TRY003)

1068-1068: Avoid specifying long messages outside the exception class

(TRY003)

1070-1072: Avoid specifying long messages outside the exception class

(TRY003)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (31)

GitHub Check: Agent
GitHub Check: CodeQL analysis (python)
GitHub Check: Test Python (5, 3.9)
GitHub Check: Test Python (4, 3.9)
GitHub Check: Test Python (6, 3.9)
GitHub Check: Test Python (6, 3.12)
GitHub Check: Test Python (4, 3.12)
GitHub Check: Test Python (5, 3.12)
GitHub Check: Test Python (3, 3.12)
GitHub Check: Test Python (2, 3.9)
GitHub Check: Test Python (3, 3.9)
GitHub Check: Test Python (1, 3.12)
GitHub Check: Test Python (1, 3.9)
GitHub Check: Test Python (2, 3.12)
GitHub Check: Build C++ (cuda120, cuda)
GitHub Check: Build C++ (rocm, rocm)
GitHub Check: Build C++ (cpu, cpu)
GitHub Check: Build C++ (cuda, cuda)
GitHub Check: Build C++ (clang, clang)
GitHub Check: Build C library (2.14, >=2.5.0,<2.15, libdeepmd_c_cu11.tar.gz)
GitHub Check: Build wheels for cp310-manylinux_aarch64
GitHub Check: Build C library (2.18, libdeepmd_c.tar.gz)
GitHub Check: Build wheels for cp311-macosx_arm64
GitHub Check: Build wheels for cp311-macosx_x86_64
GitHub Check: Build wheels for cp311-manylinux_x86_64
GitHub Check: Build wheels for cp311-win_amd64
GitHub Check: Build wheels for cp311-manylinux_x86_64
GitHub Check: Test C++ (true)
GitHub Check: Test C++ (false)
GitHub Check: Analyze (python)
GitHub Check: Analyze (c-cpp)

🔇 Additional comments (5)

deepmd/pt/model/descriptor/se_t_tebd.py (5)

556-557: LGTM: Compression restricted to strip mode.

The guard correctly restricts compression to "strip" mode, which aligns with the PR's implementation strategy for type embedding compression.

700-710: LGTM: Type embedding buffer properly registered.

The type_embd_data buffer is correctly registered for storing precomputed type-embedding compression data, with appropriate initialization and device placement.

997-1014: LGTM: Compression path correctly uses precomputed embeddings.

The conditional logic properly switches between precomputed type_embd_data (compressed) and on-the-fly computation (uncompressed). The index calculation for type pairs is correct.

Note: The code assumes idx values are within bounds of tt_full. This should be safe given that ntypes_with_padding defines both the index range and type_embd_data dimensions, but consider adding an assertion if defensive checks are desired.

1044-1072: LGTM: Proper validation for type embedding compression requirements.

The updated signature and validation checks correctly enforce that:

Compression only works in "strip" mode

filter_layers_strip must exist

The error messages are clear and help users understand the constraints.

1074-1091: LGTM: Geometric embedding compression setup.

The compression configuration for the geometric embedding network is properly initialized with bounds, table config, and data.

deepmd/pt/model/descriptor/se_t_tebd.py

codecov · 2025-11-23T06:30:29Z

Codecov Report

❌ Patch coverage is 36.00000% with 16 lines in your changes missing coverage. Please review.
✅ Project coverage is 84.24%. Comparing base (da452d7) to head (16b383f).
⚠️ Report is 5 commits behind head on devel.

Files with missing lines	Patch %	Lines
deepmd/pt/model/descriptor/se_t_tebd.py	36.00%	16 Missing ⚠️

Additional details and impacted files

@@           Coverage Diff           @@
##            devel    #5059   +/-   ##
=======================================
  Coverage   84.24%   84.24%           
=======================================
  Files         709      709           
  Lines       70236    70296   +60     
  Branches     3623     3619    -4     
=======================================
+ Hits        59169    59221   +52     
- Misses       9900     9906    +6     
- Partials     1167     1169    +2

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

OutisLi added 6 commits November 19, 2025 16:08

fix(pt): Clear type embedding data before registering new buffer in D…

c81df99

…escrptBlockSeTTebd

feat(tests): add model compression tests for se_e3_tebd

6870dc4

feat(tests): enhance cleanup in model compression tests

878e3f8

feat(pt): enforce strip mode for model compression and enhance type e…

16b383f

…mbedding handling

Copilot AI review requested due to automatic review settings November 23, 2025 05:25

github-actions bot added the Python label Nov 23, 2025

Copilot started reviewing on behalf of OutisLi November 23, 2025 05:25 View session

Copilot finished reviewing on behalf of OutisLi November 23, 2025 05:27

Copilot AI reviewed Nov 23, 2025

View reviewed changes

deepmd/pt/model/descriptor/se_t_tebd.py Show resolved Hide resolved

coderabbitai bot reviewed Nov 23, 2025

View reviewed changes

deepmd/pt/model/descriptor/se_t_tebd.py Show resolved Hide resolved

OutisLi requested review from iProzd and njzjz November 23, 2025 07:23

njzjz approved these changes Nov 23, 2025

View reviewed changes

iProzd approved these changes Nov 24, 2025

View reviewed changes

iProzd added this pull request to the merge queue Nov 24, 2025

Merged via the queue into deepmodeling:devel with commit 5163e74 Nov 24, 2025
66 checks passed

OutisLi deleted the pr/e3_compress branch November 25, 2025 11:29

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

feat(pt): Implement type embedding compression for se_e3_tebd #5059

feat(pt): Implement type embedding compression for se_e3_tebd #5059

Uh oh!

OutisLi commented Nov 23, 2025 •

edited by coderabbitai bot

Loading

Uh oh!

Copilot AI left a comment

Uh oh!

Uh oh!

coderabbitai bot commented Nov 23, 2025

Walkthrough

Changes

Sequence Diagram(s)

Estimated code review effort

Possibly related PRs

Suggested labels

Suggested reviewers

Uh oh!

coderabbitai bot left a comment

Uh oh!

Uh oh!

codecov bot commented Nov 23, 2025 •

edited

Loading

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

feat(pt): Implement type embedding compression for se_e3_tebd #5059

feat(pt): Implement type embedding compression for se_e3_tebd #5059

Uh oh!

Conversation

OutisLi commented Nov 23, 2025 • edited by coderabbitai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary by CodeRabbit

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

Uh oh!

coderabbitai bot commented Nov 23, 2025

Walkthrough

Changes

Sequence Diagram(s)

Estimated code review effort

Possibly related PRs

Suggested labels

Suggested reviewers

Pre-merge checks and finishing touches

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

codecov bot commented Nov 23, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

OutisLi commented Nov 23, 2025 •

edited by coderabbitai bot

Loading

codecov bot commented Nov 23, 2025 •

edited

Loading