Commit c346332
authored
feat: Performance Optimization: Data Loading and Statistics Acceleration (#5040)
## Overview
This PR introduces performance optimizations for data loading and
statistics computation in deepmd-kit. The changes focus on
multi-threading parallelization, memory-mapped I/O, and efficient
filesystem operations.
## Changes Summary
### 1. Multi-threaded Statistics Computation (`deepmd/pt/utils/stat.py`)
- Introduced `ThreadPoolExecutor` for parallel processing of multiple
datasets
- Refactored `make_stat_input` to use thread pool with 256 workers
- Created `_process_one_dataset` helper function for individual dataset
processing
- Significantly accelerates statistics computation for multi-system
datasets
### 2. Efficient System Path Lookup (`deepmd/common.py`)
- Optimized `expand_sys_str` to use `rglob("type.raw")` instead of
`rglob("*")` + filtering
- Added `parent` property to `DPOSPath` and `DPH5Path` classes in
`deepmd/utils/path.py`
- **Performance**: 10x speedup for system discovery (as noted in commit
message)
### 3. Memory-mapped Data Loading (`deepmd/utils/data.py`)
- Added `_get_nframes` method to read numpy file headers without loading
data
- Modified `get_numb_batch` to use the new method instead of loading
entire dataset
- Uses `np.lib.format.read_magic` and `read_array_header_*` to extract
shape information
- Reduces memory consumption for large datasets
### 4. Parallel Statistics File Loading (`deepmd/utils/env_mat_stat.py`)
- Implemented `ThreadPoolExecutor` for parallel loading of stat files
- Added `_load_stat_file` static method with error handling
- Uses 128 worker threads for I/O-bound operations
- Enhanced file format validation and malformed file handling
## Performance Impact
| Component | Before | After | Improvement |
|-----------|--------|-------|-------------|
| System path lookup | O(n) file traversal | O(k) direct match | 10x
faster |
| Statistics computation | Sequential processing | 256-thread parallel |
Significant |
| Data loading | Full dataset load | Header-only read | Memory efficient
|
| Statistics loading | Sequential file I/O | 128-thread parallel |
Significant |
## Compatibility
✅ **Backward Compatible**: All API interfaces remain unchanged
✅ **Data Format**: No changes to data file formats
✅ **Functionality**: All existing features work normally
<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit
* **Performance Improvements**
* Optimized frame detection to avoid loading complete datasets during
initialization, enhancing startup performance for large data files.
* Improved support for multiple data format variants with more efficient
metadata reading.
<!-- end of auto-generated comment: release notes by coderabbit.ai -->1 parent 1ccc57d commit c346332
1 file changed
+23
-12
lines changed| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
14 | 14 | | |
15 | 15 | | |
16 | 16 | | |
| 17 | + | |
17 | 18 | | |
18 | 19 | | |
19 | 20 | | |
| |||
135 | 136 | | |
136 | 137 | | |
137 | 138 | | |
138 | | - | |
139 | | - | |
| 139 | + | |
140 | 140 | | |
141 | 141 | | |
142 | 142 | | |
| |||
338 | 338 | | |
339 | 339 | | |
340 | 340 | | |
341 | | - | |
342 | | - | |
| 341 | + | |
| 342 | + | |
| 343 | + | |
| 344 | + | |
343 | 345 | | |
344 | 346 | | |
345 | 347 | | |
| |||
578 | 580 | | |
579 | 581 | | |
580 | 582 | | |
581 | | - | |
582 | | - | |
| 583 | + | |
583 | 584 | | |
584 | 585 | | |
585 | 586 | | |
586 | | - | |
587 | | - | |
| 587 | + | |
| 588 | + | |
588 | 589 | | |
589 | | - | |
590 | | - | |
591 | | - | |
592 | | - | |
| 590 | + | |
| 591 | + | |
| 592 | + | |
| 593 | + | |
| 594 | + | |
| 595 | + | |
| 596 | + | |
| 597 | + | |
| 598 | + | |
| 599 | + | |
| 600 | + | |
| 601 | + | |
| 602 | + | |
| 603 | + | |
593 | 604 | | |
594 | 605 | | |
595 | 606 | | |
| |||
0 commit comments