Skip to content

Conversation

@dzzz2001
Copy link
Collaborator

@dzzz2001 dzzz2001 commented Dec 7, 2025

This PR refactors snap_psibeta_half_tddft in source/source_lcao/module_rt/snap_psibeta_half_tddft.cpp to improve code readability and performance.

Key Changes

  1. Refactoring: Split the monolithic function into logical sections with clear comments. Added helper functions init_gauss_legendre_grid and interpolate_radial.
  2. Performance Optimization:
    • Gauss-Legendre Grid Caching: Implemented thread-safe static caching for radial integration grid points.
    • Ylm Caching: Precomputed and cached spherical harmonics on the Lebedev angular grid for the projection loop.
    • Vectorization Friendly: Restructured the inner loops to separate geometric calculations from accumulation, aiding compiler vectorization.
    • Memory Management: Reused vectors to minimize allocation overhead in the loops.

Performance Benchmark

Tested on Si96 system (2 MPI processes, 8 OpenMP threads).

Version Time (s)
Baseline 230.96
Optimized 147.28

Speedup: ~1.57x (~36% reduction in execution time for this function).

Key optimizations include:
1. Cache Gauss-Legendre Grid: Implemented thread-safe caching for radial integration grid points and weights.
2. Precompute Spherical Harmonics: Precomputed Ylm and A dot r on the Lebedev angular grid to reduce inner loop overhead.
3. Memory Optimization: Lifted vector allocations out of loops and reused buffers to minimize allocation costs.
4. Inline Interpolation: Inlined polynomial interpolation and used precomputed inverse step size to avoid divisions.
5. Common Subexpression Elimination: Extracted invariant factors from the innermost loop to reduce arithmetic operations.
@dzzz2001 dzzz2001 force-pushed the refactor_snap_psibeta branch from fbecd9d to 16d8b9f Compare December 7, 2025 16:43
@dzzz2001 dzzz2001 requested a review from AsTonyshment December 7, 2025 17:04
@dzzz2001
Copy link
Collaborator Author

dzzz2001 commented Dec 8, 2025

@copilot Can you review this PR?

Copy link

Copilot AI commented Dec 8, 2025

@dzzz2001 I've opened a new pull request, #6766, to work on those changes. Once the pull request is ready, I'll request review from you.

Copy link
Collaborator

@AsTonyshment AsTonyshment left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@mohanchen mohanchen added Refactor Refactor ABACUS codes Performance Issues related to fail running ABACUS labels Dec 8, 2025
@mohanchen mohanchen merged commit be03870 into develop Dec 8, 2025
25 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Performance Issues related to fail running ABACUS Refactor Refactor ABACUS codes

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants