Skip to content

Commit 9c17b96

Browse files
njzjzCopilot
andauthored
fix(dpmodel): fix normalize scale of initial parameters (#4774)
The current scale is too large. This PR makes it consistent with PT. <!-- This is an auto-generated comment: release notes by coderabbit.ai --> ## Summary by CodeRabbit - **Refactor** - Improved the initialization of certain neural network parameters for enhanced stability and consistency. No changes to user-facing functionality. <!-- end of auto-generated comment: release notes by coderabbit.ai --> --------- Signed-off-by: Jinzhe Zeng <jinzhe.zeng@ustc.edu.cn> Signed-off-by: Jinzhe Zeng <njzjz@qq.com> Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
1 parent 881d95e commit 9c17b96

File tree

1 file changed

+12
-3
lines changed

1 file changed

+12
-3
lines changed

deepmd/dpmodel/utils/network.py

Lines changed: 12 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -105,9 +105,18 @@ def __init__(
105105
# only use_timestep when skip connection is established.
106106
use_timestep = use_timestep and (num_out == num_in or num_out == num_in * 2)
107107
rng = np.random.default_rng(seed)
108-
self.w = rng.normal(size=(num_in, num_out)).astype(prec)
109-
self.b = rng.normal(size=(num_out,)).astype(prec) if bias else None
110-
self.idt = rng.normal(size=(num_out,)).astype(prec) if use_timestep else None
108+
scale_factor = 1.0 / np.sqrt(num_out + num_in)
109+
self.w = rng.normal(size=(num_in, num_out), scale=scale_factor).astype(prec)
110+
self.b = (
111+
rng.normal(size=(num_out,), scale=scale_factor).astype(prec)
112+
if bias
113+
else None
114+
)
115+
self.idt = (
116+
rng.normal(size=(num_out,), scale=scale_factor).astype(prec)
117+
if use_timestep
118+
else None
119+
)
111120
self.activation_function = (
112121
activation_function if activation_function is not None else "none"
113122
)

0 commit comments

Comments
 (0)