Skip to content

leCunNormal initializer #8594

@vmukhachev

Description

@vmukhachev

tf.moments(tf.initializers.leCunNormal().apply([1, 1, 1, 1000000], 'float32')).variance.print()
prints something like:
0.7735087871551514

while it should be close to 1.0

The stddev passed to truncated normal distribution should be scaled by 1.0 / 0.87962566103423978 like it is done in https://github.com/keras-team/tf-keras/blob/v2.19.0/tf_keras/initializers/initializers.py#L662 .
This is because truncated normal distribution has different variance to stdev relationship from regular normal distribution.
The downscaling constant (0.87962566103423978 in this case) can be calculated like:
sqrt(1 - 4 / sqrt(2 * pi) * exp(-0.5 * x**2) / erf(x / sqrt(2))), where x is the bound of the distribution over stddev (x = 2 in the code)

unfortunately the documentation is also ignoring that fact, so it is not clear what to change

Metadata

Metadata

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions