Skip to content

PaddlePaddle 3.2.2 Release Note

Latest

Choose a tag to compare

@XiaoguangHu01 XiaoguangHu01 released this 08 Dec 01:34
· 1189 commits to develop since this release
d90d3ad

重要更新

飞桨框架3.2.2版本在分布式并行、算子机制、硬件适配三个方面完成多项优化与升级,进一步提升框架性能与稳定性。

1. 分布式训练

  • 优化 FlexCheckpoint 的重切分通信流程; 为 paddle.nn.Layer 新增 full 接口, 用于返回完整模型参数; 支持加载 HuggingFace 开源格式的 Checkpoint。(#76249, #76291)
  • 为 group_sharded_optimizer_stage2 优化器新增 sharded_state_dict 函数。#76311
  • 为 paddle.load 接口修复加载 safetensor 文件 device_id 参数错误及 core_dump 问题。#76317
  • 新增 PipelineDatasetPreprocessor 机制,消除流水线并行策略中可能出现的内存泄漏问题。 #76260

2. 算子机制

  • 修复针对 BFloat16 list 场景下的 to_tensor 精度问题。 #76242

3. 硬件适配

  • 修改了独立的 XPU 内存监控模块,以确保与最新的内存监控逻辑保持一致。 #76056

4. 贡献者名单

qw86972190, xingmingyyj, zhangbo9674, zhangyuqin1998

Important Updates

PaddlePaddle Framework version 3.2.2 features multiple optimizations and upgrades across Distributed Parallelism, Operator Mechanism, and Hardware Adaptation to further enhance the framework's performance and stability.

1. Distributed Training

  • Optimized the communication process for re-sharding in FlexCheckpoint; added the full interface to paddle.nn.Layer for returning complete model parameters; supported loading Checkpoints in the HuggingFace open-source format. (#76249, #76291)
  • Added the sharded_state_dict function to the group_sharded_optimizer_stage2 optimizer. #76311
  • Fixed errors regarding the device_id parameter and a core dump issue when loading safetensor files using the paddle.load interface. #76317
  • Introduced the PipelineDatasetPreprocessor mechanism to eliminate potential memory leak issues in the pipeline parallelism strategy. #76260

2. Operator Mechanisms

  • Fixed a precision issue in to_tensor for BFloat16 list scenarios. #76242

3. Hardware Adaptation

  • Modified the independent XPU memory monitoring module to ensure consistency with the latest memory monitoring logic. #76056

4. List of Contributors

qw86972190, xingmingyyj, zhangbo9674, zhangyuqin1998