Skip to content

Commit 735277b

Browse files
authored
Merge pull request #98 from hongyanwang/master
update docs and readme
2 parents e7090a6 + 695ae2a commit 735277b

File tree

13 files changed

+31
-22
lines changed

13 files changed

+31
-22
lines changed

README.md

Lines changed: 6 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -1,4 +1,4 @@
1-
[DOC](https://paddledtx.readthedocs.io/zh_CN/latest) | [中文](./README_CN.md) | English
1+
[DOC](https://paddledtx.readthedocs.io/zh_CN/v2.0.0) | [中文](./README_CN.md) | English
22

33
[![License](https://img.shields.io/badge/license-Apache%202-blue.svg)](LICENSE)
44

@@ -24,9 +24,12 @@ Currently, XuperChain is the only blockchain framework that PaddleDTX supported.
2424
![Image text](./images/architecture.png)
2525

2626
## Vertical Federated Learning
27-
The open source version of PaddleDTX supports two-party vertical federated learning(VFL) algorithms, including Linear Regression and Logistic Regression, more algorithms such as two-party Neural Network will be open sourced soon, along with multi-party VFL and multi-party HFL(horizontal federated learning) algorithms. Please refer to [crypto/ml](./crypto/core/machine_learning) for more about background and implementation of these two algorithms.
27+
The open source version of PaddleDTX supports vertical federated learning(VFL) algorithms, including two-party Linear Regression, two-party Logistic Regression and three-party DNN(Deep Neural Networks).
28+
Please refer to [crypto/ml](./crypto/core/machine_learning) for more about background and implementation of two-party VFL algorithms.
29+
The DNN implementation relies on the [PaddleFL](https://github.com/PaddlePaddle/PaddleFL) framework and all neural network models provided by PaddleFL can be used in PaddleDTX.
30+
More algorithms will be open sourced soon, including multi-party VFL and multi-party HFL(horizontal federated learning) algorithms.
2831

29-
Training and predicting steps of VFL are shown as follows:
32+
Take two-party VFL algorithms as an example, training and prediction steps are shown as follows:
3033

3134
![Image text](./images/vertical_learning.png)
3235

README_CN.md

Lines changed: 5 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -1,4 +1,4 @@
1-
[DOC](https://paddledtx.readthedocs.io/zh_CN/latest) | [English](./README.md) | 中文
1+
[DOC](https://paddledtx.readthedocs.io/zh_CN/v2.0.0) | [English](./README.md) | 中文
22

33
[![License](https://img.shields.io/badge/license-Apache%202-blue.svg)](LICENSE)
44

@@ -24,9 +24,11 @@ SMPC是一个支持多个学习过程并行运行的框架,会陆续集成更
2424
![Image text](./images/architecture.png)
2525

2626
## 二、纵向联邦学习
27-
PaddleDTX 开源部分目前支持两方的纵向联邦学习算法,包括多元线性回归和多元逻辑回归。算法具体原理和实现参见 [crypto/ml](./crypto/core/machine_learning),未来将支持更丰富的两方纵向联邦学习算法、多方的纵向联邦学习和横向联邦学习算法。
27+
PaddleDTX 开源部分目前支持纵向联邦学习算法,包括两方的多元线性回归和多元逻辑回归、三方的神经网络。两方纵向联邦学习算法具体原理和实现参见 [crypto/ml](./crypto/core/machine_learning)
28+
神经网络算法实现依赖了 [PaddleFL 框架](https://github.com/PaddlePaddle/PaddleFL) ,可以使用 PaddleFL 提供的所有神经网络算法模型。
29+
PaddleDTX 未来将支持更丰富的纵向联邦学习和横向联邦学习算法。
2830

29-
纵向联邦训练和预测步骤如下:
31+
以两方为例,纵向联邦训练和预测步骤如下:
3032

3133
![Image text](./images/vertical_learning.png)
3234

crypto/README.md

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -27,12 +27,12 @@ The model is based on multivariate linear regression model. It is continuously d
2727
The closer to 1, the greater the possibility it is the specified value. The training process is to look for optimal coefficients θ by iteration to ensure errors on training samples is as small as possible.
2828

2929
## Vertical Federated Learning Algorithms
30-
The project currently supported two-party vertical federated learning protocol.
30+
The project currently implemented two-party vertical federated learning protocol.
3131
In training process, each party calculates partial gradient and cost using own samples. Intermediate parameters are exchanged and integrated to obtain each party's model without leaking any data confidentiality.
3232
In prediction process, each party calculate local result using own model and deduce final result by the sum of all partial results.
3333

3434
Two parties' sample numbers in training or prediction process may be different.
35-
Samples need to be aligned by ID list of each party. Please referr to [psi](./core/machine_learning/linear_regression/gradient_descent/mpc_vertical/psi.go) for more details about sample alignment.
35+
Samples need to be aligned by ID list of each party. Please refer to [psi](./core/machine_learning/linear_regression/gradient_descent/mpc_vertical/psi.go) for more details about sample alignment.
3636

3737
The vertical federated learning steps of linear and logistic regression are shown as follows, suppose sample alignment has already been finished:
3838

crypto/README_CN.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -23,7 +23,7 @@ y = 1 / (1 + e<sup>-&theta;X</sup>)
2323
该模型是基于线性回归模型变化得到的,模型连续可导,且可以保证目标特征是(0,1)之间的数值,越接近1表明样本是指定值的概率越大。学习过程就是通过迭代找到合适的参数&theta;,使得模型在训练集合的误差尽量小。
2424

2525
## 二、纵向联邦学习算法
26-
项目暂支持两方的纵向联邦学习算法,训练过程中,双方利用各自的样本数据计算部分梯度和损失,交换中间参数并进行整合,在保证不泄露隐私的前提下计算各自的模型。预测时利用各自模型计算部分预测值,并利用预测结果之和推导出最终预测结果。
26+
项目实现了两方的纵向联邦学习算法,训练过程中,双方利用各自的样本数据计算部分梯度和损失,交换中间参数并进行整合,在保证不泄露隐私的前提下计算各自的模型。预测时利用各自模型计算部分预测值,并利用预测结果之和推导出最终预测结果。
2727

2828
训练和预测时,双方样本可能会有数量不一致的情况,需要根据各方数据ID对数据进行对齐,具体原理和实现详见[隐私求交](./core/machine_learning/linear_regression/gradient_descent/mpc_vertical/psi.go)
2929

docs/README.md

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -10,8 +10,8 @@
1010
2. 服务启动:
1111
```
1212
cd docs
13-
mkdoc server
13+
mkdocs serve
1414
```
1515
16-
3. View the site on [`localhost:8000`](https://localhost:8000).
16+
3. View the site on [`localhost:8000`](http://localhost:8000).
1717

docs/source/details/DAI.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -16,7 +16,7 @@ PaddleDTX实现的多方安全计算框架,具备以下特征:
1616
- 可执行模型评估和动态模型评估
1717
- 以区块链、隐私计算、ACL技术为支撑,保证数据、模型的隐私性和可信性
1818

19-
<img src='../../_static/smpc.png' width = "100%" height = "100%" align="middle"/>
19+
<img src='../../_static/smpc.png' width = "70%" height = "70%" align="middle"/>
2020

2121
## 3. 可信联邦学习
2222
PaddleDTX中,联邦学习分为训练过程和预测过程。计算需求方通过发布训练任务,任务执行节点会向数据持有节点做数据可信性背书,继而触发训练过程,最终得到满足条件的模型。如果有预测需求,计算需求方发布预测任务,任务执行节点会向数据持有节点做数据可信性背书,继而触发预测过程,最终得到预测结果。目前已集成的算法及其原理和实现,在 [crypto](./crypto.md#id2) 部分有更多体现。

docs/source/details/XuperDB.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -22,7 +22,7 @@ XuperDB 具备高安全、高可用、可审计的特点:
2222
## 3. 架构设计
2323
XuperDB 系统架构如下图所示:
2424

25-
<img src='../../_static/xdb.png' width = "100%" height = "100%" align="middle"/>
25+
<img src='../../_static/xdb.png' width = "80%" height = "80%" align="middle"/>
2626

2727
XuperDB 网络由三类节点构成:
2828

docs/source/details/crypto.md

Lines changed: 8 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -16,7 +16,7 @@ PaddleDTX 的 crypto 模块实现了若干机器学习算法和对应的分布
1616
- **联邦迁移学习**:参与方样本与特征重叠都较少,该场景下不对数据进行切分,而是利用迁移学习来克服数据或标签不足的情况。
1717

1818
## 2. 机器学习算法
19-
PaddleDTX 目前已经开源多元线性回归和多元逻辑回归算法,决策树、神经网络等更丰富的机器学习算法即将开源
19+
PaddleDTX 目前已经开源多元线性回归和多元逻辑回归、神经网络算法,决策树等更丰富的机器学习算法即将开源
2020

2121
### 2.1 多元线性回归
2222
多元线性回归用来描述一个变量受多个因素影响,且他们的关系可以用多元线性方程表示的场景。如房屋价格受房屋大小、楼层数、周边环境等因素影响。
@@ -36,12 +36,16 @@ y = 1 / (1 + e<sup>-&theta;X</sup>)
3636

3737
该模型是基于线性回归模型变化得到的,模型连续可导,且可以保证目标特征是(0,1)之间的数值,越接近1表明样本是指定值的概率越大。学习过程就是通过迭代找到合适的参数&theta;,使得模型在训练集合的误差尽量小。
3838

39+
### 2.3 神经网络
40+
神经网络是一种由大量的节点(或称为神经元)相互联接构成的运算模型,理论上可以逼近任意函数。
41+
在神经网络模型定义和训练的过程中,有很多标准的算法和流程,因此诞生了深度学习算法框架。PaddleDTX的神经网络算法实现依赖了应用广泛的 [PaddleFL 框架](https://github.com/PaddlePaddle/PaddleFL/blob/master/README_cn.md),可以使用 PaddleFL 提供的所有神经网络算法模型。
42+
3943
## 3. 纵向联邦学习
40-
PaddleDTX 目前已经开源两方的纵向联邦学习算法,包括多元线性回归和多元逻辑回归。多方横向联邦学习和多方纵向联邦学习相关算法即将开源,敬请期待。
44+
PaddleDTX 目前开源了两方的纵向联邦学习算法,包括多元线性回归和多元逻辑回归。多方横向联邦学习和多方纵向联邦学习相关算法即将开源,敬请期待。
4145

42-
纵向联邦训练和预测步骤如下
46+
多元线性回归与多元逻辑回归的纵向联邦训练和预测步骤如下
4347

44-
<img src='../../_static/vertical_learning.png' width = "100%" height = "100%" align="middle"/>
48+
<img src='../../_static/vertical_learning.png' width = "80%" height = "80%" align="middle"/>
4549

4650
### 3.1 数据准备
4751
计算任务会指定参与方的样本数据,数据存在去中心化存储系统(XuperDB)中。任务启动前,任务计算方(即数据持有方)需要从XuperDB获取自己的样本数据。

docs/source/details/framework.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -2,7 +2,7 @@
22

33
PaddleDTX 主要由计算需求方、任务执行节点、数据持有节点、存储节点和区块链节点组成,部署架构如下图所示:
44

5-
<img src='../../_static/deployment.png' width = "100%" height = "100%" align="middle"/>
5+
<img src='../../_static/deployment.png' width = "60%" height = "60%" align="middle"/>
66

77
!!! note "说明"
88

docs/source/index.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -31,7 +31,7 @@
3131
<div class="card-holder container">
3232
<div class="card rocket container">
3333
<p class="introtitle">应用案例</p>
34-
<p class="introcontent">通过测试案例可以评估模型训练、预测的效果,PaddleDTX提供了基于波士顿房价预测的线性回归算法和基于鸢尾花的逻辑回归算法。</p>
34+
<p class="introcontent">通过测试案例可以评估模型训练、预测的效果,PaddleDTX提供了基于波士顿房价预测的线性回归、神经网络算法和基于鸢尾花的逻辑回归算法。</p>
3535
<p class="introdetails"><b><a href="./projectcases/linear">查看详情</a></b></p>
3636
</div>
3737
</div>

0 commit comments

Comments
 (0)