Merge branch 'u/xiaoyun/03' of https://github.com/LittleLittleCloud/csharp-notebooks into u/xiaoyun/03

LittleLittleCloud · LittleLittleCloud · commit 228cf798ab1f · 2022-05-27T11:50:36.000-07:00
diff --git a/machine-learning/03-Training and AutoML.ipynb b/machine-learning/03-Training and AutoML.ipynb
@@ -59,10 +59,10 @@
       "metadata": {},
       "source": [
         "## Example 1: Linear regression\n",
-        "In the below section, we are going to show the difference of trainers via a simple linear regression task, we firstly fit the linear dataset with `SDCA`, a linear trainer. And then with `LightGbm`, a tree-base non-linear trainer. And compare their performance on test dataset. The code below does\n",
-        "- Create linear dataset and split it into train/test part\n",
-        "- Create pipelines using `SDCA` and `LightGbm`\n",
-        "- Train both `SDCA` and `LightGbm` on linear trainset, and evaluate them on testset."
+        "In the below section, we are going to show the difference of trainers via a linear regression task. First, we fit the linear dataset with the linear trainer, `SDCA`. Then we git the linear dataset with `LightGbm`, a tree-base non-linear trainer. Their performance is evaluated against a test dataset. The code below:\n",
+        "- Creates a linear dataset and splits it into train/test sets\n",
+        "- Create training pipelines using `SDCA` and `LightGbm`\n",
+        "- Trains both `SDCA` and `LightGbm` on the linear training set, and evaluates them on the test set."
       ]
     },
     {
@@ -161,7 +161,7 @@
       "metadata": {},
       "source": [
         "## Create linear dataset\n",
-        "The code below artifacts linear dataset with a random residual, and loaded it as train/test `DataFrame`"
+        "The code below creates a linear dataset with a random residual. The dataset is loaded into train and test DataFrames"
       ]
     },
     {
@@ -203,7 +203,7 @@
       "metadata": {},
       "source": [
         "## Construct pipeline\n",
-        "The code below shows how to construct pipelines for both `SDCA` and `LightGbm`. The `Concatenate` transformer is necessary because it transfer a `single` column into `Vector<single>` type, which is the accepted feature type for both `SDCA` and `LightGbm` regressor."
+        "The code below shows how to construct training pipelines for both `SDCA` and `LightGbm`. The `Concatenate` transformer is required to convert a `single` column into `Vector<single>` type, which is the expected feature type for both `SDCA` and `LightGbm` regressor."
       ]
     },
     {
@@ -230,7 +230,7 @@
       "metadata": {},
       "source": [
         "## Train and evaluate model\n",
-        "The code below first trains `sdcaPipeline` and `lgbmPipeline` which are created above, then evaluate their performance on test dataset by calcuate `Root Mean Square Loss` between predicted and truth value. We can see that `SDCA` has better performance with a significant lower `Root Mean Square Loss` comparing with `LightGbm`, even it's a simple, linear model. This is because the training dataset is also linear, so `SDCA` can fit the dataset better than `LightGbm`."
+        "The code below first trains `sdcaPipeline` and `lgbmPipeline` which are created above, then evaluate their performance on test dataset by calculating `Root Mean Square Loss` between predicted and actual value. `SDCA` has better performance with a significantly lower `Root Mean Square Loss` compared to `LightGbm` even though it's a simpler linear model. This is because the training dataset is also linear, so `SDCA` can fit the dataset better than `LightGbm`."
       ]
     },
     {
@@ -281,15 +281,15 @@
       "metadata": {},
       "source": [
         "## Example 2: Non-linear regression on LightGbm.\n",
-        "This example is to show the importance of hyper-parameter optimization. We will first create a non-linear dataset and two pipelines. One pipeline has `LightGbm` with `numberOfLeaves` set to 10, the other's set to 1000. Then train both pipelines with the same train dataset and Comparing their training performance by evaluating them on the same test dataset."
+        "This example shows the importance of hyper-parameter optimization. First we create a non-linear dataset and two pipelines. One pipeline has `LightGbm` with `numberOfLeaves` set to `10`, the other's set to `1000`. Both pipelines are trained with the same training dataset and their training performance is evaluated on the same test dataset."
       ]
     },
     {
       "cell_type": "markdown",
       "metadata": {},
       "source": [
         "## Create non-linear dataset\n",
-        "The code below artifacts non-linear dataset with a random residual, and loaded it as train/test `DataFrame`"
+        "The code below creates a non-linear dataset with a random residual. The dataset is loaded into train and test DataFrames"
       ]
     },
     {
@@ -331,7 +331,7 @@
       "metadata": {},
       "source": [
         "## Construct pipeline\n",
-        "The code below shows how to construct pipelines for `LightGbm` with different hyper parameters. The `Concatenate` transformer is necessary because it transfer a `single` column into `Vector<single>` type, which is the accepted feature type for `LightGbm` regressor."
+        "The code below shows how to construct training pipelines for `LightGbm` with different hyper-parameters. The `Concatenate` transformer is required because it converts a `single` column into `Vector<single>` type, which is the expected feature type for the `LightGbm` trainer."
       ]
     },
     {
@@ -358,7 +358,7 @@
       "metadata": {},
       "source": [
         "## Train and evaluate model\n",
-        "The code below first trains `smallLgbmPipeline` and `largeLgbmPipeline` which are created above, then evaluate their performance on test dataset by calcuate `Root Mean Square Loss` between predicted and truth value. We can see that large lgbm has better performance with a lower rmse."
+        "The code below first trains `smallLgbmPipeline` and `largeLgbmPipeline` which are created above, then evaluates their performance on the test dataset by calculating the `Root Mean Square Loss` between predicted and actual value. The model created by `largeLgbmPipeline` has better performance with a lower RMSE."
       ]
     },
     {
@@ -400,9 +400,9 @@
       "metadata": {},
       "source": [
         "## Use AutoML to simplify hyper-parameter optimization.\n",
-        "Hyper-parameter optimization is tedious while important, and it's also something that can be done automatically. Using built-in `AutoMLExperiment` can greatly simplify hpo process. `AutoMLExperiment` applies the latest research from MSR so it can conduct swift, accurate and thorough hyper-parameter optimization in a limited time budget.\n",
+        "Hyper-parameter optimization is an important process with lots of trial and error. This process can be automated and simplified using the built-in `AutoMLExperiment`. `AutoMLExperiment` applies the latest research from Microsoft Research to conduct a swift, accurate and thorough hyper-parameter optimization given a limited time budget.\n",
         "\n",
-        "The code below shows how to use `AutoMLExperiment` for hpo and explore a better configuration for `LightGbm` on the non-linear dataset used in Example 2."
+        "The code below shows how to use `AutoMLExperiment` for HPO to explore a better configuration for `LightGbm` on the non-linear dataset used in Example 2."
       ]
     },
     {

Original file line number	Diff line number	Diff line change
`@@ -59,10 +59,10 @@`
`59`	`59`	`"metadata": {},`
`60`	`60`	`"source": [`
`61`	`61`	`"## Example 1: Linear regression\n",`
`62`		- "In the below section, we are going to show the difference of trainers via a simple linear regression task, we firstly fit the linear dataset with `SDCA`, a linear trainer. And then with `LightGbm`, a tree-base non-linear trainer. And compare their performance on test dataset. The code below does\n",
`63`		`- "- Create linear dataset and split it into train/test part\n",`
`64`		- "- Create pipelines using `SDCA` and `LightGbm`\n",
`65`		- "- Train both `SDCA` and `LightGbm` on linear trainset, and evaluate them on testset."
	`62`	+ "In the below section, we are going to show the difference of trainers via a linear regression task. First, we fit the linear dataset with the linear trainer, `SDCA`. Then we git the linear dataset with `LightGbm`, a tree-base non-linear trainer. Their performance is evaluated against a test dataset. The code below:\n",
	`63`	`+ "- Creates a linear dataset and splits it into train/test sets\n",`
	`64`	+ "- Create training pipelines using `SDCA` and `LightGbm`\n",
	`65`	+ "- Trains both `SDCA` and `LightGbm` on the linear training set, and evaluates them on the test set."
`66`	`66`	`]`
`67`	`67`	`},`
`68`	`68`	`{`
`@@ -161,7 +161,7 @@`
`161`	`161`	`"metadata": {},`
`162`	`162`	`"source": [`
`163`	`163`	`"## Create linear dataset\n",`
`164`		- "The code below artifacts linear dataset with a random residual, and loaded it as train/test `DataFrame`"
	`164`	`+ "The code below creates a linear dataset with a random residual. The dataset is loaded into train and test DataFrames"`
`165`	`165`	`]`
`166`	`166`	`},`
`167`	`167`	`{`
`@@ -203,7 +203,7 @@`
`203`	`203`	`"metadata": {},`
`204`	`204`	`"source": [`
`205`	`205`	`"## Construct pipeline\n",`
`206`		- "The code below shows how to construct pipelines for both `SDCA` and `LightGbm`. The `Concatenate` transformer is necessary because it transfer a `single` column into `Vector<single>` type, which is the accepted feature type for both `SDCA` and `LightGbm` regressor."
	`206`	+ "The code below shows how to construct training pipelines for both `SDCA` and `LightGbm`. The `Concatenate` transformer is required to convert a `single` column into `Vector<single>` type, which is the expected feature type for both `SDCA` and `LightGbm` regressor."
`207`	`207`	`]`
`208`	`208`	`},`
`209`	`209`	`{`
`@@ -230,7 +230,7 @@`
`230`	`230`	`"metadata": {},`
`231`	`231`	`"source": [`
`232`	`232`	`"## Train and evaluate model\n",`
`233`		- "The code below first trains `sdcaPipeline` and `lgbmPipeline` which are created above, then evaluate their performance on test dataset by calcuate `Root Mean Square Loss` between predicted and truth value. We can see that `SDCA` has better performance with a significant lower `Root Mean Square Loss` comparing with `LightGbm`, even it's a simple, linear model. This is because the training dataset is also linear, so `SDCA` can fit the dataset better than `LightGbm`."
	`233`	+ "The code below first trains `sdcaPipeline` and `lgbmPipeline` which are created above, then evaluate their performance on test dataset by calculating `Root Mean Square Loss` between predicted and actual value. `SDCA` has better performance with a significantly lower `Root Mean Square Loss` compared to `LightGbm` even though it's a simpler linear model. This is because the training dataset is also linear, so `SDCA` can fit the dataset better than `LightGbm`."
`234`	`234`	`]`
`235`	`235`	`},`
`236`	`236`	`{`
`@@ -281,15 +281,15 @@`
`281`	`281`	`"metadata": {},`
`282`	`282`	`"source": [`
`283`	`283`	`"## Example 2: Non-linear regression on LightGbm.\n",`
`284`		- "This example is to show the importance of hyper-parameter optimization. We will first create a non-linear dataset and two pipelines. One pipeline has `LightGbm` with `numberOfLeaves` set to 10, the other's set to 1000. Then train both pipelines with the same train dataset and Comparing their training performance by evaluating them on the same test dataset."
	`284`	+ "This example shows the importance of hyper-parameter optimization. First we create a non-linear dataset and two pipelines. One pipeline has `LightGbm` with `numberOfLeaves` set to `10`, the other's set to `1000`. Both pipelines are trained with the same training dataset and their training performance is evaluated on the same test dataset."
`285`	`285`	`]`
`286`	`286`	`},`
`287`	`287`	`{`
`288`	`288`	`"cell_type": "markdown",`
`289`	`289`	`"metadata": {},`
`290`	`290`	`"source": [`
`291`	`291`	`"## Create non-linear dataset\n",`
`292`		- "The code below artifacts non-linear dataset with a random residual, and loaded it as train/test `DataFrame`"
	`292`	`+ "The code below creates a non-linear dataset with a random residual. The dataset is loaded into train and test DataFrames"`
`293`	`293`	`]`
`294`	`294`	`},`
`295`	`295`	`{`
`@@ -331,7 +331,7 @@`
`331`	`331`	`"metadata": {},`
`332`	`332`	`"source": [`
`333`	`333`	`"## Construct pipeline\n",`
`334`		- "The code below shows how to construct pipelines for `LightGbm` with different hyper parameters. The `Concatenate` transformer is necessary because it transfer a `single` column into `Vector<single>` type, which is the accepted feature type for `LightGbm` regressor."
	`334`	+ "The code below shows how to construct training pipelines for `LightGbm` with different hyper-parameters. The `Concatenate` transformer is required because it converts a `single` column into `Vector<single>` type, which is the expected feature type for the `LightGbm` trainer."
`335`	`335`	`]`
`336`	`336`	`},`
`337`	`337`	`{`
`@@ -358,7 +358,7 @@`
`358`	`358`	`"metadata": {},`
`359`	`359`	`"source": [`
`360`	`360`	`"## Train and evaluate model\n",`
`361`		- "The code below first trains `smallLgbmPipeline` and `largeLgbmPipeline` which are created above, then evaluate their performance on test dataset by calcuate `Root Mean Square Loss` between predicted and truth value. We can see that large lgbm has better performance with a lower rmse."
	`361`	+ "The code below first trains `smallLgbmPipeline` and `largeLgbmPipeline` which are created above, then evaluates their performance on the test dataset by calculating the `Root Mean Square Loss` between predicted and actual value. The model created by `largeLgbmPipeline` has better performance with a lower RMSE."
`362`	`362`	`]`
`363`	`363`	`},`
`364`	`364`	`{`
`@@ -400,9 +400,9 @@`
`400`	`400`	`"metadata": {},`
`401`	`401`	`"source": [`
`402`	`402`	`"## Use AutoML to simplify hyper-parameter optimization.\n",`
`403`		- "Hyper-parameter optimization is tedious while important, and it's also something that can be done automatically. Using built-in `AutoMLExperiment` can greatly simplify hpo process. `AutoMLExperiment` applies the latest research from MSR so it can conduct swift, accurate and thorough hyper-parameter optimization in a limited time budget.\n",
	`403`	+ "Hyper-parameter optimization is an important process with lots of trial and error. This process can be automated and simplified using the built-in `AutoMLExperiment`. `AutoMLExperiment` applies the latest research from Microsoft Research to conduct a swift, accurate and thorough hyper-parameter optimization given a limited time budget.\n",
`404`	`404`	`"\n",`
`405`		- "The code below shows how to use `AutoMLExperiment` for hpo and explore a better configuration for `LightGbm` on the non-linear dataset used in Example 2."
	`405`	+ "The code below shows how to use `AutoMLExperiment` for HPO to explore a better configuration for `LightGbm` on the non-linear dataset used in Example 2."
`406`	`406`	`]`
`407`	`407`	`},`
`408`	`408`	`{`