Skip to content

Commit a7d018e

Browse files
Merge branch 'main' into u/xiaoyun/auto-featurizer
2 parents 9891b13 + 83beea1 commit a7d018e

30 files changed

+7958
-3073
lines changed

README.md

Lines changed: 37 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -4,9 +4,14 @@ Welcome to the home of .NET interactive notebooks for C#!
44

55
## How to Install
66

7+
### VS Code
78
1. Download the .NET Coding Pack for VS Code for [Windows](https://aka.ms/dotnet-coding-pack-win) or [macOS](https://aka.ms/dotnet-coding-pack-mac).
89
2. Install the [.NET Interactive Notebooks](https://marketplace.visualstudio.com/items?itemName=ms-dotnettools.dotnet-interactive-vscode) extension.
910

11+
### Visual Studio
12+
1. Download and install [Visual Studio 2022](https://visualstudio.microsoft.com/downloads/)
13+
2. Download and install [Notebook Editor Extension](https://marketplace.visualstudio.com/items?itemName=MLNET.notebook)
14+
1015
For more information and resources, visit [Learn to code C#](https://dotnet.microsoft.com/learntocode).
1116

1217
## C# 101
@@ -31,6 +36,38 @@ Download or clone this repo and open the `csharp-101` folder in VS Code to get s
3136
14 | Methods and Members | [14 Notebook](https://ntbk.io/csharp101-notebook14) | [14 Video](https://www.youtube.com/watch?v=xLhm3bEG__c&list=PLdo4fOcmZ0oVxKLQCHpiUWun7vlJJvUiN&index=17) | [Object Oriented Coding in C#](https://docs.microsoft.com/dotnet/csharp/fundamentals/tutorials/classes?WT.mc_id=csharpnotebook-35129-website)
3237
15 | Methods and Exceptions | [15 Notebook](https://ntbk.io/csharp101-notebook15) | [15 Video](https://www.youtube.com/watch?v=8YsoBBiVVzQ&list=PLdo4fOcmZ0oVxKLQCHpiUWun7vlJJvUiN&index=18) | [Object Oriented Coding in C#](https://docs.microsoft.com/dotnet/csharp/fundamentals/tutorials/classes?WT.mc_id=csharpnotebook-35129-website)
3338

39+
## Machine Learning
40+
41+
Download or clone this repo and open the `machine-learning` folder in Visual Studio 2022 to get started with the machine-learning notebooks. Or, if you want just tap on one of the Notebook links below and automatically have it open in Visual Studio!
42+
43+
**Links below require [Visual Studio 2022](https://visualstudio.microsoft.com/downloads/) and [Notebook Editor Extension](https://marketplace.visualstudio.com/items?itemName=MLNET.notebook) 0.3.4 or greater**
44+
45+
### Getting Started Series
46+
47+
| # | Topic | VS Notebook Link | Github Link |
48+
|---|--------------------------------------------|------------------------------------------------|-------------|
49+
1 | Intro to Machine Learning | [01 Notebook](https://ntbk.io/ml-01-intro) | [01 Notebook](https://github.com/dotnet/csharp-notebooks/blob/main/machine-learning/01-Intro%20to%20Machine%20Learning.ipynb)
50+
2 | Data Prep and Feature Engineering | [02 Notebook](https://ntbk.io/ml-02-data) | [02 Notebook](https://github.com/dotnet/csharp-notebooks/blob/main/machine-learning/02-Data%20Preparation%20and%20Feature%20Engineering.ipynb)
51+
3 | Training and AutoML | [03 Notebook](https://ntbk.io/ml-03-training) | [03 Notebook](https://github.com/dotnet/csharp-notebooks/blob/main/machine-learning/03-Training%20and%20AutoML.ipynb)
52+
4 | Model Evaluation | [04 Notebook](https://ntbk.io/ml-04-evaluation)| [04 Notebook](https://github.com/dotnet/csharp-notebooks/blob/main/machine-learning/04-Model%20Evaluation.ipynb)
53+
54+
### End to End (E2E) Notebooks - examples of the entire ML process.
55+
| # | Topic | VS Notebook Link | Github Link |
56+
|---|--------------------------------------------|---------------------------------------------------------------------------|-------------|
57+
E2E | Classification using AutoML (Iris Dataset) | [Iris E2E AutoML](https://ntbk.io/ml-e2e-iris) | [Iris E2E AutoML](https://github.com/dotnet/csharp-notebooks/blob/main/machine-learning/E2E-Classification%20with%20Iris%20Dataset.ipynb)
58+
E2E | Forecasting using Regression (Luna Dataset)| [Luna E2E Regression](https://ntbk.io/ml-e2e-luna-regression) | [Luna E2E Regression](https://github.com/dotnet/csharp-notebooks/blob/main/machine-learning/E2E-Forecasting%20using%20Regression%20with%20Luna%20Dataset.ipynb)
59+
E2E | Forecasting using SSA (Luna Dataset) | [Luna E2E SSA](https://ntbk.io/ml-e2e-luna-ssa) | [Luna E2E SSA](https://github.com/dotnet/csharp-notebooks/blob/main/machine-learning/E2E-Forecasting%20using%20SSA%20with%20Luna%20Dataset.ipynb)
60+
E2E | Regression using AutoML (Taxi Dataset) | [Taxi E2E AutoML](https://ntbk.io/ml-e2e-taxi) | [Taxi E2E AutoML](https://github.com/dotnet/csharp-notebooks/blob/main/machine-learning/E2E-Regression%20with%20Taxi%20Dataset.ipynb)
61+
E2E | Text Classification API (Yelp Dataset) | [Text Classification API](https://ntbk.io/ml-e2e-text-classification-api) | [Text Classification API](https://github.com/dotnet/csharp-notebooks/blob/main/machine-learning/E2E-Text-Classification-API-with-Yelp-Dataset.ipynb)
62+
63+
64+
### Reference Notebooks
65+
| # | Topic | VS Notebook Link | Github Link |
66+
|---|--------------------------------------------|-------------------------------------------------------|-------------|
67+
REF | Data Processing with DataFrame |[Data Frame](https://ntbk.io/ml-ref-data-frame) | [Data Frame](https://github.com/dotnet/csharp-notebooks/blob/main/machine-learning/REF-Data%20Processing%20with%20DataFrame.ipynb)
68+
REF | Graphs and Visualizations |[Visualizations](https://ntbk.io/ml-ref-visualizations)| [Visualizations](https://github.com/dotnet/csharp-notebooks/blob/main/machine-learning/REF-Graphs%20and%20Visualizations.ipynb)
69+
REF | Kaggle Competitions (Titanic Dataset) |[Kaggle](https://ntbk.io/ml-ref-kaggle-titanic) | [Kaggle](https://github.com/dotnet/csharp-notebooks/blob/main/machine-learning/REF-Kaggle%20with%20Titanic%20Dataset.ipynb)
70+
3471
## .NET Foundation
3572

3673
.NET Interative Notebooks for C# is a [.NET Foundation](https://www.dotnetfoundation.org/projects) project.

machine-learning/01-Intro to Machine Learning.ipynb

Lines changed: 48 additions & 17 deletions
Original file line numberDiff line numberDiff line change
@@ -4,7 +4,12 @@
44
"cell_type": "markdown",
55
"metadata": {},
66
"source": [
7-
"# Introduction to Machine Learning"
7+
"# Introduction to Machine Learning\n",
8+
"\n",
9+
"In this notebook we'll cover: \n",
10+
"- What is Machine Learning?\n",
11+
"- What are the steps that need to be performed?\n",
12+
"- 'Hello ML.NET World' - training your first ML.NET model. "
813
]
914
},
1015
{
@@ -21,7 +26,7 @@
2126
"\n",
2227
"The above code shows how to consume a model that's already been trained. The end result of training a model is a function you can pass some data `HouseData.Size` to the model and it will give you back a prediction - `Prediction.Price`. \n",
2328
"\n",
24-
"The above is a simple example (probably too simple) but models can take in many more values. For instance - [Value Prediction(Regression) with Taxi Dataset](https://raw.githubusercontent.com/dotnet/csharp-notebooks/fa302c12c7494e5f8a5fdbe5d8283d8ff1fb7009/machine-learning/E2E-Value%20Prediction(Regression)%20with%20Taxi%20Dataset.ipynb) -- \n",
29+
"The above is a simple example (probably too simple) but models can take in many more values. For instance - [Value Prediction/Regression with Taxi Dataset](https://ntbk.io/ml-e2e-taxi) -- \n",
2530
"is a more complex example that takes in `vendor_id`, `rate_code`, `passenger_count`, `trip_time_in_secs`, `trip_distance`, `payment_type` and then predicts `fare_amount`.\n",
2631
"\n",
2732
"### How do you create that function?\n",
@@ -43,14 +48,14 @@
4348
" >You can think of this as a fancy for-loop to just try all the options. Our AutoML is a bit smarter than this ... but that is essentially what it does!\n",
4449
" >\n",
4550
" > For the example below we'll train a specific algorithm - so you can see how that works!\n",
46-
" 1. Pick a Task - [ML.NET Tasks](https://docs.microsoft.com/en-us/dotnet/machine-learning/resources/tasks)\n",
47-
" 1. Pick an Algorithm - [ML.NET Algorithms](https://docs.microsoft.com/en-us/dotnet/machine-learning/how-to-choose-an-ml-net-algorithm)\n",
48-
" 1. Set Algorithm Parameters [Glossary - Hyperparameters](https://docs.microsoft.com/en-us/dotnet/machine-learning/resources/glossary#hyperparameter)\n",
51+
" 1. Pick a Task - [ML.NET Tasks](https://docs.microsoft.com/dotnet/machine-learning/resources/tasks)\n",
52+
" 1. Pick an Algorithm - [ML.NET Algorithms](https://docs.microsoft.com/dotnet/machine-learning/how-to-choose-an-ml-net-algorithm)\n",
53+
" 1. Set Algorithm Parameters [Glossary - Hyperparameters](https://docs.microsoft.com/dotnet/machine-learning/resources/glossary#hyperparameter)\n",
4954
" 1. Train -\n",
5055
" This is where the data actually gets fed to the algorithm to train the model. This can take sometime depending on the amount of data, algorithm, and the parameters for that algorithm.\n",
5156
"\n",
5257
"1. **Evaluate** \n",
53-
" Once you've trained a model - how do you know it works? There are a bunch of techniques to evaluate your models performance. If you'd like to take a deeper dive - Checkout [Evaluation Metrics](https://docs.microsoft.com/en-us/dotnet/machine-learning/resources/metrics). Otherwise we'll give examples througout these tutorials.\n",
58+
" Once you've trained a model - how do you know it works? There are a bunch of techniques to evaluate your models performance. If you'd like to take a deeper dive - Checkout [Evaluation Metrics](https://docs.microsoft.com/dotnet/machine-learning/resources/metrics). Otherwise we'll give examples througout these tutorials.\n",
5459
"1. **Deploy** \n",
5560
" After you've trained a model ... it's just .NET Code! Build it Ship it - however you currently deploy your application."
5661
]
@@ -60,7 +65,7 @@
6065
"metadata": {},
6166
"source": [
6267
"## How do I get started?\n",
63-
"Below we have a quick introduction to ML.NET - \"Hello ML.NET World!\" and the next three Notebooks in the series take a deep dive into [Data Prep and Feature Engineering](https://raw.githubusercontent.com/dotnet/csharp-notebooks/fa302c12c7494e5f8a5fdbe5d8283d8ff1fb7009/machine-learning/02-Data%20Preparation%20and%20Feature%20Engineering.ipynb), [Training and AutoML](https://raw.githubusercontent.com/JakeRadMSFT/csharp-notebooks/main/machine-learning/03-Training%20and%20AutoML.ipynb), and [Model Evaluation](https://raw.githubusercontent.com/dotnet/csharp-notebooks/fa302c12c7494e5f8a5fdbe5d8283d8ff1fb7009/machine-learning/04-Model%20Evaluation.ipynb) "
68+
"Below we have a quick introduction to ML.NET - \"Hello ML.NET World!\" and the next three Notebooks in the series take a deep dive into [Data Prep and Feature Engineering](https://ntbk.io/ml-02-data), [Training and AutoML](https://ntbk.io/ml-03-training), and [Model Evaluation](https://ntbk.io/ml-04-evaluation) "
6469
]
6570
},
6671
{
@@ -84,9 +89,18 @@
8489
}
8590
},
8691
"source": [
87-
"#r \"nuget: Microsoft.ML, 1.7.1\""
92+
"#r \"nuget: Microsoft.ML, 2.0.0-preview.22356.1\""
8893
],
89-
"outputs": []
94+
"outputs": [
95+
{
96+
"output_type": "execute_result",
97+
"data": {
98+
"text/html": "<div><div></div><div></div><div><strong>Installed Packages</strong><ul><li><span>Microsoft.ML, 2.0.0-preview.22313.1</span></li></ul></div></div>"
99+
},
100+
"execution_count": 1,
101+
"metadata": {}
102+
}
103+
]
90104
},
91105
{
92106
"cell_type": "markdown",
@@ -113,7 +127,7 @@
113127
"cell_type": "markdown",
114128
"metadata": {},
115129
"source": [
116-
"Now we are ready to write the code to achieve the machine learning task we need to do. Always start with creating the [MLContext](https://docs.microsoft.com/dotnet/api/microsoft.ml.mlcontext?ranMID=43674&ranEAID=rl2xnKiLcHs&ranSiteID=rl2xnKiLcHs-LuTsrQLVgyEOYaht34D47g&epi=rl2xnKiLcHs-LuTsrQLVgyEOYaht34D47g&irgwc=1&OCID=AID2200057_aff_7795_1243925&tduid=(ir__2m3q0nl02wkf6gcatnnkkvci0e2xvxwafx3xgf9200)(7795)(1243925)(rl2xnKiLcHs-LuTsrQLVgyEOYaht34D47g)()&irclickid=_2m3q0nl02wkf6gcatnnkkvci0e2xvxwafx3xgf9200&view=ml-dotnet) which is the common context for all ML.NET operations"
130+
"Now we are ready to write the code to achieve the machine learning task we need to do. Always start with creating the [MLContext](https://docs.microsoft.com/dotnet/api/microsoft.ml.mlcontext?view=ml-dotnet) which is the common context for all ML.NET operations"
117131
]
118132
},
119133
{
@@ -225,7 +239,7 @@
225239
"cell_type": "markdown",
226240
"metadata": {},
227241
"source": [
228-
"Now we have the data ready, next we'll create the ML.NET pipeline specifying the trainer we are going to use to build our machine learning model. For house price prediction, we are going to use the regression trainer. ML.NET supports other machine learning trainers which can be used for other scenarios as needed. The pipeline will create what is called [Estimator](https://docs.microsoft.com/en-us/dotnet/api/microsoft.ml.iestimator-1?view=ml-dotnet) which used to define teh operations applied to the data"
242+
"Now we have the data ready, next we'll create the ML.NET pipeline specifying the trainer we are going to use to build our machine learning model. For house price prediction, we are going to use the regression trainer. ML.NET supports other machine learning trainers which can be used for other scenarios as needed. The pipeline will create what is called [Estimator](https://docs.microsoft.com/dotnet/api/microsoft.ml.iestimator-1?view=ml-dotnet) which used to define the operations applied to the data"
229243
]
230244
},
231245
{
@@ -247,7 +261,7 @@
247261
"cell_type": "markdown",
248262
"metadata": {},
249263
"source": [
250-
"After creating the estimator, we are ready to apply the transformations and trainer defined in the pipeline to the data. To do that, call the [Fit](https://docs.microsoft.com/en-us/dotnet/api/microsoft.ml.iestimator-1.fit?ranMID=43674&ranEAID=rl2xnKiLcHs&ranSiteID=rl2xnKiLcHs-G8Db905fJ0jxggGna1mdkw&epi=rl2xnKiLcHs-G8Db905fJ0jxggGna1mdkw&irgwc=1&OCID=AID2200057_aff_7795_1243925&tduid=(ir__2m3q0nl02wkf6gcatnnkkvci0e2xvx1gft3xgf9200)(7795)(1243925)(rl2xnKiLcHs-G8Db905fJ0jxggGna1mdkw)()&irclickid=_2m3q0nl02wkf6gcatnnkkvci0e2xvx1gft3xgf9200&view=ml-dotnet) method."
264+
"After creating the estimator, we are ready to apply the transformations and trainer defined in the pipeline to the data. To do that, call the [Fit](https://docs.microsoft.com/dotnet/api/microsoft.ml.iestimator-1.fit?view=ml-dotnet) method."
251265
]
252266
},
253267
{
@@ -309,14 +323,22 @@
309323
"// Print the R-Squared value. The Closer to 1 indicates a better fitted model.\n",
310324
"Console.WriteLine($\"Coefficient of determination for the trained model: {trainedModelMetrics.RSquared:0.00}\");"
311325
],
312-
"outputs": []
326+
"outputs": [
327+
{
328+
"output_type": "execute_result",
329+
"data": {
330+
"text/plain": "Coefficient of determination for the trained model: 0.97\r\n"
331+
},
332+
"execution_count": 1,
333+
"metadata": {}
334+
}
335+
]
313336
},
314337
{
315338
"cell_type": "markdown",
316339
"metadata": {},
317340
"source": [
318-
"Now we have the trained model ready for prediction. Let's use this model to predict a sample house price. We do that by creating the the prediction engine [PredictionEngine<TSrc,TDst>](https://docs.microsoft.com/dotnet/api/microsoft.ml.predictionengine-2?view=ml-dotnet). The prediction engine is the class for making single predictions on a previously trained model (and preceding transform pipeline). Creation of the prediction engine from the trained mode can be done by the following code:\n",
319-
""
341+
"Now we have the trained model ready for prediction. Let's use this model to predict a sample house price. We do that by creating the the prediction engine [PredictionEngine<TSrc,TDst>](https://docs.microsoft.com/dotnet/api/microsoft.ml.predictionengine-2?view=ml-dotnet). The prediction engine is the class for making single predictions on a previously trained model (and preceding transform pipeline). Creation of the prediction engine from the trained model can be done by the following code:"
320342
]
321343
},
322344
{
@@ -352,7 +374,16 @@
352374
"var price = predictionEngine.Predict(size);\n",
353375
"Console.WriteLine($\"Predicted price for size: {size.Size*1000} sq ft= {price.Price*100:C}k\");"
354376
],
355-
"outputs": []
377+
"outputs": [
378+
{
379+
"output_type": "execute_result",
380+
"data": {
381+
"text/plain": "Predicted price for size: 2500 sq ft= $275.59k\r\n"
382+
},
383+
"execution_count": 1,
384+
"metadata": {}
385+
}
386+
]
356387
},
357388
{
358389
"cell_type": "markdown",
@@ -369,7 +400,7 @@
369400
"source": [
370401
"# Continue learning\n",
371402
"\n",
372-
"> [⏩ Next Module - Data Prep and Feature Engineering](https://raw.githubusercontent.com/dotnet/csharp-notebooks/fa302c12c7494e5f8a5fdbe5d8283d8ff1fb7009/machine-learning/02-Data%20Preparation%20and%20Feature%20Engineering.ipynb)"
403+
"> [⏩ Next Module - Data Prep and Feature Engineering](https://ntbk.io/ml-02-data)"
373404
]
374405
}
375406
],

0 commit comments

Comments
 (0)