|
7 | 7 | "# Text Classification API (preview)" |
8 | 8 | ] |
9 | 9 | }, |
| 10 | + { |
| 11 | + "cell_type": "markdown", |
| 12 | + "metadata": {}, |
| 13 | + "source": [ |
| 14 | + "## What is text classification?\n", |
| 15 | + "\n", |
| 16 | + "Text classification as the name implies is the process of applying labels or categories to text.\n", |
| 17 | + "\n", |
| 18 | + "Common use cases include:\n", |
| 19 | + "\n", |
| 20 | + "- Categorizing e-mail as spam or not spam\n", |
| 21 | + "- Analyzing sentiment as positive or negative from customer reviews\n", |
| 22 | + "- Applying labels to support tickets\n", |
| 23 | + "\n", |
| 24 | + "## Solving text classification with machine learning\n", |
| 25 | + "\n", |
| 26 | + "Classification is a common problem in machine learning. There are a variety of algorithms you can use to train a classification model. Text classification is a subcategory of classification which deals specifically with raw text. Text poses interesting challenges because you have to account for the context and semantics in which the text occurs. As such, encoding meaning and context can be difficult. In recent years, deep learning models have emerged as a promising technique to solve natural language problems. More specifically, a type of neural network known as transformers has become the predominant way of solving natural language problems like text classification, translation, summarization, and question answering.\n", |
| 27 | + "\n", |
| 28 | + "Transformers were introduced in the paper [Attention is all you need](https://arxiv.org/abs/1706.03762). Some popular transformer architectures for natural language tasks include:\n", |
| 29 | + "\n", |
| 30 | + "- Bidirectional Encoder Representations from Transformers (BERT)\n", |
| 31 | + "- Robustly Optimized BERT Pretraining Approach (RoBERTa)\n", |
| 32 | + "- Generative Pre-trained Transformer 2 (GPT-2)\n", |
| 33 | + "- Generative Pre-trained Transformer 3 (GPT-3)\n", |
| 34 | + "\n", |
| 35 | + "At a high level, transformers are a model architecture consisting of encoding and decoding layers. The encoder takes raw text as input and maps the input to a numerical representation (including context) to produce features. The decoder uses information from the encoder to produce output such as a category or label in the case of text classification. What makes these layers so special is the concept of attention. Attention is the idea of focusing on specific parts of an input based on the importance of their context in relation to other inputs in a sequence. For example, let's say I'm categorizing news articles based on the headline. Not all words in the headline are relevant. In a headline like \"Auto sales are at an all-time high\", a word like \"sales\" might get more attention and lead to labeling the article as business or finance. \n", |
| 36 | + "\n", |
| 37 | + "Like most neural networks, training transformers from scratch can be expensive because they require large amounts of data and compute. However, you don't always have to train from scratch. Using a technique known as fine-tuning you can take a pre-trained model and retrain the layers specific to your domain or problem using your own data. This gives you the benefit of having a model that's more tailored to solve your problem without having to go through the process of training the entire model from scratch. \n", |
| 38 | + "\n", |
| 39 | + "## The Text Classification API (preview)\n", |
| 40 | + "\n", |
| 41 | + "Now that you have a general overview of how text classification problems can be solved using deep learning, let's take a look at how we've incorporated many of these techniques into the Text Classification API.\n", |
| 42 | + "\n", |
| 43 | + "The Text Classification API is powered by [TorchSharp](https://github.com/dotnet/TorchSharp). TorchSharp is a .NET library that provides access to libtorch, the library that powers PyTorch. TorchSharp contains the building blocks for training neural networks from scratch in .NET. The TorchSharp components however are low-level and building neural networks from scratch has a steep learning curve. In ML.NET, we've abstracted some of that complexity to the scenario level." |
| 44 | + ] |
| 45 | + }, |
10 | 46 | { |
11 | 47 | "cell_type": "markdown", |
12 | 48 | "metadata": {}, |
|
0 commit comments