You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: docs/01-intro.md
+3-1Lines changed: 3 additions & 1 deletion
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -66,7 +66,9 @@ We must always be aware of the potential for harm and deliberately take steps to
66
66
67
67
Humans have been interacting with AI chatbots for years. In fact, Alan Turing is credited with coming up with the concept for chatbots as early as 1950. Chatbots are software-based systems that interact with humans typically by text or speech inputs, rather than code. They mimic some human activity [@wikipedia_chatbot_2023; @abdulla2022chatbots] based on these language inputs. They process the inputs using natural language processing commonly abbreviated as NLP. NLP is a kind of AI that uses human text or speech and parses the language to determine structures and patterns to extract meaning. NLP uses large amounts of language data (such as books, websites etc.) to train AI systems to identify these structures and patterns. For example, the AI model might identify when a sentence is a question or a statement by examining various features in a prompt such as the inclusion of a question mark of the use of words often used in questions [@wikipedia_natural_2023; @cahn2017chatbot].
68
68
69
-
The methods used for chatbots have evolved over time. Now chatbots often utilize AI methods like [deep learning](https://en.wikipedia.org/wiki/Deep_learning) (which involve multiple layers of abstractions of the input data [@wikipedia_deep_learning_2023]) to extract meaning from the language data [@wikipedia_natural_2023]. As these methods use large quantities of text, they are therefore often called large language models [@wikipedia_large_language_2023].
69
+
The methods used for chatbots have evolved over time. Now chatbots often utilize AI methods like [deep learning](https://en.wikipedia.org/wiki/Deep_learning) (which involve multiple layers of abstractions of the input data [@wikipedia_deep_learning_2023]) to extract meaning from the language data [@wikipedia_natural_2023]. As these methods use large quantities of text, they are therefore often called large language models, or LLMs [@wikipedia_large_language_2023].
70
+
71
+
Although it might _seem_ like LLMs are talking to you when you interact with them, it's important to remember they aren't actually thinking. Instead, LLMs are simply putting together tokens, or parts of words, based on a huge distance matrix created using an LLM's training data set. Essentially, an LLM's program figures out how frequently (and in what contexts) different words show up together in the training data. For example, the word "example" is often paired with the word "for" in the text for this course. An LLM trained on this course would then be more likely to create the phrase "for example" than the phrase "for apples", as the training data includes multiple instances of the first phrase but only one instance of the second. (To be precise, the LLM would predict the tokens "ex", "am", and "ple", but we see it as the word "example".) If you're interested in learning more, check out this excellent [visual article](https://ig.ft.com/generative-ai/) by the Financial Times (we are not affiliated with them).
70
72
71
73
Despite the fact that chatbots have been around awhile, the popularity of OpenAI's ChatGPT and DALL-E programs has sparked a recent surge of interest. These chatbots are in part particularly powerful due to the fact that large amounts of computing power were used to train their NLP models on very large datasets [@caldarini2022literature; @cahn2017chatbot]. Large language model AIs can be divided into two categories: those that can be reached using an internet browser, and those that can be reached using an integrated development environment (IDE).
Copy file name to clipboardExpand all lines: docs/04-refactoring.md
+3-3Lines changed: 3 additions & 3 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -553,7 +553,7 @@ proc.time() - start_time
553
553
554
554
```
555
555
## user system elapsed
556
-
## 8.933 0.003 8.935
556
+
## 11.963 0.004 11.966
557
557
```
558
558
559
559
:::{.query}
@@ -581,7 +581,7 @@ proc.time() - start_time
581
581
582
582
```
583
583
## user system elapsed
584
-
## 0.775 0.564 0.625
584
+
## 0.644 0.304 0.645
585
585
```
586
586
587
587
The `outer()` function performs the same calculation as the nested loop in the original code, but more efficiently. It returns a matrix of all possible combinations of x and y values, with each element of the matrix being the product of the corresponding x and y values. The `rowSums()` function is then used to sum the elements of each row of the matrix, which is equivalent to summing the products of x and y for each index `i` in the original loop. This method avoids the need for the nested loop, resulting in a faster and more efficient computation.
@@ -609,7 +609,7 @@ proc.time() - start_time
609
609
610
610
```
611
611
## user system elapsed
612
-
## 0.361 0.299 0.263
612
+
## 0.303 0.264 0.361
613
613
```
614
614
615
615
One optimized way to perform the same calculation is by using the `%*%` operator to perform matrix multiplication. This can be done by converting x and y to matrices and transposing one of them so that their dimensions align for matrix multiplication. This code should be much faster than the original implementation because it takes advantage of highly optimized matrix multiplication algorithms in R.
<h2><spanclass="header-section-number">1.4</span> The AI Chatbots</h2>
436
436
<p>Humans have been interacting with AI chatbots for years. In fact, Alan Turing is credited with coming up with the concept for chatbots as early as 1950. Chatbots are software-based systems that interact with humans typically by text or speech inputs, rather than code. They mimic some human activity <spanclass="citation">(<ahref="references.html#ref-wikipedia_chatbot_2023" role="doc-biblioref"><span>“Chatbot”</span> 2023</a>; <ahref="references.html#ref-abdulla2022chatbots" role="doc-biblioref">Abdulla et al. 2022</a>)</span> based on these language inputs. They process the inputs using natural language processing commonly abbreviated as NLP. NLP is a kind of AI that uses human text or speech and parses the language to determine structures and patterns to extract meaning. NLP uses large amounts of language data (such as books, websites etc.) to train AI systems to identify these structures and patterns. For example, the AI model might identify when a sentence is a question or a statement by examining various features in a prompt such as the inclusion of a question mark of the use of words often used in questions <spanclass="citation">(<ahref="references.html#ref-wikipedia_natural_2023" role="doc-biblioref"><span>“Natural Language Processing”</span> 2023</a>; <ahref="references.html#ref-cahn2017chatbot" role="doc-biblioref">Cahn 2017</a>)</span>.</p>
437
-
<p>The methods used for chatbots have evolved over time. Now chatbots often utilize AI methods like <ahref="https://en.wikipedia.org/wiki/Deep_learning">deep learning</a> (which involve multiple layers of abstractions of the input data <spanclass="citation">(<ahref="references.html#ref-wikipedia_deep_learning_2023" role="doc-biblioref"><span>“Deep Learning”</span> 2023</a>)</span>) to extract meaning from the language data <spanclass="citation">(<ahref="references.html#ref-wikipedia_natural_2023" role="doc-biblioref"><span>“Natural Language Processing”</span> 2023</a>)</span>. As these methods use large quantities of text, they are therefore often called large language models <spanclass="citation">(<ahref="references.html#ref-wikipedia_large_language_2023" role="doc-biblioref"><span>“Large Language Model”</span> 2023</a>)</span>.</p>
437
+
<p>The methods used for chatbots have evolved over time. Now chatbots often utilize AI methods like <ahref="https://en.wikipedia.org/wiki/Deep_learning">deep learning</a> (which involve multiple layers of abstractions of the input data <spanclass="citation">(<ahref="references.html#ref-wikipedia_deep_learning_2023" role="doc-biblioref"><span>“Deep Learning”</span> 2023</a>)</span>) to extract meaning from the language data <spanclass="citation">(<ahref="references.html#ref-wikipedia_natural_2023" role="doc-biblioref"><span>“Natural Language Processing”</span> 2023</a>)</span>. As these methods use large quantities of text, they are therefore often called large language models, or LLMs <spanclass="citation">(<ahref="references.html#ref-wikipedia_large_language_2023" role="doc-biblioref"><span>“Large Language Model”</span> 2023</a>)</span>.</p>
438
+
<p>Although it might <em>seem</em> like LLMs are talking to you when you interact with them, it’s important to remember they aren’t actually thinking. Instead, LLMs are simply putting together tokens, or parts of words, based on a huge distance matrix created using an LLM’s training data set. Essentially, an LLM’s program figures out how frequently (and in what contexts) different words show up together in the training data. For example, the word “example” is often paired with the word “for” in the text for this course. An LLM trained on this course would then be more likely to create the phrase “for example” than the phrase “for apples”, as the training data includes multiple instances of the first phrase but only one instance of the second. (To be precise, the LLM would predict the tokens “ex”, “am”, and “ple”, but we see it as the word “example”.) If you’re interested in learning more, check out this excellent <ahref="https://ig.ft.com/generative-ai/">visual article</a> by the Financial Times (we are not affiliated with them).</p>
438
439
<p>Despite the fact that chatbots have been around awhile, the popularity of OpenAI’s ChatGPT and DALL-E programs has sparked a recent surge of interest. These chatbots are in part particularly powerful due to the fact that large amounts of computing power were used to train their NLP models on very large datasets <spanclass="citation">(<ahref="references.html#ref-caldarini2022literature" role="doc-biblioref">Caldarini, Jaf, and McGarry 2022</a>; <ahref="references.html#ref-cahn2017chatbot" role="doc-biblioref">Cahn 2017</a>)</span>. Large language model AIs can be divided into two categories: those that can be reached using an internet browser, and those that can be reached using an integrated development environment (IDE).</p>
439
440
<divclass="warning">
440
441
<p>The information presented in this course is meant for use with open source code and software. It is unclear what happens to the information fed to AI chatbots as prompts, or how secure the data are. We know data are saved and may be used to further train the AI tools, but the specifics of how data are saved, as well as how sensitive or personally identifiable information are protected, is unknown.</p>
<p>The <code>outer()</code> function performs the same calculation as the nested loop in the original code, but more efficiently. It returns a matrix of all possible combinations of x and y values, with each element of the matrix being the product of the corresponding x and y values. The <code>rowSums()</code> function is then used to sum the elements of each row of the matrix, which is equivalent to summing the products of x and y for each index <code>i</code> in the original loop. This method avoids the need for the nested loop, resulting in a faster and more efficient computation.</p>
794
794
</div>
795
795
<p>So much faster! We can look at another option by regenerating the response:</p>
<p>One optimized way to perform the same calculation is by using the <code>%*%</code> operator to perform matrix multiplication. This can be done by converting x and y to matrices and transposing one of them so that their dimensions align for matrix multiplication. This code should be much faster than the original implementation because it takes advantage of highly optimized matrix multiplication algorithms in R.</p>
813
813
</div>
814
814
<p>While this second suggestion is faster, you will need to consider what aspects of the codebase are most important in each instance. For example, this code runs more quickly, but <ahref="https://stat.ethz.ch/R-manual/R-patched/library/base/html/matmult.html">the <code>%*%</code> operator</a> might be unfamiliar to some R programmers. In cases where efficiency is less important, or the data are not large, you might consider maximizing readability.</p>
0 commit comments