You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: docfx/articles/Samples/Kaleidoscope-ch2.md
+78-10Lines changed: 78 additions & 10 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -1,14 +1,13 @@
1
-
# Implementing the parser
1
+
# 2. Kaleidoscope: Implementing the parser
2
2
The Lllvm.NET version of Kaleidoscope leverages ANTLR4 to parse the language into a parse tree.
3
3
The chapter 2 sample doesn't actually generate any code. Instead it focuses on the general structure
4
4
of the samples and parsing of the language. The sample for this chapter enables all language features
5
5
to allow exploring the language and how it is parsed to help better understand the rest of the chapters
6
-
better. It is hoped that users of this library find this helpful as understanding the language grammar
6
+
better. It is hoped that users of this library find this helpful. Understanding the language grammar
7
7
from reading the LVM tutorials and source was a difficult task since it isn't formally defined in one
8
8
place. (There are some EBNF like comments in the code but it is spread around without much real discussion
9
9
of the language the tutorials guide you to implement)
10
10
11
-
12
11
## Formal Grammar
13
12
### Lexer symbols
14
13
@@ -78,10 +77,10 @@ for the language. Subsequent chapters will introduce the meaning and use of each
78
77
79
78
#### Language Feature Defined Keywords
80
79
Chapters 5-7 each introduce new language features that introduce new keywords into the language. In order to
81
-
maintain a single grammar for all chapters the lexer uses a technique of ANTLR4 called semantic predicates.
82
-
These are basically boolean expressions that determine if a given rule should be applied. These are applied
83
-
to the rules for the feature specific keywords. Thus, at runtime, if a given feature is disabled then the
84
-
keyword is not recognized.
80
+
maintain a single grammar for all chapters the lexer uses a technique of ANTLR4 called [Semantic Predicates](https://github.com/antlr/antlr4/blob/master/doc/predicates.md).
81
+
These are basically boolean expressions that determine if a given rule should be applied while parsing the
82
+
input language. These are applied to the rules for the feature specific keywords. Thus, at runtime, if a given
83
+
feature is disabled then the keyword is not recognized.
85
84
86
85
```antlr
87
86
IF: {FeatureControlFlow}? 'if';
@@ -266,9 +265,78 @@ for the call `foo(1, 2, 3);` is:
266
265
267
266

268
267
269
-
###### Other expressions
270
-
The other expressions are either simple tokens like `Identifier` and `Number` or more complex expressions that are
271
-
covered in detail in later chapters.
268
+
###### VarInExpression
269
+
```antlr
270
+
VAR initializer (COMMA initializer)* IN expression[0]
271
+
```
272
+
The VarInExpression rule provides variable declaration, with optional initialization. The scope of the
273
+
variables is that of the expression on the right of the `in` keyword. The `var ... in ...` expression is
274
+
in many ways like a declaration of an inline function. The variables declared are scoped to the internal
275
+
implementation of the function. Once the function produces the return value the variables no longer exist.
276
+
277
+
###### ConditionalExpression
278
+
```antlr
279
+
IF expression[0] THEN expression[0] ELSE expression[0]
280
+
```
281
+
Conditional expressions use the very common and familiar if-then-else syntax and semantics with one noteable
282
+
unique quality. In Kaleidoscope every language construct is an expression, there are no statements. Expressions
283
+
all produce a value. So the result of the conditional expression is the result of the sub expression selected
284
+
based on the condition. The condition value is computed and if the result == 0.0 (false) the `else` expression
285
+
is used to produce the final result. Otherwise, the `then` expression is executed to produce the result. Thus,
286
+
the actual semantics are more like the ternary operator found C and other languages:
287
+
```C
288
+
condition ? thenExpression : elseExpression`
289
+
```
290
+
291
+
Example:
292
+
```Kaleidoscope
293
+
def fib(x)
294
+
if x < 3 then
295
+
1
296
+
else
297
+
fib(x-1)+fib(x-2);
298
+
```
299
+
##### ForInExpression
300
+
The ForInExpression provides support for classic for loop constructs. In particular it provides a variable scope for a loop
301
+
value, a condition to test when to exit the loop and an optional step value for incrementing the loop value (default is 1.0).
302
+
303
+
```Kaleidoscope
304
+
extern putchard(char);
305
+
def printstar(n)
306
+
for i = 1, i < n, 1.0 in
307
+
putchard(42); # ascii 42 = '*'
308
+
309
+
# print 100 '*' characters
310
+
printstar(100);
311
+
```
312
+
Note: That there are no statements in Kaleidoscope, everything is an expression and has a value. putchard() implicitly returns a
313
+
value as does printstar(). (e.g. there is no void return ALL functions implictly return a floating point value, even if it is always 0.0)
314
+
For loops with mutable values support in the language may provide a result that isn't always 0.0, for example:
315
+
316
+
```Kaleidoscope
317
+
# Define ':' for sequencing: as a low-precedence operator that ignores operands
318
+
# and just returns the RHS.
319
+
def binary : 1 (x y) y;
320
+
321
+
# Recursive fib, we could do this before.
322
+
def fib(x)
323
+
if (x < 3) then
324
+
1
325
+
else
326
+
fib(x-1)+fib(x-2);
327
+
328
+
# Iterative fib.
329
+
def fibi(x)
330
+
var a = 1, b = 1, c in
331
+
(for i = 3, i < x in
332
+
c = a + b :
333
+
a = b :
334
+
b = c) :
335
+
b;
336
+
337
+
# Call it.
338
+
fibi(10);
339
+
```
272
340
273
341
#### Parse Tree
274
342
The Llvm.NET implementation of Kaleidoscope doesn't use an AST per se. Instead it use the parse tree generated
0 commit comments