llvm.org GIT mirror llvm / 5d82919
[Kaleidoscope] Add an initial "Building an ORC JIT" tutorial chapter. This is a work in progress - the chapter text is incomplete, though the example code compiles and runs. Feedback and patches are, as usual, most welcome. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@270487 91177308-0d34-0410-b5e6-96231b3b80d8 Lang Hames 3 years ago
7 changed file(s) with 1612 addition(s) and 0 deletion(s). Raw diff Collapse all Expand all
0 =======================================================
1 Kaleidoscope: Building an ORC-based JIT in LLVM
2 =======================================================
3
4 .. contents::
5 :local:
6
7 **This tutorial is under active development. It is incomplete and details may
8 change frequently.** Nonetheless we invite you to try it out as it stands, and
9 we welcome any feedback.
10
11 Chapter 1 Introduction
12 ======================
13
14 Welcome to Chapter 1 of the "Building an ORC-based JIT in LLVM" tutorial. This
15 tutorial runs through the implementation of a JIT compiler using LLVM's
16 On-Request-Compilation (ORC) APIs. It begins with a simplified version of the
17 KaleidoscopeJIT class used in the
18 `Implementing a language with LLVM `_ tutorials and then
19 introduces new features like optimization, lazy compilation and remote
20 execution.
21
22 The goal of this tutorial is to introduce you to LLVM's ORC JIT APIs, show how
23 these APIs interact with other parts of LLVM, and to teach you how to recombine
24 them to build a custom JIT that is suited to your use-case.
25
26 The structure of the tutorial is:
27
28 - Chapter #1: Investigate the simple KaleidoscopeJIT class. This will
29 introduce some of the basic concepts of the ORC JIT APIs, including the
30 idea of an ORC *Layer*.
31
32 - `Chapter #2 `_: Extend the basic KaleidoscopeJIT by adding
33 a new layer that will optimize IR and generated code.
34
35 - `Chapter #3 `_: Further extend the JIT by adding a
36 Compile-On-Demand layer to lazily compile IR.
37
38 - `Chapter #4 `_: Improve the laziness of our JIT by
39 replacing the Compile-On-Demand layer with a custom layer that uses the ORC
40 Compile Callbacks API directly to defer IR-generation until functions are
41 called.
42
43 - `Chapter #5 `_: Add process isolation by JITing code into
44 a remote process with reduced privileges using the JIT Remote APIs.
45
46 To provide input for our JIT we will use the Kaleidoscope REPL from
47 `Chapter 7 `_ of the "Implementing a language in LLVM tutorial",
48 with one minor modification: We will remove the FunctionPassManager from the
49 code for that chapter and replace it with optimization support in our JIT class
50 in Chapter #2.
51
52 Finally, a word on API generations: ORC is the 3rd generation of LLVM JIT API.
53 It was preceeded by MCJIT, and before that by the (now deleted) legacy JIT.
54 These tutorials don't assume any experience with these earlier APIs, but
55 readers acquainted with them will see many familiar elements. Where appropriate
56 we will make this connection with the earlier APIs explicit to help people who
57 are transitioning from them to ORC.
58
59 JIT API Basics
60 ==============
61
62 The purpose of a JIT compiler is to compile code "on-the-fly" as it is needed,
63 rather than compiling whole programs to disk ahead of time as a traditional
64 compiler does. To support that aim our initial, bare-bones JIT API will be:
65
66 1. Handle addModule(Module &M) -- Make the given IR module available for
67 execution.
68 2. JITSymbol findSymbol(const std::string &Name) -- Search for pointers to
69 symbols (functions or variables) that have been added to the JIT.
70 3. void removeModule(Handle H) -- Remove a module from the JIT, releasing any
71 memory that had been used for the compiled code.
72
73 A basic use-case for this API, executing the 'main' function from a module,
74 will look like:
75
76 .. code-block:: c++
77
78 std::unique_ptr M = buildModule();
79 JIT J;
80 Handle H = J.addModule(*M);
81 int (*Main)(int, char*[]) =
82 (int(*)(int, char*[])J.findSymbol("main").getAddress();
83 int Result = Main();
84 J.removeModule(H);
85
86 The APIs that we build in these tutorials will all be variations on this simple
87 theme. Behind the API we will refine the implementation of the JIT to add
88 support for optimization and lazy compilation. Eventually we will extend the
89 API itself to allow higher-level program representations (e.g. ASTs) to be
90 added to the JIT.
91
92 KaleidoscopeJIT
93 ===============
94
95 In the previous section we described our API, now we examine a simple
96 implementation of it: The KaleidoscopeJIT class [1]_ that was used in the
97 `Implementing a language with LLVM `_ tutorials. We will use
98 the REPL code from `Chapter 7 `_ of that tutorial to supply the
99 input for our JIT: Each time the user enters an expression the REPL will add a
100 new IR module containing the code for that expression to the JIT. If the
101 expression is a top-level expression like '1+1' or 'sin(x)', the REPL will also
102 use the findSymbol method of our JIT class find and execute the code for the
103 expression, and then use the removeModule method to remove the code again
104 (since there's no way to re-invoke an anonymous expression). In later chapters
105 of this tutorial we'll modify the REPL to enable new interactions with our JIT
106 class, but for now we will take this setup for granted and focus our attention on
107 the implementation of our JIT itself.
108
109 Our KaleidoscopeJIT class is defined in the KaleidoscopeJIT.h header. After the
110 usual include guards and #includes [2]_, we get to the definition of our class:
111
112 .. code-block:: c++
113
114 #ifndef LLVM_EXECUTIONENGINE_ORC_KALEIDOSCOPEJIT_H
115 #define LLVM_EXECUTIONENGINE_ORC_KALEIDOSCOPEJIT_H
116
117 #include "llvm/ExecutionEngine/ExecutionEngine.h"
118 #include "llvm/ExecutionEngine/RTDyldMemoryManager.h"
119 #include "llvm/ExecutionEngine/Orc/CompileUtils.h"
120 #include "llvm/ExecutionEngine/Orc/IRCompileLayer.h"
121 #include "llvm/ExecutionEngine/Orc/LambdaResolver.h"
122 #include "llvm/ExecutionEngine/Orc/ObjectLinkingLayer.h"
123 #include "llvm/IR/Mangler.h"
124 #include "llvm/Support/DynamicLibrary.h"
125
126 namespace llvm {
127 namespace orc {
128
129 class KaleidoscopeJIT {
130 private:
131
132 std::unique_ptr TM;
133 const DataLayout DL;
134 ObjectLinkingLayer<> ObjectLayer;
135 IRCompileLayer CompileLayer;
136
137 public:
138
139 typedef decltype(CompileLayer)::ModuleSetHandleT ModuleHandleT;
140
141 Our class begins with four members: A TargetMachine, TM, which will be used
142 to build our LLVM compiler instance; A DataLayout, DL, which will be used for
143 symbol mangling (more on that later), and two ORC *layers*: An
144 ObjectLinkingLayer, and an IRCompileLayer. The ObjectLinkingLayer is the
145 foundation of our JIT: it takes in-memory object files produced by a
146 compiler and links them on the fly to make them executable. This
147 JIT-on-top-of-a-linker design was introduced in MCJIT, where the linker was
148 hidden inside the MCJIT class itself. In ORC we expose the linker as a visible,
149 reusable component so that clients can access and configure it directly
150 if they need to. In this tutorial our ObjectLinkingLayer will just be used to
151 support the next layer in our stack: the IRCompileLayer, which will be
152 responsible for taking LLVM IR, compiling it, and passing the resulting
153 in-memory object files down to the object linking layer below.
154
155 After our member variables comes typedef: ModuleHandle. This is the handle
156 type that will be returned from our JIT's addModule method, and which can be
157 used to remove a module again using the removeModule method. The IRCompileLayer
158 class already provides a convenient handle type
159 (IRCompileLayer::ModuleSetHandleT), so we will just provide a type-alias for
160 this.
161
162 .. code-block:: c++
163
164 KaleidoscopeJIT()
165 : TM(EngineBuilder().selectTarget()), DL(TM->createDataLayout()),
166 CompileLayer(ObjectLayer, SimpleCompiler(*TM)) {
167 llvm::sys::DynamicLibrary::LoadLibraryPermanently(nullptr);
168 }
169
170 TargetMachine &getTargetMachine() { return *TM; }
171
172 Next up we have our class constructor. We begin by initializing TM using the
173 EngineBuilder::selectTarget helper method, which constructs a TargetMachine for
174 the current process. Next we use our newly created TargetMachine to initialize
175 DL, our DataLayout. Then we initialize our IRCompileLayer. Our IRCompile layer
176 needs two things: (1) A reference to our object linking layer, and (2) a
177 compiler instance to use to perform the actual compilation from IR to object
178 files. We use the off-the-shelf SimpleCompiler instance for now, but in later
179 chapters we will substitute our own configurable compiler classes. Finally, in
180 the body of the constructor, we call the DynamicLibrary::LoadLibraryPermanently
181 method with a nullptr argument. Normally the LoadLibraryPermanently method is
182 called with the path of a dynamic library to load, but when passed a null
183 pointer it will 'load' the host process itself, making its exported symbols
184 available for execution.
185
186 .. code-block:: c++
187
188 ModuleHandleT addModule(std::unique_ptr M) {
189 // We need a memory manager to allocate memory and resolve symbols for this
190 // new module. Create one that resolves symbols by looking back into the
191 // JIT.
192 auto Resolver = createLambdaResolver(
193 [&](const std::string &Name) {
194 if (auto Sym = CompileLayer.findSymbol(Name, false))
195 return RuntimeDyld::SymbolInfo(Sym.getAddress(), Sym.getFlags());
196 return RuntimeDyld::SymbolInfo(nullptr);
197 },
198 [](const std::string &S) { return nullptr; });
199 std::vector> Ms;
200 Ms.push_back(std::move(M));
201 return CompileLayer.addModuleSet(singletonSet(std::move(M)),
202 make_unique(),
203 std::move(Resolver));
204 }
205
206 *To be done: describe addModule -- createLambdaResolver, resolvers, memory
207 managers, why 'module set' rather than a single module...*
208
209 .. code-block:: c++
210
211 JITSymbol findSymbol(const std::string Name) {
212 std::string MangledName;
213 raw_string_ostream MangledNameStream(MangledName);
214 Mangler::getNameWithPrefix(MangledNameStream, Name, DL);
215 return CompileLayer.findSymbol(MangledNameStream.str(), true);
216 }
217
218 void removeModule(ModuleHandle H) {
219 CompileLayer.removeModuleSet(H);
220 }
221
222 *To be done: describe findSymbol and removeModule -- why do we mangle? what's
223 the relationship between findSymbol and resolvers, why remove modules...*
224
225 *To be done: Conclusion, exercises (maybe a utility for a standalone IR JIT,
226 like a mini-LLI), feed to next chapter.*
227
228 Full Code Listing
229 =================
230
231 Here is the complete code listing for our running example, enhanced with
232 mutable variables and var/in support. To build this example, use:
233
234 .. code-block:: bash
235
236 # Compile
237 clang++ -g toy.cpp `llvm-config --cxxflags --ldflags --system-libs --libs core orc native` -O3 -o toy
238 # Run
239 ./toy
240
241 Here is the code:
242
243 .. literalinclude:: ../../examples/Kaleidoscope/BuildingAJIT/Chapter1/KaleidoscopeJIT.h
244 :language: c++
245
246 `Next: Extending the KaleidoscopeJIT `_
247
248 .. [1] Actually we use a cut-down version of KaleidoscopeJIT that makes a
249 simplifying assumption: symbols cannot be re-defined. This will make it
250 impossible to re-define symbols in the REPL, but will make our symbol
251 lookup logic simpler. Re-introducing support for symbol redefinition is
252 left as an exercise for the reader. (The KaleidoscopeJIT.h used in the
253 original tutorials will be a helpful reference).
254
255 .. [2] +-----------------------+-----------------------------------------------+
256 | File | Reason for inclusion |
257 +=======================+===============================================+
258 | ExecutionEngine.h | Access to the EngineBuilder::selectTarget |
259 | | method. |
260 +-----------------------+-----------------------------------------------+
261 | | Access to the |
262 | RTDyldMemoryManager.h | RTDyldMemoryManager::getSymbolAddressInProcess|
263 | | method. |
264 +-----------------------+-----------------------------------------------+
265 | CompileUtils.h | Provides the SimpleCompiler class. |
266 +-----------------------+-----------------------------------------------+
267 | IRCompileLayer.h | Provides the IRCompileLayer class. |
268 +-----------------------+-----------------------------------------------+
269 | | Access the createLambdaResolver function, |
270 | LambdaResolver.h | which provides easy construction of symbol |
271 | | resolvers. |
272 +-----------------------+-----------------------------------------------+
273 | ObjectLinkingLayer.h | Provides the ObjectLinkingLayer class. |
274 +-----------------------+-----------------------------------------------+
275 | Mangler.h | Provides the Mangler class for platform |
276 | | specific name-mangling. |
277 +-----------------------+-----------------------------------------------+
278 | DynamicLibrary.h | Provides the DynamicLibrary class, which |
279 | | makes symbols in the host process searchable. |
280 +-----------------------+-----------------------------------------------+
2121
2222 OCamlLangImpl*
2323
24 Kaleidoscope: Building an ORC-based JIT in LLVM
25 ===============================================
26
27 .. toctree::
28 :titlesonly:
29 :glob:
30 :numbered:
31
32 BuildingAJIT*
33
2434 External Tutorials
2535 ==================
2636
0 set(LLVM_LINK_COMPONENTS
1 Analysis
2 Core
3 ExecutionEngine
4 InstCombine
5 Object
6 RuntimeDyld
7 ScalarOpts
8 Support
9 native
10 )
11
12 add_kaleidoscope_chapter(BuildingAJIT-Ch1
13 toy.cpp
14 )
15
16 export_executable_symbols(BuildingAJIT-Ch1)
0 //===----- KaleidoscopeJIT.h - A simple JIT for Kaleidoscope ----*- C++ -*-===//
1 //
2 // The LLVM Compiler Infrastructure
3 //
4 // This file is distributed under the University of Illinois Open Source
5 // License. See LICENSE.TXT for details.
6 //
7 //===----------------------------------------------------------------------===//
8 //
9 // Contains a simple JIT definition for use in the kaleidoscope tutorials.
10 //
11 //===----------------------------------------------------------------------===//
12
13 #ifndef LLVM_EXECUTIONENGINE_ORC_KALEIDOSCOPEJIT_H
14 #define LLVM_EXECUTIONENGINE_ORC_KALEIDOSCOPEJIT_H
15
16 #include "llvm/ExecutionEngine/ExecutionEngine.h"
17 #include "llvm/ExecutionEngine/RTDyldMemoryManager.h"
18 #include "llvm/ExecutionEngine/Orc/CompileUtils.h"
19 #include "llvm/ExecutionEngine/Orc/IRCompileLayer.h"
20 #include "llvm/ExecutionEngine/Orc/LambdaResolver.h"
21 #include "llvm/ExecutionEngine/Orc/ObjectLinkingLayer.h"
22 #include "llvm/IR/Mangler.h"
23 #include "llvm/Support/DynamicLibrary.h"
24
25 namespace llvm {
26 namespace orc {
27
28 class KaleidoscopeJIT {
29 private:
30
31 std::unique_ptr TM;
32 const DataLayout DL;
33 ObjectLinkingLayer<> ObjectLayer;
34 IRCompileLayer CompileLayer;
35
36 public:
37
38 typedef decltype(CompileLayer)::ModuleSetHandleT ModuleHandle;
39
40 KaleidoscopeJIT()
41 : TM(EngineBuilder().selectTarget()), DL(TM->createDataLayout()),
42 CompileLayer(ObjectLayer, SimpleCompiler(*TM)) {
43 llvm::sys::DynamicLibrary::LoadLibraryPermanently(nullptr);
44 }
45
46 TargetMachine &getTargetMachine() { return *TM; }
47
48 ModuleHandle addModule(std::unique_ptr M) {
49 // We need a memory manager to allocate memory and resolve symbols for this
50 // new module. Create one that resolves symbols by looking back into the
51 // JIT.
52 auto Resolver = createLambdaResolver(
53 [&](const std::string &Name) {
54 if (auto Sym = CompileLayer.findSymbol(Name, false))
55 return RuntimeDyld::SymbolInfo(Sym.getAddress(), Sym.getFlags());
56 return RuntimeDyld::SymbolInfo(nullptr);
57 },
58 [](const std::string &S) { return nullptr; });
59 std::vector> Ms;
60 Ms.push_back(std::move(M));
61 return CompileLayer.addModuleSet(std::move(Ms),
62 make_unique(),
63 std::move(Resolver));
64 }
65
66 JITSymbol findSymbol(const std::string Name) {
67 std::string MangledName;
68 raw_string_ostream MangledNameStream(MangledName);
69 Mangler::getNameWithPrefix(MangledNameStream, Name, DL);
70 return CompileLayer.findSymbol(MangledNameStream.str(), true);
71 }
72
73 void removeModule(ModuleHandle H) {
74 CompileLayer.removeModuleSet(H);
75 }
76
77 };
78
79 } // End namespace orc.
80 } // End namespace llvm
81
82 #endif // LLVM_EXECUTIONENGINE_ORC_KALEIDOSCOPEJIT_H
0 #include "llvm/ADT/APFloat.h"
1 #include "llvm/ADT/STLExtras.h"
2 #include "llvm/IR/BasicBlock.h"
3 #include "llvm/IR/Constants.h"
4 #include "llvm/IR/DerivedTypes.h"
5 #include "llvm/IR/Function.h"
6 #include "llvm/IR/Instructions.h"
7 #include "llvm/IR/IRBuilder.h"
8 #include "llvm/IR/LLVMContext.h"
9 #include "llvm/IR/LegacyPassManager.h"
10 #include "llvm/IR/Module.h"
11 #include "llvm/IR/Type.h"
12 #include "llvm/IR/Verifier.h"
13 #include "llvm/Support/TargetSelect.h"
14 #include "llvm/Target/TargetMachine.h"
15 #include "llvm/Transforms/Scalar.h"
16 #include "llvm/Transforms/Scalar/GVN.h"
17 #include "KaleidoscopeJIT.h"
18 #include
19 #include
20 #include
21 #include
22 #include
23 #include
24 #include
25 #include
26 #include
27 #include
28
29 using namespace llvm;
30 using namespace llvm::orc;
31
32 //===----------------------------------------------------------------------===//
33 // Lexer
34 //===----------------------------------------------------------------------===//
35
36 // The lexer returns tokens [0-255] if it is an unknown character, otherwise one
37 // of these for known things.
38 enum Token {
39 tok_eof = -1,
40
41 // commands
42 tok_def = -2,
43 tok_extern = -3,
44
45 // primary
46 tok_identifier = -4,
47 tok_number = -5,
48
49 // control
50 tok_if = -6,
51 tok_then = -7,
52 tok_else = -8,
53 tok_for = -9,
54 tok_in = -10,
55
56 // operators
57 tok_binary = -11,
58 tok_unary = -12,
59
60 // var definition
61 tok_var = -13
62 };
63
64 static std::string IdentifierStr; // Filled in if tok_identifier
65 static double NumVal; // Filled in if tok_number
66
67 /// gettok - Return the next token from standard input.
68 static int gettok() {
69 static int LastChar = ' ';
70
71 // Skip any whitespace.
72 while (isspace(LastChar))
73 LastChar = getchar();
74
75 if (isalpha(LastChar)) { // identifier: [a-zA-Z][a-zA-Z0-9]*
76 IdentifierStr = LastChar;
77 while (isalnum((LastChar = getchar())))
78 IdentifierStr += LastChar;
79
80 if (IdentifierStr == "def")
81 return tok_def;
82 if (IdentifierStr == "extern")
83 return tok_extern;
84 if (IdentifierStr == "if")
85 return tok_if;
86 if (IdentifierStr == "then")
87 return tok_then;
88 if (IdentifierStr == "else")
89 return tok_else;
90 if (IdentifierStr == "for")
91 return tok_for;
92 if (IdentifierStr == "in")
93 return tok_in;
94 if (IdentifierStr == "binary")
95 return tok_binary;
96 if (IdentifierStr == "unary")
97 return tok_unary;
98 if (IdentifierStr == "var")
99 return tok_var;
100 return tok_identifier;
101 }
102
103 if (isdigit(LastChar) || LastChar == '.') { // Number: [0-9.]+
104 std::string NumStr;
105 do {
106 NumStr += LastChar;
107 LastChar = getchar();
108 } while (isdigit(LastChar) || LastChar == '.');
109
110 NumVal = strtod(NumStr.c_str(), nullptr);
111 return tok_number;
112 }
113
114 if (LastChar == '#') {
115 // Comment until end of line.
116 do
117 LastChar = getchar();
118 while (LastChar != EOF && LastChar != '\n' && LastChar != '\r');
119
120 if (LastChar != EOF)
121 return gettok();
122 }
123
124 // Check for end of file. Don't eat the EOF.
125 if (LastChar == EOF)
126 return tok_eof;
127
128 // Otherwise, just return the character as its ascii value.
129 int ThisChar = LastChar;
130 LastChar = getchar();
131 return ThisChar;
132 }
133
134 //===----------------------------------------------------------------------===//
135 // Abstract Syntax Tree (aka Parse Tree)
136 //===----------------------------------------------------------------------===//
137 namespace {
138 /// ExprAST - Base class for all expression nodes.
139 class ExprAST {
140 public:
141 virtual ~ExprAST() {}
142 virtual Value *codegen() = 0;
143 };
144
145 /// NumberExprAST - Expression class for numeric literals like "1.0".
146 class NumberExprAST : public ExprAST {
147 double Val;
148
149 public:
150 NumberExprAST(double Val) : Val(Val) {}
151 Value *codegen() override;
152 };
153
154 /// VariableExprAST - Expression class for referencing a variable, like "a".
155 class VariableExprAST : public ExprAST {
156 std::string Name;
157
158 public:
159 VariableExprAST(const std::string &Name) : Name(Name) {}
160 const std::string &getName() const { return Name; }
161 Value *codegen() override;
162 };
163
164 /// UnaryExprAST - Expression class for a unary operator.
165 class UnaryExprAST : public ExprAST {
166 char Opcode;
167 std::unique_ptr Operand;
168
169 public:
170 UnaryExprAST(char Opcode, std::unique_ptr Operand)
171 : Opcode(Opcode), Operand(std::move(Operand)) {}
172 Value *codegen() override;
173 };
174
175 /// BinaryExprAST - Expression class for a binary operator.
176 class BinaryExprAST : public ExprAST {
177 char Op;
178 std::unique_ptr LHS, RHS;
179
180 public:
181 BinaryExprAST(char Op, std::unique_ptr LHS,
182 std::unique_ptr RHS)
183 : Op(Op), LHS(std::move(LHS)), RHS(std::move(RHS)) {}
184 Value *codegen() override;
185 };
186
187 /// CallExprAST - Expression class for function calls.
188 class CallExprAST : public ExprAST {
189 std::string Callee;
190 std::vector> Args;
191
192 public:
193 CallExprAST(const std::string &Callee,
194 std::vector> Args)
195 : Callee(Callee), Args(std::move(Args)) {}
196 Value *codegen() override;
197 };
198
199 /// IfExprAST - Expression class for if/then/else.
200 class IfExprAST : public ExprAST {
201 std::unique_ptr Cond, Then, Else;
202
203 public:
204 IfExprAST(std::unique_ptr Cond, std::unique_ptr Then,
205 std::unique_ptr Else)
206 : Cond(std::move(Cond)), Then(std::move(Then)), Else(std::move(Else)) {}
207 Value *codegen() override;
208 };
209
210 /// ForExprAST - Expression class for for/in.
211 class ForExprAST : public ExprAST {
212 std::string VarName;
213 std::unique_ptr Start, End, Step, Body;
214
215 public:
216 ForExprAST(const std::string &VarName, std::unique_ptr Start,
217 std::unique_ptr End, std::unique_ptr Step,
218 std::unique_ptr Body)
219 : VarName(VarName), Start(std::move(Start)), End(std::move(End)),
220 Step(std::move(Step)), Body(std::move(Body)) {}
221 Value *codegen() override;
222 };
223
224 /// VarExprAST - Expression class for var/in
225 class VarExprAST : public ExprAST {
226 std::vector>> VarNames;
227 std::unique_ptr Body;
228
229 public:
230 VarExprAST(
231 std::vector>> VarNames,
232 std::unique_ptr Body)
233 : VarNames(std::move(VarNames)), Body(std::move(Body)) {}
234 Value *codegen() override;
235 };
236
237 /// PrototypeAST - This class represents the "prototype" for a function,
238 /// which captures its name, and its argument names (thus implicitly the number
239 /// of arguments the function takes), as well as if it is an operator.
240 class PrototypeAST {
241 std::string Name;
242 std::vector Args;
243 bool IsOperator;
244 unsigned Precedence; // Precedence if a binary op.
245
246 public:
247 PrototypeAST(const std::string &Name, std::vector Args,
248 bool IsOperator = false, unsigned Prec = 0)
249 : Name(Name), Args(std::move(Args)), IsOperator(IsOperator),
250 Precedence(Prec) {}
251 Function *codegen();
252 const std::string &getName() const { return Name; }
253
254 bool isUnaryOp() const { return IsOperator && Args.size() == 1; }
255 bool isBinaryOp() const { return IsOperator && Args.size() == 2; }
256
257 char getOperatorName() const {
258 assert(isUnaryOp() || isBinaryOp());
259 return Name[Name.size() - 1];
260 }
261
262 unsigned getBinaryPrecedence() const { return Precedence; }
263 };
264
265 /// FunctionAST - This class represents a function definition itself.
266 class FunctionAST {
267 std::unique_ptr Proto;
268 std::unique_ptr Body;
269
270 public:
271 FunctionAST(std::unique_ptr Proto,
272 std::unique_ptr Body)
273 : Proto(std::move(Proto)), Body(std::move(Body)) {}
274 Function *codegen();
275 };
276 } // end anonymous namespace
277
278 //===----------------------------------------------------------------------===//
279 // Parser
280 //===----------------------------------------------------------------------===//
281
282 /// CurTok/getNextToken - Provide a simple token buffer. CurTok is the current
283 /// token the parser is looking at. getNextToken reads another token from the
284 /// lexer and updates CurTok with its results.
285 static int CurTok;
286 static int getNextToken() { return CurTok = gettok(); }
287
288 /// BinopPrecedence - This holds the precedence for each binary operator that is
289 /// defined.
290 static std::map BinopPrecedence;
291
292 /// GetTokPrecedence - Get the precedence of the pending binary operator token.
293 static int GetTokPrecedence() {
294 if (!isascii(CurTok))
295 return -1;
296
297 // Make sure it's a declared binop.
298 int TokPrec = BinopPrecedence[CurTok];
299 if (TokPrec <= 0)
300 return -1;
301 return TokPrec;
302 }
303
304 /// LogError* - These are little helper functions for error handling.
305 std::unique_ptr LogError(const char *Str) {
306 fprintf(stderr, "Error: %s\n", Str);
307 return nullptr;
308 }
309
310 std::unique_ptr LogErrorP(const char *Str) {
311 LogError(Str);
312 return nullptr;
313 }
314
315 static std::unique_ptr ParseExpression();
316
317 /// numberexpr ::= number
318 static std::unique_ptr ParseNumberExpr() {
319 auto Result = llvm::make_unique(NumVal);
320 getNextToken(); // consume the number
321 return std::move(Result);
322 }
323
324 /// parenexpr ::= '(' expression ')'
325 static std::unique_ptr ParseParenExpr() {
326 getNextToken(); // eat (.
327 auto V = ParseExpression();
328 if (!V)
329 return nullptr;
330
331 if (CurTok != ')')
332 return LogError("expected ')'");
333 getNextToken(); // eat ).
334 return V;
335 }
336
337 /// identifierexpr
338 /// ::= identifier
339 /// ::= identifier '(' expression* ')'
340 static std::unique_ptr ParseIdentifierExpr() {
341 std::string IdName = IdentifierStr;
342
343 getNextToken(); // eat identifier.
344
345 if (CurTok != '(') // Simple variable ref.
346 return llvm::make_unique(IdName);
347
348 // Call.
349 getNextToken(); // eat (
350 std::vector> Args;
351 if (CurTok != ')') {
352 while (true) {
353 if (auto Arg = ParseExpression())
354 Args.push_back(std::move(Arg));
355 else
356 return nullptr;
357
358 if (CurTok == ')')
359 break;
360
361 if (CurTok != ',')
362 return LogError("Expected ')' or ',' in argument list");
363 getNextToken();
364 }
365 }
366
367 // Eat the ')'.
368 getNextToken();
369
370 return llvm::make_unique(IdName, std::move(Args));
371 }
372
373 /// ifexpr ::= 'if' expression 'then' expression 'else' expression
374 static std::unique_ptr ParseIfExpr() {
375 getNextToken(); // eat the if.
376
377 // condition.
378 auto Cond = ParseExpression();
379 if (!Cond)
380 return nullptr;
381
382 if (CurTok != tok_then)
383 return LogError("expected then");
384 getNextToken(); // eat the then
385
386 auto Then = ParseExpression();
387 if (!Then)
388 return nullptr;
389
390 if (CurTok != tok_else)
391 return LogError("expected else");
392
393 getNextToken();
394
395 auto Else = ParseExpression();
396 if (!Else)
397 return nullptr;
398
399 return llvm::make_unique(std::move(Cond), std::move(Then),
400 std::move(Else));
401 }
402
403 /// forexpr ::= 'for' identifier '=' expr ',' expr (',' expr)? 'in' expression
404 static std::unique_ptr ParseForExpr() {
405 getNextToken(); // eat the for.
406
407 if (CurTok != tok_identifier)
408 return LogError("expected identifier after for");
409
410 std::string IdName = IdentifierStr;
411 getNextToken(); // eat identifier.
412
413 if (CurTok != '=')
414 return LogError("expected '=' after for");
415 getNextToken(); // eat '='.
416
417 auto Start = ParseExpression();
418 if (!Start)
419 return nullptr;
420 if (CurTok != ',')
421 return LogError("expected ',' after for start value");
422 getNextToken();
423
424 auto End = ParseExpression();
425 if (!End)
426 return nullptr;
427
428 // The step value is optional.
429 std::unique_ptr Step;
430 if (CurTok == ',') {
431 getNextToken();
432 Step = ParseExpression();
433 if (!Step)
434 return nullptr;
435 }
436
437 if (CurTok != tok_in)
438 return LogError("expected 'in' after for");
439 getNextToken(); // eat 'in'.
440
441 auto Body = ParseExpression();
442 if (!Body)
443 return nullptr;
444
445 return llvm::make_unique(IdName, std::move(Start), std::move(End),
446 std::move(Step), std::move(Body));
447 }
448
449 /// varexpr ::= 'var' identifier ('=' expression)?
450 // (',' identifier ('=' expression)?)* 'in' expression
451 static std::unique_ptr ParseVarExpr() {
452 getNextToken(); // eat the var.
453
454 std::vector>> VarNames;
455
456 // At least one variable name is required.
457 if (CurTok != tok_identifier)
458 return LogError("expected identifier after var");
459
460 while (true) {
461 std::string Name = IdentifierStr;
462 getNextToken(); // eat identifier.
463
464 // Read the optional initializer.
465 std::unique_ptr Init = nullptr;
466 if (CurTok == '=') {
467 getNextToken(); // eat the '='.
468
469 Init = ParseExpression();
470 if (!Init)
471 return nullptr;
472 }
473
474 VarNames.push_back(std::make_pair(Name, std::move(Init)));
475
476 // End of var list, exit loop.
477 if (CurTok != ',')
478 break;
479 getNextToken(); // eat the ','.
480
481 if (CurTok != tok_identifier)
482 return LogError("expected identifier list after var");
483 }
484
485 // At this point, we have to have 'in'.
486 if (CurTok != tok_in)
487 return LogError("expected 'in' keyword after 'var'");
488 getNextToken(); // eat 'in'.
489
490 auto Body = ParseExpression();
491 if (!Body)
492 return nullptr;
493
494 return llvm::make_unique(std::move(VarNames), std::move(Body));
495 }
496
497 /// primary
498 /// ::= identifierexpr
499 /// ::= numberexpr
500 /// ::= parenexpr
501 /// ::= ifexpr
502 /// ::= forexpr
503 /// ::= varexpr
504 static std::unique_ptr ParsePrimary() {
505 switch (CurTok) {
506 default:
507 return LogError("unknown token when expecting an expression");
508 case tok_identifier:
509 return ParseIdentifierExpr();
510 case tok_number:
511 return ParseNumberExpr();
512 case '(':
513 return ParseParenExpr();
514 case tok_if:
515 return ParseIfExpr();
516 case tok_for:
517 return ParseForExpr();
518 case tok_var:
519 return ParseVarExpr();
520 }
521 }
522
523 /// unary
524 /// ::= primary
525 /// ::= '!' unary
526 static std::unique_ptr ParseUnary() {
527 // If the current token is not an operator, it must be a primary expr.
528 if (!isascii(CurTok) || CurTok == '(' || CurTok == ',')
529 return ParsePrimary();
530
531 // If this is a unary operator, read it.
532 int Opc = CurTok;
533 getNextToken();
534 if (auto Operand = ParseUnary())
535 return llvm::make_unique(Opc, std::move(Operand));
536 return nullptr;
537 }
538
539 /// binoprhs
540 /// ::= ('+' unary)*
541 static std::unique_ptr ParseBinOpRHS(int ExprPrec,
542 std::unique_ptr LHS) {
543 // If this is a binop, find its precedence.
544 while (true) {
545 int TokPrec = GetTokPrecedence();
546
547 // If this is a binop that binds at least as tightly as the current binop,
548 // consume it, otherwise we are done.
549 if (TokPrec < ExprPrec)
550 return LHS;
551
552 // Okay, we know this is a binop.
553 int BinOp = CurTok;
554 getNextToken(); // eat binop
555
556 // Parse the unary expression after the binary operator.
557 auto RHS = ParseUnary();
558 if (!RHS)
559 return nullptr;
560
561 // If BinOp binds less tightly with RHS than the operator after RHS, let
562 // the pending operator take RHS as its LHS.
563 int NextPrec = GetTokPrecedence();
564 if (TokPrec < NextPrec) {
565 RHS = ParseBinOpRHS(TokPrec + 1, std::move(RHS));
566 if (!RHS)
567 return nullptr;
568 }
569
570 // Merge LHS/RHS.
571 LHS =
572 llvm::make_unique(BinOp, std::move(LHS), std::move(RHS));
573 }
574 }
575
576 /// expression
577 /// ::= unary binoprhs
578 ///
579 static std::unique_ptr ParseExpression() {
580 auto LHS = ParseUnary();
581 if (!LHS)
582 return nullptr;
583
584 return ParseBinOpRHS(0, std::move(LHS));
585 }
586
587 /// prototype
588 /// ::= id '(' id* ')'
589 /// ::= binary LETTER number? (id, id)
590 /// ::= unary LETTER (id)
591 static std::unique_ptr ParsePrototype() {
592 std::string FnName;
593
594 unsigned Kind = 0; // 0 = identifier, 1 = unary, 2 = binary.
595 unsigned BinaryPrecedence = 30;
596
597 switch (CurTok) {
598 default:
599 return LogErrorP("Expected function name in prototype");
600 case tok_identifier:
601 FnName = IdentifierStr;
602 Kind = 0;
603 getNextToken();
604 break;
605 case tok_unary:
606 getNextToken();
607 if (!isascii(CurTok))
608 return LogErrorP("Expected unary operator");
609 FnName = "unary";
610 FnName += (char)CurTok;
611 Kind = 1;
612 getNextToken();
613 break;
614 case tok_binary:
615 getNextToken();
616 if (!isascii(CurTok))
617 return LogErrorP("Expected binary operator");
618 FnName = "binary";
619 FnName += (char)CurTok;
620 Kind = 2;
621 getNextToken();
622
623 // Read the precedence if present.
624 if (CurTok == tok_number) {
625 if (NumVal < 1 || NumVal > 100)
626 return LogErrorP("Invalid precedecnce: must be 1..100");
627 BinaryPrecedence = (unsigned)NumVal;
628 getNextToken();
629 }
630 break;
631 }
632
633 if (CurTok != '(')
634 return LogErrorP("Expected '(' in prototype");
635
636 std::vector ArgNames;
637 while (getNextToken() == tok_identifier)
638 ArgNames.push_back(IdentifierStr);
639 if (CurTok != ')')
640 return LogErrorP("Expected ')' in prototype");
641
642 // success.
643 getNextToken(); // eat ')'.
644
645 // Verify right number of names for operator.
646 if (Kind && ArgNames.size() != Kind)
647 return LogErrorP("Invalid number of operands for operator");
648
649 return llvm::make_unique(FnName, ArgNames, Kind != 0,
650 BinaryPrecedence);
651 }
652
653 /// definition ::= 'def' prototype expression
654 static std::unique_ptr ParseDefinition() {
655 getNextToken(); // eat def.
656 auto Proto = ParsePrototype();
657 if (!Proto)
658 return nullptr;
659
660 if (auto E = ParseExpression())
661 return llvm::make_unique(std::move(Proto), std::move(E));
662 return nullptr;
663 }
664
665 /// toplevelexpr ::= expression
666 static std::unique_ptr ParseTopLevelExpr() {
667 if (auto E = ParseExpression()) {
668 // Make an anonymous proto.
669 auto Proto = llvm::make_unique("__anon_expr",
670 std::vector());
671 return llvm::make_unique(std::move(Proto), std::move(E));
672 }
673 return nullptr;
674 }
675
676 /// external ::= 'extern' prototype
677 static std::unique_ptr ParseExtern() {
678 getNextToken(); // eat extern.
679 return ParsePrototype();
680 }
681
682 //===----------------------------------------------------------------------===//
683 // Code Generation
684 //===----------------------------------------------------------------------===//
685
686 static LLVMContext TheContext;
687 static IRBuilder<> Builder(TheContext);
688 static std::unique_ptr TheModule;
689 static std::map NamedValues;
690 static std::unique_ptr TheJIT;
691 static std::map> FunctionProtos;
692
693 Value *LogErrorV(const char *Str) {
694 LogError(Str);
695 return nullptr;
696 }
697
698 Function *getFunction(std::string Name) {
699 // First, see if the function has already been added to the current module.
700 if (auto *F = TheModule->getFunction(Name))
701 return F;
702
703 // If not, check whether we can codegen the declaration from some existing
704 // prototype.
705 auto FI = FunctionProtos.find(Name);
706 if (FI != FunctionProtos.end())
707 return FI->second->codegen();
708
709 // If no existing prototype exists, return null.
710 return nullptr;
711 }
712
713 /// CreateEntryBlockAlloca - Create an alloca instruction in the entry block of
714 /// the function. This is used for mutable variables etc.
715 static AllocaInst *CreateEntryBlockAlloca(Function *TheFunction,
716 const std::string &VarName) {
717 IRBuilder<> TmpB(&TheFunction->getEntryBlock(),
718 TheFunction->getEntryBlock().begin());
719 return TmpB.CreateAlloca(Type::getDoubleTy(TheContext), nullptr, VarName);
720 }
721
722 Value *NumberExprAST::codegen() {
723 return ConstantFP::get(TheContext, APFloat(Val));
724 }
725
726 Value *VariableExprAST::codegen() {
727 // Look this variable up in the function.
728 Value *V = NamedValues[Name];
729 if (!V)
730 return LogErrorV("Unknown variable name");
731
732 // Load the value.
733 return Builder.CreateLoad(V, Name.c_str());
734 }
735
736 Value *UnaryExprAST::codegen() {
737 Value *OperandV = Operand->codegen();
738 if (!OperandV)
739 return nullptr;
740
741 Function *F = getFunction(std::string("unary") + Opcode);
742 if (!F)
743 return LogErrorV("Unknown unary operator");
744
745 return Builder.CreateCall(F, OperandV, "unop");
746 }
747
748 Value *BinaryExprAST::codegen() {
749 // Special case '=' because we don't want to emit the LHS as an expression.
750 if (Op == '=') {
751 // Assignment requires the LHS to be an identifier.
752 // This assume we're building without RTTI because LLVM builds that way by
753 // default. If you build LLVM with RTTI this can be changed to a
754 // dynamic_cast for automatic error checking.
755 VariableExprAST *LHSE = static_cast(LHS.get());
756 if (!LHSE)
757 return LogErrorV("destination of '=' must be a variable");
758 // Codegen the RHS.
759 Value *Val = RHS->codegen();
760 if (!Val)
761 return nullptr;
762
763 // Look up the name.
764 Value *Variable = NamedValues[LHSE->getName()];
765 if (!Variable)
766 return LogErrorV("Unknown variable name");
767
768 Builder.CreateStore(Val, Variable);
769 return Val;
770 }
771
772 Value *L = LHS->codegen();
773 Value *R = RHS->codegen();
774 if (!L || !R)
775 return nullptr;
776
777 switch (Op) {
778 case '+':
779 return Builder.CreateFAdd(L, R, "addtmp");
780 case '-':
781 return Builder.CreateFSub(L, R, "subtmp");
782 case '*':
783 return Builder.CreateFMul(L, R, "multmp");
784 case '<':
785 L = Builder.CreateFCmpULT(L, R, "cmptmp");
786 // Convert bool 0/1 to double 0.0 or 1.0
787 return Builder.CreateUIToFP(L, Type::getDoubleTy(TheContext), "booltmp");
788 default:
789 break;
790 }
791
792 // If it wasn't a builtin binary operator, it must be a user defined one. Emit
793 // a call to it.
794 Function *F = getFunction(std::string("binary") + Op);
795 assert(F && "binary operator not found!");
796
797 Value *Ops[] = {L, R};
798 return Builder.CreateCall(F, Ops, "binop");
799 }
800
801 Value *CallExprAST::codegen() {
802 // Look up the name in the global module table.
803 Function *CalleeF = getFunction(Callee);
804 if (!CalleeF)
805 return LogErrorV("Unknown function referenced");
806
807 // If argument mismatch error.
808 if (CalleeF->arg_size() != Args.size())
809 return LogErrorV("Incorrect # arguments passed");
810
811 std::vector ArgsV;
812 for (unsigned i = 0, e = Args.size(); i != e; ++i) {
813 ArgsV.push_back(Args[i]->codegen());
814 if (!ArgsV.back())
815 return nullptr;
816 }
817
818 return Builder.CreateCall(CalleeF, ArgsV, "calltmp");
819 }
820
821 Value *IfExprAST::codegen() {
822 Value *CondV = Cond->codegen();
823 if (!CondV)
824 return nullptr;
825
826 // Convert condition to a bool by comparing equal to 0.0.
827 CondV = Builder.CreateFCmpONE(
828 CondV, ConstantFP::get(TheContext, APFloat(0.0)), "ifcond");
829
830 Function *TheFunction = Builder.GetInsertBlock()->getParent();
831
832 // Create blocks for the then and else cases. Insert the 'then' block at the
833 // end of the function.
834 BasicBlock *ThenBB = BasicBlock::Create(TheContext, "then", TheFunction);
835 BasicBlock *ElseBB = BasicBlock::Create(TheContext, "else");
836 BasicBlock *MergeBB = BasicBlock::Create(TheContext, "ifcont");
837
838 Builder.CreateCondBr(CondV, ThenBB, ElseBB);
839
840 // Emit then value.
841 Builder.SetInsertPoint(ThenBB);
842
843 Value *ThenV = Then->codegen();
844 if (!ThenV)
845 return nullptr;
846
847 Builder.CreateBr(MergeBB);
848 // Codegen of 'Then' can change the current block, update ThenBB for the PHI.
849 ThenBB = Builder.GetInsertBlock();
850
851 // Emit else block.
852 TheFunction->getBasicBlockList().push_back(ElseBB);
853 Builder.SetInsertPoint(ElseBB);
854
855 Value *ElseV = Else->codegen();
856 if (!ElseV)
857 return nullptr;
858
859 Builder.CreateBr(MergeBB);
860 // Codegen of 'Else' can change the current block, update ElseBB for the PHI.
861 ElseBB = Builder.GetInsertBlock();
862
863 // Emit merge block.
864 TheFunction->getBasicBlockList().push_back(MergeBB);
865 Builder.SetInsertPoint(MergeBB);
866 PHINode *PN = Builder.CreatePHI(Type::getDoubleTy(TheContext), 2, "iftmp");
867
868 PN->addIncoming(ThenV, ThenBB);
869 PN->addIncoming(ElseV, ElseBB);
870 return PN;
871 }
872
873 // Output for-loop as:
874 // var = alloca double
875 // ...
876 // start = startexpr
877 // store start -> var
878 // goto loop
879 // loop:
880 // ...
881 // bodyexpr
882 // ...
883 // loopend:
884 // step = stepexpr
885 // endcond = endexpr
886 //
887 // curvar = load var
888 // nextvar = curvar + step
889 // store nextvar -> var
890 // br endcond, loop, endloop
891 // outloop:
892 Value *ForExprAST::codegen() {
893 Function *TheFunction = Builder.GetInsertBlock()->getParent();
894
895 // Create an alloca for the variable in the entry block.
896 AllocaInst *Alloca = CreateEntryBlockAlloca(TheFunction, VarName);
897
898 // Emit the start code first, without 'variable' in scope.
899 Value *StartVal = Start->codegen();
900 if (!StartVal)
901 return nullptr;
902
903 // Store the value into the alloca.
904 Builder.CreateStore(StartVal, Alloca);
905
906 // Make the new basic block for the loop header, inserting after current
907 // block.
908 BasicBlock *LoopBB = BasicBlock::Create(TheContext, "loop", TheFunction);
909
910 // Insert an explicit fall through from the current block to the LoopBB.
911 Builder.CreateBr(LoopBB);
912
913 // Start insertion in LoopBB.
914 Builder.SetInsertPoint(LoopBB);
915
916 // Within the loop, the variable is defined equal to the PHI node. If it
917 // shadows an existing variable, we have to restore it, so save it now.
918 AllocaInst *OldVal = NamedValues[VarName];
919 NamedValues[VarName] = Alloca;
920
921 // Emit the body of the loop. This, like any other expr, can change the
922 // current BB. Note that we ignore the value computed by the body, but don't
923 // allow an error.
924 if (!Body->codegen())
925 return nullptr;
926
927 // Emit the step value.
928 Value *StepVal = nullptr;
929 if (Step) {
930 StepVal = Step->codegen();
931 if (!StepVal)
932 return nullptr;
933 } else {
934 // If not specified, use 1.0.
935 StepVal = ConstantFP::get(TheContext, APFloat(1.0));
936 }
937
938 // Compute the end condition.
939 Value *EndCond = End->codegen();
940 if (!EndCond)
941 return nullptr;
942
943 // Reload, increment, and restore the alloca. This handles the case where
944 // the body of the loop mutates the variable.
945 Value *CurVar = Builder.CreateLoad(Alloca, VarName.c_str());
946 Value *NextVar = Builder.CreateFAdd(CurVar, StepVal, "nextvar");
947 Builder.CreateStore(NextVar, Alloca);
948
949 // Convert condition to a bool by comparing equal to 0.0.
950 EndCond = Builder.CreateFCmpONE(
951 EndCond, ConstantFP::get(TheContext, APFloat(0.0)), "loopcond");
952
953 // Create the "after loop" block and insert it.
954 BasicBlock *AfterBB =
955 BasicBlock::Create(TheContext, "afterloop", TheFunction);
956
957 // Insert the conditional branch into the end of LoopEndBB.
958 Builder.CreateCondBr(EndCond, LoopBB, AfterBB);
959
960 // Any new code will be inserted in AfterBB.
961 Builder.SetInsertPoint(AfterBB);
962
963 // Restore the unshadowed variable.
964 if (OldVal)
965 NamedValues[VarName] = OldVal;
966 else
967 NamedValues.erase(VarName);
968
969 // for expr always returns 0.0.
970 return Constant::getNullValue(Type::getDoubleTy(TheContext));
971 }
972
973 Value *VarExprAST::codegen() {
974 std::vector OldBindings;
975
976 Function *TheFunction = Builder.GetInsertBlock()->getParent();
977
978 // Register all variables and emit their initializer.
979 for (unsigned i = 0, e = VarNames.size(); i != e; ++i) {
980 const std::string &VarName = VarNames[i].first;
981 ExprAST *Init = VarNames[i].second.get();
982
983 // Emit the initializer before adding the variable to scope, this prevents
984 // the initializer from referencing the variable itself, and permits stuff
985 // like this:
986 // var a = 1 in
987 // var a = a in ... # refers to outer 'a'.
988 Value *InitVal;
989 if (Init) {
990 InitVal = Init->codegen();
991 if (!InitVal)
992 return nullptr;
993 } else { // If not specified, use 0.0.
994 InitVal = ConstantFP::get(TheContext, APFloat(0.0));
995 }
996
997 AllocaInst *Alloca = CreateEntryBlockAlloca(TheFunction, VarName);
998 Builder.CreateStore(InitVal, Alloca);
999
1000 // Remember the old variable binding so that we can restore the binding when
1001 // we unrecurse.
1002 OldBindings.push_back(NamedValues[VarName]);
1003
1004 // Remember this binding.
1005 NamedValues[VarName] = Alloca;
1006 }
1007
1008 // Codegen the body, now that all vars are in scope.
1009 Value *BodyVal = Body->codegen();
1010 if (!BodyVal)
1011 return nullptr;
1012
1013 // Pop all our variables from scope.
1014 for (unsigned i = 0, e = VarNames.size(); i != e; ++i)
1015 NamedValues[VarNames[i].first] = OldBindings[i];
1016
1017 // Return the body computation.
1018 return BodyVal;
1019 }
1020
1021 Function *PrototypeAST::codegen() {
1022 // Make the function type: double(double,double) etc.
1023 std::vector Doubles(Args.size(), Type::getDoubleTy(TheContext));
1024 FunctionType *FT =
1025 FunctionType::get(Type::getDoubleTy(TheContext), Doubles, false);
1026
1027 Function *F =
1028 Function::Create(FT, Function::ExternalLinkage, Name, TheModule.get());
1029
1030 // Set names for all arguments.
1031 unsigned Idx = 0;
1032 for (auto &Arg : F->args())
1033 Arg.setName(Args[Idx++]);
1034
1035 return F;
1036 }
1037
1038 Function *FunctionAST::codegen() {
1039 // Transfer ownership of the prototype to the FunctionProtos map, but keep a
1040 // reference to it for use below.
1041 auto &P = *Proto;
1042 FunctionProtos[Proto->getName()] = std::move(Proto);
1043 Function *TheFunction = getFunction(P.getName());
1044 if (!TheFunction)
1045 return nullptr;
1046
1047 // If this is an operator, install it.
1048 if (P.isBinaryOp())
1049 BinopPrecedence[P.getOperatorName()] = P.getBinaryPrecedence();
1050
1051 // Create a new basic block to start insertion into.
1052 BasicBlock *BB = BasicBlock::Create(TheContext, "entry", TheFunction);
1053 Builder.SetInsertPoint(BB);
1054
1055 // Record the function arguments in the NamedValues map.
1056 NamedValues.clear();
1057 for (auto &Arg : TheFunction->args()) {
1058 // Create an alloca for this variable.
1059 AllocaInst *Alloca = CreateEntryBlockAlloca(TheFunction, Arg.getName());
1060
1061 // Store the initial value into the alloca.
1062 Builder.CreateStore(&Arg, Alloca);
1063
1064 // Add arguments to variable symbol table.
1065 NamedValues[Arg.getName()] = Alloca;
1066 }
1067
1068 if (Value *RetVal = Body->codegen()) {
1069 // Finish off the function.
1070 Builder.CreateRet(RetVal);
1071
1072 // Validate the generated code, checking for consistency.
1073 verifyFunction(*TheFunction);
1074
1075 return TheFunction;
1076 }
1077
1078 // Error reading body, remove function.
1079 TheFunction->eraseFromParent();
1080
1081 if (P.isBinaryOp())
1082 BinopPrecedence.erase(Proto->getOperatorName());
1083 return nullptr;
1084 }
1085
1086 //===----------------------------------------------------------------------===//
1087 // Top-Level parsing and JIT Driver
1088 //===----------------------------------------------------------------------===//
1089
1090 static void InitializeModule() {
1091 // Open a new module.
1092 TheModule = llvm::make_unique("my cool jit", TheContext);
1093 TheModule->setDataLayout(TheJIT->getTargetMachine().createDataLayout());
1094 }
1095
1096 static void HandleDefinition() {
1097 if (auto FnAST = ParseDefinition()) {
1098 if (auto *FnIR = FnAST->codegen()) {
1099 fprintf(stderr, "Read function definition:");
1100 FnIR->dump();
1101 TheJIT->addModule(std::move(TheModule));
1102 InitializeModule();
1103 }
1104 } else {
1105 // Skip token for error recovery.
1106 getNextToken();
1107 }
1108 }
1109
1110 static void HandleExtern() {
1111 if (auto ProtoAST = ParseExtern()) {
1112 if (auto *FnIR = ProtoAST->codegen()) {
1113 fprintf(stderr, "Read extern: ");
1114 FnIR->dump();
1115 FunctionProtos[ProtoAST->getName()] = std::move(ProtoAST);
1116 }
1117 } else {
1118 // Skip token for error recovery.
1119 getNextToken();
1120 }
1121 }
1122
1123 static void HandleTopLevelExpression() {
1124 // Evaluate a top-level expression into an anonymous function.
1125 if (auto FnAST = ParseTopLevelExpr()) {
1126 if (FnAST->codegen()) {
1127 // JIT the module containing the anonymous expression, keeping a handle so
1128 // we can free it later.
1129 auto H = TheJIT->addModule(std::move(TheModule));
1130 InitializeModule();
1131
1132 // Search the JIT for the __anon_expr symbol.
1133 auto ExprSymbol = TheJIT->findSymbol("__anon_expr");
1134 assert(ExprSymbol && "Function not found");
1135
1136 // Get the symbol's address and cast it to the right type (takes no
1137 // arguments, returns a double) so we can call it as a native function.
1138 double (*FP)() = (double (*)())(intptr_t)ExprSymbol.getAddress();
1139 fprintf(stderr, "Evaluated to %f\n", FP());
1140
1141 // Delete the anonymous expression module from the JIT.
1142 TheJIT->removeModule(H);
1143 }
1144 } else {
1145 // Skip token for error recovery.
1146 getNextToken();
1147 }
1148 }
1149
1150 /// top ::= definition | external | expression | ';'
1151 static void MainLoop() {
1152 while (true) {
1153 fprintf(stderr, "ready> ");
1154 switch (CurTok) {
1155 case tok_eof:
1156 return;
1157 case ';': // ignore top-level semicolons.
1158 getNextToken();
1159 break;
1160 case tok_def:
1161 HandleDefinition();
1162 break;
1163 case tok_extern:
1164 HandleExtern();
1165 break;
1166 default:
1167 HandleTopLevelExpression();
1168 break;
1169 }
1170 }
1171 }
1172
1173 //===----------------------------------------------------------------------===//
1174 // "Library" functions that can be "extern'd" from user code.
1175 //===----------------------------------------------------------------------===//
1176
1177 /// putchard - putchar that takes a double and returns 0.
1178 extern "C" double putchard(double X) {
1179 fputc((char)X, stderr);
1180 return 0;
1181 }
1182
1183 /// printd - printf that takes a double prints it as "%f\n", returning 0.
1184 extern "C" double printd(double X) {
1185 fprintf(stderr, "%f\n", X);
1186 return 0;
1187 }
1188
1189 //===----------------------------------------------------------------------===//
1190 // Main driver code.
1191 //===----------------------------------------------------------------------===//
1192
1193 int main() {
1194 InitializeNativeTarget();
1195 InitializeNativeTargetAsmPrinter();
1196 InitializeNativeTargetAsmParser();
1197
1198 // Install standard binary operators.
1199 // 1 is lowest precedence.
1200 BinopPrecedence['='] = 2;
1201 BinopPrecedence['<'] = 10;
1202 BinopPrecedence['+'] = 20;
1203 BinopPrecedence['-'] = 20;
1204 BinopPrecedence['*'] = 40; // highest.
1205
1206 // Prime the first token.
1207 fprintf(stderr, "ready> ");
1208 getNextToken();
1209
1210 TheJIT = llvm::make_unique();
1211
1212 InitializeModule();
1213
1214 // Run the main "interpreter loop" now.
1215 MainLoop();
1216
1217 return 0;
1218 }
55 add_llvm_example(${name} ${ARGN})
66 endmacro(add_kaleidoscope_chapter name)
77
8 add_subdirectory(BuildingAJIT)
89 add_subdirectory(Chapter2)
910 add_subdirectory(Chapter3)
1011 add_subdirectory(Chapter4)