llvm.org GIT mirror llvm / 608095d
[Kaleidoscope][BuildingAJIT] Add tutorial text for Chapter 2. This chapter discusses IR optimizations, the ORC IRTransformLayer, and the ORC layer concept itself. The text is still pretty rough, but I think the main ideas are there. Feedback is very welcome, as always. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@271865 91177308-0d34-0410-b5e6-96231b3b80d8 Lang Hames 3 years ago
1 changed file(s) with 296 addition(s) and 17 deletion(s). Raw diff Collapse all Expand all
1111 Chapter 2 Introduction
1212 ======================
1313
14 Welcome to Chapter 2 of the "Building an ORC-based JIT in LLVM" tutorial. This
15 chapter shows you how to add IR optimization support to the KaleidoscopeJIT
16 class that was introduced in `Chapter 1 `_ by adding a
17 new *ORC Layer* -- IRTransformLayer.
18
19 **To be done:**
20
21 **(1) Briefly describe FunctionPassManager and the optimizeModule
22 method (reference the Kaleidoscope language tutorial chapter 4 for more detail
23 about IR optimization - it's covered in detail there, here it just provides a
24 motivation for learning about layers).**
25
26 **(2) Describe IRTransformLayer, show how it is used to call our optimizeModule
27 method.**
28
29 **(3) Describe the ORC Layer concept using IRTransformLayer as an example.**
14 Welcome to Chapter 2 of the "Building an ORC-based JIT in LLVM" tutorial. In
15 `Chapter 1 `_ of this series we examined a basic JIT
16 class, KaleidoscopeJIT, that could take LLVM IR modules as input and produce
17 executable code in memory. KaleidoscopeJIT was able to do this with relatively
18 little code by composing two off-the-shelf *ORC layers*: IRCompileLayer and
19 ObjectLinkingLayer, to do much of the heavy lifting.
20
21 In this layer we'll learn more about the ORC layer concept by using a new layer,
22 IRTransformLayer, to add IR optimization support to KaleidoscopeJIT.
23
24 Optimizing Modules using the IRTransformLayer
25 =============================================
26
27 In `Chapter 4 `_ of the "Implementing a language with LLVM"
28 tutorial series the llvm *FunctionPassManager* is introduced as a means for
29 optimizing LLVM IR. Interested readers may read that chapter for details, but
30 in short, to optimize a Module we create an llvm::FunctionPassManager
31 instance, configure it with a set of optimizations, then run the PassManager on
32 a Module to mutate it into a (hopefully) more optimized but semantically
33 equivalent form. In the original tutorial series the FunctionPassManager was
34 created outside the KaleidoscopeJIT, and modules were optimized before being
35 added to it. In this Chapter we will make optimization a phase of our JIT
36 instead. For now, this will provide us a motivation to learn more about ORC
37 layers, but in the long term making optimization part of our JIT will yield an
38 important benefit: When we begin lazily compiling code (i.e. deferring
39 compilation of each function until the first time it's run), having
40 optimization managed by our JIT will allow us to optimize lazily too, rather
41 than having to do all our optimization up-front.
42
43 To add optimization support to our JIT we will take the KaleidoscopeJIT from
44 Chapter 1 and compose an ORC *IRTransformLayer* on top. We will look at how the
45 IRTransformLayer works in more detail below, but the interface is simple: the
46 constructor for this layer takes a reference to the layer below (as all layers
47 do) plus an *IR optimization function* that it will apply to each Module that
48 is added via addModuleSet:
49
50 .. code-block: c++
51
52 class KaleidoscopeJIT {
53 private:
54 std::unique_ptr TM;
55 const DataLayout DL;
56 ObjectLinkingLayer<> ObjectLayer;
57 IRCompileLayer CompileLayer;
58
59 typedef std::function(std::unique_ptr)>
60 OptimizeFunction;
61
62 IRTransformLayer OptimizeLayer;
63
64 public:
65 typedef decltype(OptimizeLayer)::ModuleSetHandleT ModuleHandle;
66
67 KaleidoscopeJIT()
68 : TM(EngineBuilder().selectTarget()), DL(TM->createDataLayout()),
69 CompileLayer(ObjectLayer, SimpleCompiler(*TM)),
70 OptimizeLayer(CompileLayer,
71 [this](std::unique_ptr M) {
72 return optimizeModule(std::move(M));
73 }) {
74 llvm::sys::DynamicLibrary::LoadLibraryPermanently(nullptr);
75 }
76
77 Our extended KaleidoscopeJIT class starts out the same as it did in Chapter 1,
78 but after the CompileLayer we introduce a typedef for our optimization function.
79 In this case we use a std::function (a handy wrapper for "function-like" things)
80 from a single unique_ptr input to a std::unique_ptr output. With
81 our optimization function typedef in place we can declare our OptimizeLayer,
82 which sits on top of our CompileLayer.
83
84 To initialize our OptimizeLayer we pass it a reference to the CompileLayer
85 below (standard practice for layers), and we initialize the OptimizeFunction
86 using a lambda. In the lambda, we just call out to the "optimizeModule" function
87 that we will define below.
88
89 .. code-block:
90
91 // ...
92 auto Resolver = createLambdaResolver(
93 [&](const std::string &Name) {
94 if (auto Sym = OptimizeLayer.findSymbol(Name, false))
95 return Sym.toRuntimeDyldSymbol();
96 return RuntimeDyld::SymbolInfo(nullptr);
97 },
98 // ...
99 // Add the set to the JIT with the resolver we created above and a newly
100 // created SectionMemoryManager.
101 return OptimizeLayer.addModuleSet(std::move(Ms),
102 make_unique(),
103 std::move(Resolver));
104 // ...
105
106 // ...
107 return OptimizeLayer.findSymbol(MangledNameStream.str(), true);
108 // ...
109
110 // ...
111 OptimizeLayer.removeModuleSet(H);
112 // ...
113
114 Next we need to replace references to 'CompileLayer' with references to
115 OptimizeLayer in our key methods: addModule, findSymbol, and removeModule. In
116 addModule we need to be careful to replace both references: the findSymbol call
117 inside our resolver, and the call through to addModuleSet.
118
119 .. code-block: c++
120
121 std::unique_ptr optimizeModule(std::unique_ptr M) {
122 // Create a function pass manager.
123 auto FPM = llvm::make_unique(M.get());
124
125 // Add some optimizations.
126 FPM->add(createInstructionCombiningPass());
127 FPM->add(createReassociatePass());
128 FPM->add(createGVNPass());
129 FPM->add(createCFGSimplificationPass());
130 FPM->doInitialization();
131
132 // Run the optimizations over all functions in the module being added to
133 // the JIT.
134 for (auto &F : *M)
135 FPM->run(F);
136
137 return M;
138 }
139
140 At the bottom of our JIT we add a private method to do the actual optimization:
141 *optimizeModule*. This function sets up a FunctionPassManager, adds some passes
142 to it, runs it over every function in the module, and then returns the mutated
143 module. The specific optimizations used are the same ones used in
144 `Chapter 4 `_ of the "Implementing a language with LLVM"
145 tutorial series -- readers may visit that chapter for a more in-depth
146 discussion of them, and of IR optimization in general.
147
148 And that's it: When a module is added to our JIT the OptimizeLayer will now
149 pass it to our optimizeModule function before passing the transformed module
150 on to the CompileLayer below. Of course, we could have called optimizeModule
151 directly in our addModule function and not gone to the bother of using the
152 IRTransformLayer, but it gives us an opportunity to see how layers compose, and
153 how one can be implemented, because IRTransformLayer turns out to be one of
154 the simplest implementations of the *layer* concept that can be devised:
155
156 .. code-block:
157
158 template
159 class IRTransformLayer {
160 public:
161 typedef typename BaseLayerT::ModuleSetHandleT ModuleSetHandleT;
162
163 IRTransformLayer(BaseLayerT &BaseLayer,
164 TransformFtor Transform = TransformFtor())
165 : BaseLayer(BaseLayer), Transform(std::move(Transform)) {}
166
167 template
168 typename SymbolResolverPtrT>
169 ModuleSetHandleT addModuleSet(ModuleSetT Ms,
170 MemoryManagerPtrT MemMgr,
171 SymbolResolverPtrT Resolver) {
172
173 for (auto I = Ms.begin(), E = Ms.end(); I != E; ++I)
174 *I = Transform(std::move(*I));
175
176 return BaseLayer.addModuleSet(std::move(Ms), std::move(MemMgr),
177 std::move(Resolver));
178 }
179
180 void removeModuleSet(ModuleSetHandleT H) { BaseLayer.removeModuleSet(H); }
181
182 JITSymbol findSymbol(const std::string &Name, bool ExportedSymbolsOnly) {
183 return BaseLayer.findSymbol(Name, ExportedSymbolsOnly);
184 }
185
186 JITSymbol findSymbolIn(ModuleSetHandleT H, const std::string &Name,
187 bool ExportedSymbolsOnly) {
188 return BaseLayer.findSymbolIn(H, Name, ExportedSymbolsOnly);
189 }
190
191 void emitAndFinalize(ModuleSetHandleT H) {
192 BaseLayer.emitAndFinalize(H);
193 }
194
195 TransformFtor& getTransform() { return Transform; }
196
197 const TransformFtor& getTransform() const { return Transform; }
198
199 private:
200 BaseLayerT &BaseLayer;
201 TransformFtor Transform;
202 };
203
204 This is the whole definition of IRTransformLayer, from
205 ``llvm/include/llvm/ExecutionEngine/Orc/IRTransformLayer.h``, stripped of its
206 comments. It is a template class with two template arguments: ``BaesLayerT`` and
207 ``TransformFtor`` that provide the type of the base layer, and the type of the
208 "transform functor" (in our case a std::function) respectively. The body of the
209 class is concerned with two very simple jobs: (1) Running every IR Module that
210 is added with addModuleSet through the transform functor, and (2) conforming to
211 the ORC layer interface, which is:
212
213 +------------------------------------------------------------------------------+
214 | Interface | Description |
215 +==================+===========================================================+
216 | | Provides a handle that can be used to identify a module |
217 | ModuleSetHandleT | set when calling findSymbolIn, removeModuleSet, or |
218 | | emitAndFinalize. |
219 +------------------+-----------------------------------------------------------+
220 | | Takes a given set of Modules and makes them "available |
221 | | for execution. This means that symbols in those modules |
222 | | should be searchable via findSymbol and findSymbolIn, and |
223 | | the address of the symbols should be read/writable (for |
224 | | data symbols), or executable (for function symbols) after |
225 | | JITSymbol::getAddress() is called. Note: This means that |
226 | addModuleSet | addModuleSet doesn't have to compile (or do any other |
227 | | work) up-front. It *can*, like IRCompileLayer, act |
228 | | eagerly, but it can also simply record the module and |
229 | | take no further action until somebody calls |
230 | | JITSymbol::getAddress(). In IRTransformLayer's case |
231 | | addModuleSet eagerly applies the transform functor to |
232 | | each module in the set, then passes the resulting set |
233 | | of mutated modules down to the layer below. |
234 +------------------+-----------------------------------------------------------+
235 | | Removes a set of modules from the JIT. Code or data |
236 | removeModuleSet | defined in these modules will no longer be available, and |
237 | | the memory holding the JIT'd definitions will be freed. |
238 +------------------+-----------------------------------------------------------+
239 | | Searches for the named symbol in all modules that have |
240 | | previously been added via addModuleSet (and not yet |
241 | findSymbol | removed by a call to removeModuleSet). In |
242 | | IRTransformLayer we just pass the query on to the layer |
243 | | below. In our REPL this is our default way to search for |
244 | | function definitions. |
245 +------------------+-----------------------------------------------------------+
246 | | Searches for the named symbol in the module set indicated |
247 | | by the given ModuleSetHandleT. This is just an optimized |
248 | | search, better for lookup-speed when you know exactly |
249 | | a symbol definition should be found. In IRTransformLayer |
250 | findSymbolIn | we just pass this query on to the layer below. In our |
251 | | REPL we use this method to search for functions |
252 | | representing top-level expressions, since we know exactly |
253 | | where we'll find them: in the top-level expression module |
254 | | we just added. |
255 +------------------+-----------------------------------------------------------+
256 | | Forces all of the actions required to make the code and |
257 | | data in a module set (represented by a ModuleSetHandleT) |
258 | | accessible. Behaves as if some symbol in the set had been |
259 | | searched for and JITSymbol::getSymbolAddress called. This |
260 | emitAndFinalize | is rarely needed, but can be useful when dealing with |
261 | | layers that usually behave lazily if the user wants to |
262 | | trigger early compilation (for example, to use idle CPU |
263 | | time to eagerly compile code in the background). |
264 +------------------+-----------------------------------------------------------+
265
266 This interface attempts to capture the natural operations of a JIT (with some
267 wrinkles like emitAndFinalize for performance), similar to the basic JIT API
268 operations we identified in Chapter 1. Conforming to the layer concept allows
269 classes to compose neatly by implementing their behaviors in terms of the these
270 same operations, carried out on the layer below. For example, an eager layer
271 (like IRTransformLayer) can implement addModuleSet by running each module in the
272 set through its transform up-front and immediately passing the result to the
273 layer below. A lazy layer, by contrast, could implement addModuleSet by
274 squirreling away the modules doing no other up-front work, but applying the
275 transform (and calling addModuleSet on the layer below) when the client calls
276 findSymbol instead. The JIT'd program behavior will be the same either way, but
277 these choices will have different performance characteristics: Doing work
278 eagerly means the JIT takes longer up-front, but proceeds smoothly once this is
279 done. Deferring work allows the JIT to get up-and-running quickly, but will
280 force the JIT to pause and wait whenever some code or data is needed that hasn't
281 already been procesed.
282
283 Our current REPL is eager: Each function definition is optimized and compiled as
284 soon as it's typed in. If we were to make the transform layer lazy (but not
285 change things otherwise) we could defer optimization until the first time we
286 reference a function in a top-level expression (see if you can figure out why,
287 then check out the answer below [1]_). In the next chapter, however we'll
288 introduce fully lazy compilation, in which function's aren't compiled until
289 they're first called at run-time. At this point the trade-offs get much more
290 interesting: the lazier we are, the quicker we can start executing the first
291 function, but the more often we'll have to pause to compile newly encountered
292 functions. If we only code-gen lazily, but optimize eagerly, we'll have a slow
293 startup (which everything is optimized) but relatively short pauses as each
294 function just passes through code-gen. If we both optimize and code-gen lazily
295 we can start executing the first function more quickly, but we'll have longer
296 pauses as each function has to be both optimized and code-gen'd when it's first
297 executed. Things become even more interesting if we consider interproceedural
298 optimizations like inlining, which must be performed eagerly. These are
299 complex trade-offs, and there is no one-size-fits all solution to them, but by
300 providing composable layers we leave the decisions to the person implementing
301 the JIT, and make it easy for them to experiment with different configurations.
302
303 `Next: Adding Per-function Lazy Compilation `_
30304
31305 Full Code Listing
32306 =================
46320 .. literalinclude:: ../../examples/Kaleidoscope/BuildingAJIT/Chapter2/KaleidoscopeJIT.h
47321 :language: c++
48322
49 `Next: Adding Per-function Lazy Compilation `_
323 .. [1] When we add our top-level expression to the JIT, any calls to functions
324 that we defined earlier will appear to the ObjectLinkingLayer as
325 external symbols. The ObjectLinkingLayer will call the SymbolResolver
326 that we defined in addModuleSet, which in turn calls findSymbol on the
327 OptimizeLayer, at which point even a lazy transform layer will have to
328 do its work.