llvm.org GIT mirror llvm / 948968e
[ORC] Start adding ORCv1 to ORCv2 transition tips to the ORCv2 doc. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@366075 91177308-0d34-0410-b5e6-96231b3b80d8 Lang Hames a month ago
3 changed file(s) with 467 addition(s) and 328 deletion(s). Raw diff Collapse all Expand all
0 ===============================
1 ORC Design and Implementation
2 ===============================
3
4 Introduction
5 ============
6
7 This document aims to provide a high-level overview of the design and
8 implementation of the ORC JIT APIs. Except where otherwise stated, all
9 discussion applies to the design of the APIs as of LLVM verison 9 (ORCv2).
10
11 .. contents::
12 :local:
13
14 Use-cases
15 =========
16
17 ORC provides a modular API for building JIT compilers. There are a range
18 of use cases for such an API. For example:
19
20 1. The LLVM tutorials use a simple ORC-based JIT class to execute expressions
21 compiled from a toy languge: Kaleidoscope.
22
23 2. The LLVM debugger, LLDB, uses a cross-compiling JIT for expression
24 evaluation. In this use case, cross compilation allows expressions compiled
25 in the debugger process to be executed on the debug target process, which may
26 be on a different device/architecture.
27
28 3. In high-performance JITs (e.g. JVMs, Julia) that want to make use of LLVM's
29 optimizations within an existing JIT infrastructure.
30
31 4. In interpreters and REPLs, e.g. Cling (C++) and the Swift interpreter.
32
33 By adoping a modular, library-based design we aim to make ORC useful in as many
34 of these contexts as possible.
35
36 Features
37 ========
38
39 ORC provides the following features:
40
41 - *JIT-linking* links relocatable object files (COFF, ELF, MachO) [1]_ into a
42 target process an runtime. The target process may be the same process that
43 contains the JIT session object and jit-linker, or may be another process
44 (even one running on a different machine or architecture) that communicates
45 with the JIT via RPC.
46
47 - *LLVM IR compilation*, which is provided by off the shelf components
48 (IRCompileLayer, SimpleCompiler, ConcurrentIRCompiler) that make it easy to
49 add LLVM IR to a JIT'd process.
50
51 - *Eager and lazy compilation*. By default, ORC will compile symbols as soon as
52 they are looked up in the JIT session object (``ExecutionSession``). Compiling
53 eagerly by default makes it easy to use ORC as a simple in-memory compiler for
54 an existing JIT. ORC also provides a simple mechanism, lazy-reexports, for
55 deferring compilation until first call.
56
57 - *Support for custom compilers and program representations*. Clients can supply
58 custom compilers for each symbol that they define in their JIT session. ORC
59 will run the user-supplied compiler when the a definition of a symbol is
60 needed. ORC is actually fully language agnostic: LLVM IR is not treated
61 specially, and is supported via the same wrapper mechanism (the
62 ``MaterializationUnit`` class) that is used for custom compilers.
63
64 - *Concurrent JIT'd code* and *concurrent compilation*. JIT'd code may spawn
65 multiple threads, and may re-enter the JIT (e.g. for lazy compilation)
66 concurrently from multiple threads. The ORC APIs also support running multiple
67 compilers concurrently, and provides off-the-shelf infrastructure to track
68 dependencies on running compiles (e.g. to ensure that we never call into code
69 until it is safe to do so, even if that involves waiting on multiple
70 compiles).
71
72 - *Orthogonality* and *composability*: Each of the features above can be used (or
73 not) independently. It is possible to put ORC components together to make a
74 non-lazy, in-process, single threaded JIT or a lazy, out-of-process,
75 concurrent JIT, or anything in between.
76
77 LLJIT and LLLazyJIT
78 ===================
79
80 ORC provides two basic JIT classes off-the-shelf. These are useful both as
81 examples of how to assemble ORC components to make a JIT, and as replacements
82 for earlier LLVM JIT APIs (e.g. MCJIT).
83
84 The LLJIT class uses an IRCompileLayer and RTDyldObjectLinkingLayer to support
85 compilation of LLVM IR and linking of relocatable object files. All operations
86 are performed eagerly on symbol lookup (i.e. a symbol's definition is compiled
87 as soon as you attempt to look up its address). LLJIT is a suitable replacement
88 for MCJIT in most cases (note: some more advanced features, e.g.
89 JITEventListeners are not supported yet).
90
91 The LLLazyJIT extends LLJIT and adds a CompileOnDemandLayer to enable lazy
92 compilation of LLVM IR. When an LLVM IR module is added via the addLazyIRModule
93 method, function bodies in that module will not be compiled until they are first
94 called. LLLazyJIT aims to provide a replacement of LLVM's original (pre-MCJIT)
95 JIT API.
96
97 LLJIT and LLLazyJIT instances can be created using their respective builder
98 classes: LLJITBuilder and LLazyJITBuilder. For example, assuming you have a
99 module ``M`` loaded on an ThreadSafeContext ``Ctx``:
100
101 .. code-block:: c++
102
103 // Try to detect the host arch and construct an LLJIT instance.
104 auto JIT = LLJITBuilder().create();
105
106 // If we could not construct an instance, return an error.
107 if (!JIT)
108 return JIT.takeError();
109
110 // Add the module.
111 if (auto Err = JIT->addIRModule(TheadSafeModule(std::move(M), Ctx)))
112 return Err;
113
114 // Look up the JIT'd code entry point.
115 auto EntrySym = JIT->lookup("entry");
116 if (!EntrySym)
117 return EntrySym.takeError();
118
119 auto *Entry = (void(*)())EntrySym.getAddress();
120
121 Entry();
122
123 The builder clasess provide a number of configuration options that can be
124 specified before the JIT instance is constructed. For example:
125
126 .. code-block:: c++
127
128 // Build an LLLazyJIT instance that uses four worker threads for compilation,
129 // and jumps to a specific error handler (rather than null) on lazy compile
130 // failures.
131
132 void handleLazyCompileFailure() {
133 // JIT'd code will jump here if lazy compilation fails, giving us an
134 // opportunity to exit or throw an exception into JIT'd code.
135 throw JITFailed();
136 }
137
138 auto JIT = LLLazyJITBuilder()
139 .setNumCompileThreads(4)
140 .setLazyCompileFailureAddr(
141 toJITTargetAddress(&handleLazyCompileFailure))
142 .create();
143
144 // ...
145
146 For users wanting to get started with LLJIT a minimal example program can be
147 found at ``llvm/examples/HowToUseLLJIT``.
148
149 Design Overview
150 ===============
151
152 ORC's JIT'd program model aims to emulate the linking and symbol resolution
153 rules used by the static and dynamic linkers. This allows ORC to JIT
154 arbitrary LLVM IR, including IR produced by an ordinary static compiler (e.g.
155 clang) that uses constructs like symbol linkage and visibility, and weak and
156 common symbol definitions.
157
158 To see how this works, imagine a program ``foo`` which links against a pair
159 of dynamic libraries: ``libA`` and ``libB``. On the command line, building this
160 system might look like:
161
162 .. code-block:: bash
163
164 $ clang++ -shared -o libA.dylib a1.cpp a2.cpp
165 $ clang++ -shared -o libB.dylib b1.cpp b2.cpp
166 $ clang++ -o myapp myapp.cpp -L. -lA -lB
167 $ ./myapp
168
169 In ORC, this would translate into API calls on a "CXXCompilingLayer" (with error
170 checking omitted for brevity) as:
171
172 .. code-block:: c++
173
174 ExecutionSession ES;
175 RTDyldObjectLinkingLayer ObjLinkingLayer(
176 ES, []() { return llvm::make_unique(); });
177 CXXCompileLayer CXXLayer(ES, ObjLinkingLayer);
178
179 // Create JITDylib "A" and add code to it using the CXX layer.
180 auto &LibA = ES.createJITDylib("A");
181 CXXLayer.add(LibA, MemoryBuffer::getFile("a1.cpp"));
182 CXXLayer.add(LibA, MemoryBuffer::getFile("a2.cpp"));
183
184 // Create JITDylib "B" and add code to it using the CXX layer.
185 auto &LibB = ES.createJITDylib("B");
186 CXXLayer.add(LibB, MemoryBuffer::getFile("b1.cpp"));
187 CXXLayer.add(LibB, MemoryBuffer::getFile("b2.cpp"));
188
189 // Specify the search order for the main JITDylib. This is equivalent to a
190 // "links against" relationship in a command-line link.
191 ES.getMainJITDylib().setSearchOrder({{&LibA, false}, {&LibB, false}});
192 CXXLayer.add(ES.getMainJITDylib(), MemoryBuffer::getFile("main.cpp"));
193
194 // Look up the JIT'd main, cast it to a function pointer, then call it.
195 auto MainSym = ExitOnErr(ES.lookup({&ES.getMainJITDylib()}, "main"));
196 auto *Main = (int(*)(int, char*[]))MainSym.getAddress();
197
198 int Result = Main(...);
199
200
201 This example tells us nothing about *how* or *when* compilation will happen.
202 That will depend on the implementation of the hypothetical CXXCompilingLayer,
203 but the linking rules will be the same regardless. For example, if a1.cpp and
204 a2.cpp both define a function "foo" the API should generate a duplicate
205 definition error. On the other hand, if a1.cpp and b1.cpp both define "foo"
206 there is no error (different dynamic libraries may define the same symbol). If
207 main.cpp refers to "foo", it should bind to the definition in LibA rather than
208 the one in LibB, since main.cpp is part of the "main" dylib, and the main dylib
209 links against LibA before LibB.
210
211 Many JIT clients will have no need for this strict adherence to the usual
212 ahead-of-time linking rules and should be able to get by just fine by putting
213 all of their code in a single JITDylib. However, clients who want to JIT code
214 for languages/projects that traditionally rely on ahead-of-time linking (e.g.
215 C++) will find that this feature makes life much easier.
216
217 Symbol lookup in ORC serves two other important functions, beyond basic lookup:
218 (1) It triggers compilation of the symbol(s) searched for, and (2) it provides
219 the synchronization mechanism for concurrent compilation. The pseudo-code for
220 the lookup process is:
221
222 .. code-block:: none
223
224 construct a query object from a query set and query handler
225 lock the session
226 lodge query against requested symbols, collect required materializers (if any)
227 unlock the session
228 dispatch materializers (if any)
229
230 In this context a materializer is something that provides a working definition
231 of a symbol upon request. Generally materializers wrap compilers, but they may
232 also wrap a linker directly (if the program representation backing the
233 definitions is an object file), or even just a class that writes bits directly
234 into memory (if the definitions are stubs). Materialization is the blanket term
235 for any actions (compiling, linking, splatting bits, registering with runtimes,
236 etc.) that is requried to generate a symbol definition that is safe to call or
237 access.
238
239 As each materializer completes its work it notifies the JITDylib, which in turn
240 notifies any query objects that are waiting on the newly materialized
241 definitions. Each query object maintains a count of the number of symbols that
242 it is still waiting on, and once this count reaches zero the query object calls
243 the query handler with a *SymbolMap* (a map of symbol names to addresses)
244 describing the result. If any symbol fails to materialize the query immediately
245 calls the query handler with an error.
246
247 The collected materialization units are sent to the ExecutionSession to be
248 dispatched, and the dispatch behavior can be set by the client. By default each
249 materializer is run on the calling thread. Clients are free to create new
250 threads to run materializers, or to send the work to a work queue for a thread
251 pool (this is what LLJIT/LLLazyJIT do).
252
253 Top Level APIs
254 ==============
255
256 Many of ORC's top-level APIs are visible in the example above:
257
258 - *ExecutionSession* represents the JIT'd program and provides context for the
259 JIT: It contains the JITDylibs, error reporting mechanisms, and dispatches the
260 materializers.
261
262 - *JITDylibs* provide the symbol tables.
263
264 - *Layers* (ObjLinkingLayer and CXXLayer) are wrappers around compilers and
265 allow clients to add uncompiled program representations supported by those
266 compilers to JITDylibs.
267
268 Several other important APIs are used explicitly. JIT clients need not be aware
269 of them, but Layer authors will use them:
270
271 - *MaterializationUnit* - When XXXLayer::add is invoked it wraps the given
272 program representation (in this example, C++ source) in a MaterializationUnit,
273 which is then stored in the JITDylib. MaterializationUnits are responsible for
274 describing the definitions they provide, and for unwrapping the program
275 representation and passing it back to the layer when compilation is required
276 (this ownership shuffle makes writing thread-safe layers easier, since the
277 ownership of the program representation will be passed back on the stack,
278 rather than having to be fished out of a Layer member, which would require
279 synchronization).
280
281 - *MaterializationResponsibility* - When a MaterializationUnit hands a program
282 representation back to the layer it comes with an associated
283 MaterializationResponsibility object. This object tracks the definitions
284 that must be materialized and provides a way to notify the JITDylib once they
285 are either successfully materialized or a failure occurs.
286
287 Handy utilities
288 ===============
289
290 TBD: absolute symbols, aliases, off-the-shelf layers.
291
292 Laziness
293 ========
294
295 Laziness in ORC is provided by a utility called "lazy-reexports". The aim of
296 this utility is to re-use the synchronization provided by the symbol lookup
297 mechanism to make it safe to lazily compile functions, even if calls to the
298 stub occur simultaneously on multiple threads of JIT'd code. It does this by
299 reducing lazy compilation to symbol lookup: The lazy stub performs a lookup of
300 its underlying definition on first call, updating the function body pointer
301 once the definition is available. If additional calls arrive on other threads
302 while compilation is ongoing they will be safely blocked by the normal lookup
303 synchronization guarantee (no result until the result is safe) and can also
304 proceed as soon as compilation completes.
305
306 TBD: Usage example.
307
308 Supporting Custom Compilers
309 ===========================
310
311 TBD.
312
313 Transitioning from ORCv1 to ORCv2
314 =================================
315
316 Since LLVM 7.0 new ORC developement has focused on adding support for concurrent
317 compilation. In order to enable concurrency new APIs were introduced
318 (ExecutionSession, JITDylib, etc.) and new implementations of existing layers
319 were written. In LLVM 8.0 the old layer implementations, which do not support
320 concurrency, were renamed (with a "Legacy" prefix), but remained in tree. In
321 LLVM 9.0 we have added a deprecation warning for the old layers and utilities,
322 and in LLVM 10.0 the old layers and utilities will be removed.
323
324 Clients currently using the legacy (ORCv1) layers and utilities will usually
325 find it easy to transition to the newer (ORCv2) variants. Most of the ORCv1
326 layers and utilities have ORCv2 counterparts[2]_ that can be
327 substituted. However there are some differences between ORCv1 and ORCv2 to be
328 aware of:
329
330 1. All JIT stacks now need an ExecutionSession instance which manages the
331 string pool, error reporting, synchronization, and symbol lookup.
332
333 2. ORCv2 uses uniqued strings (``SymbolStringPtr`` instances) to reduce memory
334 overhead and improve lookup performance. To get a uniqued string, call
335 ``intern`` on your ExecutionSession instance:
336
337 .. code-block:: c++
338
339 ExecutionSession ES;
340
341 /// ...
342
343 auto MainSymbolName = ES.intern("main");
344
345 3. Program representations (Modules, Object Files, etc.) are no longer added
346 *to* layers. Instead they are added *to* JITDylibs *by* layers. The layer
347 determines how the program representation will be compiled if it is needed.
348 The JITDylib provides the symbol table, enforces linkage rules (e.g.
349 rejecting duplicate definitions), and synchronizes concurrent compiles.
350
351 Most ORCv1 clients (or MCJIT clients wanting to try out ORCv2) should
352 simply add code to the default *main* JITDylib provided by the
353 ExecutionSession:
354
355 .. code-block:: c++
356
357 ExecutionSession ES;
358 RTDyldObjectLinkingLayer ObjLinkingLayer(
359 ES, []() { return llvm::make_unique(); });
360 IRCompileLayer CompileLayer(ES, ObjLinkingLayer, SimpleIRCompiler(TM));
361
362 auto M = loadModule(...);
363
364 if (auto Err = CompileLayer.add(ES.getMainJITDylib(), M))
365 return Err;
366
367 4. IR layers require ThreadSafeModule instances, rather than
368 std::unique_ptrs. A ThreadSafeModule instance is a pair of a
369 std::unique_ptr and a ThreadSafeContext, which is in turn a
370 pair of a std::unique_ptr and a lock. This allows the JIT
371 to ensure that the LLVMContext for a module is locked before the module
372 is accessed. Multiple ThreadSafeModules may share a ThreadSafeContext
373 value, but in that case the modules will not be able to be compiled
374 concurrently[3]_.
375
376 ThreadSafeContexts may be constructed explicitly:
377
378 .. code-block:: c++
379
380 // ThreadSafeContext shared between two modules.
381 ThreadSafeContext TSCtx(llvm::make_unique());
382 ThreadSafeModule TSM1(
383 llvm::make_unique("M1", *TSCtx.getContext()), TSCtx);
384 ThreadSafeModule TSM2(
385 llvm::make_unique("M2", *TSCtx.getContext()), TSCtx);
386
387 , or they can be created implicitly by passing a new LLVMContext to the
388 ThreadSafeModuleConstructor:
389
390 .. code-block:: c++
391
392 // Constructing a ThreadSafeModule (and implicitly a ThreadSafeContext)
393 // from a pair of a Module and a Context.
394 auto Ctx = llvm::make_unique();
395 auto M = llvm::make_unique("M", *Ctx);
396 return ThreadSafeModule(std::move(M), std::move(Ctx));
397
398 5. The symbol resolution and lookup scheme have been fundamentally changed.
399 Symbol lookup has been removed from the layer interface. Instead,
400 symbols are looked up via the ``ExecutionSession::lookup`` method by
401 scanning a list of JITDylibs.
402
403 SymbolResolvers have been removed entirely. Resolution rules now follow the
404 linkage relationship between JITDylibs. For example, to resolve a reference
405 to a symbol *F* from a module *M* that has been added to JITDylib *J1* we
406 would first search for a definition of *F* in *J1* then (if no definition
407 was found) search each of the JITDylibs that *J1* links against.
408
409 While the new resolution scheme is, strictly speaking, less flexible than
410 the old scheme of customizable resolvers this has not yet led to problems
411 in practice. Instead, using standard linker rules has removed a lot of
412 boilerplate while providing correct[4]_ behavior for common and weak symbols.
413
414 One notable difference is in exposing in-process symbols to the JIT. To
415 support this (without requiring the set of symbols to be enumerated up
416 front), JITDylibs allow for a *GeneratorFunction* to be attached to
417 generate new definitions upon lookup. Reflecting the processes symbols into
418 the JIT can be done by writing:
419
420 .. code-block:: c++
421
422 ExecutionSession ES;
423 const auto DataLayout &DL = ...;
424
425 {
426 auto ProcessSymbolsGenerator =
427 DynamicLibrarySearchGenerator::GetForCurrentProcess(DL.getGlobalPrefix());
428 if (!ProcessSymbolsGenerator)
429 return ProcessSymbolsGenerator.takeError();
430 ES.getMainJITDylib().setGenerator(std::move(*ProcessSymbolsGenerator));
431 }
432
433 6. Module removal is not yet supported. There is no equivalent of the
434 layer concept removeModule/removeObject methods. Work on resource tracking
435 and removal in ORCv2 is ongoing.
436
437 Future Features
438 ===============
439
440 TBD: Speculative compilation. Object Caches.
441
442 .. [1] Formats/architectures vary in terms of supported features. MachO and
443 ELF tend to have better support than COFF. Patches very welcome!
444
445 .. [2] The ``LazyEmittingLayer``, ``RemoteObjectClientLayer`` and
446 ``RemoteObjectServerLayer`` do not have counterparts in the new
447 system. In the case of ``LazyEmittingLayer`` it was simply no longer
448 needed: in ORCv2, deferring compilation until symbols are looked up is
449 the default. The removal of ``RemoteObjectClientLayer`` and
450 ``RemoteObjectServerLayer`` means that JIT stacks can no longer be split
451 across processes, however this functionality appears not to have been
452 used.
453
454 .. [3] Sharing ThreadSafeModules in a concurrent compilation can be dangerous:
455 if interdependent modules are loaded on the same context, but compiled
456 on different threads a deadlock may occur (with each compile waiting for
457 the other(s) to complete, and the other(s) unable to proceed because the
458 context is locked).
459
460 .. [4] Mostly. Weak definitions are handled correctly within dylibs, but if
461 multiple dylibs provide a weak definition of a symbol each will end up
462 with its own definition (similar to how weak symbols in Windows DLLs
463 behave). This will be fixed in the future.
+0
-325
docs/ORCv2DesignAndImplementation.rst less more
None ===============================
1 ORC Design and Implementation
2 ===============================
3
4 Introduction
5 ============
6
7 This document aims to provide a high-level overview of the design and
8 implementation of the ORC JIT APIs. Except where otherwise stated, all
9 discussion applies to the design of the APIs as of LLVM verison 9 (ORCv2).
10
11 .. contents::
12 :local:
13
14 Use-cases
15 =========
16
17 ORC provides a modular API for building JIT compilers. There are a range
18 of use cases for such an API:
19
20 1. The LLVM tutorials use a simple ORC-based JIT class to execute expressions
21 compiled from a toy languge: Kaleidoscope.
22
23 2. The LLVM debugger, LLDB, uses a cross-compiling JIT for expression
24 evaluation. In this use case, cross compilation allows expressions compiled
25 in the debugger process to be executed on the debug target process, which may
26 be on a different device/architecture.
27
28 3. In high-performance JITs (e.g. JVMs, Julia) that want to make use of LLVM's
29 optimizations within an existing JIT infrastructure.
30
31 4. In interpreters and REPLs, e.g. Cling (C++) and the Swift interpreter.
32
33 By adoping a modular, library-based design we aim to make ORC useful in as many
34 of these contexts as possible.
35
36 Features
37 ========
38
39 ORC provides the following features:
40
41 - *JIT-linking* links relocatable object files (COFF, ELF, MachO) [1]_ into a
42 target process an runtime. The target process may be the same process that
43 contains the JIT session object and jit-linker, or may be another process
44 (even one running on a different machine or architecture) that communicates
45 with the JIT via RPC.
46
47 - *LLVM IR compilation*, which is provided by off the shelf components
48 (IRCompileLayer, SimpleCompiler, ConcurrentIRCompiler) that make it easy to
49 add LLVM IR to a JIT'd process.
50
51 - *Eager and lazy compilation*. By default, ORC will compile symbols as soon as
52 they are looked up in the JIT session object (``ExecutionSession``). Compiling
53 eagerly by default makes it easy to use ORC as a simple in-memory compiler for
54 an existing JIT. ORC also provides a simple mechanism, lazy-reexports, for
55 deferring compilation until first call.
56
57 - *Support for custom compilers and program representations*. Clients can supply
58 custom compilers for each symbol that they define in their JIT session. ORC
59 will run the user-supplied compiler when the a definition of a symbol is
60 needed. ORC is actually fully language agnostic: LLVM IR is not treated
61 specially, and is supported via the same wrapper mechanism (the
62 ``MaterializationUnit`` class) that is used for custom compilers.
63
64 - *Concurrent JIT'd code* and *concurrent compilation*. JIT'd code may spawn
65 multiple threads, and may re-enter the JIT (e.g. for lazy compilation)
66 concurrently from multiple threads. The ORC APIs also support running multiple
67 compilers concurrently, and provides off-the-shelf infrastructure to track
68 dependencies on running compiles (e.g. to ensure that we never call into code
69 until it is safe to do so, even if that involves waiting on multiple
70 compiles).
71
72 - *Orthogonality* and *composability*: Each of the features above can be used (or
73 not) independently. It is possible to put ORC components together to make a
74 non-lazy, in-process, single threaded JIT or a lazy, out-of-process,
75 concurrent JIT, or anything in between.
76
77 LLJIT and LLLazyJIT
78 ===================
79
80 ORC provides two basic JIT classes off-the-shelf. These are useful both as
81 examples of how to assemble ORC components to make a JIT, and as replacements
82 for earlier LLVM JIT APIs (e.g. MCJIT).
83
84 The LLJIT class uses an IRCompileLayer and RTDyldObjectLinkingLayer to support
85 compilation of LLVM IR and linking of relocatable object files. All operations
86 are performed eagerly on symbol lookup (i.e. a symbol's definition is compiled
87 as soon as you attempt to look up its address). LLJIT is a suitable replacement
88 for MCJIT in most cases (note: some more advanced features, e.g.
89 JITEventListeners are not supported yet).
90
91 The LLLazyJIT extends LLJIT and adds a CompileOnDemandLayer to enable lazy
92 compilation of LLVM IR. When an LLVM IR module is added via the addLazyIRModule
93 method, function bodies in that module will not be compiled until they are first
94 called. LLLazyJIT aims to provide a replacement of LLVM's original (pre-MCJIT)
95 JIT API.
96
97 LLJIT and LLLazyJIT instances can be created using their respective builder
98 classes: LLJITBuilder and LLazyJITBuilder. For example, assuming you have a
99 module ``M`` loaded on an ThreadSafeContext ``Ctx``:
100
101 .. code-block:: c++
102
103 // Try to detect the host arch and construct an LLJIT instance.
104 auto JIT = LLJITBuilder().create();
105
106 // If we could not construct an instance, return an error.
107 if (!JIT)
108 return JIT.takeError();
109
110 // Add the module.
111 if (auto Err = JIT->addIRModule(TheadSafeModule(std::move(M), Ctx)))
112 return Err;
113
114 // Look up the JIT'd code entry point.
115 auto EntrySym = JIT->lookup("entry");
116 if (!EntrySym)
117 return EntrySym.takeError();
118
119 auto *Entry = (void(*)())EntrySym.getAddress();
120
121 Entry();
122
123 The builder clasess provide a number of configuration options that can be
124 specified before the JIT instance is constructed. For example:
125
126 .. code-block:: c++
127
128 // Build an LLLazyJIT instance that uses four worker threads for compilation,
129 // and jumps to a specific error handler (rather than null) on lazy compile
130 // failures.
131
132 void handleLazyCompileFailure() {
133 // JIT'd code will jump here if lazy compilation fails, giving us an
134 // opportunity to exit or throw an exception into JIT'd code.
135 throw JITFailed();
136 }
137
138 auto JIT = LLLazyJITBuilder()
139 .setNumCompileThreads(4)
140 .setLazyCompileFailureAddr(
141 toJITTargetAddress(&handleLazyCompileFailure))
142 .create();
143
144 // ...
145
146 For users wanting to get started with LLJIT a minimal example program can be
147 found at ``llvm/examples/HowToUseLLJIT``.
148
149 Design Overview
150 ===============
151
152 ORC's JIT'd program model aims to emulate the linking and symbol resolution
153 rules used by the static and dynamic linkers. This allows ORC to JIT
154 arbitrary LLVM IR, including IR produced by an ordinary static compiler (e.g.
155 clang) that uses constructs like symbol linkage and visibility, and weak and
156 common symbol definitions.
157
158 To see how this works, imagine a program ``foo`` which links against a pair
159 of dynamic libraries: ``libA`` and ``libB``. On the command line, building this
160 system might look like:
161
162 .. code-block:: bash
163
164 $ clang++ -shared -o libA.dylib a1.cpp a2.cpp
165 $ clang++ -shared -o libB.dylib b1.cpp b2.cpp
166 $ clang++ -o myapp myapp.cpp -L. -lA -lB
167 $ ./myapp
168
169 In ORC, this would translate into API calls on a "CXXCompilingLayer" (with error
170 checking omitted for brevity) as:
171
172 .. code-block:: c++
173
174 ExecutionSession ES;
175 RTDyldObjectLinkingLayer ObjLinkingLayer(
176 ES, []() { return llvm::make_unique(); });
177 CXXCompileLayer CXXLayer(ES, ObjLinkingLayer);
178
179 // Create JITDylib "A" and add code to it using the CXX layer.
180 auto &LibA = ES.createJITDylib("A");
181 CXXLayer.add(LibA, MemoryBuffer::getFile("a1.cpp"));
182 CXXLayer.add(LibA, MemoryBuffer::getFile("a2.cpp"));
183
184 // Create JITDylib "B" and add code to it using the CXX layer.
185 auto &LibB = ES.createJITDylib("B");
186 CXXLayer.add(LibB, MemoryBuffer::getFile("b1.cpp"));
187 CXXLayer.add(LibB, MemoryBuffer::getFile("b2.cpp"));
188
189 // Specify the search order for the main JITDylib. This is equivalent to a
190 // "links against" relationship in a command-line link.
191 ES.getMainJITDylib().setSearchOrder({{&LibA, false}, {&LibB, false}});
192 CXXLayer.add(ES.getMainJITDylib(), MemoryBuffer::getFile("main.cpp"));
193
194 // Look up the JIT'd main, cast it to a function pointer, then call it.
195 auto MainSym = ExitOnErr(ES.lookup({&ES.getMainJITDylib()}, "main"));
196 auto *Main = (int(*)(int, char*[]))MainSym.getAddress();
197
198 int Result = Main(...);
199
200
201 This example tells us nothing about *how* or *when* compilation will happen.
202 That will depend on the implementation of the hypothetical CXXCompilingLayer,
203 but the linking rules will be the same regardless. For example, if a1.cpp and
204 a2.cpp both define a function "foo" the API should generate a duplicate
205 definition error. On the other hand, if a1.cpp and b1.cpp both define "foo"
206 there is no error (different dynamic libraries may define the same symbol). If
207 main.cpp refers to "foo", it should bind to the definition in LibA rather than
208 the one in LibB, since main.cpp is part of the "main" dylib, and the main dylib
209 links against LibA before LibB.
210
211 Many JIT clients will have no need for this strict adherence to the usual
212 ahead-of-time linking rules and should be able to get by just fine by putting
213 all of their code in a single JITDylib. However, clients who want to JIT code
214 for languages/projects that traditionally rely on ahead-of-time linking (e.g.
215 C++) will find that this feature makes life much easier.
216
217 Symbol lookup in ORC serves two other important functions, beyond basic lookup:
218 (1) It triggers compilation of the symbol(s) searched for, and (2) it provides
219 the synchronization mechanism for concurrent compilation. The pseudo-code for
220 the lookup process is:
221
222 .. code-block:: none
223
224 construct a query object from a query set and query handler
225 lock the session
226 lodge query against requested symbols, collect required materializers (if any)
227 unlock the session
228 dispatch materializers (if any)
229
230 In this context a materializer is something that provides a working definition
231 of a symbol upon request. Generally materializers wrap compilers, but they may
232 also wrap a linker directly (if the program representation backing the
233 definitions is an object file), or even just a class that writes bits directly
234 into memory (if the definitions are stubs). Materialization is the blanket term
235 for any actions (compiling, linking, splatting bits, registering with runtimes,
236 etc.) that is requried to generate a symbol definition that is safe to call or
237 access.
238
239 As each materializer completes its work it notifies the JITDylib, which in turn
240 notifies any query objects that are waiting on the newly materialized
241 definitions. Each query object maintains a count of the number of symbols that
242 it is still waiting on, and once this count reaches zero the query object calls
243 the query handler with a *SymbolMap* (a map of symbol names to addresses)
244 describing the result. If any symbol fails to materialize the query immediately
245 calls the query handler with an error.
246
247 The collected materialization units are sent to the ExecutionSession to be
248 dispatched, and the dispatch behavior can be set by the client. By default each
249 materializer is run on the calling thread. Clients are free to create new
250 threads to run materializers, or to send the work to a work queue for a thread
251 pool (this is what LLJIT/LLLazyJIT do).
252
253 Top Level APIs
254 ==============
255
256 Many of ORC's top-level APIs are visible in the example above:
257
258 - *ExecutionSession* represents the JIT'd program and provides context for the
259 JIT: It contains the JITDylibs, error reporting mechanisms, and dispatches the
260 materializers.
261
262 - *JITDylibs* provide the symbol tables.
263
264 - *Layers* (ObjLinkingLayer and CXXLayer) are wrappers around compilers and
265 allow clients to add uncompiled program representations supported by those
266 compilers to JITDylibs.
267
268 Several other important APIs are used explicitly. JIT clients need not be aware
269 of them, but Layer authors will use them:
270
271 - *MaterializationUnit* - When XXXLayer::add is invoked it wraps the given
272 program representation (in this example, C++ source) in a MaterializationUnit,
273 which is then stored in the JITDylib. MaterializationUnits are responsible for
274 describing the definitions they provide, and for unwrapping the program
275 representation and passing it back to the layer when compilation is required
276 (this ownership shuffle makes writing thread-safe layers easier, since the
277 ownership of the program representation will be passed back on the stack,
278 rather than having to be fished out of a Layer member, which would require
279 synchronization).
280
281 - *MaterializationResponsibility* - When a MaterializationUnit hands a program
282 representation back to the layer it comes with an associated
283 MaterializationResponsibility object. This object tracks the definitions
284 that must be materialized and provides a way to notify the JITDylib once they
285 are either successfully materialized or a failure occurs.
286
287 Handy utilities
288 ===============
289
290 TBD: absolute symbols, aliases, off-the-shelf layers.
291
292 Laziness
293 ========
294
295 Laziness in ORC is provided by a utility called "lazy-reexports". The aim of
296 this utility is to re-use the synchronization provided by the symbol lookup
297 mechanism to make it safe to lazily compile functions, even if calls to the
298 stub occur simultaneously on multiple threads of JIT'd code. It does this by
299 reducing lazy compilation to symbol lookup: The lazy stub performs a lookup of
300 its underlying definition on first call, updating the function body pointer
301 once the definition is available. If additional calls arrive on other threads
302 while compilation is ongoing they will be safely blocked by the normal lookup
303 synchronization guarantee (no result until the result is safe) and can also
304 proceed as soon as compilation completes.
305
306 TBD: Usage example.
307
308 Supporting Custom Compilers
309 ===========================
310
311 TBD.
312
313 Low Level (MCJIT style) Use
314 ===========================
315
316 TBD.
317
318 Future Features
319 ===============
320
321 TBD: Speculative compilation. Object Caches.
322
323 .. [1] Formats/architectures vary in terms of supported features. MachO and
324 ELF tend to have better support than COFF. Patches very welcome!
8888 GetElementPtr
8989 Frontend/PerformanceTips
9090 MCJITDesignAndImplementation
91 ORCv2DesignAndImplementation
91 ORCv2
9292 CodeOfConduct
9393 CompileCudaWithLLVM
9494 ReportingGuide
382382 :doc:`MCJITDesignAndImplementation`
383383 Describes the inner workings of MCJIT execution engine.
384384
385 :doc:`ORCv2DesignAndImplementation`
385 :doc:`ORCv2`
386386 Describes the design and implementation of the ORC APIs, including some
387 usage examples.
387 usage examples, and a guide for users transitioning from ORCv1 to ORCv2.
388388
389389 :doc:`BranchWeightMetadata`
390390 Provides information about Branch Prediction Information.