llvm.org GIT mirror llvm / 23dcb18
Adding a document to describe the MCJIT execution engine implementation. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@188943 91177308-0d34-0410-b5e6-96231b3b80d8 Andrew Kaylor 6 years ago
8 changed file(s) with 183 addition(s) and 0 deletion(s). Raw diff Collapse all Expand all
Binary diff not shown
Binary diff not shown
Binary diff not shown
Binary diff not shown
0 ===============================
1 MCJIT Design and Implementation
2 ===============================
3
4 Introduction
5 ============
6
7 This document describes the internal workings of the MCJIT execution
8 engine and the RuntimeDyld component. It is intended as a high level
9 overview of the implementation, showing the flow and interactions of
10 objects throughout the code generation and dynamic loading process.
11
12 Engine Creation
13 ===============
14
15 In most cases, an EngineBuilder object is used to create an instance of
16 the MCJIT execution engine. The EngineBuilder takes an llvm::Module
17 object as an argument to its constructor. The client may then set various
18 options that we control the later be passed along to the MCJIT engine,
19 including the selection of MCJIT as the engine type to be created.
20 Of particular interest is the EngineBuilder::setMCJITMemoryManager
21 function. If the client does not explicitly create a memory manager at
22 this time, a default memory manager (specifically SectionMemoryManager)
23 will be created when the MCJIT engine is instantiated.
24
25 Once the options have been set, a client calls EngineBuilder::create to
26 create an instance of the MCJIT engine. If the client does not use the
27 form of this function that takes a TargetMachine as a parameter, a new
28 TargetMachine will be created based on the target triple associated with
29 the Module that was used to create the EngineBuilder.
30
31 .. image:: MCJIT-engine-builder.png
32
33 EngineBuilder::create will call the static MCJIT::createJIT function,
34 passing in its pointers to the module, memory manager and target machine
35 objects, all of which will subsequently be owned by the MCJIT object.
36
37 The MCJIT class has a member variable, Dyld, which contains an instance of
38 the RuntimeDyld wrapper class. This member will be used for
39 communications between MCJIT and the actual RuntimeDyldImpl object that
40 gets created when an object is loaded.
41
42 .. image:: MCJIT-creation.png
43
44 Upon creation, MCJIT holds a pointer to the Module object that it received
45 from EngineBuilder but it does not immediately generate code for this
46 module. Code generation is deferred until either the
47 MCJIT::finalizeObject method is called explicitly or a function such as
48 MCJIT::getPointerToFunction is called which requires the code to have been
49 generated.
50
51 Code Generation
52 ===============
53
54 When code generation is triggered, as described above, MCJIT will first
55 attempt to retrieve an object image from its ObjectCache member, if one
56 has been set. If a cached object image cannot be retrieved, MCJIT will
57 call its emitObject method. MCJIT::emitObject uses a local PassManager
58 instance and creates a new ObjectBufferStream instance, both of which it
59 passes to TargetManager::addPassesToEmitMC before calling PassManager::run
60 on the Module with which it was created.
61
62 .. image:: MCJIT-load.png
63
64 The PassManager::run call causes the MC code generation mechanisms to emit
65 a complete relocatable binary object image (either in either ELF or MachO
66 format, depending on the target) into the ObjectBufferStream object, which
67 is flushed to complete the process. If an ObjectCache is being used, the
68 image will be passed to the ObjectCache here.
69
70 At this point, the ObjectBufferStream contains the raw object image.
71 Before the code can be executed, the code and data sections from this
72 image must be loaded into suitable memory, relocations must be applied and
73 memory permission and code cache invalidation (if required) must be completed.
74
75 Object Loading
76 ==============
77
78 Once an object image has been obtained, either through code generation or
79 having been retrieved from an ObjectCache, it is passed to RuntimeDyld to
80 be loaded. The RuntimeDyld wrapper class examines the object to determine
81 its file format and creates an instance of either RuntimeDyldELF or
82 RuntimeDyldMachO (both of which derive from the RuntimeDyldImpl base
83 class) and calls the RuntimeDyldImpl::loadObject method to perform that
84 actual loading.
85
86 .. image:: MCJIT-dyld-load.png
87
88 RuntimeDyldImpl::loadObject begins by creating an ObjectImage instance
89 from the ObjectBuffer it received. ObjectImage, which wraps the
90 ObjectFile class, is a helper class which parses the binary object image
91 and provides access to the information contained in the format-specific
92 headers, including section, symbol and relocation information.
93
94 RuntimeDyldImpl::loadObject then iterates through the symbols in the
95 image. Information about common symbols is collected for later use. For
96 each function or data symbol, the associated section is loaded into memory
97 and the symbol is stored in a symbol table map data structure. When the
98 iteration is complete, a section is emitted for the common symbols.
99
100 Next, RuntimeDyldImpl::loadObject iterates through the sections in the
101 object image and for each section iterates through the relocations for
102 that sections. For each relocation, it calls the format-specific
103 processRelocationRef method, which will examine the relocation and store
104 it in one of two data structures, a section-based relocation list map and
105 an external symbol relocation map.
106
107 .. image:: MCJIT-load-object.png
108
109 When RuntimeDyldImpl::loadObject returns, all of the code and data
110 sections for the object will have been loaded into memory allocated by the
111 memory manager and relocation information will have been prepared, but the
112 relocations have not yet been applied and the generated code is still not
113 ready to be executed.
114
115 [Currently (as of August 2013) the MCJIT engine will immediately apply
116 relocations when loadObject completes. However, this shouldn't be
117 happening. Because the code may have been generated for a remote target,
118 the client should be given a chance to re-map the section addresses before
119 relocations are applied. It is possible to apply relocations multiple
120 times, but in the case where addresses are to be re-mapped, this first
121 application is wasted effort.]
122
123 Address Remapping
124 =================
125
126 At any time after initial code has been generated and before
127 finalizeObject is called, the client can remap the address of sections in
128 the object. Typically this is done because the code was generated for an
129 external process and is being mapped into that process' address space.
130 The client remaps the section address by calling MCJIT::mapSectionAddress.
131 This should happen before the section memory is copied to its new
132 location.
133
134 When MCJIT::mapSectionAddress is called, MCJIT passes the call on to
135 RuntimeDyldImpl (via its Dyld member). RuntimeDyldImpl stores the new
136 address in an internal data structure but does not update the code at this
137 time, since other sections are likely to change.
138
139 When the client is finished remapping section addresses, it will call
140 MCJIT::finalizeObject to complete the remapping process.
141
142 Final Preparations
143 ==================
144
145 When MCJIT::finalizeObject is called, MCJIT calls
146 RuntimeDyld::resolveRelocations. This function will attempt to locate any
147 external symbols and then apply all relocations for the object.
148
149 External symbols are resolved by calling the memory manager's
150 getPointerToNamedFunction method. The memory manager will return the
151 address of the requested symbol in the target address space. (Note, this
152 may not be a valid pointer in the host process.) RuntimeDyld will then
153 iterate through the list of relocations it has stored which are associated
154 with this symbol and invoke the resolveRelocation method which, through an
155 format-specific implementation, will apply the relocation to the loaded
156 section memory.
157
158 Next, RuntimeDyld::resolveRelocations iterates through the list of
159 sections and for each section iterates through a list of relocations that
160 have been saved which reference that symbol and call resolveRelocation for
161 each entry in this list. The relocation list here is a list of
162 relocations for which the symbol associated with the relocation is located
163 in the section associated with the list. Each of these locations will
164 have a target location at which the relocation will be applied that is
165 likely located in a different section.
166
167 .. image:: MCJIT-resolve-relocations.png
168
169 Once relocations have been applied as described above, MCJIT calls
170 RuntimeDyld::getEHFrameSection, and if a non-zero result is returned
171 passes the section data to the memory manager's registerEHFrames method.
172 This allows the memory manager to call any desired target-specific
173 functions, such as registering the EH frame information with a debugger.
174
175 Finally, MCJIT calls the memory manager's finalizeMemory method. In this
176 method, the memory manager will invalidate the target code cache, if
177 necessary, and apply final permissions to the memory pages it has
178 allocated for code and data memory.
179
284284 :doc:`DebuggingJITedCode`
285285 How to debug JITed code with GDB.
286286
287 :doc:`MCJITDesignAndImplementation`
288 Describes the inner workings of MCJIT execution engine.
289
287290 :doc:`BranchWeightMetadata`
288291 Provides information about Branch Prediction Information.
289292