llvm.org GIT mirror llvm / e97b132
LangRef documentation for the stackmap and patchpoint intrinsics. These still have "experimental" status, meaning we don't guarantee backward compatibility. However, they are already actively used by the open source WebKit project, and have started to be adopted by other projects. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@197930 91177308-0d34-0410-b5e6-96231b3b80d8 Andrew Trick 6 years ago
3 changed file(s) with 503 addition(s) and 0 deletion(s). Raw diff Collapse all Expand all
365365 accessed runtime components pinned to specific hardware registers.
366366 At the moment only X86 supports this convention (both 32 and 64
367367 bit).
368 "``webkit_jscc``" - WebKit's JavaScript calling convention
369 This calling convention has been implemented for `WebKit FTL JIT
370 `_. It passes arguments on the
371 stack right to left (as cdecl does), and returns a value in the
372 platform's customary return register.
373 "``anyregcc``" - Dynamic calling convention for code patching
374 This is a special convention that supports patching an arbitrary code
375 sequence in place of a call site. This convention forces the call
376 arguments into registers but allows them to be dynamcially
377 allocated. This can currently only be used with calls to
378 llvm.experimental.patchpoint because only this intrinsic records
379 the location of its arguments in a side table. See :doc:`StackMaps`.
368380 "``cc ``" - Numbered convention
369381 Any calling convention may be specified by number, allowing
370382 target-specific calling conventions to be used. Target specific
89118923
89128924 This intrinsic does nothing, and it's removed by optimizers and ignored
89138925 by codegen.
8926
8927 Stack Map Intrinsics
8928 --------------------
8929
8930 LLVM provides experimental intrinsics to support runtime patching
8931 mechanisms commonly desired in dynamic language JITs. These intrinsics
8932 are described in :doc:`StackMaps`.
0 ===================================
1 Stack maps and patch points in LLVM
2 ===================================
3
4 .. contents::
5 :local:
6 :depth: 2
7
8 Definitions
9 ===========
10
11 In this document we refer to the "runtime" collectively as all
12 components that serve as the LLVM client, including the LLVM IR
13 generator, object code consumer, and code patcher.
14
15 A stack map records the location of ``live values`` at a particular
16 instruction address. These ``live values`` do not refer to all the
17 LLVM values live across the stack map. Instead, they are only the
18 values that the runtime requires to be live at this point. For
19 example, they may be the values the runtime will need to resume
20 program execution at that point independent of the compiled function
21 containing the stack map.
22
23 LLVM emits stack map data into the object code within a designated
24 :ref:`stackmap-section`. This stack map data contains a record for
25 each stack map. The record stores the stack map's instruction address
26 and contains a entry for each mapped value. Each entry encodes a
27 value's location as a register, stack offset, or constant.
28
29 A patch point is an instruction address at which space is reserved for
30 patching a new instruction sequence at run time. Patch points look
31 much like calls to LLVM. They take arguments that follow a calling
32 convention and may return a value. They also imply stack map
33 generation, which allows the runtime to locate the patchpoint and
34 find the location of ``live values`` at that point.
35
36 Motivation
37 ==========
38
39 This functionality is currently experimental but is potentially useful
40 in a variety of settings, the most obvious being a runtime (JIT)
41 compiler. Example applications of the patchpoint intrinsics are
42 implementing an inline call cache for polymorphic method dispatch or
43 optimizing the retrieval of properties in dynamically typed languages
44 such as JavaScript.
45
46 The intrinsics documented here are currently used by the JavaScript
47 compiler within the open source WebKit project, see the `FTL JIT
48 `_, but they are designed to be
49 used whenever stack maps or code patching are needed. Because the
50 intrinsics have experimental status, compatibility across LLVM
51 releases is not guaranteed.
52
53 The stack map functionality described in this document is separate
54 from the functionality described in
55 :ref:`stack-map`. `GCFunctionMetadata` provides the location of
56 pointers into a collected heap captured by the `GCRoot` intrinsic,
57 which can also be considered a "stack map". Unlike the stack maps
58 defined above, the `GCFunctionMetadata` stack map interface does not
59 provide a way to associate live register values of arbitrary type with
60 an instruction address, nor does it specify a format for the resulting
61 stack map. The stack maps described here could potentially provide
62 richer information to a garbage collecting runtime, but that usage
63 will not be discussed in this document.
64
65 Intrinsics
66 ==========
67
68 The following two kinds of intrinsics can be used to implement stack
69 maps and patch points: ``llvm.experimental.stackmap`` and
70 ``llvm.experimental.patchpoint``. Both kinds of intrinsics generate a
71 stack map record, and they both allow some form of code patching. They
72 can be used independently (i.e. ``llvm.experimental.patchpoint``
73 implicitly generates a stack map without the need for an additional
74 call to ``llvm.experimental.stackmap``). The choice of which to use
75 depends on whether it is necessary to reserve space for code patching
76 and whether any of the intrinsic arguments should be lowered according
77 to calling conventions. ``llvm.experimental.stackmap`` does not
78 reserve any space, nor does it expect any call arguments. If the
79 runtime patches code at the stack map's address, it will destructively
80 overwrite the program text. This is unlike
81 ``llvm.experimental.patchpoint``, which reserves space for in-place
82 patching without overwriting surrounding code. The
83 ``llvm.experimental.patchpoint`` intrinsic also lowers a specified
84 number of arguments according to its calling convention. This allows
85 patched code to make in-place function calls without marshaling.
86
87 Each instance of one of these intrinsics generates a stack map record
88 in the :ref:`stackmap-section`. The record includes an ID, allowing
89 the runtime to uniquely identify the stack map, and the offset within
90 the code from the beginning of the enclosing function.
91
92 '``llvm.experimental.stackmap``' Intrinsic
93 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
94
95 Syntax:
96 """""""
97
98 ::
99
100 declare void
101 @llvm.experimental.stackmap(i64 , i32 , ...)
102
103 Overview:
104 """""""""
105
106 The '``llvm.experimental.stackmap``' intrinsic records the location of
107 specified values in the stack map without generating any code.
108
109 Operands:
110 """""""""
111
112 The first operand is an ID to be encoded within the stack map. The
113 second operand is the number of shadow bytes following the
114 intrinsic. The variable number of operands that follow are the ``live
115 values`` for which locations will be recorded in the stack map.
116
117 To use this intrinsic as a bare-bones stack map, with no code patching
118 support, the number of shadow bytes can be set to zero.
119
120 Semantics:
121 """"""""""
122
123 The stack map intrinsic generates no code in place, unless nops are
124 needed to cover its shadow (see below). However, its offset from
125 function entry is stored in the stack map. This is the relative
126 instruction address immediately following the instructions that
127 precede the stack map.
128
129 The stack map ID allows a runtime to locate the desired stack map
130 record. LLVM passes this ID through directly to the stack map
131 record without checking uniqueness.
132
133 LLVM guarantees a shadow of instructions following the stack map's
134 instruction offset during which neither the end of the basic block nor
135 another call to ``llvm.experimental.stackmap`` or
136 ``llvm.experimental.patchpoint`` may occur. This allows the runtime to
137 patch the code at this point in response to an event triggered from
138 outside the code. The code for instructions following the stack map
139 may be emitted in the stack map's shadow, and these instructions may
140 be overwritten by destructive patching. Without shadow bytes, this
141 destructive patching could overwrite program text or data outside the
142 current function. We disallow overlapping stack map shadows so that
143 the runtime does not need to consider this corner case.
144
145 For example, a stack map with 8 byte shadow:
146
147 .. code-block:: llvm
148
149 call void @runtime()
150 call void (i64, i32, ...)* @llvm.experimental.stackmap(i64 77, i32 8,
151 i64* %ptr)
152 %val = load i64* %ptr
153 %add = add i64 %val, 3
154 ret i64 %add
155
156 May require one byte of nop-padding:
157
158 .. code-block:: none
159
160 0x00 callq _runtime
161 0x05 nop <--- stack map address
162 0x06 movq (%rdi), %rax
163 0x07 addq $3, %rax
164 0x0a popq %rdx
165 0x0b ret <---- end of 8-byte shadow
166
167 Now, if the runtime needs to invalidate the compiled code, it may
168 patch 8 bytes of code at the stack map's address at follows:
169
170 .. code-block:: none
171
172 0x00 callq _runtime
173 0x05 movl $0xffff, %rax <--- patched code at stack map address
174 0x0a callq *%rax <---- end of 8-byte shadow
175
176 This way, after the normal call to the runtime returns, the code will
177 execute a patched call to a special entry point that can rebuild a
178 stack frame from the values located by the stack map.
179
180 '``llvm.experimental.patchpoint.*``' Intrinsic
181 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
182
183 Syntax:
184 """""""
185
186 ::
187
188 declare void
189 @llvm.experimental.patchpoint.void(i64 , i32 ,
190 i8* , i32 , ...)
191 declare i64
192 @llvm.experimental.patchpoint.i64(i64 , i32 ,
193 i8* , i32 , ...)
194
195 Overview:
196 """""""""
197
198 The '``llvm.experimental.patchpoint.*``' intrinsics creates a function
199 call to the specified ```` and records the location of specified
200 values in the stack map.
201
202 Operands:
203 """""""""
204
205 The first operand is an ID, the second operand is the number of bytes
206 reserved for the patchable region, the third operand is the target
207 address of a function (optionally null), and the fourth operand
208 specifies how many of the following variable operands are considered
209 function call arguments. The remaining variable number of operands are
210 the ``live values`` for which locations will be recorded in the stack
211 map.
212
213 Semantics:
214 """"""""""
215
216 The patch point intrinsic generates a stack map. It also emits a
217 function call to the address specified by ```` if the address
218 is not a constant null. The function call and its arguments are
219 lowered according to the calling convention specified at the
220 intrinsic's callsite. Variants of the intrinsic with non-void return
221 type also return a value according to calling convention.
222
223 Requesting zero patch point arguments is valid. In this case, all
224 variable operands are handled just like
225 ``llvm.experimental.stackmap.*``. The difference is that space will
226 still be reserved for patching, a call will be emitted, and a return
227 value is allowed.
228
229 The location of the arguments are not normally recorded in the stack
230 map because they are already fixed by the calling convention. The
231 remaining ``live values`` will have their location recorded, which
232 could be a register, stack location, or constant. A special calling
233 convention has been introduced for use with stack maps, anyregcc,
234 which forces the arguments to be loaded into registers but allows
235 those register to be dynamically allocated. These argument registers
236 will have their register locations recorded in the stack map in
237 addition to the remaining ``live values``.
238
239 The patch point also emits nops to cover at least ```` of
240 instruction encoding space. Hence, the client must ensure that
241 ```` is enough to encode a call to the target address on the
242 supported targets. If the call target is constant null, then there is
243 no minimum requirement. A zero-byte null target patchpoint is
244 valid.
245
246 The runtime may patch the code emitted for the patch point, including
247 the call sequence and nops. However, the runtime may not assume
248 anything about the code LLVM emits within the reserved space. Partial
249 patching is not allowed. The runtime must patch all reserved bytes,
250 padding with nops if necessary.
251
252 This example shows a patch point reserving 15 bytes, with one argument
253 in $rdi, and a return value in $rax per native calling convention:
254
255 .. code-block:: llvm
256
257 %target = inttoptr i64 -281474976710654 to i8*
258 %val = call i64 (i64, i32, ...)*
259 @llvm.experimental.patchpoint.i64(i64 78, i32 15,
260 i8* %target, i32 1, i64* %ptr)
261 %add = add i64 %val, 3
262 ret i64 %add
263
264 May generate:
265
266 .. code-block:: none
267
268 0x00 movabsq $0xffff000000000002, %r11 <--- patch point address
269 0x0a callq *%r11
270 0x0d nop
271 0x0e nop <--- end of reserved 15-bytes
272 0x0f addq $0x3, %rax
273 0x10 movl %rax, 8(%rsp)
274
275 Note that no stack map locations will be recorded. If the patched code
276 sequence does not need arguments fixed to specific calling convention
277 registers, then the ``anyregcc`` convention may be used:
278
279 .. code-block:: none
280
281 %val = call anyregcc @llvm.experimental.patchpoint(i64 78, i32 15,
282 i8* %target, i32 1,
283 i64* %ptr)
284
285 The stack map now indicates the location of the %ptr argument and
286 return value:
287
288 .. code-block:: none
289
290 Stack Map: ID=78, Loc0=%r9 Loc1=%r8
291
292 The patch code sequence may now use the argument that happened to be
293 allocated in %r8 and return a value allocated in %r9:
294
295 .. code-block:: none
296
297 0x00 movslq 4(%r8) %r9 <--- patched code at patch point address
298 0x03 nop
299 ...
300 0x0e nop <--- end of reserved 15-bytes
301 0x0f addq $0x3, %r9
302 0x10 movl %r9, 8(%rsp)
303
304 .. _stackmap-format:
305
306 Stack Map Format
307 ================
308
309 The existence of a stack map or patch point intrinsic within an LLVM
310 Module forces code emission to create a :ref:`stackmap-section`. The
311 format of this section follows:
312
313 .. code-block:: none
314
315 uint32 : Reserved (header)
316 uint32 : NumConstants
317 Constants[NumConstants] {
318 uint64 : LargeConstant
319 }
320 uint32 : NumRecords
321 StkMapRecord[NumRecords] {
322 uint64 : PatchPoint ID
323 uint32 : Instruction Offset
324 uint16 : Reserved (record flags)
325 uint16 : NumLocations
326 Location[NumLocations] {
327 uint8 : Register | Direct | Indirect | Constant | ConstantIndex
328 uint8 : Reserved (location flags)
329 uint16 : Dwarf RegNum
330 int32 : Offset or SmallConstant
331 }
332 uint16 : NumLiveOuts
333 LiveOuts[NumLiveOuts]
334 uint16 : Dwarf RegNum
335 uint8 : Reserved
336 uint8 : Size in Bytes
337 }
338 }
339
340 The first byte of each location encodes a type that indicates how to
341 interpret the ``RegNum`` and ``Offset`` fields as follows:
342
343 ======== ========== =================== ===========================
344 Encoding Type Value Description
345 -------- ---------- ------------------- ---------------------------
346 0x1 Register Reg Value in a register
347 0x2 Direct Reg + Offset Frame index value
348 0x3 Indirect [Reg + Offset] Spilled value
349 0x4 Constant Offset Small constant
350 0x5 ConstIndex Constants[Offset] Large constant
351 ======== ========== =================== ===========================
352
353 In the common case, a value is available in a register, and the
354 ``Offset`` field will be zero. Values spilled to the stack are encoded
355 as ``Indirect`` locations. The runtime must load those values from a
356 stack address, typically in the form ``[BP + Offset]``. If an
357 ``alloca`` value is passed directly to a stack map intrinsic, then
358 LLVM may fold the frame index into the stack map as an optimization to
359 avoid allocating a register or stack slot. These frame indices will be
360 encoded as ``Direct`` locations in the form ``BP + Offset``. LLVM may
361 also optimize constants by emitting them directly in the stack map,
362 either in the ``Offset`` of a ``Constant`` location or in the constant
363 pool, referred to by ``ConstantIndex`` locations.
364
365 At each callsite, a "liveout" register list is also recorded. These
366 are the registers that are live across the stackmap and therefore must
367 be saved by the runtime. This is an important optimization when the
368 patchpoint intrinsic is used with a calling convention that by default
369 preserves most registers as callee-save.
370
371 Each entry in the liveout register list contains a DWARF register
372 number and size in bytes. The stackmap format deliberately omits
373 specific subregister information. Instead the runtime must interpret
374 this information conservatively. For example, if the stackmap reports
375 one byte at ``%rax``, then the value may be in either ``%al`` or
376 ``%ah``. It doesn't matter in practice, because the runtime will
377 simply save ``%rax``. However, if the stackmap reports 16 bytes at
378 ``%ymm0``, then the runtime can safely optimize by saving only
379 ``%xmm0``.
380
381 The stack map format is a contract between an LLVM SVN revision and
382 the runtime. It is currently experimental and may change in the short
383 term, but minimizing the need to update the runtime is
384 important. Consequently, the stack map design is motivated by
385 simplicity and extensibility. Compactness of the representation is
386 secondary because the runtime is expected to parse the data
387 immediately after compiling a module and encode the information in its
388 own format. Since the runtime controls the allocation of sections, it
389 can reuse the same stack map space for multiple modules.
390
391 .. _stackmap-section:
392
393 Stack Map Section
394 ^^^^^^^^^^^^^^^^^
395
396 A JIT compiler can easily access this section by providing its own
397 memory manager via the LLVM C API
398 ``LLVMCreateSimpleMCJITMemoryManager()``. When creating the memory
399 manager, the JIT provides a callback:
400 ``LLVMMemoryManagerAllocateDataSectionCallback()``. When LLVM creates
401 this section, it invokes the callback and passes the section name. The
402 JIT can record the in-memory address of the section at this time and
403 later parse it to recover the stack map data.
404
405 On Darwin, the stack map section name is "__llvm_stackmaps". The
406 segment name is "__LLVM_STACKMAPS".
407
408 Stack Map Usage
409 ===============
410
411 The stack map support described in this document can be used to
412 precisely determine the location of values at a specific position in
413 the code. LLVM does not maintain any mapping between those values and
414 any higher-level entity. The runtime must be able to interpret the
415 stack map record given only the ID, offset, and the order of the
416 locations, which LLVM preserves.
417
418 Note that this is quite different from the goal of debug information,
419 which is a best-effort attempt to track the location of named
420 variables at every instruction.
421
422 An important motivation for this design is to allow a runtime to
423 commandeer a stack frame when execution reaches an instruction address
424 associated with a stack map. The runtime must be able to rebuild a
425 stack frame and resume program execution using the information
426 provided by the stack map. For example, execution may resume in an
427 interpreter or a recompiled version of the same function.
428
429 This usage restricts LLVM optimization. Clearly, LLVM must not move
430 stores across a stack map. However, loads must also be handled
431 conservatively. If the load may trigger an exception, hoisting it
432 above a stack map could be invalid. For example, the runtime may
433 determine that a load is safe to execute without a type check given
434 the current state of the type system. If the type system changes while
435 some activation of the load's function exists on the stack, the load
436 becomes unsafe. The runtime can prevent subsequent execution of that
437 load by immediately patching any stack map location that lies between
438 the current call site and the load (typically, the runtime would
439 simply patch all stack map locations to invalidate the function). If
440 the compiler had hoisted the load above the stack map, then the
441 program could crash before the runtime could take back control.
442
443 To enforce these semantics, stackmap and patchpoint intrinsics are
444 considered to potentially read and write all memory. This may limit
445 optimization more than some clients desire. To address this problem
446 meta-data could be added to the intrinsic call to express aliasing,
447 thereby allowing optimizations to hoist certain loads above stack
448 maps.
449
450 Direct Stack Map Entries
451 ^^^^^^^^^^^^^^^^^^^^^^^^
452
453 As shown in :ref:`stackmap-section`, a Direct stack map location
454 records the address of frame index. This address is itself the value
455 that the runtime requested. This differs from Indirect locations,
456 which refer to a stack locations from which the requested values must
457 be loaded. Direct locations can communicate the address if an alloca,
458 while Indirect locations handle register spills.
459
460 For example:
461
462 .. code-block:: none
463
464 entry:
465 %a = alloca i64...
466 llvm.experimental.stackmap(i64 , i32 , i64* %a)
467
468 The runtime can determine this alloca's relative location on the
469 stack immediately after compilation, or at any time thereafter. This
470 differs from Register and Indirect locations, because the runtime can
471 only read the values in those locations when execution reaches the
472 instruction address of the stack map.
473
474 This functionality requires LLVM to treat entry-block allocas
475 specially when they are directly consumed by an intrinsics. (This is
476 the same requirement imposed by the llvm.gcroot intrinsic.) LLVM
477 transformations must not substitute the alloca with any intervening
478 value. This can be verified by the runtime simply by checking that the
479 stack map's location is a Direct location type.
233233 TableGen/LangRef
234234 HowToUseAttributes
235235 NVPTXUsage
236 StackMaps
236237
237238 :doc:`WritingAnLLVMPass`
238239 Information on how to write LLVM transformations and analyses.
307308 :doc:`NVPTXUsage`
308309 This document describes using the NVPTX back-end to compile GPU kernels.
309310
311 :doc:`StackMaps`
312 LLVM support for mapping instruction addresses to the location of
313 values and allowing code to be patched.
310314
311315 Development Process Documentation
312316 =================================