llvm.org GIT mirror llvm / dba9146
IR: New representation for CFI and virtual call optimization pass metadata. The bitset metadata currently used in LLVM has a few problems: 1. It has the wrong name. The name "bitset" refers to an implementation detail of one use of the metadata (i.e. its original use case, CFI). This makes it harder to understand, as the name makes no sense in the context of virtual call optimization. 2. It is represented using a global named metadata node, rather than being directly associated with a global. This makes it harder to manipulate the metadata when rebuilding global variables, summarise it as part of ThinLTO and drop unused metadata when associated globals are dropped. For this reason, CFI does not currently work correctly when both CFI and vcall opt are enabled, as vcall opt needs to rebuild vtable globals, and fails to associate metadata with the rebuilt globals. As I understand it, the same problem could also affect ASan, which rebuilds globals with a red zone. This patch solves both of those problems in the following way: 1. Rename the metadata to "type metadata". This new name reflects how the metadata is currently being used (i.e. to represent type information for CFI and vtable opt). The new name is reflected in the name for the associated intrinsic (llvm.type.test) and pass (LowerTypeTests). 2. Attach metadata directly to the globals that it pertains to, rather than using the "llvm.bitsets" global metadata node as we are doing now. This is done using the newly introduced capability to attach metadata to global variables (r271348 and r271358). See also: http://lists.llvm.org/pipermail/llvm-dev/2016-June/100462.html Differential Revision: http://reviews.llvm.org/D21053 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@273729 91177308-0d34-0410-b5e6-96231b3b80d8 Peter Collingbourne 3 years ago
70 changed file(s) with 2417 addition(s) and 2539 deletion(s). Raw diff Collapse all Expand all
+0
-221
docs/BitSets.rst less more
None =======
1 Bitsets
2 =======
3
4 This is a mechanism that allows IR modules to co-operatively build pointer
5 sets corresponding to addresses within a given set of globals. One example
6 of a use case for this is to allow a C++ program to efficiently verify (at
7 each call site) that a vtable pointer is in the set of valid vtable pointers
8 for the type of the class or its derived classes.
9
10 To use the mechanism, a client creates a global metadata node named
11 ``llvm.bitsets``. Each element is a metadata node with three elements:
12
13 1. a metadata object representing an identifier for the bitset
14 2. either a global variable or a function
15 3. a byte offset into the global (generally zero for functions)
16
17 Each bitset must exclusively contain either global variables or functions.
18
19 .. admonition:: Limitation
20
21 The current implementation only supports functions as members of bitsets on
22 the x86-32 and x86-64 architectures.
23
24 An intrinsic, :ref:`llvm.bitset.test `, is used to test
25 whether a given pointer is a member of a bitset.
26
27 Representing Type Information using Bitsets
28 ===========================================
29
30 This section describes how Clang represents C++ type information associated with
31 virtual tables using bitsets.
32
33 Consider the following inheritance hierarchy:
34
35 .. code-block:: c++
36
37 struct A {
38 virtual void f();
39 };
40
41 struct B : A {
42 virtual void f();
43 virtual void g();
44 };
45
46 struct C {
47 virtual void h();
48 };
49
50 struct D : A, C {
51 virtual void f();
52 virtual void h();
53 };
54
55 The virtual table objects for A, B, C and D look like this (under the Itanium ABI):
56
57 .. csv-table:: Virtual Table Layout for A, B, C, D
58 :header: Class, 0, 1, 2, 3, 4, 5, 6
59
60 A, A::offset-to-top, &A::rtti, &A::f
61 B, B::offset-to-top, &B::rtti, &B::f, &B::g
62 C, C::offset-to-top, &C::rtti, &C::h
63 D, D::offset-to-top, &D::rtti, &D::f, &D::h, D::offset-to-top, &D::rtti, thunk for &D::h
64
65 When an object of type A is constructed, the address of ``&A::f`` in A's
66 virtual table object is stored in the object's vtable pointer. In ABI parlance
67 this address is known as an `address point`_. Similarly, when an object of type
68 B is constructed, the address of ``&B::f`` is stored in the vtable pointer. In
69 this way, the vtable in B's virtual table object is compatible with A's vtable.
70
71 D is a little more complicated, due to the use of multiple inheritance. Its
72 virtual table object contains two vtables, one compatible with A's vtable and
73 the other compatible with C's vtable. Objects of type D contain two virtual
74 pointers, one belonging to the A subobject and containing the address of
75 the vtable compatible with A's vtable, and the other belonging to the C
76 subobject and containing the address of the vtable compatible with C's vtable.
77
78 The full set of compatibility information for the above class hierarchy is
79 shown below. The following table shows the name of a class, the offset of an
80 address point within that class's vtable and the name of one of the classes
81 with which that address point is compatible.
82
83 .. csv-table:: Bitsets for A, B, C, D
84 :header: VTable for, Offset, Compatible Class
85
86 A, 16, A
87 B, 16, A
88 , , B
89 C, 16, C
90 D, 16, A
91 , , D
92 , 48, C
93
94 The next step is to encode this compatibility information into the IR. The way
95 this is done is to create bitsets named after each of the compatible classes,
96 into which we add each of the compatible address points in each vtable.
97 For example, these bitset entries encode the compatibility information for
98 the above hierarchy:
99
100 ::
101
102 !0 = !{!"_ZTS1A", [3 x i8*]* @_ZTV1A, i64 16}
103 !1 = !{!"_ZTS1A", [4 x i8*]* @_ZTV1B, i64 16}
104 !2 = !{!"_ZTS1B", [4 x i8*]* @_ZTV1B, i64 16}
105 !3 = !{!"_ZTS1C", [3 x i8*]* @_ZTV1C, i64 16}
106 !4 = !{!"_ZTS1A", [7 x i8*]* @_ZTV1D, i64 16}
107 !5 = !{!"_ZTS1D", [7 x i8*]* @_ZTV1D, i64 16}
108 !6 = !{!"_ZTS1C", [7 x i8*]* @_ZTV1D, i64 48}
109
110 With these bitsets, we can now use the ``llvm.bitset.test`` intrinsic to test
111 whether a given pointer is compatible with a bitset. Working backwards,
112 if ``llvm.bitset.test`` returns true for a particular pointer, we can also
113 statically determine the identities of the virtual functions that a particular
114 virtual call may call. For example, if a program assumes a pointer to be in the
115 ``!"_ZST1A"`` bitset, we know that the address can be only be one of ``_ZTV1A+16``,
116 ``_ZTV1B+16`` or ``_ZTV1D+16`` (i.e. the address points of the vtables of A,
117 B and D respectively). If we then load an address from that pointer, we know
118 that the address can only be one of ``&A::f``, ``&B::f`` or ``&D::f``.
119
120 .. _address point: https://mentorembedded.github.io/cxx-abi/abi.html#vtable-general
121
122 Testing Bitset Addresses
123 ========================
124
125 If a program tests an address using ``llvm.bitset.test``, this will cause
126 a link-time optimization pass, ``LowerBitSets``, to replace calls to this
127 intrinsic with efficient code to perform bitset tests. At a high level,
128 the pass will lay out referenced globals in a consecutive memory region in
129 the object file, construct bit vectors that map onto that memory region,
130 and generate code at each of the ``llvm.bitset.test`` call sites to test
131 pointers against those bit vectors. Because of the layout manipulation, the
132 globals' definitions must be available at LTO time. For more information,
133 see the `control flow integrity design document`_.
134
135 A bit set containing functions is transformed into a jump table, which is a
136 block of code consisting of one branch instruction for each of the functions
137 in the bit set that branches to the target function. The pass will redirect
138 any taken function addresses to the corresponding jump table entry. In the
139 object file's symbol table, the jump table entries take the identities of
140 the original functions, so that addresses taken outside the module will pass
141 any verification done inside the module.
142
143 Jump tables may call external functions, so their definitions need not
144 be available at LTO time. Note that if an externally defined function is a
145 member of a bitset, there is no guarantee that its identity within the module
146 will be the same as its identity outside of the module, as the former will
147 be the jump table entry if a jump table is necessary.
148
149 The `GlobalLayoutBuilder`_ class is responsible for laying out the globals
150 efficiently to minimize the sizes of the underlying bitsets.
151
152 .. _control flow integrity design document: http://clang.llvm.org/docs/ControlFlowIntegrityDesign.html
153
154 :Example:
155
156 ::
157
158 target datalayout = "e-p:32:32"
159
160 @a = internal global i32 0
161 @b = internal global i32 0
162 @c = internal global i32 0
163 @d = internal global [2 x i32] [i32 0, i32 0]
164
165 define void @e() {
166 ret void
167 }
168
169 define void @f() {
170 ret void
171 }
172
173 declare void @g()
174
175 !llvm.bitsets = !{!0, !1, !2, !3, !4, !5, !6}
176
177 !0 = !{!"bitset1", i32* @a, i32 0}
178 !1 = !{!"bitset1", i32* @b, i32 0}
179 !2 = !{!"bitset2", i32* @b, i32 0}
180 !3 = !{!"bitset2", i32* @c, i32 0}
181 !4 = !{!"bitset2", i32* @d, i32 4}
182 !5 = !{!"bitset3", void ()* @e, i32 0}
183 !6 = !{!"bitset3", void ()* @g, i32 0}
184
185 declare i1 @llvm.bitset.test(i8* %ptr, metadata %bitset) nounwind readnone
186
187 define i1 @foo(i32* %p) {
188 %pi8 = bitcast i32* %p to i8*
189 %x = call i1 @llvm.bitset.test(i8* %pi8, metadata !"bitset1")
190 ret i1 %x
191 }
192
193 define i1 @bar(i32* %p) {
194 %pi8 = bitcast i32* %p to i8*
195 %x = call i1 @llvm.bitset.test(i8* %pi8, metadata !"bitset2")
196 ret i1 %x
197 }
198
199 define i1 @baz(void ()* %p) {
200 %pi8 = bitcast void ()* %p to i8*
201 %x = call i1 @llvm.bitset.test(i8* %pi8, metadata !"bitset3")
202 ret i1 %x
203 }
204
205 define void @main() {
206 %a1 = call i1 @foo(i32* @a) ; returns 1
207 %b1 = call i1 @foo(i32* @b) ; returns 1
208 %c1 = call i1 @foo(i32* @c) ; returns 0
209 %a2 = call i1 @bar(i32* @a) ; returns 0
210 %b2 = call i1 @bar(i32* @b) ; returns 1
211 %c2 = call i1 @bar(i32* @c) ; returns 1
212 %d02 = call i1 @bar(i32* getelementptr ([2 x i32]* @d, i32 0, i32 0)) ; returns 0
213 %d12 = call i1 @bar(i32* getelementptr ([2 x i32]* @d, i32 0, i32 1)) ; returns 1
214 %e = call i1 @baz(void ()* @e) ; returns 1
215 %f = call i1 @baz(void ()* @f) ; returns 0
216 %g = call i1 @baz(void ()* @g) ; returns 1
217 ret void
218 }
219
220 .. _GlobalLayoutBuilder: http://llvm.org/klaus/llvm/blob/master/include/llvm/Transforms/IPO/LowerBitSets.h
48384838 !1 = !{!1} ; an identifier for the inner loop
48394839 !2 = !{!2} ; an identifier for the outer loop
48404840
4841 '``llvm.bitsets``'
4842 ^^^^^^^^^^^^^^^^^^
4843
4844 The ``llvm.bitsets`` global metadata is used to implement
4845 :doc:`bitsets `.
4846
48474841 '``invariant.group``' Metadata
48484842 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
48494843
1227312267 that the optimizer can otherwise deduce or facts that are of little use to the
1227412268 optimizer.
1227512269
12276 .. _bitset.test:
12277
12278 '``llvm.bitset.test``' Intrinsic
12270 .. _type.test:
12271
12272 '``llvm.type.test``' Intrinsic
1227912273 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
1228012274
1228112275 Syntax:
1228312277
1228412278 ::
1228512279
12286 declare i1 @llvm.bitset.test(i8* %ptr, metadata %bitset) nounwind readnone
12280 declare i1 @llvm.type.test(i8* %ptr, metadata %type) nounwind readnone
1228712281
1228812282
1228912283 Arguments:
1229012284 """"""""""
1229112285
1229212286 The first argument is a pointer to be tested. The second argument is a
12293 metadata object representing an identifier for a :doc:`bitset `.
12294
12295 Overview:
12296 """""""""
12297
12298 The ``llvm.bitset.test`` intrinsic tests whether the given pointer is a
12299 member of the given bitset.
12287 metadata object representing a :doc:`type identifier `.
12288
12289 Overview:
12290 """""""""
12291
12292 The ``llvm.type.test`` intrinsic tests whether the given pointer is associated
12293 with the given type identifier.
1230012294
1230112295 '``llvm.donothing``' Intrinsic
1230212296 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
0 =============
1 Type Metadata
2 =============
3
4 Type metadata is a mechanism that allows IR modules to co-operatively build
5 pointer sets corresponding to addresses within a given set of globals. LLVM's
6 `control flow integrity`_ implementation uses this metadata to efficiently
7 check (at each call site) that a given address corresponds to either a
8 valid vtable or function pointer for a given class or function type, and its
9 whole-program devirtualization pass uses the metadata to identify potential
10 callees for a given virtual call.
11
12 To use the mechanism, a client creates metadata nodes with two elements:
13
14 1. a byte offset into the global (generally zero for functions)
15 2. a metadata object representing an identifier for the type
16
17 These metadata nodes are associated with globals by using global object
18 metadata attachments with the ``!type`` metadata kind.
19
20 Each type identifier must exclusively identify either global variables
21 or functions.
22
23 .. admonition:: Limitation
24
25 The current implementation only supports attaching metadata to functions on
26 the x86-32 and x86-64 architectures.
27
28 An intrinsic, :ref:`llvm.type.test `, is used to test whether a
29 given pointer is associated with a type identifier.
30
31 .. _control flow integrity: http://clang.llvm.org/docs/ControlFlowIntegrity.html
32
33 Representing Type Information using Type Metadata
34 =================================================
35
36 This section describes how Clang represents C++ type information associated with
37 virtual tables using type metadata.
38
39 Consider the following inheritance hierarchy:
40
41 .. code-block:: c++
42
43 struct A {
44 virtual void f();
45 };
46
47 struct B : A {
48 virtual void f();
49 virtual void g();
50 };
51
52 struct C {
53 virtual void h();
54 };
55
56 struct D : A, C {
57 virtual void f();
58 virtual void h();
59 };
60
61 The virtual table objects for A, B, C and D look like this (under the Itanium ABI):
62
63 .. csv-table:: Virtual Table Layout for A, B, C, D
64 :header: Class, 0, 1, 2, 3, 4, 5, 6
65
66 A, A::offset-to-top, &A::rtti, &A::f
67 B, B::offset-to-top, &B::rtti, &B::f, &B::g
68 C, C::offset-to-top, &C::rtti, &C::h
69 D, D::offset-to-top, &D::rtti, &D::f, &D::h, D::offset-to-top, &D::rtti, thunk for &D::h
70
71 When an object of type A is constructed, the address of ``&A::f`` in A's
72 virtual table object is stored in the object's vtable pointer. In ABI parlance
73 this address is known as an `address point`_. Similarly, when an object of type
74 B is constructed, the address of ``&B::f`` is stored in the vtable pointer. In
75 this way, the vtable in B's virtual table object is compatible with A's vtable.
76
77 D is a little more complicated, due to the use of multiple inheritance. Its
78 virtual table object contains two vtables, one compatible with A's vtable and
79 the other compatible with C's vtable. Objects of type D contain two virtual
80 pointers, one belonging to the A subobject and containing the address of
81 the vtable compatible with A's vtable, and the other belonging to the C
82 subobject and containing the address of the vtable compatible with C's vtable.
83
84 The full set of compatibility information for the above class hierarchy is
85 shown below. The following table shows the name of a class, the offset of an
86 address point within that class's vtable and the name of one of the classes
87 with which that address point is compatible.
88
89 .. csv-table:: Type Offsets for A, B, C, D
90 :header: VTable for, Offset, Compatible Class
91
92 A, 16, A
93 B, 16, A
94 , , B
95 C, 16, C
96 D, 16, A
97 , , D
98 , 48, C
99
100 The next step is to encode this compatibility information into the IR. The way
101 this is done is to create type metadata named after each of the compatible
102 classes, with which we associate each of the compatible address points in
103 each vtable. For example, these type metadata entries encode the compatibility
104 information for the above hierarchy:
105
106 ::
107
108 @_ZTV1A = constant [...], !type !0
109 @_ZTV1B = constant [...], !type !0, !type !1
110 @_ZTV1C = constant [...], !type !2
111 @_ZTV1D = constant [...], !type !0, !type !3, !type !4
112
113 !0 = !{i64 16, !"_ZTS1A"}
114 !1 = !{i64 16, !"_ZTS1B"}
115 !2 = !{i64 16, !"_ZTS1C"}
116 !3 = !{i64 16, !"_ZTS1D"}
117 !4 = !{i64 48, !"_ZTS1C"}
118
119 With this type metadata, we can now use the ``llvm.type.test`` intrinsic to
120 test whether a given pointer is compatible with a type identifier. Working
121 backwards, if ``llvm.type.test`` returns true for a particular pointer,
122 we can also statically determine the identities of the virtual functions
123 that a particular virtual call may call. For example, if a program assumes
124 a pointer to be a member of ``!"_ZST1A"``, we know that the address can
125 be only be one of ``_ZTV1A+16``, ``_ZTV1B+16`` or ``_ZTV1D+16`` (i.e. the
126 address points of the vtables of A, B and D respectively). If we then load
127 an address from that pointer, we know that the address can only be one of
128 ``&A::f``, ``&B::f`` or ``&D::f``.
129
130 .. _address point: https://mentorembedded.github.io/cxx-abi/abi.html#vtable-general
131
132 Testing Addresses For Type Membership
133 =====================================
134
135 If a program tests an address using ``llvm.type.test``, this will cause
136 a link-time optimization pass, ``LowerTypeTests``, to replace calls to this
137 intrinsic with efficient code to perform type member tests. At a high level,
138 the pass will lay out referenced globals in a consecutive memory region in
139 the object file, construct bit vectors that map onto that memory region,
140 and generate code at each of the ``llvm.type.test`` call sites to test
141 pointers against those bit vectors. Because of the layout manipulation, the
142 globals' definitions must be available at LTO time. For more information,
143 see the `control flow integrity design document`_.
144
145 A type identifier that identifies functions is transformed into a jump table,
146 which is a block of code consisting of one branch instruction for each
147 of the functions associated with the type identifier that branches to the
148 target function. The pass will redirect any taken function addresses to the
149 corresponding jump table entry. In the object file's symbol table, the jump
150 table entries take the identities of the original functions, so that addresses
151 taken outside the module will pass any verification done inside the module.
152
153 Jump tables may call external functions, so their definitions need not
154 be available at LTO time. Note that if an externally defined function is
155 associated with a type identifier, there is no guarantee that its identity
156 within the module will be the same as its identity outside of the module,
157 as the former will be the jump table entry if a jump table is necessary.
158
159 The `GlobalLayoutBuilder`_ class is responsible for laying out the globals
160 efficiently to minimize the sizes of the underlying bitsets.
161
162 .. _control flow integrity design document: http://clang.llvm.org/docs/ControlFlowIntegrityDesign.html
163
164 :Example:
165
166 ::
167
168 target datalayout = "e-p:32:32"
169
170 @a = internal global i32 0, !type !0
171 @b = internal global i32 0, !type !0, !type !1
172 @c = internal global i32 0, !type !1
173 @d = internal global [2 x i32] [i32 0, i32 0], !type !2
174
175 define void @e() !type !3 {
176 ret void
177 }
178
179 define void @f() {
180 ret void
181 }
182
183 declare void @g() !type !3
184
185 !0 = !{i32 0, !"typeid1"}
186 !1 = !{i32 0, !"typeid2"}
187 !2 = !{i32 4, !"typeid2"}
188 !3 = !{i32 0, !"typeid3"}
189
190 declare i1 @llvm.type.test(i8* %ptr, metadata %typeid) nounwind readnone
191
192 define i1 @foo(i32* %p) {
193 %pi8 = bitcast i32* %p to i8*
194 %x = call i1 @llvm.type.test(i8* %pi8, metadata !"typeid1")
195 ret i1 %x
196 }
197
198 define i1 @bar(i32* %p) {
199 %pi8 = bitcast i32* %p to i8*
200 %x = call i1 @llvm.type.test(i8* %pi8, metadata !"typeid2")
201 ret i1 %x
202 }
203
204 define i1 @baz(void ()* %p) {
205 %pi8 = bitcast void ()* %p to i8*
206 %x = call i1 @llvm.type.test(i8* %pi8, metadata !"typeid3")
207 ret i1 %x
208 }
209
210 define void @main() {
211 %a1 = call i1 @foo(i32* @a) ; returns 1
212 %b1 = call i1 @foo(i32* @b) ; returns 1
213 %c1 = call i1 @foo(i32* @c) ; returns 0
214 %a2 = call i1 @bar(i32* @a) ; returns 0
215 %b2 = call i1 @bar(i32* @b) ; returns 1
216 %c2 = call i1 @bar(i32* @c) ; returns 1
217 %d02 = call i1 @bar(i32* getelementptr ([2 x i32]* @d, i32 0, i32 0)) ; returns 0
218 %d12 = call i1 @bar(i32* getelementptr ([2 x i32]* @d, i32 0, i32 1)) ; returns 1
219 %e = call i1 @baz(void ()* @e) ; returns 1
220 %f = call i1 @baz(void ()* @f) ; returns 0
221 %g = call i1 @baz(void ()* @g) ; returns 1
222 ret void
223 }
224
225 .. _GlobalLayoutBuilder: http://llvm.org/klaus/llvm/blob/master/include/llvm/Transforms/IPO/LowerTypeTests.h
260260 CoverageMappingFormat
261261 Statepoints
262262 MergeFunctions
263 BitSets
263 TypeMetadata
264264 FaultMaps
265265 MIRLangRef
266266
+0
-38
include/llvm/Analysis/BitSetUtils.h less more
None //===- BitSetUtils.h - Utilities related to pointer bitsets ------*- C++ -*-==//
1 //
2 // The LLVM Compiler Infrastructure
3 //
4 // This file is distributed under the University of Illinois Open Source
5 // License. See LICENSE.TXT for details.
6 //
7 //===----------------------------------------------------------------------===//
8 //
9 // This file contains functions that make it easier to manipulate bitsets for
10 // devirtualization.
11 //
12 //===----------------------------------------------------------------------===//
13
14 #ifndef LLVM_ANALYSIS_BITSETUTILS_H
15 #define LLVM_ANALYSIS_BITSETUTILS_H
16
17 #include "llvm/ADT/SmallVector.h"
18 #include "llvm/IR/CallSite.h"
19
20 namespace llvm {
21
22 /// A call site that could be devirtualized.
23 struct DevirtCallSite {
24 /// The offset from the address point to the virtual function.
25 uint64_t Offset;
26 /// The call site itself.
27 CallSite CS;
28 };
29
30 /// Given a call to the intrinsic @llvm.bitset.test, find all devirtualizable
31 /// call sites based on the call and return them in DevirtCalls.
32 void findDevirtualizableCalls(SmallVectorImpl &DevirtCalls,
33 SmallVectorImpl &Assumes,
34 CallInst *CI);
35 }
36
37 #endif
0 //===- TypeMetadataUtils.h - Utilities related to type metadata --*- C++ -*-==//
1 //
2 // The LLVM Compiler Infrastructure
3 //
4 // This file is distributed under the University of Illinois Open Source
5 // License. See LICENSE.TXT for details.
6 //
7 //===----------------------------------------------------------------------===//
8 //
9 // This file contains functions that make it easier to manipulate type metadata
10 // for devirtualization.
11 //
12 //===----------------------------------------------------------------------===//
13
14 #ifndef LLVM_ANALYSIS_TYPEMETADATAUTILS_H
15 #define LLVM_ANALYSIS_TYPEMETADATAUTILS_H
16
17 #include "llvm/ADT/SmallVector.h"
18 #include "llvm/IR/CallSite.h"
19
20 namespace llvm {
21
22 /// A call site that could be devirtualized.
23 struct DevirtCallSite {
24 /// The offset from the address point to the virtual function.
25 uint64_t Offset;
26 /// The call site itself.
27 CallSite CS;
28 };
29
30 /// Given a call to the intrinsic @llvm.type.test, find all devirtualizable
31 /// call sites based on the call and return them in DevirtCalls.
32 void findDevirtualizableCalls(SmallVectorImpl &DevirtCalls,
33 SmallVectorImpl &Assumes,
34 CallInst *CI);
35 }
36
37 #endif
2020 namespace llvm {
2121 class Comdat;
2222 class MDNode;
23 class Metadata;
2324 class Module;
2425
2526 class GlobalObject : public GlobalValue {
113114 /// Erase all metadata attachments with the given kind.
114115 void eraseMetadata(unsigned KindID);
115116
116 /// Copy metadata from Src.
117 void copyMetadata(const GlobalObject *Src);
117 /// Copy metadata from Src, adjusting offsets by Offset.
118 void copyMetadata(const GlobalObject *Src, unsigned Offset);
119
120 void addTypeMetadata(unsigned Offset, Metadata *TypeID);
118121
119122 void copyAttributesFrom(const GlobalValue *Src) override;
120123
662662 LLVMVectorSameWidth<0, llvm_i1_ty>],
663663 [IntrArgMemOnly]>;
664664
665 // Intrinsics to support bit sets.
666 def int_bitset_test : Intrinsic<[llvm_i1_ty], [llvm_ptr_ty, llvm_metadata_ty],
667 [IntrNoMem]>;
665 // Test whether a pointer is associated with a type metadata identifier.
666 def int_type_test : Intrinsic<[llvm_i1_ty], [llvm_ptr_ty, llvm_metadata_ty],
667 [IntrNoMem]>;
668668
669669 def int_load_relative: Intrinsic<[llvm_ptr_ty], [llvm_ptr_ty, llvm_anyint_ty],
670670 [IntrReadMem, IntrArgMemOnly]>;
6767 MD_invariant_group = 16, // "invariant.group"
6868 MD_align = 17, // "align"
6969 MD_loop = 18, // "llvm.loop"
70 MD_type = 19, // "type"
7071 };
7172
7273 /// Known operand bundle tag IDs, which always have the same value. All
191191 void initializeLoopVersioningLICMPass(PassRegistry&);
192192 void initializeLoopVersioningPassPass(PassRegistry &);
193193 void initializeLowerAtomicLegacyPassPass(PassRegistry &);
194 void initializeLowerBitSetsPass(PassRegistry&);
195194 void initializeLowerEmuTLSPass(PassRegistry&);
196195 void initializeLowerExpectIntrinsicPass(PassRegistry&);
197196 void initializeLowerGuardIntrinsicPass(PassRegistry&);
198197 void initializeLowerIntrinsicsPass(PassRegistry&);
199198 void initializeLowerInvokePass(PassRegistry&);
200199 void initializeLowerSwitchPass(PassRegistry&);
200 void initializeLowerTypeTestsPass(PassRegistry&);
201201 void initializeMIRPrintingPassPass(PassRegistry&);
202202 void initializeMachineBlockFrequencyInfoPass(PassRegistry&);
203203 void initializeMachineBlockPlacementPass(PassRegistry&);
+0
-205
include/llvm/Transforms/IPO/LowerBitSets.h less more
None //===- LowerBitSets.h - Bitset lowering pass --------------------*- C++ -*-===//
1 //
2 // The LLVM Compiler Infrastructure
3 //
4 // This file is distributed under the University of Illinois Open Source
5 // License. See LICENSE.TXT for details.
6 //
7 //===----------------------------------------------------------------------===//
8 //
9 // This file defines parts of the bitset lowering pass implementation that may
10 // be usefully unit tested.
11 //
12 //===----------------------------------------------------------------------===//
13
14 #ifndef LLVM_TRANSFORMS_IPO_LOWERBITSETS_H
15 #define LLVM_TRANSFORMS_IPO_LOWERBITSETS_H
16
17 #include "llvm/ADT/DenseMap.h"
18 #include "llvm/ADT/SmallVector.h"
19
20 #include
21 #include
22 #include
23 #include
24 #include
25
26 namespace llvm {
27
28 class DataLayout;
29 class GlobalObject;
30 class Value;
31 class raw_ostream;
32
33 namespace lowerbitsets {
34
35 struct BitSetInfo {
36 // The indices of the set bits in the bitset.
37 std::set Bits;
38
39 // The byte offset into the combined global represented by the bitset.
40 uint64_t ByteOffset;
41
42 // The size of the bitset in bits.
43 uint64_t BitSize;
44
45 // Log2 alignment of the bit set relative to the combined global.
46 // For example, a log2 alignment of 3 means that bits in the bitset
47 // represent addresses 8 bytes apart.
48 unsigned AlignLog2;
49
50 bool isSingleOffset() const {
51 return Bits.size() == 1;
52 }
53
54 bool isAllOnes() const {
55 return Bits.size() == BitSize;
56 }
57
58 bool containsGlobalOffset(uint64_t Offset) const;
59
60 bool containsValue(const DataLayout &DL,
61 const DenseMap &GlobalLayout,
62 Value *V, uint64_t COffset = 0) const;
63
64 void print(raw_ostream &OS) const;
65 };
66
67 struct BitSetBuilder {
68 SmallVector Offsets;
69 uint64_t Min, Max;
70
71 BitSetBuilder() : Min(std::numeric_limits::max()), Max(0) {}
72
73 void addOffset(uint64_t Offset) {
74 if (Min > Offset)
75 Min = Offset;
76 if (Max < Offset)
77 Max = Offset;
78
79 Offsets.push_back(Offset);
80 }
81
82 BitSetInfo build();
83 };
84
85 /// This class implements a layout algorithm for globals referenced by bit sets
86 /// that tries to keep members of small bit sets together. This can
87 /// significantly reduce bit set sizes in many cases.
88 ///
89 /// It works by assembling fragments of layout from sets of referenced globals.
90 /// Each set of referenced globals causes the algorithm to create a new
91 /// fragment, which is assembled by appending each referenced global in the set
92 /// into the fragment. If a referenced global has already been referenced by an
93 /// fragment created earlier, we instead delete that fragment and append its
94 /// contents into the fragment we are assembling.
95 ///
96 /// By starting with the smallest fragments, we minimize the size of the
97 /// fragments that are copied into larger fragments. This is most intuitively
98 /// thought about when considering the case where the globals are virtual tables
99 /// and the bit sets represent their derived classes: in a single inheritance
100 /// hierarchy, the optimum layout would involve a depth-first search of the
101 /// class hierarchy (and in fact the computed layout ends up looking a lot like
102 /// a DFS), but a naive DFS would not work well in the presence of multiple
103 /// inheritance. This aspect of the algorithm ends up fitting smaller
104 /// hierarchies inside larger ones where that would be beneficial.
105 ///
106 /// For example, consider this class hierarchy:
107 ///
108 /// A B
109 /// \ / | \
110 /// C D E
111 ///
112 /// We have five bit sets: bsA (A, C), bsB (B, C, D, E), bsC (C), bsD (D) and
113 /// bsE (E). If we laid out our objects by DFS traversing B followed by A, our
114 /// layout would be {B, C, D, E, A}. This is optimal for bsB as it needs to
115 /// cover the only 4 objects in its hierarchy, but not for bsA as it needs to
116 /// cover 5 objects, i.e. the entire layout. Our algorithm proceeds as follows:
117 ///
118 /// Add bsC, fragments {{C}}
119 /// Add bsD, fragments {{C}, {D}}
120 /// Add bsE, fragments {{C}, {D}, {E}}
121 /// Add bsA, fragments {{A, C}, {D}, {E}}
122 /// Add bsB, fragments {{B, A, C, D, E}}
123 ///
124 /// This layout is optimal for bsA, as it now only needs to cover two (i.e. 3
125 /// fewer) objects, at the cost of bsB needing to cover 1 more object.
126 ///
127 /// The bit set lowering pass assigns an object index to each object that needs
128 /// to be laid out, and calls addFragment for each bit set passing the object
129 /// indices of its referenced globals. It then assembles a layout from the
130 /// computed layout in the Fragments field.
131 struct GlobalLayoutBuilder {
132 /// The computed layout. Each element of this vector contains a fragment of
133 /// layout (which may be empty) consisting of object indices.
134 std::vector> Fragments;
135
136 /// Mapping from object index to fragment index.
137 std::vector FragmentMap;
138
139 GlobalLayoutBuilder(uint64_t NumObjects)
140 : Fragments(1), FragmentMap(NumObjects) {}
141
142 /// Add F to the layout while trying to keep its indices contiguous.
143 /// If a previously seen fragment uses any of F's indices, that
144 /// fragment will be laid out inside F.
145 void addFragment(const std::set &F);
146 };
147
148 /// This class is used to build a byte array containing overlapping bit sets. By
149 /// loading from indexed offsets into the byte array and applying a mask, a
150 /// program can test bits from the bit set with a relatively short instruction
151 /// sequence. For example, suppose we have 15 bit sets to lay out:
152 ///
153 /// A (16 bits), B (15 bits), C (14 bits), D (13 bits), E (12 bits),
154 /// F (11 bits), G (10 bits), H (9 bits), I (7 bits), J (6 bits), K (5 bits),
155 /// L (4 bits), M (3 bits), N (2 bits), O (1 bit)
156 ///
157 /// These bits can be laid out in a 16-byte array like this:
158 ///
159 /// Byte Offset
160 /// 0123456789ABCDEF
161 /// Bit
162 /// 7 HHHHHHHHHIIIIIII
163 /// 6 GGGGGGGGGGJJJJJJ
164 /// 5 FFFFFFFFFFFKKKKK
165 /// 4 EEEEEEEEEEEELLLL
166 /// 3 DDDDDDDDDDDDDMMM
167 /// 2 CCCCCCCCCCCCCCNN
168 /// 1 BBBBBBBBBBBBBBBO
169 /// 0 AAAAAAAAAAAAAAAA
170 ///
171 /// For example, to test bit X of A, we evaluate ((bits[X] & 1) != 0), or to
172 /// test bit X of I, we evaluate ((bits[9 + X] & 0x80) != 0). This can be done
173 /// in 1-2 machine instructions on x86, or 4-6 instructions on ARM.
174 ///
175 /// This is a byte array, rather than (say) a 2-byte array or a 4-byte array,
176 /// because for one thing it gives us better packing (the more bins there are,
177 /// the less evenly they will be filled), and for another, the instruction
178 /// sequences can be slightly shorter, both on x86 and ARM.
179 struct ByteArrayBuilder {
180 /// The byte array built so far.
181 std::vector Bytes;
182
183 enum { BitsPerByte = 8 };
184
185 /// The number of bytes allocated so far for each of the bits.
186 uint64_t BitAllocs[BitsPerByte];
187
188 ByteArrayBuilder() {
189 memset(BitAllocs, 0, sizeof(BitAllocs));
190 }
191
192 /// Allocate BitSize bits in the byte array where Bits contains the bits to
193 /// set. AllocByteOffset is set to the offset within the byte array and
194 /// AllocMask is set to the bitmask for those bits. This uses the LPT (Longest
195 /// Processing Time) multiprocessor scheduling algorithm to lay out the bits
196 /// efficiently; the pass allocates bit sets in decreasing size order.
197 void allocate(const std::set &Bits, uint64_t BitSize,
198 uint64_t &AllocByteOffset, uint8_t &AllocMask);
199 };
200
201 } // end namespace lowerbitsets
202 } // end namespace llvm
203
204 #endif // LLVM_TRANSFORMS_IPO_LOWERBITSETS_H
0 //===- LowerTypeTests.h - type metadata lowering pass -----------*- C++ -*-===//
1 //
2 // The LLVM Compiler Infrastructure
3 //
4 // This file is distributed under the University of Illinois Open Source
5 // License. See LICENSE.TXT for details.
6 //
7 //===----------------------------------------------------------------------===//
8 //
9 // This file defines parts of the type test lowering pass implementation that
10 // may be usefully unit tested.
11 //
12 //===----------------------------------------------------------------------===//
13
14 #ifndef LLVM_TRANSFORMS_IPO_LOWERTYPETESTS_H
15 #define LLVM_TRANSFORMS_IPO_LOWERTYPETESTS_H
16
17 #include "llvm/ADT/DenseMap.h"
18 #include "llvm/ADT/SmallVector.h"
19
20 #include
21 #include
22 #include
23 #include
24 #include
25
26 namespace llvm {
27
28 class DataLayout;
29 class GlobalObject;
30 class Value;
31 class raw_ostream;
32
33 namespace lowertypetests {
34
35 struct BitSetInfo {
36 // The indices of the set bits in the bitset.
37 std::set Bits;
38
39 // The byte offset into the combined global represented by the bitset.
40 uint64_t ByteOffset;
41
42 // The size of the bitset in bits.
43 uint64_t BitSize;
44
45 // Log2 alignment of the bit set relative to the combined global.
46 // For example, a log2 alignment of 3 means that bits in the bitset
47 // represent addresses 8 bytes apart.
48 unsigned AlignLog2;
49
50 bool isSingleOffset() const {
51 return Bits.size() == 1;
52 }
53
54 bool isAllOnes() const {
55 return Bits.size() == BitSize;
56 }
57
58 bool containsGlobalOffset(uint64_t Offset) const;
59
60 bool containsValue(const DataLayout &DL,
61 const DenseMap &GlobalLayout,
62 Value *V, uint64_t COffset = 0) const;
63
64 void print(raw_ostream &OS) const;
65 };
66
67 struct BitSetBuilder {
68 SmallVector Offsets;
69 uint64_t Min, Max;
70
71 BitSetBuilder() : Min(std::numeric_limits::max()), Max(0) {}
72
73 void addOffset(uint64_t Offset) {
74 if (Min > Offset)
75 Min = Offset;
76 if (Max < Offset)
77 Max = Offset;
78
79 Offsets.push_back(Offset);
80 }
81
82 BitSetInfo build();
83 };
84
85 /// This class implements a layout algorithm for globals referenced by bit sets
86 /// that tries to keep members of small bit sets together. This can
87 /// significantly reduce bit set sizes in many cases.
88 ///
89 /// It works by assembling fragments of layout from sets of referenced globals.
90 /// Each set of referenced globals causes the algorithm to create a new
91 /// fragment, which is assembled by appending each referenced global in the set
92 /// into the fragment. If a referenced global has already been referenced by an
93 /// fragment created earlier, we instead delete that fragment and append its
94 /// contents into the fragment we are assembling.
95 ///
96 /// By starting with the smallest fragments, we minimize the size of the
97 /// fragments that are copied into larger fragments. This is most intuitively
98 /// thought about when considering the case where the globals are virtual tables
99 /// and the bit sets represent their derived classes: in a single inheritance
100 /// hierarchy, the optimum layout would involve a depth-first search of the
101 /// class hierarchy (and in fact the computed layout ends up looking a lot like
102 /// a DFS), but a naive DFS would not work well in the presence of multiple
103 /// inheritance. This aspect of the algorithm ends up fitting smaller
104 /// hierarchies inside larger ones where that would be beneficial.
105 ///
106 /// For example, consider this class hierarchy:
107 ///
108 /// A B
109 /// \ / | \
110 /// C D E
111 ///
112 /// We have five bit sets: bsA (A, C), bsB (B, C, D, E), bsC (C), bsD (D) and
113 /// bsE (E). If we laid out our objects by DFS traversing B followed by A, our
114 /// layout would be {B, C, D, E, A}. This is optimal for bsB as it needs to
115 /// cover the only 4 objects in its hierarchy, but not for bsA as it needs to
116 /// cover 5 objects, i.e. the entire layout. Our algorithm proceeds as follows:
117 ///
118 /// Add bsC, fragments {{C}}
119 /// Add bsD, fragments {{C}, {D}}
120 /// Add bsE, fragments {{C}, {D}, {E}}
121 /// Add bsA, fragments {{A, C}, {D}, {E}}
122 /// Add bsB, fragments {{B, A, C, D, E}}
123 ///
124 /// This layout is optimal for bsA, as it now only needs to cover two (i.e. 3
125 /// fewer) objects, at the cost of bsB needing to cover 1 more object.
126 ///
127 /// The bit set lowering pass assigns an object index to each object that needs
128 /// to be laid out, and calls addFragment for each bit set passing the object
129 /// indices of its referenced globals. It then assembles a layout from the
130 /// computed layout in the Fragments field.
131 struct GlobalLayoutBuilder {
132 /// The computed layout. Each element of this vector contains a fragment of
133 /// layout (which may be empty) consisting of object indices.
134 std::vector> Fragments;
135
136 /// Mapping from object index to fragment index.
137 std::vector FragmentMap;
138
139 GlobalLayoutBuilder(uint64_t NumObjects)
140 : Fragments(1), FragmentMap(NumObjects) {}
141
142 /// Add F to the layout while trying to keep its indices contiguous.
143 /// If a previously seen fragment uses any of F's indices, that
144 /// fragment will be laid out inside F.
145 void addFragment(const std::set &F);
146 };
147
148 /// This class is used to build a byte array containing overlapping bit sets. By
149 /// loading from indexed offsets into the byte array and applying a mask, a
150 /// program can test bits from the bit set with a relatively short instruction
151 /// sequence. For example, suppose we have 15 bit sets to lay out:
152 ///
153 /// A (16 bits), B (15 bits), C (14 bits), D (13 bits), E (12 bits),
154 /// F (11 bits), G (10 bits), H (9 bits), I (7 bits), J (6 bits), K (5 bits),
155 /// L (4 bits), M (3 bits), N (2 bits), O (1 bit)
156 ///
157 /// These bits can be laid out in a 16-byte array like this:
158 ///
159 /// Byte Offset
160 /// 0123456789ABCDEF
161 /// Bit
162 /// 7 HHHHHHHHHIIIIIII
163 /// 6 GGGGGGGGGGJJJJJJ
164 /// 5 FFFFFFFFFFFKKKKK
165 /// 4 EEEEEEEEEEEELLLL
166 /// 3 DDDDDDDDDDDDDMMM
167 /// 2 CCCCCCCCCCCCCCNN
168 /// 1 BBBBBBBBBBBBBBBO
169 /// 0 AAAAAAAAAAAAAAAA
170 ///
171 /// For example, to test bit X of A, we evaluate ((bits[X] & 1) != 0), or to
172 /// test bit X of I, we evaluate ((bits[9 + X] & 0x80) != 0). This can be done
173 /// in 1-2 machine instructions on x86, or 4-6 instructions on ARM.
174 ///
175 /// This is a byte array, rather than (say) a 2-byte array or a 4-byte array,
176 /// because for one thing it gives us better packing (the more bins there are,
177 /// the less evenly they will be filled), and for another, the instruction
178 /// sequences can be slightly shorter, both on x86 and ARM.
179 struct ByteArrayBuilder {
180 /// The byte array built so far.
181 std::vector Bytes;
182
183 enum { BitsPerByte = 8 };
184
185 /// The number of bytes allocated so far for each of the bits.
186 uint64_t BitAllocs[BitsPerByte];
187
188 ByteArrayBuilder() {
189 memset(BitAllocs, 0, sizeof(BitAllocs));
190 }
191
192 /// Allocate BitSize bits in the byte array where Bits contains the bits to
193 /// set. AllocByteOffset is set to the offset within the byte array and
194 /// AllocMask is set to the bitmask for those bits. This uses the LPT (Longest
195 /// Processing Time) multiprocessor scheduling algorithm to lay out the bits
196 /// efficiently; the pass allocates bit sets in decreasing size order.
197 void allocate(const std::set &Bits, uint64_t BitSize,
198 uint64_t &AllocByteOffset, uint8_t &AllocMask);
199 };
200
201 } // end namespace lowertypetests
202 } // end namespace llvm
203
204 #endif // LLVM_TRANSFORMS_IPO_LOWERTYPETESTS_H
9898 AccumBitVector After;
9999 };
100100
101 // Information about an entry in a particular bitset.
102 struct BitSetInfo {
101 // Information about a member of a particular type identifier.
102 struct TypeMemberInfo {
103103 // The VTableBits for the vtable.
104104 VTableBits *Bits;
105105
106106 // The offset in bytes from the start of the vtable (i.e. the address point).
107107 uint64_t Offset;
108108
109 bool operator<(const BitSetInfo &other) const {
109 bool operator<(const TypeMemberInfo &other) const {
110110 return Bits < other.Bits || (Bits == other.Bits && Offset < other.Offset);
111111 }
112112 };
113113
114114 // A virtual call target, i.e. an entry in a particular vtable.
115115 struct VirtualCallTarget {
116 VirtualCallTarget(Function *Fn, const BitSetInfo *BS);
116 VirtualCallTarget(Function *Fn, const TypeMemberInfo *TM);
117117
118118 // For testing only.
119 VirtualCallTarget(const BitSetInfo *BS, bool IsBigEndian)
120 : Fn(nullptr), BS(BS), IsBigEndian(IsBigEndian) {}
119 VirtualCallTarget(const TypeMemberInfo *TM, bool IsBigEndian)
120 : Fn(nullptr), TM(TM), IsBigEndian(IsBigEndian) {}
121121
122122 // The function stored in the vtable.
123123 Function *Fn;
124124
125 // A pointer to the bitset through which the pointer to Fn is accessed.
126 const BitSetInfo *BS;
125 // A pointer to the type identifier member through which the pointer to Fn is
126 // accessed.
127 const TypeMemberInfo *TM;
127128
128129 // When doing virtual constant propagation, this stores the return value for
129130 // the function when passed the currently considered argument list.
136137 // the vtable object before the address point (e.g. RTTI, access-to-top,
137138 // vtables for other base classes) and is equal to the offset from the start
138139 // of the vtable object to the address point.
139 uint64_t minBeforeBytes() const { return BS->Offset; }
140 uint64_t minBeforeBytes() const { return TM->Offset; }
140141
141142 // The minimum byte offset after the address point. This covers the bytes in
142143 // the vtable object after the address point (e.g. the vtable for the current
143144 // class and any later base classes) and is equal to the size of the vtable
144145 // object minus the offset from the start of the vtable object to the address
145146 // point.
146 uint64_t minAfterBytes() const { return BS->Bits->ObjectSize - BS->Offset; }
147 uint64_t minAfterBytes() const { return TM->Bits->ObjectSize - TM->Offset; }
147148
148149 // The number of bytes allocated (for the vtable plus the byte array) before
149150 // the address point.
150151 uint64_t allocatedBeforeBytes() const {
151 return minBeforeBytes() + BS->Bits->Before.Bytes.size();
152 return minBeforeBytes() + TM->Bits->Before.Bytes.size();
152153 }
153154
154155 // The number of bytes allocated (for the vtable plus the byte array) after
155156 // the address point.
156157 uint64_t allocatedAfterBytes() const {
157 return minAfterBytes() + BS->Bits->After.Bytes.size();
158 return minAfterBytes() + TM->Bits->After.Bytes.size();
158159 }
159160
160161 // Set the bit at position Pos before the address point to RetVal.
161162 void setBeforeBit(uint64_t Pos) {
162163 assert(Pos >= 8 * minBeforeBytes());
163 BS->Bits->Before.setBit(Pos - 8 * minBeforeBytes(), RetVal);
164 TM->Bits->Before.setBit(Pos - 8 * minBeforeBytes(), RetVal);
164165 }
165166
166167 // Set the bit at position Pos after the address point to RetVal.
167168 void setAfterBit(uint64_t Pos) {
168169 assert(Pos >= 8 * minAfterBytes());
169 BS->Bits->After.setBit(Pos - 8 * minAfterBytes(), RetVal);
170 TM->Bits->After.setBit(Pos - 8 * minAfterBytes(), RetVal);
170171 }
171172
172173 // Set the bytes at position Pos before the address point to RetVal.
175176 void setBeforeBytes(uint64_t Pos, uint8_t Size) {
176177 assert(Pos >= 8 * minBeforeBytes());
177178 if (IsBigEndian)
178 BS->Bits->Before.setLE(Pos - 8 * minBeforeBytes(), RetVal, Size);
179 TM->Bits->Before.setLE(Pos - 8 * minBeforeBytes(), RetVal, Size);
179180 else
180 BS->Bits->Before.setBE(Pos - 8 * minBeforeBytes(), RetVal, Size);
181 TM->Bits->Before.setBE(Pos - 8 * minBeforeBytes(), RetVal, Size);
181182 }
182183
183184 // Set the bytes at position Pos after the address point to RetVal.
184185 void setAfterBytes(uint64_t Pos, uint8_t Size) {
185186 assert(Pos >= 8 * minAfterBytes());
186187 if (IsBigEndian)
187 BS->Bits->After.setBE(Pos - 8 * minAfterBytes(), RetVal, Size);
188 TM->Bits->After.setBE(Pos - 8 * minAfterBytes(), RetVal, Size);
188189 else
189 BS->Bits->After.setLE(Pos - 8 * minAfterBytes(), RetVal, Size);
190 TM->Bits->After.setLE(Pos - 8 * minAfterBytes(), RetVal, Size);
190191 }
191192 };
192193
213213 /// manager.
214214 ModulePass *createBarrierNoopPass();
215215
216 /// \brief This pass lowers bitset metadata and the llvm.bitset.test intrinsic
217 /// to bitsets.
218 ModulePass *createLowerBitSetsPass();
216 /// \brief This pass lowers type metadata and the llvm.type.test intrinsic to
217 /// bitsets.
218 ModulePass *createLowerTypeTestsPass();
219219
220220 /// \brief This pass export CFI checks for use by external modules.
221221 ModulePass *createCrossDSOCFIPass();
222222
223 /// \brief This pass implements whole-program devirtualization using bitset
223 /// \brief This pass implements whole-program devirtualization using type
224224 /// metadata.
225225 ModulePass *createWholeProgramDevirtPass();
226226
+0
-82
lib/Analysis/BitSetUtils.cpp less more
None //===- BitSetUtils.cpp - Utilities related to pointer bitsets -------------===//
1 //
2 // The LLVM Compiler Infrastructure
3 //
4 // This file is distributed under the University of Illinois Open Source
5 // License. See LICENSE.TXT for details.
6 //
7 //===----------------------------------------------------------------------===//
8 //
9 // This file contains functions that make it easier to manipulate bitsets for
10 // devirtualization.
11 //
12 //===----------------------------------------------------------------------===//
13
14 #include "llvm/Analysis/BitSetUtils.h"
15 #include "llvm/IR/Intrinsics.h"
16 #include "llvm/IR/Module.h"
17
18 using namespace llvm;
19
20 // Search for virtual calls that call FPtr and add them to DevirtCalls.
21 static void
22 findCallsAtConstantOffset(SmallVectorImpl &DevirtCalls,
23 Value *FPtr, uint64_t Offset) {
24 for (const Use &U : FPtr->uses()) {
25 Value *User = U.getUser();
26 if (isa(User)) {
27 findCallsAtConstantOffset(DevirtCalls, User, Offset);
28 } else if (auto CI = dyn_cast(User)) {
29 DevirtCalls.push_back({Offset, CI});
30 } else if (auto II = dyn_cast(User)) {
31 DevirtCalls.push_back({Offset, II});
32 }
33 }
34 }
35
36 // Search for virtual calls that load from VPtr and add them to DevirtCalls.
37 static void
38 findLoadCallsAtConstantOffset(Module *M,
39 SmallVectorImpl &DevirtCalls,
40 Value *VPtr, uint64_t Offset) {
41 for (const Use &U : VPtr->uses()) {
42 Value *User = U.getUser();
43 if (isa(User)) {
44 findLoadCallsAtConstantOffset(M, DevirtCalls, User, Offset);
45 } else if (isa(User)) {
46 findCallsAtConstantOffset(DevirtCalls, User, Offset);
47 } else if (auto GEP = dyn_cast(User)) {
48 // Take into account the GEP offset.
49 if (VPtr == GEP->getPointerOperand() && GEP->hasAllConstantIndices()) {
50 SmallVector Indices(GEP->op_begin() + 1, GEP->op_end());
51 uint64_t GEPOffset = M->getDataLayout().getIndexedOffsetInType(
52 GEP->getSourceElementType(), Indices);
53 findLoadCallsAtConstantOffset(M, DevirtCalls, User, Offset + GEPOffset);
54 }
55 }
56 }
57 }
58
59 void llvm::findDevirtualizableCalls(
60 SmallVectorImpl &DevirtCalls,
61 SmallVectorImpl &Assumes, CallInst *CI) {
62 assert(CI->getCalledFunction()->getIntrinsicID() == Intrinsic::bitset_test);
63
64 Module *M = CI->getParent()->getParent()->getParent();
65
66 // Find llvm.assume intrinsics for this llvm.bitset.test call.
67 for (const Use &CIU : CI->uses()) {
68 auto AssumeCI = dyn_cast(CIU.getUser());
69 if (AssumeCI) {
70 Function *F = AssumeCI->getCalledFunction();
71 if (F && F->getIntrinsicID() == Intrinsic::assume)
72 Assumes.push_back(AssumeCI);
73 }
74 }
75
76 // If we found any, search for virtual calls based on %p and add them to
77 // DevirtCalls.
78 if (!Assumes.empty())
79 findLoadCallsAtConstantOffset(M, DevirtCalls,
80 CI->getArgOperand(0)->stripPointerCasts(), 0);
81 }
44 Analysis.cpp
55 AssumptionCache.cpp
66 BasicAliasAnalysis.cpp
7 BitSetUtils.cpp
87 BlockFrequencyInfo.cpp
98 BlockFrequencyInfoImpl.cpp
109 BranchProbabilityInfo.cpp
7069 TargetTransformInfo.cpp
7170 Trace.cpp
7271 TypeBasedAliasAnalysis.cpp
72 TypeMetadataUtils.cpp
7373 ScopedNoAliasAA.cpp
7474 ValueTracking.cpp
7575 VectorUtils.cpp
0 //===- TypeMetadataUtils.cpp - Utilities related to type metadata ---------===//
1 //
2 // The LLVM Compiler Infrastructure
3 //
4 // This file is distributed under the University of Illinois Open Source
5 // License. See LICENSE.TXT for details.
6 //
7 //===----------------------------------------------------------------------===//
8 //
9 // This file contains functions that make it easier to manipulate type metadata
10 // for devirtualization.
11 //
12 //===----------------------------------------------------------------------===//
13
14 #include "llvm/Analysis/TypeMetadataUtils.h"
15 #include "llvm/IR/Intrinsics.h"
16 #include "llvm/IR/Module.h"
17
18 using namespace llvm;
19
20 // Search for virtual calls that call FPtr and add them to DevirtCalls.
21 static void
22 findCallsAtConstantOffset(SmallVectorImpl &DevirtCalls,
23 Value *FPtr, uint64_t Offset) {
24 for (const Use &U : FPtr->uses()) {
25 Value *User = U.getUser();
26 if (isa(User)) {
27 findCallsAtConstantOffset(DevirtCalls, User, Offset);
28 } else if (auto CI = dyn_cast(User)) {
29 DevirtCalls.push_back({Offset, CI});
30 } else if (auto II = dyn_cast(User)) {
31 DevirtCalls.push_back({Offset, II});
32 }
33 }
34 }
35
36 // Search for virtual calls that load from VPtr and add them to DevirtCalls.
37 static void
38 findLoadCallsAtConstantOffset(Module *M,
39 SmallVectorImpl &DevirtCalls,
40 Value *VPtr, uint64_t Offset) {
41 for (const Use &U : VPtr->uses()) {
42 Value *User = U.getUser();
43 if (isa(User)) {
44 findLoadCallsAtConstantOffset(M, DevirtCalls, User, Offset);
45 } else if (isa(User)) {
46 findCallsAtConstantOffset(DevirtCalls, User, Offset);
47 } else if (auto GEP = dyn_cast(User)) {
48 // Take into account the GEP offset.
49 if (VPtr == GEP->getPointerOperand() && GEP->hasAllConstantIndices()) {
50 SmallVector Indices(GEP->op_begin() + 1, GEP->op_end());
51 uint64_t GEPOffset = M->getDataLayout().getIndexedOffsetInType(
52 GEP->getSourceElementType(), Indices);
53 findLoadCallsAtConstantOffset(M, DevirtCalls, User, Offset + GEPOffset);
54 }
55 }
56 }
57 }
58
59 void llvm::findDevirtualizableCalls(
60 SmallVectorImpl &DevirtCalls,
61 SmallVectorImpl &Assumes, CallInst *CI) {
62 assert(CI->getCalledFunction()->getIntrinsicID() == Intrinsic::type_test);
63
64 Module *M = CI->getParent()->getParent()->getParent();
65
66 // Find llvm.assume intrinsics for this llvm.type.test call.
67 for (const Use &CIU : CI->uses()) {
68 auto AssumeCI = dyn_cast(CIU.getUser());
69 if (AssumeCI) {
70 Function *F = AssumeCI->getCalledFunction();
71 if (F && F->getIntrinsicID() == Intrinsic::assume)
72 Assumes.push_back(AssumeCI);
73 }
74 }
75
76 // If we found any, search for virtual calls based on %p and add them to
77 // DevirtCalls.
78 if (!Assumes.empty())
79 findLoadCallsAtConstantOffset(M, DevirtCalls,
80 CI->getArgOperand(0)->stripPointerCasts(), 0);
81 }
133133 assert(LoopID == MD_loop && "llvm.loop kind id drifted");
134134 (void)LoopID;
135135
136 unsigned TypeID = getMDKindID("type");
137 assert(TypeID == MD_type && "type kind id drifted");
138 (void)TypeID;
139
136140 auto *DeoptEntry = pImpl->getOrInsertBundleTag("deopt");
137141 assert(DeoptEntry->second == LLVMContext::OB_deopt &&
138142 "deopt operand bundle id drifted!");
13921392 return getMetadata(getContext().getMDKindID(Kind));
13931393 }
13941394
1395 void GlobalObject::copyMetadata(const GlobalObject *Other) {
1395 void GlobalObject::copyMetadata(const GlobalObject *Other, unsigned Offset) {
13961396 SmallVector, 8> MDs;
13971397 Other->getAllMetadata(MDs);
1398 for (auto &MD : MDs)
1398 for (auto &MD : MDs) {
1399 // We need to adjust the type metadata offset.
1400 if (Offset != 0 && MD.first == LLVMContext::MD_type) {
1401 auto *OffsetConst = cast(
1402 cast(MD.second->getOperand(0))->getValue());
1403 Metadata *TypeId = MD.second->getOperand(1);
1404 auto *NewOffsetMD = ConstantAsMetadata::get(ConstantInt::get(
1405 OffsetConst->getType(), OffsetConst->getValue() + Offset));
1406 addMetadata(LLVMContext::MD_type,
1407 *MDNode::get(getContext(), {NewOffsetMD, TypeId}));
1408 continue;
1409 }
13991410 addMetadata(MD.first, *MD.second);
1411 }
1412 }
1413
1414 void GlobalObject::addTypeMetadata(unsigned Offset, Metadata *TypeID) {
1415 addMetadata(
1416 LLVMContext::MD_type,
1417 *MDTuple::get(getContext(),
1418 {llvm::ConstantAsMetadata::get(llvm::ConstantInt::get(
1419 Type::getInt64Ty(getContext()), Offset)),
1420 TypeID}));
14001421 }
14011422
14021423 void Function::setSubprogram(DISubprogram *SP) {
640640 if (auto *NewGO = dyn_cast(NewGV)) {
641641 // Metadata for global variables and function declarations is copied eagerly.
642642 if (isa(SGV) || SGV->isDeclaration())
643 NewGO->copyMetadata(cast(SGV));
643 NewGO->copyMetadata(cast(SGV), 0);
644644 }
645645
646646 // Remove these copied constants in case this stays a declaration, since
966966 Dst.setPersonalityFn(Src.getPersonalityFn());
967967
968968 // Copy over the metadata attachments without remapping.
969 Dst.copyMetadata(&Src);
969 Dst.copyMetadata(&Src, 0);
970970
971971 // Steal arguments and splice the body of Src into Dst.
972972 Dst.stealArgumentListFrom(Src);
1818 Inliner.cpp
1919 Internalize.cpp
2020 LoopExtractor.cpp
21 LowerBitSets.cpp
21 LowerTypeTests.cpp
2222 MergeFunctions.cpp
2323 PartialInlining.cpp
2424 PassManagerBuilder.cpp
3535
3636 #define DEBUG_TYPE "cross-dso-cfi"
3737
38 STATISTIC(TypeIds, "Number of unique type identifiers");
38 STATISTIC(NumTypeIds, "Number of unique type identifiers");
3939
4040 namespace {
4141
4848 Module *M;
4949 MDNode *VeryLikelyWeights;
5050
51 ConstantInt *extractBitSetTypeId(MDNode *MD);
51 ConstantInt *extractNumericTypeId(MDNode *MD);
5252 void buildCFICheck();
5353
5454 bool doInitialization(Module &M) override;
7272 return false;
7373 }
7474
75 /// extractBitSetTypeId - Extracts TypeId from a hash-based bitset MDNode.
76 ConstantInt *CrossDSOCFI::extractBitSetTypeId(MDNode *MD) {
75 /// Extracts a numeric type identifier from an MDNode containing type metadata.
76 ConstantInt *CrossDSOCFI::extractNumericTypeId(MDNode *MD) {
7777 // This check excludes vtables for classes inside anonymous namespaces.
78 auto TM = dyn_cast(MD->getOperand(0));
78 auto TM = dyn_cast(MD->getOperand(1));
7979 if (!TM)
8080 return nullptr;
8181 auto C = dyn_cast_or_null(TM->getValue());
8383 // We are looking for i64 constants.
8484 if (C->getBitWidth() != 64) return nullptr;
8585
86 // Sanity check.
87 auto FM = dyn_cast_or_null(MD->getOperand(1));
88 // Can be null if a function was removed by an optimization.
89 if (FM) {
90 auto F = dyn_cast(FM->getValue());
91 // But can never be a function declaration.
92 assert(!F || !F->isDeclaration());
93 (void)F; // Suppress unused variable warning in the no-asserts build.
94 }
9586 return C;
9687 }
9788
9889 /// buildCFICheck - emits __cfi_check for the current module.
9990 void CrossDSOCFI::buildCFICheck() {
10091 // FIXME: verify that __cfi_check ends up near the end of the code section,
101 // but before the jump slots created in LowerBitSets.
102 llvm::DenseSet BitSetIds;
103 NamedMDNode *BitSetNM = M->getNamedMetadata("llvm.bitsets");
92 // but before the jump slots created in LowerTypeTests.
93 llvm::DenseSet TypeIds;
94 SmallVector Types;
95 for (GlobalObject &GO : M->global_objects()) {
96 Types.clear();
97 GO.getMetadata(LLVMContext::MD_type, Types);
98 for (MDNode *Type : Types) {
99 // Sanity check. GO must not be a function declaration.
100 auto F = dyn_cast(&GO);
101 assert(!F || !F->isDeclaration());
104102
105 if (BitSetNM)
106 for (unsigned I = 0, E = BitSetNM->getNumOperands(); I != E; ++I)
107 if (ConstantInt *TypeId = extractBitSetTypeId(BitSetNM->getOperand(I)))
108 BitSetIds.insert(TypeId->getZExtValue());
103 if (ConstantInt *TypeId = extractNumericTypeId(Type))
104 TypeIds.insert(TypeId->getZExtValue());
105 }
106 }
109107
110108 LLVMContext &Ctx = M->getContext();
111109 Constant *C = M->getOrInsertFunction(
137135 IRBExit.CreateRetVoid();
138136
139137 IRBuilder<> IRB(BB);
140 SwitchInst *SI = IRB.CreateSwitch(&CallSiteTypeId, TrapBB, BitSetIds.size());
141 for (uint64_t TypeId : BitSetIds) {
138 SwitchInst *SI = IRB.CreateSwitch(&CallSiteTypeId, TrapBB, TypeIds.size());
139 for (uint64_t TypeId : TypeIds) {
142140 ConstantInt *CaseTypeId = ConstantInt::get(Type::getInt64Ty(Ctx), TypeId);
143141 BasicBlock *TestBB = BasicBlock::Create(Ctx, "test", F);
144142 IRBuilder<> IRBTest(TestBB);
145 Function *BitsetTestFn =
146 Intrinsic::getDeclaration(M, Intrinsic::bitset_test);
143 Function *BitsetTestFn = Intrinsic::getDeclaration(M, Intrinsic::type_test);
147144
148145 Value *Test = IRBTest.CreateCall(
149146 BitsetTestFn, {&Addr, MetadataAsValue::get(
152149 BI->setMetadata(LLVMContext::MD_prof, VeryLikelyWeights);
153150
154151 SI->addCase(CaseTypeId, TestBB);
155 ++TypeIds;
152 ++NumTypeIds;
156153 }
157154 }
158155
3838 initializeLoopExtractorPass(Registry);
3939 initializeBlockExtractorPassPass(Registry);
4040 initializeSingleLoopExtractorPass(Registry);
41 initializeLowerBitSetsPass(Registry);
41 initializeLowerTypeTestsPass(Registry);
4242 initializeMergeFunctionsPass(Registry);
4343 initializePartialInlinerPass(Registry);
4444 initializePostOrderFunctionAttrsLegacyPassPass(Registry);
+0
-1060
lib/Transforms/IPO/LowerBitSets.cpp less more
None //===-- LowerBitSets.cpp - Bitset lowering pass ---------------------------===//
1 //
2 // The LLVM Compiler Infrastructure
3 //
4 // This file is distributed under the University of Illinois Open Source
5 // License. See LICENSE.TXT for details.
6 //
7 //===----------------------------------------------------------------------===//
8 //
9 // This pass lowers bitset metadata and calls to the llvm.bitset.test intrinsic.
10 // See http://llvm.org/docs/LangRef.html#bitsets for more information.
11 //
12 //===----------------------------------------------------------------------===//
13
14 #include "llvm/Transforms/IPO/LowerBitSets.h"
15 #include "llvm/Transforms/IPO.h"
16 #include "llvm/ADT/EquivalenceClasses.h"
17 #include "llvm/ADT/Statistic.h"
18 #include "llvm/ADT/Triple.h"
19 #include "llvm/IR/Constant.h"
20 #include "llvm/IR/Constants.h"
21 #include "llvm/IR/Function.h"
22 #include "llvm/IR/GlobalObject.h"
23 #include "llvm/IR/GlobalVariable.h"
24 #include "llvm/IR/IRBuilder.h"
25 #include "llvm/IR/Instructions.h"
26 #include "llvm/IR/Intrinsics.h"
27 #include "llvm/IR/Module.h"
28 #include "llvm/IR/Operator.h"
29 #include "llvm/Pass.h"
30 #include "llvm/Support/Debug.h"
31 #include "llvm/Support/raw_ostream.h"
32 #include "llvm/Transforms/Utils/BasicBlockUtils.h"
33
34 using namespace llvm;
35 using namespace lowerbitsets;
36
37 #define DEBUG_TYPE "lowerbitsets"
38
39 STATISTIC(ByteArraySizeBits, "Byte array size in bits");
40 STATISTIC(ByteArraySizeBytes, "Byte array size in bytes");
41 STATISTIC(NumByteArraysCreated, "Number of byte arrays created");
42 STATISTIC(NumBitSetCallsLowered, "Number of bitset calls lowered");
43 STATISTIC(NumBitSetDisjointSets, "Number of disjoint sets of bitsets");
44
45 static cl::opt AvoidReuse(
46 "lowerbitsets-avoid-reuse",
47 cl::desc("Try to avoid reuse of byte array addresses using aliases"),
48 cl::Hidden, cl::init(true));
49
50 bool BitSetInfo::containsGlobalOffset(uint64_t Offset) const {
51 if (Offset < ByteOffset)
52 return false;
53
54 if ((Offset - ByteOffset) % (uint64_t(1) << AlignLog2) != 0)
55 return false;
56
57 uint64_t BitOffset = (Offset - ByteOffset) >> AlignLog2;
58 if (BitOffset >= BitSize)
59 return false;
60
61 return Bits.count(BitOffset);
62 }
63
64 bool BitSetInfo::containsValue(
65 const DataLayout &DL,
66 const DenseMap &GlobalLayout, Value *V,
67 uint64_t COffset) const {
68 if (auto GV = dyn_cast(V)) {
69 auto I = GlobalLayout.find(GV);
70 if (I == GlobalLayout.end())
71 return false;
72 return containsGlobalOffset(I->second + COffset);
73 }
74
75 if (auto GEP = dyn_cast(V)) {
76 APInt APOffset(DL.getPointerSizeInBits(0), 0);
77 bool Result = GEP->accumulateConstantOffset(DL, APOffset);
78 if (!Result)
79 return false;
80 COffset += APOffset.getZExtValue();
81 return containsValue(DL, GlobalLayout, GEP->getPointerOperand(),
82 COffset);
83 }
84
85 if (auto Op = dyn_cast(V)) {
86 if (Op->getOpcode() == Instruction::BitCast)
87 return containsValue(DL, GlobalLayout, Op->getOperand(0), COffset);
88
89 if (Op->getOpcode() == Instruction::Select)
90 return containsValue(DL, GlobalLayout, Op->getOperand(1), COffset) &&
91 containsValue(DL, GlobalLayout, Op->getOperand(2), COffset);
92 }
93
94 return false;
95 }
96
97 void BitSetInfo::print(raw_ostream &OS) const {
98 OS << "offset " << ByteOffset << " size " << BitSize << " align "
99 << (1 << AlignLog2);
100
101 if (isAllOnes()) {
102 OS << " all-ones\n";
103 return;
104 }
105
106 OS << " { ";
107 for (uint64_t B : Bits)
108 OS << B << ' ';
109 OS << "}\n";
110 }
111
112 BitSetInfo BitSetBuilder::build() {
113 if (Min > Max)
114 Min = 0;
115
116 // Normalize each offset against the minimum observed offset, and compute
117 // the bitwise OR of each of the offsets. The number of trailing zeros
118 // in the mask gives us the log2 of the alignment of all offsets, which
119 // allows us to compress the bitset by only storing one bit per aligned
120 // address.
121 uint64_t Mask = 0;
122 for (uint64_t &Offset : Offsets) {
123 Offset -= Min;
124 Mask |= Offset;
125 }
126
127 BitSetInfo BSI;
128 BSI.ByteOffset = Min;
129
130 BSI.AlignLog2 = 0;
131 if (Mask != 0)
132 BSI.AlignLog2 = countTrailingZeros(Mask, ZB_Undefined);
133
134 // Build the compressed bitset while normalizing the offsets against the
135 // computed alignment.
136 BSI.BitSize = ((Max - Min) >> BSI.AlignLog2) + 1;
137 for (uint64_t Offset : Offsets) {
138 Offset >>= BSI.AlignLog2;
139 BSI.Bits.insert(Offset);
140 }
141
142 return BSI;
143 }
144
145 void GlobalLayoutBuilder::addFragment(const std::set &F) {
146 // Create a new fragment to hold the layout for F.
147 Fragments.emplace_back();
148 std::vector &Fragment = Fragments.back();
149 uint64_t FragmentIndex = Fragments.size() - 1;
150
151 for (auto ObjIndex : F) {
152 uint64_t OldFragmentIndex = FragmentMap[ObjIndex];
153 if (OldFragmentIndex == 0) {
154 // We haven't seen this object index before, so just add it to the current
155 // fragment.
156 Fragment.push_back(ObjIndex);
157 } else {
158 // This index belongs to an existing fragment. Copy the elements of the
159 // old fragment into this one and clear the old fragment. We don't update
160 // the fragment map just yet, this ensures that any further references to
161 // indices from the old fragment in this fragment do not insert any more
162 // indices.
163 std::vector &OldFragment = Fragments[OldFragmentIndex];
164 Fragment.insert(Fragment.end(), OldFragment.begin(), OldFragment.end());
165 OldFragment.clear();
166 }
167 }
168
169 // Update the fragment map to point our object indices to this fragment.
170 for (uint64_t ObjIndex : Fragment)
171 FragmentMap[ObjIndex] = FragmentIndex;
172 }
173
174 void ByteArrayBuilder::allocate(const std::set &Bits,
175 uint64_t BitSize, uint64_t &AllocByteOffset,
176 uint8_t &AllocMask) {
177 // Find the smallest current allocation.
178 unsigned Bit = 0;
179 for (unsigned I = 1; I != BitsPerByte; ++I)
180 if (BitAllocs[I] < BitAllocs[Bit])
181 Bit = I;
182
183 AllocByteOffset = BitAllocs[Bit];
184
185 // Add our size to it.
186 unsigned ReqSize = AllocByteOffset + BitSize;
187 BitAllocs[Bit] = ReqSize;
188 if (Bytes.size() < ReqSize)
189 Bytes.resize(ReqSize);
190
191 // Set our bits.
192 AllocMask = 1 << Bit;
193 for (uint64_t B : Bits)
194 Bytes[AllocByteOffset + B] |= AllocMask;
195 }
196
197 namespace {
198
199 struct ByteArrayInfo {
200 std::set Bits;
201 uint64_t BitSize;
202 GlobalVariable *ByteArray;
203 Constant *Mask;
204 };
205
206 struct LowerBitSets : public ModulePass {
207 static char ID;
208 LowerBitSets() : ModulePass(ID) {
209 initializeLowerBitSetsPass(*PassRegistry::getPassRegistry());
210 }
211
212 Module *M;
213
214 bool LinkerSubsectionsViaSymbols;
215 Triple::ArchType Arch;
216 Triple::ObjectFormatType ObjectFormat;
217 IntegerType *Int1Ty;
218 IntegerType *Int8Ty;
219 IntegerType *Int32Ty;
220 Type *Int32PtrTy;
221 IntegerType *Int64Ty;
222 IntegerType *IntPtrTy;
223
224 // The llvm.bitsets named metadata.
225 NamedMDNode *BitSetNM;
226
227 // Mapping from bitset identifiers to the call sites that test them.
228 DenseMap> BitSetTestCallSites;
229
230 std::vector ByteArrayInfos;
231
232 BitSetInfo
233 buildBitSet(Metadata *BitSet,
234 const DenseMap &GlobalLayout);
235 ByteArrayInfo *createByteArray(BitSetInfo &BSI);
236 void allocateByteArrays();
237 Value *createBitSetTest(IRBuilder<> &B, BitSetInfo &BSI, ByteArrayInfo *&BAI,
238 Value *BitOffset);
239 void lowerBitSetCalls(ArrayRef BitSets,
240 Constant *CombinedGlobalAddr,
241 const DenseMap &GlobalLayout);
242 Value *
243 lowerBitSetCall(CallInst *CI, BitSetInfo &BSI, ByteArrayInfo *&BAI,
244 Constant *CombinedGlobal,
245 const DenseMap &GlobalLayout);
246 void buildBitSetsFromGlobalVariables(ArrayRef BitSets,
247 ArrayRef Globals);
248 unsigned getJumpTableEntrySize();
249 Type *getJumpTableEntryType();
250 Constant *createJumpTableEntry(GlobalObject *Src, Function *Dest,
251 unsigned Distance);
252 void verifyBitSetMDNode(MDNode *Op);
253 void buildBitSetsFromFunctions(ArrayRef BitSets,
254 ArrayRef Functions);
255 void buildBitSetsFromDisjointSet(ArrayRef BitSets,
256 ArrayRef Globals);
257 bool buildBitSets();
258 bool eraseBitSetMetadata();
259
260 bool doInitialization(Module &M) override;
261 bool runOnModule(Module &M) override;
262 };
263
264 } // anonymous namespace
265
266 INITIALIZE_PASS_BEGIN(LowerBitSets, "lowerbitsets",
267 "Lower bitset metadata", false, false)
268 INITIALIZE_PASS_END(LowerBitSets, "lowerbitsets",
269 "Lower bitset metadata", false, false)
270 char LowerBitSets::ID = 0;
271
272 ModulePass *llvm::createLowerBitSetsPass() { return new LowerBitSets; }
273
274 bool LowerBitSets::doInitialization(Module &Mod) {
275 M = &Mod;
276 const DataLayout &DL = Mod.getDataLayout();
277
278 Triple TargetTriple(M->getTargetTriple());
279 LinkerSubsectionsViaSymbols = TargetTriple.isMacOSX();
280 Arch = TargetTriple.getArch();
281 ObjectFormat = TargetTriple.getObjectFormat();
282
283 Int1Ty = Type::getInt1Ty(M->getContext());
284 Int8Ty = Type::getInt8Ty(M->getContext());
285 Int32Ty = Type::getInt32Ty(M->getContext());
286 Int32PtrTy = PointerType::getUnqual(Int32Ty);
287 Int64Ty = Type::getInt64Ty(M->getContext());
288 IntPtrTy = DL.getIntPtrType(M->getContext(), 0);
289
290 BitSetNM = M->getNamedMetadata("llvm.bitsets");
291
292 BitSetTestCallSites.clear();
293
294 return false;
295 }
296
297 /// Build a bit set for BitSet using the object layouts in
298 /// GlobalLayout.
299 BitSetInfo LowerBitSets::buildBitSet(
300 Metadata *BitSet,
301 const DenseMap &GlobalLayout) {
302 BitSetBuilder BSB;
303
304 // Compute the byte offset of each element of this bitset.
305 if (BitSetNM) {
306 for (MDNode *Op : BitSetNM->operands()) {
307 if (Op->getOperand(0) != BitSet || !Op->getOperand(1))
308 continue;
309 Constant *OpConst =
310 cast(Op->getOperand(1))->getValue();
311 if (auto GA = dyn_cast(OpConst))
312 OpConst = GA->getAliasee();
313 auto OpGlobal = dyn_cast(OpConst);
314 if (!OpGlobal)
315 continue;
316 uint64_t Offset =
317 cast(cast(Op->getOperand(2))
318 ->getValue())->getZExtValue();
319
320 Offset += GlobalLayout.find(OpGlobal)->second;
321
322 BSB.addOffset(Offset);
323 }
324 }
325
326 return BSB.build();
327 }
328
329 /// Build a test that bit BitOffset mod sizeof(Bits)*8 is set in
330 /// Bits. This pattern matches to the bt instruction on x86.
331 static Value *createMaskedBitTest(IRBuilder<> &B, Value *Bits,
332 Value *BitOffset) {
333 auto BitsType = cast(Bits->getType());
334 unsigned BitWidth = BitsType->getBitWidth();
335
336 BitOffset = B.CreateZExtOrTrunc(BitOffset, BitsType);
337 Value *BitIndex =
338 B.CreateAnd(BitOffset, ConstantInt::get(BitsType, BitWidth - 1));
339 Value *BitMask = B.CreateShl(ConstantInt::get(BitsType, 1), BitIndex);
340 Value *MaskedBits = B.CreateAnd(Bits, BitMask);
341 return B.CreateICmpNE(MaskedBits, ConstantInt::get(BitsType, 0));
342 }
343
344 ByteArrayInfo *LowerBitSets::createByteArray(BitSetInfo &BSI) {
345 // Create globals to stand in for byte arrays and masks. These never actually
346 // get initialized, we RAUW and erase them later in allocateByteArrays() once
347 // we know the offset and mask to use.
348 auto ByteArrayGlobal = new GlobalVariable(
349 *M, Int8Ty, /*isConstant=*/true, GlobalValue::PrivateLinkage, nullptr);
350 auto MaskGlobal = new GlobalVariable(
351 *M, Int8Ty, /*isConstant=*/true, GlobalValue::PrivateLinkage, nullptr);
352
353 ByteArrayInfos.emplace_back();
354 ByteArrayInfo *BAI = &ByteArrayInfos.back();
355
356 BAI->Bits = BSI.Bits;
357 BAI->BitSize = BSI.BitSize;
358 BAI->ByteArray = ByteArrayGlobal;
359 BAI->Mask = ConstantExpr::getPtrToInt(MaskGlobal, Int8Ty);
360 return BAI;
361 }
362
363 void LowerBitSets::allocateByteArrays() {
364 std::stable_sort(ByteArrayInfos.begin(), ByteArrayInfos.end(),
365 [](const ByteArrayInfo &BAI1, const ByteArrayInfo &BAI2) {
366 return BAI1.BitSize > BAI2.BitSize;
367 });
368
369 std::vector ByteArrayOffsets(ByteArrayInfos.size());
370
371 ByteArrayBuilder BAB;
372 for (unsigned I = 0; I != ByteArrayInfos.size(); ++I) {
373 ByteArrayInfo *BAI = &ByteArrayInfos[I];
374
375 uint8_t Mask;
376 BAB.allocate(BAI->Bits, BAI->BitSize, ByteArrayOffsets[I], Mask);
377
378 BAI->Mask->replaceAllUsesWith(ConstantInt::get(Int8Ty, Mask));
379 cast(BAI->Mask->getOperand(0))->eraseFromParent();
380 }
381
382 Constant *ByteArrayConst = ConstantDataArray::get(M->getContext(), BAB.Bytes);
383 auto ByteArray =
384 new GlobalVariable(*M, ByteArrayConst->getType(), /*isConstant=*/true,
385 GlobalValue::PrivateLinkage, ByteArrayConst);
386
387 for (unsigned I = 0; I != ByteArrayInfos.size(); ++I) {
388 ByteArrayInfo *BAI = &ByteArrayInfos[I];
389
390 Constant *Idxs[] = {ConstantInt::get(IntPtrTy, 0),
391 ConstantInt::get(IntPtrTy, ByteArrayOffsets[I])};
392 Constant *GEP = ConstantExpr::getInBoundsGetElementPtr(
393 ByteArrayConst->getType(), ByteArray, Idxs);
394
395 // Create an alias instead of RAUW'ing the gep directly. On x86 this ensures
396 // that the pc-relative displacement is folded into the lea instead of the
397 // test instruction getting another displacement.
398 if (LinkerSubsectionsViaSymbols) {
399 BAI->ByteArray->replaceAllUsesWith(GEP);
400 } else {
401 GlobalAlias *Alias = GlobalAlias::create(
402 Int8Ty, 0, GlobalValue::PrivateLinkage, "bits", GEP, M);
403 BAI->ByteArray->replaceAllUsesWith(Alias);
404 }
405 BAI->ByteArray->eraseFromParent();
406 }
407
408 ByteArraySizeBits = BAB.BitAllocs[0] + BAB.BitAllocs[1] + BAB.BitAllocs[2] +
409 BAB.BitAllocs[3] + BAB.BitAllocs[4] + BAB.BitAllocs[5] +
410 BAB.BitAllocs[6] + BAB.BitAllocs[7];
411 ByteArraySizeBytes = BAB.Bytes.size();
412 }
413
414 /// Build a test that bit BitOffset is set in BSI, where
415 /// BitSetGlobal is a global containing the bits in BSI.
416 Value *LowerBitSets::createBitSetTest(IRBuilder<> &B, BitSetInfo &BSI,
417 ByteArrayInfo *&BAI, Value *BitOffset) {
418 if (BSI.BitSize <= 64) {
419 // If the bit set is sufficiently small, we can avoid a load by bit testing
420 // a constant.
421 IntegerType *BitsTy;
422 if (BSI.BitSize <= 32)
423 BitsTy = Int32Ty;
424 else
425 BitsTy = Int64Ty;
426
427 uint64_t Bits = 0;
428 for (auto Bit : BSI.Bits)
429 Bits |= uint64_t(1) << Bit;
430 Constant *BitsConst = ConstantInt::get(BitsTy, Bits);
431 return createMaskedBitTest(B, BitsConst, BitOffset);
432 } else {
433 if (!BAI) {
434 ++NumByteArraysCreated;
435 BAI = createByteArray(BSI);
436 }
437
438 Constant *ByteArray = BAI->ByteArray;
439 Type *Ty = BAI->ByteArray->getValueType();
440 if (!LinkerSubsectionsViaSymbols && AvoidReuse) {
441 // Each use of the byte array uses a different alias. This makes the
442 // backend less likely to reuse previously computed byte array addresses,
443 // improving the security of the CFI mechanism based on this pass.
444 ByteArray = GlobalAlias::create(BAI->ByteArray->getValueType(), 0,
445 GlobalValue::PrivateLinkage, "bits_use",
446 ByteArray, M);
447 }
448
449 Value *ByteAddr = B.CreateGEP(Ty, ByteArray, BitOffset);
450 Value *Byte = B.CreateLoad(ByteAddr);
451
452 Value *ByteAndMask = B.CreateAnd(Byte, BAI->Mask);
453 return B.CreateICmpNE(ByteAndMask, ConstantInt::get(Int8Ty, 0));
454 }
455 }
456
457 /// Lower a llvm.bitset.test call to its implementation. Returns the value to
458 /// replace the call with.
459 Value *LowerBitSets::lowerBitSetCall(
460 CallInst *CI, BitSetInfo &BSI, ByteArrayInfo *&BAI,
461 Constant *CombinedGlobalIntAddr,
462 const DenseMap &GlobalLayout) {
463 Value *Ptr = CI->getArgOperand(0);
464 const DataLayout &DL = M->getDataLayout();
465
466 if (BSI.containsValue(DL, GlobalLayout, Ptr))
467 return ConstantInt::getTrue(M->getContext());
468
469 Constant *OffsetedGlobalAsInt = ConstantExpr::getAdd(
470 CombinedGlobalIntAddr, ConstantInt::get(IntPtrTy, BSI.ByteOffset));
471
472 BasicBlock *InitialBB = CI->getParent();
473
474 IRBuilder<> B(CI);
475
476 Value *PtrAsInt = B.CreatePtrToInt(Ptr, IntPtrTy);
477
478 if (BSI.isSingleOffset())
479 return B.CreateICmpEQ(PtrAsInt, OffsetedGlobalAsInt);
480
481 Value *PtrOffset = B.CreateSub(PtrAsInt, OffsetedGlobalAsInt);
482
483 Value *BitOffset;
484 if (BSI.AlignLog2 == 0) {
485 BitOffset = PtrOffset;
486 } else {
487 // We need to check that the offset both falls within our range and is
488 // suitably aligned. We can check both properties at the same time by
489 // performing a right rotate by log2(alignment) followed by an integer
490 // comparison against the bitset size. The rotate will move the lower
491 // order bits that need to be zero into the higher order bits of the
492 // result, causing the comparison to fail if they are nonzero. The rotate
493 // also conveniently gives us a bit offset to use during the load from
494 // the bitset.
495 Value *OffsetSHR =
496 B.CreateLShr(PtrOffset, ConstantInt::get(IntPtrTy, BSI.AlignLog2));
497 Value *OffsetSHL = B.CreateShl(
498 PtrOffset,
499 ConstantInt::get(IntPtrTy, DL.getPointerSizeInBits(0) - BSI.AlignLog2));
500 BitOffset = B.CreateOr(OffsetSHR, OffsetSHL);
501 }
502
503 Constant *BitSizeConst = ConstantInt::get(IntPtrTy, BSI.BitSize);
504 Value *OffsetInRange = B.CreateICmpULT(BitOffset, BitSizeConst);
505
506 // If the bit set is all ones, testing against it is unnecessary.
507 if (BSI.isAllOnes())
508 return OffsetInRange;
509
510 TerminatorInst *Term = SplitBlockAndInsertIfThen(OffsetInRange, CI, false);
511 IRBuilder<> ThenB(Term);
512
513 // Now that we know that the offset is in range and aligned, load the
514 // appropriate bit from the bitset.
515 Value *Bit = createBitSetTest(ThenB, BSI, BAI, BitOffset);
516
517 // The value we want is 0 if we came directly from the initial block
518 // (having failed the range or alignment checks), or the loaded bit if
519 // we came from the block in which we loaded it.
520 B.SetInsertPoint(CI);
521 PHINode *P = B.CreatePHI(Int1Ty, 2);
522 P->addIncoming(ConstantInt::get(Int1Ty, 0), InitialBB);
523 P->addIncoming(Bit, ThenB.GetInsertBlock());
524 return P;
525 }
526
527 /// Given a disjoint set of bitsets and globals, layout the globals, build the
528 /// bit sets and lower the llvm.bitset.test calls.
529 void LowerBitSets::buildBitSetsFromGlobalVariables(
530 ArrayRef BitSets, ArrayRef Globals) {
531 // Build a new global with the combined contents of the referenced globals.
532 // This global is a struct whose even-indexed elements contain the original
533 // contents of the referenced globals and whose odd-indexed elements contain
534 // any padding required to align the next element to the next power of 2.
535 std::vector GlobalInits;
536 const DataLayout &DL = M->getDataLayout();
537 for (GlobalVariable *G : Globals) {
538 GlobalInits.push_back(G->getInitializer());
539 uint64_t InitSize = DL.getTypeAllocSize(G->getValueType());
540
541 // Compute the amount of padding required.
542 uint64_t Padding = NextPowerOf2(InitSize - 1) - InitSize;
543
544 // Cap at 128 was found experimentally to have a good data/instruction
545 // overhead tradeoff.
546 if (Padding > 128)
547 Padding = alignTo(InitSize, 128) - InitSize;
548
549 GlobalInits.push_back(
550 ConstantAggregateZero::get(ArrayType::get(Int8Ty, Padding)));
551 }
552 if (!GlobalInits.empty())
553 GlobalInits.pop_back();
554 Constant *NewInit = ConstantStruct::getAnon(M->getContext(), GlobalInits);
555 auto *CombinedGlobal =
556 new GlobalVariable(*M, NewInit->getType(), /*isConstant=*/true,
557 GlobalValue::PrivateLinkage, NewInit);
558
559 StructType *NewTy = cast(NewInit->getType());
560 const StructLayout *CombinedGlobalLayout = DL.getStructLayout(NewTy);
561
562 // Compute the offsets of the original globals within the new global.
563 DenseMap GlobalLayout;
564 for (unsigned I = 0; I != Globals.size(); ++I)
565 // Multiply by 2 to account for padding elements.
566 GlobalLayout[Globals[I]] = CombinedGlobalLayout->getElementOffset(I * 2);
567
568 lowerBitSetCalls(BitSets, CombinedGlobal, GlobalLayout);
569
570 // Build aliases pointing to offsets into the combined global for each
571 // global from which we built the combined global, and replace references
572 // to the original globals with references to the aliases.
573 for (unsigned I = 0; I != Globals.size(); ++I) {
574 // Multiply by 2 to account for padding elements.
575 Constant *CombinedGlobalIdxs[] = {ConstantInt::get(Int32Ty, 0),
576 ConstantInt::get(Int32Ty, I * 2)};
577 Constant *CombinedGlobalElemPtr = ConstantExpr::getGetElementPtr(
578 NewInit->getType(), CombinedGlobal, CombinedGlobalIdxs);
579 if (LinkerSubsectionsViaSymbols) {
580 Globals[I]->replaceAllUsesWith(CombinedGlobalElemPtr);
581 } else {
582 assert(Globals[I]->getType()->getAddressSpace() == 0);
583 GlobalAlias *GAlias = GlobalAlias::create(NewTy->getElementType(I * 2), 0,
584 Globals[I]->getLinkage(), "",
585 CombinedGlobalElemPtr, M);
586 GAlias->setVisibility(Globals[I]->getVisibility());
587 GAlias->takeName(Globals[I]);
588 Globals[I]->replaceAllUsesWith(GAlias);
589 }
590 Globals[I]->eraseFromParent();
591 }
592 }
593
594 void LowerBitSets::lowerBitSetCalls(
595 ArrayRef BitSets, Constant *CombinedGlobalAddr,
596 const DenseMap &GlobalLayout) {
597 Constant *CombinedGlobalIntAddr =
598 ConstantExpr::getPtrToInt(CombinedGlobalAddr, IntPtrTy);
599
600 // For each bitset in this disjoint set...
601 for (Metadata *BS : BitSets) {
602 // Build the bitset.
603 BitSetInfo BSI = buildBitSet(BS, GlobalLayout);
604 DEBUG({
605 if (auto BSS = dyn_cast(BS))
606 dbgs() << BSS->getString() << ": ";
607 else
608 dbgs() << ": ";
609 BSI.print(dbgs());
610 });
611
612 ByteArrayInfo *BAI = nullptr;
613
614 // Lower each call to llvm.bitset.test for this bitset.
615 for (CallInst *CI : BitSetTestCallSites[BS]) {
616 ++NumBitSetCallsLowered;
617 Value *Lowered =
618 lowerBitSetCall(CI, BSI, BAI, CombinedGlobalIntAddr, GlobalLayout);
619 CI->replaceAllUsesWith(Lowered);
620 CI->eraseFromParent();
621 }
622 }
623 }
624
625 void LowerBitSets::verifyBitSetMDNode(MDNode *Op) {
626 if (Op->getNumOperands() != 3)
627 report_fatal_error(
628 "All operands of llvm.bitsets metadata must have 3 elements");
629 if (!Op->getOperand(1))
630 return;
631
632 auto OpConstMD = dyn_cast(Op->getOperand(1));
633 if (!OpConstMD)
634 report_fatal_error("Bit set element must be a constant");
635 auto OpGlobal = dyn_cast(OpConstMD->getValue());
636 if (!OpGlobal)
637 return;
638
639 if (OpGlobal->isThreadLocal())
640 report_fatal_error("Bit set element may not be thread-local");
641 if (isa(OpGlobal) && OpGlobal->hasSection())
642 report_fatal_error(
643 "Bit set global var element may not have an explicit section");
644
645 if (isa(OpGlobal) && OpGlobal->isDeclarationForLinker())
646 report_fatal_error("Bit set global var element must be a definition");
647
648 auto OffsetConstMD = dyn_cast(Op->getOperand(2));
649 if (!OffsetConstMD)
650 report_fatal_error("Bit set element offset must be a constant");
651 auto OffsetInt = dyn_cast(OffsetConstMD->getValue());
652 if (!OffsetInt)
653 report_fatal_error("Bit set element offset must be an integer constant");
654 }
655
656 static const unsigned kX86JumpTableEntrySize = 8;
657
658 unsigned LowerBitSets::getJumpTableEntrySize() {
659 if (Arch != Triple::x86 && Arch != Triple::x86_64)
660 report_fatal_error("Unsupported architecture for jump tables");
661
662 return kX86JumpTableEntrySize;
663 }
664
665 // Create a constant representing a jump table entry for the target. This
666 // consists of an instruction sequence containing a relative branch to Dest. The
667 // constant will be laid out at address Src+(Len*Distance) where Len is the
668 // target-specific jump table entry size.
669 Constant *LowerBitSets::createJumpTableEntry(GlobalObject *Src, Function *Dest,
670 unsigned Distance) {
671 if (Arch != Triple::x86 && Arch != Triple::x86_64)
672 report_fatal_error("Unsupported architecture for jump tables");
673
674 const unsigned kJmpPCRel32Code = 0xe9;
675 const unsigned kInt3Code = 0xcc;
676
677 ConstantInt *Jmp = ConstantInt::get(Int8Ty, kJmpPCRel32Code);
678
679 // Build a constant representing the displacement between the constant's
680 // address and Dest. This will resolve to a PC32 relocation referring to Dest.
681 Constant *DestInt = ConstantExpr::getPtrToInt(Dest, IntPtrTy);
682 Constant *SrcInt = ConstantExpr::getPtrToInt(Src, IntPtrTy);
683 Constant *Disp = ConstantExpr::getSub(DestInt, SrcInt);
684 ConstantInt *DispOffset =
685 ConstantInt::get(IntPtrTy, Distance * kX86JumpTableEntrySize + 5);
686 Constant *OffsetedDisp = ConstantExpr::getSub(Disp, DispOffset);
687 OffsetedDisp = ConstantExpr::getTruncOrBitCast(OffsetedDisp, Int32Ty);
688
689 ConstantInt *Int3 = ConstantInt::get(Int8Ty, kInt3Code);
690
691 Constant *Fields[] = {
692 Jmp, OffsetedDisp, Int3, Int3, Int3,
693 };
694 return ConstantStruct::getAnon(Fields, /*Packed=*/true);
695 }
696
697 Type *LowerBitSets::getJumpTableEntryType() {
698 if (Arch != Triple::x86 && Arch != Triple::x86_64)
699 report_fatal_error("Unsupported architecture for jump tables");
700
701 return StructType::get(M->getContext(),
702 {Int8Ty, Int32Ty, Int8Ty, Int8Ty, Int8Ty},
703 /*Packed=*/true);
704 }
705
706 /// Given a disjoint set of bitsets and functions, build a jump table for the
707 /// functions, build the bit sets and lower the llvm.bitset.test calls.
708 void LowerBitSets::buildBitSetsFromFunctions(ArrayRef BitSets,
709 ArrayRef Functions) {
710 // Unlike the global bitset builder, the function bitset builder cannot
711 // re-arrange functions in a particular order and base its calculations on the
712 // layout of the functions' entry points, as we have no idea how large a
713 // particular function will end up being (the size could even depend on what
714 // this pass does!) Instead, we build a jump table, which is a block of code
715 // consisting of one branch instruction for each of the functions in the bit
716 // set that branches to the target function, and redirect any taken function
717 // addresses to the corresponding jump table entry. In the object file's
718 // symbol table, the symbols for the target functions also refer to the jump
719 // table entries, so that addresses taken outside the module will pass any
720 // verification done inside the module.
721 //
722 // In more concrete terms, suppose we have three functions f, g, h which are
723 // members of a single bitset, and a function foo that returns their
724 // addresses:
725 //
726 // f:
727 // mov 0, %eax
728 // ret
729 //
730 // g:
731 // mov 1, %eax
732 // ret
733 //
734 // h:
735 // mov 2, %eax
736 // ret
737 //
738 // foo:
739 // mov f, %eax
740 // mov g, %edx
741 // mov h, %ecx
742 // ret
743 //
744 // To create a jump table for these functions, we instruct the LLVM code
745 // generator to output a jump table in the .text section. This is done by
746 // representing the instructions in the jump table as an LLVM constant and
747 // placing them in a global variable in the .text section. The end result will
748 // (conceptually) look like this:
749 //
750 // f:
751 // jmp .Ltmp0 ; 5 bytes
752 // int3 ; 1 byte
753 // int3 ; 1 byte
754 // int3 ; 1 byte
755 //
756 // g:
757 // jmp .Ltmp1 ; 5 bytes
758 // int3 ; 1 byte
759 // int3 ; 1 byte
760 // int3 ; 1 byte
761 //
762 // h:
763 // jmp .Ltmp2 ; 5 bytes
764 // int3 ; 1 byte
765 // int3 ; 1 byte
766 // int3 ; 1 byte
767 //
768 // .Ltmp0:
769 // mov 0, %eax
770 // ret
771 //
772 // .Ltmp1:
773 // mov 1, %eax
774 // ret
775 //
776 // .Ltmp2:
777 // mov 2, %eax
778 // ret
779 //
780 // foo:
781 // mov f, %eax
782 // mov g, %edx
783 // mov h, %ecx
784 // ret
785 //
786 // Because the addresses of f, g, h are evenly spaced at a power of 2, in the
787 // normal case the check can be carried out using the same kind of simple
788 // arithmetic that we normally use for globals.
789
790 assert(!Functions.empty());
791
792 // Build a simple layout based on the regular layout of jump tables.
793 DenseMap GlobalLayout;
794 unsigned EntrySize = getJumpTableEntrySize();
795 for (unsigned I = 0; I != Functions.size(); ++I)
796 GlobalLayout[Functions[I]] = I * EntrySize;
797
798 // Create a constant to hold the jump table.
799 ArrayType *JumpTableType =
800 ArrayType::get(getJumpTableEntryType(), Functions.size());
801 auto JumpTable = new GlobalVariable(*M, JumpTableType,
802 /*isConstant=*/true,
803 GlobalValue::PrivateLinkage, nullptr);
804 JumpTable->setSection(ObjectFormat == Triple::MachO
805 ? "__TEXT,__text,regular,pure_instructions"
806 : ".text");
807 lowerBitSetCalls(BitSets, JumpTable, GlobalLayout);
808
809 // Build aliases pointing to offsets into the jump table, and replace
810 // references to the original functions with references to the aliases.
811 for (unsigned I = 0; I != Functions.size(); ++I) {
812 Constant *CombinedGlobalElemPtr = ConstantExpr::getBitCast(
813 ConstantExpr::getGetElementPtr(
814 JumpTableType, JumpTable,
815 ArrayRef{ConstantInt::get(IntPtrTy, 0),
816 ConstantInt::get(IntPtrTy, I)}),
817 Functions[I]->getType());
818 if (LinkerSubsectionsViaSymbols || Functions[I]->isDeclarationForLinker()) {
819 Functions[I]->replaceAllUsesWith(CombinedGlobalElemPtr);
820 } else {
821 assert(Functions[I]->getType()->getAddressSpace() == 0);
822 GlobalAlias *GAlias = GlobalAlias::create(Functions[I]->getValueType(), 0,
823 Functions[I]->getLinkage(), "",
824 CombinedGlobalElemPtr, M);
825 GAlias->setVisibility(Functions[I]->getVisibility());
826 GAlias->takeName(Functions[I]);
827 Functions[I]->replaceAllUsesWith(GAlias);
828 }
829 if (!Functions[I]->isDeclarationForLinker())
830 Functions[I]->setLinkage(GlobalValue::PrivateLinkage);
831 }
832
833 // Build and set the jump table's initializer.
834 std::vector JumpTableEntries;
835 for (unsigned I = 0; I != Functions.size(); ++I)
836 JumpTableEntries.push_back(
837 createJumpTableEntry(JumpTable, Functions[I], I));
838 JumpTable->setInitializer(
839 ConstantArray::get(JumpTableType, JumpTableEntries));
840 }
841
842 void LowerBitSets::buildBitSetsFromDisjointSet(
843 ArrayRef BitSets, ArrayRef Globals) {
844 llvm::DenseMap BitSetIndices;
845 llvm::DenseMap GlobalIndices;
846 for (unsigned I = 0; I != BitSets.size(); ++I)
847 BitSetIndices[BitSets[I]] = I;
848 for (unsigned I = 0; I != Globals.size(); ++I)
849 GlobalIndices[Globals[I]] = I;
850
851 // For each bitset, build a set of indices that refer to globals referenced by
852 // the bitset.
853 std::vector> BitSetMembers(BitSets.size());
854 if (BitSetNM) {
855 for (MDNode *Op : BitSetNM->operands()) {
856 // Op = { bitset name, global, offset }
857 if (!Op->getOperand(1))
858 continue;
859 auto I = BitSetIndices.find(Op->getOperand(0));
860 if (I == BitSetIndices.end())
861 continue;
862
863 auto OpGlobal = dyn_cast(
864 cast(Op->getOperand(1))->getValue());
865 if (!OpGlobal)
866 continue;
867 BitSetMembers[I->second].insert(GlobalIndices[OpGlobal]);
868 }
869 }
870
871 // Order the sets of indices by size. The GlobalLayoutBuilder works best
872 // when given small index sets first.
873 std::stable_sort(
874 BitSetMembers.begin(), BitSetMembers.end(),
875 [](const std::set &O1, const std::set &O2) {
876 return O1.size() < O2.size();
877 });
878
879 // Create a GlobalLayoutBuilder and provide it with index sets as layout
880 // fragments. The GlobalLayoutBuilder tries to lay out members of fragments as
881 // close together as possible.
882 GlobalLayoutBuilder GLB(Globals.size());
883 for (auto &&MemSet : BitSetMembers)
884 GLB.addFragment(MemSet);
885
886 // Build the bitsets from this disjoint set.
887 if (Globals.empty() || isa(Globals[0])) {
888 // Build a vector of global variables with the computed layout.
889 std::vector OrderedGVs(Globals.size());
890 auto OGI = OrderedGVs.begin();
891 for (auto &&F : GLB.Fragments) {
892 for (auto &&Offset : F) {
893 auto GV = dyn_cast(Globals[Offset]);
894 if (!GV)
895 report_fatal_error(
896 "Bit set may not contain both global variables and functions");
897 *OGI++ = GV;
898 }
899 }
900
901 buildBitSetsFromGlobalVariables(BitSets, OrderedGVs);
902 } else {
903 // Build a vector of functions with the computed layout.
904 std::vector OrderedFns(Globals.size());
905 auto OFI = OrderedFns.begin();
906 for (auto &&F : GLB.Fragments) {
907 for (auto &&Offset : F) {
908 auto Fn = dyn_cast(Globals[Offset]);
909 if (!Fn)
910 report_fatal_error(
911 "Bit set may not contain both global variables and functions");
912 *OFI++ = Fn;
913 }
914 }
915
916 buildBitSetsFromFunctions(BitSets, OrderedFns);
917 }
918 }
919
920 /// Lower all bit sets in this module.
921 bool LowerBitSets::buildBitSets() {
922 Function *BitSetTestFunc =
923 M->getFunction(Intrinsic::getName(Intrinsic::bitset_test));
924 if (!BitSetTestFunc || BitSetTestFunc->use_empty())
925 return false;
926
927 // Equivalence class set containing bitsets and the globals they reference.
928 // This is used to partition the set of bitsets in the module into disjoint
929 // sets.
930 typedef EquivalenceClasses>
931 GlobalClassesTy;
932 GlobalClassesTy GlobalClasses;
933
934 // Verify the bitset metadata and build a mapping from bitset identifiers to
935 // their last observed index in BitSetNM. This will used later to
936 // deterministically order the list of bitset identifiers.
937 llvm::DenseMap BitSetIdIndices;
938 if (BitSetNM) {
939 for (unsigned I = 0, E = BitSetNM->getNumOperands(); I != E; ++I) {
940 MDNode *Op = BitSetNM->getOperand(I);
941 verifyBitSetMDNode(Op);
942 BitSetIdIndices[Op->getOperand(0)] = I;
943 }
944 }
945
946 for (const Use &U : BitSetTestFunc->uses()) {
947 auto CI = cast(U.getUser());
948
949 auto BitSetMDVal = dyn_cast(CI->getArgOperand(1));
950 if (!BitSetMDVal)
951 report_fatal_error(
952 "Second argument of llvm.bitset.test must be metadata");
953 auto BitSet = BitSetMDVal->getMetadata();
954
955 // Add the call site to the list of call sites for this bit set. We also use
956 // BitSetTestCallSites to keep track of whether we have seen this bit set
957 // before. If we have, we don't need to re-add the referenced globals to the
958 // equivalence class.
959 std::pair>::iterator,
960 bool> Ins =
961 BitSetTestCallSites.insert(
962 std::make_pair(BitSet, std::vector()));
963 Ins.first->second.push_back(CI);
964 if (!Ins.second)
965 continue;
966
967 // Add the bitset to the equivalence class.
968 GlobalClassesTy::iterator GCI = GlobalClasses.insert(BitSet);
969 GlobalClassesTy::member_iterator CurSet = GlobalClasses.findLeader(GCI);
970
971 if (!BitSetNM)
972 continue;
973
974 // Add the referenced globals to the bitset's equivalence class.
975 for (MDNode *Op : BitSetNM->operands()) {
976 if (Op->getOperand(0) != BitSet || !Op->getOperand(1))
977 continue;
978
979 auto OpGlobal = dyn_cast(
980 cast(Op->getOperand(1))->getValue());
981 if (!OpGlobal)
982 continue;
983
984 CurSet = GlobalClasses.unionSets(
985 CurSet, GlobalClasses.findLeader(GlobalClasses.insert(OpGlobal)));
986 }
987 }
988
989 if (GlobalClasses.empty())
990 return false;
991
992 // Build a list of disjoint sets ordered by their maximum BitSetNM index
993 // for determinism.
994 std::vector> Sets;
995 for (GlobalClassesTy::iterator I = GlobalClasses.begin(),
996 E = GlobalClasses.end();
997 I != E; ++I) {
998 if (!I->isLeader()) continue;
999 ++NumBitSetDisjointSets;
1000
1001 unsigned MaxIndex = 0;
1002 for (GlobalClassesTy::member_iterator MI = GlobalClasses.member_begin(I);
1003 MI != GlobalClasses.member_end(); ++MI) {
1004 if ((*MI).is())
1005 MaxIndex = std::max(MaxIndex, BitSetIdIndices[MI->get()]);
1006 }
1007 Sets.emplace_back(I, MaxIndex);
1008 }
1009 std::sort(Sets.begin(), Sets.end(),
1010 [](const std::pair &S1,
1011 const std::pair &S2) {
1012 return S1.second < S2.second;
1013 });
1014
1015 // For each disjoint set we found...
1016 for (const auto &S : Sets) {
1017 // Build the list of bitsets in this disjoint set.
1018 std::vector BitSets;
1019 std::vector Globals;
1020 for (GlobalClassesTy::member_iterator MI =
1021 GlobalClasses.member_begin(S.first);
1022 MI != GlobalClasses.member_end(); ++MI) {
1023 if ((*MI).is())
1024 BitSets.push_back(MI->get());
1025 else
1026 Globals.push_back(MI->get());
1027 }
1028
1029 // Order bitsets by BitSetNM index for determinism. This ordering is stable
1030 // as there is a one-to-one mapping between metadata and indices.
1031 std::sort(BitSets.begin(), BitSets.end(), [&](Metadata *M1, Metadata *M2) {
1032 return BitSetIdIndices[M1] < BitSetIdIndices[M2];
1033 });
1034
1035 // Lower the bitsets in this disjoint set.
1036 buildBitSetsFromDisjointSet(BitSets, Globals);
1037 }
1038
1039 allocateByteArrays();
1040
1041 return true;
1042 }
1043
1044 bool LowerBitSets::eraseBitSetMetadata() {
1045 if (!BitSetNM)
1046 return false;
1047
1048 M->eraseNamedMetadata(BitSetNM);
1049 return true;
1050 }
1051
1052 bool LowerBitSets::runOnModule(Module &M) {
1053 if (skipModule(M))
1054 return false;
1055
1056 bool Changed = buildBitSets();
1057 Changed |= eraseBitSetMetadata();
1058 return Changed;
1059 }
0 //===-- LowerTypeTests.cpp - type metadata lowering pass ------------------===//
1 //
2 // The LLVM Compiler Infrastructure
3 //
4 // This file is distributed under the University of Illinois Open Source
5 // License. See LICENSE.TXT for details.
6 //
7 //===----------------------------------------------------------------------===//
8 //
9 // This pass lowers type metadata and calls to the llvm.type.test intrinsic.
10 // See http://llvm.org/docs/TypeMetadata.html for more information.
11 //
12 //===----------------------------------------------------------------------===//
13
14 #include "llvm/Transforms/IPO/LowerTypeTests.h"
15 #include "llvm/Transforms/IPO.h"
16 #include "llvm/ADT/EquivalenceClasses.h"
17 #include "llvm/ADT/Statistic.h"
18 #include "llvm/ADT/Triple.h"
19 #include "llvm/IR/Constant.h"
20 #include "llvm/IR/Constants.h"
21 #include "llvm/IR/Function.h"
22 #include "llvm/IR/GlobalObject.h"
23 #include "llvm/IR/GlobalVariable.h"
24 #include "llvm/IR/IRBuilder.h"
25 #include "llvm/IR/Instructions.h"
26 #include "llvm/IR/Intrinsics.h"
27 #include "llvm/IR/Module.h"
28 #include "llvm/IR/Operator.h"
29 #include "llvm/Pass.h"
30 #include "llvm/Support/Debug.h"
31 #include "llvm/Support/raw_ostream.h"
32 #include "llvm/Transforms/Utils/BasicBlockUtils.h"
33
34 using namespace llvm;
35 using namespace lowertypetests;
36
37 #define DEBUG_TYPE "lowertypetests"
38
39 STATISTIC(ByteArraySizeBits, "Byte array size in bits");
40 STATISTIC(ByteArraySizeBytes, "Byte array size in bytes");
41 STATISTIC(NumByteArraysCreated, "Number of byte arrays created");
42 STATISTIC(NumTypeTestCallsLowered, "Number of type test calls lowered");
43 STATISTIC(NumTypeIdDisjointSets, "Number of disjoint sets of type identifiers");
44
45 static cl::opt AvoidReuse(
46 "lowertypetests-avoid-reuse",
47 cl::desc("Try to avoid reuse of byte array addresses using aliases"),
48 cl::Hidden, cl::init(true));
49
50 bool BitSetInfo::containsGlobalOffset(uint64_t Offset) const {
51 if (Offset < ByteOffset)
52 return false;
53
54 if ((Offset - ByteOffset) % (uint64_t(1) << AlignLog2) != 0)
55 return false;
56
57 uint64_t BitOffset = (Offset - ByteOffset) >> AlignLog2;
58 if (BitOffset >= BitSize)
59 return false;
60
61 return Bits.count(BitOffset);
62 }
63
64 bool BitSetInfo::containsValue(
65 const DataLayout &DL,
66 const DenseMap &GlobalLayout, Value *V,
67 uint64_t COffset) const {
68 if (auto GV = dyn_cast(V)) {
69 auto I = GlobalLayout.find(GV);
70 if (I == GlobalLayout.end())
71 return false;
72 return containsGlobalOffset(I->second + COffset);
73 }
74
75 if (auto GEP = dyn_cast(V)) {
76 APInt APOffset(DL.getPointerSizeInBits(0), 0);
77 bool Result = GEP->accumulateConstantOffset(DL, APOffset);
78 if (!Result)
79 return false;
80 COffset += APOffset.getZExtValue();
81 return containsValue(DL, GlobalLayout, GEP->getPointerOperand(),
82 COffset);
83 }
84
85 if (auto Op = dyn_cast(V)) {
86 if (Op->getOpcode() == Instruction::BitCast)
87 return containsValue(DL, GlobalLayout, Op->getOperand(0), COffset);
88
89 if (Op->getOpcode() == Instruction::Select)
90 return containsValue(DL, GlobalLayout, Op->getOperand(1), COffset) &&
91 containsValue(DL, GlobalLayout, Op->getOperand(2), COffset);
92 }
93
94 return false;
95 }
96
97 void BitSetInfo::print(raw_ostream &OS) const {
98 OS << "offset " << ByteOffset << " size " << BitSize << " align "
99 << (1 << AlignLog2);
100
101 if (isAllOnes()) {
102 OS << " all-ones\n";
103 return;
104 }
105
106 OS << " { ";
107 for (uint64_t B : Bits)
108 OS << B << ' ';
109 OS << "}\n";
110 }
111
112 BitSetInfo BitSetBuilder::build() {
113 if (Min > Max)
114 Min = 0;
115
116 // Normalize each offset against the minimum observed offset, and compute
117 // the bitwise OR of each of the offsets. The number of trailing zeros
118 // in the mask gives us the log2 of the alignment of all offsets, which
119 // allows us to compress the bitset by only storing one bit per aligned
120 // address.
121 uint64_t Mask = 0;
122 for (uint64_t &Offset : Offsets) {
123 Offset -= Min;
124 Mask |= Offset;
125 }
126
127 BitSetInfo BSI;
128 BSI.ByteOffset = Min;
129
130 BSI.AlignLog2 = 0;
131 if (Mask != 0)
132 BSI.AlignLog2 = countTrailingZeros(Mask, ZB_Undefined);
133
134 // Build the compressed bitset while normalizing the offsets against the
135 // computed alignment.
136 BSI.BitSize = ((Max - Min) >> BSI.AlignLog2) + 1;
137 for (uint64_t Offset : Offsets) {
138 Offset >>= BSI.AlignLog2;
139 BSI.Bits.insert(Offset);
140 }
141
142 return BSI;
143 }
144
145 void GlobalLayoutBuilder::addFragment(const std::set &F) {
146 // Create a new fragment to hold the layout for F.
147 Fragments.emplace_back();
148 std::vector &Fragment = Fragments.back();
149 uint64_t FragmentIndex = Fragments.size() - 1;
150
151 for (auto ObjIndex : F) {
152 uint64_t OldFragmentIndex = FragmentMap[ObjIndex];
153 if (OldFragmentIndex == 0) {
154 // We haven't seen this object index before, so just add it to the current
155 // fragment.
156 Fragment.push_back(ObjIndex);
157 } else {
158 // This index belongs to an existing fragment. Copy the elements of the
159 // old fragment into this one and clear the old fragment. We don't update
160 // the fragment map just yet, this ensures that any further references to
161 // indices from the old fragment in this fragment do not insert any more
162 // indices.
163 std::vector &OldFragment = Fragments[OldFragmentIndex];
164 Fragment.insert(Fragment.end(), OldFragment.begin(), OldFragment.end());
165 OldFragment.clear();
166 }
167 }
168
169 // Update the fragment map to point our object indices to this fragment.
170 for (uint64_t ObjIndex : Fragment)
171 FragmentMap[ObjIndex] = FragmentIndex;
172 }
173
174 void ByteArrayBuilder::allocate(const std::set &Bits,
175 uint64_t BitSize, uint64_t &AllocByteOffset,
176 uint8_t &AllocMask) {
177 // Find the smallest current allocation.
178 unsigned Bit = 0;
179 for (unsigned I = 1; I != BitsPerByte; ++I)
180 if (BitAllocs[I] < BitAllocs[Bit])
181 Bit = I;
182
183 AllocByteOffset = BitAllocs[Bit];
184
185 // Add our size to it.
186 unsigned ReqSize = AllocByteOffset + BitSize;
187 BitAllocs[Bit] = ReqSize;
188 if (Bytes.size() < ReqSize)
189 Bytes.resize(ReqSize);
190
191 // Set our bits.
192 AllocMask = 1 << Bit;
193 for (uint64_t B : Bits)
194 Bytes[AllocByteOffset + B] |= AllocMask;
195 }
196
197 namespace {
198
199 struct ByteArrayInfo {
200 std::set Bits;
201 uint64_t BitSize;
202 GlobalVariable *ByteArray;
203 Constant *Mask;
204 };
205
206 struct LowerTypeTests : public ModulePass {
207 static char ID;
208 LowerTypeTests() : ModulePass(ID) {
209 initializeLowerTypeTestsPass(*PassRegistry::getPassRegistry());
210 }
211
212 Module *M;
213
214 bool LinkerSubsectionsViaSymbols;
215 Triple::ArchType Arch;
216 Triple::ObjectFormatType ObjectFormat;
217 IntegerType *Int1Ty;
218 IntegerType *Int8Ty;
219 IntegerType *Int32Ty;
220 Type *Int32PtrTy;
221 IntegerType *Int64Ty;
222 IntegerType *IntPtrTy;
223
224 // Mapping from type identifiers to the call sites that test them.
225 DenseMap> TypeTestCallSites;
226
227 std::vector ByteArrayInfos;
228
229 BitSetInfo
230 buildBitSet(Metadata *TypeId,
231 const DenseMap &GlobalLayout);
232 ByteArrayInfo *createByteArray(BitSetInfo &BSI);
233 void allocateByteArrays();
234 Value *createBitSetTest(IRBuilder<> &B, BitSetInfo &BSI, ByteArrayInfo *&BAI,
235 Value *BitOffset);
236 void
237 lowerTypeTestCalls(ArrayRef TypeIds, Constant *CombinedGlobalAddr,
238 const DenseMap &GlobalLayout);
239 Value *
240 lowerBitSetCall(CallInst *CI, BitSetInfo &BSI, ByteArrayInfo *&BAI,
241 Constant *CombinedGlobal,
242 const DenseMap &GlobalLayout);
243 void buildBitSetsFromGlobalVariables(ArrayRef TypeIds,
244 ArrayRef Globals);
245 unsigned getJumpTableEntrySize();
246 Type *getJumpTableEntryType();
247 Constant *createJumpTableEntry(GlobalObject *Src, Function *Dest,
248 unsigned Distance);
249 void verifyTypeMDNode(GlobalObject *GO, MDNode *Type);
250 void buildBitSetsFromFunctions(ArrayRef TypeIds,
251 ArrayRef Functions);
252 void buildBitSetsFromDisjointSet(ArrayRef TypeIds,
253 ArrayRef Globals);
254 bool lower();
255
256 bool doInitialization(Module &M) override;
257 bool runOnModule(Module &M) override;
258 };
259
260 } // anonymous namespace
261
262 INITIALIZE_PASS(LowerTypeTests, "lowertypetests", "Lower type metadata", false,
263 false)
264 char LowerTypeTests::ID = 0;
265
266 ModulePass *llvm::createLowerTypeTestsPass() { return new LowerTypeTests; }
267
268 bool LowerTypeTests::doInitialization(Module &Mod) {
269 M = &Mod;
270 const DataLayout &DL = Mod.getDataLayout();
271
272 Triple TargetTriple(M->getTargetTriple());
273 LinkerSubsectionsViaSymbols = TargetTriple.isMacOSX();
274 Arch = TargetTriple.getArch();
275 ObjectFormat = TargetTriple.getObjectFormat();
276
277 Int1Ty = Type::getInt1Ty(M->getContext());
278 Int8Ty = Type::getInt8Ty(M->getContext());
279 Int32Ty = Type::getInt32Ty(M->getContext());
280 Int32PtrTy = PointerType::getUnqual(Int32Ty);
281 Int64Ty = Type::getInt64Ty(M->getContext());
282 IntPtrTy = DL.getIntPtrType(M->getContext(), 0);
283
284 TypeTestCallSites.clear();
285
286 return false;
287 }
288
289 /// Build a bit set for TypeId using the object layouts in
290 /// GlobalLayout.
291 BitSetInfo LowerTypeTests::buildBitSet(
292 Metadata *TypeId,
293 const DenseMap &GlobalLayout) {
294 BitSetBuilder BSB;
295
296 // Compute the byte offset of each address associated with this type
297 // identifier.
298 SmallVector Types;
299 for (auto &GlobalAndOffset : GlobalLayout) {
300 Types.clear();
301 GlobalAndOffset.first->getMetadata(LLVMContext::MD_type, Types);
302 for (MDNode *Type : Types) {
303 if (Type->getOperand(1) != TypeId)
304 continue;
305 uint64_t Offset =
306 cast(cast(Type->getOperand(0))
307 ->getValue())->getZExtValue();
308 BSB.addOffset(GlobalAndOffset.second + Offset);
309 }
310 }
311
312 return BSB.build();
313 }
314
315 /// Build a test that bit BitOffset mod sizeof(Bits)*8 is set in
316 /// Bits. This pattern matches to the bt instruction on x86.
317 static Value *createMaskedBitTest(IRBuilder<> &B, Value *Bits,
318 Value *BitOffset) {
319 auto BitsType = cast(Bits->getType());
320 unsigned BitWidth = BitsType->getBitWidth();
321
322 BitOffset = B.CreateZExtOrTrunc(BitOffset, BitsType);
323 Value *BitIndex =
324 B.CreateAnd(BitOffset, ConstantInt::get(BitsType, BitWidth - 1));
325 Value *BitMask = B.CreateShl(ConstantInt::get(BitsType, 1), BitIndex);
326 Value *MaskedBits = B.CreateAnd(Bits, BitMask);
327 return B.CreateICmpNE(MaskedBits, ConstantInt::get(BitsType, 0));
328 }
329
330 ByteArrayInfo *LowerTypeTests::createByteArray(BitSetInfo &BSI) {
331 // Create globals to stand in for byte arrays and masks. These never actually
332 // get initialized, we RAUW and erase them later in allocateByteArrays() once
333 // we know the offset and mask to use.
334 auto ByteArrayGlobal = new GlobalVariable(
335 *M, Int8Ty, /*isConstant=*/true, GlobalValue::PrivateLinkage, nullptr);
336 auto MaskGlobal = new GlobalVariable(
337 *M, Int8Ty, /*isConstant=*/true, GlobalValue::PrivateLinkage, nullptr);
338
339 ByteArrayInfos.emplace_back();
340 ByteArrayInfo *BAI = &ByteArrayInfos.back();
341
342 BAI->Bits = BSI.Bits;
343 BAI->BitSize = BSI.BitSize;
344 BAI->ByteArray = ByteArrayGlobal;
345 BAI->Mask = ConstantExpr::getPtrToInt(MaskGlobal, Int8Ty);
346 return BAI;
347 }
348
349 void LowerTypeTests::allocateByteArrays() {
350 std::stable_sort(ByteArrayInfos.begin(), ByteArrayInfos.end(),
351 [](const ByteArrayInfo &BAI1, const ByteArrayInfo &BAI2) {
352 return BAI1.BitSize > BAI2.BitSize;
353 });
354
355 std::vector ByteArrayOffsets(ByteArrayInfos.size());
356
357 ByteArrayBuilder BAB;
358 for (unsigned I = 0; I != ByteArrayInfos.size(); ++I) {
359 ByteArrayInfo *BAI = &ByteArrayInfos[I];
360
361 uint8_t Mask;
362 BAB.allocate(BAI->Bits, BAI->BitSize, ByteArrayOffsets[I], Mask);
363
364 BAI->Mask->replaceAllUsesWith(ConstantInt::get(Int8Ty, Mask));
365 cast(BAI->Mask->getOperand(0))->eraseFromParent();
366 }
367
368 Constant *ByteArrayConst = ConstantDataArray::get(M->getContext(), BAB.Bytes);
369 auto ByteArray =
370 new GlobalVariable(*M, ByteArrayConst->getType(), /*isConstant=*/true,
371 GlobalValue::PrivateLinkage, ByteArrayConst);
372
373 for (unsigned I = 0; I != ByteArrayInfos.size(); ++I) {
374 ByteArrayInfo *BAI = &ByteArrayInfos[I];
375
376 Constant *Idxs[] = {ConstantInt::get(IntPtrTy, 0),
377 ConstantInt::get(IntPtrTy, ByteArrayOffsets[I])};
378 Constant *GEP = ConstantExpr::getInBoundsGetElementPtr(
379 ByteArrayConst->getType(), ByteArray, Idxs);
380
381 // Create an alias instead of RAUW'ing the gep directly. On x86 this ensures
382 // that the pc-relative displacement is folded into the lea instead of the
383 // test instruction getting another displacement.
384 if (LinkerSubsectionsViaSymbols) {
385 BAI->ByteArray->replaceAllUsesWith(GEP);
386 } else {
387 GlobalAlias *Alias = GlobalAlias::create(
388 Int8Ty, 0, GlobalValue::PrivateLinkage, "bits", GEP, M);
389 BAI->ByteArray->replaceAllUsesWith(Alias);
390 }
391 BAI->ByteArray->eraseFromParent();
392 }
393
394 ByteArraySizeBits = BAB.BitAllocs[0] + BAB.BitAllocs[1] + BAB.BitAllocs[2] +
395 BAB.BitAllocs[3] + BAB.BitAllocs[4] + BAB.BitAllocs[5] +
396 BAB.BitAllocs[6] + BAB.BitAllocs[7];
397 ByteArraySizeBytes = BAB.Bytes.size();
398 }
399
400 /// Build a test that bit BitOffset is set in BSI, where
401 /// BitSetGlobal is a global containing the bits in BSI.
402 Value *LowerTypeTests::createBitSetTest(IRBuilder<> &B, BitSetInfo &BSI,
403 ByteArrayInfo *&BAI, Value *BitOffset) {
404 if (BSI.BitSize <= 64) {
405 // If the bit set is sufficiently small, we can avoid a load by bit testing
406 // a constant.
407 IntegerType *BitsTy;
408 if (BSI.BitSize <= 32)
409 BitsTy = Int32Ty;
410 else
411 BitsTy = Int64Ty;
412
413 uint64_t Bits = 0;
414 for (auto Bit : BSI.Bits)
415 Bits |= uint64_t(1) << Bit;
416 Constant *BitsConst = ConstantInt::get(BitsTy, Bits);
417 return createMaskedBitTest(B, BitsConst, BitOffset);
418 } else {
419 if (!BAI) {
420 ++NumByteArraysCreated;
421 BAI = createByteArray(BSI);
422 }
423
424 Constant *ByteArray = BAI->ByteArray;
425 Type *Ty = BAI->ByteArray->getValueType();
426 if (!LinkerSubsectionsViaSymbols && AvoidReuse) {
427 // Each use of the byte array uses a different alias. This makes the
428 // backend less likely to reuse previously computed byte array addresses,
429 // improving the security of the CFI mechanism based on this pass.
430 ByteArray = GlobalAlias::create(BAI->ByteArray->getValueType(), 0,
431 GlobalValue::PrivateLinkage, "bits_use",
432 ByteArray, M);
433 }
434
435 Value *ByteAddr = B.CreateGEP(Ty, ByteArray, BitOffset);
436 Value *Byte = B.CreateLoad(ByteAddr);
437
438 Value *ByteAndMask = B.CreateAnd(Byte, BAI->Mask);
439 return B.CreateICmpNE(ByteAndMask, ConstantInt::get(Int8Ty, 0));
440 }
441 }
442
443 /// Lower a llvm.type.test call to its implementation. Returns the value to
444 /// replace the call with.
445 Value *LowerTypeTests::lowerBitSetCall(
446 CallInst *CI, BitSetInfo &BSI, ByteArrayInfo *&BAI,
447 Constant *CombinedGlobalIntAddr,
448 const DenseMap &GlobalLayout) {
449 Value *Ptr = CI->getArgOperand(0);
450 const DataLayout &DL = M->getDataLayout();
451
452 if (BSI.containsValue(DL, GlobalLayout, Ptr))
453 return ConstantInt::getTrue(M->getContext());
454
455 Constant *OffsetedGlobalAsInt = ConstantExpr::getAdd(
456 CombinedGlobalIntAddr, ConstantInt::get(IntPtrTy, BSI.ByteOffset));
457
458 BasicBlock *InitialBB = CI->getParent();
459
460 IRBuilder<> B(CI);
461
462 Value *PtrAsInt = B.CreatePtrToInt(Ptr, IntPtrTy);
463
464 if (BSI.isSingleOffset())
465 return B.CreateICmpEQ(PtrAsInt, OffsetedGlobalAsInt);
466
467 Value *PtrOffset = B.CreateSub(PtrAsInt, OffsetedGlobalAsInt);
468
469 Value *BitOffset;
470 if (BSI.AlignLog2 == 0) {
471 BitOffset = PtrOffset;
472 } else {
473 // We need to check that the offset both falls within our range and is
474 // suitably aligned. We can check both properties at the same time by
475 // performing a right rotate by log2(alignment) followed by an integer
476 // comparison against the bitset size. The rotate will move the lower
477 // order bits that need to be zero into the higher order bits of the
478 // result, causing the comparison to fail if they are nonzero. The rotate
479 // also conveniently gives us a bit offset to use during the load from
480 // the bitset.
481 Value *OffsetSHR =
482 B.CreateLShr(PtrOffset, ConstantInt::get(IntPtrTy, BSI.AlignLog2));
483 Value *OffsetSHL = B.CreateShl(
484 PtrOffset,
485 ConstantInt::get(IntPtrTy, DL.getPointerSizeInBits(0) - BSI.AlignLog2));
486 BitOffset = B.CreateOr(OffsetSHR, OffsetSHL);
487 }
488
489 Constant *BitSizeConst = ConstantInt::get(IntPtrTy, BSI.BitSize);
490 Value *OffsetInRange = B.CreateICmpULT(BitOffset, BitSizeConst);
491
492 // If the bit set is all ones, testing against it is unnecessary.
493 if (BSI.isAllOnes())
494 return OffsetInRange;
495
496 TerminatorInst *Term = SplitBlockAndInsertIfThen(OffsetInRange, CI, false);
497 IRBuilder<> ThenB(Term);
498
499 // Now that we know that the offset is in range and aligned, load the
500 // appropriate bit from the bitset.
501 Value *Bit = createBitSetTest(ThenB, BSI, BAI, BitOffset);
502
503 // The value we want is 0 if we came directly from the initial block
504 // (having failed the range or alignment checks), or the loaded bit if
505 // we came from the block in which we loaded it.
506 B.SetInsertPoint(CI);
507 PHINode *P = B.CreatePHI(Int1Ty, 2);
508 P->addIncoming(ConstantInt::get(Int1Ty, 0), InitialBB);
509 P->addIncoming(Bit, ThenB.GetInsertBlock());
510 return P;
511 }
512
513 /// Given a disjoint set of type identifiers and globals, lay out the globals,
514 /// build the bit sets and lower the llvm.type.test calls.
515 void LowerTypeTests::buildBitSetsFromGlobalVariables(
516 ArrayRef TypeIds, ArrayRef Globals) {
517 // Build a new global with the combined contents of the referenced globals.
518 // This global is a struct whose even-indexed elements contain the original
519 // contents of the referenced globals and whose odd-indexed elements contain
520 // any padding required to align the next element to the next power of 2.
521 std::vector GlobalInits;
522 const DataLayout &DL = M->getDataLayout();
523 for (GlobalVariable *G : Globals) {
524 GlobalInits.push_back(G->getInitializer());
525 uint64_t InitSize = DL.getTypeAllocSize(G->getValueType());
526
527 // Compute the amount of padding required.
528 uint64_t Padding = NextPowerOf2(InitSize - 1) - InitSize;
529
530 // Cap at 128 was found experimentally to have a good data/instruction
531 // overhead tradeoff.
532 if (Padding > 128)
533 Padding = alignTo(InitSize, 128) - InitSize;
534
535 GlobalInits.push_back(
536 ConstantAggregateZero::get(ArrayType::get(Int8Ty, Padding)));
537 }
538 if (!GlobalInits.empty())
539 GlobalInits.pop_back();
540 Constant *NewInit = ConstantStruct::getAnon(M->getContext(), GlobalInits);
541 auto *CombinedGlobal =
542 new GlobalVariable(*M, NewInit->getType(), /*isConstant=*/true,
543 GlobalValue::PrivateLinkage, NewInit);
544
545 StructType *NewTy = cast(NewInit->getType());
546 const StructLayout *CombinedGlobalLayout = DL.getStructLayout(NewTy);
547
548 // Compute the offsets of the original globals within the new global.
549 DenseMap GlobalLayout;
550 for (unsigned I = 0; I != Globals.size(); ++I)
551 // Multiply by 2 to account for padding elements.
552 GlobalLayout[Globals[I]] = CombinedGlobalLayout->getElementOffset(I * 2);
553
554 lowerTypeTestCalls(TypeIds, CombinedGlobal, GlobalLayout);
555
556 // Build aliases pointing to offsets into the combined global for each
557 // global from which we built the combined global, and replace references
558 // to the original globals with references to the aliases.
559 for (unsigned I = 0; I != Globals.size(); ++I) {
560 // Multiply by 2 to account for padding elements.
561 Constant *CombinedGlobalIdxs[] = {ConstantInt::get(Int32Ty, 0),
562 ConstantInt::get(Int32Ty, I * 2)};
563 Constant *CombinedGlobalElemPtr = ConstantExpr::getGetElementPtr(
564 NewInit->getType(), CombinedGlobal, CombinedGlobalIdxs);
565 if (LinkerSubsectionsViaSymbols) {
566 Globals[I]->replaceAllUsesWith(CombinedGlobalElemPtr);
567 } else {
568 assert(Globals[I]->getType()->getAddressSpace() == 0);
569 GlobalAlias *GAlias = GlobalAlias::create(NewTy->getElementType(I * 2), 0,
570 Globals[I]->getLinkage(), "",
571 CombinedGlobalElemPtr, M);
572 GAlias->setVisibility(Globals[I]->getVisibility());
573 GAlias->takeName(Globals[I]);
574 Globals[I]->replaceAllUsesWith(GAlias);
575 }
576 Globals[I]->eraseFromParent();
577 }
578 }
579
580 void LowerTypeTests::lowerTypeTestCalls(
581 ArrayRef TypeIds, Constant *CombinedGlobalAddr,
582 const DenseMap &GlobalLayout) {
583 Constant *CombinedGlobalIntAddr =
584 ConstantExpr::getPtrToInt(CombinedGlobalAddr, IntPtrTy);
585
586 // For each type identifier in this disjoint set...
587 for (Metadata *TypeId : TypeIds) {
588 // Build the bitset.
589 BitSetInfo BSI = buildBitSet(TypeId, GlobalLayout);
590 DEBUG({
591 if (auto MDS = dyn_cast(TypeId))
592 dbgs() << MDS->getString() << ": ";
593 else
594 dbgs() << ": ";
595 BSI.print(dbgs());
596 });
597
598 ByteArrayInfo *BAI = nullptr;
599
600 // Lower each call to llvm.type.test for this type identifier.
601 for (CallInst *CI : TypeTestCallSites[TypeId]) {
602 ++NumTypeTestCallsLowered;
603 Value *Lowered =
604 lowerBitSetCall(CI, BSI, BAI, CombinedGlobalIntAddr, GlobalLayout);
605 CI->replaceAllUsesWith(Lowered);
606 CI->eraseFromParent();
607 }
608 }
609 }
610
611 void LowerTypeTests::verifyTypeMDNode(GlobalObject *GO, MDNode *Type) {
612 if (Type->getNumOperands() != 2)
613 report_fatal_error(
614 "All operands of type metadata must have 2 elements");
615
616 if (GO->isThreadLocal())
617 report_fatal_error("Bit set element may not be thread-local");
618 if (isa(GO) && GO->hasSection())
619 report_fatal_error(
620 "A member of a type identifier may not have an explicit section");
621
622 if (isa(GO) && GO->isDeclarationForLinker())
623 report_fatal_error(
624 "A global var member of a type identifier must be a definition");
625
626 auto OffsetConstMD = dyn_cast(Type->getOperand(0));
627 if (!OffsetConstMD)
628 report_fatal_error("Type offset must be a constant");
629 auto OffsetInt = dyn_cast(OffsetConstMD->getValue());
630 if (!OffsetInt)
631 report_fatal_error("Type offset must be an integer constant");
632 }
633
634 static const unsigned kX86JumpTableEntrySize = 8;
635
636 unsigned LowerTypeTests::getJumpTableEntrySize() {
637 if (Arch != Triple::x86 && Arch != Triple::x86_64)
638 report_fatal_error("Unsupported architecture for jump tables");
639
640 return kX86JumpTableEntrySize;
641 }
642
643 // Create a constant representing a jump table entry for the target. This
644 // consists of an instruction sequence containing a relative branch to Dest. The
645 // constant will be laid out at address Src+(Len*Distance) where Len is the
646 // target-specific jump table entry size.
647 Constant *LowerTypeTests::createJumpTableEntry(GlobalObject *Src,
648 Function *Dest,
649 unsigned Distance) {
650 if (Arch != Triple::x86 && Arch != Triple::x86_64)
651 report_fatal_error("Unsupported architecture for jump tables");
652
653 const unsigned kJmpPCRel32Code = 0xe9;
654 const unsigned kInt3Code = 0xcc;
655
656 ConstantInt *Jmp = ConstantInt::get(Int8Ty, kJmpPCRel32Code);
657
658 // Build a constant representing the displacement between the constant's
659 // address and Dest. This will resolve to a PC32 relocation referring to Dest.
660 Constant *DestInt = ConstantExpr::getPtrToInt(Dest, IntPtrTy);
661 Constant *SrcInt = ConstantExpr::getPtrToInt(Src, IntPtrTy);
662 Constant *Disp = ConstantExpr::getSub(DestInt, SrcInt);
663 ConstantInt *DispOffset =
664 ConstantInt::get(IntPtrTy, Distance * kX86JumpTableEntrySize + 5);
665 Constant *OffsetedDisp = ConstantExpr::getSub(Disp, DispOffset);
666 OffsetedDisp = ConstantExpr::getTruncOrBitCast(OffsetedDisp, Int32Ty);
667
668 ConstantInt *Int3 = ConstantInt::get(Int8Ty, kInt3Code);
669
670 Constant *Fields[] = {
671 Jmp, OffsetedDisp, Int3, Int3, Int3,
672 };
673 return ConstantStruct::getAnon(Fields, /*Packed=*/true);
674 }
675
676 Type *LowerTypeTests::getJumpTableEntryType() {
677 if (Arch != Triple::x86 && Arch != Triple::x86_64)
678 report_fatal_error("Unsupported architecture for jump tables");
679
680 return StructType::get(M->getContext(),
681 {Int8Ty, Int32Ty, Int8Ty, Int8Ty, Int8Ty},
682 /*Packed=*/true);
683 }
684
685 /// Given a disjoint set of type identifiers and functions, build a jump table
686 /// for the functions, build the bit sets and lower the llvm.type.test calls.
687 void LowerTypeTests::buildBitSetsFromFunctions(ArrayRef TypeIds,
688 ArrayRef Functions) {
689 // Unlike the global bitset builder, the function bitset builder cannot
690 // re-arrange functions in a particular order and base its calculations on the
691 // layout of the functions' entry points, as we have no idea how large a
692 // particular function will end up being (the size could even depend on what
693 // this pass does!) Instead, we build a jump table, which is a block of code
694 // consisting of one branch instruction for each of the functions in the bit
695 // set that branches to the target function, and redirect any taken function
696 // addresses to the corresponding jump table entry. In the object file's
697 // symbol table, the symbols for the target functions also refer to the jump
698 // table entries, so that addresses taken outside the module will pass any
699 // verification done inside the module.
700 //
701 // In more concrete terms, suppose we have three functions f, g, h which are
702 // of the same type, and a function foo that returns their addresses:
703 //
704 // f:
705 // mov 0, %eax
706 // ret
707 //
708 // g:
709 // mov 1, %eax
710 // ret
711 //
712 // h:
713 // mov 2, %eax
714 // ret
715 //
716 // foo:
717 // mov f, %eax
718 // mov g, %edx
719 // mov h, %ecx
720 // ret
721 //
722 // To create a jump table for these functions, we instruct the LLVM code
723 // generator to output a jump table in the .text section. This is done by
724 // representing the instructions in the jump table as an LLVM constant and
725 // placing them in a global variable in the .text section. The end result will
726 // (conceptually) look like this:
727 //
728 // f:
729 // jmp .Ltmp0 ; 5 bytes
730 // int3 ; 1 byte
731 // int3 ; 1 byte
732 // int3 ; 1 byte
733 //
734 // g:
735 // jmp .Ltmp1 ; 5 bytes
736 // int3 ; 1 byte
737 // int3 ; 1 byte
738 // int3 ; 1 byte
739 //
740 // h:
741 // jmp .Ltmp2 ; 5 bytes
742 // int3 ; 1 byte
743 // int3 ; 1 byte
744 // int3 ; 1 byte
745 //
746 // .Ltmp0:
747 // mov 0, %eax
748 // ret
749 //
750 // .Ltmp1:
751 // mov 1, %eax
752 // ret
753 //
754 // .Ltmp2:
755 // mov 2, %eax
756 // ret
757 //
758 // foo:
759 // mov f, %eax
760 // mov g, %edx
761 // mov h, %ecx
762 // ret
763 //
764 // Because the addresses of f, g, h are evenly spaced at a power of 2, in the
765 // normal case the check can be carried out using the same kind of simple
766 // arithmetic that we normally use for globals.
767
768 assert(!Functions.empty());
769
770 // Build a simple layout based on the regular layout of jump tables.
771 DenseMap GlobalLayout;
772 unsigned EntrySize = getJumpTableEntrySize();
773 for (unsigned I = 0; I != Functions.size(); ++I)
774 GlobalLayout[Functions[I]] = I * EntrySize;
775
776 // Create a constant to hold the jump table.
777 ArrayType *JumpTableType =
778 ArrayType::get(getJumpTableEntryType(), Functions.size());
779 auto JumpTable = new GlobalVariable(*M, JumpTableType,
780 /*isConstant=*/true,
781 GlobalValue::PrivateLinkage, nullptr);
782 JumpTable->setSection(ObjectFormat == Triple::MachO
783 ? "__TEXT,__text,regular,pure_instructions"
784 : ".text");
785 lowerTypeTestCalls(TypeIds, JumpTable, GlobalLayout);
786
787 // Build aliases pointing to offsets into the jump table, and replace
788 // references to the original functions with references to the aliases.
789 for (unsigned I = 0; I != Functions.size(); ++I) {
790 Constant *CombinedGlobalElemPtr = ConstantExpr::getBitCast(
791 ConstantExpr::getGetElementPtr(
792 JumpTableType, JumpTable,
793 ArrayRef{ConstantInt::get(IntPtrTy, 0),
794 ConstantInt::get(IntPtrTy, I)}),
795 Functions[I]->getType());
796 if (LinkerSubsectionsViaSymbols || Functions[I]->isDeclarationForLinker()) {
797 Functions[I]->replaceAllUsesWith(CombinedGlobalElemPtr);
798 } else {
799 assert(Functions[I]->getType()->getAddressSpace() == 0);
800 GlobalAlias *GAlias = GlobalAlias::create(Functions[I]->getValueType(), 0,
801 Functions[I]->getLinkage(), "",
802 CombinedGlobalElemPtr, M);
803 GAlias->setVisibility(Functions[I]->getVisibility());
804 GAlias->takeName(Functions[I]);
805 Functions[I]->replaceAllUsesWith(GAlias);
806 }
807 if (!Functions[I]->isDeclarationForLinker())
808 Functions[I]->setLinkage(GlobalValue::PrivateLinkage);
809 }
810
811 // Build and set the jump table's initializer.
812 std::vector JumpTableEntries;
813 for (unsigned I = 0; I != Functions.size(); ++I)
814 JumpTableEntries.push_back(
815 createJumpTableEntry(JumpTable, Functions[I], I));
816 JumpTable->setInitializer(
817 ConstantArray::get(JumpTableType, JumpTableEntries));
818 }
819
820 void LowerTypeTests::buildBitSetsFromDisjointSet(
821 ArrayRef TypeIds, ArrayRef Globals) {
822 llvm::DenseMap TypeIdIndices;
823 for (unsigned I = 0; I != TypeIds.size(); ++I)
824 TypeIdIndices[TypeIds[I]] = I;
825
826 // For each type identifier, build a set of indices that refer to members of
827 // the type identifier.
828 std::vector> TypeMembers(TypeIds.size());
829 SmallVector Types;
830 unsigned GlobalIndex = 0;
831 for (GlobalObject *GO : Globals) {
832 Types.clear();
833 GO->getMetadata(LLVMContext::MD_type, Types);
834 for (MDNode *Type : Types) {
835 // Type = { offset, type identifier }
836 unsigned TypeIdIndex = TypeIdIndices[Type->getOperand(1)];
837 TypeMembers[TypeIdIndex].insert(GlobalIndex);
838 }
839 GlobalIndex++;
840 }
841
842 // Order the sets of indices by size. The GlobalLayoutBuilder works best
843 // when given small index sets first.
844 std::stable_sort(
845 TypeMembers.begin(), TypeMembers.end(),
846 [](const std::set &O1, const std::set &O2) {
847 return O1.size() < O2.size();
848 });
849
850 // Create a GlobalLayoutBuilder and provide it with index sets as layout
851 // fragments. The GlobalLayoutBuilder tries to lay out members of fragments as
852 // close together as possible.
853 GlobalLayoutBuilder GLB(Globals.size());
854 for (auto &&MemSet : TypeMembers)
855 GLB.addFragment(MemSet);
856
857 // Build the bitsets from this disjoint set.
858 if (Globals.empty() || isa(Globals[0])) {
859 // Build a vector of global variables with the computed layout.
860 std::vector OrderedGVs(Globals.size());
861 auto OGI = OrderedGVs.begin();
862 for (auto &&F : GLB.Fragments) {
863 for (auto &&Offset : F) {
864 auto GV = dyn_cast(Globals[Offset]);
865 if (!GV)
866 report_fatal_error("Type identifier may not contain both global "
867 "variables and functions");
868 *OGI++ = GV;
869 }
870 }
871
872 buildBitSetsFromGlobalVariables(TypeIds, OrderedGVs);
873 } else {
874 // Build a vector of functions with the computed layout.
875 std::vector OrderedFns(Globals.size());
876 auto OFI = OrderedFns.begin();
877 for (auto &&F : GLB.Fragments) {
878 for (auto &&Offset : F) {
879 auto Fn = dyn_cast(Globals[Offset]);
880 if (!Fn)
881 report_fatal_error("Type identifier may not contain both global "
882 "variables and functions");
883 *OFI++ = Fn;
884 }
885 }
886