llvm.org GIT mirror llvm / 9a395de
[Unroll/UnrollAndJam/Vectorizer/Distribute] Add followup loop attributes. When multiple loop transformation are defined in a loop's metadata, their order of execution is defined by the order of their respective passes in the pass pipeline. For instance, e.g. #pragma clang loop unroll_and_jam(enable) #pragma clang loop distribute(enable) is the same as #pragma clang loop distribute(enable) #pragma clang loop unroll_and_jam(enable) and will try to loop-distribute before Unroll-And-Jam because the LoopDistribute pass is scheduled after UnrollAndJam pass. UnrollAndJamPass only supports one inner loop, i.e. it will necessarily fail after loop distribution. It is not possible to specify another execution order. Also,t the order of passes in the pipeline is subject to change between versions of LLVM, optimization options and which pass manager is used. This patch adds 'followup' attributes to various loop transformation passes. These attributes define which attributes the resulting loop of a transformation should have. For instance, !0 = !{!0, !1, !2} !1 = !{!"llvm.loop.unroll_and_jam.enable"} !2 = !{!"llvm.loop.unroll_and_jam.followup_inner", !3} !3 = !{!"llvm.loop.distribute.enable"} defines a loop ID (!0) to be unrolled-and-jammed (!1) and then the attribute !3 to be added to the jammed inner loop, which contains the instruction to distribute the inner loop. Currently, in both pass managers, pass execution is in a fixed order and UnrollAndJamPass will not execute again after LoopDistribute. We hope to fix this in the future by allowing pass managers to run passes until a fixpoint is reached, use Polly to perform these transformations, or add a loop transformation pass which takes the order issue into account. For mandatory/forced transformations (e.g. by having been declared by #pragma omp simd), the user must be notified when a transformation could not be performed. It is not possible that the responsible pass emits such a warning because the transformation might be 'hidden' in a followup attribute when it is executed, or it is not present in the pipeline at all. For this reason, this patche introduces a WarnMissedTransformations pass, to warn about orphaned transformations. Since this changes the user-visible diagnostic message when a transformation is applied, two test cases in the clang repository need to be updated. To ensure that no other transformation is executed before the intended one, the attribute `llvm.loop.disable_nonforced` can be added which should disable transformation heuristics before the intended transformation is applied. E.g. it would be surprising if a loop is distributed before a #pragma unroll_and_jam is applied. With more supported code transformations (loop fusion, interchange, stripmining, offloading, etc.), transformations can be used as building blocks for more complex transformations (e.g. stripmining+stripmining+interchange -> tiling). Reviewed By: hfinkel, dmgreen Differential Revision: https://reviews.llvm.org/D49281 Differential Revision: https://reviews.llvm.org/D55288 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@348944 91177308-0d34-0410-b5e6-96231b3b80d8 Michael Kruse 8 months ago
56 changed file(s) with 2473 addition(s) and 122 deletion(s). Raw diff Collapse all Expand all
50755075 is treated as a boolean value; if it exists, it signals that the branch
50765076 or switch that it is attached to is completely unpredictable.
50775077
5078 .. _llvm.loop:
5079
50785080 '``llvm.loop``'
50795081 ^^^^^^^^^^^^^^^
50805082
51075109 ...
51085110 !0 = !{!0, !1}
51095111 !1 = !{!"llvm.loop.unroll.count", i32 4}
5112
5113 '``llvm.loop.disable_nonforced``'
5114 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
5115
5116 This metadata disables all optional loop transformations unless
5117 explicitly instructed using other transformation metdata such as
5118 ``llvm.loop.unroll.enable''. That is, no heuristic will try to determine
5119 whether a transformation is profitable. The purpose is to avoid that the
5120 loop is transformed to a different loop before an explicitly requested
5121 (forced) transformation is applied. For instance, loop fusion can make
5122 other transformations impossible. Mandatory loop canonicalizations such
5123 as loop rotation are still applied.
5124
5125 It is recommended to use this metadata in addition to any llvm.loop.*
5126 transformation directive. Also, any loop should have at most one
5127 directive applied to it (and a sequence of transformations built using
5128 followup-attributes). Otherwise, which transformation will be applied
5129 depends on implementation details such as the pass pipeline order.
5130
5131 See :ref:`transformation-metadata` for details.
51105132
51115133 '``llvm.loop.vectorize``' and '``llvm.loop.interleave``'
51125134 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
51665188 0 or if the loop does not have this metadata the width will be
51675189 determined automatically.
51685190
5191 '``llvm.loop.vectorize.followup_vectorized``' Metadata
5192 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
5193
5194 This metadata defines which loop attributes the vectorized loop will
5195 have. See :ref:`transformation-metadata` for details.
5196
5197 '``llvm.loop.vectorize.followup_epilogue``' Metadata
5198 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
5199
5200 This metadata defines which loop attributes the epilogue will have. The
5201 epilogue is not vectorized and is executed when either the vectorized
5202 loop is not known to preserve semantics (because e.g., it processes two
5203 arrays that are found to alias by a runtime check) or for the last
5204 iterations that do not fill a complete set of vector lanes. See
5205 :ref:`Transformation Metadata ` for details.
5206
5207 '``llvm.loop.vectorize.followup_all``' Metadata
5208 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
5209
5210 Attributes in the metadata will be added to both the vectorized and
5211 epilogue loop.
5212 See :ref:`Transformation Metadata ` for details.
5213
51695214 '``llvm.loop.unroll``'
51705215 ^^^^^^^^^^^^^^^^^^^^^^
51715216
52345279
52355280 !0 = !{!"llvm.loop.unroll.full"}
52365281
5282 '``llvm.loop.unroll.followup``' Metadata
5283 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
5284
5285 This metadata defines which loop attributes the unrolled loop will have.
5286 See :ref:`Transformation Metadata ` for details.
5287
5288 '``llvm.loop.unroll.followup_remainder``' Metadata
5289 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
5290
5291 This metadata defines which loop attributes the remainder loop after
5292 partial/runtime unrolling will have. See
5293 :ref:`Transformation Metadata ` for details.
5294
52375295 '``llvm.loop.unroll_and_jam``'
52385296 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
52395297
52875345
52885346 !0 = !{!"llvm.loop.unroll_and_jam.enable"}
52895347
5348 '``llvm.loop.unroll_and_jam.followup_outer``' Metadata
5349 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
5350
5351 This metadata defines which loop attributes the outer unrolled loop will
5352 have. See :ref:`Transformation Metadata ` for
5353 details.
5354
5355 '``llvm.loop.unroll_and_jam.followup_inner``' Metadata
5356 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
5357
5358 This metadata defines which loop attributes the inner jammed loop will
5359 have. See :ref:`Transformation Metadata ` for
5360 details.
5361
5362 '``llvm.loop.unroll_and_jam.followup_remainder_outer``' Metadata
5363 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
5364
5365 This metadata defines which attributes the epilogue of the outer loop
5366 will have. This loop is usually unrolled, meaning there is no such
5367 loop. This attribute will be ignored in this case. See
5368 :ref:`Transformation Metadata ` for details.
5369
5370 '``llvm.loop.unroll_and_jam.followup_remainder_inner``' Metadata
5371 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
5372
5373 This metadata defines which attributes the inner loop of the epilogue
5374 will have. The outer epilogue will usually be unrolled, meaning there
5375 can be multiple inner remainder loops. See
5376 :ref:`Transformation Metadata ` for details.
5377
5378 '``llvm.loop.unroll_and_jam.followup_all``' Metadata
5379 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
5380
5381 Attributes specified in the metadata is added to all
5382 ``llvm.loop.unroll_and_jam.*`` loops. See
5383 :ref:`Transformation Metadata ` for details.
5384
52905385 '``llvm.loop.licm_versioning.disable``' Metadata
52915386 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
52925387
53185413
53195414 This metadata should be used in conjunction with ``llvm.loop`` loop
53205415 identification metadata.
5416
5417 '``llvm.loop.distribute.followup_coincident``' Metadata
5418 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
5419
5420 This metadata defines which attributes extracted loops with no cyclic
5421 dependencies will have (i.e. can be vectorized). See
5422 :ref:`Transformation Metadata ` for details.
5423
5424 '``llvm.loop.distribute.followup_sequential``' Metadata
5425 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
5426
5427 This metadata defines which attributes the isolated loops with unsafe
5428 memory dependencies will have. See
5429 :ref:`Transformation Metadata ` for details.
5430
5431 '``llvm.loop.distribute.followup_fallback``' Metadata
5432 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
5433
5434 If loop versioning is necessary, this metadata defined the attributes
5435 the non-distributed fallback version will have. See
5436 :ref:`Transformation Metadata ` for details.
5437
5438 '``llvm.loop.distribute.followup_all``' Metadata
5439 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
5440
5441 Thes attributes in this metdata is added to all followup loops of the
5442 loop distribution pass. See
5443 :ref:`Transformation Metadata ` for details.
53215444
53225445 '``llvm.mem``'
53235446 ^^^^^^^^^^^^^^^
12231223 Displays the post dominator tree using the GraphViz tool, but omitting function
12241224 bodies.
12251225
1226 ``-transform-warning``: Report missed forced transformations
1227 ------------------------------------------------------------
1228
1229 Emits warnings about not yet applied forced transformations (e.g. from
1230 ``#pragma omp simd``).
0 .. _transformation-metadata:
1
2 ============================
3 Code Transformation Metadata
4 ============================
5
6 .. contents::
7 :local:
8
9 Overview
10 ========
11
12 LLVM transformation passes can be controlled by attaching metadata to
13 the code to transform. By default, transformation passes use heuristics
14 to determine whether or not to perform transformations, and when doing
15 so, other details of how the transformations are applied (e.g., which
16 vectorization factor to select).
17 Unless the optimizer is otherwise directed, transformations are applied
18 conservatively. This conservatism generally allows the optimizer to
19 avoid unprofitable transformations, but in practice, this results in the
20 optimizer not applying transformations that would be highly profitable.
21
22 Frontends can give additional hints to LLVM passes on which
23 transformations they should apply. This can be additional knowledge that
24 cannot be derived from the emitted IR, or directives passed from the
25 user/programmer. OpenMP pragmas are an example of the latter.
26
27 If any such metadata is dropped from the program, the code's semantics
28 must not change.
29
30 Metadata on Loops
31 =================
32
33 Attributes can be attached to loops as described in :ref:`llvm.loop`.
34 Attributes can describe properties of the loop, disable transformations,
35 force specific transformations and set transformation options.
36
37 Because metadata nodes are immutable (with the exception of
38 ``MDNode::replaceOperandWith`` which is dangerous to use on uniqued
39 metadata), in order to add or remove a loop attributes, a new ``MDNode``
40 must be created and assigned as the new ``llvm.loop`` metadata. Any
41 connection between the old ``MDNode`` and the loop is lost. The
42 ``llvm.loop`` node is also used as LoopID (``Loop::getLoopID()``), i.e.
43 the loop effectively gets a new identifier. For instance,
44 ``llvm.mem.parallel_loop_access`` references the LoopID. Therefore, if
45 the parallel access property is to be preserved after adding/removing
46 loop attributes, any ``llvm.mem.parallel_loop_access`` reference must be
47 updated to the new LoopID.
48
49 Transformation Metadata Structure
50 =================================
51
52 Some attributes describe code transformations (unrolling, vectorizing,
53 loop distribution, etc.). They can either be a hint to the optimizer
54 that a transformation might be beneficial, instruction to use a specific
55 option, , or convey a specific request from the user (such as
56 ``#pragma clang loop`` or ``#pragma omp simd``).
57
58 If a transformation is forced but cannot be carried-out for any reason,
59 an optimization-missed warning must be emitted. Semantic information
60 such as a transformation being safe (e.g.
61 ``llvm.mem.parallel_loop_access``) can be unused by the optimizer
62 without generating a warning.
63
64 Unless explicitly disabled, any optimization pass may heuristically
65 determine whether a transformation is beneficial and apply it. If
66 metadata for another transformation was specified, applying a different
67 transformation before it might be inadvertent due to being applied on a
68 different loop or the loop not existing anymore. To avoid having to
69 explicitly disable an unknown number of passes, the attribute
70 ``llvm.loop.disable_nonforced`` disables all optional, high-level,
71 restructuring transformations.
72
73 The following example avoids the loop being altered before being
74 vectorized, for instance being unrolled.
75
76 .. code-block:: llvm
77
78 br i1 %exitcond, label %for.exit, label %for.header, !llvm.loop !0
79 ...
80 !0 = distinct !{!0, !1, !2}
81 !1 = !{!"llvm.loop.vectorize.enable", i1 true}
82 !2 = !{!"llvm.loop.disable_nonforced"}
83
84 After a transformation is applied, follow-up attributes are set on the
85 transformed and/or new loop(s). This allows additional attributes
86 including followup-transformations to be specified. Specifying multiple
87 transformations in the same metadata node is possible for compatibility
88 reasons, but their execution order is undefined. For instance, when
89 ``llvm.loop.vectorize.enable`` and ``llvm.loop.unroll.enable`` are
90 specified at the same time, unrolling may occur either before or after
91 vectorization.
92
93 As an example, the following instructs a loop to be vectorized and only
94 then unrolled.
95
96 .. code-block:: llvm
97
98 !0 = distinct !{!0, !1, !2, !3}
99 !1 = !{!"llvm.loop.vectorize.enable", i1 true}
100 !2 = !{!"llvm.loop.disable_nonforced"}
101 !3 = !{!"llvm.loop.vectorize.followup_vectorized", !{"llvm.loop.unroll.enable"}}
102
103 If, and only if, no followup is specified, the pass may add attributes itself.
104 For instance, the vectorizer adds a ``llvm.loop.isvectorized`` attribute and
105 all attributes from the original loop excluding its loop vectorizer
106 attributes. To avoid this, an empty followup attribute can be used, e.g.
107
108 .. code-block:: llvm
109
110 !3 = !{!"llvm.loop.vectorize.followup_vectorized"}
111
112 The followup attributes of a transformation that cannot be applied will
113 never be added to a loop and are therefore effectively ignored. This means
114 that any followup-transformation in such attributes requires that its
115 prior transformations are applied before the followup-transformation.
116 The user should receive a warning about the first transformation in the
117 transformation chain that could not be applied if it a forced
118 transformation. All following transformations are skipped.
119
120 Pass-Specific Transformation Metadata
121 =====================================
122
123 Transformation options are specific to each transformation. In the
124 following, we present the model for each LLVM loop optimization pass and
125 the metadata to influence them.
126
127 Loop Vectorization and Interleaving
128 -----------------------------------
129
130 Loop vectorization and interleaving is interpreted as a single
131 transformation. It is interpreted as forced if
132 ``!{"llvm.loop.vectorize.enable", i1 true}`` is set.
133
134 Assuming the pre-vectorization loop is
135
136 .. code-block:: c
137
138 for (int i = 0; i < n; i+=1) // original loop
139 Stmt(i);
140
141 then the code after vectorization will be approximately (assuming an
142 SIMD width of 4):
143
144 .. code-block:: c
145
146 int i = 0;
147 if (rtc) {
148 for (; i + 3 < n; i+=4) // vectorized/interleaved loop
149 Stmt(i:i+3);
150 }
151 for (; i < n; i+=1) // epilogue loop
152 Stmt(i);
153
154 where ``rtc`` is a generated runtime check.
155
156 ``llvm.loop.vectorize.followup_vectorized`` will set the attributes for
157 the vectorized loop. If not specified, ``llvm.loop.isvectorized`` is
158 combined with the original loop's attributes to avoid it being
159 vectorized multiple times.
160
161 ``llvm.loop.vectorize.followup_epilogue`` will set the attributes for
162 the remainder loop. If not specified, it will have the original loop's
163 attributes combined with ``llvm.loop.isvectorized`` and
164 ``llvm.loop.unroll.runtime.disable`` (unless the original loop already
165 has unroll metadata).
166
167 The attributes specified by ``llvm.loop.vectorize.followup_all`` are
168 added to both loops.
169
170 When using a follow-up attribute, it replaces any automatically deduced
171 attributes for the generated loop in question. Therefore it is
172 recommended to add ``llvm.loop.isvectorized`` to
173 ``llvm.loop.vectorize.followup_all`` which avoids that the loop
174 vectorizer tries to optimize the loops again.
175
176 Loop Unrolling
177 --------------
178
179 Unrolling is interpreted as forced any ``!{!"llvm.loop.unroll.enable"}``
180 metadata or option (``llvm.loop.unroll.count``, ``llvm.loop.unroll.full``)
181 is present. Unrolling can be full unrolling, partial unrolling of a loop
182 with constant trip count or runtime unrolling of a loop with a trip
183 count unknown at compile-time.
184
185 If the loop has been unrolled fully, there is no followup-loop. For
186 partial/runtime unrolling, the original loop of
187
188 .. code-block:: c
189
190 for (int i = 0; i < n; i+=1) // original loop
191 Stmt(i);
192
193 is transformed into (using an unroll factor of 4):
194
195 .. code-block:: c
196
197 int i = 0;
198 for (; i + 3 < n; i+=4) // unrolled loop
199 Stmt(i);
200 Stmt(i+1);
201 Stmt(i+2);
202 Stmt(i+3);
203 }
204 for (; i < n; i+=1) // remainder loop
205 Stmt(i);
206
207 ``llvm.loop.unroll.followup_unrolled`` will set the loop attributes of
208 the unrolled loop. If not specified, the attributes of the original loop
209 without the ``llvm.loop.unroll.*`` attributes are copied and
210 ``llvm.loop.unroll.disable`` added to it.
211
212 ``llvm.loop.unroll.followup_remainder`` defines the attributes of the
213 remainder loop. If not specified the remainder loop will have no
214 attributes. The remainder loop might not be present due to being fully
215 unrolled in which case this attribute has no effect.
216
217 Attributes defined in ``llvm.loop.unroll.followup_all`` are added to the
218 unrolled and remainder loops.
219
220 To avoid that the partially unrolled loop is unrolled again, it is
221 recommended to add ``llvm.loop.unroll.disable`` to
222 ``llvm.loop.unroll.followup_all``. If no follow-up attribute specified
223 for a generated loop, it is added automatically.
224
225 Unroll-And-Jam
226 --------------
227
228 Unroll-and-jam uses the following transformation model (here with an
229 unroll factor if 2). Currently, it does not support a fallback version
230 when the transformation is unsafe.
231
232 .. code-block:: c
233
234 for (int i = 0; i < n; i+=1) { // original outer loop
235 Fore(i);
236 for (int j = 0; j < m; j+=1) // original inner loop
237 SubLoop(i, j);
238 Aft(i);
239 }
240
241 .. code-block:: c
242
243 int i = 0;
244 for (; i + 1 < n; i+=2) { // unrolled outer loop
245 Fore(i);
246 Fore(i+1);
247 for (int j = 0; j < m; j+=1) { // unrolled inner loop
248 SubLoop(i, j);
249 SubLoop(i+1, j);
250 }
251 Aft(i);
252 Aft(i+1);
253 }
254 for (; i < n; i+=1) { // remainder outer loop
255 Fore(i);
256 for (int j = 0; j < m; j+=1) // remainder inner loop
257 SubLoop(i, j);
258 Aft(i);
259 }
260
261 ``llvm.loop.unroll_and_jam.followup_outer`` will set the loop attributes
262 of the unrolled outer loop. If not specified, the attributes of the
263 original outer loop without the ``llvm.loop.unroll.*`` attributes are
264 copied and ``llvm.loop.unroll.disable`` added to it.
265
266 ``llvm.loop.unroll_and_jam.followup_inner`` will set the loop attributes
267 of the unrolled inner loop. If not specified, the attributes of the
268 original inner loop are used unchanged.
269
270 ``llvm.loop.unroll_and_jam.followup_remainder_outer`` sets the loop
271 attributes of the outer remainder loop. If not specified it will not
272 have any attributes. The remainder loop might not be present due to
273 being fully unrolled.
274
275 ``llvm.loop.unroll_and_jam.followup_remainder_inner`` sets the loop
276 attributes of the inner remainder loop. If not specified it will have
277 the attributes of the original inner loop. It the outer remainder loop
278 is unrolled, the inner remainder loop might be present multiple times.
279
280 Attributes defined in ``llvm.loop.unroll_and_jam.followup_all`` are
281 added to all of the aforementioned output loops.
282
283 To avoid that the unrolled loop is unrolled again, it is
284 recommended to add ``llvm.loop.unroll.disable`` to
285 ``llvm.loop.unroll_and_jam.followup_all``. It suppresses unroll-and-jam
286 as well as an additional inner loop unrolling. If no follow-up
287 attribute specified for a generated loop, it is added automatically.
288
289 Loop Distribution
290 -----------------
291
292 The LoopDistribution pass tries to separate vectorizable parts of a loop
293 from the non-vectorizable part (which otherwise would make the entire
294 loop non-vectorizable). Conceptually, it transforms a loop such as
295
296 .. code-block:: c
297
298 for (int i = 1; i < n; i+=1) { // original loop
299 A[i] = i;
300 B[i] = 2 + B[i];
301 C[i] = 3 + C[i - 1];
302 }
303
304 into the following code:
305
306 .. code-block:: c
307
308 if (rtc) {
309 for (int i = 1; i < n; i+=1) // coincident loop
310 A[i] = i;
311 for (int i = 1; i < n; i+=1) // coincident loop
312 B[i] = 2 + B[i];
313 for (int i = 1; i < n; i+=1) // sequential loop
314 C[i] = 3 + C[i - 1];
315 } else {
316 for (int i = 1; i < n; i+=1) { // fallback loop
317 A[i] = i;
318 B[i] = 2 + B[i];
319 C[i] = 3 + C[i - 1];
320 }
321 }
322
323 where ``rtc`` is a generated runtime check.
324
325 ``llvm.loop.distribute.followup_coincident`` sets the loop attributes of
326 all loops without loop-carried dependencies (i.e. vectorizable loops).
327 There might be more than one such loops. If not defined, the loops will
328 inherit the original loop's attributes.
329
330 ``llvm.loop.distribute.followup_sequential`` sets the loop attributes of the
331 loop with potentially unsafe dependencies. There should be at most one
332 such loop. If not defined, the loop will inherit the original loop's
333 attributes.
334
335 ``llvm.loop.distribute.followup_fallback`` defines the loop attributes
336 for the fallback loop, which is a copy of the original loop for when
337 loop versioning is required. If undefined, the fallback loop inherits
338 all attributes from the original loop.
339
340 Attributes defined in ``llvm.loop.distribute.followup_all`` are added to
341 all of the aforementioned output loops.
342
343 It is recommended to add ``llvm.loop.disable_nonforced`` to
344 ``llvm.loop.distribute.followup_fallback``. This avoids that the
345 fallback version (which is likely never executed) is further optimzed
346 which would increase the code size.
347
348 Versioning LICM
349 ---------------
350
351 The pass hoists code out of loops that are only loop-invariant when
352 dynamic conditions apply. For instance, it transforms the loop
353
354 .. code-block:: c
355
356 for (int i = 0; i < n; i+=1) // original loop
357 A[i] = B[0];
358
359 into:
360
361 .. code-block:: c
362
363 if (rtc) {
364 auto b = B[0];
365 for (int i = 0; i < n; i+=1) // versioned loop
366 A[i] = b;
367 } else {
368 for (int i = 0; i < n; i+=1) // unversioned loop
369 A[i] = B[0];
370 }
371
372 The runtime condition (``rtc``) checks that the array ``A`` and the
373 element `B[0]` do not alias.
374
375 Currently, this transformation does not support followup-attributes.
376
377 Loop Interchange
378 ----------------
379
380 Currently, the ``LoopInterchange`` pass does not use any metadata.
381
382 Ambiguous Transformation Order
383 ==============================
384
385 If there multiple transformations defined, the order in which they are
386 executed depends on the order in LLVM's pass pipeline, which is subject
387 to change. The default optimization pipeline (anything higher than
388 ``-O0``) has the following order.
389
390 When using the legacy pass manager:
391
392 - LoopInterchange (if enabled)
393 - SimpleLoopUnroll/LoopFullUnroll (only performs full unrolling)
394 - VersioningLICM (if enabled)
395 - LoopDistribute
396 - LoopVectorizer
397 - LoopUnrollAndJam (if enabled)
398 - LoopUnroll (partial and runtime unrolling)
399
400 When using the legacy pass manager with LTO:
401
402 - LoopInterchange (if enabled)
403 - SimpleLoopUnroll/LoopFullUnroll (only performs full unrolling)
404 - LoopVectorizer
405 - LoopUnroll (partial and runtime unrolling)
406
407 When using the new pass manager:
408
409 - SimpleLoopUnroll/LoopFullUnroll (only performs full unrolling)
410 - LoopDistribute
411 - LoopVectorizer
412 - LoopUnrollAndJam (if enabled)
413 - LoopUnroll (partial and runtime unrolling)
414
415 Leftover Transformations
416 ========================
417
418 Forced transformations that have not been applied after the last
419 transformation pass should be reported to the user. The transformation
420 passes themselves cannot be responsible for this reporting because they
421 might not be in the pipeline, there might be multiple passes able to
422 apply a transformation (e.g. ``LoopInterchange`` and Polly) or a
423 transformation attribute may be 'hidden' inside another passes' followup
424 attribute.
425
426 The pass ``-transform-warning`` (``WarnMissedTransformationsPass``)
427 emits such warnings. It should be placed after the last transformation
428 pass.
429
430 The current pass pipeline has a fixed order in which transformations
431 passes are executed. A transformation can be in the followup of a pass
432 that is executed later and thus leftover. For instance, a loop nest
433 cannot be distributed and then interchanged with the current pass
434 pipeline. The loop distribution will execute, but there is no loop
435 interchange pass following such that any loop interchange metadata will
436 be ignored. The ``-transform-warning`` should emit a warning in this
437 case.
438
439 Future versions of LLVM may fix this by executing transformations using
440 a dynamic ordering.
291291 Statepoints
292292 MergeFunctions
293293 TypeMetadata
294 TransformMetadata
294295 FaultMaps
295296 MIRLangRef
296297 Coroutines
399399 void initializeVerifierLegacyPassPass(PassRegistry&);
400400 void initializeVirtRegMapPass(PassRegistry&);
401401 void initializeVirtRegRewriterPass(PassRegistry&);
402 void initializeWarnMissedTransformationsLegacyPass(PassRegistry &);
402403 void initializeWasmEHPreparePass(PassRegistry&);
403404 void initializeWholeProgramDevirtPass(PassRegistry&);
404405 void initializeWinEHPreparePass(PassRegistry&);
219219 (void) llvm::createFloat2IntPass();
220220 (void) llvm::createEliminateAvailableExternallyPass();
221221 (void) llvm::createScalarizeMaskedMemIntrinPass();
222 (void) llvm::createWarnMissedTransformationsPass();
222223
223224 (void)new llvm::IntervalPartition();
224225 (void)new llvm::ScalarEvolutionWrapperPass();
0 //===- WarnMissedTransforms.h -----------------------------------*- C++ -*-===//
1 //
2 // The LLVM Compiler Infrastructure
3 //
4 // This file is distributed under the University of Illinois Open Source
5 // License. See LICENSE.TXT for details.
6 //
7 //===----------------------------------------------------------------------===//
8 //
9 // Emit warnings if forced code transformations have not been performed.
10 //
11 //===----------------------------------------------------------------------===//
12
13 #ifndef LLVM_TRANSFORMS_SCALAR_WARNMISSEDTRANSFORMS_H
14 #define LLVM_TRANSFORMS_SCALAR_WARNMISSEDTRANSFORMS_H
15
16 #include "llvm/IR/PassManager.h"
17
18 namespace llvm {
19 class Function;
20 class Loop;
21 class LPMUpdater;
22
23 // New pass manager boilerplate.
24 class WarnMissedTransformationsPass
25 : public PassInfoMixin {
26 public:
27 explicit WarnMissedTransformationsPass() {}
28
29 PreservedAnalyses run(Function &F, FunctionAnalysisManager &AM);
30 };
31
32 // Legacy pass manager boilerplate.
33 Pass *createWarnMissedTransformationsPass();
34 void initializeWarnMissedTransformationsLegacyPass(PassRegistry &);
35 } // end namespace llvm
36
37 #endif // LLVM_TRANSFORMS_SCALAR_WARNMISSEDTRANSFORMS_H
483483 // primarily to help other loop passes.
484484 //
485485 Pass *createLoopSimplifyCFGPass();
486
487 //===----------------------------------------------------------------------===//
488 //
489 // WarnMissedTransformations - This pass emits warnings for leftover forced
490 // transformations.
491 //
492 Pass *createWarnMissedTransformationsPass();
486493 } // End llvm namespace
487494
488495 #endif
170170 Optional findStringMetadataForLoop(Loop *TheLoop,
171171 StringRef Name);
172172
173 /// Find named metadata for a loop with an integer value.
174 llvm::Optional getOptionalIntLoopAttribute(Loop *TheLoop, StringRef Name);
175
176 /// Create a new loop identifier for a loop created from a loop transformation.
177 ///
178 /// @param OrigLoopID The loop ID of the loop before the transformation.
179 /// @param FollowupAttrs List of attribute names that contain attributes to be
180 /// added to the new loop ID.
181 /// @param InheritAttrsExceptPrefix Selects which attributes should be inherited
182 /// from the original loop. The following values
183 /// are considered:
184 /// nullptr : Inherit all attributes from @p OrigLoopID.
185 /// "" : Do not inherit any attribute from @p OrigLoopID; only use
186 /// those specified by a followup attribute.
187 /// "": Inherit all attributes except those which start with
188 /// ; commonly used to remove metadata for the
189 /// applied transformation.
190 /// @param AlwaysNew If true, do not try to reuse OrigLoopID and never return
191 /// None.
192 ///
193 /// @return The loop ID for the after-transformation loop. The following values
194 /// can be returned:
195 /// None : No followup attribute was found; it is up to the
196 /// transformation to choose attributes that make sense.
197 /// @p OrigLoopID: The original identifier can be reused.
198 /// nullptr : The new loop has no attributes.
199 /// MDNode* : A new unique loop identifier.
200 Optional
201 makeFollowupLoopID(MDNode *OrigLoopID, ArrayRef FollowupAttrs,
202 const char *InheritOptionsAttrsPrefix = "",
203 bool AlwaysNew = false);
204
205 /// Look for the loop attribute that disables all transformation heuristic.
206 bool hasDisableAllTransformsHint(const Loop *L);
207
208 /// The mode sets how eager a transformation should be applied.
209 enum TransformationMode {
210 /// The pass can use heuristics to determine whether a transformation should
211 /// be applied.
212 TM_Unspecified,
213
214 /// The transformation should be applied without considering a cost model.
215 TM_Enable,
216
217 /// The transformation should not be applied.
218 TM_Disable,
219
220 /// Force is a flag and should not be used alone.
221 TM_Force = 0x04,
222
223 /// The transformation was directed by the user, e.g. by a #pragma in
224 /// the source code. If the transformation could not be applied, a
225 /// warning should be emitted.
226 TM_ForcedByUser = TM_Enable | TM_Force,
227
228 /// The transformation must not be applied. For instance, `#pragma clang loop
229 /// unroll(disable)` explicitly forbids any unrolling to take place. Unlike
230 /// general loop metadata, it must not be dropped. Most passes should not
231 /// behave differently under TM_Disable and TM_SuppressedByUser.
232 TM_SuppressedByUser = TM_Disable | TM_Force
233 };
234
235 /// @{
236 /// Get the mode for LLVM's supported loop transformations.
237 TransformationMode hasUnrollTransformation(Loop *L);
238 TransformationMode hasUnrollAndJamTransformation(Loop *L);
239 TransformationMode hasVectorizeTransformation(Loop *L);
240 TransformationMode hasDistributeTransformation(Loop *L);
241 TransformationMode hasLICMVersioningTransformation(Loop *L);
242 /// @}
243
173244 /// Set input string into loop metadata by keeping other values intact.
174245 void addStringMetadataToLoop(Loop *TheLoop, const char *MDString,
175246 unsigned V = 0);
3434
3535 using NewLoopsMap = SmallDenseMap;
3636
37 /// @{
38 /// Metadata attribute names
39 const char *const LLVMLoopUnrollFollowupAll = "llvm.loop.unroll.followup_all";
40 const char *const LLVMLoopUnrollFollowupUnrolled =
41 "llvm.loop.unroll.followup_unrolled";
42 const char *const LLVMLoopUnrollFollowupRemainder =
43 "llvm.loop.unroll.followup_remainder";
44 /// @}
45
3746 const Loop* addClonedBlockToLoopInfo(BasicBlock *OriginalBB,
3847 BasicBlock *ClonedBB, LoopInfo *LI,
3948 NewLoopsMap &NewLoops);
6069 unsigned PeelCount, bool UnrollRemainder,
6170 LoopInfo *LI, ScalarEvolution *SE,
6271 DominatorTree *DT, AssumptionCache *AC,
63 OptimizationRemarkEmitter *ORE, bool PreserveLCSSA);
72 OptimizationRemarkEmitter *ORE, bool PreserveLCSSA,
73 Loop **RemainderLoop = nullptr);
6474
6575 bool UnrollRuntimeLoopRemainder(Loop *L, unsigned Count,
6676 bool AllowExpensiveTripCount,
6777 bool UseEpilogRemainder, bool UnrollRemainder,
68 LoopInfo *LI,
69 ScalarEvolution *SE, DominatorTree *DT,
70 AssumptionCache *AC,
71 bool PreserveLCSSA);
78 LoopInfo *LI, ScalarEvolution *SE,
79 DominatorTree *DT, AssumptionCache *AC,
80 bool PreserveLCSSA,
81 Loop **ResultLoop = nullptr);
7282
7383 void computePeelCount(Loop *L, unsigned LoopSize,
7484 TargetTransformInfo::UnrollingPreferences &UP,
8393 unsigned TripMultiple, bool UnrollRemainder,
8494 LoopInfo *LI, ScalarEvolution *SE,
8595 DominatorTree *DT, AssumptionCache *AC,
86 OptimizationRemarkEmitter *ORE);
96 OptimizationRemarkEmitter *ORE,
97 Loop **EpilogueLoop = nullptr);
8798
8899 bool isSafeToUnrollAndJam(Loop *L, ScalarEvolution &SE, DominatorTree &DT,
89100 DependenceInfo &DI);
112112 unsigned getWidth() const { return Width.Value; }
113113 unsigned getInterleave() const { return Interleave.Value; }
114114 unsigned getIsVectorized() const { return IsVectorized.Value; }
115 enum ForceKind getForce() const { return (ForceKind)Force.Value; }
115 enum ForceKind getForce() const {
116 if (Force.Value == FK_Undefined && hasDisableAllTransformsHint(TheLoop))
117 return FK_Disabled;
118 return (ForceKind)Force.Value;
119 }
116120
117121 /// If hints are provided that force vectorization, use the AlwaysPrint
118122 /// pass name to force the frontend to print the diagnostic.
236236 }
237237
238238 void Loop::setLoopID(MDNode *LoopID) const {
239 assert(LoopID && "Loop ID should not be null");
240 assert(LoopID->getNumOperands() > 0 && "Loop ID needs at least one operand");
241 assert(LoopID->getOperand(0) == LoopID && "Loop ID should refer to itself");
242
243 if (BasicBlock *Latch = getLoopLatch()) {
244 Latch->getTerminator()->setMetadata(LLVMContext::MD_loop, LoopID);
245 return;
246 }
247
248 assert(!getLoopLatch() &&
249 "The loop should have no single latch at this point");
239 assert((!LoopID || LoopID->getNumOperands() > 0) &&
240 "Loop ID needs at least one operand");
241 assert((!LoopID || LoopID->getOperand(0) == LoopID) &&
242 "Loop ID should refer to itself");
243
250244 BasicBlock *H = getHeader();
251245 for (BasicBlock *BB : this->blocks()) {
252246 Instruction *TI = BB->getTerminator();
253247 for (BasicBlock *Successor : successors(TI)) {
254 if (Successor == H)
248 if (Successor == H) {
255249 TI->setMetadata(LLVMContext::MD_loop, LoopID);
250 break;
251 }
256252 }
257253 }
258254 }
147147 #include "llvm/Transforms/Scalar/SpeculateAroundPHIs.h"
148148 #include "llvm/Transforms/Scalar/SpeculativeExecution.h"
149149 #include "llvm/Transforms/Scalar/TailRecursionElimination.h"
150 #include "llvm/Transforms/Scalar/WarnMissedTransforms.h"
150151 #include "llvm/Transforms/Utils/AddDiscriminators.h"
151152 #include "llvm/Transforms/Utils/BreakCriticalEdges.h"
152153 #include "llvm/Transforms/Utils/EntryExitInstrumenter.h"
834835 createFunctionToLoopPassAdaptor(LoopUnrollAndJamPass(Level)));
835836 }
836837 OptimizePM.addPass(LoopUnrollPass(LoopUnrollOptions(Level)));
838 OptimizePM.addPass(WarnMissedTransformationsPass());
837839 OptimizePM.addPass(InstCombinePass());
838840 OptimizePM.addPass(RequireAnalysisPass());
839841 OptimizePM.addPass(createFunctionToLoopPassAdaptor(LICMPass(), DebugLogging));
229229 FUNCTION_PASS("verify", RegionInfoVerifierPass())
230230 FUNCTION_PASS("view-cfg", CFGViewerPass())
231231 FUNCTION_PASS("view-cfg-only", CFGOnlyViewerPass())
232 FUNCTION_PASS("transform-warning", WarnMissedTransformationsPass())
232233 #undef FUNCTION_PASS
233234
234235 #ifndef LOOP_ANALYSIS
701701 MPM.add(createLICMPass());
702702 }
703703
704 MPM.add(createWarnMissedTransformationsPass());
705
704706 // After vectorization and unrolling, assume intrinsics may tell us more
705707 // about pointer alignments.
706708 MPM.add(createAlignmentFromAssumptionsPass());
875877 // The vectorizer may have significantly shortened a loop body; unroll again.
876878 if (!DisableUnrollLoops)
877879 PM.add(createLoopUnrollPass(OptLevel));
880
881 PM.add(createWarnMissedTransformationsPass());
878882
879883 // Now that we've optimized loops (in particular loop induction variables),
880884 // we may have exposed more scalar opportunities. Run parts of the scalar
6868 StraightLineStrengthReduce.cpp
6969 StructurizeCFG.cpp
7070 TailRecursionElimination.cpp
71 WarnMissedTransforms.cpp
7172
7273 ADDITIONAL_HEADER_DIRS
7374 ${LLVM_MAIN_INCLUDE_DIR}/llvm/Transforms
7777 #define LDIST_NAME "loop-distribute"
7878 #define DEBUG_TYPE LDIST_NAME
7979
80 /// @{
81 /// Metadata attribute names
82 static const char *const LLVMLoopDistributeFollowupAll =
83 "llvm.loop.distribute.followup_all";
84 static const char *const LLVMLoopDistributeFollowupCoincident =
85 "llvm.loop.distribute.followup_coincident";
86 static const char *const LLVMLoopDistributeFollowupSequential =
87 "llvm.loop.distribute.followup_sequential";
88 static const char *const LLVMLoopDistributeFollowupFallback =
89 "llvm.loop.distribute.followup_fallback";
90 /// @}
91
8092 static cl::opt
8193 LDistVerify("loop-distribute-verify", cl::Hidden,
8294 cl::desc("Turn on DominatorTree and LoopInfo verification "
185197 /// Returns the loop where this partition ends up after distribution.
186198 /// If this partition is mapped to the original loop then use the block from
187199 /// the loop.
188 const Loop *getDistributedLoop() const {
200 Loop *getDistributedLoop() const {
189201 return ClonedLoop ? ClonedLoop : OrigLoop;
190202 }
191203
442454 assert(&*OrigPH->begin() == OrigPH->getTerminator() &&
443455 "preheader not empty");
444456
457 // Preserve the original loop ID for use after the transformation.
458 MDNode *OrigLoopID = L->getLoopID();
459
445460 // Create a loop for each partition except the last. Clone the original
446461 // loop before PH along with adding a preheader for the cloned loop. Then
447462 // update PH to point to the newly added preheader.
456471
457472 Part->getVMap()[ExitBlock] = TopPH;
458473 Part->remapInstructions();
474 setNewLoopID(OrigLoopID, Part);
459475 }
460476 Pred->getTerminator()->replaceUsesOfWith(OrigPH, TopPH);
477
478 // Also set a new loop ID for the last loop.
479 setNewLoopID(OrigLoopID, &PartitionContainer.back());
461480
462481 // Now go in forward order and update the immediate dominator for the
463482 // preheaders with the exiting block of the previous loop. Dominance
572591 PrevMatch = nullptr;
573592 ++I;
574593 }
594 }
595 }
596
597 /// Assign new LoopIDs for the partition's cloned loop.
598 void setNewLoopID(MDNode *OrigLoopID, InstPartition *Part) {
599 Optional PartitionID = makeFollowupLoopID(
600 OrigLoopID,
601 {LLVMLoopDistributeFollowupAll,
602 Part->hasDepCycle() ? LLVMLoopDistributeFollowupSequential
603 : LLVMLoopDistributeFollowupCoincident});
604 if (PartitionID.hasValue()) {
605 Loop *NewLoop = Part->getDistributedLoop();
606 NewLoop->setLoopID(PartitionID.getValue());
575607 }
576608 }
577609 };
742774 return fail("TooManySCEVRuntimeChecks",
743775 "too many SCEV run-time checks needed.\n");
744776
777 if (!IsForced.getValueOr(false) && hasDisableAllTransformsHint(L))
778 return fail("HeuristicDisabled", "distribution heuristic disabled");
779
745780 LLVM_DEBUG(dbgs() << "\nDistributing loop: " << *L << "\n");
746781 // We're done forming the partitions set up the reverse mapping from
747782 // instructions to partitions.
761796 RtPtrChecking);
762797
763798 if (!Pred.isAlwaysTrue() || !Checks.empty()) {
799 MDNode *OrigLoopID = L->getLoopID();
800
764801 LLVM_DEBUG(dbgs() << "\nPointers:\n");
765802 LLVM_DEBUG(LAI->getRuntimePointerChecking()->printChecks(dbgs(), Checks));
766803 LoopVersioning LVer(*LAI, L, LI, DT, SE, false);
768805 LVer.setSCEVChecks(LAI->getPSE().getUnionPredicate());
769806 LVer.versionLoop(DefsUsedOutside);
770807 LVer.annotateLoopWithNoAlias();
808
809 // The unversioned loop will not be changed, so we inherit all attributes
810 // from the original loop, but remove the loop distribution metadata to
811 // avoid to distribute it again.
812 MDNode *UnversionedLoopID =
813 makeFollowupLoopID(OrigLoopID,
814 {LLVMLoopDistributeFollowupAll,
815 LLVMLoopDistributeFollowupFallback},
816 "llvm.loop.distribute.", true)
817 .getValue();
818 LVer.getNonVersionedLoop()->setLoopID(UnversionedLoopID);
771819 }
772820
773821 // Create identical copies of the original loop for each partition and hook
5555
5656 #define DEBUG_TYPE "loop-unroll-and-jam"
5757
58 /// @{
59 /// Metadata attribute names
60 static const char *const LLVMLoopUnrollAndJamFollowupAll =
61 "llvm.loop.unroll_and_jam.followup_all";
62 static const char *const LLVMLoopUnrollAndJamFollowupInner =
63 "llvm.loop.unroll_and_jam.followup_inner";
64 static const char *const LLVMLoopUnrollAndJamFollowupOuter =
65 "llvm.loop.unroll_and_jam.followup_outer";
66 static const char *const LLVMLoopUnrollAndJamFollowupRemainderInner =
67 "llvm.loop.unroll_and_jam.followup_remainder_inner";
68 static const char *const LLVMLoopUnrollAndJamFollowupRemainderOuter =
69 "llvm.loop.unroll_and_jam.followup_remainder_outer";
70 /// @}
71
5872 static cl::opt
5973 AllowUnrollAndJam("allow-unroll-and-jam", cl::Hidden,
6074 cl::desc("Allows loops to be unroll-and-jammed."));
109123 // Returns true if the loop has an unroll_and_jam(enable) pragma.
110124 static bool HasUnrollAndJamEnablePragma(const Loop *L) {
111125 return GetUnrollMetadataForLoop(L, "llvm.loop.unroll_and_jam.enable");
112 }
113
114 // Returns true if the loop has an unroll_and_jam(disable) pragma.
115 static bool HasUnrollAndJamDisablePragma(const Loop *L) {
116 return GetUnrollMetadataForLoop(L, "llvm.loop.unroll_and_jam.disable");
117126 }
118127
119128 // If loop has an unroll_and_jam_count pragma return the (necessarily
298307 << L->getHeader()->getParent()->getName() << "] Loop %"
299308 << L->getHeader()->getName() << "\n");
300309
310 TransformationMode EnableMode = hasUnrollAndJamTransformation(L);
311 if (EnableMode & TM_Disable)
312 return LoopUnrollResult::Unmodified;
313
301314 // A loop with any unroll pragma (enabling/disabling/count/etc) is left for
302315 // the unroller, so long as it does not explicitly have unroll_and_jam
303316 // metadata. This means #pragma nounroll will disable unroll and jam as well
304317 // as unrolling
305 if (HasUnrollAndJamDisablePragma(L) ||
306 (HasAnyUnrollPragma(L, "llvm.loop.unroll.") &&
307 !HasAnyUnrollPragma(L, "llvm.loop.unroll_and_jam."))) {
318 if (HasAnyUnrollPragma(L, "llvm.loop.unroll.") &&
319 !HasAnyUnrollPragma(L, "llvm.loop.unroll_and_jam.")) {
308320 LLVM_DEBUG(dbgs() << " Disabled due to pragma.\n");
309321 return LoopUnrollResult::Unmodified;
310322 }
343355 return LoopUnrollResult::Unmodified;
344356 }
345357
358 // Save original loop IDs for after the transformation.
359 MDNode *OrigOuterLoopID = L->getLoopID();
360 MDNode *OrigSubLoopID = SubLoop->getLoopID();
361
362 // To assign the loop id of the epilogue, assign it before unrolling it so it
363 // is applied to every inner loop of the epilogue. We later apply the loop ID
364 // for the jammed inner loop.
365 Optional NewInnerEpilogueLoopID = makeFollowupLoopID(
366 OrigOuterLoopID, {LLVMLoopUnrollAndJamFollowupAll,
367 LLVMLoopUnrollAndJamFollowupRemainderInner});
368 if (NewInnerEpilogueLoopID.hasValue())
369 SubLoop->setLoopID(NewInnerEpilogueLoopID.getValue());
370
346371 // Find trip count and trip multiple
347372 unsigned OuterTripCount = SE.getSmallConstantTripCount(L, Latch);
348373 unsigned OuterTripMultiple = SE.getSmallConstantTripMultiple(L, Latch);
358383 if (OuterTripCount && UP.Count > OuterTripCount)
359384 UP.Count = OuterTripCount;
360385
361 LoopUnrollResult UnrollResult =
362 UnrollAndJamLoop(L, UP.Count, OuterTripCount, OuterTripMultiple,
363 UP.UnrollRemainder, LI, &SE, &DT, &AC, &ORE);
386 Loop *EpilogueOuterLoop = nullptr;
387 LoopUnrollResult UnrollResult = UnrollAndJamLoop(
388 L, UP.Count, OuterTripCount, OuterTripMultiple, UP.UnrollRemainder, LI,
389 &SE, &DT, &AC, &ORE, &EpilogueOuterLoop);
390
391 // Assign new loop attributes.
392 if (EpilogueOuterLoop) {
393 Optional NewOuterEpilogueLoopID = makeFollowupLoopID(
394 OrigOuterLoopID, {LLVMLoopUnrollAndJamFollowupAll,
395 LLVMLoopUnrollAndJamFollowupRemainderOuter});
396 if (NewOuterEpilogueLoopID.hasValue())
397 EpilogueOuterLoop->setLoopID(NewOuterEpilogueLoopID.getValue());
398 }
399
400 Optional NewInnerLoopID =
401 makeFollowupLoopID(OrigOuterLoopID, {LLVMLoopUnrollAndJamFollowupAll,
402 LLVMLoopUnrollAndJamFollowupInner});
403 if (NewInnerLoopID.hasValue())
404 SubLoop->setLoopID(NewInnerLoopID.getValue());
405 else
406 SubLoop->setLoopID(OrigSubLoopID);
407
408 if (UnrollResult == LoopUnrollResult::PartiallyUnrolled) {
409 Optional NewOuterLoopID = makeFollowupLoopID(
410 OrigOuterLoopID,
411 {LLVMLoopUnrollAndJamFollowupAll, LLVMLoopUnrollAndJamFollowupOuter});
412 if (NewOuterLoopID.hasValue()) {
413 L->setLoopID(NewOuterLoopID.getValue());
414
415 // Do not setLoopAlreadyUnrolled if a followup was given.
416 return UnrollResult;
417 }
418 }
364419
365420 // If loop has an unroll count pragma or unrolled by explicitly set count
366421 // mark loop as unrolled to prevent unrolling beyond that requested.
660660 return GetUnrollMetadataForLoop(L, "llvm.loop.unroll.enable");
661661 }
662662
663 // Returns true if the loop has an unroll(disable) pragma.
664 static bool HasUnrollDisablePragma(const Loop *L) {
665 return GetUnrollMetadataForLoop(L, "llvm.loop.unroll.disable");
666 }
667
668663 // Returns true if the loop has an runtime unroll(disable) pragma.
669664 static bool HasRuntimeUnrollDisablePragma(const Loop *L) {
670665 return GetUnrollMetadataForLoop(L, "llvm.loop.unroll.runtime.disable");
712707
713708 // Returns true if unroll count was set explicitly.
714709 // Calculates unroll count and writes it to UP.Count.
710 // Unless IgnoreUser is true, will also use metadata and command-line options
711 // that are specific to to the LoopUnroll pass (which, for instance, are
712 // irrelevant for the LoopUnrollAndJam pass).
713 // FIXME: This function is used by LoopUnroll and LoopUnrollAndJam, but consumes
714 // many LoopUnroll-specific options. The shared functionality should be
715 // refactored into it own function.
715716 bool llvm::computeUnrollCount(
716717 Loop *L, const TargetTransformInfo &TTI, DominatorTree &DT, LoopInfo *LI,
717718 ScalarEvolution &SE, const SmallPtrSetImpl &EphValues,
718719 OptimizationRemarkEmitter *ORE, unsigned &TripCount, unsigned MaxTripCount,
719720 unsigned &TripMultiple, unsigned LoopSize,
720721 TargetTransformInfo::UnrollingPreferences &UP, bool &UseUpperBound) {
722
721723 // Check for explicit Count.
722724 // 1st priority is unroll count set by "unroll-count" option.
723725 bool UserUnrollCount = UnrollCount.getNumOccurrences() > 0;
968970 LLVM_DEBUG(dbgs() << "Loop Unroll: F["
969971 << L->getHeader()->getParent()->getName() << "] Loop %"
970972 << L->getHeader()->getName() << "\n");
971 if (HasUnrollDisablePragma(L))
973 if (hasUnrollTransformation(L) & TM_Disable)
972974 return LoopUnrollResult::Unmodified;
973975 if (!L->isLoopSimplifyForm()) {
974976 LLVM_DEBUG(
10651067 if (TripCount && UP.Count > TripCount)
10661068 UP.Count = TripCount;
10671069
1070 // Save loop properties before it is transformed.
1071 MDNode *OrigLoopID = L->getLoopID();
1072
10681073 // Unroll the loop.
1074 Loop *RemainderLoop = nullptr;
10691075 LoopUnrollResult UnrollResult = UnrollLoop(
10701076 L, UP.Count, TripCount, UP.Force, UP.Runtime, UP.AllowExpensiveTripCount,
10711077 UseUpperBound, MaxOrZero, TripMultiple, UP.PeelCount, UP.UnrollRemainder,
1072 LI, &SE, &DT, &AC, &ORE, PreserveLCSSA);
1078 LI, &SE, &DT, &AC, &ORE, PreserveLCSSA, &RemainderLoop);
10731079 if (UnrollResult == LoopUnrollResult::Unmodified)
10741080 return LoopUnrollResult::Unmodified;
1081
1082 if (RemainderLoop) {
1083 Optional RemainderLoopID =
1084 makeFollowupLoopID(OrigLoopID, {LLVMLoopUnrollFollowupAll,
1085 LLVMLoopUnrollFollowupRemainder});
1086 if (RemainderLoopID.hasValue())
1087 RemainderLoop->setLoopID(RemainderLoopID.getValue());
1088 }
1089
1090 if (UnrollResult != LoopUnrollResult::FullyUnrolled) {
1091 Optional NewLoopID =
1092 makeFollowupLoopID(OrigLoopID, {LLVMLoopUnrollFollowupAll,
1093 LLVMLoopUnrollFollowupUnrolled});
1094 if (NewLoopID.hasValue()) {
1095 L->setLoopID(NewLoopID.getValue());
1096
1097 // Do not setLoopAlreadyUnrolled if loop attributes have been specified
1098 // explicitly.
1099 return UnrollResult;
1100 }
1101 }
10751102
10761103 // If loop has an unroll count pragma or unrolled by explicitly set count
10771104 // mark loop as unrolled to prevent unrolling beyond that requested.
593593
594594 if (skipLoop(L))
595595 return false;
596
597 // Do not do the transformation if disabled by metadata.
598 if (hasLICMVersioningTransformation(L) & TM_Disable)
599 return false;
600
596601 // Get Analysis information.
597602 AA = &getAnalysis().getAAResults();
598603 SE = &getAnalysis().getSE();
7474 initializeLoopUnrollPass(Registry);
7575 initializeLoopUnrollAndJamPass(Registry);
7676 initializeLoopUnswitchPass(Registry);
77 initializeWarnMissedTransformationsLegacyPass(Registry);
7778 initializeLoopVersioningLICMPass(Registry);
7879 initializeLoopIdiomRecognizeLegacyPassPass(Registry);
7980 initializeLowerAtomicLegacyPassPass(Registry);
0 //===- LoopTransformWarning.cpp - ----------------------------------------===//
1 //
2 // The LLVM Compiler Infrastructure
3 //
4 // This file is distributed under the University of Illinois Open Source
5 // License. See LICENSE.TXT for details.
6 //
7 //===----------------------------------------------------------------------===//
8 //
9 // Emit warnings if forced code transformations have not been performed.
10 //
11 //===----------------------------------------------------------------------===//
12
13 #include "llvm/Transforms/Scalar/WarnMissedTransforms.h"
14 #include "llvm/Analysis/OptimizationRemarkEmitter.h"
15 #include "llvm/Transforms/Utils/LoopUtils.h"
16
17 using namespace llvm;
18
19 #define DEBUG_TYPE "transform-warning"
20
21 /// Emit warnings for forced (i.e. user-defined) loop transformations which have
22 /// still not been performed.
23 static void warnAboutLeftoverTransformations(Loop *L,
24 OptimizationRemarkEmitter *ORE) {
25 if (hasUnrollTransformation(L) == TM_ForcedByUser) {
26 LLVM_DEBUG(dbgs() << "Leftover unroll transformation\n");
27 ORE->emit(
28 DiagnosticInfoOptimizationFailure(DEBUG_TYPE,
29 "FailedRequestedUnrolling",
30 L->getStartLoc(), L->getHeader())
31 << "loop not unrolled: the optimizer was unable to perform the "
32 "requested transformation; the transformation might be disabled or "
33 "specified as part of an unsupported transformation ordering");
34 }
35
36 if (hasUnrollAndJamTransformation(L) == TM_ForcedByUser) {
37 LLVM_DEBUG(dbgs() << "Leftover unroll-and-jam transformation\n");
38 ORE->emit(
39 DiagnosticInfoOptimizationFailure(DEBUG_TYPE,
40 "FailedRequestedUnrollAndJamming",
41 L->getStartLoc(), L->getHeader())
42 << "loop not unroll-and-jammed: the optimizer was unable to perform "
43 "the requested transformation; the transformation might be disabled "
44 "or specified as part of an unsupported transformation ordering");
45 }
46
47 if (hasVectorizeTransformation(L) == TM_ForcedByUser) {
48 LLVM_DEBUG(dbgs() << "Leftover vectorization transformation\n");
49 Optional VectorizeWidth =
50 getOptionalIntLoopAttribute(L, "llvm.loop.vectorize.width");
51 Optional InterleaveCount =
52 getOptionalIntLoopAttribute(L, "llvm.loop.interleave.count");
53
54 if (VectorizeWidth.getValueOr(0) != 1)
55 ORE->emit(
56 DiagnosticInfoOptimizationFailure(DEBUG_TYPE,
57 "FailedRequestedVectorization",
58 L->getStartLoc(), L->getHeader())
59 << "loop not vectorized: the optimizer was unable to perform the "
60 "requested transformation; the transformation might be disabled "
61 "or specified as part of an unsupported transformation ordering");
62 else if (InterleaveCount.getValueOr(0) != 1)
63 ORE->emit(
64 DiagnosticInfoOptimizationFailure(DEBUG_TYPE,
65 "FailedRequestedInterleaving",
66 L->getStartLoc(), L->getHeader())
67 << "loop not interleaved: the optimizer was unable to perform the "
68 "requested transformation; the transformation might be disabled "
69 "or specified as part of an unsupported transformation ordering");
70 }
71
72 if (hasDistributeTransformation(L) == TM_ForcedByUser) {
73 LLVM_DEBUG(dbgs() << "Leftover distribute transformation\n");
74 ORE->emit(
75 DiagnosticInfoOptimizationFailure(DEBUG_TYPE,
76 "FailedRequestedDistribution",
77 L->getStartLoc(), L->getHeader())
78 << "loop not distributed: the optimizer was unable to perform the "
79 "requested transformation; the transformation might be disabled or "
80 "specified as part of an unsupported transformation ordering");
81 }
82 }
83
84 static void warnAboutLeftoverTransformations(Function *F, LoopInfo *LI,
85 OptimizationRemarkEmitter *ORE) {
86 for (auto *L : LI->getLoopsInPreorder())
87 warnAboutLeftoverTransformations(L, ORE);
88 }
89
90 // New pass manager boilerplate
91 PreservedAnalyses
92 WarnMissedTransformationsPass::run(Function &F, FunctionAnalysisManager &AM) {
93 auto &ORE = AM.getResult(F);
94 auto &LI = AM.getResult(F);
95
96 warnAboutLeftoverTransformations(&F, &LI, &ORE);
97
98 return PreservedAnalyses::all();
99 }
100
101 // Legacy pass manager boilerplate
102 namespace {
103 class WarnMissedTransformationsLegacy : public FunctionPass {
104 public:
105 static char ID;
106
107 explicit WarnMissedTransformationsLegacy() : FunctionPass(ID) {
108 initializeWarnMissedTransformationsLegacyPass(
109 *PassRegistry::getPassRegistry());
110 }
111
112 bool runOnFunction(Function &F) override {
113 if (skipFunction(F))
114 return false;
115
116 auto &ORE = getAnalysis().getORE();
117 auto &LI = getAnalysis().getLoopInfo();
118
119 warnAboutLeftoverTransformations(&F, &LI, &ORE);
120 return false;
121 }
122
123 void getAnalysisUsage(AnalysisUsage &AU) const override {
124 AU.addRequired();
125 AU.addRequired();
126
127 AU.setPreservesAll();
128 }
129 };
130 } // end anonymous namespace
131
132 char WarnMissedTransformationsLegacy::ID = 0;
133
134 INITIALIZE_PASS_BEGIN(WarnMissedTransformationsLegacy, "transform-warning",
135 "Warn about non-applied transformations", false, false)
136 INITIALIZE_PASS_DEPENDENCY(LoopInfoWrapperPass)
137 INITIALIZE_PASS_DEPENDENCY(OptimizationRemarkEmitterWrapperPass)
138 INITIALIZE_PASS_END(WarnMissedTransformationsLegacy, "transform-warning",
139 "Warn about non-applied transformations", false, false)
140
141 Pass *llvm::createWarnMissedTransformationsPass() {
142 return new WarnMissedTransformationsLegacy();
143 }
328328 ///
329329 /// This utility preserves LoopInfo. It will also preserve ScalarEvolution and
330330 /// DominatorTree if they are non-null.
331 ///
332 /// If RemainderLoop is non-null, it will receive the remainder loop (if
333 /// required and not fully unrolled).
331334 LoopUnrollResult llvm::UnrollLoop(
332335 Loop *L, unsigned Count, unsigned TripCount, bool Force, bool AllowRuntime,
333336 bool AllowExpensiveTripCount, bool PreserveCondBr, bool PreserveOnlyFirst,
334337 unsigned TripMultiple, unsigned PeelCount, bool UnrollRemainder,
335338 LoopInfo *LI, ScalarEvolution *SE, DominatorTree *DT, AssumptionCache *AC,
336 OptimizationRemarkEmitter *ORE, bool PreserveLCSSA) {
339 OptimizationRemarkEmitter *ORE, bool PreserveLCSSA, Loop **RemainderLoop) {
337340
338341 BasicBlock *Preheader = L->getLoopPreheader();
339342 if (!Preheader) {
467470 if (RuntimeTripCount && TripMultiple % Count != 0 &&
468471 !UnrollRuntimeLoopRemainder(L, Count, AllowExpensiveTripCount,
469472 EpilogProfitability, UnrollRemainder, LI, SE,
470 DT, AC, PreserveLCSSA)) {
473 DT, AC, PreserveLCSSA, RemainderLoop)) {
471474 if (Force)
472475 RuntimeTripCount = false;
473476 else {
166166
167167 isSafeToUnrollAndJam should be used prior to calling this to make sure the
168168 unrolling will be valid. Checking profitablility is also advisable.
169
170 If EpilogueLoop is non-null, it receives the epilogue loop (if it was
171 necessary to create one and not fully unrolled).
169172 */
170 LoopUnrollResult
171 llvm::UnrollAndJamLoop(Loop *L, unsigned Count, unsigned TripCount,
172 unsigned TripMultiple, bool UnrollRemainder,
173 LoopInfo *LI, ScalarEvolution *SE, DominatorTree *DT,
174 AssumptionCache *AC, OptimizationRemarkEmitter *ORE) {
173 LoopUnrollResult llvm::UnrollAndJamLoop(
174 Loop *L, unsigned Count, unsigned TripCount, unsigned TripMultiple,
175 bool UnrollRemainder, LoopInfo *LI, ScalarEvolution *SE, DominatorTree *DT,
176 AssumptionCache *AC, OptimizationRemarkEmitter *ORE, Loop **EpilogueLoop) {
175177
176178 // When we enter here we should have already checked that it is safe
177179 BasicBlock *Header = L->getHeader();
195197 if (TripMultiple == 1 || TripMultiple % Count != 0) {
196198 if (!UnrollRuntimeLoopRemainder(L, Count, /*AllowExpensiveTripCount*/ false,
197199 /*UseEpilogRemainder*/ true,
198 UnrollRemainder, LI, SE, DT, AC, true)) {
200 UnrollRemainder, LI, SE, DT, AC, true,
201 EpilogueLoop)) {
199202 LLVM_DEBUG(dbgs() << "Won't unroll-and-jam; remainder loop could not be "
200203 "generated when assuming runtime trip count\n");
201204 return LoopUnrollResult::Unmodified;
379379 }
380380 if (CreateRemainderLoop) {
381381 Loop *NewLoop = NewLoops[L];
382 MDNode *LoopID = NewLoop->getLoopID();
382383 assert(NewLoop && "L should have been cloned");
383384
384385 // Only add loop metadata if the loop is not going to be completely
385386 // unrolled.
386387 if (UnrollRemainder)
387388 return NewLoop;
389
390 Optional NewLoopID = makeFollowupLoopID(
391 LoopID, {LLVMLoopUnrollFollowupAll, LLVMLoopUnrollFollowupRemainder});
392 if (NewLoopID.hasValue()) {
393 NewLoop->setLoopID(NewLoopID.getValue());
394
395 // Do not setLoopAlreadyUnrolled if loop attributes have been defined
396 // explicitly.
397 return NewLoop;
398 }
388399
389400 // Add unroll disable metadata to disable future unrolling for this loop.
390401 NewLoop->setLoopAlreadyUnrolled();
524535 bool llvm::UnrollRuntimeLoopRemainder(Loop *L, unsigned Count,
525536 bool AllowExpensiveTripCount,
526537 bool UseEpilogRemainder,
527 bool UnrollRemainder,
528 LoopInfo *LI, ScalarEvolution *SE,
529 DominatorTree *DT, AssumptionCache *AC,
530 bool PreserveLCSSA) {
538 bool UnrollRemainder, LoopInfo *LI,
539 ScalarEvolution *SE, DominatorTree *DT,
540 AssumptionCache *AC, bool PreserveLCSSA,
541 Loop **ResultLoop) {
531542 LLVM_DEBUG(dbgs() << "Trying runtime unrolling on Loop: \n");
532543 LLVM_DEBUG(L->dump());
533544 LLVM_DEBUG(UseEpilogRemainder ? dbgs() << "Using epilog remainder.\n"
910921 formDedicatedExitBlocks(remainderLoop, DT, LI, PreserveLCSSA);
911922 }
912923
924 auto UnrollResult = LoopUnrollResult::Unmodified;
913925 if (remainderLoop && UnrollRemainder) {
914926 LLVM_DEBUG(dbgs() << "Unrolling remainder loop\n");
915 UnrollLoop(remainderLoop, /*Count*/ Count - 1, /*TripCount*/ Count - 1,
916 /*Force*/ false, /*AllowRuntime*/ false,
917 /*AllowExpensiveTripCount*/ false, /*PreserveCondBr*/ true,
918 /*PreserveOnlyFirst*/ false, /*TripMultiple*/ 1,
919 /*PeelCount*/ 0, /*UnrollRemainder*/ false, LI, SE, DT, AC,
920 /*ORE*/ nullptr, PreserveLCSSA);
921 }
922
927 UnrollResult =
928 UnrollLoop(remainderLoop, /*Count*/ Count - 1, /*TripCount*/ Count - 1,
929 /*Force*/ false, /*AllowRuntime*/ false,
930 /*AllowExpensiveTripCount*/ false, /*PreserveCondBr*/ true,
931 /*PreserveOnlyFirst*/ false, /*TripMultiple*/ 1,
932 /*PeelCount*/ 0, /*UnrollRemainder*/ false, LI, SE, DT, AC,
933 /*ORE*/ nullptr, PreserveLCSSA);
934 }
935
936 if (ResultLoop && UnrollResult != LoopUnrollResult::FullyUnrolled)
937 *ResultLoop = remainderLoop;
923938 NumRuntimeUnrolled++;
924939 return true;
925940 }
4141
4242 #define DEBUG_TYPE "loop-utils"
4343
44 static const char *LLVMLoopDisableNonforced = "llvm.loop.disable_nonforced";
45
4446 bool llvm::formDedicatedExitBlocks(Loop *L, DominatorTree *DT, LoopInfo *LI,
4547 bool PreserveLCSSA) {
4648 bool Changed = false;
182184 INITIALIZE_PASS_DEPENDENCY(ScalarEvolutionWrapperPass)
183185 }
184186
185 /// Find string metadata for loop
186 ///
187 /// If it has a value (e.g. {"llvm.distribute", 1} return the value as an
188 /// operand or null otherwise. If the string metadata is not found return
189 /// Optional's not-a-value.
190 Optional llvm::findStringMetadataForLoop(Loop *TheLoop,
191 StringRef Name) {
192 MDNode *LoopID = TheLoop->getLoopID();
187 static Optional findOptionMDForLoopID(MDNode *LoopID,
188 StringRef Name) {
193189 // Return none if LoopID is false.
194190 if (!LoopID)
195191 return None;
208204 continue;
209205 // Return true if MDString holds expected MetaData.
210206 if (Name.equals(S->getString()))
211 switch (MD->getNumOperands()) {
212 case 1:
213 return nullptr;
214 case 2:
215 return &MD->getOperand(1);
216 default:
217 llvm_unreachable("loop metadata has 0 or 1 operand");
218 }
207 return MD;
219208 }
220209 return None;
210 }
211
212 static Optional findOptionMDForLoop(const Loop *TheLoop,
213 StringRef Name) {
214 return findOptionMDForLoopID(TheLoop->getLoopID(), Name);
215 }
216
217 /// Find string metadata for loop
218 ///
219 /// If it has a value (e.g. {"llvm.distribute", 1} return the value as an
220 /// operand or null otherwise. If the string metadata is not found return
221 /// Optional's not-a-value.
222 Optional llvm::findStringMetadataForLoop(Loop *TheLoop,
223 StringRef Name) {
224 auto MD = findOptionMDForLoop(TheLoop, Name).getValueOr(nullptr);
225 if (!MD)
226 return None;
227 switch (MD->getNumOperands()) {
228 case 1:
229 return nullptr;
230 case 2:
231 return &MD->getOperand(1);
232 default:
233 llvm_unreachable("loop metadata has 0 or 1 operand");
234 }
235 }
236
237 static Optional getOptionalBoolLoopAttribute(const Loop *TheLoop,
238 StringRef Name) {
239 Optional MD = findOptionMDForLoop(TheLoop, Name);
240 if (!MD.hasValue())
241 return None;
242 MDNode *OptionNode = MD.getValue();
243 if (OptionNode == nullptr)
244 return None;
245 switch (OptionNode->getNumOperands()) {
246 case 1:
247 // When the value is absent it is interpreted as 'attribute set'.
248 return true;
249 case 2:
250 return mdconst::extract_or_null(
251 OptionNode->getOperand(1).get());
252 }
253 llvm_unreachable("unexpected number of options");
254 }
255
256 static bool getBooleanLoopAttribute(const Loop *TheLoop, StringRef Name) {
257 return getOptionalBoolLoopAttribute(TheLoop, Name).getValueOr(false);
258 }
259
260 llvm::Optional llvm::getOptionalIntLoopAttribute(Loop *TheLoop,
261 StringRef Name) {
262 const MDOperand *AttrMD =
263 findStringMetadataForLoop(TheLoop, Name).getValueOr(nullptr);
264 if (!AttrMD)
265 return None;
266
267 ConstantInt *IntMD = mdconst::extract_or_null(AttrMD->get());
268 if (!IntMD)
269 return None;
270
271 return IntMD->getSExtValue();
272 }
273
274 Optional llvm::makeFollowupLoopID(
275 MDNode *OrigLoopID, ArrayRef FollowupOptions,
276 const char *InheritOptionsExceptPrefix, bool AlwaysNew) {
277 if (!OrigLoopID) {
278 if (AlwaysNew)
279 return nullptr;
280 return None;
281 }
282
283 assert(OrigLoopID->getOperand(0) == OrigLoopID);
284
285 bool InheritAllAttrs = !InheritOptionsExceptPrefix;
286 bool InheritSomeAttrs =
287 InheritOptionsExceptPrefix && InheritOptionsExceptPrefix[0] != '\0';
288 SmallVector MDs;
289 MDs.push_back(nullptr);
290
291 bool Changed = false;
292 if (InheritAllAttrs || InheritSomeAttrs) {
293 for (const MDOperand &Existing : drop_begin(OrigLoopID->operands(), 1)) {
294 MDNode *Op = cast(Existing.get());
295
296 auto InheritThisAttribute = [InheritSomeAttrs,
297 InheritOptionsExceptPrefix](MDNode *Op) {
298 if (!InheritSomeAttrs)
299 return false;
300
301 // Skip malformatted attribute metadata nodes.
302 if (Op->getNumOperands() == 0)
303 return true;
304 Metadata *NameMD = Op->getOperand(0).get();
305 if (!isa(NameMD))
306 return true;
307 StringRef AttrName = cast(NameMD)->getString();
308
309 // Do not inherit excluded attributes.
310 return !AttrName.startswith(InheritOptionsExceptPrefix);
311 };
312
313 if (InheritThisAttribute(Op))
314 MDs.push_back(Op);
315 else
316 Changed = true;
317 }
318 } else {
319 // Modified if we dropped at least one attribute.
320 Changed = OrigLoopID->getNumOperands() > 1;
321 }
322
323 bool HasAnyFollowup = false;
324 for (StringRef OptionName : FollowupOptions) {
325 MDNode *FollowupNode =
326 findOptionMDForLoopID(OrigLoopID, OptionName).getValueOr(nullptr);
327 if (!FollowupNode)
328 continue;
329
330 HasAnyFollowup = true;
331 for (const MDOperand &Option : drop_begin(FollowupNode->operands(), 1)) {
332 MDs.push_back(Option.get());
333 Changed = true;
334 }
335 }
336
337 // Attributes of the followup loop not specified explicity, so signal to the
338 // transformation pass to add suitable attributes.
339 if (!AlwaysNew && !HasAnyFollowup)
340 return None;
341
342 // If no attributes were added or remove, the previous loop Id can be reused.
343 if (!AlwaysNew && !Changed)
344 return OrigLoopID;
345
346 // No attributes is equivalent to having no !llvm.loop metadata at all.
347 if (MDs.size() == 1)
348 return nullptr;
349
350 // Build the new loop ID.
351 MDTuple *FollowupLoopID = MDNode::get(OrigLoopID->getContext(), MDs);
352 FollowupLoopID->replaceOperandWith(0, FollowupLoopID);
353 return FollowupLoopID;
354 }
355
356 bool llvm::hasDisableAllTransformsHint(const Loop *L) {
357 return getBooleanLoopAttribute(L, LLVMLoopDisableNonforced);
358 }
359
360 TransformationMode llvm::hasUnrollTransformation(Loop *L) {
361 if (getBooleanLoopAttribute(L, "llvm.loop.unroll.disable"))
362 return TM_SuppressedByUser;
363
364 Optional Count =
365 getOptionalIntLoopAttribute(L, "llvm.loop.unroll.count");
366 if (Count.hasValue())
367 return Count.getValue() == 1 ? TM_SuppressedByUser : TM_ForcedByUser;
368
369 if (getBooleanLoopAttribute(L, "llvm.loop.unroll.enable"))
370 return TM_ForcedByUser;
371
372 if (getBooleanLoopAttribute(L, "llvm.loop.unroll.full"))
373 return TM_ForcedByUser;
374
375 if (hasDisableAllTransformsHint(L))
376 return TM_Disable;
377
378 return TM_Unspecified;
379 }
380
381 TransformationMode llvm::hasUnrollAndJamTransformation(Loop *L) {
382 if (getBooleanLoopAttribute(L, "llvm.loop.unroll_and_jam.disable"))
383 return TM_SuppressedByUser;
384
385 Optional Count =
386 getOptionalIntLoopAttribute(L, "llvm.loop.unroll_and_jam.count");
387 if (Count.hasValue())
388 return Count.getValue() == 1 ? TM_SuppressedByUser : TM_ForcedByUser;
389
390 if (getBooleanLoopAttribute(L, "llvm.loop.unroll_and_jam.enable"))
391 return TM_ForcedByUser;
392
393 if (hasDisableAllTransformsHint(L))
394 return TM_Disable;
395
396 return TM_Unspecified;
397 }
398
399 TransformationMode llvm::hasVectorizeTransformation(Loop *L) {
400 Optional Enable =
401 getOptionalBoolLoopAttribute(L, "llvm.loop.vectorize.enable");
402
403 if (Enable == false)
404 return TM_SuppressedByUser;
405
406 Optional VectorizeWidth =
407 getOptionalIntLoopAttribute(L, "llvm.loop.vectorize.width");
408 Optional InterleaveCount =
409 getOptionalIntLoopAttribute(L, "llvm.loop.interleave.count");
410
411 if (Enable == true) {
412 // 'Forcing' vector width and interleave count to one effectively disables
413 // this tranformation.
414 if (VectorizeWidth == 1 && InterleaveCount == 1)
415 return TM_SuppressedByUser;
416 return TM_ForcedByUser;
417 }
418
419 if (getBooleanLoopAttribute(L, "llvm.loop.isvectorized"))
420 return TM_Disable;
421
422 if (VectorizeWidth == 1 && InterleaveCount == 1)
423 return TM_Disable;
424
425 if (VectorizeWidth > 1 || InterleaveCount > 1)
426 return TM_Enable;
427
428 if (hasDisableAllTransformsHint(L))
429 return TM_Disable;
430
431 return TM_Unspecified;
432 }
433
434 TransformationMode llvm::hasDistributeTransformation(Loop *L) {
435 if (getBooleanLoopAttribute(L, "llvm.loop.distribute.enable"))
436 return TM_ForcedByUser;
437
438 if (hasDisableAllTransformsHint(L))
439 return TM_Disable;
440
441 return TM_Unspecified;
442 }
443
444 TransformationMode llvm::hasLICMVersioningTransformation(Loop *L) {
445 if (getBooleanLoopAttribute(L, "llvm.loop.licm_versioning.disable"))
446 return TM_SuppressedByUser;
447
448 if (hasDisableAllTransformsHint(L))
449 return TM_Disable;
450
451 return TM_Unspecified;
221452 }
222453
223454 /// Does a BFS from a given node to all of its children inside a given loop.
151151 #define LV_NAME "loop-vectorize"
152152 #define DEBUG_TYPE LV_NAME
153153
154 /// @{
155 /// Metadata attribute names
156 static const char *const LLVMLoopVectorizeFollowupAll =
157 "llvm.loop.vectorize.followup_all";
158 static const char *const LLVMLoopVectorizeFollowupVectorized =
159 "llvm.loop.vectorize.followup_vectorized";
160 static const char *const LLVMLoopVectorizeFollowupEpilogue =
161 "llvm.loop.vectorize.followup_epilogue";
162 /// @}
163
154164 STATISTIC(LoopsVectorized, "Number of loops vectorized");
155165 STATISTIC(LoopsAnalyzed, "Number of loops analyzed for vectorization");
156166
795805 }
796806 }
797807
798 static void emitMissedWarning(Function *F, Loop *L,
799 const LoopVectorizeHints &LH,
800 OptimizationRemarkEmitter *ORE) {
801 LH.emitRemarkWithHints();
802
803 if (LH.getForce() == LoopVectorizeHints::FK_Enabled) {
804 if (LH.getWidth() != 1)
805 ORE->emit(DiagnosticInfoOptimizationFailure(
806 DEBUG_TYPE, "FailedRequestedVectorization",
807 L->getStartLoc(), L->getHeader())
808 << "loop not vectorized: "
809 << "failed explicitly specified loop vectorization");
810 else if (LH.getInterleave() != 1)
811 ORE->emit(DiagnosticInfoOptimizationFailure(
812 DEBUG_TYPE, "FailedRequestedInterleaving", L->getStartLoc(),
813 L->getHeader())
814 << "loop not interleaved: "
815 << "failed explicitly specified loop interleaving");
816 }
817 }
818
819808 namespace llvm {
820809
821810 /// LoopVectorizationCostModel - estimates the expected speedups due to
13761365
13771366 if (!Hints.getWidth()) {
13781367 LLVM_DEBUG(dbgs() << "LV: Not vectorizing: No user vector width.\n");
1379 emitMissedWarning(Fn, OuterLp, Hints, ORE);
1368 Hints.emitRemarkWithHints();
13801369 return false;
13811370 }
13821371
13841373 // TODO: Interleave support is future work.
13851374 LLVM_DEBUG(dbgs() << "LV: Not vectorizing: Interleave is not supported for "
13861375 "outer loops.\n");
1387 emitMissedWarning(Fn, OuterLp, Hints, ORE);
1376 Hints.emitRemarkWithHints();
13881377 return false;
13891378 }
13901379
27382727 BasicBlock *OldBasicBlock = OrigLoop->getHeader();
27392728 BasicBlock *VectorPH = OrigLoop->getLoopPreheader();
27402729 BasicBlock *ExitBlock = OrigLoop->getExitBlock();
2730 MDNode *OrigLoopID = OrigLoop->getLoopID();
27412731 assert(VectorPH && "Invalid loop structure");
27422732 assert(ExitBlock && "Must have an exit block");
27432733
28802870 LoopExitBlock = ExitBlock;
28812871 LoopVectorBody = VecBody;
28822872 LoopScalarBody = OldBasicBlock;
2873
2874 Optional VectorizedLoopID =
2875 makeFollowupLoopID(OrigLoopID, {LLVMLoopVectorizeFollowupAll,
2876 LLVMLoopVectorizeFollowupVectorized});
2877 if (VectorizedLoopID.hasValue()) {
2878 Lp->setLoopID(VectorizedLoopID.getValue());
2879
2880 // Do not setAlreadyVectorized if loop attributes have been defined
2881 // explicitly.
2882 return LoopVectorPreHeader;
2883 }
28832884
28842885 // Keep all loop hints from the original loop on the vector loop (we'll
28852886 // replace the vectorizer-specific hints below).
71767177 &Requirements, &Hints, DB, AC);
71777178 if (!LVL.canVectorize(EnableVPlanNativePath)) {
71787179 LLVM_DEBUG(dbgs() << "LV: Not vectorizing: Cannot prove legality.\n");
7179 emitMissedWarning(F, L, Hints, ORE);
7180 Hints.emitRemarkWithHints();
71807181 return false;
71817182 }
71827183
72497250 ORE->emit(createLVMissedAnalysis(Hints.vectorizeAnalysisPassName(),
72507251 "NoImplicitFloat", L)
72517252 << "loop not vectorized due to NoImplicitFloat attribute");
7252 emitMissedWarning(F, L, Hints, ORE);
7253 Hints.emitRemarkWithHints();
72537254 return false;
72547255 }
72557256
72647265 ORE->emit(
72657266 createLVMissedAnalysis(Hints.vectorizeAnalysisPassName(), "UnsafeFP", L)
72667267 << "loop not vectorized due to unsafe FP support.");
7267 emitMissedWarning(F, L, Hints, ORE);
7268 Hints.emitRemarkWithHints();
72687269 return false;
72697270 }
72707271
73067307 if (Requirements.doesNotMeet(F, L, Hints)) {
73077308 LLVM_DEBUG(dbgs() << "LV: Not vectorizing: loop did not meet vectorization "
73087309 "requirements.\n");
7309 emitMissedWarning(F, L, Hints, ORE);
7310 Hints.emitRemarkWithHints();
73107311 return false;
73117312 }
73127313
73837384 LVP.setBestPlan(VF.Width, IC);
73847385
73857386 using namespace ore;
7387 bool DisableRuntimeUnroll = false;
7388 MDNode *OrigLoopID = L->getLoopID();
73867389
73877390 if (!VectorizeLoop) {
73887391 assert(IC > 1 && "interleave count should not be 1 or 0");
74097412 // no runtime checks about strides and memory. A scalar loop that is
74107413 // rarely used is not worth unrolling.
74117414 if (!LB.areSafetyChecksAdded())
7412 AddRuntimeUnrollDisableMetaData(L);
7415 DisableRuntimeUnroll = true;
74137416
74147417 // Report the vectorization decision.
74157418 ORE->emit([&]() {
74217424 });
74227425 }
74237426
7424 // Mark the loop as already vectorized to avoid vectorizing again.
7425 Hints.setAlreadyVectorized();
7427 Optional RemainderLoopID =
7428 makeFollowupLoopID(OrigLoopID, {LLVMLoopVectorizeFollowupAll,
7429 LLVMLoopVectorizeFollowupEpilogue});
7430 if (RemainderLoopID.hasValue()) {
7431 L->setLoopID(RemainderLoopID.getValue());
7432 } else {
7433 if (DisableRuntimeUnroll)
7434 AddRuntimeUnrollDisableMetaData(L);
7435
7436 // Mark the loop as already vectorized to avoid vectorizing again.
7437 Hints.setAlreadyVectorized();
7438 }
74267439
74277440 LLVM_DEBUG(verifyFunction(*L->getHeader()->getParent()));
74287441 return true;
245245 ; CHECK-O-NEXT: Running pass: InstCombinePass
246246 ; CHECK-O-NEXT: Running pass: LoopUnrollPass
247247 ; CHECK-O-NEXT: Running analysis: OuterAnalysisManagerProxy
248 ; CHECK-O-NEXT: Running pass: WarnMissedTransformationsPass
248249 ; CHECK-O-NEXT: Running pass: InstCombinePass
249250 ; CHECK-O-NEXT: Running pass: RequireAnalysisPass<{{.*}}OptimizationRemarkEmitterAnalysis
250251 ; CHECK-O-NEXT: Running pass: FunctionToLoopPassAdaptor<{{.*}}LICMPass
223223 ; CHECK-POSTLINK-O-NEXT: Running pass: InstCombinePass
224224 ; CHECK-POSTLINK-O-NEXT: Running pass: LoopUnrollPass
225225 ; CHECK-POSTLINK-O-NEXT: Running analysis: OuterAnalysisManagerProxy
226 ; CHECK-POSTLINK-O-NEXT: Running pass: WarnMissedTransformationsPass
226227 ; CHECK-POSTLINK-O-NEXT: Running pass: InstCombinePass
227228 ; CHECK-POSTLINK-O-NEXT: Running pass: RequireAnalysisPass<{{.*}}OptimizationRemarkEmitterAnalysis
228229 ; CHECK-POSTLINK-O-NEXT: Running pass: FunctionToLoopPassAdaptor<{{.*}}LICMPass
249249 ; CHECK-NEXT: Scalar Evolution Analysis
250250 ; CHECK-NEXT: Loop Pass Manager
251251 ; CHECK-NEXT: Loop Invariant Code Motion
252 ; CHECK-NEXT: Lazy Branch Probability Analysis
253 ; CHECK-NEXT: Lazy Block Frequency Analysis
254 ; CHECK-NEXT: Optimization Remark Emitter
255 ; CHECK-NEXT: Warn about non-applied transformations
252256 ; CHECK-NEXT: Alignment from assumptions
253257 ; CHECK-NEXT: Strip Unused Function Prototypes
254258 ; CHECK-NEXT: Dead Global Elimination
254254 ; CHECK-NEXT: Scalar Evolution Analysis
255255 ; CHECK-NEXT: Loop Pass Manager
256256 ; CHECK-NEXT: Loop Invariant Code Motion
257 ; CHECK-NEXT: Lazy Branch Probability Analysis
258 ; CHECK-NEXT: Lazy Block Frequency Analysis
259 ; CHECK-NEXT: Optimization Remark Emitter
260 ; CHECK-NEXT: Warn about non-applied transformations
257261 ; CHECK-NEXT: Alignment from assumptions
258262 ; CHECK-NEXT: Strip Unused Function Prototypes
259263 ; CHECK-NEXT: Dead Global Elimination
236236 ; CHECK-NEXT: Scalar Evolution Analysis
237237 ; CHECK-NEXT: Loop Pass Manager
238238 ; CHECK-NEXT: Loop Invariant Code Motion
239 ; CHECK-NEXT: Lazy Branch Probability Analysis
240 ; CHECK-NEXT: Lazy Block Frequency Analysis
241 ; CHECK-NEXT: Optimization Remark Emitter
242 ; CHECK-NEXT: Warn about non-applied transformations
239243 ; CHECK-NEXT: Alignment from assumptions
240244 ; CHECK-NEXT: Strip Unused Function Prototypes
241245 ; CHECK-NEXT: Dead Global Elimination
235235 ; CHECK-NEXT: Scalar Evolution Analysis
236236 ; CHECK-NEXT: Loop Pass Manager
237237 ; CHECK-NEXT: Loop Invariant Code Motion
238 ; CHECK-NEXT: Lazy Branch Probability Analysis
239 ; CHECK-NEXT: Lazy Block Frequency Analysis
240 ; CHECK-NEXT: Optimization Remark Emitter
241 ; CHECK-NEXT: Warn about non-applied transformations
238242 ; CHECK-NEXT: Alignment from assumptions
239243 ; CHECK-NEXT: Strip Unused Function Prototypes
240244 ; CHECK-NEXT: Dead Global Elimination
0 ; RUN: opt -loop-distribute -enable-loop-distribute=1 -S < %s | FileCheck %s
1 ;
2 ; Check that the disable_nonforced is honored by loop distribution.
3 ;
4 target datalayout = "e-m:o-i64:64-f80:128-n8:16:32:64-S128"
5
6 ; CHECK-LABEL: @disable_nonforced(
7 ; CHECK-NOT: for.body.ldist1:
8 define void @disable_nonforced(i32* noalias %a,
9 i32* noalias %b,
10 i32* noalias %c,
11 i32* noalias %d,
12 i32* noalias %e) {
13 entry:
14 br label %for.body
15
16 for.body:
17 %ind = phi i64 [ 0, %entry ], [ %add, %for.body ]
18
19 %arrayidxA = getelementptr inbounds i32, i32* %a, i64 %ind
20 %loadA = load i32, i32* %arrayidxA, align 4
21
22 %arrayidxB = getelementptr inbounds i32, i32* %b, i64 %ind
23 %loadB = load i32, i32* %arrayidxB, align 4
24
25 %mulA = mul i32 %loadB, %loadA
26
27 %add = add nuw nsw i64 %ind, 1
28 %arrayidxA_plus_4 = getelementptr inbounds i32, i32* %a, i64 %add
29 store i32 %mulA, i32* %arrayidxA_plus_4, align 4
30
31 %arrayidxD = getelementptr inbounds i32, i32* %d, i64 %ind
32 %loadD = load i32, i32* %arrayidxD, align 4
33
34 %arrayidxE = getelementptr inbounds i32, i32* %e, i64 %ind
35 %loadE = load i32, i32* %arrayidxE, align 4
36
37 %mulC = mul i32 %loadD, %loadE
38
39 %arrayidxC = getelementptr inbounds i32, i32* %c, i64 %ind
40 store i32 %mulC, i32* %arrayidxC, align 4
41
42 %exitcond = icmp eq i64 %add, 20
43 br i1 %exitcond, label %for.end, label %for.body, !llvm.loop !0
44
45 for.end:
46 ret void
47 }
48
49 !0 = distinct !{!0, !{!"llvm.loop.disable_nonforced"}}
0 ; RUN: opt -loop-distribute -S < %s | FileCheck %s
1 ;
2 ; Check that llvm.loop.distribute.enable overrides
3 ; llvm.loop.disable_nonforced.
4 ;
5 target datalayout = "e-m:o-i64:64-f80:128-n8:16:32:64-S128"
6
7 ; CHECK-LABEL: @disable_nonforced(
8 ; CHECK: for.body.ldist1:
9 define void @disable_nonforced(i32* noalias %a,
10 i32* noalias %b,
11 i32* noalias %c,
12 i32* noalias %d,
13 i32* noalias %e) {
14 entry:
15 br label %for.body
16
17 for.body:
18 %ind = phi i64 [ 0, %entry ], [ %add, %for.body ]
19
20 %arrayidxA = getelementptr inbounds i32, i32* %a, i64 %ind
21 %loadA = load i32, i32* %arrayidxA, align 4
22
23 %arrayidxB = getelementptr inbounds i32, i32* %b, i64 %ind
24 %loadB = load i32, i32* %arrayidxB, align 4
25
26 %mulA = mul i32 %loadB, %loadA
27
28 %add = add nuw nsw i64 %ind, 1
29 %arrayidxA_plus_4 = getelementptr inbounds i32, i32* %a, i64 %add
30 store i32 %mulA, i32* %arrayidxA_plus_4, align 4
31
32 %arrayidxD = getelementptr inbounds i32, i32* %d, i64 %ind
33 %loadD = load i32, i32* %arrayidxD, align 4
34
35 %arrayidxE = getelementptr inbounds i32, i32* %e, i64 %ind
36 %loadE = load i32, i32* %arrayidxE, align 4
37
38 %mulC = mul i32 %loadD, %loadE
39
40 %arrayidxC = getelementptr inbounds i32, i32* %c, i64 %ind
41 store i32 %mulC, i32* %arrayidxC, align 4
42
43 %exitcond = icmp eq i64 %add, 20
44 br i1 %exitcond, label %for.end, label %for.body, !llvm.loop !0
45
46 for.end:
47 ret void
48 }
49
50 !0 = distinct !{!0, !{!"llvm.loop.disable_nonforced"}, !{!"llvm.loop.distribute.enable", i32 1}}
0 ; RUN: opt -basicaa -loop-distribute -S < %s | FileCheck %s
1 ;
2 ; Check that followup loop-attributes are applied to the loops after
3 ; loop distribution.
4 ;
5 target datalayout = "e-m:o-i64:64-f80:128-n8:16:32:64-S128"
6
7 define void @f(i32* %a, i32* %b, i32* %c, i32* %d, i32* %e) {
8 entry:
9 br label %for.body
10
11 for.body:
12 %ind = phi i64 [ 0, %entry ], [ %add, %for.body ]
13
14 %arrayidxA = getelementptr inbounds i32, i32* %a, i64 %ind
15 %loadA = load i32, i32* %arrayidxA, align 4
16
17 %arrayidxB = getelementptr inbounds i32, i32* %b, i64 %ind
18 %loadB = load i32, i32* %arrayidxB, align 4
19
20 %mulA = mul i32 %loadB, %loadA
21
22 %add = add nuw nsw i64 %ind, 1
23 %arrayidxA_plus_4 = getelementptr inbounds i32, i32* %a, i64 %add
24 store i32 %mulA, i32* %arrayidxA_plus_4, align 4
25
26 %arrayidxD = getelementptr inbounds i32, i32* %d, i64 %ind
27 %loadD = load i32, i32* %arrayidxD, align 4
28
29 %arrayidxE = getelementptr inbounds i32, i32* %e, i64 %ind
30 %loadE = load i32, i32* %arrayidxE, align 4
31
32 %mulC = mul i32 %loadD, %loadE
33
34 %arrayidxC = getelementptr inbounds i32, i32* %c, i64 %ind
35 store i32 %mulC, i32* %arrayidxC, align 4
36
37 %exitcond = icmp eq i64 %add, 20
38 br i1 %exitcond, label %for.end, label %for.body, !llvm.loop !0
39
40 for.end:
41 ret void
42 }
43
44 !0 = distinct !{!0, !1, !2, !3, !4, !5}
45 !1 = !{!"llvm.loop.distribute.enable", i1 true}
46 !2 = !{!"llvm.loop.distribute.followup_all", !{!"FollowupAll"}}
47 !3 = !{!"llvm.loop.distribute.followup_coincident", !{!"FollowupCoincident", i1 false}}
48 !4 = !{!"llvm.loop.distribute.followup_sequential", !{!"FollowupSequential", i32 8}}
49 !5 = !{!"llvm.loop.distribute.followup_fallback", !{!"FollowupFallback"}}
50
51
52 ; CHECK-LABEL: for.body.lver.orig:
53 ; CHECK: br i1 %exitcond.lver.orig, label %for.end, label %for.body.lver.orig, !llvm.loop ![[LOOP_ORIG:[0-9]+]]
54 ; CHECK-LABEL: for.body.ldist1:
55 ; CHECK: br i1 %exitcond.ldist1, label %for.body.ph, label %for.body.ldist1, !llvm.loop ![[LOOP_SEQUENTIAL:[0-9]+]]
56 ; CHECK-LABEL: for.body:
57 ; CHECK: br i1 %exitcond, label %for.end, label %for.body, !llvm.loop ![[LOOP_COINCIDENT:[0-9]+]]
58
59 ; CHECK: ![[LOOP_ORIG]] = distinct !{![[LOOP_ORIG]], ![[FOLLOWUP_ALL:[0-9]+]], ![[FOLLOUP_FALLBACK:[0-9]+]]}
60 ; CHECK: ![[FOLLOWUP_ALL]] = !{!"FollowupAll"}
61 ; CHECK: ![[FOLLOUP_FALLBACK]] = !{!"FollowupFallback"}
62 ; CHECK: ![[LOOP_SEQUENTIAL]] = distinct !{![[LOOP_SEQUENTIAL]], ![[FOLLOWUP_ALL]], ![[FOLLOWUP_SEQUENTIAL:[0-9]+]]}
63 ; CHECK: ![[FOLLOWUP_SEQUENTIAL]] = !{!"FollowupSequential", i32 8}
64 ; CHECK: ![[LOOP_COINCIDENT]] = distinct !{![[LOOP_COINCIDENT]], ![[FOLLOWUP_ALL]], ![[FOLLOWUP_COINCIDENT:[0-9]+]]}
65 ; CHECK: ![[FOLLOWUP_COINCIDENT]] = !{!"FollowupCoincident", i1 false}
0 ; Legacy pass manager
1 ; RUN: opt < %s -transform-warning -disable-output -pass-remarks-missed=transform-warning -pass-remarks-analysis=transform-warning 2>&1 | FileCheck %s
2 ; RUN: opt < %s -transform-warning -disable-output -pass-remarks-output=%t.yaml
3 ; RUN: cat %t.yaml | FileCheck -check-prefix=YAML %s
4
5 ; New pass manager
6 ; RUN: opt < %s -passes=transform-warning -disable-output -pass-remarks-missed=transform-warning -pass-remarks-analysis=transform-warning 2>&1 | FileCheck %s
7 ; RUN: opt < %s -passes=transform-warning -disable-output -pass-remarks-output=%t.yaml
8 ; RUN: cat %t.yaml | FileCheck -check-prefix=YAML %s
9
10
11 ; CHECK: warning: source.cpp:19:5: loop not distributed: the optimizer was unable to perform the requested transformation; the transformation might be disabled or specified as part of an unsupported transformation ordering
12
13 ; YAML: --- !Failure
14 ; YAML-NEXT: Pass: transform-warning
15 ; YAML-NEXT: Name: FailedRequestedDistribution
16 ; YAML-NEXT: DebugLoc: { File: source.cpp, Line: 19, Column: 5 }
17 ; YAML-NEXT: Function: _Z17test_array_boundsPiS_i
18 ; YAML-NEXT: Args:
19 ; YAML-NEXT: - String: 'loop not distributed: the optimizer was unable to perform the requested transformation; the transformation might be disabled or specified as part of an unsupported transformation ordering'
20 ; YAML-NEXT: ...
21
22 target datalayout = "e-m:o-i64:64-f80:128-n8:16:32:64-S128"
23
24 define void @_Z17test_array_boundsPiS_i(i32* nocapture %A, i32* nocapture readonly %B, i32 %Length) !dbg !8 {
25 entry:
26 %cmp9 = icmp sgt i32 %Length, 0, !dbg !32
27 br i1 %cmp9, label %for.body.preheader, label %for.end, !dbg !32
28
29 for.body.preheader:
30 br label %for.body, !dbg !35
31
32 for.body:
33 %indvars.iv = phi i64 [ %indvars.iv.next, %for.body ], [ 0, %for.body.preheader ]
34 %arrayidx = getelementptr inbounds i32, i32* %B, i64 %indvars.iv, !dbg !35
35 %0 = load i32, i32* %arrayidx, align 4, !dbg !35, !tbaa !18
36 %idxprom1 = sext i32 %0 to i64, !dbg !35
37 %arrayidx2 = getelementptr inbounds i32, i32* %A, i64 %idxprom1, !dbg !35
38 %1 = load i32, i32* %arrayidx2, align 4, !dbg !35, !tbaa !18
39 %arrayidx4 = getelementptr inbounds i32, i32* %A, i64 %indvars.iv, !dbg !35
40 store i32 %1, i32* %arrayidx4, align 4, !dbg !35, !tbaa !18
41 %indvars.iv.next = add nuw nsw i64 %indvars.iv, 1, !dbg !32
42 %lftr.wideiv = trunc i64 %indvars.iv.next to i32, !dbg !32
43 %exitcond = icmp eq i32 %lftr.wideiv, %Length, !dbg !32
44 br i1 %exitcond, label %for.end.loopexit, label %for.body, !dbg !32, !llvm.loop !50
45
46 for.end.loopexit:
47 br label %for.end
48
49 for.end:
50 ret void, !dbg !36
51 }
52
53 !llvm.dbg.cu = !{!0}
54 !llvm.module.flags = !{!9, !10}
55 !llvm.ident = !{!11}
56
57 !0 = distinct !DICompileUnit(language: DW_LANG_C_plus_plus, producer: "clang version 3.5.0", isOptimized: true, runtimeVersion: 6, emissionKind: LineTablesOnly, file: !1, enums: !2, retainedTypes: !2, globals: !2, imports: !2)
58 !1 = !DIFile(filename: "source.cpp", directory: ".")
59 !2 = !{}
60 !4 = distinct !DISubprogram(name: "test", line: 1, isLocal: false, isDefinition: true, virtualIndex: 6, flags: DIFlagPrototyped, isOptimized: true, unit: !0, scopeLine: 1, file: !1, scope: !5, type: !6, retainedNodes: !2)
61 !5 = !DIFile(filename: "source.cpp", directory: ".")
62 !6 = !DISubroutineType(types: !2)
63 !7 = distinct !DISubprogram(name: "test_disabled", line: 10, isLocal: false, isDefinition: true, virtualIndex: 6, flags: DIFlagPrototyped, isOptimized: true, unit: !0, scopeLine: 10, file: !1, scope: !5, type: !6, retainedNodes: !2)
64 !8 = distinct !DISubprogram(name: "test_array_bounds", line: 16, isLocal: false, isDefinition: true, virtualIndex: 6, flags: DIFlagPrototyped, isOptimized: true, unit: !0, scopeLine: 16, file: !1, scope: !5, type: !6, retainedNodes: !2)
65 !9 = !{i32 2, !"Dwarf Version", i32 2}
66 !10 = !{i32 2, !"Debug Info Version", i32 3}
67 !11 = !{!"clang version 3.5.0"}
68 !12 = !DILocation(line: 3, column: 8, scope: !13)
69 !13 = distinct !DILexicalBlock(line: 3, column: 3, file: !1, scope: !4)
70 !16 = !DILocation(line: 4, column: 5, scope: !17)
71 !17 = distinct !DILexicalBlock(line: 3, column: 36, file: !1, scope: !13)
72 !18 = !{!19, !19, i64 0}
73 !19 = !{!"int", !20, i64 0}
74 !20 = !{!"omnipotent char", !21, i64 0}
75 !21 = !{!"Simple C/C++ TBAA"}
76 !22 = !DILocation(line: 5, column: 9, scope: !23)
77 !23 = distinct !DILexicalBlock(line: 5, column: 9, file: !1, scope: !17)
78 !24 = !DILocation(line: 8, column: 1, scope: !4)
79 !25 = !DILocation(line: 12, column: 8, scope: !26)
80 !26 = distinct !DILexicalBlock(line: 12, column: 3, file: !1, scope: !7)
81 !30 = !DILocation(line: 13, column: 5, scope: !26)
82 !31 = !DILocation(line: 14, column: 1, scope: !7)
83 !32 = !DILocation(line: 18, column: 8, scope: !33)
84 !33 = distinct !DILexicalBlock(line: 18, column: 3, file: !1, scope: !8)
85 !35 = !DILocation(line: 19, column: 5, scope: !33)
86 !36 = !DILocation(line: 20, column: 1, scope: !8)
87 !37 = distinct !DILexicalBlock(line: 24, column: 3, file: !1, scope: !46)
88 !38 = !DILocation(line: 27, column: 3, scope: !37)
89 !39 = !DILocation(line: 31, column: 3, scope: !37)
90 !40 = !DILocation(line: 28, column: 9, scope: !37)
91 !41 = !DILocation(line: 29, column: 11, scope: !37)
92 !42 = !DILocation(line: 29, column: 7, scope: !37)
93 !43 = !DILocation(line: 27, column: 32, scope: !37)
94 !44 = !DILocation(line: 27, column: 30, scope: !37)
95 !45 = !DILocation(line: 27, column: 21, scope: !37)
96 !46 = distinct !DISubprogram(name: "test_multiple_failures", line: 26, isLocal: false, isDefinition: true, virtualIndex: 6, flags: DIFlagPrototyped, isOptimized: true, unit: !0, scopeLine: 26, file: !1, scope: !5, type: !6, retainedNodes: !2)
97
98 !50 = !{!50, !{!"llvm.loop.distribute.enable"}}
0 ; Legacy pass manager
1 ; RUN: opt < %s -transform-warning -disable-output -pass-remarks-missed=transform-warning -pass-remarks-analysis=transform-warning 2>&1 | FileCheck %s
2 ; RUN: opt < %s -transform-warning -disable-output -pass-remarks-output=%t.yaml
3 ; RUN: cat %t.yaml | FileCheck -check-prefix=YAML %s
4
5 ; New pass manager
6 ; RUN: opt < %s -passes=transform-warning -disable-output -pass-remarks-missed=transform-warning -pass-remarks-analysis=transform-warning 2>&1 | FileCheck %s
7 ; RUN: opt < %s -passes=transform-warning -disable-output -pass-remarks-output=%t.yaml
8 ; RUN: cat %t.yaml | FileCheck -check-prefix=YAML %s
9
10
11 ; CHECK: warning: source.cpp:19:5: loop not unroll-and-jammed: the optimizer was unable to perform the requested transformation; the transformation might be disabled or specified as part of an unsupported transformation ordering
12
13 ; YAML: --- !Failure
14 ; YAML-NEXT: Pass: transform-warning
15 ; YAML-NEXT: Name: FailedRequestedUnrollAndJamming
16 ; YAML-NEXT: DebugLoc: { File: source.cpp, Line: 19, Column: 5 }
17 ; YAML-NEXT: Function: _Z17test_array_boundsPiS_i
18 ; YAML-NEXT: Args:
19 ; YAML-NEXT: - String: 'loop not unroll-and-jammed: the optimizer was unable to perform the requested transformation; the transformation might be disabled or specified as part of an unsupported transformation ordering'
20 ; YAML-NEXT: ...
21
22 target datalayout = "e-m:o-i64:64-f80:128-n8:16:32:64-S128"
23
24 define void @_Z17test_array_boundsPiS_i(i32* nocapture %A, i32* nocapture readonly %B, i32 %Length) !dbg !8 {
25 entry:
26 %cmp9 = icmp sgt i32 %Length, 0, !dbg !32
27 br i1 %cmp9, label %for.body.preheader, label %for.end, !dbg !32
28
29 for.body.preheader:
30 br label %for.body, !dbg !35
31
32 for.body:
33 %indvars.iv = phi i64 [ %indvars.iv.next, %for.body ], [ 0, %for.body.preheader ]
34 %arrayidx = getelementptr inbounds i32, i32* %B, i64 %indvars.iv, !dbg !35
35 %0 = load i32, i32* %arrayidx, align 4, !dbg !35, !tbaa !18
36 %idxprom1 = sext i32 %0 to i64, !dbg !35
37 %arrayidx2 = getelementptr inbounds i32, i32* %A, i64 %idxprom1, !dbg !35
38 %1 = load i32, i32* %arrayidx2, align 4, !dbg !35, !tbaa !18
39 %arrayidx4 = getelementptr inbounds i32, i32* %A, i64 %indvars.iv, !dbg !35
40 store i32 %1, i32* %arrayidx4, align 4, !dbg !35, !tbaa !18
41 %indvars.iv.next = add nuw nsw i64 %indvars.iv, 1, !dbg !32
42 %lftr.wideiv = trunc i64 %indvars.iv.next to i32, !dbg !32
43 %exitcond = icmp eq i32 %lftr.wideiv, %Length, !dbg !32
44 br i1 %exitcond, label %for.end.loopexit, label %for.body, !dbg !32, !llvm.loop !50
45
46 for.end.loopexit:
47 br label %for.end
48
49 for.end:
50 ret void, !dbg !36
51 }
52
53 !llvm.dbg.cu = !{!0}
54 !llvm.module.flags = !{!9, !10}
55 !llvm.ident = !{!11}
56
57 !0 = distinct !DICompileUnit(language: DW_LANG_C_plus_plus, producer: "clang version 3.5.0", isOptimized: true, runtimeVersion: 6, emissionKind: LineTablesOnly, file: !1, enums: !2, retainedTypes: !2, globals: !2, imports: !2)
58 !1 = !DIFile(filename: "source.cpp", directory: ".")
59 !2 = !{}
60 !4 = distinct !DISubprogram(name: "test", line: 1, isLocal: false, isDefinition: true, virtualIndex: 6, flags: DIFlagPrototyped, isOptimized: true, unit: !0, scopeLine: 1, file: !1, scope: !5, type: !6, retainedNodes: !2)
61 !5 = !DIFile(filename: "source.cpp", directory: ".")
62 !6 = !DISubroutineType(types: !2)
63 !7 = distinct !DISubprogram(name: "test_disabled", line: 10, isLocal: false, isDefinition: true, virtualIndex: 6, flags: DIFlagPrototyped, isOptimized: true, unit: !0, scopeLine: 10, file: !1, scope: !5, type: !6, retainedNodes: !2)
64 !8 = distinct !DISubprogram(name: "test_array_bounds", line: 16, isLocal: false, isDefinition: true, virtualIndex: 6, flags: DIFlagPrototyped, isOptimized: true, unit: !0, scopeLine: 16, file: !1, scope: !5, type: !6, retainedNodes: !2)
65 !9 = !{i32 2, !"Dwarf Version", i32 2}
66 !10 = !{i32 2, !"Debug Info Version", i32 3}
67 !11 = !{!"clang version 3.5.0"}
68 !12 = !DILocation(line: 3, column: 8, scope: !13)
69 !13 = distinct !DILexicalBlock(line: 3, column: 3, file: !1, scope: !4)
70 !16 = !DILocation(line: 4, column: 5, scope: !17)
71 !17 = distinct !DILexicalBlock(line: 3, column: 36, file: !1, scope: !13)
72 !18 = !{!19, !19, i64 0}
73 !19 = !{!"int", !20, i64 0}
74 !20 = !{!"omnipotent char", !21, i64 0}
75 !21 = !{!"Simple C/C++ TBAA"}
76 !22 = !DILocation(line: 5, column: 9, scope: !23)
77 !23 = distinct !DILexicalBlock(line: 5, column: 9, file: !1, scope: !17)
78 !24 = !DILocation(line: 8, column: 1, scope: !4)
79 !25 = !DILocation(line: 12, column: 8, scope: !26)
80 !26 = distinct !DILexicalBlock(line: 12, column: 3, file: !1, scope: !7)
81 !30 = !DILocation(line: 13, column: 5, scope: !26)
82 !31 = !DILocation(line: 14, column: 1, scope: !7)
83 !32 = !DILocation(line: 18, column: 8, scope: !33)
84 !33 = distinct !DILexicalBlock(line: 18, column: 3, file: !1, scope: !8)
85 !35 = !DILocation(line: 19, column: 5, scope: !33)
86 !36 = !DILocation(line: 20, column: 1, scope: !8)
87 !37 = distinct !DILexicalBlock(line: 24, column: 3, file: !1, scope: !46)
88 !38 = !DILocation(line: 27, column: 3, scope: !37)
89 !39 = !DILocation(line: 31, column: 3, scope: !37)
90 !40 = !DILocation(line: 28, column: 9, scope: !37)
91 !41 = !DILocation(line: 29, column: 11, scope: !37)
92 !42 = !DILocation(line: 29, column: 7, scope: !37)
93 !43 = !DILocation(line: 27, column: 32, scope: !37)
94 !44 = !DILocation(line: 27, column: 30, scope: !37)
95 !45 = !DILocation(line: 27, column: 21, scope: !37)
96 !46 = distinct !DISubprogram(name: "test_multiple_failures", line: 26, isLocal: false, isDefinition: true, virtualIndex: 6, flags: DIFlagPrototyped, isOptimized: true, unit: !0, scopeLine: 26, file: !1, scope: !5, type: !6, retainedNodes: !2)
97
98 !50 = !{!50, !{!"llvm.loop.unroll_and_jam.enable"}}
0 ; Legacy pass manager
1 ; RUN: opt < %s -transform-warning -disable-output -pass-remarks-missed=transform-warning -pass-remarks-analysis=transform-warning 2>&1 | FileCheck %s
2 ; RUN: opt < %s -transform-warning -disable-output -pass-remarks-output=%t.yaml
3 ; RUN: cat %t.yaml | FileCheck -check-prefix=YAML %s
4
5 ; New pass manager
6 ; RUN: opt < %s -passes=transform-warning -disable-output -pass-remarks-missed=transform-warning -pass-remarks-analysis=transform-warning 2>&1 | FileCheck %s
7 ; RUN: opt < %s -passes=transform-warning -disable-output -pass-remarks-output=%t.yaml
8 ; RUN: cat %t.yaml | FileCheck -check-prefix=YAML %s
9
10
11 ; CHECK: warning: source.cpp:19:5: loop not unrolled: the optimizer was unable to perform the requested transformation; the transformation might be disabled or specified as part of an unsupported transformation ordering
12
13 ; YAML: --- !Failure
14 ; YAML-NEXT: Pass: transform-warning
15 ; YAML-NEXT: Name: FailedRequestedUnrolling
16 ; YAML-NEXT: DebugLoc: { File: source.cpp, Line: 19, Column: 5 }
17 ; YAML-NEXT: Function: _Z17test_array_boundsPiS_i
18 ; YAML-NEXT: Args:
19 ; YAML-NEXT: - String: 'loop not unrolled: the optimizer was unable to perform the requested transformation; the transformation might be disabled or specified as part of an unsupported transformation ordering'
20 ; YAML-NEXT: ...
21
22 target datalayout = "e-m:o-i64:64-f80:128-n8:16:32:64-S128"
23
24 define void @_Z17test_array_boundsPiS_i(i32* nocapture %A, i32* nocapture readonly %B, i32 %Length) !dbg !8 {
25 entry:
26 %cmp9 = icmp sgt i32 %Length, 0, !dbg !32
27 br i1 %cmp9, label %for.body.preheader, label %for.end, !dbg !32
28
29 for.body.preheader:
30 br label %for.body, !dbg !35
31
32 for.body:
33 %indvars.iv = phi i64 [ %indvars.iv.next, %for.body ], [ 0, %for.body.preheader ]
34 %arrayidx = getelementptr inbounds i32, i32* %B, i64 %indvars.iv, !dbg !35
35 %0 = load i32, i32* %arrayidx, align 4, !dbg !35, !tbaa !18
36 %idxprom1 = sext i32 %0 to i64, !dbg !35
37 %arrayidx2 = getelementptr inbounds i32, i32* %A, i64 %idxprom1, !dbg !35
38 %1 = load i32, i32* %arrayidx2, align 4, !dbg !35, !tbaa !18
39 %arrayidx4 = getelementptr inbounds i32, i32* %A, i64 %indvars.iv, !dbg !35
40 store i32 %1, i32* %arrayidx4, align 4, !dbg !35, !tbaa !18
41 %indvars.iv.next = add nuw nsw i64 %indvars.iv, 1, !dbg !32
42 %lftr.wideiv = trunc i64 %indvars.iv.next to i32, !dbg !32
43 %exitcond = icmp eq i32 %lftr.wideiv, %Length, !dbg !32
44 br i1 %exitcond, label %for.end.loopexit, label %for.body, !dbg !32, !llvm.loop !50
45
46 for.end.loopexit:
47 br label %for.end
48
49 for.end:
50 ret void, !dbg !36
51 }
52
53 !llvm.dbg.cu = !{!0}
54 !llvm.module.flags = !{!9, !10}
55 !llvm.ident = !{!11}
56
57 !0 = distinct !DICompileUnit(language: DW_LANG_C_plus_plus, producer: "clang version 3.5.0", isOptimized: true, runtimeVersion: 6, emissionKind: LineTablesOnly, file: !1, enums: !2, retainedTypes: !2, globals: !2, imports: !2)
58 !1 = !DIFile(filename: "source.cpp", directory: ".")
59 !2 = !{}
60 !4 = distinct !DISubprogram(name: "test", line: 1, isLocal: false, isDefinition: true, virtualIndex: 6, flags: DIFlagPrototyped, isOptimized: true, unit: !0, scopeLine: 1, file: !1, scope: !5, type: !6, retainedNodes: !2)
61 !5 = !DIFile(filename: "source.cpp", directory: ".")
62 !6 = !DISubroutineType(types: !2)
63 !7 = distinct !DISubprogram(name: "test_disabled", line: 10, isLocal: false, isDefinition: true, virtualIndex: 6, flags: DIFlagPrototyped, isOptimized: true, unit: !0, scopeLine: 10, file: !1, scope: !5, type: !6, retainedNodes: !2)
64 !8 = distinct !DISubprogram(name: "test_array_bounds", line: 16, isLocal: false, isDefinition: true, virtualIndex: 6, flags: DIFlagPrototyped, isOptimized: true, unit: !0, scopeLine: 16, file: !1, scope: !5, type: !6, retainedNodes: !2)
65 !9 = !{i32 2, !"Dwarf Version", i32 2}
66 !10 = !{i32 2, !"Debug Info Version", i32 3}
67 !11 = !{!"clang version 3.5.0"}
68 !12 = !DILocation(line: 3, column: 8, scope: !13)
69 !13 = distinct !DILexicalBlock(line: 3, column: 3, file: !1, scope: !4)
70 !16 = !DILocation(line: 4, column: 5, scope: !17)
71 !17 = distinct !DILexicalBlock(line: 3, column: 36, file: !1, scope: !13)
72 !18 = !{!19, !19, i64 0}
73 !19 = !{!"int", !20, i64 0}
74 !20 = !{!"omnipotent char", !21, i64 0}
75 !21 = !{!"Simple C/C++ TBAA"}
76 !22 = !DILocation(line: 5, column: 9, scope: !23)
77 !23 = distinct !DILexicalBlock(line: 5, column: 9, file: !1, scope: !17)
78 !24 = !DILocation(line: 8, column: 1, scope: !4)
79 !25 = !DILocation(line: 12, column: 8, scope: !26)
80 !26 = distinct !DILexicalBlock(line: 12, column: 3, file: !1, scope: !7)
81 !30 = !DILocation(line: 13, column: 5, scope: !26)
82 !31 = !DILocation(line: 14, column: 1, scope: !7)
83 !32 = !DILocation(line: 18, column: 8, scope: !33)
84 !33 = distinct !DILexicalBlock(line: 18, column: 3, file: !1, scope: !8)
85 !35 = !DILocation(line: 19, column: 5, scope: !33)
86 !36 = !DILocation(line: 20, column: 1, scope: !8)
87 !37 = distinct !DILexicalBlock(line: 24, column: 3, file: !1, scope: !46)
88 !38 = !DILocation(line: 27, column: 3, scope: !37)
89 !39 = !DILocation(line: 31, column: 3, scope: !37)
90 !40 = !DILocation(line: 28, column: 9, scope: !37)
91 !41 = !DILocation(line: 29, column: 11, scope: !37)
92 !42 = !DILocation(line: 29, column: 7, scope: !37)
93 !43 = !DILocation(line: 27, column: 32, scope: !37)
94 !44 = !DILocation(line: 27, column: 30, scope: !37)
95 !45 = !DILocation(line: 27, column: 21, scope: !37)
96 !46 = distinct !DISubprogram(name: "test_multiple_failures", line: 26, isLocal: false, isDefinition: true, virtualIndex: 6, flags: DIFlagPrototyped, isOptimized: true, unit: !0, scopeLine: 26, file: !1, scope: !5, type: !6, retainedNodes: !2)
97
98 !50 = !{!50, !{!"llvm.loop.unroll.enable"}}
0 ; Legacy pass manager
1 ; RUN: opt < %s -transform-warning -disable-output -pass-remarks-missed=transform-warning -pass-remarks-analysis=transform-warning 2>&1 | FileCheck %s
2 ; RUN: opt < %s -transform-warning -disable-output -pass-remarks-output=%t.yaml
3 ; RUN: cat %t.yaml | FileCheck -check-prefix=YAML %s
4
5 ; New pass manager
6 ; RUN: opt < %s -passes=transform-warning -disable-output -pass-remarks-missed=transform-warning -pass-remarks-analysis=transform-warning 2>&1 | FileCheck %s
7 ; RUN: opt < %s -passes=transform-warning -disable-output -pass-remarks-output=%t.yaml
8 ; RUN: cat %t.yaml | FileCheck -check-prefix=YAML %s
9
10
11 ; C/C++ code for tests
12 ; void test(int *A, int Length) {
13 ; #pragma clang loop vectorize(enable) interleave(enable)
14 ; for (int i = 0; i < Length; i++) {
15 ; A[i] = i;
16 ; if (A[i] > Length)
17 ; break;
18 ; }
19 ; }
20 ; File, line, and column should match those specified in the metadata
21 ; CHECK: warning: source.cpp:19:5: loop not vectorized: the optimizer was unable to perform the requested transformation; the transformation might be disabled or specified as part of an unsupported transformation ordering
22
23 ; YAML: --- !Failure
24 ; YAML-NEXT: Pass: transform-warning
25 ; YAML-NEXT: Name: FailedRequestedVectorization
26 ; YAML-NEXT: DebugLoc: { File: source.cpp, Line: 19, Column: 5 }
27 ; YAML-NEXT: Function: _Z17test_array_boundsPiS_i
28 ; YAML-NEXT: Args:
29 ; YAML-NEXT: - String: 'loop not vectorized: the optimizer was unable to perform the requested transformation; the transformation might be disabled or specified as part of an unsupported transformation ordering'
30 ; YAML-NEXT: ...
31
32 target datalayout = "e-m:o-i64:64-f80:128-n8:16:32:64-S128"
33
34 define void @_Z17test_array_boundsPiS_i(i32* nocapture %A, i32* nocapture readonly %B, i32 %Length) !dbg !8 {
35 entry:
36 %cmp9 = icmp sgt i32 %Length, 0, !dbg !32
37 br i1 %cmp9, label %for.body.preheader, label %for.end, !dbg !32, !llvm.loop !34
38
39 for.body.preheader:
40 br label %for.body, !dbg !35
41
42 for.body:
43 %indvars.iv = phi i64 [ %indvars.iv.next, %for.body ], [ 0, %for.body.preheader ]
44 %arrayidx = getelementptr inbounds i32, i32* %B, i64 %indvars.iv, !dbg !35
45 %0 = load i32, i32* %arrayidx, align 4, !dbg !35, !tbaa !18
46 %idxprom1 = sext i32 %0 to i64, !dbg !35
47 %arrayidx2 = getelementptr inbounds i32, i32* %A, i64 %idxprom1, !dbg !35
48 %1 = load i32, i32* %arrayidx2, align 4, !dbg !35, !tbaa !18
49 %arrayidx4 = getelementptr inbounds i32, i32* %A, i64 %indvars.iv, !dbg !35
50 store i32 %1, i32* %arrayidx4, align 4, !dbg !35, !tbaa !18
51 %indvars.iv.next = add nuw nsw i64 %indvars.iv, 1, !dbg !32
52 %lftr.wideiv = trunc i64 %indvars.iv.next to i32, !dbg !32
53 %exitcond = icmp eq i32 %lftr.wideiv, %Length, !dbg !32
54 br i1 %exitcond, label %for.end.loopexit, label %for.body, !dbg !32, !llvm.loop !34
55
56 for.end.loopexit:
57 br label %for.end
58
59 for.end:
60 ret void, !dbg !36
61 }
62
63 !llvm.dbg.cu = !{!0}
64 !llvm.module.flags = !{!9, !10}
65 !llvm.ident = !{!11}
66
67 !0 = distinct !DICompileUnit(language: DW_LANG_C_plus_plus, producer: "clang version 3.5.0", isOptimized: true, runtimeVersion: 6, emissionKind: LineTablesOnly, file: !1, enums: !2, retainedTypes: !2, globals: !2, imports: !2)
68 !1 = !DIFile(filename: "source.cpp", directory: ".")
69 !2 = !{}
70 !4 = distinct !DISubprogram(name: "test", line: 1, isLocal: false, isDefinition: true, virtualIndex: 6, flags: DIFlagPrototyped, isOptimized: true, unit: !0, scopeLine: 1, file: !1, scope: !5, type: !6, retainedNodes: !2)
71 !5 = !DIFile(filename: "source.cpp", directory: ".")
72 !6 = !DISubroutineType(types: !2)
73 !7 = distinct !DISubprogram(name: "test_disabled", line: 10, isLocal: false, isDefinition: true, virtualIndex: 6, flags: DIFlagPrototyped, isOptimized: true, unit: !0, scopeLine: 10, file: !1, scope: !5, type: !6, retainedNodes: !2)
74 !8 = distinct !DISubprogram(name: "test_array_bounds", line: 16, isLocal: false, isDefinition: true, virtualIndex: 6, flags: DIFlagPrototyped, isOptimized: true, unit: !0, scopeLine: 16, file: !1, scope: !5, type: !6, retainedNodes: !2)
75 !9 = !{i32 2, !"Dwarf Version", i32 2}
76 !10 = !{i32 2, !"Debug Info Version", i32 3}
77 !11 = !{!"clang version 3.5.0"}
78 !12 = !DILocation(line: 3, column: 8, scope: !13)
79 !13 = distinct !DILexicalBlock(line: 3, column: 3, file: !1, scope: !4)
80 !14 = !{!14, !15, !15}
81 !15 = !{!"llvm.loop.vectorize.enable", i1 true}
82 !16 = !DILocation(line: 4, column: 5, scope: !17)
83 !17 = distinct !DILexicalBlock(line: 3, column: 36, file: !1, scope: !13)
84 !18 = !{!19, !19, i64 0}
85 !19 = !{!"int", !20, i64 0}
86 !20 = !{!"omnipotent char", !21, i64 0}
87 !21 = !{!"Simple C/C++ TBAA"}
88 !22 = !DILocation(line: 5, column: 9, scope: !23)
89 !23 = distinct !DILexicalBlock(line: 5, column: 9, file: !1, scope: !17)
90 !24 = !DILocation(line: 8, column: 1, scope: !4)
91 !25 = !DILocation(line: 12, column: 8, scope: !26)
92 !26 = distinct !DILexicalBlock(line: 12, column: 3, file: !1, scope: !7)
93 !27 = !{!27, !28, !29}
94 !28 = !{!"llvm.loop.interleave.count", i32 1}
95 !29 = !{!"llvm.loop.vectorize.width", i32 1}
96 !30 = !DILocation(line: 13, column: 5, scope: !26)
97 !31 = !DILocation(line: 14, column: 1, scope: !7)
98 !32 = !DILocation(line: 18, column: 8, scope: !33)
99 !33 = distinct !DILexicalBlock(line: 18, column: 3, file: !1, scope: !8)
100 !34 = !{!34, !15}
101 !35 = !DILocation(line: 19, column: 5, scope: !33)
102 !36 = !DILocation(line: 20, column: 1, scope: !8)
103 !37 = distinct !DILexicalBlock(line: 24, column: 3, file: !1, scope: !46)
104 !38 = !DILocation(line: 27, column: 3, scope: !37)
105 !39 = !DILocation(line: 31, column: 3, scope: !37)
106 !40 = !DILocation(line: 28, column: 9, scope: !37)
107 !41 = !DILocation(line: 29, column: 11, scope: !37)
108 !42 = !DILocation(line: 29, column: 7, scope: !37)
109 !43 = !DILocation(line: 27, column: 32, scope: !37)
110 !44 = !DILocation(line: 27, column: 30, scope: !37)
111 !45 = !DILocation(line: 27, column: 21, scope: !37)
112 !46 = distinct !DISubprogram(name: "test_multiple_failures", line: 26, isLocal: false, isDefinition: true, virtualIndex: 6, flags: DIFlagPrototyped, isOptimized: true, unit: !0, scopeLine: 26, file: !1, scope: !5, type: !6, retainedNodes: !2)
0 ; RUN: opt -loop-unroll -unroll-count=2 -S < %s | FileCheck %s
1 ;
2 ; Check that the disable_nonforced loop property is honored by
3 ; loop unroll.
4 ;
5 target datalayout = "e-m:e-i64:64-f80:128-n8:16:32:64-S128"
6
7 ; CHECK-LABEL: @disable_nonforced(
8 ; CHECK: load
9 ; CHECK-NOT: load
10 define void @disable_nonforced(i32* nocapture %a) {
11 entry:
12 br label %for.body
13
14 for.body:
15 %indvars.iv = phi i64 [ 0, %entry ], [ %indvars.iv.next, %for.body ]
16 %arrayidx = getelementptr inbounds i32, i32* %a, i64 %indvars.iv
17 %0 = load i32, i32* %arrayidx, align 4
18 %inc = add nsw i32 %0, 1
19 store i32 %inc, i32* %arrayidx, align 4
20 %indvars.iv.next = add nuw nsw i64 %indvars.iv, 1
21 %exitcond = icmp eq i64 %indvars.iv.next, 64
22 br i1 %exitcond, label %for.end, label %for.body, !llvm.loop !0
23
24 for.end:
25 ret void
26 }
27
28 !0 = !{!0, !{!"llvm.loop.disable_nonforced"}}
0 ; RUN: opt -loop-unroll -unroll-count=2 -S < %s | FileCheck %s
1 ;
2 ; Check whether the llvm.loop.unroll.count loop property overrides
3 ; llvm.loop.disable_nonforced.
4 ;
5 target datalayout = "e-m:e-i64:64-f80:128-n8:16:32:64-S128"
6
7 ; CHECK-LABEL: @disable_nonforced_count(
8 ; CHECK: store
9 ; CHECK: store
10 ; CHECK-NOT: store
11 define void @disable_nonforced_count(i32* nocapture %a) {
12 entry:
13 br label %for.body
14
15 for.body:
16 %indvars.iv = phi i64 [ 0, %entry ], [ %indvars.iv.next, %for.body ]
17 %arrayidx = getelementptr inbounds i32, i32* %a, i64 %indvars.iv
18 %0 = load i32, i32* %arrayidx, align 4
19 %inc = add nsw i32 %0, 1
20 store i32 %inc, i32* %arrayidx, align 4
21 %indvars.iv.next = add nuw nsw i64 %indvars.iv, 1
22 %exitcond = icmp eq i64 %indvars.iv.next, 64
23 br i1 %exitcond, label %for.end, label %for.body, !llvm.loop !0
24
25 for.end:
26 ret void
27 }
28
29 !0 = !{!0, !{!"llvm.loop.disable_nonforced"}, !{!"llvm.loop.unroll.count", i32 2}}
0 ; RUN: opt -loop-unroll -unroll-count=2 -S < %s | FileCheck %s
1 ;
2 ; Check that the llvm.loop.unroll.enable loop property overrides
3 ; llvm.loop.disable_nonforced.
4 ;
5 target datalayout = "e-m:e-i64:64-f80:128-n8:16:32:64-S128"
6
7 ; CHECK-LABEL: @disable_nonforced_enable(
8 ; CHECK: store
9 ; CHECK: store
10 ; CHECK-NOT: store
11 define void @disable_nonforced_enable(i32* nocapture %a) {
12 entry:
13 br label %for.body
14
15 for.body:
16 %indvars.iv = phi i64 [ 0, %entry ], [ %indvars.iv.next, %for.body ]
17 %arrayidx = getelementptr inbounds i32, i32* %a, i64 %indvars.iv
18 %0 = load i32, i32* %arrayidx, align 4
19 %inc = add nsw i32 %0, 1
20 store i32 %inc, i32* %arrayidx, align 4
21 %indvars.iv.next = add nuw nsw i64 %indvars.iv, 1
22 %exitcond = icmp eq i64 %indvars.iv.next, 64
23 br i1 %exitcond, label %for.end, label %for.body, !llvm.loop !0
24
25 for.end:
26 ret void
27 }
28
29 !0 = !{!0, !{!"llvm.loop.disable_nonforced"}, !{!"llvm.loop.unroll.enable"}}
0 ; RUN: opt -loop-unroll -S < %s | FileCheck %s
1 ;
2 ; Check that the llvm.loop.unroll.full loop property overrides
3 ; llvm.loop.disable_nonforced.
4 ;
5 target datalayout = "e-m:e-i64:64-f80:128-n8:16:32:64-S128"
6
7 ; CHECK-LABEL: @disable_nonforced_full(
8 ; CHECK: store
9 ; CHECK: store
10 ; CHECK: store
11 ; CHECK: store
12 ; CHECK-NOT: store
13 define void @disable_nonforced_full(i32* nocapture %a) {
14 entry:
15 br label %for.body
16
17 for.body:
18 %indvars.iv = phi i64 [ 0, %entry ], [ %indvars.iv.next, %for.body ]
19 %arrayidx = getelementptr inbounds i32, i32* %a, i64 %indvars.iv
20 %0 = load i32, i32* %arrayidx, align 4
21 %inc = add nsw i32 %0, 1
22 store i32 %inc, i32* %arrayidx, align 4
23 %indvars.iv.next = add nuw nsw i64 %indvars.iv, 1
24 %exitcond = icmp eq i64 %indvars.iv.next, 4
25 br i1 %exitcond, label %for.end, label %for.body, !llvm.loop !0
26
27 for.end:
28 ret void
29 }
30
31 !0 = !{!0, !{!"llvm.loop.disable_nonforced"}, !{!"llvm.loop.unroll.full"}}
0 ; RUN: opt < %s -S -loop-unroll -unroll-count=2 | FileCheck %s -check-prefixes=COUNT,COMMON
1 ; RUN: opt < %s -S -loop-unroll -unroll-runtime=true -unroll-runtime-epilog=true | FileCheck %s -check-prefixes=EPILOG,COMMON
2 ; RUN: opt < %s -S -loop-unroll -unroll-runtime=true -unroll-runtime-epilog=false | FileCheck %s -check-prefixes=PROLOG,COMMON
3 ;
4 ; Check that followup-attributes are applied after LoopUnroll.
5 ;
6 target datalayout = "e-p:64:64:64-i1:8:8-i8:8:8-i16:16:16-i32:32:32-i64:64:64-f32:32:32-f64:64:64-v64:64:64-v128:128:128-a0:0:64-s0:64:64-f80:128:128-n8:16:32:64-S128"
7
8 define i32 @test(i32* nocapture %a, i32 %n) nounwind uwtable readonly {
9 entry:
10 %cmp1 = icmp eq i32 %n, 0
11 br i1 %cmp1, label %for.end, label %for.body
12
13 for.body: ; preds = %for.body, %entry
14 %indvars.iv = phi i64 [ %indvars.iv.next, %for.body ], [ 0, %entry ]
15 %sum.02 = phi i32 [ %add, %for.body ], [ 0, %entry ]
16 %arrayidx = getelementptr inbounds i32, i32* %a, i64 %indvars.iv
17 %0 = load i32, i32* %arrayidx, align 4
18 %add = add nsw i32 %0, %sum.02
19 %indvars.iv.next = add i64 %indvars.iv, 1
20 %lftr.wideiv = trunc i64 %indvars.iv.next to i32
21 %exitcond = icmp eq i32 %lftr.wideiv, %n
22 br i1 %exitcond, label %for.end, label %for.body, !llvm.loop !4
23
24 for.end: ; preds = %for.body, %entry
25 %sum.0.lcssa = phi i32 [ 0, %entry ], [ %add, %for.body ]
26 ret i32 %sum.0.lcssa
27 }
28
29 !1 = !{!"llvm.loop.unroll.followup_all", !{!"FollowupAll"}}
30 !2 = !{!"llvm.loop.unroll.followup_unrolled", !{!"FollowupUnrolled"}}
31 !3 = !{!"llvm.loop.unroll.followup_remainder", !{!"FollowupRemainder"}}
32 !4 = distinct !{!4, !1, !2, !3}
33
34
35 ; COMMON-LABEL: @test(
36
37
38 ; COUNT: br i1 %exitcond.1, label %for.end.loopexit, label %for.body, !llvm.loop ![[LOOP:[0-9]+]]
39
40 ; COUNT: ![[FOLLOWUP_ALL:[0-9]+]] = !{!"FollowupAll"}
41 ; COUNT: ![[FOLLOWUP_UNROLLED:[0-9]+]] = !{!"FollowupUnrolled"}
42 ; COUNT: ![[LOOP]] = distinct !{![[LOOP]], ![[FOLLOWUP_ALL]], ![[FOLLOWUP_UNROLLED]]}
43
44
45 ; EPILOG: br i1 %niter.ncmp.7, label %for.end.loopexit.unr-lcssa.loopexit, label %for.body, !llvm.loop ![[LOOP_0:[0-9]+]]
46 ; EPILOG: br i1 %epil.iter.cmp, label %for.body.epil, label %for.end.loopexit.epilog-lcssa, !llvm.loop ![[LOOP_2:[0-9]+]]
47
48 ; EPILOG: ![[LOOP_0]] = distinct !{![[LOOP_0]], ![[FOLLOWUP_ALL:[0-9]+]], ![[FOLLOWUP_UNROLLED:[0-9]+]]}
49 ; EPILOG: ![[FOLLOWUP_ALL]] = !{!"FollowupAll"}
50 ; EPILOG: ![[FOLLOWUP_UNROLLED]] = !{!"FollowupUnrolled"}
51 ; EPILOG: ![[LOOP_2]] = distinct !{![[LOOP_2]], ![[FOLLOWUP_ALL]], ![[FOLLOWUP_REMAINDER:[0-9]+]]}
52 ; EPILOG: ![[FOLLOWUP_REMAINDER]] = !{!"FollowupRemainder"}
53
54
55 ; PROLOG: br i1 %prol.iter.cmp, label %for.body.prol, label %for.body.prol.loopexit.unr-lcssa, !llvm.loop ![[LOOP_0:[0-9]+]]
56 ; PROLOG: br i1 %exitcond.7, label %for.end.loopexit.unr-lcssa, label %for.body, !llvm.loop ![[LOOP_2:[0-9]+]]
57
58 ; PROLOG: ![[LOOP_0]] = distinct !{![[LOOP_0]], ![[FOLLOWUP_ALL:[0-9]+]], ![[FOLLOWUP_REMAINDER:[0-9]+]]}
59 ; PROLOG: ![[FOLLOWUP_ALL]] = !{!"FollowupAll"}
60 ; PROLOG: ![[FOLLOWUP_REMAINDER]] = !{!"FollowupRemainder"}
61 ; PROLOG: ![[LOOP_2]] = distinct !{![[LOOP_2]], ![[FOLLOWUP_ALL]], ![[FOLLOWUP_UNROLLED:[0-9]+]]}
62 ; PROLOG: ![[FOLLOWUP_UNROLLED]] = !{!"FollowupUnrolled"}
0 ; RUN: opt -loop-unroll-and-jam -allow-unroll-and-jam -unroll-and-jam-count=2 -S < %s | FileCheck %s
1 ;
2 ; Check that the disable_nonforced loop property is honored by
3 ; loop unroll-and-jam.
4 ;
5 target datalayout = "e-m:e-p:32:32-i64:64-v128:64:128-a:0:32-n32-S64"
6
7 ; CHECK-LABEL: disable_nonforced
8 ; CHECK: load
9 ; CHECK-NOT: load
10 define void @disable_nonforced(i32 %I, i32 %J, i32* noalias nocapture %A, i32* noalias nocapture readonly %B) {
11 entry:
12 %cmp = icmp ne i32 %J, 0
13 %cmp122 = icmp ne i32 %I, 0
14 %or.cond = and i1 %cmp, %cmp122
15 br i1 %or.cond, label %for.outer.preheader, label %for.end
16
17 for.outer.preheader:
18 br label %for.outer
19
20 for.outer:
21 %i.us = phi i32 [ %add8.us, %for.latch ], [ 0, %for.outer.preheader ]
22 br label %for.inner
23
24 for.inner:
25 %j.us = phi i32 [ 0, %for.outer ], [ %inc.us, %for.inner ]
26 %sum1.us = phi i32 [ 0, %for.outer ], [ %add.us, %for.inner ]
27 %arrayidx.us = getelementptr inbounds i32, i32* %B, i32 %j.us
28 %0 = load i32, i32* %arrayidx.us, align 4
29 %add.us = add i32 %0, %sum1.us
30 %inc.us = add nuw i32 %j.us, 1
31 %exitcond = icmp eq i32 %inc.us, %J
32 br i1 %exitcond, label %for.latch, label %for.inner
33
34 for.latch:
35 %add.us.lcssa = phi i32 [ %add.us, %for.inner ]
36 %arrayidx6.us = getelementptr inbounds i32, i32* %A, i32 %i.us
37 store i32 %add.us.lcssa, i32* %arrayidx6.us, align 4
38 %add8.us = add nuw i32 %i.us, 1
39 %exitcond25 = icmp eq i32 %add8.us, %I
40 br i1 %exitcond25, label %for.end.loopexit, label %for.outer, !llvm.loop !0
41
42 for.end.loopexit:
43 br label %for.end
44
45 for.end:
46 ret void
47 }
48
49 !0 = distinct !{!0, !{!"llvm.loop.disable_nonforced"}}
0 ; RUN: opt -loop-unroll-and-jam -allow-unroll-and-jam -S < %s | FileCheck %s
1 ;
2 ; Verify that the llvm.loop.unroll_and_jam.count loop property overrides
3 ; llvm.loop.disable_nonforced.
4 ;
5 target datalayout = "e-m:e-p:32:32-i64:64-v128:64:128-a:0:32-n32-S64"
6
7 ; CHECK-LABEL: @disable_nonforced_enable(
8 ; CHECK: load
9 ; CHECK: load
10 ; CHECK-NOT: load
11 ; CHECK: br i1
12 define void @disable_nonforced_enable(i32 %I, i32 %J, i32* noalias nocapture %A, i32* noalias nocapture readonly %B) {
13 entry:
14 %cmp = icmp ne i32 %J, 0
15 %cmp122 = icmp ne i32 %I, 0
16 %or.cond = and i1 %cmp, %cmp122
17 br i1 %or.cond, label %for.outer.preheader, label %for.end
18
19 for.outer.preheader:
20 br label %for.outer
21
22 for.outer:
23 %i.us = phi i32 [ %add8.us, %for.latch ], [ 0, %for.outer.preheader ]
24 br label %for.inner
25
26 for.inner:
27 %j.us = phi i32 [ 0, %for.outer ], [ %inc.us, %for.inner ]
28 %sum1.us = phi i32 [ 0, %for.outer ], [ %add.us, %for.inner ]
29 %arrayidx.us = getelementptr inbounds i32, i32* %B, i32 %j.us
30 %0 = load i32, i32* %arrayidx.us, align 4
31 %add.us = add i32 %0, %sum1.us
32 %inc.us = add nuw i32 %j.us, 1
33 %exitcond = icmp eq i32 %inc.us, %J
34 br i1 %exitcond, label %for.latch, label %for.inner
35
36 for.latch:
37 %add.us.lcssa = phi i32 [ %add.us, %for.inner ]
38 %arrayidx6.us = getelementptr inbounds i32, i32* %A, i32 %i.us
39 store i32 %add.us.lcssa, i32* %arrayidx6.us, align 4
40 %add8.us = add nuw i32 %i.us, 1
41 %exitcond25 = icmp eq i32 %add8.us, %I
42 br i1 %exitcond25, label %for.end.loopexit, label %for.outer, !llvm.loop !0
43
44 for.end.loopexit:
45 br label %for.end
46
47 for.end:
48 ret void
49 }
50
51 !0 = distinct !{!0, !{!"llvm.loop.disable_nonforced"}, !{!"llvm.loop.unroll_and_jam.count", i32 2}}
0 ; RUN: opt -loop-unroll-and-jam -allow-unroll-and-jam -unroll-and-jam-count=2 -S < %s | FileCheck %s
1 ;
2 ; Verify that the llvm.loop.unroll_and_jam.enable loop property
3 ; overrides llvm.loop.disable_nonforced.
4 ;
5 target datalayout = "e-m:e-p:32:32-i64:64-v128:64:128-a:0:32-n32-S64"
6
7 ; CHECK-LABEL: disable_nonforced_enable
8 ; CHECK: load
9 ; CHECK: load
10 ; CHECK-NOT: load
11 ; CHECK: br i1
12 define void @disable_nonforced_enable(i32 %I, i32 %J, i32* noalias nocapture %A, i32* noalias nocapture readonly %B) {
13 entry:
14 %cmp = icmp ne i32 %J, 0
15 %cmp122 = icmp ne i32 %I, 0
16 %or.cond = and i1 %cmp, %cmp122
17 br i1 %or.cond, label %for.outer.preheader, label %for.end
18
19 for.outer.preheader:
20 br label %for.outer
21
22 for.outer:
23 %i.us = phi i32 [ %add8.us, %for.latch ], [ 0, %for.outer.preheader ]
24 br label %for.inner
25
26 for.inner:
27 %j.us = phi i32 [ 0, %for.outer ], [ %inc.us, %for.inner ]
28 %sum1.us = phi i32 [ 0, %for.outer ], [ %add.us, %for.inner ]
29 %arrayidx.us = getelementptr inbounds i32, i32* %B, i32 %j.us
30 %0 = load i32, i32* %arrayidx.us, align 4
31 %add.us = add i32 %0, %sum1.us
32 %inc.us = add nuw i32 %j.us, 1
33 %exitcond = icmp eq i32 %inc.us, %J
34 br i1 %exitcond, label %for.latch, label %for.inner
35
36 for.latch:
37 %add.us.lcssa = phi i32 [ %add.us, %for.inner ]
38 %arrayidx6.us = getelementptr inbounds i32, i32* %A, i32 %i.us
39 store i32 %add.us.lcssa, i32* %arrayidx6.us, align 4
40 %add8.us = add nuw i32 %i.us, 1
41 %exitcond25 = icmp eq i32 %add8.us, %I
42 br i1 %exitcond25, label %for.end.loopexit, label %for.outer, !llvm.loop !0
43
44 for.end.loopexit:
45 br label %for.end
46
47 for.end:
48 ret void
49 }
50
51 !0 = distinct !{!0, !{!"llvm.loop.disable_nonforced"}, !{!"llvm.loop.unroll_and_jam.enable"}}
0 ; RUN: opt -basicaa -tbaa -loop-unroll-and-jam -allow-unroll-and-jam -unroll-and-jam-count=4 -unroll-remainder < %s -S | FileCheck %s
1 ;
2 ; Check that followup attributes are set in the new loops.
3 ;
4 target datalayout = "e-m:e-p:32:32-i64:64-v128:64:128-a:0:32-n32-S64"
5
6 define void @followup(i32 %I, i32 %J, i32* noalias nocapture %A, i32* noalias nocapture readonly %B) {
7 entry:
8 %cmp = icmp ne i32 %J, 0
9 %cmp122 = icmp ne i32 %I, 0
10 %or.cond = and i1 %cmp, %cmp122
11 br i1 %or.cond, label %for.outer.preheader, label %for.end
12
13 for.outer.preheader:
14 br label %for.outer
15
16 for.outer:
17 %i.us = phi i32 [ %add8.us, %for.latch ], [ 0, %for.outer.preheader ]
18 br label %for.inner
19
20 for.inner:
21 %j.us = phi i32 [ 0, %for.outer ], [ %inc.us, %for.inner ]
22 %sum1.us = phi i32 [ 0, %for.outer ], [ %add.us, %for.inner ]
23 %arrayidx.us = getelementptr inbounds i32, i32* %B, i32 %j.us
24 %0 = load i32, i32* %arrayidx.us, align 4
25 %add.us = add i32 %0, %sum1.us
26 %inc.us = add nuw i32 %j.us, 1
27 %exitcond = icmp eq i32 %inc.us, %J
28 br i1 %exitcond, label %for.latch, label %for.inner
29
30 for.latch:
31 %add.us.lcssa = phi i32 [ %add.us, %for.inner ]
32 %arrayidx6.us = getelementptr inbounds i32, i32* %A, i32 %i.us
33 store i32 %add.us.lcssa, i32* %arrayidx6.us, align 4
34 %add8.us = add nuw i32 %i.us, 1
35 %exitcond25 = icmp eq i32 %add8.us, %I
36 br i1 %exitcond25, label %for.end.loopexit, label %for.outer, !llvm.loop !0
37
38 for.end.loopexit:
39 br label %for.end
40
41 for.end:
42 ret void
43 }
44
45 !0 = !{!0, !1, !2, !3, !4, !6}
46 !1 = !{!"llvm.loop.unroll_and_jam.enable"}
47 !2 = !{!"llvm.loop.unroll_and_jam.followup_outer", !{!"FollowupOuter"}}
48 !3 = !{!"llvm.loop.unroll_and_jam.followup_inner", !{!"FollowupInner"}}
49 !4 = !{!"llvm.loop.unroll_and_jam.followup_all", !{!"FollowupAll"}}
50 !6 = !{!"llvm.loop.unroll_and_jam.followup_remainder_inner", !{!"FollowupRemainderInner"}}
51
52
53 ; CHECK: br i1 %exitcond.3, label %for.latch, label %for.inner, !llvm.loop ![[LOOP_INNER:[0-9]+]]
54 ; CHECK: br i1 %niter.ncmp.3, label %for.end.loopexit.unr-lcssa.loopexit, label %for.outer, !llvm.loop ![[LOOP_OUTER:[0-9]+]]
55 ; CHECK: br i1 %exitcond.epil, label %for.latch.epil, label %for.inner.epil, !llvm.loop ![[LOOP_REMAINDER_INNER:[0-9]+]]
56 ; CHECK: br i1 %exitcond.epil.1, label %for.latch.epil.1, label %for.inner.epil.1, !llvm.loop ![[LOOP_REMAINDER_INNER]]
57 ; CHECK: br i1 %exitcond.epil.2, label %for.latch.epil.2, label %for.inner.epil.2, !llvm.loop ![[LOOP_REMAINDER_INNER]]
58
59 ; CHECK: ![[LOOP_INNER]] = distinct !{![[LOOP_INNER]], ![[FOLLOWUP_ALL:[0-9]+]], ![[FOLLOWUP_INNER:[0-9]+]]}
60 ; CHECK: ![[FOLLOWUP_ALL]] = !{!"FollowupAll"}
61 ; CHECK: ![[FOLLOWUP_INNER]] = !{!"FollowupInner"}
62 ; CHECK: ![[LOOP_OUTER]] = distinct !{![[LOOP_OUTER]], ![[FOLLOWUP_ALL]], ![[FOLLOWUP_OUTER:[0-9]+]]}
63 ; CHECK: ![[FOLLOWUP_OUTER]] = !{!"FollowupOuter"}
64 ; CHECK: ![[LOOP_REMAINDER_INNER]] = distinct !{![[LOOP_REMAINDER_INNER]], ![[FOLLOWUP_ALL]], ![[FOLLOWUP_REMAINDER_INNER:[0-9]+]]}
65 ; CHECK: ![[FOLLOWUP_REMAINDER_INNER]] = !{!"FollowupRemainderInner"}
315315 !8 = distinct !{!"llvm.loop.unroll.disable"}
316316 !9 = distinct !{!9, !10}
317317 !10 = distinct !{!"llvm.loop.unroll.enable"}
318 !11 = distinct !{!11, !8, !6}
318 !11 = distinct !{!11, !8, !6}
None ; RUN: opt < %s -loop-vectorize -S -pass-remarks-missed='loop-vectorize' -pass-remarks-analysis='loop-vectorize' 2>&1 | FileCheck %s
1 ; RUN: opt < %s -loop-vectorize -o /dev/null -pass-remarks-output=%t.yaml
0 ; RUN: opt < %s -loop-vectorize -transform-warning -S -pass-remarks-missed='loop-vectorize' -pass-remarks-analysis='loop-vectorize' 2>&1 | FileCheck %s
1 ; RUN: opt < %s -loop-vectorize -transform-warning -o /dev/null -pass-remarks-output=%t.yaml
22 ; RUN: cat %t.yaml | FileCheck -check-prefix=YAML %s
33
4 ; RUN: opt < %s -passes=loop-vectorize -S -pass-remarks-missed='loop-vectorize' -pass-remarks-analysis='loop-vectorize' 2>&1 | FileCheck %s
5 ; RUN: opt < %s -passes=loop-vectorize -o /dev/null -pass-remarks-output=%t.yaml
4 ; RUN: opt < %s -passes=loop-vectorize,transform-warning -S -pass-remarks-missed='loop-vectorize' -pass-remarks-analysis='loop-vectorize' 2>&1 | FileCheck %s
5 ; RUN: opt < %s -passes=loop-vectorize,transform-warning -o /dev/null -pass-remarks-output=%t.yaml
66 ; RUN: cat %t.yaml | FileCheck -check-prefix=YAML %s
77
88 ; C/C++ code for tests
3232 ; }
3333 ; CHECK: remark: source.cpp:19:5: loop not vectorized: cannot identify array bounds
3434 ; CHECK: remark: source.cpp:19:5: loop not vectorized
35 ; CHECK: warning: source.cpp:19:5: loop not vectorized: failed explicitly specified loop vectorization
35 ; CHECK: warning: source.cpp:19:5: loop not vectorized: the optimizer was unable to perform the requested transformation; the transformation might be disabled or specified as part of an unsupported transformation ordering
3636
3737 ; int foo();
3838 ; void test_multiple_failures(int *A) {
9393 ; YAML-NEXT: - String: ')'
9494 ; YAML-NEXT: ...
9595 ; YAML-NEXT: --- !Failure
96 ; YAML-NEXT: Pass: loop-vectorize
96 ; YAML-NEXT: Pass: transform-warning
9797 ; YAML-NEXT: Name: FailedRequestedVectorization
9898 ; YAML-NEXT: DebugLoc: { File: source.cpp, Line: 19, Column: 5 }
9999 ; YAML-NEXT: Function: _Z17test_array_boundsPiS_i
100100 ; YAML-NEXT: Args:
101 ; YAML-NEXT: - String: 'loop not vectorized: '
102 ; YAML-NEXT: - String: failed explicitly specified loop vectorization
101 ; YAML-NEXT: - String: 'loop not vectorized: the optimizer was unable to perform the requested transformation; the transformation might be disabled or specified as part of an unsupported transformation ordering'
103102 ; YAML-NEXT: ...
104103 ; YAML-NEXT: --- !Analysis
105104 ; YAML-NEXT: Pass: loop-vectorize
0 ; RUN: opt -loop-vectorize -force-vector-interleave=1 -force-vector-width=2 -S < %s | FileCheck %s
1 ;
2 ; Check that the disable_nonforced loop property is honored by the
3 ; loop vectorizer.
4 ;
5 target datalayout = "e-p:64:64:64-i1:8:8-i8:8:8-i16:16:16-i32:32:32-i64:64:64-f32:32:32-f64:64:64-v64:64:64-v128:128:128-a0:0:64-s0:64:64-f80:128:128-n8:16:32:64-S128"
6
7 ; CHECK-LABEL: @disable_nonforced(
8 ; CHECK-NOT: x i32>
9 define void @disable_nonforced(i32* nocapture %a, i32 %n) {
10 entry:
11 %cmp4 = icmp sgt i32 %n, 0
12 br i1 %cmp4, label %for.body, label %for.end
13
14 for.body:
15 %indvars.iv = phi i64 [ %indvars.iv.next, %for.body ], [ 0, %entry ]
16 %arrayidx = getelementptr inbounds i32, i32* %a, i64 %indvars.iv
17 %0 = trunc i64 %indvars.iv to i32
18 store i32 %0, i32* %arrayidx, align 4
19 %indvars.iv.next = add i64 %indvars.iv, 1
20 %lftr.wideiv = trunc i64 %indvars.iv.next to i32
21 %exitcond = icmp eq i32 %lftr.wideiv, %n
22 br i1 %exitcond, label %for.end, label %for.body, !llvm.loop !0
23
24 for.end:
25 ret void
26 }
27
28 !0 = !{!0, !{!"llvm.loop.disable_nonforced"}}
0 ; RUN: opt -loop-vectorize -force-vector-interleave=1 -force-vector-width=2 -S < %s | FileCheck %s
1 ;
2 ; Check whether the llvm.loop.vectorize.enable loop property overrides
3 ; llvm.loop.disable_nonforced.
4 ;
5 target datalayout = "e-p:64:64:64-i1:8:8-i8:8:8-i16:16:16-i32:32:32-i64:64:64-f32:32:32-f64:64:64-v64:64:64-v128:128:128-a0:0:64-s0:64:64-f80:128:128-n8:16:32:64-S128"
6
7 ; CHECK-LABEL: @disable_nonforced_enable(
8 ; CHECK: store <2 x i32>
9 define void @disable_nonforced_enable(i32* nocapture %a, i32 %n) {
10 entry:
11 %cmp4 = icmp sgt i32 %n, 0
12 br i1 %cmp4, label %for.body, label %for.end
13
14 for.body:
15 %indvars.iv = phi i64 [ %indvars.iv.next, %for.body ], [ 0, %entry ]
16 %arrayidx = getelementptr inbounds i32, i32* %a, i64 %indvars.iv
17 %0 = trunc i64 %indvars.iv to i32
18 store i32 %0, i32* %arrayidx, align 4
19 %indvars.iv.next = add i64 %indvars.iv, 1
20 %lftr.wideiv = trunc i64 %indvars.iv.next to i32
21 %exitcond = icmp eq i32 %lftr.wideiv, %n
22 br i1 %exitcond, label %for.end, label %for.body, !llvm.loop !0
23
24 for.end:
25 ret void
26 }
27
28 !0 = !{!0, !{!"llvm.loop.disable_nonforced"}, !{!"llvm.loop.vectorize.enable", i32 1}}
0 ; RUN: opt -loop-vectorize -force-vector-width=4 -force-vector-interleave=1 -S < %s | FileCheck %s
1 ;
2 ; Check that the followup loop attributes are applied.
3 ;
4 target datalayout = "e-p:64:64:64-i1:8:8-i8:8:8-i16:16:16-i32:32:32-i64:64:64-f32:32:32-f64:64:64-v64:64:64-v128:128:128-a0:0:64-s0:64:64-f80:128:128-n8:16:32:64-S128"
5
6 define void @followup(i32* nocapture %a, i32 %n) {
7 entry:
8 %cmp4 = icmp sgt i32 %n, 0
9 br i1 %cmp4, label %for.body, label %for.end
10
11 for.body:
12 %indvars.iv = phi i64 [ %indvars.iv.next, %for.body ], [ 0, %entry ]
13 %arrayidx = getelementptr inbounds i32, i32* %a, i64 %indvars.iv
14 %0 = trunc i64 %indvars.iv to i32
15 store i32 %0, i32* %arrayidx, align 4
16 %indvars.iv.next = add i64 %indvars.iv, 1
17 %lftr.wideiv = trunc i64 %indvars.iv.next to i32
18 %exitcond = icmp eq i32 %lftr.wideiv, %n
19 br i1 %exitcond, label %for.end, label %for.body, !llvm.loop !0
20
21 for.end:
22 ret void
23 }
24
25 !0 = distinct !{!0, !3, !4, !5}
26 !3 = !{!"llvm.loop.vectorize.followup_vectorized", !{!"FollowupVectorized"}}
27 !4 = !{!"llvm.loop.vectorize.followup_epilogue", !{!"FollowupEpilogue"}}
28 !5 = !{!"llvm.loop.vectorize.followup_all", !{!"FollowupAll"}}
29
30
31 ; CHECK-LABEL @followup(
32
33 ; CHECK-LABEL: vector.body:
34 ; CHECK: br i1 %13, label %middle.block, label %vector.body, !llvm.loop ![[LOOP_VECTOR:[0-9]+]]
35 ; CHECK-LABEL: for.body:
36 ; CHECK: br i1 %exitcond, label %for.end.loopexit, label %for.body, !llvm.loop ![[LOOP_EPILOGUE:[0-9]+]]
37
38 ; CHECK: ![[LOOP_VECTOR]] = distinct !{![[LOOP_VECTOR]], ![[FOLLOWUP_ALL:[0-9]+]], ![[FOLLOWUP_VECTORIZED:[0-9]+]]}
39 ; CHECK: ![[FOLLOWUP_ALL]] = !{!"FollowupAll"}
40 ; CHECK: ![[FOLLOWUP_VECTORIZED:[0-9]+]] = !{!"FollowupVectorized"}
41 ; CHECK: ![[LOOP_EPILOGUE]] = distinct !{![[LOOP_EPILOGUE]], ![[FOLLOWUP_ALL]], ![[FOLLOWUP_EPILOGUE:[0-9]+]]}
42 ; CHECK: ![[FOLLOWUP_EPILOGUE]] = !{!"FollowupEpilogue"}
None ; RUN: opt < %s -loop-vectorize -S 2>&1 | FileCheck %s
0 ; RUN: opt < %s -loop-vectorize -transform-warning -S 2>&1 | FileCheck %s
11
22 ; Verify warning is generated when vectorization/ interleaving is explicitly specified and fails to occur.
3 ; CHECK: warning: no_array_bounds.cpp:5:5: loop not vectorized: failed explicitly specified loop vectorization
4 ; CHECK: warning: no_array_bounds.cpp:10:5: loop not interleaved: failed explicitly specified loop interleaving
3 ; CHECK: warning: no_array_bounds.cpp:5:5: loop not vectorized: the optimizer was unable to perform the requested transformation; the transformation might be disabled or specified as part of an unsupported transformation ordering
4 ; CHECK: warning: no_array_bounds.cpp:10:5: loop not interleaved: the optimizer was unable to perform the requested transformation; the transformation might be disabled or specified as part of an unsupported transformation ordering
55
66 ; #pragma clang loop vectorize(enable)
77 ; for (int i = 0; i < number; i++) {
None ; RUN: opt < %s -loop-vectorize -force-vector-width=4 -S 2>&1 | FileCheck %s
1 ; RUN: opt < %s -loop-vectorize -force-vector-width=1 -S 2>&1 | FileCheck %s -check-prefix=NOANALYSIS
2 ; RUN: opt < %s -loop-vectorize -force-vector-width=4 -pass-remarks-missed='loop-vectorize' -S 2>&1 | FileCheck %s -check-prefix=MOREINFO
0 ; RUN: opt < %s -loop-vectorize -force-vector-width=4 -transform-warning -S 2>&1 | FileCheck %s
1 ; RUN: opt < %s -loop-vectorize -force-vector-width=1 -transform-warning -S 2>&1 | FileCheck %s -check-prefix=NOANALYSIS
2 ; RUN: opt < %s -loop-vectorize -force-vector-width=4 -transform-warning -pass-remarks-missed='loop-vectorize' -S 2>&1 | FileCheck %s -check-prefix=MOREINFO
33
44 ; CHECK: remark: source.cpp:4:5: loop not vectorized: loop contains a switch statement
5 ; CHECK: warning: source.cpp:4:5: loop not vectorized: failed explicitly specified loop vectorization
5 ; CHECK: warning: source.cpp:4:5: loop not vectorized: the optimizer was unable to perform the requested transformation; the transformation might be disabled or specified as part of an unsupported transformation ordering
66
77 ; NOANALYSIS-NOT: remark: {{.*}}
8 ; NOANALYSIS: warning: source.cpp:4:5: loop not interleaved: failed explicitly specified loop interleaving
8 ; NOANALYSIS: warning: source.cpp:4:5: loop not vectorized: the optimizer was unable to perform the requested transformation; the transformation might be disabled or specified as part of an unsupported transformation ordering
99
1010 ; MOREINFO: remark: source.cpp:4:5: loop not vectorized: loop contains a switch statement
1111 ; MOREINFO: remark: source.cpp:4:5: loop not vectorized (Force=true, Vector Width=4)
12 ; MOREINFO: warning: source.cpp:4:5: loop not vectorized: failed explicitly specified loop vectorization
12 ; MOREINFO: warning: source.cpp:4:5: loop not vectorized: the optimizer was unable to perform the requested transformation; the transformation might be disabled or specified as part of an unsupported transformation ordering
1313
1414 ; CHECK: _Z11test_switchPii
1515 ; CHECK-NOT: x i32>