llvm.org GIT mirror llvm / 00e900a
[IR] redefine 'UnsafeAlgebra' / 'reassoc' fast-math-flags and add 'trans' fast-math-flag As discussed on llvm-dev: http://lists.llvm.org/pipermail/llvm-dev/2016-November/107104.html and again more recently: http://lists.llvm.org/pipermail/llvm-dev/2017-October/118118.html ...this is a step in cleaning up our fast-math-flags implementation in IR to better match the capabilities of both clang's user-visible flags and the backend's flags for SDNode. As proposed in the above threads, we're replacing the 'UnsafeAlgebra' bit (which had the 'umbrella' meaning that all flags are set) with a new bit that only applies to algebraic reassociation - 'AllowReassoc'. We're also adding a bit to allow approximations for library functions called 'ApproxFunc' (this was initially proposed as 'libm' or similar). ...and we're out of bits. 7 bits ought to be enough for anyone, right? :) FWIW, I did look at getting this out of SubclassOptionalData via SubclassData (spacious 16-bits), but that's apparently already used for other purposes. Also, I don't think we can just add a field to FPMathOperator because Operator is not intended to be instantiated. We'll defer movement of FMF to another day. We keep the 'fast' keyword. I thought about removing that, but seeing IR like this: %f.fast = fadd reassoc nnan ninf nsz arcp contract afn float %op1, %op2 ...made me think we want to keep the shortcut synonym. Finally, this change is binary incompatible with existing IR as seen in the compatibility tests. This statement: "Newer releases can ignore features from older releases, but they cannot miscompile them. For example, if nsw is ever replaced with something else, dropping it would be a valid way to upgrade the IR." ( http://llvm.org/docs/DeveloperPolicy.html#ir-backwards-compatibility ) ...provides the flexibility we want to make this change without requiring a new IR version. Ie, we're not loosening the FP strictness of existing IR. At worst, we will fail to optimize some previously 'fast' code because it's no longer recognized as 'fast'. This should get fixed as we audit/squash all of the uses of 'isFast()'. Note: an inter-dependent clang commit to use the new API name should closely follow commit. Differential Revision: https://reviews.llvm.org/D39304 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@317488 91177308-0d34-0410-b5e6-96231b3b80d8 Sanjay Patel 1 year, 9 months ago
32 changed file(s) with 443 addition(s) and 247 deletion(s). Raw diff Collapse all Expand all
22712271 Fast-Math Flags
22722272 ---------------
22732273
2274 LLVM IR floating-point binary ops (:ref:`fadd `,
2274 LLVM IR floating-point operations (:ref:`fadd `,
22752275 :ref:`fsub `, :ref:`fmul `, :ref:`fdiv `,
22762276 :ref:`frem `, :ref:`fcmp `) and :ref:`call `
2277 instructions have the following flags that can be set to enable
2278 otherwise unsafe floating point transformations.
2277 may use the following flags to enable otherwise unsafe
2278 floating-point transformations.
22792279
22802280 ``nnan``
22812281 No NaNs - Allow optimizations to assume the arguments and result are not
22992299 Allow floating-point contraction (e.g. fusing a multiply followed by an
23002300 addition into a fused multiply-and-add).
23012301
2302 ``afn``
2303 Approximate functions - Allow substitution of approximate calculations for
2304 functions (sin, log, sqrt, etc). See floating-point intrinsic definitions
2305 for places where this can apply to LLVM's intrinsic math functions.
2306
2307 ``reassoc``
2308 Allow reassociation transformations for floating-point instructions.
2309 This may dramatically change results in floating point.
2310
23022311 ``fast``
2303 Fast - Allow algebraically equivalent transformations that may
2304 dramatically change results in floating point (e.g. reassociate). This
2305 flag implies all the others.
2312 This flag implies all of the others.
23062313
23072314 .. _uselistorder:
23082315
1048210489 """""""
1048310490
1048410491 This is an overloaded intrinsic. You can use ``llvm.sqrt`` on any
10485 floating point or vector of floating point type. Not all targets support
10492 floating-point or vector of floating-point type. Not all targets support
1048610493 all types however.
1048710494
1048810495 ::
1049610503 Overview:
1049710504 """""""""
1049810505
10499 The '``llvm.sqrt``' intrinsics return the square root of the specified value,
10500 returning the same value as the libm '``sqrt``' functions would, but without
10501 trapping or setting ``errno``.
10502
10503 Arguments:
10504 """"""""""
10505
10506 The argument and return value are floating point numbers of the same type.
10507
10508 Semantics:
10509 """"""""""
10510
10511 This function returns the square root of the operand if it is a nonnegative
10512 floating point number.
10506 The '``llvm.sqrt``' intrinsics return the square root of the specified value.
10507
10508 Arguments:
10509 """"""""""
10510
10511 The argument and return value are floating-point numbers of the same type.
10512
10513 Semantics:
10514 """"""""""
10515
10516 Return the same value as a corresponding libm '``sqrt``' function but without
10517 trapping or setting ``errno``. For types specified by IEEE-754, the result
10518 matches a conforming libm implementation.
10519
10520 When specified with the fast-math-flag 'afn', the result may be approximated
10521 using a less accurate calculation.
1051310522
1051410523 '``llvm.powi.*``' Intrinsic
1051510524 ^^^^^^^^^^^^^^^^^^^^^^^^^^^
1055610565 """""""
1055710566
1055810567 This is an overloaded intrinsic. You can use ``llvm.sin`` on any
10559 floating point or vector of floating point type. Not all targets support
10568 floating-point or vector of floating-point type. Not all targets support
1056010569 all types however.
1056110570
1056210571 ::
1057510584 Arguments:
1057610585 """"""""""
1057710586
10578 The argument and return value are floating point numbers of the same type.
10579
10580 Semantics:
10581 """"""""""
10582
10583 This function returns the sine of the specified operand, returning the
10584 same values as the libm ``sin`` functions would, and handles error
10585 conditions in the same way.
10587 The argument and return value are floating-point numbers of the same type.
10588
10589 Semantics:
10590 """"""""""
10591
10592 Return the same value as a corresponding libm '``sin``' function but without
10593 trapping or setting ``errno``.
10594
10595 When specified with the fast-math-flag 'afn', the result may be approximated
10596 using a less accurate calculation.
1058610597
1058710598 '``llvm.cos.*``' Intrinsic
1058810599 ^^^^^^^^^^^^^^^^^^^^^^^^^^
1059110602 """""""
1059210603
1059310604 This is an overloaded intrinsic. You can use ``llvm.cos`` on any
10594 floating point or vector of floating point type. Not all targets support
10605 floating-point or vector of floating-point type. Not all targets support
1059510606 all types however.
1059610607
1059710608 ::
1061010621 Arguments:
1061110622 """"""""""
1061210623
10613 The argument and return value are floating point numbers of the same type.
10614
10615 Semantics:
10616 """"""""""
10617
10618 This function returns the cosine of the specified operand, returning the
10619 same values as the libm ``cos`` functions would, and handles error
10620 conditions in the same way.
10624 The argument and return value are floating-point numbers of the same type.
10625
10626 Semantics:
10627 """"""""""
10628
10629 Return the same value as a corresponding libm '``cos``' function but without
10630 trapping or setting ``errno``.
10631
10632 When specified with the fast-math-flag 'afn', the result may be approximated
10633 using a less accurate calculation.
1062110634
1062210635 '``llvm.pow.*``' Intrinsic
1062310636 ^^^^^^^^^^^^^^^^^^^^^^^^^^
1062610639 """""""
1062710640
1062810641 This is an overloaded intrinsic. You can use ``llvm.pow`` on any
10629 floating point or vector of floating point type. Not all targets support
10642 floating-point or vector of floating-point type. Not all targets support
1063010643 all types however.
1063110644
1063210645 ::
1064610659 Arguments:
1064710660 """"""""""
1064810661
10649 The second argument is a floating point power, and the first is a value
10650 to raise to that power.
10651
10652 Semantics:
10653 """"""""""
10654
10655 This function returns the first value raised to the second power,
10656 returning the same values as the libm ``pow`` functions would, and
10657 handles error conditions in the same way.
10662 The arguments and return value are floating-point numbers of the same type.
10663
10664 Semantics:
10665 """"""""""
10666
10667 Return the same value as a corresponding libm '``pow``' function but without
10668 trapping or setting ``errno``.
10669
10670 When specified with the fast-math-flag 'afn', the result may be approximated
10671 using a less accurate calculation.
1065810672
1065910673 '``llvm.exp.*``' Intrinsic
1066010674 ^^^^^^^^^^^^^^^^^^^^^^^^^^
1066310677 """""""
1066410678
1066510679 This is an overloaded intrinsic. You can use ``llvm.exp`` on any
10666 floating point or vector of floating point type. Not all targets support
10680 floating-point or vector of floating-point type. Not all targets support
1066710681 all types however.
1066810682
1066910683 ::
1068310697 Arguments:
1068410698 """"""""""
1068510699
10686 The argument and return value are floating point numbers of the same type.
10687
10688 Semantics:
10689 """"""""""
10690
10691 This function returns the same values as the libm ``exp`` functions
10692 would, and handles error conditions in the same way.
10700 The argument and return value are floating-point numbers of the same type.
10701
10702 Semantics:
10703 """"""""""
10704
10705 Return the same value as a corresponding libm '``exp``' function but without
10706 trapping or setting ``errno``.
10707
10708 When specified with the fast-math-flag 'afn', the result may be approximated
10709 using a less accurate calculation.
1069310710
1069410711 '``llvm.exp2.*``' Intrinsic
1069510712 ^^^^^^^^^^^^^^^^^^^^^^^^^^^
1069810715 """""""
1069910716
1070010717 This is an overloaded intrinsic. You can use ``llvm.exp2`` on any
10701 floating point or vector of floating point type. Not all targets support
10718 floating-point or vector of floating-point type. Not all targets support
1070210719 all types however.
1070310720
1070410721 ::
1071810735 Arguments:
1071910736 """"""""""
1072010737
10721 The argument and return value are floating point numbers of the same type.
10722
10723 Semantics:
10724 """"""""""
10725
10726 This function returns the same values as the libm ``exp2`` functions
10727 would, and handles error conditions in the same way.
10738 The argument and return value are floating-point numbers of the same type.
10739
10740 Semantics:
10741 """"""""""
10742
10743 Return the same value as a corresponding libm '``exp2``' function but without
10744 trapping or setting ``errno``.
10745
10746 When specified with the fast-math-flag 'afn', the result may be approximated
10747 using a less accurate calculation.
1072810748
1072910749 '``llvm.log.*``' Intrinsic
1073010750 ^^^^^^^^^^^^^^^^^^^^^^^^^^
1073310753 """""""
1073410754
1073510755 This is an overloaded intrinsic. You can use ``llvm.log`` on any
10736 floating point or vector of floating point type. Not all targets support
10756 floating-point or vector of floating-point type. Not all targets support
1073710757 all types however.
1073810758
1073910759 ::
1075310773 Arguments:
1075410774 """"""""""
1075510775
10756 The argument and return value are floating point numbers of the same type.
10757
10758 Semantics:
10759 """"""""""
10760
10761 This function returns the same values as the libm ``log`` functions
10762 would, and handles error conditions in the same way.
10776 The argument and return value are floating-point numbers of the same type.
10777
10778 Semantics:
10779 """"""""""
10780
10781 Return the same value as a corresponding libm '``log``' function but without
10782 trapping or setting ``errno``.
10783
10784 When specified with the fast-math-flag 'afn', the result may be approximated
10785 using a less accurate calculation.
1076310786
1076410787 '``llvm.log10.*``' Intrinsic
1076510788 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
1076810791 """""""
1076910792
1077010793 This is an overloaded intrinsic. You can use ``llvm.log10`` on any
10771 floating point or vector of floating point type. Not all targets support
10794 floating-point or vector of floating-point type. Not all targets support
1077210795 all types however.
1077310796
1077410797 ::
1078810811 Arguments:
1078910812 """"""""""
1079010813
10791 The argument and return value are floating point numbers of the same type.
10792
10793 Semantics:
10794 """"""""""
10795
10796 This function returns the same values as the libm ``log10`` functions
10797 would, and handles error conditions in the same way.
10814 The argument and return value are floating-point numbers of the same type.
10815
10816 Semantics:
10817 """"""""""
10818
10819 Return the same value as a corresponding libm '``log10``' function but without
10820 trapping or setting ``errno``.
10821
10822 When specified with the fast-math-flag 'afn', the result may be approximated
10823 using a less accurate calculation.
1079810824
1079910825 '``llvm.log2.*``' Intrinsic
1080010826 ^^^^^^^^^^^^^^^^^^^^^^^^^^^
1080310829 """""""
1080410830
1080510831 This is an overloaded intrinsic. You can use ``llvm.log2`` on any
10806 floating point or vector of floating point type. Not all targets support
10832 floating-point or vector of floating-point type. Not all targets support
1080710833 all types however.
1080810834
1080910835 ::
1082310849 Arguments:
1082410850 """"""""""
1082510851
10826 The argument and return value are floating point numbers of the same type.
10827
10828 Semantics:
10829 """"""""""
10830
10831 This function returns the same values as the libm ``log2`` functions
10832 would, and handles error conditions in the same way.
10852 The argument and return value are floating-point numbers of the same type.
10853
10854 Semantics:
10855 """"""""""
10856
10857 Return the same value as a corresponding libm '``log2``' function but without
10858 trapping or setting ``errno``.
10859
10860 When specified with the fast-math-flag 'afn', the result may be approximated
10861 using a less accurate calculation.
1083310862
1083410863 '``llvm.fma.*``' Intrinsic
1083510864 ^^^^^^^^^^^^^^^^^^^^^^^^^^
1083810867 """""""
1083910868
1084010869 This is an overloaded intrinsic. You can use ``llvm.fma`` on any
10841 floating point or vector of floating point type. Not all targets support
10870 floating-point or vector of floating-point type. Not all targets support
1084210871 all types however.
1084310872
1084410873 ::
1085210881 Overview:
1085310882 """""""""
1085410883
10855 The '``llvm.fma.*``' intrinsics perform the fused multiply-add
10856 operation.
10857
10858 Arguments:
10859 """"""""""
10860
10861 The argument and return value are floating point numbers of the same
10862 type.
10863
10864 Semantics:
10865 """"""""""
10866
10867 This function returns the same values as the libm ``fma`` functions
10868 would, and does not set errno.
10884 The '``llvm.fma.*``' intrinsics perform the fused multiply-add operation.
10885
10886 Arguments:
10887 """"""""""
10888
10889 The arguments and return value are floating-point numbers of the same type.
10890
10891 Semantics:
10892 """"""""""
10893
10894 Return the same value as a corresponding libm '``fma``' function but without
10895 trapping or setting ``errno``.
10896
10897 When specified with the fast-math-flag 'afn', the result may be approximated
10898 using a less accurate calculation.
1086910899
1087010900 '``llvm.fabs.*``' Intrinsic
1087110901 ^^^^^^^^^^^^^^^^^^^^^^^^^^^
307307 /// Determine whether the exact flag is set.
308308 bool isExact() const;
309309
310 /// Set or clear the unsafe-algebra flag on this instruction, which must be an
310 /// Set or clear all fast-math-flags on this instruction, which must be an
311311 /// operator which supports this flag. See LangRef.html for the meaning of
312312 /// this flag.
313 void setHasUnsafeAlgebra(bool B);
313 void setFast(bool B);
314
315 /// Set or clear the reassociation flag on this instruction, which must be
316 /// an operator which supports this flag. See LangRef.html for the meaning of
317 /// this flag.
318 void setHasAllowReassoc(bool B);
314319
315320 /// Set or clear the no-nans flag on this instruction, which must be an
316321 /// operator which supports this flag. See LangRef.html for the meaning of
332337 /// this flag.
333338 void setHasAllowReciprocal(bool B);
334339
340 /// Set or clear the approximate-math-functions flag on this instruction,
341 /// which must be an operator which supports this flag. See LangRef.html for
342 /// the meaning of this flag.
343 void setHasApproxFunc(bool B);
344
335345 /// Convenience function for setting multiple fast-math flags on this
336346 /// instruction, which must be an operator which supports these flags. See
337347 /// LangRef.html for the meaning of these flags.
342352 /// LangRef.html for the meaning of these flags.
343353 void copyFastMathFlags(FastMathFlags FMF);
344354
345 /// Determine whether the unsafe-algebra flag is set.
346 bool hasUnsafeAlgebra() const;
355 /// Determine whether all fast-math-flags are set.
356 bool isFast() const;
357
358 /// Determine whether the allow-reassociation flag is set.
359 bool hasAllowReassoc() const;
347360
348361 /// Determine whether the no-NaNs flag is set.
349362 bool hasNoNaNs() const;
359372
360373 /// Determine whether the allow-contract flag is set.
361374 bool hasAllowContract() const;
375
376 /// Determine whether the approximate-math-functions flag is set.
377 bool hasApproxFunc() const;
362378
363379 /// Convenience function for getting all the fast-math flags, which must be an
364380 /// operator which supports these flags. See LangRef.html for the meaning of
162162
163163 unsigned Flags = 0;
164164
165 FastMathFlags(unsigned F) : Flags(F) { }
166
167 public:
168 /// This is how the bits are used in Value::SubclassOptionalData so they
169 /// should fit there too.
165 FastMathFlags(unsigned F) {
166 // If all 7 bits are set, turn this into -1. If the number of bits grows,
167 // this must be updated. This is intended to provide some forward binary
168 // compatibility insurance for the meaning of 'fast' in case bits are added.
169 if (F == 0x7F) Flags = ~0U;
170 else Flags = F;
171 }
172
173 public:
174 // This is how the bits are used in Value::SubclassOptionalData so they
175 // should fit there too.
176 // WARNING: We're out of space. SubclassOptionalData only has 7 bits. New
177 // functionality will require a change in how this information is stored.
170178 enum {
171 UnsafeAlgebra = (1 << 0),
179 AllowReassoc = (1 << 0),
172180 NoNaNs = (1 << 1),
173181 NoInfs = (1 << 2),
174182 NoSignedZeros = (1 << 3),
175183 AllowReciprocal = (1 << 4),
176 AllowContract = (1 << 5)
184 AllowContract = (1 << 5),
185 ApproxFunc = (1 << 6)
177186 };
178187
179188 FastMathFlags() = default;
180189
181 /// Whether any flag is set
182190 bool any() const { return Flags != 0; }
183
184 /// Set all the flags to false
191 bool none() const { return Flags == 0; }
192 bool all() const { return Flags == ~0U; }
193
185194 void clear() { Flags = 0; }
195 void set() { Flags = ~0U; }
186196
187197 /// Flag queries
198 bool allowReassoc() const { return 0 != (Flags & AllowReassoc); }
188199 bool noNaNs() const { return 0 != (Flags & NoNaNs); }
189200 bool noInfs() const { return 0 != (Flags & NoInfs); }
190201 bool noSignedZeros() const { return 0 != (Flags & NoSignedZeros); }
191202 bool allowReciprocal() const { return 0 != (Flags & AllowReciprocal); }
192 bool allowContract() const { return 0 != (Flags & AllowContract); }
193 bool unsafeAlgebra() const { return 0 != (Flags & UnsafeAlgebra); }
203 bool allowContract() const { return 0 != (Flags & AllowContract); }
204 bool approxFunc() const { return 0 != (Flags & ApproxFunc); }
205 /// 'Fast' means all bits are set.
206 bool isFast() const { return all(); }
194207
195208 /// Flag setters
209 void setAllowReassoc() { Flags |= AllowReassoc; }
196210 void setNoNaNs() { Flags |= NoNaNs; }
197211 void setNoInfs() { Flags |= NoInfs; }
198212 void setNoSignedZeros() { Flags |= NoSignedZeros; }
199213 void setAllowReciprocal() { Flags |= AllowReciprocal; }
214 // TODO: Change the other set* functions to take a parameter?
200215 void setAllowContract(bool B) {
201216 Flags = (Flags & ~AllowContract) | B * AllowContract;
202217 }
203 void setUnsafeAlgebra() {
204 Flags |= UnsafeAlgebra;
205 setNoNaNs();
206 setNoInfs();
207 setNoSignedZeros();
208 setAllowReciprocal();
209 setAllowContract(true);
210 }
218 void setApproxFunc() { Flags |= ApproxFunc; }
219 void setFast() { set(); }
211220
212221 void operator&=(const FastMathFlags &OtherFlags) {
213222 Flags &= OtherFlags.Flags;
220229 private:
221230 friend class Instruction;
222231
223 void setHasUnsafeAlgebra(bool B) {
224 SubclassOptionalData =
225 (SubclassOptionalData & ~FastMathFlags::UnsafeAlgebra) |
226 (B * FastMathFlags::UnsafeAlgebra);
227
228 // Unsafe algebra implies all the others
229 if (B) {
230 setHasNoNaNs(true);
231 setHasNoInfs(true);
232 setHasNoSignedZeros(true);
233 setHasAllowReciprocal(true);
234 }
232 /// 'Fast' means all bits are set.
233 void setFast(bool B) {
234 setHasAllowReassoc(B);
235 setHasNoNaNs(B);
236 setHasNoInfs(B);
237 setHasNoSignedZeros(B);
238 setHasAllowReciprocal(B);
239 setHasAllowContract(B);
240 setHasApproxFunc(B);
241 }
242
243 void setHasAllowReassoc(bool B) {
244 SubclassOptionalData =
245 (SubclassOptionalData & ~FastMathFlags::AllowReassoc) |
246 (B * FastMathFlags::AllowReassoc);
235247 }
236248
237249 void setHasNoNaNs(bool B) {
262274 SubclassOptionalData =
263275 (SubclassOptionalData & ~FastMathFlags::AllowContract) |
264276 (B * FastMathFlags::AllowContract);
277 }
278
279 void setHasApproxFunc(bool B) {
280 SubclassOptionalData =
281 (SubclassOptionalData & ~FastMathFlags::ApproxFunc) |
282 (B * FastMathFlags::ApproxFunc);
265283 }
266284
267285 /// Convenience function for setting multiple fast-math flags.
277295 }
278296
279297 public:
280 /// Test whether this operation is permitted to be
281 /// algebraically transformed, aka the 'A' fast-math property.
282 bool hasUnsafeAlgebra() const {
283 return (SubclassOptionalData & FastMathFlags::UnsafeAlgebra) != 0;
284 }
285
286 /// Test whether this operation's arguments and results are to be
287 /// treated as non-NaN, aka the 'N' fast-math property.
298 /// Test if this operation allows all non-strict floating-point transforms.
299 bool isFast() const {
300 return ((SubclassOptionalData & FastMathFlags::AllowReassoc) != 0 &&
301 (SubclassOptionalData & FastMathFlags::NoNaNs) != 0 &&
302 (SubclassOptionalData & FastMathFlags::NoInfs) != 0 &&
303 (SubclassOptionalData & FastMathFlags::NoSignedZeros) != 0 &&
304 (SubclassOptionalData & FastMathFlags::AllowReciprocal) != 0 &&
305 (SubclassOptionalData & FastMathFlags::AllowContract) != 0 &&
306 (SubclassOptionalData & FastMathFlags::ApproxFunc) != 0);
307 }
308
309 /// Test if this operation may be simplified with reassociative transforms.
310 bool hasAllowReassoc() const {
311 return (SubclassOptionalData & FastMathFlags::AllowReassoc) != 0;
312 }
313
314 /// Test if this operation's arguments and results are assumed not-NaN.
288315 bool hasNoNaNs() const {
289316 return (SubclassOptionalData & FastMathFlags::NoNaNs) != 0;
290317 }
291318
292 /// Test whether this operation's arguments and results are to be
293 /// treated as NoN-Inf, aka the 'I' fast-math property.
319 /// Test if this operation's arguments and results are assumed not-infinite.
294320 bool hasNoInfs() const {
295321 return (SubclassOptionalData & FastMathFlags::NoInfs) != 0;
296322 }
297323
298 /// Test whether this operation can treat the sign of zero
299 /// as insignificant, aka the 'S' fast-math property.
324 /// Test if this operation can ignore the sign of zero.
300325 bool hasNoSignedZeros() const {
301326 return (SubclassOptionalData & FastMathFlags::NoSignedZeros) != 0;
302327 }
303328
304 /// Test whether this operation is permitted to use
305 /// reciprocal instead of division, aka the 'R' fast-math property.
329 /// Test if this operation can use reciprocal multiply instead of division.
306330 bool hasAllowReciprocal() const {
307331 return (SubclassOptionalData & FastMathFlags::AllowReciprocal) != 0;
308332 }
309333
310 /// Test whether this operation is permitted to
311 /// be floating-point contracted.
334 /// Test if this operation can be floating-point contracted (FMA).
312335 bool hasAllowContract() const {
313336 return (SubclassOptionalData & FastMathFlags::AllowContract) != 0;
337 }
338
339 /// Test if this operation allows approximations of math library functions or
340 /// intrinsics.
341 bool hasApproxFunc() const {
342 return (SubclassOptionalData & FastMathFlags::ApproxFunc) != 0;
314343 }
315344
316345 /// Convenience function for getting all the fast-math flags
330330 /// not have the "fast-math" property. Such operation requires a relaxed FP
331331 /// mode.
332332 bool hasUnsafeAlgebra() {
333 return InductionBinOp &&
334 !cast(InductionBinOp)->hasUnsafeAlgebra();
333 return InductionBinOp && !cast(InductionBinOp)->isFast();
335334 }
336335
337336 /// Returns induction operator that does not have "fast-math" property
338337 /// and requires FP unsafe mode.
339338 Instruction *getUnsafeAlgebraInst() {
340 if (!InductionBinOp ||
341 cast(InductionBinOp)->hasUnsafeAlgebra())
339 if (!InductionBinOp || cast(InductionBinOp)->isFast())
342340 return nullptr;
343341 return InductionBinOp;
344342 }
551551 KEYWORD(nsz);
552552 KEYWORD(arcp);
553553 KEYWORD(contract);
554 KEYWORD(reassoc);
555 KEYWORD(afn);
554556 KEYWORD(fast);
555557 KEYWORD(nuw);
556558 KEYWORD(nsw);
192192 FastMathFlags FMF;
193193 while (true)
194194 switch (Lex.getKind()) {
195 case lltok::kw_fast: FMF.setUnsafeAlgebra(); Lex.Lex(); continue;
195 case lltok::kw_fast: FMF.setFast(); Lex.Lex(); continue;
196196 case lltok::kw_nnan: FMF.setNoNaNs(); Lex.Lex(); continue;
197197 case lltok::kw_ninf: FMF.setNoInfs(); Lex.Lex(); continue;
198198 case lltok::kw_nsz: FMF.setNoSignedZeros(); Lex.Lex(); continue;
201201 FMF.setAllowContract(true);
202202 Lex.Lex();
203203 continue;
204 case lltok::kw_reassoc: FMF.setAllowReassoc(); Lex.Lex(); continue;
205 case lltok::kw_afn: FMF.setApproxFunc(); Lex.Lex(); continue;
204206 default: return FMF;
205207 }
206208 return FMF;
101101 kw_nsz,
102102 kw_arcp,
103103 kw_contract,
104 kw_reassoc,
105 kw_afn,
104106 kw_fast,
105107 kw_nuw,
106108 kw_nsw,
10451045
10461046 static FastMathFlags getDecodedFastMathFlags(unsigned Val) {
10471047 FastMathFlags FMF;
1048 if (0 != (Val & FastMathFlags::UnsafeAlgebra))
1049 FMF.setUnsafeAlgebra();
1048 if (0 != (Val & FastMathFlags::AllowReassoc))
1049 FMF.setAllowReassoc();
10501050 if (0 != (Val & FastMathFlags::NoNaNs))
10511051 FMF.setNoNaNs();
10521052 if (0 != (Val & FastMathFlags::NoInfs))
10571057 FMF.setAllowReciprocal();
10581058 if (0 != (Val & FastMathFlags::AllowContract))
10591059 FMF.setAllowContract(true);
1060 if (0 != (Val & FastMathFlags::ApproxFunc))
1061 FMF.setApproxFunc();
10601062 return FMF;
10611063 }
10621064
13201320 if (PEO->isExact())
13211321 Flags |= 1 << bitc::PEO_EXACT;
13221322 } else if (const auto *FPMO = dyn_cast(V)) {
1323 if (FPMO->hasUnsafeAlgebra())
1324 Flags |= FastMathFlags::UnsafeAlgebra;
1323 if (FPMO->hasAllowReassoc())
1324 Flags |= FastMathFlags::AllowReassoc;
13251325 if (FPMO->hasNoNaNs())
13261326 Flags |= FastMathFlags::NoNaNs;
13271327 if (FPMO->hasNoInfs())
13321332 Flags |= FastMathFlags::AllowReciprocal;
13331333 if (FPMO->hasAllowContract())
13341334 Flags |= FastMathFlags::AllowContract;
1335 if (FPMO->hasApproxFunc())
1336 Flags |= FastMathFlags::ApproxFunc;
13351337 }
13361338
13371339 return Flags;
9494 // and it can't be handled by generating this shuffle sequence.
9595 // TODO: Implement scalarization of ordered reductions here for targets
9696 // without native support.
97 if (!II->getFastMathFlags().unsafeAlgebra())
97 if (!II->getFastMathFlags().isFast())
9898 continue;
9999 Vec = II->getArgOperand(1);
100100 break;
25842584 case Instruction::FAdd:
25852585 case Instruction::FMul:
25862586 if (const FPMathOperator *FPOp = dyn_cast(Inst))
2587 if (FPOp->getFastMathFlags().unsafeAlgebra())
2587 if (FPOp->getFastMathFlags().isFast())
25882588 break;
25892589 LLVM_FALLTHROUGH;
25902590 default:
26302630
26312631 if (Inst->getOpcode() == OpCode || isa(U)) {
26322632 if (const FPMathOperator *FPOp = dyn_cast(Inst))
2633 if (!isa(FPOp) && !FPOp->getFastMathFlags().unsafeAlgebra())
2633 if (!isa(FPOp) && !FPOp->getFastMathFlags().isFast())
26342634 return false;
26352635 UsersToVisit.push_back(U);
26362636 } else if (const ShuffleVectorInst *ShufInst =
27242724 Flags.setNoInfs(FMF.noInfs());
27252725 Flags.setNoNaNs(FMF.noNaNs());
27262726 Flags.setNoSignedZeros(FMF.noSignedZeros());
2727 Flags.setUnsafeAlgebra(FMF.unsafeAlgebra());
2727 Flags.setUnsafeAlgebra(FMF.isFast());
27282728
27292729 SDValue BinNodeValue = DAG.getNode(OpCode, getCurSDLoc(), Op1.getValueType(),
27302730 Op1, Op2, Flags);
79587958
79597959 switch (Intrinsic) {
79607960 case Intrinsic::experimental_vector_reduce_fadd:
7961 if (FMF.unsafeAlgebra())
7961 if (FMF.isFast())
79627962 Res = DAG.getNode(ISD::VECREDUCE_FADD, dl, VT, Op2);
79637963 else
79647964 Res = DAG.getNode(ISD::VECREDUCE_STRICT_FADD, dl, VT, Op1, Op2);
79657965 break;
79667966 case Intrinsic::experimental_vector_reduce_fmul:
7967 if (FMF.unsafeAlgebra())
7967 if (FMF.isFast())
79687968 Res = DAG.getNode(ISD::VECREDUCE_FMUL, dl, VT, Op2);
79697969 else
79707970 Res = DAG.getNode(ISD::VECREDUCE_STRICT_FMUL, dl, VT, Op1, Op2);
11071107
11081108 static void WriteOptimizationInfo(raw_ostream &Out, const User *U) {
11091109 if (const FPMathOperator *FPO = dyn_cast(U)) {
1110 // Unsafe algebra implies all the others, no need to write them all out
1111 if (FPO->hasUnsafeAlgebra())
1110 // 'Fast' is an abbreviation for all fast-math-flags.
1111 if (FPO->isFast())
11121112 Out << " fast";
11131113 else {
1114 if (FPO->hasAllowReassoc())
1115 Out << " reassoc";
11141116 if (FPO->hasNoNaNs())
11151117 Out << " nnan";
11161118 if (FPO->hasNoInfs())
11211123 Out << " arcp";
11221124 if (FPO->hasAllowContract())
11231125 Out << " contract";
1126 if (FPO->hasApproxFunc())
1127 Out << " afn";
11241128 }
11251129 }
11261130
145145 return cast(this)->isExact();
146146 }
147147
148 void Instruction::setHasUnsafeAlgebra(bool B) {
148 void Instruction::setFast(bool B) {
149149 assert(isa(this) && "setting fast-math flag on invalid op");
150 cast(this)->setHasUnsafeAlgebra(B);
150 cast(this)->setFast(B);
151 }
152
153 void Instruction::setHasAllowReassoc(bool B) {
154 assert(isa(this) && "setting fast-math flag on invalid op");
155 cast(this)->setHasAllowReassoc(B);
151156 }
152157
153158 void Instruction::setHasNoNaNs(bool B) {
170175 cast(this)->setHasAllowReciprocal(B);
171176 }
172177
178 void Instruction::setHasApproxFunc(bool B) {
179 assert(isa(this) && "setting fast-math flag on invalid op");
180 cast(this)->setHasApproxFunc(B);
181 }
182
173183 void Instruction::setFastMathFlags(FastMathFlags FMF) {
174184 assert(isa(this) && "setting fast-math flag on invalid op");
175185 cast(this)->setFastMathFlags(FMF);
180190 cast(this)->copyFastMathFlags(FMF);
181191 }
182192
183 bool Instruction::hasUnsafeAlgebra() const {
184 assert(isa(this) && "getting fast-math flag on invalid op");
185 return cast(this)->hasUnsafeAlgebra();
193 bool Instruction::isFast() const {
194 assert(isa(this) && "getting fast-math flag on invalid op");
195 return cast(this)->isFast();
196 }
197
198 bool Instruction::hasAllowReassoc() const {
199 assert(isa(this) && "getting fast-math flag on invalid op");
200 return cast(this)->hasAllowReassoc();
186201 }
187202
188203 bool Instruction::hasNoNaNs() const {
208223 bool Instruction::hasAllowContract() const {
209224 assert(isa(this) && "getting fast-math flag on invalid op");
210225 return cast(this)->hasAllowContract();
226 }
227
228 bool Instruction::hasApproxFunc() const {
229 assert(isa(this) && "getting fast-math flag on invalid op");
230 return cast(this)->hasApproxFunc();
211231 }
212232
213233 FastMathFlags Instruction::getFastMathFlags() const {
578598 switch (Opcode) {
579599 case FMul:
580600 case FAdd:
581 return cast(this)->hasUnsafeAlgebra();
601 return cast(this)->isFast();
582602 default:
583603 return false;
584604 }
399399 return false;
400400
401401 FastMathFlags FMF = FPOp->getFastMathFlags();
402 bool UnsafeDiv = HasUnsafeFPMath || FMF.unsafeAlgebra() ||
402 bool UnsafeDiv = HasUnsafeFPMath || FMF.isFast() ||
403403 FMF.allowReciprocal();
404404
405405 // With UnsafeDiv node will be optimized to just rcp and mul.
486486
487487 bool AMDGPULibCalls::isUnsafeMath(const CallInst *CI) const {
488488 if (auto Op = dyn_cast(CI))
489 if (Op->hasUnsafeAlgebra())
489 if (Op->isFast())
490490 return true;
491491 const Function *F = CI->getParent()->getParent();
492492 Attribute Attr = F->getFnAttribute("unsafe-fp-math");
481481 return nullptr;
482482
483483 FastMathFlags Flags;
484 Flags.setUnsafeAlgebra();
484 Flags.setFast();
485485 if (I0) Flags &= I->getFastMathFlags();
486486 if (I1) Flags &= I->getFastMathFlags();
487487
510510 }
511511
512512 Value *FAddCombine::simplify(Instruction *I) {
513 assert(I->hasUnsafeAlgebra() && "Should be in unsafe mode");
513 assert(I->isFast() && "Expected 'fast' instruction");
514514
515515 // Currently we are not able to handle vector type.
516516 if (I->getType()->isVectorTy())
13851385 if (Value *V = SimplifySelectsFeedingBinaryOp(I, LHS, RHS))
13861386 return replaceInstUsesWith(I, V);
13871387
1388 if (I.hasUnsafeAlgebra()) {
1388 if (I.isFast()) {
13891389 if (Value *V = FAddCombine(Builder).simplify(&I))
13901390 return replaceInstUsesWith(I, V);
13911391 }
17351735 if (Value *V = SimplifySelectsFeedingBinaryOp(I, Op0, Op1))
17361736 return replaceInstUsesWith(I, V);
17371737
1738 if (I.hasUnsafeAlgebra()) {
1738 if (I.isFast()) {
17391739 if (Value *V = FAddCombine(Builder).simplify(&I))
17401740 return replaceInstUsesWith(I, V);
17411741 }
20162016 }
20172017 case Intrinsic::fmuladd: {
20182018 // Canonicalize fast fmuladd to the separate fmul + fadd.
2019 if (II->hasUnsafeAlgebra()) {
2019 if (II->isFast()) {
20202020 BuilderTy::FastMathFlagGuard Guard(Builder);
20212021 Builder.setFastMathFlags(II->getFastMathFlags());
20222022 Value *Mul = Builder.CreateFMul(II->getArgOperand(0),
486486 IntrinsicInst *II = dyn_cast(Op);
487487 if (!II)
488488 return;
489 if (II->getIntrinsicID() != Intrinsic::log2 || !II->hasUnsafeAlgebra())
489 if (II->getIntrinsicID() != Intrinsic::log2 || !II->isFast())
490490 return;
491491 Log2 = II;
492492
497497 Instruction *I = dyn_cast(OpLog2Of);
498498 if (!I)
499499 return;
500 if (I->getOpcode() != Instruction::FMul || !I->hasUnsafeAlgebra())
500
501 if (I->getOpcode() != Instruction::FMul || !I->isFast())
501502 return;
502503
503504 if (match(I->getOperand(0), m_SpecificFP(0.5)))
600601 }
601602
602603 if (R) {
603 R->setHasUnsafeAlgebra(true);
604 R->setFast(true);
604605 InsertNewInstWith(R, *InsertBefore);
605606 }
606607
621622 SQ.getWithInstruction(&I)))
622623 return replaceInstUsesWith(I, V);
623624
624 bool AllowReassociate = I.hasUnsafeAlgebra();
625 bool AllowReassociate = I.isFast();
625626
626627 // Simplify mul instructions with a constant RHS.
627628 if (isa(Op1)) {
13401341 if (Instruction *R = FoldOpIntoSelect(I, SI))
13411342 return R;
13421343
1343 bool AllowReassociate = I.hasUnsafeAlgebra();
1344 bool AllowReassociate = I.isFast();
13441345 bool AllowReciprocal = I.hasAllowReciprocal();
13451346
13461347 if (Constant *Op1C = dyn_cast(Op1)) {
144144 static BinaryOperator *isReassociableOp(Value *V, unsigned Opcode) {
145145 if (V->hasOneUse() && isa(V) &&
146146 cast(V)->getOpcode() == Opcode &&
147 (!isa(V) ||
148 cast(V)->hasUnsafeAlgebra()))
147 (!isa(V) || cast(V)->isFast()))
149148 return cast(V);
150149 return nullptr;
151150 }
155154 if (V->hasOneUse() && isa(V) &&
156155 (cast(V)->getOpcode() == Opcode1 ||
157156 cast(V)->getOpcode() == Opcode2) &&
158 (!isa(V) ||
159 cast(V)->hasUnsafeAlgebra()))
157 (!isa(V) || cast(V)->isFast()))
160158 return cast(V);
161159 return nullptr;
162160 }
564562 assert((!isa(Op) ||
565563 cast(Op)->getOpcode() != Opcode
566564 || (isa(Op) &&
567 !cast(Op)->hasUnsafeAlgebra())) &&
565 !cast(Op)->isFast())) &&
568566 "Should have been handled above!");
569567 assert(Op->hasOneUse() && "Has uses outside the expression tree!");
570568
20162014 if (I->isCommutative())
20172015 canonicalizeOperands(I);
20182016
2019 // Don't optimize floating point instructions that don't have unsafe algebra.
2020 if (I->getType()->isFPOrFPVectorTy() && !I->hasUnsafeAlgebra())
2017 // Don't optimize floating-point instructions unless they are 'fast'.
2018 if (I->getType()->isFPOrFPVectorTy() && !I->isFast())
20212019 return;
20222020
20232021 // Do not reassociate boolean (i1) expressions. We want to preserve the
431431 InstDesc &Prev, bool HasFunNoNaNAttr) {
432432 bool FP = I->getType()->isFloatingPointTy();
433433 Instruction *UAI = Prev.getUnsafeAlgebraInst();
434 if (!UAI && FP && !I->hasUnsafeAlgebra())
434 if (!UAI && FP && !I->isFast())
435435 UAI = I; // Found an unsafe (unvectorizable) algebra instruction.
436436
437437 switch (I->getOpcode()) {
659659 break;
660660 }
661661
662 // We only match FP sequences with unsafe algebra, so we can unconditionally
662 // We only match FP sequences that are 'fast', so we can unconditionally
663663 // set it on any generated instructions.
664664 IRBuilder<>::FastMathFlagGuard FMFG(Builder);
665665 FastMathFlags FMF;
666 FMF.setUnsafeAlgebra();
666 FMF.setFast();
667667 Builder.setFastMathFlags(FMF);
668668
669669 Value *Cmp;
767767
768768 // Floating point operations had to be 'fast' to enable the induction.
769769 FastMathFlags Flags;
770 Flags.setUnsafeAlgebra();
770 Flags.setFast();
771771
772772 Value *MulExp = B.CreateFMul(StepValue, Index);
773773 if (isa(MulExp))
13371337 static Value *addFastMathFlag(Value *V) {
13381338 if (isa(V)) {
13391339 FastMathFlags Flags;
1340 Flags.setUnsafeAlgebra();
1340 Flags.setFast();
13411341 cast(V)->setFastMathFlags(Flags);
13421342 }
13431343 return V;
14001400 RD::MinMaxRecurrenceKind MinMaxKind = RD::MRK_Invalid;
14011401 // TODO: Support creating ordered reductions.
14021402 FastMathFlags FMFUnsafe;
1403 FMFUnsafe.setUnsafeAlgebra();
1403 FMFUnsafe.setFast();
14041404
14051405 switch (Opcode) {
14061406 case Instruction::Add:
11101110 // Example: x = 1000, y = 0.001.
11111111 // pow(exp(x), y) = pow(inf, 0.001) = inf, whereas exp(x*y) = exp(1).
11121112 auto *OpC = dyn_cast(Op1);
1113 if (OpC && OpC->hasUnsafeAlgebra() && CI->hasUnsafeAlgebra()) {
1113 if (OpC && OpC->isFast() && CI->isFast()) {
11141114 LibFunc Func;
11151115 Function *OpCCallee = OpC->getCalledFunction();
11161116 if (OpCCallee && TLI->getLibFunc(OpCCallee->getName(), Func) &&
11351135 LibFunc_sqrtl)) {
11361136 // If -ffast-math:
11371137 // pow(x, -0.5) -> 1.0 / sqrt(x)
1138 if (CI->hasUnsafeAlgebra()) {
1138 if (CI->isFast()) {
11391139 IRBuilder<>::FastMathFlagGuard Guard(B);
11401140 B.setFastMathFlags(CI->getFastMathFlags());
11411141
11561156 LibFunc_sqrtl)) {
11571157
11581158 // In -ffast-math, pow(x, 0.5) -> sqrt(x).
1159 if (CI->hasUnsafeAlgebra()) {
1159 if (CI->isFast()) {
11601160 IRBuilder<>::FastMathFlagGuard Guard(B);
11611161 B.setFastMathFlags(CI->getFastMathFlags());
11621162
11951195 return B.CreateFDiv(ConstantFP::get(CI->getType(), 1.0), Op1, "powrecip");
11961196
11971197 // In -ffast-math, generate repeated fmul instead of generating pow(x, n).
1198 if (CI->hasUnsafeAlgebra()) {
1198 if (CI->isFast()) {
11991199 APFloat V = abs(Op2C->getValueAPF());
12001200 // We limit to a max of 7 fmul(s). Thus max exponent is 32.
12011201 // This transformation applies to integer exponents only.
12831283
12841284 IRBuilder<>::FastMathFlagGuard Guard(B);
12851285 FastMathFlags FMF;
1286 if (CI->hasUnsafeAlgebra()) {
1287 // Unsafe algebra sets all fast-math-flags to true.
1288 FMF.setUnsafeAlgebra();
1286 if (CI->isFast()) {
1287 // If the call is 'fast', then anything we create here will also be 'fast'.
1288 FMF.setFast();
12891289 } else {
12901290 // At a minimum, no-nans-fp-math must be true.
12911291 if (!CI->hasNoNaNs())
13161316 if (UnsafeFPShrink && hasFloatVersion(Name))
13171317 Ret = optimizeUnaryDoubleFP(CI, B, true);
13181318
1319 if (!CI->hasUnsafeAlgebra())
1319 if (!CI->isFast())
13201320 return Ret;
13211321 Value *Op1 = CI->getArgOperand(0);
13221322 auto *OpC = dyn_cast(Op1);
13231323
1324 // The earlier call must also be unsafe in order to do these transforms.
1325 if (!OpC || !OpC->hasUnsafeAlgebra())
1324 // The earlier call must also be 'fast' in order to do these transforms.
1325 if (!OpC || !OpC->isFast())
13261326 return Ret;
13271327
13281328 // log(pow(x,y)) -> y*log(x)
13321332
13331333 IRBuilder<>::FastMathFlagGuard Guard(B);
13341334 FastMathFlags FMF;
1335 FMF.setUnsafeAlgebra();
1335 FMF.setFast();
13361336 B.setFastMathFlags(FMF);
13371337
13381338 LibFunc Func;
13641364 Callee->getIntrinsicID() == Intrinsic::sqrt))
13651365 Ret = optimizeUnaryDoubleFP(CI, B, true);
13661366
1367 if (!CI->hasUnsafeAlgebra())
1367 if (!CI->isFast())
13681368 return Ret;
13691369
13701370 Instruction *I = dyn_cast(CI->getArgOperand(0));
1371 if (!I || I->getOpcode() != Instruction::FMul || !I->hasUnsafeAlgebra())
1371 if (!I || I->getOpcode() != Instruction::FMul || !I->isFast())
13721372 return Ret;
13731373
13741374 // We're looking for a repeated factor in a multiplication tree,
13901390 Value *OtherMul0, *OtherMul1;
13911391 if (match(Op0, m_FMul(m_Value(OtherMul0), m_Value(OtherMul1)))) {
13921392 // Pattern: sqrt((x * y) * z)
1393 if (OtherMul0 == OtherMul1 &&
1394 cast(Op0)->hasUnsafeAlgebra()) {
1393 if (OtherMul0 == OtherMul1 && cast(Op0)->isFast()) {
13951394 // Matched: sqrt((x * x) * z)
13961395 RepeatOp = OtherMul0;
13971396 OtherOp = Op1;
14361435 if (!OpC)
14371436 return Ret;
14381437
1439 // Both calls must allow unsafe optimizations in order to remove them.
1440 if (!CI->hasUnsafeAlgebra() || !OpC->hasUnsafeAlgebra())
1438 // Both calls must be 'fast' in order to remove them.
1439 if (!CI->isFast() || !OpC->isFast())
14411440 return Ret;
14421441
14431442 // tan(atan(x)) -> x
21662165
21672166 // Command-line parameter overrides instruction attribute.
21682167 // This can't be moved to optimizeFloatingPointLibCall() because it may be
2169 // used by the intrinsic optimizations.
2168 // used by the intrinsic optimizations.
21702169 if (EnableUnsafeFPShrink.getNumOccurrences() > 0)
21712170 UnsafeFPShrink = EnableUnsafeFPShrink;
2172 else if (isa(CI) && CI->hasUnsafeAlgebra())
2171 else if (isa(CI) && CI->isFast())
21732172 UnsafeFPShrink = true;
21742173
21752174 // First, check for intrinsics.
384384 static Value *addFastMathFlag(Value *V) {
385385 if (isa(V)) {
386386 FastMathFlags Flags;
387 Flags.setUnsafeAlgebra();
387 Flags.setFast();
388388 cast(V)->setFastMathFlags(Flags);
389389 }
390390 return V;
27192719
27202720 // Floating point operations had to be 'fast' to enable the induction.
27212721 FastMathFlags Flags;
2722 Flags.setUnsafeAlgebra();
2722 Flags.setFast();
27232723
27242724 Value *MulOp = Builder.CreateFMul(Cv, Step);
27252725 if (isa(MulOp))
53955395 // operations, shuffles, or casts, as they don't change precision or
53965396 // semantics.
53975397 } else if (I.getType()->isFloatingPointTy() && (CI || I.isBinaryOp()) &&
5398 !I.hasUnsafeAlgebra()) {
5398 !I.isFast()) {
53995399 DEBUG(dbgs() << "LV: Found FP op with unsafe algebra.\n");
54005400 Hints->setPotentiallyUnsafe();
54015401 }
48794879 case RK_Min:
48804880 case RK_Max:
48814881 return Opcode == Instruction::ICmp ||
4882 cast(I->getOperand(0))->hasUnsafeAlgebra();
4882 cast(I->getOperand(0))->isFast();
48834883 case RK_UMin:
48844884 case RK_UMax:
48854885 assert(Opcode == Instruction::ICmp &&
52315231 Value *VectorizedTree = nullptr;
52325232 IRBuilder<> Builder(ReductionRoot);
52335233 FastMathFlags Unsafe;
5234 Unsafe.setUnsafeAlgebra();
5234 Unsafe.setFast();
52355235 Builder.setFastMathFlags(Unsafe);
52365236 unsigned i = 0;
52375237
55 @select = external global i1
66 @vec = external global <3 x float>
77 @arr = external global [3 x float]
8
9 declare float @foo(float)
810
911 define float @none(float %x, float %y) {
1012 entry:
8587 ret float %c
8688 }
8789
90 ; CHECK: @reassoc(
91 define float @reassoc(float %x, float %y) {
92 ; CHECK: %a = fsub reassoc float %x, %y
93 %a = fsub reassoc float %x, %y
94 ; CHECK: %b = fmul reassoc float %x, %y
95 %b = fmul reassoc float %x, %y
96 ; CHECK: %c = call reassoc float @foo(float %b)
97 %c = call reassoc float @foo(float %b)
98 ret float %c
99 }
100
101 ; CHECK: @afn(
102 define float @afn(float %x, float %y) {
103 ; CHECK: %a = fdiv afn float %x, %y
104 %a = fdiv afn float %x, %y
105 ; CHECK: %b = frem afn float %x, %y
106 %b = frem afn float %x, %y
107 ; CHECK: %c = call afn float @foo(float %b)
108 %c = call afn float @foo(float %b)
109 ret float %c
110 }
111
88112 ; CHECK: no_nan_inf
89113 define float @no_nan_inf(float %x, float %y) {
90114 entry:
129153 ; CHECK: %arr = load [3 x float], [3 x float]* @arr
130154 %arr = load [3 x float], [3 x float]* @arr
131155
132 ; CHECK: %a = fadd nnan ninf float %x, %y
133 %a = fadd ninf nnan float %x, %y
134 ; CHECK: %a_vec = fadd nnan <3 x float> %vec, %vec
135 %a_vec = fadd nnan <3 x float> %vec, %vec
156 ; CHECK: %a = fadd nnan ninf afn float %x, %y
157 %a = fadd ninf nnan afn float %x, %y
158 ; CHECK: %a_vec = fadd reassoc nnan <3 x float> %vec, %vec
159 %a_vec = fadd reassoc nnan <3 x float> %vec, %vec
136160 ; CHECK: %b = fsub fast float %x, %y
137161 %b = fsub nnan nsz fast float %x, %y
138162 ; CHECK: %b_vec = fsub nnan <3 x float> %vec, %vec
611611 %f.arcp = fadd arcp float %op1, %op2
612612 ; CHECK: %f.arcp = fadd arcp float %op1, %op2
613613 %f.fast = fadd fast float %op1, %op2
614 ; CHECK: %f.fast = fadd fast float %op1, %op2
614 ; 'fast' used to be its own bit, but this changed in Oct 2017.
615 ; The binary test file does not have the newer 'contract' and 'afn' bits set, so this is not fully 'fast'.
616 ; CHECK: %f.fast = fadd reassoc nnan ninf nsz arcp float %op1, %op2
615617 ret void
616618 }
617619
655655 %f.arcp = fadd arcp float %op1, %op2
656656 ; CHECK: %f.arcp = fadd arcp float %op1, %op2
657657 %f.fast = fadd fast float %op1, %op2
658 ; CHECK: %f.fast = fadd fast float %op1, %op2
658 ; 'fast' used to be its own bit, but this changed in Oct 2017.
659 ; The binary test file does not have the newer 'contract' and 'afn' bits set, so this is not fully 'fast'.
660 ; CHECK: %f.fast = fadd reassoc nnan ninf nsz arcp float %op1, %op2
659661 ret void
660662 }
661663
686686 %f.arcp = fadd arcp float %op1, %op2
687687 ; CHECK: %f.arcp = fadd arcp float %op1, %op2
688688 %f.fast = fadd fast float %op1, %op2
689 ; CHECK: %f.fast = fadd fast float %op1, %op2
689 ; 'fast' used to be its own bit, but this changed in Oct 2017.
690 ; The binary test file does not have the newer 'contract' and 'afn' bits set, so this is not fully 'fast'.
691 ; CHECK: %f.fast = fadd reassoc nnan ninf nsz arcp float %op1, %op2
690692 ret void
691693 }
692694
699701 ; CHECK-LABEL: fastMathFlagsForCalls(
700702 define void @fastMathFlagsForCalls(float %f, double %d1, <4 x double> %d2) {
701703 %call.fast = call fast float @fmf1()
702 ; CHECK: %call.fast = call fast float @fmf1()
704 ; 'fast' used to be its own bit, but this changed in Oct 2017.
705 ; The binary test file does not have the newer 'contract' and 'aml' bits set, so this is not fully 'fast'.
706 ; CHECK: %call.fast = call reassoc nnan ninf nsz arcp float @fmf1()
703707
704708 ; Throw in some other attributes to make sure those stay in the right places.
705709
757757 %f.arcp = fadd arcp float %op1, %op2
758758 ; CHECK: %f.arcp = fadd arcp float %op1, %op2
759759 %f.fast = fadd fast float %op1, %op2
760 ; CHECK: %f.fast = fadd fast float %op1, %op2
760 ; 'fast' used to be its own bit, but this changed in Oct 2017.
761 ; The binary test file does not have the newer 'contract' and 'afn' bits set, so this is not fully 'fast'.
762 ; CHECK: %f.fast = fadd reassoc nnan ninf nsz arcp float %op1, %op2
761763 ret void
762764 }
763765
770772 ; CHECK-LABEL: fastMathFlagsForCalls(
771773 define void @fastMathFlagsForCalls(float %f, double %d1, <4 x double> %d2) {
772774 %call.fast = call fast float @fmf1()
773 ; CHECK: %call.fast = call fast float @fmf1()
775 ; 'fast' used to be its own bit, but this changed in Oct 2017.
776 ; The binary test file does not have the newer 'contract' and 'afn' bits set, so this is not fully 'fast'.
777 ; CHECK: %call.fast = call reassoc nnan ninf nsz arcp float @fmf1()
774778
775779 ; Throw in some other attributes to make sure those stay in the right places.
776780
756756 ; CHECK: %f.nsz = fadd nsz float %op1, %op2
757757 %f.arcp = fadd arcp float %op1, %op2
758758 ; CHECK: %f.arcp = fadd arcp float %op1, %op2
759 ; 'fast' used to be its own bit, but this changed in Oct 2017.
760 ; The binary test file does not have the newer 'contract' and 'afn' bits set, so this is not fully 'fast'.
759761 %f.fast = fadd fast float %op1, %op2
760 ; CHECK: %f.fast = fadd fast float %op1, %op2
762 ; CHECK: %f.fast = fadd reassoc nnan ninf nsz arcp float %op1, %op2
761763 ret void
762764 }
763765
770772 ; CHECK-LABEL: fastMathFlagsForCalls(
771773 define void @fastMathFlagsForCalls(float %f, double %d1, <4 x double> %d2) {
772774 %call.fast = call fast float @fmf1()
773 ; CHECK: %call.fast = call fast float @fmf1()
775 ; 'fast' used to be its own bit, but this changed in Oct 2017.
776 ; The binary test file does not have the newer 'contract' and 'afn' bits set, so this is not fully 'fast'.
777 ; CHECK: %call.fast = call reassoc nnan ninf nsz arcp float @fmf1()
774778
775779 ; Throw in some other attributes to make sure those stay in the right places.
776780
764764 %f.contract = fadd contract float %op1, %op2
765765 ; CHECK: %f.contract = fadd contract float %op1, %op2
766766 %f.fast = fadd fast float %op1, %op2
767 ; CHECK: %f.fast = fadd fast float %op1, %op2
767 ; 'fast' used to be its own bit, but this changed in Oct 2017.
768 ; The binary test file does not have the newer 'afn' bit set, so this is not fully 'fast'.
769 ; CHECK: %f.fast = fadd reassoc nnan ninf nsz arcp contract float %op1, %op2
768770 ret void
769771 }
770772
777779 ; CHECK-LABEL: fastMathFlagsForCalls(
778780 define void @fastMathFlagsForCalls(float %f, double %d1, <4 x double> %d2) {
779781 %call.fast = call fast float @fmf1()
780 ; CHECK: %call.fast = call fast float @fmf1()
782 ; 'fast' used to be its own bit, but this changed in Oct 2017.
783 ; The binary test file does not have the newer 'afn' bit set, so this is not fully 'fast'.
784 ; CHECK: %call.fast = call reassoc nnan ninf nsz arcp contract float @fmf1()
781785
782786 ; Throw in some other attributes to make sure those stay in the right places.
783787
774774 ; CHECK: %f.arcp = fadd arcp float %op1, %op2
775775 %f.contract = fadd contract float %op1, %op2
776776 ; CHECK: %f.contract = fadd contract float %op1, %op2
777 %f.afn = fadd afn float %op1, %op2
778 ; CHECK: %f.afn = fadd afn float %op1, %op2
779 %f.reassoc = fadd reassoc float %op1, %op2
780 ; CHECK: %f.reassoc = fadd reassoc float %op1, %op2
777781 %f.fast = fadd fast float %op1, %op2
778782 ; CHECK: %f.fast = fadd fast float %op1, %op2
779783 ret void
143143 FastMathFlags FMF;
144144 Builder.setFastMathFlags(FMF);
145145
146 // By default, no flags are set.
146147 F = Builder.CreateFAdd(F, F);
147148 EXPECT_FALSE(Builder.getFastMathFlags().any());
148
149 FMF.setUnsafeAlgebra();
149 ASSERT_TRUE(isa(F));
150 FAdd = cast(F);
151 EXPECT_FALSE(FAdd->hasNoNaNs());
152 EXPECT_FALSE(FAdd->hasNoInfs());
153 EXPECT_FALSE(FAdd->hasNoSignedZeros());
154 EXPECT_FALSE(FAdd->hasAllowReciprocal());
155 EXPECT_FALSE(FAdd->hasAllowContract());
156 EXPECT_FALSE(FAdd->hasAllowReassoc());
157 EXPECT_FALSE(FAdd->hasApproxFunc());
158
159 // Set all flags in the instruction.
160 FAdd->setFast(true);
161 EXPECT_TRUE(FAdd->hasNoNaNs());
162 EXPECT_TRUE(FAdd->hasNoInfs());
163 EXPECT_TRUE(FAdd->hasNoSignedZeros());
164 EXPECT_TRUE(FAdd->hasAllowReciprocal());
165 EXPECT_TRUE(FAdd->hasAllowContract());
166 EXPECT_TRUE(FAdd->hasAllowReassoc());
167 EXPECT_TRUE(FAdd->hasApproxFunc());
168
169 // All flags are set in the builder.
170 FMF.setFast();
150171 Builder.setFastMathFlags(FMF);
151172
152173 F = Builder.CreateFAdd(F, F);
153174 EXPECT_TRUE(Builder.getFastMathFlags().any());
175 EXPECT_TRUE(Builder.getFastMathFlags().all());
154176 ASSERT_TRUE(isa(F));
155177 FAdd = cast(F);
156178 EXPECT_TRUE(FAdd->hasNoNaNs());
179 EXPECT_TRUE(FAdd->isFast());
157180
158181 // Now, try it with CreateBinOp
159182 F = Builder.CreateBinOp(Instruction::FAdd, F, F);
161184 ASSERT_TRUE(isa(F));
162185 FAdd = cast(F);
163186 EXPECT_TRUE(FAdd->hasNoNaNs());
187 EXPECT_TRUE(FAdd->isFast());
164188
165189 F = Builder.CreateFDiv(F, F);
166 EXPECT_TRUE(Builder.getFastMathFlags().any());
167 EXPECT_TRUE(Builder.getFastMathFlags().UnsafeAlgebra);
190 EXPECT_TRUE(Builder.getFastMathFlags().all());
168191 ASSERT_TRUE(isa(F));
169192 FDiv = cast(F);
170193 EXPECT_TRUE(FDiv->hasAllowReciprocal());
171194
195 // Clear all FMF in the builder.
172196 Builder.clearFastMathFlags();
173197
174198 F = Builder.CreateFDiv(F, F);
175199 ASSERT_TRUE(isa(F));
176200 FDiv = cast(F);
177201 EXPECT_FALSE(FDiv->hasAllowReciprocal());
178
202
203 // Try individual flags.
179204 FMF.clear();
180205 FMF.setAllowReciprocal();
181206 Builder.setFastMathFlags(FMF);
224249 FAdd = cast(FC);
225250 EXPECT_TRUE(FAdd->hasAllowContract());
226251
252 FMF.setApproxFunc();
227253 Builder.clearFastMathFlags();
254 Builder.setFastMathFlags(FMF);
255 // Now 'aml' and 'contract' are set.
256 F = Builder.CreateFMul(F, F);
257 FAdd = cast(F);
258 EXPECT_TRUE(FAdd->hasApproxFunc());
259 EXPECT_TRUE(FAdd->hasAllowContract());
260 EXPECT_FALSE(FAdd->hasAllowReassoc());
261
262 FMF.setAllowReassoc();
263 Builder.clearFastMathFlags();
264 Builder.setFastMathFlags(FMF);
265 // Now 'aml' and 'contract' and 'reassoc' are set.
266 F = Builder.CreateFMul(F, F);
267 FAdd = cast(F);
268 EXPECT_TRUE(FAdd->hasApproxFunc());
269 EXPECT_TRUE(FAdd->hasAllowContract());
270 EXPECT_TRUE(FAdd->hasAllowReassoc());
228271
229272 // Test a call with FMF.
230273 auto CalleeTy = FunctionType::get(Type::getFloatTy(Ctx),