llvm.org GIT mirror llvm / 8e229ec
Adding a width of the GEP index to the Data Layout. Making a width of GEP Index, which is used for address calculation, to be one of the pointer properties in the Data Layout. p[address space]:size:memory_size:alignment:pref_alignment:index_size_in_bits. The index size parameter is optional, if not specified, it is equal to the pointer size. Till now, the InstCombiner normalized GEPs and extended the Index operand to the pointer width. It works fine if you can convert pointer to integer for address calculation and all registered targets do this. But some ISAs have very restricted instruction set for the pointer calculation. During discussions were desided to retrieve information for GEP index from the Data Layout. http://lists.llvm.org/pipermail/llvm-dev/2018-January/120416.html I added an interface to the Data Layout and I changed the InstCombiner and some other passes to take the Index width into account. This change does not affect any in-tree target. I added tests to cover data layouts with explicitly specified index size. Differential Revision: https://reviews.llvm.org/D42123 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@325102 91177308-0d34-0410-b5e6-96231b3b80d8 Elena Demikhovsky 1 year, 7 months ago
30 changed file(s) with 1632 addition(s) and 133 deletion(s). Raw diff Collapse all Expand all
10641064 for return values.
10651065
10661066 .. _attr_align:
1067
1067
10681068 ``align ``
10691069 This indicates that the pointer value may be assumed by the optimizer to
10701070 have the specified alignment.
19071907 ``A
``
19081908 Specifies the address space of objects created by '``alloca``'.
19091909 Defaults to the default address space of 0.
1910 ``p[n]:::``
1910 ``p[n]::::``
19111911 This specifies the *size* of a pointer and its ```` and
1912 ````\erred alignments for address space ``n``. All sizes are in
1913 bits. The address space, ``n``, is optional, and if not specified,
1912 ````\erred alignments for address space ``n``. The fourth parameter
1913 ```` is a size of index that used for address calculation. If not
1914 specified, the default index size is equal to the pointer size. All sizes
1915 are in bits. The address space, ``n``, is optional, and if not specified,
19141916 denotes the default address space 0. The value of ``n`` must be
19151917 in the range [1,2^23).
19161918 ``i::``
22802282 LLVM IR floating-point operations (:ref:`fadd `,
22812283 :ref:`fsub `, :ref:`fmul `, :ref:`fdiv `,
22822284 :ref:`frem `, :ref:`fcmp `) and :ref:`call `
2283 may use the following flags to enable otherwise unsafe
2285 may use the following flags to enable otherwise unsafe
22842286 floating-point transformations.
22852287
22862288 ``nnan``
23072309
23082310 ``afn``
23092311 Approximate functions - Allow substitution of approximate calculations for
2310 functions (sin, log, sqrt, etc). See floating-point intrinsic definitions
2311 for places where this can apply to LLVM's intrinsic math functions.
2312 functions (sin, log, sqrt, etc). See floating-point intrinsic definitions
2313 for places where this can apply to LLVM's intrinsic math functions.
23122314
23132315 ``reassoc``
2314 Allow reassociation transformations for floating-point instructions.
2316 Allow reassociation transformations for floating-point instructions.
23152317 This may dramatically change results in floating point.
23162318
23172319 ``fast``
68526854 Semantics:
68536855 """"""""""
68546856
6855 Return the same value as a libm '``fmod``' function but without trapping or
6857 Return the same value as a libm '``fmod``' function but without trapping or
68566858 setting ``errno``.
68576859
6858 The remainder has the same sign as the dividend. This instruction can also
6860 The remainder has the same sign as the dividend. This instruction can also
68596861 take any number of :ref:`fast-math flags `, which are optimization
68606862 hints to enable otherwise unsafe floating-point optimizations:
68616863
1050310505 """"""""""
1050410506
1050510507 The '``llvm.memset.*``' intrinsics fill "len" bytes of memory starting
10506 at the destination location.
10508 at the destination location.
1050710509
1050810510 '``llvm.sqrt.*``' Intrinsic
1050910511 ^^^^^^^^^^^^^^^^^^^^^^^^^^^
1053710539 """"""""""
1053810540
1053910541 Return the same value as a corresponding libm '``sqrt``' function but without
10540 trapping or setting ``errno``. For types specified by IEEE-754, the result
10542 trapping or setting ``errno``. For types specified by IEEE-754, the result
1054110543 matches a conforming libm implementation.
1054210544
10543 When specified with the fast-math-flag 'afn', the result may be approximated
10545 When specified with the fast-math-flag 'afn', the result may be approximated
1054410546 using a less accurate calculation.
1054510547
1054610548 '``llvm.powi.*``' Intrinsic
1061510617 Return the same value as a corresponding libm '``sin``' function but without
1061610618 trapping or setting ``errno``.
1061710619
10618 When specified with the fast-math-flag 'afn', the result may be approximated
10620 When specified with the fast-math-flag 'afn', the result may be approximated
1061910621 using a less accurate calculation.
1062010622
1062110623 '``llvm.cos.*``' Intrinsic
1065210654 Return the same value as a corresponding libm '``cos``' function but without
1065310655 trapping or setting ``errno``.
1065410656
10655 When specified with the fast-math-flag 'afn', the result may be approximated
10657 When specified with the fast-math-flag 'afn', the result may be approximated
1065610658 using a less accurate calculation.
1065710659
1065810660 '``llvm.pow.*``' Intrinsic
1069010692 Return the same value as a corresponding libm '``pow``' function but without
1069110693 trapping or setting ``errno``.
1069210694
10693 When specified with the fast-math-flag 'afn', the result may be approximated
10695 When specified with the fast-math-flag 'afn', the result may be approximated
1069410696 using a less accurate calculation.
1069510697
1069610698 '``llvm.exp.*``' Intrinsic
1072810730 Return the same value as a corresponding libm '``exp``' function but without
1072910731 trapping or setting ``errno``.
1073010732
10731 When specified with the fast-math-flag 'afn', the result may be approximated
10733 When specified with the fast-math-flag 'afn', the result may be approximated
1073210734 using a less accurate calculation.
1073310735
1073410736 '``llvm.exp2.*``' Intrinsic
1076610768 Return the same value as a corresponding libm '``exp2``' function but without
1076710769 trapping or setting ``errno``.
1076810770
10769 When specified with the fast-math-flag 'afn', the result may be approximated
10771 When specified with the fast-math-flag 'afn', the result may be approximated
1077010772 using a less accurate calculation.
1077110773
1077210774 '``llvm.log.*``' Intrinsic
1080410806 Return the same value as a corresponding libm '``log``' function but without
1080510807 trapping or setting ``errno``.
1080610808
10807 When specified with the fast-math-flag 'afn', the result may be approximated
10809 When specified with the fast-math-flag 'afn', the result may be approximated
1080810810 using a less accurate calculation.
1080910811
1081010812 '``llvm.log10.*``' Intrinsic
1084210844 Return the same value as a corresponding libm '``log10``' function but without
1084310845 trapping or setting ``errno``.
1084410846
10845 When specified with the fast-math-flag 'afn', the result may be approximated
10847 When specified with the fast-math-flag 'afn', the result may be approximated
1084610848 using a less accurate calculation.
1084710849
1084810850 '``llvm.log2.*``' Intrinsic
1088010882 Return the same value as a corresponding libm '``log2``' function but without
1088110883 trapping or setting ``errno``.
1088210884
10883 When specified with the fast-math-flag 'afn', the result may be approximated
10885 When specified with the fast-math-flag 'afn', the result may be approximated
1088410886 using a less accurate calculation.
1088510887
1088610888 '``llvm.fma.*``' Intrinsic
1091710919 Return the same value as a corresponding libm '``fma``' function but without
1091810920 trapping or setting ``errno``.
1091910921
10920 When specified with the fast-math-flag 'afn', the result may be approximated
10922 When specified with the fast-math-flag 'afn', the result may be approximated
1092110923 using a less accurate calculation.
1092210924
1092310925 '``llvm.fabs.*``' Intrinsic
1455714559 is replaced with an actual element size.
1455814560
1455914561 The optimizer is allowed to inline the memory assignment when it's profitable to do so.
14560
239239 bool IsJTAllowed = TLI->areJTsAllowed(SI.getParent()->getParent());
240240
241241 // Early exit if both a jump table and bit test are not allowed.
242 if (N < 1 || (!IsJTAllowed && DL.getPointerSizeInBits() < N))
242 if (N < 1 || (!IsJTAllowed && DL.getIndexSizeInBits(0u) < N))
243243 return N;
244244
245245 APInt MaxCaseVal = SI.case_begin()->getCaseValue()->getValue();
253253 }
254254
255255 // Check if suitable for a bit test
256 if (N <= DL.getPointerSizeInBits()) {
256 if (N <= DL.getIndexSizeInBits(0u)) {
257257 SmallPtrSet Dests;
258258 for (auto I : SI.cases())
259259 Dests.insert(I.getCaseSuccessor());
811811 bool rangeFitsInWord(const APInt &Low, const APInt &High,
812812 const DataLayout &DL) const {
813813 // FIXME: Using the pointer type doesn't seem ideal.
814 uint64_t BW = DL.getPointerSizeInBits();
814 uint64_t BW = DL.getIndexSizeInBits(0u);
815815 uint64_t Range = (High - Low).getLimitedValue(UINT64_MAX - 1) + 1;
816816 return Range <= BW;
817817 }
9191 unsigned PrefAlign;
9292 uint32_t TypeByteWidth;
9393 uint32_t AddressSpace;
94 uint32_t IndexWidth;
9495
9596 /// Initializer
9697 static PointerAlignElem get(uint32_t AddressSpace, unsigned ABIAlign,
97 unsigned PrefAlign, uint32_t TypeByteWidth);
98 unsigned PrefAlign, uint32_t TypeByteWidth,
99 uint32_t IndexWidth);
98100
99101 bool operator==(const PointerAlignElem &rhs) const;
100102 };
164166 unsigned getAlignmentInfo(AlignTypeEnum align_type, uint32_t bit_width,
165167 bool ABIAlign, Type *Ty) const;
166168 void setPointerAlignment(uint32_t AddrSpace, unsigned ABIAlign,
167 unsigned PrefAlign, uint32_t TypeByteWidth);
169 unsigned PrefAlign, uint32_t TypeByteWidth,
170 uint32_t IndexWidth);
168171
169172 /// Internal helper method that returns requested alignment for type.
170173 unsigned getAlignment(Type *Ty, bool abi_or_pref) const;
320323 /// the backends/clients are updated.
321324 unsigned getPointerSize(unsigned AS = 0) const;
322325
326 // Index size used for address calculation.
327 unsigned getIndexSize(unsigned AS) const;
328
323329 /// Return the address spaces containing non-integral pointers. Pointers in
324330 /// this address space don't have a well-defined bitwise representation.
325331 ArrayRef getNonIntegralAddressSpaces() const {
342348 /// the backends/clients are updated.
343349 unsigned getPointerSizeInBits(unsigned AS = 0) const {
344350 return getPointerSize(AS) * 8;
351 }
352
353 /// Size in bits of index used for address calculation in getelementptr.
354 unsigned getIndexSizeInBits(unsigned AS) const {
355 return getIndexSize(AS) * 8;
345356 }
346357
347358 /// Layout pointer size, in bits, based on the type. If this function is
350361 /// of the pointer is returned. This should only be called with a pointer or
351362 /// vector of pointers.
352363 unsigned getPointerTypeSizeInBits(Type *) const;
364
365 /// Layout size of the index used in GEP calculation.
366 /// The function should be called with pointer or vector of pointers type.
367 unsigned getIndexTypeSizeInBits(Type *Ty) const;
353368
354369 unsigned getPointerTypeSize(Type *Ty) const {
355370 return getPointerTypeSizeInBits(Ty) / 8;
451466 /// \brief Returns the size of largest legal integer type size, or 0 if none
452467 /// are set.
453468 unsigned getLargestLegalIntTypeSizeInBits() const;
469
470 /// \brief Returns the type of a GEP index.
471 /// If it was not specified explicitly, it will be the integer type of the
472 /// pointer width - IntPtrType.
473 Type *getIndexType(Type *PtrTy) const;
454474
455475 /// \brief Returns the offset from the beginning of the type for the specified
456476 /// indices.
285285 APInt &Offset, const DataLayout &DL) {
286286 // Trivial case, constant is the global.
287287 if ((GV = dyn_cast(C))) {
288 unsigned BitWidth = DL.getPointerTypeSizeInBits(GV->getType());
288 unsigned BitWidth = DL.getIndexTypeSizeInBits(GV->getType());
289289 Offset = APInt(BitWidth, 0);
290290 return true;
291291 }
304304 if (!GEP)
305305 return false;
306306
307 unsigned BitWidth = DL.getPointerTypeSizeInBits(GEP->getType());
307 unsigned BitWidth = DL.getIndexTypeSizeInBits(GEP->getType());
308308 APInt TmpOffset(BitWidth, 0);
309309
310310 // If the base isn't a global+constant, we aren't either.
807807 // If this is a constant expr gep that is effectively computing an
808808 // "offsetof", fold it into 'cast int Size to T*' instead of 'gep 0, 0, 12'
809809 for (unsigned i = 1, e = Ops.size(); i != e; ++i)
810 if (!isa(Ops[i])) {
811
812 // If this is "gep i8* Ptr, (sub 0, V)", fold this as:
813 // "inttoptr (sub (ptrtoint Ptr), V)"
814 if (Ops.size() == 2 && ResElemTy->isIntegerTy(8)) {
815 auto *CE = dyn_cast(Ops[1]);
816 assert((!CE || CE->getType() == IntPtrTy) &&
817 "CastGEPIndices didn't canonicalize index types!");
818 if (CE && CE->getOpcode() == Instruction::Sub &&
819 CE->getOperand(0)->isNullValue()) {
820 Constant *Res = ConstantExpr::getPtrToInt(Ptr, CE->getType());
821 Res = ConstantExpr::getSub(Res, CE->getOperand(1));
822 Res = ConstantExpr::getIntToPtr(Res, ResTy);
823 if (auto *FoldedRes = ConstantFoldConstant(Res, DL, TLI))
824 Res = FoldedRes;
825 return Res;
810 if (!isa(Ops[i])) {
811
812 // If this is "gep i8* Ptr, (sub 0, V)", fold this as:
813 // "inttoptr (sub (ptrtoint Ptr), V)"
814 if (Ops.size() == 2 && ResElemTy->isIntegerTy(8)) {
815 auto *CE = dyn_cast(Ops[1]);
816 assert((!CE || CE->getType() == IntPtrTy) &&
817 "CastGEPIndices didn't canonicalize index types!");
818 if (CE && CE->getOpcode() == Instruction::Sub &&
819 CE->getOperand(0)->isNullValue()) {
820 Constant *Res = ConstantExpr::getPtrToInt(Ptr, CE->getType());
821 Res = ConstantExpr::getSub(Res, CE->getOperand(1));
822 Res = ConstantExpr::getIntToPtr(Res, ResTy);
823 if (auto *FoldedRes = ConstantFoldConstant(Res, DL, TLI))
824 Res = FoldedRes;
825 return Res;
826 }
826827 }
827 }
828 return nullptr;
829 }
828 return nullptr;
829 }
830830
831831 unsigned BitWidth = DL.getTypeSizeInBits(IntPtrTy);
832832 APInt Offset =
371371 /// Returns false if unable to compute the offset for any reason. Respects any
372372 /// simplified values known during the analysis of this callsite.
373373 bool CallAnalyzer::accumulateGEPOffset(GEPOperator &GEP, APInt &Offset) {
374 unsigned IntPtrWidth = DL.getPointerTypeSizeInBits(GEP.getType());
374 unsigned IntPtrWidth = DL.getIndexTypeSizeInBits(GEP.getType());
375375 assert(IntPtrWidth == Offset.getBitWidth());
376376
377377 for (gep_type_iterator GTI = gep_type_begin(GEP), GTE = gep_type_end(GEP);
16181618 return nullptr;
16191619
16201620 unsigned AS = V->getType()->getPointerAddressSpace();
1621 unsigned IntPtrWidth = DL.getPointerSizeInBits(AS);
1621 unsigned IntPtrWidth = DL.getIndexSizeInBits(AS);
16221622 APInt Offset = APInt::getNullValue(IntPtrWidth);
16231623
16241624 // Even though we don't look through PHI nodes, we could be called on an
37613761 // The following transforms are only safe if the ptrtoint cast
37623762 // doesn't truncate the pointers.
37633763 if (Ops[1]->getType()->getScalarSizeInBits() ==
3764 Q.DL.getPointerSizeInBits(AS)) {
3764 Q.DL.getIndexSizeInBits(AS)) {
37653765 auto PtrToIntOrZero = [GEPTy](Value *P) -> Value * {
37663766 if (match(P, m_Zero()))
37673767 return Constant::getNullValue(GEPTy);
38013801 if (Q.DL.getTypeAllocSize(LastType) == 1 &&
38023802 all_of(Ops.slice(1).drop_back(1),
38033803 [](Value *Idx) { return match(Idx, m_Zero()); })) {
3804 unsigned PtrWidth =
3805 Q.DL.getPointerSizeInBits(Ops[0]->getType()->getPointerAddressSpace());
3806 if (Q.DL.getTypeSizeInBits(Ops.back()->getType()) == PtrWidth) {
3807 APInt BasePtrOffset(PtrWidth, 0);
3804 unsigned IdxWidth =
3805 Q.DL.getIndexSizeInBits(Ops[0]->getType()->getPointerAddressSpace());
3806 if (Q.DL.getTypeSizeInBits(Ops.back()->getType()) == IdxWidth) {
3807 APInt BasePtrOffset(IdxWidth, 0);
38083808 Value *StrippedBasePtr =
38093809 Ops[0]->stripAndAccumulateInBoundsConstantOffsets(Q.DL,
38103810 BasePtrOffset);
7979 if (const GEPOperator *GEP = dyn_cast(V)) {
8080 const Value *Base = GEP->getPointerOperand();
8181
82 APInt Offset(DL.getPointerTypeSizeInBits(GEP->getType()), 0);
82 APInt Offset(DL.getIndexTypeSizeInBits(GEP->getType()), 0);
8383 if (!GEP->accumulateConstantOffset(DL, Offset) || Offset.isNegative() ||
8484 !Offset.urem(APInt(Offset.getBitWidth(), Align)).isMinValue())
8585 return false;
145145
146146 SmallPtrSet Visited;
147147 return ::isDereferenceableAndAlignedPointer(
148 V, Align, APInt(DL.getTypeSizeInBits(VTy), DL.getTypeStoreSize(Ty)), DL,
148 V, Align, APInt(DL.getIndexTypeSizeInBits(VTy), DL.getTypeStoreSize(Ty)), DL,
149149 CtxI, DT, Visited);
150150 }
151151
11261126 if (CheckType && PtrA->getType() != PtrB->getType())
11271127 return false;
11281128
1129 unsigned PtrBitWidth = DL.getPointerSizeInBits(ASA);
1129 unsigned IdxWidth = DL.getIndexSizeInBits(ASA);
11301130 Type *Ty = cast(PtrA->getType())->getElementType();
1131 APInt Size(PtrBitWidth, DL.getTypeStoreSize(Ty));
1132
1133 APInt OffsetA(PtrBitWidth, 0), OffsetB(PtrBitWidth, 0);
1131 APInt Size(IdxWidth, DL.getTypeStoreSize(Ty));
1132
1133 APInt OffsetA(IdxWidth, 0), OffsetB(IdxWidth, 0);
11341134 PtrA = PtrA->stripAndAccumulateInBoundsConstantOffsets(DL, OffsetA);
11351135 PtrB = PtrB->stripAndAccumulateInBoundsConstantOffsets(DL, OffsetB);
11361136
36713671 /// return true.
36723672 uint64_t ScalarEvolution::getTypeSizeInBits(Type *Ty) const {
36733673 assert(isSCEVable(Ty) && "Type is not SCEVable!");
3674 if (Ty->isPointerTy())
3675 return getDataLayout().getIndexTypeSizeInBits(Ty);
36743676 return getDataLayout().getTypeSizeInBits(Ty);
36753677 }
36763678
8888 if (unsigned BitWidth = Ty->getScalarSizeInBits())
8989 return BitWidth;
9090
91 return DL.getPointerTypeSizeInBits(Ty);
91 return DL.getIndexTypeSizeInBits(Ty);
9292 }
9393
9494 namespace {
11001100 unsigned SrcBitWidth;
11011101 // Note that we handle pointer operands here because of inttoptr/ptrtoint
11021102 // which fall through here.
1103 SrcBitWidth = Q.DL.getTypeSizeInBits(SrcTy->getScalarType());
1103 Type *ScalarTy = SrcTy->getScalarType();
1104 SrcBitWidth = ScalarTy->isPointerTy() ?
1105 Q.DL.getIndexTypeSizeInBits(ScalarTy) :
1106 Q.DL.getTypeSizeInBits(ScalarTy);
11041107
11051108 assert(SrcBitWidth && "SrcBitWidth can't be zero");
11061109 Known = Known.zextOrTrunc(SrcBitWidth);
15541557 assert((V->getType()->isIntOrIntVectorTy(BitWidth) ||
15551558 V->getType()->isPtrOrPtrVectorTy()) &&
15561559 "Not integer or pointer type!");
1557 assert(Q.DL.getTypeSizeInBits(V->getType()->getScalarType()) == BitWidth &&
1558 "V and Known should have same BitWidth");
1560
1561 Type *ScalarTy = V->getType()->getScalarType();
1562 unsigned ExpectedWidth = ScalarTy->isPointerTy() ?
1563 Q.DL.getIndexTypeSizeInBits(ScalarTy) : Q.DL.getTypeSizeInBits(ScalarTy);
1564 assert(ExpectedWidth == BitWidth && "V and Known should have same BitWidth");
15591565 (void)BitWidth;
1566 (void)ExpectedWidth;
15601567
15611568 const APInt *C;
15621569 if (match(V, m_APInt(C))) {
21932200 // in V, so for undef we have to conservatively return 1. We don't have the
21942201 // same behavior for poison though -- that's a FIXME today.
21952202
2196 unsigned TyBits = Q.DL.getTypeSizeInBits(V->getType()->getScalarType());
2203 Type *ScalarTy = V->getType()->getScalarType();
2204 unsigned TyBits = ScalarTy->isPointerTy() ?
2205 Q.DL.getIndexTypeSizeInBits(ScalarTy) :
2206 Q.DL.getTypeSizeInBits(ScalarTy);
2207
21972208 unsigned Tmp, Tmp2;
21982209 unsigned FirstAnswer = 1;
21992210
30903101 /// pointer plus a constant offset. Return the base and offset to the caller.
30913102 Value *llvm::GetPointerBaseWithConstantOffset(Value *Ptr, int64_t &Offset,
30923103 const DataLayout &DL) {
3093 unsigned BitWidth = DL.getPointerTypeSizeInBits(Ptr->getType());
3104 unsigned BitWidth = DL.getIndexTypeSizeInBits(Ptr->getType());
30943105 APInt ByteOffset(BitWidth, 0);
30953106
30963107 // We walk up the defs but use a visited set to handle unreachable code. In
31083119 // means when we construct GEPOffset, we need to use the size
31093120 // of GEP's pointer type rather than the size of the original
31103121 // pointer type.
3111 APInt GEPOffset(DL.getPointerTypeSizeInBits(Ptr->getType()), 0);
3122 APInt GEPOffset(DL.getIndexTypeSizeInBits(Ptr->getType()), 0);
31123123 if (!GEP->accumulateConstantOffset(DL, GEPOffset))
31133124 break;
31143125
15801580 // if size - offset meets the size threshold.
15811581 if (!Arg->getType()->isPointerTy())
15821582 continue;
1583 APInt Offset(DL->getPointerSizeInBits(
1583 APInt Offset(DL->getIndexSizeInBits(
15841584 cast(Arg->getType())->getAddressSpace()),
15851585 0);
15861586 Value *Val = Arg->stripAndAccumulateInBoundsConstantOffsets(*DL, Offset);
79767976 const GlobalValue *GV;
79777977 int64_t GVOffset = 0;
79787978 if (TLI->isGAPlusOffset(Ptr.getNode(), GV, GVOffset)) {
7979 unsigned PtrWidth = getDataLayout().getPointerTypeSizeInBits(GV->getType());
7980 KnownBits Known(PtrWidth);
7979 unsigned IdxWidth = getDataLayout().getIndexTypeSizeInBits(GV->getType());
7980 KnownBits Known(IdxWidth);
79817981 llvm::computeKnownBits(GV, Known, getDataLayout());
79827982 unsigned AlignBits = Known.countMinTrailingZeros();
79837983 unsigned Align = AlignBits ? 1 << std::min(31U, AlignBits) : 0;
34233423 DAG.getConstant(Offset, dl, N.getValueType()), Flags);
34243424 }
34253425 } else {
3426 MVT PtrTy =
3427 DAG.getTargetLoweringInfo().getPointerTy(DAG.getDataLayout(), AS);
3428 unsigned PtrSize = PtrTy.getSizeInBits();
3429 APInt ElementSize(PtrSize, DL->getTypeAllocSize(GTI.getIndexedType()));
3426 unsigned IdxSize = DAG.getDataLayout().getIndexSizeInBits(AS);
3427 MVT IdxTy = MVT::getIntegerVT(IdxSize);
3428 APInt ElementSize(IdxSize, DL->getTypeAllocSize(GTI.getIndexedType()));
34303429
34313430 // If this is a scalar constant or a splat vector of constants,
34323431 // handle it quickly.
34383437 if (CI) {
34393438 if (CI->isZero())
34403439 continue;
3441 APInt Offs = ElementSize * CI->getValue().sextOrTrunc(PtrSize);
3440 APInt Offs = ElementSize * CI->getValue().sextOrTrunc(IdxSize);
34423441 LLVMContext &Context = *DAG.getContext();
34433442 SDValue OffsVal = VectorWidth ?
3444 DAG.getConstant(Offs, dl, EVT::getVectorVT(Context, PtrTy, VectorWidth)) :
3445 DAG.getConstant(Offs, dl, PtrTy);
3443 DAG.getConstant(Offs, dl, EVT::getVectorVT(Context, IdxTy, VectorWidth)) :
3444 DAG.getConstant(Offs, dl, IdxTy);
34463445
34473446 // In an inbouds GEP with an offset that is nonnegative even when
34483447 // interpreted as signed, assume there is no unsigned overflow.
128128
129129 PointerAlignElem
130130 PointerAlignElem::get(uint32_t AddressSpace, unsigned ABIAlign,
131 unsigned PrefAlign, uint32_t TypeByteWidth) {
131 unsigned PrefAlign, uint32_t TypeByteWidth,
132 uint32_t IndexWidth) {
132133 assert(ABIAlign <= PrefAlign && "Preferred alignment worse than ABI!");
133134 PointerAlignElem retval;
134135 retval.AddressSpace = AddressSpace;
135136 retval.ABIAlign = ABIAlign;
136137 retval.PrefAlign = PrefAlign;
137138 retval.TypeByteWidth = TypeByteWidth;
139 retval.IndexWidth = IndexWidth;
138140 return retval;
139141 }
140142
143145 return (ABIAlign == rhs.ABIAlign
144146 && AddressSpace == rhs.AddressSpace
145147 && PrefAlign == rhs.PrefAlign
146 && TypeByteWidth == rhs.TypeByteWidth);
148 && TypeByteWidth == rhs.TypeByteWidth
149 && IndexWidth == rhs.IndexWidth);
147150 }
148151
149152 //===----------------------------------------------------------------------===//
188191 setAlignment((AlignTypeEnum)E.AlignType, E.ABIAlign, E.PrefAlign,
189192 E.TypeBitWidth);
190193 }
191 setPointerAlignment(0, 8, 8, 8);
194 setPointerAlignment(0, 8, 8, 8, 8);
192195
193196 parseSpecifier(Desc);
194197 }
286289 report_fatal_error(
287290 "Pointer ABI alignment must be a power of 2");
288291
292 // Size of index used in GEP for address calculation.
293 // The parameter is optional. By default it is equal to size of pointer.
294 unsigned IndexSize = PointerMemSize;
295
289296 // Preferred alignment.
290297 unsigned PointerPrefAlign = PointerABIAlign;
291298 if (!Rest.empty()) {
294301 if (!isPowerOf2_64(PointerPrefAlign))
295302 report_fatal_error(
296303 "Pointer preferred alignment must be a power of 2");
304
305 // Now read the index. It is the second optional parameter here.
306 if (!Rest.empty()) {
307 Split = split(Rest, ':');
308 IndexSize = inBytes(getInt(Tok));
309 if (!IndexSize)
310 report_fatal_error("Invalid index size of 0 bytes");
311 }
297312 }
298
299313 setPointerAlignment(AddrSpace, PointerABIAlign, PointerPrefAlign,
300 PointerMemSize);
314 PointerMemSize, IndexSize);
301315 break;
302316 }
303317 case 'i':
466480 }
467481
468482 void DataLayout::setPointerAlignment(uint32_t AddrSpace, unsigned ABIAlign,
469 unsigned PrefAlign,
470 uint32_t TypeByteWidth) {
483 unsigned PrefAlign, uint32_t TypeByteWidth,
484 uint32_t IndexWidth) {
471485 if (PrefAlign < ABIAlign)
472486 report_fatal_error(
473487 "Preferred alignment cannot be less than the ABI alignment");
475489 PointersTy::iterator I = findPointerLowerBound(AddrSpace);
476490 if (I == Pointers.end() || I->AddressSpace != AddrSpace) {
477491 Pointers.insert(I, PointerAlignElem::get(AddrSpace, ABIAlign, PrefAlign,
478 TypeByteWidth));
492 TypeByteWidth, IndexWidth));
479493 } else {
480494 I->ABIAlign = ABIAlign;
481495 I->PrefAlign = PrefAlign;
482496 I->TypeByteWidth = TypeByteWidth;
497 I->IndexWidth = IndexWidth;
483498 }
484499 }
485500
615630 "This should only be called with a pointer or pointer vector type");
616631 Ty = Ty->getScalarType();
617632 return getPointerSizeInBits(cast(Ty)->getAddressSpace());
633 }
634
635 unsigned DataLayout::getIndexSize(unsigned AS) const {
636 PointersTy::const_iterator I = findPointerLowerBound(AS);
637 if (I == Pointers.end() || I->AddressSpace != AS) {
638 I = findPointerLowerBound(0);
639 assert(I->AddressSpace == 0);
640 }
641 return I->IndexWidth;
642 }
643
644 unsigned DataLayout::getIndexTypeSizeInBits(Type *Ty) const {
645 assert(Ty->isPtrOrPtrVectorTy() &&
646 "This should only be called with a pointer or pointer vector type");
647 Ty = Ty->getScalarType();
648 return getIndexSizeInBits(cast(Ty)->getAddressSpace());
618649 }
619650
620651 /*!
700731
701732 IntegerType *DataLayout::getIntPtrType(LLVMContext &C,
702733 unsigned AddressSpace) const {
703 return IntegerType::get(C, getPointerSizeInBits(AddressSpace));
734 return IntegerType::get(C, getIndexSizeInBits(AddressSpace));
704735 }
705736
706737 Type *DataLayout::getIntPtrType(Type *Ty) const {
707738 assert(Ty->isPtrOrPtrVectorTy() &&
708739 "Expected a pointer or pointer vector type.");
709 unsigned NumBits = getPointerTypeSizeInBits(Ty);
740 unsigned NumBits = getIndexTypeSizeInBits(Ty);
710741 IntegerType *IntTy = IntegerType::get(Ty->getContext(), NumBits);
711742 if (VectorType *VecTy = dyn_cast(Ty))
712743 return VectorType::get(IntTy, VecTy->getNumElements());
723754 unsigned DataLayout::getLargestLegalIntTypeSizeInBits() const {
724755 auto Max = std::max_element(LegalIntWidths.begin(), LegalIntWidths.end());
725756 return Max != LegalIntWidths.end() ? *Max : 0;
757 }
758
759 Type *DataLayout::getIndexType(Type *Ty) const {
760 assert(Ty->isPtrOrPtrVectorTy() &&
761 "Expected a pointer or pointer vector type.");
762 unsigned NumBits = getIndexTypeSizeInBits(Ty);
763 IntegerType *IntTy = IntegerType::get(Ty->getContext(), NumBits);
764 if (VectorType *VecTy = dyn_cast(Ty))
765 return VectorType::get(IntTy, VecTy->getNumElements());
766 return IntTy;
726767 }
727768
728769 int64_t DataLayout::getIndexedOffsetInType(Type *ElemTy,
3434 bool GEPOperator::accumulateConstantOffset(const DataLayout &DL,
3535 APInt &Offset) const {
3636 assert(Offset.getBitWidth() ==
37 DL.getPointerSizeInBits(getPointerAddressSpace()) &&
38 "The offset must have exactly as many bits as our pointer.");
37 DL.getIndexSizeInBits(getPointerAddressSpace()) &&
38 "The offset bit width does not match DL specification.");
3939
4040 for (gep_type_iterator GTI = gep_type_begin(this), GTE = gep_type_end(this);
4141 GTI != GTE; ++GTI) {
586586 if (!getType()->isPointerTy())
587587 return this;
588588
589 assert(Offset.getBitWidth() == DL.getPointerSizeInBits(cast(
589 assert(Offset.getBitWidth() == DL.getIndexSizeInBits(cast(
590590 getType())->getAddressSpace()) &&
591 "The offset must have exactly as many bits as our pointer.");
591 "The offset bit width does not match the DL specification.");
592592
593593 // Even though we don't look through PHI nodes, we could be called on an
594594 // instruction in an unreachable block, which may be on a cycle.
17601760 Type *Ty = CI.getType();
17611761 unsigned AS = CI.getPointerAddressSpace();
17621762
1763 if (Ty->getScalarSizeInBits() == DL.getPointerSizeInBits(AS))
1763 if (Ty->getScalarSizeInBits() == DL.getIndexSizeInBits(AS))
17641764 return commonPointerCastTransforms(CI);
17651765
17661766 Type *PtrTy = DL.getIntPtrType(CI.getContext(), AS);
20132013 !match(BitCast.getOperand(0), m_OneUse(m_BinOp(BO))) ||
20142014 !BO->isBitwiseLogicOp())
20152015 return nullptr;
2016
2016
20172017 // FIXME: This transform is restricted to vector types to avoid backend
20182018 // problems caused by creating potentially illegal operations. If a fix-up is
20192019 // added to handle that situation, we can remove this check.
20202020 if (!DestTy->isVectorTy() || !BO->getType()->isVectorTy())
20212021 return nullptr;
2022
2022
20232023 Value *X;
20242024 if (match(BO->getOperand(0), m_OneUse(m_BitCast(m_Value(X)))) &&
20252025 X->getType() == DestTy && !isa(X)) {
681681 // 4. Emit GEPs to get the original pointers.
682682 // 5. Remove the original instructions.
683683 Type *IndexType = IntegerType::get(
684 Base->getContext(), DL.getPointerTypeSizeInBits(Start->getType()));
684 Base->getContext(), DL.getIndexTypeSizeInBits(Start->getType()));
685685
686686 DenseMap NewInsts;
687687 NewInsts[Base] = ConstantInt::getNullValue(IndexType);
789789 static std::pair
790790 getAsConstantIndexedAddress(Value *V, const DataLayout &DL) {
791791 Type *IndexType = IntegerType::get(V->getContext(),
792 DL.getPointerTypeSizeInBits(V->getType()));
792 DL.getIndexTypeSizeInBits(V->getType()));
793793
794794 Constant *Index = ConstantInt::getNullValue(IndexType);
795795 while (true) {
40304030 // Get scalar or pointer size.
40314031 unsigned BitWidth = Ty->isIntOrIntVectorTy()
40324032 ? Ty->getScalarSizeInBits()
4033 : DL.getTypeSizeInBits(Ty->getScalarType());
4033 : DL.getIndexTypeSizeInBits(Ty->getScalarType());
40344034
40354035 if (!BitWidth)
40364036 return nullptr;
11141114 // Start with the index over the outer type. Note that the type size
11151115 // might be zero (even if the offset isn't zero) if the indexed type
11161116 // is something like [0 x {int, int}]
1117 Type *IntPtrTy = DL.getIntPtrType(PtrTy);
1117 Type *IndexTy = DL.getIndexType(PtrTy);
11181118 int64_t FirstIdx = 0;
11191119 if (int64_t TySize = DL.getTypeAllocSize(Ty)) {
11201120 FirstIdx = Offset/TySize;
11291129 assert((uint64_t)Offset < (uint64_t)TySize && "Out of range offset");
11301130 }
11311131
1132 NewIndices.push_back(ConstantInt::get(IntPtrTy, FirstIdx));
1132 NewIndices.push_back(ConstantInt::get(IndexTy, FirstIdx));
11331133
11341134 // Index into the types. If we fail, set OrigBase to null.
11351135 while (Offset) {
11511151 } else if (ArrayType *AT = dyn_cast(Ty)) {
11521152 uint64_t EltSize = DL.getTypeAllocSize(AT->getElementType());
11531153 assert(EltSize && "Cannot index into a zero-sized array");
1154 NewIndices.push_back(ConstantInt::get(IntPtrTy,Offset/EltSize));
1154 NewIndices.push_back(ConstantInt::get(IndexTy,Offset/EltSize));
11551155 Offset %= EltSize;
11561156 Ty = AT->getElementType();
11571157 } else {
15141514 // Eliminate unneeded casts for indices, and replace indices which displace
15151515 // by multiples of a zero size type with zero.
15161516 bool MadeChange = false;
1517 Type *IntPtrTy =
1518 DL.getIntPtrType(GEP.getPointerOperandType()->getScalarType());
1517
1518 // Index width may not be the same width as pointer width.
1519 // Data layout chooses the right type based on supported integer types.
1520 Type *NewScalarIndexTy =
1521 DL.getIndexType(GEP.getPointerOperandType()->getScalarType());
15191522
15201523 gep_type_iterator GTI = gep_type_begin(GEP);
15211524 for (User::op_iterator I = GEP.op_begin() + 1, E = GEP.op_end(); I != E;
15241527 if (GTI.isStruct())
15251528 continue;
15261529
1527 // Index type should have the same width as IntPtr
15281530 Type *IndexTy = (*I)->getType();
1529 Type *NewIndexType = IndexTy->isVectorTy() ?
1530 VectorType::get(IntPtrTy, IndexTy->getVectorNumElements()) : IntPtrTy;
1531 Type *NewIndexType =
1532 IndexTy->isVectorTy()
1533 ? VectorType::get(NewScalarIndexTy, IndexTy->getVectorNumElements())
1534 : NewScalarIndexTy;
15311535
15321536 // If the element type has zero size then any index over it is equivalent
15331537 // to an index of zero, so replace it with zero if it is not zero already.
17301734 if (GEP.getNumIndices() == 1) {
17311735 unsigned AS = GEP.getPointerAddressSpace();
17321736 if (GEP.getOperand(1)->getType()->getScalarSizeInBits() ==
1733 DL.getPointerSizeInBits(AS)) {
1737 DL.getIndexSizeInBits(AS)) {
17341738 Type *Ty = GEP.getSourceElementType();
17351739 uint64_t TyAllocSize = DL.getTypeAllocSize(Ty);
17361740
18561860 if (SrcElTy->isArrayTy() &&
18571861 DL.getTypeAllocSize(SrcElTy->getArrayElementType()) ==
18581862 DL.getTypeAllocSize(ResElTy)) {
1859 Type *IdxType = DL.getIntPtrType(GEP.getType());
1863 Type *IdxType = DL.getIndexType(GEP.getType());
18601864 Value *Idx[2] = { Constant::getNullValue(IdxType), GEP.getOperand(1) };
18611865 Value *NewGEP =
18621866 GEP.isInBounds()
18831887 unsigned BitWidth = Idx->getType()->getPrimitiveSizeInBits();
18841888 uint64_t Scale = SrcSize / ResSize;
18851889
1886 // Earlier transforms ensure that the index has type IntPtrType, which
1887 // considerably simplifies the logic by eliminating implicit casts.
1888 assert(Idx->getType() == DL.getIntPtrType(GEP.getType()) &&
1889 "Index not cast to pointer width?");
1890 // Earlier transforms ensure that the index has the right type
1891 // according to Data Layout, which considerably simplifies the
1892 // logic by eliminating implicit casts.
1893 assert(Idx->getType() == DL.getIndexType(GEP.getType()) &&
1894 "Index type does not match the Data Layout preferences");
18901895
18911896 bool NSW;
18921897 if (Value *NewIdx = Descale(Idx, APInt(BitWidth, Scale), NSW)) {
19221927 unsigned BitWidth = Idx->getType()->getPrimitiveSizeInBits();
19231928 uint64_t Scale = ArrayEltSize / ResSize;
19241929
1925 // Earlier transforms ensure that the index has type IntPtrType, which
1926 // considerably simplifies the logic by eliminating implicit casts.
1927 assert(Idx->getType() == DL.getIntPtrType(GEP.getType()) &&
1928 "Index not cast to pointer width?");
1930 // Earlier transforms ensure that the index has the right type
1931 // according to the Data Layout, which considerably simplifies
1932 // the logic by eliminating implicit casts.
1933 assert(Idx->getType() == DL.getIndexType(GEP.getType()) &&
1934 "Index type does not match the Data Layout preferences");
19291935
19301936 bool NSW;
19311937 if (Value *NewIdx = Descale(Idx, APInt(BitWidth, Scale), NSW)) {
19321938 // Successfully decomposed Idx as NewIdx * Scale, form a new GEP.
19331939 // If the multiplication NewIdx * Scale may overflow then the new
19341940 // GEP may not be "inbounds".
1935 Value *Off[2] = {
1936 Constant::getNullValue(DL.getIntPtrType(GEP.getType())),
1937 NewIdx};
1941 Type *IndTy = DL.getIndexType(GEP.getType());
1942 Value *Off[2] = {Constant::getNullValue(IndTy), NewIdx};
19381943
19391944 Value *NewGEP = GEP.isInBounds() && NSW
19401945 ? Builder.CreateInBoundsGEP(
19701975 if (BitCastInst *BCI = dyn_cast(PtrOp)) {
19711976 Value *Operand = BCI->getOperand(0);
19721977 PointerType *OpType = cast(Operand->getType());
1973 unsigned OffsetBits = DL.getPointerTypeSizeInBits(GEP.getType());
1978 unsigned OffsetBits = DL.getIndexTypeSizeInBits(GEP.getType());
19741979 APInt Offset(OffsetBits, 0);
19751980 if (!isa(Operand) &&
19761981 GEP.accumulateConstantOffset(DL, Offset)) {
20192024 }
20202025
20212026 if (!GEP.isInBounds()) {
2022 unsigned PtrWidth =
2023 DL.getPointerSizeInBits(PtrOp->getType()->getPointerAddressSpace());
2024 APInt BasePtrOffset(PtrWidth, 0);
2027 unsigned IdxWidth =
2028 DL.getIndexSizeInBits(PtrOp->getType()->getPointerAddressSpace());
2029 APInt BasePtrOffset(IdxWidth, 0);
20252030 Value *UnderlyingPtrOp =
20262031 PtrOp->stripAndAccumulateInBoundsConstantOffsets(DL,
20272032 BasePtrOffset);
20282033 if (auto *AI = dyn_cast(UnderlyingPtrOp)) {
20292034 if (GEP.accumulateConstantOffset(DL, BasePtrOffset) &&
20302035 BasePtrOffset.isNonNegative()) {
2031 APInt AllocSize(PtrWidth, DL.getTypeAllocSize(AI->getAllocatedType()));
2036 APInt AllocSize(IdxWidth, DL.getTypeAllocSize(AI->getAllocatedType()));
20322037 if (BasePtrOffset.ule(AllocSize)) {
20332038 return GetElementPtrInst::CreateInBounds(
20342039 PtrOp, makeArrayRef(Ops).slice(1), GEP.getName());
36473647 auto *PartPtrTy = PartTy->getPointerTo(AS);
36483648 LoadInst *PLoad = IRB.CreateAlignedLoad(
36493649 getAdjustedPtr(IRB, DL, BasePtr,
3650 APInt(DL.getPointerSizeInBits(AS), PartOffset),
3650 APInt(DL.getIndexSizeInBits(AS), PartOffset),
36513651 PartPtrTy, BasePtr->getName() + "."),
36523652 getAdjustedAlignment(LI, PartOffset, DL), /*IsVolatile*/ false,
36533653 LI->getName());
37033703 StoreInst *PStore = IRB.CreateAlignedStore(
37043704 PLoad,
37053705 getAdjustedPtr(IRB, DL, StoreBasePtr,
3706 APInt(DL.getPointerSizeInBits(AS), PartOffset),
3706 APInt(DL.getIndexSizeInBits(AS), PartOffset),
37073707 PartPtrTy, StoreBasePtr->getName() + "."),
37083708 getAdjustedAlignment(SI, PartOffset, DL), /*IsVolatile*/ false);
37093709 PStore->copyMetadata(*LI, LLVMContext::MD_mem_parallel_loop_access);
37853785 auto AS = LI->getPointerAddressSpace();
37863786 PLoad = IRB.CreateAlignedLoad(
37873787 getAdjustedPtr(IRB, DL, LoadBasePtr,
3788 APInt(DL.getPointerSizeInBits(AS), PartOffset),
3788 APInt(DL.getIndexSizeInBits(AS), PartOffset),
37893789 LoadPartPtrTy, LoadBasePtr->getName() + "."),
37903790 getAdjustedAlignment(LI, PartOffset, DL), /*IsVolatile*/ false,
37913791 LI->getName());
37973797 StoreInst *PStore = IRB.CreateAlignedStore(
37983798 PLoad,
37993799 getAdjustedPtr(IRB, DL, StoreBasePtr,
3800 APInt(DL.getPointerSizeInBits(AS), PartOffset),
3800 APInt(DL.getIndexSizeInBits(AS), PartOffset),
38013801 StorePartPtrTy, StoreBasePtr->getName() + "."),
38023802 getAdjustedAlignment(SI, PartOffset, DL), /*IsVolatile*/ false);
38033803
12941294
12951295 // We changed p+o+c to p+c+o, p+c may not be inbound anymore.
12961296 const DataLayout &DAL = First->getModule()->getDataLayout();
1297 APInt Offset(DAL.getPointerSizeInBits(
1297 APInt Offset(DAL.getIndexSizeInBits(
12981298 cast(First->getType())->getAddressSpace()),
12991299 0);
13001300 Value *NewBase =
15291529 }
15301530 } else if (auto *GEP = dyn_cast(&I)) {
15311531 unsigned BitWidth =
1532 M.getDataLayout().getPointerSizeInBits(GEP->getPointerAddressSpace());
1532 M.getDataLayout().getIndexSizeInBits(GEP->getPointerAddressSpace());
15331533 // Rewrite a constant GEP into a DIExpression. Since we are performing
15341534 // arithmetic to compute the variable's *value* in the DIExpression, we
15351535 // need to mark the expression with a DW_OP_stack_value.
21562156 if (!NewTy->isPointerTy())
21572157 return;
21582158
2159 unsigned BitWidth = DL.getTypeSizeInBits(NewTy);
2159 unsigned BitWidth = DL.getIndexTypeSizeInBits(NewTy);
21602160 if (!getConstantRangeFromMetadata(*N).contains(APInt(BitWidth, 0))) {
21612161 MDNode *NN = MDNode::get(OldLI.getContext(), None);
21622162 NewLI.setMetadata(LLVMContext::MD_nonnull, NN);
322322
323323 APInt Size(PtrBitWidth, DL.getTypeStoreSize(PtrATy));
324324
325 APInt OffsetA(PtrBitWidth, 0), OffsetB(PtrBitWidth, 0);
325 unsigned IdxWidth = DL.getIndexSizeInBits(ASA);
326 APInt OffsetA(IdxWidth, 0), OffsetB(IdxWidth, 0);
326327 PtrA = PtrA->stripAndAccumulateInBoundsConstantOffsets(DL, OffsetA);
327328 PtrB = PtrB->stripAndAccumulateInBoundsConstantOffsets(DL, OffsetB);
328329
0 ; NOTE: Assertions have been autogenerated by utils/update_test_checks.py
1 ; RUN: opt < %s -instcombine -S | FileCheck %s
2
3 target datalayout = "e-m:m-p:40:64:64:32-i32:32-i16:16-i8:8-n32"
4
5 %struct.B = type { double }
6 %struct.A = type { %struct.B, i32, i32 }
7 %struct.C = type { [7 x i8] }
8
9
10 @Global = constant [10 x i8] c"helloworld"
11
12
13 ; Test that two array indexing geps fold
14 define i32* @test1(i32* %I) {
15 ; CHECK-LABEL: @test1(
16 ; CHECK-NEXT: [[B:%.*]] = getelementptr i32, i32* [[I:%.*]], i32 21
17 ; CHECK-NEXT: ret i32* [[B]]
18 ;
19 %A = getelementptr i32, i32* %I, i8 17
20 %B = getelementptr i32, i32* %A, i16 4
21 ret i32* %B
22 }
23
24 ; Test that two getelementptr insts fold
25 define i32* @test2({ i32 }* %I) {
26 ; CHECK-LABEL: @test2(
27 ; CHECK-NEXT: [[B:%.*]] = getelementptr { i32 }, { i32 }* [[I:%.*]], i32 1, i32 0
28 ; CHECK-NEXT: ret i32* [[B]]
29 ;
30 %A = getelementptr { i32 }, { i32 }* %I, i32 1
31 %B = getelementptr { i32 }, { i32 }* %A, i32 0, i32 0
32 ret i32* %B
33 }
34
35 define void @test3(i8 %B) {
36 ; This should be turned into a constexpr instead of being an instruction
37 ; CHECK-LABEL: @test3(
38 ; CHECK-NEXT: store i8 [[B:%.*]], i8* getelementptr inbounds ([10 x i8], [10 x i8]* @Global, i32 0, i32 4), align 1
39 ; CHECK-NEXT: ret void
40 ;
41 %A = getelementptr [10 x i8], [10 x i8]* @Global, i32 0, i32 4
42 store i8 %B, i8* %A
43 ret void
44 }
45
46 %as1_ptr_struct = type { i32 addrspace(1)* }
47 %as2_ptr_struct = type { i32 addrspace(2)* }
48
49 @global_as2 = addrspace(2) global i32 zeroinitializer
50 @global_as1_as2_ptr = addrspace(1) global %as2_ptr_struct { i32 addrspace(2)* @global_as2 }
51
52 ; This should be turned into a constexpr instead of being an instruction
53 define void @test_evaluate_gep_nested_as_ptrs(i32 addrspace(2)* %B) {
54 ; CHECK-LABEL: @test_evaluate_gep_nested_as_ptrs(
55 ; CHECK-NEXT: store i32 addrspace(2)* [[B:%.*]], i32 addrspace(2)* addrspace(1)* getelementptr inbounds (%as2_ptr_struct, [[AS2_PTR_STRUCT:%.*]] addrspace(1)* @global_as1_as2_ptr, i32 0, i32 0), align 8
56 ; CHECK-NEXT: ret void
57 ;
58 %A = getelementptr %as2_ptr_struct, %as2_ptr_struct addrspace(1)* @global_as1_as2_ptr, i32 0, i32 0
59 store i32 addrspace(2)* %B, i32 addrspace(2)* addrspace(1)* %A
60 ret void
61 }
62
63 @arst = addrspace(1) global [4 x i8 addrspace(2)*] zeroinitializer
64
65 define void @test_evaluate_gep_as_ptrs_array(i8 addrspace(2)* %B) {
66 ; CHECK-LABEL: @test_evaluate_gep_as_ptrs_array(
67 ; CHECK-NEXT: store i8 addrspace(2)* [[B:%.*]], i8 addrspace(2)* addrspace(1)* getelementptr inbounds ([4 x i8 addrspace(2)*], [4 x i8 addrspace(2)*] addrspace(1)* @arst, i32 0, i32 2), align 16
68 ; CHECK-NEXT: ret void
69 ;
70
71 %A = getelementptr [4 x i8 addrspace(2)*], [4 x i8 addrspace(2)*] addrspace(1)* @arst, i16 0, i16 2
72 store i8 addrspace(2)* %B, i8 addrspace(2)* addrspace(1)* %A
73 ret void
74 }
75
76 define i32* @test4(i32* %I, i32 %C, i32 %D) {
77 ; CHECK-LABEL: @test4(
78 ; CHECK-NEXT: [[A:%.*]] = getelementptr i32, i32* [[I:%.*]], i32 [[C:%.*]]
79 ; CHECK-NEXT: [[B:%.*]] = getelementptr i32, i32* [[A]], i32 [[D:%.*]]
80 ; CHECK-NEXT: ret i32* [[B]]
81 ;
82 %A = getelementptr i32, i32* %I, i32 %C
83 %B = getelementptr i32, i32* %A, i32 %D
84 ret i32* %B
85 }
86
87
88 define i1 @test5({ i32, i32 }* %x, { i32, i32 }* %y) {
89 ; CHECK-LABEL: @test5(
90 ; CHECK-NEXT: [[TMP_4:%.*]] = icmp eq { i32, i32 }* [[X:%.*]], [[Y:%.*]]
91 ; CHECK-NEXT: ret i1 [[TMP_4]]
92 ;
93 %tmp.1 = getelementptr { i32, i32 }, { i32, i32 }* %x, i32 0, i32 1
94 %tmp.3 = getelementptr { i32, i32 }, { i32, i32 }* %y, i32 0, i32 1
95 ;; seteq x, y
96 %tmp.4 = icmp eq i32* %tmp.1, %tmp.3
97 ret i1 %tmp.4
98 }
99
100 %S = type { i32, [ 100 x i32] }
101
102 define <2 x i1> @test6(<2 x i32> %X, <2 x %S*> %P) nounwind {
103 ; CHECK-LABEL: @test6(
104 ; CHECK-NEXT: [[C:%.*]] = icmp eq <2 x i32> [[X:%.*]],
105 ; CHECK-NEXT: ret <2 x i1> [[C]]
106 ;
107 %A = getelementptr inbounds %S, <2 x %S*> %P, <2 x i32> zeroinitializer, <2 x i32> , <2 x i32> %X
108 %B = getelementptr inbounds %S, <2 x %S*> %P, <2 x i32> , <2 x i32>
109 %C = icmp eq <2 x i32*> %A, %B
110 ret <2 x i1> %C
111 }
112
113 @G = external global [3 x i8]
114 define i8* @test7(i16 %Idx) {
115 ; CHECK-LABEL: @test7(
116 ; CHECK-NEXT: [[ZE_IDX:%.*]] = zext i16 [[IDX:%.*]] to i32
117 ; CHECK-NEXT: [[TMP:%.*]] = getelementptr [3 x i8], [3 x i8]* @G, i32 0, i32 [[ZE_IDX]]
118 ; CHECK-NEXT: ret i8* [[TMP]]
119 ;
120 %ZE_Idx = zext i16 %Idx to i32
121 %tmp = getelementptr i8, i8* getelementptr ([3 x i8], [3 x i8]* @G, i32 0, i32 0), i32 %ZE_Idx
122 ret i8* %tmp
123 }
124
125
126 ; Test folding of constantexpr geps into normal geps.
127 @Array = external global [40 x i32]
128 define i32 *@test8(i32 %X) {
129 ; CHECK-LABEL: @test8(
130 ; CHECK-NEXT: [[A:%.*]] = getelementptr [40 x i32], [40 x i32]* @Array, i32 0, i32 [[X:%.*]]
131 ; CHECK-NEXT: ret i32* [[A]]
132 ;
133 %A = getelementptr i32, i32* getelementptr ([40 x i32], [40 x i32]* @Array, i32 0, i32 0), i32 %X
134 ret i32* %A
135 }
136
137 define i32 *@test9(i32 *%base, i8 %ind) {
138 ; CHECK-LABEL: @test9(
139 ; CHECK-NEXT: [[TMP1:%.*]] = sext i8 [[IND:%.*]] to i32
140 ; CHECK-NEXT: [[RES:%.*]] = getelementptr i32, i32* [[BASE:%.*]], i32 [[TMP1]]
141 ; CHECK-NEXT: ret i32* [[RES]]
142 ;
143 %res = getelementptr i32, i32 *%base, i8 %ind
144 ret i32* %res
145 }
146
147 define i32 @test10() {
148 ; CHECK-LABEL: @test10(
149 ; CHECK-NEXT: ret i32 8
150 ;
151 %A = getelementptr { i32, double }, { i32, double }* null, i32 0, i32 1
152 %B = ptrtoint double* %A to i32
153 ret i32 %B
154 }
0 ; NOTE: Assertions have been autogenerated by utils/update_test_checks.py
1 ; RUN: opt < %s -instcombine -S | FileCheck %s
2
3 target datalayout = "e-p:40:64:64:32-p1:16:16:16-p2:32:32:32-p3:64:64:64-i1:8:8-i8:8:8-i16:16:16-i32:32:32-i64:64:64-f32:32:32-f64:64:64-v64:64:64-v128:128:128-a0:0:64-s0:64:64-f80:128:128-n8:16:32:64"
4
5 declare i32 @test58_d(i64 )
6
7 define i1 @test59(i8* %foo) {
8 ; CHECK-LABEL: @test59(
9 ; CHECK-NEXT: [[GEP1:%.*]] = getelementptr inbounds i8, i8* [[FOO:%.*]], i32 8
10 ; CHECK-NEXT: [[TMP1:%.*]] = ptrtoint i8* [[GEP1]] to i32
11 ; CHECK-NEXT: [[USE:%.*]] = zext i32 [[TMP1]] to i64
12 ; CHECK-NEXT: [[CALL:%.*]] = call i32 @test58_d(i64 [[USE]])
13 ; CHECK-NEXT: ret i1 true
14 ;
15 %bit = bitcast i8* %foo to i32*
16 %gep1 = getelementptr inbounds i32, i32* %bit, i64 2
17 %gep2 = getelementptr inbounds i8, i8* %foo, i64 10
18 %cast1 = bitcast i32* %gep1 to i8*
19 %cmp = icmp ult i8* %cast1, %gep2
20 %use = ptrtoint i8* %cast1 to i64
21 %call = call i32 @test58_d(i64 %use)
22 ret i1 %cmp
23 }
24
25 define i1 @test59_as1(i8 addrspace(1)* %foo) {
26 ; CHECK-LABEL: @test59_as1(
27 ; CHECK-NEXT: [[GEP1:%.*]] = getelementptr inbounds i8, i8 addrspace(1)* [[FOO:%.*]], i16 8
28 ; CHECK-NEXT: [[TMP1:%.*]] = ptrtoint i8 addrspace(1)* [[GEP1]] to i16
29 ; CHECK-NEXT: [[USE:%.*]] = zext i16 [[TMP1]] to i64
30 ; CHECK-NEXT: [[CALL:%.*]] = call i32 @test58_d(i64 [[USE]])
31 ; CHECK-NEXT: ret i1 true
32 ;
33 %bit = bitcast i8 addrspace(1)* %foo to i32 addrspace(1)*
34 %gep1 = getelementptr inbounds i32, i32 addrspace(1)* %bit, i64 2
35 %gep2 = getelementptr inbounds i8, i8 addrspace(1)* %foo, i64 10
36 %cast1 = bitcast i32 addrspace(1)* %gep1 to i8 addrspace(1)*
37 %cmp = icmp ult i8 addrspace(1)* %cast1, %gep2
38 %use = ptrtoint i8 addrspace(1)* %cast1 to i64
39 %call = call i32 @test58_d(i64 %use)
40 ret i1 %cmp
41 }
42
43 define i1 @test60(i8* %foo, i64 %i, i64 %j) {
44 ; CHECK-LABEL: @test60(
45 ; CHECK-NEXT: [[TMP1:%.*]] = trunc i64 [[I:%.*]] to i32
46 ; CHECK-NEXT: [[TMP2:%.*]] = trunc i64 [[J:%.*]] to i32
47 ; CHECK-NEXT: [[GEP1_IDX:%.*]] = shl nuw i32 [[TMP1]], 2
48 ; CHECK-NEXT: [[TMP3:%.*]] = icmp slt i32 [[GEP1_IDX]], [[TMP2]]
49 ; CHECK-NEXT: ret i1 [[TMP3]]
50 ;
51 %bit = bitcast i8* %foo to i32*
52 %gep1 = getelementptr inbounds i32, i32* %bit, i64 %i
53 %gep2 = getelementptr inbounds i8, i8* %foo, i64 %j
54 %cast1 = bitcast i32* %gep1 to i8*
55 %cmp = icmp ult i8* %cast1, %gep2
56 ret i1 %cmp
57 }
58
59 define i1 @test60_as1(i8 addrspace(1)* %foo, i64 %i, i64 %j) {
60 ; CHECK-LABEL: @test60_as1(
61 ; CHECK-NEXT: [[TMP1:%.*]] = trunc i64 [[I:%.*]] to i16
62 ; CHECK-NEXT: [[TMP2:%.*]] = trunc i64 [[J:%.*]] to i16
63 ; CHECK-NEXT: [[GEP1_IDX:%.*]] = shl nuw i16 [[TMP1]], 2
64 ; CHECK-NEXT: [[TMP3:%.*]] = icmp slt i16 [[GEP1_IDX]], [[TMP2]]
65 ; CHECK-NEXT: ret i1 [[TMP3]]
66 ;
67 %bit = bitcast i8 addrspace(1)* %foo to i32 addrspace(1)*
68 %gep1 = getelementptr inbounds i32, i32 addrspace(1)* %bit, i64 %i
69 %gep2 = getelementptr inbounds i8, i8 addrspace(1)* %foo, i64 %j
70 %cast1 = bitcast i32 addrspace(1)* %gep1 to i8 addrspace(1)*
71 %cmp = icmp ult i8 addrspace(1)* %cast1, %gep2
72 ret i1 %cmp
73 }
74
75 ; Same as test60, but look through an addrspacecast instead of a
76 ; bitcast. This uses the same sized addrspace.
77 define i1 @test60_addrspacecast(i8* %foo, i64 %i, i64 %j) {
78 ; CHECK-LABEL: @test60_addrspacecast(
79 ; CHECK-NEXT: [[TMP1:%.*]] = trunc i64 [[J:%.*]] to i32
80 ; CHECK-NEXT: [[I_TR:%.*]] = trunc i64 [[I:%.*]] to i32
81 ; CHECK-NEXT: [[TMP2:%.*]] = shl i32 [[I_TR]], 2
82 ; CHECK-NEXT: [[TMP3:%.*]] = icmp slt i32 [[TMP2]], [[TMP1]]
83 ; CHECK-NEXT: ret i1 [[TMP3]]
84 ;
85 %bit = addrspacecast i8* %foo to i32 addrspace(3)*
86 %gep1 = getelementptr inbounds i32, i32 addrspace(3)* %bit, i64 %i
87 %gep2 = getelementptr inbounds i8, i8* %foo, i64 %j
88 %cast1 = addrspacecast i32 addrspace(3)* %gep1 to i8*
89 %cmp = icmp ult i8* %cast1, %gep2
90 ret i1 %cmp
91 }
92
93 define i1 @test60_addrspacecast_smaller(i8* %foo, i16 %i, i64 %j) {
94 ; CHECK-LABEL: @test60_addrspacecast_smaller(
95 ; CHECK-NEXT: [[GEP1_IDX:%.*]] = shl nuw i16 [[I:%.*]], 2
96 ; CHECK-NEXT: [[TMP1:%.*]] = trunc i64 [[J:%.*]] to i16
97 ; CHECK-NEXT: [[TMP2:%.*]] = icmp slt i16 [[GEP1_IDX]], [[TMP1]]
98 ; CHECK-NEXT: ret i1 [[TMP2]]
99 ;
100 %bit = addrspacecast i8* %foo to i32 addrspace(1)*
101 %gep1 = getelementptr inbounds i32, i32 addrspace(1)* %bit, i16 %i
102 %gep2 = getelementptr inbounds i8, i8* %foo, i64 %j
103 %cast1 = addrspacecast i32 addrspace(1)* %gep1 to i8*
104 %cmp = icmp ult i8* %cast1, %gep2
105 ret i1 %cmp
106 }
107
108 define i1 @test60_addrspacecast_larger(i8 addrspace(1)* %foo, i32 %i, i16 %j) {
109 ; CHECK-LABEL: @test60_addrspacecast_larger(
110 ; CHECK-NEXT: [[I_TR:%.*]] = trunc i32 [[I:%.*]] to i16
111 ; CHECK-NEXT: [[TMP1:%.*]] = shl i16 [[I_TR]], 2
112 ; CHECK-NEXT: [[TMP2:%.*]] = icmp slt i16 [[TMP1]], [[J:%.*]]
113 ; CHECK-NEXT: ret i1 [[TMP2]]
114 ;
115 %bit = addrspacecast i8 addrspace(1)* %foo to i32 addrspace(2)*
116 %gep1 = getelementptr inbounds i32, i32 addrspace(2)* %bit, i32 %i
117 %gep2 = getelementptr inbounds i8, i8 addrspace(1)* %foo, i16 %j
118 %cast1 = addrspacecast i32 addrspace(2)* %gep1 to i8 addrspace(1)*
119 %cmp = icmp ult i8 addrspace(1)* %cast1, %gep2
120 ret i1 %cmp
121 }
122
123 define i1 @test61(i8* %foo, i64 %i, i64 %j) {
124 ; CHECK-LABEL: @test61(
125 ; CHECK-NEXT: [[BIT:%.*]] = bitcast i8* [[FOO:%.*]] to i32*
126 ; CHECK-NEXT: [[TMP1:%.*]] = trunc i64 [[I:%.*]] to i32
127 ; CHECK-NEXT: [[GEP1:%.*]] = getelementptr i32, i32* [[BIT]], i32 [[TMP1]]
128 ; CHECK-NEXT: [[TMP2:%.*]] = trunc i64 [[J:%.*]] to i32
129 ; CHECK-NEXT: [[GEP2:%.*]] = getelementptr i8, i8* [[FOO]], i32 [[TMP2]]
130 ; CHECK-NEXT: [[CAST1:%.*]] = bitcast i32* [[GEP1]] to i8*
131 ; CHECK-NEXT: [[CMP:%.*]] = icmp ugt i8* [[GEP2]], [[CAST1]]
132 ; CHECK-NEXT: ret i1 [[CMP]]
133 ;
134 %bit = bitcast i8* %foo to i32*
135 %gep1 = getelementptr i32, i32* %bit, i64 %i
136 %gep2 = getelementptr i8, i8* %foo, i64 %j
137 %cast1 = bitcast i32* %gep1 to i8*
138 %cmp = icmp ult i8* %cast1, %gep2
139 ret i1 %cmp
140 ; Don't transform non-inbounds GEPs.
141 }
142
143 define i1 @test61_as1(i8 addrspace(1)* %foo, i16 %i, i16 %j) {
144 ; CHECK-LABEL: @test61_as1(
145 ; CHECK-NEXT: [[BIT:%.*]] = bitcast i8 addrspace(1)* [[FOO:%.*]] to i32 addrspace(1)*
146 ; CHECK-NEXT: [[GEP1:%.*]] = getelementptr i32, i32 addrspace(1)* [[BIT]], i16 [[I:%.*]]
147 ; CHECK-NEXT: [[GEP2:%.*]] = getelementptr i8, i8 addrspace(1)* [[FOO]], i16 [[J:%.*]]
148 ; CHECK-NEXT: [[CAST1:%.*]] = bitcast i32 addrspace(1)* [[GEP1]] to i8 addrspace(1)*
149 ; CHECK-NEXT: [[CMP:%.*]] = icmp ugt i8 addrspace(1)* [[GEP2]], [[CAST1]]
150 ; CHECK-NEXT: ret i1 [[CMP]]
151 ;
152 %bit = bitcast i8 addrspace(1)* %foo to i32 addrspace(1)*
153 %gep1 = getelementptr i32, i32 addrspace(1)* %bit, i16 %i
154 %gep2 = getelementptr i8, i8 addrspace(1)* %foo, i16 %j
155 %cast1 = bitcast i32 addrspace(1)* %gep1 to i8 addrspace(1)*
156 %cmp = icmp ult i8 addrspace(1)* %cast1, %gep2
157 ret i1 %cmp
158 ; Don't transform non-inbounds GEPs.
159 }
160
161 define i1 @test62(i8* %a) {
162 ; CHECK-LABEL: @test62(
163 ; CHECK-NEXT: ret i1 true
164 ;
165 %arrayidx1 = getelementptr inbounds i8, i8* %a, i64 1
166 %arrayidx2 = getelementptr inbounds i8, i8* %a, i64 10
167 %cmp = icmp slt i8* %arrayidx1, %arrayidx2
168 ret i1 %cmp
169 }
170
171 define i1 @test62_as1(i8 addrspace(1)* %a) {
172 ; CHECK-LABEL: @test62_as1(
173 ; CHECK-NEXT: ret i1 true
174 ;
175 %arrayidx1 = getelementptr inbounds i8, i8 addrspace(1)* %a, i64 1
176 %arrayidx2 = getelementptr inbounds i8, i8 addrspace(1)* %a, i64 10
177 %cmp = icmp slt i8 addrspace(1)* %arrayidx1, %arrayidx2
178 ret i1 %cmp
179 }
180
181
182 ; Variation of the above with an ashr
183 define i1 @icmp_and_ashr_multiuse(i32 %X) {
184 ; CHECK-LABEL: @icmp_and_ashr_multiuse(
185 ; CHECK-NEXT: [[AND:%.*]] = and i32 [[X:%.*]], 240
186 ; CHECK-NEXT: [[AND2:%.*]] = and i32 [[X]], 496
187 ; CHECK-NEXT: [[TOBOOL:%.*]] = icmp ne i32 [[AND]], 224
188 ; CHECK-NEXT: [[TOBOOL2:%.*]] = icmp ne i32 [[AND2]], 432
189 ; CHECK-NEXT: [[AND3:%.*]] = and i1 [[TOBOOL]], [[TOBOOL2]]
190 ; CHECK-NEXT: ret i1 [[AND3]]
191 ;
192 %shr = ashr i32 %X, 4
193 %and = and i32 %shr, 15
194 %and2 = and i32 %shr, 31 ; second use of the shift
195 %tobool = icmp ne i32 %and, 14
196 %tobool2 = icmp ne i32 %and2, 27
197 %and3 = and i1 %tobool, %tobool2
198 ret i1 %and3
199 }
200
201 define i1 @icmp_lshr_and_overshift(i8 %X) {
202 ; CHECK-LABEL: @icmp_lshr_and_overshift(
203 ; CHECK-NEXT: [[TOBOOL:%.*]] = icmp ugt i8 [[X:%.*]], 31
204 ; CHECK-NEXT: ret i1 [[TOBOOL]]
205 ;
206 %shr = lshr i8 %X, 5
207 %and = and i8 %shr, 15
208 %tobool = icmp ne i8 %and, 0
209 ret i1 %tobool
210 }
211
212 ; We shouldn't simplify this because the and uses bits that are shifted in.
213 define i1 @icmp_ashr_and_overshift(i8 %X) {
214 ; CHECK-LABEL: @icmp_ashr_and_overshift(
215 ; CHECK-NEXT: [[SHR:%.*]] = ashr i8 [[X:%.*]], 5
216 ; CHECK-NEXT: [[AND:%.*]] = and i8 [[SHR]], 15
217 ; CHECK-NEXT: [[TOBOOL:%.*]] = icmp ne i8 [[AND]], 0
218 ; CHECK-NEXT: ret i1 [[TOBOOL]]
219 ;
220 %shr = ashr i8 %X, 5
221 %and = and i8 %shr, 15
222 %tobool = icmp ne i8 %and, 0
223 ret i1 %tobool
224 }
225
226 ; PR16244
227 define i1 @test71(i8* %x) {
228 ; CHECK-LABEL: @test71(
229 ; CHECK-NEXT: ret i1 false
230 ;
231 %a = getelementptr i8, i8* %x, i64 8
232 %b = getelementptr inbounds i8, i8* %x, i64 8
233 %c = icmp ugt i8* %a, %b
234 ret i1 %c
235 }
236
237 define i1 @test71_as1(i8 addrspace(1)* %x) {
238 ; CHECK-LABEL: @test71_as1(
239 ; CHECK-NEXT: ret i1 false
240 ;
241 %a = getelementptr i8, i8 addrspace(1)* %x, i64 8
242 %b = getelementptr inbounds i8, i8 addrspace(1)* %x, i64 8
243 %c = icmp ugt i8 addrspace(1)* %a, %b
244 ret i1 %c
245 }
246
0 ; RUN: opt -basicaa -loop-idiom < %s -S | FileCheck %s
1 target datalayout = "e-p:40:64:64:32-i1:8:8-i8:8:8-i16:16:16-i32:32:32-i64:64:64-f32:32:32-f64:64:64-v64:64:64-v128:128:128-a0:0:64-s0:64:64-f80:128:128-n8:16:32:64"
2
3 %struct.foo = type { i32, i32 }
4 %struct.foo1 = type { i32, i32, i32 }
5 %struct.foo2 = type { i32, i16, i16 }
6
7 ;void bar1(foo_t *f, unsigned n) {
8 ; for (unsigned i = 0; i < n; ++i) {
9 ; f[i].a = 0;
10 ; f[i].b = 0;
11 ; }
12 ;}
13 define void @bar1(%struct.foo* %f, i32 %n) nounwind ssp {
14 entry:
15 %cmp1 = icmp eq i32 %n, 0
16 br i1 %cmp1, label %for.end, label %for.body.preheader
17
18 for.body.preheader: ; preds = %entry
19 br label %for.body
20
21 for.body: ; preds = %for.body.preheader, %for.body
22 %indvars.iv = phi i32 [ 0, %for.body.preheader ], [ %indvars.iv.next, %for.body ]
23 %a = getelementptr inbounds %struct.foo, %struct.foo* %f, i32 %indvars.iv, i32 0
24 store i32 0, i32* %a, align 4
25 %b = getelementptr inbounds %struct.foo, %struct.foo* %f, i32 %indvars.iv, i32 1
26 store i32 0, i32* %b, align 4
27 %indvars.iv.next = add nuw nsw i32 %indvars.iv, 1
28 %exitcond = icmp ne i32 %indvars.iv.next, %n
29 br i1 %exitcond, label %for.body, label %for.end.loopexit
30
31 for.end.loopexit: ; preds = %for.body
32 br label %for.end
33
34 for.end: ; preds = %for.end.loopexit, %entry
35 ret void
36 ; CHECK-LABEL: @bar1(
37 ; CHECK: call void @llvm.memset
38 ; CHECK-NOT: store
39 }
40
41 ;void bar2(foo_t *f, unsigned n) {
42 ; for (unsigned i = 0; i < n; ++i) {
43 ; f[i].b = 0;
44 ; f[i].a = 0;
45 ; }
46 ;}
47 define void @bar2(%struct.foo* %f, i32 %n) nounwind ssp {
48 entry:
49 %cmp1 = icmp eq i32 %n, 0
50 br i1 %cmp1, label %for.end, label %for.body.preheader
51
52 for.body.preheader: ; preds = %entry
53 br label %for.body
54
55 for.body: ; preds = %for.body.preheader, %for.body
56 %indvars.iv = phi i32 [ 0, %for.body.preheader ], [ %indvars.iv.next, %for.body ]
57 %b = getelementptr inbounds %struct.foo, %struct.foo* %f, i32 %indvars.iv, i32 1
58 store i32 0, i32* %b, align 4
59 %a = getelementptr inbounds %struct.foo, %struct.foo* %f, i32 %indvars.iv, i32 0
60 store i32 0, i32* %a, align 4
61 %indvars.iv.next = add nuw nsw i32 %indvars.iv, 1
62 %exitcond = icmp ne i32 %indvars.iv.next, %n
63 br i1 %exitcond, label %for.body, label %for.end.loopexit
64
65 for.end.loopexit: ; preds = %for.body
66 br label %for.end
67
68 for.end: ; preds = %for.end.loopexit, %entry
69 ret void
70 ; CHECK-LABEL: @bar2(
71 ; CHECK: call void @llvm.memset
72 ; CHECK-NOT: store
73 }
74
75 ;void bar3(foo_t *f, unsigned n) {
76 ; for (unsigned i = n; i > 0; --i) {
77 ; f[i].a = 0;
78 ; f[i].b = 0;
79 ; }
80 ;}
81 define void @bar3(%struct.foo* nocapture %f, i32 %n) nounwind ssp {
82 entry:
83 %cmp1 = icmp eq i32 %n, 0
84 br i1 %cmp1, label %for.end, label %for.body.preheader
85
86 for.body.preheader: ; preds = %entry
87 br label %for.body
88
89 for.body: ; preds = %for.body.preheader, %for.body
90 %indvars.iv = phi i32 [ %n, %for.body.preheader ], [ %indvars.iv.next, %for.body ]
91 %a = getelementptr inbounds %struct.foo, %struct.foo* %f, i32 %indvars.iv, i32 0
92 store i32 0, i32* %a, align 4
93 %b = getelementptr inbounds %struct.foo, %struct.foo* %f, i32 %indvars.iv, i32 1
94 store i32 0, i32* %b, align 4
95 %dec = add i32 %indvars.iv, -1
96 %cmp = icmp eq i32 %dec, 0
97 %indvars.iv.next = add nsw i32 %indvars.iv, -1
98 br i1 %cmp, label %for.end.loopexit, label %for.body
99
100 for.end.loopexit: ; preds = %for.body
101 br label %for.end
102
103 for.end: ; preds = %for.end.loopexit, %entry
104 ret void
105 ; CHECK-LABEL: @bar3(
106 ; CHECK: call void @llvm.memset
107 ; CHECK-NOT: store
108 }
109
110 ;void bar4(foo_t *f, unsigned n) {
111 ; for (unsigned i = 0; i < n; ++i) {
112 ; f[i].a = 0;
113 ; f[i].b = 1;
114 ; }
115 ;}
116 define void @bar4(%struct.foo* nocapture %f, i32 %n) nounwind ssp {
117 entry:
118 %cmp1 = icmp eq i32 %n, 0
119 br i1 %cmp1, label %for.end, label %for.body.preheader
120
121 for.body.preheader: ; preds = %entry
122 br label %for.body
123
124 for.body: ; preds = %for.body.preheader, %for.body
125 %indvars.iv = phi i32 [ 0, %for.body.preheader ], [ %indvars.iv.next, %for.body ]
126 %a = getelementptr inbounds %struct.foo, %struct.foo* %f, i32 %indvars.iv, i32 0
127 store i32 0, i32* %a, align 4
128 %b = getelementptr inbounds %struct.foo, %struct.foo* %f, i32 %indvars.iv, i32 1
129 store i32 1, i32* %b, align 4
130 %indvars.iv.next = add nuw nsw i32 %indvars.iv, 1
131 %exitcond = icmp ne i32 %indvars.iv.next, %n
132 br i1 %exitcond, label %for.body, label %for.end.loopexit
133
134 for.end.loopexit: ; preds = %for.body
135 br label %for.end
136
137 for.end: ; preds = %for.end.loopexit, %entry
138 ret void
139 ; CHECK-LABEL: @bar4(
140 ; CHECK-NOT: call void @llvm.memset
141 }
142
143 ;void bar5(foo1_t *f, unsigned n) {
144 ; for (unsigned i = 0; i < n; ++i) {
145 ; f[i].a = 0;
146 ; f[i].b = 0;
147 ; }
148 ;}
149 define void @bar5(%struct.foo1* nocapture %f, i32 %n) nounwind ssp {
150 entry:
151 %cmp1 = icmp eq i32 %n, 0
152 br i1 %cmp1, label %for.end, label %for.body.preheader
153
154 for.body.preheader: ; preds = %entry
155 br label %for.body
156
157 for.body: ; preds = %for.body.preheader, %for.body
158 %indvars.iv = phi i32 [ 0, %for.body.preheader ], [ %indvars.iv.next, %for.body ]
159 %a = getelementptr inbounds %struct.foo1, %struct.foo1* %f, i32 %indvars.iv, i32 0
160 store i32 0, i32* %a, align 4
161 %b = getelementptr inbounds %struct.foo1, %struct.foo1* %f, i32 %indvars.iv, i32 1
162 store i32 0, i32* %b, align 4
163 %indvars.iv.next = add nuw nsw i32 %indvars.iv, 1
164 %exitcond = icmp ne i32 %indvars.iv.next, %n
165 br i1 %exitcond, label %for.body, label %for.end.loopexit
166
167 for.end.loopexit: ; preds = %for.body
168 br label %for.end
169
170 for.end: ; preds = %for.end.loopexit, %entry
171 ret void
172 ; CHECK-LABEL: @bar5(
173 ; CHECK-NOT: call void @llvm.memset
174 }
175
176 ;void bar6(foo2_t *f, unsigned n) {
177 ; for (unsigned i = 0; i < n; ++i) {
178 ; f[i].a = 0;
179 ; f[i].b = 0;
180 ; f[i].c = 0;
181 ; }
182 ;}
183 define void @bar6(%struct.foo2* nocapture %f, i32 %n) nounwind ssp {
184 entry:
185 %cmp1 = icmp eq i32 %n, 0
186 br i1 %cmp1, label %for.end, label %for.body.preheader
187
188 for.body.preheader: ; preds = %entry
189 br label %for.body
190
191 for.body: ; preds = %for.body.preheader, %for.body
192 %indvars.iv = phi i32 [ 0, %for.body.preheader ], [ %indvars.iv.next, %for.body ]
193 %a = getelementptr inbounds %struct.foo2, %struct.foo2* %f, i32 %indvars.iv, i32 0
194 store i32 0, i32* %a, align 4
195 %b = getelementptr inbounds %struct.foo2, %struct.foo2* %f, i32 %indvars.iv, i32 1
196 store i16 0, i16* %b, align 4
197 %c = getelementptr inbounds %struct.foo2, %struct.foo2* %f, i32 %indvars.iv, i32 2
198 store i16 0, i16* %c, align 2
199 %indvars.iv.next = add nuw nsw i32 %indvars.iv, 1
200 %exitcond = icmp ne i32 %indvars.iv.next, %n
201 br i1 %exitcond, label %for.body, label %for.end.loopexit
202
203 for.end.loopexit: ; preds = %for.body
204 br label %for.end
205
206 for.end: ; preds = %for.end.loopexit, %entry
207 ret void
208 ; CHECK-LABEL: @bar6(
209 ; CHECK: call void @llvm.memset
210 ; CHECK-NOT: store
211 }
0 ; RUN: opt -basicaa -loop-idiom < %s -S | FileCheck %s
1 target datalayout = "e-p:64:64:64:32-i1:8:8-i8:8:8-i16:16:16-i32:32:32-i64:64:64-f32:32:32-f64:64:64-v64:64:64-v128:128:128-a0:0:64-s0:64:64-f80:128:128-n8:16:32:64"
2
3 ; CHECK: @.memset_pattern = private unnamed_addr constant [4 x i32] [i32 2, i32 2, i32 2, i32 2], align 16
4
5 target triple = "x86_64-apple-darwin10.0.0"
6
7 ;void test(int *f, unsigned n) {
8 ; for (unsigned i = 0; i < 2 * n; i += 2) {
9 ; f[i] = 0;
10 ; f[i+1] = 0;
11 ; }
12 ;}
13 define void @test(i32* %f, i32 %n) nounwind ssp {
14 entry:
15 %0 = shl i32 %n, 1
16 %cmp1 = icmp eq i32 %0, 0
17 br i1 %cmp1, label %for.end, label %for.body.preheader
18
19 for.body.preheader: ; preds = %entry
20 br label %for.body
21
22 for.body: ; preds = %for.body.preheader, %for.body
23 %indvars.iv = phi i32 [ 0, %for.body.preheader ], [ %indvars.iv.next, %for.body ]
24 %arrayidx = getelementptr inbounds i32, i32* %f, i32 %indvars.iv
25 store i32 0, i32* %arrayidx, align 4
26 %1 = or i32 %indvars.iv, 1
27 %arrayidx2 = getelementptr inbounds i32, i32* %f, i32 %1
28 store i32 0, i32* %arrayidx2, align 4
29 %indvars.iv.next = add nuw nsw i32 %indvars.iv, 2
30 %cmp = icmp ult i32 %indvars.iv.next, %0
31 br i1 %cmp, label %for.body, label %for.end.loopexit
32
33 for.end.loopexit: ; preds = %for.body
34 br label %for.end
35
36 for.end: ; preds = %for.end.loopexit, %entry
37 ret void
38 ; CHECK-LABEL: @test(
39 ; CHECK: call void @llvm.memset
40 ; CHECK-NOT: store
41 }
42
43 ;void test_pattern(int *f, unsigned n) {
44 ; for (unsigned i = 0; i < 2 * n; i += 2) {
45 ; f[i] = 2;
46 ; f[i+1] = 2;
47 ; }
48 ;}
49 define void @test_pattern(i32* %f, i32 %n) nounwind ssp {
50 entry:
51 %mul = shl i32 %n, 1
52 %cmp1 = icmp eq i32 %mul, 0
53 br i1 %cmp1, label %for.end, label %for.body.preheader
54
55 for.body.preheader: ; preds = %entry
56 br label %for.body
57
58 for.body: ; preds = %for.body.preheader, %for.body
59 %indvars.iv = phi i32 [ 0, %for.body.preheader ], [ %indvars.iv.next, %for.body ]
60 %arrayidx = getelementptr inbounds i32, i32* %f, i32 %indvars.iv
61 store i32 2, i32* %arrayidx, align 4
62 %x1 = or i32 %indvars.iv, 1
63 %arrayidx2 = getelementptr inbounds i32, i32* %f, i32 %x1
64 store i32 2, i32* %arrayidx2, align 4
65 %indvars.iv.next = add nuw nsw i32 %indvars.iv, 2
66 %cmp = icmp ult i32 %indvars.iv.next, %mul
67 br i1 %cmp, label %for.body, label %for.end.loopexit
68
69 for.end.loopexit: ; preds = %for.body
70 br label %for.end
71
72 for.end: ; preds = %for.end.loopexit, %entry
73 ret void
74 ; CHECK-LABEL: @test_pattern(
75 ; CHECK: call void @memset_pattern16
76 ; CHECK-NOT: store
77 }
0 ; RUN: opt -O3 -S -analyze -scalar-evolution < %s | FileCheck %s
1
2 target datalayout = "e-m:m-p:40:64:64:32-i32:32-i16:16-i8:8-n32"
3
4 ;
5 ; This file contains phase ordering tests for scalar evolution.
6 ; Test that the standard passes don't obfuscate the IR so scalar evolution can't
7 ; recognize expressions.
8
9 ; CHECK: test1
10 ; The loop body contains two increments by %div.
11 ; Make sure that 2*%div is recognizable, and not expressed as a bit mask of %d.
12 ; CHECK: --> {%p,+,(8 * (%d /u 4))}
13 define void @test1(i32 %d, i32* %p) nounwind uwtable ssp {
14 entry:
15 %div = udiv i32 %d, 4
16 br label %for.cond
17
18 for.cond: ; preds = %for.inc, %entry
19 %p.addr.0 = phi i32* [ %p, %entry ], [ %add.ptr1, %for.inc ]
20 %i.0 = phi i32 [ 0, %entry ], [ %inc, %for.inc ]
21 %cmp = icmp ne i32 %i.0, 64
22 br i1 %cmp, label %for.body, label %for.end
23
24 for.body: ; preds = %for.cond
25 store i32 0, i32* %p.addr.0, align 4
26 %add.ptr = getelementptr inbounds i32, i32* %p.addr.0, i32 %div
27 store i32 1, i32* %add.ptr, align 4
28 %add.ptr1 = getelementptr inbounds i32, i32* %add.ptr, i32 %div
29 br label %for.inc
30
31 for.inc: ; preds = %for.body
32 %inc = add i32 %i.0, 1
33 br label %for.cond
34
35 for.end: ; preds = %for.cond
36 ret void
37 }
38
39 ; CHECK: test1a
40 ; Same thing as test1, but it is even more tempting to fold 2 * (%d /u 2)
41 ; CHECK: --> {%p,+,(8 * (%d /u 2))}
42 define void @test1a(i32 %d, i32* %p) nounwind uwtable ssp {
43 entry:
44 %div = udiv i32 %d, 2
45 br label %for.cond
46
47 for.cond: ; preds = %for.inc, %entry
48 %p.addr.0 = phi i32* [ %p, %entry ], [ %add.ptr1, %for.inc ]
49 %i.0 = phi i32 [ 0, %entry ], [ %inc, %for.inc ]
50 %cmp = icmp ne i32 %i.0, 64
51 br i1 %cmp, label %for.body, label %for.end
52
53 for.body: ; preds = %for.cond
54 store i32 0, i32* %p.addr.0, align 4
55 %add.ptr = getelementptr inbounds i32, i32* %p.addr.0, i32 %div
56 store i32 1, i32* %add.ptr, align 4
57 %add.ptr1 = getelementptr inbounds i32, i32* %add.ptr, i32 %div
58 br label %for.inc
59
60 for.inc: ; preds = %for.body
61 %inc = add i32 %i.0, 1
62 br label %for.cond
63
64 for.end: ; preds = %for.cond
65 ret void
66 }
0 ; RUN: opt -S -simplifycfg < %s | FileCheck %s
1 target datalayout="p:40:64:64:32"
2
3 declare void @foo1()
4
5 declare void @foo2()
6
7 define void @test1(i32 %V) {
8 %C1 = icmp eq i32 %V, 4 ; [#uses=1]
9 %C2 = icmp eq i32 %V, 17 ; [#uses=1]
10 %CN = or i1 %C1, %C2 ; [#uses=1]
11 br i1 %CN, label %T, label %F
12 T: ; preds = %0
13 call void @foo1( )
14 ret void
15 F: ; preds = %0
16 call void @foo2( )
17 ret void
18 ; CHECK-LABEL: @test1(
19 ; CHECK: switch i32 %V, label %F [
20 ; CHECK: i32 17, label %T
21 ; CHECK: i32 4, label %T
22 ; CHECK: ]
23 }
24
25 define void @test1_ptr(i32* %V) {
26 %C1 = icmp eq i32* %V, inttoptr (i32 4 to i32*)
27 %C2 = icmp eq i32* %V, inttoptr (i32 17 to i32*)
28 %CN = or i1 %C1, %C2 ; [#uses=1]
29 br i1 %CN, label %T, label %F
30 T: ; preds = %0
31 call void @foo1( )
32 ret void
33 F: ; preds = %0
34 call void @foo2( )
35 ret void
36 ; CHECK-LABEL: @test1_ptr(
37 ; DL: %magicptr = ptrtoint i32* %V to i32
38 ; DL: switch i32 %magicptr, label %F [
39 ; DL: i32 17, label %T
40 ; DL: i32 4, label %T
41 ; DL: ]
42 }
43
44 define void @test1_ptr_as1(i32 addrspace(1)* %V) {
45 %C1 = icmp eq i32 addrspace(1)* %V, inttoptr (i32 4 to i32 addrspace(1)*)
46 %C2 = icmp eq i32 addrspace(1)* %V, inttoptr (i32 17 to i32 addrspace(1)*)
47 %CN = or i1 %C1, %C2 ; [#uses=1]
48 br i1 %CN, label %T, label %F
49 T: ; preds = %0
50 call void @foo1( )
51 ret void
52 F: ; preds = %0
53 call void @foo2( )
54 ret void
55 ; CHECK-LABEL: @test1_ptr_as1(
56 ; DL: %magicptr = ptrtoint i32 addrspace(1)* %V to i16
57 ; DL: switch i16 %magicptr, label %F [
58 ; DL: i16 17, label %T
59 ; DL: i16 4, label %T
60 ; DL: ]
61 }
62
63 define void @test2(i32 %V) {
64 %C1 = icmp ne i32 %V, 4 ; [#uses=1]
65 %C2 = icmp ne i32 %V, 17 ; [#uses=1]
66 %CN = and i1 %C1, %C2 ; [#uses=1]
67 br i1 %CN, label %T, label %F
68 T: ; preds = %0
69 call void @foo1( )
70 ret void
71 F: ; preds = %0
72 call void @foo2( )
73 ret void
74 ; CHECK-LABEL: @test2(
75 ; CHECK: switch i32 %V, label %T [
76 ; CHECK: i32 17, label %F
77 ; CHECK: i32 4, label %F
78 ; CHECK: ]
79 }
80
81 define void @test3(i32 %V) {
82 %C1 = icmp eq i32 %V, 4 ; [#uses=1]
83 br i1 %C1, label %T, label %N
84 N: ; preds = %0
85 %C2 = icmp eq i32 %V, 17 ; [#uses=1]
86 br i1 %C2, label %T, label %F
87 T: ; preds = %N, %0
88 call void @foo1( )
89 ret void
90 F: ; preds = %N
91 call void @foo2( )
92 ret void
93
94 ; CHECK-LABEL: @test3(
95 ; CHECK: switch i32 %V, label %F [
96 ; CHECK: i32 4, label %T
97 ; CHECK: i32 17, label %T
98 ; CHECK: ]
99 }
100
101
102
103 define i32 @test4(i8 zeroext %c) nounwind ssp noredzone {
104 entry:
105 %cmp = icmp eq i8 %c, 62
106 br i1 %cmp, label %lor.end, label %lor.lhs.false
107
108 lor.lhs.false: ; preds = %entry
109 %cmp4 = icmp eq i8 %c, 34
110 br i1 %cmp4, label %lor.end, label %lor.rhs
111
112 lor.rhs: ; preds = %lor.lhs.false
113 %cmp8 = icmp eq i8 %c, 92
114 br label %lor.end
115
116 lor.end: ; preds = %lor.rhs, %lor.lhs.false, %entry
117 %0 = phi i1 [ true, %lor.lhs.false ], [ true, %entry ], [ %cmp8, %lor.rhs ]
118 %lor.ext = zext i1 %0 to i32
119 ret i32 %lor.ext
120
121 ; CHECK-LABEL: @test4(
122 ; CHECK: switch i8 %c, label %lor.rhs [
123 ; CHECK: i8 62, label %lor.end
124 ; CHECK: i8 34, label %lor.end
125 ; CHECK: i8 92, label %lor.end
126 ; CHECK: ]
127 }
128
129 define i32 @test5(i8 zeroext %c) nounwind ssp noredzone {
130 entry:
131 switch i8 %c, label %lor.rhs [
132 i8 62, label %lor.end
133 i8 34, label %lor.end
134 i8 92, label %lor.end
135 ]
136
137 lor.rhs: ; preds = %entry
138 %V = icmp eq i8 %c, 92
139 br label %lor.end
140
141 lor.end: ; preds = %entry, %entry, %entry, %lor.rhs
142 %0 = phi i1 [ true, %entry ], [ %V, %lor.rhs ], [ true, %entry ], [ true, %entry ]
143 %lor.ext = zext i1 %0 to i32
144 ret i32 %lor.ext
145 ; CHECK-LABEL: @test5(
146 ; CHECK: switch i8 %c, label %lor.rhs [
147 ; CHECK: i8 62, label %lor.end
148 ; CHECK: i8 34, label %lor.end
149 ; CHECK: i8 92, label %lor.end
150 ; CHECK: ]
151 }
152
153
154 define i1 @test6({ i32, i32 }* %I) {
155 entry:
156 %tmp.1.i = getelementptr { i32, i32 }, { i32, i32 }* %I, i64 0, i32 1 ; [#uses=1]
157 %tmp.2.i = load i32, i32* %tmp.1.i ; [#uses=6]
158 %tmp.2 = icmp eq i32 %tmp.2.i, 14 ; [#uses=1]
159 br i1 %tmp.2, label %shortcirc_done.4, label %shortcirc_next.0
160 shortcirc_next.0: ; preds = %entry
161 %tmp.6 = icmp eq i32 %tmp.2.i, 15 ; [#uses=1]
162 br i1 %tmp.6, label %shortcirc_done.4, label %shortcirc_next.1
163 shortcirc_next.1: ; preds = %shortcirc_next.0
164 %tmp.11 = icmp eq i32 %tmp.2.i, 16 ; [#uses=1]
165 br i1 %tmp.11, label %shortcirc_done.4, label %shortcirc_next.2
166 shortcirc_next.2: ; preds = %shortcirc_next.1
167 %tmp.16 = icmp eq i32 %tmp.2.i, 17 ; [#uses=1]
168 br i1 %tmp.16, label %shortcirc_done.4, label %shortcirc_next.3
169 shortcirc_next.3: ; preds = %shortcirc_next.2
170 %tmp.21 = icmp eq i32 %tmp.2.i, 18 ; [#uses=1]
171 br i1 %tmp.21, label %shortcirc_done.4, label %shortcirc_next.4
172 shortcirc_next.4: ; preds = %shortcirc_next.3
173 %tmp.26 = icmp eq i32 %tmp.2.i, 19 ; [#uses=1]
174 br label %UnifiedReturnBlock
175 shortcirc_done.4: ; preds = %shortcirc_next.3, %shortcirc_next.2, %shortcirc_next.1, %shortcirc_next.0, %entry
176 br label %UnifiedReturnBlock
177 UnifiedReturnBlock: ; preds = %shortcirc_done.4, %shortcirc_next.4
178 %UnifiedRetVal = phi i1 [ %tmp.26, %shortcirc_next.4 ], [ true, %shortcirc_done.4 ] ; [#uses=1]
179 ret i1 %UnifiedRetVal
180
181 ; CHECK-LABEL: @test6(
182 ; CHECK: %tmp.2.i.off = add i32 %tmp.2.i, -14
183 ; CHECK: %switch = icmp ult i32 %tmp.2.i.off, 6
184 }
185
186 define void @test7(i8 zeroext %c, i32 %x) nounwind ssp noredzone {
187 entry:
188 %cmp = icmp ult i32 %x, 32
189 %cmp4 = icmp eq i8 %c, 97
190 %or.cond = or i1 %cmp, %cmp4
191 %cmp9 = icmp eq i8 %c, 99
192 %or.cond11 = or i1 %or.cond, %cmp9
193 br i1 %or.cond11, label %if.then, label %if.end
194
195 if.then: ; preds = %entry
196 tail call void @foo1() nounwind noredzone
197 ret void
198
199 if.end: ; preds = %entry
200 ret void
201
202 ; CHECK-LABEL: @test7(
203 ; CHECK: %cmp = icmp ult i32 %x, 32
204 ; CHECK: br i1 %cmp, label %if.then, label %switch.early.test
205 ; CHECK: switch.early.test:
206 ; CHECK: switch i8 %c, label %if.end [
207 ; CHECK: i8 99, label %if.then
208 ; CHECK: i8 97, label %if.then
209 ; CHECK: ]
210 }
211
212 define i32 @test8(i8 zeroext %c, i32 %x, i1 %C) nounwind ssp noredzone {
213 entry:
214 br i1 %C, label %N, label %if.then
215 N:
216 %cmp = icmp ult i32 %x, 32
217 %cmp4 = icmp eq i8 %c, 97
218 %or.cond = or i1 %cmp, %cmp4
219 %cmp9 = icmp eq i8 %c, 99
220 %or.cond11 = or i1 %or.cond, %cmp9
221 br i1 %or.cond11, label %if.then, label %if.end
222
223 if.then: ; preds = %entry
224 %A = phi i32 [0, %entry], [42, %N]
225 tail call void @foo1() nounwind noredzone
226 ret i32 %A
227
228 if.end: ; preds = %entry
229 ret i32 0
230
231 ; CHECK-LABEL: @test8(
232 ; CHECK: switch.early.test:
233 ; CHECK: switch i8 %c, label %if.end [
234 ; CHECK: i8 99, label %if.then
235 ; CHECK: i8 97, label %if.then
236 ; CHECK: ]
237 ; CHECK: %A = phi i32 [ 0, %entry ], [ 42, %switch.early.test ], [ 42, %N ], [ 42, %switch.early.test ]
238 }
239
240 ;; This is "Example 7" from http://blog.regehr.org/archives/320
241 define i32 @test9(i8 zeroext %c) nounwind ssp noredzone {
242 entry:
243 %cmp = icmp ult i8 %c, 33
244 br i1 %cmp, label %lor.end, label %lor.lhs.false
245
246 lor.lhs.false: ; preds = %entry
247 %cmp4 = icmp eq i8 %c, 46
248 br i1 %cmp4, label %lor.end, label %lor.lhs.false6
249
250 lor.lhs.false6: ; preds = %lor.lhs.false
251 %cmp9 = icmp eq i8 %c, 44
252 br i1 %cmp9, label %lor.end, label %lor.lhs.false11
253
254 lor.lhs.false11: ; preds = %lor.lhs.false6
255 %cmp14 = icmp eq i8 %c, 58
256 br i1 %cmp14, label %lor.end, label %lor.lhs.false16
257
258 lor.lhs.false16: ; preds = %lor.lhs.false11
259 %cmp19 = icmp eq i8 %c, 59
260 br i1 %cmp19, label %lor.end, label %lor.lhs.false21
261
262 lor.lhs.false21: ; preds = %lor.lhs.false16
263 %cmp24 = icmp eq i8 %c, 60
264 br i1 %cmp24, label %lor.end, label %lor.lhs.false26
265
266 lor.lhs.false26: ; preds = %lor.lhs.false21
267 %cmp29 = icmp eq i8 %c, 62
268 br i1 %cmp29, label %lor.end, label %lor.lhs.false31
269
270 lor.lhs.false31: ; preds = %lor.lhs.false26
271 %cmp34 = icmp eq i8 %c, 34
272 br i1 %cmp34, label %lor.end, label %lor.lhs.false36
273
274 lor.lhs.false36: ; preds = %lor.lhs.false31
275 %cmp39 = icmp eq i8 %c, 92
276 br i1 %cmp39, label %lor.end, label %lor.rhs
277
278 lor.rhs: ; preds = %lor.lhs.false36
279 %cmp43 = icmp eq i8 %c, 39
280 br label %lor.end
281
282 lor.end: ; preds = %lor.rhs, %lor.lhs.false36, %lor.lhs.false31, %lor.lhs.false26, %lor.lhs.false21, %lor.lhs.false16, %lor.lhs.false11, %lor.lhs.false6, %lor.lhs.false, %entry
283 %0 = phi i1 [ true, %lor.lhs.false36 ], [ true, %lor.lhs.false31 ], [ true, %lor.lhs.false26 ], [ true, %lor.lhs.false21 ], [ true, %lor.lhs.false16 ], [ true, %lor.lhs.false11 ], [ true, %lor.lhs.false6 ], [ true, %lor.lhs.false ], [ true, %entry ], [ %cmp43, %lor.rhs ]
284 %conv46 = zext i1 %0 to i32
285 ret i32 %conv46
286
287 ; CHECK-LABEL: @test9(
288 ; CHECK: %cmp = icmp ult i8 %c, 33
289 ; CHECK: br i1 %cmp, label %lor.end, label %switch.early.test
290
291 ; CHECK: switch.early.test:
292 ; CHECK: switch i8 %c, label %lor.rhs [
293 ; CHECK: i8 92, label %lor.end
294 ; CHECK: i8 62, label %lor.end
295 ; CHECK: i8 60, label %lor.end
296 ; CHECK: i8 59, label %lor.end
297 ; CHECK: i8 58, label %lor.end
298 ; CHECK: i8 46, label %lor.end
299 ; CHECK: i8 44, label %lor.end
300 ; CHECK: i8 34, label %lor.end
301 ; CHECK: i8 39, label %lor.end
302 ; CHECK: ]
303 }
304
305 define i32 @test10(i32 %mode, i1 %Cond) {
306 %A = icmp ne i32 %mode, 0
307 %B = icmp ne i32 %mode, 51
308 %C = and i1 %A, %B
309 %D = and i1 %C, %Cond
310 br i1 %D, label %T, label %F
311 T:
312 ret i32 123
313 F:
314 ret i32 324
315
316 ; CHECK-LABEL: @test10(
317 ; CHECK: br i1 %Cond, label %switch.early.test, label %F
318 ; CHECK:switch.early.test:
319 ; CHECK: switch i32 %mode, label %T [
320 ; CHECK: i32 51, label %F
321 ; CHECK: i32 0, label %F
322 ; CHECK: ]
323 }
324
325 ; PR8780
326 define i32 @test11(i32 %bar) nounwind {
327 entry:
328 %cmp = icmp eq i32 %bar, 4
329 %cmp2 = icmp eq i32 %bar, 35
330 %or.cond = or i1 %cmp, %cmp2
331 %cmp5 = icmp eq i32 %bar, 53
332 %or.cond1 = or i1 %or.cond, %cmp5
333 %cmp8 = icmp eq i32 %bar, 24
334 %or.cond2 = or i1 %or.cond1, %cmp8
335 %cmp11 = icmp eq i32 %bar, 23
336 %or.cond3 = or i1 %or.cond2, %cmp11
337 %cmp14 = icmp eq i32 %bar, 55
338 %or.cond4 = or i1 %or.cond3, %cmp14
339 %cmp17 = icmp eq i32 %bar, 12
340 %or.cond5 = or i1 %or.cond4, %cmp17
341 %cmp20 = icmp eq i32 %bar, 35
342 %or.cond6 = or i1 %or.cond5, %cmp20
343 br i1 %or.cond6, label %if.then, label %if.end
344
345 if.then: ; preds = %entry
346 br label %return
347
348 if.end: ; preds = %entry
349 br label %return
350
351 return: ; preds = %if.end, %if.then
352 %retval.0 = phi i32 [ 1, %if.then ], [ 0, %if.end ]
353 ret i32 %retval.0
354
355 ; CHECK-LABEL: @test11(
356 ; CHECK: switch i32 %bar, label %if.end [
357 ; CHECK: i32 55, label %return
358 ; CHECK: i32 53, label %return
359 ; CHECK: i32 35, label %return
360 ; CHECK: i32 24, label %return
361 ; CHECK: i32 23, label %return
362 ; CHECK: i32 12, label %return
363 ; CHECK: i32 4, label %return
364 ; CHECK: ]
365 }
366
367 define void @test12() nounwind {
368 entry:
369 br label %bb49.us.us
370
371 bb49.us.us:
372 %A = icmp eq i32 undef, undef
373 br i1 %A, label %bb55.us.us, label %malformed
374
375 bb48.us.us:
376 %B = icmp ugt i32 undef, undef
377 br i1 %B, label %bb55.us.us, label %bb49.us.us
378
379 bb55.us.us:
380 br label %bb48.us.us
381
382 malformed:
383 ret void
384 ; CHECK-LABEL: @test12(
385
386 }
387
388 ; test13 - handle switch formation with ult.
389 define void @test13(i32 %x) nounwind ssp noredzone {
390 entry:
391 %cmp = icmp ult i32 %x, 2
392 br i1 %cmp, label %if.then, label %lor.lhs.false3
393
394 lor.lhs.false3: ; preds = %lor.lhs.false
395 %cmp5 = icmp eq i32 %x, 3
396 br i1 %cmp5, label %if.then, label %lor.lhs.false6
397
398 lor.lhs.false6: ; preds = %lor.lhs.false3
399 %cmp8 = icmp eq i32 %x, 4
400 br i1 %cmp8, label %if.then, label %lor.lhs.false9
401
402 lor.lhs.false9: ; preds = %lor.lhs.false6
403 %cmp11 = icmp eq i32 %x, 6
404 br i1 %cmp11, label %if.then, label %if.end
405
406 if.then: ; preds = %lor.lhs.false9, %lor.lhs.false6, %lor.lhs.false3, %lor.lhs.false, %entry
407 call void @foo1() noredzone
408 br label %if.end
409
410 if.end: ; preds = %if.then, %lor.lhs.false9
411 ret void
412 ; CHECK-LABEL: @test13(
413 ; CHECK: switch i32 %x, label %if.end [
414 ; CHECK: i32 6, label %if.then
415 ; CHECK: i32 4, label %if.then
416 ; CHECK: i32 3, label %if.then
417 ; CHECK: i32 1, label %if.then
418 ; CHECK: i32 0, label %if.then
419 ; CHECK: ]
420 }
421
422 ; test14 - handle switch formation with ult.
423 define void @test14(i32 %x) nounwind ssp noredzone {
424 entry:
425 %cmp = icmp ugt i32 %x, 2
426 br i1 %cmp, label %lor.lhs.false3, label %if.then
427
428 lor.lhs.false3: ; preds = %lor.lhs.false
429 %cmp5 = icmp ne i32 %x, 3
430 br i1 %cmp5, label %lor.lhs.false6, label %if.then
431
432 lor.lhs.false6: ; preds = %lor.lhs.false3
433 %cmp8 = icmp ne i32 %x, 4
434 br i1 %cmp8, label %lor.lhs.false9, label %if.then
435
436 lor.lhs.false9: ; preds = %lor.lhs.false6
437 %cmp11 = icmp ne i32 %x, 6
438 br i1 %cmp11, label %if.end, label %if.then
439
440 if.then: ; preds = %lor.lhs.false9, %lor.lhs.false6, %lor.lhs.false3, %lor.lhs.false, %entry
441 call void @foo1() noredzone
442 br label %if.end
443
444 if.end: ; preds = %if.then, %lor.lhs.false9
445 ret void
446 ; CHECK-LABEL: @test14(
447 ; CHECK: switch i32 %x, label %if.end [
448 ; CHECK: i32 6, label %if.then
449 ; CHECK: i32 4, label %if.then
450 ; CHECK: i32 3, label %if.then
451 ; CHECK: i32 1, label %if.then
452 ; CHECK: i32 0, label %if.then
453 ; CHECK: ]
454 }
455
456 ; Don't crash on ginormous ranges.
457 define void @test15(i128 %x) nounwind {
458 %cmp = icmp ugt i128 %x, 2
459 br i1 %cmp, label %if.end, label %lor.false
460
461 lor.false:
462 %cmp2 = icmp ne i128 %x, 100000000000000000000
463 br i1 %cmp2, label %if.end, label %if.then
464
465 if.then:
466 call void @foo1() noredzone
467 br label %if.end
468
469 if.end:
470 ret void
471
472 ; CHECK-LABEL: @test15(
473 ; CHECK-NOT: switch
474 ; CHECK: ret void
475 }
476
477 ; PR8675
478 ; rdar://5134905
479 define zeroext i1 @test16(i32 %x) nounwind {
480 entry:
481 ; CHECK-LABEL: @test16(
482 ; CHECK: %x.off = add i32 %x, -1
483 ; CHECK: %switch = icmp ult i32 %x.off, 3
484 %cmp.i = icmp eq i32 %x, 1
485 br i1 %cmp.i, label %lor.end, label %lor.lhs.false
486
487 lor.lhs.false:
488 %cmp.i2 = icmp eq i32 %x, 2
489 br i1 %cmp.i2, label %lor.end, label %lor.rhs
490
491 lor.rhs:
492 %cmp.i1 = icmp eq i32 %x, 3
493 br label %lor.end
494
495 lor.end:
496 %0 = phi i1 [ true, %lor.lhs.false ], [ true, %entry ], [ %cmp.i1, %lor.rhs ]
497 ret i1 %0
498 }
499
500 ; Check that we don't turn an icmp into a switch where it's not useful.
501 define void @test17(i32 %x, i32 %y) {
502 %cmp = icmp ult i32 %x, 3
503 %switch = icmp ult i32 %y, 2
504 %or.cond775 = or i1 %cmp, %switch
505 br i1 %or.cond775, label %lor.lhs.false8, label %return
506
507 lor.lhs.false8:
508 tail call void @foo1()
509 ret void
510
511 return:
512 ret void
513
514 ; CHECK-LABEL: @test17(
515 ; CHECK-NOT: switch.early.test
516 ; CHECK-NOT: switch i32
517 ; CHECK: ret void
518 }
519
520 define void @test18(i32 %arg) {
521 bb:
522 %tmp = and i32 %arg, -2
523 %tmp1 = icmp eq i32 %tmp, 8
524 %tmp2 = icmp eq i32 %arg, 10
525 %tmp3 = or i1 %tmp1, %tmp2
526 %tmp4 = icmp eq i32 %arg, 11
527 %tmp5 = or i1 %tmp3, %tmp4
528 %tmp6 = icmp eq i32 %arg, 12
529 %tmp7 = or i1 %tmp5, %tmp6
530 br i1 %tmp7, label %bb19, label %bb8
531
532 bb8: ; preds = %bb
533 %tmp9 = add i32 %arg, -13
534 %tmp10 = icmp ult i32 %tmp9, 2
535 %tmp11 = icmp eq i32 %arg, 16
536 %tmp12 = or i1 %tmp10, %tmp11
537 %tmp13 = icmp eq i32 %arg, 17
538 %tmp14 = or i1 %tmp12, %tmp13
539 %tmp15 = icmp eq i32 %arg, 18
540 %tmp16 = or i1 %tmp14, %tmp15
541 %tmp17 = icmp eq i32 %arg, 15
542 %tmp18 = or i1 %tmp16, %tmp17
543 br i1 %tmp18, label %bb19, label %bb20
544
545 bb19: ; preds = %bb8, %bb
546 tail call void @foo1()
547 br label %bb20
548
549 bb20: ; preds = %bb19, %bb8
550 ret void
551
552 ; CHECK-LABEL: @test18(
553 ; CHECK: %arg.off = add i32 %arg, -8
554 ; CHECK: icmp ult i32 %arg.off, 11
555 }
556
557 define void @PR26323(i1 %tobool23, i32 %tmp3) {
558 entry:
559 %tobool5 = icmp ne i32 %tmp3, 0
560 %neg14 = and i32 %tmp3, -2
561 %cmp17 = icmp ne i32 %neg14, -1
562 %or.cond = and i1 %tobool5, %tobool23
563 %or.cond1 = and i1 %cmp17, %or.cond
564 br i1 %or.cond1, label %if.end29, label %if.then27
565
566 if.then27: ; preds = %entry
567 call void @foo1()
568 unreachable
569
570 if.end29: ; preds = %entry
571 ret void
572 }
573
574 ; CHECK-LABEL: define void @PR26323(
575 ; CHECK: %tobool5 = icmp ne i32 %tmp3, 0
576 ; CHECK: %neg14 = and i32 %tmp3, -2
577 ; CHECK: %cmp17 = icmp ne i32 %neg14, -1
578 ; CHECK: %or.cond = and i1 %tobool5, %tobool23
579 ; CHECK: %or.cond1 = and i1 %cmp17, %or.cond
580 ; CHECK: br i1 %or.cond1, label %if.end29, label %if.then27
581
582 ; Form a switch when and'ing a negated power of two
583 ; CHECK-LABEL: define void @test19
584 ; CHECK: switch i32 %arg, label %else [
585 ; CHECK: i32 32, label %if
586 ; CHECK: i32 13, label %if
587 ; CHECK: i32 12, label %if
588 define void @test19(i32 %arg) {
589 %and = and i32 %arg, -2
590 %cmp1 = icmp eq i32 %and, 12
591 %cmp2 = icmp eq i32 %arg, 32
592 %pred = or i1 %cmp1, %cmp2
593 br i1 %pred, label %if, label %else
594
595 if:
596 call void @foo1()
597 ret void
598
599 else:
600 ret void
601 }
602
603 ; Since %cmp1 is always false, a switch is never formed
604 ; CHECK-LABEL: define void @test20
605 ; CHECK-NOT: switch
606 ; CHECK: ret void
607 define void @test20(i32 %arg) {
608 %and = and i32 %arg, -2
609 %cmp1 = icmp eq i32 %and, 13
610 %cmp2 = icmp eq i32 %arg, 32
611 %pred = or i1 %cmp1, %cmp2
612 br i1 %pred, label %if, label %else
613
614 if:
615 call void @foo1()
616 ret void
617
618 else:
619 ret void
620 }
621
622 ; Form a switch when or'ing a power of two
623 ; CHECK-LABEL: define void @test21
624 ; CHECK: i32 32, label %else
625 ; CHECK: i32 13, label %else
626 ; CHECK: i32 12, label %else
627 define void @test21(i32 %arg) {
628 %and = or i32 %arg, 1
629 %cmp1 = icmp ne i32 %and, 13
630 %cmp2 = icmp ne i32 %arg, 32
631 %pred = and i1 %cmp1, %cmp2
632 br i1 %pred, label %if, label %else
633
634 if:
635 call void @foo1()
636 ret void
637
638 else:
639 ret void
640 }
641
642 ; Since %cmp1 is always false, a switch is never formed
643 ; CHECK-LABEL: define void @test22
644 ; CHECK-NOT: switch
645 ; CHECK: ret void
646 define void @test22(i32 %arg) {
647 %and = or i32 %arg, 1
648 %cmp1 = icmp ne i32 %and, 12
649 %cmp2 = icmp ne i32 %arg, 32
650 %pred = and i1 %cmp1, %cmp2
651 br i1 %pred, label %if, label %else
652
653 if:
654 call void @foo1()
655 ret void
656
657 else:
658 ret void
659 }