llvm.org GIT mirror llvm / 224904b
[PM/LoopUnswitch] Teach the new unswitch to handle nontrivial unswitching of switches. This works much like trivial unswitching of switches in that it reliably moves the switch out of the loop. Here we potentially clone the entire loop into each successor of the switch and re-point the cases at these clones. Due to the complexity of actually doing nontrivial unswitching, this patch doesn't create a dedicated routine for handling switches -- it would duplicate far too much code. Instead, it generalizes the existing routine to handle both branches and switches as it largely reduces to looping in a few places instead of doing something once. This actually improves the results in some cases with branches due to being much more careful about how dead regions of code are managed. With branches, because exactly one clone is created and there are exactly two edges considered, somewhat sloppy handling of the dead regions of code was sufficient in most cases. But with switches, there are much more complicated patterns of dead code and so I've had to move to a more robust model generally. We still do as much pruning of the dead code early as possible because that allows us to avoid even cloning the code. This also surfaced another problem with nontrivial unswitching before which is that we weren't as precise in reconstructing loops as we could have been. This seems to have been mostly harmless, but resulted in pointless LCSSA PHI nodes and other unnecessary cruft. With switches, we have to get this *right*, and everything benefits from it. While the testing may seem a bit light here because we only have two real cases with actual switches, they do a surprisingly good job of exercising numerous edge cases. Also, because we share the logic with branches, most of the changes in this patch are reasonably well covered by existing tests. The new unswitch now has all of the same fundamental power as the old one with the exception of the single unsound case of *partial* switch unswitching -- that really is just loop specialization and not unswitching at all. It doesn't fit into the canonicalization model in any way. We can add a loop specialization pass that runs late based on profile data if important test cases ever come up here. Differential Revision: https://reviews.llvm.org/D47683 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@335553 91177308-0d34-0410-b5e6-96231b3b80d8 Chandler Carruth 1 year, 2 months ago
3 changed file(s) with 427 addition(s) and 219 deletion(s). Raw diff Collapse all Expand all
1616
1717 namespace llvm {
1818
19 /// This pass transforms loops that contain branches on loop-invariant
20 /// conditions to have multiple loops. For example, it turns the left into the
21 /// right code:
19 /// This pass transforms loops that contain branches or switches on loop-
20 /// invariant conditions to have multiple loops. For example, it turns the left
21 /// into the right code:
2222 ///
2323 /// for (...) if (lic)
2424 /// A for (...)
3434 /// This pass expects LICM to be run before it to hoist invariant conditions out
3535 /// of the loop, to make the unswitching opportunity obvious.
3636 ///
37 /// There is a taxonomy of unswitching that we use to classify different forms
38 /// of this transformaiton:
39 ///
40 /// - Trival unswitching: this is when the condition can be unswitched without
41 /// cloning any code from inside the loop. A non-trivial unswitch requires
42 /// code duplication.
43 ///
44 /// - Full unswitching: this is when the branch or switch is completely moved
45 /// from inside the loop to outside the loop. Partial unswitching removes the
46 /// branch from the clone of the loop but must leave a (somewhat simplified)
47 /// branch in the original loop. While theoretically partial unswitching can
48 /// be done for switches, the requirements are extreme - we need the loop
49 /// invariant input to the switch to be sufficient to collapse to a single
50 /// successor in each clone.
51 ///
52 /// This pass always does trivial, full unswitching for both branches and
53 /// switches. For branches, it also always does trivial, partial unswitching.
54 ///
55 /// If enabled (via the constructor's `NonTrivial` parameter), this pass will
56 /// additionally do non-trivial, full unswitching for branches and switches, and
57 /// will do non-trivial, partial unswitching for branches.
58 ///
59 /// Because partial unswitching of switches is extremely unlikely to be possible
60 /// in practice and significantly complicates the implementation, this pass does
61 /// not currently implement that in any mode.
3762 class SimpleLoopUnswitchPass : public PassInfoMixin {
3863 bool NonTrivial;
3964
714714 ///
715715 /// This routine handles cloning all of the necessary loop blocks and exit
716716 /// blocks including rewriting their instructions and the relevant PHI nodes.
717 /// It skips loop and exit blocks that are not necessary based on the provided
718 /// set. It also correctly creates the unconditional branch in the cloned
717 /// Any loop blocks or exit blocks which are dominated by a different successor
718 /// than the one for this clone of the loop blocks can be trivially skipped. We
719 /// use the `DominatingSucc` map to determine whether a block satisfies that
720 /// property with a simple map lookup.
721 ///
722 /// It also correctly creates the unconditional branch in the cloned
719723 /// unswitched parent block to only point at the unswitched successor.
720724 ///
721725 /// This does not handle most of the necessary updates to `LoopInfo`. Only exit
729733 Loop &L, BasicBlock *LoopPH, BasicBlock *SplitBB,
730734 ArrayRef ExitBlocks, BasicBlock *ParentBB,
731735 BasicBlock *UnswitchedSuccBB, BasicBlock *ContinueSuccBB,
732 const SmallPtrSetImpl &SkippedLoopAndExitBlocks,
736 const SmallDenseMap &DominatingSucc,
733737 ValueToValueMapTy &VMap,
734738 SmallVectorImpl &DTUpdates, AssumptionCache &AC,
735739 DominatorTree &DT, LoopInfo &LI) {
750754 return NewBB;
751755 };
752756
757 // We skip cloning blocks when they have a dominating succ that is not the
758 // succ we are cloning for.
759 auto SkipBlock = [&](BasicBlock *BB) {
760 auto It = DominatingSucc.find(BB);
761 return It != DominatingSucc.end() && It->second != UnswitchedSuccBB;
762 };
763
753764 // First, clone the preheader.
754765 auto *ClonedPH = CloneBlock(LoopPH);
755766
756767 // Then clone all the loop blocks, skipping the ones that aren't necessary.
757768 for (auto *LoopBB : L.blocks())
758 if (!SkippedLoopAndExitBlocks.count(LoopBB))
769 if (!SkipBlock(LoopBB))
759770 CloneBlock(LoopBB);
760771
761772 // Split all the loop exit edges so that when we clone the exit blocks, if
762773 // any of the exit blocks are *also* a preheader for some other loop, we
763774 // don't create multiple predecessors entering the loop header.
764775 for (auto *ExitBB : ExitBlocks) {
765 if (SkippedLoopAndExitBlocks.count(ExitBB))
776 if (SkipBlock(ExitBB))
766777 continue;
767778
768779 // When we are going to clone an exit, we don't need to clone all the
840851 // Update any PHI nodes in the cloned successors of the skipped blocks to not
841852 // have spurious incoming values.
842853 for (auto *LoopBB : L.blocks())
843 if (SkippedLoopAndExitBlocks.count(LoopBB))
854 if (SkipBlock(LoopBB))
844855 for (auto *SuccBB : successors(LoopBB))
845856 if (auto *ClonedSuccBB = cast_or_null(VMap.lookup(SuccBB)))
846857 for (PHINode &PN : ClonedSuccBB->phis())
11741185 }
11751186
11761187 static void
1188 deleteDeadClonedBlocks(Loop &L, ArrayRef ExitBlocks,
1189 ArrayRef> VMaps,
1190 DominatorTree &DT) {
1191 // Find all the dead clones, and remove them from their successors.
1192 SmallVector DeadBlocks;
1193 for (BasicBlock *BB : llvm::concat(L.blocks(), ExitBlocks))
1194 for (auto &VMap : VMaps)
1195 if (BasicBlock *ClonedBB = cast_or_null(VMap->lookup(BB)))
1196 if (!DT.isReachableFromEntry(ClonedBB)) {
1197 for (BasicBlock *SuccBB : successors(ClonedBB))
1198 SuccBB->removePredecessor(ClonedBB);
1199 DeadBlocks.push_back(ClonedBB);
1200 }
1201
1202 // Drop any remaining references to break cycles.
1203 for (BasicBlock *BB : DeadBlocks)
1204 BB->dropAllReferences();
1205 // Erase them from the IR.
1206 for (BasicBlock *BB : DeadBlocks)
1207 BB->eraseFromParent();
1208 }
1209
1210 static void
11771211 deleteDeadBlocksFromLoop(Loop &L,
1178 const SmallVectorImpl &DeadBlocks,
11791212 SmallVectorImpl &ExitBlocks,
11801213 DominatorTree &DT, LoopInfo &LI) {
1214 // Find all the dead blocks, and remove them from their successors.
1215 SmallVector DeadBlocks;
1216 for (BasicBlock *BB : llvm::concat(L.blocks(), ExitBlocks))
1217 if (!DT.isReachableFromEntry(BB)) {
1218 for (BasicBlock *SuccBB : successors(BB))
1219 SuccBB->removePredecessor(BB);
1220 DeadBlocks.push_back(BB);
1221 }
1222
11811223 SmallPtrSet DeadBlockSet(DeadBlocks.begin(),
11821224 DeadBlocks.end());
11831225
11851227 // used in the caller.
11861228 llvm::erase_if(ExitBlocks,
11871229 [&](BasicBlock *BB) { return DeadBlockSet.count(BB); });
1188
1189 // Remove these blocks from their successors.
1190 for (auto *BB : DeadBlocks)
1191 for (BasicBlock *SuccBB : successors(BB))
1192 SuccBB->removePredecessor(BB, /*DontDeleteUselessPHIs*/ true);
11931230
11941231 // Walk from this loop up through its parents removing all of the dead blocks.
11951232 for (Loop *ParentL = &L; ParentL; ParentL = ParentL->getParentLoop()) {
15811618 } while (!DomWorklist.empty());
15821619 }
15831620
1584 /// Take an invariant branch that has been determined to be safe and worthwhile
1585 /// to unswitch despite being non-trivial to do so and perform the unswitch.
1586 ///
1587 /// This directly updates the CFG to hoist the predicate out of the loop, and
1588 /// clone the necessary parts of the loop to maintain behavior.
1589 ///
1590 /// It also updates both dominator tree and loopinfo based on the unswitching.
1591 ///
1592 /// Once unswitching has been performed it runs the provided callback to report
1593 /// the new loops and no-longer valid loops to the caller.
1594 static bool unswitchInvariantBranch(
1595 Loop &L, BranchInst &BI, ArrayRef Invariants, DominatorTree &DT,
1596 LoopInfo &LI, AssumptionCache &AC,
1621 static bool unswitchNontrivialInvariants(
1622 Loop &L, TerminatorInst &TI, ArrayRef Invariants,
1623 DominatorTree &DT, LoopInfo &LI, AssumptionCache &AC,
15971624 function_ref)> UnswitchCB) {
1598 auto *ParentBB = BI.getParent();
1599
1600 // We can only unswitch conditional branches with an invariant condition or
1601 // combining invariant conditions with an instruction.
1602 assert(BI.isConditional() && "Can only unswitch a conditional branch!");
1603 bool FullUnswitch = BI.getCondition() == Invariants[0];
1625 auto *ParentBB = TI.getParent();
1626 BranchInst *BI = dyn_cast(&TI);
1627 SwitchInst *SI = BI ? nullptr : cast(&TI);
1628
1629 // We can only unswitch switches, conditional branches with an invariant
1630 // condition, or combining invariant conditions with an instruction.
1631 assert((SI || BI->isConditional()) &&
1632 "Can only unswitch switches and conditional branch!");
1633 bool FullUnswitch = SI || BI->getCondition() == Invariants[0];
16041634 if (FullUnswitch)
16051635 assert(Invariants.size() == 1 &&
16061636 "Cannot have other invariants with full unswitching!");
16071637 else
1608 assert(isa(BI.getCondition()) &&
1638 assert(isa(BI->getCondition()) &&
16091639 "Partial unswitching requires an instruction as the condition!");
16101640
16111641 // Constant and BBs tracking the cloned and continuing successor. When we are
16171647 bool Direction = true;
16181648 int ClonedSucc = 0;
16191649 if (!FullUnswitch) {
1620 if (cast(BI.getCondition())->getOpcode() != Instruction::Or) {
1621 assert(cast(BI.getCondition())->getOpcode() == Instruction::And &&
1622 "Only `or` and `and` instructions can combine invariants being unswitched.");
1650 if (cast(BI->getCondition())->getOpcode() != Instruction::Or) {
1651 assert(cast(BI->getCondition())->getOpcode() ==
1652 Instruction::And &&
1653 "Only `or` and `and` instructions can combine invariants being "
1654 "unswitched.");
16231655 Direction = false;
16241656 ClonedSucc = 1;
16251657 }
16261658 }
1627 auto *UnswitchedSuccBB = BI.getSuccessor(ClonedSucc);
1628 auto *ContinueSuccBB = BI.getSuccessor(1 - ClonedSucc);
1629
1630 assert(UnswitchedSuccBB != ContinueSuccBB &&
1631 "Should not unswitch a branch that always goes to the same place!");
1659
1660 BasicBlock *RetainedSuccBB =
1661 BI ? BI->getSuccessor(1 - ClonedSucc) : SI->getDefaultDest();
1662 SmallSetVector UnswitchedSuccBBs;
1663 if (BI)
1664 UnswitchedSuccBBs.insert(BI->getSuccessor(ClonedSucc));
1665 else
1666 for (auto Case : SI->cases())
1667 UnswitchedSuccBBs.insert(Case.getCaseSuccessor());
1668
1669 assert(!UnswitchedSuccBBs.count(RetainedSuccBB) &&
1670 "Should not unswitch the same successor we are retaining!");
16321671
16331672 // The branch should be in this exact loop. Any inner loop's invariant branch
16341673 // should be handled by unswitching that inner loop. The caller of this
16461685 for (auto *ExitBB : ExitBlocks)
16471686 if (isa(ExitBB->getFirstNonPHI()))
16481687 return false;
1649
1650 SmallPtrSet ExitBlockSet(ExitBlocks.begin(),
1651 ExitBlocks.end());
16521688
16531689 // Compute the parent loop now before we start hacking on things.
16541690 Loop *ParentL = L.getParentLoop();
16681704 OuterExitL = NewOuterExitL;
16691705 }
16701706
1671 // If the edge we *aren't* cloning in the unswitch (the continuing edge)
1672 // dominates its target, we can skip cloning the dominated region of the loop
1673 // and its exits. We compute this as a set of nodes to be skipped.
1674 SmallPtrSet SkippedLoopAndExitBlocks;
1675 if (ContinueSuccBB->getUniquePredecessor() ||
1676 llvm::all_of(predecessors(ContinueSuccBB), [&](BasicBlock *PredBB) {
1677 return PredBB == ParentBB || DT.dominates(ContinueSuccBB, PredBB);
1678 })) {
1679 visitDomSubTree(DT, ContinueSuccBB, [&](BasicBlock *BB) {
1680 SkippedLoopAndExitBlocks.insert(BB);
1681 return true;
1682 });
1683 }
1684 // If we are doing full unswitching, then similarly to the above, the edge we
1685 // *are* cloning in the unswitch (the unswitched edge) dominates its target,
1686 // we will end up with dead nodes in the original loop and its exits that will
1687 // need to be deleted. Here, we just retain that the property holds and will
1688 // compute the deleted set later.
1689 bool DeleteUnswitchedSucc =
1690 FullUnswitch &&
1691 (UnswitchedSuccBB->getUniquePredecessor() ||
1692 llvm::all_of(predecessors(UnswitchedSuccBB), [&](BasicBlock *PredBB) {
1693 return PredBB == ParentBB || DT.dominates(UnswitchedSuccBB, PredBB);
1694 }));
1707 // If the edge from this terminator to a successor dominates that successor,
1708 // store a map from each block in its dominator subtree to it. This lets us
1709 // tell when cloning for a particular successor if a block is dominated by
1710 // some *other* successor with a single data structure. We use this to
1711 // significantly reduce cloning.
1712 SmallDenseMap DominatingSucc;
1713 for (auto *SuccBB : llvm::concat(
1714 makeArrayRef(RetainedSuccBB), UnswitchedSuccBBs))
1715 if (SuccBB->getUniquePredecessor() ||
1716 llvm::all_of(predecessors(SuccBB), [&](BasicBlock *PredBB) {
1717 return PredBB == ParentBB || DT.dominates(SuccBB, PredBB);
1718 }))
1719 visitDomSubTree(DT, SuccBB, [&](BasicBlock *BB) {
1720 DominatingSucc[BB] = SuccBB;
1721 return true;
1722 });
16951723
16961724 // Split the preheader, so that we know that there is a safe place to insert
16971725 // the conditional branch. We will change the preheader to have a conditional
17011729 BasicBlock *SplitBB = L.getLoopPreheader();
17021730 BasicBlock *LoopPH = SplitEdge(SplitBB, L.getHeader(), &DT, &LI);
17031731
1704 // Keep a mapping for the cloned values.
1705 ValueToValueMapTy VMap;
1706
17071732 // Keep track of the dominator tree updates needed.
17081733 SmallVector DTUpdates;
17091734
1710 // Build the cloned blocks from the loop.
1711 auto *ClonedPH = buildClonedLoopBlocks(
1712 L, LoopPH, SplitBB, ExitBlocks, ParentBB, UnswitchedSuccBB,
1713 ContinueSuccBB, SkippedLoopAndExitBlocks, VMap, DTUpdates, AC, DT, LI);
1735 // Clone the loop for each unswitched successor.
1736 SmallVector, 4> VMaps;
1737 VMaps.reserve(UnswitchedSuccBBs.size());
1738 SmallDenseMap ClonedPHs;
1739 for (auto *SuccBB : UnswitchedSuccBBs) {
1740 VMaps.emplace_back(new ValueToValueMapTy());
1741 ClonedPHs[SuccBB] = buildClonedLoopBlocks(
1742 L, LoopPH, SplitBB, ExitBlocks, ParentBB, SuccBB, RetainedSuccBB,
1743 DominatingSucc, *VMaps.back(), DTUpdates, AC, DT, LI);
1744 }
17141745
17151746 // The stitching of the branched code back together depends on whether we're
17161747 // doing full unswitching or not with the exception that we always want to
17171748 // nuke the initial terminator placed in the split block.
17181749 SplitBB->getTerminator()->eraseFromParent();
17191750 if (FullUnswitch) {
1720 // Remove the parent as a predecessor of the
1721 // unswitched successor.
1722 UnswitchedSuccBB->removePredecessor(ParentBB,
1723 /*DontDeleteUselessPHIs*/ true);
1724 DTUpdates.push_back({DominatorTree::Delete, ParentBB, UnswitchedSuccBB});
1725
1726 // Now splice the branch from the original loop and use it to select between
1727 // the two loops.
1728 SplitBB->getInstList().splice(SplitBB->end(), ParentBB->getInstList(), BI);
1729 BI.setSuccessor(ClonedSucc, ClonedPH);
1730 BI.setSuccessor(1 - ClonedSucc, LoopPH);
1751 for (BasicBlock *SuccBB : UnswitchedSuccBBs) {
1752 // Remove the parent as a predecessor of the unswitched successor.
1753 SuccBB->removePredecessor(ParentBB,
1754 /*DontDeleteUselessPHIs*/ true);
1755 DTUpdates.push_back({DominatorTree::Delete, ParentBB, SuccBB});
1756 }
1757
1758 // Now splice the terminator from the original loop and rewrite its
1759 // successors.
1760 SplitBB->getInstList().splice(SplitBB->end(), ParentBB->getInstList(), TI);
1761 if (BI) {
1762 assert(UnswitchedSuccBBs.size() == 1 &&
1763 "Only one possible unswitched block for a branch!");
1764 BasicBlock *ClonedPH = ClonedPHs.begin()->second;
1765 BI->setSuccessor(ClonedSucc, ClonedPH);
1766 BI->setSuccessor(1 - ClonedSucc, LoopPH);
1767 DTUpdates.push_back({DominatorTree::Insert, SplitBB, ClonedPH});
1768 } else {
1769 assert(SI && "Must either be a branch or switch!");
1770
1771 // Walk the cases and directly update their successors.
1772 for (auto &Case : SI->cases())
1773 Case.setSuccessor(ClonedPHs.find(Case.getCaseSuccessor())->second);
1774 // We need to use the set to populate domtree updates as even when there
1775 // are multiple cases pointing at the same successor we only want to
1776 // insert one edge in the domtree.
1777 for (BasicBlock *SuccBB : UnswitchedSuccBBs)
1778 DTUpdates.push_back(
1779 {DominatorTree::Insert, SplitBB, ClonedPHs.find(SuccBB)->second});
1780
1781 SI->setDefaultDest(LoopPH);
1782 }
17311783
17321784 // Create a new unconditional branch to the continuing block (as opposed to
17331785 // the one cloned).
1734 BranchInst::Create(ContinueSuccBB, ParentBB);
1786 BranchInst::Create(RetainedSuccBB, ParentBB);
17351787 } else {
1788 assert(BI && "Only branches have partial unswitching.");
1789 assert(UnswitchedSuccBBs.size() == 1 &&
1790 "Only one possible unswitched block for a branch!");
1791 BasicBlock *ClonedPH = ClonedPHs.begin()->second;
17361792 // When doing a partial unswitch, we have to do a bit more work to build up
17371793 // the branch in the split block.
17381794 buildPartialUnswitchConditionalBranch(*SplitBB, Invariants, Direction,
17391795 *ClonedPH, *LoopPH);
1740 }
1741
1742 // Before we update the dominator tree, collect the dead blocks if we're going
1743 // to end up deleting the unswitched successor.
1744 SmallVector DeadBlocks;
1745 if (DeleteUnswitchedSucc) {
1746 DeadBlocks.push_back(UnswitchedSuccBB);
1747 for (int i = 0; i < (int)DeadBlocks.size(); ++i) {
1748 // If we reach an exit block, stop recursing as the unswitched loop will
1749 // end up reaching the merge block which we make the successor of the
1750 // exit.
1751 if (ExitBlockSet.count(DeadBlocks[i]))
1752 continue;
1753
1754 // Insert the children that are within the loop or exit block set. Other
1755 // children may reach out of the loop. While we don't expect these to be
1756 // dead (as the unswitched clone should reach them) we don't try to prove
1757 // that here.
1758 for (DomTreeNode *ChildN : *DT[DeadBlocks[i]])
1759 if (L.contains(ChildN->getBlock()) ||
1760 ExitBlockSet.count(ChildN->getBlock()))
1761 DeadBlocks.push_back(ChildN->getBlock());
1762 }
1763 }
1764
1765 // Add the remaining edge to our updates and apply them to get an up-to-date
1766 // dominator tree. Note that this will cause the dead blocks above to be
1767 // unreachable and no longer in the dominator tree.
1768 DTUpdates.push_back({DominatorTree::Insert, SplitBB, ClonedPH});
1796 DTUpdates.push_back({DominatorTree::Insert, SplitBB, ClonedPH});
1797 }
1798
1799 // Apply the updates accumulated above to get an up-to-date dominator tree.
17691800 DT.applyUpdates(DTUpdates);
1801
1802 // Now that we have an accurate dominator tree, first delete the dead cloned
1803 // blocks so that we can accurately build any cloned loops. It is important to
1804 // not delete the blocks from the original loop yet because we still want to
1805 // reference the original loop to understand the cloned loop's structure.
1806 deleteDeadClonedBlocks(L, ExitBlocks, VMaps, DT);
17701807
17711808 // Build the cloned loop structure itself. This may be substantially
17721809 // different from the original structure due to the simplified CFG. This also
17731810 // handles inserting all the cloned blocks into the correct loops.
17741811 SmallVector NonChildClonedLoops;
1775 buildClonedLoops(L, ExitBlocks, VMap, LI, NonChildClonedLoops);
1776
1777 // Delete anything that was made dead in the original loop due to
1778 // unswitching.
1779 if (!DeadBlocks.empty())
1780 deleteDeadBlocksFromLoop(L, DeadBlocks, ExitBlocks, DT, LI);
1781
1812 for (std::unique_ptr &VMap : VMaps)
1813 buildClonedLoops(L, ExitBlocks, *VMap, LI, NonChildClonedLoops);
1814
1815 // Now that our cloned loops have been built, we can update the original loop.
1816 // First we delete the dead blocks from it and then we rebuild the loop
1817 // structure taking these deletions into account.
1818 deleteDeadBlocksFromLoop(L, ExitBlocks, DT, LI);
17821819 SmallVector HoistedLoops;
17831820 bool IsStillLoop = rebuildLoopAfterUnswitch(L, ExitBlocks, LI, HoistedLoops);
17841821
17891826 // verification steps.
17901827 assert(DT.verify(DominatorTree::VerificationLevel::Fast));
17911828
1792 // Now we want to replace all the uses of the invariants within both the
1793 // original and cloned blocks. We do this here so that we can use the now
1794 // updated dominator tree to identify which side the users are on.
1795 ConstantInt *UnswitchedReplacement =
1796 Direction ? ConstantInt::getTrue(BI.getContext())
1797 : ConstantInt::getFalse(BI.getContext());
1798 ConstantInt *ContinueReplacement =
1799 Direction ? ConstantInt::getFalse(BI.getContext())
1800 : ConstantInt::getTrue(BI.getContext());
1801 for (Value *Invariant : Invariants)
1802 for (auto UI = Invariant->use_begin(), UE = Invariant->use_end();
1803 UI != UE;) {
1804 // Grab the use and walk past it so we can clobber it in the use list.
1805 Use *U = &*UI++;
1806 Instruction *UserI = dyn_cast(U->getUser());
1807 if (!UserI)
1808 continue;
1809
1810 // Replace it with the 'continue' side if in the main loop body, and the
1811 // unswitched if in the cloned blocks.
1812 if (DT.dominates(LoopPH, UserI->getParent()))
1813 U->set(ContinueReplacement);
1814 else if (DT.dominates(ClonedPH, UserI->getParent()))
1815 U->set(UnswitchedReplacement);
1816 }
1829 if (BI) {
1830 // If we unswitched a branch which collapses the condition to a known
1831 // constant we want to replace all the uses of the invariants within both
1832 // the original and cloned blocks. We do this here so that we can use the
1833 // now updated dominator tree to identify which side the users are on.
1834 assert(UnswitchedSuccBBs.size() == 1 &&
1835 "Only one possible unswitched block for a branch!");
1836 BasicBlock *ClonedPH = ClonedPHs.begin()->second;
1837 ConstantInt *UnswitchedReplacement =
1838 Direction ? ConstantInt::getTrue(BI->getContext())
1839 : ConstantInt::getFalse(BI->getContext());
1840 ConstantInt *ContinueReplacement =
1841 Direction ? ConstantInt::getFalse(BI->getContext())
1842 : ConstantInt::getTrue(BI->getContext());
1843 for (Value *Invariant : Invariants)
1844 for (auto UI = Invariant->use_begin(), UE = Invariant->use_end();
1845 UI != UE;) {
1846 // Grab the use and walk past it so we can clobber it in the use list.
1847 Use *U = &*UI++;
1848 Instruction *UserI = dyn_cast(U->getUser());
1849 if (!UserI)
1850 continue;
1851
1852 // Replace it with the 'continue' side if in the main loop body, and the
1853 // unswitched if in the cloned blocks.
1854 if (DT.dominates(LoopPH, UserI->getParent()))
1855 U->set(ContinueReplacement);
1856 else if (DT.dominates(ClonedPH, UserI->getParent()))
1857 U->set(UnswitchedReplacement);
1858 }
1859 }
18171860
18181861 // We can change which blocks are exit blocks of all the cloned sibling
18191862 // loops, the current loop, and any parent loops which shared exit blocks
19361979 if (LI.getLoopFor(BB) != &L)
19371980 continue;
19381981
1982 if (auto *SI = dyn_cast(BB->getTerminator())) {
1983 // We can only consider fully loop-invariant switch conditions as we need
1984 // to completely eliminate the switch after unswitching.
1985 if (!isa(SI->getCondition()) &&
1986 L.isLoopInvariant(SI->getCondition()))
1987 UnswitchCandidates.push_back({SI, {SI->getCondition()}});
1988 continue;
1989 }
1990
19391991 auto *BI = dyn_cast(BB->getTerminator());
1940 // FIXME: Handle switches here!
19411992 if (!BI || !BI->isConditional() || isa(BI->getCondition()) ||
19421993 BI->getSuccessor(0) == BI->getSuccessor(1))
19431994 continue;
20902141 TerminatorInst &TI = *TerminatorAndInvariants.first;
20912142 ArrayRef Invariants = TerminatorAndInvariants.second;
20922143 BranchInst *BI = dyn_cast(&TI);
2093 int CandidateCost =
2094 ComputeUnswitchedCost(TI, /*FullUnswitch*/ Invariants.size() == 1 && BI &&
2095 Invariants[0] == BI->getCondition());
2144 int CandidateCost = ComputeUnswitchedCost(
2145 TI, /*FullUnswitch*/ !BI || (Invariants.size() == 1 &&
2146 Invariants[0] == BI->getCondition()));
20962147 LLVM_DEBUG(dbgs() << " Computed cost of " << CandidateCost
20972148 << " for unswitch candidate: " << TI << "\n");
20982149 if (!BestUnswitchTI || CandidateCost < BestUnswitchCost) {
21082159 return false;
21092160 }
21102161
2111 auto *UnswitchBI = dyn_cast(BestUnswitchTI);
2112 if (!UnswitchBI) {
2113 // FIXME: Add support for unswitching a switch here!
2114 LLVM_DEBUG(dbgs() << "Cannot unswitch anything but a branch!\n");
2115 return false;
2116 }
2117
21182162 LLVM_DEBUG(dbgs() << " Trying to unswitch non-trivial (cost = "
2119 << BestUnswitchCost << ") branch: " << *UnswitchBI << "\n");
2120 return unswitchInvariantBranch(L, *UnswitchBI, BestUnswitchInvariants, DT, LI,
2121 AC, UnswitchCB);
2163 << BestUnswitchCost << ") terminator: " << *BestUnswitchTI
2164 << "\n");
2165 return unswitchNontrivialInvariants(
2166 L, *BestUnswitchTI, BestUnswitchInvariants, DT, LI, AC, UnswitchCB);
21222167 }
21232168
21242169 /// Unswitch control flow predicated on loop invariant conditions.
386386 loop_b:
387387 %b = load i32, i32* %b.ptr
388388 br i1 %v, label %loop_begin, label %loop_exit
389 ; The 'loop_b' unswitched loop.
389 ; The original loop, now non-looping due to unswitching..
390390 ;
391391 ; CHECK: entry.split:
392392 ; CHECK-NEXT: br label %loop_begin
397397 ; CHECK-NEXT: br label %loop_exit.split
398398 ;
399399 ; CHECK: loop_exit.split:
400 ; CHECK-NEXT: %[[A_LCSSA:.*]] = phi i32 [ %[[A]], %loop_begin ]
401400 ; CHECK-NEXT: br label %loop_exit
402401
403402 loop_exit:
404403 %ab.phi = phi i32 [ %b, %loop_b ], [ %a, %loop_begin ]
405404 ret i32 %ab.phi
406405 ; CHECK: loop_exit:
407 ; CHECK-NEXT: %[[AB_PHI:.*]] = phi i32 [ %[[A_LCSSA]], %loop_exit.split ], [ %[[B_LCSSA]], %loop_exit.split.us ]
406 ; CHECK-NEXT: %[[AB_PHI:.*]] = phi i32 [ %[[A]], %loop_exit.split ], [ %[[B_LCSSA]], %loop_exit.split.us ]
408407 ; CHECK-NEXT: ret i32 %[[AB_PHI]]
409408 }
410409
457456 call void @sink1(i32 %a.phi)
458457 ret void
459458 ; CHECK: loop_exit1:
460 ; CHECK-NEXT: %[[A_PHI:.*]] = phi i32 [ %[[A_LCSSA]], %loop_exit1.split.us ]
461 ; CHECK-NEXT: call void @sink1(i32 %[[A_PHI]])
459 ; CHECK-NEXT: call void @sink1(i32 %[[A_LCSSA]])
462460 ; CHECK-NEXT: ret void
463461
464462 loop_exit2:
466464 call void @sink2(i32 %b.phi)
467465 ret void
468466 ; CHECK: loop_exit2:
469 ; CHECK-NEXT: %[[B_PHI:.*]] = phi i32 [ %[[B]], %loop_b ]
470 ; CHECK-NEXT: call void @sink2(i32 %[[B_PHI]])
467 ; CHECK-NEXT: %[[B_LCSSA:.*]] = phi i32 [ %[[B]], %loop_b ]
468 ; CHECK-NEXT: call void @sink2(i32 %[[B_LCSSA]])
471469 ; CHECK-NEXT: ret void
472470 }
473471
530528 call void @sink2(i32 %b.phi)
531529 ret void
532530 ; CHECK: loop_exit2:
533 ; CHECK-NEXT: %[[B_PHI:.*]] = phi i32 [ %[[B_LCSSA]], %loop_exit2.split.us ]
534 ; CHECK-NEXT: call void @sink2(i32 %[[B_PHI]])
531 ; CHECK-NEXT: call void @sink2(i32 %[[B_LCSSA]])
535532 ; CHECK-NEXT: ret void
536533 }
537534
586583 call void @sink1(i32 %a.phi)
587584 br label %exit
588585 ; CHECK: loop_exit1:
589 ; CHECK-NEXT: %[[A_PHI:.*]] = phi i32 [ %[[A_LCSSA]], %loop_exit1.split.us ]
590 ; CHECK-NEXT: call void @sink1(i32 %[[A_PHI]])
586 ; CHECK-NEXT: call void @sink1(i32 %[[A_LCSSA]])
591587 ; CHECK-NEXT: br label %exit
592588
593589 loop_exit2:
595591 call void @sink2(i32 %b.phi)
596592 br label %exit
597593 ; CHECK: loop_exit2:
598 ; CHECK-NEXT: %[[B_PHI:.*]] = phi i32 [ %[[B]], %loop_b ]
599 ; CHECK-NEXT: call void @sink2(i32 %[[B_PHI]])
594 ; CHECK-NEXT: %[[B_LCSSA:.*]] = phi i32 [ %[[B]], %loop_b ]
595 ; CHECK-NEXT: call void @sink2(i32 %[[B_LCSSA]])
600596 ; CHECK-NEXT: br label %exit
601597
602598 exit:
662658 %v2 = load i1, i1* %ptr
663659 br i1 %v2, label %loop_begin, label %loop_exit
664660 ; CHECK: loop_latch:
665 ; CHECK-NEXT: %[[B_LCSSA:.*]] = phi i32 [ %[[B]], %inner_loop_b ]
661 ; CHECK-NEXT: %[[B_INNER_LCSSA:.*]] = phi i32 [ %[[B]], %inner_loop_b ]
666662 ; CHECK-NEXT: %[[V2:.*]] = load i1, i1* %ptr
667663 ; CHECK-NEXT: br i1 %[[V2]], label %loop_begin, label %loop_exit.loopexit1
668664
670666 %ab.phi = phi i32 [ %a, %inner_loop_begin ], [ %b.phi, %loop_latch ]
671667 ret i32 %ab.phi
672668 ; CHECK: loop_exit.loopexit:
673 ; CHECK-NEXT: %[[A_PHI:.*]] = phi i32 [ %[[A_LCSSA]], %loop_exit.loopexit.split.us ]
674669 ; CHECK-NEXT: br label %loop_exit
675670 ;
676671 ; CHECK: loop_exit.loopexit1:
677 ; CHECK-NEXT: %[[B_PHI:.*]] = phi i32 [ %[[B_LCSSA]], %loop_latch ]
672 ; CHECK-NEXT: %[[B_LCSSA:.*]] = phi i32 [ %[[B_INNER_LCSSA]], %loop_latch ]
678673 ; CHECK-NEXT: br label %loop_exit
679674 ;
680675 ; CHECK: loop_exit:
681 ; CHECK-NEXT: %[[AB_PHI:.*]] = phi i32 [ %[[A_PHI]], %loop_exit.loopexit ], [ %[[B_PHI]], %loop_exit.loopexit1 ]
676 ; CHECK-NEXT: %[[AB_PHI:.*]] = phi i32 [ %[[A_LCSSA]], %loop_exit.loopexit ], [ %[[B_LCSSA]], %loop_exit.loopexit1 ]
682677 ; CHECK-NEXT: ret i32 %[[AB_PHI]]
683678 }
684679
772767 ; CHECK-NEXT: br label %latch
773768 ;
774769 ; CHECK: latch:
775 ; CHECK-NEXT: %[[B_PHI:.*]] = phi i32 [ %[[B_INNER_LCSSA]], %loop_b_inner_exit ]
776770 ; CHECK-NEXT: br i1 %[[V]], label %loop_begin, label %loop_exit.split
777771 ;
778772 ; CHECK: loop_exit.split:
779 ; CHECK-NEXT: %[[B_LCSSA:.*]] = phi i32 [ %[[B_PHI]], %latch ]
773 ; CHECK-NEXT: %[[B_LCSSA:.*]] = phi i32 [ %[[B_INNER_LCSSA]], %latch ]
780774 ; CHECK-NEXT: br label %loop_exit
781775
782776 loop_exit:
14651459 %v = load i1, i1* %ptr
14661460 br i1 %v, label %loop_begin, label %loop_exit
14671461 ; CHECK: inner_loop_exit:
1468 ; CHECK-NEXT: %[[A_INNER_LCSSA:.*]] = phi i32 [ %[[A_INNER_LCSSA_US]], %inner_loop_exit.split.us ]
14691462 ; CHECK-NEXT: %[[V:.*]] = load i1, i1* %ptr
14701463 ; CHECK-NEXT: br i1 %[[V]], label %loop_begin, label %loop_exit
14711464
14731466 %a.lcssa = phi i32 [ %a.inner_lcssa, %inner_loop_exit ]
14741467 ret i32 %a.lcssa
14751468 ; CHECK: loop_exit:
1476 ; CHECK-NEXT: %[[A_LCSSA:.*]] = phi i32 [ %[[A_INNER_LCSSA]], %inner_loop_exit ]
1469 ; CHECK-NEXT: %[[A_LCSSA:.*]] = phi i32 [ %[[A_INNER_LCSSA_US]], %inner_loop_exit ]
14771470 ; CHECK-NEXT: ret i32 %[[A_LCSSA]]
14781471 }
14791472
15541547 ret i32 %a.lcssa
15551548 ; CHECK: loop_exit:
15561549 ; CHECK-NEXT: %[[A_PHI:.*]] = phi i32 [ %[[A_LCSSA]], %loop_exit.split ], [ %[[A_PHI_US]], %loop_exit.split.us ]
1557 ; CHECK-NEXT: ret i32 %[[AB_PHI]]
1550 ; CHECK-NEXT: ret i32 %[[A_PHI]]
15581551 }
15591552
15601553 ; Test that requires re-forming dedicated exits for the original loop.
16341627 ret i32 %a.lcssa
16351628 ; CHECK: loop_exit:
16361629 ; CHECK-NEXT: %[[A_PHI:.*]] = phi i32 [ %[[A_PHI_SPLIT]], %loop_exit.split ], [ %[[A_LCSSA_US]], %loop_exit.split.us ]
1637 ; CHECK-NEXT: ret i32 %[[AB_PHI]]
1630 ; CHECK-NEXT: ret i32 %[[A_PHI]]
16381631 }
16391632
16401633 ; Check that if a cloned inner loop after unswitching doesn't loop and directly
17201713 %a.lcssa = phi i32 [ %a, %inner_loop_begin ], [ %a.inner_lcssa, %inner_loop_exit ]
17211714 ret i32 %a.lcssa
17221715 ; CHECK: loop_exit.loopexit:
1723 ; CHECK-NEXT: %[[A_LCSSA_US:.*]] = phi i32 [ %[[A_INNER_LCSSA_US]], %loop_exit.loopexit.split.us ]
17241716 ; CHECK-NEXT: br label %loop_exit
17251717 ;
17261718 ; CHECK: loop_exit.loopexit1:
17281720 ; CHECK-NEXT: br label %loop_exit
17291721 ;
17301722 ; CHECK: loop_exit:
1731 ; CHECK-NEXT: %[[A_PHI:.*]] = phi i32 [ %[[A_LCSSA_US]], %loop_exit.loopexit ], [ %[[A_LCSSA]], %loop_exit.loopexit1 ]
1723 ; CHECK-NEXT: %[[A_PHI:.*]] = phi i32 [ %[[A_INNER_LCSSA_US]], %loop_exit.loopexit ], [ %[[A_LCSSA]], %loop_exit.loopexit1 ]
17321724 ; CHECK-NEXT: ret i32 %[[A_PHI]]
17331725 }
17341726
18011793 %v3 = load i1, i1* %ptr
18021794 br i1 %v3, label %loop_latch, label %loop_exit
18031795 ; CHECK: inner_loop_exit:
1804 ; CHECK-NEXT: %[[A_INNER_PHI:.*]] = phi i32 [ %[[A_INNER_LCSSA_US]], %inner_loop_exit.split.us ]
18051796 ; CHECK-NEXT: %[[V:.*]] = load i1, i1* %ptr
18061797 ; CHECK-NEXT: br i1 %[[V]], label %loop_latch, label %loop_exit.loopexit1
18071798
18181809 ; CHECK-NEXT: br label %loop_exit
18191810 ;
18201811 ; CHECK: loop_exit.loopexit1:
1821 ; CHECK-NEXT: %[[A_LCSSA_US:.*]] = phi i32 [ %[[A_INNER_PHI]], %inner_loop_exit ]
1812 ; CHECK-NEXT: %[[A_LCSSA_US:.*]] = phi i32 [ %[[A_INNER_LCSSA_US]], %inner_loop_exit ]
18221813 ; CHECK-NEXT: br label %loop_exit
18231814 ;
18241815 ; CHECK: loop_exit:
19151906 %v4 = load i1, i1* %ptr
19161907 br i1 %v4, label %loop_begin, label %loop_exit
19171908 ; CHECK: inner_loop_exit.loopexit:
1918 ; CHECK-NEXT: %[[A_INNER_LCSSA_US:.*]] = phi i32 [ %[[A_INNER_INNER_LCSSA_US]], %inner_loop_exit.loopexit.split.us ]
19191909 ; CHECK-NEXT: br label %inner_loop_exit
19201910 ;
19211911 ; CHECK: inner_loop_exit.loopexit1:
19231913 ; CHECK-NEXT: br label %inner_loop_exit
19241914 ;
19251915 ; CHECK: inner_loop_exit:
1926 ; CHECK-NEXT: %[[A_INNER_PHI:.*]] = phi i32 [ %[[A_INNER_LCSSA_US]], %inner_loop_exit.loopexit ], [ %[[A_INNER_LCSSA]], %inner_loop_exit.loopexit1 ]
1916 ; CHECK-NEXT: %[[A_INNER_PHI:.*]] = phi i32 [ %[[A_INNER_INNER_LCSSA_US]], %inner_loop_exit.loopexit ], [ %[[A_INNER_LCSSA]], %inner_loop_exit.loopexit1 ]
19271917 ; CHECK-NEXT: %[[V:.*]] = load i1, i1* %ptr
19281918 ; CHECK-NEXT: br i1 %[[V]], label %loop_begin, label %loop_exit
19291919
20091999 %v3 = load i1, i1* %ptr
20102000 br i1 %v3, label %inner_loop_latch, label %inner_loop_exit
20112001 ; CHECK: inner_inner_loop_exit:
2012 ; CHECK-NEXT: %[[A_INNER_INNER_PHI:.*]] = phi i32 [ %[[A_INNER_INNER_LCSSA_US]], %inner_inner_loop_exit.split.us ]
20132002 ; CHECK-NEXT: %[[V:.*]] = load i1, i1* %ptr
20142003 ; CHECK-NEXT: br i1 %[[V]], label %inner_loop_latch, label %inner_loop_exit.loopexit1
20152004
20272016 ; CHECK-NEXT: br label %inner_loop_exit
20282017 ;
20292018 ; CHECK: inner_loop_exit.loopexit1:
2030 ; CHECK-NEXT: %[[A_INNER_LCSSA_US:.*]] = phi i32 [ %[[A_INNER_INNER_PHI]], %inner_inner_loop_exit ]
2019 ; CHECK-NEXT: %[[A_INNER_LCSSA_US:.*]] = phi i32 [ %[[A_INNER_INNER_LCSSA_US]], %inner_inner_loop_exit ]
20312020 ; CHECK-NEXT: br label %inner_loop_exit
20322021 ;
20332022 ; CHECK: inner_loop_exit:
22952284 entry:
22962285 br label %loop_begin
22972286 ; CHECK-NEXT: entry:
2298 ; CHECK-NEXT: br label %loop_begin
2287 ; CHECK-NEXT: switch i32 %cond2, label %[[ENTRY_SPLIT_EXIT:.*]] [
2288 ; CHECK-NEXT: i32 0, label %[[ENTRY_SPLIT_A:.*]]
2289 ; CHECK-NEXT: i32 1, label %[[ENTRY_SPLIT_A]]
2290 ; CHECK-NEXT: i32 13, label %[[ENTRY_SPLIT_B:.*]]
2291 ; CHECK-NEXT: i32 2, label %[[ENTRY_SPLIT_A]]
2292 ; CHECK-NEXT: i32 42, label %[[ENTRY_SPLIT_C:.*]]
2293 ; CHECK-NEXT: ]
22992294
23002295 loop_begin:
23012296 %var_val = load i32, i32* %var
2302 switch i32 %cond2, label %loop_a [
2303 i32 0, label %loop_b
2304 i32 1, label %loop_b
2305 i32 13, label %loop_c
2306 i32 2, label %loop_b
2307 i32 42, label %loop_exit
2297 switch i32 %cond2, label %loop_exit [
2298 i32 0, label %loop_a
2299 i32 1, label %loop_a
2300 i32 13, label %loop_b
2301 i32 2, label %loop_a
2302 i32 42, label %loop_c
23082303 ]
2309 ; CHECK: loop_begin:
2310 ; CHECK-NEXT: %[[V:.*]] = load i32, i32* %var
2311 ; CHECK-NEXT: switch i32 %cond2, label %loop_a [
2312 ; CHECK-NEXT: i32 0, label %loop_b
2313 ; CHECK-NEXT: i32 1, label %loop_b
2314 ; CHECK-NEXT: i32 13, label %loop_c
2315 ; CHECK-NEXT: i32 2, label %loop_b
2316 ; CHECK-NEXT: i32 42, label %loop_exit
2317 ; CHECK-NEXT: ]
23182304
23192305 loop_a:
23202306 call void @a()
23212307 br label %loop_latch
2322 ; CHECK: loop_a:
2308 ; Unswitched 'a' loop.
2309 ;
2310 ; CHECK: [[ENTRY_SPLIT_A]]:
2311 ; CHECK-NEXT: br label %[[LOOP_BEGIN_A:.*]]
2312 ;
2313 ; CHECK: [[LOOP_BEGIN_A]]:
2314 ; CHECK-NEXT: %{{.*}} = load i32, i32* %var
2315 ; CHECK-NEXT: br label %[[LOOP_A:.*]]
2316 ;
2317 ; CHECK: [[LOOP_A]]:
23232318 ; CHECK-NEXT: call void @a()
2324 ; CHECK-NEXT: br label %loop_latch
2319 ; CHECK-NEXT: br label %[[LOOP_LATCH_A:.*]]
2320 ;
2321 ; CHECK: [[LOOP_LATCH_A]]:
2322 ; CHECK: br label %[[LOOP_BEGIN_A]]
23252323
23262324 loop_b:
23272325 call void @b()
23282326 br label %loop_latch
2329 ; CHECK: loop_b:
2327 ; Unswitched 'b' loop.
2328 ;
2329 ; CHECK: [[ENTRY_SPLIT_B]]:
2330 ; CHECK-NEXT: br label %[[LOOP_BEGIN_B:.*]]
2331 ;
2332 ; CHECK: [[LOOP_BEGIN_B]]:
2333 ; CHECK-NEXT: %{{.*}} = load i32, i32* %var
2334 ; CHECK-NEXT: br label %[[LOOP_B:.*]]
2335 ;
2336 ; CHECK: [[LOOP_B]]:
23302337 ; CHECK-NEXT: call void @b()
2331 ; CHECK-NEXT: br label %loop_latch
2338 ; CHECK-NEXT: br label %[[LOOP_LATCH_B:.*]]
2339 ;
2340 ; CHECK: [[LOOP_LATCH_B]]:
2341 ; CHECK: br label %[[LOOP_BEGIN_B]]
23322342
23332343 loop_c:
23342344 call void @c() noreturn nounwind
23352345 br label %loop_latch
2336 ; CHECK: loop_c:
2346 ; Unswitched 'c' loop.
2347 ;
2348 ; CHECK: [[ENTRY_SPLIT_C]]:
2349 ; CHECK-NEXT: br label %[[LOOP_BEGIN_C:.*]]
2350 ;
2351 ; CHECK: [[LOOP_BEGIN_C]]:
2352 ; CHECK-NEXT: %{{.*}} = load i32, i32* %var
2353 ; CHECK-NEXT: br label %[[LOOP_C:.*]]
2354 ;
2355 ; CHECK: [[LOOP_C]]:
23372356 ; CHECK-NEXT: call void @c()
2338 ; CHECK-NEXT: br label %loop_latch
2357 ; CHECK-NEXT: br label %[[LOOP_LATCH_C:.*]]
2358 ;
2359 ; CHECK: [[LOOP_LATCH_C]]:
2360 ; CHECK: br label %[[LOOP_BEGIN_C]]
23392361
23402362 loop_latch:
23412363 br label %loop_begin
2342 ; CHECK: loop_latch:
2343 ; CHECK-NEXT: br label %loop_begin
23442364
23452365 loop_exit:
23462366 %lcssa = phi i32 [ %var_val, %loop_begin ]
23472367 ret i32 %lcssa
2368 ; Unswitched exit edge (no longer a loop).
2369 ;
2370 ; CHECK: [[ENTRY_SPLIT_EXIT]]:
2371 ; CHECK-NEXT: br label %loop_begin
2372 ;
2373 ; CHECK: loop_begin:
2374 ; CHECK-NEXT: %[[V:.*]] = load i32, i32* %var
2375 ; CHECK-NEXT: br label %loop_exit
2376 ;
23482377 ; CHECK: loop_exit:
23492378 ; CHECK-NEXT: %[[LCSSA:.*]] = phi i32 [ %[[V]], %loop_begin ]
23502379 ; CHECK-NEXT: ret i32 %[[LCSSA]]
28232852 ; CHECK: loop_exit:
28242853 ; CHECK-NEXT: ret
28252854 }
2855
2856 ; Non-trivial unswitching of a switch.
2857 define i32 @test27(i1* %ptr, i32 %cond) {
2858 ; CHECK-LABEL: @test27(
2859 entry:
2860 br label %loop_begin
2861 ; CHECK-NEXT: entry:
2862 ; CHECK-NEXT: switch i32 %cond, label %[[ENTRY_SPLIT_LATCH:.*]] [
2863 ; CHECK-NEXT: i32 0, label %[[ENTRY_SPLIT_A:.*]]
2864 ; CHECK-NEXT: i32 1, label %[[ENTRY_SPLIT_B:.*]]
2865 ; CHECK-NEXT: i32 2, label %[[ENTRY_SPLIT_C:.*]]
2866 ; CHECK-NEXT: ]
2867
2868 loop_begin:
2869 switch i32 %cond, label %latch [
2870 i32 0, label %loop_a
2871 i32 1, label %loop_b
2872 i32 2, label %loop_c
2873 ]
2874
2875 loop_a:
2876 call void @a()
2877 br label %latch
2878 ; Unswitched 'a' loop.
2879 ;
2880 ; CHECK: [[ENTRY_SPLIT_A]]:
2881 ; CHECK-NEXT: br label %[[LOOP_BEGIN_A:.*]]
2882 ;
2883 ; CHECK: [[LOOP_BEGIN_A]]:
2884 ; CHECK-NEXT: br label %[[LOOP_A:.*]]
2885 ;
2886 ; CHECK: [[LOOP_A]]:
2887 ; CHECK-NEXT: call void @a()
2888 ; CHECK-NEXT: br label %[[LOOP_LATCH_A:.*]]
2889 ;
2890 ; CHECK: [[LOOP_LATCH_A]]:
2891 ; CHECK-NEXT: %[[V_A:.*]] = load i1, i1* %ptr
2892 ; CHECK: br i1 %[[V_A]], label %[[LOOP_BEGIN_A]], label %[[LOOP_EXIT_A:.*]]
2893 ;
2894 ; CHECK: [[LOOP_EXIT_A]]:
2895 ; CHECK-NEXT: br label %loop_exit
2896
2897 loop_b:
2898 call void @b()
2899 br label %latch
2900 ; Unswitched 'b' loop.
2901 ;
2902 ; CHECK: [[ENTRY_SPLIT_B]]:
2903 ; CHECK-NEXT: br label %[[LOOP_BEGIN_B:.*]]
2904 ;
2905 ; CHECK: [[LOOP_BEGIN_B]]:
2906 ; CHECK-NEXT: br label %[[LOOP_B:.*]]
2907 ;
2908 ; CHECK: [[LOOP_B]]:
2909 ; CHECK-NEXT: call void @b()
2910 ; CHECK-NEXT: br label %[[LOOP_LATCH_B:.*]]
2911 ;
2912 ; CHECK: [[LOOP_LATCH_B]]:
2913 ; CHECK-NEXT: %[[V_B:.*]] = load i1, i1* %ptr
2914 ; CHECK: br i1 %[[V_B]], label %[[LOOP_BEGIN_B]], label %[[LOOP_EXIT_B:.*]]
2915 ;
2916 ; CHECK: [[LOOP_EXIT_B]]:
2917 ; CHECK-NEXT: br label %loop_exit
2918
2919 loop_c:
2920 call void @c()
2921 br label %latch
2922 ; Unswitched 'c' loop.
2923 ;
2924 ; CHECK: [[ENTRY_SPLIT_C]]:
2925 ; CHECK-NEXT: br label %[[LOOP_BEGIN_C:.*]]
2926 ;
2927 ; CHECK: [[LOOP_BEGIN_C]]:
2928 ; CHECK-NEXT: br label %[[LOOP_C:.*]]
2929 ;
2930 ; CHECK: [[LOOP_C]]:
2931 ; CHECK-NEXT: call void @c()
2932 ; CHECK-NEXT: br label %[[LOOP_LATCH_C:.*]]
2933 ;
2934 ; CHECK: [[LOOP_LATCH_C]]:
2935 ; CHECK-NEXT: %[[V_C:.*]] = load i1, i1* %ptr
2936 ; CHECK: br i1 %[[V_C]], label %[[LOOP_BEGIN_C]], label %[[LOOP_EXIT_C:.*]]
2937 ;
2938 ; CHECK: [[LOOP_EXIT_C]]:
2939 ; CHECK-NEXT: br label %loop_exit
2940
2941 latch:
2942 %v = load i1, i1* %ptr
2943 br i1 %v, label %loop_begin, label %loop_exit
2944 ; Unswitched the 'latch' only loop.
2945 ;
2946 ; CHECK: [[ENTRY_SPLIT_LATCH]]:
2947 ; CHECK-NEXT: br label %[[LOOP_BEGIN_LATCH:.*]]
2948 ;
2949 ; CHECK: [[LOOP_BEGIN_LATCH]]:
2950 ; CHECK-NEXT: br label %[[LOOP_LATCH_LATCH:.*]]
2951 ;
2952 ; CHECK: [[LOOP_LATCH_LATCH]]:
2953 ; CHECK-NEXT: %[[V_LATCH:.*]] = load i1, i1* %ptr
2954 ; CHECK: br i1 %[[V_LATCH]], label %[[LOOP_BEGIN_LATCH]], label %[[LOOP_EXIT_LATCH:.*]]
2955 ;
2956 ; CHECK: [[LOOP_EXIT_LATCH]]:
2957 ; CHECK-NEXT: br label %loop_exit
2958
2959 loop_exit:
2960 ret i32 0
2961 ; CHECK: loop_exit:
2962 ; CHECK-NEXT: ret i32 0
2963 }