llvm.org GIT mirror llvm / 379e7c2
Fix PR 24415 (at least), by making our post-dominator tree behavior sane. Summary: Currently, our post-dom tree tries to ignore and remove the effects of infinite loops. It fails miserably at this, because it tries to do it ahead of time, and thus can only detect self-loops, and any other type of infinite loop, it pretends doesn't exist at all. This can, in a bunch of cases, lead to wrong answers and a completely empty post-dom tree. Wrong answer: ``` declare void foo() define internal void @f() { entry: br i1 undef, label %bb35, label %bb3.i bb3.i: call void @foo() br label %bb3.i bb35.loopexit3: br label %bb35 bb35: ret void } ``` We get: ``` Inorder PostDominator Tree: [1] <<exit node>> {0,7} [2] %bb35 {1,6} [3] %bb35.loopexit3 {2,3} [3] %entry {4,5} ``` This is a trivial modification of the testcase for PR 6047 Note that we pretend bb3.i doesn't exist. We also pretend that bb35 post-dominates entry. While it's true that it does not exit in a theoretical sense, it's not really helpful to try to ignore the effect and pretend that bb35 post-dominates entry. Worse, we pretend the infinite loop does nothing (it's usually considered a side-effect), and doesn't even exist, even when it calls a function. Sadly, this makes it impossible to use when you are trying to move code safely. All compilers also create virtual or real single exit nodes (including us), and connect infinite loops there (which this patch does). In fact, others have worked around our behavior here, to the point of building their own post-dom trees: https://zneak.github.io/fcd/2016/02/17/structuring.html and pointing out the region infrastructure is near-useless for them with postdom in this state :( Completely empty post-dom tree: ``` define void @spam() #0 { bb: br label %bb1 bb1: ; preds = %bb1, %bb br label %bb1 bb2: ; No predecessors! ret void } ``` Printing analysis 'Post-Dominator Tree Construction' for function 'foo': =============================-------------------------------- Inorder PostDominator Tree: [1] <<exit node>> {0,1} :( (note that even if you ignore the effects of infinite loops, bb2 should be present as an exit node that post-dominates nothing). This patch changes post-dom to properly handle infinite loops and does root finding during calculation to prevent empty tress in such cases. We match gcc's (and the canonical theoretical) behavior for infinite loops (find the backedge, connect it to the exit block). Testcases coming as soon as i finish running this on a ton of random graphs :) Reviewers: chandlerc, davide Subscribers: bryant, llvm-commits Differential Revision: https://reviews.llvm.org/D29705 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@296535 91177308-0d34-0410-b5e6-96231b3b80d8 Daniel Berlin 3 years ago
16 changed file(s) with 201 addition(s) and 109 deletion(s). Raw diff Collapse all Expand all
769769
770770 /// recalculate - compute a dominator tree for the given function
771771 template void recalculate(FT &F) {
772 typedef GraphTraits TraitsTy;
773772 reset();
774773 this->Vertex.push_back(nullptr);
775774
776775 if (!this->IsPostDominators) {
777 // Initialize root
778 NodeT *entry = TraitsTy::getEntryNode(&F);
779 addRoot(entry);
780
781776 Calculate(*this, F);
782777 } else {
783 // Initialize the roots list
784 for (auto *Node : nodes(&F))
785 if (TraitsTy::child_begin(Node) == TraitsTy::child_end(Node))
786 addRoot(Node);
787
788778 Calculate>(*this, F);
789779 }
790780 }
2424 #define LLVM_SUPPORT_GENERICDOMTREECONSTRUCTION_H
2525
2626 #include "llvm/ADT/DepthFirstIterator.h"
27 #include "llvm/ADT/PostOrderIterator.h"
2728 #include "llvm/ADT/SmallPtrSet.h"
2829 #include "llvm/Support/GenericDomTree.h"
2930
3839 df_iterator_dom_storage(BaseSet &Storage) : Storage(Storage) {}
3940
4041 typedef typename BaseSet::iterator iterator;
41 std::pair insert(NodeRef N) {
42 return Storage.insert({N, InfoType()});
42 std::pair insert(NodeRef To) {
43 auto Result = Storage.insert({To, InfoType()});
44
45 return Result;
4346 }
4447 void completed(NodeRef) {}
4548
5457 typename GraphT::NodeRef,
5558 typename DominatorTreeBaseByGraphTraits::InfoRec>
5659 DFStorage(DT.Info);
57 bool IsChildOfArtificialExit = (N != 0);
5860 for (auto I = idf_ext_begin(V, DFStorage), E = idf_ext_end(V, DFStorage);
5961 I != E; ++I) {
6062 typename GraphT::NodeRef BB = *I;
6668 if (I.getPathLength() > 1)
6769 BBInfo.Parent = DT.Info[I.getPath(I.getPathLength() - 2)].DFSNum;
6870 DT.Vertex.push_back(BB); // Vertex[n] = V;
69
70 if (IsChildOfArtificialExit)
71 BBInfo.Parent = 1;
72
73 IsChildOfArtificialExit = false;
7471 }
7572 return N;
7673 }
141138 void Calculate(DominatorTreeBaseByGraphTraits> &DT,
142139 FuncT &F) {
143140 typedef GraphTraits GraphT;
141 typedef GraphTraits FuncGraphT;
144142 static_assert(std::is_pointer::value,
145143 "NodeRef should be pointer type");
146144 typedef typename std::remove_pointer::type NodeType;
147145
148146 unsigned N = 0;
149 bool MultipleRoots = (DT.Roots.size() > 1);
150 if (MultipleRoots) {
147 bool NeedFakeRoot = DT.isPostDominator();
148 // If this is post dominators, push a fake node to start
149 if (NeedFakeRoot) {
151150 auto &BBInfo = DT.Info[nullptr];
152151 BBInfo.DFSNum = BBInfo.Semi = ++N;
153152 BBInfo.Label = nullptr;
154
155 DT.Vertex.push_back(nullptr); // Vertex[n] = V;
153 DT.Vertex.push_back(nullptr); // Vertex[n] = V;
154 } else {
155 // The root is the entry block of the CFG
156 DT.addRoot(FuncGraphT::getEntryNode(&F));
156157 }
157158
158159 // Step #1: Number blocks in depth-first order and initialize variables used
159160 // in later stages of the algorithm.
160 if (DT.isPostDominator()){
161 for (unsigned i = 0, e = static_cast(DT.Roots.size());
162 i != e; ++i)
163 N = ReverseDFSPass(DT, DT.Roots[i], N);
164 } else {
165 N = DFSPass(DT, DT.Roots[0], N);
166 }
167
168 // it might be that some blocks did not get a DFS number (e.g., blocks of
169 // infinite loops). In these cases an artificial exit node is required.
170 MultipleRoots |= (DT.isPostDominator() && N != GraphTraits::size(&F));
161 if (DT.isPostDominator()) {
162 unsigned Total = 0;
163 for (auto I : nodes(&F)) {
164 ++Total;
165 // If it has no *successors*, it is definitely a root.
166 if (FuncGraphT::child_begin(I) == FuncGraphT::child_end(I)) {
167 N = ReverseDFSPass(DT, I, N);
168 DT.Info[I].Parent = 1;
169 DT.addRoot(I);
170 }
171 }
172 // Accounting for the virtual exit, see if we had any unreachable nodes
173 if (Total + 1 != N ) {
174 // Make another DFS pass over all other nodes to find the unreachable
175 // blocks, and find the furthest paths we'll be able to make.
176 // Note that this looks N^2, but it's really 2N worst case, if every node
177 // is unreachable. This is because we are still going to only visit each
178 // unreachable node once, we may just visit it in two directions,
179 // depending on how lucky we get.
180 SmallPtrSet ConnectToExitBlock;
181 for (auto I : nodes(&F))
182 if (!DT.Info.count(I)) {
183 // Find the furthest away we can get by following successors, then
184 // follow them in reverse. This gives us some reasonable answer about
185 // the post-dom tree inside any infinite loop. In particular, it
186 // guarantees we get to the farthest away point along *some*
187 // path. This also matches GCC behavior. If we really wanted a
188 // totally complete picture of dominance inside this infinite loop, we
189 // could do it with SCC-like algorithms to find the lowest and highest
190 // points in the infinite loop. In theory, it would be nice to give
191 // the canonical backedge for the loop, but it's expensive.
192 auto *FurthestAway = *po_begin(I);
193 ConnectToExitBlock.insert(FurthestAway);
194 N = ReverseDFSPass(DT, FurthestAway, N);
195 }
196 // Finally, now everything should be visited, and anything with parent
197 // ==
198 // 0 should be connected to virtual exit.
199 for (auto *Node : ConnectToExitBlock) {
200 auto FindResult = DT.Info.find(Node);
201 assert(FindResult != DT.Info.end() &&
202 "Everything should have been visited by now");
203 if (FindResult->second.Parent == 0) {
204 FindResult->second.Parent = 1;
205 DT.addRoot(Node);
206 }
207 }
208 }
209 } else {
210 N = DFSPass(DT, GraphTraits::getEntryNode(&F), N);
211 }
171212
172213 // When naively implemented, the Lengauer-Tarjan algorithm requires a separate
173214 // bucket for each vertex. However, this is unnecessary, because each vertex
233274 WIDom = DT.IDoms[WIDom];
234275 }
235276
236 if (DT.Roots.empty()) return;
237
238277 // Add a node for the root. This node might be the actual root, if there is
239278 // one exit block, or it may be the virtual exit (denoted by (BasicBlock *)0)
240279 // which postdominates all real exits if there are multiple exit blocks, or
241280 // an infinite loop.
242 typename GraphT::NodeRef Root = !MultipleRoots ? DT.Roots[0] : nullptr;
281 typename GraphT::NodeRef Root = NeedFakeRoot ? nullptr : DT.Roots[0];
243282
244283 DT.RootNode =
245284 (DT.DomTreeNodes[Root] =
252252 }
253253 }
254254
255 // Mark blocks live if there is no path from the block to the
256 // return of the function or a successor for which this is true.
257 // This protects IDFCalculator which cannot handle such blocks.
258 for (auto &BBInfoPair : BlockInfo) {
259 auto &BBInfo = BBInfoPair.second;
260 if (BBInfo.terminatorIsLive())
255 // Mark blocks live if there is no path from the block to a
256 // return of the function.
257 // We do this by seeing which of the postdomtree root children exit the
258 // program, and for all others, mark the subtree live.
259 for (auto &PDTChild : children(PDT.getRootNode())) {
260 auto *BB = PDTChild->getBlock();
261 auto &Info = BlockInfo[BB];
262 // Real function return
263 if (isa(Info.Terminator)) {
264 DEBUG(dbgs() << "post-dom root child is not a return: " << BB->getName()
265 << '\n';);
261266 continue;
262 auto *BB = BBInfo.BB;
263 if (!PDT.getNode(BB)) {
264 markLive(BBInfo.Terminator);
265 continue;
266 }
267 for (auto *Succ : successors(BB))
268 if (!PDT.getNode(Succ)) {
269 markLive(BBInfo.Terminator);
270 break;
271 }
272 }
273
274 // Mark blocks live if there is no path from the block to the
275 // return of the function or a successor for which this is true.
276 // This protects IDFCalculator which cannot handle such blocks.
277 for (auto &BBInfoPair : BlockInfo) {
278 auto &BBInfo = BBInfoPair.second;
279 if (BBInfo.terminatorIsLive())
280 continue;
281 auto *BB = BBInfo.BB;
282 if (!PDT.getNode(BB)) {
283 DEBUG(dbgs() << "Not post-dominated by return: " << BB->getName()
284 << '\n';);
285 markLive(BBInfo.Terminator);
286 continue;
287 }
288 for (auto *Succ : successors(BB))
289 if (!PDT.getNode(Succ)) {
290 DEBUG(dbgs() << "Successor not post-dominated by return: "
291 << BB->getName() << '\n';);
292 markLive(BBInfo.Terminator);
293 break;
294 }
267 }
268
269 // This child is something else, like an infinite loop.
270 for (auto DFNode : depth_first(PDTChild))
271 markLive(BlockInfo[DFNode->getBlock()].Terminator);
295272 }
296273
297274 // Treat the entry block as always live
0 ; RUN: opt < %s -postdomtree -analyze | FileCheck %s
1 ; RUN: opt < %s -passes='print' 2>&1 | FileCheck %s
2
3 ; Function Attrs: nounwind ssp uwtable
4 define void @foo() {
5 br label %1
6
7 ;
8 br label %1
9 ; No predecessors!
10 ret void
11 }
12
13 ; CHECK: Inorder PostDominator Tree:
14 ; CHECK-NEXT: [1] <> {0,7}
15 ; CHECK-NEXT: [2] %2 {1,2}
16 ; CHECK-NEXT: [2] %1 {3,6}
17 ; CHECK-NEXT: [3] %0 {4,5}
1111 bb35:
1212 ret void
1313 }
14 ; CHECK: [3] %entry
14 ;CHECK:Inorder PostDominator Tree:
15 ;CHECK-NEXT: [1] <> {0,9}
16 ;CHECK-NEXT: [2] %bb35 {1,4}
17 ;CHECK-NEXT: [3] %bb35.loopexit3 {2,3}
18 ;CHECK-NEXT: [2] %entry {5,6}
19 ;CHECK-NEXT: [2] %bb3.i {7,8}
1515 bb35:
1616 ret void
1717 }
18 ; CHECK: [4] %entry
18 ; CHECK: Inorder PostDominator Tree:
19 ; CHECK-NEXT: [1] <> {0,11}
20 ; CHECK-NEXT: [2] %bb35 {1,4}
21 ; CHECK-NEXT: [3] %bb35.loopexit3 {2,3}
22 ; CHECK-NEXT: [2] %a {5,6}
23 ; CHECK-NEXT: [2] %entry {7,8}
24 ; CHECK-NEXT: [2] %bb3.i {9,10}
143143 bb35:
144144 ret void
145145 }
146 ; CHECK: [3] %entry
146 ; CHECK: Inorder PostDominator Tree:
147 ; CHECK-NEXT: [1] <> {0,97}
148 ; CHECK-NEXT: [2] %bb35 {1,92}
149 ; CHECK-NEXT: [3] %bb35.loopexit3 {2,3}
150 ; CHECK-NEXT: [3] %bb35.loopexit {4,5}
151 ; CHECK-NEXT: [3] %bb31 {6,7}
152 ; CHECK-NEXT: [3] %bb30 {8,9}
153 ; CHECK-NEXT: [3] %bb30.loopexit1 {10,11}
154 ; CHECK-NEXT: [3] %bb30.loopexit {12,13}
155 ; CHECK-NEXT: [3] %bb23 {14,15}
156 ; CHECK-NEXT: [3] %bb23.us {16,17}
157 ; CHECK-NEXT: [3] %bb23.preheader {18,19}
158 ; CHECK-NEXT: [3] %bb23.us.preheader {20,21}
159 ; CHECK-NEXT: [3] %bb.nph {22,23}
160 ; CHECK-NEXT: [3] %bb29.preheader {24,25}
161 ; CHECK-NEXT: [3] %bb20 {26,27}
162 ; CHECK-NEXT: [3] %bb19 {28,29}
163 ; CHECK-NEXT: [3] %bb.nph14 {30,31}
164 ; CHECK-NEXT: [3] %bb17.loopexit.split {32,33}
165 ; CHECK-NEXT: [3] %bb16 {34,35}
166 ; CHECK-NEXT: [3] %bb15 {36,37}
167 ; CHECK-NEXT: [3] %bb15.loopexit2 {38,39}
168 ; CHECK-NEXT: [3] %bb15.loopexit {40,41}
169 ; CHECK-NEXT: [3] %bb8 {42,43}
170 ; CHECK-NEXT: [3] %bb8.us {44,45}
171 ; CHECK-NEXT: [3] %bb8.preheader {46,47}
172 ; CHECK-NEXT: [3] %bb8.us.preheader {48,49}
173 ; CHECK-NEXT: [3] %bb.nph18 {50,51}
174 ; CHECK-NEXT: [3] %bb14.preheader {52,53}
175 ; CHECK-NEXT: [3] %bb5 {54,55}
176 ; CHECK-NEXT: [3] %bb4 {56,57}
177 ; CHECK-NEXT: [3] %bb.nph21 {58,59}
178 ; CHECK-NEXT: [3] %bb3.i.loopexit.us {60,61}
179 ; CHECK-NEXT: [3] %bb8.i.us {62,63}
180 ; CHECK-NEXT: [3] %bb4.i.us {64,65}
181 ; CHECK-NEXT: [3] %bb6.i.us {66,67}
182 ; CHECK-NEXT: [3] %bb1.i.us {68,69}
183 ; CHECK-NEXT: [3] %bb.i4.us.backedge {70,71}
184 ; CHECK-NEXT: [3] %bb7.i.us {72,73}
185 ; CHECK-NEXT: [3] %bb.i4.us {74,75}
186 ; CHECK-NEXT: [3] %bb3.split.us {76,77}
187 ; CHECK-NEXT: [3] %bb3 {78,79}
188 ; CHECK-NEXT: [3] %bb32.preheader {80,81}
189 ; CHECK-NEXT: [3] %_float32_unpack.exit8 {82,83}
190 ; CHECK-NEXT: [3] %bb.i5 {84,85}
191 ; CHECK-NEXT: [3] %_float32_unpack.exit {86,87}
192 ; CHECK-NEXT: [3] %bb.i {88,89}
193 ; CHECK-NEXT: [3] %bb {90,91}
194 ; CHECK-NEXT: [2] %entry {93,94}
195 ; CHECK-NEXT: [2] %bb3.i {95,96}
2020 bb35:
2121 ret void
2222 }
23 ; CHECK: [4] %entry
23 ; CHECK: Inorder PostDominator Tree:
24 ; CHECK-NEXT: [1] <> {0,15}
25 ; CHECK-NEXT: [2] %bb35 {1,4}
26 ; CHECK-NEXT: [3] %bb35.loopexit3 {2,3}
27 ; CHECK-NEXT: [2] %c {5,12}
28 ; CHECK-NEXT: [3] %b {6,7}
29 ; CHECK-NEXT: [3] %entry {8,9}
30 ; CHECK-NEXT: [3] %a {10,11}
31 ; CHECK-NEXT: [2] %bb3.i {13,14}
1515 }
1616 ; CHECK-NOT: =>
1717 ; CHECK: [0] 0 =>
18 ; CHECK: [1] 1 => 4
19 ; STAT: 2 region - The # of regions
20 ; STAT: 1 region - The # of simple regions
18 ; STAT: 1 region - The # of regions
2525 }
2626 ; CHECK-NOT: =>
2727 ; CHECK: [0] 0 =>
28 ; CHECK: [1] 1 => 3
28 ; CHECK: [1] 5 => 6
2929 ; STAT: 2 region - The # of regions
30 ; STAT: 1 region - The # of simple regions
3130
32 ; BBIT: 0, 1, 2, 5, 11, 6, 12, 3, 4,
33 ; BBIT: 1, 2, 5, 11, 6, 12,
31 ; BBIT: 0, 1, 2, 5, 11, 6, 12, 3, 4,
32 ; BBIT: 5, 11, 12,
3433
35 ; RNIT: 0, 1 => 3, 3, 4,
36 ; RNIT: 1, 2, 5, 11, 6, 12,
34 ; RNIT: 0, 1, 2, 5 => 6, 6, 3, 4,
35 ; RNIT: 5, 11, 12,
3737 ret void
3838 }
3939 ; CHECK-NOT: =>
40 ; CHECK: [0] 0 =>
41 ; CHECK-NEXT: [1] 1 => 3
42 ; CHECK-NEXT: [1] 7 => 1
40 ; CHECK:[0] 0 =>
41 ; CHECK-NEXT: [1] 5 => 6
42 ; CHECK-NEXT: [1] 9 => 10
4343 ; STAT: 3 region - The # of regions
44 ; STAT: 2 region - The # of simple regions
4544
46 ; BBIT: 0, 7, 1, 2, 5, 11, 6, 12, 3, 4, 8, 9, 13, 10, 14,
47 ; BBIT: 7, 8, 9, 13, 10, 14,
48 ; BBIT: 1, 2, 5, 11, 6, 12,
45 ; BBIT: 0, 7, 1, 2, 5, 11, 6, 12, 3, 4, 8, 9, 13, 10, 14,
46 ; BBIT: 5, 11, 12,
47 ; BBIT: 9, 13, 14,
4948
50 ; RNIT: 0, 7 => 1, 1 => 3, 3, 4,
51 ; RNIT: 7, 8, 9, 13, 10, 14,
52 ; RNIT: 1, 2, 5, 11, 6, 12,
49 ; RNIT: 0, 7, 1, 2, 5 => 6, 6, 3, 4, 8, 9 => 10, 10,
50 ; RNIT: 5, 11, 12,
51 ; RNIT: 9, 13, 14,
3737 }
3838 ; CHECK-NOT: =>
3939 ; CHECK: [0] 0 =>
40 ; CHECK-NEXT: [1] 7 => 3
41 ; STAT: 2 region - The # of regions
40 ; CHECK-NEXT: [1] 2 => 10
41 ; CHECK_NEXT: [2] 5 => 6
42 ; STAT: 3 region - The # of regions
4243 ; STAT: 1 region - The # of simple regions
4344
4445 ; BBIT: 0, 7, 1, 2, 5, 11, 6, 10, 8, 9, 13, 14, 12, 3, 4,
45 ; BBIT: 7, 1, 2, 5, 11, 6, 10, 8, 9, 13, 14, 12,
46
47 ; RNIT: 0, 7 => 3, 3, 4,
48 ; RNIT: 7, 1, 2, 5, 11, 6, 10, 8, 9, 13, 14, 12,
46 ; BBIT: 2, 5, 11, 6, 12,
47 ; BBIT: 5, 11, 12,
48 ; RNIT: 0, 7, 1, 2 => 10, 10, 8, 9, 13, 14, 3, 4,
49 ; RNIT: 2, 5 => 6, 6,
50 ; RNIT: 5, 11, 12,
1818
1919 ; CHECK: Region tree:
2020 ; CHECK-NEXT: [0] 0 =>
21 ; CHECK-NEXT: [1] 7 => 3
2221 ; CHECK-NEXT: End region tree
2322
2020
2121 ; CHECK: Region tree:
2222 ; CHECK-NEXT: [0] 0 =>
23 ; CHECK-NEXT: [1] 7 => 3
2423 ; CHECK-NEXT: End region tree
22 ; CHECK-LABEL: @invert_branch_on_arg_inf_loop(
33 ; CHECK: entry:
44 ; CHECK: %arg.inv = xor i1 %arg, true
5 ; CHECK: phi i1 [ false, %Flow1 ], [ %arg.inv, %entry ]
65 define void @invert_branch_on_arg_inf_loop(i32 addrspace(1)* %out, i1 %arg) {
76 entry:
8 br i1 %arg, label %for.end, label %for.body
7 br i1 %arg, label %for.end, label %sesestart
8 sesestart:
9 br label %for.body
910
1011 for.body: ; preds = %entry, %for.body
1112 store i32 999, i32 addrspace(1)* %out, align 4
12 br label %for.body
13 br i1 %arg, label %for.body, label %seseend
14 seseend:
15 ret void
1316
1417 for.end: ; preds = %Flow
1518 ret void
0 ; XFAIL: *
1 ; RUN: opt -S -o - -structurizecfg -verify-dom-info < %s | FileCheck %s
12
23 ; CHECK-LABEL: @no_branch_to_entry_undef(