llvm.org GIT mirror llvm / 21a0c18
[x86] Introduce a pass to begin more systematically fixing PR36028 and similar issues. The key idea is to lower COPY nodes populating EFLAGS by scanning the uses of EFLAGS and introducing dedicated code to preserve the necessary state in a GPR. In the vast majority of cases, these uses are cmovCC and jCC instructions. For such cases, we can very easily save and restore the necessary information by simply inserting a setCC into a GPR where the original flags are live, and then testing that GPR directly to feed the cmov or conditional branch. However, things are a bit more tricky if arithmetic is using the flags. This patch handles the vast majority of cases that seem to come up in practice: adc, adcx, adox, rcl, and rcr; all without taking advantage of partially preserved EFLAGS as LLVM doesn't currently model that at all. There are a large number of operations that techinaclly observe EFLAGS currently but shouldn't in this case -- they typically are using DF. Currently, they will not be handled by this approach. However, I have never seen this issue come up in practice. It is already pretty rare to have these patterns come up in practical code with LLVM. I had to resort to writing MIR tests to cover most of the logic in this pass already. I suspect even with its current amount of coverage of arithmetic users of EFLAGS it will be a significant improvement over the current use of pushf/popf. It will also produce substantially faster code in most of the common patterns. This patch also removes all of the old lowering for EFLAGS copies, and the hack that forced us to use a frame pointer when EFLAGS copies were found anywhere in a function so that the dynamic stack adjustment wasn't a problem. None of this is needed as we now lower all of these copies directly in MI and without require stack adjustments. Lots of thanks to Reid who came up with several aspects of this approach, and Craig who helped me work out a couple of things tripping me up while working on this. Differential Revision: https://reviews.llvm.org/D45146 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@329657 91177308-0d34-0410-b5e6-96231b3b80d8 Chandler Carruth 1 year, 11 months ago
21 changed file(s) with 6907 addition(s) and 6263 deletion(s). Raw diff Collapse all Expand all
456456 /// Replace successor OLD with NEW and update probability info.
457457 void replaceSuccessor(MachineBasicBlock *Old, MachineBasicBlock *New);
458458
459 /// Copy a successor (and any probability info) from original block to this
460 /// block's. Uses an iterator into the original blocks successors.
461 ///
462 /// This is useful when doing a partial clone of successors. Afterward, the
463 /// probabilities may need to be normalized.
464 void copySuccessor(MachineBasicBlock *Orig, succ_iterator I);
465
459466 /// Transfers all the successors from MBB to this machine basic block (i.e.,
460467 /// copies all the successors FromMBB and remove all the successors from
461468 /// FromMBB).
719719 removeSuccessor(OldI);
720720 }
721721
722 void MachineBasicBlock::copySuccessor(MachineBasicBlock *Orig,
723 succ_iterator I) {
724 if (Orig->Probs.empty())
725 addSuccessor(*I, Orig->getSuccProbability(I));
726 else
727 addSuccessorWithoutProb(*I);
728 }
729
722730 void MachineBasicBlock::addPredecessor(MachineBasicBlock *Pred) {
723731 Predecessors.push_back(Pred);
724732 }
3333 X86FixupLEAs.cpp
3434 X86AvoidStoreForwardingBlocks.cpp
3535 X86FixupSetCC.cpp
36 X86FlagsCopyLowering.cpp
3637 X86FloatingPoint.cpp
3738 X86FrameLowering.cpp
3839 X86InstructionSelector.cpp
7777 /// Return a pass that avoids creating store forward block issues in the hardware.
7878 FunctionPass *createX86AvoidStoreForwardingBlocks();
7979
80 /// Return a pass that lowers EFLAGS copy pseudo instructions.
81 FunctionPass *createX86FlagsCopyLoweringPass();
82
8083 /// Return a pass that expands WinAlloca pseudo-instructions.
8184 FunctionPass *createX86WinAllocaExpander();
8285
0 //====- X86FlagsCopyLowering.cpp - Lowers COPY nodes of EFLAGS ------------===//
1 //
2 // The LLVM Compiler Infrastructure
3 //
4 // This file is distributed under the University of Illinois Open Source
5 // License. See LICENSE.TXT for details.
6 //
7 //===----------------------------------------------------------------------===//
8 /// \file
9 ///
10 /// Lowers COPY nodes of EFLAGS by directly extracting and preserving individual
11 /// flag bits.
12 ///
13 /// We have to do this by carefully analyzing and rewriting the usage of the
14 /// copied EFLAGS register because there is no general way to rematerialize the
15 /// entire EFLAGS register safely and efficiently. Using `popf` both forces
16 /// dynamic stack adjustment and can create correctness issues due to IF, TF,
17 /// and other non-status flags being overwritten. Using sequences involving
18 /// SAHF don't work on all x86 processors and are often quite slow compared to
19 /// directly testing a single status preserved in its own GPR.
20 ///
21 //===----------------------------------------------------------------------===//
22
23 #include "X86.h"
24 #include "X86InstrBuilder.h"
25 #include "X86InstrInfo.h"
26 #include "X86Subtarget.h"
27 #include "llvm/ADT/ArrayRef.h"
28 #include "llvm/ADT/DenseMap.h"
29 #include "llvm/ADT/STLExtras.h"
30 #include "llvm/ADT/ScopeExit.h"
31 #include "llvm/ADT/SmallPtrSet.h"
32 #include "llvm/ADT/SmallSet.h"
33 #include "llvm/ADT/SmallVector.h"
34 #include "llvm/ADT/SparseBitVector.h"
35 #include "llvm/ADT/Statistic.h"
36 #include "llvm/CodeGen/MachineBasicBlock.h"
37 #include "llvm/CodeGen/MachineConstantPool.h"
38 #include "llvm/CodeGen/MachineFunction.h"
39 #include "llvm/CodeGen/MachineFunctionPass.h"
40 #include "llvm/CodeGen/MachineInstr.h"
41 #include "llvm/CodeGen/MachineInstrBuilder.h"
42 #include "llvm/CodeGen/MachineModuleInfo.h"
43 #include "llvm/CodeGen/MachineOperand.h"
44 #include "llvm/CodeGen/MachineRegisterInfo.h"
45 #include "llvm/CodeGen/MachineSSAUpdater.h"
46 #include "llvm/CodeGen/TargetInstrInfo.h"
47 #include "llvm/CodeGen/TargetRegisterInfo.h"
48 #include "llvm/CodeGen/TargetSchedule.h"
49 #include "llvm/CodeGen/TargetSubtargetInfo.h"
50 #include "llvm/IR/DebugLoc.h"
51 #include "llvm/MC/MCSchedule.h"
52 #include "llvm/Pass.h"
53 #include "llvm/Support/CommandLine.h"
54 #include "llvm/Support/Debug.h"
55 #include "llvm/Support/raw_ostream.h"
56 #include
57 #include
58 #include
59 #include
60
61 using namespace llvm;
62
63 #define PASS_KEY "x86-flags-copy-lowering"
64 #define DEBUG_TYPE PASS_KEY
65
66 STATISTIC(NumCopiesEliminated, "Number of copies of EFLAGS eliminated");
67 STATISTIC(NumSetCCsInserted, "Number of setCC instructions inserted");
68 STATISTIC(NumTestsInserted, "Number of test instructions inserted");
69 STATISTIC(NumAddsInserted, "Number of adds instructions inserted");
70
71 namespace llvm {
72
73 void initializeX86FlagsCopyLoweringPassPass(PassRegistry &);
74
75 } // end namespace llvm
76
77 namespace {
78
79 // Convenient array type for storing registers associated with each condition.
80 using CondRegArray = std::array;
81
82 class X86FlagsCopyLoweringPass : public MachineFunctionPass {
83 public:
84 X86FlagsCopyLoweringPass() : MachineFunctionPass(ID) {
85 initializeX86FlagsCopyLoweringPassPass(*PassRegistry::getPassRegistry());
86 }
87
88 StringRef getPassName() const override { return "X86 EFLAGS copy lowering"; }
89 bool runOnMachineFunction(MachineFunction &MF) override;
90 void getAnalysisUsage(AnalysisUsage &AU) const override;
91
92 /// Pass identification, replacement for typeid.
93 static char ID;
94
95 private:
96 MachineRegisterInfo *MRI;
97 const X86InstrInfo *TII;
98 const TargetRegisterInfo *TRI;
99 const TargetRegisterClass *PromoteRC;
100
101 CondRegArray collectCondsInRegs(MachineBasicBlock &MBB,
102 MachineInstr &CopyDefI);
103
104 unsigned promoteCondToReg(MachineBasicBlock &MBB,
105 MachineBasicBlock::iterator TestPos,
106 DebugLoc TestLoc, X86::CondCode Cond);
107 std::pair
108 getCondOrInverseInReg(MachineBasicBlock &TestMBB,
109 MachineBasicBlock::iterator TestPos, DebugLoc TestLoc,
110 X86::CondCode Cond, CondRegArray &CondRegs);
111 void insertTest(MachineBasicBlock &MBB, MachineBasicBlock::iterator Pos,
112 DebugLoc Loc, unsigned Reg);
113
114 void rewriteArithmetic(MachineBasicBlock &TestMBB,
115 MachineBasicBlock::iterator TestPos, DebugLoc TestLoc,
116 MachineInstr &MI, MachineOperand &FlagUse,
117 CondRegArray &CondRegs);
118 void rewriteCMov(MachineBasicBlock &TestMBB,
119 MachineBasicBlock::iterator TestPos, DebugLoc TestLoc,
120 MachineInstr &CMovI, MachineOperand &FlagUse,
121 CondRegArray &CondRegs);
122 void rewriteCondJmp(MachineBasicBlock &TestMBB,
123 MachineBasicBlock::iterator TestPos, DebugLoc TestLoc,
124 MachineInstr &JmpI, CondRegArray &CondRegs);
125 void rewriteCopy(MachineInstr &MI, MachineOperand &FlagUse,
126 MachineInstr &CopyDefI);
127 void rewriteSetCC(MachineBasicBlock &TestMBB,
128 MachineBasicBlock::iterator TestPos, DebugLoc TestLoc,
129 MachineInstr &SetCCI, MachineOperand &FlagUse,
130 CondRegArray &CondRegs);
131 };
132
133 } // end anonymous namespace
134
135 INITIALIZE_PASS_BEGIN(X86FlagsCopyLoweringPass, DEBUG_TYPE,
136 "X86 EFLAGS copy lowering", false, false)
137 INITIALIZE_PASS_END(X86FlagsCopyLoweringPass, DEBUG_TYPE,
138 "X86 EFLAGS copy lowering", false, false)
139
140 FunctionPass *llvm::createX86FlagsCopyLoweringPass() {
141 return new X86FlagsCopyLoweringPass();
142 }
143
144 char X86FlagsCopyLoweringPass::ID = 0;
145
146 void X86FlagsCopyLoweringPass::getAnalysisUsage(AnalysisUsage &AU) const {
147 MachineFunctionPass::getAnalysisUsage(AU);
148 }
149
150 namespace {
151 /// An enumeration of the arithmetic instruction mnemonics which have
152 /// interesting flag semantics.
153 ///
154 /// We can map instruction opcodes into these mnemonics to make it easy to
155 /// dispatch with specific functionality.
156 enum class FlagArithMnemonic {
157 ADC,
158 ADCX,
159 ADOX,
160 RCL,
161 RCR,
162 SBB,
163 };
164 } // namespace
165
166 static FlagArithMnemonic getMnemonicFromOpcode(unsigned Opcode) {
167 switch (Opcode) {
168 default:
169 report_fatal_error("No support for lowering a copy into EFLAGS when used "
170 "by this instruction!");
171
172 #define LLVM_EXPAND_INSTR_SIZES(MNEMONIC, SUFFIX) \
173 case X86::MNEMONIC##8##SUFFIX: \
174 case X86::MNEMONIC##16##SUFFIX: \
175 case X86::MNEMONIC##32##SUFFIX: \
176 case X86::MNEMONIC##64##SUFFIX:
177
178 #define LLVM_EXPAND_ADC_SBB_INSTR(MNEMONIC) \
179 LLVM_EXPAND_INSTR_SIZES(MNEMONIC, rr) \
180 LLVM_EXPAND_INSTR_SIZES(MNEMONIC, rr_REV) \
181 LLVM_EXPAND_INSTR_SIZES(MNEMONIC, rm) \
182 LLVM_EXPAND_INSTR_SIZES(MNEMONIC, mr) \
183 case X86::MNEMONIC##8ri: \
184 case X86::MNEMONIC##16ri8: \
185 case X86::MNEMONIC##32ri8: \
186 case X86::MNEMONIC##64ri8: \
187 case X86::MNEMONIC##16ri: \
188 case X86::MNEMONIC##32ri: \
189 case X86::MNEMONIC##64ri32: \
190 case X86::MNEMONIC##8mi: \
191 case X86::MNEMONIC##16mi8: \
192 case X86::MNEMONIC##32mi8: \
193 case X86::MNEMONIC##64mi8: \
194 case X86::MNEMONIC##16mi: \
195 case X86::MNEMONIC##32mi: \
196 case X86::MNEMONIC##64mi32: \
197 case X86::MNEMONIC##8i8: \
198 case X86::MNEMONIC##16i16: \
199 case X86::MNEMONIC##32i32: \
200 case X86::MNEMONIC##64i32:
201
202 LLVM_EXPAND_ADC_SBB_INSTR(ADC)
203 return FlagArithMnemonic::ADC;
204
205 LLVM_EXPAND_ADC_SBB_INSTR(SBB)
206 return FlagArithMnemonic::SBB;
207
208 #undef LLVM_EXPAND_ADC_SBB_INSTR
209
210 LLVM_EXPAND_INSTR_SIZES(RCL, rCL)
211 LLVM_EXPAND_INSTR_SIZES(RCL, r1)
212 LLVM_EXPAND_INSTR_SIZES(RCL, ri)
213 return FlagArithMnemonic::RCL;
214
215 LLVM_EXPAND_INSTR_SIZES(RCR, rCL)
216 LLVM_EXPAND_INSTR_SIZES(RCR, r1)
217 LLVM_EXPAND_INSTR_SIZES(RCR, ri)
218 return FlagArithMnemonic::RCR;
219
220 #undef LLVM_EXPAND_INSTR_SIZES
221
222 case X86::ADCX32rr:
223 case X86::ADCX64rr:
224 case X86::ADCX32rm:
225 case X86::ADCX64rm:
226 return FlagArithMnemonic::ADCX;
227
228 case X86::ADOX32rr:
229 case X86::ADOX64rr:
230 case X86::ADOX32rm:
231 case X86::ADOX64rm:
232 return FlagArithMnemonic::ADOX;
233 }
234 }
235
236 static MachineBasicBlock &splitBlock(MachineBasicBlock &MBB,
237 MachineInstr &SplitI,
238 const X86InstrInfo &TII) {
239 MachineFunction &MF = *MBB.getParent();
240
241 assert(SplitI.getParent() == &MBB &&
242 "Split instruction must be in the split block!");
243 assert(SplitI.isBranch() &&
244 "Only designed to split a tail of branch instructions!");
245 assert(X86::getCondFromBranchOpc(SplitI.getOpcode()) != X86::COND_INVALID &&
246 "Must split on an actual jCC instruction!");
247
248 // Dig out the previous instruction to the split point.
249 MachineInstr &PrevI = *std::prev(SplitI.getIterator());
250 assert(PrevI.isBranch() && "Must split after a branch!");
251 assert(X86::getCondFromBranchOpc(PrevI.getOpcode()) != X86::COND_INVALID &&
252 "Must split after an actual jCC instruction!");
253 assert(!std::prev(PrevI.getIterator())->isTerminator() &&
254 "Must only have this one terminator prior to the split!");
255
256 // Grab the one successor edge that will stay in `MBB`.
257 MachineBasicBlock &UnsplitSucc = *PrevI.getOperand(0).getMBB();
258
259 // Analyze the original block to see if we are actually splitting an edge
260 // into two edges. This can happen when we have multiple conditional jumps to
261 // the same successor.
262 bool IsEdgeSplit =
263 std::any_of(SplitI.getIterator(), MBB.instr_end(),
264 [&](MachineInstr &MI) {
265 assert(MI.isTerminator() &&
266 "Should only have spliced terminators!");
267 return llvm::any_of(
268 MI.operands(), [&](MachineOperand &MOp) {
269 return MOp.isMBB() && MOp.getMBB() == &UnsplitSucc;
270 });
271 }) ||
272 MBB.getFallThrough() == &UnsplitSucc;
273
274 MachineBasicBlock &NewMBB = *MF.CreateMachineBasicBlock();
275
276 // Insert the new block immediately after the current one. Any existing
277 // fallthrough will be sunk into this new block anyways.
278 MF.insert(std::next(MachineFunction::iterator(&MBB)), &NewMBB);
279
280 // Splice the tail of instructions into the new block.
281 NewMBB.splice(NewMBB.end(), &MBB, SplitI.getIterator(), MBB.end());
282
283 // Copy the necessary succesors (and their probability info) into the new
284 // block.
285 for (auto SI = MBB.succ_begin(), SE = MBB.succ_end(); SI != SE; ++SI)
286 if (IsEdgeSplit || *SI != &UnsplitSucc)
287 NewMBB.copySuccessor(&MBB, SI);
288 // Normalize the probabilities if we didn't end up splitting the edge.
289 if (!IsEdgeSplit)
290 NewMBB.normalizeSuccProbs();
291
292 // Now replace all of the moved successors in the original block with the new
293 // block. This will merge their probabilities.
294 for (MachineBasicBlock *Succ : NewMBB.successors())
295 if (Succ != &UnsplitSucc)
296 MBB.replaceSuccessor(Succ, &NewMBB);
297
298 // We should always end up replacing at least one successor.
299 assert(MBB.isSuccessor(&NewMBB) &&
300 "Failed to make the new block a successor!");
301
302 // Now update all the PHIs.
303 for (MachineBasicBlock *Succ : NewMBB.successors()) {
304 for (MachineInstr &MI : *Succ) {
305 if (!MI.isPHI())
306 break;
307
308 for (int OpIdx = 1, NumOps = MI.getNumOperands(); OpIdx < NumOps;
309 OpIdx += 2) {
310 MachineOperand &OpV = MI.getOperand(OpIdx);
311 MachineOperand &OpMBB = MI.getOperand(OpIdx + 1);
312 assert(OpMBB.isMBB() && "Block operand to a PHI is not a block!");
313 if (OpMBB.getMBB() != &MBB)
314 continue;
315
316 // Replace the operand for unsplit successors
317 if (!IsEdgeSplit || Succ != &UnsplitSucc) {
318 OpMBB.setMBB(&NewMBB);
319
320 // We have to continue scanning as there may be multiple entries in
321 // the PHI.
322 continue;
323 }
324
325 // When we have split the edge append a new successor.
326 MI.addOperand(MF, OpV);
327 MI.addOperand(MF, MachineOperand::CreateMBB(&NewMBB));
328 break;
329 }
330 }
331 }
332
333 return NewMBB;
334 }
335
336 bool X86FlagsCopyLoweringPass::runOnMachineFunction(MachineFunction &MF) {
337 DEBUG(dbgs() << "********** " << getPassName() << " : " << MF.getName()
338 << " **********\n");
339
340 auto &Subtarget = MF.getSubtarget();
341 MRI = &MF.getRegInfo();
342 TII = Subtarget.getInstrInfo();
343 TRI = Subtarget.getRegisterInfo();
344 PromoteRC = &X86::GR8RegClass;
345
346 if (MF.begin() == MF.end())
347 // Nothing to do for a degenerate empty function...
348 return false;
349
350 SmallVector Copies;
351 for (MachineBasicBlock &MBB : MF)
352 for (MachineInstr &MI : MBB)
353 if (MI.getOpcode() == TargetOpcode::COPY &&
354 MI.getOperand(0).getReg() == X86::EFLAGS)
355 Copies.push_back(&MI);
356
357 for (MachineInstr *CopyI : Copies) {
358 MachineBasicBlock &MBB = *CopyI->getParent();
359
360 MachineOperand &VOp = CopyI->getOperand(1);
361 assert(VOp.isReg() &&
362 "The input to the copy for EFLAGS should always be a register!");
363 MachineInstr &CopyDefI = *MRI->getVRegDef(VOp.getReg());
364 if (CopyDefI.getOpcode() != TargetOpcode::COPY) {
365 // FIXME: The big likely candidate here are PHI nodes. We could in theory
366 // handle PHI nodes, but it gets really, really hard. Insanely hard. Hard
367 // enough that it is probably better to change every other part of LLVM
368 // to avoid creating them. The issue is that once we have PHIs we won't
369 // know which original EFLAGS value we need to capture with our setCCs
370 // below. The end result will be computing a complete set of setCCs that
371 // we *might* want, computing them in every place where we copy *out* of
372 // EFLAGS and then doing SSA formation on all of them to insert necessary
373 // PHI nodes and consume those here. Then hoping that somehow we DCE the
374 // unnecessary ones. This DCE seems very unlikely to be successful and so
375 // we will almost certainly end up with a glut of dead setCC
376 // instructions. Until we have a motivating test case and fail to avoid
377 // it by changing other parts of LLVM's lowering, we refuse to handle
378 // this complex case here.
379 DEBUG(dbgs() << "ERROR: Encountered unexpected def of an eflags copy: ";
380 CopyDefI.dump());
381 report_fatal_error(
382 "Cannot lower EFLAGS copy unless it is defined in turn by a copy!");
383 }
384
385 auto Cleanup = make_scope_exit([&] {
386 // All uses of the EFLAGS copy are now rewritten, kill the copy into
387 // eflags and if dead the copy from.
388 CopyI->eraseFromParent();
389 if (MRI->use_empty(CopyDefI.getOperand(0).getReg()))
390 CopyDefI.eraseFromParent();
391 ++NumCopiesEliminated;
392 });
393
394 MachineOperand &DOp = CopyI->getOperand(0);
395 assert(DOp.isDef() && "Expected register def!");
396 assert(DOp.getReg() == X86::EFLAGS && "Unexpected copy def register!");
397 if (DOp.isDead())
398 continue;
399
400 MachineBasicBlock &TestMBB = *CopyDefI.getParent();
401 auto TestPos = CopyDefI.getIterator();
402 DebugLoc TestLoc = CopyDefI.getDebugLoc();
403
404 DEBUG(dbgs() << "Rewriting copy: "; CopyI->dump());
405
406 // Scan for usage of newly set EFLAGS so we can rewrite them. We just buffer
407 // jumps because their usage is very constrained.
408 bool FlagsKilled = false;
409 SmallVector JmpIs;
410
411 // Gather the condition flags that have already been preserved in
412 // registers. We do this from scratch each time as we expect there to be
413 // very few of them and we expect to not revisit the same copy definition
414 // many times. If either of those change sufficiently we could build a map
415 // of these up front instead.
416 CondRegArray CondRegs = collectCondsInRegs(TestMBB, CopyDefI);
417
418 for (auto MII = std::next(CopyI->getIterator()), MIE = MBB.instr_end();
419 MII != MIE;) {
420 MachineInstr &MI = *MII++;
421 MachineOperand *FlagUse = MI.findRegisterUseOperand(X86::EFLAGS);
422 if (!FlagUse) {
423 if (MI.findRegisterDefOperand(X86::EFLAGS)) {
424 // If EFLAGS are defined, it's as-if they were killed. We can stop
425 // scanning here.
426 //
427 // NB!!! Many instructions only modify some flags. LLVM currently
428 // models this as clobbering all flags, but if that ever changes this
429 // will need to be carefully updated to handle that more complex
430 // logic.
431 FlagsKilled = true;
432 break;
433 }
434 continue;
435 }
436
437 DEBUG(dbgs() << " Rewriting use: "; MI.dump());
438
439 // Check the kill flag before we rewrite as that may change it.
440 if (FlagUse->isKill())
441 FlagsKilled = true;
442
443 // Once we encounter a branch, the rest of the instructions must also be
444 // branches. We can't rewrite in place here, so we handle them below.
445 //
446 // Note that we don't have to handle tail calls here, even conditional
447 // tail calls, as those are not introduced into the X86 MI until post-RA
448 // branch folding or black placement. As a consequence, we get to deal
449 // with the simpler formulation of conditional branches followed by tail
450 // calls.
451 if (X86::getCondFromBranchOpc(MI.getOpcode()) != X86::COND_INVALID) {
452 auto JmpIt = MI.getIterator();
453 do {
454 JmpIs.push_back(&*JmpIt);
455 ++JmpIt;
456 } while (JmpIt != MBB.instr_end() &&
457 X86::getCondFromBranchOpc(JmpIt->getOpcode()) !=
458 X86::COND_INVALID);
459 break;
460 }
461
462 // Otherwise we can just rewrite in-place.
463 if (X86::getCondFromCMovOpc(MI.getOpcode()) != X86::COND_INVALID) {
464 rewriteCMov(TestMBB, TestPos, TestLoc, MI, *FlagUse, CondRegs);
465 } else if (X86::getCondFromSETOpc(MI.getOpcode()) != X86::COND_INVALID) {
466 rewriteSetCC(TestMBB, TestPos, TestLoc, MI, *FlagUse, CondRegs);
467 } else if (MI.getOpcode() == TargetOpcode::COPY) {
468 rewriteCopy(MI, *FlagUse, CopyDefI);
469 } else {
470 // We assume that arithmetic instructions that use flags also def them.
471 assert(MI.findRegisterDefOperand(X86::EFLAGS) &&
472 "Expected a def of EFLAGS for this instruction!");
473
474 // NB!!! Several arithmetic instructions only *partially* update
475 // flags. Theoretically, we could generate MI code sequences that
476 // would rely on this fact and observe different flags independently.
477 // But currently LLVM models all of these instructions as clobbering
478 // all the flags in an undef way. We rely on that to simplify the
479 // logic.
480 FlagsKilled = true;
481
482 rewriteArithmetic(TestMBB, TestPos, TestLoc, MI, *FlagUse, CondRegs);
483 break;
484 }
485
486 // If this was the last use of the flags, we're done.
487 if (FlagsKilled)
488 break;
489 }
490
491 // If we didn't find a kill (or equivalent) check that the flags don't
492 // live-out of the basic block. Currently we don't support lowering copies
493 // of flags that live out in this fashion.
494 if (!FlagsKilled &&
495 llvm::any_of(MBB.successors(), [](MachineBasicBlock *SuccMBB) {
496 return SuccMBB->isLiveIn(X86::EFLAGS);
497 })) {
498 DEBUG({
499 dbgs() << "ERROR: Found a copied EFLAGS live-out from basic block:\n"
500 << "----\n";
501 MBB.dump();
502 dbgs() << "----\n"
503 << "ERROR: Cannot lower this EFLAGS copy!\n";
504 });
505 report_fatal_error(
506 "Cannot lower EFLAGS copy that lives out of a basic block!");
507 }
508
509 // Now rewrite the jumps that use the flags. These we handle specially
510 // because if there are multiple jumps we'll have to do surgery on the CFG.
511 for (MachineInstr *JmpI : JmpIs) {
512 // Past the first jump we need to split the blocks apart.
513 if (JmpI != JmpIs.front())
514 splitBlock(*JmpI->getParent(), *JmpI, *TII);
515
516 rewriteCondJmp(TestMBB, TestPos, TestLoc, *JmpI, CondRegs);
517 }
518
519 // FIXME: Mark the last use of EFLAGS before the copy's def as a kill if
520 // the copy's def operand is itself a kill.
521 }
522
523 #ifndef NDEBUG
524 for (MachineBasicBlock &MBB : MF)
525 for (MachineInstr &MI : MBB)
526 if (MI.getOpcode() == TargetOpcode::COPY &&
527 (MI.getOperand(0).getReg() == X86::EFLAGS ||
528 MI.getOperand(1).getReg() == X86::EFLAGS)) {
529 DEBUG(dbgs() << "ERROR: Found a COPY involving EFLAGS: "; MI.dump());
530 llvm_unreachable("Unlowered EFLAGS copy!");
531 }
532 #endif
533
534 return true;
535 }
536
537 /// Collect any conditions that have already been set in registers so that we
538 /// can re-use them rather than adding duplicates.
539 CondRegArray
540 X86FlagsCopyLoweringPass::collectCondsInRegs(MachineBasicBlock &MBB,
541 MachineInstr &CopyDefI) {
542 CondRegArray CondRegs = {};
543
544 // Scan backwards across the range of instructions with live EFLAGS.
545 for (MachineInstr &MI : llvm::reverse(
546 llvm::make_range(MBB.instr_begin(), CopyDefI.getIterator()))) {
547 X86::CondCode Cond = X86::getCondFromSETOpc(MI.getOpcode());
548 if (Cond != X86::COND_INVALID && MI.getOperand(0).isReg() &&
549 TRI->isVirtualRegister(MI.getOperand(0).getReg()))
550 CondRegs[Cond] = MI.getOperand(0).getReg();
551
552 // Stop scanning when we see the first definition of the EFLAGS as prior to
553 // this we would potentially capture the wrong flag state.
554 if (MI.findRegisterDefOperand(X86::EFLAGS))
555 break;
556 }
557 return CondRegs;
558 }
559
560 unsigned X86FlagsCopyLoweringPass::promoteCondToReg(
561 MachineBasicBlock &TestMBB, MachineBasicBlock::iterator TestPos,
562 DebugLoc TestLoc, X86::CondCode Cond) {
563 unsigned Reg = MRI->createVirtualRegister(PromoteRC);
564 auto SetI = BuildMI(TestMBB, TestPos, TestLoc,
565 TII->get(X86::getSETFromCond(Cond)), Reg);
566 (void)SetI;
567 DEBUG(dbgs() << " save cond: "; SetI->dump());
568 ++NumSetCCsInserted;
569 return Reg;
570 }
571
572 std::pair X86FlagsCopyLoweringPass::getCondOrInverseInReg(
573 MachineBasicBlock &TestMBB, MachineBasicBlock::iterator TestPos,
574 DebugLoc TestLoc, X86::CondCode Cond, CondRegArray &CondRegs) {
575 unsigned &CondReg = CondRegs[Cond];
576 unsigned &InvCondReg = CondRegs[X86::GetOppositeBranchCondition(Cond)];
577 if (!CondReg && !InvCondReg)
578 CondReg = promoteCondToReg(TestMBB, TestPos, TestLoc, Cond);
579
580 if (CondReg)
581 return {CondReg, false};
582 else
583 return {InvCondReg, true};
584 }
585
586 void X86FlagsCopyLoweringPass::insertTest(MachineBasicBlock &MBB,
587 MachineBasicBlock::iterator Pos,
588 DebugLoc Loc, unsigned Reg) {
589 // We emit test instructions as register/immediate test against -1. This
590 // allows register allocation to fold a memory operand if needed (that will
591 // happen often due to the places this code is emitted). But hopefully will
592 // also allow us to select a shorter encoding of `testb %reg, %reg` when that
593 // would be equivalent.
594 auto TestI =
595 BuildMI(MBB, Pos, Loc, TII->get(X86::TEST8ri)).addReg(Reg).addImm(-1);
596 (void)TestI;
597 DEBUG(dbgs() << " test cond: "; TestI->dump());
598 ++NumTestsInserted;
599 }
600
601 void X86FlagsCopyLoweringPass::rewriteArithmetic(
602 MachineBasicBlock &TestMBB, MachineBasicBlock::iterator TestPos,
603 DebugLoc TestLoc, MachineInstr &MI, MachineOperand &FlagUse,
604 CondRegArray &CondRegs) {
605 // Arithmetic is either reading CF or OF. Figure out which condition we need
606 // to preserve in a register.
607 X86::CondCode Cond;
608
609 // The addend to use to reset CF or OF when added to the flag value.
610 int Addend;
611
612 switch (getMnemonicFromOpcode(MI.getOpcode())) {
613 case FlagArithMnemonic::ADC:
614 case FlagArithMnemonic::ADCX:
615 case FlagArithMnemonic::RCL:
616 case FlagArithMnemonic::RCR:
617 case FlagArithMnemonic::SBB:
618 Cond = X86::COND_B; // CF == 1
619 // Set up an addend that when one is added will need a carry due to not
620 // having a higher bit available.
621 Addend = 255;
622 break;
623
624 case FlagArithMnemonic::ADOX:
625 Cond = X86::COND_O; // OF == 1
626 // Set up an addend that when one is added will turn from positive to
627 // negative and thus overflow in the signed domain.
628 Addend = 127;
629 break;
630 }
631
632 // Now get a register that contains the value of the flag input to the
633 // arithmetic. We require exactly this flag to simplify the arithmetic
634 // required to materialize it back into the flag.
635 unsigned &CondReg = CondRegs[Cond];
636 if (!CondReg)
637 CondReg = promoteCondToReg(TestMBB, TestPos, TestLoc, Cond);
638
639 MachineBasicBlock &MBB = *MI.getParent();
640
641 // Insert an instruction that will set the flag back to the desired value.
642 unsigned TmpReg = MRI->createVirtualRegister(PromoteRC);
643 auto AddI =
644 BuildMI(MBB, MI.getIterator(), MI.getDebugLoc(), TII->get(X86::ADD8ri))
645 .addDef(TmpReg, RegState::Dead)
646 .addReg(CondReg)
647 .addImm(Addend);
648 (void)AddI;
649 DEBUG(dbgs() << " add cond: "; AddI->dump());
650 ++NumAddsInserted;
651 FlagUse.setIsKill(true);
652 }
653
654 void X86FlagsCopyLoweringPass::rewriteCMov(MachineBasicBlock &TestMBB,
655 MachineBasicBlock::iterator TestPos,
656 DebugLoc TestLoc,
657 MachineInstr &CMovI,
658 MachineOperand &FlagUse,
659 CondRegArray &CondRegs) {
660 // First get the register containing this specific condition.
661 X86::CondCode Cond = X86::getCondFromCMovOpc(CMovI.getOpcode());
662 unsigned CondReg;
663 bool Inverted;
664 std::tie(CondReg, Inverted) =
665 getCondOrInverseInReg(TestMBB, TestPos, TestLoc, Cond, CondRegs);
666
667 MachineBasicBlock &MBB = *CMovI.getParent();
668
669 // Insert a direct test of the saved register.
670 insertTest(MBB, CMovI.getIterator(), CMovI.getDebugLoc(), CondReg);
671
672 // Rewrite the CMov to use the !ZF flag from the test (but match register
673 // size and memory operand), and then kill its use of the flags afterward.
674 auto &CMovRC = *MRI->getRegClass(CMovI.getOperand(0).getReg());
675 CMovI.setDesc(TII->get(X86::getCMovFromCond(
676 Inverted ? X86::COND_E : X86::COND_NE, TRI->getRegSizeInBits(CMovRC) / 8,
677 !CMovI.memoperands_empty())));
678 FlagUse.setIsKill(true);
679 DEBUG(dbgs() << " fixed cmov: "; CMovI.dump());
680 }
681
682 void X86FlagsCopyLoweringPass::rewriteCondJmp(
683 MachineBasicBlock &TestMBB, MachineBasicBlock::iterator TestPos,
684 DebugLoc TestLoc, MachineInstr &JmpI, CondRegArray &CondRegs) {
685 // First get the register containing this specific condition.
686 X86::CondCode Cond = X86::getCondFromBranchOpc(JmpI.getOpcode());
687 unsigned CondReg;
688 bool Inverted;
689 std::tie(CondReg, Inverted) =
690 getCondOrInverseInReg(TestMBB, TestPos, TestLoc, Cond, CondRegs);
691
692 MachineBasicBlock &JmpMBB = *JmpI.getParent();
693
694 // Insert a direct test of the saved register.
695 insertTest(JmpMBB, JmpI.getIterator(), JmpI.getDebugLoc(), CondReg);
696
697 // Rewrite the jump to use the !ZF flag from the test, and kill its use of
698 // flags afterward.
699 JmpI.setDesc(TII->get(
700 X86::GetCondBranchFromCond(Inverted ? X86::COND_E : X86::COND_NE)));
701 const int ImplicitEFLAGSOpIdx = 1;
702 JmpI.getOperand(ImplicitEFLAGSOpIdx).setIsKill(true);
703 DEBUG(dbgs() << " fixed jCC: "; JmpI.dump());
704 }
705
706 void X86FlagsCopyLoweringPass::rewriteCopy(MachineInstr &MI,
707 MachineOperand &FlagUse,
708 MachineInstr &CopyDefI) {
709 // Just replace this copy with the the original copy def.
710 MRI->replaceRegWith(MI.getOperand(0).getReg(),
711 CopyDefI.getOperand(0).getReg());
712 MI.eraseFromParent();
713 }
714
715 void X86FlagsCopyLoweringPass::rewriteSetCC(MachineBasicBlock &TestMBB,
716 MachineBasicBlock::iterator TestPos,
717 DebugLoc TestLoc,
718 MachineInstr &SetCCI,
719 MachineOperand &FlagUse,
720 CondRegArray &CondRegs) {
721 X86::CondCode Cond = X86::getCondFromSETOpc(SetCCI.getOpcode());
722 // Note that we can't usefully rewrite this to the inverse without complex
723 // analysis of the users of the setCC. Largely we rely on duplicates which
724 // could have been avoided already being avoided here.
725 unsigned &CondReg = CondRegs[Cond];
726 if (!CondReg)
727 CondReg = promoteCondToReg(TestMBB, TestPos, TestLoc, Cond);
728
729 // Rewriting this is trivial: we just replace the register and remove the
730 // setcc.
731 MRI->replaceRegWith(SetCCI.getOperand(0).getReg(), CondReg);
732 SetCCI.eraseFromParent();
733 }
3872038720 }
3872138721 }
3872238722
38723 /// This function checks if any of the users of EFLAGS copies the EFLAGS. We
38724 /// know that the code that lowers COPY of EFLAGS has to use the stack, and if
38725 /// we don't adjust the stack we clobber the first frame index.
38726 /// See X86InstrInfo::copyPhysReg.
38727 static bool hasCopyImplyingStackAdjustment(const MachineFunction &MF) {
38728 const MachineRegisterInfo &MRI = MF.getRegInfo();
38729 return any_of(MRI.reg_instructions(X86::EFLAGS),
38730 [](const MachineInstr &RI) { return RI.isCopy(); });
38731 }
38732
38733 void X86TargetLowering::finalizeLowering(MachineFunction &MF) const {
38734 if (hasCopyImplyingStackAdjustment(MF)) {
38735 MachineFrameInfo &MFI = MF.getFrameInfo();
38736 MFI.setHasCopyImplyingStackAdjustment(true);
38737 }
38738
38739 TargetLoweringBase::finalizeLowering(MF);
38740 }
38741
3874238723 SDValue X86TargetLowering::expandIndirectJTBranch(const SDLoc& dl,
3874338724 SDValue Value, SDValue Addr,
3874438725 SelectionDAG &DAG) const {
11201120 bool lowerInterleavedStore(StoreInst *SI, ShuffleVectorInst *SVI,
11211121 unsigned Factor) const override;
11221122
1123 void finalizeLowering(MachineFunction &MF) const override;
1124
11251123 SDValue expandIndirectJTBranch(const SDLoc& dl, SDValue Value,
11261124 SDValue Addr, SelectionDAG &DAG)
11271125 const override;
67956795 return;
67966796 }
67976797
6798 bool FromEFLAGS = SrcReg == X86::EFLAGS;
6799 bool ToEFLAGS = DestReg == X86::EFLAGS;
6800 int Reg = FromEFLAGS ? DestReg : SrcReg;
6801 bool is32 = X86::GR32RegClass.contains(Reg);
6802 bool is64 = X86::GR64RegClass.contains(Reg);
6803
6804 if ((FromEFLAGS || ToEFLAGS) && (is32 || is64)) {
6805 int Mov = is64 ? X86::MOV64rr : X86::MOV32rr;
6806 int Push = is64 ? X86::PUSH64r : X86::PUSH32r;
6807 int PushF = is64 ? X86::PUSHF64 : X86::PUSHF32;
6808 int Pop = is64 ? X86::POP64r : X86::POP32r;
6809 int PopF = is64 ? X86::POPF64 : X86::POPF32;
6810 int AX = is64 ? X86::RAX : X86::EAX;
6811
6812 if (!Subtarget.hasLAHFSAHF()) {
6813 assert(Subtarget.is64Bit() &&
6814 "Not having LAHF/SAHF only happens on 64-bit.");
6815 // Moving EFLAGS to / from another register requires a push and a pop.
6816 // Notice that we have to adjust the stack if we don't want to clobber the
6817 // first frame index. See X86FrameLowering.cpp - usesTheStack.
6818 if (FromEFLAGS) {
6819 BuildMI(MBB, MI, DL, get(PushF));
6820 BuildMI(MBB, MI, DL, get(Pop), DestReg);
6821 }
6822 if (ToEFLAGS) {
6823 BuildMI(MBB, MI, DL, get(Push))
6824 .addReg(SrcReg, getKillRegState(KillSrc));
6825 BuildMI(MBB, MI, DL, get(PopF));
6826 }
6827 return;
6828 }
6829
6830 // The flags need to be saved, but saving EFLAGS with PUSHF/POPF is
6831 // inefficient. Instead:
6832 // - Save the overflow flag OF into AL using SETO, and restore it using a
6833 // signed 8-bit addition of AL and INT8_MAX.
6834 // - Save/restore the bottom 8 EFLAGS bits (CF, PF, AF, ZF, SF) to/from AH
6835 // using LAHF/SAHF.
6836 // - When RAX/EAX is live and isn't the destination register, make sure it
6837 // isn't clobbered by PUSH/POP'ing it before and after saving/restoring
6838 // the flags.
6839 // This approach is ~2.25x faster than using PUSHF/POPF.
6840 //
6841 // This is still somewhat inefficient because we don't know which flags are
6842 // actually live inside EFLAGS. Were we able to do a single SETcc instead of
6843 // SETO+LAHF / ADDB+SAHF the code could be 1.02x faster.
6844 //
6845 // PUSHF/POPF is also potentially incorrect because it affects other flags
6846 // such as TF/IF/DF, which LLVM doesn't model.
6847 //
6848 // Notice that we have to adjust the stack if we don't want to clobber the
6849 // first frame index.
6850 // See X86ISelLowering.cpp - X86::hasCopyImplyingStackAdjustment.
6851
6852 const TargetRegisterInfo &TRI = getRegisterInfo();
6853 MachineBasicBlock::LivenessQueryResult LQR =
6854 MBB.computeRegisterLiveness(&TRI, AX, MI);
6855 // We do not want to save and restore AX if we do not have to.
6856 // Moreover, if we do so whereas AX is dead, we would need to set
6857 // an undef flag on the use of AX, otherwise the verifier will
6858 // complain that we read an undef value.
6859 // We do not want to change the behavior of the machine verifier
6860 // as this is usually wrong to read an undef value.
6861 if (MachineBasicBlock::LQR_Unknown == LQR) {
6862 LivePhysRegs LPR(TRI);
6863 LPR.addLiveOuts(MBB);
6864 MachineBasicBlock::iterator I = MBB.end();
6865 while (I != MI) {
6866 --I;
6867 LPR.stepBackward(*I);
6868 }
6869 // AX contains the top most register in the aliasing hierarchy.
6870 // It may not be live, but one of its aliases may be.
6871 for (MCRegAliasIterator AI(AX, &TRI, true);
6872 AI.isValid() && LQR != MachineBasicBlock::LQR_Live; ++AI)
6873 LQR = LPR.contains(*AI) ? MachineBasicBlock::LQR_Live
6874 : MachineBasicBlock::LQR_Dead;
6875 }
6876 bool AXDead = (Reg == AX) || (MachineBasicBlock::LQR_Dead == LQR);
6877 if (!AXDead)
6878 BuildMI(MBB, MI, DL, get(Push)).addReg(AX, getKillRegState(true));
6879 if (FromEFLAGS) {
6880 BuildMI(MBB, MI, DL, get(X86::SETOr), X86::AL);
6881 BuildMI(MBB, MI, DL, get(X86::LAHF));
6882 BuildMI(MBB, MI, DL, get(Mov), Reg).addReg(AX);
6883 }
6884 if (ToEFLAGS) {
6885 BuildMI(MBB, MI, DL, get(Mov), AX).addReg(Reg, getKillRegState(KillSrc));
6886 BuildMI(MBB, MI, DL, get(X86::ADD8ri), X86::AL)
6887 .addReg(X86::AL)
6888 .addImm(INT8_MAX);
6889 BuildMI(MBB, MI, DL, get(X86::SAHF));
6890 }
6891 if (!AXDead)
6892 BuildMI(MBB, MI, DL, get(Pop), AX);
6893 return;
6798 if (SrcReg == X86::EFLAGS || DestReg == X86::EFLAGS) {
6799 // FIXME: We use a fatal error here because historically LLVM has tried
6800 // lower some of these physreg copies and we want to ensure we get
6801 // reasonable bug reports if someone encounters a case no other testing
6802 // found. This path should be removed after the LLVM 7 release.
6803 report_fatal_error("Unable to copy EFLAGS physical register!");
68946804 }
68956805
68966806 DEBUG(dbgs() << "Cannot copy " << RI.getName(SrcReg)
6363 void initializeX86ExecutionDomainFixPass(PassRegistry &);
6464 void initializeX86DomainReassignmentPass(PassRegistry &);
6565 void initializeX86AvoidSFBPassPass(PassRegistry &);
66 void initializeX86FlagsCopyLoweringPassPass(PassRegistry &);
6667
6768 } // end namespace llvm
6869
8384 initializeX86ExecutionDomainFixPass(PR);
8485 initializeX86DomainReassignmentPass(PR);
8586 initializeX86AvoidSFBPassPass(PR);
87 initializeX86FlagsCopyLoweringPassPass(PR);
8688 }
8789
8890 static std::unique_ptr createTLOF(const Triple &TT) {
455457 addPass(createX86AvoidStoreForwardingBlocks());
456458 }
457459
460 addPass(createX86FlagsCopyLoweringPass());
458461 addPass(createX86WinAllocaExpander());
459462 }
460463 void X86PassConfig::addMachineSSAOptimization() {
99 ;
1010 ; X32-LABEL: test_add_i64:
1111 ; X32: # %bb.0:
12 ; X32-NEXT: pushl %ebp
13 ; X32-NEXT: .cfi_def_cfa_offset 8
14 ; X32-NEXT: .cfi_offset %ebp, -8
15 ; X32-NEXT: movl %esp, %ebp
16 ; X32-NEXT: .cfi_def_cfa_register %ebp
17 ; X32-NEXT: movl 16(%ebp), %eax
18 ; X32-NEXT: movl 20(%ebp), %edx
19 ; X32-NEXT: addl 8(%ebp), %eax
20 ; X32-NEXT: adcl 12(%ebp), %edx
21 ; X32-NEXT: popl %ebp
12 ; X32-NEXT: movl {{[0-9]+}}(%esp), %eax
13 ; X32-NEXT: movl {{[0-9]+}}(%esp), %edx
14 ; X32-NEXT: addl {{[0-9]+}}(%esp), %eax
15 ; X32-NEXT: adcl {{[0-9]+}}(%esp), %edx
2216 ; X32-NEXT: retl
2317 %ret = add i64 %arg1, %arg2
2418 ret i64 %ret
3636 ; CHECK-NEXT: X86 PIC Global Base Reg Initialization
3737 ; CHECK-NEXT: Expand ISel Pseudo-instructions
3838 ; CHECK-NEXT: Local Stack Slot Allocation
39 ; CHECK-NEXT: X86 EFLAGS copy lowering
3940 ; CHECK-NEXT: X86 WinAlloca Expander
4041 ; CHECK-NEXT: Eliminate PHI nodes for register allocation
4142 ; CHECK-NEXT: Two-Address instruction pass
8989 ; CHECK-NEXT: X86 LEA Optimize
9090 ; CHECK-NEXT: X86 Optimize Call Frame
9191 ; CHECK-NEXT: X86 Avoid Store Forwarding Block
92 ; CHECK-NEXT: X86 EFLAGS copy lowering
9293 ; CHECK-NEXT: X86 WinAlloca Expander
9394 ; CHECK-NEXT: Detect Dead Lanes
9495 ; CHECK-NEXT: Process Implicit Definitions
+0
-37
test/CodeGen/X86/clobber-fi0.ll less more
None ; RUN: llc < %s -verify-machineinstrs -mcpu=generic -mtriple=x86_64-linux | FileCheck %s
1
2 target datalayout = "e-p:64:64:64-i1:8:8-i8:8:8-i16:16:16-i32:32:32-i64:64:64-f32:32:32-f64:64:64-v64:64:64-v128:128:128-a0:0:64-s0:64:64-f80:128:128-n8:16:32:64-S128"
3 target triple = "x86_64-apple-macosx10.7.0"
4
5 ; In the code below we need to copy the EFLAGS because of scheduling constraints.
6 ; When copying the EFLAGS we need to write to the stack with push/pop. This forces
7 ; us to emit the prolog.
8
9 ; CHECK: main
10 ; CHECK: subq{{.*}}rsp
11 ; CHECK: ret
12 define i32 @main(i32 %arg, i8** %arg1) nounwind {
13 bb:
14 %tmp = alloca i32, align 4 ; [#uses=3 type=i32*]
15 %tmp2 = alloca i32, align 4 ; [#uses=3 type=i32*]
16 %tmp3 = alloca i32 ; [#uses=1 type=i32*]
17 store volatile i32 1, i32* %tmp, align 4
18 store volatile i32 1, i32* %tmp2, align 4
19 br label %bb4
20
21 bb4: ; preds = %bb4, %bb
22 %tmp6 = load volatile i32, i32* %tmp2, align 4 ; [#uses=1 type=i32]
23 %tmp7 = add i32 %tmp6, -1 ; [#uses=2 type=i32]
24 store volatile i32 %tmp7, i32* %tmp2, align 4
25 %tmp8 = icmp eq i32 %tmp7, 0 ; [#uses=1 type=i1]
26 %tmp9 = load volatile i32, i32* %tmp ; [#uses=1 type=i32]
27 %tmp10 = add i32 %tmp9, -1 ; [#uses=1 type=i32]
28 store volatile i32 %tmp10, i32* %tmp3
29 br i1 %tmp8, label %bb11, label %bb4
30
31 bb11: ; preds = %bb4
32 %tmp12 = load volatile i32, i32* %tmp, align 4 ; [#uses=1 type=i32]
33 ret i32 %tmp12
34 }
35
36
0 ; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py
1 ; RUN: llc -mtriple=i386-linux-gnu %s -o - | FileCheck %s --check-prefixes=32-ALL,32-GOOD-RA
2 ; RUN: llc -mtriple=i386-linux-gnu -pre-RA-sched=fast %s -o - | FileCheck %s --check-prefixes=32-ALL,32-FAST-RA
3
4 ; RUN: llc -mtriple=x86_64-linux-gnu %s -o - | FileCheck %s --check-prefixes=64-ALL,64-GOOD-RA
5 ; RUN: llc -mtriple=x86_64-linux-gnu -pre-RA-sched=fast %s -o - | FileCheck %s --check-prefixes=64-ALL,64-FAST-RA
6 ; RUN: llc -mtriple=x86_64-linux-gnu -mattr=+sahf %s -o - | FileCheck %s --check-prefixes=64-ALL,64-GOOD-RA-SAHF
7 ; RUN: llc -mtriple=x86_64-linux-gnu -mattr=+sahf -pre-RA-sched=fast %s -o - | FileCheck %s --check-prefixes=64-ALL,64-FAST-RA-SAHF
8 ; RUN: llc -mtriple=x86_64-linux-gnu -mcpu=corei7 %s -o - | FileCheck %s --check-prefixes=64-ALL,64-GOOD-RA-SAHF
9
10 ; TODO: Reenable verify-machineinstr once the if (!AXDead) // FIXME
11 ; in X86InstrInfo::copyPhysReg() is resolved.
1 ; RUN: llc -mtriple=i386-linux-gnu -verify-machineinstrs %s -o - | FileCheck %s --check-prefixes=32-ALL,32-GOOD-RA
2 ; RUN: llc -mtriple=i386-linux-gnu -verify-machineinstrs -pre-RA-sched=fast %s -o - | FileCheck %s --check-prefixes=32-ALL,32-FAST-RA
3
4 ; RUN: llc -mtriple=x86_64-linux-gnu -verify-machineinstrs %s -o - | FileCheck %s --check-prefixes=64-ALL,64-GOOD-RA
5 ; RUN: llc -mtriple=x86_64-linux-gnu -verify-machineinstrs -pre-RA-sched=fast %s -o - | FileCheck %s --check-prefixes=64-ALL,64-FAST-RA
6 ; RUN: llc -mtriple=x86_64-linux-gnu -verify-machineinstrs -mattr=+sahf %s -o - | FileCheck %s --check-prefixes=64-ALL,64-GOOD-RA-SAHF
7 ; RUN: llc -mtriple=x86_64-linux-gnu -verify-machineinstrs -mattr=+sahf -pre-RA-sched=fast %s -o - | FileCheck %s --check-prefixes=64-ALL,64-FAST-RA-SAHF
8 ; RUN: llc -mtriple=x86_64-linux-gnu -verify-machineinstrs -mcpu=corei7 %s -o - | FileCheck %s --check-prefixes=64-ALL,64-GOOD-RA-SAHF
129
1310 declare i32 @foo()
1411 declare i32 @bar(i64)
2118 ; ...
2219 ; use of eax
2320 ; During PEI the adjcallstackdown32 is replaced with the subl which
24 ; clobbers eflags, effectively interfering in the liveness interval.
25 ; Is this a case we care about? Maybe no, considering this issue
26 ; happens with the fast pre-regalloc scheduler enforced. A more
27 ; performant scheduler would move the adjcallstackdown32 out of the
28 ; eflags liveness interval.
21 ; clobbers eflags, effectively interfering in the liveness interval. However,
22 ; we then promote these copies into independent conditions in GPRs that avoids
23 ; repeated saving and restoring logic and can be trivially managed by the
24 ; register allocator.
2925 define i64 @test_intervening_call(i64* %foo, i64 %bar, i64 %baz) nounwind {
3026 ; 32-GOOD-RA-LABEL: test_intervening_call:
3127 ; 32-GOOD-RA: # %bb.0: # %entry
32 ; 32-GOOD-RA-NEXT: pushl %ebp
33 ; 32-GOOD-RA-NEXT: movl %esp, %ebp
3428 ; 32-GOOD-RA-NEXT: pushl %ebx
3529 ; 32-GOOD-RA-NEXT: pushl %esi
36 ; 32-GOOD-RA-NEXT: movl 12(%ebp), %eax
37 ; 32-GOOD-RA-NEXT: movl 16(%ebp), %edx
38 ; 32-GOOD-RA-NEXT: movl 20(%ebp), %ebx
39 ; 32-GOOD-RA-NEXT: movl 24(%ebp), %ecx
40 ; 32-GOOD-RA-NEXT: movl 8(%ebp), %esi
30 ; 32-GOOD-RA-NEXT: pushl %eax
31 ; 32-GOOD-RA-NEXT: movl {{[0-9]+}}(%esp), %eax
32 ; 32-GOOD-RA-NEXT: movl {{[0-9]+}}(%esp), %edx
33 ; 32-GOOD-RA-NEXT: movl {{[0-9]+}}(%esp), %ebx
34 ; 32-GOOD-RA-NEXT: movl {{[0-9]+}}(%esp), %ecx
35 ; 32-GOOD-RA-NEXT: movl {{[0-9]+}}(%esp), %esi
4136 ; 32-GOOD-RA-NEXT: lock cmpxchg8b (%esi)
42 ; 32-GOOD-RA-NEXT: pushl %eax
43 ; 32-GOOD-RA-NEXT: seto %al
44 ; 32-GOOD-RA-NEXT: lahf
45 ; 32-GOOD-RA-NEXT: movl %eax, %esi
46 ; 32-GOOD-RA-NEXT: popl %eax
37 ; 32-GOOD-RA-NEXT: setne %bl
4738 ; 32-GOOD-RA-NEXT: subl $8, %esp
4839 ; 32-GOOD-RA-NEXT: pushl %edx
4940 ; 32-GOOD-RA-NEXT: pushl %eax
5041 ; 32-GOOD-RA-NEXT: calll bar
5142 ; 32-GOOD-RA-NEXT: addl $16, %esp
52 ; 32-GOOD-RA-NEXT: movl %esi, %eax
53 ; 32-GOOD-RA-NEXT: addb $127, %al
54 ; 32-GOOD-RA-NEXT: sahf
43 ; 32-GOOD-RA-NEXT: testb $-1, %bl
5544 ; 32-GOOD-RA-NEXT: jne .LBB0_3
5645 ; 32-GOOD-RA-NEXT: # %bb.1: # %t
5746 ; 32-GOOD-RA-NEXT: movl $42, %eax
6049 ; 32-GOOD-RA-NEXT: xorl %eax, %eax
6150 ; 32-GOOD-RA-NEXT: .LBB0_2: # %t
6251 ; 32-GOOD-RA-NEXT: xorl %edx, %edx
52 ; 32-GOOD-RA-NEXT: addl $4, %esp
6353 ; 32-GOOD-RA-NEXT: popl %esi
6454 ; 32-GOOD-RA-NEXT: popl %ebx
65 ; 32-GOOD-RA-NEXT: popl %ebp
6655 ; 32-GOOD-RA-NEXT: retl
6756 ;
6857 ; 32-FAST-RA-LABEL: test_intervening_call:
6958 ; 32-FAST-RA: # %bb.0: # %entry
70 ; 32-FAST-RA-NEXT: pushl %ebp
71 ; 32-FAST-RA-NEXT: movl %esp, %ebp
7259 ; 32-FAST-RA-NEXT: pushl %ebx
7360 ; 32-FAST-RA-NEXT: pushl %esi
74 ; 32-FAST-RA-NEXT: movl 8(%ebp), %esi
75 ; 32-FAST-RA-NEXT: movl 20(%ebp), %ebx
76 ; 32-FAST-RA-NEXT: movl 24(%ebp), %ecx
77 ; 32-FAST-RA-NEXT: movl 12(%ebp), %eax
78 ; 32-FAST-RA-NEXT: movl 16(%ebp), %edx
61 ; 32-FAST-RA-NEXT: pushl %eax
62 ; 32-FAST-RA-NEXT: movl {{[0-9]+}}(%esp), %esi
63 ; 32-FAST-RA-NEXT: movl {{[0-9]+}}(%esp), %ebx
64 ; 32-FAST-RA-NEXT: movl {{[0-9]+}}(%esp), %ecx
65 ; 32-FAST-RA-NEXT: movl {{[0-9]+}}(%esp), %eax
66 ; 32-FAST-RA-NEXT: movl {{[0-9]+}}(%esp), %edx
7967 ; 32-FAST-RA-NEXT: lock cmpxchg8b (%esi)
80 ; 32-FAST-RA-NEXT: pushl %eax
81 ; 32-FAST-RA-NEXT: seto %al
82 ; 32-FAST-RA-NEXT: lahf
83 ; 32-FAST-RA-NEXT: movl %eax, %ecx
84 ; 32-FAST-RA-NEXT: popl %eax
68 ; 32-FAST-RA-NEXT: setne %bl
8569 ; 32-FAST-RA-NEXT: subl $8, %esp
86 ; 32-FAST-RA-NEXT: pushl %eax
87 ; 32-FAST-RA-NEXT: movl %ecx, %eax
88 ; 32-FAST-RA-NEXT: addb $127, %al
89 ; 32-FAST-RA-NEXT: sahf
90 ; 32-FAST-RA-NEXT: popl %eax
91 ; 32-FAST-RA-NEXT: pushl %eax
92 ; 32-FAST-RA-NEXT: seto %al
93 ; 32-FAST-RA-NEXT: lahf
94 ; 32-FAST-RA-NEXT: movl %eax, %esi
95 ; 32-FAST-RA-NEXT: popl %eax
9670 ; 32-FAST-RA-NEXT: pushl %edx
9771 ; 32-FAST-RA-NEXT: pushl %eax
9872 ; 32-FAST-RA-NEXT: calll bar
9973 ; 32-FAST-RA-NEXT: addl $16, %esp
100 ; 32-FAST-RA-NEXT: movl %esi, %eax
101 ; 32-FAST-RA-NEXT: addb $127, %al
102 ; 32-FAST-RA-NEXT: sahf
74 ; 32-FAST-RA-NEXT: testb $-1, %bl
10375 ; 32-FAST-RA-NEXT: jne .LBB0_3
10476 ; 32-FAST-RA-NEXT: # %bb.1: # %t
10577 ; 32-FAST-RA-NEXT: movl $42, %eax
10880 ; 32-FAST-RA-NEXT: xorl %eax, %eax
10981 ; 32-FAST-RA-NEXT: .LBB0_2: # %t
11082 ; 32-FAST-RA-NEXT: xorl %edx, %edx
83 ; 32-FAST-RA-NEXT: addl $4, %esp
11184 ; 32-FAST-RA-NEXT: popl %esi
11285 ; 32-FAST-RA-NEXT: popl %ebx
113 ; 32-FAST-RA-NEXT: popl %ebp
11486 ; 32-FAST-RA-NEXT: retl
11587 ;
116 ; 64-GOOD-RA-LABEL: test_intervening_call:
117 ; 64-GOOD-RA: # %bb.0: # %entry
118 ; 64-GOOD-RA-NEXT: pushq %rbp
119 ; 64-GOOD-RA-NEXT: movq %rsp, %rbp
120 ; 64-GOOD-RA-NEXT: pushq %rbx
121 ; 64-GOOD-RA-NEXT: pushq %rax
122 ; 64-GOOD-RA-NEXT: movq %rsi, %rax
123 ; 64-GOOD-RA-NEXT: lock cmpxchgq %rdx, (%rdi)
124 ; 64-GOOD-RA-NEXT: pushfq
125 ; 64-GOOD-RA-NEXT: popq %rbx
126 ; 64-GOOD-RA-NEXT: movq %rax, %rdi
127 ; 64-GOOD-RA-NEXT: callq bar
128 ; 64-GOOD-RA-NEXT: pushq %rbx
129 ; 64-GOOD-RA-NEXT: popfq
130 ; 64-GOOD-RA-NEXT: jne .LBB0_3
131 ; 64-GOOD-RA-NEXT: # %bb.1: # %t
132 ; 64-GOOD-RA-NEXT: movl $42, %eax
133 ; 64-GOOD-RA-NEXT: jmp .LBB0_2
134 ; 64-GOOD-RA-NEXT: .LBB0_3: # %f
135 ; 64-GOOD-RA-NEXT: xorl %eax, %eax
136 ; 64-GOOD-RA-NEXT: .LBB0_2: # %t
137 ; 64-GOOD-RA-NEXT: addq $8, %rsp
138 ; 64-GOOD-RA-NEXT: popq %rbx
139 ; 64-GOOD-RA-NEXT: popq %rbp
140 ; 64-GOOD-RA-NEXT: retq
141 ;
142 ; 64-FAST-RA-LABEL: test_intervening_call:
143 ; 64-FAST-RA: # %bb.0: # %entry
144 ; 64-FAST-RA-NEXT: pushq %rbp
145 ; 64-FAST-RA-NEXT: movq %rsp, %rbp
146 ; 64-FAST-RA-NEXT: pushq %rbx
147 ; 64-FAST-RA-NEXT: pushq %rax
148 ; 64-FAST-RA-NEXT: movq %rsi, %rax
149 ; 64-FAST-RA-NEXT: lock cmpxchgq %rdx, (%rdi)
150 ; 64-FAST-RA-NEXT: pushfq
151 ; 64-FAST-RA-NEXT: popq %rbx
152 ; 64-FAST-RA-NEXT: movq %rax, %rdi
153 ; 64-FAST-RA-NEXT: callq bar
154 ; 64-FAST-RA-NEXT: pushq %rbx
155 ; 64-FAST-RA-NEXT: popfq
156 ; 64-FAST-RA-NEXT: jne .LBB0_3
157 ; 64-FAST-RA-NEXT: # %bb.1: # %t
158 ; 64-FAST-RA-NEXT: movl $42, %eax
159 ; 64-FAST-RA-NEXT: jmp .LBB0_2
160 ; 64-FAST-RA-NEXT: .LBB0_3: # %f
161 ; 64-FAST-RA-NEXT: xorl %eax, %eax
162 ; 64-FAST-RA-NEXT: .LBB0_2: # %t
163 ; 64-FAST-RA-NEXT: addq $8, %rsp
164 ; 64-FAST-RA-NEXT: popq %rbx
165 ; 64-FAST-RA-NEXT: popq %rbp
166 ; 64-FAST-RA-NEXT: retq
167 ;
168 ; 64-GOOD-RA-SAHF-LABEL: test_intervening_call:
169 ; 64-GOOD-RA-SAHF: # %bb.0: # %entry
170 ; 64-GOOD-RA-SAHF-NEXT: pushq %rbp
171 ; 64-GOOD-RA-SAHF-NEXT: movq %rsp, %rbp
172 ; 64-GOOD-RA-SAHF-NEXT: pushq %rbx
173 ; 64-GOOD-RA-SAHF-NEXT: pushq %rax
174 ; 64-GOOD-RA-SAHF-NEXT: movq %rsi, %rax
175 ; 64-GOOD-RA-SAHF-NEXT: lock cmpxchgq %rdx, (%rdi)
176 ; 64-GOOD-RA-SAHF-NEXT: pushq %rax
177 ; 64-GOOD-RA-SAHF-NEXT: seto %al
178 ; 64-GOOD-RA-SAHF-NEXT: lahf
179 ; 64-GOOD-RA-SAHF-NEXT: movq %rax, %rbx
180 ; 64-GOOD-RA-SAHF-NEXT: popq %rax
181 ; 64-GOOD-RA-SAHF-NEXT: movq %rax, %rdi
182 ; 64-GOOD-RA-SAHF-NEXT: callq bar
183 ; 64-GOOD-RA-SAHF-NEXT: movq %rbx, %rax
184 ; 64-GOOD-RA-SAHF-NEXT: addb $127, %al
185 ; 64-GOOD-RA-SAHF-NEXT: sahf
186 ; 64-GOOD-RA-SAHF-NEXT: jne .LBB0_3
187 ; 64-GOOD-RA-SAHF-NEXT: # %bb.1: # %t
188 ; 64-GOOD-RA-SAHF-NEXT: movl $42, %eax
189 ; 64-GOOD-RA-SAHF-NEXT: jmp .LBB0_2
190 ; 64-GOOD-RA-SAHF-NEXT: .LBB0_3: # %f
191 ; 64-GOOD-RA-SAHF-NEXT: xorl %eax, %eax
192 ; 64-GOOD-RA-SAHF-NEXT: .LBB0_2: # %t
193 ; 64-GOOD-RA-SAHF-NEXT: addq $8, %rsp
194 ; 64-GOOD-RA-SAHF-NEXT: popq %rbx
195 ; 64-GOOD-RA-SAHF-NEXT: popq %rbp
196 ; 64-GOOD-RA-SAHF-NEXT: retq
197 ;
198 ; 64-FAST-RA-SAHF-LABEL: test_intervening_call:
199 ; 64-FAST-RA-SAHF: # %bb.0: # %entry
200 ; 64-FAST-RA-SAHF-NEXT: pushq %rbp
201 ; 64-FAST-RA-SAHF-NEXT: movq %rsp, %rbp
202 ; 64-FAST-RA-SAHF-NEXT: pushq %rbx
203 ; 64-FAST-RA-SAHF-NEXT: pushq %rax
204 ; 64-FAST-RA-SAHF-NEXT: movq %rsi, %rax
205 ; 64-FAST-RA-SAHF-NEXT: lock cmpxchgq %rdx, (%rdi)
206 ; 64-FAST-RA-SAHF-NEXT: pushq %rax
207 ; 64-FAST-RA-SAHF-NEXT: seto %al
208 ; 64-FAST-RA-SAHF-NEXT: lahf
209 ; 64-FAST-RA-SAHF-NEXT: movq %rax, %rbx
210 ; 64-FAST-RA-SAHF-NEXT: popq %rax
211 ; 64-FAST-RA-SAHF-NEXT: movq %rax, %rdi
212 ; 64-FAST-RA-SAHF-NEXT: callq bar
213 ; 64-FAST-RA-SAHF-NEXT: movq %rbx, %rax
214 ; 64-FAST-RA-SAHF-NEXT: addb $127, %al
215 ; 64-FAST-RA-SAHF-NEXT: sahf
216 ; 64-FAST-RA-SAHF-NEXT: jne .LBB0_3
217 ; 64-FAST-RA-SAHF-NEXT: # %bb.1: # %t
218 ; 64-FAST-RA-SAHF-NEXT: movl $42, %eax
219 ; 64-FAST-RA-SAHF-NEXT: jmp .LBB0_2
220 ; 64-FAST-RA-SAHF-NEXT: .LBB0_3: # %f
221 ; 64-FAST-RA-SAHF-NEXT: xorl %eax, %eax
222 ; 64-FAST-RA-SAHF-NEXT: .LBB0_2: # %t
223 ; 64-FAST-RA-SAHF-NEXT: addq $8, %rsp
224 ; 64-FAST-RA-SAHF-NEXT: popq %rbx
225 ; 64-FAST-RA-SAHF-NEXT: popq %rbp
226 ; 64-FAST-RA-SAHF-NEXT: retq
88 ; 64-ALL-LABEL: test_intervening_call:
89 ; 64-ALL: # %bb.0: # %entry
90 ; 64-ALL-NEXT: pushq %rbx
91 ; 64-ALL-NEXT: movq %rsi, %rax
92 ; 64-ALL-NEXT: lock cmpxchgq %rdx, (%rdi)
93 ; 64-ALL-NEXT: setne %bl
94 ; 64-ALL-NEXT: movq %rax, %rdi
95 ; 64-ALL-NEXT: callq bar
96 ; 64-ALL-NEXT: testb $-1, %bl
97 ; 64-ALL-NEXT: jne .LBB0_2
98 ; 64-ALL-NEXT: # %bb.1: # %t
99 ; 64-ALL-NEXT: movl $42, %eax
100 ; 64-ALL-NEXT: popq %rbx
101 ; 64-ALL-NEXT: retq
102 ; 64-ALL-NEXT: .LBB0_2: # %f
103 ; 64-ALL-NEXT: xorl %eax, %eax
104 ; 64-ALL-NEXT: popq %rbx
105 ; 64-ALL-NEXT: retq
227106 entry:
228107 %cx = cmpxchg i64* %foo, i64 %bar, i64 %baz seq_cst seq_cst
229108 %v = extractvalue { i64, i1 } %cx, 0
330209 define i32 @test_feed_cmov(i32* %addr, i32 %desired, i32 %new) nounwind {
331210 ; 32-GOOD-RA-LABEL: test_feed_cmov:
332211 ; 32-GOOD-RA: # %bb.0: # %entry
333 ; 32-GOOD-RA-NEXT: pushl %ebp
334 ; 32-GOOD-RA-NEXT: movl %esp, %ebp
335 ; 32-GOOD-RA-NEXT: pushl %edi
212 ; 32-GOOD-RA-NEXT: pushl %ebx
336213 ; 32-GOOD-RA-NEXT: pushl %esi
337 ; 32-GOOD-RA-NEXT: movl 12(%ebp), %eax
338 ; 32-GOOD-RA-NEXT: movl 16(%ebp), %esi
339 ; 32-GOOD-RA-NEXT: movl 8(%ebp), %ecx
214 ; 32-GOOD-RA-NEXT: pushl %eax
215 ; 32-GOOD-RA-NEXT: movl {{[0-9]+}}(%esp), %eax
216 ; 32-GOOD-RA-NEXT: movl {{[0-9]+}}(%esp), %esi
217 ; 32-GOOD-RA-NEXT: movl {{[0-9]+}}(%esp), %ecx
340218 ; 32-GOOD-RA-NEXT: lock cmpxchgl %esi, (%ecx)
341 ; 32-GOOD-RA-NEXT: seto %al
342 ; 32-GOOD-RA-NEXT: lahf
343 ; 32-GOOD-RA-NEXT: movl %eax, %edi
219 ; 32-GOOD-RA-NEXT: sete %bl
344220 ; 32-GOOD-RA-NEXT: calll foo
345 ; 32-GOOD-RA-NEXT: pushl %eax
346 ; 32-GOOD-RA-NEXT: movl %edi, %eax
347 ; 32-GOOD-RA-NEXT: addb $127, %al
348 ; 32-GOOD-RA-NEXT: sahf
349 ; 32-GOOD-RA-NEXT: popl %eax
350 ; 32-GOOD-RA-NEXT: je .LBB2_2
221 ; 32-GOOD-RA-NEXT: testb $-1, %bl
222 ; 32-GOOD-RA-NEXT: jne .LBB2_2
351223 ; 32-GOOD-RA-NEXT: # %bb.1: # %entry
352224 ; 32-GOOD-RA-NEXT: movl %eax, %esi
353225 ; 32-GOOD-RA-NEXT: .LBB2_2: # %entry
354226 ; 32-GOOD-RA-NEXT: movl %esi, %eax
227 ; 32-GOOD-RA-NEXT: addl $4, %esp
355228 ; 32-GOOD-RA-NEXT: popl %esi
356 ; 32-GOOD-RA-NEXT: popl %edi
357 ; 32-GOOD-RA-NEXT: popl %ebp
229 ; 32-GOOD-RA-NEXT: popl %ebx
358230 ; 32-GOOD-RA-NEXT: retl
359231 ;
360232 ; 32-FAST-RA-LABEL: test_feed_cmov:
361233 ; 32-FAST-RA: # %bb.0: # %entry
362 ; 32-FAST-RA-NEXT: pushl %ebp
363 ; 32-FAST-RA-NEXT: movl %esp, %ebp
364 ; 32-FAST-RA-NEXT: pushl %edi
234 ; 32-FAST-RA-NEXT: pushl %ebx
365235 ; 32-FAST-RA-NEXT: pushl %esi
366 ; 32-FAST-RA-NEXT: movl 8(%ebp), %ecx
367 ; 32-FAST-RA-NEXT: movl 16(%ebp), %esi
368 ; 32-FAST-RA-NEXT: movl 12(%ebp), %eax
236 ; 32-FAST-RA-NEXT: pushl %eax
237 ; 32-FAST-RA-NEXT: movl {{[0-9]+}}(%esp), %ecx
238 ; 32-FAST-RA-NEXT: movl {{[0-9]+}}(%esp), %esi
239 ; 32-FAST-RA-NEXT: movl {{[0-9]+}}(%esp), %eax
369240 ; 32-FAST-RA-NEXT: lock cmpxchgl %esi, (%ecx)
370 ; 32-FAST-RA-NEXT: seto %al
371 ; 32-FAST-RA-NEXT: lahf
372 ; 32-FAST-RA-NEXT: movl %eax, %edi
241 ; 32-FAST-RA-NEXT: sete %bl
373242 ; 32-FAST-RA-NEXT: calll foo
374 ; 32-FAST-RA-NEXT: pushl %eax
375 ; 32-FAST-RA-NEXT: movl %edi, %eax
376 ; 32-FAST-RA-NEXT: addb $127, %al
377 ; 32-FAST-RA-NEXT: sahf
378 ; 32-FAST-RA-NEXT: popl %eax
379 ; 32-FAST-RA-NEXT: je .LBB2_2
243 ; 32-FAST-RA-NEXT: testb $-1, %bl
244 ; 32-FAST-RA-NEXT: jne .LBB2_2
380245 ; 32-FAST-RA-NEXT: # %bb.1: # %entry
381246 ; 32-FAST-RA-NEXT: movl %eax, %esi
382247 ; 32-FAST-RA-NEXT: .LBB2_2: # %entry
383248 ; 32-FAST-RA-NEXT: movl %esi, %eax
249 ; 32-FAST-RA-NEXT: addl $4, %esp
384250 ; 32-FAST-RA-NEXT: popl %esi
385 ; 32-FAST-RA-NEXT: popl %edi
386 ; 32-FAST-RA-NEXT: popl %ebp
251 ; 32-FAST-RA-NEXT: popl %ebx
387252 ; 32-FAST-RA-NEXT: retl
388253 ;
389 ; 64-GOOD-RA-LABEL: test_feed_cmov:
390 ; 64-GOOD-RA: # %bb.0: # %entry
391 ; 64-GOOD-RA-NEXT: pushq %rbp
392 ; 64-GOOD-RA-NEXT: movq %rsp, %rbp
393 ; 64-GOOD-RA-NEXT: pushq %r14
394 ; 64-GOOD-RA-NEXT: pushq %rbx
395 ; 64-GOOD-RA-NEXT: movl %edx, %ebx
396 ; 64-GOOD-RA-NEXT: movl %esi, %eax
397 ; 64-GOOD-RA-NEXT: lock cmpxchgl %edx, (%rdi)
398 ; 64-GOOD-RA-NEXT: pushfq
399 ; 64-GOOD-RA-NEXT: popq %r14
400 ; 64-GOOD-RA-NEXT: callq foo
401 ; 64-GOOD-RA-NEXT: pushq %r14
402 ; 64-GOOD-RA-NEXT: popfq
403 ; 64-GOOD-RA-NEXT: cmovel %ebx, %eax
404 ; 64-GOOD-RA-NEXT: popq %rbx
405 ; 64-GOOD-RA-NEXT: popq %r14
406 ; 64-GOOD-RA-NEXT: popq %rbp
407 ; 64-GOOD-RA-NEXT: retq
408 ;
409 ; 64-FAST-RA-LABEL: test_feed_cmov:
410 ; 64-FAST-RA: # %bb.0: # %entry
411 ; 64-FAST-RA-NEXT: pushq %rbp
412 ; 64-FAST-RA-NEXT: movq %rsp, %rbp
413 ; 64-FAST-RA-NEXT: pushq %r14
414 ; 64-FAST-RA-NEXT: pushq %rbx
415 ; 64-FAST-RA-NEXT: movl %edx, %ebx
416 ; 64-FAST-RA-NEXT: movl %esi, %eax
417 ; 64-FAST-RA-NEXT: lock cmpxchgl %edx, (%rdi)
418 ; 64-FAST-RA-NEXT: pushfq
419 ; 64-FAST-RA-NEXT: popq %r14
420 ; 64-FAST-RA-NEXT: callq foo
421 ; 64-FAST-RA-NEXT: pushq %r14
422 ; 64-FAST-RA-NEXT: popfq
423 ; 64-FAST-RA-NEXT: cmovel %ebx, %eax
424 ; 64-FAST-RA-NEXT: popq %rbx
425 ; 64-FAST-RA-NEXT: popq %r14
426 ; 64-FAST-RA-NEXT: popq %rbp
427 ; 64-FAST-RA-NEXT: retq
428 ;
429 ; 64-GOOD-RA-SAHF-LABEL: test_feed_cmov:
430 ; 64-GOOD-RA-SAHF: # %bb.0: # %entry
431 ; 64-GOOD-RA-SAHF-NEXT: pushq %rbp
432 ; 64-GOOD-RA-SAHF-NEXT: movq %rsp, %rbp
433 ; 64-GOOD-RA-SAHF-NEXT: pushq %r14
434 ; 64-GOOD-RA-SAHF-NEXT: pushq %rbx
435 ; 64-GOOD-RA-SAHF-NEXT: movl %edx, %ebx
436 ; 64-GOOD-RA-SAHF-NEXT: movl %esi, %eax
437 ; 64-GOOD-RA-SAHF-NEXT: lock cmpxchgl %edx, (%rdi)
438 ; 64-GOOD-RA-SAHF-NEXT: seto %al
439 ; 64-GOOD-RA-SAHF-NEXT: lahf
440 ; 64-GOOD-RA-SAHF-NEXT: movq %rax, %r14
441 ; 64-GOOD-RA-SAHF-NEXT: callq foo
442 ; 64-GOOD-RA-SAHF-NEXT: pushq %rax
443 ; 64-GOOD-RA-SAHF-NEXT: movq %r14, %rax
444 ; 64-GOOD-RA-SAHF-NEXT: addb $127, %al
445 ; 64-GOOD-RA-SAHF-NEXT: sahf
446 ; 64-GOOD-RA-SAHF-NEXT: popq %rax
447 ; 64-GOOD-RA-SAHF-NEXT: cmovel %ebx, %eax
448 ; 64-GOOD-RA-SAHF-NEXT: popq %rbx
449 ; 64-GOOD-RA-SAHF-NEXT: popq %r14
450 ; 64-GOOD-RA-SAHF-NEXT: popq %rbp
451 ; 64-GOOD-RA-SAHF-NEXT: retq
452 ;
453 ; 64-FAST-RA-SAHF-LABEL: test_feed_cmov:
454 ; 64-FAST-RA-SAHF: # %bb.0: # %entry
455 ; 64-FAST-RA-SAHF-NEXT: pushq %rbp
456 ; 64-FAST-RA-SAHF-NEXT: movq %rsp, %rbp
457 ; 64-FAST-RA-SAHF-NEXT: pushq %r14
458 ; 64-FAST-RA-SAHF-NEXT: pushq %rbx
459 ; 64-FAST-RA-SAHF-NEXT: movl %edx, %ebx
460 ; 64-FAST-RA-SAHF-NEXT: movl %esi, %eax
461 ; 64-FAST-RA-SAHF-NEXT: lock cmpxchgl %edx, (%rdi)
462 ; 64-FAST-RA-SAHF-NEXT: seto %al
463 ; 64-FAST-RA-SAHF-NEXT: lahf
464 ; 64-FAST-RA-SAHF-NEXT: movq %rax, %r14
465 ; 64-FAST-RA-SAHF-NEXT: callq foo
466 ; 64-FAST-RA-SAHF-NEXT: pushq %rax
467 ; 64-FAST-RA-SAHF-NEXT: movq %r14, %rax
468 ; 64-FAST-RA-SAHF-NEXT: addb $127, %al
469 ; 64-FAST-RA-SAHF-NEXT: sahf
470 ; 64-FAST-RA-SAHF-NEXT: popq %rax
471 ; 64-FAST-RA-SAHF-NEXT: cmovel %ebx, %eax
472 ; 64-FAST-RA-SAHF-NEXT: popq %rbx
473 ; 64-FAST-RA-SAHF-NEXT: popq %r14
474 ; 64-FAST-RA-SAHF-NEXT: popq %rbp
475 ; 64-FAST-RA-SAHF-NEXT: retq
254 ; 64-ALL-LABEL: test_feed_cmov:
255 ; 64-ALL: # %bb.0: # %entry
256 ; 64-ALL-NEXT: pushq %rbp
257 ; 64-ALL-NEXT: pushq %rbx
258 ; 64-ALL-NEXT: pushq %rax
259 ; 64-ALL-NEXT: movl %edx, %ebx
260 ; 64-ALL-NEXT: movl %esi, %eax
261 ; 64-ALL-NEXT: lock cmpxchgl %edx, (%rdi)
262 ; 64-ALL-NEXT: sete %bpl
263 ; 64-ALL-NEXT: callq foo
264 ; 64-ALL-NEXT: testb $-1, %bpl
265 ; 64-ALL-NEXT: cmovnel %ebx, %eax
266 ; 64-ALL-NEXT: addq $8, %rsp
267 ; 64-ALL-NEXT: popq %rbx
268 ; 64-ALL-NEXT: popq %rbp
269 ; 64-ALL-NEXT: retq
476270 entry:
477271 %res = cmpxchg i32* %addr, i32 %desired, i32 %new seq_cst seq_cst
478272 %success = extractvalue { i32, i1 } %res, 1
1818 ; X32-LABEL: test1:
1919 ; X32: # %bb.0: # %entry
2020 ; X32-NEXT: movb b, %cl
21 ; X32-NEXT: movb %cl, %al
21 ; X32-NEXT: movl %ecx, %eax
2222 ; X32-NEXT: incb %al
2323 ; X32-NEXT: movb %al, b
2424 ; X32-NEXT: incl c
25 ; X32-NEXT: pushl %eax
26 ; X32-NEXT: seto %al
27 ; X32-NEXT: lahf
28 ; X32-NEXT: movl %eax, %edx
29 ; X32-NEXT: popl %eax
25 ; X32-NEXT: sete %dl
3026 ; X32-NEXT: movb a, %ah
3127 ; X32-NEXT: movb %ah, %ch
3228 ; X32-NEXT: incb %ch
3329 ; X32-NEXT: cmpb %cl, %ah
3430 ; X32-NEXT: sete d
3531 ; X32-NEXT: movb %ch, a
36 ; X32-NEXT: pushl %eax
37 ; X32-NEXT: movl %edx, %eax
38 ; X32-NEXT: addb $127, %al
39 ; X32-NEXT: sahf
40 ; X32-NEXT: popl %eax
41 ; X32-NEXT: je .LBB0_2
32 ; X32-NEXT: testb $-1, %dl
33 ; X32-NEXT: jne .LBB0_2
4234 ; X32-NEXT: # %bb.1: # %if.then
43 ; X32-NEXT: pushl %ebp
44 ; X32-NEXT: movl %esp, %ebp
4535 ; X32-NEXT: movsbl %al, %eax
4636 ; X32-NEXT: pushl %eax
4737 ; X32-NEXT: calll external
4838 ; X32-NEXT: addl $4, %esp
49 ; X32-NEXT: popl %ebp
5039 ; X32-NEXT: .LBB0_2: # %if.end
5140 ; X32-NEXT: xorl %eax, %eax
5241 ; X32-NEXT: retl
5847 ; X64-NEXT: incb %al
5948 ; X64-NEXT: movb %al, {{.*}}(%rip)
6049 ; X64-NEXT: incl {{.*}}(%rip)
61 ; X64-NEXT: pushfq
62 ; X64-NEXT: popq %rsi
50 ; X64-NEXT: sete %sil
6351 ; X64-NEXT: movb {{.*}}(%rip), %cl
6452 ; X64-NEXT: movl %ecx, %edx
6553 ; X64-NEXT: incb %dl
6654 ; X64-NEXT: cmpb %dil, %cl
6755 ; X64-NEXT: sete {{.*}}(%rip)
6856 ; X64-NEXT: movb %dl, {{.*}}(%rip)
69 ; X64-NEXT: pushq %rsi
70 ; X64-NEXT: popfq
71 ; X64-NEXT: je .LBB0_2
57 ; X64-NEXT: testb $-1, %sil
58 ; X64-NEXT: jne .LBB0_2
7259 ; X64-NEXT: # %bb.1: # %if.then
73 ; X64-NEXT: pushq %rbp
74 ; X64-NEXT: movq %rsp, %rbp
60 ; X64-NEXT: pushq %rax
7561 ; X64-NEXT: movsbl %al, %edi
7662 ; X64-NEXT: callq external
77 ; X64-NEXT: popq %rbp
63 ; X64-NEXT: addq $8, %rsp
7864 ; X64-NEXT: .LBB0_2: # %if.end
7965 ; X64-NEXT: xorl %eax, %eax
8066 ; X64-NEXT: retq
10793 define i32 @test2(i32* %ptr) nounwind {
10894 ; X32-LABEL: test2:
10995 ; X32: # %bb.0: # %entry
110 ; X32-NEXT: pushl %ebp
111 ; X32-NEXT: movl %esp, %ebp
112 ; X32-NEXT: pushl %esi
113 ; X32-NEXT: movl 8(%ebp), %eax
96 ; X32-NEXT: pushl %ebx
97 ; X32-NEXT: movl {{[0-9]+}}(%esp), %eax
11498 ; X32-NEXT: incl (%eax)
115 ; X32-NEXT: seto %al
116 ; X32-NEXT: lahf
117 ; X32-NEXT: movl %eax, %esi
99 ; X32-NEXT: setne %bl
118100 ; X32-NEXT: pushl $42
119101 ; X32-NEXT: calll external
120102 ; X32-NEXT: addl $4, %esp
121 ; X32-NEXT: movl %esi, %eax
122 ; X32-NEXT: addb $127, %al
123 ; X32-NEXT: sahf
103 ; X32-NEXT: testb $-1, %bl
124104 ; X32-NEXT: je .LBB1_1
125 ; X32-NEXT: # %bb.3: # %else
105 ; X32-NEXT: # %bb.2: # %else
126106 ; X32-NEXT: xorl %eax, %eax
127 ; X32-NEXT: jmp .LBB1_2
107 ; X32-NEXT: popl %ebx
108 ; X32-NEXT: retl
128109 ; X32-NEXT: .LBB1_1: # %then
129110 ; X32-NEXT: movl $64, %eax
130 ; X32-NEXT: .LBB1_2: # %then
131 ; X32-NEXT: popl %esi
132 ; X32-NEXT: popl %ebp
111 ; X32-NEXT: popl %ebx
133112 ; X32-NEXT: retl
134113 ;
135114 ; X64-LABEL: test2:
136115 ; X64: # %bb.0: # %entry
137 ; X64-NEXT: pushq %rbp
138 ; X64-NEXT: movq %rsp, %rbp
139116 ; X64-NEXT: pushq %rbx
140 ; X64-NEXT: pushq %rax
141117 ; X64-NEXT: incl (%rdi)
142 ; X64-NEXT: pushfq
143 ; X64-NEXT: popq %rbx
118 ; X64-NEXT: setne %bl
144119 ; X64-NEXT: movl $42, %edi
145120 ; X64-NEXT: callq external
146 ; X64-NEXT: pushq %rbx
147 ; X64-NEXT: popfq
121 ; X64-NEXT: testb $-1, %bl
148122 ; X64-NEXT: je .LBB1_1
149 ; X64-NEXT: # %bb.3: # %else
123 ; X64-NEXT: # %bb.2: # %else
150124 ; X64-NEXT: xorl %eax, %eax
151 ; X64-NEXT: jmp .LBB1_2
125 ; X64-NEXT: popq %rbx
126 ; X64-NEXT: retq
152127 ; X64-NEXT: .LBB1_1: # %then
153128 ; X64-NEXT: movl $64, %eax
154 ; X64-NEXT: .LBB1_2: # %then
155 ; X64-NEXT: addq $8, %rsp
156129 ; X64-NEXT: popq %rbx
157 ; X64-NEXT: popq %rbp
158130 ; X64-NEXT: retq
159131 entry:
160132 %val = load i32, i32* %ptr
182154 define void @test_tail_call(i32* %ptr) nounwind optsize {
183155 ; X32-LABEL: test_tail_call:
184156 ; X32: # %bb.0: # %entry
185 ; X32-NEXT: pushl %ebp
186 ; X32-NEXT: movl %esp, %ebp
187 ; X32-NEXT: movl 8(%ebp), %eax
157 ; X32-NEXT: movl {{[0-9]+}}(%esp), %eax
188158 ; X32-NEXT: incl (%eax)
189 ; X32-NEXT: seto %al
190 ; X32-NEXT: lahf
191 ; X32-NEXT: movl %eax, %eax
159 ; X32-NEXT: setne %al
192160 ; X32-NEXT: incb a
193161 ; X32-NEXT: sete d
194 ; X32-NEXT: movl %eax, %eax
195 ; X32-NEXT: addb $127, %al
196 ; X32-NEXT: sahf
197 ; X32-NEXT: je .LBB2_1
198 ; X32-NEXT: # %bb.2: # %else
199 ; X32-NEXT: popl %ebp
200 ; X32-NEXT: jmp external_b # TAILCALL
201 ; X32-NEXT: .LBB2_1: # %then
202 ; X32-NEXT: popl %ebp
162 ; X32-NEXT: testb $-1, %al
163 ; X32-NEXT: jne external_b # TAILCALL
164 ; X32-NEXT: # %bb.1: # %then
203165 ; X32-NEXT: jmp external_a # TAILCALL
204166 ;
205167 ; X64-LABEL: test_tail_call:
206168 ; X64: # %bb.0: # %entry
207 ; X64-NEXT: pushq %rbp
208 ; X64-NEXT: movq %rsp, %rbp
209169 ; X64-NEXT: incl (%rdi)
210 ; X64-NEXT: pushfq
211 ; X64-NEXT: popq %rax
170 ; X64-NEXT: setne %al
212171 ; X64-NEXT: incb {{.*}}(%rip)
213172 ; X64-NEXT: sete {{.*}}(%rip)
214 ; X64-NEXT: pushq %rax
215 ; X64-NEXT: popfq
216 ; X64-NEXT: je .LBB2_1
217 ; X64-NEXT: # %bb.2: # %else
218 ; X64-NEXT: popq %rbp
219 ; X64-NEXT: jmp external_b # TAILCALL
220 ; X64-NEXT: .LBB2_1: # %then
221 ; X64-NEXT: popq %rbp
173 ; X64-NEXT: testb $-1, %al
174 ; X64-NEXT: jne external_b # TAILCALL
175 ; X64-NEXT: # %bb.1: # %then
222176 ; X64-NEXT: jmp external_a # TAILCALL
223177 entry:
224178 %val = load i32, i32* %ptr
+0
-64
test/CodeGen/X86/eflags-copy-expansion.mir less more
None # RUN: llc -run-pass postrapseudos -mtriple=i386-apple-macosx -o - %s | FileCheck %s
1
2 # Verify that we correctly save and restore eax when copying eflags,
3 # even when only a smaller alias of eax is used. We used to check only
4 # eax and not its aliases.
5 # PR27624.
6
7 --- |
8 target datalayout = "e-m:o-i64:64-f80:128-n8:16:32:64-S128"
9
10 define void @foo() {
11 entry:
12 br label %false
13 false:
14 ret void
15 }
16
17 ...
18
19 ---
20 name: foo
21 tracksRegLiveness: true
22 liveins:
23 - { reg: '$edi' }
24 body: |
25 bb.0.entry:
26 liveins: $edi
27 NOOP implicit-def $al
28
29 ; The bug was triggered only when LivePhysReg is used, which
30 ; happens only when the heuristic for the liveness computation
31 ; failed. The liveness computation heuristic looks at 10 instructions
32 ; before and after the copy. Make sure we do not reach the definition of
33 ; AL in 10 instructions, otherwise the heuristic will see that it is live.
34 NOOP
35 NOOP
36 NOOP
37 NOOP
38 NOOP
39 NOOP
40 NOOP
41 NOOP
42 NOOP
43 NOOP
44 NOOP
45 NOOP
46 NOOP
47 ; Save AL.
48 ; CHECK: PUSH32r killed $eax
49
50 ; Copy edi into EFLAGS
51 ; CHECK-NEXT: $eax = MOV32rr $edi
52 ; CHECK-NEXT: $al = ADD8ri $al, 127, implicit-def $eflags
53 ; CHECK-NEXT: SAHF implicit-def $eflags, implicit $ah
54 $eflags = COPY $edi
55
56 ; Restore AL.
57 ; CHECK-NEXT: $eax = POP32r
58 bb.1.false:
59 liveins: $al
60 NOOP implicit $al
61 RETQ
62
63 ...
0 # RUN: llc -run-pass x86-flags-copy-lowering -verify-machineinstrs -o - %s | FileCheck %s
1 #
2 # Lower various interesting copy patterns of EFLAGS without using LAHF/SAHF.
3
4 --- |
5 target triple = "x86_64-unknown-unknown"
6
7 declare void @foo()
8
9 define i32 @test_branch(i64 %a, i64 %b) {
10 entry:
11 call void @foo()
12 ret i32 0
13 }
14
15 define i32 @test_branch_fallthrough(i64 %a, i64 %b) {
16 entry:
17 call void @foo()
18 ret i32 0
19 }
20
21 define void @test_setcc(i64 %a, i64 %b) {
22 entry:
23 call void @foo()
24 ret void
25 }
26
27 define void @test_cmov(i64 %a, i64 %b) {
28 entry:
29 call void @foo()
30 ret void
31 }
32
33 define void @test_adc(i64 %a, i64 %b) {
34 entry:
35 call void @foo()
36 ret void
37 }
38
39 define void @test_sbb(i64 %a, i64 %b) {
40 entry:
41 call void @foo()
42 ret void
43 }
44
45 define void @test_adcx(i64 %a, i64 %b) {
46 entry:
47 call void @foo()
48 ret void
49 }
50
51 define void @test_adox(i64 %a, i64 %b) {
52 entry:
53 call void @foo()
54 ret void
55 }
56
57 define void @test_rcl(i64 %a, i64 %b) {
58 entry:
59 call void @foo()
60 ret void
61 }
62
63 define void @test_rcr(i64 %a, i64 %b) {
64 entry:
65 call void @foo()
66 ret void
67 }
68 ...
69 ---
70 name: test_branch
71 # CHECK-LABEL: name: test_branch
72 liveins:
73 - { reg: '$rdi', virtual-reg: '%0' }
74 - { reg: '$rsi', virtual-reg: '%1' }
75 body: |
76 bb.0:
77 successors: %bb.1, %bb.2, %bb.3
78 liveins: $rdi, $rsi
79
80 %0:gr64 = COPY $rdi
81 %1:gr64 = COPY $rsi
82 CMP64rr %0, %1, implicit-def $eflags
83 %2:gr64 = COPY $eflags
84 ; CHECK-NOT: COPY{{( killed)?}} $eflags
85 ; CHECK: %[[A_REG:[^:]*]]:gr8 = SETAr implicit $eflags
86 ; CHECK-NEXT: %[[B_REG:[^:]*]]:gr8 = SETBr implicit $eflags
87 ; CHECK-NOT: COPY{{( killed)?}} $eflags
88
89 ADJCALLSTACKDOWN64 0, 0, 0, implicit-def dead $rsp, implicit-def dead $eflags, implicit-def dead $ssp, implicit $rsp, implicit $ssp
90 CALL64pcrel32 @foo, csr_64, implicit $rsp, implicit $ssp, implicit $rdi, implicit-def $rsp, implicit-def $ssp, implicit-def $eax
91 ADJCALLSTACKUP64 0, 0, implicit-def dead $rsp, implicit-def dead $eflags, implicit-def dead $ssp, implicit $rsp, implicit $ssp
92
93 $eflags = COPY %2
94 JA_1 %bb.1, implicit $eflags
95 JB_1 %bb.2, implicit $eflags
96 JMP_1 %bb.3
97 ; CHECK-NOT: $eflags =
98 ;
99 ; CHECK: TEST8ri %[[A_REG]], -1, implicit-def $eflags
100 ; CHECK-NEXT: JNE_1 %bb.1, implicit killed $eflags
101 ; CHECK-SAME: {{$[[:space:]]}}
102 ; CHECK-NEXT: bb.4:
103 ; CHECK-NEXT: successors: {{.*$}}
104 ; CHECK-SAME: {{$[[:space:]]}}
105 ; CHECK-NEXT: TEST8ri %[[B_REG]], -1, implicit-def $eflags
106 ; CHECK-NEXT: JNE_1 %bb.2, implicit killed $eflags
107 ; CHECK-NEXT: JMP_1 %bb.3
108
109 bb.1:
110 %3:gr32 = MOV32ri64 42
111 $eax = COPY %3
112 RET 0, $eax
113
114 bb.2:
115 %4:gr32 = MOV32ri64 43
116 $eax = COPY %4
117 RET 0, $eax
118
119 bb.3:
120 %5:gr32 = MOV32r0 implicit-def dead $eflags
121 $eax = COPY %5
122 RET 0, $eax
123
124 ...
125 ---
126 name: test_branch_fallthrough
127 # CHECK-LABEL: name: test_branch_fallthrough
128 liveins:
129 - { reg: '$rdi', virtual-reg: '%0' }
130 - { reg: '$rsi', virtual-reg: '%1' }
131 body: |
132 bb.0:
133 successors: %bb.1, %bb.2, %bb.3
134 liveins: $rdi, $rsi
135
136 %0:gr64 = COPY $rdi
137 %1:gr64 = COPY $rsi
138 CMP64rr %0, %1, implicit-def $eflags
139 %2:gr64 = COPY $eflags
140 ; CHECK-NOT: COPY{{( killed)?}} $eflags
141 ; CHECK: %[[A_REG:[^:]*]]:gr8 = SETAr implicit $eflags
142 ; CHECK-NEXT: %[[B_REG:[^:]*]]:gr8 = SETBr implicit $eflags
143 ; CHECK-NOT: COPY{{( killed)?}} $eflags
144
145 ADJCALLSTACKDOWN64 0, 0, 0, implicit-def dead $rsp, implicit-def dead $eflags, implicit-def dead $ssp, implicit $rsp, implicit $ssp
146 CALL64pcrel32 @foo, csr_64, implicit $rsp, implicit $ssp, implicit $rdi, implicit-def $rsp, implicit-def $ssp, implicit-def $eax
147 ADJCALLSTACKUP64 0, 0, implicit-def dead $rsp, implicit-def dead $eflags, implicit-def dead $ssp, implicit $rsp, implicit $ssp
148
149 $eflags = COPY %2
150 JA_1 %bb.2, implicit $eflags
151 JB_1 %bb.3, implicit $eflags
152 ; CHECK-NOT: $eflags =
153 ;
154 ; CHECK: TEST8ri %[[A_REG]], -1, implicit-def $eflags
155 ; CHECK-NEXT: JNE_1 %bb.2, implicit killed $eflags
156 ; CHECK-SAME: {{$[[:space:]]}}
157 ; CHECK-NEXT: bb.4:
158 ; CHECK-NEXT: successors: {{.*$}}
159 ; CHECK-SAME: {{$[[:space:]]}}
160 ; CHECK-NEXT: TEST8ri %[[B_REG]], -1, implicit-def $eflags
161 ; CHECK-NEXT: JNE_1 %bb.3, implicit killed $eflags
162 ; CHECK-SAME: {{$[[:space:]]}}
163 ; CHECK-NEXT: bb.1:
164
165 bb.1:
166 %5:gr32 = MOV32r0 implicit-def dead $eflags
167 $eax = COPY %5
168 RET 0, $eax
169
170 bb.2:
171 %3:gr32 = MOV32ri64 42
172 $eax = COPY %3
173 RET 0, $eax
174
175 bb.3:
176 %4:gr32 = MOV32ri64 43
177 $eax = COPY %4
178 RET 0, $eax
179
180 ...
181 ---
182 name: test_setcc
183 # CHECK-LABEL: name: test_setcc
184 liveins:
185 - { reg: '$rdi', virtual-reg: '%0' }
186 - { reg: '$rsi', virtual-reg: '%1' }
187 body: |
188 bb.0:
189 liveins: $rdi, $rsi
190
191 %0:gr64 = COPY $rdi
192 %1:gr64 = COPY $rsi
193 CMP64rr %0, %1, implicit-def $eflags
194 %2:gr64 = COPY $eflags
195 ; CHECK-NOT: COPY{{( killed)?}} $eflags
196 ; CHECK: %[[A_REG:[^:]*]]:gr8 = SETAr implicit $eflags
197 ; CHECK-NEXT: %[[B_REG:[^:]*]]:gr8 = SETBr implicit $eflags
198 ; CHECK-NEXT: %[[E_REG:[^:]*]]:gr8 = SETEr implicit $eflags
199 ; CHECK-NEXT: %[[NE_REG:[^:]*]]:gr8 = SETNEr implicit $eflags
200 ; CHECK-NOT: COPY{{( killed)?}} $eflags
201
202 ADJCALLSTACKDOWN64 0, 0, 0, implicit-def dead $rsp, implicit-def dead $eflags, implicit-def dead $ssp, implicit $rsp, implicit $ssp
203 CALL64pcrel32 @foo, csr_64, implicit $rsp, implicit $ssp, implicit $rdi, implicit-def $rsp, implicit-def $ssp, implicit-def $eax
204 ADJCALLSTACKUP64 0, 0, implicit-def dead $rsp, implicit-def dead $eflags, implicit-def dead $ssp, implicit $rsp, implicit $ssp
205
206 $eflags = COPY %2
207 %3:gr8 = SETAr implicit $eflags
208 %4:gr8 = SETBr implicit $eflags
209 %5:gr8 = SETEr implicit $eflags
210 %6:gr8 = SETNEr implicit killed $eflags
211 MOV8mr $rsp, 1, $noreg, -16, $noreg, killed %3
212 MOV8mr $rsp, 1, $noreg, -16, $noreg, killed %4
213 MOV8mr $rsp, 1, $noreg, -16, $noreg, killed %5
214 MOV8mr $rsp, 1, $noreg, -16, $noreg, killed %6
215 ; CHECK-NOT: $eflags =
216 ; CHECK-NOT: = SET{{.*}}
217 ; CHECK: MOV8mr {{.*}}, killed %[[A_REG]]
218 ; CHECK-CHECK: MOV8mr {{.*}}, killed %[[B_REG]]
219 ; CHECK-CHECK: MOV8mr {{.*}}, killed %[[E_REG]]
220 ; CHECK-CHECK: MOV8mr {{.*}}, killed %[[NE_REG]]
221
222 RET 0
223
224 ...
225 ---
226 name: test_cmov
227 # CHECK-LABEL: name: test_cmov
228 liveins:
229 - { reg: '$rdi', virtual-reg: '%0' }
230 - { reg: '$rsi', virtual-reg: '%1' }
231 body: |
232 bb.0:
233 liveins: $rdi, $rsi
234
235 %0:gr64 = COPY $rdi
236 %1:gr64 = COPY $rsi
237 CMP64rr %0, %1, implicit-def $eflags
238 %2:gr64 = COPY $eflags
239 ; CHECK-NOT: COPY{{( killed)?}} $eflags
240 ; CHECK: %[[A_REG:[^:]*]]:gr8 = SETAr implicit $eflags
241 ; CHECK-NEXT: %[[B_REG:[^:]*]]:gr8 = SETBr implicit $eflags
242 ; CHECK-NEXT: %[[E_REG:[^:]*]]:gr8 = SETEr implicit $eflags
243 ; CHECK-NOT: COPY{{( killed)?}} $eflags
244
245 ADJCALLSTACKDOWN64 0, 0, 0, implicit-def dead $rsp, implicit-def dead $eflags, implicit-def dead $ssp, implicit $rsp, implicit $ssp
246 CALL64pcrel32 @foo, csr_64, implicit $rsp, implicit $ssp, implicit $rdi, implicit-def $rsp, implicit-def $ssp, implicit-def $eax
247 ADJCALLSTACKUP64 0, 0, implicit-def dead $rsp, implicit-def dead $eflags, implicit-def dead $ssp, implicit $rsp, implicit $ssp
248
249 $eflags = COPY %2
250 %3:gr64 = CMOVA64rr %0, %1, implicit $eflags
251 %4:gr64 = CMOVB64rr %0, %1, implicit $eflags
252 %5:gr64 = CMOVE64rr %0, %1, implicit $eflags
253 %6:gr64 = CMOVNE64rr %0, %1, implicit killed $eflags
254 ; CHECK-NOT: $eflags =
255 ; CHECK: TEST8ri %[[A_REG]], -1, implicit-def $eflags
256 ; CHECK-NEXT: %3:gr64 = CMOVNE64rr %0, %1, implicit killed $eflags
257 ; CHECK-NEXT: TEST8ri %[[B_REG]], -1, implicit-def $eflags
258 ; CHECK-NEXT: %4:gr64 = CMOVNE64rr %0, %1, implicit killed $eflags
259 ; CHECK-NEXT: TEST8ri %[[E_REG]], -1, implicit-def $eflags
260 ; CHECK-NEXT: %5:gr64 = CMOVNE64rr %0, %1, implicit killed $eflags
261 ; CHECK-NEXT: TEST8ri %[[E_REG]], -1, implicit-def $eflags
262 ; CHECK-NEXT: %6:gr64 = CMOVE64rr %0, %1, implicit killed $eflags
263 MOV64mr $rsp, 1, $noreg, -16, $noreg, killed %3
264 MOV64mr $rsp, 1, $noreg, -16, $noreg, killed %4
265 MOV64mr $rsp, 1, $noreg, -16, $noreg, killed %5
266 MOV64mr $rsp, 1, $noreg, -16, $noreg, killed %6
267
268 RET 0
269
270 ...
271 ---
272 name: test_adc
273 # CHECK-LABEL: name: test_adc
274 liveins:
275 - { reg: '$rdi', virtual-reg: '%0' }
276 - { reg: '$rsi', virtual-reg: '%1' }
277 body: |
278 bb.0:
279 liveins: $rdi, $rsi
280
281 %0:gr64 = COPY $rdi
282 %1:gr64 = COPY $rsi
283 %2:gr64 = ADD64rr %0, %1, implicit-def $eflags
284 %3:gr64 = COPY $eflags
285 ; CHECK-NOT: COPY{{( killed)?}} $eflags
286 ; CHECK: %[[CF_REG:[^:]*]]:gr8 = SETBr implicit $eflags
287 ; CHECK-NOT: COPY{{( killed)?}} $eflags
288
289 ADJCALLSTACKDOWN64 0, 0, 0, implicit-def dead $rsp, implicit-def dead $eflags, implicit-def dead $ssp, implicit $rsp, implicit $ssp
290 CALL64pcrel32 @foo, csr_64, implicit $rsp, implicit $ssp, implicit $rdi, implicit-def $rsp, implicit-def $ssp, implicit-def $eax
291 ADJCALLSTACKUP64 0, 0, implicit-def dead $rsp, implicit-def dead $eflags, implicit-def dead $ssp, implicit $rsp, implicit $ssp
292
293 $eflags = COPY %3
294 %4:gr64 = ADC64ri32 %2:gr64, 42, implicit-def $eflags, implicit $eflags
295 %5:gr64 = ADC64ri32 %4:gr64, 42, implicit-def $eflags, implicit $eflags
296 ; CHECK-NOT: $eflags =
297 ; CHECK: dead %{{[^:]*}}:gr8 = ADD8ri %[[CF_REG]], 255, implicit-def $eflags
298 ; CHECK-NEXT: %4:gr64 = ADC64ri32 %2, 42, implicit-def $eflags, implicit killed $eflags
299 ; CHECK-NEXT: %5:gr64 = ADC64ri32 %4, 42, implicit-def{{( dead)?}} $eflags, implicit{{( killed)?}} $eflags
300 MOV64mr $rsp, 1, $noreg, -16, $noreg, killed %5
301
302 RET 0
303
304 ...
305 ---
306 name: test_sbb
307 # CHECK-LABEL: name: test_sbb
308 liveins:
309 - { reg: '$rdi', virtual-reg: '%0' }
310 - { reg: '$rsi', virtual-reg: '%1' }
311 body: |
312 bb.0:
313 liveins: $rdi, $rsi
314
315 %0:gr64 = COPY $rdi
316 %1:gr64 = COPY $rsi
317 %2:gr64 = SUB64rr %0, %1, implicit-def $eflags
318 %3:gr64 = COPY killed $eflags
319 ; CHECK-NOT: COPY{{( killed)?}} $eflags
320 ; CHECK: %[[CF_REG:[^:]*]]:gr8 = SETBr implicit $eflags
321 ; CHECK-NOT: COPY{{( killed)?}} $eflags
322
323 ADJCALLSTACKDOWN64 0, 0, 0, implicit-def dead $rsp, implicit-def dead $eflags, implicit-def dead $ssp, implicit $rsp, implicit $ssp
324 CALL64pcrel32 @foo, csr_64, implicit $rsp, implicit $ssp, implicit $rdi, implicit-def $rsp, implicit-def $ssp, implicit-def $eax
325 ADJCALLSTACKUP64 0, 0, implicit-def dead $rsp, implicit-def dead $eflags, implicit-def dead $ssp, implicit $rsp, implicit $ssp
326
327 $eflags = COPY %3
328 %4:gr64 = SBB64ri32 %2:gr64, 42, implicit-def $eflags, implicit killed $eflags
329 %5:gr64 = SBB64ri32 %4:gr64, 42, implicit-def dead $eflags, implicit killed $eflags
330 ; CHECK-NOT: $eflags =
331 ; CHECK: dead %{{[^:]*}}:gr8 = ADD8ri %[[CF_REG]], 255, implicit-def $eflags
332 ; CHECK-NEXT: %4:gr64 = SBB64ri32 %2, 42, implicit-def $eflags, implicit killed $eflags
333 ; CHECK-NEXT: %5:gr64 = SBB64ri32 %4, 42, implicit-def{{( dead)?}} $eflags, implicit{{( killed)?}} $eflags
334 MOV64mr $rsp, 1, $noreg, -16, $noreg, killed %5
335
336 RET 0
337
338 ...
339 ---
340 name: test_adcx
341 # CHECK-LABEL: name: test_adcx
342 liveins:
343 - { reg: '$rdi', virtual-reg: '%0' }
344 - { reg: '$rsi', virtual-reg: '%1' }
345 body: |
346 bb.0:
347 liveins: $rdi, $rsi
348
349 %0:gr64 = COPY $rdi
350 %1:gr64 = COPY $rsi
351 %2:gr64 = ADD64rr %0, %1, implicit-def $eflags
352 %3:gr64 = COPY $eflags
353 ; CHECK-NOT: COPY{{( killed)?}} $eflags
354 ; CHECK: %[[E_REG:[^:]*]]:gr8 = SETEr implicit $eflags
355 ; CHECK-NEXT: %[[CF_REG:[^:]*]]:gr8 = SETBr implicit $eflags
356 ; CHECK-NOT: COPY{{( killed)?}} $eflags
357
358 ADJCALLSTACKDOWN64 0, 0, 0, implicit-def dead $rsp, implicit-def dead $eflags, implicit-def dead $ssp, implicit $rsp, implicit $ssp
359 CALL64pcrel32 @foo, csr_64, implicit $rsp, implicit $ssp, implicit $rdi, implicit-def $rsp, implicit-def $ssp, implicit-def $eax
360 ADJCALLSTACKUP64 0, 0, implicit-def dead $rsp, implicit-def dead $eflags, implicit-def dead $ssp, implicit $rsp, implicit $ssp
361
362 $eflags = COPY %3
363 %4:gr64 = CMOVE64rr %0, %1, implicit $eflags
364 %5:gr64 = MOV64ri32 42
365 %6:gr64 = ADCX64rr %2, %5, implicit-def $eflags, implicit $eflags
366 ; CHECK-NOT: $eflags =
367 ; CHECK: TEST8ri %[[E_REG]], -1, implicit-def $eflags
368 ; CHECK-NEXT: %4:gr64 = CMOVNE64rr %0, %1, implicit killed $eflags
369 ; CHECK-NEXT: %5:gr64 = MOV64ri32 42
370 ; CHECK-NEXT: dead %{{[^:]*}}:gr8 = ADD8ri %[[CF_REG]], 255, implicit-def $eflags
371 ; CHECK-NEXT: %6:gr64 = ADCX64rr %2, %5, implicit-def{{( dead)?}} $eflags, implicit killed $eflags
372 MOV64mr $rsp, 1, $noreg, -16, $noreg, killed %4
373 MOV64mr $rsp, 1, $noreg, -16, $noreg, killed %6
374
375 RET 0
376
377 ...
378 ---
379 name: test_adox
380 # CHECK-LABEL: name: test_adox
381 liveins:
382 - { reg: '$rdi', virtual-reg: '%0' }
383 - { reg: '$rsi', virtual-reg: '%1' }
384 body: |
385 bb.0:
386 liveins: $rdi, $rsi
387
388 %0:gr64 = COPY $rdi
389 %1:gr64 = COPY $rsi
390 %2:gr64 = ADD64rr %0, %1, implicit-def $eflags
391 %3:gr64 = COPY $eflags
392 ; CHECK-NOT: COPY{{( killed)?}} $eflags
393 ; CHECK: %[[E_REG:[^:]*]]:gr8 = SETEr implicit $eflags
394 ; CHECK-NEXT: %[[OF_REG:[^:]*]]:gr8 = SETOr implicit $eflags
395 ; CHECK-NOT: COPY{{( killed)?}} $eflags
396
397 ADJCALLSTACKDOWN64 0, 0, 0, implicit-def dead $rsp, implicit-def dead $eflags, implicit-def dead $ssp, implicit $rsp, implicit $ssp
398 CALL64pcrel32 @foo, csr_64, implicit $rsp, implicit $ssp, implicit $rdi, implicit-def $rsp, implicit-def $ssp, implicit-def $eax
399 ADJCALLSTACKUP64 0, 0, implicit-def dead $rsp, implicit-def dead $eflags, implicit-def dead $ssp, implicit $rsp, implicit $ssp
400
401 $eflags = COPY %3
402 %4:gr64 = CMOVE64rr %0, %1, implicit $eflags
403 %5:gr64 = MOV64ri32 42
404 %6:gr64 = ADOX64rr %2, %5, implicit-def $eflags, implicit $eflags
405 ; CHECK-NOT: $eflags =
406 ; CHECK: TEST8ri %[[E_REG]], -1, implicit-def $eflags
407 ; CHECK-NEXT: %4:gr64 = CMOVNE64rr %0, %1, implicit killed $eflags
408 ; CHECK-NEXT: %5:gr64 = MOV64ri32 42
409 ; CHECK-NEXT: dead %{{[^:]*}}:gr8 = ADD8ri %[[OF_REG]], 127, implicit-def $eflags
410 ; CHECK-NEXT: %6:gr64 = ADOX64rr %2, %5, implicit-def{{( dead)?}} $eflags, implicit killed $eflags
411 MOV64mr $rsp, 1, $noreg, -16, $noreg, killed %4
412 MOV64mr $rsp, 1, $noreg, -16, $noreg, killed %6
413
414 RET 0
415
416 ...
417 ---
418 name: test_rcl
419 # CHECK-LABEL: name: test_rcl
420 liveins:
421 - { reg: '$rdi', virtual-reg: '%0' }
422 - { reg: '$rsi', virtual-reg: '%1' }
423 body: |
424 bb.0:
425 liveins: $rdi, $rsi
426
427 %0:gr64 = COPY $rdi
428 %1:gr64 = COPY $rsi
429 %2:gr64 = ADD64rr %0, %1, implicit-def $eflags
430 %3:gr64 = COPY $eflags
431 ; CHECK-NOT: COPY{{( killed)?}} $eflags
432 ; CHECK: %[[CF_REG:[^:]*]]:gr8 = SETBr implicit $eflags
433 ; CHECK-NOT: COPY{{( killed)?}} $eflags
434
435 ADJCALLSTACKDOWN64 0, 0, 0, implicit-def dead $rsp, implicit-def dead $eflags, implicit-def dead $ssp, implicit $rsp, implicit $ssp
436 CALL64pcrel32 @foo, csr_64, implicit $rsp, implicit $ssp, implicit $rdi, implicit-def $rsp, implicit-def $ssp, implicit-def $eax
437 ADJCALLSTACKUP64 0, 0, implicit-def dead $rsp, implicit-def dead $eflags, implicit-def dead $ssp, implicit $rsp, implicit $ssp
438
439 $eflags = COPY %3
440 %4:gr64 = RCL64r1 %2:gr64, implicit-def $eflags, implicit $eflags
441 %5:gr64 = RCL64r1 %4:gr64, implicit-def $eflags, implicit $eflags
442 ; CHECK-NOT: $eflags =
443 ; CHECK: dead %{{[^:]*}}:gr8 = ADD8ri %[[CF_REG]], 255, implicit-def $eflags
444 ; CHECK-NEXT: %4:gr64 = RCL64r1 %2, implicit-def $eflags, implicit killed $eflags
445 ; CHECK-NEXT: %5:gr64 = RCL64r1 %4, implicit-def{{( dead)?}} $eflags, implicit{{( killed)?}} $eflags
446 MOV64mr $rsp, 1, $noreg, -16, $noreg, killed %5
447
448 RET 0
449
450 ...
451 ---
452 name: test_rcr
453 # CHECK-LABEL: name: test_rcr
454 liveins:
455 - { reg: '$rdi', virtual-reg: '%0' }
456 - { reg: '$rsi', virtual-reg: '%1' }
457 body: |
458 bb.0:
459 liveins: $rdi, $rsi
460
461 %0:gr64 = COPY $rdi
462 %1:gr64 = COPY $rsi
463 %2:gr64 = ADD64rr %0, %1, implicit-def $eflags
464 %3:gr64 = COPY $eflags
465 ; CHECK-NOT: COPY{{( killed)?}} $eflags
466 ; CHECK: %[[CF_REG:[^:]*]]:gr8 = SETBr implicit $eflags
467 ; CHECK-NOT: COPY{{( killed)?}} $eflags
468
469 ADJCALLSTACKDOWN64 0, 0, 0, implicit-def dead $rsp, implicit-def dead $eflags, implicit-def dead $ssp, implicit $rsp, implicit $ssp
470 CALL64pcrel32 @foo, csr_64, implicit $rsp, implicit $ssp, implicit $rdi, implicit-def $rsp, implicit-def $ssp, implicit-def $eax
471 ADJCALLSTACKUP64 0, 0, implicit-def dead $rsp, implicit-def dead $eflags, implicit-def dead $ssp, implicit $rsp, implicit $ssp
472
473 $eflags = COPY %3
474 %4:gr64 = RCR64r1 %2:gr64, implicit-def $eflags, implicit $eflags
475 %5:gr64 = RCR64r1 %4:gr64, implicit-def $eflags, implicit $eflags
476 ; CHECK-NOT: $eflags =
477 ; CHECK: dead %{{[^:]*}}:gr8 = ADD8ri %[[CF_REG]], 255, implicit-def $eflags
478 ; CHECK-NEXT: %4:gr64 = RCR64r1 %2, implicit-def $eflags, implicit killed $eflags
479 ; CHECK-NEXT: %5:gr64 = RCR64r1 %4, implicit-def{{( dead)?}} $eflags, implicit{{( killed)?}} $eflags
480 MOV64mr $rsp, 1, $noreg, -16, $noreg, killed %5
481
482 RET 0
483
484 ...
55 ; X32-LABEL: test_1024:
66 ; X32: # %bb.0:
77 ; X32-NEXT: pushl %ebp
8 ; X32-NEXT: movl %esp, %ebp
98 ; X32-NEXT: pushl %ebx
109 ; X32-NEXT: pushl %edi
1110 ; X32-NEXT: pushl %esi
12 ; X32-NEXT: subl $996, %esp # imm = 0x3E4
13 ; X32-NEXT: movl 12(%ebp), %eax
14 ; X32-NEXT: movl 32(%eax), %eax
15 ; X32-NEXT: movl %eax, {{[-0-9]+}}(%e{{[sb]}}p) # 4-byte Spill
16 ; X32-NEXT: xorl %ecx, %ecx
17 ; X32-NEXT: mull %ecx
11 ; X32-NEXT: subl $1000, %esp # imm = 0x3E8
12 ; X32-NEXT: movl {{[0-9]+}}(%esp), %edx
13 ; X32-NEXT: movl {{[0-9]+}}(%esp), %eax
14 ; X32-NEXT: movl 48(%eax), %ecx
15 ; X32-NEXT: movl %eax, %esi
16 ; X32-NEXT: movl %ecx, {{[-0-9]+}}(%e{{[sb]}}p) # 4-byte Spill
17 ; X32-NEXT: movl 32(%edx), %eax
18 ; X32-NEXT: movl %eax, {{[-0-9]+}}(%e{{[sb]}}p) # 4-byte Spill
19 ; X32-NEXT: xorl %edi, %edi
20 ; X32-NEXT: mull %edi
21 ; X32-NEXT: movl %edx, %ebp
1822 ; X32-NEXT: movl %eax, %ebx
19 ; X32-NEXT: movl %edx, %edi
20 ; X32-NEXT: movl 8(%ebp), %esi
21 ; X32-NEXT: movl 48(%esi), %eax
22 ; X32-NEXT: movl %eax, {{[-0-9]+}}(%e{{[sb]}}p) # 4-byte Spill
23 ; X32-NEXT: mull %ecx
24 ; X32-NEXT: xorl %ecx, %ecx
25 ; X32-NEXT: movl %edx, {{[-0-9]+}}(%e{{[sb]}}p) # 4-byte Spill
26 ; X32-NEXT: movl %eax, {{[-0-9]+}}(%e{{[sb]}}p) # 4-byte Spill
27 ; X32-NEXT: addl %ebx, %eax
28 ; X32-NEXT: adcl %edi, %edx
29 ; X32-NEXT: movl %edx, {{[-0-9]+}}(%e{{[sb]}}p) # 4-byte Spill
30 ; X32-NEXT: movl 32(%esi), %eax
31 ; X32-NEXT: movl %eax, {{[-0-9]+}}(%e{{[sb]}}p) # 4-byte Spill
32 ; X32-NEXT: mull %ecx
23 ; X32-NEXT: movl %ecx, %eax
24 ; X32-NEXT: mull %edi
3325 ; X32-NEXT: movl %edx, {{[-0-9]+}}(%e{{[sb]}}p) # 4-byte Spill
3426 ; X32-NEXT: movl %eax, {{[-0-9]+}}(%e{{[sb]}}p) # 4-byte Spill
3527 ; X32-NEXT: movl %eax, %ecx
3628 ; X32-NEXT: addl %ebx, %ecx
3729 ; X32-NEXT: movl %edx, %eax
38 ; X32-NEXT: adcl %edi, %eax
39 ; X32-NEXT: movl %edi, %ecx
40 ; X32-NEXT: movl %edi, {{[-0-9]+}}(%e{{[sb]}}p) # 4-byte Spill
41 ; X32-NEXT: movl %eax, {{[-0-9]+}}(%e{{[sb]}}p) # 4-byte Spill
42 ; X32-NEXT: movl 12(%ebp), %eax
30 ; X32-NEXT: adcl %ebp, %eax
31 ; X32-NEXT: movl %eax, {{[-0-9]+}}(%e{{[sb]}}p) # 4-byte Spill
32 ; X32-NEXT: movl 32(%esi), %eax
33 ; X32-NEXT: movl %eax, {{[-0-9]+}}(%e{{[sb]}}p) # 4-byte Spill
34 ; X32-NEXT: mull %edi
35 ; X32-NEXT: movl %edx, {{[-0-9]+}}(%e{{[sb]}}p) # 4-byte Spill
36 ; X32-NEXT: movl %eax, {{[-0-9]+}}(%e{{[sb]}}p) # 4-byte Spill
37 ; X32-NEXT: movl %eax, %ecx
38 ; X32-NEXT: addl %ebx, %ecx
39 ; X32-NEXT: movl %edx, %eax
40 ; X32-NEXT: adcl %ebp, %eax
41 ; X32-NEXT: movl %ebp, {{[-0-9]+}}(%e{{[sb]}}p) # 4-byte Spill
42 ; X32-NEXT: movl %eax, {{[-0-9]+}}(%e{{[sb]}}p) # 4-byte Spill
43 ; X32-NEXT: movl {{[0-9]+}}(%esp), %eax
4344 ; X32-NEXT: movl 36(%eax), %eax
4445 ; X32-NEXT: movl %eax, {{[-0-9]+}}(%e{{[sb]}}p) # 4-byte Spill
45 ; X32-NEXT: xorl %edx, %edx
46 ; X32-NEXT: mull %edx
47 ; X32-NEXT: movl %edx, {{[-0-9]+}}(%e{{[sb]}}p) # 4-byte Spill
48 ; X32-NEXT: movl %eax, %edi
49 ; X32-NEXT: movl %eax, {{[-0-9]+}}(%e{{[sb]}}p) # 4-byte Spill
50 ; X32-NEXT: addl %ecx, %edi
51 ; X32-NEXT: movl %edi, {{[-0-9]+}}(%e{{[sb]}}p) # 4-byte Spill
46 ; X32-NEXT: mull %edi
47 ; X32-NEXT: movl %edx, {{[-0-9]+}}(%e{{[sb]}}p) # 4-byte Spill
48 ; X32-NEXT: movl %eax, {{[-0-9]+}}(%e{{[sb]}}p) # 4-byte Spill
49 ; X32-NEXT: addl %ebp, %eax
50 ; X32-NEXT: movl %eax, {{[-0-9]+}}(%e{{[sb]}}p) # 4-byte Spill
5251 ; X32-NEXT: movl %edx, %eax
5352 ; X32-NEXT: adcl $0, %eax
5453 ; X32-NEXT: movl %eax, {{[-0-9]+}}(%e{{[sb]}}p) # 4-byte Spill
5554 ; X32-NEXT: movl 36(%esi), %eax
5655 ; X32-NEXT: movl %eax, {{[-0-9]+}}(%e{{[sb]}}p) # 4-byte Spill
57 ; X32-NEXT: xorl %ecx, %ecx
58 ; X32-NEXT: mull %ecx
59 ; X32-NEXT: movl %edx, %ecx
60 ; X32-NEXT: movl %edx, {{[-0-9]+}}(%e{{[sb]}}p) # 4-byte Spill
61 ; X32-NEXT: movl %eax, {{[-0-9]+}}(%e{{[sb]}}p) # 4-byte Spill
62 ; X32-NEXT: movl %eax, %edx
63 ; X32-NEXT: movl {{[-0-9]+}}(%e{{[sb]}}p), %esi # 4-byte Reload
64 ; X32-NEXT: addl %esi, %edx
65 ; X32-NEXT: adcl $0, %ecx
66 ; X32-NEXT: movl %ecx, {{[-0-9]+}}(%e{{[sb]}}p) # 4-byte Spill
56 ; X32-NEXT: mull %edi
57 ; X32-NEXT: movl %edx, %esi
58 ; X32-NEXT: movl %edx, {{[-0-9]+}}(%e{{[sb]}}p) # 4-byte Spill
59 ; X32-NEXT: movl %eax, {{[-0-9]+}}(%e{{[sb]}}p) # 4-byte Spill
60 ; X32-NEXT: movl %eax, %ebp
61 ; X32-NEXT: movl {{[-0-9]+}}(%e{{[sb]}}p), %edi # 4-byte Reload
62 ; X32-NEXT: addl %edi, %ebp
63 ; X32-NEXT: adcl $0, %esi
6764 ; X32-NEXT: movl {{[-0-9]+}}(%e{{[sb]}}p), %ecx # 4-byte Reload
6865 ; X32-NEXT: movl %ecx, %eax
6966 ; X32-NEXT: movl %ebx, {{[-0-9]+}}(%e{{[sb]}}p) # 4-byte Spill
7067 ; X32-NEXT: addl %ebx, %eax
7168 ; X32-NEXT: movl %eax, {{[-0-9]+}}(%e{{[sb]}}p) # 4-byte Spill
72 ; X32-NEXT: leal (%ebx,%edi), %eax
69 ; X32-NEXT: movl {{[-0-9]+}}(%e{{[sb]}}p), %eax # 4-byte Reload
70 ; X32-NEXT: leal (%ebx,%eax), %eax
71 ; X32-NEXT: leal (%ecx,%ebp), %edx
72 ; X32-NEXT: adcl %eax, %edx
73 ; X32-NEXT: movl %edx, {{[-0-9]+}}(%e{{[sb]}}p) # 4-byte Spill
74 ; X32-NEXT: setb {{[-0-9]+}}(%e{{[sb]}}p) # 1-byte Folded Spill
75 ; X32-NEXT: addl %ecx, %ebp
76 ; X32-NEXT: movl %ebp, {{[-0-9]+}}(%e{{[sb]}}p) # 4-byte Spill
77 ; X32-NEXT: adcl %edi, %esi
78 ; X32-NEXT: movl %esi, {{[-0-9]+}}(%e{{[sb]}}p) # 4-byte Spill
79 ; X32-NEXT: setb {{[-0-9]+}}(%e{{[sb]}}p) # 1-byte Folded Spill
80 ; X32-NEXT: movl {{[0-9]+}}(%esp), %eax
81 ; X32-NEXT: movl (%eax), %eax
82 ; X32-NEXT: movl %eax, {{[-0-9]+}}(%e{{[sb]}}p) # 4-byte Spill
83 ; X32-NEXT: xorl %ecx, %ecx
84 ; X32-NEXT: mull %ecx
85 ; X32-NEXT: movl %eax, %esi
86 ; X32-NEXT: movl %edx, %ebx
87 ; X32-NEXT: movl {{[0-9]+}}(%esp), %ebp
88 ; X32-NEXT: movl 16(%ebp), %eax
89 ; X32-NEXT: movl %eax, {{[-0-9]+}}(%e{{[sb]}}p) # 4-byte Spill
90 ; X32-NEXT: mull %ecx
91 ; X32-NEXT: movl %edx, {{[-0-9]+}}(%e{{[sb]}}p) # 4-byte Spill
92 ; X32-NEXT: movl %eax, {{[-0-9]+}}(%e{{[sb]}}p) # 4-byte Spill
93 ; X32-NEXT: movl %eax, %ecx
94 ; X32-NEXT: addl %esi, %ecx
95 ; X32-NEXT: adcl %ebx, %edx
96 ; X32-NEXT: movl %edx, {{[-0-9]+}}(%e{{[sb]}}p) # 4-byte Spill
97 ; X32-NEXT: movl (%ebp), %eax
98 ; X32-NEXT: movl %eax, {{[-0-9]+}}(%e{{[sb]}}p) # 4-byte Spill
99 ; X32-NEXT: xorl %ecx, %ecx
100 ; X32-NEXT: mull %ecx
101 ; X32-NEXT: movl %edx, {{[-0-9]+}}(%e{{[sb]}}p) # 4-byte Spill
102 ; X32-NEXT: movl %eax, {{[-0-9]+}}(%e{{[sb]}}p) # 4-byte Spill
103 ; X32-NEXT: movl %esi, %ebp
104 ; X32-NEXT: addl %esi, %eax
105 ; X32-NEXT: movl %edx, %eax
106 ; X32-NEXT: adcl %ebx, %eax
107 ; X32-NEXT: movl %eax, {{[-0-9]+}}(%e{{[sb]}}p) # 4-byte Spill
108 ; X32-NEXT: movl {{[-0-9]+}}(%e{{[sb]}}p), %eax # 4-byte Reload
109 ; X32-NEXT: addl %esi, %eax
110 ; X32-NEXT: movl {{[-0-9]+}}(%e{{[sb]}}p), %eax # 4-byte Reload
111 ; X32-NEXT: adcl %ebx, %eax
112 ; X32-NEXT: movl %eax, {{[-0-9]+}}(%e{{[sb]}}p) # 4-byte Spill
113 ; X32-NEXT: movl {{[-0-9]+}}(%e{{[sb]}}p), %eax # 4-byte Reload
114 ; X32-NEXT: addl %esi, %eax
115 ; X32-NEXT: movl %esi, {{[-0-9]+}}(%e{{[sb]}}p) # 4-byte Spill
116 ; X32-NEXT: movl %edi, %eax
117 ; X32-NEXT: adcl %ebx, %eax
118 ; X32-NEXT: movl %eax, {{[-0-9]+}}(%e{{[sb]}}p) # 4-byte Spill
119 ; X32-NEXT: movl {{[-0-9]+}}(%e{{[sb]}}p), %eax # 4-byte Reload
120 ; X32-NEXT: addl {{[-0-9]+}}(%e{{[sb]}}p), %eax # 4-byte Folded Reload
121 ; X32-NEXT: movl %eax, {{[-0-9]+}}(%e{{[sb]}}p) # 4-byte Spill
122 ; X32-NEXT: movl {{[-0-9]+}}(%e{{[sb]}}p), %eax # 4-byte Reload
123 ; X32-NEXT: adcl %eax, {{[-0-9]+}}(%e{{[sb]}}p) # 4-byte Folded Spill
124 ; X32-NEXT: setb {{[-0-9]+}}(%e{{[sb]}}p) # 1-byte Folded Spill
125 ; X32-NEXT: movl {{[0-9]+}}(%esp), %esi
126 ; X32-NEXT: movl 4(%esi), %eax
127 ; X32-NEXT: movl %eax, {{[-0-9]+}}(%e{{[sb]}}p) # 4-byte Spill
128 ; X32-NEXT: xorl %ecx, %ecx
129 ; X32-NEXT: mull %ecx
130 ; X32-NEXT: movl %eax, %ecx
131 ; X32-NEXT: addl %ebx, %ecx
73132 ; X32-NEXT: movl %edx, %edi
74 ; X32-NEXT: leal (%ecx,%edx), %edx
75 ; X32-NEXT: adcl %eax, %edx
76 ; X32-NEXT: movl %edx, {{[-0-9]+}}(%e{{[sb]}}p) # 4-byte Spill
77 ; X32-NEXT: seto %al
78 ; X32-NEXT: lahf
79 ; X32-NEXT: movl %eax, %eax
80 ; X32-NEXT: movl %eax, {{[-0-9]+}}(%e{{[sb]}}p) # 4-byte Spill
81 ; X32-NEXT: movl %eax, {{[-0-9]+}}(%e{{[sb]}}p) # 4-byte Spill
82 ; X32-NEXT: addl %ecx, %edi
83 ; X32-NEXT: movl %edi, {{[-0-9]+}}(%e{{[sb]}}p) # 4-byte Spill
84 ; X32-NEXT: adcl %esi, {{[-0-9]+}}(%e{{[sb]}}p) # 4-byte Folded Spill
85 ; X32-NEXT: movl %esi, %ebx
86 ; X32-NEXT: setb {{[-0-9]+}}(%e{{[sb]}}p) # 1-byte Folded Spill
87 ; X32-NEXT: movl 12(%ebp), %eax
88 ; X32-NEXT: movl (%eax), %eax
133 ; X32-NEXT: adcl $0, %edi
134 ; X32-NEXT: addl %ebp, %ecx
135 ; X32-NEXT: movl %ebp, %esi
136 ; X32-NEXT: movl %ecx, {{[-0-9]+}}(%e{{[sb]}}p) # 4-byte Spill
137 ; X32-NEXT: adcl %ebx, %edi
138 ; X32-NEXT: movl %ebx, %ebp
139 ; X32-NEXT: movl %ebx, {{[-0-9]+}}(%e{{[sb]}}p) # 4-byte Spill
140 ; X32-NEXT: setb %cl
141 ; X32-NEXT: addl %eax, %edi
142 ; X32-NEXT: movl %edi, (%esp) # 4-byte Spill
143 ; X32-NEXT: movzbl %cl, %eax
144 ; X32-NEXT: adcl %edx, %eax
145 ; X32-NEXT: movl %eax, %ebx
146 ; X32-NEXT: movl %eax, {{[-0-9]+}}(%e{{[sb]}}p) # 4-byte Spill
147 ; X32-NEXT: movl {{[0-9]+}}(%esp), %eax
148 ; X32-NEXT: movl 8(%eax), %eax
89149 ; X32-NEXT: movl %eax, {{[-0-9]+}}(%e{{[sb]}}p) # 4-byte Spill
90150 ; X32-NEXT: xorl %ecx, %ecx
91151 ; X32-NEXT: mull %ecx
92 ; X32-NEXT: movl %eax, %esi
93 ; X32-NEXT: movl %edx, %edi
94 ; X32-NEXT: movl 8(%ebp), %ecx
95 ; X32-NEXT: movl 16(%ecx), %eax
152 ; X32-NEXT: movl %eax, {{[-0-9]+}}(%e{{[sb]}}p) # 4-byte Spill
153 ; X32-NEXT: movl %edx, {{[-0-9]+}}(%e{{[sb]}}p) # 4-byte Spill
154 ; X32-NEXT: addl %eax, %esi
155 ; X32-NEXT: adcl %edx, %ebp
156 ; X32-NEXT: addl %edi, %esi
157 ; X32-NEXT: movl %esi, {{[-0-9]+}}(%e{{[sb]}}p) # 4-byte Spill
158 ; X32-NEXT: adcl %ebx, %ebp
159 ; X32-NEXT: movl %ebp, {{[-0-9]+}}(%e{{[sb]}}p) # 4-byte Spill
160 ; X32-NEXT: movl {{[0-9]+}}(%esp), %ebp
161 ; X32-NEXT: movl 52(%ebp), %eax
162 ; X32-NEXT: movl %eax, {{[-0-9]+}}(%e{{[sb]}}p) # 4-byte Spill
163 ; X32-NEXT: xorl %ecx, %ecx
164 ; X32-NEXT: mull %ecx
165 ; X32-NEXT: movl %eax, %ebx
166 ; X32-NEXT: movl {{[-0-9]+}}(%e{{[sb]}}p), %edi # 4-byte Reload
167 ; X32-NEXT: addl %edi, %ebx
168 ; X32-NEXT: movl %edx, %ecx
169 ; X32-NEXT: adcl $0, %ecx
170 ; X32-NEXT: movl {{[-0-9]+}}(%e{{[sb]}}p), %esi # 4-byte Reload
171 ; X32-NEXT: addl %esi, %ebx
172 ; X32-NEXT: movl %ebx, {{[-0-9]+}}(%e{{[sb]}}p) # 4-byte Spill
173 ; X32-NEXT: adcl %edi, %ecx
174 ; X32-NEXT: setb %bl
175 ; X32-NEXT: addl %eax, %ecx
176 ; X32-NEXT: movzbl %bl, %ebx
177 ; X32-NEXT: adcl %edx, %ebx
178 ; X32-NEXT: movl 56(%ebp), %eax
96179 ; X32-NEXT: movl %eax, {{[-0-9]+}}(%e{{[sb]}}p) # 4-byte Spill
97180 ; X32-NEXT: xorl %edx, %edx
98181 ; X32-NEXT: mull %edx
99 ; X32-NEXT: movl %edx, {{[-0-9]+}}(%e{{[sb]}}p) # 4-byte Spill
100 ; X32-NEXT: movl %eax, {{[-0-9]+}}(%e{{[sb]}}p) # 4-byte Spill
101 ; X32-NEXT: addl %esi, %eax
102 ; X32-NEXT: adcl %edi, %edx
103 ; X32-NEXT: movl %edx, {{[-0-9]+}}(%e{{[sb]}}p) # 4-byte Spill
104 ; X32-NEXT: movl (%ecx), %eax
105 ; X32-NEXT: movl %eax, {{[-0-9]+}}(%e{{[sb]}}p) # 4-byte Spill
106 ; X32-NEXT: xorl %ecx, %ecx
107 ; X32-NEXT: mull %ecx
108 ; X32-NEXT: movl %edx, {{[-0-9]+}}(%e{{[sb]}}p) # 4-byte Spill
109 ; X32-NEXT: movl %eax, {{[-0-9]+}}(%e{{[sb]}}p) # 4-byte Spill
110 ; X32-NEXT: addl %esi, %eax
111 ; X32-NEXT: movl %edx, %eax
112 ; X32-NEXT: adcl %edi, %eax
113 ; X32-NEXT: movl %eax, {{[-0-9]+}}(%e{{[sb]}}p) # 4-byte Spill
114 ; X32-NEXT: movl {{[-0-9]+}}(%e{{[sb]}}p), %eax # 4-byte Reload
115 ; X32-NEXT: addl %esi, %eax
116 ; X32-NEXT: movl {{[-0-9]+}}(%e{{[sb]}}p), %eax # 4-byte Reload
117 ; X32-NEXT: adcl %edi, %eax
118 ; X32-NEXT: movl %eax, {{[-0-9]+}}(%e{{[sb]}}p) # 4-byte Spill
119 ; X32-NEXT: movl {{[-0-9]+}}(%e{{[sb]}}p), %eax # 4-byte Reload
120 ; X32-NEXT: addl %esi, %eax
121 ; X32-NEXT: movl %esi, %ecx
122 ; X32-NEXT: adcl %edi, %ebx
123 ; X32-NEXT: movl %ebx, {{[-0-9]+}}(%e{{[sb]}}p) # 4-byte Spill
124 ; X32-NEXT: movl %edi, %ebx
182 ; X32-NEXT: movl %eax, {{[-0-9]+}}(%e{{[sb]}}p) # 4-byte Spill
183 ; X32-NEXT: movl %edx, {{[-0-9]+}}(%e{{[sb]}}p) # 4-byte Spill
184 ; X32-NEXT: movl %esi, %ebp
185 ; X32-NEXT: addl %eax, %ebp
186 ; X32-NEXT: adcl %edx, %edi
187 ; X32-NEXT: addl %ecx, %ebp
188 ; X32-NEXT: movl %ebp, {{[-0-9]+}}(%e{{[sb]}}p) # 4-byte Spill
189 ; X32-NEXT: adcl %ebx, %edi
125190 ; X32-NEXT: movl %edi, {{[-0-9]+}}(%e{{[sb]}}p) # 4-byte Spill
126 ; X32-NEXT: movl {{[-0-9]+}}(%e{{[sb]}}p), %eax # 4-byte Reload
127 ; X32-NEXT: addl %eax, {{[-0-9]+}}(%e{{[sb]}}p) # 4-byte Folded Spill
128 ; X32-NEXT: movl {{[-0-9]+}}(%e{{[sb]}}p), %eax # 4-byte Reload
129 ; X32-NEXT: adcl %eax, {{[-0-9]+}}(%e{{[sb]}}p) # 4-byte Folded Spill
130 ; X32-NEXT: setb {{[-0-9]+}}(%e{{[sb]}}p) # 1-byte Folded Spill
131 ; X32-NEXT: movl 12(%ebp), %eax
132 ; X32-NEXT: movl 4(%eax), %eax
133 ; X32-NEXT: movl %eax, {{[-0-9]+}}(%e{{[sb]}}p) # 4-byte Spill
134 ; X32-NEXT: xorl %edx, %edx
135 ; X32-NEXT: mull %edx
136 ; X32-NEXT: movl %eax, %edi
137 ; X32-NEXT: addl %ebx, %edi
138 ; X32-NEXT: movl %edx, %esi
139 ; X32-NEXT: adcl $0, %esi
140 ; X32-NEXT: movl %ecx, {{[-0-9]+}}(%e{{[sb]}}p) # 4-byte Spill
141 ; X32-NEXT: addl %ecx, %edi
142 ; X32-NEXT: movl %edi, {{[-0-9]+}}(%e{{[sb]}}p) # 4-byte Spill
143 ; X32-NEXT: adcl %ebx, %esi
144 ; X32-NEXT: setb %bh
145 ; X32-NEXT: addl %eax, %esi
146 ; X32-NEXT: movl %esi, {{[-0-9]+}}(%e{{[sb]}}p) # 4-byte Spill
147 ; X32-NEXT: movzbl %bh, %eax
148 ; X32-NEXT: adcl %edx, %eax
149 ; X32-NEXT: movl %eax, %edi
150 ; X32-NEXT: movl %eax, {{[-0-9]+}}(%e{{[sb]}}p) # 4-byte Spill
151 ; X32-NEXT: movl 12(%ebp), %eax
152 ; X32-NEXT: movl 8(%eax), %eax
153 ; X32-NEXT: movl %eax, {{[-0-9]+}}(%e{{[sb]}}p) # 4-byte Spill
154 ; X32-NEXT: xorl %ebx, %ebx
155 ; X32-NEXT: mull %ebx
156 ; X32-NEXT: movl %eax, {{[-0-9]+}}(%e{{[sb]}}p) # 4-byte Spill
157 ; X32-NEXT: movl %edx, {{[-0-9]+}}(%e{{[sb]}}p) # 4-byte Spill
158 ; X32-NEXT: addl %eax, %ecx
159 ; X32-NEXT: movl {{[-0-9]+}}(%e{{[sb]}}p), %eax # 4-byte Reload
160 ; X32-NEXT: adcl %edx, %eax
161 ; X32-NEXT: addl %esi, %ecx
162 ; X32-NEXT: movl %ecx, {{[-0-9]+}}(%e{{[sb]}}p) # 4-byte Spill
163 ; X32-NEXT: adcl %edi, %eax
164 ; X32-NEXT: movl %eax, {{[-0-9]+}}(%e{{[sb]}}p) # 4-byte Spill
165 ; X32-NEXT: movl 8(%ebp), %eax
166 ; X32-NEXT: movl 52(%eax), %eax
167 ; X32-NEXT: movl %eax, {{[-0-9]+}}(%e{{[sb]}}p) # 4-byte Spill
168 ; X32-NEXT: mull %ebx
169 ; X32-NEXT: movl %eax, %edi
170 ; X32-NEXT: movl {{[-0-9]+}}(%e{{[sb]}}p), %ecx # 4-byte Reload
171 ; X32-NEXT: addl %ecx, %edi
172 ; X32-NEXT: movl %edx, %esi
173 ; X32-NEXT: adcl $0, %esi
174 ; X32-NEXT: movl {{[-0-9]+}}(%e{{[sb]}}p), %ebx # 4-byte Reload
175 ; X32-NEXT: addl %ebx, %edi
176 ; X32-NEXT: movl %edi, {{[-0-9]+}}(%e{{[sb]}}p) # 4-byte Spill
177 ; X32-NEXT: adcl %ecx, %esi
178 ; X32-NEXT: movl %ecx, %edi
179 ; X32-NEXT: setb %cl
180 ; X32-NEXT: addl %eax, %esi
181 ; X32-NEXT: movzbl %cl, %eax
182 ; X32-NEXT: adcl %edx, %eax
183 ; X32-NEXT: movl %eax, {{[-0-9]+}}(%e{{[sb]}}p) # 4-byte Spill
184 ; X32-NEXT: movl 8(%ebp), %eax
185 ; X32-NEXT: movl 56(%eax), %eax
186 ; X32-NEXT: movl %eax, {{[-0-9]+}}(%e{{[sb]}}p) # 4-byte Spill
187 ; X32-NEXT: xorl %ecx, %ecx
188 ; X32-NEXT: mull %ecx
189 ; X32-NEXT: movl %eax, {{[-0-9]+}}(%e{{[sb]}}p) # 4-byte Spill
190 ; X32-NEXT: movl %edx, {{[-0-9]+}}(%e{{[sb]}}p) # 4-byte Spill
191 ; X32-NEXT: movl %ebx, %ecx
192 ; X32-NEXT: addl %eax, %ebx
193 ; X32-NEXT: adcl %edx, %edi
194 ; X32-NEXT: addl %esi, %ebx
195 ; X32-NEXT: movl %ebx, {{[-0-9]+}}(%e{{[sb]}}p) # 4-byte Spill
196 ; X32-NEXT: adcl {{[-0-9]+}}(%e{{[sb]}}p), %edi # 4-byte Folded Reload
197 ; X32-NEXT: movl %edi, {{[-0-9]+}}(%e{{[sb]}}p) # 4-byte Spill
198 ; X32-NEXT: addl {{[-0-9]+}}(%e{{[sb]}}p), %ecx # 4-byte Folded Reload
199 ; X32-NEXT: movl %ecx, {{[-0-9]+}}(%e{{[sb]}}p) # 4-byte Spill
191 ; X32-NEXT: addl {{[-0-9]+}}(%e{{[sb]}}p), %esi # 4-byte Folded Reload
192 ; X32-NEXT: movl %esi, {{[-0-9]+}}(%e{{[sb]}}p) # 4-byte Spill
200193 ; X32-NEXT: movl {{[-0-9]+}}(%e{{[sb]}}p), %eax # 4-byte Reload
201194 ; X32-NEXT: adcl {{[-0-9]+}}(%e{{[sb]}}p), %eax # 4-byte Folded Reload
202195 ; X32-NEXT: movl %eax, {{[-0-9]+}}(%e{{[sb]}}p) # 4-byte Spill
203 ; X32-NEXT: movl %ebx, %eax
196 ; X32-NEXT: movl %ebp, %eax
204197 ; X32-NEXT: adcl {{[-0-9]+}}(%e{{[sb]}}p), %eax # 4-byte Folded Reload
205198 ; X32-NEXT: movl %eax, {{[-0-9]+}}(%e{{[sb]}}p) # 4-byte Spill
206199 ; X32-NEXT: movl %edi, %eax
209202 ; X32-NEXT: movl {{[-0-9]+}}(%e{{[sb]}}p), %esi # 4-byte Reload
210203 ; X32-NEXT: addl {{[-0-9]+}}(%e{{[sb]}}p), %esi # 4-byte Folded Reload
211204 ; X32-NEXT: movl %esi, {{[-0-9]+}}(%e{{[sb]}}p) # 4-byte Spill
212 ; X32-NEXT: movzbl {{[-0-9]+}}(%e{{[sb]}}p), %eax # 1-byte Folded Reload
213 ; X32-NEXT: adcl {{[-0-9]+}}(%e{{[sb]}}p), %eax # 4-byte Folded Reload
214 ; X32-NEXT: movl %eax, {{[-0-9]+}}(%e{{[sb]}}p) # 4-byte Spill
215 ; X32-NEXT: movl 8(%ebp), %eax
205 ; X32-NEXT: movzbl {{[-0-9]+}}(%e{{[sb]}}p), %ebp # 1-byte Folded Reload
206 ; X32-NEXT: adcl {{[-0-9]+}}(%e{{[sb]}}p), %ebp # 4-byte Folded Reload
207 ; X32-NEXT: movl %ebp, {{[-0-9]+}}(%e{{[sb]}}p) # 4-byte Spill
208 ; X32-NEXT: movl {{[0-9]+}}(%esp), %eax
216209 ; X32-NEXT: movl 40(%eax), %eax
217210 ; X32-NEXT: movl %eax, {{[-0-9]+}}(%e{{[sb]}}p) # 4-byte Spill
218211 ; X32-NEXT: xorl %ecx, %ecx
227220 ; X32-NEXT: adcl %ebx, %ecx
228221 ; X32-NEXT: addl %esi, %edi
229222 ; X32-NEXT: movl %edi, {{[-0-9]+}}(%e{{[sb]}}p) # 4-byte Spill
230 ; X32-NEXT: adcl {{[-0-9]+}}(%e{{[sb]}}p), %ecx # 4-byte Folded Reload
223 ; X32-NEXT: adcl %ebp, %ecx
231224 ; X32-NEXT: movl %ecx, {{[-0-9]+}}(%e{{[sb]}}p) # 4-byte Spill
232225 ; X32-NEXT: addl {{[-0-9]+}}(%e{{[sb]}}p), %edx # 4-byte Folded Reload
233226 ; X32-NEXT: movl %edx, {{[-0-9]+}}(%e{{[sb]}}p) # 4-byte Spill
234227 ; X32-NEXT: movl {{[-0-9]+}}(%e{{[sb]}}p), %eax # 4-byte Reload
235228 ; X32-NEXT: adcl {{[-0-9]+}}(%e{{[sb]}}p), %eax # 4-byte Folded Reload
236229 ; X32-NEXT: movl %eax, {{[-0-9]+}}(%e{{[sb]}}p) # 4-byte Spill
237 ; X32-NEXT: seto %al
238 ; X32-NEXT: lahf
239 ; X32-NEXT: movl %eax, %eax
240 ; X32-NEXT: movl %eax, {{[-0-9]+}}(%e{{[sb]}}p) # 4-byte Spill
230 ; X32-NEXT: setb {{[-0-9]+}}(%e{{[sb]}}p) # 1-byte Folded Spill
241231 ; X32-NEXT: movl %edi, %eax
242232 ; X32-NEXT: adcl {{[-0-9]+}}(%e{{[sb]}}p), %eax # 4-byte Folded Reload
243233 ; X32-NEXT: movl %eax, {{[-0-9]+}}(%e{{[sb]}}p) # 4-byte Spill
244234 ; X32-NEXT: movl %ecx, %eax
245235 ; X32-NEXT: adcl {{[-0-9]+}}(%e{{[sb]}}p), %eax # 4-byte Folded Reload
246236 ; X32-NEXT: movl %eax, {{[-0-9]+}}(%e{{[sb]}}p) # 4-byte Spill
247 ; X32-NEXT: movl 12(%ebp), %ecx
237 ; X32-NEXT: movl {{[0-9]+}}(%esp), %ecx
248238 ; X32-NEXT: movl 16(%ecx), %eax
249239 ; X32-NEXT: movl %eax, {{[-0-9]+}}(%e{{[sb]}}p) # 4-byte Spill
250240 ; X32-NEXT: xorl %ebx, %ebx
251241 ; X32-NEXT: mull %ebx
252242 ; X32-NEXT: movl %eax, %edi
253 ; X32-NEXT: movl %edx, %esi
254 ; X32-NEXT: movl %edx, {{[-0-9]+}}(%e{{[sb]}}p) # 4-byte Spill
243 ; X32-NEXT: movl %edx, %ebp
255244 ; X32-NEXT: movl 20(%ecx), %eax
256245 ; X32-NEXT: movl %eax, {{[-0-9]+}}(%e{{[sb]}}p) # 4-byte Spill
257246 ; X32-NEXT: mull %ebx
258247 ; X32-NEXT: movl %eax, %ebx
259 ; X32-NEXT: addl %esi, %ebx
248 ; X32-NEXT: addl %ebp, %ebx
260249 ; X32-NEXT: movl %edx, %ecx
261250 ; X32-NEXT: adcl $0, %ecx
262251 ; X32-NEXT: addl %edi, %ebx
263252 ; X32-NEXT: movl %ebx, {{[-0-9]+}}(%e{{[sb]}}p) # 4-byte Spill
264 ; X32-NEXT: adcl %esi, %ecx
253 ; X32-NEXT: adcl %ebp, %ecx
265254 ; X32-NEXT: setb %bl
266255 ; X32-NEXT: addl %eax, %ecx
267256 ; X32-NEXT: movzbl %bl, %esi
268257 ; X32-NEXT: adcl %edx, %esi
269 ; X32-NEXT: movl 12(%ebp), %eax
258 ; X32-NEXT: movl {{[0-9]+}}(%esp), %eax
270259 ; X32-NEXT: movl 24(%eax), %eax
271260 ; X32-NEXT: movl %eax, {{[-0-9]+}}(%e{{[sb]}}p) # 4-byte Spill
272261 ; X32-NEXT: xorl %edx, %edx
275264 ; X32-NEXT: movl %edx, {{[-0-9]+}}(%e{{[sb]}}p) # 4-byte Spill
276265 ; X32-NEXT: movl %edi, %ebx
277266 ; X32-NEXT: addl %eax, %ebx
278 ; X32-NEXT: movl {{[-0-9]+}}(%e{{[sb]}}p), %eax # 4-byte Reload
267 ; X32-NEXT: movl %ebp, %eax
279268 ; X32-NEXT: adcl %edx, %eax
280269 ; X32-NEXT: addl %ecx, %ebx
281 ; X32-NEXT: movl %ebx, {{[-0-9]+}}(%e{{[sb]}}p) # 4-byte Spill
282270 ; X32-NEXT: adcl %esi, %eax
283 ; X32-NEXT: movl %eax, %edx
284 ; X32-NEXT: movl {{[-0-9]+}}(%e{{[sb]}}p), %esi # 4-byte Reload
285 ; X32-NEXT: movl %esi, %eax
271 ; X32-NEXT: movl %eax, %esi
272 ; X32-NEXT: movl {{[-0-9]+}}(%e{{[sb]}}p), %ecx # 4-byte Reload
273 ; X32-NEXT: movl %ecx, %eax
286274 ; X32-NEXT: movl %edi, {{[-0-9]+}}(%e{{[sb]}}p) # 4-byte Spill
287275 ; X32-NEXT: addl %edi, %eax
288276 ; X32-NEXT: movl {{[-0-9]+}}(%e{{[sb]}}p), %eax # 4-byte Reload
289 ; X32-NEXT: movl {{[-0-9]+}}(%e{{[sb]}}p), %ecx # 4-byte Reload
290 ; X32-NEXT: adcl %ecx, %eax
291 ; X32-NEXT: movl %eax, {{[-0-9]+}}(%e{{[sb]}}p) # 4-byte Spill
292 ; X32-NEXT: movl %esi, %eax
277 ; X32-NEXT: movl %ebp, {{[-0-9]+}}(%e{{[sb]}}p) # 4-byte Spill
278 ; X32-NEXT: adcl %ebp, %eax
279 ; X32-NEXT: movl %eax, {{[-0-9]+}}(%e{{[sb]}}p) # 4-byte Spill
280 ; X32-NEXT: movl %ecx, %eax
293281 ; X32-NEXT: addl %edi, %eax
294282 ; X32-NEXT: movl %eax, {{[-0-9]+}}(%e{{[sb]}}p) # 4-byte Spill
295283 ; X32-NEXT: movl {{[-0-9]+}}(%e{{[sb]}}p), %eax # 4-byte Reload
296 ; X32-NEXT: movl {{[-0-9]+}}(%e{{[sb]}}p), %esi # 4-byte Reload
284 ; X32-NEXT: movl {{[-0-9]+}}(%e{{[sb]}}p), %edx # 4-byte Reload
285 ; X32-NEXT: adcl %edx, %eax
286 ; X32-NEXT: movl %eax, {{[-0-9]+}}(%e{{[sb]}}p) # 4-byte Spill
287 ; X32-NEXT: movl {{[-0-9]+}}(%e{{[sb]}}p), %eax # 4-byte Reload
288 ; X32-NEXT: movl %ebx, {{[-0-9]+}}(%e{{[sb]}}p) # 4-byte Spill
289 ; X32-NEXT: adcl %ebx, %eax
290 ; X32-NEXT: movl %eax, {{[-0-9]+}}(%e{{[sb]}}p) # 4-byte Spill
291 ; X32-NEXT: movl {{[-0-9]+}}(%e{{[sb]}}p), %eax # 4-byte Reload
297292 ; X32-NEXT: adcl %esi, %eax
293 ; X32-NEXT: movl %esi, {{[-0-9]+}}(%e{{[sb]}}p) # 4-byte Spill
294 ; X32-NEXT: movl %eax, {{[-0-9]+}}(%e{{[sb]}}p) # 4-byte Spill
295 ; X32-NEXT: movl {{[-0-9]+}}(%e{{[sb]}}p), %ecx # 4-byte Reload
296 ; X32-NEXT: movl %ecx, %eax
297 ; X32-NEXT: addl %edi, %eax
298 ; X32-NEXT: movl {{[-0-9]+}}(%e{{[sb]}}p), %eax # 4-byte Reload
299 ; X32-NEXT: adcl %ebp, %eax
300 ; X32-NEXT: movl %eax, {{[-0-9]+}}(%e{{[sb]}}p) # 4-byte Spill
301 ; X32-NEXT: movl %ecx, %eax
302 ; X32-NEXT: addl %edi, %eax
303 ; X32-NEXT: movl %eax, {{[-0-9]+}}(%e{{[sb]}}p) # 4-byte Spill
304 ; X32-NEXT: movl {{[-0-9]+}}(%e{{[sb]}}p), %eax # 4-byte Reload
305 ; X32-NEXT: adcl %edx, %eax
298306 ; X32-NEXT: movl %eax, {{[-0-9]+}}(%e{{[sb]}}p) # 4-byte Spill
299307 ; X32-NEXT: movl {{[-0-9]+}}(%e{{[sb]}}p), %eax # 4-byte Reload
300308 ; X32-NEXT: adcl %ebx, %eax
301309 ; X32-NEXT: movl %eax, {{[-0-9]+}}(%e{{[sb]}}p) # 4-byte Spill
302310 ; X32-NEXT: movl {{[-0-9]+}}(%e{{[sb]}}p), %eax # 4-byte Reload
303 ; X32-NEXT: adcl %edx, %eax
304 ; X32-NEXT: movl %edx, %ebx
305 ; X32-NEXT: movl %edx, {{[-0-9]+}}(%e{{[sb]}}p) # 4-byte Spill
306 ; X32-NEXT: movl %eax, {{[-0-9]+}}(%e{{[sb]}}p) # 4-byte Spill
307 ; X32-NEXT: movl {{[-0-9]+}}(%e{{[sb]}}p), %edx # 4-byte Reload
308 ; X32-NEXT: movl %edx, %eax
309 ; X32-NEXT: addl %edi, %eax
310 ; X32-NEXT: movl {{[-0-9]+}}(%e{{[sb]}}p), %eax # 4-byte Reload
311 ; X32-NEXT: adcl %ecx, %eax
312 ; X32-NEXT: movl %eax, {{[-0-9]+}}(%e{{[sb]}}p) # 4-byte Spill
313 ; X32-NEXT: movl %edx, %eax
314 ; X32-NEXT: addl %edi, %eax
315 ; X32-NEXT: movl %eax, {{[-0-9]+}}(%e{{[sb]}}p) # 4-byte Spill
316 ; X32-NEXT: movl {{[-0-9]+}}(%e{{[sb]}}p), %eax # 4-byte Reload
317311 ; X32-NEXT: adcl %esi, %eax
318312 ; X32-NEXT: movl %eax, {{[-0-9]+}}(%e{{[sb]}}p) # 4-byte Spill
319 ; X32-NEXT: movl {{[-0-9]+}}(%e{{[sb]}}p), %eax # 4-byte Reload
320 ; X32-NEXT: adcl {{[-0-9]+}}(%e{{[sb]}}p), %eax # 4-byte Folded Reload
321 ; X32-NEXT: movl %eax, {{[-0-9]+}}(%e{{[sb]}}p) # 4-byte Spill
322 ; X32-NEXT: movl {{[-0-9]+}}(%e{{[sb]}}p), %eax # 4-byte Reload
323 ; X32-NEXT: adcl %ebx, %eax
324 ; X32-NEXT: movl %eax, {{[-0-9]+}}(%e{{[sb]}}p) # 4-byte Spill
325 ; X32-NEXT: movl 8(%ebp), %eax
326 ; X32-NEXT: movl 20(%eax), %eax
313 ; X32-NEXT: movl {{[0-9]+}}(%esp), %edi
314 ; X32-NEXT: movl 20(%edi), %eax
327315 ; X32-NEXT: movl %eax, {{[-0-9]+}}(%e{{[sb]}}p) # 4-byte Spill
328316 ; X32-NEXT: xorl %ecx, %ecx
329317 ; X32-NEXT: mull %ecx
332320 ; X32-NEXT: addl %ebx, %esi
333321 ; X32-NEXT: movl %edx, %ecx
334322 ; X32-NEXT: adcl $0, %ecx
335 ; X32-NEXT: movl {{[-0-9]+}}(%e{{[sb]}}p), %edi # 4-byte Reload
336 ; X32-NEXT: addl %edi, %esi
323 ; X32-NEXT: movl {{[-0-9]+}}(%e{{[sb]}}p), %ebp # 4-byte Reload
324 ; X32-NEXT: addl %ebp, %esi
337325 ; X32-NEXT: movl %esi, {{[-0-9]+}}(%e{{[sb]}}p) # 4-byte Spill
338326 ; X32-NEXT: adcl %ebx, %ecx
339327 ; X32-NEXT: setb %bl
340328 ; X32-NEXT: addl %eax, %ecx
341329 ; X32-NEXT: movzbl %bl, %esi
342330 ; X32-NEXT: adcl %edx, %esi
343 ; X32-NEXT: movl 8(%ebp), %eax
344 ; X32-NEXT: movl 24(%eax), %eax
331 ; X32-NEXT: movl 24(%edi), %eax
345332 ; X32-NEXT: movl %eax, {{[-0-9]+}}(%e{{[sb]}}p) # 4-byte Spill
346333 ; X32-NEXT: xorl %edx, %edx
347334 ; X32-NEXT: mull %edx
348335 ; X32-NEXT: movl %eax, {{[-0-9]+}}(%e{{[sb]}}p) # 4-byte Spill
349336 ; X32-NEXT: movl %edx, {{[-0-9]+}}(%e{{[sb]}}p) # 4-byte Spill
350 ; X32-NEXT: movl %edi, %edx
337 ; X32-NEXT: movl %ebp, %edi
351338 ; X32-NEXT: addl %eax, %edi
352339 ; X32-NEXT: movl {{[-0-9]+}}(%e{{[sb]}}p), %ebx # 4-byte Reload
353 ; X32-NEXT: adcl {{[-0-9]+}}(%e{{[sb]}}p), %ebx # 4-byte Folded Reload
340 ; X32-NEXT: adcl %edx, %ebx
354341 ; X32-NEXT: addl %ecx, %edi
355342 ; X32-NEXT: movl %edi, {{[-0-9]+}}(%e{{[sb]}}p) # 4-byte Spill
356343 ; X32-NEXT: adcl %esi, %ebx
357344 ; X32-NEXT: movl %ebx, {{[-0-9]+}}(%e{{[sb]}}p) # 4-byte Spill
358 ; X32-NEXT: addl {{[-0-9]+}}(%e{{[sb]}}p), %edx # 4-byte Folded Reload
359 ; X32-NEXT: movl %edx, {{[-0-9]+}}(%e{{[sb]}}p) # 4-byte Spill
345 ; X32-NEXT: addl {{[-0-9]+}}(%e{{[sb]}}p), %ebp # 4-byte Folded Reload
346 ; X32-NEXT: movl %ebp, {{[-0-9]+}}(%e{{[sb]}}p) # 4-byte Spill
360347 ; X32-NEXT: movl {{[-0-9]+}}(%e{{[sb]}}p), %eax # 4-byte Reload
361348 ; X32-NEXT: adcl {{[-0-9]+}}(%e{{[sb]}}p), %eax # 4-byte Folded Reload
362349 ; X32-NEXT: movl %eax, {{[-0-9]+}}(%e{{[sb]}}p) # 4-byte Spill
366353 ; X32-NEXT: movl %ebx, %eax
367354 ; X32-NEXT: adcl {{[-0-9]+}}(%e{{[sb]}}p), %eax # 4-byte Folded Reload
368355 ; X32-NEXT: movl %eax, {{[-0-9]+}}(%e{{[sb]}}p) # 4-byte Spill
369 ; X32-NEXT: movl 8(%ebp), %eax
370 ; X32-NEXT: movl 4(%eax), %eax
356 ; X32-NEXT: movl {{[0-9]+}}(%esp), %ecx
357 ; X32-NEXT: movl 4(%ecx), %eax
371358 ; X32-NEXT: movl %eax, {{[-0-9]+}}(%e{{[sb]}}p) # 4-byte Spill
372359 ; X32-NEXT: xorl %ecx, %ecx
373360 ; X32-NEXT: mull %ecx
376363 ; X32-NEXT: addl %ecx, %esi
377364 ; X32-NEXT: movl %edx, %edi
378365 ; X32-NEXT: adcl $0, %edi
379 ; X32-NEXT: movl {{[-0-9]+}}(%e{{[sb]}}p), %ebx # 4-byte Reload
380 ; X32-NEXT: addl %ebx, %esi
366 ; X32-NEXT: movl {{[-0-9]+}}(%e{{[sb]}}p), %ebp # 4-byte Reload
367 ; X32-NEXT: addl %ebp, %esi
381368 ; X32-NEXT: movl %esi, {{[-0-9]+}}(%e{{[sb]}}p) # 4-byte Spill
382369 ; X32-NEXT: adcl %ecx, %edi
383370 ; X32-NEXT: setb %cl
385372 ; X32-NEXT: movl %edi, {{[-0-9]+}}(%e{{[sb]}}p) # 4-byte Spill
386373 ; X32-NEXT: movzbl %cl, %eax
387374 ; X32-NEXT: adcl %edx, %eax
388 ; X32-NEXT: movl %eax, {{[-0-9]+}}(%e{{[sb]}}p) # 4-byte Spill
389 ; X32-NEXT: movl 8(%ebp), %eax
375 ; X32-NEXT: movl %eax, %ebx
376 ; X32-NEXT: movl %eax, {{[-0-9]+}}(%e{{[sb]}}p) # 4-byte Spill
377 ; X32-NEXT: movl {{[0-9]+}}(%esp), %eax
390378 ; X32-NEXT: movl 8(%eax), %eax
391379 ; X32-NEXT: movl %eax, {{[-0-9]+}}(%e{{[sb]}}p) # 4-byte Spill
392380 ; X32-NEXT: xorl %ecx, %ecx
394382 ; X32-NEXT: movl %eax, %ecx
395383 ; X32-NEXT: movl %eax, {{[-0-9]+}}(%e{{[sb]}}p) # 4-byte Spill
396384 ; X32-NEXT: movl %edx, {{[-0-9]+}}(%e{{[sb]}}p) # 4-byte Spill
397 ; X32-NEXT: movl %ebx, %esi
398 ; X32-NEXT: movl %ebx, %eax
385 ; X32-NEXT: movl %ebp, %esi
386 ; X32-NEXT: movl %ebp, %eax
399387 ; X32-NEXT: addl %ecx, %eax
400 ; X32-NEXT: movl {{[-0-9]+}}(%e{{[sb]}}p), %ebx # 4-byte Reload
401 ; X32-NEXT: movl %ebx, %ecx
388 ; X32-NEXT: movl {{[-0-9]+}}(%e{{[sb]}}p), %ebp # 4-byte Reload
389 ; X32-NEXT: movl %ebp, %ecx
402390 ; X32-NEXT: adcl %edx, %ecx
403391 ; X32-NEXT: addl %edi, %eax
404 ; X32-NEXT: movl %eax, {{[-0-9]+}}(%e{{[sb]}}p) # 4-byte Spill
405 ; X32-NEXT: adcl {{[-0-9]+}}(%e{{[sb]}}p), %ecx # 4-byte Folded Reload
392 ; X32-NEXT: adcl %ebx, %ecx
406393 ; X32-NEXT: movl %esi, %edx
394 ; X32-NEXT: movl %esi, %ebx
407395 ; X32-NEXT: addl {{[-0-9]+}}(%e{{[sb]}}p), %edx # 4-byte Folded Reload
408396 ; X32-NEXT: movl %edx, {{[-0-9]+}}(%e{{[sb]}}p) # 4-byte Spill
409 ; X32-NEXT: movl {{[-0-9]+}}(%e{{[sb]}}p), %edi # 4-byte Reload
410 ; X32-NEXT: movl %edi, %edx
411 ; X32-NEXT: adcl {{[-0-9]+}}(%e{{[sb]}}p), %edx # 4-byte Folded Reload
412 ; X32-NEXT: movl %edx, {{[-0-9]+}}(%e{{[sb]}}p) # 4-byte Spill
413 ; X32-NEXT: pushl %eax
414 ; X32-NEXT: seto %al
415 ; X32-NEXT: lahf
397 ; X32-NEXT: movl {{[-0-9]+}}(%e{{[sb]}}p), %esi # 4-byte Reload
398 ; X32-NEXT: movl %esi, %edx
399 ; X32-NEXT: adcl {{[-0-9]+}}(%e{{[sb]}}p), %edx # 4-byte Folded Reload
400 ; X32-NEXT: movl %edx, {{[-0-9]+}}(%e{{[sb]}}p) # 4-byte Spill
401 ; X32-NEXT: setb {{[-0-9]+}}(%e{{[sb]}}p) # 1-byte Folded Spill
416402 ; X32-NEXT: movl %eax, %edx
417 ; X32-NEXT: popl %eax
418 ; X32-NEXT: movl %edx, {{[-0-9]+}}(%e{{[sb]}}p) # 4-byte Spill
419 ; X32-NEXT: movl %eax, %edx
403 ; X32-NEXT: movl %eax, %edi
404 ; X32-NEXT: movl %eax, {{[-0-9]+}}(%e{{[sb]}}p) # 4-byte Spill
420405 ; X32-NEXT: adcl {{[-0-9]+}}(%e{{[sb]}}p), %edx # 4-byte Folded Reload
421406 ; X32-NEXT: movl %edx, {{[-0-9]+}}(%e{{[sb]}}p) # 4-byte Spill
422407 ; X32-NEXT: movl %ecx, %eax
408 ; X32-NEXT: movl %ecx, %edx
423409 ; X32-NEXT: movl %ecx, {{[-0-9]+}}(%e{{[sb]}}p) # 4-byte Spill
424410 ; X32-NEXT: adcl {{[-0-9]+}}(%e{{[sb]}}p), %eax # 4-byte Folded Reload
425411 ; X32-NEXT: movl %eax, {{[-0-9]+}}(%e{{[sb]}}p) # 4-byte Spill
426 ; X32-NEXT: movl %esi, %edx
427 ; X32-NEXT: movl %esi, %eax
428 ; X32-NEXT: movl {{[-0-9]+}}(%e{{[sb]}}p), %esi # 4-byte Reload
429 ; X32-NEXT: addl %esi, %eax
430 ; X32-NEXT: movl %ebx, %eax
431 ; X32-NEXT: movl {{[-0-9]+}}(%e{{[sb]}}p), %ebx # 4-byte Reload
432 ; X32-NEXT: adcl %ebx, %eax
433 ; X32-NEXT: movl %eax, {{[-0-9]+}}(%e{{[sb]}}p) # 4-byte Spill
434 ; X32-NEXT: movl %edx, %eax
435 ; X32-NEXT: addl %esi, %eax
412 ; X32-NEXT: movl %ebx, %eax
413 ; X32-NEXT: movl {{[-0-9]+}}(%e{{[sb]}}p), %ecx # 4-byte Reload
414 ; X32-NEXT: addl %ecx, %eax
415 ; X32-NEXT: movl %ebp, %eax
416 ; X32-NEXT: movl {{[-0-9]+}}(%e{{[sb]}}p), %ebp # 4-byte Reload
417 ; X32-NEXT: adcl %ebp, %eax
418 ; X32-NEXT: movl %eax, {{[-0-9]+}}(%e{{[sb]}}p) # 4-byte Spill
419 ; X32-NEXT: addl %ecx, %ebx
420 ; X32-NEXT: movl %ebx, {{[-0-9]+}}(%e{{[sb]}}p) # 4-byte Spill
421 ; X32-NEXT: movl %esi, %eax
422 ; X32-NEXT: adcl {{[-0-9]+}}(%e{{[sb]}}p), %eax # 4-byte Folded Reload
436423 ; X32-NEXT: movl %eax, {{[-0-9]+}}(%e{{[sb]}}p) # 4-byte Spill
437424 ; X32-NEXT: movl %edi, %eax
438425 ; X32-NEXT: adcl {{[-0-9]+}}(%e{{[sb]}}p), %eax # 4-byte Folded Reload
439426 ; X32-NEXT: movl %eax, {{[-0-9]+}}(%e{{[sb]}}p) # 4-byte Spill
440 ; X32-NEXT: movl {{[-0-9]+}}(%e{{[sb]}}p), %eax # 4-byte Reload
427 ; X32-NEXT: movl %edx, %eax
441428 ; X32-NEXT: movl {{[-0-9]+}}(%e{{[sb]}}p), %edi # 4-byte Reload
442429 ; X32-NEXT: adcl %edi, %eax
443430 ; X32-NEXT: movl %eax, {{[-0-9]+}}(%e{{[sb]}}p) # 4-byte Spill
444 ; X32-NEXT: movl %ecx, %eax
445 ; X32-NEXT: adcl {{[-0-9]+}}(%e{{[sb]}}p), %eax # 4-byte Folded Reload
446 ; X32-NEXT: movl %eax, {{[-0-9]+}}(%e{{[sb]}}p) # 4-byte Spill
447 ; X32-NEXT: movl {{[-0-9]+}}(%e{{[sb]}}p), %eax # 4-byte Reload
431 ; X32-NEXT: movl {{[-0-9]+}}(%e{{[sb]}}p), %ebx # 4-byte Reload
432 ; X32-NEXT: movl %ebx, %eax
448433 ; X32-NEXT: movl {{[-0-9]+}}(%e{{[sb]}}p), %edx # 4-byte Reload
449434 ; X32-NEXT: addl %edx, %eax
450435 ; X32-NEXT: movl {{[-0-9]+}}(%e{{[sb]}}p), %esi # 4-byte Reload
457442 ; X32-NEXT: movl {{[-0-9]+}}(%e{{[sb]}}p), %eax # 4-byte Reload
458443 ; X32-NEXT: adcl %ecx, %eax
459444 ; X32-NEXT: movl %eax, {{[-0-9]+}}(%e{{[sb]}}p) # 4-byte Spill
460 ; X32-NEXT: movl {{[-0-9]+}}(%e{{[sb]}}p), %ecx # 4-byte Reload
461 ; X32-NEXT: movl %ecx, %eax
445 ; X32-NEXT: movl %ebx, %eax
462446 ; X32-NEXT: movl {{[-0-9]+}}(%e{{[sb]}}p), %edx # 4-byte Reload
463447 ; X32-NEXT: addl %edx, %eax
464 ; X32-NEXT: adcl %ebx, %esi
465 ; X32-NEXT: movl %esi, {{[-0-9]+}}(%e{{[sb]}}p) # 4-byte Spill
466 ; X32-NEXT: addl %edx, %ecx
467 ; X32-NEXT: movl %ecx, {{[-0-9]+}}(%e{{[sb]}}p) # 4-byte Spill
448 ; X32-NEXT: adcl %ebp, %esi
449 ; X32-NEXT: movl %esi, {{[-0-9]+}}(%e{{[sb]}}p) # 4-byte Spill
450 ; X32-NEXT: addl %edx, %ebx
451 ; X32-NEXT: movl %ebx, {{[-0-9]+}}(%e{{[sb]}}p) # 4-byte Spill
468452 ; X32-NEXT: movl {{[-0-9]+}}(%e{{[sb]}}p), %eax # 4-byte Reload
469453 ; X32-NEXT: adcl {{[-0-9]+}}(%e{{[sb]}}p), %eax # 4-byte Folded Reload
470454 ; X32-NEXT: movl %eax, {{[-0-9]+}}(%e{{[sb]}}p) # 4-byte Spill
471455 ; X32-NEXT: movl {{[-0-9]+}}(%e{{[sb]}}p), %eax # 4-byte Reload
456 ; X32-NEXT: adcl {{[-0-9]+}}(%e{{[sb]}}p), %eax # 4-byte Folded Reload
457 ; X32-NEXT: movl %eax, {{[-0-9]+}}(%e{{[sb]}}p) # 4-byte Spill
458 ; X32-NEXT: movl {{[-0-9]+}}(%e{{[sb]}}p), %eax # 4-byte Reload
472459 ; X32-NEXT: adcl %edi, %eax
473460 ; X32-NEXT: movl %eax, {{[-0-9]+}}(%e{{[sb]}}p) # 4-byte Spill
474 ; X32-NEXT: movl {{[-0-9]+}}(%e{{[sb]}}p), %eax # 4-byte Reload
475 ; X32-NEXT: adcl {{[-0-9]+}}(%e{{[sb]}}p), %eax # 4-byte Folded Reload
476 ; X32-NEXT: movl %eax, {{[-0-9]+}}(%e{{[sb]}}p) # 4-byte Spill
477 ; X32-NEXT: movl {{[-0-9]+}}(%e{{[sb]}}p), %esi # 4-byte Reload
478 ; X32-NEXT: addl {{[-0-9]+}}(%e{{[sb]}}p), %esi # 4-byte Folded Reload
479 ; X32-NEXT: movl %esi, {{[-0-9]+}}(%e{{[sb]}}p) # 4-byte Spill
461 ; X32-NEXT: movl {{[-0-9]+}}(%e{{[sb]}}p), %edi # 4-byte Reload
462 ; X32-NEXT: addl {{[-0-9]+}}(%e{{[sb]}}p), %edi # 4-byte Folded Reload
463 ; X32-NEXT: movl %edi, {{[-0-9]+}}(%e{{[sb]}}p) # 4-byte Spill
480464 ; X32-NEXT: movzbl {{[-0-9]+}}(%e{{[sb]}}p), %eax # 1-byte Folded Reload
481465 ; X32-NEXT: adcl {{[-0-9]+}}(%e{{[sb]}}p), %eax # 4-byte Folded Reload
482 ; X32-NEXT: movl %eax, {{[-0-9]+}}(%e{{[sb]}}p) # 4-byte Spill
483 ; X32-NEXT: movl 12(%ebp), %eax
466 ; X32-NEXT: movl %eax, %ebx
467 ; X32-NEXT: movl %eax, {{[-0-9]+}}(%e{{[sb]}}p) # 4-byte Spill
468 ; X32-NEXT: movl {{[0-9]+}}(%esp), %eax
484469 ; X32-NEXT: movl 40(%eax), %eax
485470 ; X32-NEXT: movl %eax, {{[-0-9]+}}(%e{{[sb]}}p) # 4-byte Spill
486471 ; X32-NEXT: xorl %ecx, %ecx
487472 ; X32-NEXT: mull %ecx
488473 ; X32-NEXT: movl %eax, {{[-0-9]+}}(%e{{[sb]}}p) # 4-byte Spill
489 ; X32-NEXT: movl %edx, {{[-0-9]+}}(%e{{[sb]}}p) # 4-byte Spill
490 ; X32-NEXT: movl {{[-0-9]+}}(%e{{[sb]}}p), %ebx # 4-byte Reload
491 ; X32-NEXT: movl %ebx, %edi
492 ; X32-NEXT: addl %eax, %edi
493 ; X32-NEXT: movl {{[-0-9]+}}(%e{{[sb]}}p), %ecx # 4-byte Reload
494 ; X32-NEXT: adcl %edx, %ecx
495 ; X32-NEXT: addl %esi, %edi
496 ; X32-NEXT: adcl {{[-0-9]+}}(%e{{[sb]}}p), %ecx # 4-byte Folded Reload
497 ; X32-NEXT: movl %ecx, %edx
498 ; X32-NEXT: movl {{[-0-9]+}}(%e{{[sb]}}p), %eax # 4-byte Reload
499 ; X32-NEXT: addl %ebx, %eax
474 ; X32-NEXT: movl %edx, %ecx
475 ; X32-NEXT: movl %edx, {{[-0-9]+}}(%e{{[sb]}}p) # 4-byte Spill
476 ; X32-NEXT: movl {{[-0-9]+}}(%e{{[sb]}}p), %esi # 4-byte Reload
477 ; X32-NEXT: movl %esi, %edx
478 ; X32-NEXT: addl %eax, %edx
479 ; X32-NEXT: movl {{[-0-9]+}}(%e{{[sb]}}p), %ebp # 4-byte Reload
480 ; X32-NEXT: adcl %ecx, %ebp
481 ; X32-NEXT: addl %edi, %edx
482 ; X32-NEXT: adcl %ebx, %ebp
483 ; X32-NEXT: movl {{[-0-9]+}}(%e{{[sb]}}p), %eax # 4-byte Reload
484 ; X32-NEXT: addl %esi, %eax
500485 ; X32-NEXT: movl %eax, {{[-0-9]+}}(%e{{[sb]}}p) # 4-byte Spill
501486 ; X32-NEXT: movl {{[-0-9]+}}(%e{{[sb]}}p), %eax # 4-byte Reload
502487 ; X32-NEXT: movl {{[-0-9]+}}(%e{{[sb]}}p), %ecx # 4-byte Reload
503488 ; X32-NEXT: adcl %ecx, %eax
504489 ; X32-NEXT: movl %eax, {{[-0-9]+}}(%e{{[sb]}}p) # 4-byte Spill
505490 ; X32-NEXT: movl {{[-0-9]+}}(%e{{[sb]}}p), %eax # 4-byte Reload
506 ; X32-NEXT: adcl %edi, %eax
507 ; X32-NEXT: movl %edi, {{[-0-9]+}}(%e{{[sb]}}p) # 4-byte Spill
508 ; X32-NEXT: movl %eax, {{[-0-9]+}}(%e{{[sb]}}p) # 4-byte Spill
509 ; X32-NEXT: movl {{[-0-9]+}}(%e{{[sb]}}p), %eax # 4-byte Reload
510 ; X32-NEXT: movl %edx, {{[-0-9]+}}(%e{{[sb]}}p) # 4-byte Spill
511491 ; X32-NEXT: adcl %edx, %eax
512 ; X32-NEXT: movl %eax, {{[-0-9]+}}(%e{{[sb]}}p) # 4-byte Spill
513 ; X32-NEXT: movl {{[-0-9]+}}(%e{{[sb]}}p), %eax # 4-byte Reload
514 ; X32-NEXT: addl %ebx, %eax
492 ; X32-NEXT: movl %edx, {{[-0-9]+}}(%e{{[sb]}}p) # 4-byte Spill
493 ; X32-NEXT: movl %eax, {{[-0-9]+}}(%e{{[sb]}}p) # 4-byte Spill
494 ; X32-NEXT: movl {{[-0-9]+}}(%e{{[sb]}}p), %eax # 4-byte Reload
495 ; X32-NEXT: adcl %ebp, %eax
496 ; X32-NEXT: movl %ebp, {{[-0-9]+}}(%e{{[sb]}}p) # 4-byte Spill
497 ; X32-NEXT: movl %eax, {{[-0-9]+}}(%e{{[sb]}}p) # 4-byte Spill
498 ; X32-NEXT: movl {{[-0-9]+}}(%e{{[sb]}}p), %eax # 4-byte Reload
499 ; X32-NEXT: addl %esi, %eax
515500 ; X32-NEXT: movl %eax, {{[-0-9]+}}(%e{{[sb]}}p) # 4-byte Spill
516501 ; X32-NEXT: movl {{[-0-9]+}}(%e{{[sb]}}p), %eax # 4-byte Reload
517502 ; X32-NEXT: adcl %ecx, %eax
518503 ; X32-NEXT: movl %eax, {{[-0-9]+}}(%e{{[sb]}}p) # 4-byte Spill
519 ; X32-NEXT: seto %al
520 ; X32-NEXT: lahf
521 ; X32-NEXT: movl %eax, %eax
522 ; X32-NEXT: movl %eax, {{[-0-9]+}}(%e{{[sb]}}p) # 4-byte Spill
523 ; X32-NEXT: movl {{[-0-9]+}}(%e{{[sb]}}p), %eax # 4-byte Reload
524 ; X32-NEXT: adcl %edi, %eax
525 ; X32-NEXT: movl %eax, {{[-0-9]+}}(%e{{[sb]}}p) # 4-byte Spill
504 ; X32-NEXT: setb {{[-0-9]+}}(%e{{[sb]}}p) # 1-byte Folded Spill
526505 ; X32-NEXT: movl {{[-0-9]+}}(%e{{[sb]}}p), %eax # 4-byte Reload
527506 ; X32-NEXT: adcl %edx, %eax
528507 ; X32-NEXT: movl %eax, {{[-0-9]+}}(%e{{[sb]}}p) # 4-byte Spill
529 ; X32-NEXT: movl 12(%ebp), %esi
508 ; X32-NEXT: movl {{[-0-9]+}}(%e{{[sb]}}p), %eax # 4-byte Reload
509 ; X32-NEXT: adcl %ebp, %eax
510 ; X32-NEXT: movl %eax, {{[-0-9]+}}(%e{{[sb]}}p) # 4-byte Spill
511 ; X32-NEXT: movl {{[0-9]+}}(%esp), %esi
530512 ; X32-NEXT: movl 48(%esi), %eax
531513 ; X32-NEXT: movl %eax, {{[-0-9]+}}(%e{{[sb]}}p) # 4-byte Spill
532514 ; X32-NEXT: xorl %ecx, %ecx
533515 ; X32-NEXT: mull %ecx
534 ; X32-NEXT: movl %eax, %ebx
535 ; X32-NEXT: movl %eax, {{[-0-9]+}}(%e{{[sb]}}p) # 4-byte Spill
516 ; X32-NEXT: movl %eax, %ebp
536517 ; X32-NEXT: movl %edx, %edi
537518 ; X32-NEXT: movl 52(%esi), %eax
538519 ; X32-NEXT: movl %eax, {{[-0-9]+}}(%e{{[sb]}}p) # 4-byte Spill
541522 ; X32-NEXT: addl %edi, %esi
542523 ; X32-NEXT: movl %edx, %ecx
543524 ; X32-NEXT: adcl $0, %ecx
544 ; X32-NEXT: addl %ebx, %esi
525 ; X32-NEXT: addl %ebp, %esi
545526 ; X32-NEXT: movl %esi, {{[-0-9]+}}(%e{{[sb]}}p) # 4-byte Spill
546527 ; X32-NEXT: adcl %edi, %ecx
528 ; X32-NEXT: movl %edi, {{[-0-9]+}}(%e{{[sb]}}p) # 4-byte Spill
547529 ; X32-NEXT: setb %bl
548530 ; X32-NEXT: addl %eax, %ecx
549531 ; X32-NEXT: movzbl %bl, %esi
550532 ; X32-NEXT: adcl %edx, %esi
551 ; X32-NEXT: movl 12(%ebp), %eax
533 ; X32-NEXT: movl {{[0-9]+}}(%esp), %eax
552534 ; X32-NEXT: movl 56(%eax), %eax
553535 ; X32-NEXT: movl %eax, {{[-0-9]+}}(%e{{[sb]}}p) # 4-byte Spill
554536 ; X32-NEXT: xorl %edx, %edx
555537 ; X32-NEXT: mull %edx
556538 ; X32-NEXT: movl %eax, {{[-0-9]+}}(%e{{[sb]}}p) # 4-byte Spill
557539 ; X32-NEXT: movl %edx, {{[-0-9]+}}(%e{{[sb]}}p) # 4-byte Spill
558 ; X32-NEXT: movl {{[-0-9]+}}(%e{{[sb]}}p), %ebx # 4-byte Reload
540 ; X32-NEXT: movl %ebp, %ebx
559541 ; X32-NEXT: addl %eax, %ebx
560 ; X32-NEXT: movl %edi, %edx
561 ; X32-NEXT: movl %edi, {{[-0-9]+}}(%e{{[sb]}}p) # 4-byte Spill
562 ; X32-NEXT: adcl {{[-0-9]+}}(%e{{[sb]}}p), %edi # 4-byte Folded Reload
542 ; X32-NEXT: adcl %edx, %edi
563543 ; X32-NEXT: addl %ecx, %ebx
564544 ; X32-NEXT: movl %ebx, {{[-0-9]+}}(%e{{[sb]}}p) # 4-byte Spill
565545 ; X32-NEXT: adcl %esi, %edi
566546 ; X32-NEXT: movl %edi, {{[-0-9]+}}(%e{{[sb]}}p) # 4-byte Spill
547 ; X32-NEXT: movl {{[-0-9]+}}(%e{{[sb]}}p), %edx # 4-byte Reload
548 ; X32-NEXT: movl %edx, %eax
549 ; X32-NEXT: movl %ebp, {{[-0-9]+}}(%e{{[sb]}}p) # 4-byte Spill
550 ; X32-NEXT: addl %ebp, %eax
551 ; X32-NEXT: movl {{[-0-9]+}}(%e{{[sb]}}p), %eax # 4-byte Reload
552 ; X32-NEXT: adcl {{[-0-9]+}}(%e{{[sb]}}p), %eax # 4-byte Folded Reload
553 ; X32-NEXT: movl %eax, {{[-0-9]+}}(%e{{[sb]}}p) # 4-byte Spill
554 ; X32-NEXT: movl %edx, %eax
555 ; X32-NEXT: addl %ebp, %eax
556 ; X32-NEXT: movl %eax, {{[-0-9]+}}(%e{{[sb]}}p) # 4-byte Spill
557 ; X32-NEXT: movl {{[-0-9]+}}(%e{{[sb]}}p), %eax # 4-byte Reload
558 ; X32-NEXT: adcl {{[-0-9]+}}(%e{{[sb]}}p), %eax # 4-byte Folded Reload
559 ; X32-NEXT: movl %eax, {{[-0-9]+}}(%e{{[sb]}}p) # 4-byte Spill
560 ; X32-NEXT: movl {{[-0-9]+}}(%e{{[sb]}}p), %eax # 4-byte Reload
561 ; X32-NEXT: adcl %ebx, %eax
562 ; X32-NEXT: movl %eax, {{[-0-9]+}}(%e{{[sb]}}p) # 4-byte Spill
563 ; X32-NEXT: movl {{[-0-9]+}}(%e{{[sb]}}p), %eax # 4-byte Reload
564 ; X32-NEXT: adcl %edi, %eax
565 ; X32-NEXT: movl %eax, {{[-0-9]+}}(%e{{[sb]}}p) # 4-byte Spill
566 ; X32-NEXT: movl {{[0-9]+}}(%esp), %edi
567 ; X32-NEXT: movl 64(%edi), %eax
568 ; X32-NEXT: movl %eax, {{[-0-9]+}}(%e{{[sb]}}p) # 4-byte Spill
569 ; X32-NEXT: xorl %ecx, %ecx
570 ; X32-NEXT: mull %ecx
571 ; X32-NEXT: movl {{[-0-9]+}}(%e{{[sb]}}p), %esi # 4-byte Reload
572 ; X32-NEXT: movl %esi, %ecx
573 ; X32-NEXT: movl %eax, %ebx
574 ; X32-NEXT: movl %eax, {{[-0-9]+}}(%e{{[sb]}}p) # 4-byte Spill
575 ; X32-NEXT: addl %eax, %ecx
567576 ; X32-NEXT: movl {{[-0-9]+}}(%e{{[sb]}}p), %ecx # 4-byte Reload
568577 ; X32-NEXT: movl %ecx, %eax
569 ; X32-NEXT: movl {{[-0-9]+}}(%e{{[sb]}}p), %esi # 4-byte Reload
570 ; X32-NEXT: addl %esi, %eax
571 ; X32-NEXT: movl {{[-0-9]+}}(%e{{[sb]}}p), %eax # 4-byte Reload
578 ; X32-NEXT: movl %edx, {{[-0-9]+}}(%e{{[sb]}}p) # 4-byte Spill
572579 ; X32-NEXT: adcl %edx, %eax
573580 ; X32-NEXT: movl %eax, {{[-0-9]+}}(%e{{[sb]}}p) # 4-byte Spill
574 ; X32-NEXT: movl %ecx, %eax
575 ; X32-NEXT: addl %esi, %eax
576 ; X32-NEXT: movl %eax, {{[-0-9]+}}(%e{{[sb]}}p) # 4-byte Spill
577 ; X32-NEXT: movl {{[-0-9]+}}(%e{{[sb]}}p), %eax # 4-byte Reload
578 ; X32-NEXT: adcl {{[-0-9]+}}(%e{{[sb]}}p), %eax # 4-byte Folded Reload
579 ; X32-NEXT: movl %eax, {{[-0-9]+}}(%e{{[sb]}}p) # 4-byte Spill
580 ; X32-NEXT: movl {{[-0-9]+}}(%e{{[sb]}}p), %eax # 4-byte Reload
581 ; X32-NEXT: adcl %ebx, %eax
582 ; X32-NEXT: movl %eax, {{[-0-9]+}}(%e{{[sb]}}p) # 4-byte Spill
583 ; X32-NEXT: movl {{[-0-9]+}}(%e{{[sb]}}p), %eax # 4-byte Reload
584 ; X32-NEXT: adcl %edi, %eax
585 ; X32-NEXT: movl %eax, {{[-0-9]+}}(%e{{[sb]}}p) # 4-byte Spill
586 ; X32-NEXT: movl 8(%ebp), %eax
587 ; X32-NEXT: movl 64(%eax), %eax
588 ; X32-NEXT: movl %eax, {{[-0-9]+}}(%e{{[sb]}}p) # 4-byte Spill
589 ; X32-NEXT: xorl %ecx, %ecx
590 ; X32-NEXT: mull %ecx
591 ; X32-NEXT: movl %edx, %esi
592 ; X32-NEXT: movl %edx, {{[-0-9]+}}(%e{{[sb]}}p) # 4-byte Spill
593 ; X32-NEXT: movl {{[-0-9]+}}(%e{{[sb]}}p), %edi # 4-byte Reload
594 ; X32-NEXT: movl %edi, %ecx
595 ; X32-NEXT: movl %eax, %edx
596 ; X32-NEXT: movl %eax, {{[-0-9]+}}(%e{{[sb]}}p) # 4-byte Spill
597 ; X32-NEXT: addl %eax, %ecx
598 ; X32-NEXT: movl {{[-0-9]+}}(%e{{[sb]}}p), %ebx # 4-byte Reload
599 ; X32-NEXT: movl %ebx, %eax
600 ; X32-NEXT: adcl %esi, %eax
601 ; X32-NEXT: movl %eax, {{[-0-9]+}}(%e{{[sb]}}p) # 4-byte Spill
602 ; X32-NEXT: movl {{[-0-9]+}}(%e{{[sb]}}p), %esi # 4-byte Reload
603 ; X32-NEXT: movl %esi, %eax
604 ; X32-NEXT: addl %edx, %eax
605 ; X32-NEXT: movl {{[-0-9]+}}(%e{{[sb]}}p), %ecx # 4-byte Reload
606 ; X32-NEXT: movl %ecx, %eax
607 ; X32-NEXT: adcl {{[-0-9]+}}(%e{{[sb]}}p), %eax # 4-byte Folded Reload
608 ; X32-NEXT: movl %eax, {{[-0-9]+}}(%e{{[sb]}}p) # 4-byte Spill
609 ; X32-NEXT: movl 8(%ebp), %eax
610 ; X32-NEXT: movl 80(%eax), %eax
581 ; X32-NEXT: movl {{[-0-9]+}}(%e{{[sb]}}p), %ebp # 4-byte Reload
582 ; X32-NEXT: movl %ebp, %eax
583 ; X32-NEXT: addl %ebx, %eax
584 ; X32-NEXT: movl {{[-0-9]+}}(%e{{[sb]}}p), %ebx # 4-byte Reload
585 ; X32-NEXT: movl %ebx, %eax
586 ; X32-NEXT: adcl %edx, %eax
587 ; X32-NEXT: movl %eax, {{[-0-9]+}}(%e{{[sb]}}p) # 4-byte Spill
588 ; X32-NEXT: movl 80(%edi), %eax
611589 ; X32-NEXT: movl %eax, {{[-0-9]+}}(%e{{[sb]}}p) # 4-byte Spill
612590 ; X32-NEXT: xorl %edx, %edx
613591 ; X32-NEXT: mull %edx
614 ; X32-NEXT: movl %eax, {{[-0-9]+}}(%e{{[sb]}}p) # 4-byte Spill
615 ; X32-NEXT: movl %esi, %eax
616 ; X32-NEXT: movl {{[-0-9]+}}(%e{{[sb]}}p), %esi # 4-byte Reload
617 ; X32-NEXT: addl %esi, %eax
618 ; X32-NEXT: movl %edx, {{[-0-9]+}}(%e{{[sb]}}p) # 4-byte Spill
619 ; X32-NEXT: adcl %edx, %ecx
620 ; X32-NEXT: movl %ecx, {{[-0-9]+}}(%e{{[sb]}}p) # 4-byte Spill
621 ; X32-NEXT: addl %esi, %edi
592 ; X32-NEXT: movl %ebp, %edi
593 ; X32-NEXT: movl %eax, {{[-0-9]+}}(%e{{[sb]}}p) # 4-byte Spill
594 ; X32-NEXT: addl %eax, %edi
595 ; X32-NEXT: movl %edx, {{[-0-9]+}}(%e{{[sb]}}p) # 4-byte Spill
622596 ; X32-NEXT: adcl %edx, %ebx
623597 ; X32-NEXT: movl %ebx, {{[-0-9]+}}(%e{{[sb]}}p) # 4-byte Spill
624 ; X32-NEXT: movl 12(%ebp), %ecx
598 ; X32-NEXT: addl %eax, %esi
599 ; X32-NEXT: adcl %edx, %ecx
600 ; X32-NEXT: movl %ecx, {{[-0-9]+}}(%e{{[sb]}}p) # 4-byte Spill
601 ; X32-NEXT: movl {{[0-9]+}}(%esp), %ecx
625602 ; X32-NEXT: movl 80(%ecx), %eax
626603 ; X32-NEXT: movl %eax, {{[-0-9]+}}(%e{{[sb]}}p) # 4-byte Spill
627 ; X32-NEXT: xorl %ebx, %ebx
628 ; X32-NEXT: mull %ebx
604 ; X32-NEXT: xorl %edi, %edi
605 ; X32-NEXT: mull %edi
629606 ; X32-NEXT: movl %edx, {{[-0-9]+}}(%e{{[sb]}}p) # 4-byte Spill
630607 ; X32-NEXT: movl %eax, {{[-0-9]+}}(%e{{[sb]}}p) # 4-byte Spill
631608 ; X32-NEXT: movl {{[-0-9]+}}(%e{{[sb]}}p), %esi # 4-byte Reload
632609 ; X32-NEXT: addl %esi, %eax
633610 ; X32-NEXT: movl %edx, %eax
634 ; X32-NEXT: movl {{[-0-9]+}}(%e{{[sb]}}p), %edi # 4-byte Reload
635 ; X32-NEXT: adcl %edi, %eax
611 ; X32-NEXT: movl {{[-0-9]+}}(%e{{[sb]}}p), %ebx # 4-byte Reload
612 ; X32-NEXT: adcl %ebx, %eax
636613 ; X32-NEXT: movl %eax, {{[-0-9]+}}(%e{{[sb]}}p) # 4-byte Spill
637614 ; X32-NEXT: movl 64(%ecx), %eax
638615 ; X32-NEXT: movl %eax, {{[-0-9]+}}(%e{{[sb]}}p) # 4-byte Spill
639 ; X32-NEXT: mull %ebx
616 ; X32-NEXT: mull %edi
640617 ; X32-NEXT: movl %eax, {{[-0-9]+}}(%e{{[sb]}}p) # 4-byte Spill
641618 ; X32-NEXT: movl %eax, %ecx
642619 ; X32-NEXT: addl %esi, %ecx
643620 ; X32-NEXT: movl %edx, %esi
644621 ; X32-NEXT: movl %edx, {{[-0-9]+}}(%e{{[sb]}}p) # 4-byte Spill
645622 ; X32-NEXT: movl %edx, %ecx
646 ; X32-NEXT: adcl %edi, %ecx
623 ; X32-NEXT: adcl %ebx, %ecx
647624 ; X32-NEXT: movl %ecx, {{[-0-9]+}}(%e{{[sb]}}p) # 4-byte Spill
648625 ; X32-NEXT: movl %eax, %ecx
649626 ; X32-NEXT: movl {{[-0-9]+}}(%e{{[sb]}}p), %edx # 4-byte Reload
653630 ; X32-NEXT: adcl %ecx, %eax
654631 ; X32-NEXT: movl %eax, {{[-0-9]+}}(%e{{[sb]}}p) # 4-byte Spill
655632 ; X32-NEXT: movl %edx, %eax
656 ; X32-NEXT: movl {{[-0-9]+}}(%e{{[sb]}}p), %edi # 4-byte Reload
657 ; X32-NEXT: addl %edi, %eax
633 ; X32-NEXT: movl {{[-0-9]+}}(%e{{[sb]}}p), %esi # 4-byte Reload
634 ; X32-NEXT: addl %esi, %eax
658635 ; X32-NEXT: adcl {{[-0-9]+}}(%e{{[sb]}}p), %ecx # 4-byte Folded Reload
659636 ; X32-NEXT: movl %ecx, {{[-0-9]+}}(%e{{[sb]}}p) # 4-byte Spill
660637 ; X32-NEXT: movl %edx, %eax
661 ; X32-NEXT: addl %edi, %eax
638 ; X32-NEXT: addl %esi, %eax
639 ; X32-NEXT: movl %eax, {{[-0-9]+}}(%e{{[sb]}}p) # 4-byte Spill
640 ; X32-NEXT: movl {{[-0-9]+}}(%e{{[sb]}}p), %eax # 4-byte Reload
641 ; X32-NEXT: movl {{[-0-9]+}}(%e{{[sb]}}p), %ebp # 4-byte Reload
642 ; X32-NEXT: adcl %ebp, %eax
662643 ; X32-NEXT: movl %eax, {{[-0-9]+}}(%e{{[sb]}}p) # 4-byte Spill
663644 ; X32-NEXT: movl {{[-0-9]+}}(%e{{[sb]}}p), %eax # 4-byte Reload
664645 ; X32-NEXT: adcl {{[-0-9]+}}(%e{{[sb]}}p), %eax # 4-byte Folded Reload
667648 ; X32-NEXT: adcl {{[-0-9]+}}(%e{{[sb]}}p), %eax # 4-byte Folded Reload
668649 ; X32-NEXT: movl %eax, {{[-0-9]+}}(%e{{[sb]}}p) # 4-byte Spill
669650 ; X32-NEXT: movl {{[-0-9]+}}(%e{{[sb]}}p), %eax # 4-byte Reload
651 ; X32-NEXT: addl {{[-0-9]+}}(%e{{[sb]}}p), %eax # 4-byte Folded Reload
652 ; X32-NEXT: movl %eax, {{[-0-9]+}}(%e{{[sb]}}p) # 4-byte Spill
653 ; X32-NEXT: movl {{[-0-9]+}}(%e{{[sb]}}p), %eax # 4-byte Reload
670654 ; X32-NEXT: adcl {{[-0-9]+}}(%e{{[sb]}}p), %eax # 4-byte Folded Reload
671655 ; X32-NEXT: movl %eax, {{[-0-9]+}}(%e{{[sb]}}p) # 4-byte Spill
672656 ; X32-NEXT: movl {{[-0-9]+}}(%e{{[sb]}}p), %eax # 4-byte Reload
673 ; X32-NEXT: addl {{[-0-9]+}}(%e{{[sb]}}p), %eax # 4-byte Folded Reload
674 ; X32-NEXT: movl %eax, {{[-0-9]+}}(%e{{[sb]}}p) # 4-byte Spill
675 ; X32-NEXT: movl {{[-0-9]+}}(%e{{[sb]}}p), %eax # 4-byte Reload
676 ; X32-NEXT: adcl {{[-0-9]+}}(%e{{[sb]}}p), %eax # 4-byte Folded Reload
657 ; X32-NEXT: movl {{[-0-9]+}}(%e{{[sb]}}p), %ecx # 4-byte Reload
658 ; X32-NEXT: adcl %ecx, %eax
677659 ; X32-NEXT: movl %eax, {{[-0-9]+}}(%e{{[sb]}}p) # 4-byte Spill
678660 ; X32-NEXT: movl {{[-0-9]+}}(%e{{[sb]}}p), %eax # 4-byte Reload
679661 ; X32-NEXT: movl {{[-0-9]+}}(%e{{[sb]}}p), %edx # 4-byte Reload
680662 ; X32-NEXT: adcl %edx, %eax
681663 ; X32-NEXT: movl %eax, {{[-0-9]+}}(%e{{[sb]}}p) # 4-byte Spill
682 ; X32-NEXT: movl {{[-0-9]+}}(%e{{[sb]}}p), %eax # 4-byte Reload
683 ; X32-NEXT: movl {{[-0-9]+}}(%e{{[sb]}}p), %ecx # 4-byte Reload
664 ; X32-NEXT: movb {{[-0-9]+}}(%e{{[sb]}}p), %al # 1-byte Reload
665 ; X32-NEXT: addb $255, %al
666 ; X32-NEXT: movl {{[-0-9]+}}(%e{{[sb]}}p), %ebx # 4-byte Reload
667 ; X32-NEXT: movl %ebx, %eax
684668 ; X32-NEXT: adcl %ecx, %eax
685669 ; X32-NEXT: movl %eax, {{[-0-9]+}}(%e{{[sb]}}p) # 4-byte Spill
686 ; X32-NEXT: movl {{[-0-9]+}}(%e{{[sb]}}p), %ebx # 4-byte Reload
687 ; X32-NEXT: movl %ebx, %eax
688 ; X32-NEXT: movl {{[-0-9]+}}(%e{{[sb]}}p), %esi # 4-byte Reload
689 ; X32-NEXT: pushl %eax
690 ; X32-NEXT: movl %esi, %eax
691 ; X32-NEXT: addb $127, %al
692 ; X32-NEXT: sahf
693 ; X32-NEXT: popl %eax
670 ; X32-NEXT: movl {{[-0-9]+}}(%e{{[sb]}}p), %edi # 4-byte Reload
671 ; X32-NEXT: movl %edi, %eax
694672 ; X32-NEXT: adcl %edx, %eax
695673 ; X32-NEXT: movl %eax, {{[-0-9]+}}(%e{{[sb]}}p) # 4-byte Spill
696 ; X32-NEXT: movl {{[-0-9]+}}(%e{{[sb]}}p), %esi # 4-byte Reload
697 ; X32-NEXT: movl %esi, %eax
698 ; X32-NEXT: adcl %ecx, %eax
699 ; X32-NEXT: movl %eax, {{[-0-9]+}}(%e{{[sb]}}p) # 4-byte Spill
700674 ; X32-NEXT: movl {{[-0-9]+}}(%e{{[sb]}}p), %ecx # 4-byte Reload
701675 ; X32-NEXT: movl %ecx, %eax
702 ; X32-NEXT: addl %edi, %eax
676 ; X32-NEXT: addl %esi, %eax
703677 ; X32-NEXT: movl {{[-0-9]+}}(%e{{[sb]}}p), %eax # 4-byte Reload
704678 ; X32-NEXT: movl {{[-0-9]+}}(%e{{[sb]}}p), %edx # 4-byte Reload
705679 ; X32-NEXT: adcl %edx, %eax
706680 ; X32-NEXT: movl %eax, {{[-0-9]+}}(%e{{[sb]}}p) # 4-byte Spill
707681 ; X32-NEXT: movl %ecx, %eax
708 ; X32-NEXT: addl %edi, %eax
709 ; X32-NEXT: movl %eax, {{[-0-9]+}}(%e{{[sb]}}p) # 4-byte Spill
710 ; X32-NEXT: movl {{[-0-9]+}}(%e{{[sb]}}p), %eax # 4-byte Reload
711 ; X32-NEXT: adcl {{[-0-9]+}}(%e{{[sb]}}p), %eax # 4-byte Folded Reload
682 ; X32-NEXT: addl %esi, %eax
683 ; X32-NEXT: movl %eax, {{[-0-9]+}}(%e{{[sb]}}p) # 4-byte Spill
684 ; X32-NEXT: movl {{[-0-9]+}}(%e{{[sb]}}p), %eax # 4-byte Reload
685 ; X32-NEXT: adcl %ebp, %eax
712686 ; X32-NEXT: movl %eax, {{[-0-9]+}}(%e{{[sb]}}p) # 4-byte Spill
713687 ; X32-NEXT: movl %ebx, %eax
714688 ; X32-NEXT: movl {{[-0-9]+}}(%e{{[sb]}}p), %ebx # 4-byte Reload
715689 ; X32-NEXT: adcl %ebx, %eax
716690 ; X32-NEXT: movl %eax, {{[-0-9]+}}(%e{{[sb]}}p) # 4-byte Spill
717 ; X32-NEXT: movl %esi, %eax
718 ; X32-NEXT: movl {{[-0-9]+}}(%e{{[sb]}}p), %esi # 4-byte Reload
719 ; X32-NEXT: adcl %esi, %eax
691 ; X32-NEXT: movl %edi, %eax
692 ; X32-NEXT: movl {{[-0-9]+}}(%e{{[sb]}}p), %edi # 4-byte Reload
693 ; X32-NEXT: adcl %edi, %eax
720694 ; X32-NEXT: movl %eax, {{[-0-9]+}}(%e{{[sb]}}p) # 4-byte Spill
721695 ; X32-NEXT: movl {{[-0-9]+}}(%e{{[sb]}}p), %ecx # 4-byte Reload
722696 ; X32-NEXT: movl %ecx, %eax
723 ; X32-NEXT: addl %edi, %eax
697 ; X32-NEXT: addl %esi, %eax
724698 ; X32-NEXT: movl {{[-0-9]+}}(%e{{[sb]}}p), %eax # 4-byte Reload
725699 ; X32-NEXT: adcl %edx, %eax
726700 ; X32-NEXT: movl %eax, {{[-0-9]+}}(%e{{[sb]}}p) # 4-byte Spill
727701 ; X32-NEXT: movl %ecx, %eax
728 ; X32-NEXT: addl %edi, %eax
729 ; X32-NEXT: movl %eax, {{[-0-9]+}}(%e{{[sb]}}p) # 4-byte Spill
730 ; X32-NEXT: movl {{[-0-9]+}}(%e{{[sb]}}p), %eax # 4-byte Reload
731 ; X32-NEXT: adcl {{[-0-9]+}}(%e{{[sb]}}p), %eax # 4-byte Folded Reload
702 ; X32-NEXT: addl %esi, %eax
703 ; X32-NEXT: movl %eax, {{[-0-9]+}}(%e{{[sb]}}p) # 4-byte Spill
704 ; X32-NEXT: movl {{[-0-9]+}}(%e{{[sb]}}p), %eax # 4-byte Reload
705 ; X32-NEXT: adcl %ebp, %eax
732706 ; X32-NEXT: movl %eax, {{[-0-9]+}}(%e{{[sb]}}p) # 4-byte Spill
733707 ; X32-NEXT: movl {{[-0-9]+}}(%e{{[sb]}}p), %eax # 4-byte Reload
734708 ; X32-NEXT: adcl %ebx, %eax
735709 ; X32-NEXT: movl %eax, {{[-0-9]+}}(%e{{[sb]}}p) # 4-byte Spill
736710 ; X32-NEXT: movl {{[-0-9]+}}(%e{{[sb]}}p), %eax # 4-byte Reload
737 ; X32-NEXT: adcl %esi, %eax
738 ; X32-NEXT: movl %eax, {{[-0-9]+}}(%e{{[sb]}}p) # 4-byte Spill
739 ; X32-NEXT: movl 8(%ebp), %eax
711 ; X32-NEXT: adcl %edi, %eax
712 ; X32-NEXT: movl %eax, {{[-0-9]+}}(%e{{[sb]}}p) # 4-byte Spill
713 ; X32-NEXT: movl {{[0-9]+}}(%esp), %eax
740714 ; X32-NEXT: movl 68(%eax), %eax
741715 ; X32-NEXT: movl %eax, {{[-0-9]+}}(%e{{[sb]}}p) # 4-byte Spill
742716 ; X32-NEXT: xorl %ecx, %ecx
743717 ; X32-NEXT: mull %ecx
744 ; X32-NEXT: movl %eax, %esi
745 ; X32-NEXT: movl {{[-0-9]+}}(%e{{[sb]}}p), %edi # 4-byte Reload
746 ; X32-NEXT: addl %edi, %esi
718 ; X32-NEXT: movl %eax, %edi
719 ; X32-NEXT: movl {{[-0-9]+}}(%e{{[sb]}}p), %ebp # 4-byte Reload
720 ; X32-NEXT: addl %ebp, %edi
747721 ; X32-NEXT: movl %edx, %ecx
748722 ; X32-NEXT: adcl $0, %ecx
749723 ; X32-NEXT: movl {{[-0-9]+}}(%e{{[sb]}}p), %ebx # 4-byte Reload
750 ; X32-NEXT: addl %ebx, %esi
751 ; X32-NEXT: movl %esi, {{[-0-9]+}}(%e{{[sb]}}p) # 4-byte Spill
752 ; X32-NEXT: adcl %edi, %ecx
724 ; X32-NEXT: addl %ebx, %edi
725 ; X32-NEXT: movl %edi, {{[-0-9]+}}(%e{{[sb]}}p) # 4-byte Spill
726 ; X32-NEXT: adcl %ebp, %ecx
753727 ; X32-NEXT: setb {{[-0-9]+}}(%e{{[sb]}}p) # 1-byte Folded Spill
754728 ; X32-NEXT: addl %eax, %ecx
755 ; X32-NEXT: movzbl {{[-0-9]+}}(%e{{[sb]}}p), %edi # 1-byte Folded Reload
756 ; X32-NEXT: adcl %edx, %edi
757 ; X32-NEXT: movl 8(%ebp), %eax
729 ; X32-NEXT: movzbl {{[-0-9]+}}(%e{{[sb]}}p), %esi # 1-byte Folded Reload
730 ; X32-NEXT: adcl %edx, %esi
731 ; X32-NEXT: movl {{[0-9]+}}(%esp), %eax
758732 ; X32-NEXT: movl 72(%eax), %eax
759733 ; X32-NEXT: movl %eax, {{[-0-9]+}}(%e{{[sb]}}p) # 4-byte Spill
760734 ; X32-NEXT: xorl %edx, %edx
761735 ; X32-NEXT: mull %edx
762 ; X32-NEXT: movl %eax, %esi
763 ; X32-NEXT: movl %eax, {{[-0-9]+}}(%e{{[sb]}}p) # 4-byte Spill
764 ; X32-NEXT: movl %edx, {{[-0-9]+}}(%e{{[sb]}}p) # 4-byte Spill
765 ; X32-NEXT: movl %ebx, %eax
766 ; X32-NEXT: addl %esi, %eax
767 ; X32-NEXT: movl {{[-0-9]+}}(%e{{[sb]}}p), %ebx # 4-byte Reload
768 ; X32-NEXT: adcl %edx, %ebx
769 ; X32-NEXT: addl %ecx, %eax
770 ; X32-NEXT: adcl %edi, %ebx
771 ; X32-NEXT: movl {{[-0-9]+}}(%e{{[sb]}}p), %ecx # 4-byte Reload
772 ; X32-NEXT: movl {{[-0-9]+}}(%e{{[sb]}}p), %edx # 4-byte Reload
736 ; X32-NEXT: movl %eax, {{[-0-9]+}}(%e{{[sb]}}p) # 4-byte Spill
737 ; X32-NEXT: movl %edx, %edi
738 ; X32-NEXT: movl %edx, {{[-0-9]+}}(%e{{[sb]}}p) # 4-byte Spill
739 ; X32-NEXT: movl %ebx, %edx
740 ; X32-NEXT: addl %eax, %ebx
741 ; X32-NEXT: adcl %edi, %ebp
742 ; X32-NEXT: addl %ecx, %ebx
743 ; X32-NEXT: adcl %esi, %ebp
744 ; X32-NEXT: movl {{[-0-9]+}}(%e{{[sb]}}p), %eax # 4-byte Reload
745 ; X32-NEXT: addl %edx, %eax
746 ; X32-NEXT: movl %eax, {{[-0-9]+}}(%e{{[sb]}}p) # 4-byte Spill
747 ; X32-NEXT: movl {{[-0-9]+}}(%e{{[sb]}}p), %ecx # 4-byte Reload
748 ; X32-NEXT: movl {{[-0-9]+}}(%e{{[sb]}}p), %eax # 4-byte Reload
749 ; X32-NEXT: adcl %eax, %ecx
750 ; X32-NEXT: movl %ecx, {{[-0-9]+}}(%e{{[sb]}}p) # 4-byte Spill
751 ; X32-NEXT: movl {{[-0-9]+}}(%e{{[sb]}}p), %ecx # 4-byte Reload
752 ; X32-NEXT: adcl %ebx, %ecx