llvm.org GIT mirror llvm / 34a9d4b
This patch implements medium code model support for 64-bit PowerPC. The default for 64-bit PowerPC is small code model, in which TOC entries must be addressable using a 16-bit offset from the TOC pointer. Additionally, only TOC entries are addressed via the TOC pointer. With medium code model, TOC entries and data sections can all be addressed via the TOC pointer using a 32-bit offset. Cooperation with the linker allows 16-bit offsets to be used when these are sufficient, reducing the number of extra instructions that need to be executed. Medium code model also does not generate explicit TOC entries in ".section toc" for variables that are wholly internal to the compilation unit. Consider a load of an external 4-byte integer. With small code model, the compiler generates: ld 3, .LC1@toc(2) lwz 4, 0(3) .section .toc,"aw",@progbits .LC1: .tc ei[TC],ei With medium model, it instead generates: addis 3, 2, .LC1@toc@ha ld 3, .LC1@toc@l(3) lwz 4, 0(3) .section .toc,"aw",@progbits .LC1: .tc ei[TC],ei Here .LC1@toc@ha is a relocation requesting the upper 16 bits of the 32-bit offset of ei's TOC entry from the TOC base pointer. Similarly, .LC1@toc@l is a relocation requesting the lower 16 bits. Note that if the linker determines that ei's TOC entry is within a 16-bit offset of the TOC base pointer, it will replace the "addis" with a "nop", and replace the "ld" with the identical "ld" instruction from the small code model example. Consider next a load of a function-scope static integer. For small code model, the compiler generates: ld 3, .LC1@toc(2) lwz 4, 0(3) .section .toc,"aw",@progbits .LC1: .tc test_fn_static.si[TC],test_fn_static.si .type test_fn_static.si,@object .local test_fn_static.si .comm test_fn_static.si,4,4 For medium code model, the compiler generates: addis 3, 2, test_fn_static.si@toc@ha addi 3, 3, test_fn_static.si@toc@l lwz 4, 0(3) .type test_fn_static.si,@object .local test_fn_static.si .comm test_fn_static.si,4,4 Again, the linker may replace the "addis" with a "nop", calculating only a 16-bit offset when this is sufficient. Note that it would be more efficient for the compiler to generate: addis 3, 2, test_fn_static.si@toc@ha lwz 4, test_fn_static.si@toc@l(3) The current patch does not perform this optimization yet. This will be addressed as a peephole optimization in a later patch. For the moment, the default code model for 64-bit PowerPC will remain the small code model. We plan to eventually change the default to medium code model, which matches current upstream GCC behavior. Note that the different code models are ABI-compatible, so code compiled with different models will be linked and execute correctly. I've tested the regression suite and the application/benchmark test suite in two ways: Once with the patch as submitted here, and once with additional logic to force medium code model as the default. The tests all compile cleanly, with one exception. The mandel-2 application test fails due to an unrelated ABI compatibility with passing complex numbers. It just so happens that small code model was incredibly lucky, in that temporary values in floating-point registers held the expected values needed by the external library routine that was called incorrectly. My current thought is to correct the ABI problems with _Complex before making medium code model the default, to avoid introducing this "regression." Here are a few comments on how the patch works, since the selection code can be difficult to follow: The existing logic for small code model defines three pseudo-instructions: LDtoc for most uses, LDtocJTI for jump table addresses, and LDtocCPT for constant pool addresses. These are expanded by SelectCodeCommon(). The pseudo-instruction approach doesn't work for medium code model, because we need to generate two instructions when we match the same pattern. Instead, new logic in PPCDAGToDAGISel::Select() intercepts the TOC_ENTRY node for medium code model, and generates an ADDIStocHA followed by either a LDtocL or an ADDItocL. These new node types correspond naturally to the sequences described above. The addis/ld sequence is generated for the following cases: * Jump table addresses * Function addresses * External global variables * Tentative definitions of global variables (common linkage) The addis/addi sequence is generated for the following cases: * Constant pool entries * File-scope static global variables * Function-scope static variables Expanding to the two-instruction sequences at select time exposes the instructions to subsequent optimization, particularly scheduling. The rest of the processing occurs at assembly time, in PPCAsmPrinter::EmitInstruction. Each of the instructions is converted to a "real" PowerPC instruction. When a TOC entry needs to be created, this is done here in the same manner as for the existing LDtoc, LDtocJTI, and LDtocCPT pseudo-instructions (I factored out a new routine to handle this). I had originally thought that if a TOC entry was needed for LDtocL or ADDItocL, it would already have been generated for the previous ADDIStocHA. However, at higher optimization levels, the ADDIStocHA may appear in a different block, which may be assembled textually following the block containing the LDtocL or ADDItocL. So it is necessary to include the possibility of creating a new TOC entry for those two instructions. Note that for LDtocL, we generate a new form of LD called LDrs. This allows specifying the @toc@l relocation for the offset field of the LD instruction (i.e., the offset is replaced by a SymbolLo relocation). When the peephole optimization described above is added, we will need to do similar things for all immediate-form load and store operations. The seven "mcm-n.ll" test cases are kept separate because otherwise the intermingling of various TOC entries and so forth makes the tests fragile and hard to understand. The above assumes use of an external assembler. For use of the integrated assembler, new relocations are added and used by PPCELFObjectWriter. Testing is done with "mcm-obj.ll", which tests for proper generation of the various relocations for the same sequences tested with the external assembler. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@168708 91177308-0d34-0410-b5e6-96231b3b80d8 Bill Schmidt 7 years ago
18 changed file(s) with 645 addition(s) and 12 deletion(s). Raw diff Collapse all Expand all
173173 VK_PPC_DARWIN_HA16, // ha16(symbol)
174174 VK_PPC_DARWIN_LO16, // lo16(symbol)
175175 VK_PPC_GAS_HA16, // symbol@ha
176 VK_PPC_GAS_LO16, // symbol@l
176 VK_PPC_GAS_LO16, // symbol@l
177177 VK_PPC_TPREL16_HA, // symbol@tprel@ha
178178 VK_PPC_TPREL16_LO, // symbol@tprel@l
179 VK_PPC_TOC16_HA, // symbol@toc@ha
180 VK_PPC_TOC16_LO, // symbol@toc@l
179181
180182 VK_Mips_GPREL,
181183 VK_Mips_GOT_CALL,
471471 R_PPC64_ADDR16_HIGHER = 39,
472472 R_PPC64_ADDR16_HIGHEST = 41,
473473 R_PPC64_TOC16 = 47,
474 R_PPC64_TOC16_LO = 48,
475 R_PPC64_TOC16_HA = 50,
474476 R_PPC64_TOC = 51,
475 R_PPC64_TOC16_DS = 63
477 R_PPC64_TOC16_DS = 63,
478 R_PPC64_TOC16_LO_DS = 64
476479 };
477480
478481 // ARM Specific e_flags
208208 case VK_PPC_GAS_LO16: return "l";
209209 case VK_PPC_TPREL16_HA: return "tprel@ha";
210210 case VK_PPC_TPREL16_LO: return "tprel@l";
211 case VK_PPC_TOC16_HA: return "toc@ha";
212 case VK_PPC_TOC16_LO: return "toc@l";
211213 case VK_Mips_GPREL: return "GPREL";
212214 case VK_Mips_GOT_CALL: return "GOT_CALL";
213215 case VK_Mips_GOT16: return "GOT16";
8181 case MCSymbolRefExpr::VK_None:
8282 Type = ELF::R_PPC_ADDR16_HA;
8383 break;
84 case MCSymbolRefExpr::VK_PPC_TOC16_HA:
85 Type = ELF::R_PPC64_TOC16_HA;
86 break;
8487 }
8588 break;
8689 case PPC::fixup_ppc_lo16:
9295 case MCSymbolRefExpr::VK_None:
9396 Type = ELF::R_PPC_ADDR16_LO;
9497 break;
98 case MCSymbolRefExpr::VK_PPC_TOC16_LO:
99 Type = ELF::R_PPC64_TOC16_LO;
100 break;
95101 }
96102 break;
97103 case PPC::fixup_ppc_lo14:
104110 Type = ELF::R_PPC64_TOC16;
105111 break;
106112 case PPC::fixup_ppc_toc16_ds:
107 Type = ELF::R_PPC64_TOC16_DS;
113 switch (Modifier) {
114 default: llvm_unreachable("Unsupported Modifier");
115 case MCSymbolRefExpr::VK_PPC_TOC_ENTRY:
116 Type = ELF::R_PPC64_TOC16_DS;
117 break;
118 case MCSymbolRefExpr::VK_PPC_TOC16_LO:
119 Type = ELF::R_PPC64_TOC16_LO_DS;
120 break;
121 }
108122 break;
109123 case FK_Data_8:
110124 switch (Modifier) {
7272 return "PowerPC Assembly Printer";
7373 }
7474
75 MCSymbol *lookUpOrCreateTOCEntry(MCSymbol *Sym);
7576
7677 virtual void EmitInstruction(const MachineInstr *MI);
7778
306307 printOperand(MI, OpNo, O);
307308 O << ")";
308309 return false;
310 }
311
312
313 /// lookUpOrCreateTOCEntry -- Given a symbol, look up whether a TOC entry
314 /// exists for it. If not, create one. Then return a symbol that references
315 /// the TOC entry.
316 MCSymbol *PPCAsmPrinter::lookUpOrCreateTOCEntry(MCSymbol *Sym) {
317
318 MCSymbol *&TOCEntry = TOC[Sym];
319
320 // To avoid name clash check if the name already exists.
321 while (TOCEntry == 0) {
322 if (OutContext.LookupSymbol(Twine(MAI->getPrivateGlobalPrefix()) +
323 "C" + Twine(TOCLabelID++)) == 0) {
324 TOCEntry = GetTempSymbol("C", TOCLabelID);
325 }
326 }
327
328 return TOCEntry;
309329 }
310330
311331
378398 MOSymbol = GetCPISymbol(MO.getIndex());
379399 else if (MO.isJTI())
380400 MOSymbol = GetJTISymbol(MO.getIndex());
381 MCSymbol *&TOCEntry = TOC[MOSymbol];
382 // To avoid name clash check if the name already exists.
383 while (TOCEntry == 0) {
384 if (OutContext.LookupSymbol(Twine(MAI->getPrivateGlobalPrefix()) +
385 "C" + Twine(TOCLabelID++)) == 0) {
386 TOCEntry = GetTempSymbol("C", TOCLabelID);
387 }
388 }
401
402 MCSymbol *TOCEntry = lookUpOrCreateTOCEntry(MOSymbol);
389403
390404 const MCExpr *Exp =
391405 MCSymbolRefExpr::Create(TOCEntry, MCSymbolRefExpr::VK_PPC_TOC_ENTRY,
395409 return;
396410 }
397411
412 case PPC::ADDIStocHA: {
413 // Transform %Xd = ADDIStocHA %X2,
414 LowerPPCMachineInstrToMCInst(MI, TmpInst, *this, Subtarget.isDarwin());
415
416 // Change the opcode to ADDIS8. If the global address is external,
417 // has common linkage, is a function address, or is a jump table
418 // address, then generate a TOC entry and reference that. Otherwise
419 // reference the symbol directly.
420 TmpInst.setOpcode(PPC::ADDIS8);
421 const MachineOperand &MO = MI->getOperand(2);
422 assert((MO.isGlobal() || MO.isCPI() || MO.isJTI()) &&
423 "Invalid operand for ADDIStocHA!");
424 MCSymbol *MOSymbol = 0;
425 bool IsExternal = false;
426 bool IsFunction = false;
427 bool IsCommon = false;
428
429 if (MO.isGlobal()) {
430 const GlobalValue *GValue = MO.getGlobal();
431 MOSymbol = Mang->getSymbol(GValue);
432 const GlobalVariable *GVar = dyn_cast(GValue);
433 IsExternal = GVar && !GVar->hasInitializer();
434 IsCommon = GVar && GValue->hasCommonLinkage();
435 IsFunction = !GVar;
436 } else if (MO.isCPI())
437 MOSymbol = GetCPISymbol(MO.getIndex());
438 else if (MO.isJTI())
439 MOSymbol = GetJTISymbol(MO.getIndex());
440
441 if (IsExternal || IsFunction || IsCommon || MO.isJTI())
442 MOSymbol = lookUpOrCreateTOCEntry(MOSymbol);
443
444 const MCExpr *Exp =
445 MCSymbolRefExpr::Create(MOSymbol, MCSymbolRefExpr::VK_PPC_TOC16_HA,
446 OutContext);
447 TmpInst.getOperand(2) = MCOperand::CreateExpr(Exp);
448 OutStreamer.EmitInstruction(TmpInst);
449 return;
450 }
451 case PPC::LDtocL: {
452 // Transform %Xd = LDtocL , %Xs
453 LowerPPCMachineInstrToMCInst(MI, TmpInst, *this, Subtarget.isDarwin());
454
455 // Change the opcode to LDrs, which is a form of LD with the offset
456 // specified by a SymbolLo. If the global address is external, has
457 // common linkage, or is a jump table address, then reference the
458 // associated TOC entry. Otherwise reference the symbol directly.
459 TmpInst.setOpcode(PPC::LDrs);
460 const MachineOperand &MO = MI->getOperand(1);
461 assert((MO.isGlobal() || MO.isJTI()) && "Invalid operand for LDtocL!");
462 MCSymbol *MOSymbol = 0;
463
464 if (MO.isJTI())
465 MOSymbol = lookUpOrCreateTOCEntry(GetJTISymbol(MO.getIndex()));
466 else {
467 const GlobalValue *GValue = MO.getGlobal();
468 MOSymbol = Mang->getSymbol(GValue);
469 const GlobalVariable *GVar = dyn_cast(GValue);
470
471 if (!GVar || !GVar->hasInitializer() || GValue->hasCommonLinkage())
472 MOSymbol = lookUpOrCreateTOCEntry(MOSymbol);
473 }
474
475 const MCExpr *Exp =
476 MCSymbolRefExpr::Create(MOSymbol, MCSymbolRefExpr::VK_PPC_TOC16_LO,
477 OutContext);
478 TmpInst.getOperand(1) = MCOperand::CreateExpr(Exp);
479 OutStreamer.EmitInstruction(TmpInst);
480 return;
481 }
482 case PPC::ADDItocL: {
483 // Transform %Xd = ADDItocL %Xs,
484 LowerPPCMachineInstrToMCInst(MI, TmpInst, *this, Subtarget.isDarwin());
485
486 // Change the opcode to ADDI8L. If the global address is external, then
487 // generate a TOC entry and reference that. Otherwise reference the
488 // symbol directly.
489 TmpInst.setOpcode(PPC::ADDI8L);
490 const MachineOperand &MO = MI->getOperand(2);
491 assert((MO.isGlobal() || MO.isCPI()) && "Invalid operand for ADDItocL");
492 MCSymbol *MOSymbol = 0;
493 bool IsExternal = false;
494 bool IsFunction = false;
495
496 if (MO.isGlobal()) {
497 const GlobalValue *GValue = MO.getGlobal();
498 MOSymbol = Mang->getSymbol(GValue);
499 const GlobalVariable *GVar = dyn_cast(GValue);
500 IsExternal = GVar && !GVar->hasInitializer();
501 IsFunction = !GVar;
502 } else if (MO.isCPI())
503 MOSymbol = GetCPISymbol(MO.getIndex());
504
505 if (IsFunction || IsExternal)
506 MOSymbol = lookUpOrCreateTOCEntry(MOSymbol);
507
508 const MCExpr *Exp =
509 MCSymbolRefExpr::Create(MOSymbol, MCSymbolRefExpr::VK_PPC_TOC16_LO,
510 OutContext);
511 TmpInst.getOperand(2) = MCOperand::CreateExpr(Exp);
512 OutStreamer.EmitInstruction(TmpInst);
513 return;
514 }
398515 case PPC::MFCRpseud:
399516 case PPC::MFCR8pseud:
400517 // Transform: %R3 = MFCRpseud %CR7
2424 #include "llvm/Constants.h"
2525 #include "llvm/Function.h"
2626 #include "llvm/GlobalValue.h"
27 #include "llvm/GlobalVariable.h"
2728 #include "llvm/Intrinsics.h"
2829 #include "llvm/Support/Debug.h"
2930 #include "llvm/Support/MathExtras.h"
12671268 Chain), 0);
12681269 return CurDAG->SelectNodeTo(N, Reg, MVT::Other, Chain);
12691270 }
1271 case PPCISD::TOC_ENTRY: {
1272 assert (PPCSubTarget.isPPC64() && "Only supported for 64-bit ABI");
1273
1274 // For medium code model, we generate two instructions as described
1275 // below. Otherwise we allow SelectCodeCommon to handle this, selecting
1276 // one of LDtoc, LDtocJTI, and LDtocCPT.
1277 if (TM.getCodeModel() != CodeModel::Medium)
1278 break;
1279
1280 // The first source operand is a TargetGlobalAddress or a
1281 // TargetJumpTable. If it is an externally defined symbol, a symbol
1282 // with common linkage, a function address, or a jump table address,
1283 // we generate:
1284 // LDtocL(, ADDIStocHA(%X2, ))
1285 // Otherwise we generate:
1286 // ADDItocL(ADDIStocHA(%X2, ), )
1287 SDValue GA = N->getOperand(0);
1288 SDValue TOCbase = N->getOperand(1);
1289 SDNode *Tmp = CurDAG->getMachineNode(PPC::ADDIStocHA, dl, MVT::i64,
1290 TOCbase, GA);
1291
1292 if (isa(GA))
1293 return CurDAG->getMachineNode(PPC::LDtocL, dl, MVT::i64, GA,
1294 SDValue(Tmp, 0));
1295
1296 if (GlobalAddressSDNode *G = dyn_cast(GA)) {
1297 const GlobalValue *GValue = G->getGlobal();
1298 const GlobalVariable *GVar = dyn_cast(GValue);
1299 assert((GVar || isa(GValue)) &&
1300 "Unexpected global value subclass!");
1301
1302 // An external variable is one without an initializer. For these,
1303 // for variables with common linkage, and for Functions, generate
1304 // the LDtocL form.
1305 if (!GVar || !GVar->hasInitializer() || GValue->hasCommonLinkage())
1306 return CurDAG->getMachineNode(PPC::LDtocL, dl, MVT::i64, GA,
1307 SDValue(Tmp, 0));
1308 }
1309
1310 return CurDAG->getMachineNode(PPC::ADDItocL, dl, MVT::i64,
1311 SDValue(Tmp, 0), GA);
1312 }
12701313 }
12711314
12721315 return SelectCode(N);
574574 case PPCISD::TC_RETURN: return "PPCISD::TC_RETURN";
575575 case PPCISD::CR6SET: return "PPCISD::CR6SET";
576576 case PPCISD::CR6UNSET: return "PPCISD::CR6UNSET";
577 case PPCISD::ADDIS_TOC_HA: return "PPCISD::ADDIS_TOC_HA";
578 case PPCISD::LD_TOC_L: return "PPCISD::LD_TOC_L";
579 case PPCISD::ADDI_TOC_L: return "PPCISD::ADDI_TOC_L";
577580 }
578581 }
579582
190190 /// byte-swapping load instruction. It loads "Type" bits, byte swaps it,
191191 /// then puts it in the bottom bits of the GPRC. TYPE can be either i16
192192 /// or i32.
193 LBRX
193 LBRX,
194
195 /// G8RC = ADDIS_TOC_HA %X2, Symbol - For medium code model, produces
196 /// an ADDIS8 instruction that adds the TOC base register to sym@toc@ha.
197 ADDIS_TOC_HA,
198
199 /// G8RC = LD_TOC_L Symbol, G8RReg - For medium code model, produces a
200 /// LD instruction with base register G8RReg and offset sym@toc@l.
201 /// Preceded by an ADDIS_TOC_HA to form a full 32-bit offset.
202 LD_TOC_L,
203
204 /// G8RC = ADDI_TOC_L G8RReg, Symbol - For medium code model, produces
205 /// an ADDI8 instruction that adds G8RReg to sym@toc@l.
206 /// Preceded by an ADDIS_TOC_HA to form a full 32-bit offset.
207 ADDI_TOC_L
194208 };
195209 }
196210
3030 }
3131 def tocentry : Operand {
3232 let MIOperandInfo = (ops i32imm:$imm);
33 }
34 def memrs : Operand { // memri where the immediate is a symbolLo64
35 let PrintMethod = "printMemRegImm";
36 let EncoderMethod = "getMemRIXEncoding";
37 let MIOperandInfo = (ops symbolLo64:$off, ptr_rc:$reg);
3338 }
3439
3540 //===----------------------------------------------------------------------===//
624629 def LD : DSForm_1<58, 0, (outs G8RC:$rD), (ins memrix:$src),
625630 "ld $rD, $src", LdStLD,
626631 [(set G8RC:$rD, (load ixaddr:$src))]>, isPPC64;
632 def LDrs : DSForm_1<58, 0, (outs G8RC:$rD), (ins memrs:$src),
633 "ld $rD, $src", LdStLD,
634 []>, isPPC64;
635 // The following three definitions are selected for small code model only.
636 // Otherwise, we need to create two instructions to form a 32-bit offset,
637 // so we have a custom matcher for TOC_ENTRY in PPCDAGToDAGIsel::Select().
627638 def LDtoc: Pseudo<(outs G8RC:$rD), (ins tocentry:$disp, G8RC:$reg),
628639 "#LDtoc",
629640 [(set G8RC:$rD,
669680 (LD ixaddr:$src)>;
670681 def : Pat<(PPCload xaddr:$src),
671682 (LDX xaddr:$src)>;
683
684 // Support for medium code model.
685 def ADDIStocHA: Pseudo<(outs G8RC:$rD), (ins G8RC:$reg, tocentry:$disp),
686 "#ADDIStocHA",
687 [(set G8RC:$rD,
688 (PPCaddisTocHA G8RC:$reg, tglobaladdr:$disp))]>,
689 isPPC64;
690 def LDtocL: Pseudo<(outs G8RC:$rD), (ins tocentry:$disp, G8RC:$reg),
691 "#LDtocL",
692 [(set G8RC:$rD,
693 (PPCldTocL tglobaladdr:$disp, G8RC:$reg))]>, isPPC64;
694 def ADDItocL: Pseudo<(outs G8RC:$rD), (ins G8RC:$reg, tocentry:$disp),
695 "#ADDItocL",
696 [(set G8RC:$rD,
697 (PPCaddiTocL G8RC:$reg, tglobaladdr:$disp))]>, isPPC64;
672698
673699 let PPC970_Unit = 2 in {
674700 // Truncating stores.
165165 [SDNPHasChain, SDNPMayLoad]>;
166166 def PPCstcx : SDNode<"PPCISD::STCX", SDT_PPCstcx,
167167 [SDNPHasChain, SDNPMayStore]>;
168
169 // Instructions to support medium code model
170 def PPCaddisTocHA : SDNode<"PPCISD::ADDIS_TOC_HA", SDTIntBinOp, []>;
171 def PPCldTocL : SDNode<"PPCISD::LD_TOC_L", SDTIntBinOp, [SDNPMayLoad]>;
172 def PPCaddiTocL : SDNode<"PPCISD::ADDI_TOC_L", SDTIntBinOp, []>;
173
168174
169175 // Instructions to support dynamic alloca.
170176 def SDTDynOp : SDTypeProfile<1, 2, []>;
0 ; RUN: llc -mcpu=pwr7 -O0 -code-model=medium <%s | FileCheck %s
1
2 ; Test correct code generation for medium code model (32-bit TOC offsets)
3 ; for loading and storing an external variable.
4
5 target datalayout = "E-p:64:64:64-i1:8:8-i8:8:8-i16:16:16-i32:32:32-i64:64:64-f32:32:32-f64:64:64-f128:128:128-v128:128:128-n32:64"
6 target triple = "powerpc64-unknown-linux-gnu"
7
8 @ei = external global i32
9
10 define signext i32 @test_external() nounwind {
11 entry:
12 %0 = load i32* @ei, align 4
13 %inc = add nsw i32 %0, 1
14 store i32 %inc, i32* @ei, align 4
15 ret i32 %0
16 }
17
18 ; CHECK: test_external:
19 ; CHECK: addis [[REG1:[0-9]+]], 2, .LC[[TOCNUM:[0-9]+]]@toc@ha
20 ; CHECK: ld [[REG2:[0-9]+]], .LC[[TOCNUM]]@toc@l([[REG1]])
21 ; CHECK: lwz {{[0-9]+}}, 0([[REG2]])
22 ; CHECK: stw {{[0-9]+}}, 0([[REG2]])
23 ; CHECK: .section .toc
24 ; CHECK: .LC[[TOCNUM]]:
25 ; CHECK: .tc {{[a-z0-9A-Z_.]+}}[TC],{{[a-z0-9A-Z_.]+}}
0 ; RUN: llc -mcpu=pwr7 -O0 -code-model=medium <%s | FileCheck %s
1
2 ; Test correct code generation for medium code model (32-bit TOC offsets)
3 ; for loading and storing a static variable scoped to a function.
4
5 target datalayout = "E-p:64:64:64-i1:8:8-i8:8:8-i16:16:16-i32:32:32-i64:64:64-f32:32:32-f64:64:64-f128:128:128-v128:128:128-n32:64"
6 target triple = "powerpc64-unknown-linux-gnu"
7
8 @test_fn_static.si = internal global i32 0, align 4
9
10 define signext i32 @test_fn_static() nounwind {
11 entry:
12 %0 = load i32* @test_fn_static.si, align 4
13 %inc = add nsw i32 %0, 1
14 store i32 %inc, i32* @test_fn_static.si, align 4
15 ret i32 %0
16 }
17
18 ; CHECK: test_fn_static:
19 ; CHECK: addis [[REG1:[0-9]+]], 2, [[VAR:[a-z0-9A-Z_.]+]]@toc@ha
20 ; CHECK: addi [[REG2:[0-9]+]], [[REG1]], [[VAR]]@toc@l
21 ; CHECK: lwz {{[0-9]+}}, 0([[REG2]])
22 ; CHECK: stw {{[0-9]+}}, 0([[REG2]])
23 ; CHECK: .type [[VAR]],@object
24 ; CHECK: .local [[VAR]]
25 ; CHECK: .comm [[VAR]],4,4
0 ; RUN: llc -mcpu=pwr7 -O0 -code-model=medium <%s | FileCheck %s
1
2 ; Test correct code generation for medium code model (32-bit TOC offsets)
3 ; for loading and storing a file-scope static variable.
4
5 target datalayout = "E-p:64:64:64-i1:8:8-i8:8:8-i16:16:16-i32:32:32-i64:64:64-f32:32:32-f64:64:64-f128:128:128-v128:128:128-n32:64"
6 target triple = "powerpc64-unknown-linux-gnu"
7
8 @gi = global i32 5, align 4
9
10 define signext i32 @test_file_static() nounwind {
11 entry:
12 %0 = load i32* @gi, align 4
13 %inc = add nsw i32 %0, 1
14 store i32 %inc, i32* @gi, align 4
15 ret i32 %0
16 }
17
18 ; CHECK: test_file_static:
19 ; CHECK: addis [[REG1:[0-9]+]], 2, [[VAR:[a-z0-9A-Z_.]+]]@toc@ha
20 ; CHECK: addi [[REG2:[0-9]+]], [[REG1]], [[VAR]]@toc@l
21 ; CHECK: lwz {{[0-9]+}}, 0([[REG2]])
22 ; CHECK: stw {{[0-9]+}}, 0([[REG2]])
23 ; CHECK: .type [[VAR]],@object
24 ; CHECK: .data
25 ; CHECK: .globl [[VAR]]
26 ; CHECK: [[VAR]]:
27 ; CHECK: .long 5
0 ; RUN: llc -mcpu=pwr7 -O0 -code-model=medium <%s | FileCheck %s
1
2 ; Test correct code generation for medium code model (32-bit TOC offsets)
3 ; for loading a value from the constant pool (TOC-relative).
4
5 target datalayout = "E-p:64:64:64-i1:8:8-i8:8:8-i16:16:16-i32:32:32-i64:64:64-f32:32:32-f64:64:64-f128:128:128-v128:128:128-n32:64"
6 target triple = "powerpc64-unknown-linux-gnu"
7
8 define double @test_double_const() nounwind {
9 entry:
10 ret double 0x3F4FD4920B498CF0
11 }
12
13 ; CHECK: [[VAR:[a-z0-9A-Z_.]+]]:
14 ; CHECK: .quad 4562098671269285104
15 ; CHECK: test_double_const:
16 ; CHECK: addis [[REG1:[0-9]+]], 2, [[VAR]]@toc@ha
17 ; CHECK: addi [[REG2:[0-9]+]], [[REG1]], [[VAR]]@toc@l
18 ; CHECK: lfd {{[0-9]+}}, 0([[REG2]])
0 ; RUN: llc -mcpu=pwr7 -O0 -code-model=medium <%s | FileCheck %s
1
2 ; Test correct code generation for medium code model (32-bit TOC offsets)
3 ; for loading the address of a jump table from the TOC.
4
5 target datalayout = "E-p:64:64:64-i1:8:8-i8:8:8-i16:16:16-i32:32:32-i64:64:64-f32:32:32-f64:64:64-f128:128:128-v128:128:128-n32:64"
6 target triple = "powerpc64-unknown-linux-gnu"
7
8 define signext i32 @test_jump_table(i32 signext %i) nounwind {
9 entry:
10 %i.addr = alloca i32, align 4
11 store i32 %i, i32* %i.addr, align 4
12 %0 = load i32* %i.addr, align 4
13 switch i32 %0, label %sw.default [
14 i32 3, label %sw.bb
15 i32 4, label %sw.bb1
16 i32 5, label %sw.bb2
17 i32 6, label %sw.bb3
18 ]
19
20 sw.default: ; preds = %entry
21 br label %sw.epilog
22
23 sw.bb: ; preds = %entry
24 %1 = load i32* %i.addr, align 4
25 %mul = mul nsw i32 %1, 7
26 store i32 %mul, i32* %i.addr, align 4
27 br label %sw.bb1
28
29 sw.bb1: ; preds = %entry, %sw.bb
30 %2 = load i32* %i.addr, align 4
31 %dec = add nsw i32 %2, -1
32 store i32 %dec, i32* %i.addr, align 4
33 br label %sw.bb2
34
35 sw.bb2: ; preds = %entry, %sw.bb1
36 %3 = load i32* %i.addr, align 4
37 %add = add nsw i32 %3, 3
38 store i32 %add, i32* %i.addr, align 4
39 br label %sw.bb3
40
41 sw.bb3: ; preds = %entry, %sw.bb2
42 %4 = load i32* %i.addr, align 4
43 %shl = shl i32 %4, 1
44 store i32 %shl, i32* %i.addr, align 4
45 br label %sw.epilog
46
47 sw.epilog: ; preds = %sw.bb3, %sw.default
48 %5 = load i32* %i.addr, align 4
49 ret i32 %5
50 }
51
52 ; CHECK: test_jump_table:
53 ; CHECK: addis [[REG1:[0-9]+]], 2, .LC[[TOCNUM:[0-9]+]]@toc@ha
54 ; CHECK: ld [[REG2:[0-9]+]], .LC[[TOCNUM]]@toc@l([[REG1]])
55 ; CHECK: ldx {{[0-9]+}}, {{[0-9]+}}, [[REG2]]
56 ; CHECK: .section .toc
57 ; CHECK: .LC[[TOCNUM]]:
58 ; CHECK: .tc {{[a-z0-9A-Z_.]+}}[TC],{{[a-z0-9A-Z_.]+}}
0 ; RUN: llc -mcpu=pwr7 -O0 -code-model=medium < %s | FileCheck %s
1
2 ; Test correct code generation for medium code model (32-bit TOC offsets)
3 ; for loading and storing a tentatively defined variable.
4
5 target datalayout = "E-p:64:64:64-i1:8:8-i8:8:8-i16:16:16-i32:32:32-i64:64:64-f32:32:32-f64:64:64-f128:128:128-v128:128:128-n32:64"
6 target triple = "powerpc64-unknown-linux-gnu"
7
8 @ti = common global i32 0, align 4
9
10 define signext i32 @test_tentative() nounwind {
11 entry:
12 %0 = load i32* @ti, align 4
13 %inc = add nsw i32 %0, 1
14 store i32 %inc, i32* @ti, align 4
15 ret i32 %0
16 }
17
18 ; CHECK: test_tentative:
19 ; CHECK: addis [[REG1:[0-9]+]], 2, .LC[[TOCNUM:[0-9]+]]@toc@ha
20 ; CHECK: ld [[REG2:[0-9]+]], .LC[[TOCNUM]]@toc@l([[REG1]])
21 ; CHECK: lwz {{[0-9]+}}, 0([[REG2]])
22 ; CHECK: stw {{[0-9]+}}, 0([[REG2]])
23 ; CHECK: .section .toc
24 ; CHECK: .LC[[TOCNUM]]:
25 ; CHECK: .tc [[VAR:[a-z0-9A-Z_.]+]][TC],{{[a-z0-9A-Z_.]+}}
26 ; CHECK: .comm [[VAR]],4,4
0 ; RUN: llc -mcpu=pwr7 -O0 -code-model=medium < %s | FileCheck %s
1
2 ; Test correct code generation for medium code model (32-bit TOC offsets)
3 ; for loading a function address.
4
5 target datalayout = "E-p:64:64:64-i1:8:8-i8:8:8-i16:16:16-i32:32:32-i64:64:64-f32:32:32-f64:64:64-f128:128:128-v128:128:128-n32:64"
6 target triple = "powerpc64-unknown-linux-gnu"
7
8 define i8* @test_fnaddr() nounwind {
9 entry:
10 %func = alloca i32 (i32)*, align 8
11 store i32 (i32)* @foo, i32 (i32)** %func, align 8
12 %0 = load i32 (i32)** %func, align 8
13 %1 = bitcast i32 (i32)* %0 to i8*
14 ret i8* %1
15 }
16
17 declare signext i32 @foo(i32 signext)
18
19 ; CHECK: test_fnaddr:
20 ; CHECK: addis [[REG1:[0-9]+]], 2, .LC[[TOCNUM:[0-9]+]]@toc@ha
21 ; CHECK: ld [[REG2:[0-9]+]], .LC[[TOCNUM]]@toc@l([[REG1]])
22 ; CHECK: .section .toc
23 ; CHECK: .LC[[TOCNUM]]:
24 ; CHECK: .tc {{[a-z0-9A-Z_.]+}}[TC],{{[a-z0-9A-Z_.]+}}
0 ; RUN: llc -O0 -mcpu=pwr7 -code-model=medium -filetype=obj %s -o - | \
1 ; RUN: elf-dump --dump-section-data | FileCheck %s
2
3 ; FIXME: When asm-parse is available, could make this an assembly test.
4
5 target datalayout = "E-p:64:64:64-i1:8:8-i8:8:8-i16:16:16-i32:32:32-i64:64:64-f32:32:32-f64:64:64-f128:128:128-v128:128:128-n32:64"
6 target triple = "powerpc64-unknown-linux-gnu"
7
8 @ei = external global i32
9
10 define signext i32 @test_external() nounwind {
11 entry:
12 %0 = load i32* @ei, align 4
13 %inc = add nsw i32 %0, 1
14 store i32 %inc, i32* @ei, align 4
15 ret i32 %0
16 }
17
18 ; Verify generation of R_PPC64_TOC16_HA and R_PPC64_TOC16_LO_DS for
19 ; accessing external variable ei.
20 ;
21 ; CHECK: '.rela.text'
22 ; CHECK: Relocation 0
23 ; CHECK-NEXT: 'r_offset'
24 ; CHECK-NEXT: 'r_sym', 0x[[SYM1:[0-9]+]]
25 ; CHECK-NEXT: 'r_type', 0x00000032
26 ; CHECK: Relocation 1
27 ; CHECK-NEXT: 'r_offset'
28 ; CHECK-NEXT: 'r_sym', 0x[[SYM1]]
29 ; CHECK-NEXT: 'r_type', 0x00000040
30
31 @test_fn_static.si = internal global i32 0, align 4
32
33 define signext i32 @test_fn_static() nounwind {
34 entry:
35 %0 = load i32* @test_fn_static.si, align 4
36 %inc = add nsw i32 %0, 1
37 store i32 %inc, i32* @test_fn_static.si, align 4
38 ret i32 %0
39 }
40
41 ; Verify generation of R_PPC64_TOC16_HA and R_PPC64_TOC16_LO for
42 ; accessing function-scoped variable si.
43 ;
44 ; CHECK: Relocation 2
45 ; CHECK-NEXT: 'r_offset'
46 ; CHECK-NEXT: 'r_sym', 0x[[SYM2:[0-9]+]]
47 ; CHECK-NEXT: 'r_type', 0x00000032
48 ; CHECK: Relocation 3
49 ; CHECK-NEXT: 'r_offset'
50 ; CHECK-NEXT: 'r_sym', 0x[[SYM2]]
51 ; CHECK-NEXT: 'r_type', 0x00000030
52
53 @gi = global i32 5, align 4
54
55 define signext i32 @test_file_static() nounwind {
56 entry:
57 %0 = load i32* @gi, align 4
58 %inc = add nsw i32 %0, 1
59 store i32 %inc, i32* @gi, align 4
60 ret i32 %0
61 }
62
63 ; Verify generation of R_PPC64_TOC16_HA and R_PPC64_TOC16_LO for
64 ; accessing file-scope variable gi.
65 ;
66 ; CHECK: Relocation 4
67 ; CHECK-NEXT: 'r_offset'
68 ; CHECK-NEXT: 'r_sym', 0x[[SYM3:[0-9]+]]
69 ; CHECK-NEXT: 'r_type', 0x00000032
70 ; CHECK: Relocation 5
71 ; CHECK-NEXT: 'r_offset'
72 ; CHECK-NEXT: 'r_sym', 0x[[SYM3]]
73 ; CHECK-NEXT: 'r_type', 0x00000030
74
75 define double @test_double_const() nounwind {
76 entry:
77 ret double 0x3F4FD4920B498CF0
78 }
79
80 ; Verify generation of R_PPC64_TOC16_HA and R_PPC64_TOC16_LO for
81 ; accessing a constant.
82 ;
83 ; CHECK: Relocation 6
84 ; CHECK-NEXT: 'r_offset'
85 ; CHECK-NEXT: 'r_sym', 0x[[SYM4:[0-9]+]]
86 ; CHECK-NEXT: 'r_type', 0x00000032
87 ; CHECK: Relocation 7
88 ; CHECK-NEXT: 'r_offset'
89 ; CHECK-NEXT: 'r_sym', 0x[[SYM4]]
90 ; CHECK-NEXT: 'r_type', 0x00000030
91
92 define signext i32 @test_jump_table(i32 signext %i) nounwind {
93 entry:
94 %i.addr = alloca i32, align 4
95 store i32 %i, i32* %i.addr, align 4
96 %0 = load i32* %i.addr, align 4
97 switch i32 %0, label %sw.default [
98 i32 3, label %sw.bb
99 i32 4, label %sw.bb1
100 i32 5, label %sw.bb2
101 i32 6, label %sw.bb3
102 ]
103
104 sw.default: ; preds = %entry
105 br label %sw.epilog
106
107 sw.bb: ; preds = %entry
108 %1 = load i32* %i.addr, align 4
109 %mul = mul nsw i32 %1, 7
110 store i32 %mul, i32* %i.addr, align 4
111 br label %sw.bb1
112
113 sw.bb1: ; preds = %entry, %sw.bb
114 %2 = load i32* %i.addr, align 4
115 %dec = add nsw i32 %2, -1
116 store i32 %dec, i32* %i.addr, align 4
117 br label %sw.bb2
118
119 sw.bb2: ; preds = %entry, %sw.bb1
120 %3 = load i32* %i.addr, align 4
121 %add = add nsw i32 %3, 3
122 store i32 %add, i32* %i.addr, align 4
123 br label %sw.bb3
124
125 sw.bb3: ; preds = %entry, %sw.bb2
126 %4 = load i32* %i.addr, align 4
127 %shl = shl i32 %4, 1
128 store i32 %shl, i32* %i.addr, align 4
129 br label %sw.epilog
130
131 sw.epilog: ; preds = %sw.bb3, %sw.default
132 %5 = load i32* %i.addr, align 4
133 ret i32 %5
134 }
135
136 ; Verify generation of R_PPC64_TOC16_HA and R_PPC64_TOC16_LO_DS for
137 ; accessing a jump table address.
138 ;
139 ; CHECK: Relocation 8
140 ; CHECK-NEXT: 'r_offset'
141 ; CHECK-NEXT: 'r_sym', 0x[[SYM5:[0-9]+]]
142 ; CHECK-NEXT: 'r_type', 0x00000032
143 ; CHECK: Relocation 9
144 ; CHECK-NEXT: 'r_offset'
145 ; CHECK-NEXT: 'r_sym', 0x[[SYM5]]
146 ; CHECK-NEXT: 'r_type', 0x00000040
147
148 @ti = common global i32 0, align 4
149
150 define signext i32 @test_tentative() nounwind {
151 entry:
152 %0 = load i32* @ti, align 4
153 %inc = add nsw i32 %0, 1
154 store i32 %inc, i32* @ti, align 4
155 ret i32 %0
156 }
157
158 ; Verify generation of R_PPC64_TOC16_HA and R_PPC64_TOC16_LO_DS for
159 ; accessing tentatively declared variable ti.
160 ;
161 ; CHECK: Relocation 10
162 ; CHECK-NEXT: 'r_offset'
163 ; CHECK-NEXT: 'r_sym', 0x[[SYM6:[0-9]+]]
164 ; CHECK-NEXT: 'r_type', 0x00000032
165 ; CHECK: Relocation 11
166 ; CHECK-NEXT: 'r_offset'
167 ; CHECK-NEXT: 'r_sym', 0x[[SYM6]]
168 ; CHECK-NEXT: 'r_type', 0x00000040
169
170 define i8* @test_fnaddr() nounwind {
171 entry:
172 %func = alloca i32 (i32)*, align 8
173 store i32 (i32)* @foo, i32 (i32)** %func, align 8
174 %0 = load i32 (i32)** %func, align 8
175 %1 = bitcast i32 (i32)* %0 to i8*
176 ret i8* %1
177 }
178
179 declare signext i32 @foo(i32 signext)
180
181 ; Verify generation of R_PPC64_TOC16_HA and R_PPC64_TOC16_LO_DS for
182 ; accessing function address foo.
183 ;
184 ; CHECK: Relocation 12
185 ; CHECK-NEXT: 'r_offset'
186 ; CHECK-NEXT: 'r_sym', 0x[[SYM7:[0-9]+]]
187 ; CHECK-NEXT: 'r_type', 0x00000032
188 ; CHECK: Relocation 13
189 ; CHECK-NEXT: 'r_offset'
190 ; CHECK-NEXT: 'r_sym', 0x[[SYM7]]
191 ; CHECK-NEXT: 'r_type', 0x00000040
192