llvm.org GIT mirror llvm / 7897538
Make this document *substantially* better and cover a lot more territory. Document written by Mason Woo (http://www.woo.com)! git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@59066 91177308-0d34-0410-b5e6-96231b3b80d8 Chris Lattner 10 years ago
1 changed file(s) with 2009 addition(s) and 240 deletion(s). Raw diff Collapse all Expand all
11 "http://www.w3.org/TR/html4/strict.dtd">
22
33
4 Writing an LLVM <del>b</del>ackend
4 Writing an LLVM <ins>Compiler B</ins>ackend
55
66
77
88
99
10
11 Writing an LLVM backend
10

11 Writing an LLVM Compiler Backend
1212
1313
1414
1515
  • Introduction
  • 16
  • Writing a backend
  • 17
    18
  • Machine backends
  • 19
    20
  • Outline
  • 21
  • Implementation details
  • 22
    23
  • Language backends
  • 24
    25 <li>Related reading material>
    16 <ul>
    17
  • Audience
  • 18
  • Prerequisite Reading
  • 19
  • Basic Steps
  • 20
  • Preliminaries
  • 21
    22
  • Target Machine
  • 23
  • Register Set and Register Classes
  • 24
    25
  • Defining a Register
  • 26
  • Defining a Register Class
  • 27
  • Implement a subclass of TargetRegisterInfo
  • 28
    29
  • Instruction Set
  • 30
    31
  • Implement a subclass of TargetInstrInfo
  • 32
  • Branch Folding and If Conversion
  • 33
    34
  • Instruction Selector
  • 35
    36
  • The SelectionDAG Legalize Phase
  • 37
    38
  • Promote
  • 39
  • Expand
  • 40
  • Custom
  • 41
  • Legal
  • 42
    43
  • Calling Conventions
  • 44
    45
  • Assembly Printer
  • 46
  • Subtarget Support
  • 47
  • JIT Support
  • 48
    49
  • Machine Code Emitter
  • 50
  • Target JIT Info
  • 51
    2652
    2753
    2854
    29

    Written by Misha Brukman

    55

    Written by Mason Woo and Misha Brukman

    3056
    3157
    3258
    3662
    3763
    3864
    39
    40

    This document describes techniques for writing backends for LLVM which

    41 convert the LLVM representation to machine assembly code or other languages.

    42
    65

    This document describes techniques for writing compiler backends

    66 that convert the LLVM IR (intermediate representation) to code for a specified
    67 machine or other languages. Code intended for a specific machine can take the
    68 form of either assembly code or binary code (usable for a JIT compiler).

    69
    70

    The backend of LLVM features a target-independent code generator

    71 that may create output for several types of target CPUs, including X86,
    72 PowerPC, Alpha, and SPARC. The backend may also be used to generate code
    73 targeted at SPUs of the Cell processor or GPUs to support the execution of
    74 compute kernels.

    75
    76

    The document focuses on existing examples found in subdirectories

    77 of llvm/lib/Target in a downloaded LLVM release. In particular, this document
    78 focuses on the example of creating a static compiler (one that emits text
    79 assembly) for a SPARC target, because SPARC has fairly standard
    80 characteristics, such as a RISC instruction set and straightforward calling
    81 conventions.

    82
    83
    84
    85 Audience
    86
    87
    88
    89

    The audience for this document is anyone who needs to write an

    90 LLVM backend to generate code for a specific hardware or software target.

    91
    92
    93
    94 Prerequisite Reading
    95
    96
    97
    98 These essential documents must be read before reading this document:
    99
    100
  • 101 LLVM Language Reference Manual -
    102 a reference manual for the LLVM assembly language
    103
    104
  • 105 The LLVM Target-Independent Code Generator -
    106 a guide to the components (classes and code generation algorithms) for translating
    107 the LLVM internal representation to the machine code for a specified target.
    108 Pay particular attention to the descriptions of code generation stages:
    109 Instruction Selection, Scheduling and Formation, SSA-based Optimization,
    110 Register Allocation, Prolog/Epilog Code Insertion, Late Machine Code Optimizations,
    111 and Code Emission.
    112
    113
  • 114 TableGen Fundamentals -
    115 a document that describes the TableGen (tblgen) application that manages domain-specific
    116 information to support LLVM code generation. TableGen processes input from a
    117 target description file (.td suffix) and generates C++ code that can be used
    118 for code generation.
    119
    120
  • 121 Writing an LLVM Pass -
    122 The assembly printer is a FunctionPass, as are several SelectionDAG processing steps.
    123
    124
    125 To follow the SPARC examples in this document, have a copy of
    126 The SPARC Architecture Manual, Version 8
    127 for reference. For details about the ARM instruction set, refer to the
    128 ARM Architecture Reference Manual
    129 For more about the GNU Assembler format (GAS), see
    130 Using As
    131 especially for the assembly printer. Using As contains lists of target machine dependent features.
    132
    133
    134
    135 Basic Steps
    136
    137
    138

    To write a compiler

    139 backend for LLVM that converts the LLVM IR (intermediate representation)
    140 to code for a specified target (machine or other language), follow these steps:

    141
    142
    143
  • 144 Create a subclass of the TargetMachine class that describes
    145 characteristics of your target machine. Copy existing examples of specific
    146 TargetMachine class and header files; for example, start with SparcTargetMachine.cpp
    147 and SparcTargetMachine.h, but change the file names for your target. Similarly,
    148 change code that references "Sparc" to reference your target.
    149
    150
  • Describe the register set of the target. Use TableGen to generate
  • 151 code for register definition, register aliases, and register classes from a
    152 target-specific RegisterInfo.td input file. You should also write additional
    153 code for a subclass of TargetRegisterInfo class that represents the class
    154 register file data used for register allocation and also describes the
    155 interactions between registers.
    156
    157
  • Describe the instruction set of the target. Use TableGen to
  • 158 generate code for target-specific instructions from target-specific versions of
    159 TargetInstrFormats.td and TargetInstrInfo.td. You should write additional code
    160 for a subclass of the TargetInstrInfo
    161 class to represent machine
    162 instructions supported by the target machine.
    163
    164
  • Describe the selection and conversion of the LLVM IR from a DAG (directed
  • 165 acyclic graph) representation of instructions to native target-specific
    166 instructions. Use TableGen to generate code that matches patterns and selects
    167 instructions based on additional information in a target-specific version of
    168 TargetInstrInfo.td. Write code for XXXISelDAGToDAG.cpp
    169 (where XXX identifies the specific target) to perform pattern
    170 matching and DAG-to-DAG instruction selection. Also write code in XXXISelLowering.cpp
    171 to replace or remove operations and data types that are not supported natively
    172 in a SelectionDAG.
    173
    174
  • Write code for an
  • 175 assembly printer that converts LLVM IR to a GAS format for your target machine.
    176 You should add assembly strings to the instructions defined in your
    177 target-specific version of TargetInstrInfo.td. You should also write code for a
    178 subclass of AsmPrinter that performs the LLVM-to-assembly conversion and a
    179 trivial subclass of TargetAsmInfo.
    180
    181
  • Optionally, add support for subtargets (that is, variants with
  • 182 different capabilities). You should also write code for a subclass of the
    183 TargetSubtarget class, which allows you to use the -mcpu=
    184 and -mattr= command-line options.
    185
    186
  • Optionally, add JIT support and create a machine code emitter (subclass
  • 187 of TargetJITInfo) that is used to emit binary code directly into memory.
    188
    189
    190

    In the .cpp and .h files, initially stub up these methods and

    191 then implement them later. Initially, you may not know which private members
    192 that the class will need and which components will need to be subclassed.

    193
    194
    195
    196 Preliminaries
    197
    198
    199

    To actually create

    200 your compiler backend, you need to create and modify a few files. The absolute
    201 minimum is discussed here, but to actually use the LLVM target-independent code
    202 generator, you must perform the steps described in the
    203 href="http://www.llvm.org/docs/CodeGenerator.html">LLVM
    204 Target-Independent Code Generator document.

    205
    206

    First, you should

    207 create a subdirectory under lib/Target to hold all the files related to your
    208 target. If your target is called "Dummy", create the directory
    209 lib/Target/Dummy.

    210
    211

    In this new

    212 directory, create a Makefile. It is easiest to copy a Makefile of another
    213 target and modify it. It should at least contain the LEVEL, LIBRARYNAME and
    214 TARGET variables, and then include $(LEVEL)/Makefile.common. The library can be
    215 named LLVMDummy (for example, see the MIPS target). Alternatively, you can
    216 split the library into LLVMDummyCodeGen and LLVMDummyAsmPrinter, the latter of
    217 which should be implemented in a subdirectory below lib/Target/Dummy (for
    218 example, see the PowerPC target).

    219
    220

    Note that these two

    221 naming schemes are hardcoded into llvm-config. Using any other naming scheme
    222 will confuse llvm-config and produce lots of (seemingly unrelated) linker
    223 errors when linking llc.

    224
    225

    To make your target

    226 actually do something, you need to implement a subclass of TargetMachine. This
    227 implementation should typically be in the file
    228 lib/Target/DummyTargetMachine.cpp, but any file in the lib/Target directory will
    229 be built and should work. To use LLVM's target
    230 independent code generator, you should do what all current machine backends do: create a subclass
    231 of LLVMTargetMachine. (To create a target from scratch, create a subclass of
    232 TargetMachine.)

    233
    234

    To get LLVM to

    235 actually build and link your target, you need to add it to the TARGETS_TO_BUILD
    236 variable. To do this, you modify the configure script to know about your target
    237 when parsing the --enable-targets option. Search the configure script for TARGETS_TO_BUILD,
    238 add your target to the lists there (some creativity required) and then
    239 reconfigure. Alternatively, you can change autotools/configure.ac and
    240 regenerate configure by running ./autoconf/AutoRegen.sh

    43241
    44242
    45243
    46244
    47 Writing a backend
    245 Target Machine
    48246
    49247
    248
    249

    LLVMTargetMachine is designed as a base class for targets

    250 implemented with the LLVM target-independent code generator. The
    251 LLVMTargetMachine class should be specialized by a concrete target class that
    252 implements the various virtual methods. LLVMTargetMachine is defined as a
    253 subclass of TargetMachine in include/llvm/Target/TargetMachine.h. The
    254 TargetMachine class implementation (TargetMachine.cpp) also processes numerous
    255 command-line options.

    256
    257

    To create a concrete target-specific subclass of

    258 LLVMTargetMachine, start by copying an existing TargetMachine class and header.
    259 You should name the files that you create to reflect your specific target. For
    260 instance, for the SPARC target, name the files SparcTargetMachine.h and
    261 SparcTargetMachine.cpp

    262
    263

    For a target machine XXX, the implementation of XXXTargetMachine

    264 must have access methods to obtain objects that represent target components.
    265 These methods are named get*Info and are intended to obtain the instruction set
    266 (getInstrInfo), register set (getRegisterInfo), stack frame layout
    267 (getFrameInfo), and similar information. XXXTargetMachine must also implement
    268 the getTargetData method to access an object with target-specific data
    269 characteristics, such as data type size and alignment requirements.

    270
    271

    For instance, for the SPARC target, the header file SparcTargetMachine.h

    272 declares prototypes for several get*Info and getTargetData methods that simply
    273 return a class member.

    274
    275
    276
    277
    namespace llvm {
    
                      
                    
    278
    279 class Module;
    280
    281 class SparcTargetMachine : public LLVMTargetMachine {
    282 const TargetData DataLayout; // Calculates type size & alignment
    283 SparcSubtarget Subtarget;
    284 SparcInstrInfo InstrInfo;
    285 TargetFrameInfo FrameInfo;
    286
    287 protected:
    288 virtual const TargetAsmInfo *createTargetAsmInfo()
    289 const;
    290
    291 public:
    292 SparcTargetMachine(const Module &M, const std::string &FS);
    293
    294 virtual const SparcInstrInfo *getInstrInfo() const {return &InstrInfo; }
    295 virtual const TargetFrameInfo *getFrameInfo() const {return &FrameInfo; }
    296 virtual const TargetSubtarget *getSubtargetImpl() const{return &Subtarget; }
    297 virtual const TargetRegisterInfo *getRegisterInfo() const {
    298 return &InstrInfo.getRegisterInfo();
    299 }
    300 virtual const TargetData *getTargetData() const { return &DataLayout; }
    301 static unsigned getModuleMatchQuality(const Module &M);
    302
    303 // Pass Pipeline Configuration
    304 virtual bool addInstSelector(PassManagerBase &PM, bool Fast);
    305 virtual bool addPreEmitPass(PassManagerBase &PM, bool Fast);
    306 virtual bool addAssemblyEmitter(PassManagerBase &PM, bool Fast,
    307 std::ostream &Out);
    308 };
    309
    310 } // end namespace llvm
    311
    312
    313
    314
    315
    316
  • getInstrInfo
  • 317
  • getRegisterInfo
  • 318
  • getFrameInfo
  • 319
  • getTargetData
  • 320
  • getSubtargetImpl
  • 321
    322

    For some targets, you also need to support the following methods:

    323

    324
    325
    326
  • getTargetLowering
  • 327
  • getJITInfo
  • 328
    329

    In addition, the XXXTargetMachine constructor should specify a

    330 TargetDescription string that determines the data layout for the target machine,
    331 including characteristics such as pointer size, alignment, and endianness. For
    332 example, the constructor for SparcTargetMachine contains the following:

    333
    334
    335
    336
    
                      
                    
    337 SparcTargetMachine::SparcTargetMachine(const Module &M, const std::string &FS)
    338 : DataLayout("E-p:32:32-f128:128:128"),
    339 Subtarget(M, FS), InstrInfo(Subtarget),
    340 FrameInfo(TargetFrameInfo::StackGrowsDown, 8, 0) {
    341 }
    342
    343
    344
    345
    346

    Hyphens separate portions of the TargetDescription string.

    347
    348
  • The "E" in the string indicates a big-endian target data model; a
  • 349 lower-case "e" would indicate little-endian.
    350
  • "p:" is followed by pointer information: size, ABI alignment, and
  • 351 preferred alignment. If only two figures follow "p:", then the first value is
    352 pointer size, and the second value is both ABI and preferred alignment.
    353
  • then a letter for numeric type alignment: "i", "f", "v", or "a"
  • 354 (corresponding to integer, floating point, vector, or aggregate). "i", "v", or
    355 "a" are followed by ABI alignment and preferred alignment. "f" is followed by
    356 three values, the first indicates the size of a long double, then ABI alignment
    357 and preferred alignment.
    358
    359

    You must also register your target using the RegisterTarget

    360 template. (See the TargetMachineRegistry class.) For example, in SparcTargetMachine.cpp,
    361 the target is registered with:

    362
    363
    364
    365
    
                      
                    
    366 namespace {
    367 // Register the target.
    368 RegisterTarget<SparcTargetMachine>X("sparc", "SPARC");
    369 }
    370
    371
    372
    373
    374
    375 Register Set and Register Classes
    376
    377
    378
    379

    You should describe

    380 a concrete target-specific class
    381 that represents the register file of a target machine. This class is
    382 called XXXRegisterInfo (where XXX identifies the target) and represents the
    383 class register file data that is used for register allocation and also
    384 describes the interactions between registers.

    385
    386

    You also need to

    387 define register classes to categorize related registers. A register class
    388 should be added for groups of registers that are all treated the same way for
    389 some instruction. Typical examples are register classes that include integer,
    390 floating-point, or vector registers. A register allocator allows an
    391 instruction to use any register in a specified register class to perform the
    392 instruction in a similar manner. Register classes allocate virtual registers to
    393 instructions from these sets, and register classes let the target-independent
    394 register allocator automatically choose the actual registers.

    395
    396

    Much of the code for registers, including register definition,

    397 register aliases, and register classes, is generated by TableGen from
    398 XXXRegisterInfo.td input files and placed in XXXGenRegisterInfo.h.inc and
    399 XXXGenRegisterInfo.inc output files. Some of the code in the implementation of
    400 XXXRegisterInfo requires hand-coding.

    401
    50402
    51403
    52404
    53 Machine backends
    54
    55
    405 Defining a Register
    406
    407
    408

    The XXXRegisterInfo.td file typically starts with register definitions

    409 for a target machine. The Register class (specified in Target.td) is used to
    410 define an object for each register. The specified string n becomes the Name of
    411 the register. The basic Register object does not have any subregisters and does
    412 not specify any aliases.

    413
    414
    415
    
                      
                    
    416 class Register<string n> {
    417 string Namespace = "";
    418 string AsmName = n;
    419 string Name = n;
    420 int SpillSize = 0;
    421 int SpillAlignment = 0;
    422 list<Register> Aliases = [];
    423 list<Register> SubRegs = [];
    424 list<int> DwarfNumbers = [];
    425 }
    426
    427
    428
    429
    430

    For example, in the X86RegisterInfo.td file, there are register

    431 definitions that utilize the Register class, such as:

    432
    433
    434
    
                      
                    
    435 def AL : Register<"AL">,
    436 DwarfRegNum<[0, 0, 0]>;
    437
    438
    439
    440
    441

    This defines the register AL and assigns it values (with

    442 DwarfRegNum) that are used by gcc, gdb, or a debug information writer (such as
    443 DwarfWriter in llvm/lib/CodeGen) to identify a register. For register AL,
    444 DwarfRegNum takes an array of 3 values, representing 3 different modes: the
    445 first element is for X86-64, the second for EH (exception handling) on X86-32,
    446 and the third is generic. -1 is a special Dwarf number that indicates the gcc
    447 number is undefined, and -2 indicates the register number is invalid for this
    448 mode.

    449
    450

    From the previously described line in the X86RegisterInfo.td

    451 file, TableGen generates this code in the X86GenRegisterInfo.inc file:

    452
    453
    454
    
                      
                    
    455 static const unsigned GR8[] = { X86::AL, ... };
    456  
    457 const unsigned AL_AliasSet[] = { X86::AX, X86::EAX, X86::RAX, 0 };
    458  
    459 const TargetRegisterDesc RegisterDescriptors[] = {
    460 ...
    461 { "AL", "AL", AL_AliasSet, Empty_SubRegsSet, Empty_SubRegsSet, AL_SuperRegsSet }, ...
    462
    463
    464
    465
    466

    From the register info file, TableGen generates a

    467 TargetRegisterDesc object for each register. TargetRegisterDesc is defined in
    468 include/llvm/Target/TargetRegisterInfo.h with the following fields:

    469
    470
    471
    472
    
                      
                    
    473 struct TargetRegisterDesc {
    474 const char *AsmName; // Assembly language name for the register
    475 const char *Name; // Printable name for the reg (for debugging)
    476 const unsigned *AliasSet; // Register Alias Set
    477 const unsigned *SubRegs; // Sub-register set
    478 const unsigned *ImmSubRegs; // Immediate sub-register set
    479 const unsigned *SuperRegs; // Super-register set
    480 };
    481
    482
    483
    484

    TableGen uses the entire target description file (.td) to

    485 determine text names for the register (in the AsmName and Name fields of
    486 TargetRegisterDesc) and the relationships of other registers to the defined
    487 register (in the other TargetRegisterDesc fields). In this example, other
    488 definitions establish the registers "AX", "EAX", and "RAX" as aliases for one
    489 another, so TableGen generates a null-terminated array (AL_AliasSet) for this
    490 register alias set.

    491
    492

    The Register class is commonly used as a base class for more

    493 complex classes. In Target.td, the Register class is the base for the
    494 RegisterWithSubRegs class that is used to define registers that need to specify
    495 subregisters in the SubRegs list, as shown here:

    496
    497
    498
    
                      
                    
    499 class RegisterWithSubRegs<string n,
    500 list<Register> subregs> : Register<n> {
    501 let SubRegs = subregs;
    502 }
    503
    504
    505
    506

    In SparcRegisterInfo.td, additional register classes are defined

    507 for SPARC: a Register subclass, SparcReg, and further subclasses: Ri, Rf, and
    508 Rd. SPARC registers are identified by 5-bit ID numbers, which is a feature
    509 common to these subclasses. Note the use of ‘let’ expressions to override values
    510 that are initially defined in a superclass (such as SubRegs field in the Rd
    511 class).

    512
    513
    514
    
                      
                    
    515 class SparcReg<string n> : Register<n> {
    516 field bits<5> Num;
    517 let Namespace = "SP";
    518 }
    519 // Ri - 32-bit integer registers
    520 class Ri<bits<5> num, string n> :
    521 SparcReg<n> {
    522 let Num = num;
    523 }
    524 // Rf - 32-bit floating-point registers
    525 class Rf<bits<5> num, string n> :
    526 SparcReg<n> {
    527 let Num = num;
    528 }
    529 // Rd - Slots in the FP register file for 64-bit
    530 floating-point values.
    531 class Rd<bits<5> num, string n,
    532 list<Register> subregs> : SparcReg<n> {
    533 let Num = num;
    534 let SubRegs = subregs;
    535 }
    536
    537
    538

    In the SparcRegisterInfo.td file, there are register definitions

    539 that utilize these subclasses of Register, such as:

    540
    541
    542
    
                      
                    
    543 def G0 : Ri< 0, "G0">,
    544 DwarfRegNum<[0]>;
    545 def G1 : Ri< 1, "G1">, DwarfRegNum<[1]>;
    546 ...
    547 def F0 : Rf< 0, "F0">,
    548 DwarfRegNum<[32]>;
    549 def F1 : Rf< 1, "F1">,
    550 DwarfRegNum<[33]>;
    551 ...
    552 def D0 : Rd< 0, "F0", [F0, F1]>,
    553 DwarfRegNum<[32]>;
    554 def D1 : Rd< 2, "F2", [F2, F3]>,
    555 DwarfRegNum<[34]>;
    556
    557
    558
    559

    The last two registers shown above (D0 and D1) are double-precision

    560 floating-point registers that are aliases for pairs of single-precision
    561 floating-point sub-registers. In addition to aliases, the sub-register and
    562 super-register relationships of the defined register are in fields of a
    563 register’s TargetRegisterDesc.

    564
    565
    566
    567
    568 Defining a Register Class
    569
    570
    571

    The RegisterClass class (specified in Target.td) is used to

    572 define an object that represents a group of related registers and also defines
    573 the default allocation order of the registers. A target description file
    574 XXXRegisterInfo.td that uses Target.td can construct register classes using the
    575 following class:

    576
    577
    578
    579
    
                      
                    
    580 class RegisterClass<string namespace,
    581 list<ValueType> regTypes, int alignment,
    582 list<Register> regList> {
    583 string Namespace = namespace;
    584 list<ValueType> RegTypes = regTypes;
    585 int Size = 0; // spill size, in bits; zero lets tblgen pick the size
    586 int Alignment = alignment;
    587  
    588 // CopyCost is the cost of copying a value between two registers
    589 // default value 1 means a single instruction
    590 // A negative value means copying is extremely expensive or impossible
    591 int CopyCost = 1;
    592 list<Register> MemberList = regList;
    593
    594 // for register classes that are subregisters of this class
    595 list<RegisterClass> SubRegClassList = [];
    596
    597 code MethodProtos = [{}]; // to insert arbitrary code
    598 code MethodBodies = [{}];
    599 }
    600
    601
    602

    To define a RegisterClass, use the following 4 arguments:

    603
    604
  • The first argument of the definition is the name of the
  • 605 namespace.
    606
    607
  • The second argument is a list of ValueType register type values
  • 608 that are defined in include/llvm/CodeGen/ValueTypes.td. Defined values include
    609 integer types (such as i16, i32, and i1 for Boolean), floating-point types
    610 (f32, f64), and vector types (for example, v8i16 for an 8 x i16 vector). All
    611 registers in a RegisterClass must have the same ValueType, but some registers
    612 may store vector data in different configurations. For example a register that
    613 can process a 128-bit vector may be able to handle 16 8-bit integer elements, 8
    614 16-bit integers, 4 32-bit integers, and so on.
    615
    616
  • The third argument of the RegisterClass definition specifies the
  • 617 alignment required of the registers when they are stored or loaded to memory.
    618
    619
  • The final argument, regList, specifies which registers are in
  • 620 this class. If an allocation_order_* method is not specified, then regList also
    621 defines the order of allocation used by the register allocator.
    622
    623
    624

    In SparcRegisterInfo.td, three RegisterClass objects are defined:

    625 FPRegs, DFPRegs, and IntRegs. For all three register classes, the first
    626 argument defines the namespace with the string “SP”. FPRegs defines a group of 32
    627 single-precision floating-point registers (F0 to F31); DFPRegs defines a group
    628 of 16 double-precision registers (D0-D15). For IntRegs, the MethodProtos and
    629 MethodBodies methods are used by TableGen to insert the specified code into generated
    630 output.

    631
    632
    633
    
                      
                    
    634 def FPRegs : RegisterClass<"SP", [f32], 32, [F0, F1, F2, F3, F4, F5, F6, F7,
    635 F8, F9, F10, F11, F12, F13, F14, F15, F16, F17, F18, F19, F20, F21, F22,
    636 F23, F24, F25, F26, F27, F28, F29, F30, F31]>;
    637  
    638 def DFPRegs : RegisterClass<"SP", [f64], 64, [D0, D1, D2, D3, D4, D5, D6, D7,
    639 D8, D9, D10, D11, D12, D13, D14, D15]>;
    640  
    641 def IntRegs : RegisterClass<"SP", [i32], 32, [L0, L1, L2, L3, L4, L5, L6, L7,
    642 I0, I1, I2, I3, I4, I5,
    643 O0, O1, O2, O3, O4, O5, O7,
    644 G1,
    645 // Non-allocatable regs:
    646 G2, G3, G4,
    647 O6, // stack ptr
    648 I6, // frame ptr
    649 I7, // return address
    650 G0, // constant zero
    651 G5, G6, G7 // reserved for kernel
    652 ]> {
    653 let MethodProtos = [{
    654 iterator allocation_order_end(const MachineFunction &MF) const;
    655 }];
    656 let MethodBodies = [{
    657 IntRegsClass::iterator
    658 IntRegsClass::allocation_order_end(const MachineFunction &MF) const {
    659 return end()-10 // Don't allocate special registers
    660 -1;
    661 }
    662 }];
    663 }
    664
    665
    666
    667
    668

    Using SparcRegisterInfo.td with TableGen generates several output

    669 files that are intended for inclusion in other source code that you write.
    670 SparcRegisterInfo.td generates SparcGenRegisterInfo.h.inc, which should be
    671 included in the header file for the implementation of the SPARC register
    672 implementation that you write (SparcRegisterInfo.h). In
    673 SparcGenRegisterInfo.h.inc a new structure is defined called
    674 SparcGenRegisterInfo that uses TargetRegisterInfo as its base. It also
    675 specifies types, based upon the defined register classes: DFPRegsClass, FPRegsClass,
    676 and IntRegsClass.

    677
    678

    SparcRegisterInfo.td also generates SparcGenRegisterInfo.inc,

    679 which is included at the bottom of SparcRegisterInfo.cpp, the SPARC register
    680 implementation. The code below shows only the generated integer registers and
    681 associated register classes. The order of registers in IntRegs reflects the
    682 order in the definition of IntRegs in the target description file. Take special
    683 note of the use of MethodBodies in SparcRegisterInfo.td to create code in
    684 SparcGenRegisterInfo.inc. MethodProtos generates similar code in
    685 SparcGenRegisterInfo.h.inc.

    686
    687
    688
    689
      // IntRegs Register Class...
    
                      
                    
    690 static const unsigned IntRegs[] = {
    691 SP::L0, SP::L1, SP::L2, SP::L3, SP::L4, SP::L5,
    692 SP::L6, SP::L7, SP::I0, SP::I1, SP::I2, SP::I3, SP::I4, SP::I5, SP::O0, SP::O1,
    693 SP::O2, SP::O3, SP::O4, SP::O5, SP::O7, SP::G1, SP::G2, SP::G3, SP::G4, SP::O6,
    694 SP::I6, SP::I7, SP::G0, SP::G5, SP::G6, SP::G7,
    695 };
    696  
    697 // IntRegsVTs Register Class Value Types...
    698 static const MVT::ValueType IntRegsVTs[] = {
    699 MVT::i32, MVT::Other
    700 };
    701 namespace SP { // Register class instances
    702 DFPRegsClass    DFPRegsRegClass;
    703 FPRegsClass     FPRegsRegClass;
    704 IntRegsClass    IntRegsRegClass;
    705 ...
    706  
    707 // IntRegs Sub-register Classess...
    708 static const TargetRegisterClass* const IntRegsSubRegClasses [] = {
    709 NULL
    710 };
    711 ...
    712 // IntRegs Super-register Classess...
    713 static const TargetRegisterClass* const IntRegsSuperRegClasses [] = {
    714 NULL
    715 };
    716  
    717 // IntRegs Register Class sub-classes...
    718 static const TargetRegisterClass* const IntRegsSubclasses [] = {
    719 NULL
    720 };
    721 ...
    722  
    723 // IntRegs Register Class super-classes...
    724 static const TargetRegisterClass* const IntRegsSuperclasses [] = {
    725 NULL
    726 };
    727 ...
    728  
    729 IntRegsClass::iterator
    730 IntRegsClass::allocation_order_end(const MachineFunction &MF) const {
    731
    732 return end()-10 // Don't allocate special registers
    733 -1;
    734 }
    735
    736 IntRegsClass::IntRegsClass() : TargetRegisterClass(IntRegsRegClassID,
    737 IntRegsVTs, IntRegsSubclasses, IntRegsSuperclasses, IntRegsSubRegClasses,
    738 IntRegsSuperRegClasses, 4, 4, 1, IntRegs, IntRegs + 32) {}
    739 }
    740
    741
    742
    743
    744 Implement a subclass of
    745 TargetRegisterInfo
    746
    747
    748

    The final step is to hand code portions of XXXRegisterInfo, which

    749 implements the interface described in TargetRegisterInfo.h. These functions
    750 return 0, NULL, or false, unless overridden. Here’s a list of functions that
    751 are overridden for the SPARC implementation in SparcRegisterInfo.cpp:

    752
    753
  • getCalleeSavedRegs (returns a list of callee-saved registers in
  • 754 the order of the desired callee-save stack frame offset)
    755
    756
  • getCalleeSavedRegClasses (returns a list of preferred register
  • 757 classes with which to spill each callee saved register)
    758
    759
  • getReservedRegs (returns a bitset indexed by physical register
  • 760 numbers, indicating if a particular register is unavailable)
    761
    762
  • hasFP (return a Boolean indicating if a function should have a
  • 763 dedicated frame pointer register)
    764
    765
  • eliminateCallFramePseudoInstr (if call frame setup or destroy
  • 766 pseudo instructions are used, this can be called to eliminate them)
    767
    768
  • eliminateFrameIndex (eliminate abstract frame indices from
  • 769 instructions that may use them)
    770
    771
  • emitPrologue (insert prologue code into the function)
  • 772
    773
  • emitEpilogue (insert epilogue code into the function)
  • 774
    775
    776
    777
    778
    779 Instruction Set
    780
    781
    782
    783

    During the early stages of code generation, the LLVM IR code is

    784 converted to a SelectionDAG with nodes that are instances of the SDNode class
    785 containing target instructions. An SDNode has an opcode, operands, type
    786 requirements, and operation properties (for example, is an operation
    787 commutative, does an operation load from memory). The various operation node
    788 types are described in the include/llvm/CodeGen/SelectionDAGNodes.h file (values
    789 of the NodeType enum in the ISD namespace).

    790
    791

    TableGen uses the following target description (.td) input files

    792 to generate much of the code for instruction definition:

    793
    794
  • Target.td, where the Instruction, Operand, InstrInfo, and other
  • 795 fundamental classes are defined
    796
    797
  • TargetSelectionDAG.td, used by SelectionDAG instruction selection
  • 798 generators, contains SDTC* classes (selection DAG type constraint), definitions
    799 of SelectionDAG nodes (such as imm, cond, bb, add, fadd, sub), and pattern
    800 support (Pattern, Pat, PatFrag, PatLeaf, ComplexPattern)
    801
    802
  • XXXInstrFormats.td, patterns for definitions of target-specific
  • 803 instructions
    804
    805
  • XXXInstrInfo.td, target-specific definitions of instruction
  • 806 templates, condition codes, and instructions of an instruction set. (For architecture
    807 modifications, a different file name may be used. For example, for Pentium with
    808 SSE instruction, this file is X86InstrSSE.td, and for Pentium with MMX, this
    809 file is X86InstrMMX.td.)
    810
    811

    There is also a target-specific XXX.td file, where XXX is the

    812 name of the target. The XXX.td file includes the other .td input files, but its
    813 contents are only directly important for subtargets.

    814
    815

    You should describe

    816 a concrete target-specific class
    817 XXXInstrInfo that represents machine
    818 instructions supported by a target machine. XXXInstrInfo contains an array of
    819 XXXInstrDescriptor objects, each of which describes one instruction. An
    820 instruction descriptor defines:

    821
    822
  • opcode mnemonic
  • 823
    824
  • number of operands
  • 825
    826
  • list of implicit register definitions and uses
  • 827
    828
  • target-independent properties (such as memory access, is
  • 829 commutable)
    830
    831
  • target-specific flags
  • 832
    833
    834

    The Instruction class (defined in Target.td) is mostly used as a

    835 base for more complex instruction classes.

    836
    837
    838
    839
    class Instruction {
    
                      
                    
    840 string Namespace = "";
    841 dag OutOperandList; // An dag containing the MI def operand list.
    842 dag InOperandList; // An dag containing the MI use operand list.
    843 string AsmString = ""; // The .s format to print the instruction with.
    844 list<dag> Pattern; // Set to the DAG pattern for this instruction
    845 list<Register> Uses = [];
    846 list<Register> Defs = [];
    847 list<Predicate> Predicates = []; // predicates turned into isel match code
    848 ... remainder not shown for space ...
    849 }
    850
    851
    852
    853

    A SelectionDAG node (SDNode) should contain an object

    854 representing a target-specific instruction that is defined in XXXInstrInfo.td. The
    855 instruction objects should represent instructions from the architecture manual
    856 of the target machine (such as the
    857 SPARC Architecture Manual for the SPARC target).

    858
    859

    A single

    860 instruction from the architecture manual is often modeled as multiple target
    861 instructions, depending upon its operands.  For example, a manual might
    862 describe an add instruction that takes a register or an immediate operand. An
    863 LLVM target could model this with two instructions named ADDri and ADDrr.

    864
    865

    You should define a

    866 class for each instruction category and define each opcode as a subclass of the
    867 category with appropriate parameters such as the fixed binary encoding of
    868 opcodes and extended opcodes. You should map the register bits to the bits of
    869 the instruction in which they are encoded (for the JIT). Also you should specify
    870 how the instruction should be printed when the automatic assembly printer is
    871 used.

    872
    873

    As is described in

    874 the SPARC Architecture Manual, Version 8, there are three major 32-bit formats
    875 for instructions. Format 1 is only for the CALL instruction. Format 2 is for
    876 branch on condition codes and SETHI (set high bits of a register) instructions.
    877 Format 3 is for other instructions.

    878
    879

    Each of these

    880 formats has corresponding classes in SparcInstrFormat.td. InstSP is a base
    881 class for other instruction classes. Additional base classes are specified for
    882 more precise formats: for example in SparcInstrFormat.td, F2_1 is for SETHI,
    883 and F2_2 is for branches. There are three other base classes: F3_1 for
    884 register/register operations, F3_2 for register/immediate operations, and F3_3 for
    885 floating-point operations. SparcInstrInfo.td also adds the base class Pseudo for
    886 synthetic SPARC instructions.

    887
    888

    SparcInstrInfo.td

    889 largely consists of operand and instruction definitions for the SPARC target. In
    890 SparcInstrInfo.td, the following target description file entry, LDrr, defines
    891 the Load Integer instruction for a Word (the LD SPARC opcode) from a memory
    892 address to a register. The first parameter, the value 3 (112), is
    893 the operation value for this category of operation. The second parameter
    894 (0000002) is the specific operation value for LD/Load Word. The
    895 third parameter is the output destination, which is a register operand and
    896 defined in the Register target description file (IntRegs).

    897
    898
    899
    def LDrr : F3_1 <3, 0b000000, (outs IntRegs:$dst), (ins MEMrr:$addr),
    
                      
                    
    900 "ld [$addr], $dst",
    901 [(set IntRegs:$dst, (load ADDRrr:$addr))]>;
    902
    903
    904
    905
    906

    The fourth

    907 parameter is the input source, which uses the address operand MEMrr that is
    908 defined earlier in SparcInstrInfo.td:

    909
    910
    911
    def MEMrr : Operand<i32> {
    
                      
                    
    912 let PrintMethod = "printMemOperand";
    913 let MIOperandInfo = (ops IntRegs, IntRegs);
    914 }
    915
    916
    917
    918

    The fifth parameter is a string that is used by the assembly

    919 printer and can be left as an empty string until the assembly printer interface
    920 is implemented. The sixth and final parameter is the pattern used to match the
    921 instruction during the SelectionDAG Select Phase described in
    922 (The LLVM Target-Independent Code Generator).
    923 This parameter is detailed in the next section, Instruction Selector.

    924
    925

    Instruction class definitions are not overloaded for different

    926 operand types, so separate versions of instructions are needed for register,
    927 memory, or immediate value operands. For example, to perform a
    928 Load Integer instruction for a Word
    929 from an immediate operand to a register, the following instruction class is
    930 defined:

    931
    932
    933
    def LDri : F3_2 <3, 0b000000, (outs IntRegs:$dst), (ins MEMri:$addr),
    
                      
                    
    934 "ld [$addr], $dst",
    935 [(set IntRegs:$dst, (load ADDRri:$addr))]>;
    936
    937
    938
    939

    Writing these definitions for so many similar instructions can

    940 involve a lot of cut and paste. In td files, the multiclass directive enables
    941 the creation of templates to define several instruction classes at once (using
    942 the defm directive). For example in
    943 SparcInstrInfo.td, the multiclass pattern F3_12 is defined to create 2
    944 instruction classes each time F3_12 is invoked:

    945
    946
    947
    multiclass F3_12 <string OpcStr, bits<6> Op3Val, SDNode OpNode> {
    
                      
                    
    948 def rr : F3_1 <2, Op3Val,
    949 (outs IntRegs:$dst), (ins IntRegs:$b, IntRegs:$c),
    950 !strconcat(OpcStr, " $b, $c, $dst"),
    951 [(set IntRegs:$dst, (OpNode IntRegs:$b, IntRegs:$c))]>;
    952 def ri : F3_2 <2, Op3Val,
    953 (outs IntRegs:$dst), (ins IntRegs:$b, i32imm:$c),
    954 !strconcat(OpcStr, " $b, $c, $dst"),
    955 [(set IntRegs:$dst, (OpNode IntRegs:$b, simm13:$c))]>;
    956 }
    957
    958
    959
    960

    So when the defm directive is used for the XOR and ADD

    961 instructions, as seen below, it creates four instruction objects: XORrr, XORri,
    962 ADDrr, and ADDri.

    963
    964
    965
    defm XOR   : F3_12<"xor", 0b000011, xor>;
    
                      
                    
    966 defm ADD : F3_12<"add", 0b000000, add>;
    967
    968
    969
    970
    971

    SparcInstrInfo.td

    972 also includes definitions for condition codes that are referenced by branch
    973 instructions. The following definitions in SparcInstrInfo.td indicate the bit location
    974 of the SPARC condition code; for example, the 10th bit represents
    975 the ‘greater than’ condition for integers, and the 22nd bit
    976 represents the ‘greater than’ condition for floats.

    977
    978
    979
    980
    def ICC_NE  : ICC_VAL< 9>;  // Not Equal
    
                      
                    
    981 def ICC_E : ICC_VAL< 1>; // Equal
    982 def ICC_G : ICC_VAL<10>; // Greater
    983 ...
    984 def FCC_U : FCC_VAL<23>; // Unordered
    985 def FCC_G : FCC_VAL<22>; // Greater
    986 def FCC_UG : FCC_VAL<21>; // Unordered or Greater
    987 ...
    988
    989
    990
    991
    992

    (Note that Sparc.h

    993 also defines enums that correspond to the same SPARC condition codes. Care must
    994 be taken to ensure the values in Sparc.h correspond to the values in
    995 SparcInstrInfo.td; that is, SPCC::ICC_NE = 9, SPCC::FCC_U = 23 and so on.)

    996
    997
    998
    999
    1000 Implement a subclass of
    1001 TargetInstrInfo
    1002
    1003
    1004
    1005

    The final step is to hand code portions of XXXInstrInfo, which

    1006 implements the interface described in TargetInstrInfo.h. These functions return
    1007 0 or a Boolean or they assert, unless overridden. Here's a list of functions
    1008 that are overridden for the SPARC implementation in SparcInstrInfo.cpp:

    1009
    1010
  • isMoveInstr (return true if the instruction is a register to
  • 1011 register move; false, otherwise)
    1012
    1013
  • isLoadFromStackSlot (if the specified machine instruction is a
  • 1014 direct load from a stack slot, return the register number of the destination
    1015 and the FrameIndex of the stack slot)
    1016
    1017
  • isStoreToStackSlot (if the specified machine instruction is a
  • 1018 direct store to a stack slot, return the register number of the destination and
    1019 the FrameIndex of the stack slot)
    1020
    1021
  • copyRegToReg (copy values between a pair of registers)
  • 1022
    1023
  • storeRegToStackSlot (store a register value to a stack slot)
  • 1024
    1025
  • loadRegFromStackSlot (load a register value from a stack slot)
  • 1026
    1027
  • storeRegToAddr (store a register value to memory)
  • 1028
    1029
  • loadRegFromAddr (load a register value from memory)
  • 1030
    1031
  • foldMemoryOperand (attempt to combine instructions of any load or
  • 1032 store instruction for the specified operand(s))
    1033
    1034
    1035
    1036
    1037
    1038 Branch Folding and If Conversion
    1039
    1040
    1041

    Performance can be improved by combining instructions or by eliminating

    1042 instructions that are never reached. The AnalyzeBranch method in XXXInstrInfo may
    1043 be implemented to examine conditional instructions and remove unnecessary
    1044 instructions. AnalyzeBranch looks at the end of a machine basic block (MBB) for
    1045 opportunities for improvement, such as branch folding and if conversion. The
    1046 BranchFolder and IfConverter machine function passes (see the source files
    1047 BranchFolding.cpp and IfConversion.cpp in the lib/CodeGen directory) call
    1048 AnalyzeBranch to improve the control flow graph that represents the
    1049 instructions.

    1050
    1051

    Several implementations of AnalyzeBranch (for ARM, Alpha, and

    1052 X86) can be examined as models for your own AnalyzeBranch implementation. Since
    1053 SPARC does not implement a useful AnalyzeBranch, the ARM target implementation
    1054 is shown below.

    1055
    1056

    AnalyzeBranch returns a Boolean value and takes four parameters:

    1057
    1058
  • MachineBasicBlock &MBB – the incoming block to be
  • 1059 examined
    1060
    1061
  • MachineBasicBlock *&TBB – a destination block that is
  • 1062 returned; for a conditional branch that evaluates to true, TBB is the
    1063 destination
    1064
    1065
  • MachineBasicBlock *&FBB – for a conditional branch that
  • 1066 evaluates to false, FBB is returned as the destination
    1067
    1068
  • std::vector<MachineOperand> &Cond – list of
  • 1069 operands to evaluate a condition for a conditional branch
    1070
    1071
    1072

    In the simplest case, if a block ends without a branch, then it

    1073 falls through to the successor block. No destination blocks are specified for
    1074 either TBB or FBB, so both parameters return NULL. The start of the AnalyzeBranch
    1075 (see code below for the ARM target) shows the function parameters and the code
    1076 for the simplest case.

    1077
    1078
    1079
    1080
    bool ARMInstrInfo::AnalyzeBranch(MachineBasicBlock &MBB,
    
                      
                    
    1081 MachineBasicBlock *&TBB, MachineBasicBlock *&FBB,
    1082 std::vector<MachineOperand> &Cond) const
    1083 {
    1084 MachineBasicBlock::iterator I = MBB.end();
    1085 if (I == MBB.begin() || !isUnpredicatedTerminator(--I))
    1086 return false;
    1087
    1088
    1089
    1090
    1091

    If a block ends with a single unconditional branch instruction,

    1092 then AnalyzeBranch (shown below) should return the destination of that branch
    1093 in the TBB parameter.

    1094
    1095
    1096
    1097
    if (LastOpc == ARM::B || LastOpc == ARM::tB) {
    
                      
                    
    1098 TBB = LastInst->getOperand(0).getMBB();
    1099 return false;
    1100 }
    1101
    1102
    1103
    1104
    1105

    If a block ends with two unconditional branches, then the second

    1106 branch is never reached. In that situation, as shown below, remove the last
    1107 branch instruction and return the penultimate branch in the TBB parameter.

    1108
    1109
    1110
    1111
    if ((SecondLastOpc == ARM::B || SecondLastOpc==ARM::tB) &&
    
                      
                    
    1112 (LastOpc == ARM::B || LastOpc == ARM::tB)) {
    1113 TBB = SecondLastInst->getOperand(0).getMBB();
    1114 I = LastInst;
    1115 I->eraseFromParent();
    1116 return false;
    1117 }
    1118
    1119
    1120
    1121

    A block may end with a single conditional branch instruction that

    1122 falls through to successor block if the condition evaluates to false. In that
    1123 case, AnalyzeBranch (shown below) should return the destination of that
    1124 conditional branch in the TBB parameter and a list of operands in the Cond
    1125 parameter to evaluate the condition.

    1126
    1127
    1128
    1129
    if (LastOpc == ARM::Bcc || LastOpc == ARM::tBcc) {
    
                      
                    
    1130 // Block ends with fall-through condbranch.
    1131 TBB = LastInst->getOperand(0).getMBB();
    1132 Cond.push_back(LastInst->getOperand(1));
    1133 Cond.push_back(LastInst->getOperand(2));
    1134 return false;
    1135 }
    1136
    1137
    1138
    1139
    1140

    If a block ends with both a conditional branch and an ensuing

    1141 unconditional branch, then AnalyzeBranch (shown below) should return the
    1142 conditional branch destination (assuming it corresponds to a conditional
    1143 evaluation of ‘true’) in the TBB parameter and the unconditional branch
    1144 destination in the FBB (corresponding to a conditional evaluation of ‘false’).
    1145 A list of operands to evaluate the condition should be returned in the Cond
    1146 parameter.

    1147
    1148
    1149
    1150
    unsigned SecondLastOpc = SecondLastInst->getOpcode();
    
                      
                    
    1151 if ((SecondLastOpc == ARM::Bcc && LastOpc == ARM::B) ||
    1152 (SecondLastOpc == ARM::tBcc && LastOpc == ARM::tB)) {
    1153 TBB = SecondLastInst->getOperand(0).getMBB();
    1154 Cond.push_back(SecondLastInst->getOperand(1));
    1155 Cond.push_back(SecondLastInst->getOperand(2));
    1156 FBB = LastInst->getOperand(0).getMBB();
    1157 return false;
    1158 }
    1159
    1160
    1161
    1162
    1163

    For the last two cases (ending with a single conditional branch or

    1164 ending with one conditional and one unconditional branch), the operands returned
    1165 in the Cond parameter can be passed to methods of other instructions to create
    1166 new branches or perform other operations. An implementation of AnalyzeBranch
    1167 requires the helper methods RemoveBranch and InsertBranch to manage subsequent
    1168 operations.

    1169
    1170

    AnalyzeBranch should return false indicating success in most circumstances.

    1171 AnalyzeBranch should only return true when the method is stumped about what to
    1172 do, for example, if a block has three terminating branches. AnalyzeBranch may
    1173 return true if it encounters a terminator it cannot handle, such as an indirect
    1174 branch.

    1175
    1176
    1177
    1178
    1179 Instruction Selector
    1180
    1181
    1182
    1183
    1184

    LLVM uses a SelectionDAG to represent LLVM IR instructions, and nodes

    1185 of the SelectionDAG ideally represent native target instructions. During code
    1186 generation, instruction selection passes are performed to convert non-native
    1187 DAG instructions into native target-specific instructions. The pass described
    1188 in XXXISelDAGToDAG.cpp is used to match patterns and perform DAG-to-DAG
    1189 instruction selection. Optionally, a pass may be defined (in
    1190 XXXBranchSelector.cpp) to perform similar DAG-to-DAG operations for branch
    1191 instructions. Later,
    1192 the code in XXXISelLowering.cpp replaces or removes operations and data types
    1193 not supported natively (legalizes) in a Selection DAG.

    1194
    1195

    TableGen generates code for instruction selection using the

    1196 following target description input files:

    1197
    1198
  • XXXInstrInfo.td contains definitions of instructions in a
  • 1199 target-specific instruction set, generates XXXGenDAGISel.inc, which is included
    1200 in XXXISelDAGToDAG.cpp.
    1201
    1202
  • XXXCallingConv.td contains the calling and return value conventions
  • 1203 for the target architecture, and it generates XXXGenCallingConv.inc, which is
    1204 included in XXXISelLowering.cpp.
    1205
    1206
    1207

    The implementation of an instruction selection pass must include

    1208 a header that declares the FunctionPass class or a subclass of FunctionPass. In
    1209 XXXTargetMachine.cpp, a Pass Manager (PM) should add each instruction selection
    1210 pass into the queue of passes to run.

    1211
    1212

    The LLVM static

    1213 compiler (llc) is an excellent tool for visualizing the contents of DAGs. To display
    1214 the SelectionDAG before or after specific processing phases, use the command
    1215 line options for llc, described at
    1216 href="http://llvm.org/docs/CodeGenerator.html#selectiondag_process">
    1217 SelectionDAG Instruction Selection Process.
    1218

    1219
    1220

    To describe instruction selector behavior, you should add

    1221 patterns for lowering LLVM code into a SelectionDAG as the last parameter of
    1222 the instruction definitions in XXXInstrInfo.td. For example, in
    1223 SparcInstrInfo.td, this entry defines a register store operation, and the last
    1224 parameter describes a pattern with the store DAG operator.

    1225
    1226
    1227
    1228
    def STrr  : F3_1< 3, 0b000100, (outs), (ins MEMrr:$addr, IntRegs:$src),
    
                      
                    
    1229 "st $src, [$addr]", [(store IntRegs:$src, ADDRrr:$addr)]>;
    1230
    1231
    1232
    1233
    1234

    ADDRrr is a memory mode that is also defined in SparcInstrInfo.td:

    1235
    1236
    1237
    1238
    def ADDRrr : ComplexPattern<i32, 2, "SelectADDRrr", [], []>;
    
                      
                    
    1239
    1240
    1241
    1242
    1243

    The definition of ADDRrr refers to SelectADDRrr, which is a function defined in an

    1244 implementation of the Instructor Selector (such as SparcISelDAGToDAG.cpp).

    1245
    1246

    In lib/Target/TargetSelectionDAG.td, the DAG operator for store

    1247 is defined below:

    1248
    1249
    1250
    1251
    def store : PatFrag<(ops node:$val, node:$ptr),
    
                      
                    
    1252 (st node:$val, node:$ptr), [{
    1253 if (StoreSDNode *ST = dyn_cast<StoreSDNode>(N))
    1254 return !ST->isTruncatingStore() &&
    1255 ST->getAddressingMode() == ISD::UNINDEXED;
    1256 return false;
    1257 }]>;
    1258
    1259
    1260
    1261

    XXXInstrInfo.td also generates (in XXXGenDAGISel.inc) the

    1262 SelectCode method that is used to call the appropriate processing method for an
    1263 instruction. In this example, SelectCode calls Select_ISD_STORE for the
    1264 ISD::STORE opcode.

    1265
    1266
    1267
    1268
    SDNode *SelectCode(SDOperand N) {
    
                      
                    
    1269 ...
    1270 MVT::ValueType NVT = N.Val->getValueType(0);
    1271 switch (N.getOpcode()) {
    1272 case ISD::STORE: {
    1273 switch (NVT) {
    1274 default:
    1275 return Select_ISD_STORE(N);
    1276 break;
    1277 }
    1278 break;
    1279 }
    1280 ...
    1281
    1282
    1283
    1284

    The pattern for STrr is matched, so elsewhere in

    1285 XXXGenDAGISel.inc, code for STrr is created for Select_ISD_STORE. The Emit_22 method
    1286 is also generated in XXXGenDAGISel.inc to complete the processing of this
    1287 instruction.

    1288
    1289
    1290
    1291
    SDNode *Select_ISD_STORE(const SDOperand &N) {
    
                      
                    
    1292 SDOperand Chain = N.getOperand(0);
    1293 if (Predicate_store(N.Val)) {
    1294 SDOperand N1 = N.getOperand(1);
    1295 SDOperand N2 = N.getOperand(2);
    1296 SDOperand CPTmp0;
    1297 SDOperand CPTmp1;
    1298  
    1299 // Pattern: (st:void IntRegs:i32:$src,
    1300 // ADDRrr:i32:$addr)<<P:Predicate_store>>
    1301 // Emits: (STrr:void ADDRrr:i32:$addr, IntRegs:i32:$src)
    1302 // Pattern complexity = 13 cost = 1 size = 0
    1303 if (SelectADDRrr(N, N2, CPTmp0, CPTmp1) &&
    1304 N1.Val->getValueType(0) == MVT::i32 &&
    1305 N2.Val->getValueType(0) == MVT::i32) {
    1306 return Emit_22(N, SP::STrr, CPTmp0, CPTmp1);
    1307 }
    1308 ...
    1309
    1310
    1311
    1312
    1313
    1314 The SelectionDAG Legalize Phase
    1315
    1316
    1317

    The Legalize phase converts a DAG to use types and operations

    1318 that are natively supported by the target. For natively unsupported types and
    1319 operations, you need to add code to the target-specific XXXTargetLowering implementation
    1320 to convert unsupported types and operations to supported ones.

    1321
    1322

    In the constructor for the XXXTargetLowering class, first use the

    1323 addRegisterClass method to specify which types are supports and which register
    1324 classes are associated with them. The code for the register classes are generated
    1325 by TableGen from XXXRegisterInfo.td and placed in XXXGenRegisterInfo.h.inc. For
    1326 example, the implementation of the constructor for the SparcTargetLowering
    1327 class (in SparcISelLowering.cpp) starts with the following code:

    1328
    1329
    1330
    1331
    addRegisterClass(MVT::i32, SP::IntRegsRegisterClass);
    
                      
                    
    1332 addRegisterClass(MVT::f32, SP::FPRegsRegisterClass);
    1333 addRegisterClass(MVT::f64, SP::DFPRegsRegisterClass);
    1334
    1335
    1336
    1337
    1338

    You should examine the node types in the ISD namespace

    1339 (include/llvm/CodeGen/SelectionDAGNodes.h)
    1340 and determine which operations the target natively supports. For operations
    1341 that do not have native support, add a callback to the constructor for
    1342 the XXXTargetLowering class, so the instruction selection process knows what to
    1343 do. The TargetLowering class callback methods (declared in
    1344 llvm/Target/TargetLowering.h) are:

    1345
    1346
  • setOperationAction (general operation)
  • 1347
    1348
  • setLoadExtAction (load with extension)
  • 1349
    1350
  • setTruncStoreAction (truncating store)
  • 1351
    1352
  • setIndexedLoadAction (indexed load)
  • 1353
    1354
  • setIndexedStoreAction (indexed store)
  • 1355
    1356
  • setConvertAction (type conversion)
  • 1357
    1358
  • setCondCodeAction (support for a given condition code)
  • 1359
    1360
    1361

    Note: on older releases, setLoadXAction is used instead of setLoadExtAction.

    1362 Also, on older releases, setCondCodeAction may not be supported. Examine your
    1363 release to see what methods are specifically supported.

    1364
    1365

    These callbacks are used to determine that an operation does or

    1366 does not work with a specified type (or types). And in all cases, the third
    1367 parameter is a LegalAction type enum value: Promote, Expand,
    1368 Custom, or Legal. SparcISelLowering.cpp
    1369 contains examples of all four LegalAction values.

    1370
    1371
    561372
    571373
    58 Outline
    59
    60
    61
    62
    63

    In general, you want to follow the format of SPARC, X86 or PowerPC (in

    64 lib/Target). SPARC is the simplest backend, and is RISC, so if
    65 you're working on a RISC target, it is a good one to start with.

    66
    67

    To create a static compiler (one that emits text assembly), you need to

    68 implement the following:

    69
    70
    71
  • Describe the register set.
  • 72
    73
  • Create a TableGen description of
  • 74 the register set and register classes
    75
  • Implement a subclass of
  • 76 href="CodeGenerator.html#targetregisterinfo">TargetRegisterInfo
    77
    78
  • Describe the instruction set.
  • 79
    80
  • Create a TableGen description of
  • 81 the instruction set
    82
  • Implement a subclass of
  • 83 href="CodeGenerator.html#targetinstrinfo">TargetInstrInfo
    84
    85
  • Describe the target machine.
  • 86
    87
  • Create a TableGen description of
  • 88 the target that describes the pointer size and references the instruction
    89 set
    90
  • Implement a subclass of
  • 91 href="CodeGenerator.html#targetmachine">TargetMachine, which
    92 configures TargetData
    93 correctly
    94
  • Register your new target using the RegisterTarget
  • 95 template:

    96
    
                      
                    
    97 RegisterTarget<MyTargetMachine> M("short_name", " Target name");
    98
    99
    Here, MyTargetMachine is the name of your implemented
    100 subclass of
    101 href="CodeGenerator.html#targetmachine">TargetMachine,
    102 short_name is the option that will be active following
    103 -march= to select a target in llc and lli, and the last string
    104 is the description of your target to appear in -help
    105 listing.
    106
    107
  • Implement the assembly printer for the architecture.
  • 108
    109
  • Define all of the assembly strings for your target, adding them to the
  • 110 instructions in your *InstrInfo.td file.
    111
  • Implement the llvm::AsmPrinter interface.
  • 112
    113
    114
  • Implement an instruction selector for the architecture.
  • 115
    116
  • The recommended method is the
  • 117 pattern-matching DAG-to-DAG instruction selector (for example, see
    118 the PowerPC backend in PPCISelDAGtoDAG.cpp). Parts of instruction
    119 selector creation can be performed by adding patterns to the instructions
    120 in your .td file.
    121
    122
    123
  • Optionally, add subtarget support.
  • 124
    125
  • If your target has multiple subtargets (e.g. variants with different
  • 126 capabilities), implement the llvm::TargetSubtarget interface
    127 for your architecture. This allows you to add -mcpu= and
    128 -mattr= options.
    129
    130
  • Optionally, add JIT support.
  • 131
    132
  • Create a subclass of
  • 133 href="CodeGenerator.html#targetjitinfo">TargetJITInfo
    134
  • Create a machine code emitter that will be used to emit binary code
  • 135 directly into memory, given MachineInstrs
    136
    137 >
    1374 Promote>
    1375
    1376
    1377
    1378

    For an operation without native support for a given type, the

    1379 specified type may be promoted to a larger type that is supported. For example,
    1380 SPARC does not support a sign-extending load for Boolean values (i1 type), so
    1381 in SparcISelLowering.cpp the third
    1382 parameter below, Promote, changes i1 type
    1383 values to a large type before loading.

    1384
    1385
    1386
    1387
    setLoadExtAction(ISD::SEXTLOAD, MVT::i1, Promote);
    
                      
                    
    1388
    1381389
    1391390
    1401391
    1411392
    142 Implementation details
    143
    144
    145
    146
    147
    148
    149
  • TableGen register info description - describe a class which

  • 150 will store the register's number in the binary encoding of the instruction
    151 (e.g., for JIT purposes).

    152
    153

    You also need to define register classes to contain these registers, such as

    154 the integer register class and floating-point register class, so that you can
    155 allocate virtual registers to instructions from these sets, and let the
    156 target-independent register allocator automatically choose the actual
    157 architected registers.

    158
    159
    160
    
                      
                    
    161 // class Register is defined in Target.td
    162 class TargetReg<string name> : Register<name> {
    163 <b>let Namespace = "Target";
    1393 <a name="expand">Expand
    1394
    1395
    1396

    For a type without native support, a value may need to be broken

    1397 down further, rather than promoted. For an operation without native support, a
    1398 combination of other operations may be used to similar effect. In SPARC, the
    1399 floating-point sine and cosine trig operations are supported by expansion to
    1400 other operations, as indicated by the third parameter, Expand, to
    1401 setOperationAction:

    1402
    1403
    1404
    1405
    setOperationAction(ISD::FSIN, MVT::f32, Expand);
    
                      
                    
    1406 setOperationAction(ISD::FCOS, MVT::f32, Expand);
    1407
    1408
    1409
    1410
    1411
    1412 Custom
    1413
    1414
    1415

    For some operations, simple type promotion or operation expansion

    1416 may be insufficient. In some cases, a special intrinsic function must be
    1417 implemented.

    1418
    1419

    For example, a constant value may require special treatment, or

    1420 an operation may require spilling and restoring registers in the stack and
    1421 working with register allocators.

    1422
    1423

    As seen in SparcISelLowering.cpp code below, to perform a type

    1424 conversion from a floating point value to a signed integer, first the
    1425 setOperationAction should be called with Custom as the third parameter:

    1426
    1427
    1428
    1429
    setOperationAction(ISD::FP_TO_SINT, MVT::i32, Custom);
    
                      
                    
    1430
    1431
    1432
    1433

    In the LowerOperation method, for each Custom operation, a case

    1434 statement should be added to indicate what function to call. In the following
    1435 code, an FP_TO_SINT opcode will call the LowerFP_TO_SINT method:

    1436
    1437
    1438
    1439
    SDOperand SparcTargetLowering::LowerOperation(
    
                      
                    
    1440 SDOperand Op, SelectionDAG &DAG) {
    1441 switch (Op.getOpcode()) {
    1442 case ISD::FP_TO_SINT: return LowerFP_TO_SINT(Op, DAG);
    1443 ...
    1444 }
    1641445 }
    165
    166 class IntReg<bits<5> num, string name> : TargetReg<name> {
    167 field bits<5> Num = num;
    1446
    1447
    1448
    1449

    Finally, the LowerFP_TO_SINT method is implemented, using an FP

    1450 register to convert the floating-point value to an integer.

    1451
    1452
    1453
    1454
    static SDOperand LowerFP_TO_SINT(SDOperand Op, SelectionDAG &DAG) {
    
                      
                    
    1455 assert(Op.getValueType() == MVT::i32);
    1456 Op = DAG.getNode(SPISD::FTOI, MVT::f32, Op.getOperand(0));
    1457 return DAG.getNode(ISD::BIT_CONVERT, MVT::i32, Op);
    1681458 }
    169
    170 <b>def R0 : IntReg<0, "%R0">;
    1459 </pre>
    1460
    1461
    1462
    1463 Legal
    1464
    1465
    1466

    The Legal LegalizeAction enum value simply indicates that an

    1467 operation is natively supported. Legal represents the default condition,
    1468 so it is rarely used. In SparcISelLowering.cpp, the action for CTPOP (an
    1469 operation to count the bits set in an integer) is natively supported only for
    1470 SPARC v9. The following code enables the Expand conversion technique for non-v9
    1471 SPARC implementations.

    1472
    1473
    1474
    1475
    setOperationAction(ISD::CTPOP, MVT::i32, Expand);
    
                      
                    
    1711476 ...
    172
    173 // class RegisterClass is defined in Target.td
    174 def IReg : RegisterClass<i64, 64, [R0, ... ]>;
    175
    176
    177
    178
    179
  • TableGen instruction info description - break up instructions into

  • 180 classes, usually that's already done by the manufacturer (see instruction
    181 manual). Define a class for each instruction category. Define each opcode as a
    182 subclass of the category, with appropriate parameters such as the fixed binary
    183 encoding of opcodes and extended opcodes, and map the register bits to the bits
    184 of the instruction which they are encoded in (for the JIT). Also specify how
    185 the instruction should be printed so it can use the automatic assembly printer,
    186 e.g.:

    187
    188
    189
    
                      
                    
    190 // class Instruction is defined in Target.td
    191 class Form<bits<6> opcode, dag OL, string asmstr> : Instruction {
    192 field bits<42> Inst;
    193
    194 let Namespace = "Target";
    195 let Inst{0-6} = opcode;
    196 let OperandList = OL;
    197 let AsmString = asmstr;
    1477 if (TM.getSubtarget<SparcSubtarget>().isV9())
    1478 setOperationAction(ISD::CTPOP, MVT::i32, Legal);
    1479 case ISD::SETULT: return SPCC::ICC_CS;
    1480 case ISD::SETULE: return SPCC::ICC_LEU;
    1481 case ISD::SETUGT: return SPCC::ICC_GU;
    1482 case ISD::SETUGE: return SPCC::ICC_CC;
    1483 }
    1981484 }
    199
    200 def ADD : Form<42, (ops IReg:$rD, IReg:$rA, IReg:$rB), "add $rD, $rA, $rB">;
    201
    202
    203
    204
    205
    206
    207
    208
    1485
    1486
    2091487
    2101488
    211 Language backends
    212
    213
    214
    215
    216

    For now, just take a look at lib/Target/CBackend for an example of

    217 how the C backend is written.

    218
    1489 Calling Conventions
    1490
    1491
    1492

    To support target-specific calling conventions, XXXGenCallingConv.td

    1493 uses interfaces (such as CCIfType and CCAssignToReg) that are defined in
    1494 lib/Target/TargetCallingConv.td. TableGen can take the target descriptor file
    1495 XXXGenCallingConv.td and generate the header file XXXGenCallingConv.inc, which
    1496 is typically included in XXXISelLowering.cpp. You can use the interfaces in
    1497 TargetCallingConv.td to specify:

    1498
    1499
  • the order of parameter allocation
  • 1500
    1501
  • where parameters and return values are placed (that is, on the
  • 1502 stack or in registers)
    1503
    1504
  • which registers may be used
  • 1505
    1506
  • whether the caller or callee unwinds the stack
  • 1507
    1508
    1509

    The following example demonstrates the use of the CCIfType and

    1510 CCAssignToReg interfaces. If the CCIfType predicate is true (that is, if the
    1511 current argument is of type f32 or f64), then the action is performed. In this
    1512 case, the CCAssignToReg action assigns the argument value to the first
    1513 available register: either R0 or R1.

    1514
    1515
    1516
    CCIfType<[f32,f64], CCAssignToReg<[R0, R1]>>
    
                      
                    
    1517
    1518
    1519
    1520

    SparcCallingConv.td contains definitions for a target-specific return-value

    1521 calling convention (RetCC_Sparc32) and a basic 32-bit C calling convention
    1522 (CC_Sparc32). The definition of RetCC_Sparc32 (shown below) indicates which
    1523 registers are used for specified scalar return types. A single-precision float
    1524 is returned to register F0, and a double-precision float goes to register D0. A
    1525 32-bit integer is returned in register I0 or I1.

    1526
    1527
    1528
    1529
    def RetCC_Sparc32 : CallingConv<[
    
                      
                    
    1530 CCIfType<[i32], CCAssignToReg<[I0, I1]>>,
    1531 CCIfType<[f32], CCAssignToReg<[F0]>>,
    1532 CCIfType<[f64], CCAssignToReg<[D0]>>
    1533 ]>;
    1534
    1535
    1536
    1537

    The definition of CC_Sparc32 in SparcCallingConv.td introduces

    1538 CCAssignToStack, which assigns the value to a stack slot with the specified size
    1539 and alignment. In the example below, the first parameter, 4, indicates the size
    1540 of the slot, and the second parameter, also 4, indicates the stack alignment
    1541 along 4-byte units. (Special cases: if size is zero, then the ABI size is used;
    1542 if alignment is zero, then the ABI alignment is used.)

    1543
    1544
    1545
    1546
    def CC_Sparc32 : CallingConv<[
    
                      
                    
    1547 // All arguments get passed in integer registers if there is space.
    1548 CCIfType<[i32, f32, f64], CCAssignToReg<[I0, I1, I2, I3, I4, I5]>>,
    1549 CCAssignToStack<4, 4>
    1550 ]>;
    1551
    1552
    1553
    1554

    CCDelegateTo is another commonly used interface, which tries to find

    1555 a specified sub-calling convention and, if a match is found, it is invoked. In
    1556 the following example (in X86CallingConv.td), the definition of RetCC_X86_32_C
    1557 ends with CCDelegateTo. After the current value is assigned to the register ST0
    1558 or ST1, the RetCC_X86Common is invoked.

    1559
    1560
    1561
    1562
    def RetCC_X86_32_C : CallingConv<[
    
                      
                    
    1563 CCIfType<[f32], CCAssignToReg<[ST0, ST1]>>,
    1564 CCIfType<[f64], CCAssignToReg<[ST0, ST1]>>,
    1565 CCDelegateTo<RetCC_X86Common>
    1566 ]>;
    1567
    1568
    1569
    1570

    CCIfCC is an interface that attempts to match the given name to

    1571 the current calling convention. If the name identifies the current calling
    1572 convention, then a specified action is invoked. In the following example (in
    1573 X86CallingConv.td), if the Fast calling convention is in use, then RetCC_X86_32_Fast
    1574 is invoked. If the SSECall calling convention is in use, then RetCC_X86_32_SSE
    1575 is invoked.

    1576
    1577
    1578
    1579
    def RetCC_X86_32 : CallingConv<[
    
                      
                    
    1580 CCIfCC<"CallingConv::Fast", CCDelegateTo<RetCC_X86_32_Fast>>,
    1581 CCIfCC<"CallingConv::X86_SSECall", CCDelegateTo<RetCC_X86_32_SSE>>,
    1582 CCDelegateTo<RetCC_X86_32_C>
    1583 ]>;
    1584
    1585
    1586
    1587

    Other calling convention interfaces include:

    1588
    1589
  • CCIf <predicate, action> - if the predicate matches, apply
  • 1590 the action
    1591
    1592
  • CCIfInReg <action> - if the argument is marked with the
  • 1593 ‘inreg’ attribute, then apply the action
    1594
    1595
  • CCIfNest <action> - if the argument is marked with the
  • 1596 ‘nest’ attribute, then apply the action
    1597
    1598
  • CCIfNotVarArg <action> - if the current function does not
  • 1599 take a variable number of arguments, apply the action
    1600
    1601
  • CCAssignToRegWithShadow <registerList, shadowList> -
  • 1602 similar to CCAssignToReg, but with a shadow list of registers
    1603
    1604
  • CCPassByVal <size, align> - assign value to a stack slot
  • 1605 with the minimum specified size and alignment
    1606
    1607
  • CCPromoteToType <type> - promote the current value to the specified
  • 1608 type
    1609
    1610
  • CallingConv <[actions]> - define each calling convention
  • 1611 that is supported
    1612
    1613
    1614
    1615
    1616
    1617 Assembly Printer
    1618
    1619
    1620
    1621
    1622

    During the code

    1623 emission stage, the code generator may utilize an LLVM pass to produce assembly
    1624 output. To do this, you want to implement the code for a printer that converts
    1625 LLVM IR to a GAS-format assembly language for your target machine, using the
    1626 following steps:

    1627
    1628
  • Define all the assembly strings for your target, adding them to
  • 1629 the instructions defined in the XXXInstrInfo.td file.
    1630 (See Instruction Set.)
    1631 TableGen will produce an output file (XXXGenAsmWriter.inc) with an
    1632 implementation of the printInstruction method for the XXXAsmPrinter class.
    1633
    1634
  • Write XXXTargetAsmInfo.h, which contains the bare-bones
  • 1635 declaration of the XXXTargetAsmInfo class (a subclass of TargetAsmInfo).
    1636
    1637
  • Write XXXTargetAsmInfo.cpp, which contains target-specific values
  • 1638 for TargetAsmInfo properties and sometimes new implementations for methods
    1639
    1640
  • Write XXXAsmPrinter.cpp, which implements the AsmPrinter class
  • 1641 that performs the LLVM-to-assembly conversion.
    1642
    1643
    1644

    The code in XXXTargetAsmInfo.h is usually a trivial declaration

    1645 of the XXXTargetAsmInfo class for use in XXXTargetAsmInfo.cpp. Similarly,
    1646 XXXTargetAsmInfo.cpp usually has a few declarations of XXXTargetAsmInfo replacement
    1647 values that override the default values in TargetAsmInfo.cpp. For example in
    1648 SparcTargetAsmInfo.cpp,

    1649
    1650
    1651
    1652
    SparcTargetAsmInfo::SparcTargetAsmInfo(const SparcTargetMachine &TM) {
    
                      
                    
    1653 Data16bitsDirective = "\t.half\t";
    1654 Data32bitsDirective = "\t.word\t";
    1655 Data64bitsDirective = 0; // .xword is only supported by V9.
    1656 ZeroDirective = "\t.skip\t";
    1657 CommentString = "!";
    1658 ConstantPoolSection = "\t.section \".rodata\",#alloc\n";
    1659 }
    1660
    1661
    1662
    1663

    The X86 assembly printer implementation (X86TargetAsmInfo) is an

    1664 example where the target specific TargetAsmInfo class uses overridden methods:
    1665 ExpandInlineAsm and PreferredEHDataFormat.

    1666
    1667

    A target-specific implementation of AsmPrinter is written in

    1668 XXXAsmPrinter.cpp, which implements the AsmPrinter class that converts the LLVM
    1669 to printable assembly. The implementation must include the following headers
    1670 that have declarations for the AsmPrinter and MachineFunctionPass classes. The
    1671 MachineFunctionPass is a subclass of FunctionPass.

    1672
    1673
    1674
    1675
    #include "llvm/CodeGen/AsmPrinter.h"
    
                      
                    
    1676 #include "llvm/CodeGen/MachineFunctionPass.h"
    1677
    1678
    1679
    1680
    1681

    As a FunctionPass, AsmPrinter first calls doInitialization to set

    1682 up the AsmPrinter. In SparcAsmPrinter, a Mangler object is instantiated to
    1683 process variable names.

    1684
    1685

    In XXXAsmPrinter.cpp, the runOnMachineFunction method (declared

    1686 in MachineFunctionPass) must be implemented for XXXAsmPrinter. In
    1687 MachineFunctionPass, the runOnFunction method invokes runOnMachineFunction.
    1688 Target-specific implementations of runOnMachineFunction differ, but generally
    1689 do the following to process each machine function:

    1690
    1691
  • call SetupMachineFunction to perform initialization
  • 1692
    1693
  • call EmitConstantPool to print out (to the output stream)
  • 1694 constants which have been spilled to memory
    1695
    1696
  • call EmitJumpTableInfo to print out jump tables used by the
  • 1697 current function
    1698
    1699
  • print out the label for the current function
  • 1700
    1701
  • print out the code for the function, including basic block labels
  • 1702 and the assembly for the instruction (using printInstruction)
    1703
    1704

    The XXXAsmPrinter implementation must also include the code

    1705 generated by TableGen that is output in the XXXGenAsmWriter.inc file. The code
    1706 in XXXGenAsmWriter.inc contains an implementation of the printInstruction
    1707 method that may call these methods:

    1708
    1709
  • printOperand
  • 1710
    1711
  • printMemOperand
  • 1712
    1713
  • printCCOperand (for conditional statements)
  • 1714
    1715
  • printDataDirective
  • 1716
    1717
  • printDeclare
  • 1718
    1719
  • printImplicitDef
  • 1720
    1721
  • printInlineAsm
  • 1722
    1723
  • printLabel
  • 1724
    1725
  • printPICJumpTableEntry
  • 1726
    1727
  • printPICJumpTableSetLabel
  • 1728
    1729
    1730

    The implementations of printDeclare, printImplicitDef,

    1731 printInlineAsm, and printLabel in AsmPrinter.cpp are generally adequate for
    1732 printing assembly and do not need to be overridden. (printBasicBlockLabel is
    1733 another method that is implemented in AsmPrinter.cpp that may be directly used
    1734 in an implementation of XXXAsmPrinter.)

    1735
    1736

    The printOperand method is implemented with a long switch/case

    1737 statement for the type of operand: register, immediate, basic block, external
    1738 symbol, global address, constant pool index, or jump table index. For an
    1739 instruction with a memory address operand, the printMemOperand method should be
    1740 implemented to generate the proper output. Similarly, printCCOperand should be
    1741 used to print a conditional operand.

    1742
    1743

    doFinalization should be overridden in XXXAsmPrinter, and

    1744 it should be called to shut down the assembly printer. During doFinalization,
    1745 global variables and constants are printed to output.

    1746
    1747
    1748
    1749 Subtarget Support
    1750
    1751
    1752
    1753
    1754

    Subtarget support is used to inform the code generation process

    1755 of instruction set variations for a given chip set. For example, the LLVM
    1756 SPARC implementation provided covers three major versions of the SPARC
    1757 microprocessor architecture: Version 8 (V8, which is a 32-bit architecture),
    1758 Version 9 (V9, a 64-bit architecture), and the UltraSPARC architecture. V8 has
    1759 16 double-precision floating-point registers that are also usable as either 32
    1760 single-precision or 8 quad-precision registers. V8 is also purely big-endian. V9
    1761 has 32 double-precision floating-point registers that are also usable as 16
    1762 quad-precision registers, but cannot be used as single-precision registers. The
    1763 UltraSPARC architecture combines V9 with UltraSPARC Visual Instruction Set
    1764 extensions.

    1765
    1766

    If subtarget support is needed, you should implement a

    1767 target-specific XXXSubtarget class for your architecture. This class should
    1768 process the command-line options –mcpu= and –mattr=

    1769
    1770

    TableGen uses definitions in the Target.td and Sparc.td files to

    1771 generate code in SparcGenSubtarget.inc. In Target.td, shown below, the
    1772 SubtargetFeature interface is defined. The first 4 string parameters of the
    1773 SubtargetFeature interface are a feature name, an attribute set by the feature,
    1774 the value of the attribute, and a description of the feature. (The fifth
    1775 parameter is a list of features whose presence is implied, and its default
    1776 value is an empty array.)

    1777
    1778
    1779
    1780
    class SubtargetFeature<string n, string a,  string v, string d,
    
                      
                    
    1781 list<SubtargetFeature> i = []> {
    1782 string Name = n;
    1783 string Attribute = a;
    1784 string Value = v;
    1785 string Desc = d;
    1786 list<SubtargetFeature> Implies = i;
    1787 }
    1788
    1789
    1790
    1791

    In the Sparc.td file, the SubtargetFeature is used to define the

    1792 following features.

    1793
    1794
    1795
    1796
    def FeatureV9 : SubtargetFeature<"v9", "IsV9", "true",
    
                      
                    
    1797 "Enable SPARC-V9 instructions">;
    1798 def FeatureV8Deprecated : SubtargetFeature<"deprecated-v8",
    1799 "V8DeprecatedInsts", "true",
    1800 "Enable deprecated V8 instructions in V9 mode">;
    1801 def FeatureVIS : SubtargetFeature<"vis", "IsVIS", "true",
    1802 "Enable UltraSPARC Visual Instruction Set extensions">;
    1803
    1804
    1805
    1806
    1807

    Elsewhere in Sparc.td, the Proc class is defined and then is used

    1808 to define particular SPARC processor subtypes that may have the previously
    1809 described features.

    1810
    1811
    1812
    1813
    class Proc<string Name, list<SubtargetFeature> Features>
    
                      
                    
    1814 : Processor<Name, NoItineraries, Features>;
    1815  
    1816 def : Proc<"generic", []>;
    1817 def : Proc<"v8", []>;
    1818 def : Proc<"supersparc", []>;
    1819 def : Proc<"sparclite", []>;
    1820 def : Proc<"f934", []>;
    1821 def : Proc<"hypersparc", []>;
    1822 def : Proc<"sparclite86x", []>;
    1823 def : Proc<"sparclet", []>;
    1824 def : Proc<"tsc701", []>;
    1825 def : Proc<"v9", [FeatureV9]>;
    1826 def : Proc<"ultrasparc", [FeatureV9, FeatureV8Deprecated]>;
    1827 def : Proc<"ultrasparc3", [FeatureV9, FeatureV8Deprecated]>;
    1828 def : Proc<"ultrasparc3-vis", [FeatureV9, FeatureV8Deprecated, FeatureVIS]>;
    1829
    1830
    1831
    1832
    1833

    From Target.td and Sparc.td files, the resulting

    1834 SparcGenSubtarget.inc specifies enum values to identify the features, arrays of
    1835 constants to represent the CPU features and CPU subtypes, and the
    1836 ParseSubtargetFeatures method that parses the features string that sets
    1837 specified subtarget options. The generated SparcGenSubtarget.inc file should be
    1838 included in the SparcSubtarget.cpp. The target-specific implementation of the XXXSubtarget
    1839 method should follow this pseudocode:

    1840
    1841
    1842
    1843
    XXXSubtarget::XXXSubtarget(const Module &M, const std::string &FS) {
    
                      
                    
    1844 // Set the default features
    1845 // Determine default and user specified characteristics of the CPU
    1846 // Call ParseSubtargetFeatures(FS, CPU) to parse the features string
    1847 // Perform any additional operations
    1848 }
    1849
    1850
    1851
    1852
    1853
    1854 JIT Support
    1855
    1856
    1857
    1858
    1859

    The implementation of a target machine optionally includes a Just-In-Time

    1860 (JIT) code generator that emits machine code and auxiliary structures as binary
    1861 output that can be written directly to memory.
    1862 To do this, implement JIT code generation by performing the following
    1863 steps:

    1864
    1865
  • Write an XXXCodeEmitter.cpp file that contains a machine function
  • 1866 pass that transforms target-machine instructions into relocatable machine code.
    1867
    1868
  • Write an XXXJITInfo.cpp file that implements the JIT interfaces
  • 1869 for target-specific code-generation
    1870 activities, such as emitting machine code and stubs.
    1871
    1872
  • Modify XXXTargetMachine so that it provides a TargetJITInfo
  • 1873 object through its getJITInfo method.
    1874
    1875
    1876

    There are several different approaches to writing the JIT support

    1877 code. For instance, TableGen and target descriptor files may be used for
    1878 creating a JIT code generator, but are not mandatory. For the Alpha and PowerPC
    1879 target machines, TableGen is used to generate XXXGenCodeEmitter.inc, which
    1880 contains the binary coding of machine instructions and the
    1881 getBinaryCodeForInstr method to access those codes. Other JIT implementations
    1882 do not.

    1883
    1884

    Both XXXJITInfo.cpp and XXXCodeEmitter.cpp must include the

    1885 llvm/CodeGen/MachineCodeEmitter.h header file that defines the MachineCodeEmitter
    1886 class containing code for several callback functions that write data (in bytes,
    1887 words, strings, etc.) to the output stream.

    2191888
    2201889
    2211890
    222 Files to create/modify
    223
    224
    225
    226
    227

    To actually create your backend, you need to create and modify a few files.

    228 Here, the absolute minimum will be discussed. To actually use LLVM's target
    229 independent codegenerator, you must implement extra
    230 things.

    231
    232

    First of all, you should create a subdirectory under lib/Target,

    233 which will hold all the files related to your target. Let's assume that our
    234 target is called, "Dummy", we would create the directory
    235 lib/Target/Dummy.

    236
    237

    In this new directory, you should put a Makefile. You can probably

    238 copy one from another target and modify it. It should at least contain the
    239 LEVEL, LIBRARYNAME and TARGET variables, and then
    240 include $(LEVEL)/Makefile.common. Be careful to give the library the
    241 correct name, it must be named LLVMDummy (see the MIPS target, for
    242 example). Alternatively, you can split the library into
    243 LLVMDummyCodeGen and LLVMDummyAsmPrinter, the latter of which
    244 should be implemented in a subdirectory below lib/Target/Dummy (see the
    245 PowerPC target, for example).

    246
    247

    Note that these two naming schemes are hardcoded into llvm-config. Using any

    248 other naming scheme will confuse llvm-config and produce lots of (seemingly
    249 unrelated) linker errors when linking llc.

    250
    251

    To make your target actually do something, you need to implement a subclass

    252 of TargetMachine. This implementation should typically be in the file
    253 lib/Target/DummyTargetMachine.cpp, but any file in the
    254 lib/Target directory will be built and should work. To use LLVM's
    255 href="CodeGenerator.html">target independent code generator, you should
    256 create a subclass of LLVMTargetMachine. This is what all current
    257 machine backends do. To create a target from scratch, create a subclass of
    258 TargetMachine. This is what the current language backends do.

    259
    260

    To get LLVM to actually build and link your target, you also need to add it

    261 to the TARGETS_TO_BUILD variable. To do this, you need to modify the
    262 configure script to know about your target when parsing the
    263 --enable-targets option. Search the configure script for
    264 TARGETS_TO_BUILD, add your target to the lists there (some creativity
    265 required) and then reconfigure. Alternatively, you can change
    266 autotools/configure.ac and regenerate configure by running
    267 ./autoconf/AutoRegen.sh.
    268
    269
    270
    271
    272
    273 Related reading material
    274
    275
    276
    277
    278
    1891 Machine Code Emitter
    1892
    1893
    1894
    1895

    In XXXCodeEmitter.cpp, a target-specific of the Emitter class is

    1896 implemented as a function pass (subclass of MachineFunctionPass). The
    1897 target-specific implementation of runOnMachineFunction (invoked by
    1898 runOnFunction in MachineFunctionPass) iterates through the MachineBasicBlock
    1899 calls emitInstruction to process each instruction and emit binary code. emitInstruction
    1900 is largely implemented with case statements on the instruction types defined in
    1901 XXXInstrInfo.h. For example, in X86CodeEmitter.cpp, the emitInstruction method
    1902 is built around the following switch/case statements:

    1903
    1904
    1905
    1906
    switch (Desc->TSFlags & X86::FormMask) {
    
                      
                    
    1907 case X86II::Pseudo: // for not yet implemented instructions
    1908 ... // or pseudo-instructions
    1909 break;
    1910 case X86II::RawFrm: // for instructions with a fixed opcode value
    1911 ...
    1912 break;
    1913 case X86II::AddRegFrm: // for instructions that have one register operand
    1914 ... // added to their opcode
    1915 break;
    1916 case X86II::MRMDestReg:// for instructions that use the Mod/RM byte
    1917 ... // to specify a destination (register)
    1918 break;
    1919 case X86II::MRMDestMem:// for instructions that use the Mod/RM byte
    1920 ... // to specify a destination (memory)
    1921 break;
    1922 case X86II::MRMSrcReg: // for instructions that use the Mod/RM byte
    1923 ... // to specify a source (register)
    1924 break;
    1925 case X86II::MRMSrcMem: // for instructions that use the Mod/RM byte
    1926 ... // to specify a source (memory)
    1927 break;
    1928 case X86II::MRM0r: case X86II::MRM1r: // for instructions that operate on
    1929 case X86II::MRM2r: case X86II::MRM3r: // a REGISTER r/m operand and
    1930 case X86II::MRM4r: case X86II::MRM5r: // use the Mod/RM byte and a field
    1931 case X86II::MRM6r: case X86II::MRM7r: // to hold extended opcode data
    1932 ...
    1933 break;
    1934 case X86II::MRM0m: case X86II::MRM1m: // for instructions that operate on
    1935 case X86II::MRM2m: case X86II::MRM3m: // a MEMORY r/m operand and
    1936 case X86II::MRM4m: case X86II::MRM5m: // use the Mod/RM byte and a field
    1937 case X86II::MRM6m: case X86II::MRM7m: // to hold extended opcode data
    1938 ...
    1939 break;
    1940 case X86II::MRMInitReg: // for instructions whose source and
    1941 ... // destination are the same register
    1942 break;
    1943 }
    1944
    1945
    1946
    1947

    The implementations of these case statements often first emit the

    1948 opcode and then get the operand(s). Then depending upon the operand, helper
    1949 methods may be called to process the operand(s). For example, in X86CodeEmitter.cpp,
    1950 for the X86II::AddRegFrm case, the first data emitted (by emitByte) is the
    1951 opcode added to the register operand. Then an object representing the machine
    1952 operand, MO1, is extracted. The helper methods such as isImmediate,
    1953 isGlobalAddress, isExternalSymbol, isConstantPoolIndex, and
    1954 isJumpTableIndex
    1955 determine the operand type. (X86CodeEmitter.cpp also has private methods such
    1956 as emitConstant, emitGlobalAddress,
    1957 emitExternalSymbolAddress, emitConstPoolAddress,
    1958 and emitJumpTableAddress that emit the data into the output stream.)

    1959
    1960
    1961
    1962
    case X86II::AddRegFrm:
    
                      
                    
    1963 MCE.emitByte(BaseOpcode + getX86RegNum(MI.getOperand(CurOp++).getReg()));
    1964
    1965 if (CurOp != NumOps) {
    1966 const MachineOperand &MO1 = MI.getOperand(CurOp++);
    1967 unsigned Size = X86InstrInfo::sizeOfImm(Desc);
    1968 if (MO1.isImmediate())
    1969 emitConstant(MO1.getImm(), Size);
    1970 else {
    1971 unsigned rt = Is64BitMode ? X86::reloc_pcrel_word
    1972 : (IsPIC ? X86::reloc_picrel_word : X86::reloc_absolute_word);
    1973 if (Opcode == X86::MOV64ri)
    1974 rt = X86::reloc_absolute_dword; // FIXME: add X86II flag?
    1975 if (MO1.isGlobalAddress()) {
    1976 bool NeedStub = isa<Function>(MO1.getGlobal());
    1977 bool isLazy = gvNeedsLazyPtr(MO1.getGlobal());
    1978 emitGlobalAddress(MO1.getGlobal(), rt, MO1.getOffset(), 0,
    1979 NeedStub, isLazy);
    1980 } else if (MO1.isExternalSymbol())
    1981 emitExternalSymbolAddress(MO1.getSymbolName(), rt);
    1982 else if (MO1.isConstantPoolIndex())
    1983 emitConstPoolAddress(MO1.getIndex(), rt);
    1984 else if (MO1.isJumpTableIndex())
    1985 emitJumpTableAddress(MO1.getIndex(), rt);
    1986 }
    1987 }
    1988 break;
    1989
    1990
    1991
    1992

    In the previous example, XXXCodeEmitter.cpp uses the variable rt,

    1993 which is a RelocationType enum that may be used to relocate addresses (for
    1994 example, a global address with a PIC base offset). The RelocationType enum for
    1995 that target is defined in the short target-specific XXXRelocations.h file. The
    1996 RelocationType is used by the relocate method defined in XXXJITInfo.cpp to
    1997 rewrite addresses for referenced global symbols.

    1998
    1999

    For example, X86Relocations.h specifies the following relocation

    2000 types for the X86 addresses. In all four cases, the relocated value is added to
    2001 the value already in memory. For reloc_pcrel_word and reloc_picrel_word,
    2002 there is an additional initial adjustment.

    2003
    2004
    2005
    2006
    enum RelocationType {
    
                      
                    
    2007 reloc_pcrel_word = 0, // add reloc value after adjusting for the PC loc
    2008 reloc_picrel_word = 1, // add reloc value after adjusting for the PIC base
    2009 reloc_absolute_word = 2, // absolute relocation; no additional adjustment
    2010 reloc_absolute_dword = 3 // absolute relocation; no additional adjustment
    2011 };
    2012
    2013
    2014
    2015
    2016 Target JIT Info
    2017
    2018
    2019

    XXXJITInfo.cpp implements the JIT interfaces for target-specific code-generation

    2020 activities, such as emitting machine code and stubs. At minimum,
    2021 a target-specific version of XXXJITInfo implements the following:

    2792022
    280
  • Code generator -
  • 281 describes some of the classes in code generation at a high level, but
    282 it is not (yet) complete
    283
  • TableGen fundamentals -
  • 284 describes how to use TableGen to describe your target information
    285 succinctly
    286
  • Debugging code generation with
  • 287 bugpoint - shows bugpoint usage scenarios to simplify backend
    288 development
    2023
  • getLazyResolverFunction – initializes the JIT, gives the
  • 2024 target a function that is used for compilation
    2025
    2026
  • emitFunctionStub – returns a native function with a
  • 2027 specified address for a callback function
    2028
    2029
  • relocate – changes the addresses of referenced globals,
  • 2030 based on relocation types
    2031
    2032
  • callback function that are wrappers to a function stub that is
  • 2033 used when the real target is not initially known
    2892034
    2902035
    2036

    getLazyResolverFunction is generally trivial to implement. It

    2037 makes the incoming parameter as the global JITCompilerFunction and returns the
    2038 callback function that will be used a function wrapper. For the Alpha target
    2039 (in AlphaJITInfo.cpp), the getLazyResolverFunction implementation is simply:

    2040
    2041
    2042
    2043
    TargetJITInfo::LazyResolverFn AlphaJITInfo::getLazyResolverFunction(  
    
                      
                    
    2044 JITCompilerFn F)
    2045 {
    2046 JITCompilerFunction = F;
    2047 return AlphaCompilationCallback;
    2048 }
    2049
    2050
    2051
    2052

    For the X86 target, the getLazyResolverFunction implementation is

    2053 a little more complication, because it returns a different callback function
    2054 for processors with SSE instructions and XMM registers.

    2055
    2056

    The callback function initially saves and later restores the

    2057 callee register values, incoming arguments, and frame and return address. The
    2058 callback function needs low-level access to the registers or stack, so it is typically
    2059 implemented with assembler.

    2912060
    2922061
    2932062
    2992068
    3002069 src="http://www.w3.org/Icons/valid-html401" alt="Valid HTML 4.01!" />
    3012070
    302 Misha Brukman
    2071 Mason Woo and Misha Brukman
    3032072 The LLVM Compiler Infrastructure
    3042073
    3052074 Last modified: $Date$