llvm.org GIT mirror llvm / 68cb319
Overhauled llvm/clang docs builds. Closes PR6613. NOTE: 2nd part changeset for cfe trunk to follow. *** PRE-PATCH ISSUES ADDRESSED - clang api docs fail build from objdir - clang/llvm api docs collide in install PREFIX/ - clang/llvm main docs collide in install - clang/llvm main docs have full of hard coded destination assumptions and make use of absolute root in static html files; namely CommandGuide tools hard codes a website destination for cross references and some html cross references assume website root paths *** IMPROVEMENTS - bumped Doxygen from 1.4.x -> 1.6.3 - splits llvm/clang docs into 'main' and 'api' (doxygen) build trees - provide consistent, reliable doc builds for both main+api docs - support buid vs. install vs. website intentions - support objdir builds - document targets with 'make help' - correct clean and uninstall operations - use recursive dir delete only where absolutely necessary - added call function fn.RMRF which safeguards against botched 'rm -rf'; if any target (or any variable is evaluated) which attempts to remove any dirs which match a hard-coded 'safelist', a verbose error will be printed and make will error-stop. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@103213 91177308-0d34-0410-b5e6-96231b3b80d8 mike-m 9 years ago
257 changed file(s) with 72888 addition(s) and 72116 deletion(s). Raw diff Collapse all Expand all
155155 # Paths to miscellaneous programs we hope are present but might not be
156156 PERL := @PERL@
157157 BZIP2 := @BZIP2@
158 CAT := @CAT@
158159 DOT := @DOT@
159160 DOXYGEN := @DOXYGEN@
160161 GROFF := @GROFF@
166167 GAS := @GAS@
167168 POD2HTML := @POD2HTML@
168169 POD2MAN := @POD2MAN@
170 PDFROFF := @PDFROFF@
169171 RUNTEST := @RUNTEST@
170172 TCLSH := @TCLSH@
171173 ZIP := @ZIP@
10021002 dnl nothing. This just lets the build output show that we could have done
10031003 dnl something if the tool was available.
10041004 AC_PATH_PROG(BZIP2, [bzip2])
1005 AC_PATH_PROG(CAT, [cat])
10051006 AC_PATH_PROG(DOXYGEN, [doxygen])
10061007 AC_PATH_PROG(GROFF, [groff])
10071008 AC_PATH_PROG(GZIP, [gzip])
10081009 AC_PATH_PROG(POD2HTML, [pod2html])
10091010 AC_PATH_PROG(POD2MAN, [pod2man])
1011 AC_PATH_PROG(PDFROFF, [pdfroff])
10101012 AC_PATH_PROG(RUNTEST, [runtest])
10111013 DJ_AC_PATH_TCLSH
10121014 AC_PATH_PROG(ZIP, [zip])
15421544 dnl Configure the RPM spec file for LLVM
15431545 AC_CONFIG_FILES([llvm.spec])
15441546
1545 dnl Configure doxygen's configuration file
1546 AC_CONFIG_FILES([docs/doxygen.cfg])
1547
15481547 dnl Configure llvmc's Base plugin
15491548 AC_CONFIG_FILES([tools/llvmc/plugins/Base/Base.td])
15501549
735735 INSTALL_SCRIPT
736736 INSTALL_DATA
737737 BZIP2
738 CAT
738739 DOXYGEN
739740 GROFF
740741 GZIP
741742 POD2HTML
742743 POD2MAN
744 PDFROFF
743745 RUNTEST
744746 TCLSH
745747 ZIP
80158017 fi
80168018
80178019
8020 # Extract the first word of "cat", so it can be a program name with args.
8021 set dummy cat; ac_word=$2
8022 { echo "$as_me:$LINENO: checking for $ac_word" >&5
8023 echo $ECHO_N "checking for $ac_word... $ECHO_C" >&6; }
8024 if test "${ac_cv_path_CAT+set}" = set; then
8025 echo $ECHO_N "(cached) $ECHO_C" >&6
8026 else
8027 case $CAT in
8028 [\\/]* | ?:[\\/]*)
8029 ac_cv_path_CAT="$CAT" # Let the user override the test with a path.
8030 ;;
8031 *)
8032 as_save_IFS=$IFS; IFS=$PATH_SEPARATOR
8033 for as_dir in $PATH
8034 do
8035 IFS=$as_save_IFS
8036 test -z "$as_dir" && as_dir=.
8037 for ac_exec_ext in '' $ac_executable_extensions; do
8038 if { test -f "$as_dir/$ac_word$ac_exec_ext" && $as_executable_p "$as_dir/$ac_word$ac_exec_ext"; }; then
8039 ac_cv_path_CAT="$as_dir/$ac_word$ac_exec_ext"
8040 echo "$as_me:$LINENO: found $as_dir/$ac_word$ac_exec_ext" >&5
8041 break 2
8042 fi
8043 done
8044 done
8045 IFS=$as_save_IFS
8046
8047 ;;
8048 esac
8049 fi
8050 CAT=$ac_cv_path_CAT
8051 if test -n "$CAT"; then
8052 { echo "$as_me:$LINENO: result: $CAT" >&5
8053 echo "${ECHO_T}$CAT" >&6; }
8054 else
8055 { echo "$as_me:$LINENO: result: no" >&5
8056 echo "${ECHO_T}no" >&6; }
8057 fi
8058
8059
80188060 # Extract the first word of "doxygen", so it can be a program name with args.
80198061 set dummy doxygen; ac_word=$2
80208062 { echo "$as_me:$LINENO: checking for $ac_word" >&5
82098251 if test -n "$POD2MAN"; then
82108252 { echo "$as_me:$LINENO: result: $POD2MAN" >&5
82118253 echo "${ECHO_T}$POD2MAN" >&6; }
8254 else
8255 { echo "$as_me:$LINENO: result: no" >&5
8256 echo "${ECHO_T}no" >&6; }
8257 fi
8258
8259
8260 # Extract the first word of "pdfroff", so it can be a program name with args.
8261 set dummy pdfroff; ac_word=$2
8262 { echo "$as_me:$LINENO: checking for $ac_word" >&5
8263 echo $ECHO_N "checking for $ac_word... $ECHO_C" >&6; }
8264 if test "${ac_cv_path_PDFROFF+set}" = set; then
8265 echo $ECHO_N "(cached) $ECHO_C" >&6
8266 else
8267 case $PDFROFF in
8268 [\\/]* | ?:[\\/]*)
8269 ac_cv_path_PDFROFF="$PDFROFF" # Let the user override the test with a path.
8270 ;;
8271 *)
8272 as_save_IFS=$IFS; IFS=$PATH_SEPARATOR
8273 for as_dir in $PATH
8274 do
8275 IFS=$as_save_IFS
8276 test -z "$as_dir" && as_dir=.
8277 for ac_exec_ext in '' $ac_executable_extensions; do
8278 if { test -f "$as_dir/$ac_word$ac_exec_ext" && $as_executable_p "$as_dir/$ac_word$ac_exec_ext"; }; then
8279 ac_cv_path_PDFROFF="$as_dir/$ac_word$ac_exec_ext"
8280 echo "$as_me:$LINENO: found $as_dir/$ac_word$ac_exec_ext" >&5
8281 break 2
8282 fi
8283 done
8284 done
8285 IFS=$as_save_IFS
8286
8287 ;;
8288 esac
8289 fi
8290 PDFROFF=$ac_cv_path_PDFROFF
8291 if test -n "$PDFROFF"; then
8292 { echo "$as_me:$LINENO: result: $PDFROFF" >&5
8293 echo "${ECHO_T}$PDFROFF" >&6; }
82128294 else
82138295 { echo "$as_me:$LINENO: result: no" >&5
82148296 echo "${ECHO_T}no" >&6; }
1127411356 lt_dlunknown=0; lt_dlno_uscore=1; lt_dlneed_uscore=2
1127511357 lt_status=$lt_dlunknown
1127611358 cat > conftest.$ac_ext <
11277 #line 11278 "configure"
11359 #line 11360 "configure"
1127811360 #include "confdefs.h"
1127911361
1128011362 #if HAVE_DLFCN_H
2029920381 ac_config_files="$ac_config_files llvm.spec"
2030020382
2030120383
20302 ac_config_files="$ac_config_files docs/doxygen.cfg"
20303
20304
2030520384 ac_config_files="$ac_config_files tools/llvmc/plugins/Base/Base.td"
2030620385
2030720386
2092121000 "include/llvm/System/DataTypes.h") CONFIG_HEADERS="$CONFIG_HEADERS include/llvm/System/DataTypes.h" ;;
2092221001 "Makefile.config") CONFIG_FILES="$CONFIG_FILES Makefile.config" ;;
2092321002 "llvm.spec") CONFIG_FILES="$CONFIG_FILES llvm.spec" ;;
20924 "docs/doxygen.cfg") CONFIG_FILES="$CONFIG_FILES docs/doxygen.cfg" ;;
2092521003 "tools/llvmc/plugins/Base/Base.td") CONFIG_FILES="$CONFIG_FILES tools/llvmc/plugins/Base/Base.td" ;;
2092621004 "tools/llvm-config/llvm-config.in") CONFIG_FILES="$CONFIG_FILES tools/llvm-config/llvm-config.in" ;;
2092721005 "setup") CONFIG_COMMANDS="$CONFIG_COMMANDS setup" ;;
2117521253 INSTALL_SCRIPT!$INSTALL_SCRIPT$ac_delim
2117621254 INSTALL_DATA!$INSTALL_DATA$ac_delim
2117721255 BZIP2!$BZIP2$ac_delim
21256 CAT!$CAT$ac_delim
2117821257 DOXYGEN!$DOXYGEN$ac_delim
2117921258 GROFF!$GROFF$ac_delim
2118021259 GZIP!$GZIP$ac_delim
2118121260 POD2HTML!$POD2HTML$ac_delim
2118221261 POD2MAN!$POD2MAN$ac_delim
21262 PDFROFF!$PDFROFF$ac_delim
2118321263 RUNTEST!$RUNTEST$ac_delim
2118421264 TCLSH!$TCLSH$ac_delim
2118521265 ZIP!$ZIP$ac_delim
2123221312 LTLIBOBJS!$LTLIBOBJS$ac_delim
2123321313 _ACEOF
2123421314
21235 if test `sed -n "s/.*$ac_delim\$/X/p" conf$$subs.sed | grep -c X` = 92; then
21315 if test `sed -n "s/.*$ac_delim\$/X/p" conf$$subs.sed | grep -c X` = 94; then
2123621316 break
2123721317 elif $ac_last_try; then
2123821318 { { echo "$as_me:$LINENO: error: could not make $CONFIG_STATUS" >&5
+0
-937
docs/AliasAnalysis.html less more
None
1 "http://www.w3.org/TR/html4/strict.dtd">
2
3
4 LLVM Alias Analysis Infrastructure
5
6
7
8
9
10 LLVM Alias Analysis Infrastructure
11
12
13
14
  • Introduction
  • 15
    16
  • AliasAnalysis Class Overview
  • 17
    18
  • Representation of Pointers
  • 19
  • The alias method
  • 20
  • The getModRefInfo methods
  • 21
  • Other useful AliasAnalysis methods
  • 22
    23
    24
    25
  • Writing a new AliasAnalysis Implementation
  • 26
    27
  • Different Pass styles
  • 28
  • Required initialization calls
  • 29
  • Interfaces which may be specified
  • 30
  • AliasAnalysis chaining behavior
  • 31
  • Updating analysis results for transformations
  • 32
  • Efficiency Issues
  • 33
    34
    35
    36
  • Using alias analysis results
  • 37
    38
  • Using the MemoryDependenceAnalysis Pass
  • 39
  • Using the AliasSetTracker class
  • 40
  • Using the AliasAnalysis interface directly
  • 41
    42
    43
    44
  • Existing alias analysis implementations and clients
  • 45
    46
  • Available AliasAnalysis implementations
  • 47
  • Alias analysis driven transformations
  • 48
  • Clients for debugging and evaluation of
  • 49 implementations
    50
    51
    52
  • Memory Dependence Analysis
  • 53
    54
    55
    56

    Written by Chris Lattner

    57
    58
    59
    60
    61 Introduction
    62
    63
    64
    65
    66
    67

    Alias Analysis (aka Pointer Analysis) is a class of techniques which attempt

    68 to determine whether or not two pointers ever can point to the same object in
    69 memory. There are many different algorithms for alias analysis and many
    70 different ways of classifying them: flow-sensitive vs flow-insensitive,
    71 context-sensitive vs context-insensitive, field-sensitive vs field-insensitive,
    72 unification-based vs subset-based, etc. Traditionally, alias analyses respond
    73 to a query with a Must, May, or No alias response,
    74 indicating that two pointers always point to the same object, might point to the
    75 same object, or are known to never point to the same object.

    76
    77

    The LLVM

    78 href="http://llvm.org/doxygen/classllvm_1_1AliasAnalysis.html">AliasAnalysis
    79 class is the primary interface used by clients and implementations of alias
    80 analyses in the LLVM system. This class is the common interface between clients
    81 of alias analysis information and the implementations providing it, and is
    82 designed to support a wide range of implementations and clients (but currently
    83 all clients are assumed to be flow-insensitive). In addition to simple alias
    84 analysis information, this class exposes Mod/Ref information from those
    85 implementations which can provide it, allowing for powerful analyses and
    86 transformations to work well together.

    87
    88

    This document contains information necessary to successfully implement this

    89 interface, use it, and to test both sides. It also explains some of the finer
    90 points about what exactly results mean. If you feel that something is unclear
    91 or should be added, please let me
    92 know.

    93
    94
    95
    96
    97
    98 AliasAnalysis Class Overview
    99
    100
    101
    102
    103
    104

    The

    105 href="http://llvm.org/doxygen/classllvm_1_1AliasAnalysis.html">AliasAnalysis
    106 class defines the interface that the various alias analysis implementations
    107 should support. This class exports two important enums: AliasResult
    108 and ModRefResult which represent the result of an alias query or a
    109 mod/ref query, respectively.

    110
    111

    The AliasAnalysis interface exposes information about memory,

    112 represented in several different ways. In particular, memory objects are
    113 represented as a starting address and size, and function calls are represented
    114 as the actual call or invoke instructions that performs the
    115 call. The AliasAnalysis interface also exposes some helper methods
    116 which allow you to get mod/ref information for arbitrary instructions.

    117
    118
    119
    120
    121
    122 Representation of Pointers
    123
    124
    125
    126
    127

    Most importantly, the AliasAnalysis class provides several methods

    128 which are used to query whether or not two memory objects alias, whether
    129 function calls can modify or read a memory object, etc. For all of these
    130 queries, memory objects are represented as a pair of their starting address (a
    131 symbolic LLVM Value*) and a static size.

    132
    133

    Representing memory objects as a starting address and a size is critically

    134 important for correct Alias Analyses. For example, consider this (silly, but
    135 possible) C code:

    136
    137
    138
    
                      
                    
    139 int i;
    140 char C[2];
    141 char A[10];
    142 /* ... */
    143 for (i = 0; i != 10; ++i) {
    144 C[0] = A[i]; /* One byte store */
    145 C[1] = A[9-i]; /* One byte store */
    146 }
    147
    148
    149
    150

    In this case, the basicaa pass will disambiguate the stores to

    151 C[0] and C[1] because they are accesses to two distinct
    152 locations one byte apart, and the accesses are each one byte. In this case, the
    153 LICM pass can use store motion to remove the stores from the loop. In
    154 constrast, the following code:

    155
    156
    157
    
                      
                    
    158 int i;
    159 char C[2];
    160 char A[10];
    161 /* ... */
    162 for (i = 0; i != 10; ++i) {
    163 ((short*)C)[0] = A[i]; /* Two byte store! */
    164 C[1] = A[9-i]; /* One byte store */
    165 }
    166
    167
    168
    169

    In this case, the two stores to C do alias each other, because the access to

    170 the &C[0] element is a two byte access. If size information wasn't
    171 available in the query, even the first case would have to conservatively assume
    172 that the accesses alias.

    173
    174
    175
    176
    177
    178 The alias method
    179
    180
    181
    182 The alias method is the primary interface used to determine whether or
    183 not two memory objects alias each other. It takes two memory objects as input
    184 and returns MustAlias, MayAlias, or NoAlias as appropriate.
    185
    186
    187
    188
    189 Must, May, and No Alias Responses
    190
    191
    192
    193

    The NoAlias response is used when the two pointers refer to distinct objects,

    194 regardless of whether the pointers compare equal. For example, freed pointers
    195 don't alias any pointers that were allocated afterwards. As a degenerate case,
    196 pointers returned by malloc(0) have no bytes for an object, and are considered
    197 NoAlias even when malloc returns the same pointer. The same rule applies to
    198 NULL pointers.

    199
    200

    The MayAlias response is used whenever the two pointers might refer to the

    201 same object. If the two memory objects overlap, but do not start at the same
    202 location, return MayAlias.

    203
    204

    The MustAlias response may only be returned if the two memory objects are

    205 guaranteed to always start at exactly the same location. A MustAlias response
    206 implies that the pointers compare equal.

    207
    208
    209
    210
    211
    212 The getModRefInfo methods
    213
    214
    215
    216
    217

    The getModRefInfo methods return information about whether the

    218 execution of an instruction can read or modify a memory location. Mod/Ref
    219 information is always conservative: if an instruction might read or write
    220 a location, ModRef is returned.

    221
    222

    The AliasAnalysis class also provides a getModRefInfo

    223 method for testing dependencies between function calls. This method takes two
    224 call sites (CS1 & CS2), returns NoModRef if the two calls refer to disjoint
    225 memory locations, Ref if CS1 reads memory written by CS2, Mod if CS1 writes to
    226 memory read or written by CS2, or ModRef if CS1 might read or write memory
    227 accessed by CS2. Note that this relation is not commutative.

    228
    229
    230
    231
    232
    233
    234 Other useful AliasAnalysis methods
    235
    236
    237
    238
    239

    240 Several other tidbits of information are often collected by various alias
    241 analysis implementations and can be put to good use by various clients.
    242

    243
    244
    245
    246
    247
    248 The pointsToConstantMemory method
    249
    250
    251
    252
    253

    The pointsToConstantMemory method returns true if and only if the

    254 analysis can prove that the pointer only points to unchanging memory locations
    255 (functions, constant global variables, and the null pointer). This information
    256 can be used to refine mod/ref information: it is impossible for an unchanging
    257 memory location to be modified.

    258
    259
    260
    261
    262
    263 The doesNotAccessMemory and
    264 onlyReadsMemory methods
    265
    266
    267
    268
    269

    These methods are used to provide very simple mod/ref information for

    270 function calls. The doesNotAccessMemory method returns true for a
    271 function if the analysis can prove that the function never reads or writes to
    272 memory, or if the function only reads from constant memory. Functions with this
    273 property are side-effect free and only depend on their input arguments, allowing
    274 them to be eliminated if they form common subexpressions or be hoisted out of
    275 loops. Many common functions behave this way (e.g., sin and
    276 cos) but many others do not (e.g., acos, which modifies the
    277 errno variable).

    278
    279

    The onlyReadsMemory method returns true for a function if analysis

    280 can prove that (at most) the function only reads from non-volatile memory.
    281 Functions with this property are side-effect free, only depending on their input
    282 arguments and the state of memory when they are called. This property allows
    283 calls to these functions to be eliminated and moved around, as long as there is
    284 no store instruction that changes the contents of memory. Note that all
    285 functions that satisfy the doesNotAccessMemory method also satisfies
    286 onlyReadsMemory.

    287
    288
    289
    290
    291
    292 Writing a new AliasAnalysis Implementation
    293
    294
    295
    296
    297
    298

    Writing a new alias analysis implementation for LLVM is quite

    299 straight-forward. There are already several implementations that you can use
    300 for examples, and the following information should help fill in any details.
    301 For a examples, take a look at the various alias analysis
    302 implementations included with LLVM.

    303
    304
    305
    306
    307
    308 Different Pass styles
    309
    310
    311
    312
    313

    The first step to determining what type of

    314 href="WritingAnLLVMPass.html">LLVM pass you need to use for your Alias
    315 Analysis. As is the case with most other analyses and transformations, the
    316 answer should be fairly obvious from what type of problem you are trying to
    317 solve:

    318
    319
    320
  • If you require interprocedural analysis, it should be a
  • 321 Pass.
    322
  • If you are a function-local analysis, subclass FunctionPass.
  • 323
  • If you don't need to look at the program at all, subclass
  • 324 ImmutablePass.
    325
    326
    327

    In addition to the pass that you subclass, you should also inherit from the

    328 AliasAnalysis interface, of course, and use the
    329 RegisterAnalysisGroup template to register as an implementation of
    330 AliasAnalysis.

    331
    332
    333
    334
    335
    336 Required initialization calls
    337
    338
    339
    340
    341

    Your subclass of AliasAnalysis is required to invoke two methods on

    342 the AliasAnalysis base class: getAnalysisUsage and
    343 InitializeAliasAnalysis. In particular, your implementation of
    344 getAnalysisUsage should explicitly call into the
    345 AliasAnalysis::getAnalysisUsage method in addition to doing any
    346 declaring any pass dependencies your pass has. Thus you should have something
    347 like this:

    348
    349
    350
    
                      
                    
    351 void getAnalysisUsage(AnalysisUsage &AU) const {
    352 AliasAnalysis::getAnalysisUsage(AU);
    353 // declare your dependencies here.
    354 }
    355
    356
    357
    358

    Additionally, your must invoke the InitializeAliasAnalysis method

    359 from your analysis run method (run for a Pass,
    360 runOnFunction for a FunctionPass, or InitializePass
    361 for an ImmutablePass). For example (as part of a Pass):

    362
    363
    364
    
                      
                    
    365 bool run(Module &M) {
    366 InitializeAliasAnalysis(this);
    367 // Perform analysis here...
    368 return false;
    369 }
    370
    371
    372
    373
    374
    375
    376
    377 Interfaces which may be specified
    378
    379
    380
    381
    382

    All of the

    383 href="/doxygen/classllvm_1_1AliasAnalysis.html">AliasAnalysis
    384 virtual methods default to providing chaining to another
    385 alias analysis implementation, which ends up returning conservatively correct
    386 information (returning "May" Alias and "Mod/Ref" for alias and mod/ref queries
    387 respectively). Depending on the capabilities of the analysis you are
    388 implementing, you just override the interfaces you can improve.

    389
    390
    391
    392
    393
    394
    395
    396 AliasAnalysis chaining behavior
    397
    398
    399
    400
    401

    With only two special exceptions (the

    402 href="#basic-aa">basicaa and no-aa
    403 passes) every alias analysis pass chains to another alias analysis
    404 implementation (for example, the user can specify "-basicaa -ds-aa
    405 -licm" to get the maximum benefit from both alias
    406 analyses). The alias analysis class automatically takes care of most of this
    407 for methods that you don't override. For methods that you do override, in code
    408 paths that return a conservative MayAlias or Mod/Ref result, simply return
    409 whatever the superclass computes. For example:

    410
    411
    412
    
                      
                    
    413 AliasAnalysis::AliasResult alias(const Value *V1, unsigned V1Size,
    414 const Value *V2, unsigned V2Size) {
    415 if (...)
    416 return NoAlias;
    417 ...
    418
    419 // Couldn't determine a must or no-alias result.
    420 return AliasAnalysis::alias(V1, V1Size, V2, V2Size);
    421 }
    422
    423
    424
    425

    In addition to analysis queries, you must make sure to unconditionally pass

    426 LLVM update notification methods to the superclass as
    427 well if you override them, which allows all alias analyses in a change to be
    428 updated.

    429
    430
    431
    432
    433
    434
    435 Updating analysis results for transformations
    436
    437
    438
    439

    440 Alias analysis information is initially computed for a static snapshot of the
    441 program, but clients will use this information to make transformations to the
    442 code. All but the most trivial forms of alias analysis will need to have their
    443 analysis results updated to reflect the changes made by these transformations.
    444

    445
    446

    447 The AliasAnalysis interface exposes two methods which are used to
    448 communicate program changes from the clients to the analysis implementations.
    449 Various alias analysis implementations should use these methods to ensure that
    450 their internal data structures are kept up-to-date as the program changes (for
    451 example, when an instruction is deleted), and clients of alias analysis must be
    452 sure to call these interfaces appropriately.
    453

    454
    455
    456
    457
    The deleteValue method
    458
    459
    460 The deleteValue method is called by transformations when they remove an
    461 instruction or any other value from the program (including values that do not
    462 use pointers). Typically alias analyses keep data structures that have entries
    463 for each value in the program. When this method is called, they should remove
    464 any entries for the specified value, if they exist.
    465
    466
    467
    468
    The copyValue method
    469
    470
    471 The copyValue method is used when a new value is introduced into the
    472 program. There is no way to introduce a value into the program that did not
    473 exist before (this doesn't make sense for a safe compiler transformation), so
    474 this is the only way to introduce a new value. This method indicates that the
    475 new value has exactly the same properties as the value being copied.
    476
    477
    478
    479
    The replaceWithNewValue method
    480
    481
    482 This method is a simple helper method that is provided to make clients easier to
    483 use. It is implemented by copying the old analysis information to the new
    484 value, then deleting the old value. This method cannot be overridden by alias
    485 analysis implementations.
    486
    487
    488
    489
    490 Efficiency Issues
    491
    492
    493
    494
    495

    From the LLVM perspective, the only thing you need to do to provide an

    496 efficient alias analysis is to make sure that alias analysis queries are
    497 serviced quickly. The actual calculation of the alias analysis results (the
    498 "run" method) is only performed once, but many (perhaps duplicate) queries may
    499 be performed. Because of this, try to move as much computation to the run
    500 method as possible (within reason).

    501
    502
    503
    504
    505
    506 Using alias analysis results
    507
    508
    509
    510
    511
    512

    There are several different ways to use alias analysis results. In order of

    513 preference, these are...

    514
    515
    516
    517
    518
    519 Using the MemoryDependenceAnalysis Pass
    520
    521
    522
    523
    524

    The memdep pass uses alias analysis to provide high-level dependence

    525 information about memory-using instructions. This will tell you which store
    526 feeds into a load, for example. It uses caching and other techniques to be
    527 efficient, and is used by Dead Store Elimination, GVN, and memcpy optimizations.
    528

    529
    530
    531
    532
    533
    534 Using the AliasSetTracker class
    535
    536
    537
    538
    539

    Many transformations need information about alias sets that are active

    540 in some scope, rather than information about pairwise aliasing. The
    541 href="/doxygen/classllvm_1_1AliasSetTracker.html">AliasSetTracker class
    542 is used to efficiently build these Alias Sets from the pairwise alias analysis
    543 information provided by the AliasAnalysis interface.

    544
    545

    First you initialize the AliasSetTracker by using the "add" methods

    546 to add information about various potentially aliasing instructions in the scope
    547 you are interested in. Once all of the alias sets are completed, your pass
    548 should simply iterate through the constructed alias sets, using the
    549 AliasSetTracker begin()/end() methods.

    550
    551

    The AliasSets formed by the AliasSetTracker are guaranteed

    552 to be disjoint, calculate mod/ref information and volatility for the set, and
    553 keep track of whether or not all of the pointers in the set are Must aliases.
    554 The AliasSetTracker also makes sure that sets are properly folded due to call
    555 instructions, and can provide a list of pointers in each set.

    556
    557

    As an example user of this, the Loop

    558 Invariant Code Motion pass uses AliasSetTrackers to calculate alias
    559 sets for each loop nest. If an AliasSet in a loop is not modified,
    560 then all load instructions from that set may be hoisted out of the loop. If any
    561 alias sets are stored to and are must alias sets, then the stores may be
    562 sunk to outside of the loop, promoting the memory location to a register for the
    563 duration of the loop nest. Both of these transformations only apply if the
    564 pointer argument is loop-invariant.

    565
    566
    567
    568
    569
    570 The AliasSetTracker implementation
    571
    572
    573
    574
    575

    The AliasSetTracker class is implemented to be as efficient as possible. It

    576 uses the union-find algorithm to efficiently merge AliasSets when a pointer is
    577 inserted into the AliasSetTracker that aliases multiple sets. The primary data
    578 structure is a hash table mapping pointers to the AliasSet they are in.

    579
    580

    The AliasSetTracker class must maintain a list of all of the LLVM Value*'s

    581 that are in each AliasSet. Since the hash table already has entries for each
    582 LLVM Value* of interest, the AliasesSets thread the linked list through these
    583 hash-table nodes to avoid having to allocate memory unnecessarily, and to make
    584 merging alias sets extremely efficient (the linked list merge is constant time).
    585

    586
    587

    You shouldn't need to understand these details if you are just a client of

    588 the AliasSetTracker, but if you look at the code, hopefully this brief
    589 description will help make sense of why things are designed the way they
    590 are.

    591
    592
    593
    594
    595
    596 Using the AliasAnalysis interface directly
    597
    598
    599
    600
    601

    If neither of these utility class are what your pass needs, you should use

    602 the interfaces exposed by the AliasAnalysis class directly. Try to use
    603 the higher-level methods when possible (e.g., use mod/ref information instead of
    604 the alias method directly if possible) to get the
    605 best precision and efficiency.

    606
    607
    608
    609
    610
    611 Existing alias analysis implementations and clients
    612
    613
    614
    615
    616
    617

    If you're going to be working with the LLVM alias analysis infrastructure,

    618 you should know what clients and implementations of alias analysis are
    619 available. In particular, if you are implementing an alias analysis, you should
    620 be aware of the the clients that are useful
    621 for monitoring and evaluating different implementations.

    622
    623
    624
    625
    626
    627 Available AliasAnalysis implementations
    628
    629
    630
    631
    632

    This section lists the various implementations of the AliasAnalysis

    633 interface. With the exception of the -no-aa and
    634 -basicaa implementations, all of these
    635 href="#chaining">chain to other alias analysis implementations.

    636
    637
    638
    639
    640
    641 The -no-aa pass
    642
    643
    644
    645
    646

    The -no-aa pass is just like what it sounds: an alias analysis that

    647 never returns any useful information. This pass can be useful if you think that
    648 alias analysis is doing something wrong and are trying to narrow down a
    649 problem.

    650
    651
    652
    653
    654
    655 The -basicaa pass
    656
    657
    658
    659
    660

    The -basicaa pass is the default LLVM alias analysis. It is an

    661 aggressive local analysis that "knows" many important facts:

    662
    663
    664
  • Distinct globals, stack allocations, and heap allocations can never
  • 665 alias.
    666
  • Globals, stack allocations, and heap allocations never alias the null
  • 667 pointer.
    668
  • Different fields of a structure do not alias.
  • 669
  • Indexes into arrays with statically differing subscripts cannot alias.
  • 670
  • Many common standard C library functions
  • 671 href="#simplemodref">never access memory or only read memory.
    672
  • Pointers that obviously point to constant globals
  • 673 "pointToConstantMemory".
    674
  • Function calls can not modify or references stack allocations if they never
  • 675 escape from the function that allocates them (a common case for automatic
    676 arrays).
    677
    678
    679
    680
    681
    682
    683 The -globalsmodref-aa pass
    684
    685
    686
    687
    688

    This pass implements a simple context-sensitive mod/ref and alias analysis

    689 for internal global variables that don't "have their address taken". If a
    690 global does not have its address taken, the pass knows that no pointers alias
    691 the global. This pass also keeps track of functions that it knows never access
    692 memory or never read memory. This allows certain optimizations (e.g. GVN) to
    693 eliminate call instructions entirely.
    694

    695
    696

    The real power of this pass is that it provides context-sensitive mod/ref

    697 information for call instructions. This allows the optimizer to know that
    698 calls to a function do not clobber or read the value of the global, allowing
    699 loads and stores to be eliminated.

    700
    701

    Note that this pass is somewhat limited in its scope (only support

    702 non-address taken globals), but is very quick analysis.

    703
    704
    705
    706
    707 The -steens-aa pass
    708
    709
    710
    711
    712

    The -steens-aa pass implements a variation on the well-known

    713 "Steensgaard's algorithm" for interprocedural alias analysis. Steensgaard's
    714 algorithm is a unification-based, flow-insensitive, context-insensitive, and
    715 field-insensitive alias analysis that is also very scalable (effectively linear
    716 time).

    717
    718

    The LLVM -steens-aa pass implements a "speculatively

    719 field-sensitive" version of Steensgaard's algorithm using the Data
    720 Structure Analysis framework. This gives it substantially more precision than
    721 the standard algorithm while maintaining excellent analysis scalability.

    722
    723

    Note that -steens-aa is available in the optional "poolalloc"

    724 module, it is not part of the LLVM core.

    725
    726
    727
    728
    729
    730 The -ds-aa pass
    731
    732
    733
    734
    735

    The -ds-aa pass implements the full Data Structure Analysis

    736 algorithm. Data Structure Analysis is a modular unification-based,
    737 flow-insensitive, context-sensitive, and speculatively
    738 field-sensitive alias analysis that is also quite scalable, usually at
    739 O(n*log(n)).

    740
    741

    This algorithm is capable of responding to a full variety of alias analysis

    742 queries, and can provide context-sensitive mod/ref information as well. The
    743 only major facility not implemented so far is support for must-alias
    744 information.

    745
    746

    Note that -ds-aa is available in the optional "poolalloc"

    747 module, it is not part of the LLVM core.

    748
    749
    750
    751
    752
    753
    754 Alias analysis driven transformations
    755
    756
    757
    758 LLVM includes several alias-analysis driven transformations which can be used
    759 with any of the implementations above.
    760
    761
    762
    763
    764 The -adce pass
    765
    766
    767
    768
    769

    The -adce pass, which implements Aggressive Dead Code Elimination

    770 uses the AliasAnalysis interface to delete calls to functions that do
    771 not have side-effects and are not used.

    772
    773
    774
    775
    776
    777
    778 The -licm pass
    779
    780
    781
    782
    783

    The -licm pass implements various Loop Invariant Code Motion related

    784 transformations. It uses the AliasAnalysis interface for several
    785 different transformations:

    786
    787
    788
  • It uses mod/ref information to hoist or sink load instructions out of loops
  • 789 if there are no instructions in the loop that modifies the memory loaded.
    790
    791
  • It uses mod/ref information to hoist function calls out of loops that do not
  • 792 write to memory and are loop-invariant.
    793
    794
  • If uses alias information to promote memory objects that are loaded and
  • 795 stored to in loops to live in a register instead. It can do this if there are
    796 no may aliases to the loaded/stored memory location.
    797
    798
    799
    800
    801
    802
    803 The -argpromotion pass
    804
    805
    806
    807

    808 The -argpromotion pass promotes by-reference arguments to be passed in
    809 by-value instead. In particular, if pointer arguments are only loaded from it
    810 passes in the value loaded instead of the address to the function. This pass
    811 uses alias information to make sure that the value loaded from the argument
    812 pointer is not modified between the entry of the function and any load of the
    813 pointer.

    814
    815
    816
    817
    818 The -gvn, -memcpyopt, and -dse
    819 passes
    820
    821
    822
    823
    824

    These passes use AliasAnalysis information to reason about loads and stores.

    825

    826
    827
    828
    829
    830
    831 Clients for debugging and evaluation of
    832 implementations
    833
    834
    835
    836
    837

    These passes are useful for evaluating the various alias analysis

    838 implementations. You can use them with commands like 'opt -ds-aa
    839 -aa-eval foo.bc -disable-output -stats'.

    840
    841
    842
    843
    844
    845 The -print-alias-sets pass
    846
    847
    848
    849
    850

    The -print-alias-sets pass is exposed as part of the

    851 opt tool to print out the Alias Sets formed by the
    852 href="#ast">AliasSetTracker class. This is useful if you're using
    853 the AliasSetTracker class. To use it, use something like:

    854
    855
    856
    
                      
                    
    857 % opt -ds-aa -print-alias-sets -disable-output
    858
    859
    860
    861
    862
    863
    864
    865
    866 The -count-aa pass
    867
    868
    869
    870
    871

    The -count-aa pass is useful to see how many queries a particular

    872 pass is making and what responses are returned by the alias analysis. As an
    873 example,

    874
    875
    876
    
                      
                    
    877 % opt -basicaa -count-aa -ds-aa -count-aa -licm
    878
    879
    880
    881

    will print out how many queries (and what responses are returned) by the

    882 -licm pass (of the -ds-aa pass) and how many queries are made
    883 of the -basicaa pass by the -ds-aa pass. This can be useful
    884 when debugging a transformation or an alias analysis implementation.

    885
    886
    887
    888
    889
    890 The -aa-eval pass
    891
    892
    893
    894
    895

    The -aa-eval pass simply iterates through all pairs of pointers in a

    896 function and asks an alias analysis whether or not the pointers alias. This
    897 gives an indication of the precision of the alias analysis. Statistics are
    898 printed indicating the percent of no/may/must aliases found (a more precise
    899 algorithm will have a lower number of may aliases).

    900
    901
    902
    903
    904
    905 Memory Dependence Analysis
    906
    907
    908
    909
    910
    911

    If you're just looking to be a client of alias analysis information, consider

    912 using the Memory Dependence Analysis interface instead. MemDep is a lazy,
    913 caching layer on top of alias analysis that is able to answer the question of
    914 what preceding memory operations a given instruction depends on, either at an
    915 intra- or inter-block level. Because of its laziness and caching
    916 policy, using MemDep can be a significant performance win over accessing alias
    917 analysis directly.

    918
    919
    920
    921
    922
    923
    924
    925
    926 src="http://jigsaw.w3.org/css-validator/images/vcss-blue" alt="Valid CSS">
    927
    928 src="http://www.w3.org/Icons/valid-html401-blue" alt="Valid HTML 4.01">
    929
    930 Chris Lattner
    931 LLVM Compiler Infrastructure
    932 Last modified: $Date$
    933
    934
    935
    936
    +0
    -1163
    docs/BitCodeFormat.html less more
    None
    1 "http://www.w3.org/TR/html4/strict.dtd">
    2
    3
    4
    5 LLVM Bitcode File Format
    6
    7
    8
    9
    LLVM Bitcode File Format
    10
    11
  • Abstract
  • 12
  • Overview
  • 13
  • Bitstream Format
  • 14
    15
  • Magic Numbers
  • 16
  • Primitives
  • 17
  • Abbreviation IDs
  • 18
  • Blocks
  • 19
  • Data Records
  • 20
  • Abbreviations
  • 21
  • Standard Blocks
  • 22
    23
    24
  • Bitcode Wrapper Format
  • 25
    26
  • LLVM IR Encoding
  • 27
    28
  • Basics
  • 29
  • MODULE_BLOCK Contents
  • 30
  • PARAMATTR_BLOCK Contents
  • 31
  • TYPE_BLOCK Contents
  • 32
  • CONSTANTS_BLOCK Contents
  • 33
  • FUNCTION_BLOCK Contents
  • 34
  • TYPE_SYMTAB_BLOCK Contents
  • 35
  • VALUE_SYMTAB_BLOCK Contents
  • 36
  • METADATA_BLOCK Contents
  • 37
  • METADATA_ATTACHMENT Contents
  • 38
    39
    40
    41
    42

    Written by Chris Lattner

    43 and Joshua Haberman.
    44

    45
    46
    47
    48
    49
    50
    51
    52
    53

    This document describes the LLVM bitstream file format and the encoding of

    54 the LLVM IR into it.

    55
    56
    57
    58
    59
    60
    61
    62
    63
    64

    65 What is commonly known as the LLVM bitcode file format (also, sometimes
    66 anachronistically known as bytecode) is actually two things: a
    67 href="#bitstream">bitstream container format
    68 and an encoding of LLVM IR into the container format.

    69
    70

    71 The bitstream format is an abstract encoding of structured data, very
    72 similar to XML in some ways. Like XML, bitstream files contain tags, and nested
    73 structures, and you can parse the file without having to understand the tags.
    74 Unlike XML, the bitstream format is a binary encoding, and unlike XML it
    75 provides a mechanism for the file to self-describe "abbreviations", which are
    76 effectively size optimizations for the content.

    77
    78

    LLVM IR files may be optionally embedded into a

    79 href="#wrapper">wrapper structure that makes it easy to embed extra data
    80 along with LLVM IR files.

    81
    82

    This document first describes the LLVM bitstream format, describes the

    83 wrapper format, then describes the record structure used by LLVM IR files.
    84

    85
    86
    87
    88
    89
    90
    91
    92
    93
    94

    95 The bitstream format is literally a stream of bits, with a very simple
    96 structure. This structure consists of the following concepts:
    97

    98
    99
    100
  • A "magic number" that identifies the contents of
  • 101 the stream.
    102
  • Encoding primitives like variable bit-rate
  • 103 integers.
    104
  • Blocks, which define nested content.
  • 105
  • Data Records, which describe entities within the
  • 106 file.
    107
  • Abbreviations, which specify compression optimizations for the file.
  • 108
    109
    110

    Note that the

    111 href="CommandGuide/html/llvm-bcanalyzer.html">llvm-bcanalyzer tool can be
    112 used to dump and inspect arbitrary bitstreams, which is very useful for
    113 understanding the encoding.

    114
    115
    116
    117
    118
    119
    120
    121
    122
    123

    The first two bytes of a bitcode file are 'BC' (0x42, 0x43).

    124 The second two bytes are an application-specific magic number. Generic
    125 bitcode tools can look at only the first two bytes to verify the file is
    126 bitcode, while application-specific programs will want to look at all four.

    127
    128
    129
    130
    131
    132
    133
    134
    135
    136

    137 A bitstream literally consists of a stream of bits, which are read in order
    138 starting with the least significant bit of each byte. The stream is made up of a
    139 number of primitive values that encode a stream of unsigned integer values.
    140 These integers are encoded in two ways: either as Fixed
    141 Width Integers or as Variable Width
    142 Integers.
    143

    144
    145
    146
    147
    148
    149
    150
    151
    152
    153

    Fixed-width integer values have their low bits emitted directly to the file.

    154 For example, a 3-bit integer value encodes 1 as 001. Fixed width integers
    155 are used when there are a well-known number of options for a field. For
    156 example, boolean values are usually encoded with a 1-bit wide integer.
    157

    158
    159
    160
    161
    162
    163 Integers
    164
    165
    166
    167

    Variable-width integer (VBR) values encode values of arbitrary size,

    168 optimizing for the case where the values are small. Given a 4-bit VBR field,
    169 any 3-bit value (0 through 7) is encoded directly, with the high bit set to
    170 zero. Values larger than N-1 bits emit their bits in a series of N-1 bit
    171 chunks, where all but the last set the high bit.

    172
    173

    For example, the value 27 (0x1B) is encoded as 1011 0011 when emitted as a

    174 vbr4 value. The first set of four bits indicates the value 3 (011) with a
    175 continuation piece (indicated by a high bit of 1). The next word indicates a
    176 value of 24 (011 << 3) with no continuation. The sum (3+24) yields the value
    177 27.
    178

    179
    180
    181
    182
    183
    184
    185
    186
    187

    6-bit characters encode common characters into a fixed 6-bit field. They

    188 represent the following characters with the following 6-bit values:

    189
    190
    191
    
                      
                    
    192 'a' .. 'z' — 0 .. 25
    193 'A' .. 'Z' — 26 .. 51
    194 '0' .. '9' — 52 .. 61
    195 '.' — 62
    196 '_' — 63
    197
    198
    199
    200

    This encoding is only suitable for encoding characters and strings that

    201 consist only of the above characters. It is completely incapable of encoding
    202 characters not in the set.

    203
    204
    205
    206
    207
    208
    209
    210
    211

    Occasionally, it is useful to emit zero bits until the bitstream is a

    212 multiple of 32 bits. This ensures that the bit position in the stream can be
    213 represented as a multiple of 32-bit words.

    214
    215
    216
    217
    218
    219
    220
    221
    222
    223
    224

    225 A bitstream is a sequential series of Blocks and
    226 Data Records. Both of these start with an
    227 abbreviation ID encoded as a fixed-bitwidth field. The width is specified by
    228 the current block, as described below. The value of the abbreviation ID
    229 specifies either a builtin ID (which have special meanings, defined below) or
    230 one of the abbreviation IDs defined for the current block by the stream itself.
    231

    232
    233

    234 The set of builtin abbrev IDs is:
    235

    236
    237
    238
  • 0 - END_BLOCK — This abbrev ID marks
  • 239 the end of the current block.
    240
  • 1 - ENTER_SUBBLOCK — This
  • 241 abbrev ID marks the beginning of a new block.
    242
  • 2 - DEFINE_ABBREV — This defines
  • 243 a new abbreviation.
    244
  • 3 - UNABBREV_RECORD — This ID
  • 245 specifies the definition of an unabbreviated record.
    246
    247
    248

    Abbreviation IDs 4 and above are defined by the stream itself, and specify

    249 an abbreviated record encoding.

    250
    251
    252
    253
    254
    255
    256
    257
    258
    259

    260 Blocks in a bitstream denote nested regions of the stream, and are identified by
    261 a content-specific id number (for example, LLVM IR uses an ID of 12 to represent
    262 function bodies). Block IDs 0-7 are reserved for standard blocks
    263 whose meaning is defined by Bitcode; block IDs 8 and greater are
    264 application specific. Nested blocks capture the hierarchical structure of the data
    265 encoded in it, and various properties are associated with blocks as the file is
    266 parsed. Block definitions allow the reader to efficiently skip blocks
    267 in constant time if the reader wants a summary of blocks, or if it wants to
    268 efficiently skip data it does not understand. The LLVM IR reader uses this
    269 mechanism to skip function bodies, lazily reading them on demand.
    270

    271
    272

    273 When reading and encoding the stream, several properties are maintained for the
    274 block. In particular, each block maintains:
    275

    276
    277
    278
  • A current abbrev id width. This value starts at 2 at the beginning of
  • 279 the stream, and is set every time a
    280 block record is entered. The block entry specifies the abbrev id width for
    281 the body of the block.
    282
    283
  • A set of abbreviations. Abbreviations may be defined within a block, in
  • 284 which case they are only defined in that block (neither subblocks nor
    285 enclosing blocks see the abbreviation). Abbreviations can also be defined
    286 inside a BLOCKINFO block, in which case
    287 they are defined in all blocks that match the ID that the BLOCKINFO block is
    288 describing.
    289
    290
    291
    292

    293 As sub blocks are entered, these properties are saved and the new sub-block has
    294 its own set of abbreviations, and its own abbrev id width. When a sub-block is
    295 popped, the saved values are restored.
    296

    297
    298
    299
    300
    301
    302 Encoding
    303
    304
    305
    306

    [ENTER_SUBBLOCK, blockidvbr8, newabbrevlenvbr4,

    307 <align32bits>, blocklen32]

    308
    309

    310 The ENTER_SUBBLOCK abbreviation ID specifies the start of a new block
    311 record. The blockid value is encoded as an 8-bit VBR identifier, and
    312 indicates the type of block being entered, which can be
    313 a standard block or an application-specific block.
    314 The newabbrevlen value is a 4-bit VBR, which specifies the abbrev id
    315 width for the sub-block. The blocklen value is a 32-bit aligned value
    316 that specifies the size of the subblock in 32-bit words. This value allows the
    317 reader to skip over the entire block in one jump.
    318

    319
    320
    321
    322
    323
    324 Encoding
    325
    326
    327
    328

    [END_BLOCK, <align32bits>]

    329
    330

    331 The END_BLOCK abbreviation ID specifies the end of the current block
    332 record. Its end is aligned to 32-bits to ensure that the size of the block is
    333 an even multiple of 32-bits.
    334

    335
    336
    337
    338
    339
    340
    341
    342
    343
    344
    345

    346 Data records consist of a record code and a number of (up to) 64-bit
    347 integer values. The interpretation of the code and values is
    348 application specific and may vary between different block types.
    349 Records can be encoded either using an unabbrev record, or with an
    350 abbreviation. In the LLVM IR format, for example, there is a record
    351 which encodes the target triple of a module. The code is
    352 MODULE_CODE_TRIPLE, and the values of the record are the
    353 ASCII codes for the characters in the string.
    354

    355
    356
    357
    358
    359
    360 Encoding
    361
    362
    363
    364

    [UNABBREV_RECORD, codevbr6, numopsvbr6,

    365 op0vbr6, op1vbr6, ...]

    366
    367

    368 An UNABBREV_RECORD provides a default fallback encoding, which is both
    369 completely general and extremely inefficient. It can describe an arbitrary
    370 record by emitting the code and operands as VBRs.
    371

    372
    373

    374 For example, emitting an LLVM IR target triple as an unabbreviated record
    375 requires emitting the UNABBREV_RECORD abbrevid, a vbr6 for the
    376 MODULE_CODE_TRIPLE code, a vbr6 for the length of the string, which is
    377 equal to the number of operands, and a vbr6 for each character. Because there
    378 are no letters with values less than 32, each letter would need to be emitted as
    379 at least a two-part VBR, which means that each letter would require at least 12
    380 bits. This is not an efficient encoding, but it is fully general.
    381

    382
    383
    384
    385
    386
    387 Encoding
    388
    389
    390
    391

    [<abbrevid>, fields...]

    392
    393

    394 An abbreviated record is a abbreviation id followed by a set of fields that are
    395 encoded according to the abbreviation definition.
    396 This allows records to be encoded significantly more densely than records
    397 encoded with the UNABBREV_RECORD type,
    398 and allows the abbreviation types to be specified in the stream itself, which
    399 allows the files to be completely self describing. The actual encoding of
    400 abbreviations is defined below.
    401

    402
    403

    The record code, which is the first field of an abbreviated record,

    404 may be encoded in the abbreviation definition (as a literal
    405 operand) or supplied in the abbreviated record (as a Fixed or VBR
    406 operand value).

    407
    408
    409
    410
    411
    412
    413
    414
    415

    416 Abbreviations are an important form of compression for bitstreams. The idea is
    417 to specify a dense encoding for a class of records once, then use that encoding
    418 to emit many records. It takes space to emit the encoding into the file, but
    419 the space is recouped (hopefully plus some) when the records that use it are
    420 emitted.
    421

    422
    423

    424 Abbreviations can be determined dynamically per client, per file. Because the
    425 abbreviations are stored in the bitstream itself, different streams of the same
    426 format can contain different sets of abbreviations according to the needs
    427 of the specific stream.
    428 As a concrete example, LLVM IR files usually emit an abbreviation
    429 for binary operators. If a specific LLVM module contained no or few binary
    430 operators, the abbreviation does not need to be emitted.
    431

    432
    433
    434
    435
    436 Encoding
    437
    438
    439
    440

    [DEFINE_ABBREV, numabbrevopsvbr5, abbrevop0, abbrevop1,

    441 ...]

    442
    443

    444 A DEFINE_ABBREV record adds an abbreviation to the list of currently
    445 defined abbreviations in the scope of this block. This definition only exists
    446 inside this immediate block — it is not visible in subblocks or enclosing
    447 blocks. Abbreviations are implicitly assigned IDs sequentially starting from 4
    448 (the first application-defined abbreviation ID). Any abbreviations defined in a
    449 BLOCKINFO record for the particular block type
    450 receive IDs first, in order, followed by any
    451 abbreviations defined within the block itself. Abbreviated data records
    452 reference this ID to indicate what abbreviation they are invoking.
    453

    454
    455

    456 An abbreviation definition consists of the DEFINE_ABBREV abbrevid
    457 followed by a VBR that specifies the number of abbrev operands, then the abbrev
    458 operands themselves. Abbreviation operands come in three forms. They all start
    459 with a single bit that indicates whether the abbrev operand is a literal operand
    460 (when the bit is 1) or an encoding operand (when the bit is 0).
    461

    462
    463
    464
  • Literal operands — [11, litvaluevbr8]
  • 465 — Literal operands specify that the value in the result is always a single
    466 specific value. This specific value is emitted as a vbr8 after the bit
    467 indicating that it is a literal operand.
    468
  • Encoding info without data — [01,
  • 469 encoding3] — Operand encodings that do not have extra
    470 data are just emitted as their code.
    471
    472
  • Encoding info with data — [01, encoding3,
  • 473 valuevbr5] — Operand encodings that do have extra data are
    474 emitted as their code, followed by the extra data.
    475
    476
    477
    478

    The possible operand encodings are:

    479
    480
    481
  • Fixed (code 1): The field should be emitted as
  • 482 a fixed-width value, whose width is specified by
    483 the operand's extra data.
    484
  • VBR (code 2): The field should be emitted as
  • 485 a variable-width value, whose width is
    486 specified by the operand's extra data.
    487
  • Array (code 3): This field is an array of values. The array operand
  • 488 has no extra data, but expects another operand to follow it, indicating
    489 the element type of the array. When reading an array in an abbreviated
    490 record, the first integer is a vbr6 that indicates the array length,
    491 followed by the encoded elements of the array. An array may only occur as
    492 the last operand of an abbreviation (except for the one final operand that
    493 gives the array's type).
    494
  • Char6 (code 4): This field should be emitted as
  • 495 a char6-encoded value. This operand type takes no
    496 extra data. Char6 encoding is normally used as an array element type.
    497
    498
  • Blob (code 5): This field is emitted as a vbr6, followed by padding to a
  • 499 32-bit boundary (for alignment) and an array of 8-bit objects. The array of
    500 bytes is further followed by tail padding to ensure that its total length is
    501 a multiple of 4 bytes. This makes it very efficient for the reader to
    502 decode the data without having to make a copy of it: it can use a pointer to
    503 the data in the mapped in file and poke directly at it. A blob may only
    504 occur as the last operand of an abbreviation.
    505
    506
    507

    508 For example, target triples in LLVM modules are encoded as a record of the
    509 form [TRIPLE, 'a', 'b', 'c', 'd']. Consider if the bitstream emitted
    510 the following abbrev entry:
    511

    512
    513
    514
    
                      
                    
    515 [0, Fixed, 4]
    516 [0, Array]
    517 [0, Char6]
    518
    519
    520
    521

    522 When emitting a record with this abbreviation, the above entry would be emitted
    523 as:
    524

    525
    526
    527

    528 [4abbrevwidth, 24, 4vbr6, 06,
    529 16, 26, 36]
    530

    531
    532
    533

    These values are:

    534
    535
    536
  • The first value, 4, is the abbreviation ID for this abbreviation.
  • 537
  • The second value, 2, is the record code for TRIPLE records within LLVM IR file MODULE_BLOCK blocks.
  • 538
  • The third value, 4, is the length of the array.
  • 539
  • The rest of the values are the char6 encoded values
  • 540 for "abcd".
    541
    542
    543

    544 With this abbreviation, the triple is emitted with only 37 bits (assuming a
    545 abbrev id width of 3). Without the abbreviation, significantly more space would
    546 be required to emit the target triple. Also, because the TRIPLE value
    547 is not emitted as a literal in the abbreviation, the abbreviation can also be
    548 used for any other string value.
    549

    550
    551
    552
    553
    554
    555
    556
    557
    558
    559

    560 In addition to the basic block structure and record encodings, the bitstream
    561 also defines specific built-in block types. These block types specify how the
    562 stream is to be decoded or other metadata. In the future, new standard blocks
    563 may be added. Block IDs 0-7 are reserved for standard blocks.
    564

    565
    566
    567
    568
    569
    570 Block
    571
    572
    573
    574

    575 The BLOCKINFO block allows the description of metadata for other
    576 blocks. The currently specified records are:
    577

    578
    579
    580
    
                      
                    
    581 [SETBID (#1), blockid]
    582 [DEFINE_ABBREV, ...]
    583 [BLOCKNAME, ...name...]
    584 [SETRECORDNAME, RecordID, ...name...]
    585
    586
    587
    588

    589 The SETBID record (code 1) indicates which block ID is being
    590 described. SETBID records can occur multiple times throughout the
    591 block to change which block ID is being described. There must be
    592 a SETBID record prior to any other records.
    593

    594
    595

    596 Standard DEFINE_ABBREV records can occur inside BLOCKINFO
    597 blocks, but unlike their occurrence in normal blocks, the abbreviation is
    598 defined for blocks matching the block ID we are describing, not the
    599 BLOCKINFO block itself. The abbreviations defined
    600 in BLOCKINFO blocks receive abbreviation IDs as described
    601 in DEFINE_ABBREV.
    602

    603
    604

    The BLOCKNAME record (code 2) can optionally occur in this block. The elements of

    605 the record are the bytes of the string name of the block. llvm-bcanalyzer can use
    606 this to dump out bitcode files symbolically.

    607
    608

    The SETRECORDNAME record (code 3) can also optionally occur in this block. The

    609 first operand value is a record ID number, and the rest of the elements of the record are
    610 the bytes for the string name of the record. llvm-bcanalyzer can use
    611 this to dump out bitcode files symbolically.

    612
    613

    614 Note that although the data in BLOCKINFO blocks is described as
    615 "metadata," the abbreviations they contain are essential for parsing records
    616 from the corresponding blocks. It is not safe to skip them.
    617

    618
    619
    620
    621
    622
    623
    624
    625
    626
    627

    628 Bitcode files for LLVM IR may optionally be wrapped in a simple wrapper
    629 structure. This structure contains a simple header that indicates the offset
    630 and size of the embedded BC file. This allows additional information to be
    631 stored alongside the BC file. The structure of this file header is:
    632

    633
    634
    635

    636 [Magic32, Version32, Offset32,
    637 Size32, CPUType32]
    638

    639
    640
    641

    642 Each of the fields are 32-bit fields stored in little endian form (as with
    643 the rest of the bitcode file fields). The Magic number is always
    644 0x0B17C0DE and the version is currently always 0. The Offset
    645 field is the offset in bytes to the start of the bitcode stream in the file, and
    646 the Size field is the size in bytes of the stream. CPUType is a target-specific
    647 value that can be used to encode the CPU of the target.
    648

    649
    650
    651
    652
    653
    654
    655
    656
    657
    658

    659 LLVM IR is encoded into a bitstream by defining blocks and records. It uses
    660 blocks for things like constant pools, functions, symbol tables, etc. It uses
    661 records for things like instructions, global variable descriptors, type
    662 descriptions, etc. This document does not describe the set of abbreviations
    663 that the writer uses, as these are fully self-described in the file, and the
    664 reader is not allowed to build in any knowledge of this.
    665

    666
    667
    668
    669
    670
    671
    672
    673
    674
    675
    676
    677
    678

    679 The magic number for LLVM IR files is:
    680

    681
    682
    683

    684 [0x04, 0xC4, 0xE4, 0xD4]
    685

    686
    687
    688

    689 When combined with the bitcode magic number and viewed as bytes, this is
    690 "BC 0xC0DE".
    691

    692
    693
    694
    695
    696
    697
    698
    699
    700

    701 Variable Width Integer encoding is an efficient way to
    702 encode arbitrary sized unsigned values, but is an extremely inefficient for
    703 encoding signed values, as signed values are otherwise treated as maximally large
    704 unsigned values.
    705

    706
    707

    708 As such, signed VBR values of a specific width are emitted as follows:
    709

    710
    711
    712
  • Positive values are emitted as VBRs of the specified width, but with their
  • 713 value shifted left by one.
    714
  • Negative values are emitted as VBRs of the specified width, but the negated
  • 715 value is shifted left by one, and the low bit is set.
    716
    717
    718

    719 With this encoding, small positive and small negative values can both
    720 be emitted efficiently. Signed VBR encoding is used in
    721 CST_CODE_INTEGER and CST_CODE_WIDE_INTEGER records
    722 within CONSTANTS_BLOCK blocks.
    723

    724
    725
    726
    727
    728
    729
    730
    731
    732
    733

    734 LLVM IR is defined with the following blocks:
    735

    736
    737
    738
  • 8 — MODULE_BLOCK — This is the top-level block that
  • 739 contains the entire module, and describes a variety of per-module
    740 information.
    741
  • 9 — PARAMATTR_BLOCK — This enumerates the parameter
  • 742 attributes.
    743
  • 10 — TYPE_BLOCK — This describes all of the types in
  • 744 the module.
    745
  • 11 — CONSTANTS_BLOCK — This describes constants for a
  • 746 module or function.
    747
  • 12 — FUNCTION_BLOCK — This describes a function
  • 748 body.
    749
  • 13 — TYPE_SYMTAB_BLOCK — This describes the type symbol
  • 750 table.
    751
  • 14 — VALUE_SYMTAB_BLOCK — This describes a value symbol
  • 752 table.
    753
  • 15 — METADATA_BLOCK — This describes metadata items.
  • 754
  • 16 — METADATA_ATTACHMENT — This contains records associating metadata with function instruction values.
  • 755
    756
    757
    758
    759
    760
    761
    762
    763
    764
    765

    The MODULE_BLOCK block (id 8) is the top-level block for LLVM

    766 bitcode files, and each bitcode file must contain exactly one. In
    767 addition to records (described below) containing information
    768 about the module, a MODULE_BLOCK block may contain the
    769 following sub-blocks:
    770

    771
    772
    773
  • BLOCKINFO
  • 774
  • PARAMATTR_BLOCK
  • 775
  • TYPE_BLOCK
  • 776
  • TYPE_SYMTAB_BLOCK
  • 777
  • VALUE_SYMTAB_BLOCK
  • 778
  • CONSTANTS_BLOCK
  • 779
  • FUNCTION_BLOCK
  • 780
  • METADATA_BLOCK
  • 781
    782
    783
    784
    785
    786
    787
    788
    789
    790
    791

    [VERSION, version#]

    792
    793

    The VERSION record (code 1) contains a single value

    794 indicating the format version. Only version 0 is supported at this
    795 time.

    796
    797
    798
    799
    800
    801
    802
    803

    [TRIPLE, ...string...]

    804
    805

    The TRIPLE record (code 2) contains a variable number of

    806 values representing the bytes of the target triple
    807 specification string.

    808
    809
    810
    811
    812
    813
    814
    815

    [DATALAYOUT, ...string...]

    816
    817

    The DATALAYOUT record (code 3) contains a variable number of

    818 values representing the bytes of the target datalayout
    819 specification string.

    820
    821
    822
    823
    824
    825
    826
    827

    [ASM, ...string...]

    828
    829

    The ASM record (code 4) contains a variable number of

    830 values representing the bytes of module asm strings, with
    831 individual assembly blocks separated by newline (ASCII 10) characters.

    832
    833
    834
    835
    836
    837
    838
    839

    [SECTIONNAME, ...string...]

    840
    841

    The SECTIONNAME record (code 5) contains a variable number

    842 of values representing the bytes of a single section name
    843 string. There should be one SECTIONNAME record for each
    844 section name referenced (e.g., in global variable or function
    845 section attributes) within the module. These records can be
    846 referenced by the 1-based index in the section fields of
    847 GLOBALVAR or FUNCTION records.

    848
    849
    850
    851
    852
    853
    854
    855

    [DEPLIB, ...string...]

    856
    857

    The DEPLIB record (code 6) contains a variable number of

    858 values representing the bytes of a single dependent library name
    859 string, one of the libraries mentioned in a deplibs
    860 declaration. There should be one DEPLIB record for each
    861 library name referenced.

    862
    863
    864
    865
    866
    867
    868
    869

    [GLOBALVAR, pointer type, isconst, initid, linkage, alignment, section, visibility, threadlocal]

    870
    871

    The GLOBALVAR record (code 7) marks the declaration or

    872 definition of a global variable. The operand fields are:

    873
    874
    875
  • pointer type: The type index of the pointer type used to point to
  • 876 this global variable
    877
    878
  • isconst: Non-zero if the variable is treated as constant within
  • 879 the module, or zero if it is not
    880
    881
  • initid: If non-zero, the value index of the initializer for this
  • 882 variable, plus 1.
    883
    884
  • linkage: An encoding of the linkage
  • 885 type for this variable:
    886
    887
  • external: code 0
  • 888
  • weak: code 1
  • 889
  • appending: code 2
  • 890
  • internal: code 3
  • 891
  • linkonce: code 4
  • 892
  • dllimport: code 5
  • 893
  • dllexport: code 6
  • 894
  • extern_weak: code 7
  • 895
  • common: code 8
  • 896
  • private: code 9
  • 897
  • weak_odr: code 10
  • 898
  • linkonce_odr: code 11
  • 899
  • available_externally: code 12
  • 900
  • linker_private: code 13
  • 901
    902
    903
    904
  • alignment: The logarithm base 2 of the variable's requested
  • 905 alignment, plus 1
    906
    907
  • section: If non-zero, the 1-based section index in the
  • 908 table of MODULE_CODE_SECTIONNAME
    909 entries.
    910
    911
  • visibility: If present, an
  • 912 encoding of the visibility of this variable:
    913
    914
  • default: code 0
  • 915
  • hidden: code 1
  • 916
  • protected: code 2
  • 917
    918
    919
    920
  • threadlocal: If present and non-zero, indicates that the variable
  • 921 is thread_local
    922
    923
    924
    925
    926
    927
    928
    929
    930
    931
    932

    [FUNCTION, type, callingconv, isproto, linkage, paramattr, alignment, section, visibility, gc]

    933
    934

    The FUNCTION record (code 8) marks the declaration or

    935 definition of a function. The operand fields are:

    936
    937
    938
  • type: The type index of the function type describing this function
  • 939
    940
  • callingconv: The calling convention number:
  • 941
    942
  • ccc: code 0
  • 943
  • fastcc: code 8
  • 944
  • coldcc: code 9
  • 945
  • x86_stdcallcc: code 64
  • 946
  • x86_fastcallcc: code 65
  • 947
  • arm_apcscc: code 66
  • 948
  • arm_aapcscc: code 67
  • 949
  • arm_aapcs_vfpcc: code 68
  • 950
    951
    952
    953
  • isproto: Non-zero if this entry represents a declaration
  • 954 rather than a definition
    955
    956
  • linkage: An encoding of the linkage type
  • 957 for this function
    958
    959
  • paramattr: If nonzero, the 1-based parameter attribute index
  • 960 into the table of PARAMATTR_CODE_ENTRY
    961 entries.
    962
    963
  • alignment: The logarithm base 2 of the function's requested
  • 964 alignment, plus 1
    965
    966
  • section: If non-zero, the 1-based section index in the
  • 967 table of MODULE_CODE_SECTIONNAME
    968 entries.
    969
    970
  • visibility: An encoding of the visibility
  • 971 of this function
    972
    973
  • gc: If present and nonzero, the 1-based garbage collector
  • 974 index in the table of
    975 MODULE_CODE_GCNAME entries.
    976
    977
    978
    979
    980
    981
    982
    983
    984
    985

    [ALIAS, alias type, aliasee val#, linkage, visibility]

    986
    987

    The ALIAS record (code 9) marks the definition of an

    988 alias. The operand fields are

    989
    990
    991
  • alias type: The type index of the alias
  • 992
    993
  • aliasee val#: The value index of the aliased value
  • 994
    995
  • linkage: An encoding of the linkage type
  • 996 for this alias
    997
    998
  • visibility: If present, an encoding of the
  • 999 visibility of the alias
    1000
    1001
    1002
    1003
    1004
    1005
    1006
    1007
    1008
    1009

    [PURGEVALS, numvals]

    1010
    1011

    The PURGEVALS record (code 10) resets the module-level

    1012 value list to the size given by the single operand value. Module-level
    1013 value list items are added by GLOBALVAR, FUNCTION,
    1014 and ALIAS records. After a PURGEVALS record is seen,
    1015 new value indices will start from the given numvals value.

    1016
    1017
    1018
    1019
    1020
    1021
    1022
    1023

    [GCNAME, ...string...]

    1024
    1025

    The GCNAME record (code 11) contains a variable number of

    1026 values representing the bytes of a single garbage collector name
    1027 string. There should be one GCNAME record for each garbage
    1028 collector name referenced in function gc attributes within
    1029 the module. These records can be referenced by 1-based index in the gc
    1030 fields of FUNCTION records.

    1031
    1032
    1033
    1034
    1035
    1036
    1037
    1038
    1039

    The PARAMATTR_BLOCK block (id 9) ...

    1040

    1041
    1042
    1043
    1044
    1045
    1046
    1047
    1048
    1049
    1050
    1051

    [ENTRY, paramidx0, attr0, paramidx1, attr1...]

    1052
    1053

    The ENTRY record (code 1) ...

    1054

    1055
    1056
    1057
    1058
    1059
    1060
    1061
    1062
    1063

    The TYPE_BLOCK block (id 10) ...

    1064

    1065
    1066
    1067
    1068
    1069
    1070
    1071
    1072
    1073
    1074
    1075

    The CONSTANTS_BLOCK block (id 11) ...

    1076

    1077
    1078
    1079
    1080
    1081
    1082
    1083
    1084
    1085
    1086
    1087

    The FUNCTION_BLOCK block (id 12) ...

    1088

    1089
    1090

    In addition to the record types described below, a

    1091 FUNCTION_BLOCK block may contain the following sub-blocks:
    1092

    1093
    1094
    1095
  • CONSTANTS_BLOCK
  • 1096
  • VALUE_SYMTAB_BLOCK
  • 1097
  • METADATA_ATTACHMENT
  • 1098
    1099
    1100
    1101
    1102
    1103
    1104
    1105
    1106
    1107
    1108
    1109

    The TYPE_SYMTAB_BLOCK block (id 13) ...

    1110

    1111
    1112
    1113
    1114
    1115
    1116
    1117
    1118
    1119
    1120
    1121

    The VALUE_SYMTAB_BLOCK block (id 14) ...

    1122

    1123
    1124
    1125
    1126
    1127
    1128
    1129
    1130
    1131
    1132
    1133

    The METADATA_BLOCK block (id 15) ...

    1134

    1135
    1136
    1137
    1138
    1139
    1140
    1141
    1142
    1143
    1144
    1145

    The METADATA_ATTACHMENT block (id 16) ...

    1146

    1147
    1148
    1149
    1150
    1151
    1152
    1153
    1154 src="http://jigsaw.w3.org/css-validator/images/vcss-blue" alt="Valid CSS">
    1155
    1156 src="http://www.w3.org/Icons/valid-html401-blue" alt="Valid HTML 4.01">
    1157 Chris Lattner
    1158 The LLVM Compiler Infrastructure
    1159 Last modified: $Date$
    1160
    1161
    1162
    +0
    -250
    docs/Bugpoint.html less more
    None
    1 "http://www.w3.org/TR/html4/strict.dtd">
    2
    3
    4 LLVM bugpoint tool: design and usage
    5
    6
    7
    8
    9 LLVM bugpoint tool: design and usage
    10
    11
    12
    13
  • Description
  • 14
  • Design Philosophy
  • 15
    16
  • Automatic Debugger Selection
  • 17
  • Crash debugger
  • 18
  • Code generator debugger
  • 19
  • Miscompilation debugger
  • 20
    21
  • Advice for using bugpoint
  • 22
    23
    24
    25

    Written by Chris Lattner

    26
    27
    28
    29
    30 Description
    31
    32
    33
    34
    35
    36

    bugpoint narrows down the source of problems in LLVM tools and

    37 passes. It can be used to debug three types of failures: optimizer crashes,
    38 miscompilations by optimizers, or bad native code generation (including problems
    39 in the static and JIT compilers). It aims to reduce large test cases to small,
    40 useful ones. For example, if opt crashes while optimizing a
    41 file, it will identify the optimization (or combination of optimizations) that
    42 causes the crash, and reduce the file down to a small example which triggers the
    43 crash.

    44
    45

    For detailed case scenarios, such as debugging opt,

    46 llvm-ld, or one of the LLVM code generators, see
    47 href="HowToSubmitABug.html">How To Submit a Bug Report document.

    48
    49
    50
    51
    52
    53 Design Philosophy
    54
    55
    56
    57
    58
    59

    bugpoint is designed to be a useful tool without requiring any

    60 hooks into the LLVM infrastructure at all. It works with any and all LLVM
    61 passes and code generators, and does not need to "know" how they work. Because
    62 of this, it may appear to do stupid things or miss obvious
    63 simplifications. bugpoint is also designed to trade off programmer
    64 time for computer time in the compiler-debugging process; consequently, it may
    65 take a long period of (unattended) time to reduce a test case, but we feel it
    66 is still worth it. Note that bugpoint is generally very quick unless
    67 debugging a miscompilation where each test of the program (which requires
    68 executing it) takes a long time.

    69
    70
    71
    72
    73
    74 Automatic Debugger Selection
    75
    76
    77
    78
    79

    bugpoint reads each .bc or .ll file specified on

    80 the command line and links them together into a single module, called the test
    81 program. If any LLVM passes are specified on the command line, it runs these
    82 passes on the test program. If any of the passes crash, or if they produce
    83 malformed output (which causes the verifier to abort), bugpoint starts
    84 the crash debugger.

    85
    86

    Otherwise, if the -output option was not specified,

    87 bugpoint runs the test program with the C backend (which is assumed to
    88 generate good code) to generate a reference output. Once bugpoint has
    89 a reference output for the test program, it tries executing it with the
    90 selected code generator. If the selected code generator crashes,
    91 bugpoint starts the crash debugger on the
    92 code generator. Otherwise, if the resulting output differs from the reference
    93 output, it assumes the difference resulted from a code generator failure, and
    94 starts the code generator debugger.

    95
    96

    Finally, if the output of the selected code generator matches the reference

    97 output, bugpoint runs the test program after all of the LLVM passes
    98 have been applied to it. If its output differs from the reference output, it
    99 assumes the difference resulted from a failure in one of the LLVM passes, and
    100 enters the miscompilation debugger.
    101 Otherwise, there is no problem bugpoint can debug.

    102
    103
    104
    105
    106
    107 Crash debugger
    108
    109
    110
    111
    112

    If an optimizer or code generator crashes, bugpoint will try as hard

    113 as it can to reduce the list of passes (for optimizer crashes) and the size of
    114 the test program. First, bugpoint figures out which combination of
    115 optimizer passes triggers the bug. This is useful when debugging a problem
    116 exposed by opt, for example, because it runs over 38 passes.

    117
    118

    Next, bugpoint tries removing functions from the test program, to

    119 reduce its size. Usually it is able to reduce a test program to a single
    120 function, when debugging intraprocedural optimizations. Once the number of
    121 functions has been reduced, it attempts to delete various edges in the control
    122 flow graph, to reduce the size of the function as much as possible. Finally,
    123 bugpoint deletes any individual LLVM instructions whose absence does
    124 not eliminate the failure. At the end, bugpoint should tell you what
    125 passes crash, give you a bitcode file, and give you instructions on how to
    126 reproduce the failure with opt or llc.

    127
    128
    129
    130
    131
    132 Code generator debugger
    133
    134
    135
    136
    137

    The code generator debugger attempts to narrow down the amount of code that

    138 is being miscompiled by the selected code generator. To do this, it takes the
    139 test program and partitions it into two pieces: one piece which it compiles
    140 with the C backend (into a shared object), and one piece which it runs with
    141 either the JIT or the static LLC compiler. It uses several techniques to
    142 reduce the amount of code pushed through the LLVM code generator, to reduce the
    143 potential scope of the problem. After it is finished, it emits two bitcode
    144 files (called "test" [to be compiled with the code generator] and "safe" [to be
    145 compiled with the C backend], respectively), and instructions for reproducing
    146 the problem. The code generator debugger assumes that the C backend produces
    147 good code.

    148
    149
    150
    151
    152
    153 Miscompilation debugger
    154
    155
    156
    157
    158

    The miscompilation debugger works similarly to the code generator debugger.

    159 It works by splitting the test program into two pieces, running the
    160 optimizations specified on one piece, linking the two pieces back together, and
    161 then executing the result. It attempts to narrow down the list of passes to
    162 the one (or few) which are causing the miscompilation, then reduce the portion
    163 of the test program which is being miscompiled. The miscompilation debugger
    164 assumes that the selected code generator is working properly.

    165
    166
    167
    168
    169
    170 Advice for using bugpoint
    171
    172
    173
    174
    175
    176 bugpoint can be a remarkably useful tool, but it sometimes works in
    177 non-obvious ways. Here are some hints and tips:

    178
    179
    180
  • In the code generator and miscompilation debuggers, bugpoint only
  • 181 works with programs that have deterministic output. Thus, if the program
    182 outputs argv[0], the date, time, or any other "random" data,
    183 bugpoint may misinterpret differences in these data, when output,
    184 as the result of a miscompilation. Programs should be temporarily modified
    185 to disable outputs that are likely to vary from run to run.
    186
    187
  • In the code generator and miscompilation debuggers, debugging will go
  • 188 faster if you manually modify the program or its inputs to reduce the
    189 runtime, but still exhibit the problem.
    190
    191
  • bugpoint is extremely useful when working on a new optimization:
  • 192 it helps track down regressions quickly. To avoid having to relink
    193 bugpoint every time you change your optimization however, have
    194 bugpoint dynamically load your optimization with the
    195 -load option.
    196
    197
  • bugpoint can generate a lot of output and run for a long period

  • 198 of time. It is often useful to capture the output of the program to file.
    199 For example, in the C shell, you can run:

    200
    201
    202

    bugpoint ... |& tee bugpoint.log

    203
    204
    205

    to get a copy of bugpoint's output in the file

    206 bugpoint.log, as well as on your terminal.

    207
    208
  • bugpoint cannot debug problems with the LLVM linker. If
  • 209 bugpoint crashes before you see its "All input ok" message,
    210 you might try llvm-link -v on the same set of input files. If
    211 that also crashes, you may be experiencing a linker bug.
    212
    213
  • bugpoint is useful for proactively finding bugs in LLVM.
  • 214 Invoking bugpoint with the -find-bugs option will cause
    215 the list of specified optimizations to be randomized and applied to the
    216 program. This process will repeat until a bug is found or the user
    217 kills bugpoint.
    218
    219
  • bugpoint does not understand the -O option

  • 220 that is used to specify optimization level to opt. You
    221 can use e.g.

    222
    223
    224

    opt -O2 -debug-pass=Arguments foo.bc -disable-output

    225
    226
    227

    to get a list of passes that are used with -O2 and

    228 then pass this list to bugpoint.

    229
    230
    231
    232
    233
    234
    235
    236
    237
    238
    239 src="http://jigsaw.w3.org/css-validator/images/vcss-blue" alt="Valid CSS">
    240
    241 src="http://www.w3.org/Icons/valid-html401-blue" alt="Valid HTML 4.01">
    242
    243 Chris Lattner
    244 LLVM Compiler Infrastructure
    245 Last modified: $Date$
    246
    247
    248
    249
    +0
    -29
    docs/CFEBuildInstrs.html less more
    None
    1 "http://www.w3.org/TR/html4/strict.dtd">
    2
    3
    4
    5
    6 Building the LLVM C/C++ Front-End
    7
    8
    9
    10
    11 This page has moved here.
    12
    13
    14
    15
    16
    17
    18
    19 src="http://jigsaw.w3.org/css-validator/images/vcss-blue" alt="Valid CSS">
    20
    21 src="http://www.w3.org/Icons/valid-html401-blue" alt="Valid HTML 4.01">
    22
    23 LLVM Compiler Infrastructure
    24 Last modified: $Date: 2008-02-13 17:46:10 +0100 (Wed, 13 Feb 2008) $
    25
    26
    27
    28
    +0
    -384
    docs/CMake.html less more
    None
    1 "http://www.w3.org/TR/html4/strict.dtd">
    2
    3
    4 Building LLVM with CMake
    5
    6
    7
    8
    9 Building LLVM with CMake
    10
    11
    12
    13
  • Introduction
  • 14
  • Quick start
  • 15
  • Basic CMake usage
  • 16
  • Options and variables
  • 17
    18
  • Frequently-used CMake variables
  • 19
  • LLVM-specific variables
  • 20
    21
  • Executing the test suite
  • 22
  • Cross compiling
  • 23
  • Embedding LLVM in your project
  • 24
  • Compiler/Platform specific topics
  • 25
    26
  • Microsoft Visual C++
  • 27
    28
    29
    30
    31

    Written by Oscar Fuentes

    32
    33
    34
    35
    36 Introduction
    37
    38
    39
    40
    41
    42

    CMake is a cross-platform

    43 build-generator tool. CMake does not build the project, it generates
    44 the files needed by your build tool (GNU make, Visual Studio, etc) for
    45 building LLVM.

    46
    47

    If you are really anxious about getting a functional LLVM build,

    48 go to the Quick start section. If you
    49 are a CMake novice, start on Basic CMake
    50 usage and then go back to the Quick
    51 start once you know what you are
    52 doing. The Options and variables section
    53 is a reference for customizing your build. If you already have
    54 experience with CMake, this is the recommended starting point.
    55
    56
    57
    58
    59 Quick start
    60
    61
    62
    63
    64
    65

    We use here the command-line, non-interactive CMake interface

    66
    67
    68
    69
  • Download

  • 70 and install CMake. Version 2.6.2 is the minimum required.

    71
    72
  • Open a shell. Your development tools must be reachable from this

  • 73 shell through the PATH environment variable.

    74
    75
  • Create a directory for containing the build. It is not

  • 76 supported to build LLVM on the source directory. cd to this
    77 directory:

    78
    79

    mkdir mybuilddir

    80

    cd mybuilddir

    81
    82
    83
  • Execute this command on the shell

  • 84 replacing path/to/llvm/source/root with the path to the
    85 root of your LLVM source tree:

    86
    87

    cmake path/to/llvm/source/root

    88
    89
    90

    CMake will detect your development environment, perform a

    91 series of test and generate the files required for building
    92 LLVM. CMake will use default values for all build
    93 parameters. See the Options and variables
    94 section for fine-tuning your build

    95
    96

    This can fail if CMake can't detect your toolset, or if it

    97 thinks that the environment is not sane enough. On this case
    98 make sure that the toolset that you intend to use is the only
    99 one reachable from the shell and that the shell itself is the
    100 correct one for you development environment. CMake will refuse
    101 to build MinGW makefiles if you have a POSIX shell reachable
    102 through the PATH environment variable, for instance. You can
    103 force CMake to use a given build tool, see
    104 the Usage section.

    105
    106
    107
    108
    109
    110
    111
    112 Basic CMake usage
    113
    114
    115
    116
    117
    118

    This section explains basic aspects of CMake, mostly for

    119 explaining those options which you may need on your day-to-day
    120 usage.

    121
    122

    CMake comes with extensive documentation in the form of html

    123 files and on the cmake executable itself. Execute cmake
    124 --help for further help options.

    125
    126

    CMake requires to know for which build tool it shall generate

    127 files (GNU make, Visual Studio, Xcode, etc). If not specified on
    128 the command line, it tries to guess it based on you
    129 environment. Once identified the build tool, CMake uses the
    130 corresponding Generator for creating files for your build
    131 tool. You can explicitly specify the generator with the command
    132 line option -G "Name of the generator". For knowing the
    133 available generators on your platform, execute

    134
    135
    136

    cmake --help

    137
    138
    139

    This will list the generator's names at the end of the help

    140 text. Generator's names are case-sensitive. Example:

    141
    142
    143

    cmake -G "Visual Studio 8 2005" path/to/llvm/source/root

    144
    145
    146

    For a given development platform there can be more than one

    147 adequate generator. If you use Visual Studio "NMake Makefiles"
    148 is a generator you can use for building with NMake. By default,
    149 CMake chooses the more specific generator supported by your
    150 development environment. If you want an alternative generator,
    151 you must tell this to CMake with the -G option.

    152
    153

    TODO: explain variables and cache. Move explanation here from

    154 #options section.

    155
    156
    157
    158
    159
    160 Options and variables
    161
    162
    163
    164
    165
    166

    Variables customize how the build will be generated. Options are

    167 boolean variables, with possible values ON/OFF. Options and
    168 variables are defined on the CMake command line like this:

    169
    170
    171

    cmake -DVARIABLE=value path/to/llvm/source

    172
    173
    174

    You can set a variable after the initial CMake invocation for

    175 changing its value. You can also undefine a variable:

    176
    177
    178

    cmake -UVARIABLE path/to/llvm/source

    179
    180
    181

    Variables are stored on the CMake cache. This is a file

    182 named CMakeCache.txt on the root of the build
    183 directory. Do not hand-edit it.

    184
    185

    Variables are listed here appending its type after a colon. It is

    186 correct to write the variable and the type on the CMake command
    187 line:

    188
    189
    190

    cmake -DVARIABLE:TYPE=value path/to/llvm/source

    191
    192
    193
    194
    195
    196
    197 Frequently-used CMake variables
    198
    199
    200
    201
    202

    Here are listed some of the CMake variables that are used often,

    203 along with a brief explanation and LLVM-specific notes. For full
    204 documentation, check the CMake docs or execute cmake
    205 --help-variable VARIABLE_NAME.

    206
    207
    208
    CMAKE_BUILD_TYPE:STRING
    209
    210
    Sets the build type for make based generators. Possible
    211 values are Release, Debug, RelWithDebInfo and MinSizeRel. On
    212 systems like Visual Studio the user sets the build type with the IDE
    213 settings.
    214
    215
    CMAKE_INSTALL_PREFIX:PATH
    216
    Path where LLVM will be installed if "make install" is invoked
    217 or the "INSTALL" target is built.
    218
    219
    LLVM_LIBDIR_SUFFIX:STRING
    220
    Extra suffix to append to the directory where libraries are to
    221 be installed. On a 64-bit architecture, one could use
    222 -DLLVM_LIBDIR_SUFFIX=64 to install libraries to /usr/lib64.
    223
    224
    CMAKE_C_FLAGS:STRING
    225
    Extra flags to use when compiling C source files.
    226
    227
    CMAKE_CXX_FLAGS:STRING
    228
    Extra flags to use when compiling C++ source files.
    229
    230
    BUILD_SHARED_LIBS:BOOL
    231
    Flag indicating is shared libraries will be built. Its default
    232 value is OFF. Shared libraries are not supported on Windows and
    233 not recommended in the other OSes.
    234
    235
    236
    237
    238
    239
    240 LLVM-specific variables
    241
    242
    243
    244
    245
    246
    LLVM_TARGETS_TO_BUILD:STRING
    247
    Semicolon-separated list of targets to build, or all for
    248 building all targets. Case-sensitive. For Visual C++ defaults
    249 to X86. On the other cases defaults to all. Example:
    250 -DLLVM_TARGETS_TO_BUILD="X86;PowerPC;Alpha".
    251
    252
    LLVM_BUILD_TOOLS:BOOL
    253
    Build LLVM tools. Defaults to ON. Targets for building each tool
    254 are generated in any case. You can build an tool separately by
    255 invoking its target. For example, you can build llvm-as
    256 with a makefile-based system executing make llvm-as on the
    257 root of your build directory.
    258
    259
    LLVM_BUILD_EXAMPLES:BOOL
    260
    Build LLVM examples. Defaults to OFF. Targets for building each
    261 example are generated in any case. See documentation
    262 for LLVM_BUILD_TOOLS above for more details.
    263
    264
    LLVM_ENABLE_THREADS:BOOL
    265
    Build with threads support, if available. Defaults to ON.
    266
    267
    LLVM_ENABLE_ASSERTIONS:BOOL
    268
    Enables code assertions. Defaults to OFF if and only if
    269 CMAKE_BUILD_TYPE is Release.
    270
    271
    LLVM_ENABLE_PIC:BOOL
    272
    Add the -fPIC flag for the compiler command-line, if the
    273 compiler supports this flag. Some systems, like Windows, do not
    274 need this flag. Defaults to ON.
    275
    276
    LLVM_ENABLE_WARNINGS:BOOL
    277
    Enable all compiler warnings. Defaults to ON.
    278
    279
    LLVM_ENABLE_PEDANTIC:BOOL
    280
    Enable pedantic mode. This disable compiler specific extensions, is
    281 possible. Defaults to ON.
    282
    283
    LLVM_ENABLE_WERROR:BOOL
    284
    Stop and fail build, if a compiler warning is
    285 triggered. Defaults to OFF.
    286
    287
    LLVM_BUILD_32_BITS:BOOL
    288
    Build 32-bits executables and libraries on 64-bits systems. This
    289 option is available only on some 64-bits unix systems. Defaults to
    290 OFF.
    291
    292
    LLVM_TARGET_ARCH:STRING
    293
    LLVM target to use for native code generation. This is required
    294 for JIT generation. It defaults to "host", meaning that it shall
    295 pick the architecture of the machine where LLVM is being built. If
    296 you are cross-compiling, set it to the target architecture
    297 name.
    298
    299
    LLVM_TABLEGEN:STRING
    300
    Full path to a native TableGen executable (usually
    301 named tblgen). This is intented for cross-compiling: if the
    302 user sets this variable, no native TableGen will be created.
    303
    304
    305
    306
    307
    308
    309 Executing the test suite
    310
    311
    312
    313
    314
    315

    LLVM testing is not supported on Visual Studio.

    316
    317

    TODO

    318
    319
    320
    321
    322
    323 Cross compiling
    324
    325
    326
    327
    328
    329

    See this

    330 wiki page for generic instructions on how to cross-compile
    331 with CMake. It goes into detailed explanations and may seem
    332 daunting, but it is not. On the wiki page there are several
    333 examples including toolchain files. Go directly to
    334 this
    335 section for a quick solution.

    336
    337

    Also see the LLVM-specific variables

    338 section for variables used when cross-compiling.

    339
    340
    341
    342
    343
    344 Embedding LLVM in your project
    345
    346
    347
    348
    349
    350

    TODO

    351
    352
    353
    354
    355
    356
    357
    358 Compiler/Platform specific topics
    359
    360
    361
    362
    363
    364

    Notes for specific compilers and/or platforms.

    365
    366
    367
    368
    369
    370
    371
    372
    373 src="http://jigsaw.w3.org/css-validator/images/vcss-blue" alt="Valid CSS">
    374
    375 src="http://www.w3.org/Icons/valid-html401-blue" alt="Valid HTML 4.01">
    376
    377 Oscar Fuentes
    378 LLVM Compiler Infrastructure
    379 Last modified: $Date: 2008-12-31 03:59:36 +0100 (Wed, 31 Dec 2008) $
    380
    381
    382
    383
    +0
    -2169
    docs/CodeGenerator.html less more
    None
    1 "http://www.w3.org/TR/html4/strict.dtd">
    2
    3
    4
    5 The LLVM Target-Independent Code Generator
    6
    7
    8
    9
    10
    11 The LLVM Target-Independent Code Generator
    12
    13
    14
    15
  • Introduction
  • 16
    17
  • Required components in the code generator
  • 18
  • The high-level design of the code
  • 19 generator
    20
  • Using TableGen for target description
  • 21
    22
    23
  • Target description classes
  • 24
    25
  • The TargetMachine class
  • 26
  • The TargetData class
  • 27
  • The TargetLowering class
  • 28
  • The TargetRegisterInfo class
  • 29
  • The TargetInstrInfo class
  • 30
  • The TargetFrameInfo class
  • 31
  • The TargetSubtarget class
  • 32
  • The TargetJITInfo class
  • 33
    34
    35
  • Machine code description classes
  • 36
    37
  • The MachineInstr class
  • 38
  • The MachineBasicBlock
  • 39 class
    40
  • The MachineFunction class
  • 41
    42
    43
  • Target-independent code generation algorithms
  • 44
    45
  • Instruction Selection
  • 46
    47
  • Introduction to SelectionDAGs
  • 48
  • SelectionDAG Code Generation
  • 49 Process
    50
  • Initial SelectionDAG
  • 51 Construction
    52
  • SelectionDAG LegalizeTypes Phase
  • 53
  • SelectionDAG Legalize Phase
  • 54
  • SelectionDAG Optimization
  • 55 Phase: the DAG Combiner
    56
  • SelectionDAG Select Phase
  • 57
  • SelectionDAG Scheduling and Formation
  • 58 Phase
    59
  • Future directions for the
  • 60 SelectionDAG
    61
    62
  • Live Intervals
  • 63
    64
  • Live Variable Analysis
  • 65
  • Live Intervals Analysis
  • 66
    67
  • Register Allocation
  • 68
    69
  • How registers are represented in
  • 70 LLVM
    71
  • Mapping virtual registers to physical
  • 72 registers
    73
  • Handling two address instructions
  • 74
  • The SSA deconstruction phase
  • 75
  • Instruction folding
  • 76
  • Built in register allocators
  • 77
    78
  • Code Emission
  • 79
    80
  • Generating Assembly Code
  • 81
  • Generating Binary Machine Code
  • 82
    83
    84
    85
  • Target-specific Implementation Notes
  • 86
    87
  • Tail call optimization
  • 88
  • Sibling call optimization
  • 89
  • The X86 backend
  • 90
  • The PowerPC backend
  • 91
    92
  • LLVM PowerPC ABI
  • 93
  • Frame Layout
  • 94
  • Prolog/Epilog
  • 95
  • Dynamic Allocation
  • 96
    97
    98
    99
    100
    101
    102

    Written by Chris Lattner,

    103 Bill Wendling,
    104 Fernando Magno Quintao
    105 Pereira and
    106 Jim Laskey

    107
    108
    109
    110

    Warning: This is a work in progress.

    111
    112
    113
    114
    115 Introduction
    116
    117
    118
    119
    120
    121

    The LLVM target-independent code generator is a framework that provides a

    122 suite of reusable components for translating the LLVM internal representation
    123 to the machine code for a specified target—either in assembly form
    124 (suitable for a static compiler) or in binary machine code format (usable for
    125 a JIT compiler). The LLVM target-independent code generator consists of five
    126 main components:

    127
    128
    129
  • Abstract target description interfaces which
  • 130 capture important properties about various aspects of the machine,
    131 independently of how they will be used. These interfaces are defined in
    132 include/llvm/Target/.
    133
    134
  • Classes used to represent the machine code
  • 135 being generated for a target. These classes are intended to be abstract
    136 enough to represent the machine code for any target machine. These
    137 classes are defined in include/llvm/CodeGen/.
    138
    139
  • Target-independent algorithms used to implement
  • 140 various phases of native code generation (register allocation, scheduling,
    141 stack frame representation, etc). This code lives
    142 in lib/CodeGen/.
    143
    144
  • Implementations of the abstract target description
  • 145 interfaces for particular targets. These machine descriptions make
    146 use of the components provided by LLVM, and can optionally provide custom
    147 target-specific passes, to build complete code generators for a specific
    148 target. Target descriptions live in lib/Target/.
    149
    150
  • The target-independent JIT components. The LLVM JIT is
  • 151 completely target independent (it uses the TargetJITInfo
    152 structure to interface for target-specific issues. The code for the
    153 target-independent JIT lives in lib/ExecutionEngine/JIT.
    154
    155
    156

    Depending on which part of the code generator you are interested in working

    157 on, different pieces of this will be useful to you. In any case, you should
    158 be familiar with the target description
    159 and machine code representation classes. If you
    160 want to add a backend for a new target, you will need
    161 to implement the target description classes for
    162 your new target and understand the LLVM code
    163 representation. If you are interested in implementing a
    164 new code generation algorithm, it should only
    165 depend on the target-description and machine code representation classes,
    166 ensuring that it is portable.

    167
    168
    169
    170
    171
    172 Required components in the code generator
    173
    174
    175
    176
    177

    The two pieces of the LLVM code generator are the high-level interface to the

    178 code generator and the set of reusable components that can be used to build
    179 target-specific backends. The two most important interfaces
    180 (TargetMachine
    181 and TargetData) are the only ones that are
    182 required to be defined for a backend to fit into the LLVM system, but the
    183 others must be defined if the reusable code generator components are going to
    184 be used.

    185
    186

    This design has two important implications. The first is that LLVM can

    187 support completely non-traditional code generation targets. For example, the
    188 C backend does not require register allocation, instruction selection, or any
    189 of the other standard components provided by the system. As such, it only
    190 implements these two interfaces, and does its own thing. Another example of
    191 a code generator like this is a (purely hypothetical) backend that converts
    192 LLVM to the GCC RTL form and uses GCC to emit machine code for a target.

    193
    194

    This design also implies that it is possible to design and implement

    195 radically different code generators in the LLVM system that do not make use
    196 of any of the built-in components. Doing so is not recommended at all, but
    197 could be required for radically different targets that do not fit into the
    198 LLVM machine description model: FPGAs for example.

    199
    200
    201
    202
    203
    204 The high-level design of the code generator
    205
    206
    207
    208
    209

    The LLVM target-independent code generator is designed to support efficient

    210 and quality code generation for standard register-based microprocessors.
    211 Code generation in this model is divided into the following stages:

    212
    213
    214
  • Instruction Selection — This phase
  • 215 determines an efficient way to express the input LLVM code in the target
    216 instruction set. This stage produces the initial code for the program in
    217 the target instruction set, then makes use of virtual registers in SSA
    218 form and physical registers that represent any required register
    219 assignments due to target constraints or calling conventions. This step
    220 turns the LLVM code into a DAG of target instructions.
    221
    222
  • Scheduling and Formation
  • 223 This phase takes the DAG of target instructions produced by the
    224 instruction selection phase, determines an ordering of the instructions,
    225 then emits the instructions
    226 as MachineInstrs with that ordering.
    227 Note that we describe this in the instruction
    228 selection section because it operates on
    229 a SelectionDAG.
    230
    231
  • SSA-based Machine Code Optimizations
  • 232 This optional stage consists of a series of machine-code optimizations
    233 that operate on the SSA-form produced by the instruction selector.
    234 Optimizations like modulo-scheduling or peephole optimization work
    235 here.
    236
    237
  • Register Allocation — The target code
  • 238 is transformed from an infinite virtual register file in SSA form to the
    239 concrete register file used by the target. This phase introduces spill
    240 code and eliminates all virtual register references from the program.
    241
    242
  • Prolog/Epilog Code Insertion — Once
  • 243 the machine code has been generated for the function and the amount of
    244 stack space required is known (used for LLVM alloca's and spill slots),
    245 the prolog and epilog code for the function can be inserted and "abstract
    246 stack location references" can be eliminated. This stage is responsible
    247 for implementing optimizations like frame-pointer elimination and stack
    248 packing.
    249
    250
  • Late Machine Code Optimizations
  • 251 Optimizations that operate on "final" machine code can go here, such as
    252 spill code scheduling and peephole optimizations.
    253
    254
  • Code Emission — The final stage
  • 255 actually puts out the code for the current function, either in the target
    256 assembler format or in machine code.
    257
    258
    259

    The code generator is based on the assumption that the instruction selector

    260 will use an optimal pattern matching selector to create high-quality
    261 sequences of native instructions. Alternative code generator designs based
    262 on pattern expansion and aggressive iterative peephole optimization are much
    263 slower. This design permits efficient compilation (important for JIT
    264 environments) and aggressive optimization (used when generating code offline)
    265 by allowing components of varying levels of sophistication to be used for any
    266 step of compilation.

    267
    268

    In addition to these stages, target implementations can insert arbitrary

    269 target-specific passes into the flow. For example, the X86 target uses a
    270 special pass to handle the 80x87 floating point stack architecture. Other
    271 targets with unusual requirements can be supported with custom passes as
    272 needed.

    273
    274
    275
    276
    277
    278 Using TableGen for target description
    279
    280
    281
    282
    283

    The target description classes require a detailed description of the target

    284 architecture. These target descriptions often have a large amount of common
    285 information (e.g., an add instruction is almost identical to a
    286 sub instruction). In order to allow the maximum amount of
    287 commonality to be factored out, the LLVM code generator uses
    288 the TableGen tool to describe big
    289 chunks of the target machine, which allows the use of domain-specific and
    290 target-specific abstractions to reduce the amount of repetition.

    291
    292

    As LLVM continues to be developed and refined, we plan to move more and more

    293 of the target description to the .td form. Doing so gives us a
    294 number of advantages. The most important is that it makes it easier to port
    295 LLVM because it reduces the amount of C++ code that has to be written, and
    296 the surface area of the code generator that needs to be understood before
    297 someone can get something working. Second, it makes it easier to change
    298 things. In particular, if tables and other things are all emitted
    299 by tblgen, we only need a change in one place (tblgen) to
    300 update all of the targets to a new interface.

    301
    302
    303
    304
    305
    306 Target description classes
    307
    308
    309
    310
    311
    312

    The LLVM target description classes (located in the

    313 include/llvm/Target directory) provide an abstract description of
    314 the target machine independent of any particular client. These classes are
    315 designed to capture the abstract properties of the target (such as the
    316 instructions and registers it has), and do not incorporate any particular
    317 pieces of code generation algorithms.

    318
    319

    All of the target description classes (except the

    320 TargetData class) are designed to be
    321 subclassed by the concrete target implementation, and have virtual methods
    322 implemented. To get to these implementations, the
    323 TargetMachine class provides accessors
    324 that should be implemented by the target.

    325
    326
    327
    328
    329
    330 The TargetMachine class
    331
    332
    333
    334
    335

    The TargetMachine class provides virtual methods that are used to

    336 access the target-specific implementations of the various target description
    337 classes via the get*Info methods (getInstrInfo,
    338 getRegisterInfo, getFrameInfo, etc.). This class is
    339 designed to be specialized by a concrete target implementation
    340 (e.g., X86TargetMachine) which implements the various virtual
    341 methods. The only required target description class is
    342 the TargetData class, but if the code
    343 generator components are to be used, the other interfaces should be
    344 implemented as well.

    345
    346
    347
    348
    349
    350 The TargetData class
    351
    352
    353
    354
    355

    The TargetData class is the only required target description class,

    356 and it is the only class that is not extensible (you cannot derived a new
    357 class from it). TargetData specifies information about how the
    358 target lays out memory for structures, the alignment requirements for various
    359 data types, the size of pointers in the target, and whether the target is
    360 little-endian or big-endian.

    361
    362
    363
    364
    365
    366 The TargetLowering class
    367
    368
    369
    370
    371

    The TargetLowering class is used by SelectionDAG based instruction

    372 selectors primarily to describe how LLVM code should be lowered to
    373 SelectionDAG operations. Among other things, this class indicates:

    374
    375
    376
  • an initial register class to use for various ValueTypes,
  • 377
    378
  • which operations are natively supported by the target machine,
  • 379
    380
  • the return type of setcc operations,
  • 381
    382
  • the type to use for shift amounts, and
  • 383
    384
  • various high-level characteristics, like whether it is profitable to turn
  • 385 division by a constant into a multiplication sequence
    386
    387
    388
    389
    390
    391
    392 The TargetRegisterInfo class
    393
    394
    395
    396
    397

    The TargetRegisterInfo class is used to describe the register file

    398 of the target and any interactions between the registers.

    399
    400

    Registers in the code generator are represented in the code generator by

    401 unsigned integers. Physical registers (those that actually exist in the
    402 target description) are unique small numbers, and virtual registers are
    403 generally large. Note that register #0 is reserved as a flag value.

    404
    405

    Each register in the processor description has an associated

    406 TargetRegisterDesc entry, which provides a textual name for the
    407 register (used for assembly output and debugging dumps) and a set of aliases
    408 (used to indicate whether one register overlaps with another).

    409
    410

    In addition to the per-register description, the TargetRegisterInfo

    411 class exposes a set of processor specific register classes (instances of the
    412 TargetRegisterClass class). Each register class contains sets of
    413 registers that have the same properties (for example, they are all 32-bit
    414 integer registers). Each SSA virtual register created by the instruction
    415 selector has an associated register class. When the register allocator runs,
    416 it replaces virtual registers with a physical register in the set.

    417
    418

    The target-specific implementations of these classes is auto-generated from

    419 a TableGen description of the
    420 register file.

    421
    422
    423
    424
    425
    426 The TargetInstrInfo class
    427
    428
    429
    430
    431

    The TargetInstrInfo class is used to describe the machine

    432 instructions supported by the target. It is essentially an array of
    433 TargetInstrDescriptor objects, each of which describes one
    434 instruction the target supports. Descriptors define things like the mnemonic
    435 for the opcode, the number of operands, the list of implicit register uses
    436 and defs, whether the instruction has certain target-independent properties
    437 (accesses memory, is commutable, etc), and holds any target-specific
    438 flags.

    439
    440
    441
    442
    443
    444 The TargetFrameInfo class
    445
    446
    447
    448
    449

    The TargetFrameInfo class is used to provide information about the

    450 stack frame layout of the target. It holds the direction of stack growth, the
    451 known stack alignment on entry to each function, and the offset to the local
    452 area. The offset to the local area is the offset from the stack pointer on
    453 function entry to the first location where function data (local variables,
    454 spill locations) can be stored.

    455
    456
    457
    458
    459
    460 The TargetSubtarget class
    461
    462
    463
    464
    465

    The TargetSubtarget class is used to provide information about the

    466 specific chip set being targeted. A sub-target informs code generation of
    467 which instructions are supported, instruction latencies and instruction
    468 execution itinerary; i.e., which processing units are used, in what order,
    469 and for how long.

    470
    471
    472
    473
    474
    475
    476 The TargetJITInfo class
    477
    478
    479
    480
    481

    The TargetJITInfo class exposes an abstract interface used by the

    482 Just-In-Time code generator to perform target-specific activities, such as
    483 emitting stubs. If a TargetMachine supports JIT code generation, it
    484 should provide one of these objects through the getJITInfo
    485 method.

    486
    487
    488
    489
    490
    491 Machine code description classes
    492
    493
    494
    495
    496
    497

    At the high-level, LLVM code is translated to a machine specific

    498 representation formed out of
    499 MachineFunction,
    500 MachineBasicBlock,
    501 and MachineInstr instances (defined
    502 in include/llvm/CodeGen). This representation is completely target
    503 agnostic, representing instructions in their most abstract form: an opcode
    504 and a series of operands. This representation is designed to support both an
    505 SSA representation for machine code, as well as a register allocated, non-SSA
    506 form.

    507
    508
    509
    510
    511
    512 The MachineInstr class
    513
    514
    515
    516
    517

    Target machine instructions are represented as instances of the

    518 MachineInstr class. This class is an extremely abstract way of
    519 representing machine instructions. In particular, it only keeps track of an
    520 opcode number and a set of operands.

    521
    522

    The opcode number is a simple unsigned integer that only has meaning to a

    523 specific backend. All of the instructions for a target should be defined in
    524 the *InstrInfo.td file for the target. The opcode enum values are
    525 auto-generated from this description. The MachineInstr class does
    526 not have any information about how to interpret the instruction (i.e., what
    527 the semantics of the instruction are); for that you must refer to the
    528 TargetInstrInfo class.

    529
    530

    The operands of a machine instruction can be of several different types: a

    531 register reference, a constant integer, a basic block reference, etc. In
    532 addition, a machine operand should be marked as a def or a use of the value
    533 (though only registers are allowed to be defs).

    534
    535

    By convention, the LLVM code generator orders instruction operands so that

    536 all register definitions come before the register uses, even on architectures
    537 that are normally printed in other orders. For example, the SPARC add
    538 instruction: "add %i1, %i2, %i3" adds the "%i1", and "%i2" registers
    539 and stores the result into the "%i3" register. In the LLVM code generator,
    540 the operands should be stored as "%i3, %i1, %i2": with the
    541 destination first.

    542
    543

    Keeping destination (definition) operands at the beginning of the operand

    544 list has several advantages. In particular, the debugging printer will print
    545 the instruction like this:

    546
    547
    548
    
                      
                    
    549 %r3 = add %i1, %i2
    550
    551
    552
    553

    Also if the first operand is a def, it is easier to create

    554 instructions whose only def is the first operand.

    555
    556
    557
    558
    559
    560 Using the MachineInstrBuilder.h functions
    561
    562
    563
    564
    565

    Machine instructions are created by using the BuildMI functions,

    566 located in the include/llvm/CodeGen/MachineInstrBuilder.h file. The
    567 BuildMI functions make it easy to build arbitrary machine
    568 instructions. Usage of the BuildMI functions look like this:

    569
    570
    571
    
                      
                    
    572 // Create a 'DestReg = mov 42' (rendered in X86 assembly as 'mov DestReg, 42')
    573 // instruction. The '1' specifies how many operands will be added.
    574 MachineInstr *MI = BuildMI(X86::MOV32ri, 1, DestReg).addImm(42);
    575
    576 // Create the same instr, but insert it at the end of a basic block.
    577 MachineBasicBlock &MBB = ...
    578 BuildMI(MBB, X86::MOV32ri, 1, DestReg).addImm(42);
    579
    580 // Create the same instr, but insert it before a specified iterator point.
    581 MachineBasicBlock::iterator MBBI = ...
    582 BuildMI(MBB, MBBI, X86::MOV32ri, 1, DestReg).addImm(42);
    583
    584 // Create a 'cmp Reg, 0' instruction, no destination reg.
    585 MI = BuildMI(X86::CMP32ri, 2).addReg(Reg).addImm(0);
    586 // Create an 'sahf' instruction which takes no operands and stores nothing.
    587 MI = BuildMI(X86::SAHF, 0);
    588
    589 // Create a self looping branch instruction.
    590 BuildMI(MBB, X86::JNE, 1).addMBB(&MBB);
    591
    592
    593
    594

    The key thing to remember with the BuildMI functions is that you

    595 have to specify the number of operands that the machine instruction will
    596 take. This allows for efficient memory allocation. You also need to specify
    597 if operands default to be uses of values, not definitions. If you need to
    598 add a definition operand (other than the optional destination register), you
    599 must explicitly mark it as such:

    600
    601
    602
    
                      
                    
    603 MI.addReg(Reg, RegState::Define);
    604
    605
    606
    607
    608
    609
    610
    611 Fixed (preassigned) registers
    612
    613
    614
    615
    616

    One important issue that the code generator needs to be aware of is the

    617 presence of fixed registers. In particular, there are often places in the
    618 instruction stream where the register allocator must arrange for a
    619 particular value to be in a particular register. This can occur due to
    620 limitations of the instruction set (e.g., the X86 can only do a 32-bit divide
    621 with the EAX/EDX registers), or external factors like
    622 calling conventions. In any case, the instruction selector should emit code
    623 that copies a virtual register into or out of a physical register when
    624 needed.

    625
    626

    For example, consider this simple LLVM example:

    627
    628
    629
    
                      
                    
    630 define i32 @test(i32 %X, i32 %Y) {
    631 %Z = udiv i32 %X, %Y
    632 ret i32 %Z
    633 }
    634
    635
    636
    637

    The X86 instruction selector produces this machine code for the div

    638 and ret (use "llc X.bc -march=x86 -print-machineinstrs" to
    639 get this):

    640
    641
    642
    
                      
                    
    643 ;; Start of div
    644 %EAX = mov %reg1024 ;; Copy X (in reg1024) into EAX
    645 %reg1027 = sar %reg1024, 31
    646 %EDX = mov %reg1027 ;; Sign extend X into EDX
    647 idiv %reg1025 ;; Divide by Y (in reg1025)
    648 %reg1026 = mov %EAX ;; Read the result (Z) out of EAX
    649
    650 ;; Start of ret
    651 %EAX = mov %reg1026 ;; 32-bit return value goes in EAX
    652 ret
    653
    654
    655
    656

    By the end of code generation, the register allocator has coalesced the

    657 registers and deleted the resultant identity moves producing the following
    658 code:

    659
    660
    661
    
                      
                    
    662 ;; X is in EAX, Y is in ECX
    663 mov %EAX, %EDX
    664 sar %EDX, 31
    665 idiv %ECX
    666 ret
    667
    668
    669
    670

    This approach is extremely general (if it can handle the X86 architecture, it

    671 can handle anything!) and allows all of the target specific knowledge about
    672 the instruction stream to be isolated in the instruction selector. Note that
    673 physical registers should have a short lifetime for good code generation, and
    674 all physical registers are assumed dead on entry to and exit from basic
    675 blocks (before register allocation). Thus, if you need a value to be live
    676 across basic block boundaries, it must live in a virtual
    677 register.

    678
    679
    680
    681
    682
    683 Machine code in SSA form
    684
    685
    686
    687
    688

    MachineInstr's are initially selected in SSA-form, and are

    689 maintained in SSA-form until register allocation happens. For the most part,
    690 this is trivially simple since LLVM is already in SSA form; LLVM PHI nodes
    691 become machine code PHI nodes, and virtual registers are only allowed to have
    692 a single definition.

    693
    694