llvm.org GIT mirror llvm / 607faa5
Merge from mainline. git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_24@58967 91177308-0d34-0410-b5e6-96231b3b80d8 Tanya Lattner 10 years ago
1 changed file(s) with 535 addition(s) and 480 deletion(s). Raw diff Collapse all Expand all
None
1
21 "http://www.w3.org/TR/html4/strict.dtd">
32
43
54
65
7 LLVM 2.<del>3</del> Release Notes
6 LLVM 2.<ins>4</ins> Release Notes
87
98
109
11
LLVM 2.3 Release Notes
12
10
LLVM 2.4 Release Notes
11
1312
1413
  • Introduction
  • 15
  • Major Changes and Sub-project Status
  • 16
  • What's New?
  • 14
  • Sub-project Status Update
  • 15
  • What's New in LLVM?
  • 1716
  • Installation Instructions
  • 1817
  • Portability and Supported Platforms
  • 19
  • Known Problems
  • 18
  • Known Problems
  • 2019
  • Additional Information
  • 2120
    2221
    2322
    24

    Written by the LLVM Team

    25
    26
    27
    3027
    3431
    3532
    3633
    37

    This document contains the release notes for the LLVM compiler

    38 infrastructure, release 2.3. Here we describe the status of LLVM, including
    39 major improvements from the previous release and any known problems. All LLVM
    40 releases may be downloaded from the LLVM
    41 releases web site.

    34

    This document contains the release notes for the LLVM Compiler

    35 Infrastructure, release 2.4. Here we describe the status of LLVM, including
    36 major improvements from the previous release and significant known problems.
    37 All LLVM releases may be downloaded from the
    38 href="http://llvm.org/releases/">LLVM releases web site.

    4239
    4340

    For more information about LLVM, including information about the latest

    4441 release, please check out the main LLVM
    4542 web site. If you have questions or comments, the
    46 href="http://mail.cs.uiuc.edu/mailman/listinfo/llvmdev">LLVM developer's mailing
    47 list is a good place to send them.

    48
    49

    Note that if you are reading this file from a Subversion checkout or the

    43 href="http://mail.cs.uiuc.edu/mailman/listinfo/llvmdev">LLVM Developer's Mailing
    44 List is a good place to send them.

    45
    46

    Note that if you are reading this file from a Subversion checkout or the

    5047 main LLVM web page, this document applies to the next release, not the
    51 current one. To see the release notes for a specific releases, please see the
    48 current one. To see the release notes for a specific release, please see the
    5249 releases page.

    5350
    5451
    5552
    56
    57
    58 Major Changes and Sub-project Status
    59
    60
    61
    62
    63
    64

    This is the fourteenth public release of the LLVM Compiler Infrastructure.

    65 It includes a large number of features and refinements from LLVM 2.2.

    66
    67
    68
    69
    7460
    75
    76
    77 Major Changes in LLVM 2.3
    78
    79
    80
    81
    82

    LLVM 2.3 no longer supports llvm-gcc 4.0, it has been replaced with

    83 llvm-gcc 4.2.

    84
    85

    LLVM 2.3 no longer includes the llvm-upgrade tool. It was useful

    86 for upgrading LLVM 1.9 files to LLVM 2.x syntax, but you can always use a
    87 previous LLVM release to do this. One nice impact of this is that the LLVM
    88 regression test suite no longer depends on llvm-upgrade, which makes it run
    89 faster.

    90
    91

    The llvm2cpp tool has been folded into llc, use

    92 llc -march=cpp instead of llvm2cpp.

    93
    94

    LLVM API Changes:

    95
    96
    97
  • Several core LLVM IR classes have migrated to use the
  • 98 'FOOCLASS::Create(...)' pattern instead of 'new
    99 FOOCLASS(...)' (e.g. where FOOCLASS=BasicBlock). We hope to
    100 standardize on FOOCLASS::Create for all IR classes in the future,
    101 but not all of them have been moved over yet.
    102
  • LLVM 2.3 renames the LLVMBuilder and LLVMFoldingBuilder classes to
  • 103 IRBuilder.
    104
    105
  • MRegisterInfo was renamed to
  • 106
    107 TargetRegisterInfo.
    108
  • The MappedFile class is gone, please use
  • 109
    110 MemoryBuffer instead.
    111
  • The '-enable-eh' flag to llc has been removed. Now code should
  • 112 encode whether it is safe to omit unwind information for a function by
    113 tagging the Function object with the 'nounwind' attribute.
    114
  • The ConstantFP::get method that uses APFloat now takes one argument
  • 115 instead of two. The type argument has been removed, and the type is
    116 now inferred from the size of the given APFloat value.
    117
    118
    119
    120
    121
    122
    123 Other LLVM Sub-Projects
    124
    61
    65
    66
    67
    68 Sub-project Status Update
    69
    70
    12571
    12672
    12773

    128 The core LLVM 2.3 distribution currently consists of code from the core LLVM
    129 repository (which roughly contains the LLVM optimizer, code generators and
    74 The LLVM 2.4 distribution currently consists of code from the core LLVM
    75 repository (which roughly includes the LLVM optimizers, code generators and
    13076 supporting tools) and the llvm-gcc repository. In addition to this code, the
    13177 LLVM Project includes other sub-projects that are in development. The two which
    132 are the most actively developed are the new vmkit Project
    133 and the Clang Project.
    78 are the most actively developed are the Clang Project and
    79 the VMKit Project.
    13480

    135
    136
    137
    138
    139 vmkit
    140
    141
    142
    143

    144 The "vmkit" project is a new addition to the LLVM family. It is an
    145 implementation of a JVM and a CLI Virtual Machines (Microsoft .NET is an
    146 implementation of the CLI) using the Just-In-Time compiler of LLVM.

    147
    148

    The JVM, called JnJVM, executes real-world applications such as Apache

    149 projects (e.g. Felix and Tomcat) and the SpecJVM98 benchmark. It uses the GNU
    150 Classpath project for the base classes. The CLI implementation, called N3, is
    151 its in early stages but can execute simple applications and the "pnetmark"
    152 benchmark. It uses the pnetlib project as its core library.

    153
    154

    The 'vmkit' VMs compare in performance with industrial and top open-source

    155 VMs on scientific applications. Besides the JIT, the VMs use many features of
    156 the LLVM framework, including the standard set of optimizations, atomic
    157 operations, custom function provider and memory manager for JITed methods, and
    158 specific virtual machine optimizations. vmkit is not an official part of LLVM
    159 2.3 release. It is publicly available under the LLVM license and can be
    160 downloaded from:
    161

    162
    163
    164
    svn co http://llvm.org/svn/llvm-project/vmkit/trunk vmkit
    165
    166
    167
    168
    169
    170
    171 Clang
    81
    82
    83
    84
    85
    86
    87 Clang: C/C++/Objective-C Frontend Toolkit
    17288
    17389
    17490
    18197 yet production quality, it is progressing very nicely. In addition, C++
    18298 front-end work has started to make significant progress.

    18399
    184

    At this point, Clang is most useful if you are interested in source-to-source

    185 transformations (such as refactoring) and other source-level tools for C and
    186 Objective-C. Clang now also includes tools for turning C code into pretty HTML,
    187 and includes a new static
    188 analysis tool in development. This tool focuses on automatically finding
    189 bugs in C and Objective-C code.

    190
    100

    Clang, in conjunction with the ccc driver, is now usable as a

    101 replacement for gcc for building some small- to medium-sized C applications.
    102 Additionally, Clang now has code generation support for Objective-C on Mac OS X
    103 platform. Major highlights include:

    104
    105
    106
  • Clang/ccc pass almost all of the LLVM test suite on Mac OS X and Linux
  • 107 on the 32-bit x86 architecture. This includes significant C
    108 applications such as sqlite3,
    109 lua, and
    110 Clam AntiVirus.
    111
    112
  • Clang can build the majority of Objective-C examples shipped with the
  • 113 Mac OS X Developer Tools.
    114
    115
    116

    Clang code generation still needs considerable testing and development,

    117 however. Some areas under active development include:

    118
    119
    120
  • Improved support for C and Objective-C features, for example
  • 121 variable-length arrays, va_arg, exception handling (Obj-C), and garbage
    122 collection (Obj-C).
    123
  • ABI compatibility, especially for platforms other than 32-bit
  • 124 x86.
    125
    126
    127
    128
    129
    130
    131 Clang Static Analyzer
    132
    133
    134
    135
    136

    The Clang project also includes an early stage static source code analysis

    137 tool for automatically
    138 finding bugs in C and Objective-C programs. The tool performs a growing set
    139 of checks to find bugs that occur on a specific path within a program. Examples
    140 of bugs the tool finds include logic errors such as null dereferences,
    141 violations of various API rules, dead code, and potential memory leaks in
    142 Objective-C programs. Since its inception, public feedback on the tool has been
    143 extremely positive, and conservative estimates put the number of real bugs it
    144 has found in industrial-quality software on the order of thousands.

    145
    146

    The tool also provides a simple web GUI to inspect potential bugs found by

    147 the tool. While still early in development, the GUI illustrates some of the key
    148 features of Clang: accurate source location information, which is used by the
    149 GUI to highlight specific code expressions that relate to a bug (including those
    150 that span multiple lines); and built-in knowledge of macros, which is used to
    151 perform inline expansion of macros within the GUI itself.

    152
    153

    The set of checks performed by the static analyzer is gradually expanding,

    154 and future plans for the tool include full source-level inter-procedural
    155 analysis and deeper checks such as buffer overrun detection. There are many
    156 opportunities to extend and enhance the static analyzer, and anyone interested
    157 in working on this project is encouraged to get involved!

    158
    159
    160
    161
    162
    163 VMKit: JVM/CLI Virtual Machine Implementation
    164
    165
    166
    167

    168 The VMKit project is an implementation of
    169 a JVM and a CLI Virtual Machines (Microsoft .NET is an
    170 implementation of the CLI) using the Just-In-Time compiler of LLVM.

    171
    172

    Following LLVM 2.4, VMKit has its first release 0.24 that you can find on its

    173 webpage. The release includes
    174 bug fixes, cleanup and new features. The major changes are:

    175
    176
    177
    178
  • Support for generics in the .Net virtual machine.
  • 179
  • Initial support for the Mono class libraries.
  • 180
  • Support for MacOSX/x86, following LLVM's support for exceptions in
  • 181 JIT on MacOSX/x86.
    182
  • A new vmkit driver: a program to run java or .net applications. The driver
  • 183 supports llvm command line arguments including the new "-fast" option.
    184
  • A new memory allocation scheme in the JVM that makes unloading a
  • 185 class loader very fast.
    186
  • VMKit now follows the LLVM Makefile machinery.
  • 187
    188
    191189
    192190
    193191
    194192
    195193
    196 What's New?
    197
    198
    199
    200
    201
    202

    LLVM 2.3 includes a huge number of bug fixes, performance tweaks and minor

    203 improvements. Some of the major improvements and new features are listed in
    204 this section.
    194 What's New in LLVM?
    195
    196
    197
    198
    199
    200

    This release includes a huge number of bug fixes, performance tweaks, and

    201 minor improvements. Some of the major improvements and new features are listed
    202 in this section.
    205203

    206204
    207205
    212210
    213211
    214212
    215

    LLVM 2.3 includes several major new capabilities:

    216
    217
    218
  • The biggest change in LLVM 2.3 is Multiple Return Value (MRV) support.

  • 219 MRVs allow LLVM IR to directly represent functions that return multiple
    220 values without having to pass them "by reference" in the LLVM IR. This
    221 allows a front-end to generate more efficient code, as MRVs are generally
    222 returned in registers if a target supports them. See the
    223 href="LangRef.html#i_getresult">LLVM IR Reference for more details.

    213

    LLVM 2.4 includes several major new capabilities:

    214
    215
    216
  • The most visible end-user change in LLVM 2.4 is that it includes many

  • 217 optimizations and changes to make -O0 compile times much faster. You should see
    218 improvements in speed on the order of 30% (or more) than in LLVM 2.3. There are
    219 many pieces to this change described in more detail below. The speedups and new
    220 components can also be used for JIT compilers that want fast
    221 compilation.

    222
    223
  • The biggest change to the LLVM IR is that Multiple Return Values (which

  • 224 were introduced in LLVM 2.3) have been generalized to full support for "First
    225 Class Aggregate" values in LLVM 2.4. This means that LLVM IR supports using
    226 structs and arrays as values in a function. This capability is mostly useful
    227 for front-end authors, who prefer to treat things like complex numbers, simple
    228 tuples, dope vectors, etc., as Value*'s instead of as a tuple of Value*'s or as
    229 memory values. Bitcode files from LLVM 2.3 will automatically migrate to the
    230 general representation.

    231
    232
  • LLVM 2.4 also includes an initial port for the PIC16 microprocessor. This

  • 233 target only has support for 8 bit registers, and a number of other crazy
    234 constraints. While the port is still in early development stages, it shows some
    235 interesting things you can do with LLVM.

    236
    237
    238
    239
    240
    241
    242
    243
    244 llvm-gcc 4.2 Improvements
    245
    246
    247
    248
    249

    LLVM fully supports the llvm-gcc 4.2 front-end, which marries the GCC

    250 front-ends and driver with the LLVM optimizer and code generator. It currently
    251 includes support for the C, C++, Objective-C, Ada, and Fortran front-ends.

    252
    253
    254
  • LLVM 2.4 supports the full set of atomic __sync_* builtins. LLVM
  • 255 2.3 only supported those used by OpenMP, but 2.4 supports them all. Note that
    256 while llvm-gcc supports all of these builtins, not all targets do. X86 support
    257 them all in both 32-bit and 64-bit mode and PowerPC supports them all except for
    258 the 64-bit operations when in 32-bit mode.
    259
    260
  • llvm-gcc now supports an -flimited-precision option, which tells
  • 261 the compiler that it is okay to use low-precision approximations of certain libm
    262 functions (like exp, log, etc). This allows you to get high
    263 performance if you only need (say) 12-bits of precision.
    264
    265
  • llvm-gcc now supports a C language extension known as "
  • 266 href="http://lists.cs.uiuc.edu/pipermail/cfe-dev/2008-August/002670.html">Blocks".
    267 This feature is similar to nested functions and closures, but does not
    268 require stack trampolines (with most ABIs), and supports returning closures
    269 from functions that define them. Note that actually using Blocks
    270 requires a small runtime that is not included with llvm-gcc.
    271
    272
  • llvm-gcc now supports a new -flto option. On systems that support
  • 273 transparent Link Time Optimization (currently Darwin systems with Xcode 3.1 and
    274 later) this allows the use of LTO with other optimization levels like -Os.
    275 Previously, LTO could only be used with -O4, which implied optimizations in
    276 -O3 that can increase code size.
    277
    278
    279
    280
    281
    282
    283
    284 LLVM Core Improvements
    285
    286
    287
    288

    New features include:

    289
    290
    291
  • A major change to the Use class landed, which shrank it by 25%. Since
  • 292 this is a pervasive part of the LLVM, it ended up reducing the memory use of
    293 LLVM IR in general by 15% for most programs.
    294
    295
  • Values with no names are now pretty printed by llvm-dis more
  • 296 nicely. They now print as "%3 = add i32 %A, 4" instead of
    297 "add i32 %A, 4 ; <i32>:3", which makes it much easier to read.
    298
    299
    300
  • LLVM 2.4 includes some changes for better vector support. First, the shift
  • 301 operations (shl, ashr, and lshr) now all support
    302 vectors and do an element-by-element shift (shifts of the whole vector can be
    303 accomplished by bitcasting the vector to <1 x i128>, for example). Second,
    304 there is initial support in development for vector comparisons with the
    305 fcmp/icmp
    306 instructions. These instructions compare two vectors and return a vector of
    307 i1's for each result. Note that there is very little codegen support
    308 available for any of these IR features though.
    309
    310
  • A new DebugInfoBuilder class is available, which makes it much
  • 311 easier for front-ends to create debug info descriptors, similar to the way that
    312 IRBuilder makes it easier to create LLVM IR.
    313
    314
  • The IRBuilder class is now parameterized by a class responsible
  • 315 for constant folding. The default ConstantFolder class does target independent
    316 constant folding. The NoFolder class does no constant folding at all, which is
    317 useful when learning how LLVM works. The TargetFolder class folds the most,
    318 doing target dependent constant folding.
    319
    320
  • LLVM now supports "function attributes", which allow us to separate return
  • 321 value attributes from function attributes. LLVM now supports attributes on a
    322 function itself, a return value, and its parameters. New supported function
    323 attributes include noinline/alwaysinline and the opt-size flag,
    324 which says the function should be optimized for code size.
    325
    326
  • LLVM IR now directly represents "common" linkage, instead of
  • 327 representing it as a form of weak linkage.
    224328
    225

    MRVs are fully supported in the LLVM IR, but are not yet fully supported in

    226 on all targets. However, it is generally safe to return up to 2 values from
    227 a function: most targets should be able to handle at least that. MRV
    228 support is a critical requirement for X86-64 ABI support, as X86-64 requires
    229 the ability to return multiple registers from functions, and we use MRVs to
    230 accomplish this in a direct way.

    231
    232
  • LLVM 2.3 includes a complete reimplementation of the "llvmc"

  • 233 tool. It is designed to overcome several problems with the original
    234 llvmc and to provide a superset of the features of the
    235 'gcc' driver.

    236
    237

    The main features of llvmc2 are:

    238
    239
  • Extended handling of command line options and smart rules for
  • 240 dispatching them to different tools.
    241
  • Flexible (and extensible) rules for defining different tools.
  • 242
  • The different intermediate steps performed by tools are represented
  • 243 as edges in the abstract graph.
    244
  • The 'language' for driver behavior definition is tablegen and thus
  • 245 it's relatively easy to add new features.
    246
  • The definition of driver is transformed into set of C++ classes, thus
  • 247 no runtime interpretation is needed.
    248
    249
    250
    251
  • LLVM 2.3 includes a completely rewritten interface for

  • 252 href="LinkTimeOptimization.html">Link Time Optimization. This interface
    253 is written in C, which allows for easier integration with C code bases, and
    254 incorporates improvements we learned about from the first incarnation of the
    255 interface.

    256
    257
  • The Kaleidoscope tutorial now

  • 258 includes a "port" of the tutorial that
    259 href="tutorial/OCamlLangImpl1.html">uses the Ocaml bindings to implement
    260 the Kaleidoscope language.

    261
    262
    263
    264
    265
    266
    267
    268
    269 llvm-gcc 4.2 Improvements
    270
    271
    272
    273
    274

    LLVM 2.3 fully supports the llvm-gcc 4.2 front-end, and includes support

    275 for the C, C++, Objective-C, Ada, and Fortran front-ends.

    276
    277

    278
    279
  • llvm-gcc 4.2 includes numerous fixes to better support the Objective-C
  • 280 front-end. Objective-C now works very well on Mac OS/X.
    281
    282
  • Fortran EQUIVALENCEs are now supported by the gfortran
  • 283 front-end.
    284
    285
  • llvm-gcc 4.2 includes many other fixes which improve conformance with the
  • 286 relevant parts of the GCC testsuite.
    287
    288
    289
    290
    291
    292
    293
    294
    295 <a name="coreimprovements">LLVM Core Improvements>
    329 </ul>
    330
    331
    332
    333
    334
    335 Optimizer Improvements
    336
    337
    338
    339
    340

    In addition to a huge array of bug fixes and minor performance tweaks, this

    341 release includes a few major enhancements and additions to the optimizers:

    342
    343
    344
    345
  • The Global Value Numbering (GVN) pass now does local Partial Redundancy
  • 346 Elimination (PRE) to eliminate some partially redundant expressions in cases
    347 where doing so won't grow code size.
    348
    349
  • LLVM 2.4 includes a new loop deletion pass (which removes output-free
  • 350 provably-finite loops) and a rewritten Aggressive Dead Code Elimination (ADCE)
    351 pass that no longer uses control dependence information. These changes speed up
    352 the optimizer and also prevent it from deleting output-free infinite
    353 loops.
    354
    355
  • The new AddReadAttrs pass works out which functions are read-only or
  • 356 read-none (these correspond to 'pure' and 'const' in GCC) and marks them
    357 with the appropriate attribute.
    358
    359
  • LLVM 2.4 now includes a new SparsePropagation framework, which makes it
  • 360 trivial to build lattice-based dataflow solvers that operate over LLVM IR. Using
    361 this interface means that you just define objects to represent your lattice
    362 values and the transfer functions that operate on them. It handles the
    363 mechanics of worklist processing, liveness tracking, handling PHI nodes,
    364 etc.
    365
    366
  • The Loop Strength Reduction and induction variable optimization passes have
  • 367 several improvements to avoid inserting MAX expressions, to optimize simple
    368 floating point induction variables and to analyze trip counts of more
    369 loops.
    370
    371
  • Various helper functions (ComputeMaskedBits, ComputeNumSignBits, etc) were
  • 372 pulled out of the Instruction Combining pass and put into a new
    373 ValueTracking.h header, where they can be reused by other passes.
    374
    375
  • The tail duplication pass has been removed from the standard optimizer
  • 376 sequence used by llvm-gcc. This pass still exists, but the benefits it once
    377 provided are now achieved by other passes.
    378
    379
    380
    381
    382
    383
    384
    385 Code Generator Improvements
    386
    387
    388
    389
    390

    We have put a significant amount of work into the code generator infrastructure,

    391 which allows us to implement more aggressive algorithms and make it run
    392 faster:

    393
    394
    395
  • The target-independent code generator supports (and the X86 backend
  • 396 currently implements) a new interface for "fast" instruction selection. This
    397 interface is optimized to produce code as quickly as possible, sacrificing
    398 code quality to do it. This is used by default at -O0 or when using
    399 "llc -fast" on X86. It is straight-forward to add support for
    400 other targets if faster -O0 compilation is desired.
    401
    402
  • In addition to the new 'fast' instruction selection path, many existing
  • 403 pieces of the code generator have been optimized in significant ways.
    404 SelectionDAG's are now pool allocated and use better algorithms in many
    405 places, the ".s" file printers now use raw_ostream to emit text much faster,
    406 etc. The end result of these improvements is that the compiler also takes
    407 substantially less time to generate code that is just as good (and often
    408 better) than before.
    409
    410
  • Each target has been split to separate the ".s" file printing logic from the
  • 411 rest of the target. This enables JIT compilers that don't link in the
    412 (somewhat large) code and data tables used for printing a ".s" file.
    413
    414
  • The code generator now includes a "stack slot coloring" pass, which packs
  • 415 together individual spilled values into common stack slots. This reduces
    416 the size of stack frames with many spills, which tends to increase L1 cache
    417 effectiveness.
    418
    419
  • Various pieces of the register allocator (e.g. the coalescer and two-address
  • 420 operation elimination pass) now know how to rematerialize trivial operations
    421 to avoid copies and include several other optimizations.
    422
    423
  • The graphs produced by
  • 424 the llc -view-*-dags options are now significantly prettier and
    425 easier to read.
    426
    427
  • LLVM 2.4 includes a new register allocator based on Partitioned Boolean
  • 428 Quadratic Programming (PBQP). This register allocator is still in
    429 development, but is very simple and clean.
    430
    431
    432
    433
    434
    435
    436
    437
    438 Target Specific Improvements
    439
    440
    441
    442

    New target-specific features include:

    443

    444
    445
    446
  • Exception handling is supported by default on Linux/x86-64.
  • 447
  • Position Independent Code (PIC) is now supported on Linux/x86-64.
  • 448
  • @llvm.frameaddress now supports getting the frame address of stack frames
  • 449 > 0 on x86/x86-64.
    450
  • MIPS has improved a lot since last release, the most important changes
  • 451 are: Little endian support, floating point support, allegrex core and
    452 intrinsics support. O32 ABI is improved but isn't complete. The EABI
    453 was implemented and is fully supported. We also have support for small
    454 sections and gp_rel relocation for its access, a threshold in bytes can be
    455 specified through command line.
    456
  • The PowerPC backend now supports trampolines.
  • 457
    458
    459
    460
    461
    462
    463
    464 Other Improvements
    296465
    297466
    298467
    300469

    301470
    302471
    303
  • LLVM IR now directly represents "common" linkage, instead of representing it
  • 304 as a form of weak linkage.
    305
    306
  • LLVM IR now has support for atomic operations, and this functionality can be
  • 307 accessed through the llvm-gcc "__sync_synchronize",
    308 "__sync_val_compare_and_swap", and related builtins. Support for
    309 atomics are available in the Alpha, X86, X86-64, and PowerPC backends.
    310
    311
  • The C and Ocaml bindings have extended to cover pass managers, several
  • 312 transformation passes, iteration over the LLVM IR, target data, and parameter
    313 attribute lists.
    314
    315
    316
    317
    318
    319
    320 Optimizer Improvements
    321
    322
    323
    324
    325

    In addition to a huge array of bug fixes and minor performance tweaks, the

    326 LLVM 2.3 optimizers support a few major enhancements:

    327
    328
    329
    330
  • Loop index set splitting on by default.

  • 331 This transformation hoists conditions from loop bodies and reduces a loop's
    332 iteration space to improve performance. For example,

    333
    334
    335
    
                      
                    
    336 for (i = LB; i < UB; ++i)
    337 if (i <= NV)
    338 LOOP_BODY
    339
    340
    341
    342

    is transformed into:

    343
    344

    345
    
                      
                    
    346 NUB = min(NV+1, UB)
    347 for (i = LB; i < NUB; ++i)
    348 LOOP_BODY
    349
    350
    351 </p>
    472 <li>llvmc2 (the generic compiler driver) gained plugin
    473 support. It is now easier to experiment with llvmc2 and
    474 build your own tools based on it.
    475
    476
  • LLVM 2.4 includes a number of new generic algorithms and data structures,
  • 477 including a scoped hash table, 'immutable' data structures, a simple
    478 free-list manager, and a raw_ostream class.
    479 The raw_ostream class and
    480 format allow for efficient file output, and various pieces of LLVM
    481 have switched over to use it. The eventual goal is to eliminate
    482 use of std::ostream in favor of it.
    483
    484
  • LLVM 2.4 includes an optional build system based on CMake. It
  • 485 still is in its early stages but can be useful for Visual C++
    486 users who can not use the Visual Studio IDE.
    487
    488
    489
    490
    491
    492
    493
    494 Major Changes and Removed Features
    495
    496
    497
    498
    499

    If you're already an LLVM user or developer with out-of-tree changes based

    500 on LLVM 2.3, this section lists some "gotchas" that you may run into upgrading
    501 from the previous release.

    502
    503
    504
    505
  • The LLVM IR generated by llvm-gcc no longer names all instructions. This
  • 506 makes it run faster, but may be more confusing to some people. If you
    507 prefer to have names, the 'opt -instnamer' pass will add names to
    508 all instructions.
    509
    510
  • The LoadVN and GCSE passes have been removed from the tree. They are
  • 511 obsolete and have been replaced with the GVN and MemoryDependence passes.
    512
    513
    514
    515
    516

    In addition, many APIs have changed in this release. Some of the major LLVM

    517 API changes are:

    518
    519
    520
    521
  • Now, function attributes and return value attributes are managed
  • 522 separately. Interface exported by ParameterAttributes.h header is now
    523 exported by Attributes.h header. The new attributes interface changes are:
    524
    525
  • getParamAttrs method is now replaced by
  • 526 getParamAttributes, getRetAttributes and
    527 getFnAttributes methods.
    528
  • Return value attributes are stored at index 0. Function attributes are
  • 529 stored at index ~0U. Parameter attributes are stored at index that matches
    530 parameter number.
    531
  • ParamAttr namespace is now renamed as Attribute.
  • 532
  • The name of the class that manages reference count of opaque
  • 533 attributes is changed from PAListPtr to AttrListPtr.
    534
  • ParamAttrsWithIndex is now renamed as AttributeWithIndex.
  • 352535
    353
    354
  • LLVM now includes a new memcpy optimization pass which removes
  • 355 dead memcpy calls, unneeded copies of aggregates, and performs
    356 return slot optimization. The LLVM optimizer now notices long sequences of
    357 consecutive stores and merges them into memcpy's where profitable.
    358
    359
  • Alignment detection for vector memory references and for memcpy and
  • 360 memset is now more aggressive.
    361
    362
  • The Aggressive Dead Code Elimination (ADCE) optimization has been rewritten
  • 363 to make it both faster and safer in the presence of code containing infinite
    364 loops. Some of its prior functionality has been factored out into the loop
    365 deletion pass, which is safe for infinite loops. The new ADCE pass is
    366 no longer based on control dependence, making it run faster.
    367
    368
  • The 'SimplifyLibCalls' pass, which optimizes calls to libc and libm
  • 369 functions for C-based languages, has been rewritten to be a FunctionPass
    370 instead a ModulePass. This allows it to be run more often and to be
    371 included at -O1 in llvm-gcc. It was also extended to include more
    372 optimizations and several corner case bugs were fixed.
    373
    374
  • LLVM now includes a simple 'Jump Threading' pass, which attempts to simplify
  • 375 conditional branches using information about predecessor blocks, simplifying
    376 the control flow graph. This pass is pretty basic at this point, but
    377 catches some important cases and provides a foundation to build on.
    378
    379
  • Several corner case bugs which could lead to deleting volatile memory
  • 380 accesses have been fixed.
    381
    382
  • Several optimizations have been sped up, leading to faster code generation
  • 383 with the same code quality.
    384
    385
    386
    387
    388
    389
    390
    391 Code Generator Improvements
    392
    393
    394
    395
    396

    We put a significant amount of work into the code generator infrastructure,

    397 which allows us to implement more aggressive algorithms and make it run
    398 faster:

    399
    400
    401
  • The code generator now has support for carrying information about memory
  • 402 references throughout the entire code generation process, via the
    403
    404 MachineMemOperand class. In the future this will be used to improve
    405 both pre-pass and post-pass scheduling, and to improve compiler-debugging
    406 output.
    407
    408
  • The target-independent code generator infrastructure now uses LLVM's
  • 409 APInt
    410 class to handle integer values, which allows it to support integer types
    411 larger than 64 bits (for example i128). Note that support for such types is
    412 also dependent on target-specific support. Use of APInt is also a step
    413 toward support for non-power-of-2 integer sizes.
    414
    415
  • LLVM 2.3 includes several compile time speedups for code with large basic
  • 416 blocks, particularly in the instruction selection phase, register
    417 allocation, scheduling, and tail merging/jump threading.
    418
    419
  • LLVM 2.3 includes several improvements which make llc's
  • 420 --view-sunit-dags visualization of scheduling dependency graphs
    421 easier to understand.
    422
    423
  • The code generator allows targets to write patterns that generate subreg
  • 424 references directly in .td files now.
    425
    426
  • memcpy lowering in the backend is more aggressive, particularly for
  • 427 memcpy calls introduced by the code generator when handling
    428 pass-by-value structure argument copies.
    429
    430
  • Inline assembly with multiple register results now returns those results
  • 431 directly in the appropriate registers, rather than going through memory.
    432 Inline assembly that uses constraints like "ir" with immediates now use the
    433 'i' form when possible instead of always loading the value in a register.
    434 This saves an instruction and reduces register use.
    435
    436
  • Added support for PIC/GOT style
  • 437 href="CodeGenerator.html#tailcallopt">tail calls on X86/32 and initial
    438 support for tail calls on PowerPC 32 (it may also work on PowerPC 64 but is
    439 not thoroughly tested).
    440
    441
    442
    443
    444
    445
    446
    447 X86/X86-64 Specific Improvements
    448
    449
    450
    451

    New target-specific features include:

    452

    453
    454
    455
  • llvm-gcc's X86-64 ABI conformance is far improved, particularly in the
  • 456 area of passing and returning structures by value. llvm-gcc compiled code
    457 now interoperates very well on X86-64 systems with other compilers.
    458
    459
  • Support for Win64 was added. This includes code generation itself, JIT
  • 460 support, and necessary changes to llvm-gcc.
    461
    462
  • The LLVM X86 backend now supports the support SSE 4.1 instruction set, and
  • 463 the llvm-gcc 4.2 front-end supports the SSE 4.1 compiler builtins. Various
    464 generic vector operations (insert/extract/shuffle) are much more efficient
    465 when SSE 4.1 is enabled. The JIT automatically takes advantage of these
    466 instructions, but llvm-gcc must be explicitly told to use them, e.g. with
    467 -march=penryn.
    468
    469
  • The X86 backend now does a number of optimizations that aim to avoid
  • 470 converting numbers back and forth from SSE registers to the X87 floating
    471 point stack. This is important because most X86 ABIs require return values
    472 to be on the X87 Floating Point stack, but most CPUs prefer computation in
    473 the SSE units.
    474
    475
  • The X86 backend supports stack realignment, which is particularly useful for
  • 476 vector code on OS's without 16-byte aligned stacks, such as Linux and
    477 Windows.
    478
    479
  • The X86 backend now supports the "sseregparm" options in GCC, which allow
  • 480 functions to be tagged as passing floating point values in SSE
    481 registers.
    482
    483
  • Trampolines (taking the address of a nested function) now work on
  • 484 Linux/X86-64.
    485
    486
  • __builtin_prefetch is now compiled into the appropriate prefetch
  • 487 instructions instead of being ignored.
    488
    489
  • 128-bit integers are now supported on X86-64 targets. This can be used
  • 490 through __attribute__((TImode)) in llvm-gcc.
    491
    492
  • The register allocator can now rematerialize PIC-base computations, which is
  • 493 an important optimization for register use.
    494
    495
  • The "t" and "f" inline assembly constraints for the X87 floating point stack
  • 496 now work. However, the "u" constraint is still not fully supported.
    497
    498
    499
    500
    501
    502
    503
    504 Other Target Specific Improvements
    505
    506
    507
    508

    New target-specific features include:

    509

    510
    511
    512
  • The LLVM C backend now supports vector code.
  • 513
  • The Cell SPU backend includes a number of improvements. It generates better
  • 514 code and its stability/completeness is improving.
    515
    516
    517
    518
    519
    520
    521
    522
    523 Other Improvements
    524
    525
    526
    527

    New features include:

    528

    529
    530
    531
  • LLVM now builds with GCC 4.3.
  • 532
  • Bugpoint now supports running custom scripts (with the -run-custom
  • 533 option) to determine how to execute the command and whether it is making
    534 forward process.
    535
    536
    537 div>
    536 ul>
    537
    538
    539
  • The DbgStopPointInst methods getDirectory and
  • 540 getFileName now return Value* instead of strings. These can be
    541 converted to strings using llvm::GetConstantStringInfo defined via
    542 "llvm/Analysis/ValueTracking.h".
    543
    544
  • The APIs to create various instructions have changed from lower case
  • 545 "create" methods to upper case "Create" methods (e.g.
    546 BinaryOperator::create). LLVM 2.4 includes both cases, but the
    547 lower case ones are removed in mainline (2.5 and later), please migrate.
    548
    549
  • Various header files like "llvm/ADT/iterator" were given a ".h" suffix.
  • 550 Change your code to #include "llvm/ADT/iterator.h" instead.
    551
    552
  • The getresult instruction has been removed and replaced with the
  • 553 extractvalue instruction. This is part of support for first class
    554 aggregates.
    555
    556
  • In the code generator, many MachineOperand predicates were renamed to be
  • 557 shorter (e.g. isFrameIndex() -> isFI()),
    558 SDOperand was renamed to SDValue (and the "Val"
    559 member was changed to be the getNode() accessor), and the
    560 MVT::ValueType enum has been replaced with an "MVT"
    561 struct. The getSignExtended and getValue methods in the
    562 ConstantSDNode class were renamed to getSExtValue and
    563 getZExtValue respectively, to be more consistent with
    564 the ConstantInt class.
    565
    566
    567
    568
    569
    538570
    539571
    540572
    547579

    LLVM is known to work on the following platforms:

    548580
    549581
    550
  • Intel and AMD machines (IA32) running Red Hat Linux, Fedora Core and FreeBSD
  • 551 (and probably other unix-like systems).
    552
  • PowerPC and X86-based Mac OS X systems, running 10.3 and above in 32-bit and
  • 553 64-bit modes.
    582
  • Intel and AMD machines (IA32, X86-64, AMD64, EMT-64) running Red Hat
  • 583 Linux, Fedora Core and FreeBSD (and probably other unix-like systems).
    584
  • PowerPC and X86-based Mac OS X systems, running 10.3 and above in 32-bit
  • 585 and 64-bit modes.
    554586
  • Intel and AMD machines running on Win32 using MinGW libraries (native).
  • 555587
  • Intel and AMD machines running on Win32 with the Cygwin libraries (limited
  • 556588 support is available for native builds with Visual C++).
    574606
    575607
    576608
    577

    This section contains all known problems with the LLVM system, listed by

    578 component. As new problems are discovered, they will be added to these
    579 sections. If you run into a problem, please check the
    609

    This section contains significant known problems with the LLVM system,

    610 listed by component. If you run into a problem, please check the
    580611 href="http://llvm.org/bugs/">LLVM bug database and submit a bug if
    581612 there isn't already one.

    582613
    597628 href="http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev">LLVMdev list.

    598629
    599630
    600
  • The MSIL, IA64, Alpha, SPU, and MIPS backends are experimental.
  • 631
  • The MSIL, IA64, Alpha, SPU, MIPS, and PIC16 backends are experimental.
  • 601632
  • The llc "-filetype=asm" (the default) is the only supported
  • 602633 value for this option.
    603634
    624655 to several
    625656 bugs due to lack of support for the
    626657 'u' inline assembly constraint and X87 floating point inline assembly.
    627
  • The X86-64 backend does not yet support position-independent code (PIC)
  • 628 generation on Linux targets.
    629658
  • The X86-64 backend does not yet support the LLVM IR instruction
  • 630659 va_arg. Currently, the llvm-gcc front-end supports variadic
    631660 argument constructs on X86-64 by lowering them manually.
    683712
    684713
    685714
    715 Known problems with the MIPS back-end
    716
    717
    718
    719
    720
    721
  • The O32 ABI is not fully supported.
  • 722
  • 64-bit MIPS targets are not supported yet.
  • 723
    724
    725
    726
    727
    728
    686729 Known problems with the Alpha back-end
    687730
    688731
    706749
    707750
  • The Itanium backend is highly experimental, and has a number of known
  • 708751 issues. We are looking for a maintainer for the Itanium backend. If you
    709 are interested, please contact the llvmdev mailing list.
    752 are interested, please contact the LLVMdev mailing list.
    710753
    711754
    712755
    739782
    740783

    llvm-gcc does not currently support Link-Time

    741784 Optimization on most platforms "out-of-the-box". Please inquire on the
    742 llvmdev mailing list if you are interested.

    785 LLVMdev mailing list if you are interested.

    743786
    744787

    The only major language feature of GCC not supported by llvm-gcc is

    745788 the __builtin_apply family of builtins. However, some extensions
    764807 itself, Qt, Mozilla, etc.

    765808
    766809
    767
  • Exception handling works well on the X86 and PowerPC targets, including
  • 768 X86-64 darwin. This works when linking to a libstdc++ compiled by GCC. It is
    769 supported on X86-64 linux, but that is disabled by default in this release.
    770
    771
    772
    773
    810
  • Exception handling works well on the X86 and PowerPC targets. Currently
  • 811 only Linux and Darwin targets are supported (both 32 and 64 bit).
    812
    813
    814
    815
    816
    817
    818 Known problems with the llvm-gcc Fortran front-end
    819
    820
    821
    822
    823
  • Fortran support generally works, but there are still several unresolved bugs
  • 824 in Bugzilla. Please see the tools/gfortran component for details.
    825
    826
  • The Fortran front-end currently does not build on Darwin (without tweaks)
  • 827 due to unresolved dependencies on the C front-end.
    828
    829
    774830
    775831
    776832
    787843 which does support trampolines.
    788844
  • The Ada front-end fails to bootstrap.
  • 789845 Workaround: configure with --disable-bootstrap.
    790
  • The c380004 and c393010 ACATS tests
  • 791 fail (c380004 also fails with gcc-4.2 mainline). When built at -O3, the
    792 cxg2021 ACATS test also fails.
    793
  • Some gcc specific Ada tests continue to crash the compiler. The testsuite
  • 794 reports most tests as having failed even though they pass.>
    846
  • The c380004, c393010>
  • 847 and cxg2021 ACATS tests fail
    848 (c380004 also fails with gcc-4.2 mainline).
    849
  • Some gcc specific Ada tests continue to crash the compiler.
  • 795850
  • The -E binder option (exception backtraces)
  • 796851 does not work and will result in programs
    797852 crashing if an exception is raised. Workaround: do not use -E.