llvm.org GIT mirror llvm / release_29
Update the release notes. git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_29@129054 91177308-0d34-0410-b5e6-96231b3b80d8 Bill Wendling 8 years ago
1 changed file(s) with 649 addition(s) and 906 deletion(s). Raw diff Collapse all Expand all
44
55
66
7 LLVM 2.<del>8</del> Release Notes
7 LLVM 2.<ins>9</ins> Release Notes
88
99
1010
11 <div class="doc_title">LLVM 2.8 Release Notes>
11 <h1 class="doc_title">LLVM 2.9 Release Notes>
1212
1313
1414 width="136" height="136" alt="LLVM Dragon Logo">
1616
1717
  • Introduction
  • 1818
  • Sub-project Status Update
  • 19
  • External Projects Using LLVM 2.8
  • 20
  • What's New in LLVM 2.8?
  • 19
  • External Projects Using LLVM 2.9
  • 20
  • What's New in LLVM 2.9?
  • 2121
  • Installation Instructions
  • 2222
  • Known Problems
  • 2323
  • Additional Information
  • 2828
    2929
    3030
    37
    38
    39
    >
    36 -->
    37
    38
    39

    4040 Introduction
    41 div>
    41 h1>
    4242
    4343
    4444
    4545
    4646

    This document contains the release notes for the LLVM Compiler

    47 Infrastructure, release 2.8. Here we describe the status of LLVM, including
    47 Infrastructure, release 2.9. Here we describe the status of LLVM, including
    4848 major improvements from the previous release and significant known problems.
    4949 All LLVM releases may be downloaded from the
    5050 href="http://llvm.org/releases/">LLVM releases web site.

    6161 releases page.

    6262
    6363
    64
    65
    66
    72
    7364
    74
    8173
    82
    83
    84
    85
    86 <div class="doc_section">
    74 <!-- *********************************************************************** -->
    75

    8776 Sub-project Status Update
    88
    89
    90
    91
    92

    93 The LLVM 2.8 distribution currently consists of code from the core LLVM
    77
    78
    79
    80
    81

    82 The LLVM 2.9 distribution currently consists of code from the core LLVM
    9483 repository (which roughly includes the LLVM optimizers, code generators
    9584 and supporting tools), the Clang repository and the llvm-gcc repository. In
    9685 addition to this code, the LLVM Project includes other sub-projects that are in
    10190
    10291
    10392
    104 <div class="doc_subsection">
    93 <h2>
    10594 Clang: C/C++/Objective-C Frontend Toolkit
    106 div>
    95 h2>
    10796
    10897
    10998
    114103 modular, library-based architecture that makes it suitable for creating or
    115104 integrating with other development tools. Clang is considered a
    116105 production-quality compiler for C, Objective-C, C++ and Objective-C++ on x86
    117 (32- and 64-bit), and for darwin-arm targets.

    118
    119

    In the LLVM 2.8 time-frame, the Clang team has made many improvements:

    120
    121
    122
  • Clang C++ is now feature-complete with respect to the ISO C++ 1998 and 2003 standards.
  • 123
  • Added support for Objective-C++.
  • 124
  • Clang now uses LLVM-MC to directly generate object code and to parse inline assembly (on Darwin).
  • 125
  • Introduced many new warnings, including -Wmissing-field-initializers, -Wshadow, -Wno-protocol, -Wtautological-compare, -Wstrict-selector-match, -Wcast-align, -Wunused improvements, and greatly improved format-string checking.
  • 126
  • Introduced the "libclang" library, a C interface to Clang intended to support IDE clients.
  • 127
  • Added support for #pragma GCC visibility, #pragma align, and others.
  • 128
  • Added support for SSE, AVX, ARM NEON, and AltiVec.
  • 129
  • Improved support for many Microsoft extensions.
  • 130
  • Implemented support for blocks in C++.
  • 131
  • Implemented precompiled headers for C++.
  • 132
  • Improved abstract syntax trees to retain more accurate source information.
  • 133
  • Added driver support for handling LLVM IR and bitcode files directly.
  • 134
  • Major improvements to compiler correctness for exception handling.
  • 135
  • Improved generated code quality in some areas:
  • 136
    137
  • Good code generation for X86-32 and X86-64 ABI handling.
  • 138
  • Improved code generation for bit-fields, although important work remains.
  • 139
    140
    141
    142
    143
    144
    145
    146 Clang Static Analyzer
    147
    148
    149
    150
    151

    The Clang Static Analyzer

    152 project is an effort to use static source code analysis techniques to
    153 automatically find bugs in C and Objective-C programs (and hopefully
    154 href="http://clang-analyzer.llvm.org/dev_cxx.html">C++ in the
    155 future!). The tool is very good at finding bugs that occur on specific
    156 paths through code, such as on error conditions.

    157
    158

    The LLVM 2.8 release fixes a number of bugs and slightly improves precision

    159 over 2.7, but there are no major new features in the release.
    160

    161
    162
    163
    164
    165
    166 DragonEgg: llvm-gcc ported to gcc-4.5
    167
    168
    169
    170

    171 DragonEgg is a port of llvm-gcc to
    172 gcc-4.5. Unlike llvm-gcc, dragonegg in theory does not require any gcc-4.5
    173 modifications whatsoever (currently one small patch is needed) thanks to the
    174 new gcc plugin architecture.
    175 DragonEgg is a gcc plugin that makes gcc-4.5 use the LLVM optimizers and code
    176 generators instead of gcc's, just like with llvm-gcc.
    177

    178
    179

    180 DragonEgg is still a work in progress, but it is able to compile a lot of code,
    181 for example all of gcc, LLVM and clang. Currently Ada, C, C++ and Fortran work
    182 well, while all other languages either don't work at all or only work poorly.
    183 For the moment only the x86-32 and x86-64 targets are supported, and only on
    184 linux and darwin (darwin may need additional gcc patches).
    185

    186
    187

    188 The 2.8 release has the following notable changes:
    189
    190
  • The plugin loads faster due to exporting fewer symbols.
  • 191
  • Additional vector operations such as addps256 are now supported.
  • 192
  • Ada global variables with no initial value are no longer zero initialized,
  • 193 resulting in better optimization.
    194
  • The '-fplugin-arg-dragonegg-enable-gcc-optzns' flag now runs all gcc
  • 195 optimizers, rather than just a handful.
    196
  • Fortran programs using common variables now link correctly.
  • 197
  • GNU OMP constructs no longer crash the compiler.
  • 198
    199
    200
    201
    202
    203
    204 VMKit: JVM/CLI Virtual Machine Implementation
    205
    206
    207
    208

    209 The VMKit project is an implementation of
    210 a Java Virtual Machine (Java VM or JVM) that uses LLVM for static and
    211 just-in-time compilation. As of LLVM 2.8, VMKit now supports copying garbage
    212 collectors, and can be configured to use MMTk's copy mark-sweep garbage
    213 collector. In LLVM 2.8, the VMKit .NET VM is no longer being maintained.
    214

    215
    216
    217
    218
    >
    106 (32- and 64-bit), and for darwin/arm targets.>
    107
    108

    In the LLVM 2.9 time-frame, the Clang team has made many improvements in C,

    109 C++ and Objective-C support. C++ support is now generally rock solid, has
    110 been exercised on a broad variety of code, and has several new
    111 href="http://clang.llvm.org/cxx_status.html#cxx0x">C++'0x features
    112 implemented (such as rvalue references and variadic templates). LLVM 2.9 has
    113 also brought in a large range of bug fixes and minor features (e.g. __label__
    114 support), and is much more compatible with the Linux Kernel.

    115
    116

    If Clang rejects your code but another compiler accepts it, please take a

    117 look at the language
    118 compatibility guide to make sure this is not intentional or a known issue.
    119

    120
    121
    122
    123
    124
    125
    126

    127 DragonEgg: GCC front-ends, LLVM back-end
    128
    129
    130
    131

    132 DragonEgg is a
    133 gcc plugin that replaces GCC's
    134 optimizers and code generators with LLVM's.
    135 Currently it requires a patched version of gcc-4.5.
    136 The plugin can target the x86-32 and x86-64 processor families and has been
    137 used successfully on the Darwin, FreeBSD and Linux platforms.
    138 The Ada, C, C++ and Fortran languages work well.
    139 The plugin is capable of compiling plenty of Obj-C, Obj-C++ and Java but it is
    140 not known whether the compiled code actually works or not!
    141

    142
    143

    144 The 2.9 release has the following notable changes:
    145
    146
  • The plugin is much more stable when compiling Fortran.
  • 147
  • Inline assembly where an asm output is tied to an input of a different size
  • 148 is now supported in many more cases.
    149
  • Basic support for the __float128 type was added. It is now possible to
  • 150 generate LLVM IR from programs using __float128 but code generation does not
    151 work yet.
    152
  • Compiling Java programs no longer systematically crashes the plugin.
  • 153
    154
    155
    156
    157
    158

    219159 compiler-rt: Compiler Runtime Library
    220 div>
    160 h2>
    221161
    222162
    223163

    230170 this and other low-level routines (some are 3x faster than the equivalent
    231171 libgcc routines).

    232172
    233

    234 All of the code in the compiler-rt project is available under the standard LLVM
    235 License, a "BSD-style" license. New in LLVM 2.8, compiler_rt now supports
    236 soft floating point (for targets that don't have a real floating point unit),
    237 and includes an extensive testsuite for the "blocks" language feature and the
    238 blocks runtime included in compiler_rt.

    239
    240
    241
    242
    243 <div class="doc_subsection">
    173 <p>In the LLVM 2.9 timeframe, compiler_rt has had several minor changes for
    174 better ARM support, and a fairly major license change. All of the code in the
    175 compiler-rt project is now dual
    176 licensed under MIT and UIUC license, which allows you to use compiler-rt
    177 in applications without the binary copyright reproduction clause. If you
    178 prefer the LLVM/UIUC license, you are free to continue using it under that
    179 license as well.

    180
    181
    182
    183
    184

    244185 LLDB: Low Level Debugger
    245 div>
    186 h2>
    246187
    247188
    248189

    253194 LLVM disassembler and the LLVM JIT.

    254195
    255196

    256 LLDB is in early development and not included as part of the LLVM 2.8 release,
    257 but is mature enough to support basic debugging scenarios on Mac OS X in C,
    258 Objective-C and C++. We'd really like help extending and expanding LLDB to
    259 support new platforms, new languages, new architectures, and new features.
    260

    261
    262
    263
    264
    265
    197 LLDB is has advanced by leaps and bounds in the 2.9 timeframe. It is
    198 dramatically more stable and useful, and includes both a new
    199 href="http://lldb.llvm.org/tutorial.html">tutorial and a
    200 href="http://lldb.llvm.org/lldb-gdb.html">side-by-side comparison with
    201 GDB.

    202
    203
    204
    205
    206

    266207 libc++: C++ Standard Library
    267 div>
    208 h2>
    268209
    269210
    270211

    274215 delivering great performance.

    275216
    276217

    277 As of the LLVM 2.8 release, libc++ is virtually feature complete, but would
    278 benefit from more testing and better integration with Clang++. It is also
    279 looking forward to the C++ committee finalizing the C++'0x standard.
    280

    281
    282
    283
    284
    285
    286
    287
    218 In the LLVM 2.9 timeframe, libc++ has had numerous bugs fixed, and is now being
    219 co-developed with Clang's C++'0x mode.

    220
    221

    222 Like compiler_rt, libc++ is now dual
    223 licensed under the MIT and UIUC license, allowing it to be used more
    224 permissively.
    225

    226
    227
    228
    229
    230
    231

    232 LLBrowse: IR Browser
    233
    234
    235
    236

    237
    238 LLBrowse is an interactive viewer for LLVM modules. It can load any LLVM
    239 module and displays its contents as an expandable tree view, facilitating an
    240 easy way to inspect types, functions, global variables, or metadata nodes. It
    241 is fully cross-platform, being based on the popular wxWidgets GUI toolkit.
    242

    243
    244
    245
    246

    247 VMKit
    248
    249
    250
    251

    The VMKit project is an implementation

    252 of a Java Virtual Machine (Java VM or JVM) that uses LLVM for static and
    253 just-in-time compilation. As of LLVM 2.9, VMKit now supports generational
    254 garbage collectors. The garbage collectors are provided by the MMTk framework,
    255 and VMKit can be configured to use one of the numerous implemented collectors
    256 of MMTk.
    257

    258
    259
    260
    261
    262
    323
    324 External Open Source Projects Using LLVM 2.8
    325 </div>
    276 <p>UPDATE!>
    277 -->
    278
    279
    280
    281

    282 External Open Source Projects Using LLVM 2.9
    283
    326284
    327285
    328286
    329287
    330288

    An exciting aspect of LLVM is that it is used as an enabling technology for

    331289 a lot of other language and tools projects. This section lists some of the
    332 projects that have already been updated to work with LLVM 2.8.

    333
    334
    335
    336
    337 TTA-based Codesign Environment (TCE)
    338
    339
    340
    341

    342 TCE is a toolset for designing
    343 application-specific processors (ASP) based on the Transport triggered
    344 architecture (TTA). The toolset provides a complete co-design flow from C/C++
    345 programs down to synthesizable VHDL and parallel program binaries. Processor
    346 customization points include the register files, function units, supported
    347 operations, and the interconnection network.

    348
    349

    TCE uses llvm-gcc/Clang and LLVM for C/C++ language support, target

    350 independent optimizations and also for parts of code generation. It generates
    351 new LLVM-based code generators "on the fly" for the designed TTA processors and
    352 loads them in to the compiler backend as runtime libraries to avoid per-target
    353 recompilation of larger parts of the compiler chain.

    354
    355
    356
    357
    358
    359 Horizon Bytecode Compiler
    360
    361
    362
    363

    364 Horizon is a bytecode
    365 language and compiler written on top of LLVM, intended for producing
    366 single-address-space managed code operating systems that
    367 run faster than the equivalent multiple-address-space C systems.
    368 More in-depth blurb is available on the
    369 href="http://www.quokforge.org/projects/horizon/wiki/Wiki">wiki.

    370
    371
    372
    373
    374
    375 Clam AntiVirus
    376
    377
    378
    379

    380 Clam AntiVirus is an open source (GPL)
    381 anti-virus toolkit for UNIX, designed especially for e-mail scanning on mail
    382 gateways. Since version 0.96 it has
    383 href="http://vrt-sourcefire.blogspot.com/2010/09/introduction-to-clamavs-low-level.html">bytecode
    384 signatures that allow writing detections for complex malware. It
    385 uses LLVM's JIT to speed up the execution of bytecode on
    386 X86, X86-64, PPC32/64, falling back to its own interpreter otherwise.
    387 The git version was updated to work with LLVM 2.8.
    388

    389
    390

    The

    391 href="http://git.clamav.net/gitweb?p=clamav-bytecode-compiler.git;a=blob_plain;f=docs/user/clambc-user.pdf">
    392 ClamAV bytecode compiler uses Clang and LLVM to compile a C-like
    393 language, insert runtime checks, and generate ClamAV bytecode.

    394
    395
    396
    397
    398
    399 Pure
    400
    401
    402
    403

    404 Pure
    405 is an algebraic/functional
    406 programming language based on term rewriting. Programs are collections
    407 of equations which are used to evaluate expressions in a symbolic
    408 fashion. Pure offers dynamic typing, eager and lazy evaluation, lexical
    409 closures, a hygienic macro system (also based on term rewriting),
    410 built-in list and matrix support (including list and matrix
    411 comprehensions) and an easy-to-use C interface. The interpreter uses
    412 LLVM as a backend to JIT-compile Pure programs to fast native code.

    413
    414

    Pure versions 0.44 and later have been tested and are known to work with

    415 LLVM 2.8 (and continue to work with older LLVM releases >= 2.5).

    416
    417
    418
    419
    420
    421 Glasgow Haskell Compiler (GHC)
    422
    423
    424
    425

    426 GHC is an open source,
    427 state-of-the-art programming suite for
    428 Haskell, a standard lazy functional programming language. It includes
    429 an optimizing static compiler generating good code for a variety of
    290 projects that have already been updated to work with LLVM 2.9.

    291
    292
    293
    294
    295

    Crack Programming Language

    296
    297
    298

    299 Crack aims to provide the
    300 ease of development of a scripting language with the performance of a compiled
    301 language. The language derives concepts from C++, Java and Python, incorporating
    302 object-oriented programming, operator overloading and strong typing.

    303
    304
    305
    306
    307

    TTA-based Codesign Environment (TCE)

    308
    309
    310

    TCE is a toolset for designing application-specific processors (ASP) based on

    311 the Transport triggered architecture (TTA). The toolset provides a complete
    312 co-design flow from C/C++ programs down to synthesizable VHDL and parallel
    313 program binaries. Processor customization points include the register files,
    314 function units, supported operations, and the interconnection network.

    315
    316

    TCE uses Clang and LLVM for C/C++ language support, target independent

    317 optimizations and also for parts of code generation. It generates new LLVM-based
    318 code generators "on the fly" for the designed TTA processors and loads them in
    319 to the compiler backend as runtime libraries to avoid per-target recompilation
    320 of larger parts of the compiler chain.

    321
    322
    323
    324
    325
    326

    PinaVM

    327
    328
    329

    PinaVM is an open

    330 source, SystemC front-end. Unlike many
    331 other front-ends, PinaVM actually executes the elaboration of the
    332 program analyzed using LLVM's JIT infrastructure. It later enriches the
    333 bitcode with SystemC-specific information.

    334
    335
    336
    337

    Pure

    338
    339
    340

    Pure is an

    341 algebraic/functional
    342 programming language based on term rewriting. Programs are collections
    343 of equations which are used to evaluate expressions in a symbolic
    344 fashion. The interpreter uses LLVM as a backend to JIT-compile Pure
    345 programs to fast native code. Pure offers dynamic typing, eager and lazy
    346 evaluation, lexical closures, a hygienic macro system (also based on
    347 term rewriting), built-in list and matrix support (including list and
    348 matrix comprehensions) and an easy-to-use interface to C and other
    349 programming languages (including the ability to load LLVM bitcode
    350 modules, and inline C, C++, Fortran and Faust code in Pure programs if
    351 the corresponding LLVM-enabled compilers are installed).

    352
    353

    Pure version 0.47 has been tested and is known to work with LLVM 2.9

    354 (and continues to work with older LLVM releases >= 2.5).

    355
    356
    357
    358

    IcedTea Java Virtual Machine Implementation

    359
    360
    361

    362 IcedTea provides a
    363 harness to build OpenJDK using only free software build tools and to provide
    364 replacements for the not-yet free parts of OpenJDK. One of the extensions that
    365 IcedTea provides is a new JIT compiler named
    366 href="http://icedtea.classpath.org/wiki/ZeroSharkFaq">Shark which uses LLVM
    367 to provide native code generation without introducing processor-dependent
    368 code.
    369

    370
    371

    OpenJDK 7 b112, IcedTea6 1.9 and IcedTea7 1.13 and later have been tested

    372 and are known to work with LLVM 2.9 (and continue to work with older LLVM
    373 releases >= 2.6 as well).

    374
    375
    376
    377

    Glasgow Haskell Compiler (GHC)

    378
    379
    380

    GHC is an open source, state-of-the-art programming suite for Haskell,

    381 a standard lazy functional programming language. It includes an
    382 optimizing static compiler generating good code for a variety of
    430383 platforms, together with an interactive system for convenient, quick
    431384 development.

    432385
    433386

    In addition to the existing C and native code generators, GHC 7.0 now

    434 supports an
    435 href="http://hackage.haskell.org/trac/ghc/wiki/Commentary/Compiler/Backends/LLVM">LLVM
    436 code generator. GHC supports LLVM 2.7 and later.

    437
    438
    439
    440
    441
    442 Clay Programming Language
    443
    444
    445
    446

    447 Clay is a new systems programming
    448 language that is specifically designed for generic programming. It makes
    449 generic programming very concise thanks to whole program type propagation. It
    450 uses LLVM as its backend.

    451
    452
    453
    454
    455
    456 llvm-py Python Bindings for LLVM
    457
    458
    459
    460

    461 llvm-py has been updated to work
    462 with LLVM 2.8. llvm-py provides Python bindings for LLVM, allowing you to write a
    463 compiler backend or a VM in Python.

    464
    387 supports an LLVM code generator. GHC supports LLVM 2.7 and later.

    388
    389
    390
    391

    Polly - Polyhedral optimizations for LLVM

    392
    393
    394

    Polly is a project that aims to provide advanced memory access optimizations

    395 to better take advantage of SIMD units, cache hierarchies, multiple cores or
    396 even vector accelerators for LLVM. Built around an abstract mathematical
    397 description based on Z-polyhedra, it provides the infrastructure to develop
    398 advanced optimizations in LLVM and to connect complex external optimizers. In
    399 its first year of existence Polly already provides an exact value-based
    400 dependency analysis as well as basic SIMD and OpenMP code generation support.
    401 Furthermore, Polly can use PoCC(Pluto) an advanced optimizer for data-locality
    402 and parallelism.

    403
    404
    405
    406

    Rubinius

    407
    408
    409

    Rubinius is an environment

    410 for running Ruby code which strives to write as much of the implementation in
    411 Ruby as possible. Combined with a bytecode interpreting VM, it uses LLVM to
    412 optimize and compile ruby code down to machine code. Techniques such as type
    413 feedback, method inlining, and deoptimization are all used to remove dynamism
    414 from ruby execution and increase performance.

    465415
    466416
    467417
    476426 audio signal processing. The name FAUST stands for Functional AUdio STream. Its
    477427 programming model combines two approaches: functional programming and block
    478428 diagram composition. In addition with the C, C++, JAVA output formats, the
    479 Faust compiler can now generate LLVM bitcode, and works with LLVM 2.7 and
    480 2.8.

    481
    482
    483
    484
    485
    486 Jade Just-in-time Adaptive Decoder Engine
    487
    488
    489
    490

    491 href="http://sourceforge.net/apps/trac/orcc/wiki/JadeDocumentation">Jade
    492 (Just-in-time Adaptive Decoder Engine) is a generic video decoder engine using
    493 LLVM for just-in-time compilation of video decoder configurations. Those
    494 configurations are designed by MPEG Reconfigurable Video Coding (RVC) committee.
    495 MPEG RVC standard is built on a stream-based dataflow representation of
    496 decoders. It is composed of a standard library of coding tools written in
    497 RVC-CAL language and a dataflow configuration — block diagram —
    498 of a decoder.

    499
    500

    Jade project is hosted as part of the Open

    501 RVC-CAL Compiler and requires it to translate the RVC-CAL standard library
    502 of video coding tools into an LLVM assembly code.

    503
    504
    505
    506
    507
    508 LLVM JIT for Neko VM
    509
    510
    511
    512

    Neko LLVM JIT

    513 replaces the standard Neko JIT with an LLVM-based implementation. While not
    514 fully complete, it is already providing a 1.5x speedup on 64-bit systems.
    515 Neko LLVM JIT requires LLVM 2.8 or later.

    516
    517
    518
    519
    520
    521 Crack Scripting Language
    522
    523
    524
    525

    526 Crack aims to provide
    527 the ease of development of a scripting language with the performance of a
    528 compiled language. The language derives concepts from C++, Java and Python,
    529 incorporating object-oriented programming, operator overloading and strong
    530 typing. Crack 0.2 works with LLVM 2.7, and the forthcoming Crack 0.2.1 release
    531 builds on LLVM 2.8.

    532
    533
    534
    535
    536
    537 Dresden TM Compiler (DTMC)
    538
    539
    540
    541

    542 DTMC provides support for
    543 Transactional Memory, which is an easy-to-use and efficient way to synchronize
    544 accesses to shared memory. Transactions can contain normal C/C++ code (e.g.,
    545 __transaction { list.remove(x); x.refCount--; }) and will be executed
    546 virtually atomically and isolated from other transactions.

    547
    548
    549
    550
    551
    552 Kai Programming Language
    553
    554
    555
    556

    557 Kai (Japanese 会 for
    558 meeting/gathering) is an experimental interpreter that provides a highly
    559 extensible runtime environment and explicit control over the compilation
    560 process. Programs are defined using nested symbolic expressions, which are all
    561 parsed into first-class values with minimal intrinsic semantics. Kai can
    562 generate optimised code at run-time (using LLVM) in order to exploit the nature
    563 of the underlying hardware and to integrate with external software libraries.
    564 It is a unique exploration into world of dynamic code compilation, and the
    565 interaction between high level and low level semantics.

    566
    567
    568
    569
    570
    571 OSL: Open Shading Language
    572
    573
    574
    575

    576 OSL is a shading
    577 language designed for use in physically based renderers and in particular
    578 production rendering. By using LLVM instead of the interpreter, it was able to
    579 meet its performance goals (>= C-code) while retaining the benefits of
    580 runtime specialization and a portable high-level language.
    581

    582
    583
    584
    585
    586
    587
    588
    589 What's New in LLVM 2.8?
    590 >
    429 Faust compiler can now generate LLVM bitcode, and works with LLVM 2.7-2.9.>
    430
    431
    432
    433
    434

    435 What's New in LLVM 2.9?
    436
    591437
    592438
    593439
    600446
    601447
    602448
    603 <div class="doc_subsection">
    449 <h2>
    604450 Major New Features
    605
    606
    607
    608
    609

    LLVM 2.8 includes several major new capabilities:

    610
    611
    612
  • As mentioned above, libc++ and
  • 613 href="#lldb">LLDB are major new additions to the LLVM collective.
    614
  • LLVM 2.8 now has pretty decent support for debugging optimized code. You
  • 615 should be able to reliably get debug info for function arguments, assuming
    616 that the value is actually available where you have stopped.
    617
  • A new 'llvm-diff' tool is available that does a semantic diff of .ll
  • 618 files.
    619
  • The MC subproject has made major progress in this release.
  • 620 Direct .o file writing support for darwin/x86[-64] is now reliable and
    621 support for other targets and object file formats are in progress.
    622
    623
    624
    625
    626
    627 <div class="doc_subsection">
    451 </h2>
    452
    453
    454
    455

    LLVM 2.9 includes several major new capabilities:

    456
    457
    458
    459
  • Type Based Alias Analysis (TBAA) is now implemented and turned on by default
  • 460 in Clang. This allows substantially better load/store optimization in some
    461 cases. TBAA can be disabled by passing -fno-strict-aliasing.
    462
    463
    464
  • This release has seen a continued focus on quality of debug information.
  • 465 LLVM now generates much higher fidelity debug information, particularly when
    466 debugging optimized code.
    467
    468
  • Inline assembly now supports multiple alternative constraints.
  • 469
    470
  • A new backend for the NVIDIA PTX virtual ISA (used to target its GPUs) is
  • 471 under rapid development. It is not generally useful in 2.9, but is making
    472 rapid progress.
    473
    474
    475
    476
    477
    478
    479

    628480 LLVM IR and Core Improvements
    629 div>
    481 h2>
    630482
    631483
    632484

    LLVM IR has several new features for better support of new targets and that

    633485 expose new optimization opportunities:

    634486
    635487
    636
  • The memcpy, memmove, and memset
  • 637 intrinsics now take address space qualified pointers and a bit to indicate
    638 whether the transfer is "volatile" or not.
    639
    640
  • Per-instruction debug info metadata is much faster and uses less memory by
  • 641 using the new DebugLoc class.
    642
  • LLVM IR now has a more formalized concept of "
  • 643 href="LangRef.html#trapvalues">trap values", which allow the optimizer
    644 to optimize more aggressively in the presence of undefined behavior, while
    645 still producing predictable results.
    646
  • LLVM IR now supports two new linkage
  • 647 types (linker_private_weak and linker_private_weak_def_auto) which map
    648 onto some obscure MachO concepts.
    649
    650
    651
    652
    653
    654 <div class="doc_subsection">
    488 <li>The udiv, ashr, lshr, and shl>
    489 instructions now have support exact and nuw/nsw bits to indicate that they
    490 don't overflow or shift out bits. This is useful for optimization of
    491 href="http://llvm.org/PR8862">pointer differences and other cases.
    492
    493
  • LLVM IR now supports the unnamed_addr
  • 494 attribute to indicate that constant global variables with identical
    495 initializers can be merged. This fixed an
    496 issue where LLVM would incorrectly merge two globals which were supposed
    497 to have distinct addresses.
    498
    499
  • The new hotpatch attribute has been added
  • 500 to allow runtime patching of functions.
    501
    502
    503
    504
    505
    506

    655507 Optimizer Improvements
    656 div>
    508 h2>
    657509
    658510
    659511
    661513 release includes a few major enhancements and additions to the optimizers:

    662514
    663515
    664
  • As mentioned above, the optimizer now has support for updating debug
  • 665 information as it goes. A key aspect of this is the new
    666 href="SourceLevelDebugging.html#format_common_value">llvm.dbg.value
    667 intrinsic. This intrinsic represents debug info for variables that are
    668 promoted to SSA values (typically by mem2reg or the -scalarrepl passes).
    669
    670
  • The JumpThreading pass is now much more aggressive about implied value
  • 671 relations, allowing it to thread conditions like "a == 4" when a is known to
    672 be 13 in one of the predecessors of a block. It does this in conjunction
    673 with the new LazyValueInfo analysis pass.
    674
  • The new RegionInfo analysis pass identifies single-entry single-exit regions
  • 675 in the CFG. You can play with it with the "opt -regions -analyze" or
    676 "opt -view-regions" commands.
    677
  • The loop optimizer has significantly improved strength reduction and analysis
  • 678 capabilities. Notably it is able to build on the trap value and signed
    679 integer overflow information to optimize <= and >= loops.
    680
  • The CallGraphSCCPassManager now has some basic support for iterating within
  • 681 an SCC when a optimizer devirtualizes a function call. This allows inlining
    682 through indirect call sites that are devirtualized by store-load forwarding
    683 and other optimizations.
    684
  • The new -loweratomic pass is available
  • 685 to lower atomic instructions into their non-atomic form. This can be useful
    686 to optimize generic code that expects to run in a single-threaded
    687 environment.
    688
    689
    690
    696
    697
    698
    699
    700 <div class="doc_subsection">
    516 <li>Link Time Optimization (LTO) has been improved to use MC for parsing inline
    517 assembly and now can build large programs like Firefox 4 on both Mac OS X and
    518 Linux.
    519
    520
  • The new -loop-idiom pass recognizes memset/memcpy loops (and memset_pattern
  • 521 on darwin), turning them into library calls, which are typically better
    522 optimized than inline code. If you are building a libc and notice that your
    523 memcpy and memset functions are compiled into infinite recursion, please build
    524 with -ffreestanding or -fno-builtin to disable this pass.
    525
    526
  • A new -early-cse pass does a fast pass over functions to fold constants,
  • 527 simplify expressions, perform simple dead store elimination, and perform
    528 common subexpression elimination. It does a good job at catching some of the
    529 trivial redundancies that exist in unoptimized code, making later passes more
    530 effective.
    531
    532
  • A new -loop-instsimplify pass is used to clean up loop bodies in the loop
  • 533 optimizer.
    534
    535
  • The new TargetLibraryInfo interface allows mid-level optimizations to know
  • 536 whether the current target's runtime library has certain functions. For
    537 example, the optimizer can now transform integer-only printf calls to call
    538 iprintf, allowing reduced code size for embedded C libraries (e.g. newlib).
    539
    540
    541
  • LLVM has a new RegionPass
  • 542 infrastructure for region-based optimizations.
    543
    544
  • Several optimizer passes have been substantially sped up:
  • 545 GVN is much faster on functions with deep dominator trees and lots of basic
    546 blocks. The dominator tree and dominance frontier passes are much faster to
    547 compute, and preserved by more passes (so they are computed less often). The
    548 -scalar-repl pass is also much faster and doesn't use DominanceFrontier.
    549
    550
    551
  • The Dead Store Elimination pass is more aggressive optimizing stores of
  • 552 different types: e.g. a large store following a small one to the same address.
    553 The MemCpyOptimizer pass handles several new forms of memcpy elimination.
    554
    555
  • LLVM now optimizes various idioms for overflow detection into check of the
  • 556 flag register on various CPUs. For example, we now compile:
    557
    558
    
                      
                    
    559 unsigned long t = a+b;
    560 if (t < a) ...
    561
    562 into:
    563
    
                      
                    
    564 addq %rdi, %rbx
    565 jno LBB0_2
    566
    567
    568
    569
    570
    571
    572
    573
    574

    701575 MC Level Improvements
    702 div>
    576 h2>
    703577
    704578
    705579

    708582 and a number of other related areas that CPU instruction-set level tools work
    709583 in.

    710584
    711

    The MC subproject has made great leaps in LLVM 2.8. For example, support for

    712 directly writing .o files from LLC (and clang) now works reliably for
    713 darwin/x86[-64] (including inline assembly support) and the integrated
    714 assembler is turned on by default in Clang for these targets. This provides
    715 improved compile times among other things.

    716
    717
    718
  • The entire compiler has converted over to using the MCStreamer assembler API
  • 719 instead of writing out a .s file textually.
    720
  • The "assembler parser" is far more mature than in 2.7, supporting a full
  • 721 complement of directives, now supports assembler macros, etc.
    722
  • The "assembler backend" has been completed, including support for relaxation
  • 723 relocation processing and all the other things that an assembler does.
    724
  • The MachO file format support is now fully functional and works.
  • 725
  • The MC disassembler now fully supports ARM and Thumb. ARM assembler support
  • 726 is still in early development though.
    727
  • The X86 MC assembler now supports the X86 AES and AVX instruction set.
  • 728
  • Work on ELF and COFF object files and ARM target support is well underway,
  • 729 but isn't useful yet in LLVM 2.8. Please contact the llvmdev mailing list
    730 if you're interested in this.>
    585 >
    586
  • ELF MC support has matured enough for the integrated assembler to be turned
  • 587 on by default in Clang on X86-32 and X86-64 ELF systems.
    588
    589
  • MC supports and CodeGen uses the .file and .loc directives
  • 590 for producing line number debug info. This produces more compact line
    591 tables and easier to read .s files.
    592
    593
  • MC supports the .cfi_* directives for producing DWARF
  • 594 frame information, but it is still not used by CodeGen by default.
    595
    596
    597
  • The MC assembler now generates much better diagnostics for common errors,
  • 598 is much faster at matching instructions, is much more bug-compatible with
    599 the GAS assembler, and is now generally useful for a broad range of X86
    600 assembly.
    601
    602
  • We now have some basic internals
  • 603 documentation for MC.
    604
    605
  • .td files can now specify assembler aliases directly with the
  • 606 href="CodeGenerator.html#na_instparsing">MnemonicAlias and InstAlias
    607 tblgen classes.
    608
    609
  • LLVM now has an experimental format-independent object file manipulation
  • 610 library (lib/Object). It supports both PE/COFF and ELF. The llvm-nm tool has
    611 been extended to work with native object files, and the new llvm-objdump tool
    612 supports disassembly of object files (but no relocations are displayed yet).
    613
    614
    615
  • Win32 PE-COFF support in the MC assembler has made a lot of progress in the
  • 616 2.9 timeframe, but is still not generally useful.
    617
    731618
    732619
    733620

    For more information, please see the

    735622 LLVM MC Project Blog Post.
    736623

    737624
    738
    739
    740
    741
    742 <div class="doc_subsection">
    625 </div>
    626
    627
    628

    743629 Target Independent Code Generator Improvements
    744 div>
    630 h2>
    745631
    746632
    747633
    750636 it run faster:

    751637
    752638
    753
  • The clang/gcc -momit-leaf-frame-pointer argument is now supported.
  • 754
  • The clang/gcc -ffunction-sections and -fdata-sections arguments are now
  • 755 supported on ELF targets (like GCC).
    756
  • The MachineCSE pass is now tuned and on by default. It eliminates common
  • 757 subexpressions that are exposed when lowering to machine instructions.
    758
  • The "local" register allocator was replaced by a new "fast" register
  • 759 allocator. This new allocator (which is often used at -O0) is substantially
    760 faster and produces better code than the old local register allocator.
    761
  • A new LLC "-regalloc=default" option is available, which automatically
  • 762 chooses a register allocator based on the -O optimization level.
    763
  • The common code generator code was modified to promote illegal argument and
  • 764 return value vectors to wider ones when possible instead of scalarizing
    765 them. For example, <3 x float> will now pass in one SSE register
    766 instead of 3 on X86. This generates substantially better code since the
    767 rest of the code generator was already expecting this.
    768
  • The code generator uses a new "COPY" machine instruction. This speeds up
  • 769 the code generator and eliminates the need for targets to implement the
    770 isMoveInstr hook. Also, the copyRegToReg hook was renamed to copyPhysReg
    771 and simplified.
    772
  • The code generator now has a "LocalStackSlotPass", which optimizes stack
  • 773 slot access for targets (like ARM) that have limited stack displacement
    774 addressing.
    775
  • A new "PeepholeOptimizer" is available, which eliminates sign and zero
  • 776 extends, and optimizes away compare instructions when the condition result
    777 is available from a previous instruction.
    778
  • Atomic operations now get legalized into simpler atomic operations if not
  • 779 natively supported, easing the implementation burden on targets.
    780
  • We have added two new bottom-up pre-allocation register pressure aware schedulers:
  • 781
    782
  • The hybrid scheduler schedules aggressively to minimize schedule length when registers are available and avoid overscheduling in high pressure situations.
  • 783
  • The instruction-level-parallelism scheduler schedules for maximum ILP when registers are available and avoid overscheduling in high pressure situations.
  • 784
    785
  • The tblgen type inference algorithm was rewritten to be more consistent and
  • 786 diagnose more target bugs. If you have an out-of-tree backend, you may
    787 find that it finds bugs in your target description. This support also
    788 allows limited support for writing patterns for instructions that return
    789 multiple results (e.g. a virtual register and a flag result). The
    790 'parallel' modifier in tblgen was removed, you should use the new support
    791 for multiple results instead.
    792
  • A new (experimental) "-rendermf" pass is available which renders a
  • 793 MachineFunction into HTML, showing live ranges and other useful
    794 details.
    795
  • The new SubRegIndex tablegen class allows subregisters to be indexed
  • 796 symbolically instead of numerically. If your target uses subregisters you
    797 will need to adapt to use SubRegIndex when you upgrade to 2.8.
    798
    799
    800
  • The -fast-isel instruction selection path (used at -O0 on X86) was rewritten
  • 801 to work bottom-up on basic blocks instead of top down. This makes it
    802 slightly faster (because the MachineDCE pass is not needed any longer) and
    803 allows it to generate better code in some cases.
    804
    805
    806
    807
    808
    809 <div class="doc_subsection">
    639 <li>The pre-register-allocation (preRA) instruction scheduler models register
    640 pressure much more accurately in some cases. This allows the adoption of more
    641 aggressive scheduling heuristics without causing spills to be generated.
    642
    643
    644
  • LiveDebugVariables is a new pass that keeps track of debugging information
  • 645 for user variables that are promoted to registers in optimized builds.
    646
    647
  • The scheduler now models operand latency and pipeline forwarding.
  • 648
    649
  • A major register allocator infrastructure rewrite is underway. It is not on
  • 650 by default for 2.9 and you are not advised to use it, but it has made
    651 substantial progress in the 2.9 timeframe:
    652
    653
  • A new -regalloc=basic "basic" register allocator can be used as a simple
  • 654 fallback when debugging. It uses the new infrastructure.
    655
  • New infrastructure is in place for live range splitting. "SplitKit" can
  • 656 break a live interval into smaller pieces while preserving SSA form, and
    657 SpillPlacement can help find the best split points. This is a work in
    658 progress so the API is changing quickly.
    659
  • The inline spiller has learned to clean up after live range splitting. It
  • 660 can hoist spills out of loops, and it can eliminate redundant spills.
    661
  • Rematerialization works with live range splitting.
  • 662
  • The new "greedy" register allocator using live range splitting. This will
  • 663 be the default register allocator in the next LLVM release, but it is not
    664 turned on by default in 2.9.
    665
    666
    667
    668
    669
    670
    671

    810672 X86-32 and X86-64 Target Improvements
    811 div>
    673 h2>
    812674
    813675
    814676

    New features and major changes in the X86 target include:

    815677

    816678
    817679
    818
  • The X86 backend now supports holding X87 floating point stack values
  • 819 in registers across basic blocks, dramatically improving performance of code
    820 that uses long double, and when targeting CPUs that don't support SSE.
    821
    822
  • The X86 backend now uses a SSEDomainFix pass to optimize SSE operations. On
  • 823 Nehalem ("Core i7") and newer CPUs there is a 2 cycle latency penalty on
    824 using a register in a different domain than where it was defined. This pass
    825 optimizes away these stalls.
    826
    827
  • The X86 backend now promotes 16-bit integer operations to 32-bits when
  • 828 possible. This avoids 0x66 prefixes, which are slow on some
    829 microarchitectures and bloat the code on all of them.
    830
    831
  • The X86 backend now supports the Microsoft "thiscall" calling convention,
  • 832 and a calling convention to support
    833 ghc.
    834
    835
  • The X86 backend supports a new "llvm.x86.int" intrinsic, which maps onto
  • 836 the X86 "int $42" and "int3" instructions.
    837
    838
  • At the IR level, the <2 x float> datatype is now promoted and passed
  • 839 around as a <4 x float> instead of being passed and returned as an MMX
    840 vector. If you have a frontend that uses this, please pass and return a
    841 <2 x i32> instead (using bitcasts).
    842
    843
  • When printing .s files in verbose assembly mode (the default for clang -S),
  • 844 the X86 backend now decodes X86 shuffle instructions and prints human
    845 readable comments after the most inscrutable of them, e.g.:
    680
  • LLVM 2.9 includes a complete reimplementation of the MMX instruction set.
  • 681 The reimplementation uses a new LLVM IR
    682 href="LangRef.html#t_x86mmx">x86_mmx type to ensure that MMX operations
    683 are only generated from source that uses MMX builtin operations. With
    684 this, random types like <2 x i32> are not turned into MMX operations
    685 (which can be catastrophic without proper "emms" insertion). Because the X86
    686 code generator always generates reliable code, the -disable-mmx flag is now
    687 removed.
    688
    689
    690
  • X86 support for FS/GS relative loads and stores using
  • 691 href="CodeGenerator.html#x86_memory">address space 256/257 works reliably
    692 now.
    693
    694
  • LLVM 2.9 generates much better code in several cases by using adc/sbb to
  • 695 avoid generation of conditional move instructions for conditional increment
    696 and other idioms.
    697
    698
  • The X86 backend has adopted a new preRA scheduling mode, "list-ilp", to
  • 699 shorten the height of instruction schedules without inducing register spills.
    700
    701
    702
  • The MC assembler supports 3dNow! and 3DNowA instructions.
  • 703
    704
  • Several bugs have been fixed for Windows x64 code generator.
  • 705
    706
    707
    708
    709
    710

    711 ARM Target Improvements
    712
    713
    714
    715

    New features of the ARM target include:

    716

    717
    718
    719
  • The ARM backend now has a fast instruction selector, which dramatically
  • 720 improves -O0 compile times.
    721
  • The ARM backend has new tuning for Cortex-A8 and Cortex-A9 CPUs.
  • 722
  • The __builtin_prefetch builtin (and llvm.prefetch intrinsic) is compiled
  • 723 into prefetch instructions instead of being discarded.
    724
    725
  • The ARM backend preRA scheduler now models machine resources at cycle
  • 726 granularity. This allows the scheduler to both accurately model
    727 instruction latency and avoid overcommitting functional units.
    728
    729
  • Countless ARM microoptimizations have landed in LLVM 2.9.
  • 730
    731
    732
    733
    734

    735 Other Target Specific Improvements
    736
    737
    738
    739
    740
  • MicroBlaze: major updates for aggressive delay slot filler, MC-based
  • 741 assembly printing, assembly instruction parsing, ELF .o file emission, and MC
    742 instruction disassembler have landed.
    743
    744
  • SPARC: Many improvements, including using the Y registers for
  • 745 multiplications and addition of a simple delay slot filler.
    746
    747
  • PowerPC: The backend has been largely MC'ized and is ready to support
  • 748 directly writing out mach-o object files. No one seems interested in finishing
    749 this final step though.
    750
    751
    752
    753
    754
    755

    756 Major Changes and Removed Features
    757
    758
    759
    760
    761

    If you're already an LLVM user or developer with out-of-tree changes based

    762 on LLVM 2.8, this section lists some "gotchas" that you may run into upgrading
    763 from the previous release.

    764
    765
    766
  • This is the last release to support the llvm-gcc frontend.
  • 767
    768
  • LLVM has a new naming
  • 769 convention standard, though the codebase hasn't fully adopted it yet.
    770
    771
  • The new DIBuilder class provides a simpler interface for front ends to
  • 772 encode debug info in LLVM IR, and has replaced DIFactory.
    773
    774
  • LLVM IR and other tools always work on normalized target triples (which have
  • 775 been run through Triple::normalize).
    776
    777
  • The target triple x86_64--mingw64 is obsoleted. Use x86_64--mingw32
  • 778 instead.
    779
    780
  • The PointerTracking pass has been removed from mainline, and moved to The
  • 781 ClamAV project (its only client).
    846782
    847
    
                      
                    
    848 insertps $113, %xmm3, %xmm0 # xmm0 = zero,xmm0[1,2],xmm3[1]
    849 unpcklps %xmm1, %xmm0 # xmm0 = xmm0[0],xmm1[0],xmm0[1],xmm1[1]
    850 pshufd $1, %xmm1, %xmm1 # xmm1 = xmm1[1,0,0,0]
    851
    852
    853
    854
    855
    856
    857
    858
    859
    860 ARM Target Improvements
    861
    862
    863
    864

    New features of the ARM target include:

    865

    866
    867
    868
  • The ARM backend now optimizes tail calls into jumps.
  • 869
  • Scheduling is improved through the new list-hybrid scheduler as well
  • 870 as through better modeling of structural hazards.
    871
  • Half float instructions are now
  • 872 supported.
    873
  • NEON support has been improved to model instructions which operate onto
  • 874 multiple consecutive registers more aggressively. This avoids lots of
    875 extraneous register copies.
    876
  • The ARM backend now uses a new "ARMGlobalMerge" pass, which merges several
  • 877 global variables into one, saving extra address computation (all the global
    878 variables can be accessed via same base address) and potentially reducing
    879 register pressure.
    880
    881
  • The ARM backend has received many minor improvements and tweaks which lead
  • 882 to substantially better performance in a wide range of different scenarios.
    883
    884
    885
  • The ARM NEON intrinsics have been substantially reworked to reduce
  • 886 redundancy and improve code generation. Some of the major changes are:
    887
    888
  • 889 All of the NEON load and store intrinsics (llvm.arm.neon.vld* and
    890 llvm.arm.neon.vst*) take an extra parameter to specify the alignment in bytes
    891 of the memory being accessed.
    892
    893
  • 894 The llvm.arm.neon.vaba intrinsic (vector absolute difference and
    895 accumulate) has been removed. This operation is now represented using
    896 the llvm.arm.neon.vabd intrinsic (vector absolute difference) followed by a
    897 vector add.
    898
    899
  • 900 The llvm.arm.neon.vabdl and llvm.arm.neon.vabal intrinsics (lengthening
    901 vector absolute difference with and without accumulation) have been removed.
    902 They are represented using the llvm.arm.neon.vabd intrinsic (vector absolute
    903 difference) followed by a vector zero-extend operation, and for vabal,
    904 a vector add.
    905
    906
  • 907 The llvm.arm.neon.vmovn intrinsic has been removed. Calls of this intrinsic
    908 are now replaced by vector truncate operations.
    909
    910
  • 911 The llvm.arm.neon.vmovls and llvm.arm.neon.vmovlu intrinsics have been
    912 removed. They are now represented as vector sign-extend (vmovls) and
    913 zero-extend (vmovlu) operations.
    914
    915
  • 916 The llvm.arm.neon.vaddl*, llvm.arm.neon.vaddw*, llvm.arm.neon.vsubl*, and
    917 llvm.arm.neon.vsubw* intrinsics (lengthening vector add and subtract) have
    918 been removed. They are replaced by vector add and vector subtract operations
    919 where one (vaddw, vsubw) or both (vaddl, vsubl) of the operands are either
    920 sign-extended or zero-extended.
    921
    922
  • 923 The llvm.arm.neon.vmulls, llvm.arm.neon.vmullu, llvm.arm.neon.vmlal*, and
    924 llvm.arm.neon.vmlsl* intrinsics (lengthening vector multiply with and without
    925 accumulation and subtraction) have been removed. These operations are now
    926 represented as vector multiplications where the operands are either
    927 sign-extended or zero-extended, followed by a vector add for vmlal or a
    928 vector subtract for vmlsl. Note that the polynomial vector multiply
    929 intrinsic, llvm.arm.neon.vmullp, remains unchanged.
    930
    931
    932
    933
    934
    935
    936
    937
    938
    939
    940 Major Changes and Removed Features
    941
    942
    943
    944
    945

    If you're already an LLVM user or developer with out-of-tree changes based

    946 on LLVM 2.7, this section lists some "gotchas" that you may run into upgrading
    947 from the previous release.

    948
    949
    950
  • The build configuration machinery changed the output directory names. It
  • 951 wasn't clear to many people that a "Release-Asserts" build was a release build
    952 without asserts. To make this more clear, "Release" does not include
    953 assertions and "Release+Asserts" does (likewise, "Debug" and
    954 "Debug+Asserts").
    955
  • The MSIL Backend was removed, it was unsupported and broken.
  • 956
  • The ABCD, SSI, and SCCVN passes were removed. These were not fully
  • 957 functional and their behavior has been or will be subsumed by the
    958 LazyValueInfo pass.
    959
  • The LLVM IR 'Union' feature was removed. While this is a desirable feature
  • 960 for LLVM IR to support, the existing implementation was half baked and
    961 barely useful. We'd really like anyone interested to resurrect the work and
    962 finish it for a future release.
    963
  • If you're used to reading .ll files, you'll probably notice that .ll file
  • 964 dumps don't produce #uses comments anymore. To get them, run a .bc file
    965 through "llvm-dis --show-annotations".
    966
  • Target triples are now stored in a normalized form, and all inputs from
  • 967 humans are expected to be normalized by Triple::normalize before being
    968 stored in a module triple or passed to another library.
    969
    970
    971
    972
    973

    In addition, many APIs have changed in this release. Some of the major LLVM

    974 API changes are:

    975
    976
  • LLVM 2.8 changes the internal order of operands in
  • 977 href="http://llvm.org/doxygen/classllvm_1_1InvokeInst.html">InvokeInst
    978 and CallInst.
    979 To be portable across releases, please use the CallSite class and the
    980 high-level accessors, such as getCalledValue and
    981 setUnwindDest.
    982
    983
  • 984 You can no longer pass use_iterators directly to cast<> (and similar),
    985 because these routines tend to perform costly dereference operations more
    986 than once. You have to dereference the iterators yourself and pass them in.
    987
    988
  • 989 llvm.memcpy.*, llvm.memset.*, llvm.memmove.* intrinsics take an extra
    990 parameter now ("i1 isVolatile"), totaling 5 parameters, and the pointer
    991 operands are now address-space qualified.
    992 If you were creating these intrinsic calls and prototypes yourself (as opposed
    993 to using Intrinsic::getDeclaration), you can use
    994 UpgradeIntrinsicFunction/UpgradeIntrinsicCall to be portable across releases.
    995
    996
  • 997 SetCurrentDebugLocation takes a DebugLoc now instead of a MDNode.
    998 Change your code to use
    999 SetCurrentDebugLocation(DebugLoc::getFromDILocation(...)).
    1000
    1001
  • 1002 The RegisterPass and RegisterAnalysisGroup templates are
    1003 considered deprecated, but continue to function in LLVM 2.8. Clients are
    1004 strongly advised to use the upcoming INITIALIZE_PASS() and
    1005 INITIALIZE_AG_PASS() macros instead.
    1006
    1007
  • 1008 The constructor for the Triple class no longer tries to understand odd triple
    1009 specifications. Frontends should ensure that they only pass valid triples to
    1010 LLVM. The Triple::normalize utility method has been added to help front-ends
    1011 deal with funky triples.
    1012
    1013
  • 1014 The signature of the GCMetadataPrinter::finishAssembly virtual
    1015 function changed: the raw_ostream and MCAsmInfo arguments
    1016 were dropped. GC plugins which compute stack maps must be updated to avoid
    1017 having the old definition overload the new signature.
    1018
    1019
  • 1020 The signature of MemoryBuffer::getMemBuffer changed. Unfortunately
    1021 calls intended for the old version still compile, but will not work correctly,
    1022 leading to a confusing error about an invalid header in the bitcode.
    1023
    1024
    1025
  • 1026 Some APIs were renamed:
    1027
    1028
  • llvm_report_error -> report_fatal_error
  • 1029
  • llvm_install_error_handler -> install_fatal_error_handler
  • 1030
  • llvm::DwarfExceptionHandling -> llvm::JITExceptionHandling
  • 1031
  • VISIBILITY_HIDDEN -> LLVM_LIBRARY_VISIBILITY
  • 1032
    1033
    1034
    1035
  • 1036 Some public headers were renamed:
    1037
    1038
  • llvm/Assembly/AsmAnnotationWriter.h was renamed
  • 1039 to llvm/Assembly/AssemblyAnnotationWriter.h
    1040
    1041
    1042
    1043
    1044
    1045
    1046
    1047
    1048 Development Infrastructure Changes
    1049
    1050
    1051
    1052
    1053

    This section lists changes to the LLVM development infrastructure. This

    1054 mostly impacts users who actively work on LLVM or follow development on
    1055 mainline, but may also impact users who leverage the LLVM build infrastructure
    1056 or are interested in LLVM qualification.

    1057
    1058
    1059
  • The default for make check is now to use
  • 1060 the lit testing tool, which is
    1061 part of LLVM itself. You can use lit directly as well, or use
    1062 the llvm-lit tool which is created as part of a Makefile or CMake
    1063 build (and knows how to find the appropriate tools). See the lit
    1064 documentation and the blog
    1065 post, and PR5217
    1066 for more information.
    1067
    1068
  • The LLVM test-suite infrastructure has a new "simple" test format
  • 1069 (make TEST=simple). The new format is intended to require only a
    1070 compiler and not a full set of LLVM tools. This makes it useful for testing
    1071 released compilers, for running the test suite with other compilers (for
    1072 performance comparisons), and makes sure that we are testing the compiler as
    1073 users would see it. The new format is also designed to work using reference
    1074 outputs instead of comparison to a baseline compiler, which makes it run much
    1075 faster and makes it less system dependent.
    1076
    1077
  • Significant progress has been made on a new interface to running the
  • 1078 LLVM test-suite (aka the LLVM "nightly tests") using
    1079 the LNT infrastructure. The LNT
    1080 interface to the test-suite brings significantly improved reporting
    1081 capabilities for monitoring the correctness and generated code quality
    1082 produced by LLVM over time.
    1083
    1084
    1085
    1086
    1087 <div class="doc_section">
    783 <li>The LoopIndexSplit, LiveValues, SimplifyHalfPowrLibCalls, GEPSplitter, and
    784 PartialSpecialization passes were removed. They were unmaintained,
    785 buggy, or deemed to be a bad idea.
    786
    787
    788
    789
    790
    791

    792 Internal API Changes
    793
    794
    795
    796
    797

    In addition, many APIs have changed in this release. Some of the major

    798 LLVM API changes are:

    799
    800
    801
  • include/llvm/System merged into include/llvm/Support.
  • 802
  • The llvm::APInt API was significantly
  • 803 cleaned up.
    804
    805
  • In the code generator, MVT::Flag was renamed to MVT::Glue to more accurately
  • 806 describe its behavior.
    807
    808
  • The system_error header from C++0x was added, and is now pervasively used to
  • 809 capture and handle i/o and other errors in LLVM.
    810
    811
  • The old sys::Path API has been deprecated in favor of the new PathV2 API,
  • 812 which is more efficient and flexible.
    813
    814
    815
    816
    817

    1088818 Known Problems
    1089 div>
    819 h1>
    1090820
    1091821
    1092822
    1099829
    1100830
    1101831
    1102 <div class="doc_subsection">
    832 <h2>
    1103833 Experimental features included with this release
    1104 div>
    834 h2>
    1105835
    1106836
    1107837
    1113843 href="http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev">LLVMdev list.

    1114844
    1115845
    1116
  • The Alpha, Blackfin, CellSPU, MicroBlaze, MSP430, MIPS, SystemZ
  • 846
  • The Alpha, Blackfin, CellSPU, MicroBlaze, MSP430, MIPS, PTX, SystemZ
  • 1117847 and XCore backends are experimental.
    1118848
  • llc "-filetype=obj" is experimental on all targets
  • 1119 other than darwin-i386 and darwin-x86_64.
    849 other than darwin and ELF X86 systems.
    850
    1120851
    1121852
    1122853
    1123854
    1124855
    1125 <div class="doc_subsection">
    856 <h2>
    1126857 Known problems with the X86 back-end
    1127 div>
    858 h2>
    1128859
    1129860
    1130861
    1133864 all inline assembly that uses the X86
    1134865 floating point stack. It supports the 'f' and 't' constraints, but not
    1135866 'u'.
    1136
  • Win64 code generation wasn't widely tested. Everything should work, but we
  • 1137 expect small issues to happen. Also, llvm-gcc cannot build the mingw64
    1138 runtime currently due to lack of support for the 'u' inline assembly
    1139 constraint and for X87 floating point inline assembly.
    1140867
  • The X86-64 backend does not yet support the LLVM IR instruction
  • 1141868 va_arg. Currently, front-ends support variadic
    1142869 argument constructs on X86-64 by lowering them manually.
    870
  • Windows x64 (aka Win64) code generator has a few issues.
  • 871
    872
  • llvm-gcc cannot build the mingw-w64 runtime currently
  • 873 due to lack of support for the 'u' inline assembly
    874 constraint and for X87 floating point inline assembly.
    875
  • On mingw-w64, you will see unresolved symbol __chkstk
  • 876 due to Bug 8919.
    877 It is fixed in r128206.
    878
  • Miss-aligned MOVDQA might crash your program. It is due to
  • 879 Bug 9483,
    880 lack of handling aligned internal globals.
    881
    882
    883
    1143884
    1144885
    1145886
    1146887
    1147888
    1148 <div class="doc_subsection">
    889 <h2>
    1149890 Known problems with the PowerPC back-end
    1150 div>
    891 h2>
    1151892
    1152893
    1153894
    1159900
    1160901
    1161902
    1162 <div class="doc_subsection">
    903 <h2>
    1163904 Known problems with the ARM back-end
    1164 div>
    905 h2>
    1165906
    1166907
    1167908
    1176917
    1177918
    1178919
    1179 <div class="doc_subsection">
    920 <h2>
    1180921 Known problems with the SPARC back-end
    1181 div>
    922 h2>
    1182923
    1183924
    1184925
    1190931
    1191932
    1192933
    1193 <div class="doc_subsection">
    934 <h2>
    1194935 Known problems with the MIPS back-end
    1195 div>
    936 h2>
    1196937
    1197938
    1198939
    1203944
    1204945
    1205946
    1206 <div class="doc_subsection">
    947 <h2>
    1207948 Known problems with the Alpha back-end
    1208 div>
    949 h2>
    1209950
    1210951
    1211952
    1218959
    1219960
    1220961
    1221 <div class="doc_subsection">
    962 <h2>
    1222963 Known problems with the C back-end
    1223 div>
    964 h2>
    1224965
    1225966
    1226967
    1241982
    1242983
    1243984
    1244 <div class="doc_subsection">
    985 <h2>
    1245986 Known problems with the llvm-gcc front-end
    1246
    1247
    1248 <div class="doc_text">
    987 </h2>
    988
    989
    990
    991

    LLVM 2.9 will be the last release of llvm-gcc.

    1249992
    1250993

    llvm-gcc is generally very stable for the C family of languages. The only

    1251994 major language feature of GCC not supported by llvm-gcc is the
    12671010
    12681011
    12691012
    1270 <div class="doc_section">
    1013 <h1>
    12711014 Additional Information
    1272 div>
    1015 h1>
    12731016
    12741017
    12751018