llvm.org GIT mirror llvm / 43ab46e
Update release notes for 2.8 release. git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_28@115548 91177308-0d34-0410-b5e6-96231b3b80d8 Bill Wendling 8 years ago
1 changed file(s) with 815 addition(s) and 385 deletion(s). Raw diff Collapse all Expand all
22
33
44
5
56
67 LLVM 2.8 Release Notes
78
1819
  • External Projects Using LLVM 2.8
  • 1920
  • What's New in LLVM 2.8?
  • 2021
  • Installation Instructions
  • 21
  • Portability and Supported Platforms
  • 2222
  • Known Problems
  • 2323
  • Additional Information
  • 2424
    2727

    Written by the LLVM Team

    2828
    2929
    30
    3537
    3638
    3739
    6567 Almost dead code.
    6668 include/llvm/Analysis/LiveValues.h => Dan
    6769 lib/Transforms/IPO/MergeFunctions.cpp => consider for 2.8.
    68 llvm/Analysis/PointerTracking.h => Edwin wants this, consider for 2.8.
    6970 GEPSplitterPass
    7071 -->
    7172
    7273
    73
    80
    81
    81
    82
    83
    8584
    8685
    8786
    114113 standards, fast compilation, and low memory use. Like LLVM, Clang provides a
    115114 modular, library-based architecture that makes it suitable for creating or
    116115 integrating with other development tools. Clang is considered a
    117 production-quality compiler for C and Objective-C on x86 (32- and 64-bit).

    116 production-quality compiler for C, Objective-C, C++ and Objective-C++ on x86
    117 (32- and 64-bit), and for darwin-arm targets.

    118118
    119119

    In the LLVM 2.8 time-frame, the Clang team has made many improvements:

    120120
    121
    122
    123 ul>
    121 <ul>
    122
  • Clang C++ is now feature-complete with respect to the ISO C++ 1998 and 2003 standards.
  • 123
  • Added support for Objective-C++.
  • 124
  • Clang now uses LLVM-MC to directly generate object code and to parse inline assembly (on Darwin).
  • 125
  • Introduced many new warnings, including -Wmissing-field-initializers, -Wshadow, -Wno-protocol, -Wtautological-compare, -Wstrict-selector-match, -Wcast-align, -Wunused improvements, and greatly improved format-string checking.
  • 126
  • Introduced the "libclang" library, a C interface to Clang intended to support IDE clients.
  • 127
  • Added support for #pragma GCC visibility, #pragma align, and others.
  • 128
  • Added support for SSE, ARM NEON, and Altivec.
  • 129
  • Improved support for many Microsoft extensions.
  • 130
  • Implemented support for blocks in C++.
  • 131
  • Implemented precompiled headers for C++.
  • 132
  • Improved abstract syntax trees to retain more accurate source information.
  • 133
  • Added driver support for handling LLVM IR and bitcode files directly.
  • 134
  • Major improvements to compiler correctness for exception handling.
  • 135
  • Improved generated code quality in some areas:
  • 136
    137
  • Good code generation for X86-32 and X86-64 ABI handling.
  • 138
  • Improved code generation for bit-fields, although important work remains.
  • 139
    140
    141
    124142
    125143
    126144
    137155 future!). The tool is very good at finding bugs that occur on specific
    138156 paths through code, such as on error conditions.

    139157
    140

    In the LLVM 2.8 time-frame,

    141 </p>
    158 <p>The LLVM 2.8 release fixes a number of bugs and slightly improves precision
    159 over 2.7, but there are no major new features in the release.
    160

    161
    162
    163
    164
    165
    166 DragonEgg: llvm-gcc ported to gcc-4.5
    167
    168
    169
    170

    171 DragonEgg is a port of llvm-gcc to
    172 gcc-4.5. Unlike llvm-gcc, dragonegg in theory does not require any gcc-4.5
    173 modifications whatsoever (currently one small patch is needed) thanks to the
    174 new gcc plugin architecture.
    175 DragonEgg is a gcc plugin that makes gcc-4.5 use the LLVM optimizers and code
    176 generators instead of gcc's, just like with llvm-gcc.
    177

    178
    179

    180 DragonEgg is still a work in progress, but it is able to compile a lot of code,
    181 for example all of gcc, LLVM and clang. Currently Ada, C, C++ and Fortran work
    182 well, while all other languages either don't work at all or only work poorly.
    183 For the moment only the x86-32 and x86-64 targets are supported, and only on
    184 linux and darwin (darwin may need additional gcc patches).
    185

    186
    187

    188 The 2.8 release has the following notable changes:
    189
    190
  • The plugin loads faster due to exporting fewer symbols.
  • 191
  • Additional vector operations such as addps256 are now supported.
  • 192
  • Ada global variables with no initial value are no longer zero initialized,
  • 193 resulting in better optimization.
    194
  • The '-fplugin-arg-dragonegg-enable-gcc-optzns' flag now runs all gcc
  • 195 optimizers, rather than just a handful.
    196
  • Fortran programs using common variables now link correctly.
  • 197
  • GNU OMP constructs no longer crash the compiler.
  • 198
    142199
    143200
    144201
    150207
    151208

    152209 The VMKit project is an implementation of
    153 a JVM and a CLI Virtual Machine (Microsoft .NET is an
    154 implementation of the CLI) using LLVM for static and just-in-time
    155 compilation.

    156
    157

    With the release of LLVM 2.8, ...

    158
    159
    160
    210 a Java Virtual Machine (Java VM or JVM) that uses LLVM for static and
    211 just-in-time compilation. As of LLVM 2.8, VMKit now supports copying garbage
    212 collectors, and can be configured to use MMTk's copy mark-sweep garbage
    213 collector. In LLVM 2.8, the VMKit .NET VM is no longer being maintained.
    214

    215
    161216
    162217
    163218
    177232
    178233

    179234 All of the code in the compiler-rt project is available under the standard LLVM
    180 License, a "BSD-style" license. New in LLVM 2.8:
    181
    182 Soft float support
    183

    184
    185
    186
    187
    188
    189 DragonEgg: llvm-gcc ported to gcc-4.5
    190
    191
    192
    193

    194 DragonEgg is a port of llvm-gcc to
    195 gcc-4.5. Unlike llvm-gcc, which makes many intrusive changes to the underlying
    196 gcc-4.2 code, dragonegg in theory does not require any gcc-4.5 modifications
    197 whatsoever (currently one small patch is needed). This is thanks to the new
    198 gcc plugin architecture, which
    199 makes it possible to modify the behaviour of gcc at runtime by loading a plugin,
    200 which is nothing more than a dynamic library which conforms to the gcc plugin
    201 interface. DragonEgg is a gcc plugin that causes the LLVM optimizers to be run
    202 instead of the gcc optimizers, and the LLVM code generators instead of the gcc
    203 code generators, just like llvm-gcc. To use it, you add
    204 "-fplugin=path/dragonegg.so" to the gcc-4.5 command line, and gcc-4.5 magically
    205 becomes llvm-gcc-4.5!
    206

    207
    208

    209 DragonEgg is still a work in progress. Currently C works very well, while C++,
    210 Ada and Fortran work fairly well. All other languages either don't work at all,
    211 or only work poorly. For the moment only the x86-32 and x86-64 targets are
    212 supported, and only on linux and darwin (darwin needs an additional gcc patch).
    213

    214
    215

    216 2.8 status here.
    217

    218
    219
    220
    221
    222
    223
    224 llvm-mc: Machine Code Toolkit
    225
    226
    227
    228

    229 The LLVM Machine Code (aka MC) sub-project of LLVM was created to solve a number
    230 of problems in the realm of assembly, disassembly, object file format handling,
    231 and a number of other related areas that CPU instruction-set level tools work
    232 in. It is a sub-project of LLVM which provides it with a number of advantages
    233 over other compilers that do not have tightly integrated assembly-level tools.
    234 For a gentle introduction, please see the
    235 href="http://blog.llvm.org/2010/04/intro-to-llvm-mc-project.html">Intro to the
    236 LLVM MC Project Blog Post.
    237

    238
    239

    2.8 status here

    240
    235 License, a "BSD-style" license. New in LLVM 2.8, compiler_rt now supports
    236 soft floating point (for targets that don't have a real floating point unit),
    237 and includes an extensive testsuite for the "blocks" language feature and the
    238 blocks runtime included in compiler_rt.

    239
    240
    241
    242
    243
    244 LLDB: Low Level Debugger
    245
    246
    247
    248

    249 LLDB is a brand new member of the LLVM
    250 umbrella of projects. LLDB is a next generation, high-performance debugger. It
    251 is built as a set of reusable components which highly leverage existing
    252 libraries in the larger LLVM Project, such as the Clang expression parser, the
    253 LLVM disassembler and the LLVM JIT.

    254
    255

    256 LLDB is in early development and not included as part of the LLVM 2.8 release,
    257 but is mature enough to support basic debugging scenarios on Mac OS X in C,
    258 Objective-C and C++. We'd really like help extending and expanding LLDB to
    259 support new platforms, new languages, new architectures, and new features.
    260

    261
    262
    263
    264
    265
    266 libc++: C++ Standard Library
    267
    268
    269
    270

    271 libc++ is another new member of the LLVM
    272 family. It is an implementation of the C++ standard library, written from the
    273 ground up to specifically target the forthcoming C++'0X standard and focus on
    274 delivering great performance.

    275
    276

    277 As of the LLVM 2.8 release, libc++ is virtually feature complete, but would
    278 benefit from more testing and better integration with Clang++. It is also
    279 looking forward to the C++ committee finalizing the C++'0x standard.
    280

    281
    282
    283
    284
    285
    286
    287
    288 KLEE: A Symbolic Execution Virtual Machine
    289
    290
    291
    292

    293 KLEE is a symbolic execution framework for
    294 programs in LLVM bitcode form. KLEE tries to symbolically evaluate "all" paths
    295 through the application and records state transitions that lead to fault
    296 states. This allows it to construct testcases that lead to faults and can even
    297 be used to verify some algorithms.
    298

    299
    300

    Although KLEE does not have any major new features as of 2.8, we have made

    301 various minor improvements, particular to ease development:

    302
    303
  • Added support for LLVM 2.8. KLEE currently maintains compatibility with
  • 304 LLVM 2.6, 2.7, and 2.8.
    305
  • Added a buildbot for 2.6, 2.7, and trunk. A 2.8 buildbot will be coming
  • 306 soon following release.
    307
  • Fixed many C++ code issues to allow building with Clang++. Mostly
  • 308 complete, except for the version of MiniSAT which is inside the KLEE STP
    309 version.
    310
  • Improved support for building with separate source and build
  • 311 directories.
    312
  • Added support for "long double" on x86.
  • 313
  • Initial work on KLEE support for using 'lit' test runner instead of
  • 314 DejaGNU.
    315
  • Added configure support for using an external version of
  • 316 STP.
    317
    318
    319
    241320
    242321
    243322
    253332 projects that have already been updated to work with LLVM 2.8.

    254333
    255334
    335
    336
    337 TTA-based Codesign Environment (TCE)
    338
    339
    340
    341

    342 TCE is a toolset for designing
    343 application-specific processors (ASP) based on the Transport triggered
    344 architecture (TTA). The toolset provides a complete co-design flow from C/C++
    345 programs down to synthesizable VHDL and parallel program binaries. Processor
    346 customization points include the register files, function units, supported
    347 operations, and the interconnection network.

    348
    349

    TCE uses llvm-gcc/Clang and LLVM for C/C++ language support, target

    350 independent optimizations and also for parts of code generation. It generates
    351 new LLVM-based code generators "on the fly" for the designed TTA processors and
    352 loads them in to the compiler backend as runtime libraries to avoid per-target
    353 recompilation of larger parts of the compiler chain.

    354
    355
    356
    357
    358
    359 Horizon Bytecode Compiler
    360
    361
    362
    363

    364 Horizon is a bytecode
    365 language and compiler written on top of LLVM, intended for producing
    366 single-address-space managed code operating systems that
    367 run faster than the equivalent multiple-address-space C systems.
    368 More in-depth blurb is available on the
    369 href="http://www.quokforge.org/projects/horizon/wiki/Wiki">wiki.

    370
    371
    372
    373
    374
    375 Clam AntiVirus
    376
    377
    378
    379

    380 Clam AntiVirus is an open source (GPL)
    381 anti-virus toolkit for UNIX, designed especially for e-mail scanning on mail
    382 gateways. Since version 0.96 it has
    383 href="http://vrt-sourcefire.blogspot.com/2010/09/introduction-to-clamavs-low-level.html">bytecode
    384 signatures that allow writing detections for complex malware. It
    385 uses LLVM's JIT to speed up the execution of bytecode on
    386 X86, X86-64, PPC32/64, falling back to its own interpreter otherwise.
    387 The git version was updated to work with LLVM 2.8.
    388

    389
    390

    The

    391 href="http://git.clamav.net/gitweb?p=clamav-bytecode-compiler.git;a=blob_plain;f=docs/user/clambc-user.pdf">
    392 ClamAV bytecode compiler uses Clang and LLVM to compile a C-like
    393 language, insert runtime checks, and generate ClamAV bytecode.

    394
    395
    396
    397
    398
    399 Pure
    400
    401
    402
    403

    404 Pure
    405 is an algebraic/functional
    406 programming language based on term rewriting. Programs are collections
    407 of equations which are used to evaluate expressions in a symbolic
    408 fashion. Pure offers dynamic typing, eager and lazy evaluation, lexical
    409 closures, a hygienic macro system (also based on term rewriting),
    410 built-in list and matrix support (including list and matrix
    411 comprehensions) and an easy-to-use C interface. The interpreter uses
    412 LLVM as a backend to JIT-compile Pure programs to fast native code.

    413
    414

    Pure versions 0.44 and later have been tested and are known to work with

    415 LLVM 2.8 (and continue to work with older LLVM releases >= 2.5).

    416
    417
    418
    419
    420
    421 Glasgow Haskell Compiler (GHC)
    422
    423
    424
    425

    426 GHC is an open source,
    427 state-of-the-art programming suite for
    428 Haskell, a standard lazy functional programming language. It includes
    429 an optimizing static compiler generating good code for a variety of
    430 platforms, together with an interactive system for convenient, quick
    431 development.

    432
    433

    In addition to the existing C and native code generators, GHC 7.0 now

    434 supports an
    435 href="http://hackage.haskell.org/trac/ghc/wiki/Commentary/Compiler/Backends/LLVM">LLVM
    436 code generator. GHC supports LLVM 2.7 and later.

    437
    438
    439
    440
    441
    442 Clay Programming Language
    443
    444
    445
    446

    447 Clay is a new systems programming
    448 language that is specifically designed for generic programming. It makes
    449 generic programming very concise thanks to whole program type propagation. It
    450 uses LLVM as its backend.

    451
    452
    453
    454
    455
    456 llvm-py Python Bindings for LLVM
    457
    458
    459
    460

    461 llvm-py has been updated to work
    462 with LLVM 2.8. llvm-py provides Python bindings for LLVM, allowing you to write a
    463 compiler backend or a VM in Python.

    464
    465
    466
    467
    468
    469
    470 FAUST Real-Time Audio Signal Processing Language
    471
    472
    473
    474

    475 FAUST is a compiled language for real-time
    476 audio signal processing. The name FAUST stands for Functional AUdio STream. Its
    477 programming model combines two approaches: functional programming and block
    478 diagram composition. In addition with the C, C++, JAVA output formats, the
    479 Faust compiler can now generate LLVM bitcode, and works with LLVM 2.7 and
    480 2.8.

    481
    482
    483
    484
    485
    486 Jade Just-in-time Adaptive Decoder Engine
    487
    488
    489
    490

    491 href="http://sourceforge.net/apps/trac/orcc/wiki/JadeDocumentation">Jade
    492 (Just-in-time Adaptive Decoder Engine) is a generic video decoder engine using
    493 LLVM for just-in-time compilation of video decoder configurations. Those
    494 configurations are designed by MPEG Reconfigurable Video Coding (RVC) committee.
    495 MPEG RVC standard is built on a stream-based dataflow representation of
    496 decoders. It is composed of a standard library of coding tools written in
    497 RVC-CAL language and a dataflow configuration — block diagram —
    498 of a decoder.

    499
    500

    Jade project is hosted as part of the Open

    501 RVC-CAL Compiler and requires it to translate the RVC-CAL standard library
    502 of video coding tools into an LLVM assembly code.

    503
    504
    505
    506
    507
    508 LLVM JIT for Neko VM
    509
    510
    511
    512

    Neko LLVM JIT

    513 replaces the standard Neko JIT with an LLVM-based implementation. While not
    514 fully complete, it is already providing a 1.5x speedup on 64-bit systems.
    515 Neko LLVM JIT requires LLVM 2.8 or later.

    516
    517
    518
    519
    520
    521 Crack Scripting Language
    522
    523
    524
    525

    526 Crack aims to provide
    527 the ease of development of a scripting language with the performance of a
    528 compiled language. The language derives concepts from C++, Java and Python,
    529 incorporating object-oriented programming, operator overloading and strong
    530 typing. Crack 0.2 works with LLVM 2.7, and the forthcoming Crack 0.2.1 release
    531 builds on LLVM 2.8.

    532
    533
    534
    535
    536
    537 Dresden TM Compiler (DTMC)
    538
    539
    540
    541

    542 DTMC provides support for
    543 Transactional Memory, which is an easy-to-use and efficient way to synchronize
    544 accesses to shared memory. Transactions can contain normal C/C++ code (e.g.,
    545 __transaction { list.remove(x); x.refCount--; }) and will be executed
    546 virtually atomically and isolated from other transactions.

    547
    548
    549
    550
    551
    552 Kai Programming Language
    553
    554
    555
    556

    557 Kai (Japanese 会 for
    558 meeting/gathering) is an experimental interpreter that provides a highly
    559 extensible runtime environment and explicit control over the compilation
    560 process. Programs are defined using nested symbolic expressions, which are all
    561 parsed into first-class values with minimal intrinsic semantics. Kai can
    562 generate optimised code at run-time (using LLVM) in order to exploit the nature
    563 of the underlying hardware and to integrate with external software libraries.
    564 It is a unique exploration into world of dynamic code compilation, and the
    565 interaction between high level and low level semantics.

    566
    567
    568
    569
    570
    571 OSL: Open Shading Language
    572
    573
    574
    575

    576 OSL is a shading
    577 language designed for use in physically based renderers and in particular
    578 production rendering. By using LLVM instead of the interpreter, it was able to
    579 meet its performance goals (>= C-code) while retaining the benefits of
    580 runtime specialization and a portable high-level language.
    581

    582
    583
    584
    585
    256586
    257587
    258588
    271601
    272602
    273603
    274 LLVM Community Changes
    275
    276
    277
    278
    279

    In addition to changes to the code, between LLVM 2.7 and 2.8, a number of

    280 organization changes have happened:
    281

    282
    283
    284
    285
    286
    287
    288
    289604 Major New Features
    290605
    291606
    294609

    LLVM 2.8 includes several major new capabilities:

    295610
    296611
    297
  • .
  • 612
  • As mentioned above, libc++ and
  • 613 href="#lldb">LLDB are major new additions to the LLVM collective.
    614
  • LLVM 2.8 now has pretty decent support for debugging optimized code. You
  • 615 should be able to reliably get debug info for function arguments, assuming
    616 that the value is actually available where you have stopped.
    617
  • A new 'llvm-diff' tool is available that does a semantic diff of .ll
  • 618 files.
    619
  • The MC subproject has made major progress in this release.
  • 620 Direct .o file writing support for darwin/x86[-64] is now reliable and
    621 support for other targets and object file formats are in progress.
    298622
    299623
    300624
    309633 expose new optimization opportunities:

    310634
    311635
    312
    636
  • The memcpy, memmove, and memset
  • 637 intrinsics now take address space qualified pointers and a bit to indicate
    638 whether the transfer is "volatile" or not.
    639
    640
  • Per-instruction debug info metadata is much faster and uses less memory by
  • 641 using the new DebugLoc class.
    642
  • LLVM IR now has a more formalized concept of "
  • 643 href="LangRef.html#trapvalues">trap values", which allow the optimizer
    644 to optimize more aggressively in the presence of undefined behavior, while
    645 still producing predictable results.
    646
  • LLVM IR now supports two new linkage
  • 647 types (linker_private_weak and linker_private_weak_def_auto) which map
    648 onto some obscure MachO concepts.
    649
    650
    651
    652
    653
    654
    655 Optimizer Improvements
    656
    657
    658
    659
    660

    In addition to a large array of minor performance tweaks and bug fixes, this

    661 release includes a few major enhancements and additions to the optimizers:

    662
    663
    664
  • As mentioned above, the optimizer now has support for updating debug
  • 665 information as it goes. A key aspect of this is the new
    666 href="SourceLevelDebugging.html#format_common_value">llvm.dbg.value
    667 intrinsic. This intrinsic represents debug info for variables that are
    668 promoted to SSA values (typically by mem2reg or the -scalarrepl passes).
    669
    670
  • The JumpThreading pass is now much more aggressive about implied value
  • 671 relations, allowing it to thread conditions like "a == 4" when a is known to
    672 be 13 in one of the predecessors of a block. It does this in conjunction
    673 with the new LazyValueInfo analysis pass.
    674
  • The new RegionInfo analysis pass identifies single-entry single-exit regions
  • 675 in the CFG. You can play with it with the "opt -regions analyze" or
    676 "opt -view-regions" commands.
    677
  • The loop optimizer has significantly improved strength reduction and analysis
  • 678 capabilities. Notably it is able to build on the trap value and signed
    679 integer overflow information to optimize <= and >= loops.
    680
  • The CallGraphSCCPassManager now has some basic support for iterating within
  • 681 an SCC when a optimizer devirtualizes a function call. This allows inlining
    682 through indirect call sites that are devirtualized by store-load forwarding
    683 and other optimizations.
    684
  • The new -loweratomic pass is available
  • 685 to lower atomic instructions into their non-atomic form. This can be useful
    686 to optimize generic code that expects to run in a single-threaded
    687 environment.
    688
    689
    690
    696
    697
    698
    699
    700
    701 MC Level Improvements
    702
    703
    704
    705

    706 The LLVM Machine Code (aka MC) subsystem was created to solve a number
    707 of problems in the realm of assembly, disassembly, object file format handling,
    708 and a number of other related areas that CPU instruction-set level tools work
    709 in.

    710
    711

    The MC subproject has made great leaps in LLVM 2.8. For example, support for

    712 directly writing .o files from LLC (and clang) now works reliably for
    713 darwin/x86[-64] (including inline assembly support) and the integrated
    714 assembler is turned on by default in Clang for these targets. This provides
    715 improved compile times among other things.

    716
    717
    718
  • The entire compiler has converted over to using the MCStreamer assembler API
  • 719 instead of writing out a .s file textually.
    720
  • The "assembler parser" is far more mature than in 2.7, supporting a full
  • 721 complement of directives, now supports assembler macros, etc.
    722
  • The "assembler backend" has been completed, including support for relaxation
  • 723 relocation processing and all the other things that an assembler does.
    724
  • The MachO file format support is now fully functional and works.
  • 725
  • The MC disassembler now fully supports ARM and Thumb. ARM assembler support
  • 726 is still in early development though.
    727
  • The X86 MC assembler now supports the X86 AES and AVX instruction set.
  • 728
  • Work on ELF and COFF object files and ARM target support is well underway,
  • 729 but isn't useful yet in LLVM 2.8. Please contact the llvmdev mailing list
    730 if you're interested in this.
    731
    732
    733

    For more information, please see the

    734 href="http://blog.llvm.org/2010/04/intro-to-llvm-mc-project.html">Intro to the
    735 LLVM MC Project Blog Post.
    736

    737
    738
    739
    740
    741
    742
    743 Target Independent Code Generator Improvements
    744
    745
    746
    747
    748

    We have put a significant amount of work into the code generator

    749 infrastructure, which allows us to implement more aggressive algorithms and make
    750 it run faster:

    751
    752
    753
  • The clang/gcc -momit-leaf-frame-pointer argument is now supported.
  • 754
  • The clang/gcc -ffunction-sections and -fdata-sections arguments are now
  • 755 supported on ELF targets (like GCC).
    756
  • The MachineCSE pass is now tuned and on by default. It eliminates common
  • 757 subexpressions that are exposed when lowering to machine instructions.
    758
  • The "local" register allocator was replaced by a new "fast" register
  • 759 allocator. This new allocator (which is often used at -O0) is substantially
    760 faster and produces better code than the old local register allocator.
    761
  • A new LLC "-regalloc=default" option is available, which automatically
  • 762 chooses a register allocator based on the -O optimization level.
    763
  • The common code generator code was modified to promote illegal argument and
  • 764 return value vectors to wider ones when possible instead of scalarizing
    765 them. For example, <3 x float> will now pass in one SSE register
    766 instead of 3 on X86. This generates substantially better code since the
    767 rest of the code generator was already expecting this.
    768
  • The code generator uses a new "COPY" machine instruction. This speeds up
  • 769 the code generator and eliminates the need for targets to implement the
    770 isMoveInstr hook. Also, the copyRegToReg hook was renamed to copyPhysReg
    771 and simplified.
    772
  • The code generator now has a "LocalStackSlotPass", which optimizes stack
  • 773 slot access for targets (like ARM) that have limited stack displacement
    774 addressing.
    775
  • A new "PeepholeOptimizer" is available, which eliminates sign and zero
  • 776 extends, and optimizes away compare instructions when the condition result
    777 is available from a previous instruction.
    778
  • Atomic operations now get legalized into simpler atomic operations if not
  • 779 natively supported, easing the implementation burden on targets.
    780
  • We have added two new bottom-up pre-allocation register pressure aware schedulers:
  • 781
    782
  • The hybrid scheduler schedules aggressively to minimize schedule length when registers are available and avoid overscheduling in high pressure situations.
  • 783
  • The instruction-level-parallelism scheduler schedules for maximum ILP when registers are available and avoid overscheduling in high pressure situations.
  • 784
    785
  • The tblgen type inference algorithm was rewritten to be more consistent and
  • 786 diagnose more target bugs. If you have an out-of-tree backend, you may
    787 find that it finds bugs in your target description. This support also
    788 allows limited support for writing patterns for instructions that return
    789 multiple results (e.g. a virtual register and a flag result). The
    790 'parallel' modifier in tblgen was removed, you should use the new support
    791 for multiple results instead.
    792
  • A new (experimental) "-rendermf" pass is available which renders a
  • 793 MachineFunction into HTML, showing live ranges and other useful
    794 details.
    795
  • The new SubRegIndex tablegen class allows subregisters to be indexed
  • 796 symbolically instead of numerically. If your target uses subregisters you
    797 will need to adapt to use SubRegIndex when you upgrade to 2.8.
    798
    799
    800
  • The -fast-isel instruction selection path (used at -O0 on X86) was rewritten
  • 801 to work bottom-up on basic blocks instead of top down. This makes it
    802 slightly faster (because the MachineDCE pass is not needed any longer) and
    803 allows it to generate better code in some cases.
    804
    805
    806
    807
    808
    809
    810 X86-32 and X86-64 Target Improvements
    811
    812
    813
    814

    New features and major changes in the X86 target include:

    815

    816
    817
    818
  • The X86 backend now supports holding X87 floating point stack values
  • 819 in registers across basic blocks, dramatically improving performance of code
    820 that uses long double, and when targeting CPUs that don't support SSE.
    821
    822
  • The X86 backend now uses a SSEDomainFix pass to optimize SSE operations. On
  • 823 Nehalem ("Core i7") and newer CPUs there is a 2 cycle latency penalty on
    824 using a register in a different domain than where it was defined. This pass
    825 optimizes away these stalls.
    826
    827
  • The X86 backend now promotes 16-bit integer operations to 32-bits when
  • 828 possible. This avoids 0x66 prefixes, which are slow on some
    829 microarchitectures and bloat the code on all of them.
    830
    831
  • The X86 backend now supports the Microsoft "thiscall" calling convention,
  • 832 and a calling convention to support
    833 ghc.
    834
    835
  • The X86 backend supports a new "llvm.x86.int" intrinsic, which maps onto
  • 836 the X86 "int $42" and "int3" instructions.
    837
    838
  • At the IR level, the <2 x float> datatype is now promoted and passed
  • 839 around as a <4 x float> instead of being passed and returned as an MMX
    840 vector. If you have a frontend that uses this, please pass and return a
    841 <2 x i32> instead (using bitcasts).
    842
    843
  • When printing .s files in verbose assembly mode (the default for clang -S),
  • 844 the X86 backend now decodes X86 shuffle instructions and prints human
    845 readable comments after the most inscrutable of them, e.g.:
    846
    847
    
                      
                    
    848 insertps $113, %xmm3, %xmm0 # xmm0 = zero,xmm0[1,2],xmm3[1]
    849 unpcklps %xmm1, %xmm0 # xmm0 = xmm0[0],xmm1[0],xmm0[1],xmm1[1]
    850 pshufd $1, %xmm1, %xmm1 # xmm1 = xmm1[1,0,0,0]
    851
    852
    853
    854
    855
    856
    857
    858
    859
    860 ARM Target Improvements
    861
    862
    863
    864

    New features of the ARM target include:

    865

    866
    867
    868
  • The ARM backend now optimizes tail calls into jumps.
  • 869
  • Scheduling is improved through the new list-hybrid scheduler as well
  • 870 as through better modeling of structural hazards.
    871
  • Half float instructions are now
  • 872 supported.
    873
  • NEON support has been improved to model instructions which operate onto
  • 874 multiple consecutive registers more aggressively. This avoids lots of
    875 extraneous register copies.
    876
  • The ARM backend now uses a new "ARMGlobalMerge" pass, which merges several
  • 877 global variables into one, saving extra address computation (all the global
    878 variables can be accessed via same base address) and potentially reducing
    879 register pressure.
    880
    881
  • The ARM has received many minor improvements and tweaks which lead to
  • 882 substantially better performance in a wide range of different scenarios.
    883
    884
  • The ARM NEON intrinsics have been substantially reworked to reduce
  • 885 redundancy and improve code generation. Some of the major changes are:
    886
    887
  • 888 All of the NEON load and store intrinsics (llvm.arm.neon.vld* and
    889 llvm.arm.neon.vst*) take an extra parameter to specify the alignment in bytes
    890 of the memory being accessed.
    891
    892
  • 893 The llvm.arm.neon.vaba intrinsic (vector absolute difference and
    894 accumulate) has been removed. This operation is now represented using
    895 the llvm.arm.neon.vabd intrinsic (vector absolute difference) followed by a
    896 vector add.
    897
    898
  • 899 The llvm.arm.neon.vabdl and llvm.arm.neon.vabal intrinsics (lengthening
    900 vector absolute difference with and without accumulation) have been removed.
    901 They are represented using the llvm.arm.neon.vabd intrinsic (vector absolute
    902 difference) followed by a vector zero-extend operation, and for vabal,
    903 a vector add.
    904
    905
  • 906 The llvm.arm.neon.vmovn intrinsic has been removed. Calls of this intrinsic
    907 are now replaced by vector truncate operations.
    908
    909
  • 910 The llvm.arm.neon.vmovls and llvm.arm.neon.vmovlu intrinsics have been
    911 removed. They are now represented as vector sign-extend (vmovls) and
    912 zero-extend (vmovlu) operations.
    913
    914
  • 915 The llvm.arm.neon.vaddl*, llvm.arm.neon.vaddw*, llvm.arm.neon.vsubl*, and
    916 llvm.arm.neon.vsubw* intrinsics (lengthening vector add and subtract) have
    917 been removed. They are replaced by vector add and vector subtract operations
    918 where one (vaddw, vsubw) or both (vaddl, vsubl) of the operands are either
    919 sign-extended or zero-extended.
    920
    921
  • 922 The llvm.arm.neon.vmulls, llvm.arm.neon.vmullu, llvm.arm.neon.vmlal*, and
    923 llvm.arm.neon.vmlsl* intrinsics (lengthening vector multiply with and without
    924 accumulation and subtraction) have been removed. These operations are now
    925 represented as vector multiplications where the operands are either
    926 sign-extended or zero-extended, followed by a vector add for vmlal or a
    927 vector subtract for vmlsl. Note that the polynomial vector multiply
    928 intrinsic, llvm.arm.neon.vmullp, remains unchanged.
    929
    930
    931
    932
    933
    934
    935
    936
    937
    938
    939 Major Changes and Removed Features
    940
    941
    942
    943
    944

    If you're already an LLVM user or developer with out-of-tree changes based

    945 on LLVM 2.7, this section lists some "gotchas" that you may run into upgrading
    946 from the previous release.

    947
    948
    949
  • The build configuration machinery changed the output directory names. It
  • 950 wasn't clear to many people that a "Release-Asserts" build was a release build
    951 without asserts. To make this more clear, "Release" does not include
    952 assertions and "Release+Asserts" does (likewise, "Debug" and
    953 "Debug+Asserts").
    954
  • The MSIL Backend was removed, it was unsupported and broken.
  • 955
  • The ABCD, SSI, and SCCVN passes were removed. These were not fully
  • 956 functional and their behavior has been or will be subsumed by the
    957 LazyValueInfo pass.
    958
  • The LLVM IR 'Union' feature was removed. While this is a desirable feature
  • 959 for LLVM IR to support, the existing implementation was half baked and
    960 barely useful. We'd really like anyone interested to resurrect the work and
    961 finish it for a future release.
    962
  • If you're used to reading .ll files, you'll probably notice that .ll file
  • 963 dumps don't produce #uses comments anymore. To get them, run a .bc file
    964 through "llvm-dis --show-annotations".
    965
  • Target triples are now stored in a normalized form, and all inputs from
  • 966 humans are expected to be normalized by Triple::normalize before being
    967 stored in a module triple or passed to another library.
    968
    969
    970
    971
    972

    In addition, many APIs have changed in this release. Some of the major LLVM

    973 API changes are:

    974
    313975
  • LLVM 2.8 changes the internal order of operands in
  • 314976 href="http://llvm.org/doxygen/classllvm_1_1InvokeInst.html">InvokeInst
    315977 and CallInst.
    316 To be portable across releases, resort to CallSite and the
    317 high-level accessors, such as getCalledValue and setUnwindDest.
    978 To be portable across releases, please use the CallSite class and the
    979 high-level accessors, such as getCalledValue and
    980 setUnwindDest.
    318981
    319982
  • 320 You can no longer pass use_iterators directly to cast<> (and similar), because
    321 these routines tend to perform costly dereference operations more than once. You
    322 have to dereference the iterators yourself and pass them in.
    983 You can no longer pass use_iterators directly to cast<> (and similar),
    984 because these routines tend to perform costly dereference operations more
    985 than once. You have to dereference the iterators yourself and pass them in.
    323986
    324987
  • 325 The Pass(intptr_t) and Pass(const void*) got replaced with a
    326 Pass(char&) constructor. This means you have to use ifdefs if you
    327 want your pass to work with both LLVM 2.7 and 2.8
    328
    329
  • 330 llvm.memcpy.*, llvm.memset.*, llvm.memmove.* (and possibly other?) intrinsics
    331 take an extra parameter now (i1 isVolatile), totaling 5 parameters.
    988 llvm.memcpy.*, llvm.memset.*, llvm.memmove.* intrinsics take an extra
    989 parameter now ("i1 isVolatile"), totaling 5 parameters, and the pointer
    990 operands are now address-space qualified.
    332991 If you were creating these intrinsic calls and prototypes yourself (as opposed
    333 to using Intrinsic::getDeclaration), you can use UpgradeIntrinsicFunction/UpgradeIntrinsicCall
    334 to be portable accross releases.
    335 Note that you cannot use Intrinsic::getDeclaration() in a backwards compatible
    336 way (needs 2/3 types now, in 2.7 it needed just 1).
    992 to using Intrinsic::getDeclaration), you can use
    993 UpgradeIntrinsicFunction/UpgradeIntrinsicCall to be portable across releases.
    337994
    338995
  • 339996 SetCurrentDebugLocation takes a DebugLoc now instead of a MDNode.
    341998 SetCurrentDebugLocation(DebugLoc::getFromDILocation(...)).
    342999
    3431000
  • 344 VISIBILITY_HIDDEN is gone.
    345
    346
  • 3471001 The RegisterPass and RegisterAnalysisGroup templates are
    3481002 considered deprecated, but continue to function in LLVM 2.8. Clients are
    3491003 strongly advised to use the upcoming INITIALIZE_PASS() and
    3501004 INITIALIZE_AG_PASS() macros instead.
    351
  • 352 SMDiagnostic takes different parameters now. //FIXME: how to upgrade?
    3531005
    3541006
  • 3551007 The constructor for the Triple class no longer tries to understand odd triple
    3561008 specifications. Frontends should ensure that they only pass valid triples to
    3571009 LLVM. The Triple::normalize utility method has been added to help front-ends
    3581010 deal with funky triples.
    1011
    1012
    3591013
  • 3601014 Some APIs got renamed:
    3611015
    362
  • llvm_report_error -> report_fatal_error
  • 363
  • llvm_install_error_handler -> install_fatal_error_handler
  • 364
  • llvm::DwarfExceptionHandling -> llvm::JITExceptionHandling
  • 1016
  • llvm_report_error -> report_fatal_error
  • 1017
  • llvm_install_error_handler -> install_fatal_error_handler
  • 1018
  • llvm::DwarfExceptionHandling -> llvm::JITExceptionHandling
  • 1019
  • VISIBILITY_HIDDEN -> LLVM_LIBRARY_VISIBILITY
  • 3651020
    366
    367
    368
    369
    370
    371
    372
    373 Optimizer Improvements
    374
    375
    376
    377
    378

    In addition to a large array of minor performance tweaks and bug fixes, this

    379 release includes a few major enhancements and additions to the optimizers:

    380
    381
    382
    383
  • 384
    385
    386
    387
    388
    389
    390
    391
    392 Interpreter and JIT Improvements
    393
    394
    395
    396
    397
    398
  • 399
    400
    401
    402
    403
    404
    405
    406 Target Independent Code Generator Improvements
    407
    408
    409
    410
    411

    We have put a significant amount of work into the code generator

    412 infrastructure, which allows us to implement more aggressive algorithms and make
    413 it run faster:

    414
    415
    416
  • MachO writer works.
  • 417
    418
    419
    420
    421
    422 X86-32 and X86-64 Target Improvements
    423
    424
    425
    426

    New features of the X86 target include:

    427

    428
    429
    430
  • The X86 backend now supports holding X87 floating point stack values
  • 431 in registers across basic blocks, dramatically improving performance of code
    432 that uses long double, and when targetting CPUs that don't support SSE.
    433
    434
    435
    436
    437
    438
    439
    440 ARM Target Improvements
    441
    442
    443
    444

    New features of the ARM target include:

    445

    446
    447
    448
    449
  • 450
    451
    452
    453
    454
    455
    456
    457
    458 New Useful APIs
    459
    460
    461
    462
    463

    This release includes a number of new APIs that are used internally, which

    464 may also be useful for external clients.
    465

    466
    467
    468
  • 469
    470
    471
    472
    473
    474
    475
    476 Other Improvements and New Features
    477
    478
    479
    480

    Other miscellaneous features include:

    481
    482
    483
  • 484
    485
    486
    487
    488
    489
    490
    491 Major Changes and Removed Features
    492
    493
    494
    495
    496

    If you're already an LLVM user or developer with out-of-tree changes based

    497 on LLVM 2.7, this section lists some "gotchas" that you may run into upgrading
    498 from the previous release.

    499
    500
    501
  • .ll file doesn't produce #uses comments anymore, to get them, run a .bc file
  • 502 through "llvm-dis --show-annotations".
    503
  • MSIL Backend removed.
  • 504
  • ABCD and SSI passes removed.
  • 505
  • 'Union' LLVM IR feature removed.
  • 506
    507
    508

    In addition, many APIs have changed in this release. Some of the major LLVM

    509 API changes are:

    510
    511
    512
    513
    514
    515
    516
    517
    518
    519
    520 Portability and Supported Platforms
    521
    522
    523
    524
    525
    526

    LLVM is known to work on the following platforms:

    527
    528
    529
  • Intel and AMD machines (IA32, X86-64, AMD64, EMT-64) running Red Hat
  • 530 Linux, Fedora Core, FreeBSD and AuroraUX (and probably other unix-like
    531 systems).
    532
  • PowerPC and X86-based Mac OS X systems, running 10.4 and above in 32-bit
  • 533 and 64-bit modes.
    534
  • Intel and AMD machines running on Win32 using MinGW libraries (native).
  • 535
  • Intel and AMD machines running on Win32 with the Cygwin libraries (limited
  • 536 support is available for native builds with Visual C++).
    537
  • Sun x86 and AMD64 machines running Solaris 10, OpenSolaris 0906.
  • 538
  • Alpha-based machines running Debian GNU/Linux.
  • 539
    540
    541

    The core LLVM infrastructure uses GNU autoconf to adapt itself

    542 to the machine and operating system on which it is built. However, minor
    543 porting may be required to get LLVM to work on new platforms. We welcome your
    544 portability patches and reports of successful builds or error messages.

    545
    546 div>
    1021 li>
    1022
    1023
    1024
    1025
    1026
    5471027
    5481028
    5491029
    5571037 listed by component. If you run into a problem, please check the
    5581038 href="http://llvm.org/bugs/">LLVM bug database and submit a bug if
    5591039 there isn't already one.

    560
    561
    562
  • LLVM will not correctly compile on Solaris and/or OpenSolaris
  • 563 using the stock GCC 3.x.x series 'out the box',
    564 See: Broken versions of GCC and other tools.
    565 However, A Modern GCC Build
    566 for x86/x86-64 has been made available from the third party AuroraUX Project
    567 that has been meticulously tested for bootstrapping LLVM & Clang.
    568
  • There have been reports of Solaris and/or OpenSolaris build failures due
  • 569 to an incompatibility in the nm program as well. The nm from binutils does seem
    570 to work.
    571
    5721040
    5731041
    5741042
    5871055 href="http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev">LLVMdev list.

    5881056
    5891057
    590
  • The Alpha, SPU, MIPS, PIC16, Blackfin, MSP430, SystemZ and MicroBlaze
  • 591 backends are experimental.
    592
  • llc "-filetype=asm" (the default) is the only
  • 593 supported value for this option. XXX Update me
    1058
  • The Alpha, Blackfin, CellSPU, MicroBlaze, MSP430, MIPS, PIC16, SystemZ
  • 1059 and XCore backends are experimental.
    1060
  • llc "-filetype=obj" is experimental on all targets
  • 1061 other than darwin-i386 and darwin-x86_64.
    5941062
    5951063
    5961064
    6981166
    6991167
    7001168
    1169

    The C backend has numerous problems and is not being actively maintained.

    1170 Depending on it for anything serious is not advised.

    1171
    7011172
    7021173
  • The C backend has only basic support for
  • 7031174 inline assembly code.
    7131184
    7141185
    7151186
    716 Known problems with the llvm-gcc C and C++ front-end
    717
    718
    719
    720
    721

    The only major language feature of GCC not supported by llvm-gcc is

    722 the __builtin_apply family of builtins. However, some extensions
    723 are only supported on some targets. For example, trampolines are only
    724 supported on some targets (these are used when you take the address of a
    725 nested function).

    726
    727
    728
    729
    730
    731 Known problems with the llvm-gcc Fortran front-end
    732
    733
    734
    735
    736
  • Fortran support generally works, but there are still several unresolved bugs
  • 737 in Bugzilla. Please see the
    738 tools/gfortran component for details.
    739
    740
    741
    742
    743
    744 Known problems with the llvm-gcc Ada front-end
    745
    746
    747
    748 The llvm-gcc 4.2 Ada compiler works fairly well; however, this is not a mature
    749 technology, and problems should be expected.
    750
    751
  • The Ada front-end currently only builds on X86-32. This is mainly due
  • 752 to lack of trampoline support (pointers to nested functions) on other platforms.
    753 However, it also fails to build on X86-64
    754 which does support trampolines.
    755
  • The Ada front-end fails to bootstrap.
  • 756 This is due to lack of LLVM support for setjmp/longjmp style
    757 exception handling, which is used internally by the compiler.
    758 Workaround: configure with --disable-bootstrap.
    759
  • The c380004, c393010
  • 760 and cxg2021 ACATS tests fail
    761 (c380004 also fails with gcc-4.2 mainline).
    762 If the compiler is built with checks disabled then c393010
    763 causes the compiler to go into an infinite loop, using up all system memory.
    764
  • Some GCC specific Ada tests continue to crash the compiler.
  • 765
  • The -E binder option (exception backtraces)
  • 766 does not work and will result in programs
    767 crashing if an exception is raised. Workaround: do not use -E.
    768
  • Only discrete types are allowed to start
  • 769 or finish at a non-byte offset in a record. Workaround: do not pack records
    770 or use representation clauses that result in a field of a non-discrete type
    771 starting or finishing in the middle of a byte.
    772
  • The lli interpreter considers
  • 773 'main' as generated by the Ada binder to be invalid.
    774 Workaround: hand edit the file to use pointers for argv and
    775 envp rather than integers.
    776
  • The -fstack-check option is
  • 777 ignored.
    778 >
    1187 Known problems with the llvm-gcc front-end>
    1188
    1189
    1190
    1191
    1192

    llvm-gcc is generally very stable for the C family of languages. The only

    1193 major language feature of GCC not supported by llvm-gcc is the
    1194 __builtin_apply family of builtins. However, some extensions
    1195 are only supported on some targets. For example, trampolines are only
    1196 supported on some targets (these are used when you take the address of a
    1197 nested function).

    1198
    1199

    Fortran support generally works, but there are still several unresolved bugs

    1200 in Bugzilla. Please see the
    1201 tools/gfortran component for details. Note that llvm-gcc is missing major
    1202 Fortran performance work in the frontend and library that went into GCC after
    1203 4.2. If you are interested in Fortran, we recommend that you consider using
    1204 dragonegg instead.

    1205
    1206

    The llvm-gcc 4.2 Ada compiler has basic functionality, but is no longer being

    1207 actively maintained. If you are interested in Ada, we recommend that you
    1208 consider using dragonegg instead.

    7791209
    7801210
    7811211