llvm.org GIT mirror llvm / release_30
Fully merge mainline release notes onto the release branch. git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_30@145545 91177308-0d34-0410-b5e6-96231b3b80d8 Chandler Carruth 7 years ago
1 changed file(s) with 642 addition(s) and 540 deletion(s). Raw diff Collapse all Expand all
4444
4545

This document contains the release notes for the LLVM Compiler

4646 Infrastructure, release 3.0. Here we describe the status of LLVM, including
47 major improvements from the previous release and significant known problems.
47 major improvements from the previous release, improvements in various
48 subprojects of LLVM, and some of the current users of the code.
4849 All LLVM releases may be downloaded from
4950 the LLVM releases web site.

5051
6061 releases page.

6162
6263
63
64
72
64
65
7366
7467

7568 Sub-project Status Update
8073
8174

The LLVM 3.0 distribution currently consists of code from the core LLVM

8275 repository (which roughly includes the LLVM optimizers, code generators and
83 supporting tools), the Clang repository and the llvm-gcc repository. In
76 supporting tools), and the Clang repository. In
8477 addition to this code, the LLVM Project includes other sub-projects that are
8578 in development. Here we include updates on these subprojects.

8679
9891 provides a modular, library-based architecture that makes it suitable for
9992 creating or integrating with other development tools. Clang is considered a
10093 production-quality compiler for C, Objective-C, C++ and Objective-C++ on x86
101 (32- and 64-bit), and for darwin/arm targets.

102
103

In the LLVM 3.0 time-frame, the Clang team has made many improvements:

104
94 (32- and 64-bit), and for Darwin/ARM targets.

95
96

In the LLVM 3.0 time-frame, the Clang team has made many improvements:

10597
10698
  • Greatly improved support for building C++ applications, with greater
  • 10799 stability and better diagnostics.
    108
    100
    109101
  • Improved support for
  • 110102 the C++
    111 2011 standard, including implementations of non-static data member
    112 initializers, alias templates, delegating constructors, the range-based
    113 for loop, and implicitly-generated move constructors and move assignment
    103 2011 standard (aka "C++'0x"), including implementations of non-static data member
    104 initializers, alias templates, delegating constructors, range-based
    105 for loops, and implicitly-generated move constructors and move assignment
    114106 operators, among others.
    115107
    116108
  • Implemented support for some features of the upcoming C1x standard,
  • 117109 including static assertions and generic selections.
    118
    110
    119111
  • Better detection of include and linking paths for system headers and
  • 120112 libraries, especially for Linux distributions.
    121113
    122
  • Implemented support
  • 123 for Automatic
    124 Reference Counting for Objective-C.
    114
  • Several improvements to Objective-C support, including:
  • 115
    116
    117
  • 118 Automatic Reference Counting (ARC) and an improved memory model
    119 cleanly separating object and C memory.
    120
    121
  • A migration tool for moving manual retain/release code to ARC
  • 122
    123
  • Better support for data hiding, allowing instance variables to be
  • 124 declared in implementation contexts or class extensions
    125
  • Weak linking support for Objective-C classes
  • 126
  • Improved static type checking by inferring the return type of methods
  • 127 such as +alloc and -init.
    128
    129
    130 Some new Objective-C features require either the Mac OS X 10.7 / iOS 5
    131 Objective-C runtime, or version 1.6 or later of the GNUstep Objective-C
    132 runtime version.
    125133
    126134
  • Implemented a number of optimizations in libclang, the Clang C
  • 127135 interface, to improve the performance of code completion and the mapping
    128136 from source locations to abstract syntax tree nodes.
    129137
    130
    131
    138 For more details about the changes to Clang since the 2.9 release, see the
    139 Clang release notes
    140

    141
    142
    132143

    If Clang rejects your code but another compiler accepts it, please take a

    133144 look at the language
    134145 compatibility guide to make sure this is not intentional or a known
    144155
    145156

    DragonEgg is a

    146157 gcc plugin that replaces GCC's
    147 optimizers and code generators with LLVM's. Currently it requires a patched
    148 version of gcc-4.5. The plugin can target the x86-32 and x86-64 processor
    149 families and has been used successfully on the Darwin, FreeBSD and Linux
    150 platforms. The Ada, C, C++ and Fortran languages work well. The plugin is
    151 capable of compiling plenty of Obj-C, Obj-C++ and Java but it is not known
    152 whether the compiled code actually works or not!

    158 optimizers and code generators with LLVM's. It works with gcc-4.5 or gcc-4.6,
    159 targets the x86-32 and x86-64 processor families, and has been successfully
    160 used on the Darwin, FreeBSD, KFreeBSD, Linux and OpenBSD platforms. It fully
    161 supports Ada, C, C++ and Fortran. It has partial support for Go, Java, Obj-C
    162 and Obj-C++.

    153163
    154164

    The 3.0 release has the following notable changes:

    155165
    156
    157
    246

    247 VMKit
    248
    249
    250
    251
    252

    The VMKit project is an

    253 implementation of a Java Virtual Machine (Java VM or JVM) that uses LLVM for
    254 static and just-in-time compilation.
    255
    256

    In the LLVM 3.0 time-frame, VMKit has had significant improvements on both

    257 runtime and startup performance:

    258
    259
    260
  • Precompilation: by compiling ahead of time a small subset of Java's core
  • 261 library, the startup performance have been highly optimized to the point that
    262 running a 'Hello World' program takes less than 30 milliseconds.
    263
    264
  • Customization: by customizing virtual methods for individual classes,
  • 265 the VM can statically determine the target of a virtual call, and decide to
    266 inline it.
    267
    268
  • Inlining: the VM does more inlining than it did before, by allowing more
  • 269 bytecode instructions to be inlined, and thanks to customization. It also
    270 inlines GC barriers, and object allocations.
    271
    272
  • New exception model: the generated code for a method that does not do
  • 273 any try/catch is not penalized anymore by the eventuality of calling a
    274 method that throws an exception. Instead, the method that throws the
    275 exception jumps directly to the method that could catch it.
    276
    209277
    210278
    211279
    226294
    227295
    228296
    229
    230

    231 VMKit
    232
    233
    234
    235
    236

    The VMKit project is an implementation

    237 of a Java Virtual Machine (Java VM or JVM) that uses LLVM for static and
    238 just-in-time compilation. As of LLVM 3.0, VMKit now supports generational
    239 garbage collectors. The garbage collectors are provided by the MMTk
    240 framework, and VMKit can be configured to use one of the numerous implemented
    241 collectors of MMTk.

    242
    243
    244
    245
    297
    246298
    247299
    279331

    AddressSanitizer

    280
    332
    281333
    282334
    283335

    AddressSanitizer

    290342
    291343
    292344

    ClamAV

    293
    345
    294346
    295347
    296348

    Clam AntiVirus is an open source (GPL)

    299351
    300352

    Since version 0.96 it

    301353 has bytecode
    302 signatures that allow writing detections for complex malware.

    303
    304

    It uses LLVM's JIT to speed up the execution of bytecode on X86, X86-64,

    354 signatures that allow writing detections for complex malware.
    355 It uses LLVM's JIT to speed up the execution of bytecode on X86, X86-64,
    305356 PPC32/64, falling back to its own interpreter otherwise. The git version was
    306357 updated to work with LLVM 3.0.

    358
    359
    360
    361
    362

    clang_complete for VIM

    363
    364
    365
    366

    clang_complete is a

    367 VIM plugin, that provides accurate C/C++ autocompletion using the clang front
    368 end. The development version of clang complete, can directly use libclang
    369 which can maintain a cache to speed up auto completion.

    307370
    308371
    309372
    327390
    328391
    329392

    Cling is an interactive compiler interface

    330 (aka C++ interpreter). It uses LLVM's JIT and clang; it currently supports
    331 C++ and C. It has a prompt interface, runs source files, calls into shared
    393 (aka C++ interpreter). It supports C++ and C, and uses LLVM's JIT and the
    394 Clang parser. It has a prompt interface, runs source files, calls into shared
    332395 libraries, prints the value of expressions, even does runtime lookup of
    333396 identifiers (dynamic scopes). And it just behaves like one would expect from
    334397 an interpreter.

    336399
    337400
    338401
    339
    350
    405
    406

    Crack aims to provide

    407 the ease of development of a scripting language with the performance of a
    408 compiled language. The language derives concepts from C++, Java and Python,
    409 incorporating object-oriented programming, operator overloading and strong
    410 typing.

    411
    412
    413
    414
    415

    Eero

    416
    417
    418
    419

    Eero is a fully

    420 header-and-binary-compatible dialect of Objective-C 2.0, implemented with a
    421 patched version of the Clang/LLVM compiler. It features a streamlined syntax,
    422 Python-like indentation, and new operators, for improved readability and
    423 reduced code clutter. It also has new features such as limited forms of
    424 operator overloading and namespaces, and strict (type-and-operator-safe)
    425 enumerations. It is inspired by languages such as Smalltalk, Python, and
    426 Ruby.

    427
    428
    429
    430
    431

    FAUST Real-Time Audio Signal Processing Language

    432
    433
    434
    435

    FAUST is a compiled language for

    436 real-time audio signal processing. The name FAUST stands for Functional
    437 AUdio STream. Its programming model combines two approaches: functional
    438 programming and block diagram composition. In addition with the C, C++, Java
    439 output formats, the Faust compiler can now generate LLVM bitcode, and works
    440 with LLVM 2.7-3.0.
    441

    442
    443
    444
    351445
    352446

    Glasgow Haskell Compiler (GHC)

    353
    447
    354448
    355449
    356450

    GHC is an open source, state-of-the-art programming suite for Haskell, a

    402496
    403497
    404498
    499

    ispc: The Intel SPMD Program Compiler

    500
    501
    502
    503

    ispc is a compiler for "single program,

    504 multiple data" (SPMD) programs. It compiles a C-based SPMD programming
    505 language to run on the SIMD units of CPUs; it often delivers 5-6x speedups on
    506 a single core of a CPU with an 8-wide SIMD unit compared to serial code,
    507 while still providing a clean and easy-to-understand programming model. For
    508 an introduction to the language and its performance,
    509 see the walkthrough of a short
    510 example program. ispc is licensed under the BSD license.

    511
    512
    513
    514
    515

    The Julia Programming Language

    516
    517
    518
    519

    Julia is a high-level,

    520 high-performance dynamic language for technical
    521 computing. It provides a sophisticated compiler, distributed parallel
    522 execution, numerical accuracy, and an extensive mathematical function
    523 library. The compiler uses type inference to generate fast code
    524 without any type declarations, and uses LLVM's optimization passes and
    525 JIT compiler. The language is designed around multiple dispatch,
    526 giving programs a large degree of flexibility. It is ready for use on many
    527 kinds of problems.

    528
    529
    530
    405531

    LanguageKit and Pragmatic Smalltalk

    406532
    407533
    412538 its own interpreter. Pragmatic Smalltalk is a dialect of Smalltalk, built on
    413539 top of LanguageKit, that interfaces directly with Objective-C, sharing the
    414540 same object representation and message sending behaviour. These projects are
    415 developed as part of the Étoié desktop environment.

    541 developed as part of the Étoilé desktop environment.

    416542
    417543
    418544
    438564 binary compatible with Microsoft.NET. Has an optional, dynamically-loaded
    439565 LLVM code generation backend in Mini, the JIT compiler.

    440566
    441

    Note that we use a Git mirror of LLVM with some patches. See:

    442 https://github.com/mono/llvm

    567

    Note that we use a Git mirror of LLVM

    568 href="https://github.com/mono/llvm">with some patches.

    569
    570
    571
    572
    573

    Polly

    574
    575
    576
    577

    Polly is an advanced data-locality

    578 optimizer and automatic parallelizer. It uses an advanced, mathematical
    579 model to calculate detailed data dependency information which it uses to
    580 optimize the loop structure of a program. Polly can speed up sequential code
    581 by improving memory locality and consequently the cache use. Furthermore,
    582 Polly is able to expose different kind of parallelism which it exploits by
    583 introducing (basic) OpenMP and SIMD code. A mid-term goal of Polly is to
    584 automatically create optimized GPU code.

    443585
    444586
    445587
    458600
    459601
    460602

    Pure

    461
    603
    462604
    463605

    Pure is an

    464606 algebraic/functional programming language based on term rewriting. Programs
    471613 languages (including the ability to load LLVM bitcode modules, and inline C,
    472614 C++, Fortran and Faust code in Pure programs if the corresponding LLVM-enabled
    473615 compilers are installed).

    474
    616
    475617

    Pure version 0.48 has been tested and is known to work with LLVM 3.0

    476618 (and continues to work with older LLVM releases >= 2.5).

    477619
    532674 co-design flow from C/C++ programs down to synthesizable VHDL and parallel
    533675 program binaries. Processor customization points include the register files,
    534676 function units, supported operations, and the interconnection network.

    535
    677
    536678

    TCE uses Clang and LLVM for C/C++ language support, target independent

    537679 optimizations and also for parts of code generation. It generates new
    538680 LLVM-based code generators "on the fly" for the designed TTA processors and
    540682 per-target recompilation of larger parts of the compiler chain.

    541683
    542684
    543
    685
    544686
    545687

    Tart Programming Language

    546688
    576718
    577719
    578720
    579
    580

    The ZooLib C++ Cross-Platform Application Framework

    581
    582
    583
    584

    ZooLib is Open Source under the MIT

    585 License. It provides GUI, filesystem access, TCP networking, thread-safe
    586 memory management, threading and locking for Mac OS X, Classic Mac OS,
    587 Microsoft Windows, POSIX operating systems with X11, BeOS, Haiku, Apple's iOS
    588 and Research in Motion's BlackBerry.

    589
    590

    My current work is to use CLang's static analyzer to improve ZooLib's code

    591 quality. I also plan to set up LLVM compiles of the demo programs and test
    592 programs using CLang and LLVM on all the platforms that CLang, LLVM and
    593 ZooLib all support.

    594
    595
    596
    597
    598
    609
    610
    611
    612
    631
    632
    633
    648
    649
    650
    662
    663
    664
    679
    680721
    681722
    682723
    698739
    699740
    700741
    701

    LLVM 3.0 includes several major new capabilities:

    742
    752
    753
    758
    759

    LLVM 3.0 includes several major changes and big features:

    702760
    703761
    704
    705
    708
    762
  • llvm-gcc is no longer supported, and not included in the release. We
  • 763 recommend switching to
    764 href="http://clang.llvm.org/">Clang or
    765 href="http://dragonegg.llvm.org/">DragonEgg.
    766
    767
  • The linear scan register allocator has been replaced with a new "greedy"
  • 768 register allocator, enabling live range splitting and many other
    769 optimizations that lead to better code quality. Please see its
    770 href="http://blog.llvm.org/2011/09/greedy-register-allocation-in-llvm-30.html">blog post or its talk at the
    771 href="http://llvm.org/devmtg/2011-11/">Developer Meeting
    772 for more information.
    773
  • LLVM IR now includes full support for atomics
  • 774 memory operations intended to support the C++'11 and C'1x memory models.
    775 This includes atomic load and store,
    776 compare and exchange, and read/modify/write instructions as well as a
    777 full set of memory ordering constraints.
    778 Please see the Atomics Guide for more
    779 information.
    780
    781
  • The LLVM IR exception handling representation has been redesigned and
  • 782 reimplemented, making it more elegant, fixing a huge number of bugs, and
    783 enabling inlining and other optimizations. Please see its
    784 "http://blog.llvm.org/2011/11/llvm-30-exception-handling-redesign.html">blog
    785 post and the Exception Handling
    786 documentation for more information.
    787
  • The LLVM IR Type system has been redesigned and reimplemented, making it
  • 788 faster and solving some long-standing problems.
    789 Please see its
    790 href="http://blog.llvm.org/2011/11/llvm-30-type-system-rewrite.html">blog
    791 post for more information.
    792
    793
  • The MIPS backend has made major leaps in this release, going from an
  • 794 experimental target to being virtually production quality and supporting a
    795 wide variety of MIPS subtargets. See the MIPS section
    796 below for more information.
    797
    798
  • The optimizer and code generator now supports gprof and gcov-style coverage
  • 799 and profiling information, and includes a new llvm-cov tool (but also works
    800 with gcov). Clang exposes coverage and profiling through GCC-compatible
    801 command line options.
    709802
    710
    711
    803
    804
    805
    712806
    713807
    714808

    720814

    LLVM IR has several new features for better support of new targets and that

    721815 expose new optimization opportunities:

    722816
    723

    One of the biggest changes is that 3.0 has a new exception handling

    724 system. The old system used LLVM intrinsics to convey the exception handling
    725 information to the code generator. It worked in most cases, but not
    726 all. Inlining was especially difficult to get right. Also, the intrinsics
    727 could be moved away from the invoke instruction, making it hard
    728 to recover that information.

    729
    730

    The new EH system makes exception handling a first-class member of the IR. It

    731 adds two new instructions:

    732
    733
    734
  • landingpad
  • 735 this instruction defines a landing pad basic block. It contains all of the
    736 information that's needed by the code generator. It's also required to be
    737 the first non-PHI instruction in the landing pad. In addition, a landing
    738 pad may be jumped to only by the unwind edge of an invoke
    739 instruction.
    740
    741
  • resume — this
  • 742 instruction causes the current exception to resume traveling up the
    743 stack. It replaces the @llvm.eh.resume intrinsic.
    744
    745
    746

    Converting from the old EH API to the new EH API is rather simple, because a

    747 lot of complexity has been removed. The two intrinsics,
    748 @llvm.eh.exception and @llvm.eh.selector have been
    749 superceded by the landingpad instruction. Instead of generating
    750 a call to @llvm.eh.exception and @llvm.eh.selector:
    751
    752
    753
    
                      
                    
    754 Function *ExcIntr = Intrinsic::getDeclaration(TheModule,
    755 Intrinsic::eh_exception);
    756 Function *SlctrIntr = Intrinsic::getDeclaration(TheModule,
    757 Intrinsic::eh_selector);
    758
    759 // The exception pointer.
    760 Value *ExnPtr = Builder.CreateCall(ExcIntr, "exc_ptr");
    761
    762 std::vector<Value*> Args;
    763 Args.push_back(ExnPtr);
    764 Args.push_back(Builder.CreateBitCast(Personality,
    765 Type::getInt8PtrTy(Context)));
    766
    767 // Add selector clauses to Args.
    768
    769 // The selector call.
    770 Builder.CreateCall(SlctrIntr, Args, "exc_sel");
    771
    772
    773
    774

    You should instead generate a landingpad instruction, that

    775 returns an exception object and selector value:

    776
    777
    778
    
                      
                    
    779 LandingPadInst *LPadInst =
    780 Builder.CreateLandingPad(StructType::get(Int8PtrTy, Int32Ty, NULL),
    781 Personality, 0);
    782
    783 Value *LPadExn = Builder.CreateExtractValue(LPadInst, 0);
    784 Builder.CreateStore(LPadExn, getExceptionSlot());
    785
    786 Value *LPadSel = Builder.CreateExtractValue(LPadInst, 1);
    787 Builder.CreateStore(LPadSel, getEHSelectorSlot());
    788
    789
    790
    791

    It's now trivial to add the individual clauses to the landingpad

    792 instruction.

    793
    794
    795
    
                      
                    
    796 // Adding a catch clause
    797 Constant *TypeInfo = getTypeInfo();
    798 LPadInst->addClause(TypeInfo);
    799
    800 // Adding a C++ catch-all
    801 LPadInst->addClause(Constant::getNullValue(Builder.getInt8PtrTy()));
    802
    803 // Adding a cleanup
    804 LPadInst->setCleanup(true);
    805
    806 // Adding a filter clause
    807 std::vector<Constant*> TypeInfos;
    808 Constant *TypeInfo = getFilterTypeInfo();
    809 TypeInfos.push_back(Builder.CreateBitCast(TypeInfo, Builder.getInt8PtrTy()));
    810
    811 ArrayType *FilterTy = ArrayType::get(Int8PtrTy, TypeInfos.size());
    812 LPadInst->addClause(ConstantArray::get(FilterTy, TypeInfos));
    813
    814
    815
    816

    Converting from using the @llvm.eh.resume intrinsic to

    817 the resume instruction is trivial. It takes the exception
    818 pointer and exception selector values returned by
    819 the landingpad instruction:

    820
    821
    822
    
                      
                    
    823 Type *UnwindDataTy = StructType::get(Builder.getInt8PtrTy(),
    824 Builder.getInt32Ty(), NULL);
    825 Value *UnwindData = UndefValue::get(UnwindDataTy);
    826 Value *ExcPtr = Builder.CreateLoad(getExceptionObjSlot());
    827 Value *ExcSel = Builder.CreateLoad(getExceptionSelSlot());
    828 UnwindData = Builder.CreateInsertValue(UnwindData, ExcPtr, 0, "exc_ptr");
    829 UnwindData = Builder.CreateInsertValue(UnwindData, ExcSel, 1, "exc_sel");
    830 Builder.CreateResume(UnwindData);
    831
    832
    833
    817
    818
  • Atomic memory accesses and memory ordering are
  • 819 now directly expressible in the IR.
    820
  • A new llvm.fma intrinsic directly
  • 821 represents floating point multiply accumulate operations without an
    822 intermediate rounding stage.
    823
  • A new llvm.expect intrinsic allows a frontend to express expected control
  • 824 flow (and the __builtin_expect builtin from GNU C).
    825
  • The llvm.prefetch intrinsic now
  • 826 takes a 4th argument that specifies whether the prefetch happens from the
    827 icache or dcache.
    828
  • The new uwtable function attribute
  • 829 allows a frontend to control emission of unwind tables.
    830
  • The new nonlazybind function
  • 831 attribute allow optimization of Global Offset Table (GOT) accesses.
    832
  • The new returns_twice attribute
  • 833 allows better modeling of functions like setjmp.
    834
  • The target datalayout string can now
  • 835 encode the natural alignment of the target's stack for better optimization.
    836
    837
    834838
    835839
    836840
    840844
    841845
    842846
    843

    In addition to a large array of minor performance tweaks and bug fixes, this

    847

    In addition to many minor performance tweaks and bug fixes, this

    844848 release includes a few major enhancements and additions to the
    845849 optimizers:

    846850
    847851
    848
    851
    852
    852
  • The pass manager now has an extension API that allows front-ends and plugins
  • 853 to insert their own optimizations in the well-known places in the standard
    854 pass optimization pipeline.
    855
    856
  • Information about branch probability
  • 857 and basic block frequency is now available within LLVM, based on a
    858 combination of static branch prediction heuristics and
    859 __builtin_expect calls. That information is currently used for
    860 register spill placement and if-conversion, with additional optimizations
    861 planned for future releases. The same framework is intended for eventual
    862 use with profile-guided optimization.
    863
    864
  • The "-indvars" induction variable simplification pass only modifies
  • 865 induction variables when profitable. Sign and zero extension
    866 elimination, linear function test replacement, loop unrolling, and
    867 other simplifications that require induction variable analysis have
    868 been generalized so they no longer require loops to be rewritten into
    869 canonical form prior to optimization. This new design
    870 preserves more IR level information, avoids undoing earlier loop
    871 optimizations (particularly hand-optimized loops), and no longer
    872 requires the code generator to reconstruct loops into an optimal form -
    873 an intractable problem.
    874
    875
  • LLVM now includes a pass to optimize retain/release calls for the
  • 876 Automatic
    877 Reference Counting (ARC) Objective-C language feature (in
    878 lib/Transforms/Scalar/ObjCARC.cpp). It is a decent example of implementing
    879 a source-language-specific optimization in LLVM.
    880
    853881
    854882
    855883
    864892

    The LLVM Machine Code (aka MC) subsystem was created to solve a number of

    865893 problems in the realm of assembly, disassembly, object file format handling,
    866894 and a number of other related areas that CPU instruction-set level tools work
    867 in.

    895 in. For more information, please see
    896 the Intro
    897 to the LLVM MC Project Blog Post.

    868898
    869899
    870
    900
  • The MC layer has undergone significant refactoring to eliminate layering
  • 901 violations that caused it to pull in the LLVM compiler backend code.
    902
  • The ELF object file writers are much more full featured.
  • 903
  • The integrated assembler now supports #line directives.
  • 904
  • An early implementation of a JIT built on top of the MC framework (known
  • 905 as MC-JIT) has been implemented and will eventually replace the old JIT.
    906 It emits object files direct to memory and uses a runtime dynamic linker to
    907 resolve references and drive lazy compilation. The MC-JIT enables much
    908 greater code reuse between the JIT and the static compiler and provides
    909 better integration with the platform ABI as a result.
    910
    911
  • The assembly printer now makes uses of assemblers instruction aliases
  • 912 (InstAliases) to print simplified mneumonics when possible.
    913
  • TableGen can now autogenerate MC expansion logic for pseudo
  • 914 instructions that expand to multiple MC instructions (through the
    915 PseudoInstExpansion class).
    916
  • A new llvm-dwarfdump tool provides a start of a drop-in
  • 917 replacement for the corresponding tool that use LLVM libraries. As part of
    918 this, LLVM has the beginnings of a dwarf parsing library.
    919
  • llvm-objdump has more output including, symbol by symbol disassembly,
  • 920 inline relocations, section headers, symbol tables, and section contents.
    921 Support for archive files has also been added.
    922
  • llvm-nm has gained support for archives of binary files.
  • 923
  • llvm-size has been added. This tool prints out section sizes.
  • 873924
    874
    875

    For more information, please see

    876 the Intro
    877 to the LLVM MC Project Blog Post.

    878925
    879926
    880927
    890937 make it run faster:

    891938
    892939
    893
    940
  • LLVM can now produce code that works with libgcc
  • 941 to dynamically allocate stack
    942 segments, as opposed to allocating a worst-case chunk of
    943 virtual memory for each thread.
    944
  • LLVM generates substantially better code for indirect gotos due to a new
  • 945 tail duplication pass, which can be a substantial performance win for
    946 interpreter loops that use them.
    947
  • Exception handling and debug frame information is now emitted with CFI
  • 948 directives. This lets the assembler produce more compact info as it knows
    949 the final offsets, yielding much smaller executables for some C++ applications.
    950 If the system assembler doesn't support it, MC exands the directives when
    951 the integrated assembler is not used.
    952
    953
    954
  • The code generator now supports vector "select" operations on vector
  • 955 comparisons, turning them into various optimized code sequences (e.g.
    956 using the SSE4/AVX "blend" instructions).
    957
  • The SSE execution domain fix pass and the ARM NEON move fix pass have been
  • 958 merged to a target independent execution dependency fix pass. This pass is
    959 used to select alternative equivalent opcodes in a way that minimizes
    960 execution domain crossings. Closely connected instructions are moved to
    961 the same execution domain when possible. Targets can override the
    962 getExecutionDomain and setExecutionDomain hooks
    963 to use the pass.
    896964
    897965
    898966
    906974

    New features and major changes in the X86 target include:

    907975
    908976
    909
    910
  • The CRC32 intrinsics have been renamed. The intrinsics were previously
  • 911 @llvm.x86.sse42.crc32.[8|16|32]
    912 and @llvm.x86.sse42.crc64.[8|64]. They have been renamed to
    913 @llvm.x86.sse42.crc32.32.[8|16|32] and
    914 @llvm.x86.sse42.crc32.64.[8|64].
    915
    977
  • The X86 backend, assembler and disassembler now have full support for AVX 1.
  • 978 To enable it pass -mavx to the compiler. AVX2 implementation is
    979 underway on mainline.
    980
  • The integrated assembler and disassembler now support a broad range of new
  • 981 instructions including Atom, Ivy Bridge,
    982 href="http://en.wikipedia.org/wiki/SSE4a">SSE4a/BMI instructions,
    983 href="http://en.wikipedia.org/wiki/RdRand">rdrand and many others.
    984
  • The X86 backend now fully supports the X87
  • 985 floating point stack inline assembly constraints.
    986
  • The integrated assembler now supports the .code32 and
  • 987 .code64 directives to switch between 32-bit and 64-bit
    988 instructions.
    989
  • The X86 backend now synthesizes horizontal add/sub instructions from generic
  • 990 vector code when the appropriate instructions are enabled.
    991
  • The X86-64 backend generates smaller and faster code at -O0 due to
  • 992 improvements in fast instruction selection.
    993
  • Native Client
  • 994 subtarget support has been added.
    995
    996
  • The CRC32 intrinsics have been renamed. The intrinsics were previously
  • 997 @llvm.x86.sse42.crc32.[8|16|32]
    998 and @llvm.x86.sse42.crc64.[8|64]. They have been renamed to
    999 @llvm.x86.sse42.crc32.32.[8|16|32] and
    1000 @llvm.x86.sse42.crc32.64.[8|64].
    9161001
    9171002
    9181003
    9271012

    New features of the ARM target include:

    9281013
    9291014
    930
    1029

    1030 MIPS Target Improvements
    1031
    1032
    1033
    1034
    1035

    This release has seen major new work on just about every aspect of the MIPS

    1036 backend. Some of the major new features include:

    1037
    1038
    1039
  • Most MIPS32r1 and r2 instructions are now supported.
  • 1040
  • LE/BE MIPS32r1/r2 has been tested extensively.
  • 1041
  • O32 ABI has been fully tested.
  • 1042
  • MIPS backend has migrated to using the MC infrastructure for assembly printing. Initial support for direct object code emission has been implemented too.
  • 1043
  • Delay slot filler has been updated. Now it tries to fill delay slots with useful instructions instead of always filling them with NOPs.
  • 1044
  • Support for old-style JIT is complete.
  • 1045
  • Support for old architectures (MIPS1 and MIPS2) has been removed.
  • 1046
  • Initial support for MIPS64 has been added.
  • 1047
    1048
    1049
    1050
    1051

    1052 PTX Target Improvements
    1053
    1054
    1055
    1056
    1057

    1058 The PTX back-end is still experimental, but is fairly usable for compute kernels
    1059 in LLVM 3.0. Most scalar arithmetic is implemented, as well as intrinsics to
    1060 access the special PTX registers and sync instructions. The major missing
    1061 pieces are texture/sampler support and some vector operations.

    1062
    1063

    That said, the backend is already being used for domain-specific languages

    1064 and can be used by Clang to
    1065 compile OpenCL
    1066 C code into PTX.

    1067
    1068
    1069
    9361070
    9371071

    9381072 Other Target Specific Improvements
    9391073
    9401074
    941

    PPC32/ELF va_arg was implemented.

    942

    PPC32 initial support for .o file writing was implemented.

    943
    9441075
    9451076
    9461077
    947
    1078
  • Many PowerPC improvements have been implemented for ELF targets, including
  • 1079 support for varargs and initial support for direct .o file emission.
    1080
    1081
  • MicroBlaze scheduling itineraries were added that model the
  • 1082 3-stage and the 5-stage pipeline architectures. The 3-stage
    1083 pipeline model can be selected with -mcpu=mblaze3
    1084 and the 5-stage pipeline model can be selected with
    1085 -mcpu=mblaze5.
    1086
    9501087
    9511088
    9521089
    9631100 from the previous release.

    9641101
    9651102
    966
  • The LLVMC front end code was removed while separating
  • 967 out language independence.
    968
  • The LowerSetJmp pass wasn't used effectively by any
  • 969 target and has been removed.
    1103
  • LLVM 3.0 removes support for reading LLVM 2.8 and earlier files, and LLVM
  • 1104 3.1 will eliminate support for reading LLVM 2.9 files. Going forward, we
    1105 aim for all future versions of LLVM to read bitcode files and .ll files
    1106 produced by LLVM 3.0.
    1107
  • Tablegen has been split into a library, allowing the clang tblgen pieces
  • 1108 to now live in the clang tree. The llvm version has been renamed to
    1109 llvm-tblgen instead of tblgen.
    1110
  • The LLVMC meta compiler driver was removed.
  • 1111
  • The unused PostOrder Dominator Frontiers and LowerSetJmp passes were removed.
  • 1112
    1113
    9701114
  • The old TailDup pass was not used in the standard pipeline
  • 9711115 and was unable to update ssa form, so it has been removed.
    9721116
  • The syntax of volatile loads and stores in IR has been changed to
  • 9731117 "load volatile"/"store volatile". The old
    9741118 syntax ("volatile load"/"volatile store")
    975 is still accepted, but is now considered deprecated.
    976
  • The old atomic intrinscs (llvm.memory.barrier and
  • 1119 is still accepted, but is now considered deprecated and will be removed in
    1120 3.1.
    1121
  • llvm-gcc's frontend tests have been removed from llvm/test/Frontend*, sunk
  • 1122 into the clang and dragonegg testsuites.
    1123
  • The old atomic intrinsics (llvm.memory.barrier and
  • 9771124 llvm.atomic.*) are now gone. Please use the new atomic
    9781125 instructions, described in the atomics guide.
    1126
  • LLVM's configure script doesn't depend on llvm-gcc anymore, eliminating a
  • 1127 strange circular dependence between projects.
    9791128
    9801129
    9811130

    Windows (32-bit)

    10011150 LLVM API changes are:

    10021151
    10031152
    1004
  • The biggest and most pervasive change is that llvm::Type's are no longer
  • 1005 returned or accepted as 'const' values. Instead, just pass around
    1006 non-const Type's.
    1007
    1153
  • The biggest and most pervasive change is that the type system has been
  • 1154 rewritten: PATypeHolder and OpaqueType are gone,
    1155 and all APIs deal with Type* instead of const
    1156 Type*. If you need to create recursive structures, then create a
    1157 named structure, and use setBody() when all its elements are
    1158 built. Type merging and refining is gone too: named structures are not
    1159 merged with other structures, even if their layout is identical. (of
    1160 course anonymous structures are still uniqued by layout).
    1161
    10081162
  • PHINode::reserveOperandSpace has been removed. Instead, you
  • 10091163 must specify how many operands to reserve space for when you create the
    10101164 PHINode, by passing an extra argument
    10781232 use DIBuilder::finalize() at the end of translation unit to
    10791233 complete debugging information encoding.
    10801234
    1081
  • The way the type system works has been
  • 1082 rewritten: PATypeHolder and OpaqueType are gone,
    1083 and all APIs deal with Type* instead of const
    1084 Type*. If you need to create recursive structures, then create a
    1085 named structure, and use setBody() when all its elements are
    1086 built. Type merging and refining is gone too: named structures are not
    1087 merged with other structures, even if their layout is identical. (of
    1088 course anonymous structures are still uniqued by layout).
    1089
    10901235
  • TargetSelect.h moved to Support/ from Target/
  • 10911236
    10921237
  • UpgradeIntrinsicCall no longer upgrades pre-2.9 intrinsic calls (for
  • 11131258
    11141259
    11151260
    1116

    This section contains significant known problems with the LLVM system, listed

    1117 by component. If you run into a problem, please check
    1118 the LLVM bug database and submit a bug if
    1119 there isn't already one.

    1120
    1121
    1122

    1123 Experimental features included with this release
    1124
    1125
    1126
    1127
    1128

    The following components of this LLVM release are either untested, known to

    1129 be broken or unreliable, or are in early development. These components
    1130 should not be relied on, and bugs should not be filed against them, but they
    1131 may be useful to some people. In particular, if you would like to work on
    1132 one of these components, please contact us on
    1133 the LLVMdev
    1134 list.

    1261

    LLVM is generally a production quality compiler, and is used by a broad range

    1262 of applications and shipping in many products. That said, not every
    1263 subsystem is as mature as the aggregate, particularly the more obscure
    1264 targets. If you run into a problem, please check the
    1265 href="http://llvm.org/bugs/">LLVM bug database and submit a bug if
    1266 there isn't already one or ask on the
    1267 href="http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev">LLVMdev
    1268 list.

    1269
    1270

    Known problem areas include:

    11351271
    11361272
    1137
  • The Alpha, Blackfin, CellSPU, MicroBlaze, MSP430, MIPS, PTX, SystemZ and
  • 1138 XCore backends are experimental.
    1139
    1140
  • llc "-filetype=obj" is experimental on all targets other
  • 1141 than darwin and ELF X86 systems.
    1273
  • The Alpha, Blackfin, CellSPU, MSP430, PTX, SystemZ and
  • 1274 XCore backends are experimental, and the Alpha, Blackfin and SystemZ
    1275 targets have already been removed from mainline.
    1276
    1277
  • The integrated assembler, disassembler, and JIT is not supported by
  • 1278 several targets. If an integrated assembler is not supported, then a
    1279 system assembler is required. For more details, see the
    1280 href="CodeGenerator.html#targetfeatures">Target Features Matrix.
    1281
    1282
    1283
  • The C backend has numerous problems and is not being actively maintained.
  • 1284 Depending on it for anything serious is not advised.
    11421285
    1143
    1144
    1145
    1146
    1147

    1148 Known problems with the X86 back-end
    1149
    1150
    1151
    1152
    1153
    1154
  • The X86 backend does not yet support
  • 1155 all inline assembly that uses the X86
    1156 floating point stack. It supports the 'f' and 't' constraints, but
    1157 not 'u'.
    1158
    1159
  • The X86-64 backend does not yet support the LLVM IR instruction
  • 1160 va_arg. Currently, front-ends support variadic argument
    1161 constructs on X86-64 by lowering them manually.
    1162
    1163
  • Windows x64 (aka Win64) code generator has a few issues.
  • 1164
    1165
  • llvm-gcc cannot build the mingw-w64 runtime currently due to lack of
  • 1166 support for the 'u' inline assembly constraint and for X87 floating
    1167 point inline assembly.
    1168
    1169
  • On mingw-w64, you will see unresolved symbol __chkstk due
  • 1170 to Bug 8919.
    1171 It is fixed
    1172 in r128206.
    1173
    1174
  • Miss-aligned MOVDQA might crash your program. It is due to
  • 1175 Bug 9483, lack
    1176 of handling aligned internal globals.
    1177
    1178
    1179
    1180
    1181
    1182
    1183
    1184
    1185

    1186 Known problems with the PowerPC back-end
    1187
    1188
    1189
    1190
    1191
    1192
  • The PPC32/ELF support lacks PIC support.
  • 1193
    1194
    1195
    1196
    1197
    1198

    1199 Known problems with the ARM back-end
    1200
    1201
    1202
    1203
    1204
    1205
  • Thumb mode works only on ARMv6 or higher processors. On sub-ARMv6
  • 1206 processors, thumb programs can crash or produce wrong results
    1207 (PR1388).
    1208
    1209
  • Compilation for ARM Linux OABI (old ABI) is supported but not fully
  • 1210 tested.
    1211
    1212
    1213
    1214
    1215
    1216

    1217 Known problems with the SPARC back-end
    1218
    1219
    1220
    1221
    1222
    1223
  • The SPARC backend only supports the 32-bit SPARC ABI (-m32); it does not
  • 1224 support the 64-bit SPARC ABI (-m64).
    1225
    1226
    1227
    1228
    1229
    1230

    1231 Known problems with the MIPS back-end
    1232
    1233
    1234
    1235
    1236
    1237
  • 64-bit MIPS targets are not supported yet.
  • 1238
    1239
    1240
    1241
    1242
    1243

    1244 Known problems with the Alpha back-end
    1245
    1246
    1247
    1248
    1249
    1250
  • On 21164s, some rare FP arithmetic sequences which may trap do not have
  • 1251 the appropriate nops inserted to ensure restartability.
    1252
    1253
    1254
    1255
    1256
    1257

    1258 Known problems with the C back-end
    1259
    1260
    1261
    1262
    1263

    The C backend has numerous problems and is not being actively maintained.

    1264 Depending on it for anything serious is not advised.

    1265
    1266
    1267
  • The C backend has only basic support for
  • 1268 inline assembly code.
    1269
    1270
  • The C backend violates the ABI of common
  • 1271 C++ programs, preventing intermixing between C++ compiled by the CBE
    1272 and C++ code compiled with llc or native compilers.
    1273
    1274
  • The C backend does not support all exception handling constructs.
  • 1275
    1276
  • The C backend does not support arbitrary precision integers.
  • 1277
    1278
    1279
    1280
    1281
    1282
    1283

    1284 Known problems with the llvm-gcc front-end
    1285
    1286
    1287
    1288
    1289

    LLVM 2.9 was the last release of llvm-gcc.

    1290
    1291

    llvm-gcc is generally very stable for the C family of languages. The only

    1292 major language feature of GCC not supported by llvm-gcc is the
    1293 __builtin_apply family of builtins. However, some extensions
    1294 are only supported on some targets. For example, trampolines are only
    1295 supported on some targets (these are used when you take the address of a
    1296 nested function).

    1297
    1298

    Fortran support generally works, but there are still several unresolved bugs

    1299 in Bugzilla. Please see the
    1300 tools/gfortran component for details. Note that llvm-gcc is missing major
    1301 Fortran performance work in the frontend and library that went into GCC after
    1302 4.2. If you are interested in Fortran, we recommend that you consider using
    1303 dragonegg instead.

    1304
    1305

    The llvm-gcc 4.2 Ada compiler has basic functionality, but is no longer being

    1306 actively maintained. If you are interested in Ada, we recommend that you
    1307 consider using dragonegg instead.

    13081286
    13091287
    13101288
    13311309
    13321310
    13331311
    1312
    1313
    1314
    1434
    1435
    13341436
    13351437
    13361438