llvm.org GIT mirror llvm / 4a53562
docs: Nuke the old release notes. This change also removes a bunch of boilerplate and stuffing which made it unnecessarily hard to navigate and see the comparatively miniscule actual content that was added to this document during the 3.2 development period (or maybe even sticking around from earlier releases...). The new organization (a flat list) optimizes for making it easy for people who know about changes to add them to the document. It's completely trivial for anyone with basic knowledge of LLVM to come in later (such as when preparing for the actual release) and cluster any changes into logical groups. However, I have left some comments indicating how to add larger descriptions, if someone is feeling adventurous ;) Hopefully this organization will highlight how little effort is being put into producing accurate, high-quality release notes, prompting a corresponding improvement for the 3.3 release. I have preserved the changes to this document that are not present in the 3.2 release notes. There were only two... I'm pretty sure we've been busier than that... (version control shows +213347/-173656 raw lines just in the LLVM repo since the 3.2 release). git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@172954 91177308-0d34-0410-b5e6-96231b3b80d8 Sean Silva 7 years ago
1 changed file(s) with 26 addition(s) and 505 deletion(s). Raw diff Collapse all Expand all
None .. raw:: html
1
2
3
4 .. role:: red
5
6 ======================
7 LLVM 3.2 Release Notes
1 LLVM 3.3 Release Notes
82 ======================
93
104 .. contents::
115 :local:
126
13 Written by the `LLVM Team `_
7 .. warning::
8 These are in-progress notes for the upcoming LLVM 3.3 release. You may
9 prefer the `LLVM 3.2 Release Notes
10 /ReleaseNotes.html>`_.
1411
15 :red:`These are in-progress notes for the upcoming LLVM 3.2 release. You may
16 prefer the` `LLVM 3.1 Release Notes
17 /ReleaseNotes.html>`_.
1812
1913 Introduction
2014 ============
2115
2216 This document contains the release notes for the LLVM Compiler Infrastructure,
23 release 3.2. Here we describe the status of LLVM, including major improvements
17 release 3.3. Here we describe the status of LLVM, including major improvements
2418 from the previous release, improvements in various subprojects of LLVM, and
2519 some of the current users of the code. All LLVM releases may be downloaded
2620 from the `LLVM releases web site `_.
3630 one. To see the release notes for a specific release, please see the `releases
3731 page `_.
3832
39 Sub-project Status Update
40 =========================
33 Non-comprehensive list of changes in this release
34 =================================================
4135
42 The LLVM 3.2 distribution currently consists of code from the core LLVM
43 repository, which roughly includes the LLVM optimizers, code generators and
44 supporting tools, and the Clang repository. In addition to this code, the LLVM
45 Project includes other sub-projects that are in development. Here we include
46 updates on these subprojects.
36 .. NOTE
37 For small 1-3 sentence descriptions, just add an entry at the end of
38 this list. If your description won't fit comfortably in one bullet
39 point (e.g. maybe you would like to give an example of the
40 functionality, or simply have a lot to talk about), see the `NOTE` below
41 for adding a new subsection.
4742
48 Clang: C/C++/Objective-C Frontend Toolkit
49 -----------------------------------------
43 * The CellSPU port has been removed. It can still be found in older versions.
5044
51 `Clang `_ is an LLVM front end for the C, C++, and
52 Objective-C languages. Clang aims to provide a better user experience through
53 expressive diagnostics, a high level of conformance to language standards, fast
54 compilation, and low memory use. Like LLVM, Clang provides a modular,
55 library-based architecture that makes it suitable for creating or integrating
56 with other development tools. Clang is considered a production-quality
57 compiler for C, Objective-C, C++ and Objective-C++ on x86 (32- and 64-bit), and
58 for Darwin/ARM targets.
45 * The IR-level extended linker APIs (for example, to link bitcode files out of
46 archives) have been removed. Any existing clients of these features should
47 move to using a linker with integrated LTO support.
5948
60 In the LLVM 3.2 time-frame, the Clang team has made many improvements.
61 Highlights include:
49 * ... next change ...
6250
63 #. More powerful warnings, especially `-Wuninitialized`
64 #. Template type diffing in diagnostic messages
65 #. Higher quality and more efficient debug info generation
51 .. NOTE
52 If you would like to document a larger change, then you can add a
53 subsection about it right here. You can copy the following boilerplate
54 and un-indent it (the indentation causes it to be inside this comment).
6655
67 For more details about the changes to Clang since the 3.1 release, see the
68 `Clang release notes. `_
56 Special New Feature
57 -------------------
6958
70 If Clang rejects your code but another compiler accepts it, please take a look
71 at the `language compatibility `_
72 guide to make sure this is not intentional or a known issue.
73
74 DragonEgg: GCC front-ends, LLVM back-end
75 ----------------------------------------
76
77 `DragonEgg `_ is a `gcc plugin
78 `_ that replaces GCC's optimizers and code
79 generators with LLVM's. It works with gcc-4.5 and gcc-4.6 (and partially with
80 gcc-4.7), can target the x86-32/x86-64 and ARM processor families, and has been
81 successfully used on the Darwin, FreeBSD, KFreeBSD, Linux and OpenBSD
82 platforms. It fully supports Ada, C, C++ and Fortran. It has partial support
83 for Go, Java, Obj-C and Obj-C++.
84
85 The 3.2 release has the following notable changes:
86
87 #. Able to load LLVM plugins such as Polly.
88 #. Supports thread-local storage models.
89 #. Passes knowledge of variable lifetimes to the LLVM optimizers.
90 #. No longer requires GCC to be built with LTO support.
91
92 compiler-rt: Compiler Runtime Library
93 -------------------------------------
94
95 The new LLVM `compiler-rt project `_ is a simple
96 library that provides an implementation of the low-level target-specific hooks
97 required by code generation and other runtime components. For example, when
98 compiling for a 32-bit target, converting a double to a 64-bit unsigned integer
99 is compiled into a runtime call to the ``__fixunsdfdi`` function. The
100 ``compiler-rt`` library provides highly optimized implementations of this and
101 other low-level routines (some are 3x faster than the equivalent libgcc
102 routines).
103
104 The 3.2 release has the following notable changes:
105
106 #. ...
107
108 LLDB: Low Level Debugger
109 ------------------------
110
111 `LLDB `_ is a ground-up implementation of a command line
112 debugger, as well as a debugger API that can be used from other applications.
113 LLDB makes use of the Clang parser to provide high-fidelity expression parsing
114 (particularly for C++) and uses the LLVM JIT for target support.
115
116 The 3.2 release has the following notable changes:
117
118 #. ...
119
120 libc++: C++ Standard Library
121 ----------------------------
122
123 Like compiler_rt, libc++ is now :ref:`dual licensed
124 ` under the MIT and UIUC license, allowing it to be
125 used more permissively.
126
127 Within the LLVM 3.2 time-frame there were the following highlights:
128
129 #. ...
130
131 VMKit
132 -----
133
134 The `VMKit project `_ is an implementation of a Java
135 Virtual Machine (Java VM or JVM) that uses LLVM for static and just-in-time
136 compilation.
137
138 The 3.2 release has the following notable changes:
139
140 #. ...
141
142 Polly: Polyhedral Optimizer
143 ---------------------------
144
145 `Polly `_ is an *experimental* optimizer for data
146 locality and parallelism. It provides high-level loop optimizations and
147 automatic parallelisation.
148
149 Within the LLVM 3.2 time-frame there were the following highlights:
150
151 #. isl, the integer set library used by Polly, was relicensed to the MIT license
152 #. isl based code generation
153 #. MIT licensed replacement for CLooG (LGPLv2)
154 #. Fine grained option handling (separation of core and border computations,
155 control overhead vs. code size)
156 #. Support for FORTRAN and dragonegg
157 #. OpenMP code generation fixes
158
159 External Open Source Projects Using LLVM 3.2
160 ============================================
161
162 An exciting aspect of LLVM is that it is used as an enabling technology for a
163 lot of other language and tools projects. This section lists some of the
164 projects that have already been updated to work with LLVM 3.2.
165
166 Crack
167 -----
168
169 `Crack `_ aims to provide the ease of
170 development of a scripting language with the performance of a compiled
171 language. The language derives concepts from C++, Java and Python,
172 incorporating object-oriented programming, operator overloading and strong
173 typing.
174
175 FAUST
176 -----
177
178 `FAUST `_ is a compiled language for real-time audio
179 signal processing. The name FAUST stands for Functional AUdio STream. Its
180 programming model combines two approaches: functional programming and block
181 diagram composition. In addition with the C, C++, Java, JavaScript output
182 formats, the Faust compiler can generate LLVM bitcode, and works with LLVM
183 2.7-3.1.
184
185 Glasgow Haskell Compiler (GHC)
186 ------------------------------
187
188 `GHC `_ is an open source compiler and programming
189 suite for Haskell, a lazy functional programming language. It includes an
190 optimizing static compiler generating good code for a variety of platforms,
191 together with an interactive system for convenient, quick development.
192
193 GHC 7.0 and onwards include an LLVM code generator, supporting LLVM 2.8 and
194 later.
195
196 Julia
197 -----
198
199 `Julia `_ is a high-level, high-performance
200 dynamic language for technical computing. It provides a sophisticated
201 compiler, distributed parallel execution, numerical accuracy, and an extensive
202 mathematical function library. The compiler uses type inference to generate
203 fast code without any type declarations, and uses LLVM's optimization passes
204 and JIT compiler. The `Julia Language `_ is designed
205 around multiple dispatch, giving programs a large degree of flexibility. It is
206 ready for use on many kinds of problems.
207
208 LLVM D Compiler
209 ---------------
210
211 `LLVM D Compiler `_ (LDC) is a compiler
212 for the D programming Language. It is based on the DMD frontend and uses LLVM
213 as backend.
214
215 Open Shading Language
216 ---------------------
217
218 `Open Shading Language (OSL)
219 `_ is a small but rich
220 language for programmable shading in advanced global illumination renderers and
221 other applications, ideal for describing materials, lights, displacement, and
222 pattern generation. It uses LLVM to JIT complex shader networks to x86 code at
223 runtime.
224
225 OSL was developed by Sony Pictures Imageworks for use in its in-house renderer
226 used for feature film animation and visual effects, and is distributed as open
227 source software with the "New BSD" license.
228
229 Portable OpenCL (pocl)
230 ----------------------
231
232 In addition to producing an easily portable open source OpenCL implementation,
233 another major goal of `pocl `_ is improving
234 performance portability of OpenCL programs with compiler optimizations,
235 reducing the need for target-dependent manual optimizations. An important part
236 of pocl is a set of LLVM passes used to statically parallelize multiple
237 work-items with the kernel compiler, even in the presence of work-group
238 barriers. This enables static parallelization of the fine-grained static
239 concurrency in the work groups in multiple ways (SIMD, VLIW, superscalar, ...).
240
241 Pure
242 ----
243
244 `Pure `_ is an algebraic/functional
245 programming language based on term rewriting. Programs are collections of
246 equations which are used to evaluate expressions in a symbolic fashion. The
247 interpreter uses LLVM as a backend to JIT-compile Pure programs to fast native
248 code. Pure offers dynamic typing, eager and lazy evaluation, lexical closures,
249 a hygienic macro system (also based on term rewriting), built-in list and
250 matrix support (including list and matrix comprehensions) and an easy-to-use
251 interface to C and other programming languages (including the ability to load
252 LLVM bitcode modules, and inline C, C++, Fortran and Faust code in Pure
253 programs if the corresponding LLVM-enabled compilers are installed).
254
255 Pure version 0.54 has been tested and is known to work with LLVM 3.1 (and
256 continues to work with older LLVM releases >= 2.5).
257
258 TTA-based Co-design Environment (TCE)
259 -------------------------------------
260
261 `TCE `_ is a toolset for designing application-specific
262 processors (ASP) based on the Transport triggered architecture (TTA). The
263 toolset provides a complete co-design flow from C/C++ programs down to
264 synthesizable VHDL/Verilog and parallel program binaries. Processor
265 customization points include the register files, function units, supported
266 operations, and the interconnection network.
267
268 TCE uses Clang and LLVM for C/C++ language support, target independent
269 optimizations and also for parts of code generation. It generates new
270 LLVM-based code generators "on the fly" for the designed TTA processors and
271 loads them in to the compiler backend as runtime libraries to avoid per-target
272 recompilation of larger parts of the compiler chain.
273
274 Installation Instructions
275 =========================
276
277 See :doc:`GettingStarted`.
278
279 What's New in LLVM 3.2?
280 =======================
281
282 This release includes a huge number of bug fixes, performance tweaks and minor
283 improvements. Some of the major improvements and new features are listed in
284 this section.
285
286 Major New Features
287 ------------------
288
289 ..
290
291 Features that need text if they're finished for 3.2:
292 ARM EHABI
293 combiner-aa?
294 strong phi elim
295 loop dependence analysis
296 CorrelatedValuePropagation
297 Integrated assembler on by default for arm/thumb?
298
299 Near dead:
300 Analysis/RegionInfo.h + Dom Frontiers
301 SparseBitVector: used in LiveVar.
302 llvm/lib/Archive - replace with lib object?
303
304
305 LLVM 3.2 includes several major changes and big features:
306
307 #. New NVPTX back-end (replacing existing PTX back-end) based on NVIDIA sources
308 #. ...
309
310 LLVM IR and Core Improvements
311 -----------------------------
312
313 LLVM IR has several new features for better support of new targets and that
314 expose new optimization opportunities:
315
316 #. Thread local variables may have a specified TLS model. See the :ref:`Language
317 Reference Manual `.
318 #. ...
319
320 Optimizer Improvements
321 ----------------------
322
323 In addition to many minor performance tweaks and bug fixes, this release
324 includes a few major enhancements and additions to the optimizers:
325
326 Loop Vectorizer - We've added a loop vectorizer and we are now able to
327 vectorize small loops. The loop vectorizer is disabled by default and can be
328 enabled using the ``-mllvm -vectorize-loops`` flag. The SIMD vector width can
329 be specified using the flag ``-mllvm -force-vector-width=4``. The default
330 value is ``0`` which means auto-select.
331
332 We can now vectorize this function:
333
334 .. code-block:: c++
335
336 unsigned sum_arrays(int *A, int *B, int start, int end) {
337 unsigned sum = 0;
338 for (int i = start; i < end; ++i)
339 sum += A[i] + B[i] + i;
340 return sum;
341 }
342
343 We vectorize under the following loops:
344
345 #. The inner most loops must have a single basic block.
346 #. The number of iterations are known before the loop starts to execute.
347 #. The loop counter needs to be incremented by one.
348 #. The loop trip count **can** be a variable.
349 #. Loops do **not** need to start at zero.
350 #. The induction variable can be used inside the loop.
351 #. Loop reductions are supported.
352 #. Arrays with affine access pattern do **not** need to be marked as
353 '``noalias``' and are checked at runtime.
354 #. ...
355
356 SROA - We've re-written SROA to be significantly more powerful and generate
357 code which is much more friendly to the rest of the optimization pipeline.
358 Previously this pass had scaling problems that required it to only operate on
359 relatively small aggregates, and at times it would mistakenly replace a large
360 aggregate with a single very large integer in order to make it a scalar SSA
361 value. The result was a large number of i1024 and i2048 values representing any
362 small stack buffer. These in turn slowed down many subsequent optimization
363 paths.
364
365 The new SROA pass uses a different algorithm that allows it to only promote to
366 scalars the pieces of the aggregate actively in use. Because of this it doesn't
367 require any thresholds. It also always deduces the scalar values from the uses
368 of the aggregate rather than the specific LLVM type of the aggregate. These
369 features combine to both optimize more code with the pass but to improve the
370 compile time of many functions dramatically.
371
372 #. Branch weight metadata is preseved through more of the optimizer.
373 #. ...
374
375 MC Level Improvements
376 ---------------------
377
378 The LLVM Machine Code (aka MC) subsystem was created to solve a number of
379 problems in the realm of assembly, disassembly, object file format handling,
380 and a number of other related areas that CPU instruction-set level tools work
381 in. For more information, please see the `Intro to the LLVM MC Project Blog
382 Post `_.
383
384 #. ...
385
386 .. _codegen:
387
388 Target Independent Code Generator Improvements
389 ----------------------------------------------
390
391 We have put a significant amount of work into the code generator
392 infrastructure, which allows us to implement more aggressive algorithms and
393 make it run faster:
394
395 #. ...
396
397 Stack Coloring - We have implemented a new optimization pass to merge stack
398 objects which are used in disjoin areas of the code. This optimization reduces
399 the required stack space significantly, in cases where it is clear to the
400 optimizer that the stack slot is not shared. We use the lifetime markers to
401 tell the codegen that a certain alloca is used within a region.
402
403 We now merge consecutive loads and stores.
404
405 X86-32 and X86-64 Target Improvements
406 -------------------------------------
407
408 New features and major changes in the X86 target include:
409
410 #. ...
411
412 .. _ARM:
413
414 ARM Target Improvements
415 -----------------------
416
417 New features of the ARM target include:
418
419 #. ...
420
421 .. _armintegratedassembler:
422
423 MIPS Target Improvements
424 ------------------------
425
426 New features and major changes in the MIPS target include:
427
428 #. ...
429
430 PowerPC Target Improvements
431 ---------------------------
432
433 Many fixes and changes across LLVM (and Clang) for better compliance with the
434 64-bit PowerPC ELF Application Binary Interface, interoperability with GCC, and
435 overall 64-bit PowerPC support. Some highlights include:
436
437 #. MCJIT support added.
438 #. PPC64 relocation support and (small code model) TOC handling added.
439 #. Parameter passing and return value fixes (alignment issues, padding, varargs
440 support, proper register usage, odd-sized structure support, float support,
441 extension of return values for i32 return values).
442 #. Fixes in spill and reload code for vector registers.
443 #. C++ exception handling enabled.
444 #. Changes to remediate double-rounding compatibility issues with respect to
445 GCC behavior.
446 #. Refactoring to disentangle ``ppc64-elf-linux`` ABI from Darwin ppc64 ABI
447 support.
448 #. Assorted new test cases and test case fixes (endian and word size issues).
449 #. Fixes for big-endian codegen bugs, instruction encodings, and instruction
450 constraints.
451 #. Implemented ``-integrated-as`` support.
452 #. Additional support for Altivec compare operations.
453 #. IBM long double support.
454
455 There have also been code generation improvements for both 32- and 64-bit code.
456 Instruction scheduling support for the Freescale e500mc and e5500 cores has
457 been added.
458
459 PTX/NVPTX Target Improvements
460 -----------------------------
461
462 The PTX back-end has been replaced by the NVPTX back-end, which is based on the
463 LLVM back-end used by NVIDIA in their CUDA (nvcc) and OpenCL compiler. Some
464 highlights include:
465
466 #. Compatibility with PTX 3.1 and SM 3.5.
467 #. Support for NVVM intrinsics as defined in the NVIDIA Compiler SDK.
468 #. Full compatibility with old PTX back-end, with much greater coverage of LLVM
469 SIR.
470
471 Please submit any back-end bugs to the LLVM Bugzilla site.
472
473 Other Target Specific Improvements
474 ----------------------------------
475
476 #. ...
477
478 Major Changes and Removed Features
479 ----------------------------------
480
481 If you're already an LLVM user or developer with out-of-tree changes based on
482 LLVM 3.2, this section lists some "gotchas" that you may run into upgrading
483 from the previous release.
484
485 #. The CellSPU port has been removed. It can still be found in older versions.
486 #. ...
487
488 Internal API Changes
489 --------------------
490
491 In addition, many APIs have changed in this release. Some of the major LLVM
492 API changes are:
493
494 We've added a new interface for allowing IR-level passes to access
495 target-specific information. A new IR-level pass, called
496 ``TargetTransformInfo`` provides a number of low-level interfaces. LSR and
497 LowerInvoke already use the new interface.
498
499 The ``TargetData`` structure has been renamed to ``DataLayout`` and moved to
500 ``VMCore`` to remove a dependency on ``Target``.
501
502 #. The IR-level extended linker APIs (for example, to link bitcode files out of
503 archives) have been removed. Any existing clients of these features should
504 move to using a linker with integrated LTO support.
505
506 Tools Changes
507 -------------
508
509 In addition, some tools have changed in this release. Some of the changes are:
510
511 #. ...
512
513 Python Bindings
514 ---------------
515
516 Officially supported Python bindings have been added! Feature support is far
517 from complete. The current bindings support interfaces to:
518
519 #. ...
520
521 Known Problems
522 ==============
523
524 LLVM is generally a production quality compiler, and is used by a broad range
525 of applications and shipping in many products. That said, not every subsystem
526 is as mature as the aggregate, particularly the more obscure1 targets. If you
527 run into a problem, please check the `LLVM bug database
528 `_ and submit a bug if there isn't already one or ask on
529 the `LLVMdev list `_.
530
531 Known problem areas include:
532
533 #. The MSP430 and XCore backends are experimental.
534
535 #. The integrated assembler, disassembler, and JIT is not supported by several
536 targets. If an integrated assembler is not supported, then a system
537 assembler is required. For more details, see the
538 :ref:`target-feature-matrix`.
59 Makes programs 10x faster by doing Special New Thing.
53960
54061 Additional Information
54162 ======================