llvm.org GIT mirror llvm / 047904e
[docs] Further organization of the Performance Tips document Arranging the language specific property section into readable groupings and adding a couple of notes about pass order, extensions, and the like. For the record, suggestion for word smithing are welcomed. I'm happy to revise; I'm just trying to get *something* in place. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@245855 91177308-0d34-0410-b5e6-96231b3b80d8 Philip Reames 4 years ago
1 changed file(s) with 54 addition(s) and 15 deletion(s). Raw diff Collapse all Expand all
132132 Describing Language Specific Properties
133133 =======================================
134134
135 When translating a source language to LLVM, finding ways to express concepts and guarantees available in your source language which are not natively provided by LLVM IR will greatly improve LLVM's ability to optimize your code. As an example, C/C++'s ability to mark every add as "no signed wrap (nsw)" goes along way to assisting the optimizer in reasoning about loop induction variables.
136
137 The LLVM LangRef includes a number of mechanisms for annotating the IR with additional semantic information. It is *strongly* recommended that you become highly familiar with this document. The list below is intended to highlight a couple of items of particular interest, but is by no means exhaustive.
138
135 When translating a source language to LLVM, finding ways to express concepts
136 and guarantees available in your source language which are not natively
137 provided by LLVM IR will greatly improve LLVM's ability to optimize your code.
138 As an example, C/C++'s ability to mark every add as "no signed wrap (nsw)" goes
139 a long way to assisting the optimizer in reasoning about loop induction
140 variables and thus generating more optimal code for loops.
141
142 The LLVM LangRef includes a number of mechanisms for annotating the IR with
143 additional semantic information. It is *strongly* recommended that you become
144 highly familiar with this document. The list below is intended to highlight a
145 couple of items of particular interest, but is by no means exhaustive.
146
147 Restricted Operation Semantics
148 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
139149 #. Add nsw/nuw flags as appropriate. Reasoning about overflow is
140150 generally hard for an optimizer so providing these facts from the frontend
141151 can be very impactful.
145155 optimizations that can be performed. This can be highly impactful for
146156 floating point intensive computations.
147157
148 #. Use inbounds on geps. This can help to disambiguate some aliasing queries.
158 Describing Aliasing Properties
159 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
149160
150161 #. Add noalias/align/dereferenceable/nonnull to function arguments and return
151162 values as appropriate
152163
153 #. Mark functions as readnone/readonly or noreturn/nounwind when known. The
154 optimizer will try to infer these flags, but may not always be able to.
155 Manual annotations are particularly important for external functions that
156 the optimizer can not analyze.
164 #. Use pointer aliasing metadata, especially tbaa metadata, to communicate
165 otherwise-non-deducible pointer aliasing facts
166
167 #. Use inbounds on geps. This can help to disambiguate some aliasing queries.
168
169
170 Modeling Memory Effects
171 ^^^^^^^^^^^^^^^^^^^^^^^^
172
173 #. Mark functions as readnone/readonly/argmemonly or noreturn/nounwind when
174 known. The optimizer will try to infer these flags, but may not always be
175 able to. Manual annotations are particularly important for external
176 functions that the optimizer can not analyze.
157177
158178 #. Use the lifetime.start/lifetime.end and invariant.start/invariant.end
159179 intrinsics where possible. Common profitable uses are for stack like data
160180 structures (thus allowing dead store elimination) and for describing
161181 life times of allocas (thus allowing smaller stack sizes).
162182
163 #. Use pointer aliasing metadata, especially tbaa metadata, to communicate
164 otherwise-non-deducible pointer aliasing facts
165
166183 #. Mark invariant locations using !invariant.load and TBAA's constant flags
167184
168 #. If you language uses range checks, consider using the IRCE pass. It is not
169 currently part of the standard pass order.
185 Pass Ordering
186 ^^^^^^^^^^^^^
187
188 One of the most common mistakes made by new language frontend projects is to
189 use the existing -O2 or -O3 pass pipelines as is. These pass pipelines make a
190 good starting point for an optimizing compiler for any language, but they have
191 been carefully tuned for C and C++, not your target language. You will almost
192 certainly need to use a custom pass order to achieve optimal performance. A
193 couple specific suggestions:
170194
171195 #. For languages with numerous rarely executed guard conditions (e.g. null
172196 checks, type checks, range checks) consider adding an extra execution or
174198 which is tuned for C and C++ applications, may not be sufficient to remove
175199 all dischargeable checks from loops.
176200
177 If you didn't find what you were looking for above, consider proposing an piece of metadata which provides the optimization hint you need. Such extensions are relatively common and are generally well received by the community. You will need to ensure that your proposal is sufficiently general so that it benefits others if you wish to contribute it upstream.
201 #. If you language uses range checks, consider using the IRCE pass. It is not
202 currently part of the standard pass order.
203
204 #. A useful sanity check to run is to run your optimized IR back through the
205 -O2 pipeline again. If you see noticeable improvement in the resulting IR,
206 you likely need to adjust your pass order.
207
208
209 I Still Can't Find What I'm Looking For
210 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
211
212 If you didn't find what you were looking for above, consider proposing an piece
213 of metadata which provides the optimization hint you need. Such extensions are
214 relatively common and are generally well received by the community. You will
215 need to ensure that your proposal is sufficiently general so that it benefits
216 others if you wish to contribute it upstream.
178217
179218 Adding to this document
180219 =======================