llvm.org GIT mirror llvm / ae143ce
[Docs] Add CMake Primer document This document is intended to provide a basic overview of the CMake scripting language for LLVM developers. It was unorthodoxly reviewed for accuracy and content on the CMake developer list: http://public.kitware.com/pipermail/cmake-developers/2016-April/028300.html git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@268096 91177308-0d34-0410-b5e6-96231b3b80d8 Chris Bieneman 3 years ago
2 changed file(s) with 466 addition(s) and 0 deletion(s). Raw diff Collapse all Expand all
0 ============
1 CMake Primer
2 ============
3
4 .. contents::
5 :local:
6
7 .. warning::
8 Disclaimer: This documentation is written by LLVM project contributors `not`
9 anyone affiliated with the CMake project. This document may contain
10 inaccurate terminology, phrasing, or technical details. It is provided with
11 the best intentions.
12
13
14 Introduction
15 ============
16
17 The LLVM project and many of the core projects built on LLVM build using CMake.
18 This document aims to provide a brief overview of CMake for developers modifying
19 LLVM projects or building their own projects on top of LLVM.
20
21 The official CMake language references is available in the cmake-language
22 manpage and `cmake-language online documentation
23 `_.
24
25 10,000 ft View
26 ==============
27
28 CMake is a tool that reads script files in its own language that describe how a
29 software project builds. As CMake evaluates the scripts it constructs an
30 internal representation of the software project. Once the scripts have been
31 fully processed, if there are no errors, CMake will generate build files to
32 actually build the project. CMake supports generating build files for a variety
33 of command line build tools as well as for popular IDEs.
34
35 When a user runs CMake it performs a variety of checks similar to how autoconf
36 worked historically. During the checks and the evaluation of the build
37 description scripts CMake caches values into the CMakeCache. This is useful
38 because it allows the build system to skip long-running checks during
39 incremental development. CMake caching also has some drawbacks, but that will be
40 discussed later.
41
42 Scripting Overview
43 ==================
44
45 CMake's scripting language has a very simple grammar. Every language construct
46 is a command that matches the pattern _name_(_args_). Commands come in three
47 primary types: language-defined (commands implemented in C++ in CMake), defined
48 functions, and defined macros. The CMake distribution also contains a suite of
49 CMake modules that contain definitions for useful functionality.
50
51 The example below is the full CMake build for building a C++ "Hello World"
52 program. The example uses only CMake language-defined functions.
53
54 .. code-block:: cmake
55
56 cmake_minimum_required(VERSION 3.2)
57 project(HelloWorld)
58 add_executable(HelloWorld HelloWorld.cpp)
59
60 The CMake language provides control flow constructs in the form of foreach loops
61 and if blocks. To make the example above more complicated you could add an if
62 block to define "APPLE" when targeting Apple platforms:
63
64 .. code-block:: cmake
65
66 cmake_minimum_required(VERSION 3.2)
67 project(HelloWorld)
68 add_executable(HelloWorld HelloWorld.cpp)
69 if(APPLE)
70 target_compile_definitions(HelloWorld PUBLIC APPLE)
71 endif()
72
73 Variables, Types, and Scope
74 ===========================
75
76 Dereferencing
77 -------------
78
79 In CMake variables are "stringly" typed. All variables are represented as
80 strings throughout evaluation. Wrapping a variable in ``${}`` dereferences it
81 and results in a literal substitution of the name for the value. CMake refers to
82 this as "variable evaluation" in their documentation. Dereferences are performed
83 *before* the command being called receives the arguments. This means
84 dereferencing a list results in multiple separate arguments being passed to the
85 command.
86
87 Variable dereferences can be nested and be used to model complex data. For
88 example:
89
90 .. code-block:: cmake
91
92 set(var_name var1)
93 set(${var_name} foo) # same as "set(var1 foo)"
94 set(${${var_name}}_var bar) # same as "set(foo_var bar)"
95
96 Dereferencing an unset variable results in an empty expansion. It is a common
97 pattern in CMake to conditionally set variables knowing that it will be used in
98 code paths that the variable isn't set. There are examples of this throughout
99 the LLVM CMake build system.
100
101 An example of variable empty expansion is:
102
103 .. code-block:: cmake
104
105 if(APPLE)
106 set(extra_sources Apple.cpp)
107 endif()
108 add_executable(HelloWorld HelloWorld.cpp ${extra_sources})
109
110 In this example the ``extra_sources`` variable is only defined if you're
111 targeting an Apple platform. For all other targets the ``extra_sources`` will be
112 evaluated as empty before add_executable is given its arguments.
113
114 One big "Gotcha" with variable dereferencing is that ``if`` commands implicitly
115 dereference values. This has some unexpected results. For example:
116
117 .. code-block:: cmake
118
119 if("${SOME_VAR}" STREQUAL "MSVC")
120
121 In this code sample MSVC will be implicitly dereferenced, which will result in
122 the if command comparing the value of the dereferenced variables ``SOME_VAR``
123 and ``MSVC``. A common workaround to this solution is to prepend strings being
124 compared with an ``x``.
125
126 .. code-block:: cmake
127
128 if("x${SOME_VAR}" STREQUAL "xMSVC")
129
130 This works because while ``MSVC`` is a defined variable, ``xMSVC`` is not. This
131 pattern is uncommon, but it does occur in LLVM's CMake scripts.
132
133 .. note::
134
135 Once the LLVM project upgrades its minimum CMake version to 3.1 or later we
136 can prevent this behavior by setting CMP0054 to new. For more information on
137 CMake policies please see the cmake-policies manpage or the `cmake-policies
138 online documentation
139 `_.
140
141 Lists
142 -----
143
144 In CMake lists are semi-colon delimited strings, and it is strongly advised that
145 you avoid using semi-colons in lists; it doesn't go smoothly. A few examples of
146 defining lists:
147
148 .. code-block:: cmake
149
150 # Creates a list with members a, b, c, and d
151 set(my_list a b c d)
152 set(my_list "a;b;c;d")
153
154 # Creates a string "a b c d"
155 set(my_string "a b c d")
156
157 Lists of Lists
158 --------------
159
160 One of the more complicated patterns in CMake is lists of lists. Because a list
161 cannot contain an element with a semi-colon to construct a list of lists you
162 make a list of variable names that refer to other lists. For example:
163
164 .. code-block:: cmake
165
166 set(list_of_lists a b c)
167 set(a 1 2 3)
168 set(b 4 5 6)
169 set(c 7 8 9)
170
171 With this layout you can iterate through the list of lists printing each value
172 with the following code:
173
174 .. code-block:: cmake
175
176 foreach(list_name IN LISTS list_of_lists)
177 foreach(value IN LISTS ${list_name})
178 message(${value})
179 endforeach()
180 endforeach()
181
182 You'll notice that the inner foreach loop's list is doubly dereferenced. This is
183 because the first dereference turns ``list_name`` into the name of the sub-list
184 (a, b, or c in the example), then the second dereference is to get the value of
185 the list.
186
187 This pattern is used throughout CMake, the most common example is the compiler
188 flags options, which CMake refers to using the following variable expansions:
189 CMAKE_${LANGUAGE}_FLAGS and CMAKE_${LANGUAGE}_FLAGS_${CMAKE_BUILD_TYPE}.
190
191 Other Types
192 -----------
193
194 Variables that are cached or specified on the command line can have types
195 associated with them. The variable's type is used by CMake's UI tool to display
196 the right input field. The variable's type generally doesn't impact evaluation.
197 One of the few examples is PATH variables, which CMake does have some special
198 handling for. You can read more about the special handling in `CMake's set
199 documentation
200 `_.
201
202 Scope
203 -----
204
205 CMake inherently has a directory-based scoping. Setting a variable in a
206 CMakeLists file, will set the variable for that file, and all subdirectories.
207 Variables set in a CMake module that is included in a CMakeLists file will be
208 set in the scope they are included from, and all subdirectories.
209
210 When a variable that is already set is set again in a subdirectory it overrides
211 the value in that scope and any deeper subdirectories.
212
213 The CMake set command provides two scope-related options. PARENT_SCOPE sets a
214 variable into the parent scope, and not the current scope. The CACHE option sets
215 the variable in the CMakeCache, which results in it being set in all scopes. The
216 CACHE option will not set a variable that already exists in the CACHE unless the
217 FORCE option is specified.
218
219 In addition to directory-based scope, CMake functions also have their own scope.
220 This means variables set inside functions do not bleed into the parent scope.
221 This is not true of macros, and it is for this reason LLVM prefers functions
222 over macros whenever reasonable.
223
224 .. note::
225 Unlike C-based languages, CMake's loop and control flow blocks do not have
226 their own scopes.
227
228 Control Flow
229 ============
230
231 CMake features the same basic control flow constructs you would expect in any
232 scripting language, but there are a few quarks because, as with everything in
233 CMake, control flow constructs are commands.
234
235 If, ElseIf, Else
236 ----------------
237
238 .. note::
239 For the full documentation on the CMake if command go
240 `here `_. That resource is
241 far more complete.
242
243 In general CMake if blocks work the way you'd expect:
244
245 .. code-block:: cmake
246
247 if()
248 .. do stuff
249 elseif()
250 .. do other stuff
251 else()
252 .. do other other stuff
253 endif()
254
255 The single most important thing to know about CMake's if blocks coming from a C
256 background is that they do not have their own scope. Variables set inside
257 conditional blocks persist after the ``endif()``.
258
259 Loops
260 -----
261
262 The most common form of the CMake ``foreach`` block is:
263
264 .. code-block:: cmake
265
266 foreach(var ...)
267 .. do stuff
268 endforeach()
269
270 The variable argument portion of the ``foreach`` block can contain dereferenced
271 lists, values to iterate, or a mix of both:
272
273 .. code-block:: cmake
274
275 foreach(var foo bar baz)
276 message(${var})
277 endforeach()
278 # prints:
279 # foo
280 # bar
281 # baz
282
283 set(my_list 1 2 3)
284 foreach(var ${my_list})
285 message(${var})
286 endforeach()
287 # prints:
288 # 1
289 # 2
290 # 3
291
292 foreach(var ${my_list} out_of_bounds)
293 message(${var})
294 endforeach()
295 # prints:
296 # 1
297 # 2
298 # 3
299 # out_of_bounds
300
301 There is also a more modern CMake foreach syntax. The code below is equivalent
302 to the code above:
303
304 .. code-block:: cmake
305
306 foreach(var IN ITEMS foo bar baz)
307 message(${var})
308 endforeach()
309 # prints:
310 # foo
311 # bar
312 # baz
313
314 set(my_list 1 2 3)
315 foreach(var IN LISTS my_list)
316 message(${var})
317 endforeach()
318 # prints:
319 # 1
320 # 2
321 # 3
322
323 foreach(var IN LISTS my_list ITEMS out_of_bounds)
324 message(${var})
325 endforeach()
326 # prints:
327 # 1
328 # 2
329 # 3
330 # out_of_bounds
331
332 Similar to the conditional statements, these generally behave how you would
333 expect, and they do not have their own scope.
334
335 CMake also supports ``while`` loops, although they are not widely used in LLVM.
336
337 Modules, Functions and Macros
338 =============================
339
340 Modules
341 -------
342
343 Modules are CMake's vehicle for enabling code reuse. CMake modules are just
344 CMake script files. They can contain code to execute on include as well as
345 definitions for commands.
346
347 In CMake macros and functions are universally referred to as commands, and they
348 are the primary method of defining code that can be called multiple times.
349
350 In LLVM we have several CMake modules that are included as part of our
351 distribution for developers who don't build our project from source. Those
352 modules are the fundamental pieces needed to build LLVM-based projects with
353 CMake. We also rely on modules as a way of organizing the build system's
354 functionality for maintainability and re-use within LLVM projects.
355
356 Argument Handling
357 -----------------
358
359 When defining a CMake command handling arguments is very useful. The examples
360 in this section will all use the CMake ``function`` block, but this all applies
361 to the ``macro`` block as well.
362
363 CMake commands can have named arguments, but all commands are implicitly
364 variable argument. If the command has named arguments they are required and must
365 be specified at every call site. Below is a trivial example of providing a
366 wrapper function for CMake's built in function ``add_dependencies``.
367
368 .. code-block:: cmake
369
370 function(add_deps target)
371 add_dependencies(${target} ${ARGV})
372 endfunction()
373
374 This example defines a new macro named ``add_deps`` which takes a required first
375 argument, and just calls another function passing through the first argument and
376 all trailing arguments. When variable arguments are present CMake defines them
377 in a list named ``ARGV``, and the count of the arguments is defined in ``ARGN``.
378
379 CMake provides a module ``CMakeParseArguments`` which provides an implementation
380 of advanced argument parsing. We use this all over LLVM, and it is recommended
381 for any function that has complex argument-based behaviors or optional
382 arguments. CMake's official documentation for the module is in the
383 ``cmake-modules`` manpage, and is also available at the
384 `cmake-modules online documentation
385 `_.
386
387 .. note::
388 As of CMake 3.5 the cmake_parse_arguments command has become a native command
389 and the CMakeParseArguments module is empty and only left around for
390 compatibility.
391
392 Functions Vs Macros
393 -------------------
394
395 Functions and Macros look very similar in how they are used, but there is one
396 fundamental difference between the two. Functions have their own scope, and
397 macros don't. This means variables set in macros will bleed out into the calling
398 scope. That makes macros suitable for defining very small bits of functionality
399 only.
400
401 The other difference between CMake functions and macros is how arguments are
402 passed. Arguments to macros are not set as variables, instead dereferences to
403 the parameters are resolved across the macro before executing it. This can
404 result in some unexpected behavior if using unreferenced variables. For example:
405
406 .. code-block:: cmake
407
408 macro(print_list my_list)
409 foreach(var IN LISTS my_list)
410 message("${var}")
411 endforeach()
412 endmacro()
413
414 set(my_list a b c d)
415 set(my_list_of_numbers 1 2 3 4)
416 print_list(my_list_of_numbers)
417 # prints:
418 # a
419 # b
420 # c
421 # d
422
423 Generally speaking this issue is uncommon because it requires using
424 non-dereferenced variables with names that overlap in the parent scope, but it
425 is important to be aware of because it can lead to subtle bugs.
426
427 LLVM Project Wrappers
428 =====================
429
430 LLVM projects provide lots of wrappers around critical CMake built-in commands.
431 We use these wrappers to provide consistent behaviors across LLVM components
432 and to reduce code duplication.
433
434 We generally (but not always) follow the convention that commands prefaced with
435 ``llvm_`` are intended to be used only as building blocks for other commands.
436 Wrapper commands that are intended for direct use are generally named following
437 with the project in the middle of the command name (i.e. ``add_llvm_executable``
438 is the wrapper for ``add_executable``). The LLVM ``add_*`` wrapper functions are
439 all defined in ``AddLLVM.cmake`` which is installed as part of the LLVM
440 distribution. It can be included and used by any LLVM sub-project that requires
441 LLVM.
442
443 .. note::
444
445 Not all LLVM projects require LLVM for all use cases. For example compiler-rt
446 can be built without LLVM, and the compiler-rt sanitizer libraries are used
447 with GCC.
448
449 Useful Built-in Commands
450 ========================
451
452 CMake has a bunch of useful built-in commands. This document isn't going to
453 go into details about them because The CMake project has excellent
454 documentation. To highlight a few useful functions see:
455
456 * `add_custom_command `_
457 * `add_custom_target `_
458 * `file `_
459 * `list `_
460 * `math `_
461 * `string `_
462
463 The full documentation for CMake commands is in the ``cmake-commands`` manpage
464 and available on `CMake's website `_
6464 :hidden:
6565
6666 CMake
67 CMakePrimer
6768 AdvancedBuilds
6869 HowToBuildOnARM
6970 HowToCrossCompileLLVM