llvm.org GIT mirror llvm / 89b77ce
docs: Add some information about Fuzzing LLVM itself This splits some content out of the libFuzzer docs and adds a fair amount of detail about the fuzzers in LLVM. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@315544 91177308-0d34-0410-b5e6-96231b3b80d8 Justin Bogner 2 years ago
3 changed file(s) with 232 addition(s) and 65 deletion(s). Raw diff Collapse all Expand all
0 ================================
1 Fuzzing LLVM libraries and tools
2 ================================
3
4 .. contents::
5 :local:
6 :depth: 2
7
8 Introduction
9 ============
10
11 The LLVM tree includes a number of fuzzers for various components. These are
12 built on top of :doc:`LibFuzzer `.
13
14
15 Available Fuzzers
16 =================
17
18 clang-fuzzer
19 ------------
20
21 A |generic fuzzer| that tries to compile textual input as C++ code. Some of the
22 bugs this fuzzer has reported are `on bugzilla `__
23 and `on OSS Fuzz's tracker
24 `__.
25
26 clang-proto-fuzzer
27 ------------------
28
29 A |protobuf fuzzer| that compiles valid C++ programs generated from a protobuf
30 class that describes a subset of the C++ language.
31
32 This fuzzer accepts clang command line options after `ignore_remaining_args=1`.
33 For example, the following command will fuzz clang with a higher optimization
34 level:
35
36 .. code-block:: shell
37
38 % bin/clang-proto-fuzzer -ignore_remaining_args=1 -O3
39
40 clang-format-fuzzer
41 -------------------
42
43 A |generic fuzzer| that runs clang-format_ on C++ text fragments. Some of the
44 bugs this fuzzer has reported are `on bugzilla `__
45 and `on OSS Fuzz's tracker
46
47
48 .. _clang-format: https://clang.llvm.org/docs/ClangFormat.html
49
50 llvm-as-fuzzer
51 --------------
52
53 A |generic fuzzer| that tries to parse text as :doc:`LLVM assembly `.
54 Some of the bugs this fuzzer has reported are `on bugzilla
55 `__
56
57 llvm-dwarfdump-fuzzer
58 ---------------------
59
60 A |generic fuzzer| that interprets inputs as object files and runs
61 :doc:`llvm-dwarfdump ` on them. Some of the bugs
62 this fuzzer has reported are `on OSS Fuzz's tracker
63
64
65 llvm-isel-fuzzer
66 ----------------
67
68 A |LLVM IR fuzzer| aimed at finding bugs in instruction selection.
69
70 This fuzzer accepts flags after `ignore_remaining_args=1`. The flags match
71 those of :doc:`llc ` and the triple is required. For example,
72 the following command would fuzz AArch64 with :doc:`GlobalISel`:
73
74 .. code-block:: shell
75
76 % bin/llvm-isel-fuzzer -ignore_remaining_args=1 -mtriple aarch64 -global-isel -O0
77
78 llvm-mc-assemble-fuzzer
79 -----------------------
80
81 A |generic fuzzer| that fuzzes the MC layer's assemblers by treating inputs as
82 target specific assembly.
83
84 Note that this fuzzer has an unusual command line interface which is not fully
85 compatible with all of libFuzzer's features. Fuzzer arguments must be passed
86 after ``--fuzzer-args``, and any ``llc`` flags must use two dashes. For
87 example, to fuzz the AArch64 assembler you might use the following command:
88
89 .. code-block:: console
90
91 llvm-mc-fuzzer --triple=aarch64-linux-gnu --fuzzer-args -max_len=4
92
93 This scheme will likely change in the future.
94
95 llvm-mc-disassemble-fuzzer
96 --------------------------
97
98 A |generic fuzzer| that fuzzes the MC layer's disassemblers by treating inputs
99 as assembled binary data.
100
101 Note that this fuzzer has an unusual command line interface which is not fully
102 compatible with all of libFuzzer's features. See the notes above about
103 ``llvm-mc-assemble-fuzzer`` for details.
104
105
106 .. |generic fuzzer| replace:: :ref:`generic fuzzer `
107 .. |protobuf fuzzer|
108 replace:: :ref:`libprotobuf-mutator based fuzzer `
109 .. |LLVM IR fuzzer|
110 replace:: :ref:`structured LLVM IR fuzzer `
111
112
113 Mutators and Input Generators
114 =============================
115
116 The inputs for a fuzz target are generated via random mutations of a
117 :ref:`corpus `. There are a few options for the kinds of
118 mutations that a fuzzer in LLVM might want.
119
120 .. _fuzzing-llvm-generic:
121
122 Generic Random Fuzzing
123 ----------------------
124
125 The most basic form of input mutation is to use the built in mutators of
126 LibFuzzer. These simply treat the input corpus as a bag of bits and make random
127 mutations. This type of fuzzer is good for stressing the surface layers of a
128 program, and is good at testing things like lexers, parsers, or binary
129 protocols.
130
131 Some of the in-tree fuzzers that use this type of mutator are `clang-fuzzer`_,
132 `clang-format-fuzzer`_, `llvm-as-fuzzer`_, `llvm-dwarfdump-fuzzer`_,
133 `llvm-mc-assemble-fuzzer`_, and `llvm-mc-disassemble-fuzzer`_.
134
135 .. _fuzzing-llvm-protobuf:
136
137 Structured Fuzzing using ``libprotobuf-mutator``
138 ------------------------------------------------
139
140 We can use libprotobuf-mutator_ in order to perform structured fuzzing and
141 stress deeper layers of programs. This works by defining a protobuf class that
142 translates arbitrary data into structurally interesting input. Specifically, we
143 use this to work with a subset of the C++ language and perform mutations that
144 produce valid C++ programs in order to exercise parts of clang that are more
145 interesting than parser error handling.
146
147 To build this kind of fuzzer you need `protobuf`_ and its dependencies
148 installed, and you need to specify some extra flags when configuring the build
149 with :doc:`CMake `. For example, `clang-proto-fuzzer`_ can be enabled by
150 adding ``-DCLANG_ENABLE_PROTO_FUZZER=ON`` to the flags described in
151 :ref:`building-fuzzers`.
152
153 The only in-tree fuzzer that uses ``libprotobuf-mutator`` today is
154 `clang-proto-fuzzer`_.
155
156 .. _libprotobuf-mutator: https://github.com/google/libprotobuf-mutator
157 .. _protobuf: https://github.com/google/protobuf
158
159 .. _fuzzing-llvm-ir:
160
161 Structured Fuzzing of LLVM IR
162 -----------------------------
163
164 We also use a more direct form of structured fuzzing for fuzzers that take
165 :doc:`LLVM IR ` as input. This is achieved through the ``FuzzMutate``
166 library, which was `discussed at EuroLLVM 2017`_.
167
168 The ``FuzzMutate`` library is used to structurally fuzz backends in
169 `llvm-isel-fuzzer`_.
170
171 .. _discussed at EuroLLVM 2017: https://www.youtube.com/watch?v=UBbQ_s6hNgg
172
173
174 Building and Running
175 ====================
176
177 .. _building-fuzzers:
178
179 Configuring LLVM to Build Fuzzers
180 ---------------------------------
181
182 Fuzzers will be built and linked to libFuzzer by default as long as you build
183 LLVM with sanitizer coverage enabled. You would typically also enable at least
184 one sanitizer for the fuzzers to be particularly likely, so the most common way
185 to build the fuzzers is by adding the following two flags to your CMake
186 invocation: ``-DLLVM_USE_SANITIZER=Address -DLLVM_USE_SANITIZE_COVERAGE=On``.
187
188 .. note:: If you have ``compiler-rt`` checked out in an LLVM tree when building
189 with sanitizers, you'll want to specify ``-DLLVM_BUILD_RUNTIME=Off``
190 to avoid building the sanitizers themselves with sanitizers enabled.
191
192 Continuously Running and Finding Bugs
193 -------------------------------------
194
195 There used to be a public buildbot running LLVM fuzzers continuously, and while
196 this did find issues, it didn't have a very good way to report problems in an
197 actionable way. Because of this, we're moving towards using `OSS Fuzz`_ more
198 instead.
199
200 https://github.com/google/oss-fuzz/blob/master/projects/llvm/project.yaml
201 https://bugs.chromium.org/p/oss-fuzz/issues/list?q=Proj-llvm
202
203 .. _OSS Fuzz: https://github.com/google/oss-fuzz
204
205
206 Utilities for Writing Fuzzers
207 =============================
208
209 There are some utilities available for writing fuzzers in LLVM.
210
211 Some helpers for handling the command line interface are available in
212 ``include/llvm/FuzzMutate/FuzzerCLI.h``, including functions to parse command
213 line options in a consistent way and to implement standalone main functions so
214 your fuzzer can be built and tested when not built against libFuzzer.
215
216 There is also some handling of the CMake config for fuzzers, where you should
217 use the ``add_llvm_fuzzer`` to set up fuzzer targets. This function works
218 similarly to functions such as ``add_llvm_tool``, but they take care of linking
219 to LibFuzzer when appropriate and can be passed the ``DUMMY_MAIN`` argument to
220 enable standalone testing.
4141 ``./third_party/llvm-build/Release+Asserts/bin/clang``)
4242
4343 The libFuzzer code resides in the LLVM repository, and requires a recent Clang
44 compiler to build (and is used to `fuzz various parts of LLVM itself`_).
45 However the fuzzer itself does not (and should not) depend on any part of LLVM
46 infrastructure and can be used for other projects without requiring the rest
47 of LLVM.
44 compiler to build (and is used to :doc:`fuzz various parts of LLVM itself
45 `). However the fuzzer itself does not (and should not) depend on
46 any part of LLVM infrastructure and can be used for other projects without
47 requiring the rest of LLVM.
4848
4949
5050 Getting Started
136136
137137 clang -fsanitize-coverage=trace-pc-guard -fsanitize=address your_lib.cc fuzz_target.cc libFuzzer.a -o my_fuzzer
138138
139 .. _libfuzzer-corpus:
140
139141 Corpus
140142 ------
141143
626628 ninja check-fuzzer
627629
628630
629 Fuzzing components of LLVM
630 ==========================
631 .. contents::
632 :local:
633 :depth: 1
634
635 To build any of the LLVM fuzz targets use the build instructions above.
636
637 clang-format-fuzzer
638 -------------------
639 The inputs are random pieces of C++-like text.
640
641 .. code-block:: console
642
643 ninja clang-format-fuzzer
644 mkdir CORPUS_DIR
645 ./bin/clang-format-fuzzer CORPUS_DIR
646
647 Optionally build other kinds of binaries (ASan+Debug, MSan, UBSan, etc).
648
649 Tracking bug: https://llvm.org/bugs/show_bug.cgi?id=23052
650
651 clang-fuzzer
652 ------------
653
654 The behavior is very similar to ``clang-format-fuzzer``.
655
656 Tracking bug: https://llvm.org/bugs/show_bug.cgi?id=23057
657
658 llvm-as-fuzzer
659 --------------
660
661 Tracking bug: https://llvm.org/bugs/show_bug.cgi?id=24639
662
663 llvm-mc-fuzzer
664 --------------
665
666 This tool fuzzes the MC layer. Currently it is only able to fuzz the
667 disassembler but it is hoped that assembly, and round-trip verification will be
668 added in future.
669
670 When run in dissassembly mode, the inputs are opcodes to be disassembled. The
671 fuzzer will consume as many instructions as possible and will stop when it
672 finds an invalid instruction or runs out of data.
673
674 Please note that the command line interface differs slightly from that of other
675 fuzzers. The fuzzer arguments should follow ``--fuzzer-args`` and should have
676 a single dash, while other arguments control the operation mode and target in a
677 similar manner to ``llvm-mc`` and should have two dashes. For example:
678
679 .. code-block:: console
680
681 llvm-mc-fuzzer --triple=aarch64-linux-gnu --disassemble --fuzzer-args -max_len=4 -jobs=10
682
683 Buildbot
684 --------
685
686 A buildbot continuously runs the above fuzzers for LLVM components, with results
687 shown at http://lab.llvm.org:8011/builders/sanitizer-x86_64-linux-fuzzer .
688
689631 FAQ
690632 =========================
691633
807749 .. _`value profile`: #value-profile
808750 .. _`caller-callee pairs`: http://clang.llvm.org/docs/SanitizerCoverage.html#caller-callee-coverage
809751 .. _BoringSSL: https://boringssl.googlesource.com/boringssl/
810 .. _`fuzz various parts of LLVM itself`: `Fuzzing components of LLVM`_
752
182182 ProgrammersManual
183183 Extensions
184184 LibFuzzer
185 FuzzingLLVM
185186 ScudoHardenedAllocator
186187 OptBisect
187188
226227
227228 :doc:`LibFuzzer`
228229 A library for writing in-process guided fuzzers.
230
231 :doc:`FuzzingLLVM`
232 Information on writing and using Fuzzers to find bugs in LLVM.
229233
230234 :doc:`ScudoHardenedAllocator`
231235 A library that implements a security-hardened `malloc()`.