llvm.org GIT mirror llvm / 01c176b
Add some tips on benchmarking. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@303769 91177308-0d34-0410-b5e6-96231b3b80d8 Rafael Espindola 2 years ago
2 changed file(s) with 88 addition(s) and 0 deletion(s). Raw diff Collapse all Expand all
0 ==================================
1 Benchmarking tips
2 ==================================
3
4
5 Introduction
6 ============
7
8 For benchmarking a patch we want to reduce all possible sources of
9 noise as much as possible. How to do that is very OS dependent.
10
11 Note that low noise is required, but not sufficient. It does not
12 exclude measurement bias. See
13 https://www.cis.upenn.edu/~cis501/papers/producing-wrong-data.pdf for
14 example.
15
16 General
17 ================================
18
19 * Use a high resolution timer, e.g. perf under linux.
20
21 * Run the benchmark multiple times to be able to recognize noise.
22
23 * Disable as many processes or services as possible on the target system.
24
25 * Disable frequency scaling, turbo boost and address space
26 randomization (see OS specific section).
27
28 * Static link if the OS supports it. That avoids any variation that
29 might be introduced by loading dynamic libraries. This can be done
30 by passing ``-DLLVM_BUILD_STATIC=ON`` to cmake.
31
32 * Try to avoid storage. On some systems you can use tmpfs. Putting the
33 program, inputs and outputs on tmpfs avoids touching a real storage
34 system, which can have a pretty big variability.
35
36 To mount it (on linux and freebsd at least)::
37
38 mount -t tmpfs -o size=g none dir_to_mount
39
40 Linux
41 =====
42
43 * Disable address space randomization::
44
45 echo 0 > /proc/sys/kernel/randomize_va_space
46
47 * Set scaling_governor to performance::
48
49 for i in /sys/devices/system/cpu/cpu*/cpufreq/scaling_governor
50 do
51 echo performance > /sys/devices/system/cpu/cpu*/cpufreq/scaling_governor
52 done
53
54 * Use https://github.com/lpechacek/cpuset to reserve cpus for just the
55 program you are benchmarking. If using perf, leave at least 2 cores
56 so that perf runs in one and your program in another::
57
58 cset shield -c N1,N2 -k on
59
60 This will move all threads out of N1 and N2. The ``-k on`` means
61 that even kernel threads are moved out.
62
63 * Disable the SMT pair of the cpus you will use for the benchmark. The
64 pair of cpu N can be found in
65 ``/sys/devices/system/cpu/cpuN/topology/thread_siblings_list`` and
66 disabled with::
67
68 echo 0 > /sys/devices/system/cpu/cpuX/online
69
70
71 * Run the program with::
72
73 cset shield --exec -- perf stat -r 10
74
75 This will run the command after ``--`` in the isolated cpus. The
76 particular perf command runs the ```` 10 times and reports
77 statistics.
78
79 With these in place you can expect perf variations of less than 0.1%.
80
81 Linux Intel
82 -----------
83
84 * Disable turbo mode::
85
86 echo 1 > /sys/devices/system/cpu/intel_pstate/no_turbo
8989 CodeOfConduct
9090 CompileCudaWithLLVM
9191 ReportingGuide
92 Benchmarking
9293
9394 :doc:`GettingStarted`
9495 Discusses how to get up and running quickly with the LLVM infrastructure.