llvm.org GIT mirror llvm / a9c15c1
[TableGen][SubtargetEmitter] Add the ability for processor models to describe dependency breaking instructions. This patch adds the ability for processor models to describe dependency breaking instructions. Different processors may specify a different set of dependency-breaking instructions. That means, we cannot assume that all processors of the same target would use the same rules to classify dependency breaking instructions. The main goal of this patch is to provide the means to describe dependency breaking instructions directly via tablegen, and have the following TargetSubtargetInfo hooks redefined in overrides by tabegen'd XXXGenSubtargetInfo classes (here, XXX is a Target name). ``` virtual bool isZeroIdiom(const MachineInstr *MI, APInt &Mask) const { return false; } virtual bool isDependencyBreaking(const MachineInstr *MI, APInt &Mask) const { return isZeroIdiom(MI); } ``` An instruction MI is a dependency-breaking instruction if a call to method isDependencyBreaking(MI) on the STI (TargetSubtargetInfo object) evaluates to true. Similarly, an instruction MI is a special case of zero-idiom dependency breaking instruction if a call to STI.isZeroIdiom(MI) returns true. The extra APInt is used for those targets that may want to select which machine operands have their dependency broken (see comments in code). Note that by default, subtargets don't know about the existence of dependency-breaking. In the absence of external information, those method calls would always return false. A new tablegen class named STIPredicate has been added by this patch to let processor models classify instructions that have properties in common. The idea is that, a MCInstrPredicate definition can be used to "generate" an instruction equivalence class, with the idea that instructions of a same class all have a property in common. STIPredicate definitions are essentially a collection of instruction equivalence classes. Also, different processor models can specify a different variant of the same STIPredicate with different rules (i.e. predicates) to classify instructions. Tablegen backends (in this particular case, the SubtargetEmitter) will be able to process STIPredicate definitions, and automatically generate functions in XXXGenSubtargetInfo. This patch introduces two special kind of STIPredicate classes named IsZeroIdiomFunction and IsDepBreakingFunction in tablegen. It also adds a definition for those in the BtVer2 scheduling model only. This patch supersedes the one committed at r338372 (phabricator review: D49310). The main advantages are: - We can describe subtarget predicates via tablegen using STIPredicates. - We can describe zero-idioms / dep-breaking instructions directly via tablegen in the scheduling models. In future, the STIPredicates framework can be used for solving other problems. Examples of future developments are: - Teach how to identify optimizable register-register moves - Teach how to identify slow LEA instructions (each subtarget defining its own concept of "slow" LEA). - Teach how to identify instructions that have undocumented false dependencies on the output registers on some processors only. It is also (in my opinion) an elegant way to expose knowledge to both external tools like llvm-mca, and codegen passes. For example, machine schedulers in LLVM could reuse that information when internally constructing the data dependency graph for a code region. This new design feature is also an "opt-in" feature. Processor models don't have to use the new STIPredicates. It has all been designed to be as unintrusive as possible. Differential Revision: https://reviews.llvm.org/D52174 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@342555 91177308-0d34-0410-b5e6-96231b3b80d8 Andrea Di Biagio 1 year, 9 months ago
13 changed file(s) with 1164 addition(s) and 92 deletion(s). Raw diff Collapse all Expand all
1313 #ifndef LLVM_CODEGEN_TARGETSUBTARGETINFO_H
1414 #define LLVM_CODEGEN_TARGETSUBTARGETINFO_H
1515
16 #include "llvm/ADT/APInt.h"
1617 #include "llvm/ADT/ArrayRef.h"
1718 #include "llvm/ADT/SmallVector.h"
1819 #include "llvm/ADT/StringRef.h"
143144 return 0;
144145 }
145146
147 /// Returns true if \param MI is a dependency breaking zero-idiom instruction
148 /// for the subtarget.
149 ///
150 /// This function also sets bits in \param Mask related to input operands that
151 /// are not in a data dependency relationship. There is one bit for each
152 /// machine operand; implicit operands follow explicit operands in the bit
153 /// representation used for \param Mask. An empty \param Mask (i.e. a mask
154 /// with all bits cleared) means: data dependencies are "broken" for all the
155 /// explicit input machine operands of \param MI.
156 virtual bool isZeroIdiom(const MachineInstr *MI, APInt &Mask) const {
157 return false;
158 }
159
160 /// Returns true if \param MI is a dependency breaking instruction for the
161 /// subtarget.
162 ///
163 /// Similar in behavior to `isZeroIdiom`. However, it knows how to identify
164 /// all dependency breaking instructions (i.e. not just zero-idioms).
165 ///
166 /// As for `isZeroIdiom`, this method returns a mask of "broken" dependencies.
167 /// (See method `isZeroIdiom` for a detailed description of \param Mask).
168 virtual bool isDependencyBreaking(const MachineInstr *MI, APInt &Mask) const {
169 return isZeroIdiom(MI, Mask);
170 }
171
146172 /// True if the subtarget should run MachineScheduler after aggressive
147173 /// coalescing.
148174 ///
8787 const MCInst &Inst,
8888 APInt &Writes) const;
8989
90 /// Returns true if \param Inst is a dependency breaking instruction for the
90 /// Returns true if \param MI is a dependency breaking zero-idiom for the
9191 /// given subtarget.
92 ///
93 /// \param Mask is used to identify input operands that have their dependency
94 /// broken. Each bit of the mask is associated with a specific input operand.
95 /// Bits associated with explicit input operands are laid out first in the
96 /// mask; implicit operands come after explicit operands.
97 ///
98 /// Dependencies are broken only for operands that have their corresponding bit
99 /// set. Operands that have their bit cleared, or that don't have a
100 /// corresponding bit in the mask don't have their dependency broken.
101 /// Note that \param Mask may not be big enough to describe all operands.
102 /// The assumption for operands that don't have a correspondent bit in the
103 /// mask is that those are still data dependent.
104 ///
105 /// The only exception to the rule is for when \param Mask has all zeroes.
106 /// A zero mask means: dependencies are broken for all explicit register
107 /// operands.
108 virtual bool isZeroIdiom(const MCInst &MI, APInt &Mask,
109 unsigned CPUID) const {
110 return false;
111 }
112
113 /// Returns true if \param MI is a dependency breaking instruction for the
114 /// subtarget associated with \param CPUID.
92115 ///
93116 /// The value computed by a dependency breaking instruction is not dependent
94117 /// on the inputs. An example of dependency breaking instruction on X86 is
95118 /// `XOR %eax, %eax`.
96 /// TODO: In future, we could implement an alternative approach where this
97 /// method returns `true` if the input instruction is not dependent on
98 /// some/all of its input operands. An APInt mask could then be used to
99 /// identify independent operands.
100 virtual bool isDependencyBreaking(const MCSubtargetInfo &STI,
101 const MCInst &Inst) const;
119 ///
120 /// If \param MI is a dependency breaking instruction for subtarget \param
121 /// CPUID, then \param Mask can be inspected to identify independent operands.
122 ///
123 /// Essentially, each bit of the mask corresponds to an input operand.
124 /// Explicit operands are laid out first in the mask; implicit operands follow
125 /// explicit operands. Bits are set for operands that are independent.
126 ///
127 /// Note that the number of bits in Mask may not be equivalent to the sum of
128 /// explicit and implicit operands in \param MI. Operands that don't have a
129 /// corresponding bit in Mask are assumed "not independente".
130 ///
131 /// The only exception is for when \param Mask is all zeroes. That means:
132 /// explicit input operands of \param MI are independent.
133 virtual bool isDependencyBreaking(const MCInst &MI, APInt &Mask,
134 unsigned CPUID) const {
135 return isZeroIdiom(MI, Mask, CPUID);
136 }
102137
103138 /// Given a branch instruction try to get the address the branch
104139 /// targets. Return true on success, and the address in Target.
6767
6868 // Forward declarations.
6969 class Instruction;
70 class SchedMachineModel;
7071
7172 // A generic machine instruction predicate.
7273 class MCInstPredicate;
229230 string MCInstFnName = MCInstFn;
230231 string MachineInstrFnName = MachineInstrFn;
231232 }
233
234 // Used to classify machine instructions based on a machine instruction
235 // predicate.
236 //
237 // Let IC be an InstructionEquivalenceClass definition, and MI a machine
238 // instruction. We say that MI belongs to the equivalence class described by IC
239 // if and only if the following two conditions are met:
240 // a) MI's opcode is in the `opcodes` set, and
241 // b) `Predicate` evaluates to true when applied to MI.
242 //
243 // Instances of this class can be used by processor scheduling models to
244 // describe instructions that have a property in common. For example,
245 // InstructionEquivalenceClass definitions can be used to identify the set of
246 // dependency breaking instructions for a processor model.
247 //
248 // An (optional) list of operand indices can be used to further describe
249 // properties that apply to instruction operands. For example, it can be used to
250 // identify register uses of a dependency breaking instructions that are not in
251 // a RAW dependency.
252 class InstructionEquivalenceClass opcodes,
253 MCInstPredicate pred,
254 list operands = []> {
255 list Opcodes = opcodes;
256 MCInstPredicate Predicate = pred;
257 list OperandIndices = operands;
258 }
259
260 // Used by processor models to describe dependency breaking instructions.
261 //
262 // This is mainly an alias for InstructionEquivalenceClass. Input operand
263 // `BrokenDeps` identifies the set of "broken dependencies". There is one bit
264 // per each implicit and explicit input operand. An empty set of broken
265 // dependencies means: "explicit input register operands are independent."
266 class DepBreakingClass opcodes, MCInstPredicate pred,
267 list BrokenDeps = []>
268 : InstructionEquivalenceClass;
269
270 // A function descriptor used to describe the signature of a predicate methods
271 // which will be expanded by the STIPredicateExpander into a tablegen'd
272 // XXXGenSubtargetInfo class member definition (here, XXX is a target name).
273 //
274 // It describes the signature of a TargetSubtarget hook, as well as a few extra
275 // properties. Examples of extra properties are:
276 // - The default return value for the auto-generate function hook.
277 // - A list of subtarget hooks (Delegates) that are called from this function.
278 //
279 class STIPredicateDecl
280 bit overrides = 1, bit expandForMC = 1,
281 bit updatesOpcodeMask = 0,
282 list delegates = []> {
283 string Name = name;
284
285 MCInstPredicate DefaultReturnValue = default;
286
287 // True if this method is declared as virtual in class TargetSubtargetInfo.
288 bit OverridesBaseClassMember = overrides;
289
290 // True if we need an equivalent predicate function in the MC layer.
291 bit ExpandForMC = expandForMC;
292
293 // True if the autogenerated method has a extra in/out APInt param used as a
294 // mask of operands.
295 bit UpdatesOpcodeMask = updatesOpcodeMask;
296
297 // A list of STIPredicates used by this definition to delegate part of the
298 // computation. For example, STIPredicateFunction `isDependencyBreaking()`
299 // delegates to `isZeroIdiom()` part of its computation.
300 list Delegates = delegates;
301 }
302
303 // A predicate function definition member of class `XXXGenSubtargetInfo`.
304 //
305 // If `Declaration.ExpandForMC` is true, then SubtargetEmitter
306 // will also expand another definition of this method that accepts a MCInst.
307 class STIPredicate
308 list classes> {
309 STIPredicateDecl Declaration = declaration;
310 list Classes = classes;
311 SchedMachineModel SchedModel = ?;
312 }
313
314 // Convenience classes and definitions used by processor scheduling models to
315 // describe dependency breaking instructions.
316 let UpdatesOpcodeMask = 1 in {
317
318 def IsZeroIdiomDecl : STIPredicateDecl<"isZeroIdiom">;
319
320 let Delegates = [IsZeroIdiomDecl] in
321 def IsDepBreakingDecl : STIPredicateDecl<"isDependencyBreaking">;
322
323 } // UpdatesOpcodeMask
324
325 class IsZeroIdiomFunction classes>
326 : STIPredicate;
327
328 class IsDepBreakingFunction classes>
329 : STIPredicate;
2323 return false;
2424 }
2525
26 bool MCInstrAnalysis::isDependencyBreaking(const MCSubtargetInfo &STI,
27 const MCInst &Inst) const {
28 return false;
29 }
30
3126 bool MCInstrAnalysis::evaluateBranch(const MCInst &Inst, uint64_t Addr,
3227 uint64_t Size, uint64_t &Target) const {
3328 if (Inst.getNumOperands() == 0 ||
379379 public:
380380 X86MCInstrAnalysis(const MCInstrInfo *MCII) : MCInstrAnalysis(MCII) {}
381381
382 bool isDependencyBreaking(const MCSubtargetInfo &STI,
383 const MCInst &Inst) const override;
382 #define GET_STIPREDICATE_DECLS_FOR_MC_ANALYSIS
383 #include "X86GenSubtargetInfo.inc"
384
384385 bool clearsSuperRegisters(const MCRegisterInfo &MRI, const MCInst &Inst,
385386 APInt &Mask) const override;
386387 std::vector>
389390 const Triple &TargetTriple) const override;
390391 };
391392
392 bool X86MCInstrAnalysis::isDependencyBreaking(const MCSubtargetInfo &STI,
393 const MCInst &Inst) const {
394 if (STI.getCPU() == "btver2") {
395 // Reference: Agner Fog's microarchitecture.pdf - Section 20 "AMD Bobcat and
396 // Jaguar pipeline", subsection 8 "Dependency-breaking instructions".
397 switch (Inst.getOpcode()) {
398 default:
399 return false;
400 case X86::SUB32rr:
401 case X86::SUB64rr:
402 case X86::SBB32rr:
403 case X86::SBB64rr:
404 case X86::XOR32rr:
405 case X86::XOR64rr:
406 case X86::XORPSrr:
407 case X86::XORPDrr:
408 case X86::VXORPSrr:
409 case X86::VXORPDrr:
410 case X86::ANDNPSrr:
411 case X86::VANDNPSrr:
412 case X86::ANDNPDrr:
413 case X86::VANDNPDrr:
414 case X86::PXORrr:
415 case X86::VPXORrr:
416 case X86::PANDNrr:
417 case X86::VPANDNrr:
418 case X86::PSUBBrr:
419 case X86::PSUBWrr:
420 case X86::PSUBDrr:
421 case X86::PSUBQrr:
422 case X86::VPSUBBrr:
423 case X86::VPSUBWrr:
424 case X86::VPSUBDrr:
425 case X86::VPSUBQrr:
426 case X86::PCMPEQBrr:
427 case X86::PCMPEQWrr:
428 case X86::PCMPEQDrr:
429 case X86::PCMPEQQrr:
430 case X86::VPCMPEQBrr:
431 case X86::VPCMPEQWrr:
432 case X86::VPCMPEQDrr:
433 case X86::VPCMPEQQrr:
434 case X86::PCMPGTBrr:
435 case X86::PCMPGTWrr:
436 case X86::PCMPGTDrr:
437 case X86::PCMPGTQrr:
438 case X86::VPCMPGTBrr:
439 case X86::VPCMPGTWrr:
440 case X86::VPCMPGTDrr:
441 case X86::VPCMPGTQrr:
442 case X86::MMX_PXORirr:
443 case X86::MMX_PANDNirr:
444 case X86::MMX_PSUBBirr:
445 case X86::MMX_PSUBDirr:
446 case X86::MMX_PSUBQirr:
447 case X86::MMX_PSUBWirr:
448 case X86::MMX_PCMPGTBirr:
449 case X86::MMX_PCMPGTDirr:
450 case X86::MMX_PCMPGTWirr:
451 case X86::MMX_PCMPEQBirr:
452 case X86::MMX_PCMPEQDirr:
453 case X86::MMX_PCMPEQWirr:
454 return Inst.getOperand(1).getReg() == Inst.getOperand(2).getReg();
455 case X86::CMP32rr:
456 case X86::CMP64rr:
457 return Inst.getOperand(0).getReg() == Inst.getOperand(1).getReg();
458 }
459 }
460
461 return false;
462 }
393 #define GET_STIPREDICATE_DEFS_FOR_MC_ANALYSIS
394 #include "X86GenSubtargetInfo.inc"
463395
464396 bool X86MCInstrAnalysis::clearsSuperRegisters(const MCRegisterInfo &MRI,
465397 const MCInst &Inst,
686686
687687 def : InstRW<[JSlowLEA16r], (instrs LEA16r)>;
688688
689 ///////////////////////////////////////////////////////////////////////////////
690 // Dependency breaking instructions.
691 ///////////////////////////////////////////////////////////////////////////////
692
693 def : IsZeroIdiomFunction<[
694 // GPR Zero-idioms.
695 DepBreakingClass<[ SUB32rr, SUB64rr, XOR32rr, XOR64rr ], ZeroIdiomPredicate>,
696
697 // MMX Zero-idioms.
698 DepBreakingClass<[
699 MMX_PXORirr, MMX_PANDNirr, MMX_PSUBBirr,
700 MMX_PSUBDirr, MMX_PSUBQirr, MMX_PSUBWirr,
701 MMX_PCMPGTBirr, MMX_PCMPGTDirr, MMX_PCMPGTWirr
702 ], ZeroIdiomPredicate>,
703
704 // SSE Zero-idioms.
705 DepBreakingClass<[
706 // fp variants.
707 XORPSrr, XORPDrr, ANDNPSrr, ANDNPDrr,
708
709 // int variants.
710 PXORrr, PANDNrr,
711 PSUBBrr, PSUBWrr, PSUBDrr, PSUBQrr,
712 PCMPGTBrr, PCMPGTDrr, PCMPGTQrr, PCMPGTWrr
713 ], ZeroIdiomPredicate>,
714
715 // AVX Zero-idioms.
716 DepBreakingClass<[
717 // xmm fp variants.
718 VXORPSrr, VXORPDrr, VANDNPSrr, VANDNPDrr,
719
720 // xmm int variants.
721 VPXORrr, VPANDNrr,
722 VPSUBBrr, VPSUBWrr, VPSUBDrr, VPSUBQrr,
723 VPCMPGTBrr, VPCMPGTWrr, VPCMPGTDrr, VPCMPGTQrr,
724
725 // ymm variants.
726 VXORPSYrr, VXORPDYrr, VANDNPSYrr, VANDNPDYrr
727 ], ZeroIdiomPredicate>
728 ]>;
729
730 def : IsDepBreakingFunction<[
731 // GPR
732 DepBreakingClass<[ SBB32rr, SBB64rr ], ZeroIdiomPredicate>,
733 DepBreakingClass<[ CMP32rr, CMP64rr ], CheckSameRegOperand<0, 1> >,
734
735 // MMX
736 DepBreakingClass<[
737 MMX_PCMPEQBirr, MMX_PCMPEQDirr, MMX_PCMPEQWirr
738 ], ZeroIdiomPredicate>,
739
740 // SSE
741 DepBreakingClass<[
742 PCMPEQBrr, PCMPEQWrr, PCMPEQDrr, PCMPEQQrr
743 ], ZeroIdiomPredicate>,
744
745 // AVX
746 DepBreakingClass<[
747 VPCMPEQBrr, VPCMPEQWrr, VPCMPEQDrr, VPCMPEQQrr
748 ], ZeroIdiomPredicate>
749 ]>;
750
689751 } // SchedModel
0 # NOTE: Assertions have been autogenerated by utils/update_mca_test_checks.py
1 # RUN: llvm-mca -mtriple=x86_64-unknown-unknown -mcpu=btver2 -timeline -timeline-max-iterations=3 < %s | FileCheck %s
2
3 # TODO: Fix the processor resource usage for zero-idiom YMM XOR instructions.
4 # Those vector XOR instructions should only consume 1cy of JFPU1 (instead
5 # of 2cy).
6
7 # LLVM-MCA-BEGIN ZERO-IDIOM-1
8
9 vaddps %ymm0, %ymm0, %ymm1
10 vxorps %ymm1, %ymm1, %ymm1
11 vblendps $2, %ymm1, %ymm2, %ymm3
12
13 # LLVM-MCA-END
14
15 # LLVM-MCA-BEGIN ZERO-IDIOM-2
16
17 vaddpd %ymm0, %ymm0, %ymm1
18 vxorpd %ymm1, %ymm1, %ymm1
19 vblendpd $2, %ymm1, %ymm2, %ymm3
20
21 # LLVM-MCA-END
22
23 # LLVM-MCA-BEGIN ZERO-IDIOM-3
24 vaddps %xmm0, %xmm1, %xmm2
25 vandnps %xmm2, %xmm2, %xmm3
26 # LLVM-MCA-END
27
28 # LLVM-MCA-BEGIN ZERO-IDIOM-4
29 vaddps %xmm0, %xmm1, %xmm2
30 vandnps %xmm2, %xmm2, %xmm3
31 # LLVM-MCA-END
32
33 # CHECK: [0] Code Region - ZERO-IDIOM-1
34
35 # CHECK: Iterations: 100
36 # CHECK-NEXT: Instructions: 300
37 # CHECK-NEXT: Total Cycles: 306
38 # CHECK-NEXT: Total uOps: 600
39
40 # CHECK: Dispatch Width: 2
41 # CHECK-NEXT: uOps Per Cycle: 1.96
42 # CHECK-NEXT: IPC: 0.98
43 # CHECK-NEXT: Block RThroughput: 3.0
44
45 # CHECK: Instruction Info:
46 # CHECK-NEXT: [1]: #uOps
47 # CHECK-NEXT: [2]: Latency
48 # CHECK-NEXT: [3]: RThroughput
49 # CHECK-NEXT: [4]: MayLoad
50 # CHECK-NEXT: [5]: MayStore
51 # CHECK-NEXT: [6]: HasSideEffects (U)
52
53 # CHECK: [1] [2] [3] [4] [5] [6] Instructions:
54 # CHECK-NEXT: 2 3 2.00 vaddps %ymm0, %ymm0, %ymm1
55 # CHECK-NEXT: 2 1 1.00 vxorps %ymm1, %ymm1, %ymm1
56 # CHECK-NEXT: 2 1 1.00 vblendps $2, %ymm1, %ymm2, %ymm3
57
58 # CHECK: Resources:
59 # CHECK-NEXT: [0] - JALU0
60 # CHECK-NEXT: [1] - JALU1
61 # CHECK-NEXT: [2] - JDiv
62 # CHECK-NEXT: [3] - JFPA
63 # CHECK-NEXT: [4] - JFPM
64 # CHECK-NEXT: [5] - JFPU0
65 # CHECK-NEXT: [6] - JFPU1
66 # CHECK-NEXT: [7] - JLAGU
67 # CHECK-NEXT: [8] - JMul
68 # CHECK-NEXT: [9] - JSAGU
69 # CHECK-NEXT: [10] - JSTC
70 # CHECK-NEXT: [11] - JVALU0
71 # CHECK-NEXT: [12] - JVALU1
72 # CHECK-NEXT: [13] - JVIMUL
73
74 # CHECK: Resource pressure per iteration:
75 # CHECK-NEXT: [0] [1] [2] [3] [4] [5] [6] [7] [8] [9] [10] [11] [12] [13]
76 # CHECK-NEXT: - - - 3.00 3.00 3.00 3.00 - - - - - - -
77
78 # CHECK: Resource pressure by instruction:
79 # CHECK-NEXT: [0] [1] [2] [3] [4] [5] [6] [7] [8] [9] [10] [11] [12] [13] Instructions:
80 # CHECK-NEXT: - - - 2.00 - 2.00 - - - - - - - - vaddps %ymm0, %ymm0, %ymm1
81 # CHECK-NEXT: - - - - 2.00 - 2.00 - - - - - - - vxorps %ymm1, %ymm1, %ymm1
82 # CHECK-NEXT: - - - 1.00 1.00 1.00 1.00 - - - - - - - vblendps $2, %ymm1, %ymm2, %ymm3
83
84 # CHECK: Timeline view:
85 # CHECK-NEXT: 012
86 # CHECK-NEXT: Index 0123456789
87
88 # CHECK: [0,0] DeeeER . . vaddps %ymm0, %ymm0, %ymm1
89 # CHECK-NEXT: [0,1] .DeE-R . . vxorps %ymm1, %ymm1, %ymm1
90 # CHECK-NEXT: [0,2] . DeE-R . . vblendps $2, %ymm1, %ymm2, %ymm3
91 # CHECK-NEXT: [1,0] . D=eeeER. . vaddps %ymm0, %ymm0, %ymm1
92 # CHECK-NEXT: [1,1] . DeE--R. . vxorps %ymm1, %ymm1, %ymm1
93 # CHECK-NEXT: [1,2] . D=eE-R . vblendps $2, %ymm1, %ymm2, %ymm3
94 # CHECK-NEXT: [2,0] . .DeeeER. vaddps %ymm0, %ymm0, %ymm1
95 # CHECK-NEXT: [2,1] . . D=eER. vxorps %ymm1, %ymm1, %ymm1
96 # CHECK-NEXT: [2,2] . . D=eER vblendps $2, %ymm1, %ymm2, %ymm3
97
98 # CHECK: Average Wait times (based on the timeline view):
99 # CHECK-NEXT: [0]: Executions
100 # CHECK-NEXT: [1]: Average time spent waiting in a scheduler's queue
101 # CHECK-NEXT: [2]: Average time spent waiting in a scheduler's queue while ready
102 # CHECK-NEXT: [3]: Average time elapsed from WB until retire stage
103
104 # CHECK: [0] [1] [2] [3]
105 # CHECK-NEXT: 0. 3 1.3 1.3 0.0 vaddps %ymm0, %ymm0, %ymm1
106 # CHECK-NEXT: 1. 3 1.3 1.3 1.0 vxorps %ymm1, %ymm1, %ymm1
107 # CHECK-NEXT: 2. 3 1.7 0.3 0.7 vblendps $2, %ymm1, %ymm2, %ymm3
108
109 # CHECK: [1] Code Region - ZERO-IDIOM-2
110
111 # CHECK: Iterations: 100
112 # CHECK-NEXT: Instructions: 300
113 # CHECK-NEXT: Total Cycles: 306
114 # CHECK-NEXT: Total uOps: 600
115
116 # CHECK: Dispatch Width: 2
117 # CHECK-NEXT: uOps Per Cycle: 1.96
118 # CHECK-NEXT: IPC: 0.98
119 # CHECK-NEXT: Block RThroughput: 3.0
120
121 # CHECK: Instruction Info:
122 # CHECK-NEXT: [1]: #uOps
123 # CHECK-NEXT: [2]: Latency
124 # CHECK-NEXT: [3]: RThroughput
125 # CHECK-NEXT: [4]: MayLoad
126 # CHECK-NEXT: [5]: MayStore
127 # CHECK-NEXT: [6]: HasSideEffects (U)
128
129 # CHECK: [1] [2] [3] [4] [5] [6] Instructions:
130 # CHECK-NEXT: 2 3 2.00 vaddpd %ymm0, %ymm0, %ymm1
131 # CHECK-NEXT: 2 1 1.00 vxorpd %ymm1, %ymm1, %ymm1
132 # CHECK-NEXT: 2 1 1.00 vblendpd $2, %ymm1, %ymm2, %ymm3
133
134 # CHECK: Resources:
135 # CHECK-NEXT: [0] - JALU0
136 # CHECK-NEXT: [1] - JALU1
137 # CHECK-NEXT: [2] - JDiv
138 # CHECK-NEXT: [3] - JFPA
139 # CHECK-NEXT: [4] - JFPM
140 # CHECK-NEXT: [5] - JFPU0
141 # CHECK-NEXT: [6] - JFPU1
142 # CHECK-NEXT: [7] - JLAGU
143 # CHECK-NEXT: [8] - JMul
144 # CHECK-NEXT: [9] - JSAGU
145 # CHECK-NEXT: [10] - JSTC
146 # CHECK-NEXT: [11] - JVALU0
147 # CHECK-NEXT: [12] - JVALU1
148 # CHECK-NEXT: [13] - JVIMUL
149
150 # CHECK: Resource pressure per iteration:
151 # CHECK-NEXT: [0] [1] [2] [3] [4] [5] [6] [7] [8] [9] [10] [11] [12] [13]
152 # CHECK-NEXT: - - - 3.00 3.00 3.00 3.00 - - - - - - -
153
154 # CHECK: Resource pressure by instruction:
155 # CHECK-NEXT: [0] [1] [2] [3] [4] [5] [6] [7] [8] [9] [10] [11] [12] [13] Instructions:
156 # CHECK-NEXT: - - - 2.00 - 2.00 - - - - - - - - vaddpd %ymm0, %ymm0, %ymm1
157 # CHECK-NEXT: - - - - 2.00 - 2.00 - - - - - - - vxorpd %ymm1, %ymm1, %ymm1
158 # CHECK-NEXT: - - - 1.00 1.00 1.00 1.00 - - - - - - - vblendpd $2, %ymm1, %ymm2, %ymm3
159
160 # CHECK: Timeline view:
161 # CHECK-NEXT: 012
162 # CHECK-NEXT: Index 0123456789
163
164 # CHECK: [0,0] DeeeER . . vaddpd %ymm0, %ymm0, %ymm1
165 # CHECK-NEXT: [0,1] .DeE-R . . vxorpd %ymm1, %ymm1, %ymm1
166 # CHECK-NEXT: [0,2] . DeE-R . . vblendpd $2, %ymm1, %ymm2, %ymm3
167 # CHECK-NEXT: [1,0] . D=eeeER. . vaddpd %ymm0, %ymm0, %ymm1
168 # CHECK-NEXT: [1,1] . DeE--R. . vxorpd %ymm1, %ymm1, %ymm1
169 # CHECK-NEXT: [1,2] . D=eE-R . vblendpd $2, %ymm1, %ymm2, %ymm3
170 # CHECK-NEXT: [2,0] . .DeeeER. vaddpd %ymm0, %ymm0, %ymm1
171 # CHECK-NEXT: [2,1] . . D=eER. vxorpd %ymm1, %ymm1, %ymm1
172 # CHECK-NEXT: [2,2] . . D=eER vblendpd $2, %ymm1, %ymm2, %ymm3
173
174 # CHECK: Average Wait times (based on the timeline view):
175 # CHECK-NEXT: [0]: Executions
176 # CHECK-NEXT: [1]: Average time spent waiting in a scheduler's queue
177 # CHECK-NEXT: [2]: Average time spent waiting in a scheduler's queue while ready
178 # CHECK-NEXT: [3]: Average time elapsed from WB until retire stage
179
180 # CHECK: [0] [1] [2] [3]
181 # CHECK-NEXT: 0. 3 1.3 1.3 0.0 vaddpd %ymm0, %ymm0, %ymm1
182 # CHECK-NEXT: 1. 3 1.3 1.3 1.0 vxorpd %ymm1, %ymm1, %ymm1
183 # CHECK-NEXT: 2. 3 1.7 0.3 0.7 vblendpd $2, %ymm1, %ymm2, %ymm3
184
185 # CHECK: [2] Code Region - ZERO-IDIOM-3
186
187 # CHECK: Iterations: 100
188 # CHECK-NEXT: Instructions: 200
189 # CHECK-NEXT: Total Cycles: 105
190 # CHECK-NEXT: Total uOps: 200
191
192 # CHECK: Dispatch Width: 2
193 # CHECK-NEXT: uOps Per Cycle: 1.90
194 # CHECK-NEXT: IPC: 1.90
195 # CHECK-NEXT: Block RThroughput: 1.0
196
197 # CHECK: Instruction Info:
198 # CHECK-NEXT: [1]: #uOps
199 # CHECK-NEXT: [2]: Latency
200 # CHECK-NEXT: [3]: RThroughput
201 # CHECK-NEXT: [4]: MayLoad
202 # CHECK-NEXT: [5]: MayStore
203 # CHECK-NEXT: [6]: HasSideEffects (U)
204
205 # CHECK: [1] [2] [3] [4] [5] [6] Instructions:
206 # CHECK-NEXT: 1 3 1.00 vaddps %xmm0, %xmm1, %xmm2
207 # CHECK-NEXT: 1 0 0.50 vandnps %xmm2, %xmm2, %xmm3
208
209 # CHECK: Resources:
210 # CHECK-NEXT: [0] - JALU0
211 # CHECK-NEXT: [1] - JALU1
212 # CHECK-NEXT: [2] - JDiv
213 # CHECK-NEXT: [3] - JFPA
214 # CHECK-NEXT: [4] - JFPM
215 # CHECK-NEXT: [5] - JFPU0
216 # CHECK-NEXT: [6] - JFPU1
217 # CHECK-NEXT: [7] - JLAGU
218 # CHECK-NEXT: [8] - JMul
219 # CHECK-NEXT: [9] - JSAGU
220 # CHECK-NEXT: [10] - JSTC
221 # CHECK-NEXT: [11] - JVALU0
222 # CHECK-NEXT: [12] - JVALU1
223 # CHECK-NEXT: [13] - JVIMUL
224
225 # CHECK: Resource pressure per iteration:
226 # CHECK-NEXT: [0] [1] [2] [3] [4] [5] [6] [7] [8] [9] [10] [11] [12] [13]
227 # CHECK-NEXT: - - - 1.00 - 1.00 - - - - - - - -
228
229 # CHECK: Resource pressure by instruction:
230 # CHECK-NEXT: [0] [1] [2] [3] [4] [5] [6] [7] [8] [9] [10] [11] [12] [13] Instructions:
231 # CHECK-NEXT: - - - 1.00 - 1.00 - - - - - - - - vaddps %xmm0, %xmm1, %xmm2
232 # CHECK-NEXT: - - - - - - - - - - - - - - vandnps %xmm2, %xmm2, %xmm3
233
234 # CHECK: Timeline view:
235 # CHECK-NEXT: Index 01234567
236
237 # CHECK: [0,0] DeeeER . vaddps %xmm0, %xmm1, %xmm2
238 # CHECK-NEXT: [0,1] D----R . vandnps %xmm2, %xmm2, %xmm3
239 # CHECK-NEXT: [1,0] .DeeeER. vaddps %xmm0, %xmm1, %xmm2
240 # CHECK-NEXT: [1,1] .D----R. vandnps %xmm2, %xmm2, %xmm3
241 # CHECK-NEXT: [2,0] . DeeeER vaddps %xmm0, %xmm1, %xmm2
242 # CHECK-NEXT: [2,1] . D----R vandnps %xmm2, %xmm2, %xmm3
243
244 # CHECK: Average Wait times (based on the timeline view):
245 # CHECK-NEXT: [0]: Executions
246 # CHECK-NEXT: [1]: Average time spent waiting in a scheduler's queue
247 # CHECK-NEXT: [2]: Average time spent waiting in a scheduler's queue while ready
248 # CHECK-NEXT: [3]: Average time elapsed from WB until retire stage
249
250 # CHECK: [0] [1] [2] [3]
251 # CHECK-NEXT: 0. 3 1.0 1.0 0.0 vaddps %xmm0, %xmm1, %xmm2
252 # CHECK-NEXT: 1. 3 0.0 0.0 4.0 vandnps %xmm2, %xmm2, %xmm3
253
254 # CHECK: [3] Code Region - ZERO-IDIOM-4
255
256 # CHECK: Iterations: 100
257 # CHECK-NEXT: Instructions: 200
258 # CHECK-NEXT: Total Cycles: 105
259 # CHECK-NEXT: Total uOps: 200
260
261 # CHECK: Dispatch Width: 2
262 # CHECK-NEXT: uOps Per Cycle: 1.90
263 # CHECK-NEXT: IPC: 1.90
264 # CHECK-NEXT: Block RThroughput: 1.0
265
266 # CHECK: Instruction Info:
267 # CHECK-NEXT: [1]: #uOps
268 # CHECK-NEXT: [2]: Latency
269 # CHECK-NEXT: [3]: RThroughput
270 # CHECK-NEXT: [4]: MayLoad
271 # CHECK-NEXT: [5]: MayStore
272 # CHECK-NEXT: [6]: HasSideEffects (U)
273
274 # CHECK: [1] [2] [3] [4] [5] [6] Instructions:
275 # CHECK-NEXT: 1 3 1.00 vaddps %xmm0, %xmm1, %xmm2
276 # CHECK-NEXT: 1 0 0.50 vandnps %xmm2, %xmm2, %xmm3
277
278 # CHECK: Resources:
279 # CHECK-NEXT: [0] - JALU0
280 # CHECK-NEXT: [1] - JALU1
281 # CHECK-NEXT: [2] - JDiv
282 # CHECK-NEXT: [3] - JFPA
283 # CHECK-NEXT: [4] - JFPM
284 # CHECK-NEXT: [5] - JFPU0
285 # CHECK-NEXT: [6] - JFPU1
286 # CHECK-NEXT: [7] - JLAGU
287 # CHECK-NEXT: [8] - JMul
288 # CHECK-NEXT: [9] - JSAGU
289 # CHECK-NEXT: [10] - JSTC
290 # CHECK-NEXT: [11] - JVALU0
291 # CHECK-NEXT: [12] - JVALU1
292 # CHECK-NEXT: [13] - JVIMUL
293
294 # CHECK: Resource pressure per iteration:
295 # CHECK-NEXT: [0] [1] [2] [3] [4] [5] [6] [7] [8] [9] [10] [11] [12] [13]
296 # CHECK-NEXT: - - - 1.00 - 1.00 - - - - - - - -
297
298 # CHECK: Resource pressure by instruction:
299 # CHECK-NEXT: [0] [1] [2] [3] [4] [5] [6] [7] [8] [9] [10] [11] [12] [13] Instructions:
300 # CHECK-NEXT: - - - 1.00 - 1.00 - - - - - - - - vaddps %xmm0, %xmm1, %xmm2
301 # CHECK-NEXT: - - - - - - - - - - - - - - vandnps %xmm2, %xmm2, %xmm3
302
303 # CHECK: Timeline view:
304 # CHECK-NEXT: Index 01234567
305
306 # CHECK: [0,0] DeeeER . vaddps %xmm0, %xmm1, %xmm2
307 # CHECK-NEXT: [0,1] D----R . vandnps %xmm2, %xmm2, %xmm3
308 # CHECK-NEXT: [1,0] .DeeeER. vaddps %xmm0, %xmm1, %xmm2
309 # CHECK-NEXT: [1,1] .D----R. vandnps %xmm2, %xmm2, %xmm3
310 # CHECK-NEXT: [2,0] . DeeeER vaddps %xmm0, %xmm1, %xmm2
311 # CHECK-NEXT: [2,1] . D----R vandnps %xmm2, %xmm2, %xmm3
312
313 # CHECK: Average Wait times (based on the timeline view):
314 # CHECK-NEXT: [0]: Executions
315 # CHECK-NEXT: [1]: Average time spent waiting in a scheduler's queue
316 # CHECK-NEXT: [2]: Average time spent waiting in a scheduler's queue while ready
317 # CHECK-NEXT: [3]: Average time elapsed from WB until retire stage
318
319 # CHECK: [0] [1] [2] [3]
320 # CHECK-NEXT: 0. 3 1.0 1.0 0.0 vaddps %xmm0, %xmm1, %xmm2
321 # CHECK-NEXT: 1. 3 0.0 0.0 4.0 vandnps %xmm2, %xmm2, %xmm3
423423 std::unique_ptr NewIS = llvm::make_unique(D);
424424
425425 // Check if this is a dependency breaking instruction.
426 bool IsDepBreaking = MCIA.isDependencyBreaking(STI, MCI);
427 // FIXME: this is a temporary hack to identify zero-idioms.
428 bool IsZeroIdiom = D.isZeroLatency() && IsDepBreaking;
426 APInt Mask;
427
428 unsigned ProcID = STI.getSchedModel().getProcessorID();
429 bool IsZeroIdiom = MCIA.isZeroIdiom(MCI, Mask, ProcID);
430 bool IsDepBreaking =
431 IsZeroIdiom || MCIA.isDependencyBreaking(MCI, Mask, ProcID);
429432
430433 // Initialize Reads first.
431434 for (const ReadDescriptor &RD : D.Reads) {
450453 assert(RegID > 0 && "Invalid register ID found!");
451454 auto RS = llvm::make_unique(RD, RegID);
452455
453 if (IsDepBreaking && !RD.isImplicitRead())
454 RS->setIndependentFromDef();
456 if (IsDepBreaking) {
457 // A mask of all zeroes means: explicit input operands are not
458 // independent.
459 if (Mask.isNullValue()) {
460 if (!RD.isImplicitRead())
461 RS->setIndependentFromDef();
462 } else {
463 // Check if this register operand is independent according to `Mask`.
464 // Note that Mask may not have enough bits to describe all explicit and
465 // implicit input operands. If this register operand doesn't have a
466 // corresponding bit in Mask, then conservatively assume that it is
467 // dependent.
468 if (Mask.getBitWidth() > RD.UseIndex) {
469 // Okay. This map describe register use `RD.UseIndex`.
470 if (Mask[RD.UseIndex])
471 RS->setIndependentFromDef();
472 }
473 }
474 }
455475 NewIS->getUses().emplace_back(std::move(RS));
456476 }
457477
224224 // Check MCInstPredicate definitions.
225225 checkMCInstPredicates();
226226
227 // Check STIPredicate definitions.
228 checkSTIPredicates();
229
230 // Find STIPredicate definitions for each processor model, and construct
231 // STIPredicateFunction objects.
232 collectSTIPredicates();
233
227234 checkCompleteness();
235 }
236
237 void CodeGenSchedModels::checkSTIPredicates() const {
238 DenseMap Declarations;
239
240 // There cannot be multiple declarations with the same name.
241 const RecVec Decls = Records.getAllDerivedDefinitions("STIPredicateDecl");
242 for (const Record *R : Decls) {
243 StringRef Name = R->getValueAsString("Name");
244 const auto It = Declarations.find(Name);
245 if (It == Declarations.end()) {
246 Declarations[Name] = R;
247 continue;
248 }
249
250 PrintError(R->getLoc(), "STIPredicate " + Name + " multiply declared.");
251 PrintNote(It->second->getLoc(), "Previous declaration was here.");
252 PrintFatalError(R->getLoc(), "Invalid STIPredicateDecl found.");
253 }
254
255 // Disallow InstructionEquivalenceClasses with an empty instruction list.
256 const RecVec Defs =
257 Records.getAllDerivedDefinitions("InstructionEquivalenceClass");
258 for (const Record *R : Defs) {
259 RecVec Opcodes = R->getValueAsListOfDefs("Opcodes");
260 if (Opcodes.empty()) {
261 PrintFatalError(R->getLoc(), "Invalid InstructionEquivalenceClass "
262 "defined with an empty opcode list.");
263 }
264 }
265 }
266
267 // Used by function `processSTIPredicate` to construct a mask of machine
268 // instruction operands.
269 static APInt constructOperandMask(ArrayRef Indices) {
270 APInt OperandMask;
271 if (Indices.empty())
272 return OperandMask;
273
274 int64_t MaxIndex = *std::max_element(Indices.begin(), Indices.end());
275 assert(MaxIndex >= 0 && "Invalid negative indices in input!");
276 OperandMask = OperandMask.zext(MaxIndex + 1);
277 for (const int64_t Index : Indices) {
278 assert(Index >= 0 && "Invalid negative indices!");
279 OperandMask.setBit(Index);
280 }
281
282 return OperandMask;
283 }
284
285 static void
286 processSTIPredicate(STIPredicateFunction &Fn,
287 const DenseMap &ProcModelMap) {
288 DenseMap Opcode2Index;
289 using OpcodeMapPair = std::pair;
290 std::vector OpcodeMappings;
291 std::vector> OpcodeMasks;
292
293 DenseMap Predicate2Index;
294 unsigned NumUniquePredicates = 0;
295
296 // Number unique predicates and opcodes used by InstructionEquivalenceClass
297 // definitions. Each unique opcode will be associated with an OpcodeInfo
298 // object.
299 for (const Record *Def : Fn.getDefinitions()) {
300 RecVec Classes = Def->getValueAsListOfDefs("Classes");
301 for (const Record *EC : Classes) {
302 const Record *Pred = EC->getValueAsDef("Predicate");
303 if (Predicate2Index.find(Pred) == Predicate2Index.end())
304 Predicate2Index[Pred] = NumUniquePredicates++;
305
306 RecVec Opcodes = EC->getValueAsListOfDefs("Opcodes");
307 for (const Record *Opcode : Opcodes) {
308 if (Opcode2Index.find(Opcode) == Opcode2Index.end()) {
309 Opcode2Index[Opcode] = OpcodeMappings.size();
310 OpcodeMappings.emplace_back(Opcode, OpcodeInfo());
311 }
312 }
313 }
314 }
315
316 // Initialize vector `OpcodeMasks` with default values. We want to keep track
317 // of which processors "use" which opcodes. We also want to be able to
318 // identify predicates that are used by different processors for a same
319 // opcode.
320 // This information is used later on by this algorithm to sort OpcodeMapping
321 // elements based on their processor and predicate sets.
322 OpcodeMasks.resize(OpcodeMappings.size());
323 APInt DefaultProcMask(ProcModelMap.size(), 0);
324 APInt DefaultPredMask(NumUniquePredicates, 0);
325 for (std::pair &MaskPair : OpcodeMasks)
326 MaskPair = std::make_pair(DefaultProcMask, DefaultPredMask);
327
328 // Construct a OpcodeInfo object for every unique opcode declared by an
329 // InstructionEquivalenceClass definition.
330 for (const Record *Def : Fn.getDefinitions()) {
331 RecVec Classes = Def->getValueAsListOfDefs("Classes");
332 const Record *SchedModel = Def->getValueAsDef("SchedModel");
333 unsigned ProcIndex = ProcModelMap.find(SchedModel)->second;
334 APInt ProcMask(ProcModelMap.size(), 0);
335 ProcMask.setBit(ProcIndex);
336
337 for (const Record *EC : Classes) {
338 RecVec Opcodes = EC->getValueAsListOfDefs("Opcodes");
339
340 std::vector OpIndices =
341 EC->getValueAsListOfInts("OperandIndices");
342 APInt OperandMask = constructOperandMask(OpIndices);
343
344 const Record *Pred = EC->getValueAsDef("Predicate");
345 APInt PredMask(NumUniquePredicates, 0);
346 PredMask.setBit(Predicate2Index[Pred]);
347
348 for (const Record *Opcode : Opcodes) {
349 unsigned OpcodeIdx = Opcode2Index[Opcode];
350 if (OpcodeMasks[OpcodeIdx].first[ProcIndex]) {
351 std::string Message =
352 "Opcode " + Opcode->getName().str() +
353 " used by multiple InstructionEquivalenceClass definitions.";
354 PrintFatalError(EC->getLoc(), Message);
355 }
356 OpcodeMasks[OpcodeIdx].first |= ProcMask;
357 OpcodeMasks[OpcodeIdx].second |= PredMask;
358 OpcodeInfo &OI = OpcodeMappings[OpcodeIdx].second;
359
360 OI.addPredicateForProcModel(ProcMask, OperandMask, Pred);
361 }
362 }
363 }
364
365 // Sort OpcodeMappings elements based on their CPU and predicate masks.
366 // As a last resort, order elements by opcode identifier.
367 llvm::sort(OpcodeMappings.begin(), OpcodeMappings.end(),
368 [&](const OpcodeMapPair &Lhs, const OpcodeMapPair &Rhs) {
369 unsigned LhsIdx = Opcode2Index[Lhs.first];
370 unsigned RhsIdx = Opcode2Index[Rhs.first];
371 std::pair &LhsMasks = OpcodeMasks[LhsIdx];
372 std::pair &RhsMasks = OpcodeMasks[RhsIdx];
373
374 if (LhsMasks.first != RhsMasks.first) {
375 if (LhsMasks.first.countPopulation() <
376 RhsMasks.first.countPopulation())
377 return true;
378 return LhsMasks.first.countLeadingZeros() >
379 RhsMasks.first.countLeadingZeros();
380 }
381
382 if (LhsMasks.second != RhsMasks.second) {
383 if (LhsMasks.second.countPopulation() <
384 RhsMasks.second.countPopulation())
385 return true;
386 return LhsMasks.second.countLeadingZeros() >
387 RhsMasks.second.countLeadingZeros();
388 }
389
390 return LhsIdx < RhsIdx;
391 });
392
393 // Now construct opcode groups. Groups are used by the SubtargetEmitter when
394 // expanding the body of a STIPredicate function. In particular, each opcode
395 // group is expanded into a sequence of labels in a switch statement.
396 // It identifies opcodes for which different processors define same predicates
397 // and same opcode masks.
398 for (OpcodeMapPair &Info : OpcodeMappings)
399 Fn.addOpcode(Info.first, std::move(Info.second));
400 }
401
402 void CodeGenSchedModels::collectSTIPredicates() {
403 // Map STIPredicateDecl records to elements of vector
404 // CodeGenSchedModels::STIPredicates.
405 DenseMap Decl2Index;
406
407 RecVec RV = Records.getAllDerivedDefinitions("STIPredicate");
408 for (const Record *R : RV) {
409 const Record *Decl = R->getValueAsDef("Declaration");
410
411 const auto It = Decl2Index.find(Decl);
412 if (It == Decl2Index.end()) {
413 Decl2Index[Decl] = STIPredicates.size();
414 STIPredicateFunction Predicate(Decl);
415 Predicate.addDefinition(R);
416 STIPredicates.emplace_back(std::move(Predicate));
417 continue;
418 }
419
420 STIPredicateFunction &PreviousDef = STIPredicates[It->second];
421 PreviousDef.addDefinition(R);
422 }
423
424 for (STIPredicateFunction &Fn : STIPredicates)
425 processSTIPredicate(Fn, ProcModelMap);
426 }
427
428 void OpcodeInfo::addPredicateForProcModel(const llvm::APInt &CpuMask,
429 const llvm::APInt &OperandMask,
430 const Record *Predicate) {
431 auto It = llvm::find_if(
432 Predicates, [&OperandMask, &Predicate](const PredicateInfo &P) {
433 return P.Predicate == Predicate && P.OperandMask == OperandMask;
434 });
435 if (It == Predicates.end()) {
436 Predicates.emplace_back(CpuMask, OperandMask, Predicate);
437 return;
438 }
439 It->ProcModelMask |= CpuMask;
228440 }
229441
230442 void CodeGenSchedModels::checkMCInstPredicates() const {
1414 #ifndef LLVM_UTILS_TABLEGEN_CODEGENSCHEDULE_H
1515 #define LLVM_UTILS_TABLEGEN_CODEGENSCHEDULE_H
1616
17 #include "llvm/ADT/APInt.h"
1718 #include "llvm/ADT/DenseMap.h"
1819 #include "llvm/ADT/StringMap.h"
1920 #include "llvm/Support/ErrorHandling.h"
269270 #endif
270271 };
271272
273 /// Used to correlate instructions to MCInstPredicates specified by
274 /// InstructionEquivalentClass tablegen definitions.
275 ///
276 /// Example: a XOR of a register with self, is a known zero-idiom for most
277 /// X86 processors.
278 ///
279 /// Each processor can use a (potentially different) InstructionEquivalenceClass
280 /// definition to classify zero-idioms. That means, XORrr is likely to appear
281 /// in more than one equivalence class (where each class definition is
282 /// contributed by a different processor).
283 ///
284 /// There is no guarantee that the same MCInstPredicate will be used to describe
285 /// equivalence classes that identify XORrr as a zero-idiom.
286 ///
287 /// To be more specific, the requirements for being a zero-idiom XORrr may be
288 /// different for different processors.
289 ///
290 /// Class PredicateInfo identifies a subset of processors that specify the same
291 /// requirements (i.e. same MCInstPredicate and OperandMask) for an instruction
292 /// opcode.
293 ///
294 /// Back to the example. Field `ProcModelMask` will have one bit set for every
295 /// processor model that sees XORrr as a zero-idiom, and that specifies the same
296 /// set of constraints.
297 ///
298 /// By construction, there can be multiple instances of PredicateInfo associated
299 /// with a same instruction opcode. For example, different processors may define
300 /// different constraints on the same opcode.
301 ///
302 /// Field OperandMask can be used as an extra constraint.
303 /// It may be used to describe conditions that appy only to a subset of the
304 /// operands of a machine instruction, and the operands subset may not be the
305 /// same for all processor models.
306 struct PredicateInfo {
307 llvm::APInt ProcModelMask; // A set of processor model indices.
308 llvm::APInt OperandMask; // An operand mask.
309 const Record *Predicate; // MCInstrPredicate definition.
310 PredicateInfo(llvm::APInt CpuMask, llvm::APInt Operands, const Record *Pred)
311 : ProcModelMask(CpuMask), OperandMask(Operands), Predicate(Pred) {}
312
313 bool operator==(const PredicateInfo &Other) const {
314 return ProcModelMask == Other.ProcModelMask &&
315 OperandMask == Other.OperandMask && Predicate == Other.Predicate;
316 }
317 };
318
319 /// A collection of PredicateInfo objects.
320 ///
321 /// There is at least one OpcodeInfo object for every opcode specified by a
322 /// TIPredicate definition.
323 class OpcodeInfo {
324 llvm::SmallVector Predicates;
325
326 OpcodeInfo(const OpcodeInfo &Other) = delete;
327 OpcodeInfo &operator=(const OpcodeInfo &Other) = delete;
328
329 public:
330 OpcodeInfo() = default;
331 OpcodeInfo &operator=(OpcodeInfo &&Other) = default;
332 OpcodeInfo(OpcodeInfo &&Other) = default;
333
334 ArrayRef getPredicates() const { return Predicates; }
335
336 void addPredicateForProcModel(const llvm::APInt &CpuMask,
337 const llvm::APInt &OperandMask,
338 const Record *Predicate);
339 };
340
341 /// Used to group together tablegen instruction definitions that are subject
342 /// to a same set of constraints (identified by an instance of OpcodeInfo).
343 class OpcodeGroup {
344 OpcodeInfo Info;
345 std::vector Opcodes;
346
347 OpcodeGroup(const OpcodeGroup &Other) = delete;
348 OpcodeGroup &operator=(const OpcodeGroup &Other) = delete;
349
350 public:
351 OpcodeGroup(OpcodeInfo &&OpInfo) : Info(std::move(OpInfo)) {}
352 OpcodeGroup(OpcodeGroup &&Other) = default;
353
354 void addOpcode(const Record *Opcode) {
355 assert(std::find(Opcodes.begin(), Opcodes.end(), Opcode) == Opcodes.end() &&
356 "Opcode already in set!");
357 Opcodes.push_back(Opcode);
358 }
359
360 ArrayRef getOpcodes() const { return Opcodes; }
361 const OpcodeInfo &getOpcodeInfo() const { return Info; }
362 };
363
364 /// An STIPredicateFunction descriptor used by tablegen backends to
365 /// auto-generate the body of a predicate function as a member of tablegen'd
366 /// class XXXGenSubtargetInfo.
367 class STIPredicateFunction {
368 const Record *FunctionDeclaration;
369
370 std::vector Definitions;
371 std::vector Groups;
372
373 STIPredicateFunction(const STIPredicateFunction &Other) = delete;
374 STIPredicateFunction &operator=(const STIPredicateFunction &Other) = delete;
375
376 public:
377 STIPredicateFunction(const Record *Rec) : FunctionDeclaration(Rec) {}
378 STIPredicateFunction(STIPredicateFunction &&Other) = default;
379
380 bool isCompatibleWith(const STIPredicateFunction &Other) const {
381 return FunctionDeclaration == Other.FunctionDeclaration;
382 }
383
384 void addDefinition(const Record *Def) { Definitions.push_back(Def); }
385 void addOpcode(const Record *OpcodeRec, OpcodeInfo &&Info) {
386 if (Groups.empty() ||
387 Groups.back().getOpcodeInfo().getPredicates() != Info.getPredicates())
388 Groups.emplace_back(std::move(Info));
389 Groups.back().addOpcode(OpcodeRec);
390 }
391
392 StringRef getName() const {
393 return FunctionDeclaration->getValueAsString("Name");
394 }
395 const Record *getDefaultReturnPredicate() const {
396 return FunctionDeclaration->getValueAsDef("DefaultReturnValue");
397 }
398
399 const Record *getDeclaration() const { return FunctionDeclaration; }
400 ArrayRef getDefinitions() const { return Definitions; }
401 ArrayRef getGroups() const { return Groups; }
402 };
403
272404 /// Top level container for machine model data.
273405 class CodeGenSchedModels {
274406 RecordKeeper &Records;
301433 // combination of it's itinerary class, SchedRW list, and InstRW records.
302434 using InstClassMapTy = DenseMap;
303435 InstClassMapTy InstrClassMap;
436
437 std::vector STIPredicates;
304438
305439 public:
306440 CodeGenSchedModels(RecordKeeper& RK, const CodeGenTarget &TGT);
429563 Record *findProcResUnits(Record *ProcResKind, const CodeGenProcModel &PM,
430564 ArrayRef Loc) const;
431565
566 ArrayRef getSTIPredicates() const {
567 return STIPredicates;
568 }
432569 private:
433570 void collectProcModels();
434571
466603
467604 void checkMCInstPredicates() const;
468605
606 void checkSTIPredicates() const;
607
608 void collectSTIPredicates();
609
469610 void checkCompleteness();
470611
471612 void inferFromRW(ArrayRef OperWrites, ArrayRef OperReads,
1111 //===----------------------------------------------------------------------===//
1212
1313 #include "PredicateExpander.h"
14 #include "CodeGenSchedule.h" // Definition of STIPredicateFunction.
1415
1516 namespace llvm {
1617
312313 llvm_unreachable("No known rules to expand this MCInstPredicate");
313314 }
314315
316 void STIPredicateExpander::expandHeader(raw_ostream &OS,
317 const STIPredicateFunction &Fn) {
318 const Record *Rec = Fn.getDeclaration();
319 StringRef FunctionName = Rec->getValueAsString("Name");
320
321 OS.indent(getIndentLevel() * 2);
322 OS << "bool ";
323 if (shouldExpandDefinition())
324 OS << getClassPrefix() << "::";
325 OS << FunctionName << "(";
326 if (shouldExpandForMC())
327 OS << "const MCInst " << (isByRef() ? "&" : "*") << "MI";
328 else
329 OS << "const MachineInstr " << (isByRef() ? "&" : "*") << "MI";
330 if (Rec->getValueAsBit("UpdatesOpcodeMask"))
331 OS << ", APInt &Mask";
332 OS << (shouldExpandForMC() ? ", unsigned ProcessorID) const " : ") const ");
333 if (shouldExpandDefinition()) {
334 OS << "{\n";
335 return;
336 }
337
338 if (Rec->getValueAsBit("OverridesBaseClassMember"))
339 OS << "override";
340 OS << ";\n";
341 }
342
343 void STIPredicateExpander::expandPrologue(raw_ostream &OS,
344 const STIPredicateFunction &Fn) {
345 RecVec Delegates = Fn.getDeclaration()->getValueAsListOfDefs("Delegates");
346 bool UpdatesOpcodeMask =
347 Fn.getDeclaration()->getValueAsBit("UpdatesOpcodeMask");
348
349 increaseIndentLevel();
350 unsigned IndentLevel = getIndentLevel();
351 for (const Record *Delegate : Delegates) {
352 OS.indent(IndentLevel * 2);
353 OS << "if (" << Delegate->getValueAsString("Name") << "(MI";
354 if (UpdatesOpcodeMask)
355 OS << ", Mask";
356 if (shouldExpandForMC())
357 OS << ", ProcessorID";
358 OS << "))\n";
359 OS.indent((1 + IndentLevel) * 2);
360 OS << "return true;\n\n";
361 }
362
363 if (shouldExpandForMC())
364 return;
365
366 OS.indent(IndentLevel * 2);
367 OS << "unsigned ProcessorID = getSchedModel().getProcessorID();\n";
368 }
369
370 void STIPredicateExpander::expandOpcodeGroup(raw_ostream &OS, const OpcodeGroup &Group,
371 bool ShouldUpdateOpcodeMask) {
372 const OpcodeInfo &OI = Group.getOpcodeInfo();
373 for (const PredicateInfo &PI : OI.getPredicates()) {
374 const APInt &ProcModelMask = PI.ProcModelMask;
375 bool FirstProcID = true;
376 for (unsigned I = 0, E = ProcModelMask.getActiveBits(); I < E; ++I) {
377 if (!ProcModelMask[I])
378 continue;
379
380 if (FirstProcID) {
381 OS.indent(getIndentLevel() * 2);
382 OS << "if (ProcessorID == " << I;
383 } else {
384 OS << " || ProcessorID == " << I;
385 }
386 FirstProcID = false;
387 }
388
389 OS << ") {\n";
390
391 increaseIndentLevel();
392 OS.indent(getIndentLevel() * 2);
393 if (ShouldUpdateOpcodeMask) {
394 if (PI.OperandMask.isNullValue())
395 OS << "Mask.clearAllBits();\n";
396 else
397 OS << "Mask = " << PI.OperandMask << ";\n";
398 OS.indent(getIndentLevel() * 2);
399 }
400 OS << "return ";
401 expandPredicate(OS, PI.Predicate);
402 OS << ";\n";
403 decreaseIndentLevel();
404 OS.indent(getIndentLevel() * 2);
405 OS << "}\n";
406 }
407 }
408
409 void STIPredicateExpander::expandBody(raw_ostream &OS,
410 const STIPredicateFunction &Fn) {
411 bool UpdatesOpcodeMask =
412 Fn.getDeclaration()->getValueAsBit("UpdatesOpcodeMask");
413
414 unsigned IndentLevel = getIndentLevel();
415 OS.indent(IndentLevel * 2);
416 OS << "switch(MI" << (isByRef() ? "." : "->") << "getOpcode()) {\n";
417 OS.indent(IndentLevel * 2);
418 OS << "default:\n";
419 OS.indent(IndentLevel * 2);
420 OS << " break;";
421
422 for (const OpcodeGroup &Group : Fn.getGroups()) {
423 for (const Record *Opcode : Group.getOpcodes()) {
424 OS << '\n';
425 OS.indent(IndentLevel * 2);
426 OS << "case " << getTargetName() << "::" << Opcode->getName() << ":";
427 }
428
429 OS << '\n';
430 increaseIndentLevel();
431 expandOpcodeGroup(OS, Group, UpdatesOpcodeMask);
432
433 OS.indent(getIndentLevel() * 2);
434 OS << "break;\n";
435 decreaseIndentLevel();
436 }
437
438 OS.indent(IndentLevel * 2);
439 OS << "}\n";
440 }
441
442 void STIPredicateExpander::expandEpilogue(raw_ostream &OS,
443 const STIPredicateFunction &Fn) {
444 OS << '\n';
445 OS.indent(getIndentLevel() * 2);
446 OS << "return ";
447 expandPredicate(OS, Fn.getDefaultReturnPredicate());
448 OS << ";\n";
449
450 decreaseIndentLevel();
451 OS.indent(getIndentLevel() * 2);
452 StringRef FunctionName = Fn.getDeclaration()->getValueAsString("Name");
453 OS << "} // " << ClassPrefix << "::" << FunctionName << "\n\n";
454 }
455
456 void STIPredicateExpander::expandSTIPredicate(raw_ostream &OS,
457 const STIPredicateFunction &Fn) {
458 const Record *Rec = Fn.getDeclaration();
459 if (shouldExpandForMC() && !Rec->getValueAsBit("ExpandForMC"))
460 return;
461
462 expandHeader(OS, Fn);
463 if (shouldExpandDefinition()) {
464 expandPrologue(OS, Fn);
465 expandBody(OS, Fn);
466 expandEpilogue(OS, Fn);
467 }
468 }
469
315470 } // namespace llvm
4242 bool shouldNegate() const { return NegatePredicate; }
4343 bool shouldExpandForMC() const { return ExpandForMC; }
4444 unsigned getIndentLevel() const { return IndentLevel; }
45 StringRef getTargetName() const { return TargetName; }
4546
4647 void setByRef(bool Value) { EmitCallsByRef = Value; }
4748 void flipNegatePredicate() { NegatePredicate = !NegatePredicate; }
4849 void setNegatePredicate(bool Value) { NegatePredicate = Value; }
4950 void setExpandForMC(bool Value) { ExpandForMC = Value; }
51 void setIndentLevel(unsigned Level) { IndentLevel = Level; }
5052 void increaseIndentLevel() { ++IndentLevel; }
5153 void decreaseIndentLevel() { --IndentLevel; }
52 void setIndentLevel(unsigned Level) { IndentLevel = Level; }
5354
5455 using RecVec = std::vector;
5556 void expandTrue(raw_ostream &OS);
8081 void expandStatement(raw_ostream &OS, const Record *Rec);
8182 };
8283
84 // Forward declarations.
85 class STIPredicateFunction;
86 class OpcodeGroup;
87
88 class STIPredicateExpander : public PredicateExpander {
89 StringRef ClassPrefix;
90 bool ExpandDefinition;
91
92 STIPredicateExpander(const PredicateExpander &) = delete;
93 STIPredicateExpander &operator=(const PredicateExpander &) = delete;
94
95 void expandHeader(raw_ostream &OS, const STIPredicateFunction &Fn);
96 void expandPrologue(raw_ostream &OS, const STIPredicateFunction &Fn);
97 void expandOpcodeGroup(raw_ostream &OS, const OpcodeGroup &Group,
98 bool ShouldUpdateOpcodeMask);
99 void expandBody(raw_ostream &OS, const STIPredicateFunction &Fn);
100 void expandEpilogue(raw_ostream &OS, const STIPredicateFunction &Fn);
101
102 public:
103 STIPredicateExpander(StringRef Target)
104 : PredicateExpander(Target), ClassPrefix(), ExpandDefinition(false) {}
105
106 bool shouldExpandDefinition() const { return ExpandDefinition; }
107 StringRef getClassPrefix() const { return ClassPrefix; }
108 void setClassPrefix(StringRef S) { ClassPrefix = S; }
109 void setExpandDefinition(bool Value) { ExpandDefinition = Value; }
110
111 void expandSTIPredicate(raw_ostream &OS, const STIPredicateFunction &Fn);
112 };
113
83114 } // namespace llvm
84115
85116 #endif
115115 void emitSchedModelHelpersImpl(raw_ostream &OS,
116116 bool OnlyExpandMCInstPredicates = false);
117117 void emitGenMCSubtargetInfo(raw_ostream &OS);
118 void EmitMCInstrAnalysisPredicateFunctions(raw_ostream &OS);
118119
119120 void EmitSchedModel(raw_ostream &OS);
120121 void EmitHwModeCheck(const std::string &ClassName, raw_ostream &OS);
16711672 << " unsigned CPUID) const {\n"
16721673 << " return " << Target << "_MC"
16731674 << "::resolveVariantSchedClassImpl(SchedClass, MI, CPUID);\n"
1674 << "} // " << ClassName << "::resolveVariantSchedClass\n";
1675 << "} // " << ClassName << "::resolveVariantSchedClass\n\n";
1676
1677 STIPredicateExpander PE(Target);
1678 PE.setClassPrefix(ClassName);
1679 PE.setExpandDefinition(true);
1680 PE.setByRef(false);
1681 PE.setIndentLevel(0);
1682
1683 for (const STIPredicateFunction &Fn : SchedModels.getSTIPredicates())
1684 PE.expandSTIPredicate(OS, Fn);
16751685 }
16761686
16771687 void SubtargetEmitter::EmitHwModeCheck(const std::string &ClassName,
17631773 << "::resolveVariantSchedClassImpl(SchedClass, MI, CPUID); \n";
17641774 OS << " }\n";
17651775 OS << "};\n";
1776 }
1777
1778 void SubtargetEmitter::EmitMCInstrAnalysisPredicateFunctions(raw_ostream &OS) {
1779 OS << "\n#ifdef GET_STIPREDICATE_DECLS_FOR_MC_ANALYSIS\n";
1780 OS << "#undef GET_STIPREDICATE_DECLS_FOR_MC_ANALYSIS\n\n";
1781
1782 STIPredicateExpander PE(Target);
1783 PE.setExpandForMC(true);
1784 PE.setByRef(true);
1785 for (const STIPredicateFunction &Fn : SchedModels.getSTIPredicates())
1786 PE.expandSTIPredicate(OS, Fn);
1787
1788 OS << "#endif // GET_STIPREDICATE_DECLS_FOR_MC_ANALYSIS\n\n";
1789
1790 OS << "\n#ifdef GET_STIPREDICATE_DEFS_FOR_MC_ANALYSIS\n";
1791 OS << "#undef GET_STIPREDICATE_DEFS_FOR_MC_ANALYSIS\n\n";
1792
1793 std::string ClassPrefix = Target + "MCInstrAnalysis";
1794 PE.setExpandDefinition(true);
1795 PE.setClassPrefix(ClassPrefix);
1796 PE.setIndentLevel(0);
1797 for (const STIPredicateFunction &Fn : SchedModels.getSTIPredicates())
1798 PE.expandSTIPredicate(OS, Fn);
1799
1800 OS << "#endif // GET_STIPREDICATE_DEFS_FOR_MC_ANALYSIS\n\n";
17661801 }
17671802
17681803 //
18621897 << " const;\n";
18631898 if (TGT.getHwModes().getNumModeIds() > 1)
18641899 OS << " unsigned getHwMode() const override;\n";
1900
1901 STIPredicateExpander PE(Target);
1902 PE.setByRef(false);
1903 for (const STIPredicateFunction &Fn : SchedModels.getSTIPredicates())
1904 PE.expandSTIPredicate(OS, Fn);
1905
18651906 OS << "};\n"
18661907 << "} // end namespace llvm\n\n";
18671908
19191960 OS << "} // end namespace llvm\n\n";
19201961
19211962 OS << "#endif // GET_SUBTARGETINFO_CTOR\n\n";
1963
1964 EmitMCInstrAnalysisPredicateFunctions(OS);
19221965 }
19231966
19241967 namespace llvm {