llvm.org GIT mirror llvm / ef25bf0
[SystemZ] Add more future work items to the README Based on an analysis by Ulrich Weigand. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@181882 91177308-0d34-0410-b5e6-96231b3b80d8 Richard Sandiford 7 years ago
1 changed file(s) with 91 addition(s) and 7 deletion(s). Raw diff Collapse all Expand all
2828
2929 --
3030
31 The tuning of the choice between Load Address (LA) and addition in
31 The tuning of the choice between LOAD ADDRESS (LA) and addition in
3232 SystemZISelDAGToDAG.cpp is suspect. It should be tweaked based on
3333 performance measurements.
3434
3535 --
3636
37 We don't support tail calls at present.
38
39 --
40
41 We don't support prefetching yet.
42
43 --
44
3745 There is no scheduling support.
3846
3947 --
4048
41 We don't use the Branch on Count or Branch on Index families of instruction.
49 We don't use the BRANCH ON COUNT or BRANCH ON INDEX families of instruction.
50
51 --
52
53 We might want to use BRANCH ON CONDITION for conditional indirect calls
54 and conditional returns.
55
56 --
57
58 We don't use the combined COMPARE AND BRANCH instructions. Using them
59 would require a change to the way we handle out-of-range branches.
60 At the moment, we start with 32-bit forms like BRCL and shorten them
61 to forms like BRC where possible, but COMPARE AND BRANCH does not have
62 a 32-bit form.
63
64 --
65
66 We should probably model just CC, not the PSW as a whole. Strictly
67 speaking, every instruction changes the PSW since the PSW contains the
68 current instruction address.
4269
4370 --
4471
5380
5481 --
5582
56 We don't optimize string and block memory operations.
83 We don't use the LOAD AND TEST or TEST DATA CLASS instructions.
84
85 --
86
87 We could use the generic floating-point forms of LOAD COMPLEMENT,
88 LOAD NEGATIVE and LOAD POSITIVE in cases where we don't need the
89 condition codes. For example, we could use LCDFR instead of LCDBR.
90
91 --
92
93 We don't optimize block memory operations.
94
95 It's definitely worth using things like MVC, CLC, NC, XC and OC with
96 constant lengths. MVCIN may be worthwhile too.
97
98 We should probably implement things like memcpy using MVC with EXECUTE.
99 Likewise memcmp and CLC. MVCLE and CLCLE could be useful too.
100
101 --
102
103 We don't optimize string operations.
104
105 MVST, CLST, SRST and CUSE could be useful here. Some of the TRANSLATE
106 family might be too, although they are probably more difficult to exploit.
57107
58108 --
59109
62112
63113 --
64114
115 ADD LOGICAL WITH SIGNED IMMEDIATE could be useful when we need to
116 produce a carry. SUBTRACT LOGICAL IMMEDIATE could be useful when we
117 need to produce a borrow. (Note that there are no memory forms of
118 ADD LOGICAL WITH CARRY and SUBTRACT LOGICAL WITH BORROW, so the high
119 part of 128-bit memory operations would probably need to be done
120 via a register.)
121
122 --
123
124 We don't use the halfword forms of LOAD REVERSED and STORE REVERSED
125 (LRVH and STRVH).
126
127 --
128
129 We could take advantage of the various ... UNDER MASK instructions,
130 such as ICM and STCM.
131
132 --
133
134 We could make more use of the ROTATE AND ... SELECTED BITS instructions.
135 At the moment we only use RISBG, and only then for subword atomic operations.
136
137 --
138
65139 DAGCombiner can detect integer absolute, but there's not yet an associated
66 ISD opcode. We could add one and implement it using Load Positive.
67 Negated absolutes could use Load Negative.
140 ISD opcode. We could add one and implement it using LOAD POSITIVE.
141 Negated absolutes could use LOAD NEGATIVE.
68142
69143 --
70144
141215 --
142216
143217 Atomic loads and stores use the default compare-and-swap based implementation.
144 This is probably much too conservative in practice, and the overhead is
145 especially bad for 8- and 16-bit accesses.
218 This is much too conservative in practice, since the architecture guarantees
219 that 1-, 2-, 4- and 8-byte loads and stores to aligned addresses are
220 inherently atomic.
221
222 --
223
224 If needed, we can support 16-byte atomics using LPQ, STPQ and CSDG.
225
226 --
227
228 We might want to model all access registers and use them to spill
229 32-bit values.