llvm.org GIT mirror llvm / 26b8aab
tblgen, docs: Add initial syntax reference. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@171685 91177308-0d34-0410-b5e6-96231b3b80d8 Sean Silva 6 years ago
2 changed file(s) with 376 addition(s) and 0 deletion(s). Raw diff Collapse all Expand all
0 ===========================
1 TableGen Language Reference
2 ===========================
3
4 .. sectionauthor:: Sean Silva
5
6 .. contents::
7 :local:
8
9 .. warning::
10 This document is extremely rough. If you find something lacking, please
11 fix it, file a documentation bug, or ask about it on llvmdev.
12
13 Introduction
14 ============
15
16 This document is meant to be a normative spec about the TableGen language
17 in and of itself (i.e. how to understand a given construct in terms of how
18 it affects the final set of records represented by the TableGen file). If
19 you are unsure if this document is really what you are looking for, please
20 read :doc:`/TableGenFundamentals` first.
21
22 Notation
23 ========
24
25 The lexical and syntax notation used here is intended to imitate
26 `Python's`_. In particular, for lexical definitions, the productions
27 operate at the character level and there is no implied whitespace between
28 elements. The syntax definitions operate at the token level, so there is
29 implied whitespace between tokens.
30
31 .. _`Python's`: http://docs.python.org/py3k/reference/introduction.html#notation
32
33 Lexical Analysis
34 ================
35
36 TableGen supports BCPL (``// ...``) and nestable C-style (``/* ... */``)
37 comments.
38
39 The following is a listing of the basic punctuation tokens::
40
41 - + [ ] { } ( ) < > : ; . = ? #
42
43 Numeric literals take one of the following forms:
44
45 .. TableGen actually will lex some pretty strange sequences an interpret
46 them as numbers. What is shown here is an attempt to approximate what it
47 "should" accept.
48
49 .. productionlist::
50 TokInteger: `DecimalInteger` | `HexInteger` | `BinInteger`
51 DecimalInteger: ["+" | "-"] ("0"..."9")+
52 HexInteger: "0x" ("0"..."9" | "a"..."f" | "A"..."F")+
53 BinInteger: "0b" ("0" | "1")+
54
55 One aspect to note is that the :token:`DecimalInteger` token *includes* the
56 ``+`` or ``-``, as opposed to having ``+`` and ``-`` be unary operators as
57 most languages do.
58
59 TableGen has identifier-like tokens:
60
61 .. productionlist::
62 ualpha: "a"..."z" | "A"..."Z" | "_"
63 TokIdentifier: ("0"..."9")* `ualpha` (`ualpha` | "0"..."9")*
64 TokVarName: "$" `ualpha` (`ualpha` | "0"..."9")*
65
66 Note that unlike most languages, TableGen allows :token:`TokIdentifier` to
67 begin with a number. In case of ambiguity, a token will be interpreted as a
68 numeric literal rather than an identifier.
69
70 TableGen also has two string-like literals:
71
72 .. productionlist::
73 TokString: '"' '"'
74 TokCodeFragment: "[{" "}]"
75
76 TableGen also has the following keywords::
77
78 bit bits class code dag
79 def foreach defm field in
80 int let list multiclass string
81
82 TableGen also has "bang operators" which have a
83 wide variety of meanings::
84
85 !eq !if !head !tail !con
86 !shl !sra !srl
87 !cast !empty !subst !foreach !strconcat
88
89 Syntax
90 ======
91
92 TableGen has an ``include`` mechanism. It does not play a role in the
93 syntax per se, since it is lexically replaced with the contents of the
94 included file.
95
96 .. productionlist::
97 IncludeDirective: "include" `TokString`
98
99 TableGen's top-level production consists of "objects".
100
101 .. productionlist::
102 TableGenFile: `Object`*
103 Object: `Class` | `Def` | `Defm` | `Let` | `MultiClass` | `Foreach`
104
105 ``class``\es
106 ------------
107
108 .. productionlist::
109 Class: "class" `TokIdentifier` [`TemplateArgList`] `ObjectBody`
110
111 A ``class`` declaration creates a record which other records can inherit
112 from. A class can be parametrized by a list of "template arguments", whose
113 values can be used in the class body.
114
115 A given class can only be defined once. A ``class`` declaration is
116 considered to define the class if any of the following is true:
117
118 .. break ObjectBody into its consituents so that they are present here?
119
120 #. The :token:`TemplateArgList` is present.
121 #. The :token:`Body` in the :token:`ObjectBody` is present and is not empty.
122 #. The :token:`BaseClassList` in the :token:`ObjectBody` is present.
123
124 You can declare an empty class by giving and empty :token:`TemplateArgList`
125 and an empty :token:`ObjectBody`. This can serve as a restricted form of
126 forward declaration: note that records deriving from the forward-declared
127 class will inherit no fields from it since the record expansion is done
128 when the record is parsed.
129
130 .. productionlist::
131 TemplateArgList: "<" `Declaration` ("," `Declaration`)* ">"
132
133 Declarations
134 ------------
135
136 .. Omitting mention of arcane "field" prefix to discourage its use.
137
138 The declaration syntax is pretty much what you would expect as a C++
139 programmer.
140
141 .. productionlist::
142 Declaration: `Type` `TokIdentifier` ["=" `Value`]
143
144 It assigns the value to the identifer.
145
146 Types
147 -----
148
149 .. productionlist::
150 Type: "string" | "code" | "bit" | "int" | "dag"
151 :| "bits" "<" `TokInteger` ">"
152 :| "list" "<" `Type` ">"
153 :| `ClassID`
154 ClassID: `TokIdentifier`
155
156 Both ``string`` and ``code`` correspond to the string type; the difference
157 is purely to indicate programmer intention.
158
159 The :token:`ClassID` must identify a class that has been previously
160 declared or defined.
161
162 Values
163 ------
164
165 .. productionlist::
166 Value: `SimpleValue` `ValueSuffix`*
167 ValueSuffix: "{" `RangeList` "}"
168 :| "[" `RangeList` "]"
169 :| "." `TokIdentifier`
170 RangeList: `RangePiece` ("," `RangePiece`)*
171 RangePiece: `TokInteger`
172 :| `TokInteger` "-" `TokInteger`
173 :| `TokInteger` `TokInteger`
174
175 The peculiar last form of :token:`RangePiece` is due to the fact that the
176 "``-``" is included in the :token:`TokInteger`, hence ``1-5`` gets lexed as
177 two consecutive :token:`TokInteger`'s, with values ``1`` and ``-5``,
178 instead of "1", "-", and "5".
179 The :token:`RangeList` can be thought of as specifying "list slice" in some
180 contexts.
181
182
183 :token:`SimpleValue` has a number of forms:
184
185
186 .. productionlist::
187 SimpleValue: `TokIdentifier`
188
189 The value will be the variable referenced by the identifier. It can be one
190 of:
191
192 .. The code for this is exceptionally abstruse. These examples are a
193 best-effort attempt.
194
195 * name of a ``def``, such as the use of ``Bar`` in::
196
197 def Bar : SomeClass {
198 int X = 5;
199 }
200
201 def Foo {
202 SomeClass Baz = Bar;
203 }
204
205 * value local to a ``def``, such as the use of ``Bar`` in::
206
207 def Foo {
208 int Bar = 5;
209 int Baz = Bar;
210 }
211
212 * a template arg of a ``class``, such as the use of ``Bar`` in::
213
214 class Foo {
215 int Baz = Bar;
216 }
217
218 * value local to a ``multiclass``, such as the use of ``Bar`` in::
219
220 multiclass Foo {
221 int Bar = 5;
222 int Baz = Bar;
223 }
224
225 * a template arg to a ``multiclass``, such as the use of ``Bar`` in::
226
227 multiclass Foo {
228 int Baz = Bar;
229 }
230
231 .. productionlist::
232 SimpleValue: `TokInteger`
233
234 This represents the numeric value of the integer.
235
236 .. productionlist::
237 SimpleValue: `TokString`+
238
239 Multiple adjacent string literals are concatenated like in C/C++. The value
240 is the concatenation of the strings.
241
242 .. productionlist::
243 SimpleValue: `TokCodeFragment`
244
245 The value is the string value of the code fragment.
246
247 .. productionlist::
248 SimpleValue: "?"
249
250 ``?`` represents an "unset" initializer.
251
252 .. productionlist::
253 SimpleValue: "{" `ValueList` "}"
254 ValueList: [`ValueListNE`]
255 ValueListNE: `Value` ("," `Value`)*
256
257 This represents a sequence of bits, as would be used to initialize a
258 ``bits`` field (where ``n`` is the number of bits).
259
260 .. productionlist::
261 SimpleValue: `ClassID` "<" `ValueListNE` ">"
262
263 This generates a new anonymous record definition (as would be created by an
264 unnamed ``def`` inheriting from the given class with the given template
265 arguments) and the value is the value of that record definition.
266
267 .. productionlist::
268 SimpleValue: "[" `ValueList` "]" ["<" `Type` ">"]
269
270 A list initializer. The optional :token:`Type` can be used to indicate a
271 specific element type, otherwise the element type will be deduced from the
272 given values.
273
274 .. The initial `DagArg` of the dag must start with an identifier or
275 !cast, but this is more of an implementation detail and so for now just
276 leave it out.
277
278 .. productionlist::
279 SimpleValue: "(" `DagArg` `DagArgList` ")"
280 DagArgList: `DagArg` ("," `DagArg`)*
281 DagArg: `Value` [":" `TokVarName`]
282
283 The initial :token:`DagArg` is called the "operator" of the dag.
284
285 .. productionlist::
286 SimpleValue: `BangOperator` ["<" `Type` ">"] "(" `ValueListNE` ")"
287
288 Bodies
289 ------
290
291 .. productionlist::
292 ObjectBody: `BaseClassList` `Body`
293 BaseClassList: [`BaseClassListNE`]
294 BaseClassListNE: `SubClassRef` ("," `SubClassRef`)*
295 SubClassRef: (`ClassID` | `DefmID`) ["<" `ValueList` ">"]
296 DefmID: `TokIdentifier`
297
298 The version with the :token:`DefmID` is only valid in the
299 :token:`BaseClassList` of a ``defm``.
300 The :token:`DefmID` should be the name of a ``multiclass``.
301
302 .. put this somewhere else
303
304 It is after parsing the base class list that the "let stack" is applied.
305
306 .. productionlist::
307 Body: ";" | "{" BodyList "}"
308 BodyList: BodyItem*
309 BodyItem: `Declaration` ";"
310 :| "let" `TokIdentifier` [`RangeList`] "=" `Value` ";"
311
312 The ``let`` form allows overriding the value of an inherited field.
313
314 ``def``
315 -------
316
317 .. TODO::
318 There can be pastes in the names here, like ``#NAME#``. Look into that
319 and document it (it boils down to ParseIDValue with IDParseMode ==
320 ParseNameMode). ParseObjectName calls into the general ParseValue, with
321 the only different from "arbitrary expression parsing" being IDParseMode
322 == Mode.
323
324 .. productionlist::
325 Def: "def" `TokIdentifier` `ObjectBody`
326
327 Defines a record whose name is given by the :token:`TokIdentifier`. The
328 fields of the record are inherited from the base classes and defined in the
329 body.
330
331 Special handling occurs if this ``def`` appears inside a ``multiclass`` or
332 a ``foreach``.
333
334 ``defm``
335 --------
336
337 .. productionlist::
338 Defm: "defm" `TokIdentifier` ":" `BaseClassList` ";"
339
340 Note that in the :token:`BaseClassList`, all of the ``multiclass``'s must
341 precede any ``class``'s that appear.
342
343 ``foreach``
344 -----------
345
346 .. productionlist::
347 Foreach: "foreach" `Declaration` "in" "{" `Object`* "}"
348 :| "foreach" `Declaration` "in" `Object`
349
350 The value assigned to the variable in the declaration is iterated over and
351 the object or object list is reevaluated with the variable set at each
352 iterated value.
353
354 Top-Level ``let``
355 -----------------
356
357 .. productionlist::
358 Let: "let" `LetList` "in" "{" `Object`* "}"
359 :| "let" `LetList` "in" `Object`
360 LetList: `LetItem` ("," `LetItem`)*
361 LetItem: `TokIdentifier` [`RangeList`] "=" `Value`
362
363 This is effectively equivalent to ``let`` inside the body of a record
364 except that it applies to multiple records at a time. The bindings are
365 applied at the end of parsing the base classes of a record.
366
367 ``multiclass``
368 --------------
369
370 .. productionlist::
371 MultiClass: "multiclass" `TokIdentifier` [`TemplateArgList`]
372 : [":" `BaseMultiClassList`] "{" `MultiClassDef`+ "}"
373 BaseMultiClassList: `MultiClassID` ("," `MultiClassID`)*
374 MultiClassID: `TokIdentifier`
2323 WritingAnLLVMBackend
2424 GarbageCollection
2525 WritingAnLLVMPass
26 TableGen/LangRef
2627
2728 * :doc:`WritingAnLLVMPass`
2829