llvm.org GIT mirror llvm / e9d67e4
fix various typos git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@306262 91177308-0d34-0410-b5e6-96231b3b80d8 Sylvestre Ledru 2 years ago
3 changed file(s) with 12 addition(s) and 12 deletion(s). Raw diff Collapse all Expand all
586586 The code object metadata is specified by the ``NT_AMD_AMDHSA_METADATA`` note
587587 record (see :ref:`amdgpu-note-records`).
588588
589 The metadata is specified as a YAML formated string (see [YAML]_ and
589 The metadata is specified as a YAML formatted string (see [YAML]_ and
590590 :doc:`YamlIO`).
591591
592592 The metadata is represented as a single YAML document comprised of the mapping
10301030 appropriate section according to if it has initialized data or is readonly.
10311031
10321032 If the symbol is external then its section is ``STN_UNDEF`` and the loader
1033 will resolve relocations using the defintion provided by another code object
1033 will resolve relocations using the definition provided by another code object
10341034 or explicitly defined by the runtime.
10351035
10361036 All global symbols, whether defined in the compilation unit or external, are
1037 accessed by the machine code indirectly throught a GOT table entry. This
1037 accessed by the machine code indirectly through a GOT table entry. This
10381038 allows them to be preemptable. The GOT table is only supported when the target
10391039 triple OS is ``amdhsa`` (see :ref:`amdgpu-target-triples`).
10401040
11591159 Define DWARF register enumeration.
11601160
11611161 If want to present a wavefront state then should expose vector registers as
1162 64 wide (rather than per work-item view that LLVM uses). Either as seperate
1162 64 wide (rather than per work-item view that LLVM uses). Either as separate
11631163 registers, or a 64x4 byte single register. In either case use a new LANE op
11641164 (akin to XDREF) to select the current lane usage in a location
11651165 expression. This would also allow scalar register spilling to vector register
16521652 ``COMPUTE_PGM_RSRC2.USER_SGPR``.
16531653 6 1 bit enable_trap_handler Set to 1 if code contains a
16541654 TRAP instruction which
1655 requires a trap hander to
1655 requires a trap handler to
16561656 be enabled.
16571657
16581658 CP sets
21452145 .. TODO
21462146 Update when implementation complete.
21472147
2148 Support more relaxed OpenCL memory model to be controled by environment
2148 Support more relaxed OpenCL memory model to be controlled by environment
21492149 component of target triple.
21502150
21512151 The AMDGPU backend supports the memory synchronization scopes specified in
22002200 can be reordered relative to each other, which can result in reordering the
22012201 visibility of vector memory operations with respect to LDS operations of other
22022202 wavefronts in the same work-group. A ``s_waitcnt lgkmcnt(0)`` is required to
2203 ensure synchonization between LDS operations and vector memory operations
2203 ensure synchronization between LDS operations and vector memory operations
22042204 between waves of a work-group, but not between operations performed by the
22052205 same wavefront.
22062206 * The vector memory operations are performed as wavefront wide operations and
22252225 scalar memory operations performed by waves executing in different work-groups
22262226 (which may be executing on different CUs) of an agent can be reordered
22272227 relative to each other. A ``s_waitcnt vmcnt(0)`` is required to ensure
2228 synchonization between vector memory operations of different CUs. It ensures a
2228 synchronization between vector memory operations of different CUs. It ensures a
22292229 previous vector memory operation has completed before executing a subsequent
22302230 vector memory or LDS operation and so can be used to meet the requirements of
22312231 acquire and release.
22672267 constant address space data may change between kernel dispatch executions. See
22682268 :ref:`amdgpu-amdhsa-memory-spaces`.
22692269
2270 The one exeception is if scalar writes are used to spill SGPR registers. In this
2270 The one execption is if scalar writes are used to spill SGPR registers. In this
22712271 case the AMDGPU backend ensures the memory location used to spill is never
22722272 accessed by vector memory operations at the same time. If scalar writes are used
22732273 then a ``s_dcache_wb`` is inserted before the ``s_endpgm`` and before a function
33093309 be moved before the acquire.
33103310 - If a fence then same as load atomic, plus no preceding
33113311 associated fence-paired-atomic can be moved after the fence.
3312 release - If a store atomic/atomicrmw then no preceeding load/load
3312 release - If a store atomic/atomicrmw then no preceding load/load
33133313 atomic/store/ store atomic/atomicrmw/fence instruction can
33143314 be moved after the release.
33153315 - If a fence then same as store atomic, plus no following
2626 VPlan-based vectorization involves three major steps, taking a "scenario-based
2727 approach" to vectorization planning:
2828
29 1. Legal Step: check if a loop can be legally vectorized; encode contraints and
29 1. Legal Step: check if a loop can be legally vectorized; encode constraints and
3030 artifacts if so.
3131 2. Plan Step:
3232
149149 | xray_logfile_base | ``const char*`` | ``xray-log.`` | Filename base for the |
150150 | | | | XRay logfile. |
151151 +-------------------+-----------------+---------------+------------------------+
152 | xray_fdr_log | ``bool`` | ``false`` | Wheter to install the |
152 | xray_fdr_log | ``bool`` | ``false`` | Whether to install the |
153153 | | | | Flight Data Recorder |
154154 | | | | (FDR) mode. |
155155 +-------------------+-----------------+---------------+------------------------+