llvm.org GIT mirror llvm / 8ff7590
Add links to SLD from the LangRef.html doc Clean up the SLD document a LOT Fill in a lot of details in the SLD document update the formats for the object descriptors git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@10698 91177308-0d34-0410-b5e6-96231b3b80d8 Chris Lattner 15 years ago
2 changed file(s) with 518 addition(s) and 222 deletion(s). Raw diff Collapse all Expand all
9494
  • 'llvm.va_copy' Intrinsic
  • 9595
    9696
    97
  • Debugger intrinsics
  • 9798
    9899
    99100
    15861587

    See the variable argument processing

    15871588 section.

    15881589
    1590
    15891591
    15901592
    15911593
    1594
    15921595
    15931596

    LLVM supports the notion of an "intrinsic function". These

    15941597 functions have well known names and semantics, and are required to
    16081611 lowering pass to eliminate the intrinsic or all backends must support
    16091612 the intrinsic function.

    16101613
    1614
    16111615
    1612
    1613 Handling Intrinsics >
    1616
    >
    1617 Variable Argument Handling Intrinsics
    1618
    1619
    16141620
    16151621

    Variable argument support is defined in LLVM with the

    16161622 href="#i_vanext">vanext instruction and these three
    16301636 href="#i_va_end">llvm.va_end(sbyte* %aq)

    ; Stop processing of arguments.
    call void %
    16311637 href="#i_va_end">llvm.va_end(sbyte* %ap2)
    ret int %tmp
    }
    16321638
    1633
    1634
    1635 Intrinsic
    1639
    1640
    1641
    1642 'llvm.va_start' Intrinsic
    1643
    1644
    1645
    16361646
    16371647
    Syntax:
    16381648
      call va_list ()* %llvm.va_start()
    16491659

    Note that this intrinsic function is only legal to be called from

    16501660 within the body of a variable argument function.

    16511661
    1652
    1653
    1654 Intrinsic
    1662
    1663
    1664
    1665 'llvm.va_end' Intrinsic
    1666
    1667
    16551668
    16561669
    Syntax:
    16571670
      call void (va_list)* %llvm.va_end(va_list <arglist>)
    16681681 href="#i_va_copy">llvm.va_copy must be matched exactly
    16691682 with calls to llvm.va_end.

    16701683
    1671
    1672
    1673 Intrinsic
    1684
    1685
    1686
    1687 'llvm.va_copy' Intrinsic
    1688
    1689
    16741690
    16751691
    Syntax:
    16761692
      call va_list (va_list)* %llvm.va_copy(va_list <destarglist>)
    16861702 href="i_va_start">llvm.va_start intrinsic may be arbitrarily
    16871703 complex and require memory allocation, for example.

    16881704
    1705
    1706
    1707
    1708
    1709 Debugger Intrinsics
    1710
    1711
    1712
    1713

    1714 The LLVM debugger intrinsics (which all start with llvm.dbg. prefix),
    1715 are described in the
    1716 href="SourceLevelDebugging.html#format_common_intrinsics">LLVM Source Level
    1717 Debugging document.
    1718

    1719
    1720
    1721
    16891722
    16901723
    16911724
    1010
    1111
    1212
    13 width=247 height=369 align=right>
    13 alt="A leafy and green bug eater"
    14 width=247 height=369 align=right>
    1415
    1516
  • Introduction
  • 1617
    2829
    2930
  • Architecture of the LLVM debugger
  • 3031
    32
  • The Debugger and InferiorProcess classes
  • 33
  • The RuntimeInfo, ProgramInfo, and SourceLanguage classes
  • 34
  • The llvm-db tool
  • 3135
  • Short-term TODO list
  • 3236
    3337
    34
  • Debugging information implementation
  • 38
  • Debugging information format
  • 3539
    36
  • Anchors for global objects
  • 37
  • Representing stopping points in the source program
  • 38
  • Object lifetimes and scoping
  • 39
  • Object descriptor formats
  • 40
  • Anchors for global objects
  • 41
  • Representing stopping points in the source program
  • 42
  • Object lifetimes and scoping
  • 43
  • Object descriptor formats
  • 4044
    41
  • Representation of source files
  • 42
  • Representation of global objects
  • 43
  • Representation of local variables
  • 45
  • Representation of source files
  • 46
  • Representation of program objects
  • 47
  • Program object contexts
  • 4448
    45
  • Other intrinsic functions
  • 49
  • Debugger intrinsic functions
  • 50
  • Values for debugger tags
  • 4651
    47
  • C/C++ front-end specific debug information
  • 52
  • C/C++ front-end specific debug information
  • 4853
    49
  • Object descriptor formats
  • 54
  • Program Scope Entries
  • 55
    56
  • Compilation unit entries
  • 57
  • Module, namespace, and importing entries
  • 58
    59
  • Data objects (program variables)
  • 5060
    5161
    5262
    5767
    5868
    5969

    This document is the central repository for all information pertaining to

    60 debug information in LLVM. It describes how to use the
    61 href="CommandGuide/llvm-db.html">llvm-db tool, which provides a
    62 powerful source-level debugger to users of LLVM-based
    63 compilers. When compiling a program in debug mode, the front-end in use adds
    64 LLVM debugging information to the program in the form of normal
    65 href="LangRef.html">LLVM program objects as well as a small set of LLVM
    66 href="#implementation">intrinsic functions, which specify the mapping of the
    67 program in LLVM form to the program in the source language.
    68

    70 debug information in LLVM. It describes the user
    71 interface for the llvm-db
    72 tool, which provides a powerful source-level debugger
    73 to users of LLVM-based compilers. It then describes the
    74 href="#architecture">various components that make up the debugger and the
    75 libraries which future clients may use. Finally, it describes the
    76 href="#format">actual format that the LLVM debug information takes,
    77 which is useful for those interested in creating front-ends or dealing directly
    78 with the information.

    6979
    7080
    7181
    99109 the debugging information should work with any language.
    100110
    101111
  • With code generator support, it should be possible to use an LLVM compiler
  • 102 to compile a program to native machine code with standard debugging formats.
    112 to compile a program to native machine code and standard debugging formats.
    103113 This allows compatibility with traditional machine-code level debuggers, like
    104114 GDB or DBX.
    105115
    107117
    108118

    109119 The approach used by the LLVM implementation is to use a small set of
    110 href="#impl_common_intrinsics">intrinsic functions to define a mapping
    120 href="#format_common_intrinsics">intrinsic functions to define a mapping
    111121 between LLVM program objects and the source-level objects. The description of
    112122 the source-level program is maintained in LLVM global variables in an
    113 href="#impl_ccxx">implementation-defined format (the C/C++ front-end
    123 href="#ccxx_frontend">implementation-defined format (the C/C++ front-end
    114124 currently uses working draft 7 of the
    115125 href="http://www.eagercon.com/dwarf/dwarf3std.htm">Dwarf 3 standard).

    116126
    206216
    207217

    208218 After working with the debugger for a while, perhaps the nicest improvement
    209 would be to add some sort of line editor, such as GNU readline (but that is
    219 would be to add some sort of line editor, such as GNU readline (but one that is
    210220 compatible with the LLVM license).

    211221
    212222

    213223 For someone so inclined, it should be straight-forward to write different
    214224 front-ends for the LLVM debugger, as the LLVM debugging engine is cleanly
    215 seperated from the llvm-db front-end. A GUI debugger or IDE would be
    216 an interesting project.
    225 separated from the llvm-db front-end. A new LLVM GUI debugger or IDE
    226 would be nice. :)
    217227

    218228
    219229
    251261
    252262
    253263
    254

    llvm-db is the first LLVM debugger, and as such was designed to be

    255 quick to prototype and build, and simple to extend. It is missing many many
    256 features, though they should be easy to add over time (patches welcomed!).
    257 Because the (currently only) debugger backend (implemented in
    258 "lib/Debugger/UnixLocalInferiorProcess.cpp") was designed to work without any
    259 cooperation from the code generators, it suffers from the following inherent
    260 limitations:

    264

    llvm-db is designed to be modular and easy to extend. This

    265 extensibility was key to getting the debugger up-and-running quickly, because we
    266 can start with simple-but-unsophisicated implementations of various components.
    267 Because of this, it is currently missing many features, though they should be
    268 easy to add over time (patches welcomed!). The biggest inherent limitations of
    269 llvm-db are currently due to extremely simple
    270 href="#arch_debugger">debugger backend (implemented in
    271 "lib/Debugger/UnixLocalInferiorProcess.cpp") which is designed to work without
    272 any cooperation from the code generators. Because it is so simple, it suffers
    273 from the following inherent limitations:

    261274
    262275

    263276
    264277
  • Running a program in llvm-db is a bit slower than running it with
  • 265 lli.
    278 lli (i.e., in the JIT).
    266279
    267280
  • Inspection of the target hardware is not supported. This means that you
  • 268281 cannot, for example, print the contents of X86 registers.
    280293
    281294

    282295
    283

    That said, it is still quite useful, and all of these limitations can be

    284 eliminated by integrating support for the debugger into the code generators.
    285 See the future work section for ideas of how to extend
    286 the LLVM debugger despite these limitations.

    296

    That said, the debugger is still quite useful, and all of these limitations

    297 can be eliminated by integrating support for the debugger into the code
    298 generators, and writing a new InferiorProcess
    299 subclass to use it. See the future work section for ideas
    300 of how to extend the LLVM debugger despite these limitations.

    287301
    288302
    289303
    295309
    296310
    297311
    298

    299 TODO
    300 </p>
    312 <p>TODO: this is obviously lame, when more is implemented, this can be much
    313 better.

    314
    315

    
                      
                    
    316 $ llvm-db funccall
    317 llvm-db: The LLVM source-level debugger
    318 Loading program... successfully loaded 'funccall.bc'!
    319 (llvm-db) create
    320 Starting program: funccall.bc
    321 main at funccall.c:9:2
    322 9 -> q = 0;
    323 (llvm-db) list main
    324 4 void foo() {
    325 5 int t = q;
    326 6 q = t + 1;
    327 7 }
    328 8 int main() {
    329 9 -> q = 0;
    330 10 foo();
    331 11 q = q - 1;
    332 12
    333 13 return q;
    334 (llvm-db) list
    335 14 }
    336 (llvm-db) step
    337 10 -> foo();
    338 (llvm-db) s
    339 foo at funccall.c:5:2
    340 5 -> int t = q;
    341 (llvm-db) bt
    342 #0 -> 0x85ffba0 in foo at funccall.c:5:2
    343 #1 0x85ffd98 in main at funccall.c:10:2
    344 (llvm-db) finish
    345 main at funccall.c:11:2
    346 11 -> q = q - 1;
    347 (llvm-db) s
    348 13 -> return q;
    349 (llvm-db) s
    350 The program stopped with exit code 0
    351 (llvm-db) quit
    352 $
    353

    301354
    302355
    303356
    385438
  • set listsize
  • 386439
  • show language
  • 387440
  • set language
  • 441
  • show args
  • 442
  • set args [args]
  • 388443
    389444
    390445

    TODO:

    414469
    415470
    416471
    417

    
                      
                    
    418 lib/Debugger
    419 - UnixLocalInferiorProcess.cpp
    420
    421 tools/llvm-db
    422 - SourceLanguage interfaces
    423 - ProgramInfo/RuntimeInfo
    424 - Commands
    425
    426

    427
    428 </div>
    472 <p>
    473 The LLVM debugger is built out of three distinct layers of software. These
    474 layers provide clients with different interface options depending on what pieces
    475 of they want to implement themselves, and it also promotes code modularity and
    476 good design. The three layers are the Debugger
    477 interface, the "info" interfaces, and the
    478 llvm-db tool itself.
    479

    480
    481
    482
    483
    484 The Debugger and InferiorProcess classes
    485
    486
    487
    488

    489 The Debugger class (defined in the include/llvm/Debugger/ directory) is
    490 a low-level class which is used to maintain information about the loaded
    491 program, as well as start and stop the program running as necessary. This class
    492 does not provide any high-level analysis or control over the program, only
    493 exposing simple interfaces like load/unloadProgram,
    494 create/killProgram, step/next/finish/contProgram, and
    495 low-level methods for installing breakpoints.
    496

    497
    498

    499 The Debugger class is itself a wrapper around the lowest-level InferiorProcess
    500 class. This class is used to represent an instance of the program running under
    501 debugger control. The InferiorProcess class can be implemented in different
    502 ways for different targets and execution scenarios (e.g., remote debugging).
    503 The InferiorProcess class exposes a small and simple collection of interfaces
    504 which are useful for inspecting the current state of the program (such as
    505 collecting stack trace information, reading the memory image of the process,
    506 etc). The interfaces in this class are designed to be as low-level and simple
    507 as possible, to make it easy to create new instances of the class.
    508

    509
    510

    511 The Debugger class exposes the currently active instance of InferiorProcess
    512 through the Debugger::getRunningProcess method, which returns a
    513 const reference to the class. This means that clients of the Debugger
    514 class can only inspect the running instance of the program directly. To
    515 change the executing process in some way, they must use the interces exposed by
    516 the Debugger class.
    517

    518
    519
    520
    521
    522 The RuntimeInfo, ProgramInfo, and SourceLanguage classes
    523
    524
    525
    526

    527 The next-highest level of debugger abstraction is provided through the
    528 ProgramInfo, RuntimeInfo, SourceLanguage and related classes (also defined in
    529 the include/llvm/Debugger/ directory). These classes efficiently
    530 decode the debugging information and low-level interfaces exposed by
    531 InferiorProcess into a higher-level representation, suitable for analysis by the
    532 debugger.
    533

    534
    535

    536 The ProgramInfo class exposes a variety of different kinds of information about
    537 the program objects in the source-level-language. The SourceFileInfo class
    538 represents a source-file in the program (e.g. a .cpp or .h file). The
    539 SourceFileInfo class captures information such as which SourceLanguage was used
    540 to compile the file, where the debugger can get access to the actual file text
    541 (which is lazily loaded on demand), etc. The SourceFunctionInfo class
    542 represents a... FIXME: finish. The ProgramInfo class provides interfaces
    543 to lazily find and decode the information needed to create the Source*Info
    544 classes requested by the debugger.
    545

    546
    547

    548 The RuntimeInfo class exposes information about the currently executed program,
    549 by decoding information from the InferiorProcess and ProgramInfo classes. It
    550 provides a StackFrame class which provides an easy-to-use interface for
    551 inspecting the current and suspended stack frames in the program.
    552

    553
    554

    555 The SourceLanguage class is an abstract interface used by the debugger to
    556 perform all source-language-specific tasks. For example, this interface is used
    557 by the ProgramInfo class to decode language-specific types and functions and by
    558 the debugger front-end (such as llvm-db to
    559 evaluate source-langauge expressions typed into the debugger. This class uses
    560 the RuntimeInfo & ProgramInfo classes to get information about the current
    561 execution context and the loaded program, respectively.
    562

    563
    564
    565
    566
    567
    568 The llvm-db tool
    569
    570
    571
    572

    573 The llvm-db is designed to be a debugger providing an interface as
    574 href="#llvm-db">similar to GDB as reasonable, but no more so than that.
    575 Because the Debugger and
    576 href="#arch_info">info classes implement all of the heavy lifting and
    577 analysis, llvm-db (which lives in llvm/tools/llvm-db) consists
    578 mainly of of code to interact with the user and parse commands. The CLIDebugger
    579 constructor registers all of the builtin commands for the debugger, and each
    580 command is implemented as a CLIDebugger::[name]Command method.
    581

    582
    583
    429584
    430585
    431586
    455610

    456611
    457612

    458 run (with args) & set args: These need to be implemented.
    459 Currently run doesn't support setting arguments as part of the command. The
    460 only tricky thing is handling quotes right and stuff.

    461
    462

    463613 UnixLocalInferiorProcess.cpp speedup: There is no reason for the debugged
    464614 process to code gen the globals corresponding to debug information. The
    465615 IntrinsicLowering object could instead change descriptors into constant expr
    467617 would also allow us to eliminate the mapping back and forth between physical
    468618 addresses that must be done.

    469619
    620

    621 Process deaths: The InferiorProcessDead exception should be extended to
    622 know "how" a process died, i.e., it was killed by a signal. This is easy to
    623 collect in the UnixLocalInferiorProcess, we just need to represent it.

    624
    470625
    471626
    472627
    473628
    474 Debugging information implementation
    629 Debugging information format
    475630
    476631
    477632
    496651 become dead and be removed by the optimizer.

    497652
    498653

    The debugger is designed to be agnostic about the contents of most of the

    499 debugging information. It uses a source-language-specific module to decode the
    500 information that represents variables, types, functions, namespaces, etc: this
    501 allows for arbitrary source-language semantics and type-systems to be used, as
    502 long as there is a module written for the debugger to interpret the information.
    654 debugging information. It uses a source-language-specific
    655 module to decode the information that represents variables, types,
    656 functions, namespaces, etc: this allows for arbitrary source-language semantics
    657 and type-systems to be used, as long as there is a module written for the
    658 debugger to interpret the information.
    503659

    504660
    505661

    506662 To provide basic functionality, the LLVM debugger does have to make some
    507663 assumptions about the source-level language being debugged, though it keeps
    508664 these to a minimum. The only common features that the LLVM debugger assumes
    509 exist are source files,
    510 href="#impl_common_globals">global objects (aka methods, messages, global
    511 variables, etc), and local variables.
    512 These abstract objects are used by the debugger to form stack traces, show
    513 information about local variables, etc.
    665 exist are source files, and
    666 href="#format_program_objects">program objects. These abstract objects are
    667 used by the debugger to form stack traces, show information about local
    668 variables, etc.
    514669
    515670

    This section of the documentation first describes the representation aspects

    516 common to any source-language. The next section
    517 describes the data layout conventions used by the C and C++
    518 front-ends.

    519
    520
    521
    522
    523
    524 Anchors for global objects
    671 common to any source-language. The next section
    672 describes the data layout conventions used by the C and C++ front-ends.

    673
    674
    675
    676
    677
    678 Anchors for global objects
    525679
    526680
    527681
    541695

    542696
    543697

    
                      
                    
    544 %llvm.dbg.translation_units = linkonce global {} {}
    545 %llvm.dbg.globals = linkonce global {} {}
    698 %llvm.dbg.translation_units = linkonce global {} {}
    699 %llvm.dbg.globals = linkonce global {} {}
    546700

    547701
    548702

    559713
    560714
    561715
    562
    716
    563717 Representing stopping points in the source program
    564718
    565719
    573727 at every point in the program where the debugger should be able to inspect the
    574728 program (these correspond to places the debugger stops when you "step"
    575729 through it). The front-end can choose to place these as fine-grained as it
    576 would like (for example, before every subexpression was evaluated), but it is
    577 recommended to only put them after every source statement.

    730 would like (for example, before every subexpression evaluated), but it is
    731 recommended to only put them after every source statement that includes
    732 executable code.

    578733
    579734

    580735 Using calls to this intrinsic function to demark legal points for the debugger
    584739 which they must assume to do anything (including reading or writing to any part
    585740 of reachable memory). On the other hand, it does not impact many optimizations,
    586741 such as code motion of non-trapping instructions, nor does it impact
    587 optimization of subexpressions, or any other code between the stop points.

    742 optimization of subexpressions, code duplication transformations, or basic-block
    743 reordering transformations.

    588744
    589745

    590746 An important aspect of the calls to the %llvm.dbg.stoppoint intrinsic
    591747 is that the function-local debugging information is woven together with use-def
    592748 chains. This makes it easy for the debugger to, for example, locate the 'next'
    593 stop point. For a concrete example of stop points, see
    594 href="#impl_common_lifetime">the next section.

    595
    596
    597
    598
    599
    600
    601 Object lifetimes and scoping
    749 stop point. For a concrete example of stop points, see the example in
    750 href="#format_common_lifetime">the next section.

    751
    752
    753
    754
    755
    756
    757 Object lifetimes and scoping
    602758
    603759
    604760
    641797 %X = alloca int
    642798 %Y = alloca int
    643799 %Z = alloca int
    644 %D1 = call {}* %llvm.dbg.func.start(%lldb.global* %d.foo)
    645 %D2 = call {}* %llvm.dbg.stoppoint({}* %D1, uint 2, uint 2, %lldb.compile_unit* %file)
    800 %D1 = call {}* %llvm.dbg.func.start(%lldb.global* %d.foo)
    801 %D2 = call {}* %llvm.dbg.stoppoint({}* %D1, uint 2, uint 2, %lldb.compile_unit* %file)
    646802
    647803 %D3 = call {}* %llvm.dbg.DEFINEVARIABLE({}* %D2, ...)
    648804 ;; Evaluate expression on line 2, assigning to X.
    649 %D4 = call {}* %llvm.dbg.stoppoint({}* %D3, uint 3, uint 2, %lldb.compile_unit* %file)
    805 %D4 = call {}* %llvm.dbg.stoppoint({}* %D3, uint 3, uint 2, %lldb.compile_unit* %file)
    650806
    651807 %D5 = call {}* %llvm.dbg.DEFINEVARIABLE({}* %D4, ...)
    652808 ;; Evaluate expression on line 3, assigning to Y.
    653 %D6 = call {}* %llvm.dbg.stoppoint({}* %D5, uint 5, uint 4, %lldb.compile_unit* %file)
    809 %D6 = call {}* %llvm.dbg.stoppoint({}* %D5, uint 5, uint 4, %lldb.compile_unit* %file)
    654810
    655811 %D7 = call {}* %llvm.region.start({}* %D6)
    656812 %D8 = call {}* %llvm.dbg.DEFINEVARIABLE({}* %D7, ...)
    657813 ;; Evaluate expression on line 5, assigning to Z.
    658 %D9 = call {}* %llvm.dbg.stoppoint({}* %D8, uint 6, uint 4, %lldb.compile_unit* %file)
    814 %D9 = call {}* %llvm.dbg.stoppoint({}* %D8, uint 6, uint 4, %lldb.compile_unit* %file)
    659815
    660816 ;; Code for line 6.
    661817 %D10 = call {}* %llvm.region.end({}* %D9)
    662 %D11 = call {}* %llvm.dbg.stoppoint({}* %D10, uint 8, uint 2, %lldb.compile_unit* %file)
    818 %D11 = call {}* %llvm.dbg.stoppoint({}* %D10, uint 8, uint 2, %lldb.compile_unit* %file)
    663819
    664820 ;; Code for line 8.
    665821 %D12 = call {}* %llvm.region.end({}* %D11)
    671827 This example illustrates a few important details about the LLVM debugging
    672828 information. In particular, it shows how the various intrinsics used are woven
    673829 together with def-use and use-def chains, similar to how
    674 href="#impl_common_anchors">anchors are used with globals. This allows the
    830 href="#format_common_anchors">anchors are used with globals. This allows the
    675831 debugger to analyze the relationship between statements, variable definitions,
    676832 and the code used to implement the function.

    677833
    680836 href="#icl_ex_D1">definition of the %D1 variable and one with the
    681837 definition of %D7. In the case of
    682838 %D1, the debug information indicates that the function whose
    683 href="#impl_common_globals">descriptor is specified as an argument to the
    839 href="#format_program_objects">descriptor is specified as an argument to the
    684840 intrinsic. This defines a new stack frame whose lifetime ends when the region
    685841 is ended by the %D12 call.

    686842
    687843

    688 Representing the boundaries of functions with regions allows normal LLVM
    689 interprocedural optimizations to change the boundaries of functions without
    690 having to worry about breaking mapping information between LLVM and source-level
    691 functions. In particular, the inlining optimization requires no modification to
    692 support inlining with debugging information: there is no correlation drawn
    693 between LLVM functions and their source-level counterparts.

    844 Using regions to represent the boundaries of source-level functions allow LLVM
    845 interprocedural optimizations to arbitrarily modify LLVM functions without
    846 having to worry about breaking mapping information between the LLVM code and the
    847 and source-level program. In particular, the inliner requires no modification
    848 to support inlining with debugging information: there is no explicit correlation
    849 drawn between LLVM functions and their source-level counterparts (note however,
    850 that if the inliner inlines all instances of a non-strong-linkage function into
    851 its caller that it will not be possible for the user to manually invoke the
    852 inlined function from the debugger).

    694853
    695854

    696855 Once the function has been defined, the
    697 href="#impl_common_stoppoint">stopping point corresponding to line #2 of the
    856 href="#format_common_stoppoint">stopping point corresponding to line #2 of the
    698857 function is encountered. At this point in the function, no local
    699858 variables are live. As lines 2 and 3 of the example are executed, their
    700859 variable definitions are automatically introduced into the program, without the
    707866 In contrast, the Z variable goes out of scope at a different time, on
    708867 line 7. For this reason, it is defined within the
    709868 %D7 region, which kills the availability of Z before the
    710 code for line 8 is executed. Through the use of LLVM debugger regions,
    711 arbitrary source-language scoping rules can be supported, as long as they can
    712 only be nested (ie, one scope cannot partially overlap with a part of another
    713 scope).
    869 code for line 8 is executed. In this way, regions can support arbitrary
    870 source-language scoping rules, as long as they can only be nested (ie, one scope
    871 cannot partially overlap with a part of another scope).
    714872

    715873
    716874

    718876 declarations, not just variable declarations. For example, the scope of a C++
    719877 using declaration is controlled with this, and the llvm-db C++ support
    720878 routines could use this to change how name lookup is performed (though this is
    721 not yet implemented).
    722

    723
    724
    725
    726
    727
    728
    729 Object descriptor formats
    730
    731
    732
    733

    734 The LLVM debugger expects the descriptors for global objects to start in a
    879 not implemented yet).
    880

    881
    882
    883
    884
    885
    886
    887 Object descriptor formats
    888
    889
    890
    891

    892 The LLVM debugger expects the descriptors for program objects to start in a
    735893 canonical format, but the descriptors can include additional information
    736 appended at the end. All LLVM debugging information is versioned, allowing
    737 backwards compatibility in the case that the core structures need to change in
    738 some way. The lowest-level descriptor are those describing
    739 href="#impl_common_source_files">the files containing the program source
    740 code, all other descriptors refer to them.
    894 appended at the end that is source-language specific. All LLVM debugging
    895 information is versioned, allowing backwards compatibility in the case that the
    896 core structures need to change in some way. Also, all debugging information
    897 objects start with a tag to indicate what type
    898 of object it is. The source-language is allows to define its own objects, by
    899 using unreserved tag numbers.

    900
    901

    The lowest-level descriptor are those describing

    902 href="#format_common_source_files">the files containing the program source
    903 code, as most other descriptors (sometimes indirectly) refer to them.
    741904

    742905
    743906
    744907
    745908
    746909
    747 Representation of source files
    748
    749
    750
    751

    752 Source file descriptors were roughly patterned after the Dwarf "compile_unit"
    753 object. The descriptor currently is defined to have the following LLVM
    754 type:>
    910 Representation of source files>
    911
    912
    913
    914

    915 Source file descriptors are patterned after the Dwarf "compile_unit" object.
    916 The descriptor currently is defined to have at least the following LLVM
    917 type entries:

    755918
    756919

    
                      
                    
    757920 %lldb.compile_unit = type {
    921 uint, ;; Tag: LLVM_COMPILE_UNIT
    758922 ushort, ;; LLVM debug version number
    759923 ushort, ;; Dwarf language identifier
    760924 sbyte*, ;; Filename
    761925 sbyte*, ;; Working directory when compiled
    762 sbyte*, ;; Producer of the debug information
    763 {}* ;; Anchor for llvm.dbg.translation_units
    926 sbyte* ;; Producer of the debug information
    764927 }
    765928

    766929
    769932 language ID for the file (we use the Dwarf 3.0 ID numbers, such as
    770933 DW_LANG_C89, DW_LANG_C_plus_plus, DW_LANG_Cobol74,
    771934 etc), three strings describing the filename, working directory of the compiler,
    772 and an identifier string for the compiler that produced it, and the
    773 href="#impl_common_anchors">anchor for the descriptor. Here is an example
    935 and an identifier string for the compiler that produced it. Note that actual
    936 compile_unit declarations must also include an
    937 href="#format_common_anchors">anchor to llvm.dbg.translation_units,
    938 but it is not specified where the anchor is to be located. Here is an example
    774939 descriptor:
    775940

    776941
    777942

    
                      
                    
    778943 %arraytest_source_file = internal constant %lldb.compile_unit {
    944 uint 17, ; Tag value
    779945 ushort 0, ; Version #0
    780946 ushort 1, ; DW_LANG_C89
    781947 sbyte* getelementptr ([12 x sbyte]* %.str_1, long 0, long 0), ; filename
    788954 %.str_3 = internal constant [12 x sbyte] c"llvmgcc 3.4\00"
    789955

    790956
    957

    958 Note that the LLVM constant merging pass should eliminate duplicate copies of
    959 the strings that get emitted to each translation unit, such as the producer.
    960

    791961
    792962
    793963
    794964
    795965
    796966
    797 Representation of global objects
    798
    799
    800
    801

    802 The LLVM debugger needs to know what the source-language global objects, in
    803 order to build stack traces and other related activities. Because
    804 source-languages have widly varying forms of global objects, the LLVM debugger
    805 only expects the following fields in the descriptor for each global:
    967 Representation of program objects
    968
    969
    970
    971

    972 The LLVM debugger needs to know about some source-language program objects, in
    973 order to build stack traces, print information about local variables, and other
    974 related activities. The LLVM debugger differentiates between three different
    975 types of program objects: subprograms (functions, messages, methods, etc),
    976 variables (locals and globals), and others. Because source-languages have
    977 widely varying forms of these objects, the LLVM debugger expects only a few
    978 fields in the descriptor for each object:
    806979

    807980
    808981

    
                      
                    
    809 %lldb.global = type {
    810 %lldb.compile_unit*, ;; The translation unit containing the global
    811 sbyte*, ;; The global object 'name'
    812 [type]*, ;; Source-language type descriptor for global
    813 {}* ;; The anchor for llvm.dbg.globals
    982 %lldb.object = type {
    983 uint, ;; A tag
    984 any*, ;; The context for the object
    985 sbyte* ;; The object 'name'
    814986 }
    815987

    816988
    817989

    818 The first field contains a pointer to the translation unit the function is
    819 defined in. This pointer allows the debugger to find out which version of debug
    820 information the function corresponds to. The second field contains a string
    821 that the debugger can use to identify the subprogram if it does not contain
    822 explicit support for the source-language in use. This should be some sort of
    823 unmangled string that corresponds to the function somehow.
    990 The first field contains a tag for the descriptor. The second field contains
    991 either a pointer to the descriptor for the containing
    992 href="#format_common_source_files">source file, or it contains a pointer to
    993 another program object whose context pointer eventually reaches a source file.
    994 Through this context pointer, the
    995 LLVM debugger can establish the debug version number of the object.

    996
    997

    998 The third field contains a string that the debugger can use to identify the
    999 object if it does not contain explicit support for the source-language in use
    1000 (ie, the 'unknown' source language handler uses this string). This should be
    1001 some sort of unmangled string that corresponds to the object, but it is a
    1002 quality of implementation issue what exactly it contains (it is legal, though
    1003 not useful, for all of these strings to be null).
    8241004

    8251005
    8261006

    8271007 Note again that descriptors can be extended to include source-language-specific
    8281008 information in addition to the fields required by the LLVM debugger. See the
    829 href="#impl_ccxx_descriptors">section on the C/C++ front-end for more
    830 information.
    831

    832
    833
    834
    835
    836
    837
    838 Representation of local variables
    839
    840
    841
    842

    843

    844
    845
    846
    847
    848
    849 Other intrinsic functions
    850
    851
    852
    853

    854
    855

    1009 href="#ccxx_descriptors">section on the C/C++ front-end for more
    1010 information. Also remember that global objects (functions, selectors, global
    1011 variables, etc) must contain an anchor to
    1012 the llvm.dbg.globals variable.
    1013

    1014
    1015
    1016
    1017
    1018
    1019 Program object contexts
    1020
    1021
    1022
    1023

    
                      
                    
    1024 Allow source-language specific contexts, use to identify namespaces etc
    1025 Must end up in a source file descriptor.
    1026 Debugger core ignores all unknown context objects.
    1027

    1028
    1029
    1030
    1031
    1032
    1033
    1034 Debugger intrinsic functions
    1035
    1036
    1037
    1038

    
                      
                    
    1039 Define each intrinsics, as an extension of the language reference manual.
    1040
    1041 llvm.dbg.stoppoint
    1042 llvm.dbg.region.start
    1043 llvm.dbg.region.end
    1044 llvm.dbg.function.start
    1045 llvm.dbg.declare
    1046

    1047
    1048
    1049
    1050
    1051
    1052
    1053 Values for debugger tags
    1054
    1055
    1056
    1057
    1058

    1059 Happen to be the same value as the similarly named Dwarf-3 tags, this may change
    1060 in the future.
    1061

    1062
    1063

    1064

    
                      
                    
    1065 LLVM_COMPILE_UNIT : 17
    1066 LLVM_SUBPROGRAM : 46
    1067 LLVM_VARIABLE : 52
    1068
    1069

    8561070
    8571071
    8581072
    8591073
    8601074
    8611075
    862 C/C++ front-end specific debug information
    1076 C/C++ front-end specific debug information
    8631077
    8641078
    8651079
    8701084 href="http://www.eagercon.com/dwarf/dwarf3std.htm">Dwarf 3.0 in terms of
    8711085 information content. This allows code generators to trivially support native
    8721086 debuggers by generating standard dwarf information, and contains enough
    873 information for non-dwarf targets to translate it other as needed.

    874
    875

    876 TODO: document extensions to standard debugging objects, document how we
    877 represent source types, etc.
    878

    879
    880
    881
    882
    883
    884 Object Descriptor Formats
    885
    886
    887
    888

    889
    890

    891
    892
    1087 information for non-dwarf targets to translate it as needed.

    1088
    1089

    1090 The basic debug information required by the debugger is (intentionally) designed
    1091 to be as minimal as possible. This basic information is so minimal that it is
    1092 unlikely that any source-language could be adequately described by it.
    1093 Because of this, the debugger format was designed for extension to support
    1094 source-language-specific information. The extended descriptors are read and
    1095 interpreted by the language-specific modules in the
    1096 debugger if there is support available, otherwise it is ignored.
    1097

    1098
    1099

    1100 This section describes the extensions used to represent C and C++ programs.
    1101 Other languages could pattern themselves after this (which itself is tuned to
    1102 representing programs in the same way that Dwarf 3 does), or they could choose
    1103 to provide completely different extensions if they don't fit into the Dwarf
    1104 model. As support for debugging information gets added to the various LLVM
    1105 source-language front-ends, the information used should be documented here.
    1106

    1107
    1108
    1109
    1110
    1111
    1112 Program Scope Entries
    1113
    1114
    1115
    1116

    1117
    1118

    1119
    1120
    1121
    1122
    1123 Compilation unit entries
    1124
    1125
    1126
    1127

    1128 Translation units do not add any information over the standard
    1129 href="#format_common_source_files">source file representation already
    1130 expected by the debugger. As such, it uses descriptors of the type specified,
    1131 with a trailing anchor.
    1132

    1133
    1134
    1135
    1136
    1137 Module, namespace, and importing entries
    1138
    1139
    1140
    1141

    1142
    1143

    1144
    1145
    1146
    1147
    1148 Data objects (program variables)
    1149
    1150
    1151
    1152

    1153
    1154

    1155
    8931156
    8941157
    8951158