llvm.org GIT mirror llvm / 9b2a184
Continue the exposition git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@13819 91177308-0d34-0410-b5e6-96231b3b80d8 Chris Lattner 15 years ago
1 changed file(s) with 141 addition(s) and 33 deletion(s). Raw diff Collapse all Expand all
2020
  • Interfaces for user programs
  • 2121
    2222
  • Identifying GC roots on the stack: llvm.gcroot
  • 23
  • GC descriptor format for heap objects
  • 2423
  • Allocating memory from the GC
  • 2524
  • Reading and writing references to the heap
  • 2625
  • Explicit invocation of the garbage collector
  • 3029
  • Implementing a garbage collector
  • 3130
    3231
  • Implementing llvm_gc_read and llvm_gc_write
  • 33
  • Tracing the GC roots from the program stack
  • 34
  • GC implementations available
  • 32
  • Callback functions used to implement the garbage collector
  • 3533
    34
    35
  • GC implementations available
  • 36
    37
  • SemiSpace - A simple copying garbage collector
  • 3638
    3739
    3840
    209211
    210 GC descriptor format for heap objects
    211
    212
    213
    214
    215

    216 TODO: Either from root meta data, or from object headers. Front-end can provide a
    217 call-back to get descriptor from object without meta-data.
    218

    219
    220
    221
    222
    223
    224212 Allocating memory from the GC
    225213
    226214
    281269
    282270
    283271
    284 void %llvm_gc_initialize()
    272 void %llvm_gc_initialize(unsigned %InitialHeapSize)
    285273
    286274
    287275

    288276 The llvm_gc_initialize function should be called once before any other
    289277 garbage collection functions are called. This gives the garbage collector the
    290 chance to initialize itself and allocate the heap spaces.
    278 chance to initialize itself and allocate the heap spaces. The initial heap size
    279 to allocate should be specified as an argument.
    291280

    292281
    293282
    322311
    323312
    324313

    325 Implementing a garbage collector for LLVM is fairly straight-forward. The
    314 Implementing a garbage collector for LLVM is fairly straight-forward. The LLVM
    315 garbage collectors are provided in a form that makes them easy to link into the
    316 language-specific runtime that a language front-end would use. They require
    317 functionality from the language-specific runtime to get information about
    318 href="#gcdescriptors">where pointers are located in heap objects.
    319

    320
    321

    The

    326322 implementation must include the
    327323 href="#allocate">llvm_gc_allocate and
    328324 href="#explicit">llvm_gc_collect functions, and it must implement
    362358
    363359
    364360
    365 Tracing the GC roots from the program stack>
    361 Callback functions used to implement the garbage collector>
    362
    363
    364 Garbage collector implementations make use of call-back functions that are
    365 implemented by other parts of the LLVM system.
    366
    367
    368
    369 Tracing GC pointers from the program stack
    366370
    367371
    368372
    379383

    380384
    381385
    382
    383
    384 <div class="doc_subsection">
    386 <!--_________________________________________________________________________-->
    387
    388 Tracing GC pointers from static roots
    389
    390
    391
    392 TODO
    393
    394
    395
    396
    397
    398 Tracing GC pointers from heap objects
    399
    400
    401
    402

    403 The three most common ways to keep track of where pointers live in heap objects
    404 are (listed in order of space overhead required):

    405
    406
    407
  • In languages with polymorphic objects, pointers from an object header are
  • 408 usually used to identify the GC pointers in the heap object. This is common for
    409 object-oriented languages like Self, Smalltalk, Java, or C#.
    410
    411
  • If heap objects are not polymorphic, often the "shape" of the heap can be
  • 412 determined from the roots of the heap or from some other meta-data [
    413 href="#appel89">Appel89, Goldberg91,
    414 href="#tolmach94">Tolmach94]. In this case, the garbage collector can
    415 propagate the information around from meta data stored with the roots. This
    416 often eliminates the need to have a header on objects in the heap. This is
    417 common in the ML family.
    418
    419
  • If all heap objects have pointers in the same locations, or pointers can be
  • 420 distinguished just by looking at them (e.g., the low order bit is clear), no
    421 book-keeping is needed at all. This is common for Lisp-like languages.
    422
    423
    424

    The LLVM garbage collectors are capable of supporting all of these styles of

    425 language, including ones that mix various implementations. To do this, it
    426 allows the source-language to associate meta-data with the
    427 href="#roots">stack roots, and the heap tracing routines can propagate the
    428 information. In addition, LLVM allows the front-end to extract GC information
    429 from in any form from a specific object pointer (this supports situations #1 and
    430 #3).
    431

    432
    433

    Making this efficient

    434
    435
    436
    437
    438
    439
    440
    441
    442
    385443 GC implementations available
    386444
    445
    387446
    388447
    389448
    390449

    391450 To make this more concrete, the currently implemented LLVM garbage collectors
    392 all live in the llvm/runtime/GC directory in the LLVM source-base.
    393

    394
    395

    396 TODO: Brief overview of each.
    397

    398
    399
    400
    451 all live in the llvm/runtime/GC/* directories in the LLVM source-base.
    452 If you are interested in implementing an algorithm, there are many interesting
    453 possibilities (mark/sweep, a generational collector, a reference counting
    454 collector, etc), or you could choose to improve one of the existing algorithms.
    455

    456
    457
    458
    459
    460
    461 SemiSpace - A simple copying garbage collector
    462
    463
    464
    465

    466 SemiSpace is a very simple copying collector. When it starts up, it allocates
    467 two blocks of memory for the heap. It uses a simple bump-pointer allocator to
    468 allocate memory from the first block until it runs out of space. When it runs
    469 out of space, it traces through all of the roots of the program, copying blocks
    470 to the other half of the memory space.
    471

    472
    473
    474
    475
    476
    477 Possible Improvements
    478
    479
    480
    481
    482

    483 If a collection cycle happens and the heap is not compacted very much (say less
    484 than 25% of the allocated memory was freed), the memory regions should be
    485 doubled in size.

    486
    487
    488
    489
    490
    491 References
    492
    493
    494
    495
    496
    497

    [Appel89] Runtime Tags Aren't Necessary. Andrew

    498 W. Appel. Lisp and Symbolic Computation 19(7):703-705, July 1989.

    499
    500

    [Goldberg91] Tag-free garbage collection for

    501 strongly typed programming languages. Benjamin Goldberg. ACM SIGPLAN
    502 PLDI'91.

    503
    504

    [Tolmach94] Tag-free garbage collection using

    505 explicit type parameters. Andrew Tolmach. Proceedings of the 1994 ACM
    506 conference on LISP and functional programming.

    507
    508
    401509
    402510
    403511