llvm.org GIT mirror llvm / cf16656
[WinEH] Add documentation motivating the new EH instructions This adds documentation on how to use the new EH instructions added in r243766. Reviewers: majnemer, reames Differential Revision: http://reviews.llvm.org/D11565 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@244267 91177308-0d34-0410-b5e6-96231b3b80d8 Reid Kleckner 4 years ago
1 changed file(s) with 205 addition(s) and 150 deletion(s). Raw diff Collapse all Expand all
6666 Windows Runtime Exception Handling
6767 -----------------------------------
6868
69 Windows runtime based exception handling uses the same basic IR structure as
70 Itanium ABI based exception handling, but it relies on the personality
71 functions provided by the native Windows runtime library, ``__CxxFrameHandler3``
72 for C++ exceptions: ``__C_specific_handler`` for 64-bit SEH or
73 ``_frame_handler3/4`` for 32-bit SEH. This results in a very different
74 execution model and requires some minor modifications to the initial IR
75 representation and a significant restructuring just before code generation.
76
77 General information about the Windows x64 exception handling mechanism can be
78 found at `MSDN Exception Handling (x64)
79 `_.
69 LLVM supports handling exceptions produced by the Windows runtime, but it
70 requires a very different intermediate representation. It is not based on the
71 ":ref:`landingpad `" instruction like the other two models, and is
72 described later in this document under :ref:`wineh`.
8073
8174 Overview
8275 --------
320313 the `resume instruction `_ if none of the conditions
321314 match.
322315
323 C++ Exception Handling using the Windows Runtime
324 =================================================
325
326 (Note: Windows C++ exception handling support is a work in progress and is
327 not yet fully implemented. The text below describes how it will work
328 when completed.)
329
330 The Windows runtime function for C++ exception handling uses a multi-phase
331 approach. When an exception occurs it searches the current callstack for a
332 frame that has a handler for the exception. If a handler is found, it then
333 calls the cleanup handler for each frame above the handler which has a
334 cleanup handler before calling the catch handler. These calls are all made
335 from a stack context different from the original frame in which the handler
336 is defined. Therefore, it is necessary to outline these handlers from their
337 original context before code generation.
338
339 Catch handlers are called with a pointer to the handler itself as the first
340 argument and a pointer to the parent function's stack frame as the second
341 argument. The catch handler uses the `llvm.localrecover
342 `_ to get a
343 pointer to a frame allocation block that is created in the parent frame using
344 the `llvm.localescape
345 `_ intrinsic.
346 The ``WinEHPrepare`` pass will have created a structure definition for the
347 contents of this block. The first two members of the structure will always be
348 (1) a 32-bit integer that the runtime uses to track the exception state of the
349 parent frame for the purposes of handling chained exceptions and (2) a pointer
350 to the object associated with the exception (roughly, the parameter of the
351 catch clause). These two members will be followed by any frame variables from
352 the parent function which must be accessed in any of the functions unwind or
353 catch handlers. The catch handler returns the address at which execution
354 should continue.
355
356 Cleanup handlers perform any cleanup necessary as the frame goes out of scope,
357 such as calling object destructors. The runtime handles the actual unwinding
358 of the stack. If an exception occurs in a cleanup handler the runtime manages
359 termination of the process. Cleanup handlers are called with the same arguments
360 as catch handlers (a pointer to the handler and a pointer to the parent stack
361 frame) and use the same mechanism described above to access frame variables
362 in the parent function. Cleanup handlers do not return a value.
363
364 The IR generated for Windows runtime based C++ exception handling is initially
365 very similar to the ``landingpad`` mechanism described above. Calls to
366 libc++abi functions (such as ``__cxa_begin_catch``/``__cxa_end_catch`` and
367 ``__cxa_throw_exception`` are replaced with calls to intrinsics or Windows
368 runtime functions (such as ``llvm.eh.begincatch``/``llvm.eh.endcatch`` and
369 ``__CxxThrowException``).
370
371 During the WinEHPrepare pass, the handler functions are outlined into handler
372 functions and the original landing pad code is replaced with a call to the
373 ``llvm.eh.actions`` intrinsic that describes the order in which handlers will
374 be processed from the logical location of the landing pad and an indirect
375 branch to the return value of the ``llvm.eh.actions`` intrinsic. The
376 ``llvm.eh.actions`` intrinsic is defined as returning the address at which
377 execution will continue. This is a temporary construct which will be removed
378 before code generation, but it allows for the accurate tracking of control
379 flow until then.
380
381 A typical landing pad will look like this after outlining:
382
383 .. code-block:: llvm
384
385 lpad:
386 %vals = landingpad { i8*, i32 } personality i8* bitcast (i32 (...)* @__CxxFrameHandler3 to i8*)
387 cleanup
388 catch i8* bitcast (i8** @_ZTIi to i8*)
389 catch i8* bitcast (i8** @_ZTIf to i8*)
390 %recover = call i8* (...)* @llvm.eh.actions(
391 i32 3, i8* bitcast (i8** @_ZTIi to i8*), i8* (i8*, i8*)* @_Z4testb.catch.1)
392 i32 2, i8* null, void (i8*, i8*)* @_Z4testb.cleanup.1)
393 i32 1, i8* bitcast (i8** @_ZTIf to i8*), i8* (i8*, i8*)* @_Z4testb.catch.0)
394 i32 0, i8* null, void (i8*, i8*)* @_Z4testb.cleanup.0)
395 indirectbr i8* %recover, [label %try.cont1, label %try.cont2]
396
397 In this example, the landing pad represents an exception handling context with
398 two catch handlers and a cleanup handler that have been outlined. If an
399 exception is thrown with a type that matches ``_ZTIi``, the ``_Z4testb.catch.1``
400 handler will be called an no clean-up is needed. If an exception is thrown
401 with a type that matches ``_ZTIf``, first the ``_Z4testb.cleanup.1`` handler
402 will be called to perform unwind-related cleanup, then the ``_Z4testb.catch.1``
403 handler will be called. If an exception is throw which does not match either
404 of these types and the exception is handled by another frame further up the
405 call stack, first the ``_Z4testb.cleanup.1`` handler will be called, then the
406 ``_Z4testb.cleanup.0`` handler (which corresponds to a different scope) will be
407 called, and exception handling will continue at the next frame in the call
408 stack will be called. One of the catch handlers will return the address of
409 ``%try.cont1`` in the parent function and the other will return the address of
410 ``%try.cont2``, meaning that execution continues at one of those blocks after
411 an exception is caught.
412
413
414316 Exception Handling Intrinsics
415317 =============================
416318
496398 When used in the native Windows C++ exception handling implementation, this
497399 intrinsic serves as a placeholder to delimit code before a catch handler is
498400 outlined. After the handler is outlined, this intrinsic is simply removed.
499
500 .. _llvm.eh.actions:
501
502 ``llvm.eh.actions``
503 ----------------------
504
505 .. code-block:: llvm
506
507 void @llvm.eh.actions()
508
509 This intrinsic represents the list of actions to take when an exception is
510 thrown. It is typically used by Windows exception handling schemes where cleanup
511 outlining is required by the runtime. The arguments are a sequence of ``i32``
512 sentinels indicating the action type followed by some pre-determined number of
513 arguments required to implement that action.
514
515 A code of ``i32 0`` indicates a cleanup action, which expects one additional
516 argument. The argument is a pointer to a function that implements the cleanup
517 action.
518
519 A code of ``i32 1`` indicates a catch action, which expects three additional
520 arguments. Different EH schemes give different meanings to the three arguments,
521 but the first argument indicates whether the catch should fire, the second is
522 the localescape index of the exception object, and the third is the code to run
523 to catch the exception.
524
525 For Windows C++ exception handling, the first argument for a catch handler is a
526 pointer to the RTTI type descriptor for the object to catch. The second
527 argument is an index into the argument list of the ``llvm.localescape`` call in
528 the main function. The exception object will be copied into the provided stack
529 object. If the exception object is not required, this argument should be -1.
530 The third argument is a pointer to a function implementing the catch. This
531 function returns the address of the basic block where execution should resume
532 after handling the exception.
533
534 For Windows SEH, the first argument is a pointer to the filter function, which
535 indicates if the exception should be caught or not. The second argument is
536 typically negative one. The third argument is the address of a basic block
537 where the exception will be handled. In other words, catch handlers are not
538 outlined in SEH. After running cleanups, execution immediately resumes at this
539 PC.
540
541 In order to preserve the structure of the CFG, a call to '``llvm.eh.actions``'
542 must be followed by an ':ref:`indirectbr `' instruction that
543 jumps to the result of the intrinsic call.
544401
545402
546403 SJLJ Intrinsics
627484 exception handling frame that defines information common to all functions in the
628485 unit.
629486
487 The format of this call frame information (CFI) is often platform-dependent,
488 however. ARM, for example, defines their own format. Apple has their own compact
489 unwind info format. On Windows, another format is used for all architectures
490 since 32-bit x86. LLVM will emit whatever information is required by the
491 target.
492
630493 Exception Tables
631494 ----------------
632495
633496 An exception table contains information about what actions to take when an
634 exception is thrown in a particular part of a function's code. There is one
635 exception table per function, except leaf functions and functions that have
636 calls only to non-throwing functions. They do not need an exception table.
497 exception is thrown in a particular part of a function's code. This is typically
498 referred to as the language-specific data area (LSDA). The format of the LSDA
499 table is specific to the personality function, but the majority of personalities
500 out there use a variation of the tables consumed by ``__gxx_personality_v0``.
501 There is one exception table per function, except leaf functions and functions
502 that have calls only to non-throwing functions. They do not need an exception
503 table.
504
505 .. _wineh:
506
507 Exception Handling using the Windows Runtime
508 =================================================
509
510 (Note: Windows C++ exception handling support is a work in progress and is not
511 yet fully implemented. The text below describes how it will work when
512 completed.)
513
514 Background on Windows exceptions
515 ---------------------------------
516
517 Interacting with exceptions on Windows is significantly more complicated than on
518 Itanium C++ ABI platforms. The fundamental difference between the two models is
519 that Itanium EH is designed around the idea of "successive unwinding," while
520 Windows EH is not.
521
522 Under Itanium, throwing an exception typically involes allocating thread local
523 memory to hold the exception, and calling into the EH runtime. The runtime
524 identifies frames with appropriate exception handling actions, and successively
525 resets the register context of the current thread to the most recently active
526 frame with actions to run. In LLVM, execution resumes at a ``landingpad``
527 instruction, which produces register values provided by the runtime. If a
528 function is only cleaning up allocated resources, the function is responsible
529 for calling ``_Unwind_Resume`` to transition to the next most recently active
530 frame after it is finished cleaning up. Eventually, the frame responsible for
531 handling the exception calls ``__cxa_end_catch`` to destroy the exception,
532 release its memory, and resume normal control flow.
533
534 The Windows EH model does not use these successive register context resets.
535 Instead, the active exception is typically described by a frame on the stack.
536 In the case of C++ exceptions, the exception object is allocated in stack memory
537 and its address is passed to ``__CxxThrowException``. General purpose structured
538 exceptions (SEH) are more analogous to Linux signals, and they are dispatched by
539 userspace DLLs provided with Windows. Each frame on the stack has an assigned EH
540 personality routine, which decides what actions to take to handle the exception.
541 There are a few major personalities for C and C++ code: the C++ personality
542 (``__CxxFrameHandler3``) and the SEH personalities (``_except_handler3``,
543 ``_except_handler4``, and ``__C_specific_handler``). All of them implement
544 cleanups by calling back into a "funclet" contained in the parent function.
545
546 Funclets, in this context, are regions of the parent function that can be called
547 as though they were a function pointer with a very special calling convention.
548 The frame pointer of the parent frame is passed into the funclet either using
549 the standard EBP register or as the first parameter register, depending on the
550 architecture. The funclet implements the EH action by accessing local variables
551 in memory through the frame pointer, and returning some appropriate value,
552 continuing the EH process. No variables live in to or out of the funclet can be
553 allocated in registers.
554
555 The C++ personality also uses funclets to contain the code for catch blocks
556 (i.e. all user code between the braces in ``catch (Type obj) { ... }``). The
557 runtime must use funclets for catch bodies because the C++ exception object is
558 allocated in a child stack frame of the function handling the exception. If the
559 runtime rewound the stack back to frame of the catch, the memory holding the
560 exception would be overwritten quickly by subsequent function calls. The use of
561 funclets also allows ``__CxxFrameHandler3`` to implement rethrow without
562 resorting to TLS. Instead, the runtime throws a special exception, and then uses
563 SEH (``__try / __except``) to resume execution with new information in the child
564 frame.
565
566 In other words, the successive unwinding approach is incompatible with Visual
567 C++ exceptions and general purpose Windows exception handling. Because the C++
568 exception object lives in stack memory, LLVM cannot provide a custom personality
569 function that uses landingpads. Similarly, SEH does not provide any mechanism
570 to rethrow an exception or continue unwinding. Therefore, LLVM must use the IR
571 constructs described later in this document to implement compatible exception
572 handling.
573
574 SEH filter expressions
575 -----------------------
576
577 The SEH personality functions also use funclets to implement filter expressions,
578 which allow executing arbitrary user code to decide which exceptions to catch.
579 Filter expressions should not be confused with the ``filter`` clause of the LLVM
580 ``landingpad`` instruction. Typically filter expressions are used to determine
581 if the exception came from a particular DLL or code region, or if code faulted
582 while accessing a particular memory address range. LLVM does not currently have
583 IR to represent filter expressions because it is difficult to represent their
584 control dependencies. Filter expressions run during the first phase of EH,
585 before cleanups run, making it very difficult to build a faithful control flow
586 graph. For now, the new EH instructions cannot represent SEH filter
587 expressions, and frontends must outline them ahead of time. Local variables of
588 the parent function can be escaped and accessed using the ``llvm.localescape``
589 and ``llvm.localrecover`` intrinsics.
590
591 New exception handling instructions
592 ------------------------------------
593
594 The primary design goal of the new EH instructions is to support funclet
595 generation while preserving information about the CFG so that SSA formation
596 still works. As a secondary goal, they are designed to be generic across MSVC
597 and Itanium C++ exceptions. They make very few assumptions about the data
598 required by the personality, so long as it uses the familiar core EH actions:
599 catch, cleanup, and terminate. However, the new instructions are hard to modify
600 without knowing details of the EH personality. While they can be used to
601 represent Itanium EH, the landingpad model is strictly better for optimization
602 purposes.
603
604 The following new instructions are considered "exception handling pads", in that
605 they must be the first non-phi instruction of a basic block that may be the
606 unwind destination of an invoke: ``catchpad``, ``cleanuppad``, and
607 ``terminatepad``. As with landingpads, when entering a try scope, if the
608 frontend encounters a call site that may throw an exception, it should emit an
609 invoke that unwinds to a ``catchpad`` block. Similarly, inside the scope of a
610 C++ object with a destructor, invokes should unwind to a ``cleanuppad``. The
611 ``terminatepad`` instruction exists to represent ``noexcept`` and throw
612 specifications with one combined instruction. All potentially throwing calls in
613 a ``noexcept`` function should transitively unwind to a terminateblock. Throw
614 specifications are not implemented by MSVC, and are not yet supported.
615
616 Each of these new EH pad instructions has a label operand that indicates which
617 action should be considered after this action. The ``catchpad`` and
618 ``terminatepad`` instructions are terminators, and this label is considered to
619 be an unwind destination analogous to the unwind destination of an invoke. The
620 ``cleanuppad`` instruction is different from the other two in that it is not a
621 terminator, and this label operand is not an edge in the CFG. The code inside a
622 cleanuppad runs before transferring control to the next action, so the
623 ``cleanupret`` instruction is the instruction that unwinds to the next EH pad.
624 All of these "unwind edges" may refer to a basic block that contains an EH pad
625 instruction, or they may simply unwind to the caller. Unwinding to the caller
626 has roughly the same semantics as the ``resume`` instruction in the
627 ``landingpad`` model. When inlining through an invoke, instructions that unwind
628 to the caller are hooked up to unwind to the unwind destination of the call
629 site.
630
631 Putting things together, here is a hypothetical lowering of some C++ that uses
632 all of the new IR instructions:
633
634 .. code-block:: c
635
636 struct Cleanup {
637 Cleanup();
638 ~Cleanup();
639 int m;
640 };
641 void may_throw();
642 int f() noexcept {
643 try {
644 Cleanup obj;
645 may_throw();
646 } catch (int e) {
647 return e;
648 }
649 return 0;
650 }
651
652 .. code-block:: llvm
653
654 define i32 @f() nounwind personality i32 (...)* @__CxxFrameHandler3 {
655 entry:
656 %obj = alloca %struct.Cleanup, align 4
657 %e = alloca i32, align 4
658 %call = invoke %struct.Cleanup* @"\01??0Cleanup@@QEAA@XZ"(%struct.Cleanup* nonnull %obj)
659 to label %invoke.cont unwind label %lpad.catch
660
661 invoke.cont: ; preds = %entry
662 invoke void @"\01?may_throw@@YAXXZ"()
663 to label %invoke.cont.2 unwind label %lpad.cleanup
664
665 invoke.cont.2: ; preds = %invoke.cont
666 call void @"\01??_DCleanup@@QEAA@XZ"(%struct.Cleanup* nonnull %obj) nounwind
667 br label %return
668
669 return: ; preds = %invoke.cont.2, %catch
670 %retval.0 = phi i32 [ 0, %invoke.cont.2 ], [ %9, %catch ]
671 ret i32 %retval.0
672
673 ; EH scope code, ordered innermost to outermost:
674
675 lpad.cleanup: ; preds = %invoke.cont
676 cleanuppad [label %lpad.catch]
677 call void @"\01??_DCleanup@@QEAA@XZ"(%struct.Cleanup* nonnull %obj) nounwind
678 cleanupret unwind label %lpad.catch
679
680 lpad.catch: ; preds = %entry, %lpad.cleanup
681 catchpad void [%rtti.TypeDescriptor2* @"\01??_R0H@8", i32 0, i32* %e]
682 to label %catch unwind label %lpad.terminate
683
684 catch: ; preds = %lpad.catch
685 %9 = load i32, i32* %e, align 4
686 catchret label %return
687
688 lpad.terminate:
689 terminatepad [void ()* @"\01?terminate@@YAXXZ"]
690 unwind to caller
691 }