| ============================== |
| LLVM Language Reference Manual |
| ============================== |
| |
| .. contents:: |
| :local: |
| :depth: 4 |
| |
| Abstract |
| ======== |
| |
| This document is a reference manual for the LLVM assembly language. LLVM |
| is a Static Single Assignment (SSA) based representation that provides |
| type safety, low-level operations, flexibility, and the capability of |
| representing 'all' high-level languages cleanly. It is the common code |
| representation used throughout all phases of the LLVM compilation |
| strategy. |
| |
| Introduction |
| ============ |
| |
| The LLVM code representation is designed to be used in three different |
| forms: as an in-memory compiler IR, as an on-disk bitcode representation |
| (suitable for fast loading by a Just-In-Time compiler), and as a human |
| readable assembly language representation. This allows LLVM to provide a |
| powerful intermediate representation for efficient compiler |
| transformations and analysis, while providing a natural means to debug |
| and visualize the transformations. The three different forms of LLVM are |
| all equivalent. This document describes the human readable |
| representation and notation. |
| |
| The LLVM representation aims to be light-weight and low-level while |
| being expressive, typed, and extensible at the same time. It aims to be |
| a "universal IR" of sorts, by being at a low enough level that |
| high-level ideas may be cleanly mapped to it (similar to how |
| microprocessors are "universal IR's", allowing many source languages to |
| be mapped to them). By providing type information, LLVM can be used as |
| the target of optimizations: for example, through pointer analysis, it |
| can be proven that a C automatic variable is never accessed outside of |
| the current function, allowing it to be promoted to a simple SSA value |
| instead of a memory location. |
| |
| .. _wellformed: |
| |
| Well-Formedness |
| --------------- |
| |
| It is important to note that this document describes 'well formed' LLVM |
| assembly language. There is a difference between what the parser accepts |
| and what is considered 'well formed'. For example, the following |
| instruction is syntactically okay, but not well formed: |
| |
| .. code-block:: llvm |
| |
| %x = add i32 1, %x |
| |
| because the definition of ``%x`` does not dominate all of its uses. The |
| LLVM infrastructure provides a verification pass that may be used to |
| verify that an LLVM module is well formed. This pass is automatically |
| run by the parser after parsing input assembly and by the optimizer |
| before it outputs bitcode. The violations pointed out by the verifier |
| pass indicate bugs in transformation passes or input to the parser. |
| |
| .. _identifiers: |
| |
| Identifiers |
| =========== |
| |
| LLVM identifiers come in two basic types: global and local. Global |
| identifiers (functions, global variables) begin with the ``'@'`` |
| character. Local identifiers (register names, types) begin with the |
| ``'%'`` character. Additionally, there are three different formats for |
| identifiers, for different purposes: |
| |
| #. Named values are represented as a string of characters with their |
| prefix. For example, ``%foo``, ``@DivisionByZero``, |
| ``%a.really.long.identifier``. The actual regular expression used is |
| '``[%@][-a-zA-Z$._][-a-zA-Z$._0-9]*``'. Identifiers that require other |
| characters in their names can be surrounded with quotes. Special |
| characters may be escaped using ``"\xx"`` where ``xx`` is the ASCII |
| code for the character in hexadecimal. In this way, any character can |
| be used in a name value, even quotes themselves. The ``"\01"`` prefix |
| can be used on global values to suppress mangling. |
| #. Unnamed values are represented as an unsigned numeric value with |
| their prefix. For example, ``%12``, ``@2``, ``%44``. |
| #. Constants, which are described in the section Constants_ below. |
| |
| LLVM requires that values start with a prefix for two reasons: Compilers |
| don't need to worry about name clashes with reserved words, and the set |
| of reserved words may be expanded in the future without penalty. |
| Additionally, unnamed identifiers allow a compiler to quickly come up |
| with a temporary variable without having to avoid symbol table |
| conflicts. |
| |
| Reserved words in LLVM are very similar to reserved words in other |
| languages. There are keywords for different opcodes ('``add``', |
| '``bitcast``', '``ret``', etc...), for primitive type names ('``void``', |
| '``i32``', etc...), and others. These reserved words cannot conflict |
| with variable names, because none of them start with a prefix character |
| (``'%'`` or ``'@'``). |
| |
| Here is an example of LLVM code to multiply the integer variable |
| '``%X``' by 8: |
| |
| The easy way: |
| |
| .. code-block:: llvm |
| |
| %result = mul i32 %X, 8 |
| |
| After strength reduction: |
| |
| .. code-block:: llvm |
| |
| %result = shl i32 %X, 3 |
| |
| And the hard way: |
| |
| .. code-block:: llvm |
| |
| %0 = add i32 %X, %X ; yields i32:%0 |
| %1 = add i32 %0, %0 ; yields i32:%1 |
| %result = add i32 %1, %1 |
| |
| This last way of multiplying ``%X`` by 8 illustrates several important |
| lexical features of LLVM: |
| |
| #. Comments are delimited with a '``;``' and go until the end of line. |
| #. Unnamed temporaries are created when the result of a computation is |
| not assigned to a named value. |
| #. Unnamed temporaries are numbered sequentially (using a per-function |
| incrementing counter, starting with 0). Note that basic blocks and unnamed |
| function parameters are included in this numbering. For example, if the |
| entry basic block is not given a label name and all function parameters are |
| named, then it will get number 0. |
| |
| It also shows a convention that we follow in this document. When |
| demonstrating instructions, we will follow an instruction with a comment |
| that defines the type and name of value produced. |
| |
| High Level Structure |
| ==================== |
| |
| Module Structure |
| ---------------- |
| |
| LLVM programs are composed of ``Module``'s, each of which is a |
| translation unit of the input programs. Each module consists of |
| functions, global variables, and symbol table entries. Modules may be |
| combined together with the LLVM linker, which merges function (and |
| global variable) definitions, resolves forward declarations, and merges |
| symbol table entries. Here is an example of the "hello world" module: |
| |
| .. code-block:: llvm |
| |
| ; Declare the string constant as a global constant. |
| @.str = private unnamed_addr constant [13 x i8] c"hello world\0A\00" |
| |
| ; External declaration of the puts function |
| declare i32 @puts(i8* nocapture) nounwind |
| |
| ; Definition of main function |
| define i32 @main() { ; i32()* |
| ; Convert [13 x i8]* to i8*... |
| %cast210 = getelementptr [13 x i8], [13 x i8]* @.str, i64 0, i64 0 |
| |
| ; Call puts function to write out the string to stdout. |
| call i32 @puts(i8* %cast210) |
| ret i32 0 |
| } |
| |
| ; Named metadata |
| !0 = !{i32 42, null, !"string"} |
| !foo = !{!0} |
| |
| This example is made up of a :ref:`global variable <globalvars>` named |
| "``.str``", an external declaration of the "``puts``" function, a |
| :ref:`function definition <functionstructure>` for "``main``" and |
| :ref:`named metadata <namedmetadatastructure>` "``foo``". |
| |
| In general, a module is made up of a list of global values (where both |
| functions and global variables are global values). Global values are |
| represented by a pointer to a memory location (in this case, a pointer |
| to an array of char, and a pointer to a function), and have one of the |
| following :ref:`linkage types <linkage>`. |
| |
| .. _linkage: |
| |
| Linkage Types |
| ------------- |
| |
| All Global Variables and Functions have one of the following types of |
| linkage: |
| |
| ``private`` |
| Global values with "``private``" linkage are only directly |
| accessible by objects in the current module. In particular, linking |
| code into a module with a private global value may cause the |
| private to be renamed as necessary to avoid collisions. Because the |
| symbol is private to the module, all references can be updated. This |
| doesn't show up in any symbol table in the object file. |
| ``internal`` |
| Similar to private, but the value shows as a local symbol |
| (``STB_LOCAL`` in the case of ELF) in the object file. This |
| corresponds to the notion of the '``static``' keyword in C. |
| ``available_externally`` |
| Globals with "``available_externally``" linkage are never emitted into |
| the object file corresponding to the LLVM module. From the linker's |
| perspective, an ``available_externally`` global is equivalent to |
| an external declaration. They exist to allow inlining and other |
| optimizations to take place given knowledge of the definition of the |
| global, which is known to be somewhere outside the module. Globals |
| with ``available_externally`` linkage are allowed to be discarded at |
| will, and allow inlining and other optimizations. This linkage type is |
| only allowed on definitions, not declarations. |
| ``linkonce`` |
| Globals with "``linkonce``" linkage are merged with other globals of |
| the same name when linkage occurs. This can be used to implement |
| some forms of inline functions, templates, or other code which must |
| be generated in each translation unit that uses it, but where the |
| body may be overridden with a more definitive definition later. |
| Unreferenced ``linkonce`` globals are allowed to be discarded. Note |
| that ``linkonce`` linkage does not actually allow the optimizer to |
| inline the body of this function into callers because it doesn't |
| know if this definition of the function is the definitive definition |
| within the program or whether it will be overridden by a stronger |
| definition. To enable inlining and other optimizations, use |
| "``linkonce_odr``" linkage. |
| ``weak`` |
| "``weak``" linkage has the same merging semantics as ``linkonce`` |
| linkage, except that unreferenced globals with ``weak`` linkage may |
| not be discarded. This is used for globals that are declared "weak" |
| in C source code. |
| ``common`` |
| "``common``" linkage is most similar to "``weak``" linkage, but they |
| are used for tentative definitions in C, such as "``int X;``" at |
| global scope. Symbols with "``common``" linkage are merged in the |
| same way as ``weak symbols``, and they may not be deleted if |
| unreferenced. ``common`` symbols may not have an explicit section, |
| must have a zero initializer, and may not be marked |
| ':ref:`constant <globalvars>`'. Functions and aliases may not have |
| common linkage. |
| |
| .. _linkage_appending: |
| |
| ``appending`` |
| "``appending``" linkage may only be applied to global variables of |
| pointer to array type. When two global variables with appending |
| linkage are linked together, the two global arrays are appended |
| together. This is the LLVM, typesafe, equivalent of having the |
| system linker append together "sections" with identical names when |
| .o files are linked. |
| |
| Unfortunately this doesn't correspond to any feature in .o files, so it |
| can only be used for variables like ``llvm.global_ctors`` which llvm |
| interprets specially. |
| |
| ``extern_weak`` |
| The semantics of this linkage follow the ELF object file model: the |
| symbol is weak until linked, if not linked, the symbol becomes null |
| instead of being an undefined reference. |
| ``linkonce_odr``, ``weak_odr`` |
| Some languages allow differing globals to be merged, such as two |
| functions with different semantics. Other languages, such as |
| ``C++``, ensure that only equivalent globals are ever merged (the |
| "one definition rule" --- "ODR"). Such languages can use the |
| ``linkonce_odr`` and ``weak_odr`` linkage types to indicate that the |
| global will only be merged with equivalent globals. These linkage |
| types are otherwise the same as their non-``odr`` versions. |
| ``external`` |
| If none of the above identifiers are used, the global is externally |
| visible, meaning that it participates in linkage and can be used to |
| resolve external symbol references. |
| |
| It is illegal for a function *declaration* to have any linkage type |
| other than ``external`` or ``extern_weak``. |
| |
| .. _callingconv: |
| |
| Calling Conventions |
| ------------------- |
| |
| LLVM :ref:`functions <functionstructure>`, :ref:`calls <i_call>` and |
| :ref:`invokes <i_invoke>` can all have an optional calling convention |
| specified for the call. The calling convention of any pair of dynamic |
| caller/callee must match, or the behavior of the program is undefined. |
| The following calling conventions are supported by LLVM, and more may be |
| added in the future: |
| |
| "``ccc``" - The C calling convention |
| This calling convention (the default if no other calling convention |
| is specified) matches the target C calling conventions. This calling |
| convention supports varargs function calls and tolerates some |
| mismatch in the declared prototype and implemented declaration of |
| the function (as does normal C). |
| "``fastcc``" - The fast calling convention |
| This calling convention attempts to make calls as fast as possible |
| (e.g. by passing things in registers). This calling convention |
| allows the target to use whatever tricks it wants to produce fast |
| code for the target, without having to conform to an externally |
| specified ABI (Application Binary Interface). `Tail calls can only |
| be optimized when this, the GHC or the HiPE convention is |
| used. <CodeGenerator.html#id80>`_ This calling convention does not |
| support varargs and requires the prototype of all callees to exactly |
| match the prototype of the function definition. |
| "``coldcc``" - The cold calling convention |
| This calling convention attempts to make code in the caller as |
| efficient as possible under the assumption that the call is not |
| commonly executed. As such, these calls often preserve all registers |
| so that the call does not break any live ranges in the caller side. |
| This calling convention does not support varargs and requires the |
| prototype of all callees to exactly match the prototype of the |
| function definition. Furthermore the inliner doesn't consider such function |
| calls for inlining. |
| "``cc 10``" - GHC convention |
| This calling convention has been implemented specifically for use by |
| the `Glasgow Haskell Compiler (GHC) <http://www.haskell.org/ghc>`_. |
| It passes everything in registers, going to extremes to achieve this |
| by disabling callee save registers. This calling convention should |
| not be used lightly but only for specific situations such as an |
| alternative to the *register pinning* performance technique often |
| used when implementing functional programming languages. At the |
| moment only X86 supports this convention and it has the following |
| limitations: |
| |
| - On *X86-32* only supports up to 4 bit type parameters. No |
| floating-point types are supported. |
| - On *X86-64* only supports up to 10 bit type parameters and 6 |
| floating-point parameters. |
| |
| This calling convention supports `tail call |
| optimization <CodeGenerator.html#id80>`_ but requires both the |
| caller and callee are using it. |
| "``cc 11``" - The HiPE calling convention |
| This calling convention has been implemented specifically for use by |
| the `High-Performance Erlang |
| (HiPE) <http://www.it.uu.se/research/group/hipe/>`_ compiler, *the* |
| native code compiler of the `Ericsson's Open Source Erlang/OTP |
| system <http://www.erlang.org/download.shtml>`_. It uses more |
| registers for argument passing than the ordinary C calling |
| convention and defines no callee-saved registers. The calling |
| convention properly supports `tail call |
| optimization <CodeGenerator.html#id80>`_ but requires that both the |
| caller and the callee use it. It uses a *register pinning* |
| mechanism, similar to GHC's convention, for keeping frequently |
| accessed runtime components pinned to specific hardware registers. |
| At the moment only X86 supports this convention (both 32 and 64 |
| bit). |
| "``webkit_jscc``" - WebKit's JavaScript calling convention |
| This calling convention has been implemented for `WebKit FTL JIT |
| <https://trac.webkit.org/wiki/FTLJIT>`_. It passes arguments on the |
| stack right to left (as cdecl does), and returns a value in the |
| platform's customary return register. |
| "``anyregcc``" - Dynamic calling convention for code patching |
| This is a special convention that supports patching an arbitrary code |
| sequence in place of a call site. This convention forces the call |
| arguments into registers but allows them to be dynamically |
| allocated. This can currently only be used with calls to |
| llvm.experimental.patchpoint because only this intrinsic records |
| the location of its arguments in a side table. See :doc:`StackMaps`. |
| "``preserve_mostcc``" - The `PreserveMost` calling convention |
| This calling convention attempts to make the code in the caller as |
| unintrusive as possible. This convention behaves identically to the `C` |
| calling convention on how arguments and return values are passed, but it |
| uses a different set of caller/callee-saved registers. This alleviates the |
| burden of saving and recovering a large register set before and after the |
| call in the caller. If the arguments are passed in callee-saved registers, |
| then they will be preserved by the callee across the call. This doesn't |
| apply for values returned in callee-saved registers. |
| |
| - On X86-64 the callee preserves all general purpose registers, except for |
| R11. R11 can be used as a scratch register. Floating-point registers |
| (XMMs/YMMs) are not preserved and need to be saved by the caller. |
| |
| The idea behind this convention is to support calls to runtime functions |
| that have a hot path and a cold path. The hot path is usually a small piece |
| of code that doesn't use many registers. The cold path might need to call out to |
| another function and therefore only needs to preserve the caller-saved |
| registers, which haven't already been saved by the caller. The |
| `PreserveMost` calling convention is very similar to the `cold` calling |
| convention in terms of caller/callee-saved registers, but they are used for |
| different types of function calls. `coldcc` is for function calls that are |
| rarely executed, whereas `preserve_mostcc` function calls are intended to be |
| on the hot path and definitely executed a lot. Furthermore `preserve_mostcc` |
| doesn't prevent the inliner from inlining the function call. |
| |
| This calling convention will be used by a future version of the ObjectiveC |
| runtime and should therefore still be considered experimental at this time. |
| Although this convention was created to optimize certain runtime calls to |
| the ObjectiveC runtime, it is not limited to this runtime and might be used |
| by other runtimes in the future too. The current implementation only |
| supports X86-64, but the intention is to support more architectures in the |
| future. |
| "``preserve_allcc``" - The `PreserveAll` calling convention |
| This calling convention attempts to make the code in the caller even less |
| intrusive than the `PreserveMost` calling convention. This calling |
| convention also behaves identical to the `C` calling convention on how |
| arguments and return values are passed, but it uses a different set of |
| caller/callee-saved registers. This removes the burden of saving and |
| recovering a large register set before and after the call in the caller. If |
| the arguments are passed in callee-saved registers, then they will be |
| preserved by the callee across the call. This doesn't apply for values |
| returned in callee-saved registers. |
| |
| - On X86-64 the callee preserves all general purpose registers, except for |
| R11. R11 can be used as a scratch register. Furthermore it also preserves |
| all floating-point registers (XMMs/YMMs). |
| |
| The idea behind this convention is to support calls to runtime functions |
| that don't need to call out to any other functions. |
| |
| This calling convention, like the `PreserveMost` calling convention, will be |
| used by a future version of the ObjectiveC runtime and should be considered |
| experimental at this time. |
| "``cxx_fast_tlscc``" - The `CXX_FAST_TLS` calling convention for access functions |
| Clang generates an access function to access C++-style TLS. The access |
| function generally has an entry block, an exit block and an initialization |
| block that is run at the first time. The entry and exit blocks can access |
| a few TLS IR variables, each access will be lowered to a platform-specific |
| sequence. |
| |
| This calling convention aims to minimize overhead in the caller by |
| preserving as many registers as possible (all the registers that are |
| perserved on the fast path, composed of the entry and exit blocks). |
| |
| This calling convention behaves identical to the `C` calling convention on |
| how arguments and return values are passed, but it uses a different set of |
| caller/callee-saved registers. |
| |
| Given that each platform has its own lowering sequence, hence its own set |
| of preserved registers, we can't use the existing `PreserveMost`. |
| |
| - On X86-64 the callee preserves all general purpose registers, except for |
| RDI and RAX. |
| "``swiftcc``" - This calling convention is used for Swift language. |
| - On X86-64 RCX and R8 are available for additional integer returns, and |
| XMM2 and XMM3 are available for additional FP/vector returns. |
| - On iOS platforms, we use AAPCS-VFP calling convention. |
| "``cc <n>``" - Numbered convention |
| Any calling convention may be specified by number, allowing |
| target-specific calling conventions to be used. Target specific |
| calling conventions start at 64. |
| |
| More calling conventions can be added/defined on an as-needed basis, to |
| support Pascal conventions or any other well-known target-independent |
| convention. |
| |
| .. _visibilitystyles: |
| |
| Visibility Styles |
| ----------------- |
| |
| All Global Variables and Functions have one of the following visibility |
| styles: |
| |
| "``default``" - Default style |
| On targets that use the ELF object file format, default visibility |
| means that the declaration is visible to other modules and, in |
| shared libraries, means that the declared entity may be overridden. |
| On Darwin, default visibility means that the declaration is visible |
| to other modules. Default visibility corresponds to "external |
| linkage" in the language. |
| "``hidden``" - Hidden style |
| Two declarations of an object with hidden visibility refer to the |
| same object if they are in the same shared object. Usually, hidden |
| visibility indicates that the symbol will not be placed into the |
| dynamic symbol table, so no other module (executable or shared |
| library) can reference it directly. |
| "``protected``" - Protected style |
| On ELF, protected visibility indicates that the symbol will be |
| placed in the dynamic symbol table, but that references within the |
| defining module will bind to the local symbol. That is, the symbol |
| cannot be overridden by another module. |
| |
| A symbol with ``internal`` or ``private`` linkage must have ``default`` |
| visibility. |
| |
| .. _dllstorageclass: |
| |
| DLL Storage Classes |
| ------------------- |
| |
| All Global Variables, Functions and Aliases can have one of the following |
| DLL storage class: |
| |
| ``dllimport`` |
| "``dllimport``" causes the compiler to reference a function or variable via |
| a global pointer to a pointer that is set up by the DLL exporting the |
| symbol. On Microsoft Windows targets, the pointer name is formed by |
| combining ``__imp_`` and the function or variable name. |
| ``dllexport`` |
| "``dllexport``" causes the compiler to provide a global pointer to a pointer |
| in a DLL, so that it can be referenced with the ``dllimport`` attribute. On |
| Microsoft Windows targets, the pointer name is formed by combining |
| ``__imp_`` and the function or variable name. Since this storage class |
| exists for defining a dll interface, the compiler, assembler and linker know |
| it is externally referenced and must refrain from deleting the symbol. |
| |
| .. _tls_model: |
| |
| Thread Local Storage Models |
| --------------------------- |
| |
| A variable may be defined as ``thread_local``, which means that it will |
| not be shared by threads (each thread will have a separated copy of the |
| variable). Not all targets support thread-local variables. Optionally, a |
| TLS model may be specified: |
| |
| ``localdynamic`` |
| For variables that are only used within the current shared library. |
| ``initialexec`` |
| For variables in modules that will not be loaded dynamically. |
| ``localexec`` |
| For variables defined in the executable and only used within it. |
| |
| If no explicit model is given, the "general dynamic" model is used. |
| |
| The models correspond to the ELF TLS models; see `ELF Handling For |
| Thread-Local Storage <http://people.redhat.com/drepper/tls.pdf>`_ for |
| more information on under which circumstances the different models may |
| be used. The target may choose a different TLS model if the specified |
| model is not supported, or if a better choice of model can be made. |
| |
| A model can also be specified in an alias, but then it only governs how |
| the alias is accessed. It will not have any effect in the aliasee. |
| |
| For platforms without linker support of ELF TLS model, the -femulated-tls |
| flag can be used to generate GCC compatible emulated TLS code. |
| |
| .. _runtime_preemption_model: |
| |
| Runtime Preemption Specifiers |
| ----------------------------- |
| |
| Global variables, functions and aliases may have an optional runtime preemption |
| specifier. If a preemption specifier isn't given explicitly, then a |
| symbol is assumed to be ``dso_preemptable``. |
| |
| ``dso_preemptable`` |
| Indicates that the function or variable may be replaced by a symbol from |
| outside the linkage unit at runtime. |
| |
| ``dso_local`` |
| The compiler may assume that a function or variable marked as ``dso_local`` |
| will resolve to a symbol within the same linkage unit. Direct access will |
| be generated even if the definition is not within this compilation unit. |
| |
| .. _namedtypes: |
| |
| Structure Types |
| --------------- |
| |
| LLVM IR allows you to specify both "identified" and "literal" :ref:`structure |
| types <t_struct>`. Literal types are uniqued structurally, but identified types |
| are never uniqued. An :ref:`opaque structural type <t_opaque>` can also be used |
| to forward declare a type that is not yet available. |
| |
| An example of an identified structure specification is: |
| |
| .. code-block:: llvm |
| |
| %mytype = type { %mytype*, i32 } |
| |
| Prior to the LLVM 3.0 release, identified types were structurally uniqued. Only |
| literal types are uniqued in recent versions of LLVM. |
| |
| .. _nointptrtype: |
| |
| Non-Integral Pointer Type |
| ------------------------- |
| |
| Note: non-integral pointer types are a work in progress, and they should be |
| considered experimental at this time. |
| |
| LLVM IR optionally allows the frontend to denote pointers in certain address |
| spaces as "non-integral" via the :ref:`datalayout string<langref_datalayout>`. |
| Non-integral pointer types represent pointers that have an *unspecified* bitwise |
| representation; that is, the integral representation may be target dependent or |
| unstable (not backed by a fixed integer). |
| |
| ``inttoptr`` instructions converting integers to non-integral pointer types are |
| ill-typed, and so are ``ptrtoint`` instructions converting values of |
| non-integral pointer types to integers. Vector versions of said instructions |
| are ill-typed as well. |
| |
| .. _globalvars: |
| |
| Global Variables |
| ---------------- |
| |
| Global variables define regions of memory allocated at compilation time |
| instead of run-time. |
| |
| Global variable definitions must be initialized. |
| |
| Global variables in other translation units can also be declared, in which |
| case they don't have an initializer. |
| |
| Either global variable definitions or declarations may have an explicit section |
| to be placed in and may have an optional explicit alignment specified. If there |
| is a mismatch between the explicit or inferred section information for the |
| variable declaration and its definition the resulting behavior is undefined. |
| |
| A variable may be defined as a global ``constant``, which indicates that |
| the contents of the variable will **never** be modified (enabling better |
| optimization, allowing the global data to be placed in the read-only |
| section of an executable, etc). Note that variables that need runtime |
| initialization cannot be marked ``constant`` as there is a store to the |
| variable. |
| |
| LLVM explicitly allows *declarations* of global variables to be marked |
| constant, even if the final definition of the global is not. This |
| capability can be used to enable slightly better optimization of the |
| program, but requires the language definition to guarantee that |
| optimizations based on the 'constantness' are valid for the translation |
| units that do not include the definition. |
| |
| As SSA values, global variables define pointer values that are in scope |
| (i.e. they dominate) all basic blocks in the program. Global variables |
| always define a pointer to their "content" type because they describe a |
| region of memory, and all memory objects in LLVM are accessed through |
| pointers. |
| |
| Global variables can be marked with ``unnamed_addr`` which indicates |
| that the address is not significant, only the content. Constants marked |
| like this can be merged with other constants if they have the same |
| initializer. Note that a constant with significant address *can* be |
| merged with a ``unnamed_addr`` constant, the result being a constant |
| whose address is significant. |
| |
| If the ``local_unnamed_addr`` attribute is given, the address is known to |
| not be significant within the module. |
| |
| A global variable may be declared to reside in a target-specific |
| numbered address space. For targets that support them, address spaces |
| may affect how optimizations are performed and/or what target |
| instructions are used to access the variable. The default address space |
| is zero. The address space qualifier must precede any other attributes. |
| |
| LLVM allows an explicit section to be specified for globals. If the |
| target supports it, it will emit globals to the section specified. |
| Additionally, the global can placed in a comdat if the target has the necessary |
| support. |
| |
| External declarations may have an explicit section specified. Section |
| information is retained in LLVM IR for targets that make use of this |
| information. Attaching section information to an external declaration is an |
| assertion that its definition is located in the specified section. If the |
| definition is located in a different section, the behavior is undefined. |
| |
| By default, global initializers are optimized by assuming that global |
| variables defined within the module are not modified from their |
| initial values before the start of the global initializer. This is |
| true even for variables potentially accessible from outside the |
| module, including those with external linkage or appearing in |
| ``@llvm.used`` or dllexported variables. This assumption may be suppressed |
| by marking the variable with ``externally_initialized``. |
| |
| An explicit alignment may be specified for a global, which must be a |
| power of 2. If not present, or if the alignment is set to zero, the |
| alignment of the global is set by the target to whatever it feels |
| convenient. If an explicit alignment is specified, the global is forced |
| to have exactly that alignment. Targets and optimizers are not allowed |
| to over-align the global if the global has an assigned section. In this |
| case, the extra alignment could be observable: for example, code could |
| assume that the globals are densely packed in their section and try to |
| iterate over them as an array, alignment padding would break this |
| iteration. The maximum alignment is ``1 << 29``. |
| |
| Globals can also have a :ref:`DLL storage class <dllstorageclass>`, |
| an optional :ref:`runtime preemption specifier <runtime_preemption_model>`, |
| an optional :ref:`global attributes <glattrs>` and |
| an optional list of attached :ref:`metadata <metadata>`. |
| |
| Variables and aliases can have a |
| :ref:`Thread Local Storage Model <tls_model>`. |
| |
| Syntax:: |
| |
| @<GlobalVarName> = [Linkage] [PreemptionSpecifier] [Visibility] |
| [DLLStorageClass] [ThreadLocal] |
| [(unnamed_addr|local_unnamed_addr)] [AddrSpace] |
| [ExternallyInitialized] |
| <global | constant> <Type> [<InitializerConstant>] |
| [, section "name"] [, comdat [($name)]] |
| [, align <Alignment>] (, !name !N)* |
| |
| For example, the following defines a global in a numbered address space |
| with an initializer, section, and alignment: |
| |
| .. code-block:: llvm |
| |
| @G = addrspace(5) constant float 1.0, section "foo", align 4 |
| |
| The following example just declares a global variable |
| |
| .. code-block:: llvm |
| |
| @G = external global i32 |
| |
| The following example defines a thread-local global with the |
| ``initialexec`` TLS model: |
| |
| .. code-block:: llvm |
| |
| @G = thread_local(initialexec) global i32 0, align 4 |
| |
| .. _functionstructure: |
| |
| Functions |
| --------- |
| |
| LLVM function definitions consist of the "``define``" keyword, an |
| optional :ref:`linkage type <linkage>`, an optional :ref:`runtime preemption |
| specifier <runtime_preemption_model>`, an optional :ref:`visibility |
| style <visibility>`, an optional :ref:`DLL storage class <dllstorageclass>`, |
| an optional :ref:`calling convention <callingconv>`, |
| an optional ``unnamed_addr`` attribute, a return type, an optional |
| :ref:`parameter attribute <paramattrs>` for the return type, a function |
| name, a (possibly empty) argument list (each with optional :ref:`parameter |
| attributes <paramattrs>`), optional :ref:`function attributes <fnattrs>`, |
| an optional section, an optional alignment, |
| an optional :ref:`comdat <langref_comdats>`, |
| an optional :ref:`garbage collector name <gc>`, an optional :ref:`prefix <prefixdata>`, |
| an optional :ref:`prologue <prologuedata>`, |
| an optional :ref:`personality <personalityfn>`, |
| an optional list of attached :ref:`metadata <metadata>`, |
| an opening curly brace, a list of basic blocks, and a closing curly brace. |
| |
| LLVM function declarations consist of the "``declare``" keyword, an |
| optional :ref:`linkage type <linkage>`, an optional :ref:`visibility style |
| <visibility>`, an optional :ref:`DLL storage class <dllstorageclass>`, an |
| optional :ref:`calling convention <callingconv>`, an optional ``unnamed_addr`` |
| or ``local_unnamed_addr`` attribute, a return type, an optional :ref:`parameter |
| attribute <paramattrs>` for the return type, a function name, a possibly |
| empty list of arguments, an optional alignment, an optional :ref:`garbage |
| collector name <gc>`, an optional :ref:`prefix <prefixdata>`, and an optional |
| :ref:`prologue <prologuedata>`. |
| |
| A function definition contains a list of basic blocks, forming the CFG (Control |
| Flow Graph) for the function. Each basic block may optionally start with a label |
| (giving the basic block a symbol table entry), contains a list of instructions, |
| and ends with a :ref:`terminator <terminators>` instruction (such as a branch or |
| function return). If an explicit label is not provided, a block is assigned an |
| implicit numbered label, using the next value from the same counter as used for |
| unnamed temporaries (:ref:`see above<identifiers>`). For example, if a function |
| entry block does not have an explicit label, it will be assigned label "%0", |
| then the first unnamed temporary in that block will be "%1", etc. |
| |
| The first basic block in a function is special in two ways: it is |
| immediately executed on entrance to the function, and it is not allowed |
| to have predecessor basic blocks (i.e. there can not be any branches to |
| the entry block of a function). Because the block can have no |
| predecessors, it also cannot have any :ref:`PHI nodes <i_phi>`. |
| |
| LLVM allows an explicit section to be specified for functions. If the |
| target supports it, it will emit functions to the section specified. |
| Additionally, the function can be placed in a COMDAT. |
| |
| An explicit alignment may be specified for a function. If not present, |
| or if the alignment is set to zero, the alignment of the function is set |
| by the target to whatever it feels convenient. If an explicit alignment |
| is specified, the function is forced to have at least that much |
| alignment. All alignments must be a power of 2. |
| |
| If the ``unnamed_addr`` attribute is given, the address is known to not |
| be significant and two identical functions can be merged. |
| |
| If the ``local_unnamed_addr`` attribute is given, the address is known to |
| not be significant within the module. |
| |
| Syntax:: |
| |
| define [linkage] [PreemptionSpecifier] [visibility] [DLLStorageClass] |
| [cconv] [ret attrs] |
| <ResultType> @<FunctionName> ([argument list]) |
| [(unnamed_addr|local_unnamed_addr)] [fn Attrs] [section "name"] |
| [comdat [($name)]] [align N] [gc] [prefix Constant] |
| [prologue Constant] [personality Constant] (!name !N)* { ... } |
| |
| The argument list is a comma separated sequence of arguments where each |
| argument is of the following form: |
| |
| Syntax:: |
| |
| <type> [parameter Attrs] [name] |
| |
| |
| .. _langref_aliases: |
| |
| Aliases |
| ------- |
| |
| Aliases, unlike function or variables, don't create any new data. They |
| are just a new symbol and metadata for an existing position. |
| |
| Aliases have a name and an aliasee that is either a global value or a |
| constant expression. |
| |
| Aliases may have an optional :ref:`linkage type <linkage>`, an optional |
| :ref:`runtime preemption specifier <runtime_preemption_model>`, an optional |
| :ref:`visibility style <visibility>`, an optional :ref:`DLL storage class |
| <dllstorageclass>` and an optional :ref:`tls model <tls_model>`. |
| |
| Syntax:: |
| |
| @<Name> = [Linkage] [PreemptionSpecifier] [Visibility] [DLLStorageClass] [ThreadLocal] [(unnamed_addr|local_unnamed_addr)] alias <AliaseeTy>, <AliaseeTy>* @<Aliasee> |
| |
| The linkage must be one of ``private``, ``internal``, ``linkonce``, ``weak``, |
| ``linkonce_odr``, ``weak_odr``, ``external``. Note that some system linkers |
| might not correctly handle dropping a weak symbol that is aliased. |
| |
| Aliases that are not ``unnamed_addr`` are guaranteed to have the same address as |
| the aliasee expression. ``unnamed_addr`` ones are only guaranteed to point |
| to the same content. |
| |
| If the ``local_unnamed_addr`` attribute is given, the address is known to |
| not be significant within the module. |
| |
| Since aliases are only a second name, some restrictions apply, of which |
| some can only be checked when producing an object file: |
| |
| * The expression defining the aliasee must be computable at assembly |
| time. Since it is just a name, no relocations can be used. |
| |
| * No alias in the expression can be weak as the possibility of the |
| intermediate alias being overridden cannot be represented in an |
| object file. |
| |
| * No global value in the expression can be a declaration, since that |
| would require a relocation, which is not possible. |
| |
| .. _langref_ifunc: |
| |
| IFuncs |
| ------- |
| |
| IFuncs, like as aliases, don't create any new data or func. They are just a new |
| symbol that dynamic linker resolves at runtime by calling a resolver function. |
| |
| IFuncs have a name and a resolver that is a function called by dynamic linker |
| that returns address of another function associated with the name. |
| |
| IFunc may have an optional :ref:`linkage type <linkage>` and an optional |
| :ref:`visibility style <visibility>`. |
| |
| Syntax:: |
| |
| @<Name> = [Linkage] [Visibility] ifunc <IFuncTy>, <ResolverTy>* @<Resolver> |
| |
| |
| .. _langref_comdats: |
| |
| Comdats |
| ------- |
| |
| Comdat IR provides access to COFF and ELF object file COMDAT functionality. |
| |
| Comdats have a name which represents the COMDAT key. All global objects that |
| specify this key will only end up in the final object file if the linker chooses |
| that key over some other key. Aliases are placed in the same COMDAT that their |
| aliasee computes to, if any. |
| |
| Comdats have a selection kind to provide input on how the linker should |
| choose between keys in two different object files. |
| |
| Syntax:: |
| |
| $<Name> = comdat SelectionKind |
| |
| The selection kind must be one of the following: |
| |
| ``any`` |
| The linker may choose any COMDAT key, the choice is arbitrary. |
| ``exactmatch`` |
| The linker may choose any COMDAT key but the sections must contain the |
| same data. |
| ``largest`` |
| The linker will choose the section containing the largest COMDAT key. |
| ``noduplicates`` |
| The linker requires that only section with this COMDAT key exist. |
| ``samesize`` |
| The linker may choose any COMDAT key but the sections must contain the |
| same amount of data. |
| |
| Note that the Mach-O platform doesn't support COMDATs, and ELF and WebAssembly |
| only support ``any`` as a selection kind. |
| |
| Here is an example of a COMDAT group where a function will only be selected if |
| the COMDAT key's section is the largest: |
| |
| .. code-block:: text |
| |
| $foo = comdat largest |
| @foo = global i32 2, comdat($foo) |
| |
| define void @bar() comdat($foo) { |
| ret void |
| } |
| |
| As a syntactic sugar the ``$name`` can be omitted if the name is the same as |
| the global name: |
| |
| .. code-block:: text |
| |
| $foo = comdat any |
| @foo = global i32 2, comdat |
| |
| |
| In a COFF object file, this will create a COMDAT section with selection kind |
| ``IMAGE_COMDAT_SELECT_LARGEST`` containing the contents of the ``@foo`` symbol |
| and another COMDAT section with selection kind |
| ``IMAGE_COMDAT_SELECT_ASSOCIATIVE`` which is associated with the first COMDAT |
| section and contains the contents of the ``@bar`` symbol. |
| |
| There are some restrictions on the properties of the global object. |
| It, or an alias to it, must have the same name as the COMDAT group when |
| targeting COFF. |
| The contents and size of this object may be used during link-time to determine |
| which COMDAT groups get selected depending on the selection kind. |
| Because the name of the object must match the name of the COMDAT group, the |
| linkage of the global object must not be local; local symbols can get renamed |
| if a collision occurs in the symbol table. |
| |
| The combined use of COMDATS and section attributes may yield surprising results. |
| For example: |
| |
| .. code-block:: text |
| |
| $foo = comdat any |
| $bar = comdat any |
| @g1 = global i32 42, section "sec", comdat($foo) |
| @g2 = global i32 42, section "sec", comdat($bar) |
| |
| From the object file perspective, this requires the creation of two sections |
| with the same name. This is necessary because both globals belong to different |
| COMDAT groups and COMDATs, at the object file level, are represented by |
| sections. |
| |
| Note that certain IR constructs like global variables and functions may |
| create COMDATs in the object file in addition to any which are specified using |
| COMDAT IR. This arises when the code generator is configured to emit globals |
| in individual sections (e.g. when `-data-sections` or `-function-sections` |
| is supplied to `llc`). |
| |
| .. _namedmetadatastructure: |
| |
| Named Metadata |
| -------------- |
| |
| Named metadata is a collection of metadata. :ref:`Metadata |
| nodes <metadata>` (but not metadata strings) are the only valid |
| operands for a named metadata. |
| |
| #. Named metadata are represented as a string of characters with the |
| metadata prefix. The rules for metadata names are the same as for |
| identifiers, but quoted names are not allowed. ``"\xx"`` type escapes |
| are still valid, which allows any character to be part of a name. |
| |
| Syntax:: |
| |
| ; Some unnamed metadata nodes, which are referenced by the named metadata. |
| !0 = !{!"zero"} |
| !1 = !{!"one"} |
| !2 = !{!"two"} |
| ; A named metadata. |
| !name = !{!0, !1, !2} |
| |
| .. _paramattrs: |
| |
| Parameter Attributes |
| -------------------- |
| |
| The return type and each parameter of a function type may have a set of |
| *parameter attributes* associated with them. Parameter attributes are |
| used to communicate additional information about the result or |
| parameters of a function. Parameter attributes are considered to be part |
| of the function, not of the function type, so functions with different |
| parameter attributes can have the same function type. |
| |
| Parameter attributes are simple keywords that follow the type specified. |
| If multiple parameter attributes are needed, they are space separated. |
| For example: |
| |
| .. code-block:: llvm |
| |
| declare i32 @printf(i8* noalias nocapture, ...) |
| declare i32 @atoi(i8 zeroext) |
| declare signext i8 @returns_signed_char() |
| |
| Note that any attributes for the function result (``nounwind``, |
| ``readonly``) come immediately after the argument list. |
| |
| Currently, only the following parameter attributes are defined: |
| |
| ``zeroext`` |
| This indicates to the code generator that the parameter or return |
| value should be zero-extended to the extent required by the target's |
| ABI by the caller (for a parameter) or the callee (for a return value). |
| ``signext`` |
| This indicates to the code generator that the parameter or return |
| value should be sign-extended to the extent required by the target's |
| ABI (which is usually 32-bits) by the caller (for a parameter) or |
| the callee (for a return value). |
| ``inreg`` |
| This indicates that this parameter or return value should be treated |
| in a special target-dependent fashion while emitting code for |
| a function call or return (usually, by putting it in a register as |
| opposed to memory, though some targets use it to distinguish between |
| two different kinds of registers). Use of this attribute is |
| target-specific. |
| ``byval`` |
| This indicates that the pointer parameter should really be passed by |
| value to the function. The attribute implies that a hidden copy of |
| the pointee is made between the caller and the callee, so the callee |
| is unable to modify the value in the caller. This attribute is only |
| valid on LLVM pointer arguments. It is generally used to pass |
| structs and arrays by value, but is also valid on pointers to |
| scalars. The copy is considered to belong to the caller not the |
| callee (for example, ``readonly`` functions should not write to |
| ``byval`` parameters). This is not a valid attribute for return |
| values. |
| |
| The byval attribute also supports specifying an alignment with the |
| align attribute. It indicates the alignment of the stack slot to |
| form and the known alignment of the pointer specified to the call |
| site. If the alignment is not specified, then the code generator |
| makes a target-specific assumption. |
| |
| .. _attr_inalloca: |
| |
| ``inalloca`` |
| |
| The ``inalloca`` argument attribute allows the caller to take the |
| address of outgoing stack arguments. An ``inalloca`` argument must |
| be a pointer to stack memory produced by an ``alloca`` instruction. |
| The alloca, or argument allocation, must also be tagged with the |
| inalloca keyword. Only the last argument may have the ``inalloca`` |
| attribute, and that argument is guaranteed to be passed in memory. |
| |
| An argument allocation may be used by a call at most once because |
| the call may deallocate it. The ``inalloca`` attribute cannot be |
| used in conjunction with other attributes that affect argument |
| storage, like ``inreg``, ``nest``, ``sret``, or ``byval``. The |
| ``inalloca`` attribute also disables LLVM's implicit lowering of |
| large aggregate return values, which means that frontend authors |
| must lower them with ``sret`` pointers. |
| |
| When the call site is reached, the argument allocation must have |
| been the most recent stack allocation that is still live, or the |
| behavior is undefined. It is possible to allocate additional stack |
| space after an argument allocation and before its call site, but it |
| must be cleared off with :ref:`llvm.stackrestore |
| <int_stackrestore>`. |
| |
| See :doc:`InAlloca` for more information on how to use this |
| attribute. |
| |
| ``sret`` |
| This indicates that the pointer parameter specifies the address of a |
| structure that is the return value of the function in the source |
| program. This pointer must be guaranteed by the caller to be valid: |
| loads and stores to the structure may be assumed by the callee not |
| to trap and to be properly aligned. This is not a valid attribute |
| for return values. |
| |
| .. _attr_align: |
| |
| ``align <n>`` |
| This indicates that the pointer value may be assumed by the optimizer to |
| have the specified alignment. |
| |
| Note that this attribute has additional semantics when combined with the |
| ``byval`` attribute. |
| |
| .. _noalias: |
| |
| ``noalias`` |
| This indicates that objects accessed via pointer values |
| :ref:`based <pointeraliasing>` on the argument or return value are not also |
| accessed, during the execution of the function, via pointer values not |
| *based* on the argument or return value. The attribute on a return value |
| also has additional semantics described below. The caller shares the |
| responsibility with the callee for ensuring that these requirements are met. |
| For further details, please see the discussion of the NoAlias response in |
| :ref:`alias analysis <Must, May, or No>`. |
| |
| Note that this definition of ``noalias`` is intentionally similar |
| to the definition of ``restrict`` in C99 for function arguments. |
| |
| For function return values, C99's ``restrict`` is not meaningful, |
| while LLVM's ``noalias`` is. Furthermore, the semantics of the ``noalias`` |
| attribute on return values are stronger than the semantics of the attribute |
| when used on function arguments. On function return values, the ``noalias`` |
| attribute indicates that the function acts like a system memory allocation |
| function, returning a pointer to allocated storage disjoint from the |
| storage for any other object accessible to the caller. |
| |
| ``nocapture`` |
| This indicates that the callee does not make any copies of the |
| pointer that outlive the callee itself. This is not a valid |
| attribute for return values. Addresses used in volatile operations |
| are considered to be captured. |
| |
| .. _nest: |
| |
| ``nest`` |
| This indicates that the pointer parameter can be excised using the |
| :ref:`trampoline intrinsics <int_trampoline>`. This is not a valid |
| attribute for return values and can only be applied to one parameter. |
| |
| ``returned`` |
| This indicates that the function always returns the argument as its return |
| value. This is a hint to the optimizer and code generator used when |
| generating the caller, allowing value propagation, tail call optimization, |
| and omission of register saves and restores in some cases; it is not |
| checked or enforced when generating the callee. The parameter and the |
| function return type must be valid operands for the |
| :ref:`bitcast instruction <i_bitcast>`. This is not a valid attribute for |
| return values and can only be applied to one parameter. |
| |
| ``nonnull`` |
| This indicates that the parameter or return pointer is not null. This |
| attribute may only be applied to pointer typed parameters. This is not |
| checked or enforced by LLVM; if the parameter or return pointer is null, |
| the behavior is undefined. |
| |
| ``dereferenceable(<n>)`` |
| This indicates that the parameter or return pointer is dereferenceable. This |
| attribute may only be applied to pointer typed parameters. A pointer that |
| is dereferenceable can be loaded from speculatively without a risk of |
| trapping. The number of bytes known to be dereferenceable must be provided |
| in parentheses. It is legal for the number of bytes to be less than the |
| size of the pointee type. The ``nonnull`` attribute does not imply |
| dereferenceability (consider a pointer to one element past the end of an |
| array), however ``dereferenceable(<n>)`` does imply ``nonnull`` in |
| ``addrspace(0)`` (which is the default address space). |
| |
| ``dereferenceable_or_null(<n>)`` |
| This indicates that the parameter or return value isn't both |
| non-null and non-dereferenceable (up to ``<n>`` bytes) at the same |
| time. All non-null pointers tagged with |
| ``dereferenceable_or_null(<n>)`` are ``dereferenceable(<n>)``. |
| For address space 0 ``dereferenceable_or_null(<n>)`` implies that |
| a pointer is exactly one of ``dereferenceable(<n>)`` or ``null``, |
| and in other address spaces ``dereferenceable_or_null(<n>)`` |
| implies that a pointer is at least one of ``dereferenceable(<n>)`` |
| or ``null`` (i.e. it may be both ``null`` and |
| ``dereferenceable(<n>)``). This attribute may only be applied to |
| pointer typed parameters. |
| |
| ``swiftself`` |
| This indicates that the parameter is the self/context parameter. This is not |
| a valid attribute for return values and can only be applied to one |
| parameter. |
| |
| ``swifterror`` |
| This attribute is motivated to model and optimize Swift error handling. It |
| can be applied to a parameter with pointer to pointer type or a |
| pointer-sized alloca. At the call site, the actual argument that corresponds |
| to a ``swifterror`` parameter has to come from a ``swifterror`` alloca or |
| the ``swifterror`` parameter of the caller. A ``swifterror`` value (either |
| the parameter or the alloca) can only be loaded and stored from, or used as |
| a ``swifterror`` argument. This is not a valid attribute for return values |
| and can only be applied to one parameter. |
| |
| These constraints allow the calling convention to optimize access to |
| ``swifterror`` variables by associating them with a specific register at |
| call boundaries rather than placing them in memory. Since this does change |
| the calling convention, a function which uses the ``swifterror`` attribute |
| on a parameter is not ABI-compatible with one which does not. |
| |
| These constraints also allow LLVM to assume that a ``swifterror`` argument |
| does not alias any other memory visible within a function and that a |
| ``swifterror`` alloca passed as an argument does not escape. |
| |
| .. _gc: |
| |
| Garbage Collector Strategy Names |
| -------------------------------- |
| |
| Each function may specify a garbage collector strategy name, which is simply a |
| string: |
| |
| .. code-block:: llvm |
| |
| define void @f() gc "name" { ... } |
| |
| The supported values of *name* includes those :ref:`built in to LLVM |
| <builtin-gc-strategies>` and any provided by loaded plugins. Specifying a GC |
| strategy will cause the compiler to alter its output in order to support the |
| named garbage collection algorithm. Note that LLVM itself does not contain a |
| garbage collector, this functionality is restricted to generating machine code |
| which can interoperate with a collector provided externally. |
| |
| .. _prefixdata: |
| |
| Prefix Data |
| ----------- |
| |
| Prefix data is data associated with a function which the code |
| generator will emit immediately before the function's entrypoint. |
| The purpose of this feature is to allow frontends to associate |
| language-specific runtime metadata with specific functions and make it |
| available through the function pointer while still allowing the |
| function pointer to be called. |
| |
| To access the data for a given function, a program may bitcast the |
| function pointer to a pointer to the constant's type and dereference |
| index -1. This implies that the IR symbol points just past the end of |
| the prefix data. For instance, take the example of a function annotated |
| with a single ``i32``, |
| |
| .. code-block:: llvm |
| |
| define void @f() prefix i32 123 { ... } |
| |
| The prefix data can be referenced as, |
| |
| .. code-block:: llvm |
| |
| %0 = bitcast void* () @f to i32* |
| %a = getelementptr inbounds i32, i32* %0, i32 -1 |
| %b = load i32, i32* %a |
| |
| Prefix data is laid out as if it were an initializer for a global variable |
| of the prefix data's type. The function will be placed such that the |
| beginning of the prefix data is aligned. This means that if the size |
| of the prefix data is not a multiple of the alignment size, the |
| function's entrypoint will not be aligned. If alignment of the |
| function's entrypoint is desired, padding must be added to the prefix |
| data. |
| |
| A function may have prefix data but no body. This has similar semantics |
| to the ``available_externally`` linkage in that the data may be used by the |
| optimizers but will not be emitted in the object file. |
| |
| .. _prologuedata: |
| |
| Prologue Data |
| ------------- |
| |
| The ``prologue`` attribute allows arbitrary code (encoded as bytes) to |
| be inserted prior to the function body. This can be used for enabling |
| function hot-patching and instrumentation. |
| |
| To maintain the semantics of ordinary function calls, the prologue data must |
| have a particular format. Specifically, it must begin with a sequence of |
| bytes which decode to a sequence of machine instructions, valid for the |
| module's target, which transfer control to the point immediately succeeding |
| the prologue data, without performing any other visible action. This allows |
| the inliner and other passes to reason about the semantics of the function |
| definition without needing to reason about the prologue data. Obviously this |
| makes the format of the prologue data highly target dependent. |
| |
| A trivial example of valid prologue data for the x86 architecture is ``i8 144``, |
| which encodes the ``nop`` instruction: |
| |
| .. code-block:: text |
| |
| define void @f() prologue i8 144 { ... } |
| |
| Generally prologue data can be formed by encoding a relative branch instruction |
| which skips the metadata, as in this example of valid prologue data for the |
| x86_64 architecture, where the first two bytes encode ``jmp .+10``: |
| |
| .. code-block:: text |
| |
| %0 = type <{ i8, i8, i8* }> |
| |
| define void @f() prologue %0 <{ i8 235, i8 8, i8* @md}> { ... } |
| |
| A function may have prologue data but no body. This has similar semantics |
| to the ``available_externally`` linkage in that the data may be used by the |
| optimizers but will not be emitted in the object file. |
| |
| .. _personalityfn: |
| |
| Personality Function |
| -------------------- |
| |
| The ``personality`` attribute permits functions to specify what function |
| to use for exception handling. |
| |
| .. _attrgrp: |
| |
| Attribute Groups |
| ---------------- |
| |
| Attribute groups are groups of attributes that are referenced by objects within |
| the IR. They are important for keeping ``.ll`` files readable, because a lot of |
| functions will use the same set of attributes. In the degenerative case of a |
| ``.ll`` file that corresponds to a single ``.c`` file, the single attribute |
| group will capture the important command line flags used to build that file. |
| |
| An attribute group is a module-level object. To use an attribute group, an |
| object references the attribute group's ID (e.g. ``#37``). An object may refer |
| to more than one attribute group. In that situation, the attributes from the |
| different groups are merged. |
| |
| Here is an example of attribute groups for a function that should always be |
| inlined, has a stack alignment of 4, and which shouldn't use SSE instructions: |
| |
| .. code-block:: llvm |
| |
| ; Target-independent attributes: |
| attributes #0 = { alwaysinline alignstack=4 } |
| |
| ; Target-dependent attributes: |
| attributes #1 = { "no-sse" } |
| |
| ; Function @f has attributes: alwaysinline, alignstack=4, and "no-sse". |
| define void @f() #0 #1 { ... } |
| |
| .. _fnattrs: |
| |
| Function Attributes |
| ------------------- |
| |
| Function attributes are set to communicate additional information about |
| a function. Function attributes are considered to be part of the |
| function, not of the function type, so functions with different function |
| attributes can have the same function type. |
| |
| Function attributes are simple keywords that follow the type specified. |
| If multiple attributes are needed, they are space separated. For |
| example: |
| |
| .. code-block:: llvm |
| |
| define void @f() noinline { ... } |
| define void @f() alwaysinline { ... } |
| define void @f() alwaysinline optsize { ... } |
| define void @f() optsize { ... } |
| |
| ``alignstack(<n>)`` |
| This attribute indicates that, when emitting the prologue and |
| epilogue, the backend should forcibly align the stack pointer. |
| Specify the desired alignment, which must be a power of two, in |
| parentheses. |
| ``allocsize(<EltSizeParam>[, <NumEltsParam>])`` |
| This attribute indicates that the annotated function will always return at |
| least a given number of bytes (or null). Its arguments are zero-indexed |
| parameter numbers; if one argument is provided, then it's assumed that at |
| least ``CallSite.Args[EltSizeParam]`` bytes will be available at the |
| returned pointer. If two are provided, then it's assumed that |
| ``CallSite.Args[EltSizeParam] * CallSite.Args[NumEltsParam]`` bytes are |
| available. The referenced parameters must be integer types. No assumptions |
| are made about the contents of the returned block of memory. |
| ``alwaysinline`` |
| This attribute indicates that the inliner should attempt to inline |
| this function into callers whenever possible, ignoring any active |
| inlining size threshold for this caller. |
| ``builtin`` |
| This indicates that the callee function at a call site should be |
| recognized as a built-in function, even though the function's declaration |
| uses the ``nobuiltin`` attribute. This is only valid at call sites for |
| direct calls to functions that are declared with the ``nobuiltin`` |
| attribute. |
| ``cold`` |
| This attribute indicates that this function is rarely called. When |
| computing edge weights, basic blocks post-dominated by a cold |
| function call are also considered to be cold; and, thus, given low |
| weight. |
| ``convergent`` |
| In some parallel execution models, there exist operations that cannot be |
| made control-dependent on any additional values. We call such operations |
| ``convergent``, and mark them with this attribute. |
| |
| The ``convergent`` attribute may appear on functions or call/invoke |
| instructions. When it appears on a function, it indicates that calls to |
| this function should not be made control-dependent on additional values. |
| For example, the intrinsic ``llvm.nvvm.barrier0`` is ``convergent``, so |
| calls to this intrinsic cannot be made control-dependent on additional |
| values. |
| |
| When it appears on a call/invoke, the ``convergent`` attribute indicates |
| that we should treat the call as though we're calling a convergent |
| function. This is particularly useful on indirect calls; without this we |
| may treat such calls as though the target is non-convergent. |
| |
| The optimizer may remove the ``convergent`` attribute on functions when it |
| can prove that the function does not execute any convergent operations. |
| Similarly, the optimizer may remove ``convergent`` on calls/invokes when it |
| can prove that the call/invoke cannot call a convergent function. |
| ``inaccessiblememonly`` |
| This attribute indicates that the function may only access memory that |
| is not accessible by the module being compiled. This is a weaker form |
| of ``readnone``. If the function reads or writes other memory, the |
| behavior is undefined. |
| ``inaccessiblemem_or_argmemonly`` |
| This attribute indicates that the function may only access memory that is |
| either not accessible by the module being compiled, or is pointed to |
| by its pointer arguments. This is a weaker form of ``argmemonly``. If the |
| function reads or writes other memory, the behavior is undefined. |
| ``inlinehint`` |
| This attribute indicates that the source code contained a hint that |
| inlining this function is desirable (such as the "inline" keyword in |
| C/C++). It is just a hint; it imposes no requirements on the |
| inliner. |
| ``jumptable`` |
| This attribute indicates that the function should be added to a |
| jump-instruction table at code-generation time, and that all address-taken |
| references to this function should be replaced with a reference to the |
| appropriate jump-instruction-table function pointer. Note that this creates |
| a new pointer for the original function, which means that code that depends |
| on function-pointer identity can break. So, any function annotated with |
| ``jumptable`` must also be ``unnamed_addr``. |
| ``minsize`` |
| This attribute suggests that optimization passes and code generator |
| passes make choices that keep the code size of this function as small |
| as possible and perform optimizations that may sacrifice runtime |
| performance in order to minimize the size of the generated code. |
| ``naked`` |
| This attribute disables prologue / epilogue emission for the |
| function. This can have very system-specific consequences. |
| ``no-jump-tables`` |
| When this attribute is set to true, the jump tables and lookup tables that |
| can be generated from a switch case lowering are disabled. |
| ``nobuiltin`` |
| This indicates that the callee function at a call site is not recognized as |
| a built-in function. LLVM will retain the original call and not replace it |
| with equivalent code based on the semantics of the built-in function, unless |
| the call site uses the ``builtin`` attribute. This is valid at call sites |
| and on function declarations and definitions. |
| ``noduplicate`` |
| This attribute indicates that calls to the function cannot be |
| duplicated. A call to a ``noduplicate`` function may be moved |
| within its parent function, but may not be duplicated within |
| its parent function. |
| |
| A function containing a ``noduplicate`` call may still |
| be an inlining candidate, provided that the call is not |
| duplicated by inlining. That implies that the function has |
| internal linkage and only has one call site, so the original |
| call is dead after inlining. |
| ``noimplicitfloat`` |
| This attributes disables implicit floating-point instructions. |
| ``noinline`` |
| This attribute indicates that the inliner should never inline this |
| function in any situation. This attribute may not be used together |
| with the ``alwaysinline`` attribute. |
| ``nonlazybind`` |
| This attribute suppresses lazy symbol binding for the function. This |
| may make calls to the function faster, at the cost of extra program |
| startup time if the function is not called during program startup. |
| ``noredzone`` |
| This attribute indicates that the code generator should not use a |
| red zone, even if the target-specific ABI normally permits it. |
| ``noreturn`` |
| This function attribute indicates that the function never returns |
| normally. This produces undefined behavior at runtime if the |
| function ever does dynamically return. |
| ``norecurse`` |
| This function attribute indicates that the function does not call itself |
| either directly or indirectly down any possible call path. This produces |
| undefined behavior at runtime if the function ever does recurse. |
| ``nounwind`` |
| This function attribute indicates that the function never raises an |
| exception. If the function does raise an exception, its runtime |
| behavior is undefined. However, functions marked nounwind may still |
| trap or generate asynchronous exceptions. Exception handling schemes |
| that are recognized by LLVM to handle asynchronous exceptions, such |
| as SEH, will still provide their implementation defined semantics. |
| ``"null-pointer-is-valid"`` |
| If ``"null-pointer-is-valid"`` is set to ``"true"``, then ``null`` address |
| in address-space 0 is considered to be a valid address for memory loads and |
| stores. Any analysis or optimization should not treat dereferencing a |
| pointer to ``null`` as undefined behavior in this function. |
| Note: Comparing address of a global variable to ``null`` may still |
| evaluate to false because of a limitation in querying this attribute inside |
| constant expressions. |
| ``optforfuzzing`` |
| This attribute indicates that this function should be optimized |
| for maximum fuzzing signal. |
| ``optnone`` |
| This function attribute indicates that most optimization passes will skip |
| this function, with the exception of interprocedural optimization passes. |
| Code generation defaults to the "fast" instruction selector. |
| This attribute cannot be used together with the ``alwaysinline`` |
| attribute; this attribute is also incompatible |
| with the ``minsize`` attribute and the ``optsize`` attribute. |
| |
| This attribute requires the ``noinline`` attribute to be specified on |
| the function as well, so the function is never inlined into any caller. |
| Only functions with the ``alwaysinline`` attribute are valid |
| candidates for inlining into the body of this function. |
| ``optsize`` |
| This attribute suggests that optimization passes and code generator |
| passes make choices that keep the code size of this function low, |
| and otherwise do optimizations specifically to reduce code size as |
| long as they do not significantly impact runtime performance. |
| ``"patchable-function"`` |
| This attribute tells the code generator that the code |
| generated for this function needs to follow certain conventions that |
| make it possible for a runtime function to patch over it later. |
| The exact effect of this attribute depends on its string value, |
| for which there currently is one legal possibility: |
| |
| * ``"prologue-short-redirect"`` - This style of patchable |
| function is intended to support patching a function prologue to |
| redirect control away from the function in a thread safe |
| manner. It guarantees that the first instruction of the |
| function will be large enough to accommodate a short jump |
| instruction, and will be sufficiently aligned to allow being |
| fully changed via an atomic compare-and-swap instruction. |
| While the first requirement can be satisfied by inserting large |
| enough NOP, LLVM can and will try to re-purpose an existing |
| instruction (i.e. one that would have to be emitted anyway) as |
| the patchable instruction larger than a short jump. |
| |
| ``"prologue-short-redirect"`` is currently only supported on |
| x86-64. |
| |
| This attribute by itself does not imply restrictions on |
| inter-procedural optimizations. All of the semantic effects the |
| patching may have to be separately conveyed via the linkage type. |
| ``"probe-stack"`` |
| This attribute indicates that the function will trigger a guard region |
| in the end of the stack. It ensures that accesses to the stack must be |
| no further apart than the size of the guard region to a previous |
| access of the stack. It takes one required string value, the name of |
| the stack probing function that will be called. |
| |
| If a function that has a ``"probe-stack"`` attribute is inlined into |
| a function with another ``"probe-stack"`` attribute, the resulting |
| function has the ``"probe-stack"`` attribute of the caller. If a |
| function that has a ``"probe-stack"`` attribute is inlined into a |
| function that has no ``"probe-stack"`` attribute at all, the resulting |
| function has the ``"probe-stack"`` attribute of the callee. |
| ``readnone`` |
| On a function, this attribute indicates that the function computes its |
| result (or decides to unwind an exception) based strictly on its arguments, |
| without dereferencing any pointer arguments or otherwise accessing |
| any mutable state (e.g. memory, control registers, etc) visible to |
| caller functions. It does not write through any pointer arguments |
| (including ``byval`` arguments) and never changes any state visible |
| to callers. This means while it cannot unwind exceptions by calling |
| the ``C++`` exception throwing methods (since they write to memory), there may |
| be non-``C++`` mechanisms that throw exceptions without writing to LLVM |
| visible memory. |
| |
| On an argument, this attribute indicates that the function does not |
| dereference that pointer argument, even though it may read or write the |
| memory that the pointer points to if accessed through other pointers. |
| |
| If a readnone function reads or writes memory visible to the program, or |
| has other side-effects, the behavior is undefined. If a function reads from |
| or writes to a readnone pointer argument, the behavior is undefined. |
| ``readonly`` |
| On a function, this attribute indicates that the function does not write |
| through any pointer arguments (including ``byval`` arguments) or otherwise |
| modify any state (e.g. memory, control registers, etc) visible to |
| caller functions. It may dereference pointer arguments and read |
| state that may be set in the caller. A readonly function always |
| returns the same value (or unwinds an exception identically) when |
| called with the same set of arguments and global state. This means while it |
| cannot unwind exceptions by calling the ``C++`` exception throwing methods |
| (since they write to memory), there may be non-``C++`` mechanisms that throw |
| exceptions without writing to LLVM visible memory. |
| |
| On an argument, this attribute indicates that the function does not write |
| through this pointer argument, even though it may write to the memory that |
| the pointer points to. |
| |
| If a readonly function writes memory visible to the program, or |
| has other side-effects, the behavior is undefined. If a function writes to |
| a readonly pointer argument, the behavior is undefined. |
| ``"stack-probe-size"`` |
| This attribute controls the behavior of stack probes: either |
| the ``"probe-stack"`` attribute, or ABI-required stack probes, if any. |
| It defines the size of the guard region. It ensures that if the function |
| may use more stack space than the size of the guard region, stack probing |
| sequence will be emitted. It takes one required integer value, which |
| is 4096 by default. |
| |
| If a function that has a ``"stack-probe-size"`` attribute is inlined into |
| a function with another ``"stack-probe-size"`` attribute, the resulting |
| function has the ``"stack-probe-size"`` attribute that has the lower |
| numeric value. If a function that has a ``"stack-probe-size"`` attribute is |
| inlined into a function that has no ``"stack-probe-size"`` attribute |
| at all, the resulting function has the ``"stack-probe-size"`` attribute |
| of the callee. |
| ``"no-stack-arg-probe"`` |
| This attribute disables ABI-required stack probes, if any. |
| ``writeonly`` |
| On a function, this attribute indicates that the function may write to but |
| does not read from memory. |
| |
| On an argument, this attribute indicates that the function may write to but |
| does not read through this pointer argument (even though it may read from |
| the memory that the pointer points to). |
| |
| If a writeonly function reads memory visible to the program, or |
| has other side-effects, the behavior is undefined. If a function reads |
| from a writeonly pointer argument, the behavior is undefined. |
| ``argmemonly`` |
| This attribute indicates that the only memory accesses inside function are |
| loads and stores from objects pointed to by its pointer-typed arguments, |
| with arbitrary offsets. Or in other words, all memory operations in the |
| function can refer to memory only using pointers based on its function |
| arguments. |
| |
| Note that ``argmemonly`` can be used together with ``readonly`` attribute |
| in order to specify that function reads only from its arguments. |
| |
| If an argmemonly function reads or writes memory other than the pointer |
| arguments, or has other side-effects, the behavior is undefined. |
| ``returns_twice`` |
| This attribute indicates that this function can return twice. The C |
| ``setjmp`` is an example of such a function. The compiler disables |
| some optimizations (like tail calls) in the caller of these |
| functions. |
| ``safestack`` |
| This attribute indicates that |
| `SafeStack <http://clang.llvm.org/docs/SafeStack.html>`_ |
| protection is enabled for this function. |
| |
| If a function that has a ``safestack`` attribute is inlined into a |
| function that doesn't have a ``safestack`` attribute or which has an |
| ``ssp``, ``sspstrong`` or ``sspreq`` attribute, then the resulting |
| function will have a ``safestack`` attribute. |
| ``sanitize_address`` |
| This attribute indicates that AddressSanitizer checks |
| (dynamic address safety analysis) are enabled for this function. |
| ``sanitize_memory`` |
| This attribute indicates that MemorySanitizer checks (dynamic detection |
| of accesses to uninitialized memory) are enabled for this function. |
| ``sanitize_thread`` |
| This attribute indicates that ThreadSanitizer checks |
| (dynamic thread safety analysis) are enabled for this function. |
| ``sanitize_hwaddress`` |
| This attribute indicates that HWAddressSanitizer checks |
| (dynamic address safety analysis based on tagged pointers) are enabled for |
| this function. |
| ``speculatable`` |
| This function attribute indicates that the function does not have any |
| effects besides calculating its result and does not have undefined behavior. |
| Note that ``speculatable`` is not enough to conclude that along any |
| particular execution path the number of calls to this function will not be |
| externally observable. This attribute is only valid on functions |
| and declarations, not on individual call sites. If a function is |
| incorrectly marked as speculatable and really does exhibit |
| undefined behavior, the undefined behavior may be observed even |
| if the call site is dead code. |
| |
| ``ssp`` |
| This attribute indicates that the function should emit a stack |
| smashing protector. It is in the form of a "canary" --- a random value |
| placed on the stack before the local variables that's checked upon |
| return from the function to see if it has been overwritten. A |
| heuristic is used to determine if a function needs stack protectors |
| or not. The heuristic used will enable protectors for functions with: |
| |
| - Character arrays larger than ``ssp-buffer-size`` (default 8). |
| - Aggregates containing character arrays larger than ``ssp-buffer-size``. |
| - Calls to alloca() with variable sizes or constant sizes greater than |
| ``ssp-buffer-size``. |
| |
| Variables that are identified as requiring a protector will be arranged |
| on the stack such that they are adjacent to the stack protector guard. |
| |
| If a function that has an ``ssp`` attribute is inlined into a |
| function that doesn't have an ``ssp`` attribute, then the resulting |
| function will have an ``ssp`` attribute. |
| ``sspreq`` |
| This attribute indicates that the function should *always* emit a |
| stack smashing protector. This overrides the ``ssp`` function |
| attribute. |
| |
| Variables that are identified as requiring a protector will be arranged |
| on the stack such that they are adjacent to the stack protector guard. |
| The specific layout rules are: |
| |
| #. Large arrays and structures containing large arrays |
| (``>= ssp-buffer-size``) are closest to the stack protector. |
| #. Small arrays and structures containing small arrays |
| (``< ssp-buffer-size``) are 2nd closest to the protector. |
| #. Variables that have had their address taken are 3rd closest to the |
| protector. |
| |
| If a function that has an ``sspreq`` attribute is inlined into a |
| function that doesn't have an ``sspreq`` attribute or which has an |
| ``ssp`` or ``sspstrong`` attribute, then the resulting function will have |
| an ``sspreq`` attribute. |
| ``sspstrong`` |
| This attribute indicates that the function should emit a stack smashing |
| protector. This attribute causes a strong heuristic to be used when |
| determining if a function needs stack protectors. The strong heuristic |
| will enable protectors for functions with: |
| |
| - Arrays of any size and type |
| - Aggregates containing an array of any size and type. |
| - Calls to alloca(). |
| - Local variables that have had their address taken. |
| |
| Variables that are identified as requiring a protector will be arranged |
| on the stack such that they are adjacent to the stack protector guard. |
| The specific layout rules are: |
| |
| #. Large arrays and structures containing large arrays |
| (``>= ssp-buffer-size``) are closest to the stack protector. |
| #. Small arrays and structures containing small arrays |
| (``< ssp-buffer-size``) are 2nd closest to the protector. |
| #. Variables that have had their address taken are 3rd closest to the |
| protector. |
| |
| This overrides the ``ssp`` function attribute. |
| |
| If a function that has an ``sspstrong`` attribute is inlined into a |
| function that doesn't have an ``sspstrong`` attribute, then the |
| resulting function will have an ``sspstrong`` attribute. |
| ``strictfp`` |
| This attribute indicates that the function was called from a scope that |
| requires strict floating-point semantics. LLVM will not attempt any |
| optimizations that require assumptions about the floating-point rounding |
| mode or that might alter the state of floating-point status flags that |
| might otherwise be set or cleared by calling this function. |
| ``"thunk"`` |
| This attribute indicates that the function will delegate to some other |
| function with a tail call. The prototype of a thunk should not be used for |
| optimization purposes. The caller is expected to cast the thunk prototype to |
| match the thunk target prototype. |
| ``uwtable`` |
| This attribute indicates that the ABI being targeted requires that |
| an unwind table entry be produced for this function even if we can |
| show that no exceptions passes by it. This is normally the case for |
| the ELF x86-64 abi, but it can be disabled for some compilation |
| units. |
| ``nocf_check`` |
| This attribute indicates that no control-flow check will be performed on |
| the attributed entity. It disables -fcf-protection=<> for a specific |
| entity to fine grain the HW control flow protection mechanism. The flag |
| is target independent and currently appertains to a function or function |
| pointer. |
| ``shadowcallstack`` |
| This attribute indicates that the ShadowCallStack checks are enabled for |
| the function. The instrumentation checks that the return address for the |
| function has not changed between the function prolog and eiplog. It is |
| currently x86_64-specific. |
| |
| .. _glattrs: |
| |
| Global Attributes |
| ----------------- |
| |
| Attributes may be set to communicate additional information about a global variable. |
| Unlike :ref:`function attributes <fnattrs>`, attributes on a global variable |
| are grouped into a single :ref:`attribute group <attrgrp>`. |
| |
| .. _opbundles: |
| |
| Operand Bundles |
| --------------- |
| |
| Operand bundles are tagged sets of SSA values that can be associated |
| with certain LLVM instructions (currently only ``call`` s and |
| ``invoke`` s). In a way they are like metadata, but dropping them is |
| incorrect and will change program semantics. |
| |
| Syntax:: |
| |
| operand bundle set ::= '[' operand bundle (, operand bundle )* ']' |
| operand bundle ::= tag '(' [ bundle operand ] (, bundle operand )* ')' |
| bundle operand ::= SSA value |
| tag ::= string constant |
| |
| Operand bundles are **not** part of a function's signature, and a |
| given function may be called from multiple places with different kinds |
| of operand bundles. This reflects the fact that the operand bundles |
| are conceptually a part of the ``call`` (or ``invoke``), not the |
| callee being dispatched to. |
| |
| Operand bundles are a generic mechanism intended to support |
| runtime-introspection-like functionality for managed languages. While |
| the exact semantics of an operand bundle depend on the bundle tag, |
| there are certain limitations to how much the presence of an operand |
| bundle can influence the semantics of a program. These restrictions |
| are described as the semantics of an "unknown" operand bundle. As |
| long as the behavior of an operand bundle is describable within these |
| restrictions, LLVM does not need to have special knowledge of the |
| operand bundle to not miscompile programs containing it. |
| |
| - The bundle operands for an unknown operand bundle escape in unknown |
| ways before control is transferred to the callee or invokee. |
| - Calls and invokes with operand bundles have unknown read / write |
| effect on the heap on entry and exit (even if the call target is |
| ``readnone`` or ``readonly``), unless they're overridden with |
| callsite specific attributes. |
| - An operand bundle at a call site cannot change the implementation |
| of the called function. Inter-procedural optimizations work as |
| usual as long as they take into account the first two properties. |
| |
| More specific types of operand bundles are described below. |
| |
| .. _deopt_opbundles: |
| |
| Deoptimization Operand Bundles |
| ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ |
| |
| Deoptimization operand bundles are characterized by the ``"deopt"`` |
| operand bundle tag. These operand bundles represent an alternate |
| "safe" continuation for the call site they're attached to, and can be |
| used by a suitable runtime to deoptimize the compiled frame at the |
| specified call site. There can be at most one ``"deopt"`` operand |
| bundle attached to a call site. Exact details of deoptimization is |
| out of scope for the language reference, but it usually involves |
| rewriting a compiled frame into a set of interpreted frames. |
| |
| From the compiler's perspective, deoptimization operand bundles make |
| the call sites they're attached to at least ``readonly``. They read |
| through all of their pointer typed operands (even if they're not |
| otherwise escaped) and the entire visible heap. Deoptimization |
| operand bundles do not capture their operands except during |
| deoptimization, in which case control will not be returned to the |
| compiled frame. |
| |
| The inliner knows how to inline through calls that have deoptimization |
| operand bundles. Just like inlining through a normal call site |
| involves composing the normal and exceptional continuations, inlining |
| through a call site with a deoptimization operand bundle needs to |
| appropriately compose the "safe" deoptimization continuation. The |
| inliner does this by prepending the parent's deoptimization |
| continuation to every deoptimization continuation in the inlined body. |
| E.g. inlining ``@f`` into ``@g`` in the following example |
| |
| .. code-block:: llvm |
| |
| define void @f() { |
| call void @x() ;; no deopt state |
| call void @y() [ "deopt"(i32 10) ] |
| call void @y() [ "deopt"(i32 10), "unknown"(i8* null) ] |
| ret void |
| } |
| |
| define void @g() { |
| call void @f() [ "deopt"(i32 20) ] |
| ret void |
| } |
| |
| will result in |
| |
| .. code-block:: llvm |
| |
| define void @g() { |
| call void @x() ;; still no deopt state |
| call void @y() [ "deopt"(i32 20, i32 10) ] |
| call void @y() [ "deopt"(i32 20, i32 10), "unknown"(i8* null) ] |
| ret void |
| } |
| |
| It is the frontend's responsibility to structure or encode the |
| deoptimization state in a way that syntactically prepending the |
| caller's deoptimization state to the callee's deoptimization state is |
| semantically equivalent to composing the caller's deoptimization |
| continuation after the callee's deoptimization continuation. |
| |
| .. _ob_funclet: |
| |
| Funclet Operand Bundles |
| ^^^^^^^^^^^^^^^^^^^^^^^ |
| |
| Funclet operand bundles are characterized by the ``"funclet"`` |
| operand bundle tag. These operand bundles indicate that a call site |
| is within a particular funclet. There can be at most one |
| ``"funclet"`` operand bundle attached to a call site and it must have |
| exactly one bundle operand. |
| |
| If any funclet EH pads have been "entered" but not "exited" (per the |
| `description in the EH doc\ <ExceptionHandling.html#wineh-constraints>`_), |
| it is undefined behavior to execute a ``call`` or ``invoke`` which: |
| |
| * does not have a ``"funclet"`` bundle and is not a ``call`` to a nounwind |
| intrinsic, or |
| * has a ``"funclet"`` bundle whose operand is not the most-recently-entered |
| not-yet-exited funclet EH pad. |
| |
| Similarly, if no funclet EH pads have been entered-but-not-yet-exited, |
| executing a ``call`` or ``invoke`` with a ``"funclet"`` bundle is undefined behavior. |
| |
| GC Transition Operand Bundles |
| ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ |
| |
| GC transition operand bundles are characterized by the |
| ``"gc-transition"`` operand bundle tag. These operand bundles mark a |
| call as a transition between a function with one GC strategy to a |
| function with a different GC strategy. If coordinating the transition |
| between GC strategies requires additional code generation at the call |
| site, these bundles may contain any values that are needed by the |
| generated code. For more details, see :ref:`GC Transitions |
| <gc_transition_args>`. |
| |
| .. _moduleasm: |
| |
| Module-Level Inline Assembly |
| ---------------------------- |
| |
| Modules may contain "module-level inline asm" blocks, which corresponds |
| to the GCC "file scope inline asm" blocks. These blocks are internally |
| concatenated by LLVM and treated as a single unit, but may be separated |
| in the ``.ll`` file if desired. The syntax is very simple: |
| |
| .. code-block:: llvm |
| |
| module asm "inline asm code goes here" |
| module asm "more can go here" |
| |
| The strings can contain any character by escaping non-printable |
| characters. The escape sequence used is simply "\\xx" where "xx" is the |
| two digit hex code for the number. |
| |
| Note that the assembly string *must* be parseable by LLVM's integrated assembler |
| (unless it is disabled), even when emitting a ``.s`` file. |
| |
| .. _langref_datalayout: |
| |
| Data Layout |
| ----------- |
| |
| A module may specify a target specific data layout string that specifies |
| how data is to be laid out in memory. The syntax for the data layout is |
| simply: |
| |
| .. code-block:: llvm |
| |
| target datalayout = "layout specification" |
| |
| The *layout specification* consists of a list of specifications |
| separated by the minus sign character ('-'). Each specification starts |
| with a letter and may include other information after the letter to |
| define some aspect of the data layout. The specifications accepted are |
| as follows: |
| |
| ``E`` |
| Specifies that the target lays out data in big-endian form. That is, |
| the bits with the most significance have the lowest address |
| location. |
| ``e`` |
| Specifies that the target lays out data in little-endian form. That |
| is, the bits with the least significance have the lowest address |
| location. |
| ``S<size>`` |
| Specifies the natural alignment of the stack in bits. Alignment |
| promotion of stack variables is limited to the natural stack |
| alignment to avoid dynamic stack realignment. The stack alignment |
| must be a multiple of 8-bits. If omitted, the natural stack |
| alignment defaults to "unspecified", which does not prevent any |
| alignment promotions. |
| ``P<address space>`` |
| Specifies the address space that corresponds to program memory. |
| Harvard architectures can use this to specify what space LLVM |
| should place things such as functions into. If omitted, the |
| program memory space defaults to the default address space of 0, |
| which corresponds to a Von Neumann architecture that has code |
| and data in the same space. |
| ``A<address space>`` |
| Specifies the address space of objects created by '``alloca``'. |
| Defaults to the default address space of 0. |
| ``p[n]:<size>:<abi>:<pref>:<idx>`` |
| This specifies the *size* of a pointer and its ``<abi>`` and |
| ``<pref>``\erred alignments for address space ``n``. The fourth parameter |
| ``<idx>`` is a size of index that used for address calculation. If not |
| specified, the default index size is equal to the pointer size. All sizes |
| are in bits. The address space, ``n``, is optional, and if not specified, |
| denotes the default address space 0. The value of ``n`` must be |
| in the range [1,2^23). |
| ``i<size>:<abi>:<pref>`` |
| This specifies the alignment for an integer type of a given bit |
| ``<size>``. The value of ``<size>`` must be in the range [1,2^23). |
| ``v<size>:<abi>:<pref>`` |
| This specifies the alignment for a vector type of a given bit |
| ``<size>``. |
| ``f<size>:<abi>:<pref>`` |
| This specifies the alignment for a floating-point type of a given bit |
| ``<size>``. Only values of ``<size>`` that are supported by the target |
| will work. 32 (float) and 64 (double) are supported on all targets; 80 |
| or 128 (different flavors of long double) are also supported on some |
| targets. |
| ``a:<abi>:<pref>`` |
| This specifies the alignment for an object of aggregate type. |
| ``m:<mangling>`` |
| If present, specifies that llvm names are mangled in the output. Symbols |
| prefixed with the mangling escape character ``\01`` are passed through |
| directly to the assembler without the escape character. The mangling style |
| options are |
| |
| * ``e``: ELF mangling: Private symbols get a ``.L`` prefix. |
| * ``m``: Mips mangling: Private symbols get a ``$`` prefix. |
| * ``o``: Mach-O mangling: Private symbols get ``L`` prefix. Other |
| symbols get a ``_`` prefix. |
| * ``x``: Windows x86 COFF mangling: Private symbols get the usual prefix. |
| Regular C symbols get a ``_`` prefix. Functions with ``__stdcall``, |
| ``__fastcall``, and ``__vectorcall`` have custom mangling that appends |
| ``@N`` where N is the number of bytes used to pass parameters. C++ symbols |
| starting with ``?`` are not mangled in any way. |
| * ``w``: Windows COFF mangling: Similar to ``x``, except that normal C |
| symbols do not receive a ``_`` prefix. |
| ``n<size1>:<size2>:<size3>...`` |
| This specifies a set of native integer widths for the target CPU in |
| bits. For example, it might contain ``n32`` for 32-bit PowerPC, |
| ``n32:64`` for PowerPC 64, or ``n8:16:32:64`` for X86-64. Elements of |
| this set are considered to support most general arithmetic operations |
| efficiently. |
| ``ni:<address space0>:<address space1>:<address space2>...`` |
| This specifies pointer types with the specified address spaces |
| as :ref:`Non-Integral Pointer Type <nointptrtype>` s. The ``0`` |
| address space cannot be specified as non-integral. |
| |
| On every specification that takes a ``<abi>:<pref>``, specifying the |
| ``<pref>`` alignment is optional. If omitted, the preceding ``:`` |
| should be omitted too and ``<pref>`` will be equal to ``<abi>``. |
| |
| When constructing the data layout for a given target, LLVM starts with a |
| default set of specifications which are then (possibly) overridden by |
| the specifications in the ``datalayout`` keyword. The default |
| specifications are given in this list: |
| |
| - ``E`` - big endian |
| - ``p:64:64:64`` - 64-bit pointers with 64-bit alignment. |
| - ``p[n]:64:64:64`` - Other address spaces are assumed to be the |
| same as the default address space. |
| - ``S0`` - natural stack alignment is unspecified |
| - ``i1:8:8`` - i1 is 8-bit (byte) aligned |
| - ``i8:8:8`` - i8 is 8-bit (byte) aligned |
| - ``i16:16:16`` - i16 is 16-bit aligned |
| - ``i32:32:32`` - i32 is 32-bit aligned |
| - ``i64:32:64`` - i64 has ABI alignment of 32-bits but preferred |
| alignment of 64-bits |
| - ``f16:16:16`` - half is 16-bit aligned |
| - ``f32:32:32`` - float is 32-bit aligned |
| - ``f64:64:64`` - double is 64-bit aligned |
| - ``f128:128:128`` - quad is 128-bit aligned |
| - ``v64:64:64`` - 64-bit vector is 64-bit aligned |
| - ``v128:128:128`` - 128-bit vector is 128-bit aligned |
| - ``a:0:64`` - aggregates are 64-bit aligned |
| |
| When LLVM is determining the alignment for a given type, it uses the |
| following rules: |
| |
| #. If the type sought is an exact match for one of the specifications, |
| that specification is used. |
| #. If no match is found, and the type sought is an integer type, then |
| the smallest integer type that is larger than the bitwidth of the |
| sought type is used. If none of the specifications are larger than |
| the bitwidth then the largest integer type is used. For example, |
| given the default specifications above, the i7 type will use the |
| alignment of i8 (next largest) while both i65 and i256 will use the |
| alignment of i64 (largest specified). |
| #. If no match is found, and the type sought is a vector type, then the |
| largest vector type that is smaller than the sought vector type will |
| be used as a fall back. This happens because <128 x double> can be |
| implemented in terms of 64 <2 x double>, for example. |
| |
| The function of the data layout string may not be what you expect. |
| Notably, this is not a specification from the frontend of what alignment |
| the code generator should use. |
| |
| Instead, if specified, the target data layout is required to match what |
| the ultimate *code generator* expects. This string is used by the |
| mid-level optimizers to improve code, and this only works if it matches |
| what the ultimate code generator uses. There is no way to generate IR |
| that does not embed this target-specific detail into the IR. If you |
| don't specify the string, the default specifications will be used to |
| generate a Data Layout and the optimization phases will operate |
| accordingly and introduce target specificity into the IR with respect to |
| these default specifications. |
| |
| .. _langref_triple: |
| |
| Target Triple |
| ------------- |
| |
| A module may specify a target triple string that describes the target |
| host. The syntax for the target triple is simply: |
| |
| .. code-block:: llvm |
| |
| target triple = "x86_64-apple-macosx10.7.0" |
| |
| The *target triple* string consists of a series of identifiers delimited |
| by the minus sign character ('-'). The canonical forms are: |
| |
| :: |
| |
| ARCHITECTURE-VENDOR-OPERATING_SYSTEM |
| ARCHITECTURE-VENDOR-OPERATING_SYSTEM-ENVIRONMENT |
| |
| This information is passed along to the backend so that it generates |
| code for the proper architecture. It's possible to override this on the |
| command line with the ``-mtriple`` command line option. |
| |
| .. _pointeraliasing: |
| |
| Pointer Aliasing Rules |
| ---------------------- |
| |
| Any memory access must be done through a pointer value associated with |
| an address range of the memory access, otherwise the behavior is |
| undefined. Pointer values are associated with address ranges according |
| to the following rules: |
| |
| - A pointer value is associated with the addresses associated with any |
| value it is *based* on. |
| - An address of a global variable is associated with the address range |
| of the variable's storage. |
| - The result value of an allocation instruction is associated with the |
| address range of the allocated storage. |
| - A null pointer in the default address-space is associated with no |
| address. |
| - An integer constant other than zero or a pointer value returned from |
| a function not defined within LLVM may be associated with address |
| ranges allocated through mechanisms other than those provided by |
| LLVM. Such ranges shall not overlap with any ranges of addresses |
| allocated by mechanisms provided by LLVM. |
| |
| A pointer value is *based* on another pointer value according to the |
| following rules: |
| |
| - A pointer value formed from a scalar ``getelementptr`` operation is *based* on |
| the pointer-typed operand of the ``getelementptr``. |
| - The pointer in lane *l* of the result of a vector ``getelementptr`` operation |
| is *based* on the pointer in lane *l* of the vector-of-pointers-typed operand |
| of the ``getelementptr``. |
| - The result value of a ``bitcast`` is *based* on the operand of the |
| ``bitcast``. |
| - A pointer value formed by an ``inttoptr`` is *based* on all pointer |
| values that contribute (directly or indirectly) to the computation of |
| the pointer's value. |
| - The "*based* on" relationship is transitive. |
| |
| Note that this definition of *"based"* is intentionally similar to the |
| definition of *"based"* in C99, though it is slightly weaker. |
| |
| LLVM IR does not associate types with memory. The result type of a |
| ``load`` merely indicates the size and alignment of the memory from |
| which to load, as well as the interpretation of the value. The first |
| operand type of a ``store`` similarly only indicates the size and |
| alignment of the store. |
| |
| Consequently, type-based alias analysis, aka TBAA, aka |
| ``-fstrict-aliasing``, is not applicable to general unadorned LLVM IR. |
| :ref:`Metadata <metadata>` may be used to encode additional information |
| which specialized optimization passes may use to implement type-based |
| alias analysis. |
| |
| .. _volatile: |
| |
| Volatile Memory Accesses |
| ------------------------ |
| |
| Certain memory accesses, such as :ref:`load <i_load>`'s, |
| :ref:`store <i_store>`'s, and :ref:`llvm.memcpy <int_memcpy>`'s may be |
| marked ``volatile``. The optimizers must not change the number of |
| volatile operations or change their order of execution relative to other |
| volatile operations. The optimizers *may* change the order of volatile |
| operations relative to non-volatile operations. This is not Java's |
| "volatile" and has no cross-thread synchronization behavior. |
| |
| IR-level volatile loads and stores cannot safely be optimized into |
| llvm.memcpy or llvm.memmove intrinsics even when those intrinsics are |
| flagged volatile. Likewise, the backend should never split or merge |
| target-legal volatile load/store instructions. |
| |
| .. admonition:: Rationale |
| |
| Platforms may rely on volatile loads and stores of natively supported |
| data width to be executed as single instruction. For example, in C |
| this holds for an l-value of volatile primitive type with native |
| hardware support, but not necessarily for aggregate types. The |
| frontend upholds these expectations, which are intentionally |
| unspecified in the IR. The rules above ensure that IR transformations |
| do not violate the frontend's contract with the language. |
| |
| .. _memmodel: |
| |
| Memory Model for Concurrent Operations |
| -------------------------------------- |
| |
| The LLVM IR does not define any way to start parallel threads of |
| execution or to register signal handlers. Nonetheless, there are |
| platform-specific ways to create them, and we define LLVM IR's behavior |
| in their presence. This model is inspired by the C++0x memory model. |
| |
| For a more informal introduction to this model, see the :doc:`Atomics`. |
| |
| We define a *happens-before* partial order as the least partial order |
| that |
| |
| - Is a superset of single-thread program order, and |
| - When a *synchronizes-with* ``b``, includes an edge from ``a`` to |
| ``b``. *Synchronizes-with* pairs are introduced by platform-specific |
| techniques, like pthread locks, thread creation, thread joining, |
| etc., and by atomic instructions. (See also :ref:`Atomic Memory Ordering |
| Constraints <ordering>`). |
| |
| Note that program order does not introduce *happens-before* edges |
| between a thread and signals executing inside that thread. |
| |
| Every (defined) read operation (load instructions, memcpy, atomic |
| loads/read-modify-writes, etc.) R reads a series of bytes written by |
| (defined) write operations (store instructions, atomic |
| stores/read-modify-writes, memcpy, etc.). For the purposes of this |
| section, initialized globals are considered to have a write of the |
| initializer which is atomic and happens before any other read or write |
| of the memory in question. For each byte of a read R, R\ :sub:`byte` |
| may see any write to the same byte, except: |
| |
| - If write\ :sub:`1` happens before write\ :sub:`2`, and |
| write\ :sub:`2` happens before R\ :sub:`byte`, then |
| R\ :sub:`byte` does not see write\ :sub:`1`. |
| - If R\ :sub:`byte` happens before write\ :sub:`3`, then |
| R\ :sub:`byte` does not see write\ :sub:`3`. |
| |
| Given that definition, R\ :sub:`byte` is defined as follows: |
| |
| - If R is volatile, the result is target-dependent. (Volatile is |
| supposed to give guarantees which can support ``sig_atomic_t`` in |
| C/C++, and may be used for accesses to addresses that do not behave |
| like normal memory. It does not generally provide cross-thread |
| synchronization.) |
| - Otherwise, if there is no write to the same byte that happens before |
| R\ :sub:`byte`, R\ :sub:`byte` returns ``undef`` for that byte. |
| - Otherwise, if R\ :sub:`byte` may see exactly one write, |
| R\ :sub:`byte` returns the value written by that write. |
| - Otherwise, if R is atomic, and all the writes R\ :sub:`byte` may |
| see are atomic, it chooses one of the values written. See the :ref:`Atomic |
| Memory Ordering Constraints <ordering>` section for additional |
| constraints on how the choice is made. |
| - Otherwise R\ :sub:`byte` returns ``undef``. |
| |
| R returns the value composed of the series of bytes it read. This |
| implies that some bytes within the value may be ``undef`` **without** |
| the entire value being ``undef``. Note that this only defines the |
| semantics of the operation; it doesn't mean that targets will emit more |
| than one instruction to read the series of bytes. |
| |
| Note that in cases where none of the atomic intrinsics are used, this |
| model places only one restriction on IR transformations on top of what |
| is required for single-threaded execution: introducing a store to a byte |
| which might not otherwise be stored is not allowed in general. |
| (Specifically, in the case where another thread might write to and read |
| from an address, introducing a store can change a load that may see |
| exactly one write into a load that may see multiple writes.) |
| |
| .. _ordering: |
| |
| Atomic Memory Ordering Constraints |
| ---------------------------------- |
| |
| Atomic instructions (:ref:`cmpxchg <i_cmpxchg>`, |
| :ref:`atomicrmw <i_atomicrmw>`, :ref:`fence <i_fence>`, |
| :ref:`atomic load <i_load>`, and :ref:`atomic store <i_store>`) take |
| ordering parameters that determine which other atomic instructions on |
| the same address they *synchronize with*. These semantics are borrowed |
| from Java and C++0x, but are somewhat more colloquial. If these |
| descriptions aren't precise enough, check those specs (see spec |
| references in the :doc:`atomics guide <Atomics>`). |
| :ref:`fence <i_fence>` instructions treat these orderings somewhat |
| differently since they don't take an address. See that instruction's |
| documentation for details. |
| |
| For a simpler introduction to the ordering constraints, see the |
| :doc:`Atomics`. |
| |
| ``unordered`` |
| The set of values that can be read is governed by the happens-before |
| partial order. A value cannot be read unless some operation wrote |
| it. This is intended to provide a guarantee strong enough to model |
| Java's non-volatile shared variables. This ordering cannot be |
| specified for read-modify-write operations; it is not strong enough |
| to make them atomic in any interesting way. |
| ``monotonic`` |
| In addition to the guarantees of ``unordered``, there is a single |
| total order for modifications by ``monotonic`` operations on each |
| address. All modification orders must be compatible with the |
| happens-before order. There is no guarantee that the modification |
| orders can be combined to a global total order for the whole program |
| (and this often will not be possible). The read in an atomic |
| read-modify-write operation (:ref:`cmpxchg <i_cmpxchg>` and |
| :ref:`atomicrmw <i_atomicrmw>`) reads the value in the modification |
| order immediately before the value it writes. If one atomic read |
| happens before another atomic read of the same address, the later |
| read must see the same value or a later value in the address's |
| modification order. This disallows reordering of ``monotonic`` (or |
| stronger) operations on the same address. If an address is written |
| ``monotonic``-ally by one thread, and other threads ``monotonic``-ally |
| read that address repeatedly, the other threads must eventually see |
| the write. This corresponds to the C++0x/C1x |
| ``memory_order_relaxed``. |
| ``acquire`` |
| In addition to the guarantees of ``monotonic``, a |
| *synchronizes-with* edge may be formed with a ``release`` operation. |
| This is intended to model C++'s ``memory_order_acquire``. |
| ``release`` |
| In addition to the guarantees of ``monotonic``, if this operation |
| writes a value which is subsequently read by an ``acquire`` |
| operation, it *synchronizes-with* that operation. (This isn't a |
| complete description; see the C++0x definition of a release |
| sequence.) This corresponds to the C++0x/C1x |
| ``memory_order_release``. |
| ``acq_rel`` (acquire+release) |
| Acts as both an ``acquire`` and ``release`` operation on its |
| address. This corresponds to the C++0x/C1x ``memory_order_acq_rel``. |
| ``seq_cst`` (sequentially consistent) |
| In addition to the guarantees of ``acq_rel`` (``acquire`` for an |
| operation that only reads, ``release`` for an operation that only |
| writes), there is a global total order on all |
| sequentially-consistent operations on all addresses, which is |
| consistent with the *happens-before* partial order and with the |
| modification orders of all the affected addresses. Each |
| sequentially-consistent read sees the last preceding write to the |
| same address in this global order. This corresponds to the C++0x/C1x |
| ``memory_order_seq_cst`` and Java volatile. |
| |
| .. _syncscope: |
| |
| If an atomic operation is marked ``syncscope("singlethread")``, it only |
| *synchronizes with* and only participates in the seq\_cst total orderings of |
| other operations running in the same thread (for example, in signal handlers). |
| |
| If an atomic operation is marked ``syncscope("<target-scope>")``, where |
| ``<target-scope>`` is a target specific synchronization scope, then it is target |
| dependent if it *synchronizes with* and participates in the seq\_cst total |
| orderings of other operations. |
| |
| Otherwise, an atomic operation that is not marked ``syncscope("singlethread")`` |
| or ``syncscope("<target-scope>")`` *synchronizes with* and participates in the |
| seq\_cst total orderings of other operations that are not marked |
| ``syncscope("singlethread")`` or ``syncscope("<target-scope>")``. |
| |
| .. _floatenv: |
| |
| Floating-Point Environment |
| -------------------------- |
| |
| The default LLVM floating-point environment assumes that floating-point |
| instructions do not have side effects. Results assume the round-to-nearest |
| rounding mode. No floating-point exception state is maintained in this |
| environment. Therefore, there is no attempt to create or preserve invalid |
| operation (SNaN) or division-by-zero exceptions in these examples: |
| |
| .. code-block:: llvm |
| |
| %A = fdiv 0x7ff0000000000001, %X ; 64-bit SNaN hex value |
| %B = fdiv %X, 0.0 |
| Safe: |
| %A = NaN |
| %B = NaN |
| |
| The benefit of this exception-free assumption is that floating-point |
| operations may be speculated freely without any other fast-math relaxations |
| to the floating-point model. |
| |
| Code that requires different behavior than this should use the |
| :ref:`Constrained Floating-Point Intrinsics <constrainedfp>`. |
| |
| .. _fastmath: |
| |
| Fast-Math Flags |
| --------------- |
| |
| LLVM IR floating-point operations (:ref:`fadd <i_fadd>`, |
| :ref:`fsub <i_fsub>`, :ref:`fmul <i_fmul>`, :ref:`fdiv <i_fdiv>`, |
| :ref:`frem <i_frem>`, :ref:`fcmp <i_fcmp>`) and :ref:`call <i_call>` |
| may use the following flags to enable otherwise unsafe |
| floating-point transformations. |
| |
| ``nnan`` |
| No NaNs - Allow optimizations to assume the arguments and result are not |
| NaN. If an argument is a nan, or the result would be a nan, it produces |
| a :ref:`poison value <poisonvalues>` instead. |
| |
| ``ninf`` |
| No Infs - Allow optimizations to assume the arguments and result are not |
| +/-Inf. If an argument is +/-Inf, or the result would be +/-Inf, it |
| produces a :ref:`poison value <poisonvalues>` instead. |
| |
| ``nsz`` |
| No Signed Zeros - Allow optimizations to treat the sign of a zero |
| argument or result as insignificant. |
| |
| ``arcp`` |
| Allow Reciprocal - Allow optimizations to use the reciprocal of an |
| argument rather than perform division. |
| |
| ``contract`` |
| Allow floating-point contraction (e.g. fusing a multiply followed by an |
| addition into a fused multiply-and-add). |
| |
| ``afn`` |
| Approximate functions - Allow substitution of approximate calculations for |
| functions (sin, log, sqrt, etc). See floating-point intrinsic definitions |
| for places where this can apply to LLVM's intrinsic math functions. |
| |
| ``reassoc`` |
| Allow reassociation transformations for floating-point instructions. |
| This may dramatically change results in floating-point. |
| |
| ``fast`` |
| This flag implies all of the others. |
| |
| .. _uselistorder: |
| |
| Use-list Order Directives |
| ------------------------- |
| |
| Use-list directives encode the in-memory order of each use-list, allowing the |
| order to be recreated. ``<order-indexes>`` is a comma-separated list of |
| indexes that are assigned to the referenced value's uses. The referenced |
| value's use-list is immediately sorted by these indexes. |
| |
| Use-list directives may appear at function scope or global scope. They are not |
| instructions, and have no effect on the semantics of the IR. When they're at |
| function scope, they must appear after the terminator of the final basic block. |
| |
| If basic blocks have their address taken via ``blockaddress()`` expressions, |
| ``uselistorder_bb`` can be used to reorder their use-lists from outside their |
| function's scope. |
| |
| :Syntax: |
| |
| :: |
| |
| uselistorder <ty> <value>, { <order-indexes> } |
| uselistorder_bb @function, %block { <order-indexes> } |
| |
| :Examples: |
| |
| :: |
| |
| define void @foo(i32 %arg1, i32 %arg2) { |
| entry: |
| ; ... instructions ... |
| bb: |
| ; ... instructions ... |
| |
| ; At function scope. |
| uselistorder i32 %arg1, { 1, 0, 2 } |
| uselistorder label %bb, { 1, 0 } |
| } |
| |
| ; At global scope. |
| uselistorder i32* @global, { 1, 2, 0 } |
| uselistorder i32 7, { 1, 0 } |
| uselistorder i32 (i32) @bar, { 1, 0 } |
| uselistorder_bb @foo, %bb, { 5, 1, 3, 2, 0, 4 } |
| |
| .. _source_filename: |
| |
| Source Filename |
| --------------- |
| |
| The *source filename* string is set to the original module identifier, |
| which will be the name of the compiled source file when compiling from |
| source through the clang front end, for example. It is then preserved through |
| the IR and bitcode. |
| |
| This is currently necessary to generate a consistent unique global |
| identifier for local functions used in profile data, which prepends the |
| source file name to the local function name. |
| |
| The syntax for the source file name is simply: |
| |
| .. code-block:: text |
| |
| source_filename = "/path/to/source.c" |
| |
| .. _typesystem: |
| |
| Type System |
| =========== |
| |
| The LLVM type system is one of the most important features of the |
| intermediate representation. Being typed enables a number of |
| optimizations to be performed on the intermediate representation |
| directly, without having to do extra analyses on the side before the |
| transformation. A strong type system makes it easier to read the |
| generated code and enables novel analyses and transformations that are |
| not feasible to perform on normal three address code representations. |
| |
| .. _t_void: |
| |
| Void Type |
| --------- |
| |
| :Overview: |
| |
| |
| The void type does not represent any value and has no size. |
| |
| :Syntax: |
| |
| |
| :: |
| |
| void |
| |
| |
| .. _t_function: |
| |
| Function Type |
| ------------- |
| |
| :Overview: |
| |
| |
| The function type can be thought of as a function signature. It consists of a |
| return type and a list of formal parameter types. The return type of a function |
| type is a void type or first class type --- except for :ref:`label <t_label>` |
| and :ref:`metadata <t_metadata>` types. |
| |
| :Syntax: |
| |
| :: |
| |
| <returntype> (<parameter list>) |
| |
| ...where '``<parameter list>``' is a comma-separated list of type |
| specifiers. Optionally, the parameter list may include a type ``...``, which |
| indicates that the function takes a variable number of arguments. Variable |
| argument functions can access their arguments with the :ref:`variable argument |
| handling intrinsic <int_varargs>` functions. '``<returntype>``' is any type |
| except :ref:`label <t_label>` and :ref:`metadata <t_metadata>`. |
| |
| :Examples: |
| |
| +---------------------------------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------+ |
| | ``i32 (i32)`` | function taking an ``i32``, returning an ``i32`` | |
| +---------------------------------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------+ |
| | ``float (i16, i32 *) *`` | :ref:`Pointer <t_pointer>` to a function that takes an ``i16`` and a :ref:`pointer <t_pointer>` to ``i32``, returning ``float``. | |
| +---------------------------------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------+ |
| | ``i32 (i8*, ...)`` | A vararg function that takes at least one :ref:`pointer <t_pointer>` to ``i8`` (char in C), which returns an integer. This is the signature for ``printf`` in LLVM. | |
| +---------------------------------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------+ |
| | ``{i32, i32} (i32)`` | A function taking an ``i32``, returning a :ref:`structure <t_struct>` containing two ``i32`` values | |
| +---------------------------------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------+ |
| |
| .. _t_firstclass: |
| |
| First Class Types |
| ----------------- |
| |
| The :ref:`first class <t_firstclass>` types are perhaps the most important. |
| Values of these types are the only ones which can be produced by |
| instructions. |
| |
| .. _t_single_value: |
| |
| Single Value Types |
| ^^^^^^^^^^^^^^^^^^ |
| |
| These are the types that are valid in registers from CodeGen's perspective. |
| |
| .. _t_integer: |
| |
| Integer Type |
| """""""""""" |
| |
| :Overview: |
| |
| The integer type is a very simple type that simply specifies an |
| arbitrary bit width for the integer type desired. Any bit width from 1 |
| bit to 2\ :sup:`23`\ -1 (about 8 million) can be specified. |
| |
| :Syntax: |
| |
| :: |
| |
| iN |
| |
| The number of bits the integer will occupy is specified by the ``N`` |
| value. |
| |
| Examples: |
| ********* |
| |
| +----------------+------------------------------------------------+ |
| | ``i1`` | a single-bit integer. | |
| +----------------+------------------------------------------------+ |
| | ``i32`` | a 32-bit integer. | |
| +----------------+------------------------------------------------+ |
| | ``i1942652`` | a really big integer of over 1 million bits. | |
| +----------------+------------------------------------------------+ |
| |
| .. _t_floating: |
| |
| Floating-Point Types |
| """""""""""""""""""" |
| |
| .. list-table:: |
| :header-rows: 1 |
| |
| * - Type |
| - Description |
| |
| * - ``half`` |
| - 16-bit floating-point value |
| |
| * - ``float`` |
| - 32-bit floating-point value |
| |
| * - ``double`` |
| - 64-bit floating-point value |
| |
| * - ``fp128`` |
| - 128-bit floating-point value (112-bit mantissa) |
| |
| * - ``x86_fp80`` |
| - 80-bit floating-point value (X87) |
| |
| * - ``ppc_fp128`` |
| - 128-bit floating-point value (two 64-bits) |
| |
| The binary format of half, float, double, and fp128 correspond to the |
| IEEE-754-2008 specifications for binary16, binary32, binary64, and binary128 |
| respectively. |
| |
| X86_mmx Type |
| """""""""""" |
| |
| :Overview: |
| |
| The x86_mmx type represents a value held in an MMX register on an x86 |
| machine. The operations allowed on it are quite limited: parameters and |
| return values, load and store, and bitcast. User-specified MMX |
| instructions are represented as intrinsic or asm calls with arguments |
| and/or results of this type. There are no arrays, vectors or constants |
| of this type. |
| |
| :Syntax: |
| |
| :: |
| |
| x86_mmx |
| |
| |
| .. _t_pointer: |
| |
| Pointer Type |
| """""""""""" |
| |
| :Overview: |
| |
| The pointer type is used to specify memory locations. Pointers are |
| commonly used to reference objects in memory. |
| |
| Pointer types may have an optional address space attribute defining the |
| numbered address space where the pointed-to object resides. The default |
| address space is number zero. The semantics of non-zero address spaces |
| are target-specific. |
| |
| Note that LLVM does not permit pointers to void (``void*``) nor does it |
| permit pointers to labels (``label*``). Use ``i8*`` instead. |
| |
| :Syntax: |
| |
| :: |
| |
| <type> * |
| |
| :Examples: |
| |
| +-------------------------+--------------------------------------------------------------------------------------------------------------+ |
| | ``[4 x i32]*`` | A :ref:`pointer <t_pointer>` to :ref:`array <t_array>` of four ``i32`` values. | |
| +-------------------------+--------------------------------------------------------------------------------------------------------------+ |
| | ``i32 (i32*) *`` | A :ref:`pointer <t_pointer>` to a :ref:`function <t_function>` that takes an ``i32*``, returning an ``i32``. | |
| +-------------------------+--------------------------------------------------------------------------------------------------------------+ |
| | ``i32 addrspace(5)*`` | A :ref:`pointer <t_pointer>` to an ``i32`` value that resides in address space #5. | |
| +-------------------------+--------------------------------------------------------------------------------------------------------------+ |
| |
| .. _t_vector: |
| |
| Vector Type |
| """"""""""" |
| |
| :Overview: |
| |
| A vector type is a simple derived type that represents a vector of |
| elements. Vector types are used when multiple primitive data are |
| operated in parallel using a single instruction (SIMD). A vector type |
| requires a size (number of elements) and an underlying primitive data |
| type. Vector types are considered :ref:`first class <t_firstclass>`. |
| |
| :Syntax: |
| |
| :: |
| |
| < <# elements> x <elementtype> > |
| |
| The number of elements is a constant integer value larger than 0; |
| elementtype may be any integer, floating-point or pointer type. Vectors |
| of size zero are not allowed. |
| |
| :Examples: |
| |
| +-------------------+--------------------------------------------------+ |
| | ``<4 x i32>`` | Vector of 4 32-bit integer values. | |
| +-------------------+--------------------------------------------------+ |
| | ``<8 x float>`` | Vector of 8 32-bit floating-point values. | |
| +-------------------+--------------------------------------------------+ |
| | ``<2 x i64>`` | Vector of 2 64-bit integer values. | |
| +-------------------+--------------------------------------------------+ |
| | ``<4 x i64*>`` | Vector of 4 pointers to 64-bit integer values. | |
| +-------------------+--------------------------------------------------+ |
| |
| .. _t_label: |
| |
| Label Type |
| ^^^^^^^^^^ |
| |
| :Overview: |
| |
| The label type represents code labels. |
| |
| :Syntax: |
| |
| :: |
| |
| label |
| |
| .. _t_token: |
| |
| Token Type |
| ^^^^^^^^^^ |
| |
| :Overview: |
| |
| The token type is used when a value is associated with an instruction |
| but all uses of the value must not attempt to introspect or obscure it. |
| As such, it is not appropriate to have a :ref:`phi <i_phi>` or |
| :ref:`select <i_select>` of type token. |
| |
| :Syntax: |
| |
| :: |
| |
| token |
| |
| |
| |
| .. _t_metadata: |
| |
| Metadata Type |
| ^^^^^^^^^^^^^ |
| |
| :Overview: |
| |
| The metadata type represents embedded metadata. No derived types may be |
| created from metadata except for :ref:`function <t_function>` arguments. |
| |
| :Syntax: |
| |
| :: |
| |
| metadata |
| |
| .. _t_aggregate: |
| |
| Aggregate Types |
| ^^^^^^^^^^^^^^^ |
| |
| Aggregate Types are a subset of derived types that can contain multiple |
| member types. :ref:`Arrays <t_array>` and :ref:`structs <t_struct>` are |
| aggregate types. :ref:`Vectors <t_vector>` are not considered to be |
| aggregate types. |
| |
| .. _t_array: |
| |
| Array Type |
| """""""""" |
| |
| :Overview: |
| |
| The array type is a very simple derived type that arranges elements |
| sequentially in memory. The array type requires a size (number of |
| elements) and an underlying data type. |
| |
| :Syntax: |
| |
| :: |
| |
| [<# elements> x <elementtype>] |
| |
| The number of elements is a constant integer value; ``elementtype`` may |
| be any type with a size. |
| |
| :Examples: |
| |
| +------------------+--------------------------------------+ |
| | ``[40 x i32]`` | Array of 40 32-bit integer values. | |
| +------------------+--------------------------------------+ |
| | ``[41 x i32]`` | Array of 41 32-bit integer values. | |
| +------------------+--------------------------------------+ |
| | ``[4 x i8]`` | Array of 4 8-bit integer values. | |
| +------------------+--------------------------------------+ |
| |
| Here are some examples of multidimensional arrays: |
| |
| +-----------------------------+----------------------------------------------------------+ |
| | ``[3 x [4 x i32]]`` | 3x4 array of 32-bit integer values. | |
| +-----------------------------+----------------------------------------------------------+ |
| | ``[12 x [10 x float]]`` | 12x10 array of single precision floating-point values. | |
| +-----------------------------+----------------------------------------------------------+ |
| | ``[2 x [3 x [4 x i16]]]`` | 2x3x4 array of 16-bit integer values. | |
| +-----------------------------+----------------------------------------------------------+ |
| |
| There is no restriction on indexing beyond the end of the array implied |
| by a static type (though there are restrictions on indexing beyond the |
| bounds of an allocated object in some cases). This means that |
| single-dimension 'variable sized array' addressing can be implemented in |
| LLVM with a zero length array type. An implementation of 'pascal style |
| arrays' in LLVM could use the type "``{ i32, [0 x float]}``", for |
| example. |
| |
| .. _t_struct: |
| |
| Structure Type |
| """""""""""""" |
| |
| :Overview: |
| |
| The structure type is used to represent a collection of data members |
| together in memory. The elements of a structure may be any type that has |
| a size. |
| |
| Structures in memory are accessed using '``load``' and '``store``' by |
| getting a pointer to a field with the '``getelementptr``' instruction. |
| Structures in registers are accessed using the '``extractvalue``' and |
| '``insertvalue``' instructions. |
| |
| Structures may optionally be "packed" structures, which indicate that |
| the alignment of the struct is one byte, and that there is no padding |
| between the elements. In non-packed structs, padding between field types |
| is inserted as defined by the DataLayout string in the module, which is |
| required to match what the underlying code generator expects. |
| |
| Structures can either be "literal" or "identified". A literal structure |
| is defined inline with other types (e.g. ``{i32, i32}*``) whereas |
| identified types are always defined at the top level with a name. |
| Literal types are uniqued by their contents and can never be recursive |
| or opaque since there is no way to write one. Identified types can be |
| recursive, can be opaqued, and are never uniqued. |
| |
| :Syntax: |
| |
| :: |
| |
| %T1 = type { <type list> } ; Identified normal struct type |
| %T2 = type <{ <type list> }> ; Identified packed struct type |
| |
| :Examples: |
| |
| +------------------------------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+ |
| | ``{ i32, i32, i32 }`` | A triple of three ``i32`` values | |
| +------------------------------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+ |
| | ``{ float, i32 (i32) * }`` | A pair, where the first element is a ``float`` and the second element is a :ref:`pointer <t_pointer>` to a :ref:`function <t_function>` that takes an ``i32``, returning an ``i32``. | |
| +------------------------------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+ |
| | ``<{ i8, i32 }>`` | A packed struct known to be 5 bytes in size. | |
| +------------------------------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+ |
| |
| .. _t_opaque: |
| |
| Opaque Structure Types |
| """""""""""""""""""""" |
| |
| :Overview: |
| |
| Opaque structure types are used to represent named structure types that |
| do not have a body specified. This corresponds (for example) to the C |
| notion of a forward declared structure. |
| |
| :Syntax: |
| |
| :: |
| |
| %X = type opaque |
| %52 = type opaque |
| |
| :Examples: |
| |
| +--------------+-------------------+ |
| | ``opaque`` | An opaque type. | |
| +--------------+-------------------+ |
| |
| .. _constants: |
| |
| Constants |
| ========= |
| |
| LLVM has several different basic types of constants. This section |
| describes them all and their syntax. |
| |
| Simple Constants |
| ---------------- |
| |
| **Boolean constants** |
| The two strings '``true``' and '``false``' are both valid constants |
| of the ``i1`` type. |
| **Integer constants** |
| Standard integers (such as '4') are constants of the |
| :ref:`integer <t_integer>` type. Negative numbers may be used with |
| integer types. |
| **Floating-point constants** |
| Floating-point constants use standard decimal notation (e.g. |
| 123.421), exponential notation (e.g. 1.23421e+2), or a more precise |
| hexadecimal notation (see below). The assembler requires the exact |
| decimal value of a floating-point constant. For example, the |
| assembler accepts 1.25 but rejects 1.3 because 1.3 is a repeating |
| decimal in binary. Floating-point constants must have a |
| :ref:`floating-point <t_floating>` type. |
| **Null pointer constants** |
| The identifier '``null``' is recognized as a null pointer constant |
| and must be of :ref:`pointer type <t_pointer>`. |
| **Token constants** |
| The identifier '``none``' is recognized as an empty token constant |
| and must be of :ref:`token type <t_token>`. |
| |
| The one non-intuitive notation for constants is the hexadecimal form of |
| floating-point constants. For example, the form |
| '``double 0x432ff973cafa8000``' is equivalent to (but harder to read |
| than) '``double 4.5e+15``'. The only time hexadecimal floating-point |
| constants are required (and the only time that they are generated by the |
| disassembler) is when a floating-point constant must be emitted but it |
| cannot be represented as a decimal floating-point number in a reasonable |
| number of digits. For example, NaN's, infinities, and other special |
| values are represented in their IEEE hexadecimal format so that assembly |
| and disassembly do not cause any bits to change in the constants. |
| |
| When using the hexadecimal form, constants of types half, float, and |
| double are represented using the 16-digit form shown above (which |
| matches the IEEE754 representation for double); half and float values |
| must, however, be exactly representable as IEEE 754 half and single |
| precision, respectively. Hexadecimal format is always used for long |
| double, and there are three forms of long double. The 80-bit format used |
| by x86 is represented as ``0xK`` followed by 20 hexadecimal digits. The |
| 128-bit format used by PowerPC (two adjacent doubles) is represented by |
| ``0xM`` followed by 32 hexadecimal digits. The IEEE 128-bit format is |
| represented by ``0xL`` followed by 32 hexadecimal digits. Long doubles |
| will only work if they match the long double format on your target. |
| The IEEE 16-bit format (half precision) is represented by ``0xH`` |
| followed by 4 hexadecimal digits. All hexadecimal formats are big-endian |
| (sign bit at the left). |
| |
| There are no constants of type x86_mmx. |
| |
| .. _complexconstants: |
| |
| Complex Constants |
| ----------------- |
| |
| Complex constants are a (potentially recursive) combination of simple |
| constants and smaller complex constants. |
| |
| **Structure constants** |
| Structure constants are represented with notation similar to |
| structure type definitions (a comma separated list of elements, |
| surrounded by braces (``{}``)). For example: |
| "``{ i32 4, float 17.0, i32* @G }``", where "``@G``" is declared as |
| "``@G = external global i32``". Structure constants must have |
| :ref:`structure type <t_struct>`, and the number and types of elements |
| must match those specified by the type. |
| **Array constants** |
| Array constants are represented with notation similar to array type |
| definitions (a comma separated list of elements, surrounded by |
| square brackets (``[]``)). For example: |
| "``[ i32 42, i32 11, i32 74 ]``". Array constants must have |
| :ref:`array type <t_array>`, and the number and types of elements must |
| match those specified by the type. As a special case, character array |
| constants may also be represented as a double-quoted string using the ``c`` |
| prefix. For example: "``c"Hello World\0A\00"``". |
| **Vector constants** |
| Vector constants are represented with notation similar to vector |
| type definitions (a comma separated list of elements, surrounded by |
| less-than/greater-than's (``<>``)). For example: |
| "``< i32 42, i32 11, i32 74, i32 100 >``". Vector constants |
| must have :ref:`vector type <t_vector>`, and the number and types of |
| elements must match those specified by the type. |
| **Zero initialization** |
| The string '``zeroinitializer``' can be used to zero initialize a |
| value to zero of *any* type, including scalar and |
| :ref:`aggregate <t_aggregate>` types. This is often used to avoid |
| having to print large zero initializers (e.g. for large arrays) and |
| is always exactly equivalent to using explicit zero initializers. |
| **Metadata node** |
| A metadata node is a constant tuple without types. For example: |
| "``!{!0, !{!2, !0}, !"test"}``". Metadata can reference constant values, |
| for example: "``!{!0, i32 0, i8* @global, i64 (i64)* @function, !"str"}``". |
| Unlike other typed constants that are meant to be interpreted as part of |
| the instruction stream, metadata is a place to attach additional |
| information such as debug info. |
| |
| Global Variable and Function Addresses |
| -------------------------------------- |
| |
| The addresses of :ref:`global variables <globalvars>` and |
| :ref:`functions <functionstructure>` are always implicitly valid |
| (link-time) constants. These constants are explicitly referenced when |
| the :ref:`identifier for the global <identifiers>` is used and always have |
| :ref:`pointer <t_pointer>` type. For example, the following is a legal LLVM |
| file: |
| |
| .. code-block:: llvm |
| |
| @X = global i32 17 |
| @Y = global i32 42 |
| @Z = global [2 x i32*] [ i32* @X, i32* @Y ] |
| |
| .. _undefvalues: |
| |
| Undefined Values |
| ---------------- |
| |
| The string '``undef``' can be used anywhere a constant is expected, and |
| indicates that the user of the value may receive an unspecified |
| bit-pattern. Undefined values may be of any type (other than '``label``' |
| or '``void``') and be used anywhere a constant is permitted. |
| |
| Undefined values are useful because they indicate to the compiler that |
| the program is well defined no matter what value is used. This gives the |
| compiler more freedom to optimize. Here are some examples of |
| (potentially surprising) transformations that are valid (in pseudo IR): |
| |
| .. code-block:: llvm |
| |
| %A = add %X, undef |
| %B = sub %X, undef |
| %C = xor %X, undef |
| Safe: |
| %A = undef |
| %B = undef |
| %C = undef |
| |
| This is safe because all of the output bits are affected by the undef |
| bits. Any output bit can have a zero or one depending on the input bits. |
| |
| .. code-block:: llvm |
| |
| %A = or %X, undef |
| %B = and %X, undef |
| Safe: |
| %A = -1 |
| %B = 0 |
| Safe: |
| %A = %X ;; By choosing undef as 0 |
| %B = %X ;; By choosing undef as -1 |
| Unsafe: |
| %A = undef |
| %B = undef |
| |
| These logical operations have bits that are not always affected by the |
| input. For example, if ``%X`` has a zero bit, then the output of the |
| '``and``' operation will always be a zero for that bit, no matter what |
| the corresponding bit from the '``undef``' is. As such, it is unsafe to |
| optimize or assume that the result of the '``and``' is '``undef``'. |
| However, it is safe to assume that all bits of the '``undef``' could be |
| 0, and optimize the '``and``' to 0. Likewise, it is safe to assume that |
| all the bits of the '``undef``' operand to the '``or``' could be set, |
| allowing the '``or``' to be folded to -1. |
| |
| .. code-block:: llvm |
| |
| %A = select undef, %X, %Y |
| %B = select undef, 42, %Y |
| %C = select %X, %Y, undef |
| Safe: |
| %A = %X (or %Y) |
| %B = 42 (or %Y) |
| %C = %Y |
| Unsafe: |
| %A = undef |
| %B = undef |
| %C = undef |
| |
| This set of examples shows that undefined '``select``' (and conditional |
| branch) conditions can go *either way*, but they have to come from one |
| of the two operands. In the ``%A`` example, if ``%X`` and ``%Y`` were |
| both known to have a clear low bit, then ``%A`` would have to have a |
| cleared low bit. However, in the ``%C`` example, the optimizer is |
| allowed to assume that the '``undef``' operand could be the same as |
| ``%Y``, allowing the whole '``select``' to be eliminated. |
| |
| .. code-block:: text |
| |
| %A = xor undef, undef |
| |
| %B = undef |
| %C = xor %B, %B |
| |
| %D = undef |
| %E = icmp slt %D, 4 |
| %F = icmp gte %D, 4 |
| |
| Safe: |
| %A = undef |
| %B = undef |
| %C = undef |
| %D = undef |
| %E = undef |
| %F = undef |
| |
| This example points out that two '``undef``' operands are not |
| necessarily the same. This can be surprising to people (and also matches |
| C semantics) where they assume that "``X^X``" is always zero, even if |
| ``X`` is undefined. This isn't true for a number of reasons, but the |
| short answer is that an '``undef``' "variable" can arbitrarily change |
| its value over its "live range". This is true because the variable |
| doesn't actually *have a live range*. Instead, the value is logically |
| read from arbitrary registers that happen to be around when needed, so |
| the value is not necessarily consistent over time. In fact, ``%A`` and |
| ``%C`` need to have the same semantics or the core LLVM "replace all |
| uses with" concept would not hold. |
| |
| .. code-block:: llvm |
| |
| %A = sdiv undef, %X |
| %B = sdiv %X, undef |
| Safe: |
| %A = 0 |
| b: unreachable |
| |
| These examples show the crucial difference between an *undefined value* |
| and *undefined behavior*. An undefined value (like '``undef``') is |
| allowed to have an arbitrary bit-pattern. This means that the ``%A`` |
| operation can be constant folded to '``0``', because the '``undef``' |
| could be zero, and zero divided by any value is zero. |
| However, in the second example, we can make a more aggressive |
| assumption: because the ``undef`` is allowed to be an arbitrary value, |
| we are allowed to assume that it could be zero. Since a divide by zero |
| has *undefined behavior*, we are allowed to assume that the operation |
| does not execute at all. This allows us to delete the divide and all |
| code after it. Because the undefined operation "can't happen", the |
| optimizer can assume that it occurs in dead code. |
| |
| .. code-block:: text |
| |
| a: store undef -> %X |
| b: store %X -> undef |
| Safe: |
| a: <deleted> |
| b: unreachable |
| |
| A store *of* an undefined value can be assumed to not have any effect; |
| we can assume that the value is overwritten with bits that happen to |
| match what was already there. However, a store *to* an undefined |
| location could clobber arbitrary memory, therefore, it has undefined |
| behavior. |
| |
| .. _poisonvalues: |
| |
| Poison Values |
| ------------- |
| |
| Poison values are similar to :ref:`undef values <undefvalues>`, however |
| they also represent the fact that an instruction or constant expression |
| that cannot evoke side effects has nevertheless detected a condition |
| that results in undefined behavior. |
| |
| There is currently no way of representing a poison value in the IR; they |
| only exist when produced by operations such as :ref:`add <i_add>` with |
| the ``nsw`` flag. |
| |
| Poison value behavior is defined in terms of value *dependence*: |
| |
| - Values other than :ref:`phi <i_phi>` nodes depend on their operands. |
| - :ref:`Phi <i_phi>` nodes depend on the operand corresponding to |
| their dynamic predecessor basic block. |
| - Function arguments depend on the corresponding actual argument values |
| in the dynamic callers of their functions. |
| - :ref:`Call <i_call>` instructions depend on the :ref:`ret <i_ret>` |
| instructions that dynamically transfer control back to them. |
| - :ref:`Invoke <i_invoke>` instructions depend on the |
| :ref:`ret <i_ret>`, :ref:`resume <i_resume>`, or exception-throwing |
| call instructions that dynamically transfer control back to them. |
| - Non-volatile loads and stores depend on the most recent stores to all |
| of the referenced memory addresses, following the order in the IR |
| (including loads and stores implied by intrinsics such as |
| :ref:`@llvm.memcpy <int_memcpy>`.) |
| - An instruction with externally visible side effects depends on the |
| most recent preceding instruction with externally visible side |
| effects, following the order in the IR. (This includes :ref:`volatile |
| operations <volatile>`.) |
| - An instruction *control-depends* on a :ref:`terminator |
| instruction <terminators>` if the terminator instruction has |
| multiple successors and the instruction is always executed when |
| control transfers to one of the successors, and may not be executed |
| when control is transferred to another. |
| - Additionally, an instruction also *control-depends* on a terminator |
| instruction if the set of instructions it otherwise depends on would |
| be different if the terminator had transferred control to a different |
| successor. |
| - Dependence is transitive. |
| |
| Poison values have the same behavior as :ref:`undef values <undefvalues>`, |
| with the additional effect that any instruction that has a *dependence* |
| on a poison value has undefined behavior. |
| |
| Here are some examples: |
| |
| .. code-block:: llvm |
| |
| entry: |
| %poison = sub nuw i32 0, 1 ; Results in a poison value. |
| %still_poison = and i32 %poison, 0 ; 0, but also poison. |
| %poison_yet_again = getelementptr i32, i32* @h, i32 %still_poison |
| store i32 0, i32* %poison_yet_again ; memory at @h[0] is poisoned |
| |
| store i32 %poison, i32* @g ; Poison value stored to memory. |
| %poison2 = load i32, i32* @g ; Poison value loaded back from memory. |
| |
| store volatile i32 %poison, i32* @g ; External observation; undefined behavior. |
| |
| %narrowaddr = bitcast i32* @g to i16* |
| %wideaddr = bitcast i32* @g to i64* |
| %poison3 = load i16, i16* %narrowaddr ; Returns a poison value. |
| %poison4 = load i64, i64* %wideaddr ; Returns a poison value. |
| |
| %cmp = icmp slt i32 %poison, 0 ; Returns a poison value. |
| br i1 %cmp, label %true, label %end ; Branch to either destination. |
| |
| true: |
| store volatile i32 0, i32* @g ; This is control-dependent on %cmp, so |
| ; it has undefined behavior. |
| br label %end |
| |
| end: |
| %p = phi i32 [ 0, %entry ], [ 1, %true ] |
| ; Both edges into this PHI are |
| ; control-dependent on %cmp, so this |
| ; always results in a poison value. |
| |
| store volatile i32 0, i32* @g ; This would depend on the store in %true |
| ; if %cmp is true, or the store in %entry |
| ; otherwise, so this is undefined behavior. |
| |
| br i1 %cmp, label %second_true, label %second_end |
| ; The same branch again, but this time the |
| ; true block doesn't have side effects. |
| |
| second_true: |
| ; No side effects! |
| ret void |
| |
| second_end: |
| store volatile i32 0, i32* @g ; This time, the instruction always depends |
| ; on the store in %end. Also, it is |
| ; control-equivalent to %end, so this is |
| ; well-defined (ignoring earlier undefined |
| ; behavior in this example). |
| |
| .. _blockaddress: |
| |
| Addresses of Basic Blocks |
| ------------------------- |
| |
| ``blockaddress(@function, %block)`` |
| |
| The '``blockaddress``' constant computes the address of the specified |
| basic block in the specified function, and always has an ``i8*`` type. |
| Taking the address of the entry block is illegal. |
| |
| This value only has defined behavior when used as an operand to the |
| ':ref:`indirectbr <i_indirectbr>`' instruction, or for comparisons |
| against null. Pointer equality tests between labels addresses results in |
| undefined behavior --- though, again, comparison against null is ok, and |
| no label is equal to the null pointer. This may be passed around as an |
| opaque pointer sized value as long as the bits are not inspected. This |
| allows ``ptrtoint`` and arithmetic to be performed on these values so |
| long as the original value is reconstituted before the ``indirectbr`` |
| instruction. |
| |
| Finally, some targets may provide defined semantics when using the value |
| as the operand to an inline assembly, but that is target specific. |
| |
| .. _constantexprs: |
| |
| Constant Expressions |
| -------------------- |
| |
| Constant expressions are used to allow expressions involving other |
| constants to be used as constants. Constant expressions may be of any |
| :ref:`first class <t_firstclass>` type and may involve any LLVM operation |
| that does not have side effects (e.g. load and call are not supported). |
| The following is the syntax for constant expressions: |
| |
| ``trunc (CST to TYPE)`` |
| Perform the :ref:`trunc operation <i_trunc>` on constants. |
| ``zext (CST to TYPE)`` |
| Perform the :ref:`zext operation <i_zext>` on constants. |
| ``sext (CST to TYPE)`` |
| Perform the :ref:`sext operation <i_sext>` on constants. |
| ``fptrunc (CST to TYPE)`` |
| Truncate a floating-point constant to another floating-point type. |
| The size of CST must be larger than the size of TYPE. Both types |
| must be floating-point. |
| ``fpext (CST to TYPE)`` |
| Floating-point extend a constant to another type. The size of CST |
| must be smaller or equal to the size of TYPE. Both types must be |
| floating-point. |
| ``fptoui (CST to TYPE)`` |
| Convert a floating-point constant to the corresponding unsigned |
| integer constant. TYPE must be a scalar or vector integer type. CST |
| must be of scalar or vector floating-point type. Both CST and TYPE |
| must be scalars, or vectors of the same number of elements. If the |
| value won't fit in the integer type, the result is a |
| :ref:`poison value <poisonvalues>`. |
| ``fptosi (CST to TYPE)`` |
| Convert a floating-point constant to the corresponding signed |
| integer constant. TYPE must be a scalar or vector integer type. CST |
| must be of scalar or vector floating-point type. Both CST and TYPE |
| must be scalars, or vectors of the same number of elements. If the |
| value won't fit in the integer type, the result is a |
| :ref:`poison value <poisonvalues>`. |
| ``uitofp (CST to TYPE)`` |
| Convert an unsigned integer constant to the corresponding |
| floating-point constant. TYPE must be a scalar or vector floating-point |
| type. CST must be of scalar or vector integer type. Both CST and TYPE must |
| be scalars, or vectors of the same number of elements. |
| ``sitofp (CST to TYPE)`` |
| Convert a signed integer constant to the corresponding floating-point |
| constant. TYPE must be a scalar or vector floating-point type. |
| CST must be of scalar or vector integer type. Both CST and TYPE must |
| be scalars, or vectors of the same number of elements. |
| ``ptrtoint (CST to TYPE)`` |
| Perform the :ref:`ptrtoint operation <i_ptrtoint>` on constants. |
| ``inttoptr (CST to TYPE)`` |
| Perform the :ref:`inttoptr operation <i_inttoptr>` on constants. |
| This one is *really* dangerous! |
| ``bitcast (CST to TYPE)`` |
| Convert a constant, CST, to another TYPE. |
| The constraints of the operands are the same as those for the |
| :ref:`bitcast instruction <i_bitcast>`. |
| ``addrspacecast (CST to TYPE)`` |
| Convert a constant pointer or constant vector of pointer, CST, to another |
| TYPE in a different address space. The constraints of the operands are the |
| same as those for the :ref:`addrspacecast instruction <i_addrspacecast>`. |
| ``getelementptr (TY, CSTPTR, IDX0, IDX1, ...)``, ``getelementptr inbounds (TY, CSTPTR, IDX0, IDX1, ...)`` |
| Perform the :ref:`getelementptr operation <i_getelementptr>` on |
| constants. As with the :ref:`getelementptr <i_getelementptr>` |
| instruction, the index list may have one or more indexes, which are |
| required to make sense for the type of "pointer to TY". |
| ``select (COND, VAL1, VAL2)`` |
| Perform the :ref:`select operation <i_select>` on constants. |
| ``icmp COND (VAL1, VAL2)`` |
| Perform the :ref:`icmp operation <i_icmp>` on constants. |
| ``fcmp COND (VAL1, VAL2)`` |
| Perform the :ref:`fcmp operation <i_fcmp>` on constants. |
| ``extractelement (VAL, IDX)`` |
| Perform the :ref:`extractelement operation <i_extractelement>` on |
| constants. |
| ``insertelement (VAL, ELT, IDX)`` |
| Perform the :ref:`insertelement operation <i_insertelement>` on |
| constants. |
| ``shufflevector (VEC1, VEC2, IDXMASK)`` |
| Perform the :ref:`shufflevector operation <i_shufflevector>` on |
| constants. |
| ``extractvalue (VAL, IDX0, IDX1, ...)`` |
| Perform the :ref:`extractvalue operation <i_extractvalue>` on |
| constants. The index list is interpreted in a similar manner as |
| indices in a ':ref:`getelementptr <i_getelementptr>`' operation. At |
| least one index value must be specified. |
| ``insertvalue (VAL, ELT, IDX0, IDX1, ...)`` |
| Perform the :ref:`insertvalue operation <i_insertvalue>` on constants. |
| The index list is interpreted in a similar manner as indices in a |
| ':ref:`getelementptr <i_getelementptr>`' operation. At least one index |
| value must be specified. |
| ``OPCODE (LHS, RHS)`` |
| Perform the specified operation of the LHS and RHS constants. OPCODE |
| may be any of the :ref:`binary <binaryops>` or :ref:`bitwise |
| binary <bitwiseops>` operations. The constraints on operands are |
| the same as those for the corresponding instruction (e.g. no bitwise |
| operations on floating-point values are allowed). |
| |
| Other Values |
| ============ |
| |
| .. _inlineasmexprs: |
| |
| Inline Assembler Expressions |
| ---------------------------- |
| |
| LLVM supports inline assembler expressions (as opposed to :ref:`Module-Level |
| Inline Assembly <moduleasm>`) through the use of a special value. This value |
| represents the inline assembler as a template string (containing the |
| instructions to emit), a list of operand constraints (stored as a string), a |
| flag that indicates whether or not the inline asm expression has side effects, |
| and a flag indicating whether the function containing the asm needs to align its |
| stack conservatively. |
| |
| The template string supports argument substitution of the operands using "``$``" |
| followed by a number, to indicate substitution of the given register/memory |
| location, as specified by the constraint string. "``${NUM:MODIFIER}``" may also |
| be used, where ``MODIFIER`` is a target-specific annotation for how to print the |
| operand (See :ref:`inline-asm-modifiers`). |
| |
| A literal "``$``" may be included by using "``$$``" in the template. To include |
| other special characters into the output, the usual "``\XX``" escapes may be |
| used, just as in other strings. Note that after template substitution, the |
| resulting assembly string is parsed by LLVM's integrated assembler unless it is |
| disabled -- even when emitting a ``.s`` file -- and thus must contain assembly |
| syntax known to LLVM. |
| |
| LLVM also supports a few more substitions useful for writing inline assembly: |
| |
| - ``${:uid}``: Expands to a decimal integer unique to this inline assembly blob. |
| This substitution is useful when declaring a local label. Many standard |
| compiler optimizations, such as inlining, may duplicate an inline asm blob. |
| Adding a blob-unique identifier ensures that the two labels will not conflict |
| during assembly. This is used to implement `GCC's %= special format |
| string <https://gcc.gnu.org/onlinedocs/gcc/Extended-Asm.html>`_. |
| - ``${:comment}``: Expands to the comment character of the current target's |
| assembly dialect. This is usually ``#``, but many targets use other strings, |
| such as ``;``, ``//``, or ``!``. |
| - ``${:private}``: Expands to the assembler private label prefix. Labels with |
| this prefix will not appear in the symbol table of the assembled object. |
| Typically the prefix is ``L``, but targets may use other strings. ``.L`` is |
| relatively popular. |
| |
| LLVM's support for inline asm is modeled closely on the requirements of Clang's |
| GCC-compatible inline-asm support. Thus, the feature-set and the constraint and |
| modifier codes listed here are similar or identical to those in GCC's inline asm |
| support. However, to be clear, the syntax of the template and constraint strings |
| described here is *not* the same as the syntax accepted by GCC and Clang, and, |
| while most constraint letters are passed through as-is by Clang, some get |
| translated to other codes when converting from the C source to the LLVM |
| assembly. |
| |
| An example inline assembler expression is: |
| |
| .. code-block:: llvm |
| |
| i32 (i32) asm "bswap $0", "=r,r" |
| |
| Inline assembler expressions may **only** be used as the callee operand |
| of a :ref:`call <i_call>` or an :ref:`invoke <i_invoke>` instruction. |
| Thus, typically we have: |
| |
| .. code-block:: llvm |
| |
| %X = call i32 asm "bswap $0", "=r,r"(i32 %Y) |
| |
| Inline asms with side effects not visible in the constraint list must be |
| marked as having side effects. This is done through the use of the |
| '``sideeffect``' keyword, like so: |
| |
| .. code-block:: llvm |
| |
| call void asm sideeffect "eieio", ""() |
| |
| In some cases inline asms will contain code that will not work unless |
| the stack is aligned in some way, such as calls or SSE instructions on |
| x86, yet will not contain code that does that alignment within the asm. |
| The compiler should make conservative assumptions about what the asm |
| might contain and should generate its usual stack alignment code in the |
| prologue if the '``alignstack``' keyword is present: |
| |
| .. code-block:: llvm |
| |
| call void asm alignstack "eieio", ""() |
| |
| Inline asms also support using non-standard assembly dialects. The |
| assumed dialect is ATT. When the '``inteldialect``' keyword is present, |
| the inline asm is using the Intel dialect. Currently, ATT and Intel are |
| the only supported dialects. An example is: |
| |
| .. code-block:: llvm |
| |
| call void asm inteldialect "eieio", ""() |
| |
| If multiple keywords appear the '``sideeffect``' keyword must come |
| first, the '``alignstack``' keyword second and the '``inteldialect``' |
| keyword last. |
| |
| Inline Asm Constraint String |
| ^^^^^^^^^^^^^^^^^^^^^^^^^^^^ |
| |
| The constraint list is a comma-separated string, each element containing one or |
| more constraint codes. |
| |
| For each element in the constraint list an appropriate register or memory |
| operand will be chosen, and it will be made available to assembly template |
| string expansion as ``$0`` for the first constraint in the list, ``$1`` for the |
| second, etc. |
| |
| There are three different types of constraints, which are distinguished by a |
| prefix symbol in front of the constraint code: Output, Input, and Clobber. The |
| constraints must always be given in that order: outputs first, then inputs, then |
| clobbers. They cannot be intermingled. |
| |
| There are also three different categories of constraint codes: |
| |
| - Register constraint. This is either a register class, or a fixed physical |
| register. This kind of constraint will allocate a register, and if necessary, |
| bitcast the argument or result to the appropriate type. |
| - Memory constraint. This kind of constraint is for use with an instruction |
| taking a memory operand. Different constraints allow for different addressing |
| modes used by the target. |
| - Immediate value constraint. This kind of constraint is for an integer or other |
| immediate value which can be rendered directly into an instruction. The |
| various target-specific constraints allow the selection of a value in the |
| proper range for the instruction you wish to use it with. |
| |
| Output constraints |
| """""""""""""""""" |
| |
| Output constraints are specified by an "``=``" prefix (e.g. "``=r``"). This |
| indicates that the assembly will write to this operand, and the operand will |
| then be made available as a return value of the ``asm`` expression. Output |
| constraints do not consume an argument from the call instruction. (Except, see |
| below about indirect outputs). |
| |
| Normally, it is expected that no output locations are written to by the assembly |
| expression until *all* of the inputs have been read. As such, LLVM may assign |
| the same register to an output and an input. If this is not safe (e.g. if the |
| assembly contains two instructions, where the first writes to one output, and |
| the second reads an input and writes to a second output), then the "``&``" |
| modifier must be used (e.g. "``=&r``") to specify that the output is an |
| "early-clobber" output. Marking an output as "early-clobber" ensures that LLVM |
| will not use the same register for any inputs (other than an input tied to this |
| output). |
| |
| Input constraints |
| """"""""""""""""" |
| |
| Input constraints do not have a prefix -- just the constraint codes. Each input |
| constraint will consume one argument from the call instruction. It is not |
| permitted for the asm to write to any input register or memory location (unless |
| that input is tied to an output). Note also that multiple inputs may all be |
| assigned to the same register, if LLVM can determine that they necessarily all |
| contain the same value. |
| |
| Instead of providing a Constraint Code, input constraints may also "tie" |
| themselves to an output constraint, by providing an integer as the constraint |
| string. Tied inputs still consume an argument from the call instruction, and |
| take up a position in the asm template numbering as is usual -- they will simply |
| be constrained to always use the same register as the output they've been tied |
| to. For example, a constraint string of "``=r,0``" says to assign a register for |
| output, and use that register as an input as well (it being the 0'th |
| constraint). |
| |
| It is permitted to tie an input to an "early-clobber" output. In that case, no |
| *other* input may share the same register as the input tied to the early-clobber |
| (even when the other input has the same value). |
| |
| You may only tie an input to an output which has a register constraint, not a |
| memory constraint. Only a single input may be tied to an output. |
| |
| There is also an "interesting" feature which deserves a bit of explanation: if a |
| register class constraint allocates a register which is too small for the value |
| type operand provided as input, the input value will be split into multiple |
| registers, and all of them passed to the inline asm. |
| |
| However, this feature is often not as useful as you might think. |
| |
| Firstly, the registers are *not* guaranteed to be consecutive. So, on those |
| architectures that have instructions which operate on multiple consecutive |
| instructions, this is not an appropriate way to support them. (e.g. the 32-bit |
| SparcV8 has a 64-bit load, which instruction takes a single 32-bit register. The |
| hardware then loads into both the named register, and the next register. This |
| feature of inline asm would not be useful to support that.) |
| |
| A few of the targets provide a template string modifier allowing explicit access |
| to the second register of a two-register operand (e.g. MIPS ``L``, ``M``, and |
| ``D``). On such an architecture, you can actually access the second allocated |
| register (yet, still, not any subsequent ones). But, in that case, you're still |
| probably better off simply splitting the value into two separate operands, for |
| clarity. (e.g. see the description of the ``A`` constraint on X86, which, |
| despite existing only for use with this feature, is not really a good idea to |
| use) |
| |
| Indirect inputs and outputs |
| """"""""""""""""""""""""""" |
| |
| Indirect output or input constraints can be specified by the "``*``" modifier |
| (which goes after the "``=``" in case of an output). This indicates that the asm |
| will write to or read from the contents of an *address* provided as an input |
| argument. (Note that in this way, indirect outputs act more like an *input* than |
| an output: just like an input, they consume an argument of the call expression, |
| rather than producing a return value. An indirect output constraint is an |
| "output" only in that the asm is expected to write to the contents of the input |
| memory location, instead of just read from it). |
| |
| This is most typically used for memory constraint, e.g. "``=*m``", to pass the |
| address of a variable as a value. |
| |
| It is also possible to use an indirect *register* constraint, but only on output |
| (e.g. "``=*r``"). This will cause LLVM to allocate a register for an output |
| value normally, and then, separately emit a store to the address provided as |
| input, after the provided inline asm. (It's not clear what value this |
| functionality provides, compared to writing the store explicitly after the asm |
| statement, and it can only produce worse code, since it bypasses many |
| optimization passes. I would recommend not using it.) |
| |
| |
| Clobber constraints |
| """"""""""""""""""" |
| |
| A clobber constraint is indicated by a "``~``" prefix. A clobber does not |
| consume an input operand, nor generate an output. Clobbers cannot use any of the |
| general constraint code letters -- they may use only explicit register |
| constraints, e.g. "``~{eax}``". The one exception is that a clobber string of |
| "``~{memory}``" indicates that the assembly writes to arbitrary undeclared |
| memory locations -- not only the memory pointed to by a declared indirect |
| output. |
| |
| Note that clobbering named registers that are also present in output |
| constraints is not legal. |
| |
| |
| Constraint Codes |
| """""""""""""""" |
| After a potential prefix comes constraint code, or codes. |
| |
| A Constraint Code is either a single letter (e.g. "``r``"), a "``^``" character |
| followed by two letters (e.g. "``^wc``"), or "``{``" register-name "``}``" |
| (e.g. "``{eax}``"). |
| |
| The one and two letter constraint codes are typically chosen to be the same as |
| GCC's constraint codes. |
| |
| A single constraint may include one or more than constraint code in it, leaving |
| it up to LLVM to choose which one to use. This is included mainly for |
| compatibility with the translation of GCC inline asm coming from clang. |
| |
| There are two ways to specify alternatives, and either or both may be used in an |
| inline asm constraint list: |
| |
| 1) Append the codes to each other, making a constraint code set. E.g. "``im``" |
| or "``{eax}m``". This means "choose any of the options in the set". The |
| choice of constraint is made independently for each constraint in the |
| constraint list. |
| |
| 2) Use "``|``" between constraint code sets, creating alternatives. Every |
| constraint in the constraint list must have the same number of alternative |
| sets. With this syntax, the same alternative in *all* of the items in the |
| constraint list will be chosen together. |
| |
| Putting those together, you might have a two operand constraint string like |
| ``"rm|r,ri|rm"``. This indicates that if operand 0 is ``r`` or ``m``, then |
| operand 1 may be one of ``r`` or ``i``. If operand 0 is ``r``, then operand 1 |
| may be one of ``r`` or ``m``. But, operand 0 and 1 cannot both be of type m. |
| |
| However, the use of either of the alternatives features is *NOT* recommended, as |
| LLVM is not able to make an intelligent choice about which one to use. (At the |
| point it currently needs to choose, not enough information is available to do so |
| in a smart way.) Thus, it simply tries to make a choice that's most likely to |
| compile, not one that will be optimal performance. (e.g., given "``rm``", it'll |
| always choose to use memory, not registers). And, if given multiple registers, |
| or multiple register classes, it will simply choose the first one. (In fact, it |
| doesn't currently even ensure explicitly specified physical registers are |
| unique, so specifying multiple physical registers as alternatives, like |
| ``{r11}{r12},{r11}{r12}``, will assign r11 to both operands, not at all what was |
| intended.) |
| |
| Supported Constraint Code List |
| """""""""""""""""""""""""""""" |
| |
| The constraint codes are, in general, expected to behave the same way they do in |
| GCC. LLVM's support is often implemented on an 'as-needed' basis, to support C |
| inline asm code which was supported by GCC. A mismatch in behavior between LLVM |
| and GCC likely indicates a bug in LLVM. |
| |
| Some constraint codes are typically supported by all targets: |
| |
| - ``r``: A register in the target's general purpose register class. |
| - ``m``: A memory address operand. It is target-specific what addressing modes |
| are supported, typical examples are register, or register + register offset, |
| or register + immediate offset (of some target-specific size). |
| - ``i``: An integer constant (of target-specific width). Allows either a simple |
| immediate, or a relocatable value. |
| - ``n``: An integer constant -- *not* including relocatable values. |
| - ``s``: An integer constant, but allowing *only* relocatable values. |
| - ``X``: Allows an operand of any kind, no constraint whatsoever. Typically |
| useful to pass a label for an asm branch or call. |
| |
| .. FIXME: but that surely isn't actually okay to jump out of an asm |
| block without telling llvm about the control transfer???) |
| |
| - ``{register-name}``: Requires exactly the named physical register. |
| |
| Other constraints are target-specific: |
| |
| AArch64: |
| |
| - ``z``: An immediate integer 0. Outputs ``WZR`` or ``XZR``, as appropriate. |
| - ``I``: An immediate integer valid for an ``ADD`` or ``SUB`` instruction, |
| i.e. 0 to 4095 with optional shift by 12. |
| - ``J``: An immediate integer that, when negated, is valid for an ``ADD`` or |
| ``SUB`` instruction, i.e. -1 to -4095 with optional left shift by 12. |
| - ``K``: An immediate integer that is valid for the 'bitmask immediate 32' of a |
| logical instruction like ``AND``, ``EOR``, or ``ORR`` with a 32-bit register. |
| - ``L``: An immediate integer that is valid for the 'bitmask immediate 64' of a |
| logical instruction like ``AND``, ``EOR``, or ``ORR`` with a 64-bit register. |
| - ``M``: An immediate integer for use with the ``MOV`` assembly alias on a |
| 32-bit register. This is a superset of ``K``: in addition to the bitmask |
| immediate, also allows immediate integers which can be loaded with a single |
| ``MOVZ`` or ``MOVL`` instruction. |
| - ``N``: An immediate integer for use with the ``MOV`` assembly alias on a |
| 64-bit register. This is a superset of ``L``. |
| - ``Q``: Memory address operand must be in a single register (no |
| offsets). (However, LLVM currently does this for the ``m`` constraint as |
| well.) |
| - ``r``: A 32 or 64-bit integer register (W* or X*). |
| - ``w``: A 32, 64, or 128-bit floating-point/SIMD register. |
| - ``x``: A lower 128-bit floating-point/SIMD register (``V0`` to ``V15``). |
| |
| AMDGPU: |
| |
| - ``r``: A 32 or 64-bit integer register. |
| - ``[0-9]v``: The 32-bit VGPR register, number 0-9. |
| - ``[0-9]s``: The 32-bit SGPR register, number 0-9. |
| |
| |
| All ARM modes: |
| |
| - ``Q``, ``Um``, ``Un``, ``Uq``, ``Us``, ``Ut``, ``Uv``, ``Uy``: Memory address |
| operand. Treated the same as operand ``m``, at the moment. |
| |
| ARM and ARM's Thumb2 mode: |
| |
| - ``j``: An immediate integer between 0 and 65535 (valid for ``MOVW``) |
| - ``I``: An immediate integer valid for a data-processing instruction. |
| - ``J``: An immediate integer between -4095 and 4095. |
| - ``K``: An immediate integer whose bitwise inverse is valid for a |
| data-processing instruction. (Can be used with template modifier "``B``" to |
| print the inverted value). |
| - ``L``: An immediate integer whose negation is valid for a data-processing |
| instruction. (Can be used with template modifier "``n``" to print the negated |
| value). |
| - ``M``: A power of two or a integer between 0 and 32. |
| - ``N``: Invalid immediate constraint. |
| - ``O``: Invalid immediate constraint. |
| - ``r``: A general-purpose 32-bit integer register (``r0-r15``). |
| - ``l``: In Thumb2 mode, low 32-bit GPR registers (``r0-r7``). In ARM mode, same |
| as ``r``. |
| - ``h``: In Thumb2 mode, a high 32-bit GPR register (``r8-r15``). In ARM mode, |
| invalid. |
| - ``w``: A 32, 64, or 128-bit floating-point/SIMD register: ``s0-s31``, |
| ``d0-d31``, or ``q0-q15``. |
| - ``x``: A 32, 64, or 128-bit floating-point/SIMD register: ``s0-s15``, |
| ``d0-d7``, or ``q0-q3``. |
| - ``t``: A low floating-point/SIMD register: ``s0-s31``, ``d0-d16``, or |
| ``q0-q8``. |
| |
| ARM's Thumb1 mode: |
| |
| - ``I``: An immediate integer between 0 and 255. |
| - ``J``: An immediate integer between -255 and -1. |
| - ``K``: An immediate integer between 0 and 255, with optional left-shift by |
| some amount. |
| - ``L``: An immediate integer between -7 and 7. |
| - ``M``: An immediate integer which is a multiple of 4 between 0 and 1020. |
| - ``N``: An immediate integer between 0 and 31. |
| - ``O``: An immediate integer which is a multiple of 4 between -508 and 508. |
| - ``r``: A low 32-bit GPR register (``r0-r7``). |
| - ``l``: A low 32-bit GPR register (``r0-r7``). |
| - ``h``: A high GPR register (``r0-r7``). |
| - ``w``: A 32, 64, or 128-bit floating-point/SIMD register: ``s0-s31``, |
| ``d0-d31``, or ``q0-q15``. |
| - ``x``: A 32, 64, or 128-bit floating-point/SIMD register: ``s0-s15``, |
| ``d0-d7``, or ``q0-q3``. |
| - ``t``: A low floating-point/SIMD register: ``s0-s31``, ``d0-d16``, or |
| ``q0-q8``. |
| |
| |
| Hexagon: |
| |
| - ``o``, ``v``: A memory address operand, treated the same as constraint ``m``, |
| at the moment. |
| - ``r``: A 32 or 64-bit register. |
| |
| MSP430: |
| |
| - ``r``: An 8 or 16-bit register. |
| |
| MIPS: |
| |
| - ``I``: An immediate signed 16-bit integer. |
| - ``J``: An immediate integer zero. |
| - ``K``: An immediate unsigned 16-bit integer. |
| - ``L``: An immediate 32-bit integer, where the lower 16 bits are 0. |
| - ``N``: An immediate integer between -65535 and -1. |
| - ``O``: An immediate signed 15-bit integer. |
| - ``P``: An immediate integer between 1 and 65535. |
| - ``m``: A memory address operand. In MIPS-SE mode, allows a base address |
| register plus 16-bit immediate offset. In MIPS mode, just a base register. |
| - ``R``: A memory address operand. In MIPS-SE mode, allows a base address |
| register plus a 9-bit signed offset. In MIPS mode, the same as constraint |
| ``m``. |
| - ``ZC``: A memory address operand, suitable for use in a ``pref``, ``ll``, or |
| ``sc`` instruction on the given subtarget (details vary). |
| - ``r``, ``d``, ``y``: A 32 or 64-bit GPR register. |
| - ``f``: A 32 or 64-bit FPU register (``F0-F31``), or a 128-bit MSA register |
| (``W0-W31``). In the case of MSA registers, it is recommended to use the ``w`` |
| argument modifier for compatibility with GCC. |
| - ``c``: A 32-bit or 64-bit GPR register suitable for indirect jump (always |
| ``25``). |
| - ``l``: The ``lo`` register, 32 or 64-bit. |
| - ``x``: Invalid. |
| |
| NVPTX: |
| |
| - ``b``: A 1-bit integer register. |
| - ``c`` or ``h``: A 16-bit integer register. |
| - ``r``: A 32-bit integer register. |
| - ``l`` or ``N``: A 64-bit integer register. |
| - ``f``: A 32-bit float register. |
| - ``d``: A 64-bit float register. |
| |
| |
| PowerPC: |
| |
| - ``I``: An immediate signed 16-bit integer. |
| - ``J``: An immediate unsigned 16-bit integer, shifted left 16 bits. |
| - ``K``: An immediate unsigned 16-bit integer. |
| - ``L``: An immediate signed 16-bit integer, shifted left 16 bits. |
| - ``M``: An immediate integer greater than 31. |
| - ``N``: An immediate integer that is an exact power of 2. |
| - ``O``: The immediate integer constant 0. |
| - ``P``: An immediate integer constant whose negation is a signed 16-bit |
| constant. |
| - ``es``, ``o``, ``Q``, ``Z``, ``Zy``: A memory address operand, currently |
| treated the same as ``m``. |
| - ``r``: A 32 or 64-bit integer register. |
| - ``b``: A 32 or 64-bit integer register, excluding ``R0`` (that is: |
| ``R1-R31``). |
| - ``f``: A 32 or 64-bit float register (``F0-F31``), or when QPX is enabled, a |
| 128 or 256-bit QPX register (``Q0-Q31``; aliases the ``F`` registers). |
| - ``v``: For ``4 x f32`` or ``4 x f64`` types, when QPX is enabled, a |
| 128 or 256-bit QPX register (``Q0-Q31``), otherwise a 128-bit |
| altivec vector register (``V0-V31``). |
| |
| .. FIXME: is this a bug that v accepts QPX registers? I think this |
| is supposed to only use the altivec vector registers? |
| |
| - ``y``: Condition register (``CR0-CR7``). |
| - ``wc``: An individual CR bit in a CR register. |
| - ``wa``, ``wd``, ``wf``: Any 128-bit VSX vector register, from the full VSX |
| register set (overlapping both the floating-point and vector register files). |
| - ``ws``: A 32 or 64-bit floating-point register, from the full VSX register |
| set. |
| |
| Sparc: |
| |
| - ``I``: An immediate 13-bit signed integer. |
| - ``r``: A 32-bit integer register. |
| - ``f``: Any floating-point register on SparcV8, or a floating-point |
| register in the "low" half of the registers on SparcV9. |
| - ``e``: Any floating-point register. (Same as ``f`` on SparcV8.) |
| |
| SystemZ: |
| |
| - ``I``: An immediate unsigned 8-bit integer. |
| - ``J``: An immediate unsigned 12-bit integer. |
| - ``K``: An immediate signed 16-bit integer. |
| - ``L``: An immediate signed 20-bit integer. |
| - ``M``: An immediate integer 0x7fffffff. |
| - ``Q``: A memory address operand with a base address and a 12-bit immediate |
| unsigned displacement. |
| - ``R``: A memory address operand with a base address, a 12-bit immediate |
| unsigned displacement, and an index register. |
| - ``S``: A memory address operand with a base address and a 20-bit immediate |
| signed displacement. |
| - ``T``: A memory address operand with a base address, a 20-bit immediate |
| signed displacement, and an index register. |
| - ``r`` or ``d``: A 32, 64, or 128-bit integer register. |
| - ``a``: A 32, 64, or 128-bit integer address register (excludes R0, which in an |
| address context evaluates as zero). |
| - ``h``: A 32-bit value in the high part of a 64bit data register |
| (LLVM-specific) |
| - ``f``: A 32, 64, or 128-bit floating-point register. |
| |
| X86: |
| |
| - ``I``: An immediate integer between 0 and 31. |
| - ``J``: An immediate integer between 0 and 64. |
| - ``K``: An immediate signed 8-bit integer. |
| - ``L``: An immediate integer, 0xff or 0xffff or (in 64-bit mode only) |
| 0xffffffff. |
| - ``M``: An immediate integer between 0 and 3. |
| - ``N``: An immediate unsigned 8-bit integer. |
| - ``O``: An immediate integer between 0 and 127. |
| - ``e``: An immediate 32-bit signed integer. |
| - ``Z``: An immediate 32-bit unsigned integer. |
| - ``o``, ``v``: Treated the same as ``m``, at the moment. |
| - ``q``: An 8, 16, 32, or 64-bit register which can be accessed as an 8-bit |
| ``l`` integer register. On X86-32, this is the ``a``, ``b``, ``c``, and ``d`` |
| registers, and on X86-64, it is all of the integer registers. |
| - ``Q``: An 8, 16, 32, or 64-bit register which can be accessed as an 8-bit |
| ``h`` integer register. This is the ``a``, ``b``, ``c``, and ``d`` registers. |
| - ``r`` or ``l``: An 8, 16, 32, or 64-bit integer register. |
| - ``R``: An 8, 16, 32, or 64-bit "legacy" integer register -- one which has |
| existed since i386, and can be accessed without the REX prefix. |
| - ``f``: A 32, 64, or 80-bit '387 FPU stack pseudo-register. |
| - ``y``: A 64-bit MMX register, if MMX is enabled. |
| - ``x``: If SSE is enabled: a 32 or 64-bit scalar operand, or 128-bit vector |
| operand in a SSE register. If AVX is also enabled, can also be a 256-bit |
| vector operand in an AVX register. If AVX-512 is also enabled, can also be a |
| 512-bit vector operand in an AVX512 register, Otherwise, an error. |
| - ``Y``: The same as ``x``, if *SSE2* is enabled, otherwise an error. |
| - ``A``: Special case: allocates EAX first, then EDX, for a single operand (in |
| 32-bit mode, a 64-bit integer operand will get split into two registers). It |
| is not recommended to use this constraint, as in 64-bit mode, the 64-bit |
| operand will get allocated only to RAX -- if two 32-bit operands are needed, |
| you're better off splitting it yourself, before passing it to the asm |
| statement. |
| |
| XCore: |
| |
| - ``r``: A 32-bit integer register. |
| |
| |
| .. _inline-asm-modifiers: |
| |
| Asm template argument modifiers |
| ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ |
| |
| In the asm template string, modifiers can be used on the operand reference, like |
| "``${0:n}``". |
| |
| The modifiers are, in general, expected to behave the same way they do in |
| GCC. LLVM's support is often implemented on an 'as-needed' basis, to support C |
| inline asm code which was supported by GCC. A mismatch in behavior between LLVM |
| and GCC likely indicates a bug in LLVM. |
| |
| Target-independent: |
| |
| - ``c``: Print an immediate integer constant unadorned, without |
| the target-specific immediate punctuation (e.g. no ``$`` prefix). |
| - ``n``: Negate and print immediate integer constant unadorned, without the |
| target-specific immediate punctuation (e.g. no ``$`` prefix). |
| - ``l``: Print as an unadorned label, without the target-specific label |
| punctuation (e.g. no ``$`` prefix). |
| |
| AArch64: |
| |
| - ``w``: Print a GPR register with a ``w*`` name instead of ``x*`` name. E.g., |
| instead of ``x30``, print ``w30``. |
| - ``x``: Print a GPR register with a ``x*`` name. (this is the default, anyhow). |
| - ``b``, ``h``, ``s``, ``d``, ``q``: Print a floating-point/SIMD register with a |
| ``b*``, ``h*``, ``s*``, ``d*``, or ``q*`` name, rather than the default of |
| ``v*``. |
| |
| AMDGPU: |
| |
| - ``r``: No effect. |
| |
| ARM: |
| |
| - ``a``: Print an operand as an address (with ``[`` and ``]`` surrounding a |
| register). |
| - ``P``: No effect. |
| - ``q``: No effect. |
| - ``y``: Print a VFP single-precision register as an indexed double (e.g. print |
| as ``d4[1]`` instead of ``s9``) |
| - ``B``: Bitwise invert and print an immediate integer constant without ``#`` |
| prefix. |
| - ``L``: Print the low 16-bits of an immediate integer constant. |
| - ``M``: Print as a register set suitable for ldm/stm. Also prints *all* |
| register operands subsequent to the specified one (!), so use carefully. |
| - ``Q``: Print the low-order register of a register-pair, or the low-order |
| register of a two-register operand. |
| - ``R``: Print the high-order register of a register-pair, or the high-order |
| register of a two-register operand. |
| - ``H``: Print the second register of a register-pair. (On a big-endian system, |
| ``H`` is equivalent to ``Q``, and on little-endian system, ``H`` is equivalent |
| to ``R``.) |
| |
| .. FIXME: H doesn't currently support printing the second register |
| of a two-register operand. |
| |
| - ``e``: Print the low doubleword register of a NEON quad register. |
| - ``f``: Print the high doubleword register of a NEON quad register. |
| - ``m``: Print the base register of a memory operand without the ``[`` and ``]`` |
| adornment. |
| |
| Hexagon: |
| |
| - ``L``: Print the second register of a two-register operand. Requires that it |
| has been allocated consecutively to the first. |
| |
| .. FIXME: why is it restricted to consecutive ones? And there's |
| nothing that ensures that happens, is there? |
| |
| - ``I``: Print the letter 'i' if the operand is an integer constant, otherwise |
| nothing. Used to print 'addi' vs 'add' instructions. |
| |
| MSP430: |
| |
| No additional modifiers. |
| |
| MIPS: |
| |
| - ``X``: Print an immediate integer as hexadecimal |
| - ``x``: Print the low 16 bits of an immediate integer as hexadecimal. |
| - ``d``: Print an immediate integer as decimal. |
| - ``m``: Subtract one and print an immediate integer as decimal. |
| - ``z``: Print $0 if an immediate zero, otherwise print normally. |
| - ``L``: Print the low-order register of a two-register operand, or prints the |
| address of the low-order word of a double-word memory operand. |
| |
| .. FIXME: L seems to be missing memory operand support. |
| |
| - ``M``: Print the high-order register of a two-register operand, or prints the |
| address of the high-order word of a double-word memory operand. |
| |
| .. FIXME: M seems to be missing memory operand support. |
| |
| - ``D``: Print the second register of a two-register operand, or prints the |
| second word of a double-word memory operand. (On a big-endian system, ``D`` is |
| equivalent to ``L``, and on little-endian system, ``D`` is equivalent to |
| ``M``.) |
| - ``w``: No effect. Provided for compatibility with GCC which requires this |
| modifier in order to print MSA registers (``W0-W31``) with the ``f`` |
| constraint. |
| |
| NVPTX: |
| |
| - ``r``: No effect. |
| |
| PowerPC: |
| |
| - ``L``: Print the second register of a two-register operand. Requires that it |
| has been allocated consecutively to the first. |
| |
| .. FIXME: why is it restricted to consecutive ones? And there's |
| nothing that ensures that happens, is there? |
| |
| - ``I``: Print the letter 'i' if the operand is an integer constant, otherwise |
| nothing. Used to print 'addi' vs 'add' instructions. |
| - ``y``: For a memory operand, prints formatter for a two-register X-form |
| instruction. (Currently always prints ``r0,OPERAND``). |
| - ``U``: Prints 'u' if the memory operand is an update form, and nothing |
| otherwise. (NOTE: LLVM does not support update form, so this will currently |
| always print nothing) |
| - ``X``: Prints 'x' if the memory operand is an indexed form. (NOTE: LLVM does |
| not support indexed form, so this will currently always print nothing) |
| |
| Sparc: |
| |
| - ``r``: No effect. |
| |
| SystemZ: |
| |
| SystemZ implements only ``n``, and does *not* support any of the other |
| target-independent modifiers. |
| |
| X86: |
| |
| - ``c``: Print an unadorned integer or symbol name. (The latter is |
| target-specific behavior for this typically target-independent modifier). |
| - ``A``: Print a register name with a '``*``' before it. |
| - ``b``: Print an 8-bit register name (e.g. ``al``); do nothing on a memory |
| operand. |
| - ``h``: Print the upper 8-bit register name (e.g. ``ah``); do nothing on a |
| memory operand. |
| - ``w``: Print the 16-bit register name (e.g. ``ax``); do nothing on a memory |
| operand. |
| - ``k``: Print the 32-bit register name (e.g. ``eax``); do nothing on a memory |
| operand. |
| - ``q``: Print the 64-bit register name (e.g. ``rax``), if 64-bit registers are |
| available, otherwise the 32-bit register name; do nothing on a memory operand. |
| - ``n``: Negate and print an unadorned integer, or, for operands other than an |
| immediate integer (e.g. a relocatable symbol expression), print a '-' before |
| the operand. (The behavior for relocatable symbol expressions is a |
| target-specific behavior for this typically target-independent modifier) |
| - ``H``: Print a memory reference with additional offset +8. |
| - ``P``: Print a memory reference or operand for use as the argument of a call |
| instruction. (E.g. omit ``(rip)``, even though it's PC-relative.) |
| |
| XCore: |
| |
| No additional modifiers. |
| |
| |
| Inline Asm Metadata |
| ^^^^^^^^^^^^^^^^^^^ |
| |
| The call instructions that wrap inline asm nodes may have a |
| "``!srcloc``" MDNode attached to it that contains a list of constant |
| integers. If present, the code generator will use the integer as the |
| location cookie value when report errors through the ``LLVMContext`` |
| error reporting mechanisms. This allows a front-end to correlate backend |
| errors that occur with inline asm back to the source code that produced |
| it. For example: |
| |
| .. code-block:: llvm |
| |
| call void asm sideeffect "something bad", ""(), !srcloc !42 |
| ... |
| !42 = !{ i32 1234567 } |
| |
| It is up to the front-end to make sense of the magic numbers it places |
| in the IR. If the MDNode contains multiple constants, the code generator |
| will use the one that corresponds to the line of the asm that the error |
| occurs on. |
| |
| .. _metadata: |
| |
| Metadata |
| ======== |
| |
| LLVM IR allows metadata to be attached to instructions in the program |
| that can convey extra information about the code to the optimizers and |
| code generator. One example application of metadata is source-level |
| debug information. There are two metadata primitives: strings and nodes. |
| |
| Metadata does not have a type, and is not a value. If referenced from a |
| ``call`` instruction, it uses the ``metadata`` type. |
| |
| All metadata are identified in syntax by a exclamation point ('``!``'). |
| |
| .. _metadata-string: |
| |
| Metadata Nodes and Metadata Strings |
| ----------------------------------- |
| |
| A metadata string is a string surrounded by double quotes. It can |
| contain any character by escaping non-printable characters with |
| "``\xx``" where "``xx``" is the two digit hex code. For example: |
| "``!"test\00"``". |
| |
| Metadata nodes are represented with notation similar to structure |
| constants (a comma separated list of elements, surrounded by braces and |
| preceded by an exclamation point). Metadata nodes can have any values as |
| their operand. For example: |
| |
| .. code-block:: llvm |
| |
| !{ !"test\00", i32 10} |
| |
| Metadata nodes that aren't uniqued use the ``distinct`` keyword. For example: |
| |