llvm-project

mirror of https://github.com/llvm/llvm-project.git synced 2025-05-18 10:16:07 +00:00

Author	SHA1	Message	Date
Fangrui Song	a57d16bf80	[CodeGen] Fix -Wswitch after D116462	2022-04-19 17:33:15 -07:00
Yaxun (Sam) Liu	cac4e2fe25	[CUDA][HIP] Fix gpu.used.external Rename gpu.used.external as __clang_gpu_used_external as ptxas does not allow . in global variable name. Fixes: https://github.com/llvm/llvm-project/issues/54934 Reviewed by: Joseph Huber, Artem Belevich Differential Revision: https://reviews.llvm.org/D123946	2022-04-18 23:10:31 -04:00
Eli Friedman	4802edd1ac	Fix size of flexible array initializers, and re-enable assertions. In D123649, I got the formula for getFlexibleArrayInitChars slightly wrong: the flexible array elements can be contained in the tail padding of the struct. Fix the formula to account for that. With the fixed formula, we run into another issue: in some cases, we were emitting extra padding for flexible arrray initializers. Fix CGExprConstant so it uses a packed struct when necessary, to avoid this extra padding. Differential Revision: https://reviews.llvm.org/D123826	2022-04-15 12:09:57 -07:00
Eli Friedman	6cf0b1b3da	Comment out assertions about initializer size added in D123649. They're causing failures in LLVM test-suite. Added some regression tests that explain the issue.	2022-04-14 13:58:17 -07:00
Eli Friedman	5955a0f937	Allow flexible array initialization in C++. Flexible array initialization is a C/C++ extension implemented in many compilers to allow initializing the flexible array tail of a struct type that contains a flexible array. In clang, this is currently restricted to C. But this construct is used in the Microsoft SDK headers, so I'd like to extend it to C++. For now, this doesn't handle dynamic initialization; probably not hard to implement, but it's extra code, and I don't think it's necessary for the expected uses. And we explicitly fail out of constant evaluation. I've added some additional code to assert that initializers have the correct size, with or without flexible array init. This might catch issues unrelated to flexible array init. Differential Revision: https://reviews.llvm.org/D123649	2022-04-14 11:56:40 -07:00
Yaxun (Sam) Liu	0424b5115c	[CUDA][HIP] Fix host used external kernel in archive For -fgpu-rdc, a host function may call an external kernel which is defined in an archive of bitcode. Since this external kernel is only referenced in host function, the device bitcode does not contain reference to this external kernel, then the linker will not try to resolve this external kernel in the archive. To fix this issue, host-used external kernels and device variables are tracked. A global array containing pointers to these external kernels and variables is emitted which serves as an artificial references to the external kernels and variables used by host. Reviewed by: Artem Belevich Differential Revision: https://reviews.llvm.org/D123441	2022-04-13 10:47:16 -04:00
Daniel Kiss	b0343a38a5	Support the min of module flags when linking, use for AArch64 BTI/PAC-RET LTO objects might compiled with different `mbranch-protection` flags which will cause an error in the linker. Such a setup is allowed in the normal build with this change that is possible. Reviewed By: pcc Differential Revision: https://reviews.llvm.org/D123493	2022-04-13 09:31:51 +02:00
Yaxun (Sam) Liu	4ea1d43509	[CUDA][HIP] Externalize kernels in anonymous name space kernels in anonymous name space needs to have unique name to avoid duplicate symbols. Fixes: https://github.com/llvm/llvm-project/issues/54560 Reviewed by: Artem Belevich Differential Revision: https://reviews.llvm.org/D123353	2022-04-10 21:56:28 -04:00
Mitch Phillips	fa34951fbc	Reland "[MTE] Add -fsanitize=memtag* and friends." Differential Revision: https://reviews.llvm.org/D118948	2022-04-08 14:28:33 -07:00
Aaron Ballman	4aaf25b4f7	Revert "[MTE] Add -fsanitize=memtag* and friends." This reverts commit 8aa1490513f111afd407d87c3f07d26f65c8a686. Broke testing: https://lab.llvm.org/buildbot/#/builders/109/builds/36233	2022-04-08 16:15:58 -04:00
Mitch Phillips	8aa1490513	[MTE] Add -fsanitize=memtag* and friends. Currently, enablement of heap MTE on Android is specified by an ELF note, which signals to the linker to enable heap MTE. This change allows -fsanitize=memtag-heap to synthesize these notes, rather than adding them through the build system. We need to extend this feature to also signal the linker to do special work for MTE globals (in future) and MTE stack (currently implemented in the toolchain, but not implemented in the loader). Current Android uses a non-backwards-compatible ELF note, called ".note.android.memtag". Stack MTE is an ABI break anyway, so we don't mind that we won't be able to run executables with stack MTE on Android 11/12 devices. The current expectation is to support the verbiage used by Android, in that "SYNC" means MTE Synchronous mode, and "ASYNC" effectively means "fast", using the Kernel auto-upgrade feature that allows hardware-specific and core-specific configuration as to whether "ASYNC" would end up being Asynchronous, Asymmetric, or Synchronous on that particular core, whichever has a reasonable performance delta. Of course, this is platform and loader-specific. Differential Revision: https://reviews.llvm.org/D118948	2022-04-08 12:13:15 -07:00
Chuanqi Xu	74b56e02bd	[NFC] Remove unused variable in CodeGenModules This eliminates an unused-variable warning	2022-04-08 11:52:31 +08:00
Kavitha Natarajan	b1ea0191a4	[clang][DebugInfo] Support debug info for alias variable clang to emit DWARF information for global alias variable as DW_TAG_imported_declaration. This change also handles nested (recursive) imported declarations. Reviewed by: dblaikie, aprantl Differential Revision: https://reviews.llvm.org/D120989	2022-04-07 17:15:40 +05:30
Tom Honermann	5531abaf71	[clang] Corrections for target_clones multiversion functions. This change merges code for emit of target and target_clones multiversion resolver functions and, in doing so, corrects handling of target_clones functions that are declared but not defined. Previously, a use of such a target_clones function would result in an attempted emit of an ifunc that referenced an undefined resolver function. Ifunc references to undefined resolver functions are not allowed and, when the LLVM verifier is not disabled (via '-disable-llvm-verifier'), resulted in the verifier issuing a "IFunc resolver must be a definition" error and aborting the compilation. With this change, ifuncs and resolver function definitions are always emitted for used target_clones functions regardless of whether the target_clones function is defined (if the function is defined, then the ifunc and resolver are emitted regardless of whether the function is used). This change has the side effect of causing target_clones variants and resolver functions to be emitted in a different order than they were previously. This is harmless and is reflected in the updated tests. Reviewed By: erichkeane Differential Revision: https://reviews.llvm.org/D122958	2022-04-05 19:50:22 -04:00
Tom Honermann	40af8df6fe	[clang] NFC: Preparation for merging code to emit target and target_clones resolvers. This change modifies CodeGenModule::emitMultiVersionFunctions() in preparation for a change that will merge support for emitting target_clones resolvers into this function. This change mostly serves to isolate indentation changes from later behavior modifying changes. Reviewed By: erichkeane Differential Revision: https://reviews.llvm.org/D122957	2022-04-05 19:50:22 -04:00
Tom Honermann	0ace0100ae	[clang] NFC: Simplify the interface to CodeGenModule::GetOrCreateMultiVersionResolver(). Previously, GetOrCreateMultiVersionResolver() required the caller to provide a GlobalDecl along with an llvm::type and FunctionDecl. The latter two can be cheaply obtained from the first, and the llvm::type parameter is not always used, so requiring the caller to provide them was unnecessary and created the possibility that callers would pass an inconsistent set. This change simplifies the interface to only require the GlobalDecl value. Reviewed By: erichkeane Differential Revision: https://reviews.llvm.org/D122956	2022-04-05 19:50:22 -04:00
Tom Honermann	bed5ee3f4b	[clang] NFC: Enhance comments in CodeGen for multiversion function support. Reviewed By: erichkeane Differential Revision: https://reviews.llvm.org/D122955	2022-04-05 19:50:22 -04:00
Shangwu Yao	15a1769631	Emit OpenCL metadata when targeting SPIR-V This is required for converting function calls such as get_global_id() into SPIR-V builtins. Differential Revision: https://reviews.llvm.org/D123049	2022-04-05 20:58:32 +00:00
Tom Honermann	7c53fc4fe1	[clang] Emit target_clones resolver functions as COMDAT. Previously, resolver functions synthesized for target_clones multiversion functions were not emitted as COMDAT. Now fixed.	2022-04-05 15:34:35 -04:00
Nikita Popov	ff18b158ed	[CodeGen] Avoid unnecessary ConstantExpr cast With opaque pointers, this is not necessarily a ConstantExpr. And we don't need one here either, just Constant is sufficient.	2022-04-05 11:28:40 +02:00
Erich Keane	9ba8c4024b	Fix behavior of ifuncs with 'used' extern "C" static functions We expect that `extern "C"` static functions to be usable in things like inline assembly, as well as ifuncs: See the bug report here: https://github.com/llvm/llvm-project/issues/54549 However, we were diagnosing this as 'not defined', because the ifunc's attempt to look up its resolver would generate a declared IR function. Additionally, as background, the way we allow these static extern "C" functions to work in inline assembly is by making an alias with the C mangling in MOST situations to the version we emit with internal-linkage/mangling. The problem here was multi-fold: First- We generated the alias after the ifunc was checked, so the function by that name didn't exist yet. Second, the ifunc's generation caused a symbol to exist under the name of the alias already (the declared function above), which suppressed the alias generation. This patch fixes all of this by moving the checking of ifuncs/CFE aliases until AFTER we have generated the extern-C alias. Then, it does a 'fixup' around the GlobalIFunc to make sure we correct the reference. Differential Revision: https://reviews.llvm.org/D122608	2022-04-01 13:00:59 -07:00
Chris Bieneman	dfde354958	NFC. Fixing warnings from adding DXContainer Adds DXContainer to switch statements in Clang and LLDB to silence warnings.	2022-03-29 14:46:24 -05:00
James Y Knight	d614874900	[Clang] Implement __builtin_source_location. This builtin returns the address of a global instance of the `std::source_location::__impl` type, which must be defined (with an appropriate shape) before calling the builtin. It will be used to implement std::source_location in libc++ in a future change. The builtin is compatible with GCC's implementation, and libstdc++'s usage. An intentional divergence is that GCC declares the builtin's return type to be `const void` (for ease-of-implementation reasons), while Clang uses the actual type, `const std::source_location::__impl`. In order to support this new functionality, I've also added a new 'UnnamedGlobalConstantDecl'. This artificial Decl is modeled after MSGuidDecl, and is used to represent a generic concept of an lvalue constant with global scope, deduplicated by its value. It's possible that MSGuidDecl itself, or some of the other similar sorts of things in Clang might be able to be refactored onto this more-generic concept, but there's enough special-case weirdness in MSGuidDecl that I gave up attempting to share code there, at least for now. Finally, for compatibility with libstdc++'s <source_location> header, I've added a second exception to the "cannot cast from void* to T* in constant evaluation" rule. This seems a bit distasteful, but feels like the best available option. Reviewers: aaron.ballman, erichkeane Differential Revision: https://reviews.llvm.org/D120159	2022-03-28 18:29:02 -04:00
Nikita Popov	b8f0e12847	[CodeGen] Remove some uses of deprecated Address constructor Remove two stray uses in CodeGenModule and CGCUDANV.	2022-03-22 10:02:35 +01:00
Benjamin Kramer	5d2ce7663b	Use llvm::append_range instead of push_back loops where applicable. NFCI.	2022-03-18 01:25:34 +01:00
Joseph Huber	806bbc49dc	[OpenMP] Try to embed offloading objects after codegen Currently we use the `-fembed-offload-object` option to embed a binary file into the host as a named section. This is currently only used as a codegen action, meaning we only handle this option correctly when the input is a bitcode file. This patch adds the same handling to embed an offloading object after we complete code generation. This allows us to embed the object correctly if the input file is source or bitcode. Reviewed By: jdoerfert Differential Revision: https://reviews.llvm.org/D120270	2022-03-14 20:08:24 -04:00
Erich Keane	dc152659b4	Have cpu-specific variants set 'tune-cpu' as an optimization hint Due to various implementation constraints, despite the programmer choosing a 'processor' cpu_dispatch/cpu_specific needs to use the 'feature' list of a processor to identify it. This results in the identified processor in source-code not being propogated to the optimizer, and thus, not able to be tuned for. This patch changes to use the actual cpu as written for tune-cpu so that opt can make decisions based on the cpu-as-spelled, which should better match the behavior expected by the programmer. Note that the 'valid' list of processors for x86 is in llvm/include/llvm/Support/X86TargetParser.def. At the moment, this list contains only Intel processors, but other vendors may wish to add their own entries as 'alias'es (or with different feature lists!). If this is not done, there is two potential performance issues with the patch, but I believe them to be worth it in light of the improvements to behavior and performance. 1- In the event that the user spelled "ProcessorB", but we only have the features available to test for "ProcessorA" (where A is B minus features), AND there is an optimization opportunity for "B" that negatively affects "A", the optimizer will likely choose to do so. 2- In the event that the user spelled VendorI's processor, and the feature list allows it to run on VendorA's processor of similar features, AND there is an optimization opportunity for VendorIs that negatively affects "A"s, the optimizer will likely choose to do so. This can be fixed by adding an alias to X86TargetParser.def. Differential Revision: https://reviews.llvm.org/D121410	2022-03-14 06:14:30 -07:00
Itay Bookstein	f3480390be	[clang][CodeGen] Avoid emitting ifuncs with undefined resolvers The purpose of this change is to fix the following codegen bug: ``` // main.c __attribute__((cpu_specific(generic))) int foo(void) { static int z; return &z;} int main() { return foo() = 5; } // other.c __attribute__((cpu_dispatch(generic))) int foo(void); // run: clang main.c other.c -o main; ./main ``` This will segfault prior to the change, and return the correct exit code 5 after the change. The underlying cause is that when a translation unit contains a cpu_specific function without the corresponding cpu_dispatch the generated code binds the reference to foo() against a GlobalIFunc whose resolver is undefined. This is invalid: the resolver must be defined in the same translation unit as the ifunc, but historically the LLVM bitcode verifier did not check that. The generated code then binds against the resolver rather than the ifunc, so it ends up calling the resolver rather than the resolvee. In the example above it treats its return value as an int , therefore trying to write to program text. The root issue at the representation level is that GlobalIFunc, like GlobalAlias, does not support a "declaration" state. The object which provides the correct semantics in these cases is a Function declaration, but unlike Functions, changing a declaration to a definition in the GlobalIFunc case constitutes a change of the object type, as opposed to simply emitting code into a Function. I think this limitation is unlikely to change, so I implemented the fix by returning a function declaration rather than an ifunc when encountering cpu_specific, and upgrading it to an ifunc when emitting cpu_dispatch. This uses `takeName` + `replaceAllUsesWith` in similar vein to other places where the correct IR object type cannot be known locally/up-front, like in `CodeGenModule::EmitAliasDefinition`. Previous discussion in: https://reviews.llvm.org/D112349 Signed-off-by: Itay Bookstein <ibookstein@gmail.com> Reviewed By: erichkeane Differential Revision: https://reviews.llvm.org/D120266	2022-02-26 11:17:49 +02:00
Nikita Popov	5065076698	[CodeGen] Rename deprecated Address constructor To make uses of the deprecated constructor easier to spot, and to ensure that no new uses are introduced, rename it to Address::deprecated(). While doing the rename, I've filled in element types in cases where it was relatively obvious, but we're still left with 135 calls to the deprecated constructor.	2022-02-17 11:26:42 +01:00
Momchil Velikov	6398903ac8	Extend the `uwtable` attribute with unwind table kind We have the `clang -cc1` command-line option `-funwind-tables=1\|2` and the codegen option `VALUE_CODEGENOPT(UnwindTables, 2, 0) ///< Unwind tables (1) or asynchronous unwind tables (2)`. However, this is encoded in LLVM IR by the presence or the absence of the `uwtable` attribute, i.e. we lose the information whether to generate want just some unwind tables or asynchronous unwind tables. Asynchronous unwind tables take more space in the runtime image, I'd estimate something like 80-90% more, as the difference is adding roughly the same number of CFI directives as for prologues, only a bit simpler (e.g. `.cfi_offset reg, off` vs. `.cfi_restore reg`). Or even more, if you consider tail duplication of epilogue blocks. Asynchronous unwind tables could also restrict code generation to having only a finite number of frame pointer adjustments (an example of not having a finite number of `SP` adjustments is on AArch64 when untagging the stack (MTE) in some cases the compiler can modify `SP` in a loop). Having the CFI precise up to an instruction generally also means one cannot bundle together CFI instructions once the prologue is done, they need to be interspersed with ordinary instructions, which means extra `DW_CFA_advance_loc` commands, further increasing the unwind tables size. That is to say, async unwind tables impose a non-negligible overhead, yet for the most common use cases (like C++ exceptions), they are not even needed. This patch extends the `uwtable` attribute with an optional value: - `uwtable` (default to `async`) - `uwtable(sync)`, synchronous unwind tables - `uwtable(async)`, asynchronous (instruction precise) unwind tables Reviewed By: MaskRay Differential Revision: https://reviews.llvm.org/D114543	2022-02-14 14:35:02 +00:00
Arthur Eubanks	87dd3d350c	[clang][OpaquePtr] Remove call to getPointerElementType() in CodeGenModule::GetAddrOfGlobalTemporary()	2022-02-11 10:39:49 -08:00
Sameer Sahasrabuddhe	d8f99bb6e0	[AMDGPU] replace hostcall module flag with function attribute The module flag to indicate use of hostcall is insufficient to catch all cases where hostcall might be in use by a kernel. This is now replaced by a function attribute that gets propagated to top-level kernel functions via their respective call-graph. If the attribute "amdgpu-no-hostcall-ptr" is absent on a kernel, the default behaviour is to emit kernel metadata indicating that the kernel uses the hostcall buffer pointer passed as an implicit argument. The attribute may be placed explicitly by the user, or inferred by the AMDGPU attributor by examining the call-graph. The attribute is inferred only if the function is not being sanitized, and the implictarg_ptr does not result in a load of any byte in the hostcall pointer argument. Reviewed By: jdoerfert, arsenm, kpyzhov Differential Revision: https://reviews.llvm.org/D119216	2022-02-11 22:51:56 +05:30
Yaxun (Sam) Liu	1d97cb1f6e	[HIP] Emit amdgpu_code_object_version module flag code object version determines ABI, therefore should not be mixed. This patch emits amdgpu_code_object_version module flag in LLVM IR based on code object version (default 4). The amdgpu_code_object_version value is code object version times 100. LLVM IR with different amdgpu_code_object_version module flag cannot be linked. The -cc1 option -mcode-object-version=none is for ROCm device library use only, which supports multiple ABI. Reviewed by: Artem Belevich Differential Revision: https://reviews.llvm.org/D119026	2022-02-08 21:58:40 -05:00
Yaxun (Sam) Liu	171da443d5	[HIPSPV] Fix literals are mapped to Generic address space This issue is an oversight in D108621. Literals in HIP are emitted as global constant variables with default address space which maps to Generic address space for HIPSPV. In SPIR-V such variables translate to OpVariable instructions with Generic storage class which are not legal. Fix by mapping literals to CrossWorkGroup address space. The literals are not mapped to UniformConstant because the “flat” pointers in HIP may reference them and “flat” pointers are modeled as Generic pointers in SPIR-V. In SPIR-V/OpenCL UniformConstant pointers may not be casted to Generic. Patch by: Henry Linjamäki Reviewed by: Yaxun Liu Differential Revision: https://reviews.llvm.org/D118876	2022-02-05 17:26:52 -05:00
Hans Wennborg	853e0aa424	Don't dllexport reference temporaries Even if the reference itself is dllexport, the temporary should not be. In fact, we're already giving it internal linkage, so dllexporting it is not just wasteful, but will fail to link, as in the example below: $ cat /tmp/a.cc void _DllMainCRTStartup() {} const int __declspec(dllexport) &foo = 42; $ clang-cl -fuse-ld=lld /tmp/a.cc /Zl /link /dll /out:a.dll lld-link: error: <root>: undefined symbol: int const &foo::$RT1 Differential revision: https://reviews.llvm.org/D118980	2022-02-04 16:31:51 +01:00
Amilendra Kodithuwakku	1f08b08674	[clang][ARM] Emit warnings when PACBTI-M is used with unsupported architectures Branch protection in M-class is supported by - Armv8.1-M.Main - Armv8-M.Main - Armv7-M Attempting to enable this for other architectures, either by command-line (e.g -mbranch-protection=bti) or by target attribute in source code (e.g. __attribute__((target("branch-protection=..."))) ) will generate a warning. In both cases function attributes related to branch protection will not be emitted. Regardless of the warning, module level attributes related to branch protection will be emitted when it is enabled via the command-line. The following people also contributed to this patch: - Victor Campos Reviewed By: chill Differential Revision: https://reviews.llvm.org/D115501	2022-01-28 09:59:58 +00:00
Joao Moreira	82af95029e	[X86] Enable ibt-seal optimization when LTO is used in Kernel Intel's CET/IBT requires every indirect branch target to be an ENDBR instruction. Because of that, the compiler needs to correctly emit these instruction on function's prologues. Because this is a security feature, it is desirable that only actual indirect-branch-targeted functions are emitted with ENDBRs. While it is possible to identify address-taken functions through LTO, minimizing these ENDBR instructions remains a hard task for user-space binaries because exported functions may end being reachable through PLT entries, that will use an indirect branch for such. Because this cannot be determined during compilation-time, the compiler currently emits ENDBRs to every non-local-linkage function. Despite the challenge presented for user-space, the kernel landscape is different as no PLTs are used. With the intent of providing the most fit ENDBR emission for the kernel, kernel developers proposed an optimization named "ibt-seal" which replaces the ENDBRs for NOPs directly in the binary. The discussion of this feature can be seen in [1]. This diff brings the enablement of the flag -mibt-seal, which in combination with LTO enforces a different policy for ENDBR placement in when the code-model is set to "kernel". In this scenario, the compiler will only emit ENDBRs to address taken functions, ignoring non-address taken functions that are don't have local linkage. A comparison between an LTO-compiled kernel binaries without and with the -mibt-seal feature enabled shows that when -mibt-seal was used, the number of ENDBRs in the vmlinux.o binary patched by objtool decreased from 44383 to 33192, and that the number of superfluous ENDBR instructions nopped-out decreased from 11730 to 540. The 540 missed superfluous ENDBRs need to be investigated further, but hypotheses are: assembly code not being taken care of by the compiler, kernel exported symbols mechanisms creating bogus address taken situations or even these being removed due to other binary optimizations like kernel's static_calls. For now, I assume that the large drop in the number of ENDBR instructions already justifies the feature being merged. [1] - https://lkml.org/lkml/2021/11/22/591 Reviewed By: xiangzhangllvm Differential Revision: https://reviews.llvm.org/D116070	2022-01-21 10:55:34 +08:00
Yaxun (Sam) Liu	85c2bd2a0e	Prevent adding module flag amdgpu_hostcall multiple times HIP program with printf call fails to compile with -fsanitize=address option, because of appending module flag - amdgpu_hostcall twice, one for printf and one for sanitize option. This patch fixes that issue. Patch by: Praveen Velliengiri Reviewed by: Yaxun Liu, Roman Lebedev Differential Revision: https://reviews.llvm.org/D116216	2022-01-19 12:52:33 -05:00
Nikita Popov	c63a3175c2	[AttrBuilder] Remove ctor accepting AttributeList and Index Use the AttributeSet constructor instead. There's no good reason why AttrBuilder itself should exact the AttributeSet from the AttributeList. Moving this out of the AttrBuilder generally results in cleaner code.	2022-01-15 22:39:31 +01:00
Erich Keane	2bcba21c8b	[CPU-Dispatch] Make sure Dispatch names get updated if previously mangled Cases where there is a mangling of a cpu-dispatch/cpu-specific function before the function becomes 'multiversion' (such as a member function) causes the wrong name to be emitted for one of the variants/resolver, since the name is cached. Make sure we invalidate the cache in cpu-dispatch/cpu-specific modes, like we previously did for just target multiversioning.	2022-01-14 10:45:55 -08:00
Erich Keane	b699e8b11a	Add another assert to cpu-dispatch emission to help track down a tough to repro error. As mentioned yesterday, I've got a problem that I can only reproduce on Godbolt (none of the build configs on my local machine!), so this is at least somewhat usable until I figure out a cause.	2022-01-13 06:54:08 -08:00
Erich Keane	6e77ad11ff	Add an assert in cpudispatch emit to try to track down an error. I'm attempting to debug an issue that I can only get to happen on godbolt, where the cpu-dispatch resolver for an out of line member function is generated with the wrong name, causing a link failure.	2022-01-12 10:31:28 -08:00
Serge Guelton	d2cc6c2d0c	Use a sorted array instead of a map to store AttrBuilder string attributes Using and std::map<SmallString, SmallString> for target dependent attributes is inefficient: it makes its constructor slightly heavier, and involves extra allocation for each new string attribute. Storing the attribute key/value as strings implies extra allocation/copy step. Use a sorted vector instead. Given the low number of attributes generally involved, this is cheaper, as showcased by https://llvm-compile-time-tracker.com/compare.php?from=5de322295f4ade692dc4f1823ae4450ad3c48af2&to=05bc480bf641a9e3b466619af43a2d123ee3f71d&stat=instructions Differential Revision: https://reviews.llvm.org/D116599	2022-01-10 14:49:53 +01:00
Kazu Hirata	40446663c7	[clang] Use true/false instead of 1/0 (NFC) Identified with modernize-use-bool-literals.	2022-01-09 00:19:47 -08:00
serge-sans-paille	9290ccc3c1	Introduce the AttributeMask class This class is solely used as a lightweight and clean way to build a set of attributes to be removed from an AttrBuilder. Previously AttrBuilder was used both for building and removing, which introduced odd situation like creation of Attribute with dummy value because the only relevant part was the attribute kind. Differential Revision: https://reviews.llvm.org/D116110	2022-01-04 15:37:46 +01:00
Sami Tolvanen	ec2e26eaf6	[Clang] Add __builtin_function_start Control-Flow Integrity (CFI) replaces references to address-taken functions with pointers to the CFI jump table. This is a problem for low-level code, such as operating system kernels, which may need the address of an actual function body without the jump table indirection. This change adds the __builtin_function_start() builtin, which accepts an argument that can be constant-evaluated to a function, and returns the address of the function body. Link: https://github.com/ClangBuiltLinux/linux/issues/1353 Depends on D108478 Reviewed By: pcc, rjmccall Differential Revision: https://reviews.llvm.org/D108479	2021-12-20 12:55:33 -08:00
Nikita Popov	c3b624a191	[CodeGen] Avoid deprecated ConstantAddress constructor Change all uses of the deprecated constructor to pass the element type explicitly and drop it. For cases where the correct element type was not immediately obvious to me or would require a slightly larger change I'm falling back to explicitly calling getPointerElementType() for now.	2021-12-15 10:42:41 +01:00
Peter Collingbourne	0a14674f27	CodeGen: Strip exception specifications from function types in CFI type names. With C++17 the exception specification has been made part of the function type, and therefore part of mangled type names. However, it's valid to convert function pointers with an exception specification to function pointers with the same argument and return types but without an exception specification, which means that e.g. a function of type "void () noexcept" can be called through a pointer of type "void ()". We must therefore consider the two types to be compatible for CFI purposes. We can do this by stripping the exception specification before mangling the type name, which is what this patch does. Differential Revision: https://reviews.llvm.org/D115015	2021-12-03 14:50:52 -05:00
Ties Stuij	e3b2f0226b	[clang][ARM] PACBTI-M frontend support Handle branch protection option on the commandline as well as a function attribute. One patch for both mechanisms, as they use the same underlying parsing mechanism. These are recorded in a set of LLVM IR module-level attributes like we do for AArch64 PAC/BTI (see https://reviews.llvm.org/D85649): - command-line options are "translated" to module-level LLVM IR attributes (metadata). - functions have PAC/BTI specific attributes iff the __attribute__((target("branch-protection=...))) was used in the function declaration. - command-line option -mbranch-protection to armclang targeting Arm, following this grammar: branch-protection ::= "-mbranch-protection=" <protection> protection ::= "none" \| "standard" \| "bti" [ "+" <pac-ret-clause> ] \| <pac-ret-clause> [ "+" "bti"] pac-ret-clause ::= "pac-ret" [ "+" <pac-ret-option> ] pac-ret-option ::= "leaf" ["+" "b-key"] \| "b-key" ["+" "leaf"] b-key is simply a placeholder to make it consistent with AArch64's version. In Arm, however, it triggers a warning informing that b-key is unsupported and a-key will be selected instead. - Handle _attribute_((target(("branch-protection=..."))) for AArch32 with the same grammer as the commandline options. This patch is part of a series that adds support for the PACBTI-M extension of the Armv8.1-M architecture, as detailed here: https://community.arm.com/arm-community-blogs/b/architectures-and-processors-blog/posts/armv8-1-m-pointer-authentication-and-branch-target-identification-extension The PACBTI-M specification can be found in the Armv8-M Architecture Reference Manual: https://developer.arm.com/documentation/ddi0553/latest The following people contributed to this patch: - Momchil Velikov - Victor Campos - Ties Stuij Reviewed By: vhscampos Differential Revision: https://reviews.llvm.org/D112421	2021-12-01 10:37:16 +00:00
Erich Keane	fc53eb69c2	Reapply 'Implement target_clones multiversioning' See discussion in D51650, this change was a little aggressive in an error while doing a 'while we were here', so this removes that error condition, as it is apparently useful. This reverts commit bb4934601d731465e01e2e22c80ce2dbe687d73f.	2021-11-29 06:30:01 -08:00

1 2 3 4 5 ...

1814 Commits