llvm-project

mirror of https://github.com/llvm/llvm-project.git synced 2025-04-24 01:16:06 +00:00

Author	SHA1	Message	Date
Kazu Hirata	cb80b26e37	[clang] Use Set::insert_range (NFC) (#133357 ) We can use Set::insert_range to collapse: for (auto Elem : Range) Set.insert(E); down to: Set.insert_range(Range); In some cases, we can further fold that into the set declaration.	2025-03-27 20:14:25 -07:00
Craig Topper	1752d5292c	[RISCV] Make RequiredExtensions for intrinsics scalable to more than 32 extensions. NFC (#132895 ) We have more than 32 extensions in our downstream and had to change this type from uint32_t to uint64_t. To simplify our downstream and make the code more flexible, I propose to make it an array of uint32_t that we can size based on the number of extensions. I really wanted to use std::bitset, but we have to print the bits to a .inc file which can't easily be done with std::bitset.	2025-03-25 20:46:01 -07:00
Kazu Hirata	c6c394634c	[clang] Use *Set::insert_range (NFC) (#132507 ) DenseSet, SmallPtrSet, SmallSet, SetVector, and StringSet recently gained C++23-style insert_range. This patch replaces: Dest.insert(Src.begin(), Src.end()); with: Dest.insert_range(Src); This patch does not touch custom begin like succ_begin for now.	2025-03-22 08:06:38 -07:00
erichkeane	79079c9469	[OpenACC] Finish implementing 'routine' AST/Sema. This is the last item of the OpenACC 3.3 spec. It includes the implicit-name version of 'routine', plus significant refactorings to make the two work together. The implicit name version is represented as an attribute on the function call. This patch also implements the clauses for the implicit-name version, as well as the A.3.4 warning.	2025-03-21 08:57:54 -07:00
Kazu Hirata	69b70110b7	[TableGen] Avoid repeated hash lookups (NFC) (#132142 )	2025-03-20 09:10:23 -07:00
Sarah Spall	431eaa8deb	[HLSL] make semantic matching case insensitive (#129773 ) Make semantic matching case insensitive update tests to reflect semantic printed as all lower case in error messages add new tests to show case insensitivity Closes #128063	2025-03-10 11:19:45 -07:00
Oliver Stannard	a619a2e53a	[ARM] Fix lane ordering for AdvSIMD intrinsics on big-endian targets (#127068 ) In arm-neon.h, we insert shufflevectors around each intrinsic when the target is big-endian, to compensate for the difference between the ABI-defined memory format of vectors (with the whole vector stored as one big-endian access) and LLVM's target-independent expectations (with the lowest-numbered lane in the lowest address). However, this code was written for the AArch64 ABI, and the AArch32 ABI differs slightly: it requires that vectors are stored in memory as-if stored with VSTM, which does a series of 64-bit accesses, instead of the AArch64 VSTR, which does a single 128-bit access. This means that for AArch32 we need to reverse the lanes in each 64-bit chunk of the vector, instead of in the whole vector. Since there are only a small number of different shufflevector orderings needed, I've split them out into macros, so that this doesn't need separate conditions in each intrinsic definition.	2025-03-04 08:10:22 +00:00
serge-sans-paille	f3d4d11547	[clang][cmake] Fix support for dynamic libraries in CLANG_BOLT Simpler detection of dynamic library operands as the readelf one seems to be unreliable (works on my setup, not on buildbots). This is a follow-up to #127020	2025-03-03 18:05:18 +01:00
serge-sans-paille	9db72e55ed	[clang][cmake] Fix support for dynamic libraries in CLANG_BOLT Patch typo introduced in #127020	2025-03-03 09:21:05 +01:00
serge-sans-paille	527af302b9	Add support for dynamic libraries in CLANG_BOLT (#127020 )	2025-03-02 20:21:44 +00:00
Nikolas Klauser	8dd8e5f7d6	[Clang] Add BuiltinTemplates.td to generate code for builtin templates (#123736 ) This makes it significantly easier to add new builtin templates, since you only have to modify two places instead of a dozen or so. The `BuiltinTemplates.td` could also be extended to generate documentation from it in the future.	2025-02-26 16:01:14 +01:00
Reid Kleckner	59cee030fb	Generalize creduce-clang-crash.py script to look for cvise (#128592 ) cvise reimplements creduce in Python and bundles clang-delta and other tools. In my experience, it is generally a more robust reduction tool that is better maintained. I renamed the script to make it tool-neutral, which also opens up the possibility that we teach it how to automatically transition over to llvm-reduce and opt/llc to handle LLVM backend crashes, but that is potential future work. Internally, the variable names still say "creduce". I kept using the verb "reduce" because "vise" is not a verb, but the external facing text has been updated.	2025-02-25 13:59:26 -08:00
Petr Hosek	81ed48531d	[CMake] Fix variable name (#127967 ) This was accidentaly introduced in #126876.	2025-02-20 08:22:14 -08:00
Petr Hosek	dca7306365	[clang][perf-training] Support excluding LLVM build from PGO training (#126876 ) Using LLVM build itself for PGO training is convenient and a great starting point but it also has several issues: * LLVM build implicitly depends on tools other than CMake and C/C++ compiler and if those tools aren't available in PATH, the build will fail. * LLVM build also requires standard headers and libraries which may not always be available in the default location requiring an explicit sysroot. * Building a single configuration (-DCMAKE_BUILD_TYPE=Release) only exercises the -O3 pipeline and can pesimize other configurations. * Building for the host target doesn't exercise all other targets. * Since LLVMSupport is a static library, this doesn't exercise the linker (beyond what the CMake itself does). Rather than using LLVM build, ideally we would provide a more minimal, purpose built corpus. While we're working on building such a corpus, provide a CMake option that lets vendors disable the use LLVM build for PGO training.	2025-02-19 11:36:09 -08:00
Kazu Hirata	ba9810e803	[TableGen] Avoid repeated hash lookups (NFC) (#126464 )	2025-02-10 07:49:42 -08:00
Kazu Hirata	cf5947be13	[TableGen] Avoid repeated map lookups (NFC) (#126381 )	2025-02-08 11:36:35 -08:00
Mats Jun Larsen	c4c22a5377	[Clang][TableGen] Use PointerType::get(Context) in MVE TableGen emitter (NFC) (#124782 ) Follow-up to #123569 Co-authored-by: Nikita Popov <github@npopov.com>	2025-02-07 17:41:27 +00:00
Joseph Huber	cd754af55f	[Clang] Permit both `gnu` and `clang` prefixes on some attributes (#125796 ) Summary: Some attributes have gnu extensions that share names with clang attributes. If these imply the same thing, we can specially declare this to be an alternate but equivalent spelling. This patch enables this for `no_sanitize` and provides the infrastructure for more to be added if needed. Discussions welcome on whether or not we want to bind ourselves to GNU behavior, since theoretically it's possible for GNU to silently change the semantics away from our implementation, but I'm not an expert. Fixes: https://github.com/llvm/llvm-project/issues/125760	2025-02-05 08:16:00 -06:00
Chandler Carruth	2ff42bdac3	[StrTable] Add prefixes for x86 builtins. This requires adding support to the general builtins emission for producing prefixed builtin infos separately from un-prefixed which is a bit crufty. But we don't currently have any good way of having a more refined model than a single hard-coded prefix string per TableGen emission. Something more powerful and/or elegant is possible, but this is a fairly minimal first step that at least allows factoring out the builtin prefix for something like X86.	2025-02-04 18:04:58 +00:00
Chandler Carruth	212ecb9d5c	[StrTable] Teach main builtin TableGen to use direct enums, strings, and info This moves the main builtins and several targets to use nice generated string tables and info structures rather than X-macros. Even without obvious prefixes factored out, the resulting tables are significantly smaller and much cheaper to compile with out all the X-macro overhead. This leaves the X-macros in place for atomic builtins which have a wide range of uses that don't seem reasonable to fold into TableGen. As future work, these should move to their own file (whether as X-macros or just generated patterns) so the AST headers don't have to include all the data for other builtins.	2025-02-04 18:04:58 +00:00
Chandler Carruth	64ea3f5a47	[StrTable] Switch AArch64 and ARM to use directly TableGen-ed builtin tables This leverages the sharded structure of the builtins to make it easy to directly tablegen most of the AArch64 and ARM builtins while still using X-macros for a few edge cases. It also extracts common prefixes as part of that. This makes the string tables for these targets dramatically smaller. This is especially important as the SVE builtins represent (by far) the largest string table and largest builtin table across all the targets in Clang.	2025-02-04 18:04:58 +00:00
Chandler Carruth	1cb979f001	[StrTable] Switch RISCV to leverage sharded, prefixed builtins w/ TableGen This lets the TableGen-ed code be much cleaner, directly building an efficient string table without duplicates and without the repeated prefix.	2025-02-04 18:04:57 +00:00
Momchil Velikov	b6e50ed209	[AArch64] Simplify definitions of SVE/SME intrinsics which set FPMR (#123796 ) If an intrinsic has an `fpm_t` parameter, automatically set the flag `SetsFPMR` and append "_fpm" to the name.	2025-02-03 09:38:05 +00:00
Kazu Hirata	9268494f03	[TableGen] Migrate away from PointerUnion::dyn_cast (NFC) (#125158 ) Note that PointerUnion::dyn_cast has been soft deprecated in PointerUnion.h: // FIXME: Replace the uses of is(), get() and dyn_cast() with // isa<T>, cast<T> and the llvm::dyn_cast<T> Literal migration would result in dyn_cast_if_present (see the definition of PointerUnion::dyn_cast), but this patch uses dyn_cast because we expect DiagsInPedantic and GroupsInPedantic to be nonnull.	2025-01-31 07:50:18 -08:00
Sirraide	c4a019747c	[Clang] Remove ARCMigrate (#119269 ) In the discussion around #116792, @rjmccall mentioned that ARCMigrate has been obsoleted and that we could go ahead and remove it from Clang, so this patch does just that.	2025-01-30 05:32:25 +01:00
Momchil Velikov	db6fa74dfe	[AArch64] Implement FP8 Neon reinterpret intrinsics (#120476 )	2025-01-28 11:06:24 +00:00
Chandler Carruth	f4de28a63c	[StrTable] Switch intrinsics to StringTable and work around MSVC (#123548 ) Historically, the main example of very large string tables used the `EmitCharArray` to work around MSVC limitations with string literals, but that was switched (without removing the API) in order to consolidate on a nicer emission primitive. While this large string table in `IntrinsicsImpl.inc` seems to compile correctly on MSVC without the work around in `EmitCharArray` (and that this PR adds back to the nicer emission path), other users have repeatedly hit this MSVC limitation as you can see in the discussion on PR https://github.com/llvm/llvm-project/pull/120534. This PR teaches the string offset table emission to look at the size of the table and switch to the char array emission strategy when the table becomes too large. This work around does have the downside of making compile times worse for large string tables, but that appears unavoidable until we can identify known good MSVC versions and switch to requiring them for all LLVM users. It also reduces searchability of the generated string table -- I looked at emitting a comment with each string but it is tricky because the escaping rules for an inline comment are different from those of of a string literal, and there's no real way to turn the string literal into a comment. While improving the output in this way, also clean up the output to not emit an extraneous empty string at the end of the string table, and update the `StringTable` class to not look for that. It isn't actually used by anything and is wasteful. This PR also switches the `IntrinsicsImpl.inc` string tables over to the new `StringTable` runtime abstraction. I didn't want to do this until landing the MSVC workaround in case it caused even this example to start hitting the MSVC bug, but I wanted to switch here so that I could simplify the API for emitting the string table with the workaround present. With the two different emission strategies, its important to use a very exact syntax and that seems better encapsulated in the API. Last but not least, the `SDNodeInfoEmitter` is updated, including its tests to match the new output. This PR should unblock landing https://github.com/llvm/llvm-project/pull/120534 and letting us switch all of Clang's builtins to use string tables. That PR has all the details motivating the overall effort. Follow-up patches will try to consolidate the remaining users onto the single interface, but those at least were easy to separate into follow-ups and keep this PR somewhat smaller.	2025-01-28 00:17:04 -08:00
Chandler Carruth	b968fd9502	[StrTable] Mechanically convert NVPTX builtins to use TableGen (#122873 ) This switches them to use tho common TableGen layer, extending it to support the missing features needed by the NVPTX backend. The biggest thing was to build a TableGen system that computes the cumulative SM and PTX feature sets the same way the macros did. That's done with some string concatenation tricks in TableGen, but they worked out pretty neatly and are very comparable in complexity to the macro version. Then the actual defines were mapped over using a very hacky Python script. It was never productionized or intended to work in the future, but for posterity: https://gist.github.com/chandlerc/10bdf8fb1312e252b4a501bace184b66 Last but not least, there was a very odd "bug" in one of the converted builtins' prototype in the TableGen model: it didn't handle uses of `Z` and `U` both as qualifiers of a single type, treating `Z` as its own `int32_t` type. So my hacky Python script converted `ZUi` into two types, an `int32_t` and an `unsigned int`. This produced a very wrong prototype. But the tests caught this nicely and I fixed it manually rather than trying to improve the Python script as it occurred in exactly one place I could find. This should provide direct benefits of allowing future refactorings to more directly leverage TableGen to express builtins more structurally rather than textually. It will also make my efforts to move builtins to string tables significantly more effective for the NVPTX backend where the X-macro approach resulted in significantly less efficient string tables than other targets due to the long repeated feature strings.	2025-01-27 22:45:37 -08:00
Momchil Velikov	99bd2e3f12	[AArch64] Add Neon FP8 conversion intrinsics (#123612 ) The patch adds the following intrinsics: bfloat16x8_t vcvt1_bf16_mf8_fpm(mfloat8x8_t vn, fpm_t fpm) bfloat16x8_t vcvt1_low_bf16_mf8_fpm(mfloat8x16_t vn, fpm_t fpm) bfloat16x8_t vcvt2_bf16_mf8_fpm(mfloat8x8_t vn, fpm_t fpm) bfloat16x8_t vcvt2_low_bf16_mf8_fpm(mfloat8x16_t vn, fpm_t fpm) bfloat16x8_t vcvt1_high_bf16_mf8_fpm(mfloat8x16_t vn, fpm_t fpm) bfloat16x8_t vcvt2_high_bf16_mf8_fpm(mfloat8x16_t vn, fpm_t fpm) float16x8_t vcvt1_f16_mf8_fpm(mfloat8x8_t vn, fpm_t fpm) float16x8_t vcvt1_low_f16_mf8_fpm(mfloat8x16_t vn, fpm_t fpm) float16x8_t vcvt2_f16_mf8_fpm(mfloat8x8_t vn, fpm_t fpm) float16x8_t vcvt2_low_f16_mf8_fpm(mfloat8x16_t vn, fpm_t fpm) float16x8_t vcvt1_high_f16_mf8_fpm(mfloat8x16_t vn, fpm_t fpm) float16x8_t vcvt2_high_f16_mf8_fpm(mfloat8x16_t vn, fpm_t fpm) mfloat8x8_t vcvt_mf8_f32_fpm(float32x4_t vn, float32x4_t vm, fpm_t fpm) mfloat8x16_t vcvt_high_mf8_f32_fpm(mfloat8x8_t vd, float32x4_t vn, float32x4_t vm, fpm_t fpm) mfloat8x8_t vcvt_mf8_f16_fpm(float16x4_t vn, float16x4_t vm, fpm_t fpm) mfloat8x16_t vcvtq_mf8_f16_fpm(float16x8_t vn, float16x8_t vm, fpm_t fpm) Co-Authored-By: Caroline Concatto <caroline.concatto@arm.com>	2025-01-27 17:32:47 +00:00
Momchil Velikov	f95a8bde34	[AArch64] Refactor implementation of FP8 types (NFC) (#123604 ) - The FP8 scalar type (`__mfp8`) was described as a vector type - The FP8 vector types were described/assumed to have integer element type (the element type ought to be `__mfp8`) - Add support for `m` type specifier (denoting `__mfp8`) in `DecodeTypeFromStr` and create builtin function prototypes using that specifier, instead of `int8_t`	2025-01-27 14:31:41 +00:00
Momchil Velikov	87103a016f	[AArch64] Implement NEON FP8 vectors as VectorType (#123603 ) Reimplement Neon FP8 vector types using attribute `neon_vector_type` instead of having them as builtin types. This allows to implement FP8 Neon intrinsics without the need to add special cases for these types when using `__builtin_shufflevector` or bitcast (using C-style cast operator) between vectors, both extensively used in the generated code in `arm_neon.h`.	2025-01-27 10:41:53 +00:00
Kazu Hirata	ccc066e8d5	[TableGen] Avoid repeated map lookups (NFC) (#124448 ) This patch avoids repeated map lookups and constructions of temporary std::string instances by switching to DenseSet.	2025-01-26 11:50:10 -08:00
Tom Stellard	1a53d4baeb	[clang][cmake] Apply bolt optimizations as part of the clang target (#119896 ) This change removes the need to call the clang-bolt target in order to apply bolt optimizations to clang. Now running `ninja clang` will build a clang with bolt optimizations, and `ninja check-clang` and `ninja install-clang` will test and install bolt optimized clang too. The clang-bolt target has been kept for compatibilty reasons, but it is now just an alias to the clang target. Also, this new design for applying the bolt optimizations to clang will be easier to generalize and use to optimize other binaries/libraries in the project. --------- Co-authored-by: Amir Ayupov <fads93@gmail.com> Co-authored-by: Petr Hosek <phosek@google.com>	2025-01-25 03:59:45 -08:00
Momchil Velikov	dac49e8ddd	[Arm] Fix generating code with UB in NeonEmitter (#121802 ) When generating `arm_neon.h`, NeonEmitter outputs code that violates strict aliasing rules (C23 6.5 Expressions #7, C++23 7.2.1 Value category [basic.lval] #11), for example: bfloat16_t __reint = __p0; uint32_t __reint1 = (uint32_t)((uint16_t ) &__reint) << 16; __ret = (float32_t ) &__reint1; This patch fixed the offending code by replacing it with a call to `__builtin_bit_cast`.	2025-01-24 10:57:23 +00:00
Oleksandr T.	4018317407	[Clang] restrict use of attribute names reserved by the C++ standard (#106036 ) Fixes #92196 https://eel.is/c++draft/macro.names#2 > A translation unit shall not #define or #undef names lexically identical to keywords, to the identifiers listed in Table [4](https://eel.is/c++draft/lex.name#tab:lex.name.special), or to the [attribute-token](https://eel.is/c++draft/dcl.attr.grammar#nt:attribute-token)s described in [[dcl.attr]](https://eel.is/c++draft/dcl.attr), except that the names likely and unlikely may be defined as function-like macros ([[cpp.replace]](https://eel.is/c++draft/cpp.replace))[.](https://eel.is/c++draft/macro.names#2.sentence-1)	2025-01-23 21:16:59 +02:00
Ilya Biryukov	f63e8ed16e	Revert "[Modules] Delay deserialization of preferred_name attribute at r… (#122726 )" This reverts commit c3ba6f378ef80d750e2278560c6f95a300114412. We are seeing performance regressions of up to 40% on some compilations with this patch, we will investigate and reland after fixing performance issues.	2025-01-22 18:17:37 +01:00
Chandler Carruth	bc6f84a2db	[StrTable] Switch diag group names to `llvm::StringTable` (#123302 ) Previously, they used a hand-rolled Pascal-string encoding different from all the other string tables produced from TableGen. This moves them to use the newly introduced runtime abstraction, and enhances that abstraction to support iterating over the string table as used in this case. From what I can tell the Pascal-string encoding isn't critical here to avoid expensive `strlen` calls, so I think this is a simpler and more consistent model. But if folks would prefer a Pascal-string style encoding, I can instead work to switch the `StringTable` abstraction towards that. It would require some tricky tradeoffs though to make it reasonably general: either using 4 bytes instead of 1 byte to encode the size, or having a fallback to `strlen` for long strings.	2025-01-22 00:41:27 -08:00
Jonathan Thackray	d028eaaeb8	[AArch64] Update SVE untyped intrinsics to have FP8 variants (#123585 ) Update the following intrinsics to have FP8 variants: ``` c svuint8_t svdup_laneq[_u8](svuint8_t zn, uint64_t imm_idx); svuint8_t svextq[_u8](svuint8_t zdn, svuint8_t zm, uint64_t imm); svint8_t svtblq[_s8](svint8_t zn, svuint8_t zm); svint8_t svtbxq[_s8](svint8_t fallback, svint8_t zn, svuint8_t zm); svuint8_t svuzpq1[_u8](svuint8_t zn, svuint8_t zm); svuint8_t svuzpq2[_u8](svuint8_t zn, svuint8_t zm); svuint8_t svzipq1[_u8](svuint8_t zn, svuint8_t zm); svuint8_t svzipq2[_u8](svuint8_t zn, svuint8_t zm); ```	2025-01-21 13:34:57 +00:00
Viktoriia Bakalova	c3ba6f378e	[Modules] Delay deserialization of preferred_name attribute at r… (#122726 ) …ecord level. This fixes the incorrect diagnostic emitted when compiling the following snippet ``` // string_view.h template<class _CharT> class basic_string_view; typedef basic_string_view<char> string_view; template<class _CharT> class __attribute__((__preferred_name__(string_view))) basic_string_view { public: basic_string_view() { } }; inline basic_string_view<char> foo() { return basic_string_view<char>(); } // A.cppm module; #include "string_view.h" export module A; // Use.cppm module; #include "string_view.h" export module Use; import A; ``` The diagnostic is ``` string_view.h:11:5: error: 'basic_string_view<char>::basic_string_view' from module 'A.<global>' is not present in definition of 'string_view' provided earlier ``` The underlying issue is that deserialization of the `preferred_name` attribute triggers deserialization of `basic_string_view<char>`, which triggers the deserialization of the `preferred_name` attribute again (since it's attached to the `basic_string_view` template). The deserialization logic is implemented in a way that prevents it from going on a loop in a literal sense (it detects early on that it has already seen the `string_view` typedef when trying to start its deserialization for the second time), but leaves the typedef deserialization in an unfinished state. Subsequently, the `string_view` typedef from the deserialized module cannot be merged with the same typedef from `string_view.h`, resulting in the above diagnostic. This PR resolves the problem by delaying the deserialization of the `preferred_name` attribute until the deserialization of the `basic_string_view` template is completed. As a result of deferring, the deserialization of the `preferred_name` attribute doesn't need to go on a loop since the type of the `string_view` typedef is already known when it's deserialized.	2025-01-17 09:10:58 +01:00
Erich Keane	bf17016a92	Add 'enum_select' diagnostic selection to clang. (#122505 ) This causes us to generate an enum to go along with the select diagnostic, which allows for clearer diagnostic error emit lines. The syntax for this is: %enum_select<EnumerationName>{%OptionalEnumeratorName{Text}\|{Text2}}0 Where the curley brackets around the select-text are only required if an Enumerator name is provided. The TableGen here emits this as a normal 'select' to the frontend, which permits us to reuse all of the existing 'select' infrastructure. Documentation is the same as well. --------- Co-authored-by: Aaron Ballman <aaron@aaronballman.com>	2025-01-15 12:59:08 -08:00
Chandler Carruth	6d25345465	Remove the `CustomEntry` escape hatch from builtin TableGen (#120861 ) This was an especially challenging escape hatch because it directly forced the use of a specific X-macro structure and prevented any other form of TableGen emission. The problematic feature that motivated this is a case where a builtin's prototype can't be represented in the mini-language used by TableGen. Instead of adding a complete custom entry for this, this PR just teaches the prototype handling to do the same thing the X-macros did in this case: emit an empty string and let the Clang builtin handling respond appropriately. This should produce identical results while preserving all the rest of the structured representation in the builtin TableGen code.	2025-01-14 01:10:42 -08:00
Momchil Velikov	16e45b8fac	[AArch64] Implement FP8 SVE/SME reinterpret intrinsics (#121063 )	2025-01-13 18:53:07 +00:00
Nicholas Guy	21b531ead1	[clang][llvm][aarch64] Add aarch64_sme_in_streaming_mode intrinsic (#120265 ) Replacing the extant streaming mode function call with an intrinsic allows us to make further optimisations around it. For example, if it's called within a function that has a known streaming mode, we can remove the dead code, and avoid the redundant conditional branch.	2025-01-07 09:02:26 +00:00
Chandler Carruth	2529a8df53	Mechanically port bulk of x86 builtins to TableGen (#120831 ) The goal is to make incremental (if small) progress towards fully TableGen'ed builtins, and to unblock #120534 by gaining access to more powerful TableGen-based representations. The bulk `.td` file addition was generated with the help of a very rough Python script. That script made no attempt to be robust or reusable, it specifically handled only the cases in the X86 `.def` file. Four entries from the `.def` file were not handled automatically as they used `BUILTIN` rather than `TARGET_BUILTIN`. These were ported by hand to an empty-feature `TargetBuiltin` entry, which seems like a better match. For all the automatically ported entries, the results were compared by sorting and diffing the `.def` file and the generated `.inc` file. The only differences were: - Different horizontal whitespace - Additional entries that had already been ported to the `.td` file. - More systematically using `Oi` instead of `LLi` for the type `long long int` in the fully general `__builtin_ia32_...` builtins for OpenCL support. The `.def` file was only partially moved to this it seems, and the systematic migration has updated a few missed builtins.	2025-01-04 02:23:54 -08:00
SpencerAbson	db84ae3a68	[Clang][AArch64] Add signed index/offset variants of sve2p1 qword stores (#120549 ) This patch adds signed offset/index variants to the SVE2p1 quadword store intrinsics, in accordance with https://github.com/ARM-software/acle/pull/359.	2024-12-19 13:27:07 +00:00
Momchil Velikov	c2172431c7	[AArch64] Implements FP8 SVE intrinsics for dot-product (#118125 ) This patch adds the following intrinsics: * 8-bit floating-point dot product to single-precision. // Only if (__ARM_FEATURE_SVE2 && __ARM_FEATURE_FP8DOT4) \|\| __ARM_FEATURE_SSVE_FP8DOT4 svfloat32_t svdot[_f32_mf8]_fpm(svfloat32_t zda, svmfloat8_t zn, svmfloat8_t zm, fpm_t fpm); svfloat32_t svdot[_n_f32_mf8]_fpm(svfloat32_t zda, svmfloat8_t zn, mfloat8_t zm, fpm_t fpm); * 8-bit floating-point indexed dot product to single-precision. // Only if (__ARM_FEATURE_SVE2 && __ARM_FEATURE_FP8DOT4) \|\| __ARM_FEATURE_SSVE_FP8DOT4 svfloat32_t svdot_lane[_f32_mf8]_fpm(svfloat32_t zda, svmfloat8_t zn, svmfloat8_t zm, uint64_t imm0_3, fpm_t fpm); * 8-bit floating-point dot product to half-precision. // Only if (__ARM_FEATURE_SVE2 && __ARM_FEATURE_FP8DOT2) \|\| __ARM_FEATURE_SSVE_FP8DOT2 svfloat16_t svdot[_f16_mf8]_fpm(svfloat16_t zda, svmfloat8_t zn, svmfloat8_t zm, fpm_t fpm); svfloat16_t svdot[_n_f16_mf8]_fpm(svfloat16_t zda, svmfloat8_t zn, mfloat8_t zm, fpm_t fpm); * 8-bit floating-point indexed dot product to half-precision. // Only if (__ARM_FEATURE_SVE2 && __ARM_FEATURE_FP8DOT2) \|\| __ARM_FEATURE_SSVE_FP8DOT2 svfloat16_t svdot_lane[_f16_mf8]_fpm(svfloat16_t zda, svmfloat8_t zn, svmfloat8_t zm, uint64_t imm0_7, fpm_t fpm);	2024-12-13 14:06:54 +00:00
Tom Stellard	ea6e13586c	[clang][perf-training] Fix profiling with -DCLANG_BOLT=perf (#119117 ) This fixes the llvm-support build that generates the profile data, and wraps the whole `cmake --build` command with perf instead of wrapping each individual clang invocation. This limits the number of profile files generated and reduces the time spent running perf2bolt.	2024-12-12 15:50:33 -08:00
Kazu Hirata	02dd73a5d5	[clang] Migrate away from PointerUnion::{is,get} (NFC) (#119654 ) Note that PointerUnion::{is,get} have been soft deprecated in PointerUnion.h: // FIXME: Replace the uses of is(), get() and dyn_cast() with // isa<T>, cast<T> and the llvm::dyn_cast<T> I'm not touching PointerUnion::dyn_cast for now because it's a bit complicated; we could blindly migrate it to dyn_cast_if_present, but we should probably use dyn_cast when the operand is known to be non-null.	2024-12-11 21:13:13 -08:00
Qiongsi Wu	f33e236905	[clang][Modules] Fixing Build Breaks When -DLLVM_ENABLE_MODULES=ON (#119473 ) A few recent changes are causing build breaks when `-DLLVM_ENABLE_MODULES=ON` (such as 834dfd23155351c9885eddf7b9664f7697326946 and 7dfdca1961aadc75ca397818bfb9bd32f1879248). This PR makes the required updates so that clang/llvm builds when `-DLLVM_ENABLE_MODULES=ON`. rdar://140803058	2024-12-11 17:33:25 -08:00
Haojian Wu	8f434bb9b2	[clang] Fix a dangling reference in clang/utils/TableGen/ClangDiagnosticsEmitter.cpp (#119197 ) `DiagsInGroup` is a `map<llvm::StringRef, ...>`, we store a dangling string_view in the key.	2024-12-10 09:05:28 +01:00

1 2 3 4 5 ...

2087 Commits