llvm-project

mirror of https://github.com/llvm/llvm-project.git synced 2025-05-08 10:26:08 +00:00

Author	SHA1	Message	Date
Kadir Cetinkaya	5845688e91	Reapply "[clang] Introduce diagnostics suppression mappings (#112517 )" This reverts commit 5f140ba54794fe6ca379362b133eb27780e363d7.	2024-11-13 10:35:22 +01:00
Kadir Cetinkaya	5f140ba547	Revert "[clang] Introduce diagnostics suppression mappings (#112517 )" This reverts commit 12e3ed8de8c6063b15916b3faf67c8c9cd17df1f. This reverts commit 41e3919ded78d8870f7c95e9181c7f7e29aa3cc4. There are some buildbot breakages in https://lab.llvm.org/buildbot/#/builders/18/builds/6832.	2024-11-12 18:30:42 +01:00
kadir çetinkaya	41e3919ded	[clang] Introduce diagnostics suppression mappings (#112517 ) This implements https://discourse.llvm.org/t/rfc-add-support-for-controlling-diagnostics-severities-at-file-level-granularity-through-command-line/81292. Users now can suppress warnings for certain headers by providing a mapping with globs, a sample file looks like: ``` [unused] src:* src:clang/=emit ``` This will suppress warnings from `-Wunused` group in all files that aren't under `clang/` directory. This mapping file can be passed to clang via `--warning-suppression-mappings=foo.txt`. At a high level, mapping file is stored in DiagnosticOptions and then processed with rest of the warning flags when creating a DiagnosticsEngine. This is a functor that uses SpecialCaseLists underneath to match against globs coming from the mappings file. This implies processing warning options now performs IO, relevant interfaces are updated to take in a VFS, falling back to RealFileSystem when one is not available.	2024-11-12 10:53:43 +01:00
Jan Svoboda	a2f9d1d078	[clang][serialization] Enable `ASTWriter` to work with `Preprocessor` only (#115237 ) This PR builds on top of https://github.com/llvm/llvm-project/pull/115235 and makes it possible to call `ASTWriter::WriteAST()` with `Preprocessor` only instead of full `Sema` object. So far, there are no clients that leverage the new capability - that will come in a follow-up commit.	2024-11-11 11:01:01 -08:00
Aaron Ballman	af7c58b7ea	Remove support for RenderScript (#112916 ) See https://discourse.llvm.org/t/rfc-deprecate-and-eventually-remove-renderscript-support/81284 for the RFC	2024-10-28 12:48:42 -04:00
Jan Svoboda	b1aea98cfa	[clang] Make deprecations of some `FileManager` APIs formal (#110014 ) Some `FileManager` APIs still return `{File,Directory}Entry` instead of the preferred `{File,Directory}EntryRef`. These are documented to be deprecated, but don't have the attribute that warns on their usage. This PR marks them as such with `LLVM_DEPRECATED()` and replaces their usage with the recommended counterparts. NFCI.	2024-09-25 10:36:44 -07:00
Kazu Hirata	3cd3202b78	[Frontend] Teach LoadFromASTFile to take FileName by StringRef (NFC) (#109583 ) Without this patch, several callers of LoadFromASTFile construct an instance of std::string to be passed as FileName, only to be converted back to StringRef when LoadFromASTFile calls ReadAST. This patch changes the type of FileName to StringRef and updates the callers.	2024-09-23 19:21:39 -07:00
Volodymyr Sapsai	4357175569	[re-format][Modules] Follow-up formatting to "Mention which AST file's options differ from the current TU options." (#102484 ) Fix formatting for fdf8e3e31103bc81917cdb27150877f524bb2669.	2024-08-08 11:59:47 -03:00
Volodymyr Sapsai	fdf8e3e311	[Modules][Diagnostic] Mention which AST file's options differ from the current TU options. (#101413 ) Claiming a mismatch is always in a precompiled header is wrong and misleading as a mismatch can happen in any provided AST file. Emitting a path for a file with a problem allows to disambiguate between multiple input files. Use generic term "AST file" because we don't always know a kind of the provided file (for example, see `ASTReader::readASTFileControlBlock`). rdar://65005546	2024-08-08 11:23:47 -03:00
Chuanqi Xu	8af86025af	[NFC] [Serialization] Unify how LocalDeclID can be created Now we can create a LocalDeclID directly with an integer without verifying. It may be hard to refactor if we want to change the way we serialize DeclIDs (See https://github.com/llvm/llvm-project/pull/95897). Also it is hard for us to debug if someday someone construct a LocalDeclID with an incorrect value. So in this patch, I tried to unify the way we can construct a LocalDeclID in ASTReader, where we will construct the LocalDeclID from the serialized data. Also, now we can verify the constructed LocalDeclID sooner in the new interface.	2024-06-19 15:18:01 +08:00
Vlad Serebrennikov	874f511ae7	[clang] Introduce `SemaCodeCompletion` (#92311 ) This patch continues previous efforts to split `Sema` up, this time covering code completion. Context can be found in #84184. Dropping `Code` prefix from function names in `SemaCodeCompletion` would make sense, but I think this PR has enough changes already. As usual, formatting changes are done as a separate commit. Hopefully this helps with the review.	2024-05-17 20:55:37 +04:00
Chuanqi Xu	947b062823	Reland "[Modules] No transitive source location change (#86912 )" This relands 6c31104. The patch was reverted due to incorrectly introduced alignment. And the patch was re-commited after fixing the alignment issue. Following off are the original message: This is part of "no transitive change" patch series, "no transitive source location change". I talked this with @Bigcheese in the tokyo's WG21 meeting. The idea comes from @jyknight posted on LLVM discourse. That for: ``` // A.cppm export module A; ... // B.cppm export module B; import A; ... //--- C.cppm export module C; import C; ``` Almost every time A.cppm changes, we need to recompile `B`. Due to we think the source location is significant to the semantics. But it may be good if we can avoid recompiling `C` if the change from `A` wouldn't change the BMI of B. This patch only cares source locations. So let's focus on source location's example. We can see the full example from the attached test. ``` //--- A.cppm export module A; export template <class T> struct C { T func() { return T(43); } }; export int funcA() { return 43; } //--- A.v1.cppm export module A; export template <class T> struct C { T func() { return T(43); } }; export int funcA() { return 43; } //--- B.cppm export module B; import A; export int funcB() { return funcA(); } //--- C.cppm export module C; import A; export void testD() { C<int> c; c.func(); } ``` Here the only difference between `A.cppm` and `A.v1.cppm` is that `A.v1.cppm` has an additional blank line. Then the test shows that two BMI of `B.cppm`, one specified `-fmodule-file=A=A.pcm` and the other specified `-fmodule-file=A=A.v1.pcm`, should have the bit-wise same contents. However, it is a different story for C, since C instantiates templates from A, and the instantiation records the source information from module A, which is different from `A` and `A.v1`, so it is expected that the BMI `C.pcm` and `C.v1.pcm` can and should differ. To fully understand the patch, we need to understand how we encodes source locations and how we serialize and deserialize them. For source locations, we encoded them as: ``` \| \| \| _____ base offset of an imported module \| \| \| \|_____ base offset of another imported module \| \| \| \| \| ___ 0 ``` As the diagram shows, we encode the local (unloaded) source location from 0 to higher bits. And we allocate the space for source locations from the loaded modules from high bits to 0. Then the source locations from the loaded modules will be mapped to our source location space according to the allocated offset. For example, for, ``` // a.cppm export module a; ... // b.cppm export module b; import a; ... ``` Assuming the offset of a source location (let's name the location as `S`) in a.cppm is 45 and we will record the value `45` into the BMI `a.pcm`. Then in b.cppm, when we import a, the source manager will allocate a space for module 'a' (according to the recorded number of source locations) as the base offset of module 'a' in the current source location spaces. Let's assume the allocated base offset as 90 in this example. Then when we want to get the location in the current source location space for `S`, we can get it simply by adding `45` to `90` to `135`. Finally we can get the source location for `S` in module B as `135`. And when we want to write module `b`, we would also write the source location of `S` as `135` directly in the BMI. And to clarify the location `S` comes from module `a`, we also need to record the base offset of module `a`, 90 in the BMI of `b`. Then the problem comes. Since the base offset of module 'a' is computed by the number source locations in module 'a'. In module 'b', the recorded base offset of module 'a' will change every time the number of source locations in module 'a' increase or decrease. In other words, the contents of BMI of B will change every time the number of locations in module 'a' changes. This is pretty sensitive. Almost every change will change the number of locations. So this is the problem this patch want to solve. Let's continue with the existing design to understand what's going on. Another interesting case is: ``` // c.cppm export module c; import whatever; import a; import b; ... ``` In `c.cppm`, when we import `a`, we still need to allocate a base location offset for it, let's say the value becomes to `200` somehow. Then when we reach the location `S` recorded in module `b`, we need to translate it into the current source location space. The solution is quite simple, we can get it by `135 + (200 - 90) = 245`. In another word, the offset of a source location in current module can be computed as `Recorded Offset + Base Offset of the its module file - Recorded Base Offset`. Then we're almost done about how we handle the offset of source locations in serializers. From the abstract level, what we want to do is to remove the hardcoded base offset of imported modules and remain the ability to calculate the source location in a new module unit. To achieve this, we need to be able to find the module file owning a source location from the encoding of the source location. So in this patch, for each source location, we will store the local offset of the location and the module file index. For the above example, in `b.pcm`, the source location of `S` will be recorded as `135` directly. And in the new design, the source location of `S` will be recorded as `<1, 45>`. Here `1` stands for the module file index of `a` in module `b`. And `45` means the offset of `S` to the base offset of module `a`. So the trade-off here is that, to make the BMI more independent, we need to record more abstract information. And I feel it is worthy. The recompilation problem of modules is really annoying and there are still people complaining this. But if we can make this (including stopping other changes transitively), I think this may be a killer feature for modules. And from @Bigcheese , this should be helpful for clang explicit modules too. And the benchmarking side, I tested this patch against https://github.com/alibaba/async_simple/tree/CXX20Modules. No significant change on compilation time. The size of .pcm files becomes to 204M from 200M. I think the trade-off is pretty fair. I didn't use another slot to record the module file index. I tried to use the higher 32 bits of the existing source location encodings to store that information. This design may be safe. Since we use `unsigned` to store source locations but we use uint64_t in serialization. And generally `unsigned` is 32 bit width in most platforms. So it might not be a safe problem. Since all the bits we used to store the module file index is not used before. So the new encodings may be: ``` \|-----------------------\|-----------------------\| \| A \| B \| C \| * A: 32 bit. The index of the module file in the module manager + 1. * The +1 here is necessary since we wish 0 stands for the current module file. * B: 31 bit. The offset of the source location to the module file * containing it. * C: The macro bit. We rotate it to the lowest bit so that we can save * some space in case the index of the module file is 0. ``` (The B and C is the existing raw encoding for source locations) Another reason to reuse the same slot of the source location is to reduce the impact of the patch. Since there are a lot of places assuming we can store and get a source location from a slot. And if I tried to add another slot, a lot of codes breaks. I don't feel it is worhty. Another impact of this decision is that, the existing small optimizations for encoding source location may be invalided. The key of the optimization is that we can turn large values into small values then we can use VBR6 format to reduce the size. But if we decided to put the module file index into the higher bits, then maybe it simply doesn't work. An example may be the `SourceLocationSequence` optimization. This will only affect the size of on-disk .pcm files. I don't expect this impact the speed and memory use of compilations. And seeing my small experiments above, I feel this trade off is worthy. The mental model for handling source location offsets is not so complex and I believe we can solve it by adding module file index to each stored source location. For the practical side, since the source location is pretty sensitive, and the patch can pass all the in-tree tests and a small scale projects, I feel it should be correct. I'll continue to work on no transitive decl change and no transitive identifier change (if matters) to achieve the goal to stop the propagation of unnecessary changes. But all of this depends on this patch. Since, clearly, the source locations are the most sensitive thing. --- The release nots and documentation will be added seperately.	2024-05-06 13:35:16 +08:00
Chuanqi Xu	d333a0de68	Revert "[Modules] No transitive source location change (#86912 )" This reverts commit 6c3110464bac3600685af9650269b0b2b8669d34. Required by the post commit comments: https://github.com/llvm/llvm-project/pull/86912	2024-04-30 22:32:02 +08:00
Chuanqi Xu	6c3110464b	[Modules] No transitive source location change (#86912 ) This is part of "no transitive change" patch series, "no transitive source location change". I talked this with @Bigcheese in the tokyo's WG21 meeting. The idea comes from @jyknight posted on LLVM discourse. That for: ``` // A.cppm export module A; ... // B.cppm export module B; import A; ... //--- C.cppm export module C; import C; ``` Almost every time A.cppm changes, we need to recompile `B`. Due to we think the source location is significant to the semantics. But it may be good if we can avoid recompiling `C` if the change from `A` wouldn't change the BMI of B. # Motivation Example This patch only cares source locations. So let's focus on source location's example. We can see the full example from the attached test. ``` //--- A.cppm export module A; export template <class T> struct C { T func() { return T(43); } }; export int funcA() { return 43; } //--- A.v1.cppm export module A; export template <class T> struct C { T func() { return T(43); } }; export int funcA() { return 43; } //--- B.cppm export module B; import A; export int funcB() { return funcA(); } //--- C.cppm export module C; import A; export void testD() { C<int> c; c.func(); } ``` Here the only difference between `A.cppm` and `A.v1.cppm` is that `A.v1.cppm` has an additional blank line. Then the test shows that two BMI of `B.cppm`, one specified `-fmodule-file=A=A.pcm` and the other specified `-fmodule-file=A=A.v1.pcm`, should have the bit-wise same contents. However, it is a different story for C, since C instantiates templates from A, and the instantiation records the source information from module A, which is different from `A` and `A.v1`, so it is expected that the BMI `C.pcm` and `C.v1.pcm` can and should differ. # Internal perspective of status quo To fully understand the patch, we need to understand how we encodes source locations and how we serialize and deserialize them. For source locations, we encoded them as: ``` \| \| \| _____ base offset of an imported module \| \| \| \|_____ base offset of another imported module \| \| \| \| \| ___ 0 ``` As the diagram shows, we encode the local (unloaded) source location from 0 to higher bits. And we allocate the space for source locations from the loaded modules from high bits to 0. Then the source locations from the loaded modules will be mapped to our source location space according to the allocated offset. For example, for, ``` // a.cppm export module a; ... // b.cppm export module b; import a; ... ``` Assuming the offset of a source location (let's name the location as `S`) in a.cppm is 45 and we will record the value `45` into the BMI `a.pcm`. Then in b.cppm, when we import a, the source manager will allocate a space for module 'a' (according to the recorded number of source locations) as the base offset of module 'a' in the current source location spaces. Let's assume the allocated base offset as 90 in this example. Then when we want to get the location in the current source location space for `S`, we can get it simply by adding `45` to `90` to `135`. Finally we can get the source location for `S` in module B as `135`. And when we want to write module `b`, we would also write the source location of `S` as `135` directly in the BMI. And to clarify the location `S` comes from module `a`, we also need to record the base offset of module `a`, 90 in the BMI of `b`. Then the problem comes. Since the base offset of module 'a' is computed by the number source locations in module 'a'. In module 'b', the recorded base offset of module 'a' will change every time the number of source locations in module 'a' increase or decrease. In other words, the contents of BMI of B will change every time the number of locations in module 'a' changes. This is pretty sensitive. Almost every change will change the number of locations. So this is the problem this patch want to solve. Let's continue with the existing design to understand what's going on. Another interesting case is: ``` // c.cppm export module c; import whatever; import a; import b; ... ``` In `c.cppm`, when we import `a`, we still need to allocate a base location offset for it, let's say the value becomes to `200` somehow. Then when we reach the location `S` recorded in module `b`, we need to translate it into the current source location space. The solution is quite simple, we can get it by `135 + (200 - 90) = 245`. In another word, the offset of a source location in current module can be computed as `Recorded Offset + Base Offset of the its module file - Recorded Base Offset`. Then we're almost done about how we handle the offset of source locations in serializers. # The high level design of current patch From the abstract level, what we want to do is to remove the hardcoded base offset of imported modules and remain the ability to calculate the source location in a new module unit. To achieve this, we need to be able to find the module file owning a source location from the encoding of the source location. So in this patch, for each source location, we will store the local offset of the location and the module file index. For the above example, in `b.pcm`, the source location of `S` will be recorded as `135` directly. And in the new design, the source location of `S` will be recorded as `<1, 45>`. Here `1` stands for the module file index of `a` in module `b`. And `45` means the offset of `S` to the base offset of module `a`. So the trade-off here is that, to make the BMI more independent, we need to record more abstract information. And I feel it is worthy. The recompilation problem of modules is really annoying and there are still people complaining this. But if we can make this (including stopping other changes transitively), I think this may be a killer feature for modules. And from @Bigcheese , this should be helpful for clang explicit modules too. And the benchmarking side, I tested this patch against https://github.com/alibaba/async_simple/tree/CXX20Modules. No significant change on compilation time. The size of .pcm files becomes to 204M from 200M. I think the trade-off is pretty fair. # Some low level details I didn't use another slot to record the module file index. I tried to use the higher 32 bits of the existing source location encodings to store that information. This design may be safe. Since we use `unsigned` to store source locations but we use uint64_t in serialization. And generally `unsigned` is 32 bit width in most platforms. So it might not be a safe problem. Since all the bits we used to store the module file index is not used before. So the new encodings may be: ``` \|-----------------------\|-----------------------\| \| A \| B \| C \| * A: 32 bit. The index of the module file in the module manager + 1. The +1 here is necessary since we wish 0 stands for the current module file. * B: 31 bit. The offset of the source location to the module file containing it. * C: The macro bit. We rotate it to the lowest bit so that we can save some space in case the index of the module file is 0. ``` (The B and C is the existing raw encoding for source locations) Another reason to reuse the same slot of the source location is to reduce the impact of the patch. Since there are a lot of places assuming we can store and get a source location from a slot. And if I tried to add another slot, a lot of codes breaks. I don't feel it is worhty. Another impact of this decision is that, the existing small optimizations for encoding source location may be invalided. The key of the optimization is that we can turn large values into small values then we can use VBR6 format to reduce the size. But if we decided to put the module file index into the higher bits, then maybe it simply doesn't work. An example may be the `SourceLocationSequence` optimization. This will only affect the size of on-disk .pcm files. I don't expect this impact the speed and memory use of compilations. And seeing my small experiments above, I feel this trade off is worthy. # Correctness The mental model for handling source location offsets is not so complex and I believe we can solve it by adding module file index to each stored source location. For the practical side, since the source location is pretty sensitive, and the patch can pass all the in-tree tests and a small scale projects, I feel it should be correct. # Future Plans I'll continue to work on no transitive decl change and no transitive identifier change (if matters) to achieve the goal to stop the propagation of unnecessary changes. But all of this depends on this patch. Since, clearly, the source locations are the most sensitive thing. --- The release nots and documentation will be added seperately.	2024-04-30 15:57:58 +08:00
Chuanqi Xu	fe47e8ff3a	[NFC] [ASTUnit] [Serialization] Transalte local decl ID to global decl ID before consuming Discovered from `d86cc73bbf`. There is a potential issue of using DeclID in ASTUnit. ASTUnit may record the declaration ID from ASTWriter. And after loading the preamble, the ASTUnit may consume the recorded declaration ID directly in ExternalASTSource. This is not good. According to the design, all local declaration ID consumed in ASTReader need to be translated by `ASTReader::getGlobaldeclID()`. This will be problematic if we changed the encodings of declaration IDs or if we make preamble to work more complexly.	2024-04-25 15:55:46 +08:00
Chuanqi Xu	d86cc73bbf	[NFC] [Serialization] Avoid using DeclID directly as much as possible This patch tries to remove all the direct use of DeclID except the real low level reading and writing. All the use of DeclID is converted to the use of LocalDeclID or GlobalDeclID. This is helpful to increase the readability and type safety.	2024-04-25 14:59:09 +08:00
Chuanqi Xu	72b58146b1	Revert "[NFC] [Serialization] Avoid using DeclID directly as much as possible" This reverts commit 42070a5c092ed420bf92ebf38229c594885e94c7. I forgot to touch lldb.	2024-04-25 14:26:07 +08:00
Chuanqi Xu	42070a5c09	[NFC] [Serialization] Avoid using DeclID directly as much as possible This patch tries to remove all the direct use of DeclID except the real low level reading and writing. All the use of DeclID is converted to the use of LocalDeclID or GlobalDeclID. This is helpful to increase the readability and type safety.	2024-04-25 14:14:05 +08:00
Chuanqi Xu	c2a98fdeb3	[NFC] Move DeclID from serialization/ASTBitCodes.h to AST/DeclID.h (#89873 ) Previously, the DeclID is defined in serialization/ASTBitCodes.h under clang::serialization namespace. However, actually the DeclID is not purely used in serialization part. The DeclID is already widely used in AST and all around the clang project via classes like `LazyPtrDecl` or calling `ExternalASTSource::getExernalDecl()`. All such uses are via the raw underlying type of `DeclID` as `uint32_t`. This is not pretty good. This patch moves the DeclID class family to a new header `AST/DeclID.h` so that the whole project can use the wrapped class `DeclID`, `GlobalDeclID` and `LocalDeclID` instead of the raw underlying type. This can improve the readability and the type safety.	2024-04-25 13:53:22 +08:00
Chuanqi Xu	2e5af56b05	[C++20] [Modules] Allow to compile a pcm with and without -fPIC seperately We can compile a module unit in 2 phase compilaton: ``` clang++ -std=c++20 a.cppm --precompile -o a.pcm clang++ -std=c++20 a.pcm -c -o a.o ``` And it is a general requirement that we need to compile a translation unit with and without -fPIC for static and shared libraries. But for C++20 modules with 2 phase compilation, it may be waste of time to compile them 2 times completely. It may be fine to generate one BMI and compile it with and without -fPIC seperately. e.g., ``` clang++ -std=c++20 a.cppm --precompile -o a.pcm clang++ -std=c++20 a.pcm -c -o a.o clang++ -std=c++20 a.pcm -c -fPIC -o a-PIC.o ``` Then we can save the time to parse a.cppm repeatedly.	2024-02-23 11:05:15 +08:00
Juergen Ributzka	21361bb860	[clang] Remove unused argument. NFC. (#73594 )	2023-11-28 09:19:39 -08:00
Chuanqi Xu	0f7aaeb324	[C++20] [Modules] Allow export from language linkage Close https://github.com/llvm/llvm-project/issues/71347 Previously I misread the concept of module purview. I thought if a declaration attached to a unnamed module, it can't be part of the module purview. But after the issue report, I recognized that module purview is more of a concept about locations instead of semantics. Concretely, the things in the language linkage after module declarations can be exported. This patch refactors `Module::isModulePurview()` and introduces some possible code cleanups.	2023-11-09 17:44:41 +08:00
Aaron Ballman	46518a14f1	Revert "Revert "Fixes and closes #53952 . Setting the ASTHasCompilerErrors member variable correctly based on the PP diagnostics. (#68127 )"" This reverts commit a6acf3fd49a20c570a390af2a3c84e10b9545b68 and relands a50e63b38b931d945f97eac882278068221eca17. The original revert was done by mistake.	2023-10-06 07:43:19 -04:00
Kazu Hirata	a6acf3fd49	Revert "Fixes and closes #53952 . Setting the ASTHasCompilerErrors member variable correctly based on the PP diagnostics. (#68127 )" This reverts commit a50e63b38b931d945f97eac882278068221eca17. With clang-14.0.6 as the host compiler, I'm getting: ld.lld: error: undefined symbol: clang::ASTWriter::WriteAST(clang::Sema&, llvm::StringRef, clang::Module*, llvm::StringRef, bool, bool) >>> referenced by ASTUnit.cpp >>> ASTUnit.cpp.o:(clang::ASTUnit::serialize(llvm::raw_ostream&)) in archive lib/libclangFrontend.a	2023-10-05 13:08:24 -07:00
Rajkumar Ananthu	a50e63b38b	Fixes and closes #53952 . Setting the ASTHasCompilerErrors member variable correctly based on the PP diagnostics. (#68127 ) The issue #53952 is reported indicating clang is giving a crashing pch file, when hasErrors is been passed incorrectly to WriteAST method. To fix the issue, the parameter has been removed and instead we're relying on the results of `hasUncompilableErrorOccured()` instead of letting the caller override it. Fixes https://github.com/llvm/llvm-project/issues/53952	2023-10-05 14:51:57 -04:00
Jan Svoboda	523c471250	Reapply "[clang] NFCI: Adopt `SourceManager::getFileEntryRefForID()`" This reapplies ddbcc10b9e26b18f6a70e23d0611b9da75ffa52f, except for a tiny part that was reverted separately: 65331da0032ab4253a4bc0ddcb2da67664bd86a9. That will be reapplied later on, since it turned out to be more involved. This commit is enabled by 5523fefb01c282c4cbcaf6314a9aaf658c6c145f and f0f548a65a215c450d956dbcedb03656449705b9, specifically the part that makes 'clang-tidy/checkers/misc/header-include-cycle.cpp' separator agnostic.	2023-09-08 19:04:01 -07:00
Chuanqi Xu	96122b5b71	[C++20] [Modules] Introduce ForceCheckCXX20ModulesInputFiles options for C++20 modules Previously, we banned the check for input files from C++20 modules since we thought the BMI from C++20 modules should be a standalone artifact. However, during the recent experiment with clangd for modules, I find it is necessary to tell whether or not a BMI is out-of-date by checking the input files especially for language servers. So this patch brings a header search option ForceCheckCXX20ModulesInputFiles to allow the tools (concretly, clangd) to check the input files from BMI.	2023-09-08 16:53:12 +08:00
Jan Svoboda	0a9611fd8d	Revert "[clang] NFCI: Adopt `SourceManager::getFileEntryRefForID()`" This reverts commit ddbcc10b9e26b18f6a70e23d0611b9da75ffa52f. The 'clang-tidy/checkers/misc/header-include-cycle.cpp' test started failing on Windows: https://lab.llvm.org/buildbot/#/builders/216/builds/26855.	2023-09-06 13:23:23 -07:00
Jan Svoboda	ddbcc10b9e	[clang] NFCI: Adopt `SourceManager::getFileEntryRefForID()` This commit replaces some calls to the deprecated `FileEntry::getName()` with `FileEntryRef::getName()` by swapping current usages of `SourceManager::getFileEntryForID()` with `SourceManager::getFileEntryRefForID()`. This lowers the number of usages of the deprecated `FileEntry::getName()` from 95 to 50.	2023-09-06 10:49:48 -07:00
Jan Svoboda	5746002ebb	[clang] NFCI: Change returned LanguageOptions pointer to reference	2023-09-05 13:23:53 -07:00
Fred Fu	79af92bb99	Reland "[clang-repl] support code completion at a REPL." Original commit message: " This patch enabled code completion for ClangREPL. The feature was built upon three existing Clang components: a list completer for LineEditor, a CompletionConsumer from SemaCodeCompletion, and the ASTUnit::codeComplete method. The first component serves as the main entry point of handling interactive inputs. Because a completion point for a compiler instance has to be unchanged once it is set, an incremental compiler instance is created for each code completion. Such a compiler instance carries over AST context source from the main interpreter compiler in order to obtain declarations or bindings from previous input in the same REPL session. The most important API codeComplete in Interpreter/CodeCompletion is a thin wrapper that calls with ASTUnit::codeComplete with necessary arguments, such as a code completion point and a ReplCompletionConsumer, which communicates completion results from SemaCodeCompletion back to the list completer for the REPL. In addition, PCC_TopLevelOrExpression and CCC_TopLevelOrExpression` top levels were added so that SemaCodeCompletion can treat top level statements like expression statements at the REPL. For example, clang-repl> int foo = 42; clang-repl> f<tab> From a parser's persective, the cursor is at a top level. If we used code completion without any changes, PCC_Namespace would be supplied to Sema::CodeCompleteOrdinaryName, and thus the completion results would not include foo. Currently, the way we use PCC_TopLevelOrExpression and CCC_TopLevelOrExpression is no different from the way we use PCC_Statement and CCC_Statement respectively. Differential revision: https://reviews.llvm.org/D154382 " The new patch also fixes clangd and several memory issues that the bots reported and upload the missing files.	2023-08-28 20:09:03 +00:00
Vassil Vassilev	752f87cd6a	Revert "Reland "[clang-repl] support code completion at a REPL."" This reverts commit 5ab25a42ba70c4b50214b0e78eaaccd30696fa09 due to forgotten files.	2023-08-28 20:05:14 +00:00
Fred Fu	5ab25a42ba	Reland "[clang-repl] support code completion at a REPL." Original commit message: " This patch enabled code completion for ClangREPL. The feature was built upon three existing Clang components: a list completer for LineEditor, a CompletionConsumer from SemaCodeCompletion, and the ASTUnit::codeComplete method. The first component serves as the main entry point of handling interactive inputs. Because a completion point for a compiler instance has to be unchanged once it is set, an incremental compiler instance is created for each code completion. Such a compiler instance carries over AST context source from the main interpreter compiler in order to obtain declarations or bindings from previous input in the same REPL session. The most important API codeComplete in Interpreter/CodeCompletion is a thin wrapper that calls with ASTUnit::codeComplete with necessary arguments, such as a code completion point and a ReplCompletionConsumer, which communicates completion results from SemaCodeCompletion back to the list completer for the REPL. In addition, PCC_TopLevelOrExpression and CCC_TopLevelOrExpression` top levels were added so that SemaCodeCompletion can treat top level statements like expression statements at the REPL. For example, clang-repl> int foo = 42; clang-repl> f<tab> From a parser's persective, the cursor is at a top level. If we used code completion without any changes, PCC_Namespace would be supplied to Sema::CodeCompleteOrdinaryName, and thus the completion results would not include foo. Currently, the way we use PCC_TopLevelOrExpression and CCC_TopLevelOrExpression is no different from the way we use PCC_Statement and CCC_Statement respectively. Differential revision: https://reviews.llvm.org/D154382 " The new patch also fixes clangd and several memory issues that the bots reported.	2023-08-28 19:59:56 +00:00
Vassil Vassilev	f94a937cb3	Revert "[clang-repl] support code completion at a REPL." This reverts commit eb0e6c3134ef6deafe0a4958e9e1a1214b3c2f14 due to failures in clangd such as https://lab.llvm.org/buildbot/#/builders/57/builds/29377	2023-08-23 14:46:15 +00:00
Fred Fu	eb0e6c3134	[clang-repl] support code completion at a REPL. This patch enabled code completion for ClangREPL. The feature was built upon three existing Clang components: a list completer for LineEditor, a CompletionConsumer from SemaCodeCompletion, and the ASTUnit::codeComplete method. The first component serves as the main entry point of handling interactive inputs. Because a completion point for a compiler instance has to be unchanged once it is set, an incremental compiler instance is created for each code completion. Such a compiler instance carries over AST context source from the main interpreter compiler in order to obtain declarations or bindings from previous input in the same REPL session. The most important API codeComplete in Interpreter/CodeCompletion is a thin wrapper that calls with ASTUnit::codeComplete with necessary arguments, such as a code completion point and a ReplCompletionConsumer, which communicates completion results from SemaCodeCompletion back to the list completer for the REPL. In addition, PCC_TopLevelOrExpression and CCC_TopLevelOrExpression` top levels were added so that SemaCodeCompletion can treat top level statements like expression statements at the REPL. For example, clang-repl> int foo = 42; clang-repl> f<tab> From a parser's persective, the cursor is at a top level. If we used code completion without any changes, PCC_Namespace would be supplied to Sema::CodeCompleteOrdinaryName, and thus the completion results would not include foo. Currently, the way we use PCC_TopLevelOrExpression and CCC_TopLevelOrExpression is no different from the way we use PCC_Statement and CCC_Statement respectively. Differential revision: https://reviews.llvm.org/D154382	2023-08-23 14:00:59 +00:00
Jan Svoboda	6a11557832	[clang][modules] Avoid storing command-line macro definitions into implicitly built PCM files With implicit modules, it's impossible to load a PCM file that was built using different command-line macro definitions. This is guaranteed by the fact that they contribute to the context hash. This means that we don't need to store those macros into PCM files for validation purposes. This patch avoids serializing them in those circumstances, since there's no other use for command-line macro definitions (besides "-module-file-info"). For a typical Apple project, this speeds up the dependency scan by 5.6% and shrinks the cache with scanning PCMs by 26%. Reviewed By: benlangmuir Differential Revision: https://reviews.llvm.org/D158136	2023-08-17 11:11:29 -07:00
Hamish Knight	a57bdc8fe6	[clang] Fix leak in LoadFromCommandLineWorkingDirectory unit test Change `ASTUnit::LoadFromCommandLine` to return a `std::unique_ptr` instead of a +1 pointer, fixing a leak in the unit test `LoadFromCommandLineWorkingDirectory`. Reviewed By: bnbarham, benlangmuir Differential Revision: https://reviews.llvm.org/D154257	2023-06-30 16:02:12 -07:00
Hamish Knight	62e4c22c95	[clang] Fix ASTUnit working directory handling Fix a couple of issues with the handling of the current working directory in ASTUnit: - Use `createPhysicalFileSystem` instead of `getRealFileSystem` to avoid affecting the process' current working directory, and set it at the top of `ASTUnit::LoadFromCommandLine` such that the driver used for argument parsing and the ASTUnit share the same VFS. This ensures that '-working-directory' correctly sets the VFS working directory in addition to the FileManager working directory. - Ensure we preserve the FileSystemOptions set on the FileManager when re-creating it (as `ASTUnit::Reparse` will clear the currently set FileManager). rdar://110697657 Reviewed By: bnbarham, benlangmuir Differential Revision: https://reviews.llvm.org/D154134	2023-06-30 09:56:42 -07:00
Haojian Wu	db15ace6ab	[clang] NFC, replace llvm::writeFileAtomically with llvm::writeToOutput API in ASTUnit.cpp writeFileAtomically is going to be deprecated, in favor of writeToOutput.	2023-06-30 10:18:49 +02:00
David Goldman	a42ce094d9	[clang][Sema] Add CodeCompletionContext::CCC_ObjCClassForwardDecl - Use this new context in Sema to limit completions to seen ObjC class names - Use this new context in clangd to disable include insertions when completing ObjC forward decls Reviewed By: kadircet Differential Revision: https://reviews.llvm.org/D150978	2023-06-27 16:25:40 -04:00
Jan Svoboda	fa5788ff8d	[clang][index] NFCI: Make `CXFile` a `FileEntryRef` This patch swaps out the `void ` behind `CXFile` from `FileEntry ` to `FileEntryRef::MapEntry *`. This allows us to remove some deprecated uses of `FileEntry::getName()`. Depends on D151854. Reviewed By: benlangmuir Differential Revision: https://reviews.llvm.org/D151938	2023-06-15 12:34:54 +02:00
Chuanqi Xu	807aa26136	[C++20] [Modules] Don't ignore -fmodule-file when we compile pcm files Close https://github.com/llvm/llvm-project/issues/62843. Previously when we compile .pcm files into .o files, the `-fmodule-file=<module-name>=<module-path>` option is ignored. This is conflicted with our consensus in https://github.com/llvm/llvm-project/issues/62707.	2023-05-23 14:22:01 +08:00
Vitaly Buka	b96d0635f0	[NFC][AST] Align one line	2023-05-18 01:19:48 -07:00
Vitaly Buka	e8cd04624e	[AST] Initialize local counter I assume it's optional and ReadAST does not have to set the counter on success. Without the patch msan complains that we pass uninitialized value into noundef parameters of setCounterValue. Reviewed By: bnbarham, barannikov88 Differential Revision: https://reviews.llvm.org/D150492	2023-05-18 01:19:11 -07:00
Chuanqi Xu	88a720d194	[NFC] [C++20] [Modules] Rename ASTContext::getNamedModuleForCodeGen to ASTContext::getCurrentNamedModule The original name "ASTContext::getNamedModuleForCodeGen" is not properly reflecting the usage of the interface. This interface can be used to judge the current module unit in both sema analysis and code generation. So the original name was not so correct.	2023-05-16 11:24:35 +08:00
Chuanqi Xu	7f37066915	Revert "[NFC] [C++20] [Modules] Refactor Sema::isModuleUnitOfCurrentTU into" This reverts commit f109b1016801e2b0dbee278f3c517057c0b1d441 as required in `f109b10168 (commitcomment-113477829)`.	2023-05-16 10:47:53 +08:00
Chuanqi Xu	f109b10168	[NFC] [C++20] [Modules] Refactor Sema::isModuleUnitOfCurrentTU into Decl::isInCurrentModuleUnit Refactor `Sema::isModuleUnitOfCurrentTU` to `Decl::isInCurrentModuleUnit` to make code simpler a little bit. Note that although this patch introduces a FIXME, this is an existing issue and this patch just tries to describe it explicitly.	2023-05-10 16:01:27 +08:00
Ben Langmuir	8fe8d69ddf	[clang][deps] Make clang-scan-deps write modules in raw format We have no use for debug info for the scanner modules, and writing raw ast files speeds up scanning ~15% in some cases. Note that the compile commands produced by the scanner will still build the obj format (if requested), and the scanner can read obj format pcms, e.g. from a PCH. rdar://108807592 Differential Revision: https://reviews.llvm.org/D149693	2023-05-03 12:07:46 -07:00
Igor Kushnir	55f7e00afc	[libclang] Add index option to store preambles in memory This commit allows libclang API users to opt into storing PCH in memory instead of temporary files. The option can be set only during CXIndex construction to avoid multithreading issues and confusion or bugs if some preambles are stored in temporary files and others - in memory. The added API works as expected in KDevelop: https://invent.kde.org/kdevelop/kdevelop/-/merge_requests/283 Differential Revision: https://reviews.llvm.org/D145974	2023-03-15 09:21:41 -04:00
Igor Kushnir	cc929590ad	[libclang] Add API to override preamble storage path TempPCHFile::create() calls llvm::sys::fs::createTemporaryFile() to create a file named preamble-.pch in a system temporary directory. This commit allows overriding the directory where these often many and large preamble-.pch files are stored. The referenced bug report requests the ability to override the temporary directory path used by libclang. However, overriding the return value of llvm::sys::path::system_temp_directory() was rejected during code review as improper and because it would negatively affect multithreading performance. Finding all places where libclang uses the temporary directory is very difficult. Therefore this commit is limited to override libclang's single known use of the temporary directory. This commit allows to override the preamble storage path only during CXIndex construction to avoid multithreading issues and ensure that all preambles are stored in the same directory. For the same multithreading and consistency reasons, this commit deprecates clang_CXIndex_setGlobalOptions() and clang_CXIndex_setInvocationEmissionPathOption() in favor of specifying these options during CXIndex construction. Adding a new CXIndex constructor function each time a new initialization argument is needed leads to either a large number of function parameters unneeded by most libclang users or to an exponential number of overloads that support different usage requirements. Therefore this commit introduces a new extensible struct CXIndexOptions and a general function clang_createIndexWithOptions(). A libclang user passes a desired preamble storage path to clang_createIndexWithOptions(), which stores it in CIndexer::PreambleStoragePath. Whenever clang_parseTranslationUnit_Impl() is called, it passes CIndexer::PreambleStoragePath to ASTUnit::LoadFromCommandLine(), which stores this argument in ASTUnit::PreambleStoragePath. Whenever ASTUnit::getMainBufferWithPrecompiledPreamble() is called, it passes ASTUnit::PreambleStoragePath to PrecompiledPreamble::Build(). PrecompiledPreamble::Build() forwards the corresponding StoragePath argument to TempPCHFile::create(). If StoragePath is not empty, TempPCHFile::create() stores the preamble-*.pch file in the directory at the specified path rather than in the system temporary directory. The analysis below proves that this passing around of the PreambleStoragePath string is sufficient to guarantee that the libclang user override is used in TempPCHFile::create(). The analysis ignores API uses in test code. TempPCHFile::create() is called only in PrecompiledPreamble::Build(). PrecompiledPreamble::Build() is called only in two places: one in clangd, which is not used by libclang, and one in ASTUnit::getMainBufferWithPrecompiledPreamble(). ASTUnit::getMainBufferWithPrecompiledPreamble() is called in 3 places: ASTUnit::LoadFromCompilerInvocation() [analyzed below]. ASTUnit::Reparse(), which in turn is called only from clang_reparseTranslationUnit_Impl(), which in turn is called only from clang_reparseTranslationUnit(). clang_reparseTranslationUnit() is never called in LLVM code, but is part of public libclang API. This function's documentation requires its translation unit argument to have been built with clang_createTranslationUnitFromSourceFile(). clang_createTranslationUnitFromSourceFile() delegates its work to clang_parseTranslationUnit(), which delegates to clang_parseTranslationUnit2(), which delegates to clang_parseTranslationUnit2FullArgv(), which delegates to clang_parseTranslationUnit_Impl(), which passes CIndexer::PreambleStoragePath to the ASTUnit it creates. ASTUnit::CodeComplete() passes AllowRebuild = false to ASTUnit::getMainBufferWithPrecompiledPreamble(), which makes it return nullptr before calling PrecompiledPreamble::Build(). Both ASTUnit::LoadFromCompilerInvocation() overloads (one of which delegates its work to another) call ASTUnit::getMainBufferWithPrecompiledPreamble() only if their argument PrecompilePreambleAfterNParses > 0. LoadFromCompilerInvocation() is called in: ASTBuilderAction::runInvocation() keeps the default parameter value of PrecompilePreambleAfterNParses = 0, meaning that the preamble file is never created from here. ASTUnit::LoadFromCommandLine(). ASTUnit::LoadFromCommandLine() is called in two places: CrossTranslationUnitContext::ASTLoader::loadFromSource() keeps the default parameter value of PrecompilePreambleAfterNParses = 0, meaning that the preamble file is never created from here. clang_parseTranslationUnit_Impl(), which passes CIndexer::PreambleStoragePath to the ASTUnit it creates. Therefore, the overridden preamble storage path is always used in TempPCHFile::create(). TempPCHFile::create() uses PreambleStoragePath in the same way as LibclangInvocationReporter() uses InvocationEmissionPath. The existing documentation for clang_CXIndex_setInvocationEmissionPathOption() does not specify ownership, encoding, separator or relative vs absolute path requirements. So the documentation for CXIndexOptions::PreambleStoragePath doesn't either. The assumptions are: no ownership transfer; UTF-8 encoding; native separators. Both relative and absolute paths are supported. The added API works as expected in KDevelop: https://invent.kde.org/kdevelop/kdevelop/-/merge_requests/283 Fixes: https://github.com/llvm/llvm-project/issues/51847 Differential Revision: https://reviews.llvm.org/D143418	2023-03-07 08:25:38 -05:00

1 2 3 4 5 ...

656 Commits