mirror of
https://github.com/llvm/llvm-project.git
synced 2025-04-19 04:47:03 +00:00

Historically, the main example of *very* large string tables used the `EmitCharArray` to work around MSVC limitations with string literals, but that was switched (without removing the API) in order to consolidate on a nicer emission primitive. While this large string table in `IntrinsicsImpl.inc` seems to compile correctly on MSVC without the work around in `EmitCharArray` (and that this PR adds back to the nicer emission path), other users have repeatedly hit this MSVC limitation as you can see in the discussion on PR https://github.com/llvm/llvm-project/pull/120534. This PR teaches the string offset table emission to look at the size of the table and switch to the char array emission strategy when the table becomes too large. This work around does have the downside of making compile times worse for large string tables, but that appears unavoidable until we can identify known good MSVC versions and switch to requiring them for all LLVM users. It also reduces searchability of the generated string table -- I looked at emitting a comment with each string but it is tricky because the escaping rules for an inline comment are different from those of of a string literal, and there's no real way to turn the string literal into a comment. While improving the output in this way, also clean up the output to not emit an extraneous empty string at the end of the string table, and update the `StringTable` class to not look for that. It isn't actually used by anything and is wasteful. This PR also switches the `IntrinsicsImpl.inc` string tables over to the new `StringTable` runtime abstraction. I didn't want to do this until landing the MSVC workaround in case it caused even this example to start hitting the MSVC bug, but I wanted to switch here so that I could simplify the API for emitting the string table with the workaround present. With the two different emission strategies, its important to use a very exact syntax and that seems better encapsulated in the API. Last but not least, the `SDNodeInfoEmitter` is updated, including its tests to match the new output. This PR should unblock landing https://github.com/llvm/llvm-project/pull/120534 and letting us switch all of Clang's builtins to use string tables. That PR has all the details motivating the overall effort. Follow-up patches will try to consolidate the remaining users onto the single interface, but those at least were easy to separate into follow-ups and keep this PR somewhat smaller.