[MLIR] Make More Specific Function Header For StringLiteral Optimization in Diagnostic (#112154)

Diagnostic stores various notes/error messages which might help the user
in debugging. For the most part, the `Diagnostic` when receiving an
error message will copy and own the contents of the string.

However, there is one optimization where given a `const char*`, the
class will assume this is a StringLiteral which is immutable and
lifetime matches that of the entire program. As a result, instead of
copying the message in these cases the class will simply store the
underlying pointer.

This is problematic since `const char*` is not specific enough to always
imply a StringLiteral which can lead to bugs, e.g. if the underlying
pointer is freed before the diagnostic reports.

We solve this problem by choosing a more specific function signature.
While not full-proof, this should cover a lot more cases.

A potentially better alternative is just deleting this special handling
of string literals, but I am unsure of the implications (it does sound
safe to do however with a negligble impact on performance).
This commit is contained in:
Andrew Luo 2024-10-15 10:38:45 -07:00 committed by GitHub
parent c9f27275c1
commit e511026bf0
No known key found for this signature in database
GPG Key ID: B5690EEEBB952194
3 changed files with 66 additions and 1 deletions

View File

@ -183,7 +183,8 @@ public:
Diagnostic &operator<<(StringAttr val);
/// Stream in a string literal.
Diagnostic &operator<<(const char *val) {
template <size_t n>
Diagnostic &operator<<(const char (&val)[n]) {
arguments.push_back(DiagnosticArgument(val));
return *this;
}

View File

@ -4,6 +4,7 @@ add_mlir_unittest(MLIRIRTests
AffineMapTest.cpp
AttributeTest.cpp
AttrTypeReplacerTest.cpp
Diagnostic.cpp
DialectTest.cpp
InterfaceTest.cpp
IRMapping.cpp

View File

@ -0,0 +1,63 @@
//===- Diagnostic.cpp - Dialect unit tests -------------------------------===//
//
// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
// See https://llvm.org/LICENSE.txt for license information.
// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
//
//===----------------------------------------------------------------------===//
#include "mlir/IR/Diagnostics.h"
#include "mlir/Support/TypeID.h"
#include "gtest/gtest.h"
using namespace mlir;
using namespace mlir::detail;
namespace {
TEST(DiagnosticLifetime, TestCopiesConstCharStar) {
const auto *expectedMessage = "Error 1, don't mutate this";
// Copy expected message into a mutable container, and call the constructor.
std::string myStr(expectedMessage);
mlir::MLIRContext context;
Diagnostic diagnostic(mlir::UnknownLoc::get(&context),
DiagnosticSeverity::Note);
diagnostic << myStr.c_str();
// Mutate underlying pointer, but ensure diagnostic still has orig. message
myStr[0] = '^';
std::string resultMessage;
llvm::raw_string_ostream stringStream(resultMessage);
diagnostic.print(stringStream);
ASSERT_STREQ(expectedMessage, resultMessage.c_str());
}
TEST(DiagnosticLifetime, TestLazyCopyStringLiteral) {
char charArr[21] = "Error 1, mutate this";
mlir::MLIRContext context;
Diagnostic diagnostic(mlir::UnknownLoc::get(&context),
DiagnosticSeverity::Note);
// Diagnostic contains optimization which assumes string literals are
// represented by `const char[]` type. This is imperfect as we can sometimes
// trick the type system as seen below.
//
// Still we use this to check the diagnostic is lazily storing the pointer.
auto addToDiagnosticAsConst = [&diagnostic](const char(&charArr)[21]) {
diagnostic << charArr;
};
addToDiagnosticAsConst(charArr);
// Mutate the underlying pointer and ensure the string does change
charArr[0] = '^';
std::string resultMessage;
llvm::raw_string_ostream stringStream(resultMessage);
diagnostic.print(stringStream);
ASSERT_STREQ("^rror 1, mutate this", resultMessage.c_str());
}
} // namespace