0
0
mirror of https://github.com/llvm/llvm-project.git synced 2025-04-20 20:26:43 +00:00
Florian Hahn 641fbf1524
[TySan] Add initial Type Sanitizer runtime ()
This patch introduces the runtime components for type sanitizer: a
sanitizer for type-based aliasing violations.

It is based on Hal Finkel's https://reviews.llvm.org/D32197.

C/C++ have type-based aliasing rules, and LLVM's optimizer can exploit
these given TBAA metadata added by Clang. Roughly, a pointer of given
type cannot be used to access an object of a different type (with, of
course, certain exceptions). Unfortunately, there's a lot of code in the
wild that violates these rules (e.g. for type punning), and such code
often must be built with -fno-strict-aliasing. Performance is often
sacrificed as a result. Part of the problem is the difficulty of finding
TBAA violations. Hopefully, this sanitizer will help.

For each TBAA type-access descriptor, encoded in LLVM's IR using
metadata, the corresponding instrumentation pass generates descriptor
tables. Thus, for each type (and access descriptor), we have a unique
pointer representation. Excepting anonymous-namespace types, these
tables are comdat, so the pointer values should be unique across the
program. The descriptors refer to other descriptors to form a type
aliasing tree (just like LLVM's TBAA metadata does). The instrumentation
handles the "fast path" (where the types match exactly and no
partial-overlaps are detected), and defers to the runtime to handle all
of the more-complicated cases. The runtime, of course, is also
responsible for reporting errors when those are detected.

The runtime uses essentially the same shadow memory region as tsan, and
we use 8 bytes of shadow memory, the size of the pointer to the type
descriptor, for every byte of accessed data in the program. The value 0
is used to represent an unknown type. The value -1 is used to represent
an interior byte (a byte that is part of a type, but not the first
byte). The instrumentation first checks for an exact match between the
type of the current access and the type for that address recorded in the
shadow memory. If it matches, it then checks the shadow for the
remainder of the bytes in the type to make sure that they're all -1. If
not, we call the runtime. If the exact match fails, we next check if the
value is 0 (i.e. unknown). If it is, then we check the shadow for the
remainder of the byes in the type (to make sure they're all 0). If
they're not, we call the runtime. We then set the shadow for the access
address and set the shadow for the remaining bytes in the type to -1
(i.e. marking them as interior bytes). If the type indicated by the
shadow memory for the access address is neither an exact match nor 0, we
call the runtime.

The instrumentation pass inserts calls to the memset intrinsic to set
the memory updated by memset, memcpy, and memmove, as well as
allocas/byval (and for lifetime.start/end) to reset the shadow memory to
reflect that the type is now unknown. The runtime intercepts memset,
memcpy, etc. to perform the same function for the library calls.

The runtime essentially repeats these checks, but uses the full TBAA
algorithm, just as the compiler does, to determine when two types are
permitted to alias. In a situation where access overlap has occurred and
aliasing is not permitted, an error is generated.

As a note, this implementation does not use the compressed shadow-memory
scheme discussed previously
(http://lists.llvm.org/pipermail/llvm-dev/2017-April/111766.html). That
scheme would not handle the struct-path (i.e. structure offset)
information that our TBAA represents. I expect we'll want to further
work on compressing the shadow-memory representation, but I think it
makes sense to do that as follow-up work.

This includes build fixes for Linux from Mingjie Xu.

Depends on  (Clang support),  (LLVM support)


PR: https://github.com/llvm/llvm-project/pull/76261
2024-12-17 18:49:50 +00:00

189 lines
7.1 KiB
CMake

# TODO: Set the install directory.
include(ExternalProject)
set(known_subdirs
"libcxx"
)
foreach (dir ${known_subdirs})
if (EXISTS ${CMAKE_CURRENT_SOURCE_DIR}/${dir}/CMakeLists.txt)
add_subdirectory(${CMAKE_CURRENT_SOURCE_DIR}/${dir})
endif()
endforeach()
function(get_ext_project_build_command out_var target)
if (CMAKE_GENERATOR MATCHES "Make")
# Use special command for Makefiles to support parallelism.
set(${out_var} "$(MAKE)" "${target}" PARENT_SCOPE)
else()
set(${out_var} ${CMAKE_COMMAND} --build . --target ${target}
--config $<CONFIG> PARENT_SCOPE)
endif()
endfunction()
set(COMPILER_RT_SRC_ROOT ${LLVM_MAIN_SRC_DIR}/projects/compiler-rt)
# Fallback to the external path, if the other one isn't available.
# This is the same behavior (try "internal", then check the LLVM_EXTERNAL_...
# variable) as in add_llvm_external_project
if(NOT EXISTS ${COMPILER_RT_SRC_ROOT})
# We don't want to set it if LLVM_EXTERNAL_COMPILER_RT_SOURCE_DIR is ""
if(LLVM_EXTERNAL_COMPILER_RT_SOURCE_DIR)
set(COMPILER_RT_SRC_ROOT ${LLVM_EXTERNAL_COMPILER_RT_SOURCE_DIR})
endif()
endif()
if(LLVM_BUILD_EXTERNAL_COMPILER_RT AND EXISTS ${COMPILER_RT_SRC_ROOT}/)
# Add compiler-rt as an external project.
set(COMPILER_RT_PREFIX ${CMAKE_BINARY_DIR}/projects/compiler-rt)
set(STAMP_DIR ${CMAKE_CURRENT_BINARY_DIR}/compiler-rt-stamps/)
set(BINARY_DIR ${CMAKE_CURRENT_BINARY_DIR}/compiler-rt-bins/)
add_custom_target(compiler-rt-clear
COMMAND ${CMAKE_COMMAND} -E remove_directory ${BINARY_DIR}
COMMAND ${CMAKE_COMMAND} -E remove_directory ${STAMP_DIR}
COMMENT "Clobberring compiler-rt build and stamp directories"
)
# Find all variables that start with COMPILER_RT and populate a variable with
# them.
get_cmake_property(variableNames VARIABLES)
foreach(variableName ${variableNames})
if(variableName MATCHES "^COMPILER_RT")
string(REPLACE ";" "\;" value "${${variableName}}")
list(APPEND COMPILER_RT_PASSTHROUGH_VARIABLES
-D${variableName}=${value})
endif()
endforeach()
set(compiler_rt_configure_deps)
if(TARGET cxx-headers)
list(APPEND compiler_rt_configure_deps "cxx-headers")
endif()
if(LLVM_INCLUDE_TESTS)
list(APPEND compiler_rt_configure_deps LLVMTestingSupport)
endif()
include(GetClangResourceDir)
get_clang_resource_dir(output_resource_dir PREFIX ${LLVM_BINARY_DIR})
get_clang_resource_dir(install_resource_dir)
ExternalProject_Add(compiler-rt
DEPENDS llvm-config clang ${compiler_rt_configure_deps}
PREFIX ${COMPILER_RT_PREFIX}
SOURCE_DIR ${COMPILER_RT_SRC_ROOT}
STAMP_DIR ${STAMP_DIR}
BINARY_DIR ${BINARY_DIR}
CMAKE_ARGS ${CLANG_COMPILER_RT_CMAKE_ARGS}
-DCMAKE_C_COMPILER=${LLVM_RUNTIME_OUTPUT_INTDIR}/clang
-DCMAKE_CXX_COMPILER=${LLVM_RUNTIME_OUTPUT_INTDIR}/clang++
-DCMAKE_ASM_COMPILER=${LLVM_RUNTIME_OUTPUT_INTDIR}/clang
-DCMAKE_BUILD_TYPE=${CMAKE_BUILD_TYPE}
-DCMAKE_MAKE_PROGRAM=${CMAKE_MAKE_PROGRAM}
-DCMAKE_C_COMPILER_LAUNCHER=${CMAKE_C_COMPILER_LAUNCHER}
-DCMAKE_CXX_COMPILER_LAUNCHER=${CMAKE_CXX_COMPILER_LAUNCHER}
-DLLVM_CONFIG_PATH=${LLVM_RUNTIME_OUTPUT_INTDIR}/llvm-config
-DLLVM_LIT_ARGS=${LLVM_LIT_ARGS}
-DCOMPILER_RT_OUTPUT_DIR=${output_resource_dir}
-DCOMPILER_RT_EXEC_OUTPUT_DIR=${LLVM_RUNTIME_OUTPUT_INTDIR}
-DCOMPILER_RT_INSTALL_PATH:PATH=${install_resource_dir}
-DCOMPILER_RT_INCLUDE_TESTS=${LLVM_INCLUDE_TESTS}
-DCMAKE_INSTALL_PREFIX=${CMAKE_INSTALL_PREFIX}
-DLLVM_LIBDIR_SUFFIX=${LLVM_LIBDIR_SUFFIX}
-DLLVM_RUNTIME_OUTPUT_INTDIR=${LLVM_RUNTIME_OUTPUT_INTDIR}
-DCMAKE_OSX_DEPLOYMENT_TARGET=${CMAKE_OSX_DEPLOYMENT_TARGET}
-DCMAKE_OSX_SYSROOT:PATH=${CMAKE_OSX_SYSROOT}
${COMPILER_RT_PASSTHROUGH_VARIABLES}
INSTALL_COMMAND ""
STEP_TARGETS configure build
USES_TERMINAL_CONFIGURE 1
USES_TERMINAL_BUILD 1
USES_TERMINAL_INSTALL 1
# Always run the build command so that incremental builds are correct.
BUILD_ALWAYS 1
)
get_ext_project_build_command(run_clean_compiler_rt clean)
ExternalProject_Add_Step(compiler-rt clean
COMMAND ${run_clean_compiler_rt}
COMMENT "Cleaning compiler-rt..."
DEPENDEES configure
DEPENDERS build
DEPENDS clang
WORKING_DIRECTORY ${BINARY_DIR}
)
install(CODE "execute_process\(COMMAND \${CMAKE_COMMAND} -DCMAKE_INSTALL_PREFIX=\${CMAKE_INSTALL_PREFIX} -P ${BINARY_DIR}/cmake_install.cmake \)"
COMPONENT compiler-rt)
add_llvm_install_targets(install-compiler-rt
DEPENDS compiler-rt
COMPONENT compiler-rt)
# Add top-level targets that build specific compiler-rt runtimes.
set(COMPILER_RT_RUNTIMES fuzzer asan builtins dfsan lsan msan profile tsan tysan ubsan ubsan-minimal)
foreach(runtime ${COMPILER_RT_RUNTIMES})
get_ext_project_build_command(build_runtime_cmd ${runtime})
add_custom_target(${runtime}
COMMAND ${build_runtime_cmd}
DEPENDS compiler-rt-configure
WORKING_DIRECTORY ${BINARY_DIR}
VERBATIM USES_TERMINAL)
endforeach()
if(LLVM_INCLUDE_TESTS)
# Add binaries that compiler-rt tests depend on.
set(COMPILER_RT_TEST_DEPENDENCIES
FileCheck count not llvm-nm llvm-objdump llvm-symbolizer llvm-jitlink lli split-file)
# Add top-level targets for various compiler-rt test suites.
set(COMPILER_RT_TEST_SUITES
check-asan
check-asan-dynamic
check-cfi
check-cfi-and-supported
check-dfsan
check-fuzzer
check-gwp_asan
check-hwasan
check-lsan
check-msan
check-profile
check-safestack
check-sanitizer
check-tsan
check-ubsan
check-ubsan-minimal
)
foreach(test_suite ${COMPILER_RT_TEST_SUITES})
get_ext_project_build_command(run_test_suite ${test_suite})
add_custom_target(${test_suite}
COMMAND ${run_test_suite}
DEPENDS compiler-rt-build ${COMPILER_RT_TEST_DEPENDENCIES}
WORKING_DIRECTORY ${BINARY_DIR}
VERBATIM
USES_TERMINAL
)
endforeach()
# Add special target to run all compiler-rt test suites.
get_ext_project_build_command(run_check_compiler_rt check-all)
add_custom_target(check-compiler-rt
COMMAND ${run_check_compiler_rt}
DEPENDS compiler-rt-build ${COMPILER_RT_TEST_DEPENDENCIES}
WORKING_DIRECTORY ${BINARY_DIR}
VERBATIM USES_TERMINAL)
# Add special target to run all compiler-rt test suites.
get_ext_project_build_command(run_check_compiler_rt compiler-rt-test-depends)
add_custom_target(compiler-rt-test-depends
COMMAND ${run_check_compiler_rt}
DEPENDS compiler-rt-build ${COMPILER_RT_TEST_DEPENDENCIES}
WORKING_DIRECTORY ${BINARY_DIR}
VERBATIM USES_TERMINAL)
set_property(GLOBAL APPEND PROPERTY LLVM_ALL_ADDITIONAL_TEST_DEPENDS compiler-rt-test-depends)
set_property(GLOBAL APPEND PROPERTY LLVM_ALL_ADDITIONAL_TEST_TARGETS check-compiler-rt)
endif()
endif()