llvm-project/compiler-rt/lib/fuzzer/FuzzerUtil.h

//===- FuzzerUtil.h - Internal header for the Fuzzer Utils ------*- C++ -* ===//
//
// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
// See https://llvm.org/LICENSE.txt for license information.
// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
//
//===----------------------------------------------------------------------===//
// Util functions.
//===----------------------------------------------------------------------===//

#ifndef LLVM_FUZZER_UTIL_H
#define LLVM_FUZZER_UTIL_H

#include "FuzzerBuiltins.h"
#include "FuzzerBuiltinsMsvc.h"
#include "FuzzerCommand.h"
#include "FuzzerDefs.h"

namespace fuzzer {

void PrintHexArray(const Unit &U, const char *PrintAfter = "");

void PrintHexArray(const uint8_t *Data, size_t Size,
                   const char *PrintAfter = "");

void PrintASCII(const uint8_t *Data, size_t Size, const char *PrintAfter = "");

void PrintASCII(const Unit &U, const char *PrintAfter = "");

// Changes U to contain only ASCII (isprint+isspace) characters.
// Returns true iff U has been changed.
bool ToASCII(uint8_t *Data, size_t Size);

bool IsASCII(const Unit &U);

bool IsASCII(const uint8_t *Data, size_t Size);

std::string Base64(const Unit &U);

void PrintPC(const char *SymbolizedFMT, const char *FallbackFMT, uintptr_t PC);

std::string DescribePC(const char *SymbolizedFMT, uintptr_t PC);

void PrintStackTrace();

void PrintMemoryProfile();

unsigned NumberOfCpuCores();

// Parses one dictionary entry.
// If successful, write the enty to Unit and returns true,
// otherwise returns false.
bool ParseOneDictionaryEntry(const std::string &Str, Unit *U);

// Parses the dictionary file, fills Units, returns true iff all lines
// were parsed successfully.
bool ParseDictionaryFile(const std::string &Text, Vector<Unit> *Units);

// Platform specific functions.
void SetSignalHandler(const FuzzingOptions& Options);

void SleepSeconds(int Seconds);

unsigned long GetPid();

size_t GetPeakRSSMb();

int ExecuteCommand(const Command &Cmd);
bool ExecuteCommand(const Command &Cmd, std::string *CmdOutput);

// Fuchsia does not have popen/pclose.
FILE *OpenProcessPipe(const char *Command, const char *Mode);
int CloseProcessPipe(FILE *F);

std::string CloneArgsWithoutX(const Vector<std::string> &Args,
                              const char *X1, const char *X2);

inline std::string CloneArgsWithoutX(const Vector<std::string> &Args,
                                     const char *X) {
  return CloneArgsWithoutX(Args, X, X);
}

inline std::pair<std::string, std::string> SplitBefore(std::string X,
                                                       std::string S) {
  auto Pos = S.find(X);
  if (Pos == std::string::npos)
    return std::make_pair(S, "");
  return std::make_pair(S.substr(0, Pos), S.substr(Pos));
}

void DiscardOutput(int Fd);

std::string DisassembleCmd(const std::string &FileName);

std::string SearchRegexCmd(const std::string &Regex);

uint64_t SimpleFastHash(const void *Data, size_t Size, uint64_t Initial = 0);

inline size_t Log(size_t X) {
  return static_cast<size_t>((sizeof(unsigned long long) * 8) - Clzll(X) - 1);
}

inline size_t PageSize() { return 4096; }
inline uint8_t *RoundUpByPage(uint8_t *P) {
  uintptr_t X = reinterpret_cast<uintptr_t>(P);
  size_t Mask = PageSize() - 1;
  X = (X + Mask) & ~Mask;
  return reinterpret_cast<uint8_t *>(X);
}
inline uint8_t *RoundDownByPage(uint8_t *P) {
  uintptr_t X = reinterpret_cast<uintptr_t>(P);
  size_t Mask = PageSize() - 1;
  X = X & ~Mask;
  return reinterpret_cast<uint8_t *>(X);
}

#if __BYTE_ORDER == __LITTLE_ENDIAN
template <typename T> T HostToLE(T X) { return X; }
#else
template <typename T> T HostToLE(T X) { return Bswap(X); }
#endif

}  // namespace fuzzer

#endif  // LLVM_FUZZER_UTIL_H
Move libFuzzer to compiler_rt. Resulting library binaries will be named libclang_rt.fuzzer*, and will be placed in Clang toolchain, allowing redistribution. Differential Revision: https://reviews.llvm.org/D36908 llvm-svn: 311407 2017-08-21 23:25:50 +00:00			`//===- FuzzerUtil.h - Internal header for the Fuzzer Utils ------- C++ - ===//`
			`//`
Update the file headers across all of the LLVM projects in the monorepo to reflect the new license. We understand that people may be surprised that we're moving the header entirely to discuss the new license. We checked this carefully with the Foundation's lawyer and we believe this is the correct approach. Essentially, all code in the project is now made available by the LLVM project under our new license, so you will see that the license headers include that license only. Some of our contributors have contributed code under our old license, and accordingly, we have retained a copy of our old license notice in the top-level files in each project and repository. llvm-svn: 351636 2019-01-19 08:50:56 +00:00			`// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.`
			`// See https://llvm.org/LICENSE.txt for license information.`
			`// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception`
Move libFuzzer to compiler_rt. Resulting library binaries will be named libclang_rt.fuzzer*, and will be placed in Clang toolchain, allowing redistribution. Differential Revision: https://reviews.llvm.org/D36908 llvm-svn: 311407 2017-08-21 23:25:50 +00:00			`//`
			`//===----------------------------------------------------------------------===//`
			`// Util functions.`
			`//===----------------------------------------------------------------------===//`

			`#ifndef LLVM_FUZZER_UTIL_H`
			`#define LLVM_FUZZER_UTIL_H`

[libfuzzer][MSVC] Make calls to builtin functions work with MSVC Summary: Replace calls to builtin functions with macros or functions that call the Windows-equivalents when targeting windows and call the original builtin functions everywhere else. This change makes more parts of libFuzzer buildable with MSVC. Reviewers: vitalybuka Reviewed By: vitalybuka Subscribers: mgorny, rnk, thakis Differential Revision: https://reviews.llvm.org/D56439 llvm-svn: 350766 2019-01-09 21:46:09 +00:00			`#include "FuzzerBuiltins.h"`
			`#include "FuzzerBuiltinsMsvc.h"`
[libFuzzer] Encapsulate commands in a class. Summary: To be more portable (especially w.r.t. platforms without system()), commands should be managed programmatically rather than via string manipulation on the command line. This change introduces Fuzzer::Command, with methods to manage arguments and flags, set output options, and execute the command. Patch By: aarongreen Reviewers: kcc, morehouse Reviewed By: kcc, morehouse Subscribers: llvm-commits, mgorny Differential Revision: https://reviews.llvm.org/D40103 llvm-svn: 319680 2017-12-04 19:25:59 +00:00			`#include "FuzzerCommand.h"`
[libfuzzer][MSVC] Make calls to builtin functions work with MSVC Summary: Replace calls to builtin functions with macros or functions that call the Windows-equivalents when targeting windows and call the original builtin functions everywhere else. This change makes more parts of libFuzzer buildable with MSVC. Reviewers: vitalybuka Reviewed By: vitalybuka Subscribers: mgorny, rnk, thakis Differential Revision: https://reviews.llvm.org/D56439 llvm-svn: 350766 2019-01-09 21:46:09 +00:00			`#include "FuzzerDefs.h"`
Move libFuzzer to compiler_rt. Resulting library binaries will be named libclang_rt.fuzzer*, and will be placed in Clang toolchain, allowing redistribution. Differential Revision: https://reviews.llvm.org/D36908 llvm-svn: 311407 2017-08-21 23:25:50 +00:00
			`namespace fuzzer {`

			`void PrintHexArray(const Unit &U, const char *PrintAfter = "");`

			`void PrintHexArray(const uint8_t *Data, size_t Size,`
			`const char *PrintAfter = "");`

			`void PrintASCII(const uint8_t Data, size_t Size, const char PrintAfter = "");`

			`void PrintASCII(const Unit &U, const char *PrintAfter = "");`

			`// Changes U to contain only ASCII (isprint+isspace) characters.`
			`// Returns true iff U has been changed.`
			`bool ToASCII(uint8_t *Data, size_t Size);`

			`bool IsASCII(const Unit &U);`

			`bool IsASCII(const uint8_t *Data, size_t Size);`

			`std::string Base64(const Unit &U);`

			`void PrintPC(const char SymbolizedFMT, const char FallbackFMT, uintptr_t PC);`

			`std::string DescribePC(const char *SymbolizedFMT, uintptr_t PC);`

[libFuzzer] Guard symbolization with try-lock. Summary: When out-of-memory or timeout occurs, threads can be stopped during symbolization, thereby causing a deadlock when the OOM/TO handlers attempt symbolization. We avoid this deadlock by skipping symbolization if another thread is symbolizing. Reviewers: kcc Reviewed By: kcc Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D46605 llvm-svn: 331825 2018-05-08 23:45:05 +00:00			`void PrintStackTrace();`

			`void PrintMemoryProfile();`

Move libFuzzer to compiler_rt. Resulting library binaries will be named libclang_rt.fuzzer*, and will be placed in Clang toolchain, allowing redistribution. Differential Revision: https://reviews.llvm.org/D36908 llvm-svn: 311407 2017-08-21 23:25:50 +00:00			`unsigned NumberOfCpuCores();`

Refactor mutation strategies into a standalone library This change introduces libMutagen/libclang_rt.mutagen.a as a subset of libFuzzer/libclang_rt.fuzzer.a. This library contains only the fuzzing strategies used by libFuzzer to produce new test inputs from provided inputs, dictionaries, and SanitizerCoverage feedback. Most of this change is simply moving sections of code to one side or the other of the library boundary. The only meaningful new code is: * The Mutagen.h interface and its implementation in Mutagen.cpp. * The following methods in MutagenDispatcher.cpp: * UseCmp * UseMemmem * SetCustomMutator * SetCustomCrossOver * LateInitialize (similar to the MutationDispatcher's original constructor) * Mutate_AddWordFromTORC (uses callbacks instead of accessing TPC directly) * StartMutationSequence * MutationSequence * DictionaryEntrySequence * RecommendDictionary * RecommendDictionaryEntry * FuzzerMutate.cpp (which now justs sets callbacks and handles printing) * MutagenUnittest.cpp (which adds tests of Mutagen.h) A note on performance: This change was tested with a 100 passes of test/fuzzer/LargeTest.cpp with 1000 runs per pass, both with and without the change. The running time distribution was qualitatively similar both with and without the change, and the average difference was within 30 microseconds (2.240 ms/run vs 2.212 ms/run, respectively). Both times were much higher than observed with the fully optimized system clang (~0.38 ms/run), most likely due to the combination of CMake "dev mode" settings (e.g. CMAKE_BUILD_TYPE="Debug", LLVM_ENABLE_LTO=OFF, etc.). The difference between the two versions built similarly seems to be "in the noise" and suggests no meaningful performance degradation. Reviewed By: morehouse Differential Revision: https://reviews.llvm.org/D102447 2021-05-25 12:04:12 -07:00			`// Parses one dictionary entry.`
			`// If successful, write the enty to Unit and returns true,`
			`// otherwise returns false.`
			`bool ParseOneDictionaryEntry(const std::string &Str, Unit *U);`

			`// Parses the dictionary file, fills Units, returns true iff all lines`
			`// were parsed successfully.`
			`bool ParseDictionaryFile(const std::string &Text, Vector<Unit> *Units);`

Move libFuzzer to compiler_rt. Resulting library binaries will be named libclang_rt.fuzzer*, and will be placed in Clang toolchain, allowing redistribution. Differential Revision: https://reviews.llvm.org/D36908 llvm-svn: 311407 2017-08-21 23:25:50 +00:00			`// Platform specific functions.`
			`void SetSignalHandler(const FuzzingOptions& Options);`

			`void SleepSeconds(int Seconds);`

			`unsigned long GetPid();`

			`size_t GetPeakRSSMb();`

[libFuzzer] Encapsulate commands in a class. Summary: To be more portable (especially w.r.t. platforms without system()), commands should be managed programmatically rather than via string manipulation on the command line. This change introduces Fuzzer::Command, with methods to manage arguments and flags, set output options, and execute the command. Patch By: aarongreen Reviewers: kcc, morehouse Reviewed By: kcc, morehouse Subscribers: llvm-commits, mgorny Differential Revision: https://reviews.llvm.org/D40103 llvm-svn: 319680 2017-12-04 19:25:59 +00:00			`int ExecuteCommand(const Command &Cmd);`
[Fuzzer] Rename ExecuteCommandWithPopen to ExecuteCommandNon-Fushsia target will keep using popen/pclose implementation. OnFuchsia, Two-args version of `ExecuteCommand` is a simple wrapper of theone-arg version. (Hopefully) Fix D73329 build on Fuchsia. 2020-02-12 15:43:44 -08:00			`bool ExecuteCommand(const Command &Cmd, std::string *CmdOutput);`
Move libFuzzer to compiler_rt. Resulting library binaries will be named libclang_rt.fuzzer*, and will be placed in Clang toolchain, allowing redistribution. Differential Revision: https://reviews.llvm.org/D36908 llvm-svn: 311407 2017-08-21 23:25:50 +00:00
[Fuzzer] Rename ExecuteCommandWithPopen to ExecuteCommandNon-Fushsia target will keep using popen/pclose implementation. OnFuchsia, Two-args version of `ExecuteCommand` is a simple wrapper of theone-arg version. (Hopefully) Fix D73329 build on Fuchsia. 2020-02-12 15:43:44 -08:00			`// Fuchsia does not have popen/pclose.`
Move libFuzzer to compiler_rt. Resulting library binaries will be named libclang_rt.fuzzer*, and will be placed in Clang toolchain, allowing redistribution. Differential Revision: https://reviews.llvm.org/D36908 llvm-svn: 311407 2017-08-21 23:25:50 +00:00			`FILE OpenProcessPipe(const char Command, const char *Mode);`
[libFuzzer] communicate through pipe to subprocess for MinimizeCrashInput For CleanseCrashInput, discards stdout output anyway since it is not used. These changes are to defend against aggressive PID recycle on windows to reduce the chance of contention on files. Using pipe instead of file also workaround the problem that when the process is spawned by llvm-lit, the aborted process keeps a handle to the output file such that the output file can not be removed. This will cause random test failures. https://devblogs.microsoft.com/oldnewthing/20110107-00/?p=11803 Reviewers: kcc, vitalybuka Reviewed By: vitalybuka Differential Revision: https://reviews.llvm.org/D73329 2020-02-10 14:31:47 -08:00			`int CloseProcessPipe(FILE *F);`
Move libFuzzer to compiler_rt. Resulting library binaries will be named libclang_rt.fuzzer*, and will be placed in Clang toolchain, allowing redistribution. Differential Revision: https://reviews.llvm.org/D36908 llvm-svn: 311407 2017-08-21 23:25:50 +00:00
[libFuzzer] Use custom allocators for STL containers in libFuzzer. Avoids ODR violations causing spurious ASAN warnings. Differential Revision: https://reviews.llvm.org/D37086 llvm-svn: 311866 2017-08-27 23:20:09 +00:00			`std::string CloneArgsWithoutX(const Vector<std::string> &Args,`
Move libFuzzer to compiler_rt. Resulting library binaries will be named libclang_rt.fuzzer*, and will be placed in Clang toolchain, allowing redistribution. Differential Revision: https://reviews.llvm.org/D36908 llvm-svn: 311407 2017-08-21 23:25:50 +00:00			`const char X1, const char X2);`

[libFuzzer] Use custom allocators for STL containers in libFuzzer. Avoids ODR violations causing spurious ASAN warnings. Differential Revision: https://reviews.llvm.org/D37086 llvm-svn: 311866 2017-08-27 23:20:09 +00:00			`inline std::string CloneArgsWithoutX(const Vector<std::string> &Args,`
Move libFuzzer to compiler_rt. Resulting library binaries will be named libclang_rt.fuzzer*, and will be placed in Clang toolchain, allowing redistribution. Differential Revision: https://reviews.llvm.org/D36908 llvm-svn: 311407 2017-08-21 23:25:50 +00:00			`const char *X) {`
			`return CloneArgsWithoutX(Args, X, X);`
			`}`

			`inline std::pair<std::string, std::string> SplitBefore(std::string X,`
			`std::string S) {`
			`auto Pos = S.find(X);`
			`if (Pos == std::string::npos)`
			`return std::make_pair(S, "");`
			`return std::make_pair(S.substr(0, Pos), S.substr(Pos));`
			`}`

[libFuzzer] don't use /dev/null for DiscardOuput in Fuchsia. Summary: This commit moves the `DiscardOutput` function in FuzzerIO to FuzzerUtil, so fuchsia can have its own specialized version. In fuchsia, accessing `/dev/null` is not supported, and there's nothing similar to a file that discards everything that is written to it. The way of doing something similar in fuchsia is by using `fdio_null_create` and binding that to a file descriptor with `fdio_bind_to_fd`. This change should fix one of the issues with the `-close_fd_mask` flag in libfuzzer, in which closing stdout was not working due to `fopen("/dev/null", "w")` returning `NULL`. Reviewers: kcc, aarongreen Subscribers: #sanitizers, llvm-commits Tags: #sanitizers, #llvm Differential Revision: https://reviews.llvm.org/D69593 2019-10-29 15:38:51 -07:00			`void DiscardOutput(int Fd);`

Move libFuzzer to compiler_rt. Resulting library binaries will be named libclang_rt.fuzzer*, and will be placed in Clang toolchain, allowing redistribution. Differential Revision: https://reviews.llvm.org/D36908 llvm-svn: 311407 2017-08-21 23:25:50 +00:00			`std::string DisassembleCmd(const std::string &FileName);`

			`std::string SearchRegexCmd(const std::string &Regex);`

Tweak SimpleFastHash This change adds a SimpleFastHash64 variant of SimpleFastHash which allows call sites to specify a starting value and get a 64 bit hash in return. This allows a hash to be "resumed" with more data. A later patch needs this to be able to hash a sequence of module-relative values one at a time, rather than just a region a memory. Reviewed By: morehouse Differential Revision: https://reviews.llvm.org/D94510 2021-04-01 23:20:35 -07:00			`uint64_t SimpleFastHash(const void *Data, size_t Size, uint64_t Initial = 0);`
Move libFuzzer to compiler_rt. Resulting library binaries will be named libclang_rt.fuzzer*, and will be placed in Clang toolchain, allowing redistribution. Differential Revision: https://reviews.llvm.org/D36908 llvm-svn: 311407 2017-08-21 23:25:50 +00:00
[crt][fuzzer] Fix up various numeric conversions Attempting to build a standalone libFuzzer in Fuchsia's default toolchain for the purpose of cross-compiling the unit tests revealed a number of not-quite-proper type conversions. Fuchsia's toolchain include `-std=c++17` and `-Werror`, among others, leading to many errors like `-Wshorten-64-to-32`, `-Wimplicit-float-conversion`, etc. Most of these have been addressed by simply making the conversion explicit with a `static_cast`. These typically fell into one of two categories: 1) conversions between types where high precision isn't critical, e.g. the "energy" calculations for `InputInfo`, and 2) conversions where the values will never reach the bits being truncated, e.g. `DftTimeInSeconds` is not going to exceed 136 years. The major exception to this is the number of features: there are several places that treat features as `size_t`, and others as `uint32_t`. This change makes the decision to cap the features at 32 bits. The maximum value of a feature as produced by `TracePC::CollectFeatures` is roughly: (NumPCsInPCTables + ValueBitMap::kMapSizeInBits + ExtraCountersBegin() - ExtraCountersEnd() + log2(SIZE_MAX)) * 8 It's conceivable for extremely large targets and/or extra counters that this limit could be reached. This shouldn't break fuzzing, but it will cause certain features to collide and lower the fuzzers overall precision. To address this, this change adds a warning to TracePC::PrintModuleInfo about excessive feature size if it is detected, and recommends refactoring the fuzzer into several smaller ones. Reviewed By: morehouse Differential Revision: https://reviews.llvm.org/D97992 2021-03-11 16:00:53 -08:00			`inline size_t Log(size_t X) {`
			`return static_cast<size_t>((sizeof(unsigned long long) * 8) - Clzll(X) - 1);`
			`}`
[libFuzzer] change the strategy for -experimental_len_control to grow max_len slower llvm-svn: 320531 2017-12-12 23:11:28 +00:00
[libFuzzer] refactor the handling of instrumentation counters so that they are grouped in regions one full page each. Needed for future optimization. NFC llvm-svn: 352603 2019-01-30 06:15:52 +00:00			`inline size_t PageSize() { return 4096; }`
			`inline uint8_t RoundUpByPage(uint8_t P) {`
			`uintptr_t X = reinterpret_cast<uintptr_t>(P);`
			`size_t Mask = PageSize() - 1;`
			`X = (X + Mask) & ~Mask;`
			`return reinterpret_cast<uint8_t *>(X);`
			`}`
			`inline uint8_t RoundDownByPage(uint8_t P) {`
			`uintptr_t X = reinterpret_cast<uintptr_t>(P);`
			`size_t Mask = PageSize() - 1;`
			`X = X & ~Mask;`
			`return reinterpret_cast<uint8_t *>(X);`
			`}`

[libFuzzer] Fix endianness issue in ForEachNonZeroByte() The usage pattern of Bundle variable assumes the machine is little endian, which is not the case on SystemZ. Fix by converting Bundle to little-endian when necessary. 2020-07-30 20:07:11 +02:00			`#if __BYTE_ORDER == __LITTLE_ENDIAN`
			`template <typename T> T HostToLE(T X) { return X; }`
			`#else`
			`template <typename T> T HostToLE(T X) { return Bswap(X); }`
			`#endif`

Move libFuzzer to compiler_rt. Resulting library binaries will be named libclang_rt.fuzzer*, and will be placed in Clang toolchain, allowing redistribution. Differential Revision: https://reviews.llvm.org/D36908 llvm-svn: 311407 2017-08-21 23:25:50 +00:00			`} // namespace fuzzer`

			`#endif // LLVM_FUZZER_UTIL_H`