Summary:
Currently, assertion-disabled Clang builds emit value names when generating LLVM IR. This is controlled by the `NDEBUG` macro, and is not easily overridable. In order to get IR output containing names from a release build of Clang, the user must manually construct the CC1 invocation w/o the `-discard-value-names` option. This is less than ideal.
For example, Godbolt uses a release build of Clang, and so when asked to emit LLVM IR the result lacks names, making it harder to read. Manually invoking CC1 on Compiler Explorer is not feasible.
This patch adds the driver options `-fdiscard-value-names` and `-fno-discard-value-names` which allow the user to override the default behavior. If neither is specified, the old behavior remains.
Reviewers: erichkeane, aaron.ballman, lebedev.ri
Reviewed By: aaron.ballman
Subscribers: bogner, cfe-commits
Differential Revision: https://reviews.llvm.org/D42887
llvm-svn: 324498
NVPTX does not have runtime support necessary for profiling to work
and even call arc collection is prohibitively expensive. Furthermore,
there's no easy way to collect the samples. NVPTX also does not
support global constructors that clang generates if sample/arc collection
is enabled.
Differential Revision: https://reviews.llvm.org/D42452
llvm-svn: 323345
As RV64 codegen has not yet been upstreamed into LLVM, we focus on RV32 driver
support (RV64 to follow).
Differential Revision: https://reviews.llvm.org/D39963
llvm-svn: 322276
Adds option /guard:cf to clang-cl and -cfguard to cc1 to emit function IDs
of functions that have their address taken into a section named .gfids$y for
compatibility with Microsoft's Control Flow Guard feature.
The original patch didn't have the lit.local.cfg file that restricts the new
test to x86, thus the new test was failing on the non-x86 bots.
Differential Revision: https://reviews.llvm.org/D40531
The reverts r322008, which was a revert of r322005.
This reverts commit a05b89f9aca70597dc79fe97bc49b50b51f525ba.
llvm-svn: 322136
Cf-protection is a target independent flag that instructs the back-end to instrument control flow mechanisms like: Branch, Return, etc.
For example in X86 this flag will be used to instrument Indirect Branch Tracking instructions.
Differential Revision: https://reviews.llvm.org/D40478
Change-Id: I5126e766c0e6b84118cae0ee8a20fe78cc373dea
llvm-svn: 322063
The new test fails on the Hexagon bot. Reverting while I investigate.
This reverts https://reviews.llvm.org/rL322005
This reverts commit b7e0026b4385180c378edc658ec91a39566f2942.
llvm-svn: 322008
Adds option /guard:cf to clang-cl and -cfguard to cc1 to emit function IDs
of functions that have their address taken into a section named .gfids$y for
compatibility with Microsoft's Control Flow Guard feature.
Differential Revision: https://reviews.llvm.org/D40531
llvm-svn: 322005
Adds the -fstack-size-section flag to enable the .stack_sizes section. The flag defaults to on for the PS4 triple.
Differential Revision: https://reviews.llvm.org/D40712
llvm-svn: 321992
The Clang option -foptimization-record-file= controls which file an
optimization record is output to. Optimization records are output if you
use the Clang option -fsave-optimization-record. If you specify the
first option without the second, you get a warning that the command line
argument was unused. Passing -foptimization-record-file= should imply
-fsave-optimization-record.
This fixes PR33670
Patch by: Dmitry Venikov <venikov@phystech.edu>
Differential revision: https://reviews.llvm.org/D39834
llvm-svn: 321090
There are 2 parts to getting the -fassociative-math command-line flag translated to LLVM FMF:
1. In the driver/frontend, we accept the flag and its 'no' inverse and deal with the
interactions with other flags like -ffast-math -fno-signed-zeros -fno-trapping-math.
This was mostly already done - we just need to translate the flag as a codegen option.
The test file is complicated because there are many potential combinations of flags here.
Note that we are matching gcc's behavior that requires 'nsz' and no-trapping-math.
2. In codegen, we map the codegen option to FMF in the IR builder. This is simple code and
corresponding test.
For the motivating example from PR27372:
float foo(float a, float x) { return ((a + x) - x); }
$ ./clang -O2 27372.c -S -o - -ffast-math -fno-associative-math -emit-llvm | egrep 'fadd|fsub'
%add = fadd nnan ninf nsz arcp contract float %0, %1
%sub = fsub nnan ninf nsz arcp contract float %add, %2
So 'reassoc' is off as expected (and so is the new 'afn' but that's a different patch).
This case now works as expected end-to-end although the underlying logic is still wrong:
$ ./clang -O2 27372.c -S -o - -ffast-math -fno-associative-math | grep xmm
addss %xmm1, %xmm0
subss %xmm1, %xmm0
We're not done because the case where 'reassoc' is set is ignored by optimizer passes. Example:
$ ./clang -O2 27372.c -S -o - -fassociative-math -fno-signed-zeros -fno-trapping-math -emit-llvm | grep fadd
%add = fadd reassoc float %0, %1
$ ./clang -O2 27372.c -S -o - -fassociative-math -fno-signed-zeros -fno-trapping-math | grep xmm
addss %xmm1, %xmm0
subss %xmm1, %xmm0
Differential Revision: https://reviews.llvm.org/D39812
llvm-svn: 320920
This adds a new command line option -mprefer-vector-width to specify a preferred vector width for the vectorizers. Valid values are 'none' and unsigned integers. The driver will check that it meets those constraints. Specific supported integers will be managed by the targets in the backend.
Clang will take the value and add it as a new function attribute during CodeGen.
This represents the alternate direction proposed by Sanjay in this RFC: http://lists.llvm.org/pipermail/llvm-dev/2017-November/118734.html
The syntax here matches gcc, though gcc treats it as an x86 specific command line argument. gcc only allows values of 128, 256, and 512. I'm not having clang check any values.
Differential Revision: https://reviews.llvm.org/D40230
llvm-svn: 320419
This is a re-apply of r319294.
adds -fseh-exceptions and -fdwarf-exceptions flags
clang will check if the user has specified an exception model flag,
in the absense of specifying the exception model clang will then check
the driver default and append the model flag for that target to cc1
-fno-exceptions has a higher priority then specifying the model
move __SEH__ macro definitions out of Targets into InitPreprocessor
behind the -fseh-exceptions flag
move __ARM_DWARF_EH__ macrodefinitions out of verious targets and into
InitPreprocessor behind the -fdwarf-exceptions flag and arm|thumb check
remove unused USESEHExceptions from the MinGW Driver
fold USESjLjExceptions into a new GetExceptionModel function that
gives the toolchain classes more flexibility with eh models
Reviewers: rnk, mstorsjo
Differential Revision: https://reviews.llvm.org/D39673
llvm-svn: 319297
adds -fseh-exceptions and -fdwarf-exceptions flags
clang will check if the user has specified an exception model flag,
in the absense of specifying the exception model clang will then check
the driver default and append the model flag for that target to cc1
clang cc1 assumes dwarf is the default if none is passed
and -fno-exceptions has a higher priority then specifying the model
move __SEH__ macro definitions out of Targets into InitPreprocessor
behind the -fseh-exceptions flag
move __ARM_DWARF_EH__ macrodefinitions out of verious targets and into
InitPreprocessor behind the -fdwarf-exceptions flag and arm|thumb check
remove unused USESEHExceptions from the MinGW Driver
fold USESjLjExceptions into a new GetExceptionModel function that
gives the toolchain classes more flexibility with eh models
Reviewers: rnk, mstorsjo
Differential Revision: https://reviews.llvm.org/D39673
llvm-svn: 319294
The support for relax relocations is dependent on the linker and
different toolchains within the same compiler can be using different
linkers some of which may or may not support relax relocations.
Give toolchains the option to control whether they want to use relax
relocations in addition to the existing (global) build system option.
Differential Revision: https://reviews.llvm.org/D39831
llvm-svn: 318816
This is an instrumentation flag that's similar to
-finstrument-functions, but it only inserts calls on function entry, the
calls are inserted post-inlining, and they don't take any arugments.
This is intended for users who want to instrument function entry with
minimal overhead.
(-pg would be another alternative, but forces frame pointer emission and
affects link flags, so is probably best left alone to be used for
generating gcov data.)
Differential revision: https://reviews.llvm.org/D40276
llvm-svn: 318785
This was previously done in some places, but for example not for
bundling so that single object compilation with -c failed. In
addition cubin was used for all file types during unbundling which
is incorrect for assembly files that are passed to ptxas.
Tighten up the tests so that we can't regress in that area.
Differential Revision: https://reviews.llvm.org/D40250
llvm-svn: 318763
The Unified Arm Assembler Language is designed so that the majority of
assembler files can be assembled for both Arm and Thumb with the choice
made as a compilation option.
The way this is done in gcc is to pass -mthumb to the assembler with either
-Wa,-mthumb or -Xassembler -mthumb. This change adds support for these
options to clang. There is no assembler equivalent of -mno-thumb, -marm or
-mno-arm so we don't need to recognize these.
Ideally we would do all of the processing in
CollectArgsForIntegratedAssembler(). Unfortunately we need to change the
triple and at that point it is too late. Instead we look for the option
earlier in ComputeLLVMTriple().
Fixes PR34519
Differential Revision: https://reviews.llvm.org/D40127
llvm-svn: 318647
This updates -mcount to use the new attribute names (LLVM r318195), and
switches over -finstrument-functions to also use these attributes rather
than inserting instrumentation in the frontend.
It also adds a new flag, -finstrument-functions-after-inlining, which
makes the cygprofile instrumentation get inserted after inlining rather
than before.
Differential Revision: https://reviews.llvm.org/D39331
llvm-svn: 318199
Added support for regcall as default calling convention. Also added code to
exclude main when applying default calling conventions.
Patch-By: eandrews
Differential Revision: https://reviews.llvm.org/D39210
llvm-svn: 317268
AAPCS and AAPCS64 mandate that `wchar_t` with `-fno-short-wchar` is an
`unsigned int` rather than a `signed int`. Ensure that the driver does
not flip the signedness of `wchar_t` for those targets.
Add additional tests to ensure that this does not regress.
llvm-svn: 316858
GCC tries to shorten system headers in depfiles using its real path
(resolving components like ".." and following symlinks). Mimic this
feature to ensure that the Ninja build tool detects the correct
dependencies when a symlink changes directory levels, see
https://github.com/ninja-build/ninja/issues/1330
An option to disable this feature is added in case "these changed header
paths may conflict with some compilation environments", see
https://gcc.gnu.org/ml/gcc-patches/2012-09/msg00287.html
Note that the original feature request for GCC
(https://gcc.gnu.org/bugzilla/show_bug.cgi?id=52974) also included paths
preprocessed output (-E) and diagnostics. That is not implemented now
since I am not sure if it breaks something else.
Differential Revision: https://reviews.llvm.org/D37954
llvm-svn: 316193
This patch has the following changes
A new flag "-mhvx-length={64B|128B}" is introduced to specify the length of the vector.
Previously we have used "-mhvx-double" for 128 Bytes. This adds the target-feature "+hvx-length{64|128}b"
The "-mhvx" flag must be provided on command line to enable HVX for Hexagon. If no -mhvx-length flag
is specified, a default length is picked from the arch mentioned in this priority order from either -mhvx=vxx
or -mcpu. For v60 and v62 the default length is 64 Byte. For unknown versions, the length is 128 Byte. The
-mhvx flag adds the target-feature "+hvxv{hvx_version}"
The 64 Byte mode is soon going to be deprecated. A warning is emitted if 64 Byte is enabled. A warning is
still emitted for the default 64 Byte as well. This warning can be suppressed with a -Wno flag.
The "-mhvx-double" and "-mno-hvx-double" flags are deprecated. A warning is emitted if the driver sees
them on commandline. "-mhvx-double" is an alias to "-mhvx-length=128B"
The compilation will error out if -mhvx-length is specified with out an -mhvx/-mhvx= flag
The macro HVX_LENGTH is defined and is set to the length of the vector.
Eg: #define HVX_LENGTH 64
The macro HVX_ARCH is defined and is set to the version of the HVX.
Eg: #define HVX_ARCH 62
Differential Revision: https://reviews.llvm.org/D38852
llvm-svn: 316102
Currently all the consecutive bitfields are wrapped as a large integer unless there is unamed zero sized bitfield in between. The patch provides an alternative manner which makes the bitfield to be accessed as separate memory location if it has legal integer width and is naturally aligned. Such separate bitfield may split the original consecutive bitfields into subgroups of consecutive bitfields, and each subgroup will be wrapped as an integer. Now This is all controlled by an option -ffine-grained-bitfield-accesses. The alternative of bitfield access manner can improve the access efficiency of those bitfields with legal width and being aligned, but may reduce the chance of load/store combining of other bitfields, so it depends on how the bitfields are defined and actually accessed to choose when to use the option. For now the option is off by default.
Differential revision: https://reviews.llvm.org/D36562
llvm-svn: 315915
Move the logic for determining the `wchar_t` type information into the
driver. Rather than passing the single bit of information of
`-fshort-wchar` indicate to the frontend the desired type of `wchar_t`
through a new `-cc1` option of `-fwchar-type` and indicate the
signedness through `-f{,no-}signed-wchar`. This replicates the current
logic which was spread throughout Basic into the
`RenderCharacterOptions`.
Most of the changes to the tests are to ensure that the frontend uses
the correct type. Add a new test set under `test/Driver/wchar_t.c` to
ensure that we calculate the proper types for the various cases.
llvm-svn: 315126
to have child entries describing the template parameters. This will
be on by default for SCE tuning.
Differential Revision: https://reviews.llvm.org/D14358
llvm-svn: 314444
We make the same decision when compiling the kernel or kexts -- we
should do this in -ffreestanding mode as well to avoid size regressions
in a potentially large set of firmware projects.
It's still possible to get uwtable information in -ffreestanding mode by
compiling with -funwind-tables (I expect this to be a rare case: I
certainly haven't seen any projects like that).
Context: -munwind-tables was enabled by default for some arm targets in
r310006.
Testing: check-clang
rdar://problem/33934446
Differential Revision: https://reviews.llvm.org/D37777
llvm-svn: 313087
This primarily impacts the Windows MSVC and Windows itanium
environments. Windows MSVC does not use `__cxa_atexit` and Itanium
follows suit. Simplify the logic for the default value calculation and
blanket the Windows environments to default to off for use of
`__cxa_atexit`.
llvm-svn: 312941
The ToolChain class validates the -mthread-model flag in the constructor which
doesn't work correctly since the thread model methods are virtual methods. The
check is moved into Clang::ConstructJob() when constructing the internal
command line.
https://reviews.llvm.org/D37496
Patch by: Ian Tessier!
llvm-svn: 312748
This code has been `#if 0`'ed out for a very long time. After speaking
with Duncan, opt to remove it even if it is something which should be
fixed. If the underlying issue is still valid, this can be restored
with proper explanation and tests.
llvm-svn: 312618
Extract the target specific option application. This is a huge switch
which was inlined into the `ConstructJob` option which adds a large
amount of code to the already large function. Extract it to simply
reduce the line count. NFC
llvm-svn: 312436
Out-of-line the logic for selecting the debug information handling.
This is still split across the new function and partially inline in the
job construction. This is needed since the split portion attempts to
record the "-cc1" arguments. This needs to be the very last item to
ensure that all the flags are recorded. NFC.
llvm-svn: 312435