292 Commits

Author SHA1 Message Date
Slava Zakharin
27bc8a1811
[flang][NFC] Split CG dialect and the passes. (#135240)
I am making a CG pass to depend on `FIROpenACCSupport` in #134346.
This introduces a cyclic dependency between `FIROpenACCSupport`
and `FIRCodeGen`. This patch splits `FIRCodeGen` into
`FIRCodeGenDialect` (for FIR CG dialect definition) and `FIRCodeGen`
(for the CG passes).

Now, `FIROpenACCSupport` depends on `FIRCodeGenDialect`,
and `FIRCodeGen` depends on `FIROpenACCSupport`.
2025-04-10 16:13:04 -07:00
Valentin Clement (バレンタイン クレメン)
a862b6deae
[flang][cuda] Lower shared global to the correct NVVM address space (#131368)
Global with the CUDA shared data attribute needs to be lowered to llvm
globals with the correct address space (3). Address space is set from
the `mlir::NVVM::NVVMMemorySpace::kSharedMemorySpace` enum from
`mlir/Dialect/LLVMIR/NVVMDialect.h`
2025-03-14 15:28:32 -07:00
Asher Mancinelli
982527eef0
[flang] Use saturated intrinsics for floating point to integer conversions (#130686)
The saturated floating point conversion intrinsics match the semantics in the standard more closely than the fptosi/fptoui instructions.

Case 2 of 16.9.100 is

> INT (A [, KIND])
> If A is of type real, there are two cases: if |A| < 1, INT (A) has the
value 0; if |A| ≥ 1, INT (A) is the integer whose magnitude is the
largest integer that does not exceed the magnitude of A and whose sign
is the same as the sign of A.

Currently, converting a floating point value into an integer type too
small to hold the constant will be converted to poison in opt, leaving
us with garbage:

```
> cat t.f90
program main
  real(kind=16)   :: f
  integer(kind=4) :: i
  f=huge(f)
  i=f
  print *, i
end program main

# current upstream
> for i in `seq 10`; do; ./a.out; done
 -862156992
 -1497393344
 -739096768
 -1649494208
 1761228608
 -1959270592
 -746244288
 -1629194432
 -231217344
 382322496
```

With the saturated fptoui/fptosi intrinsics, we get the appropriate
values

```
# mine
> flang -O2 ./t.f90 && ./a.out
 2147483647

> perl -e 'printf "%d\n", (2 ** 31) - 1'
2147483647
```

One notable difference: NaNs being converted to ints will become zero, unlike current flang (and some other compilers). Newer versions of GCC have this behavior.
2025-03-12 08:14:46 -07:00
jeanPerier
1ddf18057a
[flang] introduce fir.copy to avoid load store of aggregates (#130289)
Introduce a FIR operation to do memcopy/memmove of compile time constant size types.

This is to avoid requiring derived type copies to done with load/store
which is badly supported in LLVM when the aggregate type is "big" (no
threshold can easily be defined here, better to always avoid them for
fir.type).

This was the root cause of the regressions caused by #114002 which introduced a
load/store of fir.type<> which caused hand/asserts to fire in LLVM on
several benchmarks.

See https://llvm.org/docs/Frontend/PerformanceTips.html#avoid-creating-values-of-aggregate-type
2025-03-11 09:31:03 +01:00
R
1dffe8f364
Reland [flang] In AllocMemOp lowering, convert types for calling malloc on 32-bit (#130386)
Previous PR: https://github.com/llvm/llvm-project/pull/129308

Changes:
* The alloc-32.fir test is now marked as requiring the X86 target.
* Drive-by fixes uncovered when fixing tests involving malloc
2025-03-11 02:01:57 +00:00
R
3121da52aa Revert "[flang] In AllocMemOp lowering, convert types for calling malloc on 32-bit (#129308)"
This reverts commit cf1964af5a461196904b663ede04c26555fcff69.

This causes breakage on all the non-x86 buildbots as they don't have the i686
target enabled. This was missed in pre-commit CI.
2025-03-08 02:42:24 +00:00
R
cf1964af5a
[flang] In AllocMemOp lowering, convert types for calling malloc on 32-bit (#129308)
Although 32-bit targets are currently not officially supported, add a type conversion in the AllocMemOp lowering when calling the `malloc` function on 32-bit targets. This fixes a type mismatch, and this fix makes it easier to potentially support such targets in the future.

This involves making sure the `LLVMTypeConverter` has the necessary information to know the target bit width.

Co-authored-by: Valentin Clement (バレンタイン クレメン) <clementval@gmail.com>
2025-03-08 02:25:17 +00:00
jeanPerier
a8db1fb9b5
[flang] update fir.coordinate_of to carry the fields (#127231)
This patch updates fir.coordinate_op to carry the field index as
attributes instead of relying on getting it from the fir.field_index
operations defining its operands.

The rational is that FIR currently has a few operations that require
DAGs to be preserved in order to be able to do code generation. This is
the case of fir.coordinate_op, which requires its fir.field operand
producer to be visible.
This makes IR transformation harder/brittle, so I want to update FIR to
get rid if this.

Codegen/printer/parser of fir.coordinate_of and many tests need to be
updated after this change.
2025-02-28 09:50:05 +01:00
Slava Zakharin
0caa8f42be Reland "[flang] Set LLVM specific attributes to fir.call's of Fortran runtime. (#128093)"
This change is inspired by a case in facerec benchmark, where
performance
of scalar code may improve by about 6%@aarch64 due to getting rid of
redundant
loads from Fortran descriptors. These descriptors are corresponding
to subroutine local ALLOCATABLE, SAVE variables. The scalar loop nest
in LocalMove subroutine contains call to Fortran runtime IO functions,
and LLVM globals-aa analysis cannot prove that these calls do not modify
the globalized descriptors with internal linkage.

This patch sets and propagates llvm.memory_effects attribute for
fir.call
operations calling Fortran runtime functions. In particular, it tries
to set the Other memory effect to NoModRef. The Other memory effect
includes accesses to globals and captured pointers, so we cannot set
it for functions taking Fortran descriptors with one exception
for calls where the Fortran descriptor arguments are all null.

As long as different calls to the same Fortran runtime function may have
different attributes, I decided to attach the attributes to the calls
rather than functions. Moreover, attaching the attributes to func.func
will require propagating these attributes to llvm.func, which is not
happening right now.

In addition to llvm.memory_effects, the new pass sets llvm.nosync
and llvm.nocallback attributes that may also help LLVM alias analysis
(e.g. see #127707). These attributes are ignored currently.
I will support them in LLVM IR dialect in a separate patch.

I also added another pass for developers to be able to print
declarations/calls of all Fortran runtime functions that are recognized
by the attributes setting pass. It should help with maintenance
of the LIT tests.
2025-02-24 14:18:17 -08:00
Slava Zakharin
69cc16fb55 Revert "[flang] Set LLVM specific attributes to fir.call's of Fortran runtime. (#128093)"
This reverts commit 36fdeb2aded08a776fcffefa73cb7667e7fc6c2d.
2025-02-24 10:52:53 -08:00
Valentin Clement (バレンタイン クレメン)
8dbc393e44
[flang][cuda][NFC] Remove shared alloc addr space (#128535) 2025-02-24 10:05:32 -08:00
Slava Zakharin
36fdeb2ade
[flang] Set LLVM specific attributes to fir.call's of Fortran runtime. (#128093)
This change is inspired by a case in facerec benchmark, where
performance
of scalar code may improve by about 6%@aarch64 due to getting rid of
redundant
loads from Fortran descriptors. These descriptors are corresponding
to subroutine local ALLOCATABLE, SAVE variables. The scalar loop nest
in LocalMove subroutine contains call to Fortran runtime IO functions,
and LLVM globals-aa analysis cannot prove that these calls do not modify
the globalized descriptors with internal linkage.

This patch sets and propagates llvm.memory_effects attribute for
fir.call
operations calling Fortran runtime functions. In particular, it tries
to set the Other memory effect to NoModRef. The Other memory effect
includes accesses to globals and captured pointers, so we cannot set
it for functions taking Fortran descriptors with one exception
for calls where the Fortran descriptor arguments are all null.

As long as different calls to the same Fortran runtime function may have
different attributes, I decided to attach the attributes to the calls
rather than functions. Moreover, attaching the attributes to func.func
will require propagating these attributes to llvm.func, which is not
happening right now.

In addition to llvm.memory_effects, the new pass sets llvm.nosync
and llvm.nocallback attributes that may also help LLVM alias analysis
(e.g. see #127707). These attributes are ignored currently.
I will support them in LLVM IR dialect in a separate patch.

I also added another pass for developers to be able to print
declarations/calls of all Fortran runtime functions that are recognized
by the attributes setting pass. It should help with maintenance
of the LIT tests.
2025-02-24 09:27:48 -08:00
Razvan Lupusoru
f27081ba6a
[FIR] Avoid generating llvm.undef for dummy scoping info (#128098)
Dummy scoping operations are generated to keep track of scopes for
purpose of Fortran level analyses like Alias Analysis. For codegen, the
scoping info is converted to a fir.undef during pre-codegen rewrite.
Then during declare lowering, this info is no longer used - but it is
still translated to llvm.undef. I cleaned up so it is simply erased. The
generated LLVM should now no longer have a stray undef which looks off
when trying to make sense of the IR.

Co-authored-by: Razvan Lupusoru <rlupusoru@nvidia.com>
2025-02-20 18:49:23 -08:00
Valentin Clement (バレンタイン クレメン)
726c4b9f77
[flang][cuda] Lower match_all_sync functions to nvvm intrinsics (#127940) 2025-02-20 09:10:25 -08:00
jeanPerier
5836d91845
[flang] add ABI argument attributes in indirect calls (#126896)
Last piece that implements the TODO for sret and byval setting on
indirect calls.

This includes a fix to the codegen last patch. I thought types in in
type attributes were automatically converted in dialect conversion
passes, but that is not the case. The sret and byval type needs to be
converted to llvm types in codegen (mlir FuncOp conversion is doing a
similar conversion).
2025-02-12 17:31:34 +01:00
jeanPerier
65075a863b
[flang][FIR] handle argument attributes in fir.call (#126711)
Add pretty printer/parser for fir.call argument/result attributes and
propagate them to llvm.call.

This will allow implementing the TODO about ABI relevant argument
attribute in indirect calls.
2025-02-12 09:49:52 +01:00
agozillon
4186805060
[Flang][MLIR] Extend DataLayout utilities to have basic GPU Module support (#123149)
As there is now certain areas where we now have the possibility of
having either a ModuleOp or GPUModuleOp and both of these modules can
have DataLayout's and we may require utilising the DataLayout utilities
in these areas I've taken the liberty of trying to extend them for use
with both.

Those with more knowledge of how they wish the GPUModuleOp's to interact
with their parent ModuleOp's DataLayout may have further alterations
they wish to make in the future, but for the moment, it'll simply
utilise the basic data layout construction which I believe combines
parent and child datalayouts from the ModuleOp and GPUModuleOp. If there
is no GPUModuleOp DataLayout it should default to the parent ModuleOp.

It's worth noting there is some weirdness if you have two module
operations defining builtin dialect DataLayout Entries, it appears the
combinatorial functionality for DataLayouts doesn't support the merging
of these.

This behaviour is useful for areas like:
https://github.com/llvm/llvm-project/pull/119585/files#diff-19fc4bcb38829d085e25d601d344bbd85bf7ef749ca359e348f4a7c750eae89dR1412
where we have a crossroads between the two different module operations.
2025-01-30 17:31:50 +01:00
Slava Zakharin
0b80491cd5
[flang] Support non-index shape/shift/slice for CG box operations. (#124625)
That is another problem uncovered during hlfir.reshape inlining,
where the shape bits could be any integer type.
This patch adds explicit convertions to `index` type where needed.
2025-01-28 09:38:33 -08:00
Abid Qadeer
afa4681ce4
[flang][debug] Add support for common blocks. (#112398)
This PR adds debug support for common block in flang. As variable which
are part of a common block don't have a special marker to recognize
them, we use the following check to find them.

%0 = fir.address_of(@a)
%1 = fir.convert %0
%2 = fir.coordinate_of %1, %c0
%3 = fir.convert %2
%4 = fircg.ext_declare %3

If the memref of a fircg.ext_declare points to a fir.coordinate_of and
that in turn points to an fir.address_of (ignoring immediate
fir.convert) then we assume that it is a common block variable. The
fir.address_of gives us the global symbol which is the storage for
common block and fir.coordinate_of provides the offset in this storage.

The debug hierarchy looks like as

subroutine f3
  integer :: x, y
  common /a/ x, y
end subroutine

@a_ = global { ... } { ... }, !dbg !26, !dbg !28

!23 = !DISubprogram(name: "f3"...)
!24 = !DICommonBlock(scope: !23, name: "a", ...)
!25 = !DIGlobalVariable(name: "x", scope: !24 ...)
!26 = !DIGlobalVariableExpression(var: !25, expr: !DIExpression())
!27 = !DIGlobalVariable(name: "y", scope: !24 ...)
!28 = !DIGlobalVariableExpression(var: !27, expr:
!DIExpression(DW_OP_plus_uconst, 4))

This required following changes:

1. Instead of using DIGlobalVariableAttr in the FusedLoc of GlobalOp, we
use DIGlobalVariableExpressionAttr. This allows us the generate the
DIExpression where we have the information.

2. Previously, only one DIGlobalVariableExpressionAttr could be linked
to one global op. I recently removed this restriction in mlir. To make
use of it, we add an ArrayAttr to the FusedLoc of a GlobalOp. This
allows us to pass multiple DIGlobalVariableExpressionAttr.

3. I was depending on the name of global for the name of the common
block. The name gets a '_' appended. I could not find a utility function
in flang to remove it so I have to brute force it.
2025-01-28 12:54:15 +00:00
Valentin Clement (バレンタイン クレメン)
9f83c4ed1c
[flang][cuda] Allocate descriptor in managed memory on rebox block argument (#123971)
Another case where the descriptor must be allocated with the CUF runtime
and not a simple alloca instruction.
2025-01-22 10:04:39 -08:00
Valentin Clement (バレンタイン クレメン)
c26e1a22df
[flang][cuda] Allocate descriptor in managed memory when memref is a block argument (#123829) 2025-01-21 17:20:46 -08:00
Matthias Springer
599c739905
[mlir][GPU] Add NVVM-specific cf.assert lowering (#120431)
This commit add an NVIDIA-specific lowering of `cf.assert` to to
`__assertfail`.

Note: `getUniqueFormatGlobalName`, `getOrCreateFormatStringConstant` and
`getOrDefineFunction` are moved to `GPUOpsLowering.h`, so that they can
be reused.
2025-01-06 12:00:11 +01:00
Matthias Springer
c870632ef6
[flang] Fix some memory leaks (#121050)
This commit fixes some but not all memory leaks in Flang. There are
still 91 tests that fail with ASAN.

- Use `mlir::OwningOpRef` instead of `std::unique_ptr`. The latter does
not free allocations of nested blocks.
- Pass `ModuleOp` as value instead of reference.
- Add few missing deallocations in test cases and other places.
2024-12-25 09:42:03 +01:00
Valentin Clement (バレンタイン クレメン)
d36836de01
[flang][cuda] Create descriptor in managed memory when emboxing fir.box_addr value (#120980) 2024-12-23 09:52:59 -08:00
Kazu Hirata
392651a7ec
[flang] Migrate away from PointerUnion::{is,get} (NFC) (#120880)
Note that PointerUnion::{is,get} have been soft deprecated in
PointerUnion.h:

  // FIXME: Replace the uses of is(), get() and dyn_cast() with
  //        isa<T>, cast<T> and the llvm::dyn_cast<T>

I'm not touching PointerUnion::dyn_cast for now because it's a bit
complicated; we could blindly migrate it to dyn_cast_if_present, but
we should probably use dyn_cast when the operand is known to be
non-null.
2024-12-22 13:30:16 -08:00
Valentin Clement (バレンタイン クレメン)
e650ac1654
[flang][cuda][NFC] Fix typo in CUFAllocDescriptor (#120797)
Missing `r` in the function name.
2024-12-20 13:57:47 -08:00
Valentin Clement (バレンタイン クレメン)
81831ef3e7
[flang][cuda] Correctly allocate descriptor in managed memory when reboxing (#120795)
Reboxing might create a new in memory descriptor. If this one was
allocate with managed memory, allocate the new one in managed memory as
well.
2024-12-20 13:32:31 -08:00
Valentin Clement (バレンタイン クレメン)
3e13acfbf4
[flang][cuda] Make default.nonTbpDefinedIoTable compiler generated (#120686)
`default.nonTbpDefinedIoTable` is a special global defined for IO that
doesn't follow the mangling scheme and is then not handle correctly in
the `CompilerGeneratedNames` pass. Update how it is generated with
doGenerated so it can be handle without special handling.

Also do not generate comdat in gpu module as the current code is not
handling nested module correctly.
2024-12-20 10:37:48 -08:00
Matthias Springer
eb6c4197d5
[mlir][CF] Split cf-to-llvm from func-to-llvm (#120580)
Do not run `cf-to-llvm` as part of `func-to-llvm`. This commit fixes
https://github.com/llvm/llvm-project/issues/70982.

This commit changes the way how `func.func` ops are lowered to LLVM.
Previously, the signature of the entire region (i.e., entry block and
all other blocks in the `func.func` op) was converted as part of the
`func.func` lowering pattern.

Now, only the entry block is converted. The remaining block signatures
are converted together with `cf.br` and `cf.cond_br` as part of
`cf-to-llvm`. All unstructured control flow is not converted as part of
a single pass (`cf-to-llvm`). `func-to-llvm` no longer deals with
unstructured control flow.

Also add more test cases for control flow dialect ops.

Note: This PR is in preparation of #120431, which adds an additional
GPU-specific lowering for `cf.assert`. This was a problem because
`cf.assert` used to be converted as part of `func-to-llvm`.

Note for LLVM integration: If you see failures, add
`-convert-cf-to-llvm` to your pass pipeline.
2024-12-20 13:46:45 +01:00
Valentin Clement (バレンタイン クレメン)
e93d226664
[flang][cuda] Update CompilerGeneratedNames pass to work on gpu module (#120660)
- Update `CompilerGeneratedNames` so it can perform renaming in
gpu.module
- Update Codegen so it look in the correct module for the type
descriptor.
2024-12-19 19:07:00 -08:00
Valentin Clement (バレンタイン クレメン)
4530273d7c
[flang][cuda] Allocate descriptor in managed memory when emboxing device memory (#120485)
When emboxing memory that comes from CUFMemAlloc, we need to allocate
the descriptor in manage memory as it might be passed to a kernel.
2024-12-18 18:20:45 -08:00
Peter Klausler
fc97d2e68b
[flang] Add UNSIGNED (#113504)
Implement the UNSIGNED extension type and operations under control of a
language feature flag (-funsigned).

This is nearly identical to the UNSIGNED feature that has been available
in Sun Fortran for years, and now implemented in GNU Fortran for
gfortran 15, and proposed for ISO standardization in J3/24-116.txt.

See the new documentation for details; but in short, this is C's
unsigned type, with guaranteed modular arithmetic for +, -, and *, and
the related transformational intrinsic functions SUM & al.
2024-12-18 07:02:37 -08:00
Valentin Clement (バレンタイン クレメン)
5e1f87e849
[flang][cuda] Correctly allocate memory for descriptor load (#120164)
CodeGen will allocate memory for a new descriptor on descriptor loads.
CUDA Fortran local descriptor are allocated in managed memory by the
runtime. The newly allocated storage for cuda descriptor must also be
allocated through the runtime.
2024-12-16 19:12:05 -08:00
Michael Kruse
c91ba04328
[Flang][NFC] Split runtime headers in preparation for cross-compilation. (#112188)
Split some headers into headers for public and private declarations in
preparation for #110217. Moving the runtime-private headers in
runtime-private include directory will occur in #110298.

* Do not use `sizeof(Descriptor)` in the compiler. The size of the
descriptor is target-dependent while `sizeof(Descriptor)` is the size of
the Descriptor for the host platform which might be too small when
cross-compiling to a different platform. Another problem is that the
emitted assembly ((cross-)compiling to the same target) is not identical
between Flang's running on different systems. Moving the declaration of
`class Descriptor` out of the included header will also reduce the
amount of #included sources.

* Do not use `sizeof(ArrayConstructorVector)` and
`alignof(ArrayConstructorVector)` in the compiler. Same reason as with
`Descriptor`.

* Compute the descriptor's extra flags without instantiating a
Descriptor. `Fortran::runtime::Descriptor` is defined in the runtime
source, but not the compiler source.

* Move `InquiryKeywordHashDecode` into runtime-private header. The
function is defined in the runtime sources and trying to call it in the
compiler would lead to a link-error.

* Move allocator-kind magic numbers into common header. They are the
only declarations out of `allocator-registry.h` in the compiler as well.
 
This does not make Flang cross-compile ready yet, the main goal is to
avoid transitive header dependencies from Flang to clang-rt. There are
more assumptions that host platform is the same as the target platform.
2024-12-06 15:29:00 +01:00
Tom Eccles
e9fc2faf0c
[flang][CodeGen] fix bug hoisting allocas using a shared constant arg (#116251)
When hoisting the allocas with a constant integer size, the constant
integer was moved to where the alloca is hoisted to unconditionally.

By CodeGen there have been various iterations of mlir canonicalization
and dead code elimination. This can cause lots of unrelated bits of code
to share the same constant values. If for some reason the alloca
couldn't be hoisted all of the way to the entry block of the function,
moving the constant might result in it no-longer dominating some of the
remaining uses.

In theory, there should be dominance analysis to ensure the location of
the constant does dominate all uses of it. But those constants are
effectively free anyway (they aren't even separate instructions in LLVM
IR), so it is less expensive just to leave the old one where it was and
insert a new one we know for sure is immediately before the alloca.
2024-11-15 10:31:20 +00:00
Valentin Clement (バレンタイン クレメン)
e5092c3019
[flang][cuda] Support malloc and free conversion in gpu module (#116112) 2024-11-13 17:09:38 -08:00
Valentin Clement (バレンタイン クレメン)
466b58ba38
[flang] Avoid generating duplicate symbol in comdat (#114472)
In case where a fir.global might be duplicated in an inner module
(gpu.module), the conversion pattern will be applied on the module and
the gpu module version of the global and try to generate multiple comdat
with the same symbol name. This is what we have in the implementation of
CUDA Fortran.

Just check for the presence of the `ComdatSelectorOp` before creating a
new one.
2024-10-31 18:59:04 -07:00
Asher Mancinelli
0c9a02355a
[flang][fir] always use memcpy for fir.box (#113949)
@jeanPerier explained the importance of converting box loads and stores
into `memcpy`s instead of aggregate loads and stores, and I'll do my
best to explain it here.

* [(godbolt link) Example comparing opt transformations on memcpys vs
aggregate load/stores](https://godbolt.org/z/be7xM83cG)
* LLVM can more effectively reason about memcpys compared to aggregate
load/stores.
* This came up when others were discussing array descriptors for
assumed-rank arrays passed to `bind(c)` subroutines, with the
implication that the array descriptors are known to have lower bounds of
1 and that they are not pointer/allocatable types.
* [(godbolt link) Clang also uses memcpys so we should probably follow
them, assuming the clang developers are generatign what they know Opt
will handle more effectively.](https://godbolt.org/z/YT4x7387W)
* This currently may not help much without the `nocapture` attribute
being propagated to function calls, but [it looks like someone may do
this soon (discourse
link)](https://discourse.llvm.org/t/applying-the-nocapture-attribute-to-reference-passed-arguments-in-fortran-subroutines/81401/23)
or I can do this in a follow-up patch.

Note on test `flang/test/Fir/embox-char.fir`: it looks like the original
test was auto-generated. I wasn't too sure which parts were especially
important to test, so I regenerated the test. If we want the updated
version to look more like the old version, I'll make those changes.
2024-10-30 09:50:27 -07:00
Scott Manley
e6a4346b5a
[flang] add getElementType() to fir::SquenceType and fir::VectorType (#112770)
getElementType() was missing from Sequence and Vector types. Did a
replace of the obvious places getEleTy() was used for these two types
and updated to use this name instead.

Co-authored-by: Scott Manley <scmanley@nvidia.com>
2024-10-18 09:29:25 +02:00
jeanPerier
2f0b4f43fc
[flang][extension] support concatenation with absent optional (#112678)
Fix #112593 by adding support in lowering to concatenation with an
absent optional _assumed length_ dummy argument because:
1. Most compilers seem to support it (most likely by accident).
2. This actually makes the compiler codegen simpler. Codegen was going
out of its way to poke the LLVM optimizer bear by producing an undef
argument for the length.

I insist on the fact that no compiler support this with _explicit
length_ optional arguments and the executable will segfault and I would
discourage users from using that "feature" because runtime checks for
bad optional dereference will kick when used (For instance, "nagfor
-C=present" will produce an executable that abort with an error message
. Flang does not have such runtime check option so far).

Hence, I am not updating the Extensions.md document because this is not
something I think we should advertise.
2024-10-17 13:25:09 +02:00
Abid Qadeer
cd12ffb622
[mlir][debug] Allow multiple DIGlobalVariableExpression on globals. (#111981)
Currently, we allow only one DIGlobalVariableExpressionAttr per global.
It is especially evident in import where we pick the first from the list
and ignore the rest. In contrast, LLVM allows multiple
DIGlobalVariableExpression to be attached to the global. They are needed
for correct working of things like DICommonBlock. This PR removes this
restriction in mlir. Changes are mostly mechanical. One thing on which I
went a bit back and forth was the representation inside GlobalOp. I
would be happy to change if there are better ways to do this.

---------

Co-authored-by: Tobias Gysi <tobias.gysi@nextsilicon.com>
2024-10-13 23:36:00 +01:00
Leandro Lupori
390943f25b
[flang] Implement conversion of compatible derived types (#111165)
With some restrictions, BIND(C) derived types can be converted to
compatible BIND(C) derived types.
Semantics already support this, but ConvertOp was missing the
conversion of such types.

Fixes https://github.com/llvm/llvm-project/issues/107783
2024-10-09 10:37:46 -03:00
Matthias Springer
206fad0e21
[mlir][NFC] Mark type converter in populate... functions as const (#111250)
This commit marks the type converter in `populate...` functions as
`const`. This is useful for debugging.

Patterns already take a `const` type converter. However, some
`populate...` functions do not only add new patterns, but also add
additional type conversion rules. That makes it difficult to find the
place where a type conversion was added in the code base. With this
change, all `populate...` functions that only populate pattern now have
a `const` type converter. Programmers can then conclude from the
function signature that these functions do not register any new type
conversion rules.

Also some minor cleanups around the 1:N dialect conversion
infrastructure, which did not always pass the type converter as a
`const` object internally.
2024-10-05 21:32:40 +02:00
jeanPerier
1753de2d95
[flang][FIR] remove fir.complex type and its fir.real element type (#111025)
Final patch of
https://discourse.llvm.org/t/rfc-flang-replace-usages-of-fir-complex-by-mlir-complex-type/82292

Since fir.real was only still used as fir.complex element type, this
patch removes it at the same time.
2024-10-04 09:57:03 +02:00
jeanPerier
c2601f1769
[flang][NFC] remove unused fir.constc operation (#110821)
As part of [RFC to replace fir.complex usages by mlir.complex
type](https://discourse.llvm.org/t/rfc-flang-replace-usages-of-fir-complex-by-mlir-complex-type/82292).

fir.constc is unused so instead of porting it, just remove it.
Complex constants are currently created with inserts in lowering
already. When using mlir complex, we may just want to start using
[complex.constant](4f6ad17adc/mlir/include/mlir/Dialect/Complex/IR/ComplexOps.td (L131C5-L131C16)).
2024-10-02 16:16:57 +02:00
Sirui Mu
fde3c16ac9
[mlir][LLVM] Add operand bundle support (#108933)
This PR adds LLVM [operand
bundle](https://llvm.org/docs/LangRef.html#operand-bundles) support to
MLIR LLVM dialect. It affects these 3 operations related to making
function calls: `llvm.call`, `llvm.invoke`, and `llvm.call_intrinsic`.

This PR adds two new parameters to each of the 3 operations. The first
parameter is a variadic operand `op_bundle_operands` that contains the
SSA values for operand bundles. The second parameter is a property
`op_bundle_tags` which holds an array of strings that represent the tags
of each operand bundle.
2024-09-26 07:59:37 +02:00
Jan Leyonberg
4290e34ebd
[flang][AMDGPU] Convert math ops to AMD GPU library calls instead of libm calls (#99517)
This patch invokes a pass when compiling for an AMDGPU target to lower
math operations to AMD GPU library calls library calls instead of libm
calls.
2024-09-10 09:48:55 -04:00
Nikita Popov
67e19e5bb1 [flang] Set isSigned=true for negative constant (NFC)
We're providing this as a negative signed value, so set the flag.
Currently doesn't make a difference, but will assert in the future.

Split out of https://github.com/llvm/llvm-project/pull/80309.
2024-09-05 15:25:05 +02:00
Peter Klausler
9e53e77265
[flang] Fix warnings from more recent GCCs (#106567)
While experimenting with some more recent C++ features, I ran into
trouble with warnings from GCC 12.3.0 and 14.2.0. These warnings looked
legitimate, so I've tweaked the code to avoid them.
2024-09-04 10:52:51 -07:00
Slava Zakharin
cfd4c1805e
[RFC][flang] Replace special symbols in uniqued global names. (#104859)
This change addresses more "issues" as the one resolved in #71338.
Some targets (e.g. NVPTX) do not accept global names containing
`.`. In particular, the global variables created to represent
the runtime information of derived types use `.` in their names.
A derived type's descriptor object may be used in the device code,
e.g. to initialize a descriptor of a variable of this type.
Thus, the runtime type info objects may need to be compiled
for the device.

Moreover, at least the derived types' descriptor objects
may need to be registered (think of `omp declare target`)
for the host-device association so that the addendum pointer
can be properly mapped to the device for descriptors using
a derived type's descriptor as their addendum pointer.
The registration implies knowing the name of the global variable
in the device image so that proper host code can be created.
So it is better to name the globals the same way for the host
and the device.

CompilerGeneratedNamesConversion pass renames all uniqued globals
such that the special symbols (currently `.`) are replaced
with `X`. The pass is supposed to be run for the host and the device.

An option is added to FIR-to-LLVM conversion pass to indicate
whether the new pass has been run before or not. This setting
affects how the codegen computes the names of the derived types'
descriptors for FIR derived types.

fir::NameUniquer now allows `X` to be part of a name, because
the name deconstruction may be applied to the mangled names
after CompilerGeneratedNamesConversion pass.
2024-08-21 13:37:03 -07:00