I would like to nominate Mario Cupelli (@mariocup) and myself to join
the LLVM Security Group as representatives (vendor contacts) of [HighTec
EDV Systeme](https://github.com/hightec-rt).
Mario is the CTO of HighTec and has a strong background in compiler
safety qualification. I am a SW engineer with a focus on cybersecurity
and a goal to contribute to the LLVM Security Group.
HighTec collaborates with major silicon vendors and offers
safety-qualified C/C++ and Rust compilers based on LLVM. Our active
contributions to LLVM include work on the linker and various patches and
we are committed to further improving LLVM’s security.
Our motivation for joining the LLVM Security Group is to collaborate
with LLVM security experts, stay informed about the latest CVEs in LLVM
and meet the cybersecurity requirements of the automotive industry.
The format is: `!instances<T>([regex])`.
This operator produces a list of records whose type is `T`. If
`regex` is provided, only records whose name matches the regular
expression `regex` will be included. The format of `regex` is ERE
(Extended POSIX Regular Expressions).
This implements [the `.option exact` and `.option noexact`
proposal](https://github.com/riscv-non-isa/riscv-asm-manual/pull/122)
for RISC-V.
`.option exact` turns off:
- Compression
- Branch Relaxation
- Linker Relaxation
`.option noexact` turns these back on, and is also the default, matching
the current behaviour.
The link should refer to the section of 'phi' Instruction in the
LangRef, but it referred to the subsection of 'fcmp' Instruction.
Replace it with appropriate one.
In Ada, an array can be packed and the elements can take less space than
their natural object size. For example, for this type:
type Packed_Array is array (4 .. 8) of Boolean;
pragma pack (Packed_Array);
... each element of the array occupies a single bit, even though the
"natural" size for a Boolean in memory is a byte.
In DWARF, this is represented by putting a DW_AT_bit_stride onto the
array type itself.
This patch adds a bit stride to DICompositeType so that gnat-llvm can
emit DWARF for these sorts of arrays.
Option becomes: -instruction-tables=`<level>`
The choice of `<level>` controls number of printed information.
`<level>` may be `none` (default), `normal`, `full`.
Note: If the option is used without `<label>`, default is `normal`
(legacy).
When `<level>` is `full`, additional information are:
- `<Bypass Latency>`: Latency when a bypass is implemented between
operands
in pipelines (see SchedReadAdvance).
- `<LLVM Opcode Name>`: mnemonic plus operands identifier.
- `<Resources units>`: Used resources associated with LLVM Opcode.
- `<instruction comment>`: reports comment if any from source assembly.
Level `full` can be used to better check scheduling info when TableGen
is modified.
LLVM Opcode name help to find right instruction regexp to fix TableGen
Scheduling Info.
-instruction-tables=full option is validated on
AArch64/Neoverse/V1-sve-instructions.s
Follow up of MR #126703
---------
Co-authored-by: Julien Villette <julien.villette@sipearl.com>
This is required to support `li`, but the code is also shared with
CodeGen so the compiler will now emit instructions from Xqcili when that
extension is enabled during compilation.
Also implemented some missed verifiers in
`RISCVInstrInfo::verifyInstruction`, some of which are required for this
change.
According to the shader manual there are not buffer load lds
instructions of gfx11.
The tests for the regular `buffer_load ... lds` instructions for gfx11
are already present in AMDGPU/gfx11_asm_mubuf.s, where the compiler
fails to encode the instructions for this target.
When a developer copy/pastes a failing command line into their
shell to rerun it, they have to manually delete the "RUN: at line
N:" prefix. To make life easier for such developers, let's make it
possible to copy/paste a command without needing to modify it while
still showing the line number in the output by moving the line number
to a comment at the end of the command line.
Reviewers: jroelofs, MaskRay
Reviewed By: jroelofs, MaskRay
Pull Request: https://github.com/llvm/llvm-project/pull/132485
With a minor fix for the build failures.
Original message:
This extension adds nine instructions, eight for non-memory-mapped devices synchronization and delay instruction.
The current spec can be found at:
https://github.com/quic/riscv-unified-db/releases/tag/Xqci-0.7.0
This patch adds assembler only support.
Co-authored-by: Sudharsan Veeravalli quic_svs@quicinc.com
This extension adds nine instructions, eight for non-memory-mapped
devices synchronization and delay instruction.
The current spec can be found at:
https://github.com/quic/riscv-unified-db/releases/tag/Xqci-0.7.0
This patch adds assembler only support.
Co-authored-by: Sudharsan Veeravalli <quic_svs@quicinc.com>
Set the default processor version to v68 when the user does not specify
one in the command line. This includes changes in the LLVM backed and
linker (lld). Since lld normally sets the version based on inputs, this
change will only affect cases when there are no inputs.
Fixes#127558
There should be no substantial policy change here.
I reordered the breaking change docs after the major change docs, since
that flow made sense to me. I smoothed out some of the incremental
development policy wording to be a bit less black and white, and talk
about "preferred" and "discouraged" approaches.
The i6400 and i6500 are high performance multi-core microprocessors from
MIPS that provide best in class power efficiency for use in
system-on-chip (SoC) applications. i6400 and i6500 implements Release 6
of the MIPS64 Instruction Set Architecture with full hardware
multithreading and hardware virtualization support.
I initially thought starting with a more narrow definition and later
expanding would make more sense. But as pointed out in review for PR
#129220, this restriction is generating additional unnecessary work.
This patch alters the intrinsic to accept patterns of any type. Future
patches will update LoopIdiomRecognize and PreISelIntrinsicLowering to
take advantage of this. The verifier will complain if an unsized type is
used. I've additionally taken the opportunity to remove a comment from
the LangRef about some bit widths potentially not being supported by the
target. I don't think this is any more true than it is for arbitrary
width loads/stores which don't carry a similar warning that I can see.
A verifier check ensures that only sized types are used for the pattern.
The CWSR trap handler needs to save and restore the VGPRs. When dynamic
VGPRs are in use, the fixed function hardware will only allocate enough
space for one VGPR block. The rest will have to be stored in scratch, at
offset 0.
This patch allocates the necessary space by:
- generating a prologue that checks at runtime if we're on a compute
queue (since CWSR only works on compute queues); for this we will have
to check the ME_ID bits of the ID_HW_ID2 register - if that is non-zero,
we can assume we're on a compute queue and initialize the SP and FP with
enough room for the dynamic VGPRs
- forcing all compute entry functions to use a FP so they can access
their locals/spills correctly (this isn't ideal but it's the quickest to
implement)
Note that at the moment we allocate enough space for the theoretical
maximum number of VGPRs that can be allocated dynamically (for blocks of
16 registers, this will be 128, of which we subtract the first 16, which
are already allocated by the fixed function hardware). Future patches
may decide to allocate less if they can prove the shader never allocates
that many blocks.
Also note that this should not affect any reported stack sizes (e.g. PAL
backend_stack_size etc).
GenericOpcodes.td states that the number of operands are variadic.
let InOperandList = (ins type1:$src0, variable_ops);
X86 supports up to 4 inputs. The example uses 512-bit aka AVX-512 to
make it look real and show the effect of the ~many operands.
Test plan: ninja docs-llvm-html
This extension adds 10 instructions that provide hints to the interface
simulation environment.
The current spec can be found at:
https://github.com/quic/riscv-unified-db/releases/
This patch adds assembler only support.
This represents a hardware mode supported only for wave32 compute
shaders. When enabled, we set the `.dynamic_vgpr_en` field of
`.compute_registers` to true in the PAL metadata.
This will be changed to use an attribute after downstream consumers
have been migrated.
This extension adds twelve conditional branch instructions that use an
immediate operand for the source.
The current spec can be found at:
https://github.com/quic/riscv-unified-db/releases/tag/Xqci-0.7.0
This patch adds assembler only support.
Co-authored-by: Sudharsan Veeravalli <quic_svs@quicinc.com>
From GFX10 onwards it is possible to employ benevolent scheduling of
waves. This patch unconditionally enables, for the `amdhsa` OS, the bit
which controls that capability, as it is beneficial for algorithms that
rely on more complex concurrent coordination and it is generally
performance neutral otherwise.
The PTX target allows an alignment to be specified on both return values
and parameters to allow for more efficient vectorized stores. Currently
we represent these parameter alignments via the "alignstack" attribute,
but must fall back to metadata for the return value. This PR allows
"alignstack" on return values as well.
This patch adds a section pointing out how permissions should be done
within Github workflows. I believe all of our workflows are currently
compliant with this, but it helps to have something to point to
documenting the practice and especially the motivation.
This is meant as a preparation for PR #130988 "[AMDGPU] Implement IR
expansion for frem instruction" which implements the expansion of
another instruction in this pass. The more general name seems more
appropriate given this change and quite reasonable even without it.
…elative
The semantics of `llvm.type.checked.load.relative` seem to be a little
different from that of `llvm.load.relative`. It looks like the semantics
for `llvm.type.checked.load.relative` is `ptr + offset + *(ptr +
offset)` whereas the semantics for `llvm.load.relative` is `ptr + *(ptr
+ offset)`. That is, the offset for the former is added to the offset
address whereas the later has the offset added to the original pointer.
It really feels like the checked intrinsic was meant to match the
semantics of the non-checked intrinsic, but I think for all cases the
checked intrinsic is used (swift being the only use I know of), the
calculation just happens to be the same because swift always uses an
offset of zero. Likewise, all llvm tests for this intrinsic happen to
use an offset of zero.
Relative vtables in clang happens to be the first time where we're using
this intrinsic and using it with non-zero values. This updates the
semantics of the checked intrinsic to match the non-checked one.
Effectively this shouldn't change any codegen by any users of this since
all current users seem to use a zero offset.
This PR also updates some tests with non-zero offsets.
The Xqcili extension includes a two instructions that load large
immediates than is available with the base RISC-V ISA.
The current spec can be found at:
https://github.com/quic/riscv-unified-db/releases/tag/Xqci-0.7.0
This patch adds assembler only support.
The grammar is `!match(str, regex)` and this operator produces 1
if the `str` matches the regular expression `regex`.
The format of `regex` is ERE (Extended POSIX Regular Expressions).
* Add multi dimensional array support
* Make maximum vector size tunable
* Make ratio of VGPRs used for vector promotion tunable
* Maximum array size now based on VGPR count (32b) instead of element count