llvm-project/mlir/lib/IR/Operation.cpp

Ignoring revisions in .git-blame-ignore-revs. Click here to bypass and see the normal blame view.

1221 lines
45 KiB
C++
Raw Normal View History

//===- Operation.cpp - Operation support code -----------------------------===//
//
// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
// See https://llvm.org/LICENSE.txt for license information.
// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
//
//===----------------------------------------------------------------------===//
#include "mlir/IR/Operation.h"
#include "mlir/IR/BlockAndValueMapping.h"
#include "mlir/IR/BuiltinTypes.h"
#include "mlir/IR/Dialect.h"
#include "mlir/IR/OpImplementation.h"
#include "mlir/IR/PatternMatch.h"
#include "mlir/IR/TypeUtilities.h"
#include "mlir/Interfaces/FoldInterfaces.h"
#include "llvm/ADT/StringExtras.h"
#include <numeric>
using namespace mlir;
//===----------------------------------------------------------------------===//
// Operation
//===----------------------------------------------------------------------===//
/// Create a new Operation from operation state.
Operation *Operation::create(const OperationState &state) {
return create(state.location, state.name, state.types, state.operands,
state.attributes.getDictionary(state.getContext()),
state.successors, state.regions);
}
/// Create a new Operation with the specific fields.
Operation *Operation::create(Location location, OperationName name,
TypeRange resultTypes, ValueRange operands,
NamedAttrList &&attributes, BlockRange successors,
RegionRange regions) {
unsigned numRegions = regions.size();
Operation *op = create(location, name, resultTypes, operands,
std::move(attributes), successors, numRegions);
for (unsigned i = 0; i < numRegions; ++i)
if (regions[i])
op->getRegion(i).takeBody(*regions[i]);
return op;
}
/// Overload of create that takes an existing DictionaryAttr to avoid
/// unnecessarily uniquing a list of attributes.
Operation *Operation::create(Location location, OperationName name,
TypeRange resultTypes, ValueRange operands,
NamedAttrList &&attributes, BlockRange successors,
unsigned numRegions) {
[mlir][IR] Refactor the internal implementation of Value The current implementation of Value involves a pointer int pair with several different kinds of owners, i.e. BlockArgumentImpl*, Operation *, TrailingOpResult*. This design arose from the desire to save memory overhead for operations that have a very small number of results (generally 0-2). There are, unfortunately, many problematic aspects of the current implementation that make Values difficult to work with or just inefficient. Operation result types are stored as a separate array on the Operation. This is very inefficient for many reasons: we use TupleType for multiple results, which can lead to huge amounts of memory usage if multi-result operations change types frequently(they do). It also means that simple methods like Value::getType/Value::setType now require complex logic to get to the desired type. Value only has one pointer bit free, severely limiting the ability to use it in things like PointerUnion/PointerIntPair. Given that we store the kind of a Value along with the "owner" pointer, we only leave one bit free for users of Value. This creates situations where we end up nesting PointerUnions to be able to use Value in one. As noted above, most of the methods in Value need to branch on at least 3 different cases which is both inefficient, possibly error prone, and verbose. The current storage of results also creates problems for utilities like ValueRange/TypeRange, which want to efficiently store base pointers to ranges (of which Operation* isn't really useful as one). This revision greatly simplifies the implementation of Value by the introduction of a new ValueImpl class. This class contains all of the state shared between all of the various derived value classes; i.e. the use list, the type, and the kind. This shared implementation class provides several large benefits: * Most of the methods on value are now branchless, and often one-liners. * The "kind" of the value is now stored in ValueImpl instead of Value This frees up all of Value's pointer bits, allowing for users to take full advantage of PointerUnion/PointerIntPair/etc. It also allows for storing more operation results as "inline", 6 now instead of 2, freeing up 1 word per new inline result. * Operation result types are now stored in the result, instead of a side array This drops the size of zero-result operations by 1 word. It also removes the memory crushing use of TupleType for operations results (which could lead up to hundreds of megabytes of "dead" TupleTypes in the context). This also allowed restructured ValueRange, making it simpler and one word smaller. This revision does come with two conceptual downsides: * Operation::getResultTypes no longer returns an ArrayRef<Type> This conceptually makes some usages slower, as the iterator increment is slightly more complex. * OpResult::getOwner is slightly more expensive, as it now requires a little bit of arithmetic From profiling, neither of the conceptual downsides have resulted in any perceivable hit to performance. Given the advantages of the new design, most compiles are slightly faster. Differential Revision: https://reviews.llvm.org/D97804
2021-03-03 14:23:14 -08:00
assert(llvm::all_of(resultTypes, [](Type t) { return t; }) &&
"unexpected null result type");
// We only need to allocate additional memory for a subset of results.
unsigned numTrailingResults = OpResult::getNumTrailing(resultTypes.size());
unsigned numInlineResults = OpResult::getNumInline(resultTypes.size());
unsigned numSuccessors = successors.size();
unsigned numOperands = operands.size();
[mlir][IR] Refactor the internal implementation of Value The current implementation of Value involves a pointer int pair with several different kinds of owners, i.e. BlockArgumentImpl*, Operation *, TrailingOpResult*. This design arose from the desire to save memory overhead for operations that have a very small number of results (generally 0-2). There are, unfortunately, many problematic aspects of the current implementation that make Values difficult to work with or just inefficient. Operation result types are stored as a separate array on the Operation. This is very inefficient for many reasons: we use TupleType for multiple results, which can lead to huge amounts of memory usage if multi-result operations change types frequently(they do). It also means that simple methods like Value::getType/Value::setType now require complex logic to get to the desired type. Value only has one pointer bit free, severely limiting the ability to use it in things like PointerUnion/PointerIntPair. Given that we store the kind of a Value along with the "owner" pointer, we only leave one bit free for users of Value. This creates situations where we end up nesting PointerUnions to be able to use Value in one. As noted above, most of the methods in Value need to branch on at least 3 different cases which is both inefficient, possibly error prone, and verbose. The current storage of results also creates problems for utilities like ValueRange/TypeRange, which want to efficiently store base pointers to ranges (of which Operation* isn't really useful as one). This revision greatly simplifies the implementation of Value by the introduction of a new ValueImpl class. This class contains all of the state shared between all of the various derived value classes; i.e. the use list, the type, and the kind. This shared implementation class provides several large benefits: * Most of the methods on value are now branchless, and often one-liners. * The "kind" of the value is now stored in ValueImpl instead of Value This frees up all of Value's pointer bits, allowing for users to take full advantage of PointerUnion/PointerIntPair/etc. It also allows for storing more operation results as "inline", 6 now instead of 2, freeing up 1 word per new inline result. * Operation result types are now stored in the result, instead of a side array This drops the size of zero-result operations by 1 word. It also removes the memory crushing use of TupleType for operations results (which could lead up to hundreds of megabytes of "dead" TupleTypes in the context). This also allowed restructured ValueRange, making it simpler and one word smaller. This revision does come with two conceptual downsides: * Operation::getResultTypes no longer returns an ArrayRef<Type> This conceptually makes some usages slower, as the iterator increment is slightly more complex. * OpResult::getOwner is slightly more expensive, as it now requires a little bit of arithmetic From profiling, neither of the conceptual downsides have resulted in any perceivable hit to performance. Given the advantages of the new design, most compiles are slightly faster. Differential Revision: https://reviews.llvm.org/D97804
2021-03-03 14:23:14 -08:00
unsigned numResults = resultTypes.size();
// If the operation is known to have no operands, don't allocate an operand
// storage.
bool needsOperandStorage =
operands.empty() ? !name.hasTrait<OpTrait::ZeroOperands>() : true;
// Compute the byte size for the operation and the operand storage. This takes
// into account the size of the operation, its trailing objects, and its
// prefixed objects.
size_t byteSize =
totalSizeToAlloc<detail::OperandStorage, BlockOperand, Region, OpOperand>(
needsOperandStorage ? 1 : 0, numSuccessors, numRegions, numOperands);
size_t prefixByteSize = llvm::alignTo(
Operation::prefixAllocSize(numTrailingResults, numInlineResults),
alignof(Operation));
char *mallocMem = reinterpret_cast<char *>(malloc(byteSize + prefixByteSize));
void *rawMem = mallocMem + prefixByteSize;
// Populate default attributes.
if (Optional<RegisteredOperationName> info = name.getRegisteredInfo())
info->populateDefaultAttrs(attributes);
// Create the new Operation.
Operation *op = ::new (rawMem) Operation(
location, name, numResults, numSuccessors, numRegions,
attributes.getDictionary(location.getContext()), needsOperandStorage);
assert((numSuccessors == 0 || op->mightHaveTrait<OpTrait::IsTerminator>()) &&
"unexpected successors in a non-terminator operation");
// Initialize the results.
[mlir][IR] Refactor the internal implementation of Value The current implementation of Value involves a pointer int pair with several different kinds of owners, i.e. BlockArgumentImpl*, Operation *, TrailingOpResult*. This design arose from the desire to save memory overhead for operations that have a very small number of results (generally 0-2). There are, unfortunately, many problematic aspects of the current implementation that make Values difficult to work with or just inefficient. Operation result types are stored as a separate array on the Operation. This is very inefficient for many reasons: we use TupleType for multiple results, which can lead to huge amounts of memory usage if multi-result operations change types frequently(they do). It also means that simple methods like Value::getType/Value::setType now require complex logic to get to the desired type. Value only has one pointer bit free, severely limiting the ability to use it in things like PointerUnion/PointerIntPair. Given that we store the kind of a Value along with the "owner" pointer, we only leave one bit free for users of Value. This creates situations where we end up nesting PointerUnions to be able to use Value in one. As noted above, most of the methods in Value need to branch on at least 3 different cases which is both inefficient, possibly error prone, and verbose. The current storage of results also creates problems for utilities like ValueRange/TypeRange, which want to efficiently store base pointers to ranges (of which Operation* isn't really useful as one). This revision greatly simplifies the implementation of Value by the introduction of a new ValueImpl class. This class contains all of the state shared between all of the various derived value classes; i.e. the use list, the type, and the kind. This shared implementation class provides several large benefits: * Most of the methods on value are now branchless, and often one-liners. * The "kind" of the value is now stored in ValueImpl instead of Value This frees up all of Value's pointer bits, allowing for users to take full advantage of PointerUnion/PointerIntPair/etc. It also allows for storing more operation results as "inline", 6 now instead of 2, freeing up 1 word per new inline result. * Operation result types are now stored in the result, instead of a side array This drops the size of zero-result operations by 1 word. It also removes the memory crushing use of TupleType for operations results (which could lead up to hundreds of megabytes of "dead" TupleTypes in the context). This also allowed restructured ValueRange, making it simpler and one word smaller. This revision does come with two conceptual downsides: * Operation::getResultTypes no longer returns an ArrayRef<Type> This conceptually makes some usages slower, as the iterator increment is slightly more complex. * OpResult::getOwner is slightly more expensive, as it now requires a little bit of arithmetic From profiling, neither of the conceptual downsides have resulted in any perceivable hit to performance. Given the advantages of the new design, most compiles are slightly faster. Differential Revision: https://reviews.llvm.org/D97804
2021-03-03 14:23:14 -08:00
auto resultTypeIt = resultTypes.begin();
for (unsigned i = 0; i < numInlineResults; ++i, ++resultTypeIt)
new (op->getInlineOpResult(i)) detail::InlineOpResult(*resultTypeIt, i);
for (unsigned i = 0; i < numTrailingResults; ++i, ++resultTypeIt) {
new (op->getOutOfLineOpResult(i))
detail::OutOfLineOpResult(*resultTypeIt, i);
}
// Initialize the regions.
for (unsigned i = 0; i != numRegions; ++i)
new (&op->getRegion(i)) Region(op);
// Initialize the operands.
if (needsOperandStorage) {
new (&op->getOperandStorage()) detail::OperandStorage(
op, op->getTrailingObjects<OpOperand>(), operands);
}
// Initialize the successors.
auto blockOperands = op->getBlockOperands();
for (unsigned i = 0; i != numSuccessors; ++i)
new (&blockOperands[i]) BlockOperand(op, successors[i]);
return op;
}
[mlir][IR] Refactor the internal implementation of Value The current implementation of Value involves a pointer int pair with several different kinds of owners, i.e. BlockArgumentImpl*, Operation *, TrailingOpResult*. This design arose from the desire to save memory overhead for operations that have a very small number of results (generally 0-2). There are, unfortunately, many problematic aspects of the current implementation that make Values difficult to work with or just inefficient. Operation result types are stored as a separate array on the Operation. This is very inefficient for many reasons: we use TupleType for multiple results, which can lead to huge amounts of memory usage if multi-result operations change types frequently(they do). It also means that simple methods like Value::getType/Value::setType now require complex logic to get to the desired type. Value only has one pointer bit free, severely limiting the ability to use it in things like PointerUnion/PointerIntPair. Given that we store the kind of a Value along with the "owner" pointer, we only leave one bit free for users of Value. This creates situations where we end up nesting PointerUnions to be able to use Value in one. As noted above, most of the methods in Value need to branch on at least 3 different cases which is both inefficient, possibly error prone, and verbose. The current storage of results also creates problems for utilities like ValueRange/TypeRange, which want to efficiently store base pointers to ranges (of which Operation* isn't really useful as one). This revision greatly simplifies the implementation of Value by the introduction of a new ValueImpl class. This class contains all of the state shared between all of the various derived value classes; i.e. the use list, the type, and the kind. This shared implementation class provides several large benefits: * Most of the methods on value are now branchless, and often one-liners. * The "kind" of the value is now stored in ValueImpl instead of Value This frees up all of Value's pointer bits, allowing for users to take full advantage of PointerUnion/PointerIntPair/etc. It also allows for storing more operation results as "inline", 6 now instead of 2, freeing up 1 word per new inline result. * Operation result types are now stored in the result, instead of a side array This drops the size of zero-result operations by 1 word. It also removes the memory crushing use of TupleType for operations results (which could lead up to hundreds of megabytes of "dead" TupleTypes in the context). This also allowed restructured ValueRange, making it simpler and one word smaller. This revision does come with two conceptual downsides: * Operation::getResultTypes no longer returns an ArrayRef<Type> This conceptually makes some usages slower, as the iterator increment is slightly more complex. * OpResult::getOwner is slightly more expensive, as it now requires a little bit of arithmetic From profiling, neither of the conceptual downsides have resulted in any perceivable hit to performance. Given the advantages of the new design, most compiles are slightly faster. Differential Revision: https://reviews.llvm.org/D97804
2021-03-03 14:23:14 -08:00
Operation::Operation(Location location, OperationName name, unsigned numResults,
unsigned numSuccessors, unsigned numRegions,
DictionaryAttr attributes, bool hasOperandStorage)
: location(location), numResults(numResults), numSuccs(numSuccessors),
numRegions(numRegions), hasOperandStorage(hasOperandStorage), name(name),
attrs(attributes) {
assert(attributes && "unexpected null attribute dictionary");
#ifndef NDEBUG
if (!getDialect() && !getContext()->allowsUnregisteredDialects())
llvm::report_fatal_error(
name.getStringRef() +
" created with unregistered dialect. If this is intended, please call "
"allowUnregisteredDialects() on the MLIRContext, or use "
"-allow-unregistered-dialect with the MLIR tool used.");
#endif
}
// Operations are deleted through the destroy() member because they are
// allocated via malloc.
Operation::~Operation() {
assert(block == nullptr && "operation destroyed but still in a block");
#ifndef NDEBUG
if (!use_empty()) {
{
InFlightDiagnostic diag =
emitOpError("operation destroyed but still has uses");
for (Operation *user : getUsers())
diag.attachNote(user->getLoc()) << "- use: " << *user << "\n";
}
llvm::report_fatal_error("operation destroyed but still has uses");
}
#endif
// Explicitly run the destructors for the operands.
if (hasOperandStorage)
getOperandStorage().~OperandStorage();
// Explicitly run the destructors for the successors.
for (auto &successor : getBlockOperands())
successor.~BlockOperand();
// Explicitly destroy the regions.
for (auto &region : getRegions())
region.~Region();
}
/// Destroy this operation or one of its subclasses.
void Operation::destroy() {
// Operations may have additional prefixed allocation, which needs to be
// accounted for here when computing the address to free.
char *rawMem = reinterpret_cast<char *>(this) -
llvm::alignTo(prefixAllocSize(), alignof(Operation));
this->~Operation();
free(rawMem);
}
/// Return true if this operation is a proper ancestor of the `other`
/// operation.
bool Operation::isProperAncestor(Operation *other) {
while ((other = other->getParentOp()))
if (this == other)
return true;
return false;
}
/// Replace any uses of 'from' with 'to' within this operation.
void Operation::replaceUsesOfWith(Value from, Value to) {
if (from == to)
return;
for (auto &operand : getOpOperands())
if (operand.get() == from)
operand.set(to);
}
/// Replace the current operands of this operation with the ones provided in
/// 'operands'.
void Operation::setOperands(ValueRange operands) {
if (LLVM_LIKELY(hasOperandStorage))
return getOperandStorage().setOperands(this, operands);
assert(operands.empty() && "setting operands without an operand storage");
}
/// Replace the operands beginning at 'start' and ending at 'start' + 'length'
/// with the ones provided in 'operands'. 'operands' may be smaller or larger
/// than the range pointed to by 'start'+'length'.
void Operation::setOperands(unsigned start, unsigned length,
ValueRange operands) {
assert((start + length) <= getNumOperands() &&
"invalid operand range specified");
if (LLVM_LIKELY(hasOperandStorage))
return getOperandStorage().setOperands(this, start, length, operands);
assert(operands.empty() && "setting operands without an operand storage");
}
/// Insert the given operands into the operand list at the given 'index'.
void Operation::insertOperands(unsigned index, ValueRange operands) {
if (LLVM_LIKELY(hasOperandStorage))
return setOperands(index, /*length=*/0, operands);
assert(operands.empty() && "inserting operands without an operand storage");
}
//===----------------------------------------------------------------------===//
// Diagnostics
//===----------------------------------------------------------------------===//
/// Emit an error about fatal conditions with this operation, reporting up to
/// any diagnostic handlers that may be listening.
InFlightDiagnostic Operation::emitError(const Twine &message) {
InFlightDiagnostic diag = mlir::emitError(getLoc(), message);
if (getContext()->shouldPrintOpOnDiagnostic()) {
diag.attachNote(getLoc())
.append("see current operation: ")
.appendOp(*this, OpPrintingFlags().printGenericOpForm());
}
return diag;
}
/// Emit a warning about this operation, reporting up to any diagnostic
/// handlers that may be listening.
Introduce a new API for emitting diagnostics with Diagnostic and InFlightDiagnostic. The Diagnostic class contains all of the information necessary to report a diagnostic to the DiagnosticEngine. It should generally not be constructed directly, and instead used transitively via InFlightDiagnostic. A diagnostic is currently comprised of several different elements: * A severity level. * A source Location. * A list of DiagnosticArguments that help compose and comprise the output message. * A DiagnosticArgument represents any value that may be part of the diagnostic, e.g. string, integer, Type, Attribute, etc. * Arguments can be added to the diagnostic via the stream(<<) operator. * (In a future cl) A list of attached notes. * These are in the form of other diagnostics that provide supplemental information to the main diagnostic, but do not have context on their own. The InFlightDiagnostic class represents an RAII wrapper around a Diagnostic that is set to be reported with the diagnostic engine. This allows for the user to modify a diagnostic that is inflight. The internally wrapped diagnostic can be reported directly or automatically upon destruction. These classes allow for more natural composition of diagnostics by removing the restriction that the message of a diagnostic is comprised of a single Twine. They should also allow for nice incremental improvements to the diagnostics experience in the future, e.g. formatv style diagnostics. Simple Example: emitError(loc, "integer bitwidth is limited to " + Twine(IntegerType::kMaxWidth) + " bits"); emitError(loc) << "integer bitwidth is limited to " << IntegerType::kMaxWidth << " bits"; -- PiperOrigin-RevId: 246526439
2019-05-03 10:01:01 -07:00
InFlightDiagnostic Operation::emitWarning(const Twine &message) {
InFlightDiagnostic diag = mlir::emitWarning(getLoc(), message);
if (getContext()->shouldPrintOpOnDiagnostic())
diag.attachNote(getLoc()) << "see current operation: " << *this;
return diag;
}
/// Emit a remark about this operation, reporting up to any diagnostic
/// handlers that may be listening.
InFlightDiagnostic Operation::emitRemark(const Twine &message) {
InFlightDiagnostic diag = mlir::emitRemark(getLoc(), message);
if (getContext()->shouldPrintOpOnDiagnostic())
diag.attachNote(getLoc()) << "see current operation: " << *this;
return diag;
}
//===----------------------------------------------------------------------===//
// Operation Ordering
//===----------------------------------------------------------------------===//
constexpr unsigned Operation::kInvalidOrderIdx;
constexpr unsigned Operation::kOrderStride;
/// Given an operation 'other' that is within the same parent block, return
/// whether the current operation is before 'other' in the operation list
/// of the parent block.
/// Note: This function has an average complexity of O(1), but worst case may
/// take O(N) where N is the number of operations within the parent block.
bool Operation::isBeforeInBlock(Operation *other) {
assert(block && "Operations without parent blocks have no order.");
assert(other && other->block == block &&
"Expected other operation to have the same parent block.");
// If the order of the block is already invalid, directly recompute the
// parent.
if (!block->isOpOrderValid()) {
block->recomputeOpOrder();
} else {
// Update the order either operation if necessary.
updateOrderIfNecessary();
other->updateOrderIfNecessary();
}
return orderIndex < other->orderIndex;
}
/// Update the order index of this operation of this operation if necessary,
/// potentially recomputing the order of the parent block.
void Operation::updateOrderIfNecessary() {
assert(block && "expected valid parent");
// If the order is valid for this operation there is nothing to do.
if (hasValidOrder())
return;
Operation *blockFront = &block->front();
Operation *blockBack = &block->back();
// This method is expected to only be invoked on blocks with more than one
// operation.
assert(blockFront != blockBack && "expected more than one operation");
// If the operation is at the end of the block.
if (this == blockBack) {
Operation *prevNode = getPrevNode();
if (!prevNode->hasValidOrder())
return block->recomputeOpOrder();
// Add the stride to the previous operation.
orderIndex = prevNode->orderIndex + kOrderStride;
return;
}
// If this is the first operation try to use the next operation to compute the
// ordering.
if (this == blockFront) {
Operation *nextNode = getNextNode();
if (!nextNode->hasValidOrder())
return block->recomputeOpOrder();
// There is no order to give this operation.
if (nextNode->orderIndex == 0)
return block->recomputeOpOrder();
// If we can't use the stride, just take the middle value left. This is safe
// because we know there is at least one valid index to assign to.
if (nextNode->orderIndex <= kOrderStride)
orderIndex = (nextNode->orderIndex / 2);
else
orderIndex = kOrderStride;
return;
}
// Otherwise, this operation is between two others. Place this operation in
// the middle of the previous and next if possible.
Operation *prevNode = getPrevNode(), *nextNode = getNextNode();
if (!prevNode->hasValidOrder() || !nextNode->hasValidOrder())
return block->recomputeOpOrder();
unsigned prevOrder = prevNode->orderIndex, nextOrder = nextNode->orderIndex;
// Check to see if there is a valid order between the two.
if (prevOrder + 1 == nextOrder)
return block->recomputeOpOrder();
orderIndex = prevOrder + ((nextOrder - prevOrder) / 2);
}
//===----------------------------------------------------------------------===//
// ilist_traits for Operation
//===----------------------------------------------------------------------===//
auto llvm::ilist_detail::SpecificNodeAccess<
typename llvm::ilist_detail::compute_node_options<
::mlir::Operation>::type>::getNodePtr(pointer n) -> node_type * {
return NodeAccess::getNodePtr<OptionsT>(n);
}
auto llvm::ilist_detail::SpecificNodeAccess<
typename llvm::ilist_detail::compute_node_options<
::mlir::Operation>::type>::getNodePtr(const_pointer n)
-> const node_type * {
return NodeAccess::getNodePtr<OptionsT>(n);
}
auto llvm::ilist_detail::SpecificNodeAccess<
typename llvm::ilist_detail::compute_node_options<
::mlir::Operation>::type>::getValuePtr(node_type *n) -> pointer {
return NodeAccess::getValuePtr<OptionsT>(n);
}
auto llvm::ilist_detail::SpecificNodeAccess<
typename llvm::ilist_detail::compute_node_options<
::mlir::Operation>::type>::getValuePtr(const node_type *n)
-> const_pointer {
return NodeAccess::getValuePtr<OptionsT>(n);
}
void llvm::ilist_traits<::mlir::Operation>::deleteNode(Operation *op) {
op->destroy();
}
Block *llvm::ilist_traits<::mlir::Operation>::getContainingBlock() {
size_t offset(size_t(&((Block *)nullptr->*Block::getSublistAccess(nullptr))));
iplist<Operation> *anchor(static_cast<iplist<Operation> *>(this));
return reinterpret_cast<Block *>(reinterpret_cast<char *>(anchor) - offset);
}
/// This is a trait method invoked when an operation is added to a block. We
/// keep the block pointer up to date.
void llvm::ilist_traits<::mlir::Operation>::addNodeToList(Operation *op) {
assert(!op->getBlock() && "already in an operation block!");
op->block = getContainingBlock();
// Invalidate the order on the operation.
op->orderIndex = Operation::kInvalidOrderIdx;
}
/// This is a trait method invoked when an operation is removed from a block.
/// We keep the block pointer up to date.
void llvm::ilist_traits<::mlir::Operation>::removeNodeFromList(Operation *op) {
assert(op->block && "not already in an operation block!");
op->block = nullptr;
}
/// This is a trait method invoked when an operation is moved from one block
/// to another. We keep the block pointer up to date.
void llvm::ilist_traits<::mlir::Operation>::transferNodesFromList(
ilist_traits<Operation> &otherList, op_iterator first, op_iterator last) {
Block *curParent = getContainingBlock();
// Invalidate the ordering of the parent block.
curParent->invalidateOpOrder();
// If we are transferring operations within the same block, the block
// pointer doesn't need to be updated.
if (curParent == otherList.getContainingBlock())
return;
// Update the 'block' member of each operation.
for (; first != last; ++first)
first->block = curParent;
}
/// Remove this operation (and its descendants) from its Block and delete
/// all of them.
void Operation::erase() {
if (auto *parent = getBlock())
parent->getOperations().erase(this);
else
destroy();
}
[mlir][PDL] Add support for PDL bytecode and expose PDL support to OwningRewritePatternList PDL patterns are now supported via a new `PDLPatternModule` class. This class contains a ModuleOp with the pdl::PatternOp operations representing the patterns, as well as a collection of registered C++ functions for native constraints/creations/rewrites/etc. that may be invoked via the pdl patterns. Instances of this class are added to an OwningRewritePatternList in the same fashion as C++ RewritePatterns, i.e. via the `insert` method. The PDL bytecode is an in-memory representation of the PDL interpreter dialect that can be efficiently interpreted/executed. The representation of the bytecode boils down to a code array(for opcodes/memory locations/etc) and a memory buffer(for storing attributes/operations/values/any other data necessary). The bytecode operations are effectively a 1-1 mapping to the PDLInterp dialect operations, with a few exceptions in cases where the in-memory representation of the bytecode can be more efficient than the MLIR representation. For example, a generic `AreEqual` bytecode op can be used to represent AreEqualOp, CheckAttributeOp, and CheckTypeOp. The execution of the bytecode is split into two phases: matching and rewriting. When matching, all of the matched patterns are collected to avoid the overhead of re-running parts of the matcher. These matched patterns are then considered alongside the native C++ patterns, which rewrite immediately in-place via `RewritePattern::matchAndRewrite`, for the given root operation. When a PDL pattern is matched and has the highest benefit, it is passed back to the bytecode to execute its rewriter. Differential Revision: https://reviews.llvm.org/D89107
2020-12-01 14:30:18 -08:00
/// Remove the operation from its parent block, but don't delete it.
void Operation::remove() {
if (Block *parent = getBlock())
parent->getOperations().remove(this);
}
/// Unlink this operation from its current block and insert it right before
/// `existingOp` which may be in the same or another block in the same
/// function.
void Operation::moveBefore(Operation *existingOp) {
moveBefore(existingOp->getBlock(), existingOp->getIterator());
}
/// Unlink this operation from its current basic block and insert it right
/// before `iterator` in the specified basic block.
void Operation::moveBefore(Block *block,
llvm::iplist<Operation>::iterator iterator) {
block->getOperations().splice(iterator, getBlock()->getOperations(),
getIterator());
}
/// Unlink this operation from its current block and insert it right after
/// `existingOp` which may be in the same or another block in the same function.
void Operation::moveAfter(Operation *existingOp) {
moveAfter(existingOp->getBlock(), existingOp->getIterator());
}
/// Unlink this operation from its current block and insert it right after
/// `iterator` in the specified block.
void Operation::moveAfter(Block *block,
llvm::iplist<Operation>::iterator iterator) {
assert(iterator != block->end() && "cannot move after end of block");
moveBefore(block, std::next(iterator));
}
/// This drops all operand uses from this operation, which is an essential
/// step in breaking cyclic dependences between references when they are to
/// be deleted.
void Operation::dropAllReferences() {
for (auto &op : getOpOperands())
op.drop();
for (auto &region : getRegions())
region.dropAllReferences();
for (auto &dest : getBlockOperands())
dest.drop();
}
/// This drops all uses of any values defined by this operation or its nested
/// regions, wherever they are located.
void Operation::dropAllDefinedValueUses() {
dropAllUses();
for (auto &region : getRegions())
for (auto &block : region)
block.dropAllDefinedValueUses();
}
void Operation::setSuccessor(Block *block, unsigned index) {
assert(index < getNumSuccessors());
getBlockOperands()[index].set(block);
}
/// Attempt to fold this operation using the Op's registered foldHook.
LogicalResult Operation::fold(ArrayRef<Attribute> operands,
SmallVectorImpl<OpFoldResult> &results) {
// If we have a registered operation definition matching this one, use it to
// try to constant fold the operation.
Optional<RegisteredOperationName> info = getRegisteredInfo();
if (info && succeeded(info->foldHook(this, operands, results)))
return success();
// Otherwise, fall back on the dialect hook to handle it.
Dialect *dialect = getDialect();
if (!dialect)
return failure();
auto *interface = dyn_cast<DialectFoldInterface>(dialect);
if (!interface)
return failure();
return interface->fold(this, operands, results);
}
/// Emit an error with the op name prefixed, like "'dim' op " which is
/// convenient for verifiers.
Introduce a new API for emitting diagnostics with Diagnostic and InFlightDiagnostic. The Diagnostic class contains all of the information necessary to report a diagnostic to the DiagnosticEngine. It should generally not be constructed directly, and instead used transitively via InFlightDiagnostic. A diagnostic is currently comprised of several different elements: * A severity level. * A source Location. * A list of DiagnosticArguments that help compose and comprise the output message. * A DiagnosticArgument represents any value that may be part of the diagnostic, e.g. string, integer, Type, Attribute, etc. * Arguments can be added to the diagnostic via the stream(<<) operator. * (In a future cl) A list of attached notes. * These are in the form of other diagnostics that provide supplemental information to the main diagnostic, but do not have context on their own. The InFlightDiagnostic class represents an RAII wrapper around a Diagnostic that is set to be reported with the diagnostic engine. This allows for the user to modify a diagnostic that is inflight. The internally wrapped diagnostic can be reported directly or automatically upon destruction. These classes allow for more natural composition of diagnostics by removing the restriction that the message of a diagnostic is comprised of a single Twine. They should also allow for nice incremental improvements to the diagnostics experience in the future, e.g. formatv style diagnostics. Simple Example: emitError(loc, "integer bitwidth is limited to " + Twine(IntegerType::kMaxWidth) + " bits"); emitError(loc) << "integer bitwidth is limited to " << IntegerType::kMaxWidth << " bits"; -- PiperOrigin-RevId: 246526439
2019-05-03 10:01:01 -07:00
InFlightDiagnostic Operation::emitOpError(const Twine &message) {
return emitError() << "'" << getName() << "' op " << message;
}
//===----------------------------------------------------------------------===//
// Operation Cloning
//===----------------------------------------------------------------------===//
[mlir] Make `Regions`s `cloneInto` multithread-readable Prior to this patch, `cloneInto` would do a simple walk over the blocks and contained operations and clone and map them as it encounters them. As finishing touch it then remaps any successor and operands it has remapped during that process. This is generally fine, but sadly leads to a lot of uses of both operations and blocks from the source region, in the cloned operations in the target region. Those uses lead to writes in the use-def list of the operations, making `cloneInto` never thread safe. This patch reimplements `cloneInto` in three steps to avoid ever creating any extra uses on elements in the source region: * It first creates the mapping of all blocks and block operands * It then clones all operations to create the mapping of all operation results, but does not yet clone any regions or set the operands * After all operation results have been mapped, it now sets the operations operands and clones their regions. That way it is now possible to call `cloneInto` from multiple threads if the Region or Operation is isolated-from-above. This allows creating copies of functions or to use `mlir::inlineCall` with the same source region from multiple threads. In the general case, the method is thread-safe if through cloning, no new uses of `Value`s from outside the cloned Operation/Region are created. This can be ensured by mapping any outside operands via the `BlockAndValueMapping` to `Value`s owned by the caller thread. While I was at it, I also reworked the `clone` method of `Operation` a little bit and added a proper options class to avoid having a `cloneWithoutRegionsAndOperands` method, and be more extensible in the future. `cloneWithoutRegions` is now also a simple wrapper that calls `clone` with the proper options set. That way all the operation cloning code is now contained solely within `clone`. Differential Revision: https://reviews.llvm.org/D123917
2022-04-21 13:43:00 +02:00
Operation::CloneOptions::CloneOptions()
: cloneRegionsFlag(false), cloneOperandsFlag(false) {}
[mlir] Make `Regions`s `cloneInto` multithread-readable Prior to this patch, `cloneInto` would do a simple walk over the blocks and contained operations and clone and map them as it encounters them. As finishing touch it then remaps any successor and operands it has remapped during that process. This is generally fine, but sadly leads to a lot of uses of both operations and blocks from the source region, in the cloned operations in the target region. Those uses lead to writes in the use-def list of the operations, making `cloneInto` never thread safe. This patch reimplements `cloneInto` in three steps to avoid ever creating any extra uses on elements in the source region: * It first creates the mapping of all blocks and block operands * It then clones all operations to create the mapping of all operation results, but does not yet clone any regions or set the operands * After all operation results have been mapped, it now sets the operations operands and clones their regions. That way it is now possible to call `cloneInto` from multiple threads if the Region or Operation is isolated-from-above. This allows creating copies of functions or to use `mlir::inlineCall` with the same source region from multiple threads. In the general case, the method is thread-safe if through cloning, no new uses of `Value`s from outside the cloned Operation/Region are created. This can be ensured by mapping any outside operands via the `BlockAndValueMapping` to `Value`s owned by the caller thread. While I was at it, I also reworked the `clone` method of `Operation` a little bit and added a proper options class to avoid having a `cloneWithoutRegionsAndOperands` method, and be more extensible in the future. `cloneWithoutRegions` is now also a simple wrapper that calls `clone` with the proper options set. That way all the operation cloning code is now contained solely within `clone`. Differential Revision: https://reviews.llvm.org/D123917
2022-04-21 13:43:00 +02:00
Operation::CloneOptions::CloneOptions(bool cloneRegions, bool cloneOperands)
: cloneRegionsFlag(cloneRegions), cloneOperandsFlag(cloneOperands) {}
[mlir] Make `Regions`s `cloneInto` multithread-readable Prior to this patch, `cloneInto` would do a simple walk over the blocks and contained operations and clone and map them as it encounters them. As finishing touch it then remaps any successor and operands it has remapped during that process. This is generally fine, but sadly leads to a lot of uses of both operations and blocks from the source region, in the cloned operations in the target region. Those uses lead to writes in the use-def list of the operations, making `cloneInto` never thread safe. This patch reimplements `cloneInto` in three steps to avoid ever creating any extra uses on elements in the source region: * It first creates the mapping of all blocks and block operands * It then clones all operations to create the mapping of all operation results, but does not yet clone any regions or set the operands * After all operation results have been mapped, it now sets the operations operands and clones their regions. That way it is now possible to call `cloneInto` from multiple threads if the Region or Operation is isolated-from-above. This allows creating copies of functions or to use `mlir::inlineCall` with the same source region from multiple threads. In the general case, the method is thread-safe if through cloning, no new uses of `Value`s from outside the cloned Operation/Region are created. This can be ensured by mapping any outside operands via the `BlockAndValueMapping` to `Value`s owned by the caller thread. While I was at it, I also reworked the `clone` method of `Operation` a little bit and added a proper options class to avoid having a `cloneWithoutRegionsAndOperands` method, and be more extensible in the future. `cloneWithoutRegions` is now also a simple wrapper that calls `clone` with the proper options set. That way all the operation cloning code is now contained solely within `clone`. Differential Revision: https://reviews.llvm.org/D123917
2022-04-21 13:43:00 +02:00
Operation::CloneOptions Operation::CloneOptions::all() {
return CloneOptions().cloneRegions().cloneOperands();
}
[mlir] Make `Regions`s `cloneInto` multithread-readable Prior to this patch, `cloneInto` would do a simple walk over the blocks and contained operations and clone and map them as it encounters them. As finishing touch it then remaps any successor and operands it has remapped during that process. This is generally fine, but sadly leads to a lot of uses of both operations and blocks from the source region, in the cloned operations in the target region. Those uses lead to writes in the use-def list of the operations, making `cloneInto` never thread safe. This patch reimplements `cloneInto` in three steps to avoid ever creating any extra uses on elements in the source region: * It first creates the mapping of all blocks and block operands * It then clones all operations to create the mapping of all operation results, but does not yet clone any regions or set the operands * After all operation results have been mapped, it now sets the operations operands and clones their regions. That way it is now possible to call `cloneInto` from multiple threads if the Region or Operation is isolated-from-above. This allows creating copies of functions or to use `mlir::inlineCall` with the same source region from multiple threads. In the general case, the method is thread-safe if through cloning, no new uses of `Value`s from outside the cloned Operation/Region are created. This can be ensured by mapping any outside operands via the `BlockAndValueMapping` to `Value`s owned by the caller thread. While I was at it, I also reworked the `clone` method of `Operation` a little bit and added a proper options class to avoid having a `cloneWithoutRegionsAndOperands` method, and be more extensible in the future. `cloneWithoutRegions` is now also a simple wrapper that calls `clone` with the proper options set. That way all the operation cloning code is now contained solely within `clone`. Differential Revision: https://reviews.llvm.org/D123917
2022-04-21 13:43:00 +02:00
Operation::CloneOptions &Operation::CloneOptions::cloneRegions(bool enable) {
cloneRegionsFlag = enable;
return *this;
}
[mlir] Make `Regions`s `cloneInto` multithread-readable Prior to this patch, `cloneInto` would do a simple walk over the blocks and contained operations and clone and map them as it encounters them. As finishing touch it then remaps any successor and operands it has remapped during that process. This is generally fine, but sadly leads to a lot of uses of both operations and blocks from the source region, in the cloned operations in the target region. Those uses lead to writes in the use-def list of the operations, making `cloneInto` never thread safe. This patch reimplements `cloneInto` in three steps to avoid ever creating any extra uses on elements in the source region: * It first creates the mapping of all blocks and block operands * It then clones all operations to create the mapping of all operation results, but does not yet clone any regions or set the operands * After all operation results have been mapped, it now sets the operations operands and clones their regions. That way it is now possible to call `cloneInto` from multiple threads if the Region or Operation is isolated-from-above. This allows creating copies of functions or to use `mlir::inlineCall` with the same source region from multiple threads. In the general case, the method is thread-safe if through cloning, no new uses of `Value`s from outside the cloned Operation/Region are created. This can be ensured by mapping any outside operands via the `BlockAndValueMapping` to `Value`s owned by the caller thread. While I was at it, I also reworked the `clone` method of `Operation` a little bit and added a proper options class to avoid having a `cloneWithoutRegionsAndOperands` method, and be more extensible in the future. `cloneWithoutRegions` is now also a simple wrapper that calls `clone` with the proper options set. That way all the operation cloning code is now contained solely within `clone`. Differential Revision: https://reviews.llvm.org/D123917
2022-04-21 13:43:00 +02:00
Operation::CloneOptions &Operation::CloneOptions::cloneOperands(bool enable) {
cloneOperandsFlag = enable;
return *this;
}
[mlir] Make `Regions`s `cloneInto` multithread-readable Prior to this patch, `cloneInto` would do a simple walk over the blocks and contained operations and clone and map them as it encounters them. As finishing touch it then remaps any successor and operands it has remapped during that process. This is generally fine, but sadly leads to a lot of uses of both operations and blocks from the source region, in the cloned operations in the target region. Those uses lead to writes in the use-def list of the operations, making `cloneInto` never thread safe. This patch reimplements `cloneInto` in three steps to avoid ever creating any extra uses on elements in the source region: * It first creates the mapping of all blocks and block operands * It then clones all operations to create the mapping of all operation results, but does not yet clone any regions or set the operands * After all operation results have been mapped, it now sets the operations operands and clones their regions. That way it is now possible to call `cloneInto` from multiple threads if the Region or Operation is isolated-from-above. This allows creating copies of functions or to use `mlir::inlineCall` with the same source region from multiple threads. In the general case, the method is thread-safe if through cloning, no new uses of `Value`s from outside the cloned Operation/Region are created. This can be ensured by mapping any outside operands via the `BlockAndValueMapping` to `Value`s owned by the caller thread. While I was at it, I also reworked the `clone` method of `Operation` a little bit and added a proper options class to avoid having a `cloneWithoutRegionsAndOperands` method, and be more extensible in the future. `cloneWithoutRegions` is now also a simple wrapper that calls `clone` with the proper options set. That way all the operation cloning code is now contained solely within `clone`. Differential Revision: https://reviews.llvm.org/D123917
2022-04-21 13:43:00 +02:00
/// Create a deep copy of this operation but keep the operation regions empty.
/// Operands are remapped using `mapper` (if present), and `mapper` is updated
/// to contain the results. The `mapResults` flag specifies whether the results
/// of the cloned operation should be added to the map.
Operation *Operation::cloneWithoutRegions(BlockAndValueMapping &mapper) {
return clone(mapper, CloneOptions::all().cloneRegions(false));
}
Operation *Operation::cloneWithoutRegions() {
BlockAndValueMapping mapper;
return cloneWithoutRegions(mapper);
}
/// Create a deep copy of this operation, remapping any operands that use
/// values outside of the operation using the map that is provided (leaving
/// them alone if no entry is present). Replaces references to cloned
/// sub-operations to the corresponding operation that is copied, and adds
/// those mappings to the map.
[mlir] Make `Regions`s `cloneInto` multithread-readable Prior to this patch, `cloneInto` would do a simple walk over the blocks and contained operations and clone and map them as it encounters them. As finishing touch it then remaps any successor and operands it has remapped during that process. This is generally fine, but sadly leads to a lot of uses of both operations and blocks from the source region, in the cloned operations in the target region. Those uses lead to writes in the use-def list of the operations, making `cloneInto` never thread safe. This patch reimplements `cloneInto` in three steps to avoid ever creating any extra uses on elements in the source region: * It first creates the mapping of all blocks and block operands * It then clones all operations to create the mapping of all operation results, but does not yet clone any regions or set the operands * After all operation results have been mapped, it now sets the operations operands and clones their regions. That way it is now possible to call `cloneInto` from multiple threads if the Region or Operation is isolated-from-above. This allows creating copies of functions or to use `mlir::inlineCall` with the same source region from multiple threads. In the general case, the method is thread-safe if through cloning, no new uses of `Value`s from outside the cloned Operation/Region are created. This can be ensured by mapping any outside operands via the `BlockAndValueMapping` to `Value`s owned by the caller thread. While I was at it, I also reworked the `clone` method of `Operation` a little bit and added a proper options class to avoid having a `cloneWithoutRegionsAndOperands` method, and be more extensible in the future. `cloneWithoutRegions` is now also a simple wrapper that calls `clone` with the proper options set. That way all the operation cloning code is now contained solely within `clone`. Differential Revision: https://reviews.llvm.org/D123917
2022-04-21 13:43:00 +02:00
Operation *Operation::clone(BlockAndValueMapping &mapper,
CloneOptions options) {
SmallVector<Value, 8> operands;
SmallVector<Block *, 2> successors;
// Remap the operands.
if (options.shouldCloneOperands()) {
operands.reserve(getNumOperands());
for (auto opValue : getOperands())
operands.push_back(mapper.lookupOrDefault(opValue));
}
// Remap the successors.
successors.reserve(getNumSuccessors());
for (Block *successor : getSuccessors())
successors.push_back(mapper.lookupOrDefault(successor));
// Create the new operation.
auto *newOp = create(getLoc(), getName(), getResultTypes(), operands, attrs,
successors, getNumRegions());
// Clone the regions.
[mlir] Make `Regions`s `cloneInto` multithread-readable Prior to this patch, `cloneInto` would do a simple walk over the blocks and contained operations and clone and map them as it encounters them. As finishing touch it then remaps any successor and operands it has remapped during that process. This is generally fine, but sadly leads to a lot of uses of both operations and blocks from the source region, in the cloned operations in the target region. Those uses lead to writes in the use-def list of the operations, making `cloneInto` never thread safe. This patch reimplements `cloneInto` in three steps to avoid ever creating any extra uses on elements in the source region: * It first creates the mapping of all blocks and block operands * It then clones all operations to create the mapping of all operation results, but does not yet clone any regions or set the operands * After all operation results have been mapped, it now sets the operations operands and clones their regions. That way it is now possible to call `cloneInto` from multiple threads if the Region or Operation is isolated-from-above. This allows creating copies of functions or to use `mlir::inlineCall` with the same source region from multiple threads. In the general case, the method is thread-safe if through cloning, no new uses of `Value`s from outside the cloned Operation/Region are created. This can be ensured by mapping any outside operands via the `BlockAndValueMapping` to `Value`s owned by the caller thread. While I was at it, I also reworked the `clone` method of `Operation` a little bit and added a proper options class to avoid having a `cloneWithoutRegionsAndOperands` method, and be more extensible in the future. `cloneWithoutRegions` is now also a simple wrapper that calls `clone` with the proper options set. That way all the operation cloning code is now contained solely within `clone`. Differential Revision: https://reviews.llvm.org/D123917
2022-04-21 13:43:00 +02:00
if (options.shouldCloneRegions()) {
for (unsigned i = 0; i != numRegions; ++i)
getRegion(i).cloneInto(&newOp->getRegion(i), mapper);
}
[mlir] Make `Regions`s `cloneInto` multithread-readable Prior to this patch, `cloneInto` would do a simple walk over the blocks and contained operations and clone and map them as it encounters them. As finishing touch it then remaps any successor and operands it has remapped during that process. This is generally fine, but sadly leads to a lot of uses of both operations and blocks from the source region, in the cloned operations in the target region. Those uses lead to writes in the use-def list of the operations, making `cloneInto` never thread safe. This patch reimplements `cloneInto` in three steps to avoid ever creating any extra uses on elements in the source region: * It first creates the mapping of all blocks and block operands * It then clones all operations to create the mapping of all operation results, but does not yet clone any regions or set the operands * After all operation results have been mapped, it now sets the operations operands and clones their regions. That way it is now possible to call `cloneInto` from multiple threads if the Region or Operation is isolated-from-above. This allows creating copies of functions or to use `mlir::inlineCall` with the same source region from multiple threads. In the general case, the method is thread-safe if through cloning, no new uses of `Value`s from outside the cloned Operation/Region are created. This can be ensured by mapping any outside operands via the `BlockAndValueMapping` to `Value`s owned by the caller thread. While I was at it, I also reworked the `clone` method of `Operation` a little bit and added a proper options class to avoid having a `cloneWithoutRegionsAndOperands` method, and be more extensible in the future. `cloneWithoutRegions` is now also a simple wrapper that calls `clone` with the proper options set. That way all the operation cloning code is now contained solely within `clone`. Differential Revision: https://reviews.llvm.org/D123917
2022-04-21 13:43:00 +02:00
// Remember the mapping of any results.
[MLIR] Fix operation clone Operation clone is currently faulty. Suppose you have a block like as follows: ``` (%x0 : i32) { %x1 = f(%x0) return %x1 } ``` The test case we have is that we want to "unroll" this, in which we want to change this to compute `f(f(x0))` instead of just `f(x0)`. We do so by making a copy of the body at the end of the block and set the uses of the argument in the copy operations with the value returned from the original block. This is implemented as follows: 1) map to the block arguments to the returned value (`map[x0] = x1`). 2) clone the body Now for this small example, this works as intended and we get the following. ``` (%x0 : i32) { %x1 = f(%x0) %x2 = f(%x1) return %x2 } ``` This is because the current logic to clone `x1 = f(x0)` first looks up the arguments in the map (which finds `x0` maps to `x1` from the initialization), and then sets the map of the result to the cloned result (`map[x1] = x2`). However, this fails if `x0` is not an argument to the op, but instead used inside the region, like below. ``` (%x0 : i32) { %x1 = f() { yield %x0 } return %x1 } ``` This is because cloning an op currently first looks up the args (none), sets the map of the result (`map[%x1] = %x2`), and then clones the regions. This results in the following, which is clearly illegal: ``` (%x0 : i32) { %x1 = f() { yield %x0 } %x2 = f() { yield %x2 } return %x2 } ``` Diving deeper, this is partially due to the ordering (how this PR fixes it), as well as how region cloning works. Namely it will first clone with the mapping, and then it will remap all operands. Since the ordering above now has a map of `x0 -> x1` and `x1 -> x2`, we end up with the incorrect behavior here. Reviewed By: ftynse Differential Revision: https://reviews.llvm.org/D122531
2022-03-26 14:42:15 -04:00
for (unsigned i = 0, e = getNumResults(); i != e; ++i)
mapper.map(getResult(i), newOp->getResult(i));
return newOp;
}
[mlir] Make `Regions`s `cloneInto` multithread-readable Prior to this patch, `cloneInto` would do a simple walk over the blocks and contained operations and clone and map them as it encounters them. As finishing touch it then remaps any successor and operands it has remapped during that process. This is generally fine, but sadly leads to a lot of uses of both operations and blocks from the source region, in the cloned operations in the target region. Those uses lead to writes in the use-def list of the operations, making `cloneInto` never thread safe. This patch reimplements `cloneInto` in three steps to avoid ever creating any extra uses on elements in the source region: * It first creates the mapping of all blocks and block operands * It then clones all operations to create the mapping of all operation results, but does not yet clone any regions or set the operands * After all operation results have been mapped, it now sets the operations operands and clones their regions. That way it is now possible to call `cloneInto` from multiple threads if the Region or Operation is isolated-from-above. This allows creating copies of functions or to use `mlir::inlineCall` with the same source region from multiple threads. In the general case, the method is thread-safe if through cloning, no new uses of `Value`s from outside the cloned Operation/Region are created. This can be ensured by mapping any outside operands via the `BlockAndValueMapping` to `Value`s owned by the caller thread. While I was at it, I also reworked the `clone` method of `Operation` a little bit and added a proper options class to avoid having a `cloneWithoutRegionsAndOperands` method, and be more extensible in the future. `cloneWithoutRegions` is now also a simple wrapper that calls `clone` with the proper options set. That way all the operation cloning code is now contained solely within `clone`. Differential Revision: https://reviews.llvm.org/D123917
2022-04-21 13:43:00 +02:00
Operation *Operation::clone(CloneOptions options) {
BlockAndValueMapping mapper;
[mlir] Make `Regions`s `cloneInto` multithread-readable Prior to this patch, `cloneInto` would do a simple walk over the blocks and contained operations and clone and map them as it encounters them. As finishing touch it then remaps any successor and operands it has remapped during that process. This is generally fine, but sadly leads to a lot of uses of both operations and blocks from the source region, in the cloned operations in the target region. Those uses lead to writes in the use-def list of the operations, making `cloneInto` never thread safe. This patch reimplements `cloneInto` in three steps to avoid ever creating any extra uses on elements in the source region: * It first creates the mapping of all blocks and block operands * It then clones all operations to create the mapping of all operation results, but does not yet clone any regions or set the operands * After all operation results have been mapped, it now sets the operations operands and clones their regions. That way it is now possible to call `cloneInto` from multiple threads if the Region or Operation is isolated-from-above. This allows creating copies of functions or to use `mlir::inlineCall` with the same source region from multiple threads. In the general case, the method is thread-safe if through cloning, no new uses of `Value`s from outside the cloned Operation/Region are created. This can be ensured by mapping any outside operands via the `BlockAndValueMapping` to `Value`s owned by the caller thread. While I was at it, I also reworked the `clone` method of `Operation` a little bit and added a proper options class to avoid having a `cloneWithoutRegionsAndOperands` method, and be more extensible in the future. `cloneWithoutRegions` is now also a simple wrapper that calls `clone` with the proper options set. That way all the operation cloning code is now contained solely within `clone`. Differential Revision: https://reviews.llvm.org/D123917
2022-04-21 13:43:00 +02:00
return clone(mapper, options);
}
//===----------------------------------------------------------------------===//
// OpState trait class.
//===----------------------------------------------------------------------===//
// The fallback for the parser is to try for a dialect operation parser.
// Otherwise, reject the custom assembly form.
ParseResult OpState::parse(OpAsmParser &parser, OperationState &result) {
if (auto parseFn = result.name.getDialect()->getParseOperationHook(
result.name.getStringRef()))
return (*parseFn)(parser, result);
return parser.emitError(parser.getNameLoc(), "has no custom assembly form");
}
// The fallback for the printer is to try for a dialect operation printer.
// Otherwise, it prints the generic form.
void OpState::print(Operation *op, OpAsmPrinter &p, StringRef defaultDialect) {
if (auto printFn = op->getDialect()->getOperationPrinter(op)) {
printOpName(op, p, defaultDialect);
printFn(op, p);
} else {
p.printGenericOp(op);
}
}
/// Print an operation name, eliding the dialect prefix if necessary and doesn't
/// lead to ambiguities.
void OpState::printOpName(Operation *op, OpAsmPrinter &p,
StringRef defaultDialect) {
StringRef name = op->getName().getStringRef();
if (name.startswith((defaultDialect + ".").str()) && name.count('.') == 1)
name = name.drop_front(defaultDialect.size() + 1);
p.getStream() << name;
}
/// Emit an error about fatal conditions with this operation, reporting up to
Introduce a new API for emitting diagnostics with Diagnostic and InFlightDiagnostic. The Diagnostic class contains all of the information necessary to report a diagnostic to the DiagnosticEngine. It should generally not be constructed directly, and instead used transitively via InFlightDiagnostic. A diagnostic is currently comprised of several different elements: * A severity level. * A source Location. * A list of DiagnosticArguments that help compose and comprise the output message. * A DiagnosticArgument represents any value that may be part of the diagnostic, e.g. string, integer, Type, Attribute, etc. * Arguments can be added to the diagnostic via the stream(<<) operator. * (In a future cl) A list of attached notes. * These are in the form of other diagnostics that provide supplemental information to the main diagnostic, but do not have context on their own. The InFlightDiagnostic class represents an RAII wrapper around a Diagnostic that is set to be reported with the diagnostic engine. This allows for the user to modify a diagnostic that is inflight. The internally wrapped diagnostic can be reported directly or automatically upon destruction. These classes allow for more natural composition of diagnostics by removing the restriction that the message of a diagnostic is comprised of a single Twine. They should also allow for nice incremental improvements to the diagnostics experience in the future, e.g. formatv style diagnostics. Simple Example: emitError(loc, "integer bitwidth is limited to " + Twine(IntegerType::kMaxWidth) + " bits"); emitError(loc) << "integer bitwidth is limited to " << IntegerType::kMaxWidth << " bits"; -- PiperOrigin-RevId: 246526439
2019-05-03 10:01:01 -07:00
/// any diagnostic handlers that may be listening.
InFlightDiagnostic OpState::emitError(const Twine &message) {
return getOperation()->emitError(message);
}
/// Emit an error with the op name prefixed, like "'dim' op " which is
/// convenient for verifiers.
Introduce a new API for emitting diagnostics with Diagnostic and InFlightDiagnostic. The Diagnostic class contains all of the information necessary to report a diagnostic to the DiagnosticEngine. It should generally not be constructed directly, and instead used transitively via InFlightDiagnostic. A diagnostic is currently comprised of several different elements: * A severity level. * A source Location. * A list of DiagnosticArguments that help compose and comprise the output message. * A DiagnosticArgument represents any value that may be part of the diagnostic, e.g. string, integer, Type, Attribute, etc. * Arguments can be added to the diagnostic via the stream(<<) operator. * (In a future cl) A list of attached notes. * These are in the form of other diagnostics that provide supplemental information to the main diagnostic, but do not have context on their own. The InFlightDiagnostic class represents an RAII wrapper around a Diagnostic that is set to be reported with the diagnostic engine. This allows for the user to modify a diagnostic that is inflight. The internally wrapped diagnostic can be reported directly or automatically upon destruction. These classes allow for more natural composition of diagnostics by removing the restriction that the message of a diagnostic is comprised of a single Twine. They should also allow for nice incremental improvements to the diagnostics experience in the future, e.g. formatv style diagnostics. Simple Example: emitError(loc, "integer bitwidth is limited to " + Twine(IntegerType::kMaxWidth) + " bits"); emitError(loc) << "integer bitwidth is limited to " << IntegerType::kMaxWidth << " bits"; -- PiperOrigin-RevId: 246526439
2019-05-03 10:01:01 -07:00
InFlightDiagnostic OpState::emitOpError(const Twine &message) {
return getOperation()->emitOpError(message);
}
/// Emit a warning about this operation, reporting up to any diagnostic
/// handlers that may be listening.
Introduce a new API for emitting diagnostics with Diagnostic and InFlightDiagnostic. The Diagnostic class contains all of the information necessary to report a diagnostic to the DiagnosticEngine. It should generally not be constructed directly, and instead used transitively via InFlightDiagnostic. A diagnostic is currently comprised of several different elements: * A severity level. * A source Location. * A list of DiagnosticArguments that help compose and comprise the output message. * A DiagnosticArgument represents any value that may be part of the diagnostic, e.g. string, integer, Type, Attribute, etc. * Arguments can be added to the diagnostic via the stream(<<) operator. * (In a future cl) A list of attached notes. * These are in the form of other diagnostics that provide supplemental information to the main diagnostic, but do not have context on their own. The InFlightDiagnostic class represents an RAII wrapper around a Diagnostic that is set to be reported with the diagnostic engine. This allows for the user to modify a diagnostic that is inflight. The internally wrapped diagnostic can be reported directly or automatically upon destruction. These classes allow for more natural composition of diagnostics by removing the restriction that the message of a diagnostic is comprised of a single Twine. They should also allow for nice incremental improvements to the diagnostics experience in the future, e.g. formatv style diagnostics. Simple Example: emitError(loc, "integer bitwidth is limited to " + Twine(IntegerType::kMaxWidth) + " bits"); emitError(loc) << "integer bitwidth is limited to " << IntegerType::kMaxWidth << " bits"; -- PiperOrigin-RevId: 246526439
2019-05-03 10:01:01 -07:00
InFlightDiagnostic OpState::emitWarning(const Twine &message) {
return getOperation()->emitWarning(message);
}
/// Emit a remark about this operation, reporting up to any diagnostic
/// handlers that may be listening.
Introduce a new API for emitting diagnostics with Diagnostic and InFlightDiagnostic. The Diagnostic class contains all of the information necessary to report a diagnostic to the DiagnosticEngine. It should generally not be constructed directly, and instead used transitively via InFlightDiagnostic. A diagnostic is currently comprised of several different elements: * A severity level. * A source Location. * A list of DiagnosticArguments that help compose and comprise the output message. * A DiagnosticArgument represents any value that may be part of the diagnostic, e.g. string, integer, Type, Attribute, etc. * Arguments can be added to the diagnostic via the stream(<<) operator. * (In a future cl) A list of attached notes. * These are in the form of other diagnostics that provide supplemental information to the main diagnostic, but do not have context on their own. The InFlightDiagnostic class represents an RAII wrapper around a Diagnostic that is set to be reported with the diagnostic engine. This allows for the user to modify a diagnostic that is inflight. The internally wrapped diagnostic can be reported directly or automatically upon destruction. These classes allow for more natural composition of diagnostics by removing the restriction that the message of a diagnostic is comprised of a single Twine. They should also allow for nice incremental improvements to the diagnostics experience in the future, e.g. formatv style diagnostics. Simple Example: emitError(loc, "integer bitwidth is limited to " + Twine(IntegerType::kMaxWidth) + " bits"); emitError(loc) << "integer bitwidth is limited to " << IntegerType::kMaxWidth << " bits"; -- PiperOrigin-RevId: 246526439
2019-05-03 10:01:01 -07:00
InFlightDiagnostic OpState::emitRemark(const Twine &message) {
return getOperation()->emitRemark(message);
}
//===----------------------------------------------------------------------===//
// Op Trait implementations
//===----------------------------------------------------------------------===//
OpFoldResult OpTrait::impl::foldIdempotent(Operation *op) {
if (op->getNumOperands() == 1) {
auto *argumentOp = op->getOperand(0).getDefiningOp();
if (argumentOp && op->getName() == argumentOp->getName()) {
// Replace the outer operation output with the inner operation.
return op->getOperand(0);
}
} else if (op->getOperand(0) == op->getOperand(1)) {
return op->getOperand(0);
}
return {};
}
OpFoldResult OpTrait::impl::foldInvolution(Operation *op) {
auto *argumentOp = op->getOperand(0).getDefiningOp();
if (argumentOp && op->getName() == argumentOp->getName()) {
// Replace the outer involutions output with inner's input.
return argumentOp->getOperand(0);
}
return {};
}
LogicalResult OpTrait::impl::verifyZeroOperands(Operation *op) {
if (op->getNumOperands() != 0)
return op->emitOpError() << "requires zero operands";
return success();
}
LogicalResult OpTrait::impl::verifyOneOperand(Operation *op) {
if (op->getNumOperands() != 1)
return op->emitOpError() << "requires a single operand";
return success();
}
LogicalResult OpTrait::impl::verifyNOperands(Operation *op,
unsigned numOperands) {
if (op->getNumOperands() != numOperands) {
return op->emitOpError() << "expected " << numOperands
<< " operands, but found " << op->getNumOperands();
}
return success();
}
LogicalResult OpTrait::impl::verifyAtLeastNOperands(Operation *op,
unsigned numOperands) {
if (op->getNumOperands() < numOperands)
return op->emitOpError()
<< "expected " << numOperands << " or more operands, but found "
<< op->getNumOperands();
return success();
}
/// If this is a vector type, or a tensor type, return the scalar element type
/// that it is built around, otherwise return the type unmodified.
static Type getTensorOrVectorElementType(Type type) {
if (auto vec = type.dyn_cast<VectorType>())
return vec.getElementType();
// Look through tensor<vector<...>> to find the underlying element type.
if (auto tensor = type.dyn_cast<TensorType>())
return getTensorOrVectorElementType(tensor.getElementType());
return type;
}
LogicalResult OpTrait::impl::verifyIsIdempotent(Operation *op) {
// FIXME: Add back check for no side effects on operation.
// Currently adding it would cause the shared library build
// to fail since there would be a dependency of IR on SideEffectInterfaces
// which is cyclical.
return success();
}
LogicalResult OpTrait::impl::verifyIsInvolution(Operation *op) {
// FIXME: Add back check for no side effects on operation.
// Currently adding it would cause the shared library build
// to fail since there would be a dependency of IR on SideEffectInterfaces
// which is cyclical.
return success();
}
[mlir] Add a signedness semantics bit to IntegerType Thus far IntegerType has been signless: a value of IntegerType does not have a sign intrinsically and it's up to the specific operation to decide how to interpret those bits. For example, std.addi does two's complement arithmetic, and std.divis/std.diviu treats the first bit as a sign. This design choice was made some time ago when we did't have lots of dialects and dialects were more rigid. Today we have much more extensible infrastructure and different dialect may want different modelling over integer signedness. So while we can say we want signless integers in the standard dialect, we cannot dictate for others. Requiring each dialect to model the signedness semantics with another set of custom types is duplicating the functionality everywhere, considering the fundamental role integer types play. This CL extends the IntegerType with a signedness semantics bit. This gives each dialect an option to opt in signedness semantics if that's what they want and helps code sharing. The parser is modified to recognize `si[1-9][0-9]*` and `ui[1-9][0-9]*` as signed and unsigned integer types, respectively, leaving the original `i[1-9][0-9]*` to continue to mean no indication over signedness semantics. All existing dialects are not affected (yet) as this is a feature to opt in. More discussions can be found at: https://groups.google.com/a/tensorflow.org/d/msg/mlir/XmkV8HOPWpo/7O4X0Nb_AQAJ Differential Revision: https://reviews.llvm.org/D72533
2020-01-10 14:48:24 -05:00
LogicalResult
OpTrait::impl::verifyOperandsAreSignlessIntegerLike(Operation *op) {
for (auto opType : op->getOperandTypes()) {
auto type = getTensorOrVectorElementType(opType);
[mlir] Add a signedness semantics bit to IntegerType Thus far IntegerType has been signless: a value of IntegerType does not have a sign intrinsically and it's up to the specific operation to decide how to interpret those bits. For example, std.addi does two's complement arithmetic, and std.divis/std.diviu treats the first bit as a sign. This design choice was made some time ago when we did't have lots of dialects and dialects were more rigid. Today we have much more extensible infrastructure and different dialect may want different modelling over integer signedness. So while we can say we want signless integers in the standard dialect, we cannot dictate for others. Requiring each dialect to model the signedness semantics with another set of custom types is duplicating the functionality everywhere, considering the fundamental role integer types play. This CL extends the IntegerType with a signedness semantics bit. This gives each dialect an option to opt in signedness semantics if that's what they want and helps code sharing. The parser is modified to recognize `si[1-9][0-9]*` and `ui[1-9][0-9]*` as signed and unsigned integer types, respectively, leaving the original `i[1-9][0-9]*` to continue to mean no indication over signedness semantics. All existing dialects are not affected (yet) as this is a feature to opt in. More discussions can be found at: https://groups.google.com/a/tensorflow.org/d/msg/mlir/XmkV8HOPWpo/7O4X0Nb_AQAJ Differential Revision: https://reviews.llvm.org/D72533
2020-01-10 14:48:24 -05:00
if (!type.isSignlessIntOrIndex())
return op->emitOpError() << "requires an integer or index type";
}
return success();
}
LogicalResult OpTrait::impl::verifyOperandsAreFloatLike(Operation *op) {
for (auto opType : op->getOperandTypes()) {
auto type = getTensorOrVectorElementType(opType);
if (!type.isa<FloatType>())
return op->emitOpError("requires a float type");
}
return success();
}
LogicalResult OpTrait::impl::verifySameTypeOperands(Operation *op) {
// Zero or one operand always have the "same" type.
unsigned nOperands = op->getNumOperands();
if (nOperands < 2)
return success();
auto type = op->getOperand(0).getType();
for (auto opType : llvm::drop_begin(op->getOperandTypes(), 1))
if (opType != type)
return op->emitOpError() << "requires all operands to have the same type";
return success();
}
LogicalResult OpTrait::impl::verifyZeroRegions(Operation *op) {
if (op->getNumRegions() != 0)
return op->emitOpError() << "requires zero regions";
return success();
}
LogicalResult OpTrait::impl::verifyOneRegion(Operation *op) {
if (op->getNumRegions() != 1)
return op->emitOpError() << "requires one region";
return success();
}
LogicalResult OpTrait::impl::verifyNRegions(Operation *op,
unsigned numRegions) {
if (op->getNumRegions() != numRegions)
return op->emitOpError() << "expected " << numRegions << " regions";
return success();
}
LogicalResult OpTrait::impl::verifyAtLeastNRegions(Operation *op,
unsigned numRegions) {
if (op->getNumRegions() < numRegions)
return op->emitOpError() << "expected " << numRegions << " or more regions";
return success();
}
LogicalResult OpTrait::impl::verifyZeroResults(Operation *op) {
if (op->getNumResults() != 0)
return op->emitOpError() << "requires zero results";
return success();
}
LogicalResult OpTrait::impl::verifyOneResult(Operation *op) {
if (op->getNumResults() != 1)
return op->emitOpError() << "requires one result";
return success();
}
LogicalResult OpTrait::impl::verifyNResults(Operation *op,
unsigned numOperands) {
if (op->getNumResults() != numOperands)
return op->emitOpError() << "expected " << numOperands << " results";
return success();
}
LogicalResult OpTrait::impl::verifyAtLeastNResults(Operation *op,
unsigned numOperands) {
if (op->getNumResults() < numOperands)
return op->emitOpError()
<< "expected " << numOperands << " or more results";
return success();
}
LogicalResult OpTrait::impl::verifySameOperandsShape(Operation *op) {
if (failed(verifyAtLeastNOperands(op, 1)))
return failure();
if (failed(verifyCompatibleShapes(op->getOperandTypes())))
return op->emitOpError() << "requires the same shape for all operands";
return success();
}
LogicalResult OpTrait::impl::verifySameOperandsAndResultShape(Operation *op) {
if (failed(verifyAtLeastNOperands(op, 1)) ||
failed(verifyAtLeastNResults(op, 1)))
return failure();
SmallVector<Type, 8> types(op->getOperandTypes());
types.append(llvm::to_vector<4>(op->getResultTypes()));
if (failed(verifyCompatibleShapes(types)))
return op->emitOpError()
<< "requires the same shape for all operands and results";
return success();
}
LogicalResult OpTrait::impl::verifySameOperandsElementType(Operation *op) {
if (failed(verifyAtLeastNOperands(op, 1)))
return failure();
auto elementType = getElementTypeOrSelf(op->getOperand(0));
for (auto operand : llvm::drop_begin(op->getOperands(), 1)) {
if (getElementTypeOrSelf(operand) != elementType)
return op->emitOpError("requires the same element type for all operands");
}
return success();
}
LogicalResult
OpTrait::impl::verifySameOperandsAndResultElementType(Operation *op) {
if (failed(verifyAtLeastNOperands(op, 1)) ||
failed(verifyAtLeastNResults(op, 1)))
return failure();
auto elementType = getElementTypeOrSelf(op->getResult(0));
// Verify result element type matches first result's element type.
for (auto result : llvm::drop_begin(op->getResults(), 1)) {
if (getElementTypeOrSelf(result) != elementType)
return op->emitOpError(
"requires the same element type for all operands and results");
}
// Verify operand's element type matches first result's element type.
for (auto operand : op->getOperands()) {
if (getElementTypeOrSelf(operand) != elementType)
return op->emitOpError(
"requires the same element type for all operands and results");
}
return success();
}
LogicalResult OpTrait::impl::verifySameOperandsAndResultType(Operation *op) {
if (failed(verifyAtLeastNOperands(op, 1)) ||
failed(verifyAtLeastNResults(op, 1)))
return failure();
auto type = op->getResult(0).getType();
auto elementType = getElementTypeOrSelf(type);
Attribute encoding = nullptr;
if (auto rankedType = dyn_cast<RankedTensorType>(type))
encoding = rankedType.getEncoding();
[mlir][IR] Refactor the internal implementation of Value The current implementation of Value involves a pointer int pair with several different kinds of owners, i.e. BlockArgumentImpl*, Operation *, TrailingOpResult*. This design arose from the desire to save memory overhead for operations that have a very small number of results (generally 0-2). There are, unfortunately, many problematic aspects of the current implementation that make Values difficult to work with or just inefficient. Operation result types are stored as a separate array on the Operation. This is very inefficient for many reasons: we use TupleType for multiple results, which can lead to huge amounts of memory usage if multi-result operations change types frequently(they do). It also means that simple methods like Value::getType/Value::setType now require complex logic to get to the desired type. Value only has one pointer bit free, severely limiting the ability to use it in things like PointerUnion/PointerIntPair. Given that we store the kind of a Value along with the "owner" pointer, we only leave one bit free for users of Value. This creates situations where we end up nesting PointerUnions to be able to use Value in one. As noted above, most of the methods in Value need to branch on at least 3 different cases which is both inefficient, possibly error prone, and verbose. The current storage of results also creates problems for utilities like ValueRange/TypeRange, which want to efficiently store base pointers to ranges (of which Operation* isn't really useful as one). This revision greatly simplifies the implementation of Value by the introduction of a new ValueImpl class. This class contains all of the state shared between all of the various derived value classes; i.e. the use list, the type, and the kind. This shared implementation class provides several large benefits: * Most of the methods on value are now branchless, and often one-liners. * The "kind" of the value is now stored in ValueImpl instead of Value This frees up all of Value's pointer bits, allowing for users to take full advantage of PointerUnion/PointerIntPair/etc. It also allows for storing more operation results as "inline", 6 now instead of 2, freeing up 1 word per new inline result. * Operation result types are now stored in the result, instead of a side array This drops the size of zero-result operations by 1 word. It also removes the memory crushing use of TupleType for operations results (which could lead up to hundreds of megabytes of "dead" TupleTypes in the context). This also allowed restructured ValueRange, making it simpler and one word smaller. This revision does come with two conceptual downsides: * Operation::getResultTypes no longer returns an ArrayRef<Type> This conceptually makes some usages slower, as the iterator increment is slightly more complex. * OpResult::getOwner is slightly more expensive, as it now requires a little bit of arithmetic From profiling, neither of the conceptual downsides have resulted in any perceivable hit to performance. Given the advantages of the new design, most compiles are slightly faster. Differential Revision: https://reviews.llvm.org/D97804
2021-03-03 14:23:14 -08:00
for (auto resultType : llvm::drop_begin(op->getResultTypes())) {
if (getElementTypeOrSelf(resultType) != elementType ||
failed(verifyCompatibleShape(resultType, type)))
return op->emitOpError()
<< "requires the same type for all operands and results";
if (encoding)
if (auto rankedType = dyn_cast<RankedTensorType>(resultType);
encoding != rankedType.getEncoding())
return op->emitOpError()
<< "requires the same encoding for all operands and results";
}
for (auto opType : op->getOperandTypes()) {
if (getElementTypeOrSelf(opType) != elementType ||
failed(verifyCompatibleShape(opType, type)))
return op->emitOpError()
<< "requires the same type for all operands and results";
if (encoding)
if (auto rankedType = dyn_cast<RankedTensorType>(opType);
encoding != rankedType.getEncoding())
return op->emitOpError()
<< "requires the same encoding for all operands and results";
}
return success();
}
LogicalResult OpTrait::impl::verifyIsTerminator(Operation *op) {
Block *block = op->getBlock();
// Verify that the operation is at the end of the respective parent block.
if (!block || &block->back() != op)
return op->emitOpError("must be the last operation in the parent block");
return success();
}
static LogicalResult verifyTerminatorSuccessors(Operation *op) {
auto *parent = op->getParentRegion();
// Verify that the operands lines up with the BB arguments in the successor.
for (Block *succ : op->getSuccessors())
if (succ->getParent() != parent)
return op->emitError("reference to block defined in another region");
return success();
}
LogicalResult OpTrait::impl::verifyZeroSuccessors(Operation *op) {
if (op->getNumSuccessors() != 0) {
return op->emitOpError("requires 0 successors but found ")
<< op->getNumSuccessors();
}
return success();
}
LogicalResult OpTrait::impl::verifyOneSuccessor(Operation *op) {
if (op->getNumSuccessors() != 1) {
return op->emitOpError("requires 1 successor but found ")
<< op->getNumSuccessors();
}
return verifyTerminatorSuccessors(op);
}
LogicalResult OpTrait::impl::verifyNSuccessors(Operation *op,
unsigned numSuccessors) {
if (op->getNumSuccessors() != numSuccessors) {
return op->emitOpError("requires ")
<< numSuccessors << " successors but found "
<< op->getNumSuccessors();
}
return verifyTerminatorSuccessors(op);
}
LogicalResult OpTrait::impl::verifyAtLeastNSuccessors(Operation *op,
unsigned numSuccessors) {
if (op->getNumSuccessors() < numSuccessors) {
return op->emitOpError("requires at least ")
<< numSuccessors << " successors but found "
<< op->getNumSuccessors();
}
return verifyTerminatorSuccessors(op);
}
LogicalResult OpTrait::impl::verifyResultsAreBoolLike(Operation *op) {
for (auto resultType : op->getResultTypes()) {
auto elementType = getTensorOrVectorElementType(resultType);
bool isBoolType = elementType.isInteger(1);
if (!isBoolType)
return op->emitOpError() << "requires a bool result type";
}
return success();
}
LogicalResult OpTrait::impl::verifyResultsAreFloatLike(Operation *op) {
for (auto resultType : op->getResultTypes())
if (!getTensorOrVectorElementType(resultType).isa<FloatType>())
return op->emitOpError() << "requires a floating point type";
return success();
}
[mlir] Add a signedness semantics bit to IntegerType Thus far IntegerType has been signless: a value of IntegerType does not have a sign intrinsically and it's up to the specific operation to decide how to interpret those bits. For example, std.addi does two's complement arithmetic, and std.divis/std.diviu treats the first bit as a sign. This design choice was made some time ago when we did't have lots of dialects and dialects were more rigid. Today we have much more extensible infrastructure and different dialect may want different modelling over integer signedness. So while we can say we want signless integers in the standard dialect, we cannot dictate for others. Requiring each dialect to model the signedness semantics with another set of custom types is duplicating the functionality everywhere, considering the fundamental role integer types play. This CL extends the IntegerType with a signedness semantics bit. This gives each dialect an option to opt in signedness semantics if that's what they want and helps code sharing. The parser is modified to recognize `si[1-9][0-9]*` and `ui[1-9][0-9]*` as signed and unsigned integer types, respectively, leaving the original `i[1-9][0-9]*` to continue to mean no indication over signedness semantics. All existing dialects are not affected (yet) as this is a feature to opt in. More discussions can be found at: https://groups.google.com/a/tensorflow.org/d/msg/mlir/XmkV8HOPWpo/7O4X0Nb_AQAJ Differential Revision: https://reviews.llvm.org/D72533
2020-01-10 14:48:24 -05:00
LogicalResult
OpTrait::impl::verifyResultsAreSignlessIntegerLike(Operation *op) {
for (auto resultType : op->getResultTypes())
[mlir] Add a signedness semantics bit to IntegerType Thus far IntegerType has been signless: a value of IntegerType does not have a sign intrinsically and it's up to the specific operation to decide how to interpret those bits. For example, std.addi does two's complement arithmetic, and std.divis/std.diviu treats the first bit as a sign. This design choice was made some time ago when we did't have lots of dialects and dialects were more rigid. Today we have much more extensible infrastructure and different dialect may want different modelling over integer signedness. So while we can say we want signless integers in the standard dialect, we cannot dictate for others. Requiring each dialect to model the signedness semantics with another set of custom types is duplicating the functionality everywhere, considering the fundamental role integer types play. This CL extends the IntegerType with a signedness semantics bit. This gives each dialect an option to opt in signedness semantics if that's what they want and helps code sharing. The parser is modified to recognize `si[1-9][0-9]*` and `ui[1-9][0-9]*` as signed and unsigned integer types, respectively, leaving the original `i[1-9][0-9]*` to continue to mean no indication over signedness semantics. All existing dialects are not affected (yet) as this is a feature to opt in. More discussions can be found at: https://groups.google.com/a/tensorflow.org/d/msg/mlir/XmkV8HOPWpo/7O4X0Nb_AQAJ Differential Revision: https://reviews.llvm.org/D72533
2020-01-10 14:48:24 -05:00
if (!getTensorOrVectorElementType(resultType).isSignlessIntOrIndex())
return op->emitOpError() << "requires an integer or index type";
return success();
}
LogicalResult OpTrait::impl::verifyValueSizeAttr(Operation *op,
StringRef attrName,
StringRef valueGroupName,
size_t expectedCount) {
auto sizeAttr = op->getAttrOfType<DenseI32ArrayAttr>(attrName);
if (!sizeAttr)
return op->emitOpError("requires dense i32 array attribute '")
<< attrName << "'";
ArrayRef<int32_t> sizes = sizeAttr.asArrayRef();
if (llvm::any_of(sizes, [](int32_t element) { return element < 0; }))
return op->emitOpError("'")
<< attrName << "' attribute cannot have negative elements";
size_t totalCount =
std::accumulate(sizes.begin(), sizes.end(), 0,
[](unsigned all, int32_t one) { return all + one; });
if (totalCount != expectedCount)
return op->emitOpError()
<< valueGroupName << " count (" << expectedCount
<< ") does not match with the total size (" << totalCount
<< ") specified in attribute '" << attrName << "'";
return success();
}
LogicalResult OpTrait::impl::verifyOperandSizeAttr(Operation *op,
StringRef attrName) {
return verifyValueSizeAttr(op, attrName, "operand", op->getNumOperands());
}
LogicalResult OpTrait::impl::verifyResultSizeAttr(Operation *op,
StringRef attrName) {
return verifyValueSizeAttr(op, attrName, "result", op->getNumResults());
}
LogicalResult OpTrait::impl::verifyNoRegionArguments(Operation *op) {
for (Region &region : op->getRegions()) {
if (region.empty())
continue;
if (region.getNumArguments() != 0) {
if (op->getNumRegions() > 1)
return op->emitOpError("region #")
<< region.getRegionNumber() << " should have no arguments";
return op->emitOpError("region should have no arguments");
}
}
return success();
}
LogicalResult OpTrait::impl::verifyElementwise(Operation *op) {
auto isMappableType = [](Type type) {
return type.isa<VectorType, TensorType>();
};
auto resultMappableTypes = llvm::to_vector<1>(
llvm::make_filter_range(op->getResultTypes(), isMappableType));
auto operandMappableTypes = llvm::to_vector<2>(
llvm::make_filter_range(op->getOperandTypes(), isMappableType));
// If the op only has scalar operand/result types, then we have nothing to
// check.
if (resultMappableTypes.empty() && operandMappableTypes.empty())
return success();
if (!resultMappableTypes.empty() && operandMappableTypes.empty())
return op->emitOpError("if a result is non-scalar, then at least one "
"operand must be non-scalar");
assert(!operandMappableTypes.empty());
if (resultMappableTypes.empty())
return op->emitOpError("if an operand is non-scalar, then there must be at "
"least one non-scalar result");
if (resultMappableTypes.size() != op->getNumResults())
return op->emitOpError(
"if an operand is non-scalar, then all results must be non-scalar");
SmallVector<Type, 4> types = llvm::to_vector<2>(
llvm::concat<Type>(operandMappableTypes, resultMappableTypes));
TypeID expectedBaseTy = types.front().getTypeID();
if (!llvm::all_of(types,
[&](Type t) { return t.getTypeID() == expectedBaseTy; }) ||
failed(verifyCompatibleShapes(types))) {
return op->emitOpError() << "all non-scalar operands/results must have the "
"same shape and base type";
}
return success();
}
/// Check for any values used by operations regions attached to the
/// specified "IsIsolatedFromAbove" operation defined outside of it.
LogicalResult OpTrait::impl::verifyIsIsolatedFromAbove(Operation *isolatedOp) {
assert(isolatedOp->hasTrait<OpTrait::IsIsolatedFromAbove>() &&
"Intended to check IsolatedFromAbove ops");
// List of regions to analyze. Each region is processed independently, with
// respect to the common `limit` region, so we can look at them in any order.
// Therefore, use a simple vector and push/pop back the current region.
SmallVector<Region *, 8> pendingRegions;
for (auto &region : isolatedOp->getRegions()) {
pendingRegions.push_back(&region);
// Traverse all operations in the region.
while (!pendingRegions.empty()) {
for (Operation &op : pendingRegions.pop_back_val()->getOps()) {
for (Value operand : op.getOperands()) {
// Check that any value that is used by an operation is defined in the
// same region as either an operation result.
auto *operandRegion = operand.getParentRegion();
if (!operandRegion)
return op.emitError("operation's operand is unlinked");
if (!region.isAncestor(operandRegion)) {
return op.emitOpError("using value defined outside the region")
.attachNote(isolatedOp->getLoc())
<< "required by region isolation constraints";
}
}
// Schedule any regions in the operation for further checking. Don't
// recurse into other IsolatedFromAbove ops, because they will check
// themselves.
if (op.getNumRegions() &&
!op.hasTrait<OpTrait::IsIsolatedFromAbove>()) {
for (Region &subRegion : op.getRegions())
pendingRegions.push_back(&subRegion);
}
}
}
}
return success();
}
bool OpTrait::hasElementwiseMappableTraits(Operation *op) {
return op->hasTrait<Elementwise>() && op->hasTrait<Scalarizable>() &&
op->hasTrait<Vectorizable>() && op->hasTrait<Tensorizable>();
}
//===----------------------------------------------------------------------===//
// CastOpInterface
//===----------------------------------------------------------------------===//
/// Attempt to fold the given cast operation.
LogicalResult
impl::foldCastInterfaceOp(Operation *op, ArrayRef<Attribute> attrOperands,
SmallVectorImpl<OpFoldResult> &foldResults) {
OperandRange operands = op->getOperands();
if (operands.empty())
return failure();
ResultRange results = op->getResults();
// Check for the case where the input and output types match 1-1.
if (operands.getTypes() == results.getTypes()) {
foldResults.append(operands.begin(), operands.end());
return success();
}
return failure();
}
/// Attempt to verify the given cast operation.
LogicalResult impl::verifyCastInterfaceOp(
Operation *op, function_ref<bool(TypeRange, TypeRange)> areCastCompatible) {
auto resultTypes = op->getResultTypes();
if (resultTypes.empty())
return op->emitOpError()
<< "expected at least one result for cast operation";
auto operandTypes = op->getOperandTypes();
if (!areCastCompatible(operandTypes, resultTypes)) {
InFlightDiagnostic diag = op->emitOpError("operand type");
if (operandTypes.empty())
diag << "s []";
else if (llvm::size(operandTypes) == 1)
diag << " " << *operandTypes.begin();
else
diag << "s " << operandTypes;
return diag << " and result type" << (resultTypes.size() == 1 ? " " : "s ")
<< resultTypes << " are cast incompatible";
}
return success();
}
//===----------------------------------------------------------------------===//
// Misc. utils
//===----------------------------------------------------------------------===//
/// Insert an operation, generated by `buildTerminatorOp`, at the end of the
/// region's only block if it does not have a terminator already. If the region
/// is empty, insert a new block first. `buildTerminatorOp` should return the
/// terminator operation to insert.
void impl::ensureRegionTerminator(
Region &region, OpBuilder &builder, Location loc,
function_ref<Operation *(OpBuilder &, Location)> buildTerminatorOp) {
OpBuilder::InsertionGuard guard(builder);
if (region.empty())
builder.createBlock(&region);
Block &block = region.back();
if (!block.empty() && block.back().hasTrait<OpTrait::IsTerminator>())
return;
builder.setInsertionPointToEnd(&block);
builder.insert(buildTerminatorOp(builder, loc));
}
/// Create a simple OpBuilder and forward to the OpBuilder version of this
/// function.
void impl::ensureRegionTerminator(
Region &region, Builder &builder, Location loc,
function_ref<Operation *(OpBuilder &, Location)> buildTerminatorOp) {
OpBuilder opBuilder(builder.getContext());
ensureRegionTerminator(region, opBuilder, loc, buildTerminatorOp);
}