TPU Dialect 
Refer to the official documentation for more details.
Reactant.MLIR.Dialects.tpu.barrier Method
barrier
Performs barrier synchronization across all SC vector subcores at the specified barrier id.
sourceReactant.MLIR.Dialects.tpu.broadcast_in_sublanes Method
broadcast_in_sublanes
For each sublane i, broadcasts the value in lane lane + i along the entire sublane. If lane + i is not in [0, lane_count), then the value in sublane i is not defined (can be anything).
Reactant.MLIR.Dialects.tpu.create_subelement_mask Method
create_subelement_mask
The "half-sublanes", "quarter-sublanes", etc. (unit is determined by the type of output) of the mask are masked in the range specified by from and to.
- If - from <= to, the range- [from, to)is set and the rest is unset.
- If - to <= from, the range- [to, from)is unset and the rest is set.
All lanes are set identically.
Example
%msk = tpu.create_subelement_mask 3, 9 : vector<8x128x2xi1>This creates a mask %msk where, for all lanes, %msk[*][lane][*] is:
[[0, 0], [0, 1], [1, 1], [1, 1], [1, 0], [0, 0], [0, 0], [0, 0]]It is currently only supported:
- In TPU v4, for - num_subelemsof 1 and 2.
- In TPU v5, for - num_subelemsof 1, 2, and 4.
Reactant.MLIR.Dialects.tpu.dynamic_gather Method
dynamic_gather
Gathers elements from source using indices.
The specified dimensions of source are collapsed together and indexed by indices.
Given a shape N0 x N1 x ..., the output[i0, i1, ...] is given by collapsed_source[j0, j1, ..., indices[i0, i1, ...] mod M] where
- collapsed_sourceis the result of collapsing- dimensionsof- sourceinto a new trailing dimension of size- M.
- jkis the subsequence of- infor- nnot in- dimensions.
When a single dimension is specified, this is similar to np.take_along_axis.
Reactant.MLIR.Dialects.tpu.iota Method
iota
Creates a vector that with values that start at 0 and increase along a dimension resulting from collapsing the given dimensions together in row-major order.
Example
tpu.iota {dimensions = array<i32: 2, 0>} : vector<4x3x2xi16>This produces a vector with the following values:
[[[0, 4], [0, 4], [0, 4]]
 [[1, 5], [1, 5], [1, 5]]
 [[2, 6], [2, 6], [2, 6]]
 [[3, 7], [3, 7], [3, 7]]]Reactant.MLIR.Dialects.tpu.load Method
load
Similar to vector::LoadOp but with sublane_mask and sublane_stride. When indices are negative, it means loading from negative offset of base address.
Reactant.MLIR.Dialects.tpu.mask_cast Method
mask_cast
Cast a mask register into a different packing.
If casting to a type with smaller packing, then values being packed together must be identical. For example, for 8x128x4xi1 -> 8x128x2xi1, input[i, j, 0] == input[i, j, 1] and input[i, j, 2] == input[i, j, 3] must hold for all i, j. Otherwise, the result is undefined.
sourceReactant.MLIR.Dialects.tpu.rotate Method
rotate
Rotates the given vector by the given amount in the given dimension, i.e., for a 2D vector of shape (m, n), rotating dim 0 by amount will shift a row at index i to index (i + amount) % m
Reactant.MLIR.Dialects.tpu.scan_count Method
scan_count
ScanCountOp calculates the running duplicate occurrence count of the elements in the input vector, %values. The output vector, %counts, contains the running duplicate occurrence count for the corresponding element in the input vector, where the count is performed in ascending order of element indices. For example, if the elements of %values at indices 0, 5, and 7 had duplicate values, then the elements of %counts at indices 0, 5, and 7 would be 1, 2, and 3, respectively.
A mask vector, %in_mask, specifies which of the elements in the input vector are eligible for counting. An element in %values that has its mask set to 0 will always have a count of 1 in %counts, regardless of the position in the vector, or whether there were duplicates or not.
source