EnzymeXLA Dialect
Reactant.MLIR.Dialects.enzymexla.blas_symm Method
blas_symm
C := alpha_A_B + beta_C, or C := alpha_B_A + beta_C, where alpha and beta are scalars, A is a symmetric matrix"
sourceReactant.MLIR.Dialects.enzymexla.blas_syrk Method
blas_syrk
C_out := alpha_A_A^T + betaC, or C_out := alpha_A^T_A + betaC, where alpha and beta are scalars. C must be a n x n symmetric matrix.
output_uplo determines which part of C_out is populated. Accessing the values in the non-output_uplo part of the matrix is undefined behavior.
LAPACK/BLAS routines typically require a single uplo attribute and it is implicitly assumed that the output uplo corresponds to the input uplo. This means the burden lies on the user to manually copy data if they need to access the other half of the matrix. By specifying the output_uplo we can perform transformations that analyze the entire dataflow, and avoid computing/copying half of the tensor all together. Generally, it is recommended to set this attribute to enzymexla::LapackUplo::F, and our passes will automatically refine this to minimize data copies.
Reactant.MLIR.Dialects.enzymexla.blas_trmm Method
blas_trmm
B := alpha * op(A) x B, or B := alpha * B x op(A), where alpha is a scalar, B is a m x n matrix, A is a unit, or non-unit, upper or lower triangular matrix, and op(A) is one of op(A) = A, or op(A) = A^T or A^H.
sourceReactant.MLIR.Dialects.enzymexla.gpu_wrapper Method
gpu_wrapper
The optional arguments to this operation are suggestions about what block dimensions this gpu kernel should have - usually taken f rom kernel launch params
sourceReactant.MLIR.Dialects.enzymexla.lapack_gemqrt Method
lapack_gemqrt
This operation is modeled after LAPACK's *GEMQR routines.
sourceReactant.MLIR.Dialects.enzymexla.lapack_geqrf Method
lapack_geqrf
This operation computes the QR factorization of a matrix using Householder reflections. Mathematically, it decomposes A into the product of an orthogonal matri x Q and an upper triangular matrix R, such that A = QR.
This operation is modeled after LAPACK's *GEQRF routines, which returns the result in the QR packed format.
sourceReactant.MLIR.Dialects.enzymexla.lapack_geqrt Method
lapack_geqrt
This operation computes the QR factorization of a matrix using Householder reflections. Mathematically, it decomposes A into the product of an orthogonal matrix Q and an upper triangular matrix R, such that A = QR.
This operation is modeled after LAPACK's *GEQRT routines, which returns the result in the QR CompactWY format.
sourceReactant.MLIR.Dialects.enzymexla.lapack_orgqr Method
lapack_orgqr
This operation is modeled after LAPACK's _ORGQR/_UNGQR routines.
sourceReactant.MLIR.Dialects.enzymexla.lapack_ormqr Method
lapack_ormqr
This operation is modeled after LAPACK's *ORMQR routines.
sourceReactant.MLIR.Dialects.enzymexla.linalg_qr Method
linalg_qr
This operation computes the QR factorization of a matrix using Householder reflections. Mathematically, it decomposes A into the product of an orthogonal (unitary if complex) matrix Q and an upper triangular matrix R, such that A = QR.
If A has size m x n and m > n, Q is an m x n isometric matrix. If m < n, R will be a m x n trapezoidal matrix.
This operation is modeled after the mathematical formulation of the QR factorization, and not after LAPACK's compact formats.
sourceReactant.MLIR.Dialects.enzymexla.memcpy Method
memcpy
The gpu.memcpy operation copies the content of one memref to another.
The op does not execute before all async dependencies have finished executing.
If the async keyword is present, the op is executed asynchronously (i.e. it does not block until the execution has finished on the device). In that case, it returns a !gpu.async.token.
Example
%token = gpu.memcpy async [%dep] %dst, %src : memref<?xf32, 1>, memref<?xf32>Reactant.MLIR.Dialects.enzymexla.multi_rotate Method
multi_rotate
MultiRotate operation produces multiple rotated versions of the input tensor. Given left_amount=L and right_amount=R, it produces L+R+1 results:
L results for left rotations (from L to 1)
1 result for no rotation (amount=0)
R results for right rotations (from 1 to R)
For example, with left_amount=2 and right_amount=2: results[0] = rotate left by 2 results[1] = rotate left by 1 results[2] = no rotation (amount=0) results[3] = rotate right by 1 results[4] = rotate right by 2
sourceReactant.MLIR.Dialects.enzymexla.multi_slice Method
multi_slice
MultiSlice operation produces multiple slice versions of the input tensor. Given amount=N, it produces N+1 results:
Result 0 is the slice at the base start/limit indices
Result i is the slice shifted by i along the specified dimension
The slice parameters (start_indices, limit_indices, strides) define the first slice. Each subsequent result is offset by +1 along the specified dimension.
To achieve the old left/right semantics: if you previously had left_amount=L and right_amount=R, set amount=L+R and shift start_indices[dim] left by L.
sourceReactant.MLIR.Dialects.enzymexla.piecewise_select Method
piecewise_select
For coordinates that fit inside any of the boxes defined by the attribute, the result contains the element from the first operand, for other coordinates, it contains the element from the second operand.
sourceReactant.MLIR.Dialects.enzymexla.rotate Method
rotate
Performs a left rotation (circular shift) along dimension by amount elements. Elements are shifted left, with the first amount elements wrapping around to the end.
i.e.:
rotate([a, b, c, d, e], amount=2, dimension=0) → [c, d, e, a, b]Reactant.MLIR.Dialects.enzymexla.special_besselh Method
special_besselh
Computes the Bessel function of the third kind, also known as the Hankel function. The parameter k must be either 1 or 2, selecting between Hankel functions of the first kind (H1) and second kind (H2).
sourceReactant.MLIR.Dialects.enzymexla.special_jinc Method
special_jinc
Computes the jinc function, also known as the sombrero or besinc function. It is defined as J1(pi_x) / (2_x) where J1 is the Bessel function of the first kind of order 1. At x=0, the function evaluates to pi/4.
source