EnzymeXLA Dialect
Reactant.MLIR.Dialects.enzymexla.gpu_wrapper Method
gpu_wrapper
The optional arguments to this operation are suggestions about what block dimensions this gpu kernel should have - usually taken from kernel launch params
sourceReactant.MLIR.Dialects.enzymexla.memcpy Method
memcpy
The gpu.memcpy
operation copies the content of one memref to another.
The op does not execute before all async dependencies have finished executing.
If the async
keyword is present, the op is executed asynchronously (i.e. it does not block until the execution has finished on the device). In that case, it returns a !gpu.async.token.
Example
mlir
source%token = gpu.memcpy async [%dep] %dst, %src : memref<?xf32, 1>, memref<?xf32>