DMA Cuda Implementation API¶
- group aml_dma_cuda
dma between devices and host.
Cuda dma is an implementation of aml dma to transfer data between devices and between host and devices.
#include <aml/dma/cuda.h>
See also
Defines
-
AML_DMA_CUDA_REQUEST_STATUS_NONE¶
-
AML_DMA_CUDA_REQUEST_STATUS_PENDING¶
-
AML_DMA_CUDA_REQUEST_STATUS_DONE¶
-
AML_DMA_CUDA_DEVICE_PAIR(src, dst)¶
Embed a pair of devices in a void* to use as dma copy_operator argument when copying from device to device.
-
AML_DMA_CUDA_DEVICE_FROM_PAIR(pair, src, dst)¶
Translate back a pair of device ids stored in
pair
(void*) into to device id integers.
Functions
-
int aml_dma_cuda_request_create(struct aml_dma_data *data, struct aml_dma_request **req, struct aml_layout *dest, struct aml_layout *src, aml_dma_operator op, void *op_arg)¶
AML dma cuda request creation operator.
- Returns:
-AML_EINVAL if data, req, *req, dest or src is NULL.
- Returns:
-AML_ENOMEM if allocation failed.
- Returns:
AML_SUCCESS on success.
-
int aml_dma_cuda_request_wait(struct aml_dma_data *dma, struct aml_dma_request **req)¶
AML dma cuda request wait operator.
- Returns:
-AML_EINVAL if dma, req, *req is NULL or if data was does not come from the dma used in request creation.
- Returns:
AML_SUCCESS on success.
-
int aml_dma_cuda_barrier(struct aml_dma_data *data)¶
AML dma cuda barrier operator.
- Returns:
AML_SUCCESS on success.
-
int aml_dma_cuda_request_destroy(struct aml_dma_data *dma, struct aml_dma_request **req)¶
AML dma cuda request deletion operator
-
int aml_dma_cuda_create(struct aml_dma **dma, const enum cudaMemcpyKind kind)¶
Creation of a dma engine for cuda backend.
See also
struct aml_dma_cuda_data.
- Parameters:
dma – A pointer to set with a new allocated dma.
kind – The kind of transfer performed: host to device, device to host, device to device, or host to host.
- Returns:
-AML_EINVAL if dma can’t be set.
- Returns:
-AML_FAILURE if any cuda backend call failed.
- Returns:
-AML_ENOMEM if allocation failed.
- Returns:
AML_SUCCESS on success.
-
int aml_dma_cuda_copy_1D(struct aml_layout *dst, const struct aml_layout *src, void *arg)¶
Cuda DMA operator implementation: Use only with
aml_dma_cuda_request_create()
or higher levelaml_dma_async_copy_custom()
. This copy operator is compatible only with:This dma cuda implementation,
Dense source and destination layouts of one dimension. Make a flat copy of contiguous bytes in between two layout raw pointers. The size of the byte stream is computed as the product of dimensions and element size.
See also
- Parameters:
dst – [in] The destination layout of the copy.
src – [in] The source layout of the copy.
arg – [in] A pair of device ids obtained with
AML_DMA_CUDA_DEVICE_PAIR
.op_arg
is used only if the dma used with this operator iscudaMemcpyDeviceToDevice
kind of dma.
- Returns:
an AML error code.
-
int aml_dma_cuda_memcpy_op(struct aml_layout *dst, const struct aml_layout *src, void *arg)¶
Cuda DMA operator implementation: Use only with
aml_dma_cuda_request_create()
or higher levelaml_dma_async_copy_custom()
. This copy operator is compatible only with:This dma cuda implementation (device to device is not supported),
Flat source and destination pointers. Make a flat asychronous copy of contiguous bytes between two raw pointers. This dma operator casts input layout pointers into
void*
and assumes these are contiguous set of bytes to copy fromsrc
todst
in the linuxmemcpy()
fashion withcudaMemcpyAsync()
.
- Parameters:
dst – [out] The destination (
void*
) of the copy casted into astruct aml_layout *
.src – [in] The source (
void*
) of the copy casted into astruct aml_layout *
.arg – [in] The size (
size_t
) of the copy casted into avoid*
.
- Returns:
AML_SUCCESS
Variables
-
struct aml_dma_ops aml_dma_cuda_ops¶
Default dma ops used at dma creation
-
struct aml_dma_cuda_request¶
- #include <cuda.h>
Cuda DMA request. Only need a status flag is needed.
-
struct aml_dma_cuda_data¶
- #include <cuda.h>
aml_dma data structure. AML dma cuda contains a single execution stream. When waiting a request, the whole request stream is synchronized and all the requests are waited.
-
struct aml_dma_cuda_op_arg¶
- #include <cuda.h>
Structure passed to
aml_dma_operator
arg
argument by the request created inaml_dma_cuda_request_create()
. Allaml_dma_operator
implementations can expect to obtain a pointer to this structure asarg
argument. The pointer is valid only for the lifetime of theaml_dma_operator
call.
-
AML_DMA_CUDA_REQUEST_STATUS_NONE¶