Tilings

What is an AML tiling ?

In AML, a tiling is a representation of the decomposition of data structures. It identifies a way in which a layout can be split into layouts of smaller sizes.

As such, the main function of a tiling is to provide an index into subcomponents of a layout.

As for the layouts, both the C and Fortran indexing orders are available for the tilings, with similar names: AML_TILING_ORDER_C (AML_TILING_ORDER_ROW_MAJOR) and AML_TILING_ORDER_FORTRAN (AML_TILING_ORDER_COLUMN_MAJOR).

Creating an AML tiling

First, you need to have the right headers.

#include <aml.h>
#include <aml/layout/dense.h>
#include "aml/tiling/resize.h"

Let’s take the example of a two-dimensional matrix, with x rows and y columns. We need to have already allocated some memory space for our matrix, for instance with a AML area.

double mat[x][y];
double *mat;
struct aml_area *area = &aml_area_linux;
mat = (double *)aml_area_mmap(area, sizeof(double) * x * y, NULL);

We then need to declare and create a layout in any order. Let’s do both orders:

struct aml_layout *layout_c, *layout_f;
size_t dims_col[2] = { x, y };

aml_layout_dense_create(&layout_c, mat, AML_LAYOUT_ORDER_C, sizeof(double), 2, dims, NULL, NULL));

aml_layout_dense_create(&layout_f, mat, AML_LAYOUT_ORDER_FORTRAN, sizeof(double), 2, dims, NULL, NULL));

After that, we can finally create a tiling associated to each layout, one in AML_TILING_ORDER_C, the other in AML_TILING_ORDER_FORTRAN:

struct aml_tiling *tiling_c, *tiling_f;
size_t tile_x = x/3, tile_y = y/3;

aml_tiling_resize_create(&tiling_c, AML_TILING_ORDER_C, layout_c, 2, (size_t[]){tile_x, tile_y}));

aml_tiling_resize_create(&tiling_f, AML_TILING_ORDER_FORTRAN, layout_f, 2, (size_t[]){tile_x, tile_y}));

We have just created two tilings on our two layouts, each with two dimensions, {tile_x, tile_y} for both tilings.

Getting the order of an AML tiling

You can see the order of a tiling by using the following function from the AML Tiling API:

aml_tiling_order(tiling_c));
aml_tiling_order(tiling_f));

The order of the first tiling is AML_TILING_ORDER_C, which in AML is represented by the value 0, and the order of the second tiling is AML_TILING_ORDER_FORTRAN, and the above function would return 1.

Destroying an AML tiling

In the same way you need to free the memory allocated for an array, you need to destroy your tiling when you’re done using it.

aml_tiling_resize_destroy(&tiling_c);
aml_tiling_resize_destroy(&tiling_f);

Generic operations on an AML tiling

Several operations on an AML tiling are defined in the AML Tiling generic API. Let’s assume here that we have successful created a tiling called tiling in this part.

We can get the number of dimensions of this tiling:

size_t ndims = aml_tiling_ndims(tiling);

In the previous examples, the number of dimensions of the tilings would be 2.

Once you’ve got the number of dimensions of the tiling, you can get the size of each dimension in an array:

size_t dims[ndims];
int err = aml_tiling_dims(tiling, dims);

This function will return a non-zero integer if there is an error, and 0 if everything is fine. In our previous tilings, the dimensions returned would be {3, 3} for each tiling, because we cut our layouts into 3 in each dimension.

We can also get the number of tiles inside the tiling:

size_t ntiles = aml_tiling_ntiles(tiling);

This would have returned 9 for our previous tilings.

The dimensions of each tile in the tiling can be obtained in a array:

size_t tile_dims[ndims];
aml_tiling_tile_dims(tiling, tile_dims);

The resulting array would be {tile_x, tile_y} for both tiling_c and tiling_f.

Accessing a tile and elements of a tile

You can access any tile of the tiling by using its id in the tiling.

This will give you a tile, which is in fact a smaller layout than the one the tiling is based on.

Once you have this layout, you can access each element of the tile with its coordinates within this layout, of dimensions tile_dims.

Here is an example of going through each tile with the function aml_tiling_index_byid, then going through each dimension of the tile, and setting each element to 1.0 with the function aml_layout_deref:

size_t coords[ndims];
double *a;

for (size_t i = 0; i < ntiles; i++) {
   struct aml_layout *ltile;
   ltile = aml_tiling_index_byid(tiling, i);

   for (size_t j = 0; j < tile_dims[0]; j++) {
      for (size_t k = 0; k < tile_dims[1]; k++) {
         coords[0] = j;
         coords[1] = k;
         a = aml_layout_deref(ltile, coords);
         *a = 1.0;
      }
   }
}

Exercise

Let’s look at an example of when you could use tiles.

Let a be a matrix of doubles of size m*k. Let b be a matrix of doubles of size k*n. We want to get the matrix c of doubles of size m*n which is the result of the matrix multiplication of a and b. We assume that m, n and k can be divided by 3.

We want to use the tilings to perform a blocked matrix multiplication.

In order to do so, you will need to first allocate the memory for your matrices and initialize them, including c to all zeros.

Then you will need to create a layout for each of them, each with two dimensions, of the corresponding sizes for each matrix. You can all create them with the same order.

Then, since all the dimensions of the matrices can be divided by 3, you will need to create a tiling for each layout, with the dimensions of each tile being 3 times smaller than corresponding matrix dimensions.

After that you will be able to access each tile and do a blocked matrix multiplication on each tile.

Solution

Click Here to Show/Hide Code

/*******************************************************************************
 * Copyright 2019 UChicago Argonne, LLC.
 * (c.f. AUTHORS, LICENSE)
 *
 * This file is part of the AML project.
 * For more info, see https://github.com/anlsys/aml
 *
 * SPDX-License-Identifier: BSD-3-Clause
 ******************************************************************************/

#include <aml.h>
#include "aml/area/linux.h"
#include "aml/layout/dense.h"
#include "aml/tiling/resize.h"
#include <stdio.h>

void init_matrix(double *mat, size_t rows, size_t cols, double scal)
{
	for (size_t i = 0; i < rows; i++)
		for (size_t j = 0; j < cols; j++)
			mat[j * rows + i] = scal;
}

void print_matrix(double *mat, size_t rows, size_t cols)
{
	for (size_t i = 0; i < rows; i++) {
		for (size_t j = 0; j < cols; j++)
			fprintf(stderr, "%f ", mat[j * rows + i]);
		fprintf(stderr, "\n");
	}
	fprintf(stderr, "\n");
}

void matrix_multiplication(int m, int n, int k,
			   double *a, double *b, double *c)
{
	int i, j, l;

	for (i = 0; i < m; i++)
		for (j = 0; j < n; j++)
			for (l = 0; l < k; l++)
				c[j * m + i] += a[l * m + i] * b[j * k + l];
}

void dgemm_layout(size_t mt, size_t nt, size_t kt,
		  struct aml_layout *ltile_a,
		  struct aml_layout *ltile_b,
		  struct aml_layout *ltile_c)
{
	for (size_t ii = 0; ii < mt; ii++) {
		for (size_t jj = 0; jj < nt; jj++) {
			double *ct;

			ct = aml_layout_deref(ltile_c, (size_t[]){ii, jj});
			for (size_t ll = 0; ll < kt; ll++) {
				double *at, *bt;

				at = aml_layout_deref(ltile_a,
						      (size_t[]){ii, ll});
				bt = aml_layout_deref(ltile_b,
						      (size_t[]){ll, jj});
				*ct += *at * *bt;
			}
		}
	}
}

int dgemm_tiled(struct aml_tiling *tiling_a, struct aml_tiling *tiling_b,
		struct aml_tiling *tiling_c)
{
	size_t ndims = aml_tiling_ndims(tiling_a);
	size_t tile_a_dims[ndims];
	size_t tile_b_dims[ndims];
	size_t mt, nt, kt;
	size_t tiling_a_dims[ndims];
	size_t tiling_b_dims[ndims];
	size_t tiling_c_dims[ndims];

	aml_tiling_tile_dims(tiling_a, NULL, tile_a_dims);
	aml_tiling_tile_dims(tiling_b, NULL, tile_b_dims);
	mt = tile_a_dims[0];
	kt = tile_a_dims[1];
	nt = tile_b_dims[1];
	aml_tiling_dims(tiling_a, tiling_a_dims);
	aml_tiling_dims(tiling_b, tiling_b_dims);
	aml_tiling_dims(tiling_c, tiling_c_dims);

	// Going through the tiling tile by tile
	for (size_t i = 0; i < tiling_c_dims[0]; i++) {
		for (size_t j = 0; j < tiling_c_dims[1]; j++) {
			struct aml_layout *ltile_c;

			ltile_c = aml_tiling_index(tiling_c, (size_t[]){i, j});
			for (size_t l = 0; l < tiling_a_dims[1]; l++) {
				struct aml_layout *ltile_a, *ltile_b;

				ltile_a = aml_tiling_index(tiling_a,
						       (size_t[]){i, l});
				ltile_b = aml_tiling_index(tiling_b,
						       (size_t[]){l, j});
				dgemm_layout(mt, nt, kt, ltile_a, ltile_b,
					     ltile_c);
				aml_layout_destroy(&ltile_a);
				aml_layout_destroy(&ltile_b);
			}
			aml_layout_destroy(&ltile_c);
		}
	}
	return AML_SUCCESS;
}

int main(int argc, char **argv)
{
	if (aml_init(&argc, &argv) != AML_SUCCESS)
		return 1;

	size_t m = 6, n = 12, k = 9;
	double *a, *b, *c, *c_ref;

	// Allocationg our matrices
	a = malloc(sizeof(double) * m * k);
	b = malloc(sizeof(double) * k * n);
	c = malloc(sizeof(double) * m * n);
	c_ref = malloc(sizeof(double) * m * n);
	assert(a != NULL && b != NULL && c != NULL && c_ref != NULL);

	// Initializing our matrices, all of a is 1, all of b is 2
	init_matrix(a, m, k, 1.0);
	init_matrix(b, k, n, 2.0);
	init_matrix(c, m, n, 0.0);
	init_matrix(c_ref, m, n, 0.0);

	// Calculating our reference result
	matrix_multiplication(m, n, k, a, b, c_ref);
	print_matrix(c_ref, m, n);

	fprintf(stderr, "Creating layouts...\n");

	struct aml_layout *layout_a, *layout_b, *layout_c;
	size_t dims_a[2] = { m, k };
	size_t dims_b[2] = { k, n };
	size_t dims_c[2] = { m, n };

	assert(!aml_layout_dense_create(&layout_a, a,
					AML_LAYOUT_ORDER_C,
					sizeof(double), 2, dims_a, NULL,
					NULL));
	assert(!aml_layout_dense_create(&layout_b, b,
					AML_LAYOUT_ORDER_C,
					sizeof(double), 2, dims_b, NULL,
					NULL));
	assert(!aml_layout_dense_create(&layout_c, c,
					AML_LAYOUT_ORDER_C,
					sizeof(double), 2, dims_c, NULL,
					NULL));

	// We divide each matrix in 9 blocks
	fprintf(stderr, "Creating tilings...\n");

	struct aml_tiling *tiling_a, *tiling_b, *tiling_c;
	size_t tile_a_dims[2] = { 2, 3 };
	size_t tile_b_dims[2] = { 3, 4 };
	size_t tile_c_dims[2] = { 2, 4 };

	assert(!aml_tiling_resize_create(&tiling_a,
					 AML_TILING_ORDER_C,
					 layout_a, 2, tile_a_dims));
	assert(!aml_tiling_resize_create(&tiling_b,
					 AML_TILING_ORDER_C,
					 layout_b, 2, tile_b_dims));
	assert(!aml_tiling_resize_create(&tiling_c,
					 AML_TILING_ORDER_C,
					 layout_c, 2, tile_c_dims));

	// Do the matrix multiplication
	assert(!dgemm_tiled(tiling_a, tiling_b, tiling_c));
	print_matrix(c, m, n);

	/* Destroy everything */
	aml_tiling_resize_destroy(&tiling_a);
	aml_tiling_resize_destroy(&tiling_b);
	aml_tiling_resize_destroy(&tiling_c);
	aml_layout_destroy(&layout_a);
	aml_layout_destroy(&layout_b);
	aml_layout_destroy(&layout_c);
	free(a);
	free(b);
	free(c);
	free(c_ref);

	aml_finalize();

	return 0;
}

You can find this solution in doc/tutorials/tiling.