Heterogeneous computing requires that the data constituting a message
be typed or described somehow so that its machine representation can be
converted between computer architectures. MPI can thoroughly describe
message datatypes, from the simple primitive machine types to complex
structures, arrays and indices.
MPI messaging functions accept a datatype parameter, whose C typedef is
MPI_Datatype:
MPI_Send(void* buf, int count, MPI_Datatype datatype,
int dest, int tag, MPI_Comm comm);
Basic Datatypes
Everybody uses the primitive machine datatypes. Some C examples are
listed below (with the corresponding C datatype in parentheses):
MPI_CHAR (char)
MPI_INT (int)
MPI_FLOAT (float)
MPI_DOUBLE (double)
The count parameter in MPI_Send( ) refers to the number of elements of
the given datatype, not the total number of bytes.
For messages consisting of a homogeneous, contiguous array of basic
datatypes, this is the end of the datatype discussion. For messages
that contain more than one datatype or whose elements are not stored
contiguously in memory, something more is needed.
Strided Vector
Consider a mesh application with patches of a 2D array assigned to
different processes. The internal boundary rows and columns are
transferred between north/south and east/west processes in the overall
mesh. In C, the transfer of a row in a 2D array is simple - a
contiguous vector of elements equal in number to the number of columns
in the 2D array. Conversely, storage of the elements of a single
column are dispersed in memory; each vector element separated from its
next and previous indices by the size of one entire row.
An MPI derived datatype is a good solution for a non-contiguous data
structure. A code fragment to derive an appropriate datatype matching
this strided vector and then transmit the last column is listed below:
#include <mpi.h>
{
float mesh[10][20];
int dest, tag;
MPI_Datatype newtype;
/*
* Do this once.
*/
MPI_Type_vector(10, /* # column elements */
1, /* 1 column only */
20, /* skip 20 elements */
MPI_FLOAT, /* elements are float */
&newtype); /* MPI derived datatype */
MPI_Type_commit(&newtype);
/*
* Do this for every new message.
*/
MPI_Send(&mesh[0][19], 1, newtype,
dest, tag, MPI_COMM_WORLD);
}
MPI_Type_commit( ) separates the datatypes you really want to save and
use from the intermediate ones that are scaffolded on the way to some
very complex datatype.
A nice feature of MPI derived datatypes is that once created, they can
be used repeatedly with no further set-up code. MPI has many other
derived datatype constructors.
C Structure
Consider an imaging application that is transferring fixed length scan
lines of eight bit color pixels. Coupled with the pixel array is the
scan line number, an integer. The message might be described in C as a
structure:
struct {
int lineno;
char pixels[1024];
} scanline;
In addition to a derived datatype, message packing is a useful method
for sending non-contiguous and/or heterogeneous data. A code fragment
to pack and send the above structure is listed below:
#include <mpi.h>
{
unsigned int membersize, maxsize;
int position;
int dest, tag;
char *buffer;
/*
* Do this once.
*/
MPI_Pack_size(1, /* one element */
MPI_INT, /* datatype integer */
MPI_COMM_WORLD, /* consistent comm. */
&membersize); /* max packing space req'd */
maxsize = membersize;
MPI_Pack_size(1024, MPI_CHAR, MPI_COMM_WORLD, &membersize);
maxsize += membersize;
buffer = malloc(maxsize);
/*
* Do this for every new message.
*/
position = 0;
MPI_Pack(&scanline.lineno, /* pack this element */
1, /* one element */
MPI_INT, /* datatype int */
buffer, /* packing buffer */
maxsize, /* buffer size */
&position, /* next free byte offset */
MPI_COMM_WORLD); /* consistent comm. */
MPI_Pack(scanline.pixels, 1024, MPI_CHAR,
buffer, maxsize, &position, MPI_COMM_WORLD);
MPI_Send(buffer, position, MPI_PACKED,
dest, tag, MPI_COMM_WORLD);
}
A buffer is allocated once to contain the size of the packed
structure. The size must be computed because of implementation
dependent overhead in the message. Variable sized messages can be
handled by allocating a buffer large enough for the largest possible
message. The position parameter to MPI_Pack( ) always returns the
current size of the packed buffer.
A code fragment to unpack the message, assuming a receive buffer has
been allocated, is listed below:
{
int src;
int msgsize;
MPI_Status status;
MPI_Recv(buffer, maxsize, MPI_PACKED,
src, tag, MPI_COMM_WORLD, &status);
position = 0;
MPI_Get_count(&status, MPI_PACKED, &msgsize);
MPI_Unpack(buffer, /* packing buffer */
msgsize, /* buffer size */
&position, /* next element byte offset */
&scanline.lineno, /* unpack this element */
1, /* one element */
MPI_INT, /* datatype int */
MPI_COMM_WORLD); /* consistent comm. */
MPI_Unpack(buffer, msgsize, &position,
scanline.pixels, 1024, MPI_CHAR, MPI_COMM_WORLD);
}
You should be able to modify the above code fragments for any
structure. It is completely possible to alter the number of elements
to unpack based on application information unpacked previously in the
same message.
|