Parallel I/O

Experimental html version of Parallel Programming in MPI, OpenMP, and PETSc by Victor Eijkhout. download the textbook at https:/theartofhpc.com/pcse
\[ \newcommand\inv{^{-1}}\newcommand\invt{^{-t}} \newcommand\bbP{\mathbb{P}} \newcommand\bbR{\mathbb{R}} \newcommand\defined{ \mathrel{\lower 5pt \hbox{${\equiv\atop\mathrm{\scriptstyle D}}$}}} \] Back to Table of Contents

53 Parallel I/O

Parallel I/O is a tricky subject. You can try to let all processors jointly write one file, or to write a file per process and combine them later. With the standard mechanisms of your programming language there are the following considerations:

  • On clusters where the processes have individual file systems, the only way to write a single file is to let it be generated by a single processor.
  • Writing one file per process is easy to do, but

    • You need a post-processing script;
    • if the files are not on a shared file system (such as Lustre ), it takes additional effort to bring them together;
    • if the files are on a shared file system, writing many files may be a burden on the metadata server.
  • On a shared file system it is possible for all files to open the same file and set the file pointer individually. This can be difficult if the amount of data per process is not uniform.

Illustrating the last point:

// pseek.c
FILE *pfile;
pfile = fopen("pseek.dat","w");
fseek(pfile,procid*sizeof(int),SEEK_CUR);
//  fseek(pfile,procid*sizeof(char),SEEK_CUR);
fprintf(pfile,"%d\n",procid);
fclose(pfile);

MPI also has its own portable I/O: MPI I/O , for which see chapter~ MPI topic: File I/O .

Alternatively, one could use a library such as hdf5 ; see~\CARPref{tut:hdf5}.

For a great discussion see~ [Mendez:ParallelIOpage] , from which figures here are taken.

53.1 Use sequential I/O

crumb trail: > io > Use sequential I/O

MPI processes can do anything a regular process can, including opening a file. This is the simplest form of parallel I/O: every MPI process opens its own file. To prevent write collisions,

  • you use MPI_Comm_rank to generate a unique file name, or
  • you use a local file system, typically /tmp , that is unique per process, or at least per the group of processes on a node.

For reading it is actually possible for all processes to open the same file, but for reading this is not really feasible. Hence the unique files.

53.2 MPI I/O

crumb trail: > io > MPI I/O

In chapter~ MPI topic: File I/O we discussed MPI I/O. This is a way for all processes on a communicator to open a single file, and write to it in a coordinated fashion. This has the big advantage that the end result is an ordinary Unix file.

53.3 Higher level libraries

crumb trail: > io > Higher level libraries

Libraries such as NetCDF or HDF5 offer advantages over MPI I/O:

  • Files can be OS-independent, removing worries such as about little-endian storage.
  • Files are self-documenting: they contain the metadata describing their contents.

Back to Table of Contents