Henceforth, we use the term Target Program to refer to the message passing program whose performance is to be predicted and Target Machine to refer to the machine on which the target program executes. The term Simulator refers to the program that simulates execution of the target program and Host Machine refers to the sequential or parallel architecture on which the simulator executes. The simulator contains the following primary components:
A companion paper submitted to this conference [PBB97] describes MPISIM. We restrict our attention in this paper to parallel I/O simulation. The I/O simulator has been designed to be both modular and extensible: it is relatively easy to replace individual modules at each of the preceding levels. In particular, it is straightforward to replace the disk models, to modify the caching or partitioning policies, to modify the implementation of a specific collective MPI-IO operation, and to change the model of the interconnection network.
We assume that an MPI-IO program includes local code blocks that are simulated by direct execution, MPI communication calls that are simulated by MPISIM, and MPI-IO commands. Each process of an MPI program is modeled by a single thread in the simulator; we refer to this thread as the target LP. When a target LP executes an MPI-IO call, it is intercepted by PIO-SIM. In the case of collective I/O, MPISIM's underlying communication facility is used for synchronization and communication between target LP's. If complex user-defined datatypes are used, which allow processes to access non-contiguous pieces of data with a single MPI-IO call, the single non-contiguous request is decomposed into multiple contiguous requests. PIO-SIM uses standard UNIX system I/O calls (e.g., read(), write(), etc.) to replicate the functionality of these operations in the simulator. These I/O requests are then passed to PFS-SIM, in order to determine I/O execution time for the simulated subsystem.
The I/O subsystem can be simulated with multiple levels of detail. At the most abstract level, a simple analytical model is provided which calculates the I/O time as a function of specified disk performance characteristics and the size of the data transfer. For more detailed analysis, PFS-SIM is used to simulate the parallel file system's cnodes, ionodes and disks. The I/O requests from PIO-SIM are processed by the corresponding cnode LP. Each request is then distributed to ionode LPs based upon the physical data layout selected. Similarly, ionode LPs send their requests to disk LPs, where I/O service times are calculated.