The design of a parallel I/O system includes a number of important alternatives including the type of disks used, caching strategies implemented, and collective I/O supported. This paper described an extensible and modular simulator to evaluate the impact of design alternatives at each of preceding levels and the expected performance of MPI-IO programs. This paper presented experiments which demonstrate the usefulness of PIOSIM as a framework to study caching strategies within a parallel file system. PIOSIM has been implemented on a parallel architecture to reduce model execution time. We also presented results on the effectiveness of parallel execution. We show speedups for the NAS BTIO Benchmarks and matrix multiplication.