SESAME:
System Software Measurement and Evaluation
OBJECTIVE
The goal of this research is to identify major performance bottlenecks in supercomputer system software. This will be achieved by a coordinated measurement,modeling, simulation, architectural evaluation, and experimental alteration effort, taking a global view of the overall systems software environment. Also see the four part SESAME graphic for an overview of our timeline, ideas, and project impact.
APPROACH
The first and major step is to develop a broad based set of instrumentation tools which a user can use to measure the behavior of existing systems. In parallel, a standard set of benchmarks will be selected from existing applications, and additions developed. Measurements of numerous system parameters on multiple scalable systems architectures are in progress. Models of these systems will be (and are currently being) created from the observed behavior. These models (both analytical and simulation) will be iteratively refined until they correctly predict system behavior. Then, and most importantly, these measurements, models, and general systems insights will be used to propose systems architecture changes, and evaluate changes using the validated models. The data (measured and generated by simulation), the tools developed, and the models, will all be put into a well-structured repository.
A key component of the Sesame project is the Repository. This is a multilayered data structure in which all the results of our modeling, measurement and simulation efforts will be stored; in fact it will contain much more than that as follows. The design calls for a first level, the Data Library, in which all the data, be it measured or simulated or imported from others, will reside. The data formats are being designed and all data entries will conform to these formats; the goal is to have the entries be "self-labeled" in a way which specifies how the data was obtained, the configuration, the assumptions, etc, so that the data could be reproduced if necessary. The second level contains the Tools Library which consists of simulation packages, modeling packages, languages, etc. The third level contains the Models Library in which specific MPP models, and their components reside. For example, the Models Library might contain a model of an particular interconnection network; any data that has been derived from that model would reside in the Data Library, and would be pointed to from the Models Library. On top of the entire structure we are placing a Graphical User Interface; this will be an adaptation of existing GUI's. Moreover, in the long range, we will make this Repository available through the Internet using a Web-type of interface.
DETAILS
Project start date: October 30, 1994
Funded by ARPA/CSTO under contract No. F30602-94-C-0273
Principal Investigators:
Rajive Bagrodia, Computer Science Dept, UCLA
Leonard Kleinrock, Computer Science Dept, UCLA
Gerald J. Popek, Platinum Computing
PLAN
Forthcoming
PUBLICATIONS
"An Adaptive Synchronization Method for Unpredictable Communication Patterns in Dataparallel Programs," S. Prakash and R. Bagrodia; Proceedings of the 9th International Parallel Processing Symposium - IPPS '95, Santa Barbara, CA, March 1995, pp. 838-844.
Integrating Task and Data Parallelism, Maneesh Dhagat; UCLA Computer Science Dept. PhD Dissertation, June 1995.
"UC: A Set-Based Language for Data Parallel Programs," R. Bagrodia, M. Chandy, and M. Dhagat; Journal of Parallel and Distributed Computing, Vol. 28, August 1995, pp. 186-201.
"Parallel Simulation of Data Parallel Programs," S. Prakash and R. Bagrodia; Proceedings of the 8th Workshop on Languages and Compilers for Parallel Computing, Columbus, Ohio, August 1995.
"Integrating Task and Data Parallelism in UC," Maneesh Dhagat, Rajive Bagrodia, and Mani Chandy; Proceedings of the International Conference on Parallel Processing, Oconomowoc, Wisconsin, August 1995.
"A Performance Evaluation Methodology for Parallel Simulation Protocols," V. Jha and R. L. Bagrodia; Proceedings of the 10th Workshop on Parallel and Distributed Simulations - PADS '96, Philadelphia, PA, May 22-24, 1996, p.180-183.
"Parallel Simulation of a High-Speed Wormhole Routing Network," R. L. Bagrodia, Y-A. Chen, M. Gerla, B. Kwan, J. Martin, P. Palnati, S. Walton; Proceedings of the 10th Workshop on Parallel and Distributed Simulations - PADS '96, Philadelphia, PA, May 22-24, 1996, p. 47-56.
"Perils and Pitfalls of Parallel Discrete-Event Simulation," R. Bagrodia; Proceedings of the 1996 Winter Simulation Conference - WSC '96, eds. J. Charnes and D. J. Morrice, Coronado, CA, 1996.
"Performance Prediction of Parallel Programs", Sundeep Prakash; PhD Dissertation, November 1996. [Postscript File][HTML]
"Parallel Simulation of Parallel File Systems and I/O Programs", Rajive Bagrodia, Stephen Docy, and Andy Kahn; Published In Supercomputing 97 (SC97). [Postscript file] [HTML]
[Html of 6-page Extended Abstract]"Parallel Simulation of Parallel I/O Programs," Andy Kahn; Computer Science Master's Thesis, 1997. [Postscript file] [Digest version.]
"Maisie on World Wide Web," Suresh Thakur; Computer Science Master's Report, 1997. [MS Word file]
POSTERS AND PRESENTATIONS
See the poster displayed at the ARPA Joint PI meeting at Ft. Lauderdale, FL., July 1995. Click here for the version without pictures (134K postscript file), and here for the detailed version (link to another directory).
Quorum 1996 Presentation (December)
Slides of SC97 Presentation on Parallel Simulation of Parallel File Systems and I/O; presented by Stephen Docy
SOFTWARE
MPI-LITE Home Page
WHAT'S NEW
(in reverse chronological order)
Slides of SC97 Presentation on Parallel Simulation of Parallel File Systems and I/O; presented by Stephen Docy
"Parallel Simulation of Parallel File Systems and I/O Programs", Rajive Bagrodia, Stephen Docy, and Andy Kahn; Published in SuperComputing 97 (SC97). [Postscript file] [HTML version]
"Using Parallel Simulation to Evaluate MPI Programs", Sundeep Prakash, Rajive Bagrodia, and Punit Bhargava; [Postscript file coming soon.]
MPI-LITE Home Page
Quorum 1996 Presentation
MPISIM - A parallel simulator for task parallel programs using MPI (Message Passing Interface). Verified and validated on the IBM SP2 for the NAS Parallel Benchmarks. Read about it in this PhD Dissertation.
See the poster displayed at the ARPA Joint PI meeting at Ft. Lauderdale, FL., July 1995. Click here for the version without pictures (134K postscript file), and here for the detailed version (link to another directory).
SESAME News before 1997
Message Latencies on the SP2
Comments to: sdocy@cs.ucla.edu
Updated: April 1, 1998