Figure 1 shows the simulator speedup from parallel execution for the NAS BTIO Benchmarks and matrix multiplication. For matrix multiplication, we are underpresenting the actual speedup because of two reasons: 1) we are not performing the actual I/O; this was done to reduce simulator execution time, and 2) the matrix size was relatively small.