Next: Experiments and Results
Up: Parallel I/O Simulator
Previous: PIO-SIM
The basic structure and functionality of PFS-SIM is taken from the
Vesta parallel file system, a highly scalable, experimental file
system developed by IBM [CF96].
Many of Vesta's features have been
included in the design of PFS-SIM, most notably, the use of an
interface which allows user applications to configure the parallelism
actually used to perform I/O.
In addition to the flexibility contained within the Vesta interface, PFS-SIM
allows many of the file systems physical characteristics, such as
cnode/ionode ratio, number of disk drives attached to each ionode, disk
drive characteristics and a multitude of cache setups.
While the Vesta file system implemented caching only at the ionodes,
PFS-SIM supports systems which have cache at both ionodes and cnodes,
though ionode-only caching can be simulated by setting the cnode cache size
to zero. This caching setup offers a larger variety of configurations
for study, including cooperative caching. PFS-SIM also supports a full
range of cache sizes, cache block sizes, cache associativities (direct-mapped,
fully associative, set associative) and write policies
(write-through/write-back,
write-allocate/write-around). Write-invalidation is used to maintain cache
coherency, though write-update could easily be updated. Block
replacement uses the LRU algorithm.
The cache management policies implemented by PFS-SIM are:
- Base caching: provides local cache at both ionodes
and cnodes, but does not
utilize any of the cooperative caching techniques. Reads involve checking
the local cnode cache and then the cache of the ionode(s) responsible for
managing the required data. If the data is not in either of these caches,
it is retrieved from disk. Writes are somewhat more complex and depend on
the exact write policies chosen, but operate much the same way.
Write-invalidation causes the invalidation of any remote cnode cache blocks
which contain data just written.
- Greedy forwarding: adds the retrieval of data from remote cnode
caches. Whenever an ionode receives a request for data which it is not
currently caching, it checks to see if any cnode in the system is caching
the data. If so, the request if forwarded to that cnode, otherwise the data
is read from the disk.
- Centrally coordinated caching: typically used in addition to greedy
forwarding, centrally coordinated caching attempts to improve the global
cache hit rate of the system by coordinating the contents of the cnode
caches. A specified portion of each cnode cache is collectively managed by
the ionodes, with the remaining portion still managed locally by the cnode.
Whenever a cache block is evicted from an ionode cache, it is sent to that
ionode's portion of the centrally coordinated cache. This is very similar
to physically moving some of the cnode cache to each of the ionodes. The
penalty for this is a reduced hit rate at each cnode's local cache.
- Optimized globally managed caching: For the
network environment in [DWAP94],
the tradeoff of a reduced local hit rate for a higher global hit rate
resulted in peak cache performance occurring when 80% of each cnode's cache
was coordinated. The fact that retrieving data from a remote cnode in a
parallel file system is much less expensive than retrieving data from a
remote client in a network file system led us to believe that performance
would continue to improve until 100% of all cache was coordinated. This
would remove any data redundancy within the caches and eliminate the need
for expensive cache coherency protocols. In an attempt to minimize the
effect of coordination on local cache hit rates, blocks read from the disk
are first placed in the cache of the cnode which issued the read. The
ionode cache is used to house evicted cnode cache blocks instead of
the reverse.
Next: Experiments and Results
Up: Parallel I/O Simulator
Previous: PIO-SIM
Andy Kahn
Tue Jun 24 17:48:10 PDT 1997