Version 4 (modified by denton, 18 months ago)

--

High Performance User Interface

PVFS provides two different user interfaces:

  • A Linux kernel interface
  • A direct interface via pvfs_lib

They kernel interface is the easiest to use as it makes a PVFS file system appear like most other file systems, and allow most programs to use it directly. This is the interface employed by most users. Unfortunately, the kernel interface offers considerably lower performance and functionality compared to the direct interface. The Linux file system infrastructure is based on the Posix standard which in turn was developed for local disk file systems. The PVFS system interface provides "direct" access to the PVFS server, gives the best performance, and is the most reliable. The problem with the system interface is that the only way for a user to access it directly is using MPI-IO. This is fine for MPI users, but not so convenient for others.

The OrangeFS project is developing a multi-layer user interface that allows programs to link directly to the system interface. There is a common IO layer which is used to implement the higher layers, a new MPI-IO interface designed specifically for PVFS, a Posix-like system call layer (open, close, read, write), and a C stdio library layer (fopen, fclose, fread, fwrite). These layers are designed to work together and provide the common functions used by applications and high-level applications libraries. These layers can be linked directly with an application, or preloaded as a shared library so that existing applications can be run without recompiling. Best of all, there are extensions to these interfaces that allow high performance applications to more directly control their IO and utilize the underlying file system.

Examples of such extended features include:

  • Buffering (exists in stdio, but can be adjusted more easily)
  • User-level caching (more aggressive than buffering and more features)
  • Run-time monitoring - the ability for the program to monitor traffic on servers and adjust
  • Distribution - the ability to manage how data is distributed in the file system

This new would not replace the existing kernel level interface for all users, but would supplement it, proving a wider range of choices. Projects are also under way to improve this interface, focusing on the experimental FUSE interface.

Developments

This code has been added to the OrangeFS source tree under src/client/usrint and currently consists of the following files:

  • stdio.c/stdio.h - implementation of stdio library functions primarily to add special features and bind to the lower-level libraries
  • posix.c/posix.h - implements wrappers for all of the Posix IO system calls - these calls use a method table to select the proper implementation to call (either PVFS or not)
  • posix-pvfs.c/posix-pvfs.h - implements all of the Posix IO system calls using a descriptor table in user space and directs all IO through PVFS
  • filetable-util.c - implements user space file descriptor and stream functions
  • iocommon.c - implements IO operations using PVFS sysint calls - supports both Posix and MPI-IO

In addition there is an MPI-IO implementation that can be used with either MPICH or OpenMPI on top of the iocommon.c routines. This is not yet installed into the source tree.

Back to OrangeFS projects page