Improved fsck and related tools

FSCK

  • currently takes too long to fsck a large FS
    • fsck only certain subdirs
    • parallelize fsck?
    • continuous fsck - check only with FS live report potential problems
  • fsck needs to handle new issues
    • replicated data
  • sysop tools for difficult FS repairs - some available from cmd line, but better tools are needed
  • Scrubbing - allow objects to become orphaned and then removed later. This is one solution to the "delete while in use" problem. This can also be used to remove old copies. Presupposes that we aren't leaving valid data orphans - thus orphaned objects are deletable.
  • Lazy quota checking - add up quota data while scanning FS - probably a directory quota approach rather than user.

Ideas:

  • Use a web based interface to manage and schedule processes that run to do fsck, scrubbing, performance data gathering, scheduled migration or replication, etc.
  • Processes can run on the management station and access the file system through the sysint and mgmt interfaces (BMI)
  • Alternatively (2.9+), a pair of mgmt requests allow a server to fork one of these processes on the server and retrieve output datasets from a known directory
  • These remote processes can access one or several servers (depending on configuration ... how to configure???)
  • Server should manage these child processes - keep a list of running pids, not allow too many to start at once, kill them when need be.
  • Additional processes run on the management station to perform global data referencing
  • Need to manage output datasets on server - when to delete them, rotate them, etc. May require additional requests (delete, list, etc.)

Other Mgmt Tools

What else do we need?

Back to OrangeFS projects page