Version 7 (modified by walt, 16 months ago)

--

Protocol Definition Language

This project includes and subsumes this older one: Improved encode/decode routines

This project is to find an utilize or develop a protocol definition language or system to simplify the definition of the PVFS protocol. Currently the protocol definition starts with a set of structs in C that define the arguments passed in with each request, and back with each response and the overall structs that include these. Next each struct must have an encode and decode functions defined using a set of macros - and in some cases a new macro must be defined. Additional macros are created to fill in the structs and to define constants for declaring and checking the size of elements that have a variable length. These various parts are defined in several header files. It is easy for a change to miss one or more parts of this leaving the protocol potentially corruptible.

Another issue arises in that we would like to handle backwards compatibility when the protocol is updated. Currently a protocol major and minor number are manually maintained - the convention being that a client with an older minor number should be compatible with a server using a newer minor number. Major number mis-matches are assumed to be incompatible. There is no mechanism to tie these to specific protocol changes or to allow a broader range of compatibility.

Examples of similar systems to investigate are  Google Protocol Buffers and rpcgen. While these are good examples, it is not likely they will be directly spplicable to PVFS. Google protocol buffers are for use with C++, Java, and Python, while PVFS is in C. Rpcgen uses XDR as an encoding, which is very similar to what is used in PVFS, but not identical.

What we would like is to develop a simple C-like macro language that can be compiled into C code and compiled along with the client and server code. The protocol specification would largely replace the code in src/proto/pvfs2-req-proto.h and would define each request and its response with struct-like syntax including the request number definition, the request and response arguments, array arguments along with valid numbers of items in the array), version specifications indicating when a field is added for a specific protocol version. The compiler would automatically generate encode and decode functions for each request or response, including space management (allocation and freeing) and including reordering or insertion of padding to ensure proper 64-bit alignment, checks for invalid array counts and overflows, macros for filling in structs, code for handling different versions of the protocol, code for calculating the overall size of request and response buffers, etc.

And example of code in this macro language might appear as follows:

protocol {                   /* start of protocol */
   version 4.3 min 4.2;      /* this is version 4.3 the minimum acceptable version number is 4.2.  specifications below 4.2 are ignored, above 4.2 generate code for back-compatibility, above 4.3 are error */
   
   header {                  /* these fields are in all of the requests/responses */
      int32t request_num;
   } response {
      int32t errno;   
   };

   request my_request (14) { /* text name of request and number */
      int32t a;
      int64t b;              /* these will probably be reordered, putting all 64s first, all 32's next and so on, and finally pads if needed */
      PVFS_offset c array 20;/* maximum of 20 of these */
   } response {
      int32t d;
      int32t e version 4.3;  /* this was added in version 4.3 if a server receives a v4.2 request it won't send this back */
   };

   request my_other_request (15) {
      int64t f;              /* this is assumed to be version 4.2 or before */
      int64t g version 4.0;  /* this version number is ignored because it is < 4.2 */
      PVFS_handle h;         /* this may be a struct that isn't a request, but is used elsewhere in the code.  maybe defined here or not - need a way to handle */
   };                        /* this request has no arguments in the response */

};                           /* end of protocol */

Probably a tricky issue is struct types used in the code proper but also used in the protocol. Currently we define the encode and decode functions using the same macros. Almost certainly we want to automate that. We can do this in this same file with an appropriate construct, or assume that they are defined in a different file (and thus the defs must be included here) or potentially even defined in regular .h files and assume they are all passed through the macro pre-processor to generate the needed code. For example, a struct define might look as follows:

protocol {
   struct some_struct {
      int32t a;
      int32t b array 10;
   };
};

The macro compiler can compile multiple protocol section with C code both before and after (and in-between). Struct definitions can exist in the main protocol definition, or in other files. The array notation is needed for several reasons including that the maximum size is needed and because arrays are general to be declared with a pointer and size. The protocol compiler can generate macros for assigning values to fields, especially arrays. THese declarations would both be used to generate the regular struct definitions and definitions needed for protocol processing such as encode and decode functions.

Back to OrangeFS projects page