root/branches/cu-security-branch/PVFS2-GLOBAL-TODO.txt @ 8330

Revision 8330, 6.9 KB (checked in by nlmills, 3 years ago)

revert cu-security-branch to before the attempted merge with Orange-Branch

Line 
1####################################################################
2# TODO list for pvfs2 project as a whole
3#
4#
5
6NOTE: Some (dated) status information can be found in doc/pvfs2-status.tex
7
8improving robustness of I/O apis:
9====================================================================
10- our internal api's should be able to handle the following cases:
11  a) operations posted before initialize() should return error
12  b) operations posted after finalize() has started should return error
13  c) finalize() should gracefully terminate pending operations, although
14     those operations will have undefined results
15- these API's in particular need update in that regard:
16  - dbpf-attr-cache DONE
17  - trove
18  - bmi
19  - flow
20  - job
21  - request scheduler
22  - device interface
23
24server operations:
25====================================================================
26- not started:
27  - eattrib (set/get)
28 
29- unfinished:
30  - general error handling
31  - performance monitoring (need more metrics)
32
33general server functionality:
34====================================================================
35- attributes (permissions, etc.) on datafiles
36- finishing file system semantics documentation
37- don't forget to define semantics for access times
38
39request scheduler:
40====================================================================
41- more generic implementation
42- smarter concurrency rules
43
44system interface functionality:
45====================================================================
46- not started:
47  - eattrib (set/get)?
48
49- unfinished:
50  - thread safety
51  - way to pass in consistency semantics (timeout values, etc.)
52
53- define how configuration info should be passed in
54  (how to do paths, fstab, url stuff, whatever)
55- define how to pass in distribution and number of datafiles for
56  cases in which the caller wants to override the defaults
57- add nonblocking api for some functions
58- clean up API (in particular fstab parsing / initialize path, and removal of
59  depricated terminology)
60- make input pointer argumentss to system interface be declared const
61- make sure that system interface functions return an error, rather than
62  asserting, if the caller tries to operate on a bogus handle (one case occurs
63  in assertions following PINT_bucket_map_to_server())
64
65kernel/vfs interface
66====================================================================
67
68performance tuning:
69====================================================================
70- instrumenting
71- steal what we can from mpich2
72- architecture specific locking, etc.
73- thread tuning
74- memory allocation cache
75- do some benchmarking of thread context switches to help decide
76  how trove/job/flow interfaces should interact
77- figure out how to make i/o faster
78
79request encoding:
80====================================================================
81- come up with a mechanism for handling requests that go beyond
82  the BMI defined limit for unexpected messages (mainly an issue
83  on read/write with complex datatypes, but also potentially a
84  problem on setattr)
85
86error codes:
87====================================================================
88- converting to new error code format (everywhere)
89- documenting valid error codes from functions
90
91I/O path:
92====================================================================
93- buffer cache on top of trove
94- clean up buffer management in BMI to be more useful for I/O buffer
95  cache, maybe push to a seperate component
96- optimizing small reads and writes (packing data into req/ack messages)
97- native GM flowprotocol
98- general optimizations (lock granularity, immediate completion, etc.)
99- ability to unpost, correct use of timeouts, preposting operations
100- semantics of short read and write operations
101- bmi_tcp scalability and robustness
102- ability to toggle synch behavior in trove
103- use better buffer size in default flow protocol
104- bmi shmem implementation
105- many items in BMI and flow TODO files
106- ability to compile out device support, or at least prevent device thread
107  from spawning if not used
108- ability to fail over with multiple bmi transports
109
110correctness/performance testing
111====================================================================
112- a comprehensive test suite of the system interface API
113- more pts tests
114- profiling code paths
115- eliminate memory leaks
116- handle server or client failures in a reasonable way (log and exit instead
117  of segfault, perhaps)
118
119system management utilities
120====================================================================
121- pvfs2-fsck (serial tool done, evolve into parallel tool)
122- decide what we want/need here?
123  - health monitoring
124  - system recovery
125  - system statistics (raid stat, mem used, etc.)
126  - etc.
127- performance monitoring:
128  - more metrics
129  - more viz tools
130- end user documentation
131- better logging systems
132- maybe make pvfs2-ping compute a cksum on the fs.conf from all
133  servers and issue a warning if they don't all match?
134
135documentation:
136====================================================================
137- come up with an automated way to document the wire packet format
138  - also document headers that bmi tacks on, at least for bmi_tcp
139- update the coding guidelines
140- document config file options
141- automate faq publishing
142- mechanism for exporting to html
143- update all design docs!
144- review
145
146code cleanup:
147====================================================================
148- remove some of the stuff from the test subdir for "make dist" target
149  - in particular, test/common (partial), test/io, test/proto, test/server
150- put in header file wrappers to make them work with c++
151- audit code to make sure that all error paths are handled when
152  assertions are turned off
153- maybe make a checklist for each pvfs2 component to use as we clean
154  up each section of the code?  (items to check for each component
155  could include stuff like symbol names, PVFS_error code usage,
156  properly error handling when assertions are off, etc.)
157- consistent formatting
158- consistent function naming
159- consistent header file inclusion
160- come up with more named values like TROVE_HANDLE_NULL to use in
161  other parts of the code
162- try to clean up flow / I/O path some, in particular so we don't have
163  to do so much mallocing to set up from client side
164  - maybe do things like embed file_data struct in flow desc.
165- make permission checking in prelude.sm neater, maybe assert on
166  unkown op types so we don't forget to add new ones here
167
168fault tolerance:
169=====================================================================
170- what does the API look like
171- data redundancy
172- failover
173
174testing:
175=====================================================================
176- run common test programs and benchmarks, like:
177  - flash
178  - iozone
179  - dbench
180  - ior
181  - bonnie
182  - make kernel
183  - mpiiotest
184  - John May's tests?
185  - piobench
186- more pts tests
187- more datatype testing
188  - remember example of ub < lb
189
190rob's random list:
191=====================================================================
192- do something about the weird PINT_sys_wait and PINT_mgmt_wait macros in
193  client-state-machine.h
Note: See TracBrowser for help on using the browser.