Collection Management



next up previous
Next: Software Engineering Model Up: Areas Needing Development Previous: Indexing and Searching

Collection Management

The NSE will grow faster and contain more useful information if application teams have some degree of ownership of the software repositories and are motivated to contribute to them. A development team such as the group that produced this NSE prototype has neither the manpower nor the expertise to develop, catalog, and maintain high-quality software and documents in all the application areas. Although funding could conceivably be found to increase the manpower of the development group, scientists closest to the different application areas will be in the best position to produce useful software and documents. On the other hand, these scientists cannot be expected to document, refine, and thoroughly debug experimental software to the point where it is of sufficiently high quality to be generally useful and shareable, because they do not have software development expertise nor are they funded to do software development. Nor can they be expected to learn all the details of various classification schemes so as to accurately catalog their contributions. Thus, additional funding would be best spent on software development and cataloging experts who can work closely in teams with application scientists, either as regular staff, or in the case of smaller institutions or short-term projects, on a consulting basis.

Quality control will be needed, both for the software and documents themselves, and for the descriptive information and the cataloging and indexing process. Control is needed for software so that the user can have confidence in software obtained from the NSE, and for cataloging so that users can accurately evaluate the results of their searches. In its plan for the NSE, the Center for Research on Parallel Computation (CRPC) has proposed a review process for software similar to that used for articles submitted to a scholarly journal. The CRPC would provide or recruit area editors for the different software domains, such as linear algebra, numerical simulation, parallel compiler technologies, and visualization. A rating scheme for software quality would be defined, based on criteria such as stability, robustness, documentation quality, level of support, and transportability.

Although miminum submission requirements and quality control are essential, the software submission process must be structured in such a way that it does not place so heavy a burden on application developers as to discourage contributions.

Development of a comprehensive collection of Internet resources for the NSE will require the following:

  1. A clear statement of the scope, purpose, and goals of the NSE.
  2. Division of the overall scope into categories and sub-categories.
  3. Assignment of categories and sub-categories to individuals who are aware of current developments in Internet resources in their respective areas.
  4. Feedback from users and domain experts on the significance and value of resources.
  5. Identification of focused, specialized, and well-maintained collections of information to serve as destinations of links.
  6. Recognition for publishing in a moderated collection such as the NSE so that contributors and reviewers will find it worthwhile to put effort into improving the quality of resources.

The current representation of the NSE as an interlinked sub-web of HTML pages has a number of drawbacks, including maintenance difficulty and fixed organization and format. To overcome theses difficulties, the NSE contents should be reorganized as a display-independent database, with each entry in the database assigned one or more values from a faceted classification scheme (e.g., type of resource, subject area). The user should be able to select a default, pre-generated organization and format, or to generate a new organization on the fly. Portions of the database should be distributed and replicated at various sites. Distribution is necessary to permit local maintenance of their portions of the database by domain experts. Replication is necessary to achieve high availability and scalability. With distribution, the page the user sees might be constructed by querying servers at different sites for the current state of portions of the database. Location-independent naming of files (see section 6.1 will simplify the management of the distributed NSE database.



next up previous
Next: Software Engineering Model Up: Areas Needing Development Previous: Indexing and Searching



Jack Dongarra
Sun Dec 18 14:22:28 EST 1994