Data models used for document digital libraries are not in general suitable for use by software repositories. Although some fields are useful in both settings - e.g., author,title, abstract - software cataloguing requires a number of additional fields. A field that appears in most major catalogs is a requirements field that lists the hardware, operating system, and other software needed to use the catalogued item. Another important field in the case of public domain software repositories such as Netlib, where software is author-supported, is a contact field giving an electronic mail address to which questions and bug reports may be sent. Still another field used by many software repositories is a certification field that tells at what level the software has been certified and possibly includes pointers to certification artifacts such as completed checklists and testing results.
The Reuse Library Interoperability Group (RIG) has developed and approved the Basic Interoperability Data Model (BIDM) as a standard data model for software repository interoperability, and the BIDM has been submitted for balloting as an IEEE standard [3]. The BIDM will be used as a lowest common denominator for interoperation between software repositories. However, because of considerable variation in the purpose, contents, and application domains of different software repositories, no single data model will be suitable for all, and important cataloguing information will be lost in exporting to the BIDM. An example where such loss occurs is the certification field. This field was not included in the BIDM because of the wide variety of certification and evaluation methods in use at different repositories. The RIG also encountered difficulty in developing controlled vocabulary lists, because again different sets of terms were appropriate for different repositories. The approach now being taken by the RIG is to define a standard for an Extensible Uniform Data Model (EUDM), which will be a meta-model a repository can use to describe the data model it is using. For example, using the EUDM, a repository will be able to define its certification methods and the meanings of different certification levels. As a member of the RIG, Netlib is participating in development and promotion of standard data models for software repositories.