Integration with Document Digital Libraries
Next: References
Up: Digital Software and Data
Previous: Access to Scientific
The Corporation for National Research Initiatives (CNRI)
is working with five major universities (CMU, Cornell, UC-Berkeley,
Stanford, and MIT) on an ARPA-sponsored project to develop
concepts for digital libraries. As part of this project,
each university is placing its Computer Science Technical
Reports on-line and providing access to the distributed CSTR
collection. Technologies developed for the CSTR project
include the Dienst distributed search system [18]
and a Handle Management Service for assigning, maintaining,
and using unique identifiers for digital library objects [16].
The basic architecture being developed by CNRI for distributed
digital libraries includes the following concepts
[26]:
- A digital object which consists of a sequence of bits
plus a unique identifier known as a handle (the binding
between the handle and the sequence of bits may change over time).
- Naming authorities who are responsible for assigning
unique identifiers within their portions of the handle namespace.
- Repositories from which digital objects are available.
- Information and Reference (IR) servers that provide
reference information about collections of digital objects.
The CNRI work is closely related to the IETF Uniform Resource
Identifier (URI) Working Group's work on Uniform Resource
Names (URNs) [32] and Uniform Resource Citations [17].
CNRI's handle is the equivalent of IETF's URN, and CNRI's
IR server serves a similar function to IETF's URC server.
The Netlib and NHSE Development Group has been engaged in
a parallel effort to implement a location-independent naming
architecture [11].
We provide for two types of location-independent names:
- a Uniform Resource Name (URN), for which the contents it
refers to may change - e.g., the ``current version of LAPACK''.
- a Location Independent File Name (LIFN), for which the binding
between the name and the byte contents of the file it refers to
is fixed, once assigned. This type of name is needed for
unambiguous references when attaching critical reviews or
reporting scientific results obtained using a particular version
of a piece of software. LIFNs also permit reliable and efficient
cacheing and mirroring of files.
At any given time, a URN is associated with exactly one LIFN.
By looking up the LIFN associated with a URN and then retrieving
the file corresponding to that LIFN, the user is assured of retrieving
the most recent copy, even if some mirrored copies are out-of-date.
Thus, we obtain consistency of replicated
copies without the overhead of a replica control protocol.
We are also developing a URC server that provides support for the
following:
- Provision by the publisher of attribute-value pairs for a given
URN in the form of cryptographically signed assertions.
- Retrieval and authentication of assertions by users.
- Specification of the data model used for a particular URN.
- Choice of encryption algorithm, including none.
We propose to integrate our software repository naming architecture
with CNRI's digital library architecture in the following manner:
- Both URNs and LIFNs will be expressible as handles, and URN and
LIFN lookup will be merged with
the Handle Management Service.
- Our URC server will be an implementation of CNRI's IR server that
may be used for cataloguing general Web resources, including software
and data archives.
- Similar to the Dienst protocol for document repositories,
we will develop service specifications and retrieval protocols
appropriate for software and data repositories. In addition,
similar to the Digital Library Document Architecture that defines
requirements for digital document structure
[33], we will define
requirements for software and data archive structures.
Next: References
Up: Digital Software and Data
Previous: Access to Scientific
Jack Dongarra
Thu Feb 23 09:42:15 EST 1995