... eqn | troff -ms .EQ delim $$ .EN .ND November 19, 1985 .TL .ps 12 .in 0 Distribution of Mathematical Software Via .br Electronic Mail .AU .ps 11 .in 0 Jack J. Dongarra\|$size -1 {"" sup \(dg}$\h'.15i' .AI .ps 10 .in 0 Mathematics and Computer Science Division\h'.20i' Argonne National Laboratory\h'.20i' Argonne, Illinois 60439\h'.20i' Electronic mail: anl-mcs!dongarra or dongarra@anl-mcs .AU .ps 11 .in 0 Eric Grosse\h'.20i' .AI .ps 10 .in 0 AT&T Bell Laboratories\h'.20i' Murray Hill, New Jersey 07974\h'.20i' Electronic mail: research!ehg or ehg@btl.csnet .FS .ps 9 .vs 11p This draft was typeset on \*(DY. Unix is a trademark of AT&T Bell Laboratories. .br $size -1 {"" sup \(dg}$\|The work of this author was supported in part by the National Science Foundation under Agreement No. DCR-8419437. Any opinion, findings and conclusions or recommendations expressed in this publication are those of the authors and do not necessarily reflect the views of the National Science Foundation. .FE .QS .sp 2 .ps 10 .in .25i .ll -.25i .I Abstract \(em A large collection of public-domain mathematical software is now available via electronic mail. Messages sent to "netlib@anl-mcs" (on the Arpanet/CSNET) or to "research!netlib" (on the Unix\(tm network) wake up a server that distributes items from the collection. For example the one-line message, "send index", gets a library catalog by return mail. We describe how to use the service and some of the issues in its implementation. .in .ll .QE .nr PS 11 .nr VS 16 .nr PD 0.5v .SH Introduction. .PP A large pool of high-quality mathematical software is in use at educational, research, and industrial institutions around the country. At present this software is available from a number of distribution agents \(em for example AT&T Bell Laboratories for the PORT library, IMSL, the National Energy Software Center (NESC), and the Numerical Algorithms Group (NAG). All do a fine job with the distribution of large packages of mathematical software, but there is no provision for convenient distribution of small pieces of software. Currently scientists transmit such software by magnetic tapes, but contacting authors and deciphering alien tape formats wastes an intolerable amount of time. .PP A new system, .I netlib, .R provides quick, easy, and efficient distribution of public-domain software to the scientific computing community on an as-needed basis. It sends electronic mail over Arpanet, CSNET, Telenet, or Unix uucp. .SH Netlib in Use. .PP Imagine an engineer who needs to compute several integrals numerically. He consults the resident numeric expert, who advises trying the routine $dqag$ for some preliminary estimates and then to use $gaussq$ for the production runs. The engineer types at his terminal .I .nf mail research!netlib send dqag from quadpack send gaussq from go . .fi .R In a short time, he receives back two pieces of mail from $netlibd$. The first contains the double precision Fortran subroutine $dqag$ and all the routines from $quadpack$ that $dqag$ calls; the second contains $gaussq$ and the routines it calls. .PP The utility routine $d1mach$ was not included with $gaussq$, since it is probably already installed on his system; if he had wanted it, he could have changed his request to .I "send gaussq from go core" .R to include the ``core library'' of machine constants and basic linear algebra modules in the search list. .PP Should the engineer later decide that the routine $dqags$ would be more effective, he could send the request .I "send dqags but not dqag from quadpack" .R to get $dqags$ and any subroutines not already sent with $dqag$ . .PP This engineer happens to be running Unix; if instead his machine were on the Arpanet, he would use the address .I netlib@anl-mcs. .R If he needed the code in upper case, he would send his request in all caps; to get single precision, he need simply change the names of the routines or the libraries, as appropriate. Finally, he could ask for several routines together: .I .nf SEND RG RS FROM DEISPACK SEND DGECO FROM LINPACK CORE .fi .R .PP Meanwhile, the numerical expert decides she should check on the current contents of netlib. She types .nf .I mail research!netlib send index .fi .R The return mail shows a library $toeplitz$ she is not familiar with, so she sends mail .I "send index for toeplitz" .R to see what is included. Curious to see a typical routine, she tries .I "send only cslz from toeplitz" .R and gets just $cslz$, not any of the routines which it calls. .PP More formally, requests have the following syntax: .nf $request_line$: send $names$ $exclusions sub opt$ $libraries sub opt$ send only $names$ $libraries sub opt$ who is $names$ $exclusions$: but not $names$ $libraries$: for $names$ from $names$ .fi where $names$ is a list of words, separated by blanks. .PP Just how quickly these requests are answered depends on the speed of the network communications involved, but five or ten minutes is typical for Arpanet. CSNET or Unix uucp may require anywhere from minutes to days to transmit a message from sender to recipient. The actual processing time is insignificant. One user wrote back enthusiastically that the system was so fast he preferred using it to hunting around on his own machine for the library software. .SH Material Available through Netlib. .PP Currently netlib offers: linear algebra routines from LINPACK [9], EISPACK [13,15], and TOEPLITZ [1]; optimization routines from MINPACK [13] and Gay [8]; the special function library FNLIB by Fullerton; code from the book by Forsythe, Malcolm, and Moler [10]; quadrature routines from QUADPACK [16]; PPPACK routines from de Boor's .I Practical Guide to Splines [3]; .R the Collected Algorithms of the ACM published in the .I Transactions on Mathematical Software; .R FISHPACK routines providing finite difference approximations for elliptic boundary value problems [18]; iterative linear system solvers from ITPACK [14]; the public subset of FITPACK by Cline; routines for machine constants and error handling and other public routines from the PORT library [11] and SLATEC, the Basic Linear Algebra Subroutines and extensions [12], Golub and Welsch's GAUSSQ [10], biharmonic solvers [2], the SCPACK Schwarz-Christoffel conformal mapping program [19], the PARANOIA floating point test, the PCHIP routines for Hermite cubics by Fritsch and Carlson, the MA28 sparse matrix routine from the Harwell library, the Y12M package for sparse linear systems, Scott's LASO block Lanczos code, and miscellaneous other items. The multigrid program PLTMG by Bank and the multiple precision package by Brent are also in the collection, though they are probably too large to realistically send by mail. .PP The various standard linear algebra libraries are included for convenience, but the real heart of the collection lies in the recent research codes and the "golden oldies" that somehow never made it into standard libraries. Almost all of these programs are in Fortran but some are in C, such as the routine $rainbow$ by Grosse for generating uniformly spaced colors. There is also a collection of errata for numerical books, descriptions and benchmark data for various computers, test data for linear programming collected by Gay, and the ``na-list'' electronic address book maintained by Gene Golub. .PP We do \fInot\fP send out entire libraries. The computer center setting up a comprehensive numerical library should get magnetic tapes through the usual channels. .PP There is no reason to restrict the collection to mathematical software. If the habit of sharing work using software libraries of general utility becomes popular in other fields, we would be delighted to accomodate them. .SH The Netlib Server. .PP The netlib server runs under the Unix operating system (8th edition at Bell Labs and 4.1BSD at Argonne) and consists of a few shell scripts and C programs. The following discussion necessarily assumes some familiarity with Unix commands. .PP When mail arrives for $netlib$, it is piped through a sed editor script that strips punctuation, through a sort process to remove duplicates, and into a C program that parses the request. This program then invokes a shell script that translates the given library names into a search list and invokes the system loader with the given routine names as external symbols to be resolved. The resulting loader map is edited into a list of file names to satisfy the request. These files, along with a time stamp and disclaimer, are then mailed back to the requester. A line is added to a logfile showing the time, return address, number of characters sent, and requested routine and library names. .PP The programs can tolerate minor syntax deviations, since we do get requests like: "Please send me r1mach from port. Thank you." from people who don't realize they are talking to a program. Users sometimes submit a single request on the subject line of the mail message, so a "Subject:" prefix is also allowed. One user even sent .I "send index 4 eispack" .R instead of .I "send index for eispack", .R so $4$ is a synonym for .I for .R and .I from. .R (This is not such an unreasonable mistake, considering that the instructions for using netlib are often given over the phone.) However, we make no attempt to accept arbitrary English input. .PP One way to start up the mail processing is to have a daemon process that wakes up every few minutes and checks for a nonempty mailbox. In 8th edition Unix, thanks to Dave Presotto, if a mailbox contains .I Pipe to rcv.cmd, .R then the mail delivery software, instead of appending the incoming text to a mailbox, will pipe the text to the command $rcv.cmd$. (Similar functionality is available from the Berkeley mail alias facility.) The mailbox is owned by user-id .I netlibd .R so that the process is run as netlibd; hence the return mail will have this mnemonic name attached. The userid is not just .I netlib .R because if the return mail command fails or if the remote user sends a reply, the message should go to the administrator, not back into the request processor. For example, mail once came back announcing that a user had gone on vacation in the few hours before the netlib response had gotten to his mailbox! .PP The file that describes the mapping from library names to loader search lists consists simply of lines of the form "eispack => \-leispack" . Several similar lines allow for alternate spellings such as \f2eispac\f1 and \f2eispak\f1. This file is easily updated when new libraries are added to the collection. .PP A subtle security problem arises from the implementation: we construct commands to a shell based on text from a user. It could be catastrophic to blindly send mail to a return address of \f2kgbvax!\`rm -r *\`\f1, since the backquote characters tell the shell to first execute a command that removes all files! Therefore, the request parser checks for dangerous characters. Another potential security problem is that someone might tamper with the program text as it is enroute to the user. For now, we feel that the threat is not serious enough to adopt encryption schemes, though those would be easy to add. .PP Even though there are standards, it is not particularly easy to extract from a request a valid return address. There are comment brackets and anticomment brackets to be recognized and address transformations to be unwound, but we now seem to be correctly answering except when the return address contains blanks. .PP We do not use checksums since the network software already provides a reliable channel. We have received only one complaint, which involved noise on the link from a user's Vax to his PC; we regard that as his responsibility. If checksums were required, we would choose a scheme like that in MOSIS [15] which allows for anticipated, insignificant changes such as addition of trailing blanks on lines. To avoid problems with mail processing programs in the various networks, our request syntax avoids colons and our replies start with a blank line so that message contents are not processed as header information along the mail route. Problems occassionally arise with computers that are willing to send us mail, but will not allow us to send mail back. Delays for multihop and inter-network mail are more common, but we have no way to collect statistics on that and in any event it is out of our control. .PP The most difficult problem we have encountered has been length limitation; a few of the programs are more than 100 kilobytes, and that is more than the mail systems at many Arpanet sites will tolerate. Of course, the file transmission protocols can handle larger sizes, but those are too cumbersome and unstandardized for our purposes. We get around this by splitting up large items into several pieces of mail, but would prefer to see the mail systems themselves improved. We considered using Huffman coding to compress the files we send out, but that would only save about a factor of two and would require that we ship decoding programs. However, in setting up the netlib collection of test data for linear programming, David Gay did decide to adopt a program for compressing MPS format files. .SH Discussion. .PP We chose this mode of interaction via electronic mail, keeping the intelligence local to the central depository, because mail is at present the only ubiquitous data communication service. We considered putting an interactive program at remote sites, communicating by mail with the depository. That would allow a better dialogue (``Do you want that in single or double?'') but would be difficult to write in the necessary portable way. .PP We are not aware of any comparable software distribution service in existence, although some personal computer "public bulletin board" systems may be somewhat similar. At least one bulletin board has been confiscated because it contained a stolen telephone charge number. For this reason and to control space, we do not allow users to put their own software automatically in the collection. .PP The netlib service provides its users with features not previously available. There are no administrative channels to go through. Since no human processes the request, it is possible to get software in the middle of the night. The most up-to-date version is always available. Individual routines or pieces of a package can be obtained instead of a whole collection. One of the problems with receiving a large package of software is the volume of material. Often only a few routines are required from a package, yet the material is distributed as a whole collection and cannot easily be stripped off. .PP At present, netlib is simply a clearinghouse for contributed software and therefore subject to various disadvantages that have plagued such projects in the past: the only documents, example programs, and implementation tests are those supplied by the code author or other users. There may be multiple codes for the same task and no help in choosing which is best. We have made an effort not to stock numerous copies of machine constants, but in general we have left submitted codes untouched. Our system differs from previous efforts mainly by a different focus than, say, the Quantum Chemistry Exchange, and a more convenient distribution mechanism. .PP Several years ago there was a discussion on the Arpanet prompted by a query from Jim Pool as to whether the time was not ripe for "a portable set of documentation for interactive access by users of a collection of mathematical software." His idea was that the SLAC NAPLUG [5] be put into an expert system form. We have not yet tackled that problem in netlib, although we do pass along whatever documentation comes from the original code authors. Since the time of that discussion, local mathematical typesetting with output on terminals has become more common but most of the other objections remain. The user can not be assumed to describe his problem exactly as the numerical analyst would; thus the program must be able to translate from the engineering to the mathematical domain. Unserstanding only the general nature of the user's problem is not enough; this leaves too much documentation to wade through. A certain amount of insight is required to realize that a user may not need exactly what he thinks he needs. .IP "Do you need the matrix inverse? Maybe you just need the solution to a linear system." .IP "This is a correlation matrix, and I really do want to look at the elements." .LP The general user will only be looking for a library routine a few times a year. He will certainly not remember more than a few commands; a sophisticated search language is infeasible. Who is going to write all the documentation in the required format? At least a modest knowledge of numerical analysis and considerable consulting experience will be necessary, but the job is tedious and unrewarding. The best interactive documentation system is a good numerical analyst interested in the users' problems. Unfortunately, this system has its own difficulties: expensive to reproduce, inconsistent in intelligence and alertness, hard to transport, prone to use buzzwords, often unavailable, specialized, difficult to keep current. So there have been continuing efforts to build online numerical help facilities, the most successful of these being GAMS at the National Bureau of Standards, the NAG online help facilities and decision trees, and NIT at Oak Ridge. Entirely new writing styles are possible. Beyond the graph structured text popularized in "programmed learning manuals" a decade ago, specific documentation might be derived, rather than simply searching for and listing parts of a file. Instead of a single example, an online consultant could provide a complete program tailored to the problem at hand. Also, some knowledge of the previous experience of the reader might be used to modify the level of explanation and avoid needless repetition. .PP The main cost of running this service is for communications. If it becomes necessary, we will require uucp users to call the hosts to pick up their return mail so that such costs are distributed fairly. At an average of a few requests per day, the traffic has been small enough to impose a negligible load on the host systems. Disk costs are controlled by discarding files that the host administrators are not themselves interested in keeping. The current collection occupies 32 megabytes. Most important, the human costs for maintaining the collection are modest and consist mainly of collecting software. We do not see how we could run such a widely accessible and low overhead operation if we had to charge for the service, and are not interested in doing so. (See, however, [4] for a description of the Toolchest electronic ordering system. One problem mentioned there is that users want to see demonstrations of software before purchase.) .PP The coverage of netlib obviously will tend to reflect the interests of the collectors, so we would welcome "associate editors" to augment the collection. Please send mail to the authors. At present, there are just two distribution sites. Mail delays would be reduced if machines on other networks or in other countries were willing to also serve as depositories. On the other hand, it is difficult even to keep two locations in sync! The software netlib uses to reply to mail is itself available from netlib, so it would be fairly easy for someone to, say, annnounce a service for searching a bibliography that he has collected. .PP Netlib, being free, cannot replace commercial software firms. We provide no consulting, make no claims for the quality of the software distributed, and do not even guarantee the service will continue. In compensation, the quick response time and the lack of bureacratic, legal, and financial impediments encourages researchers to send us their codes. They know that their work can quickly be available to a wide audience for testing and use. We hope netlib will promote the use of modern numerical techniques in general scientific computing. .sp .SH Acknowledgements. .PP We wish to express our gratitude to the many authors and editors who have permitted their codes to be freely distributed and to Gene Golub for his encouragement and help in starting this project. The trick of editing a loader map is taken from the GAMS system at the National Bureau of Standards. Finally, the managements of our organizations deserve thanks for sponsoring this public service. .SH References. .IP [1] .R O.B. Arushanian, et al, .I The TOEPLITZ Package Users' Guide, .R Argonne National Laboratory, ANL-83-16, (1983). .sp .IP [2] P. Bj\o'o/'rstad, "Fast Numerical Solution of the Biharmonic Dirichlet Problem on Rectangles", .I SIAM J. on Numerical Analysis, .R 20 (1983), 59-71. .sp .IP [3] C. de Boor, .I A Practical Guide to Splines, .R Applied Mathematical Science, Vol. 27, Springer-Verlag, New York, 1978. .sp .IP [4] Catherine A. Brooks, "Experiences with Electronic Software Distribution", .I USENIX Association 1985 Summer Conference Proceedings, .R Portland, Oregon. .sp .IP [5] T. F. Chan, W. M. Coughran, Jr., E. H. Grosse, M. T. Heath, F. T. Luk, "Numerical Analysis Program Library User's Guide", SLAC Computing Services User Note 82, Stanford University, 1976. .sp .IP [6] W.J. Cody, "The Construction of Numerical Subroutine Libraries", .I SIAM Review, .R 16 (1974), 36-46. .sp .IP [7] W.J. Cody, "Observations on the Mathematical Software Effort", to appear in .I Sources and Development of Mathematical Software, .R ed. W. Cowell, Prentice-Hall, Englewood Cliffs, N.J., 1983. .sp .IP [8] J. E. Dennis, D. M. Gay, R. E. Welch, "An Adaptive Nonlinear Least Squares Algorithm", ACM Trans. on Mathematical Software, 7 (1981) 348-368,369-383. .sp .IP [9] J.J. Dongarra, J.R. Bunch, C.B. Moler, and G.W. Stewart, .I LINPACK Users' Guide, .R SIAM Publications, Philadelphia, 1979. .sp .IP [10] G.E. Forsythe, M.A. Malcolm, and C.B. Moler, .I Computer Methods for Mathematical Computations, .R Prentice-Hall, Englewood Cliffs, N.J., 1977. .sp .IP [11] P. A. Fox, A. D. Hall, N. L. Schryer, "The PORT Mathematical Subroutine Library", ACM Trans. on Mathematical Software, 4 (1978) 104-126, 177-188. .sp .IP [12] W. Fullerton, .I FNLIB User's Manual, .R AT&T Bell Laboratories, (1981). .sp .IP [13] B.S. Garbow, J.M. Boyle, J.J. Dongarra, and C.B. Moler, .I Matrix Eigensystem Routines - EISPACK Guide Extension, .R Lecture Notes in Computer Science, Vol. 51, Springer-Verlag, Berlin, 1977. .sp .IP [10] G.H. Golub, J.H. Welsch, "Calculation of Gauss Quadrature Rules", .I Mathematics of Computation, .R 23 (1969) 221-230. .sp .IP [14] .R D.R. Kincaid, J.R. Respess, D.M. Young, "ITPACK 2C: A Fortran Package for Solving Large Sparse Linear Systems by Adaptive Accelerated Iterative Methods", .I ACM Trans. Mathematical Software, .R 8 (1982), 302-322. .sp .IP [12] C. Lawson, R. Hanson, D. Kincaid, and F. Krogh, "Basic Linear Algebra Subprograms for Fortran Usage", .I ACM Trans. Mathematical Software, .R 5 (1979), 308-371. .sp .IP [15] G. Lewicki, D. Cohen, P. Losleben, D. Trotter, "MOSIS: Present & Future" .I 1984 Conf. on Advanced Research in VLSI, .R MIT, Jan. 1984. .sp .IP [13] .R J. Mor\*'e, D. Sorensen, B. Garbow, and K. Hillstrom, .I The MINPACK Project, .R in Sources and Development of Mathematical Software, edited by W. Cowell, Prentice Hall, pp. 88-111, 1984. .sp .IP [16] R. Piessens, E. deDoncker-Kapenga, C. Uberhuber, D. Kahaner, .I Quadpack: a Subroutine Package for Automatic Integration, .R Series in Computational Mathematics v.1, Springer Verlag, 1983. .sp .IP [17] B.T. Smith, J.M. Boyle, J.J. Dongarra, B.S. Garbow, Y. Ikebe, V.C. Klema, and C.B. Moler, .I Matrix Eigensystem Routines - EISPACK Guide, .R Lecture Notes in Computer Science, Vol. 6, 2nd Edition, Springer-Verlag, Berlin, 1976. .sp .IP [18] P.N. Swarztrauber and R.A. Sweet, "Efficient FORTRAN Subroutines for the Solution of Separable Elliptic Equations, Algorithm 541", .I ACM Trans. Mathematical Software, .R 5 (1979), 352-364. .sp .IP [19] L. N. Trefethen, "Numerical Computation of the Schwarz-Christoffel Transformation", SIAM J. Scientific and Statistical Computing, 1 (1980) 82-102.