Last top-level updates:
Mon Sep 9 11:00:21 2024
[but the bibliography archive files are updated often!]
This is an archive of freely-distributable bibliographic data in BibTeX format; see the sections below for more information on its contents, and how to mirror, and use, the data.
If you are unfamiliar with BibTeX, or bibliographic markup systems, and would like to learn more, visit this tutorial . It discusses many of the issues that are important for bibliographic work, and describes numerous software tools that can make such work easier and more productive.
You can jump from here directly to the BibNet Project download or mirror sections.
The BibNet Project was created in August 1994 by Stefano Foresti (then at the University of Utah, now at the University of California, Merced), Eric Grosse (then at Bell Labs, now at Google), and Nelson H. F. Beebe (still at the University of Utah) with the intent of collecting, and making freely available on the Web, accurate and clean publication bibliographies in BibTeX format of well-known scientists working in numerical analysis.
The third co-founder (NHFB) is the author of numerous software tools for converting Web HTML pages and other data to BibTeX format, and for joining, merging, ordering, prettyprinting, searching, sorting, and validating BibTeX data; some of those tools are described in the papers A Bibliographer's Toolbox and Bibliography prettyprinting and syntax checking. He is responsible for almost all of the subsequent evolution and maintenance of the BibNet Project.
Many of those software tools are available at these sites: https://www.math.utah.edu/pub/emacs and https://www.math.utah.edu/~beebe/software/
Note: In the tables in the rest of this document, you can replace .html with .bib in any bibliography link to get the original BibTeX file. With similar changes, you can get DVI files (.dvi), LaTeX wrappers (.ltx), PDF files (.pdf), compressed PostScript files (.ps.gz or .ps.xz), spelling dictionaries (.sok), and titleword cross-reference files (.twx).
Since its founding, the archive has been expanded to include a few subject-specific bibliographies, and extensive bibliographies of important scientists in other fields of computing, as well as numerical analysis.
Two nineteenth-century pioneers in computing, Charles Babbage and Augusta Ada King, Countess of Lovelace, are also covered in a separate bibliography, adabooks.bib, in the TeX User Group bibliography archive. All of the entries in Parts 3 to 6 of adabooks.bib are included in Parts 1 and 2 of the Babbage and Lovelace bibliographies here.
The small number of publications by, and about, the 19th Century French mathematician Évariste Galois (1811–1832) are covered in Part 3 of Leopold Infeld's bibliography.
The archive also covers pioneers in quantum physics and quantum chemistry, several of them winners of the Nobel Prize in Chemistry or Physics, or the US National Medal of Science.
There are bibliographies for dozens of other authors in the BibNet Project archive; consult the download section to find them.
Subject-specific bibliographies include at least these:
The much larger TeX User Group bibliography archive contains bibliographies for about 890 journals, and about 90 subject-specific bibliographies. Of those, several could easily have been incorporated as part of the BibNet Project, but have not been. Here is a partial list of them, divided into several categories:
|
|
---|---|
Mach operating system | GNU (Gnu is Not Unix) system |
GNU/Linux operating system | MINIX operating system |
Oracle database system | Plan 9 distributed operating system |
UNIX |
|
|
---|---|
ANSI Standards | ECMA Standards |
IEC Standards | ISO Standards for programming languages |
Software standards |
Two other large BibTeX-format bibliography archives of note are the Karlsruhe Collection of Computer Science Bibliographies (3 million entries in early 2013), and the Universität Trier DBLP Computer Science Bibliography (2.1 million entries in early 2013).
The Karlsruhe archive mirrors the Utah archives, possibly with some rearrangement into subject-specific directories.
Oak Ridge National Laboratory and Sandia National Laboratory also mirror the Utah archives, without rearrangement: see links in the mirror section of this document.
Each BibTeX bibliography has the standard file extension .bib. It is accompanied by a LaTeX file with extension .ltx that is used to typeset all of the entries in the BibTeX file to demonstrate that they are free of TeX-markup errors, and show how they might appear in a reference list in one particular bibliography style. Each bibliography file also has a spelling exception list file with extension .sok. Those three files are the only ones created by humans. The remaining files for each bibliography are created by software, and are automatically updated as new versions of the bibliography are released on the BibNet Project Web site. Their file extensions are:
In a Web browser, the .bib and .html files should be visually identical, allowing cut-and-paste operations from either, but the HTML file is enriched with hypertext links that in many cases lead to online documents. The BiBTeX file is the critical file, and is needed if you wish to incorporate multiple references from a given bibliography file in your document.
The BibNet Project archive is divided into two main directory trees: authors and subjects . The first is further subdivided into subdirectories named by the first letter of its files. The second is a flat directory, with no subdirectories.
Author-specific files are named with the family name first, in lowercase letters: dirac-p-a-m, shannon-claude-elwood, von-neumann-john, and so on. However, Albert Einstein's bibliography is just called einstein.
You can find top-level indexes of BibNet Project archive files in authors and subjects, and initial-letter indexes for authors in a, b, c, d, e, f, g, h, i, j, k, l, m, n, o, p, q, r, s, t, u, v, w, x, y, z.
If you are willing, and have adequate disk space (about 250MB), we strongly urge you to consider mirroring the project archive from its home site to your site, either for local-use only, or made available to the public at your Web site. Librarians have a good acronym for that practice: LOCKSS (Lots of Copies Keeps Stuff Safe).
If you succeed in creating a stable up-to-date mirror that you believe will be able to exist for a long time, please send e-mail to the maintainers with a request for it to be added to a list of BibNet Project mirrors.
One brute-force way to pull the entire archive to your system is a recursive retrieval with either of two popular Unix utilities:
% ncftpget -R ftp://ftp.math.utah.edu/pub/bibnet/ % wget --recursive ftp://ftp.math.utah.edu/pub/bibnet/
A better way is to exploit the fact that the master host FTP server can return entire directory trees in any of several archive formats:
% wget ftp://ftp.math.utah.edu/pub/bibnet.jar % curl -o bibnet.tar.gz ftp://ftp.math.utah.edu/pub/bibnet.tar.gz % ncftpget ftp://ftp.math.utah.edu/pub/bibnet.tar % wget ftp://ftp.math.utah.edu/pub/bibnet.tar.bz2 % wget ftp://ftp.math.utah.edu/pub/bibnet.zip % wget ftp://ftp.math.utah.edu/pub/bibnet.zoo
You can use those same URLs in most Web browsers, and then unpack the just-downloaded archive file in a suitable location. The unpacking normally preserves file protections and file timestamps.
The preferred way, however, is to use the rsync utility, which uses a clever algorithm on both sides of the connection to transfer only the changes between files, dramatically reducing transfer times when the two archives have similar contents.
# Find out what collections are available to rsync: % rsync rsync://ftp.math.utah.edu/ CTAN all of ftp://ctan.tug.org/ (huge) bib TeX User Group bibliography archive (large) bibnet BibNet Project bibliography archive texlive all of ftp://tug.org/texlive/ (huge) # Fetch one of them (the -a option preserves important timestamp # information, and the -z option turns on compression to reduce # network traffic; add the -v option for verbose output): % rsync -a -z rsync://ftp.math.utah.edu/bibnet . # See how long a subsequent update might take % time rsync -a -z rsync://ftp.math.utah.edu/bibnet . 0.004u 0.013s 0:00.34 2.9% 0+0k 0+0io 0pf+0w
rsync can be used to populate an initial copy of a mirror
The rsync utility should now be standard in most Unix distributions, but if your machine does not have it, you can find it at http://rsync.samba.org/. There is a separate project that wraps the command-line version in a graphical user interface for common Unix, Mac OS X, and Microsoft Windows systems: http://www.opbyte.it/grsync/. Prebuilt versions of grsync are installable from some Unix package distributions. The grsync program remembers your settings, so once you have used it to configure and run a mirror update, you can run it manually from time to time and get updates with a single click.
Once you have a copy of the archive on your system, use a regularly-scheduled cron job to keep your copy up-to-date. We recommend at least weekly updates if your copy is for local use only, and nightly (our winter time is GMT/UCT - 7 hours) if your copy is a mirror on a public Web site.
There are public mirrors of the BibNet Project archives at Oak Ridge National Laboratory (Oak Ridge, TN, USA) and at Sandia National Laboratory (Albuquerque, NM, USA).
There are several ways to search the archive files, apart from Web search engines whose own copies of the data are likely to be several weeks out of date. If you know which bibliography has the entry you want, then just visit the file in your favorite text editor and use its search commands.
The Unix grep command-line utility family is one common approach to search in multiple files:
% grep -B 4 '^ *title *= .*Einstein.*Berlin' *.bib
Its limitation is that it is line based, and search strings must match a single line.
The bibsearch provides a much faster way, and it eliminates the line-boundary constraint because each BibTeX entry is treated as a single block of text:
% bibsearch > title & einstein & berlin & 2003
A more powerful way to search is first to convert the data to SQL (Structured Query Language) with bibtosql, and then to use the bibsql front end, or the sqlite3 program directly, and enter SQL commands for selective searching and display of specified fields, or even entire BibTeX entries:
# create the SQLite3 database (once only) % bibtosql --create -database sqlite *.bib | sqlite3 bibnet.db # search the SQLite3 database % sqlite3 bibnet.db -- how many BibTeX entries are in the database? sqlite> select count(*) from bibtab; 37554 -- which entries are about Einstein's years in Berlin? sqlite> select filename, label from bibtab where (title like '%Einstein%Berlin%') order by filename, year, label; bohr-niels.bib|Hendry:1986:BRJ bohr-niels.bib|Hendry:1986:BRW ... einstein.bib|Treder:1966:ESE einstein.bib|Kirsten:1979:AEB einstein.bib|Nelkowski:1979:ESB ... einstein.bib|vanDongen:2012:MIM ... -- get the most recent entry about Einstein in Berlin sqlite> select entry from bibtab where (label = 'vanDongen:2012:MIM'); @Article{vanDongen:2012:MIM, author = "Jeroen van Dongen", title = "Mistaken Identity and Mirror Images: {Albert and Carl Einstein}, {Leiden} and {Berlin}, {Relativity} and Revolution", ... }
The sqlite3 program is public-domain software. It is extremely portable, and its database files do not depend on the host operating system or the host CPU's memory byte order; once created, those files can be copied and used everywhere. Prebuilt versions are available for common desktop platforms, and even for some mobile telephones!
Because most BibTeX entries in the archives carry a time stamp field that records when the entry was created or modified, you can use that field to find recently-added material:
-- change output format to aligned column sqlite> .mode columns -- find the most recent Einstein entries sqlite> select label, bibtimestamp, substr(title, 1, 40) from bibtab where (filename = 'einstein.bib') and (bibtimestamp > '2013.01.01 00:00:00 AAA') order by bibtimestamp; Lanouette:1994:AS 2013.01.11 06:50:11 ??? Atomic Spies Walker:1997:PUD 2013.01.11 09:28:10 MST Prompt and utter des Buchwald:2001:HEB 2013.01.11 12:08:33 MST Histories of the Ele Walker:2004:TMI 2013.01.11 12:17:42 ??? Three Mile Island: N Thackray:1977:BRB 2013.01.12 11:56:22 MST Book Review: booktit ...
The paper at the bibsql Web site gives numerous examples of how the data can be mined in many more ways that are simply infeasible without the added structure of SQL fields.