The BibNet Project archive

Last top-level updates: Sat Jan 19 08:02:11 2013    Mon Jan 21 10:34:09 2013    Thu Feb 7 08:37:12 2013    Wed Feb 20 08:43:32 2013    Tue Mar 19 12:10:54 2013    Thu Jul 11 14:17:47 2013    Mon Jul 22 07:51:01 2013    Wed Aug 28 11:48:09 2013    Sat Sep 7 12:37:58 2013    Fri Nov 29 09:36:34 2013    Wed Jan 29 09:24:58 2014    Fri May 30 11:30:49 2014    Wed Nov 5 10:33:30 2014 [but the bibliography archive files are updated often!]

This is an archive of freely-distributable bibliographic data in BibTeX format; see the sections below for more information on its contents, and how to mirror, and use, the data.

If you are unfamiliar with BibTeX, or bibliographic markup systems, and would like to learn more, visit this tutorial . It discusses many of the issues that are important for bibliographic work, and describes numerous software tools that can make such work easier and more productive.

You can jump from here directly to the BibNet Project download or mirror sections. Valid HTML 4.0!

Project history

The BibNet Project was created in August 1994 by Stefano Foresti (then at the University of Utah, now at the University of California, Merced), Eric Grosse (then at Bell Labs, now at Google), and Nelson H. F. Beebe (still at the University of Utah) with the intent of collecting, and making freely available on the Web, accurate and clean publication bibliographies in BibTeX format of well-known scientists working in numerical analysis.

The third co-founder (NHFB) is the author of numerous software tools for converting Web HTML pages and other data to BibTeX format, and for joining, merging, ordering, prettyprinting, searching, sorting, and validating BibTeX data; some of those tools are described in the papers A Bibliographer's Toolbox and Bibliography prettyprinting and syntax checking.

Many of those software tools are available at these sites: http://www.math.utah.edu/pub/emacs and http://www.math.utah.edu/~beebe/software/

Project content highlights

Note: In the tables in the rest of this document, you can replace .html with .bib in any bibliography link to get the original BibTeX file. With similar changes, you can get DVI files (.dvi), LaTeX wrappers (.ltx), PDF files (.pdf), compressed PostScript files (.ps.gz or .ps.xz), spelling dictionaries (.sok), and titleword cross-reference files (.twx).

Since its founding, the archive has been expanded to include a few subject-specific bibliographies, and extensive bibliographies of important scientists in the other fields of computing, as well as numerical analysis

Numerical analysts
Marsha Berger Achi Brandt William Cody
Richard Crandall Jack Dongarra Ian Duff
George Forsythe David Gay Gene Golub
Nick Higham William Kahan Baker Kearfott
David Kincaid Cornelius Lanczos Benoît Mandelbrot
George Marsaglia Nick Metropolis Cleve Moler
Ray Moore Beresford Parlett John R. Rice
Axel Ruhe Yousef Saad Claude Shannon
Frank Stenger G. W. `Pete' Stewart Nick Trefethen
John Tukey Alan Turing Henk Vandervorst
Jim Wilkinson Henry Wolkowicz David Young

The archive also covers pioneers in quantum physics and quantum chemistry, several of them winners of the Nobel Prize in Chemistry or Physics, or the US National Medal of Science:

Quantum scientists
Hans Bethe Niels Bohr Max Born
Paul Dirac Freeman Dyson Albert Einstein
Enrico Fermi Richard Feynman George Gamow
Frank E. Harris Werner Heisenberg Per-Olov Löwdin
Ettore Majorana Norman March Robert Oppenheimer
Wolfgang Pauli Erwin Schrödinger John Slater
Leo Szilard Edward Teller Stan Ulam
John von Neumann Eugene Wigner

There are bibliographies for dozens of other authors in the BibNet Project archive; consult the download section to find them.

Subject-specific bibliographies include at least these:

Related bibliographies

The much larger TeX User Group bibliography archive contains bibliographies for more than 500 journals, and about 90 subject-specific bibliographies. Of those, several could easily have been incorporated as part of the BibNet Project, but have not been. Here is a partial list of them, divided into several categories:

Journals on the history of science
Annals of Science
Annals of the History of Computing
Archive for History of Exact Sciences
Berichte zur Wissenschaftsgeschichte [Reports on Science History]
British Journal for the History of Science (2010–2019) (also 1962–1989 , 1990–1999 , and 2000–2009 )
British Journal for the Philosophy of Science
Centaurus: an International Journal of the History of Science and its Cultural Aspects
Chymia: Annual Studies in the History of Chemistry
Foundations of Physics
Historical Studies in the Natural Sciences
Historical Studies in the Physical Sciences
Historical Studies in the Physical and Biological Sciences
IEEE Annals of the History of Computing
Isis (2010–2019) (also 1910–1919 , 1920–1929 , 1930–1939 , 1940–1949 , 1950–1959 , 1960–1969 , 1970–1979 , 1980–1969 , 1990–1999 , and 2000–2009 )
Nuncius
Perspectives on Science
Philosophy of Science
Physics in Perspective (PIP)
Science in Context
Studies in History and Philosophy of Biological and Biomedical Sciences
Studies in History and Philosophy of Science Part A
Studies in History and Philosophy of Modern Physics [Part B]
Studies in History and Philosophy of Science Part C
General Topics
ACM Turing Awards Ada Lovelace & Charles Babbage
Benford, Heaps, & Zipf's Laws Classic Shell Scripting
Data compression Elementary and special functions
Electronic publishing Fibonacci
Fonts in typography Floating-point arithmetic
Matrix Computations Hash algorithms
Interior point methods Intel IA-64 architecture
LAPACK Working Notes lcc C compiler
Literate programming Lecture Notes in Computer Science
Lecture Notes in Computational Science and Engineering Microprocessors
Multithreading Numerical analysis (1990--1999)
Numerical analysis (2000--2009) Numerical analysis (2010--2019)
O'Reilly computing books Google PageRank Algorithm
Pi (π) computations Pseudorandom numbers
Spelling error detection and correction Supercomputing
TeX used for books TeX used for journals
TeX with graphics Tree-drawing algorithms
Typography Typesetting
Unicode Utah mathematics books
Virtual machines Visual Instruction Sets
Internet
Internet and networking: 1969--1999 Internet and networking: 2000--2009
Internet and networking: 2010--2019 Internet FYI documents
Internet Engineering Notes Internet RFC documents
Internet STD Documents
Markup, programming, scripting, and symbolic algebra languages and systems
Ada Axiom & Scratchpad
BMDP statistics software Common Lisp
C-sharp (C#) Data Explorer
Fortran (1956--1980) Fortran (1981--1989)
Fortran (1990--date) High-Performance Fortran
Icon Java (1995--1999)
Java (2000--2009) Java (2010--2019)
MACSYMA, Maxima, and VAXIMA Maple
Mathematica MuPAD
PostScript & Portable Document Format (PDF) PVM and MPI
Python R, S, and S-Plus
Reduce Reduce (more)
REXX and NetReXX SAS (Statistical Analysis System)
SGML, HTML, and XML (1981--1999) SGML, HTML, and XML (2000--2009)
SGML, HTML, and XML (2010--2019) SPSS
SQL (Structured Query Language) TeX
Operating and database systems
Mach operating system GNU (Gnu is Not Unix) system
GNU/Linux operating system MINIX operating system
Oracle database system Plan 9 distributed operating system
UNIX
Standards
ANSI Standards ECMA Standards
IEC Standards ISO Standards for programming languages
Software standards

Two other large BibTeX-format bibliography archives of note are the Karlsruhe Collection of Computer Science Bibliographies (3 million entries in early 2013), and the Universität Trier DBLP Computer Science Bibliography (2.1 million entries in early 2013).

The Karlsruhe archive mirrors the Utah archives, possibly with some rearrangement into subject-specific directories.

Archive file types

Each BibTeX bibliography has the standard file extension .bib. It is accompanied by a LaTeX file with extension .ltx that is used to typeset all of the entries in the BibTeX file to demonstrate that they are free of TeX-markup errors, and show how they might appear in a reference list in one particular bibliography style. Those two files are the only ones created by humans. The remaining files for each bibliography are created by software, and are automatically updated as new versions of the bibliography are released on the BibNet Project Web site. Their file extensions are:

In a Web browser, the .bib and .html files should be visually identical, allowing cut-and-paste operations from either, but the HTML file is enriched with hypertext links that in many cases lead to online documents. The BiBTeX file is the critical file, and is needed if you wish to incorporate multiple references from a given bibliography file in your document.

Archive organization

The BibNet Project archive is divided into two main directory trees: authors and subjects . The first is further subdivided into subdirectories named by the first letter of its files. The second is a flat directory, with no subdirectories.

Author-specific files are named with the family name first, in lowercase letters: dirac-p-a-m, shannon-claude-elwood, von-neumann-john, and so on. Albert Einstein's bibliography is just called einstein.

Downloading archive files

You can find top-level indexes of BibNet Project archive files in authors and subjects, and initial-letter indexes for authors in a,    b,    c,    d,    e,    f,    g,    h,    i,    j,    k,    l,    m,    n,    o,    p,    q,    r,    s,    t,    u,    v,    w,    x,    y,    z.

Mirroring the archive

If you are willing, and have adequate disk space (about 250MB), we strongly urge you to consider mirroring the project archive from its home site to your site, either for local-use only, or made available to the public at your Web site. Librarians have a good acronym for that practice: LOCKSS (Lots of Copies Keeps Stuff Safe).

If you succeed in creating a stable up-to-date mirror that you believe will be able to exist for a long time, please send e-mail to the maintainers with a request for it to be added to a list of BibNet Project mirrors.

One brute-force way to pull the entire archive to your system is a recursive retrieval with either of two popular Unix utilities:

% ncftpget -R ftp://ftp.math.utah.edu/pub/bibnet/

% wget --recursive ftp://ftp.math.utah.edu/pub/bibnet/

A better way is to exploit the fact that the master host FTP server can return entire directory trees in any of several archive formats:

% wget ftp://ftp.math.utah.edu/pub/bibnet.jar

% curl -o bibnet.tar.gz ftp://ftp.math.utah.edu/pub/bibnet.tar.gz

% ncftpget ftp://ftp.math.utah.edu/pub/bibnet.tar

% wget ftp://ftp.math.utah.edu/pub/bibnet.tar.bz2

% wget ftp://ftp.math.utah.edu/pub/bibnet.zip

% wget ftp://ftp.math.utah.edu/pub/bibnet.zoo

You can use those same URLs in most Web browsers, and then unpack the just-downloaded archive file in a suitable location. The unpacking normally preserves file protections and file timestamps.

The preferred way, however, is to use the rsync utility, which uses a clever algorithm on both sides of the connection to transfer only the changes between files, dramatically reducing transfer times when the two archives have similar contents.

# Find out what collections are available to rsync:
% rsync rsync://ftp.math.utah.edu/
CTAN            all of ftp://ctan.tug.org/ (huge)
bib             TeX User Group bibliography archive (large)
bibnet          BibNet Project bibliography archive
texlive         all of ftp://tug.org/texlive/ (huge)

# Fetch one of them (the -a option preserves important timestamp
# information, and the -z option turns on compression to reduce
# network traffic; add the -v option for verbose output):
% rsync -a -z rsync://ftp.math.utah.edu/bibnet .

# See how long a subsequent update might take
% time rsync -a -z rsync://ftp.math.utah.edu/bibnet .
0.004u 0.013s 0:00.34 2.9%      0+0k 0+0io 0pf+0w

rsync can be used to populate an initial copy of a mirror

The rsync utility should now be standard in most Unix distributions, but if your machine does not have it, you can find it at http://rsync.samba.org/. There is a separate project that wraps the command-line version in a graphical user interface for common Unix, Mac OS X, and Microsoft Windows systems: http://www.opbyte.it/grsync/. Prebuilt versions of grsync are installable from some Unix package distributions. The grsync program remembers your settings, so once you have used it to configure and run a mirror update, you can run it manually from time to time and get updates with a single click.

Once you have a copy of the archive on your system, use a regularly-scheduled cron job to keep your copy up-to-date. We recommend at least weekly updates if your copy is for local use only, and nightly (our time: GMT/UCT - 7 hours) if your copy is a mirror on a public Web site.

Searching archive files

There are several ways to search the archive files, apart from Web search engines whose own copies of the data are likely to be several weeks out of date. If you know which bibliography has the entry you want, then just visit the file in your favorite text editor and use its search commands.

The Unix grep command-line utility family is one common approach to search in multiple files:

% grep -B 4 '^ *title *= .*Einstein.*Berlin' *.bib

Its limitation is that it is line based, and search strings must match a single line.

The bibsearch provides a much faster way, and it eliminates the line-boundary constraint because each BibTeX entry is treated as a single block of text:

% bibsearch
> title & einstein & berlin & 2003

A more powerful way to search is first to convert the data to SQL (Structured Query Language) with bibtosql, and then to use the bibsql front end, or the sqlite3 program directly, and enter SQL commands for selective searching and display of specified fields, or even entire BibTeX entries:

# create the SQLite3 database (once only)
% bibtosql --create -database sqlite *.bib | sqlite3 bibnet.db

# search the SQLite3 database
% sqlite3 bibnet.db

-- how many BibTeX entries are in the database?

sqlite> select count(*) from bibtab;
37554

-- which entries are about Einstein's years in Berlin?

sqlite> select filename, label from bibtab
        where (title like '%Einstein%Berlin%')
        order by filename, year, label;
bohr-niels.bib|Hendry:1986:BRJ
bohr-niels.bib|Hendry:1986:BRW
...
einstein.bib|Treder:1966:ESE
einstein.bib|Kirsten:1979:AEB
einstein.bib|Nelkowski:1979:ESB
...
einstein.bib|vanDongen:2012:MIM
...

-- get the most recent entry about Einstein in Berlin

sqlite> select entry from bibtab
     where (label = 'vanDongen:2012:MIM');

@Article{vanDongen:2012:MIM,
  author =       "Jeroen van Dongen",
  title =        "Mistaken Identity and Mirror Images: {Albert and Carl
                 Einstein}, {Leiden} and {Berlin}, {Relativity} and
                 Revolution",
...
}

The sqlite3 program is public-domain software. It is extremely portable, and its database files do not depend on the host operating system or the host CPU's memory byte order; once created, those files can be copied and used everywhere. Prebuilt versions are available for common desktop platforms, and even for some mobile telephones!

Because most BibTeX entries in the archives carry a time stamp field that records when the entry was created or modified, you can use that field to find recently-added material:


-- change output format to aligned column

sqlite> .mode columns

-- find the most recent Einstein entries

sqlite> select label, bibtimestamp, substr(title, 1, 40) from bibtab
        where (filename = 'einstein.bib')
          and (bibtimestamp > '2013.01.01 00:00:00 AAA')
        order by bibtimestamp;
Lanouette:1994:AS  2013.01.11 06:50:11 ???  Atomic Spies
Walker:1997:PUD    2013.01.11 09:28:10 MST  Prompt and utter des
Buchwald:2001:HEB  2013.01.11 12:08:33 MST  Histories of the Ele
Walker:2004:TMI    2013.01.11 12:17:42 ???  Three Mile Island: N
Thackray:1977:BRB  2013.01.12 11:56:22 MST  Book Review: booktit
...

The paper at the bibsql Web site gives numerous examples of how the data can be mined in many more ways that are simply infeasible without the added structure of SQL fields.