**Today's Topics:**

- Changed address for Sven-Ake Gustafson
- New Address for Alfio Quarteroni
- New Address for Angelika Bunse-Gerstner
- Student Support for Parallel Circus
- Looking for a Research Position
- Journal Refereeing is Breaking Down
- 3-level block-Toeplitz solver
- 6th ACM Conference on Supercomputing
- FTPing Technical Reports
- Matrix Calculator with C Syntax
- Errata to "Numerical Recipes" by Press et all
- Special Issue of "Computing" for H.-J. Wacker
- Positions at the University of Minnesota.
- Householder Fellowship at Oak Ridge
- On the Odor of IEEE Arithmetic

-------------------------------------------------------

From: Sven-Ake Gustafson <gustafson@ifi.hsr.no>

Date: Tue, 24 Sep 1991 12:02:00 +0200

**Subject: Changed address for Sven-Ake Gustafson**

Changed address for Sven-Ake Gustafson:

I have returned from my sabbatical leave and from now

on my address will be:

HSR

Box 2557 Ullandhaug

N-4004 Stavanger

Norway

Phone: (office) +47 4874470

(home) +47 4890083

Fax: +47 4874236

email: gustafson@ifi.hsr.no

------------------------------

From: Alfio Quarteroni <ALFQUA@IPMMA1.POLIMI.IT>

Date: Thu, 26 Sep 1991 15:56 MET

**Subject: New Address for Alfio Quarteroni**

Effective September 1st, I am back at the Politecnico of Milan,

at the following address:

Alfio Quarteroni

Dipartimento di Matematica

Politecnico di Milano

Via Bonardi 9

20133 Milano - Italy

Phone: +39-2-2399-4525

Fax: +39-2-2399-4568

e-mail: alfqua%ipmma1.polimi.it@imiclvx.bitnet

------------------------------

From: Angelika Bunse-Gerstner <abg@informatik.uni-Bremen.de>

Date: Fri, 27 Sep 91 14:48:14 N

**Subject: New Address for Angelika Bunse-Gerstner **

I moved from the university of Bielefeld to the university of Bremen, where

I took a new position at the department of mathematics and computer science.

My new address, phone and e-mail are as follows:

Address:

Angelika Bunse-Gerstner

Fachbereich Mathematik und Informatik

Universitaet Bremen

Postfach 33 04 40

W-2800 Bremen

FRG

Phone: 049-421-218-4205

email: abg@informatik.uni-bremen.de

------------------------------

From: Gene Golub <golub@Cholesky.Stanford.EDU>

Date: Fri, 27 Sep 91 7:57:32 PDT

**Subject: Student Support for Parallel Circus**

We have recently been awarded a modest grant by the NSF for supporting

the travel and expenses of students attending the Parallel Circus

meetings during the next two years. The next Circus will take place in

Oak Ridge on Oct 25, 26. Those students requesting support should send

a one page proposal to Gene Golub (na.golub@na-net.ornl.gov) giving

the reasons for attending the meeting, and a budget for expenses. The

student(s) should indicate their research interests and plans.

A letter verifying that the student is in good standing should be

sent independently by a faculty adviser. This letter should give the

student's GPA. We will be pleased to consider joint proposals which

would include the expenses of several students.

The deadline is 5pm (PST) on Oct 11 and we will try to get a response

by Oct 15. Correspondence by e-mail is desirable.

Apostolos Gerasoulis Gene Golub Ahmed Sameh

Gene Golub,

Computer Science Dept.

Office: Margaret Jacks Hall, Room 306

Office Phone: 415/723-3124 Home Phone: 415/323-0105

FAX number: (415) 725-7411

------------------------------

From: Vera Talanker <TALANKER@ALCVAX.PFC.MIT.EDU>

Date: Mon, 23 Sep 91 11:17 EST

**Subject: Looking for a Research Position**

This spring I will be completing my MS degree in Applied Mathematics

at Colorado School of Mines. I am looking for a 1-2 year research position

in the field of medicine (image processing, modeling drug delivery, etc.)

in a European country. Can anyone give me some suggestions where to

start looking?

Thanks,

Vera Talanker: e-mail: on internet talanker@alcvax.pfc.mit.edu

------------------------------

From: Stephen Vavasis <vavasis@cs.cornell.edu>

Date: Mon, 23 Sep 91 16:08:59

**Subject: Journal Refereeing is Breaking Down**

It's my impression that the refereeing process in the numerical

community is breaking down. Papers are taking longer and longer to

get refereed. The reason seems to be that people are writing more

papers while agreeing to referee fewer. I reach this conclusion from

my own experience of lengthy publication delays, and also from talking

to some editors of SIAM journals.

I would like to propose a stringent solution to the problem. Because

many people in the numerical community read and publish in SIAM

journals, I would propose that SIAM adopt the following policy. SIAM

would maintain a list of people who carry their share in terms of

refereeing, and of people who repeatedly decline to referee SIAM

papers. Anyone in the latter category would not be allowed to publish

in SIAM journals.

I realize that the problem I have identified is not specific to

numerical analysis but is widespread throughout computer science and

probably other disciplines. We in NA, however, are fortunate to

have a single organization (SIAM) that could actually take action

to combat the problem.

I would welcome comments on this proposal, sent to me by email or

posted to NA digest. I have not yet discussed my proposal with people

in the SIAM publication department.

-- Steve Vavasis

vavasis@cs.cornell.edu

------------------------------

From: Khalil Kalbasi <TIMBER@kuhub.cc.ukans.edu>

Date: Thu, 19 Sep 1991 11:14 CDT

**Subject: 3-level block-Toeplitz solver**

I like to know if anyone knows of a 3-level block Toeplitz solver

with complex general blocks. I am doing an array antenna problem which

eventually needs to solve a linear complex system Ax=b. The matrix A is

very structured, i.e. it is a 3-level block Toeplitz with a complex general

blocks. The Netlib has 3-level block-circulant routines which is not

any use to me. The structure of the matrix A (a 6x6 block example) is shown

below, where each M# is a general complex block. As shown, each 2x2 block is

a general Toeplitz and the ovearl 6x6 blocks are also general Toeplitz.

| M1 M2 M4 M5 M7 M8 |

| M3 M1 M6 M4 M9 M7 |

| |

| M10 M11 M1 M2 M4 M5 |

| M12 M10 M3 M1 M6 M4 |

| |

| M13 M14 M10 M11 M1 M2 |

| M15 M13 M12 M10 M3 M1 |

I appreciate your help.

Khalil Kalbasi

Department of Electrical and Computer Engineering

The University of Kansas, Lawrence, KS

(913)864-7739

------------------------------

From: Stratis Gallopoulos <stratis@csrd.uiuc.edu>

Date: Mon, 23 Sep 91 12:28:59 CDT

**Subject: 6th ACM Conference on Supercomputing**

CALL FOR PAPERS

6th ACM International Conference on Supercomputing

Sponsored by ACM SIGARCH

July 19-23, 1992

Hyatt Crystal City Hotel, Washington D.C

General Chair

Ken Kennedy, Center for Research on Parallel Comp., Rice University

Program Chair

C. D. Polychronopoulos, CSRD, University of Illinois

The 6th ACM International Conference on Supercomputing

is soliciting papers on significant new research results

and experience in the development and use of supercomputing systems.

Topics of interest include, but are not limited to

parallel and high-performance computer architectures,

parallelizing compilers and programming environments,

operating systems and performance evaluation,

large-scale applications and algorithms,

technologies, new experimental and commercial systems.

Program Committee

F. Allen (IBM, T.J. Watson), R. Allen (Kubota Pacific),

Arvind (MIT), J. L. Baer (U. Washington), D. Bailey (NASA Ames),

J. C. Browne (U. Texas-Austin), R. Cytron (Washington U.),

D. DeGroot (Texas Inst.), R. Esser (KFA), E. Gallopoulos (CSRD),

J. R. Gurd (U. Manchester), F. Hossfield (KFA), E. Houstis (Purdue U.),

W. Jalby (INRIA-IRISA), Y. Jegou (INRIA-IRISA), C. Koelbel (Rice U.),

J. McGraw (LLNL), Y. Muraoka (Waseda U.), A. Nicolau (UC Irvine),

T. Papatheodorou (U. Patras), J. Riganati (SRC), A. J. Smith (UC Berkeley),

H. Terada (Osaka U.), A. Veidenbaum (CSRD), S. Wallach (Convex).

Local Arrangements Chair: Duncan Buell, SRC

Finance Chair: A. Veidenbaum, CSRD

Publicity Chair: E. Gallopoulos. CSRD

Paper Submissions

Authors should submit five copies of the full manuscript

to the program chairman or the appropriate regional chairman

as follows:

NORTH & SOUTH AMERICA EUROPE & AFRICA JAPAN & FAR EAST

Constantine Polychronopoulos Harry Wijshoff Toshitugu Yuba

Program Chairman Region Chairman Region Chairman

CSRD Dept. Computer Sci. Electrotechnical Lab.

University of Illinois Utrecht University 1-1-4 Umezono, Tsukuba

104 S. Wright St. Padualaan 14 Ibaraki 305

Urbana, Illinois 61801, USA 3584 CH Utrecht Japan

(cdp@csrd.uiuc.edu) The Netherlands (yuba@etl.go.jp)

(harryw@cs.ruu.nl)

DEADLINE for Submissions is December 10, 1991.

Authors will be notified about the outcome of

the paper selection by March 7, 1992.

Proposals for Tutorials

The conference also solicits proposals for half and full-day

tutorials which will be offered before and after the conference.

Proposals for tutorials on topics of wide interest to the

supercomputing and parallel processing community should be

submitted by December 10, 1991 to:

George Paul

Tutorials Chairman

IBM T.J. Watson Research Center

P.O. Box 704

Yorktown Heights, NY 10598, USA

(gp1@watson.ibm.com)

Tutorial topics of particular interest include, but are not

limited to large-scale applications, numeric and symbolic

computing, applications of supercomputing to engineering,

physical and biological sciences, programming environments

and visualization tools, architectures, compilers and languages

and operating systems.

Conference Announcement

The conference announcement is available in ASCII and

PostScript via anonymous ftp from sp2.csrd.uiuc.edu

(128.174.162.51) in pub/ics-an and pub/ics-an.ps.

For more information contact C. D. Polychronopoulos,

Program Chairman, CSRD, University of Illinois,

104 S. Wright St., Urbana, Illinois 61801-2932, USA.

Tel. (217) 333-6773. FAX (217) 244-1351. E-mail: cdp@csrd.uiuc.edu.

------------------------------

From: Pete Stewart <stewart@cs.UMD.EDU>

Date: Mon, 23 Sep 91 14:40:31 -0400

**Subject: FTPing Technical Reports**

As regular readers of this digest know, I have made my current

technical reports available by anonymous ftp. The response has been

encouraging, and several people have asked how to make their own

reports available. Hence this little note.

First some background. Ftp--short for file transfer program--is an

internet interface for moving files over the network. The user can

send or get ASCII or binary files. A particularly useful command is

"mget", which expands names (e.g., paper.*) and gets all the

resulting files. Details may be found in the on-line manual.

Ordinarily one must have a valid account on a machine in order to ftp

to it. This causes difficulties for people who want to make

programs, reports, and the like available to the general public. The

fix is the anonymous ftp account, which gives people restricted

access to files on a machine. Here for an example, is a little run

in which I listed some directories on my own machine.

thales: ftp thales.cs.umd.edu

Connected to thales.cs.umd.edu.

220 thales.cs.umd.edu FTP server (SunOS 4.1) ready.

Name (thales.cs.umd.edu:stewart): anonymous

331 Guest login ok, send ident as password.

Password:

230 Guest login ok, access restrictions apply.

ftp> cd pub

250 CWD command successful.

ftp> ls

200 PORT command successful.

150 ASCII data connection for /bin/ls (128.8.128.81,1315) (0 bytes).

jeep

matrixpert

references

reports

226 ASCII Transfer complete.

39 bytes received in 1.1 seconds (0.035 Kbytes/s)

ftp> bye

221 Goodbye.

thales:

The procedure is to identify yourself as "anonymous". You can give

any password, but it is usual to give your name or email address, so

the account manager can tell who is transfering files. By convention

pub is the root directory for public files.

The first step in creating a repository of technical reports is to get

your systems staff to set up an anonymous ftp account. For people who

are concerned about security, note that anonymous ftp is quite safe.

The user cannot get out of the ftp directory, and to my knowledge the

program has never been broken. In any event, our own security

conscious staff has no hesitation about setting up such accounts.

Once you have the account, you will probably want to create a special

directory to hold the repository (tech reports are not the only thing

you are going to want to distribute: see the above listing of my pub

directory). At this point you must decide what form you want to

distribute the reports in. For those who use TeX, there are three

options--tex files, dvi files, or postscript files. Unless you stick

to vanilla TeX or LaTeX, tex files are not the way to go, since you

will also have to distribute any auxiliary files containing your

customized macros. I went with dvi files because many places do not

have postscript printers, while most places have a utility (called

something like dvi2ps) that converts dvi files to postscript. One

important drawback is that postscript graphics must be transfered in

separate files; however, you can use the mget feature of ftp to ease

some of the pain.

The next thing is to decide on what auxiliary information you want to

supply your customers. I have organized my repository around three

files: README, Contents, and Abstracts. The README file contains

basic information about the repository. The Contents file contains

the following information for each paper

Authors

Title

Source

Disposition

Files

Here is a typical entry.

G. W. Stewart

"An Updating Algorithm for Subspace Tracking"

UMIACS-TR-90-86, CS-TR 2494, July 1990, Revised January 1991

To appear in IEEE Transactions on Acoustics, Speech and Signal

Processing

uast.dvi, uast.f (fortran program)

The Abstract file contains abstracts in addition to the above

information. Setting up these files for the first time is the most

time-consuming part of the process. They are very easy to update as

you add reports to your repository.

The last problem is to select names for your files. I construct mine

from the first letters of the significant words in the title. The

results are admitedly ugly, but it solves the problem of repeated

names. In fact, two or three letters followed by an asterisk is

usually sufficient identification for an mget. An alternative is to

use more conventional names with a date to insure uniqueness.

That's all there is to it. If you try it, please let me know.

When there is a critical mass, I will undertake to edit a weekly

entry in the na-digest announcing newly available reports--a sort

of technet digest. It would be a fitting birthday present for Gene

Golub, who has done so much to unite us electronically.

Pete Stewart

stewart@cs.umd.edu

na.pstewart@na-net.ornl.gov

------------------------------

From: Rick Perry <perry@ucis.vill.edu>

Date: 24 Sep 91 09:51:00 EDT

**Subject: Matrix Calculator with C Syntax**

I have been working on the development of a matrix calculator program,

similar to Matlab, but with the syntax of the C programming language.

This project started as a way to combine some of my C code for matrix

computations into a coherent package, but has evolved into a fairly complete

interactive C compiler, with matrices as the fundamental data type

(e.g. matrix A = [1 2; 3 4]; ++A;) and some operators from APL ( +/ */ ).

I am interested in hearing about similar development efforts or existing

products, and comments about the usefulness of such a package. I am aware

of many packages, e.g. Matlab, Matrix, S, SMP, etc. but I've never heard

of any which use C syntax. My program so far runs on Sun/Unix, VAX/VMS,

and IBM PC's, and I'll gladly make it available for anyone who's interested.

...Rick perry@vu-vlsi.vill.edu, perry@ucis.vill.edu

Dr. Rick Perry, Department of Electrical Engineering

Villanova University, Villanova, PA 19085, 215-645-4969, -4970, hm: 259-8734

------------------------------

From: Przemek Klosowski <przemek@rrdstrad.nist.gov>

Date: 17 Sep 91 18:21:32 GMT

**Subject: Errata to "Numerical Recipes" by Press et all**

Hello!

The merits (or demerits) of Numerical Recipes were discussed on the

net in a series of articles, but my feeling was that there was no hard

data on the specific problems and adopted solutions. In particular I

remember someone from JPL complaining that the "robust fit" procedure

is not robust at all, but without giving the details. I think it might

be of use to catalog specific problems with NR's algorithms (for the

sake of focus I propose that we do the C version first), so that

others might avoid the traps that possibly exist.

I would like to start such collection by posting several items, providing

varying detail, and asking people to add their own remarks/fixes. I ask

everyone to send me an e-mail copy of their followup, so that I don't miss

it (I normally read the newsgroups, but if the backlog is too large, I tend

to catch up). In the future, I will edit the contributions and make them

available to the public.

Specific problem areas that I know of are:

1) In the discussion of Levenberg-Marquard algorithm (ch. 14, p. 544)

for nonlinear minimization (mrqmin), the book gives a condition for

the end of iterations that is not satisfied if the starting point is

near the true minimum.

2) error in medfit---could the JPL person that reported it originally

fill in the details?

3) The authors "cathegorically recommend that you {\it only} use FFTs

with N a power of two". I understand that the latest wisdom goes

against this---comments/code anyone?

I have heard rumors that some of the FFT routines are broken.

4) the variable metric BFGS method (dfpmin) uses the traditional approach

of calculating and using the Hessian. This (as mentioned in the text)

is prone to roundoff errors, and that instead Gill/Murray method of

calculating H^{-1} is much preferred (more than it is implied from the

text). Anyone willing to comment on that? Anyone has a G/M version of

dfpmin?

Przemek Klosowski (przemek@ndcvx.cc.nd.edu)

Physics Department

University of Notre Dame IN 46556

------------------------------

From: Bob Mattheij <wstanw10@urc.tue.nl>

Date: Wed, 25 Sep 91 13:27:58 +0200

**Subject: Special Issue of "Computing" for H.-J. Wacker**

The journal Computing (Archives for Informatics and Numerical Computation)

will devote a special issue for papers dedicated to the unfortunate premature

death of its Editor H.-J. Wacker. As Professor Wacker was particularly active

in the promotion of the use of mathematics in industry, this special issue

should reflect these intentions and papers from the area of "Industrial Mathe-

matics" will be particularly welcome for this issue even if they would consti-

tute borderline cases with respect to the ordinary scope of Computing.

Manuscripts for this issue should be submitted (in duplicate) as soon as

possible to either

Prof. Robert Mattheij Prof. Hans J. Stetter

Technical University or Technical University (115.2)

Postbus 513 Wiedner Hauptstrasse 8-10

5600 MB Eindhoven A-1040 Vienna

The Netherlands Austria

The absolute deadline for the submission of manuscripts for this special issue

is February 15 1992.

Manuscripts must conform with the "Instructions to Authors" reprinted in each

issue of Computing. They will be subject to refereeing and the scientific

standards of the journal. Note that Computing has a category of "Short Communi-

cations" (besides "Original Papers"); manuscripts for the special issue may

also be submitted in this category.

Please, give me your reactions to the above information as soon as pos-

sible. I will be away once more between 22 Sept. and 5 Oct.; thus it would be

good to have a reply by 20 Sept.

With my best regards and my appreciation of your cooperation,

Sincerely yours,

Hans J. Stetter

------------------------------

From: Youcef Saad <saad@cs.umn.edu>

Date: Thu, 26 Sep 91 10:35:24 CDT

**Subject: Positions at the University of Minnesota.**

The Department of Computer Science at the University of Minnesota invites

applications for faculty positions beginning September 1992. Positions

are available at both the junior and senior levels and in all areas of

Computer Science with emphasis on compilers and operating systems,

performance evaluation, architecture, numerical analysis and optimization,

robotics and computer vision as well as all aspects of parallel and

distributed computing. Requirements include a Ph.D. in Computer Science or

a related discipline, excellence in research and a commitment to teaching.

The Minneapolis-St. Paul area is a major center for high technology and

the computer industry. Faculty in the Department of Computer Science have

access to outstanding computer facilities both within the department and

at the various high performance computing centers on campus, including the

Minnesota Supercomputer Institute and the Army High Performance Computer

Research Center. The supercomputers available on campus are Cray-2,

Cray X-MP, and Thinking Machines CM-2 and CM-X massively parallel processors.

Applicants should submit a vitae and names of at least three references to:

Chair, Faculty Recruiting Committee, Department of Computer Science,

4-192 EE/CSci Building, University of Minnesota, 200 Union St. S.E.,

Minneapolis, Minnesota 55455. The closing date for receipt of applications

is February 28, 1992.

The University of Minnesota is an equal opportunity employer and

welcomes applications from women and ethnic minorities.

------------------------------

From: Richard Sincovec <sincovec@sirius.EPM.ORNL.GOV>

Date: Thu, 26 Sep 91 14:46:42 -0400

**Subject: Householder Fellowship at Oak Ridge**

HOUSEHOLDER FELLOWSHIP

OAK RIDGE NATIONAL LABORATORY

The Mathematical Sciences Section of Oak Ridge National Laboratory

invites outstanding candidates to apply for the 1992 Alston S.

Householder Fellowship in Scientific Computing.

Alston S. Householder was the organizer and founding Director of the

Mathematics Division (precursor of the current Mathematical Sciences

Section) at Oak Ridge National Laboratory (ORNL). In recognition of

the seminal research contributions of Dr. Householder to the fields of

numerical analysis and scientific computing, a distinguished

postdoctoral fellowship program has been established at ORNL and named

in his honor. The Householder Fellowship is supported by the Applied

Mathematical Sciences Subprogram of the U.S. Department of Energy.

The purposes of the Householder Fellowship are to promote innovative

research in scientific computing on advanced computer architectures

and to facilitate technology transfer from the laboratory research

environment to industry and academia through advanced training of new

computational scientists. The Householder Fellowship is normally for

a term of two years. Benefits of the Fellowship include a competitive

salary, fringe benefits, travel opportunities, access to

state-of-the-art computational facilities (including both parallel

architectures and high-performance personal workstations), and

collaborative research opportunities in a very active research program

in advanced scientific computing. Competition for the appointment is

open to U.S. citizens and permanent residents. Applicants should have

completed a doctoral degree in computer science, mathematics, or

statistics within three years prior to the appointment and have a

strong background and research interest in large-scale scientific

computing.

ORNL's Mathematical Sciences Section has research programs in

Computational Mathematics, Computer Performance Characterization,

Applied Analysis, Global Climate Modeling, and Computational

Statistics. The precise research emphasis of the Householder Fellow

would necessarily depend to a great degree on the research interests

of the selected Fellow. Areas of particular interest at ORNL, and in

which applicants would be especially encouraged, include:

computational linear algebra, parallel algorithms for partial

differential equations including fluid flow through porous media,

tools for the development and analysis of parallel algorithms,

computational statistics, scientific visualization, and ``grand

challenges'' in computational science.

Applicants should send a resume, statement of research goals, and

three letters of recommendation to Carl Ludemann, PhD Employment, Oak

Ridge National Laboratory, P.O.\ Box 2008, Oak Ridge, TN 37831-6216,

marked ``ATTN: Householder Fellowship.'' The deadline for applying is

December 13, 1991. Finalists will be invited to visit ORNL in late

January 1992, and the selection committee's decision on the winning

candidate will be announced in February 1992. The fellowship will

commence in 1992.

For further information, contact Richard F. Sincovec by phone at

615-574-3125 or by electronic mail at sincovecrf@ornl.gov.

------------------------------

From: James Demmel <demmel@robalo.Berkeley.edu>

Date: Thu, 26 Sep 91 16:38:32 PDT

**Subject: On the Odor of IEEE Arithmetic**

This is in response to Joe Grcar's message from NA Digest v. 91, i. 33:

IEEE Arithmetic Stinks. There were interesting responses from John Reid

and John Nash, and I will not repeat their arguments here. I would instead

like to discuss a few of the real ways in which current floating point

computing environments remain deficient, and what efforts are being

made, or could be made to fix them. I would also like to discuss the SLI

arithmetic of Lozier, Olver and Turner, as well the arithmetic of Kulisch and

Miranker.

A thorough discussion of all these issues would fill a long journal article,

so I will necessarily be terse.

IEEE Arithmetic:

The IEEE standards committee hoped to provide a good scientific computing

environment, a complete provision of which would encompass arithmetic hardware,

languages, compilers, and OS support. The committee reasonably felt no one

would listen if they tried to dictate too many details beyond the "pure"

floating point properties, and perhaps also realized they did not have

the expertise to make wise decisions covering all these areas. Instead,

they made suggestions in the hope that language, compiler and OS support

for the various IEEE features would follow. This hope has usually

gone unmet.

SUN C, SUN Fortran, and the Apple SANE environment have gone farthest in

providing a simple programming model for the user, but they have still

not gone far enough. No language provides easy (or any) ability to

write exception handlers, for example.

Joe Grcar may respond that he does not want to even consider writing his

own exception handlers. Fine. In that case, it is a simple matter to call

a standard SUN library function to enable trapping on all the usual

exceptions. That the IEEE default is to continue computing with NANs and

infinities instead of trapping is something that people on the IEEE committee

were not unanimous about, but providing the usual unforgiving programming

model is certainly no problem. (And one could imagine providing a

different default environment, where one traps on exceptions, via a

compiler flag or some similar device).

It is clear that the time is ripe for a proposed standard set of interfaces

to features of IEEE (and other!) arithmetics not usually provided for in

conventional languages. The recent LCAS proposal of Mary Payne and

Brian Wichmann is a recent attempt, even if it had flaws (see the forthcoming

paper in the SIGNUM newsletter by W. Kahan on this). Work is continuing

on this, in the context of the NCEG (Numerical C Extensions Group, chaired

by Jim Thomas - jimt@apple.com) and the LAPACK project (W. Kahan at Berkeley).

I hope to speak about this issues in more detail at Supercomputing 91 in

November.

Joe Grcar mentions the difficulty in doing error analysis, because of

infinities and gradual underflow. One reference on error analysis with

gradual underflow is my article "Underflow and the Reliability of

Numerical Software" in the December 1984 SIAM J. Sci. Stat. Comput.

If one wishes to take underflow into account, for most codes the

algebraic work involved is essentially the same as for conventional

flush-to-zero systems: One uses the error model

floating_point_result(a op b) = (a op b)*(1 + delta) + mu

where abs(delta) is bounded by machine epsilon and mu is the underflow error.

For gradual underflow mu is bounded by the smallest subnormal number,

and for flush-to-zero, it is bounded by the underflow threshold, which is

larger by the factor 1/machine_precision. It is easy to analyze many codes

using this model, and to show that gradual underflow makes them more reliable

than flush-to-zero. Indeed, for some codes, like polynomial evaluation,

one can show that only with gradual underflow is there a "perfect" backward

error in the following sense:

The computed answer is the exact answer for a slightly different

problem, where the perturbation is in the last few bits of every

piece of data, whether it is normalized or not. In contrast, with

flush-to-zero, almost no bits in any piece of the data my be

significant if all the numbers are too close to the underflow

threshold.

Grcar is concerned about a product like

i=n

p = prod a(i)

i=1

where an intermediate product underflows (but not to zero) but subsequent a(i)

are so large that p is normalized. In this case p could have large relative

error without warning. In flush-to-zero, p would be zero, which also has a

large relative error. The concern must be over codes that test for underflow by

comparing p to zero, and recover somehow; these codes could break with IEEE.

Such codes would have to have some a(i) less than 1 and some greater than 1

(otherwise once underflow occurred the final result would still be underflowed

and detectable by comparing p against the underflow threshold instead of 0).

Thus, to be correct, they would have to have protection against overflow,

which complicates the code at least as much as gradual underflow.

And to be portable to machines like the Cray or i860 which divide a/b via

a*(1/b), they would still have to compare tiny numbers against a threshold

instead of against zero to guard against problems.

Still, someone might have a code he thinks works, and want to run it

with as few changes as possible. Here are two approaches. One way is

to use the underflow flag to test for underflow. A second way is to

have a trap handler that traps on underflow and substitutes 0, mimicing

old fashioned flush-to-zero. Both solutions presume access to these

features of IEEE arithmetic, and so if they are not provided the user

has a perfect right to complain. Complain about access to features of

IEEE arithmetic, that is, not about IEEE arithmetic itself.

Finally, the true benefit of IEEE arithmetic will not be seen until

significant applications are written using its features to great

advantage. One of the goals of LAPACK 2 is to write linear algebra

codes using IEEE which are significantly faster and/or more accurate

than their conventional counterparts. Simple examples are condition

estimation and eigenvector computation: Both involve solving potentially

very ill-conditioned triangular systems. Overflow happens occasionally,

but is generally NOT a "pathological" situation deserving an exception.

For example, it may just mean eigenvalues are close together.

Currently, LAPACK codes must scale inside the inner loop to avoid overflow,

significantly slowing them down. The alternative is to run at top speed

without scaling, test for overflow at the end, and only then run the

scaling code if needed. In other words, IEEE provides an essentially free

insurance policy against a rare but fatal problem, instead of paying the

high premium of always scaling.

SLI Arithmetic:

Frank Olver responded to Joe Grcar's complaints about IEEE by saying

that we should throw out floating point completely and use sli

arithmetic. For the uninitiated, SLI arithmetic represents floating

point numbers as follows:

s1*exp(s2*exp(exp(exp(...(exp(f))...))))

where s1 and s2 are +-1, and f is a fixed point number between 0 and 1.

The representation stores s1, s2, f and m, where m is the number of exp(.)

functions. With a 64 bit format and m limited to 0..7 (3 bits), the

largest number representable is quite unimaginably large.

I made a number of arguments against adopting this arithmetic in my paper

"On error analysis in arithmetic with varying relative precision", Proc.

8th IEEE Computer Arithmetic Symposium, 1987. Let me summarize these and

add a few not presented there.

The proponents of this arithmetic like it because it is free of over/underflow

(unless one divides by 0 or uses transcendental functions). This, in their

view, eliminates the need for infinity arithmetic or denormalized numbers.

However, the reason their system cannot over/underflow on +,-,*, or / is

that at the extremes of their representation, the representable numbers are

so far apart that the correctly rounded value of a+b or a*b is just max(a,b)

(assuming a and b are positive). In other words, rounding uncertainty is

so large that not only is there no relative precision in the usual sense,

but there may not even be any relative precision in the (usual) exponent.

Indeed, a + b = a * b = max(a,b) for positive a and b is precisely the

semantics of the infinity symbol. I am all for having an infinity symbol,

but not a profusion of them as in SLI arithmetic.

The problem with expressions like prod(a(i)) possibly losing precision because

of denormalized IEEE arithmetic recurs in SLI, but without any possibility

of warning. Since numbers gradually lose (all) their relative precision

for enormous or tiny magnitudes, one could end up with a modest sized number

without any relative accuracy, and without recourse to exception handling or

underflow flags as warning. (But see my above mentioned paper for a

possible, if too-expensive, hardware fix.)

The test of any new arithmetic lies in the existence of applications, even if

they are only on paper for lack of an implementation. The only applications

of SLI I have seen have been unconvincing small problems for which equal or

better alternatives using standard arithmetic exist. Indeed, I think that

there is a conservation of effort in writing robust code: any effort saved by

SLI in handling over/underflow is lost in dealing with the variable relative

precision.

Finally, there are better and more easily implemented schemes with similar

tradeoffs between exponent range and precision. A paper by H. Hamada of Hitachi

in the same IEEE Proceedings describes one such scheme; see my references to

Matsui and Iri for another. Hamada's scheme had most of the advantages and

fewer of the disadvantages of SLI. Hitachi considered building a machine with

Hamada's arithmetic, and to my knowledge abandoned it, reflecting their

judgment that it was too slow and had insufficient applications.

Kulisch - Miranker Arithmetic

There is a proposal circulating to incorporate Kulisch/Miranker style

arithmetic as an extension of IEEE floating point arithmetic. I'd like

to summarize this proposal briefly and argue against it.

Kulisch/Miranker propose to add the ability to compute dot products of

arbitrary vectors exactly. In other words,

(1,1e38,-1e38).(1,1e38,1e38) = ( 1*1 + 1e38*1e38 ) - 1e38*1e38 = 1

would be computed correctly instead of as 0. This would appear to

require carrying the equivalent of at least about 2*2*38+7=159 decimal digits

in IEEE single, and far more in double. This has been done in the past in

microcode and cache, as on the IBM 4361. They propose to add a single type,

DOTPRECISION, which one could only use to accumulate dot products, add and

subtract, and round (up, down or to nearest) to the other IEEE precisions.

Their motivation is to build algorithms using interval arithmetic with

standard precision for most operations and DOTPRECISION for a few

critical ones. Their algorithms, as implemented for example in the IBM

ACRITH package, claim to solve linear systems and get minimally wide

output intervals guaranteed to contain each solution component; i.e. the

interval bounds are adjacent floating point numbers. They have similar

codes for some eigenproblems, zeros of polynomials, linear programming,

and other problems.

A conference proceedings on this material may be found in

"A new approach to scientific computation", U. Kulisch and W. Miranker, eds.,

Academic Press, 1983. Some of their claims are exaggerated, as described in

"Anomalies in the IBM ACRITH Package," W. Kahan and E. Leblanc, Proc.

7th IEEE Computer Arithmetic Symposium, 1985.

Here are the main arguments against adopting Kulisch's scheme:

1) None of their algorithms that I know of depend for correctness on an

exact inner product. And their accuracy also depends only weakly on

how much extra precision you have: As the precision grows, you can

solve more problems to "full accuracy", but the number of these problems

is very small and gets smaller as the precision increases. You have

to work hard to concoct examples which need anywhere near the precision

required for exact inner products in order to be solved fully accurately.

So the benefit from all those extra bits is small.

2) Many of their iterative schemes (such as for linear equation solving)

depend on working precision Gaussian elimination being accurate

enough to get a convergent iteration going. As long as there is a little

bit of convergence, they compute residuals accurately to get lots of

accuracy in the answer. But if the working precision Gaussian elimination

is not good enough, their extra precision does not help them. You would

need to do the Gaussian elimination itself in higher precision, which they

cannot do straightforwardly with their proposed data types and instruction set.

(They appear to do higher precision Gaussian elimination in some cases in

later ACRITH releases, although this is just speculation since the details are

proprietary. The only way I can imagine doing it using their arithmetic

is quite cumbersome, in contrast to using higher precision arithmetic directly.)

3) They do not address exponent range limitations, which are at least

as problematic in some software development as accuracy limitations.

This is true in polynomial evaluation, for example.

4) There is really a statistical ("RISC vs CISC") question underlying this:

Of the following reasons for failure to get a good enough answer:

(1) not enough precision in the residuals for iterative refinement

to work

(2) not enough precision in the initial Gaussian elimination to get

convergence (however slow)

(3) exponent range limitation

which is most likely to cause failure? Kulisch and Miranker only address (1).

Quadruple precision addresses all three, but (1) to a somewhat weaker extent

than Kulisch and Miranker. I do not have statistics (I doubt anyone does) but

I feel strongly that the extra problems one can solve with quad because

it deals with (2) and (3) far outweigh the problems Kulisch can

solve but quad can't because of (1).

5) There are faster and simpler algorithms using (I believe) the same

hardware resources which they cannot use, or prefer not to use, because

they limit themselves to dot products. Again, consider polynomial evaluation:

if one could multiply a long accumulator by a single precision variable,

using a loop like:

real s,tmp

DOTPRECISION d1, d1

/* compute d2=s*d1 */

d2=0

repeat

tmp=d1

d1=d1-tmp

d2=d2+s*tmp

until d1=0

then one could just use the standard algorithm to evaluate polynomials

(Horner's rule) and it would work. Instead, they need a complicated

library routine. In other words, their proposed instruction set makes

programmers have to completely rethink things unnecessarily.

So from a software point of view, it is much easier to develop algorithms

using quad than shoehorning everything into dot products. Kahan has more

to say on this in his article cited above.

6) Trying to get tight guaranteed error bounds is a laudable goal, and

interval arithmetic is a valuable tool, but it will not be possible to do

this reliably in systems without variable precision. The reason is that if one

tries to build larger applications out of interval building blocks

(like ACRITH routines), then even if the input data is exact

(0-width intervals), the output of the first interval routine will be

intervals. These intervals are then fed in to a second routine. How wide

must the output intervals of the second routine be? It is easy to show that

in general they must be COND2 times wider than the input intervals,

where COND2 is the condition number of the problem being solved by the

second routine. If these intervals are then used as input to a third routine,

their width will be magnified by another condition number factor COND3.

If we are computing with a fixed maximum precision, eventually these

products of COND factors will make the intermediate intervals so large

as to be useless. The only way around this is to completely reformulate

the problem to avoid these condition numbers, which may be very difficult

and time-consuming (and defeats the purpose of using these libraries, which is

getting the answer faster as well as more accurately), or else to use

variable precision. Tom Hull at the University of Toronto, Siegfried Rump at

TU Hamburg, and others have proposed and sometimes built such systems.

Thus, I do not foresee wide use of interval arithmetic until variable

precision is adequately supported, and this cannot be done without a

more major architectural revision than a small change in the IEEE floating

point standard.

Jim Demmel

Computer Science Division and Mathematics Department

University of California

Berkeley, CA 94720

demmel@cs.berkeley.edu

------------------------------

End of NA Digest

**************************

-------