From nacomb@surfer.EPM.ORNL.GOV Sun Sep 29 16:01:11 1991 Return-Path: Received: from surfer.EPM.ORNL.GOV by CS.UTK.EDU with SMTP (5.61++/2.7s-UTK) id AA14148; Sun, 29 Sep 91 16:01:02 -0400 Received: by surfer.EPM.ORNL.GOV (5.61/1.34) id AA01787; Sun, 29 Sep 91 16:00:46 -0400 Date: Sun, 29 Sep 91 16:00:46 -0400 From: nacomb@surfer.EPM.ORNL.GOV (NA-NET) Message-Id: <9109292000.AA01787@surfer.EPM.ORNL.GOV> Subject: NA Digest, V. 91, # 39 Apparently-To: dongarra@cs.utk.edu Status: R NA Digest Sunday, September 29, 1991 Volume 91 : Issue 39 Today's Editor: Cleve Moler Today's Topics: Changed address for Sven-Ake Gustafson New Address for Alfio Quarteroni New Address for Angelika Bunse-Gerstner Student Support for Parallel Circus Looking for a Research Position Journal Refereeing is Breaking Down 3-level block-Toeplitz solver 6th ACM Conference on Supercomputing FTPing Technical Reports Matrix Calculator with C Syntax Errata to "Numerical Recipes" by Press et all Special Issue of "Computing" for H.-J. Wacker Positions at the University of Minnesota. Householder Fellowship at Oak Ridge On the Odor of IEEE Arithmetic Submissions for NA Digest: Mail to na.digest@na-net.ornl.gov. Information about NA-NET: Mail to na.help@na-net.ornl.gov. ------------------------------------------------------- From: Sven-Ake Gustafson Date: Tue, 24 Sep 1991 12:02:00 +0200 Subject: Changed address for Sven-Ake Gustafson Changed address for Sven-Ake Gustafson: I have returned from my sabbatical leave and from now on my address will be: HSR Box 2557 Ullandhaug N-4004 Stavanger Norway Phone: (office) +47 4874470 (home) +47 4890083 Fax: +47 4874236 email: gustafson@ifi.hsr.no ------------------------------ From: Alfio Quarteroni Date: Thu, 26 Sep 1991 15:56 MET Subject: New Address for Alfio Quarteroni Effective September 1st, I am back at the Politecnico of Milan, at the following address: Alfio Quarteroni Dipartimento di Matematica Politecnico di Milano Via Bonardi 9 20133 Milano - Italy Phone: +39-2-2399-4525 Fax: +39-2-2399-4568 e-mail: alfqua%ipmma1.polimi.it@imiclvx.bitnet ------------------------------ From: Angelika Bunse-Gerstner Date: Fri, 27 Sep 91 14:48:14 N Subject: New Address for Angelika Bunse-Gerstner I moved from the university of Bielefeld to the university of Bremen, where I took a new position at the department of mathematics and computer science. My new address, phone and e-mail are as follows: Address: Angelika Bunse-Gerstner Fachbereich Mathematik und Informatik Universitaet Bremen Postfach 33 04 40 W-2800 Bremen FRG Phone: 049-421-218-4205 email: abg@informatik.uni-bremen.de ------------------------------ From: Gene Golub Date: Fri, 27 Sep 91 7:57:32 PDT Subject: Student Support for Parallel Circus We have recently been awarded a modest grant by the NSF for supporting the travel and expenses of students attending the Parallel Circus meetings during the next two years. The next Circus will take place in Oak Ridge on Oct 25, 26. Those students requesting support should send a one page proposal to Gene Golub (na.golub@na-net.ornl.gov) giving the reasons for attending the meeting, and a budget for expenses. The student(s) should indicate their research interests and plans. A letter verifying that the student is in good standing should be sent independently by a faculty adviser. This letter should give the student's GPA. We will be pleased to consider joint proposals which would include the expenses of several students. The deadline is 5pm (PST) on Oct 11 and we will try to get a response by Oct 15. Correspondence by e-mail is desirable. Apostolos Gerasoulis Gene Golub Ahmed Sameh Gene Golub, Computer Science Dept. Office: Margaret Jacks Hall, Room 306 Office Phone: 415/723-3124 Home Phone: 415/323-0105 FAX number: (415) 725-7411 ------------------------------ From: Vera Talanker Date: Mon, 23 Sep 91 11:17 EST Subject: Looking for a Research Position This spring I will be completing my MS degree in Applied Mathematics at Colorado School of Mines. I am looking for a 1-2 year research position in the field of medicine (image processing, modeling drug delivery, etc.) in a European country. Can anyone give me some suggestions where to start looking? Thanks, Vera Talanker: e-mail: on internet talanker@alcvax.pfc.mit.edu ------------------------------ From: Stephen Vavasis Date: Mon, 23 Sep 91 16:08:59 Subject: Journal Refereeing is Breaking Down It's my impression that the refereeing process in the numerical community is breaking down. Papers are taking longer and longer to get refereed. The reason seems to be that people are writing more papers while agreeing to referee fewer. I reach this conclusion from my own experience of lengthy publication delays, and also from talking to some editors of SIAM journals. I would like to propose a stringent solution to the problem. Because many people in the numerical community read and publish in SIAM journals, I would propose that SIAM adopt the following policy. SIAM would maintain a list of people who carry their share in terms of refereeing, and of people who repeatedly decline to referee SIAM papers. Anyone in the latter category would not be allowed to publish in SIAM journals. I realize that the problem I have identified is not specific to numerical analysis but is widespread throughout computer science and probably other disciplines. We in NA, however, are fortunate to have a single organization (SIAM) that could actually take action to combat the problem. I would welcome comments on this proposal, sent to me by email or posted to NA digest. I have not yet discussed my proposal with people in the SIAM publication department. -- Steve Vavasis vavasis@cs.cornell.edu ------------------------------ From: Khalil Kalbasi Date: Thu, 19 Sep 1991 11:14 CDT Subject: 3-level block-Toeplitz solver I like to know if anyone knows of a 3-level block Toeplitz solver with complex general blocks. I am doing an array antenna problem which eventually needs to solve a linear complex system Ax=b. The matrix A is very structured, i.e. it is a 3-level block Toeplitz with a complex general blocks. The Netlib has 3-level block-circulant routines which is not any use to me. The structure of the matrix A (a 6x6 block example) is shown below, where each M# is a general complex block. As shown, each 2x2 block is a general Toeplitz and the ovearl 6x6 blocks are also general Toeplitz. | M1 M2 M4 M5 M7 M8 | | M3 M1 M6 M4 M9 M7 | | | | M10 M11 M1 M2 M4 M5 | | M12 M10 M3 M1 M6 M4 | | | | M13 M14 M10 M11 M1 M2 | | M15 M13 M12 M10 M3 M1 | I appreciate your help. Khalil Kalbasi Department of Electrical and Computer Engineering The University of Kansas, Lawrence, KS (913)864-7739 ------------------------------ From: Stratis Gallopoulos Date: Mon, 23 Sep 91 12:28:59 CDT Subject: 6th ACM Conference on Supercomputing CALL FOR PAPERS 6th ACM International Conference on Supercomputing Sponsored by ACM SIGARCH July 19-23, 1992 Hyatt Crystal City Hotel, Washington D.C General Chair Ken Kennedy, Center for Research on Parallel Comp., Rice University Program Chair C. D. Polychronopoulos, CSRD, University of Illinois The 6th ACM International Conference on Supercomputing is soliciting papers on significant new research results and experience in the development and use of supercomputing systems. Topics of interest include, but are not limited to parallel and high-performance computer architectures, parallelizing compilers and programming environments, operating systems and performance evaluation, large-scale applications and algorithms, technologies, new experimental and commercial systems. Program Committee F. Allen (IBM, T.J. Watson), R. Allen (Kubota Pacific), Arvind (MIT), J. L. Baer (U. Washington), D. Bailey (NASA Ames), J. C. Browne (U. Texas-Austin), R. Cytron (Washington U.), D. DeGroot (Texas Inst.), R. Esser (KFA), E. Gallopoulos (CSRD), J. R. Gurd (U. Manchester), F. Hossfield (KFA), E. Houstis (Purdue U.), W. Jalby (INRIA-IRISA), Y. Jegou (INRIA-IRISA), C. Koelbel (Rice U.), J. McGraw (LLNL), Y. Muraoka (Waseda U.), A. Nicolau (UC Irvine), T. Papatheodorou (U. Patras), J. Riganati (SRC), A. J. Smith (UC Berkeley), H. Terada (Osaka U.), A. Veidenbaum (CSRD), S. Wallach (Convex). Local Arrangements Chair: Duncan Buell, SRC Finance Chair: A. Veidenbaum, CSRD Publicity Chair: E. Gallopoulos. CSRD Paper Submissions Authors should submit five copies of the full manuscript to the program chairman or the appropriate regional chairman as follows: NORTH & SOUTH AMERICA EUROPE & AFRICA JAPAN & FAR EAST Constantine Polychronopoulos Harry Wijshoff Toshitugu Yuba Program Chairman Region Chairman Region Chairman CSRD Dept. Computer Sci. Electrotechnical Lab. University of Illinois Utrecht University 1-1-4 Umezono, Tsukuba 104 S. Wright St. Padualaan 14 Ibaraki 305 Urbana, Illinois 61801, USA 3584 CH Utrecht Japan (cdp@csrd.uiuc.edu) The Netherlands (yuba@etl.go.jp) (harryw@cs.ruu.nl) DEADLINE for Submissions is December 10, 1991. Authors will be notified about the outcome of the paper selection by March 7, 1992. Proposals for Tutorials The conference also solicits proposals for half and full-day tutorials which will be offered before and after the conference. Proposals for tutorials on topics of wide interest to the supercomputing and parallel processing community should be submitted by December 10, 1991 to: George Paul Tutorials Chairman IBM T.J. Watson Research Center P.O. Box 704 Yorktown Heights, NY 10598, USA (gp1@watson.ibm.com) Tutorial topics of particular interest include, but are not limited to large-scale applications, numeric and symbolic computing, applications of supercomputing to engineering, physical and biological sciences, programming environments and visualization tools, architectures, compilers and languages and operating systems. Conference Announcement The conference announcement is available in ASCII and PostScript via anonymous ftp from sp2.csrd.uiuc.edu (128.174.162.51) in pub/ics-an and pub/ics-an.ps. For more information contact C. D. Polychronopoulos, Program Chairman, CSRD, University of Illinois, 104 S. Wright St., Urbana, Illinois 61801-2932, USA. Tel. (217) 333-6773. FAX (217) 244-1351. E-mail: cdp@csrd.uiuc.edu. ------------------------------ From: Pete Stewart Date: Mon, 23 Sep 91 14:40:31 -0400 Subject: FTPing Technical Reports As regular readers of this digest know, I have made my current technical reports available by anonymous ftp. The response has been encouraging, and several people have asked how to make their own reports available. Hence this little note. First some background. Ftp--short for file transfer program--is an internet interface for moving files over the network. The user can send or get ASCII or binary files. A particularly useful command is "mget", which expands names (e.g., paper.*) and gets all the resulting files. Details may be found in the on-line manual. Ordinarily one must have a valid account on a machine in order to ftp to it. This causes difficulties for people who want to make programs, reports, and the like available to the general public. The fix is the anonymous ftp account, which gives people restricted access to files on a machine. Here for an example, is a little run in which I listed some directories on my own machine. thales: ftp thales.cs.umd.edu Connected to thales.cs.umd.edu. 220 thales.cs.umd.edu FTP server (SunOS 4.1) ready. Name (thales.cs.umd.edu:stewart): anonymous 331 Guest login ok, send ident as password. Password: 230 Guest login ok, access restrictions apply. ftp> cd pub 250 CWD command successful. ftp> ls 200 PORT command successful. 150 ASCII data connection for /bin/ls (128.8.128.81,1315) (0 bytes). jeep matrixpert references reports 226 ASCII Transfer complete. 39 bytes received in 1.1 seconds (0.035 Kbytes/s) ftp> bye 221 Goodbye. thales: The procedure is to identify yourself as "anonymous". You can give any password, but it is usual to give your name or email address, so the account manager can tell who is transfering files. By convention pub is the root directory for public files. The first step in creating a repository of technical reports is to get your systems staff to set up an anonymous ftp account. For people who are concerned about security, note that anonymous ftp is quite safe. The user cannot get out of the ftp directory, and to my knowledge the program has never been broken. In any event, our own security conscious staff has no hesitation about setting up such accounts. Once you have the account, you will probably want to create a special directory to hold the repository (tech reports are not the only thing you are going to want to distribute: see the above listing of my pub directory). At this point you must decide what form you want to distribute the reports in. For those who use TeX, there are three options--tex files, dvi files, or postscript files. Unless you stick to vanilla TeX or LaTeX, tex files are not the way to go, since you will also have to distribute any auxiliary files containing your customized macros. I went with dvi files because many places do not have postscript printers, while most places have a utility (called something like dvi2ps) that converts dvi files to postscript. One important drawback is that postscript graphics must be transfered in separate files; however, you can use the mget feature of ftp to ease some of the pain. The next thing is to decide on what auxiliary information you want to supply your customers. I have organized my repository around three files: README, Contents, and Abstracts. The README file contains basic information about the repository. The Contents file contains the following information for each paper Authors Title Source Disposition Files Here is a typical entry. G. W. Stewart "An Updating Algorithm for Subspace Tracking" UMIACS-TR-90-86, CS-TR 2494, July 1990, Revised January 1991 To appear in IEEE Transactions on Acoustics, Speech and Signal Processing uast.dvi, uast.f (fortran program) The Abstract file contains abstracts in addition to the above information. Setting up these files for the first time is the most time-consuming part of the process. They are very easy to update as you add reports to your repository. The last problem is to select names for your files. I construct mine from the first letters of the significant words in the title. The results are admitedly ugly, but it solves the problem of repeated names. In fact, two or three letters followed by an asterisk is usually sufficient identification for an mget. An alternative is to use more conventional names with a date to insure uniqueness. That's all there is to it. If you try it, please let me know. When there is a critical mass, I will undertake to edit a weekly entry in the na-digest announcing newly available reports--a sort of technet digest. It would be a fitting birthday present for Gene Golub, who has done so much to unite us electronically. Pete Stewart stewart@cs.umd.edu na.pstewart@na-net.ornl.gov ------------------------------ From: Rick Perry Date: 24 Sep 91 09:51:00 EDT Subject: Matrix Calculator with C Syntax I have been working on the development of a matrix calculator program, similar to Matlab, but with the syntax of the C programming language. This project started as a way to combine some of my C code for matrix computations into a coherent package, but has evolved into a fairly complete interactive C compiler, with matrices as the fundamental data type (e.g. matrix A = [1 2; 3 4]; ++A;) and some operators from APL ( +/ */ ). I am interested in hearing about similar development efforts or existing products, and comments about the usefulness of such a package. I am aware of many packages, e.g. Matlab, Matrix, S, SMP, etc. but I've never heard of any which use C syntax. My program so far runs on Sun/Unix, VAX/VMS, and IBM PC's, and I'll gladly make it available for anyone who's interested. ...Rick perry@vu-vlsi.vill.edu, perry@ucis.vill.edu Dr. Rick Perry, Department of Electrical Engineering Villanova University, Villanova, PA 19085, 215-645-4969, -4970, hm: 259-8734 ------------------------------ From: Przemek Klosowski Date: 17 Sep 91 18:21:32 GMT Subject: Errata to "Numerical Recipes" by Press et all Hello! The merits (or demerits) of Numerical Recipes were discussed on the net in a series of articles, but my feeling was that there was no hard data on the specific problems and adopted solutions. In particular I remember someone from JPL complaining that the "robust fit" procedure is not robust at all, but without giving the details. I think it might be of use to catalog specific problems with NR's algorithms (for the sake of focus I propose that we do the C version first), so that others might avoid the traps that possibly exist. I would like to start such collection by posting several items, providing varying detail, and asking people to add their own remarks/fixes. I ask everyone to send me an e-mail copy of their followup, so that I don't miss it (I normally read the newsgroups, but if the backlog is too large, I tend to catch up). In the future, I will edit the contributions and make them available to the public. Specific problem areas that I know of are: 1) In the discussion of Levenberg-Marquard algorithm (ch. 14, p. 544) for nonlinear minimization (mrqmin), the book gives a condition for the end of iterations that is not satisfied if the starting point is near the true minimum. 2) error in medfit---could the JPL person that reported it originally fill in the details? 3) The authors "cathegorically recommend that you {\it only} use FFTs with N a power of two". I understand that the latest wisdom goes against this---comments/code anyone? I have heard rumors that some of the FFT routines are broken. 4) the variable metric BFGS method (dfpmin) uses the traditional approach of calculating and using the Hessian. This (as mentioned in the text) is prone to roundoff errors, and that instead Gill/Murray method of calculating H^{-1} is much preferred (more than it is implied from the text). Anyone willing to comment on that? Anyone has a G/M version of dfpmin? Przemek Klosowski (przemek@ndcvx.cc.nd.edu) Physics Department University of Notre Dame IN 46556 ------------------------------ From: Bob Mattheij Date: Wed, 25 Sep 91 13:27:58 +0200 Subject: Special Issue of "Computing" for H.-J. Wacker The journal Computing (Archives for Informatics and Numerical Computation) will devote a special issue for papers dedicated to the unfortunate premature death of its Editor H.-J. Wacker. As Professor Wacker was particularly active in the promotion of the use of mathematics in industry, this special issue should reflect these intentions and papers from the area of "Industrial Mathe- matics" will be particularly welcome for this issue even if they would consti- tute borderline cases with respect to the ordinary scope of Computing. Manuscripts for this issue should be submitted (in duplicate) as soon as possible to either Prof. Robert Mattheij Prof. Hans J. Stetter Technical University or Technical University (115.2) Postbus 513 Wiedner Hauptstrasse 8-10 5600 MB Eindhoven A-1040 Vienna The Netherlands Austria The absolute deadline for the submission of manuscripts for this special issue is February 15 1992. Manuscripts must conform with the "Instructions to Authors" reprinted in each issue of Computing. They will be subject to refereeing and the scientific standards of the journal. Note that Computing has a category of "Short Communi- cations" (besides "Original Papers"); manuscripts for the special issue may also be submitted in this category. Please, give me your reactions to the above information as soon as pos- sible. I will be away once more between 22 Sept. and 5 Oct.; thus it would be good to have a reply by 20 Sept. With my best regards and my appreciation of your cooperation, Sincerely yours, Hans J. Stetter ------------------------------ From: Youcef Saad Date: Thu, 26 Sep 91 10:35:24 CDT Subject: Positions at the University of Minnesota. The Department of Computer Science at the University of Minnesota invites applications for faculty positions beginning September 1992. Positions are available at both the junior and senior levels and in all areas of Computer Science with emphasis on compilers and operating systems, performance evaluation, architecture, numerical analysis and optimization, robotics and computer vision as well as all aspects of parallel and distributed computing. Requirements include a Ph.D. in Computer Science or a related discipline, excellence in research and a commitment to teaching. The Minneapolis-St. Paul area is a major center for high technology and the computer industry. Faculty in the Department of Computer Science have access to outstanding computer facilities both within the department and at the various high performance computing centers on campus, including the Minnesota Supercomputer Institute and the Army High Performance Computer Research Center. The supercomputers available on campus are Cray-2, Cray X-MP, and Thinking Machines CM-2 and CM-X massively parallel processors. Applicants should submit a vitae and names of at least three references to: Chair, Faculty Recruiting Committee, Department of Computer Science, 4-192 EE/CSci Building, University of Minnesota, 200 Union St. S.E., Minneapolis, Minnesota 55455. The closing date for receipt of applications is February 28, 1992. The University of Minnesota is an equal opportunity employer and welcomes applications from women and ethnic minorities. ------------------------------ From: Richard Sincovec Date: Thu, 26 Sep 91 14:46:42 -0400 Subject: Householder Fellowship at Oak Ridge HOUSEHOLDER FELLOWSHIP OAK RIDGE NATIONAL LABORATORY The Mathematical Sciences Section of Oak Ridge National Laboratory invites outstanding candidates to apply for the 1992 Alston S. Householder Fellowship in Scientific Computing. Alston S. Householder was the organizer and founding Director of the Mathematics Division (precursor of the current Mathematical Sciences Section) at Oak Ridge National Laboratory (ORNL). In recognition of the seminal research contributions of Dr. Householder to the fields of numerical analysis and scientific computing, a distinguished postdoctoral fellowship program has been established at ORNL and named in his honor. The Householder Fellowship is supported by the Applied Mathematical Sciences Subprogram of the U.S. Department of Energy. The purposes of the Householder Fellowship are to promote innovative research in scientific computing on advanced computer architectures and to facilitate technology transfer from the laboratory research environment to industry and academia through advanced training of new computational scientists. The Householder Fellowship is normally for a term of two years. Benefits of the Fellowship include a competitive salary, fringe benefits, travel opportunities, access to state-of-the-art computational facilities (including both parallel architectures and high-performance personal workstations), and collaborative research opportunities in a very active research program in advanced scientific computing. Competition for the appointment is open to U.S. citizens and permanent residents. Applicants should have completed a doctoral degree in computer science, mathematics, or statistics within three years prior to the appointment and have a strong background and research interest in large-scale scientific computing. ORNL's Mathematical Sciences Section has research programs in Computational Mathematics, Computer Performance Characterization, Applied Analysis, Global Climate Modeling, and Computational Statistics. The precise research emphasis of the Householder Fellow would necessarily depend to a great degree on the research interests of the selected Fellow. Areas of particular interest at ORNL, and in which applicants would be especially encouraged, include: computational linear algebra, parallel algorithms for partial differential equations including fluid flow through porous media, tools for the development and analysis of parallel algorithms, computational statistics, scientific visualization, and ``grand challenges'' in computational science. Applicants should send a resume, statement of research goals, and three letters of recommendation to Carl Ludemann, PhD Employment, Oak Ridge National Laboratory, P.O.\ Box 2008, Oak Ridge, TN 37831-6216, marked ``ATTN: Householder Fellowship.'' The deadline for applying is December 13, 1991. Finalists will be invited to visit ORNL in late January 1992, and the selection committee's decision on the winning candidate will be announced in February 1992. The fellowship will commence in 1992. For further information, contact Richard F. Sincovec by phone at 615-574-3125 or by electronic mail at sincovecrf@ornl.gov. ------------------------------ From: James Demmel Date: Thu, 26 Sep 91 16:38:32 PDT Subject: On the Odor of IEEE Arithmetic This is in response to Joe Grcar's message from NA Digest v. 91, i. 33: IEEE Arithmetic Stinks. There were interesting responses from John Reid and John Nash, and I will not repeat their arguments here. I would instead like to discuss a few of the real ways in which current floating point computing environments remain deficient, and what efforts are being made, or could be made to fix them. I would also like to discuss the SLI arithmetic of Lozier, Olver and Turner, as well the arithmetic of Kulisch and Miranker. A thorough discussion of all these issues would fill a long journal article, so I will necessarily be terse. IEEE Arithmetic: The IEEE standards committee hoped to provide a good scientific computing environment, a complete provision of which would encompass arithmetic hardware, languages, compilers, and OS support. The committee reasonably felt no one would listen if they tried to dictate too many details beyond the "pure" floating point properties, and perhaps also realized they did not have the expertise to make wise decisions covering all these areas. Instead, they made suggestions in the hope that language, compiler and OS support for the various IEEE features would follow. This hope has usually gone unmet. SUN C, SUN Fortran, and the Apple SANE environment have gone farthest in providing a simple programming model for the user, but they have still not gone far enough. No language provides easy (or any) ability to write exception handlers, for example. Joe Grcar may respond that he does not want to even consider writing his own exception handlers. Fine. In that case, it is a simple matter to call a standard SUN library function to enable trapping on all the usual exceptions. That the IEEE default is to continue computing with NANs and infinities instead of trapping is something that people on the IEEE committee were not unanimous about, but providing the usual unforgiving programming model is certainly no problem. (And one could imagine providing a different default environment, where one traps on exceptions, via a compiler flag or some similar device). It is clear that the time is ripe for a proposed standard set of interfaces to features of IEEE (and other!) arithmetics not usually provided for in conventional languages. The recent LCAS proposal of Mary Payne and Brian Wichmann is a recent attempt, even if it had flaws (see the forthcoming paper in the SIGNUM newsletter by W. Kahan on this). Work is continuing on this, in the context of the NCEG (Numerical C Extensions Group, chaired by Jim Thomas - jimt@apple.com) and the LAPACK project (W. Kahan at Berkeley). I hope to speak about this issues in more detail at Supercomputing 91 in November. Joe Grcar mentions the difficulty in doing error analysis, because of infinities and gradual underflow. One reference on error analysis with gradual underflow is my article "Underflow and the Reliability of Numerical Software" in the December 1984 SIAM J. Sci. Stat. Comput. If one wishes to take underflow into account, for most codes the algebraic work involved is essentially the same as for conventional flush-to-zero systems: One uses the error model floating_point_result(a op b) = (a op b)*(1 + delta) + mu where abs(delta) is bounded by machine epsilon and mu is the underflow error. For gradual underflow mu is bounded by the smallest subnormal number, and for flush-to-zero, it is bounded by the underflow threshold, which is larger by the factor 1/machine_precision. It is easy to analyze many codes using this model, and to show that gradual underflow makes them more reliable than flush-to-zero. Indeed, for some codes, like polynomial evaluation, one can show that only with gradual underflow is there a "perfect" backward error in the following sense: The computed answer is the exact answer for a slightly different problem, where the perturbation is in the last few bits of every piece of data, whether it is normalized or not. In contrast, with flush-to-zero, almost no bits in any piece of the data my be significant if all the numbers are too close to the underflow threshold. Grcar is concerned about a product like i=n p = prod a(i) i=1 where an intermediate product underflows (but not to zero) but subsequent a(i) are so large that p is normalized. In this case p could have large relative error without warning. In flush-to-zero, p would be zero, which also has a large relative error. The concern must be over codes that test for underflow by comparing p to zero, and recover somehow; these codes could break with IEEE. Such codes would have to have some a(i) less than 1 and some greater than 1 (otherwise once underflow occurred the final result would still be underflowed and detectable by comparing p against the underflow threshold instead of 0). Thus, to be correct, they would have to have protection against overflow, which complicates the code at least as much as gradual underflow. And to be portable to machines like the Cray or i860 which divide a/b via a*(1/b), they would still have to compare tiny numbers against a threshold instead of against zero to guard against problems. Still, someone might have a code he thinks works, and want to run it with as few changes as possible. Here are two approaches. One way is to use the underflow flag to test for underflow. A second way is to have a trap handler that traps on underflow and substitutes 0, mimicing old fashioned flush-to-zero. Both solutions presume access to these features of IEEE arithmetic, and so if they are not provided the user has a perfect right to complain. Complain about access to features of IEEE arithmetic, that is, not about IEEE arithmetic itself. Finally, the true benefit of IEEE arithmetic will not be seen until significant applications are written using its features to great advantage. One of the goals of LAPACK 2 is to write linear algebra codes using IEEE which are significantly faster and/or more accurate than their conventional counterparts. Simple examples are condition estimation and eigenvector computation: Both involve solving potentially very ill-conditioned triangular systems. Overflow happens occasionally, but is generally NOT a "pathological" situation deserving an exception. For example, it may just mean eigenvalues are close together. Currently, LAPACK codes must scale inside the inner loop to avoid overflow, significantly slowing them down. The alternative is to run at top speed without scaling, test for overflow at the end, and only then run the scaling code if needed. In other words, IEEE provides an essentially free insurance policy against a rare but fatal problem, instead of paying the high premium of always scaling. SLI Arithmetic: Frank Olver responded to Joe Grcar's complaints about IEEE by saying that we should throw out floating point completely and use sli arithmetic. For the uninitiated, SLI arithmetic represents floating point numbers as follows: s1*exp(s2*exp(exp(exp(...(exp(f))...)))) where s1 and s2 are +-1, and f is a fixed point number between 0 and 1. The representation stores s1, s2, f and m, where m is the number of exp(.) functions. With a 64 bit format and m limited to 0..7 (3 bits), the largest number representable is quite unimaginably large. I made a number of arguments against adopting this arithmetic in my paper "On error analysis in arithmetic with varying relative precision", Proc. 8th IEEE Computer Arithmetic Symposium, 1987. Let me summarize these and add a few not presented there. The proponents of this arithmetic like it because it is free of over/underflow (unless one divides by 0 or uses transcendental functions). This, in their view, eliminates the need for infinity arithmetic or denormalized numbers. However, the reason their system cannot over/underflow on +,-,*, or / is that at the extremes of their representation, the representable numbers are so far apart that the correctly rounded value of a+b or a*b is just max(a,b) (assuming a and b are positive). In other words, rounding uncertainty is so large that not only is there no relative precision in the usual sense, but there may not even be any relative precision in the (usual) exponent. Indeed, a + b = a * b = max(a,b) for positive a and b is precisely the semantics of the infinity symbol. I am all for having an infinity symbol, but not a profusion of them as in SLI arithmetic. The problem with expressions like prod(a(i)) possibly losing precision because of denormalized IEEE arithmetic recurs in SLI, but without any possibility of warning. Since numbers gradually lose (all) their relative precision for enormous or tiny magnitudes, one could end up with a modest sized number without any relative accuracy, and without recourse to exception handling or underflow flags as warning. (But see my above mentioned paper for a possible, if too-expensive, hardware fix.) The test of any new arithmetic lies in the existence of applications, even if they are only on paper for lack of an implementation. The only applications of SLI I have seen have been unconvincing small problems for which equal or better alternatives using standard arithmetic exist. Indeed, I think that there is a conservation of effort in writing robust code: any effort saved by SLI in handling over/underflow is lost in dealing with the variable relative precision. Finally, there are better and more easily implemented schemes with similar tradeoffs between exponent range and precision. A paper by H. Hamada of Hitachi in the same IEEE Proceedings describes one such scheme; see my references to Matsui and Iri for another. Hamada's scheme had most of the advantages and fewer of the disadvantages of SLI. Hitachi considered building a machine with Hamada's arithmetic, and to my knowledge abandoned it, reflecting their judgment that it was too slow and had insufficient applications. Kulisch - Miranker Arithmetic There is a proposal circulating to incorporate Kulisch/Miranker style arithmetic as an extension of IEEE floating point arithmetic. I'd like to summarize this proposal briefly and argue against it. Kulisch/Miranker propose to add the ability to compute dot products of arbitrary vectors exactly. In other words, (1,1e38,-1e38).(1,1e38,1e38) = ( 1*1 + 1e38*1e38 ) - 1e38*1e38 = 1 would be computed correctly instead of as 0. This would appear to require carrying the equivalent of at least about 2*2*38+7=159 decimal digits in IEEE single, and far more in double. This has been done in the past in microcode and cache, as on the IBM 4361. They propose to add a single type, DOTPRECISION, which one could only use to accumulate dot products, add and subtract, and round (up, down or to nearest) to the other IEEE precisions. Their motivation is to build algorithms using interval arithmetic with standard precision for most operations and DOTPRECISION for a few critical ones. Their algorithms, as implemented for example in the IBM ACRITH package, claim to solve linear systems and get minimally wide output intervals guaranteed to contain each solution component; i.e. the interval bounds are adjacent floating point numbers. They have similar codes for some eigenproblems, zeros of polynomials, linear programming, and other problems. A conference proceedings on this material may be found in "A new approach to scientific computation", U. Kulisch and W. Miranker, eds., Academic Press, 1983. Some of their claims are exaggerated, as described in "Anomalies in the IBM ACRITH Package," W. Kahan and E. Leblanc, Proc. 7th IEEE Computer Arithmetic Symposium, 1985. Here are the main arguments against adopting Kulisch's scheme: 1) None of their algorithms that I know of depend for correctness on an exact inner product. And their accuracy also depends only weakly on how much extra precision you have: As the precision grows, you can solve more problems to "full accuracy", but the number of these problems is very small and gets smaller as the precision increases. You have to work hard to concoct examples which need anywhere near the precision required for exact inner products in order to be solved fully accurately. So the benefit from all those extra bits is small. 2) Many of their iterative schemes (such as for linear equation solving) depend on working precision Gaussian elimination being accurate enough to get a convergent iteration going. As long as there is a little bit of convergence, they compute residuals accurately to get lots of accuracy in the answer. But if the working precision Gaussian elimination is not good enough, their extra precision does not help them. You would need to do the Gaussian elimination itself in higher precision, which they cannot do straightforwardly with their proposed data types and instruction set. (They appear to do higher precision Gaussian elimination in some cases in later ACRITH releases, although this is just speculation since the details are proprietary. The only way I can imagine doing it using their arithmetic is quite cumbersome, in contrast to using higher precision arithmetic directly.) 3) They do not address exponent range limitations, which are at least as problematic in some software development as accuracy limitations. This is true in polynomial evaluation, for example. 4) There is really a statistical ("RISC vs CISC") question underlying this: Of the following reasons for failure to get a good enough answer: (1) not enough precision in the residuals for iterative refinement to work (2) not enough precision in the initial Gaussian elimination to get convergence (however slow) (3) exponent range limitation which is most likely to cause failure? Kulisch and Miranker only address (1). Quadruple precision addresses all three, but (1) to a somewhat weaker extent than Kulisch and Miranker. I do not have statistics (I doubt anyone does) but I feel strongly that the extra problems one can solve with quad because it deals with (2) and (3) far outweigh the problems Kulisch can solve but quad can't because of (1). 5) There are faster and simpler algorithms using (I believe) the same hardware resources which they cannot use, or prefer not to use, because they limit themselves to dot products. Again, consider polynomial evaluation: if one could multiply a long accumulator by a single precision variable, using a loop like: real s,tmp DOTPRECISION d1, d1 /* compute d2=s*d1 */ d2=0 repeat tmp=d1 d1=d1-tmp d2=d2+s*tmp until d1=0 then one could just use the standard algorithm to evaluate polynomials (Horner's rule) and it would work. Instead, they need a complicated library routine. In other words, their proposed instruction set makes programmers have to completely rethink things unnecessarily. So from a software point of view, it is much easier to develop algorithms using quad than shoehorning everything into dot products. Kahan has more to say on this in his article cited above. 6) Trying to get tight guaranteed error bounds is a laudable goal, and interval arithmetic is a valuable tool, but it will not be possible to do this reliably in systems without variable precision. The reason is that if one tries to build larger applications out of interval building blocks (like ACRITH routines), then even if the input data is exact (0-width intervals), the output of the first interval routine will be intervals. These intervals are then fed in to a second routine. How wide must the output intervals of the second routine be? It is easy to show that in general they must be COND2 times wider than the input intervals, where COND2 is the condition number of the problem being solved by the second routine. If these intervals are then used as input to a third routine, their width will be magnified by another condition number factor COND3. If we are computing with a fixed maximum precision, eventually these products of COND factors will make the intermediate intervals so large as to be useless. The only way around this is to completely reformulate the problem to avoid these condition numbers, which may be very difficult and time-consuming (and defeats the purpose of using these libraries, which is getting the answer faster as well as more accurately), or else to use variable precision. Tom Hull at the University of Toronto, Siegfried Rump at TU Hamburg, and others have proposed and sometimes built such systems. Thus, I do not foresee wide use of interval arithmetic until variable precision is adequately supported, and this cannot be done without a more major architectural revision than a small change in the IEEE floating point standard. Jim Demmel Computer Science Division and Mathematics Department University of California Berkeley, CA 94720 demmel@cs.berkeley.edu ------------------------------ End of NA Digest ************************** -------