References

Next: About this document Up: No Title Previous: Acknowledgments

References

1: E. Anderson, A. Benzoni, J. J. Dongarra, S. Moulton, S. Ostrouchov, B. Tourancheau, and R. van de Geijn. LAPACK for distributed memory architectures: Progress report. In Parallel Processing for Scientific Computing, Fifth SIAM Conference. SIAM, 1991.
2: E. Anderson and J. Dongarra. Results from the initial release of LAPACK. Technical Report LAPACK working note 16, Computer Science Department, University of Tennessee, Knoxville, TN, 1989.
3: E. Anderson and J. Dongarra. Evaluating block algorithm variants in LAPACK. Technical Report LAPACK working note 19, Computer Science Department, University of Tennessee, Knoxville, TN, 1990.
4: C. C. Ashcraft. The distributed solution of linear systems using the torus wrap data mapping. Engineering Computing and Analysis Technical Report ECA-TR-147, Boeing Computer Services, 1990.
5: C. C. Ashcraft. A taxonamy of distributed dense LU factorization methods. Engineering Computing and Analysis Technical Report ECA-TR-161, Boeing Computer Services, 1991.
6: M. Barnett, D. G. Payne, and R. van de Geijn. Broadcasting on meshes with worm-hole routing. Technical report, Department of Computer Science, University of Texas at Austin, April 1993. Submitted to Supercomputing '93.
7: W. S. Brainerd, C. H. Goldbergs, and J. C. Adams. Programmers Guide to Fortran 90. McGraw-Hill, New York, 1990.
8: R. P. Brent. The LINPACK benchmark for the Fujitsu AP 1000. In Proceedings of the Fourth Symposium on the Frontiers of Massively Parallel Computation, pages 128-135. IEEE Computer Society Press, 1992.
9: R. P. Brent. The LINPACK benchmark on the AP 1000: Preliminary report. In Proceedings of the 2nd CAP Workshop, NOV 1991.
10: J. Choi, J. J. Dongarra, R. Pozo, and D. W. Walker. Scalapack: A scalable linear algebra library for distributed memory concurrent computers. In Proceedings of the Fourth Symposium on the Frontiers of Massively Parallel Computation, pages 120-127. IEEE Computer Society Press, 1992.
11: J. Choi, J. J. Dongarra, and D. W. Walker. The design of scalable software libraries for distributed memory concurrent computers. In J. J. Dongarra and B. Tourancheau, editors, Environments and Tools for Parallel Scientific Computing. Elsevier Science Publishers, 1993.
12: E. Chu and A. George. Gaussian elimination with partial pivoting and load balancing on a multiprocessor. Parallel Computing, 5:65-74, 1987.
13: D. E. Culler, A. Dusseau, S. C. Goldstein, A. Krishnamurthy, S. Lumetta, T. von Eicken, and K. Yelick. Introduction to Split-C: Version 0.9. Technical report, Computer Science Division - EECS, University of California, Berkeley, CA 94720, February 1993.
14: J. Demmel. LAPACK: A portable linear algebra library for supercomputers. In Proceedings of the 1989 IEEE Control Systems Society Workshop on Computer-Aided Control System Design, December 1989.
15: J. J. Dongarra. Increasing the performance of mathematical software through high-level modularity. In Proc. Sixth Int. Symp. Comp. Methods in Eng. & Applied Sciences, Versailles, France, pages 239-248. North-Holland, 1984.
16: J. J. Dongarra. LAPACK Working Note 34: Workshop on the BLACS. Computer Science Dept. Technical Report CS-91-134, University of Tennessee, Knoxville, TN, May 1991. (LAPACK Working Note #34).
17: J. J. Dongarra, J. Du Croz, S. Hammarling, and I. Duff. A set of level 3 basic linear algebra subprograms. ACM Transactions on Mathematical Software, 16(1):1-17, 1990.
18: J. J. Dongarra, J. Du Croz, S. Hammarling, and R. Hanson. An extended set of Fortran basic linear algebra subroutines. ACM Transactions on Mathematical Software, 14(1):1-17, March 1988.
19: J. J. Dongarra, I. S. Duff, D. C. Sorensen, and H. A. Van der Vorst. Solving Linear Systems on Vector and Shared Memory Computers. SIAM Publications, Philadelphia, PA, 1991.
20: J. J. Dongarra, R. Hempel, A. J. G. Hey, and D. W. Walker. A proposal for a user-level message passing interface in a distributed memory environment. Technical Report TM-12231, Oak Ridge National Laboratory, February 1993.
21: J. J. Dongarra, Peter Mayes, and Giuseppe Radicati di Brozolo. The IBM RISC System/6000 and linear algebra operations. Supercomputer, 44(VIII-4):15-30, 1991.
22: J. J. Dongarra and S. Ostrouchov. LAPACK block factorization algorithms on the Intel iPSC/860. Technical Report CS-90-115, University of Tennessee at Knoxville, Computer Science Department, October 1990.
23: J. J. Dongarra, R. Pozo, and D. W. Walker. An object oriented design for high performance linear algebra on distributed memory architectures. In Proceedings of the Object Oriented Numerics Conference, 1993.
24: J. J. Dongarra, R. van de Geijn, and D. W. Walker. A look at scalable dense linear algebra libraries. In IEEE, editor, Proceedings of the Scalable High-Performance Computing Conference, pages 372-379. IEEE Publishers, 1992.
25: J. J. Dongarra and R. A. van de Geijn. Two-dimensional basic linear algebra communication subprograms. Technical Report LAPACK working note 37, Computer Science Department, University of Tennessee, Knoxville, TN, October 1991.
26: J. J. Dongarra and R. A. van de Geijn. Reduction to condensed form for the eigenvalue problem on distributed memory architectures. Parallel Computing, 18:973-982, 1992.
27: T. H. Dunigan. Communication performance of the Intel Touchstone Delta mesh. Technical Report TM-11983, Oak Ridge National Laboratory, January 1992.
28: A. Edelman. Large dense numerical linear algebra in 1993: The parallel computing influence. International Journal Supercomputer Applications, 1993. Accepted for publication.
29: E. W. Felten and S. W. Otto. Coherent parallel C. In G. C. Fox, editor, Proceedings of the Third Conference on Hypercube Concurrent Computers and Applications, pages 440-450. ACM Press, 1988.
30: G. C. Fox, M. A. Johnson, G. A. Lyzenga, S. W. Otto, J. K. Salmon, and D. W. Walker. Solving Problems on Concurrent Processors, volume 1. Prentice Hall, Englewood Cliffs, N.J., 1988.
31: K. Gallivan, R. Plemmons, and A. Sameh. Parallel algorithms for dense linear algebra computations. SIAM Review, 32(1):54-135, 1990.
32: A. Geist and M. Heath. Matrix factorization on a hypercube multiprocessor. In M. Heath, editor, Hypercube Multiprocessors, 1986, pages 161-180, Philadelphia, PA, 1986. Society for Industrial and Applied Mathematics.
33: A. Geist and C. Romine. LU factorization algorithms on distributed-memory multiprocessor architectures. SIAM J. Sci. Statist. Comput., 9(4):639-649, July 1988.
34: R. Harrington. Origin and development of the method of moments for field computation. IEEE Antennas and Propagation Magazine, June 1990.
35: B. Hendrickson and D. Womble. The torus-wrap mapping for dense matrix computations on massively parallel computers. Technical Report SAND92-0792, Sandia National Laboratories, April 1992.
36: J. L. Hess. Panel methods in computational fluid dynamics. Annual Reviews of Fluid Mechanics, 22:255-274, 1990.
37: J. L. Hess and M. O. Smith. Calculation of potential flows about arbitrary bodies. In D. Küchemann, editor, Progress in Aeronautical Sciences, Volume 8. Pergamon Press, 1967.
38: High Performance Fortran Forum. High Performance Fortran Language Specification, Version 1.0, January 1993.
39: R. W. Hockney and C. R. Jesshope. Parallel Computers. Adam Hilger Ltd., Bristol, UK, 1981.
40: C. Lawson, R. Hanson, D. Kincaid, and F. Krogh. Basic linear algebra subprograms for Fortran usage. ACM Trans. Math. Softw., 5:308-323, 1979.
41: W. Lichtenstein and S. L. Johnsson. Block-cyclic dense linear algebra. Technical Report TR-04-92, Harvard University, Center for Research in Computing Technology, January 1992.
42: M. Lin, D. Du, A. E. Klietz, and S. Saroff. Performance evaluation of the CM-5 interconnection network. Technical report, Department of Computer Science, University of Minnesota, 1992.
43: R. Ponnusamy, A. Choudhary, and G. Fox. Communication overhead on CM-5: An experimental performance evaluation. In Proceedings of the Fourth Symposium on the Frontiers of Massively Parallel Computation, pages 108-115. IEEE Computer Society Press, 1992.
44: Y. Saad and M. H. Schultz. Parallel direct methods for solving banded linear systems. Technical Report YALEU/DCS/RR-387, Department of Computer Science, Yale University, 1985.
45: S. R. Seidel. Broadcasting on linear arrays and meshes. Technical Report TM-12356, Oak Ridge National Laboratory, April 1993.
46: A. Skjellum and A. Leung. LU factorization of sparse, unsymmetric, Jacobian matrices on multicomputers. In D. W. Walker and Q. F. Stout, editors, Proceedings of the Fifth Distributed Memory Concurrent Computing Conference, pages 328-337. IEEE Press, 1990.
47: R. A. van de Geijn. Massively parallel LINPACK benchmark on the Intel Touchstone Delta and iPSC/860 systems. Computer Science report TR-91-28, Univ. of Texas, 1991.
48: E. F. Van de Velde. Data redistribution and concurrency. Parallel Computing, 16, December 1990.
49: J. J. H. Wang. Generalized Moment Methods in Electromagnetics. John Wiley & Sons, New York, 1991.
50: J. Wilkinson and C. Reinsch. Handbook for Automatic Computation: Volume II - Linear Algebra. Springer-Verlag, New York, 1971.

Jack Dongarra
Sun Feb 9 10:05:05 EST 1997