 
  
  
  
  
 
 Next: Index
Up: ScaLAPACK Users' Guide
 Previous: Notes
 
 
References
- 1
- 
M. ABOELAZE, N. CHRISOCHOIDES, AND E. HOUSTIS, The Parallelization
  of Level 2 and 3 BLAS Operations on Distributed Memory Machines, Tech. Rep.
  CSD-TR-91-007, Purdue University, West Lafayette, IN, 1991.
 
- 2
- 
R. AGARWAL, F. GUSTAVSON, AND M. ZUBAIR, Improving Performance of
  Linear Algebra Algorithms for Dense Matrices Using Algorithmic Prefetching,
  IBM J. Res. Dev., 38 (1994), pp. 265-275.
 
- 3
- 
E. ANDERSON, Z. BAI, C. BISCHOF, J. DEMMEL, J. DONGARRA, J. DU CROZ, A. GREENBAUM, S. HAMMARLING, A. MCKENNEY, S. OSTROUCHOV, AND D. SORENSEN,
  LAPACK Users' Guide, Society for Industrial and Applied Mathematics,
  Philadelphia, PA, second ed., 1995.
 
- 4
- 
E. ANDERSON, Z. BAI, C. BISCHOF, J. DEMMEL, J. DONGARRA, J. DU CROZ, A. GREENBAUM, S. HAMMARLING, A. MCKENNEY, AND D. SORENSEN, LAPACK: A
  portable linear algebra library for high-performance computers, Computer
  Science Dept. Technical Report CS-90-105, University of Tennessee,
  Knoxville, TN, May 1990.
(Also LAPACK Working Note #20).
 
- 5
- 
E. ANDERSON, Z. BAI, AND J. DONGARRA, Generalized QR factorization
  and its applications, Linear Algebra and Its Applications, 162-164 (1992),
  pp. 243-273.
(Also LAPACK Working Note #31).
 
- 6
- 
I. ANGUS, G. FOX, J. KIM, AND D. WALKER, Solving Problems on
  Concurrent Processors: Software for Concurrent Processors, vol. 2, Prentice
  Hall, Englewood Cliffs, N.J, 1990.
 
- 7
- 
ANSI/IEEE, 
  IEEE Standard for Binary Floating Point Arithmetic, New York, Std
  754-1985 ed., 1985.
 
- 8
- 
height 2pt depth -1.6pt width 23pt, 
  IEEE Standard for Radix Independent Floating Point Arithmetic, New
  York, Std 854-1987 ed., 1987.
 
- 9
- 
M. ARIOLI, J. W. DEMMEL, AND I. S. DUFF, Solving sparse linear
  systems with sparse backward error, SIAM J. Matrix Anal. Appl., 10 (1989),
  pp. 165-190.
 
- 10
- 
C. ASHCRAFT, The Distributed Solution of Linear Systems Using the
  Torus-wrap Data mapping, Tech. Rep. ECA-TR-147, Boeing Computer Services,
  Seattle, WA, 1990.
 
- 11
- 
Z. BAI AND J. DEMMEL, Design of a parallel nonsymmetric
  eigenroutine toolbox, Part I, in Proceedings of the Sixth SIAM
  Conference on Parallel Processing for Scientific Computing, SIAM, 1993,
  pp. 391-398.
 
- 12
- 
Z. BAI AND J. DEMMEL, Using the matrix sign function to compute
  invariant subspaces, SIAM J. Matrix Anal. Appl, x (1997), p. xxx.
to appear.
 
- 13
- 
Z. BAI, J. DEMMEL, J. DONGARRA, A. PETITET, H. ROBINSON, AND K. STANLEY,
  The spectral decomposition of nonsymmetric matrices on distributed
  memory computers, Computer Science Dept. Technical Report
  CS-95-273, University of Tennessee, Knoxville, TN, 1995.
(Also LAPACK Working Note No. 91), To appear in SIAM J. Sci.
  Stat. Comput.
 
- 14
- 
Z. BAI AND J. W. DEMMEL, Design of a parallel nonsymmetric
  eigenroutine toolbox, Part I, in Proceedings of the Sixth SIAM
  Conference on Parallel Processing for Scientific Computing, R. F. et
  al. Sincovec, ed., Philadelphia, PA, 1993, Society for Industrial and
  Applied Mathematics, pp. 391-398.
Long version available as Computer Science Report CSD-92-718,
  University of California, Berkeley, 1992.
 
- 15
- 
J. BARLOW AND J. DEMMEL, Computing accurate eigensystems of scaled
  diagonally dominant matrices, SIAM J. Num. Anal., 27 (1990), pp. 762-791.
(Also LAPACK Working Note #7).
 
- 16
- 
J. BILMES, K. ASANOVIC, J. DEMMEL, D. LAM, AND C. CHIN, Optimizing
  matrix multiply using PHiPAC: A portable, high-performance, ANSI C
  coding methodology, Computer Science Dept. Technical Report
  CS-96-326, University of Tennessee, Knoxville, TN, 1996.
(Also LAPACK Working Note #111).
 
- 17
- 
R. H. BISSELING AND J. G. G. VAN DE VORST, Parallel LU
  decomposition on a transputer network, in Lecture Notes in Computer Science,
  Number 384, G. A. van Zee and J. G. G. van de Vorst, eds.,
  Springer-Verlag, 1989, pp. 61-77.
 
- 18
- 
L. S. BLACKFORD, J. CHOI, A. CLEARY, J. DEMMEL, I. DHILLON, J. J. DONGARRA, S. HAMMARLING, G. HENRY, A. PETITET, K. STANLEY, D. W. WALKER, AND R. C. WHALEY, ScaLAPACK: A portable linear algebra library for
  distributed memory computers - design issues and performance, in Proceedings
  of Supercomputing '96, Sponsored by ACM SIGARCH and IEEE Computer Society,
  1996.
(ACM Order Number: 415962, IEEE Computer Society Press Order Number:
  RS00126. http://www.supercomp.org/sc96/proceedings/).
 
- 19
- 
L. S. BLACKFORD, A. CLEARY, J. DEMMEL, I. DHILLON, J. DONGARRA, S. HAMMARLING, A. PETITET, H. REN, K. STANLEY, AND R. C. WHALEY, 
  Practical experience in the dangers of heterogeneous computing, Computer
  Science Dept. Technical Report CS-96-330, University of Tennessee,
  Knoxville, TN, July 1996.
(Also LAPACK Working Note #112), to appear ACM Trans. Math. Softw.,
  1997.
 
- 20
- 
R. BRENT, The LINPACK Benchmark on the AP 1000, in Frontiers,
  1992, McLean, VA, 1992, pp. 128-135.
 
- 21
- 
R. BRENT AND P. STRAZDINS, Implementation of BLAS Level 3 and
  LINPACK Benchmark on the AP1000, Fujitsu Scientific and Technical Journal,
  5 (1993), pp. 61-70.
 
- 22
- 
S. BROWNE, J. DONGARRA, S. GREEN, E. GROSSE, K. MOORE, T. ROWAN, AND R. WADE, Netlib services and resources (rev. 1), Computer Science
  Dept. Technical Report CS-94-222, University of Tennessee, Knoxville,
  TN, 1994.
 
- 23
- 
S. BROWNE, J. DONGARRA, E. GROSSE, AND T. ROWAN, The netlib
  mathematical software repository, D-Lib Magazine (www.dlib.org),  (1995).
 
- 24
- 
J. CHOI, J. DEMMEL, I. DHILLON, J. DONGARRA, S. OSTROUCHOV, A. PETITET, K. STANLEY, D. WALKER, AND R. C. WHALEY, Installation guide for
  ScaLAPACK, Computer Science Dept. Technical Report CS-95-280,
  University of Tennessee, Knoxville, TN, March 1995.
(Also LAPACK Working Note #93).
 
- 25
- 
height 2pt depth -1.6pt width 23pt, ScaLAPACK: A
  portable linear algebra library for distributed memory computers - design
  issues and performance, Computer Science Dept. Technical Report
  CS-95-283, University of Tennessee, Knoxville, TN, March 1995.
(Also LAPACK Working Note #95).
 
- 26
- 
J. CHOI, J. DONGARRA, S. OSTROUCHOV, A. PETITET, D. WALKER, AND R. C. WHALEY, A proposal for a set of parallel basic linear algebra
  subprograms, Computer Science Dept. Technical Report CS-95-292,
  University of Tennessee, Knoxville, TN, May 1995.
(Also LAPACK Working Note #100).
 
- 27
- 
J. CHOI, J. DONGARRA, R. POZO, AND D. WALKER, ScaLAPACK: A
  scalable linear algebra library for distributed memory concurrent computers,
  in Proceedings of the Fourth Symposium on the Frontiers of Massively Parallel
  Computation, McLean, Virginia, 1992, IEEE Computer Society Press,
  pp. 120-127.
(Also LAPACK Working Note #55).
 
- 28
- 
J. CHOI, J. DONGARRA, AND D. WALKER, The design of a parallel dense
  linear algebra software library: Reduction to Hessenberg, tridiagonal and
  bidiagonal form, Numerical Algorithms, 10 (1995), pp. 379-399.
(Also LAPACK Working Note #92).
 
- 29
- 
J. CHOI, J. DONGARRA, AND D. WALKER, PB-BLAS: A Set of Parallel
  Block Basic Linear Algebra Subroutines, Concurrency: Practice and
  Experience, 8 (1996), pp. 517-535.
 
- 30
- 
A. CHTCHELKANOVA, J. GUNNELS, G. MORROW, J. OVERFELT, AND R. VAN DE GEIJN, Parallel Implementation of BLAS: General Techniques for Level 3
  BLAS, Tech. Rep. TR95-49, Department of Computer Sciences, UT-Austin, 1995.
Submitted to Concurrency: Practice and Experience.
 
- 31
- 
E. CHU AND A. GEORGE, QR Factorization of a Dense Matrix on a
  Hypercube Multiprocessor, SIAM Journal on Scientific and Statistical
  Computing, 11 (1990), pp. 990-1028.
 
- 32
- 
A. CLEARY AND J. DONGARRA, Implementation in scalapack of
  divide-and-conquer algorithms for banded and tridiagonal linear systems,
  Computer Science Dept. Technical Report CS-97-358, University of
  Tennessee, Knoxville, TN, April 1997.
(Also LAPACK Working Note #125).
 
- 33
- 
M. COSNARD, Y. ROBERT, P. QUINTON, AND M. TCHUENTE, eds., Parallel
  Algorithms and Architectures, North-Holland, 1986.
 
- 34
- 
D. E. CULLER, A. ARPACI-DUSSEAU, R. ARPACI-DUSSEAU, B. CHUN, S. LUMETTA, A. MAINWARING, R. MARTIN, C. YOSHIKAWA, AND F. WONG, Parallel computing
  on the Berkeley NOW.
To appear in JSPP'97 (9th Joint Symposium on Parallel Processing),
  Kobe, Japan, 1997.
 
- 35
- 
M. DAYDE, I. DUFF, AND A. PETITET, A Parallel Block Implementation
  of Level 3 BLAS for MIMD Vector Processors, ACM Trans. Math. Softw., 20
  (1994), pp. 178-193.
 
- 36
- 
B. DE MOOR AND P. VAN DOOREN, Generalization of the singular value
  and QR decompositions, SIAM J. Matrix Anal. Appl., 13 (1992),
  pp. 993-1014.
 
- 37
- 
J. DEMMEL, Underflow and the reliability of numerical software,
  SIAM J. Sci. Stat. Comput., 5 (1984), pp. 887-919.
 
- 38
- 
height 2pt depth -1.6pt width 23pt, Applied Numerical
  Linear Algebra, SIAM, 1996.
to appear.
 
- 39
- 
J. DEMMEL, S. EISENSTAT, J. GILBERT, X. LI, AND J. W. H. LIU, A
  supernodal approach to sparse partial pivoting, Technical Report
  UCB//CSD-95-883, UC Berkeley Computer Science Division, September 1995.
to appear in SIAM J. Mat. Anal. Appl.
 
- 40
- 
J. DEMMEL AND K. STANLEY, The performance of finding eigenvalues and
  eigenvectors of dense symmetric matrices on distributed memory computers,
  Computer Science Dept. Technical Report CS-94-254, University of
  Tennessee, Knoxville, TN, September 1994.
(Also LAPACK Working Note #86).
 
- 41
- 
J. W. DEMMEL, J. R. GILBERT, AND X. S. LI, An asynchronous parallel
  supernodal algorithm for sparse Gaussian elimination, February 1997.
Submitted to SIAM J. Matrix Anal. Appl., special issue on Sparse and
  Structured Matrix Computations and Their Applications (Also LAPACK Working
  Note 124).
 
- 42
- 
J. W. DEMMEL AND X. LI, Faster numerical algorithms via exception
  handling, IEEE Trans. Comp., 43 (1994), pp. 983-992.
(Also LAPACK Working Note #59).
 
- 43
- 
I. S. DHILLON, Current inverse iteration software can fail,
  (1997).
Submitted for publication.
 
- 44
- 
height 2pt depth -1.6pt width 23pt, A Stable  Algorithm for the Symmetric Tridiagonal Eigenproblem, PhD thesis, University
  of California, Berkeley, CA, May 1997. Algorithm for the Symmetric Tridiagonal Eigenproblem, PhD thesis, University
  of California, Berkeley, CA, May 1997.
 
- 45
- 
I. S. DHILLON AND B. PARLETT, Orthogonal eigenvectors without
  Gram-Schmidt,  (1997).
draft.
 
- 46
- 
J. DONGARRA AND T. DUNIGAN, Message-passing performance of various
  computers, Tech. Rep. ORNL/TM-13006, Oak Ridge National Laboratory, Oak
  Ridge, TN, 1996.
Submitted and accepted to Concurrency: Practice and Experience.
 
- 47
- 
J. DONGARRA, S. HAMMARLING, AND D. WALKER, Key Concepts for Parallel
  Out-Of-Core LU Factorization, Society for Industrial and Applied
  Mathematics, Philadelphia, PA, 1996.
(Also LAPACK Working Note #110).
 
- 48
- 
J. DONGARRA, G. HENRY, AND D. WATKINS, A distributed memory
  implementation of the nonsymmetric QR algorithm, in Proceedings of the
  Eighth SIAM Conference on Parallel Processing for Scientific Computing,
  Philadelphia, PA, 1997, Society for Industrial and Applied Mathematics.
 
- 49
- 
J. DONGARRA, C. RANDRIAMARO, L. PRYLLI, AND B. TOURANCHEAU, Array
  redistribution in ScaLAPACK using PVM, in EuroPVM users' group,
  Hermes, 1995.
 
- 50
- 
J. DONGARRA AND R. VAN DE GEIJN, Two dimensional basic linear
  algebra communication subprograms, Computer Science Dept. Technical
  Report CS-91-138, University of Tennessee, Knoxville, TN, 1991.
(Also LAPACK Working Note #37).
 
- 51
- 
J. DONGARRA, R. VAN DE GEIJN, AND D. WALKER, Scalability issues in
  the design of a library for dense linear algebra, Journal of Parallel and
  Distributed Computing, 22 (1994), pp. 523-537.
(Also LAPACK Working Note #43).
 
- 52
- 
J. DONGARRA, R. VAN DE GEIJN, AND R. C. WHALEY, Two dimensional
  basic linear algebra communication subprograms, in Environments and Tools
  for Parallel Scientific Computing, Advances in Parallel Computing,
  J. Dongarra and B. Tourancheau, eds., vol. 6, Elsevier Science Publishers
  B.V., 1993, pp. 31-40.
 
- 53
- 
J. DONGARRA AND D. WALKER, Software libraries for linear algebra
  computations on high performance computers, SIAM Review, 37 (1995),
  pp. 151-180.
 
- 54
- 
J. DONGARRA AND R. C. WHALEY, A user's guide to the BLACS v1.1,
  Computer Science Dept. Technical Report CS-95-281, University of
  Tennessee, Knoxville, TN, 1995.
(Also LAPACK Working Note #94).
 
- 55
- 
J. J. DONGARRA AND E. F. D'AZEVEDO, The design and implementation of
  the parallel out-of-core ScaLAPACK LU, QR, and Cholesky factorization
  routines, Department of Computer Science Technical Report
  CS-97-347, University of Tennessee, Knoxville, TN, 1997.
(Also LAPACK Working Note 118).
 
- 56
- 
J. J. DONGARRA, J. DU CROZ, I. S. DUFF, AND S. HAMMARLING, Algorithm
  679: A set of Level 3 Basic Linear Algebra Subprograms, ACM
  Trans. Math. Soft., 16 (1990), pp. 18-28.
 
- 57
- 
height 2pt depth -1.6pt width 23pt, A set of Level 3
  Basic Linear Algebra Subprograms, ACM Trans. Math. Soft., 16
  (1990), pp. 1-17.
 
- 58
- 
J. J. DONGARRA, J. DU CROZ, S. HAMMARLING, AND R. J. HANSON, 
  Algorithm 656: An extended set of FORTRAN Basic Linear Algebra
  Subroutines, ACM Trans. Math. Soft., 14 (1988), pp. 18-32.
 
- 59
- 
height 2pt depth -1.6pt width 23pt, An extended set of
  FORTRAN basic linear algebra subroutines, ACM Trans. Math. Soft., 14
  (1988), pp. 1-17.
 
- 60
- 
J. J. DONGARRA AND E. GROSSE, Distribution of mathematical software
  via electronic mail, Communications of the ACM, 30 (1987), pp. 403-407.
 
- 61
- 
J. J. DONGARRA, R. VAN DE GEIJN, AND D. W. WALKER, A look at
  scalable dense linear algebra libraries, in Proceedings of the Scalable
  High-Performance Computing Conference, IEEE, ed., IEEE Publishers, 1992,
  pp. 372-379.
 
- 62
- 
J. DU CROZ AND N. J. HIGHAM, Stability of methods for matrix
  inversion, IMA J. Numer. Anal., 12 (1992), pp. 1-19.
(Also LAPACK Working Note #27).
 
- 63
- 
R. FALGOUT, A. SKJELLUM, S. SMITH, AND C. STILL, The Multicomputer
  Toolbox Approach to Concurrent BLAS and LACS, in Proceedings of the
  Scalable High Performance Computing Conference SHPCC-92, IEEE Computer
  Society Press, 1992.
 
- 64
- 
M. P. I. FORUM, MPI: A message passing interface standard,
  International Journal of Supercomputer Applications and High Performance
  Computing, 8 (1994), pp. 3-4.
Special issue on MPI. Also available electronically, the URL is
  ftp://www.netlib.org/mpi/mpi-report.ps .
 
- 65
- 
G. FOX, M. JOHNSON, G. LYZENGA, S. OTTO, J. SALMON, AND D. WALKER, 
  Solving Problems on Concurrent Processors, Volume 1, Prentice-Hall,
  Englewood Cliffs, NJ, 1988.
 
- 66
- 
G. FOX, R. WILLIAMS, AND P. MESSINA, Parallel Computing Works!,
  Morgan Kaufmann Publishers, Inc., San Francisco, CA, 1994.
 
- 67
- 
T. L. FREEMAN AND C. PHILLIPS, Parallel Numerical Algorithms,
  Prentice-Hall, Hemel Hempstead, Hertfordshire, UK, 1992.
 
- 68
- 
A. GEIST, A. BEGUELIN, J. DONGARRA, W. JIANG, R. MANCHEK, AND V. SUNDERAM, PVM: Parallel Virtual Machine. A Users' Guide and
  Tutorial for Networked Parallel Computing, MIT Press, Cambridge, MA, 1994.
 
- 69
- 
G. GEIST AND C. ROMINE, LU factorization algorithms on distributed
  memory multiprocessor architectures, SIAM J. Sci. Stat. Comput., 9 (1988),
  pp. 639-649.
 
- 70
- 
G. GOLUB AND C. VAN LOAN, Matrix Computations, Johns-Hopkins,
  Baltimore, second ed., 1989.
 
- 71
- 
G. GOLUB AND C. F. VAN LOAN, Matrix Computations, Johns Hopkins
  University Press, Baltimore, MD, third ed., 1996.
 
- 72
- 
W. W. HAGER, Condition estimators, SIAM J. Sci. Stat. Comput., 5
  (1984), pp. 311-316.
 
- 73
- 
S. HAMMARLING, The numerical solution of the general
  Gauss-Markov linear model, in Mathematics in Signal Processing,
  T. S. et al.. Durani, ed., Clarendon Press, Oxford, UK, 1986.
 
- 74
- 
R. HANSON, F. KROGH, AND C. LAWSON, A proposal for standard linear
  algebra subprograms, ACM SIGNUM Newsl., 8 (1973).
 
- 75
- 
P. HATCHER AND M. QUINN, Data-Parallel Programming On MIMD
  Computers, The MIT Press, Cambridge, Massachusetts, 1991.
 
- 76
- 
B. HENDRICKSON AND D. WOMBLE, The torus-wrap mapping for dense
  matrix calculations on massively parallel computers, SIAM J. Sci. Stat.
  Comput., 15 (1994), pp. 1201-1226.
 
- 77
- 
G. HENRY, Improving Data Re-Use in Eigenvalue-Related Computations,
  PhD thesis, Cornell University, Ithaca, NY, January 1994.
 
- 78
- 
G. HENRY AND R. VAN DE GEIJN, Parallelizing the QR algorithm for
  the unsymmetric algebraic eigenvalue problem: Myths and reality, SIAM J.
  Sci. Comput., 17 (1996), pp. 870-883.
(Also LAPACK Working Note 79).
 
- 79
- 
G. HENRY, D. WATKINS, AND J. DONGARRA, A parallel implementation of
  the nonsymmetric QR algorithm for distributed memory architectures,
  Computer Science Dept. Technical Report CS-97-352, University of
  Tennessee, Knoxville, TN, March 1997.
(Also LAPACK Working Note # 121).
 
- 80
- 
N. J. HIGHAM, A survey of condition number estimation for triangular
  matrices, SIAM Review, 29 (1987), pp. 575-596.
 
- 81
- 
height 2pt depth -1.6pt width 23pt, FORTRAN codes for
  estimating the one-norm of a real or complex matrix, with applications to
  condition estimation, ACM Trans. Math. Softw., 14 (1988), pp. 381-396.
 
- 82
- 
height 2pt depth -1.6pt width 23pt, Experience with a
  matrix norm estimator, SIAM J. Sci. Stat. Comput., 11 (1990),
  pp. 804-809.
 
- 83
- 
height 2pt depth -1.6pt width 23pt, Perturbation theory
  and backward error for AX-XB=C, BIT, 33 (1993), pp. 124-136.
 
- 84
- 
height 2pt depth -1.6pt width 23pt, Accuracy and
  Stability of Numerical Algorithms, Society for Industrial and Applied
  Mathematics, Philadelphia, PA, 1996.
 
- 85
- 
S. HUSS-LEDERMAN, E. JACOBSON, A. TSAO, AND G. ZHANG, Matrix
  Multiplication on the Intel Touchstone DELTA, Concurrency: Practice and
  Experience, 6 (1994), pp. 571-594.
 
- 86
- 
S. HUSS-LEDERMAN, A. TSAO, AND G. ZHANG, A parallel implementation
  of the invariant subspace decomposition algorithm for dense symmetric
  matrices, in Proceedings of the Sixth SIAM Conference on Parallel
  Processing for Scientific Computing, SIAM, 1993, pp. 367-374.
 
- 87
- 
K. HWANG, Advanced Computer Architecture: Parallelism, Scalability,
  Programmability, McGraw-Hill, 1993.
 
- 88
- 
IBM CORPORATION, IBM RS6000, 1996.
 (URL = http://www.rs6000.ibm.com/).
 
- 89
- 
INTEL CORPORATION, Intel Supercomputer Technical Publications
  Home Page, 1995.
 (URL = http://www.ssd.intel.com/pubs.html).
 
- 90
- 
B. KåGSTRfOM, P. LING, AND C. V. LOAN, GEMM-based level 3
  BLAS: High-performance model implementations and performance evaluation
  benchmark, Tech. Rep. UMINF 95-18, Department of Computing Science, Umeå
  University, 1995.
Submitted to ACM Trans. Math. Softw.
 
- 91
- 
C. KOEBEL, D. LOVEMAN, R. SCHREIBER, G. STEELE, AND M. ZOSEL, The
  High Performance Fortran Handbook, MIT Press, Cambridge, Massachusetts,
  1994.
 
- 92
- 
V. KUMAR, A. GRAMA, A. GUPTA, AND G. KARYPIS, Introduction to
  Parallel Computing - Design and Analysis of Algorithms, The
  Benjamin/Cummings Publishing Company, Inc., Redwood City, CA, 1994.
 
- 93
- 
C. L. LAWSON, R. J. HANSON, D. KINCAID, AND F. T. KROGH, Basic
  linear algebra subprograms for Fortran usage, ACM Trans. Math. Soft., 5
  (1979), pp. 308-323.
 
- 94
- 
R. LEHOUCQ, The computation of elementary unitary matrices,
  Computer Science Dept. Technical Report CS-94-233, University of
  Tennessee, Knoxville, TN, 1994.
(Also LAPACK Working Note 72).
 
- 95
- 
T. LEWIS AND H. EL-REWINI, Introduction to Parallel Computing,
  Prentice-Hall, Inc., Englewood Cliffs, NJ, 1992.
 
- 96
- 
X. LI, Sparse Gaussian Elimination on High Performance Computers,
  PhD thesis, Computer Science Division, Department of Electrical Engineering
  and Computer Science, University of California, Berkeley, CA, September 1996.
 
- 97
- 
W. LICHTENSTEIN AND S. L. JOHNSSON, Block-cyclic dense linear
  algebra, SIAM J. Sci. Stat. Comput., 14 (1993), pp. 1259-1288.
 
- 98
- 
A. MAINWARING AND D. E. CULLER, Active message applications
  programming interface and communication subsystem organization, Tech. Rep.
  UCB CSD-96-918, University of California at Berkeley, Berkeley, CA, October
  1996.
 
- 99
- 
P. PACHECO, Parallel Programming with MPI, Morgan Kaufmann
  Publishers, Inc., San Francisco, CA, 1997.
 
- 100
- 
C. PAIGE, Some aspects of generalized QR factorization, in
  Reliable Numerical Computations, M. Cox and S. Hammarling, eds., Clarendon
  Press, 1990.
 
- 101
- 
B. PARLETT, The Symmetric Eigenvalue Problem, Prentice-Hall,
  Englewood Cliffs, NJ, 1980.
 
- 102
- 
height 2pt depth -1.6pt width 23pt, The construction of
  orthogonal eigenvectors for tight clusters by use of submatrices, Center for
  Pure and Applied Mathematics PAM-664, University of California, Berkeley, CA,
  January 1996.
submitted to SIMAX.
 
- 103
- 
B. PARLETT AND I. DHILLON, On Fernando's method to find the most
  redundant equation in a tridiagonal system, Linear Algebra and Its
  Applications,  (1996).
to appear.
 
- 104
- 
A. PETITET, Algorithmic Redistribution Methods for Block Cyclic
  Decompositions, PhD thesis, University of Tennessee, Knoxville, TN, 1996.
 
- 105
- 
E. POLLICINI, A. A., Using Toolpack Software Tools, 1989.
 
- 106
- 
L. PRYLLI AND B. TOURANCHEAU, Efficient block cyclic data
  redistribution, in EUROPAR'96, vol. 1 of Lecture Notes in Computer Science,
  Springer-Verlag, 1996, pp. 155-165.
 
- 107
- 
height 2pt depth -1.6pt width 23pt, Efficient block
  cyclic array redistribution, Journal of Parallel and Distributed Computing,
  (1997).
To appear.
 
- 108
- 
R. SCHREIBER AND C. F. VAN LOAN, A storage efficient WY
  representation for products of Householder transformations, SIAM J. Sci.
  Stat. Comput., 10 (1989), pp. 53-57.
 
- 109
- 
B. SMITH, W. GROPP, AND L. CURFMAN MCINNES, PETSc 2.0 users
  manual, Technical Report ANL-95/11, Argonne National Laboratory,
  Argonne, IL, 1995.
(Available by anonymous ftp from ftp.mcs.anl.gov).
 
- 110
- 
M. SNIR, S. W. OTTO, S. HUSS-LEDERMAN, D. W. WALKER, AND J. J. DONGARRA,
  MPI: The Complete Reference, MIT Press, Cambridge, MA, 1996.
 
- 111
- 
SUNSOFT, The XDR Protocol Specification. Appendix A of ``Network
  Interfaces Programmer's Guide'', SunSoft, 1993.
 
- 112
- 
E. VAN DE VELDE, Concurrent Scientific Computing, no. 16 in Texts
  in Applied Mathematics, Springer-Verlag, 1994.
 
- 113
- 
R. C. WHALEY, Basic linear algebra communication subprograms:
  Analysis and implementation across multiple parallel architectures,
  Computer Science Dept. Technical Report CS-94-234, University of
  Tennessee, Knoxville, TN, May 1994.
(Also LAPACK Working Note 73).
 
- 114
- 
J. H. WILKINSON, The Algebraic Eigenvalue Problem, Oxford
  University Press, Oxford, UK, 1965.
 
Susan Blackford 
Tue May 13 09:21:01 EDT 1997