References

Next: About this document Up: Automatic Blocking of Nested Previous: Blocking Examples

References

1: W. A. Abu-Sufah, D. J. Kuck, and D. H. Lawrie. On the performance enhancement of paging systems throught program analysis and transformations. IEEE Transactions on Computers, C-30:341-356, 1981.
2: Christian H. Bischof. Incremental condition estimation. Technical Report ANL-MCS-P15-1088, Argonne National Laboratory, 1989.
3: David Callahan, Steve Carr, and Ken Kennedy. Improving register allocation of subscripted variables. In Proceedings of the ACM SIGPLAN `90 Conference on Programming Language Design and Implementation, Association for Computing Machinery, 1990.
4: Steve Carr and Ken Kennedy. Blocking linear algebra codes for memory hierarchies. In Proceedings of the Fourth SIAM Conference on Parallel Processing for Scientific Computing, Society for Industrial and Applied Mathematics, 1989.
5: James Demmel, Jack Dongarra, Jeremy Du Croz, Anne Greenbaum, Sven Hammarling, and Danny Sorensen. Prospectus for the development of a linear algebra library for high-performance computers. Technical Report, Argonne National Laboratory, 1987.
6: J.J. Dongarra and D.C. Sorensen. Linear algebra on high-performance computers. In U. Schendel, editor, Proceedings of Parallel Computing 85, pages 3-32, JACK: WHAT PUBLISHER?, 1986.
7: K.A. Gallivan, R.J. Plemmons, and A.H. Sameh. Parallel algorithms for dense linear algebra computations. SIAM Review, 32(1):54-135, 1990.
8: Dennis Gannon, William Jalby, and Kyle Gallivan. Strategies for cache and local memory management by global program transformation. Journal of Parallel and Distributed Computing, 5(5):587-616, 1988.
9: G. H. Golub, V. Klema, and G. W. Stewart. Rank degeneracy and least squares problems. Technical Report TR-456, Department of Computer Science, University of Maryland, 1976.
10: Gene H. Golub and Charles F. Van Loan. Matrix Computations. Johns Hopkins, Baltimore, MD, Second edition, 1989.
11: F. Irigoin and R. Triolet. Supernode partitioning. In Conference Record of the 15th Annual ACM Symposium on Principles of Programming Languages, pages 319-329, Association for Computing Machinery, 1988.
12: Ken Kennedy. Talk at the fourth SIAM conference on parallel processing for scientific computing. Chicago, Illinois, 1989.
13: Leslie Lamport. The parallel execution of do loops. Communications of the Association for Computing Machinery, 17:83-93, 1974.
14: H. Lomax and T. H. Pulliam. A three-dimensional implicit code for the ILLIAC IV. In Garry Rodrigue, editor, Computational Physics on Parallel Computers, Academic Press, New York, NY, 1982.
15: Dan I. Moldovan and Jose A. B. Fortes. Partitioning and mapping algorithms into fixed size systolic arrays. IEEE Transactions on Computers, C-36:1-12, 1986.
16: Robert Schreiber. Block algorithms for parallel machines. In Numerical Algorithms for Modern Parallel Computer Architectures, pages 197-208, Springer-Verlag, New York, NY, 1988.
17: Michael E. Wolf and Monica S. Lam. An algorithm to generate sequential and parallel code with improved data locality. Technical Report, Computer Systems Labortory, Stanford University, 1989.
18: Michael Wolfe. Iteration space tiling for memory hierarchies. In Garry Rodrigue, editor, Parallel Processing for Scientific Computing, pages 357-361, Society for Industrial and Applied Mathematics, 1989.
19: Michael Wolfe. More iteration space tiling. In Proceedings Supercomputing '89, pages 655-664, Association for Computing Machinery, 1989.

Jack Dongarra
Tue Feb 18 15:39:11 EST 1997