URL: http://www.netlib.org/tennessee/ut-cs-91-131.ps author: Ed Anderson, Z. Bai & Jack Dongarra, title: LAPACK Working Note 31: Generalized QR Factorization and Its Applications, reference: University of Tennessee Technical Report CS-91-131, April 1991. abstract: The purpose of this paper is to reintroduce the generalized QR factorization with or without pivoting of two matrices A and B having the same number of rows. When B is square and nonsingular, the factorization implicitly gives the orthogonal factorization of B{-1}A. Continuing the work of Paige [20] and Hammarling [12], we discuss the different forms of the factorization from the point of view of general-purpose software development. In addition, we demonstrate the applications of the GQR factorization in solving the linear equality constrained least squares problem and the generalized linear regression problem, and in estimating the conditioning of these problems. URL: http://www.netlib.org/tennessee/sc91.ps author: Adam Beguelin, Jack J. Dongarra, G.A. Geist, Robert Manchek, & V.S. Sunderam, title: Graphical Development Tools for Network-Based Concurrent Supercomputing, reference: Proceedings of Supercomputing `91, pp. 435-444, Albuquerque, New Mexico, November 1991. abstract: This paper describes an X-window based software environment called HeNCE (Heterogeneous Network Computing Environment) designed to assist scientists in developing parallel programs that run on a network of computers. HeNCE is built on top of a software package called PVM which supports process management and communication between a network of heterogeneous computers. HeNCE is based on a parallel programming paradigm where an application program can be described by a graph. Nodes of the graph represent subroutines and the arcs represent data dependencies. HeNCE is composed of integrated graphical tools for creating, compiling, executing, and analyzing HeNCE programs. URL: http://www.netlib.org/tennessee/ut-cs-91-136.ps author: Adam Beguelin, Jack Dongarra, Al Geist, Robert Manchek, & Vaidy Sunderam, title: A Users' Guide to PVM Parallel Virtual Machine, reference: University of Tennessee Technical Report CS-91-136, July 1991. abstract: This report is the PVM version 2.3 users' guide. It contains an overview of PVM and how it is installed and used. Example programs in C and Fortran are included. PVM stands for Parallel Virtual Machine. It is a software package that allows the utilization of a heterogeneous network of parallel and serial computers as a single computational resource. PVM consists of two parts: a daemon process that any user can install on a machine, and a user library that contains routines for initiating processes on other machines, for communicating between processes, and synchronizing processes. URL: http://www.netlib.org/tennessee/ornl-tm-11850.ps author: Jean R.S. Blair & Barry W. Peyton, title: On Finding Minimum-Diameter Clique Trees, reference: Oak Ridge National Laboratory Technical Report ORNL/TM-11850, Oak Ridge National Laboratory, Oak Ridge, Tennessee, August 1991. abstract: It is well-known that any chordal graph can be represented as a clique tree (acyclic hypergraph, join tree). Since some chordal graphs have many distinct clique tree representations, it is interesting to consider which one is most desirable under various circumstances. A clique tree of minimum diameter (or height) is sometimes a natural candidate when choosing clique trees to be processed in a parallel computing environment. This paper introduces a linear time algorithm for computing a minimum-diameter clique tree. The new algorithm is an analogue of the natural greedy algorithm for rooting an ordinary tree in order to minimize its height. It has potential application in the development of parallel algorithms for both knowledge-based systems and the solution of sparse linear systems of equations. URL: http://www.netlib.org/tennessee/ornl-tm-12318.ps author: Jack J. Dongarra, Thomas H. Rowan, and Reed C. Wade title: Software Distribution Using XNETLIB reference: Oak Ridge National Laboratory Technical Report ORNL/TM-12318 June, 1993 abstract: Xnetlib is a new tool for software distribution. Whereas its predecessor netlib uses e-mail as the user interface to its large collection of public-domain mathematical software, Xnetlib uses an X Window interface and socket-based communication. Xnetlib makes it easy to search through a large distributed collection of software and to retrieve requested software in seconds. URL: http://www.netlib.org/tennessee/ut-cs-91-141.ps author: James Demmel, Jack Dongarra, & W. Kahan, title: LAPACK Working Note 39: On Designing Portable High Performance Numerical Libraries, reference: University of Tennessee Technical Reports CS-91-141, July 1991. abstract: High quality portable numerical libraries have existed for many years. These libraries, such as LINPACK and EISPACK, were designed to be accurate, robust, efficient and portable in a Fortran environment of conventional uniprocessors, diverse floating point arithmetics, and limited input data structures. These libraries are no longer adequate on modern high performance computer architectures. We describe their inadequacies and how we are addressing them in the LAPACK project, a library of numerical linear algebra routines designed to supplant LINPACK and EISPACK. We shall now show how the new architectures lead to important changes in the goals as well as the methods of library design. URL: http://www.netlib.org/tennessee/ut-cs-89-85.ps author: Jack J. Dongarra, title: Performance of Various Computers Using Standard Linear Equations Software, reference: University of Tennessee Technical Report CS-89-85, December 1990. abstract: This report compares the performance of different computer systems in solving dense systems of linear equations. The comparison involves approximately a hundred computers, ranging from a CRAY-MP to scientific workstations such as the Apollo and Sun to IBM PCs. URL: http://www.netlib.org/tennessee/ut-cs-91-134.ps author: Jack Dongarra, title: LAPACK Working Note 34: Workshop on the BLACS, reference: University of Tennessee Technical Report CS-91-134, May 1991. abstract: Forty-three people met on March 28, 1991, to discuss a set of Basic Linear Algebra Communication Subprograms (BLACS). This set of routines is motivated by the needs of distributed memory computers. URL: http://www.netlib.org/tennessee/pc.v17.10.ps author: Jack Dongarra, Mark Furtney, Steve Reinhardt, & Jerry Russell, title: Parallel Loops -- A Test Suite for Parallelizing Compilers: Description and Example Results, reference: Parallel Computing 17 (1991), pp. 1247-1255. abstract: Several multiprocessor systems are now commercially available, and advances in compiler technology provide automatic conversion of programs to run on such systems. However, no accepted measure of this parallel compiler ability exists. This paper presents a test suite of subroutines and loops, called Parallel Loops, designed to (1) measure the ability of parallelizing compilers to convert code to run in parallel and (2) determine how effectively parallel hardware and software work together to achieve high performance across a range of problem sizes. In addition, we present the results of compiling this suite using two commercially available parallelizing Fortran compilers, Cray and Convex. URL: http://www.netlib.org/tennessee/ut-cs-91-146.ps author: Jack Dongarra & Bill Rosener, title: NA-NET: Numerical Analysis NET, reference: University of Tennessee Technical Report CS-91-146, September 1991. abstract: The NA-NET is a mail facility created to allow numerical analysts (na) an easy method of communicating with one another. The main advantage of the NA-NET is uniformity of addressing. All mail is addressed to the Internet host ``na-net.ornl.gov'' at Oak Ridge National Laboratory. Hence, members of the NA-NET do not need to remember complicated addresses or even where a member is currently located. This paper describes the software. URL: http://www.netlib.org/tennessee/ut-cs-91-137.ps author: Jack J. Dongarra & Majed Sidani, title: A Parallel Algorithm for the Non-Symmetric Eigenvalue Problem, reference: University of Tennessee Technical Report CS-91-137, July 30, 1991. abstract: This paper describes a parallel algorithm for computing the eigenvalues and eigenvectors of a non-symmetric matrix. The algorithm is based on a divide-and-conquer procedure and uses an iterative refinement technique. URL: http://www.netlib.org/tennessee/ut-cs-91-138.ps author: Jack Dongarra & Robert A. van de Geijn, title: LAPACK Working Note 37: Two Dimensional Basic Linear Algebra Communication Subprograms, reference: University of Tennessee Technical Report CS-91-138, October 28, 1991. abstract: In this paper, we describe extensions to a proposed set of linear algebra communication routines for communicating and manipulating data structures that are distributed among the memories of a distributed memory MIMD computer. In particular, recent experience shows that higher performance can be attained on such architectures when parallel dense matrix algorithms utilize a data distribution that views the computational nodes as a logical two dimensional mesh. The motivation for the BLACS continues to be to increase portability, efficiency and modularity at a high level. The audience of the BLACS are mathematical software experts and people with large scale scientific computation to perform. A systematic effort must be made to achieve a de facto standard for the BLACS. URL: http://www.netlib.org/tennessee/ut-cs-91-130.ps author: Jack Dongarra & Robert A. van de Geijn, title: Reduction to Condensed Form for the Eigenvalue Problem on Distributed Memory Architectures, reference: University of Tennessee Technical Report CS-91-130, April 30, 1991. abstract: In this paper, we describe a parallel implementation for the reduction of general and symmetric matrices to Hessenberg and tridiagonal form, respectively. The methods are based on LAPACK sequential codes and use a panel-wrapped, mapping of matrices to nodes. Results from experiments on the Intel Touchstone Delta are given. URL: http://www.netlib.org/tennessee/icci91.ps author: Eric S. Kirsch & Jean R.S. Blair, title: Practical Parallel Algorithms for Chordal Graphs, reference: pp. 372-382 in Proceedings of the International Conference on Computing and Information (ICCI '91)-- Advances in Computing and Information, Ottawa, Canada, May 1991. abstract: Until recently, a large majority of theoretical work in parallel algorithms has ignored communication costs and other realities of parallel computing. This paper attempts to address this issue by developing parallel algorithms that not only are efficient using standard theoretical analysis techniques, but also require a minimal amount of communication. The specific parallel algorithms developed here include one to find the set of maximal cliques and one to find a perfect elimination ordering of a chordal graph. URL: http://www.netlib.org/tennessee/vector.ps author: David Levine, David Callahan, & Jack Dongarra, title: A Comparative Study of Automatic Vectorizing Compilers, reference: Parallel Computing 17 (1991), pp. 1223-1244. abstract: We compare the capabilities of several commercially available, vectorizing Fortran compilers using a test suite of Fortran loops. We present the results of compiling and executing these loops on a variety of supercomputers, mini-supercomputers, and mainframes. URL: http://www.netlib.org/tennessee/ut-cs-91-147.ps author: Bruce MacLennan, title: Characteristics of Connectionist Knowledge Representation, reference: University of Tennessee Technical Report CS-91-147, November 1991. abstract: Connectionism--the use of neural networks for knowledge representation and inference--has profound implications for the representation and processing of information because it provides a fundamentally new view of knowledge. However, its progress is impeded by the lack of a unifying theoretical construct corresponding to the idea of a calculus (or formal system) in traditional approaches to knowledge representation. Such a construct, called a simulacrum, is proposed here, and its basic properties are explored. We find that although exact classification is impossible, several other useful, robust kinds of classification are permitted. The representation of structured information and constituent structure are considered, and we find a basis for more flexible rule-like processing than that permitted by conventional methods. We discuss briefly logical issues such as decidability and computability and show that they require reformulation in this new context. Throughout we discuss the implications for artificial intelligence and cognitive science of this new theoretical framework. URL: http://www.netlib.org/tennessee/ut-cs-91-145.ps author: Bruce MacLennan, title: Continuous Symbol Systems: The Logic of Connectionism, reference: University of Tennessee Technical Report CS-91-145, September 1991. abstract: It has been long assumed that knowledge and thought are most naturally represented as discrete symbol systems (calculi). Thus a major contribution of connectionism is that it provides an alternative model of knowledge and cognition that avoids many of the limitations of the traditional approach. But what idea serves for connectionism the same unifying role that the idea of a calculus served for the traditional theories? We claim it is the idea of a continuous symbol system. This paper presents a preliminary formulation of continuous symbol systems and indicates how they may aid the understanding and development of connectionist theories. It begins with a brief phenomenological analysis of the discrete and continuous; the aim of this analysis is to directly contrast the two kinds of symbols systems and identify their distinguishing characteristics. Next, based on the phenomenological analysis and on other observations of existing continuous symbol systems and connectionist models, I sketch a mathematical characterization of these systems. Finally the paper turns to some applications of the theory and to its implications for knowledge representation and the theory of computation in a connectionist context. Specific problems addressed include decomposition of connectionist spaces, representation of recursive structures, properties of connectionist categories, and decidability in continuous formal systems. URL: http://www.netlib.org/tennessee/nipt91-panel.ps author: Bruce MacLennan, title: The Emergence of Symbolic Processes From the Subsymbolic Substrate, reference: text of invited panel presentation, International Symposium on New Information Processing Technologies `91, Tokyo, Japan, March 13-14, 1991. abstract: A central question for the success of neural network technology is the relation of symbolic processes (e.g., language and logic) to the underlying subsymbolic processes (e.g., parallel distributed implementations of pattern recognition, analogical reasoning and learning). This is not simply an issue of integrating neural networks with conventional expert system technology. Human symbolic cognition is flexible because it is not purely formal, and because it retains some of the ``softness'' of the subsymbolic processes. If we want our computers to be as flexible as people, then we need to understand the emergence of the discrete and symbolic from the continuous and subsymbolic. URL: http://www.netlib.org/tennessee/ut-cs-91-144.ps author: Bruce MacLennan, title: Gabor Representations of Spatiotemporal Visual Images, reference: University of Tennessee Technical Report CS-91-144, September 1991. abstract: We review Gabor's Uncertainty Principle and the limits it places on the representation of any signal. Representations in terms of Gabor elementary functions (Gaussian-modulated sinusoids), which are optimal in terms of this uncertainty principle, are compared with Fourier and wavelet representations. We also review Daugman's evidence for representations based on two-dimensional Gabor functions in mammalian visual cortex. We suggest three-dimensional Gabor elementary functions as a model for motion selectivity in complex and hypercomplex cells in visual cortex. This model also suggests a computational role for low frequency oscillations (such as the alpha rhythm) in visual cortex. URL: http://www.netlib.org/tennessee/uist91.ps author: Brad Vander Zanden, Brad A. Myers, Dario Giuse, & Pedro Szekely, title: The Importance of Pointer Variables in Constraint Models, reference: pp. 155-164 in Proceedings of UIST '91, ``ACM SIGGRAPH Symposium on User Interface Software and Technology,'' Hilton Head, South Carolina, November 11-13, 1991. abstract: Graphical tools are increasingly using constraints to specify the graphical layout and behavior of many parts of an application. However, conventional constraints directly encode the objects they reference, and thus cannot provide support for the dynamic runtime creation and manipulation of application objects. This paper discusses an extension to current constraint models that allows constraints to indirectly reference objects through pointer variables. Pointer variables permit programmers to create the constraint equivalent of procedures in traditional programming languages. This procedural abstraction allows constraints to model a wide array of dynamic application behavior, simplifies the implementation of structured object and demonstrational systems, and improves the storage and efficiency of highly interactive, graphical applications. It also promotes a simpler, more effective style of programming than conventional constraints. Constraints that use pointer variables are powerful enough to allow a comprehensive user interface toolkit to be built for the first time on top of a constraint system. URL: http://www.netlib.org/tennessee/hence.ieee title: HeNCE: Graphical Development Tools for Network-Based Concurrent Computing author: Adam Beguelin, Jack J. Dongarra, G.A. Geist, Robert Manchek, Keith Moore, V. S. Sunderam, and Reed Wade. abstract: Wide area computer networks have become a basic part of today's computing infrastructure. These networks connect a variety of machines, presenting an enormous computing resource. In this project we focus on developing methods and tools which allow a programmer to tap into this resource. In this talk we describe HeNCE, a tool and methodology under development that assists a programmer in developing programs to execute on a networked group of heterogeneous machines. HeNCE is implemented on top of a system called PVM (Parallel Virtual Machine). PVM is a software package that allows the utilization of a heterogeneous network of parallel and serial computers as a single computational resource. PVM provides facilities for spawning, communication, and synchronization of processes over a network of heterogeneous machines. While PVM provides the low level tools for implementing parallel programs, HeNCE provides the programmer with a higher level abstraction for specifying parallelism. URL: http://www.netlib.org/tennessee/siampvm.ps author: A. Beguelin, J. Dongarra, A. Geist, R. Manchek & V. Sunderam, title: Solving Computational Grand Challenges Using a Network of Heterogeneous Supercomputers reference: Proceedings of the Fifth SIAM Conference on Parallel Processing for Scientific Computing, pp. 596-601, March 25-27, 1991. abstract: This paper describes simple experiments connecting a Cray XMP, an Intel iPSC/860, and a Thinking Machines CM2 together over a high speed network to form a much larger virtual computer. It also describes our experience with running a Computational Grand Challenge on a Cray XMP and an iPSC/860 combination. The purpose of the experiments is to demonstrate the power and flexibility of the PVM (Parallel Virtual Machine) system to allow programmers to exploit a diverse collection of the most powerful computers available to solve Grand Challenge problems. URL: http://www.netlib.org/tennessee/ut-cs-89-85.ps author: Jack J. Dongarra title: Performance of Various Computers Using Standard Linear Equations Software reference: University of Tennessee Technical Report CS-89-85, January, 1993. abstract: This report compares the performance of different computer systems in solving dense systems of linear equations. The comparison involves approximately a hundred computers, ranging from a CRAY-MP to scientific workstations such as the Apollo and Sun to IBM PCs. URL: http://www.netlib.org/tennessee/ut-cs-92-168.ps author: Jack J. Dongarra & H.A. Van der Vorst, title: Performance of Various Computers Using Standard Sparse Linear Equations Solving Techniques reference: University of Tennessee Technical Report CS-92-168, February 1992. abstract: The LINPACK benchmark has become popular in the past few years as a means of measuring floating-point performance on computers. The benchmark shows in simple and direct way what performance is to be expected for a range of machines when doing dense matrix computations. We present performance results of sparse matrix computations which is an iterative approach. URL: http://www.netlib.org/tennessee/ut-cs-92-154.ps author: Bruce MacLennan title: $L_p$-Circular Functions reference: University of Tennessee Technical Report CS-92-154, May 1992. abstract: In this report we develop the basic properties of a set of functions analogous to the circular and hyperbolic functions, but based on $L_p$ circles. The resulting identities may simplify analysis in $L_p$ spaces in much the way that the circular functions do in Euclidean space. In any case, they are a pleasing example of mathematical generalization. URL: http://www.netlib.org/tennessee/ut-cs-92-172.ps author: Bruce J. MacLennan title: Research Issues in Flexible Computing: Two Presentations in Japan reference: University of Tennessee Technical Report CS-92-172, September 1992. abstract: This report contains the text of two presentations made in Japan in 1991, both of which deal with the Japanese ``Real World Computing Project'' (previously known as the ``New Information Processing Technology,'' and informally as the ``Sixth Generation Project''). URL: http://www.netlib.org/tennessee/ut-cs-92-174.ps author: Bruce MacLennan title: Field Computation in the Brain reference: University of Tennessee Technical Report CS-92-174, October 1992. abstract: We begin with a brief consideration of the {\it topology of knowledge}. It has traditionally been assumed that true knowledge must be represented by discrete symbol structures, but recent research in psychology, philosophy and computer science has shown the fundamental importance of {\it subsymbolic} information processing, in which knowledge is represented in terms of very large numbers--or even continua--of {\it microfeatures}. We believe that this sets the stage for a fundamentally new theory of knowledge, and we sketch a theory of continuous information representation and processing. Next we consider {\it field computation}, a kind of continuous information processing that emphasizes spatially continuous {\it fields} of information. This is a reasonable approximation for macroscopic areas of cortex and provides a convenient mathematical framework for studying information processing at this level. We apply it also to a linear-systems model of dendritic information processing. We consider examples from the visual cortex, including Gabor and wavelet representations, and outline field-based theories of sensorimotor intentions and of model-based deduction. URL: http://www.netlib.org/tennessee/ut-cs-92-180.ps author: Bruce MacLennan title: Information Processing in the Dendritic Net reference: University of Tennessee Technical Report CS-92-180, October 1992. abstract: The goal of this paper is a model of the dendritic net that: (1) is mathematically tractable, (2) is reasonably true to the biology, and (3) illuminates information processing in the neuropil. First I discuss some general principles of mathematical modeling in a biological context that are relevant to the use of linearity and orthogonality in our models. Next I discuss the hypothesis that the dendritic net can be viewed as a linear field computer. Then I discuss the approximations involved in analyzing it as a dynamic, lumped-parameter, linear system. Within this basically linear framework I then present: (1) the self-organization of matched filters and of associative memories; (2) the dendritic computation of Gabor and other nonorthogonal representations; and (3) the possible effects of reverse current flow in neurons. URL: http://www.netlib.org/tennessee/oopsla.ps author: Brad A. Myers, Dario A. Giuse, & Brad Vander Zanden title: Declarative Programming in a Prototype-Instance System: Object-Oriented Programming Without Writing Methods reference: Sigplan Notices, Vol.~27, No.~10, October 1992, pp.~184-200. abstract: Most programming in the Garnet system uses a declarative style that eliminates the need to write new methods. One implication is that the interface to objects is typically through their data values. This contrasts significantly with other object systems where writing methods is the central mechanism of programming. Four features are combined in a unique way in Garnet to make this possible: the use of a prototype-instance object system with structural inheritance, a retained-object model where most objects persist, the use of constraints to tie the objects together, and a new input model that makes writing event handlers unnecessary. The result is that code is easier to write for programmers, and also easier for tools, such as interactive, direct manipulation interface builders, to generate. URL: http://www.netlib.org/tennessee/ut-cs-92-152.ps author: Marc D. VanHeyningen & Bruce J. MacLennan, title: A Constraint Satisfaction Model for Perception of Ambiguous Stimuli reference: University of Tennessee Technical Report CS-92-152, April 1992. abstract: Constraint satisfaction networks are natural models of the interpretation of ambiguous stimuli, such as Necker cubes. Previous constraint satisfaction models have stimulated the initial interpretation of a stimulus, but have not simulated the dynamics of perception, which includes the alternation of interpretations and the phenomena known as bias, adaptation and hysteresis. In this paper we show that these phenomena can be modeled by a constraint satisfaction network {\it with fatigue}, that is, a network in which unit activities decay in time. Although our model is quite simple, it nevertheless exhibits some key characteristics of the dynamics of perception. URL: http://www.netlib.org/tennessee/ut-cs-93-194.ps author: Michael Berry, Theresa Do, Gavin O'Brien, Vijay Krishna, & Sowmini Varadhan, title: SVDPACKC (Version 1.0) User's Guide, reference: University of Tennessee Technical Report CS-93-194, April 1993. abstract: SVDPACKC comprises four numerical (iterative) methods for computing the singular value decomposition (SVD) of large sparse matrices using ANSI C. This software package implements Lanczos and subspace iteration-based methods for determining several of the largest singular triplets (singular values and corresponding left- and right-singular vectors) for large sparse matrices. The package has been ported to a variety of machines ranging from supercomputers to workstations: CRAY Y-MP, IBM RS/6000-550, DEC 5000-100, HP 9000-750, SPARCstation 2, and Macintosh II/fx. This document {\it (i)} explains each algorithm in some detail, {\it (ii)} explains the input parameters for each program, {\it (iii)} explains how to compile/execute each program, and {\it (iv)} illustrates the performance of each method when we compute lower rank approximations to sparse {\it term-document} matrices from information retrieval applications. A user-friendly software interface to the package for UNIX-based systems and the Macintosh II/fx is also described. URL: http://www.netlib.org/tennessee/ut-cs-93-195.ps author: Brian Howard LaRose title: The Development and Implementation of a Performance Database Server reference: University of Tennessee Technical Report CS-93-195, August 1993. abstract: The process of gathering, archiving, and distributing computer benchmark data is a cumbersome task usually performed by computer users and vendors with little coordination. Most importantly, there is no publicly-available central depository of performance data for all ranges of machines: supercomputers to personal computers. We present an Internet-accessible performance database server (PDS) which can be used to extract current benchmark data and literature. As an extension to the X-Windows-based user interface (Xnetlib) to the Netlib archival system, PDS provides an on-line catalog of public-domain computer benchmarks such as the Linpack Benchmark, Perfect Benchmarks, and the Genesis benchmarks. PDS does not reformat or present the benchmark data in any way which conflicts with the original methodology of any particular benchmark, and is thereby devoid of any subjective interpretations of machine performance. We feel that all branches (academic and industrial) of the general computing community can use this facility to archive performance metrics and make them readily available to the public. PDS can provide a more manageable approach to the development and support of a large dynamic database of published performance metrics. URL: http://www.netlib.org/tennessee/ut-cs-93-196.ps author: Douglas J. Sept title: The Design, Implementation and Performance of a Queue Manager for PVM reference: University of Tennessee Technical Report CS-93-196, August 1993. abstract: The PVM Queue Manager (QM) application addresses some of the load balancing problems associated with the heterogeneous, multi-user, computing environments for which PVM was designed. In such environments, PVM is not only confronted with the difficulties of distributing tasks among machines of variable loads, it must also contend with machines of varying performance levels in the same virtual machine. The QM addresses both of these problems using two different load balancing techniques, one static, the other dynamic. In its simplest (static) mode, the QM will initiate PVM processes for the user on demand, taking into account information such as the peak megaflops/sec and actual load of each machine. In addition to the initiation of processes, the QM will also accept tasks to be completed by a specified PVM process type. These tasks are shipped to the QM where they are kept in a FIFO queue. Worker processes in the virtual machine send idle messages to the QM when they are ready for a task, and the QM ships a task to the process if there is one (of a type matching the process) in the queue. The QM also maintains a list of idle processes and chooses the {\em best} one for the task, should one arrive when several processes are idle. Since faster machines typically send more idle messages (and receive more tasks) than slower ones, this provides a level of dynamic load balancing for the system. Three applications have already been implemented using the QM within PVM: a Mandelbrot image generator, a conjugate-gradient algorithm, and a map analysis program used in landscape ecology applications. Benchmarks of elapsed wall-clock time comparing standard PVM versions with the QM-based versions demonstrate substantial performance gains for both methods of load balancing. When processing a $1000 \times 1000$ image, for example, the QM-based Mandelbrot application averaged 63.92 seconds, compared to 139.62 seconds for the standard PVM version in a heterogenous network of five workstations (comprised of Sun4's and an IBM RS/6000). URL: http://www.netlib.org/tennessee/ut-cs-93-197.ps author: Karen Stoner Minser title: Parallel Map Analysis on the CM-5 for Landscape Ecology Models reference: University of Tennessee Technical Report CS-93-197, August 1993. abstract: In landscape ecology, computer modeling is used to assess habitat fragmentation and its ecological implications. Specifically, maps (2-D grids) of habitat clusters are analyzed to determine numbers, sizes, and geometry of clusters. Previous ecological models have relied upon sequential Fortran-77 programs which have limited the size and density of maps that can be analyzed. To efficiently analyze relatively large maps, we present parallel map analysis software implemented on the CM-5. For algorithm development, random maps of different sizes and densities were generated and analyzed. Initially, the Fortran-77 program was rewritten in C, and the sequential cluster identification algorithm was improved and implemented as a recursive or nonrecursive algorithm. The major focus of parallelization was on cluster geometry using C with CMMD message passing routines. Several different parallel models were implemented: host/node, hostless, and host/node with vector units (VUs). All models obtained some speed improvements when compared against several RISC-based workstations. The host/node model with VUs proved to be the most efficient and flexible with speed improvements for a $512\times 512$ map of 187, 95, and 20 over the Sun Sparc 2, HP 9000-750, and IBM RS/6000-350, respectively. When tested on an actual map produced through remote imagery and used in ecological studies this same model obtained a speed improvement of 119 over the Sun Sparc 2. URL: http://www.netlib.org/tennessee/ut-cs-93-197.ps title: HeNCE: A Users' Guide Version 1.2 author: Adam Beguelin, Jack Dongarra, G. A. Geist, Robert Manchek, Keith Moore, Reed Wade, Jim Plank, and Vaidy Sunderam reference: University of Tennessee Technical Report CS-92-157 abstract: HeNCE, Heterogeneous Network Computing Environment, is a graphical parallel programming environment. HeNCE provides an easy to use interface for creating, compiling, executing, and debugging parallel programs. HeNCE programs can be run on a single Unix workstation or over a network of heterogeneous machines, possibly including supercomputers. This report describes the installation and use of the HeNCE software. URL: http://www.netlib.org/tennessee/ut-cs-93-191.ps title: Software Distribution Using XNETLIB author: Jack Dongarra, Tom Rowan and Reed Wade reference: University of Tennessee Technical Report CS-93-191 abstract: Xnetlib is a new tool for software distribution. Whereas its predecessor netlib uses e-mail as the user interface to its large collection of public-domain mathematical software, Xnetlib uses an X-Window interface and socket-based communication. Xnetlib makes it easy to search through a large distributed collection of software and to retrieve requested software in seconds. URL: http://www.netlib.org/tennessee/ut-cs-93-207.ps title: Data-parallel Implementations of Map Analysis and Animal Movement for Landscape Ecology Models author: Ethel Jane Comiskey URL: http://www.netlib.org/tennessee/ut-cs-93-213.ps title: Public International Benchmarks for Parallel Computers author: assembled by Roger Hockney (chairman) and Michael Berry (secretary) reference: PARKBENCH Committee: Report-1, November 17, 1993 URL: http://www.netlib.org/tennessee/ornl-tm-11669.ps title: Fortran Subroutines for Computing the Eigenvalues and Eigenvectors of a General Matrix by Reduction to General Tridiagonal Form, J. Dongarra, A. Geist, and C. Romine reference: ORNL/TM-11669, 1990. (Also appeared as a ACM TOMS Vol. 18, No. 4, Dec 1992, pp 392-400. abstract: This paper describes programs to reduce a nonsymmetric matrix to tridiagonal form, compute the eigenvalues of the tridiagonal matrix, improve the accuracy of an eigenvalue, and compute the corresponding eigenvector. The intended purpose of the software is to find a few eigenpairs of a dense nonsymmetric matrix faster and more accurately than previous methods. The performance and accuracy of the new routines are compared to two \eispack\ paths: {\tt RG} and {\tt HQR-INVIT}. The results show that the new routines always more accurate and also faster if less than 20\% of the eigenpairs are needed. URL: http://www.netlib.org/tennessee/ut-cs-89-90.ps title: Advanced Architecture Computers, author: Jack Dongarra and Iain S. Duff, reference: University of Tennessee, CS-89-90, November 1989. abstract: We describe the characteristics of several recent computers that employ vectorization or parallelism to achieve high performance in floating-point calculations. We consider both top-of-the-range supercomputers and computers based on readily available and inexpensive basic units. In each case we discuss the architectural base, novel features, performance, and cost. We intend to update this report regularly, and to this end we welcome comments. URL: http://www.netlib.org/tennessee/ornl-tm-12404.ps title: Software Libraries for Linear Algebra Computation on High-Performance Computers author: Jack J. Dongarra and David W. Walker reference: Oak Ridge National Laboratory, ORNL TM-12404, August, 1993. abstract: This paper discusses the design of linear algebra libraries for high performance computers. Particular emphasis is placed on the development of scalable algorithms for MIMD distributed memory concurrent computers. A brief description of the EISPACK, LINPACK, and LAPACK libraries is given, followed by an outline of ScaLAPACK, which is a distributed memory version of LAPACK currently under development. The importance of block-partitioned algorithms in reducing the frequency of data movement between different levels of hierarchical memory is stressed. The use of such algorithms helps reduce the message startup costs on distributed memory concurrent computers. Other key ideas in our approach are the use of distributed versions of the Level 3 Basic Linear Algebra Subprograms (BLAS) as computational building blocks, and the use of Basic Linear Algebra Communication Subprograms (BLACS) as communication building blocks. Together the distributed BLAS and the BLACS can be used to construct higher-level algorithms, and hide many details of the parallelism from the application developer. The block-cyclic data distribution is described, and adopted as a good way of distributing block-partitioned matrices. Block-partitioned versions of the Cholesky and LU factorizations are presented, and optimization issues associated with the implementation of the LU factorization algorithm on distributed memory concurrent computers are discussed, together with its performance on the Intel Delta system. Finally, approaches to the design of library interfaces are reviewed. URL: http://www.netlib.org/tennessee/ut-cs-93-205.ps title: HeNCE: A Heterogeneous Network Computing Environment, author: Adam Beguelin, Jack Dongarra, Al Geist, Robert Manchek, and Keith Moore abstract: Network computing seeks to utilize the aggregate resources of many networked computers to solve a single problem. In so doing it is often possible to obtain supercomputer performance from an inexpensive local area network. The drawback is that network computing is complicated and error prone when done by hand, especially if the computers have different operating systems and data formats and are thus heterogeneous. HeNCE (Heterogeneous Network Computing Environment) is an integrated graphical environment for creating and running parallel programs over a heterogeneous collection of computers. It is built on a lower level package called PVM. The HeNCE philosophy of parallel programming is to have the programmer graphically specify the parallelism of a computation and to automate, as much as possible, the tasks of writing, compiling, executing, debugging, and tracing the network computation. Key to HeNCE is a graphical language based on directed graphs that describe the parallelism and data dependencies of an application. Nodes in the graphs represent conventional Fortran or C subroutines and the arcs represent data and control flow. This paper describes the the present state of HeNCE, its capabilities, limitations, and areas of future research. URL: http://www.netlib.org/tennessee/ut-cs-93-186.ps title: A Proposal for a User-Level, Message-Passing Interface in a Distributed Memory Environment author: Jack J. Dongarra, Rolf Hempel Anthony J. G. Hey, and David W. Walker abstract: This paper describes Message Passing Interface 1 (MPI1), a proposed library interface standard for supporting point-to-point message passing. The intended standard will be provided with Fortran 77 and C interfaces, and will form the basis of a standard high level communication environment featuring collective communication and data distribution transformations. The standard proposed here provides blocking and nonblocking message passing between pairs of processes, with message selectivity by source process and message type. Provision is made for noncontiguous messages. Context control provides a convenient means of avoiding message selectivity conflicts between different phases of an application. The ability to form and manipulate process groups permit task parallelism to be exploited, and is a useful abstraction in controlling certain types of collective communication. URL: http://www.netlib.org/tennessee/ut-cs-93-214.ps author: Message Passing Interface Forum, title: DRAFT: Document for a Standard Message-Passing Interface, reference: University of Tennessee Technical Report CS-93-214, October 1993. abstract: The Message Passing Interface Forum (MPIF), with participation from over 40 organizations, has been meeting since January 1993 to discuss and define a set of library interface standards for message passing. MPIF is not sanctioned or supported by any official standards organization. This is a draft of what will become the Final Report, Version 1.0, of the Message Passing Interface Forum. This document contains all the technical features proposed for the interface. This copy of the draft was processed by LATEX on October 27, 1993. MPIF invites comments on the technical content of MPI, as well as on the editorial presentation in the document. Comments received before January 15, 1994 will be considered in producing the final draft of Version 1.0 of the Message Passing Interface Specification. The goal of the Message Passing Interface, simply stated, is to develop a widely used standard for writing message-passing programs. As such the interface should establish a practical, portable, efficient, and flexible standard for message passing. URL: http://www.netlib.org/tennessee/ut-cs-93-209.ps title: Efficient Communication Operations in Reconfigurable Parallel Computers author: F. Desprez, A. Ferreira, and B. Tourancheau, abstract: Reconfiguration is largely an unexplored property in the context of parallel models of computation. However, it is a powerful concept as far as massively parallel architectures are concerned, because it overcomes the constraints due to the bissection width arising in most of distributed memory machines. In this paper, we show how to use reconfiguration in order to improve communication operations that are widely used in parallel applications. We propose quasi-optimal algorithms for broadcasting, scattering, gossiping and multi-scattering. URL: http://www.netlib.org/tennessee/ut-cs-93-208.ps title: Trace2au Audio Monitoring Tools for Parallel Programs, author: Jean-Yves Peterschmitt and Bernard Tourancheau abstract: It is not easy to reach the best performances you can expect of a parallel computer. We therefore have to use monitoring programs to study the performances of parallel programs. We introduce here a way to generate sound in real-time on a workstation, with no additional hardware, and we apply it to such monitoring programs. URL: http://www.netlib.org/tennessee/ut-cs-93-204.ps title: A General Approach to the Monitoring of Distributed Memory MIMD Multicomputers author: Maurice van Riek, Bernard Tourancheau, Xavier-Francois Vigouroux, abstract: Programs for distributed memory parallel machines are generally considered to be much more complex than sequential programs. Monitoring systems that collect runtime information about a program execution often prove a valuable help in gaining insight in the behavior of a parallel program and thus can increase its performance. This report describes in a systematic and comprehensive way the issues involved in the monitoring of parallel programs for distributed memory systems. It aims to provide a structured general approach to the field of monitoring and a guide for further documentation. First the different approaches to parallel monitoring are presented and the problems encountered are discussed and classified. In the second part, the main existing systems are described to provide the user with a feeling for the possibilities and limitations of real tools. URL: http://www.netlib.org/tennessee/ut-cs-93-210.ps author: Frederic Desprez, Pierre Fraigniaud, and Bernard Tourancheau title: Successive Broadcasts on Hypercube, reference: University of Tennessee Technical Report CS-93-210, August 1993. abstract: Broadcasting is an information dissemination problem in which information originating at one node of a communication network must be transmitted to all the other nodes as quickly as possible. In this paper, we consider the problem in which all the nodes of a network must, by turns, broadcast a distinct message. We call this problem the successive broadcasts problem. Successive broadcasts is a communication pattern that appears in several parallel implementations of linear algebra algorithms on distributed memory multicomputers. Note that the successive broadcasts problem is different from the gossip problem in which all the nodes must perform a broadcast in any order, even simultaneously. We present an algorithm solving the successive broadcasts problem on hypercubes. We derive a lower bound on the time of any successive broadcasts algorithms that shows that our algorithm is within a factor of 2 of the optimality. URL: http://www.netlib.org/tennessee/ut-cs-93-222.ps title: Netlib Services and Resources, (Rev. 1) author: S. Browne, J. Dongarra, S. Green, E. Grosse, K. Moore, T. Rowan, and R. Wade abstract: The Netlib repository, maintained by the University of Tennessee and Oak Ridge National Laboratory, contains freely available software, documents, and databases of interest to the numerical, scientific computing, and other communities. This report includes both the Netlib User's Guide and the Netlib System Manager's Guide, and contains information about Netlib's databases, interfaces, and system implementation. The Netlib repository's databases include the Performance Database, the Conferences Database, and the NA-NET mail forwarding and Whitepages Databases. A variety of user interfaces enable users to access the Netlib repository in the manner most convenient and compatible with their networking capabilities. These interfaces include the Netlib email interface, the Xnetlib X Windows client, the netlibget command-line TCP/IP client, anonymous FTP, anonymous RCP, and gopher. URL: http://www.netlib.org/tennessee/ut-cs-94-226.ps author: Makan Pourzandi and Bernard Tourancheau title: A Parallel Performance Study of Jacobi-like Eigenvalue Solution reference: University of Tennessee Technical Report CS-94-226, March 1994. abstract: In this report we focus on Jacobi like resolution of the eigen-problem for a real symmetric matrix from a parallel performance point of view: we try to optimize the algorithm working on the communication intensive part of the code. We discuss several parallel implementations and propose an implementation which overlaps the communications by the computations to reach a better efficiency. We show that the overlapping implementation can lead to significant improvements. We conclude by presenting our future work. URL: http://www.netlib.org/tennessee/ut-cs-94-229.ps author: James C. Browne, Jack Dongarra, Syed I. Hyder, Keith Moore, and Peter Newton, title: Visual Programming and Parallel Computing reference: University of Tennessee Technical Report CS-94-229, April 1994. abstract: Visual programming arguably provides greater benefit in explicit parallel programming, particularly coarse grain MIMD programming, than in sequential programming. Explicitly parallel programs are multi-dimenstioal objects; the natural representations of a parallel program are annotated directed graphs: data flow graphs, control flow graphs, etc. where the nodes of the graphs are sequential computations. The execution of parallel programs is a directed graph of instances of sequential computations. A visually based (directed graph) representation of parallel programs is thus more natural than a pure text string language where multi-dimensional structures must be implicitly defined. The naturalness of the annotated directed graph representation of parallel programs enables methods for programming and debugging which are qualitatively different and arguably superior to the conventional practice based on pure text string languages. Annotation of the graphs is a critical element of a practical visual programming system; text is still the best way to represent many aspects of programs. This paper presents a model of parallel programming and a model of execution for parallel programs which are the conceptual framework for a complete visual programming environement including capture of parallel structure, compilation and behavior analysis (performance and debugging). Two visually-oriented parallel programming systems, CODE 2.0 and HeNCE, each based on a variant of the model of programming, will be used to illustrate the concepts. The benefits of visually-oriented realizations of these models for program structure capture, software component reuse, performance analysis and debugging will be explored and hopefully demonstated by examples in these representations. It is only by actually implementing and using visual parallel programming languages that we have been able to fully evaluate their merits. URL: http://www.netlib.org/tennessee/ut-cs-94-230.ps author: Message Passing Interface Forum, title: MPI: A Message-Passing Interface Standard, reference: University of Tennessee Technical Report CS-94-230, April 1994. abstract: The Message Passing Interface Forum (MPIF), with participation from over 40 organizations, has been meeting since November 1992 to discuss and define a set of library standards for message passing. MPIF is not sanctioned or supported by any official standards organization. The goal of the Message Passing Interface, simply stated, is to develop a widely used standard for writing message-passing programs. As such the interface should establish a practical, portable, efficient and flexible standard for message passing. This is the final report, Version 1.0, of the Message Passing Interface Forum. This document contains all the technical features proposed for the interface. This copy of the draft was processed by LATEX on April 21, 1994. Please send comments on MPI to mpi-comments@cs.utk.edu. Your comment will be forwarded to MPIF committee members who will attempt to respond. URL: http://www.netlib.org/tennessee/ut-cs-94-232.ps author: Robert J. Manchek, title: Design and Implementation of PVM Version 3, reference: University of Tennessee Technical Report CS-94-232, May 1994. abstract: There is a growing trend toward distributed computing - writing programs that run across multiple networked computers - to speed up computation, solve larger problems or withstand machine failures. A programming model commonly used to write distributed applications is message-passing, in which a program is decomposed into distinct subprograms that communicate and synchronize with one another by explicitly sending and receiving blocks of data. PVM (Parallel Virtual Machine) is a generic message-passing system composed of a programming library and manager processes. It ties together separate physical machines (possibly of different types), providing communication and control between the subprograms and detection of machine failures. The resulting virtual machine appears as a single, manageable source. PVM is portable to a wide variety of machine architectures and operating systems, including workstations, supercomputers, PCs and multiprocessors. This paper describes the design, implementation and testing of version 3.3 of PVM and surveys related works. URL: http://www.netlib.org/tennessee/vp.ps author: Bruce J. MacLennan, title: Visualizing the Possibilities, Commentary, Behavioral and Brain reference: Sciences (1993) 16:2. abstract: I am in general agreement with Johnson-Laird \& Byrne's (J-L \& B's) approach and find their experiments convincing: therefore my commentary will be limited to several suggestions for extending and refining their theory. URL: http://www.netlib.org/tennessee/ipdn.ps author: Bruce MacLennan, title: Information Processing in the Dendritic Net reference: Ch. 6 of Rethinking Neural Networks: Quantum Fields \& Biological Data, Karl H. Pribram, ed., Lawrence Erlbaum Associates, Publishers, 1993, pp.~161-197. abstract: The goal of this paper is a model of the dendritic net that: (1) is mathematically tractable, (2) is reasonably true to the biology, and (3) illuminates information processing in the neuropil. First I discuss some general principles of mathematical modeling in a biological context that are relevant to the use of linearity and orthogonality in our models. Next I discuss the hypothesis that the dendritic net can be viewed as a linear field computer. Then I discuss the approximations involved in analyzing it as a dynamic, lumped-parameter, linear system. Within this basically linear framework I then present: (1) the self-organization of matched filters and of associative memories; (2) the dendritic computation of Gabor and other nonorthogonal representations; and (3) the possible effects of reverse current flow in neurons. URL: http://www.netlib.org/tennessee/fcb.ps author: Bruce MacLennan title: Field Computation in the Brain, reference: Ch. 7 of Rethinking Neural Networks: Quantum Fields \& Biological Data, Karl H. Pribram, ed., Lawrence Erlbaum Associates, Publishers, 1993, pp.~199-232. abstract: We begin with a brief consideration of the topology of knowledge. It has traditionally been assumed that true knowledge must be represented by discrete symbol structures, but recent research in psychology, philosophy and computer science has shown the fundamental importance of subsymbolic information processing, in which knowledge is represented in terms of very large numbers---or even continua---of microfeatures. We believe that this sets the stage for a fundamentally new theory of knowledge, and we sketch a theory of continuous information representation and processing. Next we consider field computation, a kind of continuous information processing that emphasizes spatially continuous fields of information. This is a reasonable approximation for macroscopic areas of cortex and provides a convenient mathematical framework for studying information processing at this level. We apply it also to a linear-systems model of dendritic information processing. We consider examples from the visual cortex, including Gabor and wavelet representations, and outline field-based theories of sensorimotor intentions and of model-based deduction. URL: http://www.netlib.org/tennessee/cckr.ps author: Bruce J. MacLennan title: Characteristics of Connectionist Knowledge Representation, reference: Information Sciences 70, pp. 119-143, 1993. abstract: Connectionism---the use of neural networks for knowledge representation and inference---has profound implications for the representation and processing of information because it provides a fundamentally new view of knowledge. However, its progress is impeded by the lack of a unifying theoretical construct corresponding to the idea of a calculus (or formal system) in traditional approaches to knowledge representation. Such a construct, called a simulacrum, is proposed here, and its basic properties are explored. We find that although exact classification is impossible, several other useful, robust kinds of classification are permitted. The representation of structured information and constituent structure are considered, and we find a basis for more flexible rule-like processing than that permitted by conventional methods. We discuss briefly logical issues such as decidability and computability and show that they require reformulation in this new context. Throughout we discuss the implications of this new theoretical framework for artificial intelligence and cognitive science. URL: http://www.netlib.org/tennessee/kohl-contact abstract: Information on where to contact author, James Arthur Kohl. URL: http://www.netlib.org/tennessee/kohl-93-mascots.ps author: T. L. Casavant, J. A. Kohl, title: "The IMPROV Meta-Tool Design Methodology for Visualization of Parallel Programs," reference: Invited Paper, International Workshop on Modeling, Analysis, and Simulation of Computer and Telecommunication Systems (MASCOTS), January 1993. abstract: A design methodology is presented that simplifies the creation of program visualization tools while maintaining a high degree of flexibility and expressive power. The approach is based on a "circulation architecture" model that organizes the details of the user specification, and provides a formal means for indicating relationships. The overall user specification is divided into independent modules containing distinct, well-defined entities, and the relationships among these module entities are identified using a powerful "mapping language". This language maps conditions on entities to manipulations that modify entities, resulting in dynamic animations of program behavior. The mapping language supports arbitrary levels of abstraction providing a full range of detail, and allowing efficient view development. To demonstrate the feasibility and usefulness of this approach, a specific program visualization meta-tool design, IMPROV, is described. URL: http://www.netlib.org/tennessee/kohl-92-compsac.tar author: J. A. Kohl, T. L. Casavant, title: "A Software Engineering, Visualization Methodology for Parallel Processing Systems," reference: Proceedings of the Sixteenth Annual International Computer Software & Applications Conference (COMPSAC), Chicago, Illinois, September 1992, pp. 51-56. abstract: This paper focuses on techniques for enhancing the feasibility of using graphic visualization in analyzing the complexities of parallel software. The central drawback to applying such visual techniques is the overhead in developing analysis tools with flexible, customized views. The "PARADISE" (PARallel Animated DebuggIng and Simulation Environment) system, which has been in operation since 1989, alleviates some of this design overhead by providing an abstract, object-oriented, visual modeling environment which expedites custom visual tool development. PARADISE is a visual tool which is used to develop other visual tools, or a "meta-tool". This paper complements previous work on PARADISE by describing the philosophy behind its design, and how that philosophy leads to a methodology for constructing visual models which characterize parallel systems in general. Emphasis will be on the crucial issues in utilizing visualization for parallel software development, and how PARADISE deals with these issues. URL: http://www.netlib.org/tennessee/kohl-92-prop.ps author: J. A. Kohl, title: "The Construction of Meta-Tools for Program Visualization of Parallel Software," reference: Ph.D. Thesis Proposal, Written Paper Accompanying Oral Comprehensive Examination, Technical Report Number TR-ECE-920204, Department of ECE, University of Iowa, Iowa City, IA, 52242, February 1992. abstract: This proposal provides a design methodology for program visualization meta-tools for parallel software that simplifies the use of such tools while maintaining a high degree of flexibility and expressive power. The approach is based on a "meta-tool circulation architecture" model that organizes the details of the user specification, and provides a circulation of information which supports a formal means for indicating relationships among that information. The overall user specification is divided into independent modules containing distinct entities, and the relationships among these module entities are identified using a powerful "relationship mapping language". This language maps conditions on selected entities to manipulations that modify the entities, allowing the state of an entity to be controlled in terms of the state of any other entity or itself. The mapping language supports arbitrary levels of abstraction in manipulating entities, allowing a full range of possible detail. As a result, visual analyses can be specified efficiently, utilizing only the minimum level of detail necessary. To demonstrate the feasibility and usefulness of this approach, a specific program visualization meta-tool design is proposed based on the methodology. URL: http://www.netlib.org/tennessee/kohl-92-ewpc-conf.tar author: T. L. Casavant, J. A. Kohl, Y. E. Papelis, title: "Practical Use of Visualization for Parallel Systems," reference: Invited Keynote Address Text for 1992 European Workshop on Parallel Computers (EWPC), Barcelona, Spain, March 23-24, 1992. abstract: This paper overviews the major contributions to the field of visualization as applied to parallel computing to date. Advances have come mostly from academics, but the influence on industrial and commercial settings for the future will be dramatic. The paper emphasizes how to improve the software development process for high-performance parallel computers through the use of visualization techniques both for program creation, as well as for debugging, verification, performance tuning, and maintenance. A concrete discussion of actual tool behavior is also presented. URL: http://www.netlib.org/tennessee/kohl-92-ewpc-full.tar author: T. L. Casavant, J. A. Kohl, Y. E. Papelis, title: "Practical Use of Visualization for Parallel Systems," reference: Technical Report Number TR-ECE-920102, Department of ECE, University of Iowa, Iowa City, IA, 52242, January 1992 (full version of EWPC 92 paper). abstract: This paper overviews the major contributions to the field of visualization as applied to parallel computing to date. Advances have come mostly from academics, but the influence on industrial and commercial settings for the future will be dramatic. The paper emphasizes how to improve the software development process for high-performance parallel computers through the use of visualization techniques both for program creation, as well as for debugging, verification, performance tuning, and maintenance. A concrete discussion of actual tool behavior is also presented. URL: http://www.netlib.org/tennessee/kohl-91-ipps.tar author: J. A. Kohl, T. L. Casavant, title: "Use of PARADISE: A Meta-Tool for Visualizing Parallel Systems," reference: Proceedings of the Fifth International Parallel Processing Symposium (IPPS), Anaheim, California, May 1991, pp. 561-567. abstract: This paper addresses the problem of creating software tools for visualizing the dynamic behavior of parallel applications and systems. "PARADISE" (PARallel Animated DebuggIng and Simulation Environment) approaches this problem by providing a "meta-tool" environment for generating custom visual analysis tools. PARADISE is a meta-tool because it is a tool which is utilized to create other tools. This paper focuses on the user's view of the use of PARADISE for constructing tools which analyze the interaction between parallel systems and parallel applications. An example of its use, involving the PASM Parallel Processing System, is given. URL: http://www.netlib.org/tennessee/kohl-91-santafe.ps author: J. A. Kohl, T. L. Casavant, title: "Methodologies for Rapid Prototyping of Tools for Visualizing the Performance of Parallel Systems," reference: Presentation at Workshop on Parallel Computer Systems: Software Tools, Santa Fe, New Mexico, October 1991. abstract: This presentation focuses on the issues encountered in developing visualization tools for performance tuning of parallel software. This task will be analyzed from the perspective of the user and the "meta- tool" designer. The talk will emphasize these two perspectives on performance tuning, as well as another approach which utilizes a limited tool kit. Then, the current state of the PARADISE tool, a meta-tool for analyzing parallel software, will be examined, along with other visual tools, to determine the extent to which each tool satisfies the goals and guidelines of the previous discussion. Finally, directions for future work will be explored. ( Note: Presentation slides only. ) URL: http://www.netlib.org/tennessee/kohl-91-comp.ps author: J. A. Kohl, title: "Visual Techniques for Parallel Processing," reference: Written Comprehensive Examination, University of Iowa, Department of Electrical and Computer Engineering, ECETR-910726, July 1991. abstract: This Comprehensive Examination consists of an accumulation and analysis of research on the use of visualization in computing systems over the past decade, as well as recent efforts specifically in the area of software development for parallel processing. The goal of the examination is to determine the relationships among the references located, and their cumulative effect in directing the course of future research in the field of visualization. The examination includes a creative portion in which the various uses and approaches for visualization are to be classified via a taxonomical system. This classification will identify the central issues which differentiate the visualization environments for developing parallel software. In addition, a quantitative assessment of these environments will be constructed which presents a more concrete evaluation and categorization technique. URL: http://www.netlib.org/tennessee/kohl-91-901011.tar author: J. A. Kohl, T. L. Casavant, title: "PARADISE: A Meta-Tool for Program Visualization in Parallel Computing Systems," reference: Technical Report Number TR-ECE-901011, Department of ECE, University of Iowa, Iowa City, IA, 52242, Revised December 1991. abstract: This paper addresses the problem of creation of software tools for visualizing the dynamic behavior of parallel applications and systems. "PARADISE" (PARallel Animated DebuggIng and Simulation Environment) approaches this problem by providing a "meta-tool" environment for generating custom visual analysis tools. PARADISE is a meta-tool because it is a tool which is utilized to create other tools. The fundamental concept is the use of abstract visual models to simulate complex, concurrent behavior. This paper focuses on the goals of PARADISE, and reflects on the extent to which the prototype system, which has been in operation since 1989, meets these goals. The prototype system is described, along with a methodology for using visual modeling to analyze parallel software and systems. Examples of its use are also given.