From wade@cs.utk.edu Tue Feb 9 10:47:32 1993 Received: from THUD.CS.UTK.EDU by surfer.EPM.ORNL.GOV (5.61/1.34) id AA26386; Tue, 9 Feb 93 10:47:32 -0500 Received: from LOCALHOST.cs.utk.edu by thud.cs.utk.edu with SMTP (5.61++/2.7c-UTK) id AA26430; Tue, 9 Feb 93 10:47:26 -0500 Message-Id: <9302091547.AA26430@thud.cs.utk.edu> To: pbwg-comm-archive@surfer.EPM.ORNL.GOV Cc: wade@cs.utk.edu Subject: test Date: Tue, 09 Feb 93 10:47:25 EST From: Reed Wade From owner-pbwg-comm@CS.UTK.EDU Sun Feb 14 16:56:44 1993 Received: from CS.UTK.EDU by surfer.EPM.ORNL.GOV (5.61/1.34) id AA14958; Sun, 14 Feb 93 16:56:44 -0500 Received: from localhost by CS.UTK.EDU with SMTP (5.61++/2.8s-UTK) id AA24403; Sun, 14 Feb 93 16:56:04 -0500 X-Resent-To: pbwg-comm@CS.UTK.EDU ; Sun, 14 Feb 1993 16:56:03 EST Errors-To: owner-pbwg-comm@CS.UTK.EDU Received: from DASHER.CS.UTK.EDU by CS.UTK.EDU with SMTP (5.61++/2.8s-UTK) id AA24397; Sun, 14 Feb 93 16:56:02 -0500 From: Jack Dongarra Received: by dasher.cs.utk.edu (5.61++/2.7c-UTK) id AA01941; Sun, 14 Feb 93 16:56:01 -0500 Date: Sun, 14 Feb 93 16:56:01 -0500 Message-Id: <9302142156.AA01941@dasher.cs.utk.edu> To: pbwg-comm@cs.utk.edu Subject: Parallel Benchmark Working Group Meeting For planning purposes, I would like to know how many people will be attending the Parallel Benchmark Working Group (PBWG) meeting on March 1th and 2nd, 1993 in Knoxville, Tennessee. Let me know if your will or will not be attending the meeting. Best wishes, Jack From @ecs.soton.ac.uk,@diana.ecs.soton.ac.uk:C.D.Collier@ecs.southampton.ac.uk Mon Jan 11 12:30:42 1993 Return-Path: <@ecs.soton.ac.uk,@diana.ecs.soton.ac.uk:C.D.Collier@ecs.southampton.ac.uk> Received: from sun2.nsfnet-relay.ac.uk by CS.UTK.EDU with SMTP (5.61++/2.8s-UTK) id AA04153; Mon, 11 Jan 93 12:30:09 -0500 Via: uk.ac.southampton.ecs; Mon, 11 Jan 1993 17:27:19 +0000 Via: diana.ecs.soton.ac.uk; Mon, 11 Jan 93 17:20:06 GMT From: Christine Collier Message-Id: <6780.9301111725@diana.ecs.soton.ac.uk> Subject: Meeting Minneapolis 18th November, 1992 To: dbailey@nas.nasa.com, iyb@lanl.gov, sbveit@ksv.com, carterl@watson.ibm.com, thec@newton.national-physical-lab.co.uk, dongarra@cs.utk.edu, dem@cxa.dl.ac.uk, j.flemming@cray.com, gcfe@npac.syr.edu, danielf@kgnvma.vnet.ibm.com, paulg@meiko.com, gent@genias.de, cmg@cray.com, harp@revmes.mod.uk, siamak@fai.com, hempel@gmd.de, ajgh@ecs.soton.ac.uk, rwh@uk.pac.soton.ecs, j1mart@kgnvmz.vnet.ibm.com, hcooke@parsys.co.uk, messina@caltech.edu, Bminto@cray.com, dennisp@think.com, schneid@csrd.uiuc.edu, simon@nas.nasa.gov, actstea@ml.ruu.cc, frannie@Parsytech.de Date: Mon, 11 Jan 1993 17:25:46 +0000 (GMT) X-Mailer: ELM [version 2.4 PL8] Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit Content-Length: 7692 Status: RO Parallel Benchmark Working Group Summary of 1st Meeting held in Minneapolis Convention Centre Wednesday November 18th, 1992 1. Introduction and Welcome The meeting was opened by Professor Roger Hockney who welcomed all the participants and asked Tony Hey to say a few words about the background to the meeting. Tony Hey outlined the history of attempts in Europe to establish credible and useful benchmarks for the evaluation of Distributed Memory MIMD systems. At the time the European Genesis work began in 1988 there were no suitable DM message-passing benchmarks - although the Caltech group also undertook a study at about the same time. Four years later the scene has moved on, and besides the Genesis benchmarks for message-passing Fortran programs, HPF/Fortran 90 benchmarks are now appearing. The Perfect Club are also now looking at benchmarks for DM systems and the NASA Ames 'Pencil and Paper' benchmarks are being taken seriously by the vendors. From a situation where there were very few suitable benchmark codes we are now approaching a situation where there is an over-abundance of benchmarks and inevitable duplication of effort. This is undesirable both for the vendors and for the end users of systems. With the adoption of HPF and the current proposal for a standard message-passing interface (MPI), discussed yesterday at the message-passing standards workshop, there seems to be a real window of opportunity to gather together US and European benchmarkers to agree on a useful subset. Roger Hockney then asked those present to introduce themselves and make some remark about their interest in this activity. A list of attendees is attached to this summary. There was widespread support for such an activity and a summary of contributions as follows: Horst Simon of NASA Ames suggested that we follow the HPF model and form working groups in identified areas. He also voiced reservations about any attempt to create a fully comprehensive benchmark suite. Denis Parkinson of TMC supported the activity but pointed out that it was important that this did not become "yet another" set of benchmarks for vendors to implement. He also stressed the need for HPF versions as well as message-passing versions, and raised the question of scalability of benchmarks. Aard van der Steen raised the question of Japanese participation in such an activity. Siamak Hassanzadeh of Fujitsu supported the proposal and suggested the inclusion of seismic benchmarking codes. Rolf Hempel of the GMD in Sankt Augustin stressed his organization's support for the activity and briefly described the RAPS benchmarking initiative in Europe. This is a consortium of users and software houses, led by ECMWF and including ESI and AVL, supported by Convex, Cray, Fujitsu, IBM, Intel and Meiko RAPS is an acronym for Real Applications on Parallel Systems. The Genesis benchmark codes are included as a public domain subset of RAPS but the major codes would not be publicly available. Bill Minto of Cray UK stressed that real applications were important and that it was also necessary to address as wide a spread of applications as possible. Geoffrey Fox of Syracuse suggested that it was only realistic at this stage at attempt to co-ordinate benchmarking activities. His activities at Syracuse had lately been concerned with constructing benchmarks for validating HPF and Fortran 90 compilers. Trevor Chambers of NPL, UK gave his support to the activity and announced that a major new European benchmarking project PEPS with NPL were involvement has just started. Gordon Harp of DRA, Malvern UK said that the DRA (Defence Research Agency) were involved in a collaboration with agencies in the US, Canada and Australia looking at benchmarks for defence applications. He was concerned that there was a need for real applications and issues such as scalability and power consumption were important from the DRA perspective. Francis Wray of Parsytec and Paul Garrett of Meiko both voiced concern that this activity should not generate more rather than less work in procurement activities. David Schneider from CRSD Illinois, representing the Perfect Club, welcomed an attempt to eliminate redundant effort. He stressed that benchmarks had multiple uses - such as education and compiler evaluation - as well as specific application knowledge. He thought that there may be budget problems if a very organised activity was envisaged. Benchmarks spanning a range of architectures and the need for public domain codes were also stressed. Joanne Martin of the HPSSL IBM Kingston welcomed the initiative and stressed how important it was for vendors not to be confronted with many sets of "standard" benchmarks. 2. Objectives There were no objections to the draft objectives for the group. These objectives are: 1. To establish a comprehensive set of parallel benchmarks that is generally accepted by both users and vendors of parallel system. 2. To provide a focus for parallel benchmark activities and avoid unnecessary duplication of effort and proliferation of benchmarks. 3. To set standards for benchmarking methodology and result-reporting together with a control database/repository for both the benchmarks and the results. 3. Mode of Working 3.1 It was agreed that on HPF-like forum style of working should be adopted with a view to convergence to an agreed set of benchmarks and procedures within 12 months. 3.2 There was not seen to be a need for meetings every six weeks but in order to generate momentum for the project it was thought that two meetings a year were too few. 3.3 Jack Dongarra volunteered to set up a database for benchmarks and results at ORNL. NPL were willing to maintain a European copy of this database. 3.4 It was agreed that all present should send existing benchmarks to Jack Dongarra at ORNL. 3.5 Jack Dongarra and Aard van der Steen agreed to examine the available benchmarks submitted and attempt to classify them appropriately. 3.6 Three other working groups were identified with named individuals taking responsibility for each group. These were as follows: Methodology: Bailey, Hockney and Schneider Kernel Benchmarks: Dongarra, Hockney, van der Steen, Wray Compiler Benchmarks: Fox, Grassl 3.7 A number of application areas were discussed as possible working groups e.g. CFD, Seismic, QCD etc., but it was thought premature to activate such groups at this time. 4. Future Activity 4.1 Jack Dongarra agreed to set up a mail refector at ORNL for the Parallel Benchmark Working Group (PBWG) and to organize the relevant subgroups along the lines of the HPF forum. 4.2 Jack Dongarra also agreed to host the next meeting of the PBWG. Subsequent discussions after the formal close of the meeting led to the dates of March 1st/2nd being selected. 4.3 At the March meeting, each subgroup will produce a discussion document and a benchmark database classification will be proposed. Further discussion on procedures for the PBWG was also deferred until then. From owner-pbwg-comm@CS.UTK.EDU Sun Feb 21 12:08:07 1993 Received: from CS.UTK.EDU by surfer.EPM.ORNL.GOV (5.61/1.34) id AA10794; Sun, 21 Feb 93 12:08:07 -0500 Received: from localhost by CS.UTK.EDU with SMTP (5.61++/2.8s-UTK) id AA19152; Sun, 21 Feb 93 12:07:34 -0500 X-Resent-To: pbwg-comm@CS.UTK.EDU ; Sun, 21 Feb 1993 12:07:32 EST Errors-To: owner-pbwg-comm@CS.UTK.EDU Received: from THUD.CS.UTK.EDU by CS.UTK.EDU with SMTP (5.61++/2.8s-UTK) id AA19145; Sun, 21 Feb 93 12:07:28 -0500 From: Jack Dongarra Received: by thud.cs.utk.edu (5.61++/2.7c-UTK) id AA05996; Sun, 21 Feb 93 12:07:24 -0500 Date: Sun, 21 Feb 93 12:07:24 -0500 Message-Id: <9302211707.AA05996@thud.cs.utk.edu> To: pbwg-comm@cs.utk.edu Subject: change of room for the PBWG meeting We have had a change in the meeting room for the Parallel Benchmark Working Group. The new meeting room is in the University Center Room 201. The postscript file below contains a map that may help. Look forward to seeing you next week. Regards, Jack %!PS-Adobe-2.0 EPSF-1.2 %%DocumentFonts: Helvetica-Bold Courier Courier-Bold Times-Bold %%Pages: 1 %%BoundingBox: 39 -113 604 767 %%EndComments /arrowHeight 10 def /arrowWidth 5 def /IdrawDict 54 dict def IdrawDict begin /reencodeISO { dup dup findfont dup length dict begin { 1 index /FID ne { def }{ pop pop } ifelse } forall /Encoding ISOLatin1Encoding def currentdict end definefont } def /ISOLatin1Encoding [ /.notdef/.notdef/.notdef/.notdef/.notdef/.notdef/.notdef/.notdef /.notdef/.notdef/.notdef/.notdef/.notdef/.notdef/.notdef/.notdef /.notdef/.notdef/.notdef/.notdef/.notdef/.notdef/.notdef/.notdef /.notdef/.notdef/.notdef/.notdef/.notdef/.notdef/.notdef/.notdef /space/exclam/quotedbl/numbersign/dollar/percent/ampersand/quoteright /parenleft/parenright/asterisk/plus/comma/minus/period/slash /zero/one/two/three/four/five/six/seven/eight/nine/colon/semicolon /less/equal/greater/question/at/A/B/C/D/E/F/G/H/I/J/K/L/M/N /O/P/Q/R/S/T/U/V/W/X/Y/Z/bracketleft/backslash/bracketright /asciicircum/underscore/quoteleft/a/b/c/d/e/f/g/h/i/j/k/l/m /n/o/p/q/r/s/t/u/v/w/x/y/z/braceleft/bar/braceright/asciitilde /.notdef/.notdef/.notdef/.notdef/.notdef/.notdef/.notdef/.notdef /.notdef/.notdef/.notdef/.notdef/.notdef/.notdef/.notdef/.notdef /.notdef/dotlessi/grave/acute/circumflex/tilde/macron/breve /dotaccent/dieresis/.notdef/ring/cedilla/.notdef/hungarumlaut /ogonek/caron/space/exclamdown/cent/sterling/currency/yen/brokenbar /section/dieresis/copyright/ordfeminine/guillemotleft/logicalnot /hyphen/registered/macron/degree/plusminus/twosuperior/threesuperior /acute/mu/paragraph/periodcentered/cedilla/onesuperior/ordmasculine /guillemotright/onequarter/onehalf/threequarters/questiondown /Agrave/Aacute/Acircumflex/Atilde/Adieresis/Aring/AE/Ccedilla /Egrave/Eacute/Ecircumflex/Edieresis/Igrave/Iacute/Icircumflex /Idieresis/Eth/Ntilde/Ograve/Oacute/Ocircumflex/Otilde/Odieresis /multiply/Oslash/Ugrave/Uacute/Ucircumflex/Udieresis/Yacute /Thorn/germandbls/agrave/aacute/acircumflex/atilde/adieresis /aring/ae/ccedilla/egrave/eacute/ecircumflex/edieresis/igrave /iacute/icircumflex/idieresis/eth/ntilde/ograve/oacute/ocircumflex /otilde/odieresis/divide/oslash/ugrave/uacute/ucircumflex/udieresis /yacute/thorn/ydieresis ] def /Helvetica-Bold reencodeISO def /Courier reencodeISO def /Courier-Bold reencodeISO def /Times-Bold reencodeISO def /none null def /numGraphicParameters 17 def /stringLimit 65535 def /Begin { save numGraphicParameters dict begin } def /End { end restore } def /SetB { dup type /nulltype eq { pop false /brushRightArrow idef false /brushLeftArrow idef true /brushNone idef } { /brushDashOffset idef /brushDashArray idef 0 ne /brushRightArrow idef 0 ne /brushLeftArrow idef /brushWidth idef false /brushNone idef } ifelse } def /SetCFg { /fgblue idef /fggreen idef /fgred idef } def /SetCBg { /bgblue idef /bggreen idef /bgred idef } def /SetF { /printSize idef /printFont idef } def /SetP { dup type /nulltype eq { pop true /patternNone idef } { dup -1 eq { /patternGrayLevel idef /patternString idef } { /patternGrayLevel idef } ifelse false /patternNone idef } ifelse } def /BSpl { 0 begin storexyn newpath n 1 gt { 0 0 0 0 0 0 1 1 true subspline n 2 gt { 0 0 0 0 1 1 2 2 false subspline 1 1 n 3 sub { /i exch def i 1 sub dup i dup i 1 add dup i 2 add dup false subspline } for n 3 sub dup n 2 sub dup n 1 sub dup 2 copy false subspline } if n 2 sub dup n 1 sub dup 2 copy 2 copy false subspline patternNone not brushLeftArrow not brushRightArrow not and and { ifill } if brushNone not { istroke } if 0 0 1 1 leftarrow n 2 sub dup n 1 sub dup rightarrow } if end } dup 0 4 dict put def /Circ { newpath 0 360 arc patternNone not { ifill } if brushNone not { istroke } if } def /CBSpl { 0 begin dup 2 gt { storexyn newpath n 1 sub dup 0 0 1 1 2 2 true subspline 1 1 n 3 sub { /i exch def i 1 sub dup i dup i 1 add dup i 2 add dup false subspline } for n 3 sub dup n 2 sub dup n 1 sub dup 0 0 false subspline n 2 sub dup n 1 sub dup 0 0 1 1 false subspline patternNone not { ifill } if brushNone not { istroke } if } { Poly } ifelse end } dup 0 4 dict put def /Elli { 0 begin newpath 4 2 roll translate scale 0 0 1 0 360 arc patternNone not { ifill } if brushNone not { istroke } if end } dup 0 1 dict put def /Line { 0 begin 2 storexyn newpath x 0 get y 0 get moveto x 1 get y 1 get lineto brushNone not { istroke } if 0 0 1 1 leftarrow 0 0 1 1 rightarrow end } dup 0 4 dict put def /MLine { 0 begin storexyn newpath n 1 gt { x 0 get y 0 get moveto 1 1 n 1 sub { /i exch def x i get y i get lineto } for patternNone not brushLeftArrow not brushRightArrow not and and { ifill } if brushNone not { istroke } if 0 0 1 1 leftarrow n 2 sub dup n 1 sub dup rightarrow } if end } dup 0 4 dict put def /Poly { 3 1 roll newpath moveto -1 add { lineto } repeat closepath patternNone not { ifill } if brushNone not { istroke } if } def /Rect { 0 begin /t exch def /r exch def /b exch def /l exch def newpath l b moveto l t lineto r t lineto r b lineto closepath patternNone not { ifill } if brushNone not { istroke } if end } dup 0 4 dict put def /Text { ishow } def /idef { dup where { pop pop pop } { exch def } ifelse } def /ifill { 0 begin gsave patternGrayLevel -1 ne { fgred bgred fgred sub patternGrayLevel mul add fggreen bggreen fggreen sub patternGrayLevel mul add fgblue bgblue fgblue sub patternGrayLevel mul add setrgbcolor eofill } { eoclip originalCTM setmatrix pathbbox /t exch def /r exch def /b exch def /l exch def /w r l sub ceiling cvi def /h t b sub ceiling cvi def /imageByteWidth w 8 div ceiling cvi def /imageHeight h def bgred bggreen bgblue setrgbcolor eofill fgred fggreen fgblue setrgbcolor w 0 gt h 0 gt and { l b translate w h scale w h true [w 0 0 h neg 0 h] { patternproc } imagemask } if } ifelse grestore end } dup 0 8 dict put def /istroke { gsave brushDashOffset -1 eq { [] 0 setdash 1 setgray } { brushDashArray brushDashOffset setdash fgred fggreen fgblue setrgbcolor } ifelse brushWidth setlinewidth originalCTM setmatrix stroke grestore } def /ishow { 0 begin gsave fgred fggreen fgblue setrgbcolor /fontDict printFont printSize scalefont dup setfont def /descender fontDict begin 0 [FontBBox] 1 get FontMatrix end transform exch pop def /vertoffset 1 printSize sub descender sub def { 0 vertoffset moveto show /vertoffset vertoffset printSize sub def } forall grestore end } dup 0 3 dict put def /patternproc { 0 begin /patternByteLength patternString length def /patternHeight patternByteLength 8 mul sqrt cvi def /patternWidth patternHeight def /patternByteWidth patternWidth 8 idiv def /imageByteMaxLength imageByteWidth imageHeight mul stringLimit patternByteWidth sub min def /imageMaxHeight imageByteMaxLength imageByteWidth idiv patternHeight idiv patternHeight mul patternHeight max def /imageHeight imageHeight imageMaxHeight sub store /imageString imageByteWidth imageMaxHeight mul patternByteWidth add string def 0 1 imageMaxHeight 1 sub { /y exch def /patternRow y patternByteWidth mul patternByteLength mod def /patternRowString patternString patternRow patternByteWidth getinterval def /imageRow y imageByteWidth mul def 0 patternByteWidth imageByteWidth 1 sub { /x exch def imageString imageRow x add patternRowString putinterval } for } for imageString end } dup 0 12 dict put def /min { dup 3 2 roll dup 4 3 roll lt { exch } if pop } def /max { dup 3 2 roll dup 4 3 roll gt { exch } if pop } def /midpoint { 0 begin /y1 exch def /x1 exch def /y0 exch def /x0 exch def x0 x1 add 2 div y0 y1 add 2 div end } dup 0 4 dict put def /thirdpoint { 0 begin /y1 exch def /x1 exch def /y0 exch def /x0 exch def x0 2 mul x1 add 3 div y0 2 mul y1 add 3 div end } dup 0 4 dict put def /subspline { 0 begin /movetoNeeded exch def y exch get /y3 exch def x exch get /x3 exch def y exch get /y2 exch def x exch get /x2 exch def y exch get /y1 exch def x exch get /x1 exch def y exch get /y0 exch def x exch get /x0 exch def x1 y1 x2 y2 thirdpoint /p1y exch def /p1x exch def x2 y2 x1 y1 thirdpoint /p2y exch def /p2x exch def x1 y1 x0 y0 thirdpoint p1x p1y midpoint /p0y exch def /p0x exch def x2 y2 x3 y3 thirdpoint p2x p2y midpoint /p3y exch def /p3x exch def movetoNeeded { p0x p0y moveto } if p1x p1y p2x p2y p3x p3y curveto end } dup 0 17 dict put def /storexyn { /n exch def /y n array def /x n array def n 1 sub -1 0 { /i exch def y i 3 2 roll put x i 3 2 roll put } for } def %%EndProlog %%BeginIdrawPrologue /arrowhead { 0 begin transform originalCTM itransform /taily exch def /tailx exch def transform originalCTM itransform /tipy exch def /tipx exch def /dy tipy taily sub def /dx tipx tailx sub def /angle dx 0 ne dy 0 ne or { dy dx atan } { 90 } ifelse def gsave originalCTM setmatrix tipx tipy translate angle rotate newpath arrowHeight neg arrowWidth 2 div moveto 0 0 lineto arrowHeight neg arrowWidth 2 div neg lineto patternNone not { originalCTM setmatrix /padtip arrowHeight 2 exp 0.25 arrowWidth 2 exp mul add sqrt brushWidth mul arrowWidth div def /padtail brushWidth 2 div def tipx tipy translate angle rotate padtip 0 translate arrowHeight padtip add padtail add arrowHeight div dup scale arrowheadpath ifill } if brushNone not { originalCTM setmatrix tipx tipy translate angle rotate arrowheadpath istroke } if grestore end } dup 0 9 dict put def /arrowheadpath { newpath arrowHeight neg arrowWidth 2 div moveto 0 0 lineto arrowHeight neg arrowWidth 2 div neg lineto } def /leftarrow { 0 begin y exch get /taily exch def x exch get /tailx exch def y exch get /tipy exch def x exch get /tipx exch def brushLeftArrow { tipx tipy tailx taily arrowhead } if end } dup 0 4 dict put def /rightarrow { 0 begin y exch get /tipy exch def x exch get /tipx exch def y exch get /taily exch def x exch get /tailx exch def brushRightArrow { tipx tipy tailx taily arrowhead } if end } dup 0 4 dict put def %%EndIdrawPrologue %I Idraw 10 Grid 2.84217e-39 0 %%Page: 1 1 Begin %I b u %I cfg u %I cbg u %I f u %I p u %I t [ 0.799705 0 0 0.799705 0 0 ] concat /originalCTM matrix currentmatrix def Begin %I Pict %I b u %I cfg u %I cbg u %I f u %I p u %I t [ 1 0 0 1 89.1002 831.2 ] concat Begin %I Elli %I b 65535 0 0 0 [] 0 SetB %I cfg Black 0 0 0 SetCFg %I cbg White 1 1 1 SetCBg none SetP %I p n %I t [ 1.09512e-08 0.9 -0.9 1.09512e-08 584.1 0.89999 ] concat %I 79 550 18 17 Elli End Begin %I Line %I b 65535 3 0 1 [] 0 SetB %I cfg Black 0 0 0 SetCFg %I cbg White 1 1 1 SetCBg none SetP %I p n %I t [ 1.09512e-08 0.9 -0.9 1.09512e-08 541.8 -27 ] concat %I 110 466 110 542 Line %I 1 End Begin %I Line %I b 65535 2 0 0 [] 0 SetB %I cfg Black 0 0 0 SetCFg %I cbg White 1 1 1 SetCBg none SetP %I p n %I t [ 1.09512e-08 0.9 -0.9 1.09512e-08 541.8 -27 ] concat %I 82 504 140 504 Line %I 1 End Begin %I Text %I cfg Black 0 0 0 SetCFg %I f *-helvetica-bold-r-*-140-* Helvetica-Bold 14 SetF %I t [ 1.2168e-08 1 -1 1.2168e-08 35.5 66.1 ] concat %I [ (N) ] Text End End %I eop Begin %I Text %I cfg Black 0 0 0 SetCFg %I f *-courier-medium-r-*-80-* Courier 8 SetF %I t [ -1.23147 0.0157385 -0.0157385 -1.23147 409.218 -127.169 ] concat %I [ (Voluteer Boulevard) ] Text End Begin %I Text %I cfg Black 0 0 0 SetCFg %I f *-courier-bold-r-*-120-* Courier-Bold 12 SetF %I t [ 3.66661e-08 3.01333 -3.01333 3.66661e-08 57.9267 164.513 ] concat %I [ (UT Campus -- Jack Dongarra's Lab) ] Text End Begin %I Rect none SetB %I b n %I cfg Black 0 0 0 SetCFg %I cbg White 1 1 1 SetCBg %I p 1 SetP %I t [ 1.2168e-08 1 -1 1.2168e-08 693.5 -156 ] concat %I 17 61 177 478 Rect End Begin %I Line %I b 65535 2 1 1 [] 0 SetB %I cfg Black 0 0 0 SetCFg %I cbg White 1 1 1 SetCBg none SetP %I p n %I t [ 6.9812e-09 0.698798 -0.573736 8.50294e-09 453.281 -79.7194 ] concat %I 158 545 1015 545 Line %I 1 End Begin %I Elli %I b 65535 2 0 0 [] 0 SetB %I cfg Black 0 0 0 SetCFg %I cbg White 1 1 1 SetCBg %I p 0 SetP %I t [ 5.81767e-09 0.582331 -0.478114 7.08579e-09 333.274 -81.9323 ] concat %I 257 403 4 4 Elli End Begin %I Line %I b 65535 2 0 1 [] 0 SetB %I cfg Black 0 0 0 SetCFg %I cbg White 1 1 1 SetCBg %I p 0 SetP %I t [ 7.02094e-09 0.702776 -0.577002 8.55135e-09 455.344 -80.5588 ] concat %I 211 544 211 17 Line %I 1 End Begin %I BSpl %I b 65535 2 0 0 [] 0 SetB %I cfg Black 0 0 0 SetCFg %I cbg White 1 1 1 SetCBg none SetP %I p n %I t [ 6.9812e-09 0.698798 -0.573736 8.50294e-09 455.594 -124.448 ] concat %I 5 503 545 514 535 546 529 628 529 628 529 5 BSpl %I 1 End Begin %I Elli %I b 65535 2 0 0 [] 0 SetB %I cfg Black 0 0 0 SetCFg %I cbg White 1 1 1 SetCBg %I p 0 SetP %I t [ 5.81767e-09 0.582331 -0.478114 7.08579e-09 333.274 76.6169 ] concat %I 257 403 4 4 Elli End Begin %I Elli %I b 65535 2 0 0 [] 0 SetB %I cfg Black 0 0 0 SetCFg %I cbg White 1 1 1 SetCBg %I p 0 SetP %I t [ 5.81767e-09 0.582331 -0.478114 7.08579e-09 403.382 -81.1611 ] concat %I 257 403 4 4 Elli End Begin %I Text %I cfg Black 0 0 0 SetCFg %I f *-courier-medium-r-*-100-* Courier 10 SetF %I t [ 1.15378e-08 1.1549 -0.948215 1.40528e-08 222.192 481.065 ] concat %I [ (DOWN) (TOWN) ] Text End Begin %I Poly %I b 65535 3 0 0 [] 0 SetB %I cfg Black 0 0 0 SetCFg %I cbg White 1 1 1 SetCBg %I p 0.75 SetP %I t [ 2.43189e-09 0.222447 -0.19986 2.70672e-09 310.54 254.841 ] concat %I 4 301 50 850 50 850 537 301 537 4 Poly End Begin %I BSpl %I b 65535 2 0 0 [] 0 SetB %I cfg Black 0 0 0 SetCFg %I cbg White 1 1 1 SetCBg none SetP %I p n %I t [ 9.38371e-09 0.771182 -0.771182 9.38371e-09 628.85 -52.2379 ] concat %I 9 251 541 251 516 251 507 257 495 263 489 413 491 472 496 485 499 485 499 9 BSpl %I 1 End Begin %I BSpl %I b 65520 2 0 0 [12 4] 17 SetB %I cfg Black 0 0 0 SetCFg %I cbg White 1 1 1 SetCBg none SetP %I p n %I t [ 9.18822e-09 0.43379 -0.755115 5.27834e-09 620.126 111.397 ] concat %I 6 486 498 502 500 510 504 514 514 514 529 513 542 6 BSpl %I 1 End Begin %I Line %I b 65535 2 0 0 [] 0 SetB %I cfg Black 0 0 0 SetCFg %I cbg White 1 1 1 SetCBg %I p 0 SetP %I t [ 9.38371e-09 0.771182 -0.771182 9.38371e-09 628.85 -52.2378 ] concat %I 476 654 476 542 Line %I 1 End Begin %I Elli %I b 65535 2 0 0 [] 0 SetB %I cfg Black 0 0 0 SetCFg %I cbg White 1 1 1 SetCBg %I p 0 SetP %I t [ 5.81767e-09 0.582331 -0.478114 7.08579e-09 445.797 -81.1611 ] concat %I 257 403 4 4 Elli End Begin %I Text %I cfg Black 0 0 0 SetCFg %I f *-courier-medium-r-*-100-* Courier 10 SetF %I t [ 1.13387e-08 0.931845 -0.931845 1.13387e-08 252.947 159.612 ] concat %I [ (Volunteer Boulevard) ] Text End Begin %I Text %I cfg Black 0 0 0 SetCFg %I f *-courier-medium-r-*-80-* Courier 8 SetF %I t [ 1.13387e-08 0.931845 -0.931845 1.13387e-08 310.947 152.672 ] concat %I [ (Neyland Drive) ] Text End Begin %I Text %I cfg Black 0 0 0 SetCFg %I f *-courier-medium-r-*-100-* Courier 10 SetF %I t [ 1.13387e-08 0.931845 -0.931845 1.13387e-08 198.965 155.756 ] concat %I [ (Cumberland Avenue) ] Text End Begin %I Text %I cfg Black 0 0 0 SetCFg %I f *-courier-medium-r-*-100-* Courier 10 SetF %I t [ 1.13387e-08 0.931845 -0.931845 1.13387e-08 130.329 90.977 ] concat %I [ (Interstate 40) ] Text End Begin %I Text %I cfg Black 0 0 0 SetCFg %I f *-courier-medium-r-*-100-* Courier 10 SetF %I t [ 1.13387e-08 0.931845 -0.931845 1.13387e-08 130.329 403.305 ] concat %I [ (Interstate 40) ] Text End Begin %I Elli %I b 65535 2 0 0 [] 0 SetB %I cfg Black 0 0 0 SetCFg %I cbg White 1 1 1 SetCBg %I p 0 SetP %I t [ 5.81767e-09 0.582331 -0.478114 7.08579e-09 332.433 366.124 ] concat %I 257 403 4 4 Elli End Begin %I Line %I b 65535 2 0 0 [] 0 SetB %I cfg Black 0 0 0 SetCFg %I cbg White 1 1 1 SetCBg none SetP %I p n %I t [ 9.38371e-09 0.771182 -0.771182 9.38371e-09 628.85 -52.2379 ] concat %I 107 542 486 541 Line %I 1 End Begin %I Line %I b 65520 2 0 0 [12 4] 17 SetB %I cfg Black 0 0 0 SetCFg %I cbg White 1 1 1 SetCBg none SetP %I p n %I t [ 9.38371e-09 0.771182 -0.771182 9.38371e-09 628.85 -52.2379 ] concat %I 487 541 664 541 Line %I 1 End Begin %I Line %I b 65535 2 0 0 [] 0 SetB %I cfg Black 0 0 0 SetCFg %I cbg White 1 1 1 SetCBg none SetP %I p n %I t [ 9.38371e-09 0.771182 -0.771182 9.38371e-09 628.85 -52.2379 ] concat %I 665 541 682 540 Line %I 1 End Begin %I Line %I b 65535 2 0 0 [] 0 SetB %I cfg Black 0 0 0 SetCFg %I cbg White 1 1 1 SetCBg none SetP %I p n %I t [ 9.38371e-09 0.771182 -0.771182 9.38371e-09 628.85 -51.4666 ] concat %I 681 599 681 485 Line %I 1 End Begin %I Text %I cfg Black 0 0 0 SetCFg %I f *-courier-medium-r-*-80-* Courier 8 SetF %I t [ -0.931845 2.26773e-08 -2.26773e-08 -0.931845 205.214 462.905 ] concat %I [ (Henley Street) ] Text End Begin %I Text %I cfg Black 0 0 0 SetCFg %I f *-courier-medium-r-*-80-* Courier 8 SetF %I t [ -0.931845 2.26773e-08 -2.26773e-08 -0.931845 193.647 318.042 ] concat %I [ (17th Street) ] Text End Begin %I Text %I cfg Black 0 0 0 SetCFg %I f *-courier-medium-r-*-100-* Courier 10 SetF %I t [ 9.63786e-09 0.792068 -0.792068 9.63786e-09 110.137 205.729 ] concat %I [ (17th Street Exit) ] Text End Begin %I Text %I cfg Black 0 0 0 SetCFg %I f *-courier-medium-r-*-100-* Courier 10 SetF %I t [ 9.63786e-09 0.792068 -0.792068 9.63786e-09 110.137 26.0435 ] concat %I [ (Airport/Alcoa Highway Exit) ] Text End Begin %I Text %I cfg Black 0 0 0 SetCFg %I f *-times-bold-r-*-140-* Times-Bold 14 SetF %I t [ 9.63786e-09 0.792068 -0.792068 9.63786e-09 171.727 80.0262 ] concat %I [ (Cumberland Avenue Exit) ] Text End Begin %I Text %I cfg Black 0 0 0 SetCFg %I f *-courier-medium-r-*-100-* Courier 10 SetF %I t [ 9.63786e-09 0.792068 -0.792068 9.63786e-09 265.915 92.3651 ] concat %I [ (Neyland Drive Exit) ] Text End Begin %I Text %I cfg Black 0 0 0 SetCFg %I f *-courier-medium-r-*-100-* Courier 10 SetF %I t [ 9.63786e-09 0.792068 -0.792068 9.63786e-09 110.137 509.574 ] concat %I [ (Summit Hill Exit) ] Text End Begin %I Line %I b 65535 0 0 0 [] 0 SetB %I cfg Black 0 0 0 SetCFg %I cbg White 1 1 1 SetCBg none SetP %I p n %I t [ 9.38371e-09 0.771182 -0.771182 9.38371e-09 628.85 -52.2379 ] concat %I 144 661 155 636 Line %I 1 End Begin %I Line %I b 65535 0 0 0 [] 0 SetB %I cfg Black 0 0 0 SetCFg %I cbg White 1 1 1 SetCBg none SetP %I p n %I t [ 9.38371e-09 0.771182 -0.771182 9.38371e-09 628.85 -52.2379 ] concat %I 372 662 361 634 Line %I 1 End Begin %I Line %I b 65535 0 0 0 [] 0 SetB %I cfg Black 0 0 0 SetCFg %I cbg White 1 1 1 SetCBg none SetP %I p n %I t [ 9.38371e-09 0.771182 -0.771182 9.38371e-09 628.85 -52.2379 ] concat %I 752 660 736 634 Line %I 1 End Begin %I Line %I b 65535 0 0 0 [] 0 SetB %I cfg Black 0 0 0 SetCFg %I cbg White 1 1 1 SetCBg none SetP %I p n %I t [ 9.38371e-09 0.771182 -0.771182 9.38371e-09 628.85 -52.2378 ] concat %I 186 466 160 486 Line %I 1 End Begin %I Line %I b 65535 0 0 0 [] 0 SetB %I cfg Black 0 0 0 SetCFg %I cbg White 1 1 1 SetCBg none SetP %I p n %I t [ 9.38371e-09 0.771182 -0.771182 9.38371e-09 628.85 -52.2378 ] concat %I 180 576 157 542 Line %I 1 End Begin %I Line %I b 65520 0 0 0 [12 4] 17 SetB %I cfg Black 0 0 0 SetCFg %I cbg White 1 1 1 SetCBg none SetP %I p n %I t [ 1.2168e-08 1 -1 1.2168e-08 793 -5.99997 ] concat %I 475 50 328 492 Line %I 1 End Begin %I Line %I b 65520 0 0 0 [12 4] 17 SetB %I cfg Black 0 0 0 SetCFg %I cbg White 1 1 1 SetCBg none SetP %I p n %I t [ 1.2168e-08 1 -1 1.2168e-08 793 -5.99994 ] concat %I 475 483 329 589 Line %I 1 End Begin %I Line %I b 65520 0 0 0 [12 4] 17 SetB %I cfg Black 0 0 0 SetCFg %I cbg White 1 1 1 SetCBg none SetP %I p n %I t [ 1.2168e-08 1 -1 1.2168e-08 793 -6 ] concat %I 962 483 450 588 Line %I 1 End Begin %I Line %I b 65520 0 0 0 [12 4] 17 SetB %I cfg Black 0 0 0 SetCFg %I cbg White 1 1 1 SetCBg none SetP %I p n %I t [ 1.2168e-08 1 -1 1.2168e-08 793 -6 ] concat %I 450 491 474 471 Line %I 1 End Begin %I Text %I cfg Black 0 0 0 SetCFg %I f *-courier-medium-r-*-100-* Courier 10 SetF %I t [ 9.63786e-09 0.792068 -0.792068 9.63786e-09 456.136 36.7289 ] concat %I [ (To Airport) ] Text End Begin %I Text %I cfg Black 0 0 0 SetCFg %I f *-courier-medium-r-*-100-* Courier 10 SetF %I t [ 9.63786e-09 0.792068 -0.792068 9.63786e-09 137.137 639.729 ] concat %I [ (To Ashville, Bristol) ] Text End Begin %I Elli %I b 65535 2 0 0 [] 0 SetB %I cfg Black 0 0 0 SetCFg %I cbg White 1 1 1 SetCBg %I p 0.5 SetP %I t [ -0.288762 0.966405 -0.966405 -0.288762 1070.92 66.5381 ] concat %I 743 193 51 70 Elli End Begin %I Poly %I b 65535 2 0 0 [] 0 SetB %I cfg Black 0 0 0 SetCFg %I cbg White 1 1 1 SetCBg %I p 0.5 SetP %I t [ 1.22729e-08 1.00862 -1.00862 1.22729e-08 600.094 464.292 ] concat %I 11 203 116 212 116 212 140 226 140 226 122 222 122 222 102 212 102 212 110 203 110 203 113 11 Poly End Begin %I Poly %I b 65535 2 0 0 [] 0 SetB %I cfg Black 0 0 0 SetCFg %I cbg White 1 1 1 SetCBg %I p 0.5 SetP %I t [ 1.0258e-08 0.843029 -0.843029 1.0258e-08 619.253 516.221 ] concat %I 12 183 114 183 124 193 124 193 135 205 135 205 125 248 90 242 83 239 86 231 76 194 106 194 114 12 Poly End Begin %I Poly %I b 65535 2 0 0 [] 0 SetB %I cfg Black 0 0 0 SetCFg %I cbg White 1 1 1 SetCBg %I p 0.5 SetP %I t [ 1.11572e-08 0.810855 -0.916932 9.86645e-09 659.418 526.458 ] concat %I 8 251 106 251 126 300 126 300 126 300 116 289 116 289 106 289 106 8 Poly End Begin %I Rect %I b 65535 2 0 0 [] 0 SetB %I cfg Black 0 0 0 SetCFg %I cbg White 1 1 1 SetCBg %I p 0.5 SetP %I t [ 1.63639e-08 1.34483 -1.34483 1.63639e-08 1018.44 -282.452 ] concat %I 791 383 799 395 Rect End Begin %I Poly %I b 65535 2 0 0 [] 0 SetB %I cfg Black 0 0 0 SetCFg %I cbg White 1 1 1 SetCBg %I p 0.5 SetP %I t [ 1.22729e-08 1.00862 -1.00862 1.22729e-08 900.11 -9.08362 ] concat %I 12 776 339 776 352 798 352 803 356 808 350 807 348 812 342 808 336 803 341 789 343 782 333 776 333 12 Poly End Begin %I Rect %I b 65535 2 0 0 [] 0 SetB %I cfg Black 0 0 0 SetCFg %I cbg White 1 1 1 SetCBg %I p 0.5 SetP %I t [ 1.13408e-08 0.952586 -0.932017 1.1591e-08 882.447 11.9457 ] concat %I 885 359 901 436 Rect End Begin %I Elli %I b 65535 2 0 0 [] 0 SetB %I cfg Black 0 0 0 SetCFg %I cbg White 1 1 1 SetCBg %I p 1 SetP %I t [ -0.249802 0.836016 -0.836016 -0.249802 1016.8 155.898 ] concat %I 743 193 51 70 Elli End Begin %I Rect %I b 65535 2 0 0 [] 0 SetB %I cfg Black 0 0 0 SetCFg %I cbg White 1 1 1 SetCBg %I p 0.75 SetP %I t [ -0.289327 0.966236 -0.966236 -0.289327 1071.6 80.2259 ] concat %I 707 160 754 232 Rect End Begin %I Poly %I b 65535 2 0 0 [] 0 SetB %I cfg Black 0 0 0 SetCFg %I cbg White 1 1 1 SetCBg %I p 0.5 SetP %I t [ 0.240969 0.462035 -0.606024 0.318423 615.648 549.235 ] concat %I 13 164 162 182 162 182 167 235 167 234 162 254 162 254 134 234 133 235 129 183 129 183 134 164 134 164 149 13 Poly End Begin %I Rect %I b 65535 2 0 0 [] 0 SetB %I cfg Black 0 0 0 SetCFg %I cbg White 1 1 1 SetCBg %I p 0.5 SetP %I t [ 9.95725e-09 0.885621 -0.818318 1.07762e-08 598.215 291.204 ] concat %I 385 148 422 197 Rect End Begin %I Pict %I b u %I cfg Black 0 0 0 SetCFg %I cbg u %I f u %I p u %I t [ 1.05665 0 0 1.05665 213.224 -6.32959 ] concat Begin %I Poly %I b 65535 2 0 0 [] 0 SetB %I cfg DkGray 0.501961 0.501961 0.501961 SetCFg %I cbg White 1 1 1 SetCBg %I p 0.5 SetP %I t [ 8.1286e-09 0.496511 -0.668033 6.04153e-09 312.332 583.056 ] concat %I 13 204 92 204 123 226 123 226 113 248 113 248 123 264 123 264 92 248 92 248 95 226 95 226 92 225 92 13 Poly End Begin %I Poly %I b 65535 2 0 0 [] 0 SetB %I cfg DkGray 0.501961 0.501961 0.501961 SetCFg %I cbg White 1 1 1 SetCBg %I p 0.5 SetP %I t [ 8.1286e-09 -0.496511 -0.668033 -6.04153e-09 312.332 845.214 ] concat %I 13 204 92 204 123 226 123 226 113 248 113 248 123 264 123 264 92 248 92 248 95 226 95 226 92 225 92 13 Poly End Begin %I Rect none SetB %I b n %I cfg DkGray 0.501961 0.501961 0.501961 SetCFg %I cbg White 1 1 1 SetCBg %I p 0.5 SetP %I t [ 8.1286e-09 0.496511 -0.668033 6.04153e-09 311.664 582.559 ] concat %I 258 92 274 121 Rect End End %I eop Begin %I Text %I cfg Black 0 0 0 SetCFg %I f *-courier-medium-r-*-80-* Courier 8 SetF %I t [ 1.32746e-08 1.09095 -1.09095 1.32746e-08 532.584 731.01 ] concat %I [ (Physics) ] Text End Begin %I Text %I cfg Black 0 0 0 SetCFg %I f *-courier-medium-r-*-80-* Courier 8 SetF %I t [ 1.32746e-08 1.09095 -1.09095 1.32746e-08 555.924 783.814 ] concat %I [ (Geography) ] Text End Begin %I Text %I cfg Black 0 0 0 SetCFg %I f *-courier-medium-r-*-80-* Courier 8 SetF %I t [ 1.32746e-08 1.09095 -1.09095 1.32746e-08 565.058 787.873 ] concat %I [ (& Geology) ] Text End Begin %I Text %I cfg Black 0 0 0 SetCFg %I f *-courier-medium-r-*-80-* Courier 8 SetF %I t [ 0.737179 0.804195 -0.804195 0.737179 504.155 697.286 ] concat %I [ (Biology) ] Text End Begin %I Text %I cfg Black 0 0 0 SetCFg %I f *-courier-medium-r-*-80-* Courier 8 SetF %I t [ -1.09095 2.65492e-08 -2.65492e-08 -1.09095 366.956 772.951 ] concat %I [ (13th Street) ] Text End Begin %I Text %I cfg Black 0 0 0 SetCFg %I f *-courier-medium-r-*-80-* Courier 8 SetF %I t [ -1.09095 2.65492e-08 -2.65492e-08 -1.09095 511.152 538.704 ] concat %I [ (Voluteer Boulevard) ] Text End Begin %I Text %I cfg Black 0 0 0 SetCFg %I f *-courier-medium-r-*-80-* Courier 8 SetF %I t [ 0.46602 0.986397 -0.986397 0.466021 521.533 631.373 ] concat %I [ (Middle Way) ] Text End Begin %I Text %I cfg Black 0 0 0 SetCFg %I f *-courier-medium-r-*-80-* Courier 8 SetF %I t [ -1.09095 2.65492e-08 -2.65492e-08 -1.09095 373.044 540.56 ] concat %I [ (16th Street) ] Text End Begin %I Text %I cfg Black 0 0 0 SetCFg %I f *-courier-medium-r-*-80-* Courier 8 SetF %I t [ 1.32746e-08 1.09095 -1.09095 1.32746e-08 363.799 672.151 ] concat %I [ (Walters) (Life) (Sciences) ] Text End Begin %I Text %I cfg Black 0 0 0 SetCFg %I f *-courier-medium-r-*-80-* Courier 8 SetF %I t [ -1.09095 2.65492e-08 -2.65492e-08 -1.09095 537.034 899.133 ] concat %I [ (Daughtery) (Engineering) ] Text End Begin %I Text %I cfg Black 0 0 0 SetCFg %I f *-courier-medium-r-*-80-* Courier 8 SetF %I t [ 1.60693e-08 1.32061 -1.32061 1.60693e-08 657.627 707.825 ] concat %I [ (Neyland) (Stadium) ] Text End Begin %I Text %I cfg Black 0 0 0 SetCFg %I f *-courier-medium-r-*-80-* Courier 8 SetF %I t [ -1.0536 -0.283024 0.283024 -1.0536 624.893 639.464 ] concat %I [ (Stadium Drive) ] Text End Begin %I Text %I cfg Black 0 0 0 SetCFg %I f *-courier-medium-r-*-80-* Courier 8 SetF %I t [ 1.32746e-08 1.09095 -1.09095 1.32746e-08 450.384 486.441 ] concat %I [ (Library) ] Text End Begin %I Poly %I b 65535 2 0 0 [] 0 SetB %I cfg Black 0 0 0 SetCFg %I cbg White 1 1 1 SetCBg %I p 0.5 SetP %I t [ 1.22729e-08 1.00862 -1.00862 1.22729e-08 895.645 -1.75617 ] concat %I 4 483 431 523 431 523 391 481 389 4 Poly End Begin %I Text %I cfg Black 0 0 0 SetCFg %I f *-courier-medium-r-*-80-* Courier 8 SetF %I t [ 1.32746e-08 1.09095 -1.09095 1.32746e-08 412.866 555.498 ] concat %I [ (University) ( Center) ] Text End Begin %I BSpl %I b 65520 2 0 0 [12 4] 17 SetB %I cfg Black 0 0 0 SetCFg %I cbg White 1 1 1 SetCBg none SetP %I p n %I t [ 1.22729e-08 1.00862 -1.00862 1.22729e-08 895.438 0.0836792 ] concat %I 6 753 467 837 468 841 464 846 464 843 420 841 419 6 BSpl %I 1 End Begin %I BSpl %I b 65520 2 0 0 [12 4] 17 SetB %I cfg Black 0 0 0 SetCFg %I cbg White 1 1 1 SetCBg none SetP %I p n %I t [ 1.22729e-08 1.00862 -1.00862 1.22729e-08 895.921 0.182861 ] concat %I 6 788 313 830 316 840 348 841 386 843 417 843 418 6 BSpl %I 1 End Begin %I Line %I b 65535 0 0 0 [] 0 SetB %I cfg Black 0 0 0 SetCFg %I cbg White 1 1 1 SetCBg none SetP %I p n %I t [ 1.22729e-08 1.00862 -1.00862 1.22729e-08 895.435 0.47699 ] concat %I 887 450 839 438 Line %I 1 End Begin %I Poly %I b 65535 2 0 0 [] 0 SetB %I cfg Black 0 0 0 SetCFg %I cbg White 1 1 1 SetCBg %I p 0.5 SetP %I t [ 1.22729e-08 1.00862 -1.00862 1.22729e-08 895.589 0.222778 ] concat %I 9 807 460 838 460 838 405 816 404 816 394 831 394 831 379 807 378 806 379 9 Poly End Begin %I Line %I b 65535 0 0 0 [] 0 SetB %I cfg Black 0 0 0 SetCFg %I cbg White 1 1 1 SetCBg none SetP %I p n %I t [ 1.22729e-08 1.00862 -1.00862 1.22729e-08 895.838 0.0155029 ] concat %I 796 369 781 389 Line %I 1 End Begin %I Text %I cfg Black 0 0 0 SetCFg %I f *-courier-medium-r-*-80-* Courier 8 SetF %I t [ 1.32746e-08 1.09095 -1.09095 1.32746e-08 523.481 806.106 ] concat %I [ (South) (College) ] Text End Begin %I Text %I cfg Black 0 0 0 SetCFg %I f *-courier-bold-r-*-120-* Courier-Bold 12 SetF %I t [ 1.21714e-08 1.00029 -1.00029 1.21714e-08 439.801 711.069 ] concat %I [ (Ayres Hall) ] Text End Begin %I Line %I b 65535 0 0 0 [] 0 SetB %I cfg Black 0 0 0 SetCFg %I cbg White 1 1 1 SetCBg none SetP %I p n %I t [ 1.07786e-08 0.885813 -0.885813 1.07786e-08 771.842 61.8701 ] concat %I 943 284 915 303 Line %I 1 End Begin %I Text %I cfg Black 0 0 0 SetCFg %I f *-courier-medium-r-*-80-* Courier 8 SetF %I t [ -1.09095 2.65492e-08 -2.65492e-08 -1.09095 459.421 900.429 ] concat %I [ (Dabney/) (Buhler) ] Text End Begin %I Line %I b 65535 0 0 0 [] 0 SetB %I cfg Black 0 0 0 SetCFg %I cbg White 1 1 1 SetCBg %I p 0.5 SetP %I t [ 1.07786e-08 0.885813 -0.885813 1.07786e-08 767.413 57.441 ] concat %I 698 424 663 374 Line %I 1 End Begin %I Text %I cfg Black 0 0 0 SetCFg %I f *-times-bold-r-*-140-* Times-Bold 14 SetF %I t [ 1.22729e-08 1.00862 -1.00862 1.22729e-08 464.596 743.589 ] concat %I [ (X) ] Text End Begin %I Text %I cfg Black 0 0 0 SetCFg %I f *-courier-medium-r-*-80-* Courier 8 SetF %I t [ -1.09087 0.0128026 -0.0128027 -1.09087 503.693 612.151 ] concat %I [ (Stadium Drive) ] Text End Begin %I Rect %I b 65535 2 0 0 [] 0 SetB %I cfg Black 0 0 0 SetCFg %I cbg White 1 1 1 SetCBg %I p < cc cc 33 33 cc cc 33 33 > -1 SetP %I t [ 1.07786e-08 0.885813 -0.885813 1.07786e-08 855.109 167.282 ] concat %I 465 390 498 427 Rect End Begin %I Poly %I b 65535 2 0 0 [] 0 SetB %I cfg Black 0 0 0 SetCFg %I cbg White 1 1 1 SetCBg %I p 0.5 SetP %I t [ 6.73658e-10 0.0553633 -0.0553633 6.73658e-10 494.25 582.008 ] concat %I 11 509 1018 237 1018 237 1114 -35 1114 -35 1018 -307 1018 -307 746 -35 746 -35 378 509 378 509 380 11 Poly End Begin %I Text %I cfg Black 0 0 0 SetCFg %I f *-courier-medium-r-*-80-* Courier 8 SetF %I t [ 1.32746e-08 1.09095 -1.09095 1.32746e-08 485.903 693.513 ] concat %I [ (Psychology) ] Text End Begin %I Text %I cfg Black 0 0 0 SetCFg %I f *-courier-bold-r-*-120-* Courier-Bold 12 SetF %I t [ 9.87788e-09 0.811792 -0.811792 9.87788e-09 526.522 575.313 ] concat %I [ (Parking) (Garage) ] Text End Begin %I Line %I b 65535 0 0 0 [] 0 SetB %I cfg Black 0 0 0 SetCFg %I cbg White 1 1 1 SetCBg %I p < cc cc 33 33 cc cc 33 33 > -1 SetP %I t [ 1.07786e-08 0.885813 -0.885813 1.07786e-08 774.943 159.309 ] concat %I 481 283 491 300 Line %I 1 End Begin %I Text %I cfg Black 0 0 0 SetCFg %I f *-courier-medium-r-*-80-* Courier 8 SetF %I t [ -1.09095 2.65492e-08 -2.65492e-08 -1.09095 368.693 657.005 ] concat %I [ (15th Street) ] Text End Begin %I Text %I cfg Black 0 0 0 SetCFg %I f *-courier-medium-r-*-80-* Courier 8 SetF %I t [ -1.09095 2.65492e-08 -2.65492e-08 -1.09095 374.042 893.422 ] concat %I [ (11th Street) ] Text End Begin %I BSpl %I b 65535 3 0 0 [] 0 SetB %I cfg Black 0 0 0 SetCFg %I cbg White 1 1 1 SetCBg none SetP %I p n %I t [ 1.07785e-08 0.885813 -0.885813 1.07785e-08 786.901 202.714 ] concat %I 18 476 304 494 301 501 299 527 283 545 269 569 255 590 244 611 241 648 238 672 234 686 219 705 203 741 204 767 217 776 236 774 284 776 343 773 538 18 BSpl %I 1 End Begin %I Line %I b 65535 3 0 0 [] 0 SetB %I cfg Black 0 0 0 SetCFg %I cbg White 1 1 1 SetCBg none SetP %I p n %I t [ 1.07785e-08 0.885813 -0.885813 1.07785e-08 786.901 202.714 ] concat %I 376 429 375 538 Line %I 1 End Begin %I BSpl %I b 65535 3 0 0 [] 0 SetB %I cfg Black 0 0 0 SetCFg %I cbg White 1 1 1 SetCBg none SetP %I p n %I t [ 1.07785e-08 0.885813 -0.885813 1.07785e-08 786.901 202.714 ] concat %I 8 376 429 375 275 369 238 352 223 345 219 323 208 303 203 303 204 8 BSpl %I 1 End Begin %I Poly %I b 65535 2 0 0 [] 0 SetB %I cfg Black 0 0 0 SetCFg %I cbg White 1 1 1 SetCBg none SetP %I p n %I t [ 1.07785e-08 0.885813 -0.885813 1.07785e-08 786.901 202.714 ] concat %I 4 301 50 850 50 850 537 301 537 4 Poly End Begin %I BSpl %I b 65535 3 0 0 [] 0 SetB %I cfg Black 0 0 0 SetCFg %I cbg White 1 1 1 SetCBg none SetP %I p n %I t [ 1.2168e-08 1 -1 1.2168e-08 793 -5.99995 ] concat %I 5 814 52 872 66 910 82 947 107 961 127 5 BSpl %I 1 End Begin %I Text %I cfg Black 0 0 0 SetCFg %I f *-courier-medium-r-*-80-* Courier 8 SetF %I t [ -0.520591 0.958718 -0.958718 -0.520591 712.173 866.792 ] concat %I [ (Neyland Drive) ] Text End Begin %I BSpl %I b 65535 3 0 0 [] 0 SetB %I cfg Black 0 0 0 SetCFg %I cbg White 1 1 1 SetCBg none SetP %I p n %I t [ 1.52099e-09 0.125 -0.125 1.52099e-09 461.75 653.5 ] concat %I 24 459 412 422 314 224 273 224 233 224 152 305 80 345 72 386 31 418 -33 418 -122 394 -316 474 -461 652 -542 854 -566 1015 -566 1152 -501 1184 -445 1176 -203 1168 15 1031 233 789 314 547 314 432 351 428 350 24 BSpl %I 8 End Begin %I Line %I b 65535 3 0 0 [] 0 SetB %I cfg Black 0 0 0 SetCFg %I cbg White 1 1 1 SetCBg none SetP %I p n %I t [ 1.52099e-09 0.125 -0.125 1.52099e-09 461.75 653.5 ] concat %I 916 415 915 1204 Line %I 8 End Begin %I BSpl %I b 65535 2 0 0 [] 0 SetB %I cfg Black 0 0 0 SetCFg %I cbg White 1 1 1 SetCBg none SetP %I p n %I t [ 1.52099e-09 0.125 -0.125 1.52099e-09 461.75 620.75 ] concat %I 5 486 195 402 153 394 72 402 -73 394 -73 5 BSpl %I 8 End Begin %I BSpl %I b 65535 3 0 0 [] 0 SetB %I cfg Black 0 0 0 SetCFg %I cbg White 1 1 1 SetCBg none SetP %I p n %I t [ 1.52099e-09 0.125 -0.125 1.52099e-09 461.75 604.375 ] concat %I 11 133 415 132 396 110 602 231 715 316 751 330 765 387 857 351 977 344 1126 358 1041 351 1190 11 BSpl %I 8 End Begin %I MLine %I b 65535 3 0 0 [] 0 SetB %I cfg Black 0 0 0 SetCFg %I cbg White 1 1 1 SetCBg none SetP %I p n %I t [ 1.52099e-09 0.125 -0.125 1.52099e-09 461.75 604.375 ] concat %I 3 133 410 153 -454 613 -2226 3 MLine %I 8 End Begin %I Line %I b 65535 3 0 0 [] 0 SetB %I cfg Black 0 0 0 SetCFg %I cbg White 1 1 1 SetCBg %I p 0.5 SetP %I t [ 1.52099e-09 0.125 -0.125 1.52099e-09 461.75 604.375 ] concat %I 663 276 137 276 Line %I 8 End Begin %I Line %I b 65535 3 0 0 [] 0 SetB %I cfg Black 0 0 0 SetCFg %I cbg White 1 1 1 SetCBg none SetP %I p n %I t [ 1.52099e-09 0.125 -0.125 1.52099e-09 527 522.5 ] concat %I 96 105 808 79 Line %I 8 End Begin %I Line %I b 65535 3 0 0 [] 0 SetB %I cfg Black 0 0 0 SetCFg %I cbg White 1 1 1 SetCBg none SetP %I p n %I t [ 1.52099e-09 0.125 -0.125 1.52099e-09 527 522.5 ] concat %I 95 105 -425 89 Line %I 8 End Begin %I Line %I b 65535 3 0 0 [] 0 SetB %I cfg Black 0 0 0 SetCFg %I cbg White 1 1 1 SetCBg none SetP %I p n %I t [ 1.52099e-09 0.125 -0.125 1.52099e-09 483.5 457 ] concat %I 99 587 3984 588 Line %I 8 End Begin %I BSpl %I b 65535 2 0 0 [] 0 SetB %I cfg Black 0 0 0 SetCFg %I cbg White 1 1 1 SetCBg none SetP %I p n %I t [ 6.91136e-09 0.653301 -0.567997 7.94935e-09 450.164 -70.1652 ] concat %I 15 211 347 224 320 247 286 278 265 315 254 368 252 499 251 582 255 629 257 783 266 863 302 880 318 903 384 900 434 898 545 15 BSpl %I 1 End Begin %I Text %I cfg Black 0 0 0 SetCFg %I f *-courier-bold-r-*-120-* Courier-Bold 12 SetF %I t [ 1.2168e-08 1 -1 1.2168e-08 288.5 532.5 ] concat %I [ (Jack Dongarra's office in Ayres Hall Room 107) ] Text End Begin %I Text %I cfg Black 0 0 0 SetCFg %I f *-courier-medium-r-*-100-* Courier 10 SetF %I t [ -0.792068 1.92757e-08 -1.92757e-08 -0.792068 428.69 73.2605 ] concat %I [ (Airport/Alcoa Highway) ] Text End Begin %I Rect %I b 65535 0 0 0 [] 0 SetB %I cfg Black 0 0 0 SetCFg %I cbg White 1 1 1 SetCBg %I p < 88 44 22 11 88 44 22 11 > -1 SetP %I t [ 1.2168e-08 1 -1 1.2168e-08 610 221 ] concat %I 258 413 267 424 Rect End Begin %I Rect %I b 65535 0 0 0 [] 0 SetB %I cfg Black 0 0 0 SetCFg %I cbg White 1 1 1 SetCBg %I p < 88 44 22 11 88 44 22 11 > -1 SetP %I t [ 1.2168e-08 1 -1 1.2168e-08 593 221 ] concat %I 258 413 267 424 Rect End Begin %I MLine %I b 65535 0 0 0 [] 0 SetB %I cfg Black 0 0 0 SetCFg %I cbg White 1 1 1 SetCBg none SetP %I p n %I t [ 1.2168e-08 1 -1 1.2168e-08 773 -23 ] concat %I 3 558 602 521 602 507 598 3 MLine %I 1 End Begin %I Line %I b 65535 2 0 0 [] 0 SetB %I cfg Black 0 0 0 SetCFg %I cbg White 1 1 1 SetCBg none SetP %I p n %I t [ 1.2168e-08 1 -1 1.2168e-08 773 -23 ] concat %I 326 526 326 467 Line %I 1 End Begin %I Rect %I b 65535 2 0 0 [] 0 SetB %I cfg Black 0 0 0 SetCFg %I cbg White 1 1 1 SetCBg %I p 0.5 SetP %I t [ 1.2168e-08 1 -1 1.2168e-08 773 -23 ] concat %I 564 368 588 385 Rect End Begin %I Rect %I b 65535 2 0 0 [] 0 SetB %I cfg Black 0 0 0 SetCFg %I cbg White 1 1 1 SetCBg %I p 0.5 SetP %I t [ 1.2168e-08 1 -1 1.2168e-08 773 26.9999 ] concat %I 564 368 588 385 Rect End Begin %I Poly %I b 65535 2 0 0 [] 0 SetB %I cfg Black 0 0 0 SetCFg %I cbg White 1 1 1 SetCBg %I p 0.5 SetP %I t [ 8.96587e-09 0.736842 -0.736842 8.96587e-09 708.816 120.079 ] concat %I 4 616 414 631 414 631 431 616 431 4 Poly End Begin %I Text %I cfg Black 0 0 0 SetCFg %I f *-courier-medium-r-*-80-* Courier 8 SetF %I t [ -0.855758 0.676646 -0.676646 -0.855758 381.905 599.524 ] concat %I [ (Law Builfinh) ] Text End Begin %I Text %I cfg Black 0 0 0 SetCFg %I f *-courier-medium-r-*-80-* Courier 8 SetF %I t [ -0.855758 0.676646 -0.676646 -0.855758 383.905 547.524 ] concat %I [ (Pan-Helenic Bldg.) ] Text End Begin %I Text %I cfg Black 0 0 0 SetCFg %I f *-courier-medium-r-*-80-* Courier 8 SetF %I t [ -0.855758 0.676646 -0.676646 -0.855758 384.905 572.524 ] concat %I [ (International House.) ] Text End Begin %I MLine %I b 65535 0 0 0 [] 0 SetB %I cfg Black 0 0 0 SetCFg %I cbg White 1 1 1 SetCBg none SetP %I p n %I t [ 1.2168e-08 1 -1 1.2168e-08 773 -23 ] concat %I 2 559 582 507 581 2 MLine %I 1 End Begin %I Text %I cfg Black 0 0 0 SetCFg %I f *-courier-medium-r-*-100-* Courier 10 SetF %I t [ 1.13387e-08 0.931845 -0.931845 1.13387e-08 186.965 539.756 ] concat %I [ (Ramada Inn) ] Text End Begin %I Text %I cfg Black 0 0 0 SetCFg %I f *-courier-medium-r-*-100-* Courier 10 SetF %I t [ 1.13387e-08 0.931845 -0.931845 1.13387e-08 167.965 538.756 ] concat %I [ (Hilton) ] Text End Begin %I Text %I cfg Black 0 0 0 SetCFg %I f *-courier-bold-r-*-120-* Courier-Bold 12 SetF %I t [ 1.2168e-08 1 -1 1.2168e-08 514.5 56.5 ] concat %I [ (Directions from the airport to Ayres Hall:) () ( Alcoa Highway North to Cumberland Avenue) () ( Cumberland Avenue east to Stadium Drive) ( \(Stadium Dr. is accross from 15th St.\)) () ( Park at Parking Garage and walk up hill) ( to largest building, Ayres Hall) ] Text End End %I eop showpage %%Trailer end From owner-pbwg-comm@CS.UTK.EDU Mon Feb 22 10:36:00 1993 Received: from CS.UTK.EDU by surfer.EPM.ORNL.GOV (5.61/1.34) id AA05312; Mon, 22 Feb 93 10:36:00 -0500 Received: from localhost by CS.UTK.EDU with SMTP (5.61++/2.8s-UTK) id AA28038; Mon, 22 Feb 93 10:34:47 -0500 X-Resent-To: pbwg-comm@CS.UTK.EDU ; Mon, 22 Feb 1993 10:34:46 EST Errors-To: owner-pbwg-comm@CS.UTK.EDU Received: from antares.mcs.anl.gov by CS.UTK.EDU with SMTP (5.61++/2.8s-UTK) id AA28025; Mon, 22 Feb 93 10:34:42 -0500 Received: from jadoube.mcs.anl.gov by antares.mcs.anl.gov with SMTP id AA23129 (5.65c/IDA-1.4.4 for ); Mon, 22 Feb 1993 09:34:38 -0600 From: David Levine Received: by jadoube.mcs.anl.gov (4.1/GeneV4) id AA11892; Mon, 22 Feb 93 09:34:37 CST Date: Mon, 22 Feb 93 09:34:37 CST Message-Id: <9302221534.AA11892@jadoube.mcs.anl.gov> To: pbwg-comm@cs.utk.edu send archive from pbwg From owner-pbwg-comm@CS.UTK.EDU Mon Feb 22 15:29:10 1993 Received: from CS.UTK.EDU by surfer.EPM.ORNL.GOV (5.61/1.34) id AA15809; Mon, 22 Feb 93 15:29:10 -0500 Received: from localhost by CS.UTK.EDU with SMTP (5.61++/2.8s-UTK) id AA15065; Mon, 22 Feb 93 15:27:47 -0500 X-Resent-To: pbwg-comm@CS.UTK.EDU ; Mon, 22 Feb 1993 15:27:45 EST Errors-To: owner-pbwg-comm@CS.UTK.EDU Received: from adios.brl.mil by CS.UTK.EDU with SMTP (5.61++/2.8s-UTK) id AA15057; Mon, 22 Feb 93 15:27:44 -0500 Date: Mon, 22 Feb 93 15:24:05 EST From: TomC To: pbwg-comm@CS.UTK.EDU Cc: TomC , coleman@BRL.MIL, apress@BRL.MIL Subject: Additions to pbwg maillist Message-Id: <9302221524.aa18550@ADIOS.BRL.MIL> Please include the following in the above mail list crimmins@brl.mil Tom Crimmins 410-278-6267 monte@brl.mil Monte Coleman 410-278-6261 apress@brl.mil Tony Pressley 410-278-6509 Thanks, Tom Crimmins From owner-pbwg-comm@CS.UTK.EDU Mon Feb 22 17:04:11 1993 Received: from CS.UTK.EDU by surfer.EPM.ORNL.GOV (5.61/1.34) id AA19520; Mon, 22 Feb 93 17:04:11 -0500 Received: from localhost by CS.UTK.EDU with SMTP (5.61++/2.8s-UTK) id AA19497; Mon, 22 Feb 93 17:03:39 -0500 X-Resent-To: pbwg-comm@CS.UTK.EDU ; Mon, 22 Feb 1993 17:03:38 EST Errors-To: owner-pbwg-comm@CS.UTK.EDU Received: from climate.engin.umich.edu by CS.UTK.EDU with SMTP (5.61++/2.8s-UTK) id AA19489; Mon, 22 Feb 93 17:03:36 -0500 Received: by climate.engin.umich.edu (5.67/1.35) id AA21283; Mon, 22 Feb 93 17:05:11 -0500 Date: Mon, 22 Feb 93 17:05:11 -0500 From: Hal G Marshall Message-Id: <9302222205.AA21283@climate.engin.umich.edu> To: pbwg-comm@cs.utk.edu Cc: idaho@CS.UTK.EDU Please add me to your list for MIMD benchmarks. -Hal From owner-pbwg-comm@CS.UTK.EDU Mon Mar 1 05:14:41 1993 Received: from CS.UTK.EDU by surfer.EPM.ORNL.GOV (5.61/1.34) id AA28056; Mon, 1 Mar 93 05:14:41 -0500 Received: from localhost by CS.UTK.EDU with SMTP (5.61++/2.8s-UTK) id AA15881; Mon, 1 Mar 93 05:13:28 -0500 X-Resent-To: pbwg-comm@CS.UTK.EDU ; Mon, 1 Mar 1993 05:13:27 EST Errors-To: owner-pbwg-comm@CS.UTK.EDU Received: from Mail.Think.COM by CS.UTK.EDU with SMTP (5.61++/2.8s-UTK) id AA15867; Mon, 1 Mar 93 05:13:24 -0500 Received: from Godot.Think.COM by mail.think.com; Mon, 1 Mar 93 05:13:22 -0500 Received: by godot.think.com (4.1/Think-1.2) id AA16751; Mon, 1 Mar 93 05:14:28 EST Message-Id: <9303011014.AA16751@godot.think.com> To: Jack Dongarra Cc: pbwg-comm@cs.utk.edu Subject: Re: Parallel Benchmark Working Group Meeting In-Reply-To: Your message of "Sun, 14 Feb 93 16:56:01 EST." <9302142156.AA01941@dasher.cs.utk.edu> Date: Mon, 01 Mar 93 05:14:28 EST From: Dennis Parkinson I apologize I will not be able to attend the meeting sorry for the late notifia`cation dennis Parkinson From owner-pbwg-comm@CS.UTK.EDU Mon Mar 1 07:39:16 1993 Received: from CS.UTK.EDU by surfer.EPM.ORNL.GOV (5.61/1.34) id AA25550; Mon, 1 Mar 93 07:39:16 -0500 Received: from localhost by CS.UTK.EDU with SMTP (5.61++/2.8s-UTK) id AA25188; Mon, 1 Mar 93 07:38:13 -0500 X-Resent-To: pbwg-comm@CS.UTK.EDU ; Mon, 1 Mar 1993 07:38:12 EST Errors-To: owner-pbwg-comm@CS.UTK.EDU Received: from inetg1.Arco.COM by CS.UTK.EDU with SMTP (5.61++/2.8s-UTK) id AA25180; Mon, 1 Mar 93 07:38:11 -0500 Received: by Arco.COM (4.1/SMI-4.1) id AA15125; Mon, 1 Mar 93 06:38:08 CST Date: Mon, 1 Mar 93 06:38:08 CST From: ccm@Arco.COM (Chuck Mosher (214)754-6468) Message-Id: <9303011238.AA15125@Arco.COM> To: pbwg-comm@cs.utk.edu Subject: Addition to the List Please add me to the parallel benchmark working group mailing list. Regards, Chuck Mosher ccm@arco.com From owner-pbwg-comm@CS.UTK.EDU Mon Mar 1 10:49:36 1993 Received: from CS.UTK.EDU by surfer.EPM.ORNL.GOV (5.61/1.34) id AA29573; Mon, 1 Mar 93 10:49:36 -0500 Received: from localhost by CS.UTK.EDU with SMTP (5.61++/2.8s-UTK) id AA05800; Mon, 1 Mar 93 10:48:55 -0500 X-Resent-To: pbwg-comm@CS.UTK.EDU ; Mon, 1 Mar 1993 10:48:54 EST Errors-To: owner-pbwg-comm@CS.UTK.EDU Received: from ben.uknet.ac.uk by CS.UTK.EDU with SMTP (5.61++/2.8s-UTK) id AA05792; Mon, 1 Mar 93 10:48:45 -0500 Received: from eros.uknet.ac.uk by ben.uknet.ac.uk via UKIP with SMTP (PP) id ; Mon, 1 Mar 1993 15:48:23 +0000 Received: from parsys.co.uk by eros.uknet.ac.uk with UUCP id <9375-0@eros.uknet.ac.uk>; Mon, 1 Mar 1993 15:48:01 +0000 Message-Id: <2594.9303011242@boston.parsys.co.uk> Received: from parsys.co.uk (gmail) by boston.parsys.co.uk; Mon, 1 Mar 93 12:42:29 GMT Date: 1 Mar 1993 12:23:18 +0000 From: Heather Cooke Subject: Re: Parallel Benchmark Worki To: dongarra@cs.utk.edu, System_Manager@parsys.co.uk Cc: pbwg-comm@cs.utk.edu Mail*Link(r) SMTP RE>Parallel Benchmark Worki Received: by parsys.co.uk with SMTP;1 Mar 1993 11:49:21 +0000 Received: from uknet.UUCP by boston.parsys.co.uk; Mon, 1 Mar 93 11:45:35 GMT Received: from CS.UTK.EDU by ben.uknet.ac.uk via EUnet with SMTP (PP) id ; Mon, 1 Mar 1993 10:17:20 +0000 Received: from localhost by CS.UTK.EDU with SMTP (5.61++/2.8s-UTK) id AA15881; Mon, 1 Mar 93 05:13:28 -0500 X-Resent-To: pbwg-comm@CS.UTK.EDU ; Mon, 1 Mar 1993 05:13:27 EST Errors-To: owner-pbwg-comm@CS.UTK.EDU Received: from Mail.Think.COM by CS.UTK.EDU with SMTP (5.61++/2.8s-UTK) id AA15867; Mon, 1 Mar 93 05:13:24 -0500 Received: from Godot.Think.COM by mail.think.com; Mon, 1 Mar 93 05:13:22 -0500 Received: by godot.think.com (4.1/Think-1.2) id AA16751; Mon, 1 Mar 93 05:14:28 EST Message-Id: <9303011014.AA16751@godot.think.com> To: Jack Dongarra Cc: pbwg-comm@cs.utk.edu Subject: Re: Parallel Benchmark Working Group Meeting In-Reply-To: Your message of "Sun, 14 Feb 93 16:56:01 EST." <9302142156.AA01941@dasher.cs.utk.edu> Date: Mon, 01 Mar 93 05:14:28 EST From: Dennis Parkinson I apologize I will not be able to attend the meeting sorry for the late notifia`cation dennis Parkinson From owner-pbwg-comm@CS.UTK.EDU Wed Mar 3 13:13:23 1993 Received: from CS.UTK.EDU by surfer.EPM.ORNL.GOV (5.61/1.34) id AA05199; Wed, 3 Mar 93 13:13:23 -0500 Received: from localhost by CS.UTK.EDU with SMTP (5.61++/2.8s-UTK) id AA25195; Wed, 3 Mar 93 13:12:14 -0500 X-Resent-To: pbwg-comm@CS.UTK.EDU ; Wed, 3 Mar 1993 13:12:12 EST Errors-To: owner-pbwg-comm@CS.UTK.EDU Received: from orion.oac.uci.edu by CS.UTK.EDU with SMTP (5.61++/2.8s-UTK) id AA25186; Wed, 3 Mar 93 13:12:09 -0500 Received: from balboa.eng.uci.edu by orion.oac.uci.edu with SMTP id AA14344 (5.65c/IDA-1.4.4 for ); Wed, 3 Mar 1993 10:12:00 -0800 Received: from seal.eng.uci.edu by balboa.eng.uci.edu (4.1/SMI-4.1) id AA03526; Wed, 3 Mar 93 10:12:02 PST From: mnyeu@balboa.eng.uci.edu (Maung Nyeu) Message-Id: <9303031812.AA03526@balboa.eng.uci.edu> Subject: Parallel Benhmarking Meetinh To: pbwg-comm@cs.utk.edu Date: Wed, 3 Mar 93 10:10:46 PST X-Mailer: ELM [version 2.3 PL11] Hi, I am student at UC Irvine dept. of comp science. I am interested to get the summary or the papers presented in the parallel benchmarking meeting. I would appreciate it if you send me a reply. Thanks. -- Maung Ting Nyeu email: mnyeu@balboa.eng.uci.edu From owner-pbwg-comm@CS.UTK.EDU Thu Mar 4 12:21:42 1993 Received: from CS.UTK.EDU by surfer.EPM.ORNL.GOV (5.61/1.34) id AA02456; Thu, 4 Mar 93 12:21:42 -0500 Received: from localhost by CS.UTK.EDU with SMTP (5.61++/2.8s-UTK) id AA06000; Thu, 4 Mar 93 12:20:35 -0500 X-Resent-To: pbwg-comm@CS.UTK.EDU ; Thu, 4 Mar 1993 12:20:33 EST Errors-To: owner-pbwg-comm@CS.UTK.EDU Received: from timbuk.cray.com by CS.UTK.EDU with SMTP (5.61++/2.8s-UTK) id AA05984; Thu, 4 Mar 93 12:20:31 -0500 Received: from groucho (groucho.cray.com) by cray.com (4.1/CRI-MX 2.13) id AA06976; Thu, 4 Mar 93 11:20:29 CST Received: by groucho (4.1/CRI-5.13) id AA16192; Thu, 4 Mar 93 11:20:26 CST From: kkg@ferrari.cray.com (Koushik Ghosh) Message-Id: <9303041720.AA16192@groucho> Subject: Meeting on May 24, 1993 To: pbwg-comm@cs.utk.edu Date: Thu, 4 Mar 93 11:20:25 CST X-Mailer: ELM [version 2.3 PL11] There seems to be a lot of requests for a May 24, 1993 meeting. May I suggest that if we start the meeting at 8 AM instead of 9 AM, the chances of getting everything done by 5PM might go up. Then Jack and some of us might be able to catch a flight out of Knoxville that evening. -- Koushik Ghosh kkg@cray.com Benchmarking Dept. Cray Research Inc. 1-612-683-3407 From owner-pbwg-comm@CS.UTK.EDU Thu Mar 4 13:48:21 1993 Received: from CS.UTK.EDU by surfer.EPM.ORNL.GOV (5.61/1.34) id AA05140; Thu, 4 Mar 93 13:48:21 -0500 Received: from localhost by CS.UTK.EDU with SMTP (5.61++/2.8s-UTK) id AA11235; Thu, 4 Mar 93 13:47:13 -0500 X-Resent-To: pbwg-comm@CS.UTK.EDU ; Thu, 4 Mar 1993 13:47:10 EST Errors-To: owner-pbwg-comm@CS.UTK.EDU Received: from gemini.npac.syr.EDU by CS.UTK.EDU with SMTP (5.61++/2.8s-UTK) id AA11220; Thu, 4 Mar 93 13:47:05 -0500 Received: by gemini.npac.syr.edu id AA20470 (5.65c/IDA-1.4.4 for pbwg-comm@cs.utk.edu); Thu, 4 Mar 1993 13:46:43 -0500 Date: Thu, 4 Mar 1993 13:46:43 -0500 From: Tomasz Haupt Message-Id: <199303041846.AA20470@gemini.npac.syr.edu> To: pbwg-comm@cs.utk.edu Subject: Meeting on May 24, 1993 Cc: haupt@npac.syr.edu > There seems to be a lot of requests for a May 24, 1993 meeting. Fine with me. Tom Haupt From owner-pbwg-comm@CS.UTK.EDU Fri Mar 5 13:15:58 1993 Received: from CS.UTK.EDU by surfer.EPM.ORNL.GOV (5.61/1.34) id AA04015; Fri, 5 Mar 93 13:15:58 -0500 Received: from localhost by CS.UTK.EDU with SMTP (5.61++/2.8s-UTK) id AA28005; Fri, 5 Mar 93 13:14:48 -0500 X-Resent-To: pbwg-comm@CS.UTK.EDU ; Fri, 5 Mar 1993 13:14:46 EST Errors-To: owner-pbwg-comm@CS.UTK.EDU Received: from vnet.ibm.com by CS.UTK.EDU with SMTP (5.61++/2.8s-UTK) id AA27996; Fri, 5 Mar 93 13:14:32 -0500 Message-Id: <9303051814.AA27996@CS.UTK.EDU> Received: from KGNVMZ by vnet.ibm.com (IBM VM SMTP V2R2) with BSMTP id 6262; Fri, 05 Mar 93 13:11:37 EST Date: Fri, 5 Mar 93 12:45:31 EST From: "Dr. Joanne L. Martin" To: pbwg-comm@cs.utk.edu Subject: send archive from pbwg send archive from pbwg From owner-pbwg-comm@CS.UTK.EDU Tue Mar 9 15:08:55 1993 Received: from CS.UTK.EDU by surfer.EPM.ORNL.GOV (5.61/1.34) id AA13515; Tue, 9 Mar 93 15:08:55 -0500 Received: from localhost by CS.UTK.EDU with SMTP (5.61++/2.8s-UTK) id AA24030; Tue, 9 Mar 93 15:01:29 -0500 X-Resent-To: pbwg-comm@CS.UTK.EDU ; Tue, 9 Mar 1993 15:01:28 EST Errors-To: owner-pbwg-comm@CS.UTK.EDU Received: from THUD.CS.UTK.EDU by CS.UTK.EDU with SMTP (5.61++/2.8s-UTK) id AA24023; Tue, 9 Mar 93 15:01:27 -0500 From: Jack Dongarra Received: by thud.cs.utk.edu (5.61++/2.7c-UTK) id AA14193; Tue, 9 Mar 93 15:01:25 -0500 Date: Tue, 9 Mar 93 15:01:25 -0500 Message-Id: <9303092001.AA14193@thud.cs.utk.edu> To: pbwg-comm@cs.utk.edu Subject: PBWG posting Forwarding: Mail from '' dated: Tue, 9 Mar 93 14:30:05 -0500 Here is a note I'm planning on posting to comp.parallel. Let me know if you see any problems. I will post it on Friday. 3/12. Regards, Jack Dear Colleague, We are planning to have the Third Meeting of the Parallel Benchmark Working Group meet in Knoxville, Tennessee at the University of Tennessee on May 24th, 1993. This process formally began with a workshop held at the Supercomputer '92 meeting in November 1992. The purpose of the working group is to establish credible and useful benchmarks for the evaluation of Distributed Memory MIMD systems. The objectives for the group are: 1. To establish a comprehensive set of parallel benchmarks that is generally accepted by both users and vendors of parallel system. 2. To provide a focus for parallel benchmark activities and avoid unnecessary duplication of effort and proliferation of benchmarks. 3. To set standards for benchmarking methodology and result-reporting together with a control database/repository for both the benchmarks and the results. Mode of Working: The working group has adopted an HPF-like forum style of proceedings with a view to convergence to an agreed set of benchmarks and procedures within 10 months. If you would like to participate and attend the meeting let me know. Mailing Lists ============= The following mailing lists have been set up. pbwg-comm@cs.utk.edu Whole committee pbwg-lowlevel@cs.utk.edu Low level subcommittee pbwg-compactapp@cs.utk.edu Compact applications subcommittee pbwg-method@cs.utk.edu Methodology subcommittee pbwg-kernel@cs.utk.edu Kernel subcommittee If you are on a mailing list you will receive mail as it is posted. If you want to join a mailing list send me mail (dongarra@cs.utk.edu). All mail will be collected and can be retrieved by sending email to netlib@ornl.gov and in the mail message typing: send comm.archive from pbwg send lowlevel.archive from pbwg send compactapp.archive from pbwg send method.archive from pbwg send kernel.archive from pbwg send index from pbwg The various subcommittees will look into the following topics: Low-Level: --------- Start-up, latency, bandwidth Reduction (broadcast, sum, gather/scatter) Synchronization (e.g., SYNCH1 from Genesis) I/O Kernel: ------ Matrix operations (e.g., multiply, transpose) LU Decomposition PDE Solvers (Red/Black Relaxation) Multigrid FFT Conjugate Gradient Compact Applications: -------------------- Particle-In-Cell codes (e.g., LPM1 from Genesis) QCD Molecular Dynamics CFD ARCO Financial Applications Methodology: ------------ Guidelines for reporting performance. The meeting site will be the: University Center in room 221 University of Tennessee We have made arrangements with the Hilton Hotel in Knoxville. Hilton Hotel 501 W. Church Street Knoxville, TN Phone: 615-523-2300 When making arrangements tell the hotel you are associated with the Parallel Benchmarking Meeting. You can rent a car or get a cab from the airport to the hotel. >From the hotel to the University it is a 15 minute walk. We should plan to start at 8:30 pm May 24th and finish about 5:00 pm. The format of the meeting is: Monday 25th May 8.30 - 12.00 Full group meeting 12.00 - 1.30 Formal lunch 1.30 - 4.00 Parallel subgroup meetings 5.00 - 5.00 Full group meeting Tentative agenda for full group meeting: 1. Minutes of Minneapolis meeting 2. Reports and discussion from subgroups 3. Open discussion and agreement on further actions 4. Date and venue for next meeting Suggested subgroups - probably two in parallel Compact Applications Low-Level benchmarks and second pair: Kernels benchmarks Methodology We have setup a mail refector for correspondence, it is called pbwg-comm@cs.utk.edu. Mail to that address will be sent to the mailing list and also collected in netlib@ornl.gov. To retrieve the collected mail, send email to netlib@ornl.gov and in the mail message type: send comm.archive from pbwg If you would like to be put on the mailing list to receive the correspondence let me know. Regards, Jack Dongarra ----------- End Forwarded Message ----------- From owner-pbwg-comm@CS.UTK.EDU Tue Mar 9 17:22:57 1993 Received: from CS.UTK.EDU by surfer.EPM.ORNL.GOV (5.61/1.34) id AA15724; Tue, 9 Mar 93 17:22:57 -0500 Received: from localhost by CS.UTK.EDU with SMTP (5.61++/2.8s-UTK) id AA01135; Tue, 9 Mar 93 17:21:35 -0500 X-Resent-To: pbwg-comm@CS.UTK.EDU ; Tue, 9 Mar 1993 17:21:32 EST Errors-To: owner-pbwg-comm@CS.UTK.EDU Received: from BERRY.CS.UTK.EDU by CS.UTK.EDU with SMTP (5.61++/2.8s-UTK) id AA01122; Tue, 9 Mar 93 17:21:29 -0500 From: Mike Berry Received: by berry.cs.utk.edu (5.61++/2.7c-UTK) id AA24206; Tue, 9 Mar 93 17:19:08 -0500 Date: Tue, 9 Mar 93 17:19:08 -0500 Message-Id: <9303092219.AA24206@berry.cs.utk.edu> To: pbwg-comm@cs.utk.edu Subject: PBWG Minutes (March 1-2, 1993) Minutes of the 2nd Meeting of the Parallel Benchmark Working Group (PBWG) ------------------------------------------------------------------------- Place: Room 221, University Center University of Tennessee Knoxville, TN Host: Jack Dongarra ORNL/Univ. of Tennessee Dates: March 1-2, 1993 Attendees/Affiliations: ---------------------- David Bailey, NASA Michael Berry, Univ. of Tennessee Ed Brocklehurst, National Physical Lab Jack Dongarra, Univ. of Tennessee / ORNL Koushik Ghosh, Cray Research Tom Haupt, Syracuse Univ. Tony Hey, Southampton Univ. Roger Hockney, Southampton Univ. Brian LaRose, Univ. of Tennessee David Mackay, Intel SSD Joanne Martin, IBM Robert Pennington, Pittsburgh Supercomputing Center David Walker, ORNL Pearl Wang, George Mason Univ. / US Geological Survey Agenda: March 1, 1993 --------------------- At 1:00 pm EDT, Roger Hockney gave opening remarks and welcomed all participants to the workshop. Each participant introduced him or herself by affiliation and interests. The minutes of the previous meeting at Supercomputing'92 (Minneapolis) were reviewed. Roger suggested that the group consider an alternative name for the group such as "PARABEN" or "INTERBEN". Participants were asked to think about this over dinner and cocktails that evening. Jack Dongarra mentioned that the mail reflector "pbwg-com@cs.utk.edu" is now in place and can be used as an electronic forum for the group. Currently there are about 50 persons on the mailing list. Jack also mentioned that anyone can obtain all correspondence information concerning PBWG requests and solicitations by sending the mail message "send archive from pbwg" to netlib@ornl.gov. To inquire about the contents of the archive simply send the message "send index from pbwg" to netlib@ornl.gov. It was noted that this meeting was posted in the the "comp.parallel" users group (USENET). Roger Hockney reviewed the 4 subgroups previously defined for the effort: Methodology, Classification, Compiler-based, Kernels. Roger proposed that the participants not break into subgroups due to the somewhat lower than expected turnout for the workshop. All participants agreed and a general discussion on "Methodology" followed. David Bailey was then asked to review the "Proposed Guidelines for Reporting Performance" which he distributed to the group. [These guidelines will be made available from the pbwg archive in netlib.] Roger Hockney then distributed an article entitled "A framework for benchmark performance analysis" from Supercomputer 48, IX-2 (March 1992). He recommended that the group start using correct performance symbols such as "Mflop/s" for millions of floating-point operations per second rather than simply "Mflops" which is somewhat ambiguous. As discussed in the paper, temporal performance measures such as T(N,P) and R_T=T^-1(N,P) were also suggested. Roger pointed out that the numerator term, F(N), in the performance measure R_B = F(N)/T(N,P) should be constant and clearly stated. Here, T(N,P) is the execution time, R_T is the inverse of execution time, N is the problem size, P is the number of processors, and F(N) is the total flop count. He suggested that the group steer away from measures such as speedup and efficiency. David Bailey pointed out that one must be prepared to change F(N) to match optimal implementations on different machines. Jack D. pointed out that one must be careful though not to allow performance rates which exceed peak performance. Roger H. suggested that the performance database might provide plots of the metrics as functions of N and P. Along with the distribution of the paper by Roger Hockney, Tony Hey distributed copies of 2 papers on the Genesis Benchmarks: "The Genesis Distributed-Memory Benchmarks" (Tech Report, Dept. of Electronics and Computer Science, Univ. of Southhampton, UK), and "The Genesis distributed-memory benchmarks. Part 1: methodology and general relativity benchmark with results for the SUPRENUM computer" (from Concurrency: Practice and Experience Vol. 5(1), pp. 1-22, February 1993.) Tony indicated that the Genesis benchmarks will be sent to Jack D. for inclusion in the benchmark repository in NETLIB. Ed Brocklehurst raised the concern that the European PEPS project methodology might differ from that of PBWG. He stressed the need for collaboration among the two efforts. He then gave a short lecture on the PEPS project which is a 3-year funded program ($11 million) which began in November of 1992. Participants of the PEPS project include: TS-ASM (electronics firm - France), NPL, Inecs Fistemi (Italy), Simulog (France), Sosip (Germany, France, Italy, UK), and Warwick. The benchmarks supported by PEPS are to be available to the public. Other concerns of the project involve performance monitoring and characterization. The discussion on Methodology concluded with a comment by Michael Berry that all accepted codes for the benchmark suite be self-validating. Roger H. then asked David Walker to report on the work thus far by the Classification subgroup. David W. stressed that the classification of benchmarks be based on (i.) communication characteristics, (ii.) I/O, and (iii.) load balance and data layout. Questions on the inclusion of profiles and processor utilization data with the benchmarks were raised. David W. pointed out that benchmarks may range from inherent parallel codes (e.g., Monte-Carlo) to mesh-based codes or gather/scatter-based applications. David Bailey stressed the need for classifying I/O and to see what the current measurement of I/O is in existing benchmark suites. Joanne Martin pointed out that the ARCO benchmark suite is well-received by the seismic community and should be seriously considered by PBWG. The discussion then turned to compiler-based benchmarks with Tom Haupt taking the lead. He distributed the document entitled "High Performance Fortran, Fortran-D Benchmark Suite" which is available vi anonymous ftp from minerva.npac.syr.edu. Tom H. indicated that the Syracuse compiler benchmarks can test parsers on a collection of statements from HPF, accept intrinsic functions, and for-all loops. He stressed that these benchmarks be able to handle hard-coded message-passing as well as HPF Fortran. One goal would be to find instances where HPF is difficult to use. Tom mentioned that these codes deal with parallel I/O, validate HPF Fortran, and generate meaningful statistics. The current Syracuse benchmark suite contains over 50 codes and includes many well-known benchmarks such as LAPACK, NAS, and Genesis benchmarks. Tom agreed to download this suite to machine at UT during the workshop (this was done after the meeting adjourned on March 2 and the files are available in netlib, send index from pbwg, for more details). This suite consumes about 5 megabytes and comprises about 14 directories of README files, makefiles, and source files. Brian LaRose (student of Michael B.) then gave a demonstration of the Performance Database Server (PDS) which is available in an upcoming release of XNETLIB. Beforehand, Jack D. gave a brief introduction of XNETLIB to familiarize participants with the features of the X Windows-based user interface to NETLIB. The members were pleased with the tool and made suggestions for improvements such as graphing data retrieved from the database. Brian demonstrated the various browsing, rank-ordering, and search features of the system he is designing for his Master thesis at the Univ. of Tennessee. Michael B. indicated that participants would be able to test PDS and should send anonymous ftp address/directory information to "utpds@cs.utk.edu" so a beta release of the software could be sent to them. At this point, Roger Hockney asked that Jack D. define/classify the publically-available benchmarks that might be considered for the PBWG benchmark suite. The classification into Kernels, Applications, and Compact Applications is given below. A (*) indicates availability as single-node benchmarks also. Additional benchmarks that were not explicitly classified are listed under "Miscellaneous". Kernels: ------- Linpack 100 (*) 1000 (*) Livermore Loops (*) Los Alamos (*, some of them) Tom Dunigan's (ORNL) Genesis Global Operations (e.g., sum, gather/scatter) NAS Sorting Applications: ------------ Perfect Slalom RAPS PEPS Compact Applications: -------------------- NAS Genesis Syracuse ARCO Euroben (*) Miscellaneous: ------------ flops.c IOZONE Dhrystones Whetstones Xstones Jack D. proposed that the selected codes for the PBWG suite be implemented in Fortran-77, HPF, in PVM for message-passing (this will be converted to MPI, Message Passing Interface, when the standard is complete). Roger H. stressed the need to measure startup latency (alpha, beta), and metrics such as R-sub-infinity and N-sub-one-half. Other target measures discussed included: peak rates, flop/s, communication costs, and bisection (node-node) bandwidth. David Bailey stressed that the single processor benchmarks should be separated from the parallel benchmarks too avoid confusion. At 5:00pm EDT, the workshop adjourned for a 6:30pm EDT dinner appointment at Chesapeakes's, a downtown Knoxville seafood restaurant. Agenda: March 2, 1993 --------------------- At 9:00 am EDT, Roger Hockney opened the morning session of the PBWG workshop in Room 221 of the University Center on the campus of the University of Tennessee. Prior to any technical discussion on the makeup of the PBWG benchmark suite. Participants agreed to hold a "Birds of a Feather" session at the Supercomputing'93 conference in Portland, OR in November. Joanne Martin will schedule this meeting for the group - a 4:30 pm session was agreed upon by the participants. Prior to the Supercomputing'93 meeting, Jack D. agreed to collect all desirable benchmarks and install them in NETLIB for public access. The discussion on the categorical makeup of a PBWG benchmark suite was then lead by Tony Hey. The 3 categories considered were: Low-Level, Kernels, and Compact Applications. David W. stressed that 10 would be an optimal number of codes to use. Participants generally agreed with this proposal. Potential benchmarks considered are listed by category below. Jack D. stressed that each group "justify" benchmarks by indicating exactly what machine feature(s) are to be tested and try to avoid overlap between categories. Low-Level: --------- Start-up, latency, bandwidth Reduction (broadcast, sum, gather/scatter) Synchronization (e.g., SYNCH1 from Genesis) I/O Kernel: ------ Matrix operations (e.g., multiply, transpose) LU Decomposition PDE Solvers (Red/Black Relaxation) Multigrid FFT Conjugate Gradient Compact Applications: -------------------- Particle-In-Cell codes (e.g., LPM1 from Genesis) QCD Molecular Dynamics CFD ARCO Financial Applications At this point, Roger H. asked that subgroups be formed to address the each of the above categories of benchmarks. Each subgroup will be ask to give a report at the Portland Meeting concerning the potential design/makeup of their category of PBWG benchmarks. Participants agreed that the subgroups be open to other experts from the performance community: vendors, academia, and users. A person may be in more than one subgroup and Jack D. suggested that a voting strategy be developed. A "1 vote per institution" was one candidate for consideration but nothing official was decided at the workshop. Suggestions for subgroup members not present at the workshop are included below. A (*) indicates subgroup leader. PBWG Benchmark Suite Subgroups: ------------------------------ Low-Level Kernel Compact Applications Methodology --------- ------ -------------------- ----------- PEPS/Warwick D. Walker D. Walker (*) D. Bailey (*) IBM/D. Frye J. Dongarra J. Martin D. Schneider R. Pennington T. Hey (*) E. Brocklehurst PEPS R. Hockney (*) T. Haupt K. Ghosh J. Dongarra PEPS M. Berry E. Kushner (Intel) D. Bailey D. Bailey P. Wang D. Barton The following mail reflectors have been setup: Low Level: pbwg-lowlevel@cs.utk.edu Kernel: pbwg-kernel@cs.utk.edu Compact Applications: pbwg-compactapp@cs.utk.edu Methodology: pbwg-method@cs.utk.edu Jack D. agreed to set up mail reflectors for each of the subgroups. (These mail reflectors are setup and functioning.) All participants agreed to 2 meetings that would be open to anyone interested in attending prior to the meeting at Supercomputing'93 in Portland. Drafts of the reports from each subgroup are due by the agreed-upon May 24 meeting at the Univ. of Tennessee which will begin at 8:30 am. The second meeting date will also be at the Univ. of Tennessee on August 23-24. Here's the approved schedule of PBWG meetings: Third Meeting: May 24, begins at 8:30am (Univ. of Tennessee) Fourth Meeting: August 23-24 (Univ. of Tennessee) Fifth Meeting: Week of Nov 15 (Supercomputing Portland, OR) Joanne Martin mentioned that there was a Conference on Performance Tools on April 1-4 and Tony H. suggested that PBWG might submit an announcement to that conference about PBWG's activities. A discussion on possible name changes for PBWG was considered again. Some suggestions included PES(Performance or Parallel Evaluation Suite) and PARKBENCH(Parallel Kernel Benchmarks). Since no consensus was reached, the discussion is postponed to the next meeting so members can think of other names. In preparation for the May meeting, subgroup leaders briefly mentioned some of the issues they hope to address: T. Hey : talk to vendors, provide list of kernel areas, justification D. Walker : define areas, acquire candidate codes, address scalability, address problem sizes (e.g., 4 used in Genesis), be able to specify data layouts (use vanilla layouts) R. Hockney: construct list of desirable low-level benchmarks Participants agreed that users should be able to optimize the suite by any means provided they document what they do (e.g., assembly language allowed). When possible, benchmark contributions should include the code that was used to obtain the reported results (stored in PDS). The second day of the PBWG workshop was adjourned by Roger Hockney at 11:00am EDT, and participants had informal discussions before departing the Univ. of Tennessee campus. PBWG activities will be posted to the comp.parallel newsgroup (USENET). From owner-pbwg-comm@CS.UTK.EDU Wed Mar 24 06:54:11 1993 Received: from CS.UTK.EDU by surfer.EPM.ORNL.GOV (5.61/1.34) id AA05932; Wed, 24 Mar 93 06:54:11 -0500 Received: from localhost by CS.UTK.EDU with SMTP (5.61+IDA+UTK-930125/2.8s-UTK) id AA11133; Wed, 24 Mar 93 06:53:25 -0500 X-Resent-To: pbwg-comm@CS.UTK.EDU ; Wed, 24 Mar 1993 06:53:13 EST Errors-To: owner-pbwg-comm@CS.UTK.EDU Received: from sun2.nsfnet-relay.ac.uk by CS.UTK.EDU with SMTP (5.61+IDA+UTK-930125/2.8s-UTK) id AA11125; Wed, 24 Mar 93 06:53:10 -0500 Via: uk.ac.southampton.ecs; Wed, 24 Mar 1993 11:44:52 +0000 From: R.Hockney@parallel-applications-centre.southampton.ac.uk Via: calvados.pac.soton.ac.uk (plonk); Wed, 24 Mar 93 11:37:49 GMT Date: Wed, 24 Mar 93 11:24:51 GMT Message-Id: <3278.9303241124@calvados.pac.soton.ac.uk> Apparently-To: pbwg-comm@cs.utk.edu From owner-pbwg-comm@CS.UTK.EDU Wed Mar 24 07:10:28 1993 Received: from CS.UTK.EDU by surfer.EPM.ORNL.GOV (5.61/1.34) id AA09660; Wed, 24 Mar 93 07:10:28 -0500 Received: from localhost by CS.UTK.EDU with SMTP (5.61+IDA+UTK-930125/2.8s-UTK) id AA11684; Wed, 24 Mar 93 07:09:56 -0500 X-Resent-To: pbwg-comm@CS.UTK.EDU ; Wed, 24 Mar 1993 07:09:55 EST Errors-To: owner-pbwg-comm@CS.UTK.EDU Received: from sun2.nsfnet-relay.ac.uk by CS.UTK.EDU with SMTP (5.61+IDA+UTK-930125/2.8s-UTK) id AA11676; Wed, 24 Mar 93 07:09:50 -0500 Via: uk.ac.southampton.ecs; Wed, 24 Mar 1993 12:07:41 +0000 From: R.Hockney@parallel-applications-centre.southampton.ac.uk Via: calvados.pac.soton.ac.uk (plonk); Wed, 24 Mar 93 11:44:01 GMT Date: Wed, 24 Mar 93 11:50:26 GMT Message-Id: <3341.9303241150@calvados.pac.soton.ac.uk> Apparently-To: pbwg-comm@cs.utk.edu ~r comlet1.asc From owner-pbwg-comm@CS.UTK.EDU Wed Mar 24 07:36:42 1993 Received: from CS.UTK.EDU by surfer.EPM.ORNL.GOV (5.61/1.34) id AA11698; Wed, 24 Mar 93 07:36:42 -0500 Received: from localhost by CS.UTK.EDU with SMTP (5.61+IDA+UTK-930125/2.8s-UTK) id AA12605; Wed, 24 Mar 93 07:35:41 -0500 X-Resent-To: pbwg-comm@CS.UTK.EDU ; Wed, 24 Mar 1993 07:35:40 EST Errors-To: owner-pbwg-comm@CS.UTK.EDU Received: from sun2.nsfnet-relay.ac.uk by CS.UTK.EDU with SMTP (5.61+IDA+UTK-930125/2.8s-UTK) id AA12584; Wed, 24 Mar 93 07:35:34 -0500 Via: uk.ac.southampton.ecs; Wed, 24 Mar 1993 12:32:55 +0000 Via: brewery.ecs.soton.ac.uk; Wed, 24 Mar 93 12:25:44 GMT From: Prof Roger Hockney Received: from alithea.ecs.soton.ac.uk by brewery.ecs.soton.ac.uk; Wed, 24 Mar 93 12:34:08 GMT Date: Wed, 24 Mar 93 12:34:10 GMT Message-Id: <253.9303241234@alithea.ecs.soton.ac.uk> To: pbwg-comm@cs.utk.edu Subject: PBWG Draft Report A Message from Your Chairman ---------------------------- In view of the committee's aim to agree on a draft text at the meeting on May 24, 1993, it would help if each subcommittee produces their recommendations as LATEX files to fit into a standard framework. I believe that LATEX is the most widely used text processing system in the universities, and appears to have been used for the HPF report. I have prepared the following skeleton files for the committee's report: (1) benrep1.tex - control file for main report (2) bencom1.tex - command definition file (3) benref1.bib - start of bibliography Additions to the command file and bibliography can be sent to me as files bencom2.tex, benref2.bib etc. The control file reads the following files which are to be provided by the leader of each subcommittee: (4) intro1.tex - Roger Hockney for whole committee (5) method1.tex - David Bailey for Methodology subcommittee (6) lowlev1.tex - Roger Hockney for the low-level subcommittee (7) kernel1.tex - Tony Hey for the kernel subcommittee (8) compac1.tex - David Walker for compact applications subcommittee (9) conclu1.tex - Roger Hockney for whole committee Files (1), (2), (3), (5), (7), and (8) are appended to this e-mail. Best wishes, Roger Hockney. % file : benrep1.tex % % ************************************************************** % STANDARD INTERNATIONAL BENCHMARKS FOR PARALLEL COMPUTERS % ************************************************************** % \input{bencom1.tex} % define new commands for benchmark report % ---------------------------------------------------------------------------- \documentstyle[]{report} % Specifies the document style. \textheight 8.25 true in \textwidth 5.625 true in \topmargin -0.13 true in \oddsidemargin 0.25 true in \evensidemargin 0.25 true in % The preamble begins here. \title{Standard International Benchmarks for Parallel Computers} % ---------------------------------------------------------------------------- \author{PBWG Committee \\ draft assembled by Roger Hockney (chairman)} \date{23 March 1993 - draft 1} % ---------------------------------------------------------------------------- \begin{document} % End of preamble and beginning of text. \sloppy \maketitle % Produces the title. % ---------------------------------------------------------------------------- \input{intro1.tex} % Introduction % responsibility of Roger Hockney for whole committee % ---------------------------------------------------------------------------- \input{method1.tex} % Chapter1 % responsibility of David Bailey for Methodology subcommittee % ---------------------------------------------------------------------------- \input{lowlev1.tex} % Chapter2 % responsibility of Roger Hockney for Low-level benchmarks subcommittee % ---------------------------------------------------------------------------- \input{kernel1.tex} % Chapter3 % responsibility of Tony Hey for Kernel benchmarks subcommittee % ---------------------------------------------------------------------------- \input{compac1.tex} % Chapter4 % responsibility of David Walker for Compact Applications subcommittee % ---------------------------------------------------------------------------- \input{conclu1.tex} % Conclusions % responsibility of Roger Hockney for whole committee % ---------------------------------------------------------------------------- \vspace{0.35in} {\large \bf Acknowledgments} \bibliography{benref1} \bibliographystyle{unsrt} \end{document} % End of document. % file : bencom1.tex % % ************************************************************** % LATEX COMMANDS FOR PARABEN REPORTS % ************************************************************** % \def\flop{\mathop{\rm flop}\nolimits} \def\pipe{\mathop{\rm pipe}\nolimits} \newcommand{\usec}{\mbox{\rm $\mu$s}} \newcommand{\where}{\mbox{\rm where}} \newcommand{\rmand}{\mbox{\rm and}} \newcommand{\Mflops}{\mbox{\rm Mflop/s}} \newcommand{\flops}{\mbox{\rm flop/s}} \newcommand{\flopB}{\mbox{\rm flop/B}} \newcommand{\tstepps}{\mbox{\rm tstep/s}} \newcommand{\MWps}{\mbox{\rm MW/s}} \newcommand{\Mwps}{\mbox{\rm Mw/s}} \newcommand{\spone}{\mbox{\ }} \newcommand{\sptwo}{\mbox{\ \ }} \newcommand{\spfour}{\mbox{\ \ \ \ }} \newcommand{\spsix}{\mbox{\ \ \ \ \ \ }} \newcommand{\speight}{\mbox{\ \ \ \ \ \ \ \ }} \newcommand{\spten}{\mbox{\ \ \ \ \ \ \ \ \ \ }} \newcommand{\rinf}{\mbox{$r_\infty$}} \newcommand{\Rinf}{\mbox{$R_\infty$}} \newcommand{\nhalf}{\mbox{$n_{\frac{1}{2}}$}} \newcommand{\Nhalf}{\mbox{$N_{\frac{1}{2}}$}} \newcommand{\phalf}{\mbox{$p_{\frac{1}{2}}$}} \newcommand{\Phalf}{\mbox{$P_{\frac{1}{2}}$}} \newcommand{\half}{\mbox{$\frac{1}{2}$}} \newcommand{\rnhalf}{\mbox{(\rinf,\nhalf)}} \newcommand{\RNhalf}{\mbox{(\Rinf,\Nhalf)}} \newcommand{\third}{\mbox{$\frac{1}{3}$}} \newcommand{\quart}{\mbox{$\frac{1}{4}$}} \newcommand{\eighth}{\mbox{$\frac{1}{8}$}} \newcommand{\nineth}{\mbox{$\frac{1}{9}$}} % ---------------------------------------------------------------------------- % file : benref1.bib % @book{HoJe88, author= "Roger W. Hockney and Christopher R. Jesshope", title= "Parallel Computers 2: Architecture, Programming and Algorithms", publisher= "Adam Hilger/IOP Publishing", address= "Bristol \& New York", year= "1988", note= "Distributed in the USA by the American Institute of Physics, c/o AIDC, 64 Depot Road, Colchester, VT 05445."} @book{Super, key="Super", title={Supercomputer}, publisher="ASFRA", address="Edam, Netherlands"} @book{SI75, key="Royal Society", organization="Symbols Committee of the Royal Society", title={Quantities, Units and Symbols}, publisher="The Royal Society", address="London", year=1975} @article{Berr89, author="M. Berry and D. Chen and P. Koss and D. Kuck and S. Lo and Y. Pang and L. Pointer and R. Roloff and A. Sameh and E. Clementi and S. Chin and D. Schneider and G. Fox and P. Messina and D. Walker and C. Hsiung and J. Schwarzmeier and K. Lue and S. Orszag and F. Seidl and O. Johnson and R. Goodrum and J. Martin", title="The {PERFECT} Club benchmarks: effective performance evaluation of computers", journal={Intl. J. Supercomputer Appls.}, volume=3, number=3, year=1989, pages="5-40"} @incollection{Ma88, author="F. H. McMahon", title="The {L}ivermore {F}ortran Kernels test of the numerical performance range", editor="J. L. Martin", booktitle={Performance Evaluation of Supercomputers}, publisher="Elsevier Science B.V., North-Holland", address="Amsterdam", year=1988, pages="143-186"} @article{Mess90, author="P. Messina and C. Baillie and E. Felten and P. Hipes and R. Williams and A. Alagar and A. Kamrath and R. Leary and W. Pfeiffer and J. Rogers and D. Walker", title="Benchmarking advanced architecture computers", journal={Concurrency: Practice and Experience}, volume=2, number=3, year=1990, pages="195-255"} @inproceedings{Cvet90, author="Z. Cvetanovic and E. G. Freedman and C. Nofsinger", title="Efficient decomposition and performance of parallel {PDE}, {FFT}, {M}onte-{C}arlo simulations, simplex and sparse solvers", booktitle={Proceedings Supercomputing90}, publisher="IEEE", address="New York", year=1990, pages="465-474"} @article{SUPR88, title="Proceedings 2nd International SUPRENUM Colloquium", author="U. Trottenberg", journal={Parallel Computing}, volume=7, number=3, year=1988} @article{Hey91, author="A. J. G. Hey", title="The {G}enesis Distributed-Memory Benchmarks", journal={Parallel Computing}, volume=17, year=1991, pages="1275-1283"} @book{F90, author="M. Metcalf and J. Reid", title={Fortran-90 Explained}, publisher="Oxford Science Publications/OUP", address="Oxford and New York", year=1990, chapter=6} @article{SPEC90, key="SPEC", title="{SPEC} Benchmarks Suite Release 1.0", journal={SPEC Newslett.}, volume=2, number=3, year=1990, pages="3-4", publisher="Systems Performance Evaluation Cooperative, Waterside Associates", address="Fremont, California"} @article{FGHS89, author="A. Friedli and W. Gentzsch and R. Hockney and A. van der Steen", title="A {E}uropean Supercomputer Benchmark Effort", journal={Supercomputer 34}, volume="VI", number=6, year=1989, pages="14-17"} @article{BRH90, author="L. Bomans and D. Roose and R. Hempel", title="The {A}rgonne/{GMD} Macros in {F}ortran for portable parallel programming and their implementation on the {I}ntel i{PSC}/2", journal={Parallel Computing}, volume=15, year=1990, pages="119-132"} @inproceedings{ShTu91, author="J. N. Shahid and R. S. Tuminaro", title="Iterative Methods for Nonsymmetric Systems on {MIMD} Machines", booktitle={Proc. Fifth SIAM Conf. Parallel Processing for Scientific Computing}, year=1991} @article{Bish90, author="N. T. Bishop and C. J. S. Clarke and R. A. d'Inverno", journal={Classical and Quantum Gravity}, volume=7, year=1990, pages="L23-L27"} @article{Isaac83, author="R. A. Isaacson and J. S. Welling and J.Winicour", journal={J. Math. Phys.}, volume=24, year=1983, pages="1824-1834"} @article{Stew82, author="J. M. Stewart and H. Friedrich", journal={Proc. Roy. Soc.}, volume="A384", year=1982, pages="427-454"} @article{Hoc92, author="R. W. Hockney", title="A framework for benchmark analysis", journal={Supercomputer}, volume=48, number="IX-2", year=1992, pages="9-22"} @article{Add93, author="C. Addison and J. Allwright and N. Binsted and N. Bishop and B. Carpenter and P. Dalloz and D. Gee and V. Getov and A. Hey and R. Hockney and M. Lemke and J. Merlin and M. Pinches and C. Scott and I. Wolton", title="The {G}enesis distributed-memory benchmarks. Part 1: methodology and general relativity benchmark with results for the {SUPRENUM} computer", journal={Concurrency: Practice and Experience}, volume=5, number=1, year=1993, pages="1-22"} % % ------------------------------------------------------------------------- %file: intro1.tex \chapter{Introduction}\footnotemark \footnotetext{written by Roger Hockney for whole committee} \cite{HoJe88} % % ------------------------------------------------------------------------- %file: method1.tex \chapter{Methodology} \footnote{assembled by David Bailey for Methodology subcommittee} % % ------------------------------------------------------------------------- %file: lowlev1.tex \chapter{Low-Level Benchmarks} \footnote{assembled by Roger Hockney for low-level subcommittee} % % ------------------------------------------------------------------------- %file: kernel1.tex \chapter{Kernel Benchmarks} \footnote{assembled by Tony Hey for Kernel subcommittee} % % ------------------------------------------------------------------------- %file: compac1.tex \chapter{Compact Applications} \footnote{assembled by David Walker for Compact Applications subcommittee} % % ------------------------------------------------------------------------- %file: conclu1.tex \chapter{Conclusions} \footnote{written by Roger Hockney for whole committee} % % ------------------------------------------------------------------------- % End of skeleton paper From owner-pbwg-comm@CS.UTK.EDU Mon Mar 29 12:54:46 1993 Received: from CS.UTK.EDU by surfer.EPM.ORNL.GOV (5.61/1.34) id AA19359; Mon, 29 Mar 93 12:54:46 -0500 Received: from localhost by CS.UTK.EDU with SMTP (5.61+IDA+UTK-930125/2.8s-UTK) id AA04636; Mon, 29 Mar 93 12:52:54 -0500 X-Resent-To: pbwg-comm@CS.UTK.EDU ; Mon, 29 Mar 1993 12:52:53 EST Errors-To: owner-pbwg-comm@CS.UTK.EDU Received: from gemini.npac.syr.EDU by CS.UTK.EDU with SMTP (5.61+IDA+UTK-930125/2.8s-UTK) id AA04626; Mon, 29 Mar 93 12:52:42 -0500 Received: from localhost by gemini.npac.syr.edu with SMTP id AA11964 (5.65c/IDA-1.4.4 for pbwg-comm@cs.utk.edu); Mon, 29 Mar 1993 12:52:02 -0500 Message-Id: <199303291752.AA11964@gemini.npac.syr.edu> To: Prof Roger Hockney Cc: pbwg-comm@cs.utk.edu, haupt@npac.syr.edu Subject: Re: PBWG Draft Report In-Reply-To: Your message of "Wed, 24 Mar 93 12:34:10 GMT." <253.9303241234@alithea.ecs.soton.ac.uk> Date: Mon, 29 Mar 93 12:51:54 -0500 From: haupt@npac.syr.edu X-Mts: smtp I thought we agreed to add additional section to the draft on compiler benchmark. Syracuse volunteers to coordinate that effort, and to provide the text for the draft. Will you add this section to the draft, please? Tom Haupt From owner-pbwg-comm@CS.UTK.EDU Tue Apr 27 03:01:42 1993 Received: from CS.UTK.EDU by surfer.EPM.ORNL.GOV (5.61/1.34) id AA29378; Tue, 27 Apr 93 03:01:42 -0400 Received: from localhost by CS.UTK.EDU with SMTP (5.61+IDA+UTK-930125/2.8s-UTK) id AA20445; Tue, 27 Apr 93 03:00:40 -0400 X-Resent-To: pbwg-comm@CS.UTK.EDU ; Tue, 27 Apr 1993 03:00:39 EDT Errors-To: owner-pbwg-comm@CS.UTK.EDU Received: from swiba9.unibas.ch by CS.UTK.EDU with SMTP (5.61+IDA+UTK-930125/2.8s-UTK) id AA20392; Tue, 27 Apr 93 03:00:32 -0400 Received: from sir.ifi.unibas.ch by swiba9.unibas.ch with SMTP (PP) id <6953-0@swiba9.unibas.ch>; Tue, 27 Apr 1993 09:00:04 +0200 Received: from charlie by sir.ifi.unibas.ch (NX5.67c/NX3.0M) id AA07530; Tue, 27 Apr 93 08:59:58 +0200 From: (Walter Kuhn) kuhn@ifi.unibas.ch Message-Id: <9304270659.AA07530@sir.ifi.unibas.ch> Received: by charlie.ifi.unibas.ch (NX5.67c/NX3.0X) id AA00297; Tue, 27 Apr 93 08:59:58 +0200 Date: Tue, 27 Apr 93 08:59:58 +0200 To: pbwg-comm@CS.UTK.edu Subject: Additions to pbwg maillist Please add me to your mail list. Walter From owner-pbwg-comm@CS.UTK.EDU Fri Apr 30 12:32:36 1993 Received: from CS.UTK.EDU by surfer.EPM.ORNL.GOV (5.61/1.34) id AA18590; Fri, 30 Apr 93 12:32:36 -0400 Received: from localhost by CS.UTK.EDU with SMTP (5.61+IDA+UTK-930125/2.8s-UTK) id AA22997; Fri, 30 Apr 93 12:31:04 -0400 X-Resent-To: pbwg-comm@CS.UTK.EDU ; Fri, 30 Apr 1993 12:30:51 EDT Errors-To: owner-pbwg-comm@CS.UTK.EDU Received: from sun2.nsfnet-relay.ac.uk by CS.UTK.EDU with SMTP (5.61+IDA+UTK-930125/2.8s-UTK) id AA22949; Fri, 30 Apr 93 12:30:47 -0400 Via: uk.ac.southampton.ecs; Fri, 30 Apr 1993 16:11:19 +0100 From: R.Hockney@parallel-applications-centre.southampton.ac.uk Via: calvados.pac.soton.ac.uk (plonk); Fri, 30 Apr 93 16:03:41 BST Date: Fri, 30 Apr 93 15:10:51 GMT Message-Id: <15929.9304301510@calvados.pac.soton.ac.uk> To: pbwg-comm@cs.utk.edu Subject: Revised Report Framework Updated Report Framework and Bibliography ----------------------------------------- I append to this note updated versions of benrep1.tex and benref1.bib, there is an added chapter on Compiler benchmarks and additional references. I have sent a contribution to Chapter 2 to pbwg-method and a draft of Chapter 3 to pbwg-lowlevel. I await drafts of the other chapters from the different subcommittee leaders: Roger Hockney ------------------------------- cut here -------------------------------- % % ************************************************************** % STANDARD INTERNATIONAL BENCHMARKS FOR PARALLEL COMPUTERS % ************************************************************** % \input{bencom1.tex} % define new commands for benchmark report % ---------------------------------------------------------------------------- \documentstyle[]{report} % Specifies the document style. \textheight 8.25 true in \textwidth 5.625 true in \topmargin -0.13 true in \oddsidemargin 0.25 true in \evensidemargin 0.25 true in % The preamble begins here. \title{Standard International Benchmarks for Parallel Computers} % ---------------------------------------------------------------------------- \author{PBWG Committee \\ draft assembled by Roger Hockney (chairman)} \date{19 April 1993 - draft 2} % ---------------------------------------------------------------------------- \begin{document} % End of preamble and beginning of text. \sloppy \maketitle % Produces the title. % ---------------------------------------------------------------------------- \input{intro1.tex} % Introduction % responsibility of Roger Hockney for whole committee % ---------------------------------------------------------------------------- \input{method3.tex} % Chapter1 % responsibility of David Bailey for Methodology subcommittee % ---------------------------------------------------------------------------- \input{lowlev1.tex} % Chapter2 % responsibility of Roger Hockney for Low-level benchmarks subcommittee % ---------------------------------------------------------------------------- \input{kernel1.tex} % Chapter3 % responsibility of Tony Hey for Kernel benchmarks subcommittee % ---------------------------------------------------------------------------- \input{compac1.tex} % Chapter4 % responsibility of David Walker for Compact Applications subcommittee % ---------------------------------------------------------------------------- \input{compil1.tex} % Chapter5 % responsibility of Tom Haupt for Compiler Benchmarks subcommittee % ---------------------------------------------------------------------------- \input{conclu1.tex} % Conclusions % responsibility of Roger Hockney for whole committee % ---------------------------------------------------------------------------- \vspace{0.35in} {\large \bf Acknowledgments} \bibliography{benref1} \bibliographystyle{unsrt} \end{document} % End of document. % @book{HoJe81, author= "Roger W. Hockney and Christopher R. Jesshope", title= "Parallel Computers: Architecture, Programming and Algorithms", publisher= "Adam Hilger", address= "Bristol", year= "1981", } @book{HoJe88, author= "Roger W. Hockney and Christopher R. Jesshope", title= "Parallel Computers 2: Architecture, Programming and Algorithms", publisher= "Adam Hilger/IOP Publishing", address= "Bristol \& Philadelphia", year= "1988", edition="second", note= "Distributed in the USA by IOP Publ. Inc., Public Ledger Bldg., Suite 1035, Independence Square, Philadelphia, PA 19106."} @book{Super, key="Super", title={Supercomputer}, publisher="ASFRA", address="Edam, Netherlands"} @book{SI75, key="Royal Society", organization="Symbols Committee of the Royal Society", title={Quantities, Units and Symbols}, publisher="The Royal Society", address="London", year=1975} @article{Berr89, author="M. Berry and D. Chen and P. Koss and D. Kuck and S. Lo and Y. Pang and L. Pointer and R. Roloff and A. Sameh and E. Clementi and S. Chin and D. Schneider and G. Fox and P. Messina and D. Walker and C. Hsiung and J. Schwarzmeier and K. Lue and S. Orszag and F. Seidl and O. Johnson and R. Goodrum and J. Martin", title="The {PERFECT} Club benchmarks: effective performance evaluation of computers", journal={Intl. J. Supercomputer Appls.}, volume=3, number=3, year=1989, pages="5-40"} @incollection{Ma88, author="F. H. McMahon", title="The {L}ivermore {F}ortran {K}ernels test of the numerical performance range", editor="J. L. Martin", booktitle={Performance Evaluation of Supercomputers}, publisher="Elsevier Science B.V., North-Holland", address="Amsterdam", year=1988, pages="143-186"} @article{Mess90, author="P. Messina and C. Baillie and E. Felten and P. Hipes and R. Williams and A. Alagar and A. Kamrath and R. Leary and W. Pfeiffer and J. Rogers and D. Walker", title="Benchmarking advanced architecture computers", journal={Concurrency: Practice and Experience}, volume=2, number=3, year=1990, pages="195-255"} @inproceedings{Cvet90, author="Z. Cvetanovic and E. G. Freedman and C. Nofsinger", title="Efficient decomposition and performance of parallel {PDE}, {FFT}, {M}onte-{C}arlo simulations, simplex and sparse solvers", booktitle={Proceedings Supercomputing90}, publisher="IEEE", address="New York", year=1990, pages="465-474"} @article{SUPR88, title="Proceedings 2nd International SUPRENUM Colloquium", author="U. Trottenberg", journal={Parallel Computing}, volume=7, number=3, year=1988} @article{Hey91, author="A. J. G. Hey", title="The {G}enesis Distributed-Memory Benchmarks", journal={Parallel Computing}, volume=17, year=1991, pages="1275-1283"} @book{F90, author="M. Metcalf and J. Reid", title={Fortran-90 Explained}, publisher="Oxford Science Publications/OUP", address="Oxford and New York", year=1990, chapter=6} @article{SPEC90, key="SPEC", title="{SPEC} Benchmarks Suite Release 1.0", journal={SPEC Newslett.}, volume=2, number=3, year=1990, pages="3-4", publisher="Systems Performance Evaluation Cooperative, Waterside Associates", address="Fremont, California"} @article{FGHS89, author="A. Friedli and W. Gentzsch and R. Hockney and A. van der Steen", title="A {E}uropean Supercomputer Benchmark Effort", journal={Supercomputer 34}, volume="VI", number=6, year=1989, pages="14-17"} @article{BRH90, author="L. Bomans and D. Roose and R. Hempel", title="The {A}rgonne/{GMD} Macros in {F}ortran for portable parallel programming and their implementation on the {I}ntel i{PSC}/2", journal={Parallel Computing}, volume=15, year=1990, pages="119-132"} @inproceedings{ShTu91, author="J. N. Shahid and R. S. Tuminaro", title="Iterative Methods for Nonsymmetric Systems on {MIMD} Machines", booktitle={Proc. Fifth SIAM Conf. Parallel Processing for Scientific Computing}, year=1991} @article{Bish90, author="N. T. Bishop and C. J. S. Clarke and R. A. d'Inverno", journal={Classical and Quantum Gravity}, volume=7, year=1990, pages="L23-L27"} @article{Isaac83, author="R. A. Isaacson and J. S. Welling and J.Winicour", journal={J. Math. Phys.}, volume=24, year=1983, pages="1824-1834"} @article{Stew82, author="J. M. Stewart and H. Friedrich", journal={Proc. Roy. Soc.}, volume="A384", year=1982, pages="427-454"} @incollection{Hoc77, author="R. W. Hockney", title="Super-Computer Architecture", editor="F. Sumner", booktitle={Infotech State of the Art Conference: {F}uture {S}ystems}, publisher="Infotech", address="Maidenhead", year=1977, pages="277-305"} @article{Hoc82, author="R. W. Hockney", title="Characterization of parallel computers and algorithms", journal={Computer Physics Communications}, volume=26, year=1982, pages="285-291"} @article{Hoc83, author="R. W. Hockney", title="Characterizing Computers and Optimizing the {FACR}(l) Poisson-Solver on Parallel Unicomputers", journal={IEEE Trans. Comput.}, volume="{C}\-32", year=1983, pages="933-941"} @article{Hoc87, author="R. W. Hockney", title="Parametrization of Computer Performance", journal={Parallel Computing}, volume=5, year=1987, pages="97-103"} @article{Hoc88, author="R. W. Hockney", title="Synchronization and Communication Overheads on the {LCAP} Multiple {FPS}-164 Computer System", journal={Parallel Computing}, volume=9, year=1988, pages="279-290"} @article{HoCu89, author="R. W. Hockney and I. J. Curington", title="$f_{frac{1}{2}}$: a Parameter to Characterise Memory and Communication Bottlenecks", journal={Parallel Computing}, volume=10, year=1989, pages="277-286"} @article{Hoc91, author="R. W. Hockney", title="Performance Parameters and Benchmarking of Supercomputers", journal={Parallel Computing}, volume=17, year=1991, pages="1111-1130"} @article{Hoc92, author="R. W. Hockney", title="A framework for benchmark analysis", journal={Supercomputer}, volume=48, number="IX-2", year=1992, pages="9-22"} @article{HoCa92, author="R. W. Hockney and E. A. Carmona", title="Comparison of Communications on the {I}ntel i{PSC}/860 and {T}ouchstone {D}elta", journal={Parallel Computing}, volume=18, year=1992, pages="1067-1072"} @article{Add93, author="C. Addison and J. Allwright and N. Binsted and N. Bishop and B. Carpenter and P. Dalloz and D. Gee and V. Getov and A. Hey and R. Hockney and M. Lemke and J. Merlin and M. Pinches and C. Scott and I. Wolton", title="The {G}enesis distributed-memory benchmarks. Part 1: methodology and general relativity benchmark with results for the {SUPRENUM} computer", journal={Concurrency: Practice and Experience}, volume=5, number=1, year=1993, pages="1-22"} @techreport{StRi93, author="A. J. van der Steen and P. P. M. de Rijk", title="Guidelines for use of the {E}uro{B}en Benchmark", institution="{E}uro{B}en", year=1993, month=feb, type="Technical Report", number="{TR}\-3", address="The {E}uro{B}en Group, {U}trecht, {T}he {N}etherlands"} From owner-pbwg-comm@CS.UTK.EDU Fri Apr 30 12:49:20 1993 Received: from CS.UTK.EDU by surfer.EPM.ORNL.GOV (5.61/1.34) id AA20526; Fri, 30 Apr 93 12:49:20 -0400 Received: from localhost by CS.UTK.EDU with SMTP (5.61+IDA+UTK-930125/2.8s-UTK) id AA24215; Fri, 30 Apr 93 12:48:17 -0400 X-Resent-To: pbwg-comm@CS.UTK.EDU ; Fri, 30 Apr 1993 12:48:16 EDT Errors-To: owner-pbwg-comm@CS.UTK.EDU Received: from vnet.ibm.com by CS.UTK.EDU with SMTP (5.61+IDA+UTK-930125/2.8s-UTK) id AA24207; Fri, 30 Apr 93 12:48:15 -0400 Message-Id: <9304301648.AA24207@CS.UTK.EDU> Received: from KGNVMZ by vnet.IBM.COM (IBM VM SMTP V2R2) with BSMTP id 8031; Fri, 30 Apr 93 12:48:11 EDT Date: Fri, 30 Apr 93 12:47:34 EDT From: "Dr. Joanne L. Martin ((914) 385-9572)" To: pbwg-comm@cs.utk.edu Subject: New address Please note that my new e-mail address is jmartin at vnet.ibm.com Thanks, Joanne. From owner-pbwg-comm@CS.UTK.EDU Mon May 3 11:05:02 1993 Received: from CS.UTK.EDU by surfer.EPM.ORNL.GOV (5.61/1.34) id AA17408; Mon, 3 May 93 11:05:02 -0400 Received: from localhost by CS.UTK.EDU with SMTP (5.61+IDA+UTK-930125/2.8s-UTK) id AA02905; Mon, 3 May 93 11:03:37 -0400 X-Resent-To: pbwg-comm@CS.UTK.EDU ; Mon, 3 May 1993 11:03:35 EDT Errors-To: owner-pbwg-comm@CS.UTK.EDU Received: from swiba9.unibas.ch by CS.UTK.EDU with SMTP (5.61+IDA+UTK-930125/2.8s-UTK) id AA02897; Mon, 3 May 93 11:03:28 -0400 Received: from sir.ifi.unibas.ch by swiba9.unibas.ch with SMTP (PP) id <21115-0@swiba9.unibas.ch>; Mon, 3 May 1993 17:03:12 +0200 Received: from charlie by sir.ifi.unibas.ch (NX5.67c/NX3.0M) id AA27670; Mon, 3 May 93 17:03:07 +0200 From: (Walter Kuhn) kuhn@ifi.unibas.ch Message-Id: <9305031503.AA27670@sir.ifi.unibas.ch> Received: by charlie.ifi.unibas.ch (NX5.67c/NX3.0X) id AA00509; Mon, 3 May 93 17:03:05 +0200 Date: Mon, 3 May 93 17:03:05 +0200 Received: by NeXT.Mailer (1.87.1) Received: by NeXT Mailer (1.87.1) To: pbwg-comm@cs.utk.edu Subject: send lowlevel.archive from pbwg From owner-pbwg-comm@CS.UTK.EDU Mon May 3 11:05:13 1993 Received: from CS.UTK.EDU by surfer.EPM.ORNL.GOV (5.61/1.34) id AA17427; Mon, 3 May 93 11:05:13 -0400 Received: from localhost by CS.UTK.EDU with SMTP (5.61+IDA+UTK-930125/2.8s-UTK) id AA02925; Mon, 3 May 93 11:04:06 -0400 X-Resent-To: pbwg-comm@CS.UTK.EDU ; Mon, 3 May 1993 11:04:05 EDT Errors-To: owner-pbwg-comm@CS.UTK.EDU Received: from swiba9.unibas.ch by CS.UTK.EDU with SMTP (5.61+IDA+UTK-930125/2.8s-UTK) id AA02909; Mon, 3 May 93 11:03:56 -0400 Received: from sir.ifi.unibas.ch by swiba9.unibas.ch with SMTP (PP) id <21120-0@swiba9.unibas.ch>; Mon, 3 May 1993 17:03:33 +0200 Received: from charlie by sir.ifi.unibas.ch (NX5.67c/NX3.0M) id AA27674; Mon, 3 May 93 17:03:30 +0200 From: (Walter Kuhn) kuhn@ifi.unibas.ch Message-Id: <9305031503.AA27674@sir.ifi.unibas.ch> Received: by charlie.ifi.unibas.ch (NX5.67c/NX3.0X) id AA00514; Mon, 3 May 93 17:03:30 +0200 Date: Mon, 3 May 93 17:03:30 +0200 Received: by NeXT.Mailer (1.87.1) Received: by NeXT Mailer (1.87.1) To: pbwg-comm@cs.utk.edu Subject: send method.archive from pbwg From owner-pbwg-comm@CS.UTK.EDU Mon May 3 11:05:38 1993 Received: from CS.UTK.EDU by surfer.EPM.ORNL.GOV (5.61/1.34) id AA17445; Mon, 3 May 93 11:05:38 -0400 Received: from localhost by CS.UTK.EDU with SMTP (5.61+IDA+UTK-930125/2.8s-UTK) id AA02939; Mon, 3 May 93 11:04:35 -0400 X-Resent-To: pbwg-comm@CS.UTK.EDU ; Mon, 3 May 1993 11:04:33 EDT Errors-To: owner-pbwg-comm@CS.UTK.EDU Received: from swiba9.unibas.ch by CS.UTK.EDU with SMTP (5.61+IDA+UTK-930125/2.8s-UTK) id AA02928; Mon, 3 May 93 11:04:24 -0400 Received: from sir.ifi.unibas.ch by swiba9.unibas.ch with SMTP (PP) id <21132-0@swiba9.unibas.ch>; Mon, 3 May 1993 17:04:14 +0200 Received: from charlie by sir.ifi.unibas.ch (NX5.67c/NX3.0M) id AA27678; Mon, 3 May 93 17:04:10 +0200 From: (Walter Kuhn) kuhn@ifi.unibas.ch Message-Id: <9305031504.AA27678@sir.ifi.unibas.ch> Received: by charlie.ifi.unibas.ch (NX5.67c/NX3.0X) id AA00519; Mon, 3 May 93 17:04:10 +0200 Date: Mon, 3 May 93 17:04:10 +0200 Received: by NeXT.Mailer (1.87.1) Received: by NeXT Mailer (1.87.1) To: pbwg-comm@cs.utk.edu Subject: send compil.archive from pbwg From owner-pbwg-comm@CS.UTK.EDU Mon May 3 11:05:58 1993 Received: from CS.UTK.EDU by surfer.EPM.ORNL.GOV (5.61/1.34) id AA17467; Mon, 3 May 93 11:05:58 -0400 Received: from localhost by CS.UTK.EDU with SMTP (5.61+IDA+UTK-930125/2.8s-UTK) id AA02984; Mon, 3 May 93 11:05:07 -0400 X-Resent-To: pbwg-comm@CS.UTK.EDU ; Mon, 3 May 1993 11:05:05 EDT Errors-To: owner-pbwg-comm@CS.UTK.EDU Received: from swiba9.unibas.ch by CS.UTK.EDU with SMTP (5.61+IDA+UTK-930125/2.8s-UTK) id AA02948; Mon, 3 May 93 11:04:53 -0400 Received: from sir.ifi.unibas.ch by swiba9.unibas.ch with SMTP (PP) id <21137-0@swiba9.unibas.ch>; Mon, 3 May 1993 17:04:34 +0200 Received: from charlie by sir.ifi.unibas.ch (NX5.67c/NX3.0M) id AA27683; Mon, 3 May 93 17:04:30 +0200 From: (Walter Kuhn) kuhn@ifi.unibas.ch Message-Id: <9305031504.AA27683@sir.ifi.unibas.ch> Received: by charlie.ifi.unibas.ch (NX5.67c/NX3.0X) id AA00525; Mon, 3 May 93 17:04:30 +0200 Date: Mon, 3 May 93 17:04:30 +0200 Received: by NeXT.Mailer (1.87.1) Received: by NeXT Mailer (1.87.1) To: pbwg-comm@cs.utk.edu Subject: send conclu.archive from pbwg From owner-pbwg-comm@CS.UTK.EDU Wed May 5 09:08:29 1993 Received: from CS.UTK.EDU by surfer.EPM.ORNL.GOV (5.61/1.34) id AA15181; Wed, 5 May 93 09:08:29 -0400 Received: from localhost by CS.UTK.EDU with SMTP (5.61+IDA+UTK-930125/2.8s-UTK) id AA05335; Wed, 5 May 93 06:47:02 -0400 X-Resent-To: pbwg-comm@CS.UTK.EDU ; Wed, 5 May 1993 06:46:59 EDT Errors-To: owner-pbwg-comm@CS.UTK.EDU Received: from THUD.CS.UTK.EDU by CS.UTK.EDU with SMTP (5.61+IDA+UTK-930125/2.8s-UTK) id AA05327; Wed, 5 May 93 06:46:53 -0400 From: Jack Dongarra Received: by thud.cs.utk.edu (5.61+IDA+UTK-930125/2.7c-UTK) id AA08869; Wed, 5 May 93 06:46:47 -0400 Date: Wed, 5 May 93 06:46:47 -0400 Message-Id: <9305051046.AA08869@thud.cs.utk.edu> To: pbwg-comm@cs.utk.edu Subject: PBWG Meeting May 24th Dear Colleague, We are planning to have the Third Meeting of the Parallel Benchmark Working Group meet in Knoxville, Tennessee at the University of Tennessee on May 24th, 1993. This process formally began with a workshop held at the Supercomputer '92 meeting in November 1992. The purpose of the working group is to establish credible and useful benchmarks for the evaluation of Distributed Memory MIMD systems. The objectives for the group are: 1. To establish a comprehensive set of parallel benchmarks that is generally accepted by both users and vendors of parallel system. 2. To provide a focus for parallel benchmark activities and avoid unnecessary duplication of effort and proliferation of benchmarks. 3. To set standards for benchmarking methodology and result-reporting together with a control database/repository for both the benchmarks and the results. Mode of Working: The working group has adopted an HPF-like forum style of proceedings with a view to convergence to an agreed set of benchmarks and procedures within 10 months. If you would like to participate and attend the meeting let me know. Mailing Lists ============= The following mailing lists have been set up. pbwg-comm@cs.utk.edu Whole committee pbwg-lowlevel@cs.utk.edu Low level subcommittee pbwg-compactapp@cs.utk.edu Compact applications subcommittee pbwg-method@cs.utk.edu Methodology subcommittee pbwg-kernel@cs.utk.edu Kernel subcommittee If you are on a mailing list you will receive mail as it is posted. If you want to join a mailing list send me mail (dongarra@cs.utk.edu). All mail will be collected and can be retrieved by sending email to netlib@ornl.gov and in the mail message typing: send comm.archive from pbwg send lowlevel.archive from pbwg send compactapp.archive from pbwg send method.archive from pbwg send kernel.archive from pbwg send index from pbwg The various subcommittees will look into the following topics: Low-Level: --------- Start-up, latency, bandwidth Reduction (broadcast, sum, gather/scatter) Synchronization (e.g., SYNCH1 from Genesis) I/O Kernel: ------ Matrix operations (e.g., multiply, transpose) LU Decomposition PDE Solvers (Red/Black Relaxation) Multigrid FFT Conjugate Gradient Compact Applications: -------------------- Particle-In-Cell codes (e.g., LPM1 from Genesis) QCD Molecular Dynamics CFD ARCO Financial Applications Methodology: ------------ Guidelines for reporting performance. The meeting site will be the: Science Alliance Conference Room South College University of Tennessee (A postscript map in included at the end of this message, South College is the building located next to Ayres Hall.) We have made arrangements with the Hilton Hotel in Knoxville. Hilton Hotel 501 W. Church Street Knoxville, TN Phone: 615-523-2300 When making arrangements tell the hotel you are associated with the Parallel Benchmarking Meeting. The rate is $65.00/night. You can rent a car or get a cab from the airport to the hotel. From the hotel to the University it is a 15 minute walk. We should plan to start at 8:30 am May 24th and finish about 5:00 pm. The format of the meeting is: Monday 24th May 8.30 - 12.00 Full group meeting 12.00 - 1.30 Lunch 1.30 - 4.00 Parallel subgroup meetings 4.00 - 5.00 Full group meeting Tentative agenda for full group meeting: 1. Minutes of Minneapolis meeting 2. Reports and discussion from subgroups 3. Open discussion and agreement on further actions 4. Date and venue for next meeting Suggested subgroups - probably two in parallel Compact Applications Low-Level benchmarks and second pair: Kernels benchmarks Methodology We have setup a mail refector for correspondence, it is called pbwg-comm@cs.utk.edu. Mail to that address will be sent to the mailing list and also collected in netlib@ornl.gov. To retrieve the collected mail, send email to netlib@ornl.gov and in the mail message type: send comm.archive from pbwg If you would like to be put on the mailing list to receive the correspondence let me know. Regards, Jack Dongarra ---postscript map of the University of Tennessee--- %!PS-Adobe-2.0 EPSF-1.2 %%DocumentFonts: Helvetica-Bold Courier Courier-Bold Times-Bold %%Pages: 1 %%BoundingBox: 39 -113 604 767 %%EndComments /arrowHeight 10 def /arrowWidth 5 def /IdrawDict 54 dict def IdrawDict begin /reencodeISO { dup dup findfont dup length dict begin { 1 index /FID ne { def }{ pop pop } ifelse } forall /Encoding ISOLatin1Encoding def currentdict end definefont } def /ISOLatin1Encoding [ /.notdef/.notdef/.notdef/.notdef/.notdef/.notdef/.notdef/.notdef /.notdef/.notdef/.notdef/.notdef/.notdef/.notdef/.notdef/.notdef /.notdef/.notdef/.notdef/.notdef/.notdef/.notdef/.notdef/.notdef /.notdef/.notdef/.notdef/.notdef/.notdef/.notdef/.notdef/.notdef /space/exclam/quotedbl/numbersign/dollar/percent/ampersand/quoteright /parenleft/parenright/asterisk/plus/comma/minus/period/slash /zero/one/two/three/four/five/six/seven/eight/nine/colon/semicolon /less/equal/greater/question/at/A/B/C/D/E/F/G/H/I/J/K/L/M/N /O/P/Q/R/S/T/U/V/W/X/Y/Z/bracketleft/backslash/bracketright /asciicircum/underscore/quoteleft/a/b/c/d/e/f/g/h/i/j/k/l/m /n/o/p/q/r/s/t/u/v/w/x/y/z/braceleft/bar/braceright/asciitilde /.notdef/.notdef/.notdef/.notdef/.notdef/.notdef/.notdef/.notdef /.notdef/.notdef/.notdef/.notdef/.notdef/.notdef/.notdef/.notdef /.notdef/dotlessi/grave/acute/circumflex/tilde/macron/breve /dotaccent/dieresis/.notdef/ring/cedilla/.notdef/hungarumlaut /ogonek/caron/space/exclamdown/cent/sterling/currency/yen/brokenbar /section/dieresis/copyright/ordfeminine/guillemotleft/logicalnot /hyphen/registered/macron/degree/plusminus/twosuperior/threesuperior /acute/mu/paragraph/periodcentered/cedilla/onesuperior/ordmasculine /guillemotright/onequarter/onehalf/threequarters/questiondown /Agrave/Aacute/Acircumflex/Atilde/Adieresis/Aring/AE/Ccedilla /Egrave/Eacute/Ecircumflex/Edieresis/Igrave/Iacute/Icircumflex /Idieresis/Eth/Ntilde/Ograve/Oacute/Ocircumflex/Otilde/Odieresis /multiply/Oslash/Ugrave/Uacute/Ucircumflex/Udieresis/Yacute /Thorn/germandbls/agrave/aacute/acircumflex/atilde/adieresis /aring/ae/ccedilla/egrave/eacute/ecircumflex/edieresis/igrave /iacute/icircumflex/idieresis/eth/ntilde/ograve/oacute/ocircumflex /otilde/odieresis/divide/oslash/ugrave/uacute/ucircumflex/udieresis /yacute/thorn/ydieresis ] def /Helvetica-Bold reencodeISO def /Courier reencodeISO def /Courier-Bold reencodeISO def /Times-Bold reencodeISO def /none null def /numGraphicParameters 17 def /stringLimit 65535 def /Begin { save numGraphicParameters dict begin } def /End { end restore } def /SetB { dup type /nulltype eq { pop false /brushRightArrow idef false /brushLeftArrow idef true /brushNone idef } { /brushDashOffset idef /brushDashArray idef 0 ne /brushRightArrow idef 0 ne /brushLeftArrow idef /brushWidth idef false /brushNone idef } ifelse } def /SetCFg { /fgblue idef /fggreen idef /fgred idef } def /SetCBg { /bgblue idef /bggreen idef /bgred idef } def /SetF { /printSize idef /printFont idef } def /SetP { dup type /nulltype eq { pop true /patternNone idef } { dup -1 eq { /patternGrayLevel idef /patternString idef } { /patternGrayLevel idef } ifelse false /patternNone idef } ifelse } def /BSpl { 0 begin storexyn newpath n 1 gt { 0 0 0 0 0 0 1 1 true subspline n 2 gt { 0 0 0 0 1 1 2 2 false subspline 1 1 n 3 sub { /i exch def i 1 sub dup i dup i 1 add dup i 2 add dup false subspline } for n 3 sub dup n 2 sub dup n 1 sub dup 2 copy false subspline } if n 2 sub dup n 1 sub dup 2 copy 2 copy false subspline patternNone not brushLeftArrow not brushRightArrow not and and { ifill } if brushNone not { istroke } if 0 0 1 1 leftarrow n 2 sub dup n 1 sub dup rightarrow } if end } dup 0 4 dict put def /Circ { newpath 0 360 arc patternNone not { ifill } if brushNone not { istroke } if } def /CBSpl { 0 begin dup 2 gt { storexyn newpath n 1 sub dup 0 0 1 1 2 2 true subspline 1 1 n 3 sub { /i exch def i 1 sub dup i dup i 1 add dup i 2 add dup false subspline } for n 3 sub dup n 2 sub dup n 1 sub dup 0 0 false subspline n 2 sub dup n 1 sub dup 0 0 1 1 false subspline patternNone not { ifill } if brushNone not { istroke } if } { Poly } ifelse end } dup 0 4 dict put def /Elli { 0 begin newpath 4 2 roll translate scale 0 0 1 0 360 arc patternNone not { ifill } if brushNone not { istroke } if end } dup 0 1 dict put def /Line { 0 begin 2 storexyn newpath x 0 get y 0 get moveto x 1 get y 1 get lineto brushNone not { istroke } if 0 0 1 1 leftarrow 0 0 1 1 rightarrow end } dup 0 4 dict put def /MLine { 0 begin storexyn newpath n 1 gt { x 0 get y 0 get moveto 1 1 n 1 sub { /i exch def x i get y i get lineto } for patternNone not brushLeftArrow not brushRightArrow not and and { ifill } if brushNone not { istroke } if 0 0 1 1 leftarrow n 2 sub dup n 1 sub dup rightarrow } if end } dup 0 4 dict put def /Poly { 3 1 roll newpath moveto -1 add { lineto } repeat closepath patternNone not { ifill } if brushNone not { istroke } if } def /Rect { 0 begin /t exch def /r exch def /b exch def /l exch def newpath l b moveto l t lineto r t lineto r b lineto closepath patternNone not { ifill } if brushNone not { istroke } if end } dup 0 4 dict put def /Text { ishow } def /idef { dup where { pop pop pop } { exch def } ifelse } def /ifill { 0 begin gsave patternGrayLevel -1 ne { fgred bgred fgred sub patternGrayLevel mul add fggreen bggreen fggreen sub patternGrayLevel mul add fgblue bgblue fgblue sub patternGrayLevel mul add setrgbcolor eofill } { eoclip originalCTM setmatrix pathbbox /t exch def /r exch def /b exch def /l exch def /w r l sub ceiling cvi def /h t b sub ceiling cvi def /imageByteWidth w 8 div ceiling cvi def /imageHeight h def bgred bggreen bgblue setrgbcolor eofill fgred fggreen fgblue setrgbcolor w 0 gt h 0 gt and { l b translate w h scale w h true [w 0 0 h neg 0 h] { patternproc } imagemask } if } ifelse grestore end } dup 0 8 dict put def /istroke { gsave brushDashOffset -1 eq { [] 0 setdash 1 setgray } { brushDashArray brushDashOffset setdash fgred fggreen fgblue setrgbcolor } ifelse brushWidth setlinewidth originalCTM setmatrix stroke grestore } def /ishow { 0 begin gsave fgred fggreen fgblue setrgbcolor /fontDict printFont printSize scalefont dup setfont def /descender fontDict begin 0 [FontBBox] 1 get FontMatrix end transform exch pop def /vertoffset 1 printSize sub descender sub def { 0 vertoffset moveto show /vertoffset vertoffset printSize sub def } forall grestore end } dup 0 3 dict put def /patternproc { 0 begin /patternByteLength patternString length def /patternHeight patternByteLength 8 mul sqrt cvi def /patternWidth patternHeight def /patternByteWidth patternWidth 8 idiv def /imageByteMaxLength imageByteWidth imageHeight mul stringLimit patternByteWidth sub min def /imageMaxHeight imageByteMaxLength imageByteWidth idiv patternHeight idiv patternHeight mul patternHeight max def /imageHeight imageHeight imageMaxHeight sub store /imageString imageByteWidth imageMaxHeight mul patternByteWidth add string def 0 1 imageMaxHeight 1 sub { /y exch def /patternRow y patternByteWidth mul patternByteLength mod def /patternRowString patternString patternRow patternByteWidth getinterval def /imageRow y imageByteWidth mul def 0 patternByteWidth imageByteWidth 1 sub { /x exch def imageString imageRow x add patternRowString putinterval } for } for imageString end } dup 0 12 dict put def /min { dup 3 2 roll dup 4 3 roll lt { exch } if pop } def /max { dup 3 2 roll dup 4 3 roll gt { exch } if pop } def /midpoint { 0 begin /y1 exch def /x1 exch def /y0 exch def /x0 exch def x0 x1 add 2 div y0 y1 add 2 div end } dup 0 4 dict put def /thirdpoint { 0 begin /y1 exch def /x1 exch def /y0 exch def /x0 exch def x0 2 mul x1 add 3 div y0 2 mul y1 add 3 div end } dup 0 4 dict put def /subspline { 0 begin /movetoNeeded exch def y exch get /y3 exch def x exch get /x3 exch def y exch get /y2 exch def x exch get /x2 exch def y exch get /y1 exch def x exch get /x1 exch def y exch get /y0 exch def x exch get /x0 exch def x1 y1 x2 y2 thirdpoint /p1y exch def /p1x exch def x2 y2 x1 y1 thirdpoint /p2y exch def /p2x exch def x1 y1 x0 y0 thirdpoint p1x p1y midpoint /p0y exch def /p0x exch def x2 y2 x3 y3 thirdpoint p2x p2y midpoint /p3y exch def /p3x exch def movetoNeeded { p0x p0y moveto } if p1x p1y p2x p2y p3x p3y curveto end } dup 0 17 dict put def /storexyn { /n exch def /y n array def /x n array def n 1 sub -1 0 { /i exch def y i 3 2 roll put x i 3 2 roll put } for } def %%EndProlog %%BeginIdrawPrologue /arrowhead { 0 begin transform originalCTM itransform /taily exch def /tailx exch def transform originalCTM itransform /tipy exch def /tipx exch def /dy tipy taily sub def /dx tipx tailx sub def /angle dx 0 ne dy 0 ne or { dy dx atan } { 90 } ifelse def gsave originalCTM setmatrix tipx tipy translate angle rotate newpath arrowHeight neg arrowWidth 2 div moveto 0 0 lineto arrowHeight neg arrowWidth 2 div neg lineto patternNone not { originalCTM setmatrix /padtip arrowHeight 2 exp 0.25 arrowWidth 2 exp mul add sqrt brushWidth mul arrowWidth div def /padtail brushWidth 2 div def tipx tipy translate angle rotate padtip 0 translate arrowHeight padtip add padtail add arrowHeight div dup scale arrowheadpath ifill } if brushNone not { originalCTM setmatrix tipx tipy translate angle rotate arrowheadpath istroke } if grestore end } dup 0 9 dict put def /arrowheadpath { newpath arrowHeight neg arrowWidth 2 div moveto 0 0 lineto arrowHeight neg arrowWidth 2 div neg lineto } def /leftarrow { 0 begin y exch get /taily exch def x exch get /tailx exch def y exch get /tipy exch def x exch get /tipx exch def brushLeftArrow { tipx tipy tailx taily arrowhead } if end } dup 0 4 dict put def /rightarrow { 0 begin y exch get /tipy exch def x exch get /tipx exch def y exch get /taily exch def x exch get /tailx exch def brushRightArrow { tipx tipy tailx taily arrowhead } if end } dup 0 4 dict put def %%EndIdrawPrologue %I Idraw 10 Grid 2.84217e-39 0 %%Page: 1 1 Begin %I b u %I cfg u %I cbg u %I f u %I p u %I t [ 0.799705 0 0 0.799705 0 0 ] concat /originalCTM matrix currentmatrix def Begin %I Pict %I b u %I cfg u %I cbg u %I f u %I p u %I t [ 1 0 0 1 89.1002 831.2 ] concat Begin %I Elli %I b 65535 0 0 0 [] 0 SetB %I cfg Black 0 0 0 SetCFg %I cbg White 1 1 1 SetCBg none SetP %I p n %I t [ 1.09512e-08 0.9 -0.9 1.09512e-08 584.1 0.89999 ] concat %I 79 550 18 17 Elli End Begin %I Line %I b 65535 3 0 1 [] 0 SetB %I cfg Black 0 0 0 SetCFg %I cbg White 1 1 1 SetCBg none SetP %I p n %I t [ 1.09512e-08 0.9 -0.9 1.09512e-08 541.8 -27 ] concat %I 110 466 110 542 Line %I 1 End Begin %I Line %I b 65535 2 0 0 [] 0 SetB %I cfg Black 0 0 0 SetCFg %I cbg White 1 1 1 SetCBg none SetP %I p n %I t [ 1.09512e-08 0.9 -0.9 1.09512e-08 541.8 -27 ] concat %I 82 504 140 504 Line %I 1 End Begin %I Text %I cfg Black 0 0 0 SetCFg %I f *-helvetica-bold-r-*-140-* Helvetica-Bold 14 SetF %I t [ 1.2168e-08 1 -1 1.2168e-08 35.5 66.1 ] concat %I [ (N) ] Text End End %I eop Begin %I Text %I cfg Black 0 0 0 SetCFg %I f *-courier-medium-r-*-80-* Courier 8 SetF %I t [ -1.23147 0.0157385 -0.0157385 -1.23147 409.218 -127.169 ] concat %I [ (Voluteer Boulevard) ] Text End Begin %I Text %I cfg Black 0 0 0 SetCFg %I f *-courier-bold-r-*-120-* Courier-Bold 12 SetF %I t [ 3.66661e-08 3.01333 -3.01333 3.66661e-08 57.9267 164.513 ] concat %I [ (UT Campus -- Jack Dongarra's Lab) ] Text End Begin %I Rect none SetB %I b n %I cfg Black 0 0 0 SetCFg %I cbg White 1 1 1 SetCBg %I p 1 SetP %I t [ 1.2168e-08 1 -1 1.2168e-08 693.5 -156 ] concat %I 17 61 177 478 Rect End Begin %I Line %I b 65535 2 1 1 [] 0 SetB %I cfg Black 0 0 0 SetCFg %I cbg White 1 1 1 SetCBg none SetP %I p n %I t [ 6.9812e-09 0.698798 -0.573736 8.50294e-09 453.281 -79.7194 ] concat %I 158 545 1015 545 Line %I 1 End Begin %I Elli %I b 65535 2 0 0 [] 0 SetB %I cfg Black 0 0 0 SetCFg %I cbg White 1 1 1 SetCBg %I p 0 SetP %I t [ 5.81767e-09 0.582331 -0.478114 7.08579e-09 333.274 -81.9323 ] concat %I 257 403 4 4 Elli End Begin %I Line %I b 65535 2 0 1 [] 0 SetB %I cfg Black 0 0 0 SetCFg %I cbg White 1 1 1 SetCBg %I p 0 SetP %I t [ 7.02094e-09 0.702776 -0.577002 8.55135e-09 455.344 -80.5588 ] concat %I 211 544 211 17 Line %I 1 End Begin %I BSpl %I b 65535 2 0 0 [] 0 SetB %I cfg Black 0 0 0 SetCFg %I cbg White 1 1 1 SetCBg none SetP %I p n %I t [ 6.9812e-09 0.698798 -0.573736 8.50294e-09 455.594 -124.448 ] concat %I 5 503 545 514 535 546 529 628 529 628 529 5 BSpl %I 1 End Begin %I Elli %I b 65535 2 0 0 [] 0 SetB %I cfg Black 0 0 0 SetCFg %I cbg White 1 1 1 SetCBg %I p 0 SetP %I t [ 5.81767e-09 0.582331 -0.478114 7.08579e-09 333.274 76.6169 ] concat %I 257 403 4 4 Elli End Begin %I Elli %I b 65535 2 0 0 [] 0 SetB %I cfg Black 0 0 0 SetCFg %I cbg White 1 1 1 SetCBg %I p 0 SetP %I t [ 5.81767e-09 0.582331 -0.478114 7.08579e-09 403.382 -81.1611 ] concat %I 257 403 4 4 Elli End Begin %I Text %I cfg Black 0 0 0 SetCFg %I f *-courier-medium-r-*-100-* Courier 10 SetF %I t [ 1.15378e-08 1.1549 -0.948215 1.40528e-08 222.192 481.065 ] concat %I [ (DOWN) (TOWN) ] Text End Begin %I Poly %I b 65535 3 0 0 [] 0 SetB %I cfg Black 0 0 0 SetCFg %I cbg White 1 1 1 SetCBg %I p 0.75 SetP %I t [ 2.43189e-09 0.222447 -0.19986 2.70672e-09 310.54 254.841 ] concat %I 4 301 50 850 50 850 537 301 537 4 Poly End Begin %I BSpl %I b 65535 2 0 0 [] 0 SetB %I cfg Black 0 0 0 SetCFg %I cbg White 1 1 1 SetCBg none SetP %I p n %I t [ 9.38371e-09 0.771182 -0.771182 9.38371e-09 628.85 -52.2379 ] concat %I 9 251 541 251 516 251 507 257 495 263 489 413 491 472 496 485 499 485 499 9 BSpl %I 1 End Begin %I BSpl %I b 65520 2 0 0 [12 4] 17 SetB %I cfg Black 0 0 0 SetCFg %I cbg White 1 1 1 SetCBg none SetP %I p n %I t [ 9.18822e-09 0.43379 -0.755115 5.27834e-09 620.126 111.397 ] concat %I 6 486 498 502 500 510 504 514 514 514 529 513 542 6 BSpl %I 1 End Begin %I Line %I b 65535 2 0 0 [] 0 SetB %I cfg Black 0 0 0 SetCFg %I cbg White 1 1 1 SetCBg %I p 0 SetP %I t [ 9.38371e-09 0.771182 -0.771182 9.38371e-09 628.85 -52.2378 ] concat %I 476 654 476 542 Line %I 1 End Begin %I Elli %I b 65535 2 0 0 [] 0 SetB %I cfg Black 0 0 0 SetCFg %I cbg White 1 1 1 SetCBg %I p 0 SetP %I t [ 5.81767e-09 0.582331 -0.478114 7.08579e-09 445.797 -81.1611 ] concat %I 257 403 4 4 Elli End Begin %I Text %I cfg Black 0 0 0 SetCFg %I f *-courier-medium-r-*-100-* Courier 10 SetF %I t [ 1.13387e-08 0.931845 -0.931845 1.13387e-08 252.947 159.612 ] concat %I [ (Volunteer Boulevard) ] Text End Begin %I Text %I cfg Black 0 0 0 SetCFg %I f *-courier-medium-r-*-80-* Courier 8 SetF %I t [ 1.13387e-08 0.931845 -0.931845 1.13387e-08 310.947 152.672 ] concat %I [ (Neyland Drive) ] Text End Begin %I Text %I cfg Black 0 0 0 SetCFg %I f *-courier-medium-r-*-100-* Courier 10 SetF %I t [ 1.13387e-08 0.931845 -0.931845 1.13387e-08 198.965 155.756 ] concat %I [ (Cumberland Avenue) ] Text End Begin %I Text %I cfg Black 0 0 0 SetCFg %I f *-courier-medium-r-*-100-* Courier 10 SetF %I t [ 1.13387e-08 0.931845 -0.931845 1.13387e-08 130.329 90.977 ] concat %I [ (Interstate 40) ] Text End Begin %I Text %I cfg Black 0 0 0 SetCFg %I f *-courier-medium-r-*-100-* Courier 10 SetF %I t [ 1.13387e-08 0.931845 -0.931845 1.13387e-08 130.329 403.305 ] concat %I [ (Interstate 40) ] Text End Begin %I Elli %I b 65535 2 0 0 [] 0 SetB %I cfg Black 0 0 0 SetCFg %I cbg White 1 1 1 SetCBg %I p 0 SetP %I t [ 5.81767e-09 0.582331 -0.478114 7.08579e-09 332.433 366.124 ] concat %I 257 403 4 4 Elli End Begin %I Line %I b 65535 2 0 0 [] 0 SetB %I cfg Black 0 0 0 SetCFg %I cbg White 1 1 1 SetCBg none SetP %I p n %I t [ 9.38371e-09 0.771182 -0.771182 9.38371e-09 628.85 -52.2379 ] concat %I 107 542 486 541 Line %I 1 End Begin %I Line %I b 65520 2 0 0 [12 4] 17 SetB %I cfg Black 0 0 0 SetCFg %I cbg White 1 1 1 SetCBg none SetP %I p n %I t [ 9.38371e-09 0.771182 -0.771182 9.38371e-09 628.85 -52.2379 ] concat %I 487 541 664 541 Line %I 1 End Begin %I Line %I b 65535 2 0 0 [] 0 SetB %I cfg Black 0 0 0 SetCFg %I cbg White 1 1 1 SetCBg none SetP %I p n %I t [ 9.38371e-09 0.771182 -0.771182 9.38371e-09 628.85 -52.2379 ] concat %I 665 541 682 540 Line %I 1 End Begin %I Line %I b 65535 2 0 0 [] 0 SetB %I cfg Black 0 0 0 SetCFg %I cbg White 1 1 1 SetCBg none SetP %I p n %I t [ 9.38371e-09 0.771182 -0.771182 9.38371e-09 628.85 -51.4666 ] concat %I 681 599 681 485 Line %I 1 End Begin %I Text %I cfg Black 0 0 0 SetCFg %I f *-courier-medium-r-*-80-* Courier 8 SetF %I t [ -0.931845 2.26773e-08 -2.26773e-08 -0.931845 205.214 462.905 ] concat %I [ (Henley Street) ] Text End Begin %I Text %I cfg Black 0 0 0 SetCFg %I f *-courier-medium-r-*-80-* Courier 8 SetF %I t [ -0.931845 2.26773e-08 -2.26773e-08 -0.931845 193.647 318.042 ] concat %I [ (17th Street) ] Text End Begin %I Text %I cfg Black 0 0 0 SetCFg %I f *-courier-medium-r-*-100-* Courier 10 SetF %I t [ 9.63786e-09 0.792068 -0.792068 9.63786e-09 110.137 205.729 ] concat %I [ (17th Street Exit) ] Text End Begin %I Text %I cfg Black 0 0 0 SetCFg %I f *-courier-medium-r-*-100-* Courier 10 SetF %I t [ 9.63786e-09 0.792068 -0.792068 9.63786e-09 110.137 26.0435 ] concat %I [ (Airport/Alcoa Highway Exit) ] Text End Begin %I Text %I cfg Black 0 0 0 SetCFg %I f *-times-bold-r-*-140-* Times-Bold 14 SetF %I t [ 9.63786e-09 0.792068 -0.792068 9.63786e-09 171.727 80.0262 ] concat %I [ (Cumberland Avenue Exit) ] Text End Begin %I Text %I cfg Black 0 0 0 SetCFg %I f *-courier-medium-r-*-100-* Courier 10 SetF %I t [ 9.63786e-09 0.792068 -0.792068 9.63786e-09 265.915 92.3651 ] concat %I [ (Neyland Drive Exit) ] Text End Begin %I Text %I cfg Black 0 0 0 SetCFg %I f *-courier-medium-r-*-100-* Courier 10 SetF %I t [ 9.63786e-09 0.792068 -0.792068 9.63786e-09 110.137 509.574 ] concat %I [ (Summit Hill Exit) ] Text End Begin %I Line %I b 65535 0 0 0 [] 0 SetB %I cfg Black 0 0 0 SetCFg %I cbg White 1 1 1 SetCBg none SetP %I p n %I t [ 9.38371e-09 0.771182 -0.771182 9.38371e-09 628.85 -52.2379 ] concat %I 144 661 155 636 Line %I 1 End Begin %I Line %I b 65535 0 0 0 [] 0 SetB %I cfg Black 0 0 0 SetCFg %I cbg White 1 1 1 SetCBg none SetP %I p n %I t [ 9.38371e-09 0.771182 -0.771182 9.38371e-09 628.85 -52.2379 ] concat %I 372 662 361 634 Line %I 1 End Begin %I Line %I b 65535 0 0 0 [] 0 SetB %I cfg Black 0 0 0 SetCFg %I cbg White 1 1 1 SetCBg none SetP %I p n %I t [ 9.38371e-09 0.771182 -0.771182 9.38371e-09 628.85 -52.2379 ] concat %I 752 660 736 634 Line %I 1 End Begin %I Line %I b 65535 0 0 0 [] 0 SetB %I cfg Black 0 0 0 SetCFg %I cbg White 1 1 1 SetCBg none SetP %I p n %I t [ 9.38371e-09 0.771182 -0.771182 9.38371e-09 628.85 -52.2378 ] concat %I 186 466 160 486 Line %I 1 End Begin %I Line %I b 65535 0 0 0 [] 0 SetB %I cfg Black 0 0 0 SetCFg %I cbg White 1 1 1 SetCBg none SetP %I p n %I t [ 9.38371e-09 0.771182 -0.771182 9.38371e-09 628.85 -52.2378 ] concat %I 180 576 157 542 Line %I 1 End Begin %I Line %I b 65520 0 0 0 [12 4] 17 SetB %I cfg Black 0 0 0 SetCFg %I cbg White 1 1 1 SetCBg none SetP %I p n %I t [ 1.2168e-08 1 -1 1.2168e-08 793 -5.99997 ] concat %I 475 50 328 492 Line %I 1 End Begin %I Line %I b 65520 0 0 0 [12 4] 17 SetB %I cfg Black 0 0 0 SetCFg %I cbg White 1 1 1 SetCBg none SetP %I p n %I t [ 1.2168e-08 1 -1 1.2168e-08 793 -5.99994 ] concat %I 475 483 329 589 Line %I 1 End Begin %I Line %I b 65520 0 0 0 [12 4] 17 SetB %I cfg Black 0 0 0 SetCFg %I cbg White 1 1 1 SetCBg none SetP %I p n %I t [ 1.2168e-08 1 -1 1.2168e-08 793 -6 ] concat %I 962 483 450 588 Line %I 1 End Begin %I Line %I b 65520 0 0 0 [12 4] 17 SetB %I cfg Black 0 0 0 SetCFg %I cbg White 1 1 1 SetCBg none SetP %I p n %I t [ 1.2168e-08 1 -1 1.2168e-08 793 -6 ] concat %I 450 491 474 471 Line %I 1 End Begin %I Text %I cfg Black 0 0 0 SetCFg %I f *-courier-medium-r-*-100-* Courier 10 SetF %I t [ 9.63786e-09 0.792068 -0.792068 9.63786e-09 456.136 36.7289 ] concat %I [ (To Airport) ] Text End Begin %I Text %I cfg Black 0 0 0 SetCFg %I f *-courier-medium-r-*-100-* Courier 10 SetF %I t [ 9.63786e-09 0.792068 -0.792068 9.63786e-09 137.137 639.729 ] concat %I [ (To Ashville, Bristol) ] Text End Begin %I Elli %I b 65535 2 0 0 [] 0 SetB %I cfg Black 0 0 0 SetCFg %I cbg White 1 1 1 SetCBg %I p 0.5 SetP %I t [ -0.288762 0.966405 -0.966405 -0.288762 1070.92 66.5381 ] concat %I 743 193 51 70 Elli End Begin %I Poly %I b 65535 2 0 0 [] 0 SetB %I cfg Black 0 0 0 SetCFg %I cbg White 1 1 1 SetCBg %I p 0.5 SetP %I t [ 1.22729e-08 1.00862 -1.00862 1.22729e-08 600.094 464.292 ] concat %I 11 203 116 212 116 212 140 226 140 226 122 222 122 222 102 212 102 212 110 203 110 203 113 11 Poly End Begin %I Poly %I b 65535 2 0 0 [] 0 SetB %I cfg Black 0 0 0 SetCFg %I cbg White 1 1 1 SetCBg %I p 0.5 SetP %I t [ 1.0258e-08 0.843029 -0.843029 1.0258e-08 619.253 516.221 ] concat %I 12 183 114 183 124 193 124 193 135 205 135 205 125 248 90 242 83 239 86 231 76 194 106 194 114 12 Poly End Begin %I Poly %I b 65535 2 0 0 [] 0 SetB %I cfg Black 0 0 0 SetCFg %I cbg White 1 1 1 SetCBg %I p 0.5 SetP %I t [ 1.11572e-08 0.810855 -0.916932 9.86645e-09 659.418 526.458 ] concat %I 8 251 106 251 126 300 126 300 126 300 116 289 116 289 106 289 106 8 Poly End Begin %I Rect %I b 65535 2 0 0 [] 0 SetB %I cfg Black 0 0 0 SetCFg %I cbg White 1 1 1 SetCBg %I p 0.5 SetP %I t [ 1.63639e-08 1.34483 -1.34483 1.63639e-08 1018.44 -282.452 ] concat %I 791 383 799 395 Rect End Begin %I Poly %I b 65535 2 0 0 [] 0 SetB %I cfg Black 0 0 0 SetCFg %I cbg White 1 1 1 SetCBg %I p 0.5 SetP %I t [ 1.22729e-08 1.00862 -1.00862 1.22729e-08 900.11 -9.08362 ] concat %I 12 776 339 776 352 798 352 803 356 808 350 807 348 812 342 808 336 803 341 789 343 782 333 776 333 12 Poly End Begin %I Rect %I b 65535 2 0 0 [] 0 SetB %I cfg Black 0 0 0 SetCFg %I cbg White 1 1 1 SetCBg %I p 0.5 SetP %I t [ 1.13408e-08 0.952586 -0.932017 1.1591e-08 882.447 11.9457 ] concat %I 885 359 901 436 Rect End Begin %I Elli %I b 65535 2 0 0 [] 0 SetB %I cfg Black 0 0 0 SetCFg %I cbg White 1 1 1 SetCBg %I p 1 SetP %I t [ -0.249802 0.836016 -0.836016 -0.249802 1016.8 155.898 ] concat %I 743 193 51 70 Elli End Begin %I Rect %I b 65535 2 0 0 [] 0 SetB %I cfg Black 0 0 0 SetCFg %I cbg White 1 1 1 SetCBg %I p 0.75 SetP %I t [ -0.289327 0.966236 -0.966236 -0.289327 1071.6 80.2259 ] concat %I 707 160 754 232 Rect End Begin %I Poly %I b 65535 2 0 0 [] 0 SetB %I cfg Black 0 0 0 SetCFg %I cbg White 1 1 1 SetCBg %I p 0.5 SetP %I t [ 0.240969 0.462035 -0.606024 0.318423 615.648 549.235 ] concat %I 13 164 162 182 162 182 167 235 167 234 162 254 162 254 134 234 133 235 129 183 129 183 134 164 134 164 149 13 Poly End Begin %I Rect %I b 65535 2 0 0 [] 0 SetB %I cfg Black 0 0 0 SetCFg %I cbg White 1 1 1 SetCBg %I p 0.5 SetP %I t [ 9.95725e-09 0.885621 -0.818318 1.07762e-08 598.215 291.204 ] concat %I 385 148 422 197 Rect End Begin %I Pict %I b u %I cfg Black 0 0 0 SetCFg %I cbg u %I f u %I p u %I t [ 1.05665 0 0 1.05665 213.224 -6.32959 ] concat Begin %I Poly %I b 65535 2 0 0 [] 0 SetB %I cfg DkGray 0.501961 0.501961 0.501961 SetCFg %I cbg White 1 1 1 SetCBg %I p 0.5 SetP %I t [ 8.1286e-09 0.496511 -0.668033 6.04153e-09 312.332 583.056 ] concat %I 13 204 92 204 123 226 123 226 113 248 113 248 123 264 123 264 92 248 92 248 95 226 95 226 92 225 92 13 Poly End Begin %I Poly %I b 65535 2 0 0 [] 0 SetB %I cfg DkGray 0.501961 0.501961 0.501961 SetCFg %I cbg White 1 1 1 SetCBg %I p 0.5 SetP %I t [ 8.1286e-09 -0.496511 -0.668033 -6.04153e-09 312.332 845.214 ] concat %I 13 204 92 204 123 226 123 226 113 248 113 248 123 264 123 264 92 248 92 248 95 226 95 226 92 225 92 13 Poly End Begin %I Rect none SetB %I b n %I cfg DkGray 0.501961 0.501961 0.501961 SetCFg %I cbg White 1 1 1 SetCBg %I p 0.5 SetP %I t [ 8.1286e-09 0.496511 -0.668033 6.04153e-09 311.664 582.559 ] concat %I 258 92 274 121 Rect End End %I eop Begin %I Text %I cfg Black 0 0 0 SetCFg %I f *-courier-medium-r-*-80-* Courier 8 SetF %I t [ 1.32746e-08 1.09095 -1.09095 1.32746e-08 532.584 731.01 ] concat %I [ (Physics) ] Text End Begin %I Text %I cfg Black 0 0 0 SetCFg %I f *-courier-medium-r-*-80-* Courier 8 SetF %I t [ 1.32746e-08 1.09095 -1.09095 1.32746e-08 555.924 783.814 ] concat %I [ (Geography) ] Text End Begin %I Text %I cfg Black 0 0 0 SetCFg %I f *-courier-medium-r-*-80-* Courier 8 SetF %I t [ 1.32746e-08 1.09095 -1.09095 1.32746e-08 565.058 787.873 ] concat %I [ (& Geology) ] Text End Begin %I Text %I cfg Black 0 0 0 SetCFg %I f *-courier-medium-r-*-80-* Courier 8 SetF %I t [ 0.737179 0.804195 -0.804195 0.737179 504.155 697.286 ] concat %I [ (Biology) ] Text End Begin %I Text %I cfg Black 0 0 0 SetCFg %I f *-courier-medium-r-*-80-* Courier 8 SetF %I t [ -1.09095 2.65492e-08 -2.65492e-08 -1.09095 366.956 772.951 ] concat %I [ (13th Street) ] Text End Begin %I Text %I cfg Black 0 0 0 SetCFg %I f *-courier-medium-r-*-80-* Courier 8 SetF %I t [ -1.09095 2.65492e-08 -2.65492e-08 -1.09095 511.152 538.704 ] concat %I [ (Voluteer Boulevard) ] Text End Begin %I Text %I cfg Black 0 0 0 SetCFg %I f *-courier-medium-r-*-80-* Courier 8 SetF %I t [ 0.46602 0.986397 -0.986397 0.466021 521.533 631.373 ] concat %I [ (Middle Way) ] Text End Begin %I Text %I cfg Black 0 0 0 SetCFg %I f *-courier-medium-r-*-80-* Courier 8 SetF %I t [ -1.09095 2.65492e-08 -2.65492e-08 -1.09095 373.044 540.56 ] concat %I [ (16th Street) ] Text End Begin %I Text %I cfg Black 0 0 0 SetCFg %I f *-courier-medium-r-*-80-* Courier 8 SetF %I t [ 1.32746e-08 1.09095 -1.09095 1.32746e-08 363.799 672.151 ] concat %I [ (Walters) (Life) (Sciences) ] Text End Begin %I Text %I cfg Black 0 0 0 SetCFg %I f *-courier-medium-r-*-80-* Courier 8 SetF %I t [ -1.09095 2.65492e-08 -2.65492e-08 -1.09095 537.034 899.133 ] concat %I [ (Daughtery) (Engineering) ] Text End Begin %I Text %I cfg Black 0 0 0 SetCFg %I f *-courier-medium-r-*-80-* Courier 8 SetF %I t [ 1.60693e-08 1.32061 -1.32061 1.60693e-08 657.627 707.825 ] concat %I [ (Neyland) (Stadium) ] Text End Begin %I Text %I cfg Black 0 0 0 SetCFg %I f *-courier-medium-r-*-80-* Courier 8 SetF %I t [ -1.0536 -0.283024 0.283024 -1.0536 624.893 639.464 ] concat %I [ (Stadium Drive) ] Text End Begin %I Text %I cfg Black 0 0 0 SetCFg %I f *-courier-medium-r-*-80-* Courier 8 SetF %I t [ 1.32746e-08 1.09095 -1.09095 1.32746e-08 450.384 486.441 ] concat %I [ (Library) ] Text End Begin %I Poly %I b 65535 2 0 0 [] 0 SetB %I cfg Black 0 0 0 SetCFg %I cbg White 1 1 1 SetCBg %I p 0.5 SetP %I t [ 1.22729e-08 1.00862 -1.00862 1.22729e-08 895.645 -1.75617 ] concat %I 4 483 431 523 431 523 391 481 389 4 Poly End Begin %I Text %I cfg Black 0 0 0 SetCFg %I f *-courier-medium-r-*-80-* Courier 8 SetF %I t [ 1.32746e-08 1.09095 -1.09095 1.32746e-08 412.866 555.498 ] concat %I [ (University) ( Center) ] Text End Begin %I BSpl %I b 65520 2 0 0 [12 4] 17 SetB %I cfg Black 0 0 0 SetCFg %I cbg White 1 1 1 SetCBg none SetP %I p n %I t [ 1.22729e-08 1.00862 -1.00862 1.22729e-08 895.438 0.0836792 ] concat %I 6 753 467 837 468 841 464 846 464 843 420 841 419 6 BSpl %I 1 End Begin %I BSpl %I b 65520 2 0 0 [12 4] 17 SetB %I cfg Black 0 0 0 SetCFg %I cbg White 1 1 1 SetCBg none SetP %I p n %I t [ 1.22729e-08 1.00862 -1.00862 1.22729e-08 895.921 0.182861 ] concat %I 6 788 313 830 316 840 348 841 386 843 417 843 418 6 BSpl %I 1 End Begin %I Line %I b 65535 0 0 0 [] 0 SetB %I cfg Black 0 0 0 SetCFg %I cbg White 1 1 1 SetCBg none SetP %I p n %I t [ 1.22729e-08 1.00862 -1.00862 1.22729e-08 895.435 0.47699 ] concat %I 887 450 839 438 Line %I 1 End Begin %I Poly %I b 65535 2 0 0 [] 0 SetB %I cfg Black 0 0 0 SetCFg %I cbg White 1 1 1 SetCBg %I p 0.5 SetP %I t [ 1.22729e-08 1.00862 -1.00862 1.22729e-08 895.589 0.222778 ] concat %I 9 807 460 838 460 838 405 816 404 816 394 831 394 831 379 807 378 806 379 9 Poly End Begin %I Line %I b 65535 0 0 0 [] 0 SetB %I cfg Black 0 0 0 SetCFg %I cbg White 1 1 1 SetCBg none SetP %I p n %I t [ 1.22729e-08 1.00862 -1.00862 1.22729e-08 895.838 0.0155029 ] concat %I 796 369 781 389 Line %I 1 End Begin %I Text %I cfg Black 0 0 0 SetCFg %I f *-courier-medium-r-*-80-* Courier 8 SetF %I t [ 1.32746e-08 1.09095 -1.09095 1.32746e-08 523.481 806.106 ] concat %I [ (South) (College) ] Text End Begin %I Text %I cfg Black 0 0 0 SetCFg %I f *-courier-bold-r-*-120-* Courier-Bold 12 SetF %I t [ 1.21714e-08 1.00029 -1.00029 1.21714e-08 439.801 711.069 ] concat %I [ (Ayres Hall) ] Text End Begin %I Line %I b 65535 0 0 0 [] 0 SetB %I cfg Black 0 0 0 SetCFg %I cbg White 1 1 1 SetCBg none SetP %I p n %I t [ 1.07786e-08 0.885813 -0.885813 1.07786e-08 771.842 61.8701 ] concat %I 943 284 915 303 Line %I 1 End Begin %I Text %I cfg Black 0 0 0 SetCFg %I f *-courier-medium-r-*-80-* Courier 8 SetF %I t [ -1.09095 2.65492e-08 -2.65492e-08 -1.09095 459.421 900.429 ] concat %I [ (Dabney/) (Buhler) ] Text End Begin %I Line %I b 65535 0 0 0 [] 0 SetB %I cfg Black 0 0 0 SetCFg %I cbg White 1 1 1 SetCBg %I p 0.5 SetP %I t [ 1.07786e-08 0.885813 -0.885813 1.07786e-08 767.413 57.441 ] concat %I 698 424 663 374 Line %I 1 End Begin %I Text %I cfg Black 0 0 0 SetCFg %I f *-times-bold-r-*-140-* Times-Bold 14 SetF %I t [ 1.22729e-08 1.00862 -1.00862 1.22729e-08 464.596 743.589 ] concat %I [ (X) ] Text End Begin %I Text %I cfg Black 0 0 0 SetCFg %I f *-courier-medium-r-*-80-* Courier 8 SetF %I t [ -1.09087 0.0128026 -0.0128027 -1.09087 503.693 612.151 ] concat %I [ (Stadium Drive) ] Text End Begin %I Rect %I b 65535 2 0 0 [] 0 SetB %I cfg Black 0 0 0 SetCFg %I cbg White 1 1 1 SetCBg %I p < cc cc 33 33 cc cc 33 33 > -1 SetP %I t [ 1.07786e-08 0.885813 -0.885813 1.07786e-08 855.109 167.282 ] concat %I 465 390 498 427 Rect End Begin %I Poly %I b 65535 2 0 0 [] 0 SetB %I cfg Black 0 0 0 SetCFg %I cbg White 1 1 1 SetCBg %I p 0.5 SetP %I t [ 6.73658e-10 0.0553633 -0.0553633 6.73658e-10 494.25 582.008 ] concat %I 11 509 1018 237 1018 237 1114 -35 1114 -35 1018 -307 1018 -307 746 -35 746 -35 378 509 378 509 380 11 Poly End Begin %I Text %I cfg Black 0 0 0 SetCFg %I f *-courier-medium-r-*-80-* Courier 8 SetF %I t [ 1.32746e-08 1.09095 -1.09095 1.32746e-08 485.903 693.513 ] concat %I [ (Psychology) ] Text End Begin %I Text %I cfg Black 0 0 0 SetCFg %I f *-courier-bold-r-*-120-* Courier-Bold 12 SetF %I t [ 9.87788e-09 0.811792 -0.811792 9.87788e-09 526.522 575.313 ] concat %I [ (Parking) (Garage) ] Text End Begin %I Line %I b 65535 0 0 0 [] 0 SetB %I cfg Black 0 0 0 SetCFg %I cbg White 1 1 1 SetCBg %I p < cc cc 33 33 cc cc 33 33 > -1 SetP %I t [ 1.07786e-08 0.885813 -0.885813 1.07786e-08 774.943 159.309 ] concat %I 481 283 491 300 Line %I 1 End Begin %I Text %I cfg Black 0 0 0 SetCFg %I f *-courier-medium-r-*-80-* Courier 8 SetF %I t [ -1.09095 2.65492e-08 -2.65492e-08 -1.09095 368.693 657.005 ] concat %I [ (15th Street) ] Text End Begin %I Text %I cfg Black 0 0 0 SetCFg %I f *-courier-medium-r-*-80-* Courier 8 SetF %I t [ -1.09095 2.65492e-08 -2.65492e-08 -1.09095 374.042 893.422 ] concat %I [ (11th Street) ] Text End Begin %I BSpl %I b 65535 3 0 0 [] 0 SetB %I cfg Black 0 0 0 SetCFg %I cbg White 1 1 1 SetCBg none SetP %I p n %I t [ 1.07785e-08 0.885813 -0.885813 1.07785e-08 786.901 202.714 ] concat %I 18 476 304 494 301 501 299 527 283 545 269 569 255 590 244 611 241 648 238 672 234 686 219 705 203 741 204 767 217 776 236 774 284 776 343 773 538 18 BSpl %I 1 End Begin %I Line %I b 65535 3 0 0 [] 0 SetB %I cfg Black 0 0 0 SetCFg %I cbg White 1 1 1 SetCBg none SetP %I p n %I t [ 1.07785e-08 0.885813 -0.885813 1.07785e-08 786.901 202.714 ] concat %I 376 429 375 538 Line %I 1 End Begin %I BSpl %I b 65535 3 0 0 [] 0 SetB %I cfg Black 0 0 0 SetCFg %I cbg White 1 1 1 SetCBg none SetP %I p n %I t [ 1.07785e-08 0.885813 -0.885813 1.07785e-08 786.901 202.714 ] concat %I 8 376 429 375 275 369 238 352 223 345 219 323 208 303 203 303 204 8 BSpl %I 1 End Begin %I Poly %I b 65535 2 0 0 [] 0 SetB %I cfg Black 0 0 0 SetCFg %I cbg White 1 1 1 SetCBg none SetP %I p n %I t [ 1.07785e-08 0.885813 -0.885813 1.07785e-08 786.901 202.714 ] concat %I 4 301 50 850 50 850 537 301 537 4 Poly End Begin %I BSpl %I b 65535 3 0 0 [] 0 SetB %I cfg Black 0 0 0 SetCFg %I cbg White 1 1 1 SetCBg none SetP %I p n %I t [ 1.2168e-08 1 -1 1.2168e-08 793 -5.99995 ] concat %I 5 814 52 872 66 910 82 947 107 961 127 5 BSpl %I 1 End Begin %I Text %I cfg Black 0 0 0 SetCFg %I f *-courier-medium-r-*-80-* Courier 8 SetF %I t [ -0.520591 0.958718 -0.958718 -0.520591 712.173 866.792 ] concat %I [ (Neyland Drive) ] Text End Begin %I BSpl %I b 65535 3 0 0 [] 0 SetB %I cfg Black 0 0 0 SetCFg %I cbg White 1 1 1 SetCBg none SetP %I p n %I t [ 1.52099e-09 0.125 -0.125 1.52099e-09 461.75 653.5 ] concat %I 24 459 412 422 314 224 273 224 233 224 152 305 80 345 72 386 31 418 -33 418 -122 394 -316 474 -461 652 -542 854 -566 1015 -566 1152 -501 1184 -445 1176 -203 1168 15 1031 233 789 314 547 314 432 351 428 350 24 BSpl %I 8 End Begin %I Line %I b 65535 3 0 0 [] 0 SetB %I cfg Black 0 0 0 SetCFg %I cbg White 1 1 1 SetCBg none SetP %I p n %I t [ 1.52099e-09 0.125 -0.125 1.52099e-09 461.75 653.5 ] concat %I 916 415 915 1204 Line %I 8 End Begin %I BSpl %I b 65535 2 0 0 [] 0 SetB %I cfg Black 0 0 0 SetCFg %I cbg White 1 1 1 SetCBg none SetP %I p n %I t [ 1.52099e-09 0.125 -0.125 1.52099e-09 461.75 620.75 ] concat %I 5 486 195 402 153 394 72 402 -73 394 -73 5 BSpl %I 8 End Begin %I BSpl %I b 65535 3 0 0 [] 0 SetB %I cfg Black 0 0 0 SetCFg %I cbg White 1 1 1 SetCBg none SetP %I p n %I t [ 1.52099e-09 0.125 -0.125 1.52099e-09 461.75 604.375 ] concat %I 11 133 415 132 396 110 602 231 715 316 751 330 765 387 857 351 977 344 1126 358 1041 351 1190 11 BSpl %I 8 End Begin %I MLine %I b 65535 3 0 0 [] 0 SetB %I cfg Black 0 0 0 SetCFg %I cbg White 1 1 1 SetCBg none SetP %I p n %I t [ 1.52099e-09 0.125 -0.125 1.52099e-09 461.75 604.375 ] concat %I 3 133 410 153 -454 613 -2226 3 MLine %I 8 End Begin %I Line %I b 65535 3 0 0 [] 0 SetB %I cfg Black 0 0 0 SetCFg %I cbg White 1 1 1 SetCBg %I p 0.5 SetP %I t [ 1.52099e-09 0.125 -0.125 1.52099e-09 461.75 604.375 ] concat %I 663 276 137 276 Line %I 8 End Begin %I Line %I b 65535 3 0 0 [] 0 SetB %I cfg Black 0 0 0 SetCFg %I cbg White 1 1 1 SetCBg none SetP %I p n %I t [ 1.52099e-09 0.125 -0.125 1.52099e-09 527 522.5 ] concat %I 96 105 808 79 Line %I 8 End Begin %I Line %I b 65535 3 0 0 [] 0 SetB %I cfg Black 0 0 0 SetCFg %I cbg White 1 1 1 SetCBg none SetP %I p n %I t [ 1.52099e-09 0.125 -0.125 1.52099e-09 527 522.5 ] concat %I 95 105 -425 89 Line %I 8 End Begin %I Line %I b 65535 3 0 0 [] 0 SetB %I cfg Black 0 0 0 SetCFg %I cbg White 1 1 1 SetCBg none SetP %I p n %I t [ 1.52099e-09 0.125 -0.125 1.52099e-09 483.5 457 ] concat %I 99 587 3984 588 Line %I 8 End Begin %I BSpl %I b 65535 2 0 0 [] 0 SetB %I cfg Black 0 0 0 SetCFg %I cbg White 1 1 1 SetCBg none SetP %I p n %I t [ 6.91136e-09 0.653301 -0.567997 7.94935e-09 450.164 -70.1652 ] concat %I 15 211 347 224 320 247 286 278 265 315 254 368 252 499 251 582 255 629 257 783 266 863 302 880 318 903 384 900 434 898 545 15 BSpl %I 1 End Begin %I Text %I cfg Black 0 0 0 SetCFg %I f *-courier-bold-r-*-120-* Courier-Bold 12 SetF %I t [ 1.2168e-08 1 -1 1.2168e-08 288.5 532.5 ] concat %I [ (Jack Dongarra's office in Ayres Hall Room 107) ] Text End Begin %I Text %I cfg Black 0 0 0 SetCFg %I f *-courier-medium-r-*-100-* Courier 10 SetF %I t [ -0.792068 1.92757e-08 -1.92757e-08 -0.792068 428.69 73.2605 ] concat %I [ (Airport/Alcoa Highway) ] Text End Begin %I Rect %I b 65535 0 0 0 [] 0 SetB %I cfg Black 0 0 0 SetCFg %I cbg White 1 1 1 SetCBg %I p < 88 44 22 11 88 44 22 11 > -1 SetP %I t [ 1.2168e-08 1 -1 1.2168e-08 610 221 ] concat %I 258 413 267 424 Rect End Begin %I Rect %I b 65535 0 0 0 [] 0 SetB %I cfg Black 0 0 0 SetCFg %I cbg White 1 1 1 SetCBg %I p < 88 44 22 11 88 44 22 11 > -1 SetP %I t [ 1.2168e-08 1 -1 1.2168e-08 593 221 ] concat %I 258 413 267 424 Rect End Begin %I MLine %I b 65535 0 0 0 [] 0 SetB %I cfg Black 0 0 0 SetCFg %I cbg White 1 1 1 SetCBg none SetP %I p n %I t [ 1.2168e-08 1 -1 1.2168e-08 773 -23 ] concat %I 3 558 602 521 602 507 598 3 MLine %I 1 End Begin %I Line %I b 65535 2 0 0 [] 0 SetB %I cfg Black 0 0 0 SetCFg %I cbg White 1 1 1 SetCBg none SetP %I p n %I t [ 1.2168e-08 1 -1 1.2168e-08 773 -23 ] concat %I 326 526 326 467 Line %I 1 End Begin %I Rect %I b 65535 2 0 0 [] 0 SetB %I cfg Black 0 0 0 SetCFg %I cbg White 1 1 1 SetCBg %I p 0.5 SetP %I t [ 1.2168e-08 1 -1 1.2168e-08 773 -23 ] concat %I 564 368 588 385 Rect End Begin %I Rect %I b 65535 2 0 0 [] 0 SetB %I cfg Black 0 0 0 SetCFg %I cbg White 1 1 1 SetCBg %I p 0.5 SetP %I t [ 1.2168e-08 1 -1 1.2168e-08 773 26.9999 ] concat %I 564 368 588 385 Rect End Begin %I Poly %I b 65535 2 0 0 [] 0 SetB %I cfg Black 0 0 0 SetCFg %I cbg White 1 1 1 SetCBg %I p 0.5 SetP %I t [ 8.96587e-09 0.736842 -0.736842 8.96587e-09 708.816 120.079 ] concat %I 4 616 414 631 414 631 431 616 431 4 Poly End Begin %I Text %I cfg Black 0 0 0 SetCFg %I f *-courier-medium-r-*-80-* Courier 8 SetF %I t [ -0.855758 0.676646 -0.676646 -0.855758 381.905 599.524 ] concat %I [ (Law Builfinh) ] Text End Begin %I Text %I cfg Black 0 0 0 SetCFg %I f *-courier-medium-r-*-80-* Courier 8 SetF %I t [ -0.855758 0.676646 -0.676646 -0.855758 383.905 547.524 ] concat %I [ (Pan-Helenic Bldg.) ] Text End Begin %I Text %I cfg Black 0 0 0 SetCFg %I f *-courier-medium-r-*-80-* Courier 8 SetF %I t [ -0.855758 0.676646 -0.676646 -0.855758 384.905 572.524 ] concat %I [ (International House.) ] Text End Begin %I MLine %I b 65535 0 0 0 [] 0 SetB %I cfg Black 0 0 0 SetCFg %I cbg White 1 1 1 SetCBg none SetP %I p n %I t [ 1.2168e-08 1 -1 1.2168e-08 773 -23 ] concat %I 2 559 582 507 581 2 MLine %I 1 End Begin %I Text %I cfg Black 0 0 0 SetCFg %I f *-courier-medium-r-*-100-* Courier 10 SetF %I t [ 1.13387e-08 0.931845 -0.931845 1.13387e-08 186.965 539.756 ] concat %I [ (Ramada Inn) ] Text End Begin %I Text %I cfg Black 0 0 0 SetCFg %I f *-courier-medium-r-*-100-* Courier 10 SetF %I t [ 1.13387e-08 0.931845 -0.931845 1.13387e-08 167.965 538.756 ] concat %I [ (Hilton) ] Text End Begin %I Text %I cfg Black 0 0 0 SetCFg %I f *-courier-bold-r-*-120-* Courier-Bold 12 SetF %I t [ 1.2168e-08 1 -1 1.2168e-08 514.5 56.5 ] concat %I [ (Directions from the airport to Ayres Hall:) () ( Alcoa Highway North to Cumberland Avenue) () ( Cumberland Avenue east to Stadium Drive) ( \(Stadium Dr. is accross from 15th St.\)) () ( Park at Parking Garage and walk up hill) ( to largest building, Ayres Hall) ] Text End Begin %I Text %I cfg Black 0 0 0 SetCFg %I f *-courier-bold-r-*-120-* Courier-Bold 12 SetF %I t [ -0.0156231 0.999878 -0.999878 -0.0156231 651.985 47.9375 ] concat %I [ (Jack Dongarra's office phone 615-974-8295) ] Text End End %I eop showpage %%Trailer end From owner-pbwg-comm@CS.UTK.EDU Fri May 14 16:21:50 1993 Received: from CS.UTK.EDU by netlib2.cs.utk.edu with SMTP (5.61+IDA+UTK-930125/2.8t-UTK) id AA11143; Fri, 14 May 93 16:21:50 -0400 Received: from localhost by CS.UTK.EDU with SMTP (5.61+IDA+UTK-930125/2.8s-UTK) id AA13482; Fri, 14 May 93 16:21:26 -0400 X-Resent-To: pbwg-comm@CS.UTK.EDU ; Fri, 14 May 1993 16:21:25 EDT Errors-To: owner-pbwg-comm@CS.UTK.EDU Received: from DASHER.CS.UTK.EDU by CS.UTK.EDU with SMTP (5.61+IDA+UTK-930125/2.8s-UTK) id AA13476; Fri, 14 May 93 16:21:24 -0400 From: Jack Dongarra Received: by dasher.cs.utk.edu (5.61+IDA+UTK-930125/2.7c-UTK) id AA04396; Fri, 14 May 93 16:21:23 -0400 Date: Fri, 14 May 93 16:21:23 -0400 Message-Id: <9305142021.AA04396@dasher.cs.utk.edu> To: pbwg-comm@cs.utk.edu Subject: may meeting I would like to get a rough idea how many people will attend the May 24th Parallel Benchmark Working Group meeting in Knoxville. If you are planning to attend please send me email. Thanks, Jack From owner-pbwg-comm@CS.UTK.EDU Mon May 17 04:19:22 1993 Received: from CS.UTK.EDU by netlib2.cs.utk.edu with SMTP (5.61+IDA+UTK-930125/2.8t-UTK) id AA14900; Mon, 17 May 93 04:19:22 -0400 Received: from localhost by CS.UTK.EDU with SMTP (5.61+IDA+UTK-930125/2.8s-UTK) id AA06401; Mon, 17 May 93 04:16:33 -0400 X-Resent-To: pbwg-comm@CS.UTK.EDU ; Mon, 17 May 1993 04:16:32 EDT Errors-To: owner-pbwg-comm@CS.UTK.EDU Received: from Mail.Think.COM by CS.UTK.EDU with SMTP (5.61+IDA+UTK-930125/2.8s-UTK) id AA06388; Mon, 17 May 93 04:16:25 -0400 Received: from Godot.Think.COM by mail.think.com; Mon, 17 May 93 04:16:22 -0400 Received: by godot.think.com (4.1/Think-1.2) id AA11666; Mon, 17 May 93 04:16:21 EDT Message-Id: <9305170816.AA11666@godot.think.com> To: Jack Dongarra Cc: pbwg-comm@cs.utk.edu Subject: Re: may meeting In-Reply-To: Your message of "Fri, 14 May 93 16:21:23 EDT." <9305142021.AA04396@dasher.cs.utk.edu> Date: Mon, 17 May 93 04:16:20 EDT From: Dennis Parkinson Sorry I am unable to attend the may 24 meeting From owner-pbwg-comm@CS.UTK.EDU Tue May 18 09:30:50 1993 Received: from CS.UTK.EDU by netlib2.cs.utk.edu with SMTP (5.61+IDA+UTK-930125/2.8t-UTK) id AA23405; Tue, 18 May 93 09:30:50 -0400 Received: from localhost by CS.UTK.EDU with SMTP (5.61+IDA+UTK-930125/2.8s-UTK) id AA07104; Tue, 18 May 93 09:30:05 -0400 X-Resent-To: pbwg-comm@CS.UTK.EDU ; Tue, 18 May 1993 09:29:58 EDT Errors-To: owner-pbwg-comm@CS.UTK.EDU Received: from ben.uknet.ac.uk by CS.UTK.EDU with SMTP (5.61+IDA+UTK-930125/2.8s-UTK) id AA07060; Tue, 18 May 93 09:29:47 -0400 Message-Id: <9305181329.AA07060@CS.UTK.EDU> Received: from eros.uknet.ac.uk by ben.uknet.ac.uk via UKIP with SMTP (PP) id ; Tue, 18 May 1993 14:29:31 +0100 Received: from newton.npl.co.uk by eros.uknet.ac.uk via PSS with NIFTP (PP) id <3191-0@eros.uknet.ac.uk>; Tue, 18 May 1993 14:29:28 +0100 Date: Tue, 18 May 93 14:29 GMT From: Trevor Chambers To: PBWG-COMM COMMITTEE: Whole TOPIC: Boundaries between the benchmark sets CONTENT: How big a benchmark is allowed in compact applications? ACTUAL MESSAGE: Do we need to delineate more carefully the boundaries between the benchmark sets? In particular where is the boundary between 'low level' benchmarks and 'kernels' and between `kernels' and `compact applications' What is the boundary between 'compact applications' and larger pieces of code? Trevor Chambers pp Ed Brocklehurst From owner-pbwg-comm@CS.UTK.EDU Tue May 18 13:21:52 1993 Received: from CS.UTK.EDU by netlib2.cs.utk.edu with SMTP (5.61+IDA+UTK-930125/2.8t-UTK) id AA26444; Tue, 18 May 93 13:21:52 -0400 Received: from localhost by CS.UTK.EDU with SMTP (5.61+IDA+UTK-930125/2.8s-UTK) id AA24080; Tue, 18 May 93 13:21:18 -0400 X-Resent-To: pbwg-comm@CS.UTK.EDU ; Tue, 18 May 1993 13:21:17 EDT Errors-To: owner-pbwg-comm@CS.UTK.EDU Received: from sun2.nsfnet-relay.ac.uk by CS.UTK.EDU with SMTP (5.61+IDA+UTK-930125/2.8s-UTK) id AA24067; Tue, 18 May 93 13:21:08 -0400 Via: uk.ac.southampton.ecs; Tue, 18 May 1993 17:18:11 +0100 Via: brewery.ecs.soton.ac.uk; Tue, 18 May 93 17:10:43 BST From: Tony Hey Received: from pleasuredome.ecs.soton.ac.uk by brewery.ecs.soton.ac.uk; Tue, 18 May 93 17:19:46 BST Message-Id: <11358.9305181619@pleasuredome.ecs.soton.ac.uk> Subject: What's in a name To: pbwg-comm@cs.utk.edu Date: Tue, 18 May 1993 17:19:38 +0100 (BST) X-Mailer: ELM [version 2.4 PL0] Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Length: 577 Name of proposed benchmark suite Since the last meeting I have been thinking over the possibilities and doing some (random) market research. The conclusion I came to was that Pearl's suggestion: PARKBENCH - Parallel Kernels and Benchmark was really very good. Just like a SPEC mark for workstations, the idea of a PARKBENCH mark (or marks) for parallel systems seems a nice thing to say. Similarly referring to the Parkbench Suite sound OK. I think this is much better than Paraben, Interben or somesuch name. More discussion over cocktails in Knoxville? Tony Hey From owner-pbwg-comm@CS.UTK.EDU Tue May 18 13:34:13 1993 Received: from CS.UTK.EDU by netlib2.cs.utk.edu with SMTP (5.61+IDA+UTK-930125/2.8t-UTK) id AA26493; Tue, 18 May 93 13:34:13 -0400 Received: from localhost by CS.UTK.EDU with SMTP (5.61+IDA+UTK-930125/2.8s-UTK) id AA25132; Tue, 18 May 93 13:33:48 -0400 X-Resent-To: pbwg-comm@CS.UTK.EDU ; Tue, 18 May 1993 13:33:47 EDT Errors-To: owner-pbwg-comm@CS.UTK.EDU Received: from BERRY.CS.UTK.EDU by CS.UTK.EDU with SMTP (5.61+IDA+UTK-930125/2.8s-UTK) id AA25126; Tue, 18 May 93 13:33:46 -0400 Received: from LOCALHOST by berry.cs.utk.edu with SMTP (5.61+IDA+UTK-930125/2.7c-UTK) id AA00313; Tue, 18 May 93 13:31:12 -0400 Message-Id: <9305181731.AA00313@berry.cs.utk.edu> To: Tony Hey Cc: pbwg-comm@cs.utk.edu Subject: Re: What's in a name In-Reply-To: Your message of "Tue, 18 May 1993 17:19:38 BST." <11358.9305181619@pleasuredome.ecs.soton.ac.uk> Date: Tue, 18 May 1993 13:31:11 -0400 From: "Michael W. Berry" > > > Name of proposed benchmark suite > > Since the last meeting I have been thinking over the > possibilities and doing some (random) market research. > > The conclusion I came to was that Pearl's suggestion: > > PARKBENCH - Parallel Kernels and Benchmark > > was really very good. > > Just like a SPEC mark for workstations, the idea of a > PARKBENCH mark (or marks) for parallel systems seems > a nice thing to say. Similarly referring to the > Parkbench Suite sound OK. > > I think this is much better than Paraben, Interben or > somesuch name. > > More discussion over cocktails in Knoxville? I tend to agree that PARKBENCH is very nice and would be willing to vote for it. Mike --- Michael W. Berry ___-___ o==o====== . . . . . Ayres 114 =========== ||// Department of \ \ |//__ Computer Science #_______/ berry@cs.utk.edu University of Tennessee (615) 974-3838 [OFF] Knoxville, TN 37996-1301 (615) 974-4404 [FAX] From owner-pbwg-comm@CS.UTK.EDU Wed May 26 14:04:06 1993 Received: from CS.UTK.EDU by netlib2.cs.utk.edu with SMTP (5.61+IDA+UTK-930125/2.8t-UTK) id AA07523; Wed, 26 May 93 14:04:06 -0400 Received: from localhost by CS.UTK.EDU with SMTP (5.61+IDA+UTK-930125/2.8s-UTK) id AA10932; Wed, 26 May 93 14:02:58 -0400 X-Resent-To: pbwg-comm@CS.UTK.EDU ; Wed, 26 May 1993 14:02:57 EDT Errors-To: owner-pbwg-comm@CS.UTK.EDU Received: from BERRY.CS.UTK.EDU by CS.UTK.EDU with SMTP (5.61+IDA+UTK-930125/2.8s-UTK) id AA10926; Wed, 26 May 93 14:02:55 -0400 Received: from LOCALHOST.cs.utk.edu by berry.cs.utk.edu with SMTP (5.61++/2.7c-UTK) id AA11175; Wed, 26 May 93 14:02:54 -0400 Message-Id: <9305261802.AA11175@berry.cs.utk.edu> To: pbwg-comm@cs.utk.edu Subject: Minutes for Editing Date: Wed, 26 May 1993 14:02:53 -0400 From: "Michael W. Berry" Colleagues, please review the minutes below and edit where necessary. Please precede all lines that are modified/added with ">>>" so I can merge all your changes easily. Thanks, Mike B. ----------------------------------------------------------------------- Minutes of the PARKBENCH (Formerly PBWG) Workshop ------------------------------------------------- Place: Science Alliance Conference Room South College University of Tennessee Knoxville, TN Host: Jack Dongarra ORNL/Univ. of Tennessee Date: May 24, 1993 Attendees/Affiliations: - ---------------------- David Bailey, NASA Michael Berry, Univ. of Tennessee Jack Dongarra, Univ. of Tennessee / ORNL Charles Grassl Cray Research Tom Haupt, Syracuse Univ. Tony Hey, Southampton Univ. Roger Hockney, Southampton Univ. Brian LaRose, Univ. of Tennessee David Mackay, Intel SSD Joanne Martin, IBM Robert Pennington, Pittsburgh Supercomputing Center David Walker, ORNL Patrick Worley, ORNL Ramesh Natarajan, IBM, Yorktown Heights Bodo Parady, Sun Microsystems Ed Kushner, Intel SSD Agenda: May 24, 1993 - -------------------- At 8:36 am EDT, Roger Hockney gave opening remarks and welcomed all participants to the workshop. Each participant introduced him or herself by affiliation and interests. The minutes of the previous meeting (Knoxville, March 1-2) were reviewed with two major corrections made to the minutes: (1) Roger Hockney's name was missing from the Methodology subgroup list, and (2) a Compiler subgroup (with T. Haupt as leader) should have been added. Since the number of attendees was not that large (17), Roger H. proposed that there be no separate subgroup meetings during the day and all attendees agreed. Roger H. then asked the attendees to think about an alternative name for the group/benchmark suite which was first discussed at the March 1-2 meeting. The names considered include: PBWG, PARKBENCH, PARABEN, INTERBEN, SIMPLE, BIGBEN, and INTERPAR. Roger suggested that formal voting on the new name be conducted at the end of the meeting (before adjourning). The initial draft of the group's report was handed out and each chapter was then discussed in sequence. Roger H. began the discussion with Chapter 1: Methodology. David B. remarked that the the notation "Mflop/s" rather than "Mflops" (which was proposed by Roger H.) is a good standard to adopt. Although the meaning of T(p), i.e., elapsed wall-clock time on p processors, was clear, several attendees pointed out problems with the interpretation of T(1), the elapsed wall-clock time on 1 processor. Roger H. pointed out that there can be many T(1)'s, which leads to confusion in speedup comparisons. D. Bailey suggested that T(1) should not have any parallel overhead. Charles G. suggested that an efficiency measure based on Amdahl's law be used as a replacement for speedup. Ramesh N. suggested that a 2-processor baseline time be used. Tom H. pointed out that speedup is important for compiler performance measurement. Roger H. agreed that speedup may be important to report in this particular case. David B. felt that scaled speedup should be computed separately and Rameh N. indicated that "super linear" speedup is mathematically incorrect. Charles G. proposed that speedup should specifically address caching effects and memory hierarchies. This discussion of speedup ended with David B. agreeing to rewrite Section 1.4.5 (Speedup, Efficiency, and Performance per Node) of the draft and address all the concerns mentioned above. Roger H. reminded the attendees that subgroup leaders are responsible for their respective chapters of the report (which is targeted for release at Supercomputing '93). Section 1.5 (Performance Database) was the final section of Chapter 1 which was discussed. Jack D. was opposed to the idea of providing any graphical display of the on-line benchmarks provided by the PDS (Performance Database Server) extension to Xnetlib. Roger H. indicated that such graphics would make the data more attractive to users. Michael B. indicated that future PDS development would incorporate a spreadsheet-based display of the benchmark data from which graphical utilities could evolve. Tony H. indicated an interest in designing a few prototype graphical tools for displaying benchmark data obtained from PDS. Michael B. also pointed out that PDS will also provide SPEC Benchmarks in the future based on discussions with SPEC officials at a recent meeting in Huntsville, AL. Jack D. and Michael B. agreed to rewrite Section 15. and to indicate how to acquire/use PDS. Before moving on to Chapter 2 (Low-Level Benchmarks), David B. suggested that the draft include a motivation section which stresses benchmarking as a science rather than art. Parallels with other sciences could be drawn. David B. was willing to write up this for the report. Roger H. then led the discussion of Chapter 2 (Low-Level Benchmarks). He explained the difference between the two proposed timers TICK1 (clock resolution) and TICK2 (external wall-clock time). Roger also indicated that the UNIX timer "etime" is misleading in that does not report elapsed wall-clock time (reports CPU time instead) and that timer benchmarks are really necessary in order to understand the meaning of the reported times. David B. indicated that he has observed cases in which CPU time was greater than wall-clock time. Charles G. proposed that documentation should indicate that CPU time cannot be reported but also indicate potential hazards in wall-clock timing (hardware and network errors). David B. agreed to write a paragraph for the report which would address these concerns. Most attendees agreed that several runs of each benchmark should be made and Roger H. proposed that the minimum time be reported (rather than an average) for the low-level benchmarks. The consensus was unanimous on reporting the minimum time required but Bodo P. pointed out that operating systems will need to be somehow quantified for these times. Tony H. questioned whether or not optimizations should also be allowed for these particular benchmarks. A discussion of the Linpack benchmark (Section 2.1.3) and Livermore Loops (Section 2.1.4) was then initiated. Roger H. suggested that the Linpack (n=1000) benchmark be considered as a kernel benchmark. Charles G. supported the use of the Livermore Loops for measuring cache-based microprocessors and Roger H. supported their use for measuring the range of performance on a node (instability). Tony H. questioned why the group include sequential benchmarks for a parallel benchmark suite. He suggested that the report could reference the serial benchmarks (Linpack, Livermore Loops, SPEC) but should not include them in the suite. Roger H. then reviewed Section 2.1.5 which discusses the "N sub one-half" and "R sub infinity" performance measures. A routine RINF1 from the Genesis benchmarks could be used to determine these measures. Roger H. also proposed that memory-bottleneck benchmarks (POLY1, POLY2) be included (see Section 2.1.6) in the suite. Whereas vectors would fit in cache with POLY1, they would not fit in cache in POLY2. With regard to Arithmetic benchmarks (Section 2.1.7), Jack D. stressed that 64-bit arithmetic be used but Ed K. pointed out that 32-bit is commonly used in many applications (e.g., seismic codes). David B. proposed that the methodology should encourage 64-bit arithmetic but not exclude 32-bit in cases where it is explicitly required (and documented). Discussions were then curtailed for a short coffee break (10:20-10:45am). After the coffee break, discussions concerning Chapter 2 continued. Patrick W. questioned the type of communication (arbitrary or nearest- neighbor) that should be used for the COMM1, COMM2 benchmarks for measuring communication (Section 2.2). Charles G. questioned how one could measure hidden latency? Patrick W. suggested that a protocol be defined and Roger H. responded which the proposal that "nonblocking send" and "locking receive" be used. He suggested that other variations could be used in optimizing basic routines. Roger H. asked if matrix transposition really measures bisection bandwidth? David W. indicated that it does provided the matrix has only 1 data distribution. Tony H. suggested that the group think about alternative benchmarks for measuring bisection bandwidth. David W. suggested that MPI communication routines (broadcast, gather, scatter, etc.,) be used. David W. will provide information on these routines. Tony H. questioned the need for the separate communication bottleneck benchmark (POLY3, Section 2.2.4), but Roger H. maintained that it is best to have it separated from POLY1 and the COMMS benchmarks. Roger H. pointed out that the synchronization benchmark (SYNCH1) was missing from Table 2.3. Patrick W. pointed out that this particular benchmark will be extremely machine-dependent. Roger H. suggested that the basic "barrier" paradigm be used. This concluded the discussion of Chapter 2 on Low-Level Benchmarks. Roger H. then asked Tony H. to lead the discussion on Chapter 3 (Kernel Applications). Tony passed out his draft of the chapter (not included with the chapters originally handed out by Roger H.) and reviewed its contents with the attendees. For the matrix benchmarks (Section 3.2.1), Tony H. proposed that the kernel A=B*C be provided and that the group consider appropriate validation tests based on generated matrices or input datasets. It was also stressed that the matrices B and C start distributed and stay that way. Tony H. discussed the availability of a matrix diagonalization code (Intel i860) that could scale the computation per node. He will make the code available (from Dawesbury Lab) for review purposes. Jack D. proposed that routines from SCALAPACK be used for the dense LU (with pivoting) benchmark and that an iterative solver for nonsymmetric linear systems be included. Michael B., Jack D., and Patrick W. agreed to work on an appropriate sparse linear system solver or eigensolver for the suite. Jack D. suggested that a Cholesky factorization routine was not necessary as long as QR factorization was included. He also stressed that the suite use state-of-the-art algorithms for each benchmark. All attendees agreed. The discussion focused on what type of Fast Fourier Transform (FFT) benchmarks (Section 3.2.2) the suite should contain. Bodo P. suggested that they be structured like the Linpack benchmarks and questioned whether or not they should be ordered? David B. suggested that the 1-D FFT should be very large (order of 1.E+06) and need not be ordered. As an alternative, David suggested that the benchmark really be a convolution problem to be solved any way desired. Patrick W. then questioned whether or not a power of 2 should be used, and David B. responded that it should be a power or 2. Ed K. suggested that a 2-D FFT is not needed if a 3-D FFT is provided. David W. suggested that there be forward/backward FFT's which are easy to validate. Charles G. and David B. agreed to work on the FFT benchmarks. For PDE benchmarks (Section 3.2.3), there was a general agreement to drop Jacobi and Gauss-Seidel from the list of candidate algorithms. Tony H. suggested that an SOR-based routine from the Genesis benchmarks be used. David M. indicated that he could provide a Finite Element Method (FEM) code but that it might be better to consider it as a compact application rather than a kernel. Bodo P. and Tony H. proposed that such a benchmark be for a 3-D problem. There was somewhat of a consensus that there be a single problem and multiple algorithms provided. The discussion on Chapter 3 concluded with consideration of other possible kernel benchmarks (Section 3.2.4). Patrick W. questioned whether or not the Embarrassing Parallel (EP) benchmark (from NASA) should be a compact application. Roger H. suggested that there be an integer sort kernel and perhaps a Particle-In-Cell (PIC) kernel that might be commonly used in domain decomposition applications. Other suggested kernels (proposed by various attendees) included: operation counts, intrinsic operations, out-of-core solvers, check-pointing. Tony H. indicated that he could obtain an I/O benchmark from Dawesbury Lab. David B. pointed out that timing events such as loading is more appropriate for compact applications than for kernel or low-level benchmarks. Roger H. then asked David W. to lead the discussion of the final chapter of the current report (Chapter 4, Compact Applications). David B. asked if various data layouts should be allowed? David W. proposed that there should both HPF and message-passing versions of the benchmarks and questioned if time should be measured from the start to finish? Tony H. proposed that the QCD code from the Perfect Benchmarks be replaced with GAUGE (available in HPF and message-passing). Tony H. will acquire GAUGE for the Netlib database (currently listed as the "pbwg" library). David B. suggested that an N-body code be provided and Bodo P. questioned if the gravity benchmark (Section 4.2.3) is really a kernel rather than a compact application. Patrick W. indicated that he could acquire a shallow water code that is public-domain (parallelized NCAR code). David W. suggested that other molecular dynamics codes be sought since those in the Perfect Benchmarks have problem sizes that are too small. For a potential geophysics benchmark, Michael B. agreed to check on the use of the ARCO benchmark for the parallel suite. For other potential compact applications (Section 4.2.7), Bodo P. questioned the availability of DYNA2D (restricted distribution) and David M. agreed to check on the availability of a FEM code. Michael B. commented that good candidates would be those having multiple instances (HPF, message-passing, etc.,). David W. suggested that the group hold off on investigating commercial codes till the project matures. David B. suggested that a reservoir code be included. Other suggestions included: CHARMM, AMBER, GAMES, GAUSSIAN90 (all molecular dynamics codes). Patrick W. suggested that a signal processing application be added and Tony H. proposed that the applications focus under the "Grand Challenge" research areas. Charles G. asked how the performance of the compact applications would be verified. Tony H. indicated that the RAPS project typically generates lots of numbers. How many problems to run was another question raised. Bodo P. indicated that the SPEC folks use the geometric mean of several runs. Roger H. pointed out that there should be several numbers reported which can illustrate performance variation on varying numbers of processors. The discussion of Chapter 4 concluded and Roger H. asked Tom H. to briefly report on the compiler subgroup activities. Tom H. suggested that the compiler benchmarks should address how compilers handle data distribution. Related issues include the use dynamic memory, communication, and runtime libraries. He indicated that low-level compiler benchmarks (synthetic) be added and that there should be a comparison with hand-coded optimizations. Having completed formal discussions of the current report, Roger H. then called for a vote on the new name of the group. By a 10-7 margin, the attendees votes to change PBWG to PARKBENCH (PARallel Kernels and BENCHmarks). Jack D. asked if the group preferred a full-day or two half days for the next scheduled PARKBENCH meeting in Knoxville. The majority of attendees preferred the single day format, and so the next PARKBENCH meeting is scheduled to be in Knoxville on August 23. Roger H. asked that the minutes be posted (to comp.parallel Internet newsgroup and pbwg-com@cs.utk.edu). Joanne M. indicated that she would have information on the Birds-of-a-Feather (BOF) session for PARKBENCH at Supercomputing '93 at the August meeting. Michael B. briefly reviewed the status of the SPEC/Perfect merger and passed out minutes of that meeting (Huntsville, May 1-13). Roger H. then adjourned the official (third) meeting of the PARKBENCH group at 2:50pm EDT. A demo of the PDS tool supported by UT/ORNL was given by Brian L. to a few of the attendees till approximately 3:15pm EDT. From owner-pbwg-comm@CS.UTK.EDU Fri May 28 21:01:01 1993 Received: from CS.UTK.EDU by netlib2.cs.utk.edu with SMTP (5.61+IDA+UTK-930125/2.8t-UTK) id AA22536; Fri, 28 May 93 21:01:01 -0400 Received: from localhost by CS.UTK.EDU with SMTP (5.61+IDA+UTK-930125/2.8s-UTK) id AA20147; Fri, 28 May 93 21:00:58 -0400 X-Resent-To: pbwg-comm@CS.UTK.EDU ; Fri, 28 May 1993 21:00:57 EDT Errors-To: owner-pbwg-comm@CS.UTK.EDU Received: from BERRY.CS.UTK.EDU by CS.UTK.EDU with SMTP (5.61+IDA+UTK-930125/2.8s-UTK) id AA20141; Fri, 28 May 93 21:00:55 -0400 Received: from LOCALHOST.cs.utk.edu by berry.cs.utk.edu with SMTP (5.61++/2.7c-UTK) id AA17854; Fri, 28 May 93 21:00:54 -0400 Message-Id: <9305290100.AA17854@berry.cs.utk.edu> To: pbwg-comm@cs.utk.edu Subject: Revised Minutes of last Meeting Date: Fri, 28 May 1993 21:00:54 -0400 From: "Michael W. Berry" Minutes of the PARKBENCH (Formerly PBWG) Workshop ------------------------------------------------- (PBWG= Parallel Benchmark Working Group) Place: Science Alliance Conference Room South College University of Tennessee Knoxville, TN Host: Jack Dongarra ORNL/Univ. of Tennessee Date: May 24, 1993 Attendees/Affiliations: ---------------------- David Bailey, NASA Michael Berry, Univ. of Tennessee Jack Dongarra, Univ. of Tennessee / ORNL Charles Grassl Cray Research Tom Haupt, Syracuse Univ. Tony Hey, Southampton Univ. Roger Hockney, Southampton Univ. Ed Kushner, Intel SSD Brian LaRose, Univ. of Tennessee David Mackay, Intel SSD Joanne Martin, IBM Ramesh Natarajan, IBM, Yorktown Heights Robert Pennington, Pittsburgh Supercomputing Center Bodo Parady, Sun Microsystems David Walker, ORNL Patrick Worley, ORNL Agenda: May 24, 1993 -------------------- At 8:36 am EDT, Roger Hockney gave opening remarks and welcomed all participants to the workshop. Each participant introduced him or herself by affiliation and interests. The minutes of the previous meeting (Knoxville, March 1-2) were reviewed with two major corrections made to the minutes: (1) Roger Hockney's name was missing from the Methodology subgroup list, and (2) a Compiler subgroup (with T. Haupt as leader) should have been added. Since the number of attendees was not that large (17), Roger H. proposed that there be no separate subgroup meetings during the day and all attendees agreed. Roger H. then asked the attendees to think about an alternative name for the group/benchmark suite which was first discussed at the March 1-2 meeting. The names considered include: PBWG, PARKBENCH, PARABEN, INTERBEN, SIMPLE, BIGBEN, and INTERPAR. Roger suggested that formal voting on the new name be conducted at the end of the meeting (before adjourning). The initial draft of the group's report was handed out and each chapter was then discussed in sequence. Roger H. began the discussion with Chapter 1: Methodology. David B. remarked that the the notation "Mflop/s" rather than "Mflops" (which was proposed by Roger H.) is a good standard to adopt. Although the meaning of T(p), i.e., elapsed wall-clock time on p processors, was clear, several attendees pointed out problems with the interpretation of T(1), the elapsed wall-clock time on 1 processor. Roger H. pointed out that there can be many T(1)'s, which leads to confusion in speedup comparisons. D. Bailey suggested that T(1) should not have any parallel overhead. Charles G. suggested that an efficiency measure based on Amdahl's law be used as a replacement for speedup. Ramesh N. suggested that a 2-processor baseline time be used. Tom H. pointed out that speedup is important for compiler performance measurement. Roger H. agreed that speedup may be important to report in this particular case. Ramesh N. felt that scaled speedup should be computed separately and that the widespread connotation of the term "superlinear" speedup is mathematically incorrect. Charles G. proposed that speedup should specifically address caching effects and memory hierarchies. This discussion of speedup ended with David B. agreeing to rewrite Section 1.4.5 (Speedup, Efficiency, and Performance per Node) of the draft and address all the concerns mentioned above. Roger H. reminded the attendees that subgroup leaders are responsible for their respective chapters of the report (which is targeted for release at Supercomputing '93). Section 1.5 (Performance Database) was the final section of Chapter 1 which was discussed. Jack D. was opposed to the idea of providing any graphical display of the on-line benchmarks provided by the PDS (Performance Database Server) extension to Xnetlib. Roger H. indicated that such graphics would make the data more attractive to users. Michael B. indicated that future PDS development would incorporate a spreadsheet-based display of the benchmark data from which graphical utilities could evolve. Tony H. indicated an interest in designing a few prototype graphical tools for displaying benchmark data obtained from PDS. Michael B. also pointed out that PDS will also provide SPEC Benchmarks in the future based on discussions with SPEC officials at a recent meeting in Huntsville, AL. Jack D. and Michael B. agreed to rewrite Section 15. and to indicate how to acquire/use PDS. Before moving on to Chapter 2 (Low-Level Benchmarks), David B. suggested that the draft include a motivation section which stresses benchmarking as a science rather than art. Parallels with other sciences could be drawn. David B. was willing to write up this for the report. Roger H. then led the discussion of Chapter 2 (Low-Level Benchmarks). He explained the difference between the two proposed timers TICK1 (clock resolution) and TICK2 (external wall-clock time). Roger also indicated that the UNIX timer "etime" is misleading in that does not report elapsed wall-clock time (reports CPU time instead) and that timer benchmarks are really necessary in order to understand the meaning of the reported times. David B. indicated that he has observed cases in which CPU time was greater than wall-clock time. Charles G. proposed that documentation should indicate that CPU time cannot be reported but also indicate potential hazards in wall-clock timing (hardware and network errors). David B. agreed to write a paragraph for the report which would address these concerns. Most attendees agreed that several runs of each benchmark should be made and Roger H. proposed that the minimum time be reported (rather than an average) for the low-level benchmarks. The consensus was unanimous on reporting the minimum time required but Bodo P. pointed out that operating systems will need to be somehow quantified for these times. Tony H. questioned whether or not optimizations should also be allowed for these particular benchmarks. A discussion of the Linpack benchmark (Section 2.1.3) and Livermore Loops (Section 2.1.4) was then initiated. Roger H. suggested that the Linpack (n=1000) benchmark be considered as a kernel benchmark. Charles G. did not support the use of the Livermore Loops for measuring cache-based microprocessors, while Roger H. supported their use for measuring the range of performance on a node (instability). Tony H. questioned why the group include sequential benchmarks for a parallel benchmark suite. He suggested that the report could reference the serial benchmarks (Linpack, Livermore Loops, SPEC) but should not include them in the suite. Roger H. then reviewed Section 2.1.5 which discusses the "N sub one-half" and "R sub infinity" performance measures. A routine RINF1 from the Genesis benchmarks could be used to determine these measures. Roger H. also proposed that memory-bottleneck benchmarks (POLY1, POLY2) be included (see Section 2.1.6) in the suite. Whereas vectors would fit in cache with POLY1, they would not fit in cache in POLY2. With regard to Arithmetic benchmarks (Section 2.1.7), Jack D. stressed that 64-bit arithmetic be used but Ed K. pointed out that 32-bit is commonly used in many applications (e.g., seismic codes). David B. proposed that the methodology should encourage 64-bit arithmetic but not exclude 32-bit in cases where it is explicitly required (and documented). Discussions were then curtailed for a short coffee break (10:20-10:45am). After the coffee break, discussions concerning Chapter 2 continued. Patrick W. questioned the type of communication (arbitrary or nearest- neighbor) that should be used for the COMM1, COMM2 benchmarks for measuring communication (Section 2.2). Charles G. enquired as to if one could measure hidden latency? Patrick W. suggested that a protocol be defined and Roger H. responded which the proposal that "nonblocking send" and "locking receive" be used. He suggested that other variations could be used in optimizing basic routines. Roger H. asked if matrix transposition really measures bisection bandwidth? David W. indicated that it does provided the matrix has only 1 data distribution. Tony H. suggested that the group think about alternative benchmarks for measuring bisection bandwidth. David W. suggested that MPI communication routines (broadcast, gather, scatter, etc.,) be used. David W. will provide information on these routines. Tony H. questioned the need for the separate communication bottleneck benchmark (POLY3, Section 2.2.4), but Roger H. maintained that it is best to have it separated from POLY1 and the COMMS benchmarks. Roger H. pointed out that the synchronization benchmark (SYNCH1) was missing from Table 2.3. Patrick W. pointed out that this particular benchmark will be extremely machine-dependent. Roger H. suggested that the basic "barrier" paradigm be used. This concluded the discussion of Chapter 2 on Low-Level Benchmarks. Roger H. then asked Tony H. to lead the discussion on Chapter 3 (Kernel Applications). Tony passed out his draft of the chapter (not included with the chapters originally handed out by Roger H.) and reviewed its contents with the attendees. For the matrix benchmarks (Section 3.2.1), Tony H. proposed that the kernel A=B*C be provided and that the group consider appropriate validation tests based on generated matrices or input datasets. It was also stressed that the matrices B and C start distributed and stay that way. Tony H. discussed the availability of a matrix diagonalization code (Intel i860) that could scale the computation per node. He will make the code available (from Dawesbury Lab) for review purposes. Jack D. proposed that routines from SCALAPACK be used for the dense LU (with pivoting) benchmark and that an iterative solver for nonsymmetric linear systems be included. Michael B., Jack D., and Patrick W. agreed to work on an appropriate sparse linear system solver or eigensolver for the suite. Jack D. suggested that a Cholesky factorization routine was not necessary as long as QR factorization was included. He also stressed that the suite use state-of-the-art algorithms for each benchmark. All attendees agreed. The discussion focused on what type of Fast Fourier Transform (FFT) benchmarks (Section 3.2.2) the suite should contain. Bodo P. suggested that they be structured like the Linpack benchmarks and questioned whether or not they should be ordered? David B. suggested that the 1-D FFT should be very large (order of 1.E+06) and need not be ordered. As an alternative, David suggested that the benchmark really be a convolution problem to be solved any way desired. Patrick W. then questioned whether or not a power of 2 should be used, and David B. responded that it should be a power or 2. Ed K. suggested that a 2-D FFT is not needed if a 3-D FFT is provided. David W. suggested that there be forward/backward FFT's which are easy to validate. Charles G. and David B. agreed to work on the FFT benchmarks. For PDE benchmarks (Section 3.2.3), there was a general agreement to drop Jacobi and Gauss-Seidel from the list of candidate algorithms. Tony H. suggested that an SOR-based routine from the Genesis benchmarks be used. David M. indicated that he could provide a Finite Element Method (FEM) code but that it might be better to consider it as a compact application rather than a kernel. Bodo P. and Tony H. proposed that such a benchmark be for a 3-D problem. There was somewhat of a consensus that there be a single problem and multiple algorithms provided. The discussion on Chapter 3 concluded with consideration of other possible kernel benchmarks (Section 3.2.4). Patrick W. questioned whether or not the Embarrassing Parallel (EP) benchmark (from NASA) should be a compact application. Roger H. suggested that there be an integer sort kernel and perhaps a Particle-In-Cell (PIC) kernel that might be commonly used in domain decomposition applications. Other suggested kernels (proposed by various attendees) included: operation counts, intrinsic operations, out-of-core solvers, check-pointing. Tony H. indicated that he could obtain an I/O benchmark from Dawesbury Lab. David B. pointed out that timing events such as loading is more appropriate for compact applications than for kernel or low-level benchmarks. Roger H. then asked David W. to lead the discussion of the final chapter of the current report (Chapter 4, Compact Applications). David B. asked if various data layouts should be allowed? David W. proposed that there should both HPF and message-passing versions of the benchmarks and questioned if time should be measured from the start to finish? Tony H. proposed that the QCD code from the Perfect Benchmarks be replaced with GAUGE (available in HPF and message-passing). Tony H. will acquire GAUGE for the Netlib database (currently listed as the "pbwg" library). David B. suggested that an N-body code be provided and Bodo P. questioned if the gravity benchmark (Section 4.2.3) is really a kernel rather than a compact application. Patrick W. indicated that he could acquire a shallow water code that is public-domain (parallelized NCAR code). David W. suggested that other molecular dynamics codes be sought since those in the Perfect Benchmarks have problem sizes that are too small. For a potential geophysics benchmark, Michael B. agreed to check on the use of the ARCO benchmark for the parallel suite. For other potential compact applications (Section 4.2.7), Bodo P. objected to the use of DYNA3D (originally suggested by Joanne M.), and David M. agreed to check on the availability of a FEM code. Michael B. commented that good candidates would be those having multiple instances (HPF, message-passing, etc.,). David W. suggested that the group hold off on investigating commercial codes till the project matures. David B. suggested that a reservoir code be included. Other suggestions included: CHARMM, AMBER, GAMES, GAUSSIAN90 (all molecular dynamics codes). Patrick W. suggested that a signal processing application be added and Tony H. proposed that the applications focus under the "Grand Challenge" research areas. Charles G. asked how the performance of the compact applications would be verified. Tony H. indicated that the RAPS project typically generates lots of numbers. How many problems to run was another question raised. Bodo P. indicated that the SPEC folks use the geometric mean of several runs. Roger H. pointed out that there should be several numbers reported which can illustrate performance variation on varying numbers of processors. The discussion of Chapter 4 concluded and Roger H. asked Tom H. to briefly report on the compiler subgroup activities. Tom H. suggested that the compiler benchmarks should address how compilers handle data distribution. Related issues include the use dynamic memory, communication, and runtime libraries. He indicated that low-level compiler benchmarks (synthetic) be added and that there should be a comparison with hand-coded optimizations. Having completed formal discussions of the current report, Roger H. then called for a vote on the new name of the group. By a 10-7 margin, the attendees voted to change PBWG to PARKBENCH (PARallel Kernels and BENCHmarks). Jack D. asked if the group preferred a full-day or two half days for the next scheduled PARKBENCH meeting in Knoxville. The majority of attendees preferred the single day format, and so the next PARKBENCH meeting is scheduled to be in Knoxville on August 23. Roger H. asked that the minutes be posted (to comp.parallel Internet newsgroup and pbwg-com@cs.utk.edu). Joanne M. indicated that she would have information on the Birds-of-a-Feather (BOF) session for PARKBENCH at Supercomputing '93 at the August meeting. Michael B. briefly reviewed the status of the SPEC/Perfect merger and passed out minutes of that meeting (Huntsville, May 1-13). Roger H. then adjourned the official (third) meeting of the PARKBENCH group at 2:50pm EDT. A demo of the PDS tool supported by UT/ORNL was given by Brian L. to a few of the attendees till approximately 3:15pm EDT. End of Minutes for May 24, 1993 (M. Berry) --- From owner-pbwg-comm@CS.UTK.EDU Tue Jun 1 17:58:44 1993 Received: from CS.UTK.EDU by netlib2.cs.utk.edu with SMTP (5.61+IDA+UTK-930125/2.8t-UTK) id AA05508; Tue, 1 Jun 93 17:58:44 -0400 Received: from localhost by CS.UTK.EDU with SMTP (5.61+IDA+UTK-930125/2.8s-UTK) id AA26587; Tue, 1 Jun 93 17:58:22 -0400 X-Resent-To: pbwg-comm@CS.UTK.EDU ; Tue, 1 Jun 1993 17:58:21 EDT Errors-To: owner-pbwg-comm@CS.UTK.EDU Received: from BERRY.CS.UTK.EDU by CS.UTK.EDU with SMTP (5.61+IDA+UTK-930125/2.8s-UTK) id AA26581; Tue, 1 Jun 93 17:58:20 -0400 Received: from LOCALHOST.cs.utk.edu by berry.cs.utk.edu with SMTP (5.61++/2.7c-UTK) id AA03189; Tue, 1 Jun 93 17:58:19 -0400 Message-Id: <9306012158.AA03189@berry.cs.utk.edu> To: pbwg-comm@cs.utk.edu Subject: Revised Minutes Date: Tue, 01 Jun 1993 17:58:18 -0400 From: "Michael W. Berry" This should be the last revision - please let me know if I have missed any other changes. -Mike -----------------------------START MINUTES---------------------------- Minutes of the 3rd PARKBENCH (Formerly PBWG) Workshop ----------------------------------------------------- (PBWG= Parallel Benchmark Working Group) Place: Science Alliance Conference Room South College University of Tennessee Knoxville, TN Host: Jack Dongarra ORNL/Univ. of Tennessee Date: May 24, 1993 Attendees/Affiliations: ---------------------- David Bailey, NASA Michael Berry, Univ. of Tennessee Jack Dongarra, Univ. of Tennessee / ORNL Charles Grassl Cray Research Tom Haupt, Syracuse Univ. Tony Hey, Southampton Univ. Roger Hockney, Southampton Univ. Ed Kushner, Intel SSD Brian LaRose, Univ. of Tennessee David Mackay, Intel SSD Joanne Martin, IBM Ramesh Natarajan, IBM, Yorktown Heights Robert Pennington, Pittsburgh Supercomputing Center Bodo Parady, Sun Microsystems David Walker, ORNL Patrick Worley, ORNL Agenda: May 24, 1993 -------------------- At 8:36 am EDT, Roger Hockney gave opening remarks and welcomed all participants to the workshop. Each participant introduced him or herself by affiliation and interests. The minutes of the previous meeting (Knoxville, March 1-2) were reviewed with two major corrections made to the minutes: (1) Roger Hockney's name was missing from the Methodology subgroup list, and (2) a Compiler subgroup (with T. Haupt as leader) should have been added. Since the number of attendees was not that large (17), Roger H. proposed that there be no separate subgroup meetings during the day and all attendees agreed. Roger H. then asked the attendees to think about an alternative name for the group/benchmark suite which was first discussed at the March 1-2 meeting. The names considered include: PBWG, PARKBENCH, PARABEN, INTERBEN, SIMPLE, BIGBEN, and INTERPAR. Roger suggested that formal voting on the new name be conducted at the end of the meeting (before adjourning). The initial draft of the group's report was handed out and each chapter was then discussed in sequence. Roger H. began the discussion with Chapter 1: Methodology. David B. remarked that the the notation "Mflop/s" rather than "Mflops" (which was proposed by Roger H.) is a good standard to adopt. Although the meaning of T(p), i.e., elapsed wall-clock time on p processors, was clear, several attendees pointed out problems with the interpretation of T(1), the elapsed wall-clock time on 1 processor. Roger H. pointed out that there can be many T(1)'s, which leads to confusion in speedup comparisons. D. Bailey suggested that T(1) should not have any parallel overhead. Charles G. suggested that an efficiency measure based on Amdahl's law be used as a replacement for speedup. Ramesh N. suggested that a 2-processor baseline time be used. Tom H. pointed out that speedup is important for compiler performance measurement. Roger H. agreed that speedup may be important to report in this particular case. Ramesh N. felt that scaled speedup should be computed separately and that the widespread connotation of the term "superlinear" speedup is mathematically incorrect. Charles G. proposed that speedup should specifically address caching effects and memory hierarchies. This discussion of speedup ended with David B. agreeing to rewrite Section 1.4.5 (Speedup, Efficiency, and Performance per Node) of the draft and address all the concerns mentioned above. Roger H. reminded the attendees that subgroup leaders are responsible for their respective chapters of the report (which is targeted for release at Supercomputing '93). Section 1.5 (Performance Database) was the final section of Chapter 1 which was discussed. Jack D. was opposed to the idea of providing any graphical display of the on-line benchmarks provided by the PDS (Performance Database Server) extension to Xnetlib. Roger H. indicated that such graphics would make the data more attractive to users. Michael B. indicated that future PDS development would incorporate a spreadsheet-based display of the benchmark data from which graphical utilities could evolve. Tony H. indicated an interest in designing a few prototype graphical tools for displaying benchmark data obtained from PDS. Michael B. also pointed out that PDS will also provide SPEC Benchmarks in the future based on discussions with SPEC officials at a recent meeting in Huntsville, AL. Jack D. and Michael B. agreed to rewrite Section 15. and to indicate how to acquire/use PDS. Before moving on to Chapter 2 (Low-Level Benchmarks), David B. suggested that the draft include a motivation section which stresses benchmarking as a science rather than art. Parallels with other sciences could be drawn. David B. was willing to write up this for the report. Roger H. then led the discussion of Chapter 2 (Low-Level Benchmarks). He explained the difference between the two proposed timers TICK1 (clock resolution) and TICK2 (external wall-clock time). Roger also indicated that the UNIX timer "etime" is misleading in that does not report elapsed wall-clock time (reports CPU time instead) and that timer benchmarks are really necessary in order to understand the meaning of the reported times. David B. indicated that he has observed cases in which CPU time was greater than wall-clock time. Charles G. proposed that documentation should indicate that CPU time cannot be reported but also indicate potential hazards in wall-clock timing (hardware and network errors). David B. agreed to write a paragraph for the report which would address these concerns. Most attendees agreed that several runs of each benchmark should be made and Roger H. proposed that the minimum time be reported (rather than an average) for the low-level benchmarks. The consensus was unanimous on reporting the minimum time required but Bodo P. pointed out that operating systems will need to be somehow quantified for these times. Tony H. questioned whether or not optimizations should also be allowed for these particular benchmarks. A discussion of the Linpack benchmark (Section 2.1.3) and Livermore Loops (Section 2.1.4) was then initiated. Roger H. suggested that the Linpack (n=1000) benchmark be considered as a kernel benchmark. Charles G. did not support the use of the Livermore Loops for measuring cache-based microprocessors, while Roger H. supported their use for measuring the range of performance on a node (instability). Tony H. questioned why the group include sequential benchmarks for a parallel benchmark suite. He suggested that the report could reference the serial benchmarks (Linpack, Livermore Loops, SPEC) but should not include them in the suite. Roger H. then reviewed Section 2.1.5 which discusses the "N sub one-half" and "R sub infinity" performance measures. A routine RINF1 from the Genesis benchmarks could be used to determine these measures. Roger H. also proposed that memory-bottleneck benchmarks (POLY1, POLY2) be included (see Section 2.1.6) in the suite. Whereas vectors would fit in cache with POLY1, they would not fit in cache in POLY2. With regard to Arithmetic benchmarks (Section 2.1.7), Jack D. stressed that 64-bit arithmetic be used but Ed K. pointed out that 32-bit is commonly used in many applications (e.g., seismic codes). David B. proposed that the methodology should encourage 64-bit arithmetic but not exclude 32-bit in cases where it is explicitly required (and documented). Discussions were then curtailed for a short coffee break (10:20-10:45am). After the coffee break, discussions concerning Chapter 2 continued. Patrick W. questioned the type of communication (arbitrary or nearest- neighbor) that should be used for the COMM1, COMM2 benchmarks for measuring communication (Section 2.2). Charles G. enquired as to if one could measure hidden latency? Patrick W. suggested that a protocol be defined and Roger H. responded with the proposal that "nonblocking send" and "blocking receive" be used. He suggested that other variations could be used in optimizing basic routines. Roger H. asked if matrix transposition really measures bisection bandwidth? David W. indicated that it does provided the matrix has only 1 data distribution. Tony H. suggested that the group think about alternative benchmarks for measuring bisection bandwidth. David W. suggested that MPI communication routines (broadcast, gather, scatter, etc.,) be used. David W. will provide information on these routines. Tony H. questioned the need for the separate communication bottleneck benchmark (POLY3, Section 2.2.4), but Roger H. maintained that it is best to have it separated from POLY1 and the COMMS benchmarks. Roger H. pointed out that the synchronization benchmark (SYNCH1) was missing from Table 2.3. Patrick W. pointed out that this particular benchmark will be extremely machine-dependent. Roger H. suggested that the basic "barrier" paradigm be used. This concluded the discussion of Chapter 2 on Low-Level Benchmarks. Roger H. then asked Tony H. to lead the discussion on Chapter 3 (Kernel Applications). Tony passed out his draft of the chapter (not included with the chapters originally handed out by Roger H.) and reviewed its contents with the attendees. For the matrix benchmarks (Section 3.2.1), Tony H. proposed that the kernel A=B*C be provided and that the group consider appropriate validation tests based on generated matrices or input datasets. It was also stressed that the matrices B and C start distributed and stay that way. Tony H. discussed the availability of a matrix diagonalization code (Intel i860) that could scale the computation per node. He will make the code available (from Daresbury Lab) for review purposes. Jack D. proposed that routines from SCALAPACK be used for the dense LU (with pivoting) benchmark and that an iterative solver for nonsymmetric linear systems be included. Michael B., Jack D., and Patrick W. agreed to work on an appropriate sparse linear system solver or eigensolver for the suite. Jack D. suggested that a Cholesky factorization routine was not necessary as long as QR factorization was included. He also stressed that the suite use state-of-the-art algorithms for each benchmark. All attendees agreed. The discussion focused on what type of Fast Fourier Transform (FFT) benchmarks (Section 3.2.2) the suite should contain. Bodo P. suggested that they be structured like the Linpack benchmarks and questioned whether or not they should be ordered? David B. suggested that the 1-D FFT should be very large (order of 1.E+06) and need not be ordered. As an alternative, David suggested that the benchmark really be a convolution problem to be solved any way desired. Patrick W. then questioned whether or not a power of 2 should be used, and David B. responded that it should be a power or 2. Ed K. suggested that a 2-D FFT is not needed if a 3-D FFT is provided. David W. suggested that there be forward/backward FFT's which are easy to validate. Charles G. and David B. agreed to work on the FFT benchmarks. For PDE benchmarks (Section 3.2.3), there was a general agreement to drop Jacobi and Gauss-Seidel from the list of candidate algorithms. Tony H. suggested that an SOR-based routine from the Genesis benchmarks be used. David M. indicated that he could provide a Finite Element Method (FEM) code but that it might be better to consider it as a compact application rather than a kernel. Bodo P. and Tony H. proposed that such a benchmark be for a 3-D problem. There was somewhat of a consensus that there be a single problem and multiple algorithms provided. The discussion on Chapter 3 concluded with consideration of other possible kernel benchmarks (Section 3.2.4). Patrick W. questioned whether or not the Embarrassing Parallel (EP) benchmark (from NASA) should be a compact application. Roger H. suggested that there be an integer sort kernel and perhaps a Particle-In-Cell (PIC) kernel that might be commonly used in domain decomposition applications. Other suggested kernels (proposed by various attendees) included: operation counts, intrinsic operations, out-of-core solvers, check-pointing. Tony H. indicated that he could obtain an I/O benchmark from Daresbury Lab. David B. pointed out that timing events such as loading is more appropriate for compact applications than for kernel or low-level benchmarks. Roger H. then asked David W. to lead the discussion of the final chapter of the current report (Chapter 4, Compact Applications). David B. asked if various data layouts should be allowed? David W. proposed that there should both HPF and message-passing versions of the benchmarks and questioned if time should be measured from the start to finish? Tony H. proposed that the QCD code from the Perfect Benchmarks be replaced with GAUGE (available in HPF and message-passing). Tony H. will acquire GAUGE for the Netlib database (currently listed as the "pbwg" library). David B. suggested that an N-body code be provided and Bodo P. questioned if the gravity benchmark (Section 4.2.3) is really a kernel rather than a compact application. Patrick W. indicated that he could acquire a shallow water code that is public-domain (parallelized NCAR code). David W. suggested that other molecular dynamics codes be sought since those in the Perfect Benchmarks have problem sizes that are too small. For a potential geophysics benchmark, Michael B. agreed to check on the use of the ARCO benchmark for the parallel suite. For other potential compact applications (Section 4.2.7), Bodo P. objected to the use of DYNA3D (originally suggested by Joanne M.), and David M. agreed to check on the availability of a FEM code. Michael B. commented that good candidates would be those having multiple instances (HPF, message-passing, etc.,). David W. suggested that the group hold off on investigating commercial codes till the project matures. David B. suggested that a reservoir code be included. Other suggestions included: CHARMM, AMBER, GAMES, GAUSSIAN90 (all molecular dynamics codes). Patrick W. suggested that a signal processing application be added and Tony H. proposed that the applications focus under the "Grand Challenge" research areas. Charles G. asked how the performance of the compact applications would be verified. Tony H. indicated that the RAPS project typically generates lots of numbers. How many problems to run was another question raised. Bodo P. indicated that the SPEC folks use the geometric mean of several runs. Roger H. pointed out that there should be sufficient numbers reported to be able to show the performance variation with the numbers of processors. It is particularly important to detect Amdahl saturation, and any peak in performance which is followed by a steady decrease. Between 5 and 10 points for each problem size, roughly equally spaced logarithmically, would usually be necessary to do this (e.g. 1, 2, 4, 8, 16, 32, 64 and 128, on a 128-node system). Such detailed measurements have been made without difficulty on the LPM1 benchmark, and give a clear picture of performance variation with both problem size and number of processors. The discussion of Chapter 4 concluded and Roger H. asked Tom H. to briefly report on the compiler subgroup activities. Tom H. suggested that the compiler benchmarks should address how compilers handle data distribution. Related issues include the use dynamic memory, communication, and runtime libraries. He indicated that low-level compiler benchmarks (synthetic) be added and that there should be a comparison with hand-coded optimizations. Having completed formal discussions of the current report, Roger H. then called for a vote on the new name of the group. By a 10-7 margin, the attendees voted to change PBWG to PARKBENCH (PARallel Kernels and BENCHmarks). Jack D. asked if the group preferred a full-day or two half days for the next scheduled PARKBENCH meeting in Knoxville. The majority of attendees preferred the single day format, and so the next PARKBENCH meeting is scheduled to be in Knoxville on August 23. Roger H. asked that the minutes be posted (to comp.parallel Internet newsgroup and pbwg-com@cs.utk.edu). Joanne M. indicated that she would have information on the Birds-of-a-Feather (BOF) session for PARKBENCH at Supercomputing '93 at the August meeting. Michael B. briefly reviewed the status of the SPEC/Perfect merger and passed out minutes of that meeting (Huntsville, May 1-13). Roger H. then adjourned the official (third) meeting of the PARKBENCH group at 2:50pm EDT. A demo of the PDS tool supported by UT/ORNL was given by Brian L. to a few of the attendees till approximately 3:15pm EDT. End of Minutes for May 24, 1993 (M. Berry) From owner-pbwg-comm@CS.UTK.EDU Wed Jun 2 14:21:29 1993 Received: from CS.UTK.EDU by netlib2.cs.utk.edu with SMTP (5.61+IDA+UTK-930125/2.8t-UTK) id AA11051; Wed, 2 Jun 93 14:21:29 -0400 Received: from localhost by CS.UTK.EDU with SMTP (5.61+IDA+UTK-930125/2.8s-UTK) id AA15612; Wed, 2 Jun 93 14:21:24 -0400 X-Resent-To: pbwg-comm@CS.UTK.EDU ; Wed, 2 Jun 1993 14:21:19 EDT Errors-To: owner-pbwg-comm@CS.UTK.EDU Received: from sun2.nsfnet-relay.ac.uk by CS.UTK.EDU with SMTP (5.61+IDA+UTK-930125/2.8s-UTK) id AA15592; Wed, 2 Jun 93 14:21:13 -0400 Via: uk.ac.southampton.ecs; Wed, 2 Jun 1993 18:00:21 +0100 From: R.Hockney@parallel-applications-centre.southampton.ac.uk Via: calvados.pac.soton.ac.uk (plonk); Wed, 2 Jun 93 17:52:29 BST Date: Wed, 2 Jun 93 16:59:55 GMT Message-Id: <18087.9306021659@calvados.pac.soton.ac.uk> To: pbwg-comm@cs.utk.edu Subject: Revised SPEEDUP section Because the definition of Speedup is of such general concern to all members, not just those in the Methodology subcommittee, I am sending this proposed ammendment to all committee members: *********************** \subsection{Speedup and Efficiency} Speedup is a popular metric that has been used for many years to compare the performance of parallel computers. However its definition is open to ambiguity and misuse because it always begs the question "Speedup over what?"; a question that is often not clearly answered in publications using this metric. Whilst preferring the use of absolute measures of performance, such as Benchmark Performance defined earlier, the PARKBENCH committee accepted that speedup would probably continue to be used, and that the best policy was to sharpen-up its definition. Speedup is universally defined as \begin{equation} frac{T_1}{T_p} \end{equation} where $T_p$ is the p-processor time to perform some benchmark, and $T_1$ is called the one-processor time. There is no doubt about the meaning of $T_p$, because this is measured time $T(N;p)$ to perform the benchmark. There is, however, usually considerable discussion over the meaning of $T_1$, whether it is the time for the parallel code running on one-processor which probably contains unnecessary parallel overheads, or whether it is the best serial code (probably quite a different algorithm) running on one processor. The latter choice sounds much more realistic, but would require a program of research to determine what was the best serial algorithm, and the rescaling of all previously computed Speedup values every time a better serial algorithm was discovered. An additional problem is that even if we decide what $T_1$ should be, there may not be enough memory on a single node to store the whole data for a large problem suitable for using a large MPP. It may not therefore be possible to measure $T_1$ on an MPP, however we define it. The purpose of benchmarking is to compare the performance of different computers, on the basis that the best performance corresponds to the least wall-clock execution time. In order to use Speedup for this purpose, it does not matter how $T_1$ is defined, or what its value is. It only matters that the same value of $T_1$ is used to calculate all Speedup values used in the comparison. Looked at in this way, $T_1$ is just a single reference time which is defined for each benchmark, and to which all parallel execution times are compared. The answer to the question "over what?" is then "over $T_1$", and it is clear then why the same $T_1$ must be used for all comparisons of different computers on the same benchmark. If we do not use the same $T_1$ for all comparisons, then we are using different units to measure the performance on the different computers. This makes as much sense as comparing the numerical value of the maximum speeds of three cars, when one is measured in m.p.h, the second in feet per second and the third in m/s. The SPEC committee uses the above proceedure in the definition of their SPEC ratio which is defined as the Speedup over a reference time obtained by running the defining serial code on a VAX11/780. The problem is that periodically it seems necessary to update these reference times to a currently available computer, or to keep a VAX11/780 going in a special museum (I suppose it would be the Smithsonian or NPL) in a similar way as the standard yard or metre are carefully maintained. Nevertheless there is no doubt that keeping a constant value of $T_1$, however it is defined, for each benchmark is the only way of making Speedup an acceptable metric for measuring and comparing computer performance. Defining $T_1$ as a reference time unrelated to the parallel computer being benchmarked unfortunately has the conseqence that certain properties that many people regard as essential to the idea of Speedup are lost: \begin{enumerate} \item It is no longer necessarilly true that the Speedup of the parallel code on one processor is unity. It may be, but only by chance. \item It is no longer true that the maximum Speedup using $p$-processors is $p$. \item Because of the last item, Efficiency=Speedup/$p$ is no longer a meaningful measure of processor utilisation. \end{enumerate} Thus it appears that if we sharpen-up the definition of Speedup to make it an acceptable metric for comparing the performance of different computers, we have to throw way the main properties which have made the concept of Speedup useful in the past. There is a choice: keep Speedup with its traditional properties, and accept that it has no place as a metric for comparing computer performance (i.e. in benchmarking), or define Speedup in a way that can be used in benchmarking, and lose the traditional properties. There is no middle way, or possible compromise. If we use $T_1$ as the time for the defining serial code on a very fast single processor (currently, say, a CRAY C90), then I am sure that manufacturers would be reluctant to having to quote the Speedup of their MPP with hundreds of processors in the above way. If the Speedup of the 100 processor MPP over a single node of the MPP is a respectable 80, say, it is likely that the Speedup over $T_1$ would be reduced to about 10 or less, because the fast single processor is likely to be at least ten times faster than the workstation chips used in MPPs. [For all the above reasons I, personally RWH, do not believe that Speedup can be saved as a useful metric for comparing computer performance, and that it should only be kept as a convenient metric to use when optimising code on a particular multiprocessor computer in isolation. However if the committee wishes to allow its use as a metric the following rules should apply:] The use of absolute measures of computer performance such as Temporal (tstep/s) or Benchmark performance (Mflop/s based on a given nominal flop-count) avoid the above problems of definition. However, if Speedup is used as a metric for comparing computer performance on benchmarks, then the PARKBENCH committee requires that: \begin{enumerate} \item The value of $T_1$ in seconds that was used in the speedup calculation must be quoted along with the value of Speedup \item The same value of $T_1$ must be used when calculating all Speedup values for a particular benchmark. \item The benchmark writer must provide, as part of the definition of the benchmark, the value of $T_1$ in seconds that is to be used. \item Only if the above rules are obeyed will the benchmark results be accepted as unambiguous and entered into the PDS database. \end{enumerate} From owner-pbwg-comm@CS.UTK.EDU Wed Jun 2 18:26:16 1993 Received: from CS.UTK.EDU by netlib2.cs.utk.edu with SMTP (5.61+IDA+UTK-930125/2.8t-UTK) id AA13296; Wed, 2 Jun 93 18:26:16 -0400 Received: from localhost by CS.UTK.EDU with SMTP (5.61+IDA+UTK-930125/2.8s-UTK) id AA02392; Wed, 2 Jun 93 18:26:01 -0400 X-Resent-To: pbwg-comm@CS.UTK.EDU ; Wed, 2 Jun 1993 18:26:00 EDT Errors-To: owner-pbwg-comm@CS.UTK.EDU Received: from sp2.csrd.uiuc.edu by CS.UTK.EDU with SMTP (5.61+IDA+UTK-930125/2.8s-UTK) id AA02372; Wed, 2 Jun 93 18:25:53 -0400 Received: from sp94.csrd.uiuc.edu.csrd.uiuc.edu (sp94.csrd.uiuc.edu) by sp2.csrd.uiuc.edu with SMTP id AA25307 (5.67a/IDA-1.5); Wed, 2 Jun 1993 17:25:45 -0500 Received: by sp94.csrd.uiuc.edu.csrd.uiuc.edu (4.1/SMI-4.1) id AA03416; Wed, 2 Jun 93 17:25:43 CDT Date: Wed, 2 Jun 93 17:25:43 CDT From: schneid@csrd.uiuc.edu (David John Schneider) Message-Id: <9306022225.AA03416@sp94.csrd.uiuc.edu.csrd.uiuc.edu> To: R.Hockney@pac.soton.ac.uk Cc: pbwg-comm@cs.utk.edu, perfect.steering@csrd.uiuc.edu In-Reply-To: <18087.9306021659@calvados.pac.soton.ac.uk> (R.Hockney@pac.soton.ac.uk) Subject: Re: Revised SPEEDUP section Several weeks ago at the first meeting of the newly reconstituted SPEC-Perfect steering committee, we also got tangled up in policy questions related to speedup and other derived quantities. After the usual heated discussions, we decided to simply report only elapsed times (i.e. time-to-solution or it's inverse). As usual, this conclusion was a result of a lot of discussion, much of which I expect was the same as occurred at the PBWG meeting. Since the SPEC-Perfect meeting, I have spent some time trying to understand why speedup and other issues always provoke such contentious discussions. The current debate in the PBWG has prompted me to try to write down some of these ideas to clarify my own thinking. The basic conclusion that I have reached is that there are two important classes of problems in basic performance evaluation methodology today, both of which can be readily addressed. Since I have been unable to attend PBWG meetings and directly participate in methodology subcommittee meetings, I'll attempt to summarize my thoughts here. The first class of problems is very basic -- we are searching for something which does not exist. For example, there is tendency to expend considerable effort to define a single, universal figure of merit for comparing machines when, in fact, there are good reason to think that this cannot ever be done in a unique and unbiased manner. It would be nice to be able to say that machine X is better (eg. faster) than machine Y, with no qualifications. However, this simply isn't possible (see below). Second, there are other performance measures such as speedup which are, at best, loosely correlated with an end user's perception of delivered performance and therefore are of little real utility. I agree with all of Roger's comments in this regard. Nevertheless, these loosely defined performance measures tend to get a lot of press in both academic and popular publications because they are "easy to understand". In fact, the contentious debates on these issues indicate that this "understanding" is largely illusory and unsatisfactory when taken out of the context or when this context is not provided at all. I feel that both of these classes of problems can be overcome by adopting and adhering to an axiomatic approach. The utility an axiomatic approach in defining a set of logically consistent performance measures has been recently advocated by directly by David Snelling. Others such as Roger Hockney, John Larson, David Bailey have argued forcefully for the need of a well defined mathematical framework, and the axiomatic approach provides one alternative for constructing such a framework. One of the the most important aspects of axiomatic approach is that it forces one to precisely and explicitly state assumptions regarding the measurement process that have previously remained implicit or imprecise. In the axiomatic approach, computer performance evaluation becomes the study of the complicated mapping from the set of computer codes into the multidimensional space of measureable quantities which obey the prescribed axioms (additivity, positivity, etc.). A second important aspect of the axiomatic approach is that makes it clear that one cannot construct a unique, universal, unbiased ranking scheme for computer performance simply because there is no natural ordering to the underlying space of measurements. Therefore, the search for a universal figure of merit is certain to fail. Would a well informed computer scientist attempt to develop an algorithm for determining whether or not a Turing machine will halt when presented with an arbitrary input? Would a practically minded physicist or engineer write a grant to develop a perpetual motion machine or a faster-than-light communication system? In all cases, if one accepts the usual mathematical description of the underlying problem, then all of these endeavors can be proven futile. Why then in benchmarking do we continue to pursue the goal of a single figure of merit? As emphasized by David Snelling, the axiomatic approach has been extrordinarily successful in the physical sciences. For example, one can meaningfully ascribe quantities such as energy and momentum to particles, waves fields, and combinations thereof. More recently, in the hands of Shannon and others the field of communication theory underwent a spectacular revolution from a raw empiricism to a quantitative science in a space of only several years by employing a very simple system of axioms and a clear set of definitions. In this case the basic axioms were largely borrowed from the foundations of statistical physics (information content or entropy should be an extensive, non-negative function, etc.). Shannon's major contibution was to recognize that it is possible distinguish between the precisely definable notion of "information content", and the loosely defined "meaning" of this information to the recipient of the message. I believe that the same type of precise thinking which was so successfully applied by Shannon to transform communication theory into a quantitative science also needs to be applied to performance evaluation. The list of existing performance measures which can be incorporated into an axiomatic framework is very remarkably short, and Roger Hockney has enumerated essentially of the time-based measures in hsi previous versions of PBWG documents. I personally think it is a large step backwards to include speedup in light of the sound arguments against it. The SPEC-Perfect group steering committee concluded that the major argument in favor of speedup and single figures of merit was their current popularity, and this reason was insufficient for us to adopt them as part of our basic measurement and reporting methodology. -- Dave Schneider University of Illinois at Urbana-Champaign Center for Supercomputing Research and Development 367 Computer and Systems Research Laboratory 1308 W. Main Street Urbana, IL 61801-2307 MC-264 phone : (217) 244-0055 fax : (217) 244-1351 E-mail: schneid@csrd.uiuc.edu ======================================================================== Return-Path: Errors-To: owner-pbwg-comm@CS.UTK.EDU X-Resent-To: pbwg-comm@CS.UTK.EDU ; Wed, 2 Jun 1993 14:21:19 EDT Errors-To: owner-pbwg-comm@CS.UTK.EDU From: R.Hockney@pac.soton.ac.uk Date: Wed, 2 Jun 93 16:59:55 GMT To: pbwg-comm@cs.utk.edu Subject: Revised SPEEDUP section Because the definition of Speedup is of such general concern to all members, not just those in the Methodology subcommittee, I am sending this proposed ammendment to all committee members: *********************** \subsection{Speedup and Efficiency} Speedup is a popular metric that has been used for many years to compare the performance of parallel computers. However its definition is open to ambiguity and misuse because it always begs the question "Speedup over what?"; a question that is often not clearly answered in publications using this metric. Whilst preferring the use of absolute measures of performance, such as Benchmark Performance defined earlier, the PARKBENCH committee accepted that speedup would probably continue to be used, and that the best policy was to sharpen-up its definition. Speedup is universally defined as \begin{equation} frac{T_1}{T_p} \end{equation} where $T_p$ is the p-processor time to perform some benchmark, and $T_1$ is called the one-processor time. There is no doubt about the meaning of $T_p$, because this is measured time $T(N;p)$ to perform the benchmark. There is, however, usually considerable discussion over the meaning of $T_1$, [stuff deleted] From owner-pbwg-comm@CS.UTK.EDU Thu Jun 3 11:43:03 1993 Received: from CS.UTK.EDU by netlib2.cs.utk.edu with SMTP (5.61+IDA+UTK-930125/2.8t-UTK) id AA17970; Thu, 3 Jun 93 11:43:03 -0400 Received: from localhost by CS.UTK.EDU with SMTP (5.61+IDA+UTK-930125/2.8s-UTK) id AA10097; Thu, 3 Jun 93 11:42:35 -0400 X-Resent-To: pbwg-comm@CS.UTK.EDU ; Thu, 3 Jun 1993 11:42:34 EDT Errors-To: owner-pbwg-comm@CS.UTK.EDU Received: from sp2.csrd.uiuc.edu by CS.UTK.EDU with SMTP (5.61+IDA+UTK-930125/2.8s-UTK) id AA10084; Thu, 3 Jun 93 11:42:31 -0400 Received: by sp2.csrd.uiuc.edu id AA04316 (5.67a/IDA-1.5); Thu, 3 Jun 1993 10:42:03 -0500 Date: Thu, 3 Jun 1993 10:42:03 -0500 From: "John L. Larson" Message-Id: <199306031542.AA04316@sp2.csrd.uiuc.edu> To: pbwg-comm@cs.utk.edu, perfect.steering@csrd.uiuc.edu Subject: recent papers I am sending you postscript versions of two recent papers for your information. The first one which includes performance metric definitions will appear in the August issue of the Proceedings of the IEEE. The second one describes a workload characterization study and has been submitted to Supercomputing '93. Any comments are appreciated. thanks john From owner-pbwg-comm@CS.UTK.EDU Thu Jun 3 11:44:03 1993 Received: from CS.UTK.EDU by netlib2.cs.utk.edu with SMTP (5.61+IDA+UTK-930125/2.8t-UTK) id AA17985; Thu, 3 Jun 93 11:44:03 -0400 Received: from localhost by CS.UTK.EDU with SMTP (5.61+IDA+UTK-930125/2.8s-UTK) id AA10266; Thu, 3 Jun 93 11:44:22 -0400 X-Resent-To: pbwg-comm@CS.UTK.EDU ; Thu, 3 Jun 1993 11:44:18 EDT Errors-To: owner-pbwg-comm@CS.UTK.EDU Received: from sp2.csrd.uiuc.edu by CS.UTK.EDU with SMTP (5.61+IDA+UTK-930125/2.8s-UTK) id AA10228; Thu, 3 Jun 93 11:44:07 -0400 Received: by sp2.csrd.uiuc.edu id AA04336 (5.67a/IDA-1.5); Thu, 3 Jun 1993 10:43:34 -0500 Date: Thu, 3 Jun 1993 10:43:34 -0500 From: "John L. Larson" Message-Id: <199306031543.AA04336@sp2.csrd.uiuc.edu> To: pbwg-comm@cs.utk.edu, perfect.steering@csrd.uiuc.edu Subject: IEEE paper %!PS-Adobe-2.0 %%Creator: dvips 5.47 Copyright 1986-91 Radical Eye Software %%Title: /homes/bradley/ieee.dvi %%Pages: 27 1 %%BoundingBox: 0 0 612 792 %%EndComments %%BeginProcSet: texc.pro /TeXDict 200 dict def TeXDict begin /N /def load def /B{bind def}N /S /exch load def /X{S N}B /TR /translate load N /isls false N /vsize 10 N /@rigin{ isls{[0 1 -1 0 0 0]concat}if 72 Resolution div 72 VResolution div neg scale Resolution VResolution vsize neg mul TR matrix currentmatrix dup dup 4 get round 4 exch put dup dup 5 get round 5 exch put setmatrix}N /@letter{/vsize 10 N}B /@landscape{/isls true N /vsize -1 N}B /@a4{/vsize 10.6929133858 N}B /@a3{ /vsize 15.5531 N}B /@ledger{/vsize 16 N}B /@legal{/vsize 13 N}B /@manualfeed{ statusdict /manualfeed true put}B /@copies{/#copies X}B /FMat[1 0 0 -1 0 0]N /FBB[0 0 0 0]N /nn 0 N /IE 0 N /ctr 0 N /df-tail{/nn 8 dict N nn begin /FontType 3 N /FontMatrix fntrx N /FontBBox FBB N string /base X array /BitMaps X /BuildChar{CharBuilder}N /Encoding IE N end dup{/foo setfont}2 array copy cvx N load 0 nn put /ctr 0 N[}B /df{/sf 1 N /fntrx FMat N df-tail} B /dfs{div /sf X /fntrx[sf 0 0 sf neg 0 0]N df-tail}B /E{pop nn dup definefont setfont}B /ch-width{ch-data dup length 5 sub get}B /ch-height{ch-data dup length 4 sub get}B /ch-xoff{128 ch-data dup length 3 sub get sub}B /ch-yoff{ ch-data dup length 2 sub get 127 sub}B /ch-dx{ch-data dup length 1 sub get}B /ch-image{ch-data dup type /stringtype ne{ctr get /ctr ctr 1 add N}if}B /id 0 N /rw 0 N /rc 0 N /gp 0 N /cp 0 N /G 0 N /sf 0 N /CharBuilder{save 3 1 roll S dup /base get 2 index get S /BitMaps get S get /ch-data X pop /ctr 0 N ch-dx 0 ch-xoff ch-yoff ch-height sub ch-xoff ch-width add ch-yoff setcachedevice ch-width ch-height true[1 0 0 -1 -.1 ch-xoff sub ch-yoff .1 add]/id ch-image N /rw ch-width 7 add 8 idiv string N /rc 0 N /gp 0 N /cp 0 N{rc 0 ne{rc 1 sub /rc X rw}{G}ifelse}imagemask restore}B /G{{id gp get /gp gp 1 add N dup 18 mod S 18 idiv pl S get exec}loop}B /adv{cp add /cp X}B /chg{rw cp id gp 4 index getinterval putinterval dup gp add /gp X adv}B /nd{/cp 0 N rw exit}B /lsh{rw cp 2 copy get dup 0 eq{pop 1}{dup 255 eq{pop 254}{dup dup add 255 and S 1 and or}ifelse}ifelse put 1 adv}B /rsh{rw cp 2 copy get dup 0 eq{pop 128}{dup 255 eq{pop 127}{dup 2 idiv S 128 and or}ifelse}ifelse put 1 adv}B /clr{rw cp 2 index string putinterval adv}B /set{rw cp fillstr 0 4 index getinterval putinterval adv}B /fillstr 18 string 0 1 17{2 copy 255 put pop}for N /pl[{adv 1 chg}bind{adv 1 chg nd}bind{1 add chg}bind{1 add chg nd}bind{adv lsh}bind{ adv lsh nd}bind{adv rsh}bind{adv rsh nd}bind{1 add adv}bind{/rc X nd}bind{1 add set}bind{1 add clr}bind{adv 2 chg}bind{adv 2 chg nd}bind{pop nd}bind]N /D{ /cc X dup type /stringtype ne{]}if nn /base get cc ctr put nn /BitMaps get S ctr S sf 1 ne{dup dup length 1 sub dup 2 index S get sf div put}if put /ctr ctr 1 add N}B /I{cc 1 add D}B /bop{userdict /bop-hook known{bop-hook}if /SI save N @rigin 0 0 moveto}N /eop{clear SI restore showpage userdict /eop-hook known{eop-hook}if}N /@start{userdict /start-hook known{start-hook}if /VResolution X /Resolution X 1000 div /DVImag X /IE 256 array N 0 1 255{IE S 1 string dup 0 3 index put cvn put}for}N /p /show load N /RMat[1 0 0 -1 0 0]N /BDot 260 string N /rulex 0 N /ruley 0 N /v{/ruley X /rulex X V}B /V statusdict begin /product where{pop product dup length 7 ge{0 7 getinterval (Display)eq}{pop false}ifelse}{false}ifelse end{{gsave TR -.1 -.1 TR 1 1 scale rulex ruley false RMat{BDot}imagemask grestore}}{{gsave TR -.1 -.1 TR rulex ruley scale 1 1 false RMat{BDot}imagemask grestore}}ifelse B /a{moveto}B /delta 0 N /tail{dup /delta X 0 rmoveto}B /M{S p delta add tail}B /b{S p tail} B /c{-4 M}B /d{-3 M}B /e{-2 M}B /f{-1 M}B /g{0 M}B /h{1 M}B /i{2 M}B /j{3 M}B /k{4 M}B /w{0 rmoveto}B /l{p -4 w}B /m{p -3 w}B /n{p -2 w}B /o{p -1 w}B /q{p 1 w}B /r{p 2 w}B /s{p 3 w}B /t{p 4 w}B /x{0 S rmoveto}B /y{3 2 roll p a}B /bos{ /SS save N}B /eos{clear SS restore}B end %%EndProcSet TeXDict begin 1000 300 300 @start /Fa 36 122 df<127012F8A3127005057C840D>46 D48 DIII<1306A2 130EA2131E132EA2134E138EA2EA010E1202A212041208A212101220A2124012C0B512F038000E 00A7EBFFE0141E7F9D17>II<137CEA0182EA0701380E0380EA0C07121C3838030090C7FC12781270A2EAF1F0EAF21CEA F406EAF807EB0380A200F013C0A51270A214801238EB07001218EA0C0E6C5AEA01F0121F7E9D17 >I<1240387FFFC01480A238400100EA8002A25B485AA25B5BA25BA213C0A212015B1203A41207 A76CC7FC121F7D9D17>III<5B497EA3497EA3EB09E0A3EB10F0A3EB2078A3497EA3497E A2EBFFFE3801000FA30002EB0780A348EB03C0120E001FEB07E039FFC03FFE1F207F9F22>65 DI<90380FC04090387030C03801C00938038005 38070003000E1301001E1300121C123C007C1440A2127800F81400A91278007C1440A2123C121C 001E1480120E6CEB0100380380026C6C5A38007038EB0FC01A217D9F21>IIII<90 380FE02090387818609038E004E03803800238070001481300001E1460A25A1520127C127800F8 1400A7EC7FFCEC03E000781301127C123CA27EA27E7E380380023900E00460903878182090380F E0001E217D9F24>I77 D<39FF803FF83907C007C0EC03803905E00100A2EA04F01378A2133CA213 1E130FA2EB0781A2EB03C1EB01E1A2EB00F1A21479143DA2141FA28080A2000E7F121F38FFE001 1D1F7E9E22>I82 D<3807E080EA0C19EA1007EA3003EA6001A212E01300A36C1300A21278127FEA3FF0EA1FFC6C7E EA03FF38001F801307EB03C0A213011280A400C01380130300E01300EAF006EACE0CEA81F81221 7D9F19>I<39FFF003FF391F8000F8000F1460000714407F6C6C138012019038F0010000005BEB F802EB7C04133CEB3E08131EEB1F10EB0FB0EB07A014C01303AB1307EB7FFE201F7F9E22>89 D97 D<120E12FE120EAA133EEBC380380F01C0EB00E0120E1470 A21478A61470A214E0120F380D01C0380CC300EA083E15207F9F19>IIII<137C13C6EA018F1203EA07061300A7EAFFF0EA0700B2 EA7FF01020809F0E>I<14E03803E330EA0E3CEA1C1C38380E00EA780FA5EA380E6C5AEA1E38EA 33E00020C7FCA21230A2EA3FFE381FFF806C13C0383001E038600070481330A4006013606C13C0 381C03803803FC00141F7F9417>I<390E1F01F039FE618618390E81C81C390F00F00EA2000E13 E0AE3AFFE7FE7FE023147F9326>109 DI114 DI<38FF83F8381E01E0381C00C06C1380A338070100A2EA 0382A3EA01C4A213ECEA00E8A21370A3132015147F9318>118 D<38FF83F8381E01E0381C00C0 6C1380A338070100A2EA0382A3EA01C4A213ECEA00E8A21370A31320A25BA3EAF080A200F1C7FC 1262123C151D7F9318>121 D E /Fb 15 119 df<1430146014C0EB0180EB03005B130E130C5B 1338133013705B5B12015B1203A290C7FC5A1206120EA2120C121CA312181238A45AA75AB3A312 70A77EA41218121CA3120C120EA2120612077E7FA212017F12007F13701330133813187F130E7F 7FEB0180EB00C014601430146377811F>18 D<12C012607E7E7E120E7E7E6C7E7F12007F137013 3013381318131CA2130C130E13061307A27F1480A3130114C0A4EB00E0A71470B3A314E0A7EB01 C0A414801303A314005BA21306130E130C131CA213181338133013705B5B12015B48C7FC5A120E 120C5A5A5A5A14637F811F>I<14181430146014E014C0EB01801303EB07001306130E130C131C 5BA25BA25BA212015BA2485AA3120790C7FCA25A120EA2121EA3121CA2123CA412381278A81270 12F0B3A812701278A81238123CA4121CA2121EA3120EA2120F7EA27F1203A36C7EA27F1200A213 70A27FA27F130C130E13061307EB03801301EB00C014E0146014301418157C768121>32 D<12C012607E123812187E120E7E7E7F12017F6C7EA21370A27FA2133C131CA27FA3130F7FA214 801303A214C0A31301A214E0A4130014F0A814701478B3A8147014F0A814E01301A414C0A21303 A31480A213071400A25B130EA35BA2133C1338A25BA25BA2485A5B120390C7FC5A120E120C5A12 3812305A5A157C7F8121>I<1318137813F0EA01E0EA03C0EA0780EA0F005A121E123E123C127C A2127812F8B3A50D25707E25>56 D<12C012F0127C121E7EEA078013C01203EA01E013F0120013 F8A3137CB3A50E25797E25>I<12F8B3A51278127CA2123C123E121E121F7EEA0780EA03C0EA01 E0EA00F0137813180D25708025>I<137CB3A513F8A313F0120113E0EA03C012071380EA0F0012 1E127C12F012C00E25798025>I<137CB3A613F8A313F0120113E0120313C0EA07801300120E5A 5A12F012C012F012387E7E7E1380EA03C013E0120113F0120013F8A3137CB3A60E4D798025>I< 12F8B3A61278127CA2123CA2123E121E7EA2EA0780EA03C01201EA00E013781318137813E0EA01 C01203EA0780EA0F00A2121E123E123CA2127CA2127812F8B3A60D4D708025>I<12F8AE050E70 8025>I88 D116 D<12C0B3A9021B64802C>I<387FFF80B5FC00C0C7FCB3A611 1A64812C>I E /Fc 9 116 df<0007B512803800E003EC0100A3EA01C0A21440A248485A138113 FF1381D80701C7FCA390C8FC120EA45AEAFFC019177F9616>70 D99 D101 D<130E1313133713361360A5EA07 FCEA00C0A5EA0180A6EA0300A4126612E65A1278101D7E9611>I<120313801300C7FCA6121C12 241246A25A120C5AA31231A21232A2121C09177F960C>105 D<1318133813101300A6EA01C0EA 0220EA0430A2EA08601200A313C0A4EA0180A4EA630012E312C612780D1D80960E>I<38383C1E 3844C6633847028138460301388E0703EA0C06A21406EA180C1520140C154038301804EC07801B 0E7F8D1F>109 DI115 D E /Fd 1 121 df<121FEA3080EA7040EA6060EAE0E0A21340 13001260127012307E121C1233EA7180EA61C012E013E0A31260EA70C01231EA1980EA07007EEA 018013C0120013E0124012E0A2EAC0C01241EA2180EA1F000B257D9C12>120 D E /Fe 52 123 df<903901FF81FE011F9038EFFF80903A7F80FF87C0903AFC00FE0FE03801F8 01000314FCEA07F0EE07C093C7FCA7B712F8A32707F001FCC7FCB3A33A7FFF1FFFE0A32B2A7FA9 28>11 D<903A01FF803FE0011F9038E3FFF8903A7F80FFF01E903AFE007F801FD801F849485A00 0349EC7F80D807F05BA30200EC3F00171E94C7FCA4B91280A33B07F000FE003FB3A33C7FFF0FFF E3FFF8A3352A7FA939>14 D<121C127FA2EAFF8013C0A2127FA2121C1200A2EA0180A3EA0300A2 12065A5A5A12200A157BA913>39 D45 D<121C123E127FEAFF80A3EA7F 00123E121C09097B8813>I<130E131E137EEA07FE12FFA212F81200B3AB387FFFFEA317277BA6 22>49 DII<140E141E143E147E14FEA213011303EB077E130EA2131C13381370 13E0A2EA01C0EA0380EA0700120EA25A5A5A5AB612F8A3C7EAFE00A890387FFFF8A31D277EA622 >I<000C1303380F803FEBFFFE5C5C5C5C5C49C7FC000EC8FCA6EB7FC0380FFFF8EB80FC380E00 3E000C133FC7EA1F8015C0A215E0A21218127C12FEA315C05A0078EB3F80A26CEB7F00381F01FE 380FFFF800035BC613801B277DA622>II<1238123E003FB512F0A34814E015C0158015003870000EA25C485B5C5CC7FC495A495A 1307A249C7FCA25BA25B133EA2137EA413FEA8137C13381C297CA822>I<121C123E127FEAFF80 A3EA7F00123E121CC7FCA9121C123E127FEAFF80A3EA7F00123E121C091B7B9A13>58 D65 DI<91393FF0018090 3903FFFE03010FEBFF8790393FF007DF9039FF8001FF4848C7127F4848143FD807F0141F000F15 0F48481407A2485A1603127F5B93C7FC12FFA9127FA26DEC0380123FA26C7EEE07006C7E000715 0ED803FC141E6C6C5C6C6C6C13F890393FF007E0010FB55A010391C7FC9038003FF829297CA832 >III<91387FE003903903FFFC07011FEBFF0F90393FF00FFF9038FF80014848C7FCD8 03F8143F485A000F81484880A2485A82127F5B93C7FC12FFA84AB512F8127FA26DC71300123FA2 6C7EA26C7E12076C7EEA01FE6C6C6C5A90393FF007BF6DB5121F0103497E9039007FF0032D297C A836>71 DI76 DI80 DII<90387F80603903FFF0E04813F9380F807F381F 001F003E1307481303140112FCA214007EA26C140013C0EA7FFEEBFFE06C13FC6C7F6CEBFF806C 14C06C14E0C6FC010713F0EB007FEC0FF8140714030060130112E0A36C14F0A26C13036C14E0B4 EB07C09038E01F8000F3B5120000E05B38C01FF01D297CA826>I<007FB712C0A39039803FC03F D87E00140700781503A20070150100F016E0A2481500A5C71500B3A4017FB512E0A32B287EA730 >II 89 D<48B47E000713F0380F81F8381FC07EA280D80F801380EA0700C7FCA3EB0FFF90B5FC3807 FC3FEA0FE0EA3F8013005A12FEA4007E137F007F13DF393F839FFC380FFF0F3801FC031E1B7E9A 21>97 DIIIII<90 38FF81F00003EBE7F8390FC1FE7C381F80FC9038007C3848EB7E1048EB7F00A66C137E6C137CEB 80FC380FC1F8381FFFE0001813800038C8FCA2123C123E383FFFF86C13FF15806C14C06C14E000 1F14F0383E000748EB01F8481300A4007CEB01F0003C14E0001FEB07C0390FC01F803903FFFE00 38007FF01E287E9A22>II<1207EA0F80EA1FC0EA3FE0A3EA1FC0EA0F80EA07 00C7FCA7EAFFE0A3120FB3A3EAFFFEA30F2B7DAA14>I107 DI<3BFFC0 7F800FF0903AC1FFE03FFC903AC383F0707E3B0FC603F8C07F903ACC01F9803F01D8D9FF001380 01F05BA201E05BB03CFFFE1FFFC3FFF8A3351B7D9A3A>I<38FFC07F9038C1FFC09038C787E039 0FCE03F013D88113F0A213E0B03AFFFE3FFF80A3211B7D9A26>II<38FFE1FE9038E7FF809038FE07E039 0FF803F0496C7E496C7E818181A21680A716005DA25D4A5A01F05B6D485A9038FE0FE09038E7FF 80D9E1FCC7FC01E0C8FCA9EAFFFEA321277E9A26>I<38FFC1F0EBC7FCEBCE3E380FD87FA213F0 143E141CEBE000B0B5FCA3181B7E9A1C>114 D<3803FE30380FFFF0EA1E03EA380048137012F0 A27E6C1300EAFFE0EA7FFEEBFF806C13E06C13F0000713F8C6FCEB03FC13000060137C00E0133C 7E14387E6C137038FF01E038F7FFC000C11300161B7E9A1B>I<1370A413F0A312011203A21207 381FFFF0B5FCA23807F000AD1438A61203EBF870000113603800FFC0EB1F8015267FA51B>I<39 FFE03FF8A3000F1303B214071207140F3A03F03BFF803801FFF338003FC3211B7D9A26>I<3BFF FE7FFC0FFEA33B0FE007E000E03B07F003F001C0A29039F807F80300031680A23B01FC0EFC0700 A2D9FE1E5B000090381C7E0EA29039FF383F1E017F141C0278133C90393FF01FB8A216F86D486C 5AA26D486C5AA36D486C5AA22F1B7F9A32>119 D<39FFFC0FFFA33907F003C06C6C485AEA01FC 6C6C48C7FCEBFF1E6D5AEB3FF86D5A130FA2130780497E497E131EEB3C7F496C7E496C7ED801E0 7FEBC00F00036D7E3AFFF01FFF80A3211B7F9A24>I<3AFFFE03FF80A33A07F0007000A26D13F0 00035CEBFC0100015CA26C6C485AA2D97F07C7FCA2148FEB3F8E14DEEB1FDCA2EB0FF8A36D5AA2 6D5AA26D5AA2495AA2EA3807007C90C8FCEAFE0F130E131E5BEA7C78EA3FE0EA0FC021277F9A24 >I<003FB51280A29038007F00003C13FEEA3801387803FC5CEA7007495A5CC6485A133F495A91 C7FC5B3901FE038013FCEA03F81207380FF00713E0001F140048485A495A387F007FB6FCA2191B 7E9A1F>I E /Ff 15 121 df0 D<6C13026C13060060130C6C13186C13 306C13606C13C03803018038018300EA00C6136C1338A2136C13C6EA018338030180380600C048 136048133048131848130C4813064813021718789727>2 D15 D<150C153C15F0EC03C0EC0F0014 3C14F0EB07C0011FC7FC1378EA01E0EA0780001EC8FC127812E01278121EEA0780EA01E0EA0078 131FEB07C0EB00F0143C140FEC03C0EC00F0153C150C1500A8007FB512F8B612FC1E277C9F27> 20 D<487E48CAFCA21206A25A5A1270B812C0A20070CAFC12187E7EA27EA26C7E2A127C9432> 32 D<166082A282A28282EE0380B812E017C0C9EA0380EE06005E5EA25EA25E2B127D9432>I<13 C0B3B100C013C0EAF0C33838C700EA1CCEEA06D8EA03F06C5AA26C5A1340122D7DA219>35 D<12C07E12707E7E7E7E6C7E6C7E6C7E13707F7F7F7F6D7E6D7E6D7E1470808080806E7E6E7E6E 7E15708181816F13209238038060ED01C0ED00E0EE70C01638161C160E16071603163F923801FF E0923803C0602B2B7DA032>38 D<176017E0EE01C0EE0380EE0700160E5E5E5E5E4B5A4B5A4BC7 FC150E5D5D5D5D4A5A4A5A4AC8FC140E5C5C5C5C495A495A49C9FC130E6C5A6C5A5B5BEA61C0EA 63800067CAFC126E127C1278EA7F80EAFFF0EAC0782B2B7DA032>46 D<134013C0A2EA0180A3EA 0300A31206A25AA35AA35AA35AA35AA41260A37EA37EA37EA37EA27EA3EA0180A3EA00C0A21340 0A327BA413>104 D<12C0A31260A37EA37EA27EA37EA37EA3EA0180A3EA00C0A4EA0180A3EA03 00A31206A35AA35AA25AA35AA35AA30A327DA413>I<12C0B3B3AD02317AA40E>II<160116031606A2160CA21618A21630A21660A216C0A2ED0180A2 ED0300A21506A25DA25DA25D1206001E5C122F004F5CEA87800007495AEA03C04AC7FCA23801E0 06A26C6C5AA2EB7818A26D5AA26D5AA26D5AA26D5AA26DC8FCA228327D812A>112 D120 D E /Fg 56 123 df11 D<137E3801C180EA0301380703C0120EEB018090C7FC A5B512C0EA0E01B0387F87F8151D809C17>II34 D<13401380EA0100120212065AA25AA25AA212701260A312E0AC1260A312 701230A27EA27EA27E12027EEA008013400A2A7D9E10>40 D<7E12407E7E12187EA27EA27EA213 801201A313C0AC1380A312031300A21206A25AA25A12105A5A5A0A2A7E9E10>I<126012F0A212 701210A41220A212401280040C7C830C>44 DI<126012F0A212600404 7C830C>I48 D<12035A123F12C71207B3A4EA0F80EAFFF80D1C7C9B 15>III<130CA2131C133CA2135C13DC13 9CEA011C120312021204120C1208121012301220124012C0B512C038001C00A73801FFC0121C7F 9B15>II<13F0EA030CEA0604EA0C0E EA181E1230130CEA7000A21260EAE3E0EAE430EAE818EAF00C130EEAE0061307A51260A2EA7006 EA300E130CEA1818EA0C30EA03E0101D7E9B15>I<1240387FFF801400A2EA4002485AA25B485A A25B1360134013C0A212015BA21203A41207A66CC7FC111D7E9B15>I<1306A3130FA3EB1780A3 EB23C0A3EB41E0A3EB80F0A200017FEB0078EBFFF83803007C0002133CA20006133E0004131EA2 000C131F121E39FF80FFF01C1D7F9C1F>65 DI<9038 1F8080EBE0613801801938070007000E13035A14015A00781300A2127000F01400A80070148012 78A212386CEB0100A26C13026C5B380180083800E030EB1FC0191E7E9C1E>IIII73 D77 D79 DI<3807E080EA1C19EA3005EA7003EA600112E01300A36C13007E127CEA7FC0 EA3FF8EA1FFEEA07FFC61380130FEB07C0130313011280A300C01380A238E00300EAD002EACC0C EA83F8121E7E9C17>83 D<007FB512C038700F010060130000401440A200C014201280A3000014 00B1497E3803FFFC1B1C7F9B1E>I<397FF0FFC0390FC03E0038078018EA03C0EBE01000015BEB F06000001340EB7880137D013DC7FC7F131F7F80A2EB13C0EB23E01321EB41F0EBC0F8EB807838 01007C48133C00027F0006131F001FEB3F8039FFC0FFF01C1C7F9B1F>88 D<39FFF007FC390F8001E00007EB0080EBC00100031400EBE002EA01F000005B13F8EB7808EB7C 18EB3C106D5A131F6D5A14C06D5AABEB7FF81E1C809B1F>I92 D97 D<12FC121CAA137CEA1D86EA1E03381C018014C0130014E0A614C013011480381E0300EA1906EA 10F8131D7F9C17>II<133F1307AAEA03E7EA0C17EA180F487E1270126012E0A6126012 7012306C5AEA0C373807C7E0131D7E9C17>II<13F8EA018CEA071E12 06EA0E0C1300A6EAFFE0EA0E00B0EA7FE00F1D809C0D>II<12FC121CAA137C1387EA1D0300 1E1380121CAD38FF9FF0141D7F9C17>I<1218123CA21218C7FCA712FC121CB0EAFF80091D7F9C 0C>I<12FC121CAAEB3FC0EB0F00130C13085B5B5B13E0121DEA1E70EA1C781338133C131C7F13 0F148038FF9FE0131D7F9C16>107 D<12FC121CB3A9EAFF80091D7F9C0C>I<39FC7E07E0391C83 8838391D019018001EEBE01C001C13C0AD3AFF8FF8FF8021127F9124>IIII< EAFCE0EA1D30EA1E78A2EA1C301300ACEAFFC00D127F9110>114 DI<1204A4120CA2121C123CEAFFE0EA1C00A91310A5120CEA0E20EA03C00C1A7F9910>I<38 FC1F80EA1C03AD1307120CEA0E1B3803E3F014127F9117>I<38FF07E0383C0380381C0100A2EA 0E02A26C5AA3EA0388A213D8EA01D0A2EA00E0A3134013127F9116>I<39FF3FCFE0393C0F0380 381C07011500130B000E1382A21311000713C4A213203803A0E8A2EBC06800011370A2EB803000 0013201B127F911E>I<387F8FF0380F03801400EA0702EA0384EA01C813D8EA00F01370137813 F8139CEA010E1202EA060738040380381E07C038FF0FF81512809116>I<38FF07E0383C038038 1C0100A2EA0E02A26C5AA3EA0388A213D8EA01D0A2EA00E0A31340A25BA212F000F1C7FC12F312 66123C131A7F9116>II E /Fh 5 54 df<120C121C12EC120CAF EAFFC00A137D9211>49 D<121FEA60C01360EAF07013301260EA0070A2136013C012011380EA02 005AEA08101210EA2020EA7FE012FF0C137E9211>II<136013E0A2EA 016012021206120C120812101220126012C0EAFFFCEA0060A5EA03FC0E137F9211>II E /Fi 60 123 df12 D<91390FF01FE091393838601891396079C01C9139C07B803C0101EB330003071338D903801400 A2150EA2EB0700A35D90B712E090390E001C00A2EE01C01538A25BEE0380A21570A2EE07005BA2 03E01308EE0E10A25B1720913801C006EE03C0016091C7FC01E05B140313C092C8FCEA71C738F1 8F06EB0F0C38620618383C03E02E2D82A22B>14 D<1480EB010013025B5B5B13305B5BA2485A48 C7FCA21206A2120E120C121C1218A212381230A21270A21260A212E0A35AAD12401260A2122012 3012107E113278A414>40 D<13087F130613021303A27F1480AD1303A31400A25BA21306A2130E 130CA2131C131813381330A25BA25B485AA248C7FC120612045A5A5A5A5A113280A414>I45 D<127012F8A212F012E005057A840F>I<150815181530A2156015C0 A2EC0180A2EC03001406A25CA25C5CA25CA25C495AA249C7FCA21306A25B5BA25BA25B5BA2485A A248C8FC1206A25AA25A5AA25AA25AA25A1D317FA419>II<13011303A21306131E132EEA03CEEA001CA41338A4 1370A413E0A4EA01C0A4EA0380A41207EAFFFC10217AA019>II<14181438A21470A314E0A314C01301148013031400A21306A25BA25B1310EB 3180EB61C0EB438013831201EA03033802070012041208EA3FC7EA403E38800FF038000E00A25B A45BA31330152B7EA019>52 D54 D<380278023804FC041207000F1308EB0E18381E 0670381803A03830006000201340006013C000401380EA8001EA000314005B1306130EA25BA213 3C13381378A25BA3485AA312035BA26C5A172279A019>I<1207EA0F80A21300120EC7FCAB1270 12F8A25A5A09157A940F>58 D<001FB6FCA2C9FCA8B612F8A2200C7B9125>61 D<1403A25CA25CA25C142FA2144F15801487A2EB01071302A21304A21308A2131013301320EB7F FF90384007C013801403EA01005A12025AA2120C003C1307B4EB3FFC1E237DA224>65 D<027F138090390380810090380E00630138132749131F49130E485A485A48C7FC481404120E12 1E5A5D4891C7FCA35AA55A1520A25DA26C5C12704AC7FC6C130200185B001C5B00061330380381 C0D800FEC8FC212479A223>67 D<90B512F090380F003C150E81011EEB0380A2ED01C0A25B16E0 A35BA449EB03C0A44848EB0780A216005D4848130E5D153C153848485B5D4A5A0207C7FC000F13 1CB512F023227DA125>I<90B6128090380F00071501A2131EA21600A25BA2140192C7FCEB7802 A21406140EEBFFFCEBF00CA33801E008A21504EC0008485AA25DA248485B15605D1401000F1307 B65A21227DA121>I<90B6FC90380F000F1503A2131EA21502A25BA214011500EB7802A2140614 0EEBFFFCEBF00CA33801E008A391C7FC485AA4485AA4120FEAFFFC20227DA120>I<027F138090 390380810090380E00630138132749131F49130E485A485A48C7FC481404120E121E5A5D4891C7 FCA35AA4EC3FFC48EB01E0A34A5AA27E12704A5A7E0018130F001C131300060123C7FC380381C1 D800FEC8FC212479A226>I<9039FFF87FFC90390F000780A3011EEB0F00A449131EA4495BA490 B512F89038F00078A348485BA44848485AA44848485AA4000F130739FFF87FFC26227DA124>I< EBFFF8EB0F00A3131EA45BA45BA45BA4485AA4485AA4485AA4120FEAFFF815227DA113>I<9038 07FFC09038003C00A35CA45CA4495AA4495AA4495AA449C7FCA212381278EAF81EA2485AEA4038 5BEA21E0EA1F801A237CA11A>I76 DI<01FFEB0FFC90390F8001E016 80ECC0000113EB0100A2EB11E0A201211302EB20F0A39038407804A3143C01805B143E141EA239 01001F10140FA2EC0790000214A0A2EC03E0A2485C1401A2120C001E6D5AEAFFC026227DA124> I<14FE903807838090380C00E0013813704913385B4848131C485A48C7FC48141E121E121C123C A25AA348143CA31578A25A15F0A2EC01E015C06C1303EC0780007014000078130E00385B6C5B6C 13E038070380D801FCC7FC1F2479A225>I<90B512E090380F0038151E150E011E1307A449130F A3151E5B153C157815E09038F003C09038FFFE0001F0C7FCA2485AA4485AA4485AA4120FEAFFF8 20227DA121>I<14FE903807838090380C00E0013813704913385B4848133C4848131C48C7FC48 141E121EA25AA25AA348143CA3153815785A15F0A2EC01E015C01403D8F01E13803970610700EB 810E00385B001C5B000E13E039078380403801FD00EA0001EC808014811500EB0387EB01FEA25C EB00F01F2D79A225>I<90B512C090380F0070153C151C011E130EA2150FA249131EA3153C4913 381570EC01E0EC07809038FFFC00EBF00E80EC0380D801E013C0A43903C00780A43907800F0015 01A2EC0702120F39FFF8038CC812F020237DA124>I<903801F02090380E0C4090381802C0EB30 01136001E0138013C01201A200031400A291C7FCA27FEA01F813FF6C13E06D7EEB1FF8EB03FCEB 007C143C80A30020131CA3141800601338143000705B5C38C80180D8C607C7FCEA81FC1B247DA2 1B>I<001FB512F8391E03C03800181418123038200780A200401410A2EB0F001280A200001400 131EA45BA45BA45BA4485AA41203B5FC1D2277A123>I<393FFE03FF3903C00078156015204848 1340A448C71280A4001EEB0100A4481302A4485BA400705B12F05C12705C5C123038380180D818 02C7FCEA0E0CEA03F0202377A124>I<3BFFF03FF80FF83B1F0007C003C0001E91388001801700 1602140F5E001F13175E6C13275E144702C75B1487D901075B16C001025C0381C7FC130415C213 08EC03C4131015C8132015D0134001C013E0138001005B5D120E92C8FC120C14022D2376A131> 87 D97 DI<137EEA01C138030180EA0703EA0E07121C003CC7FC12381278A35AA45B12701302 EA300CEA1830EA0FC011157B9416>I<143CEB03F8EB0038A31470A414E0A4EB01C013F9EA0185 EA0705380E0380A2121C123C383807001278A3EAF00EA31410EB1C201270133C38305C40138C38 0F078016237BA219>I<13F8EA0384EA0E02121C123C1238EA7804EAF018EAFFE0EAF000A25AA4 1302A2EA6004EA7018EA3060EA0F800F157A9416>I<143E144714CFEB018F1486EB0380A3EB07 00A5130EEBFFF0EB0E00A35BA55BA55BA55BA45B1201A2EA718012F100F3C7FC1262123C182D82 A20F>II<13F0EA 0FE01200A3485AA4485AA448C7FC131FEB2180EBC0C0380F00E0A2120EA2381C01C0A438380380 A3EB070400701308130E1410130600E01320386003C016237DA219>I<13C0EA01E013C0A2C7FC A8121C12231243A25AA3120EA25AA35AA21340EA7080A3EA71001232121C0B217BA00F>I<13F0 EA0FE01200A3485AA4485AA448C7FCEB01E0EB0210EB0C70380E10F0A2EB2060EB4000EA1D8000 1EC7FCEA1FC0EA1C70487EA27F142038703840A3EB188012E038600F0014237DA216>107 DI<391C0F80F8392610C10C39476066063987807807A2EB0070A2000EEBE00EA44848485A A3ED38202638038013401570168015303A7007003100D83003131E23157B9428>II<137EEA01C338038180380701C0120E001C13E0123C12381278A338 F003C0A21480130700701300130E130CEA3018EA1870EA07C013157B9419>I<3801C1F0380262 183804741C3808780CEB700EA2141EEA00E0A43801C03CA3147838038070A2EBC0E0EBC1C03807 2380EB1E0090C7FCA2120EA45AA3EAFFC0171F7F9419>III<13FCEA018338020080EA0401EA0C03140090C7FC120F13F0EA07FC6C7EEA003E130F 7F1270EAF006A2EAE004EA4008EA2030EA1FC011157D9414>I<13C01201A4EA0380A4EA0700EA FFF8EA0700A2120EA45AA45AA31310EA7020A213401380EA3100121E0D1F7C9E10>I<001E1360 002313E0EA4380EB81C01283EA8701A238070380120EA3381C0700A31408EB0E101218121CEB1E 20EA0C263807C3C015157B941A>I<381E0380382307C0EA43871383EA8381EA8700A200071380 120EA3381C0100A31302A25B5BA2EA0C30EA03C012157B9416>I<001EEB60E00023EBE1F0EA43 80EB81C000831470D887011330A23907038020120EA3391C070040A31580A2EC0100130F000C13 023806138C3803E0F01C157B9420>I<3803C1E0380462103808347038103CF0EA203814601400 C65AA45BA314203861C04012F1148038E2C100EA4462EA383C14157D9416>I<001E1330002313 70EA438014E01283EA8700A2380701C0120EA3381C0380A4EB0700A35BEA0C3EEA03CEEA000EA2 5B1260EAF0381330485AEA80C0EA4380003EC7FC141F7B9418>I<3801E0203803F0603807F8C0 38041F80380801001302C65A5B5B5B5B5B48C7FC120248138038080100485AEA3F06EA61FCEA40 F8EA807013157D9414>I E /Fj 7 62 df48 D<12035AB4FC1207B1EA7FF00C157E9412 >III<13 30A2137013F012011370120212041208121812101220124012C0EAFFFEEA0070A5EA03FE0F157F 9412>II61 D E /Fk 29 120 df<127012F8A3127005057C840E>58 D<127012F812FCA212741204A41208A2 1210A212201240060F7C840E>I<14801301A2EB0300A31306A35BA35BA35BA35BA35BA3485AA4 48C7FCA31206A35AA35AA35AA35AA35AA311317DA418>61 D<811401811403A21407140BA21413 143314231443811481130114011302A21304130C1308131090381FFFF0EB2000136013405BA248 C7FC5A12024880120C121E3AFF800FFF8021237EA225>65 D<90387FFFF8903807800FED0780ED 03C090380F0001A216E0A2011E14C01503A2ED078049EB0F00151E5DEC01F090387FFFE0903878 007881815B150E150FA24848131EA35D485A5D5DEC01C00007EB0F80B500FCC7FC23227EA125> I<017FB512C090380780031500A249C7FCA21680A2131EA2158016004948C7FCA25C5CEB7FFEEB 7806A3EBF004A25DEC0002485AA25D150C4848130815185D15700007EB03F0B65A22227EA124> 69 D<90387FFFF0903807801C150F8190390F000380A4011E1307A3ED0F005B151E5D15704948 5AD97FFFC7FC0178C8FCA25BA4485AA4485AA41207EAFFFC21227EA11F>80 D<90387FFFE0903807803C150E81D90F001380150316C0A2011EEB0780A3ED0F0049131E5D1570 EC01C0D97FFEC7FC9038780780EC01C081496C7EA44848485AA44848485A1640A2020113801207 3AFFFC00E300C8123C22237EA125>82 D<903803F01090380E0C20903818026090382001E0EB40 0001C013C05B1201A200031480A21500A27FEA01F013FE3800FFE06D7EEB1FF8EB01FCEB003C14 1C80A30020130CA3140800601318141000705B5C00C85BD8C603C7FCEA81FC1C247DA21E>I<00 1FB512FE391E01E00E001814061230382003C0A200401404A2EB07801280A20000140049C7FCA4 131EA45BA45BA45BA41201B512C01F227EA11D>I<3A3FFE01FF803A03C0003C00153015104848 5BA448C75AA4001E5CA44849C7FCA4481302A400705B12F05C12705C5C6C5B5C6C48C8FCEA0606 EA01F821237DA121>I<3BFFF03FFC03FF3B1F8007E000786C486C481360A217401780A20207EB 0100A2020B1302A202135B02235BA202435B018013E0000701815BA2D981015B018314C001825C 018401E1C7FCA2018813E2A2019013E4A201A013E801C013F0A201805B120390C75AA200025C30 237DA12E>87 D97 DI<133FEBE080380380C0EA0701EA0E03121C003CC7FCA25AA35A A400701340A23830018038380200EA1C1CEA07E012157E9415>I<137CEA0382EA0701120E121C 1238EA7802EA7004EAFFF800F0C7FCA25AA41480A238700300EA3004EA1838EA0FC011157D9417 >101 D<141EEC638014C71301ECC30014801303A449C7FCA4EBFFF8010EC7FCA65BA55BA55BA4 136013E0A35B1270EAF18090C8FC1262123C192D7EA218>I<13E0A21201EA00C01300A9121E12 23EA4380A21283A2EA87001207120EA35AA25A132013401270A2EA3080EA3100121E0B227EA111 >105 D<14E01301A2EB00C01400A9131E1323EB43801383EA0103A338000700A4130EA45BA45B A45BA3EA70E0EAF0C0EAF1800063C7FC123E132C81A114>I<13F0EA0FE01200A3485AA4485AA4 48C7FC14F0EB0308EB0438380E08781310EB2030EB4000485A001FC7FC13C0EA1C70487EA27F14 1038703820A3EB184038E00C803860070015237DA219>II<393C07E01F3A46183061 803A47201880C03A87401D00E0EB801E141C1300000E90383801C0A4489038700380A2ED070016 044801E01308150EA2ED0610267001C01320D83000EB03C026157E942B>I<383C07C038461860 384720303887403813801300A2000E1370A44813E0A2EB01C014C1003813C2EB0382A2EB018400 701388383000F018157E941D>I<133EEBC180380380C0380700E0120E4813F0123CA25AA338F0 01E0A214C0130300701380EB07001306EA381C6C5AEA07E014157E9417>I<3803C0F03804631C EB740EEA0878EB7007A2140FEA00E0A43801C01EA3143C38038038A2EBC07014E038072180EB1E 0090C7FCA2120EA45AA3EAFFC0181F819418>I114 D<137E138138030080EA0201EA0603140090C7FC 120713F8EA03FE6C7EEA003FEB07801303127000F01300A2EAE002EA4004EA3018EA0FE011157E 9417>I<136013E0A4EA01C0A4EA0380EAFFFCEA0380A2EA0700A4120EA45AA31308EA3810A213 20121813C0EA07000E1F7F9E12>I<001EEB18180023EB383CD84380133EEC701E0083140E1506 EA87000007EBE004120EA3391C01C008A31510A2152001031340D80C0413C0390708E1003801F0 3E1F157E9423>119 D E /Fl 43 121 df12 D45 D49 DII<157815F8140114031407A2140F141F143F147F147714F7EB01E7EB03 C7EB07871407130F131E133C1378137013F0EA01E0EA03C0EA0780EA0F00A2121E5A5A5AB712F0 A3C7380FF800A9010FB512F0A3242E7EAD29>I<000C1438390FC003F890B5FC15F015E015C015 80ECFE005C14F091C7FC90C8FCA7EB0FF8EB7FFF90B512C09038F01FE09038800FF090380007F8 000E14FCC7120315FEA215FFA2121E123FEA7F8012FF13C015FE1380A2397F0007FC007C14F800 3CEB0FF06CEB1FE0390FC07FC06CB512800001EBFE0038003FE0202E7CAD29>II<1238123E003FB612C0A316804815005D5D5D5D387800010070 495A4A5A00F0495A4891C7FC141E143EC75A5CA2495AA213035C1307A2130FA2495AA3133FA513 7FA86D5AA2010EC8FC22307BAF29>I<157CA215FEA34A7EA24A7FA24A7FA34A7F157F021F7FEC 1E3FA2023E7FEC3C1F027C7FEC780FA202F87FECF0070101804A7EA20103814A7E0107814A7EA2 49B67EA24981011EC7123FA24981161F017C810178140FA2496E7EA2000182486C80B5D8C001B5 12FEA337317DB03E>65 DI<913A03FF8001 80023FEBF00349B5EAFC0F01079038007F1FD91FF8EB0FBFD93FE0EB03FFD9FF8013004890C812 7F4848153F485A171F485A001F160F5B003F1607A2127F5B94C7FCA212FFA9127FA36DED038012 3FA2121F6D1507000F17006C7E170E6C6C151E6C6C5D6C6D5CD93FE05CD91FF8EB03E0D907FFEB 3F800101D9FFFEC7FCD9003F13F80203138031317BB03C>II II73 D77 D80 D82 D<90391FF0018090B51203000314C73907F00FFF380F800148C7127F48141F 003E140F127E150712FEA215037EA26D90C7FC13E0EA7FFCEBFFE014FE6CEBFFC06C14F0816C80 0003806C806C6C1480131F010014C01407020013E0153FA2151F126000E0140FA316C07EA26CEC 1F807EB4EC3F0001C0137E9038FC01FC00F1B55AD8E03F13E0D8C00790C7FC23317BB02E>I85 D87 D97 D99 DII<14FF 010713C0011F13E090383FC7F090387F0FF813FE120113FC0003EB07F0EC03E0EC01C091C7FCA7 B512F8A3D803FCC7FCB3A8387FFFF0A31D327EB119>I<90391FF007E09039FFFE3FF0489038FF 7FF83907F83FF1390FE00FE1EDE0F03A1FC007F0601600003F80A5001F5CA26C6C485AA23907F8 3FC090B5C7FC00065B380E1FF090C9FC121EA2121F7F90B512C06C14F815FE6C806C15804815C0 001F15E048C7127F007EEC0FF04814071503A4007EEC07E06CEC0FC0D81FC0EB3F803A0FF801FF 006CB55AC614F0011F1380252F7E9F29>III107 D I<2703F007F8EB0FF000FFD93FFFEB7FFE4A6DB5FC913BF03FC1E07F803D0FF1C01FE3803FC03C 07F3000FE6001F01F602FC14E013FE495CA2495CB3B500C1B50083B5FCA340207D9F45>I<3903 F007F800FFEB3FFF4A7F9138F03FC03A0FF1C01FE03907F3000F01F68013FE5BA25BB3B500C1B5 1280A329207D9F2E>II<3901F80FF000FFEB7FFE 01F9B512809039FFE07FC0000F9038001FE06C48EB0FF001F8EB07F816FC150316FEA2150116FF A816FE1503A216FC15076D14F86DEB0FF06DEB1FE09039FBE07FC001F9B512009038F87FFEEC1F E091C8FCABB512C0A3282E7E9F2E>I<3803F03F00FFEB7FC09038F1FFE09038F3C7F0000FEB8F F83807F70F13F613FE9038FC07F0EC03E0EC0080491300B2B512E0A31D207E9F22>114 DI<1338A51378A313F8A2120112031207121FB5 12FEA33807F800B01407A70003130E13FC3801FE1C3800FFF8EB7FF0EB0FE0182E7EAD20>IIII< B5EBFFFCA3D807F8EB1F806C6CEB1E006C6C5B6C6C5B6E5A90387FC1E0133F90381FE3C090380F F7806DB4C7FC5C130313016D7E497F81497F9038079FF0EB0F0F90381E07F8496C7E496C7E01F0 7F48486C1380ED7FC00007143F3AFFF801FFFEA327207E9F2C>I E /Fm 84 125 df<90381F83E09038706E309038C07C78380180F8000313F03907007000A9B612C03907 007000B21478397FE3FF801D2380A21C>11 DII<90380FC07F90397031C080 9039E00B00402601801E13E00003EB3E013807003C91381C00C01600A7B712E03907001C011500 B23A7FF1FFCFFE272380A229>I34 D37 D<127012F812FCA212741204A412 08A21210A212201240060F7CA20E>39 D<132013401380EA01005A12061204120CA25AA25AA312 701260A312E0AE1260A312701230A37EA27EA2120412067E7EEA0080134013200B327CA413>I< 7E12407E7E12187E12041206A27EA2EA0180A313C01200A313E0AE13C0A312011380A3EA0300A2 1206A21204120C5A12105A5A5A0B327DA413>I<7FA538C08180EAE08338388E00EA0C98EA03E0 6C5A487EEA0C98EA388E38E08380EAC08138008000A511157DA418>I<497EB0B612FEA2390001 8000B01F227D9C26>I<127012F812FCA212741204A41208A21210A212201240060F7C840E>II<127012F8A3127005057C840E>I<14801301A2EB0300A31306A35BA35BA3 5BA35BA35BA3485AA448C7FCA31206A35AA35AA35AA35AA35AA311317DA418>II<13801203120F12F31203B3A9EA07C0EAFFFE0F217CA018>III<13021306130EA2131EA2132E134EA2138EA2EA010E1202A212 04A212081210A21220A212401280B512F838000E00A7131F3801FFF015217FA018>I<00101380 381E0700EA1FFF5B13F8EA13E00010C7FCA613F8EA130EEA1407381803801210380001C0A214E0 A4127012F0A200E013C01280EA4003148038200700EA1006EA0C1CEA03F013227EA018>I<137E EA01C138030080380601C0EA0E03121C381801800038C7FCA212781270A2EAF0F8EAF30CEAF406 7F00F81380EB01C012F014E0A51270A3003813C0A238180380001C1300EA0C06EA070CEA01F013 227EA018>I<12401260387FFFE014C0A23840008038C0010012801302A2485A5BA25B13301320 1360A313E05BA21201A41203A86C5A13237DA118>III<12 7012F8A312701200AB127012F8A3127005157C940E>I<127012F8A312701200AB127012F8A312 781208A41210A312201240A2051F7C940E>I61 D63 D<497EA3497EA3EB05E0A2EB0DF01308 A2497E1478A2497EA3497EA3497EA290B5FC3901000780A24814C000021303A24814E01401A200 0CEB00F0A2003EEB01F839FF800FFF20237EA225>65 DI<903807E0109038381830EBE006 3901C0017039038000F048C7FC000E1470121E001C1430123CA2007C14101278A200F81400A812 781510127C123CA2001C1420121E000E14407E6C6C13803901C001003800E002EB381CEB07E01C 247DA223>IIII<90 3807F00890383C0C18EBE0023901C001B839038000F848C71278481438121E15185AA2007C1408 1278A200F81400A7EC1FFF0078EB00F81578127C123CA27EA27E7E6C6C13B86C7E3900E0031890 383C0C08903807F00020247DA226>I<39FFFC3FFF390FC003F039078001E0AE90B5FCEB8001AF 390FC003F039FFFC3FFF20227EA125>I I<3803FFF038001F007FB3A6127012F8A2130EEAF01EEA401C6C5AEA1870EA07C014237EA119> I<39FFFC03FF390FC000F86C48136015405D4AC7FC14025C5C5C5C5C5C1381EB83C0EB87E01389 EB88F01390EBA078EBC03C13808080A26E7E8114036E7EA26E7E81486C7F3AFFFC07FF8021227E A126>III<39FF8007FF3907C000F81570D805E01320EA04F0A21378137C133C7F131F7FEB0780A2 EB03C0EB01E0A2EB00F014F81478143C143E141E140FA2EC07A0EC03E0A21401A21400000E1460 121FD8FFE0132020227EA125>IIIII<3803F020380C0C60EA18023830 01E0EA70000060136012E0A21420A36C1300A21278127FEA3FF0EA1FFE6C7E0003138038003FC0 EB07E01301EB00F0A214707EA46C1360A26C13C07E38C8018038C60700EA81FC14247DA21B>I< 007FB512F839780780780060141800401408A300C0140C00801404A400001400B3A3497E0003B5 FC1E227EA123>I<39FFFC07FF390FC000F86C4813701520B3A5000314407FA2000114806C7E90 38600100EB3006EB1C08EB03F020237EA125>I<3BFFF03FFC03FE3B1F8007E000F86C486C4813 701720A26C6C6C6C1340A32703C002F01380A33B01E004780100A33A00F0083C02A39039F8183E 06903978101E04A2137C90393C200F08A390391E400790A390390F8003E0A36D486C5AA36D5C01 0213002F237FA132>87 D<397FF807FF3907E001F83903C000E06D5B00015C6C6C48C7FC6D5AEB 7802EB7C04EB3E0CEB1E08EB1F10EB0FB0EB07A014C06D7E130180497EEB0278EB047CEB0C3EEB 081EEB101F9038300F80EB200701407F9038C003E0EB8001D801007F4813004880391F8001FC3A FFE007FFC022227FA125>II<12FEA212C0B3B3A912FEA207317BA40E>91 DI< 12FEA21206B3B3A912FEA207317FA40E>I97 D<120E12FE121E120EAB131FEB61C0EB8060380F0030000E1338143C141C141EA7141C143C1438 000F1370380C8060EB41C038083F0017237FA21B>II<14E013 0F13011300ABEA01F8EA0704EA0C02EA1C01EA38001278127012F0A7127012781238EA1801EA0C 0238070CF03801F0FE17237EA21B>II<133C 13C6EA018F1203130FEA0700A9EAFFF8EA0700B21380EA7FF8102380A20F>I<14703801F19838 071E18EA0E0E381C0700A2003C1380A4001C1300A2EA0E0EEA0F1CEA19F00010C7FCA21218A2EA 1FFE380FFFC014E0383800F0006013300040131812C0A300601330A2003813E0380E03803803FE 0015217F9518>I<120E12FE121E120EABEB1F80EB60C0EB80E0380F0070A2120EAF38FFE7FF18 237FA21B>I<121C123EA3121CC7FCA8120E12FE121E120EB1EAFFC00A227FA10E>II< 120E12FE121E120EABEB03FCEB01F014C01480EB02005B5B5B133813F8EA0F1CEA0E1E130E7F14 80EB03C0130114E0EB00F014F838FFE3FE17237FA21A>I<120E12FE121E120EB3ADEAFFE00B23 7FA20E>I<390E1FC07F3AFE60E183803A1E807201C03A0F003C00E0A2000E1338AF3AFFE3FF8F FE27157F942A>I<380E1F8038FE60C0381E80E0380F0070A2120EAF38FFE7FF18157F941B>III<3801F82038070460EA0E02EA1C01003813E0EA7800A25AA712701278EA3801121CEA0C02EA 070CEA01F0C7FCA9EB0FFE171F7E941A>III<1202A41206A3120E 121E123EEAFFF8EA0E00AB1304A6EA07081203EA01F00E1F7F9E13>I<000E137038FE07F0EA1E 00000E1370AD14F0A238060170380382783800FC7F18157F941B>I<38FFC1FE381E0078000E13 301420A26C1340A238038080A33801C100A2EA00E2A31374A21338A3131017157F941A>I<39FF 8FF8FF391E01E03C001CEBC018120EECE010A239070260201470A239038430401438A23901C818 80141CA23900F00D00140FA2EB6006A320157F9423>I<38FF83FE381F01F0380E00C06C138038 0381001383EA01C2EA00E41378A21338133C134E138EEA0187EB0380380201C0000413E0EA0C00 383E01F038FF03FE17157F941A>I<38FFC1FE381E0078000E13301420A26C1340A238038080A3 3801C100A2EA00E2A31374A21338A31310A25BA35B12F05B12F10043C7FC123C171F7F941A>I< 383FFFC038380380EA300700201300EA600EEA401C133C1338C65A5B12015B38038040EA07005A 000E13C04813805AEA7801EA7007B5FC12157F9416>III E /Fn 51 121 df12 D<13181378EA01F812FFA21201B3A7387FFF E0A213207C9F1C>49 DI<13FE3807FFC0380F07E0381E03F0123FEB81F8A3EA1F 0314F0120014E0EB07C0EB1F803801FE007F380007C0EB01F014F8EB00FCA2003C13FE127EB4FC A314FCEA7E01007813F8381E07F0380FFFC03801FE0017207E9F1C>I<14E013011303A2130713 0F131FA21337137713E7EA01C71387EA03071207120E120C12181238127012E0B512FEA2380007 E0A7EBFFFEA217207E9F1C>I<00101320381E01E0381FFFC0148014005B13F8EA1BC00018C7FC A4EA19FCEA1FFF381E0FC0381807E01303000013F0A214F8A21238127C12FEA200FC13F0A23870 07E0003013C0381C1F80380FFF00EA03F815207D9F1C>II<1260 1278387FFFFEA214FC14F8A214F038E0006014C038C00180EB0300A2EA00065B131C1318133813 78A25BA31201A31203A76C5A17227DA11C>I57 D<1470A214F8A3497EA2497EA3EB06FF80010E7FEB0C3FA201187F141F01387FEB300FA201607F 140701E07F90B5FCA239018001FCA200038090C7FCA20006147FA23AFFE00FFFF8A225227EA12A >65 DII IIIIII76 DIIII82 D<3801FC043807FF8C381F03FC383C 007C007C133C0078131CA200F8130CA27E1400B4FC13E06CB4FC14C06C13F06C13F86C13FC0003 13FEEA003FEB03FFEB007F143FA200C0131FA36C131EA26C133C12FCB413F838C7FFE000801380 18227DA11F>I<007FB61280A2397E03F80F00781407007014030060140100E015C0A200C01400 A400001500B3A20003B512F8A222227EA127>II87 D89 D97 DI II<13FE3807FF80380F87C0381E01E0 003E13F0EA7C0014F812FCA2B5FCA200FCC7FCA3127CA2127E003E13186C1330380FC0703803FF C0C6130015167E951A>II<3803FC1E380FFF7F381F0F8F383E07CF383C03C0007C13 E0A5003C13C0EA3E07381F0F80EBFF00EA13FC0030C7FCA21238383FFF806C13F06C13F84813FC EA380048133E00F0131EA40078133C007C137C383F01F8380FFFE00001130018217E951C>II<121C 123E127FA3123E121CC7FCA7B4FCA2121FB2EAFFE0A20B247EA310>I107 DI<3AFF07F007F090391F FC1FFC3A1F303E303E01401340496C487EA201001300AE3BFFE0FFE0FFE0A22B167E9530>I<38 FF07E0EB1FF8381F307CEB403CEB803EA21300AE39FFE1FFC0A21A167E951F>I<13FE3807FFC0 380F83E0381E00F0003E13F848137CA300FC137EA7007C137CA26C13F8381F01F0380F83E03807 FFC03800FE0017167E951C>I<38FF0FE0EB3FF8381FF07CEB803E497E1580A2EC0FC0A8EC1F80 A29038803F00EBC03EEBE0FCEB3FF8EB0FC090C8FCA8EAFFE0A21A207E951F>I114 DI<487EA41203A21207A2120F123FB5FCA2EA0F80AB EB8180A5EB8300EA07C3EA03FEEA00F811207F9F16>I<38FF01FEA2381F003EAF147E14FE380F 81BE3907FF3FC0EA01FC1A167E951F>I<39FFE01FE0A2390F800600A2EBC00E0007130CEBE01C 00031318A26C6C5AA26C6C5AA2EB7CC0A2137F6D5AA26DC7FCA2130EA21B167F951E>I<3AFFE7 FF07F8A23A1F007800C0D80F80EB0180147CA23A07C07E030014DE01E05B0003EBDF06EBE18FD8 01F15B01F3138C9038FB079C000014D8EBFE03017E13F0A2EB7C01013C5BEB380001185B25167F 9528>I<39FFE07FC0A2390F801C006C6C5A6C6C5AEBF0606C6C5A3800F980137F6DC7FC7F8049 7E1337EB63E0EBC1F03801C0F848487E3807007E000E133E39FF80FFE0A21B167F951E>I E /Fo 16 122 df<1238127C12FEA3127C1238070774861F>46 D64 D97 DI II<13 7F3801FFC0000713E04813F0381F81F8383F0078003C133C127C0078131EA2B512FEA400F0C7FC A21278A2007C131E7E381F803EEBE07C380FFFF8000313F06C13E038003F80171A7D991F>I<13 30137813FCA21378133090C7FCA6EA7FFCA4EA003CB2387FFFFCB512FEA26C13FC17267CA51F> 105 D<1303EB0780EB0FC0A2EB0780EB030090C7FCA6380FFFC0A4EA0003B3AA38200780EA700F 38F81F00EAFFFE6C5A5BEA1FE012337DA51F>I108 D<38FF87E0EB9FF0EBBFF8EBFFFC3807F83CEBE01E13C0A21380AE39FFFC7FF0A41C1A7F991F> 110 D<13FCEA03FF481380001F13E01387383E01F0387C00F800781378A248133CA76C137C0078 1378007C13F8A2383E01F0381F87E013FF000713806C1300EA00FC161A7C991F>I<38FFE07E90 38E1FF8001E713C013EF3801FF879038FE038049C7FC5B5BA35BABB512F0A41A1A7E991F>114 D<3803FC70380FFFF0123F5AEA7C03EAF801EAF000A3007C1300EA7FE0EA1FFF000713C0C613E0 EB03F0EB00F80070133C12F0A27E6C137C38FF01F8EBFFF0A200E713C038E1FE00161A7C991F> I<38FF83FEA43807801EAF143EA2EBC0FE6CB512F0A26C139F38007E1F1C1A7F991F>117 D<397FE07FE039FFF0FFF0A2397FE07FE03907000E0013800003131E141CEA01C0143C1438EA00 E0A26D5A1370A26D5AA36D5A131DA2EB0F80A2130791C7FCA3130EA35B1238EA7C3C13F8EA7FF0 5B6C5AEA0F801C277F991F>121 D E /Fp 52 122 df<127812FCA212FEA2127A1202A41204A3 12081210A21220124007127B8511>44 DI<127812FCA4127806067B85 11>I<137F3801C1C0380780F0380F0078000E1338487F003C131EA3487FA400F81480AF007814 00A46C131EA3001C131C6C5B000F13786C6C5A3801C1C0D8007FC7FC19297EA71E>48 D<13101370EA01F0120F12FE12F01200B3AD487E387FFFE0A213287BA71E>I<13FE3807FF8038 0E07E0381803F0382001F8130048137CA200F8137E7E143EA30078137EC7FC147CA214F8A2EB01 F014E0EB03C0EB07801400130E5B5B5B13605B38018002EA0300000613045A5A0010130C383FFF FC4813F8B5FCA217287DA71E>I<137F3803FFC0380701F0380C00F80010137C121C003E137E14 3EA2121E000C137EC7127CA214785C5C495A0107C7FC13FFEB01E06D7E147880143E80A21580A2 1230127812FCA215005A00405B143E00305B6C5B380F01F03803FFC0C66CC7FC19297EA71E>I< 1460A214E01301A21303A213051309A213111321A213411381A2EA01011202A212041208A21210 12301220124012C0B61280A2390001E000A8497E90387FFF80A219287EA71E>I<00181318001F 13F0EBFFE014C014801400EA11F80010C7FCA8137E38118380381600C0001813E000101370C712 78143CA3143EA3127012F8A3143C12800040137C14787E003013F0381801E0380E07C03807FF00 EA01FC17297DA71E>I I<12201238003FB51280A215005A3860000200405BA25C485B5CC7FC5C5CA249C7FC5B13021306 A25BA2131CA35BA31378A413F8A81370192A7DA81E>I<137F3801FFC0380381F038060078487F 001C131C00187F1238A3123CA2003E5B381F8018EBC038380FF0606C6C5A6CB45A6C90C7FC3800 7FC0497E38030FF8380603FC381C01FE3838007E00307F0070130FEC07805A1403A46C14000070 5B0078130600385B001E1338380F80F03803FFE0C66CC7FC19297EA71E>I<137F3801FFC03807 C1E0380F0070001E7F001C133C003C131C48131EA200F87FA41580A41278141F7EA2001C132F6C 134F6C13CF3803810FEA007E01001300A3141EA35C121C003E5B1470003C5B381801C0381C0780 D80FFEC7FCEA03F819297EA71E>I<1418A3143CA3147EA214FF149FA29038011F80140FA29038 0207C0A3496C7EA3496C7EA201187FEB1000A2497F157C90383FFFFC497F903840003EA2497FA3 48C7EA0F80A30002EC07C01207D81F80EB0FE0D8FFF090B5FCA2282A7EA92D>65 DI<02FF13100107EBE03090381FC07890393E000C7001F8EB06F0484813 03484813014848130048481470A248C812305A123E1610127EA2007C150012FCA9127C127E1610 123EA2123F6C15206C7E16606C6C14406C6C14806C6C13016C6CEB0300013E130E90381FC03890 3807FFE001001380242B7DA92B>II70 D<02FF13100107EBE03090381FC07890393E000C7001F8EB06F04848130348481301484813 0048481470A248C812305A123E1610127EA2007C150012FCA892B5FC127C007EEC03F01501123E A2123F7E6C7EA26C7E6C7E6C6C13026C7E013EEB0C7090391FC03830903907FFE0100100EB8000 282B7DA92F>I73 D<0003B5FCA2380007E01303B3AA1230127812FCA214C0EAF8070040138038200F00EA300EEA0C 3CEA03F0182A7DA81F>IIIII82 DI<007FB612F8A2397C007C00007015380060 151800401508A200C0150CA2481504A5C71400B3A614FE90B512FEA226297EA82B>II<3DFFFE03FFF803FFC0A23D0FE0003F80007E 006C486DC712186C7E6F6C1310A26C6C5E82A26C6C5EED13E0A26D16C0017CD933F05B1521137E 013E6E48C7FC1540A26D1502ED807CA2D90F805C913881003EA2D907C15C02C2131FA2D903E25C 02E4EB0F90A2D901F414A002F8EB07E0A201005D4A1303A202705C02601301A33A2A7FA83D>87 D97 DII<140F49B4FCA2EB001F 80AC133F3801C0CF3803802F380F001F121E001C7F123C127C1278A212F8A71278A27EA26C5B00 0E132F6CEB4F803901C18FF838007E0F1D2A7EA921>I<137E3803C380380700E0000E13F04813 70003C1378143848133CA212F8A2B512FC00F8C7FCA51278127C003C1304A26C1308000E13106C 13203801C0C038007F00161A7E991B>I<131FEB70C0EBE1E0EA01C31203EB81C0380780801400 A9EAFFFEA2EA0780B3A37FEAFFFEA2132A7FA912>III<120FEA1F80A4EA0F00C7FCA9EA0780 127FA2120F1207B3A2EAFFF8A20D297FA811>I107 DI<260781F813FC3AFF860E0307903A98070C0380D80FA0019013C00007903903D001E001 C013E0A2018013C0B13BFFFC7FFE3FFFA2301A7F9933>I<380783F838FF8C1CEB900E380FA007 0007148013C0A21380B139FFFCFFFCA21E1A7F9921>I<137F3801C1C038070070000E7F487F00 3C131EA2487FA200F81480A800781400A26C131EA26C5B000E13386C5B3801C1C0D8007FC7FC19 1A7E991E>I<380783F038FF8C1CEBB00F3907A007809038C003C0018013E0140115F0A2EC00F8 A715F01401A215E0EC03C013C0EC07809038A00E00EB983CEB87E00180C7FCAAEAFFFCA21D267F 9921>I<380787C038FF98E0EB91F0EA0FA1EA07C1EBC0E014005BB07FEAFFFEA2141A7F9917> 114 D<3807F840381C06C0EA3001EA6000A200E01340A27E6C1300127EEA7FF0EA3FFEEA0FFF00 03138038003FC01307388001E0A2EAC000A36C13C0EAF00100F8138038C40700EA83F8131A7E99 18>I<7FA41201A31203A21207120F381FFF80B5FC38078000AD1440A73803C08012013800E100 133E12257FA417>I<390780078000FF13FFA2000F130F00071307B0140FA2000313173901C027 C03900E047FCEB3F871E1A7F9921>I<39FFF00FF8A2390F8003C000071480EC01003803C002A2 13E000015BA26C6C5AA2EBF818EB7810A26D5AA2EB3E60EB1E40A26D5AA26DC7FCA313021D1A7F 9920>I<3AFFF1FFC1FFA23A0F803E0078D9001E1330D807801420A2141F6C6C48134014271580 2601E0671380144315C03A00F081C100A201F813E390387900E2A2017D13F66D1374A2011E1378 011C1338A2010C133001081310281A7F992B>I<39FFF00FF8A2390F8003C000071480EC010038 03C002A213E000015BA26C6C5AA2EBF818EB7810A26D5AA2EB3E60EB1E40A26D5AA26DC7FCA313 02A25BA35B1270EAF810A25B485AEA6080001FC8FC1D267F9920>121 D E /Fq 25 122 df45 D<150C151EA3153FA34B7EA34B7EA39138019FE0 A202037F150FA202077FEC0607A2020C7F1503A202187F1501A24A6C7EA34A6D7EA214E04A6D7E A20101814A131FA201038191B6FCA249810106C71207A249811603A2496E7EA3496E7EA2017082 0160157FA201F082EA03F8D80FFC4A487EB500C0013FEBFFC0A33A3D7DBC41>65 DI69 D80 D85 D97 D99 DII<143F903801FFC0903803E0E090380781 F090380F83F8EB1F07133E137EEC03F090387C01E001FCC7FCAEB512FCA3D800FCC7FCB3AC487E 387FFFFCA31D3D7FBC1A>I<903907F001F890393FFE0FFC90397C1F1E3E9038F007F03A01E003 E01C2603C00113080007ECF000000F80EB8000001F80A7000F5CEBC00100075C00035C6C6C485A 6D485A26037C1FC7FC38073FFE380607F090C9FC120EA3120FA2EA07C090B512C06C14FC6C14FF 6C1580000315C03A0780003FE0001FC7EA07F0003EEC01F8003C1400127C0078157C12F8A5007C 15F8A26CEC01F06CEC03E06C6CEB07C0D803E0EB1F00D801FC13FE39003FFFF00107138027397E A52B>III108 D<2701F803F8EB03F800FFD91FFFEB1FFF91 3B3C0F803C0F80913BE007C0E007C03D07F9C003E1C003E02601FB00D9F3007F0301140101FE02 FE80A2495CA2495CB3A5486C496C497EB500F0B500F0B512F0A344267EA549>I<3901F807F800 FFEB1FFEEC781F9138E00F803A07F98007C02601FB007F150301FE805BA35BB3A5486C497EB500 F1B512E0A32B267EA530>II<3901F80FF000FFEB3FFEECF01F9039F9C007C03A03FB0003E0D801 FE6D7E496D7E8249147EA2167F821780A2161F17C0A91780163FA217005E167E5E7F4B5A6D495A 01FB495A9039F9C00FC09026F8F03FC7FCEC3FFCEC0FE091C9FCAC487EB512F0A32A377EA530> I<3903F00F8000FFEB3FE0EC70F0ECC1F83807F1833801F30313F6EC01F0EC004001FC1300A45B B3A3487EB512F8A31D267EA522>114 D<90387F81803803FFE3380F807F381E001F00381307A2 481303A200F01301A37EA200FE90C7FCEA7F8013FC383FFFC06C13F06C13FC00037F6C7FD8001F 13801300EC1FC00040130F00C0EB07E014036C1301A47E15C06C13036C1480EC070000F7130E38 E3C03C38C0FFF8EB3FC01B287DA622>I<1318A51338A41378A213F8A2120112031207001FB5FC B6FCA2D801F8C7FCB2EC0180A91200EC030013FC137CEB3E066D5AEB0FF8EB03F019367EB421> III121 D E end %%EndProlog %%BeginSetup %%Feature: *Resolution 300 TeXDict begin %%EndSetup %%Page: 0 1 bop 133 776 a Fq(A)29 b(P)n(arallelism-Based)h(Analytic)h(Approac)n(h)c(to) 131 880 y(P)n(erformance)g(Ev)-5 b(aluation)31 b(Using)e(Application)757 984 y(Programs)302 1244 y Fp(Da)n(vid)19 b(K.)h(Bradley)379 b(John)20 b(L.)g(Larson)203 1319 y Fo(bradley@)q(csr)q(d.u)q(iu)q(c.e)q(du) 154 b(jlarson@csr)q(d.u)q(iu)q(c.e)q(du)217 1543 y Fp(Cen)n(ter)21 b(for)f(Sup)r(ercomputing)f(Researc)n(h)g(and)h(Dev)n(elopmen)n(t)295 1618 y(465)f(Computer)h(and)g(Systems)e(Researc)n(h)h(Lab)r(oratory)705 1692 y(1308)h(W.)f(Main)h(St.)638 1767 y(Urbana,)g(IL)g(61801-2307)293 1917 y(This)g(w)n(ork)g(w)n(as)g(supp)r(orted)g(in)g(part)h(b)n(y)f(the)g (National)182 1991 y(Science)e(F)-5 b(oundation)20 b(under)h(Gran)n(t)g(No.) 26 b(NSF)19 b(ASC)i(89-02829.)767 2133 y(April)f(1,)f(1993)p eop %%Page: 1 2 bop eop %%Page: 1 3 bop 830 768 a Fn(Abstract)73 883 y Fm(In)11 b(this)g(pap)q(er)g(w)o(e)g (presen)o(t)f(a)i(brief)e(o)o(v)o(erview)f(of)i(p)q(erformance)f(ev)m (aluation)h(and)h(b)q(enc)o(hmarking.)0 943 y(W)l(e)k(demonstrate)f(that)i (traditional)f(p)q(erformance)e(measuremen)o(ts)g(recorded)h(in)h(these)g (activities)0 1003 y(are)i(really)g(a)h(direct)e(measuremen)o(t)e(of)k(the)f (parallelism)e(in)i(soft)o(w)o(are)h(and)g(hardw)o(are.)28 b(A)18 b(frame-)0 1063 y(w)o(ork)c(called)e(the)i(P)o(ath)g(to)g(P)o (erformance)e(is)h(dev)o(elop)q(ed)g(that)h(iden)o(ti\014es)e(the)h(agen)o (ts)h(and)h(activities)0 1123 y(that)j(c)o(hange)g(the)g(parallelism)e(as)i (it)g(mo)o(v)o(es)d(from)i(problem)g(to)h(solution.)27 b(W)l(e)17 b(sho)o(w)i(where)e(v)m(ar-)0 1183 y(ious)23 b(curren)o(t)e(application)h(b)q (enc)o(hmarks)f(apply)h(prob)q(es)h(on)g(the)f(P)o(ath)h(and)g(whic)o(h)e (agen)o(ts)i(are)0 1244 y(b)q(eing)c(measured.)29 b(W)l(e)19 b(recast)g(the)g(traditional)g(time-based)f(p)q(erformance)g(measuremen)o(ts) e(in)o(to)0 1304 y(parallelism-based)j(p)q(erformance)g(measuremen)n(ts)f(to) j(sho)o(w)f(that)h(understanding)g(p)q(erformance)0 1364 y(implies)12 b(understanding)j(the)g(parallelism.)j(A)c(sim)o(ulation)f(to)q(ol)i(and)h (metho)q(dology)e(are)h(describ)q(ed)0 1424 y(for)e(measuring)e(and)i (comparing)f(the)g(executed)f(parallelism)g(on)i(a)f(single)g(CRA)l(Y)g(Y-MP) g(CPU.)g(W)l(e)0 1484 y(apply)17 b(our)g(metho)q(dology)g(to)g(sev)o(eral)f (of)i(the)e(P)o(erfect)g(Benc)o(hmarks)f(to)i(quan)o(tify)f(their)g(executed) 0 1545 y(parallelism)c(on)j(this)f(mac)o(hine.)k(Our)c(results)g(suggest)i (that)e(since)g(some)f(of)i(the)f(b)q(enc)o(hmarks)f(ha)o(v)o(e)0 1605 y(a)k(similar)d(mix)h(of)i(di\013eren)o(t)e(lev)o(els)g(of)i (parallelism)d(on)j(the)f(Y-MP)l(,)g(the)g(b)q(enc)o(hmarking)f(utilit)o(y)g (of)0 1665 y(these)k(programs)g(on)g(the)g(Y-MP)g(ma)o(y)e(b)q(e)i (questioned)g(b)q(ecause)g(they)g(exercise)e(the)i(mac)o(hine)d(in)0 1725 y(the)g(same)f(w)o(a)o(y)l(.)931 2828 y(i)p eop %%Page: 2 4 bop 924 2828 a Fm(ii)p eop %%Page: 3 5 bop 0 203 a Fl(Con)n(ten)n(ts)0 312 y Fn(1)45 b(In)n(tro)r(duction)1465 b(1)0 421 y(2)45 b(P)n(erformance)18 b(Ev)m(aluation)f(as)i(a)g(Scien)n (ti\014c)f(Discipline)572 b(1)0 530 y(3)45 b(Preserving)17 b(and)j(Exploiting)c(P)n(arallelism)841 b(3)73 590 y Fm(3.1)50 b(Soft)o(w)o(are)17 b(P)o(arallelism)43 b Fk(:)25 b(:)f(:)h(:)f(:)g(:)h(:)f (:)h(:)f(:)h(:)f(:)h(:)f(:)h(:)f(:)g(:)h(:)f(:)h(:)f(:)h(:)f(:)h(:)f(:)h(:)f (:)g(:)h(:)94 b Fm(3)73 651 y(3.2)50 b(Hardw)o(are)17 b(P)o(arallelism)j Fk(:)25 b(:)f(:)h(:)f(:)g(:)h(:)f(:)h(:)f(:)h(:)f(:)h(:)f(:)h(:)f(:)g(:)h(:)f (:)h(:)f(:)h(:)f(:)h(:)f(:)h(:)f(:)g(:)h(:)94 b Fm(5)73 711 y(3.3)50 b(Realizing)15 b(P)o(arallelism)f(at)j(Run)o(time)40 b Fk(:)25 b(:)f(:)h(:)f(:)h(:)f(:)h(:)f(:)g(:)h(:)f(:)h(:)f(:)h(:)f(:)h(:)f (:)h(:)f(:)g(:)h(:)94 b Fm(5)73 771 y(3.4)50 b(Benc)o(hmarking)14 b(Metho)q(dologies)j(and)g(the)f(P)o(ath)h Fk(:)24 b(:)h(:)f(:)g(:)h(:)f(:)h (:)f(:)h(:)f(:)h(:)f(:)h(:)f(:)g(:)h(:)94 b Fm(5)73 831 y(3.5)50 b(The)17 b(P)o(arallelism)c(Matrix:)21 b(A)16 b(Measure)g(of)g(Executed)f(P)o (arallelism)40 b Fk(:)25 b(:)f(:)h(:)f(:)g(:)h(:)94 b Fm(7)73 891 y(3.6)50 b(Quan)o(tifying)16 b(Di\013erences)f(in)h(Executed)g(P)o (arallelism)32 b Fk(:)24 b(:)h(:)f(:)h(:)f(:)h(:)f(:)h(:)f(:)h(:)f(:)g(:)h(:) 94 b Fm(8)0 1000 y Fn(4)45 b(Rederiv)m(ation)17 b(of)h(Basic)h(P)n(arallel)f (P)n(erformance)f(Metrics)511 b(9)73 1061 y Fm(4.1)50 b(P)o(erformance)21 b Fk(:)j(:)h(:)f(:)h(:)f(:)h(:)f(:)h(:)f(:)g(:)h(:)f(:)h(:)f(:)h(:)f(:)h(:)f (:)h(:)f(:)g(:)h(:)f(:)h(:)f(:)h(:)f(:)h(:)f(:)h(:)f(:)g(:)h(:)94 b Fm(9)73 1121 y(4.2)50 b(Execution)16 b(Rate)35 b Fk(:)24 b(:)h(:)f(:)h(:)f(:)h(:)f(:)g(:)h(:)f(:)h(:)f(:)h(:)f(:)h(:)f(:)h(:)f(:)g(:)h (:)f(:)h(:)f(:)h(:)f(:)h(:)f(:)h(:)f(:)g(:)h(:)69 b Fm(10)73 1181 y(4.3)50 b(P)o(eak)16 b(Execution)g(Rate)30 b Fk(:)25 b(:)f(:)h(:)f(:)g(:)h(:)f(:)h(:)f(:)h(:)f(:)h(:)f(:)h(:)f(:)g(:)h(:)f(:)h(:)f (:)h(:)f(:)h(:)f(:)h(:)f(:)g(:)h(:)69 b Fm(10)73 1241 y(4.4)50 b(Sp)q(eedup)32 b Fk(:)24 b(:)h(:)f(:)h(:)f(:)h(:)f(:)h(:)f(:)h(:)f(:)g(:)h (:)f(:)h(:)f(:)h(:)f(:)h(:)f(:)h(:)f(:)g(:)h(:)f(:)h(:)f(:)h(:)f(:)h(:)f(:)h (:)f(:)g(:)h(:)69 b Fm(11)73 1301 y(4.5)50 b(E\016ciency)43 b Fk(:)25 b(:)f(:)h(:)f(:)h(:)f(:)h(:)f(:)h(:)f(:)g(:)h(:)f(:)h(:)f(:)h(:)f (:)h(:)f(:)h(:)f(:)g(:)h(:)f(:)h(:)f(:)h(:)f(:)h(:)f(:)h(:)f(:)g(:)h(:)69 b Fm(12)73 1362 y(4.6)50 b(Utilization)22 b Fk(:)j(:)f(:)h(:)f(:)h(:)f(:)h(:) f(:)h(:)f(:)g(:)h(:)f(:)h(:)f(:)h(:)f(:)h(:)f(:)h(:)f(:)g(:)h(:)f(:)h(:)f(:)h (:)f(:)h(:)f(:)h(:)f(:)g(:)h(:)69 b Fm(12)73 1422 y(4.7)50 b(Amdahl's)15 b(La)o(w)20 b Fk(:)25 b(:)f(:)h(:)f(:)h(:)f(:)h(:)f(:)g(:)h(:)f (:)h(:)f(:)h(:)f(:)h(:)f(:)h(:)f(:)g(:)h(:)f(:)h(:)f(:)h(:)f(:)h(:)f(:)h(:)f (:)g(:)h(:)69 b Fm(13)0 1531 y Fn(5)45 b(Using)19 b(P)n(arallelism)d(to)j(Ev) m(aluate)e(Application)h(Benc)n(hmarks)395 b(13)73 1591 y Fm(5.1)50 b(Generating)17 b(P)o(arallelism)c(Matrices)j(for)g(the)g(CRA)l(Y)g(Y-MP)23 b Fk(:)i(:)f(:)h(:)f(:)h(:)f(:)h(:)f(:)g(:)h(:)69 b Fm(14)73 1651 y(5.2)50 b(P)o(arallelism)14 b(Matrices)h(for)i(Selected)e(P)o(erfect)g (Benc)o(hmarks)29 b Fk(:)c(:)f(:)h(:)f(:)h(:)f(:)h(:)f(:)g(:)h(:)69 b Fm(15)73 1711 y(5.3)50 b(Di\013erences)16 b(in)g(Executed)f(P)o(arallelism) f(for)i(Selected)f(P)o(erfect)g(Benc)o(hmarks)k Fk(:)24 b(:)h(:)69 b Fm(16)73 1771 y(5.4)50 b(Prob)q(e)17 b(P)o(oin)o(ts)f(for)h(the)f(Baseline) f(P)o(erfect)g(Benc)o(hmarks)24 b Fk(:)h(:)f(:)h(:)f(:)h(:)f(:)h(:)f(:)h(:)f (:)g(:)h(:)69 b Fm(18)0 1880 y Fn(6)45 b(Conclusions)19 b(and)g(F)-5 b(uture)19 b(W)-5 b(ork)1013 b(19)0 1989 y(7)45 b(Ac)n(kno)n(wledgmen)n(ts) 1306 b(19)917 2828 y Fm(iii)p eop %%Page: 4 6 bop 918 2828 a Fm(iv)p eop %%Page: 1 7 bop 0 203 a Fl(1)83 b(In)n(tro)r(duction)0 313 y Fm(Since)17 b(their)h(inception,)f(the)h(p)q(erformance)f(of)h(computer)f(systems)g(has)h (enjo)o(y)o(ed)f(unpreceden)o(ted)0 373 y(gro)o(wth.)k(F)l(or)14 b(decades)f(the)g(enabling)g(force)h(b)q(ehind)f(this)g(gro)o(wth)h(has)h(b)q (een)e(the)g(steady)h(and)g(rapid)0 433 y(impro)o(v)o(em)o(en)n(t)g(of)i(the) h(basic)f(circuit)f(tec)o(hnology)l(.)21 b(Computers)16 b(b)q(ecame)f(more)g (p)q(o)o(w)o(erful)h(b)q(ecause)0 493 y(the)d(basic)f(comp)q(onen)o(ts)h (used)g(in)f(their)g(construction)h(|)g(from)f(tub)q(es)h(to)g(transistors)h (to)g(successiv)o(e)0 554 y(generations)g(of)g(in)o(tegrated)e(circuit)g(tec) o(hnology)h(|)h(b)q(ecame)e(faster,)h(smaller,)f(more)g(e\016cien)o(t,)f(and) 0 614 y(less)19 b(exp)q(ensiv)o(e.)29 b(The)19 b(con)o(tributions)g(of)g(arc) o(hitectural)f(inno)o(v)m(ations,)i(although)g(n)o(umerous)e(and)0 674 y(signi\014can)o(t,)e(w)o(ere)f(dw)o(arfed.)73 734 y(F)l(or)j(w)o(ell)e (o)o(v)o(er)h(a)h(decade)f(prop)q(onen)o(ts)h(of)g(parallel)f(computer)f (systems)g(ha)o(v)o(e)h(predicted)g(that)0 794 y(circuit)f(tec)o(hnology)h(w) o(ould)g(so)q(on)i(reac)o(h)d(limits)f(imp)q(osed)h(b)o(y)h(the)g(la)o(ws)g (of)h(ph)o(ysics,)e(and)i(this)f(ap-)0 855 y(p)q(ears)k(to)g(b)q(e)f(happ)q (ening.)34 b(The)20 b(clo)q(c)o(k)g(rates)g(of)h(the)f(fastest)g(sup)q (ercomputers)g(and)h(the)f(fastest)0 915 y(micropro)q(cessors)15 b(no)o(w)h(di\013er)f(b)o(y)f(only)i(a)f(factor)h(of)g(three.)1106 897 y Fj(1)1146 915 y Fm(W)l(e)f(are)g(en)o(tering)g(a)g(new)h(era)f(in)g (com-)0 975 y(puter)j(arc)o(hitecture)f(in)i(whic)o(h)f(almost)f(all)h (signi\014can)o(t)h(p)q(erformance)e(impro)o(v)o(em)o(en)o(ts)e(will)j(o)q (ccur)0 1035 y(b)o(y)f(exploiting)g(increased)g Fi(p)n(ar)n(al)r(lelism)i Fm(rather)f(than)g(faster)g(basic)g(tec)o(hnology)l(.)26 b(Consequen)o(tly)17 b(if)0 1095 y(the)k(\014eld)f(of)h(P)o(erformance)f(Ev)m(aluation)h(is)g(to)g (pro)o(vide)f(meaningful)g(information)g(to)h(designers)0 1156 y(and)16 b(users)f(of)g(high-p)q(erformance)g(hardw)o(are,)g(new)g Fi(p)n(ar)n(al)r(lelism-b)n(ase)n(d)h Fm(measures)e(of)h(p)q(erformance)0 1216 y(m)o(ust)f(b)q(e)h(dev)o(elop)q(ed)f(to)h(complemen)n(t)d(the)j (traditional)g Fi(time-b)n(ase)n(d)h Fm(measures)e(suc)o(h)g(as)i(execution)0 1276 y(rate.)73 1336 y(This)j(pap)q(er)h(examines)d(P)o(erformance)g(Ev)m (aluation)i(from)f(the)h(viewp)q(oin)o(t)f(that)h(ev)o(en)f(though)0 1396 y(reducing)11 b(execution)f(time)f(is)j(the)f(ultimate)e(goal,)k (parallelism)c(is)i(the)g(essen)o(tial)f(c)o(haracteristic)g(nec-)0 1457 y(essary)15 b(for)h(ac)o(hieving)e(go)q(o)q(d)j(p)q(erformance.)j (Section)15 b(2)g(examines)e(P)o(erformance)h(Ev)m(aluation)i(as)g(a)0 1517 y(Scien)o(ti\014c)d(Discipline)f(with)j(sp)q(ecial)f(emphasis)f(on)i (the)f(role)g(of)h(careful)f(design)g(and)h(measuremen)o(t)0 1577 y(in)c(the)g(P)o(erformance)f(Ev)m(aluation)i(pro)q(cess.)21 b(Section)11 b(3)g(in)o(tro)q(duces)h(the)f(P)o(ath)h(to)g(P)o(erformance)d (and)0 1637 y(sho)o(ws)19 b(that)g(the)g(k)o(ey)e(to)i(ac)o(hieving)e(high)i (p)q(erformance)f(is)g(preserving)g(parallelism)e(across)k(eac)o(h)0 1697 y(stage)g(of)f(the)g(soft)o(w)o(are)g(dev)o(elopmen)o(t)d(pro)q(cess)k (and)g(b)o(y)e(pro)o(viding)h(hardw)o(are)g(that)h(can)f(exploit)0 1758 y(this)i(parallelism.)32 b(This)21 b(section)g(also)g(in)o(tro)q(duces)g (a)g(simple)e(example)f(of)j(a)h(parallelism-based)0 1818 y(metric,)15 b(the)j(P)o(arallelism)d(Matrix.)25 b(F)l(undamen)o(tal)17 b(time-based)f(measuremen)o(ts)f(of)j(p)q(erformance)0 1878 y(are)j(recast)f(as)i(parallelism-based)d(measuremen)n(ts)f(in)j(Section)f (4,)i(demonstrating)e(that)h(under-)0 1938 y(standing)h(p)q(erformance)f (implies)e(understanding)j(the)f(parallelism.)35 b(Section)21 b(5)h(describ)q(es)f(ho)o(w)0 1998 y(parallelism-based)13 b(p)q(erformance)h (metrics)f(migh)o(t)h(b)q(e)h(applied)f(to)i(the)f(design)g(of)g(b)q(enc)o (hmark)f(sets)0 2058 y(of)19 b(application)g(programs,)g(and)h(applies)f(suc) o(h)f(metrics)f(to)i(the)g(P)o(erfect)f(Benc)o(hmarks.)27 b(Our)19 b(re-)0 2119 y(sults)h(p)q(oin)o(t)f(to)h(some)e(p)q(ossible)i(shortcomings)f (in)g(curren)o(t)g(metho)q(dology)g(used)h(for)g(b)q(enc)o(hmark)0 2179 y(construction.)h(Finally)l(,)14 b(w)o(e)h(presen)o(t)g(our)h (conclusions)f(and)i(plans)e(for)h(future)f(w)o(ork)h(in)f(Section)g(6.)0 2345 y Fl(2)83 b(P)n(erformance)23 b(Ev)-5 b(aluation)25 b(as)f(a)g(Scien)n (ti\014c)e(Discipline)0 2455 y Fm(In)16 b(an)o(y)g(scien)o(ti\014c)e(or)i (engineering)f(discipline,)f(progress)j(is)f(ac)o(hiev)o(ed)e(b)o(y)i(ev)m (aluating)g(the)f(state)i(of)0 2515 y(the)e(art)g(to)g(\014nd)g(w)o(a)o(ys)g (in)f(whic)o(h)g(it)g(can)h(b)q(e)g(impro)o(v)o(ed.)j(The)d(e\016ciency)e (with)h(whic)o(h)g(this)h(progres-)p 0 2559 750 2 v 56 2589 a Fh(1)75 2604 y Fg(F)m(or)g(example,)f(2)i(nanoseconds)h(for)e(the)h(Cra)o (y-3)f(v)o(ersus)i(6.7)e(nanoseconds)i(\(150)e(Mhz\))h(for)g(the)g(DEC)f (Alpha)0 2654 y(mo)q(del)d(3000.)925 2828 y Fm(1)p eop %%Page: 2 8 bop 0 195 a Fm(sion)18 b(mo)o(v)o(es)d(forw)o(ard)j(dep)q(ends)g(in)f(part)h (on)f(the)h(accuracy)f(and)h(precision)e(of)i(the)f(metho)q(ds)g(used)0 255 y(to)e(ev)m(aluate)f(curren)o(t)g(practice.)20 b(Metho)q(ds)15 b(that)g(do)g(not)g(facilitate)e(clear)h(understanding)i(of)f(exist-)0 315 y(ing)g(problems)e(and)i(p)q(oten)o(tial)g(solutions)g(w)o(aste)f(ph)o (ysical,)g(\014nancial,)g(and)h(in)o(tellectual)e(resources.)0 376 y(Ev)m(aluation,)18 b(therefore,)f(is)h(a)g(cornerstone)g(of)g(sound)h (scien)o(ti\014c)d(and)j(engineering)e(practice.)24 b(Ac-)0 436 y(curate)16 b(and)h(meaningful)e(measuremen)n(t,)e(in)j(turn,)g(is)g(a)h (prerequisite)d(of)j(go)q(o)q(d)h(ev)m(aluation.)73 496 y(In)e(order)h(to)g (cast)f(P)o(erformance)f(Ev)m(aluation)i(as)g(scien)o(ti\014c)e(discipline,)f (one)j(m)o(ust)e(\014rst)h(de\014ne)0 556 y(\\p)q(erformance")23 b(and)i(\\ev)m(aluation.")44 b(The)24 b(p)q(erformance)f(of)h(a)g(computer)e (system)h(running)h(a)0 616 y(program)15 b(can)g(b)q(e)h(measured)e(b)o(y)g (sev)o(eral)g(criteria.)20 b(Of)15 b(ma)s(jor)f(imp)q(ortance)g(is)h(the)g (elapsed)g(time)e(a)0 677 y(computer)d(system)g(requires)g(to)i(solv)o(e)f(a) h(problem.)18 b(By)11 b(con)o(v)o(en)o(tion)f(this)h(time)f(is)h(usually)g (measured)0 737 y(b)o(y)g(its)h(in)o(v)o(erse,)e(the)h Fi(r)n(ate)h Fm(at)g(whic)o(h)f(solutions)h(are)g(obtained.)20 b(This)11 b(rate)h(is)g(called)e(the)i Fi(p)n(erformanc)n(e)0 797 y Fm(of)g(the)f (system.)19 b(Prerequisite)10 b(to)i(ev)m(aluating)f(\(i.e.)19 b(understanding)12 b(and)g(impro)o(ving\))e(p)q(erformance)0 857 y(is)21 b(the)g(abilit)o(y)f(to)h Fi(me)n(asur)n(e)g Fm(p)q(erformance.) 35 b(The)21 b(practice)f(of)h(measuring)g(the)g(p)q(erformance)f(of)0 917 y(computer)11 b(systems)h(is)h(commonly)d(called)h Fi(b)n(enchmarking)p Fm(,)k(and)e(programs)g(used)g(to)g(p)q(erform)f(these)0 978 y(measuremen)o(ts)h(are)k(called)e Fi(b)n(enchmarks)p Fm(.)794 960 y Fj(2)73 1038 y Fm(As)i(with)f(an)o(y)h(scien)o(ti\014c)e(discipline,)f (one)j(uses)g(the)f(basic)h(steps)g(of)g(the)f(Scien)o(ti\014c)f(Metho)q(d)i (to)0 1098 y(conduct)i(meaningful)e(exp)q(erimen)o(ts)f(and)j(comm)o(unic)o (ate)d(understandable)j(results.)29 b(The)18 b(\\disci-)0 1158 y(pline")d(of)g(P)o(erformance)f(Ev)m(aluation,)h(therefore,)g(consists)g(of) h(four)g(di\013eren)o(t,)e(in)o(terrelated)f(activ-)0 1218 y(ities,)i(striving)h(to)g(answ)o(er)h(basic)f(p)q(erformance-related)f (questions:)60 1322 y(1.)24 b(Design)f({)f(What)h(p)q(erformance)e(are)i(w)o (e)f(trying)g(to)h(measure)e(and)i(ho)o(w)f(do)h(w)o(e)f(measure)122 1383 y(it?)f(Activities)14 b(include)h(creating)h(prop)q(er)h(exp)q(erimen)o (ts)c(and)k(metho)q(dologies)f(for)g(testing)h(a)122 1443 y(p)q(erformance)e (h)o(yp)q(othesis,)60 1542 y(2.)24 b(Observ)m(ation)15 b({)g(What)g(happ)q (ened)h(in)e(the)h(exp)q(erimen)o(t?)j(Activities)12 b(include)i(executing)f (the)122 1602 y(exp)q(erimen)o(t)g(and)k(recording)f(measuremen)o(ts,)d(coun) o(ts,)j(traces,)g(etc.,)60 1701 y(3.)24 b(Analysis)12 b({)g(What)h(caused)g (the)f(b)q(eha)o(vior)g(observ)o(ed)g(during)h(the)f(p)q(erformance)f(exp)q (erimen)o(t?)122 1762 y(Activities)j(include)h(understanding)i(what)g(the)f (recorded)f(n)o(um)o(b)q(ers)g(mean.)60 1861 y(4.)24 b(Syn)o(thesis)13 b({)g(Giv)o(en)g(what)h(w)o(as)g(learned)f(from)f(the)h(exp)q(erimen)o(t,)e (ho)o(w)i(can)h(the)f(p)q(erformance)122 1921 y(b)q(e)18 b(impro)o(v)o(ed?)24 b(Activities)15 b(include)h(pro)o(viding)i(feedbac)o(k)e(to)j(hardw)o(are)f (and)g(soft)o(w)o(are)g(de-)122 1981 y(signers)e(regarding)h(new)f(opp)q (ortunities)h(for)g(higher)f(p)q(erformance.)0 2085 y(These)d(four)h(steps)g (are)g(rep)q(eated)f(un)o(til)f(the)i(desired)f(understanding)h(of)g(p)q (erformance)e(is)h(ac)o(hiev)o(ed.)73 2146 y(A)19 b(basic)h(to)q(ol)g(used)g (in)g(P)o(erformance)e(Ev)m(aluation)i(is)f(the)h(b)q(enc)o(hmark.)30 b(A)19 b(b)q(enc)o(hmark)f(is)i(a)0 2206 y(w)o(ell-de\014ned)d(sp)q (eci\014cation)i(of)g(an)g(exp)q(erimen)o(t)c(including)j(1\))h(a)g(problem)e (description)h(|)h(prose)0 2266 y(or)14 b(a)g(set)g(of)g(programs,)g(2\))g(a) g(metho)q(dology)f(or)h(set)g(of)g(rules)f(for)h(conducting)g(the)g(exp)q (erimen)o(t,)d(3\))j(a)0 2326 y(set)g(of)g(desired)f(measuremen)n(ts,)e(and)k (4\))f(a)g(v)o(eri\014cation)e(test)i(indicating)f(whether)g(the)h(exp)q (erimen)o(t)0 2386 y(succeeded)e(in)h(pro)q(ducing)h(a)g(correct)f(solution)g (to)h(the)f(problem.)19 b(Benc)o(hmarking)11 b(is)i(the)g(execution)0 2447 y(of)k(the)g(b)q(enc)o(hmark)e(on)j(the)f(mac)o(hine\(s\))e(of)i(in)o (terest)f(and)h(the)g(rep)q(orting)g(of)h(the)e(recorded)h(infor-)0 2507 y(mation.)i(Benc)o(hmarking)11 b(is)i(part)g(of)h(the)f(observ)m(ation)h (activit)o(y)d(in)i(P)o(erformance)e(Ev)m(aluation.)21 b(All)0 2567 y(to)q(o)c(often,)f(ho)o(w)o(ev)o(er,)f(b)q(enc)o(hmarking)f(is)i (mistak)o(en)f(for)h(P)o(erformance)f(Ev)m(aluation.)p 0 2608 750 2 v 56 2639 a Fh(2)75 2654 y Fg(The)21 b(distinction)f(b)q(et)o(w)o(een)h (b)q(enc)o(hmarking)f(and)g(p)q(erformance)g(ev)n(aluation)f(is)i(often)f(o)o (v)o(erlo)q(ok)o(ed)g(ev)o(en)h(b)o(y)0 2704 y(practitioners.)925 2828 y Fm(2)p eop %%Page: 3 9 bop 73 195 a Fm(It)18 b(can)h(b)q(e)f(argued)h(that)g(often)g(in)f(curren)o (t)f(practice)h(to)q(o)h(m)o(uc)o(h)e(emphasis)g(is)h(placed)g(on)h(ob-)0 255 y(serv)m(ation)c(\(step)g(3\))g(|)g(v)m(arious)g(w)o(ell-kno)o(wn)f(b)q (enc)o(hmark)f(programs)i(are)g(run)g(on)g(a)g(new)g(mac)o(hine)0 315 y(or)i(sim)o(ulated)e(on)j(a)f(prop)q(osed)h(mac)o(hine,)d(and)i(the)g (timing)e(results)i(are)g(rep)q(orted)g(in)g(v)o(oluminous)0 376 y(tables)d(and)h(graphs.)22 b(W)l(e)14 b(w)o(ould)g(lik)o(e)f(to)i(see)e (more)h(atten)o(tion)g(giv)o(en)f(to)i(the)f(analysis)g(and)h(syn)o(the-)0 436 y(sis)g(steps,)g(and)h(esp)q(ecially)d(to)i(the)g(design)g(of)g(b)q(enc)o (hmarks.)20 b(This)15 b(pap)q(er)g(addresses)h(the)f(design)g(of)0 496 y(b)q(enc)o(hmark)g(sets)h(b)o(y)g(prop)q(osing)i(a)e(parallelism-based)f (criterion)g(for)h(b)q(enc)o(hmark)f(comparison.)0 662 y Fl(3)83 b(Preserving)26 b(and)h(Exploiting)h(P)n(arallelism)73 772 y Fm(If)19 b(w)o(e)h(lo)q(ok)g(at)g(the)g(en)o(tire)e(computational)h(pro)q (cess)i(from)d(problem)h(de\014nition)g(to)h(program)0 832 y(execution,)15 b(w)o(e)h(\014nd)h(that)g(some)e(asp)q(ect)i(of)g(p)q (erformance)e(is)h(in\015uenced)g(b)o(y)g(eac)o(h)g(step.)22 b(Figure)16 b(1)0 892 y(depicts)f(what)i(w)o(e)e(call)g(the)h(P)o(ath)g(T)l (o)h(P)o(erformance.)i(The)d(left)f(side)g(of)h(the)g(\014gure)g(represen)o (ts)f(the)0 953 y(soft)o(w)o(are)h(dev)o(elopmen)o(t)e(pro)q(cess,)i(while)g (the)g(righ)o(t)g(sho)o(ws)h(the)f(hardw)o(are)h(dev)o(elopmen)o(t)c(pro)q (cess.)0 1013 y(W)l(e)i(call)f(eac)o(h)g(abstraction)i(a)f Fi(step)g Fm(in)g(the)f(P)o(ath.)22 b(The)14 b(P)o(ath)i(is)e(tra)o(v)o (ersed)g(through)i(the)f(activities)0 1073 y(of)24 b(a)f(team)f(of)h Fi(agents)i Fm(that)f(transform)e(the)h(soft)o(w)o(are)h(and)f(hardw)o(are)h (from)e(one)i(step)f(to)g(the)0 1133 y(next.)30 b(The)19 b(soft)o(w)o(are)g (and)h(hardw)o(are)g(meet)d(when)i(a)h(program)f(is)g(executed,)f(resulting,) h(from)g(a)0 1193 y(p)q(erformance)h(viewp)q(oin)o(t,)h(in)g(a)h(measure)e (of)h(program)g(execution)f(time.)34 b(Finally)l(,)21 b(a)h(metric)c(is)0 1253 y(applied)i(to)g(the)g(execution)f(time)f(to)j(get)f(a)g(measure)f(of)i (p)q(erformance.)1388 1235 y Fj(3)1439 1253 y Fm(With)f(the)g(tenet)g(that)0 1314 y(p)q(erformance)13 b(is)g(parallelism)f(\()p Ff(x)p Fm(4\),)i(w)o(e)f (see)g(that)i(three)e(conditions)h(m)o(ust)e(b)q(e)i(met)e(to)i(ac)o(hiev)o (e)e(high)0 1374 y(p)q(erformance.)19 b(First,)13 b(the)g(parallelism)d (presen)o(t)j(in)f(the)h(ph)o(ysical)f(problem)g(m)o(ust)f(b)q(e)i(preserv)o (ed,)f(as)0 1434 y(m)o(uc)o(h)j(as)i(p)q(ossible,)g(through)g(eac)o(h)g (phase)g(in)f(the)h(soft)o(w)o(are)g(dev)o(elopmen)o(t)c(pro)q(cess.)24 b(Second,)16 b(the)0 1494 y(hardw)o(are)j(m)o(ust)d(pro)o(vide)h(su\016cien)o (t)g(parallelism)f(to)i(realize)f(the)h(desired)f(p)q(erformance.)26 b(Third,)0 1554 y(the)16 b(soft)o(w)o(are)g(and)h(hardw)o(are)g(parallelism)d (m)o(ust)h(b)q(e)h(compatible.)73 1615 y(Eac)o(h)k(soft)o(w)o(are)g(and)h (hardw)o(are)f(agen)o(t)g(optimizes)e(t)o(w)o(o)i(of)g(the)g(three)f(v)m (ariables)h(that)g(deter-)0 1675 y(mine)e(p)q(erformance)g(|)h(clo)q(c)o(k)g (rate,)h(parallelism,)e(and)i(w)o(ork)f(\()p Ff(x)p Fm(4.1\))h(|)g(to)g(meet) d(its)i(ob)s(jectiv)o(e.)0 1735 y(Soft)o(w)o(are)j(agen)o(ts)g(m)o(ust)e (optimize)f(the)i(balance)h(b)q(et)o(w)o(een)f(w)o(ork)g(and)i(parallelism.) 35 b(Execution)0 1795 y(time)16 b(can)h(sometimes)e(b)q(e)j(decreased)f(b)o (y)g(increasing)h(the)f(amoun)o(t)g(of)h(w)o(ork,)f(if)g(the)h(resulting)f (in-)0 1855 y(crease)d(in)g(parallelism)e(is)j(su\016cien)o(tly)d(large.)21 b(Con)o(v)o(ersely)l(,)13 b(execution)g(time)f(can)j(b)q(e)g(shortened)g(b)o (y)0 1916 y(signi\014can)o(tly)h(decreasing)h(the)g(amoun)o(t)f(of)h(w)o (ork,)g(for)h(example)c(via)j(a)h(more)d(e\016cien)o(t)h(algorithm,)0 1976 y(ev)o(en)22 b(though)h(the)g(amoun)o(t)f(of)h(parallelism)d(ma)o(y)h(b) q(e)i(sligh)o(tly)f(smaller.)39 b(Similarly)-5 b(,)21 b(giv)o(en)h(the)0 2036 y(constrain)o(ts)e(sp)q(eci\014ed)f(b)o(y)g(the)g(system)f(requiremen)o (ts)f(\(cost,)j(p)q(o)o(w)o(er)f(consumption,)h(size,)f(etc.\),)0 2096 y(hardw)o(are)e(agen)o(ts)f(optimize)e(the)i(balance)g(b)q(et)o(w)o(een) g(parallelism)d(and)k(clo)q(c)o(k)f(rate.)0 2241 y Fe(3.1)70 b(Soft)n(w)n(are)23 b(P)n(arallelism)0 2333 y Fm(In)f(the)g(P)o(ath)g(T)l(o)h (P)o(erformance)d(the)i(soft)o(w)o(are)g(dev)o(elopmen)o(t)d(pro)q(cess)k (starts)g(with)f(a)g(ph)o(ysical)0 2393 y(phenomenon)13 b(and)i(ends)f(with)f (a)i(stream)d(of)j(mac)o(hine)c(instructions.)21 b(Man)o(y)13 b(ph)o(ysical)g(phenomena)0 2453 y(con)o(tain)18 b(signi\014can)o(t)g (inheren)o(t)f(parallelism;)g(extremely)e(high)k(p)q(erformance)e(can)h(b)q (e)h(obtained)g(if)p 0 2497 750 2 v 56 2528 a Fh(3)75 2543 y Fg(Often)13 b(the)g(wrong)g(metric)f(is)h(used,)g(giving)e(the)j(illusion)d (of)h(p)q(o)q(or)h(\(or)g(go)q(o)q(d\))f(p)q(erformance.)18 b(F)m(or)12 b(example,)f(the)0 2593 y(most)i(common)f(metric,)h(mega\015ops,) g(when)h(applied)g(to)g(a)g(pattern-matc)o(hing)g(program)e(will)h(giv)o(e)h (a)g(misleadingly)0 2642 y(small)f(v)n(alue,)i(implying)e(p)q(o)q(or)j(p)q (erformance.)23 b(In)15 b(this)h(case)h(a)e(di\013eren)o(t)h(metric)f(suc)o (h)i(as)f(comparisons)e(or)i(logical)0 2692 y(op)q(erations)e(w)o(ould)f(b)q (e)i(a)e(more)g(accurate)i(measure)f(of)f(\\w)o(ork")h(as)f(de\014ned)i(in)f Fd(x)p Fg(4.1.)925 2828 y Fm(3)p eop %%Page: 4 10 bop 319 556 a Fn(SOFTW)-6 b(ARE)277 616 y(P)h(ARALLELISM)1167 556 y(HARD)n(W)f(ARE)1136 616 y(P)h(ARALLELISM)245 737 y Fm(Ph)o(ysical)15 b(Phenomenon)474 797 y Ff(#)376 857 y Fi(\(Scientist\))474 917 y Ff(#)257 977 y Fm(Mathematical)f(Mo)q(del)474 1038 y Ff(#)270 1098 y Fi(\(Numeric)n(al)j(A)o(nalyst\))474 1158 y Ff(#)360 1218 y Fm(Algorithms)508 b(System)15 b(Requiremen)o(ts)474 1278 y Ff(#)273 1339 y Fi(\(Softwar)n(e)j(Engine)n(er\))474 1399 y Ff(#)1333 1278 y(#)1231 1339 y Fi(\(A)o(r)n(chite)n(ct\))1333 1399 y Ff(#)386 1459 y Fm(Program)549 b(System)15 b(Arc)o(hitecture)474 1519 y Ff(#)275 1579 y Fi(\(Compiling)j(System\))474 1640 y Ff(#)1333 1519 y(#)1122 1579 y Fi(\(Har)n(dwar)n(e)e(Engine)n(er\))1333 1640 y Ff(#)352 1700 y Fm(Instructions)423 b(Arc)o(hitecture)14 b(Implem)o(e)o(n)o(tation)454 1760 y Ff(&)809 b(.)544 1820 y(\000)-9 b(!)91 b Fi(\(Pr)n(o)n(gr)n(am)15 b(Exe)n(cution\))94 b Ff( )-8 b(\000)925 1880 y(#)767 1940 y Fm(Execution)16 b(Time)933 2001 y Ff(#)856 2061 y Fi(\(Metric\))933 2121 y Ff(#)712 2181 y Fn(PERF)n(ORMANCE)167 2334 y(Figure)i(1)p Fm(:)j(The)c(P)o(ath)f(T)l(o)h(P) o(erformance.)j(Agen)o(ts)15 b(are)i(sho)o(wn)g(in)f(paren)o(theses.)925 2828 y(4)p eop %%Page: 5 11 bop 0 195 a Fm(this)21 b(parallelism)f(can)h(b)q(e)h(preserv)o(ed)e(at)i (program)g(execution)e(time.)35 b(In)22 b(the)f(\014rst)h(step)f(of)h(the)0 255 y(P)o(ath,)g(a)f(scien)o(tist)e(dev)o(elops)h(a)i(mathematical)17 b(mo)q(del)j(for)h(the)g(ph)o(ysical)e(phenomenon.)34 b(Next,)0 315 y(an)17 b(algorithm)e(is)g(dev)o(elop)q(ed)h(to)g(solv)o(e)f(for)i(the)e (relev)m(an)o(t)h(v)m(ariables)g(of)g(the)g(mathematical)d(mo)q(del.)0 376 y(With)i(the)g(algorithm)f(in)h(hand,)h(a)g(program)f(is)g(written.)20 b(Finally)l(,)14 b(the)h(program)g(is)g(compiled)e(and)0 436 y(link)o(ed)i(to)j(pro)q(duce)f(the)g(\014nal)g(instruction)g(stream.)22 b(Ob)o(viously)l(,)16 b(signi\014can)o(t)h(parallelism)e(can)i(b)q(e)0 496 y(lost)k(at)h(eac)o(h)e(step.)36 b(Simplifying)19 b(assumptions)i(in)f (the)h(mathematical)d(mo)q(del)i(ma)o(y)g(eliminate)0 556 y(parallel)14 b(asp)q(ects)i(of)g(the)f(phenomenon,)f(sync)o(hronization)h(p)q(oin)o(ts)h (and)f(sequen)o(tial)f(orderings)i(can)0 616 y(b)q(e)h(in)o(tro)q(duced)f(b)o (y)g(the)g(algorithm,)g(false)g(dep)q(endences)g(and)h(an)o(y)g(n)o(um)o(b)q (er)d(of)j(other)g(pitfalls)f(can)0 677 y(b)q(e)22 b(in)o(tro)q(duced)g(when) g(the)g(program)f(is)h(written,)h(and)f(ev)o(en)f(the)h(b)q(est)g(compilers)e (ma)o(y)g(fail)i(to)0 737 y(detect)15 b(all)h(the)g(parallelism)e(a)o(v)m (ailable)h(in)h(the)g(program.)0 881 y Fe(3.2)70 b(Hardw)n(are)23 b(P)n(arallelism)0 974 y Fm(The)c(hardw)o(are)g(dev)o(elopmen)o(t)d(pro)q (cess)j(starts)h(with)f(a)g(set)f(of)i(system)d(requiremen)o(ts,)f(including) 0 1034 y(those)c(dictated)f(b)o(y)g(non)o(tec)o(hnical)f(criteria)g(suc)o(h)h (as)h(mark)o(et)e(considerations.)20 b(Giv)o(en)10 b(these)h(criteria)0 1094 y(an)16 b(arc)o(hitect)e(dev)o(elops)g(a)i(system)e(arc)o(hitecture,)g (and)i(\014nally)e(a)i(group)h(of)e(engineers)g(dev)o(elops)g(an)0 1154 y(implem)o(en)o(tati)o(on)20 b(of)j(that)f(arc)o(hitecture.)796 1136 y Fj(4)854 1154 y Fm(A)f(giv)o(en)h(arc)o(hitecture)e(ma)o(y)h(b)q(e)h (able)g(to)h(explicitly)0 1214 y(or)c(implicitly)c(supp)q(ort)21 b(parallelism)16 b(in)j(a)h(v)m(ariet)o(y)e(of)h(w)o(a)o(ys,)g(but)h(the)f (ac)o(hiev)m(able)f(parallelism)e(is)0 1275 y(determined)c(b)o(y)i(the)g (particular)g(implem)o(en)o(tation)e(of)j(the)f(arc)o(hitecture.)19 b(In)14 b(this)g(w)o(a)o(y)g(parallelism)0 1335 y(is)i(gained)g(or)h(lost)g (at)f(eac)o(h)g(stage)h(of)f(the)g(hardw)o(are)h(dev)o(elopmen)o(t)c(pro)q (cess.)0 1479 y Fe(3.3)70 b(Realizi)o(ng)20 b(P)n(arallelism)g(at)j(Run)n (time)0 1572 y Fm(Ev)o(en)c(if)g(the)g(soft)o(w)o(are)h(instruction)f(stream) f(and)i(the)f(hardw)o(are)h(arc)o(hitecture)e(implem)o(en)o(tation)0 1632 y(eac)o(h)g(con)o(tain)f(large)h(amoun)o(ts)g(of)g(parallelism,)e(high)i (p)q(erformance)f(will)g(not)h(b)q(e)g(realized)f(unless)0 1692 y(the)f(t)o(w)o(o)h(parallelism)d(con\014gurations)k(are)f(compatible.)j (One)d(common)d(mismatc)o(h)g(is)i(grain)h(size;)0 1752 y(the)k(parallel)f (units)h(of)g(w)o(ork)g(in)g(the)g(soft)o(w)o(are)g(ma)o(y)f(b)q(e)h(to)q(o)h (small)d(to)j(b)q(e)f(executed)f(e\016cien)o(tly)0 1812 y(giv)o(en)d(the)h (parallelism)e(o)o(v)o(erhead)i(presen)o(t)g(in)g(the)g(hardw)o(are.)28 b(In)18 b(other)g(cases)g(the)h(\\shap)q(es")h(of)0 1872 y(the)g(parallelism) e(ma)o(y)g(not)j(matc)o(h.)31 b(F)l(or)20 b(example,)f(a)h(massiv)o(ely)e (parallel)h(arc)o(hitecture)g(with)h(a)0 1933 y(t)o(w)o(o-dimensional)c(grid) i(of)g(pro)q(cessors)h(ma)o(y)d(not)j(b)q(e)f(able)f(to)h(p)q(erform)f (computations)h(along)g(the)0 1993 y(diagonals)f(of)g(matrices)d(at)j(a)g (high)f(rate.)0 2137 y Fe(3.4)70 b(Benc)n(hmarking)22 b(Metho)r(dologies)g (and)i(the)e(P)n(ath)0 2230 y Fm(Eac)o(h)15 b(agen)o(t)h(in)e(the)h(P)o(ath)h (T)l(o)f(P)o(erformance)f(represen)o(ts)g(a)i(scien)o(ti\014c)d(discipline)g (dev)o(oted)i(to)g(min-)0 2290 y(imizing)e(the)i(loss)h(of)f(parallelism.)k (F)l(or)c(example,)e(the)i(Numerical)e(Analysts)i(dev)o(elop)f(alternativ)o (e)0 2350 y(algorithms)23 b(to)h(solv)o(e)f(a)h(giv)o(en)f(mathematical)e(mo) q(del.)43 b(Soft)o(w)o(are)23 b(Engineers)h(study)g(w)o(a)o(ys)g(to)0 2410 y(instan)o(tiate)19 b(a)h(giv)o(en)f(algorithm)f(as)j(programs,)f(etc.) 30 b(Eac)o(h)20 b(of)g(these)f(disciplines)f(mak)o(es)g(trade-)0 2470 y(o\013s)k(with)g(the)f(parallelism)e(against)j(other)f(ob)s(jectiv)o(e) f(criteria)g(suc)o(h)h(as)h(reducing)f(the)g(n)o(um)o(b)q(er)p 0 2514 750 2 v 56 2545 a Fh(4)75 2560 y Fg(The)d(arc)o(hitecture)i(and)e(arc) o(hitecture)h(implemen)o(tation)c(are)j(often)g(confused,)h(but)f(they)h(are) f(distinct.)30 b(F)m(or)0 2610 y(example,)12 b(the)h(IBM)h(370)e(line)h (consists)h(of)f(di\013eren)o(t)h(implemen)o(tatio)o(ns)d(of)i(the)g(same)g (arc)o(hitecture.)19 b(Similarly)m(,)9 b(the)0 2659 y(Cra)o(y)14 b(1,)f(Cra)o(y)g(X-MP)m(,)g(and)h(Cra)o(y)g(Y-MP)g(can)g(b)q(e)g(lo)q(osely)g (view)o(ed)g(as)g(instan)o(tiations)f(of)g(the)h(same)g(arc)o(hitecture.)925 2828 y Fm(5)p eop %%Page: 6 12 bop 0 195 a Fm(of)24 b(op)q(erations.)44 b(The)23 b(\014eld)g(of)h(P)o (erformance)e(Ev)m(aluation)i(attempts)e(to)i(measure)e(parallelism)0 255 y(\(p)q(erformance\))15 b(at)i(particular)f(steps)g(on)h(the)f(P)o(ath.) 73 315 y(When)j(parallelism)c(is)k(measured)e(at)h(t)o(w)o(o)h(di\013eren)o (t)e(steps,)i(w)o(e)f(call)f(the)h(highest)h(and)g(lo)o(w)o(est)0 376 y(measured)13 b(step)h(in)g(the)g(P)o(ath)h(the)f Fi(pr)n(ob)n(e)h(p)n (oints)p Fm(.)20 b(F)l(or)14 b(example,)e(if)i(one)h(uses)f(a)h(program)f(k)o (ernel)f(to)0 436 y(measure)g(the)g(execution)g(time)f(of)i(a)g(section)g(of) g(co)q(de,)g(then)g(the)f(lo)q(cation)h(of)h(the)e(\014rst)h(prob)q(e)h(p)q (oin)o(t)0 496 y(is)f(at)h(the)g(Program)f(lev)o(el,)e(and)k(the)e(second)h (at)g(the)f(Execution)g(Time)e(lev)o(el.)19 b(If)14 b(one)h(measures)e(the)0 556 y(mega\015ops)k(rate)h(of)g(the)f(k)o(ernel,)f(then)h(the)g(second)h (prob)q(e)g(p)q(oin)o(t)g(w)o(ould)f(b)q(e)h(at)g(the)f(P)o(erformance)0 616 y(lev)o(el.)73 677 y(Since)j(the)g(purp)q(ose)i(of)f(an)o(y)f(program)h (is)f(to)h(pro)o(vide)f(a)h(solution)g(to)g(a)g(problem,)e(the)i(most)0 737 y(imp)q(ortan)o(t)c(p)q(erformance)g(measure)g(is)h(time)e(to)i(solution) g(|)g(ho)o(w)h(quic)o(kly)c(the)j(problem)f(can)h(b)q(e)0 797 y(solv)o(ed.)23 b(F)l(or)17 b(this)g(question)f(the)h(appropriate)h(prob)q(e) f(p)q(oin)o(ts)h(are)f(at)g(the)g(Ph)o(ysical)f(Phenomenon)0 857 y(and)21 b(Execution)f(Time)f(lev)o(els.)32 b(Almost)19 b(all)h(b)q(enc)o(hmark)f(activities)g(appro)o(ximate)g(this)i(desired)0 917 y(v)m(alue)c(b)o(y)f(using)h(a)g(di\013eren)o(t)f(\(but)h(represen)o (tativ)o(e\))e(set)i(of)g(b)q(enc)o(hmarks)e(problems)h(and)h(general-)0 978 y(izing)g(the)g(results.)26 b(Some)16 b(sets)i(of)g(b)q(enc)o(hmark)e (programs,)i(called)f Fi(kernel)i Fm(b)q(enc)o(hmarks,)d(con)o(tain)0 1038 y(a)22 b(set)g(of)g(program)g(fragmen)o(ts)e(whose)j(c)o(haracteristics) d(are)i(designed)g(to)g(mimic)c(those)k(of)g(real)0 1098 y(programs.)j(The)18 b(Liv)o(ermore)d(F)l(ortran)j(Kernels)e([1])h(and)h(Linpac)o(k)g(b)q(enc)o (hmark)d([2])i(are)h(examples)0 1158 y(of)d(k)o(ernel)f(b)q(enc)o(hmarks.)19 b(The)d(results)e(of)i(k)o(ernel)d(b)q(enc)o(hmarks)h(are)h(usually)g(rep)q (orted)g(in)g(terms)f(of)0 1218 y(mega\015ops.)24 b(Suc)o(h)17 b(b)q(enc)o(hmarks)f(measure)g(the)h(parallelism)e(exploited)h(at)i (execution)e(time)f(giv)o(en)0 1279 y(a)i(sp)q(eci\014ed)e(amoun)o(t)h(of)g (parallelism)e(\(a)j(\014xed)f(program\))g(at)g(the)g(Program)h(lev)o(el.)73 1339 y(More)j(recen)o(tly)l(,)g(b)q(enc)o(hmarks)f(comp)q(osed)h(of)h(sets)g (of)g(application)f(programs)h(ha)o(v)o(e)e(b)q(ecome)0 1399 y(a)o(v)m(ailable.)25 b(The)17 b(P)o(erfect)g(Benc)o(hmarks)e([3])i(and)h (SPEC)h(Benc)o(hmark)c(Suite)i([4])g(are)h(t)o(w)o(o)f(p)q(opular)0 1459 y(examples.)23 b(Prop)q(onen)o(ts)c(of)f(application)f(b)q(enc)o(hmark)f (sets)i(claim)e(that)i(a)g(set)f(of)h(k)o(ernels)e(cannot)0 1519 y(adequately)i(represen)o(t)g(the)h(p)q(erformance)e(c)o(haracteristics) h(of)h(application)g(programs;)g(only)g(full)0 1579 y(programs)14 b(can)h(adequately)f(test)g(the)g(p)q(erformance)f(c)o(haracteristics)g(of)i (a)f(computer)f(system.)19 b(Re-)0 1640 y(sults)14 b(are)g(usually)f(giv)o (en)g(as)h(execution)f(times)e(for)j(eac)o(h)g(co)q(de)f(in)h(the)f(set,)h (so)g(the)g(prob)q(e)g(p)q(oin)o(ts)g(are)0 1700 y(at)i(the)f(Program)h(and)g (Execution)e(Time)g(lev)o(els.)19 b(Some)14 b(application)i(b)q(enc)o(hmark)e (sets,)h(including)0 1760 y(the)f(P)o(erfect)f(Benc)o(hmarks)g(and)i(the)f (NAS)g(P)o(arallel)f(Benc)o(hmarks)f([5],)i(allo)o(w)g(the)g(user)h(to)g (increase)0 1820 y(p)q(erformance)e(b)o(y)g(mo)q(difying)f(the)i(source)f(co) q(de;)i(c)o(hanges)f(from)e(the)i(simple)d(insertion)i(of)h(compiler)0 1880 y(directiv)o(es)k(to)i(the)g(substitution)g(of)g(new)g(algorithms)f(are) h(allo)o(w)o(ed)f(as)i(long)f(as)h(the)e(correctness)0 1941 y(of)f(the)f(program)g(is)g(preserv)o(ed.)23 b(In)17 b(suc)o(h)g(cases)h(the) f(\014rst)g(prob)q(e)h(p)q(oin)o(t)g(is)f(mo)o(v)o(ed)e(from)h(the)h(Pro-)0 2001 y(gram)i(lev)o(el)e(to)j(the)f(Algorithm)f(or)i(Mathematical)d(Mo)q(del) j(lev)o(el,)d(dep)q(ending)j(on)g(the)f(exten)o(t)g(of)0 2061 y(the)d(mo)q(di\014cations.)73 2121 y(W)l(e)d(th)o(us)h(ha)o(v)o(e)f(sho)o (wn)h(ho)o(w)g(the)f(metho)q(dologies)g(of)h(sev)o(eral)e(p)q(opular)j(b)q (enc)o(hmark)d(sets)h(can)h(b)q(e)0 2181 y(sp)q(eci\014ed)h(in)f(our)i (terminology)l(.)j(No)c(single)g(metho)q(dology)f(is)h(the)g(correct)g(one.) 21 b(Eac)o(h)15 b(b)q(enc)o(hmark)0 2242 y(metho)q(dology)e(measures)f (something)g(di\013eren)o(t)h(and)g(the)g(user)g(needs)g(to)h(decide)e(if)h (the)g(b)q(enc)o(hmark)0 2302 y(tests)h(and)g(measures)f(the)h(desired)f (agen)o(ts.)21 b(Additionally)l(,)12 b(care)i(m)o(ust)e(b)q(e)i(tak)o(en)g (when)g(comparing)0 2362 y(results)j(of)g(di\013eren)o(t)g(b)q(enc)o(hmarks)f (to)h(see)g(that)h(the)f(same)f(set)h(of)g(agen)o(ts)h(is)f(b)q(eing)h (measured.)k(A)0 2422 y(more)15 b(detailed)g(description)h(of)g(these)g(and)h (other)f(b)q(enc)o(hmarks)f(can)i(b)q(e)f(found)h(in)f([6][7)o(][8].)925 2828 y(6)p eop %%Page: 7 13 bop 0 195 a Fe(3.5)70 b(The)28 b(P)n(arallelism)e(Matrix:)43 b(A)29 b(Measure)g(of)h(Executed)e(P)n(aral-)157 270 y(leli)o(sm)0 362 y Fm(An)o(y)16 b(metric)e(that)j(considers)f(only)g(soft)o(w)o(are)h (parallelism)d(\(e.g.)22 b(b)o(y)16 b(assuming)g(in\014nite)g(hardw)o(are)0 422 y(resources\))22 b(is)h(merely)c(a)k(measure)e(of)i(the)f(parallelism)e Fi(p)n(otential)j Fm(con)o(tained)f(in)g(the)g(soft)o(w)o(are.)0 483 y(Corresp)q(ondingly)l(,)k(an)o(y)d(metric)e(based)j(solely)f(on)h(hardw) o(are)g(parallelism)e(describ)q(es)h(only)g(the)0 543 y Fi(p)n(otential)d Fm(parallelism)d(of)j(the)f(hardw)o(are.)807 525 y Fj(5)857 543 y Fm(Of)g(ultimate)f(in)o(terest)g(is)h(the)g(parallelism)e(ac)o(hiev)o (ed)0 603 y(when)g(the)f(soft)o(w)o(are)g(and)i(hardw)o(are)f(parallelism)d (in)o(teract)h(during)i(program)f(execution.)21 b(All)15 b(the)0 663 y(e\013orts)20 b(undertak)o(en)f(to)h(preserv)o(e)e(soft)o(w)o(are)h (parallelism)e(and)j(maximiz)o(e)d(hardw)o(are)i(parallelism)0 723 y(can)14 b(b)q(e)h(view)o(ed)e(as)i(preparation)f(for)h(this)f(ev)o(en)o (t.)19 b(W)l(e)14 b(call)g(the)g(parallelism)d(that)k(results)f(from)f(the)0 784 y(in)o(teraction)i(of)i(hardw)o(are)f(and)h(soft)o(w)o(are)g(parallelism) c(the)j Fi(exe)n(cute)n(d)j(p)n(ar)n(al)r(lelism)p Fm(.)73 844 y(Curren)o(tly)l(,)13 b(commonplace)e(measures)h(of)i(executed)e (parallelism)g(are)h(limited)e(to)j(p)q(ost)h(mortem)0 904 y(a)o(v)o(erages.)34 b(In)20 b(some)f(cases)i(soft)o(w)o(are)f(tec)o(hniques) f(are)i(used)f(to)h(estimate)e(the)h(op)q(eration)h(coun)o(t)0 964 y(in)g(a)g(program)g(or)g(co)q(de)g(segmen)o(t,)f(then)h(this)f(coun)o(t) h(is)g(divided)f(b)o(y)g(the)h(run)o(time)d(to)j(compute)0 1024 y(the)e(a)o(v)o(erage)g(parallelism.)28 b(On)20 b(some)e(systems)g(sp)q (ecial)h(hardw)o(are)h(monitors)e(coun)o(t)i(prede\014ned)0 1084 y(classes)h(of)h(op)q(erations,)h(and)f(this)f(information)f(is)h(used)g (to)h(compute)e(p)q(ost)i(mortem)d(a)o(v)o(erages.)0 1145 y(The)e(hardw)o (are)g(approac)o(h)h(has)f(the)g(p)q(oten)o(tial)f(for)h(greater)g(accuracy)f (but)h(is)g(still)e(limited)f(to)j(the)0 1205 y(simple)d(accum)o(ulation)g (of)j(total)g(op)q(eration)g(coun)o(ts.)73 1265 y(Supp)q(ose)e(that)f(w)o(e)f (ha)o(v)o(e)g(de\014ned)h(\\w)o(ork")g(to)g(b)q(e)g(\015oating)h(p)q(oin)o(t) f(op)q(erations,)h(and)f(consider)g(an)0 1325 y(arbitrary)k(computer)e (system)h(that)h(can)g(collectiv)o(ely)d(compute)h Fk(f)23 b Fm(\015oating)c(p)q(oin)o(t)f(op)q(erations)h(in)0 1385 y(one)c(clo)q(c)o (k)e(p)q(erio)q(d.)22 b(A)14 b(natural)h(extension)f(to)h(the)f(simple)f(p)q (ost)i(mortem)d(a)o(v)o(erage)i(is)h(a)g(histogram,)0 1446 y Fk(W)21 b Fm(=)13 b Ff(h)p Fk(W)183 1453 y Fj(0)204 1446 y Fk(;)8 b(:)g(:)g(:)f(;)h(W)359 1453 y Fc(f)382 1446 y Ff(i)p Fm(,)13 b(where)f Fk(W)611 1453 y Fc(i)638 1446 y Fm(is)g(the)g(n)o(um)o(b)q (er)f(of)i(clo)q(c)o(k)e(p)q(erio)q(ds)j(during)e(whic)o(h)g Fk(i)g Fm(\015oating)i(p)q(oin)o(t)0 1548 y(op)q(erations)20 b(w)o(ere)e(completed)e(sim)o(ultaneously)l(.)27 b(The)19 b(sum)f Fk(t)f Fm(=)1264 1491 y Fc(f)1245 1506 y Fb(X)1246 1597 y Fc(i)p Fj(=0)1313 1548 y Fk(W)1359 1555 y Fc(i)1392 1548 y Fm(is)h(the)h(n)o(um)o(b) q(er)e(of)i(clo)q(c)o(k)0 1690 y(p)q(erio)q(ds)e(consumed)e(b)o(y)h(the)g(en) o(tire)f(program,)g(and)i(the)f(w)o(eigh)o(ted)f(sum)h Fk(w)f Fm(=)1500 1633 y Fc(f)1480 1648 y Fb(X)1482 1739 y Fc(i)p Fj(=0)1549 1690 y Fk(iW)1612 1697 y Fc(i)1642 1690 y Fm(is)h(the)g(total)0 1782 y(amoun)o(t)i(of)i(w)o(ork,)f(in)g(this)g(case)g(\015oating)h(p)q(oin)o (t)g(op)q(erations,)g(p)q(erformed)e(b)o(y)h(the)g(program.)29 b(T)l(o)0 1842 y(facilitate)17 b(comparisons)h(b)q(et)o(w)o(een)f(programs)h (that)h(ha)o(v)o(e)f(di\013eren)o(t)f(execution)g(times,)f(w)o(e)i(divide)0 1902 y(eac)o(h)e(en)o(try)g(in)g(the)g(histogram)g(b)o(y)g Fk(t)p Fm(,)g(the)g(total)h(execution)e(time)f(in)i(clo)q(c)o(k)g(p)q(erio)q (ds,)h(to)g(pro)q(duced)0 1963 y(a)f(normalized)d(histogram)i(called)g(the)g Fi(p)n(ar)n(al)r(lelism)i(ve)n(ctor)f Fk(P)21 b Fm(=)14 b Ff(h)p Fk(P)1276 1970 y Fj(0)1296 1963 y Fk(;)8 b(:)g(:)g(:)g(;)g(P)1437 1970 y Fc(f)1460 1963 y Ff(i)p Fm(,)15 b(where)g Fk(P)1679 1970 y Fc(i)1708 1963 y Fm(=)e Fk(W)1805 1970 y Fc(i)1819 1963 y Fk(=t)p Fm(.)0 2023 y(By)g(construction,)g(eac)o(h)g(en)o(try)f Fk(P)626 2030 y Fc(i)654 2023 y Fm(has)i(a)g(v)m(alue)f(b)q(et)o(w)o(een)f (0.0)h(and)h(1.0)g(that)g(indicates)e(the)h(fraction)0 2083 y(of)k(time)d(during)i(whic)o(h)g Fk(i)g Fm(units)g(of)g(w)o(ork)h(w)o(ere)e (completed)f(in)i(parallel.)73 2143 y(When)11 b(the)g(de\014nition)g(of)g (\\w)o(ork")h(includes)e(op)q(erations)j(of)e(t)o(w)o(o)g(or)g(more)f(t)o(yp) q(es)h(that)h(need)e(to)i(b)q(e)0 2203 y(distinguished,)g(the)f(parallelism)e (v)o(ector)i(is)g(extended)g(to)h(higher)f(dimensions)f(in)h(a)h(straigh)o (tforw)o(ard)0 2264 y(w)o(a)o(y)l(.)21 b(F)l(or)16 b(example)e(supp)q(ose)j (w)o(e)e(de\014ne)h(t)o(w)o(o)g(distinct)f(kinds)g(of)i(w)o(ork,)e(memory)e (op)q(erations)k(and)0 2324 y(\015oating)24 b(p)q(oin)o(t)f(op)q(erations,)i (and)f(consider)f(a)g(mac)o(hine)e(that)i(can)g(p)q(erform)f(up)h(to)h Fk(f)k Fm(\015oating)0 2384 y(p)q(oin)o(t)22 b(and)g Fk(m)g Fm(memory)d(op)q(erations)k(in)e(one)h(clo)q(c)o(k)f(p)q(erio)q(d.)38 b(In)22 b(this)g(case)f(a)i(t)o(w)o(o-dimensional)0 2444 y(histogram)c Fk(W)25 b Fm(=)18 b Fk(W)404 2451 y Fc(ij)435 2444 y Fk(;)8 b Fm(0)18 b Ff(\024)h Fk(i)f Ff(\024)g Fk(f)s(;)8 b Fm(0)18 b Ff(\024)g Fk(j)k Ff(\024)c Fk(m)p Fm(,)h(is)g(used,)g(where)g Fk(W)1342 2451 y Fc(ij)1391 2444 y Fm(is)g(the)f(n)o(um)o(b)q(er)g(of)h(clo)q (c)o(k)0 2504 y(p)q(erio)q(ds)13 b(during)f(whic)o(h)f Fk(i)g Fm(\015oating)i(p)q(oin)o(t)f(op)q(erations)h(and)f Fk(j)j Fm(memory)9 b(op)q(erations)k(w)o(ere)e(completed)p 0 2548 750 2 v 56 2579 a Fh(5)75 2594 y Fg(The)j(most)f(notorious)g(example,)f(of)i (course,)g(is)g(p)q(eak)g(mega\015ops.)925 2828 y Fm(7)p eop %%Page: 8 14 bop 0 238 a Fm(sim)o(ultaneously)l(.)19 b(As)d(b)q(efore)h(w)o(e)f(divide)f (b)o(y)g(the)h(total)h(n)o(um)o(b)q(er)d(of)j(clo)q(c)o(k)e(p)q(erio)q(ds)i Fk(t)d Fm(=)1681 181 y Fc(f)1661 196 y Fb(X)1662 287 y Fc(i)p Fj(=0)1744 184 y Fc(m)1730 196 y Fb(X)1729 287 y Fc(j)r Fj(=0)1799 238 y Fk(W)1845 245 y Fc(ij)0 339 y Fm(to)e(obtain)g(a)g(normalized)e (histogram)h(called)g(the)g Fi(p)n(ar)n(al)r(lelism)j(matrix)d Fk(P)c Fm(,)12 b(where)g(eac)o(h)f Fk(P)1647 346 y Fc(ij)1691 339 y Fm(=)j Fk(W)1789 346 y Fc(ij)1819 339 y Fk(=t)p Fm(.)73 399 y(In)h(a)h(similar)e(w)o(a)o(y)h(parallelism)e(matrices)h(of)i(arbitrary) f(dimension)f(can)i(b)q(e)g(constructed)f(with)0 460 y(one)20 b(dimension)e(for)i(eac)o(h)f(of)h(the)f(di\013eren)o(t)g(kinds)g(of)h(w)o (ork)g(that)g(are)f(of)h(in)o(terest.)30 b(Some)19 b(other)0 520 y(p)q(ossibilities)11 b(for)i(\\w)o(ork")g(include)e(logical)h(op)q (erations,)h(in)o(teger)f(op)q(erations,)h(and)g(I/O)f(op)q(erations.)0 580 y(Dep)q(ending)24 b(on)g(ho)o(w)g(w)o(e)f(de\014ne)g(w)o(ork)g(w)o(e)h (can)f(obtain)h(v)m(arious)g(parallelism)e(pro\014les)h(from)g(a)0 640 y(program's)18 b(parallelism)f(matrix.)26 b(In)19 b(the)f(t)o(w)o(o)h (dimensional)e(case)i(describ)q(ed)f(ab)q(o)o(v)o(e,)g(the)h(v)o(ector)0 700 y(of)h(ro)o(w)g(sums)f(of)h(the)g(parallelism)d(matrix)h(giv)o(es)h(a)h (pro\014le)g(of)g(the)f(memory)e(parallelism,)h(while)0 761 y(the)f(v)o(ector)f(of)h(column)f(sums)g(sho)o(ws)i(the)f(pro\014le)g(of)g (\015oating-p)q(oin)o(t)h(parallelism.)k(The)17 b(v)o(ector)f(of)0 821 y(diagonal)i(sums)d(represen)o(ts)h(a)i(pro\014le)e(of)h(w)o(ork)g (parallelism)d(where)i(\015oating)i(p)q(oin)o(t)f(and)g(memory)0 881 y(op)q(erations)j(are)f(not)g(distinguished)g(\(i.e.)28 b(the)18 b(unit)h(of)g(w)o(ork)g(is)g(de\014ned)g(as)g(a)g(\015oating-p)q (oin)o(t)i Fi(or)0 941 y Fm(memory)13 b(op)q(eration\).)0 1086 y Fe(3.6)70 b(Quan)n(tifying)22 b(Di\013erences)e(in)i(Executed)h(P)n (arallelism)0 1178 y Fm(Although)c(similar)d(programs)i(will)f(exhibit)h (similar)e(a)o(v)o(erage)i(parallelism)e(on)i(a)h(giv)o(en)f(mac)o(hine,)0 1238 y(the)e(con)o(v)o(erse)f(is)g(not)i(true;)e(similar)f(a)o(v)o(erage)i (parallelism)d(do)q(es)k(not)f(indicate)f(that)i(programs)f(are)0 1298 y(similar)c([12].)20 b(P)o(arallelism)12 b(v)o(ectors)h(and)i(matrices)d (rev)o(eal)h(di\013erences)g(in)h(program)g(b)q(eha)o(vior)g(that)0 1359 y(cannot)22 b(b)q(e)g(distinguished)f(b)o(y)g(a)o(v)o(erages.)36 b(W)l(e)22 b(can)f(compare)g(the)g(parallelism)e(pro\014les)i(of)h(t)o(w)o(o) 0 1419 y(programs)d(b)o(y)f(comparing)g(the)h(parallelism)e(matrices)g(for)i (eac)o(h)f(program)h(using)g(the)g(F)l(rob)q(enius)0 1479 y(matrix)e(norm)h (to)i(quan)o(tify)e(the)g(di\013erence.)28 b(If)19 b Fk(A)f Fm(is)h(the)g(t)o(w)o(o-dimensional)e Fk(m)c Ff(\002)f Fk(n)19 b Fm(parallelism)0 1539 y(matrix)f(for)j(Program)f(1)h(and)f Fk(B)j Fm(is)d(the)g Fk(m)14 b Ff(\002)f Fk(n)20 b Fm(parallelism)e(matrix)h (for)h(Program)g(2,)h(then)f(w)o(e)0 1599 y(gauge)d(the)f(di\013erence)f(in)h (parallelism)e(b)q(et)o(w)o(een)h(the)h(t)o(w)o(o)h(programs)f(b)o(y)426 1709 y(di\013erence)f(in)h(parallelism)39 b(=)i Ff(k)p Fk(A)11 b Ff(\000)g Fk(B)s Ff(k)1239 1724 y Fc(F)973 1885 y Fm(=)1052 1802 y Fb(v)1052 1825 y(u)1052 1850 y(u)1052 1875 y(t)p 1096 1802 353 2 v 1111 1831 a Fc(m)1096 1844 y Fb(X)1097 1935 y Fc(i)p Fj(=1)1184 1831 y Fc(n)1165 1844 y Fb(X)1164 1935 y Fc(j)r Fj(=1)1234 1885 y Ff(j)p Fk(a)1274 1892 y Fc(ij)1315 1885 y Ff(\000)g Fk(b)1386 1892 y Fc(ij)1416 1885 y Ff(j)1430 1871 y Fj(2)1813 1885 y Fm(\(1\))0 2038 y(In)o(tuitiv)o(ely)l(,)j(the)j(F)l (rob)q(enius)g(norm)f(represen)o(ts)h(the)g(\\distance")h(b)q(et)o(w)o(een)e (t)o(w)o(o)h(matrices,)e(just)i(as)0 2098 y(the)e(Euclidean)f(form)o(ula)g (is)g(used)h(to)h(measure)d(the)i(distance)g(b)q(et)o(w)o(een)f(t)o(w)o(o)h (p)q(oin)o(ts.)21 b(This)15 b(metho)q(d)0 2159 y(is)h(used)g(to)h(compare)e (some)g(application)h(b)q(enc)o(hmarks)f(in)h(Section)g(5.)73 2219 y(Recall)j(that)h(the)g(matrix)f(elemen)o(ts)e(represen)o(t)i(the)h (fraction)g(of)g(the)g(normalized)e(execution)0 2279 y(time,)g(so)j(that)f (their)f(sum)g(equals)g(1.0.)32 b(This)20 b(means)f(that)h(the)g (di\013erence)f(in)g(parallelism)f(ma)o(y)0 2339 y(range)i(from)f(0.0)h(for)h (t)o(w)o(o)e(programs)h(with)g(iden)o(tical)e(parallelism)g(distributions,)i (to)1661 2298 y Ff(p)p 1702 2298 25 2 v 1702 2339 a Fm(2)h(in)e(the)0 2399 y(case)d(where)g(eac)o(h)g(matrix)e(has)j(only)f(one)h(non-zero)f (elemen)o(t)d(\(with)j(v)m(alue)g(1.0\))h(and)g(the)f(lo)q(cation)0 2460 y(of)h(the)f(non-zero)g(elemen)o(t)e(in)h(eac)o(h)h(matrix)f(is)h (di\013eren)o(t.)925 2828 y(8)p eop %%Page: 9 15 bop 0 203 a Fl(4)83 b(Rederiv)-5 b(ation)24 b(of)i(Basic)e(P)n(arallel)i(P)n (erformance)e(Met-)124 295 y(rics)0 404 y Fm(This)d(section)g(striv)o(es)f (to)h(justify)g(the)f(approac)o(h)i(to)g(Benc)o(hmarking)c(describ)q(ed)j(in) f Ff(x)p Fm(5)i(and)g(our)0 464 y(emphasis)16 b(on)h(parallelism)d(b)o(y)j (sho)o(wing)g(that)g(man)o(y)e(common)g(p)q(erformance)h(metrics,)e (including)0 525 y(Amdahl's)g(La)o(w,)j(are)f(directly)f(related)g(to)i(the)f (parallelism)e(in)i(the)g(hardw)o(are)g(and)h(soft)o(w)o(are.)0 669 y Fe(4.1)70 b(P)n(erformance)0 761 y Fm(The)16 b(ultimate)e(metric)f(for) k(measuring)e(the)g(p)q(erformance)g(of)i(a)f(mac)o(hine)e(executing)h(a)h (giv)o(en)f(pro-)0 822 y(gram)h(with)g(a)h(\014xed)f(dataset)h(is)f(the)g(in) o(v)o(erse)f(of)i(the)f(w)o(all)g(clo)q(c)o(k)f(time,)f Fi(WCT)p 1473 822 15 2 v 17 w(se)n(c)n(onds)p Fm(,)i(measured)0 882 y(in)g(units)g(of)h(recipro)q(cal)e(seconds,)669 862 y Fj(1)p 653 870 48 2 v 653 899 a Fc(sec)706 882 y Fm(.)618 1019 y Fi(Performanc)n(e)e Fm(=)1093 985 y Fi(1)p 959 1007 294 2 v 959 1053 a(WCT)p 1081 1053 15 2 v 17 w(se)n(c)n(onds)1813 1019 y Fm(\(2\))73 1148 y(If)18 b(a)g(program)g(tak)o(es)g(10)g(elapsed)g(seconds)g(to)h(run,)f(then) g(its)g(p)q(erformance)e(is)1602 1128 y Fj(1)p 1593 1136 36 2 v 1593 1165 a(10)1654 1128 y(1)p 1638 1136 48 2 v 1638 1165 a Fc(sec)1709 1148 y Fm(for)j(that)0 1208 y(problem.)26 b(Another)18 b(program)g(that)h(solv)o(es)e(the)h(same)g(problem)e(has)j(higher)f(p)q (erformance)g(if)f(its)0 1268 y(execution)23 b(time)e(is)i(smaller.)42 b(W)l(e)23 b(clarify)g(the)g(factors)h(that)g(con)o(tribute)f(to)h(p)q (erformance)f(b)o(y)0 1329 y(m)o(ultiplyi)o(ng)14 b(equation)i(\(2\))h(t)o (wice)e(b)o(y)h(a)g(factor)h(whose)g(v)m(alue)f(is)g(1.)369 1511 y Fi(Performanc)n(e)41 b Fm(=)900 1477 y(1)p 766 1499 294 2 v 766 1545 a Fi(WCT)p 888 1545 15 2 v 17 w(se)n(c)n(onds)1075 1511 y Ff(\002)1130 1477 y Fi(work)p 1130 1499 101 2 v 1130 1545 a(work)1246 1511 y Ff(\002)1301 1477 y Fi(total)p 1398 1477 15 2 v 18 w(CPs)p 1301 1499 201 2 v 1301 1545 a(total)p 1398 1545 15 2 v 18 w(CPs)681 1669 y Fm(=)804 1635 y(1)p 766 1657 101 2 v 766 1703 a Fi(work)882 1669 y Ff(\002)987 1635 y Fi(work)p 937 1657 201 2 v 937 1703 a(total)p 1034 1703 15 2 v 18 w(CPs)1153 1669 y Ff(\002)1254 1635 y Fi(total)p 1351 1635 V 19 w(CPs)p 1208 1657 294 2 v 1208 1703 a(WCT)p 1330 1703 15 2 v 17 w(se)n(c)n(onds)1813 1669 y Fm(\(3\))681 1824 y(=)804 1790 y(1)p 766 1812 101 2 v 766 1858 a Fi(work)882 1824 y Ff(\002)11 b Fk(par)q(al)q(l)q(el)q(ism)e Ff(\002)h Fk(cl)q(ock)p 1354 1824 15 2 v 20 w(r)q(ate)354 b Fm(\(4\))16 1946 y(where)73 2048 y Ff(\017)24 b Fi(work)16 b Fm(is)g(the)g(total)h(n)o (um)o(b)q(er)d(of)j(\\op)q(erations")h(in)e(the)g(program.)73 2150 y Ff(\017)24 b Fi(op)n(er)n(ation)15 b Fm(The)h(de\014nition)g(of)g(op)q (eration)h(is)f Fi(arbitr)n(ary)p Fm(.)j(An)d(op)q(eration)h(ma)o(y)e(b)q(e)h (a)g(mem)o(b)q(er)122 2210 y(of)k(the)f(set)h(of)g(all)f(op)q(eration)h(t)o (yp)q(es,)g(only)f(\015oating)i(p)q(oin)o(t)e(op)q(erations,)i(only)f(in)o (teger)e(and)122 2270 y(logical)11 b(op)q(erations,)h(only)f(memory)d(op)q (erations,)13 b(only)e(I/O)g(op)q(erations)h(or)g(an)o(y)f(com)o(bination)122 2330 y(of)16 b(op)q(eration)g(classes)g(that)g(the)f(user)g(ma)o(y)f(wish)i (to)g(de\014ne)f(to)h(suit)f(his)h(purp)q(ose.)22 b(The)15 b(only)122 2391 y(limitation)j(is)i(that)h(the)f(same)g(de\014nition)g(m)o (ust)f(b)q(e)i(used)f(in)g(b)q(oth)i(the)e(\014rst)h(and)g(second)122 2451 y(factors)c(in)f(equation)g(\(3\).)73 2552 y Ff(\017)24 b Fi(total)p 219 2552 V 19 w(CPs)g Fm(is)g(the)g(total)g(n)o(um)o(b)q(er)e (of)j(elapsed)f(clo)q(c)o(k)f(p)q(erio)q(ds)i(required)e(to)i(execute)d(the) 122 2613 y(program.)45 b(It)24 b(has)h(the)f(same)f(v)m(alue)h(as)h(the)f(w)o (all)g(clo)q(c)o(k)f(time,)h(W)o(CT)p 1530 2613 V 18 w(seconds,)i(but)e(is) 122 2673 y(expressed)16 b(in)g(units)g(of)g(clo)q(c)o(k)g(p)q(erio)q(ds.)925 2828 y(9)p eop %%Page: 10 16 bop 73 195 a Ff(\017)24 b Fi(p)n(ar)n(al)r(lelism)19 b Fm(is)f(the)h(ratio)g (of)f(the)h(w)o(ork)f(p)q(erformed)g(to)h(the)f(elapsed)h(time,)d(and)j(is)g (the)f Fi(av-)122 255 y(er)n(age)j Fm(n)o(um)o(b)q(er)f(of)i(op)q(erations)g (pro)q(duced)g(p)q(er)f(clo)q(c)o(k)g(p)q(erio)q(d.)37 b(It)20 b(can)i(include)e(b)q(oth)i(the)122 315 y(concurren)o(t)17 b(execution)g(of)h(m)o(ultiple)d(pro)q(cessors,)k(and)g(the)f(o)o(v)o(erlapp) q(ed)f(execution)g(of)h(sev-)122 376 y(eral)e(instructions)g(within)g(a)g (single)g(pro)q(cessor.)73 477 y Ff(\017)24 b Fi(clo)n(ck)p 226 477 15 2 v 19 w(r)n(ate)15 b Fm(is)f(the)h(ratio)g(of)h(the)f(n)o(um)o(b) q(er)e(of)i(elapsed)g(clo)q(c)o(k)f(p)q(erio)q(ds)i(to)f(the)g(elapsed)f (time)f(in)122 538 y(seconds.)22 b(It)15 b(is)h(the)g(n)o(um)o(b)q(er)f(of)h (clo)q(c)o(k)g(p)q(erio)q(ds)h(in)f(one)g(second.)0 639 y(This)e (de\014nition)f(of)i(p)q(erformance)d(is)i(similar)e(to)i(that)g(giv)o(en)f (in)h([9])f(except)g(that)h(w)o(e)g(use)g(w)o(all)f(clo)q(c)o(k)0 699 y(time)h(rather)i(than)h(CPU)f(time.)0 844 y Fe(4.2)70 b(Execution)22 b(Rate)0 936 y Fm(Although)d(P)o(erformance)d(is)j(the)f (metric)e(of)j(c)o(hoice,)e(another)i(metric,)d Fi(Exe)n(cution)p 1571 936 V 20 w(R)n(ate)p Fm(,)i(is)g(often)0 996 y(used)24 b(b)q(ecause)g(it)f(is)g(easy)h(to)g(compare)f(this)g(metric)f(with)h(the)h Fi(Pe)n(ak)p 1379 996 V 18 w(Exe)n(cution)p 1603 996 V 19 w(R)n(ate)g Fm(metric)0 1057 y(describ)q(ed)16 b(b)q(elo)o(w.)21 b(Execution)p 590 1057 V 17 w(Rate)16 b(ma)o(y)f(b)q(e)h(expressed)g(in)g(sev)o(eral)f (forms.)275 1167 y Fi(Exe)n(cution)p 484 1167 V 19 w(R)n(ate)41 b Fm(=)g Fi(Performanc)n(e)11 b Ff(\002)g Fi(work)638 1261 y Fm(=)819 1228 y Fi(work)p 722 1250 294 2 v 722 1295 a(WCT)p 844 1295 15 2 v 18 w(se)n(c)n(onds)638 1378 y Fm(=)717 1317 y Fb(\022)791 1344 y Fm(1)p 753 1366 101 2 v 753 1412 a Fi(work)869 1378 y Ff(\002)g Fi(p)n(ar)n(al)r(lelism)g Ff(\002)g Fi(clo)n(ck)p 1311 1378 15 2 v 19 w(r)n(ate)1409 1317 y Fb(\023)1450 1378 y Ff(\002)g Fi(work)638 1469 y Fm(=)41 b Fi(p)n(ar)n(al)r(lelism)12 b Ff(\002)f Fi(clo)n(ck)p 1110 1469 V 19 w(r)n(ate)605 b Fm(\(5\))0 1593 y(Execution)p 217 1593 V 17 w(Rate)22 b(is)g(measured)f(in)h(units)g(of) 928 1569 y(op)q(erations)p 890 1582 298 2 v 890 1618 a(W)o(CT)p 1012 1618 15 2 v 18 w(seconds)1192 1593 y(.)39 b(While)22 b(the)g(P)o (erformance)e(mea-)0 1653 y(suremen)o(ts)e(for)i(eac)o(h)g(of)g(t)o(w)o(o)g (programs)g(solving)g(the)g(same)e(problem)h(are)h(alw)o(a)o(ys)g (comparable,)0 1714 y(the)h(Execution)p 306 1714 V 17 w(Rates)g(for)g(t)o(w)o (o)f(programs)h(solving)g(the)g(same)e(problem)h(are)h(only)f(comparable)0 1774 y(if)g(w)o(ork)h(is)f(the)h(same)f(for)h(b)q(oth)g(programs.)35 b(The)21 b(same)f(relationships)g(hold)h(when)g(comparing)0 1834 y(mac)o(hines)16 b(solving)j(the)f(same)f(problem.)26 b(Execution)p 1021 1834 V 17 w(Rate)18 b(is)h(often)f(mistak)o(enly)e (confused)i(with)0 1894 y(p)q(erformance.)0 2039 y Fe(4.3)70 b(P)n(eak)23 b(Execution)f(Rate)0 2131 y Fm(W)l(e)16 b(de\014ne)g(P)o(eak)p 332 2131 V 17 w(Execution)p 563 2131 V 17 w(Rate)g(as)113 2251 y Fi(Pe)n(ak)p 216 2251 V 18 w(Exe)n(cution)p 440 2251 V 19 w(R)n(ate)41 b Fm(=)h Fi(work)11 b Ff(\002)835 2190 y Fb(\022)908 2217 y Fm(1)p 870 2239 101 2 v 870 2285 a Fi(work)987 2251 y Ff(\002)f Fi(maximal)p 1218 2251 15 2 v 19 w(p)n(ar)n(al)r(lelism)h Ff(\002)g Fi(clo)n(ck)p 1626 2251 V 19 w(r)n(ate)1724 2190 y Fb(\023)594 2339 y Fm(=)42 b Fi(maximal)p 856 2339 V 18 w(p)n(ar)n(al)r (lelism)11 b Ff(\002)g Fi(clo)n(ck)p 1263 2339 V 19 w(r)n(ate)0 2449 y Fm(where)j Fi(maximal)p 321 2449 V 18 w(p)n(ar)n(al)r(lelism)g Fm(is)g(the)f(maxim)o(um)d(n)o(um)o(b)q(er)i(of)i(\\op)q(eration")i(results)d (that)i(the)e(hard-)0 2509 y(w)o(are)f(can)h(pro)q(duce)f(in)g(one)h(clo)q(c) o(k)e(p)q(erio)q(d,)i(assuming)f(that)h(all)f(other)g(resources)g(suc)o(h)g (as)h(functional)0 2570 y(units,)18 b(registers)g(and)h(data)g(paths,)g(and)f (of)h(course,)f(the)g(op)q(erands,)h(are)g(alw)o(a)o(ys)e(a)o(v)m(ailable)h (when)0 2630 y(needed.)913 2828 y(10)p eop %%Page: 11 17 bop 73 195 a Fm(As)17 b(an)g(example,)d(a)j(CRA)l(Y)f(C-90)h(with)g(eac)o(h)f (CPU)h(con)o(taining)f(an)h(add)g(and)h(m)o(ultiply)13 b(func-)0 255 y(tional)k(unit,)e(3)i(memory)d(p)q(orts,)j(and)g(2)g(logical)f (functional)h(units,)f(eac)o(h)g(capable)g(of)h(pro)q(ducing)h(2)0 315 y(results)e(p)q(er)g(clo)q(c)o(k)g(p)q(erio)q(d,)g(16)h(CPUs,)f(and)h(a)f (4ns)h(clo)q(c)o(k)f(p)q(erio)q(d)g(has)h(P)o(eak)p 1439 315 15 2 v 18 w(Execution)p 1671 315 V 17 w(Rates)f(of)254 417 y(4)h(\(+,*\))f(results)p 254 433 310 2 v 375 469 a(CP)233 563 y(6)g(memory)e(w)o(ords)p 233 576 353 2 v 375 612 a(CP)243 707 y(4)j(logical)f(results)p 243 720 332 2 v 375 756 a(CP)611 381 y Fb(9)611 418 y(>)611 431 y(>)611 443 y(>)611 456 y(>)611 468 y(>)611 481 y(>)611 493 y(>)611 505 y(>)611 518 y(>)611 530 y(=)611 605 y(>)611 618 y(>)611 630 y(>)611 642 y(>)611 655 y(>)611 667 y(>)611 680 y(>)611 692 y(>)611 705 y(>)611 717 y(;)659 580 y Ff(\002)11 b Fm(16)17 b(CPUs)11 b Ff(\002)g Fm(250)17 b(MHz)c(=)1217 393 y Fb(8)1217 431 y(>)1217 443 y(>)1217 456 y(>)1217 468 y(>)1217 481 y(>)1217 493 y(>)1217 505 y(>)1217 518 y(>)1217 530 y(<)1217 605 y(>)1217 618 y(>)1217 630 y(>)1217 642 y(>)1217 655 y(>)1217 667 y(>)1217 680 y(>)1217 692 y(>)1217 705 y(:)1291 419 y Fm(16.0)k(G\015ops)p 1279 432 278 2 v 1279 469 a(W)o(CT)p 1401 469 15 2 v 18 w(second)1604 444 y(or)1305 563 y(24.0)g(GWs)p 1279 571 278 2 v 1279 607 a(W)o(CT)p 1401 607 15 2 v 18 w(second)1604 583 y(or)1289 701 y(16.0)g(GLops)p 1279 714 278 2 v 1279 751 a(W)o(CT)p 1401 751 15 2 v 18 w(second)0 844 y(where)f Fi(GL)n(ops/WCT)p 425 844 V 16 w(se)n(c)n(ond)g Fm(is)g(one)g(billion)f(logical)h(op)q(erations)i(p)q(er)e(W)o(CT)p 1475 844 V 18 w(second.)21 b(The)0 904 y(P)o(eak)p 107 904 V 17 w(Execution)p 338 904 V 17 w(Rate)15 b(of)g(a)h(mac)o(hine)d(ma)o(y)g(b) q(e)i(ac)o(hiev)m(able)f(for)h(short)h(p)q(erio)q(ds)f(of)h(time,)c(but)k(ma) o(y)0 964 y(not)k(b)q(e)f(sustainable)h(o)o(v)o(er)e(longer)i(p)q(erio)q(ds)g (since)e(the)h(assumptions)h(made)e(in)h(the)g(de\014nition)g(of)0 1024 y(maximal)p 189 1024 V 14 w(parallelism)14 b(are)j(usually)f(not)g(alw)o (a)o(ys)g(true.)0 1168 y Fe(4.4)70 b(Sp)r(eedup)0 1260 y Fm(W)l(e)22 b(de\014ne)f(Sp)q(eedup)h(of)g(a)g(new)g(program)g(o)o(v)o(er)f(an)h(old)g (program)f(that)i(b)q(oth)f(solv)o(e)f(the)h(same)0 1320 y(problem)15 b(as)i(the)f(ratio)g(of)h(the)f(p)q(erformance)f(of)h(eac)o(h)g(program.)620 1448 y Fi(Sp)n(e)n(e)n(dup)e Fm(=)857 1414 y Fi(Performanc)n(e\(new\))p 857 1436 393 2 v 867 1482 a(Performanc)n(e\(old\))0 1578 y Fm(F)l(rom)h(\(2\))i(it)e(follo)o(ws)i(that)609 1666 y Fi(Sp)n(e)n(e)n(dup)d Fm(=)855 1632 y Fi(WCT)p 977 1632 15 2 v 18 w(se)n(c)n(onds\(old\))p 845 1654 416 2 v 845 1700 a(WCT)p 967 1700 15 2 v 18 w(se)n(c)n(onds\(new\))0 1778 y Fm(whic)o(h)e(is)g(related)g(to)g(the)h(standard)g(de\014nition,)g Fk(S)s Fm(\()p Fk(p)p Fm(\))h(=)g Fk(T)7 b Fm(\(1\))p Fk(=T)g Fm(\()p Fk(p)p Fm(\),)12 b(for)h(m)o(ultipro)q(cessor)e(sp)q(eedup)0 1838 y(using)18 b Fk(p)g Fm(pro)q(cessors)h([10].)25 b(Ho)o(w)o(ev)o(er,)16 b(w)o(e)h(can)h(use)f(the)h(de\014nition)f(of)h(P)o(erformance)e(giv)o(en)g (in)i(\(3\))0 1898 y(to)f(exp)q(ose)f(the)g(con)o(tributing)g(factors)h(to)f (Sp)q(eedup)h(in)f(greater)g(detail.)303 2065 y Fi(Sp)n(e)n(e)n(dup)e Fm(=)643 1987 y Fi(1)p 545 1995 223 2 v 545 2034 a(work\(new\))783 2007 y Ff(\002)d Fi(p)n(ar)n(al)r(lelism\(new\))h Ff(\002)f Fi(clo)n(ck)p 1347 2007 15 2 v 19 w(r)n(ate\(new\))p 540 2053 1028 2 v 663 2091 a(1)p 574 2099 203 2 v 574 2138 a(work\(old\))793 2110 y Ff(\002)g Fi(p)n(ar)n(al)r(lelism\(old\))g Ff(\002)g Fi(clo)n(ck)p 1337 2110 15 2 v 19 w(r)n(ate\(old\))0 2218 y Fm(or)298 2293 y Fi(Sp)n(e)n(e)n(dup)j Fm(=)545 2260 y Fi(work\(old\))p 535 2282 223 2 v 535 2328 a(work\(new\))773 2293 y Ff(\002)828 2260 y Fi(p)n(ar)n(al)r(lelism\(new\))p 828 2282 350 2 v 838 2328 a(p)n(ar)n(al)r(lelism\(old\))1193 2293 y Ff(\002)1248 2260 y Fi(clo)n(ck)p 1352 2260 15 2 v 19 w(r)n(ate\(new\))p 1248 2282 324 2 v 1258 2328 a(clo)n(ck)p 1362 2328 15 2 v 19 w(r)n(ate\(old\))1813 2293 y Fm(\(6\))0 2405 y(Equation)21 b(\(6\))g(illustrates)f(the)g(t)o(w)o(o)g(additional)h(factors,)g(the)g(w)o (ork)f(ratio)h(and)g(the)f(parallelism)0 2466 y(ratio,)c(that)h(prev)o(en)o (t)e(accurate)h(mac)o(hine)e(comparisons)i(based)g(solely)g(on)g(the)g(clo)q (c)o(k)p 1606 2466 V 17 w(rate)g(ratio.)73 2526 y(If)21 b(w)o(e)g(are)h (comparing)e(t)o(w)o(o)i(executions)e(on)i(the)g(same)e(mac)o(hine,)g(then)i Fi(clo)n(ck)p 1596 2526 V 19 w(r)n(ate\(new\))h(=)0 2586 y(clo)n(ck)p 104 2586 V 19 w(r)n(ate\(old\))p Fm(.)496 2674 y Fi(Sp)n(e)n(e)n(dup)13 b Fm(=)742 2640 y Fi(work\(old\))p 732 2662 223 2 v 732 2708 a(work\(new\))970 2674 y Ff(\002)1025 2640 y Fi(p)n(ar)n(al)r(lelism\(new\))p 1025 2662 350 2 v 1035 2708 a(p)n(ar)n(al)r(lelism\(old\))1813 2674 y Fm(\(7\))913 2828 y(11)p eop %%Page: 12 18 bop 0 195 a Fm(where)16 b(the)g(\014rst)g(factor)h(is)f(the)g(recipro)q(cal)g (of)g(the)g(Redundancy)l(,)654 316 y Fi(R)n(e)n(dundancy)e Fm(=)980 282 y Fi(work\(new\))p 980 305 223 2 v 990 350 a(work\(old\))1207 316 y Fk(:)0 476 y Fe(4.5)70 b(E\016ciency)0 568 y Fm(W)l(e)19 b(de\014ne)g(the)g(E\016ciency)f(of)i(a)g(new)f(program)g(as)h(the)f(ratio)h (of)g(the)f(realized)f(Sp)q(eedup)h(to)h(the)0 629 y(maxim)o(um)12 b(p)q(ossible)k(Sp)q(eedup.)614 747 y Fi(E\016ciency)f Fm(=)992 713 y Fi(Sp)n(e)n(e)n(dup)p 893 735 363 2 v 893 781 a(maximal)p 1075 781 15 2 v 19 w(Sp)n(e)n(e)n(dup)1813 747 y Fm(\(8\))0 869 y(where,)g(using)i(\(7\),)314 992 y Fi(maximal)p 496 992 V 18 w(Sp)n(e)n(e)n(dup)d Fm(=)803 959 y Fi(work\(old\))p 748 981 314 2 v 748 1026 a(work\(max)p 958 1026 15 2 v 18 w(p)n(ar\))1077 992 y Ff(\002)1131 959 y Fi(maximal)p 1313 959 V 19 w(p)n(ar)n(al)r(lelism)p 1131 981 425 2 v 1179 1026 a(p)n(ar)n(al)r(lelism\(old\))0 1117 y Fm(and)i Fi(work\(max)p 304 1117 15 2 v 18 w(p)n(ar\))e Fm(is)g(the)h(n)o(um)o(b)q(er)e(of)i(op)q(erations)h(in)f(the)g Fi(maximal)p 1339 1117 V 18 w(p)n(ar)n(al)r(lel)24 b Fm(implem)o(e)o(n)o (tation.)73 1177 y(De\014nition)14 b(\(8\))h(is)f(related)g(to)h(the)f (standard)h(de\014nition,)f Fk(E)s Fm(\()p Fk(p)p Fm(\))h(=)e Fk(S)s Fm(\()p Fk(p)p Fm(\))p Fk(=p)p Fm(,)j(for)f(m)o(ultipro)q(cessor)0 1237 y(E\016ciency)g(using)h Fk(p)h Fm(pro)q(cessors)g([10)q(].)k(Ho)o(w)o (ev)o(er,)14 b(w)o(e)i(can)g(use)g(the)g(de\014nition)g(of)h(Sp)q(eedup)f(in) g(\(7\),)0 1297 y(the)i(constan)o(t)h(clo)q(c)o(k)p 396 1297 V 17 w(rate)f(case,)h(to)g(exp)q(ose)g(the)f(con)o(tributing)g(factors)h(to)g (E\016ciency)e(in)h(greater)0 1357 y(detail.)358 1581 y Fi(E\016ciency)43 b Fm(=)791 1495 y Fi(work\(old\))p 781 1511 223 2 v 781 1550 a(work\(new\))1020 1523 y Ff(\002)1075 1495 y Fi(p)n(ar)n(al)r(lelism\(new\)) p 1075 1511 350 2 v 1084 1550 a(p)n(ar)n(al)r(lelism\(old\))p 693 1569 819 2 v 754 1611 a(work\(old\))p 698 1627 314 2 v 698 1666 a(work\(max)p 908 1666 15 2 v 19 w(p)n(ar\))1028 1639 y Ff(\002)1082 1614 y Fi(maximal)p 1264 1614 V 19 w(p)n(ar)n(al)r(lelism)p 1082 1627 425 2 v 1130 1666 a(p)n(ar)n(al)r(lelism\(old\))609 1783 y Fm(=)731 1749 y Fi(p)n(ar)n(al)r(lelism\(new\))p 693 1771 V 693 1817 a(maximal)p 875 1817 15 2 v 19 w(p)n(ar)n(al)r(lelism)1134 1783 y Ff(\002)1189 1749 y Fi(work\(max)p 1399 1749 V 18 w(p)n(ar\))p 1189 1771 314 2 v 1234 1817 a(work\(new\))1813 1783 y Fm(\(9\))0 1905 y(where)17 b(w)o(e)g(observ)o(e)f(the)h(recipro)q(cal)g(of)g(the)g (Redundancy)g(as)h(the)f(second)g(factor,)g(and)h(note)g(that)0 1965 y(E\016ciency)d(is)h(not)g(a)h(function)f(of)g(the)g(p)q(erformance)g (of)g(the)g(old)g(program.)0 2107 y Fe(4.6)70 b(Utili)o(zati)o(on)0 2199 y Fm(Our)13 b(de\014nition)f(of)h(the)g(Utilization)e(of)i(a)h(new)e (program)h(follo)o(ws)g(from)f(\(9\))h(and)g(from)f(the)h(standard)0 2260 y(de\014nition)21 b([10])g(for)h(m)o(ultipro)q(cessor)e(Utilization)f (using)j Fk(p)g Fm(pro)q(cessors,)h Fk(U)5 b Fm(\()p Fk(p)p Fm(\))24 b(=)e Fk(E)s Fm(\()p Fk(p)p Fm(\))15 b Ff(\002)g Fk(R)p Fm(\()p Fk(p)p Fm(\),)0 2320 y(where)h Fk(R)p Fm(\()p Fk(p)p Fm(\))h(is)f(the)g(Redundancy)l(.)131 2447 y Fi(Utilization)43 b Fm(=)475 2374 y Fb( )551 2414 y Fi(p)n(ar)n(al)r(lelism\(new\))p 513 2436 425 2 v 513 2482 a(maximal)p 695 2482 15 2 v 19 w(p)n(ar)n(al)r (lelism)954 2447 y Ff(\002)1009 2414 y Fi(work\(max)p 1219 2414 V 18 w(p)n(ar\))p 1009 2436 314 2 v 1054 2482 a(work\(new\))1327 2374 y Fb(!)1371 2447 y Ff(\002)1471 2414 y Fi(work\(new\))p 1425 2436 V 1425 2482 a(work\(max)p 1635 2482 15 2 v 19 w(p)n(ar\))396 2585 y Fm(=)518 2551 y Fi(p)n(ar)n(al)r(lelism\(new\))p 480 2573 425 2 v 480 2619 a(maximal)p 662 2619 15 2 v 19 w(p)n(ar)n(al)r(lelism)0 2704 y Fm(where)16 b(w)o(e)g(note)g(that)h(Utilization)d(is)i(not)h(a)g (function)f(of)g(the)g(Redundancy)l(.)913 2828 y(12)p eop %%Page: 13 19 bop 0 195 a Fe(4.7)70 b(Amdahl's)22 b(La)n(w)0 287 y Fm(Amdahl's)f(La)o(w)j (is)e(a)i(sp)q(ecial)e(case)h(of)g(the)g(Sp)q(eedup)g(de\014nition)f(\(7\))i (in)e(whic)o(h)g(w)o(ork\(new\))h(=)0 348 y(w)o(ork\(old\),)642 435 y Fi(Sp)n(e)n(e)n(dup)14 b Fm(=)878 402 y Fi(p)n(ar)n(al)r(lelism\(new\)) p 878 424 350 2 v 888 469 a(p)n(ar)n(al)r(lelism\(old\))0 550 y Fm(where,)j(here,)f Fi(p)n(ar)n(al)r(lelism\(new\))i Fm(is)f(the)g (parallelism)e(in)i(a)g(theoretically)f(p)q(erfect)g(\(without)i(o)o(v)o(er-) 0 611 y(head\))e(optimized,)e(v)o(ectorized)g(or)j(concurren)o(tized)d(v)o (ersion)i(of)h(the)f(old)g(soft)o(w)o(are.)73 671 y(This)h(demonstrates)f (the)g(direct)f(relationship)i(b)q(et)o(w)o(een)e(parallelism,)f(p)q (erformance)h(and)i(tra-)0 731 y(ditional)h(p)q(erformance-related)g(metrics) e(and)j(leads)g(us)g(to)g(the)g(new)g(approac)o(h)g(to)g(p)q(erformance)0 791 y(ev)m(aluation)d(and)h(applications)f(b)q(enc)o(hmarking)f(that)i(is)f (describ)q(ed)g(in)f(the)h(next)g(section.)0 958 y Fl(5)83 b(Using)18 b(P)n(arallelism)g(to)g(Ev)-5 b(aluate)19 b(Application)f(Benc)n (h-)124 1049 y(marks)0 1158 y Fm(P)o(arallelism-based)g(tec)o(hniques)h(can)h (also)h(b)q(e)f(used)g(to)h(ev)m(aluate)f(application)f(b)q(enc)o(hmark)g Fi(sets)p Fm(.)0 1219 y(W)l(e)h(are)g(motiv)m(ated)e(to)j(study)f(the)f (parallelism)f(in)i(application)f(b)q(enc)o(hmark)g(sets)h(in)f(ligh)o(t)g (of)i(a)0 1279 y(common)15 b(metho)q(dology)h(no)o(w)i(b)q(eing)f(used)g(for) g(their)f(construction.)23 b(Often)17 b(b)q(enc)o(hmark)e(sets)i(are)0 1339 y(dev)o(elop)q(ed)c(b)o(y)h(selecting)g(programs)g(or)h(k)o(ernels)e (from)g(users)i(in)f(di\013eren)o(t)g(application)g(areas.)21 b(The)0 1399 y(application)i(areas)h(are)f(c)o(hosen)g(from)f(those)h(ha)o (ving)g(users)g(who)h(consume)e(large)h(amoun)o(ts)g(of)0 1459 y(sup)q(ercomputer)15 b(time)f([13].)21 b(Some)15 b(preliminary)e(results)j (in)g(other)g(studies)g([14][15])g(suggest)h(that)0 1520 y(while)10 b(the)h(a)o(v)o(erage)g(p)q(erformance)f(of)h(users)h(in)f(di\013eren)o(t)f (application)h(areas)h(is)f(somewhat)f(di\013eren)o(t,)0 1580 y(the)15 b(range)h(of)g(p)q(erformance)e(within)h(eac)o(h)g(area)h(is)f(so)h (large)g(as)g(to)g(mak)o(e)d(the)i(di\013erences)g(b)q(et)o(w)o(een)0 1640 y(them)g(insigni\014can)o(t.)23 b(It)17 b(is)g(not)g(curren)o(tly)f (clear)g(that)h(selecting)f(programs)h(for)h(application)f(area)0 1700 y(co)o(v)o(erage)f(alone)h(results)g(in)f(a)i(b)q(enc)o(hmark)d(set)i (whic)o(h)f(measurably)g(represen)o(ts)g(the)g(w)o(orkload)i(at)0 1760 y(a)e(particular)g(site)f(or)h(sup)q(ercomputer)f(usage)i(in)f(general.) k(A)c(more)f(rigorous)h(analytic)g(approac)o(h,)0 1821 y(p)q(erhaps)h(based)g (on)g(the)f(metho)q(d)f(describ)q(ed)h(in)g Ff(x)p Fm(2,)g(ma)o(y)f(b)q(e)h (more)f(appropriate.)73 1881 y(The)24 b(remainder)e(of)i(this)g(section)f (examines)f(the)i(executed)e(parallelism)g(of)i(some)f(curren)o(t)0 1941 y(b)q(enc)o(hmarks)f(on)i(the)f(CRA)l(Y)g(Y-MP)l(.)f(It)h(is)g(not)h(in) o(tended)f(to)g(b)q(e)h(an)g(exhaustiv)o(e)e(study)l(,)j(but)0 2001 y(rather)c(an)g(example)e(of)i(ho)o(w)h(some)e(simple)e (parallelism-based)i(tec)o(hniques)f(migh)o(t)g(b)q(e)j(used)f(to)0 2061 y(c)o(haracterize)g(the)i(executed)e(parallelism)g(of)i(application)g (programs)g(and)g(program)g(sets)g(on)h(a)0 2121 y(computer)15 b(system)g(ric)o(h)g(in)h(instruction-lev)o(el)e(parallelism.)20 b(While)15 b(the)h(to)q(ols)h(used)g(to)g(study)f(the)0 2182 y(Y-MP)f(are)h(mac)o(hine-sp)q(eci\014c,)d(the)i(parallelism-based)f(metho)q (dology)h(is)g(not,)h(and)g(w)o(e)f(hop)q(e)i(that)0 2242 y(other)f(researc)o (hers)g(will)f(apply)h(the)g(metho)q(dology)g(to)h(other)f(mac)o(hines.)73 2302 y(Our)11 b(approac)o(h)h(compleme)o(n)o(ts)c(the)j(w)o(ork)g(done)g(b)o (y)g(D.)f(K.)h(Chen,)h(H.)e(M.)g(Su)h(and)h(P)l(.)e(C.)h(Y)l(ew)g([16].)0 2362 y(They)20 b(ha)o(v)o(e)f(dev)o(elop)q(ed)g(a)i(to)q(ol,)g(MAXP)l(AR,)d (that)j(analyzes)f(the)g(soft)o(w)o(are)g(parallelism)e(that)i(is)0 2422 y(presen)o(t)13 b(in)g(F)l(ortran)h(source)g(co)q(de)g(at)f(the)h(job,)g (subroutine,)g(lo)q(op)g(and)g(statemen)o(t)e(lev)o(el,)f(assuming)0 2483 y(unlimited)h(hardw)o(are)j(resources.)21 b(Our)15 b(tec)o(hnique)e (measures)g(the)i(parallelism)d(that)j(w)o(as)h(actually)0 2543 y(exploited)f(on)j(the)e(\014nite)g(resources)h(of)g(the)f(CRA)l(Y)g (Y-MP)l(.)g(Measuremen)o(ts)e(tak)o(en)j(b)o(y)f(these)g(t)o(w)o(o)0 2603 y(tec)o(hniques)f(corresp)q(ond)i(to)f(the)g(placemen)o(t)e(of)i(t)o(w)o (o)g(prob)q(es)h(at)f(di\013eren)o(t)g(lo)q(cations)g(on)h(the)f(P)o(ath)0 2663 y(T)l(o)i(P)o(erformance)e(\()p Ff(x)q Fm(3\).)26 b(The)18 b(di\013erence)e(b)q(et)o(w)o(een)h(the)h(measuremen)o(ts)d(is)i(a)i(gauge)f (of)g(the)g(w)o(ork)913 2828 y(13)p eop %%Page: 14 20 bop 0 195 a Fm(done)17 b(b)o(y)e(the)h(agen)o(ts)h(who)g(acted)f(along)h(the) f(path)h(in)o(b)q(et)o(w)o(een)d(\(e.g.)21 b(compiler,)14 b(hardw)o(are\).)73 255 y(T)l(o)k(b)q(etter)g(understand)g(the)f(c)o(haracteristics)f(of)i (curren)o(t)f(w)o(orkloads,)h(in)f(another)h(study)g([14])0 315 y(w)o(e)d(are)g(also)g(examining)f(p)q(erformance)g(c)o(haracteristics)f (of)j(the)f(w)o(orkload)g(at)g(an)h(NSF)f(sup)q(ercom-)0 376 y(puter)i(cen)o(ter)e(o)o(v)o(er)h(a)h(one)g(y)o(ear)f(p)q(erio)q(d.)23 b(Com)o(bined)15 b(with)i(the)f(w)o(ork)h(describ)q(ed)f(here,)g(these)g(t)o (w)o(o)0 436 y(pro)s(jects)f(ma)o(y)g(enable)g(the)h(design)f(of)i(new)e(b)q (enc)o(hmark)g(sets)h(that)g(more)e(precisely)g(mo)q(del)h(actual)0 496 y(sup)q(ercomputer)f(usage.)21 b(Our)15 b(o)o(v)o(erall)e(goal)j(is)f(to) g(impro)o(v)o(e)d(the)i(tec)o(hniques)g(used)h(in)f(p)q(erformance)0 556 y(analysis)i(and,)h(in)f(particular,)f(b)q(enc)o(hmarking.)0 701 y Fe(5.1)70 b(Generating)22 b(P)n(arallelism)d(Matrices)j(for)i(the)e (CRA)-6 b(Y)22 b(Y-MP)0 793 y Fm(As)13 b(an)g(example)e(of)j(ho)o(w)f (parallelism-based)f(tec)o(hniques)f(migh)o(t)h(b)q(e)h(used)g(to)g(examine)e (application)0 853 y(programs)k(and)h(program)f(sets,)g(w)o(e)f(ha)o(v)o(e)h (measured)e(the)i(executed)f(parallelism)f(of)i(some)f(p)q(opular)0 913 y(b)q(enc)o(hmark)20 b(programs)i(using)g(a)g(clo)q(c)o(k)f(p)q(erio)q(d) h(lev)o(el)e(sim)o(ulator)g(of)i(a)g(single)f(pro)q(cessor)i(of)f(the)0 974 y(CRA)l(Y)14 b(Y-MP)h([12)q(].)20 b(The)15 b(sim)o(ulator)f(mo)q(dels)g (the)h(reserv)m(ation)g(of)h(the)f(follo)o(wing)g(CPU)g(resources)0 1034 y(of)d(the)f(CRA)l(Y)f(Y-MP:)h(all)g(v)o(ector)f(functional)h(units,)h (memory)c(p)q(orts,)13 b(v)o(ector)e(registers,)g(scalar)h(reg-)0 1094 y(isters.)22 b(Dela)o(ys)16 b(due)h(to)g(data-indep)q(enden)o(t)g(and)g (data-dep)q(enden)o(t)g(branc)o(hes,)f(instruction)h(bu\013er)0 1154 y(fetc)o(hes,)d(subroutine)i(calls,)f(instruction)g(issue,)h(and)g(some) f(memory)d(con\015icts)k(are)g(also)g(mo)q(deled.)0 1214 y(Scalar)j(input)f (path)h(con\015icts)f(and)h(some)e(t)o(yp)q(es)i(of)f(memory)e(con\015ict)i (dela)o(ys)g(are)g(not)h(mo)q(deled,)0 1275 y(but)i(in)g(most)f(cases)h(the)g (e\013ect)g(of)g(these)g(simpli\014cations)e(is)h(minimal;)g(except)g(for)h (malev)o(olen)o(t)0 1335 y(strides)16 b(and)h(gather/scatter)g(patterns,)f (the)g(sim)o(ulator)f(is)h(accurate)g(to)h(appro)o(ximately)d(5\045.)73 1395 y(One)k(of)f(the)h(textual)f(outputs)h(of)g(the)g(CRA)l(Y)e(Y-MP)i(sim)o (ulator)e(is)h(a)h(t)o(w)o(o-dimensional)e(par-)0 1455 y(allelism)e(matrix)g (of)j(the)f(t)o(yp)q(e)g(describ)q(ed)g(in)g Ff(x)p Fm(3.5.)22 b(The)16 b Fk(i;)8 b(j)s Fm(-th)16 b(en)o(try)g(of)h(the)f(parallelism)e (matrix)0 1515 y(represen)o(ts)20 b(the)h(fraction)g(of)g(the)g(program)f (elapsed)h(time)e(when)i Fk(i)f Fm(memory)e(op)q(erations)23 b(and)e Fk(j)0 1576 y Fm(\015oating)e(p)q(oin)o(t)f(op)q(erations)h(w)o(ere)e (sim)o(ultaneously)f(completed.)24 b(The)18 b(Y-MP)f(has)i(three)e (\015oating)0 1636 y(p)q(oin)o(t)g(functional)g(units)h(\(add,)f(m)o(ultiply) l(,)d(and)k(recipro)q(cal)e(appro)o(ximation\))g(and)i(three)f(memory)0 1696 y(p)q(orts)e(\(t)o(w)o(o)f(load)g(and)h(one)f(store\),)g(consequen)o (tly)f(the)g(parallelism)f(matrices)g(used)i(in)g(this)f(section)0 1756 y(are)j(4)c Ff(\002)f Fm(4.)73 1816 y(The)17 b(sim)o(ulator)e(w)o(as)i (applied)f(to)h(a)g(subset)g(of)g(the)f(P)o(erfect)g(Benc)o(hmarks)e([3])i (to)h(measure)e(the)0 1876 y(executed)j(parallelism)f(of)j(sev)o(eral)e (application)h(programs)g(in)g(the)g(set.)31 b(Fiv)o(e)17 b(of)j(the)f(13)h (P)o(erfect)0 1937 y(Benc)o(hmarks)9 b(w)o(ere)h(sim)o(ulated.)18 b(The)11 b(execution)f(time)f(required)g(to)j(sim)o(ulate)d(eac)o(h)h(en)o (tire)g(program)0 2159 y Fn(T)-5 b(able)14 b(1)p Fm(:)20 b(Names,)11 b(application)i(areas,)h(and)f(sim)o(ulation)e(statistics)h(of)h(the)g(\014v) o(e)f(sim)o(ulated)e(P)o(erfect)0 2220 y(Benc)o(hmarks)1030 2266 y(\045)16 b(of)h(Program)49 b(Sim)o(ulation)269 2326 y(Program)74 b(Application)15 b(Area)179 b(Sim)o(ulated)147 b(Error)p 244 2346 1388 2 v 269 2388 a(ADM)140 b(Meteorology)312 b(98.85\045)166 b(-6.43\045)269 2448 y(AR)o(C2D)91 b(Aero)q(dynamics)271 b(97.46\045)166 b(-6.71\045)269 2508 y(BDNA)113 b(Ph)o(ysical)15 b(Chemistry)159 b(72.65\045)166 b(-3.41\045)269 2569 y(D)o(YFESM)49 b(Structural)16 b(Mec)o(hanics)121 b(99.75\045)166 b(-6.25\045)269 2629 y(MDG)139 b(Ph)o(ysical)15 b(Chemistry)159 b(90.32\045)166 b(-2.27\045)913 2828 y(14)p eop %%Page: 15 21 bop 465 196 a Fm(FLOPS)p 272 215 547 2 v 223 256 a(3)p 271 274 2 61 v 50 w(0.00)50 b(0.00)h(0.00)f(0.00)p 817 274 V 51 w(M)223 316 y(2)p 271 334 V 50 w(0.02)g(0.00)h(0.00)f(0.00)p 817 334 V 54 w(O)223 376 y(1)p 271 394 V 50 w(0.12)g(0.01)h(0.00)f(0.00)p 817 394 V 57 w(P)223 436 y(0)p 271 455 V 50 w(0.75)g(0.09)h(0.01)f(0.00)p 817 455 V 60 w(S)p 272 456 547 2 v 359 497 a(0)113 b(1)f(2)h(3)476 557 y Fn(ADM)1229 196 y Fm(FLOPS)p 1037 215 V 987 256 a(3)p 1036 274 2 61 v 51 w(0.01)50 b(0.03)h(0.03)f(0.00)p 1582 274 V 51 w(M)987 316 y(2)p 1036 334 V 51 w(0.05)g(0.13)h(0.08)f(0.00)p 1582 334 V 54 w(O)987 376 y(1)p 1036 394 V 51 w(0.07)g(0.20)h(0.11)f(0.00)p 1582 394 V 56 w(P)987 436 y(0)p 1036 455 V 51 w(0.07)g(0.13)h(0.07)f(0.02)p 1582 455 V 59 w(S)p 1037 456 547 2 v 1124 497 a(0)112 b(1)h(2)f(3)1213 557 y Fn(AR)n(C2D)465 677 y Fm(FLOPS)p 272 697 V 223 737 a(3)p 271 755 2 61 v 50 w(0.00)50 b(0.02)h(0.02)f(0.00)p 817 755 V 51 w(M)223 798 y(2)p 271 816 V 50 w(0.01)g(0.05)h(0.04)f(0.00)p 817 816 V 54 w(O)223 858 y(1)p 271 876 V 50 w(0.05)g(0.12)h(0.07)f(0.00)p 817 876 V 57 w(P)223 918 y(0)p 271 936 V 50 w(0.28)g(0.17)h(0.11)f(0.03)p 817 936 V 60 w(S)p 272 938 547 2 v 359 978 a(0)113 b(1)f(2)h(3)461 1038 y Fn(BDNA)1229 677 y Fm(FLOPS)p 1037 697 V 987 737 a(3)p 1036 755 2 61 v 51 w(0.00)50 b(0.00)h(0.01)f(0.00)p 1582 755 V 51 w(M)987 798 y(2)p 1036 816 V 51 w(0.03)g(0.02)h(0.02)f(0.00)p 1582 816 V 54 w(O)987 858 y(1)p 1036 876 V 51 w(0.12)g(0.02)h(0.02)f(0.00)p 1582 876 V 56 w(P)987 918 y(0)p 1036 936 V 51 w(0.68)g(0.08)h(0.01)f(0.00)p 1582 936 V 59 w(S)p 1037 938 547 2 v 1124 978 a(0)112 b(1)h(2)f(3)1190 1038 y Fn(D)n(YFESM)847 1163 y Fm(FLOPS)p 654 1183 V 605 1223 a(3)p 653 1241 2 61 v 50 w(0.00)51 b(0.00)f(0.00)h(0.00)p 1200 1241 V 50 w(M)605 1283 y(2)p 653 1301 V 50 w(0.01)g(0.00)f(0.00)h(0.00)p 1200 1301 V 54 w(O)605 1344 y(1)p 653 1362 V 50 w(0.10)g(0.00)f(0.00)h(0.00)p 1200 1362 V 56 w(P)605 1404 y(0)p 653 1422 V 50 w(0.77)g(0.12)f(0.00)h(0.00)p 1200 1422 V 59 w(S)p 654 1423 547 2 v 742 1464 a(0)112 b(1)h(2)f(3)857 1524 y Fn(MDG)263 1676 y(T)-5 b(able)19 b(2)p Fm(:)i(P)o(arallelism)13 b(Matrices)j(for)g(selected)f(P)o(erfect)g(Benc)o(hmarks)0 1801 y(is)f(prohibitiv)o(e,)f(so)i(w)o(e)g(sim)o(ulated)d(only)j(a)g(subset)f (of)h(the)g(calling)f(tree)f(of)i(the)g(program.)20 b(Ho)o(w)o(ev)o(er,)0 1861 y(in)c(eac)o(h)g(case,)f(the)h(subtree)g(is)g(called)f(from)h(a)g(DO)h (lo)q(op)g(con)o(tained)e(in)h(the)g(paren)o(t)g(subroutine)h(of)0 1921 y(the)i(subtree's)f(ro)q(ot.)31 b(This)20 b(means)e(that)h(the)g (subtree)g(is)g(called)f(rep)q(eatedly)g(and)i(w)o(e)f(need)f(only)0 1981 y(sim)o(ulate)12 b(one)i(instan)o(tiation)g(to)h(capture)f(the)f (desired)h(measuremen)o(ts.)j(Eac)o(h)d(subtree)g(accoun)o(ted)0 2042 y(for)21 b(b)q(et)o(w)o(een)e(72\045)i(and)g(99\045)g(of)f(the)h(total)f (execution)f(time)g(of)h(the)h(whole)f(program.)33 b(T)l(able)21 b(1)0 2102 y(lists)c(the)g(b)q(enc)o(hmark)f(programs)i(that)g(w)o(ere)f(sim) o(ulated)e(and)j(the)g(application)f(areas)h(that)g(these)0 2162 y(b)q(enc)o(hmark)d(programs)h(represen)o(t.)0 2306 y Fe(5.2)70 b(P)n(arallelism)19 b(Matrices)j(for)h(Selected)e(P)n(erfect)g (Benc)n(hmarks)73 2399 y Fm(T)l(able)f(2)g(sho)o(ws)g(the)g(parallelism)d (matrices)h(for)i(the)f(sim)o(ulated)e(programs.)32 b(Man)o(y)19 b(features)0 2459 y(of)h(program)g(b)q(eha)o(vior)f(can)h(b)q(e)g(observ)o (ed)f(directly)f(and)i(quan)o(ti\014ed)f(from)g(the)g(parallelism)f(ma-)0 2519 y(trix.)34 b(A)20 b(c)o(hec)o(k)f(of)i(elemen)o(t)d(\(0)p Fk(;)8 b Fm(0\))21 b(imme)o(diately)c(rev)o(eals)j(whether)g(the)g(program)h (is)f(scalar-)i(or)0 2579 y(v)o(ector-orien)o(ted.)34 b(Larger)22 b(v)m(alues)f(indicate)f(that)i(neither)e(\015oating-p)q(oin)o(t)i(nor)g (memory)c(op)q(era-)0 2640 y(tions)i(o)q(ccurred)g(during)g(most)g(of)g(the)g (elapsed)g(time,)e(suc)o(h)i(as)h(w)o(ould)f(b)q(e)g(the)g(case)g(in)f(a)i (scalar)0 2700 y(co)q(de.)29 b(W)l(e)18 b(also)i(note)e(the)h(v)o(ery)e(lo)o (w)i(fraction)g(of)g(elapsed)f(time)f(in)h(b)q(oth)i(column)d(3)i(and)g(ro)o (w)g(3,)913 2828 y(15)p eop %%Page: 16 22 bop 203 514 2 360 v 188 515 16 2 v 188 442 V 188 370 V 188 298 V 188 226 V 188 154 V 188 478 V 188 406 V 188 334 V 188 262 V 188 190 V 84 528 a Fa(0.0)84 456 y(0.2)84 384 y(0.4)84 312 y(0.6)84 240 y(0.8)84 168 y(1.0)p 203 515 322 2 v 234 538 2 25 v 483 538 V 275 529 2 16 v 317 529 V 358 529 V 400 529 V 441 529 V 483 529 V 223 582 a(0)226 b(6)321 630 y(adm)p 226 514 17 270 v 268 514 17 76 v 309 514 17 15 v 351 514 17 4 v 508 514 2 360 v 494 515 16 2 v 494 478 V 494 442 V 494 406 V 494 370 V 494 334 V 494 298 V 494 262 V 494 226 V 494 190 V 494 154 V 509 515 322 2 v 539 538 2 25 v 789 538 V 581 529 2 16 v 623 529 V 664 529 V 706 529 V 747 529 V 789 529 V 529 582 a(0)g(6)616 630 y(arc2d)p 532 514 17 26 v 574 514 17 72 v 615 514 17 112 v 657 514 17 98 v 698 514 17 44 v 740 514 17 11 v 814 514 2 360 v 800 515 16 2 v 800 478 V 800 442 V 800 406 V 800 370 V 800 334 V 800 298 V 800 262 V 800 226 V 800 190 V 800 154 V 815 515 322 2 v 845 538 2 25 v 1094 538 V 887 529 2 16 v 928 529 V 970 529 V 1011 529 V 1053 529 V 1094 529 V 835 582 a(0)g(6)926 630 y(b)q(dna)p 838 514 17 101 v 879 514 17 83 v 921 514 17 87 v 962 514 17 58 v 1004 514 17 22 v 1045 514 17 8 v 1120 514 2 360 v 1106 515 16 2 v 1106 478 V 1106 442 V 1106 406 V 1106 370 V 1106 334 V 1106 298 V 1106 262 V 1106 226 V 1106 190 V 1106 154 V 1121 515 322 2 v 1151 538 2 25 v 1400 538 V 1192 529 2 16 v 1234 529 V 1275 529 V 1317 529 V 1358 529 V 1400 529 V 1140 582 a(0)g(6)1212 630 y(dyfesm)p 1143 514 17 245 v 1185 514 17 69 v 1226 514 17 22 v 1268 514 17 15 v 1310 514 17 8 v 1351 514 17 4 v 1426 514 2 360 v 1411 515 16 2 v 1411 478 V 1411 442 V 1411 406 V 1411 370 V 1411 334 V 1411 298 V 1411 262 V 1411 226 V 1411 190 V 1411 154 V 1426 515 322 2 v 1457 538 2 25 v 1706 538 V 1498 529 2 16 v 1540 529 V 1581 529 V 1623 529 V 1664 529 V 1706 529 V 1446 582 a(0)g(6)1544 630 y(mdg)p 1449 514 17 278 v 1491 514 17 80 v 1532 514 17 4 v 269 515 1204 2 v 497 754 2 57 v 581 737 a(0)107 b(1)h(2)f(3)h(4)g(5)f(6)p 1411 754 V 50 w(Av)o(erage)p 253 755 1370 2 v 278 795 a(ADM)p 497 812 2 57 v 134 w(0.75)48 b(0.21)h(0.04)f(0.01)h(0.00)g(0.00)f(0.00)p 1411 812 V 77 w(0.305)278 851 y(AR)o(C2D)p 497 868 V 88 w(0.07)g(0.20)h(0.31) f(0.27)h(0.12)g(0.03)f(0.00)p 1411 868 V 77 w(2.278)278 908 y(BDNA)p 497 925 V 110 w(0.28)g(0.23)h(0.24)f(0.16)h(0.06)g(0.02)f(0.00)p 1411 925 V 77 w(1.566)278 964 y(D)o(YFESM)p 497 981 V 49 w(0.68)g(0.19)h (0.06)f(0.04)h(0.02)g(0.01)f(0.00)p 1411 981 V 77 w(0.534)278 1021 y(MDG)p 497 1038 V 132 w(0.77)g(0.22)h(0.01)f(0.00)h(0.00)g(0.00)f(0.00) p 1411 1038 V 77 w(0.244)198 1172 y Fn(Figure)18 b(2)p Fm(:)j(T)l(otal)c (parallelism)d(v)o(ectors)h(for)i(selected)d(P)o(erfect)h(Benc)o(hmarks)0 1306 y(indicating)g(none)h(of)g(these)f(co)q(des)i(highly)e(utilize)f(all)h (3)h(arithmetic)d(units)j(or)g(all)f(3)h(memory)d(p)q(orts)0 1366 y(at)k(the)f(same)f(time.)73 1426 y(Figure)d(2)i(sho)o(ws)f(the)g (parallelism)d(v)o(ectors)i(of)h(the)g(5)g(b)q(enc)o(hmarks)e(where)i(\\w)o (ork")g(is)g(de\014ned)f(as)0 1487 y(either)k(a)h(\015oating-p)q(oin)o(t)h (or)f(memory)d(op)q(eration.)24 b(Because)16 b(in)h(this)f(case)h(w)o(e)g(do) g(not)g(distinguish)0 1547 y(b)q(et)o(w)o(een)e(the)h(three)f(\015oating)i(p) q(oin)o(t)f(functional)g(units)g(and)h(three)e(memory)e(p)q(orts)k(on)g(the)f (Y-MP)l(,)0 1607 y(the)d(parallelism)d(v)o(ectors)i(con)o(tain)h(en)o(tries)f (for)h(0)g(to)h(6)f(concurren)o(tly)e(completed)g(op)q(erations.)21 b(Note)0 1667 y(that)14 b(the)f(parallelism)e(reac)o(hes)h(a)i(lev)o(el)d(of) j(5)f(in)g(a)h(small)d(fraction)j(of)f(the)g(normalized)f(time)f(in)i(a)g (few)0 1727 y(of)i(the)g(b)q(enc)o(hmarks.)20 b(In)15 b(this)g(and)g(the)g (other)g(parallelism)e(v)o(ectors)i(describ)q(ed)f(later,)h(the)f(a)o(v)o (erage)0 1787 y(parallelism)c(is)j(the)f(w)o(eigh)o(ted)g(sum)f(of)i(the)g (normalized)d(time)h Fk(p)1178 1794 y Fc(i)1205 1787 y Fm(sp)q(en)o(t)i(at)g (eac)o(h)f(parallelism)e(lev)o(el)h Fk(i)p Fm(.)606 1924 y Fi(A)o(ver)n(age)18 b(p)n(ar)n(al)r(lelism)d Fm(=)1104 1870 y Fc(n)1084 1882 y Fb(X)1086 1973 y Fc(i)p Fj(=0)1153 1924 y Fk(i)10 b Ff(\002)h Fk(p)1254 1931 y Fc(i)0 2066 y Fm(Figure)i(3)h(sho)o (ws)h(parallelism)c(v)o(ectors)i(and)h(a)o(v)o(erage)g(parallelism)d(for)j (the)f(\014v)o(e)g(b)q(enc)o(hmarks)f(where)0 2126 y(\\w)o(ork")j(is)g (de\014ned)f(as)h(only)g(\015oating)g(p)q(oin)o(t)g(op)q(erations.)21 b(Similarly)l(,)12 b(Figure)i(4)h(sho)o(ws)g(parallelism)0 2186 y(data)i(where)f(w)o(ork)g(is)g(de\014ned)g(as)h(only)f(memory)e(op)q (erations.)0 2331 y Fe(5.3)70 b(Di\013erences)26 b(in)j(Executed)f(P)n (arallelism)d(for)30 b(Selected)c(P)n(erfect)157 2405 y(Benc)n(hmarks)73 2498 y Fm(The)14 b(di\013erence)e(in)h(parallelism)e(in)i(the)g(5)h(P)o (erfect)e(Benc)o(hmarks)f(can)i(b)q(e)h(examined)d(visually)h(in)0 2558 y(Figures)j(2)g(through)h(4,)g(or)f(quan)o(titativ)o(ely)e(using)i(the)g (distance)g(metric)e(in)h(equation)h(\(1\).)22 b(T)l(able)15 b(3)0 2618 y(sho)o(ws)23 b(the)g(quan)o(titativ)o(e)d(di\013erence)i(in)g (parallelism)e(for)j(eac)o(h)f(pair)g(of)h(the)f(\014v)o(e)g(b)q(enc)o (hmarks.)0 2678 y(Note)g(that)h(the)f(di\013erence)f(in)h(parallelism)e(is)i (not)h(a)f(transitiv)o(e)g(relation.)39 b(W)l(e)22 b(\014rst)g(compare)913 2828 y(16)p eop %%Page: 17 23 bop 410 514 2 360 v 396 515 16 2 v 396 442 V 396 370 V 396 298 V 396 226 V 396 154 V 396 478 V 396 406 V 396 334 V 396 262 V 396 190 V 292 528 a Fa(0.0)292 456 y(0.2)292 384 y(0.4)292 312 y(0.6)292 240 y(0.8)292 168 y(1.0)p 411 515 239 2 v 441 538 2 25 v 607 538 V 483 529 2 16 v 524 529 V 566 529 V 607 529 V 431 582 a(0)143 b(4)487 630 y(adm)p 434 514 17 307 v 475 514 17 47 v 517 514 17 8 v 633 514 2 360 v 619 515 16 2 v 619 478 V 619 442 V 619 406 V 619 370 V 619 334 V 619 298 V 619 262 V 619 226 V 619 190 V 619 154 V 634 515 239 2 v 664 538 2 25 v 830 538 V 706 529 2 16 v 747 529 V 789 529 V 830 529 V 653 582 a(0)h(4)699 630 y(arc2d)p 657 514 17 105 v 698 514 17 137 v 740 514 17 94 v 781 514 17 29 v 856 514 2 360 v 841 515 16 2 v 841 478 V 841 442 V 841 406 V 841 370 V 841 334 V 841 298 V 841 262 V 841 226 V 841 190 V 841 154 V 856 515 239 2 v 887 538 2 25 v 1053 538 V 928 529 2 16 v 970 529 V 1011 529 V 1053 529 V 876 582 a(0)f(4)926 630 y(b)q(dna)p 879 514 17 127 v 921 514 17 134 v 962 514 17 87 v 1004 514 17 11 v 1078 514 2 360 v 1064 515 16 2 v 1064 478 V 1064 442 V 1064 406 V 1064 370 V 1064 334 V 1064 298 V 1064 262 V 1064 226 V 1064 190 V 1064 154 V 1079 515 239 2 v 1109 538 2 25 v 1275 538 V 1151 529 2 16 v 1192 529 V 1234 529 V 1275 529 V 1099 582 a(0)g(4)1129 630 y(dyfesm)p 1102 514 17 299 v 1143 514 17 44 v 1185 514 17 19 v 1301 514 2 360 v 1287 515 16 2 v 1287 478 V 1287 442 V 1287 406 V 1287 370 V 1287 334 V 1287 298 V 1287 262 V 1287 226 V 1287 190 V 1287 154 V 1302 515 239 2 v 1332 538 2 25 v 1498 538 V 1374 529 2 16 v 1415 529 V 1457 529 V 1498 529 V 1322 582 a(0)g(4)1378 630 y(mdg)p 1325 514 17 321 v 1366 514 17 37 v 1408 514 17 4 v 476 515 872 2 v 682 757 2 61 v 770 739 a Fm(0)113 b(1)f(2)h(3)p 1228 757 V 50 w(Av)o(erage)p 424 759 1028 2 v 449 801 a(ADM)p 682 819 2 61 v 140 w(0.88)50 b(0.11)h(0.01)g(0.00)p 1228 819 V 81 w(0.127)449 861 y(AR)o(C2D)p 682 879 V 91 w(0.19)f(0.49)h(0.30)g(0.02)p 1228 879 V 81 w(1.149)449 921 y(BDNA)p 682 940 V 113 w(0.35)f(0.37)h(0.24)g (0.03)p 1228 940 V 81 w(0.956)449 982 y(D)o(YFESM)p 682 1000 V 49 w(0.83)f(0.12)h(0.05)g(0.00)p 1228 1000 V 81 w(0.223)449 1042 y(MDG)p 682 1060 V 139 w(0.88)f(0.12)h(0.00)g(0.00)p 1228 1060 V 81 w(0.125)109 1194 y Fn(Figure)18 b(3)p Fm(:)k(Floating)16 b(p)q(oin)o(t)g(parallelism)e(v)o(ectors)i(for)g(selected)f(P)o(erfect)g (Benc)o(hmarks)0 1328 y(BDNA)d(and)i(MDG,)f(t)o(w)o(o)g(programs)g(represen)o (ting)f(the)h(application)g(area)h(of)f(Ph)o(ysical)f(Chemistry)l(.)0 1389 y(The)17 b(relativ)o(ely)d(high)i(v)m(alue,)g(0.53,)h(of)g(the)f (di\013erence)g(in)g(parallelism)e(illustrates)i(that)h(these)f(t)o(w)o(o)0 1449 y(programs)d(ha)o(v)o(e)f(di\013eren)o(t)g(parallelism)e(pro\014les.)20 b(Our)13 b(measuremen)o(ts)d(tell)h(us)j(that)f(eac)o(h)f(program)0 1509 y(exercises)j(the)h(CRA)l(Y)g(Y-MP)g(CPU)h(with)f(a)h(di\013eren)o(t)f (mix)f(of)i(parallelism.)i(Our)e(measuremen)o(ts)0 1569 y(do)c(not)h(tell)e (us)h(is)g(ho)o(w)g(w)o(ell)f(these)g(t)o(w)o(o)h(programs)g(represen)o(t)f (the)h(parallelism)e(pro\014les)i(of)g(Ph)o(ysical)0 1629 y(Chemistry)h (application)i(programs)h(in)f(general.)73 1690 y(Next)c(compare)g(ADM,)g(a)h (program)g(represen)o(ting)f(the)g(application)h(area)g(of)h(Meteorology)l(,) e(and)0 1750 y(MDG,)f(a)g(Ph)o(ysical)g(Chemistry)e(co)q(de.)19 b(The)12 b(relativ)o(ely)c(lo)o(w)j(v)m(alue)g(of)g(the)g(di\013erence)f(in)h (parallelism,)0 1810 y(0.04,)k(illustrates)f(that)h(these)f(t)o(w)o(o)h (programs)f(ha)o(v)o(e)g(v)o(ery)g(similar)e(parallelism)g(pro\014les.)21 b(Although)0 1870 y(the)j(t)o(w)o(o)g(programs)g(come)f(from)g(di\013eren)o (t)g(application)h(areas,)j(eac)o(h)d(program)g(exercises)e(the)0 1930 y(CRA)l(Y)c(Y-MP)h(CPU)h(with)f(a)h(v)o(ery)e(similar)f(mix)g(of)j (parallelism.)27 b(Our)20 b(measuremen)o(t)o(s)d(suggest)0 1990 y(that)g(the)f(presence)g(of)h(ADM)f(and)i(MDG)e(in)h(the)f(P)o(erfect)g (Benc)o(hmarks)e(set)i(ma)o(y)g(b)q(e)g(redundan)o(t,)0 2051 y(in)i(so)h(far)g(as)g(their)e(suitabilit)o(y)g(for)i(testing)f(the)g(CRA)l (Y.)f(One)h(migh)o(t)f(susp)q(ect)h(that)h(they)f(w)o(ould)p 561 2204 V 587 2186 a(ADM)49 b(AR)o(C2D)h(BDNA)f(D)o(YFESM)g(MDG)p 303 2205 1270 2 v 328 2247 a(ADM)p 561 2265 2 61 v 156 w(0.00)328 2308 y(AR)o(C2D)p 561 2326 V 107 w(0.74)107 b(0.00)328 2368 y(BDNA)p 561 2386 V 129 w(0.51)g(0.26)120 b(0.00)328 2428 y(D)o(YFESM)p 561 2446 V 65 w(0.08)107 b(0.66)120 b(0.45)141 b(0.00)328 2488 y(MDG)p 561 2506 V 155 w(0.04)107 b(0.76)120 b(0.53)141 b(0.11)129 b(0.00)296 2643 y Fn(T)-5 b(able)18 b(3)p Fm(:)j(Di\013erences)16 b(in)g(P)o(arallelism)d(\(F)l(rob)q(enius)k(error)f(norms\))913 2828 y(17)p eop %%Page: 18 24 bop 410 514 2 360 v 396 515 16 2 v 396 442 V 396 370 V 396 298 V 396 226 V 396 154 V 396 478 V 396 406 V 396 334 V 396 262 V 396 190 V 292 528 a Fa(0.0)292 456 y(0.2)292 384 y(0.4)292 312 y(0.6)292 240 y(0.8)292 168 y(1.0)p 411 515 239 2 v 441 538 2 25 v 607 538 V 483 529 2 16 v 524 529 V 566 529 V 607 529 V 431 582 a(0)143 b(4)487 630 y(adm)p 434 514 17 307 v 475 514 17 47 v 517 514 17 8 v 633 514 2 360 v 619 515 16 2 v 619 478 V 619 442 V 619 406 V 619 370 V 619 334 V 619 298 V 619 262 V 619 226 V 619 190 V 619 154 V 634 515 239 2 v 664 538 2 25 v 830 538 V 706 529 2 16 v 747 529 V 789 529 V 830 529 V 653 582 a(0)h(4)699 630 y(arc2d)p 657 514 17 105 v 698 514 17 137 v 740 514 17 94 v 781 514 17 29 v 856 514 2 360 v 841 515 16 2 v 841 478 V 841 442 V 841 406 V 841 370 V 841 334 V 841 298 V 841 262 V 841 226 V 841 190 V 841 154 V 856 515 239 2 v 887 538 2 25 v 1053 538 V 928 529 2 16 v 970 529 V 1011 529 V 1053 529 V 876 582 a(0)f(4)926 630 y(b)q(dna)p 879 514 17 217 v 921 514 17 87 v 962 514 17 37 v 1004 514 17 19 v 1078 514 2 360 v 1064 515 16 2 v 1064 478 V 1064 442 V 1064 406 V 1064 370 V 1064 334 V 1064 298 V 1064 262 V 1064 226 V 1064 190 V 1064 154 V 1079 515 239 2 v 1109 538 2 25 v 1275 538 V 1151 529 2 16 v 1192 529 V 1234 529 V 1275 529 V 1099 582 a(0)g(4)1129 630 y(dyfesm)p 1102 514 17 278 v 1143 514 17 54 v 1185 514 17 26 v 1226 514 17 4 v 1301 514 2 360 v 1287 515 16 2 v 1287 478 V 1287 442 V 1287 406 V 1287 370 V 1287 334 V 1287 298 V 1287 262 V 1287 226 V 1287 190 V 1287 154 V 1302 515 239 2 v 1332 538 2 25 v 1498 538 V 1374 529 2 16 v 1415 529 V 1457 529 V 1498 529 V 1322 582 a(0)g(4)1378 630 y(mdg)p 1325 514 17 321 v 1366 514 17 37 v 1408 514 17 4 v 476 515 872 2 v 682 757 2 61 v 770 739 a Fm(0)113 b(1)f(2)h(3)p 1228 757 V 50 w(Av)o(erage)p 424 759 1028 2 v 449 801 a(ADM)p 682 819 2 61 v 140 w(0.85)50 b(0.13)h(0.02)g(0.00)p 1228 819 V 81 w(0.178)449 861 y(AR)o(C2D)p 682 879 V 91 w(0.29)f(0.38)h(0.26)g(0.08)p 1228 879 V 81 w(1.128)449 921 y(BDNA)p 682 940 V 113 w(0.60)f(0.24)h(0.10)g (0.05)p 1228 940 V 81 w(0.611)449 982 y(D)o(YFESM)p 682 1000 V 49 w(0.77)f(0.15)h(0.07)g(0.01)p 1228 1000 V 81 w(0.311)449 1042 y(MDG)p 682 1060 V 139 w(0.89)f(0.10)h(0.01)g(0.00)p 1228 1060 V 81 w(0.118)102 1194 y Fn(Figure)18 b(4)p Fm(:)j(Memory)15 b(access)h(parallelism)e(v)o(ectors)h(for)i(selected)d(P)o(erfect)h(Benc)o (hmarks)0 1328 y(also)23 b(b)q(e)f(redundan)o(t)g(for)h(testing)f(other)g (mac)o(hines)e(with)i(v)o(ector)f(arc)o(hitectures)g(similar)f(to)i(the)0 1389 y(CRA)l(Y,)12 b(but)i(this)g(h)o(yp)q(othesis)f(needs)h(to)g(b)q(e)g(in) o(v)o(estigated.)19 b(The)13 b(same)g(conclusion)g(of)h(redundancy)0 1449 y(migh)o(t)21 b(b)q(e)h(also)h(b)q(e)g(dra)o(wn)g(from)e(our)i (measuremen)o(t)o(,)e(0.08,)j(of)f(the)f(di\013erence)f(in)h(parallelism)0 1509 y(in)17 b(D)o(YFESM,)f(a)h(program)g(represen)o(ting)f(the)h (application)g(area)h(of)f(Structural)g(Mec)o(hanics,)e(and)0 1569 y(MDG.)0 1714 y Fe(5.4)70 b(Prob)r(e)23 b(P)n(oin)n(ts)f(for)i(the)e (Baseline)f(P)n(erfect)h(Benc)n(hmarks)0 1806 y Fm(It)17 b(is)g(imp)q(ortan)o (t)g(to)h(realize)e(that)h(these)h(results)f(are)g(for)h(the)f(\\baseline")h (v)o(ersion)e(of)i(the)f(P)o(erfect)0 1866 y(Benc)o(hmarks.)26 b(In)19 b(the)f(P)o(erfect)g(metho)q(dology)l(,)g(baseline)g(refers)g(to)h(b) q(enc)o(hmark)e(exp)q(erimen)o(ts)f(in)0 1926 y(whic)o(h)c(no)i(man)o(ual)e (mo)q(di\014cations)h(can)g(b)q(e)g(made)f(to)i(the)f(source)g(co)q(de.)20 b(In)13 b(the)g(terminology)e(of)j Ff(x)p Fm(3,)0 1986 y(this)i(corresp)q (onds)h(to)g(the)f(placemen)o(t)d(of)k(a)f(prob)q(e)h(on)f(the)g(P)o(ath)h (to)f(P)o(erformance)e(at)j(the)f(lev)o(el)e(of)0 2047 y(\\Program.")22 b(The)16 b(P)o(erfect)f(metho)q(dology)g(allo)o(ws)i(for)f(another)h(t)o(yp)q (e)e(of)h(b)q(enc)o(hmark)f(exp)q(erimen)o(t)0 2107 y(called)h(\\optimized")f (in)h(whic)o(h)g(an)o(y)h(man)o(ual)e(mo)q(di\014cations)h(to)h(the)g (program)f(are)h(allo)o(w)o(ed.)22 b(The)0 2167 y(only)15 b(restriction)f(is) h(that)h(the)e(correct)h(solution)g(to)h(the)f(same)f(problem)f(b)q(e)j (obtained.)21 b(This)15 b(other)0 2227 y(b)q(enc)o(hmark)k(exp)q(erimen)o(t)f (corresp)q(onds)23 b(to)e(a)g(placemen)o(t)e(of)i(a)h(prob)q(e)f(on)h(the)e (P)o(ath)i(to)f(P)o(erfor-)0 2287 y(mance)d(at)i(the)f(lev)o(el)e(of)j (\\Algorithm")e(\\Mathematical)g(Mo)q(del,")i(or)f(\\Ph)o(ysical)g (Phenomenon")0 2348 y(dep)q(ending)c(on)g(what)h(represen)o(tation)e(of)i (the)e(problem)g(is)g(constan)o(t)i(and)f(the)g(exten)o(t)f(of)h(the)g(man-)0 2408 y(ual)i(mo)q(di\014cations)g(to)g(the)g(program.)24 b(Ev)o(en)16 b(though)i(the)f(results)g(presen)o(ted)f(here)h(do)h(not)f(apply)0 2468 y(to)g(this)g(other)g(exp)q(erimen)o(t,)c(it)j(w)o(ould)h(b)q(e)g(in)o (teresting)f(to)h(examine)d(the)j(parallelism)d(of)j(the)g(opti-)0 2528 y(mized)f(co)q(des,)j(as)g(it)f(w)o(ould)h(pro)o(vide)e(insigh)o(t)h(in) o(to)g(the)g(amoun)o(t)g(of)h(lost)f(parallelism)e(that)j(could)0 2588 y(ha)o(v)o(e)e(b)q(een)h(preserv)o(ed)e(when)i(other)g(agen)o(ts)h (\(e.g.)25 b(Soft)o(w)o(are)18 b(Engineer,)f(Numerical)e(Analyst,)j(or)0 2649 y(Scien)o(tist\))d(activ)o(ely)f(participate)h(to)i(preserv)o(e)e(the)h (parallelism.)913 2828 y(18)p eop %%Page: 19 25 bop 0 203 a Fl(6)83 b(Conclusions)25 b(and)j(F)-7 b(uture)27 b(W)-7 b(ork)0 313 y Fm(In)13 b(this)g(pap)q(er)g(w)o(e)g(presen)o(ted)f(a)h (brief)g(o)o(v)o(erview)e(of)i(p)q(erformance)f(ev)m(aluation)h(and)h(b)q (enc)o(hmarking.)0 373 y(W)l(e)j(demonstrated)g(that)i(traditional)e(p)q (erformance)g(measuremen)o(ts)d(in)k(these)f(activities)f(are)i(re-)0 433 y(ally)c(a)i(direct)d(measuremen)o(t)f(of)j(the)g(parallelism)e(in)h(the) h(soft)o(w)o(are)g(and)h(hardw)o(are.)21 b(A)14 b(framew)o(ork)0 493 y(called)i(the)g(P)o(ath)h(to)h(P)o(erformance)d(w)o(as)i(dev)o(elop)q (ed)f(whic)o(h)g(iden)o(ti\014es)f(the)i(agen)o(ts)g(and)h(activities)0 554 y(that)g(c)o(hange)g(the)g(parallelism)e(as)i(it)g(mo)o(v)o(es)d(from)i (problem)g(to)h(solution.)27 b(W)l(e)17 b(sho)o(w)i(where)e(v)m(ar-)0 614 y(ious)k(curren)o(t)g(application)f(b)q(enc)o(hmarks)g(apply)h(prob)q(es) h(on)f(the)g(P)o(ath,)h(and)g(whic)o(h)e(agen)o(ts)i(are)0 674 y(b)q(eing)d(measured.)29 b(W)l(e)19 b(recast)g(the)g(traditional)g (time-based)f(p)q(erformance)g(measuremen)o(ts)e(in)o(to)0 734 y(parallelism-based)j(p)q(erformance)g(measuremen)n(ts)f(to)j(sho)o(w)f (that)h(understanding)g(p)q(erformance)0 794 y(implies)11 b(understanding)j (the)g(parallelism.)k(A)13 b(mac)o(hine-indep)q(enden)o(t)e(metho)q(dology)i (is)h(describ)q(ed)0 855 y(for)f(measuring)f(and)h(comparing)f(executed)f (parallelism.)17 b(A)c(sim)o(ulation)d(to)q(ol)k(is)e(used)h(to)g(apply)f (the)0 915 y(metho)q(dology)i(to)h(sev)o(eral)f(of)h(the)f(P)o(erfect)g(Benc) o(hmarks)e(to)j(quan)o(tify)f(their)g(executed)f(parallelism)0 975 y(on)19 b(a)f(single)g(CRA)l(Y)f(Y-MP)h(CPU.)f(Our)h(results)g(suggest)h (that)g(since)e(some)g(of)i(the)e(b)q(enc)o(hmarks)0 1035 y(ha)o(v)o(e)12 b(a)h(similar)d(mix)g(of)j(di\013eren)o(t)f(lev)o(els)e(of)j(parallelism)d (on)j(the)f(Y-MP)l(,)g(the)g(b)q(enc)o(hmarking)f(utilit)o(y)0 1095 y(of)16 b(these)g(programs)h(on)f(the)g(Y-MP)g(ma)o(y)e(b)q(e)j (questioned)e(b)q(ecause)i(these)f(programs)g(exercise)e(the)0 1156 y(mac)o(hine)g(in)i(the)g(same)f(w)o(a)o(y)l(.)73 1216 y(W)l(e)22 b(supp)q(ort)g(the)g(idea)f(that)h(p)q(erformance)f(ev)m(aluation) g(and,)i(in)f(particular,)g(applications-)0 1276 y(lev)o(el)c(b)q(enc)o (hmarking)g(should)i(b)q(ecome)e(more)h(of)h(an)g(analytic,)g(scien)o (ti\014c)e(activit)o(y)l(.)30 b(T)l(o)20 b(do)h(this,)0 1336 y(b)q(etter)h(analytic)h(tec)o(hniques)e(are)h(needed)h(to)g(examine)d(ho)o (w)j(b)q(enc)o(hmarks)f(succeed)f(or)j(fail)e(to)0 1396 y(represen)o(t)13 b(their)h(in)o(tended)f(w)o(orkload)i(and)f(ho)o(w)h(they)f(succeed)f(or)i (fail)f(to)g(test)g(the)g(c)o(haracteristics)0 1457 y(of)i(the)g(mac)o(hines) e(w)o(e)i(use.)21 b(W)l(e)16 b(prop)q(ose)h(that)f(one)h(approac)o(h)f(to)o (w)o(ard)h(these)e(ends)i(is)e(to)i(fo)q(cus)g(on)0 1517 y(parallelism)g(as)j (the)g(essen)o(tial)e(quan)o(tit)o(y)h(and)h(to)g Fi(quantify)g Fm(it.)31 b(P)o(erformance)17 b(ev)m(aluation)j(w)o(ould)0 1577 y(b)q(ecome)15 b(parallelism)f(ev)m(aluation.)73 1637 y(W)l(e)23 b(plan)g(to)g(measure)f(and)h(compare)f(the)h(parallelism)d(c)o (haracteristics)i(of)h(other)g(p)q(opular)0 1697 y(b)q(enc)o(hmark)14 b(sets)i(to)g(quan)o(tify)e(their)h(similarities)d(and)17 b(di\013erences)d (on)i(the)g(CRA)l(Y)e(Y-MP)l(.)h(These)0 1758 y(results)e(can)g(b)q(e)g (merged)f(with)h(the)g(w)o(orkload)g(c)o(haracterization)f(results)h(pro)q (duced)h(b)o(y)e(other)h(stud-)0 1818 y(ies.)24 b(The)18 b(com)o(bined)d (analysis)j(of)f(program)h(and)g(w)o(orkload)g(c)o(haracteristics)e(should)i (help)f(justify)0 1878 y(the)d(inclusion)g(of)h(application)g(b)q(enc)o (hmark)e(programs)i(that)g(are)f(candidates)h(in)f(the)h(construction)0 1938 y(of)k(new)g(b)q(enc)o(hmark)e(sets.)29 b(W)l(e)19 b(w)o(ould)g(also)g (lik)o(e)e(to)i(construct)g(a)h(b)q(enc)o(hmark)d(set)i(based)g(on)g(its)0 1998 y(co)o(v)o(erage)c(of)i(v)m(arious)g(distributions)f(of)g(parallelism,)d (rather)j(than,)h(for)f(example,)e(on)i(application)0 2058 y(areas.)24 b(The)17 b(to)q(ols)h(and)g(metho)q(dology)e(to)i(do)f(this)g (could)g(b)q(e)g(based)h(on)f(the)g(approac)o(h)h(describ)q(ed)0 2119 y(here,)d(and)i(w)o(ould)f(require)f(the)h(understanding)h(of)g(lo)o (w-lev)o(el)d(parallelism)g(on)j(other)f(mac)o(hines.)0 2285 y Fl(7)83 b(Ac)n(kno)n(wledgmen)n(ts)0 2395 y Fm(The)15 b(authors)i(wish)e (to)g(thank)h(the)f(editors)g(and)h(anon)o(ymous)f(referees)f(for)h(their)g (though)o(tful)g(com-)0 2455 y(men)o(ts)g(whic)o(h)g(greatly)h(impro)o(v)o (ed)e(the)i(qualit)o(y)f(of)h(this)g(pap)q(er.)913 2828 y(19)p eop %%Page: 20 26 bop 0 203 a Fl(References)24 313 y Fm([1])24 b(F.)12 b(McMahon,)h(\\The)g (Liv)o(ermore)e(Fortran)i(Kernels:)19 b(A)12 b(test)h(of)g(the)f(n)o (umerical)e(p)q(erformance)100 373 y(range,")17 b(T)l(ec)o(h.)e(Rep.)h (UCRL-53745,)i(La)o(wrence)e(Liv)o(ermore)e(Lab,)j(Dec.)e(1986.)24 475 y([2])24 b(J.)18 b(Dongarra,)i(\\P)o(erformance)d(of)i(v)m(arious)f (computers)f(using)i(standard)g(linear)f(equations,")100 535 y(T)l(ec)o(h.)d(Rep.)h(23,)g(Argonne)h(National)f(Lab,)h(1988.)24 637 y([3])24 b(M.)18 b(Berry)g(et)g(al.,)h(\\The)g(Perfect)f(Club)h(b)q(enc)o (hmarks:)25 b(E\013ectiv)o(e)17 b(p)q(erformance)h(ev)m(aluation)100 697 y(of)23 b(sup)q(ercomputers,")f Fi(International)i(Journal)f(of)g(Sup)n (er)n(c)n(omputer)f(Applic)n(ations)p Fm(,)h(v)o(ol.)e(3,)100 757 y(pp.)16 b(5{40,)i(F)l(all)d(1989.)24 859 y([4])24 b(J.)d(Uniejewski,)f (\\SPEC)i(b)q(enc)o(hmark)d(suite:)30 b(designed)21 b(for)g(to)q(da)o(y's)g (adv)m(anced)g(systems,")100 919 y(T)l(ec)o(h.)15 b(Rep.)h(1,)g(SPEC)h (Newsletter,)d(1989.)24 1021 y([5])24 b(D.)18 b(Bailey)f(et)h(al.,)g(\\The)g (NAS)g(parallel)f(b)q(enc)o(hmarks,")h(T)l(ec)o(h.)f(Rep.)g(RNR-91-002,)j (NASA)100 1081 y(Ames,)14 b(Jan)o(uary)j(1991.)24 1183 y([6])24 b(R.)16 b(P)l(.)f(W)l(eic)o(k)o(er,)e(\\An)j(o)o(v)o(erview)e(of)i(common)e (b)q(enc)o(hmarks,")h Fi(IEEE)i(Computer)p Fm(,)e(pp.)h(65{76,)100 1243 y(1990.)24 1344 y([7])24 b(M.)13 b(Berry)l(,)f(G.)h(Cyb)q(enk)o(o,)h (and)g(J.)f(Larson,)i(\\Scien)o(ti\014c)c(b)q(enc)o(hmark)h(c)o (haracterization,")h Fi(Par-)100 1405 y(al)r(lel)20 b(Computing)p Fm(,)c(v)o(ol.)f(17,)i(pp.)f(1173{1194)q(,)i(Decem)o(b)q(er)c(1991.)24 1506 y([8])24 b(A.)f(v)m(an)h(der)f(Steen,)i(\\Rep)q(ort)f(of)g(the)f(second) h(Eurob)q(en)g(w)o(orkshop)h(on)f(b)q(enc)o(hmarking,")100 1567 y Fi(Sup)n(er)n(c)n(omputer)p Fm(,)15 b(v)o(ol.)h(8,)g(pp.)g(15{19,)h (Septem)o(b)q(er)e(1991.)24 1668 y([9])24 b(J.)e(L.)g(Hennesy)e(and)j(D.)e (A.)g(P)o(atterson,)j Fi(Computer)e(A)o(r)n(chite)n(ctur)n(e:)32 b(A)23 b(Quantitative)h(Ap-)100 1728 y(pr)n(o)n(ach)p Fm(,)15 b(p.)h(36.)22 b(San)16 b(Mateo,)g(CA:)g(Morgan)h(Kaufmann,)e(1990.)0 1830 y([10])24 b(D.)d(J.)f(Kuc)o(k,)h Fi(The)h(Structur)n(e)g(of)f(Computers) h(and)g(Computations)p Fm(,)f(v)o(ol.)f(1,)i(p.)f(100.)36 b(New)100 1890 y(Y)l(ork:)21 b(John)c(Wiley)e(and)i(Sons,)f(1978.)0 1992 y([11])24 b(Cra)o(y)c(Researc)o(h,)g(Inc.,)f Fi(UNICOS)i(Performanc)n(e)g (Utilities)h(R)n(efer)n(enc)n(e)f(Manual)g(SR-2040)g(-)100 2052 y(6.0)p Fm(,)16 b(1991.)0 2154 y([12])24 b(D.)18 b(K.)g(Bradley)g(and)g (J.)g(L.)h(Larson,)h(\\Fine-grained)e(measuremen)o(ts)d(of)k(lo)q(op)g(p)q (erformance)100 2214 y(on)24 b(the)f(CRA)l(Y)f(Y-MP,")g(in)h Fi(Pr)n(o)n(c)n(e)n(e)n(dings)g(of)g(the)h(Fifth)g(SIAM)f(Confer)n(enc)n(e)i (on)f(Par)n(al)r(lel)100 2274 y(Pr)n(o)n(c)n(essing)17 b(for)g(Scienti\014c)j (Computing)p Fm(,)c(Marc)o(h)g(25{27)i(1991.)0 2376 y([13])24 b(D.)19 b(Bradley)l(,)f(G.)g(Cyb)q(enk)o(o,)h(H.)f(Gao,)i(J.)f(Larson,)h(F.)e (Ahmad,)g(J.)h(Golab,)g(and)h(M.)e(Strak)m(a,)100 2436 y(\\Sup)q(ercomputer)d (w)o(orkload)h(decomp)q(osition)f(and)h(analysis,")g(in)g Fi(Pr)n(o)n(c)n(e)n (e)n(dings)f(of)i(the)g(A)o(CM)100 2496 y(International)i(Confer)n(enc)n(e)f (on)g(Sup)n(er)n(c)n(omputing)p Fm(,)e(pp.)g(458{467,)i(June)f(17{21)h(1991.) 0 2598 y([14])24 b(H.)18 b(Gao)h(and)g(J.)f(L.)g(Larson,)i(\\A)e(y)o(ear's)g (pro\014le)g(of)g(sup)q(ercomputer)f(users)i(in)f(di\013eren)o(t)f(ap-)100 2658 y(plication)f(areas)h(using)f(a)h(hardw)o(are)g(p)q(erformance)e (monitor.")g(W)l(ork)h(in)g(progress,)h(1992.)913 2828 y(20)p eop %%Page: 21 27 bop 0 195 a Fm([15])24 b(J.)f(L.)g(Larson,)j(\\Collecting)c(and)i(in)o (terpreting)d(hpm)h(p)q(erformance)g(data)i(on)f(the)g(CRA)l(Y)100 255 y(Y-MP,")16 b Fi(NCSA)i(Datalink)p Fm(,)f(pp.)f(14{24,)i(No)o(v)o(em)o(b) q(er-Dece)o(m)n(b)q(er)13 b(1991.)0 357 y([16])24 b(D.)e(K.)e(Chen,)j(H.)d (M.)h(Su,)i(and)f(P)l(.)f(Y)l(ew,)h(\\The)g(impact)d(of)j(sync)o(hronization) f(and)h(gran)o(u-)100 417 y(larit)o(y)17 b(on)h(parallel)f(systems,")g(in)g Fi(Pr)n(o)n(c)n(e)n(e)n(dings)h(of)h(the)g(17th)g(International)h(Symp)n (osium)e(on)100 477 y(Computer)g(A)o(r)n(chite)n(ctur)n(e)p Fm(,)d(pp.)h(239{249,)i(June)f(1990.)913 2828 y(21)p eop %%Trailer end userdict /end-hook known{end-hook}if %%EOF From owner-pbwg-comm@CS.UTK.EDU Thu Jun 3 11:46:08 1993 Received: from CS.UTK.EDU by netlib2.cs.utk.edu with SMTP (5.61+IDA+UTK-930125/2.8t-UTK) id AA18007; Thu, 3 Jun 93 11:46:08 -0400 Received: from localhost by CS.UTK.EDU with SMTP (5.61+IDA+UTK-930125/2.8s-UTK) id AA10460; Thu, 3 Jun 93 11:46:10 -0400 X-Resent-To: pbwg-comm@CS.UTK.EDU ; Thu, 3 Jun 1993 11:46:06 EDT Errors-To: owner-pbwg-comm@CS.UTK.EDU Received: from sp2.csrd.uiuc.edu by CS.UTK.EDU with SMTP (5.61+IDA+UTK-930125/2.8s-UTK) id AA10426; Thu, 3 Jun 93 11:45:47 -0400 Received: by sp2.csrd.uiuc.edu id AA04359 (5.67a/IDA-1.5); Thu, 3 Jun 1993 10:45:27 -0500 Date: Thu, 3 Jun 1993 10:45:27 -0500 From: "John L. Larson" Message-Id: <199306031545.AA04359@sp2.csrd.uiuc.edu> To: pbwg-comm@cs.utk.edu, perfect.steering@csrd.uiuc.edu Subject: workload paper %!PS-Adobe-2.0 %%Creator: dvips 5.47 Copyright 1986-91 Radical Eye Software %%Title: m1.dvi %%Pages: 25 1 %%BoundingBox: 0 0 612 792 %%EndComments %%BeginProcSet: texc.pro /TeXDict 200 dict def TeXDict begin /N /def load def /B{bind def}N /S /exch load def /X{S N}B /TR /translate load N /isls false N /vsize 10 N /@rigin{ isls{[0 1 -1 0 0 0]concat}if 72 Resolution div 72 VResolution div neg scale Resolution VResolution vsize neg mul TR matrix currentmatrix dup dup 4 get round 4 exch put dup dup 5 get round 5 exch put setmatrix}N /@letter{/vsize 10 N}B /@landscape{/isls true N /vsize -1 N}B /@a4{/vsize 10.6929133858 N}B /@a3{ /vsize 15.5531 N}B /@ledger{/vsize 16 N}B /@legal{/vsize 13 N}B /@manualfeed{ statusdict /manualfeed true put}B /@copies{/#copies X}B /FMat[1 0 0 -1 0 0]N /FBB[0 0 0 0]N /nn 0 N /IE 0 N /ctr 0 N /df-tail{/nn 8 dict N nn begin /FontType 3 N /FontMatrix fntrx N /FontBBox FBB N string /base X array /BitMaps X /BuildChar{CharBuilder}N /Encoding IE N end dup{/foo setfont}2 array copy cvx N load 0 nn put /ctr 0 N[}B /df{/sf 1 N /fntrx FMat N df-tail} B /dfs{div /sf X /fntrx[sf 0 0 sf neg 0 0]N df-tail}B /E{pop nn dup definefont setfont}B /ch-width{ch-data dup length 5 sub get}B /ch-height{ch-data dup length 4 sub get}B /ch-xoff{128 ch-data dup length 3 sub get sub}B /ch-yoff{ ch-data dup length 2 sub get 127 sub}B /ch-dx{ch-data dup length 1 sub get}B /ch-image{ch-data dup type /stringtype ne{ctr get /ctr ctr 1 add N}if}B /id 0 N /rw 0 N /rc 0 N /gp 0 N /cp 0 N /G 0 N /sf 0 N /CharBuilder{save 3 1 roll S dup /base get 2 index get S /BitMaps get S get /ch-data X pop /ctr 0 N ch-dx 0 ch-xoff ch-yoff ch-height sub ch-xoff ch-width add ch-yoff setcachedevice ch-width ch-height true[1 0 0 -1 -.1 ch-xoff sub ch-yoff .1 add]/id ch-image N /rw ch-width 7 add 8 idiv string N /rc 0 N /gp 0 N /cp 0 N{rc 0 ne{rc 1 sub /rc X rw}{G}ifelse}imagemask restore}B /G{{id gp get /gp gp 1 add N dup 18 mod S 18 idiv pl S get exec}loop}B /adv{cp add /cp X}B /chg{rw cp id gp 4 index getinterval putinterval dup gp add /gp X adv}B /nd{/cp 0 N rw exit}B /lsh{rw cp 2 copy get dup 0 eq{pop 1}{dup 255 eq{pop 254}{dup dup add 255 and S 1 and or}ifelse}ifelse put 1 adv}B /rsh{rw cp 2 copy get dup 0 eq{pop 128}{dup 255 eq{pop 127}{dup 2 idiv S 128 and or}ifelse}ifelse put 1 adv}B /clr{rw cp 2 index string putinterval adv}B /set{rw cp fillstr 0 4 index getinterval putinterval adv}B /fillstr 18 string 0 1 17{2 copy 255 put pop}for N /pl[{adv 1 chg}bind{adv 1 chg nd}bind{1 add chg}bind{1 add chg nd}bind{adv lsh}bind{ adv lsh nd}bind{adv rsh}bind{adv rsh nd}bind{1 add adv}bind{/rc X nd}bind{1 add set}bind{1 add clr}bind{adv 2 chg}bind{adv 2 chg nd}bind{pop nd}bind]N /D{ /cc X dup type /stringtype ne{]}if nn /base get cc ctr put nn /BitMaps get S ctr S sf 1 ne{dup dup length 1 sub dup 2 index S get sf div put}if put /ctr ctr 1 add N}B /I{cc 1 add D}B /bop{userdict /bop-hook known{bop-hook}if /SI save N @rigin 0 0 moveto}N /eop{clear SI restore showpage userdict /eop-hook known{eop-hook}if}N /@start{userdict /start-hook known{start-hook}if /VResolution X /Resolution X 1000 div /DVImag X /IE 256 array N 0 1 255{IE S 1 string dup 0 3 index put cvn put}for}N /p /show load N /RMat[1 0 0 -1 0 0]N /BDot 260 string N /rulex 0 N /ruley 0 N /v{/ruley X /rulex X V}B /V statusdict begin /product where{pop product dup length 7 ge{0 7 getinterval (Display)eq}{pop false}ifelse}{false}ifelse end{{gsave TR -.1 -.1 TR 1 1 scale rulex ruley false RMat{BDot}imagemask grestore}}{{gsave TR -.1 -.1 TR rulex ruley scale 1 1 false RMat{BDot}imagemask grestore}}ifelse B /a{moveto}B /delta 0 N /tail{dup /delta X 0 rmoveto}B /M{S p delta add tail}B /b{S p tail} B /c{-4 M}B /d{-3 M}B /e{-2 M}B /f{-1 M}B /g{0 M}B /h{1 M}B /i{2 M}B /j{3 M}B /k{4 M}B /w{0 rmoveto}B /l{p -4 w}B /m{p -3 w}B /n{p -2 w}B /o{p -1 w}B /q{p 1 w}B /r{p 2 w}B /s{p 3 w}B /t{p 4 w}B /x{0 S rmoveto}B /y{3 2 roll p a}B /bos{ /SS save N}B /eos{clear SS restore}B end %%EndProcSet TeXDict begin 1000 300 300 @start /Fa 1 50 df<120C121C12EC120CAFEAFFC00A137D92 11>49 D E /Fb 1 51 df50 D E /Fc 2 64 df<1306131E133E13E6EA03C6EA0F06121C127012E0A21270121C120FEA03C6EA 00E6133E131E13060F127E9113>47 D<7FA5487EA238F9CF80383FFE00EA0FF8EA03E0A2EA0770 EA0630487EEA0808487E11117F9113>63 D E /Fd 3 15 df<1203A4EAE31CEA7338EA1FE0EA07 80A2EA1FE0EA7338EAE31CEA0300A40E107E9013>3 D<7F487EEA0360EA0630487E487E487E48 7E38C00180A238600300EA30066C5A6C5A6C5A6C5A6C5A6C5A11127F9113>5 D14 D E /Fe 4 16 df2 D8 D10 D<121E123FEA7F80EAFFC0A6 EA7F80EA3F00121E0A0C7D8B10>15 D E /Ff 25 119 df<126012F0A212701210A41220A21240 1280040C7C830C>59 D<130113031306A3130CA31318A31330A31360A213C0A3EA0180A3EA0300 A31206A25AA35AA35AA35AA35AA210297E9E15>61 D<140CA2141CA2143C147C145C149C148EEB 010EA213021304A21308A213101320EB3FFEEB4007A21380EA0100A21202A21206121E39FF807F F01C1D7F9C1F>65 D<903801F80890380E0618903838013890386000F048481370485A48C71230 481420120E5A123C15005AA35AA45CA300701302A200305B00385B6C5B6C136038070180D800FE C7FC1D1E7E9C1E>67 D<3801FFE038003C001338A45BA45BA4485AA438038002A31404EA070014 0C14181438000E13F0B5FC171C7E9B1C>76 D79 D<48B5FC39003C03C090383800 E015F01570A24913F0A315E0EBE001EC03C0EC0780EC1E003801FFF001C0C7FCA3485AA448C8FC A45AEAFFE01C1C7E9B1B>I<001FB512F0391C03807039300700300020142012601240130E1280 A2000014005BA45BA45BA45BA41201EA7FFF1C1C7F9B18>84 D<397FF03FE0390F000700000E13 061404A3485BA4485BA4485BA4485BA35CA249C7FCEA60025B6C5AEA1830EA07C01B1D7D9B1C> I97 D<123F1207A2120EA45AA4EA39E0EA3A30EA3C1812381270131CA3EAE038A3133013 70136013C01261EA2300121E0E1D7E9C12>II101 DII105 D<1307130FA213061300A61370139CEA010C1202131C1204 1200A21338A41370A413E0A4EA01C01261EAF180EAF30012E6127C1024809B11>I<39381F81F0 394E20C618394640E81CEB80F0EA8F00008E13E0120EA2391C01C038A315703938038071A215E1 15E23970070064D83003133820127E9124>109 DI<13F8EA03 0CEA0E06487E1218123000701380A238E00700A3130EA25BEA60185BEA30E0EA0F8011127E9114 >I114 DI<13C01201A3EA0380A4EAFFF0EA0700A3120EA45A A4EA3820A21340A2EA1880EA0F000C1A80990F>I<001C13C0EA27011247A238870380A2120EA2 381C0700A438180E20A3EA1C1E380C26403807C38013127E9118>II E /Fg 1 47 df<124012E0124003037D8208>46 D E /Fh 38 122 df 45 D<1238127C12FEA3127C123807077C8610>I<13181378EA01F812FFA21201B3A7387FFFE0A2 13207C9F1C>49 DI<13FE3807FFC0380F07E0381E03F0123FEB81F8A3EA1F0314 F0120014E0EB07C0EB1F803801FE007F380007C0EB01F014F8EB00FCA2003C13FE127EB4FCA314 FCEA7E01007813F8381E07F0380FFFC03801FE0017207E9F1C>I<14E013011303A21307130F13 1FA21337137713E7EA01C71387EA03071207120E120C12181238127012E0B512FEA2380007E0A7 EBFFFEA217207E9F1C>I<1470A214F8A3497EA2497EA3EB06FF80010E7FEB0C3FA201187F141F 01387FEB300FA201607F140701E07F90B5FCA239018001FCA200038090C7FCA20006147FA23AFF E00FFFF8A225227EA12A>65 D67 D70 D72 D I77 D79 DI82 D87 D89 D97 D99 DI<13FE3807FF80380F87C0381E01E0003E13F0EA7C0014F812FC A2B5FCA200FCC7FCA3127CA2127E003E13186C1330380FC0703803FFC0C6130015167E951A>I< EB3F80EBFFC03801E3E0EA03C71207EA0F87EB83C0EB8000A6EAFFFCA2EA0F80B2EA7FF8A21323 7FA211>I<3803FC1E380FFF7F381F0F8F383E07CF383C03C0007C13E0A5003C13C0EA3E07381F 0F80EBFF00EA13FC0030C7FCA21238383FFF806C13F06C13F84813FCEA380048133E00F0131EA4 0078133C007C137C383F01F8380FFFE00001130018217E951C>II<121C123E127FA3123E121CC7FC A7B4FCA2121FB2EAFFE0A20B247EA310>I 107 DI<3AFF07F007F090391FFC1FFC3A1F303E303E 01401340496C487EA201001300AE3BFFE0FFE0FFE0A22B167E9530>I<38FF07E0EB1FF8381F30 7CEB403CEB803EA21300AE39FFE1FFC0A21A167E951F>I<13FE3807FFC0380F83E0381E00F000 3E13F848137CA300FC137EA7007C137CA26C13F8381F01F0380F83E03807FFC03800FE0017167E 951C>I<38FF0FE0EB3FF8381FF07CEB803E497E1580A2EC0FC0A8EC1F80A29038803F00EBC03E EBE0FCEB3FF8EB0FC090C8FCA8EAFFE0A21A207E951F>I114 DI<487EA41203A21207A2120F123FB5FCA2EA0F80ABEB8180A5EB8300EA07 C3EA03FEEA00F811207F9F16>I<38FF01FEA2381F003EAF147E14FE380F81BE3907FF3FC0EA01 FC1A167E951F>I<39FFE01FE0A2390F800600A2EBC00E0007130CEBE01C00031318A26C6C5AA2 6C6C5AA2EB7CC0A2137F6D5AA26DC7FCA2130EA21B167F951E>I<3AFFE7FF07F8A23A1F007800 C0D80F80EB0180147CA23A07C07E030014DE01E05B0003EBDF06EBE18FD801F15B01F3138C9038 FB079C000014D8EBFE03017E13F0A2EB7C01013C5BEB380001185B25167F9528>I<39FFE01FE0 A2390F800600A2EBC00E0007130CEBE01C00031318A26C6C5AA26C6C5AA2EB7CC0A2137F6D5AA2 6DC7FCA2130EA2130CA25B1278EAFC3813305BEA69C0EA7F80001FC8FC1B207F951E>121 D E /Fi 4 50 df0 D3 D15 D49 D E /Fj 31 122 df<130E131E137EEA07FE12FFA212F81200B3AB387FFFFEA317277BA622>49 DII<140E141E143E147E14FEA213011303EB077E130EA2131C1338137013E0A2 EA01C0EA0380EA0700120EA25A5A5A5AB612F8A3C7EAFE00A890387FFFF8A31D277EA622>I<00 0C1303380F803FEBFFFE5C5C5C5C5C49C7FC000EC8FCA6EB7FC0380FFFF8EB80FC380E003E000C 133FC7EA1F8015C0A215E0A21218127C12FEA315C05A0078EB3F80A26CEB7F00381F01FE380FFF F800035BC613801B277DA622>I65 D<91393FF00180903903FFFE03010FEBFF8790393FF007DF9039FF8001FF4848C7127F4848143F D807F0141F000F150F48481407A2485A1603127F5B93C7FC12FFA9127FA26DEC0380123FA26C7E EE07006C7E0007150ED803FC141E6C6C5C6C6C6C13F890393FF007E0010FB55A010391C7FC9038 003FF829297CA832>67 D70 D73 D77 D82 D87 D<48B47E000713F0380F81F8381FC07EA280D80F801380EA0700C7FCA3EB0FFF90B5FC38 07FC3FEA0FE0EA3F8013005A12FEA4007E137F007F13DF393F839FFC380FFF0F3801FC031E1B7E 9A21>97 D 99 DI II<9038FF81F00003EBE7F8390FC1FE7C381F80FC9038007C3848EB7E1048EB7F 00A66C137E6C137CEB80FC380FC1F8381FFFE0001813800038C8FCA2123C123E383FFFF86C13FF 15806C14C06C14E0001F14F0383E000748EB01F8481300A4007CEB01F0003C14E0001FEB07C039 0FC01F803903FFFE0038007FF01E287E9A22>II<1207EA0F80EA1FC0EA3FE0 A3EA1FC0EA0F80EA0700C7FCA7EAFFE0A3120FB3A3EAFFFEA30F2B7DAA14>I107 DI<3BFFC07F800FF0903AC1FFE03FFC903AC383F0707E3B0FC603F8C07F903ACC01 F9803F01D8D9FF00138001F05BA201E05BB03CFFFE1FFFC3FFF8A3351B7D9A3A>I<38FFC07F90 38C1FFC09038C787E0390FCE03F013D88113F0A213E0B03AFFFE3FFF80A3211B7D9A26>II<38FFC1F0EB C7FCEBCE3E380FD87FA213F0143E141CEBE000B0B5FCA3181B7E9A1C>114 D<3803FE30380FFFF0EA1E03EA380048137012F0A27E6C1300EAFFE0EA7FFEEBFF806C13E06C13 F0000713F8C6FCEB03FC13000060137C00E0133C7E14387E6C137038FF01E038F7FFC000C11300 161B7E9A1B>I<1370A413F0A312011203A21207381FFFF0B5FCA23807F000AD1438A61203EBF8 70000113603800FFC0EB1F8015267FA51B>I<39FFE03FF8A3000F1303B214071207140F3A03F0 3BFF803801FFF338003FC3211B7D9A26>I<3BFFFE7FFC0FFEA33B0FE007E000E03B07F003F001 C0A29039F807F80300031680A23B01FC0EFC0700A2D9FE1E5B000090381C7E0EA29039FF383F1E 017F141C0278133C90393FF01FB8A216F86D486C5AA26D486C5AA36D486C5AA22F1B7F9A32> 119 D<3AFFFE03FF80A33A07F0007000A26D13F000035CEBFC0100015CA26C6C485AA2D97F07C7 FCA2148FEB3F8E14DEEB1FDCA2EB0FF8A36D5AA26D5AA26D5AA2495AA2EA3807007C90C8FCEAFE 0F130E131E5BEA7C78EA3FE0EA0FC021277F9A24>121 D E /Fk 44 122 df12 D<1218123CA31204A21208A21210122012401280060C779C0D>39 D<12181238127812381208A2 1210A212201240A21280050C7D830D>44 D<1230127812F0126005047C830D>46 D<133C13C6EA0183EA030312061480120E120C121C1400485AA4EA700EA4485AA35BA21330485A 12E0EA60C0EA7180001EC7FC111D7B9B15>48 D<133C13C6EA0183EA03031206120E000C138012 1C1400A2485AA35B130EEA181EA2EA0C6CEA079CEA001C1318133813305BEAE0C0A2EA81800086 C7FC127C111D7B9B15>57 D<1418A21438A21478A214B8130114381302143CEB041CA213081318 13101320A2EB7FFCEB401C1380120113001202A2487F120C001C131EB4EBFFC01A1D7E9C1F>65 D<903803F02090381E0C6090383002E09038E003C03801C001EA038048C7FC000E1480121E121C 123C15005AA35AA41404A35C12705C6C5B00185B6C485AD80706C7FCEA01F81B1E7A9C1E>67 D<3801FFFE39003C078090383801C0A2EC00E0A24913F01570A215F05BA43901C001E0A315C038 0380031580140715003807000E5C5C5C380E01C0B5C7FC1C1C7D9B1F>I<48B512E038003C0001 3813601540A35BA214201500495AA214C013FF3801C080A43803810113801402A248485AA2140C 5C000E1378B55A1B1C7D9B1C>I<48B512C038003C01EB38001580A35BA214201500495AA214C0 13FF3801C080A4D80381C7FC1380A348C8FCA45AEAFFF01A1C7D9B1B>I<3801FFC038003C0013 38A45BA45BA4485AA4485AA448C7FCA45AEAFFE0121C7E9B10>73 DI<3801FFE038003C 001338A45BA45BA4485AA438038008A31410EA07001430146014E0380E03C0B5FC151C7D9B1A> 76 DI<3901FC03FE39001C0070013C1360012E1340A301471380A3EB43809038838100 A2138114C1380101C2A2EB00E2A2000213E41474A3481338A3000C1318001C1310EAFF801F1C7D 9B1F>I<3801FFFC38003C079038380380EC01C0A3EB7003A31580EBE0071500140E14383801FF E001C0C7FCA3485AA448C8FCA45AEAFFE01A1C7D9B1C>80 D<3801FFF838003C0EEB3807EC0380 A3EB7007A3EC0F00EBE00E5C1470EBFFC0EA01C014601470A2EA0380A4380700F01540A2158048 137839FFE07900C7121E1A1D7D9B1E>82 DI<001FB512C0381C070138300E0000201480126012405B 1280A2000014005BA45BA45BA4485AA41203EA7FFE1A1C799B1E>I<397FF0FF80390F001C0000 0E13181410A3485BA4485BA4485BA44848C7FCA31302A25BA2EA6008EA3030EA1040EA0F80191D 779B1F>I97 D<123F1207A2120EA45AA4EA39C0EA3E60EA3830A2EA7038A4EAE070A3 136013E0EAC0C012C1EA6180EA6300123C0D1D7B9C13>IIIII<13F3EA018FEA030FEA0607EA0E0E12 0C121CA2EA381CA413381230A2EA187813F0EA0F701200A213E0A2EAC0C012E1EAC300127E101A 7D9113>III107 DI<393C1E078039266318C0394683A0E0384703C0008E1380A2120EA2391C0701C0A3EC03 80D8380E1388A2EC0708151039701C032039300C01C01D127C9122>IIII114 DI<13C01201A3 EA0380A4EAFFE0EA0700A3120EA45AA4EA3840A31380EA1900120E0B1A7D990E>III<381E0183382703871247148338870701A2120EA2381C0E02A31404EA180C131C140800 1C1310380C26303807C3C018127C911C>III E /Fl 85 124 df11 D<137E3801C180EA0301380703C0120E EB018090C7FCA5B512C0EA0E01B0387F87F8151D809C17>II<90383F07E03901C09C18380380F0D80701133C000E 13E00100131892C7FCA5B612FC390E00E01CB03A7FC7FCFF80211D809C23>I34 D<000F14C0EA18803930 6003803970380700386027FB38E010065CA25CA25C5CEA602000705B38304180EA1881390F0303 C03900060620EC0C1090380C1C08EB1818EC380413301360A213C0A2EA01803903001808141C00 06EB0C1048EB06200004EB03C01E217E9E23>37 D<13E0EA0190EA0308A21207A45BA25B5B3903 C03FE09038800F0014061404EA05C000095B3810E010123038607020A238E03840133CEB1C8090 380F00207F0070EB8040383009C0391830E1803907C03E001B1F7E9D20>I<126012F012F81268 1208A31210A2122012401280050C7C9C0C>I<13401380EA0100120212065AA25AA25AA2127012 60A312E0AC1260A312701230A27EA27EA27E12027EEA008013400A2A7D9E10>I<7E12407E7E12 187EA27EA27EA213801201A313C0AC1380A312031300A21206A25AA25A12105A5A5A0A2A7E9E10 >II<1306ADB612E0A2D80006C7FCAD1B1C7E9720>I<126012F0A212701210A41220A212 401280040C7C830C>II<126012F0A2126004047C830C>I<1301130313 06A3130CA31318A31330A31360A213C0A3EA0180A3EA0300A31206A25AA35AA35AA35AA35AA210 297E9E15>II<12035A123F12C71207B3A4EA0F80EAFFF80D1C7C9B 15>III<130CA2131C133CA2135C13DC13 9CEA011C120312021204120C1208121012301220124012C0B512C038001C00A73801FFC0121C7F 9B15>II<13F0EA030CEA0604EA0C0E EA181E1230130CEA7000A21260EAE3E0EAE430EAE818EAF00C130EEAE0061307A51260A2EA7006 EA300E130CEA1818EA0C30EA03E0101D7E9B15>I<1240387FFF801400A2EA4002485AA25B485A A25B1360134013C0A212015BA21203A41207A66CC7FC111D7E9B15>III<126012F0A212601200AA126012F0A2126004127C 910C>I<007FB512C0B612E0C9FCA8B612E06C14C01B0C7E8F20>61 D<1306A3130FA3EB1780A3 EB23C0A3EB41E0A3EB80F0A200017FEB0078EBFFF83803007C0002133CA20006133E0004131EA2 000C131F121E39FF80FFF01C1D7F9C1F>65 DI<9038 1F8080EBE0613801801938070007000E13035A14015A00781300A2127000F01400A80070148012 78A212386CEB0100A26C13026C5B380180083800E030EB1FC0191E7E9C1E>IIII<90381F8080EBE06138018019 38070007000E13035A14015A00781300A2127000F01400A6ECFFF0EC0F80007013071278A21238 7EA27E6C130B380180113800E06090381F80001C1E7E9C21>I<39FFF3FFC0390F003C00ACEBFF FCEB003CAD39FFF3FFC01A1C7E9B1F>III<39FFF03FE0390F000F0014 0C14085C5C5C5C49C7FC13025B130E131F132FEB27801347EB83C0EB01E0A26D7E80147880A280 141F158039FFF07FF01C1C7E9B20>IIIIIIII<3807E080EA1C19EA3005EA7003EA600112E01300A36C13007E127CEA7FC0EA 3FF8EA1FFEEA07FFC61380130FEB07C0130313011280A300C01380A238E00300EAD002EACC0CEA 83F8121E7E9C17>I<007FB512C038700F010060130000401440A200C014201280A300001400B1 497E3803FFFC1B1C7F9B1E>I<39FFF07FC0390F000E001404B3A26C5B138000035B12016C6C5A EB70C0011FC7FC1A1D7E9B1F>I<39FFE00FF0391F0003C06CEB018015006D5A00071302A26C6C 5AA36C6C5AA213F000005BA2EBF830EB7820A26D5AA36D5AA2131F6DC7FCA21306A31C1D7F9B1F >I<3AFFE0FFE0FF3A1F001F003C001E011E13186C011F1310A3D807801420EC2780A2D803C014 40EC43C0A213E00001903881E080A33A00F100F100A3017913FA017A137AA2013E137C013C133C A301181318A3281D7F9B2B>I<397FF0FFC0390FC03E0038078018EA03C0EBE01000015BEBF060 00001340EB7880137D013DC7FC7F131F7F80A2EB13C0EB23E01321EB41F0EBC0F8EB8078380100 7C48133C00027F0006131F001FEB3F8039FFC0FFF01C1C7F9B1F>I<39FFF007FC390F8001E000 07EB0080EBC00100031400EBE002EA01F000005B13F8EB7808EB7C18EB3C106D5A131F6D5A14C0 6D5AABEB7FF81E1C809B1F>I<387FFFF0EA7C01007013E0386003C0A238400780130F1400131E 12005B137C13785BA2485A1203EBC010EA0780A2EA0F00481330001E13205A14604813E0EAF803 B5FC141C7E9B19>I<12FEA212C0B3B312FEA207297C9E0C>II<12FEA21206B3B312FEA20729809E0C>I<12 08121012201240A21280A312B012F812781230050C7D9C0C>96 DI<12 FC121CAA137CEA1D86EA1E03381C018014C0130014E0A614C013011480381E0300EA1906EA10F8 131D7F9C17>II<133F1307AAEA03E7EA0C17EA180F487E1270126012E0A61260127012 306C5AEA0C373807C7E0131D7E9C17>II<13F8EA018CEA071E1206EA 0E0C1300A6EAFFE0EA0E00B0EA7FE00F1D809C0D>II<12FC121CAA137C1387EA1D03001E13 80121CAD38FF9FF0141D7F9C17>I<1218123CA21218C7FCA712FC121CB0EAFF80091D7F9C0C>I< 13C0EA01E0A2EA00C01300A7EA0FE01200B3A21260EAF0C012F1EA6180EA3E000B25839C0D>I< 12FC121CAAEB3FC0EB0F00130C13085B5B5B13E0121DEA1E70EA1C781338133C131C7F130F1480 38FF9FE0131D7F9C16>I<12FC121CB3A9EAFF80091D7F9C0C>I<39FC7E07E0391C838838391D01 9018001EEBE01C001C13C0AD3AFF8FF8FF8021127F9124>IIIIIII<1204A4120CA2121C123CEAFFE0EA1C00A91310A5120CEA0E20EA03C00C1A7F9910>I< 38FC1F80EA1C03AD1307120CEA0E1B3803E3F014127F9117>I<38FF07E0383C0380381C0100A2 EA0E02A26C5AA3EA0388A213D8EA01D0A2EA00E0A3134013127F9116>I<39FF3FCFE0393C0F03 80381C07011500130B000E1382A21311000713C4A213203803A0E8A2EBC06800011370A2EB8030 000013201B127F911E>I<387F8FF0380F03801400EA0702EA0384EA01C813D8EA00F013701378 13F8139CEA010E1202EA060738040380381E07C038FF0FF81512809116>I<38FF07E0383C0380 381C0100A2EA0E02A26C5AA3EA0388A213D8EA01D0A2EA00E0A31340A25BA212F000F1C7FC12F3 1266123C131A7F9116>III E /Fm 54 122 df<90383FE3F83901F03F1C3903C03E3E0007137CEA0F80151C1500A5B612C0A2 390F807C00AE397FE1FFC0A21F1D809C1C>11 D13 D37 D<13201340EA0180120313001206120E5AA2123C1238A21278A312F85AA97E1278A31238A2123C 121CA27E12067E13801201EA004013200B297C9E13>40 D<7E12401230123812187E120E7EA213 801203A213C0A313E01201A9120313C0A31380A212071300A2120E120C5A1238123012405A0B29 7D9E13>I43 D<127812FCA212FEA2127A1202 A21204A21208A212301240070E7D850D>II<127812FCA4127806067D85 0D>III<1360EA01E0120F12FF 12F31203B3A2387FFF80A2111B7D9A18>IIII<38180180381FFF005B5B5B13C0 0018C7FCA4EA19F8EA1E0E38180F80EA1007000013C014E0A3127812F8A214C012F038600F8038 381F00EA1FFEEA07F0131B7E9A18>I<137EEA03FF38078180380F03C0EA1E07123CEB038048C7 FCA212F813F8EAFB0E38FA0780EAFC0314C000F813E0A41278A214C0123CEB0780381E0F00EA07 FEEA03F8131B7E9A18>I<1260387FFFE0A214C01480A238E00300EAC0065B5BC65AA25B13E0A2 12015B1203A41207A66C5A131C7D9B18>III65 DI<90381FE02090 38FFF8E03803F80F3807C003380F800148C7FC123E1560127E127C00FC1400A8007C1460127E12 3E15C07E390F8001803907C003003803F80E3800FFFCEB1FE01B1C7D9B22>II70 D73 D77 D79 DI82 D<3807F820381FFEE0EA3C07EA7801EA700012F01460A26C130012FEEAFFE0EA7FFE6C7E148000 0F13C06C13E0EA007FEB03F01301130012C0A214E07E38F001C0EAFC0338EFFF00EA83FC141C7D 9B1B>I<39FFFC03FFA2390FC00030B3120715606C6C13E03901F001C03900FC078090387FFE00 EB0FF8201C7E9B25>85 D97 DIIII<137F3801E3803803C7C0EA0787120FEB8380EB 8000A5EAFFF8A2EA0F80AEEA7FF8A2121D809C0F>I<3803F8F0380E0F38121E381C0730003C13 80A4001C1300EA1E0FEA0E0EEA1BF80010C7FC1218A2EA1FFF14C06C13E04813F0387801F838F0 0078A300701370007813F0381E03C03807FF00151B7F9118>II<121E123FA4121EC7FCA6B4FCA2121FAEEAFF E0A20B1E7F9D0E>I<137813FCA413781300A6EA03FCA2EA007CB2127012F8137813F0EA70E0EA 1F800E26839D0F>III<39FF 0FC07E903831E18F3A1F40F20780D980FC13C0A2EB00F8AB3AFFE7FF3FF8A225127F9128>I<38 FF0FC0EB31E0381F40F0EB80F8A21300AB38FFE7FFA218127F911B>II<38FF3F80EBE1E0381F80F0EB0078147C143C143EA6143C147C1478EB80F0EBC1E0EB3F 0090C7FCA6EAFFE0A2171A7F911B>I114 DI<1203A45AA25AA2EA3FFC12FFEA1F00A913 0CA4EA0F08EA0798EA03F00E1A7F9913>I<38FF07F8A2EA1F00AC1301120F380786FFEA01F818 127F911B>I<38FFC1FCA2381F0060EB80E0000F13C013C03807C180A23803E300A2EA01F6A213 FE6C5AA21378A2133016127F9119>I<38FFC1FCA2381F0060EB80E0000F13C013C03807C180A2 3803E300A2EA01F713F6EA00FE5BA21378A21330A21370EA706012F85BEAF9800073C7FC123E16 1A7F9119>121 D E /Fn 60 121 df<13FFEA0387EA07071206120EA6B5FCEA0E07AE387F9FE0 131A809915>13 D38 D<1380EA010012025A120C120812185AA35AA412E0AA1260A47EA37E1208120C 12047E7EEA008009267D9B0F>40 D<7E12407E7E12181208120C7EA37EA41380AA1300A41206A3 5A1208121812105A5A5A09267E9B0F>I<126012F0A212701210A31220A212401280040B7D830B> 44 DI<126012F0A2126004047D830B>I48 D<12035AB4FC1207B3A2 EAFFF00C187D9713>III<1330A2137013F0A2EA017012031202120412 0C1208121012301220124012C0B5FCEA0070A6EA07FF10187F9713>III<1240EA7FFE13FCA2EA 4008EA8010A21320EA0040A213801201A213005AA45AA612020F197E9813>III<126012F0A212601200A8126012F0A2126004107D8F0B>I<137F380180C03806003000081308 487F38203E0213E13841C081384380710083EB7080EA8700A6EA838012433941C0F1003820E131 EB3E1E6CC8FC7E0006EB03803901803E0038007FE0191A7E991E>64 D<130CA3131EA2133F1327 A2EB4380A3EB81C0A348C67EA213FF38020070A20006137800041338A2487FA2001C131EB4EBFF C01A1A7F991D>IIIIII<38FFE7FF380E0070AB380FFFF0380E0070AC38FFE7FF181A7E991D>72 DI<39FFE07F80390E001E00141814105C5C5C49C7FC 13025B5B131C132E134E1387380F0380120E6D7E6D7EA21470A28080143E39FFE0FF80191A7E99 1E>75 DII<38FE01FF380F00381410EA0B80A2EA09C0EA08E0A21370A21338131C A2130EA21307EB0390A2EB01D0A2EB00F01470A21430121C38FF8010181A7E991D>I<137F3801 C1C038070070000E7F487F003C131E0038130E0078130F00707F00F01480A80078EB0F00A20038 130E003C131E001C131C6C5B6C5B3801C1C0D8007FC7FC191A7E991E>II82 DI<007F B5FC38701C0700401301A200C0148000801300A300001400B13803FFE0191A7F991C>I<38FFE1 FF380E00381410B20006132012076C1340EA01803800C180EB3E00181A7E991D>I<39FF801FE0 391E000700000E1306000F13046C5B13806C6C5A00011330EBE0206C6C5A1370EB78801338011D C7FC131F130EAAEBFFE01B1A7F991D>89 D97 D<12FC121CA913F8EA1F0EEA1E07381C 0380130114C0A6EB03801400EA1E07EA1B0CEA10F0121A7F9915>II<137E130EA9EA03CEEA0C3EEA38 0E1230127012E0A612601270EA381EEA1C2E3807CFC0121A7F9915>IIII<12FC121CA913F8EA1D0CEA1E0EA2121CAB38FF9FC0121A7F9915>I<121812 3CA21218C7FCA612FC121CAEEAFF80091A80990A>I<12FC121CA9EB3F80EB1E00131813105B5B EA1DC0EA1FE0121C1370137813387F131E131F38FF3FC0121A7F9914>107 D<12FC121CB3A6EAFF80091A80990A>I<38FC7C1F391D8E6380391E0781C0A2001C1301AB39FF 9FE7F81D107F8F20>IIII114 DI<1204A3 120CA2121C123CEAFFC0EA1C00A81320A5EA0E40EA03800B177F960F>II<38FF1F80383C0600EA1C04A2EA1E0CEA0E08A26C5AA21390EA 03A0A2EA01C0A36C5A11107F8F14>I<39FF3F9F80393C0E070000381306381C16041317001E13 0C380E23081488000F13983807419014D03803C1E01380A200015BEB004019107F8F1C>I<38FF 3F80383C1C00EA1C18EA0E106C5A13606C5A12017F1203EA0270487E1208EA181CEA381E38FC3F C012107F8F14>I E /Fo 2 51 df<120C123C12CC120CACEAFF8009107E8F0F>49 D<121FEA6180EA40C0EA806012C01200A213C0EA0180EA030012065AEA10201220EA7FC012FF0B 107F8F0F>I E /Fp 30 122 df<126012F0A212701210A21220A21240A2040A7D830A>44 DI<126012F0A2126004047D830A>I<12035AB4FC1207B1EA7FF00C157E 9412>49 DI<13101338A3135CA3138EA3EA0107A2 00031380EA0203A23807FFC0EA0401A2380800E0A21218003813F038FE03FE17177F961A>65 D67 D<38FF83FE381C0070AA381FFFF038 1C0070AA38FF83FE17177F961A>72 D97 D<12FC121CA813F8EA1F06EA1C031480130114C0 A4148013031400EA1B0EEA10F81217809614>II<137E130EA8EA07CEEA1C3EEA300E1270126012E0A4 12601270EA301EEA182E3807CFC012177F9614>IIII<12FC121C A8137CEA1D8EEA1E07121CAA38FF9FE01317809614>I<1218123CA212181200A5127C121CAC12 FF081780960A>I<12FC121CB3A3EAFF80091780960A>108 D<38FC7C1F391D8E6380391E0781C0 001C1301AA39FF9FE7F81D0E808D1E>II< EA07C0EA1830EA3018EA600CA2EAE00EA5EA701CEA3018EA1830EA07C00F0E7F8D12>II< EA07C2EA1C26EA381EEA700E126012E0A412601270EA301EEA1C2EEA07CEEA000EA5EB7FC01214 7F8D13>III<1208A31218A212 38EAFF80EA3800A71340A4EA1C80EA0F000A147F930E>III<38FCFE7C383838381410381C3C20A2134C380E4E40A2138638078780A2130300031300A2 160E7F8D19>I121 D E /Fq 54 121 df<132013401380EA01005A12061204120C A25AA25AA312701260A312E0AE1260A312701230A37EA27EA2120412067E7EEA0080134013200B 327CA413>40 D<7E12407E7E12187E12041206A27EA2EA0180A313C01200A313E0AE13C0A31201 1380A3EA0300A21206A21204120C5A12105A5A5A0B327DA413>I<127012F812FCA212741204A4 1208A21210A212201240060F7C840E>44 DI<127012F8A3127005057C 840E>I48 D<13801203120F12F31203B3A9EA07 C0EAFFFE0F217CA018>III<13021306130EA2131EA2132E13 4EA2138EA2EA010E1202A21204A212081210A21220A212401280B512F838000E00A7131F3801FF F015217FA018>I<00101380381E0700EA1FFF5B13F8EA13E00010C7FCA613F8EA130EEA140738 1803801210380001C0A214E0A4127012F0A200E013C01280EA4003148038200700EA1006EA0C1C EA03F013227EA018>I<137EEA01C138030080380601C0EA0E03121C381801800038C7FCA21278 1270A2EAF0F8EAF30CEAF4067F00F81380EB01C012F014E0A51270A3003813C0A238180380001C 1300EA0C06EA070CEA01F013227EA018>I<12401260387FFFE014C0A23840008038C001001280 1302A2485A5BA25B133013201360A313E05BA21201A41203A86C5A13237DA118>III<127012F8A312701200AB127012F8A3127005157C940E>I64 D<497EA3497EA3EB05E0A2EB0DF01308A2497E1478A2497EA3497EA3497EA290B5FC3901000780 A24814C000021303A24814E01401A2000CEB00F0A2003EEB01F839FF800FFF20237EA225>I<90 3807E0109038381830EBE0063901C0017039038000F048C7FC000E1470121E001C1430123CA200 7C14101278A200F81400A812781510127C123CA2001C1420121E000E14407E6C6C13803901C001 003800E002EB381CEB07E01C247DA223>67 DIII<903807F00890383C0C18EBE0023901C001B839038000F848C71278 481438121E15185AA2007C14081278A200F81400A7EC1FFF0078EB00F81578127C123CA27EA27E 7E6C6C13B86C7E3900E0031890383C0C08903807F00020247DA226>I<39FFFC3FFF390FC003F0 39078001E0AE90B5FCEB8001AF390FC003F039FFFC3FFF20227EA125>II<3803FFF038001F007FB3A6127012F8A2130EEAF01EEA401C 6C5AEA1870EA07C014237EA119>I76 DI82 D<3803F020380C0C60EA1802383001E0EA70000060136012E0A21420 A36C1300A21278127FEA3FF0EA1FFE6C7E0003138038003FC0EB07E01301EB00F0A214707EA46C 1360A26C13C07E38C8018038C60700EA81FC14247DA21B>I<007FB512F8397807807800601418 00401408A300C0140C00801404A400001400B3A3497E0003B5FC1E227EA123>I<39FFFC07FF39 0FC000F86C4813701520B3A5000314407FA2000114806C7E9038600100EB3006EB1C08EB03F020 237EA125>I<3BFFF03FFC03FE3B1F8007E000F86C486C4813701720A26C6C6C6C1340A32703C0 02F01380A33B01E004780100A33A00F0083C02A39039F8183E06903978101E04A2137C90393C20 0F08A390391E400790A390390F8003E0A36D486C5AA36D5C010213002F237FA132>87 D97 D<120E12FE121E120EAB131FEB61C0EB8060380F00 30000E1338143C141C141EA7141C143C1438000F1370380C8060EB41C038083F0017237FA21B> II<14E0130F13011300ABEA01F8EA0704EA0C02EA1C01EA3800 1278127012F0A7127012781238EA1801EA0C0238070CF03801F0FE17237EA21B>II<133C13C6EA018F1203130FEA0700A9EAFFF8EA0700B213 80EA7FF8102380A20F>I<14703801F19838071E18EA0E0E381C0700A2003C1380A4001C1300A2 EA0E0EEA0F1CEA19F00010C7FCA21218A2EA1FFE380FFFC014E0383800F0006013300040131812 C0A300601330A2003813E0380E03803803FE0015217F9518>I<120E12FE121E120EABEB1F80EB 60C0EB80E0380F0070A2120EAF38FFE7FF18237FA21B>I<121C123EA3121CC7FCA8120E12FE12 1E120EB1EAFFC00A227FA10E>II<120E12FE121E120EB3ADEAFFE00B237FA20E>108 D<390E1FC07F3AFE60E183803A1E807201C03A0F003C00E0A2000E1338AF3AFFE3FF8FFE27157F 942A>I<380E1F8038FE60C0381E80E0380F0070A2120EAF38FFE7FF18157F941B>III114 DI<1202A41206A3120E121E123EEAFFF8EA0E 00AB1304A6EA07081203EA01F00E1F7F9E13>I<000E137038FE07F0EA1E00000E1370AD14F0A2 38060170380382783800FC7F18157F941B>I<38FFC1FE381E0078000E13301420A26C1340A238 038080A33801C100A2EA00E2A31374A21338A3131017157F941A>I<38FF83FE381F01F0380E00 C06C1380380381001383EA01C2EA00E41378A21338133C134E138EEA0187EB0380380201C00004 13E0EA0C00383E01F038FF03FE17157F941A>120 D E /Fr 29 120 df12 D<127812FCA212FEA2127A1202A51204A31208A212101220 A2124007147AB112>39 D<1403A34A7EA24A7EA3EC17E01413A2EC23F01421A2EC40F8A3EC807C A20101137EEC003EA20102133F81A2496D7EA3496D7EA2011880011FB5FCA29039200003F01501 A249801500A249147CA348C87EA248153F825AD81F80EC3F80D8FFE0903803FFFCA22E327EB132 >65 D<91383FE001903901FFF803903807F01E90391F800307013EC712870178144F49142F4848 141F4848140F485A000F150790C8FC481503121E123E003C1501127CA30078150012F8AB127812 7C1601A2123C123E121E001F15027E6D1406000715046C6C14086C7E6C6C141001781420013E14 C090391F800380903907F00F00903801FFFC9038003FE028337CB130>67 D72 D77 D80 D82 D<90387F80203801FFE03907C07860380F001C 001EEB06E048130300381301007813001270156012F0A21520A37E1500127C127E7E13C0EA1FF8 6CB47E6C13F86C7FC613FF010F1380010013C0EC1FE01407EC03F01401140015F8A26C1478A57E 15706C14F015E07E6CEB01C000ECEB038000C7EB070038C1F01E38807FFCEB0FF01D337CB125> I85 D89 D<13FE380303C0380C00E00010 137080003C133C003E131C141EA21208C7FCA3EB0FFEEBFC1EEA03E0EA0F80EA1F00123E123C12 7C481404A3143EA21278007C135E6CEB8F08390F0307F03903FC03E01E1F7D9E21>97 D99 DIII<15F090387F03083901C1C41C380380E8390700700848EB 7800001E7FA2003E133EA6001E133CA26C5B6C13706D5A3809C1C0D8087FC7FC0018C8FCA5121C 7E380FFFF86C13FF6C1480390E000FC00018EB01E048EB00F000701470481438A500701470A26C 14E06CEB01C00007EB07003801C01C38003FE01E2F7E9F21>II<120FEA1F80A4EA0F00C7FCABEA078012FFA2120F1207B3A6EA0FC0EAFFF8A20D307EAF 12>I108 D<260780FEEB1FC03BFF83078060F0903A8C03C180783B0F9001E2003CD807A013E4DA00F47F01 C013F8A2495BB3A2486C486C133F3CFFFC1FFF83FFF0A2341F7E9E38>I<380780FE39FF830780 90388C03C0390F9001E0EA07A06E7E13C0A25BB3A2486C487E3AFFFC1FFF80A2211F7E9E25>I< EB1FC0EBF0783801C01C38070007481480001EEB03C0001C1301003C14E0A248EB00F0A300F814 F8A8007814F0007C1301003C14E0A26CEB03C0A26CEB07803907800F003801C01C3800F078EB1F C01D1F7E9E21>I<380781FC38FF860790388803C0390F9001E03907A000F001C013785B153CA2 153E151E151FA9153EA2153C157C15786D13F013A0EC01E090389803809038860F00EB81F80180 C7FCAB487EEAFFFCA2202D7E9E25>I<380783E038FF8C18EB907C120FEA07A0EBC0381400A35B B3487EEAFFFEA2161F7E9E19>114 D<3801FC10380E0330381800F048137048133012E01410A3 7E6C1300127EEA3FF06CB4FC6C13C0000313E038003FF0EB01F813006C133CA2141C7EA27E1418 6C1338143000CC136038C301C03880FE00161F7E9E1A>I<1340A513C0A31201A212031207120F 381FFFE0B5FC3803C000B01410A80001132013E000001340EB78C0EB1F00142C7FAB19>II<3BFFF07FF80FFCA23B0FC007C003F0D98003EB01C00007ED0080A2D803C090 38E00100A214073A01E004F002A2EC0870D800F0EB7804A2EC10380178EB3C08A2EC201C013CEB 1E10A2EC400E011EEB0F20A2EC8007010F14C0A2EC00036D5CA201061301010291C7FC2E1F7F9E 30>119 D E end %%EndProlog %%BeginSetup %%Feature: *Resolution 300 TeXDict begin %%EndSetup %%Page: 0 1 bop 227 893 a Fr(A)21 b(Y)-6 b(ear's)22 b(Pro\014le)g(of)f(Academic)g(Sup)r (ercomputer)g(Users)249 984 y(Using)f(the)i(CRA)-6 b(Y)22 b(Hardw)n(are)f(P)n (erformance)h(Monitor)462 1167 y Fq(Hui)15 b(Gao)658 1148 y Fp(1)279 1225 y Fq(Cen)o(ter)h(for)g(Sup)q(ercomputing)281 1283 y(Researc)o(h)g(and)g(Dev)o(elopmen)o(t)256 1341 y(465)i(CSRL,)e(1308)i (W.)e(Main)g(St.)382 1399 y(Urbana,)h(IL)f(61801)281 1457 y(Email:)j (hgao@csrd.uiuc.edu)362 1515 y(T)l(el:)h(\(217\)-244-5)q(287)356 1573 y(F)l(ax:)h(\(217\)-244-13)q(51)1170 1167 y(John)c(L.)f(Larson)1511 1148 y Fp(2)1060 1225 y Fq(Cen)o(ter)f(for)i(Sup)q(ercomputing)1062 1283 y(Researc)o(h)e(and)i(Dev)o(elopmen)o(t)1037 1341 y(465)g(CSRL,)g(1308)g (W.)f(Main)g(St.)1163 1399 y(Urbana,)g(IL)h(61801)1040 1457 y(Email:)j(jlarson@csrd.uiuc.edu)1142 1515 y(T)l(el:)h(\(217\)-244-590)q(8) 1136 1573 y(F)l(ax:)h(\(217\)-244-135)q(1)821 1692 y(April)15 b(8,)h(1993)-38 2596 y Fo(1)-21 2612 y Fn(Curren)o(t)e(address:)k(Kuc)o(k)13 b(&)g(Asso)q(ciates,)i(Inc.,)d(1906)i(F)m(o)o(x)f(Driv)o(e,)h(Champaign,)h (IL)e(61820,)h(Email:)19 b(hgao@k)n(ai.com.)g(T)m(el:)e(\(217\)-356-)-90 2658 y(2288)d(,)e(F)m(ax:)17 b(\(217\)-356-5199)-38 2688 y Fo(2)-21 2704 y Fn(presen)o(ting)e(author)p eop %%Page: 0 2 bop 869 713 a Fm(Abstract)-90 854 y Fl(This)13 b(pap)q(er)h(describ)q(es)h (some)d(preliminary)f(results)j(ab)q(out)f(a)g(w)o(orkload)f(c)o (haracterization)h(study)h(at)f(a)f(national)g(sup)q(ercomputer)-90 954 y(cen)o(ter.)26 b(This)15 b(study)i(is)e(part)h(of)g(a)f(larger)h(pro)r (ject)h(to)f(dev)o(elop)f(an)h(analytic)f(metho)q(dology)f(for)h(b)q(enc)o (hmark)h(set)g(construction.)-90 1053 y(Our)j(approac)o(h)f(is)g(to)g(study)h (a)f(particular)g(w)o(orkload)f(in)h(detail)f(and)h(to)g(measure)h(its)f(p)q (erformance)g(c)o(haracteristics.)33 b(These)-90 1153 y(results)16 b(will)e(giv)o(e)g(us)i(a)e(p)q(erformance)h(sp)q(eci\014cation)h(that)f(a)g (b)q(enc)o(hmark)f(set)i(represen)o(ting)h(this)e(w)o(orkload)f(m)o(ust)g (appro)o(ximate)-90 1253 y(in)j(a)g(measurable)f(w)o(a)o(y)m(.)28 b(W)m(e)16 b(can)i(then)g(in)o(v)o(estigate)f(ho)o(w)g(to)g(analytically)e (select)k(b)q(enc)o(hmarks)e(to)g(represen)o(t)j(this)d(particular)-90 1352 y(w)o(orkload.)k(W)m(e)15 b(exp)q(ect)i(the)f(metho)q(dology)d(w)o(e)i (dev)o(elop)h(to)f(b)q(e)h(applicable)e(to)h(w)o(orkloads)g(at)g(other)h (sites.)23 b(Our)16 b(study)g(rep)q(orts)-90 1452 y(on)c(292,254)e(user)k (jobs)e(o)o(v)o(er)g(a)g(p)q(erio)q(d)h(of)f(13)f(mon)o(ths)h(represen)o (ting)h(ab)q(out)g(one)f(half)f(of)h(the)h(w)o(all)e(clo)q(c)o(k)h(time)f(a)o (v)n(ailable)f(on)i(the)h(4-)-90 1552 y(pro)q(cessor)i(CRA)m(Y)d(Y-MP)h(at)f (the)i(National)d(Cen)o(ter)j(for)f(Sup)q(ercomputing)f(Applications)g (during)h(this)f(time.)17 b(The)c(p)q(erformance)-90 1651 y(statistics)i(w)o (e)f(gathered)g(in)g(some)f(sense)i Fk(de\014ne)g Fl(the)g(w)o(orkload)d(b)q (ecause)k(of)d(the)h(large)g(fraction)f(of)h(the)g(a)o(v)n(ailable)e(time)g (that)i(w)o(as)-90 1751 y(recorded.)30 b(These)18 b(statistics)g(will)e(allo) o(w)g(us)h(to)h(\\rev)o(erse)g(engineer")g(a)f(set)i(of)d(b)q(enc)o(hmarks)h (that)h(analytically)d(represen)o(t)k(the)-90 1850 y(w)o(orkload)13 b(at)g(this)h(site.)p eop %%Page: 1 3 bop -90 195 a Fj(1)69 b(In)n(tro)r(duction)-90 336 y Fl(This)13 b(pap)q(er)h(describ)q(es)h(some)d(preliminary)f(results)j(ab)q(out)f(a)g(w)o (orkload)f(c)o(haracterization)h(study)h(at)f(a)f(national)g(sup)q (ercomputer)-90 435 y(cen)o(ter.)26 b(This)15 b(study)i(is)e(part)h(of)g(a)f (larger)h(pro)r(ject)h(to)f(dev)o(elop)f(an)h(analytic)f(metho)q(dology)f (for)h(b)q(enc)o(hmark)h(set)g(construction.)-90 535 y(Our)j(approac)o(h)f (is)g(to)g(study)h(a)f(particular)g(w)o(orkload)f(in)h(detail)f(and)h(to)g (measure)h(its)f(p)q(erformance)g(c)o(haracteristics.)33 b(These)-90 635 y(results)16 b(will)e(giv)o(e)g(us)i(a)e(p)q(erformance)h(sp)q (eci\014cation)h(that)f(a)g(b)q(enc)o(hmark)f(set)i(represen)o(ting)h(this)e (w)o(orkload)f(m)o(ust)g(appro)o(ximate)-90 734 y(in)j(a)g(measurable)f(w)o (a)o(y)m(.)28 b(W)m(e)16 b(can)i(then)g(in)o(v)o(estigate)f(ho)o(w)g(to)g (analytically)e(select)k(b)q(enc)o(hmarks)e(to)g(represen)o(t)j(this)d (particular)-90 834 y(w)o(orkload.)g(W)m(e)c(exp)q(ect)j(the)e(metho)q (dology)e(w)o(e)i(dev)o(elop)g(to)g(b)q(e)g(applicable)f(to)h(w)o(orkloads)f (at)h(other)g(sites.)-28 934 y(W)m(e)i(are)h(motiv)n(ated)e(b)o(y)h(a)h (curren)o(t)h(lac)o(k)e(of)g(detailed)g(understanding)h(\(quan)o (ti\014cation\))f(ab)q(out)h(ho)o(w)f(computers)h(are)g(used.)-90 1033 y(There)12 b(is)g(a)f(need)h(to)f(kno)o(w)g(more)f(ab)q(out)i(the)f(use) i(of)d(computers)i(than)f(the)h(n)o(um)o(b)q(er)f(of)f(hours)i(consumed)f(in) g(di\013eren)o(t)h(application)-90 1133 y(areas[20].)k(The)10 b(limited)e(p)q(erformance)i(information)d(a)o(v)n(ailable)h(ab)q(out)i(ho)o (w)g(computers)g(are)g(used)h(mak)o(es)e(it)h(di\016cult)f(to)h(construct)-90 1232 y(b)q(enc)o(hmarks)15 b(that)h(are)g(represen)o(tativ)o(e)h(of)e(the)i (w)o(orkloads)d(at)i(v)n(arious)e(sites)j(or)e(of)g(sup)q(ercomputers)i(in)e (general,)h(if)f(w)o(orkloads)-90 1332 y(are)f(not)g(kno)o(wn)g(or)f (di\016cult)h(to)f(understand)i(in)f(a)f(quan)o(ti\014able)g(w)o(a)o(y)m(.) -28 1432 y(W)m(e)i(b)q(eliev)o(e)h(that)f(w)o(e)g(m)o(ust)f(address)j(some)d (basic)h(problems)f(in)h(p)q(erformance)g(ev)n(aluation)f(that)h(are)h(still) e(as)h(true)h(to)q(da)o(y)f(as)-90 1531 y(they)h(w)o(ere)h(20)f(y)o(ears)g (ago[18)n(]:)22 b(Grenander)16 b(and)g(Tsao\(1971\))f(wrote:)23 b(\\W)m(e)15 b(b)q(eliev)o(e)h(that)g(no)g(real)g(signi\014can)o(t)f(adv)n (ance)h(in)f(the)-90 1631 y(ev)n(aluation)c(of)h(systems)h(can)g(b)q(e)g(exp) q(ected)h(un)o(til)e(some)g(breakthrough)h(is)f(made)g(in)g(the)h(c)o (haracterization)g(of)f(the)h(w)o(orkload[11)n(].")-90 1731 y(F)m(errari\(1972\))i(states:)24 b(\\The)16 b(lac)o(k)f(of)g(satisfactory)h (w)o(orkload)f(c)o(haracterization)i(tec)o(hniques)g(is)f(one)g(of)f(the)i (main)c(reasons)k(for)-90 1830 y(the)d(primitiv)o(e)e(state)j(of)e(this)h (imp)q(ortan)o(t)e(branc)o(h)j(of)e(computer)g(engineering")h(\(i.e.,)f(p)q (erformance)h(ev)n(aluation\)[9)n(].)-28 1930 y(T)m(o)h(address)j(these)f (problems)e(at)h(least)g(three)h(approac)o(hes)g(to)e(w)o(orkload)g(c)o (haracterization)i(and)e(b)q(enc)o(hmark)h(construction)-90 2029 y(ha)o(v)o(e)e(b)q(een)h(prop)q(osed[10]:)-28 2112 y Fi(\017)21 b Fl(Resource)14 b(Consumption)e(Approac)o(h.)18 b(In)13 b(this)h(approac)o (h)f(a)g(b)q(enc)o(hmark)g(set)h(represen)o(ts)i(a)d(w)o(orkload)f(w)o(ell)h (if)f(it)h(consumes)14 2162 y(resources)j(at)e(the)g(same)f(rate)i(as)f(the)g (real)g(w)o(orkload,)e(e.g.)18 b(CPU)c(time)f(used,)h(memory)d(space,)k(and)f (duration)f(of)g(I/O.)-28 2242 y Fi(\017)21 b Fl(F)m(unctional)16 b(Approac)o(h.)27 b(In)17 b(this)g(approac)o(h)g(a)f(b)q(enc)o(hmark)h(set)g (represen)o(ts)j(a)d(w)o(orkload)e(w)o(ell)h(if)g(it)h(p)q(erforms)f(the)i (same)14 2292 y(functions)c(as)h(the)g(w)o(orkload,)f(e.g.)20 b(a)14 b(real)h(w)o(orkload)e(p)q(erforming)g(pa)o(yroll)g(activities)i (should)f(b)q(e)h(represen)o(ted)j(b)o(y)c(pa)o(yroll)14 2342 y(programs.)j(This)c(application)g(area)h(co)o(v)o(erage)g(approac)o(h)g(is)g (follo)o(w)o(ed)e(b)o(y)i(man)o(y)e(curren)o(t)j(b)q(enc)o(hmarks,)e(e.g.)18 b(P)o(erfect[1)q(].)-28 2421 y Fi(\017)j Fl(P)o(erformance)13 b(Orien)o(ted)i(Approac)o(h.)j(In)c(this)g(approac)o(h)f(a)h(b)q(enc)o(hmark) f(set)i(represen)o(ts)h(a)e(w)o(orkload)e(w)o(ell)h(if)g(it)g(causes)i(the)14 2471 y(system)h(to)h(exhibit)f(the)h(same)f(p)q(erformance)h(c)o (haracteristics)h(as)f(the)g(w)o(orkload,)f(e.g.)26 b(M\015ops,)17 b(a)o(v)o(erage)g(v)o(ector)g(length.)14 2521 y(This)d(approac)o(h)g(w)o(as)g (used)h(at)f(Los)h(Alamos)d(on)i(the)h(CRA)m(Y-1)e(b)o(y)h(soft)o(w)o(are)g (instrumen)o(tation)f(of)g(time-consuming)f(co)q(des)14 2571 y(\(without)h(the)i(use)g(of)e(a)h(hardw)o(are)g(p)q(erformance)g (monitor\)[6)m(][16)o(].)-28 2704 y(While)f(eac)o(h)h(approac)o(h)f(has)h (disadv)n(an)o(tages,)f(w)o(e)g(\014nd)h(the)g(P)o(erformance)f(Orien)o(ted)i (Approac)o(h)f(the)g(most)e(attractiv)o(e.)18 b(It)c(has)950 2828 y(1)p eop %%Page: 2 4 bop -90 195 a Fl(the)16 b(adv)n(an)o(tage)e(that)h(it)f(forces)i(the)f(c)o (haracterization)h(and)f(construction)h(pro)q(cess)g(to)f(fo)q(cus)g(on)g (the)h(p)q(erformance)e(v)n(ariables)h(of)-90 295 y(in)o(terest)g(in)f(an)o (y)f(particular)h(study)m(.)-28 394 y(Kno)o(wing)h(ho)o(w)g(computers)g(are)h (used)g(is)f(the)h(\014rst)h(step)f(in)f(constructing)h(b)q(enc)o(hmark)f (sets)i(that)e(represen)o(ts)j(the)e(w)o(orkload)-90 494 y(of)e(those)g (computers.)19 b(If)14 b(one)g(w)o(ere)h(to)f(construct)i(a)e(b)q(enc)o (hmark)g(set)h(to)f(represen)o(t)i(a)e(w)o(orkload)f(then)i(the)f(selected)i (b)q(enc)o(hmark)-90 594 y(programs)10 b(should)h(collectiv)o(ely)f(ha)o(v)o (e)h(a)g(distribution)g(of)f(p)q(erformance)h(c)o(haracteristics)i(that)e(is) g(similar)e(to)i(that)g(of)g(the)h(w)o(orkload.)-90 693 y(In)18 b(this)g(pap)q(er)h(w)o(e)f(see,)h(for)f(the)h(\014rst)f(time,)f(what)h(the)h (v)n(alues)e(of)h(v)n(arious)f(p)q(erformance)h(metrics)g(actually)f(are)h (for)f(the)i(real)-90 793 y(w)o(orkload)13 b(at)g(an)h(academic)f(site.)19 b(W)m(e)13 b(ha)o(v)o(e)h(selected)i(the)e(academic)f(w)o(orkload)g(of)g(the) i(CRA)m(Y)e(Y-MP)h(at)g(the)g(National)f(Cen)o(ter)-90 892 y(for)g(Sup)q(ercomputing)h(Applications)f(\(NCSA\))h(as)g(our)g(initial)e (test)i(case.)19 b(This)14 b(selection)h(w)o(as)e(made)g(to)g(tak)o(e)h(adv)n (an)o(tage)f(of)g(the)-90 992 y(unique)e(opp)q(ortunit)o(y)g(a\013orded)h(b)o (y)f(the)g(installation)e(of)i(sp)q(ecial)g(recording)h(soft)o(w)o(are)f (that)g(has)h(collected)g(hardw)o(are)f(p)q(erformance)-90 1092 y(monitor\(HPM\))i(information)e(on)i(the)i(w)o(orkload)d(of)i(this)g (mac)o(hine)e(since)j(June)g(1991.)-28 1191 y(Other)d(studies)g(ha)o(v)o(e)f (b)q(een)h(made)e(using)h(the)h(HPM)f(\(primarily)e(on)i(the)g(CRA)m(Y)f (X-MP\))i(for)f(program)e(and)i(w)o(orkload)f(analysis)-90 1291 y(in)j(an)g(e\013ort)h(to)f(gain)f(a)h(more)g(detailed)g(understanding)h (of)f(the)g(p)q(erformance)h(c)o(haracteristics)h(of)d(b)q(enc)o(hmarks,)h (user)i(programs,)-90 1391 y(and)f(w)o(orkloads.)j(These)e(studies)g (include:)-28 1482 y Fi(\017)21 b Fl(Bradley[5)o(])14 b(-)f(5)h(co)q(des)h (from)d(the)j(P)o(erfect)g(Benc)o(hmark)-28 1565 y Fi(\017)21 b Fl(Berry[2])14 b(-)f(9)h(co)q(des)h(from)d(the)j(P)o(erfect)g(Benc)o(hmark) -28 1648 y Fi(\017)21 b Fl(Bradley[4)o(])14 b(-)f(11)h(co)q(des)h(from)d (large)i(time-consuming)d(users)k(at)f(NCSA)-28 1731 y Fi(\017)21 b Fl(Delic[8)o(])13 b(-)h(24)f(co)q(des)i(from)d(the)j(users)g(at)f(the)g (Ohio)g(Sup)q(ercomputer)h(Cen)o(ter)-28 1814 y Fi(\017)21 b Fl(Berry[3])14 b(-)f(most)g(of)g(the)i(curren)o(t,)g(p)q(opular)e(b)q(enc)o (hmark)h(set)-28 1897 y Fi(\017)21 b Fl(Nelson[17)o(])13 b(-)h(30)g(hours)g (at)g(the)g(La)o(wrence)h(Liv)o(ermore)e(National)f(Lab)q(oratory)-28 1980 y Fi(\017)21 b Fl(Sato[19)o(])13 b(-)h(8)f(da)o(ys)h(at)g(the)g (National)f(Cen)o(ter)i(for)e(A)o(tmospheric)h(Researc)o(h)-28 2063 y Fi(\017)21 b Fl(William)o(s[22)l(][23)o(])13 b(-)h(7)g(w)o(eek)o(ends) h(at)f(2)f(go)o(v)o(ernmen)o(t)g(sites)-28 2146 y Fi(\017)21 b Fl(W)m(est[24)o(])13 b(-)h(35)f(da)o(ys)h(at)g(the)g(On)o(tario)g(Cen)o (tre)h(for)e(Large)h(Scale)h(Computation.)-28 2287 y(Our)g(study)f(rep)q (orts)i(on)e(292,254)e(user)j(jobs)f(o)o(v)o(er)g(a)g(p)q(erio)q(d)g(of)g(13) f(mon)o(ths)g(represen)o(ting)j(ab)q(out)e(one-half)f(of)h(the)g(w)o(all)f (clo)q(c)o(k)-90 2387 y(time)d(a)o(v)n(ailable)g(on)h(the)i(4-pro)q(cessor)g (CRA)m(Y)e(Y-MP)h(at)f(NCSA)h(during)g(this)g(time.)k(The)c(p)q(erformance)f (statistics)i(w)o(e)f(gathered)g(in)-90 2487 y(some)g(sense)i Fk(de\014ne)f Fl(the)g(w)o(orkload)e(b)q(ecause)j(of)e(the)h(large)f (fraction)g(of)g(the)h(a)o(v)n(ailable)d(time)h(that)i(w)o(as)f(recorded.)19 b(These)14 b(statistics)-90 2586 y(will)e(allo)o(w)h(us)h(to)g(\\rev)o(erse)h (engineer")g(a)e(set)i(of)f(b)q(enc)o(hmarks)f(represen)o(ting)j(the)e(w)o (orkload)f(at)h(this)g(site.)950 2828 y(2)p eop %%Page: 3 5 bop -28 195 a Fl(The)17 b(next)f(section)h(b)q(egins)f(b)o(y)g(giving)e(an)i (o)o(v)o(erview)f(of)h(the)g(CRA)m(Y)f(Y-MP)h(arc)o(hitecture)i(and)e(its)g (hardw)o(are)g(p)q(erformance)-90 295 y(monitor.)23 b(The)17 b(system)f(soft)o(w)o(are)g(that)h(records)h(HPM)e(data)g(is)g(then)h (describ)q(ed.)27 b(The)17 b(metho)q(dology)d(section)j(ends)g(with)f(the)-90 394 y(de\014nitions)c(of)g(the)h(p)q(erformance)g(metrics)f(and)g(c)o (haracteristics)j(that)d(w)o(e)h(used)g(to)g(measure)f(the)h(w)o(orkload.)k (The)c(results)g(section)-90 494 y(follo)o(ws)i(next)i(and)f(giv)o(es)g (statistics)h(ab)q(out)g(the)g(w)o(orkload)e(as)i(a)f(whole)g(and)g(the)h (top)f(10)g(time-consuming)e(application)h(areas.)-90 594 y(Our)f (conclusions)h(and)e(plans)h(for)g(future)g(w)o(ork)g(are)g(found)f(in)h(the) g(last)g(section.)-90 806 y Fj(2)69 b(Metho)r(dology)-90 946 y Fl(This)15 b(section)g(giv)o(es)g(an)g(brief)g(o)o(v)o(erview)f(of)h(the)g (CRA)m(Y)f(Y-MP)h(arc)o(hitecture)i(and)e(the)g(hardw)o(are)h(p)q(erformance) e(monitor.)20 b(The)-90 1046 y(system)13 b(soft)o(w)o(are)g(that)g(records)h (HPM)g(information)c(ab)q(out)i(user)i(jobs)f(is)g(also)f(describ)q(ed.)20 b(Finally)m(,)10 b(w)o(e)k(de\014ne)g(the)f(p)q(erformance)-90 1146 y(metrics)h(deriv)o(ed)g(from)e(the)j(HPM)f(coun)o(ters.)-90 1320 y Fh(2.1)56 b(CRA)-5 b(Y)20 b(Y-MP)f(arc)n(hitecture)e(and)i(the)g (Hardw)n(are)g(P)n(erformance)e(Monitor)-90 1446 y Fl(In)12 b(order)g(to)f(b)q(etter)i(appreciate)g(the)f(p)q(erformance)f(data)g(that)h (is)f(b)q(eing)h(collected)g(and)f(to)h(more)e(easily)h(in)o(terpret)i(what)e (it)g(means,)-90 1546 y(it)j(is)h(useful)g(to)g(review)g(the)g(CRA)m(Y)f (Y-MP)h(arc)o(hitecture)i(and)e(the)g(hardw)o(are)g(p)q(erformance)g (monitor.)k(A)c(picture)g(of)f(the)i(basic)-90 1646 y(arc)o(hitecture)f(of)d (a)h(single)g(CRA)m(Y)f(Y-MP)h(pro)q(cessor)i(is)e(giv)o(en)f(in)h(Figure)g (1.)18 b(The)13 b(memory)e(p)q(orts,)i(register)h(sets,)g(and)f(functional) -90 1745 y(units)h(are)g(illustrated)g(individually)m(.)h(A)f(clo)q(c)o(k)g (p)q(erio)q(d)g(on)g(the)g(CRA)m(Y)f(Y-MP)i(is)e(6)h(nanoseconds.)-28 1845 y(Mac)o(hine)f(instructions)h(con)o(tained)e(in)h(the)g(instruction)g (bu\013ers)h(are)f(deco)q(ded)h(and)f(signals)f(are)h(generated)h(to)f(the)g (rest)h(of)e(the)-90 1944 y(CPU.)h(These)h(signals)f(are)g(issued)h(\(i.e.)j (the)d(instruction)f(issues\))i(when,)e(in)f(general,)h(the)h(resources)h (\(registers,)g(functional)d(units)-90 2044 y(or)j(p)q(orts\))g(needed)h(b)o (y)f(the)g(instruction)g(are)g(not)g(reserv)o(ed)h(or)f(in)f(use)i(b)o(y)e (some)g(previous)h(instruction.)21 b(Once)16 b(an)e(instruction)h(is)-90 2144 y(issued,)h(the)h(instruction)e(functional)g(unit)g(attempts)h(to)f (issue)h(the)h(next)f(instruction,)g(and)f(do)q(es)h(not)g(w)o(ait)f(for)g (the)h(previously)-90 2243 y(issued)g(instruction)g(to)g(complete)f(its)g (execution.)24 b(Hence,)17 b(man)o(y)d(instructions)i(ma)o(y)e(b)q(e)i(in)f (execution)i(sim)o(ultaneously)m(.)j(If)15 b(the)-90 2343 y(resources)i (needed)f(b)o(y)f(an)f(instruction)h(are)g(curren)o(tly)h(not)e(a)o(v)n (ailable,)f(then)i(the)g(instruction)g(functional)f(unit)g(\\holds)g(issue")i (of)-90 2443 y(the)c(instruction)f(un)o(til)g(the)g(needed)i(resources)h(b)q (ecome)d(a)o(v)n(ailable.)j(Note)e(that)f(while)g(the)h(instruction)f (functional)f(unit)h(is)g(holding)-90 2542 y(issue,)j(the)h(rest)g(of)e(the)h (CPU)h(is)e(busy)i(executing)f(previously)g(issued)h(instructions.)-28 2642 y(The)h(CPU)f(logic)f(that)h(deco)q(des)i(and)e(issues)h(instructions)g (for)e(execution)i(ma)o(y)d(b)q(e)j(considered)g(an)f(instruction)g (\\functional)950 2828 y(3)p eop %%Page: 4 6 bop 265 498 a Fm(Memory)832 499 y(Registers)1162 503 y(F)l(unctional)13 b(Units)p 244 2289 230 2 v 244 549 V 244 2289 2 1742 v 472 2289 V 277 1427 a Fl(Cen)o(tral)277 1490 y(Memory)p 688 2289 38 2 v 688 549 V 688 2289 2 1742 v 724 2289 V 496 2289 110 2 v 496 2145 V 496 2289 2 146 v 604 2289 V 511 2197 a(I/O)511 2256 y(\(FU\))493 1166 y(Memory)493 1220 y(P)o(orts)493 1268 y(\(FUs\))610 1378 y Fm(A)468 1389 y Fg(.)-6 b(.)g(.)f(.)h(.)g(.)f(.)h(.)g(.) f(.)h(.)g(.)f(.)h(.)g(.)f(.)h(.)g(.)f(.)h(.)g(.)f(.)h(.)f(.)h(.)g(.)f(.)h(.)g (.)f(.)h(.)g(.)f(.)h(.)g(.)f(.)h(.)g(.)f(.)h(.)g(.)f(.)h(.)g(.)f(.)h(.)g(.)f (.)h(.)g(.)f(.)h(.)g(.)f(.)h(.)g(.)f(.)h(.)g(.)f(.)h(.)g(.)f(.)h(.)g(.)f(.)h (.)g(.)f(.)h(.)g(.)f(.)h(.)g(.)f(.)h(.)g(.)f(.)h(.)g(.)f(.)h(.)g(.)f(.)h(.)g (.)f(.)h(.)f(.)h(.)g(.)f(.)h(.)g(.)f(.)h(.)g(.)f(.)h(.)g(.)f(.)h(.)g(.)f(.)h (.)g(.)f(.)h(.)g(.)f(.)h(.)g(.)f(.)h(.)g(.)f(.)h(.)g(.)f(.)h(.)g(.)f(.)h(.)g (.)f(.)h(.)g(.)f(.)h(.)g(.)f(.)f(.)683 1390 y(.)e(.)g(.)h(.)f(.)674 1391 y(.)h(.)671 1392 y(.)g(.)668 1393 y(.)f(.)665 1394 y(.)g(.)662 1395 y(.)660 1396 y(.)h(.)657 1397 y(.)656 1398 y(.)654 1399 y(.)g(.)652 1400 y(.)684 1389 y(.)g(.)f(.)g(.)h(.)676 1388 y(.)f(.)h(.)671 1387 y(.)g(.)668 1386 y(.)f(.)665 1385 y(.)663 1384 y(.)h(.)660 1383 y(.)659 1382 y(.)f(.)656 1381 y(.)654 1380 y(.)653 1379 y(.)h(.)611 1438 y Fm(B)468 1449 y Fg(.)j(.)g(.)f(.)h(.)g (.)f(.)h(.)g(.)f(.)h(.)g(.)f(.)h(.)g(.)f(.)h(.)g(.)f(.)h(.)g(.)f(.)h(.)f(.)h (.)g(.)f(.)h(.)g(.)f(.)h(.)g(.)f(.)h(.)g(.)f(.)h(.)g(.)f(.)h(.)g(.)f(.)h(.)g (.)f(.)h(.)g(.)f(.)h(.)g(.)f(.)h(.)g(.)f(.)h(.)g(.)f(.)h(.)g(.)f(.)h(.)g(.)f (.)h(.)g(.)f(.)h(.)g(.)f(.)h(.)g(.)f(.)h(.)g(.)f(.)h(.)g(.)f(.)h(.)g(.)f(.)h (.)g(.)f(.)h(.)g(.)f(.)h(.)f(.)h(.)g(.)f(.)h(.)g(.)f(.)h(.)g(.)f(.)h(.)g(.)f (.)h(.)g(.)f(.)h(.)g(.)f(.)h(.)g(.)f(.)h(.)g(.)f(.)h(.)g(.)f(.)h(.)g(.)f(.)h (.)g(.)f(.)h(.)g(.)f(.)h(.)g(.)f(.)h(.)g(.)f(.)f(.)683 1450 y(.)e(.)g(.)h(.)f(.)674 1451 y(.)h(.)671 1452 y(.)g(.)668 1453 y(.)f(.)665 1454 y(.)g(.)662 1455 y(.)660 1456 y(.)h(.)657 1457 y(.)656 1458 y(.)654 1459 y(.)g(.)652 1460 y(.)684 1449 y(.)g(.)f(.)g(.)h(.)676 1448 y(.)f(.)h(.)671 1447 y(.)g(.)668 1446 y(.)f(.)665 1445 y(.)663 1444 y(.)h(.)660 1443 y(.)659 1442 y(.)f(.)656 1441 y(.)654 1440 y(.)653 1439 y(.)h(.)611 1498 y Fm(C)684 1509 y Fg(.)g(.)f(.)g(.)h(.)f(.)g(.)h(.)f(.)g(.)h(.)f(.)g(.)h (.)f(.)g(.)h(.)f(.)g(.)h(.)f(.)g(.)h(.)f(.)g(.)h(.)f(.)g(.)h(.)f(.)g(.)h(.)f (.)g(.)h(.)f(.)g(.)h(.)f(.)g(.)h(.)f(.)g(.)h(.)f(.)h(.)f(.)g(.)h(.)f(.)g(.)h (.)f(.)g(.)h(.)f(.)g(.)h(.)f(.)g(.)h(.)f(.)g(.)h(.)f(.)g(.)h(.)f(.)g(.)h(.)f (.)g(.)h(.)f(.)g(.)h(.)f(.)g(.)h(.)f(.)g(.)h(.)f(.)g(.)h(.)f(.)g(.)h(.)f(.)g (.)h(.)f(.)g(.)h(.)f(.)g(.)h(.)f(.)g(.)h(.)f(.)g(.)h(.)f(.)g(.)h(.)f(.)h(.)f (.)g(.)h(.)f(.)g(.)h(.)f(.)g(.)h(.)f(.)g(.)h(.)f(.)g(.)h(.)f(.)g(.)h(.)f(.)g (.)h(.)f(.)g(.)i(.)i(.)g(.)f(.)h(.)477 1508 y(.)f(.)h(.)481 1507 y(.)g(.)485 1506 y(.)f(.)488 1505 y(.)489 1504 y(.)h(.)492 1503 y(.)494 1502 y(.)f(.)497 1501 y(.)498 1500 y(.)500 1499 y(.)g(.)468 1509 y(.)470 1510 y(.)h(.)f(.)h(.)g(.)478 1511 y(.)g(.)481 1512 y(.)g(.)485 1513 y(.)f(.)488 1514 y(.)g(.)491 1515 y(.)492 1516 y(.)h(.)495 1517 y(.)497 1518 y(.)498 1519 y(.)g(.)501 1520 y(.)654 2265 y(.)654 2264 y(.)654 2262 y(.)654 2260 y(.)654 2259 y(.)654 2257 y(.)654 2255 y(.)654 2254 y(.)654 2252 y(.)654 2250 y(.)654 2249 y(.)654 2247 y(.)654 2245 y(.)654 2244 y(.)654 2242 y(.)654 2240 y(.)654 2239 y(.)654 2237 y(.)654 2235 y(.)654 2234 y(.)654 2232 y(.)654 2230 y(.)654 2228 y(.)654 2227 y(.)654 2225 y(.)654 2223 y(.)654 2222 y(.)654 2220 y(.)654 2218 y(.)654 2217 y(.)654 2215 y(.)654 2213 y(.)654 2212 y(.)654 2210 y(.)654 2208 y(.)654 2207 y(.)654 2205 y(.)654 2203 y(.)654 2202 y(.)654 2200 y(.)654 2198 y(.)654 2197 y(.)654 2195 y(.)654 2193 y(.)654 2192 y(.)654 2190 y(.)654 2188 y(.)654 2187 y(.)654 2185 y(.)654 2183 y(.)654 2182 y(.)654 2180 y(.)654 2178 y(.)654 2176 y(.)654 2175 y(.)654 2173 y(.)654 2171 y(.)654 2170 y(.)654 2168 y(.)654 2166 y(.)654 2165 y(.)654 2163 y(.)654 2161 y(.)654 2160 y(.)654 2158 y(.)654 2156 y(.)654 2155 y(.)654 2153 y(.)654 2151 y(.)654 2150 y(.)654 2148 y(.)654 2146 y(.)654 2145 y(.)654 2143 y(.)654 2141 y(.)654 2140 y(.)654 2138 y(.)654 2136 y(.)654 2135 y(.)654 2133 y(.)654 2131 y(.)654 2130 y(.)654 2128 y(.)654 2126 y(.)654 2124 y(.)654 2123 y(.)654 2121 y(.)654 2119 y(.)654 2118 y(.)654 2116 y(.)654 2114 y(.)654 2113 y(.)654 2111 y(.)654 2109 y(.)e(.)654 2111 y(.)655 2113 y(.)655 2114 y(.)655 2116 y(.)655 2118 y(.)656 2119 y(.)656 2121 y(.)657 2122 y(.)657 2124 y(.)657 2126 y(.)658 2127 y(.)659 2129 y(.)659 2130 y(.)660 2132 y(.)660 2133 y(.)661 2135 y(.)662 2136 y(.)663 2138 y(.)663 2139 y(.)664 2141 y(.)665 2142 y(.)654 2109 y(.)654 2111 y(.)654 2113 y(.)654 2114 y(.)653 2116 y(.)653 2118 y(.)653 2119 y(.)652 2121 y(.)652 2122 y(.)652 2124 y(.)651 2126 y(.)651 2127 y(.)650 2129 y(.)649 2130 y(.)649 2132 y(.)648 2133 y(.)647 2135 y(.)647 2136 y(.)646 2138 y(.)645 2139 y(.)644 2141 y(.)643 2142 y(.)528 1569 y(.)i(.)g(.)533 1568 y(.)534 1567 y(.)536 1566 y(.)537 1565 y(.)538 1563 y(.)539 1562 y(.)540 1561 y(.)540 1559 y(.)541 1558 y(.)542 1556 y(.)543 1555 y(.)543 1553 y(.)544 1552 y(.)545 1550 y(.)545 1549 y(.)546 1547 y(.)546 1545 y(.)547 1544 y(.)547 1542 y(.)548 1541 y(.)548 1539 y(.)549 1537 y(.)549 1536 y(.)549 1534 y(.)550 1533 y(.)550 1531 y(.)551 1529 y(.)551 1528 y(.)551 1526 y(.)552 1525 y(.)552 1523 y(.)552 1521 y(.)553 1520 y(.)553 1518 y(.)553 1516 y(.)553 1515 y(.)554 1513 y(.)554 1511 y(.)554 1510 y(.)554 1508 y(.)555 1507 y(.)555 1505 y(.)555 1503 y(.)555 1502 y(.)555 1500 y(.)556 1498 y(.)556 1497 y(.)556 1495 y(.)556 1493 y(.)556 1492 y(.)557 1490 y(.)557 1488 y(.)557 1487 y(.)557 1485 y(.)557 1483 y(.)557 1482 y(.)557 1480 y(.)557 1478 y(.)557 1477 y(.)558 1475 y(.)558 1473 y(.)558 1472 y(.)558 1470 y(.)558 1468 y(.)558 1467 y(.)558 1465 y(.)558 1463 y(.)558 1462 y(.)558 1460 y(.)558 1459 y(.)558 1457 y(.)558 1455 y(.)558 1454 y(.)558 1452 y(.)558 1450 y(.)558 1449 y(.)558 1447 y(.)558 1445 y(.)558 1444 y(.)558 1442 y(.)558 1440 y(.)558 1439 y(.)558 1437 y(.)558 1435 y(.)558 1434 y(.)558 1432 y(.)558 1430 y(.)558 1429 y(.)558 1427 y(.)558 1425 y(.)558 1424 y(.)557 1422 y(.)557 1420 y(.)557 1419 y(.)557 1417 y(.)557 1415 y(.)557 1414 y(.)557 1412 y(.)557 1410 y(.)557 1409 y(.)556 1407 y(.)556 1405 y(.)556 1404 y(.)556 1402 y(.)556 1400 y(.)555 1399 y(.)555 1397 y(.)555 1396 y(.)555 1394 y(.)555 1392 y(.)554 1391 y(.)554 1389 y(.)554 1387 y(.)554 1386 y(.)553 1384 y(.)553 1382 y(.)553 1381 y(.)553 1379 y(.)552 1377 y(.)552 1376 y(.)552 1374 y(.)551 1373 y(.)551 1371 y(.)551 1369 y(.)550 1368 y(.)550 1366 y(.)549 1365 y(.)549 1363 y(.)549 1361 y(.)548 1360 y(.)548 1358 y(.)547 1357 y(.)547 1355 y(.)546 1353 y(.)546 1352 y(.)545 1350 y(.)545 1349 y(.)544 1347 y(.)543 1346 y(.)543 1344 y(.)542 1343 y(.)541 1341 y(.)540 1340 y(.)540 1338 y(.)539 1337 y(.)538 1335 y(.)537 1334 y(.)536 1333 y(.)534 1332 y(.)533 1331 y(.)531 1330 y(.)d(.)528 1329 y(.)527 1330 y(.)f(.)524 1331 y(.)522 1332 y(.)521 1333 y(.)520 1334 y(.)519 1335 y(.)518 1337 y(.)517 1338 y(.)516 1340 y(.)515 1341 y(.)515 1343 y(.)514 1344 y(.)513 1346 y(.)513 1347 y(.)512 1349 y(.)511 1350 y(.)511 1352 y(.)510 1353 y(.)510 1355 y(.)509 1357 y(.)509 1358 y(.)508 1360 y(.)508 1361 y(.)507 1363 y(.)507 1365 y(.)507 1366 y(.)506 1368 y(.)506 1369 y(.)506 1371 y(.)505 1373 y(.)505 1374 y(.)505 1376 y(.)504 1378 y(.)504 1379 y(.)504 1381 y(.)503 1382 y(.)503 1384 y(.)503 1386 y(.)503 1387 y(.)502 1389 y(.)502 1391 y(.)502 1392 y(.)502 1394 y(.)501 1396 y(.)501 1397 y(.)501 1399 y(.)501 1401 y(.)501 1402 y(.)501 1404 y(.)500 1405 y(.)500 1407 y(.)500 1409 y(.)500 1410 y(.)500 1412 y(.)500 1414 y(.)500 1415 y(.)499 1417 y(.)499 1419 y(.)499 1420 y(.)499 1422 y(.)499 1424 y(.)499 1425 y(.)499 1427 y(.)499 1429 y(.)499 1430 y(.)499 1432 y(.)499 1434 y(.)498 1435 y(.)498 1437 y(.)498 1439 y(.)498 1440 y(.)498 1442 y(.)498 1444 y(.)498 1445 y(.)498 1447 y(.)498 1449 y(.)498 1450 y(.)498 1452 y(.)498 1454 y(.)498 1455 y(.)498 1457 y(.)498 1459 y(.)498 1460 y(.)498 1462 y(.)498 1464 y(.)499 1465 y(.)499 1467 y(.)499 1469 y(.)499 1470 y(.)499 1472 y(.)499 1473 y(.)499 1475 y(.)499 1477 y(.)499 1478 y(.)499 1480 y(.)499 1482 y(.)500 1483 y(.)500 1485 y(.)500 1487 y(.)500 1488 y(.)500 1490 y(.)500 1492 y(.)500 1493 y(.)501 1495 y(.)501 1497 y(.)501 1498 y(.)501 1500 y(.)501 1502 y(.)501 1503 y(.)502 1505 y(.)502 1507 y(.)502 1508 y(.)502 1510 y(.)503 1511 y(.)503 1513 y(.)503 1515 y(.)503 1516 y(.)504 1518 y(.)504 1520 y(.)504 1521 y(.)505 1523 y(.)505 1525 y(.)505 1526 y(.)506 1528 y(.)506 1529 y(.)506 1531 y(.)507 1533 y(.)507 1534 y(.)507 1536 y(.)508 1537 y(.)508 1539 y(.)509 1541 y(.)509 1542 y(.)510 1544 y(.)510 1545 y(.)511 1547 y(.)511 1549 y(.)512 1550 y(.)513 1552 y(.)513 1553 y(.)514 1555 y(.)515 1556 y(.)515 1558 y(.)516 1559 y(.)517 1561 y(.)518 1562 y(.)519 1563 y(.)520 1565 y(.)521 1566 y(.)522 1567 y(.)524 1568 y(.)525 1569 y(.)k(.)f(.)528 1571 y(.)528 1573 y(.)528 1574 y(.)528 1576 y(.)528 1578 y(.)528 1579 y(.)528 1581 y(.)528 1583 y(.)528 1584 y(.)528 1586 y(.)528 1588 y(.)528 1589 y(.)528 1591 y(.)528 1593 y(.)528 1594 y(.)528 1596 y(.)528 1598 y(.)528 1599 y(.)528 1601 y(.)528 1603 y(.)528 1604 y(.)528 1606 y(.)528 1608 y(.)528 1609 y(.)528 1611 y(.)528 1613 y(.)528 1614 y(.)528 1616 y(.)528 1618 y(.)528 1619 y(.)528 1621 y(.)528 1623 y(.)528 1624 y(.)528 1626 y(.)528 1628 y(.)528 1629 y(.)f(.)528 1628 y(.)528 1626 y(.)528 1624 y(.)527 1623 y(.)527 1621 y(.)527 1619 y(.)526 1618 y(.)526 1616 y(.)526 1615 y(.)525 1613 y(.)525 1612 y(.)524 1610 y(.)523 1608 y(.)523 1607 y(.)522 1605 y(.)521 1604 y(.)521 1602 y(.)520 1601 y(.)519 1600 y(.)518 1598 y(.)517 1597 y(.)528 1629 y(.)528 1628 y(.)529 1626 y(.)529 1624 y(.)529 1623 y(.)529 1621 y(.)530 1619 y(.)530 1618 y(.)531 1616 y(.)531 1615 y(.)531 1613 y(.)532 1612 y(.)533 1610 y(.)533 1608 y(.)534 1607 y(.)534 1605 y(.)535 1604 y(.)536 1602 y(.)537 1601 y(.)537 1600 y(.)538 1598 y(.)539 1597 y(.)495 1677 y Fm(ctr-6)543 2061 y(ctr-7)468 1809 y Fg(.)i(.)g(.)f(.)h(.)g(.)f(.)h(.)g(.)f(.)h(.)g(.)f(.) h(.)g(.)f(.)h(.)g(.)f(.)h(.)g(.)f(.)h(.)f(.)h(.)g(.)f(.)h(.)g(.)f(.)h(.)g(.)f (.)h(.)g(.)f(.)h(.)g(.)f(.)h(.)g(.)f(.)h(.)g(.)f(.)h(.)g(.)f(.)h(.)g(.)f(.)h (.)g(.)f(.)h(.)g(.)f(.)h(.)g(.)f(.)h(.)g(.)f(.)h(.)g(.)f(.)h(.)g(.)f(.)h(.)g (.)f(.)h(.)g(.)f(.)h(.)g(.)f(.)h(.)g(.)f(.)h(.)g(.)f(.)h(.)g(.)f(.)h(.)f(.)h (.)g(.)f(.)h(.)g(.)f(.)h(.)g(.)f(.)h(.)g(.)f(.)h(.)g(.)f(.)h(.)g(.)f(.)h(.)g (.)f(.)h(.)g(.)f(.)h(.)g(.)f(.)h(.)g(.)f(.)h(.)g(.)f(.)h(.)g(.)f(.)h(.)g(.)f (.)h(.)g(.)f(.)f(.)683 1810 y(.)e(.)g(.)h(.)f(.)674 1811 y(.)h(.)671 1812 y(.)g(.)668 1813 y(.)f(.)665 1814 y(.)g(.)662 1815 y(.)660 1816 y(.)h(.)657 1817 y(.)656 1818 y(.)654 1819 y(.)g(.)652 1820 y(.)684 1809 y(.)g(.)f(.)g(.)h(.)676 1808 y(.)f(.)h(.)671 1807 y(.)g(.)668 1806 y(.)f(.)665 1805 y(.)663 1804 y(.)h(.)660 1803 y(.)659 1802 y(.)f(.)656 1801 y(.)654 1800 y(.)653 1799 y(.)h(.)684 1809 y(.)g(.)f(.)g(.)h(.)f(.)g(.)h(.)f(.)g(.)h(.)f(.)g(.)h(.)f(.) g(.)h(.)f(.)g(.)h(.)f(.)g(.)h(.)f(.)g(.)h(.)f(.)g(.)h(.)f(.)g(.)h(.)f(.)g(.)h (.)f(.)g(.)h(.)f(.)g(.)h(.)f(.)g(.)h(.)f(.)h(.)f(.)g(.)h(.)f(.)g(.)h(.)f(.)g (.)h(.)f(.)g(.)h(.)f(.)g(.)h(.)f(.)g(.)h(.)f(.)g(.)h(.)f(.)g(.)h(.)f(.)g(.)h (.)f(.)g(.)h(.)f(.)g(.)h(.)f(.)g(.)h(.)f(.)g(.)h(.)f(.)g(.)h(.)f(.)g(.)h(.)f (.)g(.)h(.)f(.)g(.)h(.)f(.)g(.)h(.)f(.)g(.)h(.)f(.)g(.)h(.)f(.)h(.)f(.)g(.)h (.)f(.)g(.)h(.)f(.)g(.)h(.)f(.)g(.)h(.)f(.)g(.)h(.)f(.)g(.)h(.)f(.)g(.)h(.)f (.)g(.)i(.)i(.)g(.)f(.)h(.)477 1808 y(.)f(.)h(.)481 1807 y(.)g(.)485 1806 y(.)f(.)488 1805 y(.)489 1804 y(.)h(.)492 1803 y(.)494 1802 y(.)f(.)497 1801 y(.)498 1800 y(.)500 1799 y(.)g(.)468 1809 y(.)470 1810 y(.)h(.)f(.)h(.)g(.)478 1811 y(.)g(.)481 1812 y(.)g(.)485 1813 y(.)f(.)488 1814 y(.)g(.)491 1815 y(.)492 1816 y(.)h(.)495 1817 y(.)497 1818 y(.)498 1819 y(.)g(.)501 1820 y(.)562 1798 y Fm(D)p 808 837 290 2 v 808 741 V 808 837 2 98 v 1096 837 V 824 798 a Fl(V)13 b(Registers)p 808 1257 290 2 v 808 1161 V 808 1257 2 98 v 1096 1257 V 826 1218 a(S/T)h(Registers)p 808 1677 290 2 v 808 1581 V 808 1677 2 98 v 1096 1677 V 822 1638 a(A/B)h(Registers)p 808 2109 290 2 v 808 1989 V 808 2109 2 122 v 1096 2109 V 829 2039 a(Instruction)829 2085 y(Bu\013ers)600 2265 y Fg(.)-6 b(.)g(.)f(.)h(.)g(.)f(.)h(.)g(.)f(.)h(.)g(.)f(.)h(.)g(.)f(.)h (.)g(.)g(.)f(.)h(.)g(.)f(.)h(.)g(.)f(.)h(.)g(.)f(.)h(.)g(.)f(.)h(.)g(.)f(.)h (.)g(.)f(.)h(.)g(.)f(.)h(.)g(.)g(.)f(.)h(.)g(.)f(.)h(.)g(.)f(.)720 789 y(.)h(.)g(.)f(.)h(.)g(.)f(.)h(.)g(.)f(.)h(.)g(.)f(.)h(.)g(.)f(.)h(.)g(.)g (.)f(.)h(.)g(.)f(.)h(.)g(.)f(.)h(.)g(.)f(.)h(.)g(.)f(.)h(.)g(.)f(.)h(.)g(.)f (.)h(.)g(.)f(.)h(.)g(.)g(.)f(.)h(.)g(.)f(.)h(.)g(.)f(.)280 b(.)-6 b(.)g(.)f(.)h(.)g(.)g(.)f(.)h(.)g(.)f(.)h(.)g(.)g(.)f(.)h(.)g(.)f(.)h (.)g(.)g(.)f(.)1128 669 y(.)h(.)g(.)f(.)h(.)g(.)g(.)f(.)h(.)g(.)f(.)h(.)g(.)g (.)f(.)h(.)g(.)f(.)h(.)g(.)g(.)f(.)1128 1029 y(.)h(.)g(.)f(.)h(.)g(.)g(.)f(.) h(.)g(.)f(.)h(.)g(.)g(.)f(.)h(.)g(.)f(.)h(.)g(.)g(.)f(.)-44 b(.)1128 1028 y(.)1128 1026 y(.)1128 1024 y(.)1128 1023 y(.)1128 1021 y(.)1128 1019 y(.)1128 1018 y(.)1128 1016 y(.)1128 1014 y(.)1128 1013 y(.)1128 1011 y(.)1128 1009 y(.)1128 1008 y(.)1128 1006 y(.)1128 1004 y(.)1128 1003 y(.)1128 1001 y(.)1128 999 y(.)1128 998 y(.)1128 996 y(.)1128 994 y(.)1128 993 y(.)1128 991 y(.)1128 989 y(.)1128 988 y(.)1128 986 y(.)1128 984 y(.)1128 983 y(.)1128 981 y(.)1128 979 y(.)1128 978 y(.)1128 976 y(.)1128 974 y(.)1128 973 y(.)1128 971 y(.)1128 969 y(.)1128 968 y(.)1128 966 y(.)1128 964 y(.)1128 963 y(.)1128 961 y(.)1128 959 y(.)1128 958 y(.)1128 956 y(.)1128 954 y(.)1128 953 y(.)1128 951 y(.)1128 949 y(.)1128 948 y(.)1128 946 y(.)1128 944 y(.)1128 943 y(.)1128 941 y(.)1128 939 y(.)1128 938 y(.)1128 936 y(.)1128 934 y(.)1128 933 y(.)1128 931 y(.)1128 929 y(.)1128 928 y(.)1128 926 y(.)1128 924 y(.)1128 923 y(.)1128 921 y(.)1128 919 y(.)1128 918 y(.)1128 916 y(.)1128 914 y(.)1128 913 y(.)1128 911 y(.)1128 909 y(.)1128 908 y(.)1128 906 y(.)1128 904 y(.)1128 903 y(.)1128 901 y(.)1128 899 y(.)1128 898 y(.)1128 896 y(.)1128 894 y(.)1128 893 y(.)1128 891 y(.)1128 889 y(.)1128 888 y(.)1128 886 y(.)1128 884 y(.)1128 883 y(.)1128 881 y(.)1128 879 y(.)1128 878 y(.)1128 876 y(.)1128 874 y(.)1128 873 y(.)1128 871 y(.)1128 869 y(.)1128 868 y(.)1128 866 y(.)1128 864 y(.)1128 863 y(.)1128 861 y(.)1128 859 y(.)1128 858 y(.)1128 856 y(.)1128 854 y(.)1128 853 y(.)1128 851 y(.)1128 849 y(.)1128 848 y(.)1128 846 y(.)1128 844 y(.)1128 843 y(.)1128 841 y(.)1128 839 y(.)1128 838 y(.)1128 836 y(.)1128 834 y(.)1128 833 y(.)1128 831 y(.)1128 829 y(.)1128 828 y(.)1128 826 y(.)1128 824 y(.)1128 823 y(.)1128 821 y(.)1128 819 y(.)1128 818 y(.)1128 816 y(.)1128 814 y(.)1128 813 y(.)1128 811 y(.)1128 809 y(.)1128 808 y(.)1128 806 y(.)1128 804 y(.)1128 803 y(.)1128 801 y(.)1128 799 y(.)1128 798 y(.)1128 796 y(.)1128 794 y(.)1128 793 y(.)1128 791 y(.)1128 789 y(.)1128 788 y(.)1128 786 y(.)1128 784 y(.)1128 783 y(.)1128 781 y(.)1128 779 y(.)1128 778 y(.)1128 776 y(.)1128 774 y(.)1128 773 y(.)1128 771 y(.)1128 769 y(.)1128 768 y(.)1128 766 y(.)1128 764 y(.)1128 763 y(.)1128 761 y(.)1128 759 y(.)1128 758 y(.)1128 756 y(.)1128 754 y(.)1128 753 y(.)1128 751 y(.)1128 749 y(.)1128 748 y(.)1128 746 y(.)1128 744 y(.)1128 743 y(.)1128 741 y(.)1128 739 y(.)1128 738 y(.)1128 736 y(.)1128 734 y(.)1128 733 y(.)1128 731 y(.)1128 729 y(.)1128 728 y(.)1128 726 y(.)1128 724 y(.)1128 723 y(.)1128 721 y(.)1128 719 y(.)1128 718 y(.)1128 716 y(.)1128 714 y(.)1128 713 y(.)1128 711 y(.)1128 709 y(.)1128 708 y(.)1128 706 y(.)1128 704 y(.)1128 703 y(.)1128 701 y(.)1128 699 y(.)1128 698 y(.)1128 696 y(.)1128 694 y(.)1128 693 y(.)1128 691 y(.)1128 689 y(.)1128 688 y(.)1128 686 y(.)1128 684 y(.)1128 683 y(.)1128 681 y(.)1128 679 y(.)1128 678 y(.)1128 676 y(.)1128 674 y(.)1128 673 y(.)1128 671 y(.)1128 669 y(.)1128 1089 y(.)-6 b(.)g(.)f(.)h(.)g(.)g(.)f(.)h(.)g(.)f(.)h(.)g(.)g(.)f(.)h(.)g(.)f(.)h(.)g(.)g (.)f(.)720 1209 y(.)h(.)g(.)f(.)h(.)g(.)f(.)h(.)g(.)f(.)h(.)g(.)f(.)h(.)g(.)f (.)h(.)g(.)g(.)f(.)h(.)g(.)f(.)h(.)g(.)f(.)h(.)g(.)f(.)h(.)g(.)f(.)h(.)g(.)f (.)h(.)g(.)f(.)h(.)g(.)f(.)h(.)g(.)g(.)f(.)h(.)g(.)f(.)h(.)g(.)f(.)280 b(.)-6 b(.)g(.)f(.)h(.)g(.)g(.)f(.)h(.)g(.)f(.)h(.)g(.)g(.)f(.)h(.)g(.)f(.)h (.)g(.)g(.)f(.)1128 1377 y(.)h(.)g(.)f(.)h(.)g(.)g(.)f(.)h(.)g(.)f(.)h(.)g(.) g(.)f(.)h(.)g(.)f(.)h(.)g(.)g(.)f(.)-44 b(.)1128 1376 y(.)1128 1374 y(.)1128 1372 y(.)1128 1371 y(.)1128 1369 y(.)1128 1367 y(.)1128 1366 y(.)1128 1364 y(.)1128 1362 y(.)1128 1361 y(.)1128 1359 y(.)1128 1357 y(.)1128 1356 y(.)1128 1354 y(.)1128 1352 y(.)1128 1351 y(.)1128 1349 y(.)1128 1347 y(.)1128 1346 y(.)1128 1344 y(.)1128 1342 y(.)1128 1341 y(.)1128 1339 y(.)1128 1337 y(.)1128 1336 y(.)1128 1334 y(.)1128 1332 y(.)1128 1331 y(.)1128 1329 y(.)1128 1327 y(.)1128 1326 y(.)1128 1324 y(.)1128 1322 y(.)1128 1321 y(.)1128 1319 y(.)1128 1317 y(.)1128 1316 y(.)1128 1314 y(.)1128 1312 y(.)1128 1311 y(.)1128 1309 y(.)1128 1307 y(.)1128 1306 y(.)1128 1304 y(.)1128 1302 y(.)1128 1301 y(.)1128 1299 y(.)1128 1297 y(.)1128 1296 y(.)1128 1294 y(.)1128 1292 y(.)1128 1291 y(.)1128 1289 y(.)1128 1287 y(.)1128 1286 y(.)1128 1284 y(.)1128 1282 y(.)1128 1281 y(.)1128 1279 y(.)1128 1278 y(.)1128 1276 y(.)1128 1274 y(.)1128 1273 y(.)1128 1271 y(.)1128 1269 y(.)1128 1268 y(.)1128 1266 y(.)1128 1264 y(.)1128 1263 y(.)1128 1261 y(.)1128 1259 y(.)1128 1258 y(.)1128 1256 y(.)1128 1254 y(.)1128 1253 y(.)1128 1251 y(.)1128 1249 y(.)1128 1248 y(.)1128 1246 y(.)1128 1244 y(.)1128 1243 y(.)1128 1241 y(.)1128 1239 y(.)1128 1238 y(.)1128 1236 y(.)1128 1234 y(.)1128 1233 y(.)1128 1231 y(.)1128 1229 y(.)1128 1228 y(.)1128 1226 y(.)1128 1224 y(.)1128 1223 y(.)1128 1221 y(.)1128 1219 y(.)1128 1218 y(.)1128 1216 y(.)1128 1214 y(.)1128 1213 y(.)1128 1211 y(.)1128 1209 y(.)1128 1208 y(.)1128 1206 y(.)1128 1204 y(.)1128 1203 y(.)1128 1201 y(.)1128 1199 y(.)1128 1198 y(.)1128 1196 y(.)1128 1194 y(.)1128 1193 y(.)1128 1191 y(.)1128 1189 y(.)1128 1188 y(.)1128 1186 y(.)1128 1184 y(.)1128 1183 y(.)1128 1181 y(.)1128 1179 y(.)1128 1178 y(.)1128 1176 y(.)1128 1174 y(.)1128 1173 y(.)1128 1171 y(.)1128 1169 y(.)1128 1168 y(.)1128 1166 y(.)1128 1164 y(.)1128 1163 y(.)1128 1161 y(.)1128 1159 y(.)1128 1158 y(.)1128 1156 y(.)1128 1154 y(.)1128 1153 y(.)1128 1151 y(.)1128 1149 y(.)1128 1148 y(.)1128 1146 y(.)1128 1144 y(.)1128 1143 y(.)1128 1141 y(.)1128 1139 y(.)1128 1138 y(.)1128 1136 y(.)1128 1134 y(.)1128 1133 y(.)1128 1131 y(.)1128 1129 y(.)1128 1128 y(.)1128 1126 y(.)1128 1124 y(.)1128 1123 y(.)1128 1121 y(.)1128 1119 y(.)1128 1118 y(.)1128 1116 y(.)1128 1114 y(.)1128 1113 y(.)1128 1111 y(.)1128 1109 y(.)1128 1108 y(.)1128 1106 y(.)1128 1104 y(.)1128 1103 y(.)1128 1101 y(.)1128 1099 y(.)1128 1098 y(.)1128 1096 y(.)1128 1094 y(.)1128 1093 y(.)1128 1091 y(.)1128 1089 y(.)720 1629 y(.)-6 b(.)g(.)f(.)h(.)g(.)f(.)h(.)g(.)f(.)h(.)g(.)f(.)h(.) g(.)f(.)h(.)g(.)g(.)f(.)h(.)g(.)f(.)h(.)g(.)f(.)h(.)g(.)f(.)h(.)g(.)f(.)h(.)g (.)f(.)h(.)g(.)f(.)h(.)g(.)f(.)h(.)g(.)g(.)f(.)h(.)g(.)f(.)h(.)g(.)f(.)280 b(.)-6 b(.)g(.)f(.)h(.)g(.)f(.)h(.)g(.)f(.)h(.)g(.)f(.)h(.)g(.)f(.)h(.)g(.)f (.)h(.)g(.)f(.)h(.)g(.)f(.)h(.)g(.)f(.)h(.)g(.)g(.)f(.)h(.)g(.)f(.)h(.)g(.)f (.)h(.)g(.)f(.)h(.)g(.)f(.)720 2049 y(.)h(.)g(.)f(.)h(.)g(.)f(.)h(.)g(.)f(.)h (.)g(.)f(.)h(.)g(.)f(.)h(.)g(.)g(.)f(.)h(.)g(.)f(.)h(.)g(.)f(.)h(.)g(.)f(.)h (.)g(.)f(.)h(.)g(.)f(.)h(.)g(.)f(.)h(.)g(.)f(.)h(.)g(.)g(.)f(.)h(.)g(.)f(.)h (.)g(.)f(.)280 b(.)-6 b(.)g(.)f(.)h(.)g(.)f(.)h(.)g(.)f(.)h(.)g(.)f(.)h(.)g (.)f(.)h(.)g(.)f(.)h(.)g(.)f(.)h(.)g(.)f(.)h(.)g(.)f(.)h(.)g(.)g(.)f(.)h(.)g (.)f(.)h(.)g(.)f(.)h(.)g(.)f(.)h(.)g(.)f(.)1452 993 y(.)h(.)g(.)f(.)h(.)g(.)g (.)f(.)h(.)g(.)f(.)h(.)g(.)g(.)f(.)1452 1041 y(.)h(.)g(.)f(.)h(.)g(.)g(.)f(.) h(.)g(.)f(.)h(.)g(.)g(.)f(.)1452 1089 y(.)h(.)g(.)f(.)h(.)g(.)g(.)f(.)h(.)g (.)f(.)h(.)g(.)g(.)f(.)f(.)1476 1088 y(.)1476 1086 y(.)1476 1084 y(.)1476 1083 y(.)1476 1081 y(.)1476 1079 y(.)1476 1078 y(.)1476 1076 y(.)1476 1074 y(.)1476 1073 y(.)1476 1071 y(.)1476 1069 y(.)1476 1068 y(.)1476 1066 y(.)1476 1064 y(.)1476 1063 y(.)1476 1061 y(.)1476 1059 y(.)1476 1058 y(.)1476 1056 y(.)1476 1054 y(.)1476 1053 y(.)1476 1051 y(.)1476 1049 y(.)1476 1048 y(.)1476 1046 y(.)1476 1044 y(.)1476 1043 y(.)1476 1041 y(.)1476 1039 y(.)1476 1037 y(.)1476 1036 y(.)1476 1034 y(.)1476 1032 y(.)1476 1031 y(.)1476 1029 y(.)1476 1027 y(.)1476 1026 y(.)1476 1024 y(.)1476 1022 y(.)1476 1021 y(.)1476 1019 y(.)1476 1017 y(.)1476 1016 y(.)1476 1014 y(.)1476 1012 y(.)1476 1011 y(.)1476 1009 y(.)1476 1007 y(.)1476 1006 y(.)1476 1004 y(.)1476 1002 y(.)1476 1001 y(.)1476 999 y(.)1476 997 y(.)1476 996 y(.)1476 994 y(.)1476 992 y(.)1476 991 y(.)1476 989 y(.)1476 987 y(.)1476 986 y(.)1476 984 y(.)1476 982 y(.)1476 981 y(.)1476 979 y(.)1476 977 y(.)1476 976 y(.)1476 974 y(.)1476 972 y(.)1476 971 y(.)1476 969 y(.)1476 967 y(.)1476 965 y(.)1476 964 y(.)1476 962 y(.)1476 960 y(.)1476 959 y(.)1476 957 y(.)1476 955 y(.)1476 954 y(.)1476 952 y(.)1476 950 y(.)1476 949 y(.)1476 947 y(.)1476 945 y(.)g(.)1476 947 y(.)1477 949 y(.)1477 950 y(.)1477 952 y(.)1477 954 y(.)1478 955 y(.)1478 957 y(.)1479 958 y(.)1479 960 y(.)1479 962 y(.)1480 963 y(.)1481 965 y(.)1481 966 y(.)1482 968 y(.)1482 969 y(.)1483 971 y(.)1484 972 y(.)1485 974 y(.)1485 975 y(.)1486 977 y(.)1487 978 y(.)1476 945 y(.)1476 947 y(.)1476 949 y(.)1476 950 y(.)1475 952 y(.)1475 954 y(.)1475 955 y(.)1474 957 y(.)1474 958 y(.)1474 960 y(.)1473 962 y(.)1473 963 y(.)1472 965 y(.)1471 966 y(.)1471 968 y(.)1470 969 y(.)1469 971 y(.)1469 972 y(.)1468 974 y(.)1467 975 y(.)1466 977 y(.)1465 978 y(.)1346 917 y Fm(ctr-3,4,5)768 2049 y Fg(.)768 2048 y(.)768 2046 y(.)768 2044 y(.)768 2043 y(.)768 2041 y(.)768 2039 y(.)768 2038 y(.)768 2036 y(.)768 2034 y(.)768 2033 y(.)768 2031 y(.)768 2029 y(.)768 2027 y(.)768 2026 y(.)768 2024 y(.)768 2022 y(.)768 2021 y(.)768 2019 y(.)768 2017 y(.)768 2016 y(.)768 2014 y(.)768 2012 y(.)768 2011 y(.)768 2009 y(.)768 2007 y(.)768 2006 y(.)768 2004 y(.)768 2002 y(.)768 2001 y(.)768 1999 y(.)768 1997 y(.)768 1995 y(.)768 1994 y(.)768 1992 y(.)768 1990 y(.)768 1989 y(.)768 1987 y(.)768 1985 y(.)768 1984 y(.)768 1982 y(.)768 1980 y(.)768 1979 y(.)768 1977 y(.)768 1975 y(.)768 1974 y(.)768 1972 y(.)768 1970 y(.)768 1969 y(.)768 1967 y(.)768 1965 y(.)768 1963 y(.)768 1962 y(.)768 1960 y(.)768 1958 y(.)768 1957 y(.)768 1955 y(.)768 1953 y(.)g(.)768 1955 y(.)769 1957 y(.)769 1958 y(.)769 1960 y(.)769 1962 y(.)770 1963 y(.)770 1965 y(.)771 1966 y(.)771 1968 y(.)771 1970 y(.)772 1971 y(.)773 1973 y(.)773 1974 y(.)774 1976 y(.)774 1977 y(.)775 1979 y(.)776 1980 y(.)777 1982 y(.)777 1983 y(.)778 1985 y(.)779 1986 y(.)768 1953 y(.)768 1955 y(.)768 1957 y(.)768 1958 y(.)767 1960 y(.)767 1962 y(.)767 1963 y(.)766 1965 y(.)766 1966 y(.)766 1968 y(.)765 1970 y(.)765 1971 y(.)764 1973 y(.)763 1974 y(.)763 1976 y(.)762 1977 y(.)761 1979 y(.)761 1980 y(.)760 1982 y(.)759 1983 y(.)758 1985 y(.)757 1986 y(.)735 1905 y Fm(ctr-2)p 1168 849 338 2 v 1168 549 V 1168 849 2 302 v 1504 849 V 1182 606 a Fl(In)o(terger)15 b(Add\(64\))1182 662 y(Logical)1182 716 y(Shift)1182 764 y(P)o(op/P)o(arit)o(y)1182 823 y(2nd/Logical)1554 718 y(V)m(ector)p 1168 1149 290 2 v 1168 969 V 1168 1149 2 182 v 1456 1149 V 1182 1013 a(FP)g(Add)1182 1058 y(FP)g(Multiply)1182 1112 y(FP)g(Recipro)q(cal)1525 1040 y(Floating)1525 1093 y(P)o(oin)o(t)p 1168 1497 338 2 v 1168 1257 V 1168 1497 2 242 v 1504 1497 V 1191 1307 a(In)o(teger)f(Add\(64\))1191 1364 y(Logical)1191 1417 y(Shift)1191 1466 y(P)o(op/P)o(arit)o(y/LZ)1557 1391 y(Scalar)p 1168 1701 338 2 v 1168 1581 V 1168 1701 2 122 v 1504 1701 V 1185 1621 a(In)o(teger)h(Add\(32\))1185 1680 y(In)o(teger)g(Mult\(32\))1516 1655 y(Address)p 1168 2169 290 2 v 1168 1989 V 1168 2169 2 182 v 1456 2169 V 1194 2037 a(Instruction)1194 2082 y(Deco)q(de)g(and)1194 2128 y(Issue)g(Logic)1308 1989 y Fg(.)1308 1988 y(.)1308 1986 y(.)1308 1984 y(.)1308 1983 y(.)1308 1981 y(.)1308 1979 y(.)1308 1978 y(.)1308 1976 y(.)1308 1974 y(.)1308 1973 y(.)1308 1971 y(.)1308 1969 y(.)1308 1968 y(.)1308 1966 y(.)1308 1964 y(.)1308 1963 y(.)1308 1961 y(.)1308 1959 y(.)1308 1958 y(.)1308 1956 y(.)1308 1954 y(.)1308 1953 y(.)1308 1951 y(.)1308 1949 y(.)1308 1948 y(.)1308 1946 y(.)1308 1944 y(.)1308 1943 y(.)1308 1941 y(.)1308 1939 y(.)1308 1938 y(.)1308 1936 y(.)1308 1934 y(.)1308 1933 y(.)1308 1931 y(.)1308 1929 y(.)1308 1928 y(.)1308 1926 y(.)1308 1924 y(.)1308 1923 y(.)1308 1921 y(.)1308 1919 y(.)1308 1918 y(.)1308 1916 y(.)1308 1914 y(.)1308 1913 y(.)1308 1911 y(.)1308 1909 y(.)1308 1908 y(.)1308 1906 y(.)1308 1904 y(.)1308 1903 y(.)1308 1901 y(.)1308 1899 y(.)1308 1898 y(.)1308 1896 y(.)1308 1894 y(.)1308 1893 y(.)1308 1891 y(.)1308 1889 y(.)1308 1888 y(.)1308 1886 y(.)1308 1884 y(.)1308 1883 y(.)1308 1881 y(.)1308 1879 y(.)1308 1878 y(.)1308 1876 y(.)1308 1874 y(.)1308 1873 y(.)1308 1871 y(.)1308 1869 y(.)-8 b(.)1308 1871 y(.)1309 1873 y(.)1309 1874 y(.)1309 1876 y(.)1309 1878 y(.)1310 1879 y(.)1310 1881 y(.)1311 1882 y(.)1311 1884 y(.)1311 1886 y(.)1312 1887 y(.)1313 1889 y(.)1313 1890 y(.)1314 1892 y(.)1314 1893 y(.)1315 1895 y(.)1316 1896 y(.)1317 1898 y(.)1317 1899 y(.)1318 1901 y(.)1319 1902 y(.)1308 1869 y(.)1308 1871 y(.)1308 1873 y(.)1308 1874 y(.)1307 1876 y(.)1307 1878 y(.)1307 1879 y(.)1306 1881 y(.)1306 1882 y(.)1306 1884 y(.)1305 1886 y(.)1305 1887 y(.)1304 1889 y(.)1303 1890 y(.)1303 1892 y(.)1302 1893 y(.)1301 1895 y(.)1301 1896 y(.)1300 1898 y(.)1299 1899 y(.)1298 1901 y(.)1297 1902 y(.)1239 1821 y Fm(ctr-0)1477 2086 y Fl(Instruction)1284 2229 y Fg(.)1284 2228 y(.)1284 2226 y(.)1284 2224 y(.)1284 2223 y(.)1284 2221 y(.)1283 2219 y(.)1283 2218 y(.)1283 2216 y(.)1282 2215 y(.)1282 2213 y(.)1282 2211 y(.)1281 2210 y(.)1280 2208 y(.)1280 2207 y(.)1279 2205 y(.)1278 2204 y(.)1278 2202 y(.)1277 2201 y(.)1276 2199 y(.)1275 2198 y(.)1274 2196 y(.)1273 2195 y(.)1273 2194 y(.)1272 2192 y(.)1270 2191 y(.)1269 2190 y(.)1268 2189 y(.)1267 2187 y(.)1266 2186 y(.)1265 2185 y(.)1264 2184 y(.)1262 2183 y(.)1261 2182 y(.)1260 2181 y(.)1258 2180 y(.)1257 2179 y(.)1255 2178 y(.)1254 2177 y(.)1253 2176 y(.)e(.)1250 2175 y(.)1248 2174 y(.)h(.)1245 2173 y(.)f(.)1242 2172 y(.)g(.)1239 2171 y(.)g(.)1235 2170 y(.)h(.)f(.)h(.)f(.)1227 2169 y(.)h(.)f(.)g(.)h(.)1219 2170 y(.)f(.)h(.)f(.)1212 2171 y(.)h(.)f(.)1207 2172 y(.)h(.)1204 2173 y(.)g(.)1201 2174 y(.)1200 2175 y(.)f(.)1197 2176 y(.)1195 2177 y(.)1194 2178 y(.)1192 2179 y(.)h(.)1190 2180 y(.)1188 2181 y(.)1187 2182 y(.)1186 2183 y(.)1184 2185 y(.)1183 2186 y(.)1182 2187 y(.)1181 2188 y(.)1180 2189 y(.)1179 2190 y(.)1178 2192 y(.)1177 2193 y(.)1176 2194 y(.)1175 2196 y(.)1174 2197 y(.)1173 2199 y(.)1172 2200 y(.)1171 2201 y(.)1170 2203 y(.)1170 2204 y(.)1169 2206 y(.)1168 2207 y(.)1168 2209 y(.)1167 2211 y(.)1167 2212 y(.)1166 2214 y(.)1166 2215 y(.)1166 2217 y(.)1165 2219 y(.)1165 2220 y(.)1165 2222 y(.)1165 2224 y(.)1164 2225 y(.)1164 2227 y(.)1164 2229 y(.)1164 2230 y(.)1164 2232 y(.)1164 2233 y(.)1165 2235 y(.)1165 2237 y(.)1165 2238 y(.)1165 2240 y(.)1166 2242 y(.)1166 2243 y(.)1166 2245 y(.)1167 2247 y(.)1167 2248 y(.)1168 2250 y(.)1168 2251 y(.)1169 2253 y(.)1170 2254 y(.)1170 2256 y(.)1171 2257 y(.)1172 2259 y(.)1173 2260 y(.)1174 2262 y(.)1175 2263 y(.)1176 2264 y(.)1177 2266 y(.)1178 2267 y(.)1179 2268 y(.)1180 2270 y(.)1181 2271 y(.)1182 2272 y(.)1183 2273 y(.)1184 2274 y(.)1186 2275 y(.)1187 2276 y(.)1188 2277 y(.)1190 2278 y(.)1191 2279 y(.)1192 2280 y(.)1194 2281 y(.)1195 2282 y(.)1197 2283 y(.)i(.)1200 2284 y(.)1201 2285 y(.)h(.)1204 2286 y(.)g(.)1207 2287 y(.)g(.)1211 2288 y(.)f(.)h(.)1216 2289 y(.)f(.)h(.)f(.)h(.)g(.)f(.)h(.)g(.)f(.)h(.)g(.)1235 2288 y(.)g(.)g(.)1240 2287 y(.)g(.)1243 2286 y(.)g(.)1247 2285 y(.)1248 2284 y(.)g(.)1251 2283 y(.)1253 2282 y(.)1254 2281 y(.)f(.)1257 2280 y(.)1258 2279 y(.)1260 2278 y(.)1261 2277 y(.)1262 2276 y(.)1263 2275 y(.)1265 2274 y(.)1266 2273 y(.)1267 2271 y(.)1268 2270 y(.)1269 2269 y(.)1270 2268 y(.)1271 2266 y(.)1272 2265 y(.)1273 2264 y(.)1274 2262 y(.)1275 2261 y(.)1276 2260 y(.)1277 2258 y(.)1278 2257 y(.)1278 2255 y(.)1279 2254 y(.)1280 2252 y(.)1280 2251 y(.)1281 2249 y(.)1281 2247 y(.)1282 2246 y(.)1282 2244 y(.)1283 2243 y(.)1283 2241 y(.)1283 2239 y(.)1284 2238 y(.)1284 2236 y(.)1284 2234 y(.)1284 2233 y(.)1284 2231 y(.)1284 2229 y(.)f(.)1284 2231 y(.)1284 2233 y(.)1284 2234 y(.)1284 2236 y(.)1284 2238 y(.)1284 2239 y(.)1284 2241 y(.)1284 2243 y(.)1284 2244 y(.)1284 2246 y(.)1284 2248 y(.)1284 2249 y(.)1284 2251 y(.)1284 2253 y(.)1284 2254 y(.)1284 2256 y(.)1284 2258 y(.)1284 2259 y(.)g(.)1284 2258 y(.)1284 2256 y(.)1284 2254 y(.)1283 2253 y(.)1283 2251 y(.)1283 2249 y(.)1282 2248 y(.)1282 2246 y(.)1282 2245 y(.)1281 2243 y(.)1281 2242 y(.)1280 2240 y(.)1279 2238 y(.)1279 2237 y(.)1278 2235 y(.)1277 2234 y(.)1277 2232 y(.)1276 2231 y(.)1275 2229 y(.)1274 2228 y(.)1273 2227 y(.)1284 2259 y(.)1284 2258 y(.)1285 2256 y(.)1285 2254 y(.)1285 2253 y(.)1285 2251 y(.)1286 2249 y(.)1286 2248 y(.)1287 2246 y(.)1287 2245 y(.)1287 2243 y(.)1288 2242 y(.)1289 2240 y(.)1289 2238 y(.)1290 2237 y(.)1290 2235 y(.)1291 2234 y(.)1292 2232 y(.)1293 2231 y(.)1293 2229 y(.)1294 2228 y(.)1295 2227 y(.)1359 2241 y Fm(ctr-1)245 2379 y Fl(Figure)14 b(1:)k(CRA)m(Y)13 b(Y-MP)h(Pro)q(cessor)i(Arc)o(hitecture)g (with)e(HPM)g(Group)f(0)h(Coun)o(ters.)950 2828 y(4)p eop %%Page: 5 7 bop -90 195 a Fl(unit".)22 b(The)16 b(\\op)q(erands")g(for)f(the)h (instruction)g(functional)e(unit)i(are)f(the)i(16-bit,)d(32-bit)g(and)i (48-bit)e(mac)o(hine)g(instructions)j(of)-90 295 y(the)c(user's)g(program.)i (The)e(\\results")f(are)h(the)f(signals)g(to)f(the)i(rest)g(of)f(the)g(CPU)g (that)g(cause)h(the)g(w)o(ork)f(sp)q(eci\014ed)h(in)f(the)g(mac)o(hine)-90 394 y(instruction)i(to)f(b)q(e)h(p)q(erformed.)j(The)d(16-bit)e(instructions) i(can)f(issue)h(in)f(1)g(clo)q(c)o(k)h(p)q(erio)q(d,)f(while)g(32-bit)f(and)h (48-bit)f(instructions)-90 494 y(can)i(issue)h(in)e(2)h(clo)q(c)o(k)g(p)q (erio)q(ds)g(if)f(no)h(hold)f(issue)i(conditions)e(exist.)19 b(The)14 b(execution)h(time)d(of)i(an)f(instruction)i(v)n(aries.)-28 594 y(There)20 b(are)e(4)g(instruction)g(bu\013ers)i(eac)o(h)e(of)g(whic)o(h) g(con)o(tain)g(32)f(w)o(ords)h(\(up)h(to)f(128)f(\(16-bit\))g(instruction)i (parcels\).)32 b(The)-90 693 y(instruction)17 b(bu\013ers)h(supply)e (instructions)i(to)e(the)h(instruction)g(deco)q(de)h(and)e(issue)h(logic)f (\(functional)g(unit\).)26 b(When)16 b(the)h(next)-90 793 y(instruction)c(to) g(b)q(e)g(executed)i(is)e(not)f(in)h(an)f(instruction)i(bu\013er,)f(an)g (instruction)g(bu\013er)h(fetc)o(h)f(from)e(memory)f(o)q(ccurs.)20 b(There)14 b(are)-90 892 y(4)i(p)q(orts)g(\(lab)q(eled)g(A,)g(B,)f(C,)h(and)f (D\))h(that)g(connect)h(eac)o(h)f(CPU)h(to)e(the)i(cen)o(tral)f(memory)m(.)21 b(P)o(orts)16 b(A)g(and)g(B)g(are)g(used)h(to)f(load)-90 992 y(op)q(erands)c(from)e(memory)f(and)i(p)q(ort)h(C)f(is)g(used)i(to)e(store)h (op)q(erands)h(to)e(memory)m(.)j(P)o(ort)e(D)f(is)g(used)h(to)f(fetc)o(h)h (program)e(instructions)-90 1092 y(and)16 b(to)g(p)q(erform)g(system)g(I/O.)g (A)h(job)e(running)i(on)f(a)g(pro)q(cessor)i(ma)o(y)c(use)k(p)q(ort)e(D)g(on) h(an)o(y)f(pro)q(cessor)i(in)e(the)h(system)f(for)g(its)-90 1191 y(I/O.)h(There)h(are)g(3)f(\015oating)f(p)q(oin)o(t)h(functional)f (units)h(that)h(p)q(erform)e(64-bit)g(\015oating)h(p)q(oin)o(t)f(op)q (erations:)25 b(addition,)17 b(m)o(ultiply)-90 1291 y(and)e(recipro)q(cal)h (appro)o(ximation.)j(Division)14 b(is)h(p)q(erformed)g(with)f(a)h(com)o (bination)e(of)i(recipro)q(cal)h(appro)o(ximatio)o(n)d(and)i(m)o(ultiply)-90 1391 y(op)q(erations.)-28 1490 y(Eac)o(h)g(CPU)g(of)f(a)h(CRA)m(Y)f(Y-MP)h (system)f(con)o(tains)h(a)f(set)i(of)e(8)g(p)q(erformance)h(monitor)e(coun)o (ters.)21 b(Eac)o(h)15 b(coun)o(ter)h(is)f(48)f(bits)-90 1590 y(in)h(length.)24 b(These)17 b(coun)o(ters)g(monitor)d(and)i(trac)o(k)g (certain)g(hardw)o(are)g(related)h(ev)o(en)o(ts)g(that)f(can)g(b)q(e)g(used)h (to)e(indicate)h(relativ)o(e)-90 1689 y(p)q(erformance.)30 b(Thirt)o(y-t)o(w)o(o)17 b(ev)o(en)o(ts)i(can)f(b)q(e)g(monitored)f(and)g (these)j(ev)o(en)o(ts)e(are)h(divided)e(in)o(to)g(four)h(groups:)26 b(0,)18 b(1,)g(2,)g(3.)30 b(A)o(t)-90 1789 y(an)o(y)17 b(one)g(time)e(only)i (one)g(group)g(of)f(8)h(ev)o(en)o(ts)h(is)f(asso)q(ciated)h(with)e(the)i(8)f (ph)o(ysical)f(hardw)o(are)i(coun)o(ters.)28 b(The)18 b(asso)q(ciation)f(of) -90 1889 y(an)d(ev)o(en)o(t)h(group)f(with)g(the)h(hardw)o(are)g(coun)o(ters) h(is)e(p)q(erformed)g(b)o(y)g(the)h(op)q(erating)f(system)h(soft)o(w)o(are.)k (P)o(erformance)14 b(ev)o(en)o(ts)i(are)-90 1988 y(monitored)e(only)h(while)g (the)h(CPU)g(is)f(in)g(user)i(mo)q(de.)22 b(During)15 b(eac)o(h)h(clo)q(c)o (k)f(p)q(erio)q(d)h(in)f(user)i(mo)q(de,)d(the)i(p)q(erformance)g(coun)o (ters)-90 2088 y(are)d(incremen)o(ted)f(according)h(to)f(the)h(n)o(um)o(b)q (er)e(of)h(monitored)f(ev)o(en)o(ts)i(that)g(o)q(ccur.)18 b(The)13 b(op)q(eration)f(of)g(the)h(p)q(erformance)f(monitor)-90 2188 y(coun)o(ters)17 b(is)f(done)g(in)f(parallel)g(with)g(the)h(execution)h(of)e (user)i(programs,)d(do)q(es)j(not)f(dela)o(y)f(or)h(in)o(terfere)g(with)g (the)g(ev)o(en)o(ts)h(b)q(eing)-90 2287 y(monitored,)9 b(and)i(is)f(part)h (of)f(the)h(normal)d(op)q(eration)j(of)f(the)h(mac)o(hine.)k(T)m(able)10 b(1)h(sho)o(ws)f(the)i(p)q(erformance)e(coun)o(ters)i(that)f(comprise)-90 2387 y(monitor)h(group)i(0.)k(Besides)d(the)g(8)e(coun)o(ter)i(v)n(alues,)e (the)i(accum)o(ulated)e(CPU)h(time)f(in)g(clo)q(c)o(k)h(p)q(erio)q(ds)h(is)f (also)f(recorded.)20 b(Figure)-90 2487 y(1)15 b(illustrates)h(the)g(parts)g (of)f(the)h(Y-MP)f(pro)q(cessor)j(arc)o(hitecture)f(that)f(are)f(monitored)g (b)o(y)g(the)h(group)f(0)g(coun)o(ters.)24 b(A)16 b(detailed)-90 2586 y(description)f(of)e(the)h(hardw)o(are)h(p)q(erformance)e(monitor)f (coun)o(ters)j(ma)o(y)d(b)q(e)j(found)e(in)h([21)o(][14)n(].)950 2828 y(5)p eop %%Page: 6 8 bop 29 197 1863 2 v 28 247 2 50 v 37 247 V 62 232 a Fl(P)o(erformance)p 313 247 V 50 w(Description)p 1650 247 V 1130 w(Incremen)o(t)p 1882 247 V 1891 247 V 28 296 V 37 296 V 62 281 a(Coun)o(ter)p 313 296 V 131 w(Num)o(b)q(er)13 b(of:)p 1650 296 V 1131 w(P)o(er)h(CP)p 1882 296 V 1891 296 V 29 298 1863 2 v 28 348 2 50 v 37 348 V 62 333 a(0)p 313 348 V 256 w Fm(Instruction)o(s)f(issued)p Fl(.)j(This)e(coun)o(ter)g(is)g(incremen)o(ted)g(b)o(y)g(1)g(when)g(an)p 1650 348 V 151 w(+1)p 1882 348 V 1891 348 V 28 398 V 37 398 V 313 398 V 339 383 a(instruction)g(is)g(issued.)p 1650 398 V 1882 398 V 1891 398 V 29 399 1863 2 v 28 449 2 50 v 37 449 V 62 434 a(1)p 313 449 V 256 w Fm(Clo)q(c)o(k)i(p)q(erio)q(ds)e(holding)e (issue)p Fl(.)17 b(This)c(coun)o(ter)i(is)f(incremen)o(ted)g(b)o(y)g(1)p 1650 449 V 137 w(+1)p 1882 449 V 1891 449 V 28 499 V 37 499 V 313 499 V 339 484 a(when)g(an)g(instruction)g(is)g(prev)o(en)o(ted)h(from)d (issuing.)p 1650 499 V 1882 499 V 1891 499 V 29 501 1863 2 v 28 550 2 50 v 37 550 V 62 535 a(2)p 313 550 V 256 w Fm(Instruction)g (bu\013er)i(fetc)o(hes)h(\(p)q(ort)f(D\))p Fl(.)e(This)i(coun)o(ter)h(is)f (incremen)o(ted)p 1650 550 V 83 w(+1)p 1882 550 V 1891 550 V 28 600 V 37 600 V 313 600 V 339 585 a(b)o(y)f(1)h(when)g(an)g(instruction)g (bu\013er)h(fetc)o(h)f(is)g(initiated.)p 1650 600 V 1882 600 V 1891 600 V 29 602 1863 2 v 28 652 2 50 v 37 652 V 62 637 a(3)p 313 652 V 256 w Fm(Floating-p)q(oi)o(n)n(t)f(add)i(op)q(erations)p Fl(.)g(This)f(coun)o(ter)h(is)f(incremen)o(ted)g(b)o(y)g(1)p 1650 652 V 86 w(+1)p 1882 652 V 1891 652 V 28 702 V 37 702 V 313 702 V 339 687 a(when)g(a)g(result)g(is)g(pro)q(duced)h(from)d(the)j (\015oating)e(p)q(oin)o(t)g(addition)g(functional)p 1650 702 V 1882 702 V 1891 702 V 28 751 V 37 751 V 313 751 V 339 736 a(unit.)k(This)d(includes)g(b)q(oth)h(scalar)e(and)h(v)o(ector)h(mo)q(de)e (execution.)p 1650 751 V 1882 751 V 1891 751 V 29 753 1863 2 v 28 803 2 50 v 37 803 V 62 788 a(4)p 313 803 V 256 w Fm(Floating-p)q(oi)o (n)n(t)g(m)o(ultip)o(ly)g(op)q(erations)p Fl(.)i(This)f(coun)o(ter)h(is)f (incremen)o(ted)p 1650 803 V 80 w(+1)p 1882 803 V 1891 803 V 28 853 V 37 853 V 313 853 V 339 838 a(b)o(y)f(1)h(when)g(a)g(result)g(is)g (pro)q(duced)h(from)d(the)j(\015oating)e(p)q(oin)o(t)g(m)o(ultiply)p 1650 853 V 1882 853 V 1891 853 V 28 902 V 37 902 V 313 902 V 339 888 a(functional)f(unit.)18 b(This)c(includes)g(b)q(oth)g(scalar)g(and) g(v)o(ector)h(mo)q(de)e(execution.)p 1650 902 V 1882 902 V 1891 902 V 29 904 1863 2 v 28 954 2 50 v 37 954 V 62 939 a(5)p 313 954 V 256 w Fm(Floating-p)q(oi)o(n)n(t)g(recipro)q(cal)h(op)q(erations)p Fl(.)h(This)f(coun)o(ter)h(is)f(incremen)o(ted)p 1650 954 V 50 w(+1)p 1882 954 V 1891 954 V 28 1004 V 37 1004 V 313 1004 V 339 989 a(b)o(y)f(1)h(when)g(a)g(result)g(is)g(pro)q(duced)h(from)d(the)j (\015oating)e(p)q(oin)o(t)g(recipro)q(cal)p 1650 1004 V 1882 1004 V 1891 1004 V 28 1054 V 37 1054 V 313 1054 V 339 1039 a(appro)o(ximation)d(functional)j(unit.)18 b(This)c(includes)g(b)q(oth)g (scalar)g(and)g(v)o(ector)p 1650 1054 V 1882 1054 V 1891 1054 V 28 1103 V 37 1103 V 313 1103 V 339 1088 a(mo)q(de)e(execution.)p 1650 1103 V 1882 1103 V 1891 1103 V 29 1105 1863 2 v 28 1155 2 50 v 37 1155 V 62 1140 a(6)p 313 1155 V 256 w Fm(CPU)j(memory)g(references) f(\(p)q(orts)h(A,)h(B,)g(C\))p Fl(.)d(This)h(coun)o(ter)h(is)f(incre-)p 1650 1155 V 71 w(+3)g(max)p 1882 1155 V 1891 1155 V 28 1205 V 37 1205 V 313 1205 V 339 1190 a(men)o(ted)f(b)o(y)h(the)g(sum)f(of)g(the)i (CPU)f(memory)d(references)17 b(from)12 b(p)q(orts)j(A,)e(B,)p 1650 1205 V 1882 1205 V 1891 1205 V 28 1254 V 37 1254 V 313 1254 V 339 1240 a(and)g(C.)h(This)f(includes)i(b)q(oth)f(scalar)g(and)f(v)o (ector)i(mo)q(de)e(execution.)p 1650 1254 V 1882 1254 V 1891 1254 V 29 1256 1863 2 v 28 1306 2 50 v 37 1306 V 62 1291 a(7)p 313 1306 V 256 w Fm(I/O)i(memory)g(references)f(\(p)q(ort)h(D\))p Fl(.)d(This)i(coun)o(ter)h(is)f(incremen)o(ted)p 1650 1306 V 123 w(+1)p 1882 1306 V 1891 1306 V 28 1356 V 37 1356 V 313 1356 V 339 1341 a(b)o(y)f(1)h(for)f(eac)o(h)i(I/O)f(memory)d(reference)16 b(accessed)h(through)c(this)h(CPU.)p 1650 1356 V 1882 1356 V 1891 1356 V 29 1357 1863 2 v 363 1432 a(T)m(able)f(1:)18 b(CRA)m(Y)13 b(Y-MP)h(Hardw)o(are)h(P)o(erformance)e(Monitor)h(Group)f(0.)-90 1599 y Fh(2.2)56 b(Hardw)n(are)19 b(P)n(erformance)e(Monitor)i(Recording)e (Metho)r(dology)-90 1725 y Fl(In)d(June)g(1991,)e(the)j(National)d(Cen)o(ter) j(for)e(Sup)q(ercomputing)g(Applications,)g(an)g(NSF-supp)q(orted)i(facilit)o (y)d(at)i(the)g(Univ)o(ersit)o(y)g(of)-90 1825 y(Illinois)d(at)h (Urbana-Champaign,)e(installed,)i(at)g(the)i(request)g(of)e(one)h(of)f(the)h (authors\(JL\),)g(a)g(mo)q(di\014cation)d(to)j(the)g(UNICOS)g(6.0)-90 1924 y(op)q(erating)h(system)f(on)h(their)g(CRA)m(Y)g(Y-MP/4.)j(This)d(c)o (hange)g(caused)h(UNICOS)g(to)f(write)g(HPM)g(data)g(to)f(a)h(system)g (\014le)g(at)f(the)-90 2024 y(termination)i(of)g(eac)o(h)i(user)g(pro)q(cess) h(or)f(job)e([15)o(].)25 b(Jobs)17 b(that)f(w)o(ere)h(not)g(link)o(ed)e (after)i(June)g(1,)f(1991)f(are)i(not)f(recorded.)27 b(This)-90 2123 y(includes)12 b(some)f(third)h(part)o(y)f(soft)o(w)o(are)h(whic)o(h)f(w) o(as)h(installed)f(in)g(a)g(binary)h(form.)j(A)d(\014lter)g(w)o(as)f(emplo)o (y)o(ed)f(to)i(a)o(v)o(oid)e(the)i(recording)-90 2223 y(of)17 b(small)d(pro)q(cesses)20 b(taking)c(less)i(than)f(one)h(second,)g(e.g.)28 b(user)18 b(commands)d(suc)o(h)j(as)f('ls'.)27 b(The)17 b(statistics)h(accum) o(ulated)f(in)f(a)-90 2323 y(system)e(\014le)f(that)h(ma)o(y)e(b)q(e)i(p)q (ost-pro)q(cessed,)i(summarized,)c(and)h(sorted)i(in)e(v)n(arious)g(w)o(a)o (ys)h(to)f(pro)q(duce)i(rep)q(orts)h(ab)q(out)d(activities)-90 2422 y(that)f(o)q(ccurred)h(during)f(the)g(recording)g(p)q(erio)q(d)g([13)o (].)17 b(The)12 b(UNICOS)h(HPM)f(collection)f(co)q(de)i(on)e(the)h(Y-MP)g (gathers)h(information)-90 2522 y(organized)h(in)f(three)j(records,)f(called) e(A,)h(B)g(and)g(C,)f(ab)q(out)h(eac)o(h)g(user)h(job:)76 2604 y Fm(A)f Fl(hpmglobal)d Fm(timestamp)i(uid)h(pid)h(acct)76 2654 y(B)h(cmdname)76 2704 y(C)g(group)f(c0)h(c1)g(c2)g(c3)g(c4)g(c5)g(c6)g (c7)g(total)950 2828 y Fl(6)p eop %%Page: 7 9 bop -90 195 a Fl(where)76 278 y Fm(timestamp)11 b Fl(is)j(the)h(date/time)d (stamp,)h(as)g(pro)o(vided)h(b)o(y)g(\\time\(\)".)76 328 y Fm(uid)e Fl(is)i(the)g(user's)h(ID)f(n)o(um)o(b)q(er,)f(as)h(pro)o(vided)f(b) o(y)h(\\getuid\(\)".)76 378 y Fm(pid)e Fl(is)i(the)g(pro)q(cess)i(n)o(um)o(b) q(er,)d(as)h(pro)o(vided)g(b)o(y)f(\\getpid\(\)".)76 428 y Fm(acct)h Fl(is)g(the)g(user's)h(accoun)o(t)g(name,)d(e.g.)18 b(\\ab)q(c".)76 477 y Fm(cmdname)13 b Fl(is)h(the)g(name)f(of)g(the)h (executed)i(\014le,)e(as)g(pro)o(vided)f(b)o(y)h(\\)p 1184 477 13 2 v 15 w Ff(ar)q(g)q(v)q Fl([0]",)e(e.g.)18 b(\\a.out".)76 527 y Fm(group)12 b Fl(is)i(the)h(enabled)f(HPM)g(group)g(n)o(um)o(b)q(er.)j (Here,)e(group)e(=)i(\\0".)76 577 y Fm(c0...c7)g Fl(are)f(the)h(eigh)o(t)e (HPM)i(coun)o(ter)f(v)n(alues)g(for)g(the)g(job.)76 627 y Fm(total)e Fl(is)i(the)h(total)e(CPU)h(time.)-28 760 y(These)g(records)f(are)g(recorded) h(con)o(tin)o(uously)d(and)h(dump)q(ed)g(p)q(erio)q(dically)f(to)h(a)g (system)g(\014le)g(whic)o(h)g(is)h(o\013-loaded)e(mon)o(thly)m(.)k(In)-90 859 y(a)10 b(few)h(of)f(the)h(mon)o(thly)d(recordings)j(some)f(data)g(w)o(as) g(lost)h(due)g(to)f(unexp)q(ected)j(mac)o(hine)c(in)o(terruptions.)17 b(F)m(rom)9 b(the)i(user's)g(accoun)o(t)-90 959 y(name)16 b(w)o(e)h(are)g (able,)g(through)g(a)g(lo)q(okup)f(table,)h(to)g(iden)o(tify)f(the)i (application)d(area)i(of)g(the)g(user's)h(job.)27 b(When)17 b(submitting)e(a)-90 1059 y(prop)q(osal)10 b(for)g(time)f(on)h(the)g(mac)o (hine,)g(the)g(user)i(is)e(requested)i(to)e(categorize)h(the)g(w)o(ork)f(to)g (b)q(e)h(p)q(erformed)f(b)o(y)g(one)g(of)g(26)f(application)-90 1158 y(areas)k(de\014ned)g(b)o(y)e(the)i(National)e(Science)i(F)m(oundation.) j(The)d(application)d(area)j(is)e(recorded)j(in)e(a)f(table)h(with)g(the)h (accoun)o(t)f(name)-90 1258 y(b)o(y)g(NCSA)g(as)h(part)f(of)g(the)h(allo)q (cation)d(pro)q(cedure.)20 b(Users)13 b(who)f(do)g(not)g(categorize)i(their)e (w)o(ork)g(are)h(recorded)h(in)e(the)g(application)-90 1357 y(area)i(\\Unkno)o(wn.")j(Although)d(w)o(e)g(ha)o(v)o(e)f(organized)h(the)h (recorded)g(HPM)g(information)10 b(b)o(y)k(all)f(26)g(application)f(areas,)i (for)g(space)-90 1457 y(reasons)g(w)o(e)g(rep)q(ort)g(here)g(only)e(the)i (top)f(10)g(time-consuming)d(application)i(areas)i(whic)o(h)f(together)h (consume)f(o)o(v)o(er)g(80)g(p)q(ercen)o(t)i(of)-90 1557 y(the)f(recorded)i (CPU)e(time.)-90 1731 y Fh(2.3)56 b(P)n(erformance)17 b(Metrics)-90 1857 y Fl(F)m(rom)f(the)j(8)e(HPM)i(coun)o(ters)g(recorded)h(for)d(eac)o(h)i (job,)f(w)o(e)g(ha)o(v)o(e)g(deriv)o(ed)g(16)g(p)q(erformance)g(metrics)g (and)f(c)o(haracteristics)j(to)-90 1957 y(study)m(.)25 b(The)16 b(form)o(ulas)e(for)i(these)i(quan)o(tities)e(are)g(giv)o(en)g(in)g(T)m(able) f(2.)25 b(W)m(e)15 b(brie\015y)i(describ)q(e)h(eac)o(h)e(one)h(b)q(elo)o(w.) 24 b(Av)o(erage)17 b(rates)-90 2057 y(for)d(individual)f(jobs)i(are)g (computed)f(b)o(y)h(taking)f(the)h(total)f(coun)o(t)h(of)f(the)i(appropriate) e(op)q(erations)h(divided)g(b)o(y)f(the)i(job's)e(CPU)-90 2156 y(time.)i(Av)o(erage)d(rates)g(for)f(the)h(w)o(orkload)e(are)i(computed)f(b)o (y)g(taking)f(the)i(total)e(sum)h(of)f(the)i(appropriate)g(op)q(erations)f (for)g(all)f(jobs)-90 2256 y(divided)i(b)o(y)h(the)h(sum)d(of)i(all)e(job's)i (CPU)g(times.)-28 2356 y Fm(M\015ops)p Fl(.)24 b(This)16 b(metric)g(sho)o(ws) g(the)h(a)o(v)o(erage)f(\015oating)g(p)q(oin)o(t)f(execution)i(rate)g(in)f (millio)o(ns)e(of)i(op)q(erations)g(p)q(er)h(second.)26 b(No)-90 2455 y(adjustmen)o(t)17 b(is)h(made)f(for)h(the)h(di\013erence)h(b)q(et)o(w)o (een)f(division)e(and)h(recipro)q(cal)h(appro)o(ximation.)28 b Fm(ratio\(Add/Mul)o(t\))o Fl(.)g(This)-90 2555 y(c)o(haracteristic)17 b(sho)o(ws)f(the)h(ratio)e(of)g(the)h(n)o(um)o(b)q(er)f(of)h(\015oating)e(p)q (oin)o(t)i(additions)f(to)g(the)i(n)o(um)o(b)q(er)e(of)g(\015oating)g(p)q (oin)o(t)g(m)o(ultiplies.)-90 2654 y(Since)i(the)g(hardw)o(are)g(has)g(one)g (adder)g(and)g(one)g(m)o(ultipli)o(er,)e(this)i(measuremen)o(t)e(sho)o(ws)i (the)h(balance)e(of)g(the)h(requested)i(w)o(ork)950 2828 y(7)p eop %%Page: 8 10 bop 387 197 1146 2 v 386 247 2 50 v 395 247 V 421 232 a Fl(M\015ops)316 b(=1.0e-6)13 b Fi(\003)p Fl(\()p Ff(c)p Fl(3)c(+)g Ff(c)p Fl(4)g(+)h Ff(c)p Fl(5\))p Ff(=C)s(P)c(U)p 1402 232 13 2 v 18 w(time)p 1524 247 2 50 v 1533 247 V 386 296 V 395 296 V 421 281 a Fl (ratio\(Add/Mult\))132 b(=)12 b Ff(c)p Fl(3)p Ff(=c)p Fl(4)p 1524 296 V 1533 296 V 386 346 V 395 346 V 421 331 a(ratio\(\(A-M\)/\(A+M\)\)) 50 b(=)12 b(\()p Ff(c)p Fl(3)d Fi(\000)g Ff(c)p Fl(4\))p Ff(=)p Fl(\()p Ff(c)p Fl(3)g(+)h Ff(c)p Fl(4\))p 1524 346 V 1533 346 V 386 396 V 395 396 V 421 381 a(\045recipro)q(cals)210 b(=)12 b(100)c Fi(\003)h Ff(c)p Fl(5)p Ff(=)p Fl(\()p Ff(c)p Fl(3)g(+)g Ff(c)p Fl(4)g(+)h Ff(c)p Fl(5\))p 1524 396 V 1533 396 V 386 446 V 395 446 V 421 431 a(Mmemops)249 b(=1.0e-6)13 b Fi(\003)p Ff(c)p Fl(6)p Ff(=C)s(P)6 b(U)p 1191 431 13 2 v 18 w(time)p 1524 446 2 50 v 1533 446 V 386 496 V 395 496 V 421 481 a Fl(mem/\015op)259 b(=)12 b Ff(c)p Fl(6)p Ff(=)p Fl(\()p Ff(c)p Fl(3)c(+)i Ff(c)p Fl(4)f(+)g Ff(c)p Fl(5\))p 1524 496 V 1533 496 V 386 545 V 395 545 V 421 531 a(MIPS)333 b(=1.0e-6)13 b Fi(\003)p Ff(c)p Fl(0)p Ff(=C)s(P)6 b(U)p 1191 531 13 2 v 18 w(time)p 1524 545 2 50 v 1533 545 V 386 595 V 395 595 V 421 580 a Fl(\015op/inst)282 b(=)12 b(\()p Ff(c)p Fl(3)d(+)g Ff(c)p Fl(4)g(+)h Ff(c)p Fl(5\))p Ff(=c)p Fl(0)p 1524 595 V 1533 595 V 386 645 V 395 645 V 421 630 a(mem/i)o(nst)260 b(=)12 b Ff(c)p Fl(6)p Ff(=c)p Fl(0)p 1524 645 V 1533 645 V 386 695 V 395 695 V 421 680 a(CPs/inst)275 b(=)12 b Ff(C)s(P)6 b(U)p 1002 680 13 2 v 19 w(time=c)p Fl(0)p 1524 695 2 50 v 1533 695 V 386 745 V 395 745 V 421 730 a(\045CP)13 b(hold)g(issue)154 b(=)12 b(100)c Fi(\003)h Ff(c)p Fl(1)p Ff(=C)s(P)d(U)p 1163 730 13 2 v 19 w(time)p 1524 745 2 50 v 1533 745 V 386 795 V 395 795 V 421 780 a Fl(IBFs)350 b(=1.0e-6)13 b Fi(\003)p Ff(c)p Fl(2)p Ff(=C)s(P)6 b(U)p 1191 780 13 2 v 18 w(time)p 1524 795 2 50 v 1533 795 V 386 844 V 395 844 V 421 829 a Fl(inst/IBF)278 b(=)12 b Ff(c)p Fl(0)p Ff(=c)p Fl(2)p 1524 844 V 1533 844 V 386 894 V 395 894 V 421 879 a(\015ops/IBF)262 b(=)12 b(\()p Ff(c)p Fl(3)d(+)g Ff(c)p Fl(4)g(+)h Ff(c)p Fl(5\))p Ff(=c)p Fl(2)p 1524 894 V 1533 894 V 386 944 V 395 944 V 421 929 a(time/job)277 b(=)12 b Ff(C)s(P)6 b(U)p 1002 929 13 2 v 19 w(time=number)p 1265 929 V 16 w(of)p 1325 929 V 20 w(j)r(obs)p 1524 944 2 50 v 1533 944 V 386 994 V 395 994 V 421 979 a Fl(I/O)369 b(=)14 b(1.0e-6)f Fi(\003)p Ff(c)p Fl(7)p Ff(=C)s(P)6 b(U)p 1205 979 13 2 v 18 w(time)p 1524 994 2 50 v 1533 994 V 386 1044 V 395 1044 V 858 1029 a Fl(where)15 b Ff(C)s(P)6 b(U)p 1078 1029 13 2 v 19 w(time)12 b Fl(=)g(\()p Ff(c)p Fl(0)d(+)h Ff(c)p Fl(1\))p 1524 1044 2 50 v 1533 1044 V 387 1045 1146 2 v 333 1120 a(T)m(able)j(2:)k(P)o(erformance)d(Metrics)h(deriv)o(ed)g(from)d(HPM)i (Group)g(0)f(coun)o(ters.)-90 1294 y(to)k(the)g(capabilities)f(of)g(the)i (pro)q(cessor.)29 b Fm(ratio\(\(A-M\)/\(A+M\)\))o Fl(.)14 b(This)i(c)o (haracteristic)j(sho)o(ws)e(a)f(di\013eren)o(t)i(ratio)f(to)f(gauge)-90 1393 y(the)j(balance)g(of)f(\015oating)g(p)q(oin)o(t)g(additions)g(to)h(m)o (ultiplies.)30 b(An)19 b(adv)n(an)o(tage)f(of)g(this)h(function)f(o)o(v)o(er) h(the)g(Add/Mult)g(ratio)f(is)-90 1493 y(that)i(this)h(one)f(is)h(symmetric.) 35 b Fm(\045recipro)q(cals)p Fl(.)h(This)20 b(c)o(haracteristic)i(sho)o(ws)e (the)h(p)q(ercen)o(t)i(of)c(\015oating)h(p)q(oin)o(t)g(op)q(erations)-90 1593 y(that)15 b(w)o(ere)i(recipro)q(cal)e(op)q(erations.)23 b(This)15 b(op)q(eration)g(is)g(exp)q(ensiv)o(e)i(on)e(the)h(Cra)o(y)f(in)g (b)q(oth)g(hardw)o(are)h(and)f(execution)h(time.)21 b(A)-90 1692 y(division)10 b(op)q(eration)h(is)g(implemen)o(ted)f(on)h(the)h(Cra)o(y) f(b)o(y)g(one)g(recipro)q(cal)h(appro)o(ximation)d(and)i(t)o(w)o(o)g(or)g (three)i(m)o(ultiply)8 b(op)q(erations.)-90 1792 y Fm(Mmemops)p Fl(.)28 b(This)18 b(metric)f(sho)o(ws)h(the)g(a)o(v)o(erage)f(rate)h(of)f (access)j(to)d(the)i(shared)f(memory)d(for)i(source)i(op)q(erands)g(and)e (result)-90 1892 y(op)q(erands)f(in)e(millions)e(of)i(references)k(p)q(er)e (second.)22 b(The)15 b(rate)h(includes)f(the)g(use)h(of)e(memory)f(p)q(orts)i (A,)g(B)g(and)g(C.)f Fm(mem/\015op)p Fl(.)-90 1991 y(This)20 b(c)o(haracteristic)i(sho)o(ws)f(the)g(ratio)f(of)g(the)h(n)o(um)o(b)q(er)f (of)g(CPU)g(memory)e(op)q(erations)j(to)f(the)h(n)o(um)o(b)q(er)f(of)g (\015oating)f(p)q(oin)o(t)-90 2091 y(op)q(erations.)26 b Fm(MIPS)p Fl(.)15 b(This)i(metric)f(sho)o(ws)g(the)h(a)o(v)o(erage)g(rate)g(of)f (instruction)h(issue)g(in)f(milli)o(ons)e(of)i(instructions)h(p)q(er)g (second.)-90 2190 y Fm(\015op/inst)p Fl(.)d(This)c(c)o(haracteristic)i(sho)o (ws)e(the)h(ratio)f(of)f(the)i(n)o(um)o(b)q(er)f(of)f(\015oating)h(p)q(oin)o (t)g(op)q(erations)g(to)g(the)h(n)o(um)o(b)q(er)f(of)f(instructions)-90 2290 y(issued.)27 b Fm(mem/inst)p Fl(.)e(This)16 b(c)o(haracteristic)i(sho)o (ws)f(the)h(ratio)e(of)g(the)i(n)o(um)o(b)q(er)e(of)g(CPU)h(memory)d (references)20 b(to)c(the)i(n)o(um)o(b)q(er)-90 2390 y(of)d(instructions)i (issued.)25 b Fm(CPs/inst)p Fl(.)d(This)16 b(metric)g(sho)o(ws)g(the)h(a)o(v) o(erage)f(n)o(um)o(b)q(er)f(of)g(\(6ns\))i(clo)q(c)o(k)f(p)q(erio)q(ds)h(tak) o(en)f(to)f(issue)i(an)-90 2489 y(instruction.)23 b(The)16 b(in)o(v)o(erse)h(of)e(this)g(metric)g(is)h(prop)q(ortional)e(to)i(the)g (MIPS)g(metric.)22 b Fm(\045CP)c(hold)e(issue)p Fl(.)21 b(This)16 b(metric)f(sho)o(ws)-90 2589 y(the)f(p)q(ercen)o(t)h(of)d(the)i(CPU)f(time)f (when)h(instructions)h(w)o(ere)g(not)f(issued.)18 b Fm(IBFs)p Fl(.)g(This)13 b(metric)f(sho)o(ws)i(the)g(a)o(v)o(erage)f(rate)g(at)g(whic)o (h)-90 2689 y(instruction)f(bu\013ers)g(w)o(ere)g(fetc)o(hed)h(in)d(millions) e(of)j(fetc)o(hes)i(p)q(er)f(second.)18 b(Recall)11 b(that)g(an)g (instruction)g(bu\013er)i(con)o(tains)e(thirt)o(y-t)o(w)o(o)950 2828 y(8)p eop %%Page: 9 11 bop -90 195 a Fl(64-bit)16 b(w)o(ords.)28 b Fm(inst/IBF)p Fl(.)14 b(This)j(metric)f(sho)o(ws)i(the)f(a)o(v)o(erage)g(n)o(um)o(b)q(er)f(of)h (instructions)h(issued)f(b)q(et)o(w)o(een)i(instruction)e(bu\013er)-90 295 y(fetc)o(hes.)i(This)13 b(quan)o(tit)o(y)e(is)i(a)f(measure)g(of)g(\\co)q (de)i(lo)q(calit)o(y)m(.")h Fm(\015ops/IBF)p Fl(.)c(This)h(metric)g(sho)o(ws) h(the)g(a)o(v)o(erage)g(n)o(um)o(b)q(er)f(of)g(\015oating)-90 394 y(p)q(oin)o(t)k(op)q(erations)h(b)q(et)o(w)o(een)i(instruction)e (bu\013er)h(fetc)o(hes.)28 b(This)17 b(quan)o(tit)o(y)f(is)h(a)f(measure)h (of)f(\\w)o(ork)h(lo)q(calit)o(y)m(,")e(where)j(w)o(ork)e(is)-90 494 y(de\014ned)e(as)f(only)g(\015oating)f(p)q(oin)o(t)h(op)q(erations.)18 b Fm(time/job)p Fl(.)d(This)e(c)o(haracteristic)h(sho)o(ws)g(the)g(a)o(v)o (erage)f(CPU)g(time)f(p)q(er)i(job)f(where)-90 594 y(CPU)i(time)e(includes)j (all)d(time)g(when)j(the)f(job)f(w)o(as)h(executing)g(on)g(a)f(pro)q(cessor)j (and)d(w)o(as)h(not)f(in)g(the)i(op)q(erating)e(system.)21 b Fm(I/O)p Fl(.)-90 693 y(This)13 b(metric)g(sho)o(ws)h(the)g(a)o(v)o(erage)f (rate)h(of)f(I/O)g(op)q(erations)g(using)g(p)q(ort)h(D)f(in)g(millio)o(ns)e (of)i(w)o(ords)h(p)q(er)g(second.)19 b(Recall)12 b(that)i(p)q(ort)-90 793 y(D)g(in)f(eac)o(h)h(pro)q(cessor)i(is)e(a)f(system)h(resource.)-90 1005 y Fj(3)69 b(Results)-90 1146 y Fl(In)15 b(this)g(section)h(w)o(e)g (presen)o(t)g(preliminary)e(results)i(obtained)f(from)e(the)j(recorded)h(HPM) e(p)q(erformance)g(information)e(at)i(NCSA)-90 1245 y(for)f(b)q(oth)g(the)g (o)o(v)o(erall)f(w)o(orkload)f(and)i(individual)e(application)h(areas.)-28 1345 y(The)21 b(academic)e(w)o(orkload)f(of)i(the)g(Cra)o(y)g(Y-MP/4)g(at)f (NCSA)i(w)o(as)e(selected)j(as)e(our)g(initial)e(test)j(case)g(to)f(study)m (.)36 b(This)-90 1444 y(selection)13 b(to)q(ok)f(adv)n(an)o(tage)f(of)h(sp)q (ecial)g(recording)h(soft)o(w)o(are)f(that)h(has)f(collected)h(p)q (erformance)f(information)e(on)i(the)h(w)o(orkload)e(of)-90 1544 y(this)h(mac)o(hine)g(since)h(June)g(1991.)j(W)m(e)c(decided)i(to)e (study)h(the)g(p)q(erformance)f(data)g(collected)h(during)f(the)h(thirteen)h (mon)o(th)d(p)q(erio)q(d)-90 1644 y(from)16 b(June)j(1991)e(to)h(June)h (1992,)e(inclusiv)o(e.)30 b(These)20 b(data)d(consist)i(of)e(hardw)o(are)i(p) q(erformance)f(monitor)e(group)h(0)h(statistics)-90 1743 y(for)d(292,254)f (user)j(jobs.)24 b(Ab)q(out)16 b(2)f(CPU-y)o(ears)i(\(17,801)d(CPU)i(hours\)) g(of)f(execution)i(time)d(for)i(these)h(user)g(jobs)f(w)o(as)f(recorded)-90 1843 y(represen)o(ting)i(appro)o(ximately)c(47\045)i(of)g(the)h(w)o(all)f (clo)q(c)o(k)g(hours)i(a)o(v)n(ailable)c(on)i(the)i(4)e(pro)q(cessor)i(mac)o (hine)e(during)g(the)h(recording)-90 1943 y(p)q(erio)q(d.)h(In)11 b(the)h(discussion)g(b)q(elo)o(w,)f(w)o(e)g(use)h(the)f(w)o(ord)g(\\w)o (orkload")f(to)g(refer)j(to)d(this)i(large)e(recorded)j(p)q(ercen)o(tage.)19 b(The)11 b(collection)-90 2042 y(of)j(this)g(data)h(incurred)g(essen)o (tially)g(no)f(o)o(v)o(erhead)h(to)f(the)h(system)f(since)i(the)f(op)q (eration)f(of)g(the)h(HPM)g(coun)o(ters)h(is)e(done)h(b)o(y)f(the)-90 2142 y(hardw)o(are)19 b(and)f(is)g(part)g(of)g(the)h(normal)d(op)q(eration)i (of)g(the)h(mac)o(hine.)29 b(The)19 b(collection)f(of)g(HPM)g(p)q(erformance) g(data)g(on)g(the)-90 2241 y(Y-MP)c(at)f(NCSA)h(con)o(tin)o(ues)g(to)g(the)g (presen)o(t)h(time,)d(and)h(it)g(is)h(the)g(largest)g(kno)o(wn)f(collection)g (of)g(p)q(erformance)h(information)c(for)-90 2341 y(a)k(particular)f(site.) 950 2828 y(9)p eop %%Page: 10 12 bop 13 1129 754 2 v 12 1168 2 40 v 81 1168 V 149 1168 V 218 1168 V 286 1168 V 355 1168 V 423 1168 V 492 1168 V 560 1168 V 629 1168 V 697 1168 V 766 1168 V 12 1153 2 25 v 35 1153 V 58 1153 V 81 1153 V 104 1153 V 126 1153 V 149 1153 V 172 1153 V 195 1153 V 218 1153 V 241 1153 V 263 1153 V 286 1153 V 309 1153 V 332 1153 V 355 1153 V 378 1153 V 400 1153 V 423 1153 V 446 1153 V 469 1153 V 492 1153 V 515 1153 V 537 1153 V 560 1153 V 583 1153 V 606 1153 V 629 1153 V 652 1153 V 674 1153 V 697 1153 V 720 1153 V 743 1153 V 766 1153 V 4 1217 a Fn(0)39 b(30)31 b(60)f(90)21 b(120)12 b(150)f(180)h(210)f(240)h(270)f(300)h(330)300 1323 y(Mega\015ops)190 1369 y(\(a\))h(The)g(en)o(tire)g(w)o(orkload)p 12 1128 2 831 v -27 1129 40 2 v -27 853 V -27 576 V -27 299 V -12 1129 25 2 v -12 1074 V -12 1019 V -12 963 V -12 908 V -12 853 V -12 797 V -12 742 V -12 687 V -12 631 V -12 576 V -12 521 V -12 465 V -12 410 V -12 355 V -12 299 V -71 1141 a(0)-71 864 y(5)-90 587 y(10)-90 311 y(15)-50 179 y(P)o(ercen)o(t)-46 222 y(of)g(time)p 13 1128 19 640 v 36 1128 19 750 v 59 1128 19 488 v 82 1128 19 474 v 104 1128 19 280 v 127 1128 19 399 v 150 1128 19 368 v 173 1128 19 184 v 196 1128 19 178 v 219 1128 19 161 v 241 1128 19 189 v 264 1128 19 243 v 287 1128 19 225 v 287 1128 V 310 1128 19 153 v 333 1128 19 138 v 356 1128 19 174 v 378 1128 19 164 v 401 1128 19 86 v 424 1128 19 83 v 447 1128 19 51 v 470 1128 19 27 v 493 1128 19 50 v 515 1128 19 13 v 538 1128 19 12 v 561 1128 19 3 v 584 1128 19 2 v 607 1128 19 3 v 675 1128 V 698 1128 19 8 v 971 1128 3 831 v 1040 1128 V 1108 1128 V 1177 1128 V 1245 1128 V 1313 1128 V 1382 1128 V 1450 1128 V 1519 1128 V 1587 1128 V 1656 1128 V 1724 1128 V 972 1129 754 3 v 972 1046 V 972 963 V 972 880 V 972 797 V 972 714 V 972 631 V 972 548 V 972 465 V 972 382 V 972 299 V 972 1129 V 971 1168 3 40 v 1040 1168 V 1108 1168 V 1177 1168 V 1245 1168 V 1313 1168 V 1382 1168 V 1450 1168 V 1519 1168 V 1587 1168 V 1656 1168 V 1724 1168 V 971 1153 3 25 v 994 1153 V 1017 1153 V 1040 1153 V 1062 1153 V 1085 1153 V 1108 1153 V 1131 1153 V 1154 1153 V 1177 1153 V 1199 1153 V 1222 1153 V 1245 1153 V 1268 1153 V 1291 1153 V 1313 1153 V 1336 1153 V 1359 1153 V 1382 1153 V 1405 1153 V 1428 1153 V 1450 1153 V 1473 1153 V 1496 1153 V 1519 1153 V 1542 1153 V 1565 1153 V 1587 1153 V 1610 1153 V 1633 1153 V 1656 1153 V 1679 1153 V 1702 1153 V 1724 1153 V 962 1217 a(0)40 b(30)31 b(60)f(90)21 b(120)12 b(150)f(180)h(210)f(240)h(270)f(300)h(330)1259 1323 y(Mega\015ops)937 1369 y(\(b\))h(The)g(top)g(10)g(time-consuming)j (application)g(areas)p 971 1128 3 831 v 932 1129 40 3 v 932 1046 V 932 963 V 932 880 V 932 797 V 932 714 V 932 631 V 932 548 V 932 465 V 932 382 V 932 299 V 947 1129 25 3 v 947 1088 V 947 1046 V 947 1005 V 947 963 V 947 922 V 947 880 V 947 839 V 947 797 V 947 756 V 947 714 V 947 673 V 947 631 V 947 590 V 947 548 V 947 507 V 947 465 V 947 424 V 947 382 V 947 341 V 947 299 V 888 1141 a(0)869 1057 y(10)869 974 y(20)869 891 y(30)869 808 y(40)869 725 y(50)869 642 y(60)869 559 y(70)869 476 y(80)869 393 y(90)850 310 y(100)898 203 y(Cum)o(ulativ)o(e)867 246 y(p)q(ercen)o(t)e(of)f(time)991 1038 y Fg(.)991 1036 y(.)992 1035 y(.)992 1033 y(.)993 1032 y(.)993 1030 y(.)994 1028 y(.)994 1027 y(.)995 1025 y(.)995 1024 y(.)996 1022 y(.)996 1021 y(.)997 1019 y(.)997 1017 y(.)998 1016 y(.)998 1014 y(.)999 1013 y(.)1000 1011 y(.)1000 1009 y(.)1001 1008 y(.)1001 1006 y(.)1002 1005 y(.)1002 1003 y(.)1003 1002 y(.)1003 1000 y(.)1004 998 y(.)1004 997 y(.)1005 995 y(.)1005 994 y(.)1006 992 y(.)1006 990 y(.)1007 989 y(.)1007 987 y(.)1008 986 y(.)1008 984 y(.)1009 983 y(.)1009 981 y(.)1010 979 y(.)1010 978 y(.)1011 976 y(.)1011 975 y(.)1012 973 y(.)1013 971 y(.)1013 970 y(.)1014 968 y(.)-8 b(.)1014 967 y(.)1015 965 y(.)1015 964 y(.)1016 962 y(.)1017 960 y(.)1017 959 y(.)1018 957 y(.)1019 956 y(.)1019 954 y(.)1020 953 y(.)1021 951 y(.)1021 950 y(.)1022 948 y(.)1022 946 y(.)1023 945 y(.)1024 943 y(.)1024 942 y(.)1025 940 y(.)1026 939 y(.)1026 937 y(.)1027 936 y(.)1027 934 y(.)1028 932 y(.)1029 931 y(.)1029 929 y(.)1030 928 y(.)1031 926 y(.)1031 925 y(.)1032 923 y(.)1033 921 y(.)1033 920 y(.)1034 918 y(.)1034 917 y(.)1035 915 y(.)1036 914 y(.)1036 912 y(.)g(.)1037 911 y(.)1038 909 y(.)1039 908 y(.)1040 906 y(.)1041 905 y(.)1041 903 y(.)1042 902 y(.)1043 900 y(.)1044 899 y(.)1045 897 y(.)1046 896 y(.)1047 895 y(.)1047 893 y(.)1048 892 y(.)1049 890 y(.)1050 889 y(.)1051 887 y(.)1052 886 y(.)1052 884 y(.)1053 883 y(.)1054 881 y(.)1055 880 y(.)1056 878 y(.)1057 877 y(.)1058 875 y(.)1058 874 y(.)1059 872 y(.)g(.)1060 871 y(.)1061 870 y(.)1062 868 y(.)1063 867 y(.)1064 865 y(.)1064 864 y(.)1065 862 y(.)1066 861 y(.)1067 859 y(.)1068 858 y(.)1069 856 y(.)1070 855 y(.)1071 853 y(.)1072 852 y(.)1072 851 y(.)1073 849 y(.)1074 848 y(.)1075 846 y(.)1076 845 y(.)1077 843 y(.)1078 842 y(.)1079 840 y(.)1079 839 y(.)1080 837 y(.)1081 836 y(.)1082 835 y(.)g(.)1083 833 y(.)1084 831 y(.)1084 830 y(.)1085 828 y(.)1086 827 y(.)1087 825 y(.)1087 824 y(.)1088 822 y(.)1089 821 y(.)1090 819 y(.)1090 818 y(.)1091 816 y(.)1092 815 y(.)1093 813 y(.)1093 812 y(.)1094 810 y(.)1095 809 y(.)1096 807 y(.)1096 806 y(.)1097 804 y(.)1098 802 y(.)1099 801 y(.)1100 799 y(.)1100 798 y(.)1101 796 y(.)1102 795 y(.)1103 793 y(.)1103 792 y(.)1104 790 y(.)1105 789 y(.)g(.)1106 787 y(.)1106 786 y(.)1107 784 y(.)1108 783 y(.)1109 781 y(.)1109 780 y(.)1110 778 y(.)1111 777 y(.)1112 775 y(.)1112 774 y(.)1113 772 y(.)1114 771 y(.)1115 769 y(.)1116 768 y(.)1116 766 y(.)1117 765 y(.)1118 763 y(.)1119 762 y(.)1119 760 y(.)1120 759 y(.)1121 757 y(.)1122 756 y(.)1122 754 y(.)1123 753 y(.)1124 751 y(.)1125 750 y(.)1125 748 y(.)1126 747 y(.)1127 745 y(.)1128 744 y(.)g(.)1129 742 y(.)1130 741 y(.)1130 739 y(.)1131 738 y(.)1132 737 y(.)1133 735 y(.)1134 734 y(.)1135 732 y(.)1136 731 y(.)1137 730 y(.)1138 728 y(.)1139 727 y(.)1140 725 y(.)1140 724 y(.)1141 723 y(.)1142 721 y(.)1143 720 y(.)1144 718 y(.)1145 717 y(.)1146 716 y(.)1147 714 y(.)1148 713 y(.)1149 711 y(.)1150 710 y(.)1151 709 y(.)g(.)1152 707 y(.)1153 706 y(.)1154 705 y(.)1155 704 y(.)1157 703 y(.)1158 702 y(.)1159 700 y(.)1160 699 y(.)1161 698 y(.)1163 697 y(.)1164 696 y(.)1165 695 y(.)1166 693 y(.)1167 692 y(.)1169 691 y(.)1170 690 y(.)1171 689 y(.)1172 687 y(.)1173 686 y(.)g(.)1174 685 y(.)1175 684 y(.)1176 682 y(.)1177 681 y(.)1178 679 y(.)1179 678 y(.)1180 677 y(.)1181 675 y(.)1182 674 y(.)1183 673 y(.)1184 671 y(.)1185 670 y(.)1186 668 y(.)1187 667 y(.)1188 666 y(.)1189 664 y(.)1190 663 y(.)1190 661 y(.)1191 660 y(.)1192 659 y(.)1193 657 y(.)1194 656 y(.)1195 655 y(.)1196 653 y(.)g(.)1197 652 y(.)1197 650 y(.)1198 648 y(.)1199 647 y(.)1199 645 y(.)1200 644 y(.)1200 642 y(.)1201 641 y(.)1201 639 y(.)1202 638 y(.)1203 636 y(.)1203 634 y(.)1204 633 y(.)1204 631 y(.)1205 630 y(.)1206 628 y(.)1206 627 y(.)1207 625 y(.)1207 623 y(.)1208 622 y(.)1208 620 y(.)1209 619 y(.)1210 617 y(.)1210 616 y(.)1211 614 y(.)1211 612 y(.)1212 611 y(.)1213 609 y(.)1213 608 y(.)1214 606 y(.)1214 605 y(.)1215 603 y(.)1216 601 y(.)1216 600 y(.)1217 598 y(.)1217 597 y(.)1218 595 y(.)1218 594 y(.)1219 592 y(.)g(.)1220 591 y(.)1221 589 y(.)1222 588 y(.)1223 587 y(.)1224 585 y(.)1226 584 y(.)1227 583 y(.)1228 582 y(.)1229 580 y(.)1230 579 y(.)1231 578 y(.)1232 576 y(.)1233 575 y(.)1234 574 y(.)1235 572 y(.)1236 571 y(.)1238 570 y(.)1239 569 y(.)1240 567 y(.)1241 566 y(.)1242 565 y(.)g(.)1243 564 y(.)1245 563 y(.)1246 562 y(.)1248 561 y(.)1249 560 y(.)1250 559 y(.)1252 557 y(.)1253 556 y(.)1255 555 y(.)1256 554 y(.)1258 553 y(.)1259 552 y(.)1260 551 y(.)1262 550 y(.)1263 549 y(.)1265 548 y(.)g(.)1266 547 y(.)1267 546 y(.)1268 545 y(.)1269 543 y(.)1270 542 y(.)1272 541 y(.)1273 540 y(.)1274 539 y(.)1275 537 y(.)1276 536 y(.)1277 535 y(.)1278 534 y(.)1280 532 y(.)1281 531 y(.)1282 530 y(.)1283 529 y(.)1284 528 y(.)1285 526 y(.)1286 525 y(.)1288 524 y(.)g(.)1289 523 y(.)1290 521 y(.)1291 520 y(.)1292 519 y(.)1293 517 y(.)1294 516 y(.)1295 515 y(.)1296 513 y(.)1297 512 y(.)1298 511 y(.)1299 510 y(.)1301 508 y(.)1302 507 y(.)1303 506 y(.)1304 504 y(.)1305 503 y(.)1306 502 y(.)1307 500 y(.)1308 499 y(.)1309 498 y(.)1310 497 y(.)g(.)1311 495 y(.)1312 494 y(.)1313 492 y(.)1314 491 y(.)1315 489 y(.)1316 488 y(.)1316 486 y(.)1317 485 y(.)1318 484 y(.)1319 482 y(.)1320 481 y(.)1321 479 y(.)1322 478 y(.)1323 476 y(.)1324 475 y(.)1324 474 y(.)1325 472 y(.)1326 471 y(.)1327 469 y(.)1328 468 y(.)1329 466 y(.)1330 465 y(.)1331 463 y(.)1331 462 y(.)1332 461 y(.)1333 459 y(.)g(.)1334 458 y(.)1335 456 y(.)1335 455 y(.)1336 453 y(.)1337 452 y(.)1338 450 y(.)1339 449 y(.)1339 447 y(.)1340 445 y(.)1341 444 y(.)1342 442 y(.)1342 441 y(.)1343 439 y(.)1344 438 y(.)1345 436 y(.)1345 435 y(.)1346 433 y(.)1347 432 y(.)1348 430 y(.)1348 429 y(.)1349 427 y(.)1350 426 y(.)1351 424 y(.)1351 423 y(.)1352 421 y(.)1353 420 y(.)1354 418 y(.)1354 416 y(.)1355 415 y(.)1356 413 y(.)g(.)1357 412 y(.)1358 411 y(.)1359 409 y(.)1360 408 y(.)1361 407 y(.)1362 405 y(.)1363 404 y(.)1364 403 y(.)1365 401 y(.)1366 400 y(.)1367 399 y(.)1368 397 y(.)1370 396 y(.)1371 395 y(.)1372 393 y(.)1373 392 y(.)1374 391 y(.)1375 389 y(.)1376 388 y(.)1377 387 y(.)1378 385 y(.)1379 384 y(.)g(.)1380 383 y(.)1381 381 y(.)1382 380 y(.)1383 379 y(.)1384 377 y(.)1385 376 y(.)1386 374 y(.)1386 373 y(.)1387 372 y(.)1388 370 y(.)1389 369 y(.)1390 368 y(.)1391 366 y(.)1392 365 y(.)1393 363 y(.)1394 362 y(.)1395 361 y(.)1396 359 y(.)1397 358 y(.)1398 357 y(.)1399 355 y(.)1400 354 y(.)1401 352 y(.)1402 351 y(.)g(.)1403 350 y(.)1405 349 y(.)h(.)1408 348 y(.)1409 347 y(.)1411 346 y(.)1412 345 y(.)1414 344 y(.)g(.)1417 343 y(.)1418 342 y(.)1420 341 y(.)1421 340 y(.)h(.)1425 339 y(.)e(.)1426 338 y(.)i(.)1429 337 y(.)g(.)g(.)1434 336 y(.)g(.)1438 335 y(.)f(.)1441 334 y(.)g(.)h(.)1446 333 y(.)f(.)f(.)1448 331 y(.)1450 330 y(.)1451 329 y(.)1452 328 y(.)1453 326 y(.)1454 325 y(.)1455 324 y(.)1456 322 y(.)1457 321 y(.)1458 320 y(.)1459 319 y(.)1460 317 y(.)1461 316 y(.)1463 315 y(.)1464 313 y(.)1465 312 y(.)1466 311 y(.)1467 309 y(.)1468 308 y(.)1469 307 y(.)1470 306 y(.)g(.)1472 305 y(.)i(.)f(.)1477 304 y(.)h(.)g(.)1482 303 y(.)g(.)g(.)1488 302 y(.)f(.)h(.)1493 301 y(.)e(.)i(.)g(.)f(.)h(.)g(.)g (.)f(.)h(.)g(.)g(.)f(.)h(.)g(.)e(.)i(.)f(.)h(.)g(.)g(.)1526 300 y(.)g(.)g(.)g(.)f(.)h(.)g(.)g(.)e(.)h(.)h(.)g(.)g(.)f(.)h(.)g(.)g(.)f(.)h (.)g(.)g(.)f(.)f(.)i(.)g(.)g(.)g(.)f(.)h(.)g(.)g(.)f(.)h(.)g(.)g(.)f(.)f(.)i (.)g(.)g(.)f(.)h(.)g(.)g(.)f(.)h(.)g(.)g(.)f(.)h(.)e(.)i(.)g(.)f(.)h(.)g(.)g (.)f(.)h(.)g(.)g(.)f(.)h(.)g(.)e(.)i(.)f(.)h(.)g(.)g(.)g(.)f(.)h(.)g(.)g(.)f (.)h(.)g(.)e(.)i(.)f(.)h(.)g(.)g(.)f(.)h(.)g(.)g(.)f(.)h(.)g(.)g(.)e(.)h(.)h (.)g(.)g(.)f(.)h(.)g(.)g(.)f(.)h(.)g(.)g(.)f(.)f(.)i(.)g(.)g(.)g(.)f(.)h(.)g (.)g(.)f(.)h(.)g(.)g(.)f(.)983 1043 y Fe(\012)1006 973 y(\012)1029 917 y(\012)1051 877 y(\012)1074 839 y(\012)1097 794 y(\012)1120 749 y(\012)1143 714 y(\012)1166 691 y(\012)1188 658 y(\012)1211 597 y(\012)1234 570 y(\012)1257 553 y(\012)1280 529 y(\012)1303 501 y(\012)1325 464 y(\012)1348 418 y(\012)1371 389 y(\012)1394 356 y(\012)1417 344 y(\012)1440 338 y(\012)1462 311 y(\012)1485 306 y(\012)o(\012)1531 305 y(\012)o(\012)1577 304 y(\012)n(\012)o(\012)o (\012)o(\012)o(\012)n(\012)1782 471 y(\012)1744 466 y Fg(.)h(.)f(.)h(.)g(.)f (.)h(.)g(.)f(.)h(.)g(.)f(.)h(.)g(.)f(.)h(.)g(.)f(.)h(.)g(.)f(.)h(.)g(.)f(.)h (.)g(.)f(.)h(.)g(.)f(.)h(.)g(.)f(.)h(.)g(.)f(.)h(.)g(.)f(.)h(.)g(.)f(.)h(.)g (.)f(.)h(.)g(.)f(.)h(.)g(.)f(.)h(.)f(.)h(.)g(.)f(.)1862 477 y Fn(PHYSC)991 1027 y Fg(.)991 1025 y(.)991 1024 y(.)992 1022 y(.)992 1020 y(.)993 1019 y(.)993 1017 y(.)993 1015 y(.)994 1014 y(.)994 1012 y(.)994 1011 y(.)995 1009 y(.)995 1007 y(.)996 1006 y(.)996 1004 y(.)996 1002 y(.)997 1001 y(.)997 999 y(.)997 998 y(.)998 996 y(.)998 994 y(.)999 993 y(.)999 991 y(.)999 989 y(.)1000 988 y(.)1000 986 y(.)1000 985 y(.)1001 983 y(.)1001 981 y(.)1002 980 y(.)1002 978 y(.)1002 976 y(.)1003 975 y(.)1003 973 y(.)1003 971 y(.)1004 970 y(.)1004 968 y(.)1005 967 y(.)1005 965 y(.)1005 963 y(.)1006 962 y(.)1006 960 y(.)1006 958 y(.)1007 957 y(.)1007 955 y(.)1008 954 y(.)1008 952 y(.)1008 950 y(.)1009 949 y(.)1009 947 y(.)1009 945 y(.)1010 944 y(.)1010 942 y(.)1011 941 y(.)1011 939 y(.)1011 937 y(.)1012 936 y(.)1012 934 y(.)1012 932 y(.)1013 931 y(.)1013 929 y(.)1014 928 y(.)f(.)1014 926 y(.)1014 924 y(.)1015 923 y(.)1015 921 y(.)1015 919 y(.)1016 918 y(.)1016 916 y(.)1017 914 y(.)1017 913 y(.)1017 911 y(.)1018 909 y(.)1018 908 y(.)1018 906 y(.)1019 905 y(.)1019 903 y(.)1020 901 y(.)1020 900 y(.)1020 898 y(.)1021 896 y(.)1021 895 y(.)1021 893 y(.)1022 891 y(.)1022 890 y(.)1023 888 y(.)1023 887 y(.)1023 885 y(.)1024 883 y(.)1024 882 y(.)1024 880 y(.)1025 878 y(.)1025 877 y(.)1026 875 y(.)1026 873 y(.)1026 872 y(.)1027 870 y(.)1027 868 y(.)1027 867 y(.)1028 865 y(.)1028 864 y(.)1029 862 y(.)1029 860 y(.)1029 859 y(.)1030 857 y(.)1030 855 y(.)1030 854 y(.)1031 852 y(.)1031 850 y(.)1032 849 y(.)1032 847 y(.)1032 846 y(.)1033 844 y(.)1033 842 y(.)1033 841 y(.)1034 839 y(.)1034 837 y(.)1035 836 y(.)1035 834 y(.)1035 832 y(.)1036 831 y(.)1036 829 y(.)1036 827 y(.)g(.)1037 826 y(.)1037 824 y(.)1037 823 y(.)1038 821 y(.)1038 819 y(.)1038 818 y(.)1039 816 y(.)1039 814 y(.)1039 813 y(.)1039 811 y(.)1040 809 y(.)1040 808 y(.)1040 806 y(.)1041 804 y(.)1041 803 y(.)1041 801 y(.)1042 799 y(.)1042 798 y(.)1042 796 y(.)1042 794 y(.)1043 793 y(.)1043 791 y(.)1043 789 y(.)1044 788 y(.)1044 786 y(.)1044 785 y(.)1045 783 y(.)1045 781 y(.)1045 780 y(.)1046 778 y(.)1046 776 y(.)1046 775 y(.)1046 773 y(.)1047 771 y(.)1047 770 y(.)1047 768 y(.)1048 766 y(.)1048 765 y(.)1048 763 y(.)1049 761 y(.)1049 760 y(.)1049 758 y(.)1049 756 y(.)1050 755 y(.)1050 753 y(.)1050 751 y(.)1051 750 y(.)1051 748 y(.)1051 747 y(.)1052 745 y(.)1052 743 y(.)1052 742 y(.)1053 740 y(.)1053 738 y(.)1053 737 y(.)1053 735 y(.)1054 733 y(.)1054 732 y(.)1054 730 y(.)1055 728 y(.)1055 727 y(.)1055 725 y(.)1056 723 y(.)1056 722 y(.)1056 720 y(.)1056 718 y(.)1057 717 y(.)1057 715 y(.)1057 714 y(.)1058 712 y(.)1058 710 y(.)1058 709 y(.)1059 707 y(.)1059 705 y(.)1059 704 y(.)g(.)1061 702 y(.)1062 701 y(.)1063 700 y(.)1065 699 y(.)1066 698 y(.)1067 697 y(.)1069 696 y(.)1070 695 y(.)1071 694 y(.)1073 693 y(.)1074 692 y(.)1075 690 y(.)1077 689 y(.)1078 688 y(.)1079 687 y(.)1081 686 y(.)1082 685 y(.)g(.)1083 683 y(.)1083 682 y(.)1084 680 y(.)1084 679 y(.)1085 677 y(.)1085 675 y(.)1086 674 y(.)1087 672 y(.)1087 671 y(.)1088 669 y(.)1088 667 y(.)1089 666 y(.)1089 664 y(.)1090 663 y(.)1091 661 y(.)1091 660 y(.)1092 658 y(.)1092 656 y(.)1093 655 y(.)1093 653 y(.)1094 652 y(.)1095 650 y(.)1095 648 y(.)1096 647 y(.)1096 645 y(.)1097 644 y(.)1097 642 y(.)1098 641 y(.)1099 639 y(.)1099 637 y(.)1100 636 y(.)1100 634 y(.)1101 633 y(.)1101 631 y(.)1102 629 y(.)1103 628 y(.)1103 626 y(.)1104 625 y(.)1104 623 y(.)1105 622 y(.)g(.)1106 620 y(.)1107 619 y(.)1108 618 y(.)1109 616 y(.)1110 615 y(.)1111 614 y(.)1112 612 y(.)1113 611 y(.)1114 609 y(.)1115 608 y(.)1116 607 y(.)1117 605 y(.)1118 604 y(.)1119 603 y(.)1120 601 y(.)1121 600 y(.)1122 599 y(.)1123 597 y(.)1124 596 y(.)1125 595 y(.)1126 593 y(.)1127 592 y(.)1128 591 y(.)g(.)1129 590 y(.)1131 589 y(.)1132 588 y(.)1133 587 y(.)1135 586 y(.)1136 585 y(.)1138 584 y(.)1139 583 y(.)1141 582 y(.)1142 581 y(.)1143 580 y(.)i(.)1146 579 y(.)1148 578 y(.)1149 577 y(.)1151 576 y(.)e(.)1152 575 y(.)1153 573 y(.)1154 572 y(.)1155 571 y(.)1157 569 y(.)1158 568 y(.)1159 567 y(.)1160 566 y(.)1161 564 y(.)1163 563 y(.)1164 562 y(.)1165 561 y(.)1166 559 y(.)1167 558 y(.)1169 557 y(.)1170 556 y(.)1171 554 y(.)1172 553 y(.)1173 552 y(.)g(.)1175 551 y(.)1176 549 y(.)1177 548 y(.)1178 547 y(.)1179 546 y(.)1181 545 y(.)1182 544 y(.)1183 542 y(.)1184 541 y(.)1185 540 y(.)1187 539 y(.)1188 538 y(.)1189 537 y(.)1190 535 y(.)1191 534 y(.)1193 533 y(.)1194 532 y(.)1195 531 y(.)1196 529 y(.)g(.)1197 528 y(.)1198 527 y(.)1199 525 y(.)1200 524 y(.)1201 523 y(.)1202 521 y(.)1203 520 y(.)1204 518 y(.)1205 517 y(.)1206 516 y(.)1207 514 y(.)1208 513 y(.)1209 511 y(.)1210 510 y(.)1211 509 y(.)1212 507 y(.)1213 506 y(.)1214 504 y(.)1215 503 y(.)1216 502 y(.)1217 500 y(.)1218 499 y(.)1219 497 y(.)g(.)1220 496 y(.)1222 495 y(.)1223 494 y(.)1224 493 y(.)1226 492 y(.)1227 491 y(.)1228 490 y(.)1230 489 y(.)1231 488 y(.)1232 486 y(.)1234 485 y(.)1235 484 y(.)1236 483 y(.)1238 482 y(.)1239 481 y(.)1241 480 y(.)1242 479 y(.)g(.)1242 477 y(.)1243 476 y(.)1244 474 y(.)1244 473 y(.)1245 471 y(.)1245 469 y(.)1246 468 y(.)1246 466 y(.)1247 465 y(.)1247 463 y(.)1248 462 y(.)1249 460 y(.)1249 458 y(.)1250 457 y(.)1250 455 y(.)1251 454 y(.)1251 452 y(.)1252 451 y(.)1252 449 y(.)1253 447 y(.)1254 446 y(.)1254 444 y(.)1255 443 y(.)1255 441 y(.)1256 440 y(.)1256 438 y(.)1257 436 y(.)1257 435 y(.)1258 433 y(.)1259 432 y(.)1259 430 y(.)1260 429 y(.)1260 427 y(.)1261 425 y(.)1261 424 y(.)1262 422 y(.)1262 421 y(.)1263 419 y(.)1264 418 y(.)1264 416 y(.)1265 415 y(.)g(.)1266 413 y(.)1268 412 y(.)1269 411 y(.)1270 410 y(.)1272 409 y(.)1273 408 y(.)1275 407 y(.)1276 406 y(.)1278 405 y(.)1279 404 y(.)1280 403 y(.)1282 402 y(.)1283 401 y(.)1285 400 y(.)1286 399 y(.)1288 398 y(.)g(.)1289 397 y(.)1290 395 y(.)1291 394 y(.)1292 393 y(.)1293 392 y(.)1294 390 y(.)1296 389 y(.)1297 388 y(.)1298 386 y(.)1299 385 y(.)1300 384 y(.)1301 382 y(.)1302 381 y(.)1304 380 y(.)1305 379 y(.)1306 377 y(.)1307 376 y(.)1308 375 y(.)1309 373 y(.)1310 372 y(.)g(.)1311 371 y(.)1312 369 y(.)1313 368 y(.)1315 367 y(.)1316 366 y(.)1317 364 y(.)1318 363 y(.)1319 362 y(.)1320 360 y(.)1321 359 y(.)1322 358 y(.)1323 356 y(.)1324 355 y(.)1325 354 y(.)1326 352 y(.)1327 351 y(.)1328 350 y(.)1329 348 y(.)1330 347 y(.)1331 346 y(.)1332 345 y(.)1333 343 y(.)g(.)1335 342 y(.)1336 341 y(.)1337 340 y(.)1339 339 y(.)1340 338 y(.)1342 337 y(.)1343 336 y(.)1345 335 y(.)1346 334 y(.)1347 333 y(.)1349 332 y(.)1350 331 y(.)1352 330 y(.)1353 329 y(.)1355 328 y(.)1356 327 y(.)g(.)1358 326 y(.)h(.)h(.)1363 325 y(.)f(.)1366 324 y(.)g(.)h(.)1371 323 y(.)f(.)1374 322 y(.)h(.)f(.)1379 321 y(.)f(.)h(.)1382 320 y(.)h(.)1385 319 y(.)g(.)1389 318 y(.)f(.)h(.)1394 317 y(.)f(.)1397 316 y(.)g(.)1400 315 y(.)h(.)e(.)h(.)1405 314 y(.)h(.)f(.)1410 313 y(.)g(.)h(.)1415 312 y(.)f(.)h(.)1420 311 y(.)f(.)h(.)1425 310 y(.)e(.)h(.)h(.)g(.)1432 309 y(.)f(.)h(.)g(.)1439 308 y(.)f(.)h(.)g(.) 1446 307 y(.)f(.)f(.)i(.)1451 306 y(.)f(.)1454 305 y(.)g(.)1457 304 y(.)h(.)1460 303 y(.)g(.)1464 302 y(.)f(.)1467 301 y(.)h(.)1470 300 y(.)e(.)i(.)g(.)f(.)h(.)g(.)g(.)f(.)h(.)g(.)g(.)f(.)h(.)g(.)e(.)i(.)g(.)f (.)h(.)g(.)g(.)f(.)h(.)g(.)g(.)f(.)h(.)g(.)e(.)i(.)f(.)h(.)g(.)g(.)f(.)h(.)g (.)g(.)f(.)h(.)g(.)g(.)e(.)h(.)h(.)g(.)g(.)f(.)h(.)g(.)g(.)f(.)h(.)g(.)g(.)f (.)f(.)i(.)g(.)g(.)g(.)f(.)h(.)g(.)g(.)f(.)h(.)g(.)g(.)f(.)f(.)i(.)g(.)g(.)f (.)h(.)g(.)g(.)f(.)h(.)g(.)g(.)f(.)h(.)e(.)i(.)g(.)f(.)h(.)g(.)g(.)f(.)h(.)g (.)g(.)f(.)h(.)g(.)e(.)i(.)f(.)h(.)g(.)g(.)g(.)f(.)h(.)g(.)g(.)f(.)h(.)g(.)e (.)i(.)f(.)h(.)g(.)g(.)f(.)h(.)g(.)g(.)f(.)h(.)g(.)g(.)e(.)h(.)h(.)g(.)g(.)f (.)h(.)g(.)g(.)f(.)h(.)g(.)g(.)f(.)f(.)i(.)g(.)g(.)g(.)f(.)h(.)g(.)g(.)f(.)h (.)g(.)g(.)f(.)983 1032 y Fe(\002)1006 932 y(\002)1029 832 y(\002)1051 709 y(\002)1074 690 y(\002)1097 627 y(\002)1120 596 y(\002)1143 581 y(\002)1166 557 y(\002)1188 534 y(\002)1211 502 y(\002)1234 484 y(\002)1257 419 y(\002)1280 403 y(\002)1303 377 y(\002)1325 348 y(\002)1348 332 y(\002)1371 326 y(\002)1394 320 y(\002)1417 315 y(\002)1440 312 y(\002)1462 305 y(\002)1485 304 y(\002)o(\002)o(\002)o(\002)o(\002)n(\002)o(\002)o(\002)o(\002)o(\002)n (\002)1782 512 y(\002)1744 507 y Fg(.)h(.)f(.)h(.)g(.)f(.)h(.)g(.)f(.)h(.)g (.)f(.)h(.)g(.)f(.)h(.)g(.)f(.)h(.)g(.)f(.)h(.)g(.)f(.)h(.)g(.)f(.)h(.)g(.)f (.)h(.)g(.)f(.)h(.)g(.)f(.)h(.)g(.)f(.)h(.)g(.)f(.)h(.)g(.)f(.)h(.)g(.)f(.)h (.)g(.)f(.)h(.)f(.)h(.)g(.)f(.)1862 519 y Fn(MA)m(TER)991 1092 y Fg(.)991 1091 y(.)992 1089 y(.)993 1088 y(.)993 1086 y(.)994 1084 y(.)994 1083 y(.)995 1081 y(.)996 1080 y(.)996 1078 y(.)997 1077 y(.)998 1075 y(.)998 1073 y(.)999 1072 y(.)999 1070 y(.)1000 1069 y(.)1001 1067 y(.)1001 1065 y(.)1002 1064 y(.)1002 1062 y(.)1003 1061 y(.)1004 1059 y(.)1004 1058 y(.)1005 1056 y(.)1006 1054 y(.)1006 1053 y(.)1007 1051 y(.)1007 1050 y(.)1008 1048 y(.)1009 1047 y(.)1009 1045 y(.)1010 1043 y(.)1010 1042 y(.)1011 1040 y(.)1012 1039 y(.)1012 1037 y(.)1013 1036 y(.)1014 1034 y(.)f(.)1014 1032 y(.)1015 1031 y(.)1016 1030 y(.)1017 1028 y(.)1018 1027 y(.)1018 1025 y(.)1019 1024 y(.)1020 1022 y(.)1021 1021 y(.)1022 1019 y(.)1023 1018 y(.)1023 1016 y(.)1024 1015 y(.)1025 1013 y(.)1026 1012 y(.)1027 1010 y(.)1027 1009 y(.)1028 1007 y(.)1029 1006 y(.)1030 1004 y(.)1031 1003 y(.)1031 1002 y(.)1032 1000 y(.)1033 999 y(.)1034 997 y(.)1035 996 y(.)1036 994 y(.)1036 993 y(.)g(.)1038 991 y(.)1039 990 y(.)1040 989 y(.)1041 988 y(.)1042 986 y(.)1043 985 y(.)1044 984 y(.)1046 983 y(.)1047 981 y(.)1048 980 y(.)1049 979 y(.)1050 978 y(.)1051 976 y(.)1052 975 y(.)1053 974 y(.)1055 973 y(.)1056 971 y(.)1057 970 y(.)1058 969 y(.)1059 968 y(.)g(.)1060 966 y(.)1060 964 y(.)1061 963 y(.)1062 961 y(.)1062 960 y(.)1063 958 y(.)1064 957 y(.)1064 955 y(.)1065 954 y(.)1066 952 y(.)1066 950 y(.)1067 949 y(.)1067 947 y(.)1068 946 y(.)1069 944 y(.)1069 943 y(.)1070 941 y(.)1071 940 y(.)1071 938 y(.)1072 937 y(.)1073 935 y(.)1073 933 y(.)1074 932 y(.)1074 930 y(.)1075 929 y(.)1076 927 y(.)1076 926 y(.)1077 924 y(.)1078 923 y(.)1078 921 y(.)1079 919 y(.)1080 918 y(.)1080 916 y(.)1081 915 y(.)1081 913 y(.)1082 912 y(.)g(.)1083 911 y(.)1085 910 y(.)1086 909 y(.)1087 908 y(.)1089 907 y(.)1090 906 y(.)1091 905 y(.)1093 904 y(.)1094 903 y(.)1095 902 y(.)1097 901 y(.)1098 900 y(.)1099 899 y(.)1101 898 y(.)1102 897 y(.)1104 896 y(.)1105 895 y(.)g(.)1106 893 y(.)1107 892 y(.)1108 891 y(.)1110 889 y(.)1111 888 y(.)1112 887 y(.)1113 886 y(.)1114 884 y(.)1116 883 y(.)1117 882 y(.)1118 881 y(.)1119 879 y(.)1120 878 y(.)1122 877 y(.)1123 876 y(.)1124 874 y(.)1125 873 y(.)1126 872 y(.)1128 870 y(.)g(.)1129 869 y(.)1130 868 y(.)1132 867 y(.)1133 866 y(.)1134 865 y(.)1136 864 y(.)1137 863 y(.)1138 862 y(.)1140 861 y(.)1141 860 y(.)1142 859 y(.)1144 858 y(.)1145 857 y(.)1147 856 y(.)1148 855 y(.)1149 854 y(.)1151 853 y(.)g(.)1151 851 y(.)1152 849 y(.)1152 848 y(.)1153 846 y(.)1154 845 y(.)1154 843 y(.)1155 841 y(.)1155 840 y(.)1156 838 y(.)1157 837 y(.)1157 835 y(.)1158 833 y(.)1158 832 y(.)1159 830 y(.)1160 829 y(.)1160 827 y(.)1161 825 y(.)1161 824 y(.)1162 822 y(.)1163 821 y(.)1163 819 y(.)1164 817 y(.)1164 816 y(.)1165 814 y(.)1166 813 y(.)1166 811 y(.)1167 810 y(.)1167 808 y(.)1168 806 y(.)1169 805 y(.)1169 803 y(.)1170 802 y(.)1170 800 y(.)1171 798 y(.)1172 797 y(.)1172 795 y(.)1173 794 y(.)1173 792 y(.)g(.)1174 791 y(.)1175 789 y(.)1176 788 y(.)1177 786 y(.)1178 785 y(.)1179 784 y(.)1180 782 y(.)1180 781 y(.)1181 779 y(.)1182 778 y(.)1183 776 y(.)1184 775 y(.)1185 774 y(.)1186 772 y(.)1187 771 y(.)1187 769 y(.)1188 768 y(.)1189 767 y(.)1190 765 y(.)1191 764 y(.)1192 762 y(.)1193 761 y(.)1194 759 y(.)1194 758 y(.)1195 757 y(.)1196 755 y(.)g(.)1197 754 y(.)1198 752 y(.)1199 751 y(.)1199 749 y(.)1200 748 y(.)1201 746 y(.)1202 745 y(.)1203 744 y(.)1204 742 y(.)1204 741 y(.)1205 739 y(.)1206 738 y(.)1207 736 y(.)1208 735 y(.)1208 733 y(.)1209 732 y(.)1210 730 y(.)1211 729 y(.)1212 727 y(.)1213 726 y(.)1213 725 y(.)1214 723 y(.)1215 722 y(.)1216 720 y(.)1217 719 y(.)1217 717 y(.)1218 716 y(.)1219 714 y(.)g(.)1219 713 y(.)1220 711 y(.)1220 709 y(.)1220 708 y(.)1221 706 y(.)1221 704 y(.)1221 703 y(.)1222 701 y(.)1222 700 y(.)1222 698 y(.)1223 696 y(.)1223 695 y(.)1224 693 y(.)1224 691 y(.)1224 690 y(.)1225 688 y(.)1225 687 y(.)1225 685 y(.)1226 683 y(.)1226 682 y(.)1226 680 y(.)1227 678 y(.)1227 677 y(.)1227 675 y(.)1228 673 y(.)1228 672 y(.)1228 670 y(.)1229 669 y(.)1229 667 y(.)1229 665 y(.)1230 664 y(.)1230 662 y(.)1230 660 y(.)1231 659 y(.)1231 657 y(.)1231 655 y(.)1232 654 y(.)1232 652 y(.)1233 651 y(.)1233 649 y(.)1233 647 y(.)1234 646 y(.)1234 644 y(.)1234 642 y(.)1235 641 y(.)1235 639 y(.)1235 637 y(.)1236 636 y(.)1236 634 y(.)1236 633 y(.)1237 631 y(.)1237 629 y(.)1237 628 y(.)1238 626 y(.)1238 624 y(.)1238 623 y(.)1239 621 y(.)1239 619 y(.)1239 618 y(.)1240 616 y(.)1240 615 y(.)1240 613 y(.)1241 611 y(.)1241 610 y(.)1242 608 y(.)1242 606 y(.)g(.)1242 605 y(.)1243 603 y(.)1243 601 y(.)1243 600 y(.)1243 598 y(.)1244 597 y(.)1244 595 y(.)1244 593 y(.)1245 592 y(.)1245 590 y(.)1245 588 y(.)1246 587 y(.)1246 585 y(.)1246 583 y(.)1247 582 y(.)1247 580 y(.)1247 579 y(.)1248 577 y(.)1248 575 y(.)1248 574 y(.)1249 572 y(.)1249 570 y(.)1249 569 y(.)1250 567 y(.)1250 565 y(.)1250 564 y(.)1251 562 y(.)1251 561 y(.)1251 559 y(.)1252 557 y(.)1252 556 y(.)1252 554 y(.)1253 552 y(.)1253 551 y(.)1253 549 y(.)1254 547 y(.)1254 546 y(.)1254 544 y(.)1255 542 y(.)1255 541 y(.)1255 539 y(.)1256 538 y(.)1256 536 y(.)1256 534 y(.)1257 533 y(.)1257 531 y(.)1257 529 y(.)1258 528 y(.)1258 526 y(.)1258 524 y(.)1258 523 y(.)1259 521 y(.)1259 520 y(.)1259 518 y(.)1260 516 y(.)1260 515 y(.)1260 513 y(.)1261 511 y(.)1261 510 y(.)1261 508 y(.)1262 506 y(.)1262 505 y(.)1262 503 y(.)1263 502 y(.)1263 500 y(.)1263 498 y(.)1264 497 y(.)1264 495 y(.)1264 493 y(.)1265 492 y(.)g(.)1266 491 y(.)1267 489 y(.)1268 488 y(.)1270 487 y(.)1271 486 y(.)1272 485 y(.)1274 484 y(.)1275 483 y(.)1276 481 y(.)1277 480 y(.)1279 479 y(.)1280 478 y(.)1281 477 y(.)1282 476 y(.)1284 475 y(.)1285 473 y(.)1286 472 y(.)1288 471 y(.)g(.)1288 470 y(.)1289 468 y(.)1290 467 y(.)1291 465 y(.)1292 464 y(.)1292 463 y(.)1293 461 y(.)1294 460 y(.)1295 458 y(.)1296 457 y(.)1296 455 y(.)1297 454 y(.)1298 452 y(.)1299 451 y(.)1300 449 y(.)1301 448 y(.)1301 447 y(.)1302 445 y(.)1303 444 y(.)1304 442 y(.)1305 441 y(.)1305 439 y(.)1306 438 y(.)1307 436 y(.)1308 435 y(.)1309 434 y(.)1310 432 y(.)1310 431 y(.)g(.)1311 429 y(.)1311 427 y(.)1312 426 y(.)1313 424 y(.)1313 423 y(.)1314 421 y(.)1314 420 y(.)1315 418 y(.)1315 416 y(.)1316 415 y(.)1317 413 y(.)1317 412 y(.)1318 410 y(.)1318 409 y(.)1319 407 y(.)1319 405 y(.)1320 404 y(.)1321 402 y(.)1321 401 y(.)1322 399 y(.)1322 397 y(.)1323 396 y(.)1323 394 y(.)1324 393 y(.)1325 391 y(.)1325 390 y(.)1326 388 y(.)1326 386 y(.)1327 385 y(.)1327 383 y(.)1328 382 y(.)1329 380 y(.)1329 379 y(.)1330 377 y(.)1330 375 y(.)1331 374 y(.)1331 372 y(.)1332 371 y(.)1333 369 y(.)1333 368 y(.)g(.)1335 367 y(.)1336 366 y(.)1338 365 y(.)1339 364 y(.)i(.)1342 363 y(.)1344 362 y(.)1345 361 y(.)g(.)1348 360 y(.)1350 359 y(.)1351 358 y(.)g(.)1354 357 y(.)1356 356 y(.)e(.)i(.)1359 355 y(.)g(.)1363 354 y(.)f(.)1366 353 y(.)g(.)h(.)1371 352 y(.)f(.)1374 351 y(.)h(.)1377 350 y(.)g(.)e(.)1380 349 y(.)1382 348 y(.)h(.)1385 347 y(.)1386 346 y(.)h(.)1389 345 y(.)1391 344 y(.)g(.)1394 343 y(.)1396 342 y(.)1397 341 y(.)g(.)1400 340 y(.)1402 339 y(.)e(.)h(.)1405 338 y(.)1407 337 y(.)g(.)1410 336 y(.)g(.)1413 335 y(.)1415 334 y(.)g(.)1418 333 y(.)1420 332 y(.)g(.)1423 331 y(.)h(.)e(.)1426 330 y(.)i(.)1429 329 y(.)1431 328 y(.)g(.)1434 327 y(.)g(.)1438 326 y(.)f(.)1441 325 y(.)g(.)1444 324 y(.)h(.)1447 323 y(.)e(.)i(.)1451 322 y(.)f(.)1454 321 y(.)g(.)h(.)1459 320 y(.)f(.)1462 319 y(.)h(.)f(.)1467 318 y(.)h(.)1470 317 y(.)e(.)i(.)g(.)f(.)h(.)g(.)g(.)f(.)h (.)g(.)g(.)f(.)h(.)1493 316 y(.)e(.)i(.)g(.)f(.)h(.)g(.)g(.)f(.)h(.)g(.)g(.)f (.)h(.)g(.)e(.)i(.)f(.)h(.)g(.)g(.)f(.)h(.)g(.)g(.)f(.)h(.)g(.)g(.)e(.)h(.)h (.)g(.)1546 315 y(.)f(.)h(.)g(.)g(.)f(.)h(.)g(.)g(.)f(.)f(.)i(.)g(.)g(.)g(.)f (.)h(.)g(.)g(.)f(.)h(.)g(.)g(.)f(.)f(.)i(.)g(.)g(.)f(.)h(.)g(.)g(.)f(.)h(.)g (.)g(.)f(.)h(.)e(.)i(.)g(.)1612 314 y(.)g(.)g(.)g(.)f(.)h(.)g(.)g(.)f(.)h(.)g (.)e(.)i(.)f(.)h(.)g(.)1639 313 y(.)g(.)f(.)h(.)g(.)1648 312 y(.)f(.)h(.)g(.)e(.)1654 311 y(.)1656 310 y(.)1657 309 y(.)1659 308 y(.)h(.)1662 307 y(.)1663 306 y(.)1665 305 y(.)1667 304 y(.)g(.)1670 303 y(.)1671 302 y(.)1673 301 y(.)1674 300 y(.)h(.)e(.)h(.)h(.)g (.)g(.)f(.)h(.)g(.)g(.)f(.)h(.)g(.)g(.)f(.)f(.)i(.)g(.)g(.)g(.)f(.)h(.)g(.)g (.)f(.)h(.)g(.)g(.)f(.)985 1100 y Fd(\003)1008 1041 y(\003)1031 1000 y(\003)1054 975 y(\003)1077 919 y(\003)1099 902 y(\003)1122 878 y(\003)1145 860 y(\003)1168 799 y(\003)1191 763 y(\003)1214 722 y(\003)1236 614 y(\003)1259 499 y(\003)1282 479 y(\003)1305 438 y(\003)1328 375 y(\003)1351 363 y(\003)1373 357 y(\003)1396 347 y(\003)1419 338 y(\003)1442 331 y(\003)1465 325 y(\003)1488 324 y(\003)s(\003)1533 323 y(\003)t(\003)1579 322 y(\003)t(\003)t(\003)1647 319 y(\003)1670 307 y(\003)t(\003)t(\003)1784 556 y(\003)1744 549 y Fg(.)h(.)f(.)h(.)g(.)f(.)h(.)g(.)f(.)h(.)g(.)f(.)h(.)g(.)f(.)h(.)g(.)f (.)h(.)g(.)f(.)h(.)g(.)f(.)h(.)g(.)f(.)h(.)g(.)f(.)h(.)g(.)f(.)h(.)g(.)f(.)h (.)g(.)f(.)h(.)g(.)f(.)h(.)g(.)f(.)h(.)g(.)f(.)h(.)g(.)f(.)h(.)f(.)h(.)g(.)f (.)1862 560 y Fn(CHEMT)991 1105 y Fg(.)991 1103 y(.)991 1102 y(.)992 1100 y(.)992 1098 y(.)993 1097 y(.)993 1095 y(.)993 1094 y(.)994 1092 y(.)994 1090 y(.)995 1089 y(.)995 1087 y(.)995 1085 y(.)996 1084 y(.)996 1082 y(.)996 1080 y(.)997 1079 y(.)997 1077 y(.)998 1076 y(.)998 1074 y(.)998 1072 y(.)999 1071 y(.)999 1069 y(.)999 1067 y(.)1000 1066 y(.)1000 1064 y(.)1001 1062 y(.)1001 1061 y(.)1001 1059 y(.)1002 1057 y(.)1002 1056 y(.)1003 1054 y(.)1003 1053 y(.)1003 1051 y(.)1004 1049 y(.)1004 1048 y(.)1004 1046 y(.)1005 1044 y(.)1005 1043 y(.)1006 1041 y(.)1006 1039 y(.)1006 1038 y(.)1007 1036 y(.)1007 1035 y(.)1007 1033 y(.)1008 1031 y(.)1008 1030 y(.)1009 1028 y(.)1009 1026 y(.)1009 1025 y(.)1010 1023 y(.)1010 1021 y(.)1010 1020 y(.)1011 1018 y(.)1011 1017 y(.)1012 1015 y(.)1012 1013 y(.)1012 1012 y(.)1013 1010 y(.)1013 1008 y(.)1014 1007 y(.)f(.)1014 1005 y(.)1014 1003 y(.)1014 1002 y(.)1015 1000 y(.)1015 999 y(.)1015 997 y(.)1016 995 y(.)1016 994 y(.)1016 992 y(.)1017 990 y(.)1017 989 y(.)1017 987 y(.)1018 985 y(.)1018 984 y(.)1018 982 y(.)1019 981 y(.)1019 979 y(.)1019 977 y(.)1019 976 y(.)1020 974 y(.)1020 972 y(.)1020 971 y(.)1021 969 y(.)1021 967 y(.)1021 966 y(.)1022 964 y(.)1022 963 y(.)1022 961 y(.)1023 959 y(.)1023 958 y(.)1023 956 y(.)1024 954 y(.)1024 953 y(.)1024 951 y(.)1024 949 y(.)1025 948 y(.)1025 946 y(.)1025 945 y(.)1026 943 y(.)1026 941 y(.)1026 940 y(.)1027 938 y(.)1027 936 y(.)1027 935 y(.)1028 933 y(.)1028 931 y(.)1028 930 y(.)1029 928 y(.)1029 927 y(.)1029 925 y(.)1029 923 y(.)1030 922 y(.)1030 920 y(.)1030 918 y(.)1031 917 y(.)1031 915 y(.)1031 913 y(.)1032 912 y(.)1032 910 y(.)1032 909 y(.)1033 907 y(.)1033 905 y(.)1033 904 y(.)1034 902 y(.)1034 900 y(.)1034 899 y(.)1034 897 y(.)1035 895 y(.)1035 894 y(.)1035 892 y(.)1036 891 y(.)1036 889 y(.)1036 887 y(.)g(.)1037 886 y(.)1038 885 y(.)1039 883 y(.)1041 882 y(.)1042 881 y(.)1043 879 y(.)1044 878 y(.)1045 877 y(.)1046 876 y(.)1047 874 y(.)1048 873 y(.)1049 872 y(.)1050 870 y(.)1051 869 y(.)1052 868 y(.)1053 866 y(.)1054 865 y(.)1055 864 y(.)1056 863 y(.)1057 861 y(.)1058 860 y(.)1059 859 y(.)g(.)1060 857 y(.)1061 856 y(.)1062 855 y(.)1063 853 y(.)1064 852 y(.)1065 851 y(.)1066 849 y(.)1068 848 y(.)1069 847 y(.)1070 845 y(.)1071 844 y(.)1072 843 y(.)1073 841 y(.)1074 840 y(.)1075 839 y(.)1076 837 y(.)1077 836 y(.)1078 834 y(.)1079 833 y(.)1080 832 y(.)1081 830 y(.)1082 829 y(.)g(.)1083 828 y(.)1084 826 y(.)1084 825 y(.)1085 823 y(.)1086 821 y(.)1086 820 y(.)1087 818 y(.)1088 817 y(.)1089 815 y(.)1089 814 y(.)1090 812 y(.)1091 811 y(.)1092 809 y(.)1092 808 y(.)1093 806 y(.)1094 805 y(.)1095 803 y(.)1095 802 y(.)1096 800 y(.)1097 799 y(.)1098 797 y(.)1098 795 y(.)1099 794 y(.)1100 792 y(.)1100 791 y(.)1101 789 y(.)1102 788 y(.)1103 786 y(.)1103 785 y(.)1104 783 y(.)1105 782 y(.)g(.)1105 780 y(.)1106 779 y(.)1106 777 y(.)1107 775 y(.)1108 774 y(.)1108 772 y(.)1109 771 y(.)1109 769 y(.)1110 768 y(.)1110 766 y(.)1111 764 y(.)1111 763 y(.)1112 761 y(.)1112 760 y(.)1113 758 y(.)1113 756 y(.)1114 755 y(.)1114 753 y(.)1115 752 y(.)1115 750 y(.)1116 749 y(.)1117 747 y(.)1117 745 y(.)1118 744 y(.)1118 742 y(.)1119 741 y(.)1119 739 y(.)1120 737 y(.)1120 736 y(.)1121 734 y(.)1121 733 y(.)1122 731 y(.)1122 730 y(.)1123 728 y(.)1123 726 y(.)1124 725 y(.)1125 723 y(.)1125 722 y(.)1126 720 y(.)1126 719 y(.)1127 717 y(.)1127 715 y(.)1128 714 y(.)g(.)1128 712 y(.)1129 711 y(.)1130 709 y(.)1130 708 y(.)1131 706 y(.)1131 704 y(.)1132 703 y(.)1133 701 y(.)1133 700 y(.)1134 698 y(.)1134 697 y(.)1135 695 y(.)1136 694 y(.)1136 692 y(.)1137 691 y(.)1138 689 y(.)1138 687 y(.)1139 686 y(.)1139 684 y(.)1140 683 y(.)1141 681 y(.)1141 680 y(.)1142 678 y(.)1143 677 y(.)1143 675 y(.)1144 673 y(.)1144 672 y(.)1145 670 y(.)1146 669 y(.)1146 667 y(.)1147 666 y(.)1147 664 y(.)1148 663 y(.)1149 661 y(.)1149 659 y(.)1150 658 y(.)1151 656 y(.)g(.)1152 655 y(.)1153 654 y(.)1155 653 y(.)1156 652 y(.)1157 651 y(.)1159 650 y(.)1160 649 y(.)1161 648 y(.)1163 646 y(.)1164 645 y(.)1165 644 y(.)1167 643 y(.)1168 642 y(.)1169 641 y(.)1171 640 y(.)1172 639 y(.)1173 638 y(.)g(.)1175 637 y(.)1177 636 y(.)h(.)1180 635 y(.)1182 634 y(.)g(.)1185 633 y(.)g(.)1188 632 y(.)1190 631 y(.)g(.)1193 630 y(.)1195 629 y(.)g(.)f(.)1198 628 y(.)1199 627 y(.)1201 626 y(.)h(.)1204 625 y(.)1205 624 y(.)1207 623 y(.)g(.)1210 622 y(.)1211 621 y(.)1213 620 y(.)1214 619 y(.)h(.)1218 618 y(.)1219 617 y(.)e(.)1219 616 y(.)1220 614 y(.)1220 612 y(.)1220 611 y(.)1220 609 y(.)1221 607 y(.)1221 606 y(.)1221 604 y(.)1221 602 y(.)1222 601 y(.)1222 599 y(.)1222 597 y(.)1222 596 y(.)1223 594 y(.)1223 592 y(.)1223 591 y(.)1223 589 y(.)1224 587 y(.)1224 586 y(.)1224 584 y(.)1224 582 y(.)1225 581 y(.)1225 579 y(.)1225 577 y(.)1226 576 y(.)1226 574 y(.)1226 572 y(.)1226 571 y(.)1227 569 y(.)1227 567 y(.)1227 566 y(.)1227 564 y(.)1228 562 y(.)1228 561 y(.)1228 559 y(.)1228 557 y(.)1229 556 y(.)1229 554 y(.)1229 552 y(.)1229 551 y(.)1230 549 y(.)1230 548 y(.)1230 546 y(.)1230 544 y(.)1231 543 y(.)1231 541 y(.)1231 539 y(.)1231 538 y(.)1232 536 y(.)1232 534 y(.)1232 533 y(.)1233 531 y(.)1233 529 y(.)1233 528 y(.)1233 526 y(.)1234 524 y(.)1234 523 y(.)1234 521 y(.)1234 519 y(.)1235 518 y(.)1235 516 y(.)1235 514 y(.)1235 513 y(.)1236 511 y(.)1236 509 y(.)1236 508 y(.)1236 506 y(.)1237 504 y(.)1237 503 y(.)1237 501 y(.)1237 499 y(.)1238 498 y(.)1238 496 y(.)1238 494 y(.)1238 493 y(.)1239 491 y(.)1239 489 y(.)1239 488 y(.)1240 486 y(.)1240 484 y(.)1240 483 y(.)1240 481 y(.)1241 480 y(.)1241 478 y(.)1241 476 y(.)1241 475 y(.)1242 473 y(.)1242 471 y(.)g(.)1243 470 y(.)1244 468 y(.)1245 467 y(.)1246 466 y(.)1247 464 y(.)1248 463 y(.)1249 461 y(.)1250 460 y(.)1251 459 y(.)1252 457 y(.)1253 456 y(.)1254 454 y(.)1255 453 y(.)1256 452 y(.)1257 450 y(.)1258 449 y(.)1259 447 y(.)1260 446 y(.)1261 445 y(.)1262 443 y(.)1263 442 y(.)1264 440 y(.)1265 439 y(.)g(.)1266 438 y(.)1267 437 y(.)1269 436 y(.)1270 435 y(.)1271 433 y(.)1273 432 y(.)1274 431 y(.)1275 430 y(.)1277 429 y(.)1278 428 y(.)1279 427 y(.)1281 426 y(.)1282 425 y(.)1283 424 y(.)1285 423 y(.)1286 421 y(.)1288 420 y(.)g(.)h(.)1291 419 y(.)g(.)1294 418 y(.)h(.)1297 417 y(.)g(.)1301 416 y(.)f(.)1304 415 y(.)1305 414 y(.)h(.)1309 413 y(.)f(.)f(.)1311 411 y(.)1312 410 y(.)1313 409 y(.)1315 407 y(.)1316 406 y(.)1317 405 y(.)1318 403 y(.)1319 402 y(.)1320 401 y(.)1321 399 y(.)1322 398 y(.)1323 397 y(.)1324 395 y(.)1325 394 y(.)1326 393 y(.)1327 391 y(.)1328 390 y(.)1329 388 y(.)1330 387 y(.)1331 386 y(.)1332 384 y(.)1333 383 y(.)g(.)1334 381 y(.)1334 380 y(.)1335 378 y(.)1336 377 y(.)1336 375 y(.)1337 374 y(.)1337 372 y(.)1338 371 y(.)1339 369 y(.)1339 368 y(.)1340 366 y(.)1340 364 y(.)1341 363 y(.)1342 361 y(.)1342 360 y(.)1343 358 y(.)1343 357 y(.)1344 355 y(.)1345 354 y(.)1345 352 y(.)1346 351 y(.)1346 349 y(.)1347 347 y(.)1348 346 y(.)1348 344 y(.)1349 343 y(.)1349 341 y(.)1350 340 y(.)1351 338 y(.)1351 337 y(.)1352 335 y(.)1352 333 y(.)1353 332 y(.)1354 330 y(.)1354 329 y(.)1355 327 y(.)1355 326 y(.)1356 324 y(.)g(.)1357 323 y(.)1359 322 y(.)1360 321 y(.)1361 320 y(.)1363 319 y(.)1364 318 y(.)1365 317 y(.)1367 316 y(.)1368 315 y(.)1369 314 y(.)1371 313 y(.)1372 312 y(.)1373 311 y(.)1375 310 y(.)1376 309 y(.)1377 308 y(.)1379 307 y(.)g(.)1380 306 y(.)i(.)1384 305 y(.)f(.)h(.)1389 304 y(.)f(.)h(.)1394 303 y(.)f(.)h(.)1398 302 y(.)g(.)1402 301 y(.)e(.)h(.)h(.)g(.)g(.)f(.)h(.)1414 300 y(.)g(.)f(.)h(.)g(.)g(.)g(.)e(.) h(.)h(.)g(.)g(.)f(.)h(.)g(.)g(.)f(.)h(.)g(.)g(.)f(.)f(.)i(.)g(.)g(.)f(.)h(.)g (.)g(.)f(.)h(.)g(.)g(.)f(.)h(.)e(.)i(.)g(.)f(.)h(.)g(.)g(.)f(.)h(.)g(.)g(.)f (.)h(.)g(.)e(.)i(.)g(.)f(.)h(.)g(.)g(.)f(.)h(.)g(.)g(.)f(.)h(.)g(.)e(.)i(.)f (.)h(.)g(.)g(.)f(.)h(.)g(.)g(.)f(.)h(.)g(.)g(.)e(.)h(.)h(.)g(.)g(.)f(.)h(.)g (.)g(.)f(.)h(.)g(.)g(.)f(.)f(.)i(.)g(.)g(.)g(.)f(.)h(.)g(.)g(.)f(.)h(.)g(.)g (.)f(.)f(.)i(.)g(.)g(.)f(.)h(.)g(.)g(.)f(.)h(.)g(.)g(.)f(.)h(.)e(.)i(.)g(.)f (.)h(.)g(.)g(.)f(.)h(.)g(.)g(.)f(.)h(.)g(.)e(.)i(.)f(.)h(.)g(.)g(.)g(.)f(.)h (.)g(.)g(.)f(.)h(.)g(.)e(.)i(.)f(.)h(.)g(.)g(.)f(.)h(.)g(.)g(.)f(.)h(.)g(.)g (.)e(.)h(.)h(.)g(.)g(.)f(.)h(.)g(.)g(.)f(.)h(.)g(.)g(.)f(.)f(.)i(.)g(.)g(.)g (.)f(.)h(.)g(.)g(.)f(.)h(.)g(.)g(.)f(.)985 1112 y Fc(/)1008 1014 y(/)1031 895 y(/)1054 866 y(/)1077 837 y(/)1099 789 y(/)1122 721 y(/)1145 664 y(/)1168 645 y(/)1191 636 y(/)1214 625 y(/)1236 479 y(/)1259 446 y(/)1282 428 y(/)1305 420 y(/)1328 390 y(/)1351 332 y(/)1373 314 y(/)1396 309 y(/)1419 307 y(/)t(/)t(/)t(/)s(/)t(/)t(/)t(/)t (/)t(/)s(/)t(/)t(/)t(/)1784 598 y(/)1744 590 y Fg(.)h(.)f(.)h(.)g(.)f(.)h(.)g (.)f(.)h(.)g(.)f(.)h(.)g(.)f(.)h(.)g(.)f(.)h(.)g(.)f(.)h(.)g(.)f(.)h(.)g(.)f (.)h(.)g(.)f(.)h(.)g(.)f(.)h(.)g(.)f(.)h(.)g(.)f(.)h(.)g(.)f(.)h(.)g(.)f(.)h (.)g(.)f(.)h(.)g(.)f(.)h(.)f(.)h(.)g(.)f(.)1862 602 y Fn(ASTR)o(O)991 1057 y Fg(.)991 1056 y(.)991 1054 y(.)991 1052 y(.)992 1051 y(.)992 1049 y(.)992 1047 y(.)992 1046 y(.)992 1044 y(.)993 1042 y(.)993 1041 y(.)993 1039 y(.)993 1037 y(.)994 1036 y(.)994 1034 y(.)994 1032 y(.)994 1031 y(.)994 1029 y(.)995 1027 y(.)995 1026 y(.)995 1024 y(.)995 1022 y(.)995 1021 y(.)996 1019 y(.)996 1017 y(.)996 1016 y(.)996 1014 y(.)997 1012 y(.)997 1011 y(.)997 1009 y(.)997 1007 y(.)997 1006 y(.)998 1004 y(.)998 1002 y(.)998 1001 y(.)998 999 y(.)998 997 y(.)999 996 y(.)999 994 y(.)999 992 y(.)999 991 y(.)1000 989 y(.)1000 987 y(.)1000 986 y(.)1000 984 y(.)1000 982 y(.)1001 981 y(.)1001 979 y(.)1001 977 y(.)1001 976 y(.)1001 974 y(.)1002 972 y(.)1002 971 y(.)1002 969 y(.)1002 967 y(.)1003 966 y(.)1003 964 y(.)1003 962 y(.)1003 961 y(.)1003 959 y(.)1004 957 y(.)1004 956 y(.)1004 954 y(.)1004 952 y(.)1004 951 y(.)1005 949 y(.)1005 947 y(.)1005 946 y(.)1005 944 y(.)1006 943 y(.)1006 941 y(.)1006 939 y(.)1006 938 y(.)1006 936 y(.)1007 934 y(.)1007 933 y(.)1007 931 y(.)1007 929 y(.)1008 928 y(.)1008 926 y(.)1008 924 y(.)1008 923 y(.)1008 921 y(.)1009 919 y(.)1009 918 y(.)1009 916 y(.)1009 914 y(.)1009 913 y(.)1010 911 y(.)1010 909 y(.)1010 908 y(.)1010 906 y(.)1011 904 y(.)1011 903 y(.)1011 901 y(.)1011 899 y(.)1011 898 y(.)1012 896 y(.)1012 894 y(.)1012 893 y(.)1012 891 y(.)1012 889 y(.)1013 888 y(.)1013 886 y(.)1013 884 y(.)1013 883 y(.)1014 881 y(.)f(.)1014 880 y(.)1015 878 y(.)1016 877 y(.)1017 875 y(.)1017 874 y(.)1018 872 y(.)1019 871 y(.)1020 869 y(.)1021 868 y(.)1021 866 y(.)1022 865 y(.)1023 863 y(.)1024 862 y(.)1025 860 y(.)1025 859 y(.)1026 857 y(.)1027 856 y(.)1028 854 y(.)1029 853 y(.)1029 851 y(.)1030 850 y(.)1031 848 y(.)1032 847 y(.)1032 845 y(.)1033 844 y(.)1034 842 y(.)1035 841 y(.)1036 839 y(.)1036 838 y(.)g(.)1037 836 y(.)1038 835 y(.)1039 833 y(.)1040 832 y(.)1040 830 y(.)1041 829 y(.)1042 827 y(.)1043 826 y(.)1043 824 y(.)1044 823 y(.)1045 821 y(.)1046 820 y(.)1047 818 y(.)1047 817 y(.)1048 815 y(.)1049 813 y(.)1050 812 y(.)1051 810 y(.)1051 809 y(.)1052 807 y(.)1053 806 y(.)1054 804 y(.)1054 803 y(.)1055 801 y(.)1056 800 y(.)1057 798 y(.)1058 797 y(.)1058 795 y(.)1059 794 y(.)g(.)1060 792 y(.)1060 790 y(.)1060 789 y(.)1060 787 y(.)1061 786 y(.)1061 784 y(.)1061 782 y(.)1062 781 y(.)1062 779 y(.)1062 777 y(.)1062 776 y(.)1063 774 y(.)1063 772 y(.)1063 771 y(.)1064 769 y(.)1064 767 y(.)1064 766 y(.)1065 764 y(.)1065 762 y(.)1065 761 y(.)1065 759 y(.)1066 757 y(.)1066 756 y(.)1066 754 y(.)1067 753 y(.)1067 751 y(.)1067 749 y(.)1068 748 y(.)1068 746 y(.)1068 744 y(.)1068 743 y(.)1069 741 y(.)1069 739 y(.)1069 738 y(.)1070 736 y(.)1070 734 y(.)1070 733 y(.)1070 731 y(.)1071 729 y(.)1071 728 y(.)1071 726 y(.)1072 724 y(.)1072 723 y(.)1072 721 y(.)1073 720 y(.)1073 718 y(.)1073 716 y(.)1073 715 y(.)1074 713 y(.)1074 711 y(.)1074 710 y(.)1075 708 y(.)1075 706 y(.)1075 705 y(.)1076 703 y(.)1076 701 y(.)1076 700 y(.)1076 698 y(.)1077 696 y(.)1077 695 y(.)1077 693 y(.)1078 691 y(.)1078 690 y(.)1078 688 y(.)1078 687 y(.)1079 685 y(.)1079 683 y(.)1079 682 y(.)1080 680 y(.)1080 678 y(.)1080 677 y(.)1081 675 y(.)1081 673 y(.)1081 672 y(.)1081 670 y(.)1082 668 y(.)1082 667 y(.)g(.)1082 665 y(.)1082 663 y(.)1083 662 y(.)1083 660 y(.)1083 658 y(.)1083 657 y(.)1083 655 y(.)1084 653 y(.)1084 652 y(.)1084 650 y(.)1084 648 y(.)1084 647 y(.)1085 645 y(.)1085 643 y(.)1085 642 y(.)1085 640 y(.)1085 639 y(.)1085 637 y(.)1086 635 y(.)1086 634 y(.)1086 632 y(.)1086 630 y(.)1086 629 y(.)1087 627 y(.)1087 625 y(.)1087 624 y(.)1087 622 y(.)1087 620 y(.)1088 619 y(.)1088 617 y(.)1088 615 y(.)1088 614 y(.)1088 612 y(.)1089 610 y(.)1089 609 y(.)1089 607 y(.)1089 605 y(.)1089 604 y(.)1089 602 y(.)1090 600 y(.)1090 599 y(.)1090 597 y(.)1090 595 y(.)1090 594 y(.)1091 592 y(.)1091 590 y(.)1091 589 y(.)1091 587 y(.)1091 585 y(.)1092 584 y(.)1092 582 y(.)1092 580 y(.)1092 579 y(.)1092 577 y(.)1093 575 y(.)1093 574 y(.)1093 572 y(.)1093 570 y(.)1093 569 y(.)1093 567 y(.)1094 565 y(.)1094 564 y(.)1094 562 y(.)1094 561 y(.)1094 559 y(.)1095 557 y(.)1095 556 y(.)1095 554 y(.)1095 552 y(.)1095 551 y(.)1096 549 y(.)1096 547 y(.)1096 546 y(.)1096 544 y(.)1096 542 y(.)1096 541 y(.)1097 539 y(.)1097 537 y(.)1097 536 y(.)1097 534 y(.)1097 532 y(.)1098 531 y(.)1098 529 y(.)1098 527 y(.)1098 526 y(.)1098 524 y(.)1099 522 y(.)1099 521 y(.)1099 519 y(.)1099 517 y(.)1099 516 y(.)1100 514 y(.)1100 512 y(.)1100 511 y(.)1100 509 y(.)1100 507 y(.)1100 506 y(.)1101 504 y(.)1101 502 y(.)1101 501 y(.)1101 499 y(.)1101 497 y(.)1102 496 y(.)1102 494 y(.)1102 492 y(.)1102 491 y(.)1102 489 y(.)1103 487 y(.)1103 486 y(.)1103 484 y(.)1103 483 y(.)1103 481 y(.)1104 479 y(.)1104 478 y(.)1104 476 y(.)1104 474 y(.)1104 473 y(.)1104 471 y(.)1105 469 y(.)1105 468 y(.)g(.)1105 466 y(.)1106 464 y(.)1106 463 y(.)1107 461 y(.)1107 460 y(.)1108 458 y(.)1109 456 y(.)1109 455 y(.)1110 453 y(.)1110 452 y(.)1111 450 y(.)1111 448 y(.)1112 447 y(.)1112 445 y(.)1113 444 y(.)1113 442 y(.)1114 440 y(.)1114 439 y(.)1115 437 y(.)1115 436 y(.)1116 434 y(.)1116 432 y(.)1117 431 y(.)1117 429 y(.)1118 428 y(.)1118 426 y(.)1119 424 y(.)1119 423 y(.)1120 421 y(.)1120 420 y(.)1121 418 y(.)1121 416 y(.)1122 415 y(.)1123 413 y(.)1123 412 y(.)1124 410 y(.)1124 409 y(.)1125 407 y(.)1125 405 y(.)1126 404 y(.)1126 402 y(.)1127 401 y(.)1127 399 y(.)1128 397 y(.)g(.)1128 396 y(.)1129 394 y(.)1130 393 y(.)1130 391 y(.)1131 389 y(.)1131 388 y(.)1132 386 y(.)1133 385 y(.)1133 383 y(.)1134 382 y(.)1134 380 y(.)1135 378 y(.)1136 377 y(.)1136 375 y(.)1137 374 y(.)1138 372 y(.)1138 371 y(.)1139 369 y(.)1139 367 y(.)1140 366 y(.)1141 364 y(.)1141 363 y(.)1142 361 y(.)1143 360 y(.)1143 358 y(.)1144 356 y(.)1144 355 y(.)1145 353 y(.)1146 352 y(.)1146 350 y(.)1147 348 y(.)1147 347 y(.)1148 345 y(.)1149 344 y(.)1149 342 y(.)1150 341 y(.)1151 339 y(.)g(.)h(.)h(.)1156 338 y(.)g(.)f(.)h(.)g(.)1165 337 y(.)f(.)h(.)g(.)g(.)1173 336 y(.)e(.)i(.)1176 335 y(.)1178 334 y(.)1179 333 y(.)1181 332 y(.)f(.)1184 331 y(.)1186 330 y(.)1187 329 y(.)h(.)1190 328 y(.)1192 327 y(.)1193 326 y(.)1195 325 y(.)f(.)f(.)1198 324 y(.)i(.)f(.)h(.)g(.)g(.)f(.)h(.)g(.)1214 323 y(.)g(.)f(.)h(.)e(.)i(.)g(.)f(.) h(.)g(.)g(.)f(.)1233 322 y(.)h(.)g(.)f(.)h(.)g(.)e(.)i(.)f(.)h(.)g(.)g(.)f(.) h(.)g(.)1258 321 y(.)f(.)h(.)g(.)g(.)e(.)h(.)1268 320 y(.)1269 319 y(.)1271 318 y(.)g(.)1274 317 y(.)1275 316 y(.)h(.)1278 315 y(.)1280 314 y(.)f(.)1283 313 y(.)1284 312 y(.)h(.)1288 311 y(.)e(.)h(.)h(.)g(.)g(.)f(.)h(.)g(.)g(.)f(.)h(.)g(.)g(.)f(.)f(.)1312 310 y(.)i(.)g(.)f(.)h(.)g(.)g(.)f(.)h(.)g(.)g(.)1331 309 y(.)g(.)e(.)i(.)g(.) f(.)h(.)g(.)g(.)f(.)h(.)g(.)g(.)f(.)h(.)g(.)e(.)i(.)g(.)f(.)h(.)g(.)g(.)f(.)h (.)1372 308 y(.)g(.)f(.)h(.)g(.)e(.)h(.)1382 307 y(.)h(.)f(.)1387 306 y(.)h(.)1390 305 y(.)g(.)g(.)1395 304 y(.)g(.)f(.)1400 303 y(.)h(.)e(.)h(.)h(.)g(.)g(.)f(.)h(.)g(.)g(.)f(.)h(.)g(.)g(.)g(.)e(.)h(.)h (.)g(.)g(.)f(.)h(.)g(.)g(.)f(.)h(.)g(.)g(.)f(.)f(.)i(.)g(.)g(.)f(.)h(.)1458 302 y(.)g(.)f(.)h(.)g(.)g(.)f(.)h(.)e(.)i(.)g(.)f(.)h(.)g(.)g(.)f(.)h(.)g(.)g (.)f(.)h(.)g(.)e(.)i(.)g(.)f(.)h(.)g(.)g(.)f(.)h(.)g(.)g(.)f(.)h(.)g(.)e(.)i (.)f(.)h(.)g(.)g(.)f(.)h(.)g(.)g(.)f(.)h(.)g(.)g(.)e(.)h(.)h(.)g(.)g(.)f(.)h (.)g(.)g(.)f(.)h(.)g(.)g(.)f(.)f(.)i(.)g(.)g(.)1569 301 y(.)f(.)h(.)g(.)g(.) 1577 300 y(.)g(.)g(.)g(.)f(.)f(.)i(.)g(.)g(.)f(.)h(.)g(.)g(.)f(.)h(.)g(.)g(.) f(.)h(.)e(.)i(.)g(.)f(.)h(.)g(.)g(.)f(.)h(.)g(.)g(.)f(.)h(.)g(.)e(.)i(.)f(.)h (.)g(.)g(.)g(.)f(.)h(.)g(.)g(.)f(.)h(.)g(.)e(.)i(.)f(.)h(.)g(.)g(.)f(.)h(.)g (.)g(.)f(.)h(.)g(.)g(.)e(.)h(.)h(.)g(.)g(.)f(.)h(.)g(.)g(.)f(.)h(.)g(.)g(.)f (.)f(.)i(.)g(.)g(.)g(.)f(.)h(.)g(.)g(.)f(.)h(.)g(.)g(.)f(.)985 1064 y Fd(\016)1008 888 y(\016)1031 845 y(\016)1054 801 y(\016)1077 674 y(\016)1099 475 y(\016)1122 404 y(\016)1145 346 y(\016)1168 344 y(\016)1191 332 y(\016)1214 330 y(\016)1236 329 y(\016)1259 328 y(\016)1282 318 y(\016)t(\016)1328 317 y(\016)t(\016)1373 315 y(\016)1396 310 y(\016)t(\016)t(\016)1465 309 y(\016)t(\016)s(\016)t (\016)t(\016)1579 307 y(\016)t(\016)t(\016)s(\016)t(\016)t(\016)t(\016)1784 639 y(\016)1744 632 y Fg(.)h(.)f(.)h(.)g(.)f(.)h(.)g(.)f(.)h(.)g(.)f(.)h(.)g (.)f(.)h(.)g(.)f(.)h(.)g(.)f(.)h(.)g(.)f(.)h(.)g(.)f(.)h(.)g(.)f(.)h(.)g(.)f (.)h(.)g(.)f(.)h(.)g(.)f(.)h(.)g(.)f(.)h(.)g(.)f(.)h(.)g(.)f(.)h(.)g(.)f(.)h (.)f(.)h(.)g(.)f(.)1862 643 y Fn(MOLEB)991 990 y Fg(.)991 989 y(.)991 987 y(.)991 985 y(.)992 984 y(.)992 982 y(.)992 980 y(.)992 979 y(.)993 977 y(.)993 975 y(.)993 974 y(.)993 972 y(.)994 971 y(.)994 969 y(.)994 967 y(.)994 966 y(.)995 964 y(.)995 962 y(.)995 961 y(.)995 959 y(.)996 957 y(.)996 956 y(.)996 954 y(.)996 952 y(.)997 951 y(.)997 949 y(.)997 948 y(.)997 946 y(.)998 944 y(.)998 943 y(.)998 941 y(.)998 939 y(.)999 938 y(.)999 936 y(.)999 934 y(.)999 933 y(.)1000 931 y(.)1000 930 y(.)1000 928 y(.)1000 926 y(.)1001 925 y(.)1001 923 y(.)1001 921 y(.)1001 920 y(.)1002 918 y(.)1002 916 y(.)1002 915 y(.)1003 913 y(.)1003 911 y(.)1003 910 y(.)1003 908 y(.)1004 907 y(.)1004 905 y(.)1004 903 y(.)1004 902 y(.)1005 900 y(.)1005 898 y(.)1005 897 y(.)1005 895 y(.)1006 893 y(.)1006 892 y(.)1006 890 y(.)1006 888 y(.)1007 887 y(.)1007 885 y(.)1007 884 y(.)1007 882 y(.)1008 880 y(.)1008 879 y(.)1008 877 y(.)1008 875 y(.)1009 874 y(.)1009 872 y(.)1009 870 y(.)1009 869 y(.)1010 867 y(.)1010 865 y(.)1010 864 y(.)1010 862 y(.)1011 861 y(.)1011 859 y(.)1011 857 y(.)1011 856 y(.)1012 854 y(.)1012 852 y(.)1012 851 y(.)1012 849 y(.)1013 847 y(.)1013 846 y(.)1013 844 y(.)1013 842 y(.)1014 841 y(.)f(.)1014 839 y(.)1015 838 y(.)1016 836 y(.)1017 835 y(.)1017 833 y(.)1018 832 y(.)1019 830 y(.)1020 829 y(.)1021 827 y(.)1021 826 y(.)1022 824 y(.)1023 823 y(.)1024 821 y(.)1025 820 y(.)1025 818 y(.)1026 817 y(.)1027 816 y(.)1028 814 y(.)1029 813 y(.)1029 811 y(.)1030 810 y(.)1031 808 y(.)1032 807 y(.)1032 805 y(.)1033 804 y(.)1034 802 y(.)1035 801 y(.)1036 799 y(.)1036 798 y(.)g(.)1038 796 y(.)1039 795 y(.)1040 794 y(.)1041 793 y(.)1043 792 y(.)1044 791 y(.)1045 790 y(.)1047 788 y(.)1048 787 y(.)1049 786 y(.)1050 785 y(.)1052 784 y(.)1053 783 y(.)1054 781 y(.)1055 780 y(.)1057 779 y(.)1058 778 y(.)1059 777 y(.)g(.)1060 775 y(.)1060 773 y(.)1060 772 y(.)1061 770 y(.)1061 769 y(.)1062 767 y(.)1062 765 y(.)1063 764 y(.)1063 762 y(.)1064 761 y(.)1064 759 y(.)1064 757 y(.)1065 756 y(.)1065 754 y(.)1066 752 y(.)1066 751 y(.)1067 749 y(.)1067 748 y(.)1067 746 y(.)1068 744 y(.)1068 743 y(.)1069 741 y(.)1069 739 y(.)1070 738 y(.)1070 736 y(.)1070 735 y(.)1071 733 y(.)1071 731 y(.)1072 730 y(.)1072 728 y(.)1073 727 y(.)1073 725 y(.)1073 723 y(.)1074 722 y(.)1074 720 y(.)1075 718 y(.)1075 717 y(.)1076 715 y(.)1076 714 y(.)1076 712 y(.)1077 710 y(.)1077 709 y(.)1078 707 y(.)1078 705 y(.)1079 704 y(.)1079 702 y(.)1079 701 y(.)1080 699 y(.)1080 697 y(.)1081 696 y(.)1081 694 y(.)1082 693 y(.)1082 691 y(.)g(.)1082 689 y(.)1083 688 y(.)1083 686 y(.)1084 684 y(.)1084 683 y(.)1084 681 y(.)1085 680 y(.)1085 678 y(.)1086 676 y(.)1086 675 y(.)1086 673 y(.)1087 672 y(.)1087 670 y(.)1088 668 y(.)1088 667 y(.)1088 665 y(.)1089 663 y(.)1089 662 y(.)1090 660 y(.)1090 659 y(.)1090 657 y(.)1091 655 y(.)1091 654 y(.)1092 652 y(.)1092 651 y(.)1092 649 y(.)1093 647 y(.)1093 646 y(.)1094 644 y(.)1094 642 y(.)1094 641 y(.)1095 639 y(.)1095 638 y(.)1096 636 y(.)1096 634 y(.)1096 633 y(.)1097 631 y(.)1097 630 y(.)1098 628 y(.)1098 626 y(.)1098 625 y(.)1099 623 y(.)1099 621 y(.)1100 620 y(.)1100 618 y(.)1100 617 y(.)1101 615 y(.)1101 613 y(.)1102 612 y(.)1102 610 y(.)1102 609 y(.)1103 607 y(.)1103 605 y(.)1104 604 y(.)1104 602 y(.)1104 600 y(.)1105 599 y(.)g(.)1106 597 y(.)1106 596 y(.)1107 594 y(.)1107 593 y(.)1108 591 y(.)1109 590 y(.)1109 588 y(.)1110 586 y(.)1111 585 y(.)1111 583 y(.)1112 582 y(.)1112 580 y(.)1113 579 y(.)1114 577 y(.)1114 576 y(.)1115 574 y(.)1116 573 y(.)1116 571 y(.)1117 569 y(.)1118 568 y(.)1118 566 y(.)1119 565 y(.)1119 563 y(.)1120 562 y(.)1121 560 y(.)1121 559 y(.)1122 557 y(.)1123 556 y(.)1123 554 y(.)1124 552 y(.)1125 551 y(.)1125 549 y(.)1126 548 y(.)1126 546 y(.)1127 545 y(.)1128 543 y(.)g(.)1129 542 y(.)1130 540 y(.)1131 539 y(.)1132 538 y(.)1133 536 y(.)1134 535 y(.)1135 533 y(.)1136 532 y(.)1137 531 y(.)1138 529 y(.)1139 528 y(.)1140 526 y(.)1141 525 y(.)1142 524 y(.)1143 522 y(.)1144 521 y(.)1145 519 y(.)1146 518 y(.)1147 517 y(.)1148 515 y(.)1149 514 y(.)1150 512 y(.)1151 511 y(.)g(.)1151 509 y(.)1152 508 y(.)1153 506 y(.)1154 505 y(.)1154 504 y(.)1155 502 y(.)1156 501 y(.)1157 499 y(.)1158 498 y(.)1158 496 y(.)1159 495 y(.)1160 493 y(.)1161 492 y(.)1162 490 y(.)1162 489 y(.)1163 487 y(.)1164 486 y(.)1165 484 y(.)1165 483 y(.)1166 481 y(.)1167 480 y(.)1168 479 y(.)1169 477 y(.)1169 476 y(.)1170 474 y(.)1171 473 y(.)1172 471 y(.)1173 470 y(.)1173 468 y(.)g(.)1175 467 y(.)1176 466 y(.)i(.)1179 465 y(.)1180 464 y(.)1182 463 y(.)1183 462 y(.)1185 461 y(.)1186 460 y(.)1188 459 y(.)1189 458 y(.)1190 457 y(.)1192 456 y(.)f(.)1195 455 y(.)1196 454 y(.)f(.)1197 452 y(.)1198 451 y(.)1199 450 y(.)1201 449 y(.)1202 447 y(.)1203 446 y(.)1204 445 y(.)1205 443 y(.)1206 442 y(.)1207 441 y(.)1208 439 y(.)1209 438 y(.)1210 437 y(.)1211 436 y(.)1213 434 y(.)1214 433 y(.)1215 432 y(.)1216 430 y(.)1217 429 y(.)1218 428 y(.)1219 426 y(.)g(.)1220 425 y(.)1221 424 y(.)1222 422 y(.)1223 421 y(.)1223 419 y(.)1224 418 y(.)1225 416 y(.)1226 415 y(.)1227 414 y(.)1228 412 y(.)1229 411 y(.)1230 409 y(.)1230 408 y(.)1231 406 y(.)1232 405 y(.)1233 403 y(.)1234 402 y(.)1235 401 y(.)1236 399 y(.)1237 398 y(.)1237 396 y(.)1238 395 y(.)1239 393 y(.)1240 392 y(.)1241 391 y(.)1242 389 y(.)g(.)1243 388 y(.)1245 387 y(.)1246 386 y(.)1247 385 y(.)1249 384 y(.)1250 383 y(.)1251 382 y(.)1253 381 y(.)1254 380 y(.)1255 379 y(.)1257 378 y(.)1258 377 y(.)1259 376 y(.)1261 375 y(.)1262 374 y(.)1263 373 y(.)1265 372 y(.)g(.)1266 371 y(.)1267 369 y(.)1268 368 y(.)1269 367 y(.)1270 365 y(.)1271 364 y(.)1272 363 y(.)1273 361 y(.)1274 360 y(.)1275 359 y(.)1276 357 y(.)1277 356 y(.)1278 354 y(.)1279 353 y(.)1280 352 y(.)1281 350 y(.)1282 349 y(.)1283 348 y(.)1284 346 y(.)1285 345 y(.)1286 344 y(.)1288 342 y(.)g(.)1289 341 y(.)1290 340 y(.)i(.)1293 339 y(.)1295 338 y(.)1296 337 y(.)1298 336 y(.)1299 335 y(.)1300 334 y(.)1302 333 y(.)1303 332 y(.)1305 331 y(.)f(.)1307 330 y(.)1309 329 y(.)1310 328 y(.)f(.)1312 327 y(.)i(.)f(.)1317 326 y(.)h(.)1320 325 y(.)g(.)f(.)1325 324 y(.)h(.)1328 323 y(.)g(.)1332 322 y(.)f(.)f(.)i(.)1337 321 y(.)f(.)h(.)g(.)1344 320 y(.)f(.)h(.)1349 319 y(.)g(.)f(.)1354 318 y(.)h(.)e(.)i(.)g(.)1361 317 y(.)g(.)g(.)g(.)f(.)1370 316 y(.)h(.)g(.)f(.)1377 315 y(.)h(.)e(.)h(.) 1382 314 y(.)h(.)1385 313 y(.)g(.)1389 312 y(.)1390 311 y(.)g(.)1394 310 y(.)f(.)1397 309 y(.)g(.)1400 308 y(.)h(.)e(.)h(.)h(.)1407 307 y(.)g(.)f(.)h(.)g(.)g(.)f(.)h(.)g(.)g(.)g(.)e(.)h(.)1428 306 y(.)h(.)g(.)f(.)1435 305 y(.)h(.)g(.)f(.)1442 304 y(.)h(.)g(.)f(.)f(.) 1449 303 y(.)i(.)g(.)f(.)h(.)g(.)g(.)f(.)h(.)g(.)g(.)f(.)h(.)e(.)i(.)g(.)f(.) h(.)g(.)g(.)f(.)h(.)g(.)g(.)f(.)h(.)g(.)e(.)1495 302 y(.)i(.)f(.)h(.)g(.)g(.) f(.)h(.)g(.)g(.)f(.)h(.)g(.)e(.)i(.)f(.)h(.)g(.)1525 301 y(.)f(.)h(.)g(.)g(.) f(.)h(.)g(.)1539 300 y(.)e(.)h(.)h(.)g(.)g(.)f(.)h(.)g(.)g(.)f(.)h(.)g(.)g(.) f(.)f(.)i(.)g(.)g(.)g(.)f(.)h(.)g(.)g(.)f(.)h(.)g(.)g(.)f(.)f(.)i(.)g(.)g(.)f (.)h(.)g(.)g(.)f(.)h(.)g(.)g(.)f(.)h(.)e(.)i(.)g(.)f(.)h(.)g(.)g(.)f(.)h(.)g (.)g(.)f(.)h(.)g(.)e(.)i(.)f(.)h(.)g(.)g(.)g(.)f(.)h(.)g(.)g(.)f(.)h(.)g(.)e (.)i(.)f(.)h(.)g(.)g(.)f(.)h(.)g(.)g(.)f(.)h(.)g(.)g(.)e(.)h(.)h(.)g(.)g(.)f (.)h(.)g(.)g(.)f(.)h(.)g(.)g(.)f(.)f(.)i(.)g(.)g(.)g(.)f(.)h(.)g(.)g(.)f(.)h (.)g(.)g(.)f(.)987 995 y Fe(\017)1010 845 y(\017)1033 802 y(\017)1055 781 y(\017)1078 696 y(\017)1101 603 y(\017)1124 548 y(\017)1147 516 y(\017)1170 473 y(\017)1192 458 y(\017)1215 431 y(\017)1238 394 y(\017)1261 377 y(\017)1284 347 y(\017)1307 333 y(\017)1329 327 y(\017)1352 323 y(\017)1375 320 y(\017)1398 312 y(\017)7 b(\017)1444 308 y(\017)f(\017)1489 307 y(\017)h(\017)1535 305 y(\017)g(\017)1581 304 y(\017)f(\017)h(\017)g(\017)g(\017)g(\017)g(\017)1786 678 y(\017)1744 673 y Fg(.)-6 b(.)f(.)h(.)g(.)f(.)h(.)g(.)f(.)h(.)g(.)f(.)h (.)g(.)f(.)h(.)g(.)f(.)h(.)g(.)f(.)h(.)g(.)f(.)h(.)g(.)f(.)h(.)g(.)f(.)h(.)g (.)f(.)h(.)g(.)f(.)h(.)g(.)f(.)h(.)g(.)f(.)h(.)g(.)f(.)h(.)g(.)f(.)h(.)g(.)f (.)h(.)f(.)h(.)g(.)f(.)1862 685 y Fn(UNKNO)991 1053 y Fg(.)991 1052 y(.)991 1050 y(.)991 1048 y(.)992 1047 y(.)992 1045 y(.)992 1043 y(.)992 1042 y(.)993 1040 y(.)993 1038 y(.)993 1037 y(.)993 1035 y(.)994 1034 y(.)994 1032 y(.)994 1030 y(.)994 1029 y(.)995 1027 y(.)995 1025 y(.)995 1024 y(.)995 1022 y(.)996 1020 y(.)996 1019 y(.)996 1017 y(.)996 1015 y(.)997 1014 y(.)997 1012 y(.)997 1011 y(.)997 1009 y(.)998 1007 y(.)998 1006 y(.)998 1004 y(.)998 1002 y(.)998 1001 y(.)999 999 y(.)999 997 y(.)999 996 y(.)999 994 y(.)1000 993 y(.)1000 991 y(.)1000 989 y(.)1000 988 y(.)1001 986 y(.)1001 984 y(.)1001 983 y(.)1001 981 y(.)1002 979 y(.)1002 978 y(.)1002 976 y(.)1002 974 y(.)1003 973 y(.)1003 971 y(.)1003 970 y(.)1003 968 y(.)1004 966 y(.)1004 965 y(.)1004 963 y(.)1004 961 y(.)1005 960 y(.)1005 958 y(.)1005 956 y(.)1005 955 y(.)1006 953 y(.)1006 951 y(.)1006 950 y(.)1006 948 y(.)1006 947 y(.)1007 945 y(.)1007 943 y(.)1007 942 y(.)1007 940 y(.)1008 938 y(.)1008 937 y(.)1008 935 y(.)1008 933 y(.)1009 932 y(.)1009 930 y(.)1009 928 y(.)1009 927 y(.)1010 925 y(.)1010 924 y(.)1010 922 y(.)1010 920 y(.)1011 919 y(.)1011 917 y(.)1011 915 y(.)1011 914 y(.)1012 912 y(.)1012 910 y(.)1012 909 y(.)1012 907 y(.)1013 905 y(.)1013 904 y(.)1013 902 y(.)1013 901 y(.)1014 899 y(.)f(.)1014 897 y(.)1014 896 y(.)1014 894 y(.)1015 892 y(.)1015 891 y(.)1015 889 y(.)1016 887 y(.)1016 886 y(.)1016 884 y(.)1017 882 y(.)1017 881 y(.)1017 879 y(.)1018 878 y(.)1018 876 y(.)1018 874 y(.)1019 873 y(.)1019 871 y(.)1019 869 y(.)1020 868 y(.)1020 866 y(.)1020 864 y(.)1021 863 y(.)1021 861 y(.)1021 860 y(.)1021 858 y(.)1022 856 y(.)1022 855 y(.)1022 853 y(.)1023 851 y(.)1023 850 y(.)1023 848 y(.)1024 846 y(.)1024 845 y(.)1024 843 y(.)1025 841 y(.)1025 840 y(.)1025 838 y(.)1026 837 y(.)1026 835 y(.)1026 833 y(.)1027 832 y(.)1027 830 y(.)1027 828 y(.)1027 827 y(.)1028 825 y(.)1028 823 y(.)1028 822 y(.)1029 820 y(.)1029 819 y(.)1029 817 y(.)1030 815 y(.)1030 814 y(.)1030 812 y(.)1031 810 y(.)1031 809 y(.)1031 807 y(.)1032 805 y(.)1032 804 y(.)1032 802 y(.)1033 800 y(.)1033 799 y(.)1033 797 y(.)1034 796 y(.)1034 794 y(.)1034 792 y(.)1034 791 y(.)1035 789 y(.)1035 787 y(.)1035 786 y(.)1036 784 y(.)1036 782 y(.)1036 781 y(.)g(.)1037 779 y(.)1037 777 y(.)1037 776 y(.)1037 774 y(.)1037 773 y(.)1038 771 y(.)1038 769 y(.)1038 768 y(.)1038 766 y(.)1038 764 y(.)1039 763 y(.)1039 761 y(.)1039 759 y(.)1039 758 y(.)1039 756 y(.)1040 754 y(.)1040 753 y(.)1040 751 y(.)1040 749 y(.)1040 748 y(.)1041 746 y(.)1041 744 y(.)1041 743 y(.)1041 741 y(.)1041 739 y(.)1042 738 y(.)1042 736 y(.)1042 734 y(.)1042 733 y(.)1042 731 y(.)1043 729 y(.)1043 728 y(.)1043 726 y(.)1043 724 y(.)1043 723 y(.)1044 721 y(.)1044 719 y(.)1044 718 y(.)1044 716 y(.)1044 714 y(.)1045 713 y(.)1045 711 y(.)1045 709 y(.)1045 708 y(.)1045 706 y(.)1046 704 y(.)1046 703 y(.)1046 701 y(.)1046 699 y(.)1046 698 y(.)1047 696 y(.)1047 695 y(.)1047 693 y(.)1047 691 y(.)1047 690 y(.)1048 688 y(.)1048 686 y(.)1048 685 y(.)1048 683 y(.)1048 681 y(.)1049 680 y(.)1049 678 y(.)1049 676 y(.)1049 675 y(.)1049 673 y(.)1050 671 y(.)1050 670 y(.)1050 668 y(.)1050 666 y(.)1050 665 y(.)1051 663 y(.)1051 661 y(.)1051 660 y(.)1051 658 y(.)1051 656 y(.)1052 655 y(.)1052 653 y(.)1052 651 y(.)1052 650 y(.)1052 648 y(.)1053 646 y(.)1053 645 y(.)1053 643 y(.)1053 641 y(.)1053 640 y(.)1054 638 y(.)1054 636 y(.)1054 635 y(.)1054 633 y(.)1054 631 y(.)1055 630 y(.)1055 628 y(.)1055 626 y(.)1055 625 y(.)1055 623 y(.)1056 621 y(.)1056 620 y(.)1056 618 y(.)1056 616 y(.)1056 615 y(.)1057 613 y(.)1057 612 y(.)1057 610 y(.)1057 608 y(.)1057 607 y(.)1058 605 y(.)1058 603 y(.)1058 602 y(.)1058 600 y(.)1058 598 y(.)1059 597 y(.)1059 595 y(.)1059 593 y(.)1059 592 y(.)g(.)1060 590 y(.)1061 589 y(.)1061 587 y(.)1062 585 y(.)1062 584 y(.)1063 582 y(.)1064 581 y(.)1064 579 y(.)1065 578 y(.)1066 576 y(.)1066 575 y(.)1067 573 y(.)1068 572 y(.)1068 570 y(.)1069 569 y(.)1070 567 y(.)1070 566 y(.)1071 564 y(.)1072 563 y(.)1072 561 y(.)1073 559 y(.)1074 558 y(.)1074 556 y(.)1075 555 y(.)1076 553 y(.)1076 552 y(.)1077 550 y(.)1077 549 y(.)1078 547 y(.)1079 546 y(.)1079 544 y(.)1080 543 y(.)1081 541 y(.)1081 540 y(.)1082 538 y(.)g(.)1083 537 y(.)1084 535 y(.)1085 534 y(.)1086 533 y(.)1087 531 y(.)1088 530 y(.)1089 528 y(.)1090 527 y(.)1091 526 y(.)1092 524 y(.)1093 523 y(.)1094 522 y(.)1095 520 y(.)1096 519 y(.)1097 517 y(.)1098 516 y(.)1099 515 y(.)1100 513 y(.)1101 512 y(.)1102 511 y(.)1103 509 y(.)1104 508 y(.)1105 506 y(.)g(.)1105 505 y(.)1106 503 y(.)1106 502 y(.)1107 500 y(.)1107 499 y(.)1108 497 y(.)1109 495 y(.)1109 494 y(.)1110 492 y(.)1110 491 y(.)1111 489 y(.)1111 487 y(.)1112 486 y(.)1112 484 y(.)1113 483 y(.)1113 481 y(.)1114 479 y(.)1114 478 y(.)1115 476 y(.)1115 475 y(.)1116 473 y(.)1116 472 y(.)1117 470 y(.)1117 468 y(.)1118 467 y(.)1118 465 y(.)1119 464 y(.)1119 462 y(.)1120 460 y(.)1120 459 y(.)1121 457 y(.)1121 456 y(.)1122 454 y(.)1123 452 y(.)1123 451 y(.)1124 449 y(.)1124 448 y(.)1125 446 y(.)1125 444 y(.)1126 443 y(.)1126 441 y(.)1127 440 y(.)1127 438 y(.)1128 437 y(.)g(.)1129 435 y(.)1130 434 y(.)1131 433 y(.)1132 431 y(.)1133 430 y(.)1135 429 y(.)1136 428 y(.)1137 426 y(.)1138 425 y(.)1139 424 y(.)1140 422 y(.)1141 421 y(.)1143 420 y(.)1144 419 y(.)1145 417 y(.)1146 416 y(.)1147 415 y(.)1148 413 y(.)1149 412 y(.)1151 411 y(.)g(.)1151 409 y(.)1152 408 y(.)1153 406 y(.)1154 405 y(.)1155 403 y(.)1155 402 y(.)1156 400 y(.)1157 399 y(.)1158 397 y(.)1159 396 y(.)1160 394 y(.)1160 393 y(.)1161 391 y(.)1162 390 y(.)1163 388 y(.)1164 387 y(.)1164 385 y(.)1165 384 y(.)1166 382 y(.)1167 381 y(.)1168 379 y(.)1168 378 y(.)1169 376 y(.)1170 375 y(.)1171 373 y(.)1172 372 y(.)1173 371 y(.)1173 369 y(.)g(.)1175 368 y(.)i(.)1178 367 y(.)g(.)1182 366 y(.)f(.)1185 365 y(.)1186 364 y(.)h(.)1190 363 y(.)f(.)1193 362 y(.)h(.)1196 361 y(.)e(.)i(.)g(.)f(.)h(.)g(.)g(.)f(.)1210 360 y(.)h(.)g(.)g(.)f(.)h(.)e(.)i(.)g(.)f(.)h(.)1228 359 y(.)g(.)f(.)h(.)g(.) g(.)f(.)1240 358 y(.)h(.)e(.)h(.)1245 357 y(.)h(.)f(.)1250 356 y(.)h(.)1253 355 y(.)g(.)g(.)1258 354 y(.)g(.)1261 353 y(.)g(.)g(.)e(.)1266 352 y(.)i(.)g(.)g(.)f(.)h(.)1277 351 y(.)g(.)f(.)h(.)g (.)1286 350 y(.)g(.)e(.)h(.)h(.)g(.)g(.)1296 349 y(.)g(.)g(.)g(.)f(.)h(.)1307 348 y(.)g(.)f(.)f(.)i(.)g(.)1316 347 y(.)f(.)h(.)g(.)g(.)1324 346 y(.)g(.)g(.)g(.)f(.)1333 345 y(.)f(.)i(.)g(.)f(.)1340 344 y(.)h(.)g(.)f(.)1347 343 y(.)h(.)g(.)f(.)h(.)1356 342 y(.)e(.)i(.)g(.)f(.)h (.)g(.)g(.)1368 341 y(.)g(.)g(.)g(.)f(.)h(.)g(.)e(.)i(.)f(.)1384 340 y(.)h(.)g(.)f(.)h(.)g(.)g(.)f(.)h(.)1400 339 y(.)g(.)e(.)h(.)h(.)g(.)1409 338 y(.)f(.)h(.)g(.)1416 337 y(.)f(.)h(.)g(.)g(.)1425 336 y(.)e(.)h(.)h(.)g (.)g(.)f(.)h(.)g(.)g(.)f(.)h(.)g(.)g(.)f(.)f(.)i(.)1451 335 y(.)f(.)1454 334 y(.)g(.)1457 333 y(.)h(.)1460 332 y(.)g(.)1464 331 y(.)f(.)1467 330 y(.)h(.)1470 329 y(.)e(.)i(.)g(.)f(.)1477 328 y(.)h(.)g(.)1482 327 y(.)g(.)g(.)g(.)1489 326 y(.)g(.)g(.)e(.)1494 325 y(.)1495 323 y(.)1497 322 y(.)1498 321 y(.)1499 319 y(.)1500 318 y(.)1501 317 y(.)1503 316 y(.)1504 314 y(.)1505 313 y(.)1506 312 y(.)1507 311 y(.)1509 309 y(.)1510 308 y(.)1511 307 y(.)1512 306 y(.)1513 304 y(.)1515 303 y(.)1516 302 y(.)g(.)i(.)f(.)h(.)g(.)g(.)1526 301 y(.)g(.)g(.)g(.)f(.)h(.)g(.)g(.)e(.)h(.)h(.)g(.)g(.)1547 300 y(.)g(.)g(.)g(.)f(.)h(.)g(.)g(.)f(.)f(.)i(.)g(.)g(.)g(.)f(.)h(.)g(.)g(.)f (.)h(.)g(.)g(.)f(.)f(.)i(.)g(.)g(.)f(.)h(.)g(.)g(.)f(.)h(.)g(.)g(.)f(.)h(.)e (.)i(.)g(.)f(.)h(.)g(.)g(.)f(.)h(.)g(.)g(.)f(.)h(.)g(.)e(.)i(.)f(.)h(.)g(.)g (.)g(.)f(.)h(.)g(.)g(.)f(.)h(.)g(.)e(.)i(.)f(.)h(.)g(.)g(.)f(.)h(.)g(.)g(.)f (.)h(.)g(.)g(.)e(.)h(.)h(.)g(.)g(.)f(.)h(.)g(.)g(.)f(.)h(.)g(.)g(.)f(.)f(.)i (.)g(.)g(.)g(.)f(.)h(.)g(.)g(.)f(.)h(.)g(.)g(.)f(.)983 1058 y Fe(\010)1006 904 y(\010)1029 786 y(\010)1051 597 y(\010)1074 543 y(\010)1097 511 y(\010)1120 441 y(\010)1143 416 y(\010)1166 374 y(\010)1188 366 y(\010)1211 365 y(\010)1234 363 y(\010)1257 358 y(\010)1280 355 y(\010)1303 353 y(\010)1325 350 y(\010)1348 347 y(\010)1371 346 y(\010)1394 344 y(\010)1417 341 y(\010)o(\010)1462 334 y(\010)1485 331 y(\010)1508 307 y(\010)1531 306 y(\010)1554 304 y(\010)o(\010)n(\010)o(\010)o(\010)o(\010)o(\010)n(\010)1782 720 y(\010)1744 715 y Fg(.)h(.)f(.)h(.)g(.)f(.)h(.)g(.)f(.)h(.)g(.)f(.)h(.)g (.)f(.)h(.)g(.)f(.)h(.)g(.)f(.)h(.)g(.)f(.)h(.)g(.)f(.)h(.)g(.)f(.)h(.)g(.)f (.)h(.)g(.)f(.)h(.)g(.)f(.)h(.)g(.)f(.)h(.)g(.)f(.)h(.)g(.)f(.)h(.)g(.)f(.)h (.)f(.)h(.)g(.)f(.)1862 726 y Fn(INDUS)991 1102 y Fg(.)991 1101 y(.)991 1099 y(.)991 1097 y(.)992 1096 y(.)992 1094 y(.)992 1092 y(.)992 1091 y(.)993 1089 y(.)993 1087 y(.)993 1086 y(.)993 1084 y(.)994 1082 y(.)994 1081 y(.)994 1079 y(.)994 1078 y(.)994 1076 y(.)995 1074 y(.)995 1073 y(.)995 1071 y(.)995 1069 y(.)996 1068 y(.)996 1066 y(.)996 1064 y(.)996 1063 y(.)997 1061 y(.)997 1059 y(.)997 1058 y(.)997 1056 y(.)997 1054 y(.)998 1053 y(.)998 1051 y(.)998 1050 y(.)998 1048 y(.)999 1046 y(.)999 1045 y(.)999 1043 y(.)999 1041 y(.)1000 1040 y(.)1000 1038 y(.)1000 1036 y(.)1000 1035 y(.)1000 1033 y(.)1001 1031 y(.)1001 1030 y(.)1001 1028 y(.)1001 1026 y(.)1002 1025 y(.)1002 1023 y(.)1002 1021 y(.)1002 1020 y(.)1003 1018 y(.)1003 1017 y(.)1003 1015 y(.)1003 1013 y(.)1004 1012 y(.)1004 1010 y(.)1004 1008 y(.)1004 1007 y(.)1004 1005 y(.)1005 1003 y(.)1005 1002 y(.)1005 1000 y(.)1005 998 y(.)1006 997 y(.)1006 995 y(.)1006 993 y(.)1006 992 y(.)1007 990 y(.)1007 989 y(.)1007 987 y(.)1007 985 y(.)1007 984 y(.)1008 982 y(.)1008 980 y(.)1008 979 y(.)1008 977 y(.)1009 975 y(.)1009 974 y(.)1009 972 y(.)1009 970 y(.)1010 969 y(.)1010 967 y(.)1010 965 y(.)1010 964 y(.)1011 962 y(.)1011 960 y(.)1011 959 y(.)1011 957 y(.)1011 956 y(.)1012 954 y(.)1012 952 y(.)1012 951 y(.)1012 949 y(.)1013 947 y(.)1013 946 y(.)1013 944 y(.)1013 942 y(.)1014 941 y(.)f(.)1014 939 y(.)1015 938 y(.)1015 936 y(.)1016 934 y(.)1016 933 y(.)1017 931 y(.)1017 930 y(.)1018 928 y(.)1018 927 y(.)1019 925 y(.)1020 923 y(.)1020 922 y(.)1021 920 y(.)1021 919 y(.)1022 917 y(.)1022 916 y(.)1023 914 y(.)1023 912 y(.)1024 911 y(.)1024 909 y(.)1025 908 y(.)1026 906 y(.)1026 905 y(.)1027 903 y(.)1027 901 y(.)1028 900 y(.)1028 898 y(.)1029 897 y(.)1029 895 y(.)1030 894 y(.)1030 892 y(.)1031 890 y(.)1031 889 y(.)1032 887 y(.)1033 886 y(.)1033 884 y(.)1034 882 y(.)1034 881 y(.)1035 879 y(.)1035 878 y(.)1036 876 y(.)1036 875 y(.)g(.)1037 873 y(.)1037 871 y(.)1037 870 y(.)1038 868 y(.)1038 866 y(.)1038 865 y(.)1039 863 y(.)1039 861 y(.)1039 860 y(.)1039 858 y(.)1040 857 y(.)1040 855 y(.)1040 853 y(.)1041 852 y(.)1041 850 y(.)1041 848 y(.)1042 847 y(.)1042 845 y(.)1042 843 y(.)1042 842 y(.)1043 840 y(.)1043 838 y(.)1043 837 y(.)1044 835 y(.)1044 833 y(.)1044 832 y(.)1045 830 y(.)1045 829 y(.)1045 827 y(.)1046 825 y(.)1046 824 y(.)1046 822 y(.)1046 820 y(.)1047 819 y(.)1047 817 y(.)1047 815 y(.)1048 814 y(.)1048 812 y(.)1048 810 y(.)1049 809 y(.)1049 807 y(.)1049 806 y(.)1049 804 y(.)1050 802 y(.)1050 801 y(.)1050 799 y(.)1051 797 y(.)1051 796 y(.)1051 794 y(.)1052 792 y(.)1052 791 y(.)1052 789 y(.)1053 787 y(.)1053 786 y(.)1053 784 y(.)1053 783 y(.)1054 781 y(.)1054 779 y(.)1054 778 y(.)1055 776 y(.)1055 774 y(.)1055 773 y(.)1056 771 y(.)1056 769 y(.)1056 768 y(.)1056 766 y(.)1057 764 y(.)1057 763 y(.)1057 761 y(.)1058 759 y(.)1058 758 y(.)1058 756 y(.)1059 755 y(.)1059 753 y(.)1059 751 y(.)g(.)1060 750 y(.)1061 748 y(.)1061 747 y(.)1062 745 y(.)1063 744 y(.)1064 742 y(.)1064 741 y(.)1065 739 y(.)1066 738 y(.)1067 736 y(.)1067 734 y(.)1068 733 y(.)1069 731 y(.)1070 730 y(.)1070 728 y(.)1071 727 y(.)1072 725 y(.)1072 724 y(.)1073 722 y(.)1074 721 y(.)1075 719 y(.)1075 718 y(.)1076 716 y(.)1077 715 y(.)1078 713 y(.)1078 712 y(.)1079 710 y(.)1080 709 y(.)1081 707 y(.)1081 706 y(.)1082 704 y(.)g(.)1082 702 y(.)1083 701 y(.)1083 699 y(.)1083 697 y(.)1084 696 y(.)1084 694 y(.)1085 693 y(.)1085 691 y(.)1085 689 y(.)1086 688 y(.)1086 686 y(.)1086 684 y(.)1087 683 y(.)1087 681 y(.)1087 680 y(.)1088 678 y(.)1088 676 y(.)1089 675 y(.)1089 673 y(.)1089 671 y(.)1090 670 y(.)1090 668 y(.)1090 666 y(.)1091 665 y(.)1091 663 y(.)1091 662 y(.)1092 660 y(.)1092 658 y(.)1093 657 y(.)1093 655 y(.)1093 653 y(.)1094 652 y(.)1094 650 y(.)1094 649 y(.)1095 647 y(.)1095 645 y(.)1095 644 y(.)1096 642 y(.)1096 640 y(.)1097 639 y(.)1097 637 y(.)1097 635 y(.)1098 634 y(.)1098 632 y(.)1098 631 y(.)1099 629 y(.)1099 627 y(.)1099 626 y(.)1100 624 y(.)1100 622 y(.)1101 621 y(.)1101 619 y(.)1101 618 y(.)1102 616 y(.)1102 614 y(.)1102 613 y(.)1103 611 y(.)1103 609 y(.)1103 608 y(.)1104 606 y(.)1104 604 y(.)1105 603 y(.)1105 601 y(.)g(.)1105 600 y(.)1105 598 y(.)1106 596 y(.)1106 595 y(.)1106 593 y(.)1107 591 y(.)1107 590 y(.)1107 588 y(.)1108 586 y(.)1108 585 y(.)1108 583 y(.)1108 581 y(.)1109 580 y(.)1109 578 y(.)1109 576 y(.)1110 575 y(.)1110 573 y(.)1110 572 y(.)1111 570 y(.)1111 568 y(.)1111 567 y(.)1111 565 y(.)1112 563 y(.)1112 562 y(.)1112 560 y(.)1113 558 y(.)1113 557 y(.)1113 555 y(.)1113 553 y(.)1114 552 y(.)1114 550 y(.)1114 548 y(.)1115 547 y(.)1115 545 y(.)1115 544 y(.)1116 542 y(.)1116 540 y(.)1116 539 y(.)1116 537 y(.)1117 535 y(.)1117 534 y(.)1117 532 y(.)1118 530 y(.)1118 529 y(.)1118 527 y(.)1119 525 y(.)1119 524 y(.)1119 522 y(.)1119 520 y(.)1120 519 y(.)1120 517 y(.)1120 516 y(.)1121 514 y(.)1121 512 y(.)1121 511 y(.)1121 509 y(.)1122 507 y(.)1122 506 y(.)1122 504 y(.)1123 502 y(.)1123 501 y(.)1123 499 y(.)1124 497 y(.)1124 496 y(.)1124 494 y(.)1124 492 y(.)1125 491 y(.)1125 489 y(.)1125 487 y(.)1126 486 y(.)1126 484 y(.)1126 483 y(.)1127 481 y(.)1127 479 y(.)1127 478 y(.)1127 476 y(.)1128 474 y(.)g(.)h(.)1131 473 y(.)1132 472 y(.)1134 471 y(.)g(.)1137 470 y(.)1138 469 y(.)1140 468 y(.)g(.)1143 467 y(.)1144 466 y(.)1146 465 y(.)g(.)1149 464 y(.)1151 463 y(.)f(.)1152 462 y(.)1153 461 y(.)1154 459 y(.)1155 458 y(.)1156 457 y(.)1157 456 y(.)1159 454 y(.)1160 453 y(.)1161 452 y(.)1162 451 y(.)1163 450 y(.)1164 448 y(.)1165 447 y(.)1167 446 y(.)1168 445 y(.)1169 443 y(.)1170 442 y(.)1171 441 y(.)1172 440 y(.)1173 439 y(.)g(.)1174 437 y(.)1175 436 y(.)1176 434 y(.)1177 433 y(.)1178 431 y(.)1178 430 y(.)1179 428 y(.)1180 427 y(.)1181 425 y(.)1182 424 y(.)1183 422 y(.)1184 421 y(.)1184 419 y(.)1185 418 y(.)1186 416 y(.)1187 415 y(.)1188 413 y(.)1189 412 y(.)1189 410 y(.)1190 409 y(.)1191 407 y(.)1192 406 y(.)1193 404 y(.)1194 403 y(.)1195 402 y(.)1195 400 y(.)1196 399 y(.)g(.)1198 398 y(.)1199 397 y(.)1200 396 y(.)1202 395 y(.)1203 394 y(.)1205 393 y(.)h(.)1208 392 y(.)1209 391 y(.)1210 390 y(.)1212 389 y(.)1213 388 y(.)1215 387 y(.)1216 386 y(.)h(.)1219 385 y(.)e(.)1220 384 y(.)1222 383 y(.)1223 382 y(.)1224 381 y(.)1226 380 y(.)1227 379 y(.)1228 378 y(.)1230 377 y(.)1231 376 y(.)1232 375 y(.)1234 374 y(.)1235 373 y(.)1236 372 y(.)1238 371 y(.)1239 370 y(.)1241 369 y(.)1242 368 y(.)g(.)1243 367 y(.)i(.)1247 366 y(.)1248 365 y(.)g(.)1252 364 y(.)1253 363 y(.)g(.)1257 362 y(.)1258 361 y(.)1260 360 y(.)f(.)1263 359 y(.)1265 358 y(.)f(.)h(.)1268 357 y(.)h(.)1271 356 y(.)g(.)1274 355 y(.)g(.)1278 354 y(.)f(.)1281 353 y(.)h(.)1284 352 y(.)g(.)1288 351 y(.)e(.)h(.)h(.)1292 350 y(.)g(.)g(.)1297 349 y(.)g(.)g(.)1302 348 y(.)g(.)f(.)1307 347 y(.)h(.)1310 346 y(.)e(.)1312 345 y(.)1313 344 y(.)1314 343 y(.)1315 342 y(.)1316 340 y(.)1318 339 y(.)1319 338 y(.)1320 337 y(.)1321 336 y(.)1322 334 y(.)1324 333 y(.)1325 332 y(.)1326 331 y(.)1327 329 y(.)1328 328 y(.)1330 327 y(.)1331 326 y(.)1332 325 y(.)1333 323 y(.)g(.)1335 322 y(.)h(.)1338 321 y(.)1339 320 y(.)1341 319 y(.)1342 318 y(.)1344 317 y(.)1345 316 y(.)1347 315 y(.)g(.)1350 314 y(.)1351 313 y(.)1353 312 y(.)1354 311 y(.)1356 310 y(.)f(.)i(.)1359 309 y(.)g(.)g(.)1364 308 y(.)g(.)1367 307 y(.)g(.)1371 306 y(.)f(.)1374 305 y(.)h(.)f(.)1379 304 y(.)f(.)i(.)f(.)1384 303 y(.)h(.)g(.)f(.)1391 302 y(.)h(.)g(.)1396 301 y(.)g(.)g(.)g(.)e(.)h(.)h(.)1407 300 y(.)g(.)f(.)h(.)g(.)g(.)f(.)h(.)g(.) g(.)g(.)e(.)h(.)h(.)g(.)g(.)f(.)h(.)g(.)g(.)f(.)h(.)g(.)g(.)f(.)f(.)i(.)g(.)g (.)f(.)h(.)g(.)g(.)f(.)h(.)g(.)g(.)f(.)h(.)e(.)i(.)g(.)f(.)h(.)g(.)g(.)f(.)h (.)g(.)g(.)f(.)h(.)g(.)e(.)i(.)g(.)f(.)h(.)g(.)g(.)f(.)h(.)g(.)g(.)f(.)h(.)g (.)e(.)i(.)f(.)h(.)g(.)g(.)f(.)h(.)g(.)g(.)f(.)h(.)g(.)g(.)e(.)h(.)h(.)g(.)g (.)f(.)h(.)g(.)g(.)f(.)h(.)g(.)g(.)f(.)f(.)i(.)g(.)g(.)g(.)f(.)h(.)g(.)g(.)f (.)h(.)g(.)g(.)f(.)f(.)i(.)g(.)g(.)f(.)h(.)g(.)g(.)f(.)h(.)g(.)g(.)f(.)h(.)e (.)i(.)g(.)f(.)h(.)g(.)g(.)f(.)h(.)g(.)g(.)f(.)h(.)g(.)e(.)i(.)f(.)h(.)g(.)g (.)g(.)f(.)h(.)g(.)g(.)f(.)h(.)g(.)e(.)i(.)f(.)h(.)g(.)g(.)f(.)h(.)g(.)g(.)f (.)h(.)g(.)g(.)e(.)h(.)h(.)g(.)g(.)f(.)h(.)g(.)g(.)f(.)h(.)g(.)g(.)f(.)f(.)i (.)g(.)g(.)g(.)f(.)h(.)g(.)g(.)f(.)h(.)g(.)g(.)f(.)985 1109 y Fd(\005)1008 948 y(\005)1031 882 y(\005)1054 758 y(\005)1077 711 y(\005)1099 608 y(\005)1122 481 y(\005)1145 470 y(\005)1168 446 y(\005)1191 406 y(\005)1214 392 y(\005)1236 375 y(\005)1259 366 y(\005)1282 359 y(\005)1305 354 y(\005)1328 330 y(\005)1351 317 y(\005)1373 311 y(\005)1396 308 y(\005)1419 307 y(\005)t(\005)t(\005)t (\005)s(\005)t(\005)t(\005)t(\005)t(\005)t(\005)s(\005)t(\005)t(\005)t(\005) 1784 763 y(\005)1744 756 y Fg(.)h(.)f(.)h(.)g(.)f(.)h(.)g(.)f(.)h(.)g(.)f(.)h (.)g(.)f(.)h(.)g(.)f(.)h(.)g(.)f(.)h(.)g(.)f(.)h(.)g(.)f(.)h(.)g(.)f(.)h(.)g (.)f(.)h(.)g(.)f(.)h(.)g(.)f(.)h(.)g(.)f(.)h(.)g(.)f(.)h(.)g(.)f(.)h(.)g(.)f (.)h(.)f(.)h(.)g(.)f(.)1862 768 y Fn(CHEMI)991 885 y Fg(.)991 884 y(.)991 882 y(.)991 880 y(.)992 879 y(.)992 877 y(.)992 875 y(.)992 874 y(.)992 872 y(.)993 870 y(.)993 869 y(.)993 867 y(.)993 865 y(.)994 864 y(.)994 862 y(.)994 860 y(.)994 859 y(.)994 857 y(.)995 855 y(.)995 854 y(.)995 852 y(.)995 850 y(.)996 849 y(.)996 847 y(.)996 845 y(.)996 844 y(.)996 842 y(.)997 840 y(.)997 839 y(.)997 837 y(.)997 835 y(.)998 834 y(.)998 832 y(.)998 830 y(.)998 829 y(.)998 827 y(.)999 825 y(.)999 824 y(.)999 822 y(.)999 820 y(.)1000 819 y(.)1000 817 y(.)1000 816 y(.)1000 814 y(.)1000 812 y(.)1001 811 y(.)1001 809 y(.)1001 807 y(.)1001 806 y(.)1002 804 y(.)1002 802 y(.)1002 801 y(.)1002 799 y(.)1002 797 y(.)1003 796 y(.)1003 794 y(.)1003 792 y(.)1003 791 y(.)1004 789 y(.)1004 787 y(.)1004 786 y(.)1004 784 y(.)1004 782 y(.)1005 781 y(.)1005 779 y(.)1005 777 y(.)1005 776 y(.)1006 774 y(.)1006 772 y(.)1006 771 y(.)1006 769 y(.)1006 767 y(.)1007 766 y(.)1007 764 y(.)1007 762 y(.)1007 761 y(.)1008 759 y(.)1008 757 y(.)1008 756 y(.)1008 754 y(.)1008 752 y(.)1009 751 y(.)1009 749 y(.)1009 747 y(.)1009 746 y(.)1010 744 y(.)1010 742 y(.)1010 741 y(.)1010 739 y(.)1010 737 y(.)1011 736 y(.)1011 734 y(.)1011 732 y(.)1011 731 y(.)1012 729 y(.)1012 728 y(.)1012 726 y(.)1012 724 y(.)1012 723 y(.)1013 721 y(.)1013 719 y(.)1013 718 y(.)1013 716 y(.)1014 714 y(.)f(.)1014 713 y(.)1015 711 y(.)1016 710 y(.)1017 708 y(.)1017 707 y(.)1018 705 y(.)1019 704 y(.)1020 702 y(.)1020 701 y(.)1021 699 y(.)1022 698 y(.)1023 696 y(.)1023 695 y(.)1024 693 y(.)1025 692 y(.)1026 690 y(.)1026 689 y(.)1027 687 y(.)1028 686 y(.)1029 684 y(.)1030 682 y(.)1030 681 y(.)1031 679 y(.)1032 678 y(.)1033 676 y(.)1033 675 y(.)1034 673 y(.)1035 672 y(.)1036 670 y(.)1036 669 y(.)g(.)1037 667 y(.)1037 666 y(.)1037 664 y(.)1037 662 y(.)1038 661 y(.)1038 659 y(.)1038 657 y(.)1038 656 y(.)1039 654 y(.)1039 652 y(.)1039 651 y(.)1039 649 y(.)1040 648 y(.)1040 646 y(.)1040 644 y(.)1040 643 y(.)1041 641 y(.)1041 639 y(.)1041 638 y(.)1041 636 y(.)1042 634 y(.)1042 633 y(.)1042 631 y(.)1042 629 y(.)1043 628 y(.)1043 626 y(.)1043 625 y(.)1043 623 y(.)1044 621 y(.)1044 620 y(.)1044 618 y(.)1044 616 y(.)1045 615 y(.)1045 613 y(.)1045 611 y(.)1046 610 y(.)1046 608 y(.)1046 606 y(.)1046 605 y(.)1047 603 y(.)1047 601 y(.)1047 600 y(.)1047 598 y(.)1048 597 y(.)1048 595 y(.)1048 593 y(.)1048 592 y(.)1049 590 y(.)1049 588 y(.)1049 587 y(.)1049 585 y(.)1050 583 y(.)1050 582 y(.)1050 580 y(.)1050 578 y(.)1051 577 y(.)1051 575 y(.)1051 574 y(.)1051 572 y(.)1052 570 y(.)1052 569 y(.)1052 567 y(.)1052 565 y(.)1053 564 y(.)1053 562 y(.)1053 560 y(.)1053 559 y(.)1054 557 y(.)1054 555 y(.)1054 554 y(.)1054 552 y(.)1055 551 y(.)1055 549 y(.)1055 547 y(.)1055 546 y(.)1056 544 y(.)1056 542 y(.)1056 541 y(.)1056 539 y(.)1057 537 y(.)1057 536 y(.)1057 534 y(.)1057 532 y(.)1058 531 y(.)1058 529 y(.)1058 528 y(.)1058 526 y(.)1059 524 y(.)1059 523 y(.)1059 521 y(.)g(.)1060 520 y(.)1062 519 y(.)1063 518 y(.)1064 516 y(.)1066 515 y(.)1067 514 y(.)1068 513 y(.)1069 512 y(.)1071 511 y(.)1072 510 y(.)1073 509 y(.)1074 507 y(.)1076 506 y(.)1077 505 y(.)1078 504 y(.)1080 503 y(.)1081 502 y(.)1082 501 y(.)g(.)1083 499 y(.)1084 498 y(.)1085 496 y(.)1086 495 y(.)1086 494 y(.)1087 492 y(.)1088 491 y(.)1089 489 y(.)1090 488 y(.)1091 486 y(.)1092 485 y(.)1093 484 y(.)1093 482 y(.)1094 481 y(.)1095 479 y(.)1096 478 y(.)1097 476 y(.)1098 475 y(.)1099 474 y(.)1100 472 y(.)1100 471 y(.)1101 469 y(.)1102 468 y(.)1103 466 y(.)1104 465 y(.)1105 463 y(.)g(.)1106 462 y(.)1108 461 y(.)1109 460 y(.)1110 459 y(.)1112 458 y(.)1113 457 y(.)1114 456 y(.)1116 455 y(.)1117 453 y(.)1118 452 y(.)1120 451 y(.)1121 450 y(.)1122 449 y(.)1124 448 y(.)1125 447 y(.)1126 446 y(.)1128 444 y(.)g(.)1129 443 y(.)1130 442 y(.)1132 441 y(.)1133 440 y(.)1134 439 y(.)1136 438 y(.)1137 437 y(.)1138 436 y(.)1140 435 y(.)1141 434 y(.)1142 433 y(.)1144 432 y(.)1145 431 y(.)1147 430 y(.)1148 429 y(.)1149 428 y(.)1151 427 y(.)g(.)1152 426 y(.)i(.)1155 425 y(.)1157 424 y(.)g(.)1160 423 y(.)1162 422 y(.)g(.)1165 421 y(.)1167 420 y(.)f(.)1170 419 y(.)1172 418 y(.)g(.)f(.)1174 416 y(.)1175 415 y(.)1175 413 y(.)1176 412 y(.)1177 410 y(.)1178 408 y(.)1178 407 y(.)1179 405 y(.)1180 404 y(.)1180 402 y(.)1181 401 y(.)1182 399 y(.)1182 398 y(.)1183 396 y(.)1184 394 y(.)1184 393 y(.)1185 391 y(.)1186 390 y(.)1187 388 y(.)1187 387 y(.)1188 385 y(.)1189 384 y(.)1189 382 y(.)1190 380 y(.)1191 379 y(.)1191 377 y(.)1192 376 y(.)1193 374 y(.)1193 373 y(.)1194 371 y(.)1195 370 y(.)1196 368 y(.)1196 367 y(.)g(.)1198 366 y(.)i(.)f(.)1203 365 y(.)h(.)g(.)1208 364 y(.)g(.)g(.)1214 363 y(.)g(.)f(.)1219 362 y(.)f(.)i(.)1222 361 y(.)g(.)1226 360 y(.)f(.)1229 359 y(.)g(.)1232 358 y(.)h(.)1235 357 y(.)g(.)1239 356 y(.)f(.)1242 355 y(.)f(.)h(.)1245 354 y(.)1246 353 y(.)1248 352 y(.)1249 351 y(.)h(.)1253 350 y(.)1254 349 y(.)1256 348 y(.)1257 347 y(.)g(.)1260 346 y(.)1262 345 y(.)1263 344 y(.)g(.)e(.)1266 343 y(.)i(.)g(.)1272 342 y(.)f(.)h(.)g(.)1279 341 y(.)f(.)h(.)g(.)1286 340 y(.)g(.)e(.)h(.)1291 339 y(.)g(.)h(.)1296 338 y(.)f(.)1299 337 y(.)h(.)f(.)1304 336 y(.)g(.)h(.)1309 335 y(.)f(.)f(.)i(.)g(.)1316 334 y(.)f(.)h(.)g(.)g(.)f(.)h(.)g(.)g(.)1331 333 y(.)g(.)e(.)i(.)1336 332 y(.)g(.)1340 331 y(.)f(.)1343 330 y(.)h(.)1346 329 y(.)g(.)1349 328 y(.)g(.)1353 327 y(.)f(.)1356 326 y(.)f(.)1358 325 y(.)h(.)1361 324 y(.)1362 323 y(.)1364 322 y(.)g(.)1367 321 y(.)1368 320 y(.)1370 319 y(.)g(.)1373 318 y(.)1374 317 y(.)1376 316 y(.)1377 315 y(.)h(.)e(.)i(.)1382 314 y(.)g(.)g(.)g(.)f(.)h(.)g(.)1395 313 y(.)f(.)h(.)g(.)g(.)e(.)h(.)1405 312 y(.)h(.)f(.)1410 311 y(.)g(.)h(.)1415 310 y(.)f(.)h(.)1420 309 y(.)f(.)h(.)1425 308 y(.)e(.)h(.)h(.)1430 307 y(.)g(.)f(.)1435 306 y(.)h(.)g(.)1440 305 y(.)g(.)g(.)1446 304 y(.)f(.)f(.)i(.)g(.)1453 303 y(.)f(.)h(.)1458 302 y(.)g(.)f(.)1463 301 y(.)h(.)g(.)f(.)1470 300 y(.)f(.)i(.)g(.)f(.)h(.)g (.)g(.)f(.)h(.)g(.)g(.)f(.)h(.)g(.)e(.)i(.)g(.)f(.)h(.)g(.)g(.)f(.)h(.)g(.)g (.)f(.)h(.)g(.)e(.)i(.)f(.)h(.)g(.)g(.)f(.)h(.)g(.)g(.)f(.)h(.)g(.)g(.)e(.)h (.)h(.)g(.)g(.)f(.)h(.)g(.)g(.)f(.)h(.)g(.)g(.)f(.)f(.)i(.)g(.)g(.)g(.)f(.)h (.)g(.)g(.)f(.)h(.)g(.)g(.)f(.)f(.)i(.)g(.)g(.)f(.)h(.)g(.)g(.)f(.)h(.)g(.)g (.)f(.)h(.)e(.)i(.)g(.)f(.)h(.)g(.)g(.)f(.)h(.)g(.)g(.)f(.)h(.)g(.)e(.)i(.)f (.)h(.)g(.)g(.)g(.)f(.)h(.)g(.)g(.)f(.)h(.)g(.)e(.)i(.)f(.)h(.)g(.)g(.)f(.)h (.)g(.)g(.)f(.)h(.)g(.)g(.)e(.)h(.)h(.)g(.)g(.)f(.)h(.)g(.)g(.)f(.)h(.)g(.)g (.)f(.)f(.)i(.)g(.)g(.)g(.)f(.)h(.)g(.)g(.)f(.)h(.)g(.)g(.)f(.)985 893 y Fc(?)1008 722 y(?)1031 676 y(?)1054 528 y(?)1077 508 y(?)1099 471 y(?)1122 452 y(?)1145 435 y(?)1168 425 y(?)1191 374 y(?)1214 370 y(?)1236 363 y(?)1259 351 y(?)1282 348 y(?)1305 342 y(?)1328 341 y(?)1351 334 y(?)1373 322 y(?)1396 320 y(?)1419 316 y(?)1442 312 y(?)1465 308 y(?)t(?)s(?)t(?)t(?)t(?)t(?)t(?)1647 307 y(?)t(?)t(?)t(?)1784 805 y(?)1744 798 y Fg(.)h(.)f(.)h(.)g(.)f(.)h(.)g(.) f(.)h(.)g(.)f(.)h(.)g(.)f(.)h(.)g(.)f(.)h(.)g(.)f(.)h(.)g(.)f(.)h(.)g(.)f(.)h (.)g(.)f(.)h(.)g(.)f(.)h(.)g(.)f(.)h(.)g(.)f(.)h(.)g(.)f(.)h(.)g(.)f(.)h(.)g (.)f(.)h(.)g(.)f(.)h(.)f(.)h(.)g(.)f(.)1862 809 y Fn(MECHS)991 1113 y Fg(.)992 1112 y(.)992 1110 y(.)993 1109 y(.)994 1107 y(.)995 1106 y(.)996 1104 y(.)997 1103 y(.)998 1101 y(.)999 1100 y(.)999 1098 y(.)1000 1097 y(.)1001 1096 y(.)1002 1094 y(.)1003 1093 y(.)1004 1091 y(.)1005 1090 y(.)1006 1088 y(.)1007 1087 y(.)1007 1085 y(.)1008 1084 y(.)1009 1082 y(.)1010 1081 y(.)1011 1080 y(.)1012 1078 y(.)1013 1077 y(.)1014 1075 y(.)f(.)h(.)h(.)1019 1074 y(.)g(.)f(.)h(.)1026 1073 y(.)g(.)f(.)h(.)1033 1072 y(.)g(.)f(.)f(.)i(.) 1040 1071 y(.)g(.)f(.)h(.)1047 1070 y(.)g(.)f(.)h(.)g(.)1056 1069 y(.)f(.)h(.)e(.)1060 1068 y(.)1061 1066 y(.)1062 1065 y(.)1064 1064 y(.)1065 1062 y(.)1066 1061 y(.)1067 1060 y(.)1068 1059 y(.)1069 1057 y(.)1070 1056 y(.)1071 1055 y(.)1072 1053 y(.)1073 1052 y(.)1074 1051 y(.)1076 1050 y(.)1077 1048 y(.)1078 1047 y(.)1079 1046 y(.)1080 1044 y(.)1081 1043 y(.)1082 1042 y(.)g(.)1083 1041 y(.)1085 1039 y(.)1086 1038 y(.)1087 1037 y(.)1088 1036 y(.)1090 1035 y(.)1091 1034 y(.)1092 1033 y(.)1093 1031 y(.)1095 1030 y(.)1096 1029 y(.)1097 1028 y(.)1099 1027 y(.)1100 1026 y(.)1101 1024 y(.)1102 1023 y(.)1104 1022 y(.)1105 1021 y(.)g(.)1106 1020 y(.)1108 1019 y(.)1109 1018 y(.)i(.)1112 1017 y(.)1113 1016 y(.)1115 1015 y(.)1116 1014 y(.)1118 1013 y(.)1119 1012 y(.)g(.)1122 1011 y(.)1123 1010 y(.)1125 1009 y(.)1126 1008 y(.)1128 1007 y(.)e(.)h(.)h(.)1133 1006 y(.)f(.)1136 1005 y(.)g(.)1139 1004 y(.)h(.)1142 1003 y(.)g(.)g(.)1147 1002 y(.)g(.)1151 1001 y(.)e(.)h(.)1154 1000 y(.)g(.)1157 999 y(.)h(.)1160 998 y(.)1162 997 y(.)g(.)1165 996 y(.)g(.)1168 995 y(.)g(.)1172 994 y(.)f(.)f(.)1174 992 y(.)1176 991 y(.)1177 990 y(.)1178 988 y(.)1179 987 y(.)1180 986 y(.)1181 984 y(.)1182 983 y(.)1183 982 y(.)1184 981 y(.)1185 979 y(.)1186 978 y(.)1187 977 y(.)1189 975 y(.)1190 974 y(.)1191 973 y(.)1192 971 y(.)1193 970 y(.)1194 969 y(.)1195 967 y(.)1196 966 y(.)g(.)1197 965 y(.)1197 963 y(.)1198 961 y(.)1198 960 y(.)1199 958 y(.)1199 957 y(.)1200 955 y(.)1200 953 y(.)1201 952 y(.)1201 950 y(.)1202 949 y(.)1202 947 y(.)1203 946 y(.)1203 944 y(.)1204 942 y(.)1204 941 y(.)1205 939 y(.)1206 938 y(.)1206 936 y(.)1207 934 y(.)1207 933 y(.)1208 931 y(.)1208 930 y(.)1209 928 y(.)1209 926 y(.)1210 925 y(.)1210 923 y(.)1211 922 y(.)1211 920 y(.)1212 918 y(.)1212 917 y(.)1213 915 y(.)1213 914 y(.)1214 912 y(.)1214 911 y(.)1215 909 y(.)1215 907 y(.)1216 906 y(.)1216 904 y(.)1217 903 y(.)1217 901 y(.)1218 899 y(.)1219 898 y(.)1219 896 y(.)g(.)1220 895 y(.)1221 894 y(.)1222 892 y(.)1224 891 y(.)1225 890 y(.)1226 888 y(.)1227 887 y(.)1228 886 y(.)1229 884 y(.)1230 883 y(.)1232 882 y(.)1233 881 y(.)1234 879 y(.)1235 878 y(.)1236 877 y(.)1237 875 y(.)1238 874 y(.)1240 873 y(.)1241 871 y(.)1242 870 y(.)g(.)1242 869 y(.)1243 867 y(.)1243 865 y(.)1244 864 y(.)1244 862 y(.)1245 861 y(.)1245 859 y(.)1246 857 y(.)1246 856 y(.)1247 854 y(.)1247 853 y(.)1247 851 y(.)1248 849 y(.)1248 848 y(.)1249 846 y(.)1249 845 y(.)1250 843 y(.)1250 841 y(.)1251 840 y(.)1251 838 y(.)1252 837 y(.)1252 835 y(.)1253 833 y(.)1253 832 y(.)1254 830 y(.)1254 829 y(.)1254 827 y(.)1255 825 y(.)1255 824 y(.)1256 822 y(.)1256 821 y(.)1257 819 y(.)1257 817 y(.)1258 816 y(.)1258 814 y(.)1259 812 y(.)1259 811 y(.)1260 809 y(.)1260 808 y(.)1260 806 y(.)1261 804 y(.)1261 803 y(.)1262 801 y(.)1262 800 y(.)1263 798 y(.)1263 796 y(.)1264 795 y(.)1264 793 y(.)1265 792 y(.)g(.)1265 790 y(.)1265 788 y(.)1266 787 y(.)1266 785 y(.)1266 783 y(.)1267 782 y(.)1267 780 y(.)1267 779 y(.)1267 777 y(.)1268 775 y(.)1268 774 y(.)1268 772 y(.)1269 770 y(.)1269 769 y(.)1269 767 y(.)1270 765 y(.)1270 764 y(.)1270 762 y(.)1270 761 y(.)1271 759 y(.)1271 757 y(.)1271 756 y(.)1272 754 y(.)1272 752 y(.)1272 751 y(.)1273 749 y(.)1273 747 y(.)1273 746 y(.)1274 744 y(.)1274 742 y(.)1274 741 y(.)1274 739 y(.)1275 738 y(.)1275 736 y(.)1275 734 y(.)1276 733 y(.)1276 731 y(.)1276 729 y(.)1277 728 y(.)1277 726 y(.)1277 724 y(.)1277 723 y(.)1278 721 y(.)1278 719 y(.)1278 718 y(.)1279 716 y(.)1279 715 y(.)1279 713 y(.)1280 711 y(.)1280 710 y(.)1280 708 y(.)1281 706 y(.)1281 705 y(.)1281 703 y(.)1281 701 y(.)1282 700 y(.)1282 698 y(.)1282 697 y(.)1283 695 y(.)1283 693 y(.)1283 692 y(.)1284 690 y(.)1284 688 y(.)1284 687 y(.)1284 685 y(.)1285 683 y(.)1285 682 y(.)1285 680 y(.)1286 678 y(.)1286 677 y(.)1286 675 y(.)1287 674 y(.)1287 672 y(.)1287 670 y(.)1288 669 y(.)g(.)1288 667 y(.)1288 665 y(.)1289 664 y(.)1289 662 y(.)1290 661 y(.)1290 659 y(.)1290 657 y(.)1291 656 y(.)1291 654 y(.)1292 652 y(.)1292 651 y(.)1292 649 y(.)1293 648 y(.)1293 646 y(.)1294 644 y(.)1294 643 y(.)1295 641 y(.)1295 639 y(.)1295 638 y(.)1296 636 y(.)1296 635 y(.)1297 633 y(.)1297 631 y(.)1297 630 y(.)1298 628 y(.)1298 626 y(.)1299 625 y(.)1299 623 y(.)1300 622 y(.)1300 620 y(.)1300 618 y(.)1301 617 y(.)1301 615 y(.)1302 613 y(.)1302 612 y(.)1302 610 y(.)1303 609 y(.)1303 607 y(.)1304 605 y(.)1304 604 y(.)1305 602 y(.)1305 600 y(.)1305 599 y(.)1306 597 y(.)1306 596 y(.)1307 594 y(.)1307 592 y(.)1307 591 y(.)1308 589 y(.)1308 588 y(.)1309 586 y(.)1309 584 y(.)1310 583 y(.)1310 581 y(.)1310 579 y(.)g(.)1311 578 y(.)1312 576 y(.)1312 575 y(.)1313 573 y(.)1313 572 y(.)1314 570 y(.)1315 569 y(.)1315 567 y(.)1316 565 y(.)1316 564 y(.)1317 562 y(.)1318 561 y(.)1318 559 y(.)1319 558 y(.)1319 556 y(.)1320 555 y(.)1321 553 y(.)1321 551 y(.)1322 550 y(.)1322 548 y(.)1323 547 y(.)1324 545 y(.)1324 544 y(.)1325 542 y(.)1325 541 y(.)1326 539 y(.)1327 537 y(.)1327 536 y(.)1328 534 y(.)1328 533 y(.)1329 531 y(.)1330 530 y(.)1330 528 y(.)1331 527 y(.)1331 525 y(.)1332 523 y(.)1333 522 y(.)1333 520 y(.)g(.)1334 519 y(.)1334 517 y(.)1335 516 y(.)1336 514 y(.)1336 512 y(.)1337 511 y(.)1337 509 y(.)1338 508 y(.)1338 506 y(.)1339 505 y(.)1340 503 y(.)1340 501 y(.)1341 500 y(.)1341 498 y(.)1342 497 y(.)1343 495 y(.)1343 494 y(.)1344 492 y(.)1344 490 y(.)1345 489 y(.)1345 487 y(.)1346 486 y(.)1347 484 y(.)1347 483 y(.)1348 481 y(.)1348 479 y(.)1349 478 y(.)1350 476 y(.)1350 475 y(.)1351 473 y(.)1351 472 y(.)1352 470 y(.)1352 468 y(.)1353 467 y(.)1354 465 y(.)1354 464 y(.)1355 462 y(.)1355 461 y(.)1356 459 y(.)g(.)1357 457 y(.)1357 456 y(.)1358 454 y(.)1359 453 y(.)1359 451 y(.)1360 450 y(.)1361 448 y(.)1361 446 y(.)1362 445 y(.)1363 443 y(.)1363 442 y(.)1364 440 y(.)1364 439 y(.)1365 437 y(.)1366 435 y(.)1366 434 y(.)1367 432 y(.)1368 431 y(.)1368 429 y(.)1369 428 y(.)1370 426 y(.)1370 424 y(.)1371 423 y(.)1372 421 y(.)1372 420 y(.)1373 418 y(.)1374 417 y(.)1374 415 y(.)1375 414 y(.)1376 412 y(.)1376 410 y(.)1377 409 y(.)1378 407 y(.)1378 406 y(.)1379 404 y(.)g(.)1380 403 y(.)1381 401 y(.)1382 400 y(.)1383 399 y(.)1384 397 y(.)1385 396 y(.)1386 395 y(.)1387 393 y(.)1388 392 y(.)1389 390 y(.)1390 389 y(.)1391 388 y(.)1392 386 y(.)1393 385 y(.)1394 384 y(.)1395 382 y(.)1396 381 y(.)1397 379 y(.)1398 378 y(.)1399 377 y(.)1400 375 y(.)1401 374 y(.)1402 373 y(.)g(.)1403 371 y(.)1404 370 y(.)1405 369 y(.)1406 368 y(.)1407 366 y(.)1409 365 y(.)1410 364 y(.)1411 363 y(.)1412 361 y(.)1413 360 y(.)1414 359 y(.)1415 358 y(.)1417 357 y(.)1418 355 y(.)1419 354 y(.)1420 353 y(.)1421 352 y(.)1422 350 y(.)1423 349 y(.)1425 348 y(.)g(.)1426 347 y(.)1427 346 y(.)1429 345 y(.)1430 344 y(.)1431 343 y(.)1433 342 y(.)1434 341 y(.)1435 340 y(.)1437 339 y(.)1438 338 y(.)1439 337 y(.)1441 336 y(.)1442 335 y(.)1443 334 y(.)1445 333 y(.)1446 332 y(.)1447 331 y(.)g(.)1449 330 y(.)1450 329 y(.)1452 328 y(.)1453 327 y(.)1454 326 y(.)i(.)1457 325 y(.)1459 324 y(.)1460 323 y(.)1462 322 y(.)1463 321 y(.)1464 320 y(.)1466 319 y(.)1467 318 y(.)1469 317 y(.)f(.)f(.)1472 316 y(.)1473 315 y(.)1474 314 y(.)1476 313 y(.)1477 312 y(.)1479 311 y(.)1480 310 y(.)1482 309 y(.)1483 308 y(.)1484 307 y(.)1486 306 y(.)1487 305 y(.)1489 304 y(.)1490 303 y(.)1492 302 y(.)1493 301 y(.)g(.)i(.)g(.)f(.)h(.)g(.)g(.) 1505 300 y(.)g(.)g(.)g(.)f(.)h(.)g(.)e(.)i(.)f(.)h(.)g(.)g(.)f(.)h(.)g(.)g(.) f(.)h(.)g(.)g(.)e(.)h(.)h(.)g(.)g(.)f(.)h(.)g(.)g(.)f(.)h(.)g(.)g(.)f(.)f(.)i (.)g(.)g(.)g(.)f(.)h(.)g(.)g(.)f(.)h(.)g(.)g(.)f(.)f(.)i(.)g(.)g(.)f(.)h(.)g (.)g(.)f(.)h(.)g(.)g(.)f(.)h(.)e(.)i(.)g(.)f(.)h(.)g(.)g(.)f(.)h(.)g(.)g(.)f (.)h(.)g(.)e(.)i(.)f(.)h(.)g(.)g(.)g(.)f(.)h(.)g(.)g(.)f(.)h(.)g(.)e(.)i(.)f (.)h(.)g(.)g(.)f(.)h(.)g(.)g(.)f(.)h(.)g(.)g(.)e(.)h(.)h(.)g(.)g(.)f(.)h(.)g (.)g(.)f(.)h(.)g(.)g(.)f(.)f(.)i(.)g(.)g(.)g(.)f(.)h(.)g(.)g(.)f(.)h(.)g(.)g (.)f(.)984 1118 y Fb(2)1007 1080 y(2)1030 1076 y(2)1053 1074 y(2)1076 1046 y(2)1099 1026 y(2)1121 1012 y(2)1144 1006 y(2)1167 998 y(2)1190 971 y(2)1213 901 y(2)1236 875 y(2)1258 796 y(2)1281 673 y(2)1304 584 y(2)1327 525 y(2)1350 464 y(2)1373 409 y(2)1395 377 y(2)1418 353 y(2)1441 336 y(2)1464 321 y(2)1487 306 y(2)1510 304 y(2)q(2)r(2)r(2)r(2)r(2)r(2)q(2)r(2)r(2)1784 844 y(2)1744 839 y Fg(.)h(.)f(.)h(.)g(.)f(.)h(.)g(.)f(.)h(.)g(.)f(.)h(.)g(.)f(.)h(.)g(.)f (.)h(.)g(.)f(.)h(.)g(.)f(.)h(.)g(.)f(.)h(.)g(.)f(.)h(.)g(.)f(.)h(.)g(.)f(.)h (.)g(.)f(.)h(.)g(.)f(.)h(.)g(.)f(.)h(.)g(.)f(.)h(.)g(.)f(.)h(.)f(.)h(.)g(.)f (.)1862 851 y Fn(MA)m(THS)987 1039 y Fe(\017)987 1038 y(\017)-16 b(\017)987 1037 y(\017)g(\017)987 1036 y(\017)g(\017)h(\017)988 1035 y(\017)f(\017)988 1034 y(\017)g(\017)g(\017)988 1033 y(\017)g(\017)988 1032 y(\017)g(\017)g(\017)988 1031 y(\017)h(\017)989 1030 y(\017)f(\017)g (\017)989 1029 y(\017)g(\017)989 1028 y(\017)g(\017)g(\017)989 1027 y(\017)g(\017)989 1026 y(\017)g(\017)990 1025 y(\017)g(\017)g(\017)990 1024 y(\017)g(\017)990 1023 y(\017)g(\017)g(\017)990 1022 y(\017)g(\017)990 1021 y(\017)g(\017)h(\017)991 1020 y(\017)f(\017)991 1019 y(\017)g(\017)g (\017)991 1018 y(\017)g(\017)991 1017 y(\017)g(\017)g(\017)991 1016 y(\017)h(\017)992 1015 y(\017)f(\017)992 1014 y(\017)g(\017)g(\017)992 1013 y(\017)g(\017)992 1012 y(\017)g(\017)g(\017)992 1011 y(\017)h(\017)993 1010 y(\017)f(\017)g(\017)993 1009 y(\017)g(\017)993 1008 y(\017)g(\017)g (\017)993 1007 y(\017)g(\017)993 1006 y(\017)h(\017)f(\017)994 1005 y(\017)g(\017)994 1004 y(\017)g(\017)994 1003 y(\017)g(\017)g(\017)994 1002 y(\017)g(\017)994 1001 y(\017)h(\017)f(\017)995 1000 y(\017)g(\017)995 999 y(\017)g(\017)g(\017)995 998 y(\017)g(\017)995 997 y(\017)g(\017)g(\017) 996 996 y(\017)g(\017)996 995 y(\017)g(\017)g(\017)996 994 y(\017)g(\017)996 993 y(\017)g(\017)996 992 y(\017)g(\017)g(\017)997 991 y(\017)g(\017)997 990 y(\017)g(\017)g(\017)997 989 y(\017)g(\017)997 988 y(\017)g(\017)g(\017)997 987 y(\017)g(\017)998 986 y(\017)g(\017)g(\017) 998 985 y(\017)g(\017)998 984 y(\017)g(\017)g(\017)998 983 y(\017)g(\017)998 982 y(\017)g(\017)999 981 y(\017)g(\017)g(\017)999 980 y(\017)g(\017)999 979 y(\017)g(\017)g(\017)999 978 y(\017)g(\017)999 977 y(\017)g(\017)h(\017)1000 976 y(\017)f(\017)1000 975 y(\017)g(\017)g (\017)1000 974 y(\017)g(\017)1000 973 y(\017)g(\017)g(\017)1000 972 y(\017)g(\017)1001 971 y(\017)g(\017)1001 970 y(\017)g(\017)g(\017)1001 969 y(\017)g(\017)1001 968 y(\017)g(\017)g(\017)1001 967 y(\017)g(\017)1002 966 y(\017)g(\017)g(\017)1002 965 y(\017)g(\017)1002 964 y(\017)g(\017)g (\017)1002 963 y(\017)g(\017)1002 962 y(\017)g(\017)h(\017)1003 961 y(\017)f(\017)1003 960 y(\017)g(\017)1003 959 y(\017)g(\017)g(\017)1003 958 y(\017)g(\017)1003 957 y(\017)g(\017)h(\017)1004 956 y(\017)f(\017)1004 955 y(\017)g(\017)g(\017)1004 954 y(\017)g(\017)1004 953 y(\017)g(\017)g (\017)1004 952 y(\017)h(\017)1005 951 y(\017)f(\017)1005 950 y(\017)g(\017)g(\017)1005 949 y(\017)g(\017)1005 948 y(\017)g(\017)g(\017) 1005 947 y(\017)h(\017)1006 946 y(\017)f(\017)g(\017)1006 945 y(\017)g(\017)1006 944 y(\017)g(\017)g(\017)1006 943 y(\017)g(\017)1006 942 y(\017)h(\017)f(\017)1007 941 y(\017)g(\017)1007 940 y(\017)g(\017)1007 939 y(\017)g(\017)g(\017)1007 938 y(\017)g(\017)1007 937 y(\017)h(\017)f (\017)1008 936 y(\017)g(\017)1008 935 y(\017)g(\017)g(\017)1008 934 y(\017)g(\017)1008 933 y(\017)g(\017)g(\017)1009 932 y(\017)g(\017)1009 931 y(\017)g(\017)g(\017)1009 930 y(\017)g(\017)1009 929 y(\017)g(\017)1009 928 y(\017)g(\017)g(\017)1010 927 y(\017)g(\017)1010 926 y(\017)g(\017)g (\017)g(\017)1010 925 y(\017)g(\017)1010 924 y(\017)g(\017)h(\017)1011 923 y(\017)f(\017)1011 922 y(\017)g(\017)g(\017)1011 921 y(\017)g(\017)1012 920 y(\017)g(\017)g(\017)1012 919 y(\017)g(\017)1012 918 y(\017)g(\017)g (\017)1012 917 y(\017)h(\017)1013 916 y(\017)f(\017)g(\017)1013 915 y(\017)g(\017)g(\017)1013 914 y(\017)h(\017)1014 913 y(\017)f(\017)g (\017)1014 912 y(\017)g(\017)1014 911 y(\017)g(\017)h(\017)1015 910 y(\017)f(\017)1015 909 y(\017)g(\017)g(\017)1015 908 y(\017)g(\017)1016 907 y(\017)g(\017)g(\017)1016 906 y(\017)g(\017)1016 905 y(\017)g(\017)g (\017)1017 904 y(\017)g(\017)1017 903 y(\017)g(\017)g(\017)1017 902 y(\017)g(\017)1017 901 y(\017)h(\017)f(\017)1018 900 y(\017)g(\017)1018 899 y(\017)g(\017)g(\017)1018 898 y(\017)h(\017)1019 897 y(\017)f(\017)g (\017)1019 896 y(\017)g(\017)1019 895 y(\017)g(\017)h(\017)1020 894 y(\017)f(\017)1020 893 y(\017)g(\017)g(\017)1020 892 y(\017)g(\017)1021 891 y(\017)g(\017)g(\017)1021 890 y(\017)g(\017)1021 889 y(\017)g(\017)g (\017)1022 888 y(\017)g(\017)1022 887 y(\017)g(\017)g(\017)1022 886 y(\017)g(\017)1022 885 y(\017)h(\017)f(\017)1023 884 y(\017)g(\017)g (\017)1023 883 y(\017)g(\017)1023 882 y(\017)h(\017)f(\017)1024 881 y(\017)g(\017)1024 880 y(\017)g(\017)g(\017)1024 879 y(\017)h(\017)1025 878 y(\017)f(\017)g(\017)1025 877 y(\017)g(\017)1025 876 y(\017)g(\017)h (\017)1026 875 y(\017)f(\017)1026 874 y(\017)g(\017)g(\017)1026 873 y(\017)g(\017)1027 872 y(\017)g(\017)g(\017)1027 871 y(\017)g(\017)1027 870 y(\017)g(\017)g(\017)1028 869 y(\017)g(\017)1028 868 y(\017)g(\017)g (\017)1028 867 y(\017)g(\017)1028 866 y(\017)g(\017)h(\017)1029 865 y(\017)f(\017)1029 864 y(\017)g(\017)g(\017)1029 863 y(\017)g(\017)1030 862 y(\017)g(\017)g(\017)1030 861 y(\017)g(\017)1030 860 y(\017)g(\017)g (\017)1031 859 y(\017)g(\017)1031 858 y(\017)g(\017)g(\017)1031 857 y(\017)g(\017)1031 856 y(\017)h(\017)f(\017)1032 855 y(\017)g(\017)1032 854 y(\017)g(\017)g(\017)1032 853 y(\017)h(\017)f(\017)1033 852 y(\017)g(\017)g(\017)1033 851 y(\017)g(\017)g(\017)1033 850 y(\017)h(\017)1034 849 y(\017)f(\017)g(\017)1034 848 y(\017)g(\017)1034 847 y(\017)h(\017)f(\017)1035 846 y(\017)g(\017)1035 845 y(\017)g(\017)g (\017)1035 844 y(\017)h(\017)1036 843 y(\017)f(\017)g(\017)1036 842 y(\017)g(\017)1036 841 y(\017)g(\017)h(\017)1037 840 y(\017)f(\017)1037 839 y(\017)g(\017)g(\017)1037 838 y(\017)g(\017)1038 837 y(\017)g(\017)g (\017)1038 836 y(\017)g(\017)1038 835 y(\017)g(\017)g(\017)1039 834 y(\017)g(\017)1039 833 y(\017)g(\017)g(\017)1039 832 y(\017)g(\017)1039 831 y(\017)h(\017)f(\017)1040 830 y(\017)g(\017)1040 829 y(\017)g(\017)g (\017)1041 828 y(\017)g(\017)1041 827 y(\017)g(\017)g(\017)1041 826 y(\017)g(\017)1041 825 y(\017)h(\017)f(\017)1042 824 y(\017)g(\017)g (\017)1042 823 y(\017)g(\017)1042 822 y(\017)h(\017)f(\017)1043 821 y(\017)g(\017)1043 820 y(\017)g(\017)g(\017)1043 819 y(\017)h(\017)1044 818 y(\017)f(\017)g(\017)1044 817 y(\017)g(\017)1044 816 y(\017)g(\017)h (\017)1045 815 y(\017)f(\017)1045 814 y(\017)g(\017)g(\017)1045 813 y(\017)g(\017)1046 812 y(\017)g(\017)g(\017)1046 811 y(\017)g(\017)1046 810 y(\017)g(\017)g(\017)1047 809 y(\017)g(\017)1047 808 y(\017)g(\017)g (\017)1047 807 y(\017)g(\017)1048 806 y(\017)g(\017)g(\017)1048 805 y(\017)g(\017)1048 804 y(\017)g(\017)g(\017)1049 803 y(\017)g(\017)1049 802 y(\017)g(\017)g(\017)1049 801 y(\017)g(\017)1049 800 y(\017)h(\017)f (\017)1050 799 y(\017)g(\017)1050 798 y(\017)g(\017)g(\017)1050 797 y(\017)h(\017)f(\017)1051 796 y(\017)g(\017)1051 795 y(\017)g(\017)g (\017)1051 794 y(\017)h(\017)1052 793 y(\017)f(\017)g(\017)1052 792 y(\017)g(\017)1052 791 y(\017)g(\017)h(\017)1053 790 y(\017)f(\017)1053 789 y(\017)g(\017)g(\017)1053 788 y(\017)h(\017)1054 787 y(\017)f(\017)g (\017)1054 786 y(\017)g(\017)1054 785 y(\017)g(\017)h(\017)1055 784 y(\017)f(\017)1055 783 y(\017)g(\017)g(\017)1055 782 y(\017)g(\017)g (\017)1056 781 y(\017)g(\017)g(\017)1056 780 y(\017)g(\017)h(\017)1057 779 y(\017)f(\017)g(\017)1057 778 y(\017)h(\017)1058 777 y(\017)f(\017)g (\017)1058 776 y(\017)h(\017)f(\017)1059 775 y(\017)g(\017)1059 774 y(\017)h(\017)f(\017)1060 773 y(\017)g(\017)g(\017)1061 772 y(\017)g(\017)g(\017)1061 771 y(\017)g(\017)1062 770 y(\017)g(\017)g (\017)1062 769 y(\017)g(\017)h(\017)1063 768 y(\017)f(\017)1063 767 y(\017)g(\017)h(\017)1064 766 y(\017)f(\017)g(\017)1064 765 y(\017)h(\017)f(\017)1065 764 y(\017)g(\017)1065 763 y(\017)h(\017)f (\017)1066 762 y(\017)g(\017)g(\017)1067 761 y(\017)g(\017)1067 760 y(\017)g(\017)g(\017)1068 759 y(\017)g(\017)g(\017)1068 758 y(\017)g(\017)h(\017)1069 757 y(\017)f(\017)1069 756 y(\017)g(\017)h (\017)1070 755 y(\017)f(\017)g(\017)1070 754 y(\017)h(\017)f(\017)1071 753 y(\017)g(\017)1071 752 y(\017)h(\017)f(\017)1072 751 y(\017)g(\017)g (\017)1073 750 y(\017)g(\017)1073 749 y(\017)g(\017)g(\017)1074 748 y(\017)g(\017)g(\017)1074 747 y(\017)g(\017)h(\017)1075 746 y(\017)f(\017)1075 745 y(\017)g(\017)h(\017)1076 744 y(\017)f(\017)g (\017)1076 743 y(\017)h(\017)1077 742 y(\017)f(\017)g(\017)1077 741 y(\017)h(\017)f(\017)1078 740 y(\017)g(\017)g(\017)g(\017)1079 739 y(\017)g(\017)1079 738 y(\017)g(\017)g(\017)1079 737 y(\017)g(\017)1080 736 y(\017)g(\017)g(\017)1080 735 y(\017)g(\017)1080 734 y(\017)g(\017)h (\017)1081 733 y(\017)f(\017)g(\017)1081 732 y(\017)g(\017)1082 731 y(\017)g(\017)g(\017)1082 730 y(\017)g(\017)1082 729 y(\017)g(\017)h (\017)1083 728 y(\017)f(\017)1083 727 y(\017)g(\017)g(\017)1083 726 y(\017)h(\017)f(\017)1084 725 y(\017)g(\017)1084 724 y(\017)g(\017)g (\017)1085 723 y(\017)g(\017)1085 722 y(\017)g(\017)g(\017)1085 721 y(\017)h(\017)1086 720 y(\017)f(\017)g(\017)1086 719 y(\017)g(\017)g (\017)1087 718 y(\017)g(\017)1087 717 y(\017)g(\017)g(\017)1087 716 y(\017)g(\017)1088 715 y(\017)g(\017)g(\017)1088 714 y(\017)g(\017)1088 713 y(\017)g(\017)h(\017)1089 712 y(\017)f(\017)g(\017)1089 711 y(\017)g(\017)1090 710 y(\017)g(\017)g(\017)1090 709 y(\017)g(\017)1090 708 y(\017)g(\017)h(\017)1091 707 y(\017)f(\017)g(\017)1091 706 y(\017)g(\017)1091 705 y(\017)h(\017)f(\017)1092 704 y(\017)g(\017)1092 703 y(\017)g(\017)g(\017)1093 702 y(\017)g(\017)1093 701 y(\017)g(\017)g (\017)1093 700 y(\017)h(\017)f(\017)1094 699 y(\017)g(\017)1094 698 y(\017)g(\017)g(\017)1095 697 y(\017)g(\017)1095 696 y(\017)g(\017)g (\017)1095 695 y(\017)g(\017)1096 694 y(\017)g(\017)g(\017)1096 693 y(\017)g(\017)g(\017)1096 692 y(\017)h(\017)1097 691 y(\017)f(\017)g (\017)1097 690 y(\017)g(\017)1098 689 y(\017)g(\017)g(\017)1098 688 y(\017)g(\017)1098 687 y(\017)g(\017)h(\017)1099 686 y(\017)f(\017)g (\017)1099 685 y(\017)g(\017)1099 684 y(\017)h(\017)f(\017)1100 683 y(\017)g(\017)1100 682 y(\017)g(\017)g(\017)1101 681 y(\017)g(\017)g (\017)1101 680 y(\017)g(\017)g(\017)1101 679 y(\017)h(\017)f(\017)1102 678 y(\017)g(\017)1102 677 y(\017)g(\017)h(\017)1103 676 y(\017)f(\017)1103 675 y(\017)g(\017)g(\017)1103 674 y(\017)h(\017)f(\017)1104 673 y(\017)g(\017)1104 672 y(\017)g(\017)h(\017)1105 671 y(\017)f(\017)1105 670 y(\017)g(\017)g(\017)1106 669 y(\017)g(\017)g(\017)1106 668 y(\017)g(\017)1106 667 y(\017)h(\017)f(\017)1107 666 y(\017)g(\017)1107 665 y(\017)g(\017)g(\017)1108 664 y(\017)g(\017)g(\017)1108 663 y(\017)g(\017)1108 662 y(\017)h(\017)f(\017)1109 661 y(\017)g(\017)1109 660 y(\017)g(\017)h(\017)1110 659 y(\017)f(\017)g(\017)1110 658 y(\017)g(\017)1111 657 y(\017)g(\017)g(\017)1111 656 y(\017)g(\017)1111 655 y(\017)g(\017)h(\017)1112 654 y(\017)f(\017)g(\017)1112 653 y(\017)g(\017)1113 652 y(\017)g(\017)g(\017)1113 651 y(\017)g(\017)1113 650 y(\017)h(\017)f(\017)1114 649 y(\017)g(\017)g(\017)1114 648 y(\017)h(\017)1115 647 y(\017)f(\017)g(\017)1115 646 y(\017)g(\017)1115 645 y(\017)h(\017)f(\017)1116 644 y(\017)g(\017)1116 643 y(\017)g(\017)h (\017)1117 642 y(\017)f(\017)g(\017)1117 641 y(\017)g(\017)1118 640 y(\017)g(\017)g(\017)1118 639 y(\017)g(\017)1118 638 y(\017)g(\017)h (\017)1119 637 y(\017)f(\017)g(\017)1119 636 y(\017)g(\017)1120 635 y(\017)g(\017)g(\017)1120 634 y(\017)g(\017)1120 633 y(\017)h(\017)f (\017)1121 632 y(\017)g(\017)g(\017)1121 631 y(\017)h(\017)1122 630 y(\017)f(\017)g(\017)1122 629 y(\017)g(\017)1122 628 y(\017)h(\017)f (\017)1123 627 y(\017)g(\017)g(\017)1123 626 y(\017)h(\017)1124 625 y(\017)f(\017)g(\017)g(\017)1124 624 y(\017)h(\017)f(\017)1125 623 y(\017)h(\017)f(\017)1126 622 y(\017)g(\017)h(\017)1127 621 y(\017)f(\017)g(\017)h(\017)1128 620 y(\017)f(\017)g(\017)1129 619 y(\017)g(\017)g(\017)1129 618 y(\017)h(\017)f(\017)1130 617 y(\017)h(\017)f(\017)1131 616 y(\017)g(\017)h(\017)1132 615 y(\017)f(\017)g(\017)1133 614 y(\017)g(\017)g(\017)1133 613 y(\017)h(\017)f(\017)1134 612 y(\017)h(\017)f(\017)g(\017)1135 611 y(\017)h(\017)f(\017)1136 610 y(\017)g(\017)h(\017)1137 609 y(\017)f(\017)g(\017)1138 608 y(\017)g(\017)g(\017)1139 607 y(\017)g(\017)g(\017)1139 606 y(\017)h(\017)f(\017)1140 605 y(\017)g(\017)h(\017)1141 604 y(\017)f(\017)g(\017)h(\017)1142 603 y(\017)f(\017)h(\017)1143 602 y(\017)f(\017)g(\017)1144 601 y(\017)g(\017)g(\017)1144 600 y(\017)h(\017)f(\017)1145 599 y(\017)g(\017)h(\017)1146 598 y(\017)f(\017)g(\017)1147 597 y(\017)g(\017)g(\017)g(\017)1148 596 y(\017)g(\017)g(\017)g(\017)1149 595 y(\017)g(\017)g(\017)1149 594 y(\017)h(\017)f(\017)1150 593 y(\017)h(\017)f(\017)1151 592 y(\017)g(\017)h(\017)1152 591 y(\017)f(\017)g(\017)1153 590 y(\017)g(\017)g(\017)h(\017)1154 589 y(\017)f(\017)g(\017)1155 588 y(\017)g(\017)g(\017)1155 587 y(\017)h(\017)f(\017)1156 586 y(\017)h(\017)f(\017)1157 585 y(\017)g(\017)h(\017)1158 584 y(\017)f(\017)g(\017)h(\017)1159 583 y(\017)f(\017)h(\017)1160 582 y(\017)f(\017)g(\017)1161 581 y(\017)g(\017)g(\017)1161 580 y(\017)h(\017)f(\017)1162 579 y(\017)h(\017)f(\017)1163 578 y(\017)g(\017)h(\017)f(\017)1164 577 y(\017)g(\017)h(\017)1165 576 y(\017)f(\017)h(\017)1166 575 y(\017)f(\017)g(\017)1167 574 y(\017)g(\017)g(\017)1167 573 y(\017)h(\017)f(\017)1168 572 y(\017)g(\017)h(\017)1169 571 y(\017)f(\017)h(\017)f(\017)g(\017)1170 570 y(\017)g(\017)h(\017)1171 569 y(\017)f(\017)h(\017)1172 568 y(\017)f(\017)g(\017)h(\017)1173 567 y(\017)f(\017)h(\017)1174 566 y(\017)f(\017)g(\017)1175 565 y(\017)g(\017)g(\017)1176 564 y(\017)g(\017)g(\017)h(\017)1177 563 y(\017)f(\017)g(\017)1178 562 y(\017)g(\017)g(\017)1179 561 y(\017)g(\017)g(\017)g(\017)1180 560 y(\017)g(\017)g(\017)1181 559 y(\017)g(\017)g(\017)1181 558 y(\017)h(\017)f(\017)1182 557 y(\017)h(\017)f(\017)g(\017)1183 556 y(\017)h(\017)f(\017)1184 555 y(\017)h(\017)f(\017)1185 554 y(\017)g(\017)h(\017)1186 553 y(\017)f(\017)h(\017)f(\017)1187 552 y(\017)h(\017)f(\017)1188 551 y(\017)g(\017)h(\017)1189 550 y(\017)f(\017)h(\017)f(\017)1190 549 y(\017)g(\017)h(\017)1191 548 y(\017)f(\017)h(\017)1192 547 y(\017)f(\017)g(\017)g(\017)1193 546 y(\017)g(\017)g(\017)1193 545 y(\017)h(\017)f(\017)g(\017)1195 544 y(\017)g(\017)g(\017)1195 543 y(\017)h(\017)f(\017)1196 542 y(\017)g(\017)h(\017)1197 541 y(\017)f(\017)g(\017)1198 540 y(\017)g(\017)g(\017)1198 539 y(\017)h(\017)f(\017)1199 538 y(\017)h(\017)f(\017)1200 537 y(\017)g(\017)h(\017)1201 536 y(\017)f(\017)g(\017)1202 535 y(\017)g(\017)g(\017)1202 534 y(\017)h(\017)f(\017)1203 533 y(\017)g(\017)h(\017)1204 532 y(\017)f(\017)g(\017)1205 531 y(\017)g(\017)g(\017)h(\017)1206 530 y(\017)f(\017)g(\017)1207 529 y(\017)g(\017)g(\017)1207 528 y(\017)h(\017)f(\017)1208 527 y(\017)g(\017)h(\017)1209 526 y(\017)f(\017)g(\017)1210 525 y(\017)g(\017)g(\017)1211 524 y(\017)g(\017)g(\017)1211 523 y(\017)h(\017)f(\017)1212 522 y(\017)g(\017)h(\017)1213 521 y(\017)f(\017)g(\017)1214 520 y(\017)g(\017)g(\017)1214 519 y(\017)h(\017)f(\017)1215 518 y(\017)g(\017)g(\017)h(\017)1216 517 y(\017)f(\017)g(\017)1217 516 y(\017)g(\017)g(\017)1217 515 y(\017)g(\017)h(\017)1218 514 y(\017)f(\017)g(\017)1219 513 y(\017)g(\017)1219 512 y(\017)g(\017)g(\017)1220 511 y(\017)g(\017)g (\017)1220 510 y(\017)h(\017)f(\017)1221 509 y(\017)g(\017)g(\017)1222 508 y(\017)g(\017)g(\017)1222 507 y(\017)h(\017)1223 506 y(\017)f(\017)g (\017)1223 505 y(\017)h(\017)f(\017)1224 504 y(\017)g(\017)h(\017)1225 503 y(\017)f(\017)g(\017)1225 502 y(\017)h(\017)1226 501 y(\017)f(\017)g (\017)1227 500 y(\017)g(\017)g(\017)1227 499 y(\017)g(\017)h(\017)1228 498 y(\017)f(\017)g(\017)1229 497 y(\017)g(\017)g(\017)1229 496 y(\017)g(\017)1230 495 y(\017)g(\017)g(\017)1230 494 y(\017)h(\017)f (\017)1231 493 y(\017)g(\017)g(\017)1232 492 y(\017)g(\017)g(\017)1232 491 y(\017)h(\017)f(\017)1233 490 y(\017)g(\017)1233 489 y(\017)h(\017)f (\017)1234 488 y(\017)g(\017)h(\017)1235 487 y(\017)f(\017)g(\017)1235 486 y(\017)h(\017)f(\017)1236 485 y(\017)g(\017)h(\017)1237 484 y(\017)f(\017)1237 483 y(\017)g(\017)h(\017)1238 482 y(\017)f(\017)g (\017)g(\017)1239 481 y(\017)g(\017)g(\017)1239 480 y(\017)g(\017)h(\017)1240 479 y(\017)f(\017)g(\017)1241 478 y(\017)g(\017)g(\017)1241 477 y(\017)h(\017)1242 476 y(\017)f(\017)g(\017)1243 475 y(\017)g(\017)g (\017)1243 474 y(\017)g(\017)h(\017)1244 473 y(\017)f(\017)g(\017)1245 472 y(\017)g(\017)g(\017)1245 471 y(\017)h(\017)f(\017)1246 470 y(\017)g(\017)h(\017)1247 469 y(\017)f(\017)g(\017)1247 468 y(\017)h(\017)1248 467 y(\017)f(\017)g(\017)1249 466 y(\017)g(\017)g (\017)1249 465 y(\017)h(\017)f(\017)1250 464 y(\017)g(\017)h(\017)1251 463 y(\017)f(\017)g(\017)1251 462 y(\017)h(\017)f(\017)1252 461 y(\017)g(\017)h(\017)1253 460 y(\017)f(\017)1253 459 y(\017)h(\017)f (\017)1254 458 y(\017)g(\017)h(\017)1255 457 y(\017)f(\017)g(\017)1255 456 y(\017)h(\017)f(\017)1256 455 y(\017)g(\017)h(\017)1257 454 y(\017)f(\017)g(\017)1258 453 y(\017)g(\017)g(\017)1258 452 y(\017)h(\017)f(\017)1259 451 y(\017)g(\017)1259 450 y(\017)h(\017)f (\017)1260 449 y(\017)g(\017)h(\017)1261 448 y(\017)f(\017)g(\017)h(\017)f (\017)1262 447 y(\017)g(\017)h(\017)1263 446 y(\017)f(\017)h(\017)1264 445 y(\017)f(\017)g(\017)h(\017)1265 444 y(\017)f(\017)h(\017)1266 443 y(\017)f(\017)h(\017)1267 442 y(\017)f(\017)g(\017)h(\017)1268 441 y(\017)f(\017)h(\017)1269 440 y(\017)f(\017)h(\017)1270 439 y(\017)f(\017)g(\017)h(\017)1271 438 y(\017)f(\017)h(\017)1272 437 y(\017)f(\017)g(\017)h(\017)1273 436 y(\017)f(\017)h(\017)1274 435 y(\017)f(\017)h(\017)1275 434 y(\017)f(\017)g(\017)h(\017)1276 433 y(\017)f(\017)h(\017)1277 432 y(\017)f(\017)h(\017)1278 431 y(\017)f(\017)g(\017)h(\017)1279 430 y(\017)f(\017)h(\017)1280 429 y(\017)f(\017)g(\017)1281 428 y(\017)g(\017)g(\017)h(\017)1282 427 y(\017)f(\017)h(\017)1283 426 y(\017)f(\017)g(\017)1284 425 y(\017)g(\017)g(\017)g(\017)h(\017)1285 424 y(\017)f(\017)h(\017)f(\017) 1286 423 y(\017)h(\017)f(\017)1287 422 y(\017)g(\017)h(\017)f(\017)1288 421 y(\017)h(\017)f(\017)1289 420 y(\017)h(\017)f(\017)g(\017)1291 419 y(\017)g(\017)g(\017)g(\017)1292 418 y(\017)g(\017)g(\017)1293 417 y(\017)g(\017)g(\017)h(\017)1294 416 y(\017)f(\017)h(\017)1295 415 y(\017)f(\017)g(\017)h(\017)1296 414 y(\017)f(\017)h(\017)f(\017)1297 413 y(\017)h(\017)f(\017)1298 412 y(\017)h(\017)f(\017)g(\017)1299 411 y(\017)h(\017)f(\017)1300 410 y(\017)h(\017)f(\017)g(\017)1302 409 y(\017)g(\017)g(\017)h(\017)1303 408 y(\017)f(\017)g(\017)1304 407 y(\017)g(\017)g(\017)h(\017)1305 406 y(\017)f(\017)h(\017)1306 405 y(\017)f(\017)h(\017)f(\017)g(\017)1307 404 y(\017)g(\017)h(\017)1308 403 y(\017)f(\017)g(\017)1309 402 y(\017)g(\017)g(\017)1310 401 y(\017)g(\017)g(\017)g(\017)1311 400 y(\017)g(\017)g(\017)1312 399 y(\017)g(\017)g(\017)1312 398 y(\017)h(\017)f(\017)1313 397 y(\017)g(\017)h(\017)1314 396 y(\017)f(\017)h(\017)f(\017)1315 395 y(\017)g(\017)h(\017)1316 394 y(\017)f(\017)g(\017)1317 393 y(\017)g(\017)g(\017)1318 392 y(\017)g(\017)g(\017)1318 391 y(\017)h(\017)f(\017)g(\017)1320 390 y(\017)g(\017)g(\017)1320 389 y(\017)h(\017)f(\017)1321 388 y(\017)g(\017)h(\017)1322 387 y(\017)f(\017)h(\017)1323 386 y(\017)f(\017)g(\017)1324 385 y(\017)g(\017)g(\017)g(\017)1325 384 y(\017)g(\017)g(\017)1326 383 y(\017)g(\017)g(\017)1326 382 y(\017)h(\017)f(\017)1327 381 y(\017)g(\017)h(\017)1328 380 y(\017)f(\017)h(\017)f(\017)1329 379 y(\017)g(\017)g(\017)h(\017)1330 378 y(\017)f(\017)h(\017)1331 377 y(\017)f(\017)g(\017)1332 376 y(\017)g(\017)g(\017)h(\017)1333 375 y(\017)f(\017)g(\017)1334 374 y(\017)g(\017)g(\017)1335 373 y(\017)g(\017)g(\017)1335 372 y(\017)h(\017)f(\017)g(\017)1337 371 y(\017)g(\017)g(\017)1337 370 y(\017)h(\017)f(\017)1338 369 y(\017)h(\017)f(\017)g(\017)1339 368 y(\017)h(\017)f(\017)1340 367 y(\017)h(\017)f(\017)1341 366 y(\017)g(\017)h(\017)1342 365 y(\017)f(\017)h(\017)f(\017)1343 364 y(\017)g(\017)h(\017)1344 363 y(\017)f(\017)h(\017)1345 362 y(\017)f(\017)g(\017)1346 361 y(\017)g(\017)g(\017)h(\017)1347 360 y(\017)f(\017)g(\017)1348 359 y(\017)g(\017)g(\017)1349 358 y(\017)g(\017)g(\017)1349 357 y(\017)h(\017)f(\017)g(\017)1351 356 y(\017)g(\017)g(\017)1351 355 y(\017)h(\017)f(\017)1352 354 y(\017)g(\017)h(\017)f(\017)g(\017)h(\017) 1354 353 y(\017)f(\017)h(\017)f(\017)g(\017)1356 352 y(\017)g(\017)h(\017)f (\017)g(\017)1358 351 y(\017)g(\017)g(\017)h(\017)f(\017)1359 350 y(\017)h(\017)f(\017)h(\017)f(\017)1361 349 y(\017)h(\017)f(\017)g(\017)h (\017)1363 348 y(\017)f(\017)h(\017)f(\017)h(\017)1365 347 y(\017)f(\017)h(\017)f(\017)g(\017)1367 346 y(\017)g(\017)g(\017)h(\017)1368 345 y(\017)g(\017)f(\017)g(\017)h(\017)1370 344 y(\017)f(\017)h(\017)f(\017)g (\017)1372 343 y(\017)g(\017)h(\017)f(\017)g(\017)1374 342 y(\017)g(\017)g(\017)h(\017)f(\017)g(\017)1375 341 y(\017)h(\017)f(\017)h (\017)f(\017)1377 340 y(\017)h(\017)f(\017)g(\017)h(\017)1379 339 y(\017)f(\017)h(\017)f(\017)h(\017)1381 338 y(\017)f(\017)h(\017)f(\017)g (\017)1383 337 y(\017)g(\017)h(\017)f(\017)g(\017)1385 336 y(\017)g(\017)g(\017)h(\017)f(\017)1386 335 y(\017)h(\017)f(\017)h(\017)f (\017)1388 334 y(\017)h(\017)f(\017)g(\017)h(\017)1390 333 y(\017)g(\017)f(\017)g(\017)h(\017)1392 332 y(\017)f(\017)h(\017)f(\017)g (\017)1394 331 y(\017)g(\017)h(\017)f(\017)g(\017)1396 330 y(\017)g(\017)g(\017)h(\017)f(\017)1398 329 y(\017)g(\017)g(\017)g(\017)h (\017)f(\017)h(\017)f(\017)1400 328 y(\017)h(\017)f(\017)h(\017)f(\017)g (\017)h(\017)f(\017)1404 327 y(\017)g(\017)g(\017)h(\017)f(\017)h(\017)f (\017)1406 326 y(\017)h(\017)f(\017)h(\017)f(\017)g(\017)h(\017)f(\017)1410 325 y(\017)g(\017)g(\017)h(\017)f(\017)h(\017)f(\017)1412 324 y(\017)h(\017)f(\017)h(\017)f(\017)g(\017)h(\017)f(\017)1416 323 y(\017)g(\017)g(\017)h(\017)f(\017)h(\017)f(\017)1418 322 y(\017)h(\017)f(\017)h(\017)f(\017)g(\017)h(\017)f(\017)g(\017)1422 321 y(\017)g(\017)g(\017)h(\017)f(\017)h(\017)f(\017)g(\017)h(\017)f(\017)h (\017)f(\017)h(\017)f(\017)1427 320 y(\017)h(\017)f(\017)h(\017)f(\017)g (\017)h(\017)f(\017)h(\017)f(\017)h(\017)f(\017)g(\017)h(\017)f(\017)1434 319 y(\017)g(\017)g(\017)h(\017)f(\017)h(\017)f(\017)h(\017)f(\017)g(\017)h (\017)f(\017)h(\017)f(\017)1439 318 y(\017)h(\017)f(\017)h(\017)f(\017)g (\017)h(\017)f(\017)h(\017)f(\017)h(\017)f(\017)g(\017)1444 317 y(\017)h(\017)f(\017)h(\017)f(\017)g(\017)h(\017)f(\017)1448 316 y(\017)g(\017)g(\017)h(\017)f(\017)h(\017)f(\017)1450 315 y(\017)h(\017)f(\017)h(\017)f(\017)g(\017)h(\017)f(\017)1454 314 y(\017)g(\017)g(\017)h(\017)f(\017)h(\017)f(\017)g(\017)1457 313 y(\017)g(\017)h(\017)f(\017)g(\017)h(\017)f(\017)1460 312 y(\017)g(\017)g(\017)h(\017)f(\017)h(\017)f(\017)g(\017)1463 311 y(\017)g(\017)h(\017)f(\017)g(\017)h(\017)f(\017)h(\017)1466 310 y(\017)f(\017)g(\017)h(\017)f(\017)h(\017)f(\017)g(\017)h(\017)f(\017)h (\017)f(\017)h(\017)f(\017)g(\017)h(\017)f(\017)h(\017)f(\017)g(\017)h(\017)f (\017)h(\017)f(\017)h(\017)1476 309 y(\017)f(\017)h(\017)f(\017)h(\017)f (\017)g(\017)h(\017)f(\017)h(\017)f(\017)h(\017)f(\017)g(\017)h(\017)f(\017)h (\017)f(\017)g(\017)h(\017)f(\017)h(\017)f(\017)g(\017)h(\017)f(\017)h(\017)f (\017)h(\017)f(\017)1488 308 y(\017)h(\017)f(\017)g(\017)h(\017)f(\017)g (\017)h(\017)f(\017)h(\017)f(\017)h(\017)f(\017)g(\017)h(\017)f(\017)h(\017)f (\017)g(\017)h(\017)f(\017)h(\017)f(\017)h(\017)f(\017)g(\017)h(\017)f(\017)h (\017)f(\017)g(\017)h(\017)f(\017)h(\017)1502 307 y(\017)g(\017)f(\017)g (\017)h(\017)f(\017)h(\017)f(\017)g(\017)h(\017)f(\017)h(\017)f(\017)g(\017)h (\017)f(\017)h(\017)f(\017)h(\017)f(\017)g(\017)h(\017)f(\017)h(\017)f(\017)g (\017)g(\017)h(\017)f(\017)h(\017)f(\017)h(\017)f(\017)g(\017)h(\017)f(\017)h (\017)f(\017)g(\017)h(\017)f(\017)h(\017)f(\017)h(\017)f(\017)g(\017)h(\017)f (\017)h(\017)f(\017)g(\017)h(\017)f(\017)h(\017)f(\017)h(\017)f(\017)g(\017)h (\017)f(\017)h(\017)f(\017)g(\017)h(\017)f(\017)h(\017)f(\017)g(\017)h(\017)f (\017)1531 306 y(\017)g(\017)h(\017)f(\017)g(\017)h(\017)f(\017)h(\017)f (\017)g(\017)h(\017)f(\017)g(\017)h(\017)f(\017)h(\017)f(\017)g(\017)h(\017)f (\017)h(\017)f(\017)g(\017)h(\017)f(\017)h(\017)f(\017)h(\017)f(\017)g(\017)h (\017)f(\017)h(\017)f(\017)g(\017)h(\017)f(\017)h(\017)f(\017)h(\017)f(\017)g (\017)h(\017)f(\017)h(\017)f(\017)g(\017)h(\017)f(\017)h(\017)f(\017)g(\017)h (\017)f(\017)h(\017)f(\017)h(\017)f(\017)g(\017)h(\017)f(\017)h(\017)f(\017)g (\017)h(\017)f(\017)h(\017)f(\017)g(\017)h(\017)f(\017)g(\017)h(\017)f(\017)h (\017)f(\017)g(\017)h(\017)f(\017)h(\017)f(\017)h(\017)f(\017)g(\017)h(\017)f (\017)h(\017)f(\017)g(\017)h(\017)f(\017)h(\017)f(\017)h(\017)f(\017)g(\017)h (\017)f(\017)h(\017)f(\017)g(\017)h(\017)f(\017)h(\017)f(\017)g(\017)h(\017)f (\017)h(\017)f(\017)h(\017)f(\017)g(\017)h(\017)f(\017)h(\017)f(\017)g(\017)h (\017)f(\017)h(\017)f(\017)h(\017)f(\017)g(\017)g(\017)h(\017)f(\017)h(\017)f (\017)h(\017)f(\017)g(\017)h(\017)f(\017)h(\017)f(\017)g(\017)h(\017)f(\017)h (\017)f(\017)h(\017)f(\017)g(\017)h(\017)f(\017)h(\017)f(\017)h(\017)f(\017)g (\017)h(\017)f(\017)h(\017)f(\017)h(\017)f(\017)g(\017)h(\017)f(\017)h(\017)f (\017)g(\017)h(\017)f(\017)h(\017)f(\017)h(\017)f(\017)g(\017)h(\017)f(\017)h (\017)f(\017)h(\017)f(\017)g(\017)g(\017)h(\017)f(\017)h(\017)f(\017)g(\017)h (\017)f(\017)h(\017)f(\017)h(\017)f(\017)g(\017)h(\017)f(\017)h(\017)f(\017)h (\017)f(\017)g(\017)h(\017)f(\017)h(\017)f(\017)h(\017)f(\017)g(\017)h(\017)f (\017)h(\017)f(\017)g(\017)h(\017)f(\017)h(\017)f(\017)h(\017)f(\017)g(\017)h (\017)f(\017)h(\017)f(\017)h(\017)f(\017)g(\017)h(\017)f(\017)h(\017)f(\017)h (\017)f(\017)g(\017)h(\017)f(\017)g(\017)h(\017)f(\017)g(\017)h(\017)f(\017)h (\017)f(\017)h(\017)f(\017)g(\017)h(\017)f(\017)h(\017)f(\017)g(\017)h(\017)f (\017)h(\017)f(\017)h(\017)f(\017)g(\017)h(\017)f(\017)h(\017)f(\017)g(\017)h (\017)f(\017)h(\017)f(\017)g(\017)h(\017)f(\017)h(\017)f(\017)h(\017)f(\017)g (\017)h(\017)f(\017)h(\017)f(\017)g(\017)h(\017)f(\017)h(\017)f(\017)h(\017)f (\017)g(\017)h(\017)f(\017)h(\017)f(\017)g(\017)g(\017)1650 305 y(\017)g(\017)h(\017)f(\017)h(\017)f(\017)g(\017)h(\017)f(\017)h(\017)f (\017)g(\017)h(\017)f(\017)h(\017)f(\017)h(\017)f(\017)g(\017)h(\017)f(\017)h (\017)f(\017)g(\017)h(\017)f(\017)h(\017)f(\017)g(\017)h(\017)f(\017)h(\017)f (\017)h(\017)f(\017)g(\017)h(\017)f(\017)h(\017)f(\017)g(\017)h(\017)f(\017)h (\017)f(\017)h(\017)f(\017)1669 304 y(\017)h(\017)f(\017)h(\017)f(\017)g (\017)h(\017)f(\017)g(\017)h(\017)f(\017)h(\017)f(\017)g(\017)h(\017)f(\017)h (\017)f(\017)h(\017)f(\017)g(\017)h(\017)f(\017)h(\017)f(\017)g(\017)h(\017)f (\017)h(\017)f(\017)h(\017)f(\017)g(\017)h(\017)f(\017)h(\017)f(\017)h(\017)f (\017)g(\017)h(\017)f(\017)h(\017)f(\017)h(\017)f(\017)g(\017)h(\017)f(\017)h (\017)f(\017)g(\017)h(\017)f(\017)h(\017)f(\017)h(\017)f(\017)g(\017)h(\017)f (\017)h(\017)f(\017)g(\017)h(\017)f(\017)g(\017)h(\017)f(\017)h(\017)f(\017)h (\017)f(\017)g(\017)h(\017)f(\017)h(\017)f(\017)g(\017)h(\017)f(\017)h(\017)f (\017)h(\017)f(\017)g(\017)h(\017)f(\017)h(\017)f(\017)h(\017)f(\017)g(\017)h (\017)f(\017)h(\017)f(\017)h(\017)f(\017)g(\017)h(\017)f(\017)h(\017)f(\017)g (\017)h(\017)f(\017)h(\017)f(\017)h(\017)f(\017)g(\017)h(\017)f(\017)h(\017)f (\017)h(\017)1740 885 y(\017)g(\017)f(\017)h(\017)f(\017)g(\017)h(\017)f (\017)h(\017)f(\017)h(\017)f(\017)g(\017)h(\017)f(\017)h(\017)f(\017)g(\017)h (\017)f(\017)h(\017)f(\017)h(\017)f(\017)g(\017)h(\017)f(\017)h(\017)f(\017)g (\017)h(\017)f(\017)h(\017)f(\017)h(\017)f(\017)g(\017)h(\017)f(\017)h(\017)f (\017)g(\017)h(\017)f(\017)h(\017)f(\017)h(\017)f(\017)g(\017)h(\017)f(\017)h (\017)f(\017)g(\017)h(\017)f(\017)h(\017)f(\017)h(\017)f(\017)g(\017)h(\017)f (\017)h(\017)f(\017)g(\017)h(\017)f(\017)h(\017)f(\017)h(\017)f(\017)g(\017)h (\017)f(\017)h(\017)f(\017)g(\017)h(\017)f(\017)h(\017)f(\017)h(\017)f(\017)g (\017)h(\017)f(\017)h(\017)f(\017)g(\017)h(\017)f(\017)h(\017)f(\017)h(\017)f (\017)g(\017)h(\017)f(\017)h(\017)f(\017)g(\017)h(\017)f(\017)h(\017)f(\017)h (\017)f(\017)g(\017)h(\017)f(\017)h(\017)f(\017)g(\017)h(\017)f(\017)h(\017)f (\017)h(\017)f(\017)g(\017)h(\017)f(\017)h(\017)f(\017)g(\017)h(\017)f(\017)h (\017)f(\017)h(\017)f(\017)g(\017)h(\017)f(\017)h(\017)f(\017)g(\017)h(\017)f (\017)h(\017)f(\017)h(\017)f(\017)g(\017)h(\017)f(\017)h(\017)f(\017)g(\017)h (\017)f(\017)h(\017)f(\017)h(\017)f(\017)g(\017)h(\017)f(\017)h(\017)f(\017)g (\017)h(\017)f(\017)h(\017)f(\017)h(\017)f(\017)g(\017)h(\017)f(\017)h(\017)f (\017)h(\017)f(\017)g(\017)h(\017)f(\017)h(\017)f(\017)g(\017)h(\017)f(\017)h (\017)f(\017)h(\017)f(\017)g(\017)h(\017)f(\017)h(\017)f(\017)g(\017)h(\017)f (\017)h(\017)f(\017)h(\017)f(\017)g(\017)h(\017)f(\017)h(\017)f(\017)g(\017)h (\017)f(\017)h(\017)f(\017)h(\017)f(\017)g(\017)h(\017)f(\017)h(\017)f(\017)g (\017)h(\017)f(\017)h(\017)1862 892 y Fn(TOT)m(AL)-90 1455 y Fl(Figure)20 b(2:)30 b(Pro\014le)21 b(of)e(p)q(ercen)o(tage)j(of)d(time)g (at)h(a)g(giv)o(en)f(M\015ops)i(rate)f(as)h(recorded)g(b)o(y)f(the)h(CRA)m(Y) e(Y-MP/464)h(hardw)o(are)-90 1505 y(p)q(erformance)14 b(monitor)e(at)h(NCSA)i (from)d(June)i(1991)f(to)h(June)h(1992.)-90 1637 y Fh(3.1)56 b(Ov)n(erall)17 b(W)-5 b(orkload)-90 1764 y Fl(The)12 b(\014rst)g(results)h (w)o(e)f(presen)o(t)h(are)f(for)g(the)g(o)o(v)o(erall)e(w)o(orkload.)16 b(W)m(e)c(ha)o(v)o(e)f(w)o(orkload)f(pro\014les)j(for)e(all)f(16)h(of)g(the)h (p)q(erformance)g(met-)-90 1863 y(rics)i(describ)q(ed)h(earlier,)f(and)f (also)g(for)g(pairwise)g(comparisons)g(to)g(\014nd)h(relationships)f(b)q(et)o (w)o(een)i(the)f(metrics.)k(Space)c(limitations)-90 1963 y(allo)o(w)e(us)j (to)e(sho)o(w)h(only)f(a)h(few)g(examples)f(represen)o(ting)i(a)f(v)o(ery)g (small)e(amoun)o(t)g(of)h(this)h(information.)-28 2062 y(Figure)h(2\(a\))f (sho)o(ws)h(the)g(M\015ops)g(pro\014le)g(for)f(the)h(w)o(orkload.)k(The)c(v)o (ertical)g(axis)f(giv)o(es)g(the)i(p)q(ercen)o(tage)g(of)e(time)f(sp)q(en)o (t)j(at)e(a)-90 2162 y(giv)o(en)h(M\015ops)i(rate.)24 b(The)17 b Fk(i)p Fl(th)e(bar)h(in)g(the)g(graph)g(is)g(for)f(a)h(bin)g(represen)o (ting)h(the)g(in)o(terv)n(al)e([10)9 b Fi(\003)i Fl(\()p Ff(i)g Fi(\000)g Fl(1\))p Ff(;)c Fl(10)i Fi(\003)h Ff(i)p Fl(])16 b(M\015ops.)25 b(The)-90 2262 y(\014gure)12 b(illustrates)g(that)f(a)g(v)o (ery)h(signi\014can)o(t)f(amoun)o(t)f(of)h(time)f(is)h(sp)q(en)o(t)h(at)g (M\015ops)g(rates)g(nearer)h(the)f(lo)o(w)e(end)i(of)f(the)h(p)q(erformance) -90 2361 y(range)k(than)f(at)g(the)i(high)d(end)i(of)f(the)h(range.)23 b(The)16 b(p)q(eak)g(p)q(erformance)f(of)g(a)g(single)h(Y-MP)f(pro)q(cessor)j (is)d(333)g(M\015ops.)23 b(Ab)q(out)-90 2461 y(one-third)15 b(of)f(the)h(time)e(is)i(sp)q(en)o(t)g(at)g(less)g(than)g(10)f(p)q(ercen)o(t) i(of)e(p)q(eak)h(\(33)f(M\015ops\).)21 b(But)15 b(there)h(are)f(some)f(jobs)g (that)h(execute)h(at)-90 2561 y(more)11 b(than)i(90)e(p)q(ercen)o(t)k(of)c(p) q(eak)i(\(300)f(M\015ops\).)17 b(The)c(a)o(v)o(erage)g(M\015ops)f Fk(p)n(er)h(pr)n(o)n(c)n(essor)f Fl(for)g(the)h(w)o(orkload)e(is)h(69.06)f (\(20.7)g(p)q(ercen)o(t)-90 2660 y(of)i(p)q(eak\))i(for)e(an)h(a)o(v)o(erage) g(4-pro)q(cessor)h(system)f(p)q(erformance)f(of)h(276.24)e(M\015ops)i(o)o(v)o (er)g(the)g(one)g(y)o(ear)h(recording)f(p)q(erio)q(d.)939 2828 y(10)p eop %%Page: 11 13 bop -28 195 a Fl(The)19 b(authors)f(conjecture)i(that)e(other)h(mac)o(hines)e (w)o(ould)g(ha)o(v)o(e)h(a)f(M\015ops)i(pro\014le)f(with)g(a)f(shap)q(e)i (similar)d(in)h(form)f(to)i(the)-90 295 y(M\015ops)e(pro\014le)g(sho)o(wn)f (here.)25 b(Individual)14 b(mac)o(hines)h(w)o(ould)f(ha)o(v)o(e)i(their)g(o)o (wn)f(high)g(p)q(ercen)o(tage)j(of)d(time)f(at)h(a)h(lo)o(w)f(rate)h(and)f(a) -90 394 y(particular)f(rate)g(of)f(decline)i(of)e(p)q(ercen)o(tage)j(for)d (higher)h(rates.)-28 494 y(T)m(able)i(3)g(sho)o(ws)h(the)g(relationship)f(of) g(M\015ops)g(to)h(Mmemops)d(for)i(the)h(jobs)f(in)g(the)h(w)o(orkload.)25 b(The)17 b(horizon)o(tal)e(axis)h(giv)o(es)-90 594 y(Mmemops)11 b(while)j(the)g(v)o(ertical)f(axis)g(giv)o(es)h(M\015ops.)k(On)c(the)g(left)f (and)h(b)q(ottom)e(b)q(orders)j(of)e(the)h(table)f(are)h(pro\014les)g(sho)o (wing)f(the)-90 693 y(p)q(ercen)o(tage)j(of)e(time)g(and)g(the)h(n)o(um)o(b)q (er)f(of)g(jobs)g(in)g(the)i(corresp)q(onding)f(ro)o(w)f(and)h(column,)d (resp)q(ectiv)o(ely)m(.)22 b(The)15 b(v)o(ery)f(b)q(ottom)g(of)-90 793 y(the)h(table)g(sho)o(ws)g(some)f(statistics)i(and)f(ho)o(w)f(they)h(w)o (ere)h(computed)f(from)e(the)i(HPM)g(coun)o(ters.)23 b(The)15 b(en)o(tries)h(in)e(the)i(table)f(are)-90 892 y(coun)o(ts)f(for)g(the)g(n)o (um)o(b)q(er)f(of)g(jobs)g(that)h(had)g(a)f(particular)g(pair)h(of)f (M\015ops)h(and)f(Mmemops)f(v)n(alues.)17 b(The)d(data)g(is)f(binned)h(b)q (efore)-90 992 y(en)o(try)h(in)o(to)e(the)h(table.)k(The)c(bins)g(ha)o(v)o(e) g(size)h(10)e(M\015ops)h(and)g(30)f(Mmemops.)-28 1092 y(A)18 b(general)f(trend)h(can)f(b)q(e)h(seen)h(from)c(the)j(table.)28 b(Note,)18 b(ho)o(w)o(ev)o(er,)g(that)f(programs)f(in)h(the)g(60)g(Mmemops)e (column)h(span)-90 1191 y(almost)d(the)i(en)o(tire)g(M\015ops)g(range,)g(so)g (that)f(the)i(trend)f(is)g(a)f(lo)q(ose)h(one.)21 b(Also,)14 b(see)i(the)f(170)f(M\015ops)h(ro)o(w.)20 b(Not)14 b(surprisingly)m(,)g(to) -90 1291 y(obtain)h(more)g(M\015ops)h(the)h(system)e(m)o(ust)g(usually)g(pro) o(vide)h(more)f(memory)e(bandwidth.)24 b(Ho)o(w)o(ev)o(er,)16 b(this)g(need)h(not)f(alw)o(a)o(ys)e(b)q(e)-90 1391 y(the)k(case.)28 b(Sev)o(eral)17 b(jobs)f(in)h(the)g(w)o(orkload)f(receiv)o(ed)i(high)f (M\015ops)g(p)q(erformance)g(without)f(relativ)o(ely)g(high)h(Mmemops.)25 b(The)-90 1490 y(39)16 b(jobs)h(at)g(\(300,150\))e(buc)o(k)o(ed)i(the)h (trend.)27 b(These)18 b(jobs)f(had)g(a)f(ratio)g(of)h(memory)d(op)q(erations) j(to)g(\015oating)e(p)q(oin)o(t)i(op)q(erations)-90 1590 y(of)e(ab)q(out)h (2.0,)f(whic)o(h)g(is)h(the)g(same)f(ratio)g(as)h(that)g(of)f(the)i(matrix-v) o(ector)d(library)h(routine,)h(MXV.)g(The)g(104)f(jobs)h(at)f(\(300,60\))-90 1689 y(include)g(those)g(with)g(a)f(ratio)g(of)g(4.0,)g(whic)o(h)h(is)f(the)h (same)f(ratio)g(as)h(that)g(of)f(the)i(matrix-m)o(atrix)11 b(library)j(routine,)h(MXM.)g(Both)-90 1789 y(MXV)i(and)f(MXM)h(use)h(a)e (register-based)j(algorithm)14 b(to)i(ac)o(hiev)o(e)h(high)f(p)q(erformance)h (without)f(accessing)i(memory)m(.)24 b(A)16 b(single)-90 1889 y(job)e(at)g(\(0,300\))g(reminds)g(us)h(that)f(M\015ops)h(is)f(not)h(the)g (only)f(p)q(erformance)g(metric.)20 b(This)14 b(job)g(has)h(a)f(v)o(ery)h (high)f(memory)e(access)-90 1988 y(rate)17 b(and)e(ma)o(y)f(also)i(b)q(e)g(p) q(erforming)f(v)o(ector)h(logical)f(or)h(in)o(teger)g(op)q(erations)g(whic)o (h)g(are)g(not)g(coun)o(ted)h(b)o(y)f(the)g(HPM)h(Group)e(0)-90 2088 y(coun)o(ters.)-28 2188 y(T)m(able)d(4)h(is)f(similar)f(to)h(T)m(able)g (3)h(except)h(that)f(the)g(table)g(en)o(tries)h(are)f(for)f(p)q(ercen)o(t)j (of)d(w)o(orkload)g(time)f(rather)j(than)f(n)o(um)o(b)q(er)f(of)-90 2287 y(jobs.)22 b(Bins)15 b(represen)o(ting)i(less)f(than)f(0.01)f(p)q(ercen) o(t)j(are)f(replaced)g(b)o(y)f(a)g(`-'.)21 b(Bins)15 b(with)g(a)g(p)q(ercen)o (tage)i(greater)f(than)f(1)g(p)q(ercen)o(t)-90 2387 y(are)g(b)q(oldfaced.)20 b(This)14 b(giv)o(es)g(a)h(b)q(etter)h(view)e(of)g(the)h(imp)q(ortance)f(of)f (the)i(di\013eren)o(t)h(bins)e(and)h(the)g(lac)o(k)f(of)g(relationship)g(b)q (et)o(w)o(een)-90 2487 y(n)o(um)o(b)q(er)j(of)g(jobs)g(and)h(p)q(ercen)o(t)h (time.)28 b(Out)18 b(of)f(a)g(p)q(eak)h(memory)d(p)q(erformance)i(of)g(500)g (Mmemops)e(p)q(er)k(pro)q(cessor,)h(few)d(jobs)-90 2586 y(exceed)f(330)d (Mmemops)f(\(66)i(p)q(ercen)o(t)i(of)e(p)q(eak\).)19 b(The)c(b)q(order)g(ro)o (w)f(at)g(the)h(b)q(ottom)d(sho)o(wing)i(p)q(ercen)o(t)i(time)d(indicates)h (that)g(the)-90 2686 y(distribution)f(pro\014le)g(of)g(Mmemops,)e(starting)j (at)f(31.08)f(p)q(ercen)o(t)j(for)e([0,30])e(Mmemops,)g(has)j(the)g(same)e (pro\014le)i(form)e(as)h(that)h(of)939 2828 y(11)p eop %%Page: 12 14 bop -90 352 2151 2 v -91 401 2 50 v -82 401 V 3 386 a Fl(\045)p 62 401 V 236 401 V 224 w(M\015op)p 391 401 V 400 401 V 210 w(rate)15 b(of)e(memory)e(reference)17 b(op)q(erations)d(in)f(millions)e(p)q (er)k(second)g(\(Mmemops\))p 2051 401 V 2060 401 V 400 403 1660 2 v -91 451 2 50 v -82 451 V -43 436 a(time)p 62 451 V 100 w(jobs)p 236 451 V 83 w(rate)p 391 451 V 400 451 V 163 w(0)112 b(30)f(60)h(90)70 b(120)g(150)g(180)60 b(210)h(240)f(270)h(300)g(330) p 2051 451 V 2060 451 V -90 453 2151 2 v -90 463 V -91 513 2 50 v -82 513 V -57 498 a(11.55)p 62 513 V 49 w(115490)p 236 513 V 132 w(0)p 391 513 V 400 513 V 58 w(114139)69 b(1208)111 b(92)h(18)91 b(30)111 b(0)h(2)103 b(0)f(0)h(0)c Fm(1)k Fl(0)p 2051 513 V 2060 513 V -91 562 V -82 562 V -57 548 a(13.55)p 62 562 V 69 w(47548)p 236 562 V 112 w(10)p 391 562 V 400 562 V 79 w(43001)69 b(4325)90 b(207)112 b(15)f(0)h(0)g(0)103 b(0)f(0)h(0)f(0)h(0) p 2051 562 V 2060 562 V -91 612 V -82 612 V -36 597 a(8.81)p 62 612 V 69 w(25925)p 236 612 V 112 w(20)p 391 612 V 400 612 V 79 w(15075)48 b(10183)90 b(517)h(118)g(31)111 b(1)h(0)103 b(0)f(0)h(0)f(0)h(0)p 2051 612 V 2060 612 V -91 662 V -82 662 V -36 647 a(8.56)p 62 662 V 69 w(13111)p 236 662 V 112 w(30)p 391 662 V 400 662 V 99 w(4532)70 b(7543)90 b(767)h(214)g(51)111 b(4)h(0)103 b(0)f(0)h(0)f(0)h(0)p 2051 662 V 2060 662 V -91 712 V -82 712 V -36 697 a(5.05)p 62 712 V 69 w(10712)p 236 712 V 112 w(40)p 391 712 V 400 712 V 99 w(2509)70 b(5336)g(2710)90 b(103)h(22)f(32)112 b(0)103 b(0)f(0)h(0)f(0)h(0)p 2051 712 V 2060 712 V -91 762 V -82 762 V -36 747 a(7.20)p 62 762 V 69 w(10885)p 236 762 V 112 w(50)p 391 762 V 400 762 V 99 w(1320)70 b(4567)g(4512)90 b(393)h(16)f(16)h(59)103 b(2)f(0)h(0)f(0)h(0)p 2051 762 V 2060 762 V -91 812 V -82 812 V -36 797 a(6.64)p 62 812 V 90 w(8492)p 236 812 V 112 w(60)p 391 812 V 400 812 V 120 w(862)70 b(2680)g(3965)90 b(717)70 b(103)90 b(14)h(63)82 b(88)102 b(0)h(0)f(0)h(0)p 2051 812 V 2060 812 V -91 861 V -82 861 V -36 846 a(3.32)p 62 861 V 90 w(6573)p 236 861 V 112 w(70)p 391 861 V 400 861 V 120 w(383)70 b(1365)g(3184)f(1362)h(199)90 b(50)112 b(3)82 b(26)102 b(1)h(0)f(0)h(0)p 2051 861 V 2060 861 V -91 911 V -82 911 V -36 896 a(3.20)p 62 911 V 90 w(6682)p 236 911 V 112 w(80)p 391 911 V 400 911 V 120 w(324)70 b(1281)g(2628)f(1936)h (353)g(151)111 b(5)103 b(0)f(3)h(1)f(0)h(0)p 2051 911 V 2060 911 V -91 961 V -82 961 V -36 946 a(2.90)p 62 961 V 90 w(5223)p 236 961 V 112 w(90)p 391 961 V 400 961 V 120 w(375)91 b(858)70 b(1339)f(1627)h(670)g(250)90 b(94)103 b(6)f(0)h(4)f(0)h(0)p 2051 961 V 2060 961 V -91 1011 V -82 1011 V -36 996 a(3.41)p 62 1011 V 90 w(5471)p 236 1011 V 92 w(100)p 391 1011 V 400 1011 V 119 w(154)91 b(814)70 b(1017)f(2429)h(804)g(145)90 b(91)82 b(17)102 b(0)h(0)f(0)h(0)p 2051 1011 V 2060 1011 V -91 1061 V -82 1061 V -36 1046 a(4.39)p 62 1061 V 90 w(5824)p 236 1061 V 92 w(110)p 391 1061 V 400 1061 V 119 w(206)70 b(1239)g(1024)f(2166)h(772)g (316)90 b(52)82 b(44)102 b(5)h(0)f(0)h(0)p 2051 1061 V 2060 1061 V -91 1110 V -82 1110 V -36 1095 a(4.05)p 62 1110 V 90 w(6959)p 236 1110 V 92 w(120)p 391 1110 V 400 1110 V 119 w(151)70 b(1696)90 b(781)70 b(2993)g(983)g(232)90 b(66)82 b(48)102 b(9)h(0)f(0)h(0)p 2051 1110 V 2060 1110 V -91 1160 V -82 1160 V -36 1145 a(2.76)p 62 1160 V 90 w(4349)p 236 1160 V 92 w(130)p 391 1160 V 400 1160 V 119 w(178)91 b(815)f(729)70 b(1057)g(879)g(390)g(136)60 b(139)81 b(24)103 b(2)f(0)h(0)p 2051 1160 V 2060 1160 V -91 1210 V -82 1210 V -36 1195 a(2.48)p 62 1210 V 90 w(3902)p 236 1210 V 92 w(140)p 391 1210 V 400 1210 V 119 w(135)91 b(564)f(483)70 b(1122)g(849)g(447)g(105)60 b(130)81 b(50)h(17)102 b(0)h(0)p 2051 1210 V 2060 1210 V -91 1260 V -82 1260 V -36 1245 a(3.13)p 62 1260 V 90 w(3400)p 236 1260 V 92 w(150)p 391 1260 V 400 1260 V 140 w(75)91 b(578)f(621)h(667)70 b(785)g(396)g(168)81 b(32)g(20)h(36)g(22)102 b(0)p 2051 1260 V 2060 1260 V -91 1310 V -82 1310 V -36 1295 a(2.95)p 62 1310 V 90 w(3598)p 236 1310 V 92 w(160)p 391 1310 V 400 1310 V 140 w(51)91 b(825)f(338)h(840)70 b(856)g(356)g(168)81 b(30)g(15)h(42)g(77)102 b(0)p 2051 1310 V 2060 1310 V -91 1359 V -82 1359 V -36 1345 a(1.54)p 62 1359 V 90 w(1862)p 236 1359 V 92 w(170)p 391 1359 V 400 1359 V 140 w(91)91 b(240)f(135)h(381)70 b(417)g(285)g(125)81 b(64)g(25)h(18)g(44)f(37)p 2051 1359 V 2060 1359 V -91 1409 V -82 1409 V -36 1394 a(1.49)p 62 1409 V 90 w(2134)p 236 1409 V 92 w(180)p 391 1409 V 400 1409 V 119 w(485)91 b(128)f(146)h(154)70 b(763)g(215)g(101)81 b(61)g(27)h(20)g(32)102 b(2)p 2051 1409 V 2060 1409 V -91 1459 V -82 1459 V -36 1444 a(0.92)p 62 1459 V 90 w(1440)p 236 1459 V 92 w(190)p 391 1459 V 400 1459 V 119 w(494)91 b(241)111 b(93)h(59)70 b(235)g(142)90 b(51)82 b(37)f(52)h(23)102 b(7)h(6)p 2051 1459 V 2060 1459 V -91 1509 V -82 1509 V -36 1494 a(0.47)p 62 1509 V 90 w(1017)p 236 1509 V 92 w(200)p 391 1509 V 400 1509 V 119 w(187)91 b(159)111 b(58)91 b(154)70 b(174)g(148)90 b(32)82 b(60)f(15)h(25)102 b(5)h(0)p 2051 1509 V 2060 1509 V -91 1559 V -82 1559 V -36 1544 a(0.90)p 62 1559 V 111 w(635)p 236 1559 V 92 w(210)p 391 1559 V 400 1559 V 140 w(67)112 b(14)f(54)h(46)70 b(153)g(149)90 b(13)82 b(85)102 b(7)82 b(24)g(23)102 b(0)p 2051 1559 V 2060 1559 V -91 1609 V -82 1609 V -36 1594 a(0.22)p 62 1609 V 111 w(343)p 236 1609 V 92 w(220)p 391 1609 V 400 1609 V 140 w(99)112 b(40)f(33)h(31)91 b(15)f(12)112 b(1)82 b(91)f(14)103 b(1)f(4)h(2)p 2051 1609 V 2060 1609 V -91 1658 V -82 1658 V -36 1643 a(0.20)p 62 1658 V 111 w(143)p 236 1658 V 92 w(230)p 391 1658 V 400 1658 V 161 w(0)132 b(5)112 b(32)g(30)f(8)h(8)g(0) 82 b(20)f(20)h(20)102 b(0)h(0)p 2051 1658 V 2060 1658 V -91 1708 V -82 1708 V -36 1693 a(0.04)p 62 1708 V 132 w(85)p 236 1708 V 92 w(240)p 391 1708 V 400 1708 V 161 w(0)132 b(0)112 b(19)g(26)91 b(10)111 b(5)h(1)103 b(1)81 b(12)h(11)102 b(0)h(0)p 2051 1708 V 2060 1708 V -91 1758 V -82 1758 V -36 1743 a(0.03)p 62 1758 V 132 w(65)p 236 1758 V 92 w(250)p 391 1758 V 400 1758 V 161 w(0)132 b(0)112 b(18)g(34)f(1)h(3)g(0)103 b(0)f(5)h(4)f(0)h(0)p 2051 1758 V 2060 1758 V -91 1808 V -82 1808 V -36 1793 a(0.04)p 62 1808 V 132 w(58)p 236 1808 V 92 w(260)p 391 1808 V 400 1808 V 161 w(0)132 b(0)112 b(18)g(26)91 b(13)111 b(0)h(1)103 b(0)f(0)h(0)f(0)h(0)p 2051 1808 V 2060 1808 V -91 1858 V -82 1858 V -36 1843 a(0.00)p 62 1858 V 132 w(39)p 236 1858 V 92 w(270)p 391 1858 V 400 1858 V 161 w(0)132 b(0)112 b(16)133 b(6)91 b(17)111 b(0)h(0)103 b(0)f(0)h(0)f(0)h(0)p 2051 1858 V 2060 1858 V -91 1907 V -82 1907 V -36 1892 a(0.00)p 62 1907 V 132 w(38)p 236 1907 V 92 w(280)p 391 1907 V 400 1907 V 161 w(0)132 b(0)112 b(34)133 b(3)111 b(0)h(1)g(0)103 b(0)f(0)h(0)f(0)h(0)p 2051 1907 V 2060 1907 V -91 1957 V -82 1957 V -36 1942 a(0.04)p 62 1957 V 111 w(102)p 236 1957 V 92 w(290)p 391 1957 V 400 1957 V 161 w(0)132 b(0)112 b(70)g(30)f(0)h(2)g(0)103 b(0)f(0)h(0)f(0)h(0)p 2051 1957 V 2060 1957 V -91 2007 V -82 2007 V -36 1992 a(0.14)p 62 2007 V 111 w(143)p 236 2007 V 92 w(300)p 391 2007 V 400 2007 V 161 w(0)132 b(0)82 b Fm(104)133 b Fl(0)111 b(0)85 b Fm(39)112 b Fl(0)103 b(0)f(0)h(0)f(0)h(0)p 2051 2007 V 2060 2007 V -91 2057 V -82 2057 V -36 2042 a(0.00)p 62 2057 V 152 w(6)p 236 2057 V 93 w(310)p 391 2057 V 400 2057 V 161 w(0)132 b(0)h(0)g(0)111 b(0)e Fm(6)j Fl(0)103 b(0)f(0)h(0)f(0)h(0)p 2051 2057 V 2060 2057 V -90 2059 2151 2 v -90 2069 V -91 2118 2 50 v -82 2118 V 139 2103 a(jobs)p 236 2118 V 213 w(184893)48 b(46704)h(25744)f(18757)h(9209)g(3815)g(1337)60 b(981)h(304)f(248)h(215)81 b(47)p 2051 2118 V 2060 2118 V -90 2120 2151 2 v -91 2170 2 50 v -82 2170 V 97 2155 a(\045time)p 236 2170 V 241 w(31.08)58 b(26.95)f(18.90)h(11.79)f(6.91)h(2.63)g(0.68)49 b(0.55)f(0.18)h(0.15)f(0.16)h (0.02)p 2051 2170 V 2060 2170 V -90 2171 2151 2 v -91 2221 2 50 v -82 2221 V -12 2206 a Fm(a)o(v)o(e)15 b(M\015ops)p 236 2221 V 49 w(69.06)e Fl(out)h(of)g(333)f(M\015ops)p 727 2221 V 141 w Fm(a)o(v)o(e)i(Mmemops)p 1167 2221 V 49 w(61.00)e Fl(out)h(of)g(500)f (Mmemops)p 2051 2221 V 2060 2221 V -90 2223 2151 2 v -91 2273 2 50 v -82 2273 V 32 2258 a(total)g(time)p 236 2273 V 49 w(17801.6080)e (hours)p 727 2273 V 384 w(Mmemops)p 1167 2273 V 48 w(1.0e-6)i Fi(\003)p Ff(ctr)q Fl(\(6\))p Ff(=C)s(P)6 b(U)p 1561 2258 13 2 v 18 w(time)p 2051 2273 2 50 v 2060 2273 V -90 2274 2151 2 v -91 2324 2 50 v -82 2324 V 40 2309 a Fl(total)13 b(jobs)p 236 2324 V 50 w(292254)p 727 2324 V 634 w(M\015ops)p 1167 2324 V 50 w(1.0e-6)g Fi(\003)p Fl(\()p Ff(ctr)q Fl(\(3\))c(+)g Ff(ctr)q Fl(\(4\))g(+)h Ff(ctr)q Fl(\(5\)\))p Ff(=C)s(P)c(U)p 1906 2309 13 2 v 19 w(time)p 2051 2324 2 50 v 2060 2324 V -90 2326 2151 2 v -91 2376 2 50 v -82 2376 V -46 2361 a Fl(a)o(v)o(e)13 b(Mmemops)p 236 2376 V 48 w(1.0e-6)g Fi(\003)p Ff(T)6 b(O)q(T)g(AL)p 557 2361 13 2 v 15 w(ctr)q Fl(\(6\))p Ff(=T)g(O)q(T)g(AL)p 851 2361 V 15 w(C)s(P)g(U)p 964 2361 V 18 w(time)p 2051 2376 2 50 v 2060 2376 V -90 2377 2151 2 v -91 2427 2 50 v -82 2427 V 18 2412 a Fl(a)o(v)o(e)14 b(M\015ops)p 236 2427 V 50 w(1.0e-6)f Fi(\003)p Ff(T)6 b(O)q(T)g(AL)p 557 2412 13 2 v 15 w Fl(\()p Ff(ctr)q Fl(\(3\))j(+)h Ff(ctr)q Fl(\(4\))f(+)g Ff(ctr)q Fl(\(5\)\))p Ff(=T)d(O)q(T)g(AL)p 1196 2412 V 15 w(C)s(P)g(U)p 1310 2412 V 19 w(time)p 2051 2427 2 50 v 2060 2427 V -90 2429 2151 2 v 394 2503 a Fl(T)m(able)14 b(3:)j(M\015ops)d(vs.)19 b(memory)11 b(reference)16 b(rate)f(b)o(y)e(n)o(um)o(b)q(er)h(of)f(jobs)939 2828 y(12)p eop %%Page: 13 15 bop -90 352 2152 2 v -91 401 2 50 v -82 401 V 3 386 a Fl(\045)p 62 401 V 236 401 V 224 w(M\015op)p 391 401 V 400 401 V 211 w(rate)14 b(of)g(memory)d(reference)16 b(op)q(erations)f(in)e(millio)o(ns)f (p)q(er)i(second)h(\(Mmemops\))p 2053 401 V 2062 401 V 400 403 1662 2 v -91 451 2 50 v -82 451 V -43 436 a(time)p 62 451 V 100 w(jobs)p 236 451 V 83 w(rate)p 391 451 V 400 451 V 147 w(0)127 b(30)112 b(60)g(90)72 b(120)d(150)h(180)61 b(210)f(240)h(270)g(300)f (330)p 2053 451 V 2062 451 V -90 453 2152 2 v -90 463 V -91 513 2 50 v -82 513 V -57 498 a(11.55)p 62 513 V 49 w(115490)p 236 513 V 132 w(0)p 391 513 V 400 513 V 58 w Fm(11.11)95 b Fl(0.39)79 b(0.04)138 b(-)121 b(-)e(-)g(-)109 b(-)h(-)g(-)f(-)h(-)p 2053 513 V 2062 513 V -91 562 V -82 562 V -57 548 a(13.55)p 62 562 V 69 w(47548)p 236 562 V 112 w(10)p 391 562 V 400 562 V 58 w Fm(10.81)84 b(2.58)c Fl(0.16)138 b(-)121 b(-)e(-)g(-)109 b(-)h(-)g(-)f(-)h(-)p 2053 562 V 2062 562 V -91 612 V -82 612 V -36 597 a(8.81)p 62 612 V 69 w(25925)p 236 612 V 112 w(20)p 391 612 V 400 612 V 82 w Fm(3.67)84 b(4.23)c Fl(0.90)e(0.01)120 b(-)f(-)g(-)109 b(-)h(-)g(-)f(-)h(-)p 2053 612 V 2062 612 V -91 662 V -82 662 V -36 647 a(8.56)p 62 662 V 69 w(13111)p 236 662 V 112 w(30)p 391 662 V 400 662 V 82 w Fm(2.28)84 b(5.00)69 b(1.02)79 b Fl(0.25)120 b(-)f(-)g(-)109 b(-)h(-)g(-)f(-)h(-)p 2053 662 V 2062 662 V -91 712 V -82 712 V -36 697 a(5.05)p 62 712 V 69 w(10712)p 236 712 V 112 w(40)p 391 712 V 400 712 V 93 w(0.89)83 b Fm(2.90)69 b(1.21)79 b Fl(0.01)120 b(-)59 b(0.03)118 b(-)109 b(-)h(-)g(-)f(-)h(-)p 2053 712 V 2062 712 V -91 762 V -82 762 V -36 747 a(7.20)p 62 762 V 69 w(10885)p 236 762 V 112 w(50)p 391 762 V 400 762 V 93 w(0.98)83 b Fm(1.98)69 b(2.81)f(1.31)121 b Fl(-)59 b(0.06)f(0.07)108 b(-)i(-)g(-)f(-)h(-)p 2053 762 V 2062 762 V -91 812 V -82 812 V -36 797 a(6.64)p 62 812 V 90 w(8492)p 236 812 V 112 w(60)p 391 812 V 400 812 V 93 w(0.57)83 b Fm(1.75)69 b(2.72)f(1.42)61 b Fl(0.01)d(0.01)g(0.06)48 b(0.10)109 b(-)h(-)f(-)h(-)p 2053 812 V 2062 812 V -91 861 V -82 861 V -36 846 a(3.32)p 62 861 V 90 w(6573)p 236 861 V 112 w(70)p 391 861 V 400 861 V 93 w(0.12)83 b Fm(1.10)d Fl(0.91)e(0.93)60 b(0.21)e(0.02)118 b(-)49 b(0.03)109 b(-)h(-)f(-)h(-)p 2053 861 V 2062 861 V -91 911 V -82 911 V -36 896 a(3.20)p 62 911 V 90 w(6682)p 236 911 V 112 w(80)p 391 911 V 400 911 V 93 w(0.11)94 b(0.37)79 b(0.76)67 b Fm(1.22)61 b Fl(0.58)d(0.16)118 b(-)109 b(-)h(-)g(-)f(-)h(-)p 2053 911 V 2062 911 V -91 961 V -82 961 V -36 946 a(2.90)p 62 961 V 90 w(5223)p 236 961 V 112 w(90)p 391 961 V 400 961 V 93 w(0.12)94 b(0.37)68 b Fm(1.15)79 b Fl(0.56)60 b(0.37)e(0.24)g(0.08)108 b(-)i(-)g(-)f(-)h(-)p 2053 961 V 2062 961 V -91 1011 V -82 1011 V -36 996 a(3.41)p 62 1011 V 90 w(5471)p 236 1011 V 92 w(100)p 391 1011 V 400 1011 V 92 w(0.09)83 b Fm(1.12)d Fl(0.70)67 b Fm(1.06)61 b Fl(0.28)d(0.07)g(0.04)48 b(0.05)109 b(-)h(-)f(-)h(-)p 2053 1011 V 2062 1011 V -91 1061 V -82 1061 V -36 1046 a(4.39)p 62 1061 V 90 w(5824)p 236 1061 V 92 w(110)p 391 1061 V 400 1061 V 92 w(0.18)83 b Fm(1.73)d Fl(0.56)67 b Fm(1.54)61 b Fl(0.25)d(0.11)g(0.01)48 b(0.01)109 b(-)h(-)f(-)h(-)p 2053 1061 V 2062 1061 V -91 1110 V -82 1110 V -36 1095 a(4.05)p 62 1110 V 90 w(6959)p 236 1110 V 92 w(120)p 391 1110 V 400 1110 V 92 w(0.02)83 b Fm(1.56)69 b(1.26)79 b Fl(0.83)60 b(0.24)e(0.09)g(0.04)48 b(0.01)109 b(-)h(-)f(-)h(-)p 2053 1110 V 2062 1110 V -91 1160 V -82 1160 V -36 1145 a(2.76)p 62 1160 V 90 w(4349)p 236 1160 V 92 w(130)p 391 1160 V 400 1160 V 92 w(0.03)94 b(0.47)79 b(0.70)f(0.67)60 b(0.65)e(0.17)g(0.04)48 b(0.03)109 b(-)h(-)f(-)h(-)p 2053 1160 V 2062 1160 V -91 1210 V -82 1210 V -36 1195 a(2.48)p 62 1210 V 90 w(3902)p 236 1210 V 92 w(140)p 391 1210 V 400 1210 V 92 w(0.05)94 b(0.55)79 b(0.25)f(0.62)60 b(0.81)e(0.16)g(0.01)48 b(0.01)h(0.01)109 b(-)g(-)h(-)p 2053 1210 V 2062 1210 V -91 1260 V -82 1260 V -36 1245 a(3.13)p 62 1260 V 90 w(3400)p 236 1260 V 92 w(150)p 391 1260 V 400 1260 V 152 w(-)95 b(0.45)79 b(0.79)f(0.50)49 b Fm(1.17)59 b Fl(0.18)f(0.02)108 b(-)i(-)g(-)f(-)h(-)p 2053 1260 V 2062 1260 V -91 1310 V -82 1310 V -36 1295 a(2.95)p 62 1310 V 90 w(3598)p 236 1310 V 92 w(160)p 391 1310 V 400 1310 V 152 w(-)95 b(0.32)68 b Fm(1.27)79 b Fl(0.52)60 b(0.53)e(0.18)g(0.04)108 b(-)i(-)50 b(0.01)e(0.08)109 b(-)p 2053 1310 V 2062 1310 V -91 1359 V -82 1359 V -36 1345 a(1.54)p 62 1359 V 90 w(1862)p 236 1359 V 92 w(170)p 391 1359 V 400 1359 V 92 w(0.01)94 b(0.03)79 b(0.30)f(0.15)60 b(0.34)e(0.56)g(0.07)48 b(0.01)109 b(-)h(-)49 b(0.07)g(0.02)p 2053 1359 V 2062 1359 V -91 1409 V -82 1409 V -36 1394 a(1.49)p 62 1409 V 90 w(2134)p 236 1409 V 92 w(180)p 391 1409 V 400 1409 V 92 w(0.01)94 b(0.01)79 b(0.59)f(0.06)60 b(0.43)e(0.23)g(0.03)48 b(0.11)h(0.01)109 b(-)49 b(0.01)109 b(-)p 2053 1409 V 2062 1409 V -91 1459 V -82 1459 V -36 1444 a(0.92)p 62 1459 V 90 w(1440)p 236 1459 V 92 w(190)p 391 1459 V 400 1459 V 92 w(0.01)94 b(0.02)79 b(0.20)f(0.04)60 b(0.35)e(0.14)g(0.01)48 b(0.05)h(0.08)g(0.01)108 b(-)i(-)p 2053 1459 V 2062 1459 V -91 1509 V -82 1509 V -36 1494 a(0.47)p 62 1509 V 90 w(1017)p 236 1509 V 92 w(200)p 391 1509 V 400 1509 V 92 w(0.01)94 b(0.02)79 b(0.13)f(0.04)60 b(0.15)e(0.04)g (0.02)48 b(0.03)h(0.03)g(0.01)108 b(-)i(-)p 2053 1509 V 2062 1509 V -91 1559 V -82 1559 V -36 1544 a(0.90)p 62 1559 V 111 w(635)p 236 1559 V 92 w(210)p 391 1559 V 400 1559 V 152 w(-)155 b(-)80 b(0.20)e(0.02)60 b(0.26)e(0.14)g(0.11)48 b(0.05)109 b(-)50 b(0.12)108 b(-)i(-)p 2053 1559 V 2062 1559 V -91 1609 V -82 1609 V -36 1594 a(0.22)p 62 1609 V 111 w(343)p 236 1609 V 92 w(220)p 391 1609 V 400 1609 V 152 w(-)155 b(-)80 b(0.08)e(0.01)60 b(0.06)e(0.01)118 b(-)49 b(0.05)109 b(-)h(-)f(-)h(-)p 2053 1609 V 2062 1609 V -91 1658 V -82 1658 V -36 1643 a(0.20)p 62 1658 V 111 w(143)p 236 1658 V 92 w(230)p 391 1658 V 400 1658 V 152 w(-)155 b(-)80 b(0.01)138 b(-)61 b(0.17)118 b(-)h(-)49 b(0.01)109 b(-)h(-)f(-)h(-)p 2053 1658 V 2062 1658 V -91 1708 V -82 1708 V -36 1693 a(0.04)p 62 1708 V 132 w(85)p 236 1708 V 92 w(240)p 391 1708 V 400 1708 V 152 w(-)155 b(-)80 b(0.01)138 b(-)61 b(0.01)d(0.01)118 b(-)109 b(-)50 b(0.01)109 b(-)g(-)h(-)p 2053 1708 V 2062 1708 V -91 1758 V -82 1758 V -36 1743 a(0.03)p 62 1758 V 132 w(65)p 236 1758 V 92 w(250)p 391 1758 V 400 1758 V 152 w(-)155 b(-)140 b(-)f(-)121 b(-)59 b(0.01)118 b(-)109 b(-)50 b(0.01)109 b(-)g(-)h(-)p 2053 1758 V 2062 1758 V -91 1808 V -82 1808 V -36 1793 a(0.04)p 62 1808 V 132 w(58)p 236 1808 V 92 w(260)p 391 1808 V 400 1808 V 152 w(-)155 b(-)80 b(0.01)e(0.01)60 b(0.03)118 b(-)h(-)109 b(-)h(-)g(-)f(-)h(-)p 2053 1808 V 2062 1808 V -91 1858 V -82 1858 V -36 1843 a(0.00)p 62 1858 V 132 w(39)p 236 1858 V 92 w(270)p 391 1858 V 400 1858 V 152 w(-)155 b(-)140 b(-)f(-)121 b(-)e(-)g(-)109 b(-)h(-)g(-)f(-)h(-)p 2053 1858 V 2062 1858 V -91 1907 V -82 1907 V -36 1892 a(0.00)p 62 1907 V 132 w(38)p 236 1907 V 92 w(280)p 391 1907 V 400 1907 V 152 w(-)155 b(-)140 b(-)f(-)121 b(-)e(-)g(-)109 b(-)h(-)g(-)f(-)h(-)p 2053 1907 V 2062 1907 V -91 1957 V -82 1957 V -36 1942 a(0.04)p 62 1957 V 111 w(102)p 236 1957 V 92 w(290)p 391 1957 V 400 1957 V 152 w(-)155 b(-)80 b(0.02)e(0.01)120 b(-)f(-)g(-)109 b(-)h(-)g(-)f(-)h(-)p 2053 1957 V 2062 1957 V -91 2007 V -82 2007 V -36 1992 a(0.14)p 62 2007 V 111 w(143)p 236 2007 V 92 w(300)p 391 2007 V 400 2007 V 152 w(-)155 b(-)80 b(0.14)138 b(-)121 b(-)e(-)g(-)109 b(-)h(-)g(-)f(-)h(-)p 2053 2007 V 2062 2007 V -91 2057 V -82 2057 V -36 2042 a(0.00)p 62 2057 V 152 w(6)p 236 2057 V 93 w(310)p 391 2057 V 400 2057 V 152 w(-)155 b(-)140 b(-)f(-)121 b(-)e(-)g(-)109 b(-)h(-)g(-)f(-)h(-)p 2053 2057 V 2062 2057 V -90 2059 2152 2 v -90 2069 V -91 2118 2 50 v -82 2118 V 139 2103 a(jobs)p 236 2118 V 197 w(184893)64 b(46704)49 b(25744)f(18757)j(9209)e(3815)f(1337)61 b(981)f(304)h(248)g(215)81 b(47)p 2053 2118 V 2062 2118 V -90 2120 2152 2 v -91 2170 2 50 v -82 2170 V 97 2155 a(\045time)p 236 2170 V 225 w(31.08)73 b(26.95)58 b(18.90)g(11.79)h(6.91)f(2.63)g(0.68)48 b(0.55)h(0.18)g(0.15)f (0.16)h(0.02)p 2053 2170 V 2062 2170 V -90 2171 2152 2 v -91 2221 2 50 v -82 2221 V -12 2206 a Fm(a)o(v)o(e)15 b(M\015ops)p 236 2221 V 49 w(69.06)e Fl(out)h(of)g(333)f(M\015ops)p 727 2221 V 142 w Fm(a)o(v)o(e)j(Mmemops)p 1169 2221 V 49 w(61.00)d Fl(out)h(of)f(500)g(Mmemops)p 2053 2221 V 2062 2221 V -90 2223 2152 2 v -91 2273 2 50 v -82 2273 V 32 2258 a(total)g(time)p 236 2273 V 49 w(17801.6080)e(hours)p 727 2273 V 386 w(Mmemops)p 1169 2273 V 48 w(1.0e-6)i Fi(\003)p Ff(ctr)q Fl(\(6\))p Ff(=C)s(P)6 b(U)p 1562 2258 13 2 v 18 w(time)p 2053 2273 2 50 v 2062 2273 V -90 2274 2152 2 v -91 2324 2 50 v -82 2324 V 40 2309 a Fl(total)13 b(jobs)p 236 2324 V 50 w(292254)p 727 2324 V 636 w(M\015ops)p 1169 2324 V 50 w(1.0e-6)g Fi(\003)p Fl(\()p Ff(ctr)q Fl(\(3\))c(+)g Ff(ctr)q Fl(\(4\))g(+)h Ff(ctr)q Fl(\(5\)\))p Ff(=C)s(P)c(U)p 1908 2309 13 2 v 18 w(time)p 2053 2324 2 50 v 2062 2324 V -90 2326 2152 2 v -91 2376 2 50 v -82 2376 V -46 2361 a Fl(a)o(v)o(e)13 b(Mmemops)p 236 2376 V 48 w(1.0e-6)g Fi(\003)p Ff(T)6 b(O)q(T)g(AL)p 557 2361 13 2 v 15 w(ctr)q Fl(\(6\))p Ff(=T)g(O)q(T)g(AL)p 851 2361 V 15 w(C)s(P)g(U)p 964 2361 V 18 w(time)p 2053 2376 2 50 v 2062 2376 V -90 2377 2152 2 v -91 2427 2 50 v -82 2427 V 18 2412 a Fl(a)o(v)o(e)14 b(M\015ops)p 236 2427 V 50 w(1.0e-6)f Fi(\003)p Ff(T)6 b(O)q(T)g(AL)p 557 2412 13 2 v 15 w Fl(\()p Ff(ctr)q Fl(\(3\))j(+)h Ff(ctr)q Fl(\(4\))f(+)g Ff(ctr)q Fl(\(5\)\))p Ff(=T)d(O)q(T)g(AL)p 1196 2412 V 15 w(C)s(P)g(U)p 1310 2412 V 19 w(time)p 2053 2427 2 50 v 2062 2427 V -90 2429 2152 2 v 303 2503 a Fl(T)m(able)13 b(4:)18 b(M\015ops)c(vs.)19 b(memory)11 b(reference)16 b(rate)f(b)o(y)e(p)q(ercen)o(t)j(of)d(w)o(orkload)g(time)939 2828 y(13)p eop %%Page: 14 16 bop -90 195 a Fl(M\015ops)14 b(in)g(Figure)g(2.)-28 295 y(T)m(able)c(5)g(sho) o(ws)g(the)h(relationship)f(of)f(M\015ops)i(to)f(MIPS)g(for)g(the)h(jobs)f (in)g(the)h(w)o(orkload.)k(T)m(able)10 b(6)g(sho)o(ws)g(the)h(same)e (relationship)-90 394 y(but)15 b(b)o(y)f(p)q(ercen)o(tage)j(of)d(w)o(orkload) f(time.)19 b(The)c(bins)g(in)f(the)i(tables)e(ha)o(v)o(e)h(size)g(10)f (M\015ops)h(and)g(10)f(MIPS.)h(A)f(general)h(trend)h(can)-90 494 y(b)q(e)e(seen)g(in)f(the)g(tables.)18 b(Note,)c(ho)o(w)o(ev)o(er,)f (that)g(programs)f(in)g(the)i(20)e(MIPS)i(column)d(span)j(the)f(en)o(tire)h (range)f(of)g(M\015ops,)g(so)g(that)-90 594 y(the)g(trend)g(is)f(a)g(lo)q (ose)g(one.)18 b(Also)12 b(see)h(the)g(30)f(M\015ops)g(ro)o(w.)17 b(Not)c(surprisingly)m(,)e(for)h(a)g(v)o(ector)h(mac)o(hine,)d(there)k(is)e (a)g(general)g(in)o(v)o(erse)-90 693 y(relationship)h(b)q(et)o(w)o(een)i (MIPS)e(and)h(M\015ops.)k(Out)c(of)f(a)g(167)f(MIPS)i(p)q(eak)g(rate,)g(jobs) f(executing)h(at)f(o)o(v)o(er)h(100)e(MIPS)i(ha)o(v)o(e)f(a)h(v)o(ery)-90 793 y(lo)o(w)h(M\015ops)i(rate.)26 b(Most)17 b(jobs)f(are)h(in)f(the)h(range) f(of)g(30)g(to)g(70)g(MIPS,)g(with)g(an)g(a)o(v)o(erage)h(MIPS)f(rate)h(of)f (41.07.)24 b(The)17 b(general)-90 892 y(in)o(v)o(erse)c(relationship)g(b)q (et)o(w)o(een)h(MIPS)f(and)g(M\015ops)g(do)q(es)g(not)g(hold)f(at)h(lo)o(w)f (MIPS)h(v)n(alues.)k(V)m(ery)d(lo)o(w)d(MIPS)j(is)e(not)h(a)f(guaran)o(tee) -90 992 y(of)g(high)g(M\015ops.)18 b(There)c(app)q(ears)f(to)g(b)q(e)g(an)f (optimal)e(v)n(alue)i(for)g(MIPS)h(in)g(the)g(range)g(of)f(10)g(to)g(30.)17 b(The)c(jobs)g(that)g(ac)o(hiev)o(e)g(these)-90 1092 y(v)n(alues)h(are)g (those)h(with)e(the)i(same)e(c)o(haracteristics)i(as)f(the)h(MXM)f(and)f(MXV) i(library)e(routines.)-90 1266 y Fh(3.2)56 b(Individual)17 b(Application)h(Areas)-90 1392 y Fl(Next)g(w)o(e)h(presen)o(t)g(results)g (for)f(individual)d(application)i(areas)h(within)f(the)i(w)o(orkload.)29 b(F)m(or)17 b(eac)o(h)i(application)d(area)i(w)o(e)g(ha)o(v)o(e)-90 1492 y(pro\014les)e(for)g(all)e(16)i(of)f(the)h(p)q(erformance)g(metrics)g (describ)q(ed)h(earlier,)f(and)g(also)f(pairwise)h(comparisons)e(to)i(\014nd) g(relationships)-90 1592 y(b)q(et)o(w)o(een)h(the)f(metrics.)23 b(Space)16 b(limitations)d(allo)o(w)h(us)i(to)f(sho)o(w)h(only)e(a)i(few)f (examples)g(represen)o(ting)i(a)f(v)o(ery)f(small)f(amoun)o(t)g(of)-90 1691 y(this)g(information.)-28 1791 y(T)m(able)i(7)f(describ)q(es)j(the)f (top)f(10)g(time-consuming)d(application)h(areas)j(at)f(NCSA)g(from)f(June)h (1991)f(to)h(June)h(1992.)24 b(Recall)-90 1891 y(that)15 b(the)h(application) e(area)h(for)g(a)g(job)g(is)g(that)g(de\014ned)h(b)o(y)f(the)h(job's)f(user)h (when)g(applying)d(for)i(an)g(allo)q(cation)f(of)g(time)g(on)h(the)-90 1990 y(mac)o(hine,)g(and)i(that)f(the)h(area)g(is)f(recorded)j(as)d(\\Unkno)o (wn")g(when)h(the)g(user)h(did)e(not)g(sp)q(ecify)h(an)g(area.)26 b(The)17 b(table)f(lists)g(the)-90 2090 y(application)11 b(areas)h(in)g (decreasing)h(use)f(of)g(time,)e(and)i(giv)o(es)g(a)g(\014v)o(e)g(letter)h (abbreviation)e(\(to)h(b)q(e)g(used)h(later\),)f(the)h(n)o(um)o(b)q(er)e(of)h (jobs,)-90 2190 y(and)g(the)h(a)o(v)o(erage)g(M\015ops)g(rate)g(p)q(er)g(pro) q(cessor)h(for)f(eac)o(h)g(area.)k(The)c(top)g(10)f(user)h(p)q(opulations)f (consume)g(more)g(than)g(80)g(p)q(ercen)o(t)-90 2289 y(of)j(the)h(total)f(w)o (orkload)g(time.)22 b(Statistics)16 b(for)g(the)g(other)g(remaining)e(16)h (application)f(areas)j(are)f(condensed)h(in)o(to)e(en)o(tries)i(on)e(a)-90 2389 y(single)c(line.)17 b(Data)11 b(for)g(the)h(en)o(tire)g(w)o(orkload)e (are)i(giv)o(en)f(on)g(the)h(last)f(line.)17 b(The)12 b(M\015ops)g(p)q (erformance)f(for)g(the)h(top)f(10)g(application)-90 2488 y(areas)17 b(range)h(from)d(38.32)g(\(11.5)h(p)q(ercen)o(t)j(of)d(p)q(eak\))h(for)g(\(n) o(um)o(b)q(er)f(9\))h(Mec)o(hanical)g(and)g(Structural)g(Systems)g(to)g (129.89)e(\(39.0)-90 2588 y(p)q(ercen)o(t)h(of)d(p)q(eak\))h(for)g(\(n)o(um)o (b)q(er)f(10\))h(Mathematical)e(Sciences)j(o)o(v)o(er)f(the)h(one)f(y)o(ear)g (recording)g(p)q(erio)q(d.)-28 2688 y(W)m(e)f(sho)o(w)g(the)h(pro\014le)f(of) f(M\015ops)h(for)g(eac)o(h)h(application)d(area)i(in)g(Figure)g(2\(b\).)18 b(F)m(or)13 b(ease)h(of)e(viewing)g(and)h(comparison)f(of)g(all)939 2828 y(14)p eop %%Page: 15 17 bop -90 349 2128 2 v -91 399 2 50 v -82 399 V 3 384 a Fl(\045)p 62 399 V 236 399 V 224 w(M\015ops)p 407 399 V 416 399 V 326 w(rate)14 b(of)f(instructions)i(issued)g(in)e(millions)e(p)q(er)k(second)g (\(MIPS\))p 2029 399 V 2038 399 V 416 401 1622 2 v -91 449 2 50 v -82 449 V -43 434 a(time)p 62 449 V 100 w(jobs)p 236 449 V 100 w(rate)p 407 449 V 416 449 V 100 w(0)157 b(10)112 b(20)g(30)132 b(40)112 b(50)f(60)91 b(70)g(80)81 b(90)61 b(100)p 2029 449 V 2038 449 V -90 451 2128 2 v -90 461 V -91 510 2 50 v -82 510 V -57 495 a(11.55)p 62 510 V 49 w(115490)p 236 510 V 148 w(0)p 407 510 V 416 510 V 100 w(9)157 b(59)91 b(411)70 b(4794)g(60892)48 b(37638)70 b(9308)48 b(1363)70 b(507)61 b(507)102 b(2)p 2029 510 V 2038 510 V -91 560 V -82 560 V -57 545 a(13.55)p 62 560 V 69 w(47548)p 236 560 V 129 w(10)p 407 560 V 416 560 V 99 w(0)137 b(125)90 b(100)h(762)70 b(13228)48 b(22909)70 b(7962)48 b(2282)70 b(146)81 b(32)103 b(2)p 2029 560 V 2038 560 V -91 610 V -82 610 V -36 595 a(8.81)p 62 610 V 69 w(25925)p 236 610 V 129 w(20)p 407 610 V 416 610 V 99 w(2)137 b(151)90 b(116)70 b(1313)g(12111)f(8168)h(2502)48 b(1019)70 b(540)102 b(3)h(0)p 2029 610 V 2038 610 V -91 660 V -82 660 V -36 645 a(8.56)p 62 660 V 69 w(13111)p 236 660 V 129 w(30)p 407 660 V 416 660 V 78 w(19)137 b(168)90 b(301)70 b(1515)90 b(4220)70 b(4538)g(1462)f(533)h(251)81 b(75)h(29)p 2029 660 V 2038 660 V -91 710 V -82 710 V -36 695 a(5.05)p 62 710 V 69 w(10712)p 236 710 V 129 w(40)p 407 710 V 416 710 V 99 w(0)137 b(126)90 b(377)70 b(1909)90 b(3545)70 b(2569)g(1892)f(252)91 b(36)102 b(6)h(0)p 2029 710 V 2038 710 V -91 759 V -82 759 V -36 744 a(7.20)p 62 759 V 69 w(10885)p 236 759 V 129 w(50)p 407 759 V 416 759 V 99 w(0)137 b(247)90 b(312)70 b(2562)90 b(3893)70 b(1560)g(1370)f(912)91 b(22)102 b(7)h(0)p 2029 759 V 2038 759 V -91 809 V -82 809 V -36 794 a(6.64)p 62 809 V 90 w(8492)p 236 809 V 129 w(60)p 407 809 V 416 809 V 99 w(1)137 b(327)90 b(780)70 b(2356)90 b(2728)70 b(1158)90 b(303)70 b(839)112 b(0)102 b(0)h(0)p 2029 809 V 2038 809 V -91 859 V -82 859 V -36 844 a(3.32)p 62 859 V 90 w(6573)p 236 859 V 129 w(70)p 407 859 V 416 859 V 99 w(4)137 b(213)90 b(969)70 b(2061)90 b(2154)h(789)f(368)h(15) 112 b(0)102 b(0)h(0)p 2029 859 V 2038 859 V -91 909 V -82 909 V -36 894 a(3.20)p 62 909 V 90 w(6682)p 236 909 V 129 w(80)p 407 909 V 416 909 V 99 w(7)137 b(342)70 b(1031)f(1785)90 b(2477)h(700)f(335) 112 b(5)g(0)102 b(0)h(0)p 2029 909 V 2038 909 V -91 959 V -82 959 V -36 944 a(2.90)p 62 959 V 90 w(5223)p 236 959 V 129 w(90)p 407 959 V 416 959 V 99 w(2)137 b(169)90 b(655)70 b(2048)90 b(1499)h(628)f(212)h(10)112 b(0)102 b(0)h(0)p 2029 959 V 2038 959 V -91 1008 V -82 1008 V -36 994 a(3.41)p 62 1008 V 90 w(5471)p 236 1008 V 108 w(100)p 407 1008 V 416 1008 V 99 w(5)137 b(364)70 b(1297)f(1982)90 b(1542)h(222)111 b(59)h(0)g(0)102 b(0)h(0)p 2029 1008 V 2038 1008 V -91 1058 V -82 1058 V -36 1043 a(4.39)p 62 1058 V 90 w(5824)p 236 1058 V 108 w(110)p 407 1058 V 416 1058 V 78 w(37)137 b(726)70 b(2368)f(1769)111 b(773)91 b(117)111 b(34)h(0)g(0)102 b(0)h(0)p 2029 1058 V 2038 1058 V -91 1108 V -82 1108 V -36 1093 a(4.05)p 62 1108 V 90 w(6959)p 236 1108 V 108 w(120)p 407 1108 V 416 1108 V 57 w(235)116 b(1034)70 b(3339)f(1802)111 b(492)h(55)132 b(2)112 b(0)g(0)102 b(0)h(0)p 2029 1108 V 2038 1108 V -91 1158 V -82 1158 V -36 1143 a(2.76)p 62 1158 V 90 w(4349)p 236 1158 V 108 w(130)p 407 1158 V 416 1158 V 57 w(113)137 b(332)70 b(2316)f(1207)111 b(363)h(17)132 b(1)112 b(0)g(0)102 b(0)h(0)p 2029 1158 V 2038 1158 V -91 1208 V -82 1208 V -36 1193 a(2.48)p 62 1208 V 90 w(3902)p 236 1208 V 108 w(140)p 407 1208 V 416 1208 V 99 w(1)137 b(292)70 b(1908)f(1493)111 b(194)h(14)132 b(0)112 b(0)g(0)102 b(0)h(0)p 2029 1208 V 2038 1208 V -91 1258 V -82 1258 V -36 1243 a(3.13)p 62 1258 V 90 w(3400)p 236 1258 V 108 w(150)p 407 1258 V 416 1258 V 99 w(2)137 b(653)70 b(1654)90 b(715)111 b(369)132 b(4)h(3)112 b(0)g(0)102 b(0)h(0)p 2029 1258 V 2038 1258 V -91 1307 V -82 1307 V -36 1292 a(2.95)p 62 1307 V 90 w(3598)p 236 1307 V 108 w(160)p 407 1307 V 416 1307 V 99 w(0)137 b(752)70 b(1776)90 b(786)111 b(279)132 b(5)h(0)112 b(0)g(0)102 b(0)h(0)p 2029 1307 V 2038 1307 V -91 1357 V -82 1357 V -36 1342 a(1.54)p 62 1357 V 90 w(1862)p 236 1357 V 108 w(170)p 407 1357 V 416 1357 V 99 w(1)137 b(347)70 b(1063)90 b(358)132 b(93)g(0)h(0)112 b(0)g(0)102 b(0)h(0)p 2029 1357 V 2038 1357 V -91 1407 V -82 1407 V -36 1392 a(1.49)p 62 1407 V 90 w(2134)p 236 1407 V 108 w(180)p 407 1407 V 416 1407 V 99 w(0)137 b(761)70 b(1144)90 b(150)132 b(78)g(1)h(0)112 b(0)g(0)102 b(0)h(0)p 2029 1407 V 2038 1407 V -91 1457 V -82 1457 V -36 1442 a(0.92)p 62 1457 V 90 w(1440)p 236 1457 V 108 w(190)p 407 1457 V 416 1457 V 99 w(0)137 b(774)90 b(503)h(119)132 b(42)g(2)h(0)112 b(0)g(0)102 b(0)h(0)p 2029 1457 V 2038 1457 V -91 1507 V -82 1507 V -36 1492 a(0.47)p 62 1507 V 90 w(1017)p 236 1507 V 108 w(200)p 407 1507 V 416 1507 V 99 w(0)137 b(634)90 b(278)112 b(83)132 b(21)g(1)h(0)112 b(0)g(0)102 b(0)h(0)p 2029 1507 V 2038 1507 V -91 1556 V -82 1556 V -36 1542 a(0.90)p 62 1556 V 111 w(635)p 236 1556 V 108 w(210)p 407 1556 V 416 1556 V 99 w(1)137 b(415)90 b(144)112 b(42)132 b(26)g(6)h(1)112 b(0)g(0)102 b(0)h(0)p 2029 1556 V 2038 1556 V -91 1606 V -82 1606 V -36 1591 a(0.22)p 62 1606 V 111 w(343)p 236 1606 V 108 w(220)p 407 1606 V 416 1606 V 99 w(0)137 b(156)90 b(140)112 b(45)153 b(2)132 b(0)h(0)112 b(0)g(0)102 b(0)h(0)p 2029 1606 V 2038 1606 V -91 1656 V -82 1656 V -36 1641 a(0.20)p 62 1656 V 111 w(143)p 236 1656 V 108 w(230)p 407 1656 V 416 1656 V 99 w(0)157 b(10)112 b(86)g(47)153 b(0)132 b(0)h(0)112 b(0)g(0)102 b(0)h(0)p 2029 1656 V 2038 1656 V -91 1706 V -82 1706 V -36 1691 a(0.04)p 62 1706 V 132 w(85)p 236 1706 V 108 w(240)p 407 1706 V 416 1706 V 99 w(0)178 b(1)112 b(57)g(25)153 b(0)132 b(2)h(0)112 b(0)g(0)102 b(0)h(0)p 2029 1706 V 2038 1706 V -91 1756 V -82 1756 V -36 1741 a(0.03)p 62 1756 V 132 w(65)p 236 1756 V 108 w(250)p 407 1756 V 416 1756 V 99 w(0)178 b(1)112 b(56)132 b(8)154 b(0)132 b(0)h(0)112 b(0)g(0)102 b(0)h(0)p 2029 1756 V 2038 1756 V -91 1806 V -82 1806 V -36 1791 a(0.04)p 62 1806 V 132 w(58)p 236 1806 V 108 w(260)p 407 1806 V 416 1806 V 99 w(0)178 b(3)112 b(55)132 b(0)154 b(0)132 b(0)h(0)112 b(0)g(0)102 b(0)h(0)p 2029 1806 V 2038 1806 V -91 1855 V -82 1855 V -36 1840 a(0.00)p 62 1855 V 132 w(39)p 236 1855 V 108 w(270)p 407 1855 V 416 1855 V 99 w(0)178 b(1)112 b(34)132 b(4)154 b(0)132 b(0)h(0)112 b(0)g(0)102 b(0)h(0)p 2029 1855 V 2038 1855 V -91 1905 V -82 1905 V -36 1890 a(0.00)p 62 1905 V 132 w(38)p 236 1905 V 108 w(280)p 407 1905 V 416 1905 V 99 w(0)157 b(21)112 b(17)132 b(0)154 b(0)132 b(0)h(0)112 b(0)g(0)102 b(0)h(0)p 2029 1905 V 2038 1905 V -91 1955 V -82 1955 V -36 1940 a(0.04)p 62 1955 V 111 w(102)p 236 1955 V 108 w(290)p 407 1955 V 416 1955 V 99 w(0)178 b(6)112 b(96)132 b(0)154 b(0)132 b(0)h(0)112 b(0)g(0)102 b(0)h(0)p 2029 1955 V 2038 1955 V -91 2005 V -82 2005 V -36 1990 a(0.14)p 62 2005 V 111 w(143)p 236 2005 V 108 w(300)p 407 2005 V 416 2005 V 99 w(0)157 b(23)91 b(120)132 b(0)154 b(0)132 b(0)h(0)112 b(0)g(0)102 b(0)h(0)p 2029 2005 V 2038 2005 V -91 2055 V -82 2055 V -36 2040 a(0.00)p 62 2055 V 152 w(6)p 236 2055 V 109 w(310)p 407 2055 V 416 2055 V 99 w(0)178 b(0)130 b Fm(6)i Fl(0)154 b(0)132 b(0)h(0)112 b(0)g(0)102 b(0)h(0)p 2029 2055 V 2038 2055 V -90 2056 2128 2 v -90 2066 V -91 2116 2 50 v -82 2116 V 139 2101 a(jobs)p 236 2116 V 229 w(439)116 b(9233)49 b(23509)f(31740)h(111021)f(81103)h(25814)f(7230)h(1502)61 b(630)81 b(33)p 2029 2116 V 2038 2116 V -90 2118 2128 2 v -91 2167 2 50 v -82 2167 V 97 2153 a(\045time)p 236 2167 V 216 w(0.13)103 b(10.40)58 b(12.97)g(21.51)78 b(25.05)58 b(19.16)78 b(6.66)58 b(2.12)g(0.82)48 b(1.17)h(0.00)p 2029 2167 V 2038 2167 V -90 2169 2128 2 v -91 2219 2 50 v -82 2219 V -12 2204 a Fm(a)o(v)o(e)15 b(M\015ops)p 236 2219 V 49 w(69.06)e Fl(out)h(of)g(333)f (M\015ops)p 727 2219 V 101 w Fm(a)o(v)o(e)j(MIPS)p 1034 2219 V 49 w(41.70)e Fl(out)f(of)h(167)f(MIPS)p 2029 2219 V 2038 2219 V -90 2221 2128 2 v -91 2270 2 50 v -82 2270 V 32 2255 a(total)g(time)p 236 2270 V 49 w(17801.6080)e(hours)p 727 2270 V 333 w(MIPS)p 1034 2270 V 50 w(1.0e-6)i Fi(\003)p Ff(ctr)q Fl(\(0\))p Ff(=C)s(P)6 b(U)p 1428 2255 13 2 v 18 w(time)p 2029 2270 2 50 v 2038 2270 V -90 2272 2128 2 v -91 2322 2 50 v -82 2322 V 40 2307 a Fl(total)13 b(jobs)p 236 2322 V 50 w(292254)p 727 2322 V 501 w(M\015ops)p 1034 2322 V 50 w(1.0e-6)g Fi(\003)p Fl(\()p Ff(ctr)q Fl(\(3\))c(+)g Ff(ctr)q Fl(\(4\))h(+)f Ff(ctr)q Fl(\(5\)\))p Ff(=C)s(P)d(U)p 1773 2307 13 2 v 19 w(time)p 2029 2322 2 50 v 2038 2322 V -90 2324 2128 2 v -91 2373 2 50 v -82 2373 V 35 2358 a Fl(a)o(v)o(e)14 b(MIPS)p 236 2373 V 50 w(1.0e-6)f Fi(\003)p Ff(T)6 b(O)q(T)g(AL)p 557 2358 13 2 v 15 w(ctr)q Fl(\(0\))p Ff(=T)g(O)q(T)g(AL)p 851 2358 V 15 w(C)s(P)g(U)p 964 2358 V 18 w(time)p 2029 2373 2 50 v 2038 2373 V -90 2375 2128 2 v -91 2425 2 50 v -82 2425 V 18 2410 a Fl(a)o(v)o(e)14 b(M\015ops)p 236 2425 V 50 w(1.0e-6)f Fi(\003)p Ff(T)6 b(O)q(T)g(AL)p 557 2410 13 2 v 15 w Fl(\()p Ff(ctr)q Fl(\(3\))j(+)h Ff(ctr)q Fl(\(4\))f(+)g Ff(ctr)q Fl(\(5\)\))p Ff(=T)d(O)q(T)g(AL)p 1196 2410 V 15 w(C)s(P)g(U)p 1310 2410 V 19 w(time)p 2029 2425 2 50 v 2038 2425 V -90 2427 2128 2 v 332 2503 a Fl(T)m(able)13 b(5:)k(M\015ops)e(vs.)j(instruction)c(issue)h(rate)f(\(MIPS\))h(b)o(y)e(n)o (um)o(b)q(er)g(of)h(jobs)939 2828 y(15)p eop %%Page: 16 18 bop -90 349 2128 2 v -91 399 2 50 v -82 399 V 3 384 a Fl(\045)p 62 399 V 236 399 V 224 w(M\015op)p 391 399 V 400 399 V 333 w(rate)15 b(of)e(instructions)i(issued)f(in)g(millio)o(ns)e(p)q(er)i(second)h (\(MIPS\))p 2029 399 V 2038 399 V 400 401 1638 2 v -91 449 2 50 v -82 449 V -43 434 a(time)p 62 449 V 100 w(jobs)p 236 449 V 83 w(rate)p 391 449 V 400 449 V 112 w(0)162 b(10)112 b(20)g(30)132 b(40)112 b(50)f(60)91 b(70)g(80)81 b(90)61 b(100)p 2029 449 V 2038 449 V -90 451 2128 2 v -90 461 V -91 510 2 50 v -82 510 V -57 495 a(11.55)p 62 510 V 49 w(115490)p 236 510 V 132 w(0)p 391 510 V 400 510 V 118 w(-)130 b(0.03)79 b(0.06)f(0.90)89 b Fm(4.21)68 b(3.74)80 b Fl(0.91)58 b(0.31)g(0.27)48 b(1.12)109 b(-)p 2029 510 V 2038 510 V -91 560 V -82 560 V -57 545 a(13.55)p 62 560 V 69 w(47548)p 236 560 V 112 w(10)p 391 560 V 400 560 V 118 w(-)130 b(0.16)79 b(0.02)f(0.58)89 b Fm(3.69)68 b(6.07)h(2.07)59 b Fl(0.52)f(0.43)48 b(0.02)109 b(-)p 2029 560 V 2038 560 V -91 610 V -82 610 V -36 595 a(8.81)p 62 610 V 69 w(25925)p 236 610 V 112 w(20)p 391 610 V 400 610 V 118 w(-)130 b(0.13)79 b(0.05)f(1.31)89 b Fm(2.94)68 b(2.65)h(1.31)59 b Fl(0.35)f(0.07)108 b(-)i(-)p 2029 610 V 2038 610 V -91 660 V -82 660 V -36 645 a(8.56)p 62 660 V 69 w(13111)p 236 660 V 112 w(30)p 391 660 V 400 660 V 118 w(-)130 b(0.07)79 b(0.36)67 b Fm(2.07)90 b(1.57)68 b(2.99)h(1.25)59 b Fl(0.16)f(0.05)48 b(0.03)109 b(-)p 2029 660 V 2038 660 V -91 710 V -82 710 V -36 695 a(5.05)p 62 710 V 69 w(10712)p 236 710 V 112 w(40)p 391 710 V 400 710 V 118 w(-)130 b(0.02)79 b(0.61)67 b Fm(1.06)90 b(1.96)79 b Fl(0.87)g(0.38)58 b(0.15)118 b(-)109 b(-)h(-)p 2029 710 V 2038 710 V -91 759 V -82 759 V -36 744 a(7.20)p 62 759 V 69 w(10885)p 236 759 V 112 w(50)p 391 759 V 400 759 V 118 w(-)130 b(0.04)79 b(0.10)67 b Fm(2.83)90 b(2.18)68 b(1.28)80 b Fl(0.37)58 b(0.39)118 b(-)109 b(-)h(-)p 2029 759 V 2038 759 V -91 809 V -82 809 V -36 794 a(6.64)p 62 809 V 90 w(8492)p 236 809 V 112 w(60)p 391 809 V 400 809 V 118 w(-)130 b(0.05)79 b(0.14)67 b Fm(3.39)90 b(2.01)79 b Fl(0.62)g(0.20)58 b(0.23)118 b(-)109 b(-)h(-)p 2029 809 V 2038 809 V -91 859 V -82 859 V -36 844 a(3.32)p 62 859 V 90 w(6573)p 236 859 V 112 w(70)p 391 859 V 400 859 V 118 w(-)130 b(0.09)79 b(0.33)67 b Fm(1.51)90 b(1.00)79 b Fl(0.32)g(0.08)58 b(0.01)118 b(-)109 b(-)h(-)p 2029 859 V 2038 859 V -91 909 V -82 909 V -36 894 a(3.20)p 62 909 V 90 w(6682)p 236 909 V 112 w(80)p 391 909 V 400 909 V 118 w(-)130 b(0.19)79 b(0.58)f(0.94)89 b Fm(1.23)79 b Fl(0.23)g(0.03)118 b(-)h(-)109 b(-)h(-)p 2029 909 V 2038 909 V -91 959 V -82 959 V -36 944 a(2.90)p 62 959 V 90 w(5223)p 236 959 V 112 w(90)p 391 959 V 400 959 V 118 w(-)130 b(0.14)79 b(0.66)67 b Fm(1.03)101 b Fl(0.87)78 b(0.17)h(0.02)118 b(-)h(-)109 b(-)h(-)p 2029 959 V 2038 959 V -91 1008 V -82 1008 V -36 994 a(3.41)p 62 1008 V 90 w(5471)p 236 1008 V 92 w(100)p 391 1008 V 400 1008 V 57 w(0.01)129 b(0.66)79 b(0.70)67 b Fm(1.12)101 b Fl(0.83)78 b(0.05)h(0.03)118 b(-)h(-)109 b(-)h(-)p 2029 1008 V 2038 1008 V -91 1058 V -82 1058 V -36 1043 a(4.39)p 62 1058 V 90 w(5824)p 236 1058 V 92 w(110)p 391 1058 V 400 1058 V 57 w(0.05)118 b Fm(1.93)80 b Fl(0.99)67 b Fm(1.04)101 b Fl(0.33)78 b(0.06)139 b(-)119 b(-)g(-)109 b(-)h(-)p 2029 1058 V 2038 1058 V -91 1108 V -82 1108 V -36 1093 a(4.05)p 62 1108 V 90 w(6959)p 236 1108 V 92 w(120)p 391 1108 V 400 1108 V 57 w(0.02)118 b Fm(1.65)69 b(1.26)79 b Fl(0.99)100 b(0.10)78 b(0.03)139 b(-)119 b(-)g(-)109 b(-)h(-)p 2029 1108 V 2038 1108 V -91 1158 V -82 1158 V -36 1143 a(2.76)p 62 1158 V 90 w(4349)p 236 1158 V 92 w(130)p 391 1158 V 400 1158 V 57 w(0.01)129 b(0.74)79 b(0.84)f(0.72)100 b(0.44)78 b(0.01)139 b(-)119 b(-)g(-)109 b(-)h(-)p 2029 1158 V 2038 1158 V -91 1208 V -82 1208 V -36 1193 a(2.48)p 62 1208 V 90 w(3902)p 236 1208 V 92 w(140)p 391 1208 V 400 1208 V 117 w(-)130 b(0.51)68 b Fm(1.25)79 b Fl(0.54)100 b(0.17)78 b(0.01)139 b(-)119 b(-)g(-)109 b(-)h(-)p 2029 1208 V 2038 1208 V -91 1258 V -82 1258 V -36 1243 a(3.13)p 62 1258 V 90 w(3400)p 236 1258 V 92 w(150)p 391 1258 V 400 1258 V 117 w(-)119 b Fm(1.59)80 b Fl(0.84)e(0.28)100 b(0.40)78 b(0.01)139 b(-)119 b(-)g(-)109 b(-)h(-)p 2029 1258 V 2038 1258 V -91 1307 V -82 1307 V -36 1292 a(2.95)p 62 1307 V 90 w(3598)p 236 1307 V 92 w(160)p 391 1307 V 400 1307 V 117 w(-)119 b Fm(1.06)80 b Fl(0.96)e(0.43)100 b(0.50)138 b(-)i(-)119 b(-)g(-)109 b(-)h(-)p 2029 1307 V 2038 1307 V -91 1357 V -82 1357 V -36 1342 a(1.54)p 62 1357 V 90 w(1862)p 236 1357 V 92 w(170)p 391 1357 V 400 1357 V 57 w(0.02)129 b(0.47)79 b(0.84)f(0.18)100 b(0.03)138 b(-)i(-)119 b(-)g(-)109 b(-)h(-)p 2029 1357 V 2038 1357 V -91 1407 V -82 1407 V -36 1392 a(1.49)p 62 1407 V 90 w(2134)p 236 1407 V 92 w(180)p 391 1407 V 400 1407 V 117 w(-)130 b(0.38)79 b(0.58)f(0.08)100 b(0.38)78 b(0.07)139 b(-)119 b(-)g(-)109 b(-)h(-)p 2029 1407 V 2038 1407 V -91 1457 V -82 1457 V -36 1442 a(0.92)p 62 1457 V 90 w(1440)p 236 1457 V 92 w(190)p 391 1457 V 400 1457 V 117 w(-)130 b(0.12)79 b(0.59)f(0.12)100 b(0.09)138 b(-)i(-)119 b(-)g(-)109 b(-)h(-)p 2029 1457 V 2038 1457 V -91 1507 V -82 1507 V -36 1492 a(0.47)p 62 1507 V 90 w(1017)p 236 1507 V 92 w(200)p 391 1507 V 400 1507 V 117 w(-)130 b(0.17)79 b(0.21)f(0.09)160 b(-)139 b(-)h(-)119 b(-)g(-)109 b(-)h(-)p 2029 1507 V 2038 1507 V -91 1556 V -82 1556 V -36 1542 a(0.90)p 62 1556 V 111 w(635)p 236 1556 V 92 w(210)p 391 1556 V 400 1556 V 117 w(-)130 b(0.17)79 b(0.58)f(0.03)100 b(0.12)138 b(-)i(-)119 b(-)g(-)109 b(-)h(-)p 2029 1556 V 2038 1556 V -91 1606 V -82 1606 V -36 1591 a(0.22)p 62 1606 V 111 w(343)p 236 1606 V 92 w(220)p 391 1606 V 400 1606 V 117 w(-)130 b(0.02)79 b(0.15)f(0.05)160 b(-)139 b(-)h(-)119 b(-)g(-)109 b(-)h(-)p 2029 1606 V 2038 1606 V -91 1656 V -82 1656 V -36 1641 a(0.20)p 62 1656 V 111 w(143)p 236 1656 V 92 w(230)p 391 1656 V 400 1656 V 117 w(-)190 b(-)80 b(0.02)e(0.18)160 b(-)139 b(-)h(-)119 b(-)g(-)109 b(-)h(-)p 2029 1656 V 2038 1656 V -91 1706 V -82 1706 V -36 1691 a(0.04)p 62 1706 V 132 w(85)p 236 1706 V 92 w(240)p 391 1706 V 400 1706 V 117 w(-)190 b(-)80 b(0.01)e(0.02)160 b(-)139 b(-)h(-)119 b(-)g(-)109 b(-)h(-)p 2029 1706 V 2038 1706 V -91 1756 V -82 1756 V -36 1741 a(0.03)p 62 1756 V 132 w(65)p 236 1756 V 92 w(250)p 391 1756 V 400 1756 V 117 w(-)190 b(-)80 b(0.02)e(0.01)160 b(-)139 b(-)h(-)119 b(-)g(-)109 b(-)h(-)p 2029 1756 V 2038 1756 V -91 1806 V -82 1806 V -36 1791 a(0.04)p 62 1806 V 132 w(58)p 236 1806 V 92 w(260)p 391 1806 V 400 1806 V 117 w(-)130 b(0.01)79 b(0.04)138 b(-)161 b(-)139 b(-)h(-)119 b(-)g(-)109 b(-)h(-)p 2029 1806 V 2038 1806 V -91 1855 V -82 1855 V -36 1840 a(0.00)p 62 1855 V 132 w(39)p 236 1855 V 92 w(270)p 391 1855 V 400 1855 V 117 w(-)190 b(-)140 b(-)f(-)161 b(-)139 b(-)h(-)119 b(-)g(-)109 b(-)h(-)p 2029 1855 V 2038 1855 V -91 1905 V -82 1905 V -36 1890 a(0.00)p 62 1905 V 132 w(38)p 236 1905 V 92 w(280)p 391 1905 V 400 1905 V 117 w(-)190 b(-)140 b(-)f(-)161 b(-)139 b(-)h(-)119 b(-)g(-)109 b(-)h(-)p 2029 1905 V 2038 1905 V -91 1955 V -82 1955 V -36 1940 a(0.04)p 62 1955 V 111 w(102)p 236 1955 V 92 w(290)p 391 1955 V 400 1955 V 117 w(-)190 b(-)80 b(0.04)138 b(-)161 b(-)139 b(-)h(-)119 b(-)g(-)109 b(-)h(-)p 2029 1955 V 2038 1955 V -91 2005 V -82 2005 V -36 1990 a(0.14)p 62 2005 V 111 w(143)p 236 2005 V 92 w(300)p 391 2005 V 400 2005 V 117 w(-)190 b(-)80 b(0.14)138 b(-)161 b(-)139 b(-)h(-)119 b(-)g(-)109 b(-)h(-)p 2029 2005 V 2038 2005 V -91 2055 V -82 2055 V -36 2040 a(0.00)p 62 2055 V 152 w(6)p 236 2055 V 93 w(310)p 391 2055 V 400 2055 V 117 w(-)190 b(-)140 b(-)f(-)161 b(-)139 b(-)h(-)119 b(-)g(-)109 b(-)h(-)p 2029 2055 V 2038 2055 V -90 2056 2128 2 v -90 2066 V -91 2116 2 50 v -82 2116 V 139 2101 a(jobs)p 236 2116 V 224 w(439)121 b(9233)49 b(23509)f(31740)h (111021)f(81103)h(25814)f(7230)h(1502)61 b(630)81 b(33)p 2029 2116 V 2038 2116 V -90 2118 2128 2 v -91 2167 2 50 v -82 2167 V 97 2153 a(\045time)p 236 2167 V 211 w(0.13)108 b(10.40)58 b(12.97)g(21.51)78 b(25.05)58 b(19.16)78 b(6.66)58 b(2.12)g(0.82)48 b(1.17)h(0.00)p 2029 2167 V 2038 2167 V -90 2169 2128 2 v -91 2219 2 50 v -82 2219 V -12 2204 a Fm(a)o(v)o(e)15 b(M\015ops)p 236 2219 V 49 w(69.06)e Fl(out)h(of)g(333)f(M\015ops)p 727 2219 V 101 w Fm(a)o(v)o(e)j(MIPS)p 1034 2219 V 49 w(41.70)e Fl(out)f(of)h(167)f(MIPS)p 2029 2219 V 2038 2219 V -90 2221 2128 2 v -91 2270 2 50 v -82 2270 V 32 2255 a(total)g(time)p 236 2270 V 49 w(17801.6080)e(hours)p 727 2270 V 333 w(MIPS)p 1034 2270 V 50 w(1.0e-6)i Fi(\003)p Ff(ctr)q Fl(\(0\))p Ff(=C)s(P)6 b(U)p 1428 2255 13 2 v 18 w(time)p 2029 2270 2 50 v 2038 2270 V -90 2272 2128 2 v -91 2322 2 50 v -82 2322 V 40 2307 a Fl(total)13 b(jobs)p 236 2322 V 50 w(292254)p 727 2322 V 501 w(M\015ops)p 1034 2322 V 50 w(1.0e-6)g Fi(\003)p Fl(\()p Ff(ctr)q Fl(\(3\))c(+)g Ff(ctr)q Fl(\(4\))h(+)f Ff(ctr)q Fl(\(5\)\))p Ff(=C)s(P)d(U)p 1773 2307 13 2 v 19 w(time)p 2029 2322 2 50 v 2038 2322 V -90 2324 2128 2 v -91 2373 2 50 v -82 2373 V 35 2358 a Fl(a)o(v)o(e)14 b(MIPS)p 236 2373 V 50 w(1.0e-6)f Fi(\003)p Ff(T)6 b(O)q(T)g(AL)p 557 2358 13 2 v 15 w(ctr)q Fl(\(0\))p Ff(=T)g(O)q(T)g(AL)p 851 2358 V 15 w(C)s(P)g(U)p 964 2358 V 18 w(time)p 2029 2373 2 50 v 2038 2373 V -90 2375 2128 2 v -91 2425 2 50 v -82 2425 V 18 2410 a Fl(a)o(v)o(e)14 b(M\015ops)p 236 2425 V 50 w(1.0e-6)f Fi(\003)p Ff(T)6 b(O)q(T)g(AL)p 557 2410 13 2 v 15 w Fl(\()p Ff(ctr)q Fl(\(3\))j(+)h Ff(ctr)q Fl(\(4\))f(+)g Ff(ctr)q Fl(\(5\)\))p Ff(=T)d(O)q(T)g(AL)p 1196 2410 V 15 w(C)s(P)g(U)p 1310 2410 V 19 w(time)p 2029 2425 2 50 v 2038 2425 V -90 2427 2128 2 v 241 2503 a Fl(T)m(able)13 b(6:)k(M\015ops)e(vs.)j(instruction)c(issue)h (rate)f(\(MIPS\))h(b)o(y)e(p)q(ercen)o(t)j(of)d(w)o(orkload)g(time)939 2828 y(16)p eop %%Page: 17 19 bop 1420 230 a Fl(CPU)135 b(Av)o(erage)276 280 y(ID)160 b(Application)13 b(Area)437 b(Jobs)90 b(Hours)51 b(M\015ops/CPU)p 104 296 1712 2 v 205 331 a(1)f(PHYSC)64 b(Ph)o(ysics)580 b(33,490)69 b(2,998.3)184 b(88.75)205 381 y(2)50 b(MA)m(TER)i(Materials)14 b(Researc)o(h)370 b(29,433)69 b(2,569.9)184 b(63.63)205 431 y(3)50 b(CHEMT)g(Chemical)13 b(and)g(Thermal)g(Systems)135 b(17,340)69 b(1,743.6)184 b(98.94)205 481 y(4)50 b(ASTR)o(O)61 b(Astronomical)13 b(Sciences)312 b(18,792)69 b(1,508.4)184 b(78.26)205 531 y(5)50 b(MOLEB)k(Molecular)14 b(Biosciences)341 b(3,652)69 b(1,276.1)184 b(45.01)205 580 y(6)50 b(UNKNO)g(Unkno)o(wn)542 b(43,199)69 b(1,230.7)184 b(54.48)205 630 y(7)50 b(INDUS)75 b(Industrial)14 b(P)o(artners)370 b(20,645)69 b(1,039.3)184 b(47.20)205 680 y(8)50 b(CHEMI)65 b(Chemistry)546 b(7,175)101 b(941.2)185 b(54.16)205 730 y(9)50 b(MECHS)57 b(Mec)o(hanical)14 b(and)g(Structural)h(Systems)70 b(15,549)101 b(916.6)185 b(38.32)184 780 y(10)50 b(MA)m(THS)57 b(Mathematical)13 b(Sciences)324 b(4,590)101 b(722.8)165 b(129.89)129 829 y(11-26)49 b(OTHER)55 b(All)14 b(other)g(areas)445 b(98,389)69 b(2,854.7)184 b(54.55)p 104 846 V 483 881 a(T)m(otal)13 b(System)452 b(292,254)48 b(17,801.6)184 b(69.06)-90 970 y(T)m(able)12 b(7:)18 b(T)m(op)12 b(10)h(time-consuming)d (application)h(areas)j(at)f(NCSA)g(from)e(June)j(1991)e(to)h(June)h(1992)e (as)h(recorded)i(b)o(y)e(the)h(CRA)m(Y)-90 1020 y(Y-MP/464)f(hardw)o(are)h(p) q(erformance)g(monitor)-90 1197 y(10)e(areas,)h(w)o(e)g(presen)o(t)i(the)e (pro\014les)g(as)g(cum)o(ulativ)o(e)e(p)q(ercen)o(t)j(time)e(for)g(eac)o(h)h (area)g(at)g(a)f(giv)o(en)g(M\015ops)h(rate)h(or)e(less.)19 b(The)13 b(ra)o(w)f(job)-90 1296 y(data)j(is)h(reduced)h(to)f(bins)g(of)f (size)h(10)f(M\015ops)h(and)g(hence)h(this)f(\014gure)g(ma)o(y)e(di\013er)i (sligh)o(tly)e(from)g(statistics)j(rep)q(orted)g(in)e(other)-90 1396 y(\014gures)g(based)f(on)g(individual)e(job)h(data.)18 b(The)c(M\015ops)g(pro\014le)g(for)g(the)g(total)g(w)o(orkload)e(is)i(also)f (sho)o(wn)h(for)g(comparison.)-28 1495 y(The)h(M\015ops)f(pro\014le)g(curv)o (e)h(for)e(the)i(total)e(w)o(orkload)g(is)g(inexplicably)g(and)h(remark)n (ably)e(similar)f(to)j(the)h(PIPE)f(function[12)o(],)-90 1595 y(100*p\(x\)=100/\(1+nhalf/x\),)f(where)k(here,)g(nhalf)d(is)i(ab)q(out)f(60) g(M\015ops)h(\(69)f(M\015ops)h(for)f(un)o(binned)h(data\).)23 b(In)15 b(general,)h(most)-90 1695 y(user)g(p)q(opulations)f(ha)o(v)o(e)g(a)f (\\similar")f(M\015ops)i(pro\014le.)22 b(In)15 b(particular,)g(there)h(are)g (no)f(application)f(areas)h(that)h(execute)h(only)d(at)-90 1794 y(high)f(rates.)19 b(No)14 b(user)h(group)f(has)g(signi\014can)o(t)f (time)g(o)o(v)o(er)h(200)f(M\015ops.)18 b(Except)e(for)d(MA)m(THS,)g(all)g (user)i(p)q(opulations)e(ha)o(v)o(e)h(more)-90 1894 y(than)e(25)g(p)q(ercen)o (t)i(of)d(their)i(time)e(at)h(less)h(than)f(50)g(M\015ops.)18 b(Eac)o(h)12 b(application)f(area)h(tends)h(to)g(follo)o(w)d(the)j(w)o (orkload)e(PIPE)h(curv)o(e,)-90 1994 y(but)i(with)g(a)f(di\013eren)o(t)i (nhalf.)-28 2093 y(Finally)m(,)d(w)o(e)j(presen)o(t)h(a)e(summary)e(of)h(the) i(16)f(p)q(erformance)g(metrics)g(for)g(the)h(top)f(10)g(application)f(areas) i(and)f(the)h(w)o(orkload)-90 2193 y(for)e(the)i(one)f(y)o(ear)f(recording)i (p)q(erio)q(d)f(in)f(T)m(able)g(8)g(and)h(T)m(able)f(9.)k(The)d(tables)g(ha)o (v)o(e)g(a)f(ro)o(w)h(for)f(eac)o(h)h(application)f(area,)g(and)g(eac)o(h)-90 2292 y(column)h(giv)o(es)h(the)h(a)o(v)o(erage)f(v)n(alue)g(for)g(a)g (di\013eren)o(t)h(p)q(erformance)f(metric.)22 b(Finer)16 b(detail)e(and)h (greater)i(kno)o(wledge)e(comes)g(from)-90 2392 y(examining)f(the)i (pro\014les)h(from)d(whic)o(h)i(these)h(a)o(v)o(erages)g(w)o(ere)g(computed.) 24 b(The)16 b(units)g(and)g(range)g(of)g(p)q(ossible)g(v)n(alues)g(for)f(eac) o(h)-90 2492 y(metric)e(are)i(giv)o(en)e(in)g(the)i(b)q(ottom)d(table)i(ro)o (ws.)k(Recall)13 b(that)h(these)i(statistics)e(are)h(p)q(er)f(pro)q(cessor.) -28 2591 y(The)20 b(M\015ops)f(v)n(alues)g(ha)o(v)o(e)f(b)q(een)j(seen)f (earlier.)33 b(The)20 b(t)o(w)o(o)e(ratios)h(of)g(add)f(to)h(m)o(ultiply)d (sho)o(w)j(the)h(near)f(p)q(erfect)i(a)o(v)o(erage)-90 2691 y(balance)13 b(of)g(these)h(t)o(w)o(o)f(op)q(erations.)18 b(The)13 b(a)o(v)o(erage)g(p)q(ercen)o(tage)i(of)e(recipro)q(cal)h(op)q(erations)f(is) g(often)g(less)h(than)f(4)g(p)q(ercen)o(t.)19 b(If)13 b(2)g(or)939 2828 y(17)p eop %%Page: 18 20 bop 7 364 1906 2 v 6 414 2 50 v 15 414 V 223 414 V 248 399 a Fl(M\015ops)p 394 414 V 97 w(A/M)p 580 414 V 50 w(\(A-M\)/\(A+M\))p 900 414 V 62 w(\045recip)p 1083 414 V 50 w(mem)12 b(rate)p 1306 414 V 50 w(mem/\015op)p 1531 414 V 60 w(MIPS)p 1699 414 V 51 w(\015op/inst)p 1903 414 V 1912 414 V 7 416 1906 2 v 6 466 2 50 v 15 466 V 41 451 a(PHYSC)p 223 466 V 91 w(88.75)p 394 466 V 111 w(0.93)p 580 466 V 231 w(-0.03)p 900 466 V 109 w(3.13)p 1083 466 V 127 w(64.59)p 1306 466 V 99 w(0.73)p 1531 466 V 122 w(43.72)p 1699 466 V 130 w(2.03)p 1903 466 V 1912 466 V 6 515 V 15 515 V 41 500 a(MA)m(TER)p 223 515 V 79 w(63.62)p 394 515 V 111 w(1.02)p 580 515 V 245 w(0.01)p 900 515 V 109 w(1.57)p 1083 515 V 127 w(63.58)p 1306 515 V 99 w(1.00)p 1531 515 V 122 w(42.85)p 1699 515 V 130 w(1.48)p 1903 515 V 1912 515 V 6 565 V 15 565 V 41 550 a(CHEMT)p 223 565 V 77 w(98.93)p 394 565 V 111 w(1.08)p 580 565 V 245 w(0.04)p 900 565 V 109 w(3.47)p 1083 565 V 127 w(80.86)p 1306 565 V 99 w(0.82)p 1531 565 V 122 w(29.81)p 1699 565 V 130 w(3.39)p 1903 565 V 1912 565 V 6 615 V 15 615 V 41 600 a(ASTR)o(O)p 223 615 V 88 w(78.25)p 394 615 V 111 w(0.76)p 580 615 V 231 w(-0.13)p 900 615 V 109 w(7.64)p 1083 615 V 127 w(50.51)p 1306 615 V 99 w(0.65)p 1531 615 V 122 w(36.16)p 1699 615 V 130 w(2.16)p 1903 615 V 1912 615 V 6 665 V 15 665 V 41 650 a(MOLEB)p 223 665 V 81 w(45.01)p 394 665 V 111 w(0.90)p 580 665 V 231 w(-0.05)p 900 665 V 109 w(2.51)p 1083 665 V 127 w(64.36)p 1306 665 V 99 w(1.43)p 1531 665 V 122 w(42.11)p 1699 665 V 130 w(1.07)p 1903 665 V 1912 665 V 6 715 V 15 715 V 41 700 a(UNKNO)p 223 715 V 77 w(54.48)p 394 715 V 111 w(0.91)p 580 715 V 231 w(-0.04)p 900 715 V 109 w(3.77)p 1083 715 V 127 w(48.77)p 1306 715 V 99 w(0.90)p 1531 715 V 122 w(46.25)p 1699 715 V 130 w(1.18)p 1903 715 V 1912 715 V 6 764 V 15 764 V 41 749 a(INDUS)p 223 764 V 102 w(47.20)p 394 764 V 111 w(1.13)p 580 764 V 245 w(0.06)p 900 764 V 109 w(1.32)p 1083 764 V 127 w(54.04)p 1306 764 V 99 w(1.14)p 1531 764 V 122 w(49.59)p 1699 764 V 130 w(0.95)p 1903 764 V 1912 764 V 6 814 V 15 814 V 41 799 a(CHEMI)p 223 814 V 92 w(54.15)p 394 814 V 111 w(0.96)p 580 814 V 231 w(-0.01)p 900 814 V 109 w(3.31)p 1083 814 V 127 w(67.05)p 1306 814 V 99 w(1.24)p 1531 814 V 122 w(46.13)p 1699 814 V 130 w(1.17)p 1903 814 V 1912 814 V 6 864 V 15 864 V 41 849 a(MECHS)p 223 864 V 84 w(38.31)p 394 864 V 111 w(0.99)p 580 864 V 245 w(0.00)p 900 864 V 109 w(2.51)p 1083 864 V 127 w(43.93)p 1306 864 V 99 w(1.15)p 1531 864 V 122 w(48.03)p 1699 864 V 130 w(0.80)p 1903 864 V 1912 864 V 6 914 V 15 914 V 41 899 a(MA)m(THS)p 223 914 V 63 w(129.89)p 394 914 V 111 w(1.04)p 580 914 V 245 w(0.02)p 900 914 V 109 w(0.56)p 1083 914 V 127 w(66.64)p 1306 914 V 99 w(0.51)p 1531 914 V 122 w(25.86)p 1699 914 V 130 w(5.12)p 1903 914 V 1912 914 V 7 916 1906 2 v 6 965 2 50 v 15 965 V 41 950 a(TOT)m(AL)p 223 965 V 88 w(69.06)p 394 965 V 111 w(0.96)p 580 965 V 231 w(-0.02)p 900 965 V 109 w(3.14)p 1083 965 V 127 w(61.00)p 1306 965 V 99 w(0.88)p 1531 965 V 122 w(41.70)p 1699 965 V 130 w(1.66)p 1903 965 V 1912 965 V 7 967 1906 2 v 6 1017 2 50 v 15 1017 V 41 1002 a(UNITS)p 223 1017 V 94 w(Mil/S)p 394 1017 V 49 w(unitless)p 580 1017 V 183 w(unitless)p 900 1017 V 51 w(p)q(ercen)o(t)p 1083 1017 V 119 w(Mil/S)p 1306 1017 V 68 w(unitless)p 1531 1017 V 83 w(Mil/S)p 1699 1017 V 67 w(unitless)p 1903 1017 V 1912 1017 V 6 1067 V 15 1067 V 41 1052 a(RANGE)p 223 1067 V 57 w([0,333])p 394 1067 V 83 w([0,)p Fi(1)p Fl(\))p 580 1067 V 228 w([-1,1])p 900 1067 V 64 w([0,100])p 1083 1067 V 103 w([0,500])p 1306 1067 V 84 w([0,)p Fi(1)p Fl(\))p 1531 1067 V 85 w([0,167])p 1699 1067 V 105 w([0,64])p 1903 1067 V 1912 1067 V 7 1068 1906 2 v -90 1143 a(T)m(able)j(8:)21 b(P)o(erformance)16 b(Metrics)h(for)e(the)h(top)g(ten)g(time-consuming)d(application)i(areas)h (at)f(NCSA)h(from)e(June)j(1991)e(to)g(June)-90 1193 y(1992)e(as)h(recorded)h (b)o(y)f(the)h(CRA)m(Y)e(Y-MP/464)g(hardw)o(are)h(p)q(erformance)g(monitor)p 34 1656 1852 2 v 33 1706 2 50 v 42 1706 V 249 1706 V 275 1691 a(mem/inst)p 475 1706 V 47 w(CP/inst)p 670 1706 V 50 w(\045CP)g(HI)p 873 1706 V 50 w(IBF)g(rate)p 1080 1706 V 51 w(inst/IBF)p 1289 1706 V 50 w(\015ops/IBF)p 1514 1706 V 50 w(time/job)p 1722 1706 V 85 w(I/O)p 1877 1706 V 1886 1706 V 34 1707 1852 2 v 33 1757 2 50 v 42 1757 V 67 1742 a(PHYSC)p 249 1757 V 167 w(1.48)p 475 1757 V 121 w(3.40)p 670 1757 V 106 w(70.59)p 873 1757 V 132 w(0.32)p 1080 1757 V 93 w(135.43)p 1289 1757 V 108 w(274.90)p 1514 1757 V 132 w(0.09)p 1722 1757 V 80 w(0.12)p 1877 1757 V 1886 1757 V 33 1807 V 42 1807 V 67 1792 a(MA)m(TER)p 249 1807 V 155 w(1.48)p 475 1807 V 121 w(3.44)p 670 1807 V 106 w(70.93)p 873 1807 V 132 w(0.30)p 1080 1807 V 93 w(140.83)p 1289 1807 V 108 w(209.09)p 1514 1807 V 132 w(0.08)p 1722 1807 V 80 w(0.13)p 1877 1807 V 1886 1807 V 33 1857 V 42 1857 V 67 1842 a(CHEMT)p 249 1857 V 153 w(2.72)p 475 1857 V 121 w(5.14)p 670 1857 V 106 w(80.55)p 873 1857 V 132 w(0.19)p 1080 1857 V 93 w(156.96)p 1289 1857 V 108 w(520.91)p 1514 1857 V 132 w(0.10)p 1722 1857 V 80 w(0.14)p 1877 1857 V 1886 1857 V 33 1907 V 42 1907 V 67 1892 a(ASTR)o(O)p 249 1907 V 164 w(1.41)p 475 1907 V 121 w(4.09)p 670 1907 V 106 w(75.57)p 873 1907 V 132 w(0.36)p 1080 1907 V 113 w(99.36)p 1289 1907 V 109 w(214.99)p 1514 1907 V 132 w(0.08)p 1722 1907 V 80 w(0.16)p 1877 1907 V 1886 1907 V 33 1956 V 42 1956 V 67 1941 a(MOLEB)p 249 1956 V 157 w(1.53)p 475 1956 V 121 w(3.45)p 670 1956 V 106 w(71.04)p 873 1956 V 132 w(0.34)p 1080 1956 V 93 w(122.90)p 1289 1956 V 108 w(131.36)p 1514 1956 V 132 w(0.34)p 1722 1956 V 80 w(0.13)p 1877 1956 V 1886 1956 V 33 2006 V 42 2006 V 67 1991 a(UNKNO)p 249 2006 V 153 w(1.06)p 475 2006 V 121 w(3.10)p 670 2006 V 106 w(67.77)p 873 2006 V 132 w(0.35)p 1080 2006 V 93 w(129.31)p 1289 2006 V 108 w(152.29)p 1514 2006 V 132 w(0.03)p 1722 2006 V 80 w(0.12)p 1877 2006 V 1886 2006 V 33 2056 V 42 2056 V 67 2041 a(INDUS)p 249 2056 V 178 w(1.08)p 475 2056 V 121 w(2.89)p 670 2056 V 106 w(65.51)p 873 2056 V 132 w(0.48)p 1080 2056 V 93 w(102.68)p 1289 2056 V 129 w(97.73)p 1514 2056 V 132 w(0.05)p 1722 2056 V 80 w(0.12)p 1877 2056 V 1886 2056 V 33 2106 V 42 2106 V 67 2091 a(CHEMI)p 249 2106 V 168 w(1.46)p 475 2106 V 121 w(3.15)p 670 2106 V 106 w(68.29)p 873 2106 V 132 w(0.35)p 1080 2106 V 93 w(131.80)p 1289 2106 V 108 w(154.74)p 1514 2106 V 132 w(0.13)p 1722 2106 V 80 w(0.17)p 1877 2106 V 1886 2106 V 33 2156 V 42 2156 V 67 2141 a(MECHS)p 249 2156 V 160 w(0.92)p 475 2156 V 121 w(2.89)p 670 2156 V 106 w(65.41)p 873 2156 V 132 w(0.62)p 1080 2156 V 113 w(76.75)p 1289 2156 V 130 w(61.22)p 1514 2156 V 132 w(0.05)p 1722 2156 V 80 w(0.14)p 1877 2156 V 1886 2156 V 33 2205 V 42 2205 V 67 2191 a(MA)m(THS)p 249 2205 V 160 w(2.61)p 475 2205 V 121 w(5.96)p 670 2205 V 106 w(83.22)p 873 2205 V 132 w(0.17)p 1080 2205 V 93 w(146.07)p 1289 2205 V 108 w(748.54)p 1514 2205 V 132 w(0.15)p 1722 2205 V 80 w(0.10)p 1877 2205 V 1886 2205 V 34 2207 1852 2 v 33 2257 2 50 v 42 2257 V 67 2242 a(TOT)m(AL)p 249 2257 V 164 w(1.46)p 475 2257 V 121 w(3.52)p 670 2257 V 106 w(71.59)p 873 2257 V 132 w(0.35)p 1080 2257 V 93 w(118.74)p 1289 2257 V 108 w(196.66)p 1514 2257 V 132 w(0.06)p 1722 2257 V 80 w(0.14)p 1877 2257 V 1886 2257 V 34 2259 1852 2 v 33 2308 2 50 v 42 2308 V 67 2293 a(UNITS)p 249 2308 V 117 w(unitless)p 475 2308 V 60 w(unitless)p 670 2308 V 69 w(p)q(ercen)o(t)p 873 2308 V 104 w(Mil/S)p 1080 2308 V 71 w(unitless)p 1289 2308 V 90 w(unitless)p 1514 2308 V 109 w(hours)p 1722 2308 V 50 w(Mil/S)p 1877 2308 V 1886 2308 V 33 2358 V 42 2358 V 67 2343 a(RANGE)p 249 2358 V 133 w([0,64])p 475 2358 V 92 w([0,)p Fi(1)p Fl(\))p 670 2358 V 83 w([0,100])p 873 2358 V 76 w([0,4.75])p 1080 2358 V 104 w([0,)p Fi(1)p Fl(\))p 1289 2358 V 123 w([0,)p Fi(1)p Fl(\))p 1514 2358 V 105 w([0,)p Fi(1)p Fl(\))p 1722 2358 V 51 w([0,)p Fi(1)p Fl(\))p 1877 2358 V 1886 2358 V 34 2360 1852 2 v -90 2434 a(T)m(able)h(9:)21 b(P)o(erformance)16 b(Metrics)h(for)e(the)h(top)g(ten)g(time-consuming)d (application)i(areas)h(at)f(NCSA)h(from)e(June)j(1991)e(to)g(June)-90 2484 y(1992)e(as)h(recorded)h(b)o(y)f(the)h(CRA)m(Y)e(Y-MP/464)g(hardw)o(are) h(p)q(erformance)g(monitor)939 2828 y(18)p eop %%Page: 19 21 bop -90 195 a Fl(3)13 b(m)o(ultiplies)e(p)q(er)j(recipro)q(cal)f(w)o(ere)h (subtracted)h(from)d(the)h(n)o(um)o(b)q(er)g(of)f(m)o(ultiplies)f(to)i (accoun)o(t)h(for)f(division)f(op)q(erations,)h(then)g(in)-90 295 y(some)f(cases)j(the)f(a)o(v)o(erage)f(ratios)g(of)f(adds)i(to)f(m)o (ultipli)o(es)e(w)o(ould)i(b)q(e)g(ev)o(en)h(more)e(p)q(erfect.)20 b(The)13 b(a)o(v)o(erage)h(memory)c(reference)16 b(rate,)-90 394 y(61)f(Mmemops,)e(is)j(ab)q(out)f(12)g(p)q(ercen)o(t)j(of)c(the)j(p)q (eak)e(p)q(er)i(pro)q(cessor)g(memory)c(bandwidth.)23 b(The)16 b(a)o(v)o(erage)f(memory)f(to)h(\015oating)-90 494 y(p)q(oin)o(t)10 b(ratio)g(v)n(aries)h(from)e(0.51)g(to)i(1.43)e(with)h(a)h(w)o(orkload)e(a)o (v)o(erage)i(of)f(sligh)o(tly)f(less)i(than)g(1.)17 b(Av)o(erage)11 b(MIPS)g(are)g(lo)o(w)f(\(25)g(p)q(ercen)o(t\))-90 594 y(compared)k(to)h(p)q (eak)g(MIPS,)g(although)f(for)h(a)g(v)o(ector)g(mac)o(hine)f(this)h(is)g(to)g (b)q(e)g(exp)q(ected.)24 b(Clo)q(c)o(k)14 b(p)q(erio)q(ds)i(p)q(er)g (instruction)f(is)g(an)-90 693 y(in)o(v)o(erse)g(m)o(ultiple)c(of)i(MIPS.)-28 793 y(The)j(p)q(ercen)o(tage)g(of)e(clo)q(c)o(k)h(p)q(erio)q(ds)h(holding)e (issue)h(is)g(high)f(for)h(a)f(v)o(ector)i(mac)o(hine)d(since)j(a)f(single)f (v)o(ector)i(instruction)f(ma)o(y)-90 892 y(use)i(a)e(CPU)i(resource)h(for)d (a)h(long)f(time)f(and)i(prev)o(en)o(t)h(a)f(subsequen)o(t)i(instruction)e (needing)g(the)h(same)e(resource)j(from)c(issuing.)-90 992 y(The)e(instruction)f(bu\013er)h(fetc)o(h)g(rate)g(is)f(a)g(measure)g(in)g (part)h(of)e(the)i(mo)q(dular)d(programming)f(st)o(yle)j(of)g(the)h(user)g (since)g(an)f(IBF)h(m)o(ust)-90 1092 y(usually)g(b)q(e)h(done)g(when)h (making)c(a)j(subroutine)g(call.)k(The)c(instructions)h(p)q(er)f(IBF)g (measures)g(the)h(utilization)d(of)h(the)i(instruction)-90 1191 y(bu\013ers)i(whic)o(h)f(are)h(up)f(to)g(128)f(instructions)i(in)f (length.)22 b(Some)13 b(user)k(p)q(opulations)d(ha)o(v)o(e)h(a)o(v)o(erage)g (v)n(alues)g(o)o(v)o(er)g(128,)f(indicating)-90 1291 y(that)h(their)h(co)q (des)h(are)e(making)e(use)j(of)f(more)f(than)i(one)f(instruction)h(bu\013er)g (\(there)h(are)f(4)f(instruction)h(bu\013ers)g(in)f(eac)o(h)h(CPU\).)-90 1391 y(The)d(a)o(v)o(erage)f(n)o(um)o(b)q(er)g(of)g(\015oating)f(p)q(oin)o(t) h(op)q(erations)h(p)q(er)g(IBF)g(indicates)g(in)f(part)g(the)h(mo)q(dularit)o (y)d(of)i(the)h(user's)h(programmi)o(ng)-90 1490 y(st)o(yle.)k(The)d(highest) f(p)q(erforming)e(user)j(groups)f(ha)o(v)o(e)g(high)f(v)n(alues)h(for)f(this) h(metric.)-28 1590 y(The)f(a)o(v)o(erage)g(job)f(time)f(app)q(ears)i(to)g(b)q (e)g(lo)o(w)e(b)q(ecause)k(a)d(large)g(n)o(um)o(b)q(er)g(of)g(jobs)g(to)q(ok) g(a)h(short)g(time)e(to)h(execute.)20 b(Jobs)13 b(taking)-90 1689 y(less)h(than)g(36)e(seconds)j(accoun)o(ted)g(for)e(34.7)f(p)q(ercen)o (t)j(of)e(the)h(jobs)g(but)f(only)g(0.98)f(p)q(ercen)o(t)j(of)e(the)h(w)o (orkload)f(time.)j(The)e(a)o(v)o(erage)-90 1789 y(I/O)h(rate)h(is)f(not)g (meaningful)e(except)j(for)f(the)h(w)o(orkload)e(total)g(since)i(memory)d(p)q (ort)i(D)g(whic)o(h)g(is)g(used)h(for)f(I/O)g(op)q(erations)g(in)-90 1889 y(eac)o(h)f(pro)q(cessor)i(is)e(a)f(system)h(resource.)-28 1988 y(The)k(a)o(v)o(erage)g(\015op/inst)f(and)h(a)o(v)o(erage)g(mem/inst)d (metrics)i(pro)o(vide)h(an)f(in)o(teresting)i(comparison)d(of)h(v)o(ector)h (mac)o(hines)f(to)-90 2088 y(other)f(mac)o(hines)e(that)g(ha)o(v)o(e)h(long)f (instructions)i(capable)f(of)f(initiating)f(man)o(y)g(op)q(erations)i(at)g (the)g(same)f(time.)20 b(If)15 b(one)g(lo)q(oks)f(at)-90 2188 y(the)j(en)o(tries)h(for)e(the)h(MA)m(THS)g(user)g(group,)g(one)g(\014nds)g (that)g(the)g(CRA)m(Y)f(Y-MP)h(single)f(pro)q(cessor)i(pro)o(vided)f(these)h (users,)g(on)-90 2287 y(the)c(a)o(v)o(erage,)e(with)h(the)h(equiv)n(alen)o(t) e(of)g(25.86)g(MIPS)h(where)h(eac)o(h)g(instruction)f(initiated)f(on)h(the)g (a)o(v)o(erage)g(of)g(5.12)e(\015oating)h(p)q(oin)o(t)-90 2387 y(op)q(erations)17 b(and)f(2.61)f(memory)f(op)q(erations)i(for)g(the)h(one)g (y)o(ear)f(recording)h(p)q(erio)q(d.)26 b(This)16 b(equiv)n(alen)o(t)g (instruction)h(parallelism)-90 2487 y(of)h(7.73)f(do)q(es)j(not)e(include)h (the)g(in)o(teger,)h(logical,)d(branc)o(hing,)i(instruction)g(fetc)o(h,)h (implicit)c(memory)g(addressing)j(and)f(other)-90 2586 y(op)q(erations)c (that)g(the)h(Y-MP)f(pro)q(cessor)h(also)f(p)q(erformed)f(at)h(the)g(same)f (time.)939 2828 y(19)p eop %%Page: 20 22 bop -90 195 a Fj(4)69 b(Conclusions)22 b(and)i(F)-6 b(uture)23 b(W)-6 b(ork)-90 336 y Fl(W)m(e)14 b(ha)o(v)o(e)h(recorded)h(and)e(presen)o (ted)j(preliminary)c(results)j(on)e(the)h(w)o(orkload)f(on)g(the)h(CRA)m(Y)f (Y-MP/4)h(at)f(NCSA)h(for)f(a)h(p)q(erio)q(d)-90 435 y(of)e(one)g(y)o(ear.)18 b(These)c(results)h(ha)o(v)o(e)e(giv)o(en)g(us)g(detailed)g(insigh)o(t)g(in)o (to)f(ho)o(w)h(this)g(sup)q(ercomputer)i(w)o(as)e(used)h(during)f(the)g (recording)-90 535 y(p)q(erio)q(d.)-28 635 y(Our)19 b(sp)q(eci\014c)h (results)f(here)h(sho)o(w)e(that)g(while)g(the)h(a)o(v)o(erage)f(M\015ops)h (p)q(er)g(pro)q(cessor)h(w)o(as)e(69.06)f(\(20.7)g(p)q(ercen)o(t)j(of)e(p)q (eak\),)-90 734 y(there)f(is)e(ro)q(om)f(for)h(impro)o(v)o(emen)o(t.)20 b(The)c(distribution)f(of)g(M\015ops)h(p)q(erformance)f(is)h(w)o(eigh)o(ted)f (hea)o(vily)g(to)o(w)o(ard)g(the)h(lo)o(w)f(end)h(of)-90 834 y(the)f(p)q(erformance)f(range.)20 b(The)15 b(greatest)h(a)o(v)o(erage)f(p)q (erformance)f(gain)g(for)g(this)g(w)o(orkload)f(will)h(b)q(e)h(for)f(hardw)o (are)h(and)f(soft)o(w)o(are)-90 934 y(optimizations)e(targeted)i(at)g(this)g (imp)q(ortan)o(t)e(area)i(of)g Fk(low)f Fl(p)q(erformance.)-28 1033 y(W)m(e)i(ha)o(v)o(e)g(sho)o(wn)g(a)g(comparison)f(of)g(only)h(a)g(few)g (of)g(the)g(16)g(p)q(erformance)g(metrics)g(that)g(w)o(e)h(ha)o(v)o(e)f (deriv)o(ed)g(from)f(the)i(HPM)-90 1133 y(Group)j(0)g(coun)o(ters.)35 b(Suc)o(h)20 b(comparisons)e(sho)o(w)h(the)h(range)f(of)g(realized)g(v)n (alues)g(of)g(the)h(metrics)f(and)g(the)h(sometimes)d(lo)q(ose)-90 1232 y(relationship)i(trends)j(b)q(et)o(w)o(een)f(them.)36 b(While)19 b(w)o(e)h(compared)f(MIPS)i(and)e(Mmemops)f(to)i(M\015ops,)h (examples)e(arose)i(in)e(the)-90 1332 y(w)o(orkload)13 b(to)g(remind)g(us)h (that)g(p)q(erformance)g(is)g(not)g(just)g(M\015ops.)-28 1432 y(The)e(top)f(10)f(time-consuming)e(application)i(areas)i(in)e(the)i(w)o (orkload)e(w)o(ere)i(examined.)k(A)11 b(comparison)e(of)i(the)g(a)o(v)o (erages)h(of)e(the)-90 1531 y(16)j(p)q(erformance)h(metrics)g(for)f(the)i(v)n (arious)e(application)g(areas)h(sho)o(w)o(ed)g(v)n(ariations)f(in)h(the)g (detailed)g(c)o(haracteristics)i(and)d(rates.)-90 1631 y(Some)f(conclusions)i (migh)o(t)e(b)q(e)i(dra)o(wn)f(ab)q(out)h(the)g(di\013eren)o(t)h(application) d(areas)i(from)e(these)j(a)o(v)o(erages,)e(but)h(w)o(e)g(are)g(reluctan)o(t)g (to)-90 1731 y(do)d(so)h(without)f(further)h(in)o(v)o(estigation.)k (Additional)10 b(details)i(and)f(greater)i(understanding)f(of)f(the)h(p)q (erformance)f(of)g(the)h(w)o(orkload)-90 1830 y(are)i(a)o(v)n(ailable)e(in)h (the)i(pro\014les)f(for)g(these)h(c)o(haracteristics)h(and)d(rates,)i(whic)o (h)f(are)g(the)g(sub)r(ject)i(of)d(another)h(study)m(.)-28 1930 y(One)k(conclusion)e(can)h(b)q(e)g(reac)o(hed)h(from)d(the)i(M\015ops)g (pro\014les.)26 b(While)16 b(eac)o(h)h(area)f(had)h(a)f(di\013eren)o(t)i(a)o (v)o(erage)e(M\015ops)h(rate,)-90 2029 y(the)d(pro\014le)g(of)f(M\015ops)h (for)f(eac)o(h)h(area)g(had)f(similar)e(c)o(haracteristics)16 b(to)d(the)h(o)o(v)o(erall)f(w)o(orkload)f(pro\014le.)18 b(Most)c (application)e(areas)-90 2129 y(had)i(1\))g(a)f(larger)h(p)q(ercen)o(tage)i (of)e(time)e(sp)q(en)o(t)j(at)f(lo)o(w)f(M\015ops)i(rates)g(than)f(at)f (higher)h(rates,)h(2\))f(examples)f(of)g(v)o(ery)i(high)e(and)h(lo)o(w)-90 2229 y(p)q(erforming)h(programs,)g(and)h(3\))h(a)f(similar)d(form)i(to)h (their)h(distribution)f(functions.)25 b(W)m(e)16 b(conclude)h(from)e(the)i (M\015ops)g(pro\014les)-90 2328 y(that,)e(in)g(the)h(big)f(picture,)h(eac)o (h)g(application)e(area)h(uses)i(the)f(mac)o(hine)e(in)h(a)g(similar)e(w)o(a) o(y)m(.)21 b(This)15 b(do)q(es)h(not)g(giv)o(e)e(us)i(con\014dence)-90 2428 y(in)e(the)h(use)g(of)f(application)f(area)i(co)o(v)o(erage)g(alone)f (as)g(an)g(analytic)g(criteria)h(for)f(b)q(enc)o(hmark)g(set)h(construction.) 21 b(Suc)o(h)14 b(a)h(criteria)-90 2528 y(ma)o(y)f(select)j(b)q(enc)o(hmark)f (programs)e(whose)j(individual)d(p)q(erformance)h(statistics)i(are)f(all)f (high,)g(all)g(lo)o(w,)g(or)h(all)f(the)h(same.)23 b(W)m(e)-90 2627 y(feel)14 b(that)g(a)f(go)q(o)q(d)g(represen)o(tation)j(of)d(the)h(w)o (orkload)f(ma)o(y)e(b)q(e)k(obtained)e(b)o(y)h(selecting)g(b)q(enc)o(hmark)f (programs)g(whose)h(com)o(bined)939 2828 y(20)p eop %%Page: 21 23 bop -90 195 a Fl(p)q(erformance)14 b(statistics)g(matc)o(h)f(the)i(p)q (erformance)e(statistics)i(of)e(the)i(w)o(orkload.)-28 295 y(Sev)o(eral)d(of)g(the)g(HPM)h(p)q(erformance)f(metrics)f(tak)o(en)i (together)f(w)o(ere)h(able)f(to)g(demonstrate)g(the)h(amoun)o(t)d(of)h (instruction-lev)o(el)-90 394 y(parallelism)e(that)j(o)q(ccurs)i(within)d (the)i(pro)q(cessor)h(of)d(the)i(CRA)m(Y)e(Y-MP)m(.)g(F)m(or)h(one)g (application)f(area)h(the)g(a)o(v)o(erage)g(\(\015oating)f(p)q(oin)o(t)-90 494 y(and)16 b(memory\))e(instruction)j(parallelism)d(w)o(as)i(7.73)f(at)i (25.86)e(MIPS)h(and)g(did)h(not)f(include)g(other)h(op)q(erations)g(that)g(o) q(ccurred.)-90 594 y(This)d(a)o(v)o(erage)h(parallelism)d(app)q(ears)j(to)f (b)q(e)h(large)g(compared)f(to)g(what)g(one)h(migh)o(t)d(exp)q(ect)k(to)f(b)q (e)g(p)q(ossible)g(from)d(an)j(equiv)n(alen)o(t)-90 693 y(long)e(instruction) h(w)o(ord)g(computer.)-90 868 y Fh(4.1)56 b(F)-5 b(uture)18 b(W)-5 b(ork)-90 994 y Fl(The)11 b(capabilit)o(y)e(and)i(desire)g(to)g (record)h(detailed)e(p)q(erformance)h(information)c(on)k(sup)q(ercomputers)h (o)o(v)o(er)e(long)g(p)q(erio)q(ds)h(ha)o(v)o(e)g(man)o(y)-90 1094 y(p)q(oten)o(tial)k(b)q(ene\014ts)i(to)e(the)h(users)h(of)d(the)i(mac)o (hines,)f(the)h(organizations)e(who)h(pro)o(vide)g(them)g(to)g(the)h(users,)h (and)e(p)q(erformance)-90 1193 y(ev)n(aluators)f(who)f(study)i(their)f(use.) 19 b(Some)12 b(of)i(these)h(b)q(ene\014ts)h(include)e(new)g(capabilities:)-28 1285 y Fi(\017)21 b Fl(to)14 b(educate)h(users)g(ab)q(out)f(p)q(erformance)g (and)f(ho)o(w)h(to)g(write)g(high)f(p)q(erformance)h(programs)-28 1368 y Fi(\017)21 b Fl(to)14 b(iden)o(tify)f(users)i(or)f(p)q(opulations)f (of)g(users)i(who)f(need)h(help)-28 1451 y Fi(\017)21 b Fl(to)14 b(pro)o(vide)f(an)h(incen)o(tiv)o(e)g(to)g(mak)o(e)e(user's)j(programs)e(run) h(more)f(e\016cien)o(tly)-28 1534 y Fi(\017)21 b Fl(to)14 b(gauge)f(ho)o(w)h (m)o(uc)o(h)e(resource)k(is)e(a)o(v)n(ailable)e(and)h(ho)o(w)h(to)g(allo)q (cate)f(it)g(to)h(resp)q(onsible)h(users)-28 1617 y Fi(\017)21 b Fl(to)14 b(conduct)g(prop)q(er)h(capacit)o(y)f(planning)-28 1700 y Fi(\017)21 b Fl(to)14 b(pro)o(vide)f(funding)g(organizations)g(with)h (a)f(measure)h(of)g(return)h(on)e(in)o(v)o(estmen)o(t)-28 1783 y Fi(\017)21 b Fl(to)14 b(share)g(solutions)g(to)f(common)e(problems)i(among) f(sites)j(with)e(similar)f(mac)o(hines)-28 1866 y Fi(\017)21 b Fl(to)14 b(serv)o(e)h(as)f(a)f(standard)i(of)e(comparison)f(for)i(other)g (mac)o(hines,)f(e.g.)k(MPP)m(,)d(in)f(the)i(HPCC)f(program[7)n(].)-28 1949 y Fi(\017)21 b Fl(to)14 b(serv)o(e)h(as)f(a)f(target)i(sp)q (eci\014cation)f(for)g(a)f(b)q(enc)o(hmark)h(set)h(that)e(analytically)f (represen)o(ts)17 b(this)d(w)o(orkload)-28 2032 y Fi(\017)21 b Fl(to)14 b(pro)o(vide)f(v)o(endors)i(with)e(feedbac)o(k)i(on)e(ho)o(w)h (their)g(mac)o(hines)f(are)i(used)f(and)g(ho)o(w)f(to)h(mak)o(e)f(b)q(etter)i (ones)-28 2173 y(Our)c(o)o(wn)f(plans)h(for)f(future)h(w)o(ork)f(include)h(a) f(deep)q(er)i(analysis)e(of)g(the)h(execution)h(c)o(haracteristics)g(of)e (the)h(di\013eren)o(t)g(application)-90 2273 y(area)16 b(user)g(p)q (opulations)f(on)g(the)h(mac)o(hine.)22 b(In)o(v)o(estigation)15 b(in)o(to)f(the)i(relationships)g(b)q(et)o(w)o(een)h(the)f Fk(pr)n(o\014les)f Fl(of)g(the)h(p)q(erformance)-90 2372 y(metrics)i(in)o (tro)q(duced)g(in)g(this)g(pap)q(er)g(should)g(shed)h(some)e(ligh)o(t)f(on)i (the)h(v)n(ariations)d(of)h(p)q(erformance)h(among)e(the)i(application)-90 2472 y(areas.)g(Additionally)m(,)11 b(since)j(our)g(data)f(w)o(as)g (collected)h(b)o(y)f(mon)o(th)f(it)h(w)o(ould)g(b)q(e)h(in)o(teresting)g(to)f (see)i(if)e(p)q(erformance)g(trends)i(exist)-90 2571 y(o)o(v)o(er)e(time,)f (suc)o(h)i(as)g(a)f(gradual)g(increase)i(in)e(a)o(v)o(erage)g(M\015ops)h(as)f (an)g(individual)f(user,)i(an)f(application)f(area)i(p)q(opulation,)e(or)h (the)-90 2671 y(en)o(tire)i(user)g(p)q(opulation)d(learns)j(more)d(ab)q(out)i (using)g(a)f(v)o(ector)i(mac)o(hine.)939 2828 y(21)p eop %%Page: 22 24 bop -28 195 a Fl(W)m(e)13 b(plan)g(to)g(compare)g(the)h(p)q(erformance)f(c)o (haracteristics)i(of)e(curren)o(t)i(b)q(enc)o(hmarks)e(to)g(that)g(of)g(this) h(academic)e(w)o(orkload)g(to)-90 295 y(see)i(ho)o(w)d(w)o(ell)h(they)h(ma)o (y)d(represen)o(t)15 b(the)e(particular)f(w)o(orkload)f(at)h(NCSA)h(during)f (this)g(time)f(p)q(erio)q(d.)18 b(W)m(e)12 b(w)o(ould)f(lik)o(e)h(to)g(dev)o (elop)-90 394 y(quan)o(titativ)o(e)h(metrics)h(whic)o(h)f(measure)h(the)h (represen)o(tativ)o(eness)i(of)c(a)h(b)q(enc)o(hmark)f(set)i(to)f(a)f(giv)o (en)h(w)o(orkload.)-28 494 y(Ov)o(er)f(300)e(CRA)m(Y)g(X-MP)m(,)g(Y-MP)h(and) g(C90)f(mac)o(hines)g(with)h(hardw)o(are)g(p)q(erformance)f(monitors)g(are)h (curren)o(tly)h(in)e(op)q(eration)-90 594 y(w)o(orldwide.)33 b(The)19 b(recording)h(soft)o(w)o(are)f(for)g(the)h(collection)e(of)h(HPM)g (p)q(erformance)g(data)g(has)g(b)q(een)h(incorp)q(orated)g(in)o(to)e(the)-90 693 y(standard)f(release)g(of)e(the)i(op)q(erating)f(system)g(soft)o(w)o(are) g(no)o(w)g(a)o(v)n(ailable)d(on)j(all)f(these)j(mac)o(hines.)23 b(W)m(e)16 b(w)o(ould)f(lo)q(ok)g(forw)o(ard)h(to)-90 793 y(collab)q(orating) g(with)g(other)i(in)o(ternational,)e(national,)g(state,)j(academic,)d (industrial,)h(and)g(go)o(v)o(ernmen)o(t)f(cen)o(ters)j(in)e(forming)d(a)-90 892 y(database)j(of)e(hardw)o(are)i(p)q(erformance)f(monitor)f(information.) 23 b(Suc)o(h)16 b(a)g(database)h(w)o(ould)f(demonstrate)g(ho)o(w)g(these)i (mac)o(hines)-90 992 y(are)c(used,)h(and)e(w)o(ould)g(serv)o(e)j(as)d(a)h(v)n (aluable)f(resource)j(to)d(those)i(in)o(terested)h(in)d(p)q(erformance)h(ev)n (aluation)e(of)h(sup)q(ercomputers.)1980 977 y Fa(1)-90 1204 y Fj(5)69 b(Ac)n(kno)n(wledgemen)n(ts)-90 1345 y Fl(The)13 b(authors)f(w)o(ould)g(lik)o(e)f(to)h(thank)h(Doru)e(Marcusiu)i(of)f(NCSA)g (and)h(Ric)o(h)e(Bro)o(wn)i(of)e(Cra)o(y)h(Researc)o(h)i(for)e(their)g(help)h (in)e(making)-90 1444 y(the)j(NCSA)h(HPM)f(data)f(a)o(v)n(ailable)f(for)i (study)m(.)-90 1582 y Fj(References)-90 1673 y Fl([1])20 b(M.)e(Berry)m(,)j (et)f(al.)34 b(The)19 b(P)o(erfect)i(Club)e(Benc)o(hmarks:)29 b(E\013ectiv)o(e)20 b(P)o(erformance)f(Ev)n(aluation)f(of)h(Sup)q (ercomputers.)35 b Fk(The)-25 1722 y(International)15 b(Journal)f(of)h(Sup)n (er)n(c)n(omputer)g(Applic)n(ations)p Fl(,)f(3\(3\),)f(pages)h(5{40,)e(F)m (all)h(1989.)-90 1805 y([2])20 b(M.)10 b(W.)g(Berry)m(.)j(Scien)o(ti\014c)f (W)m(orkload)d(Characterization)i(By)g(Lo)q(op-Based)g(Analyses.)j Fk(University)d(of)h(T)m(ennesse)n(e)p Fl(,)f(CS-91-140,)-25 1855 y(Kno)o(xville,)h(TN,)h(August)i(1991.)-90 1938 y([3])20 b(Mik)o(e)d(Berry)m(,)h(George)g(Cyb)q(enk)o(o)f(and)g(John)h(Larson.)28 b(Scien)o(ti\014c)18 b(Benc)o(hmarks)g(Characterization.)28 b Fk(Par)n(al)r(lel)17 b(Computing)p Fl(,)-25 1988 y(pages)d(1173{1194,)d (17\(10&11\),)h(Decem)o(b)q(er)j(1991.)-90 2071 y([4])20 b(D.)d(Bradley)m(,)h (G.)f(Cyb)q(enk)o(o,)h(H.)f(Gao,)h(J.)f(Larson,)i(F.)e(Ahmad,)g(J.)g(Golab,)g (and)h(M.)f(Strak)n(a.)29 b(Sup)q(ercomputer)19 b(W)m(orkload)-25 2121 y(Decomp)q(osition)d(and)h(Analysis.)29 b(In)18 b Fk(Pr)n(o)n(c)n(e)n(e) n(dings)g(of)g(the)h(A)o(CM)f(International)g(Confer)n(enc)n(e)h(on)g(Sup)n (er)n(c)n(omputing)p Fl(,)g(pages)-25 2171 y(458{467,)11 b(June)k(17{21)e (1991.)-90 2254 y([5])20 b(Da)o(vid)10 b(K.)i(Bradley)g(and)g(John)g(L.)f (Larson.)k(A)d(P)o(arallelism-Based)f(Analytic)g(Approac)o(h)h(to)g(P)o (erformance)g(Ev)n(aluation)e(Using)-25 2304 y(Application)i(Programs.)17 b(In)o(vited)d(pap)q(er)h(to)f(app)q(ear)g(in)f Fk(Pr)n(o)n(c)n(e)n(e)n (dings)i(of)g(the)g(IEEE)p Fl(,)f(Septem)o(b)q(er,)g(1993.)-90 2387 y([6])20 b(Ingrid)13 b(Buc)o(her)j(and)e(Joanne)g(Martin.)19 b(Metho)q(dology)13 b(for)h(Characterizing)h(a)e(Scien)o(ti\014c)i(W)m (orkload.)i Fk(L)n(os)e(A)o(lamos)g(National)-25 2436 y(L)n(ab)n(or)n(atory)p Fl(,)e(LA-UR-82-1702,)e(Los)j(Alamos,)d(NM)j(87545,)e(1982.)p -90 2517 840 2 v -44 2544 a Fo(1)-26 2555 y Fp(A)o(t)f(the)g(presen)o(t)e (time,)h(our)h(in)o(v)o(estigatio)o(ns)d(are)j(limited)e(to)i(Cra)o(y)g(mac)o (hines)e(since)h(they)g(are)h(the)f(only)g(systems)g(a)o(v)n(ailable)f(whic)o (h)h(ha)o(v)o(e)g(a)h(compre-)-90 2595 y(hensiv)o(e,)e(non-in)o(trusiv)o(e)f (hardw)o(are)i(p)q(erformance)e(monitor.)13 b(Ho)o(w)o(ev)o(er,)e(the)g(tec)o (hniques)d(w)o(e)k(use)f(are)g(p)q(ortable)e(to)i(other)f(arc)o(hitectures)e (pro)o(vided)h(that)-90 2634 y(comparible)f(lo)o(w-lev)o(el)i(measuremen)o(t) e(facilities)i(are)h(pro)o(vided.)939 2828 y Fl(22)p eop %%Page: 23 25 bop -90 195 a Fl([7])20 b(Melvyn)e(Cimen)o(t,)g(William)c(Sc)o(herlis,)20 b(Stephen)g(Gri\016n,)e(Charles)h(Bro)o(wnstein,)h(et)f(al.)31 b(Grand)18 b(Challenges)g(1993:)27 b(High)-25 245 y(P)o(erformance)13 b(Computing)f(and)i(Comm)n(unications.)h Fk(National)g(Scienc)n(e)g(F)m (oundation)p Fl(,)g(1993.)-90 328 y([8])20 b(George)12 b(Delic.)k(P)o (erformance)c(Analysis)h(of)f(a)g(24)g(Co)q(de)h(Sample)e(on)i(Cra)o(y)f (X/Y-MP)h(Systems)f(at)h(the)g(Ohio)f(Sup)q(ercomputer)-25 378 y(Cen)o(ter.)23 b(In)15 b Fk(Pr)n(o)n(c)n(e)n(e)n(dings)i(of)f(the)g (Fifth)g(SIAM)h(Confer)n(enc)n(e)f(on)g(Par)n(al)r(lel)g(Pr)n(o)n(c)n(essing) g(for)f(Scienti\014c)i(Applic)n(ations)p Fl(,)e(SIAM,)-25 428 y(Philadelphia,)d(Dann)o(y)h(C.)g(Sorenson,)h(ed.,)g(pages)g(530{535,)d (1991.)-90 511 y([9])20 b(Domineco)10 b(F)m(errari.)15 b(W)m(orkload)c (Characterization)h(and)g(Selection)h(in)f(Computer)f(P)o(erformance)h (Measuremen)o(t.)k Fk(Computer)p Fl(,)-25 560 y(pages)e(18{24,)e(5\(4\),)h (July-August)h(1972.)-90 643 y([10])19 b(Stev)o(en)h(L.)d(Gaede.)32 b(T)m(o)q(ols)17 b(for)h(Researc)o(h)i(in)e(Computer)f(W)m(orkload)g (Characterization)h(and)h(Mo)q(deling.)30 b Fk(Exp)n(erimental)-25 693 y(Computer)14 b(Performanc)n(e)h(and)g(Evaluation)p Fl(,)f (North-Holland,)f(D.)g(F)m(errari)h(and)g(M.)f(Spadoni,)g(pages)h(235{247,)e (1981.)-90 776 y([11])19 b(U.)d(Grenander)i(and)e(R.)f(F.)h(Tsau.)26 b(Quan)o(titativ)o(e)15 b(metho)q(ds)h(for)g(ev)n(aluating)f(computer)h (system)g(p)q(erformance:)23 b(a)16 b(review)-25 826 y(and)d(prop)q(osals.)19 b(In)14 b Fk(Pr)n(o)n(c)n(e)n(e)n(dings)h(of)g(the)g(Confer)n(enc)n(e)g(on)h (Statistic)n(al)e(Metho)n(ds)i(for)e(the)h(Evaluation)h(of)f(Computer)g (Systems)-25 876 y(Performanc)n(e)p Fl(,)e(Bro)o(wn)h(Univ)o(ersit)o(y)m(,)f (No)o(v)o(em)o(b)q(er)g(1971.)-90 959 y([12])19 b(R.)c(W.)h(Ho)q(c)o(kney)g (and)g(C.)g(R.)f(Jesshop)q(e.)26 b(P)o(arallel)15 b(Computers)g(2.)24 b(Adam)15 b(Hilger,)h(Bristol)g(and)g(Philadephia,)f(pages)h(95,)-25 1009 y(1981.)-90 1092 y([13])j(John)e(L.)f(Larson.)27 b(Collecting)16 b(and)g(In)o(terpreting)i(HPM)f(P)o(erformance)g(Data)f(on)g(the)i(CRA)m(Y)e (Y-MP)m(.)26 b Fk(NCSA)17 b(Datalink)p Fl(,)-25 1142 y(pages)d(14{24,)e(No)o (v)o(em)o(b)q(er-Decem)o(b)q(er)h(1991.)-90 1225 y([14])19 b(John)12 b(Larson)g(and)f(Bob)h(Lutz.)j(PERFTRA)o(CE)d(User's)g(Guide.)i Fk(Cr)n(ay)f(R)n(ese)n(ar)n(ch)f(internal)h(te)n(chnic)n(al)g(r)n(ep)n(ort)p Fl(,)d(a)o(v)n(ailable)g(from)-25 1274 y(the)k(authors,)g(August,)g(1985.)-90 1357 y([15])19 b(Doru)j(Marcusiu.)43 b(P)o(erformance)22 b(Data)g(No)o(w)g (Automatic)e(on)i(CRA)m(Y)g(Y-MP)m(.)42 b Fk(NCSA)22 b(Datalink)p Fl(,)i(5\(4\),)g(pages)e(5{6,)-25 1407 y(Septem)o(b)q(er-Octob)q(er)15 b(1991.)-90 1490 y([16])k(Joanne)13 b(Martin,)e(Ingrid)h(Buc)o(her)h(and)f(T) m(on)o(y)f(W)m(arno)q(c)o(k.)k(W)m(orkload)10 b(Characterization)i(for)g(V)m (ector)g(Computers:)17 b(T)m(o)q(ols)11 b(and)-25 1540 y(T)m(ec)o(hniques.)18 b Fk(L)n(os)d(A)o(lamos)g(National)g(L)n(ab)n(or)n(atory)p Fl(,)e(LA-UR-83-305,)e(Los)j(Alamos,)d(NM)j(87545,)f(1983.)-90 1623 y([17])19 b(Harry)14 b(Nelson.)19 b(Using)13 b(the)i(P)o(erformance)f (Monitors)f(on)h(the)g(X-MP/48.)k Fk(LLNL)d(T)m(entacle)p Fl(,)e(1985.)-90 1706 y([18])19 b(Saul)c(Rosen.)21 b(Lectures)c(on)e(the)h(Measuremen)o(t)f (and)g(Ev)n(aluation)e(of)i(the)h(P)o(erformance)e(of)h(Computing)e(Systems.) 21 b Fk(SIAM,)-25 1756 y(R)n(e)n(gional)15 b(Confer)n(enc)n(e)g(Series)f(in)h (Applie)n(d)g(Mathematics)p Fl(,)e(National)g(Science)i(F)m(oundation,)d (1976.)-90 1839 y([19])19 b(Dic)o(k)12 b(Sato.)j(SCD)d(Conducts)h(P)o (erformance)f(and)g(Benc)o(hmarking)f(T)m(ests.)16 b Fk(SCD)e(Computing)g (News)p Fl(,)d(pages)i(3-5,)e(9\(4\),)h(April,)-25 1889 y(1988.)-90 1972 y([20])21 b(Scien)o(ti\014c)10 b(Sup)q(ercomputing)f(Sub)q(committee.)g Fk(NSF)i(Sup)n(er)n(c)n(omputer)g(Centers)f(Study)p Fl(,)g(IEEE)g(Computer)f (So)q(ciet)o(y)m(,)h(F)m(ebruary)-25 2022 y(1992.)-90 2105 y([21])24 b(UNICOS)14 b(P)o(erformance)g(Utilities)f(Reference)j(Man)o(ual.)h Fk(Cr)n(ay)d(R)n(ese)n(ar)n(ch,)h(Inc.)p Fl(,)e(SR-2040)g(-)g(6.0,)g(1991.) -90 2188 y([22])19 b(Elizab)q(eth)12 b(A.)e(William)o(s.)g(Measuremen)o(t)i (of)e(Tw)o(o)g(Scien)o(ti\014c)i(W)m(orkloads)d(Using)i(the)g(CRA)m(Y)f(X-MP) h(P)o(erformance)g(Monitor.)-25 2237 y Fk(Sup)n(er)n(c)n(omputing)k(R)n(ese)n (ar)n(ch)g(Center)p Fl(,)e(SR)o(C-TR-88-020,)e(No)o(v)o(em)o(b)q(er)i(1988.) -90 2320 y([23])19 b(Elizab)q(eth)13 b(William)o(s,)d(C.)i(Thomas)f(My)o(ers) i(and)f(Reb)q(ecca)i(Kosk)o(ela.)h(The)e(Characterization)g(of)f(Tw)o(o)f (Scien)o(ti\014c)i(W)m(orkloads)-25 2370 y(Using)h(the)h(CRA)m(Y)f(X-MP)h (Hardw)o(are)g(P)o(erformance)f(Monitor.)20 b(In)15 b Fk(Pr)n(o)n(c)n(e)n(e)n (dings)g(of)h(Sup)n(er)n(c)n(omputing)g('90)p Fl(,)e(pages)h(142{152,)-25 2420 y(No)o(v)o(em)o(b)q(er)e(1990.)-90 2503 y([24])19 b(Edmond)13 b(W)m(est.)18 b(X-MP)c(System)g(Wide)f(P)o(erformance)h(Analysis.)k Fk(The)d(CLSC)f(News)p Fl(,)f(T)m(oron)o(to,)g(pages)h(15-17,)e(3\(1\),)h (April)-25 2553 y(18,)f(1989.)939 2828 y(23)p eop %%Trailer end userdict /end-hook known{end-hook}if %%EOF From owner-pbwg-comm@CS.UTK.EDU Sat Jun 5 17:04:52 1993 Received: from CS.UTK.EDU by netlib2.cs.utk.edu with SMTP (5.61+IDA+UTK-930125/2.8t-UTK) id AA01755; Sat, 5 Jun 93 17:04:52 -0400 Received: from localhost by CS.UTK.EDU with SMTP (5.61+IDA+UTK-930125/2.8s-UTK) id AA29735; Sat, 5 Jun 93 17:04:42 -0400 X-Resent-To: pbwg-comm@CS.UTK.EDU ; Sat, 5 Jun 1993 17:04:41 EDT Errors-To: owner-pbwg-comm@CS.UTK.EDU Received: from sun2.nsfnet-relay.ac.uk by CS.UTK.EDU with SMTP (5.61+IDA+UTK-930125/2.8s-UTK) id AA29718; Sat, 5 Jun 93 17:04:34 -0400 Via: uk.ac.southampton.ecs; Sat, 5 Jun 1993 22:04:28 +0100 From: R.Hockney@parallel-applications-centre.southampton.ac.uk Via: calvados.pac.soton.ac.uk (plonk); Sat, 5 Jun 93 21:56:45 BST Date: Sat, 5 Jun 93 21:04:15 GMT Message-Id: <23001.9306052104@calvados.pac.soton.ac.uk> To: pbwg-comm@cs.utk.edu Subject: Response to Scneider on Speedup Thank you for your contribution to the Speedup debate, particularly concerning what went on at SPEC. Personally I think it is too defeatest to fall back on execution time alone (my Temporal Performance), because this means that one cannot compare, even approximatelty the performance of one benchmark with another. Also the performance of different sized instances of the same benchmark will vary widely, and are difficult to plot. Whilst I think Temporal Performance should always be reported s the primary measurement, I like the Benchmark Performance with its flop-count that is etched in stone for all time. This has the sound property (unlike Speedup as usually applied) that a greater benchmark always means a shorter execution time. It also means that all benchmark performances are measured in approximately the same units (namely Mflop/s) which allows approximate comparisons across benchmarks, which Temporal Performance does not. Also all problem sizes have approximately the same performance when expressed as benchmark Mflop/s, which is a great convenience for plotting and comparison. More importantly, whilst I can understand in general terms what you and Dave Snelling are saying about an axiomatic approach, I do not understand actually what you propose. What is the axiomatic approach when applied to benchmarking? Can you be specific and give us some text to consider, and proceedures to adopt? Concrete suggestions are what we need. Roger Hockney From owner-pbwg-comm@CS.UTK.EDU Tue Jun 8 13:50:28 1993 Received: from CS.UTK.EDU by netlib2.cs.utk.edu with SMTP (5.61+IDA+UTK-930125/2.8t-UTK) id AA12635; Tue, 8 Jun 93 13:50:28 -0400 Received: from localhost by CS.UTK.EDU with SMTP (5.61+IDA+UTK-930125/2.8s-UTK) id AA19313; Tue, 8 Jun 93 13:49:51 -0400 X-Resent-To: pbwg-comm@CS.UTK.EDU ; Tue, 8 Jun 1993 13:49:50 EDT Errors-To: owner-pbwg-comm@CS.UTK.EDU Received: from elc04.icase.edu by CS.UTK.EDU with SMTP (5.61+IDA+UTK-930125/2.8s-UTK) id AA19305; Tue, 8 Jun 93 13:49:45 -0400 Received: by elc04 (5.65.1/lanleaf2.4.9) id AA07547; Tue, 8 Jun 93 13:49:38 -0400 Message-Id: <9306081749.AA07547@elc04> Date: Tue, 8 Jun 93 13:49:38 -0400 From: Sun Xian-He To: pbwg-comm@cs.utk.edu Subject: Re: Revised SPEEDUP section Cc: sun@fluke.icase.edu Recently, Roger Hockney has sent two messages to this news group. The first is the proposed amendment of speedup, in which he emphasizes the execution time is very important. The second is a response to readers, in which he emphasizes the work (in flop-count) is also very important. I agree his points. In fact, if we agree that time and work are the two important factors under consideration, we may consider to use the GENERALIZED SPEEDUP [1] [2]. The generalized speedup is defined as Generalized Speed = Parallel_speed/Sequential_speed, where speed is defined as work/time. Parallel speed is defined as parallel work over parallel time. Sequential speed is defined as sequential work over sequential time. The parallel work could be the scaled work, with fixed-time or memory-bounded constrain. The sequential work could be the unscaled work and can be measured. All the nice properties of traditional speedup, defined as sequential_time/parallel_time, will remain. When the problem size is fixed, the generalized speedup is the same as the traditional speedup. Also, it has been shown [2] that when the single processor speed is independent of work (no memory influence) the generalized speedup is the same as the traditional speedup. Therefore, all the analytic results based on the traditional speedup can be applied to the generalized speedup, while the superlinear speedups, which due to memory miss of sequential processing, cannot. In our practice, the sequential speed is fixed for each application. It is a fraction of the asymptotic speed [3]. The sequential speed shows the single processor power of each machine. The ratio of sequential speeds can be used to compare speedups measured on different machines. The work (flop-count) is unified by using the flop-count of a practical sequential algorithm [4]. If work should be given along with the speedup, as Roger Hockney suggested, the generalized speedup does not increase measurement complexity. With the above merits, the generalized speedup may be a metric worth to consider. Xian-He Sun [1] @INPROCEEDINGS{Gust90, author = "J.L. Gustafson", title = "Fixed time, Tiered Memory, and Superlinear Speedup", booktitle = "Proc. of the Fifth Conf. on Distributed Memory Computers", year = "1990", } [2] @ARTICLE{SuGu91, AUTHOR = "Xian-He Sun and J.L. Gustafson", TITLE = "Toward a Better Parallel Performance Metric", JOURNAL = "Parallel Computing", VOLUME = "17", MONTH = "Dec.", YEAR = "1991", pages = "1093--1109", } [3] @ARTICLE{Hock91, author = "Roger W. Hockney", title = "Performance Parameters and Benchmarking of Supercomputers", journal = "Parallel Computing", volume = "17", month = "Dec.", year = "1991", } [4] @INPROCEEDINGS{Bail92, author = "David H. Bailey", title = "Misleading Performance in the Supercomputing Field", booktitle = "Proc. Supercomputing '92", address = " ", year = "1992", pages = "155--158", } ======================= Xian-He Sun ============================ ICASE (Institute for Computer Applications in Science and Engineering) Mail Stop 132C 804-864-8018 (O) NASA Langley Research Center 804-864-6134 (fax) Hampton, VA 23681-0001 sun@icase.edu ==================================================================== From owner-pbwg-comm@CS.UTK.EDU Thu Jun 10 12:05:52 1993 Received: from CS.UTK.EDU by netlib2.cs.utk.edu with SMTP (5.61+IDA+UTK-930125/2.8t-UTK) id AA22959; Thu, 10 Jun 93 12:05:52 -0400 Received: from localhost by CS.UTK.EDU with SMTP (5.61+IDA+UTK-930125/2.8s-UTK) id AA14854; Thu, 10 Jun 93 12:05:47 -0400 X-Resent-To: pbwg-comm@CS.UTK.EDU ; Thu, 10 Jun 1993 12:05:40 EDT Errors-To: owner-pbwg-comm@CS.UTK.EDU Received: from sun2.nsfnet-relay.ac.uk by CS.UTK.EDU with SMTP (5.61+IDA+UTK-930125/2.8s-UTK) id AA14805; Thu, 10 Jun 93 12:05:36 -0400 Via: uk.ac.southampton.ecs; Thu, 10 Jun 1993 16:51:06 +0100 From: R.Hockney@parallel-applications-centre.southampton.ac.uk Via: calvados.pac.soton.ac.uk (plonk); Thu, 10 Jun 93 16:43:08 BST Date: Thu, 10 Jun 93 15:50:42 GMT Message-Id: <3623.9306101550@calvados.pac.soton.ac.uk> To: pbwg-comm@cs.utk.edu Subject: Generalised Speedup COMMENTS ON GENERALISED SPEEDUP ------------------------------- Roger Hockney I would like to thank Xian-He Sun for contributing to the Speedup debate, in suggesting that PARKBENCH should consider Generalised Speedup as a Performance metric. The suggestion is interesting, but I do not favour it myself for the following reasons: (1) If generalised speedup is defined as W(N;p)*T(N;1) GS(N;p) = ------------- W(N;1)*T(N;p) then to make the number given for GS unambiguous, we must ask that the benchmarker give the values used for W(N;p), W(N;1) and T(N;1), so that we can work out the time it took to run the benchmark = T(N;p). This is impractical and unnecessarily complicated. (2) In comparing a parallel implementation with a serial one, it is essential that we use the same flop-count or work. That is to say, not give the parallel version credit for performing unnecessary redundant operations, i.e. we require W(N;p)=W(N,1). In this case, of course, generalised speedup reduces to ordinary speedup, and my comments about the undesirabilty of ordinary speedup have already been expressed. (3) In any case I do not understand what generalised speedup is supposed to be a measure of? Unless this can be satisfactorily answered, we should not use it. (4) In general, we must first decide what the objective of parallel computing is, and then pick a metric that measures this property. Some possible objectives are to design code and build computers which solve specified problems called the benchmarks: (a) in the LEAST ELAPSED WALL-CLOCK TIME. (b) that generate the highest hardware Mflop/s, R_H(N;p), defined as the sum of all flop actually performed in all processors (measured possibly by hardware monitors), divided by the elapsed wall-clock time. (c) that generate the highest self-speedup on a multiprocessor, defined as SS(p)=T(1)/T(p) with T(1)=time on one-processor of the same computer. It is obvious to me, that we seek objective (a); and that objectives (b) and (c) are both irrelevant distractions (unless one wishes to apply for the Gordon Bell award!). Understanding of parallel benchmarking starts when one realises that (b) and (c) do not necessarilly imply (a). (5) Having decided that the objective is to minimise elapsed wall-clock time, we must choose only metrics that increase in value as this time decreases, and can therefore be used to rank computers in the order we want. Such metrics are: (d) Temporal Performance, R_T(N;p)=1/T(N;p) (e) Benchmark Performance, R_B(N;p)=F_B(N)/T(N;p) with the nominal flop-count F_B(N) defined as part of the benchmark and never changed. That is we use the same F_B(N) for all computers that we compare. (f) Absolute-Speedup, AS(N;p)=T(N;1)/T(N;p) where T(N;1) is defined in seconds as part of the benchmark definition. It may or may not correspond to the one-processor time on an actual computer. There is a different T(N;1) for each problem size, N, but for any problem size, it is fixed for all time (probably a formula has to be given). That is we use the same T(N;1) for all computers we compare. Note that neither the hardware performance R_H(N;p)=F_H(N)/T(N;p), nor the self-speedup SS(N;p)=T(N;1)/T(N;p) have the desired property, because the numerators may both change as we go from computer to computer. In self-speedup, T(N;1) is the time to run the benchmark on one processor of the system being tested. The number therefore changes when one goes to another computer, but we do maintain the usually assumed properties of speedup (see last note). In contrast, of course, with absolute-speedup T(N;1) is published as part of the benchmark description, and the same number is used in calculating the absolute-speedup on all computers. Absolute- Speedup can therefore be used to order computers in the way we want, BUT we have lost the usually expected properties of speedup (see last note). (6) I agree with Dave Schneider, and have never believed that one can expect to quantify the speed of a computer with one number. In fact I am against any form of averaging, and think that the raw performance data should be examined as a function of at least two variables, namely problem size and number of processors. Hence my desire (need) to see sufficient performance points measured to be able to plot graphs (hopefully in a fairly standard format) to show this multidimensional performance surface. (7) TO CONCLUDE: I dont like generalised speedup also because it complicates an already difficult picture. Although I have defined Absolute-Speedup as a possible metric with a sharpened-up definition, I do not think we should use it because it is bound to be confused with self-speedup, which I know we must not allow as a metric, because it is just plain WRONG. My position is that we should stick very much to my original draft (disallowing Speedup or Efficiency). Although we might leave-in the definition of Hardware Performance, as a definition, the text should make clear that it is not an appropriate metric to use to compare the performance of benchmarks. (8) How do you like my definitions of self-speedup and absolute- speedup to distinguish two very different metrics which seem to get confused? I am sure that many people think of self-speedup when asked what speedup means; but then think they can use it as an absolute-speedup when they come to talk about performance comparisons. Not so. Roger Hockney From owner-pbwg-comm@CS.UTK.EDU Thu Jun 10 13:57:02 1993 Received: from CS.UTK.EDU by netlib2.cs.utk.edu with SMTP (5.61+IDA+UTK-930125/2.8t-UTK) id AA23272; Thu, 10 Jun 93 13:57:02 -0400 Received: from localhost by CS.UTK.EDU with SMTP (5.61+IDA+UTK-930125/2.8s-UTK) id AA23233; Thu, 10 Jun 93 13:56:56 -0400 X-Resent-To: pbwg-comm@CS.UTK.EDU ; Thu, 10 Jun 1993 13:56:56 EDT Errors-To: owner-pbwg-comm@CS.UTK.EDU Received: from wk49.nas.nasa.gov by CS.UTK.EDU with SMTP (5.61+IDA+UTK-930125/2.8s-UTK) id AA23225; Thu, 10 Jun 93 13:56:54 -0400 Received: by wk49.nas.nasa.gov (5.67-NAS-1.1/5.67-NAS-1.1(SGI)) id AA24296; Thu, 10 Jun 93 10:56:50 -0700 Date: Thu, 10 Jun 93 10:56:50 -0700 From: dbailey@nas.nasa.gov (David H. Bailey) Message-Id: <9306101756.AA24296@wk49.nas.nasa.gov> To: pbwg-comm@cs.utk.edu Subject: DHB's two-bits' worth on speedup figures Could I add my 25 cents' worth to the great speedup debate? Although some of us may dislike speedup figures, I feel that like Mflop/s they are here to stay. It does not seem realistic to "prohibit" researchers from using this statistic. Also, I feel that inventing a whole new statistic, "generalized speedup" (or should we say "generalised speedup" for our British colleagues?), is not a wise path to follow unless we have very persuasive reasons. It would only add more confusion to the field. In my view, the concern frequently expressed in this forum about speedup figures being used to compare systems is misplaced. In my voluminous file of scientific papers with abusive performance practices (relax: I don't believe any of you have a paper in my file), and in my reading of conference proceedings and journal articles during the last few years, I have NEVER seen an instance where someone used speedup figures to claim that one system is faster than another. The only instance remotely close to this is when a scientist, based on two separate speedup analyses, asserts that one system has more nearly linear speedup characteristics than another. What is wrong with this, provided it is backed up with proper data, and the author makes it clear that he or she is not comparing absolute performance? Some in this forum have suggested that the single processor timing must further be based on the best practical serial algorithm, even if other algorithms are used for the parallel timing. While this suggestion has some merit, I personally feel that it is of much higher priority for us to emphasize such strict procedures for computing Mflop/s rates, which are frequently used to compare different systems, than for speedup figures, which are not used to compare different systems (in absolute terms). However, I could be talked into making this recommendation for speedup figures also, if others agree. Given that speedup figures, properly formulated, are a legitimate performance statistic for studying the linearity characteristics of a single parallel system or application, I believe that all we need to do is to establish some general guidelines so that its usage is honest and scientific. In the proposed guidelines that I presented in a recent paper and in my talk at Supercomputing '92, I included the following item regarding speedup figures: If speedup figures are presented, the single processor rate should be based on a reasonably well tuned program without multiprocessing constructs. If the problem size is scaled up with the number of processors, then the results should be clearly cited as ``scaled speedup'' figures, and details should be given explaining how the problem was scaled up in size. In the light of the discussion in this forum, it may be useful to add a phrase to this statement to insure that speedup figures are not abused as an absolute performance comparison statistic. Other minor adjustments might be in order. But it general I feel it would be fruitless to include lengthy and highly technical directives or otherwise we may introduce more confusion than we alleviate. Comments? David H. Bailey dbailey@nas.nasa.gov From owner-pbwg-comm@CS.UTK.EDU Thu Jun 10 14:58:52 1993 Received: from CS.UTK.EDU by netlib2.cs.utk.edu with SMTP (5.61+IDA+UTK-930125/2.8t-UTK) id AA23536; Thu, 10 Jun 93 14:58:52 -0400 Received: from localhost by CS.UTK.EDU with SMTP (5.61+IDA+UTK-930125/2.8s-UTK) id AA27397; Thu, 10 Jun 93 14:56:30 -0400 X-Resent-To: pbwg-comm@CS.UTK.EDU ; Thu, 10 Jun 1993 14:56:29 EDT Errors-To: owner-pbwg-comm@CS.UTK.EDU Received: from sp2.csrd.uiuc.edu by CS.UTK.EDU with SMTP (5.61+IDA+UTK-930125/2.8s-UTK) id AA27384; Thu, 10 Jun 93 14:56:25 -0400 Received: from sp94.csrd.uiuc.edu.csrd.uiuc.edu (sp94.csrd.uiuc.edu) by sp2.csrd.uiuc.edu with SMTP id AA11316 (5.67a/IDA-1.5); Thu, 10 Jun 1993 13:56:22 -0500 Received: by sp94.csrd.uiuc.edu.csrd.uiuc.edu (4.1/SMI-4.1) id AA04124; Thu, 10 Jun 93 13:56:21 CDT Date: Thu, 10 Jun 93 13:56:21 CDT From: schneid@csrd.uiuc.edu (David John Schneider) Message-Id: <9306101856.AA04124@sp94.csrd.uiuc.edu.csrd.uiuc.edu> To: R.Hockney@pac.soton.ac.uk Cc: pbwg-comm@cs.utk.edu, schneid@csrd.uiuc.edu, perfect.steering@csrd.uiuc.edu In-Reply-To: <3623.9306101550@calvados.pac.soton.ac.uk> (R.Hockney@pac.soton.ac.uk) Subject: Re: Generalised Speedup Roger, The basic reason that I finally decided to advocate reporting only elapsed times is that, despite the fact that the R_B metric formally carries the units of Mflops/sec, it does not correspond to what one would actual compute if one counted floating point operations and measured elapsed time on all machines. The problem is that the definition of F_B(N) is arbitrary. As such, R_B is not a measure of computational rate, it is simply an arbitrarily scaled time-based measure. Because the relative ordering can be switched by altering the assigned F_B values, whether the R_B of code A on machine X is greater or less than the R_B of code B on machine Y becomes a matter of definition, not measurement. Therefore, I do not think that R_B is an acceptable long-term solution to the problem of needing an unbiased metric to compare the performance of different codes on different machines. In your response to my previous note, you asked for more information regarding how to implement an axiomatically based information theory approach. I will try to sketch my current ideas in this area here. However, I would like to make it clear that I am not advocating that Perfect, PARKBENCH, or anyone else to stop what they are doing to pursue this line of inquiry. Instead, I am simply trying understand the cause of the confusion regarding metrics, and to see if I can find satisfactory long-term solutions. In the mean time, it is important to continue to muddle along with possible incomplete or even inconsistent definitions. As far as I can tell, the R_B measure that you have advocated is the least biased of the current candidates for metrics for use in comparing the performance of different codes on different machines. So, on with the polemic... Most proposals to date for F_B(N) such as those used by Linpack and previously by Perfect, as well as the current PARKBENCH effort, remove this ambiguity by fiat. As such, they all suffer from the ills described above. To me, the challenge is to develop methods to measures of "computational work" such as F_B(N) from first principles. This is where I believe an axiomatic approach such as used in information theory can be useful. All computer instructions ultimately reduce to a common currency, Boolean logic. The value and generality of information theory is that it is formulated in terms of the "alphabet" in which the message is presented, not the meaning assigned to it by the recipient. All bits and bit-level operations should be treated equally. One way to to remove the ambiguity would be to define F_B(N) as the minimum floating point operation count needed for each value of N. In many practice, it is known that it is possible to make tradeoffs which increase the number of integer/logical operations to reduce the number of floating point operations. The use of indirect addressing for sparse matrix problems is a classic example of this sort of tradeoff. The fact that the value of these tradeoffs, as measured by reduction in elapsed time, are related to architectural issues implies that a satisfactory definition of computational work must treat integer, logical and floating point operations on the same footing. However, some of the basic operations are clearly more difficult to implement than others, thus it is unfair to assign equal "difficulty" measures to each operation. I think one must eventually proceed by computing the complexities of a given set of atomic operations using well defined input and output datatypes according to a well defined set of rules. If complexities of the atomic operations are known, perhaps it will be possible to obtain find the minimum "work" rather than the minimum floating point operation count. Proposals for determining the complexity of atomic operations were made by Hellerman and Rozwadowski in the early 1970's. Hellerman's ideas met with sharp criticism, especially by Welch. Some of Welch's criticisms are valid in that they are related to ambiguities in Hellerman's definitions. I feel that Hellerman addressed most of the valid criticisms in his response to Welch's letter. With a little(?) more work, I think that the Hellerman-Rozwadowski scheme can be made into a practical method. This scheme simply recognizes the fact that all computer operations are implemented at the lowest level by Boolean logic. At this level the distinction between the interpretation of the bits is irrelevant. A bit is a bit regardless of whether it is part of a character variable or a double precision floating point variable. The complicated atomic operations on larger datatypes can be summarized in tabular form. The Hellerman-Rozwadowski definition of work is the information theoretic "entropy" of these tables. Both Hellerman and Rozwadowski based their table construction methods on traditional Boolean logic operations. Alternative proposals based on "reversible" elements have been put forward by Charles Bennett, Richard Feynman, and others. Choosing between these alternatives amounts to selecting the "alphabet" used to describe the Boolean operations which are used to construct higher-level operations. As such, they affect the final value computed values for the "difficulty" of each operation. However, the need for a choice of "alphabets" is clear. Moreover, the choice which is actually can be clearly and concisely stated. Because of the widespread adoption of the IEEE floating point standard which defines both the atomic operations and the datatypes, I suggest that this would be a reasonable place to start. In comparison, the integer and logical datatypes and associated operations are easy. Hellerman has already carried out a detailed analysis of the integer and logical operations defined by the IBM S/360 architecture. I think that one can construct architecturally neutral definitions. I apologize for not being able to included bibliographic references in this note since I do not have the papers with me at the moment. I will try to remember to post them separately. Dave ======================================================================== Errors-To: owner-pbwg-comm@CS.UTK.EDU X-Resent-To: pbwg-comm@CS.UTK.EDU ; Thu, 10 Jun 1993 12:05:40 EDT Errors-To: owner-pbwg-comm@CS.UTK.EDU From: R.Hockney@pac.soton.ac.uk Date: Thu, 10 Jun 93 15:50:42 GMT COMMENTS ON GENERALISED SPEEDUP ------------------------------- Roger Hockney I would like to thank Xian-He Sun for contributing to the Speedup debate, in suggesting that PARKBENCH should consider Generalised Speedup as a Performance metric. The suggestion is interesting, but I do not favour it myself for the following reasons: (1) If generalised speedup is defined as W(N;p)*T(N;1) GS(N;p) = ------------- W(N;1)*T(N;p) then to make the number given for GS unambiguous, we must ask that the benchmarker give the values used for W(N;p), W(N;1) and T(N;1), so that we can work out the time it took to run the benchmark = T(N;p). This is impractical and unnecessarily complicated. (2) In comparing a parallel implementation with a serial one, it is essential that we use the same flop-count or work. That is to say, not give the parallel version credit for performing unnecessary redundant operations, i.e. we require W(N;p)=W(N,1). In this case, of course, generalised speedup reduces to ordinary speedup, and my comments about the undesirabilty of ordinary speedup have already been expressed. (3) In any case I do not understand what generalised speedup is supposed to be a measure of? Unless this can be satisfactorily answered, we should not use it. (4) In general, we must first decide what the objective of parallel computing is, and then pick a metric that measures this property. Some possible objectives are to design code and build computers which solve specified problems called the benchmarks: (a) in the LEAST ELAPSED WALL-CLOCK TIME. (b) that generate the highest hardware Mflop/s, R_H(N;p), defined as the sum of all flop actually performed in all processors (measured possibly by hardware monitors), divided by the elapsed wall-clock time. (c) that generate the highest self-speedup on a multiprocessor, defined as SS(p)=T(1)/T(p) with T(1)=time on one-processor of the same computer. It is obvious to me, that we seek objective (a); and that objectives (b) and (c) are both irrelevant distractions (unless one wishes to apply for the Gordon Bell award!). Understanding of parallel benchmarking starts when one realises that (b) and (c) do not necessarilly imply (a). (5) Having decided that the objective is to minimise elapsed wall-clock time, we must choose only metrics that increase in value as this time decreases, and can therefore be used to rank computers in the order we want. Such metrics are: (d) Temporal Performance, R_T(N;p)=1/T(N;p) (e) Benchmark Performance, R_B(N;p)=F_B(N)/T(N;p) with the nominal flop-count F_B(N) defined as part of the benchmark and never changed. That is we use the same F_B(N) for all computers that we compare. (f) Absolute-Speedup, AS(N;p)=T(N;1)/T(N;p) where T(N;1) is defined in seconds as part of the benchmark definition. It may or may not correspond to the one-processor time on an actual computer. There is a different T(N;1) for each problem size, N, but for any problem size, it is fixed for all time (probably a formula has to be given). That is we use the same T(N;1) for all computers we compare. Note that neither the hardware performance R_H(N;p)=F_H(N)/T(N;p), nor the self-speedup SS(N;p)=T(N;1)/T(N;p) have the desired property, because the numerators may both change as we go from computer to computer. In self-speedup, T(N;1) is the time to run the benchmark on one processor of the system being tested. The number therefore changes when one goes to another computer, but we do maintain the usually assumed properties of speedup (see last note). In contrast, of course, with absolute-speedup T(N;1) is published as part of the benchmark description, and the same number is used in calculating the absolute-speedup on all computers. Absolute- Speedup can therefore be used to order computers in the way we want, BUT we have lost the usually expected properties of speedup (see last note). (6) I agree with Dave Schneider, and have never believed that one can expect to quantify the speed of a computer with one number. In fact I am against any form of averaging, and think that the raw performance data should be examined as a function of at least two variables, namely problem size and number of processors. Hence my desire (need) to see sufficient performance points measured to be able to plot graphs (hopefully in a fairly standard format) to show this multidimensional performance surface. (7) TO CONCLUDE: I dont like generalised speedup also because it complicates an already difficult picture. Although I have defined Absolute-Speedup as a possible metric with a sharpened-up definition, I do not think we should use it because it is bound to be confused with self-speedup, which I know we must not allow as a metric, because it is just plain WRONG. My position is that we should stick very much to my original draft (disallowing Speedup or Efficiency). Although we might leave-in the definition of Hardware Performance, as a definition, the text should make clear that it is not an appropriate metric to use to compare the performance of benchmarks. (8) How do you like my definitions of self-speedup and absolute- speedup to distinguish two very different metrics which seem to get confused? I am sure that many people think of self-speedup when asked what speedup means; but then think they can use it as an absolute-speedup when they come to talk about performance comparisons. Not so. Roger Hockney From owner-pbwg-comm@CS.UTK.EDU Thu Jun 10 15:14:11 1993 Received: from CS.UTK.EDU by netlib2.cs.utk.edu with SMTP (5.61+IDA+UTK-930125/2.8t-UTK) id AA23613; Thu, 10 Jun 93 15:14:11 -0400 Received: from localhost by CS.UTK.EDU with SMTP (5.61+IDA+UTK-930125/2.8s-UTK) id AA28488; Thu, 10 Jun 93 15:11:48 -0400 X-Resent-To: pbwg-comm@CS.UTK.EDU ; Thu, 10 Jun 1993 15:11:47 EDT Errors-To: owner-pbwg-comm@CS.UTK.EDU Received: from sp2.csrd.uiuc.edu by CS.UTK.EDU with SMTP (5.61+IDA+UTK-930125/2.8s-UTK) id AA28480; Thu, 10 Jun 93 15:11:46 -0400 Received: from sp94.csrd.uiuc.edu.csrd.uiuc.edu (sp94.csrd.uiuc.edu) by sp2.csrd.uiuc.edu with SMTP id AA11590 (5.67a/IDA-1.5 for ); Thu, 10 Jun 1993 14:11:47 -0500 Received: by sp94.csrd.uiuc.edu.csrd.uiuc.edu (4.1/SMI-4.1) id AA04375; Thu, 10 Jun 93 14:11:45 CDT Date: Thu, 10 Jun 93 14:11:45 CDT From: schneid@csrd.uiuc.edu (David John Schneider) Message-Id: <9306101911.AA04375@sp94.csrd.uiuc.edu.csrd.uiuc.edu> To: dbailey@nas.nasa.gov Cc: pbwg-comm@cs.utk.edu In-Reply-To: <9306101756.AA24296@wk49.nas.nasa.gov> (dbailey@nas.nasa.gov) Subject: Re: DHB's two-bits' worth on speedup figures > Date: Thu, 10 Jun 93 10:56:50 -0700 > From: dbailey@nas.nasa.gov (David H. Bailey) > > Could I add my 25 cents' worth to the great speedup debate? > > Although some of us may dislike speedup figures, I feel that like > Mflop/s they are here to stay. It does not seem realistic to > "prohibit" researchers from using this statistic. Also, I feel that > inventing a whole new statistic, "generalized speedup" (or should we > say "generalised speedup" for our British colleagues?), is not a wise > path to follow unless we have very persuasive reasons. It would only > add more confusion to the field. > > In my view, the concern frequently expressed in this forum about > speedup figures being used to compare systems is misplaced. In my > voluminous file of scientific papers with abusive performance > practices (relax: I don't believe any of you have a paper in my file), > and in my reading of conference proceedings and journal articles > during the last few years, I have NEVER seen an instance where someone > used speedup figures to claim that one system is faster than another. Unfortunately, I believe that a new low has been reached. It is my understanding that several million dollars has recently been spent for a new machine at an NSF National Supercomputer Center, justified largely on a comparison of speedups. It is scary to think that decision makers, not just the masses, are so easily deceived. > > [stuff removed] > > Comments? > > David H. Bailey > dbailey@nas.nasa.gov > Dave Schneider University of Illinois at Urbana-Champaign Center for Supercomputing Research and Development 367 Computer and Systems Research Laboratory 1308 W. Main Street Urbana, IL 61801-2307 MC-264 phone : (217) 244-0055 fax : (217) 244-1351 E-mail: schneid@csrd.uiuc.edu From owner-pbwg-comm@CS.UTK.EDU Thu Jun 10 15:21:11 1993 Received: from CS.UTK.EDU by netlib2.cs.utk.edu with SMTP (5.61+IDA+UTK-930125/2.8t-UTK) id AA23654; Thu, 10 Jun 93 15:21:11 -0400 Received: from localhost by CS.UTK.EDU with SMTP (5.61+IDA+UTK-930125/2.8s-UTK) id AA28992; Thu, 10 Jun 93 15:20:11 -0400 X-Resent-To: pbwg-comm@CS.UTK.EDU ; Thu, 10 Jun 1993 15:20:10 EDT Errors-To: owner-pbwg-comm@CS.UTK.EDU Received: from wk49.nas.nasa.gov by CS.UTK.EDU with SMTP (5.61+IDA+UTK-930125/2.8s-UTK) id AA28974; Thu, 10 Jun 93 15:20:08 -0400 Received: by wk49.nas.nasa.gov (5.67-NAS-1.1/5.67-NAS-1.1(SGI)) id AA24612; Thu, 10 Jun 93 12:20:05 -0700 Date: Thu, 10 Jun 93 12:20:05 -0700 From: dbailey@nas.nasa.gov (David H. Bailey) Message-Id: <9306101920.AA24612@wk49.nas.nasa.gov> To: pbwg-comm@cs.utk.edu Subject: Disheartening disclosure of speedup abuse Schneider reports: Unfortunately, I believe that a new low has been reached. It is my understanding that several million dollars has recently been spent for a new machine at an NSF National Supercomputer Center, justified largely on a comparison of speedups. It is scary to think that decision makers, not just the masses, are so easily deceived. I had not heard that report. This is disheartening indeed. Then it is clear that we have to make some strong comments to insure that speedup figures are not abused to compare systems. I still feel, however, that it is best to reform the presently speedup statistic rather than to try to invent a new one. DHB From owner-pbwg-comm@CS.UTK.EDU Fri Jun 11 10:03:15 1993 Received: from CS.UTK.EDU by netlib2.cs.utk.edu with SMTP (5.61+IDA+UTK-930125/2.8t-netlib) id AA01287; Fri, 11 Jun 93 10:03:15 -0400 Received: from localhost by CS.UTK.EDU with SMTP (5.61+IDA+UTK-930125/2.8s-UTK) id AA15291; Fri, 11 Jun 93 10:03:05 -0400 X-Resent-To: pbwg-comm@CS.UTK.EDU ; Fri, 11 Jun 1993 10:03:04 EDT Errors-To: owner-pbwg-comm@CS.UTK.EDU Received: from elc04.icase.edu by CS.UTK.EDU with SMTP (5.61+IDA+UTK-930125/2.8s-UTK) id AA15283; Fri, 11 Jun 93 10:03:02 -0400 Received: by elc04 (5.65.1/lanleaf2.4.9) id AA10391; Fri, 11 Jun 93 10:03:00 -0400 Message-Id: <9306111403.AA10391@elc04> Date: Fri, 11 Jun 93 10:03:00 -0400 From: Sun Xian-He To: pbwg-comm@cs.utk.edu Subject: COMMENTS ON GENERALISED SPEEDUP I would like to thank Roger Hockney and David Bailey for their comments on the generalized speedup. I agree that there is a need to explain the definition of generalized speedup more clearly. As an effort to do so, I will answer Roger Hockney comments one by one in the following. David Bailey's question will be answered during the way. > COMMENTS ON GENERALISED SPEEDUP > ------------------------------- > Roger Hockney > (1) If generalised speedup is defined as > W(N;p)*T(N;1) > GS(N;p) = ------------- > W(N;1)*T(N;p) >then to make the number given for GS unambiguous, we must ask that the >benchmarker give the values used for W(N;p), W(N;1) and T(N;1), so that >we can work out the time it took to run the benchmark = T(N;p). This is >impractical and unnecessarily complicated. This is not true. In general, the T(N;p) is independent of W(N;1) and T(N;1). The only complication has added on is the computation of the asymptotic speed. However, the asymptotic speed is important on itself (I did refer Roger Hockney's work for this point). > (2) In comparing a parallel implementation with a serial one, it is >essential that we use the same flop-count or work. That is to say, not >give the parallel version credit for performing unnecessary redundant >operations, i.e. we require W(N;p)=W(N,1). In this case, of course, >generalised speedup reduces to ordinary speedup, and my comments about >the undesirabilty of ordinary speedup have already been expressed. In gerealized speedup, the flop-count or work is based on a practical sequential algorithm (I did refer David Bailey's work for this point). It will not give credit for performing unnecessary redundant operation. > (3) In any case I do not understand what generalised speedup is >supposed to be a measure of? Unless this can be satisfactorily >answered, we should not use it. Generalized speedup measures the speed-up. In contrast, the traditional speedup gives the ratio of time reduction. If we ask the question in different way, if we ask how is the measurement of generalized speedup related with the measurement of traditional speedup. The answer is: The generalized speedup is a REFORM of traditional speedup. The generalized speedup is a traditional speedup in which the sequential time T(N,1) is computed by using the asymptotic speed. (With few lines I can prove this, but I would like to leave it as a quiz). I think most of us agree that we need a good way to compute T(N,1). Our application has achieved superlinear speedup (traditional) on KSR and I am sure we will get superlinear speedup (traditional) on Paragon also. In fact, with traditional speedup, a big class of applications will achieve superlinear speedup on machines supporting virtual memory. It is a misleading. Generalized speedup is a way to avoid it. Xian-He Sun From owner-pbwg-comm@CS.UTK.EDU Fri Jun 11 15:43:40 1993 Received: from CS.UTK.EDU by netlib2.cs.utk.edu with SMTP (5.61+IDA+UTK-930125/2.8t-netlib) id AA03518; Fri, 11 Jun 93 15:43:40 -0400 Received: from localhost by CS.UTK.EDU with SMTP (5.61+IDA+UTK-930125/2.8s-UTK) id AA07627; Fri, 11 Jun 93 15:43:36 -0400 X-Resent-To: pbwg-comm@CS.UTK.EDU ; Fri, 11 Jun 1993 15:43:29 EDT Errors-To: owner-pbwg-comm@CS.UTK.EDU Received: from sun2.nsfnet-relay.ac.uk by CS.UTK.EDU with SMTP (5.61+IDA+UTK-930125/2.8s-UTK) id AA07618; Fri, 11 Jun 93 15:43:26 -0400 Via: uk.ac.southampton.ecs; Fri, 11 Jun 1993 20:43:01 +0100 From: R.Hockney@parallel-applications-centre.southampton.ac.uk Via: calvados.pac.soton.ac.uk (plonk); Fri, 11 Jun 93 20:34:56 BST Date: Fri, 11 Jun 93 19:42:31 GMT Message-Id: <7257.9306111942@calvados.pac.soton.ac.uk> To: pbwg-comm@cs.utk.edu Subject: Reply to Dave Schneider on Metrics Reply to Dave Schneider on F_B and R_B -------------------------------------- from Roger Hockney I agree with almost everything you said in your last note, from which I conclude: (a) There is, as yet, no axiomatic approach that we can use, therefore we are left to consider and improve the draft metrics that I presented to the last meeting (Chapter 1). (b) R_B(N;p) with sensibly defined F_B(N) is probably the least biased current metric (at least it is unambiguous and acceptable). I claim below that it is also the most convenient and useful. You rightly point out that R_B is no more than a scaled time measurement. That is precisely why it is defined in the way that it is, with F_B(N) as a nominal value (or function) that is the same for all computers and is unchangeable by fiat. This is the way it has to be if R_B is to have the properties we require. Thus F_B(N) should be thought of as inscribed in stone and handed down by the god of benchmarks. R_B is not intended to be a measure of the flop-rate of the hardware, for which we have defined another metric R_H, the hardware performance (incidentally a metric in which we should not be interested). Rather R_B is an inverse-time measure scaled for our convenience in a particular way (see below), to be more useful than the straight inverse-time metric which we call Temporal Performance R_T=1/T, but find difficult to use and compare across benchmarks. However, F_B(N) is also intended to approximate to the real hardware flop-count, F_H(N) of a good implementation of the one processor sequential code, so we expect the benchmark and hardware Mflop/s to be similar for p=1 R_B(N;1) \approx R_H(N;1)=F_H(N)/T(N;1) For p>1, F_H(N) may be >> F_B(N) because F_H counts redundant operations which are common in parallel code, whereas F_B correctly does not. Redundant operations are any operations that are repeated in each processor of a parallel implementation. Thus Benchmark and Hardware Mflop/s may differ widely in a parallel code. The important point is that we should not be interested in R_H, i.e. in generating hardware Mflop/s. The aim of the programmer should be to maximise Benchmark Mflop/s, which by its careful definition corresponds to minimising elapsed wall-clock execution time, T(N;p). To make this distinction clearer the two performance metrics have, strictly speaking, different units, and should be written differently: Benchmark Performance in units Mflop/s(benchmark name) Hardware Performance in units Mflop/s(computer name) So that, whereas I agree that R_B is a scaled time measurement, I do not agree that it is an arbitrary scaling (like SPEC use with the VAX11/780 reference timings). The scaling is closely related to the hardware flop-count of sequential code, and therefore a measure, albeit somewhat imprecise, of the amount of work required to solve the problem using current algorithms. The property that higher R_B means lower execution time, does not depend on this relation with the hardware flop-count being close (it only depends on F_B(N) being kept the same across all computers), but our Mflop/s numbers would begin to look rather silly (arbitrary) if it was not fairly close. At some stage I agree that we should adopt a better measure of work than Mflop, that might come from the research you refer to, but for this year's report, I can see no alternative to Mflop. Further, we are used to the measure, and performance values derived from it: Mflop/s. Concerning the ordering: For a given N, if the R_B(N;p) of code A on computer X is greater than the R_B(N;p) of code B on computer Y, then code A runs in less time on X than B does on Y. That is the ordering given by time alone is the same as that given by R_B. Going to a different problem size, N2 say, is rather like going to a different benchmark, and it is possible that code B will execute on Y in less time than A does on X (i.e. the ordering may change), but this will be a real difference in behaviour of the codes which is reflected by R_B(N2;p) of B on Y becoming greater than R_B(N2;p) of A on X. Such a change of ordering in R_B is real and not a matter of definition. Everything is as it should be. I cannot think of an example, where changing the function F_B(N) can change the ordering of the benchmarked computers. Can you give me an example, please. Remember that F_B(N) is the same function for all computers, its a property of the problem/benchmark, not the computer. A misunderstanding may arise however if one compares the R_B(N;p) of one problem size with the R_B(N2;p) of another problem size. It is not correct to conclude that the higher R_B means the least execution time because the numerators F_B(N) and F_B(N2) are different. Generally speaking the larger problem will generate more nominal benchmark Mflop/s(benchmark), and also real hardware Mflop/s, but because it is a bigger problem will take much longer to run. So higher R_B, in this comparison, may mean longer execution time, not shorter. But we should not be surprised that a bigger problem takes longer to run, even if it generates more Mflop/s (real and/or nominal). To compare execution time across benchmarks or across different sizes of the same benchmark, we must compare the Temporal Performance, R_T=R_B/F_B. Because of the last paragraph, some have questioned whether one should abandon R_B and use the Temporal Performance, R_T(N;p)=1/T(N;p), alone. Probably one should quote both, but it is really a matter of convenience. Values of R_B will tend to cluster around a relatively confined region of the (R_B vs p) performance graphs, because they do approximate to the available hardware Mflop/s of the computer. Thus a set of results for a range of sizes of a range of benchmarks for a given computer tend to cluster together, and should give a fair idea of the range of possible performance, and this range should not be too large or scattered. The tendency will be for all benchmarks and all problem sizes to cluster near one value for p=1, although there will be some scatter due to vector length effects if vector processors are used. As p increases, the curves for different problem sizes and benchmarks will diverge, but since they are tied together at one end they can't get too far apart. Computers of different types may give recognisable patterns when performance results are displayed in this way in the (R_B vs p) plane. This may help us classify performance behaviour by describing the region of the plane occupied by the benchmark results, but we need to see a lot of data first. It is also why I believe that an interactive graphical interface should be considered as an essential part (not just a sexy add-on) of the Parkbench performance database. In sharp contrast, values of time alone, and therefore R_T, will vary much more widely over orders of magnitude between small and large problems, and from one benchmark to another. It will be much more difficult to see any pattern of performance in the (R_T vs p) plane or try to draw conclusions. In practice using R_T as the display metric will be much more difficult and less useful. Roger Hockney From owner-pbwg-comm@CS.UTK.EDU Tue Jun 15 11:38:46 1993 Received: from CS.UTK.EDU by netlib2.cs.utk.edu with SMTP (5.61+IDA+UTK-930125/2.8t-netlib) id AA05908; Tue, 15 Jun 93 11:38:46 -0400 Received: from localhost by CS.UTK.EDU with SMTP (5.61+IDA+UTK-930125/2.8s-UTK) id AA15964; Tue, 15 Jun 93 11:37:01 -0400 X-Resent-To: pbwg-comm@CS.UTK.EDU ; Tue, 15 Jun 1993 11:36:54 EDT Errors-To: owner-pbwg-comm@CS.UTK.EDU Received: from ben.uknet.ac.uk by CS.UTK.EDU with SMTP (5.61+IDA+UTK-930125/2.8s-UTK) id AA15956; Tue, 15 Jun 93 11:36:37 -0400 Message-Id: <9306151536.AA15956@CS.UTK.EDU> Received: from newton.npl.co.uk by ben.uknet.ac.uk via PSS with NIFTP (PP) id ; Tue, 15 Jun 1993 16:36:27 +0100 Date: Tue, 15 Jun 93 16:36 GMT From: CBF@newton.npl.co.uk To: PBWG-COMM Subject: speedup Re: Speedup. I would like to voice my agreement with David Bailey's comment: >>Although some of us may dislike speedup figures, I feel that like >>Mflop/s they are here to stay. It does not seem realistic to >>"prohibit" researchers from using this statistic. Also, I feel that >>inventing a whole new statistic, "generalized speedup" (or should we >>say "generalised speedup" for our British colleagues?), is not a wise >>path to follow unless we have very persuasive reasons. It would only >>add more confusion to the field. >> >>Given that speedup figures, properly formulated, are a legitimate >>performance statistic for studying the linearity characteristics of a >>single parallel system or application, I believe that all we need to >>do is to establish some general guidelines so that its usage is honest >>and scientific. Although we cannot stop anyone using any metric to justify their comparisons, we can at least add to the PARKBENCH report a section detailing what speedup is. ie A software metric measuring how well the chosen algorithms take advantage of parallel hardware and NOT a benchmarking metric. As the great speedup debate appears to have come up against some issues of R_b that have been discussed in the Methodology sub committee, I would like to recap on those messages and a couple of further points. Here are the main points of the original message: >>With reference to 2.4.3 Benchmark Performance. If we are comparing the same >>benchmark on different machines do we need Fb(N) in the equations? >> >>For instance R1b(N;p) > R2b(N;p) where R1b is the rate of machine 1 >>and R2b is the rate of machine 2. >> >>This implies Fb(N)/T1(N;p) > Fb(N)/T(N;p) as Fb(N) does not depend on >>the machine but only on the defining serial code. >> >>This implies 1/T1(N;p) > 1/T(N;p) (or T(N;p) > T1(N;p) ). >> >>In other words we get a straight timing result anyway. >> >>If we are not comparing the same benchmark is the flop a suitable >>representation of the work that a benchmark does? Is it not possible that >>memory accessing and I/O usage could outstrip floating point operations in >>terms of importance? Here is Roger Hockney's reply: >> Do we need F_B(N) >>The answer is NO we don't, in which case the only metric to use is the >>Temporal Performance, which is the same as comparing executions times, >>but the result is expressed more naturally as a performance. The definition >>of Benchmark performance was made in order to keep the units of Mflop/s >>which people widely use but ensure that the highest R_B implies the least >>execution time. This can only be done if F_B(N) is kept the same for >>all implementations and treated as a nominal figure. If you want real >>Mflop/s you use R_H the hardware performance. Of course F_B is only >>appropriate for problems dominated by floating-point arithmetic, for >>other problems the Benchmark writer would be expected to define an >>appropriate measure of work (I/O references e.g.) or else simply fall >>back on Temporal performance and time alone. >> The other reason for R_B >>is to allow some (albeit approximate) comparison of all arithmetic >>benchmarks in the same units (i.e.Mflop/s). Temporal Performances >>cannot be compared across benchmarks. >> All Good points for discussion on Monday, preferable as (If this was discussed can we see the appropriate minutes?) >>suggested editing to my draft submission, with your agreement of course >>David. Are you there >> Roger Hockney My reaction is immediate concern over: >>the Benchmark writer would be expected to define an >>appropriate measure of work I do not believe this will simplify matters. Further: >> The other reason for R_B >>is to allow some (albeit approximate) comparison of all arithmetic >>benchmarks in the same units (i.e.Mflop/s). For this to be worthwhile you will need a wider definition of work, also, given the dependence of results upon problem size, you will need a non-arbitrary definition of problem size. We could end up with a finite element problem size given by the number of elements compared to a linear algebra problem size determined by N say when dealing with an N by N square matrix. One further point : Your phrase ``the use of a better algorithm which obtains the solution with less than Fb(N) operations will show up as higher benchmark performance.'' This implies you are allowing for algorithmic improvement. I fully agree with this approach but it does mean PARKBENCH must think about how to present benchmark definitions, and how to report algorithmic changes. In order to prevent a programming competition, we will need to make other people's implementations available. This will also discourage intensive optimisation by companies, if they're helping everybody rather than just themselves. Chris Francis National Physical Laboratory cbf@newton.npl.co.uk From owner-pbwg-comm@CS.UTK.EDU Wed Jun 16 07:12:35 1993 Received: from CS.UTK.EDU by netlib2.cs.utk.edu with SMTP (5.61+IDA+UTK-930125/2.8t-netlib) id AA14749; Wed, 16 Jun 93 07:12:35 -0400 Received: from localhost by CS.UTK.EDU with SMTP (5.61+IDA+UTK-930125/2.8s-UTK) id AA03900; Wed, 16 Jun 93 07:11:38 -0400 X-Resent-To: pbwg-comm@CS.UTK.EDU ; Wed, 16 Jun 1993 07:11:35 EDT Errors-To: owner-pbwg-comm@CS.UTK.EDU Received: from sun2.nsfnet-relay.ac.uk by CS.UTK.EDU with SMTP (5.61+IDA+UTK-930125/2.8s-UTK) id AA03892; Wed, 16 Jun 93 07:11:32 -0400 Via: uk.ac.southampton.ecs; Wed, 16 Jun 1993 12:11:30 +0100 From: R.Hockney@parallel-applications-centre.southampton.ac.uk Via: calvados.pac.soton.ac.uk (plonk); Wed, 16 Jun 93 12:03:31 BST Date: Wed, 16 Jun 93 11:11:10 GMT Message-Id: <14640.9306161111@calvados.pac.soton.ac.uk> To: pbwg-comm@cs.utk.edu Subject: More Questions to Sun on GS REPLY TO Xian-He SUN -------------------- from Roger Hockney ------------------ Thank you for your efforts to explain Generalised Speedup to me. I think that I am getting there slowly, but I think we start from rather different perspectives with probably different hidden assumptions. Your answers have in fact raised more questions, which I will try to keep to the minimum. So, if you will bear with me a little longer ... (1) We can express generalised speedup as a scaled inverse-time metric, as follows: TU W(N;p) * T(N;1) GS(N;p) = -------- where TU = --------------- T(N;p) W(N;1) Here TU defines the unit of time measurement. (2) The key question from my view point is: Does TU change when one calculates GS for two different computers, and when one is comparing the same benchmark and the same problem size ? (3) What do you mean by asymptotic performance? (4) How does one calculate and or measure W(N;p), W(N;1) and T(N;1) and on what do they depend, see question (6)? (5) Showing Dependence If we adopt a useful convention, and use the symbols: A or B to describe the computer and software being benchmarked X or Y ............... benchmark being timed N or M ............... problem size p or q ............... number of processors * to mean any value We can show dependence with 4 parameters. Thus for the elapsed time of the benchmark, we have in general T(A,X:N;p) time for p-processors on computer A with benchmark X of problem size N. It is convenient to keep this ordering and punctuation of the parameters and show explicitly a non-dependence with the wild any-value symbol * e.g (a) Temporal Performance can be written with full dependence explicitly: R_T(A,X:N;p)= TU(*,*:*;*)/T(A,X:N;p) because TU=1 identically. (b) Benchmark Performance can be written: R_B(A,X:N;p)= TU(*,X:N;*)/T(A,X:N;p) because TU=F_B(N) for any benchmark X, but there is no dependence of TU on the computer or the number of processors. (6) Can you please express the dependence of the following, using the above notation: W(?,?:N;p) does the p-processor work depend on the computer and benchmark W(?,?:N;1) ........ 1-processor ........................... T(?,?:N;1) .................... time ...................... TU(?,?:?;?) ? I hope the above questions are not too much to ask, but their answers would clarify a lot to me. Thank you, Roger Hockney. From owner-pbwg-comm@CS.UTK.EDU Wed Jun 16 14:21:41 1993 Received: from CS.UTK.EDU by netlib2.cs.utk.edu with SMTP (5.61+IDA+UTK-930125/2.8t-netlib) id AA18266; Wed, 16 Jun 93 14:21:41 -0400 Received: from localhost by CS.UTK.EDU with SMTP (5.61+IDA+UTK-930125/2.8s-UTK) id AA03396; Wed, 16 Jun 93 14:20:48 -0400 X-Resent-To: pbwg-comm@CS.UTK.EDU ; Wed, 16 Jun 1993 14:20:46 EDT Errors-To: owner-pbwg-comm@CS.UTK.EDU Received: from elc04.icase.edu by CS.UTK.EDU with SMTP (5.61+IDA+UTK-930125/2.8s-UTK) id AA03378; Wed, 16 Jun 93 14:20:44 -0400 Received: by elc04 (5.65.1/lanleaf2.4.9) id AA14823; Wed, 16 Jun 93 14:20:58 -0400 Message-Id: <9306161820.AA14823@elc04> Date: Wed, 16 Jun 93 14:20:58 -0400 From: Sun Xian-He To: pbwg-comm@cs.utk.edu Subject: Re: More Questions to Sun on GS John Gustafson and me proposed the generalized speedup metric in 1991. In our original paper, we didn't give implementation details. With a partner, I am working on a paper on generalized speedup and scalability. Using experimental results on KSR, we will explain all the details and give a comparisons between traditional and generalized speedup. Since Roger Hockney has asked me some very precise, thoughtful questions on the net. I will use some of our unpublished results to answer his questions. I will appreciate if the readers on this net could reference the source when any of the results would be used. > REPLY TO Xian-He SUN > -------------------- > from Roger Hockney > ------------------ > (1) We can express generalised speedup as a scaled inverse-time > metric, as follows: > TU W(N;p) * T(N;1) > GS(N;p) = -------- where TU = --------------- > T(N;p) W(N;1) > Here TU defines the unit of time measurement. The two equations are correct, though the notations could be improved to make the relations more clear. For me, I would rather use W(p), W(1), T(p;p), and T(1;1) to replace W(N;p), W(N;1), T(N;p), and T(N;1) respectively. W(p) is the problem size (in float-count) when p processors used. W(1) is the problem size when single processor used. The problem size (or work) could be a function of some parameter, N, but it is not important here. With these new notations, we have TU W(p) * T(1;1) GS(p) = -------- where TU = --------------- T(p;p) W(1) Let m = W(p)/W(1) then TU = m * T(1;1). Let T(p;1) = m* T(1;1), where T(p;1) is the single processor execution time for solving problem size W(p). Then TU m * T(1;1) T(p;1) GS(p) = -------- = ------------ = ------------ = TS(p) T(p;p) T(p;p) T(p;p) TS is the traditional speedup. The above equation gives the relation between GS and TS. In general, measured T(p;1) may be not equal to m * T(1;1). GS can be seen as a TS in which the T(p;1) is given by m * T(1;1). If single processor speed is independent of problem size, the measured T(p;1) is equal to m * T(1;1). > (2) The key question from my view point is: > Does TU change when one calculates GS for two different > computers, and when one is comparing the same benchmark > and the same problem size ? W(1)/T(1;1) is a fraction of the asymptotic speed. Asymptotic speed is machine dependent. So, TU may be different for different computers. Speed is defined as work/time. > (3) What do you mean by asymptotic performance? Asymptotic speed is the best achieved speed of your application on a single processor. Without virtual memory, in general, it is the achieved speed when the memory is full filled. > (4) How does one calculate and or measure > W(N;p), W(N;1) and T(N;1) > and on what do they depend, see question (6)? W(1), T(1;1) are determined by the asymptotic speed. W(p) depends on how do you scale the problem size. > (5) Showing Dependence > If we adopt a useful convention, and use the symbols: > A or B to describe the computer and software being benchmarked > X or Y ............... benchmark being timed > N or M ............... problem size > p or q ............... number of processors > * to mean any value > We can show dependence with 4 parameters. Thus for the elapsed > time of the benchmark, we have in general > T(A,X:N;p) time for p-processors on computer A with benchmark > X of problem size N. > It is convenient to keep this ordering and punctuation of the > parameters and show explicitly a non-dependence with the wild > any-value symbol * > e.g (a) Temporal Performance can be written with full dependence > explicitly: > R_T(A,X:N;p)= TU(*,*:*;*)/T(A,X:N;p) > because TU=1 identically. > (b) Benchmark Performance can be written: > R_B(A,X:N;p)= TU(*,X:N;*)/T(A,X:N;p) > because TU=F_B(N) for any benchmark X, but there is no > dependence of TU on the computer or the number of > processors. > (6) Can you please express the dependence of the following, using > the above notation: > W(?,?:N;p) does the p-processor work depend on the computer > and benchmark The user has the freedom to scale the problem size. If the the problem size is fixed, the p-processor work does not depend on the computer and benchmark. If the problem size is scaled up with time or memory constrains, the p-processor work depends on the computer and benchmark. > W(?,?:N;1) ........ 1-processor ........................... W(A;X;N;1) > T(?,?:N;1) .................... time ...................... T(A,X,N,1) > TU(?,?:?;?) ? TU(A;X;N;q) in general. TU(A;X;*,q) if problem size is fixed. The above are my interpret of the generalized speedup. I regret that I didn't discuss this issue with Roger Hockney at the Supercomputing'92. I would like to thank Roger for his consideration of generalized speedup. Xian-He Sun ======================= Xian-He Sun ============================ ICASE (Institute for Computer Applications in Science and Engineering) Mail Stop 132C 804-864-8018 (O) NASA Langley Research Center 804-864-6134 (fax) Hampton, VA 23681-0001 sun@icase.edu ==================================================================== From owner-pbwg-comm@CS.UTK.EDU Wed Jun 16 17:56:37 1993 Received: from CS.UTK.EDU by netlib2.cs.utk.edu with SMTP (5.61+IDA+UTK-930125/2.8t-netlib) id AA19956; Wed, 16 Jun 93 17:56:37 -0400 Received: from localhost by CS.UTK.EDU with SMTP (5.61+IDA+UTK-930125/2.8s-UTK) id AA16316; Wed, 16 Jun 93 17:55:10 -0400 X-Resent-To: pbwg-comm@CS.UTK.EDU ; Wed, 16 Jun 1993 17:55:09 EDT Errors-To: owner-pbwg-comm@CS.UTK.EDU Received: from timbuk.cray.com by CS.UTK.EDU with SMTP (5.61+IDA+UTK-930125/2.8s-UTK) id AA16297; Wed, 16 Jun 93 17:55:05 -0400 Received: from magnet (magnet.cray.com) by cray.com (4.1/CRI-MX 2.19) id AA26332; Wed, 16 Jun 93 16:55:26 CDT Received: by magnet (4.1/CRI-5.13) id AA00281; Wed, 16 Jun 93 16:55:24 CDT From: cmg@ferrari.cray.com (Charles Grassl) Message-Id: <9306162155.AA00281@magnet> Subject: Speed-up To: pbwg-comm@cs.utk.edu Date: Wed, 16 Jun 93 16:55:22 CDT X-Mailer: ELM [version 2.3 PL11] Use of measured speed-up ======================== Is speed-up useful for purposes other than comparing computers? Most of us believe that speed-up should not be directly used for comparing computers. For what do computers users, programmers, vendors and designers use speed-up? Speed-up is an observable. We cannot deny the fact that we can measure "speed-up" however it is defined. Speed-up by itself is not a very useful parameter. We see from our current discussions that there are many ways to "define" the observable which we call speed-up. The different ways to define and treat speed-up appear to be addressing the phenomena of efficiency and scalability. We should just say what we are trying to derive. Speed-up is related to both scalability and to efficiency. Scalability and efficiency are not directly measurable. Rather, they are usually derived from other measured quantities. Scalability and efficiency are at times, but not necessarily, independent. This is a big problem for us: for some experiments we are deriving efficiency from measured speed-up and for other experiments we are deriving scalability from measured speed-up. Scalability has been demonstrated, but only for specific cases of specific applications and for specific computers. It appears that each and every application will scale differently and this scalability is linked to the underlying computer architecture. Perhaps there are so many variables associated with scaling an application and a computer that generalizing this concept is not useful. Efficiency is relatively easy to define for a specific application and computer, though measurement or derivation is difficult. I believe that it can even be generalized. Measurement of speed-up alone does not always allow the user to derive an efficiency. Other factors involved are, for example, amount (percentage) of parallelism, level of vectorization, level of blocking or localization, relative speeds of different computer parts, computer configuration, data configuration, etc. For simple parallel processing experiments, speed-up is one of the measurables which allows us to calculate the efficiency. But speed-up by itself does not allow us to calculate efficiency. For a fixed sized problem, Amdahl's Law regulates a maximum speed-up. And Amdahl's Law is based on TIME spent in sequential versus parallel sections. These relative times are related to non-constant speeds in different regions of a program. To compute an efficiency, we require a measured speed-up and an Amdahl's Law parameter. Also, the "speed" used in sequential regions and parallel regions must be monitors or regulated. SUMMARY Speed-up is the observable which allows the derivation of efficiency. This efficiency is not directly comparable between different computers or applications. Charles Grassl Cray Research, Inc. Eagan, Minnesota USA From owner-pbwg-comm@CS.UTK.EDU Wed Jun 16 18:20:57 1993 Received: from CS.UTK.EDU by netlib2.cs.utk.edu with SMTP (5.61+IDA+UTK-930125/2.8t-netlib) id AA20120; Wed, 16 Jun 93 18:20:57 -0400 Received: from localhost by CS.UTK.EDU with SMTP (5.61+IDA+UTK-930125/2.8s-UTK) id AA17319; Wed, 16 Jun 93 18:19:11 -0400 X-Resent-To: pbwg-comm@CS.UTK.EDU ; Wed, 16 Jun 1993 18:19:10 EDT Errors-To: owner-pbwg-comm@CS.UTK.EDU Received: from sp2.csrd.uiuc.edu by CS.UTK.EDU with SMTP (5.61+IDA+UTK-930125/2.8s-UTK) id AA17305; Wed, 16 Jun 93 18:19:08 -0400 Received: from sp94.csrd.uiuc.edu.csrd.uiuc.edu (sp94.csrd.uiuc.edu) by sp2.csrd.uiuc.edu with SMTP id AA01538 (5.67a/IDA-1.5 for ); Wed, 16 Jun 1993 17:19:25 -0500 Received: by sp94.csrd.uiuc.edu.csrd.uiuc.edu (4.1/SMI-4.1) id AA09007; Wed, 16 Jun 93 17:19:24 CDT Date: Wed, 16 Jun 93 17:19:24 CDT From: schneid@csrd.uiuc.edu (David John Schneider) Message-Id: <9306162219.AA09007@sp94.csrd.uiuc.edu.csrd.uiuc.edu> To: R.Hockney@pac.soton.ac.uk Cc: pbwg-comm@cs.utk.edu In-Reply-To: <7257.9306111942@calvados.pac.soton.ac.uk> (R.Hockney@pac.soton.ac.uk) Subject: Re: Reply to Dave Schneider on Metrics Roger, I think that we are in agreement that the R_B measurement is the best existing way of comparing the performance of different codes on different machines. I also agree with your conclusion that the relative ordering of performance is unambiguous for a single code across multiple machine if one uses the R_B metric. This is one of the strongest arguments in favor of the R_B metric. The other issue is also important -- can R_B be used to unambiguously order the performance of different codes on different machines? I will try to set up an example of how R_B fails to satisfy requirements for a unique order relation for this second question. Let X and Y be two different machines, and let A and B denote two different code/dataset combinations which run on both X and Y. Assume the following execution times: T(A,X) = 1 T(B,X) = 10 T(A,Y) = 2 T(B,Y) = 2 If F_B(A)=1 and F_B(B)=10, then the average R_B for X is greater than for Y. However, if F_B(A)=3 and F_B(B)=3, then the average R_B for X is less than for Y. However, it is obvious that the sum of the execution time on X is greater than the sum of the execution times on Y. This example illustrates the way in which using "reasonable" but arbitrary normalization conventions, coupled with a "reasonable" single figure-of-merit, lead to confusion. One might ask whether or not one can measure F_B accurately enough using hardware performance monitoring equipment to eliminate this confusion. In fact, this is not always possible. For example, one can get very different operation counts for the same source code/dataset in scalar and vector modes on vector computers. As I recall, Jim Hack presented examples at the US-Japan Performance Evaluation Workshop where the hardware operation counts for individual loops on the Y-MP can increase by a factor of 5 if the code when executed in vector mode as opposed to scalar mode. Despite these large increases in operation count, the vector mode execution time can be considerably smaller. Dave ======================================================================== Errors-To: owner-pbwg-comm@CS.UTK.EDU X-Resent-To: pbwg-comm@CS.UTK.EDU ; Fri, 11 Jun 1993 15:43:29 EDT Errors-To: owner-pbwg-comm@CS.UTK.EDU From: R.Hockney@pac.soton.ac.uk Date: Fri, 11 Jun 93 19:42:31 GMT Reply to Dave Schneider on F_B and R_B -------------------------------------- from Roger Hockney I agree with almost everything you said in your last note, from which I conclude: (a) There is, as yet, no axiomatic approach that we can use, therefore we are left to consider and improve the draft metrics that I presented to the last meeting (Chapter 1). (b) R_B(N;p) with sensibly defined F_B(N) is probably the least biased current metric (at least it is unambiguous and acceptable). I claim below that it is also the most convenient and useful. You rightly point out that R_B is no more than a scaled time measurement. That is precisely why it is defined in the way that it is, with F_B(N) as a nominal value (or function) that is the same for all computers and is unchangeable by fiat. This is the way it has to be if R_B is to have the properties we require. Thus F_B(N) should be thought of as inscribed in stone and handed down by the god of benchmarks. R_B is not intended to be a measure of the flop-rate of the hardware, for which we have defined another metric R_H, the hardware performance (incidentally a metric in which we should not be interested). Rather R_B is an inverse-time measure scaled for our convenience in a particular way (see below), to be more useful than the straight inverse-time metric which we call Temporal Performance R_T=1/T, but find difficult to use and compare across benchmarks. However, F_B(N) is also intended to approximate to the real hardware flop-count, F_H(N) of a good implementation of the one processor sequential code, so we expect the benchmark and hardware Mflop/s to be similar for p=1 R_B(N;1) \approx R_H(N;1)=F_H(N)/T(N;1) For p>1, F_H(N) may be >> F_B(N) because F_H counts redundant operations which are common in parallel code, whereas F_B correctly does not. Redundant operations are any operations that are repeated in each processor of a parallel implementation. Thus Benchmark and Hardware Mflop/s may differ widely in a parallel code. The important point is that we should not be interested in R_H, i.e. in generating hardware Mflop/s. The aim of the programmer should be to maximise Benchmark Mflop/s, which by its careful definition corresponds to minimising elapsed wall-clock execution time, T(N;p). To make this distinction clearer the two performance metrics have, strictly speaking, different units, and should be written differently: Benchmark Performance in units Mflop/s(benchmark name) Hardware Performance in units Mflop/s(computer name) So that, whereas I agree that R_B is a scaled time measurement, I do not agree that it is an arbitrary scaling (like SPEC use with the VAX11/780 reference timings). The scaling is closely related to the hardware flop-count of sequential code, and therefore a measure, albeit somewhat imprecise, of the amount of work required to solve the problem using current algorithms. The property that higher R_B means lower execution time, does not depend on this relation with the hardware flop-count being close (it only depends on F_B(N) being kept the same across all computers), but our Mflop/s numbers would begin to look rather silly (arbitrary) if it was not fairly close. At some stage I agree that we should adopt a better measure of work than Mflop, that might come from the research you refer to, but for this year's report, I can see no alternative to Mflop. Further, we are used to the measure, and performance values derived from it: Mflop/s. Concerning the ordering: For a given N, if the R_B(N;p) of code A on computer X is greater than the R_B(N;p) of code B on computer Y, then code A runs in less time on X than B does on Y. That is the ordering given by time alone is the same as that given by R_B. Going to a different problem size, N2 say, is rather like going to a different benchmark, and it is possible that code B will execute on Y in less time than A does on X (i.e. the ordering may change), but this will be a real difference in behaviour of the codes which is reflected by R_B(N2;p) of B on Y becoming greater than R_B(N2;p) of A on X. Such a change of ordering in R_B is real and not a matter of definition. Everything is as it should be. I cannot think of an example, where changing the function F_B(N) can change the ordering of the benchmarked computers. Can you give me an example, please. Remember that F_B(N) is the same function for all computers, its a property of the problem/benchmark, not the computer. A misunderstanding may arise however if one compares the R_B(N;p) of one problem size with the R_B(N2;p) of another problem size. It is not correct to conclude that the higher R_B means the least execution time because the numerators F_B(N) and F_B(N2) are different. Generally speaking the larger problem will generate more nominal benchmark Mflop/s(benchmark), and also real hardware Mflop/s, but because it is a bigger problem will take much longer to run. So higher R_B, in this comparison, may mean longer execution time, not shorter. But we should not be surprised that a bigger problem takes longer to run, even if it generates more Mflop/s (real and/or nominal). To compare execution time across benchmarks or across different sizes of the same benchmark, we must compare the Temporal Performance, R_T=R_B/F_B. Because of the last paragraph, some have questioned whether one should abandon R_B and use the Temporal Performance, R_T(N;p)=1/T(N;p), alone. Probably one should quote both, but it is really a matter of convenience. Values of R_B will tend to cluster around a relatively confined region of the (R_B vs p) performance graphs, because they do approximate to the available hardware Mflop/s of the computer. Thus a set of results for a range of sizes of a range of benchmarks for a given computer tend to cluster together, and should give a fair idea of the range of possible performance, and this range should not be too large or scattered. The tendency will be for all benchmarks and all problem sizes to cluster near one value for p=1, although there will be some scatter due to vector length effects if vector processors are used. As p increases, the curves for different problem sizes and benchmarks will diverge, but since they are tied together at one end they can't get too far apart. Computers of different types may give recognisable patterns when performance results are displayed in this way in the (R_B vs p) plane. This may help us classify performance behaviour by describing the region of the plane occupied by the benchmark results, but we need to see a lot of data first. It is also why I believe that an interactive graphical interface should be considered as an essential part (not just a sexy add-on) of the Parkbench performance database. In sharp contrast, values of time alone, and therefore R_T, will vary much more widely over orders of magnitude between small and large problems, and from one benchmark to another. It will be much more difficult to see any pattern of performance in the (R_T vs p) plane or try to draw conclusions. In practice using R_T as the display metric will be much more difficult and less useful. Roger Hockney From owner-pbwg-comm@CS.UTK.EDU Fri Jun 25 17:10:01 1993 Received: from CS.UTK.EDU by netlib2.cs.utk.edu with SMTP (5.61+IDA+UTK-930125/2.8t-netlib) id AA22429; Fri, 25 Jun 93 17:10:01 -0400 Received: from localhost by CS.UTK.EDU with SMTP (5.61+IDA+UTK-930125/2.8s-UTK) id AA20862; Fri, 25 Jun 93 17:08:43 -0400 X-Resent-To: pbwg-comm@CS.UTK.EDU ; Fri, 25 Jun 1993 17:08:41 EDT Errors-To: owner-pbwg-comm@CS.UTK.EDU Received: from sun2.nsfnet-relay.ac.uk by CS.UTK.EDU with SMTP (5.61+IDA+UTK-930125/2.8s-UTK) id AA20854; Fri, 25 Jun 93 17:08:38 -0400 Via: uk.ac.southampton.ecs; Fri, 25 Jun 1993 22:09:16 +0100 From: R.Hockney@parallel-applications-centre.southampton.ac.uk Via: calvados.pac.soton.ac.uk (plonk); Fri, 25 Jun 93 22:01:04 BST Date: Fri, 25 Jun 93 21:08:51 GMT Message-Id: <2246.9306252108@calvados.pac.soton.ac.uk> To: pbwg-comm@cs.utk.edu Subject: Schneider's Benchmark example COMMENTS ON DAVE SCNEIDER'S BENCHMARK EXAMPLE from Roger Hockney -------------------------- Help, you have changed the notation on us! I am going to express your example in the previously used notation: Computer A or B, Code X or Y, size N or M, procs p or q Then your example is COMPUTER A COMPUTER B CODE X T(A,X:N;p) = 1 s T(B,X:N;p) = 2 s CODE Y T(A,Y:N;p) = 10 s T(B,Y:N;p) = 2 s Then the Temporal Performances are: CODE X R_T(A,X:N;p) = 1 soln/s R_T(B,X:N;p) = 0.5 soln/s CODE Y R_T(A,Y:N;p) = 0.1 soln/s R_T(B,Y:N;p) = 0.5 soln/s and we can conclude that: "Computer A is better on code X, but Computer B is better on code Y". Further, the above is the only conclusion that can validly be drawn from this benchmarking data. To be pedantic, it is also necessary to state that "better" in benchmarking is shorthand for "shorter wall-clock elapsed time", T(A,X:N;p). I trust we have all agreed on that, otherwise we are in deep trouble. If the work in flop-count is defined as in your first case F_B(*,X:N;*) = 1 Mflop and F_B(*,Y:N;*) = 10 Mflop Then the Benchmark Performances are: COMPUTER A COMPUTER B CODE X R_B(A,X:N;p) = 1 Mflop/s R_B(B,X:N;p) = 0.5 Mflop/s CODE Y R_B(A,Y:N;p) = 1 Mflop/s R_B(B,Y:N;p) = 5.0 Mflop/s and again we can conclude that: "Computer A is better on code X, but Computer B is better on code Y". This is still correct, so where is the confusion? If the work in flop-count is defined as in your second case case F_B(*,X:N;*) = 3 Mflop and F_B(*,Y:N;*) = 3 Mflop Then the Benchmark Performances are: COMPUTER A COMPUTER B CODE X R_B(A,X:N;p) = 3.0 Mflop/s R_B(B,X:N;p) = 0.67 Mflop/s CODE Y R_B(A,Y:N;p) = 0.3 Mflop/s R_B(B,Y:N;p) = 0.67 Mflop/s and again we can conclude that: "Computer A is better on code X, but Computer B is better on code Y". This is still correct, so where is the confusion? The confusion that you refer to seems to arise when you take an average (you don't say what kind). But taking averages of benchmark rates is known to have little meaning, and quite simply should not be done. If A is better on one code and B on another, it is not surprising that this result cannot usefully be averaged. What is the average of "black" and "white", or any two opposites? So I believe the problem is not with the definition of the Benchmark performance, but with the taking of a meaningless average. In my proposals to the methodolgy group, I deliberately did not recommend the taking of any average. In fact, I recommend the graphical display of all results. SCALING BEHAVIOUR ----------------- You should mention that the scaling behaviour of a benchmark result depends just as much, if not more, on the properties of the parallel benchmark code, as it does on the properties of the computer. An obvious example is Amdahl performance saturation which limits the benchmark performance, and depends on the amount of unparallelised code and the speed of a single processor. It does not depend at all on the maximum number of processors that can be assembled (which might be regarded as the scalability of the hardware). For each benchmark, I would expect to be able to derive a best possible scaling, from a theoretical timing analysis of the parallelised code, on the basis that communication rates were infinite and startup times were zero. This would show Amdahl saturation and be the ideal set of curves that the real computers with finite rates and overheads would be aiming at. For some embarrasingly parallel applications these curves (one for each of a set of problem sizes) would be close to the ideal speedup lines, however for other problems various forms of performance saturation would take place. The important thing is to have a realistic timing formula showing the variation of T(A,X:N;p) with A,X,N and p. This is what we have tried to do with the Genesis Benchmarks (see Addison et al, Concurrency P&E, vol 5(1), 1-22). Only when such a formula is given, is scaling completely understood. Also the effect of increasing problem size N with p in different ways can be studied by inserting the different variations into the timing formula. In the above reference the processing node was characterised by 5 hardware parameters (message startup and stream rate, scalar arithmetic rate and vector rinf and nhalf), and the code by the corresponding 5 program parameters (number of messages, total bytes sent, number of scalar flop, number of vectorised flop, and average vector length). Given such a formula, the best possible scaling can be calculated by setting the message startup time to zero and the stream rate to infinity. I suggest that the committee strongly recommend that such a comprehensive timing formula be produced for each benchmark, by the writer of the benchmark. Roger Hockney PS Dave, it is enough to send your reply to pbwg-comm, please do not copy it to rwh@pac becvause I then get it twice. From owner-pbwg-comm@CS.UTK.EDU Sat Jun 26 11:18:24 1993 Received: from CS.UTK.EDU by netlib2.cs.utk.edu with SMTP (5.61+IDA+UTK-930125/2.8t-netlib) id AA27945; Sat, 26 Jun 93 11:18:24 -0400 Received: from localhost by CS.UTK.EDU with SMTP (5.61+IDA+UTK-930125/2.8s-UTK) id AA29422; Sat, 26 Jun 93 11:17:31 -0400 X-Resent-To: pbwg-comm@CS.UTK.EDU ; Sat, 26 Jun 1993 11:17:30 EDT Errors-To: owner-pbwg-comm@CS.UTK.EDU Received: from sp2.csrd.uiuc.edu by CS.UTK.EDU with SMTP (5.61+IDA+UTK-930125/2.8s-UTK) id AA29414; Sat, 26 Jun 93 11:17:28 -0400 Received: by sp2.csrd.uiuc.edu id AA29450 (5.67a/IDA-1.5 for pbwg-comm@cs.utk.edu); Sat, 26 Jun 1993 10:18:22 -0500 Date: Sat, 26 Jun 1993 10:18:22 -0500 From: David John Schneider Message-Id: <199306261518.AA29450@sp2.csrd.uiuc.edu> To: pbwg-comm@cs.utk.edu Subject: Reply to R. Hockney Roger, I agree that the example that I constructed is artificial in the sense that the performance of different codes on different machines are simply not comparable. The fallacy lies in trying to construct a single figure of merit, not in the definition of F_B. In the case of computer performance, the definition and dissemination of a single figure of merit encourages the comparison of incomparable data. In my example, the single figure of merit was the arithmetic mean of the performance of the two codes. Other means and normalization conventions, such as the one used by SPEC, have the same basic problem. The basic problem indeed very basic -- a single figure of merit defines a total ordering on the set of machines while, in reality, elapsed time measurements on a set of codes define only a partial or topologicaI ordering on the set of machines. The point of my example was the uncertainty inherent in an arbitrary but "reasonable" convention for defining F_B can lead to different orderings specified by a single figure of merit. My concern is primarily related to the statements made by several people about the desirability of the PBWG reporting a single figure of merit, and only secondarily with a precise technical definition of F_B. I have been following the discussions here rather closely, and I haven't seen an explicit statement that the PBWG does *NOT* intend to publish a single figure of merit. If such a decision is made by the group, then the precise definition of F_B becomes much less important. I would encourage the PBWG to adopt the policy of refusing indirectly endorse *ANY* single figure of merit by publishing such numbers. Dave From owner-pbwg-comm@CS.UTK.EDU Sun Jun 27 15:50:47 1993 Received: from CS.UTK.EDU by netlib2.cs.utk.edu with SMTP (5.61+IDA+UTK-930125/2.8t-netlib) id AA01671; Sun, 27 Jun 93 15:50:47 -0400 Received: from localhost by CS.UTK.EDU with SMTP (5.61+IDA+UTK-930125/2.8s-UTK) id AA07321; Sun, 27 Jun 93 15:49:37 -0400 X-Resent-To: pbwg-comm@CS.UTK.EDU ; Sun, 27 Jun 1993 15:49:36 EDT Errors-To: owner-pbwg-comm@CS.UTK.EDU Received: from sun2.nsfnet-relay.ac.uk by CS.UTK.EDU with SMTP (5.61+IDA+UTK-930125/2.8s-UTK) id AA07309; Sun, 27 Jun 93 15:49:33 -0400 Via: uk.ac.southampton.ecs; Sun, 27 Jun 1993 20:49:57 +0100 From: R.Hockney@parallel-applications-centre.southampton.ac.uk Via: calvados.pac.soton.ac.uk (plonk); Sun, 27 Jun 93 20:41:54 BST Date: Sun, 27 Jun 93 19:49:42 GMT Message-Id: <2932.9306271949@calvados.pac.soton.ac.uk> To: pbwg-comm@cs.utk.edu Subject: Single Figures of Merit QUESTION TO SCHNEIDER ON SINGLE FIGURE OF MERIT ----------------------------------------------- I am not sure what you mean by a single figure of merit. (1) Is reporting many values of R_B for a set of benchmarks with different sizes and a range of numbers of processors ( say 30 values of R_B), reporting a single figure of merit? because one has only reported one type of metric, i.e. R_B, and not also included other metrics like R_H, R_T, S_p. (2) Or does one only commit the crime when one averages the 30 values to obtain THE overal average R_B for the computer, by taking some average, like the SPECmark is the geometric average of many SPEC ratios? (3) The Parkbench discussions have, so far, been based on my draft of chapter 1. This contained no statement on averaging to get a single figure of merit, because I do not believe that such averaging has much meaning. And no member has proposed during discussion that we should do this. Thus by implication the committee does not, as yet, propose averaging to get a final single figure for a machine. I hope that this remains the position, but it might be useful to explicitly say that we do not support such averaging. My view as repeatedly expressed is to show the full rang of benchmark results graphically, 100 results= 100 dots on the graph. In\ order to plot these it is useful if they are all expressed in the same way, and I suggest that R_B(N;p) is the most convenient metric to use for this. From owner-pbwg-comm@CS.UTK.EDU Mon Jun 28 07:50:02 1993 Received: from CS.UTK.EDU by netlib2.cs.utk.edu with SMTP (5.61+IDA+UTK-930125/2.8t-netlib) id AA09698; Mon, 28 Jun 93 07:50:02 -0400 Received: from localhost by CS.UTK.EDU with SMTP (5.61+IDA+UTK-930125/2.8s-UTK) id AA04316; Mon, 28 Jun 93 07:49:08 -0400 X-Resent-To: pbwg-comm@CS.UTK.EDU ; Mon, 28 Jun 1993 07:49:03 EDT Errors-To: owner-pbwg-comm@CS.UTK.EDU Received: from sun2.nsfnet-relay.ac.uk by CS.UTK.EDU with SMTP (5.61+IDA+UTK-930125/2.8s-UTK) id AA04308; Mon, 28 Jun 93 07:49:00 -0400 Via: uk.ac.southampton.ecs; Mon, 28 Jun 1993 12:49:13 +0100 From: R.Hockney@parallel-applications-centre.southampton.ac.uk Via: calvados.pac.soton.ac.uk (plonk); Mon, 28 Jun 93 12:40:54 BST Date: Mon, 28 Jun 93 11:48:43 GMT Message-Id: <3467.9306281148@calvados.pac.soton.ac.uk> To: pbwg-comm@cs.utk.edu Subject: Generalised Speedup REPLY to Xian-He SUN - 2 from Roger Hockney ------------------------- Thank you for answering my detailed questions on Generalised Speedup (GS). From these I see that TU, the unit of time used in GS, changes for different computers, and even for different numbers of processors on the same computer, i.e. the dependency is, in general, TU(A,X:N;p). Thus, I conclude the following: (1) We *cannot* conclude that if the GS on computer B is greater than the GS on computer A on a particular benchmark (same X:N), then computer B executes the benchmark in less wallclock time than computer A. Because the relationship is: if GS(B,X:N;p) > GS(A,X:N;p) then by definition TU(B,X:N;p) TU(A,X:N;p) ------------- > ------------- T(B,X:N;p) T(A,X:N;p) or TU(B,X:N;p) T(B,X:N;p) < ------------- * T(A,X:N;p) TU(A,X:N;p) and, unfortunately, the ratio of TUs is not guaranteed to be unity, because the value of TU depends on the computer. This is also true for conventional Self-Speedup. For an example take: TUB=9, TB=4, TUA=3, TA=2, then GSB>GSA and although TB<3*TA, WE HAVE IN FACT TB>TA However, the ratio of TUs is, of course, unity for the metrics R_T, R_B because in these cases TU does not depend on the computer or the number of processors. Therefore the R_T and R_B metrics do not suffer from the above problem. (2) Because of (1), the ordering of computers in order of GS value is not necessarilly the same as the ordering by inverse wallclock time which is what, I believe users and the committee expect. R_T and R_B, of course, do retain the inverse wallclock time ordering. (3) Looked at in another way, the GS metric uses different time units for different computers. Such numbers cannot be directly compared across computers, and to do so is like comparing the numerical values of speeds of cars measured in different units, m.p.h., f.p.s, cm/s, in order to decide which is the fastest. In both cases such comparisons are not comparing like with like, and are invalid. More confusing still, the time unit changes with the number of processors, making GS values incomparable even for measurements on the same computer for different numbers of processors. (4) There are undoubtedly other good reasons for computing and studying values of GS, for example in the understanding of so-called "super-linear" speedup. However in view of points (1) and (3) above, I do not believe that GS is a suitable metric for the Parkbench committee to use to report benchmark results. Although there is no reason why others should not compute such metrics for their own purposes from the data provided by Parkbench. Roger Hockney From owner-pbwg-comm@CS.UTK.EDU Wed Jun 30 11:19:04 1993 Received: from CS.UTK.EDU by netlib2.cs.utk.edu with SMTP (5.61+IDA+UTK-930125/2.8t-netlib) id AA06257; Wed, 30 Jun 93 11:19:04 -0400 Received: from localhost by CS.UTK.EDU with SMTP (5.61+IDA+UTK-930125/2.8s-UTK) id AA05059; Wed, 30 Jun 93 11:18:01 -0400 X-Resent-To: pbwg-comm@CS.UTK.EDU ; Wed, 30 Jun 1993 11:18:00 EDT Errors-To: owner-pbwg-comm@CS.UTK.EDU Received: from elc04.icase.edu by CS.UTK.EDU with SMTP (5.61+IDA+UTK-930125/2.8s-UTK) id AA05051; Wed, 30 Jun 93 11:17:46 -0400 Received: by elc04 (5.65.1/lanleaf2.4.9) id AA22617; Wed, 30 Jun 93 11:18:48 -0400 Message-Id: <9306301518.AA22617@elc04> Date: Wed, 30 Jun 93 11:18:48 -0400 From: Sun Xian-He To: pbwg-comm@cs.utk.edu Subject: REPLY to Generalised Speedup - 2 Cc: sun@fluke.icase.edu Through Roger Hockney's questions, I see that some confusion still exist. The following is my answer to these questions. I see my role in here is to explain the generalized speedup. There is no intention to argue that this metric is better than another. Especially, I think execution time is important and should be reported, though execution time will be heavily influenced by a programmer's programming skill. I would like to thank Roger Hockney for his respectful discussion. --------------------------------------------------------------------------- > REPLY to Xian-He SUN - 2 > from Roger Hockney > ------------------------- >I see that TU, the unit of time used in GS, changes for >different computers, and even for different numbers of processors on the >same computer, i.e. the dependency is, in general, TU(A,X:N;p). Thus, I >conclude the following: --------------------------------------------------------------------------- The answer is YES and No. TU changes for different computers, but changing for different numbers of processors on the same computer is questionable. Recall that TU W(p) * T(1;1) GS(p) = -------- where TU = --------------- T(p;p) W(1) If the problem size is fixed, TU is independent of number of processors. More precisely, TU_1 = T(1;1)/W(1) is independent of number of processors. --------------------------------------------------------------------------- > (1) We *cannot* conclude that if the GS on computer B is greater than > the GS on computer A on a particular benchmark (same X:N), then > computer B executes the benchmark in less wallclock time than > computer A. --------------------------------------------------------------------------- True. The generalized speedup measures speedup, not wallclock time. However, as a good metric, it is closed related with execution time. Let GS(A,X:N;p) = GS = TU/T, TU = (W(p) * T(1;1))/W(1) and GS(B,X:N;p) = GS' = TU'/T', TU' = (W'(p) * T'(1;1))/W'(1)). If GS > GS' then by definition TU TU' ---- > ---- T T' or TU T < -- * T' TU' If we run the same problem size on both machine A and B, then W(p)=W'(p) and TU/TU' = TU_1/TU'_1. TU_1 and TU'_1 are pre-measured and are independent of number of processors. TU_1 and TU'_1 gives the sequential computation power of the two machines. GS and GS' gives their computation power variation with parallel processing. The usefulness of speedup is its ability to show an algorithm, architecture's potential of parallel processing. --------------------------------------------------------------------------- > (3) Looked at in another way, the GS metric uses different time units > for different computers. Such numbers cannot be directly compared > across computers, and to do so is like comparing the numerical values > of speeds of cars measured in different units, m.p.h., f.p.s, cm/s, > in order to decide which is the fastest. In both cases such comparisons > are not comparing like with like, and are invalid. More confusing > still, the time unit changes with the number of processors, making > GS values incomparable even for measurements on the same computer > for different numbers of processors. --------------------------------------------------------------------------- Based on the reasons given above, I don't agree these conclusions. The TU_1 is measured with the same unit. It does not change with the number of processors. Xian-He Sun From owner-pbwg-comm@CS.UTK.EDU Sat Jul 3 16:44:19 1993 Received: from CS.UTK.EDU by netlib2.cs.utk.edu with SMTP (5.61+IDA+UTK-930125/2.8t-netlib) id AA26592; Sat, 3 Jul 93 16:44:19 -0400 Received: from localhost by CS.UTK.EDU with SMTP (5.61+IDA+UTK-930125/2.8s-UTK) id AA10055; Sat, 3 Jul 93 16:43:02 -0400 X-Resent-To: pbwg-comm@CS.UTK.EDU ; Sat, 3 Jul 1993 16:43:01 EDT Errors-To: owner-pbwg-comm@CS.UTK.EDU Received: from sun2.nsfnet-relay.ac.uk by CS.UTK.EDU with SMTP (5.61+IDA+UTK-930125/2.8s-UTK) id AA10047; Sat, 3 Jul 93 16:42:58 -0400 Via: uk.ac.southampton.ecs; Sat, 3 Jul 1993 21:44:15 +0100 From: R.Hockney@parallel-applications-centre.southampton.ac.uk Via: calvados.pac.soton.ac.uk (plonk); Sat, 3 Jul 93 21:35:58 BST Date: Sat, 3 Jul 93 20:43:53 GMT Message-Id: <470.9307032043@calvados.pac.soton.ac.uk> To: pbwg-comm@cs.utk.edu Subject: Reply to SUN - 3 Reply to Xian-He Sun - 3 ------------------------ from Roger Hockney ------------------ (1) I take your point that for the same problem size TU does not depend on the number of processors. It does however, as you agree, depend on the computer, and this is the source of the problem. If we wish to report the actual (as opposed to potential) performance of computers, then GS is not a suitable metric for ParkBench results , because a higher GS does not necessarilly imply a shorter wallclock time. Again your reply to my point (1) shows that you agree with this, indeed GS is not intended to measure wallclock time, it is intended to measure Speedup which is something quite different. (2) You misunderstand me about the units: Of course TU(A) and TU(B) are both measured in the same units, say seconds. However the numerical value of the ratio T(A)/TU(A) is the time on computer A measured in units of TU(A): it is how many TU(A)s there are in T(A). Similarly T(B)/TU(B) is the time on computer B in units of TU(B). The inverses of these two quantities are GS(A) and GS(B). So that comparing GS(A) with GS(B) is like comparing two inverse time measurements, in which the units used to measure time are different, because TU(A).NE.TU(B) in general. So that if one is interested in comparing wallclock times, one is comparing the numerical value of two measurements that are expressed in different units (like the speeds of my cars in m.p.h. and cm/s). Such a comparison is invalid because one is not comparing like with like. However, of course, if one is not interested to compare wallclock time, but interested in comparing Speedup (generalised or not) for its own sake, then one may compare GS(A) and GS(B). (3) My point is that ParkBench should keep its feet firmly on the ground and only compare actual observed performance of computers using metrics that, for a given benchmark and problem size, preserve the performance ordering given by inverse wallclock time. I do not believe that we should confuse the issue by reporting such metrics as Speedup and Generalised Speedup, that do not satisfy this condition, and express some potential rather than actual performance. What is the use of a high speedup, if the processors are so slow that the actual performance is poor? True in theory one could say we can easily speedup the individual nodes to remedy the situation. But this only works if the communications and latencies are similarly speeded up and reduced. This is rarely possible or done. The only safe policy for ParkBench is to stick to actual performance, and use metrics satisfying the condition above. Thus, in my opinion, we should not use Speedup or Generalised Speedup. I am not saying that GS is not an interesting and useful concept, far from it; I am saying only that it does not measure directly actual performance, and is therefore unsuitable for expressing benchmark results which are supposed and assumed by readers to report actual performance. Roger Hockney From owner-pbwg-comm@CS.UTK.EDU Tue Jul 13 21:18:08 1993 Received: from CS.UTK.EDU by netlib2.cs.utk.edu with SMTP (5.61+IDA+UTK-930125/2.8t-netlib) id AA09797; Tue, 13 Jul 93 21:18:08 -0400 Received: from localhost by CS.UTK.EDU with SMTP (5.61+IDA+UTK-930125/2.8s-UTK) id AA27671; Tue, 13 Jul 93 21:17:26 -0400 X-Resent-To: pbwg-comm@CS.UTK.EDU ; Tue, 13 Jul 1993 21:17:25 EDT Errors-To: owner-pbwg-comm@CS.UTK.EDU Received: from BERRY.CS.UTK.EDU by CS.UTK.EDU with SMTP (5.61+IDA+UTK-930125/2.8s-UTK) id AA27665; Tue, 13 Jul 93 21:17:24 -0400 Received: from LOCALHOST.cs.utk.edu by berry.cs.utk.edu with SMTP (5.61++/2.7c-UTK) id AA27220; Tue, 13 Jul 93 21:17:23 -0400 Message-Id: <9307140117.AA27220@berry.cs.utk.edu> To: pbwg-comm@cs.utk.edu Subject: PDS release Date: Tue, 13 Jul 1993 21:17:23 -0400 From: "Michael W. Berry" Subject: XNetlib ver. 3.4 Released Announcing the release of XNetlib ver. 3.4 What it is - Xnetlib is a new version of netlib recently developed at the University of Tennessee and Oak Ridge National Laboratory. Unlike netlib, which uses electronic mail to process requests for software, xnetlib uses an X Window graphical user interface and a socket-based connection between the user's machine and the xnetlib server machine to process software requests. Xnetlib is available to anyone who has access to the TCP/IP Internet. Xnetlib provides access to files and a whois style database residing on the Netlib server at the Oak Ridge National Laboratory. Xnetlib also connects to two other xnetlib servers, one at Rice University, and the other at the Army Research Laboratory. Our intention is to release the xnetlib server code in a few months. New to this release is the Performance Database Server (described more fully below) and a conference database. Xnetlib requires the Athena widget set (Xaw), however, precompiled executables are available. How to get it - By anonymous ftp from netlib2.cs.utk.edu in xnetlib/xnetlib3.4.shar.Z By email, send a message to netlib@ornl.gov containing the line: send xnetlib3.4.shar from xnetlib Precompiled executables for various platforms are also available. For information get the index file for the xnetlib library: via anonymous ftp from netlib2.cs.utk.edu get xnetlib/index via email send a message to netlib@ornl.gov containing the line: send index from xnetlib If you have any questions, please send mail to xnetlib@cs.utk.edu ----------------------------------------------------------------- PDS: A Performance Database Server The process of gathering, archiving, and distributing computer benchmark data is a cumbersome task usually performed by computer users and vendors with little coordination. Most important, there is no publicly-available central depository of performance data for all ranges of machines from personal computers to supercomputers. This Xnetlib release contains an Internet-accessible performance database server (PDS) which can be used to extract current benchmark data and literature. The current PDS provides an on-line catalog of the following public-domain computer benchmarks: Linpack Benchmark, Parallel Linpack Benchmark, Bonnie Benchmark, FLOPS Benchmark, Peak Performance (part of Linpack Benchmark), Fhourstones and Dhrystones, Hanoi Benchmark, Heapsort Benchmark, Nsieve Benchmark, Math Benchmark, Perfect Benchmarks, and Genesis Benchmarks. Rank-ordered lists of machines per benchmark available as well as relevant papers and bibiliographies. A browse facility allows the user to extract a variety of machine/benchmark combinations, and a search feature permits specific queries into the performance database. PDS does not reformat or present the benchmark data in any way that conflicts with the original methodology of any particular benchmark; it is thereby devoid of any subjective interpretations of machine performance. PDS is invoked by selecting the "Performance" button in the Xnetlib Menu Options. Questions and comments for PDS should be mailed to "utpds@cs.utk.edu." From owner-pbwg-comm@CS.UTK.EDU Thu Jul 15 12:22:44 1993 Received: from CS.UTK.EDU by netlib2.cs.utk.edu with SMTP (5.61+IDA+UTK-930125/2.8t-netlib) id AA23121; Thu, 15 Jul 93 12:22:44 -0400 Received: from localhost by CS.UTK.EDU with SMTP (5.61+IDA+UTK-930125/2.8s-UTK) id AA28587; Thu, 15 Jul 93 12:22:22 -0400 X-Resent-To: pbwg-comm@CS.UTK.EDU ; Thu, 15 Jul 1993 12:22:21 EDT Errors-To: owner-pbwg-comm@CS.UTK.EDU Received: from timbuk.cray.com by CS.UTK.EDU with SMTP (5.61+IDA+UTK-930125/2.8s-UTK) id AA28579; Thu, 15 Jul 93 12:22:19 -0400 Received: from magnet (magnet.cray.com) by cray.com (4.1/CRI-MX 2.19) id AA03224; Thu, 15 Jul 93 11:22:16 CDT Received: by magnet (4.1/CRI-5.13) id AA07811; Thu, 15 Jul 93 11:22:14 CDT From: cmg@ferrari.cray.com (Charles Grassl) Message-Id: <9307151622.AA07811@magnet> Subject: Parkbench name To: pbwg-comm@cs.utk.edu Date: Thu, 15 Jul 93 11:22:12 CDT X-Mailer: ELM [version 2.3 PL11] The name "Parkbench" should be acceptable for the Parallel Benchmarks Working Group. Should we, the Parkbench group, register this name? I conducted our standard search of trademark databases, and the results of that search suggest that Parkbench would be available for use as a trademark. We need to understand, however, that these searches are not perfect, and that that a conflict with the rights of another user with respect to that trademark may still occur, even if unlikely. The reason for the inaccuracy in the database searches is that trademarks can be used in the vernacular and not registered. For example, Parkbench may already be used by another group or organization without being registered. Such a group could still have rights to the name even though it is not registered. Charles Grassl Cray Research, Inc. Eagan, Minnesota USA From owner-pbwg-comm@CS.UTK.EDU Thu Jul 15 12:43:02 1993 Received: from CS.UTK.EDU by netlib2.cs.utk.edu with SMTP (5.61+IDA+UTK-930125/2.8t-netlib) id AA23255; Thu, 15 Jul 93 12:43:02 -0400 Received: from localhost by CS.UTK.EDU with SMTP (5.61+IDA+UTK-930125/2.8s-UTK) id AA00139; Thu, 15 Jul 93 12:42:59 -0400 X-Resent-To: pbwg-comm@CS.UTK.EDU ; Thu, 15 Jul 1993 12:42:58 EDT Errors-To: owner-pbwg-comm@CS.UTK.EDU Received: from Sun.COM by CS.UTK.EDU with SMTP (5.61+IDA+UTK-930125/2.8s-UTK) id AA00127; Thu, 15 Jul 93 12:42:55 -0400 Received: from Eng.Sun.COM (zigzag-bb.Corp.Sun.COM) by Sun.COM (4.1/SMI-4.1) id AA05419; Thu, 15 Jul 93 09:42:53 PDT Received: from cumbria.Eng.Sun.COM by Eng.Sun.COM (4.1/SMI-4.1) id AA10786; Thu, 15 Jul 93 09:42:54 PDT Received: by cumbria.Eng.Sun.COM (5.0/SMI-SVR4) id AA00404; Thu, 15 Jul 93 09:42:35 PDT Date: Thu, 15 Jul 93 09:42:35 PDT From: Bodo.Parady@Eng.Sun.COM (Bodo Parady - PDE Performance) Message-Id: <9307151642.AA00404@cumbria.Eng.Sun.COM> To: pbwg-comm@cs.utk.edu, cmg@ferrari.cray.com Subject: Re: Parkbench name X-Sun-Charset: US-ASCII Content-Length: 419 Has anyone posted a query on comp.arch and comp.benchmarks to see if Parkbench is being used? If not, I would be happy to post it. Bodo Parady | (415) 336-0388 SMCC, Sun Microsystems | Bodo.Parady@eng.sun.com Mail Stop MTV15-404 | Domain: bodo@cumbria.eng.sun.com 2550 Garcia Ave. | Alt: na.parady@na-net.ornl.gov Mountain View, CA 94043-1100 | FAX: (415) 968-4873 From owner-pbwg-comm@CS.UTK.EDU Thu Jul 15 19:50:11 1993 Received: from CS.UTK.EDU by netlib2.cs.utk.edu with SMTP (5.61+IDA+UTK-930125/2.8t-netlib) id AA26296; Thu, 15 Jul 93 19:50:11 -0400 Received: from localhost by CS.UTK.EDU with SMTP (5.61+IDA+UTK-930125/2.8s-UTK) id AA26722; Thu, 15 Jul 93 19:49:51 -0400 X-Resent-To: pbwg-comm@CS.UTK.EDU ; Thu, 15 Jul 1993 19:49:50 EDT Errors-To: owner-pbwg-comm@CS.UTK.EDU Received: from Sun.COM by CS.UTK.EDU with SMTP (5.61+IDA+UTK-930125/2.8s-UTK) id AA26714; Thu, 15 Jul 93 19:49:47 -0400 Received: from Eng.Sun.COM (zigzag-bb.Corp.Sun.COM) by Sun.COM (4.1/SMI-4.1) id AA07467; Thu, 15 Jul 93 16:49:24 PDT Received: from cumbria.Eng.Sun.COM by Eng.Sun.COM (4.1/SMI-4.1) id AA27030; Thu, 15 Jul 93 16:49:29 PDT Received: by cumbria.Eng.Sun.COM (5.0/SMI-SVR4) id AA01797; Thu, 15 Jul 93 16:49:09 PDT Date: Thu, 15 Jul 93 16:49:09 PDT From: Bodo.Parady@Eng.Sun.COM (Bodo Parady - PDE Performance) Message-Id: <9307152349.AA01797@cumbria.Eng.Sun.COM> To: pbwg-comm@cs.utk.edu, cmg@ferrari.cray.com Subject: Re: Parkbench name X-Sun-Charset: US-ASCII Content-Length: 151 There is reference to PARBENCH in the literature. Whether this is too close legally is another question. Bodo Parady SMCC, Mountain View From owner-pbwg-comm@CS.UTK.EDU Mon Jul 26 08:38:45 1993 Received: from CS.UTK.EDU by netlib2.cs.utk.edu with SMTP (5.61+IDA+UTK-930125/2.8t-netlib) id AA08814; Mon, 26 Jul 93 08:38:45 -0400 Received: from localhost by CS.UTK.EDU with SMTP (5.61+IDA+UTK-930125/2.8s-UTK) id AA29371; Mon, 26 Jul 93 08:38:07 -0400 X-Resent-To: pbwg-comm@CS.UTK.EDU ; Mon, 26 Jul 1993 08:38:06 EDT Errors-To: owner-pbwg-comm@CS.UTK.EDU Received: from BERRY.CS.UTK.EDU by CS.UTK.EDU with SMTP (5.61+IDA+UTK-930125/2.8s-UTK) id AA29365; Mon, 26 Jul 93 08:38:06 -0400 Received: from LOCALHOST.cs.utk.edu by berry.cs.utk.edu with SMTP (5.61++/2.7c-UTK) id AA19163; Mon, 26 Jul 93 08:38:05 -0400 Message-Id: <9307261238.AA19163@berry.cs.utk.edu> To: WCOLLIER@suvm.BITNET Cc: pbwg-comm@cs.utk.edu In-Reply-To: Your message of "Sun, 25 Jul 1993 23:39:05 EDT." <9307260345.AA15164@CS.UTK.EDU> Date: Mon, 26 Jul 1993 08:38:03 -0400 From: "Michael W. Berry" > Dear Professor Berry, > I was talking to a friend the other day who said > that people at the Universities of Tennessee and > Southhampton were about to conduct a series of > performance evaluations, called Park Bench, on > distributed multiprocessors. I would like to know > more about these tests. > Do any of the systems to be tested assume (as > shared memory multiprocessors do) that a broadcast > signal will (appear to) be received at the same time > by all processes? Or that signals will be received > in the same order in which they were sent? If so, > then there are some programs, which I would like to > show you, which test to see whether or not a machine > obeys such standards of behavior. Such programs > constitute a test of the logical behavior of a > machine, rather than its speed. They might make > an interesting and valuable addition to the > Park Bench tests. > Bill Collier Bill, The PARKBENCH suite has not formally been assembled to date. We have another meeting next month and then will formally present a paper for review at the Supercomputing'93 conference in Portland. I will forward your note to "pbwg-comm@.cs.utk.edu" which is the mail reflector for the group. I believe the scheduling behaviors you are referring to will be allowed, but I cannot say for sure at the present time. Stay tuned. Regards, Mike B. --- Michael W. Berry ___-___ o==o====== . . . . . Ayres 114 =========== ||// Department of \ \ |//__ Computer Science #_______/ berry@cs.utk.edu University of Tennessee (615) 974-3838 [OFF] Knoxville, TN 37996-1301 (615) 974-4404 [FAX] From owner-pbwg-comm@CS.UTK.EDU Wed Jul 28 19:06:49 1993 Received: from CS.UTK.EDU by netlib2.cs.utk.edu with SMTP (5.61+IDA+UTK-930125/2.8t-netlib) id AA17503; Wed, 28 Jul 93 19:06:49 -0400 Received: from localhost by CS.UTK.EDU with SMTP (5.61+IDA+UTK-930125/2.8s-UTK) id AA12710; Wed, 28 Jul 93 19:04:55 -0400 X-Resent-To: pbwg-comm@CS.UTK.EDU ; Wed, 28 Jul 1993 19:04:54 EDT Errors-To: owner-pbwg-comm@CS.UTK.EDU Received: from osiris.usi.utah.edu by CS.UTK.EDU with SMTP (5.61+IDA+UTK-930125/2.8s-UTK) id AA12702; Wed, 28 Jul 93 19:04:53 -0400 Received: by osiris.usi.utah.edu (AIX 3.2/UCB 5.64/4.03) id AA41429; Wed, 28 Jul 1993 17:04:51 -0600 Date: Wed, 28 Jul 1993 17:04:51 -0600 From: stefano@osiris.usi.utah.edu (Stefano Foresti) Message-Id: <9307282304.AA41429@osiris.usi.utah.edu> To: pbwg-comm@cs.utk.edu Subject: info Cc: stefano@osiris.usi.utah.edu I have been added to the PBWG mailing list. I have tried to access netlib looking for some document to read. Besides all the benchmark packages, I have seen few latex chapters, but I couldn't find a main latex file, or some general document of the goals and procedures of this committee. I am also interested in the list of participants, and when are the meetings. Thank you for any information, Stefano Foresti Utah Supercomputing Institute 85 SSB University of Utah Salt Lake City, Utah 84112, USA tel: (801)581-3173 Fax: (801)585-5366 E-mail: stefano@osiris.usi.utah.edu From owner-pbwg-comm@CS.UTK.EDU Sun Aug 1 06:07:47 1993 Received: from CS.UTK.EDU by netlib2.cs.utk.edu with SMTP (5.61+IDA+UTK-930125/2.8t-netlib) id AA03532; Sun, 1 Aug 93 06:07:47 -0400 Received: from localhost by CS.UTK.EDU with SMTP (5.61+IDA+UTK-930125/2.8s-UTK) id AA08779; Sun, 1 Aug 93 06:06:09 -0400 X-Resent-To: pbwg-comm@CS.UTK.EDU ; Sun, 1 Aug 1993 06:06:08 EDT Errors-To: owner-pbwg-comm@CS.UTK.EDU Received: from sun2.nsfnet-relay.ac.uk by CS.UTK.EDU with SMTP (5.61+IDA+UTK-930125/2.8s-UTK) id AA08771; Sun, 1 Aug 93 06:06:06 -0400 Via: uk.ac.southampton.ecs; Sun, 1 Aug 1993 11:04:14 +0100 Via: brewery.ecs.soton.ac.uk; Sun, 1 Aug 93 10:55:32 BST From: Vladimir Getov Received: from beluga.ecs.soton.ac.uk by brewery.ecs.soton.ac.uk; Sun, 1 Aug 93 11:05:53 BST Date: Sun, 1 Aug 93 11:05:45 BST Message-Id: <24884.9308011005@beluga.ecs.soton.ac.uk> To: pbwg-comm@cs.utk.edu, Johnsson@think.com Subject: PARKBENCH kernels During the workshop on Portability and Performance for Parallel Processing in Southampton a few weeks ago we had an informal discussion with Leonard Johnsson (TMC) re: benchmark kernels. Currently the TMC library routines cover: 1) Matrix utilities - Dense BLAS - Grid sparce BLAS - Arbitrary sparce BLAS 2) Solvers - Banded systems - Dense systems 3) Eigenanalysis 4) FFTs These must be similar for Intel (Ed Kushner may wish to comment?) and Cray (perhaps Charles Grassl could give some details?). An important question is: `How do they map onto our proposed PARKBENCH kernels?' Jack said he could provide some of the matrix utilities. Are there any details yet? We have from the GENESIS suite the following solvers: SOR, CG, Multigrid. Can anyone add to FFT debate from the last meeting? The validation is an interesting problem (e.g. FFT + reverse FFT). Do we have suitable candidate codes for these? Tony Hey and Vladimir Getov From owner-pbwg-comm@CS.UTK.EDU Tue Aug 3 17:17:15 1993 Received: from CS.UTK.EDU by netlib2.cs.utk.edu with SMTP (5.61+IDA+UTK-930125/2.8t-netlib) id AA15555; Tue, 3 Aug 93 17:17:15 -0400 Received: from localhost by CS.UTK.EDU with SMTP (5.61+IDA+UTK-930125/2.8s-UTK) id AA11340; Tue, 3 Aug 93 17:14:59 -0400 X-Resent-To: pbwg-comm@CS.UTK.EDU ; Tue, 3 Aug 1993 17:14:56 EDT Errors-To: owner-pbwg-comm@CS.UTK.EDU Received: from THUD.CS.UTK.EDU by CS.UTK.EDU with SMTP (5.61+IDA+UTK-930125/2.8s-UTK) id AA11332; Tue, 3 Aug 93 17:14:51 -0400 From: Jack Dongarra Received: by thud.cs.utk.edu (5.61+IDA+UTK-930125/2.7c-UTK) id AA01264; Tue, 3 Aug 93 17:14:38 -0400 Date: Tue, 3 Aug 93 17:14:38 -0400 Message-Id: <9308032114.AA01264@thud.cs.utk.edu> To: pbwg-comm@cs.utk.edu Subject: ParkBench meeting The Fourth Meeting of the ParkBench (Parallel Benchmark Working Group) will meet in Knoxville, Tennessee at the University of Tennessee on August 23th, 1993. The meeting site will be the: Science Alliance Conference Room South College University of Tennessee (A postscript map in included at the end of this message, South College is the building located next to Ayres Hall.) We have made arrangements with the Hilton Hotel in Knoxville. Hilton Hotel 501 W. Church Street Knoxville, TN Phone: 615-523-2300 When making arrangements tell the hotel you are associated with the Parallel Benchmarking Meeting. The rate is $64.00/night. You can rent a car or get a cab from the airport to the hotel. From the hotel to the University it is a 20 minute walk. We should plan to start at 9:00 am August 23th and finish about 5:00 pm. If you will be attending the meeting please send me email so we can better arrange for the meeting. The format of the meeting is: Monday 23th August 9:00 - 12.00 Full group meeting 12.00 - 1.30 Lunch 1.30 - 5.00 Full group meeting Tentative agenda for the meeting: 1. Minutes of last meeting 2. Reports and discussion from subgroups 3. Open discussion and agreement on further actions 4. Date and venue for next meeting The objectives for the group are: 1. To establish a comprehensive set of parallel benchmarks that is generally accepted by both users and vendors of parallel system. 2. To provide a focus for parallel benchmark activities and avoid unnecessary duplication of effort and proliferation of benchmarks. 3. To set standards for benchmarking methodology and result-reporting together with a control database/repository for both the benchmarks and the results. The following mailing lists have been set up. pbwg-comm@cs.utk.edu Whole committee pbwg-lowlevel@cs.utk.edu Low level subcommittee pbwg-compactapp@cs.utk.edu Compact applications subcommittee pbwg-method@cs.utk.edu Methodology subcommittee pbwg-kernel@cs.utk.edu Kernel subcommittee All mail is being collected and can be retrieved by sending email to netlib@ornl.gov and in the mail message typing: send comm.archive from pbwg send lowlevel.archive from pbwg send compactapp.archive from pbwg send method.archive from pbwg send kernel.archive from pbwg send index from pbwg We have setup a mail reflector for correspondence, it is called pbwg-comm@cs.utk.edu. Mail to that address will be sent to the mailing list and also collected in netlib@ornl.gov. To retrieve the collected mail, send email to netlib@ornl.gov and in the mail message type: send comm.archive from pbwg Jack Dongarra %!PS-Adobe-2.0 EPSF-1.2 %%DocumentFonts: Helvetica-Bold Courier Courier-Bold Times-Bold %%Pages: 1 %%BoundingBox: 39 -113 604 767 %%EndComments /arrowHeight 10 def /arrowWidth 5 def /IdrawDict 54 dict def IdrawDict begin /reencodeISO { dup dup findfont dup length dict begin { 1 index /FID ne { def }{ pop pop } ifelse } forall /Encoding ISOLatin1Encoding def currentdict end definefont } def /ISOLatin1Encoding [ /.notdef/.notdef/.notdef/.notdef/.notdef/.notdef/.notdef/.notdef /.notdef/.notdef/.notdef/.notdef/.notdef/.notdef/.notdef/.notdef /.notdef/.notdef/.notdef/.notdef/.notdef/.notdef/.notdef/.notdef /.notdef/.notdef/.notdef/.notdef/.notdef/.notdef/.notdef/.notdef /space/exclam/quotedbl/numbersign/dollar/percent/ampersand/quoteright /parenleft/parenright/asterisk/plus/comma/minus/period/slash /zero/one/two/three/four/five/six/seven/eight/nine/colon/semicolon /less/equal/greater/question/at/A/B/C/D/E/F/G/H/I/J/K/L/M/N /O/P/Q/R/S/T/U/V/W/X/Y/Z/bracketleft/backslash/bracketright /asciicircum/underscore/quoteleft/a/b/c/d/e/f/g/h/i/j/k/l/m /n/o/p/q/r/s/t/u/v/w/x/y/z/braceleft/bar/braceright/asciitilde /.notdef/.notdef/.notdef/.notdef/.notdef/.notdef/.notdef/.notdef /.notdef/.notdef/.notdef/.notdef/.notdef/.notdef/.notdef/.notdef /.notdef/dotlessi/grave/acute/circumflex/tilde/macron/breve /dotaccent/dieresis/.notdef/ring/cedilla/.notdef/hungarumlaut /ogonek/caron/space/exclamdown/cent/sterling/currency/yen/brokenbar /section/dieresis/copyright/ordfeminine/guillemotleft/logicalnot /hyphen/registered/macron/degree/plusminus/twosuperior/threesuperior /acute/mu/paragraph/periodcentered/cedilla/onesuperior/ordmasculine /guillemotright/onequarter/onehalf/threequarters/questiondown /Agrave/Aacute/Acircumflex/Atilde/Adieresis/Aring/AE/Ccedilla /Egrave/Eacute/Ecircumflex/Edieresis/Igrave/Iacute/Icircumflex /Idieresis/Eth/Ntilde/Ograve/Oacute/Ocircumflex/Otilde/Odieresis /multiply/Oslash/Ugrave/Uacute/Ucircumflex/Udieresis/Yacute /Thorn/germandbls/agrave/aacute/acircumflex/atilde/adieresis /aring/ae/ccedilla/egrave/eacute/ecircumflex/edieresis/igrave /iacute/icircumflex/idieresis/eth/ntilde/ograve/oacute/ocircumflex /otilde/odieresis/divide/oslash/ugrave/uacute/ucircumflex/udieresis /yacute/thorn/ydieresis ] def /Helvetica-Bold reencodeISO def /Courier reencodeISO def /Courier-Bold reencodeISO def /Times-Bold reencodeISO def /none null def /numGraphicParameters 17 def /stringLimit 65535 def /Begin { save numGraphicParameters dict begin } def /End { end restore } def /SetB { dup type /nulltype eq { pop false /brushRightArrow idef false /brushLeftArrow idef true /brushNone idef } { /brushDashOffset idef /brushDashArray idef 0 ne /brushRightArrow idef 0 ne /brushLeftArrow idef /brushWidth idef false /brushNone idef } ifelse } def /SetCFg { /fgblue idef /fggreen idef /fgred idef } def /SetCBg { /bgblue idef /bggreen idef /bgred idef } def /SetF { /printSize idef /printFont idef } def /SetP { dup type /nulltype eq { pop true /patternNone idef } { dup -1 eq { /patternGrayLevel idef /patternString idef } { /patternGrayLevel idef } ifelse false /patternNone idef } ifelse } def /BSpl { 0 begin storexyn newpath n 1 gt { 0 0 0 0 0 0 1 1 true subspline n 2 gt { 0 0 0 0 1 1 2 2 false subspline 1 1 n 3 sub { /i exch def i 1 sub dup i dup i 1 add dup i 2 add dup false subspline } for n 3 sub dup n 2 sub dup n 1 sub dup 2 copy false subspline } if n 2 sub dup n 1 sub dup 2 copy 2 copy false subspline patternNone not brushLeftArrow not brushRightArrow not and and { ifill } if brushNone not { istroke } if 0 0 1 1 leftarrow n 2 sub dup n 1 sub dup rightarrow } if end } dup 0 4 dict put def /Circ { newpath 0 360 arc patternNone not { ifill } if brushNone not { istroke } if } def /CBSpl { 0 begin dup 2 gt { storexyn newpath n 1 sub dup 0 0 1 1 2 2 true subspline 1 1 n 3 sub { /i exch def i 1 sub dup i dup i 1 add dup i 2 add dup false subspline } for n 3 sub dup n 2 sub dup n 1 sub dup 0 0 false subspline n 2 sub dup n 1 sub dup 0 0 1 1 false subspline patternNone not { ifill } if brushNone not { istroke } if } { Poly } ifelse end } dup 0 4 dict put def /Elli { 0 begin newpath 4 2 roll translate scale 0 0 1 0 360 arc patternNone not { ifill } if brushNone not { istroke } if end } dup 0 1 dict put def /Line { 0 begin 2 storexyn newpath x 0 get y 0 get moveto x 1 get y 1 get lineto brushNone not { istroke } if 0 0 1 1 leftarrow 0 0 1 1 rightarrow end } dup 0 4 dict put def /MLine { 0 begin storexyn newpath n 1 gt { x 0 get y 0 get moveto 1 1 n 1 sub { /i exch def x i get y i get lineto } for patternNone not brushLeftArrow not brushRightArrow not and and { ifill } if brushNone not { istroke } if 0 0 1 1 leftarrow n 2 sub dup n 1 sub dup rightarrow } if end } dup 0 4 dict put def /Poly { 3 1 roll newpath moveto -1 add { lineto } repeat closepath patternNone not { ifill } if brushNone not { istroke } if } def /Rect { 0 begin /t exch def /r exch def /b exch def /l exch def newpath l b moveto l t lineto r t lineto r b lineto closepath patternNone not { ifill } if brushNone not { istroke } if end } dup 0 4 dict put def /Text { ishow } def /idef { dup where { pop pop pop } { exch def } ifelse } def /ifill { 0 begin gsave patternGrayLevel -1 ne { fgred bgred fgred sub patternGrayLevel mul add fggreen bggreen fggreen sub patternGrayLevel mul add fgblue bgblue fgblue sub patternGrayLevel mul add setrgbcolor eofill } { eoclip originalCTM setmatrix pathbbox /t exch def /r exch def /b exch def /l exch def /w r l sub ceiling cvi def /h t b sub ceiling cvi def /imageByteWidth w 8 div ceiling cvi def /imageHeight h def bgred bggreen bgblue setrgbcolor eofill fgred fggreen fgblue setrgbcolor w 0 gt h 0 gt and { l b translate w h scale w h true [w 0 0 h neg 0 h] { patternproc } imagemask } if } ifelse grestore end } dup 0 8 dict put def /istroke { gsave brushDashOffset -1 eq { [] 0 setdash 1 setgray } { brushDashArray brushDashOffset setdash fgred fggreen fgblue setrgbcolor } ifelse brushWidth setlinewidth originalCTM setmatrix stroke grestore } def /ishow { 0 begin gsave fgred fggreen fgblue setrgbcolor /fontDict printFont printSize scalefont dup setfont def /descender fontDict begin 0 [FontBBox] 1 get FontMatrix end transform exch pop def /vertoffset 1 printSize sub descender sub def { 0 vertoffset moveto show /vertoffset vertoffset printSize sub def } forall grestore end } dup 0 3 dict put def /patternproc { 0 begin /patternByteLength patternString length def /patternHeight patternByteLength 8 mul sqrt cvi def /patternWidth patternHeight def /patternByteWidth patternWidth 8 idiv def /imageByteMaxLength imageByteWidth imageHeight mul stringLimit patternByteWidth sub min def /imageMaxHeight imageByteMaxLength imageByteWidth idiv patternHeight idiv patternHeight mul patternHeight max def /imageHeight imageHeight imageMaxHeight sub store /imageString imageByteWidth imageMaxHeight mul patternByteWidth add string def 0 1 imageMaxHeight 1 sub { /y exch def /patternRow y patternByteWidth mul patternByteLength mod def /patternRowString patternString patternRow patternByteWidth getinterval def /imageRow y imageByteWidth mul def 0 patternByteWidth imageByteWidth 1 sub { /x exch def imageString imageRow x add patternRowString putinterval } for } for imageString end } dup 0 12 dict put def /min { dup 3 2 roll dup 4 3 roll lt { exch } if pop } def /max { dup 3 2 roll dup 4 3 roll gt { exch } if pop } def /midpoint { 0 begin /y1 exch def /x1 exch def /y0 exch def /x0 exch def x0 x1 add 2 div y0 y1 add 2 div end } dup 0 4 dict put def /thirdpoint { 0 begin /y1 exch def /x1 exch def /y0 exch def /x0 exch def x0 2 mul x1 add 3 div y0 2 mul y1 add 3 div end } dup 0 4 dict put def /subspline { 0 begin /movetoNeeded exch def y exch get /y3 exch def x exch get /x3 exch def y exch get /y2 exch def x exch get /x2 exch def y exch get /y1 exch def x exch get /x1 exch def y exch get /y0 exch def x exch get /x0 exch def x1 y1 x2 y2 thirdpoint /p1y exch def /p1x exch def x2 y2 x1 y1 thirdpoint /p2y exch def /p2x exch def x1 y1 x0 y0 thirdpoint p1x p1y midpoint /p0y exch def /p0x exch def x2 y2 x3 y3 thirdpoint p2x p2y midpoint /p3y exch def /p3x exch def movetoNeeded { p0x p0y moveto } if p1x p1y p2x p2y p3x p3y curveto end } dup 0 17 dict put def /storexyn { /n exch def /y n array def /x n array def n 1 sub -1 0 { /i exch def y i 3 2 roll put x i 3 2 roll put } for } def %%EndProlog %%BeginIdrawPrologue /arrowhead { 0 begin transform originalCTM itransform /taily exch def /tailx exch def transform originalCTM itransform /tipy exch def /tipx exch def /dy tipy taily sub def /dx tipx tailx sub def /angle dx 0 ne dy 0 ne or { dy dx atan } { 90 } ifelse def gsave originalCTM setmatrix tipx tipy translate angle rotate newpath arrowHeight neg arrowWidth 2 div moveto 0 0 lineto arrowHeight neg arrowWidth 2 div neg lineto patternNone not { originalCTM setmatrix /padtip arrowHeight 2 exp 0.25 arrowWidth 2 exp mul add sqrt brushWidth mul arrowWidth div def /padtail brushWidth 2 div def tipx tipy translate angle rotate padtip 0 translate arrowHeight padtip add padtail add arrowHeight div dup scale arrowheadpath ifill } if brushNone not { originalCTM setmatrix tipx tipy translate angle rotate arrowheadpath istroke } if grestore end } dup 0 9 dict put def /arrowheadpath { newpath arrowHeight neg arrowWidth 2 div moveto 0 0 lineto arrowHeight neg arrowWidth 2 div neg lineto } def /leftarrow { 0 begin y exch get /taily exch def x exch get /tailx exch def y exch get /tipy exch def x exch get /tipx exch def brushLeftArrow { tipx tipy tailx taily arrowhead } if end } dup 0 4 dict put def /rightarrow { 0 begin y exch get /tipy exch def x exch get /tipx exch def y exch get /taily exch def x exch get /tailx exch def brushRightArrow { tipx tipy tailx taily arrowhead } if end } dup 0 4 dict put def %%EndIdrawPrologue %I Idraw 10 Grid 2.84217e-39 0 %%Page: 1 1 Begin %I b u %I cfg u %I cbg u %I f u %I p u %I t [ 0.799705 0 0 0.799705 0 0 ] concat /originalCTM matrix currentmatrix def Begin %I Pict %I b u %I cfg u %I cbg u %I f u %I p u %I t [ 1 0 0 1 89.1002 831.2 ] concat Begin %I Elli %I b 65535 0 0 0 [] 0 SetB %I cfg Black 0 0 0 SetCFg %I cbg White 1 1 1 SetCBg none SetP %I p n %I t [ 1.09512e-08 0.9 -0.9 1.09512e-08 584.1 0.89999 ] concat %I 79 550 18 17 Elli End Begin %I Line %I b 65535 3 0 1 [] 0 SetB %I cfg Black 0 0 0 SetCFg %I cbg White 1 1 1 SetCBg none SetP %I p n %I t [ 1.09512e-08 0.9 -0.9 1.09512e-08 541.8 -27 ] concat %I 110 466 110 542 Line %I 1 End Begin %I Line %I b 65535 2 0 0 [] 0 SetB %I cfg Black 0 0 0 SetCFg %I cbg White 1 1 1 SetCBg none SetP %I p n %I t [ 1.09512e-08 0.9 -0.9 1.09512e-08 541.8 -27 ] concat %I 82 504 140 504 Line %I 1 End Begin %I Text %I cfg Black 0 0 0 SetCFg %I f *-helvetica-bold-r-*-140-* Helvetica-Bold 14 SetF %I t [ 1.2168e-08 1 -1 1.2168e-08 35.5 66.1 ] concat %I [ (N) ] Text End End %I eop Begin %I Text %I cfg Black 0 0 0 SetCFg %I f *-courier-medium-r-*-80-* Courier 8 SetF %I t [ -1.23147 0.0157385 -0.0157385 -1.23147 409.218 -127.169 ] concat %I [ (Voluteer Boulevard) ] Text End Begin %I Text %I cfg Black 0 0 0 SetCFg %I f *-courier-bold-r-*-120-* Courier-Bold 12 SetF %I t [ 3.66661e-08 3.01333 -3.01333 3.66661e-08 57.9267 164.513 ] concat %I [ (UT Campus -- Jack Dongarra's Lab) ] Text End Begin %I Rect none SetB %I b n %I cfg Black 0 0 0 SetCFg %I cbg White 1 1 1 SetCBg %I p 1 SetP %I t [ 1.2168e-08 1 -1 1.2168e-08 693.5 -156 ] concat %I 17 61 177 478 Rect End Begin %I Line %I b 65535 2 1 1 [] 0 SetB %I cfg Black 0 0 0 SetCFg %I cbg White 1 1 1 SetCBg none SetP %I p n %I t [ 6.9812e-09 0.698798 -0.573736 8.50294e-09 453.281 -79.7194 ] concat %I 158 545 1015 545 Line %I 1 End Begin %I Elli %I b 65535 2 0 0 [] 0 SetB %I cfg Black 0 0 0 SetCFg %I cbg White 1 1 1 SetCBg %I p 0 SetP %I t [ 5.81767e-09 0.582331 -0.478114 7.08579e-09 333.274 -81.9323 ] concat %I 257 403 4 4 Elli End Begin %I Line %I b 65535 2 0 1 [] 0 SetB %I cfg Black 0 0 0 SetCFg %I cbg White 1 1 1 SetCBg %I p 0 SetP %I t [ 7.02094e-09 0.702776 -0.577002 8.55135e-09 455.344 -80.5588 ] concat %I 211 544 211 17 Line %I 1 End Begin %I BSpl %I b 65535 2 0 0 [] 0 SetB %I cfg Black 0 0 0 SetCFg %I cbg White 1 1 1 SetCBg none SetP %I p n %I t [ 6.9812e-09 0.698798 -0.573736 8.50294e-09 455.594 -124.448 ] concat %I 5 503 545 514 535 546 529 628 529 628 529 5 BSpl %I 1 End Begin %I Elli %I b 65535 2 0 0 [] 0 SetB %I cfg Black 0 0 0 SetCFg %I cbg White 1 1 1 SetCBg %I p 0 SetP %I t [ 5.81767e-09 0.582331 -0.478114 7.08579e-09 333.274 76.6169 ] concat %I 257 403 4 4 Elli End Begin %I Elli %I b 65535 2 0 0 [] 0 SetB %I cfg Black 0 0 0 SetCFg %I cbg White 1 1 1 SetCBg %I p 0 SetP %I t [ 5.81767e-09 0.582331 -0.478114 7.08579e-09 403.382 -81.1611 ] concat %I 257 403 4 4 Elli End Begin %I Text %I cfg Black 0 0 0 SetCFg %I f *-courier-medium-r-*-100-* Courier 10 SetF %I t [ 1.15378e-08 1.1549 -0.948215 1.40528e-08 222.192 481.065 ] concat %I [ (DOWN) (TOWN) ] Text End Begin %I Poly %I b 65535 3 0 0 [] 0 SetB %I cfg Black 0 0 0 SetCFg %I cbg White 1 1 1 SetCBg %I p 0.75 SetP %I t [ 2.43189e-09 0.222447 -0.19986 2.70672e-09 310.54 254.841 ] concat %I 4 301 50 850 50 850 537 301 537 4 Poly End Begin %I BSpl %I b 65535 2 0 0 [] 0 SetB %I cfg Black 0 0 0 SetCFg %I cbg White 1 1 1 SetCBg none SetP %I p n %I t [ 9.38371e-09 0.771182 -0.771182 9.38371e-09 628.85 -52.2379 ] concat %I 9 251 541 251 516 251 507 257 495 263 489 413 491 472 496 485 499 485 499 9 BSpl %I 1 End Begin %I BSpl %I b 65520 2 0 0 [12 4] 17 SetB %I cfg Black 0 0 0 SetCFg %I cbg White 1 1 1 SetCBg none SetP %I p n %I t [ 9.18822e-09 0.43379 -0.755115 5.27834e-09 620.126 111.397 ] concat %I 6 486 498 502 500 510 504 514 514 514 529 513 542 6 BSpl %I 1 End Begin %I Line %I b 65535 2 0 0 [] 0 SetB %I cfg Black 0 0 0 SetCFg %I cbg White 1 1 1 SetCBg %I p 0 SetP %I t [ 9.38371e-09 0.771182 -0.771182 9.38371e-09 628.85 -52.2378 ] concat %I 476 654 476 542 Line %I 1 End Begin %I Elli %I b 65535 2 0 0 [] 0 SetB %I cfg Black 0 0 0 SetCFg %I cbg White 1 1 1 SetCBg %I p 0 SetP %I t [ 5.81767e-09 0.582331 -0.478114 7.08579e-09 445.797 -81.1611 ] concat %I 257 403 4 4 Elli End Begin %I Text %I cfg Black 0 0 0 SetCFg %I f *-courier-medium-r-*-100-* Courier 10 SetF %I t [ 1.13387e-08 0.931845 -0.931845 1.13387e-08 252.947 159.612 ] concat %I [ (Volunteer Boulevard) ] Text End Begin %I Text %I cfg Black 0 0 0 SetCFg %I f *-courier-medium-r-*-80-* Courier 8 SetF %I t [ 1.13387e-08 0.931845 -0.931845 1.13387e-08 310.947 152.672 ] concat %I [ (Neyland Drive) ] Text End Begin %I Text %I cfg Black 0 0 0 SetCFg %I f *-courier-medium-r-*-100-* Courier 10 SetF %I t [ 1.13387e-08 0.931845 -0.931845 1.13387e-08 198.965 155.756 ] concat %I [ (Cumberland Avenue) ] Text End Begin %I Text %I cfg Black 0 0 0 SetCFg %I f *-courier-medium-r-*-100-* Courier 10 SetF %I t [ 1.13387e-08 0.931845 -0.931845 1.13387e-08 130.329 90.977 ] concat %I [ (Interstate 40) ] Text End Begin %I Text %I cfg Black 0 0 0 SetCFg %I f *-courier-medium-r-*-100-* Courier 10 SetF %I t [ 1.13387e-08 0.931845 -0.931845 1.13387e-08 130.329 403.305 ] concat %I [ (Interstate 40) ] Text End Begin %I Elli %I b 65535 2 0 0 [] 0 SetB %I cfg Black 0 0 0 SetCFg %I cbg White 1 1 1 SetCBg %I p 0 SetP %I t [ 5.81767e-09 0.582331 -0.478114 7.08579e-09 332.433 366.124 ] concat %I 257 403 4 4 Elli End Begin %I Line %I b 65535 2 0 0 [] 0 SetB %I cfg Black 0 0 0 SetCFg %I cbg White 1 1 1 SetCBg none SetP %I p n %I t [ 9.38371e-09 0.771182 -0.771182 9.38371e-09 628.85 -52.2379 ] concat %I 107 542 486 541 Line %I 1 End Begin %I Line %I b 65520 2 0 0 [12 4] 17 SetB %I cfg Black 0 0 0 SetCFg %I cbg White 1 1 1 SetCBg none SetP %I p n %I t [ 9.38371e-09 0.771182 -0.771182 9.38371e-09 628.85 -52.2379 ] concat %I 487 541 664 541 Line %I 1 End Begin %I Line %I b 65535 2 0 0 [] 0 SetB %I cfg Black 0 0 0 SetCFg %I cbg White 1 1 1 SetCBg none SetP %I p n %I t [ 9.38371e-09 0.771182 -0.771182 9.38371e-09 628.85 -52.2379 ] concat %I 665 541 682 540 Line %I 1 End Begin %I Line %I b 65535 2 0 0 [] 0 SetB %I cfg Black 0 0 0 SetCFg %I cbg White 1 1 1 SetCBg none SetP %I p n %I t [ 9.38371e-09 0.771182 -0.771182 9.38371e-09 628.85 -51.4666 ] concat %I 681 599 681 485 Line %I 1 End Begin %I Text %I cfg Black 0 0 0 SetCFg %I f *-courier-medium-r-*-80-* Courier 8 SetF %I t [ -0.931845 2.26773e-08 -2.26773e-08 -0.931845 205.214 462.905 ] concat %I [ (Henley Street) ] Text End Begin %I Text %I cfg Black 0 0 0 SetCFg %I f *-courier-medium-r-*-80-* Courier 8 SetF %I t [ -0.931845 2.26773e-08 -2.26773e-08 -0.931845 193.647 318.042 ] concat %I [ (17th Street) ] Text End Begin %I Text %I cfg Black 0 0 0 SetCFg %I f *-courier-medium-r-*-100-* Courier 10 SetF %I t [ 9.63786e-09 0.792068 -0.792068 9.63786e-09 110.137 205.729 ] concat %I [ (17th Street Exit) ] Text End Begin %I Text %I cfg Black 0 0 0 SetCFg %I f *-courier-medium-r-*-100-* Courier 10 SetF %I t [ 9.63786e-09 0.792068 -0.792068 9.63786e-09 110.137 26.0435 ] concat %I [ (Airport/Alcoa Highway Exit) ] Text End Begin %I Text %I cfg Black 0 0 0 SetCFg %I f *-times-bold-r-*-140-* Times-Bold 14 SetF %I t [ 9.63786e-09 0.792068 -0.792068 9.63786e-09 171.727 80.0262 ] concat %I [ (Cumberland Avenue Exit) ] Text End Begin %I Text %I cfg Black 0 0 0 SetCFg %I f *-courier-medium-r-*-100-* Courier 10 SetF %I t [ 9.63786e-09 0.792068 -0.792068 9.63786e-09 265.915 92.3651 ] concat %I [ (Neyland Drive Exit) ] Text End Begin %I Text %I cfg Black 0 0 0 SetCFg %I f *-courier-medium-r-*-100-* Courier 10 SetF %I t [ 9.63786e-09 0.792068 -0.792068 9.63786e-09 110.137 509.574 ] concat %I [ (Summit Hill Exit) ] Text End Begin %I Line %I b 65535 0 0 0 [] 0 SetB %I cfg Black 0 0 0 SetCFg %I cbg White 1 1 1 SetCBg none SetP %I p n %I t [ 9.38371e-09 0.771182 -0.771182 9.38371e-09 628.85 -52.2379 ] concat %I 144 661 155 636 Line %I 1 End Begin %I Line %I b 65535 0 0 0 [] 0 SetB %I cfg Black 0 0 0 SetCFg %I cbg White 1 1 1 SetCBg none SetP %I p n %I t [ 9.38371e-09 0.771182 -0.771182 9.38371e-09 628.85 -52.2379 ] concat %I 372 662 361 634 Line %I 1 End Begin %I Line %I b 65535 0 0 0 [] 0 SetB %I cfg Black 0 0 0 SetCFg %I cbg White 1 1 1 SetCBg none SetP %I p n %I t [ 9.38371e-09 0.771182 -0.771182 9.38371e-09 628.85 -52.2379 ] concat %I 752 660 736 634 Line %I 1 End Begin %I Line %I b 65535 0 0 0 [] 0 SetB %I cfg Black 0 0 0 SetCFg %I cbg White 1 1 1 SetCBg none SetP %I p n %I t [ 9.38371e-09 0.771182 -0.771182 9.38371e-09 628.85 -52.2378 ] concat %I 186 466 160 486 Line %I 1 End Begin %I Line %I b 65535 0 0 0 [] 0 SetB %I cfg Black 0 0 0 SetCFg %I cbg White 1 1 1 SetCBg none SetP %I p n %I t [ 9.38371e-09 0.771182 -0.771182 9.38371e-09 628.85 -52.2378 ] concat %I 180 576 157 542 Line %I 1 End Begin %I Line %I b 65520 0 0 0 [12 4] 17 SetB %I cfg Black 0 0 0 SetCFg %I cbg White 1 1 1 SetCBg none SetP %I p n %I t [ 1.2168e-08 1 -1 1.2168e-08 793 -5.99997 ] concat %I 475 50 328 492 Line %I 1 End Begin %I Line %I b 65520 0 0 0 [12 4] 17 SetB %I cfg Black 0 0 0 SetCFg %I cbg White 1 1 1 SetCBg none SetP %I p n %I t [ 1.2168e-08 1 -1 1.2168e-08 793 -5.99994 ] concat %I 475 483 329 589 Line %I 1 End Begin %I Line %I b 65520 0 0 0 [12 4] 17 SetB %I cfg Black 0 0 0 SetCFg %I cbg White 1 1 1 SetCBg none SetP %I p n %I t [ 1.2168e-08 1 -1 1.2168e-08 793 -6 ] concat %I 962 483 450 588 Line %I 1 End Begin %I Line %I b 65520 0 0 0 [12 4] 17 SetB %I cfg Black 0 0 0 SetCFg %I cbg White 1 1 1 SetCBg none SetP %I p n %I t [ 1.2168e-08 1 -1 1.2168e-08 793 -6 ] concat %I 450 491 474 471 Line %I 1 End Begin %I Text %I cfg Black 0 0 0 SetCFg %I f *-courier-medium-r-*-100-* Courier 10 SetF %I t [ 9.63786e-09 0.792068 -0.792068 9.63786e-09 456.136 36.7289 ] concat %I [ (To Airport) ] Text End Begin %I Text %I cfg Black 0 0 0 SetCFg %I f *-courier-medium-r-*-100-* Courier 10 SetF %I t [ 9.63786e-09 0.792068 -0.792068 9.63786e-09 137.137 639.729 ] concat %I [ (To Ashville, Bristol) ] Text End Begin %I Elli %I b 65535 2 0 0 [] 0 SetB %I cfg Black 0 0 0 SetCFg %I cbg White 1 1 1 SetCBg %I p 0.5 SetP %I t [ -0.288762 0.966405 -0.966405 -0.288762 1070.92 66.5381 ] concat %I 743 193 51 70 Elli End Begin %I Poly %I b 65535 2 0 0 [] 0 SetB %I cfg Black 0 0 0 SetCFg %I cbg White 1 1 1 SetCBg %I p 0.5 SetP %I t [ 1.22729e-08 1.00862 -1.00862 1.22729e-08 600.094 464.292 ] concat %I 11 203 116 212 116 212 140 226 140 226 122 222 122 222 102 212 102 212 110 203 110 203 113 11 Poly End Begin %I Poly %I b 65535 2 0 0 [] 0 SetB %I cfg Black 0 0 0 SetCFg %I cbg White 1 1 1 SetCBg %I p 0.5 SetP %I t [ 1.0258e-08 0.843029 -0.843029 1.0258e-08 619.253 516.221 ] concat %I 12 183 114 183 124 193 124 193 135 205 135 205 125 248 90 242 83 239 86 231 76 194 106 194 114 12 Poly End Begin %I Poly %I b 65535 2 0 0 [] 0 SetB %I cfg Black 0 0 0 SetCFg %I cbg White 1 1 1 SetCBg %I p 0.5 SetP %I t [ 1.11572e-08 0.810855 -0.916932 9.86645e-09 659.418 526.458 ] concat %I 8 251 106 251 126 300 126 300 126 300 116 289 116 289 106 289 106 8 Poly End Begin %I Rect %I b 65535 2 0 0 [] 0 SetB %I cfg Black 0 0 0 SetCFg %I cbg White 1 1 1 SetCBg %I p 0.5 SetP %I t [ 1.63639e-08 1.34483 -1.34483 1.63639e-08 1018.44 -282.452 ] concat %I 791 383 799 395 Rect End Begin %I Poly %I b 65535 2 0 0 [] 0 SetB %I cfg Black 0 0 0 SetCFg %I cbg White 1 1 1 SetCBg %I p 0.5 SetP %I t [ 1.22729e-08 1.00862 -1.00862 1.22729e-08 900.11 -9.08362 ] concat %I 12 776 339 776 352 798 352 803 356 808 350 807 348 812 342 808 336 803 341 789 343 782 333 776 333 12 Poly End Begin %I Rect %I b 65535 2 0 0 [] 0 SetB %I cfg Black 0 0 0 SetCFg %I cbg White 1 1 1 SetCBg %I p 0.5 SetP %I t [ 1.13408e-08 0.952586 -0.932017 1.1591e-08 882.447 11.9457 ] concat %I 885 359 901 436 Rect End Begin %I Elli %I b 65535 2 0 0 [] 0 SetB %I cfg Black 0 0 0 SetCFg %I cbg White 1 1 1 SetCBg %I p 1 SetP %I t [ -0.249802 0.836016 -0.836016 -0.249802 1016.8 155.898 ] concat %I 743 193 51 70 Elli End Begin %I Rect %I b 65535 2 0 0 [] 0 SetB %I cfg Black 0 0 0 SetCFg %I cbg White 1 1 1 SetCBg %I p 0.75 SetP %I t [ -0.289327 0.966236 -0.966236 -0.289327 1071.6 80.2259 ] concat %I 707 160 754 232 Rect End Begin %I Poly %I b 65535 2 0 0 [] 0 SetB %I cfg Black 0 0 0 SetCFg %I cbg White 1 1 1 SetCBg %I p 0.5 SetP %I t [ 0.240969 0.462035 -0.606024 0.318423 615.648 549.235 ] concat %I 13 164 162 182 162 182 167 235 167 234 162 254 162 254 134 234 133 235 129 183 129 183 134 164 134 164 149 13 Poly End Begin %I Rect %I b 65535 2 0 0 [] 0 SetB %I cfg Black 0 0 0 SetCFg %I cbg White 1 1 1 SetCBg %I p 0.5 SetP %I t [ 9.95725e-09 0.885621 -0.818318 1.07762e-08 598.215 291.204 ] concat %I 385 148 422 197 Rect End Begin %I Pict %I b u %I cfg Black 0 0 0 SetCFg %I cbg u %I f u %I p u %I t [ 1.05665 0 0 1.05665 213.224 -6.32959 ] concat Begin %I Poly %I b 65535 2 0 0 [] 0 SetB %I cfg DkGray 0.501961 0.501961 0.501961 SetCFg %I cbg White 1 1 1 SetCBg %I p 0.5 SetP %I t [ 8.1286e-09 0.496511 -0.668033 6.04153e-09 312.332 583.056 ] concat %I 13 204 92 204 123 226 123 226 113 248 113 248 123 264 123 264 92 248 92 248 95 226 95 226 92 225 92 13 Poly End Begin %I Poly %I b 65535 2 0 0 [] 0 SetB %I cfg DkGray 0.501961 0.501961 0.501961 SetCFg %I cbg White 1 1 1 SetCBg %I p 0.5 SetP %I t [ 8.1286e-09 -0.496511 -0.668033 -6.04153e-09 312.332 845.214 ] concat %I 13 204 92 204 123 226 123 226 113 248 113 248 123 264 123 264 92 248 92 248 95 226 95 226 92 225 92 13 Poly End Begin %I Rect none SetB %I b n %I cfg DkGray 0.501961 0.501961 0.501961 SetCFg %I cbg White 1 1 1 SetCBg %I p 0.5 SetP %I t [ 8.1286e-09 0.496511 -0.668033 6.04153e-09 311.664 582.559 ] concat %I 258 92 274 121 Rect End End %I eop Begin %I Text %I cfg Black 0 0 0 SetCFg %I f *-courier-medium-r-*-80-* Courier 8 SetF %I t [ 1.32746e-08 1.09095 -1.09095 1.32746e-08 532.584 731.01 ] concat %I [ (Physics) ] Text End Begin %I Text %I cfg Black 0 0 0 SetCFg %I f *-courier-medium-r-*-80-* Courier 8 SetF %I t [ 1.32746e-08 1.09095 -1.09095 1.32746e-08 555.924 783.814 ] concat %I [ (Geography) ] Text End Begin %I Text %I cfg Black 0 0 0 SetCFg %I f *-courier-medium-r-*-80-* Courier 8 SetF %I t [ 1.32746e-08 1.09095 -1.09095 1.32746e-08 565.058 787.873 ] concat %I [ (& Geology) ] Text End Begin %I Text %I cfg Black 0 0 0 SetCFg %I f *-courier-medium-r-*-80-* Courier 8 SetF %I t [ 0.737179 0.804195 -0.804195 0.737179 504.155 697.286 ] concat %I [ (Biology) ] Text End Begin %I Text %I cfg Black 0 0 0 SetCFg %I f *-courier-medium-r-*-80-* Courier 8 SetF %I t [ -1.09095 2.65492e-08 -2.65492e-08 -1.09095 366.956 772.951 ] concat %I [ (13th Street) ] Text End Begin %I Text %I cfg Black 0 0 0 SetCFg %I f *-courier-medium-r-*-80-* Courier 8 SetF %I t [ -1.09095 2.65492e-08 -2.65492e-08 -1.09095 511.152 538.704 ] concat %I [ (Voluteer Boulevard) ] Text End Begin %I Text %I cfg Black 0 0 0 SetCFg %I f *-courier-medium-r-*-80-* Courier 8 SetF %I t [ 0.46602 0.986397 -0.986397 0.466021 521.533 631.373 ] concat %I [ (Middle Way) ] Text End Begin %I Text %I cfg Black 0 0 0 SetCFg %I f *-courier-medium-r-*-80-* Courier 8 SetF %I t [ -1.09095 2.65492e-08 -2.65492e-08 -1.09095 373.044 540.56 ] concat %I [ (16th Street) ] Text End Begin %I Text %I cfg Black 0 0 0 SetCFg %I f *-courier-medium-r-*-80-* Courier 8 SetF %I t [ 1.32746e-08 1.09095 -1.09095 1.32746e-08 363.799 672.151 ] concat %I [ (Walters) (Life) (Sciences) ] Text End Begin %I Text %I cfg Black 0 0 0 SetCFg %I f *-courier-medium-r-*-80-* Courier 8 SetF %I t [ -1.09095 2.65492e-08 -2.65492e-08 -1.09095 537.034 899.133 ] concat %I [ (Daughtery) (Engineering) ] Text End Begin %I Text %I cfg Black 0 0 0 SetCFg %I f *-courier-medium-r-*-80-* Courier 8 SetF %I t [ 1.60693e-08 1.32061 -1.32061 1.60693e-08 657.627 707.825 ] concat %I [ (Neyland) (Stadium) ] Text End Begin %I Text %I cfg Black 0 0 0 SetCFg %I f *-courier-medium-r-*-80-* Courier 8 SetF %I t [ -1.0536 -0.283024 0.283024 -1.0536 624.893 639.464 ] concat %I [ (Stadium Drive) ] Text End Begin %I Text %I cfg Black 0 0 0 SetCFg %I f *-courier-medium-r-*-80-* Courier 8 SetF %I t [ 1.32746e-08 1.09095 -1.09095 1.32746e-08 450.384 486.441 ] concat %I [ (Library) ] Text End Begin %I Poly %I b 65535 2 0 0 [] 0 SetB %I cfg Black 0 0 0 SetCFg %I cbg White 1 1 1 SetCBg %I p 0.5 SetP %I t [ 1.22729e-08 1.00862 -1.00862 1.22729e-08 895.645 -1.75617 ] concat %I 4 483 431 523 431 523 391 481 389 4 Poly End Begin %I Text %I cfg Black 0 0 0 SetCFg %I f *-courier-medium-r-*-80-* Courier 8 SetF %I t [ 1.32746e-08 1.09095 -1.09095 1.32746e-08 412.866 555.498 ] concat %I [ (University) ( Center) ] Text End Begin %I BSpl %I b 65520 2 0 0 [12 4] 17 SetB %I cfg Black 0 0 0 SetCFg %I cbg White 1 1 1 SetCBg none SetP %I p n %I t [ 1.22729e-08 1.00862 -1.00862 1.22729e-08 895.438 0.0836792 ] concat %I 6 753 467 837 468 841 464 846 464 843 420 841 419 6 BSpl %I 1 End Begin %I BSpl %I b 65520 2 0 0 [12 4] 17 SetB %I cfg Black 0 0 0 SetCFg %I cbg White 1 1 1 SetCBg none SetP %I p n %I t [ 1.22729e-08 1.00862 -1.00862 1.22729e-08 895.921 0.182861 ] concat %I 6 788 313 830 316 840 348 841 386 843 417 843 418 6 BSpl %I 1 End Begin %I Line %I b 65535 0 0 0 [] 0 SetB %I cfg Black 0 0 0 SetCFg %I cbg White 1 1 1 SetCBg none SetP %I p n %I t [ 1.22729e-08 1.00862 -1.00862 1.22729e-08 895.435 0.47699 ] concat %I 887 450 839 438 Line %I 1 End Begin %I Poly %I b 65535 2 0 0 [] 0 SetB %I cfg Black 0 0 0 SetCFg %I cbg White 1 1 1 SetCBg %I p 0.5 SetP %I t [ 1.22729e-08 1.00862 -1.00862 1.22729e-08 895.589 0.222778 ] concat %I 9 807 460 838 460 838 405 816 404 816 394 831 394 831 379 807 378 806 379 9 Poly End Begin %I Line %I b 65535 0 0 0 [] 0 SetB %I cfg Black 0 0 0 SetCFg %I cbg White 1 1 1 SetCBg none SetP %I p n %I t [ 1.22729e-08 1.00862 -1.00862 1.22729e-08 895.838 0.0155029 ] concat %I 796 369 781 389 Line %I 1 End Begin %I Text %I cfg Black 0 0 0 SetCFg %I f *-courier-medium-r-*-80-* Courier 8 SetF %I t [ 1.32746e-08 1.09095 -1.09095 1.32746e-08 523.481 806.106 ] concat %I [ (South) (College) ] Text End Begin %I Text %I cfg Black 0 0 0 SetCFg %I f *-courier-bold-r-*-120-* Courier-Bold 12 SetF %I t [ 1.21714e-08 1.00029 -1.00029 1.21714e-08 439.801 711.069 ] concat %I [ (Ayres Hall) ] Text End Begin %I Line %I b 65535 0 0 0 [] 0 SetB %I cfg Black 0 0 0 SetCFg %I cbg White 1 1 1 SetCBg none SetP %I p n %I t [ 1.07786e-08 0.885813 -0.885813 1.07786e-08 771.842 61.8701 ] concat %I 943 284 915 303 Line %I 1 End Begin %I Text %I cfg Black 0 0 0 SetCFg %I f *-courier-medium-r-*-80-* Courier 8 SetF %I t [ -1.09095 2.65492e-08 -2.65492e-08 -1.09095 459.421 900.429 ] concat %I [ (Dabney/) (Buhler) ] Text End Begin %I Line %I b 65535 0 0 0 [] 0 SetB %I cfg Black 0 0 0 SetCFg %I cbg White 1 1 1 SetCBg %I p 0.5 SetP %I t [ 1.07786e-08 0.885813 -0.885813 1.07786e-08 767.413 57.441 ] concat %I 698 424 663 374 Line %I 1 End Begin %I Text %I cfg Black 0 0 0 SetCFg %I f *-times-bold-r-*-140-* Times-Bold 14 SetF %I t [ 1.22729e-08 1.00862 -1.00862 1.22729e-08 464.596 743.589 ] concat %I [ (X) ] Text End Begin %I Text %I cfg Black 0 0 0 SetCFg %I f *-courier-medium-r-*-80-* Courier 8 SetF %I t [ -1.09087 0.0128026 -0.0128027 -1.09087 503.693 612.151 ] concat %I [ (Stadium Drive) ] Text End Begin %I Rect %I b 65535 2 0 0 [] 0 SetB %I cfg Black 0 0 0 SetCFg %I cbg White 1 1 1 SetCBg %I p < cc cc 33 33 cc cc 33 33 > -1 SetP %I t [ 1.07786e-08 0.885813 -0.885813 1.07786e-08 855.109 167.282 ] concat %I 465 390 498 427 Rect End Begin %I Poly %I b 65535 2 0 0 [] 0 SetB %I cfg Black 0 0 0 SetCFg %I cbg White 1 1 1 SetCBg %I p 0.5 SetP %I t [ 6.73658e-10 0.0553633 -0.0553633 6.73658e-10 494.25 582.008 ] concat %I 11 509 1018 237 1018 237 1114 -35 1114 -35 1018 -307 1018 -307 746 -35 746 -35 378 509 378 509 380 11 Poly End Begin %I Text %I cfg Black 0 0 0 SetCFg %I f *-courier-medium-r-*-80-* Courier 8 SetF %I t [ 1.32746e-08 1.09095 -1.09095 1.32746e-08 485.903 693.513 ] concat %I [ (Psychology) ] Text End Begin %I Text %I cfg Black 0 0 0 SetCFg %I f *-courier-bold-r-*-120-* Courier-Bold 12 SetF %I t [ 9.87788e-09 0.811792 -0.811792 9.87788e-09 526.522 575.313 ] concat %I [ (Parking) (Garage) ] Text End Begin %I Line %I b 65535 0 0 0 [] 0 SetB %I cfg Black 0 0 0 SetCFg %I cbg White 1 1 1 SetCBg %I p < cc cc 33 33 cc cc 33 33 > -1 SetP %I t [ 1.07786e-08 0.885813 -0.885813 1.07786e-08 774.943 159.309 ] concat %I 481 283 491 300 Line %I 1 End Begin %I Text %I cfg Black 0 0 0 SetCFg %I f *-courier-medium-r-*-80-* Courier 8 SetF %I t [ -1.09095 2.65492e-08 -2.65492e-08 -1.09095 368.693 657.005 ] concat %I [ (15th Street) ] Text End Begin %I Text %I cfg Black 0 0 0 SetCFg %I f *-courier-medium-r-*-80-* Courier 8 SetF %I t [ -1.09095 2.65492e-08 -2.65492e-08 -1.09095 374.042 893.422 ] concat %I [ (11th Street) ] Text End Begin %I BSpl %I b 65535 3 0 0 [] 0 SetB %I cfg Black 0 0 0 SetCFg %I cbg White 1 1 1 SetCBg none SetP %I p n %I t [ 1.07785e-08 0.885813 -0.885813 1.07785e-08 786.901 202.714 ] concat %I 18 476 304 494 301 501 299 527 283 545 269 569 255 590 244 611 241 648 238 672 234 686 219 705 203 741 204 767 217 776 236 774 284 776 343 773 538 18 BSpl %I 1 End Begin %I Line %I b 65535 3 0 0 [] 0 SetB %I cfg Black 0 0 0 SetCFg %I cbg White 1 1 1 SetCBg none SetP %I p n %I t [ 1.07785e-08 0.885813 -0.885813 1.07785e-08 786.901 202.714 ] concat %I 376 429 375 538 Line %I 1 End Begin %I BSpl %I b 65535 3 0 0 [] 0 SetB %I cfg Black 0 0 0 SetCFg %I cbg White 1 1 1 SetCBg none SetP %I p n %I t [ 1.07785e-08 0.885813 -0.885813 1.07785e-08 786.901 202.714 ] concat %I 8 376 429 375 275 369 238 352 223 345 219 323 208 303 203 303 204 8 BSpl %I 1 End Begin %I Poly %I b 65535 2 0 0 [] 0 SetB %I cfg Black 0 0 0 SetCFg %I cbg White 1 1 1 SetCBg none SetP %I p n %I t [ 1.07785e-08 0.885813 -0.885813 1.07785e-08 786.901 202.714 ] concat %I 4 301 50 850 50 850 537 301 537 4 Poly End Begin %I BSpl %I b 65535 3 0 0 [] 0 SetB %I cfg Black 0 0 0 SetCFg %I cbg White 1 1 1 SetCBg none SetP %I p n %I t [ 1.2168e-08 1 -1 1.2168e-08 793 -5.99995 ] concat %I 5 814 52 872 66 910 82 947 107 961 127 5 BSpl %I 1 End Begin %I Text %I cfg Black 0 0 0 SetCFg %I f *-courier-medium-r-*-80-* Courier 8 SetF %I t [ -0.520591 0.958718 -0.958718 -0.520591 712.173 866.792 ] concat %I [ (Neyland Drive) ] Text End Begin %I BSpl %I b 65535 3 0 0 [] 0 SetB %I cfg Black 0 0 0 SetCFg %I cbg White 1 1 1 SetCBg none SetP %I p n %I t [ 1.52099e-09 0.125 -0.125 1.52099e-09 461.75 653.5 ] concat %I 24 459 412 422 314 224 273 224 233 224 152 305 80 345 72 386 31 418 -33 418 -122 394 -316 474 -461 652 -542 854 -566 1015 -566 1152 -501 1184 -445 1176 -203 1168 15 1031 233 789 314 547 314 432 351 428 350 24 BSpl %I 8 End Begin %I Line %I b 65535 3 0 0 [] 0 SetB %I cfg Black 0 0 0 SetCFg %I cbg White 1 1 1 SetCBg none SetP %I p n %I t [ 1.52099e-09 0.125 -0.125 1.52099e-09 461.75 653.5 ] concat %I 916 415 915 1204 Line %I 8 End Begin %I BSpl %I b 65535 2 0 0 [] 0 SetB %I cfg Black 0 0 0 SetCFg %I cbg White 1 1 1 SetCBg none SetP %I p n %I t [ 1.52099e-09 0.125 -0.125 1.52099e-09 461.75 620.75 ] concat %I 5 486 195 402 153 394 72 402 -73 394 -73 5 BSpl %I 8 End Begin %I BSpl %I b 65535 3 0 0 [] 0 SetB %I cfg Black 0 0 0 SetCFg %I cbg White 1 1 1 SetCBg none SetP %I p n %I t [ 1.52099e-09 0.125 -0.125 1.52099e-09 461.75 604.375 ] concat %I 11 133 415 132 396 110 602 231 715 316 751 330 765 387 857 351 977 344 1126 358 1041 351 1190 11 BSpl %I 8 End Begin %I MLine %I b 65535 3 0 0 [] 0 SetB %I cfg Black 0 0 0 SetCFg %I cbg White 1 1 1 SetCBg none SetP %I p n %I t [ 1.52099e-09 0.125 -0.125 1.52099e-09 461.75 604.375 ] concat %I 3 133 410 153 -454 613 -2226 3 MLine %I 8 End Begin %I Line %I b 65535 3 0 0 [] 0 SetB %I cfg Black 0 0 0 SetCFg %I cbg White 1 1 1 SetCBg %I p 0.5 SetP %I t [ 1.52099e-09 0.125 -0.125 1.52099e-09 461.75 604.375 ] concat %I 663 276 137 276 Line %I 8 End Begin %I Line %I b 65535 3 0 0 [] 0 SetB %I cfg Black 0 0 0 SetCFg %I cbg White 1 1 1 SetCBg none SetP %I p n %I t [ 1.52099e-09 0.125 -0.125 1.52099e-09 527 522.5 ] concat %I 96 105 808 79 Line %I 8 End Begin %I Line %I b 65535 3 0 0 [] 0 SetB %I cfg Black 0 0 0 SetCFg %I cbg White 1 1 1 SetCBg none SetP %I p n %I t [ 1.52099e-09 0.125 -0.125 1.52099e-09 527 522.5 ] concat %I 95 105 -425 89 Line %I 8 End Begin %I Line %I b 65535 3 0 0 [] 0 SetB %I cfg Black 0 0 0 SetCFg %I cbg White 1 1 1 SetCBg none SetP %I p n %I t [ 1.52099e-09 0.125 -0.125 1.52099e-09 483.5 457 ] concat %I 99 587 3984 588 Line %I 8 End Begin %I BSpl %I b 65535 2 0 0 [] 0 SetB %I cfg Black 0 0 0 SetCFg %I cbg White 1 1 1 SetCBg none SetP %I p n %I t [ 6.91136e-09 0.653301 -0.567997 7.94935e-09 450.164 -70.1652 ] concat %I 15 211 347 224 320 247 286 278 265 315 254 368 252 499 251 582 255 629 257 783 266 863 302 880 318 903 384 900 434 898 545 15 BSpl %I 1 End Begin %I Text %I cfg Black 0 0 0 SetCFg %I f *-courier-bold-r-*-120-* Courier-Bold 12 SetF %I t [ 1.2168e-08 1 -1 1.2168e-08 288.5 532.5 ] concat %I [ (Jack Dongarra's office in Ayres Hall Room 107) ] Text End Begin %I Text %I cfg Black 0 0 0 SetCFg %I f *-courier-medium-r-*-100-* Courier 10 SetF %I t [ -0.792068 1.92757e-08 -1.92757e-08 -0.792068 428.69 73.2605 ] concat %I [ (Airport/Alcoa Highway) ] Text End Begin %I Rect %I b 65535 0 0 0 [] 0 SetB %I cfg Black 0 0 0 SetCFg %I cbg White 1 1 1 SetCBg %I p < 88 44 22 11 88 44 22 11 > -1 SetP %I t [ 1.2168e-08 1 -1 1.2168e-08 610 221 ] concat %I 258 413 267 424 Rect End Begin %I Rect %I b 65535 0 0 0 [] 0 SetB %I cfg Black 0 0 0 SetCFg %I cbg White 1 1 1 SetCBg %I p < 88 44 22 11 88 44 22 11 > -1 SetP %I t [ 1.2168e-08 1 -1 1.2168e-08 593 221 ] concat %I 258 413 267 424 Rect End Begin %I MLine %I b 65535 0 0 0 [] 0 SetB %I cfg Black 0 0 0 SetCFg %I cbg White 1 1 1 SetCBg none SetP %I p n %I t [ 1.2168e-08 1 -1 1.2168e-08 773 -23 ] concat %I 3 558 602 521 602 507 598 3 MLine %I 1 End Begin %I Line %I b 65535 2 0 0 [] 0 SetB %I cfg Black 0 0 0 SetCFg %I cbg White 1 1 1 SetCBg none SetP %I p n %I t [ 1.2168e-08 1 -1 1.2168e-08 773 -23 ] concat %I 326 526 326 467 Line %I 1 End Begin %I Rect %I b 65535 2 0 0 [] 0 SetB %I cfg Black 0 0 0 SetCFg %I cbg White 1 1 1 SetCBg %I p 0.5 SetP %I t [ 1.2168e-08 1 -1 1.2168e-08 773 -23 ] concat %I 564 368 588 385 Rect End Begin %I Rect %I b 65535 2 0 0 [] 0 SetB %I cfg Black 0 0 0 SetCFg %I cbg White 1 1 1 SetCBg %I p 0.5 SetP %I t [ 1.2168e-08 1 -1 1.2168e-08 773 26.9999 ] concat %I 564 368 588 385 Rect End Begin %I Poly %I b 65535 2 0 0 [] 0 SetB %I cfg Black 0 0 0 SetCFg %I cbg White 1 1 1 SetCBg %I p 0.5 SetP %I t [ 8.96587e-09 0.736842 -0.736842 8.96587e-09 708.816 120.079 ] concat %I 4 616 414 631 414 631 431 616 431 4 Poly End Begin %I Text %I cfg Black 0 0 0 SetCFg %I f *-courier-medium-r-*-80-* Courier 8 SetF %I t [ -0.855758 0.676646 -0.676646 -0.855758 381.905 599.524 ] concat %I [ (Law Builfinh) ] Text End Begin %I Text %I cfg Black 0 0 0 SetCFg %I f *-courier-medium-r-*-80-* Courier 8 SetF %I t [ -0.855758 0.676646 -0.676646 -0.855758 383.905 547.524 ] concat %I [ (Pan-Helenic Bldg.) ] Text End Begin %I Text %I cfg Black 0 0 0 SetCFg %I f *-courier-medium-r-*-80-* Courier 8 SetF %I t [ -0.855758 0.676646 -0.676646 -0.855758 384.905 572.524 ] concat %I [ (International House.) ] Text End Begin %I MLine %I b 65535 0 0 0 [] 0 SetB %I cfg Black 0 0 0 SetCFg %I cbg White 1 1 1 SetCBg none SetP %I p n %I t [ 1.2168e-08 1 -1 1.2168e-08 773 -23 ] concat %I 2 559 582 507 581 2 MLine %I 1 End Begin %I Text %I cfg Black 0 0 0 SetCFg %I f *-courier-medium-r-*-100-* Courier 10 SetF %I t [ 1.13387e-08 0.931845 -0.931845 1.13387e-08 186.965 539.756 ] concat %I [ (Ramada Inn) ] Text End Begin %I Text %I cfg Black 0 0 0 SetCFg %I f *-courier-medium-r-*-100-* Courier 10 SetF %I t [ 1.13387e-08 0.931845 -0.931845 1.13387e-08 167.965 538.756 ] concat %I [ (Hilton) ] Text End Begin %I Text %I cfg Black 0 0 0 SetCFg %I f *-courier-bold-r-*-120-* Courier-Bold 12 SetF %I t [ 1.2168e-08 1 -1 1.2168e-08 514.5 56.5 ] concat %I [ (Directions from the airport to Ayres Hall:) () ( Alcoa Highway North to Cumberland Avenue) () ( Cumberland Avenue east to Stadium Drive) ( \(Stadium Dr. is accross from 15th St.\)) () ( Park at Parking Garage and walk up hill) ( to largest building, Ayres Hall) ] Text End Begin %I Text %I cfg Black 0 0 0 SetCFg %I f *-courier-bold-r-*-120-* Courier-Bold 12 SetF %I t [ -0.0156231 0.999878 -0.999878 -0.0156231 651.985 47.9375 ] concat %I [ (Jack Dongarra's office phone 615-974-8295) ] Text End End %I eop showpage %%Trailer end From owner-pbwg-comm@CS.UTK.EDU Fri Aug 13 07:12:41 1993 Received: from CS.UTK.EDU by netlib2.cs.utk.edu with SMTP (5.61+IDA+UTK-930125/2.8t-netlib) id AA25383; Fri, 13 Aug 93 07:12:41 -0400 Received: from localhost by CS.UTK.EDU with SMTP (5.61+IDA+UTK-930125/2.8s-UTK) id AA15107; Fri, 13 Aug 93 07:07:05 -0400 X-Resent-To: pbwg-comm@CS.UTK.EDU ; Fri, 13 Aug 1993 07:06:59 EDT Errors-To: owner-pbwg-comm@CS.UTK.EDU Received: from sun2.nsfnet-relay.ac.uk by CS.UTK.EDU with SMTP (5.61+IDA+UTK-930125/2.8s-UTK) id AA15099; Fri, 13 Aug 93 07:06:54 -0400 Via: uk.ac.southampton.ecs; Fri, 13 Aug 1993 12:06:13 +0100 From: R.Hockney@parallel-applications-centre.southampton.ac.uk Via: calvados.pac.soton.ac.uk (plonk); Fri, 13 Aug 93 11:57:13 BST Date: Fri, 13 Aug 93 11:05:39 GMT Message-Id: <28817.9308131105@calvados.pac.soton.ac.uk> To: pbwg-comm@cs.utk.edu Subject: Second Draft PARKBENCH Report Second draft of PARKBENCH Report A Message from Your Chairman Roger Hockney, 13 Aug 1993 ---------------------------- In view of the committee's aim to agree on a second draft text at the meeting on August 23, 1993, it would help if each subcommittee leader produces a draft of their chapter as LATEX files which fit into the standard framework that I laid out for the May 24th meeting. The second draft will use the following files: (1) benrep2.tex - control file for main report (2nd draft) (2) bencom1.tex - command definition file (no additions received) (3) benref1.bib - start of bibliography Additions to the command file and bibliography can be sent to me as files bencom2.tex, benref2.bib etc. The control file reads the following files which are to be provided by the leader of each subcommittee: (4) intro1.tex - Roger Hockney for whole committee (5) method4.tex - David Bailey for Methodology subcommittee (6) lowlev2.tex - Roger Hockney for the low-level subcommittee (7) kernel2.tex - Tony Hey for the kernel subcommittee (8) compac2.tex - David Walker for compact applications subcommittee (9) compil2.tex - Tom Haupt for compiler benchmarks subcommittee (10) conclu1.tex - Roger Hockney for whole committee Provided all the above 10 files are present, the report should be able to be assembled by the commands: latex benrep2 bibtex benrep2 repeated a few(?) times until latex stops complaining. Then printed with: dvips -o benrep2.ps benrep2 lpr benrep2.ps or the equivalent local dialects thereof. I give these instructions because there were complaints last time that the document would not print properly. The above recipe works on our Sun Unix system at Southampton. Unlike last time, I would like to assemble the second draft at Southampton to make sure everything works (there were problems with some symbols last time when printed at utk). This means that I need by 19th August on e-mail the following: (11) Roger Hockney's revision of his methodology draft in the light of the last meeting (first draft was method3.tex). This second draft will be known as method4.tex. (12) David Bailey's editing of method4.tex which should include his rewrite of the section on Speedup. Or he can provide this separately. (13) Roger Hockney's second draft of lowlevel as file lowlev2.tex (14) Tony Hey's second draft of kernel benchmarks as file kernel2.tex (15) David Walker's second draft of compact appls. as file compac2.tex (16) Tom Haupt's second draft compiler benchmarks as file compil2.tex (17) I do not plan to write intro1.tex and conclu1.tex until just prior to the November meeting at Super93, Portland I would be obliged if the named subcommittee leaders could sent their contributions to 'pbwg-comm@cs.utk.edu' so that all may see them prior to the meeting, and I can try to assemble the whole second draft. The files sent be e-mail should contain a three line comment header in the format: %------------------------------------------------------------------------ % PARKBENCH REPORT (second draft), File: kernel2.tex %------------------------------------------------------------------------ I realise this timetable may be unrealistic for some of our busy members, but please do what you can to follow it. I shall try to send items (11) and (13) shortly after this. As a special favour I would like Michael Berry to try to assemble the second draft also at utk, and keep in touch with me by e-mail if things go wrong, hope that's OK. We then have a safety net. Best wishes, Roger Hockney. From owner-pbwg-comm@CS.UTK.EDU Fri Aug 13 09:57:23 1993 Received: from CS.UTK.EDU by netlib2.cs.utk.edu with SMTP (5.61+IDA+UTK-930125/2.8t-netlib) id AA26766; Fri, 13 Aug 93 09:57:23 -0400 Received: from localhost by CS.UTK.EDU with SMTP (5.61+IDA+UTK-930125/2.8s-UTK) id AA26195; Fri, 13 Aug 93 09:55:10 -0400 X-Resent-To: pbwg-comm@CS.UTK.EDU ; Fri, 13 Aug 1993 09:55:08 EDT Errors-To: owner-pbwg-comm@CS.UTK.EDU Received: from sun2.nsfnet-relay.ac.uk by CS.UTK.EDU with SMTP (5.61+IDA+UTK-930125/2.8s-UTK) id AA26167; Fri, 13 Aug 93 09:55:03 -0400 Via: uk.ac.southampton.ecs; Fri, 13 Aug 1993 14:54:21 +0100 From: R.Hockney@parallel-applications-centre.southampton.ac.uk Via: calvados.pac.soton.ac.uk (plonk); Fri, 13 Aug 93 14:44:57 BST Date: Fri, 13 Aug 93 13:53:25 GMT Message-Id: <204.9308131353@calvados.pac.soton.ac.uk> To: pbwg-comm@cs.utk.edu Subject: PARKBENCH LATEX FILES BASIC LATEX FILES FOR PARKBENCH REPORT (second draft) ----------------------------------------------------- The second draft uses the following files: Basic control files are appended to this e-mail: (1) benrep2.tex - control file for main report (2nd draft) (2) bencom1.tex - command definition file (no additions received) (3) benref1.bib - start of bibliography (Xhian Sun's refs. added) Individual chapters: (4) intro1.tex - Roger Hockney: Introduction, absent from second draft (5) method4.tex - David Bailey for Methodology subcommittee (6) lowlev2.tex - Roger Hockney for the low-level subcommittee (7) kernel2.tex - Tony Hey for the kernel subcommittee (8) compac2.tex - David Walker for compact applications subcommittee (9) compil2.tex - Tom Haupt for compiler benchmarks subcommittee (10) conclu1.tex - Roger Hockney: Conclusions, absent from second draft Items (5) to (9) inclusive should be sent by authors to pbwg-comm when ready Items (4) and (10) will not be provided in the second draft. Dummy files are appended below for items (4) to (10) so that a skeleton report can be produced. There follows the three basic control files: %------------------------------------------------------------------------ % PARKBENCH REPORT (second draft), File: benrep2.tex %------------------------------------------------------------------------ % % ************************************************************** % STANDARD INTERNATIONAL BENCHMARKS FOR PARALLEL COMPUTERS % ************************************************************** % \input{bencom1.tex} % define new commands for benchmark report % ---------------------------------------------------------------------------- \documentstyle[]{report} % Specifies the document style. \textheight 8.25 true in \textwidth 5.625 true in \topmargin -0.13 true in \oddsidemargin 0.25 true in \evensidemargin 0.25 true in % The preamble begins here. \title{Standard International Benchmarks for Parallel Computers} % ---------------------------------------------------------------------------- \author{PARKBENCH Committee \\ draft assembled by Roger Hockney (chairman)} \date{13 August 1993 - draft 3} % ---------------------------------------------------------------------------- \begin{document} % End of preamble and beginning of text. \sloppy \maketitle % Produces the title. % ---------------------------------------------------------------------------- \input{intro1.tex} % Introduction % responsibility of Roger Hockney for whole committee % ---------------------------------------------------------------------------- \input{method4.tex} % Chapter1 % responsibility of David Bailey for Methodology subcommittee % ---------------------------------------------------------------------------- \input{lowlev2.tex} % Chapter2 % responsibility of Roger Hockney for Low-level benchmarks subcommittee % ---------------------------------------------------------------------------- \input{kernel2.tex} % Chapter3 % responsibility of Tony Hey for Kernel benchmarks subcommittee % ---------------------------------------------------------------------------- \input{compac2.tex} % Chapter4 % responsibility of David Walker for Compact Applications subcommittee % ---------------------------------------------------------------------------- \input{compil2.tex} % Chapter5 % responsibility of Tom Haupt for Compiler Benchmarks subcommittee % ---------------------------------------------------------------------------- \input{conclu1.tex} % Conclusions % responsibility of Roger Hockney for whole committee % ---------------------------------------------------------------------------- \vspace{0.35in} {\large \bf Acknowledgments} \bibliography{benref1} \bibliographystyle{unsrt} \end{document} % End of document. % %------------------------------------------------------------------------ % PARKBENCH REPORT (second draft), File: bencom1.tex %------------------------------------------------------------------------ % % ************************************************************** % LATEX COMMANDS FOR PARKBENCH REPORTS % ************************************************************** % \def\flop{\mathop{\rm flop}\nolimits} \def\pipe{\mathop{\rm pipe}\nolimits} \newcommand{\Suprenum}{\mbox{\sc SUPRENUM}} \newcommand{\usec}{\mbox{\rm $\mu$s}} \newcommand{\where}{\mbox{\rm where}} \newcommand{\rmand}{\mbox{\rm and}} \newcommand{\Mflops}{\mbox{\rm Mflop/s}} \newcommand{\flops}{\mbox{\rm flop/s}} \newcommand{\flopB}{\mbox{\rm flop/B}} \newcommand{\tstepps}{\mbox{\rm tstep/s}} \newcommand{\MWps}{\mbox{\rm MW/s}} \newcommand{\Mwps}{\mbox{\rm Mw/s}} \newcommand{\spone}{\mbox{\ }} \newcommand{\sptwo}{\mbox{\ \ }} \newcommand{\spfour}{\mbox{\ \ \ \ }} \newcommand{\spsix}{\mbox{\ \ \ \ \ \ }} \newcommand{\speight}{\mbox{\ \ \ \ \ \ \ \ }} \newcommand{\spten}{\mbox{\ \ \ \ \ \ \ \ \ \ }} \newcommand{\rinf}{\mbox{$r_\infty$}} \newcommand{\Rinf}{\mbox{$R_\infty$}} \newcommand{\nhalf}{\mbox{$n_{\frac{1}{2}}$}} \newcommand{\fhalf}{\mbox{$f_{\frac{1}{2}}$}} \newcommand{\Nhalf}{\mbox{$N_{\frac{1}{2}}$}} \newcommand{\phalf}{\mbox{$p_{\frac{1}{2}}$}} \newcommand{\rhat}{\mbox{$\hat{r}$}} \newcommand{\Phalf}{\mbox{$P_{\frac{1}{2}}$}} \newcommand{\half}{\mbox{$\frac{1}{2}$}} \newcommand{\rnhalf}{\mbox{(\rinf,\nhalf)}} \newcommand{\rfhalf}{\mbox{(\rhat,\fhalf)}} \newcommand{\RNhalf}{\mbox{(\Rinf,\Nhalf)}} \newcommand{\third}{\mbox{$\frac{1}{3}$}} \newcommand{\quart}{\mbox{$\frac{1}{4}$}} \newcommand{\eighth}{\mbox{$\frac{1}{8}$}} \newcommand{\nineth}{\mbox{$\frac{1}{9}$}} % ---------------------------------------------------------------------------- %------------------------------------------------------------------------ % PARKBENCH REPORT (second draft), File: benref1.bib %------------------------------------------------------------------------ @book{HoJe81, author= "Roger W. Hockney and Christopher R. Jesshope", title= "{Parallel Computers: Architecture, Programming and Algorithms}", publisher= "Adam Hilger", address= "Bristol", year= "1981", } @book{HoJe88, author= "Roger W. Hockney and Christopher R. Jesshope", title= "{Parallel Computers 2: Architecture, Programming and Algorithms}", publisher= "Adam Hilger/IOP Publishing", address= "Bristol \& Philadelphia", year= "1988", edition="second", note= "Distributed in the USA by IOP Publ. Inc., Public Ledger Bldg., Suite 1035, Independence Square, Philadelphia, PA 19106."} @book{Super, key="Super", title={Supercomputer}, publisher="ASFRA", address="Edam, Netherlands"} @book{SI75, key="Royal Society", organization="{Symbols Committee of the Royal Society}", title={Quantities, Units and Symbols}, publisher="The Royal Society", address="London", year=1975} @article{Berr89, author="M. Berry and D. Chen and P. Koss and D. Kuck and S. Lo and Y. Pang and L. Pointer and R. Roloff and A. Sameh and E. Clementi and S. Chin and D. Schneider and G. Fox and P. Messina and D. Walker and C. Hsiung and J. Schwarzmeier and K. Lue and S. Orszag and F. Seidl and O. Johnson and R. Goodrum and J. Martin", title="{The PERFECT Club Benchmarks: Effective Performance Evaluation of Computers}", journal={Intl. J. Supercomputer Appls.}, volume=3, number=3, year=1989, pages="5-40"} @incollection{Ma88, author="F. H. McMahon", title="{The Livermore Fortran Kernels test of the Numerical Performance Range}", editor="J. L. Martin", booktitle={Performance Evaluation of Supercomputers}, publisher="Elsevier Science B.V., North-Holland", address="Amsterdam", year=1988, pages="143-186"} @article{Mess90, author="P. Messina and C. Baillie and E. Felten and P. Hipes and R. Williams and A. Alagar and A. Kamrath and R. Leary and W. Pfeiffer and J. Rogers and D. Walker", title="Benchmarking advanced architecture computers", journal={Concurrency: Practice and Experience}, volume=2, number=3, year=1990, pages="195-255"} @inproceedings{Cvet90, author="Z. Cvetanovic and E. G. Freedman and C. Nofsinger", title="{Efficient Decomposition and Performance of Parallel PDE, FFT, Monte-Carlo Simulations, Simplex and Sparse Solvers}", booktitle={Proceedings Supercomputing90}, publisher="IEEE", address="New York", year=1990, pages="465-474"} @article{SUPR88, title="Proceedings 2nd International SUPRENUM Colloquium", author="U. Trottenberg", journal={Parallel Computing}, volume=7, number=3, year=1988} @article{Hey91, author="A. J. G. Hey", title="{The Genesis Distributed-Memory Benchmarks}", journal={Parallel Computing}, volume=17, year=1991, pages="1275-1283"} @book{F90, author="M. Metcalf and J. Reid", title={Fortran-90 Explained}, publisher="Oxford Science Publications/OUP", address="Oxford and New York", year=1990, chapter=6} @article{SPEC90, key="SPEC", title="{SPEC Benchmarks Suite Release 1.0}", journal={SPEC Newslett.}, volume=2, number=3, year=1990, pages="3-4", publisher="{Systems Performance Evaluation Cooperative, Waterside Associates}", address="{Fremont, California}"} @article{FGHS89, author="A. Friedli and W. Gentzsch and R. Hockney and A. van der Steen", title="{A European Supercomputer Benchmark Effort}", journal={Supercomputer 34}, volume="VI", number=6, year=1989, pages="14-17"} @article{BRH90, author="L. Bomans and D. Roose and R. Hempel", title="{The Argonne/GMD Macros in Fortran for Portable Parallel Programming and their Implementation on the Intel iPSC/2}", journal={Parallel Computing}, volume=15, year=1990, pages="119-132"} @inproceedings{ShTu91, author="J. N. Shahid and R. S. Tuminaro", title="{Iterative Methods for Nonsymmetric Systems on MIMD Machines}", booktitle={Proc. Fifth SIAM Conf. Parallel Processing for Scientific Computing}, year=1991} @article{Bish90, author="N. T. Bishop and C. J. S. Clarke and R. A. d'Inverno", journal={Classical and Quantum Gravity}, volume=7, year=1990, pages="L23-L27"} @article{Isaac83, author="R. A. Isaacson and J. S. Welling and J.Winicour", journal={J. Math. Phys.}, volume=24, year=1983, pages="1824-1834"} @article{Stew82, author="J. M. Stewart and H. Friedrich", journal={Proc. Roy. Soc.}, volume="A384", year=1982, pages="427-454"} @incollection{Hoc77, author="R. W. Hockney", title="{Super-Computer Architecture}", editor="F. Sumner", booktitle={Infotech State of the Art Conference: Future Systems}, publisher="Infotech", address="Maidenhead", year=1977, pages="277-305"} @article{Hoc82, author="R. W. Hockney", title="{Characterization of Parallel Computers and Algorithms}", journal={Computer Physics Communications}, volume=26, year=1982, pages="285-29"} @article{Hoc83, author="R. W. Hockney", title="{Characterizing Computers and Optimizing the FACR(l) Poisson-Solver on Parallel Unicomputers}", journal={IEEE Trans. Comput.}, volume="{C}\-32", year=1983, pages="933-941"} @article{Hoc87, author="R. W. Hockney", title="Parametrization of Computer Performance", journal={Parallel Computing}, volume=5, year=1987, pages="97-103"} @article{Hoc88, author="R. W. Hockney", title="{Synchronization and Communication Overheads on the {LCAP} Multiple FPS-164 Computer System}", journal={Parallel Computing}, volume=9, year=1988, pages="279-290"} @article{HoCu89, author="R. W. Hockney and I. J. Curington", title="{$f_{frac{1}{2}}$: a Parameter to Characterise Memory and Communication Bottlenecks}", journal={Parallel Computing}, volume=10, year=1989, pages="277-286"} @article{Hoc91, author="R. W. Hockney", title="{Performance Parameters and Benchmarking of Supercomputers}", journal={Parallel Computing}, volume=17, year=1991, pages="1111-1130"} @article{Hoc92, author="R. W. Hockney", title="{A Framework for Benchmark Analysis}", journal={Supercomputer}, volume=48, number="IX-2", year=1992, pages="9-22"} @article{HoCa92, author="R. W. Hockney and E. A. Carmona", title="{Comparison of Communications on the Intel iPSC/860 and Touchstone Delta}", journal={Parallel Computing}, volume=18, year=1992, pages="1067-1072"} @article{Add93, author="C. Addison and J. Allwright and N. Binsted and N. Bishop and B. Carpenter and P. Dalloz and D. Gee and V. Getov and A. Hey and R. Hockney and M. Lemke and J. Merlin and M. Pinches and C. Scott and I. Wolton", title="{The Genesis Distributed-Memory Benchmarks. Part 1: methodology and general relativity benchmark with results for the SUPRENUM computer}", journal={Concurrency: Practice and Experience}, volume=5, number=1, year=1993, pages="1-22"} @techreport{StRi93, author="A. J. van der Steen and P. P. M. de Rijk", title="{Guidelines for use of the EuroBen Benchmark}", institution="EuroBen", year=1993, month=feb, type="Technical Report", number="{TR}\-3", address="{The EuroBen Group, Utrecht, The Netherlands}"} @INPROCEEDINGS{Gust90, author = "J.L. Gustafson", title = "Fixed time, Tiered Memory, and Superlinear Speedup", booktitle = "Proc. of the Fifth Conf. on Distributed Memory Computers", year = "1990", } @ARTICLE{SuGu91, AUTHOR = "Xian-He Sun and J.L. Gustafson", TITLE = "Toward a Better Parallel Performance Metric", JOURNAL = "Parallel Computing", VOLUME = "17", MONTH = "Dec.", YEAR = "1991", pages = "1093--1109", } @INPROCEEDINGS{Bail92, author = "David H. Bailey", title = "Misleading Performance in the Supercomputing Field", booktitle = "Proc. Supercomputing '92", address = " ", year = "1992", pages = "155--158", } %------------------------------------------------------------------------ % PARKBENCH REPORT (second draft), END OF THREE CONTROL FILES %------------------------------------------------------------------------ For completeness there follows dummy files for each of the seven chapters: %------------------------------------------------------------------------ % PARKBENCH REPORT (second draft), File: intro1.tex %------------------------------------------------------------------------ %file: intro1.tex \chapter{Introduction}\footnotemark \footnotetext{written by Roger Hockney for whole committee} %------------------------------------------------------------------------ % PARKBENCH REPORT (second draft), File: method4.tex %------------------------------------------------------------------------ %file: method4.tex \chapter{Methodology} \footnote{assembled by David Bailey for Methodology subcommittee} %------------------------------------------------------------------------ % PARKBENCH REPORT (second draft), File: lowlev2.tex %------------------------------------------------------------------------ %file: lowlev2.tex \chapter{Low-Level Benchmarks} \footnote{assembled by Roger Hockney for Low-Level subcommittee} %------------------------------------------------------------------------ % PARKBENCH REPORT (second draft), File: kernel2.tex %------------------------------------------------------------------------ %file: kernel2.tex \chapter{Kernel Benchmarks} \footnote{assembled by Tony Hey for Kernel subcommittee} %------------------------------------------------------------------------ % PARKBENCH REPORT (second draft), File: compac2.tex %------------------------------------------------------------------------ %file: compac2.tex \chapter{Compact Applications} \footnote{assembled by David Walker for Compact Applications subcommittee} %------------------------------------------------------------------------ % PARKBENCH REPORT (second draft), File: compil2.tex %------------------------------------------------------------------------ %file: compil2.tex \chapter{Compiler Benchmarks} \footnote{assembled by Tom Haupt for Compiler Benchmarks subcommittee} %------------------------------------------------------------------------ % PARKBENCH REPORT (second draft), File: conclu1.tex %------------------------------------------------------------------------ %file: conclu1.tex \chapter{Conclusions} \footnote{written by Roger Hockney for whole committee} %------------------------------------------------------------------------ % PARKBENCH REPORT (second draft), END OF DUMMY FILES %------------------------------------------------------------------------ From owner-pbwg-comm@CS.UTK.EDU Fri Aug 13 18:55:06 1993 Received: from CS.UTK.EDU by netlib2.cs.utk.edu with SMTP (5.61+IDA+UTK-930125/2.8t-netlib) id AA04308; Fri, 13 Aug 93 18:55:06 -0400 Received: from localhost by CS.UTK.EDU with SMTP (5.61+IDA+UTK-930125/2.8s-UTK) id AA28442; Fri, 13 Aug 93 18:52:42 -0400 X-Resent-To: pbwg-comm@CS.UTK.EDU ; Fri, 13 Aug 1993 18:52:40 EDT Errors-To: owner-pbwg-comm@CS.UTK.EDU Received: from sun2.nsfnet-relay.ac.uk by CS.UTK.EDU with SMTP (5.61+IDA+UTK-930125/2.8s-UTK) id AA28400; Fri, 13 Aug 93 18:52:32 -0400 Via: uk.ac.southampton.ecs; Fri, 13 Aug 1993 23:52:23 +0100 From: R.Hockney@parallel-applications-centre.southampton.ac.uk Via: calvados.pac.soton.ac.uk (plonk); Fri, 13 Aug 93 23:42:30 BST Date: Fri, 13 Aug 93 22:50:57 GMT Message-Id: <2429.9308132250@calvados.pac.soton.ac.uk> To: pbwg-comm@cs.utk.edu Subject: Revised Methodology Chapter METHOD4.TEX and revised BENCOM1.tex and BENREF1.TEX --------------------------------------------------- Appended are my second draft of the methodology chapter (method4.tex), which requires some additions to the latex command file (bencom1.tex) and the bibliography file (benref1.bib) Roger Hockney, 13 Aug 1993 %------------------------------------------------------------------------ % PARKBENCH REPORT (second draft), File: method4.tex %------------------------------------------------------------------------ %file method4.tex %compiled by David Bailey for methodology subcommittee %text below submitted by Roger Hockney to methodology subcommittee \chapter{Methodology} \section{Introduction} The conclusions drawn from a benchmark study of computer performance depend not only on the basic timing results obtained, but also on the way these are interpreted and converted into performance figures. The choice of the performance metric, may itself influence the conclusions. For example, do we want the computer that generates the most megaflop per second (or has the highest Speedup), or the computer that solves the problem in the least time? It is now well known that high values of the first metrics do not necessarily imply the second property. This confusion can be avoided by choosing a more suitable metric that reflects solution time directly, for example either the Temporal, Simulation or Benchmark performance, defined below. This issue of the sensible choice of performance metric is becoming increasing important with the advent of massively parallel computers which have the potential of very high megaflop rates, but have much more limited potential for reducing solution time. \section{Time Measurement} In parallel computing we are concerned with the distribution of computational work to multiple processors that execute simultaneously, that is to say in parallel. The objective of the exercise is to reduce the elapsed wall-clock time to solve or complete a specified task or benchmark. The elapsed wall-clock time means the time that would be measured on an external clock that records the time-of-day or even Greenwich mean time (GMT), between the start and finish of the benchmark. We are not concerned with the origin of the time measurement, since we are taking a difference, but it is important that the time measured would be the same as that given by a difference between two measurements of GMT, if it were possible to make them. It is important to be clear about this, because many computer clocks (e.g. Sun Unix function ETIME) measure elapsed CPU-time, which is the total time that the process or job which calls it has been executing in the CPU. Such a clock does not record time (i.e. it stops ticking) when the job is swapped out of the CPU. It does not record, therefore, any wait-time which must be included if we are to assess correctly the performance of a parallel program. Two low-level benchmarks are provided in the PARKBENCH suite to test the precision and accuracy of the clock that is to be used in the benchmarking. These should be run first, before any benchmark measurements are made. They are: \begin{enumerate} \item TICK1 - measures the precision of the clock by measuring the time interval between ticks of the clock. A clock is said to tick when it changes its value. \item TICK2 - measures the accuracy of the clock by comparing a given time interval measured by an external wall-clock (the benchmarker's wrist watch is adequate) with the same interval measured by the computer clock. This tests the scale factor used to convert computer clock ticks to seconds, and immediately detects if a CPU-clock is incorrectly being used. \end{enumerate} The fundamental measurement made in any benchmark is the elapsed wall-clock time to complete some specified task. All other performance figures are derived from this basic timing measurement. The benchmark time, $T(N;p)$, will be a function of the problem size, $N$, and the number of processors, $p$. Here, the problem size is represented by the vector variable, $N$, which stands for a set of parameters characterising the size of the problem: e.g. the number of mesh points in each dimension, and the number of particles in a particle-mesh simulation. Benchmark problems of different sizes can be created by multiplying all the size parameters by suitable powers of a single scale factor, thereby increasing the spatial and particle resolution in a sensible way, and reducing the size parameters to a single size factor (here called $\alpha$). We believe that it is most important to regard execution time and performance as a function of at least the two variables $(N,p)$, which define a parameter plane. Much confusion has arisen in the past by attempts to treat performance as a function of a single variable, by taking a particular path through this plane, and not stating what path is taken. Many different paths may be taken, and hence many different conclusions can be drawn. It is important, therefore, always to define the path through the performance plane, or better as we do here, to study the shape of the two-dimensional performance hill. In some cases there may even be an optimum path up this hill. \section{Units and Symbols} A rational set of units and symbols is essential for any numerate science including benchmarking. The following extension of the internationally agreed SI system of physical units \cite{SI75} is made to accommodate the needs of computer benchmarking. \medskip New symbols and units: \begin{enumerate} \item flop : number of floating-point operations \item mref : number of memory references (reads or writes) \item barr : number of barrier operations \item b : number of binary digits (bits) \item B : number of bytes (groups of 8 bits) \item sol : number of solutions or executions of benchmark \item ${\rm w}_{32}$ : number of words (number of bits per word as subscript, here 32). Symbol is lower case (W means watt) \end{enumerate} Note that flop and mref are both inseparable four-letter symbols. The character case is significant in all unit symbols so that e.g. Flop, Mref, $W_{64}$ are incorrect. Unit symbols should always be printed in roman type, to contrast with variables names which are printed in italic. Because 's' is the SI unit for seconds, unit symbols like 'sheep' do not take 's' in the plural. \medskip SI provides the standard prefixes: \begin{enumerate} \item k : kilo meaning $10^3$ \item M : mega meaning $10^6$ \item G : giga meaning $10^9$ \item T : tera meaning $10^{12}$ \end{enumerate} This means that we cannot use M to mean $1024^2$ (the binary mega) as is often done in describing computer memory capacity, e.g. 256 MB. We can however introduce the new prefix: \begin{enumerate} \item K : meaning 1024, then use a subscript 2 to indicate the binary versions \item ${\rm M}_2$ : binary mega $1024^2$ \item ${\rm G}_2$ : binary giga $1024^3$ \item ${\rm T}_2$ : binary tera $1024^4$ \end{enumerate} In most cases the difference between the mega and the binary mega (4\%) is probably unimportant, but it is important to be unambiguous. In this way one can continue with existing practice if the difference doesn't matter, and have an agreed method of being more exact when necessary. For example, the above memory capacity was probably intended to mean $256 {\rm M_2 B}$. As a consequence of the above, an amount of computational work involving $4.5 \times 10^{12}$ floating-point operations is correctly written as 4.5 Tflop. Note that the unit symbol Tflop is never pluralised with an added 's', and it is therefore incorrect to write the above as 4.5 Tflops which could be confused with a rate per second. The most frequently used unit of performance, millions of floating-point operations per second is correctly written Mflop/s, in analogy to km/s. The slash is necessary and means 'per', because the 'p' is an integral part of the unit symbol 'flop' and cannot also be used to mean 'per'. \section{Floating-Point Operation Count} Although we discourage the use of millions of floating-point operations per second as a performance metric, it can be a useful measure if the number of floating-point operations, $F(N)$, needed to solve the benchmark problem is carefully defined. For simple problems (e.g. matrix multiply) it is sufficient to use a theoretical value for the floating-point operation count (in this case $2n^3$ flop, for nxn matrices) obtained by inspection of the code or consideration of the arithmetic in the algorithm. For more complex problems containing data-dependent conditional statements, an empirical method may have to be used. The sequential version of the benchmark code defines the problem and the algorithm to be used to solve it. Counters can be inserted into this code or a hardware monitor used to count the number of floating-point operations. The latter is the procedure followed by the {\sc PERFECT} Club \cite{Berr89}. In either case a decision has to be made regarding the number of flop that are to be credited for different types of floating-point operations, and we see no good reason to deviate from those chosen by McMahon \cite{Ma88} when the Mflop/s measure was originally defined. These are: \begin{table}[h] \centering \begin{tabular}{ll} add, subtract, multiply & 1 flop \\ divide, square-root & 4 flop \\ exponential, sine etc. & 8 flop \\ {\sc IF(X .REL. Y)} & 1 flop \\ \end{tabular} \end{table} Some members of the committee felt that these numbers, derived in the 1970s, no longer correctly reflected the situation on current computers. However, since these numbers are only used to calculate a nominal benchmark flop-count, it is not so important that they be accurate. The important thing is that they do not change, otherwise all previous flop-counts would have to be renormalised. In any case, it is not possible for a single set of ratios to be valid for all computers and library software. I (rwh) suggest the committee stays with the above ratios until such time as they become wildly wrong and extensive research provides us with a more realistic set. We distinguish two types of operation count. The first is the nominal benchmark floating-point operation count, $F_B(N)$, which is found in the above way from the defining Fortran77 sequential code. The other is the actual number of floating-point operations performed by the hardware when executing the distributed multi-node version, $F_H(N,p)$, which may be greater than the nominal benchmark count, due to the distributed version performing redundant arithmetic operations. Because of this, the hardware flop-count may also depend on the number of processors on which the benchmark is run, as shown in its argument list. \section{Performance Metrics} Given the time of execution $T(N;p)$ and the flop-count $F(N)$ several different performance measures can be defined. Each metric has its own uses, and gives different information about the computer and algorithm used in the benchmark. It is important therefore to distinguish the metrics with different names, symbols and units, and to understand clearly the difference between them. Much confusion and wasted work can arise from optimising a benchmark with respect to an inappropriate metric. The principal performance metrics are: \subsection{Temporal Performance} If we are interested in comparing the performance of different algorithms for the solution of the same problem, then the correct performance metric to use is the {\it Temporal Performance}, $R_T$, which is defined as the inverse of the execution time \begin{equation} R_T(N;p)=T^{-1}(N;p) \label{Eqn(1)} \end{equation} The units of temporal performance are, in general, solutions per second (sol/s), or some more appropriate absolute unit such as timesteps per second (tstep/s). With this metric we can be sure that the algorithm with the highest performance executes in the least time, and is therefore the best algorithm. We note that the number of flop does not appear in this definition, because the objective of algorithm design is not to perform the most arithmetic per second, but rather it is to solve a given problem in the least time, regardless of the amount of arithmetic involved. For this reason the temporal performance is also the metric that a computer user should employ to select the best algorithm to solve his problem, because his objective is also to solve the problem in the least time, and he does not care how much arithmetic is done to achieve this. \subsection{Simulation Performance} A special case of temporal performance occurs for simulation programs in which the benchmark problem is defined as the simulation of a certain period of physical time, rather than a certain number of timesteps. In this case we speak of the {\em Simulation Performance} and use units such as {\em simulated days per day} (written sim-d/d or 'd'/d) in weather forecasting, where the apostrophe is used to indicate 'simulated'; or {\em simulated pico-seconds per second} (written sim-ps/s or 'ps'/s) in electronic device simulation. It is important to use simulation performance rather than timestep/s if one is comparing different simulation algorithms which may require different sizes of timestep for the same accuracy (for example an implicit scheme that can use a large timestep, compared with an explicit scheme that requires a much smaller step). In order to maintain numerical stability, explicit schemes also require the use of a smaller timestep as the spatial grid is made finer. For such schemes the simulation performance falls off dramatically as the problem size is increased by introducing more mesh points in order to refine the spatial resolution: the doubling of the number of mesh-points in each of three dimensions can reduce the simulation performance by a factor near 16 because the timestep must also be approximately halved. Even though the larger problem will generate more Megaflop per second, in forecasting, it is the simulated days per day (i.e. the simulation performance) and not the Mflop/s, that matter to the user. As we see below, benchmark performance is also measured in terms of the amount of arithmetic performed per second or Mflop/s. However it is important to realise that it is incorrect to compare the Mflop/s achieved by two algorithms and to conclude that the algorithm with the highest Mflop/s rating is the best algorithm. This is because the two algorithms may be performing quite different amounts of arithmetic during the solution of the same problem. The temporal performance metric, $R_T$, defined above, has been introduced to overcome this problem, and provide a measure that can be used to compare different algorithms for solving the same problem. However, it should be remembered that the temporal performance only has the same meaning within the confines of a fixed problem, and no meaning can be attached to a comparison of the temporal performance on one problem with the temporal performance on another. \subsection{Benchmark Performance} In order to compare the performance of a computer on one benchmark with its performance on another, account must be taken of the different amounts of work (measured in flop) that the different problems require for their solution. Using the flop-count for the benchmark, $F_B(N)$, we can define the {\em Benchmark Performance} as \begin{equation} R_B(N;p)=F_B(N)/{T(N;p)} \label{Eqn(2)} \end{equation} The units of benchmark performance are Mflop/s (benchmark name), where we include the name of the benchmark in parentheses to emphasise that the performance may depend strongly on the problem being solved, and to emphasise that the values are based on the nominal benchmark flop-count. In other contexts such performance figures would probably be quoted as examples of the so-called {\em sustained} performance of a computer. We feel that the use of this term is meaningless unless the problem being solved and the degree of code optimisation is quoted, because the performance is so varied across different benchmarks and different levels of optimisation. Hence we favour the quotation of a selection of benchmark performance figures, rather than a single sustained performance, because the latter implies that the quoted performance is maintained over all problems. Note also that the flop-count $F_B(N)$ is that for the defining sequential version of the benchmark, and that the same count is used to calculate $R_B$ for the distributed-memory (DM) version of the program, even though the DM version may actually perform a different number of operations. It is usual for DM programs to perform more arithmetic than the defining sequential version, because often numbers are recomputed on the nodes in order to save communicating their values from a master processor. However such calculations are redundant (they have already been performed on the master) and it would be incorrect to credit them to the flop-count of the distributed program. Using the sequential flop-count in the calculation of the DM programs benchmark performance has the additional advantage that it is possible to conclude that, for a given benchmark, the implementation that has the highest benchmark performance is the best because it executes in the least time. This would not necessarily be the case if a different $F_B(N)$ were used for different implementations of the benchmark. For example, the use of a better algorithm which obtains the solution with less than $F_B(N)$ operations will show up as higher benchmark performance. For this reason it should cause no surprise if the benchmark performance occasionally exceeds the maximum possible hardware performance. To this extent benchmark performance Mflop/s must be understood to be nominal values, and not necessarily exactly the number of operations executed per second by the hardware, which is the subject of the next metric. The purpose of benchmark performance is to compare different implementations and algorithms on different computers for the solution of the same problem, on the basis that the best performance means the least execution time. For this to be true $F_B(N)$ must be kept the same for all implementations and algorithms. \subsection{Hardware Performance} If we wish to compare the observed performance with the theoretical capabilities of the computer hardware, we must compute the actual number of floating-point operations performed, $F_H(N;p)$, and from it the actual {\em Hardware Performance} \begin{equation} R_H(N;p)=F_H(N,p)/{T(N;p)} \label{Eqn(3)} \end{equation} The hardware performance also has the units Mflop/s, and will have the same value as the benchmark performance for the sequential version of the benchmark. However, the hardware performance may be higher than the benchmark performance for the distributed version, because the hardware performance gives credit for redundant arithmetic operations, whereas the benchmark performance does not. Because the hardware performance measures the actual floating-point operations performed per second, unlike the benchmark performance, it can never exceed the theoretical peak performance of the computer. Assuming a computer with multiple-CPUs each with multiple arithmetic pipelines, delivering a maximum of one flop per clock period, the theoretical peak value of hardware performance is \begin{equation} r^*= \frac{fl.pt.pipes/CPU}{clock.period}\times number.CPUs \label{Eqn(4)} \end{equation} with units of Mflop/s if the clock period is expressed in microseconds. By comparing the measured hardware performance, $R_H(N;p)$, with the theoretical peak performance, we can assess the fraction of the available performance that is being realised by a particular implementation of the benchmark. \subsection{Speedup, Efficiency and Performance per Node} \begin{verbatim} It was agreed that this subsection be redrafted by David Bailey. The first draft text is retained until the substitute is ready -------------------------- START OLD TEXT --------------------------------- \end{verbatim} We do not favour the use of any of the popular performance metrics: speedup, efficiency or performance per node; because all these are either easily misinterpreted or obscure important effects. The speedup of a benchmark code is defined as the ratio of the $p$-processor temporal performance to the single-processor temporal performance. It is a very useful and convenient measure if we are concerned with the optimisation of a particular code in isolation, because its value can easily be compared with the maximum possible speedup, namely the number of processors being used. We can thereby assess how much of the potential hardware performance is being utilised. However benchmarking is to do with comparing the performance of different computers, and all the above three metrics are unsuitable for this purpose. Speedup compares the performance of a code with itself, and might therefore be called an introspective, or even incestuous measure. Problems can therefore arise (see below), and incorrect conclusions can be drawn, if we try to use speedup to compare different algorithms on the same computer, or the same algorithm on different computers. This is because speedup is a relative measure (it is defined as the ratio of two performances), and therefore all knowledge of the absolute performance has been lost. Benchmarking, however, is concerned with the comparison of the absolute performance of computers, and therefore the use of a relative measure like speedup is not very useful, and can be positively misleading. For example, we do not wish to conclude that a computer with a large number of slow processors and therefore high value of speedup, is faster than another with fewer processors and therefore with a lower speedup, if in fact the reverse is the case, because the processors on the second computer are so much faster. Only by adopting absolute measures of performance with physical units involving inverse time, can one avoid this type of false conclusion. Speedup is not even useful for comparing the performance of one algorithm with another on the same computer, because it is not necessarily true that the algorithm with the highest speedup executes in the least time (see, e.g. ~\cite{Cvet90}). One can only be sure that this is the case if the single-processor temporal performance of both algorithms is the same, which is most unlikely. If the single-processor performances of the two algorithms are different and we compare the speedups of the two algorithms, then we are comparing the performance of the two algorithms measured in different units. This is like comparing the speeds of two cars, one measured in m.p.h. and the other in cm/s. Such a comparison has no validity either for cars or algorithms. Computers and algorithms can only safely be compared in terms of their absolute performance in solving a problem. The most unambiguous measure is the temporal performance, which is the inverse of the time of execution, or the related simulation performance. The benchmark performance per node might seem to be an attractive metric because it is an absolute measure which can be related directly to the hardware performance of node. However it has the major defect that it hides the point at which the absolute performance begins to decrease as the number of processors increases. If we plot benchmark performance against number of processors, this point is clearly visible as a maximum, however if the same data is plotted as performance per node, all we see is a very uninteresting monotonically falling line, and the important maximum has disappeared. The efficiency, which is defined as the speedup divided by the number of processors, is doubly condemned because it is a relative measure and hides the maximum. \begin{verbatim} --------------------------- END OLD TEXT --------------------------------- \end{verbatim} \section{Performance Database} \begin{verbatim} It was agreed that this subsection be redrafted by Jack Dongarra. -------------------------- START OLD TEXT --------------------------------- \end{verbatim} The database of benchmark performance results should be based on an extension of the excellent X-window display demonstrated by Jack Dongarra at the March 1993 PBWG meeting. \begin{verbatim} --------------------------- END OLD TEXT --------------------------------- ---------------------- SOME PROPOSED NEW TEXT ---------------------------- \end{verbatim} At present each benchmark measurement for a particular problem size $N$ and processor number $p$, is represented by one line in the database with variable length fields chosen by the benchmark writer as suitable and comprehensive to describe the conditions of the benchmark run. The fields separated by a marker (|) include, benchmarkers name and e-mail, computer location and date, hardware specification, compiler date and optimisation level, $N$, $p$, $T(N,p)$, $R_B(N,P)$ and other metrics as deemed appropriate by the benchmark writer. Ideally, the line for the database would be produced automatically as output by the benchmark program itself. \begin{verbatim} ---------------------- END PROPOSED NEW TEXT ---------------------------- \end{verbatim} \section{Interactive Graphical Interface} The Southampton Group has agreed to provide an interactive graphical front end to the PARKBENCH database of performance results. To achieve this, the basic data held in the Performance Data Base should be values of $T(N;p)$ for at least 4 values of problem size $N$, each for sufficient $p$-values (say 5 to 10) to determine the trend of variation of performance with number of processors for constant problem size. It is important that there be enough $p$-values to see Amdahl saturation, if present, or any peak in performance followed by degradation. A graphical interface is really essential to allow this multidimensional data to be viewed in any of the metrics defined above, as chosen interactively by the user. The user could also be offered (by suitable interpolation) a display of the results in various scaled metrics, in which the problem size is expanded with the number of processors. In order to encompass as wide a range of performance and number of processors as possible, a log-scale on both axes is unavoidable, and the format and scale range should be kept fixed as long as possible to enable easy comparison between graphs. A three-cycle by three-cycle log-log graph with range 1 to 1000 in both $p$ and Mflop/s would cover most needs in the immediate future. Examples of such graphs are to be found in \cite{Hoc92,Add93}. A log/log graph is also desirable because the size and shape of the Amdahl saturation curve is the same wherever it is plotted on such a graph. That is to say there is a universal Amdahl curve that is invariant to its position on any log/log graph. Amdahl saturation is a two-parameter description of any of the performance metrics, $R$, as a function of $p$ for fixed $N$, which can be expressed by \begin{equation} R = \frac{R_\infty}{(1 + \phalf/p)} \end{equation} where $R_\infty$ is the saturation performance approached as $p \rightarrow \infty$ and \phalf is the number of processors required to reach half the saturation performance. The graphical interface should allow this universal Amdahl curve to be moved around the graphical display, and be matched against the performance curves. The changing values of the two parameters \Rphalf should be displayed as the Amdahl curve is moved. As more experience is gained with performance analysis, that is to say the fitting of performance data to parametrised formulae, it is to be expected that the graphical interface will allow more complicated formulae to be compared with the experimental data, perhaps allowing 3 to 5 parameters in the theoretical formula. But, as yet, we do not know what these for parametrised formula should be. \section{Benchmarking Procedure and Code Optimisation} Manufacturers will always feel that any benchmark not tuned specifically by themselves, is an unfair test of their hardware and software. This is inevitable and from their viewpoint it is true. NASA have overcome this problem by only specifying the problems (the NAS paper-and-pencil benchmarks \cite{naspar2}) and leaving the manufacturers to write the code, but in many circumstances this would require unjustifiable effort and take too long. It is also a perfectly valid question to ask how a particular parallel computer will perform on existing parallel code, and that is the viewpoint of PARKBENCH. The benchmarking procedure is to run the distributed PARKBENCH suite on an 'as-is' basis, making only such non-substantive changes that are required to make the code run (e.g. changing the names of header files to a local variant). The as-is run may use the highest level of automatic compiler optimisation that works, but the level used and compiler date should be noted in the appropriate section of the performance database entry. After completing the as-is run, which gives a base-line result, any form of optimisation may be applied to show the particular computer to its best advantage, up to completely rethinking the algorithm, and rewriting the code. The only requirement on the benchmarker is to state what has been done. However, remember that, even if the algorithm is changed, the official flop-count, $F_B(N)$ that is used in the calculation of nominal benchmark Mflop/s, $R_B(N,p)$, does not. In this way a better algorithm will show up with a higher $R_B$, as we would want it to, even though the hardware Mflop/s is likely to be little changed. Typical steps in optimisation might be: \begin{enumerate} \item explore the effect of different compiler optimisations on a single processor, and choose the best for the as-is run. \item perform the as-is run on multiple processors, using enough values of $p$ to determine any peak in performance or saturation. \item return to single processor and optimise code for vectorisation, if a vector processor is being used. This means restructuring loops to permit vectorisation. \item continue by replacement of selected loops with optimal assembly coded library routines (e.g. BLAS where appropriate). \item replacement of whole benchmark by a tuned library routine with the same functionality. \item replace whole benchmark with locally written version with the same functionality but using possibly an entirely different algorithm that is more suited to the architecture. \end{enumerate} % ---------------------------------------------------------------------------- %------------------------------------------------------------------------ % PARKBENCH REPORT (second draft), File: bencom1.tex %------------------------------------------------------------------------ % % ************************************************************** % LATEX COMMANDS FOR PARKBENCH REPORTS % ************************************************************** % \def\flop{\mathop{\rm flop}\nolimits} \def\pipe{\mathop{\rm pipe}\nolimits} \newcommand{\Suprenum}{\mbox{\sc SUPRENUM}} \newcommand{\usec}{\mbox{\rm $\mu$s}} \newcommand{\where}{\mbox{\rm where}} \newcommand{\rmand}{\mbox{\rm and}} \newcommand{\Mflops}{\mbox{\rm Mflop/s}} \newcommand{\flops}{\mbox{\rm flop/s}} \newcommand{\flopB}{\mbox{\rm flop/B}} \newcommand{\tstepps}{\mbox{\rm tstep/s}} \newcommand{\MWps}{\mbox{\rm MW/s}} \newcommand{\Mwps}{\mbox{\rm Mw/s}} \newcommand{\spone}{\mbox{\ }} \newcommand{\sptwo}{\mbox{\ \ }} \newcommand{\spfour}{\mbox{\ \ \ \ }} \newcommand{\spsix}{\mbox{\ \ \ \ \ \ }} \newcommand{\speight}{\mbox{\ \ \ \ \ \ \ \ }} \newcommand{\spten}{\mbox{\ \ \ \ \ \ \ \ \ \ }} \newcommand{\rinf}{\mbox{$r_\infty$}} \newcommand{\Rinf}{\mbox{$R_\infty$}} \newcommand{\nhalf}{\mbox{$n_{\frac{1}{2}}$}} \newcommand{\fhalf}{\mbox{$f_{\frac{1}{2}}$}} \newcommand{\Nhalf}{\mbox{$N_{\frac{1}{2}}$}} \newcommand{\phalf}{\mbox{$p_{\frac{1}{2}}$}} \newcommand{\rhat}{\mbox{$\hat{r}$}} \newcommand{\Phalf}{\mbox{$P_{\frac{1}{2}}$}} \newcommand{\half}{\mbox{$\frac{1}{2}$}} \newcommand{\rnhalf}{\mbox{(\rinf,\nhalf)}} \newcommand{\rfhalf}{\mbox{(\rhat,\fhalf)}} \newcommand{\RNhalf}{\mbox{(\Rinf,\Nhalf)}} \newcommand{\Rphalf}{\mbox{(\Rinf,\phalf)}} \newcommand{\third}{\mbox{$\frac{1}{3}$}} \newcommand{\quart}{\mbox{$\frac{1}{4}$}} \newcommand{\eighth}{\mbox{$\frac{1}{8}$}} \newcommand{\nineth}{\mbox{$\frac{1}{9}$}} % ---------------------------------------------------------------------------- %------------------------------------------------------------------------ % PARKBENCH REPORT (second draft), File: benref1.bib %------------------------------------------------------------------------ % ------------------------------------------------------------------- % PARKBENCH BIBLIOGRAPHY % % Contributions from: % R.Hockney(Southampton), X.Sun(ICASE), H.Simon(NASA) % ------------------------------------------------------------------- @book{HoJe81, author= "Roger W. Hockney and Christopher R. Jesshope", title= "{Parallel Computers: Architecture, Programming and Algorithms}", publisher= "Adam Hilger", address= "Bristol", year= "1981", } @book{HoJe88, author= "Roger W. Hockney and Christopher R. Jesshope", title= "{Parallel Computers 2: Architecture, Programming and Algorithms}", publisher= "Adam Hilger/IOP Publishing", address= "Bristol \& Philadelphia", year= "1988", edition="second", note= "Distributed in the USA by IOP Publ. Inc., Public Ledger Bldg., Suite 1035, Independence Square, Philadelphia, PA 19106."} @book{Super, key="Super", title={Supercomputer}, publisher="ASFRA", address="Edam, Netherlands"} @book{SI75, key="Royal Society", organization="{Symbols Committee of the Royal Society}", title={Quantities, Units and Symbols}, publisher="The Royal Society", address="London", year=1975} @article{Berr89, author="M. Berry and D. Chen and P. Koss and D. Kuck and S. Lo and Y. Pang and L. Pointer and R. Roloff and A. Sameh and E. Clementi and S. Chin and D. Schneider and G. Fox and P. Messina and D. Walker and C. Hsiung and J. Schwarzmeier and K. Lue and S. Orszag and F. Seidl and O. Johnson and R. Goodrum and J. Martin", title="{The PERFECT Club Benchmarks: Effective Performance Evaluation of Computers}", journal={Intl. J. Supercomputer Appls.}, volume=3, number=3, year=1989, pages="5-40"} @incollection{Ma88, author="F. H. McMahon", title="{The Livermore Fortran Kernels test of the Numerical Performance Range}", editor="J. L. Martin", booktitle={Performance Evaluation of Supercomputers}, publisher="Elsevier Science B.V., North-Holland", address="Amsterdam", year=1988, pages="143-186"} @article{Mess90, author="P. Messina and C. Baillie and E. Felten and P. Hipes and R. Williams and A. Alagar and A. Kamrath and R. Leary and W. Pfeiffer and J. Rogers and D. Walker", title="Benchmarking advanced architecture computers", journal={Concurrency: Practice and Experience}, volume=2, number=3, year=1990, pages="195-255"} @inproceedings{Cvet90, author="Z. Cvetanovic and E. G. Freedman and C. Nofsinger", title="{Efficient Decomposition and Performance of Parallel PDE, FFT, Monte-Carlo Simulations, Simplex and Sparse Solvers}", booktitle={Proceedings Supercomputing90}, publisher="IEEE", address="New York", year=1990, pages="465-474"} @article{SUPR88, title="Proceedings 2nd International SUPRENUM Colloquium", author="U. Trottenberg", journal={Parallel Computing}, volume=7, number=3, year=1988} @article{Hey91, author="A. J. G. Hey", title="{The Genesis Distributed-Memory Benchmarks}", journal={Parallel Computing}, volume=17, year=1991, pages="1275-1283"} @book{F90, author="M. Metcalf and J. Reid", title={Fortran-90 Explained}, publisher="Oxford Science Publications/OUP", address="Oxford and New York", year=1990, chapter=6} @article{SPEC90, key="SPEC", title="{SPEC Benchmarks Suite Release 1.0}", journal={SPEC Newslett.}, volume=2, number=3, year=1990, pages="3-4", publisher="{Systems Performance Evaluation Cooperative, Waterside Associates}", address="{Fremont, California}"} @article{FGHS89, author="A. Friedli and W. Gentzsch and R. Hockney and A. van der Steen", title="{A European Supercomputer Benchmark Effort}", journal={Supercomputer 34}, volume="VI", number=6, year=1989, pages="14-17"} @article{BRH90, author="L. Bomans and D. Roose and R. Hempel", title="{The Argonne/GMD Macros in Fortran for Portable Parallel Programming and their Implementation on the Intel iPSC/2}", journal={Parallel Computing}, volume=15, year=1990, pages="119-132"} @inproceedings{ShTu91, author="J. N. Shahid and R. S. Tuminaro", title="{Iterative Methods for Nonsymmetric Systems on MIMD Machines}", booktitle={Proc. Fifth SIAM Conf. Parallel Processing for Scientific Computing}, year=1991} @article{Bish90, author="N. T. Bishop and C. J. S. Clarke and R. A. d'Inverno", journal={Classical and Quantum Gravity}, volume=7, year=1990, pages="L23-L27"} @article{Isaac83, author="R. A. Isaacson and J. S. Welling and J.Winicour", journal={J. Math. Phys.}, volume=24, year=1983, pages="1824-1834"} @article{Stew82, author="J. M. Stewart and H. Friedrich", journal={Proc. Roy. Soc.}, volume="A384", year=1982, pages="427-454"} @incollection{Hoc77, author="R. W. Hockney", title="{Super-Computer Architecture}", editor="F. Sumner", booktitle={Infotech State of the Art Conference: Future Systems}, publisher="Infotech", address="Maidenhead", year=1977, pages="277-305"} @article{Hoc82, author="R. W. Hockney", title="{Characterization of Parallel Computers and Algorithms}", journal={Computer Physics Communications}, volume=26, year=1982, pages="285-29"} @article{Hoc83, author="R. W. Hockney", title="{Characterizing Computers and Optimizing the FACR(l) Poisson-Solver on Parallel Unicomputers}", journal={IEEE Trans. Comput.}, volume="{C}\-32", year=1983, pages="933-941"} @article{Hoc87, author="R. W. Hockney", title="Parametrization of Computer Performance", journal={Parallel Computing}, volume=5, year=1987, pages="97-103"} @article{Hoc88, author="R. W. Hockney", title="{Synchronization and Communication Overheads on the {LCAP} Multiple FPS-164 Computer System}", journal={Parallel Computing}, volume=9, year=1988, pages="279-290"} @article{HoCu89, author="R. W. Hockney and I. J. Curington", title="{$f_{frac{1}{2}}$: a Parameter to Characterise Memory and Communication Bottlenecks}", journal={Parallel Computing}, volume=10, year=1989, pages="277-286"} @article{Hoc91, author="R. W. Hockney", title="{Performance Parameters and Benchmarking of Supercomputers}", journal={Parallel Computing}, volume=17, year=1991, pages="1111-1130"} @article{Hoc92, author="R. W. Hockney", title="{A Framework for Benchmark Analysis}", journal={Supercomputer}, volume=48, number="IX-2", year=1992, pages="9-22"} @article{HoCa92, author="R. W. Hockney and E. A. Carmona", title="{Comparison of Communications on the Intel iPSC/860 and Touchstone Delta}", journal={Parallel Computing}, volume=18, year=1992, pages="1067-1072"} @article{Add93, author="C. Addison and J. Allwright and N. Binsted and N. Bishop and B. Carpenter and P. Dalloz and D. Gee and V. Getov and A. Hey and R. Hockney and M. Lemke and J. Merlin and M. Pinches and C. Scott and I. Wolton", title="{The Genesis Distributed-Memory Benchmarks. Part 1: methodology and general relativity benchmark with results for the SUPRENUM computer}", journal={Concurrency: Practice and Experience}, volume=5, number=1, year=1993, pages="1-22"} @techreport{StRi93, author="A. J. van der Steen and P. P. M. de Rijk", title="{Guidelines for use of the EuroBen Benchmark}", institution="EuroBen", year=1993, month=feb, type="Technical Report", number="{TR}\-3", address="{The EuroBen Group, Utrecht, The Netherlands}"} @INPROCEEDINGS{Gust90, author = "J.L. Gustafson", title = "Fixed time, Tiered Memory, and Superlinear Speedup", booktitle = "Proc. of the Fifth Conf. on Distributed Memory Computers", year = "1990", } @ARTICLE{SuGu91, AUTHOR = "Xian-He Sun and J.L. Gustafson", TITLE = "Toward a Better Parallel Performance Metric", JOURNAL = "Parallel Computing", VOLUME = "17", MONTH = "Dec.", YEAR = "1991", pages = "1093--1109", } @INPROCEEDINGS{Bail92, author = "David H. Bailey", title = "Misleading Performance in the Supercomputing Field", booktitle = "Proc. Supercomputing '92", address = " ", year = "1992", pages = "155--158", } @TECHREPORT{bailey91.3, AUTHOR = "Bailey, D. H. and Frederickson, P.O.", TITLE = "Performance Results for Two of the NAS Parallel Benchmarks", INSTITUTION = "NASA Ames Research Center", ADDRESS = "Moffett Field, CA 94035", NUMBER = "RNR-91-19", MONTH = "June", YEAR = "1991"} %discusses the implementation of two of the benchmarks @TECHREPORT{naspar, AUTHOR = "Bailey, D. H. and Barton, J. and Lasinski, T. and Simon, H. (editors)", TITLE = "The {NAS} Parallel Benchmarks", INSTITUTION = "NASA Ames Research Center", ADDRESS = "Moffett Field, CA 94035", NUMBER = "RNR-91-02", MONTH = "January", YEAR = "1991"} %the original report, complete reference for the NPB @ARTICLE{naspar2, AUTHOR = "Bailey, D. and Barszcz, E. and Barton, J. and Browning, D. and Carter, R. and Dagum, L. and Fatoohi, R. and Frederickson, P. and Lasinski, T. and Schreiber, R. and Simon, H. and Venkatakrishnan, V. and Weeratunga, S.", TITLE = "The {NAS} Parallel Benchmarks", JOURNAL = "Int. J. of Supercomputer Applications", VOLUME = "5", NUMBER = "3", YEAR = "1991", PAGES = "63 - 73"} %published version of the rules @INPROCEEDINGS{naspar3, AUTHOR = "Bailey, D. and Barszcz, E. and Barton, J. and Browning, D. and Carter, R. and Dagum, L. and Fatoohi, R. and Frederickson, P. and Lasinski, T. and Schreiber, R. and Simon, H. and Venkatakrishnan, V. and Weeratunga, S.", TITLE = "The {NAS} Parallel Benchmarks - Summary and Preliminary Results", BOOKTITLE = "Proceedings of Supercomputing '91, Albuquerque, New Mexico", PUBLISHER = "IEEE Computer Society Press", ADDRESS = "Los Alamitos, California", YEAR = "1991", PAGES = "158 - 165"} %results as of 1991 @ARTICLE{naspar4, AUTHOR = "Bailey, D. H. and Barszcz, E. and Dagum, L. and Simon, H. D.", TITLE = "{NAS} Parallel Benchmark Results", JOURNAL = "IEEE J. Parallel and Distributed Technology", VOLUME = "1", NUMBER = "1", PAGES = "43 - 51", YEAR = "1993"} %results as of 12/92 @TECHREPORT{dagum91.3, AUTHOR = "L. Dagum", TITLE = "Parallel Integer Sorting With Medium and Fine-Scale Parallelism", INSTITUTION = "NASA Ames Research Center", ADDRESS = "Moffett Field, CA 94035", NUMBER = "RNR-91-13", MONTH = "April", YEAR = "1991", NOTE = "(Int. J. High Speed Comp., 1993, to appear)"} %detailed discussion of integer sort benchmark on Cray, CM, Intel @TECHREPORT{bars93, AUTHOR = "Barszcz, E. and Fatoohi, R. and Venkatakrishnan, V. and Weeratunga, S.", TITLE = "Solution of Regular Sparse Triangular Linear Systems on Vector and Distributed Memory Multiprocessors", INSTITUTION = "NASA Ames Research Center", ADDRESS = "Moffett Field, CA 94035", NUMBER = "RNR-93-07", MONTH = "April", YEAR = "1993"} %detailed discussion of LU benchmark on Cray, CM, Intel %------------------------------------------------------------------------ % PARKBENCH REPORT (second draft), END OF FILES %------------------------------------------------------------------------ From owner-pbwg-comm@CS.UTK.EDU Wed Aug 18 11:33:40 1993 Received: from CS.UTK.EDU by netlib2.cs.utk.edu with SMTP (5.61+IDA+UTK-930125/2.8t-netlib) id AA00658; Wed, 18 Aug 93 11:33:40 -0400 Received: from localhost by CS.UTK.EDU with SMTP (5.61+IDA+UTK-930125/2.8s-UTK) id AA13740; Wed, 18 Aug 93 11:29:50 -0400 X-Resent-To: pbwg-comm@CS.UTK.EDU ; Wed, 18 Aug 1993 11:29:43 EDT Errors-To: owner-pbwg-comm@CS.UTK.EDU Received: from sun2.nsfnet-relay.ac.uk by CS.UTK.EDU with SMTP (5.61+IDA+UTK-930125/2.8s-UTK) id AA13729; Wed, 18 Aug 93 11:29:35 -0400 Via: uk.ac.southampton.ecs; Wed, 18 Aug 1993 16:29:10 +0100 From: R.Hockney@parallel-applications-centre.southampton.ac.uk Via: calvados.pac.soton.ac.uk (plonk); Wed, 18 Aug 93 16:19:22 BST Date: Wed, 18 Aug 93 15:27:54 GMT Message-Id: <7288.9308181527@calvados.pac.soton.ac.uk> To: pbwg-comm@cs.utk.edu Subject: Parkbench Report UNITS.TEX and revised BENCOM1.TEX and LOWLEV2.TEX ------------------------------------------------- Appended are an update to the units section (units.tex) to use as a replacement within the methodology chapter (method4.tex). This requires further additions to the latex command file (bencom1.tex), which should be replaced with the new one here. Also supplied is the third draft of the lowlevel chapter (lowlev2.tex). Roger Hockney, 18 Aug 1993 %------------------------------------------------------------------------ % PARKBENCH REPORT (third draft), File: units.tex % Replace the whole \section{Units and Symbols} in method4.tex by % the following: %------------------------------------------------------------------------ \section{Units and Symbols} A rational set of units and symbols is essential for any numerate science including benchmarking. The following extension of the internationally agreed SI system of physical units \cite{SI75} is made to accommodate the needs of computer benchmarking. The value of a variable comprises a pure number stating the number of units which equal the value of the variable, followed by a unit symbol specifying the unit in which the variable is being measured. A new unit is required whenever a quantity of a new nature arises, such as e.g. the first appearance of vector operations, or message sends. Generally speaking a unit symbol should be as short as possible, consistent with being easily recognised and not already used. The following have been found necessary in the characterisation of computer and benchmark performance in science and engineering. No doubt more will have to be defined as benchmarking enters new areas. \medskip New unit symbols and their meaning: \begin{enumerate} \item \flop : floating-point operation [latex \verb1\1flop] \item \inst : instruction of any kind [latex \verb1\1inst] \item \inop : integer operation [latex \verb1\1inop] \item \vecop: vector operation [latex \verb1\1vecop] \item \msend: message send operation [latex \verb1\1msend] \item \iter : iteration of loop [latex \verb1\1iter] \item \mref : memory reference (read or write) [latex \verb1\1mref] \item \barr : barrier operation [latex \verb1\1barr] \item \bit : binary digit (bit) [latex \verb1\1bit] \item \B : byte (groups of 8 bits) [latex \verb1\1B] \item \sol : solution or single execution of benchmark [latex \verb1\1sol] \item \w : computer word. Symbol is lower case (W means watt) [latex \verb1\1w] \end{enumerate} When required a subscript may be used to show the number of bits involved in the unit. For example: a 32-bit floating-point operation ${\flop}_{32}$, a 64-bit word ${\w}_{64}$, also we have ${\bit}={\w}_1$, ${\B}={\w}_8$, ${\w}_{64}= 8 {\B}$. Note that flop, mref and other multi-letter symbols are inseparable four or five-letter symbols. The character case is significant in all unit symbols so that e.g. Flop, Mref, $W_{64}$ are incorrect. Unit symbols should always be printed in roman type, to contrast with variables names which are printed in italic. To aid in the use of roman type, especially within Latex maths mode, Latex commands have been defined for each unit, these commands being a backslash followed by the unit symbol (except for '\inop' and '\bit' whose names are changed in the command to avoid a clash with already defined system commands). Such commands will print in roman type wherever they occur. Because 's' is the SI unit for seconds, unit symbols like 'sheep' do not take 's' in the plural. Thus one counts: one flop, two flop, ..., one hundred flop etc. This is especially important when the unit symbol is used in ordinary text as a useful abbreviation, as often, quite sensibly, it is. \medskip SI provides the standard prefixes: \begin{enumerate} \item k : kilo meaning $10^3$ \item M : mega meaning $10^6$ \item G : giga meaning $10^9$ \item T : tera meaning $10^{12}$ \end{enumerate} This means that we cannot use M to mean $1024^2$ (the binary mega) as is often done in describing computer memory capacity, e.g. 256 MB. We can however introduce the new prefix: \begin{enumerate} \item K : meaning 1024, then use a subscript 2 to indicate the binary versions \item ${\rm M}_2$ : binary mega $1024^2$ \item ${\rm G}_2$ : binary giga $1024^3$ \item ${\rm T}_2$ : binary tera $1024^4$ \end{enumerate} In most cases the difference between the mega and the binary mega (4\%) is probably unimportant, but it is important to be unambiguous. In this way one can continue with existing practice if the difference doesn't matter, and have an agreed method of being more exact when necessary. For example, the above memory capacity was probably intended to mean $256 {\rm M_2 B}$. As a consequence of the above, an amount of computational work involving $4.5 \times 10^{12}$ floating-point operations is correctly written as 4.5 Tflop. Note that the unit symbol Tflop is never pluralised with an added 's', and it is therefore incorrect to write the above as 4.5 Tflops which could be confused with a rate per second. The most frequently used unit of performance, millions of floating-point operations per second is correctly written Mflop/s, in analogy to km/s. The slash is necessary and means 'per', because the 'p' is an integral part of the unit symbol 'flop' and cannot also be used to mean 'per'. % ---------------------------------------------------------------------------- %------------------------------------------------------------------------ % PARKBENCH REPORT (third draft), File: bencom1.tex %------------------------------------------------------------------------ % % ************************************************************** % LATEX COMMANDS FOR PARKBENCH REPORTS % ************************************************************** % \def\pipe{\mathop{\rm pipe}\nolimits} \newcommand{\Parkbench}{\mbox{\sc PARKBENCH}} \newcommand{\Suprenum}{\mbox{\sc SUPRENUM}} \newcommand{\flop}{\mbox{\rm flop}} \newcommand{\inst}{\mbox{\rm inst}} \newcommand{\inop}{\mbox{\rm intop}} \newcommand{\vecop}{\mbox{\rm vecop}} \newcommand{\msend}{\mbox{\rm msend}} \newcommand{\iter}{\mbox{\rm iter}} \newcommand{\mref}{\mbox{\rm mref}} \newcommand{\barr}{\mbox{\rm barr}} \newcommand{\bit}{\mbox{\rm b}} \newcommand{\B}{\mbox{\rm B}} \newcommand{\sol}{\mbox{\rm sol}} \newcommand{\w}{\mbox{\rm w}} \newcommand{\usec}{\mbox{\rm $\mu$s}} \newcommand{\where}{\mbox{\rm where}} \newcommand{\rmand}{\mbox{\rm and}} \newcommand{\Mflops}{\mbox{\rm Mflop/s}} \newcommand{\flops}{\mbox{\rm flop/s}} \newcommand{\flopB}{\mbox{\rm flop/B}} \newcommand{\tstepps}{\mbox{\rm tstep/s}} \newcommand{\Mwps}{\mbox{\rm Mw/s}} \newcommand{\spone}{\mbox{\ }} \newcommand{\sptwo}{\mbox{\ \ }} \newcommand{\spfour}{\mbox{\ \ \ \ }} \newcommand{\spsix}{\mbox{\ \ \ \ \ \ }} \newcommand{\speight}{\mbox{\ \ \ \ \ \ \ \ }} \newcommand{\spten}{\mbox{\ \ \ \ \ \ \ \ \ \ }} \newcommand{\rinf}{\mbox{$r_\infty$}} \newcommand{\Rinf}{\mbox{$R_\infty$}} \newcommand{\nhalf}{\mbox{$n_{\frac{1}{2}}$}} \newcommand{\fhalf}{\mbox{$f_{\frac{1}{2}}$}} \newcommand{\Nhalf}{\mbox{$N_{\frac{1}{2}}$}} \newcommand{\phalf}{\mbox{$p_{\frac{1}{2}}$}} \newcommand{\Phalf}{\mbox{$P_{\frac{1}{2}}$}} \newcommand{\rhat}{\mbox{$\hat{r}$}} \newcommand{\half}{\mbox{$\frac{1}{2}$}} \newcommand{\rnhalf}{\mbox{(\rinf,\nhalf)}} \newcommand{\rfhalf}{\mbox{(\rhat,\fhalf)}} \newcommand{\RNhalf}{\mbox{(\Rinf,\Nhalf)}} \newcommand{\Rphalf}{\mbox{(\Rinf,\phalf)}} \newcommand{\third}{\mbox{$\frac{1}{3}$}} \newcommand{\quart}{\mbox{$\frac{1}{4}$}} \newcommand{\eighth}{\mbox{$\frac{1}{8}$}} \newcommand{\nineth}{\mbox{$\frac{1}{9}$}} % ---------------------------------------------------------------------------- %------------------------------------------------------------------------ % PARKBENCH REPORT (third draft), File: lowlev2.tex %------------------------------------------------------------------------ %file: lowlev2.tex \chapter{Low-Level Benchmarks} \footnote{assembled by Roger Hockney for low-level subcommittee} \section{Introduction} The first step in the assessment of the performance of a massively parallel computer system is to measure the performance of a single processing node of the multi-node system. There exist already many good and well-established benchmarks for this purpose, notably the LINPACK benchmarks and the Livermore Loops. These are not part of the \Parkbench suite of programs, but \Parkbench recommends that these be used to measure single-node performance, in addition to some specific low-level measurements of its own (see section \ref{oneproc}). There follows a brief description of existing benchmarks that are recommended for measuring single-node performance, with a discussion of their value. \subsection{Most Reported Benchmark: LINPACKD (n=100)} This well-known standard benchmark is a Fortran program for the solution of (100x100) dense set of linear equations by Gaussian elimination. It is distributed by Dr J. J. Dongarra of the University of Tennessee. The results are quoted in Mflop/s and are regularly published and available by electronic mail. The main value of this benchmark is that results are known for more computers than any other benchmark. Most of the compute time is contained in vectorisable DO-loops such as the DAXPY (scalar times vector plus vector) and inner product. Therefore one expects vector computers to perform well on this benchmark. The weakness of the benchmark is that it tests only a small number of vector operations, but it does include the effect of memory access and it is solving a complete (although small) real problem. \subsection{Performance Range: The Livermore Loops} These are a set of 24 Fortran DO-loops (The Livermore Fortran Kernels, LFK) extracted from operational codes used at the Lawrence Livermore National Laboratory \cite{Ma88}. They have been used since the early seventies to assess the arithmetic performance of computers and their compilers. They are a mixture of vectorisable and non-vectorisable loops and test rather fully the computational capabilities of the hardware, and the skill of the software in compiling efficient code, and in vectorisation. The main value of the benchmark is the range of performance that it demonstrates, and in this respect it complements the limited range of loops tested in the LINPACK benchmark. The benchmark provides the individual performance of each loop, together with various averages (arithmetic, geometric, harmonic) and the quartiles of the distribution. However, it is difficult to give a clear meaning to these averages, and the value of the benchmark is more in the distribution itself. In particular, the maximum and minimum give the range of likely performance in full applications. The ratio of maximum to minimum performance has been called the {\em instability} or the {\em speciality} ~\cite{Hoc91}, and is a measure of how difficult it is to obtain good performance from the computer, and therefore how specialised it is. The minimum or worst performance obtained on these loops is of special value, because there is much truth in the saying that "the best computer to choose is that with the best worst-performance". \section{\label{oneproc}Single-Processor Benchmarks} The single-processor low-level benchmarks provided by \Parkbench, aim to measure performance parameters that characterise the basic architecture of the computer, and the compiler software through which it is used. For this reason, such benchmarks have also been called appropriately 'basic architectural benchmarks'. Following the methodology of Euroben~\cite{FGHS89}, the aim is that these hardware/compiler parameters will be used in performance formulae that predict the timing and performance of the more complex kernels (see Chapter~\ref{kernel}) and compact applications (see Chapter ~\ref{compact}). They are therefore a set of 'synthetic' benchmarks contrived to measure theoretical parameters that describe the severity of some overhead or potential bottleneck, or the properties of some item of hardware. Thus RINF1 characterises the basic properties of the arithmetic pipelines by measuring the parameter \rnhalf, and POLY1 and POLY2 characterise the severity of the memory bottleneck by measuring the parameters \rfhalf. The fundamental measurement in any benchmarking is the measurement of elapsed wall-clock time. Because the computer clocks on each node of a multi-node MPP are not synchronised, all benchmark time measurements must be made with a single clock on one node of the system. The benchmarks TICK1 and TICK2 have, respectively, been designed to measure the resolution and check the absolute value of this clock. These benchmarks should be run with satisfactory results before any further benchmark mreasurements are made. \subsection{Timer resolution: TICK1} TICK1 measures the interval between ticks of the clock being used in the benchmark measurements. That is to say the resolution of the clock. A succession of calls to the timer routine are inserted in a loop and executed many times. The differences between successive values given by the timer are then examined. If the changes in the clock value (or ticks) occur less frequently than the time taken to enter and leave the timer routine, then most of these differences will be zero. When a tick takes place, however, a difference equal to the tick value will be recorded, surrounded by many zero differences. This is the case with clocks of poor resolution, for example most UNIX clocks that tick typically every 10 ms. Such poor UNIX clocks can still be used for low-level benchmark measurements if the benchmark is repeated, say, 10,000 times, and the timer calls are made outside this repeat loop. With some computers, such as the CRAY series, the clock ticks every cycle of the computer, that is to say every 6ns on the Y-MP. The resolution of the CRAY clock is therefore approximately one million times better than a UNIX clock, and that is quite a difference! If TICK1 is used on such a computer the difference between successive values of the timer is a very accurate measure of how long it takes to execute the instructions of the timer routine, and therefore is never zero. TICK1 takes the minimum of all such differences, and all it is possible to say is that the clock tick is less than or equal to this value. Typically this minimum will be several hundreds of clock ticks. With a clock ticking every computer cycle, we can make low-level benchmark measurements without a repeat loop. Such measurements can even be made on a busy timeshared system (where many users are contending for memory access) by taking the minumum time recorded from a sample of, say, 10,000 single execution measurements. In this case, the minimum can usually be said to apply to a case when there was no memory access delay caused by other users. TICK1 exists and forms part of the Genesis benchmarks ~\cite{Hey91}. \subsection{Timer value: TICK2} TICK2 confirms that correctness of the time values returned by the computer clock, by comparing its measurement of a given time interval with that of an external wall-clock (actually the benchmarker's wristwatch). Parallel benchmark performance can only be measured using the elapsed wall-clock time, because the objective of parallel execution is to reduce this time. Measurements made with a CPU-timer (which only records time when its job is executing in the CPU) are clearly incorrect, because the clock does not record waiting time when the job is out of the CPU. TICK2 will immediately detect the incorrect use of a CPU-time-for-this-job-only clock. An example of a timer that claims to measure elapsed time but is actually a CPU-timer, is the returned value of the popular Sun UNIX timer ETIME. TICK2 also checks that the correct multiplier is being used in the computer system software to convert clock ticks to true seconds. TICK2 exists and will form part of the next release of the Genesis benchmarks ~\cite{Hey91}. \subsection{Basic Arithmetic Operations: RINF1} This benchmark takes a set of common Fortran DO-loops and analyses their time of execution in terms of the two parameters \rnhalf ~\cite{Hoc77,HoJe81,Hoc82,Hoc83,Hoc87,HoJe88}. \rinf is the asymptotic performance rate in Mflop/s which is approached as the loop (or vector) length ,$n$, becomes longer. \nhalf (the half-performance length) expresses how rapidly, in terms increasing vector length, the actual performance, $r$, approaches \rinf. It is defined as the vector length required to achieve a performance of one half of \rinf. This means that the time, $t$, for a DO-loop corresponding to $q$ vector operations (i.e. with $q$ floating-point operations per element per iteration) is approximated by: \begin{equation} t = q * ( n + \nhalf ) / \rinf \label{Eqn1} \end{equation} Then the performance rate is given by \begin{equation} r = \frac{q*n}{t} = \frac{\rinf}{(1+\nhalf /n)} \label{Eqn2} \end{equation} We can see from Eqn.(\ref{Eqn1}) that \nhalf is a way of measuring the importance of vector startup overhead (=\nhalf/\rinf) in terms of quantities known to the programmer (loop or vector length). In the benchmark program, the two parameters are determined by a least-squares fit of the data to the straight line defined by Eqn.(\ref{Eqn1}). A useful guide to the significance of \nhalf is to note from Eqn.(\ref{Eqn2}) that 80 percent of the asymptotic performance is achieved for vectors of length $4 \times \nhalf$. Generally speaking, \nhalf values of upto about 50 are tolerable, whereas the performance of computers with larger values of \nhalf is severely constrained by the need to keep vector lengths significantly longer than \nhalf. This requirement makes the computers difficult to program efficiently, and often leads to disappointing performance, compared to the asymptotic rate advertised by the manufacturer. RINF1 exists as part of the Hockney and Genesis benchmarks ~\cite{Hey91}. An independently written version forms module MOD1AC of the EuroBen benchmarks ~\cite{StRi93}. \subsection{Memory-Bottleneck Benchmarks: POLY1 and POLY2} Even if the vector lengths are long enough to overcome the vector startup overhead, the peak rate of the arithmetic pipelines may not be realised because of the delays associated with obtaining data from the cache or main memory of the computer. The POLY1 and POLY2 benchmarks quantify this dependence of computer performance on memory access bottlenecks. The computational intensity, $f$, of a DO-loop is defined as the number of floating-point operations (flop) performed per memory reference (mref) to an element of a vector variable ~\cite{HoJe88}. The asymptotic performance, \rinf, of a computer is observed to increase as the computational intensity increases, because as this becomes larger, the effects of memory access delays become negligible compared to the time spent on arithmetic. This effect is characterised by the two parameters (\rhat,\fhalf), where \rhat~ is the peak hardware performance of the arithmetic pipeline, and \fhalf is the computational intensity required to achieve half this rate. That is to say the asymptotic performance is given by: \begin{equation} \rinf = \frac{\rhat}{(1+\fhalf/f)} \label{Eqn3} \end{equation} If memory access and arithmetic are not overlapped, then \fhalf can be shown to be the ratio of arithmetic speed (in Mflop/s) to memory access speed (in Mword/s). The parameter \fhalf, like \nhalf, measures an unwanted overhead and should be as small as possible. In order to vary $f$ and allow the peak performance to be approached, we choose a kernel loop that can be computed with maximum efficiency on any hardware. This is the evaluation of a polynomial by Horner's rule, in which case the computational intensity is the order of the polynomial, and both the multiply and add pipelines can be used in parallel. To measure \fhalf, the order of the polynomial is increased from one to ten, and the measured performance for long vectors is fitted to Eqn.(\ref{Eqn3}). The POLY1 benchmark repeats the polynomial evaluation for each order typically 1000 times for vector lengths upto 10,000, which would normally fit into the cache of a cache-based processor. Except for the first evaluation the data will therefore be found in the cache. POLY1 is therefore an {\em in-cache} test of the memory bottleneck between the arithmetic registers of the processor and its cache. POLY2, on the other hand, flushes the cache prior to each different order and then performs only one polynomial evaluation, for vector lengths from 10,000 upto 100,000, which would normally exceed the cache size. Data will have to be brought from off-chip memory, and POLY2 is an {\em out-of-cache} test of the memory bottleneck between off-chip memory and the arithmetic registers. The POLY1 benchmark exists as MOD1G of the EuroBen benchmarks ~cite{StRi93}. POLY2 exists as part of the Hockney benchmarks. \section{Multi-Processor Benchmarks} The \Parkbench suite of benchmark programs provide low-level benchmarks to characterise the basic communication properties of an MPP by measuring the parameters \rnhalf for communication (COMMS1, COMMS2, COMMS3). The ratio of arithmetic speed to communication speed (the hardware/compiler parameter \fhalf for communication) is measured by the POLY3 benchmark. The ability to synchronise the processors in a large MPP, in an acceptable time, is a key characteristic of such computers, and the SYNCH1 benchmark measures the number of barrier statements that can be executed per second as a function of the number of processors taking part in the barrier. \subsection{Communication Benchmarks: COMMS1 and COMMS2} The purpose of the COMMS1, or {\em Pingpong}, benchmark \cite{Hoc88,Hoc91} is to measure the basic communication properties of a message-passing MIMD computer. A message of variable length, $n$, is sent from a master node to a slave node. The slave node receives the message into a Fortran data array, and immediately returns it to the master. Half the time for this `message pingpong' is recorded as the time, $t$, to send a message of length, $n$. In the COMMS2 benchmark there is a message exchange in which two nodes simultaneously send messages to each other and return them. In this case advantage can be taken of bidirectional links, and a greater bandwidth can be obtained than is possible with COMMS1. In both benchmarks, the time as a function of message length is fitted by least squares using the parameters \rnhalf \cite{Hoc82,HoJe88} to the following linear timing model: \begin{equation} t = (n + \nhalf)/\rinf \label{Eqn(4.1)} \end{equation} when the communication rate is given by \begin{equation} r = \frac {\rinf}{1+\nhalf/n} = \rinf \pipe (n/\nhalf) \label{Eqn(4.2)} \end{equation} \begin{equation} \where \spten \pipe (x) = \frac {1}{1 + 1/x} \end{equation} and the startup time is \begin{equation} t_0 = \nhalf/\rinf \label{Eqn(4.3)} \end{equation} In the above equations, \rinf is the {\em asymptotic bandwidth} of communication which is approached as the message length tends to infinity (hence the subscript), and \nhalf is the message length required to achieve half this asymptotic rate. Hence \nhalf is called the {\em half-performance message length}. The importance of the parameter \nhalf is that it provides a yardstick with which to measure message-length, and thereby enables one to distinquish the two regimes of short and long messages. For long messages $(n > \nhalf)$, the denominator in equation \ref{Eqn(4.2)} is approximately unity and the communication rate is approximately constant at its asymptotic rate, \rinf \begin{equation} r \approx \rinf \label{Eqn(4.3.5)} \end{equation} For short messages $(n < \nhalf)$, the communication rate is best expressed in the algebraically equivalent form \begin{equation} r = \frac {\pi_0 n} {(1+ n/ \nhalf)} \label{Eqn(4.4)} \end{equation} \begin{equation} \where \spten \pi_0 = t_0 ^{-1} = \rinf/\nhalf \label{Eqn(4.5)} \end{equation} For short messages, the denominator in equation \ref{Eqn(4.4)} is approximately unity, so that \begin{equation} r \approx \pi_0 n = n / t_0 \label{Eqn(4.6)} \end{equation} In sharp contrast to the approximately constant rate in the long-message limit, the communication rate in the short message limit is seen to be approximately proportional to the message length. The constant of proportionality, $\pi_0$, is known as the {\em specific performance}, and can be expressed conveniently in units of kilobyte per second per byte (kB/s)/B or k/s. Thus, in general, we may say that \rinf characterises the long-message performance and $\pi_0$ the short-message performance. The COMMS1 benchmark computes all four of the above parameters, $(\rinf, \nhalf, t_0, \rmand \pi_0)$, because each emphasises a different aspect of performance. However only two of them are independent. In the case that there are different modes of transmission for messages shorter or longer than a certain length, the benchmark can read in this breakpoint and perform a separate least-squares fit for the two regions. An example is the Intel iPSC/860 which has a different message protocol for messages shorter than and longer than 100 byte. Because of the finite (and often large) value of $t_0$, the above is a {\em two-parameter} description of communication performance. It is therefore incorrect, and sometimes positively misleading, to quote only one of the parameters (e.g. just \rinf, as is often done) to describe the performance. The most useful pairs of parameters are \rnhalf $(\pi_0,\nhalf)$ and $(t_0,\rinf)$, depending on whether one is concerned with long vectors, short vectors or a direct comparison with hardware times. Note also that, although \nhalf is defined as the message length required to obtain half the asymptotic rate \rinf, the two parameters \rnhalf are sufficient to calculate the communication rate for any message length via equation \ref{Eqn(4.2)}, or equivalently using $\pi_0$ instead of \rinf via \ref{Eqn(4.4)}. The COMMS1 and COMMS2 benchmarks exist as part if the Genesis benchmarks ~\cite{Hey91}. \subsection{Total Saturation Bandwidth: COMMS3} To complement the above communication benchmarks, there is a need for a benchmark to measure the total saturation bandwidth of the complete communication system, and to see how this scales with the number of processors. A natural generalisation of the COMMS2 benchmark could be made as follows, and be called the COMMS3 benchmark: Each processor of a $p$-processor system sends a message of length $n$ to the other $(p-1)$ processors. Each processor then waits to receive the $(p-1)$ messages directed at it. The timing of this generalised 'pingping' ends when all messages have been sucessfully received by all processors; although the process will be repeated many times to obtain an accurate measurement, and the overall time will be divided by the number of repeats. The time for the generalised pingping is the time to send $p(p-1)$ messages of length $n$ and can be analysed in the same way as COMMS1 and COMMS2 into values of \rnhalf. The value obtained for \rinf is the required total saturation bandwidth, and we are interested in how this scales up as the number of processors $p$ increases and with it the number of available links in the system. This benchmark does not exist, but Roger Hockney will develop a trial version for the Intel iPSC, followed by PARMACS and PVM. Perhaps suitable and better benchmarks exist elsewhere. Please send in your suggestions. \subsection{Communication Bottleneck: POLY3} POLY3 assesses the severity of the communication bottleneck. It is the same as the POLY1 benchmark except that the data for the polynomial evaluation is stored on a neighbouring processor. The value of \fhalf obtained therefore measures the ratio of arithmetic to communication performance. Equation ~\ref{Eqn3} shows that the computational intensity of the calculation must be significantly greater than \fhalf (say 4 times greater) if communication is not to be a bottleneck. In this case the computational intensity is the ratio of arithmetic performed on a processor to words tranferred to/from it over communication links. In the common case that the amount of arithmetic is proportional to the volume of a region, and the data communicated is proportional to the surface of the region, the computational intensity is increased as the size of the region (or granularity of the decomposition) is increased. Then the \fhalf obtained from this benchmark is directly related to the granularity that is required to make communication time unimportant. The POLY3 benchmark does not exist, although native versions have been used on transputer systems ~\cite{Hoc91}. A trial benchmark will be prepared for Intel iPSC computers by Roger Hockney, followed by PARMACS and PVM versions. \subsection{Synchronisation Benchmarks: SYNCH1} SYNCH1 measures the time to execute a barrier synchronisation statement as a function of the number of processes taking part in the barrier. The practicability of massively parallel computation with thousands or tens of thousands of processors depends on this barrier time not increasing too fast with the number of processors. The results are quoted both as a barrier time, and as the number of barrier statements executed per second (barr/s). The SYNCH1 benchmark exists as part of Genesis v2.1.1 ~\cite{Hey91}. \begin{verbatim} -------------------------------------------------------------------------- ------------------------ END CHAPTER-3 --------------------------------- -------------------------------------------------------------------------- -------------------------------------------------------------------------- ------------------------ START APPENDIX -------------------------------- -------------------------------------------------------------------------- According to the wishes of the Parkbench committee, the following sections may be either omitted completely in the final report, or relegated to an Appendix. Their purpose is to state the current status of the low-level benchmarks, and to give some examples of measurements as an aid to judging the value of the benchmarks. -------------------------------------------------------------------------- ------------------------- END APPENDIX -------------------------------- -------------------------------------------------------------------------- \end{verbatim} \section{Appendix} \subsection{Summary of Benchmark Status} The following Table-\ref{Table3} summarises the current state of the proposed low-level benchmarks, and the properties they are intended to measure. \begin{table} \centering {\small \parbox{5in}{ \caption{\label{Table3} Status of proposed Low-Level benchmarks. Note we abbreviate performance (perf.), arithmetic (arith.), communication (comms.), operations (ops.).}} \begin{tabular}{llcccc} \hline Benchmark & Measures & Parameters & Exists & Author \\ & & & & \\ \hline SINGLE-PROCESSOR \\ TICK1 & Timer resolution & tick interval & Genesis & Hockney \\ TICK2 & Timer value & wall-clock check& Genesis & Hockney \\ RINF1 & Basic Arith. ops. & \rnhalf & Genesis & Hockney \\ POLY1 & Cache-bottleneck & \rfhalf & EuroBen & Hockney \\ POLY2 & Memory-bottleneck & \rfhalf & Hockney & Hockney \\ SYNCH1 & Barrier time & rate & \barr/s & Hockney & Hockney \\ \hline MULTI-PROCESSOR \\ COMMS1 & Basic Message perf. & \rnhalf & Genesis & Hockney \\ COMMS2 & Message exch. perf. & \rnhalf & Genesis & Hockney \\ COMMS3 & Saturation Bandwidth & \rnhalf & No & Hockney \\ POLY3 & Comms. Bottleneck & \rfhalf & No & Hockney \\ \hline \end{tabular} } \end{table} % ---------------------------------------------------------------------------- \subsection{Arithmetic Benchmark Results} As an indication of the type of results given by the proposed low-level arithmetic benchmarks, Table-\ref{Table1} gives measurements made on a number of workstations, and microprocessor chips that are used as processing nodes in multiprocessor MIMD computers. \begin{table} \centering {\small \parbox{5in}{\caption{\label{Table1} Performance of some common numerical benchmarks on some common workstations and microprocessor chips used in MIMD computers. Measurements were made with the highest level of optimisation that ran, and are in Mflop/s for 64-bit precision, except where stated in parentheses. The units of \nhalf are vector length, and \fhalf are flop/mref (floating-point operations per memory reference). Results are for the best generally available compiler on the date shown. Those for the i860 are for the first Greenhills compiler which is known not to use many important i860 hardware features. Later more advanced compilers should give significantly better results. }} \begin{tabular}{lccccccc} \hline &Sun &Solbourne&Stardent&Inmos&Intel&IBM RS/ &DEC \\ Benchmark &Sparc1&System 5 &TS2025 &T800 &i860 &6000-530 &$\alpha$\\ & & & &20MHz &40MHz& 25MHz &133MHz \\ \hline d/m/y &18/1/90&25/1/90 &8/8/89 &15/4/89&6/8/90&14/6/90&13/1/93 \\ % m/y & 1/90 & 1/90 & 8/89 & 4/89 & 8/90 & 6/90 & 1/93 \\ \hline Linpackd & 1.27 & 2.79 & 4.32 & 0.33 & 3.89 & 9.54 & 20.7 \\ n=100 \\ \hline Livermore & 2.36 & 4.64 & & 0.72 & 8.76 & 31.8 & 46.6 \\ Maximum \\ \hline Livermore & 0.45 & 0.89 & 0.45 & 0.10 & 0.47 & 1.34 & 4.47 \\ Minimum \\ \hline RINF1(32') \\ \rinf & 1.29 & 2.50 &19.29 & 0.34 & 4.62 & 5.13 & 33.8 \\ (\nhalf) &(0.30)& (1.00) &(1.03) & (0) &(3.61) & &(12.2) \\ \hline POLY1 \\ \rhat & 2.50 & 5.18 &42.31 & &10.59 &25.85 & 88.9\\ (\fhalf) &(0.77)& (0.60) &(0.51) & &(1.12) &(0.34)& (0.71)\\ \hline \hline \end{tabular} } \end{table} Table-\ref{Table1} shows that the DEC $\alpha$ chip outperforms all other workstations and chips on all benchmarks by a significant margin, as befits the start of a new generation of chips. However, one cannot help being impressed by the figures. The remaining workstations and chips are compared with each other below. Table-\ref{Table1} shows that the IBM RS/6000 chip set performs best on the LINPACKD100 benchmark, followed by the Stardent ST2025 which has a vector architecture. The i860 performs significantly worse than the IBM 6000. However the benchmark performance of both machines is expected to improve as their compilers develop. Table-\ref{Table1} gives the maximum and minimum performance observed in the 24 Livermore loops. The minimum performance can be taken as giving the worst scalar arithmetic performance that is likely to be found, and the maximum gives the best performance that is likely to be seen on highly vectorisable loops. The computer with the best worst-performance, which is a very good metric to examine, is the IBM RS/6000 followed by the Solbourne. The best maximum performance is seen in the RS/6000 followed by the i860. The RINF1 benchmark gives values of the \rnhalf parameters for the kernel A=B*C (vector = vector $\times$ vector), and shows the Stardent ST2025 performing best with the highest \rinf and lowest \nhalf, followed by the IBM RS/6000. The POLY1 benchmark shows the Stardent ST2025 with the highest peak performance, followed by the IBM RS/6000 and then the i860. Of these three, the value for the IBM is best, and the Stardent quite low, but the value greater than one for the i860 shows that there is a severe memory bottleneck problem with this chip that will prevent it from getting close to its peak advertised performance on many problems. \subsection{Example Results for the COMMS1 benchmark} We report below results for the COMMS1 benchmark on the \Suprenum, and Intel iPSC/860 ~\cite{Hoc91} and Touchstone Delta ~\cite{HoCa92} computers. Table-\ref{Table4.1} gives the values obtained for the communication parameters, in the version of the benchmark using the native \Suprenum extensions to the Fortran90 language. These include a SEND and RECEIVE language statement with a syntax similar to that of the Fortran READ and WRITE statement. The asymptotic stream rate, or bandwidth, (\rinf) shows considerable variation on the Suprenum, depending on how the data to be transferred is specified in the I/O list of the SEND statement. A variable length array in Fortran90 syntax in single precision achieves 0.67 MB/s, whereas the same statement specified in double precision achieves 4.8 MB/s. This double-precision rate is about twice that observed on the iPSC/860 with their CSEND Fortran subroutine, which sends an array whose length is specified in bytes. The principal difference between the two computers is the magnitude of the startup time, $t_0$, which is $73\mu s$ on the iPSC/860 compared with about 3ms on the Suprenum. Since the startup time, via $\pi_0$, determines the transfer rate for short messages (say $<100$B), we see that the Suprenum is 45 times slower than the iPSC/860 for short messages. On the other hand the Suprenum has almost twice the stream rate for long messages (as seen by the value of \rinf), provided the most favourable format (i.e. double precision or 64-bit) is used in the I/O list. One may compute from these numbers that the iPSC/860 is faster at transferring messages for all message lengths less than 16,481 Byte. The longer startup time on Suprenum results in larger values of \nhalf, showing that longer messages are needed to achieve any given fraction of the asymptotic rate. The results for the Touchstone Delta show that this computer has the fastest short and long message performance, judged respectively by the values of $\pi_0$ and \rinf. However the improvement of short message performance over the iPSC/860 is only marginal, and the long message performance is only about one quarter of the advertised bandwidth of 25MB/s. However harware and software improvements made since the measurements were made should have improved the results. If we compare the new generation of production computers, the Intel Paragon XP/S and the Meiko CS-2, we find, on the dates stated, the CS-2 to have a higher communication performance than the Paragon for both short ($\pi_0$) and long messages (\rinf), and therefore for all message lengths. However both computers are at an early state of the hardware and software development, and both have considerable development potential. The COMMS1 benchmark will continue to be used to track this competition in communication performance, and the success of both manufacturers to achieve a high performance for both short and long messages. \begin{table} \centering {\small \parbox{3.5in}{ \caption{\label{Table4.1} Values of (\rinf,\nhalf, $t_0$, $\pi_0$) for the communication of messages between two nodes of the same cluster on the Suprenum and neighbouring nodes on the Intel iPSC/860, Touchstone Delta, Intel Paragon and Meiko CS-2 computers. The Delta measurements were made on 17 Jan. 1992, and should have been improved by subsequent hardware and software changes. Paragon measurements were made at ORNL 25-28 May, 1993, and the CS-2 measurements were made at Southampton University 9 July, 1993}} \begin{tabular}{llcccc} \hline Specification & Range & \rinf & \nhalf & $t_0$ & $\pi_0$ \\ & B* & MB/s & B & ms & k/s \\ \hline SUPRENUM \\ sp SEND A(1:N) & & 0.67 & 2041 & 3.05 & 0.328 \\ dp SEND A(1:N) & & 4.82 & 12740 & 2.64 & 0.378 \\ \hline INTEL iPSC/860 \\ CSEND (,A,N,,) & $N<100$ & 2.36 & 179 & 0.074 & 13.5 \\ & $N>100$ & 2.80 & 560 & 0.200 & 5.0 \\ \hline INTEL Delta \\ CSEND (,A,N,,) & $N<512$ & 3.48 & 213 & 0.061 & 16.3 \\ & $N>512$ & 6.76 & 892 & 0.132 & 7.57 \\ \hline INTEL Paragon XP/S \\ CSEND (,A,N,,) & $N<40000$ & 23.5 & 4044 & 0.172 & 5.80 \\ \hline Meiko CS-2 \\ PARMACS & $N<40000$ & 43.0 & 3747 & 0.087 & 11.5 \\ \hline * B - byte \end{tabular} } \end{table} % ---------------------------------------------------------------------------- %------------------------------------------------------------------------ % PARKBENCH REPORT (third draft), END OF FILES %------------------------------------------------------------------------ From owner-pbwg-comm@CS.UTK.EDU Thu Aug 19 12:35:38 1993 Received: from CS.UTK.EDU by netlib2.cs.utk.edu with SMTP (5.61+IDA+UTK-930125/2.8t-netlib) id AA06924; Thu, 19 Aug 93 12:35:38 -0400 Received: from localhost by CS.UTK.EDU with SMTP (5.61+IDA+UTK-930125/2.8s-UTK) id AA18814; Thu, 19 Aug 93 12:33:36 -0400 X-Resent-To: pbwg-comm@CS.UTK.EDU ; Thu, 19 Aug 1993 12:33:34 EDT Errors-To: owner-pbwg-comm@CS.UTK.EDU Received: from THUD.CS.UTK.EDU by CS.UTK.EDU with SMTP (5.61+IDA+UTK-930125/2.8s-UTK) id AA18798; Thu, 19 Aug 93 12:33:33 -0400 From: Jack Dongarra Received: by thud.cs.utk.edu (5.61+IDA+UTK-930125/2.7c-UTK) id AA01693; Thu, 19 Aug 93 12:33:00 -0400 Date: Thu, 19 Aug 93 12:33:00 -0400 Message-Id: <9308191633.AA01693@thud.cs.utk.edu> To: pbwg-comm@cs.utk.edu Subject: agenda for meeting Enclosed is the agenda for the ParkBench meeting on Monday. If you have not already done so, please let me know if you are planning to attend. Regards, Jack PARKBENCH AGENDA: 23 AUGUST 1993 1. Minutes of last meeting (24th May 1993) 2. Reports and Discussion of subgroups: second reading of latest draft of Parkbench Report 2.1 Introduction (All) 2.2 Methodology (David Bailey) 2.3 Low-Level (Roger Hockney/Tony Hey) 2.4 Kernels (Tony Hey) 2.5 Compact Applications (David Walker) 2.6 Compiler Benchmarks (Tom Haupt) 2.7 Conclusions/Recommendations (All) 3. Open Discussion/Further Actions 3.1 Production of Parkbench Report for Supercomputing '93 3.2 Dissemination/Adoption strategy for Parkbench suite 3.3 Future meeting schedule 3.4 Chairmanship 4. Date and Venue of Supercomputing '93 Meeting in Portland 5. A.O.B. From owner-pbwg-comm@CS.UTK.EDU Thu Aug 19 15:22:31 1993 Received: from CS.UTK.EDU by netlib2.cs.utk.edu with SMTP (5.61+IDA+UTK-930125/2.8t-netlib) id AA09981; Thu, 19 Aug 93 15:22:31 -0400 Received: from localhost by CS.UTK.EDU with SMTP (5.61+IDA+UTK-930125/2.8s-UTK) id AA29834; Thu, 19 Aug 93 15:21:19 -0400 X-Resent-To: pbwg-comm@CS.UTK.EDU ; Thu, 19 Aug 1993 15:21:16 EDT Errors-To: owner-pbwg-comm@CS.UTK.EDU Received: from gemini.npac.syr.EDU by CS.UTK.EDU with SMTP (5.61+IDA+UTK-930125/2.8s-UTK) id AA29826; Thu, 19 Aug 93 15:21:11 -0400 Received: from localhost by gemini.npac.syr.edu with SMTP id AA00640 (5.65c/IDA-1.4.4 for pbwg-comm@cs.utk.edu); Thu, 19 Aug 1993 15:20:53 -0400 Message-Id: <199308191920.AA00640@gemini.npac.syr.edu> To: R.Hockney@pac.soton.ac.uk Cc: pbwg-comm@cs.utk.edu, haupt@npac.syr.edu Subject: compiler benchmarks Date: Thu, 19 Aug 93 15:20:51 -0400 From: haupt@npac.syr.edu X-Mts: smtp COMPIL2.TEX --------------- Appended is the first draft of the chapter 6 on compiler benchmarks. Please, accept it as an invitation to discussions rather than the final version. Tom Haupt, 19 Aug 1993 %------------------------------------------------------------------------ % PARKBENCH REPORT (second draft), File: compil2.tex %------------------------------------------------------------------------ %file compil2.tex %compiled by Tom Haupt for compiler benchmarks subcommittee \chapter{Compiler Benchmarks} \footnote{assembled by Tom Haupt for Compiler Benchmarks subcommittee} \section{Objectives and Metric} For most users, the performance of codes generated by a compiler is what that actually matters. The metric for the performance evaluation is the wall--clock execution time of selected benchmark applications in, say, seconds, or the execution time normalized to a standard floating point counts, say, in GFLOP/s, as defined in chapter 2. A representative suite of benchmark applications is described in other parts of this document (Kernels, chapter 4, and Compact Applications, chapter 5), and we will provide HPF versions of these codes. For HPF compiler developers and implementators, however, an additional benchmark suite may be very useful: the benchmark suite that can evaluate specific HPF compilation phases and the compiler runtime support. For that purpose, the best metric is the ratio of execution times of compiler generated to hand coded programs as a function of the problem size and number of processors engaged in the computation. The compilation process can be logically divided into several phases, and each of them influence the efficiency of the resulting code. The initial stage is parsing of a source code which results in an internal representation of the code. It is followed compiler transformations, like data distribution, loop transformations, computation distribution, communication detection, sequentialization, insertion of calls to a runtime support, and others. This we will call a HPF-specific phase of compilation. The compilation is concluded by code generation phase. For portable compilers that outputs f77+message passing code, the node compilation is obviously factorized out and the efficiency of the node compiler can be evaluated separately. This benchmark suite addresses the HPF-specific phase only. Thus, it is well suited for performance evaluation of both translators (HPF to F77+message passing) and genuine HPF compilers. The parsing phase is an element of the conventional compiler technology and it is not of interest in this context. The code generation phase involves optimization techniques developed for sequential compilers (in particular, Fortran 90 compilers) as well as micro-grain parallelism or vectorization. The object codes for specific platforms may be strongly architecture dependent (e.g., may be very different for processors with vector capabilities than for those without it). Evaluation of performance of these aspects require different techniques that these proposed here. It is worth noting, that the HPF-phase strongly affect the possibility of optimization of the node codes. For example, insertions of calls to the communication library may prohibit the node compiler to perform many standard optimizations without expensive interprocedural analysis. Therefore, capability to exploit opportunities for optimizations at HPF level and to generate the output code that way it can be further optimized by the node compiler is an important element of evaluation of HPF compilers. Nevertheless, evaluation of the HPF-phase separately is very valuable since the hand coded programs face the same problems. We will address these issues in the future releases of the benchmark suite. Compilers for massively parallel and distributed systems are still object of a research and laboratory testing rather than commercial products. The parallel compiler technology as well as methods of evaluating it is not mature yet. The advent of the HPF standard gives opportunity to develop systematic benchmarking techniques. The current definition of HPF cannot be recognized as an ultimate solution for parallel computing. Its limitations are well known, and many researchers are working on extensions to HPF to address a broader class of real life, commercial and scientific applications. We expect new language features to be added to the HPF definition in future versions of HPF, and we will extend the benchmark suite accordingly. On the other hand, new parallel languages based on languages other than Fortran, notably C++, become more and more popular. Since the parallelism is inherent in a problem and not its representation, we anticipate many commonalities in the parallel languages and corresponding compiler technologies, notably sharing the runtime support. Therefore, we decided to address this benchmark suite to these aspects of the compilation process that are inherent to parallel processing in general, rather than testing syntactic details of the HPF. \section{High Performance Fortran} HPF is an extension of Fortran 90 to support data parallel programming model, defined as single threaded, global name space, loosely synchronous parallel computation. The idea behind HPF is to provide means to produce scalable, portable, and top performance codes for MIMD and SIMD computers with non-uniform memory access cost. The portability of the HPF codes means that the efficiency of the code is preserved for different machines with comparable number of processors. The HPF extensions to the Fortran 90 standard fall into four categories: compiler directives, new language features, library routines and restrictions to Fortran 90. The HPF compiler directives are structured comments that suggest implementation strategies or assert fact about a program to the compiler. They may affect the efficiency of the computation performed, but they do not change the value computed by the program. The new language features are FORALL statement and construct as well as minor modifications and additions to the library of Fortran 90 intrinsic functions. In addition, HPF introduces new functions that may be used to express parallelism, like new array reduction functions, array combining scatter functions, arrays suffix and prefix functions, array sorting functions and others. These functions are collected in a separate library, the HPF library. Since it was anticipated that not all algorithms can be easily expressed in HPF syntax, an escape mechanism, the extrinsic functions, has been introduced. The extrinsic functions may be written in languages other than HPF and may support a different computational model. Finally, HPF imposes some restrictions to Fortran 90 definition of storage and sequence associations. The HPF approach is based on two key observations. First, the overall efficiency of the program can be increased, if many operations are performed concurrently by different processors, and secondly, the efficiency of a single processor is likely be the highest, if the processor performs computations on data elements stored in its local memory. Therefore, the HPF extensions provide means for explicit expression of parallelism and data mapping. It follows that an HPF programmer expresses parallelism explicitly, and the data distribution is tuned accordingly to control the load balance and minimize communication. On the other hand, given a data distribution, an HPF compiler may be able to identify operations that can be executed concurrently, and thus generate even more efficient code. To speed up commercial implementations of HPF compilers, a HPF subset has been defined as well. The subset comprises selected Fortran 90 features, notably array assignments and allocatable arrays, intrinsic functions, and interface blocks. In addition, the subset excludes some of HPF features, like dynamic mappings, pure and extrinsic function attributes, FORALL construct, and the HPF library. \section{Benchmark Suite} The benchmark suite comprises several simple, synthetic applications which test several aspects of the HPF compilation. The current version of the suite addresses the basic features of HPF, and it is designed to measure performance of early implementations of the compiler. They concentrate on testing parallel implementation of explicitly parallel statements, i.e., array assignments, FORALL statements, INDEPENDENT DO loops, and intrinsic functions with different mapping directives. The language features not included in the HPF subset are not addressed in this release of the suite. The next releases will contain more kernels that will address all features of HPF, and also they will be sensitive to advanced compiler transformations. Parallel implementation of the array assignments, including FORALL statements, is a central issue for an early HPF compiler. Given a data distribution, the compiler distributes computation over available processors. An efficient compiler achieves an optimal load balance with minimum interprocessor communication. Kernels AA, SH, ST, TM, FL, and IR address that problem. They represent applications of different degree of difficulty, from very easy to implement kernel AA that is very regular, to kernel IR that require difficult unstructured interprocessor communication that is specified only at runtime. Every array assignment written according to Fortran 90 syntax can be expressed as a FORALL statement. It is a matter of a programmer's preference which syntax to use. The idea behind introducing FORALL in HPF is to generalize array assignments to make expressing parallelism easier. Kernel FL provides several examples of FORALL statements that are difficult or inconvenient to write using Fortran 90 syntax. Once the data and iteration space is distributed, the next step that strongly influences efficiency of the resulting codes is communication detection and code generation to execute data movement. In general, the off-processor data elements must be gathered before execution of an array assignment, and the results are to be scattered to destination processors after the assignment is completed. In other words, some of the array assignments may require a preprocessing phase to determine which off-processor data elements are needed and execute the gather operation. Similarly, they may require postprocessing (scatter). Many different techniques may be used to optimize these operations. To achieve a high efficiency, it may be very important that compiler is able to recognize structured communication patterns, like shift, multicast, etc. Kernels AA, SH, and ST introduce different structured communication patterns, and kernel IR is an example of an array assignment that require unstructured communication (because of indirections). A good HPF compiler is expected to handle efficiently multiple indirections as well and appropriate test kernels will be provided in the next releases of the suite. Sometimes, the programmers may help the compiler to minimize necessary interprocessor communication by suitable data mapping, in particular by defining a relative alignment of different data object. This may be achieved by aligning the data objects with an explicitly declared template. Kernel TL provides an example of this kind. The above test applications address essentially compiler transformations of a single statement. Interstatement and interprocedural analysis introduces an additional opportunity for aggressive optimizations (loop transformations, optimizations of temporary arrays usage, overlapping communications and computations, to name a few). These may be tested using benchmark applications described in chapter 4 and 5. Some of the issues are addressed in EP kernel, more will be added in kernels addressing FORALL constructs. The RD and EP kernels test performance of codes in which the parallelism is expressed in other way: by using intrinsic functions (RD) and INDEPENDENT DO construct (EP). The future release of the suite will also contain examples of use functions from the HPF library, and examples of nested INDEPENDENT loops. The last group of kernels, AS, IT, IM and EI, demonstrate passing distributed arrays as subprograms' arguments. They represents four typical cases: \begin{enumerate} \item a known mapping of the actual argument is to be preserved by the dummy argument (AS). \item the mapping of the dummy argument is to be inherited from the actual argument, thus no remapping is necessary. The mapping is known at compile time (IT). \item the mapping of the dummy argument is to be identical to that of the actual argument, but the mapping is not known at the compile time (IM). \item a specific mapping of the dummy argument is forced, regardless the mapping of the actual elements (EI). \end{enumerate} The next release of the suite will address other aspects of the HPF compilation, including usage of allocatable arrays and pointers, dynamic (re)mappings, FORALL construct, PURE functions, extrinsic functions, etc. \section{Description of Codes} \subsection{AA: Array Assignments} This simple kernel it taken from Livermore Loop benchmark Suite \cite{Liv}. The sequence of Fortran 90 array assignments is that in many implementation it requires no communication. The distribution of computation to achieve a perfect load balancing is straightforward. The resulting code is expected to be as efficient as a hand coded message passing version. The possible differences in execution time may reflect an overhead generated by the HPF compiler to set up the environment. \begin{verbatim} program AA C ======================================================= C array assignments kernel C taken from the Livermore Loops, kernel 9 C ======================================================= parameter ( N = 1024 ) real PX(13,N), Q real DM22, DM23, DM24, DM25, DM26, DM27, DM28, C0 integer i, k CHPF$ processors p(16) CHPF$ template d(1024) CHPF$ distribute d(block) CHPF$ align PX(*,I) with d(I) C call start_timer() C inititialization ... FORALL ( i = 1:N ) * PX(1,i) = DM28 * PX(13,i) + DM27 * PX(12,i) + * DM26 * PX(11,i) + DM25 * PX(10,i) + * DM24 * PX(9,i) + DM23 * PX(8,i) + * DM22 * PX(7,i) + C0 * (PX(5,i) + * PX(6,i)) + PX(3,i) call stop_timer() C print results end \end{verbatim} \subsection{SH: Array Assignments with Shift} This is another Livermore kernel. The array assignments require multiple collective communication. This kernel demonstrate performance of the runtime shift function as well as ability of a compiler to minimize communication requests. \begin{verbatim} program SH C ============================================================== C array assignments with shifts C taken from the Livermore Loops, kernel 7 C ============================================================== parameter ( N = 1024 ) real U(N), X(N), Y(N), Z(N), Q, R, T integer k CHPF$ processors p(8) CHPF$ template d(1024) CHPF$ distribute d(block) CHPF$ align (:) with d(:) :: X, Y, Z, U call start_timer() C initialization ... forall (k=1:N-6) & X(k)= U(k) + R*( Z(k) + R*Y(k)) + & T*( U(k+3) + R*( U(k+2) + R*U(k+1)) + & T*( U(k+6) + Q*( U(k+5) + Q*U(k+4)))) C call stop_timer() C print results end \end{verbatim} \subsection{ST: Array Assignments with Strides} This kernel extends test of compiler ability to recognize structured communication patterns beyond shifts. \begin{verbatim} program ST C ======================================================= C array assignments kernel C taken from the Livermore Loops, kernel 9 C ======================================================= integer N,M,K1,K2,I parameter ( N = 1024 ) parameter ( M = N/2 ) real, array(N,N) :: A, B CHPF$ processors p(4,4) CHPF$ template d(1024,1024) CHPF$ distribute d(block,block) CHPF$ align WITH d :: A,B C call start_timer() C initialization ... A(1:N,K1)=B(1:M:2,K2) FORALL(I=2,M) A(I,K1)=A(2*I-1)*B(I,1) call stop_timer() C Print results end \end{verbatim} \subsection{TM: Usage of a Template} This kernel uses an explicitly declared template to force a relative alignment of arrays to minimize communication. \begin{verbatim} PROGRAM TM C ================================================================= C template kerenel C adopted from the Purdue Set C ================================================================= INTEGER NK,MK DATA N / 1023 / DATA M / 1023 / INTEGER NDIM,MDIM,ND1,MD1 PARAMETER (NDIM=1023,MDIM=1023,ND1=NDIM+1,MD1=MDIM+1) REAL, ARRAY(MDIM) :: R REAL, ARRAY(NDIM) :: C REAL, ARRAY(NDIM,MDIM) :: A REAL, ARRAY(ND1,MD1) :: ABIG REAL, ARRAY(1,1) :: ACORN REAL, ARRAY(NDIM-1,MDIM-1) :: B CHPF$ PROCESSORS P(8) CHPF$ TEMPLATE TOM(1024) CHPF$ DISTRIBUTE TOM(BLOCK) CHPF$ ALIGN ABIG(i,*) with TOM(i) CHPF$ ALIGN A(i,*) with TOM(i) CHPF$ ALIGN C(:) with TOM(:) CHPF$ ALIGN B(i,*) with TOM(i+2) call start_timer() forall(i=1:m) r(i)=1.0+i forall(i=1:n) c(i)=1.0-i forall(i=1:n,j=1:m) a(i,j)=i+j ACORN = .5 ABIG(1:N,1:M)=A ABIG(1:N,M+1)=C ABIG(N+1,1:M)=R ABIG(N+1,M+1)=0.5 FORALL(I=3:N+1,J=3:M+1) B(I,J)=A(I-2,J-2) call stop_timer() C print results STOP END \end{verbatim} \subsection{RD: Intrinsic Reduction Functions} This kernel is adopted from the Purdue Set \cite{Rice}. It demonstrate performance of selected intrinsic functions. \begin{verbatim} PROGRAM RD INTEGER NS,NT DATA NS / 128 / DATA MT / 4092 / INTEGER NTDIM,NSDIM PARAMETER (NTDIM = 4092) PARAMETER (NSDIM = 128) REAL, ARRAY(NTDIM,NSDIM) :: SCORES LOGICAL, ARRAY(NTDIM,NSDIM) :: ABOVE LOGICAL, ARRAY(NTDIM) :: TEMP INTEGER NABOVE REAL AVER,AVERTOP,LOWABO LOGICAL GENIUS CHPF$ PROCESSORS Q(8) CHPF$ TEMPLATE TOM(4092) CHPF$ DISTRIBUTE TOM(BLOCK) CHPF$ ALIGN SCORES(i,*) with TOM(i) CHPF$ ALIGN ABOVE(i,*) with TOM(i) CHPF$ ALIGN TEMP(:) with TOM(:) call start_timer() c SCORES=60.0+40.0*SIN(SPREAD([1:NT],2,NS)* c + SPREAD([1:NS],1,NT)*0.0006321) forall(i=1:nt,j=1:ns) scores(i,j)= + 60.0+40.0*sin(real(i))*j*0.0006321 SSUM=SUM(SCORES) AVER=SSUM/(NS) WHERE(SCORES.GT.AVER) ABOVE=.TRUE. SCORES=SCORES*1.1 ELSEWHERE ABOVE=.FALSE. ENDWHERE NABOVE=COUNT(ABOVE) AVERTOP=SUM(SCORES,MASK=ABOVE)/NABOVE LOWABO=MINVAL(SCORES,MASK=ABOVE) c GENIUS=ANY(ALL(ABOVE,DIM=1)) TEMP=ALL(ABOVE,DIM=1) GENIUS=ANY(TEMP) call stop_timer C print results STOP END \end{verbatim} \subsection{FL: FORALL statement} This kernel makes use of a FORALL statement. Note, that the array assignment in this program are not easily expressible in Fortran 90 syntax. \begin{verbatim} PROGRAM FL C =================================================== C forall statement kerenel C ================================================== REAL, ARRAY(1024,1024) :: x,y REAL, ARRAY(32) :: s CHPF$ PROCESSORS P(4,4) CHPF$ TEMPLATE T(1024,1024) CHPF$ DISTRIBUTE T(BLOCK,BLOCK) CHPF$ ALIGN WITH T :: X,Y n=1024 m=32 call start_timer() FORALL (i=1:n,j=1:n) y(i,j)=1.0/REAL(i+j-1) FORALL (k=1:n) x(k,1:m)=y(1:m,k) FORALL (i=1:m-1) s(i)=SUM(x(1:n:m)) call stop_timer() C print results stop end \end{verbatim} \subsection{EP: Embarrasingly Parallel} This kernel, adopted from NAS benchmark Suite \cite{NAS} is provided in two versions: one using Fortran 90 syntax, and the other written as a Fortran 77 DO loop with HPF assertion directive INDEPENDENT. For many platforms, the latter version may be much more effective because of use HPF's NEW scalar variables as opposed to explicitly declared arrays necessary in Fortran 90 syntax. \begin{verbatim} (HPF version of the NAS EP kernel is to be included here) \end{verbatim} \subsection{IR: Irregular Communication} The code of this kernel does not provide much information for compile time optimization (indirections). The efficiency of the code solely depends on the efficiency of the runtime support, including communication scheduling. \begin{verbatim} PROGRAM IR C ============================================================ C irregular communications kernel C ============================================================ PARAMETER (NIT = 100) PARAMETER (M=32) PARAMETER (ME = M*(M-1), ME2 = 2*ME, M2 = M*M) REAL, ARRAY(M2) :: Y INTEGER, ARRAY(ME2) :: L,R,U,D,IM CHPF$ PROCESSORS P(8) CHPF$ TEMPLATE TP(M2) CHPF$ TEMPLATE TE(ME2) CHPF$ DISTRIBUTE TP(CYCLIC) CHPF$ DISTRIBUTE TE(CYCLIC) CHPF$ ALIGN WITH TP :: Y CHPF$ ALIGN WITH TE :: L,R,U,D,IM FORALL (I=1:ME) IM(I)= (I + I/M)/M FORALL(I=1:M2) Y(I)= I/M*0.1 + MOD(M,I)*0.2 FORALL (I=1:ME) L(I)=I+IM(I) FORALL (I=1:ME) R(I)=L(I)+1 FORALL (I=1:ME) U(I)=1+IM(I)+(MOD(IM(I),M)-1)*M FORALL (I=1:ME) D(I)=1+IM(I)+MOD(IM(I),M)*M L(ME+1:ME2) = U(1:ME) R(ME+1:ME2) = D(1:ME) DO IT = 1, NIT FORALL(I=1:ME2) Y(L(I)) = 0.5 * (Y(L(I)) + Y(R(I))) ENDDO CALL stop_timer() C print results END \end{verbatim} \subsection{AS: Assertion on the Mapping of the Actual Argument} This kernel test the performance of the program that calls a subprogram with an distributed array as an argument. The mapping of the actual and dummy parameters are known at the compile time. \begin{verbatim} PROGRAM AS C =========================================================== C assertions on the mapping of the actual argument C ========================================================== REAL, ARRAY(1024,1024) :: A CHPF$ PROCESSORS P(4,4) CHPF$ TEMPLATE T(1024,1024) CHPF$ DISTRIBUTE T(BLOCK,BLOCK) CHPF$ ALIGN A(:,:) WITH T(:,:) call start_timer() FORALL(i=1:1024,1:1024) A(I,J)=I+0.1*J DO I=1,1000 CALL SUBA(A) ENDO call stop_timer() C print results END SUBROUTINE SUBA(X) REAL, ARRAY(1024,1024) :: X CHPF$ PROCESSORS Q(4,4) CHPF$ DISTRIBUTE X *(BLOCK,BLOCK) ONTO *Q FORALL (I=2:1023,J=2:1023) X(I,J)=0.25*(X(I+1,J)+X(I-1,J)+ & X(I,J-1)+X(I,J+1) RETURN END \end{verbatim} \subsection{IT: Inherited Template} In this kernel, an array section is passed as an actual argument. The inherit directive signal the compiler that no remapping is to be performed. \begin{verbatim} PROGRAM IT C =========================================================== C inherited template C ========================================================== PARAMETER (N=1024) PARAMETER (N1 = 1, N2 = N/2, N3 = N2+1, N4 = N) REAL, ARRAY(N,N) :: A CHPF$ PROCESSORS P(4,4) CHPF$ TEMPLATE T(N,N) CHPF$ DISTRIBUTE T(BLOCK,BLOCK) CHPF$ ALIGN A(:,:) WITH T(:,:) call start_timer() FORALL(i=1:N,1:N) A(I,J)=I+0.1*J DO I=1,1000 CALL SUBA(A(N1:N2),N1,N2)) CALL SUBB(A(N3:N4),N3,N4)) ENDO call stop_timer() C print results END SUBROUTINE SUBA(X,N1,N2) REAL, ARRAY(:,:) :: X CHPF$ INHERIT X CHPF$ PROCESSORS Q(4,4) CHPF$ DISTRIBUTE X *(BLOCK,BLOCK) ONTO *Q FORALL (I=N1:N2,J=N1:N2) A(I,J)=0.25*(A(I+1,J)+A(I-1,J)+ & A(I,J-1)+A(I,J+1) RETURN END SUBROUTINE SUBB(X,N1,N2) REAL, ARRAY(:,:) :: X CHPF$ INHERIT X CHPF$ PROCESSORS Q(4,4) CHPF$ DISTRIBUTE X *(BLOCK,BLOCK) ONTO *Q FORALL (I=N1:N2,J=N1:N2) A(I,J)=0.25*(A(I+1,J)-A(I-1,J)+ & A(I,J-1)-A(I,J+1) RETURN END \end{verbatim} \subsection{IM: Inherited Mapping} In this kernel, a subprogram inherits mapping of the actual argument. Thus, the mapping is known only at run time. The subroutine is called twice, with different mapping. \begin{verbatim} PROGRAM IM C =========================================================== C inherited mapping C ========================================================== PARAMETER (N=1024) REAL, ARRAY(N,N) :: A,B CHPF$ PROCESSORS P(4,4) CHPF$ DISTRIBUTE A(BLOCK,BLOCK) CHPF$ DISTRIBUTE B(BLOCK,*) call start_timer() FORALL(i=1:N,1:N) A(I,J)=I+0.1*J FORALL(i=1:N,1:N) B(I,J)=I+0.1*J DO I=1,1000 CALL SUBA(A) CALL SUBA(B) ENDO call stop_timer() C print results END SUBROUTINE SUBA(X) REAL, ARRAY(:,:) :: X,A CHPF$ PROCESSORS Q(4,4) CHPF$ DISTRIBUTE X * ONTO *Q CHPF$ ALIGN A(:,:) WITH X(:,:) A=CSHIFT(X,1,DIM=1) X=0.5*(A+X) RETURN END \end{verbatim} \subsection{EI: Mapping of a Dummy Argument Declared in an Explicit Interface} A specific mapping of the dummy argument is forced by explicit mapping directives. \begin{verbatim} (in preparation) \end{verbatim} \section{Example Results} \begin{verbatim} (efficiency of the benchmark codes generated by the Fortran 90D compiler are to be presented here) \end{verbatim} \section{Summary} The synthetic compiler benchmark suite described here is an addition to the benchmark kernels and applications described in chapter 4 and 5. It is not meant as a tool to evaluate the overall performance of the compiler generated codes. It has been introduced as an aid for compiler developers and implementators to address some selected aspect of the HPF compilation process. In the current version, the suite does not comprise a comprehensive sample of HPF codes. Actually, it addresses only the HPF subset. Hopefully, it will contribute to establishment of a systematic compiler benchmarking methodology. We intend to continue our effort to develop a complete, fully representative HPF benchmark suite. % ---------------------------------------------------------------------------- % --- end of chapter on compiler benchmarks ---------------------------------- % ---------------------------------------------------------------------------- From owner-pbwg-comm@CS.UTK.EDU Fri Aug 27 11:11:54 1993 Received: from CS.UTK.EDU by netlib2.cs.utk.edu with SMTP (5.61+IDA+UTK-930125/2.8t-netlib) id AA09284; Fri, 27 Aug 93 11:11:54 -0400 Received: from localhost by CS.UTK.EDU with SMTP (5.61+IDA+UTK-930125/2.8s-UTK) id AA09655; Fri, 27 Aug 93 11:09:40 -0400 X-Resent-To: pbwg-comm@CS.UTK.EDU ; Fri, 27 Aug 1993 11:09:36 EDT Errors-To: owner-pbwg-comm@CS.UTK.EDU Received: from sun2.nsfnet-relay.ac.uk by CS.UTK.EDU with SMTP (5.61+IDA+UTK-930125/2.8s-UTK) id AA09637; Fri, 27 Aug 93 11:09:32 -0400 Via: uk.ac.southampton.ecs; Fri, 27 Aug 1993 16:09:12 +0100 From: R.Hockney@parallel-applications-centre.southampton.ac.uk Via: calvados.pac.soton.ac.uk (plonk); Fri, 27 Aug 93 15:56:58 BST Date: Fri, 27 Aug 93 15:05:39 GMT Message-Id: <21035.9308271505@calvados.pac.soton.ac.uk> To: pbwg-comm@cs.utk.edu Subject: Message from Roger Hockney A PERSONAL MESSAGE FROM YOUR CHAIRMAN ------------------------------------- I first apologise for using the Parkbench forum for what is partly a personal matter, but it does seem a good way to reach anyone who might be interested. After working in the US for about 9 years (1962 to 1970 at Stanford, NASA Langley and IBM Yorktown Heights) I have accumulated 37 US Social Security credits, which is 3 credits short of the minimum of 40 required to claim a US pension. I am therefore looking for some way of making up these missing credits, this year and/or next, by way of temporary employment of about a month (or whatever is necessary to gain 3 credits, I don't know exactly). The employment must, of course, be such that social security contributions are deducted from pay. It occurs to me that some members of Parkbench might be at institutions with visitors programs, who would be interested in my visiting and working with their staff, and/or lecturing on, topics such as: (1) Parallel program performance characterisation (parametrisation) leading performance OPTIMISATION. (2) Preparation of parallel programs, in particular PIC codes. (3) Running and interpretation of parallel benchmark results, leading to performance PREDICTION based on a small number of parameters. (4) Giving a course of lectures on "Parallel Computer Architecture, Algorithms and Performance Evaluation". (5) Any other related topic, of particular interest to the employer. I am quite flexible. If there is anyone out there interested to pursue this idea, or who has suggestions please communicate preferably via my e-mail: rwh@pac.soton.ac.uk or by letter to: 4 Whitewalls Close Compton NEWBURY, RG16 0QG England, UK Voice phone: +44 (635) 578679 FAX: same but speak to me first Best Regards Roger Hockney (Emeritus Professor of Computer Science, Reading University, Visiting Professor, Southampton University) From owner-pbwg-comm@CS.UTK.EDU Mon Aug 30 23:03:01 1993 Received: from CS.UTK.EDU by netlib2.cs.utk.edu with SMTP (5.61+IDA+UTK-930125/2.8t-netlib) id AA27002; Mon, 30 Aug 93 23:03:01 -0400 Received: from localhost by CS.UTK.EDU with SMTP (5.61+IDA+UTK-930125/2.8s-UTK) id AA18766; Mon, 30 Aug 93 23:00:05 -0400 X-Resent-To: pbwg-comm@CS.UTK.EDU ; Mon, 30 Aug 1993 23:00:04 EDT Errors-To: owner-pbwg-comm@CS.UTK.EDU Received: from BERRY.CS.UTK.EDU by CS.UTK.EDU with SMTP (5.61+IDA+UTK-930125/2.8s-UTK) id AA18745; Mon, 30 Aug 93 23:00:02 -0400 Received: from LOCALHOST.cs.utk.edu by berry.cs.utk.edu with SMTP (5.61++/2.7c-UTK) id AA03491; Mon, 30 Aug 93 23:00:01 -0400 Message-Id: <9308310300.AA03491@berry.cs.utk.edu> To: pbwg-comm@cs.utk.edu Subject: Minutes Date: Mon, 30 Aug 1993 23:00:00 -0400 From: "Michael W. Berry" I've enclosed a draft of the Minutes for the ParkBench Meeting on Aug. 23. Please email all corrections to berry.cs.utk.edu Thanks, Mike --- Minutes of the 4th PARKBENCH (Formerly PBWG) Workshop ----------------------------------------------------- Place: Science Alliance Conference Room South College University of Tennessee Knoxville, TN Host: Jack Dongarra ORNL/Univ. of Tennessee Date: August 23, 1993 Attendees/Affiliations: Michael Berry, Univ. of Tennessee Philip Tannenbaum, NEC/SPEC Ed Kushner, Intel David Mackay, Intel Charles Grassl, Cray Research Bodo Parady, Sun Microsystems David Bailey, NASA Jack Dongarra, Univ. of Tennessee/ORNL Tom Haupt, Univ. of Syracuse Tony Hey, Univ. of Southampton Joanne Martin, IBM David Walker, ORNL The meeting started at 9:10 EDT with Tony Hey chairing the meeting for the absent Roger Hockney. Tony asked if there were any changes to the minutes of the May Parkbench meeting. There were no changes suggested by the attendees so the minutes were accepted. Tony then lead the group through the current draft of the Parkbench Report to be distributed at Supercomputing '93 in Portland. Each attendee was provided a copy of the current draft along with a a handout containing modified sections written by David B. David B. suggested that his 0.1 Philosophy section be inserted before Roger's text in Chapter 1 (Introduction). David B. stressed that reproducibility be mentioned in the current draft. Phil T. pointed out that Perfect's new focus allows optimized versions to eventually become new baseline numbers. Tony H. felt that Parkbench should have some sort of relationship with SPEC/PB. Tom H. said that he would provide a few sentences for compiler benchmark development in the Introduction. Tony H. then motioned that the group move on to discuss Chapter 2 (Methodology). David B. stressed that CPU time has some value but that the pitfalls in its use should be included in Section 2.2 (Time Measurement). David B. will merge his section 0.2 in with Roger's Section 2.2. Charles G. pointed out that one cannot really define minimum performance (worst case). The group then discussed what methodology for measuring elapsed wall-clock time should be used. Charles G. offered the following 4 options: (1) make 3 runs and take the minimum, (2) make 3 runs and the maximum, (3) make 3 runs and take the arithmetic average, and (4) say nothing but recommend that 3 runs be made and ask for complete details on how the reported time was obtained. The group agreed that option (4) would be most appropriate for ParkBench at this time. Charles G. also suggested that TICK1/TICK2 essentially measure the accuracy of the timers. Ed K. explained a timing problem observed by Paragon users after reboots -- he indicated that there should be flexibility in reported timings so that "typical execution time" actually be stored in the Parkbench database. David B. agreed to insert a discussion of reproducibility into Roger's Section 2.8 (Benchmarking Procedure and Code Optimisation). Charles G. asserted that there should also be some emphasis on how runs were timed with respect to sensitivity. Tony H. then led the discussion on to Section 2.3 (Units and Symbols) and questioned whether or not this section should really be in an Appendix? David B. recommended that the Section be left where it is and most of the other attendees concurred. Mike B. asked the group to change "msend" to "send" on page 4 of the report following Roger H.'s suggestion. Moving on to Section 2.4 (Floating- Point Operation Count), David B. felt that the "8 flop" listed for "exponential, sine etc." was not realistic. He and Joanne M. felt a more correct assessment might be "20 flop". Phil T. pointed out that the Cray HPM was used for flop counts of the Perfect Benchmarks. Bodo P. asserted that since caching is typically used to evaluate many intrinsic functions, reported counts might be distorted. Tony H. stressed that counts are needed for Mflop/s ratios that will ultimately be used. Jack D. indicated that he would be willing to contact J. Demmel and W. Kahan for expert advice on what flop counts should be associated with the various operations -- all attendees agreed that this should be done. David B. and Joanne M. suggested that there should be standard flop counts that are in some sense "close" to those for a real machine (eg., Cray). Charles G. suggested that a bandwidth parameter be included -- theoretical data-access rate, for example. David B. agreed to modify Section 2.4 to include a memory parameter, and then asked whose responsibility is it to produce flop counts for the Parkbench codes? He suggested that a formula be given for the Kernel benchmarks and perhaps Cray HPM counts for the Compact-Application benchmarks. Tony H. pointed out that standard flop counts should be provided for each problem size. The discussion then turned to Section 2.5 (Performance metrics). David B. indicated that his Section 0.3 would replace Section 2.5.5 (Speedup, Efficiency, and Performance per Node) on page 9 of the draft. David B. also suggested that there be no speedup statistic kept in the Parkbench Database (PDS). Phil T. suggested that David B.'s points (4) and (5) on uniprocessor times and speedup when the problem is too large to run on a single processor/node be combined with his (3) for a general policy on speedup statistics. Charles G. stressed that efficiency is important and speedup relative to Amdahl's Law be considered. David B. suggested that the discussion of speedup be limited to avoid controversy. Jack D. and Michael B. agreed to provide a few postscript images of PDS for a Appendix to the Parkbench report. Bodo P. suggested that the adjective "obsolescent" be inserted before "SPECmarks" on the last line of page 10. SPECfp92 and SPECint92 are the current SPEC benchmarks reported. With regard to Section 2.7 (Interactive Graphical Interface), Tony H. indicated that the Southampton Group would provide the graphical front end for the Parkbench database. Bodo P. asked that a single output format be developed for inclusion in documents (e.g., embedded postscript for LaTeX documents). Jack D. indicated that each subcommittee will be responsible for providing codes to the Parkbench suite. Phil T. questioned whether optimization of codes (Section 2.8) should be limited, e.g., use high level languages only? Joanne M. responded that there should be no restrictions. Tony H. pointed out that vendors will use a variety of tactics to optimize kernels and low-level benchmarks, and that compact applications will most likely require different forms of optimization. David B. insisted that any library routines used should be found in any vendor-supported library (i.e., could at least be purchased). Tony H. reminded the attendees that there should ultimately be 3 versions of each benchmark: Fortran90, PVM/MPI, and HPF. Phil T. questioned whether or not these versions would follow any "standards". Jack D. indicated that MPI (or HPF) would not become any ANSI-supported standard. Bodo P. suggested that "XOPEN" has a procedure for creating standards without going through ANSI-related bureaucracy. XOPEN is nonprofit British corporation. A majority of the attendees felt that XOPEN might be an appropriate outlet for establishing standards for MPI or HPF. Bodo P. asked if PVM uses shared-memory function calls? Jack D. indicated that a new version of PVM which uses shared-memory was under development. Bodo P. indicated that P4 does support shared-memory. Phil T. was concerned that moving biased code to new machines would not be fair and suggested that the "as-is" run should simply be a "minimum sanity run" which comprises minimal changes just to get the code to run. Joanne M. and Bodo P. suggested that the first sentence in the Introduction (Chapter 1) indicate that "scalable distributed-memory (message-passing)" be the target architectures. The role of shared-memory architectures was not clear to several attendees. At 11:05 EDT, the group took a 15 minutes coffee break. At 11:20 EDT, the meeting resumed with Tony H. leading the discussion of Chapter 3 (Low-Level Benchmarks) on page 15 of the report. There were no comments/changes for Section 3.1 (Introduction) so the discussion focused on Section 3.2 (Single-Processor Benchmarks) early on. Charles G. indicated that the descriptions of TICK1 and TICK2 do not really discuss a timing methodology and related caveats. He agreed to rewrite Sections 3.2.1 and 3.2.2 on TICK1 and TICK2, respectively. Jack D. suggested that a catalogue of machine-specific timing routines be maintained. Bodo P. and Phil T. felt that the use of IF DEF's in TICK1 and TICK2 would be sufficient to select the appropriate machine timing intrinsic function for elapsed wall-clock times. Jack D. agreed to write a paragraph (for Sections 3.2.1 and 3.2.2) describing how machine- dependent timer calls are handled. Tony H. asked for volunteers to test POLY1 and POLY2 (Section 3.2.4) and indicated that he would provide the source code. Tony also indicated that Roger H. would develop COMMS3 and POLY3 for testing on Intel machines (Sections 3.3.2 and 3.3.3). For Section 3.4 (Appendix), Ed K. agreed to provide more recent data for Table 3.2 (page 22). Jack D. pointed out that clock speeds were missing in this table and Tony H. indicated that Roger H. will contact Ed K. about acquiring more recent data for such tables. Tony H. stressed that vendors' approval be granted before listing such data. For the Kernel Benchmarks (Chapter 4), Tony H. indicated that he had discussed the availability of parallel kernels from TMC (page 28). Ed K. indicated that Intel had single node and parallel libraries and mentioned the existence of routines for banded linear systems, matrix multiplication, and parallel FFT. Joanne M. indicated that IBM only had serial mathematical libraries. David B. pointed out the typo "diificuly" on page 25 of the draft. With regard to Matrix benchmarks (Section 4.2.1), Jack D. indicated that the benchmarks were almost complete -- only support software (makefiles, input/output files, etc.) is needed. He specifically mentioned the availability of matrix transposition, dense matrix multiplication for multicomputers, routines for reduction to tridiagonal (symmetric) and hessenberg (unsymmetric) form, BLACS for PVM, LU and QR factorization. Michael B. pointed out that #5 on page 26 should be "conjugate gradient method for solving linear systems" rather than "eigenvalue problem by conjugate gradient." Jack D. indicated that both Fortran-77 and PVM versions of the conjugate gradient method exist. What is needed for all the above mentioned kernels include: problem sizes, standard input files, makefiles, and README files. Tony H. asserted that the group needs a timescale for future HPF versions of the kernel benchmarks. Tom H. agreed to help Jack D. with this. Jack D. indicated that he would also solicit the help of Jim Demmel (Berkeley). For the Fourier Transform benchmarks (Section 4.2.2), several attendees stressed the need for a large 1-D FFT and the 3-D FFT and PDE by 3-D FFT (#2 and #3 on page 27) be merged into "3-D FFT". Tony H. reminded the group that a validation procedure will be needed for the 1-D FFT and David B. agreed to provide a convolution problem using a large vector of integers (e.g, 1024 ints). Using linear convolution, checksums would have to agree and the maximum deviation obtained on any machine could be measured. Charles G. agreed to help David B. on this. For the PDEs (Section 4.2.3), Tony H. indicated that #1 (Jacobi), #2 (Gauss-Seidel), and #4 (Finite Element) would be dropped from the current list. David W. was unable to acquire the FEM code discussed at the May Parkbench meeting. Tony H. agreed to provide the SOR benchmark from the Genesis Benchmarks and David B. will provide a multigrid benchmark from the NAS Parallel Benchmarks. These 2 benchmarks do not solve the same problem (SOR for Poisson's Equation and Multigrid for another application) but several versions of them (serial, MPI, HPF, etc.) are possible. Jack D. agreed to write a formal letter to NASA on behalf of the Parkbench group to request the use of the NAS Parallel Benchmarks. David B. indicated that these codes are now available for world-wide distribution. For "Other" kernel benchmarks (Section 4.2.4), Tony H. agreed to add a description of a candidate I/O benchmark (not mentioned in the current draft). Joanne M. and Tony H. stressed the need for the Embarassing Parallel (EP) benchmark, and Ed K. felt the Large Integer Sort benchmark should be included as well but questioned if it should be a "paper and pencil" benchmark similar to the NAS Parallel Benchmarks. He indicated that David Culler (culler@cs.berkeley.edu) and Leo(?) Dagum (dagum@nas.nasa.gov) could help supply this benchmark code. Tony H. reminded the group that Roger H. can provide a Particle-In-Cell (PIC) code as a candidate kernel benchmark also. At 12:15 pm EDT, the group broke for a 1-hour lunch. At 1:15 pm EDT, Tony H. indicated that he would contact David W. later about the status of the Compact Applications which comprises Chapter 5 (David W. had not arrived yet). Tom H. asked if naming conventions like "matmul" would be used for kernel benchmarks since they might conflict with Fortran 90 intrinsics. Tony H. indicated that separate names according to subject area would be defined. A discussion of problem sizes prompted Phil T. to ask if 1 or 2 gigabytes of memory constituted a large problem? Charles G. pointed out that scalability is important but not always possible (often algorithms will change as problem size increases). Joanne M. suggested the group think about future problem sizes to anticipate relevant problem sizes a few years from now. Jack D. suggested that 10 gigabytes defines a large problem and that 1-2 gigabytes would be a medium size problem. Small problem sizes would be needed for simple testing purposes. Bodo P. asserted that memory should be specified as part of the dataset name. Tony H. summarized that there should 3 problem sizes: (1) test problem size, (2) moderate size problem, and (3) grand challenge problem. All problem specifications would be denoted in README files. At this point, a discussion of the Compiler Benchmarks (Chapter 6) was lead by Tom H, who suggested that the Introduction (Section 6.1) be merged with the Introduction (Chapter 1). He suggested that a new metric for the compiler benchmarks also be mentioned in this introduction: "ratio of execution times of compiler generated to hand-coded programs as a function of the problem size and number of processors engaged." Since there were no comments for Section 6.2 (HPF), Tom H. proceeded on to Section 6.3 (Benchmark Suite) and specified that no multi-statement optimization is addressed in the current compiler benchmark set. Tony H. asked if the name of these particular benchmarks should be changed to "HPF Compiler Benchmarks." He also suggested that an Appendix discussing MPI and HPF be included in the report. Tom H. then reviewed the Description of Codes in Section 6.4. He indicated that all were existing codes and that the last 3 involve FORALL statements while the las 4 involve array distribution. Bobo P. noticed that subroutine in-lining was missing from the set. Ed K. indicated that Intel would eventually have HPF version of all the codes listed. He also suggested that there be HPF compiler benchmarks more suitable for the compact application benchmarks since the current set of compiler benchmarks seem for appropriate for the kernel benchmarks. Jack D. asked just how many HPF compilers would be available by the time for Supercomputing '93 in November? Tom H. felt there would be none but Ed K. said there would be at least one. Jack D. suggested that the codes (currently in Section 6.4) be removed from the report. A description of them would suffice. At 1:45 pm EDT, David W. arrived to the meeting and Tony H. asked him to report on Compact Applications (Chapter 5). David W. indicated that there were no codes submitted since the May Parkbench meeting. He also suggested that there be an application form for submitting compact application benchmarks. Such a form would specify the total number of flops required and memory requirements, and could be posted to the Usenet News Group "comp.parallel." David B. will submit the NAS Parallel Benchmarks and David W. indicated that a Shallow-Water code and a Molecular Dynamics code would be available. Tony H. indicated that he could provide a "real" QCD code and then asked about the status of the FEM code discussed at the May meeting. David M. said that particular code was not readily available. Ed K. suggested that feedback fro the Scientific Community was agreed. Tony H. concurred. Joanne M. suggested that 5 compact applications would be sufficient. David M. indicated that he would check again on the use of the FEM code (in the public domain). Tony H. asked David W. to contact Chuck Mosher about the ARCO suite (Joanne M. indicated she had Mosher's electronic mail. David B. asserted that an N-body problem might be an appropriate compact application. The current list of compact applications considered by the group include: (1) ARCO suite, (2) NAS CFD benchmarks, (3) QCD from Southampton, (4) Shallow Water code from ORNL, and (5) Molecular Dynamics code. Tony H. then suggested that a Summary of Actions be reviewed so attendees agreed on the remaining work to be done. In preparation for Supercomputing '93 in Portland, the following benchmark suite should be assembled: PDS/Xnetlib (UT) TICK1, TICK2, RIF1, POLY1, POLY2 (Roger H.) COMMS1, COMMS2, COMMS3, POLY3, SYNC1 (Roger H.; need PVM versions) PUMMA, LU, QR, CG, TRDI (UT/ORNL supplies seq. and PVM; Syracuse provides HPF) FFT 1-D, FFT3-D (David B. - NASA supplies seq. and Intel versions) SOR, Multigrid (seq., MPI, HPF provided by Tony H. and David B.) EP, Integer Sort (seq. provided by Tom H. - Syracuse) I/O (provided by Tony H. - Southampton) HPF Compiler Benchmarks (Tom H. - Syracuse) QCD (Tony H. - Southampton) CFD/NAS(3) (David B. - NASA; seq., MP, CMF) Molecular Dynamics (David W. - ORNL; seq., PICL) ARCO Suite (2) (Jack D. and David W.; seq. and P4) Shallow Water (David W. - ORNL and Tom H. - Syracuseg Each code is to have 3 input sets as described earlier. UT/Netlib will collect/archive all candidate benchmarks for the Parkbench suite. Makefiles, README files, input files, etc. should be provided. Tony H. then called for a discussion on future actions. Tony suggested that subcommittee leaders put revised sections on the net, and Michael B. agreed to assemble final draft (with the help of Roger H.). Revised text should be mailed to berry@cs.utk.edu by October 15, 1993. Joanne M. indicated that Supercomputing '93 runs from Tuesday to Friday the week of November 17, and the current tentative evening session for Parkbench is Wednesday, November 17, 4:30 pm. Tony H. asked for opinions on future chairmanship (on the behalf of Roger H.). All attendees agreed that Roger H. was best suited to fulfill this role for the immediate future. There was strong opposition to the discussion of future chairmanship at the open meeting in Portland. For future meetings, Joanne M. and Tony H. suggested that the group explore "videoconferencing" between a US site and a European site. Bodo P. suggested that such meetings should be coordinated with SPEC. Jack D. offered to host a joint Parkbench/SPEC meeting for early 1994. Bodo P. then reviewed the organizational structure of SPEC for the attendees. He also mentioned a new SPEC benchmark called "PAR93" which comprises a 512 MB problem size for under 60 processors with automatic parallelism, shared-memory, and compiler/system tuning allowed. After Bodo P. completed the SPEC overview, Tony H. asked the group whether Parkbench should ultimately be sponsored by SPEC. At this time, Phil T. gave a short lecture on the evolution of SPEC/Perfect. Joanne M. suggested that Parkbench should be separate from SPEC/Perfect (may be called SPEC/HPSC - High Performance Steering Committee in near future) at least during the initial development phase of the benchmark suite. Phil T. mentioned that the current application benchmarks for SPEC/HPSC include ARCO (fdmod, fkmig, seis), LANL (false, pueblo), IBM (turb3d), and ETH/Zurich (NCMD - QCD). All of these are Fortran-77 versions with only ARCO having a message-passing version available at this time. David W. also remarked that it was very premature to discuss a Parkbench-SPEC/HPSC merger at this time. Tony H. indicated that PEPS and Euroben should also be included and asked Jack D. to formally contact them (they would like to maintain a copy of the PDS data). Some of the final discussions at the meeting concerned the default precision the codes should be run in for validated results. Charles G. insisted that the precision be specified in the code documentation. Jack D. felt that the kernel benchmarks should normally be run with 64-bit arithmetic unless otherwise specified. Some of the compact application codes (e.g., ARCO suite) may be run in 32-bit precision, however, to reflect "real" use in industrial applications. David B. also pointed out that the precision used during a run should be denoted in a code's output. The meeting promptly adjourned at 4:04 pm EDT. End of Minutes for August 23, 1993 (M. Berry, berry@cs.utk.edu) From owner-pbwg-comm@CS.UTK.EDU Tue Aug 31 10:17:10 1993 Received: from CS.UTK.EDU by netlib2.cs.utk.edu with SMTP (5.61+IDA+UTK-930125/2.8t-netlib) id AA29301; Tue, 31 Aug 93 10:17:10 -0400 Received: from localhost by CS.UTK.EDU with SMTP (5.61+IDA+UTK-930125/2.8s-UTK) id AA14499; Tue, 31 Aug 93 10:14:30 -0400 X-Resent-To: pbwg-comm@CS.UTK.EDU ; Tue, 31 Aug 1993 10:14:29 EDT Errors-To: owner-pbwg-comm@CS.UTK.EDU Received: from gemini.npac.syr.EDU by CS.UTK.EDU with SMTP (5.61+IDA+UTK-930125/2.8s-UTK) id AA14490; Tue, 31 Aug 93 10:14:23 -0400 Received: from localhost by gemini.npac.syr.edu with SMTP id AA16084 (5.65c/IDA-1.4.4); Tue, 31 Aug 1993 10:14:19 -0400 Message-Id: <199308311414.AA16084@gemini.npac.syr.edu> To: "Michael W. Berry" Cc: pbwg-comm@cs.utk.edu Subject: Re: Minutes In-Reply-To: Your message of "Mon, 30 Aug 93 23:00:00 EDT." <9308310300.AA03491@berry.cs.utk.edu> Date: Tue, 31 Aug 93 10:14:18 -0400 From: haupt@npac.syr.edu X-Mts: smtp >I've enclosed a draft of the Minutes for the ParkBench Meeting on >Aug. 23. Please email all corrections to berry.cs.utk.edu >Thanks, >Mike >--- > Minutes of the 4th PARKBENCH (Formerly PBWG) Workshop > ----------------------------------------------------- > > >At 1:15 pm EDT, (...) Tom H. asked if naming conventions like "matmul" >would be used for kernel benchmarks since they might conflict with Fortran 90 >intrinsics. Tony H. indicated that separate names according to subject area >would be defined. Actually I meant something else. When preparing an HPF version of the kernel for dense matrix multiplication we have two choices: - rewrite the kernel in HPF (as we plan to do with everything else) - use call to HPF (actually Fortran 90) intrinsic function MATMUL. I favor the latter, and I got impression that Tony and Jack share my opinion. ---- Anyway, I need a hint how to prepare the kernel Thanks, Tom From owner-pbwg-comm@CS.UTK.EDU Wed Sep 1 07:17:57 1993 Received: from CS.UTK.EDU by netlib2.cs.utk.edu with SMTP (5.61+IDA+UTK-930125/2.8t-netlib) id AA05152; Wed, 1 Sep 93 07:17:57 -0400 Received: from localhost by CS.UTK.EDU with SMTP (5.61+IDA+UTK-930125/2.8s-UTK) id AA25134; Wed, 1 Sep 93 07:14:13 -0400 X-Resent-To: pbwg-comm@CS.UTK.EDU ; Wed, 1 Sep 1993 07:14:10 EDT Errors-To: owner-pbwg-comm@CS.UTK.EDU Received: from sun2.nsfnet-relay.ac.uk by CS.UTK.EDU with SMTP (5.61+IDA+UTK-930125/2.8s-UTK) id AA25126; Wed, 1 Sep 93 07:14:07 -0400 Via: uk.ac.southampton.ecs; Wed, 1 Sep 1993 12:07:01 +0100 From: R.Hockney@parallel-applications-centre.southampton.ac.uk Via: calvados.pac.soton.ac.uk (plonk); Wed, 1 Sep 93 11:57:41 BST Date: Wed, 1 Sep 93 11:06:27 GMT Message-Id: <24494.9309011106@calvados.pac.soton.ac.uk> To: pbwg-comm@cs.utk.edu Subject: Parkbench Minutes Although not present, I have the following comments on the minutes: >correct assessment might be "20 flop". Phil T. pointed out that the Cray HPM >was used for flop counts of the Perfect Benchmarks. Bodo P. asserted that >since caching is typically used to evaluate many >intrinsic functions, reported counts might be distorted. Tony H. stressed that POINT NEEDING CLARIFICATION --------------------------- An intrinsic function is evaluated from a series or other formula with a fixed and known number of mults and adds. I do not understand how this number is changed if caching is used or not. I do agree that the number of memory references and the time of execution will change, but these are not the points at issue. The number of flop is always the same. >data-access rate, for example. David B. agreed to modify Section 2.4 to >include a memory parameter, and then asked whose responsibility is it to >produce >flop counts for the Parkbench codes? He suggested that a formula be given for MY OPINION EXPRESSED LATER -------------------------- I (rwh) suggest that the responsibility for deciding on the official flop-count be that of the benchmark writer, because in complex codes he is probably the only one who knows what is going on well enough to dream-up a formula that is accurate enough for the purpose but not too complex to use easily. It is often a matter of judgement what approximations to make or what terms to omit. >suggested that the first sentence in the Introduction (Chapter 1) indicate that >"scalable distributed-memory (message-passing)" be the target architectures. >The role of shared-memory architectures was not clear to several attendees. NOTE FROM THE ABSENT CHAIRMAN ----------------------------- I (rwh) think that to specify scalable distributed-memory (message- passing) as the target architecture is unnecessarilly restrictive because our F90 and HPF versions are intended for use on other architectures (e.g. non-MP SIMD, and shared memory). That is why the report is entitled generally as "Benchmarks for Parallel Computers", not "Benchmarks for Message-Passing computers". Also the word scalable is bad: all parallel computers (both shared memory and distributed) are scalable up to some practical limit, the correct question is not whether they scale or not, but is by how much do they scale. >particular code was not readily available. Ed K. suggested that feedback fro >the Scientific Community was agreed. Tony H. concurred. Joanne M. suggested Meaning of middle sentence unclear? NOTE THE FOLLOWING MINOR TYPOS ------------------------------ > TICK1, TICK2, RIF1, POLY1, POLY2 (Roger H.) should be: TICK1, TICK2, RINF1, POLY1, POLY2 (Roger H.) correction: * > COMMS1, COMMS2, COMMS3, POLY3, SYNC1 (Roger H.; need PVM versions) should be: COMMS1, COMMS2, COMMS3, POLY3, SYNCH1 (Roger H.; need PVM versions) correction: * Roger Hockney From owner-pbwg-comm@CS.UTK.EDU Wed Sep 1 17:08:48 1993 Received: from CS.UTK.EDU by netlib2.cs.utk.edu with SMTP (5.61+IDA+UTK-930125/2.8t-netlib) id AA11318; Wed, 1 Sep 93 17:08:48 -0400 Received: from localhost by CS.UTK.EDU with SMTP (5.61+IDA+UTK-930125/2.8s-UTK) id AA08459; Wed, 1 Sep 93 17:06:24 -0400 X-Resent-To: pbwg-comm@CS.UTK.EDU ; Wed, 1 Sep 1993 17:06:22 EDT Errors-To: owner-pbwg-comm@CS.UTK.EDU Received: from Sun.COM by CS.UTK.EDU with SMTP (5.61+IDA+UTK-930125/2.8s-UTK) id AA08451; Wed, 1 Sep 93 17:06:19 -0400 Received: from Eng.Sun.COM (zigzag-bb.Corp.Sun.COM) by Sun.COM (4.1/SMI-4.1) id AB23523; Wed, 1 Sep 93 14:05:32 PDT Received: from cumbria.Eng.Sun.COM by Eng.Sun.COM (4.1/SMI-4.1) id AA15177; Wed, 1 Sep 93 14:05:52 PDT Received: by cumbria.Eng.Sun.COM (5.0/SMI-SVR4) id AA01956; Wed, 1 Sep 1993 14:04:36 +0800 Date: Wed, 1 Sep 1993 14:04:36 +0800 From: Bodo.Parady@Eng.Sun.COM (Bodo Parady - SMCC Systems Performance) Message-Id: <9309012104.AA01956@cumbria.Eng.Sun.COM> To: pbwg-comm@cs.utk.edu, R.Hockney@parallel-applications-centre.southampton.ac.uk Subject: Re: Parkbench Minutes X-Sun-Charset: US-ASCII Content-Length: 1985 > From owner-pbwg-comm@CS.UTK.EDU Wed Sep 1 13:40 PDT 1993 > X-Resent-To: pbwg-comm@CS.UTK.EDU ; Wed, 1 Sep 1993 07:14:10 EDT > From: R.Hockney@parallel-applications-centre.southampton.ac.uk > Date: Wed, 1 Sep 93 11:06:27 GMT > To: pbwg-comm@cs.utk.edu > Subject: Parkbench Minutes > > Although not present, I have the following comments on the minutes: > > >correct assessment might be "20 flop". Phil T. pointed out that the Cray HPM > >was used for flop counts of the Perfect Benchmarks. Bodo P. asserted that > >since caching is typically used to evaluate many > >intrinsic functions, reported counts might be distorted. Tony H. stressed that > > POINT NEEDING CLARIFICATION > --------------------------- > An intrinsic function is evaluated from a series or other formula > with a fixed and known number of mults and adds. I do not understand > how this number is changed if caching is used or not. I do agree > that the number of memory references and the time of execution will > change, but these are not the points at issue. The number of flop is > always the same. There are two ways of calculating intrinsic functions: a = F(x) Where f is a function of multiplies and adds, and a = Table(x ;x) i when x=x in the table, the value is calculated by table lookup. For i example, take the Fortran loop: do 1000 i = 1, m do 2000 j = 1, n A(i, j) = B(i,j) * sin (i*j*PI/20) 2000 continue 1000 continue For a no-caching intrinsic function (sine/cosine), n*m*(number of ops to do sine) would be performed. For a caching intrinsic, the ops count could be approximately (40*(number of ops to do sine) + n*m (modulos + loads)). Any benchmark that includes intrinsics could try to assure that the intrinsic is never called with the same or similar (the sine example above) value, making this concern moot. >...> > Roger Hockney > Bodo From owner-pbwg-comm@CS.UTK.EDU Wed Sep 1 22:24:11 1993 Received: from CS.UTK.EDU by netlib2.cs.utk.edu with SMTP (5.61+IDA+UTK-930125/2.8t-netlib) id AA12074; Wed, 1 Sep 93 22:24:11 -0400 Received: from localhost by CS.UTK.EDU with SMTP (5.61+IDA+UTK-930125/2.8s-UTK) id AA26170; Wed, 1 Sep 93 22:22:10 -0400 X-Resent-To: pbwg-comm@CS.UTK.EDU ; Wed, 1 Sep 1993 22:22:09 EDT Errors-To: owner-pbwg-comm@CS.UTK.EDU Received: from BERRY.CS.UTK.EDU by CS.UTK.EDU with SMTP (5.61+IDA+UTK-930125/2.8s-UTK) id AA26163; Wed, 1 Sep 93 22:22:08 -0400 Received: from LOCALHOST.cs.utk.edu by berry.cs.utk.edu with SMTP (5.61++/2.7c-UTK) id AA08158; Wed, 1 Sep 93 22:22:07 -0400 Message-Id: <9309020222.AA08158@berry.cs.utk.edu> To: pbwg-comm@cs.utk.edu Subject: Revised Minutes Date: Wed, 01 Sep 1993 22:22:06 -0400 From: "Michael W. Berry" Here's the latest version of the minutes to the Aug 23 PARKBENCH meeting in Knoxville. Please send corrections soon. I'll wait a few days before posting it to "comp.parallel". Thanks, Mike Minutes of the 4th PARKBENCH (Formerly PBWG) Workshop ----------------------------------------------------- Place: Science Alliance Conference Room South College University of Tennessee Knoxville, TN Host: Jack Dongarra ORNL/Univ. of Tennessee Date: August 23, 1993 Attendees/Affiliations: Michael Berry, Univ. of Tennessee Philip Tannenbaum, NEC/SPEC Ed Kushner, Intel David Mackay, Intel Charles Grassl, Cray Research Bodo Parady, Sun Microsystems David Bailey, NASA Jack Dongarra, Univ. of Tennessee/ORNL Tom Haupt, Univ. of Syracuse Tony Hey, Univ. of Southampton Joanne Martin, IBM David Walker, ORNL The meeting started at 9:10 EDT with Tony Hey chairing the meeting for the absent Roger Hockney. Tony asked if there were any changes to the minutes of the May Parkbench meeting. There were no changes suggested by the attendees so the minutes were accepted. Tony then lead the group through the current draft of the Parkbench Report to be distributed at Supercomputing '93 in Portland. Each attendee was provided a copy of the current draft along with a a handout containing modified sections written by David B. David B. suggested that his 0.1 Philosophy section be inserted before Roger's text in Chapter 1 (Introduction). David B. stressed that reproducibility be mentioned in the current draft. Phil T. pointed out that Perfect's new focus allows optimized versions to eventually become new baseline numbers. Tony H. felt that Parkbench should have some sort of relationship with SPEC/PB. Tom H. said that he would provide a few sentences for compiler benchmark development in the Introduction. Tony H. then motioned that the group move on to discuss Chapter 2 (Methodology). David B. stressed that CPU time has some value but that the pitfalls in its use should be included in Section 2.2 (Time Measurement). David B. will merge his section 0.2 in with Roger's Section 2.2. Charles G. pointed out that one cannot really define minimum performance (worst case). The group then discussed what methodology for measuring elapsed wall-clock time should be used. Charles G. offered the following 4 options: (1) make 3 runs and take the minimum, (2) make 3 runs and the maximum, (3) make 3 runs and take the arithmetic average, and (4) say nothing but recommend that 3 runs be made and ask for complete details on how the reported time was obtained. The group agreed that option (4) would be most appropriate for ParkBench at this time. Charles G. also suggested that TICK1/TICK2 essentially measure the accuracy of the timers. Ed K. explained a timing problem observed by Paragon users after reboots -- he indicated that there should be flexibility in reported timings so that "typical execution time" actually be stored in the Parkbench database. David B. agreed to insert a discussion of reproducibility into Roger's Section 2.8 (Benchmarking Procedure and Code Optimisation). Charles G. asserted that there should also be some emphasis on how runs were timed with respect to sensitivity. Tony H. then led the discussion on to Section 2.3 (Units and Symbols) and questioned whether or not this section should really be in an Appendix? David B. recommended that the Section be left where it is and most of the other attendees concurred. Mike B. asked the group to change "msend" to "send" on page 4 of the report following Roger H.'s suggestion. Moving on to Section 2.4 (Floating- Point Operation Count), David B. felt that the "8 flop" listed for "exponential, sine etc." was not realistic. He and Joanne M. felt a more correct assessment might be "20 flop". Phil T. pointed out that the Cray HPM was used for flop counts of the Perfect Benchmarks. Bodo P. asserted that since caching is typically used to evaluate many intrinsic functions, reported counts might be distorted. Tony H. stressed that counts are needed for Mflop/s ratios that will ultimately be used. Jack D. indicated that he would be willing to contact J. Demmel and W. Kahan for expert advice on what flop counts should be associated with the various operations -- all attendees agreed that this should be done. David B. and Joanne M. suggested that there should be standard flop counts that are in some sense "close" to those for a real machine (eg., Cray). Charles G. suggested that a bandwidth parameter be included -- theoretical data-access rate, for example. David B. agreed to modify Section 2.4 to include a memory parameter, and then asked whose responsibility is it to produce flop counts for the Parkbench codes? He suggested that a formula be given for the Kernel benchmarks and perhaps Cray HPM counts for the Compact-Application benchmarks. Tony H. pointed out that standard flop counts should be provided for each problem size. The discussion then turned to Section 2.5 (Performance metrics). David B. indicated that his Section 0.3 would replace Section 2.5.5 (Speedup, Efficiency, and Performance per Node) on page 9 of the draft. David B. also suggested that there be no speedup statistic kept in the Parkbench Database (PDS). Phil T. suggested that David B.'s points (4) and (5) on uniprocessor times and speedup when the problem is too large to run on a single processor/node be combined with his (3) for a general policy on speedup statistics. Charles G. stressed that efficiency is important and speedup relative to Amdahl's Law be considered. David B. suggested that the discussion of speedup be limited to avoid controversy. Jack D. and Michael B. agreed to provide a few postscript images of PDS for a Appendix to the Parkbench report. Bodo P. suggested that the adjective "obsolescent" be inserted before "SPECmarks" on the last line of page 10. SPECfp92 and SPECint92 are the current SPEC benchmarks reported. With regard to Section 2.7 (Interactive Graphical Interface), Tony H. indicated that the Southampton Group would provide the graphical front end for the Parkbench database. Bodo P. asked that a single output format be developed for inclusion in documents (e.g., embedded postscript for LaTeX documents). Jack D. indicated that each subcommittee will be responsible for providing codes to the Parkbench suite. Phil T. questioned whether optimization of codes (Section 2.8) should be limited, e.g., use high level languages only? Joanne M. responded that there should be no restrictions. Tony H. pointed out that vendors will use a variety of tactics to optimize kernels and low-level benchmarks, and that compact applications will most likely require different forms of optimization. David B. insisted that any library routines used should be found in any vendor-supported library (i.e., could at least be purchased). Tony H. reminded the attendees that there should ultimately be 3 versions of each benchmark: Fortran90, PVM/MPI, and HPF. Phil T. questioned whether or not these versions would follow any "standards". Jack D. indicated that MPI (or HPF) would not become any ANSI-supported standard. Bodo P. suggested that "XOPEN" has a procedure for creating standards without going through ANSI-related bureaucracy. XOPEN is nonprofit British corporation. A majority of the attendees felt that XOPEN might be an appropriate outlet for establishing standards for MPI or HPF. Bodo P. asked if PVM uses shared-memory function calls? Jack D. indicated that a new version of PVM which uses shared-memory was under development. Bodo P. indicated that P4 does support shared-memory. Phil T. was concerned that moving biased code to new machines would not be fair and suggested that the "as-is" run should simply be a "minimum sanity run" which comprises minimal changes just to get the code to run. Joanne M. and Bodo P. suggested that the first sentence in the Introduction (Chapter 1) indicate that "scalable distributed-memory (message-passing)" be the target architectures. The role of shared-memory architectures was not clear to several attendees. At 11:05 EDT, the group took a 15 minutes coffee break. At 11:20 EDT, the meeting resumed with Tony H. leading the discussion of Chapter 3 (Low-Level Benchmarks) on page 15 of the report. There were no comments/changes for Section 3.1 (Introduction) so the discussion focused on Section 3.2 (Single-Processor Benchmarks) early on. Charles G. indicated that the descriptions of TICK1 and TICK2 do not really discuss a timing methodology and related caveats. He agreed to rewrite Sections 3.2.1 and 3.2.2 on TICK1 and TICK2, respectively. Jack D. suggested that a catalogue of machine-specific timing routines be maintained. Bodo P. and Phil T. felt that the use of IF DEF's in TICK1 and TICK2 would be sufficient to select the appropriate machine timing intrinsic function for elapsed wall-clock times. Jack D. agreed to write a paragraph (for Sections 3.2.1 and 3.2.2) describing how machine- dependent timer calls are handled. Tony H. asked for volunteers to test POLY1 and POLY2 (Section 3.2.4) and indicated that he would provide the source code. Tony also indicated that Roger H. would develop COMMS3 and POLY3 for testing on Intel machines (Sections 3.3.2 and 3.3.3). For Section 3.4 (Appendix), Ed K. agreed to provide more recent data for Table 3.2 (page 22). Jack D. pointed out that clock speeds were missing in this table and Tony H. indicated that Roger H. will contact Ed K. about acquiring more recent data for such tables. Tony H. stressed that vendors' approval be granted before listing such data. For the Kernel Benchmarks (Chapter 4), Tony H. indicated that he had discussed the availability of parallel kernels from TMC (page 28). Ed K. indicated that Intel had single node and parallel libraries and mentioned the existence of routines for banded linear systems, matrix multiplication, and parallel FFT. Joanne M. indicated that IBM only had serial mathematical libraries. David B. pointed out the typo "diificuly" on page 25 of the draft. With regard to Matrix benchmarks (Section 4.2.1), Jack D. indicated that the benchmarks were almost complete -- only support software (makefiles, input/output files, etc.) is needed. He specifically mentioned the availability of matrix transposition, dense matrix multiplication for multicomputers, routines for reduction to tridiagonal (symmetric) and hessenberg (unsymmetric) form, BLACS for PVM, LU and QR factorization. Michael B. pointed out that #5 on page 26 should be "conjugate gradient method for solving linear systems" rather than "eigenvalue problem by conjugate gradient." Jack D. indicated that both Fortran-77 and PVM versions of the conjugate gradient method exist. What is needed for all the above mentioned kernels include: problem sizes, standard input files, makefiles, and README files. Tony H. asserted that the group needs a timescale for future HPF versions of the kernel benchmarks. Tom H. agreed to help Jack D. with this. Jack D. indicated that he would also solicit the help of Jim Demmel (Berkeley). For the Fourier Transform benchmarks (Section 4.2.2), several attendees stressed the need for a large 1-D FFT and the 3-D FFT and PDE by 3-D FFT (#2 and #3 on page 27) be merged into "3-D FFT". Tony H. reminded the group that a validation procedure will be needed for the 1-D FFT and David B. agreed to provide a convolution problem using a large vector of integers (e.g, 1024 ints). Using linear convolution, checksums would have to agree and the maximum deviation obtained on any machine could be measured. Charles G. agreed to help David B. on this. For the PDEs (Section 4.2.3), Tony H. indicated that #1 (Jacobi), #2 (Gauss-Seidel), and #4 (Finite Element) would be dropped from the current list. David W. was unable to acquire the FEM code discussed at the May Parkbench meeting. Tony H. agreed to provide the SOR benchmark from the Genesis Benchmarks and David B. will provide a multigrid benchmark from the NAS Parallel Benchmarks. These 2 benchmarks do not solve the same problem (SOR for Poisson's Equation and Multigrid for another application) but several versions of them (serial, MPI, HPF, etc.) are possible. Jack D. agreed to write a formal letter to NASA on behalf of the Parkbench group to request the use of the NAS Parallel Benchmarks. David B. indicated that these codes are now available for world-wide distribution. For "Other" kernel benchmarks (Section 4.2.4), Tony H. agreed to add a description of a candidate I/O benchmark (not mentioned in the current draft). Joanne M. and Tony H. stressed the need for the Embarassing Parallel (EP) benchmark, and Ed K. felt the Large Integer Sort benchmark should be included as well but questioned if it should be a "paper and pencil" benchmark similar to the NAS Parallel Benchmarks. He indicated that David Culler (culler@cs.berkeley.edu) and Leo(?) Dagum (dagum@nas.nasa.gov) could help supply this benchmark code. Tony H. reminded the group that Roger H. can provide a Particle-In-Cell (PIC) code as a candidate kernel benchmark also. At 12:15 pm EDT, the group broke for a 1-hour lunch. At 1:15 pm EDT, Tony H. indicated that he would contact David W. later about the status of the Compact Applications which comprises Chapter 5 (David W. had not arrived yet). Tom H. suggested that when preparing an HPF version of the kernel for dense matrix multiplication, there are 2 choices: (1) rewrite the kernel in HPF (as we plan to do with everything else), or (2) use a call to HPF (actually Fortran 90) intrinsic function MATMUL. Tom H., Jack D., and Tony H. preferred the option (2). A discussion of problem sizes prompted Phil T. to ask if 1 or 2 gigabytes of memory constituted a large problem? Charles G. pointed out that scalability is important but not always possible (often algorithms will change as problem size increases). Joanne M. suggested the group think about future problem sizes to anticipate relevant problem sizes a few years from now. Jack D. suggested that 10 gigabytes defines a large problem and that 1-2 gigabytes would be a medium size problem. Small problem sizes would be needed for simple testing purposes. Bodo P. asserted that memory should be specified as part of the dataset name. Tony H. summarized that there should 3 problem sizes: (1) test problem size, (2) moderate size problem, and (3) grand challenge problem. All problem specifications would be denoted in README files. At this point, a discussion of the Compiler Benchmarks (Chapter 6) was lead by Tom H, who suggested that the Introduction (Section 6.1) be merged with the Introduction (Chapter 1). He suggested that a new metric for the compiler benchmarks also be mentioned in this introduction: "ratio of execution times of compiler generated to hand-coded programs as a function of the problem size and number of processors engaged." Since there were no comments for Section 6.2 (HPF), Tom H. proceeded on to Section 6.3 (Benchmark Suite) and specified that no multi-statement optimization is addressed in the current compiler benchmark set. Tony H. asked if the name of these particular benchmarks should be changed to "HPF Compiler Benchmarks." He also suggested that an Appendix discussing MPI and HPF be included in the report. Tom H. then reviewed the Description of Codes in Section 6.4. He indicated that all were existing codes and that the last 3 involve FORALL statements while the las 4 involve array distribution. Bobo P. noticed that subroutine in-lining was missing from the set. Ed K. indicated that Intel would eventually have HPF version of all the codes listed. He also suggested that there be HPF compiler benchmarks more suitable for the compact application benchmarks since the current set of compiler benchmarks seem for appropriate for the kernel benchmarks. Jack D. asked just how many HPF compilers would be available by the time for Supercomputing '93 in November? Tom H. felt there would be none but Ed K. said there would be at least one. Jack D. suggested that the codes (currently in Section 6.4) be removed from the report. A description of them would suffice. At 1:45 pm EDT, David W. arrived to the meeting and Tony H. asked him to report on Compact Applications (Chapter 5). David W. indicated that there were no codes submitted since the May Parkbench meeting. He also suggested that there be an application form for submitting compact application benchmarks. Such a form would specify the total number of flops required and memory requirements, and could be posted to the Usenet News Group "comp.parallel." David B. will submit the NAS Parallel Benchmarks and David W. indicated that a Shallow-Water code and a Molecular Dynamics code would be available. Tony H. indicated that he could provide a "real" QCD code and then asked about the status of the FEM code discussed at the May meeting. David M. said that particular code was not readily available. Joanne M. suggested that 5 compact applications would be sufficient. David M. indicated that he would check again on the use of the FEM code (in the public domain). Tony H. asked David W. to contact Chuck Mosher about the ARCO suite (Joanne M. indicated she had Mosher's electronic mail. David B. asserted that an N-body problem might be an appropriate compact application. The current list of compact applications considered by the group include: (1) ARCO suite, (2) NAS CFD benchmarks, (3) QCD from Southampton, (4) Shallow Water code from ORNL, and (5) Molecular Dynamics code. Tony H. then suggested that a Summary of Actions be reviewed so attendees agreed on the remaining work to be done. In preparation for Supercomputing '93 in Portland, the following benchmark suite should be assembled: PDS/Xnetlib (UT) TICK1, TICK2, RINF1, POLY1, POLY2 (Roger H.) COMMS1, COMMS2, COMMS3, POLY3, SYNCH1 (Roger H.; need PVM versions) PUMMA, LU, QR, CG, TRDI (UT/ORNL supplies seq. and PVM; Syracuse provides HPF) FFT 1-D, FFT3-D (David B. - NASA supplies seq. and Intel versions) SOR, Multigrid (seq., MPI, HPF provided by Tony H. and David B.) EP, Integer Sort (seq. provided by Tom H. - Syracuse) I/O (provided by Tony H. - Southampton) HPF Compiler Benchmarks (Tom H. - Syracuse) QCD (Tony H. - Southampton) CFD/NAS(3) (David B. - NASA; seq., MP, CMF) Molecular Dynamics (David W. - ORNL; seq., PICL) ARCO Suite (2) (Jack D. and David W.; seq. and P4) Shallow Water (David W. - ORNL and Tom H. - Syracuseg Each code is to have 3 input sets as described earlier. UT/Netlib will collect/archive all candidate benchmarks for the Parkbench suite. Makefiles, README files, input files, etc. should be provided. Tony H. then called for a discussion on future actions. Tony suggested that subcommittee leaders put revised sections on the net, and Michael B. agreed to assemble final draft (with the help of Roger H.). Revised text should be mailed to berry@cs.utk.edu by October 15, 1993. Joanne M. indicated that Supercomputing '93 runs from Tuesday to Friday the week of November 17, and the current tentative evening session for Parkbench is Wednesday, November 17, 4:30 pm. Tony H. asked for opinions on future chairmanship (on the behalf of Roger H.). All attendees agreed that Roger H. was best suited to fulfill this role for the immediate future. There was strong opposition to the discussion of future chairmanship at the open meeting in Portland. For future meetings, Joanne M. and Tony H. suggested that the group explore "videoconferencing" between a US site and a European site. Bodo P. suggested that such meetings should be coordinated with SPEC. Jack D. offered to host a joint Parkbench/SPEC meeting for early 1994. Bodo P. then reviewed the organizational structure of SPEC for the attendees. He also mentioned a new SPEC benchmark called "PAR93" which comprises a 512 MB problem size for under 60 processors with automatic parallelism, shared-memory, and compiler/system tuning allowed. After Bodo P. completed the SPEC overview, Tony H. asked the group whether Parkbench should ultimately be sponsored by SPEC. At this time, Phil T. gave a short lecture on the evolution of SPEC/Perfect. Joanne M. suggested that Parkbench should be separate from SPEC/Perfect (may be called SPEC/HPSC - High Performance Steering Committee in near future) at least during the initial development phase of the benchmark suite. Phil T. mentioned that the current application benchmarks for SPEC/HPSC include ARCO (fdmod, fkmig, seis), LANL (false, pueblo), IBM (turb3d), and ETH/Zurich (NCMD - QCD). All of these are Fortran-77 versions with only ARCO having a message-passing version available at this time. David W. also remarked that it was very premature to discuss a Parkbench-SPEC/HPSC merger at this time. Tony H. indicated that PEPS and Euroben should also be included and asked Jack D. to formally contact them (they would like to maintain a copy of the PDS data). Some of the final discussions at the meeting concerned the default precision the codes should be run in for validated results. Charles G. insisted that the precision be specified in the code documentation. Jack D. felt that the kernel benchmarks should normally be run with 64-bit arithmetic unless otherwise specified. Some of the compact application codes (e.g., ARCO suite) may be run in 32-bit precision, however, to reflect "real" use in industrial applications. David B. also pointed out that the precision used during a run should be denoted in a code's output. The meeting promptly adjourned at 4:04 pm EDT. End of Minutes for August 23, 1993 (M. Berry, berry@cs.utk.edu) From owner-pbwg-comm@CS.UTK.EDU Mon Sep 6 14:54:47 1993 Received: from CS.UTK.EDU by netlib2.cs.utk.edu with SMTP (5.61+IDA+UTK-930125/2.8t-netlib) id AA02618; Mon, 6 Sep 93 14:54:47 -0400 Received: from localhost by CS.UTK.EDU with SMTP (5.61+IDA+UTK-930125/2.8s-UTK) id AA22682; Mon, 6 Sep 93 14:50:05 -0400 X-Resent-To: pbwg-comm@CS.UTK.EDU ; Mon, 6 Sep 1993 14:50:04 EDT Errors-To: owner-pbwg-comm@CS.UTK.EDU Received: from BERRY.CS.UTK.EDU by CS.UTK.EDU with SMTP (5.61+IDA+UTK-930125/2.8s-UTK) id AA22672; Mon, 6 Sep 93 14:50:03 -0400 Received: from LOCALHOST.cs.utk.edu by berry.cs.utk.edu with SMTP (5.61++/2.7c-UTK) id AA16975; Mon, 6 Sep 93 14:50:02 -0400 Message-Id: <9309061850.AA16975@berry.cs.utk.edu> To: pbwg-comm@cs.utk.edu Subject: minutes Date: Mon, 06 Sep 1993 14:50:01 -0400 From: "Michael W. Berry" Here are the minutes of the last meeting which were posted to comp.parallel. -Mike Minutes of the 4th PARKBENCH (Formerly PBWG) Workshop ----------------------------------------------------- Place: Science Alliance Conference Room South College University of Tennessee Knoxville, TN Host: Jack Dongarra ORNL/Univ. of Tennessee Date: August 23, 1993 Attendees/Affiliations: Michael Berry, Univ. of Tennessee Philip Tannenbaum, NEC/SPEC Ed Kushner, Intel David Mackay, Intel Charles Grassl, Cray Research Bodo Parady, Sun Microsystems David Bailey, NASA Jack Dongarra, Univ. of Tennessee/ORNL Tom Haupt, Univ. of Syracuse Tony Hey, Univ. of Southampton Joanne Martin, IBM David Walker, ORNL The meeting started at 9:10 EDT with Tony Hey chairing the meeting for the absent Roger Hockney. Tony asked if there were any changes to the minutes of the May Parkbench meeting. There were no changes suggested by the attendees so the minutes were accepted. Tony then lead the group through the current draft of the Parkbench Report to be distributed at Supercomputing '93 in Portland. Each attendee was provided a copy of the current draft along with a a handout containing modified sections written by David B. David B. suggested that his 0.1 Philosophy section be inserted before Roger's text in Chapter 1 (Introduction). David B. stressed that reproducibility be mentioned in the current draft. Phil T. pointed out that Perfect's new focus allows optimized versions to eventually become new baseline numbers. Tony H. felt that Parkbench should have some sort of relationship with SPEC/PB. Tom H. said that he would provide a few sentences for compiler benchmark development in the Introduction. Tony H. then motioned that the group move on to discuss Chapter 2 (Methodology). David B. stressed that CPU time has some value but that the pitfalls in its use should be included in Section 2.2 (Time Measurement). David B. will merge his section 0.2 in with Roger's Section 2.2. Charles G. pointed out that one cannot really define minimum performance (worst case). The group then discussed what methodology for measuring elapsed wall-clock time should be used. Charles G. offered the following 4 options: (1) make 3 runs and take the minimum, (2) make 3 runs and the maximum, (3) make 3 runs and take the arithmetic average, and (4) say nothing but recommend that 3 runs be made and ask for complete details on how the reported time was obtained. The group agreed that option (4) would be most appropriate for ParkBench at this time. Charles G. also suggested that TICK1/TICK2 essentially measure the accuracy of the timers. Ed K. explained a timing problem observed by Paragon users after reboots -- he indicated that there should be flexibility in reported timings so that "typical execution time" actually be stored in the Parkbench database. David B. agreed to insert a discussion of reproducibility into Roger's Section 2.8 (Benchmarking Procedure and Code Optimisation). Charles G. asserted that there should also be some emphasis on how runs were timed with respect to sensitivity. Tony H. then led the discussion on to Section 2.3 (Units and Symbols) and questioned whether or not this section should really be in an Appendix? David B. recommended that the Section be left where it is and most of the other attendees concurred. Mike B. asked the group to change "msend" to "send" on page 4 of the report following Roger H.'s suggestion. Moving on to Section 2.4 (Floating- Point Operation Count), David B. felt that the "8 flop" listed for "exponential, sine etc." was not realistic. He and Joanne M. felt a more correct assessment might be "20 flop". Phil T. pointed out that the Cray HPM was used for flop counts of the Perfect Benchmarks. Bodo P. asserted that since caching is typically used to evaluate many intrinsic functions, reported counts might be distorted. Tony H. stressed that counts are needed for Mflop/s ratios that will ultimately be used. Jack D. indicated that he would be willing to contact J. Demmel and W. Kahan for expert advice on what flop counts should be associated with the various operations -- all attendees agreed that this should be done. David B. and Joanne M. suggested that there should be standard flop counts that are in some sense "close" to those for a real machine (eg., Cray). Charles G. suggested that a bandwidth parameter be included -- theoretical data-access rate, for example. David B. agreed to modify Section 2.4 to include a memory parameter, and then asked whose responsibility is it to produce flop counts for the Parkbench codes? He suggested that a formula be given for the Kernel benchmarks and perhaps Cray HPM counts for the Compact-Application benchmarks. Tony H. pointed out that standard flop counts should be provided for each problem size. The discussion then turned to Section 2.5 (Performance metrics). David B. indicated that his Section 0.3 would replace Section 2.5.5 (Speedup, Efficiency, and Performance per Node) on page 9 of the draft. David B. also suggested that there be no speedup statistic kept in the Parkbench Database (PDS). Phil T. suggested that David B.'s points (4) and (5) on uniprocessor times and speedup when the problem is too large to run on a single processor/node be combined with his (3) for a general policy on speedup statistics. Charles G. stressed that efficiency is important and speedup relative to Amdahl's Law be considered. David B. suggested that the discussion of speedup be limited to avoid controversy. Jack D. and Michael B. agreed to provide a few postscript images of PDS for a Appendix to the Parkbench report. Bodo P. suggested that the adjective "obsolescent" be inserted before "SPECmarks" on the last line of page 10. SPECfp92 and SPECint92 are the current SPEC benchmarks reported. With regard to Section 2.7 (Interactive Graphical Interface), Tony H. indicated that the Southampton Group would provide the graphical front end for the Parkbench database. Bodo P. asked that a single output format be developed for inclusion in documents (e.g., embedded postscript for LaTeX documents). Jack D. indicated that each subcommittee will be responsible for providing codes to the Parkbench suite. Phil T. questioned whether optimization of codes (Section 2.8) should be limited, e.g., use high level languages only? Joanne M. responded that there should be no restrictions. Tony H. pointed out that vendors will use a variety of tactics to optimize kernels and low-level benchmarks, and that compact applications will most likely require different forms of optimization. David B. insisted that any library routines used should be found in any vendor-supported library (i.e., could at least be purchased). Tony H. reminded the attendees that there should ultimately be 3 versions of each benchmark: Fortran90, PVM/MPI, and HPF. Phil T. questioned whether or not these versions would follow any "standards". Jack D. indicated that MPI (or HPF) would not become any ANSI-supported standard. Bodo P. suggested that "XOPEN" has a procedure for creating standards without going through ANSI-related bureaucracy. XOPEN is nonprofit British corporation. A majority of the attendees felt that XOPEN might be an appropriate outlet for establishing standards for MPI or HPF. Bodo P. asked if PVM uses shared-memory function calls? Jack D. indicated that a new version of PVM which uses shared-memory was under development. Bodo P. indicated that P4 does support shared-memory. Phil T. was concerned that moving biased code to new machines would not be fair and suggested that the "as-is" run should simply be a "minimum sanity run" which comprises minimal changes just to get the code to run. Joanne M. and Bodo P. suggested that the first sentence in the Introduction (Chapter 1) indicate that "scalable distributed-memory (message-passing)" be the target architectures. The role of shared-memory architectures was not clear to several attendees. At 11:05 EDT, the group took a 15 minutes coffee break. At 11:20 EDT, the meeting resumed with Tony H. leading the discussion of Chapter 3 (Low-Level Benchmarks) on page 15 of the report. There were no comments/changes for Section 3.1 (Introduction) so the discussion focused on Section 3.2 (Single-Processor Benchmarks) early on. Charles G. indicated that the descriptions of TICK1 and TICK2 do not really discuss a timing methodology and related caveats. He agreed to rewrite Sections 3.2.1 and 3.2.2 on TICK1 and TICK2, respectively. Jack D. suggested that a catalogue of machine-specific timing routines be maintained. Bodo P. and Phil T. felt that the use of IF DEF's in TICK1 and TICK2 would be sufficient to select the appropriate machine timing intrinsic function for elapsed wall-clock times. Jack D. agreed to write a paragraph (for Sections 3.2.1 and 3.2.2) describing how machine- dependent timer calls are handled. Tony H. asked for volunteers to test POLY1 and POLY2 (Section 3.2.4) and indicated that he would provide the source code. Tony also indicated that Roger H. would develop COMMS3 and POLY3 for testing on Intel machines (Sections 3.3.2 and 3.3.3). For Section 3.4 (Appendix), Ed K. agreed to provide more recent data for Table 3.2 (page 22). Jack D. pointed out that clock speeds were missing in this table and Tony H. indicated that Roger H. will contact Ed K. about acquiring more recent data for such tables. Tony H. stressed that vendors' approval be granted before listing such data. For the Kernel Benchmarks (Chapter 4), Tony H. indicated that he had discussed the availability of parallel kernels from TMC (page 28). Ed K. indicated that Intel had single node and parallel libraries and mentioned the existence of routines for banded linear systems, matrix multiplication, and parallel FFT. Joanne M. indicated that IBM only had serial mathematical libraries. David B. pointed out the typo "diificuly" on page 25 of the draft. With regard to Matrix benchmarks (Section 4.2.1), Jack D. indicated that the benchmarks were almost complete -- only support software (makefiles, input/output files, etc.) is needed. He specifically mentioned the availability of matrix transposition, dense matrix multiplication for multicomputers, routines for reduction to tridiagonal (symmetric) and hessenberg (unsymmetric) form, BLACS for PVM, LU and QR factorization. Michael B. pointed out that #5 on page 26 should be "conjugate gradient method for solving linear systems" rather than "eigenvalue problem by conjugate gradient." Jack D. indicated that both Fortran-77 and PVM versions of the conjugate gradient method exist. What is needed for all the above mentioned kernels include: problem sizes, standard input files, makefiles, and README files. Tony H. asserted that the group needs a timescale for future HPF versions of the kernel benchmarks. Tom H. agreed to help Jack D. with this. Jack D. indicated that he would also solicit the help of Jim Demmel (Berkeley). For the Fourier Transform benchmarks (Section 4.2.2), several attendees stressed the need for a large 1-D FFT and the 3-D FFT and PDE by 3-D FFT (#2 and #3 on page 27) be merged into "3-D FFT". Tony H. reminded the group that a validation procedure will be needed for the 1-D FFT and David B. agreed to provide a convolution problem using a large vector of integers (e.g, 1024 ints). Using linear convolution, checksums would have to agree and the maximum deviation obtained on any machine could be measured. Charles G. agreed to help David B. on this. For the PDEs (Section 4.2.3), Tony H. indicated that #1 (Jacobi), #2 (Gauss-Seidel), and #4 (Finite Element) would be dropped from the current list. David W. was unable to acquire the FEM code discussed at the May Parkbench meeting. Tony H. agreed to provide the SOR benchmark from the Genesis Benchmarks and David B. will provide a multigrid benchmark from the NAS Parallel Benchmarks. These 2 benchmarks do not solve the same problem (SOR for Poisson's Equation and Multigrid for another application) but several versions of them (serial, MPI, HPF, etc.) are possible. Jack D. agreed to write a formal letter to NASA on behalf of the Parkbench group to request the use of the NAS Parallel Benchmarks. David B. indicated that these codes are now available for world-wide distribution. For "Other" kernel benchmarks (Section 4.2.4), Tony H. agreed to add a description of a candidate I/O benchmark (not mentioned in the current draft). Joanne M. and Tony H. stressed the need for the Embarassing Parallel (EP) benchmark, and Ed K. felt the Large Integer Sort benchmark should be included as well but questioned if it should be a "paper and pencil" benchmark similar to the NAS Parallel Benchmarks. He indicated that David Culler (culler@cs.berkeley.edu) and Leo(?) Dagum (dagum@nas.nasa.gov) could help supply this benchmark code. Tony H. reminded the group that Roger H. can provide a Particle-In-Cell (PIC) code as a candidate kernel benchmark also. At 12:15 pm EDT, the group broke for a 1-hour lunch. At 1:15 pm EDT, Tony H. indicated that he would contact David W. later about the status of the Compact Applications which comprises Chapter 5 (David W. had not arrived yet). Tom H. suggested that when preparing an HPF version of the kernel for dense matrix multiplication, there are 2 choices: (1) rewrite the kernel in HPF (as we plan to do with everything else), or (2) use a call to HPF (actually Fortran 90) intrinsic function MATMUL. Tom H., Jack D., and Tony H. preferred the option (2). A discussion of problem sizes prompted Phil T. to ask if 1 or 2 gigabytes of memory constituted a large problem? Charles G. pointed out that scalability is important but not always possible (often algorithms will change as problem size increases). Joanne M. suggested the group think about future problem sizes to anticipate relevant problem sizes a few years from now. Jack D. suggested that 10 gigabytes defines a large problem and that 1-2 gigabytes would be a medium size problem. Small problem sizes would be needed for simple testing purposes. Bodo P. asserted that memory should be specified as part of the dataset name. Tony H. summarized that there should 3 problem sizes: (1) test problem size, (2) moderate size problem, and (3) grand challenge problem. All problem specifications would be denoted in README files. At this point, a discussion of the Compiler Benchmarks (Chapter 6) was lead by Tom H, who suggested that the Introduction (Section 6.1) be merged with the Introduction (Chapter 1). He suggested that a new metric for the compiler benchmarks also be mentioned in this introduction: "ratio of execution times of compiler generated to hand-coded programs as a function of the problem size and number of processors engaged." Since there were no comments for Section 6.2 (HPF), Tom H. proceeded on to Section 6.3 (Benchmark Suite) and specified that no multi-statement optimization is addressed in the current compiler benchmark set. Tony H. asked if the name of these particular benchmarks should be changed to "HPF Compiler Benchmarks." He also suggested that an Appendix discussing MPI and HPF be included in the report. Tom H. then reviewed the Description of Codes in Section 6.4. He indicated that all were existing codes and that the last 3 involve FORALL statements while the las 4 involve array distribution. Bobo P. noticed that subroutine in-lining was missing from the set. Ed K. indicated that Intel would eventually have HPF version of all the codes listed. He also suggested that there be HPF compiler benchmarks more suitable for the compact application benchmarks since the current set of compiler benchmarks seem for appropriate for the kernel benchmarks. Jack D. asked just how many HPF compilers would be available by the time for Supercomputing '93 in November? Tom H. felt there would be none but Ed K. said there would be at least one. Jack D. suggested that the codes (currently in Section 6.4) be removed from the report. A description of them would suffice. At 1:45 pm EDT, David W. arrived to the meeting and Tony H. asked him to report on Compact Applications (Chapter 5). David W. indicated that there were no codes submitted since the May Parkbench meeting. He also suggested that there be an application form for submitting compact application benchmarks. Such a form would specify the total number of flops required and memory requirements, and could be posted to the Usenet News Group "comp.parallel." David B. will submit the NAS Parallel Benchmarks and David W. indicated that a Shallow-Water code and a Molecular Dynamics code would be available. Tony H. indicated that he could provide a "real" QCD code and then asked about the status of the FEM code discussed at the May meeting. David M. said that particular code was not readily available. Joanne M. suggested that 5 compact applications would be sufficient. David M. indicated that he would check again on the use of the FEM code (in the public domain). Tony H. asked David W. to contact Chuck Mosher about the ARCO suite (Joanne M. indicated she had Mosher's electronic mail. David B. asserted that an N-body problem might be an appropriate compact application. The current list of compact applications considered by the group include: (1) ARCO suite, (2) NAS CFD benchmarks, (3) QCD from Southampton, (4) Shallow Water code from ORNL, and (5) Molecular Dynamics code. Tony H. then suggested that a Summary of Actions be reviewed so attendees agreed on the remaining work to be done. In preparation for Supercomputing '93 in Portland, the following benchmark suite should be assembled: PDS/Xnetlib (UT) TICK1, TICK2, RINF1, POLY1, POLY2 (Roger H.) COMMS1, COMMS2, COMMS3, POLY3, SYNCH1 (Roger H.; need PVM versions) PUMMA, LU, QR, CG, TRDI (UT/ORNL supplies seq. and PVM; Syracuse provides HPF) FFT 1-D, FFT3-D (David B. - NASA supplies seq. and Intel versions) SOR, Multigrid (seq., MPI, HPF provided by Tony H. and David B.) EP, Integer Sort (seq. provided by Tom H. - Syracuse) I/O (provided by Tony H. - Southampton) HPF Compiler Benchmarks (Tom H. - Syracuse) QCD (Tony H. - Southampton) CFD/NAS(3) (David B. - NASA; seq., MP, CMF) Molecular Dynamics (David W. - ORNL; seq., PICL) ARCO Suite (2) (Jack D. and David W.; seq. and PVM) Shallow Water (David W. - ORNL and Tom H. - Syracuseg Each code is to have 3 input sets as described earlier. UT/Netlib will collect/archive all candidate benchmarks for the Parkbench suite. Makefiles, README files, input files, etc. should be provided. Tony H. then called for a discussion on future actions. Tony suggested that subcommittee leaders put revised sections on the net, and Michael B. agreed to assemble final draft (with the help of Roger H.). Revised text should be mailed to berry@cs.utk.edu by October 15, 1993. Joanne M. indicated that Supercomputing '93 runs from Tuesday to Friday the week of November 17, and the current tentative evening session for Parkbench is Wednesday, November 17, 4:30 pm. Tony H. asked for opinions on future chairmanship (on the behalf of Roger H.). All attendees agreed that Roger H. was best suited to fulfill this role for the immediate future. There was strong opposition to the discussion of future chairmanship at the open meeting in Portland. For future meetings, Joanne M. and Tony H. suggested that the group explore "videoconferencing" between a US site and a European site. Bodo P. suggested that such meetings should be coordinated with SPEC. Jack D. offered to host a joint Parkbench/SPEC meeting for early 1994. Bodo P. then reviewed the organizational structure of SPEC for the attendees. He also mentioned a new SPEC benchmark called "PAR93" which comprises a 512 MB problem size for under 60 processors with automatic parallelism, shared-memory, and compiler/system tuning allowed. After Bodo P. completed the SPEC overview, Tony H. asked the group whether Parkbench should ultimately be sponsored by SPEC. At this time, Phil T. gave a short lecture on the evolution of SPEC/Perfect. Joanne M. suggested that Parkbench should be separate from SPEC/Perfect (may be called SPEC/HPSC - High Performance Steering Committee in near future) at least during the initial development phase of the benchmark suite. Phil T. mentioned that the current application benchmarks for SPEC/HPSC include ARCO (fdmod, fkmig, seis), LANL (false, pueblo), IBM (turb3d), and ETH/Zurich (NCMD - QCD). All of these are Fortran-77 versions with only ARCO having a message-passing version available at this time. David W. also remarked that it was very premature to discuss a Parkbench-SPEC/HPSC merger at this time. Tony H. indicated that PEPS and Euroben should also be included and asked Jack D. to formally contact them (they would like to maintain a copy of the PDS data). Some of the final discussions at the meeting concerned the default precision the codes should be run in for validated results. Charles G. insisted that the precision be specified in the code documentation. Jack D. felt that the kernel benchmarks should normally be run with 64-bit arithmetic unless otherwise specified. Some of the compact application codes (e.g., ARCO suite) may be run in 32-bit precision, however, to reflect "real" use in industrial applications. David B. also pointed out that the precision used during a run should be denoted in a code's output. The meeting promptly adjourned at 4:04 pm EDT. End of Minutes for August 23, 1993 (M. Berry, berry@cs.utk.edu) From owner-pbwg-comm@CS.UTK.EDU Wed Sep 22 05:50:24 1993 Received: from CS.UTK.EDU by netlib2.cs.utk.edu with SMTP (5.61+IDA+UTK-930125/2.8t-netlib) id AA07792; Wed, 22 Sep 93 05:50:24 -0400 Received: from localhost by CS.UTK.EDU with SMTP (5.61+IDA+UTK-930125/2.8s-UTK) id AA23052; Wed, 22 Sep 93 05:47:39 -0400 X-Resent-To: pbwg-comm@CS.UTK.EDU ; Wed, 22 Sep 1993 05:47:38 EDT Errors-To: owner-pbwg-comm@CS.UTK.EDU Received: from ben.Britain.EU.net by CS.UTK.EDU with SMTP (5.61+IDA+UTK-930125/2.8s-UTK) id AA23044; Wed, 22 Sep 93 05:47:32 -0400 Message-Id: <9309220947.AA23044@CS.UTK.EDU> Received: from newton.npl.co.uk by ben.britain.eu.net via PSS with NIFTP (PP) id ; Wed, 22 Sep 1993 10:46:38 +0100 Date: Wed, 22 Sep 93 10:46 GMT From: Trevor Chambers To: PBWG-COMM 22nd September 1993 Dear Parkbench Committee Members, At NPL I am beginning work on producing a survey of new approaches and initiatives in the benchmarking of parallel architectures. By `new' I mean ideas and initiatives which have happened in the last couple of years. The purpose of this message is to ask anyone reading this if they know of any suitable material I could use, or can supply me with names or organisations to contact. I'll follow tham all up if I haven't done so already. When the job is complete I would be happy to make the report available to PARKBENCH if there's any interest in my doing so. I would be grateful for anyone's help. Best wishes, Trevor Chambers From owner-pbwg-comm@CS.UTK.EDU Thu Sep 30 07:02:15 1993 Received: from CS.UTK.EDU by netlib2.cs.utk.edu with SMTP (5.61+IDA+UTK-930125/2.8t-netlib) id AA01974; Thu, 30 Sep 93 07:02:15 -0400 Received: from localhost by CS.UTK.EDU with SMTP (5.61+IDA+UTK-930922/2.8s-UTK) id AA13063; Thu, 30 Sep 93 06:59:56 -0400 X-Resent-To: pbwg-comm@CS.UTK.EDU ; Thu, 30 Sep 1993 06:59:54 EDT Errors-To: owner-pbwg-comm@CS.UTK.EDU Received: from ben.Britain.EU.net by CS.UTK.EDU with SMTP (5.61+IDA+UTK-930922/2.8s-UTK) id AA13055; Thu, 30 Sep 93 06:59:48 -0400 Message-Id: <9309301059.AA13055@CS.UTK.EDU> Received: from newton.npl.co.uk by ben.britain.eu.net via PSS with NIFTP (PP) id ; Thu, 30 Sep 1993 11:59:32 +0100 Date: Thu, 30 Sep 93 11:58 GMT From: Trevor Chambers To: PBWG-COMM 30th September 1993 Dear Parkbench Committee Members, I would like to thank all the members who responded to my request for information on recent initiatives and new approaches in parallel architecture benchmarking. The response has been generous and very helpful. Thank you all for taking the trouble. I am still seeking information - please keep it coming! Thanks again, Trevor Chambers From owner-pbwg-comm@CS.UTK.EDU Fri Oct 15 11:48:54 1993 Received: from CS.UTK.EDU by netlib2.cs.utk.edu with SMTP (5.61+IDA+UTK-930125/2.8t-netlib) id AA22170; Fri, 15 Oct 93 11:48:54 -0400 Received: from localhost by CS.UTK.EDU with SMTP (5.61+IDA+UTK-930922/2.8s-UTK) id AA13599; Fri, 15 Oct 93 11:45:54 -0400 X-Resent-To: pbwg-comm@CS.UTK.EDU ; Fri, 15 Oct 1993 11:45:52 EDT Errors-To: owner-pbwg-comm@CS.UTK.EDU Received: from sun2.nsfnet-relay.ac.uk by CS.UTK.EDU with SMTP (5.61+IDA+UTK-930922/2.8s-UTK) id AA13558; Fri, 15 Oct 93 11:45:35 -0400 Via: uk.ac.southampton; Fri, 15 Oct 1993 15:54:56 +0100 Via: brewery.ecs.soton.ac.uk; Fri, 15 Oct 93 15:35:44 BST From: Tony Hey Received: from pleasuredome.ecs.soton.ac.uk by brewery.ecs.soton.ac.uk; Fri, 15 Oct 93 15:47:38 BST Message-Id: <10077.9310151447@pleasuredome.ecs.soton.ac.uk> Subject: Intro or Conclusion for Parkbenech document To: pbwg-comm@cs.utk.edu Date: Fri, 15 Oct 1993 15:47:33 +0100 (BST) Cc: ajgh@ecs.soton.ac.uk (Tony Hey) X-Mailer: ELM [version 2.4 PL0] Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Length: 1041 Some thoughts on the Introduction/Conclusion of the Parkbench document In order for the Parkbench suite to be acceptable to vendors and to have enough data to be valuable we must ensure that no unreasonable demands are made for compliance. I suggest that, certainly as an initial strategy, we emphasize the need for data on the low-level and kernel benchmarks. The low-level parameters measured by the low-level benchmarks will in any case need to be known by each of the vendors and are non-contentious and easy to measure. The kernel benchmark suite contains codes similar to what one expects to find in most parallel libraries from vendors: it would therefore not be a major effort to produce optimized numbers for these kernels. I therefore suggest we encourage take-up of Parkbench by stressing the need for low-level and kernel results in the first instance. Some more tolerance should be allowed as to whether or not vendors supply numbers for all of the application areas represented in the compact application section. Tony Hey From owner-pbwg-comm@CS.UTK.EDU Fri Oct 15 11:48:56 1993 Received: from CS.UTK.EDU by netlib2.cs.utk.edu with SMTP (5.61+IDA+UTK-930125/2.8t-netlib) id AA22176; Fri, 15 Oct 93 11:48:56 -0400 Received: from localhost by CS.UTK.EDU with SMTP (5.61+IDA+UTK-930922/2.8s-UTK) id AA13638; Fri, 15 Oct 93 11:46:11 -0400 X-Resent-To: pbwg-comm@CS.UTK.EDU ; Fri, 15 Oct 1993 11:46:10 EDT Errors-To: owner-pbwg-comm@CS.UTK.EDU Received: from sun2.nsfnet-relay.ac.uk by CS.UTK.EDU with SMTP (5.61+IDA+UTK-930922/2.8s-UTK) id AA13591; Fri, 15 Oct 93 11:45:52 -0400 Via: uk.ac.southampton; Fri, 15 Oct 1993 16:17:19 +0100 Via: brewery.ecs.soton.ac.uk; Fri, 15 Oct 93 15:54:47 BST From: Tony Hey Received: from pleasuredome.ecs.soton.ac.uk by brewery.ecs.soton.ac.uk; Fri, 15 Oct 93 16:06:41 BST Message-Id: <10113.9310151506@pleasuredome.ecs.soton.ac.uk> Subject: kernel chapter for parkbench document To: pbwg-comm@cs.utk.edu Date: Fri, 15 Oct 1993 16:06:36 +0100 (BST) Cc: ajgh@ecs.soton.ac.uk (Tony Hey), mab@par.soton.ac.uk (Mark Baker), vsg@ecs.soton.ac.uk (Vladimir Getov), berry@cs.utk.edu, pbwg-kernel@cs.utk.edu X-Mailer: ELM [version 2.4 PL0] Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Length: 1104 Hi all! Vladimir Getov and I in Southampton are trying to assemble final draft of kernel chapter. This at present has some large gaps: Matrix kernels - reads "Jack et al to provide" ACTION: JACK DONGARRA FFT codes - reads "David Bailey with Charles Grassl to provide" ACTION: DAVID BAILEY AND CHARLES GRASSL I am in the process of writing up the PDE Solver section This contains two Poisson solvers - one SOR from Southampton and the NAS Multigrid solver. ACTION: ME Jack et al. - have you decided on standard input files, makefiles and README files? We also need the three problem sizes agreed upon to be specified for each code: (1)test problem (2)moderate size and (3) grand challenge size. ACTION: JACK DONGARRA In the other codes section there will be a write-up (taken from David Bailey's latest version which he is sending me) of both the Integer Sort and Embarrassingly Parallel benchmarks. We are trying to provide a write-up of a possible I/O benchmark. ACTION: DAVID BAILEY Input please as soon as possible! Mike Berry please note we are trying! ACTION: MIKE BERRY Tony Hey From owner-pbwg-comm@CS.UTK.EDU Tue Oct 26 22:34:26 1993 Received: from CS.UTK.EDU by netlib2.cs.utk.edu with SMTP (5.61+IDA+UTK-930125/2.8t-netlib) id AA03115; Tue, 26 Oct 93 22:34:26 -0400 Received: from localhost by CS.UTK.EDU with SMTP (5.61+IDA+UTK-930922/2.8s-UTK) id AA17473; Tue, 26 Oct 93 21:31:42 -0400 X-Resent-To: pbwg-comm@CS.UTK.EDU ; Tue, 26 Oct 1993 21:31:41 EDT Errors-To: owner-pbwg-comm@CS.UTK.EDU Received: from DINAH.CS.UTK.EDU by CS.UTK.EDU with SMTP (5.61+IDA+UTK-930922/2.8s-UTK) id AA17467; Tue, 26 Oct 93 21:31:40 -0400 From: Todd Letsche Received: by dinah.cs.utk.edu (5.61+IDA+UTK-930922/2.7c-UTK) id AA06265; Tue, 26 Oct 93 21:31:45 -0400 Date: Tue, 26 Oct 93 21:31:45 -0400 Message-Id: <9310270131.AA06265@dinah.cs.utk.edu> To: pbwg-comm@cs.utk.edu Subject: Submitting Parkbench benchmarks To all those providing benchmarking codes for the Parkbench initiative: An ftp site has been created to gather the benchmarks for final testing and packaging. Since Supercomputing '93 is quickly approaching, please submit them as soon as possible. You may submit the code in any archive format you wish (shar, tar, etc.), but I would prefer that the archives be compressed to help alleviate disk space problems. Please remember to change ftp from ASCII to binary before transferring the compressed files. I would appreciate if you sent me e-mail after submitting the benchmarks so that I can contact you if I have any questions or if the transfer is incomplete. I will promptly remove the benchmarks from the directory, do a small amount of final testing, and package them (hopefully) in time for Supercomputing '93. Please direct all questions, concerns, complaints, and queries to . Thank you. Todd Letsche University of Tennessee - Knoxville letsche@cs.utk.edu Voice: (615) 974-5886 Fax : (615) 974-8296 [*][*][*][*][*][*][*][*][*][*][*][*][*][*][*][*][*][*][*][*][*][*][*][*] Information on the specifics of sending the files follows: Machine name : cs.utk.edu Login as : parkbnch (notice the missing "e") Password : hcnbkrap (login name in reverse ) Example : ------------------------------------------------------------------------- % ftp cs.utk.edu Connected to cs.utk.edu. 220 cs FTP server (Version 2.0WU(11) Fri Apr 9 16:45:39 EDT 1993) ready. Name (cs.utk.edu:letsche): parkbnch 331 Password required for parkbnch. Password: 230-You are using cs's ftp server, a machine 230-maintained by the Computer Science Department 230-at the University of Tennessee, Knoxville. 230- 230-Be aware that your transactions may be logged. 230-If you don't care for this policy, then don't use this ftp server. 230- 230- 230 User parkbnch logged in. ftp> binary 200 Type set to I. ftp> send LU.shar.Z ... ------------------------------------------------------------------------- From owner-pbwg-comm@CS.UTK.EDU Wed Oct 27 10:30:56 1993 Received: from CS.UTK.EDU by netlib2.cs.utk.edu with SMTP (5.61+IDA+UTK-930125/2.8t-netlib) id AA05527; Wed, 27 Oct 93 10:30:56 -0400 Received: from localhost by CS.UTK.EDU with SMTP (5.61+IDA+UTK-930922/2.8s-UTK) id AA14304; Wed, 27 Oct 93 09:28:50 -0400 X-Resent-To: pbwg-comm@CS.UTK.EDU ; Wed, 27 Oct 1993 09:28:48 EDT Errors-To: owner-pbwg-comm@CS.UTK.EDU Received: from rios2.EPM.ORNL.GOV by CS.UTK.EDU with SMTP (5.61+IDA+UTK-930922/2.8s-UTK) id AA14268; Wed, 27 Oct 93 09:28:44 -0400 Received: by rios2.epm.ornl.gov (AIX 3.2/UCB 5.64/4.03) id AA18172; Wed, 27 Oct 1993 09:28:50 -0400 Message-Id: <9310271328.AA18172@rios2.epm.ornl.gov> To: dongarra@rios2.epm.ornl.gov Cc: pbwg-comm@cs.utk.edu Subject: ParkBench benchmarks Date: Wed, 27 Oct 93 09:28:49 -0500 From: David W. Walker Jack, I've just got a message from Todd Letsche about submitting codes to ParkBench. As far as the compact applications are concerned I have asked people to only submit a submission form describing the application so that we can choose which ones we want in the suite. Only after this initial vetting will we ask people to actually send us code. The purpose of this vetting is as follows: 1. It saves people the trouble of putting together their code in a transportable form. I expect we'll receive many more submissions than we actually accept. 2. We must avoid peolpe just sending us codes (without documentation) 3. We must make sure codes are suitable. 4. We must make sure that a minimal amount of work is required by us to test and run them. I don't understand the necessity of hav9ng the codes before SC93. Even if we have the codes we won't be able to do anything useful with them in the next 2 weeks. David From owner-pbwg-comm@CS.UTK.EDU Wed Oct 27 13:50:55 1993 Received: from CS.UTK.EDU by netlib2.cs.utk.edu with SMTP (5.61+IDA+UTK-930125/2.8t-netlib) id AA07494; Wed, 27 Oct 93 13:50:55 -0400 Received: from localhost by CS.UTK.EDU with SMTP (5.61+IDA+UTK-930922/2.8s-UTK) id AA07023; Wed, 27 Oct 93 13:48:22 -0400 X-Resent-To: pbwg-comm@CS.UTK.EDU ; Wed, 27 Oct 1993 13:48:21 EDT Errors-To: owner-pbwg-comm@CS.UTK.EDU Received: from haven.EPM.ORNL.GOV by CS.UTK.EDU with SMTP (5.61+IDA+UTK-930922/2.8s-UTK) id AA07015; Wed, 27 Oct 93 13:48:20 -0400 Received: by haven.EPM.ORNL.GOV (4.1/1.34) id AA26558; Wed, 27 Oct 93 13:48:22 EDT Date: Wed, 27 Oct 93 13:48:22 EDT From: worley@haven.EPM.ORNL.GOV (Pat Worley) Message-Id: <9310271748.AA26558@haven.EPM.ORNL.GOV> To: walker@rios2.epm.ornl.gov, dongarra@rios2.epm.ornl.gov Subject: Re: ParkBench benchmarks In-Reply-To: Mail from 'David W. Walker ' dated: Wed, 27 Oct 93 09:28:49 -0500 Cc: pbwg-comm@cs.utk.edu > Jack, > I've just got a message from Todd Letsche about submitting > codes to ParkBench. As far as the compact applications are concerned > I have asked people to only submit a submission form describing the > application so that we can choose which ones we want in the suite. > Only after this initial vetting will we ask people to actually send > us code. The purpose of this vetting is as follows: > 1. It saves people the trouble of putting together their code in > a transportable form. I expect we'll receive many more submissions than > we actually accept. > 2. We must avoid peolpe just sending us codes (without documentation) > 3. We must make sure codes are suitable. > 4. We must make sure that a minimal amount of work is required by us to test > and run them. > I don't understand the necessity of hav9ng the codes before SC93. Even if > we have the codes we won't be able to do anything useful with them in the > next 2 weeks. > > David I concur. In any case, there is likely to be some iteration for the first few submissions to establish what is/is not acceptable in a compact application code. If we are asking people to alter already working (and useful) 10K lines of code in order to include in the benchmark suite, we should know exactly what we need and why. Pat From owner-pbwg-comm@CS.UTK.EDU Thu Oct 28 07:51:49 1993 Received: from CS.UTK.EDU by netlib2.cs.utk.edu with SMTP (5.61+IDA+UTK-930125/2.8t-netlib) id AA11422; Thu, 28 Oct 93 07:51:49 -0400 Received: from localhost by CS.UTK.EDU with SMTP (5.61+IDA+UTK-930922/2.8s-UTK) id AA02596; Thu, 28 Oct 93 07:49:40 -0400 X-Resent-To: pbwg-comm@CS.UTK.EDU ; Thu, 28 Oct 1993 07:49:39 EDT Errors-To: owner-pbwg-comm@CS.UTK.EDU Received: from sun2.nsfnet-relay.ac.uk by CS.UTK.EDU with SMTP (5.61+IDA+UTK-930922/2.8s-UTK) id AA02533; Thu, 28 Oct 93 07:49:25 -0400 Via: uk.ac.southampton; Thu, 28 Oct 1993 11:48:35 +0000 Via: brewery.ecs.soton.ac.uk; Thu, 28 Oct 93 11:16:35 GMT From: Vladimir Getov Received: from beluga.ecs.soton.ac.uk by brewery.ecs.soton.ac.uk; Thu, 28 Oct 93 11:28:38 GMT Date: Thu, 28 Oct 93 11:28:40 GMT Message-Id: <18119.9310281128@beluga.ecs.soton.ac.uk> To: dongarra@rios2.epm.ornl.gov, walker@rios2.epm.ornl.gov Subject: Re: ParkBench benchmarks Cc: pbwg-comm@cs.utk.edu > > Jack, > I've just got a message from Todd Letsche about submitting > codes to ParkBench. As far as the compact applications are concerned > I have asked people to only submit a submission form describing the > application so that we can choose which ones we want in the suite. > Only after this initial vetting will we ask people to actually send > us code. The purpose of this vetting is as follows: > 1. It saves people the trouble of putting together their code in > a transportable form. I expect we'll receive many more submissions than > we actually accept. > 2. We must avoid peolpe just sending us codes (without documentation) > 3. We must make sure codes are suitable. > 4. We must make sure that a minimal amount of work is required by us to test > and run them. > I don't understand the necessity of hav9ng the codes before SC93. Even if > we have the codes we won't be able to do anything useful with them in the > next 2 weeks. > > David > I think that a similar vetting procedure should be adopted for all ParkBench codes. This would help a lot to assemble a consistent benchmark suite with regard to the sequential and parallel versions of every benchmark (PVM, MPI, HPF, etc. including versions, subsets, etc.), the style of time measurement and reporting, the input/output and Make-files, the README files, etc. Pehaps a short `Guidelines for submission' is what we need. Vladimir vsg@ecs.soton.ac.uk Dr Vladimir Getov Tel: +44 703 593368 Dept of Electronics and Computer Science Fax: +44 703 593045 University of Southampton Southampton SO9 5NH, U.K. From owner-pbwg-comm@CS.UTK.EDU Thu Oct 28 08:50:07 1993 Received: from CS.UTK.EDU by netlib2.cs.utk.edu with SMTP (5.61+IDA+UTK-930125/2.8t-netlib) id AA11594; Thu, 28 Oct 93 08:50:07 -0400 Received: from localhost by CS.UTK.EDU with SMTP (5.61+IDA+UTK-930922/2.8s-UTK) id AA07103; Thu, 28 Oct 93 08:48:16 -0400 X-Resent-To: pbwg-comm@CS.UTK.EDU ; Thu, 28 Oct 1993 08:48:15 EDT Errors-To: owner-pbwg-comm@CS.UTK.EDU Received: from rios2.EPM.ORNL.GOV by CS.UTK.EDU with SMTP (5.61+IDA+UTK-930922/2.8s-UTK) id AA07095; Thu, 28 Oct 93 08:48:13 -0400 Received: by rios2.epm.ornl.gov (AIX 3.2/UCB 5.64/4.03) id AA17268; Thu, 28 Oct 1993 08:48:21 -0400 Message-Id: <9310281248.AA17268@rios2.epm.ornl.gov> To: pbwg-comm@cs.utk.edu Subject: Submission form for compact applications Date: Thu, 28 Oct 93 08:48:21 -0500 From: David W. Walker In case You haven't already seen it, below is the submission form for the ParkBench Compact Applications suite. David PARKBENCH COMPACT APPLICATIONS SUBMISSION FORM To submit a compact application to the ParkBench suite you must follow the following procedure: 1. Complete the submission form below, and email it to David Walker at walker@msr.epm.ornl.gov. The data on this form will be reviewed by the ParkBench Compact Applications Subcommittee, and you will be notified if the application is to be considered further for inclusion in the ParkBench suite. 2. If ParkBench Compact Applications Subcommittee decides to consider your application further you will be asked to submit the source code and input and output files, together with any documentation and papers about the application. Source code and input and output files should be submitted by email, or ftp, unless the files are very large, in which case a tar file on a 1/4 inch cassette tape. Wherever possible email submission is preferred for all documents in man page, Latex and/or Postscipt format. These files documents and papers together constitute your application package. Your application package should be sent to: David Walker Oak Ridge National Laboratory Bldg. 6012/MS-6367 P. O. Box 2008 Oak Ridge, TN 37831-6367 (615) 574-7401/0680 (phone/fax) walker@msr.epm.ornl.gov The street address is "Bethal Valley Road" if Fedex insists on this. The subcommittee will then make a final decision on whether to include your application in the ParkBench suite. 3. If your application is approved for inclusion in the ParkBench suite you (or some authorized person from your organization) will be asked in complete and sign a form giving ParkBench authority to distribute, and modify (if necessary), your application package. ------------------------------------------------------------------------------- Name of Program : ------------------------------------------------------------------------------- Submitter's Name : Submitter's Organization: Submitter's Address : Submitter's Telephone # : Submitter's Fax # : Submitter's Email : ------------------------------------------------------------------------------- Cognizant Expert(s) : CE's Organization : CE's Address : CE's Telephone # : CE's Fax # : CE's Email : ------------------------------------------------------------------------------- Extent and timeliness with which CE is prepared to respond to questions and bug reports from ParkBench : ------------------------------------------------------------------------------- Major Application Field : Application Subfield(s) : ------------------------------------------------------------------------------- Application "pedigree" (origin, history, authors, major mods) : ------------------------------------------------------------------------------- May this code be freely distributed (if not specify restrictions) : ------------------------------------------------------------------------------- Give length in bytes of integers and floating-point numbers that should be used in this application: Integers : bytes Floats : bytes ------------------------------------------------------------------------------- Documentation describing the implementation of the application (at module level, or lower) : ------------------------------------------------------------------------------- Research papers describing sequential code and/or algorithms : ------------------------------------------------------------------------------- Research papers describing parallel code and/or algorithms : ------------------------------------------------------------------------------- Other relevant research papers: ------------------------------------------------------------------------------- Application available in the following languages (give message passing system used, if applicable, and machines application runs on) : ------------------------------------------------------------------------------- Total number of lines in source code: Number of lines excluding comments : Size in bytes of source code : ------------------------------------------------------------------------------- List input files (filename, number of lines, size in bytes, and if formatted) : ------------------------------------------------------------------------------- List output files (filename, number of lines, size in bytes, and if formatted) : ------------------------------------------------------------------------------- Brief, high-level description of what application does: ------------------------------------------------------------------------------- Main algorithms used: ------------------------------------------------------------------------------- Skeleton sketch of application: ------------------------------------------------------------------------------- Brief description of I/O behavior: ------------------------------------------------------------------------------- Brief description of load balance behavior : ------------------------------------------------------------------------------- Describe the data distribution (if appropriate) : ------------------------------------------------------------------------------- Give parameters of the data distribution (if appropriate) : ------------------------------------------------------------------------------- Give parameters that determine the problem size : ------------------------------------------------------------------------------- Give memory as function of problem size : ------------------------------------------------------------------------------- Give number of floating-point operations as function of problem size : ------------------------------------------------------------------------------- Give communication overhead as function of problem size and data distribution : ------------------------------------------------------------------------------- Give three problem sizes, small, medium, and large for which the benchmark should be run (give parameters for problem size, sizes of I/O files, memory required, and number of floating point operations) : ------------------------------------------------------------------------------- How did you determine the number of floating-point operations (hardware monitor, count by hand, etc.) : ------------------------------------------------------------------------------- Other relevant information: ------------------------------------------------------------------------------- From owner-pbwg-comm@CS.UTK.EDU Wed Nov 3 13:23:18 1993 Received: from CS.UTK.EDU by netlib2.cs.utk.edu with SMTP (5.61+IDA+UTK-930125/2.8t-netlib) id AA25162; Wed, 3 Nov 93 13:23:18 -0500 Received: from localhost by CS.UTK.EDU with SMTP (5.61+IDA+UTK-930922/2.8s-UTK) id AA16320; Wed, 3 Nov 93 13:21:15 -0500 X-Resent-To: pbwg-comm@CS.UTK.EDU ; Wed, 3 Nov 1993 13:21:14 EST Errors-To: owner-pbwg-comm@CS.UTK.EDU Received: from Sun.COM by CS.UTK.EDU with SMTP (5.61+IDA+UTK-930922/2.8s-UTK) id AA16312; Wed, 3 Nov 93 13:21:09 -0500 Received: from Eng.Sun.COM (zigzag.Eng.Sun.COM) by Sun.COM (4.1/SMI-4.1) id AB29806; Wed, 3 Nov 93 10:21:07 PST Received: from cumbria.Eng.Sun.COM by Eng.Sun.COM (4.1/SMI-4.1) id AA24671; Wed, 3 Nov 93 10:20:39 PST Received: by cumbria.Eng.Sun.COM (5.0/SMI-SVR4) id AA06753; Wed, 3 Nov 1993 10:21:05 +0800 Date: Wed, 3 Nov 1993 10:21:05 +0800 From: Bodo.Parady@Eng.Sun.COM (Bodo Parady - SMCC Systems Performance) Message-Id: <9311031821.AA06753@cumbria.Eng.Sun.COM> To: pbwg-comm@cs.utk.edu Subject: PAR93 announcement X-Sun-Charset: US-ASCII Content-Length: 13947 All of the members of the PARKBENCH group are welcome to participate in discussion of the PAR93 suite at SC93. The following is an announcement of the suite. Many thanks to Joanne Martin for scheduling a room that would not conflict with PARKBENCH. I hope to see you all in Portland. Bodo PAR93 ----- Charter: PAR93 -- is a benchmark suite designed to measure cache based RISC SMP system performance using well behaved codes that parallelize automatically. While no one benchmark can fully characterize performance, it is hoped the results of a variety of realistic benchmarks can give valuable insight into expected real performance. PAR93 and its use in the performance analysis of SMPs will be discussed at: Supercomputing 93 Room A-107, Portland Convention Center 4:30-6:30 Tuesday, November 16, 1993 1.800.go2.sc93 (1.800.462.7293) Papers will be delivered by Mike Humphrey of SGI, Vinod Grover of Sun, Bodo Parady of Sun, and Bruce Leasure of Kuck and Associates. This will be followed by a panel discussion which, in addition to the presenters, includes Forrest Baskett of SGI, and John Hennessey of Stanford. Focus -- This suite is for users of small to medium (<64 cpus) symmetric multiprocessing systems who wish to compare the performance of various systems, but who do not wish to change source code to attain parallel speedups. As such, the benchmarks allow for no tuning of the code beyond the base, and no addition of compiler directives. It is assumed that code tuning through source code changes and the addition of compiler directives can attain greater speedups, but this is not the scope of this benchmark. Neither are distributed memory, vector architectures, nor massively parallel systems within the goals of this benchmark, although nothing in the suite prevents any system architecture from being used; other architectures are encouraged to run this benchmark. Vector and MP vector architectures may find that this suite offers an excellent method to evaluate the performance of their systems and are heartily encouraged to run this suite. This suite is intended to characterize the subset of applications for which compilers can parallelize automatically (compilation without the use of directives). As such this suite offers examples of parallelizable code and an indication of the speedups attainable from compilers on some applications. This benchmark offers users, system designers and compiler writers a method to compare their products on a completely level playing field, with no parallel programming paradigm favored. PAR93 is based on codes from SPEC CFP92, Perfect Club, NASA Ames, ARCO, We gratefully acknowledge the pioneering work by David Bailey, George Cybenko, Dave Schneider, and Wolgang Gentzsch, among others, and solicit their guidance in continued evolution of the suite. Q: What equipment do I need to run PAR93? A: Here is a minimum configuration: 320 MB Main Memory 500 MB Swap disk 1 processor An OS A Fortran compiler Here is a suggested configuration: 500 MB Main Memory 1 GB swap disk 2 processors An OS capable of serving multiple OS's (eg. Unix) An MP Fortran compiler Q: Why is it named PAR93 and not SPECpar93? A: It has not been approved by SPEC. PAR93 has been submitted to SPEC for further development. Before release from SPEC it will undergo a rigorous approval process to ensure it is truly portable, and provides a level playing field for comparison of certain types of systems. It could become SPECpar94. Q: Will PAR93 indicate the performance of SPECpar94? A: Not necessarily. During the SPEC development process, benchmarks may be added, dropped, or modified. Q: I have strong opinions on what benchmarks ought to be added (dropped, or modified). How can I influence the process? A: Get involved. Bring your ideas, your data and analysis, your time and energy to SPEC. Consider joining SPEC. Write to the following address for more information: SPEC [Systems Performance Evaluation Corporation] c/o NCGA [National Computer Graphics Association] 2722 Merrilee Drive Suite 200 Fairfax, VA 22031 USA Phone: +1-703-698-9600 Ext. 318 FAX: +1-703-560-2752 E-Mail: spec-ncga@cup.portal.com For technical questions regarding the SPEC benchmarks (e.g., problems with execution of the benchmarks), Dianne Dean (she is the person normally handling SPEC matters at NCGA) refers the caller to an expert at a SPEC member company. Q: What are the run rules: A: The run rules are identical to the SPEC run rules, that is to say: . No code changes, except for portability . No compiler directives . No benchmark specific optimizations All but the last are easy to enforce and understand. Q: Why are compiler directives not allowed, since with my parallel system, the addition of one directive would improve performance of typical user codes by a great deal, and it only takes a couple of minutes to put in the directives, and everyone does it. A: All of this may be true, but PAR93 has been constructed with with the goal of measuring parallel system performance, where a system includes the cpu, cache, memory system, OS, and compiler. The goal is measurement of system performance using codes that parallelize automatically. Many codes that parallelize well have been excluded for PAR93 because in order to parallelize well, they require the intervention of the programmer. Due to the availability of many codes that present day compilers can parallelize, the run rules were restricted to those that compilers can deal with. This eliminates human factors in the comparison of various systems. Most users will find that the codes provided demonstrate excellent speedups on their SMP system. From this he might be able to infer a little about what he can do by tuning. Barring compiler directives levels the playing field, since no directive format is favored and no types of directives are favored. It also spares performance engineers the the career agony of constantly tuning benchmarks. There have been other benchmarks that try to deal with user tuning like Perfect, and others that try to measure unrestricted CPU performance like the TPP linpack, but PAR93 is a benchmark that measuress the compiler, and OS in addition to the effects of the hardware. Q: Is automatic parallelization mature enough to be the basis of a benchmark suite? A: Automatic parallelization has been commercially available since 1986 on Sequent, and possibly earlier on Alliant machines (their first machine shipped with automatic parallization), and on multi-processor Cray machines (in the late 1980's). Now, there are lots of commercial automatic parallelizers with David Kuck and his group at UICU being on of the leading academic organizations researching automatic parallelization. Q: What application areas are addressed in PAR93? A: The following areas are addressed: 101.tomcatv 2d mesh generation. This program is the author's modernized version(1991) compared to the SPECfp92 version (1987) of the original tomcatv. It is also scaled up in size. 102.swim The original SPEC program to compare performance of shallow water equations by Swartztrauber and Sato ahs been scaled up to arrays of 513x513 from 256x256. 103.su2cor The same quantum physics program using Monte-Carlo methods from Prof. Bunk that is found in SPECfp92. 104.hydro2d The same SPECfp92 program that solves the astrophysical Navier-Stokes equations. The problem is scaled up from 102x102 to 127x127. 105.tfft David Bailey's NASA FFT set. The FFT length is 4M elements. 106.fdmod Siamak Hassanzadeh's 2d finite difference scheme for 3d seismic wave propagation. This is part of the original ARCO Benchmark Suite. It was selected because this portion of seismic data processing consume over 80% of the computer cycles. 107.mgrid Eric Barszcz and Paul Fredrickson's multigrid solver from NASA Ames. 109.appbt Sisira Weeratunga's block ADI solver for nonlinear PDE's from NASA Ames, using block tridiagonal solvers. 111.appsp Sisira Weeratunga's block ADI solver for nonlinear PDE's from NASA Ames based on the solution of penta- diagonal systems of equations. 113.ora An update of the original SPEC 048.ora to prevent constant propagation by reading input. The problem size has been updated. ora is a ray tracing program. 117.nump Portions of Peter Montgomery's number theory program that parallelize. Q: I want to use profiling tools and beautiful GUI's to assist in parallization. Why does this suite not encourage their use and allow me to demonstrate them? A: On the contrary, this suite will get the attention of parallel computer vendors, and every GUI and tool will be applied to get better compilers out. Some users like GUIs and tools, but a greater majority prefer compilers. Furthermore, there could be future evolutions of PAR93 that address the more advanced issues of parallelization. Q: What is the reporting methodology? A: The principal is to report both raw performance and scaling of MP systems. To the first level, one configuration must be reported, for example the run times for 8 cpus applied to the PAR93 suite could be reported. To a second level, all of the performance for 1,2,4, and 8 ( or any sequence of cpus, with up to 5 configurations in a single report) can be reported. Scaling with the number of processors is displayed, but as with the present SPEC benchmark, the single figure of merit is the ratio to a reference system, in this case a Sun SPARCstation 10 Model 41 with 40 MHz SuperSPARC and 1MB external cache. The baseline uniprocessor values for the suite are: Benchmark Seconds 101.tomcatv 1171.5 102.swim 1220.6 103.su2cor 161.5 104.hydro2d 990.5 105.tfft 2628.4 106.fdmod 929.2 107.mgrid 630.6 109.appbt 2424.4 111.appsp 2476.2 113.ora 832.0 117.nump 1005.1 This takes less than 4 hours to run on a single cpu 40.33MHz SS10/41. Q: Some of the benchmarks are not full applications. Why not substitute full applications for these abstractions? A: Some, like SWIM, are abstractions from full programs, but issues of availability, code size, and test cases have prevented SPEC from using full applications in all cases. These same limitations apply to the PAR93 suite. As this suite matures, and SPEC members have their input and are able to contribute more to this suite, more full applications are expected. Q: Some codes like tomcatv, represent 2D problems when in reality, 3D mesh generation and 3d Fluid problems are more realistic. Why not use more modern programs. A: There is an effort to port a 3d mesh generator, but this will take time given its huge size. When complete, this could add to the quality of the suite. Consumers of PAR93 realize that this is the first generation, and that future generations aim to attain much higher levels of quality. Q: What about the duplication of similar numerical techniques such at swim and fdmod using explicit finite differences? A: Even though swim and fdmod have similar methods, they represent different application areas, and the methods and program setup are quite different. Compare this for example to SPECfp92 where mdljsp2 and mdljdp2 are simply the same program, but one is single precision and the other is double precision. The comparison of results from the two similar programs can provide insights not otherwise available. Q: Why not report two numbers for this suite? One obtained by autoparallelization and the other obtained by hand parallelization or the addition of directives? A: This is similar to the reporting rules of PERFECT and some of the PARKBENCH suite. The purpose here has been stated above. Reporting results in this manner could be a possible rule for a future suite. Q: What about adding codes that have subroutine and loop level parallelism that encourage the use of directives and having various granularities? A: This is a good suggestion for a future suite. Please direct responses to: Bodo Parady bodo.parady@eng.sun.com Michael Humphrey mikehu@sgi.com Bodo Parady | (415) 336-0388 SMCC, Sun Microsystems | Bodo.Parady@eng.sun.com Mail Stop MTV15-404 | Domain: bodo@cumbria.eng.sun.com 2550 Garcia Ave. | Alt: na.parady@na-net.ornl.gov Mountain View, CA 94043-1100 | FAX: (415) 336-4636 From owner-pbwg-comm@CS.UTK.EDU Mon Nov 8 12:24:09 1993 Received: from CS.UTK.EDU by netlib2.cs.utk.edu with SMTP (5.61+IDA+UTK-930125/2.8t-netlib) id AA02240; Mon, 8 Nov 93 12:24:09 -0500 Received: from localhost by CS.UTK.EDU with SMTP (8.6.4/2.8s-UTK) id RAA09576; Mon, 8 Nov 1993 17:22:35 GMT X-Resent-To: pbwg-comm@CS.UTK.EDU ; Mon, 8 Nov 1993 17:22:34 EST Errors-To: owner-pbwg-comm@CS.UTK.EDU Received: from sun2.nsfnet-relay.ac.uk by CS.UTK.EDU with SMTP (8.6.4/2.8s-UTK) id MAA09559; Mon, 8 Nov 1993 12:22:26 -0500 Via: uk.ac.southampton; Mon, 8 Nov 1993 17:20:08 +0000 From: R.Hockney@pac.soton.ac.uk Via: calvados.pac.soton.ac.uk (plonk); Mon, 8 Nov 93 16:41:29 GMT Date: Mon, 8 Nov 93 16:50:47 GMT Message-Id: <859.9311081650@calvados.pac.soton.ac.uk> To: pbwg-comm@CS.UTK.EDU Subject: AGENDA for Portland I attach my draft agenda for the Portland meeting. Please mail to pbwg-comm anything you wish adding to this, so that we have some advance warning of ehat you wish to discuss. Please do so immediately so that I can have the agenda ready for the meeting. --------- To Mike : I will edit the agenda in line with responses, and try to get a revised version to you for reproduction before you leave. When do you leave Knoxville? Otherwise we con probably have some copies run off in Portalnd. ------------ PARKBENCH AGENDA 17 November 1993 (Birds of a Feather Session, Supercomputing93) Portland, Oregon ---------------------------------- (1) Minutes of last meeting (23rd August 1993) (2) Presentation of first PARKBENCH Report by subgroup leaders: (2.1) Chapter-1: Introduction (Roger Hockney) (2.2) Chapter-2: Methodology (David Bailey) (2.3) Chapter-3: Low-Level (Roger Hockney) (2.4) Chapter-4: Kernels (Tony Hey) (2.5) Chapter-5: Compact Applications (David Walker) (2.6) Chapter-6: Compiler Benchmarks (Tom Haupt) (3) Open discussion on objectives/actions for next year, to be completed by and presented, like to-day, at Supercomputing94, November, 1994: (3.1) Update of PARKBENCH Report (with addition of some Compact Applications). (3.2) Second release of PARKBENCH benchmarks. (3.3) First official PARKBENCH paper report of benchmark results. (3.4) Demonstration of the Performance Data Base (PDS) applied to the above results of the PARKBENCH benchmarks. (3.5) Demonstration of the Southampton Interactive Graphical Interface to the PDS. (4) Meetings : Discussion of format for next year. (4.1) Propose one live meeting at Supercomputing94, one teleconference in May 1994 (between a US and UK site, perhaps Knoxville and Southampton), with intermediate business conducted by e-mail (Hockney). (4.2) ......... (....). (5) Publication of present PARKBENCH Report, and subsequent updates and results, perhaps as a series of yearly monographs. But which publisher might be interested (Hockney)? (5) Should PARKBENCH seek some form of sponsorship from vendors (Berry)? (4) Date and Venue of Next Meeting : (5) A.O.B. From owner-pbwg-comm@CS.UTK.EDU Tue Nov 9 09:53:33 1993 Received: from CS.UTK.EDU by netlib2.cs.utk.edu with ESMTP (8.6.4/2.8t-netlib) id JAA14281; Tue, 9 Nov 1993 09:53:32 -0500 Received: from localhost by CS.UTK.EDU with SMTP (8.6.4/2.8s-UTK) id OAA13658; Tue, 9 Nov 1993 14:53:00 GMT X-Resent-To: pbwg-comm@CS.UTK.EDU ; Tue, 9 Nov 1993 14:52:59 EST Errors-to: owner-pbwg-comm@CS.UTK.EDU Received: from sun2.nsfnet-relay.ac.uk by CS.UTK.EDU with SMTP (8.6.4/2.8s-UTK) id JAA13646; Tue, 9 Nov 1993 09:52:54 -0500 Via: uk.ac.southampton; Tue, 9 Nov 1993 13:48:02 +0000 From: R.Hockney@pac.soton.ac.uk Via: calvados.pac.soton.ac.uk (plonk); Tue, 9 Nov 93 13:10:23 GMT Date: Tue, 9 Nov 93 13:19:41 GMT Message-Id: <2576.9311091319@calvados.pac.soton.ac.uk> To: pbwg-comm@CS.UTK.EDU Subject: Portland Meeting Portland Meeting ---------------- Message to subcommittee leaders: (Hockney, Bailey, Hey, Walker, Haupt, Dongarra) The principal objective of this meeting is to present our first year's work, namely our Report and the first issue of the benchmarks, to the wider parallel computing cummunity. As you can see from the draft agenda already sent, after giving the background and introducing the report, I plan to call on each subcommittee leader to present his chapter and describe briefly the benchmarks available. Please be prepared with appropriate overheads to give a 5 to 10 minute presentation of your chapter. Can Jack Dongarra, or alternate, also be prepared to make a statement about the availability of the benchmark suite, and where and how to send in results (I omitted this from the first draft of the agenda). Jack, are you planning anything by way of demonstration of PDS, after I close the formal meeting? Most of us have seen it before, but I expect there will be many people coming who have not. Suppose I add to the agenda, after Tom Haupt's presentation, an item "PDS and Benchmark availability (Dongarra)"? Looking forward to seeing you all again in Portland. Does anyone know definitely the time on 17 November of our Birds of a Feather session? Modifications/additions to the Agenda ASAP please. Roger Hockney (chairman) From owner-pbwg-comm@CS.UTK.EDU Tue Nov 9 11:50:06 1993 Received: from CS.UTK.EDU by netlib2.cs.utk.edu with ESMTP (8.6.4/2.8t-netlib) id LAA15599; Tue, 9 Nov 1993 11:50:05 -0500 Received: from localhost by CS.UTK.EDU with SMTP (8.6.4/2.8s-UTK) id QAA23857; Tue, 9 Nov 1993 16:49:47 GMT X-Resent-To: pbwg-comm@CS.UTK.EDU ; Tue, 9 Nov 1993 16:49:46 EST Errors-to: owner-pbwg-comm@CS.UTK.EDU Received: from vnet.IBM.COM by CS.UTK.EDU with SMTP (8.6.4/2.8s-UTK) id LAA23850; Tue, 9 Nov 1993 11:49:44 -0500 Message-Id: <199311091649.LAA23850@CS.UTK.EDU> Received: from KGNVMC by vnet.IBM.COM (IBM VM SMTP V2R2) with BSMTP id 9300; Tue, 09 Nov 93 11:49:43 EST Date: Tue, 9 Nov 93 11:49:47 EST From: "Dr. Joanne L. Martin ((914) 385-9572)" To: pbwg-comm@CS.UTK.EDU Subject: BOF sessions at SC93 Attached is the full BOF schedule for SC93, including the Parkbench session on Wednesday afternoon. See you all next week. Joanne. Tuesday, November 16 Room A-107 Room A-108 Room A-109 10:30-12:00 Walter Rudd Doreen Cheng ANSI X3H5 Standardizing Technical Update Debugger Server Protocol 1:30-3:30 Leslie Hart Numerical Weather Prediction 3:30-5:30 Jack Worlton Commercialization of MPP 4:30-6:30 Bodo Parady PAR93 5:00-7:00 Carolyn Colin Fiber Channel Wednesday, November 17 Room A-107 Room A-108 Room A-109 2:00-4:00 Mike McGrath JNNIE 3:30-5:00 Craig Lee Virtual Machines 4:30-6:30 Jack Dongarra ParkBench 5:00-6:30 Doreen Cheng Standardizing Debugger Server Protocol 6:30-8:30 Barry Leiner Software Exchange for HPC Tuesday, November 16 Room A-107 Room A-108 Room A-109 10:30-12:00 Walter Rudd Doreen Cheng ANSI X3H5 Standardizing Technical Update Debugger Server Protocol 1:30-3:30 Leslie Hart Numerical Weather Prediction 3:30-5:30 Jack Worlton Commercialization of MPP 4:30-6:30 Bodo Parady PAR93 5:00-7:00 Carolyn Colin Fiber Channel Wednesday, November 17 Room A-107 Room A-108 Room A-109 2:00-4:00 Mike McGrath JNNIE 3:30-5:00 Craig Lee Virtual Machines 4:30-6:30 Jack Dongarra ParkBench 5:00-6:30 Doreen Cheng Standardizing Debugger Server Protocol 5:30-7:30 Barry Leiner Software Exchange for HPC From owner-pbwg-comm@CS.UTK.EDU Wed Nov 10 23:52:51 1993 Received: from CS.UTK.EDU by netlib2.cs.utk.edu with SMTP (8.6.4/2.8t-netlib) id XAA00987; Wed, 10 Nov 1993 23:52:50 -0500 Received: from localhost by CS.UTK.EDU with SMTP (5.61+IDA+UTK-930922/2.8s-UTK) id AA07537; Wed, 10 Nov 93 23:50:45 -0500 X-Resent-To: pbwg-comm@CS.UTK.EDU ; Wed, 10 Nov 1993 23:50:44 EST Errors-To: owner-pbwg-comm@CS.UTK.EDU Received: from BERRY.CS.UTK.EDU by CS.UTK.EDU with SMTP (5.61+IDA+UTK-930922/2.8s-UTK) id AA07526; Wed, 10 Nov 93 23:50:42 -0500 Received: from LOCALHOST by berry.cs.utk.edu with SMTP (5.61+IDA+UTK-930922/2.7c-UTK) id AA07266; Wed, 10 Nov 93 23:50:27 -0500 Message-Id: <9311110450.AA07266@berry.cs.utk.edu> To: R.Hockney@pac.soton.ac.uk Cc: pbwg-comm@CS.UTK.EDU Subject: Re: AGENDA for Portland In-Reply-To: Your message of "Mon, 08 Nov 1993 16:50:47 GMT." <859.9311081650@calvados.pac.soton.ac.uk> Date: Wed, 10 Nov 1993 23:50:26 -0500 From: "Michael W. Berry" > I attach my draft agenda for the Portland meeting. Please mail to > pbwg-comm anything you wish adding to this, so that we have some > advance warning of ehat you wish to discuss. Please do so immediately > so that I can have the agenda ready for the meeting. > --------- > To Mike : I will edit the agenda in line with responses, and try to get > a revised version to you for reproduction before you leave. When do you > leave Knoxville? Roger, I have recieved no changes to the agenda and I can wait till Monday to run off about 100 or so copies. I'm send this to pbwg-comm to solicit other changes to the agenda in Portland. See you in a few days. I'll arrive Tuesday evening. Mike From owner-pbwg-comm@CS.UTK.EDU Fri Nov 12 08:52:29 1993 Received: from CS.UTK.EDU by netlib2.cs.utk.edu with ESMTP (8.6.4/2.8t-netlib) id IAA11311; Fri, 12 Nov 1993 08:52:28 -0500 Received: from localhost by CS.UTK.EDU with SMTP (8.6.4/2.8s-UTK) id NAA19260; Fri, 12 Nov 1993 13:51:44 GMT X-Resent-To: pbwg-comm@CS.UTK.EDU ; Fri, 12 Nov 1993 13:51:43 EST Errors-to: owner-pbwg-comm@CS.UTK.EDU Received: from sun2.nsfnet-relay.ac.uk by CS.UTK.EDU with SMTP (8.6.4/2.8s-UTK) id IAA19253; Fri, 12 Nov 1993 08:51:39 -0500 Via: uk.ac.southampton; Fri, 12 Nov 1993 13:27:40 +0000 From: R.Hockney@pac.soton.ac.uk Via: calvados.pac.soton.ac.uk (plonk); Fri, 12 Nov 93 12:45:44 GMT Date: Fri, 12 Nov 93 12:53:58 GMT Message-Id: <14253.9311121253@calvados.pac.soton.ac.uk> To: pbwg-comm@CS.UTK.EDU Subject: Portland Agenda PORTLAND AGENDA --------------- Attached is the current version of the agenda. Please broadcast to pbwg-comm any changes/additions that you want. Mike Berry will then make them, although I think we have enough on the agenda already, but you may have other ideas for our schedule of meetings next year. I leave tomorrow morning, Saturday, see you all in Portland. Best regards Roger Hockney PARKBENCH AGENDA ---------------- 17 November 1993 Room A-109, 4:30-6:30pm (Birds of a Feather Session, Supercomputing93) Portland, Oregon ---------------------------------- (1) Minutes of last meeting (23rd August 1993) (2) Presentation of first PARKBENCH Report by subgroup leaders: (2.1) Chapter-1: Introduction (Roger Hockney) (2.2) Chapter-2: Methodology (David Bailey) (2.3) Chapter-3: Low-Level (Roger Hockney) (2.4) Chapter-4: Kernels (Tony Hey) (2.5) Chapter-5: Compact Applications (David Walker) (2.6) Chapter-6: Compiler Benchmarks (Tom Haupt) (2.7) Performance Data Base (PDS) and Benchmark availability (Dongarra) (3) Open discussion on objectives/actions for next year, to be completed by and presented, like to-day, at Supercomputing94, November, 1994: (3.1) Update of PARKBENCH Report (with addition of some Compact Applications). (3.2) Second release of PARKBENCH benchmarks. (3.3) First official PARKBENCH paper report of benchmark results. (3.4) Demonstration of the Performance Data Base (PDS) applied to the above results of the PARKBENCH benchmarks. (3.5) Demonstration of the Southampton Interactive Graphical Interface to the PDS. (4) Meetings : Discussion of format for next year. (4.1) Propose one live meeting at Supercomputing94, one teleconference in May 1994 (between a US and UK site, perhaps Knoxville and Southampton), with intermediate business conducted by e-mail (Hockney). (5) Publication of present PARKBENCH Report, and subsequent updates and results, perhaps as a series of yearly monographs. But which publisher might be interested (Hockney)? (6) Should PARKBENCH seek some form of sponsorship from vendors (Dongarra)? (7) Date and Venue of Next Meeting : (8) A.O.B. From owner-parkbench-comm@CS.UTK.EDU Tue Nov 23 23:14:27 1993 Received: from CS.UTK.EDU by netlib2.cs.utk.edu with ESMTP (8.6.4/2.8t-netlib) id XAA08309; Tue, 23 Nov 1993 23:14:26 -0500 Received: from localhost by CS.UTK.EDU with SMTP (8.6.4/2.8s-UTK) id EAA06645; Wed, 24 Nov 1993 04:12:43 GMT X-Resent-To: parkbench-comm@CS.UTK.EDU ; Wed, 24 Nov 1993 04:12:41 EST Errors-to: owner-parkbench-comm@CS.UTK.EDU Received: from berry.cs.utk.edu by CS.UTK.EDU with SMTP (8.6.4/2.8s-UTK) id XAA06639; Tue, 23 Nov 1993 23:12:40 -0500 Received: from LOCALHOST by berry.cs.utk.edu with SMTP (5.61+IDA+UTK-930922/2.7c-UTK) id AA15569; Tue, 23 Nov 93 23:10:15 -0500 Message-Id: <9311240410.AA15569@berry.cs.utk.edu> To: pbwg-comm@CS.UTK.EDU Subject: PARKBENCH Minutes Date: Tue, 23 Nov 1993 23:10:14 -0500 From: "Michael W. Berry" Please review the enclosed minutes of the PARKBENCH meeting at Supercomputing'93 and send me all corrections and ommissions. Regards, Mike (berry@cs.utk.edu) PARKBENCH Meeting - Supercomputing'93 Nov. 17,1993, Portland Convention Center (The list of attendees is appended to the end of this message). ------------------------------------------------------------------------ At 4:33 pm PST. Roger called the meeting to order. Roger asked for Chapter authors to take 5 minutes to discuss their chapter. He then listed PARKBENCH's Aims: comprehensive set, focus on parallel benchmarking, like Linpack, etc., David B. then reviewed the methodology chapter of the report - went over Section 2.1 (Philosophy) through Section 2.6 (Performance Metrics) & Section 2.7 (Performance Database). Bodo Parady mentioned problem with security for Sun with Xnetlib. Questioned the resolution of the timers - Xian-he Sun (ICASE) was concerned about variability in floating-point operations across machines. Roger responded that all we want is a scaling factor. Roger reviewed the low-level benchmarks chapter. The Linpack-100 benchmark was noted as a single node benchmark. Fred Gustavson (IBM) indicated that the Linpack benchmark could be improved with better kernels. Jim Stacey (Collaboratory Inc.) indicated that benchmarks like Linpack does not take into account overheads (some have inlining capabilities and others do not). Bodo Parady (Sun) suggested that the SPECbenchmarks may have an alternative. Error on equation (3.3) should be "\hat{r}_{\infty]". Mohammad Zubair (IBM Research, Yorktown) suggested that a synchronization primitive should be included in COMMS1,2. Roger H. indicated that one is already included in the current PARKBENCH suite. Steve Hotovy (Cornell) - questioned the specification of the nodes for COMMS1,2 (e.g., difference in rings on KSR). Ramesh Agarwal (IBM Research, Yorktown) thought COMMS3 was a very good idea. Roger H. reviewed tables of results and indicated they are somewhat out-of-date. Tony H. then reviewed the Kernels chapter and noted the range of applications that the kernels span. Some are taken from the NAS Kernels and other from the Genesis Benchmarks. Matrix benchmarks include the dense matrix multiply, matrix transpose, dense LU factorization with partial pivoting, QR decomposition, matrix tridiagonalizations. Ramesh A. (IBM) asked about the data structure for the transpose. Jack D. indicated that a block cyclic structure is used. Ian Duff (RAL, England) felt that vendors should be allowed to use their optimized library kernels. David B. asked why have transpose and matrix multiplication but several attendees thought the transpose was needed. Regarding the FFT's (1-D and 3-D) David B. pointed out that the 1-D is really a convolution and that the size would in the millions (long convolution with with 2 long sequences via FFT). Fred Gustavson (IBM Research, Yorktown) asked about band matrices - why leave them out? David B. pointed that band matrices are used in the NAS parallel benchmarks. Tony H. then reviewed the PDE kernels (Poisson's equation's via SOR and multigrid) and other kernels (embarrassingly parallel (NAS), large integer sort from David Bailey, and an I/O paper and pencil benchmark). A comment was made about NAS benchmarks not leaving the country and David B. replied that all clearances have been given. Horst Simon (NASA Ames) asked if CG from NAS would be dropped. Jack D. indicated it is in Compact Applications. David Walker then reviewed the Compact Applications chapter. He pointed that CA's exhibit behaviors that cannot be found with kernels, and reviewed the specifications for CA's. Acceptable languages include: F77 and C plus message passing, Fortran-90, and HPF. One attendee commented that message passing can be machine-dependent. David W. indicated that the mp libraries should be portable. No assembly code allowed in the baseline versions. Fred Gustavson (IBM) asked if library routines could be included? David W. suggested that standard intrinsics would be ok. Mohammad Z. asked about the use of smart compilers? David B. said that is fine. Fiona Sim (IBM, Kingston) asked if vendors could use their own mp libraries? It was agreed that vendors should do portable version first as baseline. Some attendees were concerned if a vendor had to supply all 3 forms of a CA? David W. indicated that at least 1 would be sufficient. Tony H. indicated that we may want optimized workstation versions to compare with. CA's submitted thus far include ARCO, NERC (3D Hydro), Nat. Envir. Research Council (NERC), Spectral transform shallow water mode (ORNL), and a QCD (Lattice Gauge) code (Univ. of Edinburgh). Fred Gustavson (IBM) asked if any simplex (linear programming) should be added to the CA suite (applications such as transportation scheduling and optimization). Bodo Parody (Sun) indicated an NYU code for simplex method might be added. Other attendees notes that the use of complex arithmetic and ODE's was lacking. David W. finished by providing an email address and instructions for submitting candidate CA codes. Tom H. then reviewed the compiler benchmarks. No questions were raised about these particular benchmarks. Jack D. then reviewed history of Netlib and the inclusion of benchmarks and the results themselves. He described the Xnetlib system and PDS. He provided sample screens of Xnetlib and noted that the PARKBENCH codes and results would be included in PDS. Jack indicated which codes are currently in Netlib and those which would be appearing soon. Small, medium, and large datasets would also be included. Rudi E. (CSRD) asked if version control had been considered? Jack D. said the index file will have a version number and source files will have timestamps. Final form not decided on how to deal with valid results. Rudi E. (CSRD) suggested that a steering committee should approve the results. A few attendees questioned how will vendors approve results (contact person)? Ramesh Agarwal (IBM Research, Yorktown) asked if vendor should provide optimized code? A description of the optimizations (Diary) was deemed sufficient. Horst Simon (NASA Ames) asked about a quality of ontrol mechanism? David B. and Jack D. indicated we should be trusting but erroneous results do come in. More discussion is needed. Bodo P. (Sun) asked if the best or worst results be reported. Ed. K indicated that getting baseline codes is Intel's only problem. Ian Duff wanted to know how vendors are reacting to PARKBENCH? Robert Numrich and Kevin Lind (CRI) thought there would be no problem. One other attendee was concerned about the lack of throughput or multi-user benchmarks. The review of the first PARKBENCH report was concluded followed by an acceptance of the minutes from the previous meeting. Roger then went over the rest of the agenda items which included updating the report/codes, generating a new set of results, and having a 3-month update for PDS. The Southampton group will produce graphical tools. Ramesh Agarwal (IBM Research, Yorktown) asked if we should limit the number of processors to be a power of 2? Roger H. suggested that at least 5 values for the number of processors be submitted. Phil T. (NEC) reminded attendees that 10 gigabytes was acceptable for "large" dataset size. Concerning the format of future meetings, it was agreed that travel should be reduced. Ian Duff (RAL) felt a Quarterly Report might be issued. Roger H. asked if email was sufficient and most attendees felt is wasn't since no deadlines could be enforced. Jack D. announced the upcoming March 29, 1994 "PARKBENCH/SPEC "meeting in Knoxville. Roger H. asked for a show of hands on who was going to theHPCN (high perf. computing and networks) meeting? Most not going. Upcoming meetings in March 1994 (Knoxville, TN): March 28 : PARKBENCH March 29,30,31 SPEC Only 3 people in room would be going to Knoxville for the SPEC meeting. Roger asked that an August 1994 meeting be a teleconferenced meeting. Fred Gustavson and Joanne Martin will check with IBM about providing facilities. Roger H. asked where should report be published? He suggested that the journal "Scientific Programming" be considered since the HPF report will be published there as well. This journal by Wiley was considered it quite favorable by attendees and Roger H. will pursue Wiley about the report and yearly updates. David B. suggested that the current focus be on publishing report first and then looked at data later. Roger H. adjourned the meeting at approximately 6:30 pm PST. ------------------------------------------------------------------------ List of Attendees (Nov. 17, 1993) Neil B. MacDonald Edinburgh Parallel Computing Centre Ken Hewick Northeast Parallel Arch. Center Ramesh Agarwal IBM Research, Yorktown Fred Gustavson IBM Research, Yorktown Mohammad Zubair IBM Research, Yorktown Aad van der Steen Academic Computing Centre, Utrecht Kevin R. Lind Cray Research Inc. Robert Numrich Cray Research Inc. Tony Hey Univ. of Southampton Daniel Frye IBM Kingston James Stacey Collaboratory Inc. Wayne Pfeiffer SDSC Luis Almeida Paravector Ken Miura Fujitsu America Bodo Parady Sun Microsystems Mark R. Smith Cornell Theory Center/IBM Wei-Hwan Chiang IBM Kingston Jack Cole USARL David Mackay Intel SSD Robert Eller Arithmetika Clive Baillie Univ. of Colorado Tom Eidson NASA LARC Mike Humphrey SGI Linton Ward IBM Fiona Sim IBM Kingston Chuck Mosher ARCO Siamak Hassanzadeh Fujitsu America Ed Kushner Intel SSD Ian Duff RAL, England Stefano Foresti Univ. of Utah Steven Hotovy Cornell Theory Center Wolfgang Gentzsch GENIAS Horst Simon NASA Ames Nahil Sobh Saudi Aramco Roger Hockney Univ. of Southampton Michael Berry Univ. of Tennessee John Larson Univ. of Illinois Rudi Eigenmann Univ. of Illinois David Walker ORNL Jack Dongarra Univ. of Tennessee/ORNL Xian-He Sun ICASE Tom Haupt Syracuse University From owner-parkbench-comm@CS.UTK.EDU Wed Nov 24 09:50:25 1993 Received: from CS.UTK.EDU by netlib2.cs.utk.edu with ESMTP (8.6.4/2.8t-netlib) id JAA11299; Wed, 24 Nov 1993 09:50:25 -0500 Received: from localhost by CS.UTK.EDU with SMTP (8.6.4/2.8s-UTK) id OAA20919; Wed, 24 Nov 1993 14:49:44 GMT X-Resent-To: parkbench-comm@CS.UTK.EDU ; Wed, 24 Nov 1993 14:49:43 EST Errors-to: owner-parkbench-comm@CS.UTK.EDU Received: from sun2.nsfnet-relay.ac.uk by CS.UTK.EDU with SMTP (8.6.4/2.8s-UTK) id JAA20890; Wed, 24 Nov 1993 09:49:30 -0500 Via: uk.ac.southampton; Wed, 24 Nov 1993 14:47:00 +0000 From: R.Hockney@pac.soton.ac.uk Via: calvados.pac.soton.ac.uk (plonk); Wed, 24 Nov 93 14:19:22 GMT Date: Wed, 24 Nov 93 14:28:42 GMT Message-Id: <7471.9311241428@calvados.pac.soton.ac.uk> To: pbwg-comm@CS.UTK.EDU Subject: PARKBENCH Publication I am taking responsibility of submitting the Parkbench report to Wiley's Journal "Scientific Computing" for publication. It is already accepted in principle, but I need to know of all typos or other changes that you would l like made asw soon as possible. Please send me e-mail personally to rwh@pac.soton.ac.uk Thank you very much. I think it will make an excellent publication, and help to raise the groups profile in the community Roger Hockney (Parkbench Chairman) From owner-parkbench-comm@CS.UTK.EDU Thu Dec 2 09:44:48 1993 Received: from CS.UTK.EDU by netlib2.cs.utk.edu with ESMTP (8.6.4/2.8t-netlib) id JAA08233; Thu, 2 Dec 1993 09:44:47 -0500 Received: from localhost by CS.UTK.EDU with SMTP (8.6.4/2.8s-UTK) id JAA13515; Thu, 2 Dec 1993 09:43:17 -0500 X-Resent-To: parkbench-comm@CS.UTK.EDU ; Thu, 2 Dec 1993 09:43:15 EST Errors-to: owner-parkbench-comm@CS.UTK.EDU Received: from berry.cs.utk.edu by CS.UTK.EDU with SMTP (8.6.4/2.8s-UTK) id JAA13509; Thu, 2 Dec 1993 09:43:14 -0500 Received: from LOCALHOST by berry.cs.utk.edu with SMTP (5.61+IDA+UTK-930922/2.7c-UTK) id AA08849; Thu, 2 Dec 93 09:43:32 -0500 Message-Id: <9312021443.AA08849@berry.cs.utk.edu> To: pbwg-comm@CS.UTK.EDU Subject: Minutes from Supercomputing Date: Thu, 02 Dec 1993 09:43:31 -0500 From: "Michael W. Berry" Here are the revised minutes from the PARKBENCH Meeting in Portland. They will be posted in comp.benchmarks and comp.parallel. Mike PARKBENCH Meeting - Supercomputing'93 Nov. 17,1993, Portland Convention Center (The list of attendees is appended to the end of this message). ------------------------------------------------------------------------ At 4:33 pm PST. Roger called the meeting to order. Roger asked for Chapter authors to take 5 minutes to discuss their chapter. He then listed PARKBENCH's Aims: comprehensive set, focus on parallel benchmarking, like Linpack, etc. David B. then reviewed the methodology chapter of the report - went over Section 2.1 (Philosophy) through Section 2.6 (Performance Metrics) & Section 2.7 (Performance Database). Bodo Parady mentioned problem with security for Sun with Xnetlib. The resolution of the timers was a concern. Xian-he Sun (ICASE) was concerned about variability in floating-point operations across machines. Roger responded that all we want is a scaling factor. Roger reviewed the low-level benchmarks chapter. The Linpack-100 benchmark was noted as a single node benchmark. Fred Gustavson (IBM) indicated that the Linpack benchmark could be improved with better kernels. Jim Stacey (Collaboratory Inc.) indicated that benchmarks like Linpack does not take into account overheads (some have inlining capabilities and others do not). Bodo Parady (Sun) suggested that the SPECbenchmarks may have an alternative. Error on equation (3.3) should be "\hat{r}_{\infty]". Mohammad Zubair (IBM Research, Yorktown) suggested that a synchronization primitive should be included in COMMS1,2. Roger H. indicated that one is already included in the current PARKBENCH suite. Steve Hotovy (Cornell) - questioned the specification of the nodes for COMMS1,2 (e.g., difference in rings on KSR). Ramesh Agarwal (IBM Research, Yorktown) thought COMMS3 was a very good idea. Roger H. reviewed tables of results and indicated they are somewhat out-of-date. Tony H. then reviewed the Kernels chapter and noted the range of applications that the kernels span. Some are taken from the NAS Kernels and other from the Genesis Benchmarks. Matrix benchmarks include the dense matrix multiply, matrix transpose, dense LU factorization with partial pivoting, QR decomposition, matrix tridiagonalizations. Ramesh A. (IBM) asked about the data structure for the transpose. Jack D. indicated that a block cyclic structure is used. Ian Duff (RAL, England) felt that vendors should be allowed to use their optimized library kernels. David B. asked why have transpose and matrix multiplication but several attendees thought the transpose was needed. Regarding the FFT's (1-D and 3-D) David B. pointed out that the 1-D is really a convolution and that the size would in the millions (long convolution with with 2 long sequences via FFT). Fred Gustavson (IBM Research, Yorktown) asked about band matrices - why leave them out? David B. pointed that band matrices are used in the NAS parallel benchmarks. Tony H. then reviewed the PDE kernels (Poisson's equation's via SOR and multigrid) and other kernels (embarrassingly parallel (NAS), large integer sort from David Bailey, and an I/O paper and pencil benchmark). A comment was made about NAS benchmarks not leaving the country and David B. replied that all clearances have been given. Horst Simon (NASA Ames) asked if CG from NAS would be dropped. Jack D. indicated it is in Compact Applications. David Walker then reviewed the Compact Applications chapter. He pointed that CA's exhibit behaviors that cannot be found with kernels, and reviewed the specifications for CA's. Acceptable languages include: F77 and C plus message passing, Fortran-90, and HPF. One attendee commented that message passing can be machine-dependent. David W. indicated that the mp libraries should be portable. No assembly code allowed in the baseline versions. Fred Gustavson (IBM) asked if library routines could be included? David W. suggested that standard intrinsics would be ok. Mohammad Z. asked about the use of smart compilers? David B. said that is fine. Fiona Sim (IBM, Kingston) asked if vendors could use their own mp libraries? It was agreed that vendors should do portable version first as baseline. Some attendees were concerned if a vendor had to supply all 3 forms of a CA? David W. indicated that at least 1 would be sufficient. Tony H. indicated that we may want optimized workstation versions to compare with. CA's submitted thus far include ARCO, NERC (3D Hydro), Nat. Envir. Research Council (NERC), Spectral transform shallow water mode (ORNL), and a QCD (Lattice Gauge) code (Univ. of Edinburgh). Fred Gustavson (IBM) asked if any simplex (linear programming) should be added to the CA suite (applications such as transportation scheduling and optimization). Bodo Parody (Sun) indicated an NYU code for simplex method might be added. Other attendees notes that the use of complex arithmetic and ODE's was lacking. David W. finished by providing an email address and instructions for submitting candidate CA codes. Tom H. then reviewed the compiler benchmarks. No questions were raised about these particular benchmarks. Jack D. then reviewed history of Netlib and the inclusion of benchmarks and the results themselves. He described the Xnetlib system and PDS. He provided sample screens of Xnetlib and noted that the PARKBENCH codes and results would be included in PDS. Jack indicated which codes are currently in Netlib and those which would be appearing soon. Small, medium, and large datasets would also be included. Rudi E. (CSRD) asked if version control had been considered? Jack D. said the index file will have a version number and source files will have timestamps. Final form not decided on how to deal with valid results. Rudi E. (CSRD) suggested that a steering committee should approve the results. A few attendees questioned how will vendors approve results (contact person)? Ramesh Agarwal (IBM Research, Yorktown) asked if vendor should provide optimized code? A description of the optimizations (Diary) was deemed sufficient. Horst Simon (NASA Ames) asked about a quality of control mechanism? David B. and Jack D. indicated we should be trusting but erroneous results do come in. More discussion is needed. Bodo P. (Sun) asked if the best or worst results be reported. Ed. K indicated that getting baseline codes is Intel's only problem. Ian Duff wanted to know how vendors are reacting to PARKBENCH? Robert Numrich and Kevin Lind (CRI) thought there would be no problem. One other attendee was concerned about the lack of throughput or multi-user benchmarks. The review of the first PARKBENCH report was concluded followed by an acceptance of the minutes from the previous meeting. Roger then went over the PARKBENCH objectives for the next year, which would be reported on at Supercomputing94, Washington DC. These were: (1) Updating of the PARKBENCH report (with some CAs). (2) Second release of benchmark codes. (3) Collection of results into the PDS database (3-monthly update). (4) Production and demonstration of Southampton Interactive Graphical front-end to PDS. Ramesh Agarwal (IBM Research, Yorktown) asked if we should limit the number of processors to be a power of 2? Roger H. suggested that between 5 and 10 values for the number of processors be submitted for each problem size, so that the variation of performance with number of processors could be adequately seen. Phil T. (NEC) reminded attendees that 10 gigabytes was acceptable for "large" dataset size. Concerning the format of future meetings, it was agreed that travel should be reduced. Ian Duff (RAL) felt a Quarterly Report might be issued. Roger H. asked if email was sufficient and most attendees felt is wasn't since no deadlines could be enforced. Jack D. announced the upcoming March 29, 1994 "PARKBENCH/SPEC "meeting in Knoxville. Roger H. asked for a show of hands on who was going to the HPCN (high perf. computing and networks) meeting in Munich 1994? Most of the attendees indicated they were not going. It was agreed to hold the next PARKBENCH meeting the day before the SPEC meeting in March 1994 (Knoxville, TN), as follows: March 28 : PARKBENCH March 29,30,31 SPEC Only 3 people in room would be going to Knoxville for the SPEC meeting. Roger asked that an August 1994 meeting be a teleconferenced meeting. Fred Gustavson and Joanne Martin will check with IBM about providing facilities. Roger H. asked where should report be published? He suggested that the journal "Scientific Programming" be considered since the HPF report will be published there as well. This journal by Wilwy was considered quite favorably by the attendees and Roger H. will pursue WIley about the report and yearly updates. David B. suggested that the current focus be on publishing the report first and then looked at the data releases later. Roger H. adjourned the meeting at approximately 6:30 pm PST. ------------------------------------------------------------------------ List of Attendees (Nov. 17, 1993) Neil B. MacDonald Edinburgh Parallel Computing Centre Ken Hewick Northeast Parallel Arch. Center Ramesh Agarwal IBM Research, Yorktown Fred Gustavson IBM Research, Yorktown Mohammad Zubair IBM Research, Yorktown Aad van der Steen Academic Computing Centre, Utrecht Kevin R. Lind Cray Research Inc. Robert Numrich Cray Research Inc. Tony Hey Univ. of Southampton Daniel Frye IBM Kingston James Stacey Collaboratory Inc. Wayne Pfeiffer SDSC Luis Almeida Paravector Ken Miura Fujitsu America Bodo Parady Sun Microsystems Mark R. Smith Cornell Theory Center/IBM Wei-Hwan Chiang IBM Kingston Jack Cole USARL David Mackay Intel SSD Robert Eller Arithmetika Clive Baillie Univ. of Colorado Tom Eidson NASA LARC Mike Humphrey SGI Linton Ward IBM Fiona Sim IBM Kingston Chuck Mosher ARCO Siamak Hassanzadeh Fujitsu America Ed Kushner Intel SSD Ian Duff RAL, England Stefano Foresti Univ. of Utah Steven Hotovy Cornell Theory Center Wolfgang Gentzsch GENIAS Horst Simon NASA Ames Nahil Sobh Saudi Aramco Roger Hockney Univ. of Southampton Michael Berry Univ. of Tennessee John Larson Univ. of Illinois Rudi Eigenmann Univ. of Illinois David Walker ORNL Jack Dongarra Univ. of Tennessee/ORNL Xian-He Sun ICASE Tom Haupt Syracuse University From owner-parkbench-comm@CS.UTK.EDU Mon Dec 6 17:19:20 1993 Received: from CS.UTK.EDU by netlib2.cs.utk.edu with ESMTP (8.6.4/2.8t-netlib) id RAA06409; Mon, 6 Dec 1993 17:19:19 -0500 Received: from localhost by CS.UTK.EDU with SMTP (8.6.4/2.8s-UTK) id RAA22535; Mon, 6 Dec 1993 17:18:40 -0500 X-Resent-To: parkbench-comm@CS.UTK.EDU ; Mon, 6 Dec 1993 17:18:38 EST Errors-to: owner-parkbench-comm@CS.UTK.EDU Received: from sun2.nsfnet-relay.ac.uk by CS.UTK.EDU with SMTP (8.6.4/2.8s-UTK) id RAA22527; Mon, 6 Dec 1993 17:18:27 -0500 Via: uk.ac.southampton; Mon, 6 Dec 1993 16:38:09 +0000 From: R.Hockney@pac.soton.ac.uk Via: calvados.pac.soton.ac.uk (plonk); Mon, 6 Dec 93 16:00:25 GMT Date: Mon, 6 Dec 93 16:09:48 GMT Message-Id: <3315.9312061609@calvados.pac.soton.ac.uk> To: pbwg-comm@CS.UTK.EDU Subject: Report Publication SCHEDULE FOR PUBLICATION OF PARKBENCH REPORT -------------------------------------------- It seems appropriate that I, as chairman, take responsibility for assembling the final text for publication of the first PARKBENCH REPORT, and seeing it through to publication. The version made public at Supercomputing'93 has been accepted for publication in Wiley's Journal "Scientific Programming" by the editor Ron Perrott after we respond to his editorial comments. I am therefore sending by ordinary snail-mail to the committee member responsible for each chapter, his chapter with the editor's original annotations, with a photocopy of the rest of the report and his covering letter. To ensure rapid publication I shall follow the following schedule: 1 Jan 1994 - Last day for receipt by RWH of edited chapters from subcommittee leaders. Please send the text to me by e-mail but please do not include any screen dumps as e-mail. They clog the mail system and should be ftp'd . If there are no changes, simply omit them as I have them already. Any chapters not received by this date will be edited by RWH as best he can. My e-mail is: rwh@pac.soton.ac.uk 15 Jan 1994 - Last day for submission of revised manuscript by RWH to Ron Perrott. Submission will be in Latex form. Publication can be expected 3-4 months after this i.e. say May 1994 (spring or summer edition, 1994). To give you an idea of what to expect, look at Scientific Programming vol 2 numbers 1 & 2 which is a special issue devoted to the HPF report. The editor is suggesting that we refer to our present Chapters as Sections instead, as has been done with the HPF report. I am against this, because we have always calleds them chapters, and they are substantial enough to be so named. Views please. There are some issues to be resolved about the copyright agreement that I presumably will sign on behalf of the committee. As I see it, the important thing is that the publisher agrees that the material is in the public domain and may be freely reproduced without payment by anyone, provided attribution is made to the Parkbench committee and to Scientific Programming. This seems to have been achieved by HPF (see second page of above quoted issue of the Journal). I shall try to follow this pattern and keep committee members posted. Also I wish explicitly to add that the material is publically available electronically, from the UTK Netlib server. Whatever is agreed with the publisher will be put to the committee by e-mail before I sign anything. If you have any comments on these issues, now is the time to send them to pbwg-comm, so that I may take them into consideration. Best regards Roger Hockney (Chairman Parkbench Committee) From owner-parkbench-comm@CS.UTK.EDU Wed Dec 15 00:25:06 1993 Received: from CS.UTK.EDU by netlib2.cs.utk.edu with ESMTP (8.6.4/2.8t-netlib) id AAA06867; Wed, 15 Dec 1993 00:25:06 -0500 Received: from localhost by CS.UTK.EDU with SMTP (8.6.4/2.8s-UTK) id AAA08667; Wed, 15 Dec 1993 00:25:58 -0500 X-Resent-To: parkbench-comm@CS.UTK.EDU ; Wed, 15 Dec 1993 00:25:55 EST Errors-to: owner-parkbench-comm@CS.UTK.EDU Received: from cs.dal.ca by CS.UTK.EDU with SMTP (8.6.4/2.8s-UTK) id AAA08654; Wed, 15 Dec 1993 00:25:51 -0500 Received: by cs.dal.ca id <46616>; Wed, 15 Dec 1993 01:25:32 -0400 From: Thomas Trappenberg To: pbwg-comm@CS.UTK.EDU Subject: parkbench informations Message-Id: <93Dec15.012532ast.46616@cs.dal.ca> Date: Wed, 15 Dec 1993 01:25:22 -0400 Dear Colleagues, I heard from the article in PARALLEL COMPUTING RESEARCH about your PARKBENCH initiative. I'm very interested in parallel benchmarking and do think that this is a necessary and serious scientific discipline. Where could I get a copy of your report and where are the benchmark programs available? I would be pleased these informations. With best regards, Thomas Trappenberg From owner-parkbench-comm@CS.UTK.EDU Wed Dec 15 05:59:17 1993 Received: from CS.UTK.EDU by netlib2.cs.utk.edu with ESMTP (8.6.4/2.8t-netlib) id FAA09275; Wed, 15 Dec 1993 05:59:16 -0500 Received: from localhost by CS.UTK.EDU with SMTP (8.6.4/2.8s-UTK) id GAA09370; Wed, 15 Dec 1993 06:00:46 -0500 X-Resent-To: parkbench-comm@CS.UTK.EDU ; Wed, 15 Dec 1993 06:00:45 EST Errors-to: owner-parkbench-comm@CS.UTK.EDU Received: from sun2.nsfnet-relay.ac.uk by CS.UTK.EDU with SMTP (8.6.4/2.8s-UTK) id GAA09353; Wed, 15 Dec 1993 06:00:41 -0500 Via: uk.ac.southampton; Wed, 15 Dec 1993 10:16:40 +0000 Via: brewery.ecs.soton.ac.uk; Wed, 15 Dec 93 10:04:12 GMT From: Vladimir Getov Received: from beluga.ecs.soton.ac.uk by brewery.ecs.soton.ac.uk; Wed, 15 Dec 93 10:16:52 GMT Date: Wed, 15 Dec 93 10:16:52 GMT Message-Id: <20588.9312151016@beluga.ecs.soton.ac.uk> To: pbwg-comm@CS.UTK.EDU Subject: CG kernel from NPB Hi all, We propose the inclusion of the CG-kernel from NAS Parallel Benchmarks into the Parkbench suite. In our opinion, this benchmark would improve the quality and the balance of Parkbench kernels. Currently the Parkbench kernels include a set of benchmarks for dense matrix computations (subsection 4.2.1). The CG-kernel is typical of unstructured grid computations and employs sparse matrix-vector multiplication. Therefore it would probably best suite into subsection 4.2.4. "Others" of the Parkbench report. Are there any other suggestions, comments or proposals about the Kernels chapter? Regards, Tony Hey and Vladimir Getov From owner-parkbench-comm@CS.UTK.EDU Wed Dec 15 08:11:19 1993 Received: from CS.UTK.EDU by netlib2.cs.utk.edu with ESMTP (8.6.4/2.8t-netlib) id IAA09782; Wed, 15 Dec 1993 08:11:19 -0500 Received: from localhost by CS.UTK.EDU with SMTP (8.6.4/2.8s-UTK) id IAA17005; Wed, 15 Dec 1993 08:13:01 -0500 X-Resent-To: parkbench-comm@CS.UTK.EDU ; Wed, 15 Dec 1993 08:13:00 EST Errors-to: owner-parkbench-comm@CS.UTK.EDU Received: from thud.cs.utk.edu by CS.UTK.EDU with SMTP (8.6.4/2.8s-UTK) id IAA16998; Wed, 15 Dec 1993 08:12:59 -0500 From: Jack Dongarra Received: by thud.cs.utk.edu (5.61+IDA+UTK-930922/2.7c-UTK) id AA08351; Wed, 15 Dec 93 08:12:50 -0500 Date: Wed, 15 Dec 93 08:12:50 -0500 Message-Id: <9312151312.AA08351@thud.cs.utk.edu> To: trappenb@cs.dal.ca, pbwg-comm@CS.UTK.EDU Subject: Re: parkbench informations In-Reply-To: Mail from 'Thomas Trappenberg ' dated: Wed, 15 Dec 1993 01:25:22 -0400 > Where could I get a copy of your report > and where are the benchmark programs available? You can retrieve an up-to-date copy of the ParkBench material from netlib. 1) from any machine on the internet type: rcp anon@netlib2.cs.utk.edu:parkbench/index index 2) anonymous ftp to netlib2.cs.utk.edu cd parkbench get index quit 3) sending email to netlib@ornl.gov and in the message type: send index from parkbench 4) use Xnetlib and click "library", click "parkbench", click "parkbench/index", click "download", click "Get Files Now". (Xnetlib is an X-window interface to the netlib software based on a client-server model. The software can be found in netlib.) A report describing the activity can be found in file parkbench.ps. Hope this helps, Regards, Jack From owner-parkbench-comm@CS.UTK.EDU Wed Dec 15 13:05:14 1993 Received: from CS.UTK.EDU by netlib2.cs.utk.edu with ESMTP (8.6.4/2.8t-netlib) id NAA13407; Wed, 15 Dec 1993 13:05:14 -0500 Received: from localhost by CS.UTK.EDU with SMTP (8.6.4/2.8s-UTK) id NAA10669; Wed, 15 Dec 1993 13:05:59 -0500 X-Resent-To: parkbench-comm@CS.UTK.EDU ; Wed, 15 Dec 1993 13:05:58 EST Errors-to: owner-parkbench-comm@CS.UTK.EDU Received: from wk45.nas.nasa.gov by CS.UTK.EDU with SMTP (8.6.4/2.8s-UTK) id NAA10656; Wed, 15 Dec 1993 13:05:56 -0500 Received: by wk45.nas.nasa.gov (5.67-NAS.6/NAS.3-sgi) id AA13024; Wed, 15 Dec 93 10:05:53 -0800 Date: Wed, 15 Dec 93 10:05:53 -0800 From: simon@nas.nasa.gov (Horst D. Simon) Message-Id: <9312151805.AA13024@wk45.nas.nasa.gov> To: pbwg-comm@CS.UTK.EDU Subject: CG kernel from NPB I support the proposal by Tony Hey and Vladimir Getov for the inclusion of the NAS Parallel Benchmarks CG-kernel into the Parkbench suite. They have given some good arguments for doing so. These were the same reasons, why CG was included in the NAS Parallel Benchmarks. I brought this up at the meeting in Portland, and was told that CG would be in the compact applications. I checked, but it is not there. Presumably I would have to submit it. But it would not make sense to put in into the compact applications sections, since it is a kernel. Including the CG benchmark would also result in having all NAS Parallel Benchmarks kernel in PARKBENCH, which would be good for consistency. Horst. From owner-parkbench-comm@CS.UTK.EDU Tue Dec 21 10:51:27 1993 Received: from CS.UTK.EDU by netlib2.cs.utk.edu with ESMTP (8.6.4/2.8t-netlib) id KAA21879; Tue, 21 Dec 1993 10:51:26 -0500 Received: from localhost by CS.UTK.EDU with SMTP (8.6.4/2.8s-UTK) id KAA16349; Tue, 21 Dec 1993 10:52:27 -0500 X-Resent-To: parkbench-comm@CS.UTK.EDU ; Tue, 21 Dec 1993 10:52:26 EST Errors-to: owner-parkbench-comm@CS.UTK.EDU Received: from spica.npac.syr.edu by CS.UTK.EDU with SMTP (8.6.4/2.8s-UTK) id KAA16336; Tue, 21 Dec 1993 10:52:14 -0500 From: Received: from aldebaran.npac.syr.edu by spica.npac.syr.edu (4.1/I-1.98K) id AA02741; Tue, 21 Dec 93 10:52:01 EST Message-Id: <9312211551.AA23732@aldebaran.npac.syr.edu> Received: from localhost.syr.edu by aldebaran.npac.syr.edu (4.1/N-0.12) id AA23732; Tue, 21 Dec 93 10:51:50 EST To: R.Hockney@pac.soton.ac.uk Cc: pbwg-comm@CS.UTK.EDU, haupt@npac.syr.edu Subject: Re: Report Publication In-Reply-To: Your message of "Mon, 06 Dec 93 16:09:48 GMT." <3315.9312061609@calvados.pac.soton.ac.uk> Date: Tue, 21 Dec 93 10:51:49 EST > SCHEDULE FOR PUBLICATION OF PARKBENCH REPORT > -------------------------------------------- > >1 Jan 1994 - Last day for receipt by RWH of edited chapters from subcommittee leaders. Roger, Here is an updated version of the chapter 6. I tried to get into account all comments I got from R.H.Perrott. However, I have not modyfied the following: - phrases like "We will ..." . Please, modify them to conform the rest of the text. - notions of "node compilation" and "a node compiler". For me they are OK. Let me know if you think I should change it. - runtime (as opposed to "run time" or "run-time"). The word "runtime" is used by Parallel Compiler Runtime Consortium, and as far as I can tell, it is widely accepted by (parallel compiler + runtime) community. [Of course, I do not think that it is that important to make an issue.] Tom =============================================================================== % citation: hpff : High Performance Fortran Language Specification, % High Performance Fortran Forum, Scientific Programming, % Vol 2, Nos 1 and 2, Summer '93. % %------------------------------------------------------------------------ % PARKBENCH REPORT (fourth draft), File: compil4.tex %------------------------------------------------------------------------ %file compil4.tex %compiled by Tom Haupt for compiler benchmarks subcommittee \chapter{HPF Compiler Benchmarks\protect \footnote{assembled by Tom Haupt for Compiler Benchmarks subcommittee}} \section{Objectives} For most users, the performance of codes generated by a compiler is what actually matters. This can be inferred from running HPF version of PARKBENCH codes described in chapter 4 and 5. For HPF compiler developers and implementators, however, {\it an additional} benchmark suite may be very useful: the benchmark suite that can evaluate specific HPF compilation phases and the compiler runtime support. For that purpose, the relevant metric is the ratio of execution times of compiler generated to hand coded programs as a function of the problem size and number of processors engaged in the computation. The compilation process can be logically divided into several phases, and each of them influence the efficiency of the resulting code. The initial stage is parsing of a source code which results in an internal representation of the code. It is followed by compiler transformations, like data distribution, loop transformations, computation distribution, communication detection, sequentialization, insertion of calls to a runtime support, and others. This we will call a HPF-specific phase of compilation. The compilation is concluded by code generation phase. For portable compilers that output Fortran 77 + message passing code, the node compilation is factorized out and the efficiency of the node compiler can be evaluated separately. This benchmark suite addresses the HPF-specific phase only. Thus, it is well suited for performance evaluation of both translators (HPF to Fortran 77 + message passing) and genuine HPF compilers. The parsing phase is an element of the conventional compiler technology and it is not of interest in this context. The code generation phase involves optimization techniques developed for sequential compilers (in particular, Fortran 90 compilers) as well as micro-grain parallelism or vectorization. The object codes for specific platforms may be strongly architecture dependent (e.g., may be very different for processors with vector capabilities than for those without it). Evaluation of performance of these aspects require different techniques than these proposed here. It is worth noting, that the HPF-phase strongly affects the possibility of optimization of the node codes. For example, insertions of calls to the communication library may prohibit the node compiler from performing many standard optimizations without expensive interprocedural analysis. Therefore, its capability to exploit opportunities for optimizations at HPF level and to generate the output code in such a way that it can be further optimized by the node compiler is an important element of evaluation of HPF compilers. Nevertheless, evaluation of the HPF-phase separately is very valuable since the hand coded programs face the same problems. We will address these issues in future releases of the benchmark suite. Compilers for massively parallel and distributed systems are still the object of research and laboratory testing rather than commercial products. The parallel compiler technology as well as methods of evaluating it are not mature yet. Nevertheless, the advent of the HPF standard gives opportunity to develop systematic benchmarking techniques. The current definition of HPF \cite{hpff} cannot be recognized as an ultimate solution for parallel computing. Its limitations are well known, and many researchers are working on extensions to HPF to address a broader class of real life, commercial and scientific applications. We expect new language features to be added to the HPF definition in future versions of HPF, and we will extend the benchmark suite accordingly. On the other hand, new parallel languages based on languages other than Fortran, notably C++, are becoming more and more popular. Since the parallelism is inherent in a problem and not its representation, we anticipate many commonalities in the parallel languages and corresponding compiler technologies, notably sharing the runtime support. Therefore, we decided to address this benchmark suite to these aspects of the compilation process that are inherent to parallel processing in general, rather than testing syntactic details of HPF. \section{Low Level HPF Compiler Benchmarks} \subsection{Overview} The benchmark suite comprises several simple, synthetic applications which test several aspects of HPF compilation. The current version of the suite addresses the basic features of HPF, and it is designed to measure performance of early implementations of the compiler. They concentrate on testing parallel implementation of explicitly parallel statements, i.e., array assignments, FORALL statements, INDEPENDENT DO loops, and intrinsic functions with different mapping directives. In addition, the low level compiler benchmarks address problem of passing distributed arrays as arguments to subprograms. The language features not included in the HPF subset are not addressed in this release of the suite. The next releases will contain more kernels that will address all features of HPF, and also they will be sensitive to advanced compiler transformations. The codes included in this suite are either adopted from existing benchmark suites, NAS suite \cite{NAS}, Livermore Loops \cite{Liv}, and the Purdue Set \cite{Rice}, or are developed at Syracuse University. \subsection{FORALL statement - kernel FL} FORALL statement provides a convenient syntax for simultaneous assignments to large groups of array elements. Such assignments lie at the heart of the data parallel computations that HPF is designed to express. The idea behind introducing FORALL in HPF is to generalize Fortran 90 array assignments to make expressing parallelism easier. Kernel FL provides several examples of FORALL statements that are difficult or inconvenient to write using Fortran 90 syntax. \subsection{Explicit template - kernel TL} Parallel implementation of the array assignments, including FORALL statements, is a central issue for an early HPF compiler. Given a data distribution, the compiler distributes computation over available processors. An efficient compiler achieves an optimal load balance with minimum interprocessor communication. Sometimes, the programmers may help the compiler to minimize interprocessor communication by suitable data mapping, in particular by defining a relative alignment of different data objects. This may be achieved by aligning the data objects with an explicitly declared template. Kernel TL provides an example of this kind. \subsection{Communication detection in array assignments - kernels AA, SH, ST, and IR} Once the data and iteration space is distributed, the next step that strongly influences efficiency of the resulting codes is communication detection and code generation to execute data movement. In general, the off-processor data elements must be gathered before execution of an array assignment, and the results are to be scattered to destination processors after the assignment is completed. In other words, some of the array assignments may require a preprocessing phase to determine which off-processor data elements are needed and execute the gather operation. Similarly, they may require postprocessing (scatter). Many different techniques may be used to optimize these operations. To achieve high efficiency, it may be very important that the compiler is able to recognize structured communication patterns, like shift, multicast, etc. Kernels AA, SH, and ST introduce different structured communication patterns, and kernel IR is an example of an array assignment that requires unstructured communication (because of indirections). \subsection{INDEPENDENT assertion - kernel EP} In addition to array assignments and FORALL statments, parallelism may be expressed by using INDEPENDENT assertions. The EP kernel test the performance of INDEPENDENT DO construct with NEW variables. \subsection{Non-elemental intrinsic functions - kernel RD} Fortran 90 intrinsics and HPF functions offer yet another way to express parallelism. Kernel RD tests implementation of several reduction functions. \subsection{Passing distributed arrays as subprogram arguments - kernels AS, IT, IM and EI} The last group of kernels, demonstrate passing distributed arrays as subprogram arguments. They represents three typical cases: \begin{enumerate} \item a known mapping of the actual argument is to be preserved by the dummy argument (AS). \item mapping of the dummy argument is to be inherited from the actual argument, thus no remapping is necessary. The mapping is known at compile time (IT). \item mapping of the dummy argument is to be identical to that of the actual argument, but the mapping is not known at compile time (IM). \end{enumerate} \section{Summary} The synthetic compiler benchmark suite described here is an addition to the benchmark kernels and applications described in chapter 4 and 5. It is not meant as a tool to evaluate the overall performance of the compiler generated codes. It has been introduced as an aid for compiler developers and implementators to address some selected aspect of the HPF compilation process. In the current version, the suite does not comprise a comprehensive sample of HPF codes. Actually, it addresses only the HPF subset. Hopefully, this way, we will contribute to the establishment of a systematic compiler benchmarking methodology. We intend to continue our effort to develop a complete, fully representative HPF benchmark suite. % ---------------------------------------------------------------------------- From owner-parkbench-comm@CS.UTK.EDU Tue Dec 21 18:14:15 1993 Received: from CS.UTK.EDU by netlib2.cs.utk.edu with ESMTP (8.6.4/2.8t-netlib) id SAA24795; Tue, 21 Dec 1993 18:14:14 -0500 Received: from localhost by CS.UTK.EDU with SMTP (8.6.4/2.8s-UTK) id SAA17476; Tue, 21 Dec 1993 18:15:52 -0500 X-Resent-To: parkbench-comm@CS.UTK.EDU ; Tue, 21 Dec 1993 18:15:50 EST Errors-to: owner-parkbench-comm@CS.UTK.EDU Received: from intel2.CCS.ORNL.GOV by CS.UTK.EDU with SMTP (8.6.4/2.8s-UTK) id SAA17442; Tue, 21 Dec 1993 18:15:35 -0500 Received: by intel2.CCS.ORNL.GOV (4.1/1.34) id AA07380; Tue, 21 Dec 93 18:15:14 EST From: mackay@intel2.CCS.ORNL.GOV (David Mackay) Message-Id: <9312212315.AA07380@intel2.CCS.ORNL.GOV> Subject: Include NAS parallel CG benchmark and limit number of benchmarks To: pbwg-comm@CS.UTK.EDU Date: Tue, 21 Dec 93 18:15:11 EST X-Mailer: ELM [version 2.3 PL11] We are in favor of the proposal to include the NAS Parallel Benchmark CG-kernel in the Parkbench suite. I know in the Parkbench meetings we had discussed includeing a CG test. Up until now Parkbench has been concentrating on collecting parallel benchmarks. We feel that now is the time to consider the size and number of benchmarks we include in Parkbench. We suggest we should begin to limit the number of kernels now. So we propose the inclusion of the NAS Parallel Benchmark CG-kernel be in substitution for the NAS Parallel Benchmark Is-kernel -David Mackay Ed Kusnher From owner-parkbench-comm@CS.UTK.EDU Mon Dec 27 15:03:39 1993 Received: from CS.UTK.EDU by netlib2.cs.utk.edu with ESMTP (8.6.4/2.8t-netlib) id PAA02611; Mon, 27 Dec 1993 15:03:38 -0500 Received: from localhost by CS.UTK.EDU with SMTP (8.6.4/2.8s-UTK) id PAA14203; Mon, 27 Dec 1993 15:01:32 -0500 X-Resent-To: parkbench-comm@CS.UTK.EDU ; Mon, 27 Dec 1993 15:01:30 EST Errors-to: owner-parkbench-comm@CS.UTK.EDU Received: from sun2.nsfnet-relay.ac.uk by CS.UTK.EDU with SMTP (8.6.4/2.8s-UTK) id OAA14053; Mon, 27 Dec 1993 14:59:52 -0500 Via: uk.ac.southampton; Mon, 27 Dec 1993 19:54:51 +0000 Received: from ecs.soton.ac.uk (root@localhost) by mail.soton.ac.uk (8.6.4/2.6) with NIFTP id TAA15149; Mon, 27 Dec 1993 19:52:57 GMT Via: brewery.ecs.soton.ac.uk; Mon, 27 Dec 93 19:44:22 GMT From: Vladimir Getov Received: from beluga.ecs.soton.ac.uk by brewery.ecs.soton.ac.uk; Mon, 27 Dec 93 19:57:08 GMT Date: Mon, 27 Dec 93 19:57:11 GMT Message-Id: <28029.9312271957@beluga.ecs.soton.ac.uk> To: pbwg-comm@CS.UTK.EDU, mackay@intel2.CCS.ORNL.GOV Subject: Re: Include NAS parallel CG benchmark and limit number of benchmarks > > We are in favor of the proposal to include the NAS Parallel Benchmark > CG-kernel in the Parkbench suite. I know in the Parkbench meetings > we had discussed including a CG test. > > Up until now Parkbench has been concentrating on collecting parallel > benchmarks. We feel that now is the time to consider the size and number of > benchmarks we include in Parkbench. We suggest we should begin to limit the > number of kernels now. So we propose the inclusion of the NAS Parallel > Benchmark CG-kernel be in substitution for the NAS Parallel Benchmark > Is-kernel. > > > -David Mackay > Ed Kushner > > Thank you for supporting the inclusion of the CG-kernel in the Parkbench suite. There is no doubt that it is a useful test and should be there. We are not certain, however, that the IS-kernel has to be dropped out instead. The current Parkbench report says that "Although sorting has traditionally been thought of as of importance primarily in non-scientific computing, this operation is increasingly important in advanced scientific applications. In particle method fluid simulations, for example, sorting is the dominant cost." Can you give a justification of your proposal. In our opinion, the decision re: IS-kernel should be taken at the next Parkbench meeting. Tony Hey and Vladimir Getov From owner-parkbench-comm@CS.UTK.EDU Thu Dec 30 03:53:18 1993 Received: from CS.UTK.EDU by netlib2.cs.utk.edu with ESMTP (8.6.4/2.8t-netlib) id DAA11601; Thu, 30 Dec 1993 03:53:17 -0500 Received: from localhost by CS.UTK.EDU with SMTP (8.6.4/2.8s-UTK) id DAA27846; Thu, 30 Dec 1993 03:51:45 -0500 X-Resent-To: parkbench-comm@CS.UTK.EDU ; Thu, 30 Dec 1993 03:51:44 EST Errors-to: owner-parkbench-comm@CS.UTK.EDU Received: from sun2.nsfnet-relay.ac.uk by CS.UTK.EDU with SMTP (8.6.4/2.8s-UTK) id DAA27836; Thu, 30 Dec 1993 03:51:40 -0500 Via: uk.ac.southampton; Thu, 30 Dec 1993 08:50:36 +0000 Received: from ecs.soton.ac.uk (root@localhost) by mail.soton.ac.uk (8.6.4/2.6) with NIFTP id IAA00707 for pbwg-comm%cs.utk.edu@uk.ac.nsfnet-relay; Thu, 30 Dec 1993 08:48:35 GMT From: R.Hockney@pac.soton.ac.uk Via: calvados.pac.soton.ac.uk (plonk); Wed, 29 Dec 93 17:42:16 GMT Date: Wed, 29 Dec 93 17:51:42 GMT Message-Id: <903.9312291751@calvados.pac.soton.ac.uk> To: pbwg-comm@CS.UTK.EDU Subject: PARKBENCH Report REPORT PUBLICATION ------------------ (1) RESPONSE TO EDITOR'S COMMENTS --------------------------------- A Merry Xmas and Happy New Year to you all. This is just a reminder to all subcommittee leaders (db, rwh, ajgh, dw, th) that the deadline to get your revised chapters to me is nigh (i.e. 1st Jan. 1994). After that I shall do the best I can myself, but it may not be quite what you want. Many congratulations to Tom Haupt who has already returned his revised chapter. I have undertaken, then, to do any overall editing, and submit the revised manuscript to "Scientific Programming" by 15th January. Please send revised texts to my private e-mail, in order to avoid overloading pbwg-comm with almost, but not quite, final texts. e-mail for revised texts: rwh@pac.soton.ac.uk When I have the revised text ready for SP, I shall send it to them and also to Mike Berry at UT, so that he can replace the old CS-report version and make it available over Netlib. Other committee members will wish to know that the report is accepted for publication, after minor revision to take account of the editor's comments. The editor's comments have been sent to all subcommittee leaders for action. The editor raised the following points that you may wish to comment on: (a) He wishes to call our "Chapters", "Sections", as was done when SP published the HPF report. Unless pushed by others, and given a good reason, I will stick out to keep them called Chapters. (b) We have used "we" and sometimes "one" in the text. I shall convert all to "we", which is somehow more friendly. (c) He wishes us to try to use a consistent description of parallel machines. Do we call them multi-processors, nodes or processors, MPP, replicated computers, message-passing computers, distributed-memory computers, etc., etc. I shall do something about this next week, so input from other members now over pbwg-comm would be welcome. (d) He thought that the word "Public" could be removed from the title "Public International Benchmarks for Parallel Computers". I think we should retain the word in our title, as it emphasises the importance of our 4th aim: that the benchmarks be in the public domain. The Southampton group is also planning an additional appendix of "Selected Results", which will give about five log/log graphs with results from e.g. the COMMS1 low-level, and the NAS parallel benchmark kernels. These would illustrate the type of graphical front end that is envisaged. (2) COPYRIGHT ------------- Jack Dongarra has suggested that the University of Tennessee play the same role for the PARKBENCH report as Rice University has for the HPF report that appeared in vol 2 Nos 1&2 (1993) of Scientific Programming (SP). That is to say that the University of Tennessee (UT) would keep the copyright (not Wiley who publish the journal), but that the material be dedicated to the public domain by UT with the consent of John Wiley (see first page of above-mentioned issue of Scientific Programming). Reproductions can be made without payment provided attribution is made to UT and SP, and the report would be made freely available electronically by UT and possibly other sites. The above suggestion sounds very satisfactory to me, and we are lucky that Rice University has already set a precedent that we can happily follow. I propose that the PARKBENCH committee hand over to UT through Jack Dongarra the formal negotiations with John Wiley & Sons over copyright of the PARKBENCH report, to be conducted along the lines described in the last paragraph. This seems appropriate since this report has already been published by UT as a CS report, and technically UT probably holds the copyright now. If no serious objections are made over e-mail by 15th January 1994, this will become the policy of the committee, and Jack will have authority to proceed with negotiations with Wiley on behalf of the PARKBENCH committee. I will remain responsible however for supplying the modified text, which will be made available over the network. I would ask Jack to correct immediately over pbwg-comm any errors or omissions in my explanation above of the proposed copyright agreement. Would other committee members also make their opinions known over pbwg-comm whether they support or not the above procedure. Best Regards Roger Hockney (PARKBENCH Chairman) From owner-parkbench-comm@CS.UTK.EDU Thu Dec 30 15:46:39 1993 Received: from CS.UTK.EDU by netlib2.cs.utk.edu with ESMTP (8.6.4/2.8t-netlib) id PAA14856; Thu, 30 Dec 1993 15:46:39 -0500 Received: from localhost by CS.UTK.EDU with SMTP (8.6.4/2.8s-UTK) id PAA15634; Thu, 30 Dec 1993 15:46:09 -0500 X-Resent-To: parkbench-comm@CS.UTK.EDU ; Thu, 30 Dec 1993 15:46:08 EST Errors-to: owner-parkbench-comm@CS.UTK.EDU Received: from berry.cs.utk.edu by CS.UTK.EDU with SMTP (8.6.4/2.8s-UTK) id PAA15628; Thu, 30 Dec 1993 15:46:06 -0500 Received: from LOCALHOST by berry.cs.utk.edu with SMTP (5.61+IDA+UTK-930922/2.7c-UTK) id AA24305; Thu, 30 Dec 93 15:46:04 -0500 Message-Id: <9312302046.AA24305@berry.cs.utk.edu> To: R.Hockney@pac.soton.ac.uk Cc: pbwg-comm@CS.UTK.EDU Subject: Re: PARKBENCH Report In-Reply-To: Your message of "Wed, 29 Dec 1993 17:51:42 GMT." Date: Thu, 30 Dec 1993 15:46:02 -0500 From: "Michael W. Berry" > in my explanation above of the proposed copyright agreement. Would other > committee members also make their opinions known over pbwg-comm whether they > support or not the above procedure. I support the motion by Roger and Jack regarding the copyright agreement. I think this procedure will be the most trouble-free in getting future reports published. Regards, Mike > --- Michael W. Berry ___-___ o==o====== . . . . . Ayres 114 =========== ||// Department of \ \ |//__ Computer Science #_______/ berry@cs.utk.edu University of Tennessee (615) 974-3838 [OFF] Knoxville, TN 37996-1301 (615) 974-4404 [FAX] From owner-parkbench-comm@CS.UTK.EDU Fri Dec 31 10:32:19 1993 Received: from CS.UTK.EDU by netlib2.cs.utk.edu with ESMTP (8.6.4/2.8t-netlib) id KAA16310; Fri, 31 Dec 1993 10:32:19 -0500 Received: from localhost by CS.UTK.EDU with SMTP (8.6.4/2.8s-UTK) id KAA25823; Fri, 31 Dec 1993 10:31:53 -0500 X-Resent-To: parkbench-comm@CS.UTK.EDU ; Fri, 31 Dec 1993 10:31:52 EST Errors-to: owner-parkbench-comm@CS.UTK.EDU Received: from sun2.nsfnet-relay.ac.uk by CS.UTK.EDU with SMTP (8.6.4/2.8s-UTK) id KAA25816; Fri, 31 Dec 1993 10:31:48 -0500 Via: uk.ac.southampton; Fri, 31 Dec 1993 15:31:17 +0000 Received: from ecs.soton.ac.uk (root@localhost) by mail.soton.ac.uk (8.6.4/2.6) with NIFTP id PAA23688 for pbwg-comm%cs.utk.edu@uk.ac.nsfnet-relay; Fri, 31 Dec 1993 15:29:09 GMT From: R.Hockney@pac.soton.ac.uk Via: calvados.pac.soton.ac.uk (plonk); Fri, 31 Dec 93 15:20:49 GMT Date: Fri, 31 Dec 93 15:30:15 GMT Message-Id: <1215.9312311530@calvados.pac.soton.ac.uk> To: pbwg-comm@CS.UTK.EDU Subject: Response to Grassl RESPONSE TO CHARLES GRASSL -------------------------- >The lable multi-processors is accurate but too general. Nearly all >computers support multiple processing of some sort. I agree >Node and processors are often used incorrectly in an interchangable >fashion. A node is a network location and a processor is a CPU of some >sort. On a CRAY T3D there are two CPUs, we call them Processor >Elements (PEs), per node. The Intel Paragon has two i860 CPUs at each >node site, one for computation and one for communication. A very good point. Thanks for the clarification. I will use the distinction which is an important one to make. It will probably mean that most references to node (if they really mean processor at a node) will be replaced by processor. >MPP is a generic name. The US agency ARPA is trying to change the use >of the lable "MPP" to Scalable MuliProcessor (SMP). I object to the >latter label because there is not an adequate defition of "scalable". I agree, and indeed I object to both MPP and SMP, as we don't know what either Massive or Scalable is supposed to mean. Everything is scalable to some extent, and nothing is infinitely scalable. The only sensible question is by how much and in what manner does a particular design scale when more processors are added. A very complex question which is rarely properly answered, if indeed the answer is known at all. I will try to avoid these terms, but will probably have to mention them. >Message-passing computers would encompass all modern computers. Are you sure? Does the C90 pass messages, for example? It provides no message-passing constructs in its normal Fortran, and it is intended to be programmed with shared-memory language constructs which imply a "Direct Memory Access" (DMA) model of memory. Perhaps we need to be careful to distinguish hardware features (does the hardware actually send something that can be construed as a message), from the programming model (does the software provide SEND/RECV primitives). As yet I am not sure on this one?? >Distributed memory seems to be what we are benchmarking. The key >differentiator of MPP systems is their non-local memory access >patterns. The PARKBENCHmarks are actually either message-passing programs and/or HPF programs. They therefore can be used on and can test any computer whose software supports these models of programming. This statement is quite independent of the nature of the hardware which may or may not, at the hardware level, send messages or have data parallel (i.e. SIMD as opposed to MIMD) control. However, we have no benchmarks which use the shared memory or DMA model of parallel programming. Come to think of it, HPF is a special rather restricted case of DMA (there are certainly no messages). My view is that our benchmarks test the efficiency of certain prgramming models on a particular computer, and it should be clearly stated that this does not imply anything about the underlying hardware. Other views please. >I recommend using the label "distributed memory computers". I have noted this preference, but still have an open mind. Thanks for your contribution Roger Hockney From owner-parkbench-comm@CS.UTK.EDU Wed Jan 5 10:00:16 1994 Received: from CS.UTK.EDU by netlib2.cs.utk.edu with ESMTP (8.6.4/2.8t-netlib) id KAA21551; Wed, 5 Jan 1994 10:00:10 -0500 Received: from localhost by CS.UTK.EDU with SMTP (8.6.4/2.8s-UTK) id JAA18726; Wed, 5 Jan 1994 09:59:40 -0500 X-Resent-To: parkbench-comm@CS.UTK.EDU ; Wed, 5 Jan 1994 09:59:39 EST Errors-to: owner-parkbench-comm@CS.UTK.EDU Received: from sun2.nsfnet-relay.ac.uk by CS.UTK.EDU with SMTP (8.6.4/2.8s-UTK) id JAA18719; Wed, 5 Jan 1994 09:59:36 -0500 Via: uk.ac.southampton; Wed, 5 Jan 1994 14:59:03 +0000 Received: from ecs.soton.ac.uk (root@localhost) by mail.soton.ac.uk (8.6.4/2.6) with NIFTP id OAA01035 for pbwg-comm%CS.UTK.EDU@uk.ac.nsfnet-relay; Wed, 5 Jan 1994 14:56:19 GMT Via: brewery.ecs.soton.ac.uk; Wed, 5 Jan 94 14:48:29 GMT From: Vladimir Getov Received: from beluga.ecs.soton.ac.uk by brewery.ecs.soton.ac.uk; Wed, 5 Jan 94 15:01:22 GMT Date: Wed, 5 Jan 94 15:01:24 GMT Message-Id: <3668.9401051501@beluga.ecs.soton.ac.uk> To: pbwg-comm@CS.UTK.EDU Subject: Re: Response to Grassl - node definition Node definition --------------- > > >Node and processors are often used incorrectly in an interchangable > >fashion. A node is a network location and a processor is a CPU of some > >sort. On a CRAY T3D there are two CPUs, we call them Processor > >Elements (PEs), per node. The Intel Paragon has two i860 CPUs at each > >node site, one for computation and one for communication. > I support this point, but think it still needs deeper clarification. For instance, there are network locations in shared memory computers but we don't call them nodes. A node, therefore, is a network location specific to distributed memory computers, having its own chunk of local memory, a set of processing (computation) and communication elements, optional I/O facilities, etc. For example, one node in CM-2 consists of some local memory, 32 bit-slice processing elements, communication elements, a floating point accelerator, etc. > > A very good point. Thanks for the clarification. I will use the > distinction which is an important one to make. It will probably > mean that most references to node (if they really mean processor > at a node) will be replaced by processor. > There are also references to `processor' which really mean `node' in our report. In chapter 2, for example, we introduce a number of 2-dimensional performance metrics, one of the dimensions being the number of processors, p. In reality, however, when benchmarking distributed memory parallel computers, we use the number of nodes rather than the number of processors. I propose that the same variable (p) should be kept, clarifying that it means the number of processors for shared memory parallel computers and the number of nodes for distributed memory parallel computers. ... - Other stuff deleted - vsg > > Roger Hockney > Vladimir Getov From owner-parkbench-comm@CS.UTK.EDU Wed Jan 5 12:14:27 1994 Received: from CS.UTK.EDU by netlib2.cs.utk.edu with ESMTP (8.6.4/2.8t-netlib) id MAA22600; Wed, 5 Jan 1994 12:14:26 -0500 Received: from localhost by CS.UTK.EDU with SMTP (8.6.4/2.8s-UTK) id MAA28403; Wed, 5 Jan 1994 12:14:22 -0500 X-Resent-To: parkbench-comm@CS.UTK.EDU ; Wed, 5 Jan 1994 12:14:21 EST Errors-to: owner-parkbench-comm@CS.UTK.EDU Received: from sun2.nsfnet-relay.ac.uk by CS.UTK.EDU with SMTP (8.6.4/2.8s-UTK) id MAA28395; Wed, 5 Jan 1994 12:14:18 -0500 Via: uk.ac.southampton; Wed, 5 Jan 1994 17:13:59 +0000 Received: from ecs.soton.ac.uk (root@localhost) by mail.soton.ac.uk (8.6.4/2.6) with NIFTP id RAA09224 for pbwg-comm%CS.UTK.EDU@uk.ac.nsfnet-relay; Wed, 5 Jan 1994 17:11:24 GMT Via: brewery.ecs.soton.ac.uk; Wed, 5 Jan 94 17:03:31 GMT From: Vladimir Getov Received: from beluga.ecs.soton.ac.uk by brewery.ecs.soton.ac.uk; Wed, 5 Jan 94 17:16:23 GMT Date: Wed, 5 Jan 94 17:16:27 GMT Message-Id: <3741.9401051716@beluga.ecs.soton.ac.uk> To: pbwg-comm@CS.UTK.EDU Subject: Re: Response to Grassl - scalability Scalability ----------- > > >MPP is a generic name. The US agency ARPA is trying to change the use > >of the lable "MPP" to Scalable MuliProcessor (SMP). I object to the > >latter label because there is not an adequate defition of "scalable". > > I agree, and indeed I object to both MPP and SMP, as we don't know > what either Massive or Scalable is supposed to mean. Everything is > scalable to some extent, and nothing is infinitely scalable. > The only sensible question is by how much and in what manner does > a particular design scale when more processors are added. A very > complex question which is rarely properly answered, if indeed the > answer is known at all. I will try to avoid these terms, but will > probably have to mention them. > I think we should emphasize mainly on program (benchmark) scalability. Generally speaking, increasing the problem size we also increase: (1) the memory requirements of the code; (2) the calculation requirements (the flop count) of the code; (3) the communication requirements (if any) of the code (the total number of messages and their length). I would propose that these three could be recognized by PARKBENCH as the main scalability characteristics of our benchmarks. They should have their estimated or calculated values (or a formula, if these values depend on the number of nodes/processors) for the three standard problem sizes (test problem size, moderate size problem, and grand challenge problem) in the individual README files (see the minutes of the 4th PARKBENCH meeting in August, 1993). I believe that this information would allow us to draw conclusions about the scalability of parallel computers after taking the benchmark measurements. ... - Other stuff deleted - vsg > Roger Hockney > Vladimir Getov From owner-parkbench-comm@CS.UTK.EDU Wed Jan 5 13:39:46 1994 Received: from CS.UTK.EDU by netlib2.cs.utk.edu with ESMTP (8.6.4/2.8t-netlib) id NAA23090; Wed, 5 Jan 1994 13:39:46 -0500 Received: from localhost by CS.UTK.EDU with SMTP (8.6.4/2.8s-UTK) id NAA04812; Wed, 5 Jan 1994 13:39:36 -0500 X-Resent-To: parkbench-comm@CS.UTK.EDU ; Wed, 5 Jan 1994 13:39:34 EST Errors-to: owner-parkbench-comm@CS.UTK.EDU Received: from sun2.nsfnet-relay.ac.uk by CS.UTK.EDU with SMTP (8.6.4/2.8s-UTK) id NAA04804; Wed, 5 Jan 1994 13:39:32 -0500 Via: uk.ac.southampton; Wed, 5 Jan 1994 18:39:12 +0000 Received: from ecs.soton.ac.uk (root@localhost) by mail.soton.ac.uk (8.6.4/2.6) with NIFTP id SAA12795 for pbwg-comm%CS.UTK.EDU@uk.ac.nsfnet-relay; Wed, 5 Jan 1994 18:36:32 GMT Via: brewery.ecs.soton.ac.uk; Wed, 5 Jan 94 18:28:40 GMT From: Vladimir Getov Received: from beluga.ecs.soton.ac.uk by brewery.ecs.soton.ac.uk; Wed, 5 Jan 94 18:41:29 GMT Date: Wed, 5 Jan 94 18:41:31 GMT Message-Id: <3782.9401051841@beluga.ecs.soton.ac.uk> To: pbwg-comm@CS.UTK.EDU, R.Hockney@pac.soton.ac.uk Subject: Re: Response to Grassl - what we are benchmarking What we are benchmarking ------------------------ > > >Distributed memory seems to be what we are benchmarking. The key > >differentiator of MPP systems is their non-local memory access > >patterns. > > The PARKBENCHmarks are actually either message-passing programs > and/or HPF programs. They therefore can be used on and can test > any computer whose software supports these models of programming. > This statement is quite independent of the nature of the hardware > which may or may not, at the hardware level, send messages or > have data parallel (i.e. SIMD as opposed to MIMD) control. > However, we have no benchmarks which use the shared memory or DMA > model of parallel programming. Come to think of it, HPF is a special > rather restricted case of DMA (there are certainly no messages). > My view is that our benchmarks test the efficiency of certain > prgramming models on a particular computer, and it should be clearly > stated that this does not imply anything about the underlying > hardware. Other views please. > I firmly support this position - what we see from our `high level language' point of view is the programming model. Currently, three programming models seem to be of major interest: (1) Conventional (sequential) - based usually on Fortran 77, this model assumes the implementation of automatic vectorisers and/or parallelisers during the compilation on a parallel computer; (2) Message-passing - it exist in two versions - Master/Slave and Hostless; (3) Data parallel model - HPF/Fortran 90. Ideally, the PARKBENCHmarks should have three versions for the above programming models. > > >I recommend using the label "distributed memory computers". > > I have noted this preference, but still have an open mind. > Why don't using the label from the title of our report - "parallel computers". Virtually all possible kinds of parallel computers conform to this label - distributed memory parallel computers, shared memory parallel computers, message-passing parallel computers, etc.; even massively parallel computers sounds not too bad! > > Thanks for your contribution > > Roger Hockney > Vladimir Getov From owner-parkbench-comm@CS.UTK.EDU Tue Feb 15 13:45:20 1994 Received: from CS.UTK.EDU by netlib2.cs.utk.edu with ESMTP (8.6.4/2.8t-netlib) id NAA29141; Tue, 15 Feb 1994 13:45:19 -0500 Received: from localhost by CS.UTK.EDU with SMTP (8.6.4/2.8s-UTK) id NAA27372; Tue, 15 Feb 1994 13:43:56 -0500 X-Resent-To: parkbench-comm@CS.UTK.EDU ; Tue, 15 Feb 1994 13:43:54 EST Errors-to: owner-parkbench-comm@CS.UTK.EDU Received: from sun2.nsfnet-relay.ac.uk by CS.UTK.EDU with SMTP (8.6.4/2.8s-UTK) id NAA27342; Tue, 15 Feb 1994 13:43:45 -0500 Via: uk.ac.southampton.relay; Tue, 15 Feb 1994 17:40:28 +0000 Received: from ecs.soton.ac.uk (root@localhost) by mail.soton.ac.uk (8.6.4/2.11) with NIFTP id RAA17867 for pbwg-comm%cs.utk.edu@uk.ac.nsfnet-relay; Tue, 15 Feb 1994 17:00:37 GMT From: R.Hockney@pac.soton.ac.uk Via: calvados.pac.soton.ac.uk (plonk); Tue, 15 Feb 94 16:49:38 GMT Date: Tue, 15 Feb 94 16:47:00 GMT Message-Id: <9794.9402151647@calvados.pac.soton.ac.uk> To: pbwg-comm@CS.UTK.EDU Subject: Publication Status Report PROGRESS REPORT ON PARKBENCH PUBLICATION ---------------------------------------- I am pleased to report, after some delays and much hard work, that the Parkbench report has been revised and was returned to the Editor of Scientific Programming (Ron Perrott) on 14th Feb. 1994. I will pass on to you the substance of his response when I receive it. Negotiations over copyright are now in the hands of Jack Dongarra, and he reports to me that Wiley will not do a deal with us comparable with the one they did with Rice over the HPF Report. This is unfortunate as I thought it would make things easy for us to follow this precedent, but it seems not to be possible. My view is that the our non-negotiable base position should be that we retain the right to distribute electronically from UTK and Southampton (and perhaps other) servers, and that it might be enough for this copying to be regarded on the same basis as Xeroxing, and with the same restrictions. But some of you may not wish there to be any restrictions on electronic distribution. I have asked Jack to report on the current status of the negotiations. Comments, opinions and ideas to pbwg-comm please. Remember our next meeting is on 28th March in Knoxville which is the day before a 3-day SPEC meeting, also in Knoxville. Best regards, Roger Hockney (Chairman PARKBENCH Committee) From owner-parkbench-comm@CS.UTK.EDU Tue Feb 15 23:39:54 1994 Received: from CS.UTK.EDU by netlib2.cs.utk.edu with ESMTP (8.6.4/2.8t-netlib) id XAA03202; Tue, 15 Feb 1994 23:39:53 -0500 Received: from localhost by CS.UTK.EDU with SMTP (8.6.4/2.8s-UTK) id XAA09238; Tue, 15 Feb 1994 23:39:09 -0500 X-Resent-To: parkbench-comm@CS.UTK.EDU ; Tue, 15 Feb 1994 23:39:08 EST Errors-to: owner-parkbench-comm@CS.UTK.EDU Received: from berry.cs.utk.edu by CS.UTK.EDU with SMTP (8.6.4/2.8s-UTK) id XAA09232; Tue, 15 Feb 1994 23:39:07 -0500 Received: from LOCALHOST by berry.cs.utk.edu with SMTP (5.61+IDA+UTK-930922/2.7c-UTK) id AA22905; Tue, 15 Feb 94 23:39:07 -0500 Message-Id: <9402160439.AA22905@berry.cs.utk.edu> To: R.Hockney@pac.soton.ac.uk Cc: pbwg-comm@CS.UTK.EDU Subject: Re: Publication Status Report In-Reply-To: Your message of "Tue, 15 Feb 1994 16:47:00 GMT." <9794.9402151647@calvados.pac.soton.ac.uk> Date: Tue, 15 Feb 1994 23:38:45 -0500 From: "Michael W. Berry" > PROGRESS REPORT ON PARKBENCH PUBLICATION > ---------------------------------------- > > I am pleased to report, after some delays and much hard work, that the > Parkbench report has been revised and was returned to the Editor of > Scientific Programming (Ron Perrott) on 14th Feb. 1994. I will pass on > to you the substance of his response when I receive it. > > Negotiations over copyright are now in the hands of Jack Dongarra, and > he reports to me that Wiley will not do a deal with us comparable with > the one they did with Rice over the HPF Report. This is unfortunate as > I thought it would make things easy for us to follow this precedent, but > it seems not to be possible. My view is that the our non-negotiable base > position should be that we retain the right to distribute electronically > from UTK and Southampton (and perhaps other) servers, and that it might > be enough for this copying to be regarded on the same basis as Xeroxing, > and with the same restrictions. But some of you may not wish there to > be any restrictions on electronic distribution. I have asked Jack to > report on the current status of the negotiations. Comments, opinions > and ideas to pbwg-comm please. Roger, we can always maintain a version of the paper as a UT tech. report for public-domain downloading from Netlib. Perhaps there should be something like "PARKBENCH Working Reports" which are a series in performance analysis, application descriptions, algorithm research, etc.. using the suite of codes. Regards, Mike From owner-parkbench-comm@CS.UTK.EDU Thu Mar 3 15:32:04 1994 Received: from CS.UTK.EDU by netlib2.cs.utk.edu with ESMTP (8.6.4/2.8t-netlib) id PAA08264; Thu, 3 Mar 1994 15:32:03 -0500 Received: from localhost by CS.UTK.EDU with SMTP (8.6.4/2.8s-UTK) id PAA25164; Thu, 3 Mar 1994 15:30:11 -0500 X-Resent-To: parkbench-comm@CS.UTK.EDU ; Thu, 3 Mar 1994 15:30:09 EST Errors-to: owner-parkbench-comm@CS.UTK.EDU Received: from dasher.cs.utk.edu by CS.UTK.EDU with SMTP (8.6.4/2.8s-UTK) id PAA25150; Thu, 3 Mar 1994 15:30:08 -0500 From: Jack Dongarra Received: by dasher.cs.utk.edu (5.61+IDA+UTK-930922/2.7c-UTK) id AA02049; Thu, 3 Mar 94 15:29:39 -0500 Date: Thu, 3 Mar 94 15:29:39 -0500 Message-Id: <9403032029.AA02049@dasher.cs.utk.edu> To: parkbench-comm@CS.UTK.EDU Subject: next parkbench meeting Dear Colleague, The Fifth Meeting of the ParkBench (Parallel Benchmark Working Group) will meet in Knoxville, Tennessee at the University of Tennessee on March 28th, 1994. The meeting site will be the Knoxville Downtown Hilton Hotel. We have made arrangements with the Hilton Hotel in Knoxville. Hilton Hotel 501 W. Church Street Knoxville, TN Phone: 615-523-2300 When making arrangements tell the hotel you are associated with the Parallel Benchmarking or Spec Meeting. The rate is $68.00/night. You can download a postscript map of the area by anonymous ftp'ing to netlib2.cs.utk.edu, cd shpcc94, get knx-downtown.ps. You can rent a car or get a cab from the airport to the hotel. We should plan to start at 9:00 am March 28th and finish about 5:00 pm. If you will be attending the meeting please send me email so we can better arrange for the meeting. The format of the meeting is: Monday 28th March 9:00 - 12.00 Full group meeting 12.00 - 1.30 Lunch 1.30 - 5.00 Full group meeting Tentative agenda for the meeting: 1. Minutes of last meeting 2. Reports and discussion from subgroups 3. Open discussion and agreement on further actions 4. Date and venue for next meeting The objectives for the group are: 1. To establish a comprehensive set of parallel benchmarks that is generally accepted by both users and vendors of parallel system. 2. To provide a focus for parallel benchmark activities and avoid unnecessary duplication of effort and proliferation of benchmarks. 3. To set standards for benchmarking methodology and result-reporting together with a control database/repository for both the benchmarks and the results. The following mailing lists have been set up. parkbench-comm@cs.utk.edu Whole committee parkbench-lowlevel@cs.utk.edu Low level subcommittee parkbench-compactapp@cs.utk.edu Compact applications subcommittee parkbench-method@cs.utk.edu Methodology subcommittee parkbench-kernel@cs.utk.edu Kernel subcommittee All mail is being collected and can be retrieved by sending email to netlib@ornl.gov and in the mail message typing: send comm.archive from parkbench send lowlevel.archive from parkbench send compactapp.archive from parkbench send method.archive from parkbench send kernel.archive from parkbench send index from parkbench We have setup a mail reflector for correspondence, it is called parkbench-comm@cs.utk.edu. Mail to that address will be sent to the mailing list and also collected in netlib@ornl.gov. To retrieve the collected mail, send email to netlib@ornl.gov and in the mail message type: send comm.archive from parkbench The Spec people will be having their meeting at the Hilton after the Parkbench meeting and you are welcome to attend that meeting as well. Jack Dongarra From owner-parkbench-comm@CS.UTK.EDU Fri Mar 11 15:18:23 1994 Received: from CS.UTK.EDU by netlib2.cs.utk.edu with ESMTP (8.6.4/2.8t-netlib) id PAA04210; Fri, 11 Mar 1994 15:18:23 -0500 Received: from localhost by CS.UTK.EDU with SMTP (8.6.4/2.8s-UTK) id PAA02038; Fri, 11 Mar 1994 15:16:51 -0500 X-Resent-To: parkbench-comm@CS.UTK.EDU ; Fri, 11 Mar 1994 15:16:51 EST Errors-to: owner-parkbench-comm@CS.UTK.EDU Received: from dasher.cs.utk.edu by CS.UTK.EDU with SMTP (8.6.4/2.8s-UTK) id PAA02032; Fri, 11 Mar 1994 15:16:44 -0500 From: Jack Dongarra Received: by dasher.cs.utk.edu (5.61+IDA+UTK-930922/2.7c-UTK) id AA13650; Fri, 11 Mar 94 15:16:17 -0500 Date: Fri, 11 Mar 94 15:16:17 -0500 Message-Id: <9403112016.AA13650@dasher.cs.utk.edu> To: R.Hockney@pac.soton.ac.uk, pbwg-comm@CS.UTK.EDU Subject: Re: PARKBENCH Report In-Reply-To: Mail from 'R.Hockney@pac.soton.ac.uk' dated: Wed, 29 Dec 93 17:51:42 GMT I believe we have the copyright issue straighten out with Wiley. The copyright agreement will allow electronic distribution of the report by us on an unlimited but non-exclusive basis. I will sign the copyright statement on behalf of the committee. Jack From owner-parkbench-comm@CS.UTK.EDU Mon Mar 21 09:14:09 1994 Received: from CS.UTK.EDU by netlib2.cs.utk.edu with ESMTP (cf v2.8t-netlib) id JAA03706; Mon, 21 Mar 1994 09:14:09 -0500 Received: from localhost by CS.UTK.EDU with SMTP (cf v2.8s-UTK) id JAA17384; Mon, 21 Mar 1994 09:12:57 -0500 X-Resent-To: parkbench-comm@CS.UTK.EDU ; Mon, 21 Mar 1994 09:12:56 EST Errors-to: owner-parkbench-comm@CS.UTK.EDU Received: from dasher.cs.utk.edu by CS.UTK.EDU with ESMTP (cf v2.8s-UTK) id JAA17378; Mon, 21 Mar 1994 09:12:55 -0500 From: Jack Dongarra Received: by dasher.cs.utk.edu (cf v2.9c-UTK) id JAA16480; Mon, 21 Mar 1994 09:12:46 -0500 Date: Mon, 21 Mar 1994 09:12:46 -0500 Message-Id: <199403211412.JAA16480@dasher.cs.utk.edu> To: parkbench-comm@CS.UTK.EDU Subject: Parkbench meeting Please let me know if you are planning to attend the Parkbench meeting next Monday, March 28th. Thanks, Jack From owner-parkbench-comm@CS.UTK.EDU Fri Apr 1 18:52:00 1994 Received: from CS.UTK.EDU by netlib2.cs.utk.edu with ESMTP (cf v2.8t-netlib) id SAA02659; Fri, 1 Apr 1994 18:51:59 -0500 Received: from localhost by CS.UTK.EDU with SMTP (cf v2.8s-UTK) id SAA04266; Fri, 1 Apr 1994 18:50:50 -0500 X-Resent-To: parkbench-comm@CS.UTK.EDU ; Fri, 1 Apr 1994 18:50:48 EST Errors-to: owner-parkbench-comm@CS.UTK.EDU Received: from cray.com by CS.UTK.EDU with SMTP (cf v2.8s-UTK) id SAA04259; Fri, 1 Apr 1994 18:50:46 -0500 Received: from magnet (magnet.cray.com) by cray.com (Bob mailer 1.2) id AA27845; Fri, 1 Apr 94 17:50:43 CST Received: by magnet (4.1/CRI-5.13) id AA01899; Fri, 1 Apr 94 17:50:40 CST From: cmg@ferrari.cray.com (Charles Grassl) Message-Id: <9404012350.AA01899@magnet> Subject: Bare kernels To: pbwg-comm@CS.UTK.EDU Date: Fri, 1 Apr 94 17:50:36 CST X-Mailer: ELM [version 2.3 PL11] I would like to see a "bare kernels" section which tests minimally changed constructs. These bare kernels are necessary in order to help programmers working on KERNELS, not using the kernels. In the discussion on Monday, 28 March, 1994, we listed the following kernels: - Matrix benchmarks - Transpose - Dense LU factorization - QR decomposition - Matrix diagonalization The kernels listed above each involve configuration specific algorithms. The performance of an algorithm is not useful for a low level programmer. We could test the PEs in a parallel system with the simple "bare kernels" listed below: - Intrinsics - SAXPY - COPY - DOT PRODUCT - Unchanged matrix-vector multiplication - Unchanged matrix-matrix multiplication An implementation of this test could be based on the MOD1AC program from the EUROBEN benchmark. This specific program is not exactly appropriate for microprocessors because it assumes a linear model of performance versus construct length al la vector processing. Comments? Discussion? Regards, Charles Grassl Cray Research, Inc. Eagan, Minnesota USA From owner-parkbench-comm@CS.UTK.EDU Tue Apr 12 22:26:46 1994 Received: from CS.UTK.EDU by netlib2.cs.utk.edu with ESMTP (cf v2.8t-netlib) id WAA01265; Tue, 12 Apr 1994 22:26:45 -0400 Received: from localhost by CS.UTK.EDU with SMTP (cf v2.8s-UTK) id WAA13147; Tue, 12 Apr 1994 22:25:52 -0400 X-Resent-To: parkbench-comm@CS.UTK.EDU ; Tue, 12 Apr 1994 22:25:49 EDT Errors-to: owner-parkbench-comm@CS.UTK.EDU Received: from berry.cs.utk.edu by CS.UTK.EDU with ESMTP (cf v2.8s-UTK) id WAA13136; Tue, 12 Apr 1994 22:25:47 -0400 Received: from LOCALHOST.cs.utk.edu by berry.cs.utk.edu with SMTP (cf v2.9c-UTK) id WAA12994; Tue, 12 Apr 1994 22:25:45 -0400 Message-Id: <199404130225.WAA12994@berry.cs.utk.edu> to: pbwg-comm@CS.UTK.EDU Subject: Minutes of last meeting Date: Tue, 12 Apr 1994 22:25:44 -0400 From: "Michael W. Berry" --------------------------------------------------------------------- Minutes of the Parkbench Meeting - Knoxville Hilton - March 28, 1994 --------------------------------------------------------------------- List of Attendees: Michael Berry (Univ. of Tennessee, berry@cs.utk.edu) David Bailey (NASA, dbailey@nas.nasa.gov) Jack Dongarra (Univ. of Tennessee / ORNL, dongarra@cs.utk.edu) Myron Ginsberg (GM/EDS, ginsberg@gmr.com) Charles Grassl (CRI, cmg@cray.com) Ed Kushner (Intel SSD, kushner@ssd.intel.com) Tony Hey (Univ. of Southampton, ajgh@ecs.soton.ak.uk) David Mackay (Intel SSD, mackay@ssd.intel.com) Bodo Parady (Sun Microsystems, bodo.parady@eng.sun.com) Fiona Sim (IBM Kingston, fsim@vnet.ibm.com) David Walker (ORNL, walker@msr.epm.ornl.gov) At 9:20am, Jack D. opened the meeting and indicated that he would be the substitute chairman (in Roger Hockney's absence). Jack asked Mike B. to read the minutes from the Portland (SC'93) PARKBENCH meeting. There were no modifications to the minutes. It was noted that there are benchmark programs yet to be placed in Tony H. suggested that there be a formal announcement of the benchmark release. David W. indicated that the CA codes are still evolving. Jack D. read a statement from Roger about his motion to step down as chairman. The status of the SP publication was then addressed. There is hesitation from Wiley to give up Copyright but they will allow Internet access in PDS/Netlib. A similar situation with MPI resulted in MPI going with MIT Press and not Wiley (joint copyright with MIT Press and UT). For PARKBENCH, Wiley holds the copyright but allows electronic distribution. David M. asked about future revision problems? Jack said that there should be no problem and that the on-line version will be the most recent. The text of statement read by Jack D. on Roger H.'s behalf is provided below. PUBLICATION OF PARKBENCH REPORT ------------------------------- The first Parkbench Report has been accepted for publication by Wiley in "Scientific Programming" by Ron Perrott (Editor). Although there can be no guarantees, Ron states that the aim is for it to appear in 1994. The version that has been modified for publication is available by anonymous ftp from 'ecs.soton.ac.uk' (152.78.64.201) under directory 'pub/benchmarks', file 'parkbench.ps.Z'. Login as 'anonymous' and give your e-mail address as the password. The file is compressed postscript, and needs a large buffer space when printing because of screen dumps. Jack then asked the subgroup leaders to give status reports. David B. indicates that nothing happening has happened lately with the Methodology subgroup and that the report is fully updated. Tony H. then reported on the efforts by the Kernels subgroup. CG has now been included since there was no sparse matrix code in the PARKBENCH suite. There was a proposal to replace the Integer Sort (IS) with the CG kernel but David B. felt that the IS might be kept. The proposal for replacement of IS by CG was made by David Mackay and Ed Kushner (Intel SSD). Bodo P. felt the use of IS depends on the size of the array to be sorted. A general discussion on the pro's and con's of keeping the IS benchmark followed. David B. felt that a particle-in-cell (PIC) code should be included and that the IS kernel is not indicative of what is done in a PIC code. David B. suggested that a code from Los Alamos be added. ACTION ITEM: David Bailey will contact Los Alamos about getting a PIC code. The insertion of a PIC code was judged to be independent of the IS kernel decision. Jack D. and Bodo P. were in favor of leaving the IS kernel in. Attendees agreed to leave the IS discussion for the next meeting. Tony H. noted that only three kernels were actually available in Netlib: Dense LU, QR Decomposition, and Matrix Tridiagonalization. He then presented some performance results. David B. said that the latest NAS Parallel Benchmarks Report can be put in Netlib. A hardcopy of this report (RNR Technical Report RNR-94-006) was distributed to all the attendees. Tony H. then demonstrated how the PARKBENCH data (stored in PDS) might be filtered into graphs using various templates. He showed sample performance (Mflop/s) graphs of different kernels (e.g., LU, 3D-FFT). Charles G. stressed that we have to specify why log-log graphs are used. Tony H. indicated that the graphics package would allow the user to choose their own display options. Jack asked if we want to include price-performance graphs? Bodo P. suggested that the user could supply their own price information and Jack suggested that the NAS table of prices be available. Charles G. suggested that list price might be more reasonable than the actual price. Tony H. said that his group wants to be able to express the performance of kernels in terms of low-level parameters (kernels) and that they are trying this with the following kernels: FFT, Matrix-Multiply, and SOR. The question of which plotting package to use was raised. Possible options are: Xplot, Gnuplot, and jgraph (UT-CS). David W. and Jack D. both suggested that the user should be able to select the plotting mechanism from PDS and have plot displayed on their local workstation but Tony H. wants the user to be able to modify plots on their local machine. Attendees felt both features were worthwhile The current list of kernels and their versions were then discussed. 1. Matrix Benchmarks (Dense matrix Mult., transpose, dense LU,QR decomposition, Matrix tridiagonalization) Versions: F77, PVM, HPF Bodo P. asked if we want to use a Strassen multiply code? David B. asked if we want to encourage them to use Strassen's method? Charles G. felt it should be left out due to memory and accuracy concerns. 2. Fourier transforms (1DFFT,3DFFT) Versions: F77,CMF for 3DFFT, NX2 3. PDE Kernels (SOR, MG, SoHo, NAS) Versions: HPF,PVM 4. Other (NAS:EP, CG, IS; I/O) Bodo P. stressed that disclosure be encouraged and attendees agreed that vendors are allowed to use supported libraries (available to customers). David B. asked that a paragraph disclosure of optimization techniques. Charles G. pointed out that vendors are liable when they put source codes out that were not really supported by the vendor. Attendees agreed that baseline runs be done first followed by optimizations in which the integrity of the "driver" is kept, i.e., guarantee that the same problem and accuracy is obtained (re: pages 17-18 of PARKBENCH report). ACTION ITEM: Jack D. coordinate the specification of optimization procedures for the kernels and their packaging for Netlib. Jack D. pointed out that their are no official MPI results available. Tony H. said the kernels should all be available for release at the end of the month (April). PVM versions of the low-level benchmarks will also be included. The discussion of the kernels concluded and Jack D. then asked David W. to report on the efforts of the Compact Applications (CA) subgroup. David W. pointed out that he has a Fortran-M version of a PIC code that he might be able to get converted into a Fortran+MP code. Tony H. indicated that such a code might be better suited as a CA rather than a kernel. ACTION ITEM: David W. will see if his PCI code can be converted into a Fortran message-passing program for PARKBENCH. Tne current list of submitted CA's is given below: POLMP (Fluid Dynamics) SOLVER (QCD,CG) GAUGE(QCD,MC) (not available right now but easily obtained) SHALLOW H2O (ORNL) Versions: PICL ARCO (Seismic Migration) Versions: YAMPOR, PVM LU,SP,BT (CFD from NAS) Versions: NX2, F77, CMF, Cray Multitasking Note: HPF versions of first 3 do exist. CA's to be added are listed below: MOLECULAR DYNAMICS (ORNL) Versions: F77 PIC (LPM1 or LANL) Versions: PARMACS, PVM It appears that about 10 CA's can be collected for PARKBENCH. Group discussed if this is a satisfactory list and pondered any redundancies. Tony H. asked that application experts review the list of codes. Myron Ginsberg stressed that some FEM codes be included but Bodo P. and Tony H. pointed out problems with obtaining public-domain codes for applications like "crash testing". Myron G. noted that large structures or aerodynamics are the key applications used in the auto industry. ACTION ITEM: Tony H. he will try to find a public-domain crash code. Charles G. asked what application areas are missing? Bodo P. has a prime factorization code that could be added. Fiona S. (IBM) pointed out that the list of CA's does not include a solid state Chemistry code. Tony H. mentioned that there is a GAMES code that might be considered. ACTION ITEM: Tony H. will ask Martin Guest about availability of GAMES-UK or AMBER benchmarks. David W. indicated that codes satisfying the constraints listed in Chapter 5 of the PARKBENCH report should be accepted. Attendees felt that codes solving the same problem be left in the suite. Myron G. asked if an on-line transaction processing could be added? Tony H. indicated there is sensitivity by those groups (financial community) in reporting results. Mike B. indicated that the group avoid averaging the results of the CA's (such as the geometric mean). Jack D. then suggested that the group break for lunch and restart at 1:00pm EST. After lunch, a discussion of verification procedures followed. Jack D. will produce verification procedures for kernels; David B. will provide such procedures for the NAS Parallel Benchmarks used in PARKBENCH. Charles G. pointed out past problems in the Perfect effort's verification problems (changing arithmetic, different algorithms?). He indicated that there would great interest to include residual information (error analysis). One could track some residuals for different numbers of processors. David B. suggested the group should reserve the right to verify submitted PARKBENCH results. No specifics on how the verification would be conducted would be written down -- this was a unanimous decision. Jack then noted that PVM to MPI conversions are in the near future. He then asked the group to comment on the relationship with the SPEC/HPC group. Tony H. provided the details of a conversation he had with David Kuck. Kuck put SPEC/Perfect "back on track" at January SPEC meeting. SPEC/HPSC apparently has to pay their own way. They have ARCO, NAS Parallel Codes, Weather code from Intel, Canadian weather code. Charles G. pointed out that their press release was the result of a "contentious meeting" in California. Fiona S. questioned that their was any firm decisions from that meeting. Charles G. pointed out that vendors are tired of having to maintain yet another benchmark infrastructure. Ed K. indicated that Intel is interested in shared-memory type parallel processing with putting several CPU's on a single chip. It's not clear how Intel SSD relates to SPEC. Intel SSD provided $5K to HPSC (at management level). Ed K. on the behalf of Intel SSD (w/ backing of IBM and CRI) will invite Jack to give talk to SPEC/HPSC group. Fiona S. was not comfortable with the size/number of codes HPSC is wanting to consider. There appears to be conflicting attitudes in SPEC/HPSC in allowing optimized codes or strictly baseline (no touch) codes. Ed K. will contact Dave K. to see if Jack will be invited to talk to HPSC. Mike B. agreed to give a talk to SPEC/OSSC on Wed morning about the PDS/Netlib infrastructure [At that meeting, SPEC/OSSC voted 9-0 in favor of posting SPEC Newsletter data with a 3-month delay]. All attendees agreed that Roger H. should remain chair of PARKBENCH. Jack then asked if the SC'94 meeting should be a BOFS or a roundtable discussion. Group decided that the meeting would be best suited as a BOFS. April 4 is the deadline for scheduling a BOF/roundtable discussion for SC'94 in Washington DC. ACTION ITEM: Jack D. will apply for a PARKBENCH BOFS on Roger's behalf. Plans for the next meeting: first release of the benchmarks, complete set of CA's and some initial results. Video conferencing is a possibility for the next meeting. Jack D. indicated that either CLI or PICTURE-TEL technology can be used. A third alternative is to use the Mbone/Internet, but the transmission is 2 frames per second. The group felt an experimental video conference would be interesting. Tentatively the next meeting is set for August here in Knoxville - Monday, August 29. In June, various video technologies will be tested (MBONE is marginal in expectations). NEXT MEETING: Monday, August 29 in Knoxville. Jack D. then asked Tony H. to talk about the most recent RAPS Workshop (RAPS = Real Applications on Parallel Systems). It is supported by IBM, Intel, TMC, Fujitsu, Cray, ACRI, and Convex. Members get privileged access to codes: AVL (FIRE code), ESI (PAMCRASH code), ECMUF (IFS code). The workshop was held in December, 1993. A talk on 3-D IFS was given -- HPF applied to FIRE code. Tony H. also spoke about the Europe HPCN Initiative which is funded to port commercial codes to MPP's (generic parallel versions with PVM and PARMACS not HPF). He detailed some of their work in the published "Europort" reports: Europort-1: Focus on CFD and FEM [PAMCRASH, NASTRAN]. Portable and scalable codes desired and made publicly available. Basically part of ESPRIT project. Europort-2: Focus on Computational Chemistry, Databases, Oil Reservoir, Electromagnetics, Radiotherapy, Earth Observation, Drug Design, Visualization. Oil application code is PROCLIPSE (ECLIPSE100). Comp. Chemistry codes included GAMES-UK, ADF, VAMP, GROMOS, MNDO, TURBOMOLE. Other applications/codes are Electromagnetics [TOSCA,ELEKTRA], Radiotherapy [EGS4], and Databases [ADABAS]. Tony H. listed contacts for these efforts and noted that the next RAPS Conference is in Hamburg, Germany on May 26-27: Europort-1: C. Thole, GMD, Sankt Augustis Europort-2: A. Colbrook, Smith Systems Engineering Jack D. then asked about the creation of a "comp.parallel.benchmark" or "comp.parkbench" newsgroup. Jack indicated that an electronic call of votes (over the Internet) would be required to create such a newsgroup. Charles G. supports the creation of the newsgroup over mail reflectors. Group felt that "comp.parallel.performance" was the better choice for the name and that such a newsgroup should be unmoderated. Jack D. then suggested that there be a PARKBENCH announcement to HPCWIRE when all the codes are tested and assembled into Netlib. Mike B. then reported on the status of Mosaic version of PDS. He indicated that the current URL for PDS/Mosaic is http://netlib2.cs.utk.edu/performance/html/PDStop.html. Jack D. adjourned the meeting at approximately 3:15pm EST. From owner-parkbench-comm@CS.UTK.EDU Wed Apr 13 12:37:35 1994 Received: from CS.UTK.EDU by netlib2.cs.utk.edu with ESMTP (cf v2.8t-netlib) id MAA06160; Wed, 13 Apr 1994 12:37:34 -0400 Received: from localhost by CS.UTK.EDU with SMTP (cf v2.8s-UTK) id MAA22363; Wed, 13 Apr 1994 12:36:06 -0400 X-Resent-To: parkbench-comm@CS.UTK.EDU ; Wed, 13 Apr 1994 12:36:05 EDT Errors-to: owner-parkbench-comm@CS.UTK.EDU Received: from Sun.COM by CS.UTK.EDU with SMTP (cf v2.8s-UTK) id MAA22356; Wed, 13 Apr 1994 12:36:02 -0400 Received: from Eng.Sun.COM (zigzag.Eng.Sun.COM) by Sun.COM (sun-barr.Sun.COM) id AA10839; Wed, 13 Apr 94 09:35:48 PDT Received: from cumbria.Eng.Sun.COM by Eng.Sun.COM (4.1/SMI-4.1) id AA23729; Wed, 13 Apr 94 09:34:49 PDT Received: by cumbria.Eng.Sun.COM (5.0/SMI-SVR4) id AA25499; Wed, 13 Apr 1994 09:34:55 +0800 Date: Wed, 13 Apr 1994 09:34:55 +0800 From: Bodo.Parady@Eng.Sun.COM (Bodo Parady - SMCC Architecture Performance Group) Message-Id: <9404131634.AA25499@cumbria.Eng.Sun.COM> To: pbwg-comm@CS.UTK.EDU Subject: FYI: Status of PAR9X in SPEC X-Sun-Charset: US-ASCII This is a summary of the consensus of the SPEC Open Systems Steering Committee on the status of PAR93 and its possible successor to be released as part of the SPECfp95 suite. Many thanks to Jack for hosting a the joint PARKBENCH and SPEC meetings. The improved communications between the different benchmark efforts could be a benefit to all of us. I have ordered video conferencing equipment and look forward to using it in the future for PARKBENCH. Bodo --------- From bodo Tue Apr 5 17:07:06 1994 To: specpar@dg-rtp.dg.com > At the Knoxville meeting, the release plans for SPECpar95 were established. > It is the consensus (no objections were raised, and the following is > an abstraction of the minutes) of the SPEC members in attendance: > > That SPECpar95 be released as part of the SPECfp95 release tape. > > That SPECpar95 can be reported separately from SPECfp95, that reporting > one does not imply reporting the other. > > That a baseline and optimized result will be reported for SPECpar95 > with identical run rules for SPECfp95. > > That no compiler directives will be allowed in the run rules for > SPECpar95, optimized. > > That SPECpar95 be sized appropriately to the large uniprocessor > systems and MP systems, and that the sizing of SPECfp95 > would be appropriate to the smallest system being > reported. The technology used in SPECpar95 that allows > a single code base to be produced in the src.base > directory will enable the maintenance of this source base. > > That benchmarks in that appear in both SPECpar95 and SPECfp95 would > share the same code but not be comparable in size and > run time. > > That no resolution of the SC is required since Jeff Reilly and Bodo Parady > will work together to produce this release. > > Many thanks. > > Bodo > From owner-parkbench-comm@CS.UTK.EDU Thu Apr 14 11:48:06 1994 Received: from CS.UTK.EDU by netlib2.cs.utk.edu with ESMTP (cf v2.8t-netlib) id LAA14981; Thu, 14 Apr 1994 11:48:06 -0400 Received: from localhost by CS.UTK.EDU with SMTP (cf v2.8s-UTK) id LAA19333; Thu, 14 Apr 1994 11:48:06 -0400 X-Resent-To: parkbench-comm@CS.UTK.EDU ; Thu, 14 Apr 1994 11:48:04 EDT Errors-to: owner-parkbench-comm@CS.UTK.EDU Received: from ua.d.umn.edu by CS.UTK.EDU with SMTP (cf v2.8s-UTK) id LAA19321; Thu, 14 Apr 1994 11:47:59 -0400 Received: from mars.d.umn.edu by ua.d.umn.edu with SMTP id AA14968 (5.65c/IDA-1.4.4); Thu, 14 Apr 1994 10:47:48 -0500 Date: Thu, 14 Apr 1994 10:47:48 -0500 From: Clark Thomborson Message-Id: <199404141547.AA14968@ua.d.umn.edu> Received: by mars.d.umn.edu (4.1/SMI-4.1) id AA01688; Thu, 14 Apr 94 10:47:41 CDT To: dongarra@CS.UTK.EDU Subject: Parkbench Cc: pbwg-comm@CS.UTK.EDU Dear Professor Dongarra, Please add me to your distribution list for Parkbench. I'm the author of the article in the 11/93 C. ACM on porting workstation code to vector supercomputers. Among other projects, I am now trying to figure out how to extend that broad-brush analysis of the cost and performance (broadly construed) of high-performance computer systems. Broader benchmarks than single-user CPU or CPU-memory kernels are surely part of the answer I seek. Just now, I retrieved "method.archive from pbwg" to see if your methodology group is headed in the direction of my interest. It seems not...so I'll ride my hobby-horse a bit longer... Measurement under a vaguely-stated "load" is, of course, an ill-defined problem, but it has the virtue of leading one to other, even more important, questions: 1. What are the likely job mixes for any particular supercomputer? 2. Which components of the job mix are most important to accelerate? 3. Which components should be run as efficiently as possible, i.e., in such a way as to maximize system throughput, rather than minimizing an individual job's latency? It seems that the supercomputer benchmarking community has already answered these questions in the following way: 1. Floating-point intensive (most typically BLAS kernels). 2. BLAS kernels. 3. None. Perhaps these are indeed the "right" answers, but I'd be interested to hear a justification, especially for Answer 3. (Answers 1 & 2 define the traditional military/scientific market for supercomputing; an abrupt focus-switch to an entirely different market segment is, I believe, beyond the scope of Parkbench.) Answer 3 is, I believe, inappropriate. An architecture that is sharply tuned for performance on single-job kernels, but with a large process-switch time, will be cost-ineffective in comparison to an architecture that can support multiple jobs without much degradation in single-job performance. To put my concern in another way: it is "merely a small matter of programming" to hand-code any particular BLAS-3 kernel for ideal speed under no load on any particular system for which you know the memory and instruction latencies and throughput. But! I submit that any such code is likely to have significant performance deficits when there are memory conflicts (e.g. bank conflicts on a Cray Y) or processor-scheduling delays (e.g. in a Cray autotasked code run on a multi-user system). Surely others in the Parkbench group have noticed this nonlinear, load-dependent, performance degradation on a Cray X-MP or Y-MP. And things will surely get worse, soon, when more of us write codes using more than two layers of the memory hierarchy (as is necessary on any current MPP, or for that matter on any vector supercomputer chewing on a 10 GB data file). How large of a memory image, at each layer of the memory hierarchy, should we assume for the purposes of performance-tuning? Can we write our source codes and system subroutines so that their performance depends no more than is necessary on system load?? Special-interest disclosure: My comments above are partially motivated by some ideas I have been developing recently, about how to write source codes for BLAS-3 subroutines that acheive near-optimal performance under a wide variety of system loads. Clark Thomborson PS: Thank you for organizing this email discussion group, and for publishing the "Parallel Computing Research" newsletter. From owner-parkbench-comm@CS.UTK.EDU Wed May 11 09:08:10 1994 Received: from CS.UTK.EDU by netlib2.cs.utk.edu with ESMTP (cf v2.8t-netlib) id JAA17213; Wed, 11 May 1994 09:08:09 -0400 Received: from localhost by CS.UTK.EDU with SMTP (cf v2.8s-UTK) id JAA25118; Wed, 11 May 1994 09:06:35 -0400 X-Resent-To: parkbench-comm@CS.UTK.EDU ; Wed, 11 May 1994 09:06:33 EDT Errors-to: owner-parkbench-comm@CS.UTK.EDU Received: from obelix.hrz.tu-chemnitz.de by CS.UTK.EDU with SMTP (cf v2.8s-UTK) id JAA25034; Wed, 11 May 1994 09:06:22 -0400 Received: from sunnyboy.informatik.tu-chemnitz.de by obelix.hrz.tu-chemnitz.de with Local SMTP (PP) id <08635-0@obelix.hrz.tu-chemnitz.de>; Wed, 11 May 1994 15:05:52 +0200 Received: from pi.informatik.tu-chemnitz.de by sunnyboy.informatik.tu-chemnitz.de (4.1/SMI-4.1) id AA19062; Wed, 11 May 94 15:05:47 +0200 Date: Wed, 11 May 94 15:05:47 +0200 From: andreas.kleber@informatik.tu-chemnitz.de (Andreas Kleber) Message-Id: <9405111305.AA19062@sunnyboy.informatik.tu-chemnitz.de> To: pbwg-comm@CS.UTK.EDU send parkbench.ps from parkbench From owner-parkbench-comm@CS.UTK.EDU Wed May 11 09:11:10 1994 Received: from CS.UTK.EDU by netlib2.cs.utk.edu with ESMTP (cf v2.8t-netlib) id JAA17221; Wed, 11 May 1994 09:11:10 -0400 Received: from localhost by CS.UTK.EDU with SMTP (cf v2.8s-UTK) id JAA25617; Wed, 11 May 1994 09:11:17 -0400 X-Resent-To: parkbench-comm@CS.UTK.EDU ; Wed, 11 May 1994 09:11:16 EDT Errors-to: owner-parkbench-comm@CS.UTK.EDU Received: from obelix.hrz.tu-chemnitz.de by CS.UTK.EDU with SMTP (cf v2.8s-UTK) id JAA25610; Wed, 11 May 1994 09:11:10 -0400 Received: from sunnyboy.informatik.tu-chemnitz.de by obelix.hrz.tu-chemnitz.de with Local SMTP (PP) id <08862-0@obelix.hrz.tu-chemnitz.de>; Wed, 11 May 1994 15:10:55 +0200 Received: from pi.informatik.tu-chemnitz.de by sunnyboy.informatik.tu-chemnitz.de (4.1/SMI-4.1) id AA19090; Wed, 11 May 94 15:10:53 +0200 Date: Wed, 11 May 94 15:10:53 +0200 From: andreas.kleber@informatik.tu-chemnitz.de (Andreas Kleber) Message-Id: <9405111310.AA19090@sunnyboy.informatik.tu-chemnitz.de> To: pbwg-comm@CS.UTK.EDU send linalg.tar.z.uu from parkbench From owner-parkbench-comm@CS.UTK.EDU Wed May 11 09:13:48 1994 Received: from CS.UTK.EDU by netlib2.cs.utk.edu with ESMTP (cf v2.8t-netlib) id JAA17227; Wed, 11 May 1994 09:13:48 -0400 Received: from localhost by CS.UTK.EDU with SMTP (cf v2.8s-UTK) id JAA25815; Wed, 11 May 1994 09:13:55 -0400 X-Resent-To: parkbench-comm@CS.UTK.EDU ; Wed, 11 May 1994 09:13:54 EDT Errors-to: owner-parkbench-comm@CS.UTK.EDU Received: from obelix.hrz.tu-chemnitz.de by CS.UTK.EDU with SMTP (cf v2.8s-UTK) id JAA25802; Wed, 11 May 1994 09:13:48 -0400 Received: from sunnyboy.informatik.tu-chemnitz.de by obelix.hrz.tu-chemnitz.de with Local SMTP (PP) id <08870-0@obelix.hrz.tu-chemnitz.de>; Wed, 11 May 1994 15:13:26 +0200 Received: from pi.informatik.tu-chemnitz.de by sunnyboy.informatik.tu-chemnitz.de (4.1/SMI-4.1) id AA19093; Wed, 11 May 94 15:13:23 +0200 Date: Wed, 11 May 94 15:13:23 +0200 From: andreas.kleber@informatik.tu-chemnitz.de (Andreas Kleber) Message-Id: <9405111313.AA19093@sunnyboy.informatik.tu-chemnitz.de> To: pbwg-comm@CS.UTK.EDU send nas.uu from parkbench From owner-parkbench-comm@CS.UTK.EDU Wed May 11 09:16:32 1994 Received: from CS.UTK.EDU by netlib2.cs.utk.edu with ESMTP (cf v2.8t-netlib) id JAA17245; Wed, 11 May 1994 09:16:32 -0400 Received: from localhost by CS.UTK.EDU with SMTP (cf v2.8s-UTK) id JAA26241; Wed, 11 May 1994 09:16:48 -0400 X-Resent-To: parkbench-comm@CS.UTK.EDU ; Wed, 11 May 1994 09:16:45 EDT Errors-to: owner-parkbench-comm@CS.UTK.EDU Received: from obelix.hrz.tu-chemnitz.de by CS.UTK.EDU with SMTP (cf v2.8s-UTK) id JAA26203; Wed, 11 May 1994 09:16:13 -0400 Received: from sunnyboy.informatik.tu-chemnitz.de by obelix.hrz.tu-chemnitz.de with Local SMTP (PP) id <09183-0@obelix.hrz.tu-chemnitz.de>; Wed, 11 May 1994 15:15:45 +0200 Received: from pi.informatik.tu-chemnitz.de by sunnyboy.informatik.tu-chemnitz.de (4.1/SMI-4.1) id AA19096; Wed, 11 May 94 15:15:41 +0200 Date: Wed, 11 May 94 15:15:41 +0200 From: andreas.kleber@informatik.tu-chemnitz.de (Andreas Kleber) Message-Id: <9405111315.AA19096@sunnyboy.informatik.tu-chemnitz.de> To: pbwg-comm@CS.UTK.EDU send npb.rev4.3.tar.z.uu parkbench From owner-parkbench-comm@CS.UTK.EDU Wed May 11 09:17:51 1994 Received: from CS.UTK.EDU by netlib2.cs.utk.edu with ESMTP (cf v2.8t-netlib) id JAA17258; Wed, 11 May 1994 09:17:50 -0400 Received: from localhost by CS.UTK.EDU with SMTP (cf v2.8s-UTK) id JAA26390; Wed, 11 May 1994 09:18:13 -0400 X-Resent-To: parkbench-comm@CS.UTK.EDU ; Wed, 11 May 1994 09:18:13 EDT Errors-to: owner-parkbench-comm@CS.UTK.EDU Received: from obelix.hrz.tu-chemnitz.de by CS.UTK.EDU with SMTP (cf v2.8s-UTK) id JAA26382; Wed, 11 May 1994 09:18:04 -0400 Received: from sunnyboy.informatik.tu-chemnitz.de by obelix.hrz.tu-chemnitz.de with Local SMTP (PP) id <09204-0@obelix.hrz.tu-chemnitz.de>; Wed, 11 May 1994 15:17:25 +0200 Received: from pi.informatik.tu-chemnitz.de by sunnyboy.informatik.tu-chemnitz.de (4.1/SMI-4.1) id AA19121; Wed, 11 May 94 15:17:22 +0200 Date: Wed, 11 May 94 15:17:22 +0200 From: andreas.kleber@informatik.tu-chemnitz.de (Andreas Kleber) Message-Id: <9405111317.AA19121@sunnyboy.informatik.tu-chemnitz.de> To: pbwg-comm@CS.UTK.EDU send general.uu from parkbench From owner-parkbench-comm@CS.UTK.EDU Wed May 11 09:19:31 1994 Received: from CS.UTK.EDU by netlib2.cs.utk.edu with ESMTP (cf v2.8t-netlib) id JAA17289; Wed, 11 May 1994 09:19:30 -0400 Received: from localhost by CS.UTK.EDU with SMTP (cf v2.8s-UTK) id JAA26446; Wed, 11 May 1994 09:19:32 -0400 X-Resent-To: parkbench-comm@CS.UTK.EDU ; Wed, 11 May 1994 09:19:31 EDT Errors-to: owner-parkbench-comm@CS.UTK.EDU Received: from obelix.hrz.tu-chemnitz.de by CS.UTK.EDU with SMTP (cf v2.8s-UTK) id JAA26424; Wed, 11 May 1994 09:19:25 -0400 Received: from sunnyboy.informatik.tu-chemnitz.de by obelix.hrz.tu-chemnitz.de with Local SMTP (PP) id <09211-0@obelix.hrz.tu-chemnitz.de>; Wed, 11 May 1994 15:19:10 +0200 Received: from pi.informatik.tu-chemnitz.de by sunnyboy.informatik.tu-chemnitz.de (4.1/SMI-4.1) id AA19124; Wed, 11 May 94 15:19:08 +0200 Date: Wed, 11 May 94 15:19:08 +0200 From: andreas.kleber@informatik.tu-chemnitz.de (Andreas Kleber) Message-Id: <9405111319.AA19124@sunnyboy.informatik.tu-chemnitz.de> To: pbwg-comm@CS.UTK.EDU send genesis.uu from parkbench From owner-parkbench-comm@CS.UTK.EDU Thu Jun 16 15:40:59 1994 Received: from CS.UTK.EDU by netlib2 with ESMTP (cf v2.8t-netlib) id PAA15560; Thu, 16 Jun 1994 15:40:59 -0400 Received: from localhost by CS.UTK.EDU with SMTP (cf v2.9s-UTK) id PAA00972; Thu, 16 Jun 1994 15:39:27 -0400 X-Resent-To: parkbench-comm@CS.UTK.EDU ; Thu, 16 Jun 1994 15:39:19 EDT Errors-to: owner-parkbench-comm@CS.UTK.EDU Received: from sun2.nsfnet-relay.ac.uk by CS.UTK.EDU with SMTP (cf v2.9s-UTK) id PAA00965; Thu, 16 Jun 1994 15:39:16 -0400 Via: uk.ac.southampton.relay; Thu, 16 Jun 1994 20:34:37 +0100 Received: from ecs.soton.ac.uk (root@localhost) by mail.soton.ac.uk (8.6.4/2.12) with NIFTP id UAA01709 for pbwg-comm%cs.utk.edu@uk.ac.nsfnet-relay; Thu, 16 Jun 1994 20:32:18 +0100 From: R.Hockney@pac.soton.ac.uk Via: calvados.pac.soton.ac.uk (plonk); Thu, 16 Jun 94 20:31:14 BST Date: Thu, 16 Jun 94 20:30:30 BST Message-Id: <29769.9406161930@calvados.pac.soton.ac.uk> To: pbwg-comm@CS.UTK.EDU Subject: Reprints PUBLICATION OF PARKBENCH REPORT ------------------------------- I have just finished correcting the proofs, and the Report is scheduled for publication in Scientific Programming at the end of August 1994, about the time of our 29th August Parkbench Meeting. 25 reprints come free and will be sent to me. I shall distribute them as I see fit amongst those most closely involved, but they won't go far. Extra reprints may be ordered from Wiley by IMMEDIATELY sending on order to them. The minimum order is for 100 reprints and I estimate by extrapolating their table that the cost is USDollars 830 for the 46 pages. Requests should be sent to: Alyson Linefsky Journals Editorial/Production John Wiley & Sons, Inc 605 Third Avenue New York, N.Y. 10158 USA Phone : +1 (212) 850-6723 FAX : ........ 850-6050 e-mail: alinefsk@jwiley.com I have not yet verified that she receives e-mail, but the telephone is correct. Ask for reprints of Manuscript Number: SP-049 Name : Parkbench Report Author : Hockney Journal: Scientific Computing and give: Method of payment (check, Bill me, Credit card-give usual,details) Billing Address Shipping Address Arrangements for additional reprints are between yourself and Wiley, I am just giving the contact details here. Remember that the Report is always available electronically from the servers at Knoxville and Southampton in postscript and latex forms. Best Regards Roger Hockney From owner-parkbench-comm@CS.UTK.EDU Sat Jun 18 14:55:55 1994 Received: from CS.UTK.EDU by netlib2.cs.utk.edu with ESMTP (cf v2.8t-netlib) id OAA01129; Sat, 18 Jun 1994 14:55:54 -0400 Received: from localhost by CS.UTK.EDU with SMTP (cf v2.9s-UTK) id OAA12263; Sat, 18 Jun 1994 14:54:23 -0400 X-Resent-To: parkbench-comm@CS.UTK.EDU ; Sat, 18 Jun 1994 14:54:22 EDT Errors-to: owner-parkbench-comm@CS.UTK.EDU Received: from sun2.nsfnet-relay.ac.uk by CS.UTK.EDU with SMTP (cf v2.9s-UTK) id OAA12256; Sat, 18 Jun 1994 14:54:20 -0400 Via: uk.ac.southampton.relay; Sat, 18 Jun 1994 19:53:11 +0100 Received: from ecs.soton.ac.uk (root@localhost) by mail.soton.ac.uk (8.6.4/2.12) with NIFTP id TAA08728 for pbwg-comm%cs.utk.edu@uk.ac.nsfnet-relay; Sat, 18 Jun 1994 19:45:37 +0100 From: R.Hockney@pac.soton.ac.uk Via: calvados.pac.soton.ac.uk (plonk); Sat, 18 Jun 94 19:44:40 BST Date: Sat, 18 Jun 94 19:43:57 BST Message-Id: <7598.9406181843@calvados.pac.soton.ac.uk> To: pbwg-comm@CS.UTK.EDU Subject: Reprints PARKBENCH REPRINTS ------------------ Following my earlier note, it seems a pity if we only get the 25 free copies of the printed version, but that is all a funding-free individual such as myself can do. I wonder however if there are any corporate or institutional members who could persuade their organisations to buy some reprints with covers, overprint the covers with some note about themselves distributing these reprints as a service to the parallel community, then giving them away at their stands at Supercomputing'94. There would then be some advertising for their organisations and for PARKBENCH. Just a thougth, but it would be nice if we could get some decent distribution for the printed version. Any Offers?? If so arrangements should be made direct between the organisation and Alyson Linefsky whose e-mail (she does receive and read it) was on my last e-mail: alinefsk@jwiley.com Action needs to be prompt as reprints ordered much later than the proof are returned, carry extra charges. Best Wishes Roger Hockney From owner-parkbench-comm@CS.UTK.EDU Mon Jun 20 16:14:58 1994 Received: from CS.UTK.EDU by netlib2.cs.utk.edu with ESMTP (cf v2.8t-netlib) id QAA17509; Mon, 20 Jun 1994 16:14:57 -0400 Received: from localhost by CS.UTK.EDU with SMTP (cf v2.9s-UTK) id QAA24104; Mon, 20 Jun 1994 16:13:52 -0400 X-Resent-To: parkbench-comm@CS.UTK.EDU ; Mon, 20 Jun 1994 16:13:51 EDT Errors-to: owner-parkbench-comm@CS.UTK.EDU Received: from sun2.nsfnet-relay.ac.uk by CS.UTK.EDU with SMTP (cf v2.9s-UTK) id QAA24097; Mon, 20 Jun 1994 16:13:47 -0400 Via: uk.ac.southampton.relay; Mon, 20 Jun 1994 21:13:19 +0100 Received: from ecs.soton.ac.uk (root@localhost) by mail.soton.ac.uk (8.6.4/2.12) with NIFTP id VAA20306 for pbwg-comm%cs.utk.edu@uk.ac.nsfnet-relay; Mon, 20 Jun 1994 21:05:31 +0100 From: R.Hockney@pac.soton.ac.uk Via: calvados.pac.soton.ac.uk (plonk); Mon, 20 Jun 94 21:04:44 BST Date: Mon, 20 Jun 94 21:04:11 BST Message-Id: <2438.9406202004@calvados.pac.soton.ac.uk> To: pbwg-comm@CS.UTK.EDU Subject: Reprints REPRINTS AGAIN -------------- Mike Berry has offered to put in a few hundred dollars from UTK towards some reprints. If we can have two or three more similar offers (please enclose a billing address), then we can get 100 reprints. Any more offers? How many reprints do you all think we should try to buy?? Roger From owner-parkbench-comm@CS.UTK.EDU Wed Jun 29 09:45:32 1994 Received: from CS.UTK.EDU by netlib2.cs.utk.edu with ESMTP (cf v2.8t-netlib) id JAA01475; Wed, 29 Jun 1994 09:45:31 -0400 Received: from localhost by CS.UTK.EDU with SMTP (cf v2.9s-UTK) id JAA28185; Wed, 29 Jun 1994 09:43:59 -0400 X-Resent-To: parkbench-comm@CS.UTK.EDU ; Wed, 29 Jun 1994 09:43:58 EDT Errors-to: owner-parkbench-comm@CS.UTK.EDU Received: from rambone.psi.net by CS.UTK.EDU with SMTP (cf v2.9s-UTK) id JAA28178; Wed, 29 Jun 1994 09:43:56 -0400 Received: from jwiley.com by rambone.psi.net (4.1/SMI-4.1.3-PSI) id AA24290; Wed, 29 Jun 94 09:37:21 EDT From: CWOODS@jwiley.com (Woods, Craig) Date: 29 Jun 94 09:21:40 Received: by jwiley.com (UUCP-MHS-XtcN) Wed Jun 29 09:36:59 1994 To: pbwg-comm@CS.UTK.EDU Subject: Reprint Costs-Parkbench Report Message-Id: 7075112E011DBCD1 Importance: Normal Encoding: 14 TEXT The cost of 100 reprints of the above article is $830.00 plus $90.00 for covers. 200 reprints would be $1,532.00 plus an additional $145.00 for 200 covers. Shipping charges and any applicable sales tax are not included. Please direct all queries concerning this matter to my attention, Craig Woods, e-mail: cwoods@jwiley.com From owner-parkbench-comm@CS.UTK.EDU Fri Jul 8 04:27:50 1994 Received: from CS.UTK.EDU by netlib2.cs.utk.edu with ESMTP (cf v2.8t-netlib) id EAA03229; Fri, 8 Jul 1994 04:27:49 -0400 Received: from localhost by CS.UTK.EDU with SMTP (cf v2.9s-UTK) id EAA16760; Fri, 8 Jul 1994 04:26:59 -0400 X-Resent-To: parkbench-comm@CS.UTK.EDU ; Fri, 8 Jul 1994 04:26:57 EDT Errors-to: owner-parkbench-comm@CS.UTK.EDU Received: from unidhp1.uni-c.dk by CS.UTK.EDU with SMTP (cf v2.9s-UTK) id EAA16753; Fri, 8 Jul 1994 04:26:55 -0400 Message-Id: <199407080826.EAA16753@CS.UTK.EDU> Received: by unidhp1.uni-c.dk (1.37.109.8/16.2) id AA23701; Fri, 8 Jul 1994 10:27:38 +0200 Date: Fri, 8 Jul 1994 10:27:38 +0200 From: Jack Dongarra To: parkbench-comm@CS.UTK.EDU Subject: next Parkbench Meeting As much as I would like to hold the next ParkBench meeting over the network, it doesn't look like it will be possible. I would like to suggest that we plan the next meeting on Monday August 29th in Knoxville. As before, we will use the Downtown Knoxville Hilton Hotel. More details to follow. Regards, Jack From owner-parkbench-comm@CS.UTK.EDU Wed Jul 13 04:23:09 1994 Received: from CS.UTK.EDU by netlib2.cs.utk.edu with ESMTP (cf v2.8t-netlib) id EAA29004; Wed, 13 Jul 1994 04:23:09 -0400 Received: from localhost by CS.UTK.EDU with SMTP (cf v2.9s-UTK) id EAA23032; Wed, 13 Jul 1994 04:20:35 -0400 X-Resent-To: parkbench-comm@CS.UTK.EDU ; Wed, 13 Jul 1994 04:20:25 EDT Errors-to: owner-parkbench-comm@CS.UTK.EDU Received: from sun2.nsfnet-relay.ac.uk by CS.UTK.EDU with SMTP (cf v2.9s-UTK) id EAA23005; Wed, 13 Jul 1994 04:20:22 -0400 Via: uk.ac.southampton.relay; Wed, 13 Jul 1994 09:18:13 +0100 Received: from ecs.soton.ac.uk (root@localhost) by mail.soton.ac.uk (8.6.4/2.12) with NIFTP id JAA15789 for pbwg-comm%cs.utk.edu@uk.ac.nsfnet-relay; Wed, 13 Jul 1994 09:10:36 +0100 From: R.Hockney@pac.soton.ac.uk Via: calvados.pac.soton.ac.uk (plonk); Wed, 13 Jul 94 09:10:26 BST Date: Wed, 13 Jul 94 09:10:25 BST Message-Id: <4916.9407130810@calvados.pac.soton.ac.uk> To: pbwg-comm@CS.UTK.EDU Subject: Parkbench Reprints I have confirmation from Wiley that the cost is: 100 200 Reprints US$ 830 US$ 1532 Covers US$ 90 US$ 145 I have offers as follows: UTK US$ 300 my interpretation of a "few" offer from Mike Soton US$ 300 from Tony Thus we need a further offer to make a minmumpurchase for the group. Please notify me within a week if your organisation can offer a further US$ 230. I will then order 100 without covers and have the invoice split and the order split to the three addresses Roger From owner-parkbench-comm@CS.UTK.EDU Sun Aug 7 12:18:13 1994 Received: from CS.UTK.EDU by netlib2.cs.utk.edu with ESMTP (cf v2.8t-netlib) id MAA05481; Sun, 7 Aug 1994 12:18:12 -0400 Received: from localhost by CS.UTK.EDU with SMTP (cf v2.9s-UTK) id MAA07179; Sun, 7 Aug 1994 12:17:17 -0400 X-Resent-To: parkbench-comm@CS.UTK.EDU ; Sun, 7 Aug 1994 12:17:15 EDT Errors-to: owner-parkbench-comm@CS.UTK.EDU Received: from dasher.cs.utk.edu by CS.UTK.EDU with ESMTP (cf v2.9s-UTK) id MAA07166; Sun, 7 Aug 1994 12:17:14 -0400 From: Jack Dongarra Received: by dasher.cs.utk.edu (cf v2.9c-UTK) id MAA12566; Sun, 7 Aug 1994 12:17:11 -0400 Date: Sun, 7 Aug 1994 12:17:11 -0400 Message-Id: <199408071617.MAA12566@dasher.cs.utk.edu> To: parkbench-comm@CS.UTK.EDU Subject: meeting on August 29th I would like to get a feeling for how many people will be attending the Parkbench meeting in Knoxville on August 29th. Please send me a message if you are planning to attend. Thanks, Jack From owner-parkbench-comm@CS.UTK.EDU Mon Aug 8 11:11:50 1994 Received: from CS.UTK.EDU by netlib2.cs.utk.edu with ESMTP (cf v2.8t-netlib) id LAA11110; Mon, 8 Aug 1994 11:11:49 -0400 Received: from localhost by CS.UTK.EDU with SMTP (cf v2.9s-UTK) id LAA24013; Mon, 8 Aug 1994 11:09:53 -0400 X-Resent-To: parkbench-comm@CS.UTK.EDU ; Mon, 8 Aug 1994 11:09:52 EDT Errors-to: owner-parkbench-comm@CS.UTK.EDU Received: from wk49.nas.nasa.gov by CS.UTK.EDU with ESMTP (cf v2.9s-UTK) id LAA24006; Mon, 8 Aug 1994 11:09:49 -0400 Received: (from dbailey@localhost) by wk49.nas.nasa.gov (8.6.8.1/NAS.5.b) id IAA09174; Mon, 8 Aug 1994 08:09:46 -0700 Date: Mon, 8 Aug 1994 08:09:46 -0700 From: dbailey@nas.nasa.gov (David H. Bailey) Message-Id: <199408081509.IAA09174@wk49.nas.nasa.gov> To: dongarra@CS.UTK.EDU CC: parkbench-comm@CS.UTK.EDU In-reply-to: <199408071617.MAA12566@dasher.cs.utk.edu> (message from Jack Dongarra on Sun, 7 Aug 1994 12:17:11 -0400) Subject: Re: meeting on August 29th Reply-to: dbailey@nas.nasa.gov I'm planning on attending. DHB From owner-parkbench-comm@CS.UTK.EDU Wed Aug 17 09:54:00 1994 Received: from CS.UTK.EDU by netlib2.cs.utk.edu with ESMTP (cf v2.8t-netlib) id JAA16281; Wed, 17 Aug 1994 09:54:00 -0400 Received: from localhost by CS.UTK.EDU with SMTP (cf v2.9s-UTK) id JAA12588; Wed, 17 Aug 1994 09:52:34 -0400 X-Resent-To: parkbench-comm@CS.UTK.EDU ; Wed, 17 Aug 1994 09:52:33 EDT Errors-to: owner-parkbench-comm@CS.UTK.EDU Received: from dasher.cs.utk.edu by CS.UTK.EDU with ESMTP (cf v2.9s-UTK) id JAA12581; Wed, 17 Aug 1994 09:52:32 -0400 From: Jack Dongarra Received: by dasher.cs.utk.edu (cf v2.9c-UTK) id JAA17012; Wed, 17 Aug 1994 09:52:31 -0400 Date: Wed, 17 Aug 1994 09:52:31 -0400 Message-Id: <199408171352.JAA17012@dasher.cs.utk.edu> To: parkbench-comm@CS.UTK.EDU Subject: August Parkbench Meeting Dear Colleague, The Sixth Meeting of the ParkBench (Parallel Benchmark Working Group) will meet in Knoxville, Tennessee on August 29th, 1994. The meeting site will be the Knoxville Downtown Hilton Hotel. We have made arrangements with the Hilton Hotel in Knoxville. Hilton Hotel 501 W. Church Street Knoxville, TN Phone: 615-523-2300 When making arrangements tell the hotel you are associated with the ParkBench or University of Tennessee. The rate is $68.00/night. You can download a postscript map of the area by anonymous ftp'ing to netlib2.cs.utk.edu, cd shpcc94, get knx-downtown.ps. You can rent a car or get a cab from the airport to the hotel. We should plan to start at 9:00 am August 29th and finish about 5:00 pm. If you will be attending the meeting please send me email so we can better arrange for the meeting. If you are planning to arrive early on Sunday let me know, perhaps I can join you for dinner. The format of the meeting is: Monday 29th August 9:00 - 12.00 Full group meeting 12.00 - 1.30 Lunch 1.30 - 5.00 Full group meeting Tentative agenda for the meeting: (1) Minutes of last meeting (2) Relation of ParkBench to SPEC (3) Status of various sections of ParkBench (4) Availability of the NAS benchmarks. (5) Cleanup of benchmark code. (6) Report on Southampton's Graphical Interface to Parkbench Data Base (7) Report on Parkbench Data Base. how does one enter data etc. (8) Supercomputing 94 meeting (9) Where we go from here (10) Parkbench Chair The objectives for the group are: 1. To establish a comprehensive set of parallel benchmarks that is generally accepted by both users and vendors of parallel system. 2. To provide a focus for parallel benchmark activities and avoid unnecessary duplication of effort and proliferation of benchmarks. 3. To set standards for benchmarking methodology and result-reporting together with a control database/repository for both the benchmarks and the results. The following mailing lists have been set up. parkbench-comm@cs.utk.edu Whole committee parkbench-lowlevel@cs.utk.edu Low level subcommittee parkbench-compactapp@cs.utk.edu Compact applications subcommittee parkbench-method@cs.utk.edu Methodology subcommittee parkbench-kernel@cs.utk.edu Kernel subcommittee All mail is being collected and can be retrieved by sending email to netlib@ornl.gov and in the mail message typing: send comm.archive from parkbench send lowlevel.archive from parkbench send compactapp.archive from parkbench send method.archive from parkbench send kernel.archive from parkbench send index from parkbench We have setup a mail reflector for correspondence, it is called parkbench-comm@cs.utk.edu. Mail to that address will be sent to the mailing list and also collected in netlib@ornl.gov. To retrieve the collected mail, send email to netlib@ornl.gov and in the mail message type: send comm.archive from parkbench Jack Dongarra From owner-parkbench-comm@CS.UTK.EDU Fri Aug 19 17:29:12 1994 Received: from CS.UTK.EDU by netlib2.cs.utk.edu with ESMTP (cf v2.8t-netlib) id RAA03679; Fri, 19 Aug 1994 17:29:12 -0400 Received: from localhost by CS.UTK.EDU with SMTP (cf v2.9s-UTK) id RAA12115; Fri, 19 Aug 1994 17:26:29 -0400 X-Resent-To: parkbench-comm@CS.UTK.EDU ; Fri, 19 Aug 1994 17:26:27 EDT Errors-to: owner-parkbench-comm@CS.UTK.EDU Received: from rios2.EPM.ORNL.GOV by CS.UTK.EDU with ESMTP (cf v2.9s-UTK) id RAA12080; Fri, 19 Aug 1994 17:26:25 -0400 Received: (from walker@localhost) by rios2.EPM.ORNL.GOV (8.6.8/8.6.6) id RAA18062; Fri, 19 Aug 1994 17:26:23 -0400 From: David Walker Message-Id: <199408192126.RAA18062@rios2.EPM.ORNL.GOV> To: pbwg-comm@CS.UTK.EDU Subject: PARKBENCH mosaic page Date: Fri, 19 Aug 94 17:26:23 -0500 I've set up a PARKBENCH mosaic page and would appreciate your comments, criticisms, and additions. The URL is: http://www.epm.ornl.gov/~walker/parkbench/ Best Regards, David From owner-parkbench-comm@CS.UTK.EDU Mon Aug 22 09:17:20 1994 Received: from CS.UTK.EDU by netlib2.cs.utk.edu with ESMTP (cf v2.8t-netlib) id JAA26166; Mon, 22 Aug 1994 09:17:19 -0400 Received: from localhost by CS.UTK.EDU with SMTP (cf v2.9s-UTK) id JAA23344; Mon, 22 Aug 1994 09:15:39 -0400 X-Resent-To: parkbench-comm@CS.UTK.EDU ; Mon, 22 Aug 1994 09:15:37 EDT Errors-to: owner-parkbench-comm@CS.UTK.EDU Received: from rios2.EPM.ORNL.GOV by CS.UTK.EDU with ESMTP (cf v2.9s-UTK) id JAA23332; Mon, 22 Aug 1994 09:15:35 -0400 Received: (from walker@localhost) by rios2.EPM.ORNL.GOV (8.6.8/8.6.6) id JAA15620 for pbwg-comm@cs.utk.edu; Mon, 22 Aug 1994 09:15:34 -0400 Date: Mon, 22 Aug 1994 09:15:34 -0400 From: David Walker Message-Id: <199408221315.JAA15620@rios2.EPM.ORNL.GOV> To: pbwg-comm@CS.UTK.EDU Subject: Re: PARKBENCH mosaic page If you tried to access the PARKBENCH Mosaic page that I sent email about on Friday, and failed it was because we had a power outage over the weekend. You should be able to access te URL now. It is http://www.epm.ornl.gov/~walker/parkbench/ Best Regards, David From owner-parkbench-comm@CS.UTK.EDU Fri Aug 26 10:02:13 1994 Received: from CS.UTK.EDU by netlib2.cs.utk.edu with ESMTP (cf v2.8t-netlib) id KAA28186; Fri, 26 Aug 1994 10:02:13 -0400 Received: from localhost by CS.UTK.EDU with SMTP (cf v2.9s-UTK) id KAA29660; Fri, 26 Aug 1994 10:00:34 -0400 X-Resent-To: parkbench-comm@CS.UTK.EDU ; Fri, 26 Aug 1994 10:00:32 EDT Errors-to: owner-parkbench-comm@CS.UTK.EDU Received: from sun2.nsfnet-relay.ac.uk by CS.UTK.EDU with SMTP (cf v2.9s-UTK) id KAA29624; Fri, 26 Aug 1994 10:00:20 -0400 Via: uk.ac.southampton.relay; Fri, 26 Aug 1994 14:59:12 +0100 Received: from ecs.soton.ac.uk (root@localhost) by mail.soton.ac.uk (8.6.4/2.12) with NIFTP id OAA29335 for pbwg-comm%cs.utk.edu@uk.ac.nsfnet-relay; Fri, 26 Aug 1994 14:49:49 +0100 From: R.Hockney@pac.soton.ac.uk Via: calvados.pac.soton.ac.uk (plonk); Fri, 26 Aug 94 14:51:28 BST Date: Fri, 26 Aug 94 14:51:09 BST Message-Id: <14357.9408261351@calvados.pac.soton.ac.uk> To: pbwg-comm@CS.UTK.EDU Subject: Report Publication I am very pleased to say that the first Parkbench Report has now been published and is in the Summer Issue of Parallel Prpgramming. The reference is: R. Hockney and M. Berry (eds.) "Public International Benchmarks for Paralle Computers, Parkbench Committee Report-1", Scientific Programming, vol 3 (2), 101-146 (1994). If anyone attending the 29th August meeting has a copy, please bring it with\ you. The second line above should read "Summer Issue of Scientific (not) Parallel Programming. I look forward to seeing some of you in Knoxville next Monday Roger Hockney From owner-parkbench-comm@CS.UTK.EDU Thu Sep 1 14:53:31 1994 Received: from CS.UTK.EDU by netlib2.cs.utk.edu with ESMTP (cf v2.8t-netlib) id OAA10612; Thu, 1 Sep 1994 14:53:31 -0400 Received: from localhost by CS.UTK.EDU with SMTP (cf v2.9s-UTK) id OAA03075; Thu, 1 Sep 1994 14:51:40 -0400 X-Resent-To: parkbench-comm@CS.UTK.EDU ; Thu, 1 Sep 1994 14:51:38 EDT Errors-to: owner-parkbench-comm@CS.UTK.EDU Received: from berry.cs.utk.edu by CS.UTK.EDU with ESMTP (cf v2.9s-UTK) id OAA03069; Thu, 1 Sep 1994 14:51:36 -0400 Received: from LOCALHOST.cs.utk.edu by berry.cs.utk.edu with SMTP (cf v2.9c-UTK) id OAA14248; Thu, 1 Sep 1994 14:51:37 -0400 Message-Id: <199409011851.OAA14248@berry.cs.utk.edu> to: pbwg-comm@CS.UTK.EDU Subject: Minutes of last meeting Date: Thu, 01 Sep 1994 14:51:36 -0400 From: "Michael W. Berry" Please send corrections to the minutes below. Thanks, Mike --------------------------------------------------------------------- Minutes of the Parkbench Meeting - Knoxville Hilton - August 29,1994 --------------------------------------------------------------------- List of Attendees: Michael Berry (Univ. of Tennessee, berry@cs.utk.edu) David Bailey (NASA, dbailey@nas.nasa.gov) Jack Dongarra (Univ. of Tennessee / ORNL, dongarra@cs.utk.edu) Alistair Dunlop (Univ. of Southampton, and@ecs.soton.ac.uk) Charles Grassl (CRI, cmg@cray.com) Roger Hockney (Univ. of Southampton, rwh@pac.soton.ak.uk) Todd Letsche (Univ. of Tennessee, letsche@cs.utk.edu) Bodo Parady (Sun Microsystems, bodo.parady@eng.sun.com) Fiona Sim (IBM Kingston, fsim@vnet.ibm.com) David Walker (ORNL, walker@msr.epm.ornl.gov) Pat Worley (ORNL, worleyph@ornl.gov) At 9:10am Jack D. provided details of the NSE (National Software Exchange) on the world-wide-web which will put a framework around high performance computing (HPC) activities. The URL is http://www.netlib.org/nse/home.html. Roger H. started the formal meeting at 9:15am. Mike B. read the minutes from the March 28 Parkbench meeting which were accepted with minor revisions. At 9:30am, Roger H. noted that the first Parkbench report has now appeared in the Summer 1994 (45 pages) issue of "Scientific Programming" published by Wiley. Roger noted that minor editing to the current on-line version was required so the two versions are not quite the same. The question of reprints was brought up. Jack D. indicated that a postscript and HTML version of paper are available and that a reference to hardcopy could be easily added. One suggestion was to scan the cover page of the hardcopy for the HTML version. Roger H. then asked Fiona Sims to comment on the Parkbench/SPEC-HPSC relationship. Fiona gave a brief talk discussion the current interactions between the two groups. She provided a list of the current SPEC/HPSC members and noted that an organization must pay $5K to be an Assoc. Member and $1K to be an Affiliate Member. SPEC's aim is to look at "real-world" applications for standardized cross platform comparison on a level playing field. The focus is on symmetric multi- processor systems, workstation clusters, distributed-memory parallel systems, vector and vector parallel supercomputers. Fiona stressed that "end users" are to be ultimate beneficiaries of the SPEC/HPSC efforts. Basic guidelines have been adopted and their codes must be written in Fortran 77, ANSI C and MPI or PVM. They are currently looking at codes from the following application areas: seismic data processing (ARCO), computational chemistry (GAMESS & AMBER), and computational fluid dynamics (TURB3D). The SPEC/HPSC contacts are Siamak Hassanzadeh and Fiona Sims. From a previous conference call involving Siamak H., Jack D., Fiona S., and David W. it was generally decided that SPEC/HPSC should draw on Parkbench experience and expertise. Parkbench is clearly more research oriented than SPEC/HPSC but Parkbench can draw draw on SPEC/HPSC experience (commercial user interests, vendor interests). Fiona encouraged collaboration and interaction between the two groups and mentioned that a join BOFS at Supercomputing '94 be held. Another consideration voiced was to have one join meeting per year (perhaps in Knoxville). Parkbench participation at SPEC/HPSC meetings was encouraged and the next SPEC/HPSC meeting is scheduled in September. SPEC/HPSC lans to have 4 meetings each year. Bodo P. pointed out that SPEC is based on vendor compliance and is truly a volunteer effort. SPEC/HPSC has need for application codes based on finite elements and engineering analysis. A weather code is being considered but is not a MP (message-passing) code. Pat W. mentioned that many versions of ONRL's Shallow Water code are available (but it is really a compact application). A human genome application is another possibility. Fiona S. finished her discussion at 9:52am. Jack D. felt that joint meetings at Supercomputing conferences would be a good idea and that a Parkbench representative would need to attend the other regularly-scheduled SPEC/HPSC meetings. Fiona S. pointed out that SPEC/HPSC only wants application codes no kernels. Jack D. asked where the funds from SPEC annual fees actually go. Bodo P. pointed out that some funds could go to travel costs for affiliate members (university folks). Both Fiona S. and Bodo P. felt that that universities would need to formally join SPEC to be able to ask for funds (travel etc.) in order to build the SPEC HPSC/Parkbench alliance. Jack D. then indicated that he would pay $1K to SPEC for Parkbench to be an affiliate member. Parkbench would be the member name. All attendees agreed that Jack D.'s offer, which would to allow Parkbench to formally join SPEC/HPSC, be accepted. ACTION ITEM: Jack D. will contact SPEC/HPSC about Parkbench membership. At 10:08am, Roger asked for status reports from the subgroups. David B. indicated that there had been no activity from the Methodology group. However, David B. asked if vendors should be asked to conform to certain code standards? Would vendors still write low-level codes for absolute performance. David W. suggested that for CA's we need sequential, MP, and HPF version (CA = Compact Application). David B. suggested that separate reporting categories be used (pencil-paper versions allows a variety of versions to be produced). Roger H. pointed out that the hierarchical structure of Parkbench has had some recent successes with FFT's and Genesis Benchmarks. One can categorize kernels with 3 hardware parameters, for example. Charles G. pointed out that we need to get away from single point performance characterization. David B. suggested that parameterization of kernels and CA's be on-going research. Alistair D. pointed out that this can only be done with codes that are well-understood. Pat W. suggested that there might be specification problems with using low-level benchmarks to explain many versions of a parallel application code. Fiona S. suggested that it would be good to instrument/parameterize the CA's. Roger H. suggested that perhaps a working group be formed to address the parameterization issue. Alistair D. suggested that SOTON's FFT-based results be distributed (FFT code from Genesis) to the Parkbench members. Roger H. then reported on the cleanup of the low-level kernels to make the Parkbench suite and the Genesis versions the same. Jack D. will provide the number of requests for Parkbench codes to date. It was noted that most of the codes are written in Fortran 77 and use PVM 3.2. ACTION ITEM: Jack D. to provide data on Parkbench code requests from Netlib. Alistair D. then reported on Kernel subgroup effort. Jack D. indicated that the linear algebra kernel components in PVM are now available: dense matrix multiplication, transpose, dense LU, dense QR, and matrix tri- diagonalization. He also noted that results for LU, QR, and tridiagonalization on the Intel Paragon are also available. Alistair D. indicated that some results for these kernels on the CRAY T3D had been collected. Bodo P. asked if Strassen's method could be used? Jack D. indicated that the current benchmark's validation procedure may fail with Strassen's method. Roger H. pointed out that benchmarks must obtain answers within the same accuracy of the standard method (used in original benchmark) and that the flop count must not be changed either. All attendees agreed that the author of a benchmark gets to decide what the flop count should be. Alistair D. asked about the status of the 1-D FFT. David B. will put together a parallel version soon using MP. ACTION ITEM: David B. will construct parallel 1-D FFT using message-passing. Bodo P. pointed out that codes should be able to exploit a variety of memory configurations. David B. said new code will do this. Alistair D. then asked about the status of the I/O benchmark. Nothing was reported. Pat W. asked if any CA's had any significant I/O activity? David W. indicated that check-pointing was the only significant I/O activity in the CA collection. David B. indicated that he had a benchmark (BT) which monitors the output/ writing of arrays that could be used for the Parkbench suite. Charles G. also had similar benchmarks used within CRI. Roger H. then asked for a volunteer to construct a prototype I/O benchmark. ACTION ITEM: David B. and Jack D. will provide two different types of I/O benchmarks. Roger H. indicated that a description of these benchmarks be provided to Tony H. who chairs the Kernel subgroup. David B. thought that his code might be more of an CA rather than a kernel. It was then decided that Jack D.'s I/O code would be used for the kernel section of the suite and that David B.'s would be added to the CA suite. David W. then reported on the status of the compact applications (CA's). He indicated that there are 4 CA's now in Netlib (NCAR/ORNL shallow water, UK QCD, ARCO suite, and OCEAN code). The code characteristics of these CA's are listed below: QCD -- PVM, Parmacs, ARCO -- PVM & NX, OCEAN -- seq,HPF,PVM,Parmacs, Shallow Water -- PVM,NX,PICL. All of the above codes were formally submitted via application form. Jack D. noted the existence of several tar files which don't seem to be part of suite - extraneous codes will be removed from the Netlib repository. David B. tried to get a PIC code but has had no luck thus far. Roger H. is working on a 3-D PIC (Particle-in-Cell) code for the USAF. Mike B. indicated that an environmental modeling code (NOYELP) might be another possibility for a PIC code. David W. indicated that he has a Fortran-90 3D PIC code also. For the compiler subgroup, Jack D. (speaking for Tom Haupt) indicated that nothing has changed. One concern is the availability of an HPF compiler. Charles G. indicated that CRI is not committed to HPF currently as it is not standard. IBM is working on it. At 11:21am, Roger asked David B. about the availability of NAS benchmarks for overseas acquisition. New codes without comments on restricted access will be added to Netlib (David B.). Jack D. indicated that he has acquired a PVM version of NAS parallel benchmarks and will put them in Netlib. ACTION ITEM: David B. and Jack D. to place new versions of the NAS Parallel Benchmarks in Netlib. David B. indicated that HPF versions of the NAS Parallel Benchmarks have been done by David Serrafini (Rice Univ). For future development, David B. indicated that a multi-zone benchmark with overlapping grids and a serious unstructured grid might be added. These would be full-scale applications rather than CA's. New larger sizes for the benchmarks are also being considered. He also shared with the attendees the official NASA statement concerning the export of the NAS Parallel Benchmarks... ``Codes used for computer benchmarking are not considered competitively sensitive'' At 11:32am, Alistair D. questioned the cleanup of benchmark codes. He felt standardization of makefiles would be helpful and asked if a language standard should be imposed? Roger H. thought that the Fortran 77 standard was sufficient although some of the codes are not fully Fortran 77. Bodo P. pointed out that Fortran 77 plus extensions (DOD) are now considered standard Fortran 77. It was then noted that PVM 3.2 might also be a standard to use. David B. pointed out that no one wants to rewrite codes with more restrictions. Alistair D. pointed out that makefiles and timer routines do vary across the suite of codes. Mike B. and Bodo P. stressed that single file source is the easiest to maintain. Charles G. indicated that the Genesis benchmarks have too many interactive files which should be removed. Alistair D. asked if any standard documentation should be used. David W. suggested that the code submission form provide documentation. The question of how timer routines should be maintained across different platforms was raised. Mike B. suggested that a common timer template (similar to that used by the Perfect Benchmarks) be used. Charles G. pointed out that TICK1 and TICK2 should be uniformly used within all the codes. ACTION ITEM: Jack D. and Alistair D. will define the timing routine standard for Parkbench codes. Alistair D. then suggested that the number of files per benchmark be kept small and to isolate machine-dependent code. At noon, lunch was served, and the meeting moved to Jack D.'s on the UT campus for demos of Southampton's GUI and Tennessee's PDS. Alistair D. demo'ed the Southamption GUI and Todd L. demo'ed the current PDS software. Both tools are built using Mosaic. The demos were completed at 2:10pm. Roger H. then asked about the status of the BOFS's at the upcoming Supercomputing '94 conference in Washington, DC. Jack D. indicated that he had make a request for a Parkbench BOFS but has heard nothing back. Attendees agreed that there should be a joint BOFS with SPEC/HPSC (which should be open to the public). Since there would be a total of 3 BOFS (SPEC/HPSC, Parkbench, combined), separate evenings will be targeted. ACTION ITEM: Fiona S. will make a request to the SC'94 Program Committee for the joint Parkbench-SPEC/HPSC BOFS. Alistair D. suggested that a hardcopy of Parkbench performance results be distributed at SC'94 along with Mosaic instructions for the GUI/performance graphs and PDS tools. ACTION ITEM: Alistair D. will make specifications for submitting data to the database used by the Southampton GUI and PDS. for the joint Parkbench-SPEC/HPSC BOFS. At 2;20pm, Roger H. asked the group where Parkbench should be headed in the near future? Everyone agreed that data acquisition and analysis should consume future efforts. Roger indicated that Southampton will conducted research along the lines of "Parkbench Performance Profiles" in which say 10 numbers are used to characterize parallel performance. Fiona S. pointed out, however, that one has to have a different performance model for the optimized codes (i.e., data-fitting may not be appropriate). David W. suggested that more CA's be collected. Charles G. remarked that the current set is not a good random sample of application codes. Jack D. suggest that "good" codes be acquired and then solicit the public for what is needed. Jack D. then asked the attendees to list potential Parkbench organizations and applications. The list follows: Ask to join: ----------------------- LANL (Margaret Simmons) LLNL (?) Applications to consider: ------------------------ PIC Structural Mechanics Molecular Dynamics (Fiona S. has a list of public-domain MD codes) Databases N-body Solvers Device Simulation (PISCES) Astrophysics (Bodo P. may be able to check on these) Environmental Modeling EuroBen (Single processor benchmarks) PEPS (Funding hiatus) At 2:55pm, Roger H. raised the question of selecting a new chairman. Since Tony H. indicated his willingness to be chairman (by proxy), the attendees voted that Tony Hey of Southampton be the next Parkbench Chairman. Jack D. indicated that he would be willing to be Vice-Chairman and the group accepted his offer. Jack D. will also investigate the video-conferencing option and send specifications of the type of technology that might be used for Parkbench future meetings. Roger H. adjourned the meeting at 3:06pm EDT --------------------------------------------------------------------- From owner-parkbench-comm@CS.UTK.EDU Thu Sep 1 17:13:41 1994 Received: from CS.UTK.EDU by netlib2.cs.utk.edu with ESMTP (cf v2.8t-netlib) id RAA16285; Thu, 1 Sep 1994 17:13:40 -0400 Received: from localhost by CS.UTK.EDU with SMTP (cf v2.9s-UTK) id RAA13554; Thu, 1 Sep 1994 17:13:02 -0400 X-Resent-To: parkbench-comm@CS.UTK.EDU ; Thu, 1 Sep 1994 17:13:01 EDT Errors-to: owner-parkbench-comm@CS.UTK.EDU Received: from Arco.COM by CS.UTK.EDU with SMTP (cf v2.9s-UTK) id RAA13544; Thu, 1 Sep 1994 17:12:56 -0400 Received: from xanadu.ease.Arco.COM ([130.201.40.24]) by Arco.COM (4.1/SMI-4.1) id AA25367; Thu, 1 Sep 94 16:12:21 CDT Received: from siamak by xanadu.ease.Arco.COM (5.65/Ultrix3.0-C) id AA15986; Thu, 1 Sep 1994 16:12:20 -0500 Received: by siamak (931110.SGI/930416.SGI.AUTO) for @xanadu:pbwg-comm@CS.UTK.EDU id AA05356; Thu, 1 Sep 94 16:18:45 -0500 From: "Chuck Mosher" Message-Id: <9409011618.ZM5354@siamak> Date: Thu, 1 Sep 1994 16:18:42 -0500 In-Reply-To: "Michael W. Berry" "Minutes of last meeting" (Sep 1, 2:51pm) References: <199409011851.OAA14248@berry.cs.utk.edu> X-Mailer: Z-Mail (3.1.0 22feb94 MediaMail) To: pbwg-comm@CS.UTK.EDU Subject: Re: Minutes of last meeting Content-Type: text/plain; charset=us-ascii Mime-Version: 1.0 On Sep 1, 2:51pm, Michael W. Berry wrote: > Subject: Minutes of last meeting > Please send corrections to the minutes below. > Thanks, Mike > Alistair D. > then asked about the status of the I/O benchmark. Nothing was reported. > Pat W. asked if any CA's had any significant I/O activity? David W. indicated > that check-pointing was the only significant I/O activity in the CA collection. > David B. indicated that he had a benchmark (BT) which monitors the output/ > writing of arrays that could be used for the Parkbench suite. Charles G. > also had similar benchmarks used within CRI. Roger H. then asked for > a volunteer to construct a prototype I/O benchmark. > > ACTION ITEM: David B. and Jack D. will provide two different types of I/O > benchmarks. The routines in the ARCO suite perform intense I/O operations, with both sequential and transposed (scattered) access pattterns. Data volumes range from 10 MB (small) to 48 GB (huge). The ARCO suite provides a set of routines for parallel I/O on regular multi-D arrays with support for files > 2GB and software striping across file systems. The routines have been tested on a wide range of hardware and software configurations. -Chuck Mosher From owner-parkbench-comm@CS.UTK.EDU Thu Sep 1 20:50:59 1994 Received: from CS.UTK.EDU by netlib2.cs.utk.edu with ESMTP (cf v2.8t-netlib) id UAA19423; Thu, 1 Sep 1994 20:50:59 -0400 Received: from localhost by CS.UTK.EDU with SMTP (cf v2.9s-UTK) id UAA25327; Thu, 1 Sep 1994 20:50:34 -0400 X-Resent-To: parkbench-comm@CS.UTK.EDU ; Thu, 1 Sep 1994 20:50:30 EDT Errors-to: owner-parkbench-comm@CS.UTK.EDU Received: from berry.cs.utk.edu by CS.UTK.EDU with ESMTP (cf v2.9s-UTK) id UAA25308; Thu, 1 Sep 1994 20:50:28 -0400 Received: from LOCALHOST.cs.utk.edu by berry.cs.utk.edu with SMTP (cf v2.9c-UTK) id UAA14582; Thu, 1 Sep 1994 20:50:29 -0400 Message-Id: <199409020050.UAA14582@berry.cs.utk.edu> To: "Chuck Mosher" cc: pbwg-comm@CS.UTK.EDU Subject: Re: Minutes of last meeting In-reply-to: Your message of "Thu, 01 Sep 1994 16:18:42 CDT." <9409011618.ZM5354@siamak> Date: Thu, 01 Sep 1994 20:50:28 -0400 From: "Michael W. Berry" > On Sep 1, 2:51pm, Michael W. Berry wrote: > > Subject: Minutes of last meeting > > Please send corrections to the minutes below. > > Thanks, Mike > > > Alistair D. > > then asked about the status of the I/O benchmark. Nothing was reported. > > Pat W. asked if any CA's had any significant I/O activity? David W. > indicated > > that check-pointing was the only significant I/O activity in the CA > collection. > > David B. indicated that he had a benchmark (BT) which monitors the output/ > > writing of arrays that could be used for the Parkbench suite. Charles G. > > also had similar benchmarks used within CRI. Roger H. then asked for > > a volunteer to construct a prototype I/O benchmark. > > > > ACTION ITEM: David B. and Jack D. will provide two different types of I/O > > benchmarks. > > The routines in the ARCO suite perform intense I/O operations, with both > sequential and transposed (scattered) access pattterns. Data volumes range > from 10 MB (small) to 48 GB (huge). The ARCO suite provides a set of routines > for parallel I/O on regular multi-D arrays with support for files > 2GB and > software striping across file systems. The routines have been tested on a > wide range of hardware and software configurations. > > -Chuck Mosher > Thanks for your comment Chuck. I'll add this information to our minutes. Mike > ----------------------------------------------------- Michael W. Berry |\---/| \\ Ayres Hall 114 (.o o.) // Department of Computer Science -=~+~=-// Knoxville, TN 37996-1301 oO( )Oo berry@cs.utk.edu (_)\ | /(_) OFF:(615) 974-3838 FAX:(615) 974-4404 OoooO URL:http://www.netlib.org/utk/people/MichaelBerry.html ------------------------------------------------------ From owner-parkbench-comm@CS.UTK.EDU Fri Sep 2 11:06:00 1994 Received: from CS.UTK.EDU by netlib2.cs.utk.edu with ESMTP (cf v2.8t-netlib) id LAA29092; Fri, 2 Sep 1994 11:05:59 -0400 Received: from localhost by CS.UTK.EDU with SMTP (cf v2.9s-UTK) id LAA12158; Fri, 2 Sep 1994 11:05:01 -0400 X-Resent-To: parkbench-comm@CS.UTK.EDU ; Fri, 2 Sep 1994 11:05:00 EDT Errors-to: owner-parkbench-comm@CS.UTK.EDU Received: from spica.npac.syr.edu by CS.UTK.EDU with SMTP (cf v2.9s-UTK) id LAA12150; Fri, 2 Sep 1994 11:04:54 -0400 From: Received: from aldebaran.npac.syr.edu by spica.npac.syr.edu (4.1/I-1.98K) id AA19451; Fri, 2 Sep 94 11:04:41 EDT Message-Id: <9409021504.AA14079@aldebaran.npac.syr.edu> Received: from localhost.syr.edu by aldebaran.npac.syr.edu (4.1/N-0.12) id AA14079; Fri, 2 Sep 94 11:04:31 EDT To: "Michael W. Berry" Cc: pbwg-comm@CS.UTK.EDU, haupt@npac.syr.edu Subject: Re: Minutes of last meeting In-Reply-To: Your message of "Thu, 01 Sep 94 14:51:36 EDT." <199409011851.OAA14248@berry.cs.utk.edu> Date: Fri, 02 Sep 94 11:04:30 -0400 > >David B. indicated that HPF versions of the NAS Parallel Benchmarks have >been done by David Serrafini (Rice Univ). > I tried finger and Rice Mosaic pages to find this person in vain. Could anyone help me to get in touch with David Serrafini, please? Thanks, Tom From owner-parkbench-comm@CS.UTK.EDU Fri Sep 2 11:14:19 1994 Received: from CS.UTK.EDU by netlib2.cs.utk.edu with ESMTP (cf v2.8t-netlib) id LAA29235; Fri, 2 Sep 1994 11:14:19 -0400 Received: from localhost by CS.UTK.EDU with SMTP (cf v2.9s-UTK) id LAA12945; Fri, 2 Sep 1994 11:14:06 -0400 X-Resent-To: parkbench-comm@CS.UTK.EDU ; Fri, 2 Sep 1994 11:14:05 EDT Errors-to: owner-parkbench-comm@CS.UTK.EDU Received: from wk49.nas.nasa.gov by CS.UTK.EDU with ESMTP (cf v2.9s-UTK) id LAA12930; Fri, 2 Sep 1994 11:14:00 -0400 Received: (from dbailey@localhost) by wk49.nas.nasa.gov (8.6.8.1/NAS.5.b) id IAA17382; Fri, 2 Sep 1994 08:12:34 -0700 Date: Fri, 2 Sep 1994 08:12:34 -0700 From: dbailey@nas.nasa.gov (David H. Bailey) Message-Id: <199409021512.IAA17382@wk49.nas.nasa.gov> To: haupt@npac.syr.edu CC: berry@CS.UTK.EDU, pbwg-comm@CS.UTK.EDU, haupt@npac.syr.edu In-reply-to: <9409021504.AA14079@aldebaran.npac.syr.edu> (haupt@npac.syr.edu) Subject: Re: Minutes of last meeting Reply-to: dbailey@nas.nasa.gov Try serafini, not serrafini DHB From owner-parkbench-comm@CS.UTK.EDU Fri Sep 2 19:03:45 1994 Received: from CS.UTK.EDU by netlib2.cs.utk.edu with ESMTP (cf v2.8t-netlib) id TAA14854; Fri, 2 Sep 1994 19:03:44 -0400 Received: from localhost by CS.UTK.EDU with SMTP (cf v2.9s-UTK) id TAA18602; Fri, 2 Sep 1994 19:02:27 -0400 X-Resent-To: parkbench-comm@CS.UTK.EDU ; Fri, 2 Sep 1994 19:02:26 EDT Errors-to: owner-parkbench-comm@CS.UTK.EDU Received: from berry.cs.utk.edu by CS.UTK.EDU with ESMTP (cf v2.9s-UTK) id TAA18596; Fri, 2 Sep 1994 19:02:24 -0400 Received: from LOCALHOST.cs.utk.edu by berry.cs.utk.edu with SMTP (cf v2.9c-UTK) id TAA15940; Fri, 2 Sep 1994 19:02:22 -0400 Message-Id: <199409022302.TAA15940@berry.cs.utk.edu> to: pbwg-comm@CS.UTK.EDU Subject: Edited minutes Date: Fri, 02 Sep 1994 19:02:21 -0400 From: "Michael W. Berry" Here's the edited version of the minutes I have posted to comp.parallel and comp.benchmarks. Regards, Mike --------------------------------------------------------------------- Minutes of the Parkbench Meeting - Knoxville Hilton - August 29,1994 --------------------------------------------------------------------- List of Attendees: Michael Berry (Univ. of Tennessee, berry@cs.utk.edu) David Bailey (NASA, dbailey@nas.nasa.gov) Jack Dongarra (Univ. of Tennessee / ORNL, dongarra@cs.utk.edu) Alistair Dunlop (Univ. of Southampton, and@ecs.soton.ac.uk) Charles Grassl (CRI, cmg@cray.com) Roger Hockney (Univ. of Southampton, rwh@pac.soton.ak.uk) Todd Letsche (Univ. of Tennessee, letsche@cs.utk.edu) Bodo Parady (Sun Microsystems, bodo.parady@eng.sun.com) Fiona Sim (IBM Kingston, fsim@vnet.ibm.com) David Walker (ORNL, walker@msr.epm.ornl.gov) Pat Worley (ORNL, worleyph@ornl.gov) At 9:10am Jack D. provided details of the NSE (National Software Exchange) on the world-wide-web which will put a framework around high performance computing (HPC) activities. The NSE URL is given below. http://netlib2.cs.utk.edu/nse/home.html Roger H. started the formal meeting at 9:15am. Mike B. read the minutes from the March 28 Parkbench meeting which were accepted with minor revisions. At 9:30am, Roger H. noted that the first Parkbench report has now appeared in the Summer 1994 (45 pages) issue of "Scientific Programming" published by Wiley. Roger noted that minor editing to the current on-line version was required so the two versions are not quite the same. The question of reprints was brought up. Jack D. indicated that a postscript and HTML version of paper are available and that a reference to hardcopy could be easily added. One suggestion was to scan the cover page of the hardcopy for the HTML version. Roger H. then asked Fiona S. to comment on the Parkbench/SPEC-HPSC relationship. Fiona gave a brief talk discussion the current interactions between the two groups. She provided a list of the current SPEC/HPSC members and noted that an organization must pay $5K to be an Assoc. Member and $1K to be an Affiliate Member. SPEC's aim is to look at "real-world" applications for standardized cross platform comparison on a level playing field. The focus is on symmetric multi- processor systems, workstation clusters, distributed-memory parallel systems, vector and vector parallel supercomputers. Fiona stressed that "end users" are to be ultimate beneficiaries of the SPEC/HPSC efforts. Basic guidelines have been adopted and their codes must be written in Fortran 77, ANSI C and MPI or PVM. They are currently looking at codes from the following application areas: seismic data processing (ARCO), computational chemistry (GAMESS & AMBER), and computational fluid dynamics (TURB3D). The SPEC/HPSC contacts are Siamak Hassanzadeh and Fiona Sim. From a previous conference call involving Siamak H., Jack D., Fiona S., and David W. it was generally decided that SPEC/HPSC should draw on Parkbench experience and expertise. Parkbench is clearly more research oriented than SPEC/HPSC but Parkbench can draw draw on SPEC/HPSC experience (commercial user interests, vendor interests). Fiona encouraged collaboration and interaction between the two groups and mentioned that a join BOFS at Supercomputing '94 be held. Another consideration voiced was to have one join meeting per year (perhaps in Knoxville). Parkbench participation at SPEC/HPSC meetings was encouraged and the next SPEC/HPSC meeting is scheduled in September. SPEC/HPSC lans to have 4 meetings each year. Bodo P. pointed out that SPEC is based on vendor compliance and is truly a volunteer effort. SPEC/HPSC has need for application codes based on finite elements and engineering analysis. A weather code is being considered but is not a MP (message-passing) code. Pat W. mentioned that many versions of ONRL's Shallow Water code are available (but it is really a compact application). A human genome application is another possibility. Fiona S. finished her discussion at 9:52am. Jack D. felt that joint meetings at Supercomputing conferences would be a good idea and that a Parkbench representative would need to attend the other regularly-scheduled SPEC/HPSC meetings. Fiona S. pointed out that SPEC/HPSC only wants application codes no kernels. Jack D. asked where the funds from SPEC annual fees actually go. Bodo P. pointed out that some funds could go to travel costs for affiliate members (university folks). Both Fiona S. and Bodo P. felt that that universities would need to formally join SPEC to be able to ask for funds (travel etc.) in order to build the SPEC HPSC/Parkbench alliance. Jack D. then indicated that he would pay $1K to SPEC for Parkbench to be an affiliate member. Parkbench would be the member name. All attendees agreed that Jack D.'s offer, which would to allow Parkbench to formally join SPEC/HPSC, be accepted. ACTION ITEM: Jack D. will contact SPEC/HPSC about Parkbench membership. At 10:08am, Roger asked for status reports from the subgroups. David B. indicated that there had been no activity from the Methodology group. However, David B. asked if vendors should be asked to conform to certain code standards? Would vendors still write low-level codes for absolute performance. David W. suggested that for CA's we need sequential, MP, and HPF version (CA = Compact Application). David B. suggested that separate reporting categories be used (pencil-paper versions allows a variety of versions to be produced). Roger H. pointed out that the hierarchical structure of Parkbench has had some recent successes with FFT's and Genesis Benchmarks. One can categorize kernels with 3 hardware parameters, for example. Charles G. pointed out that we need to get away from single point performance characterization. David B. suggested that parameterization of kernels and CA's be on-going research. Alistair D. pointed out that this can only be done with codes that are well-understood. Pat W. suggested that there might be specification problems with using low-level benchmarks to explain many versions of a parallel application code. Fiona S. suggested that it would be good to instrument/parameterize the CA's. Roger H. suggested that perhaps a working group be formed to address the parameterization issue. Alistair D. suggested that SOTON's FFT-based results be distributed (FFT code from Genesis) to the Parkbench members. Roger H. then reported on the cleanup of the low-level kernels to make the Parkbench suite and the Genesis versions the same. Jack D. will provide the number of requests for Parkbench codes to date. It was noted that most of the codes are written in Fortran 77 and use PVM 3.2. ACTION ITEM: Jack D. to provide data on Parkbench code requests from Netlib. Alistair D. then reported on Kernel subgroup effort. Jack D. indicated that the linear algebra kernel components in PVM are now available: dense matrix multiplication, transpose, dense LU, dense QR, and matrix tri- diagonalization. He also noted that results for LU, QR, and tridiagonalization on the Intel Paragon are also available. Alistair D. indicated that some results for these kernels on the CRAY T3D had been collected. Bodo P. asked if Strassen's method could be used? Jack D. indicated that the current benchmark's validation procedure may fail with Strassen's method. Roger H. pointed out that benchmarks must obtain answers within the same accuracy of the standard method (used in original benchmark) and that the flop count must not be changed either. All attendees agreed that the author of a benchmark gets to decide what the flop count should be. Alistair D. asked about the status of the 1-D FFT. David B. will put together a parallel version soon using MP. ACTION ITEM: David B. will construct parallel 1-D FFT using message-passing. Bodo P. pointed out that codes should be able to exploit a variety of memory configurations. David B. said new code will do this. Alistair D. then asked about the status of the I/O benchmark. Nothing was reported. Pat W. asked if any CA's had any significant I/O activity? David W. indicated that check-pointing was the only significant I/O activity in the CA collection. David B. indicated that he had a benchmark (BHT) which monitors the output/ writing of arrays that could be used for the Parkbench suite. Charles G. also had similar benchmarks used within CRI. Roger H. then asked for a volunteer to construct a prototype I/O benchmark. ACTION ITEM: David B. and Jack D. will provide two different types of I/O benchmarks. SIDE COMMENT: (via email from Chuck Mosher at ARCO on Sept. 1, 1994) "The routines in the ARCO suite perform intense I/O operations, with both sequential and transposed (scattered) access pattterns. Data volumes range from 10 MB (small) to 48 GB (huge). The ARCO suite provides a set of routines for parallel I/O on regular multi-D arrays with support for files > 2GB and software striping across file systems. The routines have been tested on a wide range of hardware and software configurations." Roger H. indicated that a description of these benchmarks be provided to Tony H. who chairs the Kernel subgroup. David B. thought that his code might be more of an CA rather than a kernel. It was then decided that Jack D.'s I/O code would be used for the kernel section of the suite and that David B.'s would be added to the CA suite. David W. then reported on the status of the compact applications (CA's). He indicated that there are 4 CA's now in Netlib (NCAR/ORNL shallow water, UK QCD, ARCO suite, and OCEAN code). The code characteristics of these CA's are listed below: QCD -- PVM, Parmacs, ARCO -- PVM & NX, OCEAN -- seq,HPF,PVM,Parmacs, Shallow Water -- PVM,NX,PICL. All of the above codes were formally submitted via application form. Jack D. noted the existence of several tar files which don't seem to be part of suite - extraneous codes will be removed from the Netlib repository. David B. tried to get a PIC code but has had no luck thus far. Roger H. is working on a 3-D PIC (Particle-in-Cell) code for the USAF. Mike B. indicated that an environmental modeling code (NOYELP) might be another possibility for a PIC code. David W. indicated that he has a Fortran-90 3D PIC code also. For the compiler subgroup, Jack D. (speaking for Tom Haupt) indicated that nothing has changed. One concern is the availability of an HPF compiler. Charles G. indicated that CRI is not committed to HPF currently as it is not standard. IBM is working on it. At 11:21am, Roger asked David B. about the availability of NAS benchmarks for overseas acquisition. New codes without comments on restricted access will be added to Netlib (David B.). Jack D. indicated that he has acquired a PVM version of NAS parallel benchmarks and will put them in Netlib. ACTION ITEM: David B. and Jack D. to place new versions of the NAS Parallel Benchmarks in Netlib. David B. indicated that HPF versions of the NAS Parallel Benchmarks have been done by David Serafini (Rice Univ). For future development, David B. indicated that a multi-zone benchmark with overlapping grids and a serious unstructured grid might be added. These would be full-scale applications rather than CA's. New larger sizes for the benchmarks are also being considered. He also shared with the attendees the official NASA statement concerning the export of the NAS Parallel Benchmarks... ``Codes used for computer benchmarking are not considered competitively sensitive'' At 11:32am, Alistair D. questioned the cleanup of benchmark codes. He felt standardization of makefiles would be helpful and asked if a language standard should be imposed? Roger H. thought that the Fortran 77 standard was sufficient although some of the codes are not fully Fortran 77. Bodo P. pointed out that Fortran 77 plus extensions (DOD) are now considered standard Fortran 77. It was then noted that PVM 3.2 might also be a standard to use. David B. pointed out that no one wants to rewrite codes with more restrictions. Alistair D. pointed out that makefiles and timer routines do vary across the suite of codes. Mike B. and Bodo P. stressed that single file source is the easiest to maintain. Charles G. indicated that the Genesis benchmarks have too many interactive files which should be removed. Alistair D. asked if any standard documentation should be used. David W. suggested that the code submission form provide documentation. The question of how timer routines should be maintained across different platforms was raised. Mike B. suggested that a common timer template (similar to that used by the Perfect Benchmarks) be used. Charles G. pointed out that TICK1 and TICK2 should be uniformly used within all the codes. ACTION ITEM: Jack D. and Alistair D. will define the timing routine standard for Parkbench codes. Alistair D. then suggested that the number of files per benchmark be kept small and to isolate machine-dependent code. At noon, lunch was served, and the meeting moved to Jack D.'s on the UT campus for demos of Southampton's GUI and Tennessee's PDS. Alistair D. demo'ed the Southamption GUI and Todd L. demo'ed the current PDS software. Both tools are built using Mosaic. The demos were completed at 2:10pm. Roger H. then asked about the status of the BOFS's at the upcoming Supercomputing '94 conference in Washington, DC. Jack D. indicated that he had make a request for a Parkbench BOFS but has heard nothing back. Attendees agreed that there should be a joint BOFS with SPEC/HPSC (which should be open to the public). Since there would be a total of 3 BOFS (SPEC/HPSC, Parkbench, combined), separate evenings will be targeted. ACTION ITEM: Fiona S. will make a request to the SC'94 Program Committee for the joint Parkbench-SPEC/HPSC BOFS. Alistair D. suggested that a hardcopy of Parkbench performance results be distributed at SC'94 along with Mosaic instructions for the GUI/performance graphs and PDS tools. ACTION ITEM: Alistair D. will make specifications for submitting data to the database used by the Southampton GUI and PDS. for the joint Parkbench-SPEC/HPSC BOFS. At 2;20pm, Roger H. asked the group where Parkbench should be headed in the near future? Everyone agreed that data acquisition and analysis should consume future efforts. Roger indicated that Southampton will conducted research along the lines of "Parkbench Performance Profiles" in which say 10 numbers are used to characterize parallel performance. Fiona S. pointed out, however, that one has to have a different performance model for the optimized codes (i.e., data-fitting may not be appropriate). David W. suggested that more CA's be collected. Charles G. remarked that the current set is not a good random sample of application codes. Jack D. suggest that "good" codes be acquired and then solicit the public for what is needed. Jack D. then asked the attendees to list potential Parkbench organizations and applications. The list follows: Ask to join: ----------------------- LANL (Margaret Simmons) LLNL (?) Applications to consider: ------------------------ PIC Structural Mechanics Molecular Dynamics (Fiona S. has a list of public-domain MD codes) Databases N-body Solvers Device Simulation (PISCES) Astrophysics (Bodo P. may be able to check on these) Environmental Modeling EuroBen (Single processor benchmarks) PEPS (Funding hiatus) At 2:55pm, Roger H. raised the question of selecting a new chairman. Since Tony H. indicated his willingness to be chairman (by proxy), the attendees voted that Tony Hey of Southampton be the next Parkbench Chairman. Jack D. indicated that he would be willing to be Vice-Chairman and the group accepted his offer. Jack D. will also investigate the video-conferencing option and send specifications of the type of technology that might be used for Parkbench future meetings. Roger H. adjourned the meeting at 3:06pm EDT --------------------------------------------------------------------- From owner-parkbench-comm@CS.UTK.EDU Wed Sep 14 16:27:55 1994 Received: from CS.UTK.EDU by netlib2.cs.utk.edu with ESMTP (cf v2.8t-netlib) id QAA14699; Wed, 14 Sep 1994 16:27:54 -0400 Received: from localhost by CS.UTK.EDU with SMTP (cf v2.9s-UTK) id QAA05851; Wed, 14 Sep 1994 16:26:17 -0400 X-Resent-To: parkbench-comm@CS.UTK.EDU ; Wed, 14 Sep 1994 16:26:15 EDT Errors-to: owner-parkbench-comm@CS.UTK.EDU Received: from dasher.cs.utk.edu by CS.UTK.EDU with ESMTP (cf v2.9s-UTK) id QAA05844; Wed, 14 Sep 1994 16:26:14 -0400 From: Jack Dongarra Received: by dasher.cs.utk.edu (cf v2.9c-UTK) id QAA01660; Wed, 14 Sep 1994 16:26:11 -0400 Date: Wed, 14 Sep 1994 16:26:11 -0400 Message-Id: <199409142026.QAA01660@dasher.cs.utk.edu> To: parkbench-comm@CS.UTK.EDU Subject: SC94 and ParkBench We are set for a birds-of-a-feather session at the SC-94 meeting in DC. The meeting is schedule for Thursday, November 17th, between 10:30-12:00, location to be determined. Jack From owner-parkbench-comm@CS.UTK.EDU Mon Sep 19 09:20:19 1994 Received: from CS.UTK.EDU by netlib2.cs.utk.edu with ESMTP (cf v2.8t-netlib) id JAA26181; Mon, 19 Sep 1994 09:20:19 -0400 Received: from localhost by CS.UTK.EDU with SMTP (cf v2.9s-UTK) id JAA01805; Mon, 19 Sep 1994 09:19:19 -0400 X-Resent-To: parkbench-comm@CS.UTK.EDU ; Mon, 19 Sep 1994 09:19:17 EDT Errors-to: owner-parkbench-comm@CS.UTK.EDU Received: from rios2.EPM.ORNL.GOV by CS.UTK.EDU with ESMTP (cf v2.9s-UTK) id JAA01797; Mon, 19 Sep 1994 09:19:16 -0400 Received: (from walker@localhost) by rios2.EPM.ORNL.GOV (8.6.8/8.6.6) id JAA17682; Mon, 19 Sep 1994 09:19:14 -0400 From: David Walker Message-Id: <199409191319.JAA17682@rios2.EPM.ORNL.GOV> To: pbwg-comm@CS.UTK.EDU Subject: Benchmarking Workshop Date: Mon, 19 Sep 94 09:19:14 -0500 First Announcement: Call for Participation Workshop on Performance Evaluation & Benchmarking of Parallel Systems A joint workshop of various parties that are involved in the performance evaluation of high-performance computing systems will be held at the University of Warwick (Coventry, England) on December 15--16, 1994. Participants will include EuroBen, PARKBENCH, and PEPS. Topics include, but are not limited to: TOPICS: - Structure of benchmarks. - Machine characterisation. - Monitoring systems. - Recent results of the various benchmarks. - New benchmark initiatives and opportunities for cooperation. - Simulation. - Interpretation of results. - (Re)presentation of performance results. CONTRIBUTIONS: 40 minutes will be allowed for each presentation including discussions. However short contributions can be included for subjects of sufficient interest. It is our intention to publish the (possibly selected) contributions either as separate proceedings or as a issue of the journal SUPERCOMPUTER. An ASCII version of the presentation sent on disk or via email is required by the 14th of October. COST: The workshop fee is 150 Pounds Sterling. The fee includes food during the conference, and accommodation on the night of the 15th, as well as one copy of the proceedings. Accommodation can also be made available on the 14th for early registrants at 55 pounds for bed and breakfast, and 20 pounds for dinner. Program: The formal program will not be announced until shortly before the workshop takes place to allow for maximum flexibility. NUMBER OF PARTICIPANTS: The maximum number of participants is 35 to enable a high degree of interaction and discussions to take place between participants. Registration: For further information and registration details please contact: Prof. Graham R. Nudd or; Aad J. van der Steen Dept. of Computer Science EuroBen University of Warwick c/o Academic Computing Coventry CV4 7AL Centre Utrecht England Budapestlaan 6 Tel +44--203--523193 3584 CD Utrecht Fax +44--203--525714 The Netherlands Email conf@dcs.warwick.ac.uk Tel +31--30--531444 Fax +31--30--531633 Email actstea@cc.ruu.nl From owner-parkbench-comm@CS.UTK.EDU Fri Oct 7 14:46:11 1994 Received: from CS.UTK.EDU by netlib2.cs.utk.edu with ESMTP (cf v2.8t-netlib) id OAA20753; Fri, 7 Oct 1994 14:46:10 -0400 Received: from localhost by CS.UTK.EDU with SMTP (cf v2.9s-UTK) id OAA06396; Fri, 7 Oct 1994 14:44:32 -0400 X-Resent-To: parkbench-comm@CS.UTK.EDU ; Fri, 7 Oct 1994 14:44:31 EDT Errors-to: owner-parkbench-comm@CS.UTK.EDU Received: from berry.cs.utk.edu by CS.UTK.EDU with ESMTP (cf v2.9s-UTK) id OAA06390; Fri, 7 Oct 1994 14:44:30 -0400 Received: from LOCALHOST.cs.utk.edu by berry.cs.utk.edu with SMTP (cf v2.9c-UTK) id OAA02274; Fri, 7 Oct 1994 14:44:23 -0400 Message-Id: <199410071844.OAA02274@berry.cs.utk.edu> to: pbwg-comm@CS.UTK.EDU Subject: SPEC/HPSC meeting Date: Fri, 07 Oct 1994 14:44:23 -0400 From: "Michael W. Berry" Sorry this is abit late. The compact applications subcommittee may find some of the information I collected at a recent SPEC/HPSC meeting useful. Regards, Mike ------------------------------------------------------------------ Notes from the SPEC/HPSC meeting on Wed. Sept. 28 at AT&T (Napierville, IL) ------------------------------------------------------------------ At 9:00am, I gave a 30-minute report of Parkbench activities. There were a few questions on the availability of codes and selected problem sizes. The URL's for PDS and the Parkbench Home Page were given to the attendees and Larry Gray from HP asked if vendors could supply reports to go along with the SPEC results -- I indicated "yes" of course and that UT could take care of any necessary HTML conversion. At 9:30am, Ken Koch from Los Alamos discussed benchmarking efforts at LANL. He mentioned 8 application codes as well as a few kernel codes that they are working on. Most of the application codes arise from CFD and are described below. The SWEEP3D code may be of interest to Compact Applications Subcommittee. EULER [3-D Multi-material hydro code, large grain computation, written for CM-2 and ported to CM-5, working on a MP-version, available versions: CMF,CF77/CF90] MCNP [3-D Monte Carlo transport, huge verification suite with 2 performance ranges, old code started in 70's, Fortran-77 but PVM version for clusters available, versions: F77/PVM,CF77/AT, F77/MPI] Note: AT = autotasked. SWEEP3D [3-D S_N Transport, compact representation, recursive in space, local node indirection, fully instrumented, 4 datasets available, only about 1K lines, regular distributed in 2-D, fine grain computation, data parallel on CM-5, versions: CMF, CF77/AT,F77/PVM] NIKE [Tetrahedral-mesh S_N Transport, unstructured mesh, only available in CMF] AMR [AMR by patches, uses adaptive refinement, special array class library, deferred scheduling of operations, versions: CMF, C++/P++] PUEBLO [Gamma-law Hydro code, originally given to Perfect Benchmarks, old code, versions: F77,CF77/AT,CMF] Note: Regular mesh CFD codes are EULER, MCNP, and SWEEP3D. POP [Parallel Ocean code, part of CHAMMP, restricted access, versions: F77/DPEAC/CMMD, F77/PVM/MPI/NXLIB/SHMEM, CMF/DPEAC] SPASM [Molecular Dynamics, restricted access, Gordon Bell winner, versions: C/DPEAC/CMMD, C/SHMEM] At 10:30am, a general discussion of SPEC policies took place. SPEC confidentiality of results was mentioned. After results are submitted to NCGA, an editor reviews them and passes them on to review committee before they are published in the SPEC newsletter. HPSC benchmarks are only used for code selection at this time and are not considered for public knowledge. Dave Kuck proposed that a formal statement of confidentiality be drafted and Fiona Sim from IBM helped construct the statement with help from the attendees. An issue of file/makefile structure for benchmarks came up during the meeting and it seemed clear to me that SPEC has a formal layout of how directories for each SPEC code should be designed. Parkbench and SPEC/HPSC could both benefit from this structure perhaps. At 11:30am, Sia Hassanzadeh (Sun) began the discussion on benchmark code status. He showed performance results for the ARCO benchmark on SPARCstations and results for the NEC SX-3/44R (1-4 CPU's) were also provided. On a SPARCstation 10 (50 MHz), the small dataset took only 14 minutes while the large dataset required over 2 hours. The NEC machine appeared to scale well with the following results on the large dataset: 1CPU - 1,104 seconds 2CPU - 560 seconds 4CPU - 330 seconds (provided by Dennis Ellis, NEC) The codes in the ARCO suite are currently in single precision and the verification is done via checksums and graphics. The codes are now written using PVM and an interesting comparison could be made running with and without SHMEM. After a lunch break, Fiona Sim reviewed the computational chemistry benchmark codes considered thus far. She mentioned both GAMESS (Iowa State, Mark Gordon and Mike Schmidt) which is Ab Initio Quantum Chemistry and AMBER (UCSF) which is a molecular modeling code. SPEC/HPSC has obtained a reduced version of GAMESS (full version distributed by ISU) which has the computational flow: integrals+ SCF+gradients. Fiona mentioned that this code can manage the integrals in two ways: 1) Traditional I/O in which the integrals are recomputed, or 2) "Direct I/O" -- integrals previously computed are read from disk. There is no baseline version for GAMESS. IBM has reduced the code down to 110,000 lines and installed a makefile as an alternative to the many scripts normally used. The serial version had message-passing calls so stubs are used to resolve undefined references. The parallel version has embedded message passing based on "tcgmsg". Note that "tcgtopvm3.f" program is available to convert tcgmsg to pvm3 (written by IBM). Some of the problems with GAMESS include: dynamic memory allocation using LOC and MEMGET; the dgemm subroutine was missing from blas.f; written in double precision; size of integers could be reduced. Benchmarks on the for GAMESS are listed below for 1-32 processors of the IBM SP/1 (1 proc is RS/6000-370): 15 minutes small problem 23 hours large problem (1 proc) 1 hour large problem (32 proc) -- only needed 16 megabytes Regarding AMBER (by P. Kollman), a scaled version obtained and should be ready for testing sometime this month. Another CFD benchmark code was presented by Sharad Galvali at Fujitsu. He discussed LANS3D which is absed on an LU-ADI algorithm by Obayashi et al; it is a finite differencing code which is first order implicit in time and 2nd order in space. The comprises 3,000 lines for Fortran-77 and is well documented. It can be used for Navier-Stokes problems and a parallel version is expected in 1995. For a medium gird (128,975 grid points) of a Delta Wing the code ran in 2,700 seconds on a Fujitsu VPX 240 (4,000 iterations). Runs on a larger grid (853,349 grid points) is anticipated. ------------------------------------------------------------------ The meeting continued passed 3:30pm but I left to catch my plane. ------------------------------------------------------------------ ----------------------------------------------------- Michael W. Berry |\---/| \\ Ayres Hall 114 (.o o.) // Department of Computer Science -=~+~=-// Knoxville, TN 37996-1301 oO( )Oo berry@cs.utk.edu (_)\ | /(_) OFF:(615) 974-3838 FAX:(615) 974-4404 OoooO URL:http://www.netlib.org/utk/people/MichaelBerry.html ------------------------------------------------------ From owner-parkbench-comm@CS.UTK.EDU Tue Oct 18 09:51:52 1994 Received: from CS.UTK.EDU by netlib2.cs.utk.edu with ESMTP (cf v2.8t-netlib) id JAA06672; Tue, 18 Oct 1994 09:51:52 -0400 Received: from localhost by CS.UTK.EDU with SMTP (cf v2.9s-UTK) id JAA20800; Tue, 18 Oct 1994 09:50:22 -0400 X-Resent-To: parkbench-comm@CS.UTK.EDU ; Tue, 18 Oct 1994 09:50:20 EDT Errors-to: owner-parkbench-comm@CS.UTK.EDU Received: from thoth.mch.sni.de by CS.UTK.EDU with SMTP (cf v2.9s-UTK) id JAA20782; Tue, 18 Oct 1994 09:50:16 -0400 Received: from hippo.mch.sni.de by thoth.mch.sni.de with SMTP id AA29400 (5.67a/IDA-1.5 for ); Tue, 18 Oct 1994 14:45:54 +0100 Received: by hippo.mch.sni.de id AA25849 (5.65c/graf-1.0); Tue, 18 Oct 1994 14:45:30 +0100 From: Winfrid Tschiedel Message-Id: <199410181345.AA25849@hippo.mch.sni.de> Subject: errors in parkbench/solver_pvm and genesis-3.0/solver To: pbwg-comm@CS.UTK.EDU, support@par.soton.ac.uk Date: Tue, 18 Oct 1994 14:45:29 +0100 (MET) Cc: christi@munich.sgi.com X-Mailer: ELM [version 2.4 PL22] Content-Type: text Hello, Currently I am working on a large benchmark, where parkbench / genesis is required. I suppose there are some source error the solver benchmark : parkbench/solver_pvm/pvmgrid.F genesis-3.0/solver/pvm3/solver.F This program contains a call to pvmfserror - I suppose it should be read pvmfperror, because this is the only routine I found in the PVM library with a similar name. genesis-3.0/solver/pvm3/solver.F and genesis-3.0/solver/pvm3/solvern.F Between PROGRAM statement and #include "messages.h" are two statements CHARACTER*80 RCSID DATA RCSID / .... This is not allowed, because the IMPLICIT statement must be the first statement after the PROGRAM, SUBROUTINE, FUNCTION or BLOCK DATA statement; if these statements are missing the IMPLICIT statement must be the first statement. I suppose there is another error in this part of the genesis-suite : ( I did not check the code in the parkbench/solver-pvm ) I hope, I understood the README correct, I tried to run the program without PVM; I is it correct, that in this case you have just to specify -DFAKE for the proprocessor ? After generating the programs, I started solvern - and I got an error because solvern still contains some PVM calls. Thanks in advance for your help. Take care, Winfrid _______________________________________________________________________________ Siemens Nixdorf Informationssysteme AG D553CC Email: winfrid.tschiedel@mch.sni.de Otto-Hahn-Ring 6 Tel. : ++49-89-636-45652 81739 Muenchen Fax : ++49-89-636-45046 From owner-parkbench-comm@CS.UTK.EDU Mon Oct 24 12:23:20 1994 Received: from CS.UTK.EDU by netlib2.cs.utk.edu with ESMTP (cf v2.8t-netlib) id MAA16225; Mon, 24 Oct 1994 12:23:20 -0400 Received: from localhost by CS.UTK.EDU with SMTP (cf v2.9s-UTK) id MAA27698; Mon, 24 Oct 1994 12:21:37 -0400 X-Resent-To: parkbench-comm@CS.UTK.EDU ; Mon, 24 Oct 1994 12:21:36 EDT Errors-to: owner-parkbench-comm@CS.UTK.EDU Received: from sun2.nsfnet-relay.ac.uk by CS.UTK.EDU with SMTP (cf v2.9s-UTK) id MAA27685; Mon, 24 Oct 1994 12:21:32 -0400 Via: uk.ac.southampton.beech; Mon, 24 Oct 1994 15:25:53 +0000 Received: from ecs.soton.ac.uk (root@localhost) by beech.soton.ac.uk (8.6.4/%I%) with NIFTP id PAA02173 for pbwg-comm%cs.utk.edu@uk.ac.nsfnet-relay; Mon, 24 Oct 1994 15:28:09 GMT Via: brewery.ecs.soton.ac.uk; Mon, 24 Oct 94 15:26:37 GMT From: Mark Papiani Received: from holt.ecs.soton.ac.uk by brewery.ecs.soton.ac.uk; Mon, 24 Oct 94 15:31:17 GMT Date: Mon, 24 Oct 94 15:31:17 GMT Message-Id: <24949.9410241531@holt.ecs.soton.ac.uk> To: pbwg-comm@CS.UTK.EDU Subject: Results required for Graphical Interface to ParkBench & Genesis Results GBIS Graphical Interface to Genesis and Parkbench Results available on WWW -------------------------------------------------------------------------- The Southampton HPC Centre GBIS Graphical Interface is now available on the WWW. This allows the interactive selection of a benchmark metric and selection of computers from which a performance graph will be generated and displayed. Results will be kept for the Genesis and Parkbench suites. The home page can be accessed via URL:- http://hpcc.soton.ac.uk/RandD/gbis/papiani-new-gbis-top.html To skip the Home page and go directly to the graph pages use URL:- http://hpcc.soton.ac.uk/RandD/gbis/papiani-new-gbis.html For faster results in transferring the graphs, select change defaults on the 'Machine List' page and then postscript format on the 'CHANGE DEFAULTS' page (this is explained in the instructions, accessible from the home page, and is considerably quicker than the default results display; gif images on WWW pages). Currently Available Results --------------------------- I have entered results for all the NAS codes which appear in Parkbench; taken from the NAS Parallel Benchmark Report (3-94 and update 9-94). This includes a fairly large number of machines for the following benchmarks. 3DFFT CONJUGENT GRADIENT EMBARRASINGLY PARALLEL LARGE INTEGER SORT MULTIGRID I have also included some results for Genesis COMMS1 (Meiko CS-2 only). Additional Results Required for SC'94 ------------------------------------- We are hoping to produce some form of report/ news letter for SC'94 which will describe Parkbench along with information on available results. I am therefore looking for further results and would be pleased to here from anyone who has (or knows the location of) any Parkbench or Genesis results. Details of how to send results (ftp address and required format) are contained in the GBIS Graphical Interface Instructions, accessible from the GBIS Graphical Interface Home Page or by anonymous ftp to par.soton.ac.uk filename incoming/benchmark_results/readme Alternatively, contact me on mp@ecs.soton.ac.uk if you can send results but in a different format. Mark. _______________________________________________________________________________ Mark Papiani Department of Electronics & Computer Science Mountbatten Building, rm. 4037 University of Southampton, S017 1BJ Southampton, ENGLAND Tel: +44 (703) 593368 (direct) Fax: +44 (703) 593045 +44 (703) 595000 (Switchboard) Email: mp@ecs.soton.ac.uk WWW URL: http://hpcc.soton.ac.uk/Staff/ECS/mp.html _______________________________________________________________________________ From owner-parkbench-comm@CS.UTK.EDU Wed Nov 2 05:23:11 1994 Received: from CS.UTK.EDU by netlib2.cs.utk.edu with ESMTP (cf v2.8t-netlib) id FAA23466; Wed, 2 Nov 1994 05:23:10 -0500 Received: from localhost by CS.UTK.EDU with SMTP (cf v2.9s-UTK) id FAA18037; Wed, 2 Nov 1994 05:22:00 -0500 X-Resent-To: parkbench-comm@CS.UTK.EDU ; Wed, 2 Nov 1994 05:21:59 EST Errors-to: owner-parkbench-comm@CS.UTK.EDU Received: from m1.cs.man.ac.uk by CS.UTK.EDU with SMTP (cf v2.9s-UTK) id FAA17994; Wed, 2 Nov 1994 05:21:53 -0500 Received: from r7.cs.man.ac.uk by m1.cs.man.ac.uk (4.1/SMI-4.1:AL5l) id AA05189; Wed, 2 Nov 94 10:21:32 GMT Received: from [130.88.13.21] (asante) by r7.cs.man.ac.uk; Wed, 2 Nov 94 10:21:27 GMT Date: Wed, 2 Nov 94 10:21:27 GMT Message-Id: <9411021021.AA19264@r7.cs.man.ac.uk> X-Sender: daves@r7.cs.man.ac.uk Mime-Version: 1.0 Content-Type: text/plain; charset="us-ascii" To: hint@scl.ameslab.gov From: "Dave Snelling" Subject: Notes on HINT and QUIPS Cc: parkbench-comm@CS.UTK.EDU John, Quinn, and PARKBENCH, I am making these comments (constructive I hope) openly in order to disseminate the ideas as quickly as possible. I believe that, whether right or wrong, QUIPS and HINT are likely to catch on. The problems below should therefore be rectified quickly. Problem 1: The document describing HINT and the source code claim that the list of intervals is sorted. Even with the well behaved function integrated in the HINT benchmark, the relative difference between the errors of consecutive intervals ranges from plus to *minus* ten percent by the 500 interval mark. I assume that this trend will continue. Possible Solution: Correct the documentation to indicate that the list is only partially sorted. Problem 2: The Net QUIPS computation is heavily weighted toward the speed at which the smallest case can be computed. For example, by treating this case specially and returning the answer almost immediately (a very simple optimization not prohibited by the rules), the Net MQUIPS of a Sun SPARC station improved from 0.741 to 1.615. Possible Solution: Start the Net QUIPS at something more reasonable, say 32 intervals rather than two as it stands now. Problem 3: By tuning the parallel version on the KSR to select the number of processors actually used (including the special case from problem 2 above), it was possible to obtain similar variations in the apparent performance of the KSR. This tuning operation basically eliminates the low QUIPS ratings associated with start-up on small numbers of intervals in parallel configurations. Further improvements are also possible by varying the ADVANCE parameter. Possible Solution: Most of these problems can be eliminated by stating exactly how many intervals are to be used at each stage of the Net QUIPS computation. Problem 4: There is no direct verification of the answers. The value of the answer is of course included in the value of the QUIPS rating, but for small numbers of intervals, where the error is large anyway, random numbers returned very quickly could artificially improve results. Possible Solution: Check the actual results. Unlike many benchmark programs HINT's results can be checked to the last bit; why not do it? Problem 5: As it stands, memory use in parallel tests increases simultaneously on all processors. The result on the KSR is that, when testing configurations smaller than the whole system, one level of the memory hierarchy is ignored. On VSM systems like the KSR, the memory of other processors can be used as "paging" space instead of disk. A more gradual increase in memory use, for the parallel cases, would allow the benchmark to detect this level of the hierarchy. Possible Solution: In the parallel version, increase memory use in the same way as in the sequential case. Conclusions: I hope that these comments promote further discussion. I believe the quest for a unifying benchmark is futile, but I see no reason for not trying. Good luck. Take care: Dr. David F. Snelling snelling@cs.man.ac.uk Centre for Novel Computing 44-61-275-6134 office Department of Computer Science 44-533-708636 home University of Manchester 44-61-275-6204 fax Manchester, UK M13 9PL From owner-parkbench-comm@CS.UTK.EDU Fri Nov 4 09:02:42 1994 Received: from CS.UTK.EDU by netlib2.cs.utk.edu with ESMTP (cf v2.8t-netlib) id JAA25539; Fri, 4 Nov 1994 09:02:41 -0500 Received: from localhost by CS.UTK.EDU with SMTP (cf v2.9s-UTK) id JAA12556; Fri, 4 Nov 1994 09:01:10 -0500 X-Resent-To: parkbench-comm@CS.UTK.EDU ; Fri, 4 Nov 1994 09:01:09 EST Errors-to: owner-parkbench-comm@CS.UTK.EDU Received: from berry.cs.utk.edu by CS.UTK.EDU with ESMTP (cf v2.9s-UTK) id JAA12549; Fri, 4 Nov 1994 09:01:06 -0500 Received: from LOCALHOST.cs.utk.edu by berry.cs.utk.edu with SMTP (cf v2.9c-UTK) id JAA17395; Fri, 4 Nov 1994 09:01:04 -0500 Message-Id: <199411041401.JAA17395@berry.cs.utk.edu> to: parkbench-comm@CS.UTK.EDU Subject: SC'94 Parkbench Meeting Date: Fri, 04 Nov 1994 09:01:04 -0500 From: "Michael W. Berry" A joint SPEC-HPSC/ParkBench meeting has been scheduled at the Supercomputing'94 conference. The meeting will take place at 5:30-7:30 on Tuesday of that week. Stay tuned for details on the location of the meeting. Mike B. From owner-parkbench-comm@CS.UTK.EDU Wed Nov 23 19:45:28 1994 Received: from CS.UTK.EDU by netlib2.cs.utk.edu with ESMTP (cf v2.8t-netlib) id TAA02701; Wed, 23 Nov 1994 19:45:28 -0500 Received: from localhost by CS.UTK.EDU with SMTP (cf v2.9s-UTK) id TAA22591; Wed, 23 Nov 1994 19:44:10 -0500 X-Resent-To: parkbench-comm@CS.UTK.EDU ; Wed, 23 Nov 1994 19:44:09 EST Errors-to: owner-parkbench-comm@CS.UTK.EDU Received: from berry.cs.utk.edu by CS.UTK.EDU with ESMTP (cf v2.9s-UTK) id TAA22585; Wed, 23 Nov 1994 19:44:08 -0500 Received: from LOCALHOST.cs.utk.edu by berry.cs.utk.edu with SMTP (cf v2.9c-UTK) id TAA20552; Wed, 23 Nov 1994 19:44:06 -0500 Message-Id: <199411240044.TAA20552@berry.cs.utk.edu> to: parkbench-comm@CS.UTK.EDU Subject: Minutes of SC'94 BOFS Date: Wed, 23 Nov 1994 19:44:05 -0500 From: "Michael W. Berry" Please send me all corrections to the following minutes of the joint HPSC/SPEC-PARKBENCH BOFS. Thanks, Mike -------------------------------------------------------------------------- Minutes of SPEC/HPSC - PARKBENCH BOFS at Supercomputing'94 on Nov. 15, 1994 (Washington Convention Center, DC) -------------------------------------------------------------------------- (Attendance list not available) Roger H. started the meeting at 5:40pm. Roger handed out a glossy brochure for the PARKBENCH effort. Roger mentioned that 10 benchmarks are now available and the report has been published in "Scientific Programming". Roger had some extra copies of the reprints for those mentioned in the paper. A joint workshop of PARKBENCH/PEPS/EUROBEN has been scheduled at the Univ. of warwick, UK for Dec 15-16, 1994. Send email to conf@dcs.warwick.ac.uk for more information or to submit papers. This meeting is organized by Prof. Nudd at Warwick. Tony H. expressed warm thanks to Roger H. for his efforts in guiding the PARKBENCH effort. Michael B. read the action items from the August PARKBENCH meeting in KNoxville and it was noted that the suite still required a very long 1-D FFT - to be provided by D. Bailey. Tony Hey (new chairman) then overviewed the current Parkbench effort. Tony noted the 2 year anniv. of the Parkbench effort. Tony reviewed the objectives and the deliverables of the Parkbench effort whose focus is on distributed memory-based benchmarks. Tony reviewed the 5 subcommittees - he indicated that perhaps we should have an "analysis" subcommittee. Tony then briefly reviewed the actual benchmarks in the PARKBENCH suite: low-level, kernels (all available including the NAS Parallel benchmarks), and compact applications (SEIS1.1,POLMP,SOLVER,PSTSWM, 3 CFD codes from NASA). It was noted that POLMP results have already been published in 3 papers. Tom Haupt's HPF compiler benchmarks were also mentioned. Tony mentioned the PARKBENCH URL for source codes, reports, documentation. He indicated that one can now view results via PDS and GBIS. He then demo'ed the GBIS graphical interface. This GUI isfor both Genesis and PARKBENCH results. Many formats for outputs are available. Several of the URL's for GUI and PDS access were listed: PARKBENCH desc & report -- http://www.epm.ornl.gov/~walker/parkbench/ PARKBENCH doc & codes -- http://www.netlib.org/parkbench/ GBIS (Genesis Benchmark Information Service) -- http://hpcc.soton.ac.uk/RanD/gbis/papiani-new-gbis-top.html PDS -- http://netlib2.cs.utk.edu/performance/html/PDStop.html Tony H. noted that one can get benchmark information with links to NSE and PDS from the GBIS pages. He then demo'ed selections for the EP PARKBENCH code. He indicated that graphs can be modified or accept the default graphs. One can produce either linear or log scales on the axes. Regarding Future PARKBENCH work, Tony H. listed the following items. - parallel version of 1-D FFT - investigation of suitable codes for I/O kernel - code uniformity (standard makefiles, etc.) - reorganize the server repository (release methodology etc.) - define rules for running the benchmarks - further results are required (GBIS and for PDS inclusion) - define a methodology for reporting bugs - future MPI versions of all codes; benchmarks to test gather/scatter/global sum operations - shared memory versions (put/get) - further compact applications coverage - start up an analysis subcommittee (for low-level kernels) After concluding the PARKBENCH presentation, Tony H. solicited questions from the audience. Here are some of the dialogues. Q: How many machines reported? Rather sparse right now -- depends on what is available. Roger H. pointed out that database must be expanded in order for an analysis group to have enough to work with. He stressed the need to understand the performance "surface". Q: Are sequential versions available? Yes, there are seq. and parallel versions. Q: Has the PDE2 code essentially replaced the NAS multigrid code? Yes, codes that are more widely used are preferred. Tony H. indicated that a formal process for including/deleting codes is needed. Sia Hassanzadeh from Sun Microsystems then overviewed the SPEC/HPSC effort. He provided 2 handouts: 1 on SPEC and 1 on HPSC. Two committees exist - OSSC and HPSC. Two categories of affiliation were described and current membership total is about 58. There are regular members ($5K fee) and associate members ($1K). He pointed out that SPEC/HPSC is a nonprofit organization whose mission is to establish, maintain, and endorse a suite of benchmarks. He distributed a handout "HPSC: A Progress Report" to al the attendees. He then reviewed HPSC's scope (broad range of machines and applications with discipline-specific suites). The SPEC/HPSC target application areas mentioned were: Seismic Processing (ARCO Suite 1.1) Computational Chemistry (GAMESS) CFD (TURBO3D) Enviromental and Climate Modeling Structural Analysis (FEM) Information Processing (DB, text retrieval, video-on-demand, etc.) Multiple data sizes planned and codes should be self-checking. Sizes can be small, medium, large and HUGE (ARCO suite uses this for example). The programming languages considered are Fortran77 or ANSI C, PVM/MPI, and the Shared-Address Space (SAS) programming model. Sia mentioned that there have been 4 SPEC/HPSC meetings and that Sia is Chair and Fiona Sim (IBM) is the Vice-Chair. They are looking into procedure for result distribution -- perhaps using PARKBENCH tools/servers. Run rules are more restrictive for SPEC/HPSC. Two of the questions and answers that followed Sia's presentation are listed below: Q: What is the optimal number of codes? A timetable is planned and looking at emerging technology. Q: Will applications only be added to suite if it satisfies the list of parallel versions? SPEC/HPSC requires a serial and at least 1 parallel version such as MPI or HPF. Code authors or code managers will have to take responsibility to produce follow-up versions. At 6:30pm, Tony asked for an open discussion on future directions for both efforts (PARKBENCH and SPEC/HPSC). Tony H. asked Jack D. for comments about the LINPACK results method. Jack is very trusting of the submissions and he sometimes verifies them on his own. David B. was asked to comment about NAS Paralell benchmark methodology. In that case, one has to include self-checking code with benchmark. Sia H. indicated that there should be a review process (done within SPEC). Have to report a global set of metrics for each code and then provide breakdown of certain activities like throughput, I/O, etc. Chuck Mosher (ARCO) indicated that politics of vendors gets to be a problem with reported results. Rudy Eigenmann (Illinois) asked about how to handle optimized versions of codes. He indicated that vendors tend not to want opt. results. Tony H. indicated that PARKBENCH stresses listing of optimization and that performance is measured against a canonical mflop count. Getting the actual opt. code is not necessary -- libraries used must be available to consumers. Sia H. pointed out that "as is" codes are difficult to manage. Have to be concerned with recording "benchmarker's abilities". Forcing compliance on the specifications of optimization is difficult. Fiona Sim stressed that customers really want their own version so opt. should be limited. David MacKay stressed that PARKBENCH universities work with SPEC/HPSC's versions of the codes. They should be able to use the Assoc. membership SPEC/HPSC status to achieve this. Tony H. asked if NAS CFD codes would be acceptable to SPEC/HPSC? David M. indicated that vendors already had their own versions. PARKBENCH distributed version would be experimental but vendors would keep their own private versions. Jack D. agreed to be the arbitrator of distribution of HPSC codes to PARKBENCH members. PARKBENCH can perform exploratory exercises in code optimization for SPEC and delay results on the WWW. Tony H. then asked Tom H. to provide updates on HPF benchmarks in PARKBENCH html pages. He then asked the audience to voice their opinions on the concept of an "analysis " group. Tony then asked Aad van der Steen (Academic Computing Centre Ultrecht) if he would be willing to lead such a group. Aad agreed and Tony suggested that inquiries be made into joint European and American funding. Alistair D., Fiona S., and Roger H. were willing to be on that subcommittee chaired by Aad. Tony H. adjourned the meeting promptly at 7pm EST. The next meeting for PARKBENCH is schedule for the spring of 1995 in Knoxville. From owner-parkbench-comm@CS.UTK.EDU Wed Dec 7 08:16:42 1994 Received: from CS.UTK.EDU by netlib2.cs.utk.edu with ESMTP (cf v2.8t-netlib) id IAA27810; Wed, 7 Dec 1994 08:16:41 -0500 Received: from localhost by CS.UTK.EDU with SMTP (cf v2.9s-UTK) id IAA28740; Wed, 7 Dec 1994 08:15:08 -0500 X-Resent-To: parkbench-comm@CS.UTK.EDU ; Wed, 7 Dec 1994 08:15:05 EST Errors-to: owner-parkbench-comm@CS.UTK.EDU Received: from sentosa.sas.ntu.ac.sg by CS.UTK.EDU with ESMTP (cf v2.9s-UTK) id IAA28693; Wed, 7 Dec 1994 08:14:52 -0500 Received: from localhost (fp52bc@localhost) by sentosa.sas.ntu.ac.sg (8.6.4/8.6.4) id VAA15816 for pbwg-comm@cs.utk.edu; Wed, 7 Dec 1994 21:09:53 +0800 Date: Wed, 7 Dec 1994 21:09:53 +0800 From: Tng Chee Hiong Message-Id: <199412071309.VAA15816@sentosa.sas.ntu.ac.sg> Subject: how to uncompress .shar file Content-Type: text/plain Mime-Version: 1.0 X-Courtesy-Of: NCSA Mosaic 2.2 on Sun X-URL: http://www.netlib.org/parkbench/index.html Apparently-To: pbwg-comm@CS.UTK.EDU PARKBENCH (PARallel Kernels and BENCHmarks) There have been 4,410 accesses to this library. (Count updated 11/14/94 at 01:19:53) The PARKBENCH (PARallel Kernels and BENCHmarks) committee, originally called the Parallel Benchmark Working Group (PBWG), was founded at Supercomputing '92. The objectives of the group are the following: 1. To establish a comprehensive set of parallel benchmarks that is generally accepted by both users and vendors of parallel systems. 2. To provide a focus for parallel benchmark activities and avoid unnecessary duplication of effort and proliferation of benchmarks. 3. To set standards for benchmarking methodology and result-reporting together with a control database/repository for both the benchmarks and the results. 4. To make the benchmarks and results freely available in the public domain. The initial focus of the parallel benchmarks is on the new generation of scalable distributed-memory message-passing architectures for which there is a notable lack of existing benchmarks. The initial benchmark release concentrates on Fortran 77 message-passing codes using the widely available PVM message passing interface for portability. Future versions will undoubtedly adopt the proposed MPI interface, when this is fully defined and becomes generally accepted. The committee's aim, however, is to cover all parallel architectures, and this is expected to be achieved in the near-term by producing versions of the benchmark codes using Fortran 90 and High Performance Fortran (HPF) over PVM. For more information about PARKBENCH, see the PARKBENCH paper. The email address for the PARKBENCH committee is pbwg-comm@cs.utk.edu file parkbench/readme for overview of parkbench file parkbench/parkbench.ps # postscript file containing the current version # of a paper describing the activity. file parkbench/submission # Submission form for compact applications. file parkbench/linalg.tar.z.uu # Parallel linear algebra benchmarks described in # section 4.2.1 of the PARKBENCH report distributed at # Supercomputing '93 on November 17, 1993. file parkbench/lowlev_1.0.tar.z.uu # Low-level benchmarks described in chapter 3 of the # PARKBENCH report distributed at Supercomputing '93 # on November 17, 1993. file parkbench/comm.archive # archive: contains the correspondences of the PBWG committee # to retrieve from netlib type: send comm.archive from parkbench # ----------------------------------- # Last updated 2/26/93 (3439 bytes). file parkbench/lowlevel.archive # lowlevel.archive: contains the correspondences of the PBWG # low-level subcommittee # ----------------------------------- file parkbench/method.archive # method.archive: contains the correspondences of the PBWG # methodology subcommittee # ----------------------------------- file parkbench/compactapp.archive # compactapp.archive: contains the correspondences of the PBWG # compact applications subcommittee # ----------------------------------- file parkbench/kernel.archive # kernel.archive: contains the correspondences of the PBWG # kernel subcommittee # ----------------------------------- file parkbench/guide_for_reporting.ps # guide_for_reporting.ps: contains a proposed guidelines for # reporting performance results. # This file is a postscript file. # ----------------------------------- # Last updated 3/4/93 (24892 bytes). file parkbench/npb.uu # npb.uu: contains the sequential version of the NAS parallel benchmark # to retrieve from netlib type: send npb.uu from parkbench # This file a tar, compress, uuencode file. # ----------------------------------- # Last updated 4/7/94 (378189 bytes). file parkbench/npb-codes.tar.z.uu # npb-codes.tar.z.uu: contains the NASA in-house parallel # implementations of the NAS Parallel Benchmarks for the # Intel iPSC and the CM-2. # This file is a tar, compress, uuencode file. # -------------------------------------- # Last updated 3/24/94 (1563760 bytes) file parkbench/npb-pvm.shar # npb-pvm.shar: contains the NASA in-house parallel # implementations of the NAS Parallel Benchmarks for PVM 3.3. # This file is a shar file. # -------------------------------------- # Last updated 9/30/94 (852979 bytes) file parkbench/isuite.uu # isuite.uu: contains the isuite parallel benchmark # This file a tar, compress, uuencode file. # ----------------------------------- # Last updated 3/4/93 (34303 bytes). file parkbench/electromagnetic.uu # electromagnetic.uu: contains the electromagnetic parallel benchmark # This file a tar, compress, uuencode file. # ----------------------------------- # Last updated 3/4/93 (150898 bytes). file parkbench/general.uu # general.uu: contains the general parallel benchmark # This file a tar, compress, uuencode file. # ----------------------------------- # Last updated 3/4/93 (344392 bytes). file parkbench/physics.uu # physics.uu: contains the physics parallel benchmark # This file a tar, compress, uuencode file. # ----------------------------------- # Last updated 3/4/93 (975800 bytes). file parkbench/purdue-set.uu # purdue-set.uu: contains the purdue-set parallel benchmark # This file a tar, compress, uuencode file. # ----------------------------------- # Last updated 3/4/93 (124849 bytes). file parkbench/intrinsics.uu # intrinsics.uu: contains the intrinsics parallel benchmark # This file a tar, compress, uuencode file. # ----------------------------------- # Last updated 3/4/93 (81621 bytes). file parkbench/weather-climate.uu # weather-climate.uu: contains the weather-climate parallel benchmark # This file a tar, compress, uuencode file. # ----------------------------------- # Last updated 3/4/93 (194462 bytes). file parkbench/genesis.uu # genesis.uu: contains the genesis parallel benchmark # This file a tar, compress, uuencode file. # ----------------------------------- # Last updated 3/4/93 (1095744 bytes). file parkbench/SP.tar.Z # ----------------------------------- # Last updated 2/14/94 (4690328 bytes). file parkbench/cmpl-ben.z.uu # Low Level HPF Compiler Benchmarks # The benchmark suite comprises several simple, synthetic applications # which test several aspects of HPF compilation. The current version # of the suite addresses the basic features of HPF, and it is designed # to measure performance of early implementations of the compiler. # They concentrate on testing parallel implementation of explicitly # parallel statements, i.e., array assignments, FORALL statements, # INDEPENDENT DO loops, and intrinsic functions with different mapping # directives. In addition, the low level compiler benchmarks address # problem of passing distributed arrays as arguments to subprograms. # # The language features not included in the HPF subset are not addressed # in this release of the suite. The next releases will contain more # kernels that will address all features of HPF, and also they will be # sensitive to advanced compiler transformations. # # The codes included in this suite are either adopted from existing # benchmark suites, NAS suite, Livermore Loops, and the Purdue Set, # or are developed at Syracuse University. # This file a tar, compress, uuencode file. # ----------------------------------- # Last updated 2/25/94 (14313 bytes) file parkbench/solver_pvm.tar file parkbench/solver_parmacs.tar for This application generates quark propagators from a background gauge , configuration and a fermionic source. This is equivalent to solving , M psi = source , where psi is the quark propagator and M (a function operating on psi) , depends on the gauge fields. , The benchmark performs a cut down version of this operation. file parkbench/Seis1.1.tar for ARCO Seis1.1 code , Seis1.1 is designed to be a portable, parallel environment , for developing, benchmarking, and sharing seismic application codes. The , underlying model for parallel computation is message passing, using a , system independent message passing layer. Sample implementations for PVM , and NX are included. file parkbench/polmpf77.tar.Z file parkbench/polmpf77.tar.z.uu file parkbench/polmphpf.tar.Z file parkbench/polmphpf.tar.z.uu file parkbench/polmppvm.tar.Z file parkbench/polmppvm.tar.z.uu file parkbench/polmpparmacs.tar.Z file parkbench/polmpparmacs.tar.z.uu for The Proudman Oceanographic Laboratory Multiprocessing Program (POLMP) , project was created to develop numerical algorithms for shallow sea , 3D hydrodynamic models that run efficiently on modern parallel , computers. A code follows the wind induced flow in a closed , rectangular basin including a number of arbitrary land areas. , The model solves a set of hydrodynamic partial differential equations, , subject to a set of initial conditions, using a mixed , explicit/implicit forward time integration scheme. The explicit , component corresponds to a horizontal finite difference scheme and , the implicit to a functional expansion in the vertical. , This is the sequential Fortran 77 version. , This is the sequential HPF version. , This is the sequential PVM version. , This is the sequential Parmacs version. file parkbench/pstswm.tar.Z file parkbench/pstswm.tar.z.uu for PSTSWM Version 2.0 , PSTSWM Version 2.0 is a message-passing benchmark code and , parallel algorithm testbed that solves the nonlinear shallow , water equations using the spectral transform method. The spectral , transform algorithm of the code follows closely how CCM2, the , NCAR Community Climate Model, handles the dynamical part of the , primitive equations, and the parallel algorithms implemented in the , model include those currently used in the message-passing parallel , implementation of CCM2. PSTSWM was written by Patrick Worley of , Oak Ridge National Laboratory and Ian Foster of Argonne National , Laboratory, and is based partly on previous parallel algorithm , research by John Drake, David Walker, and Patrick Worley of Oak , Ridge National Laboratory. Both the code development and parallel , algorithms research were funded by the DOE Computer Hardware, , Advanced Mathematics, and Model Physics (CHAMMP) program. , PSTSWM is a parallel implementation of a sequential code (STSWM 2.0) , written by James Hack and Ruediger Jakob at NCAR to solve the , shallow water equations on a sphere using the spectral transform method. file parkbench/fftpb for 1-D FFT benchmark , This is a sample code implementing the ParkBench 1-D FFT benchmark. , It actually performs a large 1-D linear convolution using , real-to-complex and complex-to-real FFTs. From owner-parkbench-comm@CS.UTK.EDU Wed Dec 7 11:53:08 1994 Received: from CS.UTK.EDU by netlib2.cs.utk.edu with ESMTP (cf v2.8t-netlib) id LAA01495; Wed, 7 Dec 1994 11:53:07 -0500 Received: from localhost by CS.UTK.EDU with SMTP (cf v2.9s-UTK) id LAA15100; Wed, 7 Dec 1994 11:52:37 -0500 X-Resent-To: parkbench-comm@CS.UTK.EDU ; Wed, 7 Dec 1994 11:52:35 EST Errors-to: owner-parkbench-comm@CS.UTK.EDU Received: from box by CS.UTK.EDU with ESMTP (cf v2.9s-UTK) id LAA15092; Wed, 7 Dec 1994 11:52:31 -0500 From: PEPS Conference Administrator Message-Id: <13217.199412071652@box> Subject: Workshop on Performance Evaluation To: parkbench-comm@CS.UTK.EDU Date: Wed, 7 Dec 1994 16:52:14 +0000 (GMT) Cc: conf@dcs.warwick.ac.uk (PEPS Conference Administrator) X-Mailer: ELM [version 2.4 PL20] MIME-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit Workshop on Performance Evaluation and Benchmarking of Parallel Systems ================================ 15/16 December 1994 Scarman House University of Warwick A joint workshop of various parties that are involved in the performance evaluation of high-performance computing systems will be held at the University of Warwick (Coventry, England) on December 15--16, 1994. Participants will include EuroBen, PARKBENCH, and PEPS. =========================================================================== Preliminary Schedule =========================================================================== THURDAY 15th December --------------------- REGISTRATION & WELCOME --------------------------------------------------------------------------- 10:30 - Project Overviews (Chair: Graham Nudd) Aad van der Steen, ACCU-EuroBen, The Netherlands "Status and recent results of the EuroBen Benchmark" Claude Lemoine, Thomson ASM, Sophia Antipolis, France "PEPS Overview" Tony Hey, M. Baker, V. Getov, A. Dunlop, University of Southampton, England "The PARKBENCH and GENESIS Benchmark Suites: Status and Future" --------------------------------------------------------------------------- 12:30 LUNCH --------------------------------------------------------------------------- 2:00 - (Chair: Aad van der Steen) Francois Manchon, Simulog, Sophia Antiplolis, France "The PEPS Modelling Tools" Gilles Kempf, Electricite de France, Clamart, France "Modelling and simulation of scientific programs on a CM-5 with MODARCH" Yasumasa Kanada, University of Tokyo, Japan "Benchmarking Ultrafast Random Number Generators" Gerrit van der Velde, NEC Benelux, Amsterdam, The Netherlands "SX-4 overview" (Title to be confirmed) --------------------------------------------------------------------------- 3:30 COFFEE --------------------------------------------------------------------------- 4:00 - (Chair: Aad van der Steen) Roger Hockney, University of Warwick and Southampton, UK "Computational Similarity" David Snelling, University of Manchester, UK "Computational Similarity:A Case Study of Shallow on the KSR-1" Erich Strohmaier, University Mannheim, Mannheim, Germany "Statistical Analysis of NAS Parallel Benchmarks and LINPACK Results" 5:30 Discussion --------------------------------------------------------------------------- 7:00 DINNER =========================================================================== FRIDAY 16th December -------------------- 9:00 - (Chair: Roger Hockney) Parallel Systems Group, University of Warwick, UK "Performance Prediction through Characterisation" Adolfy Hoisie, Cornell TC, Ithaca, USA "Algorithmic approach to performance evaluation and performance modelling" Willi Schonauer, Universitat Karlsruhe, Germany "A careful interpretation of simple kernel benchmarks yields the essential information about a parallel supercomputer" --------------------------------------------------------------------------- 10:30 COFFEE --------------------------------------------------------------------------- 11:00 - (Chair: Roger Hockney) MariaRita Nazzarelli, Intecs Systemi, PISA, Italy "Monitoring Methodology for Performance and Test Coverage Analysis" M. Louter-Nool, CWI, The Netherlands (Title to be announced) Vladimir Getov, University of Southampton, UK "Performance characterisation of cache memory effect", --------------------------------------------------------------------------- 12:30 LUNCH --------------------------------------------------------------------------- 2:00 - (Chair: David Snelling) Claude Lemoine, Thomson ASM, Sophia Antipolis, France "PEPS Testcases" Trevor Chambers, National Physical Laboratory, UK (Title to be announced) --------------------------------------------------------------------------- 3:00 COFFEE --------------------------------------------------------------------------- 3:30 - (Chair: David Snelling) Vladimir Getov, Brandes, Chapman, Hey, Pritchard, University of Southampton "A comparison of HPF-like systems" Alistair Dunlop, University of Southampton, UK "A Toolkit for Optimising Parallel Performance" 4:30 Discussion and Closing remarks =========================================================================== REGISTRATION DETAILS: The registration fee is #150 (Pounds Sterling) and as attendance is limited to 35 delegates. Please include payment with your registration form. Registration cannot be confirmed without payment. Registration at the workshop will be in Scarman House on the morning of Thursday 15th December. Accommodation is in single rooms with en suite facilities in Scarman House. Accommodation on the 15th and meals on the 15th-16th are included in the registration fee. Additional accomodation is available on the evening of Wednesday 14th at a cost of #55 for bed and breakfast, and #20 (Pounds Sterling) for dinner. TRAVEL INFORMATION TO THE UNIVERSITY OF WARWICK: There are frequent train services between London and Coventry (approximately at 30 minute intervals) and the travel time from London Euston station is about 75 minutes. The University is 3 miles from Coventry train station. There are usually plenty of taxis available. Birmingham International Airport has good connecting train services to Coventry (a journey of about 10 miles). By car From the North M1, M69 follow the By-pass routes marked Warwick (A46), then follow the signs to the University, which is on the outskirts of COVENTRY. From the South M1, M45, A45 or M40, A46, follow the signs to the University. From the East join the M1, then follow directions as for travel from the North or the South. From the West M5, M42, A45, follow the signs for the University. REGISTRATION FORM: Name (and title): ------------------------------------------------------------------------------- Affiliation: ------------------------------------------------------------------------------- Position: ------------------------------------------------------------------------------- Address: ------------------------------------------------------------------------------- ------------------------------------------------------------------------------- ------------------------------------------------------------------------------- Telephone: Fax: ---------------------------------- --------------------------------------- Email: ------------------------------------------------------------------------------- YES/NO Workshop Attendance: 150 | YES | (includes accomodation on 15th December | | & meals during workshop). | | Accomodation on Wednesday 14th: 55 | | Dinner on Night of 14th: 20 | | Payment should be in UK Pounds Sterling by cheque or international money order, drawn on a UK bank. Please make cheques payable to 'The University of Warwick'. (Sorry, no credit cards). For further information, or registration please contact: Prof. Graham R. Nudd or; Aad J. van der Steen (PEPS Workshop) EuroBen Dept. of Computer Science c/o Academic Computing University of Warwick Centre Utrecht Coventry CV4 7AL Budapestlaan 6 England 3584 CD Utrecht Tel +44--203--523193 The Netherlands Fax +44--203--525714 Tel +31--30--531444 Email conf@dcs.warwick.ac.uk Fax +31--30--531633 Email actstea@cc.ruu.nl From owner-parkbench-comm@CS.UTK.EDU Wed Jan 4 19:36:26 1995 Received: from CS.UTK.EDU by netlib2.cs.utk.edu with ESMTP (cf v2.8t-netlib) id TAA09859; Wed, 4 Jan 1995 19:36:26 -0500 Received: from localhost by CS.UTK.EDU with SMTP (cf v2.9s-UTK) id TAA18668; Wed, 4 Jan 1995 19:35:01 -0500 X-Resent-To: parkbench-comm@CS.UTK.EDU ; Wed, 4 Jan 1995 19:34:59 EST Errors-to: owner-parkbench-comm@CS.UTK.EDU Received: from zazu.c3.lanl.gov by CS.UTK.EDU with ESMTP (cf v2.9s-UTK) id TAA18661; Wed, 4 Jan 1995 19:34:57 -0500 Received: (krk@localhost) by zazu.c3.lanl.gov (8.6.8/c93112801) id RAA00489 for pbwg-comm@cs.utk.edu; Wed, 4 Jan 1995 17:34:49 -0700 Date: Wed, 4 Jan 1995 17:34:49 -0700 From: Kenneth R Koch Message-Id: <199501050034.RAA00489@zazu.c3.lanl.gov> To: pbwg-comm@CS.UTK.EDU Subject: Meeting? When and where is the next planned ParkBench meeting. I just joined this effort and heard that there was a Knoxville meeting soon. Is that correct? From owner-parkbench-comm@CS.UTK.EDU Fri Jan 6 18:55:44 1995 Received: from CS.UTK.EDU by netlib2.cs.utk.edu with ESMTP (cf v2.8t-netlib) id SAA15026; Fri, 6 Jan 1995 18:55:43 -0500 Received: from localhost by CS.UTK.EDU with SMTP (cf v2.9s-UTK) id SAA01091; Fri, 6 Jan 1995 18:54:43 -0500 X-Resent-To: parkbench-comm@CS.UTK.EDU ; Fri, 6 Jan 1995 18:54:41 EST Errors-to: owner-parkbench-comm@CS.UTK.EDU Received: from trout.nosc.mil by CS.UTK.EDU with SMTP (cf v2.9s-UTK) id SAA01075; Fri, 6 Jan 1995 18:54:39 -0500 Received: from grey.color (grey.nosc.mil) by trout.nosc.mil (4.1/SMI-4.1) id AA16080; Fri, 6 Jan 95 15:54:36 PST Received: by grey.color (5.0/SMI-SVR4) id AA01032; Fri, 6 Jan 1995 15:54:30 +0800 Date: Fri, 6 Jan 1995 15:54:30 +0800 From: trancv@grey.nosc.mil (Cam V. Tran) Message-Id: <9501062354.AA01032@grey.color> To: maintainers@netlib.org, pbwg-comm@CS.UTK.EDU Subject: PARKBENCH paper Cc: trancv@nosc.mil X-Sun-Charset: US-ASCII The PostScript paper on PARKBENCH is too big, about 4.7Kbytes. I'm interested in obtaining a hard copy but cannot print out on our laser printer. Would you please break them down, say to 4 parts, suitable for small office printer. Many thank. Cam Tran (trancv@nosc.mil) Naval Command, Control and Ocean Surveillance Center R D T & E Divion Code 761 San Diego, CA 92152-5000 ------------------------------------------------------ From owner-parkbench-comm@CS.UTK.EDU Fri Jan 6 19:01:40 1995 Received: from CS.UTK.EDU by netlib2.cs.utk.edu with ESMTP (cf v2.8t-netlib) id TAA15048; Fri, 6 Jan 1995 19:01:39 -0500 Received: from localhost by CS.UTK.EDU with SMTP (cf v2.9s-UTK) id TAA01860; Fri, 6 Jan 1995 19:01:17 -0500 X-Resent-To: parkbench-comm@CS.UTK.EDU ; Fri, 6 Jan 1995 19:01:15 EST Errors-to: owner-parkbench-comm@CS.UTK.EDU Received: from trout.nosc.mil by CS.UTK.EDU with SMTP (cf v2.9s-UTK) id TAA01843; Fri, 6 Jan 1995 19:01:11 -0500 Received: from grey.color (grey.nosc.mil) by trout.nosc.mil (4.1/SMI-4.1) id AA16346; Fri, 6 Jan 95 16:01:08 PST Received: by grey.color (5.0/SMI-SVR4) id AA01046; Fri, 6 Jan 1995 16:01:03 +0800 Date: Fri, 6 Jan 1995 16:01:03 +0800 From: trancv@grey.nosc.mil (Cam V. Tran) Message-Id: <9501070001.AA01046@grey.color> To: wade@CS.UTK.EDU, pbwg-comm@CS.UTK.EDU Subject: PARKBENCH paper Cc: trancv@nosc.mil X-Sun-Charset: US-ASCII The PostScript paper on PARKBENCH is too big, about 4.7Kbytes. I'm interested in obtaining a hard copy but cannot print out on our laser printer. Would you please break them down, say to 4 parts, suitable for small office printer. Many thank. Cam Tran (trancv@nosc.mil) Naval Command, Control and Ocean Surveillance Center R D T & E Divion Code 761 San Diego, CA 92152-5000 ------------------------------------------------------ From owner-parkbench-comm@CS.UTK.EDU Tue Jan 31 11:30:24 1995 Received: from CS.UTK.EDU by netlib2.cs.utk.edu with ESMTP (cf v2.9t-netlib) id LAA09626; Tue, 31 Jan 1995 11:30:23 -0500 Received: from localhost by CS.UTK.EDU with SMTP (cf v2.9s-UTK) id LAA13389; Tue, 31 Jan 1995 11:29:19 -0500 X-Resent-To: parkbench-comm@CS.UTK.EDU ; Tue, 31 Jan 1995 11:29:18 EST Errors-to: owner-parkbench-comm@CS.UTK.EDU Received: from sun3.nsfnet-relay.ac.uk by CS.UTK.EDU with SMTP (cf v2.9s-UTK) id LAA13373; Tue, 31 Jan 1995 11:29:03 -0500 Received: from bright.ecs.soton.ac.uk by sun3.nsfnet-relay.ac.uk with JANET SMTP id ; Tue, 31 Jan 1995 16:28:46 +0000 From: Mark Papiani Received: from holt.ecs.soton.ac.uk by landlord.ecs.soton.ac.uk; Tue, 31 Jan 95 16:33:58 GMT Date: Tue, 31 Jan 95 16:33:57 GMT Message-Id: <8873.9501311633@holt.ecs.soton.ac.uk> To: parkbench-comm@CS.UTK.EDU Subject: Questions on PARKBENCH SOLVER Questions on PARKBENCH SOLVER ----------------------------- I received a couple of questions from Jim Janak regarding some SOLVER results I sent him. Ian Glendinning replied to Jim, see included message below. I have forwarded this to the committee since it raises a couple of questions that might need addressing at some stage. Mark Papiani. Department of Electronics & Computer Science University of Southampton. ----- Begin Included Message ----- From igl Thu Jan 26 17:34:09 1995 From: Ian Glendinning Date: Thu, 26 Jan 95 17:34:06 GMT To: janak@watson.ibm.com Subject: Re: qcd solver Cc: mp Content-Length: 1917 Jim, Mark Papiani forwarded your message about the SOLVER benchmark to me. > I have been looking at the note you sent to Rick Lawrence earlier, > containing results for the qcd solver parkbench benchmark (I've also > looked at the source code a bit). A couple of questions, if I may: > > 1. You didn't say how many nodes were used to produce the CS2 results > that you so kindly sent. I read the output (processor grid) as > 4 nodes. Is this correct? Yes. It's rather cryptically recorded in the results file of the GENESIS version of SOLVER, but that's an improvement on the version available from the netlib PARKBENCH repository, which prints that information to standard output. I struggled with the contorted I/O redirection routines for some time to get that changed! Clearly there is still room for improvement... > 2. There are differences in precision on different machines that > may well affect the numerical values of the results (especially > the residues after a number of iterations, where they start > becoming very small). The source code contains a fair amount > of single precision, I notice, which would be handled differently > on different hardware. The question is, do the benchmark > standards for parkbench contain criteria for how closely the > numerical values, especially of the residues, have to match > those in the output files you sent? The answer is no, there is no stipulation that the results must match, though perhaps there should be! Please note that the benchmark can be made to use single or double precision, as defined by constants in the file precision.h, although as distributed it uses single precision. Ian -- Ian Glendinning HPC Centre, Electronics & Computer Science igl@ecs.soton.ac.uk University of Southampton, SO17 1BJ, UK Tel: +44 1703 592897 WWW URL: http://hpcc.soton.ac.uk/staff/igl.html ----- End Included Message ----- From owner-parkbench-comm@CS.UTK.EDU Thu Feb 2 07:18:47 1995 Received: from CS.UTK.EDU by netlib2.cs.utk.edu with ESMTP (cf v2.9t-netlib) id HAA23696; Thu, 2 Feb 1995 07:18:47 -0500 Received: from localhost by CS.UTK.EDU with SMTP (cf v2.9s-UTK) id HAA12293; Thu, 2 Feb 1995 07:17:51 -0500 X-Resent-To: parkbench-comm@CS.UTK.EDU ; Thu, 2 Feb 1995 07:17:50 EST Errors-to: owner-parkbench-comm@CS.UTK.EDU Received: from inet-gw-3.pa.dec.com by CS.UTK.EDU with SMTP (cf v2.9s-UTK) id HAA12286; Thu, 2 Feb 1995 07:17:33 -0500 From: Received: from ilonet.ilo.dec.com by inet-gw-3.pa.dec.com (5.65/10Aug94) id AA27085; Thu, 2 Feb 95 04:06:42 -0800 Received: by ilonet.ilo.dec.com (5.65/MS-012594); id AA16072; Thu, 2 Feb 1995 12:07:42 GMT Received: from localhost by punkbrat.ilo.dec.com; (5.65/1.1.8.2/03May94-0246PM) id AA05563; Thu, 2 Feb 1995 12:07:19 GMT Message-Id: <9502021207.AA05563@punkbrat.ilo.dec.com> X-Mailer: exmh version 1.5.3 12/28/94 To: pbwg-comm@CS.UTK.EDU Cc: gavan@ilonet.ilo.dec.com Subject: Parkbench Benchmarks Mime-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Date: Thu, 02 Feb 95 12:07:19 +0000 X-Mts: smtp Do you have any information at this point in time of how close Parkbench is to becoming a standard benchmark. How visible are the benchmark results? Which markets view Parkbench results as significant? Are there any indications that parkbench will displace NAS as the most important cross-industry benchmark excluding LINPACK? Is the working group still active? regards --------------------------------------+------------------------------- Gavan Duffy | Phone: +44-353-91-754913 Technical Computing Group | DTN : 7-822-4913 Digital | Fax: +44-353-91-754444 Ballybritt, | e-mail: gavan@ilo.dec.com Galway, Ireland | VMS mail : galvia::gduffy --------------------------------------+------------------------------- From owner-parkbench-comm@CS.UTK.EDU Thu Feb 2 10:11:02 1995 Received: from CS.UTK.EDU by netlib2.cs.utk.edu with ESMTP (cf v2.9t-netlib) id KAA28591; Thu, 2 Feb 1995 10:10:58 -0500 Received: from localhost by CS.UTK.EDU with SMTP (cf v2.9s-UTK) id KAA25385; Thu, 2 Feb 1995 10:10:22 -0500 X-Resent-To: parkbench-comm@CS.UTK.EDU ; Thu, 2 Feb 1995 10:10:13 EST Errors-to: owner-parkbench-comm@CS.UTK.EDU Received: from win234.nas.nasa.gov by CS.UTK.EDU with ESMTP (cf v2.9s-UTK) id KAA25273; Thu, 2 Feb 1995 10:09:54 -0500 Received: (from dbailey@localhost) by win234.nas.nasa.gov (8.6.8.2/NAS.6) id HAA21777; Thu, 2 Feb 1995 07:07:57 -0800 Date: Thu, 2 Feb 1995 07:07:57 -0800 From: dbailey@nas.nasa.gov (David H. Bailey) Message-Id: <199502021507.HAA21777@win234.nas.nasa.gov> To: pbwg-comm@CS.UTK.EDU, gavan@ilonet.ilo.dec.com Subject: Re: Parkbench Benchmarks The question here is to what extent the vendors are accepting the ParkBench kernels. For instance, how many have run the low-level communication tests, etc.? BTW, when is ParkBench's next meeting? DHB From owner-parkbench-comm@CS.UTK.EDU Thu Feb 9 06:20:00 1995 Received: from CS.UTK.EDU by netlib2.cs.utk.edu with ESMTP (cf v2.9t-netlib) id GAA22863; Thu, 9 Feb 1995 06:20:00 -0500 Received: from localhost by CS.UTK.EDU with SMTP (cf v2.9s-UTK) id GAA14637; Thu, 9 Feb 1995 06:19:24 -0500 X-Resent-To: parkbench-comm@CS.UTK.EDU ; Thu, 9 Feb 1995 06:19:22 EST Errors-to: owner-parkbench-comm@CS.UTK.EDU Received: from dasher.cs.utk.edu by CS.UTK.EDU with ESMTP (cf v2.9s-UTK) id GAA14631; Thu, 9 Feb 1995 06:19:21 -0500 From: Jack Dongarra Received: by dasher.cs.utk.edu (cf v2.9c-UTK) id GAA20517; Thu, 9 Feb 1995 06:19:19 -0500 Date: Thu, 9 Feb 1995 06:19:19 -0500 Message-Id: <199502091119.GAA20517@dasher.cs.utk.edu> To: parkbench-comm@CS.UTK.EDU Subject: ParkBench Meeting Dear Colleague, The ParkBench (Parallel Benchmark Working Group) will meet in Knoxville, Tennessee on March 27th, 1994. The SPEC-HPSC people will be having their meeting at the Hilton after the Parkbench meeting and you are welcome to attend that meeting as well. The dates for the SPEC-HPSC meeting are March 28-29, 1995. The meeting site will be the Knoxville Downtown Hilton Hotel. We have made arrangements with the Hilton Hotel in Knoxville. Hilton Hotel 501 W. Church Street Knoxville, TN Phone: 615-523-2300 When making arrangements tell the hotel you are associated with the Parallel Benchmarking or Spec Meeting. The rate about $68.00/night. You can download a postscript map of the area by anonymous ftp'ing to netlib2.cs.utk.edu, cd shpcc94, get knx-downtown.ps. You can rent a car or get a cab from the airport to the hotel. We should plan to start at 9:00 am March 27th and finish about 5:00 pm. If you will be attending the meeting please send me email so we can better arrange for the meeting. The format of the meeting is: Monday 27th March 9:00 - 12.00 Full group meeting 12.00 - 1.30 Lunch 1.30 - 5.00 Full group meeting Tentative agenda for the meeting: 1. Minutes of last meeting 2. Reports and discussion from subgroups 3. Open discussion and agreement on further actions 4. Date and venue for next meeting The objectives for the group are: 1. To establish a comprehensive set of parallel benchmarks that is generally accepted by both users and vendors of parallel system. 2. To provide a focus for parallel benchmark activities and avoid unnecessary duplication of effort and proliferation of benchmarks. 3. To set standards for benchmarking methodology and result-reporting together with a control database/repository for both the benchmarks and the results. The following mailing lists have been set up. parkbench-comm@cs.utk.edu Whole committee parkbench-lowlevel@cs.utk.edu Low level subcommittee parkbench-compactapp@cs.utk.edu Compact applications subcommittee parkbench-method@cs.utk.edu Methodology subcommittee parkbench-kernel@cs.utk.edu Kernel subcommittee All mail is being collected and can be retrieved by sending email to netlib@ornl.gov and in the mail message typing: send comm.archive from parkbench send lowlevel.archive from parkbench send compactapp.archive from parkbench send method.archive from parkbench send kernel.archive from parkbench send index from parkbench We have setup a mail reflector for correspondence, it is called parkbench-comm@cs.utk.edu. Mail to that address will be sent to the mailing list and also collected in netlib@ornl.gov. To retrieve the collected mail, send email to netlib@ornl.gov and in the mail message type: send comm.archive from parkbench Jack Dongarra From owner-parkbench-comm@CS.UTK.EDU Wed Mar 22 13:54:52 1995 Received: from CS.UTK.EDU by netlib2.cs.utk.edu with ESMTP (cf v2.9t-netlib) id NAA27772; Wed, 22 Mar 1995 13:54:52 -0500 Received: from localhost by CS.UTK.EDU with SMTP (cf v2.9s-UTK) id NAA15560; Wed, 22 Mar 1995 13:52:29 -0500 X-Resent-To: parkbench-comm@CS.UTK.EDU ; Wed, 22 Mar 1995 13:52:28 EST Errors-to: owner-parkbench-comm@CS.UTK.EDU Received: from dasher.cs.utk.edu by CS.UTK.EDU with ESMTP (cf v2.9s-UTK) id NAA15554; Wed, 22 Mar 1995 13:52:27 -0500 From: Jack Dongarra Received: by dasher.cs.utk.edu (cf v2.11c-UTK) id NAA04979; Wed, 22 Mar 1995 13:52:26 -0500 Date: Wed, 22 Mar 1995 13:52:26 -0500 Message-Id: <199503221852.NAA04979@dasher.cs.utk.edu> To: parkbench-comm@CS.UTK.EDU Subject: parkbench meeting Here is a draft agenda for the ParkBench meeting on Monday. Agenda for PARKBENCH Meeting on 27 March '95 --------------------------------------------- 1) Minutes of Last Meeting 2) Status of Parkbench - see below for more detail Reports and discussion from subgroups Analysis subgroup Status of Codes/repository on WWW Communication within ParkBench 3) Funding of ParkBench Activities 4) ParkBench Reports/Publications 5) Date of next meeting ------------------------------------------------------------------------- The ParkBench (Parallel Benchmark Working Group) will meet in Knoxville, Tennessee on March 27th, 1994. The SPEC-HPSC people will be having their meeting at the Hilton after the Parkbench meeting and you are welcome to attend that meeting as well. The dates for the SPEC-HPSC meeting are March 28-29, 1995. The meeting site will be the Knoxville Downtown Hilton Hotel. We have made arrangements with the Hilton Hotel in Knoxville. Hilton Hotel 501 W. Church Street Knoxville, TN Phone: 615-523-2300 When making arrangements tell the hotel you are associated with the Parallel Benchmarking or Spec Meeting. The rate about $68.00/night. You can download a postscript map of the area by anonymous ftp'ing to netlib2.cs.utk.edu, cd shpcc94, get knx-downtown.ps. You can rent a car or get a cab from the airport to the hotel. We should plan to start at 9:00 am March 27th and finish about 5:00 pm. If you will be attending the meeting please send me email so we can better arrange for the meeting. ------------------------------------------------------------------------- 2) Status of Parkbench - In more detail:- Status of Codes/repository on WWW - See "Other comments on status of codes" below - WWW repository needs much work Restructure pages, page hierarchy could be for e.g. Rules for running Benchmarks Source Codes low-level ... kernels Matrix, ... CA's HPF Method for reporting bugs Method for reporting results - Page containing source codes on WWW, very confusing at present. - Codes present which do not belong to Parkbench - Difficult to work out where low-level, kernels etc are. - Old versions e.g. low-level should reflect GENESIS 3.0 - Update all to latest versions - Overall release no. for Parkbench codes? - Check if documentation included Analysis subgroup Alistair Dunlop has some results from analysis of a kernel Communication within ParkBench Duplicate work e.g. on low-level benchmarks perhaps people should email committee with details of intended work at outset, so they can be warned if duplicating work? Strategy for answering questions mailed to whole committee Benchmark Results Few results received GBIS - Tennessee mirror - script sent but I have little documentation to send as yet. Regular reports? see item 4) on agenda ________________________________________________________________________ Other comments on status of codes:- from Mark Baker. 1) Timer - common timer in all codes use something akin to GENESIS host/node library which encapsulates and wraps system dependent routines for the timer and date stamp. 2) Header - common header file - like GENESIS 3) Makefile - common makefile GENERIC makefiles that work even with the least commonly used version of make. 4) I/O - common methodology of input data: command-line/file/#include 5) Documentation - more, with better description of how input parameters affect memory usage etc - perhaps recommendations of PVM environmental setting should be made. 6) Standard problem sizes (small, medium and large). 7) PD PVM vs Vendor specific versions of PVM - The latter often lacks the functionality of the former - (pvm_spawn, pvm_joingroup) 8) Adding RESULTS to database important. 9) NAS paper and pencil versus PARKBENCH versions of the codes !!! 10) Rules/guidelines for running codes (baseline and optimised). 11) All codes should have test input and output to certify results. 12) Obvious bug/problem reporting mechanism 13) Someone responsible for the repository - update and stabilise existing codes. 14) Recommended subset of codes to run... (to many at moment !!) ---------------------------------------------------------------------- The single most important thing that is not explicitly mentioned, is that the codes should use a common timer! In fact, they should also use something akin to the GENESIS node library, which encapsulates the system dependent routines needed for the timer, and time of day. The codes should also all print out a standard header in the results file, with information like the name of the benchmark, the date and time of the run, the system it was run on, etc., as is done by the SETDTL() routine in the GENESIS library. Other things that it would be nice to see would be consistently named executables and result files, and a policy on makefiles. Some people advocate single source benchmarks with no makefile, but I personally think that that is unrealisitic, and it would be better if each benchmark were supplied with a makefile based on a "standard" makefile template, such as the one that comes with PVM. The PVM versions of the GENESIS codes use such a template. Even if you're running on a system that doesn't have "make", however unlikely that may be, having a makefile at least documents how to build the executable, and if it's based on a template, then peculiarities of individual benchmarks are easy to spot. Depending how much work people are prepared to put into it, there are other things that could be made more uniform, such as the means of defining input parameters - interactively from standard input, from a steering file, or compiled into the program being three options. Not all the benchmarks are adequately documented, and not all come with a well defined set of "standard" problem sizes - the Matrix kernels being a particular case in point. From owner-parkbench-comm@CS.UTK.EDU Thu Mar 30 08:41:10 1995 Received: from CS.UTK.EDU by netlib2.cs.utk.edu with ESMTP (cf v2.9t-netlib) id IAA07648; Thu, 30 Mar 1995 08:41:10 -0500 Received: from localhost by CS.UTK.EDU with SMTP (cf v2.9s-UTK) id IAA05807; Thu, 30 Mar 1995 08:38:37 -0500 X-Resent-To: parkbench-comm@CS.UTK.EDU ; Thu, 30 Mar 1995 08:38:32 EST Errors-to: owner-parkbench-comm@CS.UTK.EDU Received: from berry.cs.utk.edu by CS.UTK.EDU with ESMTP (cf v2.9s-UTK) id IAA05798; Thu, 30 Mar 1995 08:38:31 -0500 Received: from LOCALHOST.cs.utk.edu by berry.cs.utk.edu with SMTP (cf v2.11c-UTK) id IAA18659; Thu, 30 Mar 1995 08:38:30 -0500 Message-Id: <199503301338.IAA18659@berry.cs.utk.edu> to: parkbench-comm@CS.UTK.EDU Subject: Minutes Date: Thu, 30 Mar 1995 08:38:29 -0500 From: "Michael W. Berry" Here are the minutes I recorded from the most recent Parkbench meeting. Please review, correct, and e-mail your changes. Please pay particular attention to *** ACTION ITEM ***'s. Thanks, Mike ----------------------------------------------------------------- Minutes of Parkbench Meeting - Knoxville Hilton - March 27, 1995 ----------------------------------------------------------------- Attendee List: David Bailey NASA dbailey@nas.nasa.gov Mark Baker Univ. of Southampton mab@soton.ak.uk Michael Berry Univ. of Tennessee berry@cs.utk.edu Jack Dongarra Univ. of Tenn./ORNL dongarra@cs.utk.edu Tony Hey Univ. of Southampton ajgh@ecs.soton.ak.uk Kenneth Koch Los Alamos Nat'l Lab krk@lanl.gov Mike Kolatis Univ. of Tennessee kolatis@cs.utk.edu David MacKay Intel SSD mackay@ssd.intel.com Fiona Sim IBM (Kingston) fsim@vnet.ibm.com David Snelling Univ. of Manchester snelling@cs.man.ac.uk David Walker Oak Ride Nat'l Lab walker@msr.epm.ornl.gov At 9:30am, Tony H. opened the meeting and asked if there were any corrections (some spelling errors). Tony indicated that there should be a formal release of the codes. The status of the WWW Repository was reviewed. The codes are not really usable in the current form. Reorganization of the web pages is needed. Jack D. agreed that the web pages simply need to be redone. There are collections that have nothing to do with the suite. Fiona S. indicated that it was very confusing when trying to download the suite. There has been some duplication of work. Tony H. then outlined "Proposed Parkbench Activities": 1. Benchmark Consolidation Release 1.0 of the suite is to be fully tested by June 1, 1995. Items: low-level , kernels, compact applic, HPF compiler benchmarks. Release 1.0 ---------- Low level - need common makefiles and I/O (small amount of work needed); there are some typo errors in the makefiles; all in Fortran and can work with PVM 3.3.7 version; there are 7-8 codes in total. Genesis V. 3 ties them together (M. Kolatis had worked with older version). Via email, Roger Hockney suggested that UT and Southampton (Ian Glendinning) should coordinate to finalize the low-level kernels. Roger H. stressed that as many measurements as possible be collected for the database and that duplication of results be constantly checked. Roger H. indicated that he is working on a "graphical profile of performance" which will be based on scatter plots of all (rinf,nhalf) values on the (nhalf,rinf)-plane. Roger H. encourages all Parkbench members to submit their low-level measurements to the Southampton graphical data base (GBIS). Kernels - NAS kernels : MG,CG,EP,IS,FFT(3-D); these are vanilla versions. Matrix kernels : MM, TP, LU, QR, TRD; TP needs some work and the others need to be standalone and not dependent on BLACS. User will just need to provide the communication layer. Need to have "load and go" versions. A new version just installed but BLACS are not included (but will be added with tar file). PVM+BLAS needed by vendors for these codes. Work needed for kernels: common timer, common header file, common makefiles, common I/O (input,includes), standard documentation of parameters, standard problem sizes, public domain PVM -vs- vendor-based PVM, language conformance, NAS pencil+paper -vs- PARKBENCH versions, verification of results. Ken K. suggested that a list of PVM files needed be specified. He asked if one should look at process control calls specifically? Codes are not exactly Fortran-77. Need to point to some well-defined extension (DOD sanctioned Mill-Spec extension). Don't want Fortran-90 infiltration right now. David B. suggested that someone should check for use of "SAVE" option. A preprocessor-based implementation is not desired at this time (suggested by Tony H.) Need to have a reporting mechanism - ask vendors for a general reporting strategy. Will have 2 sets of results for NAS kernels - vendors have their own versions and how to distinguish the results? Optimized libraries can be used with vanilla version (compiler options also). Mark B. suggested that there are really 3 versions: true vanilla (no flags), code reordering+libraries, vendor-best approach. David B. suggested that there only be 2 levels : level 1 (apply any compiler options but applied to all routines) and level 2 (do anything to it). Jack D. wants same compiler options for entire suite - not just w/in 1 code. SPEC has a set of rules for compiler options. *** ACTION ITEM *** Fiona S. will provide SPEC compiler option rules for review. David B. indicated that simple checks of correctness provided in NAS. Genesis codes are not self-checking. David S. stressed that self-checking is preferred (correctness issue). Tony H. proposed to drop PDE1 due to duplication (R/B ordering); a 1-D FFT could be included and should a pencil+paper I/O benchmark be included. David B. suggested to hold off on the FFT for now. David S. agreed. A general discussion of how to formalize an I/O benchmark followed. David B. and Michael B. suggested that the compact applications could provide I/O benchmarking activities within them. For Release 1, it should be avoided. Jack D. mentioned an activity by Paul Messina at Caltech to address I/O benchmarking. Jack D. proposed 30 megabytes or 1000 order matrix (for small problem). Medium size is 3 gigabytes (order of 10,000 matrix), and large problem size would be 30 gigabytes (30,000 order matrix). Fiona S. indicated IBM typically ties problem size to number of procs. David B. indicated that there is a NAS Parallel Benchmarks 2 effort in existence. Use 2 problem sizes for NAS parallel benchmarks and 3 problem sizes for matrix kernels for now. *** ACTION ITEM *** David B. will investigate a 3rd problem size for NAS Parallel Kernels. Compact Applications - (discussion began at 10:50am). NAS CFD codes (need PVM versions); David B. indicated that MPI versions are being developed. ARCO (1.1 version) - by Mosher (Fiona S. indicated that "Resource 2000" effort presents problem with multiple versions with different I/O). Enviro. Modeling - POLMP, PSTSWM (PICL-based) QCD - SOLVER GAMESS (a public domain PVM version possible?); SPEC has the US version (2 versions one for UK and one for US); Mike Schmidt does not want it distributed by anon ftp. Martin Guest (UK) has not been contacted about distribution. David S. suggested to drop ARCO code. Fiona S. suggested getting a newer version from Chuck Mosher. David S. also suggested to rethink PICL version of PSTSWM. Mark B. indicated that programmers have had a problem with PSTSWM. Tony H. suggested future discussions on MPI and HPF versions of these applications be made but not for first release. POLMP (by Mark Ashworth at NERC) - in good shape. David W. indicated that Pat Worley can improve PSTSWM for release. Should QCD be used? David S. suggested it be dropped since it is too simplistic and not that interesting of an application. Tony H. suggested that it is a "theoretical" benchmark. Fiona S. suggested that it's to start small and add codes in time. *** ACTION ITEM *** Fiona S. will contact Martin Guest about UK-GAMESS for Parkbench release. Some chemistry kernels might be possible also. For June, - CFD (BT,LU,SP) - Enviro. Modeling (POLMP, PSTSWM?) - Oil (C. Mosher's latest ARCO) - Chemistry (UK-GAMESS?, MD codes?) *** ACTION ITEM *** David W. will oversee the development of first release of Compact Applic. At 11:15am, the discussion of the HPF compiler benchmarks began. Tony H. asked Fiona S. about IBM's interest in using such benchmarks. *** ACTION ITEM *** Jack D. will contact Tom Haupt about his interest in the inclusion of his codes in the first release and how to report the results. Tony H. suggested that they be left out initially for the first release. A schedule is needed to development first release. 2. Re-organization of Parkbench WWW pages Instructions, Source Codes, Result Database, GBIS Viewer. Tony H. proposed an organization of the information. Jack D. has someone funded by CRPC to deal with bug reporting. 3. Funding of PARKBENCH Activities Jack D. has $15,000 from CRPC (Mike Kolatis) Berry, Dongarra and Sameh has pre-proposal to NSF. Tony H. trying to get vendor-support (Mark Baker funding). Tony H. and van der Steen to Europe funding agencies. Corporate aid from Sun, SGI, Intel, IBM, Cray, Convex, HP, Meiko, Parsytec? Funding from NEC, Fujitsu? 4. General WORK required: 1) Rules/guidelines for running codes SPEC rules by Fiona and Fortran Extensions by David B. in 1 week 2) Mechanism for bug reporting 3) Update and stabilize existing codes April 7 deadline for contacting application experts for including CA's (Worley, Mosher, Guest). Jack D. and Fiona S. will be attending SPEC-HPSC meeting and discuss arrangement for proper release of CA's. Tony H. suggested that ARCO and GAMESS be left out. *** ACTION ITEM *** AVL (Fire code) may be included from RAPS. Tony H. will investigate this along with Roger H's LPM2 (PIC code). 4) Single release 5) Recommended subset of codes to run The group had lunch from noon to 1:00pm. At 1:00pm the following "Timeline" was proposed. -------------------------------------------------------------------------------- TIMELINE for PARKBENCH ACTIVITIES -------------------------------------------------------------------------------- 3/27 lowlevel, kernels, CA's -------------------------------------------------------------------------------- 4/01 UT : ARCO,kernels,gather software, WWW pages ORNL: PSTSWM, WWW pages IBM : UK-GAMESS pending HPSC discussion USH : lowlevel SU : HPF Benchmarks -------------------------------------------------------------------------------- 4/15 LL+Kernels Software ready for US Unify makefile DOC/Readme -------------------------------------------------------------------------------- 5/1 Actively gathering performance numbers (3 runs - S,M,L) Target machines: SP/2, Meiko CS2, Paragon, T3D, Convex, DEC SMP, SGI (SP/2 - David B., Paragon - UT, T3D - Ken K., SGI - Horst Simon?, Meiko CS2 - Mark B., CRI J90 - Charles Grassl) Install GBIS at UT, Put #'s into database; Graph BM Info Server -------------------------------------------------------------------------------- 6/1 Announce in places like HPCwire, NA-Digest, other? -------------------------------------------------------------------------------- At 1:45pm, different attendees presented specific results using PARKBENCH codes. Tony H. showed performance of Large Integer Sort for a direct-mapped cache processor. Looked at bucket sort on NUMA machine with data size <= cache size and vice-versa. Jack D. showed a comparison of message passing on different platforms: it appeared that T3D using SHMEM was superior. David B. then showed new results for the BT NAS Parallel Benchmark code on a variety of machines. The Cray J90 performance was impressive from a cost-performance perspective but the DEC 8400 was even better. Results were presented in gflop/s. At 2:15pm, Tony H. asked about another meeting before SC'95. David B. suggested the end of August or early September. A video conference in July was a possibility but more than 3 sites is a problem according to Fiona S. A teleconference before the release in mid May is a possibility. Parkbench video meeting for May 12 (tentative). Jack D. will test a "bridge" between sites for this video meeting. This Parkbench meeting was adjourned at 2:30 pm by Tony H. From owner-parkbench-comm@CS.UTK.EDU Tue Apr 4 13:43:26 1995 Received: from CS.UTK.EDU by netlib2.cs.utk.edu with ESMTP (cf v2.9t-netlib) id NAA26886; Tue, 4 Apr 1995 13:43:26 -0400 Received: from localhost by CS.UTK.EDU with SMTP (cf v2.9s-UTK) id NAA14291; Tue, 4 Apr 1995 13:43:04 -0400 X-Resent-To: parkbench-comm@CS.UTK.EDU ; Tue, 4 Apr 1995 13:43:03 EDT Errors-to: owner-parkbench-comm@CS.UTK.EDU Received: from berry.cs.utk.edu by CS.UTK.EDU with ESMTP (cf v2.9s-UTK) id NAA14285; Tue, 4 Apr 1995 13:43:02 -0400 Received: from LOCALHOST.cs.utk.edu by berry.cs.utk.edu with SMTP (cf v2.11c-UTK) id NAA02046; Tue, 4 Apr 1995 13:43:01 -0400 Message-Id: <199504041743.NAA02046@berry.cs.utk.edu> to: parkbench-comm@CS.UTK.EDU Subject: Fortran conventions Date: Tue, 04 Apr 1995 13:43:00 -0400 From: "Michael W. Berry" From David Bailey.... ------- Forwarded Message Return-Path: Received: from win234.nas.nasa.gov by CS.UTK.EDU with ESMTP (cf v2.9s-UTK) id MAA11093; Tue, 4 Apr 1995 12:52:16 -0400 Received: (from dbailey@localhost) by win234.nas.nasa.gov (8.6.8.2/NAS.6.1) id JAA10320; Tue, 4 Apr 1995 09:52:13 -0700 Date: Tue, 4 Apr 1995 09:52:13 -0700 From: dbailey@nas.nasa.gov (David H. Bailey) Message-Id: <199504041652.JAA10320@win234.nas.nasa.gov> To: berry@CS.UTK.EDU Subject: Fortran conventions I haven't been able to locate the reference to the "Milspec" extensions to Fortran-77. But as I recall it includes most of the following. Certainly these extensions would be expected in any reasonable Fortran-77 compiler nowadays. These appear in some of the NAS benchmarks, for example. In any event, I recommend that we permit the following in all ParkBench codes. 1. Long identifiers, up to 31 chars long, together with the optional usage of the underscore after the first character. 2. Upper and lower case alphabetics, which are treated indistinguishably. 3. IMPLICIT NONE. 4. DO-ENDDO constructs. 5. INCLUDE statements. 6. The DOUBLE COMPLEX datatype, together with related intrinsics such as DCMPLX. 7. INTEGER*4, REAL*8, COMPLEX*16, etc. (although REAL*4 may be handled as 8-byte floating-point, etc.). ------- End of Forwarded Message From owner-parkbench-comm@CS.UTK.EDU Tue Apr 4 14:36:32 1995 Received: from CS.UTK.EDU by netlib2.cs.utk.edu with ESMTP (cf v2.9t-netlib) id OAA28162; Tue, 4 Apr 1995 14:36:31 -0400 Received: from localhost by CS.UTK.EDU with SMTP (cf v2.9s-UTK) id OAA17537; Tue, 4 Apr 1995 14:36:39 -0400 X-Resent-To: parkbench-comm@CS.UTK.EDU ; Tue, 4 Apr 1995 14:36:35 EDT Errors-to: owner-parkbench-comm@CS.UTK.EDU Received: from cs.rice.edu by CS.UTK.EDU with ESMTP (cf v2.9s-UTK) id OAA17521; Tue, 4 Apr 1995 14:36:24 -0400 Received: from [128.42.5.169] by cs.rice.edu (NAA27273); Tue, 4 Apr 1995 13:36:11 -0500 Message-Id: <199504041836.NAA27273@cs.rice.edu> Mime-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Date: Tue, 4 Apr 1995 13:35:58 -0600 To: "Michael W. Berry" , parkbench-comm@CS.UTK.EDU From: chk@cs.rice.edu (Chuck Koelbel) Subject: Re: Fortran conventions I'm working from home today, so I don't have my copy of the F90 standard handy. >I haven't been able to locate the reference to the "Milspec" >extensions to Fortran-77. That's MIL-STD-1753. The bib entry from the HPF standard says: US Department of Defense. Military Standard, MIL-STD-1753: FORTRAN, DoD Supplement to American National Standard X3.9-1978, November 9, 1978. I'm not sure who to order it from... >But as I recall it includes most of the >following. Certainly these extensions would be expected in any >reasonable Fortran-77 compiler nowadays. These appear in some of the >NAS benchmarks, for example. In any event, I recommend that we permit >the following in all ParkBench codes. I think all of these actually appear in the MILSPEC and Fortran 90, with the possible exception of >7. INTEGER*4, REAL*8, COMPLEX*16, etc. (although REAL*4 may be handled >as 8-byte floating-point, etc.). I don't think in either one, although a VERY common compiler extension. F90 has the KIND type parameter mechanism that specifies precision more, well, precisely (in effect, you can say "here's how many decimal digits of accuracy I need", compiler picks the most efficient machine representation.) However, no F77 compilers use the F90 mechanism. REAL*8 et al are probably OK to allow in Parkbench F77 codes, but (future) F90 versions ought to use KIND. Chuck Koelbel From owner-parkbench-comm@CS.UTK.EDU Thu Apr 6 10:27:23 1995 Received: from CS.UTK.EDU by netlib2.cs.utk.edu with ESMTP (cf v2.9t-netlib) id KAA29896; Thu, 6 Apr 1995 10:27:23 -0400 Received: from localhost by CS.UTK.EDU with SMTP (cf v2.9s-UTK) id KAA17519; Thu, 6 Apr 1995 10:26:01 -0400 X-Resent-To: parkbench-comm@CS.UTK.EDU ; Thu, 6 Apr 1995 10:26:00 EDT Errors-to: owner-parkbench-comm@CS.UTK.EDU Received: from haven.EPM.ORNL.GOV by CS.UTK.EDU with ESMTP (cf v2.9s-UTK) id KAA17511; Thu, 6 Apr 1995 10:25:56 -0400 Received: (from worley@localhost) by haven.EPM.ORNL.GOV (8.6.10/8.6.10) id KAA03335; Thu, 6 Apr 1995 10:25:23 -0400 Date: Thu, 6 Apr 1995 10:25:23 -0400 From: Pat Worley Message-Id: <199504061425.KAA03335@haven.EPM.ORNL.GOV> To: parkbench-comm@CS.UTK.EDU Subject: Re: Minutes In-Reply-To: Mail from '"Michael W. Berry" ' dated: Thu, 30 Mar 1995 08:38:29 -0500 Cc: itf@mcs.anl.gov, worley@haven.EPM.ORNL.GOV 1. Benchmark Consolidation Release 1.0 of the suite is to be fully tested by June 1, 1995. ... Compact Applications - (discussion began at 10:50am). ... Enviro. Modeling - POLMP, PSTSWM (PICL-based) ... David S. also suggested to rethink PICL version of PSTSWM. Mark B. indicated that programmers have had a problem with PSTSWM. Could someone please remind me how CA codes are to be validated and performance data collected? Southampton seems to have taken the lead in some of this work, but I am uncertain as to how they are approaching it. My concern is the following. PSTSWM was not originally developed to be a publicly-available benchmark code, and currently has some peculiarities or deficiencies. Some of these peculiarities are intrinsic (the large number of runtime options), while others are due to the research motivating the code development coming first (PICL dependence and lack of external documentation). I am willing to modify PSTSWM to make it more suitable, and I would hope that if people are having problems with the code, that they would contact me. The only problem I have heard of was from last summer (and that was secondhand) and was due to the lack of documentation on porting the PICL communication library. The code has been used extensively across many platforms in our research, and problems in running it are a basic concern to us. On a related point. Ian Foster, Brian Toonen, and I recently finished an extensive benchmarking exercise using PSTSWM, measuring performance on the Paragon, SP-2, and T3D. Can or should this data be included in the performance database, or does data need to be generated by an "independent" ParkBench effort? We are also interested in "helping" run the code on the other machines mentioned in the minutes, but do not have access to many of the machine ourselves. (Please contact us.) A final few points: PSTSWM has evolved from depending on one communication library (PICL) to running using native routines on a large number of machines. Currently, it runs as a PICL, PVM, MPI, MPL, NX, SUNMOS, VERTEX, SHMEM, or serial code, with the communication library chosen at compile time. For our benchmarking, we use the native libraries and optimize over the large number of tuning parameters to decide on what parallel algorithm to use. The result is an estimate of how well each machine can solve the problem rather than how well it can run a particular parallel algorithm using a fixed communication library. Is this appropriate for ParkBench performance numbers? We can specify a communication protocol and a parallel algorithm that should work across all platforms, but it will be far from optimal, especially on machines like the T3D where the SHMEM libraries perform much better than PVM on this code. (The message passing semantics required by the code are very simple and the added functionality provided by tagged message-passing is not needed.) I am sorry to have missed the latest ParkBench meeting, but illness prevented me from attending. I hope that these issues can be resolved by e-mail. ... David W. indicated that Pat Worley can improve PSTSWM for release. PSTSWM has been a moving target over the past year, but an improved version is now available. (The basic code has not changed, but removing PICL dependence and porting to the different platforms required numerous minor structural modifications.) I have held off updating the ParkBench version until the new version was stabilized. What is the appropriate procedure for updating ParkBench codes, both now and in the future? (I expect further changes in the form of additional native ports.) There is a PSTSWM homepage under my control where I can put new versions. How much version control of the codes does ParkBench want or need? ... 4. General WORK required: 3) Update and stabilize existing codes April 7 deadline for contacting application experts for including CA's (Worley, Mosher, Guest). ... I haven't been contacted (:-)), What sort of response do you want? Pat Worley From owner-parkbench-comm@CS.UTK.EDU Thu Apr 6 12:36:03 1995 Received: from CS.UTK.EDU by netlib2.cs.utk.edu with ESMTP (cf v2.9t-netlib) id MAA02971; Thu, 6 Apr 1995 12:36:03 -0400 Received: from localhost by CS.UTK.EDU with SMTP (cf v2.9s-UTK) id MAA29113; Thu, 6 Apr 1995 12:36:31 -0400 X-Resent-To: parkbench-comm@CS.UTK.EDU ; Thu, 6 Apr 1995 12:36:29 EDT Errors-to: owner-parkbench-comm@CS.UTK.EDU Received: from sun2.nsfnet-relay.ac.uk by CS.UTK.EDU with SMTP (cf v2.9s-UTK) id MAA29091; Thu, 6 Apr 1995 12:36:09 -0400 Via: uk.ac.southampton.parallel-computing-support; Thu, 6 Apr 1995 17:35:47 +0100 Date: Thu, 6 Apr 95 17:31:12 BST Message-Id: <13220.9504061631@par.soton.ac.uk> From: Mark Baker Subject: Re: Minutes To: Pat Worley In-Reply-To: Pat Worley's message of Thu, 6 Apr 1995 10:25:23 -0400 Organisation: HPC Centre, The University of Southampton, England. Phone: +44 1703 593226 fax: +44 1703 593939 X-Mailer: Sendmail/Ream version 5.1.9 Cc: parkbench-comm@CS.UTK.EDU Pat, Rather than produce a rather long reply here to your email I suggest that we talk over the phone about PSTSWM and I can bring you up to date on the committee views etc. about the code and other matters. Regards Mark _____________________________________________________________________________ Dr Mark Baker HPC Centre Tel: +44 - 1703-593226 University of Southampton Fax: +44 - 1703-593939 Southampton, S017 1BJ Email: mab@par.soton.ac.uk England, UK WWW URL: http://hpcc.soton.ac.uk/ _____________________________________________________________________________ From owner-parkbench-comm@CS.UTK.EDU Fri Sep 8 16:37:43 1995 Return-Path: Received: from CS.UTK.EDU by netlib2.cs.utk.edu with ESMTP (cf v2.9t-netlib) id QAA14465; Fri, 8 Sep 1995 16:37:43 -0400 Received: from localhost by CS.UTK.EDU with SMTP (cf v2.9s-UTK) id QAA04446; Fri, 8 Sep 1995 16:35:52 -0400 X-Resent-To: parkbench-comm@CS.UTK.EDU ; Fri, 8 Sep 1995 16:35:50 EDT Errors-to: owner-parkbench-comm@CS.UTK.EDU Received: from franklin.seas.gwu.edu by CS.UTK.EDU with ESMTP (cf v2.9s-UTK) id QAA04432; Fri, 8 Sep 1995 16:35:48 -0400 Received: from felix.seas.gwu.edu (abdullah@felix.seas.gwu.edu [128.164.9.3]) by franklin.seas.gwu.edu (v8) with ESMTP id QAA10026 for ; Fri, 8 Sep 1995 16:35:42 -0400 Received: (from abdullah@localhost) by felix.seas.gwu.edu (8.6.12/8.6.12) id QAA07062 for parkbench-comm@cs.utk.edu; Fri, 8 Sep 1995 16:35:38 -0400 Date: Fri, 8 Sep 1995 16:35:38 -0400 From: Abdullah Meajil Message-Id: <199509082035.QAA07062@felix.seas.gwu.edu> To: parkbench-comm@CS.UTK.EDU Subject: subscribe subscribe :w From owner-parkbench-comm@CS.UTK.EDU Wed Oct 11 18:25:43 1995 Return-Path: Received: from CS.UTK.EDU by netlib2.cs.utk.edu with ESMTP (cf v2.9t-netlib) id SAA14748; Wed, 11 Oct 1995 18:25:42 -0400 Received: from localhost by CS.UTK.EDU with SMTP (cf v2.9s-UTK) id SAA05677; Wed, 11 Oct 1995 18:23:18 -0400 Received: from zazu.c3.lanl.gov by CS.UTK.EDU with ESMTP (cf v2.9s-UTK) id SAA05670; Wed, 11 Oct 1995 18:23:12 -0400 Received: (krk@localhost) by zazu.c3.lanl.gov (8.6.10/c93112801) id QAA11521 for parkbench-comm@cs.utk.edu; Wed, 11 Oct 1995 16:22:53 -0600 Date: Wed, 11 Oct 1995 16:22:53 -0600 From: Kenneth R Koch Message-Id: <199510112222.QAA11521@zazu.c3.lanl.gov> To: parkbench-comm@CS.UTK.EDU Subject: Re: t3d resutls > Ken, > Unfortunately, it appears that the T3D would pose great problems > for PARKBENCH utilizing PVM3.3.8. Hopefully, when the MPI codes are > released, we'll be able to get a much easier view. It turns out that > NAS is producing their own vanilla MPI benchmark versions that we will > be including within PARKBENCH--this is very fortunate. So ParkBench is not suitable for the T3D (nor the T3E when it comes out). Will the MPI version really address the T3D porting issues we have uncovered? I don't think so unless there is a specific effort. Does anyone else other than me see that not supporting the T3D (a major MPP) is a major short coming of ParkBench? > The CG and FT PVM codes are being modified by the authors as > we speak. There was a distinctly unparallel memory requirement in these > codes. But, they have been cleaned up considerably. Yes, I have seen the new versions. > I am really writing regarding another situation. Awhile back, > I sent a request to see if you would run the codes on the LANL CM5. > Have you had any success? I ran all the codes here on ours, but it > only has 32 nodes--so, I could only run the small codes. > If you have not, could you download the most recent version > of PARKBENCH and run the LINALG kernels only (as the NAS ones are > being fixed)? A CM5 make.def already exists with the package, and any > questions you have can be answered by me. We'd really like to have > the numbers to put in the database. I promise that the concerns > you noted with the T3D do not appear anywhere else. It should be easy > to run without modification. > > Thank you, > Michael Kolatis > Where can I get the most recent ParkBench codes available these days? http://www.netlib.org/parkbench ftp ??? I just grabbed the http://www.netlib.org/parkbench/kernels/kernels.tar.gz file. Much nicer file tree and build system. But ... I have a major problem with the CM5 configuration: ParkBench uses the Sparc f77 compiler rather than the TMC cmf compiler. This means NO vector units; this is equivalent to castrating our CM5! What good are ParkBench CM5 results since any real code will use the vector units? ParkBench results for a non-vector-unit CM5 and none for the T3D definitely make ParkBench less appealing. I just scanned what is available in the Netlib GBIS ParkBench database for Cray, IBM, Intel, Meiko, & TMC. Only the NAS kernels (and compact apps) have data, the LINALG kernels have none. And all of the NAS results are all from vendor implementations of the NAS kernels offically published in the 10/94 and 3/95 NAS reports. None are from ParkBench PVM implementations and I expect vendors to continue their own NAS results. So why are ParkBench PVM (or MPI) results of interest to the HPC community? Comments anyone? --Ken Koch, CIC-19, LANL From owner-parkbench-comm@CS.UTK.EDU Fri Oct 13 08:50:27 1995 Return-Path: Received: from CS.UTK.EDU by netlib2.cs.utk.edu with ESMTP (cf v2.9t-netlib) id IAA12876; Fri, 13 Oct 1995 08:50:27 -0400 Received: from localhost by CS.UTK.EDU with SMTP (cf v2.9s-UTK) id IAA21894; Fri, 13 Oct 1995 08:48:58 -0400 Received: from haven.EPM.ORNL.GOV by CS.UTK.EDU with ESMTP (cf v2.9s-UTK) id IAA21887; Fri, 13 Oct 1995 08:48:56 -0400 Received: (from worley@localhost) by haven.EPM.ORNL.GOV (8.6.10/8.6.10) id IAA01526; Fri, 13 Oct 1995 08:48:15 -0400 Date: Fri, 13 Oct 1995 08:48:15 -0400 From: Pat Worley Message-Id: <199510131248.IAA01526@haven.EPM.ORNL.GOV> To: krk@c3serve.c3.lanl.gov, parkbench-comm@CS.UTK.EDU Subject: Re: t3d resutls In-Reply-To: Mail from 'Kenneth R Koch ' dated: Wed, 11 Oct 1995 16:22:53 -0600 > > Ken, > > > Unfortunately, it appears that the T3D would pose great problems > > for PARKBENCH utilizing PVM3.3.8. Hopefully, when the MPI codes are > > released, we'll be able to get a much easier view. It turns out that > > NAS is producing their own vanilla MPI benchmark versions that we will > > be including within PARKBENCH--this is very fortunate. > > So ParkBench is not suitable for the T3D (nor the T3E when it comes > out). Will the MPI version really address the T3D porting issues we > have uncovered? I don't think so unless there is a specific effort. > Does anyone else other than me see that not supporting the T3D (a major > MPP) is a major short coming of ParkBench? ... What is the background for this message? What is unfair about using PVM3.3.8? I have run PSTSWM (one of the compact application codes) with both PVM and SHMEM on the T3D with no problems, and SHMEM is faster. Is this the issue? Pat Worley From owner-parkbench-comm@CS.UTK.EDU Fri Oct 13 10:01:58 1995 Return-Path: Received: from CS.UTK.EDU by netlib2.cs.utk.edu with ESMTP (cf v2.9t-netlib) id KAA13435; Fri, 13 Oct 1995 10:01:58 -0400 Received: from localhost by CS.UTK.EDU with SMTP (cf v2.9s-UTK) id KAA02382; Fri, 13 Oct 1995 10:01:45 -0400 Received: from mordillo.npac.syr.edu by CS.UTK.EDU with ESMTP (cf v2.9s-UTK) id KAA02374; Fri, 13 Oct 1995 10:01:42 -0400 Received: (from mab@localhost) by mordillo.npac.syr.edu (940816.SGI.8.6.9/8.6.6) id KAA10830 for parkbench-comm@CS.UTK.EDU; Fri, 13 Oct 1995 10:01:38 -0400 Date: Fri, 13 Oct 1995 10:01:38 -0400 Message-Id: <199510131401.KAA10830@mordillo.npac.syr.edu> From: Mark Baker Subject: [Kenneth R Koch: Re: t3d resutls] To: parkbench-comm@CS.UTK.EDU Organisation: NPAC, Syracuse University, New York 13244-4100, USA. Phone: +1 315 443 2083 fax: +1 315 443 1973 X-Mailer: Sendmail/Ream version 5.1.51 Fcc: +log I agree with Pat. It would be useful if someone put this in centext... As for the comments: Why can't the Cray T3D use PARKBENCH !? I've run the PVM codes on a T3D ! and a CM-5... Re: the CM-5, why can't f77 be changed to cmf77 !? Re: the NAS results - the agreement at the last parkbench meeting was that we produced turnkey NAS benchmarks that vendors run without changing so that we could publish NAS results and put them on GBIS. It was understood that vendors which continue to run their own versions - when they had spent condsiderable effort in writing. It was also understood that there would be a big difference between the results of the optimised and optimised runs. BTW - Could someone at UTK update my email address, its now mab@npac.syr.edu. Regards Mark ---- Start of forwarded text ---- >From @parallel-computing-support.southampton.ac.uk,@[08002009431a/DCC.38826110000503000004000003]:krk@c3serve.c3.lanl.gov Thu Oct 12 03:11:07 1995 Received: from sun2.nsfnet-relay.ac.uk by spica.npac.syr.edu (4.1/I-1.98K) id AA12247; Thu, 12 Oct 95 03:10:21 EDT Via: uk.ac.southampton.parallel-computing-support; Thu, 12 Oct 1995 08:10:08 +0100 Via: [08002009431a/DCC.38826110000503000004000003]; Thu, 12 Oct 95 00:00:26 BST Received: from CS.UTK.EDU (CS.UTK.EDU [128.169.94.1]) by beech.soton.ac.uk (8.6.12/hub-8.5a) with ESMTP id XAA05694 for ; Wed, 11 Oct 1995 23:54:19 +0100 Received: from localhost by CS.UTK.EDU with SMTP (cf v2.9s-UTK) id SAA05677; Wed, 11 Oct 1995 18:23:18 -0400 Received: from zazu.c3.lanl.gov by CS.UTK.EDU with ESMTP (cf v2.9s-UTK) id SAA05670; Wed, 11 Oct 1995 18:23:12 -0400 Received: (krk@localhost) by zazu.c3.lanl.gov (8.6.10/c93112801) id QAA11521 for parkbench-comm@cs.utk.edu; Wed, 11 Oct 1995 16:22:53 -0600 Date: Wed, 11 Oct 1995 16:22:53 -0600 From: Kenneth R Koch Message-Id: <199510112222.QAA11521@zazu.c3.lanl.gov> To: parkbench-comm@CS.UTK.EDU Subject: Re: t3d resutls Status: RO > Ken, > Unfortunately, it appears that the T3D would pose great problems > for PARKBENCH utilizing PVM3.3.8. Hopefully, when the MPI codes are > released, we'll be able to get a much easier view. It turns out that > NAS is producing their own vanilla MPI benchmark versions that we will > be including within PARKBENCH--this is very fortunate. So ParkBench is not suitable for the T3D (nor the T3E when it comes out). Will the MPI version really address the T3D porting issues we have uncovered? I don't think so unless there is a specific effort. Does anyone else other than me see that not supporting the T3D (a major MPP) is a major short coming of ParkBench? > The CG and FT PVM codes are being modified by the authors as > we speak. There was a distinctly unparallel memory requirement in these > codes. But, they have been cleaned up considerably. Yes, I have seen the new versions. > I am really writing regarding another situation. Awhile back, > I sent a request to see if you would run the codes on the LANL CM5. > Have you had any success? I ran all the codes here on ours, but it > only has 32 nodes--so, I could only run the small codes. > If you have not, could you download the most recent version > of PARKBENCH and run the LINALG kernels only (as the NAS ones are > being fixed)? A CM5 make.def already exists with the package, and any > questions you have can be answered by me. We'd really like to have > the numbers to put in the database. I promise that the concerns > you noted with the T3D do not appear anywhere else. It should be easy > to run without modification. > > Thank you, > Michael Kolatis > Where can I get the most recent ParkBench codes available these days? http://www.netlib.org/parkbench ftp ??? I just grabbed the http://www.netlib.org/parkbench/kernels/kernels.tar.gz file. Much nicer file tree and build system. But ... I have a major problem with the CM5 configuration: ParkBench uses the Sparc f77 compiler rather than the TMC cmf compiler. This means NO vector units; this is equivalent to castrating our CM5! What good are ParkBench CM5 results since any real code will use the vector units? ParkBench results for a non-vector-unit CM5 and none for the T3D definitely make ParkBench less appealing. I just scanned what is available in the Netlib GBIS ParkBench database for Cray, IBM, Intel, Meiko, & TMC. Only the NAS kernels (and compact apps) have data, the LINALG kernels have none. And all of the NAS results are all from vendor implementations of the NAS kernels offically published in the 10/94 and 3/95 NAS reports. None are from ParkBench PVM implementations and I expect vendors to continue their own NAS results. So why are ParkBench PVM (or MPI) results of interest to the HPC community? Comments anyone? --Ken Koch, CIC-19, LANL ---- End of forwarded text ---- _____________________________________________________________________________ NPAC, Syracuse University Tel: +1 315 443 2083 111 College Place Fax: +1 315 443 1973 Syracuse Email: mab@npac.syr.edu New York 13244-4100, USA WWW URL: http://www.npac.syr.edu/ _____________________________________________________________________________ From owner-parkbench-comm@CS.UTK.EDU Fri Oct 13 10:46:19 1995 Return-Path: Received: from CS.UTK.EDU by netlib2.cs.utk.edu with ESMTP (cf v2.9t-netlib) id KAA13736; Fri, 13 Oct 1995 10:46:19 -0400 Received: from localhost by CS.UTK.EDU with SMTP (cf v2.9s-UTK) id KAA07994; Fri, 13 Oct 1995 10:45:36 -0400 Received: from zazu.c3.lanl.gov by CS.UTK.EDU with ESMTP (cf v2.9s-UTK) id KAA07966; Fri, 13 Oct 1995 10:45:28 -0400 Received: (krk@localhost) by zazu.c3.lanl.gov (8.6.10/c93112801) id IAA01168; Fri, 13 Oct 1995 08:44:40 -0600 Date: Fri, 13 Oct 1995 08:44:40 -0600 From: Kenneth R Koch Message-Id: <199510131444.IAA01168@zazu.c3.lanl.gov> To: worley@haven.EPM.ORNL.GOV Subject: Re: t3d resutls Cc: parkbench-comm@CS.UTK.EDU Pat Worley writes: > > > Ken, > > > > Unfortunately, it appears that the T3D would pose great problems > > > for PARKBENCH utilizing PVM3.3.8. Hopefully, when the MPI codes are > > > released, we'll be able to get a much easier view. It turns out that > > > NAS is producing their own vanilla MPI benchmark versions that we will > > > be including within PARKBENCH--this is very fortunate. (this was from Mike Kolatis, UTK) > > > > So ParkBench is not suitable for the T3D (nor the T3E when it comes > > out). Will the MPI version really address the T3D porting issues we > > have uncovered? I don't think so unless there is a specific effort. > > Does anyone else other than me see that not supporting the T3D (a major > > MPP) is a major short coming of ParkBench? (I wrote this) > > ... > > What is the background for this message? What is unfair about using PVM3.3.8? > I have run PSTSWM (one of the compact application codes) with both PVM and > SHMEM on the T3D with no problems, and SHMEM is faster. Is this the issue? > > Pat Worley It turns out that most of the Parkbench codes are not portable to the T3D for 3 major reasons. It would take a concerted effort on Parkbench code contributors (or CRI) to deal with the CRI compiler and T3D specific obstacles, but you are absolutely correct in that it can be done. A clean port to a C-90 would be a good logical first step. >From an earlier mail message to Jack Dongarra, Ken Koch writes: > LANL has some experiences and results to report for the T3D thanks to > Sowmini Varadhan's (UTK-Berry) great efforts here this summer. The CRI > special Fortran-to-C linkage (especially character strings in PBLAS & > BLACS), the T3D non-hosted PVM mode, the T3D 64bit word size, and the > gettimeofday() system timer created problems for the ParkBench codes as > distributed. We were only able to run the NAS set, and we couldn't get > CG to run at all because of insufficient info about setting parameters > (we even called NAS). We had tons of trouble with the ScalaPack codes, > especially the BLACS PVM implementation and thus never got them working > at all. The low-level benchmarks also had PVM problems for the T3D and > were being rewritten anyway so we didn't run them. Finally, our > ParkBench NAS results aren't good; some are uniformly 4 to even 20 > times slower than CRI's reported NAS results, and the rest show > slowdown trends, not speedups! We ran out of time before we could > trackdown what was really happening. Bottom line is that T3D ports of > ParkBench are a mess right now. And I would guess that little would > change for the T3E. So... >From a very recent mail message to Jack Dongarra, Ken Koch writes: > The T3D port now looks to be mostly CRI software and T3D specific > problems. And CRI is clearly the problem because of their screwy > non-portable language and Unix support and the T3D data types. > At this point in time, I fully support ParkBench NOT supporting this > machine. There were 3 major stumbling areas on the T3D: > 1) CRI T3D PVM version is 3.3.5 or 3.3.7 EXCEPT that > some functionality just isn't there. Also, the T3D PVM > implementation has lots of additional restrictions on using > PVMDATAINPLACE that are definitely not standard to PVM. So > in general you just have to back off to PVMDATARAW. > 2) Fortran to C linkage of character strings. The unique CRI > way requires a special "character descriptor" data type and > some auxilirary routines to handle Fortran character strings > passed into C routines. They use this on their entire > product line, so it isn't just T3D specific. This is really > only a problem with the LINALG set because of it extensive > use of character string args for the PBLAS and BLACS > routines. Those packages as provided in ParkBench won't > even work on a YMP or C-90 because of this; and I think this > is a larger issue you may want to address outside of > ParkBench. Ps: It also isn't just an easy matter of using > CRI's provided BLACS as they have a different API! I > thought BLACS was to be a standard; what gives? > 3) The T3D only supports 64 bit integers and 64 bit reals > under the standard Fortran 77 (cft77) environment. This has > repercussions in many places, one of which is their special > INTEGER8 PVM datatype. Now the new Fortran 90 (cf90) on the > T3D does allow 32 reals if explicitly declared real*4, but > it still defaults to 64 bit integers - nice huh? It may > support integer*4 declarations as well, but I don't recall. > Most T3D sites won't have the F90 environment. So this is > a thorny area. Parkbench needs to address the SHMEM, NXLIB, MPL, ... issues. Are they or will they be legal substitutes for PVM. What are the expected "run rules"? SHMEM is faster than PVM on the T3D, but it isn't just a drop in. For one, the synchronization issues are completely different with a one-sided communication mechanism. So you typically end up with an #ifdef code structure and several vendors have complained that you can NOT use CPP with Fortran code. But if you are careful you can! So then you start getting a large multi-flavor support issue and Parkbench should probably address this issue soon as well. Maybe we can decide what to do with the T3D issues at the 11/3 meeting. There is also a significant CM5 issue (no vector units!). --Ken Koch, CIC-19, LANL From owner-parkbench-comm@CS.UTK.EDU Fri Oct 13 17:52:41 1995 Return-Path: Received: from CS.UTK.EDU by netlib2.cs.utk.edu with ESMTP (cf v2.9t-netlib) id RAA16694; Fri, 13 Oct 1995 17:52:41 -0400 Received: from localhost by CS.UTK.EDU with SMTP (cf v2.9s-UTK) id RAA28468; Fri, 13 Oct 1995 17:51:27 -0400 Received: from timbuk.cray.com by CS.UTK.EDU with ESMTP (cf v2.9s-UTK) id RAA28459; Fri, 13 Oct 1995 17:51:23 -0400 Received: from ferrari.cray.com (root@ferrari.cray.com [128.162.173.1]) by timbuk.cray.com (8.6.12/CRI-gate-8-2.5) with SMTP id QAA26074 for ; Fri, 13 Oct 1995 16:51:21 -0500 Received: from lotus04.cray.com by ferrari.cray.com (5.0/CRI-5.14) id AA12936; Fri, 13 Oct 1995 16:53:15 -0500 Received: by lotus04.cray.com (5.0/SMI-SVR4) id AA05298; Fri, 13 Oct 1995 16:49:13 -0500 From: cmg@ferrari.cray.com (Charles Grassl) Message-Id: <9510132149.AA05298@lotus04.cray.com> Subject: Parkbench results To: parkbench-comm@CS.UTK.EDU Date: Fri, 13 Oct 1995 16:49:12 -0500 (CDT) X-Mailer: ELM [version 2.4 PL24-CRI-b] Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit To all; We are getting off track with these benchmarks. The intention of Parkbench is not to test conformance or standards. Rather, we would like to measure various performance aspects of MPP systems. If we tacitly intending to test PVM, then we should write a genuine test for for the entire PVM library or all aspect of Fortran. Does such a test suite exists? If it does not, then this would be a worthwhile task. If we are intending to measure specific performance features, then the tests should be kept as simple and succinct as possible. Charles Grassl Cray Research, Inc. From owner-parkbench-comm@CS.UTK.EDU Tue Oct 17 17:13:47 1995 Return-Path: Received: from CS.UTK.EDU by netlib2.cs.utk.edu with ESMTP (cf v2.9t-netlib) id RAA20089; Tue, 17 Oct 1995 17:13:47 -0400 Received: from localhost by CS.UTK.EDU with SMTP (cf v2.9s-UTK) id RAA07671; Tue, 17 Oct 1995 17:10:27 -0400 Received: from mordillo.npac.syr.edu by CS.UTK.EDU with ESMTP (cf v2.9s-UTK) id RAA07633; Tue, 17 Oct 1995 17:10:16 -0400 Received: (from mab@localhost) by mordillo.npac.syr.edu (940816.SGI.8.6.9/8.6.6) id RAA14529 for parkbench-comm@CS.UTK.EDU; Tue, 17 Oct 1995 17:09:57 -0400 Date: Tue, 17 Oct 1995 17:09:57 -0400 Message-Id: <199510172109.RAA14529@mordillo.npac.syr.edu> From: Mark Baker Subject: Re: Parkbench results To: parkbench-comm@CS.UTK.EDU In-Reply-To: Charles Grassl's message of Fri, 13 Oct 1995 16:49:12 -0500 (CDT) Organisation: NPAC, Syracuse University, New York 13244-4100, USA. Phone: +1 315 443 2083 fax: +1 315 443 1973 X-Mailer: Sendmail/Ream version 5.1.51 Fcc: +M.A.Baker RE: Reply to Charles Grassl email. > We are getting off track with these benchmarks. The intention of > Parkbench is not to test conformance or standards. Rather, we would > like to measure various performance aspects of MPP systems. This is very true. > If we tacitly intending to test PVM, then we should write a genuine > test for for the entire PVM library or all aspect of Fortran. Does > such a test suite exists? If it does not, then this would be a > worthwhile task. I think a suite for testing PVM is maybe appropriate, but surely not under PARKBENCH ! > If we are intending to measure specific performance features, then the > tests should be kept as simple and succinct as possible. This was the original aim, but things seemed to have been side tracked recently. Hopefully, we can sort this and other issues at the next PARKBENCH meeting in Knoxsville at the beginning of November. Mark > > > Charles Grassl > Cray Research, Inc. > _____________________________________________________________________________ Dr Mark Baker NPAC, Syracuse University Tel: +1 315 443 2083 111 College Place Fax: +1 315 443 1973 Syracuse Email: mab@npac.syr.edu New York 13244-4100, USA WWW URL: http://www.npac.syr.edu/ _____________________________________________________________________________ From owner-parkbench-comm@CS.UTK.EDU Thu Oct 26 09:35:21 1995 Return-Path: Received: from CS.UTK.EDU by netlib2.cs.utk.edu with ESMTP (cf v2.9t-netlib) id JAA22376; Thu, 26 Oct 1995 09:35:21 -0400 Received: from localhost by CS.UTK.EDU with SMTP (cf v2.9s-UTK) id JAA15803; Thu, 26 Oct 1995 09:33:44 -0400 Received: from dasher.cs.utk.edu by CS.UTK.EDU with ESMTP (cf v2.9s-UTK) id JAA15790; Thu, 26 Oct 1995 09:33:37 -0400 From: Jack Dongarra Received: by dasher.cs.utk.edu (cf v2.11c-UTK) id JAA09450; Thu, 26 Oct 1995 09:33:35 -0400 Date: Thu, 26 Oct 1995 09:33:35 -0400 Message-Id: <199510261333.JAA09450@dasher.cs.utk.edu> To: parkbench-comm@CS.UTK.EDU Subject: parkbench meeting on Nov 3 The ParkBench (Parallel Benchmark Working Group) will meet in Knoxville, Tennessee on November 3rd, 1995. Please let me know if you are planning to attend. Details below. Jack The meeting site will be the Knoxville Downtown Hilton Hotel. We have made arrangements with the Hilton Hotel in Knoxville. Hilton Hotel 501 W. Church Street Knoxville, TN Phone: 615-523-2300 When making arrangements tell the hotel you are associated with the Parallel Benchmarking. The rate about $68.00/night. You can download a postscript map of the area by looking at my homepage: http://www.netlib.org/utk/people/JackDongarra.html. You can rent a car or get a cab from the airport to the hotel. We should plan to start at 9:00 am November 3rd and finish about 5:00 pm. If you will be attending the meeting please send me email so we can better arrange for the meeting. The format of the meeting is: Friday November 3rd 9:00 - 12.00 Full group meeting 12.00 - 1.30 Lunch 1.30 - 5.00 Full group meeting Tentative agenda for the meeting: 1. Minutes of last meeting 2. Reports and discussion from subgroups 3. Examine the results obtained so far 4. Electronic journal of benchmark results 5. Open discussion on the Supercomputer 95 activity 6. Date and venue for next meeting From owner-parkbench-comm@CS.UTK.EDU Sat Nov 11 23:12:34 1995 Return-Path: Received: from CS.UTK.EDU by netlib2.cs.utk.edu with ESMTP (cf v2.9t-netlib) id XAA26878; Sat, 11 Nov 1995 23:12:34 -0500 Received: from localhost by CS.UTK.EDU with SMTP (cf v2.9s-UTK) id XAA17376; Sat, 11 Nov 1995 23:10:51 -0500 Received: from berry.cs.utk.edu by CS.UTK.EDU with ESMTP (cf v2.9s-UTK) id XAA17369; Sat, 11 Nov 1995 23:10:48 -0500 Received: from LOCALHOST.cs.utk.edu by berry.cs.utk.edu with SMTP (cf v2.11c-UTK) id XAA02398; Sat, 11 Nov 1995 23:09:34 -0500 Message-Id: <199511120409.XAA02398@berry.cs.utk.edu> to: parkbench-comm@CS.UTK.EDU Subject: Minutes from 11-3-95 Date: Sat, 11 Nov 1995 23:09:33 -0500 From: "Michael W. Berry" ----------------------------------------------------------------- Minutes of Parkbench Meeting - Knoxville Hilton, November 3, 1995 ----------------------------------------------------------------- Attendee List: Mark Baker Syracuse University mab@npac.syr.edu Michael Berry Univ. of Tennessee berry@cs.utk.edu Jack Dongarra Univ. of Tenn./ORNL dongarra@cs.utk.edu Tony Hey Univ. of Southampton ajgh@ecs.soton.ak.uk Edgar Kalns IBM kalns@vnet.ibm.com Kenneth Koch Los Alamos Nat'l Lab krk@lanl.gov Mike Kolatis Univ. of Tennessee kolatis@cs.utk.edu David MacKay Intel SSD mackay@ssd.intel.com Ron Sercely Convex sercely@convex.com Fiona Sim IBM fsim@vnet.ibm.com Erich Strohmaier Univ. of Tennessee erich@cs.utk.edu Chegu Vinod Convex vinod@bach.convex.com David Walker Oak Ride Nat'l Lab walker@msr.epm.ornl.gov Pat Worley Oak Ride Nat'l Lab worleyph@ornl.gov At 9:07am EST, Tony H. opened the meeting and itemized the agenda items for the day: 1. Minutes 2. Results 3. Electronic Benchmarking Journal 4. SC'95 5. Run Rules 6. Policy 7. AOB (Any Other Business) Minutes from March 1995 meeting accepted with a minor correction to indicate that QCD is in fact not in the first release of PARKBENCH. The 'Action items' from the May video meeting between UTK and Southampton were then reviewed. Jack D. suggested that most efficient version of PVM be used on SP2. At 9:25am, Tony H. reviewed the Results & Codes. We should have a release number on the codes. Mark B. was asked to comment on low-level codes. They are available in PVM version (latest version). Results are available on most systems (SPMD). Codes are debugged as of a month. Fiona S. was concerned that results may not be "acceptable" - run rules are not clear. Jack D. suggested that a file be provided for each code to specify input parameters/files and that an output file could be sent to PARKBENCH for validation. Should IBM be constrained to publish PVM-E results or can MPL results be used? Tony H. asked that Mark B. work on making sure low-level run rules available. Jack suggested that a subcommittee be formed to create a formal set of run rules for PARKBENCH codes. Outcome could be reported electronically to resolve them and announce them at SC'95. A review of the results needs to be specified. Ken K. asked who will be assigned to run the codes? Tony H. suggested (like Genesis) that vendors and academics can both run codes and produce results. There was a general discussion on how to report results but this was curtailed for later discussion during the day. Jack D. suggested that users can use vendor-supplied PVM not generic PVM (T3D has several versions of PVM). Tony H. concurred with this philosophy. The issue is really how the codes are invoked not really the performance of the measured tasks. the measured tasks. Ron S. reiterated that vendors would be allowed to replace/modify part of code related to task creation/management. Tony H. suggested that a subcommittee should be created to resolve such issues. Tony H. then asked Mike K. to summarize the status of the kernels: Lowlevel: Single processor TICK1, TICK2 (interactive), RINF1, POLY1, POLY2 (non-PVM, f77) gettimeofday - used for timing (wall-clock time is the requirement); timer calls are isolated for replacement by other timers. Tony H. indicated that users should be allowed to replace timers and this can be noted in run rules. Multiple processor COMMS1 COMMS2 COMMS3 POLY3 SYNCH1 Machines run on: CM5(UTK,LANL), Convex, SP2 (Ames), CS2, T3D, Paragon(ORNL), Sun4, DEC Alpha, IBM RS/6000, HP, SGI. ** No results on KSR machines * Linear Algebra Kernels: Versions w/ PVM-BLACS, MPI-BLACS. Some PVM versions spawn tasks which cause problems on machines like CM5, T3D. T3D has problems w/ BLACS (SCALAPACK group looking into these problems). PVM 3.3.7,3.3.8 are available. Version 3.3.10 was just released. Need to specify a minimal PVM version (3.3.6?). Tony H. requested that recommendation be made or users. Mike K. suggested version 3.3.8 should be the base version. NAS (based on PVM) C codes (ala V. Sunderum at Emory Univ. and his colleagues) EP, IS, MG (recursive memory calls) Ames will provide a MPI-f77 version to PARKBENCH soon - so group must decide which version(s) to support in the suite. These will be NAS Parallell Benchmarks version 1.5 (in MPI). f77 codes: CG (memory concerns), FT Compact Applications: (4 CA's with PVM and MPI) installed -> NPB (MPI,PVM) - 3 codes (LU,SP,BT) ARCO (PVM) -- "future directory" POLMP(f77,PVM,HPF,PARMACS) -- "future directory" installed -> SHALLOW (f77,PVM,MPI) HPF (in a separate directory) *** dropped SOLVER, 1-DFFT, GAMESS *** All levels of benchmarks will be available in PVM and MPI: 2 sources for each code. Jack D. suggested that we should have 2 files - 1 per version. Pat W. mentioned that "wrappers" could be used as an alternative. Ken K. pointed out that real application writers use wrappers. No final decision on wrappers was made at this time. --------------------------------------- At 11:15am, group took a 5-minute break. --------------------------------------- Tony H. summarized status afterwards at 11:30am: o low-level codes (PVM,MPI-soon) Results on SP,Cray,Convex,Paragon,SGI o Kernels (MPI,PVM soon) LinAlg NAS o Compact Applications (MPI,PVM soon) LU,SP,BT :=NAS SHALLOW *** MPI versions are the primary focus with optional PVM versions available *** Ken K. was then asked to present his results on the T3D at LANL. He pointed out the differences with MPP PVM and NOW PVM. For example, no spawns allowed, pvm_parent90 is useless to determine the master. Could not get LinAlg to work on T3D - complicated call tree for benchmarks. Fortran-to-C char strings on CRI machines must use special FCD type. BLACS are different on CRI machines. For NAS/EP, the LANL results with Problems A and B were consistently about 17 times slower than the NAS reported results. For NAS/MG the results do not scale well. For the small number of processors, the results were more comparable with NAS report. NAS CG could not be run due to memory requirements - appeared to designed for NOW's and not MPP's with respect to memory. For NAS/FT, the LANL results were about 5 times slower. For NAS/IS, when replaced gettimeofday with "rtclock()" the results were more comparable (remove system calls for timer). Ken K. then summarized issues for refining the PARKBENCH's version of NAS for running on T3D. Results on T3D were obtained by Sowmini Varadhan (PhD student from UT interning at LANL this past summer). At 11:50am, Tony H. then asked for a general discussion on results & codes from attendees. Ron S. shared similar concerns about NOW-based design of codes. Mapping of memory per processor configuration is needed (Ken K. concurred with Convex on this). Fiona S. was very concerned about the differences between Ken's results on the NAS results with those reported in the NAS report. Fiona S. indicated that it is difficult to access what is a "real" indication of performance and that if the vanilla starting point was not a reasonable one, not only would it discredit the vendor's results, but also the NAS PB themselves. Fiona S. also pointed out the comparison was not made with the codes which are going into the Parkbench suite. There is a new set of MPI based codes from NASA, and no data has been collected from those codes. Tony H. stressed that the database be available at SC'95. ----------------------------------------------------------- At 12:07pm EST, Tony H. suggested the group break for lunch. ----------------------------------------------------------- At 1:25pm EST, the group reassembled and Tony stressed the current *** PARKBENCH ACTION ITEMS *** o Release 1.0 of PARKBENCH codes and results by December 3rd. 1. PVM spawning/MPI run problems (Jack D + David W./Eric S. have agreed to address this problem - Jac D. takes lead) 2. Low-level codes: have PVM and will convert to MPI soon (Eric S + Mike K./ Mark B.) will handle this conversion and isolate timers in codes. 3. Kernels: BLACS problems related to T3D and will get NAS NPB 1.5 MPI version and address size problems (e.g., NAS CG) and remove NOW-based bias; a README file to specify memory requirements per processor configuration (Jack D. + Eric S./David B. will address the size problems and README files; don't worry about the Class C problems for now) 4. Compact Applications; NAS 3 applications + SHALLOW with MPI and PVM optional versions soon (David W. + Pat W. / David B + Eric S. will address these codes) 5. Must decide the "Run Rules" for the codes (Mike B. + Jack D. will take SPEC document, modify it and circulate with vendor input) - generally specify sizes and that specific details for the parameters are in the README files per codes. Fiona S. suggested that the problem sizes scale. Reference to the individual codes. Will put NAS (MPI) results from David Bailey; this release will have a mixture of PVM and MPI-based codes. Specifically , here's the first release: ------------------------------------ | PARKBENCH RELEASE 0.9 (Nov. 10)| ------------------------------------ | Low-level - PVM versions only | | Kernel - LinAlg (MPI,PVM) | | NAS (MPI) | | Compact | | Applics - NAS (MPI) | | SHALLOW (MPI,PVM) | ------------------------------------ Note: RELEASE 1.0 at SC'95 will fill 'holes' + spawning rewrites 6. Results: Release 0.9 (codes) ... as of Nov. 10 Run Rules Current Databases: MPI PVM Vendor-Opt --- ---- ---------- NAS SHALLOW --------------------------------------------------------------------------- Machine assignments: SP2 : NASA runs NPB's, Southampton runs low-level, Daresbury? Convex : ?Kentucky, ?Convex, ?NCSA, Ken K. SGI : Jack D. will ask Mark Brown at SGI, Tony H. will see about Power Challenge results (moderate number of processors), Ken K. will contact Paul Woodward at Univ. of Minnesota. Intel : ORNL's Paragon (Pat W./David W.) Cray : Edinburgh's 256-processor system - Tony H. to get MPI results --------------------------------------------------------------------------- Source of current and future results: SP Convex SGI Intel Cray IBM -- ------ --- ----- ---- ---- Low-Lvl x x x Tony H. LinAlg IBM? or NAS Mike K. Daresbury NAS Ker. NAS/MPI NAS/MPI Tony H/MPI NAS App. NAS NAS Tony H/MPI SHALLOW Pat W. Pat W. Pat W. Second tier machines: Meiko CS2, Parsytec, CM-5, KSR, DEC Turbolaser (8600 server) At 2:30pm EST, Jack D. announced that there will be a joint SC'95 BOF session on Tuesday (12/05) at 5:15 pm in Room 8 with SPEC-HPSC. Jack then passed around David Snelling's draft of a joint code policy with SPEC-HPSC. Fiona S. suggested that there should be another PARKBENCH meeting at SC'95 to discuss results and sensitive information. Myron Ginsberg's BOF session is at 6:30pm. Jack D. and Tony H. felt the second Parkbench meeting should be closed-door and can be at 6pm on Wednesday, December 6. Tony H. then mentioned the existence of an Electronic Benchmarking Journal in Europe and US with Jack D. and Tony H. as co-editors. Associate editors are being solicited and journal would be scheduled to being in January '96. Suggestions for a title are welcome. Tony H. will try to modify the current Joint Code Policy and a motion to update the Parkbench subgroups was made. Jack D. and Tony H. agreed that PARKBENCH database can be mirrored in the UK (perhaps at Southampton). ------------------------------------------- Tony H. adjourned the meeting at 2:55pm EST. ------------------------------------------- From owner-parkbench-comm@CS.UTK.EDU Fri Dec 1 11:21:31 1995 Return-Path: Received: from CS.UTK.EDU by netlib2.cs.utk.edu with ESMTP (cf v2.9t-netlib) id LAA12191; Fri, 1 Dec 1995 11:21:31 -0500 Received: from localhost by CS.UTK.EDU with SMTP (cf v2.9s-UTK) id LAA09848; Fri, 1 Dec 1995 11:18:08 -0500 Received: from beech.soton.ac.uk by CS.UTK.EDU with ESMTP (cf v2.9s-UTK) id LAA09840; Fri, 1 Dec 1995 11:18:01 -0500 Received: from bright.ecs.soton.ac.uk (bright.ecs.soton.ac.uk [152.78.64.201]) by beech.soton.ac.uk (8.6.12/hub-8.5a) with SMTP id OAA22343 for ; Fri, 1 Dec 1995 14:41:11 GMT Received: from landlord.ecs.soton.ac.uk by bright.ecs.soton.ac.uk; Fri, 1 Dec 95 14:42:30 GMT From: Prof Roger Hockney Received: from caesar.ecs.soton.ac.uk by landlord.ecs.soton.ac.uk; Fri, 1 Dec 95 14:41:42 GMT Date: Fri, 1 Dec 95 14:41:28 GMT Message-Id: <18281.9512011441@caesar.ecs.soton.ac.uk> To: parkbench-comm@CS.UTK.EDU Subject: New book on Parkbench I hope members of the committee will forgive me using their e-mail for a little announcement, but I think it will be of interest to most of you: --------------------------- ________________________________________________________ I have recently published a book with SIAM, entitled: "The Science of Computer Benchmarking" Roger W. Hockney ISBN 0-89871-363-3 Available at the SIAM stand, Supercomputing95, San Diego ________________________________________________________ Published November 1995, it consists of 129 pages and is a softcover volume at US$ 21.25. Those of you interested in computer benchmarking and performance analysis should find the book valuable. It is a tutorial exposition of the methodology and low-level benchmarks of the Parkbench committee's report on parallel computer benchmarking, together with the dimensionless theory of scaling and the graphical presentation of results. It is suitable as a teaching text for tutorials, advanced undergraduate and MSc courses. The chapter headings are: Chapter-1: "Introduction" - survey of Parkbench committee and other benchmarking activities, and the usefulness of benchmarking. Chapter-2: "Methodology" - units, symbols and performance metrics with examples. Critique of Speedup. Chapter-3: "Low-level Parameters and Benchmarks" - tutorial definition of the r-infinity and n-half performance parameters, and the benchmarks to measure them. Chapter-4: "Computational Similarity and Scaling" - dimensionless theory of scaling with the principle of "Computational Similarity". Chapter-5: "Presentation of Results" - The Univ. of Tennessee's "Performance Database Server" and the Univ. of Southampton's "Graphical Benchmark Information Service". Prepayment is required and shipping charge will apply. Please contact SIAM for further ordering information: service@siam.org Or the author regarding the book itself: Roger W. Hockney (Professor Emeritus, Reading University, UK) (Visiting Professor, Southampton University,UK) e-mail: rwh@ecs.soton.ac.uk Ordinary mail: 4 Whitewalls Close, Compton, Newbury, England, UK. Telephone: +44 (1635) 578 679 (also fax after speaking). ________________________________________________________ From owner-parkbench-comm@CS.UTK.EDU Wed Mar 20 12:58:21 1996 Return-Path: Received: from CS.UTK.EDU by netlib2.cs.utk.edu with ESMTP (cf v2.9t-netlib) id MAA26002; Wed, 20 Mar 1996 12:58:20 -0500 Received: from localhost by CS.UTK.EDU with SMTP (cf v2.9s-UTK) id MAA09632; Wed, 20 Mar 1996 12:58:44 -0500 Received: from dasher.cs.utk.edu by CS.UTK.EDU with ESMTP (cf v2.9s-UTK) id MAA09620; Wed, 20 Mar 1996 12:58:42 -0500 From: Jack Dongarra Received: by dasher.cs.utk.edu (cf v2.11c-UTK) id MAA17873; Wed, 20 Mar 1996 12:58:40 -0500 Date: Wed, 20 Mar 1996 12:58:40 -0500 Message-Id: <199603201758.MAA17873@dasher.cs.utk.edu> To: parkbench-comm@CS.UTK.EDU Subject: Parkbench meeting in April Dear Colleague, The ParkBench (Parallel Benchmark Working Group) will meet in Knoxville, Tennessee on April 26th, 1996. The meeting site will be the Knoxville Downtown Hilton Hotel. We have made arrangements with the Hilton Hotel in Knoxville. Hilton Hotel 501 W. Church Street Knoxville, TN Phone: 423-523-2300 When making arrangements tell the hotel you are associated with the Parallel Benchmarking or Parkbench or Park. The rate about $75.00/night. You can download a postscript map of the area by looking at my homepage: http://www.netlib.org/utk/people/JackDongarra.html. You can rent a car or get a cab from the airport to the hotel. We should plan to start at 9:00 am April 26th and finish about 5:00 pm. If you will be attending the meeting please send me email so we can better arrange for the meeting. The format of the meeting is: Friday April 26th 9:00 - 12.00 Full group meeting 12.00 - 1.30 Lunch 1.30 - 5.00 Full group meeting Tentative agenda for the meeting: 1. Minutes of last meeting 2. Reports and discussion from subgroups 3. Examine the results obtained so far 4. Electronic journal of benchmark results 5. Date and venue for next meeting The objectives for the group are: 1. To establish a comprehensive set of parallel benchmarks that is generally accepted by both users and vendors of parallel system. 2. To provide a focus for parallel benchmark activities and avoid unnecessary duplication of effort and proliferation of benchmarks. 3. To set standards for benchmarking methodology and result-reporting together with a control database/repository for both the benchmarks and the results. The following mailing lists have been set up. parkbench-comm@cs.utk.edu Whole committee parkbench-lowlevel@cs.utk.edu Low level subcommittee parkbench-compactapp@cs.utk.edu Compact applications subcommittee parkbench-method@cs.utk.edu Methodology subcommittee parkbench-kernel@cs.utk.edu Kernel subcommittee Jack Dongarra From owner-parkbench-comm@CS.UTK.EDU Mon Apr 1 19:14:42 1996 Return-Path: Received: from CS.UTK.EDU by netlib2.cs.utk.edu with ESMTP (cf v2.9t-netlib) id TAA19648; Mon, 1 Apr 1996 19:14:42 -0500 Received: from localhost (root@localhost) by CS.UTK.EDU with SMTP (cf v2.9s-UTK) id TAA23088; Mon, 1 Apr 1996 19:15:14 -0500 Received: from igate1.hac.com (igate1.HAC.COM [192.48.33.10]) by CS.UTK.EDU with SMTP (cf v2.9s-UTK) id TAA23058; Mon, 1 Apr 1996 19:15:06 -0500 From: Received: from ises01.ES.HAC.COM ([147.16.5.2]) by igate1.hac.com (4.1/SMI-4.1) id AA12280; Mon, 1 Apr 96 16:12:03 PST Received: by ises01.ES.HAC.COM; id AA01757; Mon, 1 Apr 1996 16:14:24 -0800 Received: from cc:Mail by CCGATE.HAC.COM id AA828403436; Mon, 01 Apr 96 15:52:13 PST Date: Mon, 01 Apr 96 15:52:13 PST Encoding: 15 Text Message-Id: <9603018284.AA828403436@CCGATE.HAC.COM> To: pbwg-comm@CS.UTK.EDU Subject: Parkbench and MPI Dear Parkbench committee, According to the Parkbench home page, "future releases will adopt the proposed MPI interface." Do you know of anyone working on this? Also, has anyone written a version of Parkbench in C? I appreciate your response. Sincerely, Chris Reed Hughes Aircraft Company (310) 334-7134 From owner-parkbench-comm@CS.UTK.EDU Mon Apr 1 20:01:17 1996 Return-Path: Received: from CS.UTK.EDU by netlib2.cs.utk.edu with ESMTP (cf v2.9t-netlib) id UAA19857; Mon, 1 Apr 1996 20:01:17 -0500 Received: from localhost (root@localhost) by CS.UTK.EDU with SMTP (cf v2.9s-UTK) id UAA26371; Mon, 1 Apr 1996 20:02:36 -0500 Received: from blueberry.cs.utk.edu (BLUEBERRY.CS.UTK.EDU [128.169.92.34]) by CS.UTK.EDU with ESMTP (cf v2.9s-UTK) id UAA26364; Mon, 1 Apr 1996 20:02:33 -0500 Received: by blueberry.cs.utk.edu (cf v2.11c-UTK) id BAA03207; Tue, 2 Apr 1996 01:02:13 GMT From: "Erich Strohmaier" Message-Id: <9604012002.ZM3205@blueberry.cs.utk.edu> Date: Mon, 1 Apr 1996 20:02:12 -0500 In-Reply-To: "Parkbench and MPI" (Apr 1, 3:52pm) References: <9603018284.AA828403436@CCGATE.HAC.COM> X-Face: ,v?vp%=2zU8m.23T00H*9+qjCVLwK{V3T{?1^Bua(Ud:|%?@D!~^v^hoA@Z5/*TU[RFq_n'n"}z{qhQ^Q3'Mexsxg0XW>+CbEOca91voac=P/w]>n_nS]V_ZL>XRSYWi:{MzalK9Hb^=B}Y*[x*MOX7R=*V}PI.HG~2 X-Mailer: Z-Mail (3.2.0 26oct94 MediaMail) To: Subject: Re: Parkbench and MPI Cc: pbwg-comm@CS.UTK.EDU Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Dear Chris, A beta version of a new ParkBench release is available at: http://www.cs.utk.edu/~erich/ParkBench.tar.gz or http://www.cs.utk.edu/~erich/ParkBench.tar.Z This includes MPI versions of all benchmarks. We currently do not have any plans on a version written in C. Best Regards Erich -- =========================================================================== Erich Strohmaier email: erich@cs.utk.edu Department of Computer Science phone: ++ 1 (423) 974 5886 104 Ayres Hall fax : ++ 1 (423) 974 8296 Knoxville TN, 37996 - USA http://www.cs.utk.edu/~erich/ From owner-parkbench-comm@CS.UTK.EDU Tue May 7 08:17:43 1996 Return-Path: Received: from CS.UTK.EDU by netlib2.cs.utk.edu with ESMTP (cf v2.9t-netlib) id IAA06170; Tue, 7 May 1996 08:17:42 -0400 Received: from localhost (root@localhost) by CS.UTK.EDU with SMTP (cf v2.9s-UTK) id HAA06858; Tue, 7 May 1996 07:54:09 -0400 Received: from lime.cs.utk.edu (LIME.CS.UTK.EDU [128.169.92.81]) by CS.UTK.EDU with ESMTP (cf v2.9s-UTK) id HAA06852; Tue, 7 May 1996 07:54:06 -0400 Received: from cs.utk.edu by lime.cs.utk.edu with ESMTP (cf v2.11c-UTK) id HAA02075; Tue, 7 May 1996 07:53:11 -0400 Message-Id: <199605071153.HAA02075@lime.cs.utk.edu> to: parkbench-comm@CS.UTK.EDU Subject: Minutes of Last PARKBENCH Meeting Date: Tue, 07 May 1996 07:53:10 -0400 From: "Michael W. Berry" Here are the revised minutes of the last PARKBENCH Meeting. They will be posted to comp.parallel and comp.benchmarks today. Regards, Mike ----------------------------------------------------------------- Minutes of Parkbench Meeting - Knoxville Hilton, April 26, 1996 ----------------------------------------------------------------- Attendee List: (MB) Michael Berry Univ. of Tennessee berry@cs.utk.edu (JD) Jack Dongarra Univ. of Tenn./ORNL dongarra@cs.utk.edu (VG) Vladimir Getov Univ. of Westminister getovv@wmin.ac.uk (MG) Myron Ginsberg EDS/General Motors ginsberg@gmr.com (TH) Tony Hey Univ. of Southampton ajgh@ecs.soton.ac.uk Edgar Kalns IBM kalns@vnet.ibm.com David MacKay Intel SSD mackay@ssd.intel.com (RM) Richard C. Metzger RL/CJCB (Rome Lab) metzgerr@rl.af.mil Phil Mucci Univ. of Tennessee mucci@cs.utk.edu (BP) Bodo Parady Sun Microsystems bodo@eng.sun.com (SS) Subhash Saini NASA Ames saini@nas.nasa.gov (ES) Erich Strohmaier Univ. of Tennessee erich@cs.utk.edu (PW) Pat Worley Oak Ridge Nat'l Lab worleyph@ornl.gov [Events after 11:30am] Regarding the Low-Level benchmarks: ----------------------------------- SS showed that variations ranging from 18-30% on an IBM SP/2 have been recorded for the low-level codes. There is a proceedings paper (SC'94) which first noted this. Members felt the PVM daemons may be the suspects here. TH reported that strange results had been obtained for COMMS1 with respect to the MAXLEN parameter and that the MPI and PVM versions of COMMS2,3 may not measure the same operations. TH demonstrated performance variations of COMMS1,2 on the IBM SP/2 and Meiko CS2. The best results were obtained using PSEND and PRECEIVE on contiguous data. It was noted that PSEND/PRECEIVE were not available with the original COMMS1,2 codes were written. TH noted that memory access should be measured as well and that perhaps COMMS3 could be rewritten for consistency with the current COMMS1,2 codes (ES suggested this could be done for the V3.0 Parkbench Release). VG suggested that the group rethink the goals of low-level benchmarking. He proposed 3 goals of this level of benchmarking: (1) Comparisons across different parallel computers, (2) Performance estimation of parallel kernels and applications, and (3) Performance tuning of parallel applications. For single node architectures, VG suggested that scalar performance, vectorization, and memory hierarchies be measured. He pointed out that RINF1 which comprises 17 examples of code fragments/kernels is 15 years old and was best suited for vector pipeline machines. Therefore, a new low-level benchmark was proposed for evaluation of the memory hierarchy effects. VG also suggested that vendors be allowed to use their single node sustainable performance kernels (in-house versions) for such benchmarks. Members felt that a new benchmark of memory hierarchies would be very useful in addition to the existing low-level codes. At 12:30pm EDT the group went to lunch at the "Soup Kitchen" in downtown Knoxville. At 1:20pm EDT, the meeting resumed. MB and TH briefly reviewed the Minutes of the November 11, 1995 Parkbench Meeting (in Knoxville). JD pointed out that Cray Research should be using the BLACS provided by Parkbench to avoid the problems observed with task spawning on the Cray T3D. TH and JD indicated that they will investigate the use of SGI machines since little or no results have been obtained on those machines to date. BP announced that SPEC-HPSC will release their first benchmark suite in the Summer of '96 and will have two application codes (SEISMIC (ARCO) and CHEMISTRY). The meeting focus then switched to Compact Applications: -------------------------------------------------------- MG discussed the status and needs of benchmarking in the automotive industry. USCAR released a benchmark suite at the end of August 1995. They typically use 5-6 commercial codes for benchmarks based on older model cars. They have had discussions with Europort and seek their models as well. The particular needs of benchmarking in this area include: (1) Information on the limitation of machines, (2) Crossover points between small-medium-large machines - would like to see results posted on the WWW someday, (3) Assess if machine is general purpose or only good for a specific model, (4) Develop a canonical set of parameters for scalability analysis, (5) Develop application-dependent metrics, and (6) Focus on shared-memory paradigms (message-passing not desirable). Ford has a lot of Cray C-90's and uses them primarily for job streaming. Most auto companies do not do parallelism and depend on commercial Finite Element Model (FEM) codes. Typical machines benchmarked include: IBM SP/2, Convex Exemplar, Cray J9-16, and SGI PowerChallenge. Five commonly-used commerical benchmark codes (which are also in the Europort suite) are: MSC/NASTRAN, STARCD, ABACUS, LSDYNA, and PAMCRASH. MG pointed out that parallel commercial FEM codes have not evolved (few number of processors typically sufficient). Measurements based on turn-around-time are needed. Assessing the performance of job mixes with increasing numbers of threads is very desirable (throughput). Creating a synthetic job mix is more desirable than running one code on a single dedicated machine. Four areas MG suggested for future benchmarks are: (1) Crashworthiness (large structures, 60K degrees of freedom), (2) Other structures problems (noise vibration, heat), (3) CFD (exterior aerodynamic, internal combustion), (4) Manufacturing processes (metal forming). MG concluded his presentation with a note that a preliminary report on AUTOBENCH was available (copy left with TH/JD). SS then presented some results on HPF-based NAS Parallel Benchmarks (PB). He pointed out the heavy investment at NAS in CMF on the CM-2 and CM-5 (shutdown on March 1995). He showed that for the SP and FT benchmarks, that the MPI versions were about 2 times faster than the HP versions of NAS PB Version 2.0 on the IBM SP/2. For the BT benchmark, the MPI code was 4 times faster than the HPF counterparts. The Portland Group HPF compiler was used on the IBM SP/2 tested. He alluded that the gain in using MPI may not always be cost-effective given the man hours required to obtain such codes. With HPF, once can get a speedup factor of 2 over baseline (16 processor) versions for 40+ processors on the IBM SP/2. Having a library of HPF/MPI codes is a big win given the ease of programming with HPF and good performance of underlying MPI kernels. SS also pointed out that a HPF-based PIC (Particle-In-Cell) code had obtained a speedup of 5 on a 16 processor SGI Power Challenge XL. SS concluded his presentation by indicating that (1) CMF-to-HPF conversion is easy, (2) Lack of parallel I/O in HPF is a definite problem, (3) Lack of parallel math and scientific libraries in HPF will complicate the CMF-to-HPF conversion (and HPF ports), and (4) HPF are maturing. TH suggested that HPF versions of the Parkbench kernels and compact applications be considered for Release Version 3 of the Parkbench Suite. RM from Rome Lab then proceeded to describe the C3I Parallel Benchmark Suite. He noted that DARPA is supporting efforts in real-time parallel benchmark development (configurable computing). The C3I compact application level codes are just over 1K lines each and can be accessed on the WWW at http://www.se.rl.af.mil:8001. Applications involve SAR processing, multiple hypothesis tracking, and terrain meshing. Such codes typically scale down problems to meet constraints with smaller numbers of processors (not necessarily real-time). The real-time benchmark effort is just starting at Rome Lab with an effort mimicking the Parkbench effort (kernels and compact applications). RM proposed a future "real-time annex" to Parkbench. He noted that both functional and temporal specifications will undoubtedly result in new performance metrics to consider. JD indicated his willingness to extract 2 of these codes and integrate them into the next Parkbench release. PW of ORNL then discussed some of the performance results obtained for the shallow water compact application code on an Intel Paragon (up to 512 processors). He indicated that for up to 128 processors, the percentage of time due to communication is not dominating. For over 256 processors, the communication can consume as much as 30% for large problems and 60% of the total execution time for small problems. Scaling up problems sizes makes no sense for weather-related codes (i.e., problem sizes are fixed). PW also mentioned a MPI protocol sensitivity study with emphasis on the collective communication performance of MPI. He compared MPI communication routines versus hand-coded Fortran with optimal NX protocols and showed that the handcoded was consistently better, indicating that, in particular, there is room for performance improvement in the MPI collective routines. He also mentioned that similar results hold on the IBM SP2 and on the Cray T3D. TH then listed the compact applications of current and future interest for Parkbench: 1. CFD/Aero (NAS PB) 2. Weather/Climate (Shallow Water, POLMP/IFS) 3. Seismic (ARCO) 4. Autobench (AVL/Fire) 5. C3I (Non Real-Time) 6. C3I (Real-Time) 7. Particle-In-Cell (PIC) code 8. FDMOD At 3:10pm EDT, TH mentioned that the Electronic Benchmarking Journal now has support and suggested that articles on C3I, Autobench, Parkbench, and NAS PB be submitted. It was suggested to wait for Parkbench Release 2.0 to kickoff the Electronic Journal. *** Further Actions *** I. Release 2.X + Results Low-level (COMMS2,3) Put results in standard format such as that used in NAS PB. Run-rules Add NAS PV Version 2.2 results (SS will provide). Get SGI results (TH and JD may contact Tony S. at Miss. State). Mirror PDS/GBIS at both US and UK sites. II. Discussion on Release 3 (2.x) Codes at all levels. TH will provide POLMP. BP can provide FDMOD documentation (85% seismic). Roger Hockney will be asked for a PIC code. III. The next Parkbench Meeting is scheduled for late August/early September 1996 in Knoxville. TH adjourned the meeting at 3:23pm EDT. From owner-parkbench-comm@CS.UTK.EDU Tue May 7 13:39:49 1996 Return-Path: Received: from CS.UTK.EDU by netlib2.cs.utk.edu with ESMTP (cf v2.9t-netlib) id NAA12055; Tue, 7 May 1996 13:39:48 -0400 Received: from localhost (root@localhost) by CS.UTK.EDU with SMTP (cf v2.9s-UTK) id NAA06674; Tue, 7 May 1996 13:27:44 -0400 Received: from berry.cs.utk.edu (BERRY.CS.UTK.EDU [128.169.94.70]) by CS.UTK.EDU with ESMTP (cf v2.9s-UTK) id NAA06666; Tue, 7 May 1996 13:27:40 -0400 Received: from LOCALHOST.cs.utk.edu by berry.cs.utk.edu with SMTP (cf v2.11c-UTK) id NAA01566; Tue, 7 May 1996 13:24:52 -0400 Message-Id: <199605071724.NAA01566@berry.cs.utk.edu> to: parkbench-comm@CS.UTK.EDU Subject: Revised Minutes Date: Tue, 07 May 1996 13:24:51 -0400 From: "Michael W. Berry" A few more changes... Mike ----------------------------------------------------------------- Minutes of Parkbench Meeting - Knoxville Hilton, April 26, 1996 ----------------------------------------------------------------- Attendee List: (MB) Michael Berry Univ. of Tennessee berry@cs.utk.edu (JD) Jack Dongarra Univ. of Tenn./ORNL dongarra@cs.utk.edu (VG) Vladimir Getov Univ. of Westminister getovv@wmin.ac.uk (MG) Myron Ginsberg EDS/General Motors ginsberg@gmr.com (TH) Tony Hey Univ. of Southampton ajgh@ecs.soton.ac.uk Edgar Kalns IBM kalns@vnet.ibm.com David MacKay Intel SSD mackay@ssd.intel.com (RM) Richard C. Metzger RL/CJCB (Rome Lab) metzgerr@rl.af.mil Phil Mucci Univ. of Tennessee mucci@cs.utk.edu (BP) Bodo Parady Sun Microsystems bodo@eng.sun.com (SS) Subhash Saini NASA Ames saini@nas.nasa.gov (ES) Erich Strohmaier Univ. of Tennessee erich@cs.utk.edu (PW) Pat Worley Oak Ridge Nat'l Lab worleyph@ornl.gov [Events after 11:30am] Regarding the Low-Level benchmarks: ----------------------------------- SS showed that variations ranging from 18-30% on an IBM SP/2 have been recorded for the low-level codes. There is a proceedings paper (SC'94) which first noted this. Members felt the PVM daemons may be the suspects here. TH reported that strange results had been obtained for COMMS1 with respect to the MAXLEN parameter and that the MPI and PVM versions of COMMS2,3 may not measure the same operations. TH demonstrated performance variations of COMMS1,2 on the IBM SP/2 and Meiko CS2. The best results were obtained using PSEND and PRECEIVE on contiguous data. It was noted that PSEND/PRECEIVE were not available with the original COMMS1,2 codes were written. TH noted that memory access should be measured as well and that perhaps COMMS3 could be rewritten for consistency with the current COMMS1,2 codes (ES suggested this could be done for the V3.0 Parkbench Release). VG suggested that the group rethink the goals of low-level benchmarking. He proposed 3 goals of this level of benchmarking: (1) Comparisons across different parallel computers, (2) Performance estimation of parallel kernels and applications, and (3) Performance tuning of parallel applications. For single node architectures, VG suggested that scalar performance, vectorization, and memory hierarchies be measured. He pointed out that RINF1 which comprises 17 examples of code fragments/kernels is 15 years old and was best suited for vector pipeline machines. Therefore, a new low-level benchmark was proposed for evaluation of the memory hierarchy effects. VG also suggested that vendors be allowed to use their single node sustainable performance kernels (in-house versions) for such benchmarks. Members felt that a new benchmark of memory hierarchies would be very useful in addition to the existing low-level codes. At 12:30pm EDT the group went to lunch at the "Soup Kitchen" in downtown Knoxville. At 1:20pm EDT, the meeting resumed. MB and TH briefly reviewed the Minutes of the November 11, 1995 Parkbench Meeting (in Knoxville). JD pointed out that Cray Research should be using the BLACS provided by Parkbench to avoid the problems observed with task spawning on the Cray T3D. TH and JD indicated that they will investigate the use of SGI machines since little or no results have been obtained on those machines to date. BP announced that SPEC-HPSC will release their first benchmark suite in the Summer of '96 and will have two application codes (SEISMIC (ARCO) and CHEMISTRY). The meeting focus then switched to Compact Applications: -------------------------------------------------------- MG discussed the status and needs of benchmarking in the automotive industry. USCAR released a benchmark suite at the end of August 1995. They typically use 5-6 commercial codes for benchmarks based on older model cars. They have had discussions with Europort and seek their models as well. The particular needs of benchmarking in this area include: (1) Information on the limitation of machines, (2) Crossover points between small-medium-large machines - would like to see results posted on the WWW someday, (3) Assess if machine is general purpose or only good for a specific model, (4) Develop a canonical set of parameters for scalability analysis, (5) Develop application-dependent metrics, and (6) Focus on shared-memory paradigms (message-passing not desirable). Ford has a lot of Cray C-90's and uses them primarily for job streaming. Most auto companies do not do parallelism and depend on commercial Finite Element Model (FEM) codes. Typical machines benchmarked include: IBM SP/2, Convex Exemplar, Cray J9-16, and SGI PowerChallenge. Five commonly-used commerical benchmark codes (which are also in the Europort suite) are: MSC/NASTRAN, STARCD, ABACUS, LSDYNA, and PAMCRASH. MG pointed out that parallel commercial FEM codes have not evolved (few number of processors typically sufficient). Measurements based on turn-around-time are needed. Assessing the performance of job mixes with increasing numbers of threads is very desirable (throughput). Creating a synthetic job mix is more desirable than running one code on a single dedicated machine. Four areas MG suggested for future benchmarks are: (1) Crashworthiness (large structures, 60K degrees of freedom), (2) Other structures problems (noise vibration, heat), (3) CFD (exterior aerodynamic, internal combustion), (4) Manufacturing processes (metal forming). MG concluded his presentation with a note that a preliminary report on AUTOBENCH was available (copy left with TH/JD). SS then presented some results on HPF-based NAS Parallel Benchmarks (PB). He pointed out the heavy investment at NAS in CMF on the CM-2 and CM-5 (shutdown on March 1995). He showed that for the FT benchmarks, that the MPI versions were about 2 times faster than the MPI versions of NAS PB Version 2.0 on the IBM SP/2. Also he showed that for the SP benchmark, that the MPI version was about 3 times faster than the MPI versions of NAS PB Version 2.0 on the IBM SP/2. For the BT benchmark, the MPI code was 4 times faster than the HPF counterparts. The Portland Group HPF compiler was used on the IBM SP/2 tested. He alluded that the gain in using MPI may not always be cost-effective given the man hours required to obtain such codes. With HPF, once can get a speedup factor of 2 over baseline (16 processor) versions for 40+ processors on the IBM SP/2. Having a library of HPF/MPI codes is a big win given the ease of programming with HPF and good performance of underlying MPI kernels. SS also pointed out that a HPF-based PIC (Particle-In-Cell) code had obtained a speedup of 5 on a 16 processor SGI Power Challenge XL. SS concluded his presentation by indicating that (1) CMF-to-HPF conversion is easy, (2) Lack of parallel I/O in HPF is a definite problem, (3) Lack of parallel math and scientific libraries in HPF will complicate the CMF-to-HPF conversion (and HPF ports), and (4) HPF compilers are maturing. TH suggested that HPF versions of the Parkbench kernels and compact applications be considered for Release Version 3 of the Parkbench Suite. RM from Rome Lab then proceeded to describe the C3I Parallel Benchmark Suite. He noted that DARPA is supporting efforts in real-time parallel benchmark development (configurable computing). The C3I compact application level codes are just over 1K lines each and can be accessed on the WWW at http://www.se.rl.af.mil:8001. Applications involve SAR processing, multiple hypothesis tracking, and terrain meshing. Such codes typically scale down problems to meet constraints with smaller numbers of processors (not necessarily real-time). The real-time benchmark effort is just starting at Rome Lab with an effort mimicking the Parkbench effort (kernels and compact applications). RM proposed a future "real-time annex" to Parkbench. He noted that both functional and temporal specifications will undoubtedly result in new performance metrics to consider. JD indicated his willingness to extract 2 of these codes and integrate them into the next Parkbench release. PW of ORNL then discussed some of the performance results obtained for the shallow water compact application code on an Intel Paragon (up to 512 processors). He indicated that for up to 128 processors, the percentage of time due to communication is not dominating. For over 256 processors, the communication can consume as much as 30% for large problems and 60% of the total execution time for small problems. Scaling up problems sizes makes no sense for weather-related codes (i.e., problem sizes are fixed). PW also mentioned a MPI protocol sensitivity study with emphasis on the collective communication performance of MPI. He compared MPI communication routines versus hand-coded Fortran with optimal NX protocols and showed that the handcoded was consistently better, indicating that, in particular, there is room for performance improvement in the MPI collective routines. He also mentioned that similar results hold on the IBM SP2 and on the Cray T3D. TH then listed the compact applications of current and future interest for Parkbench: 1. CFD/Aero (NAS PB) 2. Weather/Climate (Shallow Water, POLMP/IFS) 3. Seismic (ARCO) 4. Autobench (AVL/Fire) 5. C3I (Non Real-Time) 6. C3I (Real-Time) 7. Particle-In-Cell (PIC) code 8. FDMOD At 3:10pm EDT, TH mentioned that the Electronic Benchmarking Journal now has support and suggested that articles on C3I, Autobench, Parkbench, and NAS PB be submitted. It was suggested to wait for Parkbench Release 2.0 to kickoff the Electronic Journal. *** Further Actions *** I. Release 2.X + Results Low-level (COMMS2,3) Put results in standard format such as that used in NAS PB. Run-rules Add NAS PV Version 2.2 results (SS will provide). Get SGI results (TH and JD may contact Tony S. at Miss. State). Mirror PDS/GBIS at both US and UK sites. II. Discussion on Release 3 (2.x) Codes at all levels. TH will provide POLMP. BP can provide FDMOD documentation (85% seismic). Roger Hockney will be asked for a PIC code. III. The next Parkbench Meeting is scheduled for late August/early September 1996 in Knoxville. TH adjourned the meeting at 3:23pm EDT. From owner-parkbench-comm@CS.UTK.EDU Fri May 10 09:39:48 1996 Return-Path: Received: from CS.UTK.EDU by netlib2.cs.utk.edu with ESMTP (cf v2.9t-netlib) id JAA28581; Fri, 10 May 1996 09:39:48 -0400 Received: from localhost (root@localhost) by CS.UTK.EDU with SMTP (cf v2.9s-UTK) id JAA05422; Fri, 10 May 1996 09:28:10 -0400 Received: from venus.SE.RL.AF.MIL (VENUS.SE.RL.AF.MIL [128.132.44.12]) by CS.UTK.EDU with ESMTP (cf v2.9s-UTK) id JAA05406; Fri, 10 May 1996 09:28:03 -0400 Received: from [128.132.44.6] (SLUG.SE.RL.AF.MIL [128.132.44.6]) by venus.SE.RL.AF.MIL (8.7.4/8.7.3) with SMTP id JAA10924; Fri, 10 May 1996 09:05:17 -0400 (EDT) X-Sender: metzgerr@se.rl.af.mil Message-Id: Mime-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Date: Thu, 9 May 1996 21:00:51 -0400 To: "Michael W. Berry" , parkbench-comm@CS.UTK.EDU From: metzgerr@rl.af.mil (Rick Metzger) Subject: Re: Revised Minutes Sorry this is late but my organization is RL/C3CB not RL/CJCB. I apologize. The nuns in school always said I had bad handwriting! Thanks, Rick At 1:24 PM 5/7/96, Michael W. Berry wrote: >A few more changes... Mike > >----------------------------------------------------------------- >Minutes of Parkbench Meeting - Knoxville Hilton, April 26, 1996 >----------------------------------------------------------------- >Attendee List: > > (MB) Michael Berry Univ. of Tennessee berry@cs.utk.edu > (JD) Jack Dongarra Univ. of Tenn./ORNL dongarra@cs.utk.edu > (VG) Vladimir Getov Univ. of Westminister getovv@wmin.ac.uk > (MG) Myron Ginsberg EDS/General Motors ginsberg@gmr.com > (TH) Tony Hey Univ. of Southampton ajgh@ecs.soton.ac.uk > Edgar Kalns IBM kalns@vnet.ibm.com > David MacKay Intel SSD mackay@ssd.intel.com > (RM) Richard C. Metzger RL/CJCB (Rome Lab) metzgerr@rl.af.mil > Phil Mucci Univ. of Tennessee mucci@cs.utk.edu > (BP) Bodo Parady Sun Microsystems bodo@eng.sun.com > (SS) Subhash Saini NASA Ames saini@nas.nasa.gov > (ES) Erich Strohmaier Univ. of Tennessee erich@cs.utk.edu > (PW) Pat Worley Oak Ridge Nat'l Lab worleyph@ornl.gov > > >[Events after 11:30am] > >Regarding the Low-Level benchmarks: >----------------------------------- >SS showed that variations ranging from 18-30% on an IBM SP/2 have >been recorded for the low-level codes. There is a proceedings >paper (SC'94) which first noted this. Members felt the PVM >daemons may be the suspects here. TH reported that strange results >had been obtained for COMMS1 with respect to the MAXLEN parameter and >that the MPI and PVM versions of COMMS2,3 may not measure the same >operations. TH demonstrated performance variations of COMMS1,2 on the >IBM SP/2 and Meiko CS2. The best results were obtained using PSEND and >PRECEIVE on contiguous data. It was noted that PSEND/PRECEIVE were >not available with the original COMMS1,2 codes were written. TH noted >that memory access should be measured as well and that perhaps COMMS3 >could be rewritten for consistency with the current COMMS1,2 codes (ES >suggested this could be done for the V3.0 Parkbench Release). > >VG suggested that the group rethink the goals of low-level benchmarking. >He proposed 3 goals of this level of benchmarking: > >(1) Comparisons across different parallel computers, >(2) Performance estimation of parallel kernels and applications, and >(3) Performance tuning of parallel applications. > >For single node architectures, VG suggested that scalar performance, >vectorization, and memory hierarchies be measured. He pointed out that >RINF1 which comprises 17 examples of code fragments/kernels is 15 years >old and was best suited for vector pipeline machines. Therefore, a new >low-level benchmark was proposed for evaluation of the memory hierarchy >effects. VG also suggested that vendors be allowed to use their single >node sustainable performance kernels (in-house versions) for such >benchmarks. Members felt that a new benchmark of memory hierarchies would >be very useful in addition to the existing low-level codes. > >At 12:30pm EDT the group went to lunch at the "Soup Kitchen" in downtown >Knoxville. > >At 1:20pm EDT, the meeting resumed. MB and TH briefly reviewed the Minutes >of the November 11, 1995 Parkbench Meeting (in Knoxville). JD pointed out >that Cray Research should be using the BLACS provided by Parkbench to avoid >the problems observed with task spawning on the Cray T3D. TH and JD indicated >that they will investigate the use of SGI machines since little or no results >have been obtained on those machines to date. BP announced that SPEC-HPSC >will release their first benchmark suite in the Summer of '96 and will have >two application codes (SEISMIC (ARCO) and CHEMISTRY). > >The meeting focus then switched to Compact Applications: >-------------------------------------------------------- > >MG discussed the status and needs of benchmarking in the automotive >industry. USCAR released a benchmark suite at the end of August 1995. >They typically use 5-6 commercial codes for benchmarks based on older >model cars. They have had discussions with Europort and seek their >models as well. The particular needs of benchmarking in this area include: > >(1) Information on the limitation of machines, >(2) Crossover points between small-medium-large machines - would like > to see results posted on the WWW someday, >(3) Assess if machine is general purpose or only good for a specific model, >(4) Develop a canonical set of parameters for scalability analysis, >(5) Develop application-dependent metrics, and >(6) Focus on shared-memory paradigms (message-passing not desirable). > >Ford has a lot of Cray C-90's and uses them primarily for job streaming. >Most auto companies do not do parallelism and depend on commercial >Finite Element Model (FEM) codes. Typical machines benchmarked include: >IBM SP/2, Convex Exemplar, Cray J9-16, and SGI PowerChallenge. Five >commonly-used commerical benchmark codes (which are also in the Europort >suite) are: MSC/NASTRAN, STARCD, ABACUS, LSDYNA, and PAMCRASH. MG pointed >out that parallel commercial FEM codes have not evolved (few number of >processors typically sufficient). Measurements based on turn-around-time >are needed. Assessing the performance of job mixes with increasing numbers >of threads is very desirable (throughput). Creating a synthetic job mix >is more desirable than running one code on a single dedicated machine. Four >areas MG suggested for future benchmarks are: > >(1) Crashworthiness (large structures, 60K degrees of freedom), >(2) Other structures problems (noise vibration, heat), >(3) CFD (exterior aerodynamic, internal combustion), >(4) Manufacturing processes (metal forming). > >MG concluded his presentation with a note that a preliminary report >on AUTOBENCH was available (copy left with TH/JD). > >SS then presented some results on HPF-based NAS Parallel Benchmarks (PB). >He pointed out the heavy investment at NAS in CMF on the CM-2 and CM-5 >(shutdown on March 1995). He showed that for the FT >benchmarks, that the MPI versions were about 2 times faster than the >MPI versions of NAS PB Version 2.0 on the IBM SP/2. Also >he showed that for the SP benchmark, that the MPI version was about 3 >times faster than the MPI versions of NAS PB Version 2.0 on the IBM SP/2. >For the BT benchmark, the MPI code was 4 times faster than the HPF >counterparts. >The Portland Group HPF compiler was used on the IBM SP/2 tested. He alluded >that the gain in using MPI may not always be cost-effective given the man >hours required to obtain such codes. With HPF, once can get a speedup factor >of 2 over baseline (16 processor) versions for 40+ processors on the >IBM SP/2. Having a library of HPF/MPI codes is a big win given the >ease of programming with HPF and good performance of underlying MPI >kernels. SS also pointed out that a HPF-based PIC (Particle-In-Cell) >code had obtained a speedup of 5 on a 16 processor SGI Power Challenge >XL. SS concluded his presentation by indicating that > >(1) CMF-to-HPF conversion is easy, >(2) Lack of parallel I/O in HPF is a definite problem, >(3) Lack of parallel math and scientific libraries in HPF will > complicate the CMF-to-HPF conversion (and HPF ports), and >(4) HPF compilers are maturing. > >TH suggested that HPF versions of the Parkbench kernels and compact >applications be considered for Release Version 3 of the Parkbench Suite. > >RM from Rome Lab then proceeded to describe the C3I Parallel Benchmark >Suite. He noted that DARPA is supporting efforts in real-time parallel >benchmark development (configurable computing). The C3I compact >application level codes are just over 1K lines each and can be accessed >on the WWW at http://www.se.rl.af.mil:8001. Applications involve SAR >processing, multiple hypothesis tracking, and terrain meshing. Such >codes typically scale down problems to meet constraints with smaller numbers >of processors (not necessarily real-time). > >The real-time benchmark effort is just starting at Rome Lab with >an effort mimicking the Parkbench effort (kernels and compact applications). >RM proposed a future "real-time annex" to Parkbench. He noted that both >functional and temporal specifications will undoubtedly result in new >performance metrics to consider. JD indicated his willingness to extract >2 of these codes and integrate them into the next Parkbench release. > >PW of ORNL then discussed some of the performance results obtained for >the shallow water compact application code on an Intel Paragon (up to >512 processors). He indicated that for up to 128 processors, the percentage >of time due to communication is not dominating. For over 256 processors, >the communication can consume as much as 30% for large problems and 60% >of the total execution time for small problems. Scaling up problems sizes >makes no sense for weather-related codes (i.e., problem sizes are fixed). >PW also mentioned a MPI protocol sensitivity study with emphasis on >the collective communication performance of MPI. He compared MPI >communication routines versus hand-coded Fortran with optimal NX protocols >and showed that the handcoded was consistently better, indicating that, in >particular, there is room for performance improvement in the MPI collective >routines. He also mentioned that similar results hold on the IBM SP2 and on >the Cray T3D. > >TH then listed the compact applications of current and future interest >for Parkbench: > >1. CFD/Aero (NAS PB) >2. Weather/Climate (Shallow Water, POLMP/IFS) >3. Seismic (ARCO) >4. Autobench (AVL/Fire) >5. C3I (Non Real-Time) >6. C3I (Real-Time) >7. Particle-In-Cell (PIC) code >8. FDMOD > >At 3:10pm EDT, TH mentioned that the Electronic Benchmarking Journal >now has support and suggested that articles on C3I, Autobench, Parkbench, >and NAS PB be submitted. It was suggested to wait for Parkbench Release >2.0 to kickoff the Electronic Journal. > >*** Further Actions *** > >I. Release 2.X + Results > Low-level (COMMS2,3) > Put results in standard format such as that used in NAS PB. > Run-rules > Add NAS PV Version 2.2 results (SS will provide). > Get SGI results (TH and JD may contact Tony S. at Miss. State). > Mirror PDS/GBIS at both US and UK sites. > >II. Discussion on Release 3 (2.x) > Codes at all levels. > TH will provide POLMP. > BP can provide FDMOD documentation (85% seismic). > Roger Hockney will be asked for a PIC code. > >III. The next Parkbench Meeting is scheduled for late August/early September > 1996 in Knoxville. > > >TH adjourned the meeting at 3:23pm EDT. ***************************************************************************** Richard C. Metzger Mail Address: Software for High Performance Computing Richard C. Metzger Rome Laboratory - USAF Rome Laboratory/C3CB 525 Brooks Rd. email: Rome, NY 13441-4505 metzgerr@se.rl.af.mil metzgerr@rl.af.mil voice (315)330-7652 fax (315)330-7989 From owner-parkbench-comm@CS.UTK.EDU Mon Jun 17 06:56:23 1996 Return-Path: Received: from CS.UTK.EDU by netlib2.cs.utk.edu with ESMTP (cf v2.9t-netlib) id GAA08996; Mon, 17 Jun 1996 06:56:22 -0400 Received: from localhost (root@localhost) by CS.UTK.EDU with SMTP (cf v2.9s-UTK) id GAA07519; Mon, 17 Jun 1996 06:39:20 -0400 Received: from aloisius.vcpc.univie.ac.at ([193.171.58.11]) by CS.UTK.EDU with ESMTP (cf v2.9s-UTK) id GAA07505; Mon, 17 Jun 1996 06:38:49 -0400 Received: (from doc@localhost) by aloisius.vcpc.univie.ac.at (8.7.5/8.7.3) id MAA17903; Mon, 17 Jun 1996 12:37:25 +0200 (MET DST) Date: Mon, 17 Jun 1996 12:37:25 +0200 (MET DST) Message-Id: <199606171037.MAA17903@aloisius.vcpc.univie.ac.at> To: mpi-core@mcs.anl.gov, mpi-io@mcs.anl.gov, mpi-io@nas.nasa.gov, parkbench-comm@CS.UTK.EDU From: course@vcpc.univie.ac.at Subject: ANNOUNCE: High Performance FORTRAN meeting, Vienna, Austria: 1-4/7/96 X-Safemail-Version: 1.2 [We apologise if this post is slightly off-topic, but it may be relevant to a number of HPC researchers...] Summer of HPF in Vienna HPF Tutorial, July 1-2, 1996, Vienna, Austria Workshop on HPF for Real Applications, July 3-4, 1996, Vienna High Performance Fortran (HPF) is a data parallel language extension to Fortran90 which provides a portable programming interface for a wide variety of target platforms. The original HPF language specification was produced by the High Performance Fortran Forum, a broad consortium of industry and academia, which met regularly throughout 1992 and early 1993. HPF compilers are now available on most commonly-used computing systems, and users are beginning to gain first hand experience with this language. The Forum has continued to meet in order to address advanced topics. To increase public knowledge of HPF, a workshop and a tutorial with a hands-on session will be held in Vienna during the first week of July. The workshop is organised by the VCPC as part of the ESPRIT project HPC-Standards. Participants may register for one or more of the above events using the form attached. For further information, please contact course@vcpc.univie.ac.at or http://www.vcpc.univie.ac.at/ or complete the form attached. The workshop will be held at the Austrotel hotel, Vienna. A number of rooms have been reserved there at a special rate for participants. Please refer to the VCPC on your reservation in order to qualify. Book rooms directly at: Austrotel, Felberstr. 4, A-1150 Vienna, Austria. Tel: +43-1-981110, Fax: +43-1-98111930. Tutorial The tutorial is divided into two parts. Participants may register for the first day only, or for both days. It is especially suitable for those who do not have access to an HPF compiler. Day One: HPF in Practice, Charles Koelbel, Rice University. This tutorial will introduce programmers to the most important features of HPF and illustrate how they can be used in practice for scientific computation. Further details can be found at http://www.cs.rice.edu/~chk/hpf-tutorial.html Day Two: HPF Tutorial, NA Software. Attendees will have hands on access to the NA Software HPF mapper and tools on the Meiko CS-2 at VCPC. Please note that there are a limited number of places for the `hands-on' sessions. Workshop This workshop gives an overview of the achievements of the HPF Forum, including its recent activities, and provides up-to-date information on HPF compilers. Major compiler vendors will describe their efforts and share their views on HPF. Contributions from end users include descriptions of completed and on-going code development efforts. One of the aims of this event is to enable compiler writers, potential and actual users of High Performance Fortran to come together to discuss their problems and needs. Compiler writers need guidance from users in order to understand how best to improve their products; application developers need to find out how to write their codes in ways that help the compiler generate fast object code. Thus we include both kinds of presentation and leave time for discussion in the program. Exhibit An exhibit room will be available to enable vendors of HPF compilers and related tools to display their products and to disseminate information during the workshop. There is limited space only. If you wish to participate, please contact Tony.Curtis@vcpc.univie.ac.at with a list of your proposed requirements. Note that we will not be able to process any requests after June 20. =============================================================================== Workshop on HPF for Real Applications Preliminary Program Welcome on Tuesday, 2nd July, 19:00 - 21:00 Wednesday, 3rd July 09:00 - 10:00 Making HPF Work: Past Success and Future Challenges Charles Koelbel, CRPC/Rice University 10:00 - 11:00 Migrating to HPF Re-engineering Tools for HPF Bernard Dion, Simulog Programming Tools for HPF: User Requirements Fritz Wollenweber, German Military Geophysical Office Tools for High Performance Program: A Survey Jean-Louis Pazat, IRISA 11:00 - 11:30 Coffee Break 11:30 - 13:00 Commercial Compilers I Thinking Machines' High Performance Fortran Harvey Richardson, Thinking Machines, Inc. An Overview of the IBM XLHPF Compiler Manish Gupta, IBM Watson Research Center The PREPARE HPF Compiler Martijn de Lange, ACE 13:00 - 14:30 Lunch 14:30 - 15:30 Applications I Porting of Ocean Simulation Code to HPF Tor Sorevik, Parallab HPF Porting Strategy for an Industrial CFD Code Christian Borel, MATRA 15:30 - 16:00 Coffee Break 16:00 - 17:00 Applications II HPF Port of an Irregular Application Philippe Devillers, VCPC Experience with Porting Two CFD Applications to HPF Henk Sips, University of Amsterdam 17:00 - 18:00 Free time for exhibit/demonstrations 19:00 Social Event Thursday 4th July 09:00 - 10:30 Compilers II The PGI HPF Compiler Larry Meadows, The Portland Group, Inc. The HPFPlus Compiler Toolset Mike Delves, N. A. Software APR's HPF Compiler: Status and Results John Levesque, Applied Parallel Research 10:30 - 11:00 Coffee Break 11:00 - 12:00 Benchmarking Experience with HPF Compilers at ICASE Piyush Mehrotra, ICASE Benchmarking experiences at the VCPC Guy Robinson, VCPC 12:00 - 13:30 Lunch 13:30 - 15:00 Applications III HPF+ Pam-Crash Kernels and Requirements Guy Lonsdale, NEC Europe Application of HPF to Financial Modelling Carlos Falco-Korn, LPAC 15:00 - 15:30 Coffee Break 15:30 - 17:00 Research Compilers sHPF: A Subset HPF Compilation System John Merlin, University of Southampton Optimizing HPF for Advanced Applications Siegfried Benkner, University of Vienna Run Time Support for Structured Adaptive Mesh Methods Scott Baden, University of California, San Diego 17:00 - 18:00 Panel Discussion and Closing Remarks =============================================================================== Registration form European Centre for Parallel Computing at Vienna (VCPC) Summer of HPF HPF Tutorial, July 1-2, 1996, Vienna, Austria Workshop on HPF for Real Application, July 3-4, 1996, Vienna, Austria Name: _______________________________________________________________________ Affiliation: ________________________________________________________________ Address: ____________________________________________________________________ ____________________________________ Telephone: ____________________________ Fax: ________________________________ E-mail: ______________________________ I wish to attend (please cross): o the HPF Tutorial on day one only (July 1) o the HPF Tutorial on both days (July 1/2) o the HPF Workshop on July 3/4 Please send me further information on: o the HPF Tutorial on day one only (July 1) o the HPF Tutorial on both days (July 1/2) o the HPF Workshop on July 3/4 o Please send me information on future workshops and tutorials o Please send me hotel information Arrival date: ___________________ Departure date: ____________________ Date: ________________________________ Signature: _________________________ Fees Payment should be enclosed if you register for the tutorial or the workshop. Please make cheques payable to VCPC. All payments must be in Austrian Schillings. Fees include refreshments and lunch on each day of the events for which you register. Day one only (July 1): Until June 20 After June 20 Academic, ESPRIT/ACTS projects: 1700 ATS 2100 ATS Industry: 2300 ATS 2800 ATS Day one and two (July 1/2): Until June 20 After June 20 Academic, ESPRIT/ACTS projects: 2300 ATS 2700 ATS Industry: 2850 ATS 3500 ATS Workshop (July 3/4): Until June 20 After June 20 Academic, ESPRIT/ACTS projects: 1800 ATS 2200 ATS Industry: 2400 ATS 2800 ATS I qualify for the o Academic, project fee o Industry fee Name of project: ____________________________________________________________ Method of payment: o Enclosed Cheque o American Express o Eurocard/Mastercard o Visa o Diners Club Total amount of payment: ____________________________________________________ Credit Card Number: _____________________________ Exp. date: ______________ Cardholder Name: ____________________________________________________________ Date: _________________________________ Signature: ________________________ European Centre for Parallel Computing at Vienna, (VCPC) Liechtensteinstr. 22, A-1090 Vienna, Austria Tel: +43-1-3109396-10, Fax: +43-1-3109396-13, E-mail: info@vcpc.univie.ac.at WWW: http://www.vcpc.univie.ac.at From owner-parkbench-comm@CS.UTK.EDU Thu Jun 20 11:05:02 1996 Return-Path: Received: from CS.UTK.EDU by netlib2.cs.utk.edu with ESMTP (cf v2.9t-netlib) id LAA03220; Thu, 20 Jun 1996 11:05:02 -0400 Received: from localhost (root@localhost) by CS.UTK.EDU with SMTP (cf v2.9s-UTK) id KAA19248; Thu, 20 Jun 1996 10:55:09 -0400 Received: from convex.convex.com (convex.convex.com [130.168.1.1]) by CS.UTK.EDU with ESMTP (cf v2.9s-UTK) id KAA19229; Thu, 20 Jun 1996 10:55:05 -0400 Received: from bach.convex.com by convex.convex.com (8.6.4.2/1.35) id JAA09068; Thu, 20 Jun 1996 09:54:30 -0500 Received: from localhost by bach.convex.com (8.6.4/1.28) id JAA25077; Thu, 20 Jun 1996 09:54:30 -0500 From: hari@bach.convex.com (Harikumar Sivaraman) Message-Id: <199606201454.JAA25077@bach.convex.com> Subject: PARKBENCH2.0 results To: parkbench-comm@CS.UTK.EDU Date: Thu, 20 Jun 96 9:54:29 CDT X-Mailer: ELM [version 2.3 PL11] Disclaimer: The contents of this mail do not represent the opinion of HP nor is HP responsible for the contents of this mail. I would like to submit results of the run of the PARKBENCH2.0 LOW_LEVEL benchmarks on an SPP-1600. I have the results in a directory called Low_Level containing a subdirectory for each Low_Level benchmark. Each subdirectory has a pair of result files and a third file listing the summary of the results. I was thinking of mailing a uuencoded version of a tar of these files. Is this okay with you'll. Hari. -- ------------------- H. Sivaraman (214) 497 - 4374 D1W47; HP; 3000 Waterview Pk. way Dallas, TX From owner-parkbench-comm@CS.UTK.EDU Tue Aug 6 05:00:12 1996 Return-Path: Received: from CS.UTK.EDU by netlib2.cs.utk.edu with ESMTP (cf v2.9t-netlib) id FAA04652; Tue, 6 Aug 1996 05:00:12 -0400 Received: from localhost (root@localhost) by CS.UTK.EDU with SMTP (cf v2.9s-UTK) id EAA24106; Tue, 6 Aug 1996 04:46:57 -0400 Received: from internet-gw.zurich.ibm.com (internet-gw.zurich.ibm.ch [193.5.61.130]) by CS.UTK.EDU with SMTP (cf v2.9s-UTK) id EAA24092; Tue, 6 Aug 1996 04:46:37 -0400 Received: from [9.4.4.223] by internet-gw.zurich.ibm.com (AIX 3.2/UCB 5.64/4.03) id AA22298 from ; Tue, 6 Aug 1996 10:45:57 +0200 Received: by fanellahorn.zurich.ibm.com (AIX 3.2/UCB 5.64/4.03) id AA43570 from ; Tue, 6 Aug 1996 10:45:51 +0200 From: "Peter Ohnacker" Message-Id: <9608061045.ZM18992@zurich.ibm.com> Date: Tue, 6 Aug 1996 10:45:50 +0200 X-Mailer: Z-Mail (3.2.0 06sep94) To: parkbench-comm@CS.UTK.EDU Subject: (Fwd) ParkBench_misc_pvm.a missing Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Dear Sirs, this won't be the right place to ask, but I don't see any alternative. I have problems with installing parkbench and tried to contact the recommended address parkbench-comments@cs.utk.edu but I didn't get any answer for more than three weeks now. Can you give me an address where I can get help from? Thank you very much! Kind regards, Peter Ohnacker --- Forwarded mail from "Peter Ohnacker" Hello, I got parkbench 2.0 via ftp and want to install it on our local IBM/RS6K/AIX workstation cluster. Following the "ParkBench 2.0: Release NOtes and Run Rules" it tried to compile parkbench (or any part of it) but the compiler -- xlf -- always claims that he can't find the library ParkBench_misc_pvm.a (the same holds for ParkBench_misc_mpi.a and ParkBench_misc.a) What did I forget to do? Thanks very much for any help! Regards, Peter Ohnacker -- Peter Ohnacker Tel: +41-1-724-85-30 IBM Zurich Research Laboratory Fax: +41-1-724-09-04 Saeumerstrasse 4 Internet: ohp@zurich.ibm.com CH-8803 Rueschlikon/Switzerland EARN/BITNET: ohp@zurich.bitnet ---End of forwarded mail from "Peter Ohnacker" -- Peter Ohnacker Tel: +41-1-724-85-30 IBM Zurich Research Laboratory Fax: +41-1-724-09-04 Saeumerstrasse 4 Internet: ohp@zurich.ibm.com CH-8803 Rueschlikon/Switzerland EARN/BITNET: ohp@zurich.bitnet From owner-parkbench-comm@CS.UTK.EDU Wed Aug 28 10:22:06 1996 Return-Path: Received: from CS.UTK.EDU by netlib2.cs.utk.edu with ESMTP (cf v2.9t-netlib) id KAA03450; Wed, 28 Aug 1996 10:22:06 -0400 Received: from localhost (root@localhost) by CS.UTK.EDU with SMTP (cf v2.9s-UTK) id KAA19375; Wed, 28 Aug 1996 10:06:49 -0400 Received: from beech.soton.ac.uk (beech.soton.ac.uk [152.78.128.78]) by CS.UTK.EDU with ESMTP (cf v2.9s-UTK) id KAA19345; Wed, 28 Aug 1996 10:06:34 -0400 Received: from bright.ecs.soton.ac.uk (bright.ecs.soton.ac.uk [152.78.64.201]) by beech.soton.ac.uk (8.6.12/hub-8.5a) with SMTP id PAA15128; Wed, 28 Aug 1996 15:06:02 +0100 Received: from landlord.ecs.soton.ac.uk by bright.ecs.soton.ac.uk; Wed, 28 Aug 96 15:05:56 BST From: John Merlin Received: from bacchus.ecs.soton.ac.uk by landlord.ecs.soton.ac.uk; Wed, 28 Aug 96 15:07:37 BST Message-Id: <532.9608281407@bacchus.ecs.soton.ac.uk> Subject: Minutes of Parkbench meetings To: berry@CS.UTK.EDU Date: Wed, 28 Aug 1996 15:07:13 +0100 (BST) Cc: jhm@ecs.soton.ac.uk (John Merlin), parkbench-comm@CS.UTK.EDU X-Mailer: ELM [version 2.4 PL0] Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Dear Mike, I am currently creating a WWW page for the Esprit 'HPC Standards' project (URL: http://www.ccg.ecs.soton.ac.uk/hpc-stds/index.html). The purpose of this project is to provide funding for Europeans to participate in HPF-2, MPI-2 and Parkbench meetings, and to disseminate information about these standardisation activities. The project Web page therefore has introductory information about the HPF-2, MPI-2 and Parkbench activities. Would it be OK if I made your minutes of the Parkbench meetings, as sent to the 'parkbench-comm' group, available from this page? If presume so, since they are publically available anyway, but thought I should ask first. Please feel free to publicise this Web page, link it from anywhere relevant, etc. I look forward to hearing from you. John Merlin ('HPC Standards' project coordinator) P.S. Please email me directly, as I have not (yet) subscribed to the parkbench-comm list. ----------------------------------------------------------------------- John Merlin email: jhm@ecs.soton.ac.uk Dept. of Electronics and Computer Science, tel: +44 1703 593943 Building 16, fax: +44 1703 593903 University of Southampton, Southampton S017 1BJ, U.K. From owner-parkbench-comm@cs.utk.edu Tue Sep 10 07:30:51 1996 Return-Path: Received: from CS.UTK.EDU by netlib2.cs.utk.edu with ESMTP (cf v2.9t-netlib) id HAA00659; Tue, 10 Sep 1996 07:30:51 -0400 Received: from localhost (root@localhost) by CS.UTK.EDU with SMTP (cf v2.9s-UTK) id DAA15443; Tue, 10 Sep 1996 03:59:49 -0400 Received: from beech.soton.ac.uk (beech.soton.ac.uk [152.78.128.78]) by CS.UTK.EDU with ESMTP (cf v2.9s-UTK) id DAA15421; Tue, 10 Sep 1996 03:59:42 -0400 Received: from bright.ecs.soton.ac.uk (bright.ecs.soton.ac.uk [152.78.64.201]) by beech.soton.ac.uk (8.6.12/hub-8.5a) with SMTP id IAA22959; Tue, 10 Sep 1996 08:57:52 +0100 Received: from landlord.ecs.soton.ac.uk by bright.ecs.soton.ac.uk; Tue, 10 Sep 96 08:57:21 BST From: Vladimir Getov Received: from caesar.ecs.soton.ac.uk by landlord.ecs.soton.ac.uk; Tue, 10 Sep 96 08:59:09 BST Date: Tue, 10 Sep 96 08:58:36 BST Message-Id: <2546.9609100758@caesar.ecs.soton.ac.uk> To: parkbench-comm@cs.utk.edu, parkbench-lowlevel@cs.utk.edu, sercely@convex.convex.com Subject: Re: comms2 and comms3 bugs, mpi release Cc: wallach@convex.convex.com, romero@convex.convex.com Hi Ron, Are you talking about the same or similar bugs as the ones reported for the comms3 benchmark by Harikumar Sivaraman at the end of June (see the included message below)? -Vladimir Getov p.s. Apologies if you receive this message more than once - I have included parkbench-comm@CS.UTK.EDU on the "To:" line but do not know the cross membership. > > HP/Convex wants to release lowlevel numbers in two weeks, but we are > trying to > figure out what to do about the bugs we have reported in these codes. > > Options are: > Submitting results without these tests > HP/Convex Re-writing the benchmarks to "do the right thing" > other ? > > I would appreciate a phone call to discuss these issues. > -- > Ron Sercely > 214.497.4667 > > HP/CXTC Toolsmith > ____________________________ included message _______________________ >From owner-parkbench-compactapp@CS.UTK.EDU Fri Jun 28 15:54:32 1996 From: hari@bach.convex.com (Harikumar Sivaraman) Subject: Bug report on COMMS3.f in PARKBENCH2.0 To: parkbench-comments@CS.UTK.EDU, parkbench-lowlevel@CS.UTK.EDU Date: Fri, 28 Jun 96 9:50:26 CDT Cc: romero@bach.convex.com (Paco Romero) X-Mailer: ELM [version 2.3 PL11] Content-Length: 1559 X-Status: DISCLAIMER: The contents of this mail are not an official HP position. I do not speak for HP. The COMMS3 benchmark in PARKBENCH2.0 is in apparent violation of the specifications in the MPI standard. The benchmark attempts to do an MPI_RECV into the same buffer on which it has posted an MPI_ISEND before it does an MPI_WAIT. The relevant code fragment is as below: COMMS3 (This code fragments applies in the case of two processors) ------ CALL MPI_ISEND(A, IWORD, MPI_DOUBLE_PRECISION, ..... CALL MPI_RECV(A, IWORD, MPI_DOUBLE_PRECISION, ...... CALL MPI_WAIT(request(NSLAVE), status, ierr) COMMS3 (Multiple processors) ------ do i = 1, #processors CALL MPI_ISEND(A, IWORD, MPI_DOUBLE_PRECISION, ..... enddo // The MPI_ISEND statements in the loop violate the MPI standard since the buffer "A" // is reused inside the loop. do i = 1, #processors CALL MPI_RECV(A, IWORD, MPI_DOUBLE_PRECISION, ...... enddo do i = 1, #processors CALL MPI_WAIT(request(NSLAVE), status, ierr) enddo Comments: --------- The MPI standards (page 40, last but one paragraph) says "the sender should not access any part of the send buffer after a nonblocking send operation is called, until the send completes." Page 41, line 1 of the MPI standards says "the functions MPI_WAIT and MPI_TEST are used to complete a nonblocking communication". Clearly the reuse of buffer "A" in the code fragments above is in violation of the standard. ------- H. Sivaraman (214) 497 - 4374 HP; 3000 Waterview Pk.way Dallas, TX - 75080 From owner-parkbench-comm@CS.UTK.EDU Mon Sep 16 15:18:45 1996 Return-Path: Received: from CS.UTK.EDU by netlib2.cs.utk.edu with ESMTP (cf v2.9t-netlib) id PAA24764; Mon, 16 Sep 1996 15:18:45 -0400 Received: from localhost (root@localhost) by CS.UTK.EDU with SMTP (cf v2.9s-UTK) id OAA18388; Mon, 16 Sep 1996 14:55:24 -0400 Received: from blueberry.cs.utk.edu (BLUEBERRY.CS.UTK.EDU [128.169.92.34]) by CS.UTK.EDU with ESMTP (cf v2.9s-UTK) id OAA18377; Mon, 16 Sep 1996 14:55:21 -0400 Received: by blueberry.cs.utk.edu (cf v2.11c-UTK) id SAA05955; Mon, 16 Sep 1996 18:52:56 GMT From: "Erich Strohmaier" Message-Id: <9609161452.ZM5953@blueberry.cs.utk.edu> Date: Mon, 16 Sep 1996 14:52:56 -0400 X-Face: ,v?vp%=2zU8m.23T00H*9+qjCVLwK{V3T{?1^Bua(Ud:|%?@D!~^v^hoA@Z5/*TU[RFq_n'n"}z{qhQ^Q3'Mexsxg0XW>+CbEOca91voac=P/w]>n_nS]V_ZL>XRSYWi:{MzalK9Hb^=B}Y*[x*MOX7R=*V}PI.HG~2 X-Mailer: Z-Mail (3.2.0 26oct94 MediaMail) To: parkbench-comm@CS.UTK.EDU Subject: ParKBench Release 2.1 Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Hello, The release 2.1 of ParKBench is available at netlib: http://www.netlib.org/parkbench/ It contains the following bug fixes: - Comms2 for MPI made to be a true exchange benchmark using MPI_SENDRECV. - Comms3 for MPI using wild-card and second buffer. - Added missing mpif.f for the MPI2PVM library. - Fixed Makefiles. - make.local.def modifications. - Updated conf/make.def.SP2MPI. - LU Solver fixed though the use of a flag to the Blacs build in the Bmakes. - Addition of the definition for mpi_group_translate_ranks in Bdef.h. - PBLAS bug solved with new BLACS compilation. Best Regards Erich Strohmaier email: erich@cs.utk.edu From owner-parkbench-comm@CS.UTK.EDU Mon Sep 30 15:25:21 1996 Return-Path: Received: from CS.UTK.EDU by netlib2.cs.utk.edu with ESMTP (cf v2.9t-netlib) id PAA15684; Mon, 30 Sep 1996 15:25:21 -0400 Received: from localhost (root@localhost) by CS.UTK.EDU with SMTP (cf v2.9s-UTK) id PAA16935; Mon, 30 Sep 1996 15:08:53 -0400 Received: from dasher.cs.utk.edu (DASHER.CS.UTK.EDU [128.169.92.51]) by CS.UTK.EDU with ESMTP (cf v2.9s-UTK) id PAA16918; Mon, 30 Sep 1996 15:08:49 -0400 From: Jack Dongarra Received: by dasher.cs.utk.edu (cf v2.11c-UTK) id PAA14197; Mon, 30 Sep 1996 15:08:47 -0400 Date: Mon, 30 Sep 1996 15:08:47 -0400 Message-Id: <199609301908.PAA14197@dasher.cs.utk.edu> To: parkbench-comm@CS.UTK.EDU Subject: October 31st meeting Dear Colleague, The ParkBench (Parallel Benchmark Working Group) will meet in Knoxville, Tennessee on October 31th, 1996. The meeting site will be the Knoxville Downtown Hilton Hotel. We have made arrangements with the Hilton Hotel in Knoxville. Hilton Hotel 501 W. Church Street Knoxville, TN Phone: 423-523-2300 When making arrangements tell the hotel you are associated with the Parallel Benchmarking or Parkbench or Park. The rate about $75.00/night. You can download a postscript map of the area by looking at http://www.netlib.org/utk/people/JackDongarra.html. You can rent a car or get a cab from the airport to the hotel. We should plan to start at 9:00 am October 31th and finish about 5:00 pm. If you will be attending the meeting please send me email so we can better arrange for the meeting. The format of the meeting is: Thursday October 31th 9:00 - 12.00 Full group meeting 12.00 - 1.30 Lunch 1.30 - 5.00 Full group meeting Tentative agenda for the meeting: 1. Minutes of last meeting 2. Reports and discussion from subgroups 3. Examine the results obtained so far 4. Electronic journal of benchmark results 5. Open discussion on the Supercomputer 96 activity 6. Date and venue for next meeting The objectives for the group are: 1. To establish a comprehensive set of parallel benchmarks that is generally accepted by both users and vendors of parallel system. 2. To provide a focus for parallel benchmark activities and avoid unnecessary duplication of effort and proliferation of benchmarks. 3. To set standards for benchmarking methodology and result-reporting together with a control database/repository for both the benchmarks and the results. The following mailing lists have been set up. parkbench-comm@cs.utk.edu Whole committee parkbench-lowlevel@cs.utk.edu Low level subcommittee parkbench-compactapp@cs.utk.edu Compact applications subcommittee parkbench-method@cs.utk.edu Methodology subcommittee parkbench-kernel@cs.utk.edu Kernel subcommittee Jack Dongarra Erich Strohmaier From owner-parkbench-comm@CS.UTK.EDU Sun Oct 6 21:26:14 1996 Return-Path: Received: from CS.UTK.EDU by netlib2.cs.utk.edu with ESMTP (cf v2.9t-netlib) id VAA28195; Sun, 6 Oct 1996 21:26:14 -0400 Received: from localhost (root@localhost) by CS.UTK.EDU with SMTP (cf v2.9s-UTK) id VAA04702; Sun, 6 Oct 1996 21:18:01 -0400 Received: from nchc.gov.tw (dns.nchc.gov.tw [140.110.192.11]) by CS.UTK.EDU with SMTP (cf v2.9s-UTK) id VAA04689; Sun, 6 Oct 1996 21:17:54 -0400 Message-Id: <199610070117.VAA04689@CS.UTK.EDU> Received: by nchc.gov.tw (th3.8s.nchc.gov) from circe.nchc.gov.tw id AA26350; Mon, 7 Oct 96 09:03:46 CST Received: by circe.nchc.gov.tw (station v4.0) from elc029.dt2.nchc id AA25654; Mon, 7 Oct 96 09:17:15 CST From: b00cjl00@nchc.gov.tw (Jun-Lin Chen) Subject: subcribe To: parkbench-comm@CS.UTK.EDU Date: Mon, 7 Oct 1996 09:17:14 +0800 (CST) X-Mailer: ELM [version 2.4 PL24] Mime-Version: 1.0 Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 8bit subcribe From owner-parkbench-comm@CS.UTK.EDU Mon Oct 14 14:37:42 1996 Return-Path: Received: from CS.UTK.EDU by netlib2.cs.utk.edu with ESMTP (cf v2.9t-netlib) id OAA06970; Mon, 14 Oct 1996 14:37:41 -0400 Received: from localhost (root@localhost) by CS.UTK.EDU with SMTP (cf v2.9s-UTK) id OAA07632; Mon, 14 Oct 1996 14:24:53 -0400 Received: from blueberry.cs.utk.edu (BLUEBERRY.CS.UTK.EDU [128.169.92.34]) by CS.UTK.EDU with ESMTP (cf v2.9s-UTK) id OAA07485; Mon, 14 Oct 1996 14:22:53 -0400 Received: by blueberry.cs.utk.edu (cf v2.11c-UTK) id SAA13307; Mon, 14 Oct 1996 18:20:29 GMT From: "Erich Strohmaier" Message-Id: <9610141420.ZM13305@blueberry.cs.utk.edu> Date: Mon, 14 Oct 1996 14:20:27 -0400 X-Face: ,v?vp%=2zU8m.23T00H*9+qjCVLwK{V3T{?1^Bua(Ud:|%?@D!~^v^hoA@Z5/*TU[RFq_n'n"}z{qhQ^Q3'Mexsxg0XW>+CbEOca91voac=P/w]>n_nS]V_ZL>XRSYWi:{MzalK9Hb^=B}Y*[x*MOX7R=*V}PI.HG~2 X-Mailer: Z-Mail (3.2.0 26oct94 MediaMail) To: parkbench-comm@CS.UTK.EDU, parkbench-lowlevel@CS.UTK.EDU Subject: ParkBench Workshop: Tentative Agenda Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Dear Colleague, The ParkBench (Parallel Benchmark Working Group) will meet in Knoxville, Tennessee on October 31th, 1996. The format of the meeting is: Thursday October 31th 9:00 - 12.00 Full group meeting 12.00 - 1.30 Lunch 1.30 - 5.00 Full group meeting The tentative agenda for the meeting is: 1. Minutes of last meeting Current release: 2. Status report and experience with the current release 3. Examine the results obtained Next release: 4. New HPF Low Level benchmarks 5. New shared memory Low Level benchmarks 6. New performance database design and new benchmark output format 7. Update of GBIS with new Web front-end 8. Report from other benchmark activities ParkBench: 9. Discussion of ParkBench group structure 10. ParkBench Bibliography 11. Status of ParkBench funding Other Activities: 12. Discussion of the Supercomputing'96 activities 13. "Electronic Benchmarking Journal" - status report 14. Miscellaneous 15. Date and venue for next meeting The meeting site will be the Knoxville Downtown Hilton Hotel. We have made arrangements with the Hilton Hotel in Knoxville. You can download a postscript map of the area by looking at http://www.netlib.org/utk/people/JackDongarra.html. When making arrangements tell the hotel you are associated with the Parallel Benchmarking or ParkBench or Park. The rate about $75.00/night. Hilton Hotel 501 W. Church Street Knoxville, TN Phone: 423-523-2300 ==> Please make your reservation as soon as possible! Jack Dongarra Erich Strohmaier From owner-parkbench-comm@CS.UTK.EDU Mon Oct 21 16:36:15 1996 Return-Path: Received: from CS.UTK.EDU by netlib2.cs.utk.edu with ESMTP (cf v2.9t-netlib) id QAA11445; Mon, 21 Oct 1996 16:36:14 -0400 Received: from localhost (root@localhost) by CS.UTK.EDU with SMTP (cf v2.9s-UTK) id PAA20805; Mon, 21 Oct 1996 15:54:52 -0400 Received: from blueberry.cs.utk.edu (BLUEBERRY.CS.UTK.EDU [128.169.92.34]) by CS.UTK.EDU with ESMTP (cf v2.9s-UTK) id PAA20796; Mon, 21 Oct 1996 15:54:50 -0400 Received: by blueberry.cs.utk.edu (cf v2.11c-UTK) id TAA16003; Mon, 21 Oct 1996 19:52:28 GMT From: "Erich Strohmaier" Message-Id: <9610211552.ZM16001@blueberry.cs.utk.edu> Date: Mon, 21 Oct 1996 15:52:27 -0400 X-Face: ,v?vp%=2zU8m.23T00H*9+qjCVLwK{V3T{?1^Bua(Ud:|%?@D!~^v^hoA@Z5/*TU[RFq_n'n"}z{qhQ^Q3'Mexsxg0XW>+CbEOca91voac=P/w]>n_nS]V_ZL>XRSYWi:{MzalK9Hb^=B}Y*[x*MOX7R=*V}PI.HG~2 X-Mailer: Z-Mail (3.2.0 26oct94 MediaMail) To: parkbench-lowlevel@CS.UTK.EDU, parkbench-comm@CS.UTK.EDU Subject: ParKBench Workshop Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Dear Colleague, All of you who are planning to come to the next meeting --- http://www.netlib.org/parkbench/ --- please send email to us so we can make local arrangements. Thank you very much Erich Strohmaier From owner-parkbench-comm@CS.UTK.EDU Fri Nov 8 17:37:43 1996 Return-Path: Received: from CS.UTK.EDU by netlib2.cs.utk.edu with ESMTP (cf v2.9t-netlib) id RAA16790; Fri, 8 Nov 1996 17:37:43 -0500 Received: from localhost (root@localhost) by CS.UTK.EDU with SMTP (cf v2.9s-UTK) id RAA19773; Fri, 8 Nov 1996 17:12:10 -0500 Received: from berry.cs.utk.edu (BERRY.CS.UTK.EDU [128.169.94.70]) by CS.UTK.EDU with ESMTP (cf v2.9s-UTK) id RAA19767; Fri, 8 Nov 1996 17:12:09 -0500 Received: from cs.utk.edu by berry.cs.utk.edu with ESMTP (cf v2.11c-UTK) id RAA07377; Fri, 8 Nov 1996 17:08:27 -0500 Message-Id: <199611082208.RAA07377@berry.cs.utk.edu> to: parkbench-comm@CS.UTK.EDU Subject: Minutes of last PARKBENCH meeting Date: Fri, 08 Nov 1996 17:08:27 -0500 From: "Michael W. Berry" ----------------------------------------------------------------- Minutes of ParKBench Meeting - Knoxville Hilton, October 31, 1996 ----------------------------------------------------------------- Attendee List: (MBa) Mark Baker Univ. of Portsmouth mab@sis.port.ac.uk (MBe) Michael Berry Univ. of Tennessee berry@cs.utk.edu (JD) Jack Dongarra Univ. of Tenn./ORNL dongarra@cs.utk.edu (VG) Vladimir Getov Univ. of Westminister getovv@wmin.ac.uk Christian Halloy Univ. of Tennessee halloy@cs.utk.edu Adolfy Hoisie Cornell Theory Ctr hoisie@tc.cornell.edu Edgar Kalns IBM kalns@vnet.ibm.com Phil Mucci Univ. of Tennessee mucci@cs.utk.edu Erik Riedel GENIAS Software GmbH erik@genias.de (SS) Subhash Saini NASA Ames saini@nas.nasa.gov (RS) Ron Sercely Convex/HP sercely@convex.hp.com (ES) Erich Strohmaier Univ. of Tennessee erich@cs.utk.edu At 9:05am EST, JD opened the meeting. The participants introduced themselves and the final agenda was discussed. At 9:20am ES gave a status report of the current release of ParkBench. He reviewed the ParkBench structure and the changes in V2.1.1 to the last version 2.0. The changes to the MPI implementations of comms2 and comms3 were discussed in detail. Further changes to comms3 were proposed by RS. The committee decided to change the MPI and PVM version of comms3 to model MPI_ALLTOALL communication pattern. At 10:15am ES presented the table of results available. The committee reviewed selected results and discussed in detail necessary changes to vector length and message sizes in Low Level benchmarks to reflect the increasing memory and cache sizes. Two parameters TOTMEM and TOTCACHE will be used for this. JD suggested to characterize more precisely which measurements are performed. He also proposed to add an additional test for cold cache behavior to the existing ones which only work on warm caches. JD and ES will investigate necessary changes to the LINALG kernels to facilitate their usage. VG reported that a set of measurement results has recently been taken by the low-level sequential benchmarks on a number of high-performance workstations, including SGI-Onyx, DEC-Alpha, and SUN-Ultra2. These results will be submitted in due course. At 11:00am JD started the discussion about additions for the next release. ES reviewed the status of the old HPF Compiler test suite. The committee agreed on generating a new set of Low Level test which should use similar data exchange pattern as the Message Passing test. Charles Koelbel will be contacted to help with this. The necessity of additional HPF test will also be investigated. *** ACTION ITEM *** MBa will talk with Chuck K. and others (NPAC, etc.) about assembling the HPF codes. At 11:15am MBa proposed to generate a similar set of tests for shared memory programming. The committee agreed that one sided exchange operations like put and get would be a good first candidate for this. The standardization of these constructs in MPI-2 would provide a good starting point for the actual implementation. RS indicated that he could look into thread-based SM-type benchmarks. There was also a discussion about BSP and Oxford lab. It was suggested that this group should encourage them to complete low-level codes which could be added to Netlib. MBa will report the details of this discussion to Oxford. to Oxford the At 11:30am MBe reviewed the minutes from the April 26, 1996 PARKBENCH meeting and the following corrections were made prior to acceptance of these minutes: "SS then presented some results on HPF-based NAS Parallel Benchmarks (PB). He pointed out the heavy investment at NAS in CMF on the CM-2 and CM-5 (shutdown on March 1995). He showed that for the FT benchmarks, that the MPI versions were about 2 times faster than the HPF versions of NAS PB Version 2.0 on the IBM SP/2. Also he showed that for the SP benchmark, that the MPI version was about 3 times faster than the HPF versions of NAS PB Version 2.0 on the IBM SP/2." RS pointed out that many of the commercial benchmark codes (MSC/NASTRAN, STARCD, ABACUS, LSDYNA, and PAMCRASH) discussed at the April PARKBENCH meeting utilize PVM and that MPI versions are now in beta test for most vendors. RS also pointed out memory referencing patterns and communication can greatly vary for such codes for different problem sizes and datasets. VG indicated that the PIC code (via R. Hockney) called LPM that was under consideration for a future release of PARKBENCH is proprietary. The group then convened for lunch at the Soup Kitchen at noon. At 12:55pm, MBa overviewed GBIS (which is now over 2 years old). He compared it with the presentation of NPB results at NAS which uses precomputed postscript graph files. He indicated that a prototype of the new interface for GBIS has been designed which is currently using a DB2 database (demo planned for Supercomputing'96). He suggested that future work in the coupling of object databases and java graphics might be the best approach for upcoming interfaces to PARKBENCH data. A more efficient mechanism is also needed for adding new PARKBENCH data. The URL for the Database Navigator Project at the Univ. of Southampton which has updated GBIS is available at the URL below: http://barrel.ecs.soton.ac.uk:8080/dbhome.html SS then gave a presentation of the NAS PB 1.0 (vendor optimized) results collected from machines such as: NEC SX-4, Fujitsu VPP700, Hitachi SR2201, Convex SPP2000, SGI Origin 2000, Cray T3E, and IBM's new 135 MHz-based machine. He showed that for the Class A SP benchmark on 64 processors the HPF-compiled versions (PGI, APR) were between 2 and 2.7 times slower than the MPI versions on the IBM SP/2. On the Cray T3D, the HPF-compiled versions were between 1.7 and 2.2 times slower than the optimal MPI versions. For the Class A BT benchmark on 64 processors, the HPF-compiled versions ranged between 1.2 and 1.5 times slower than the MPI version. SS, however, did point out that the vendors' optimized MPI versions sometimes required twice the memory of the HPF-compiled versions (trade-off in memory for speed). SS indicated that the NPB 2.0 MPI-based versions are complete and that the NPB 3.0 HPF-based versions of BT, SP, FT, and EP are now available. He also stressed the need for PARKBENCH to join in the Petaflop performance modeling efforts related to applications in climate and ocean modeling, for example. SS also discussed the history of nanotechonology and the principle of "writing with atoms". He posed an important question to the group: are current benchmarks suitable for future application-based simulations?" At 1:55pm, JD and ES asked the group to consider reducing the PARKBENCH group structure (low-level, kernel, compact applications, methodology, HPF, and analysis) down to a single group. It was unanimous that a single PARKBENCH group be used with the electronic mail address parkbench-comm@cs.utk.edu as the single channel of communication. ES pointed out that the webpages made need to be changed to indicate the single group structure. ES then suggested that a PARKBENCH bibliography be constructed from all PARKBENCH-related publications and pointers to other benchmarking activities. The formation of a benchmarking catalogue was suggested. The group then discussed an update of the previous PARKBENCH report (edited by Hockney and Berry). It was decided that a second report should detail changes in the PARKBENCH effort and address issues in Petaflops performance modeling. MBa, MBe, ES, SS, and VG agreed to help produce the second PARKBENCH report. JD suggested that the report be ready by Supercomputing'96. JD and VG then commented on efforts for funding PARKBENCH activities. JD indicated that the UT-based proposal is still in a development phase and VG indicated that the Universities of Southampton and Westminster had submitted a joint proposal to get a 3-year grant for Optimization and Performance Modeling work. The group also made a list of future benchmarks that might be considered for PARKBENCH: I/O (MBa will investigate possible benchmarks) C, C++ Web Servers Posix Threads JD indicated that the PARKBENCH BOFS (Birds-Of-a-Feather-Session) at Supercomputing'96 in Pittsburgh will be on Tuesday (Nov. 19) at 1:30pm (2 hours) in the Washington Room of the Doubletree Hotel. Current PARKBENCH results and presentation of recent NPB data by SS will be presented before questions are taken from the audience. The final discussion of this meeting concerned the creation of a new electronic journal on performance evaluation and analysis which would be published on a Web Server at Southampton and mirrored on the Netlib server at UTK. The group suggested a variety of names appropriate for such a journal including: - High-Performance Evaluation and Modeling, - Performance Evaluation and Modeling for High-Performance Computing, - High-Performance Computing: Evaluation and Modeling, plus other combinations. *** ACTION ITEM *** JD will send out candidate names for the electronic journal to parkbench-comm@cs.utk.edu for voting by PARKBENCH members. The name decision was postponed to give all members time to propose alternatives. Suggestions for populating the Editorial Board of the proposed journal includes: Joanne Martin (IBM), Phil Tannenbaum (NEC), Ken Miura, SGI representative, and Japanese computer vendor representatives. ES suggested that an electronic mail address be created for this journal for communication of such topics. The next Parkbench Meeting is scheduled for April 1997 in Knoxville. JD adjourned the meeting at 3:15pm EST. From owner-parkbench-comm@CS.UTK.EDU Mon Nov 11 21:50:59 1996 Return-Path: Received: from CS.UTK.EDU by netlib2.cs.utk.edu with ESMTP (cf v2.9t-netlib) id VAA16447; Mon, 11 Nov 1996 21:50:58 -0500 Received: from localhost (root@localhost) by CS.UTK.EDU with SMTP (cf v2.9s-UTK) id TAA27829; Mon, 11 Nov 1996 19:28:40 -0500 Received: from berry.cs.utk.edu (BERRY.CS.UTK.EDU [128.169.94.70]) by CS.UTK.EDU with ESMTP (cf v2.9s-UTK) id TAA27823; Mon, 11 Nov 1996 19:28:37 -0500 Received: from cs.utk.edu by berry.cs.utk.edu with ESMTP (cf v2.11c-UTK) id TAA12326; Mon, 11 Nov 1996 19:24:55 -0500 Message-Id: <199611120024.TAA12326@berry.cs.utk.edu> to: parkbench-comm@CS.UTK.EDU Subject: corrected minutes Date: Mon, 11 Nov 1996 19:24:55 -0500 From: "Michael W. Berry" There were some errors in the minutes to the last meeting that I have now corrected. Regards, Mike ----------------------------------------------------------------- Minutes of ParKBench Meeting - Knoxville Hilton, October 31, 1996 ----------------------------------------------------------------- Attendee List: (MBa) Mark Baker Univ. of Portsmouth mab@sis.port.ac.uk (MBe) Michael Berry Univ. of Tennessee berry@cs.utk.edu (JD) Jack Dongarra Univ. of Tenn./ORNL dongarra@cs.utk.edu (VG) Vladimir Getov Univ. of Westminister getovv@wmin.ac.uk Christian Halloy Univ. of Tennessee halloy@cs.utk.edu Adolfy Hoisie Cornell Theory Ctr hoisie@tc.cornell.edu Edgar Kalns IBM kalns@vnet.ibm.com Phil Mucci Univ. of Tennessee mucci@cs.utk.edu Erik Riedel GENIAS Software GmbH erik@genias.de (SS) Subhash Saini NASA Ames saini@nas.nasa.gov (RS) Ron Sercely Convex/HP sercely@convex.hp.com (ES) Erich Strohmaier Univ. of Tennessee erich@cs.utk.edu At 9:05am EST, JD opened the meeting. The participants introduced themselves and the final agenda was discussed. At 9:20am ES gave a status report of the current release of ParkBench. He reviewed the ParkBench structure and the changes in V2.1.1 to the last version 2.0. The changes to the MPI implementations of comms2 and comms3 were discussed in detail. Further changes to comms3 were proposed by RS. The committee decided to change the MPI and PVM version of comms3 to model MPI_ALLTOALL communication pattern. At 10:15am ES presented the table of results available. The committee reviewed selected results and discussed in detail necessary changes to vector length and message sizes in Low Level benchmarks to reflect the increasing memory and cache sizes. Two parameters TOTMEM and TOTCACHE will be used for this. JD suggested to characterize more precisely which measurements are performed. He also proposed to add an additional test for cold cache behavior to the existing ones which only work on warm caches. JD and ES will investigate necessary changes to the LINALG kernels to facilitate their usage. VG reported that a set of measurement results has recently been taken by the low-level sequential benchmarks on a number of high-performance workstations, including SGI-Onyx, DEC-Alpha, and SUN-Ultra2. These results will be submitted in due course. At 11:00am JD started the discussion about additions for the next release. ES reviewed the status of the old HPF Compiler test suite. The committee agreed on generating a new set of Low Level test which should use similar data exchange pattern as the Message Passing test. Charles Koelbel will be contacted to help with this. The necessity of additional HPF test will also be investigated. *** ACTION ITEM *** MBa will talk with Chuck K. and others (NPAC, etc.) about assembling the HPF codes. At 11:15am MBa proposed to generate a similar set of tests for shared memory programming. The committee agreed that one sided exchange operations like put and get would be a good first candidate for this. The standardization of these constructs in MPI-2 would provide a good starting point for the actual implementation. RS indicated that he could look into thread-based SM-type benchmarks. There was also a discussion about BSP and Oxford lab. It was suggested that this group should encourage them to complete low-level codes which could be added to Netlib. MBa will report the details of this discussion to Oxford. to Oxford the At 11:30am MBe reviewed the minutes from the April 26, 1996 PARKBENCH meeting and the following corrections were made prior to acceptance of these minutes: "SS then presented some results on HPF-based NAS Parallel Benchmarks (PB). He pointed out the heavy investment at NAS in CMF on the CM-2 and CM-5 (shutdown on March 1995). He showed that for the FT benchmarks, that the MPI versions were about 2 times faster than the HPF versions of NAS PB Version 2.0 on the IBM SP/2. Also he showed that for the SP benchmark, that the MPI version was about 3 times faster than the HPF versions of NAS PB Version 2.0 on the IBM SP/2." RS pointed out that many of the commercial benchmark codes (MSC/NASTRAN, STARCD, ABACUS, LSDYNA, and PAMCRASH) discussed at the April PARKBENCH meeting utilize PVM and that MPI versions are now in beta test for most vendors. RS also pointed out memory referencing patterns and communication can greatly vary for such codes for different problem sizes and datasets. VG indicated that the PIC code (via R. Hockney) called LPM that was under consideration for a future release of PARKBENCH is proprietary. The group then convened for lunch at the Soup Kitchen at noon. At 12:55pm, MBa overviewed GBIS (which is now over 2 years old). He compared it with the presentation of NPB results at NAS which uses precomputed postscript graph files. He indicated that a prototype of the new interface for GBIS has been designed which is currently using a DB2 database (demo planned for Supercomputing'96). He suggested that future work in the coupling of object databases and java graphics might be the best approach for upcoming interfaces to PARKBENCH data. A more efficient mechanism is also needed for adding new PARKBENCH data. The URL for the Database Navigator Project at the Univ. of Southampton which has updated GBIS is available at the URL below: http://barrel.ecs.soton.ac.uk:8080/dbhome.html SS indicated that NAS PB 1.0 (vendor optimized) results collected from machines such as: NEC SX-4, Fujitsu VPP700, Hitachi SR2201, Convex SPP2000, SGI Origin 2000, Cray T3E, and IBM's new 135 MHz-based machine would be released at SC'96. He showed that for the Class A SP benchmark on 64 processors the HPF-compiled versions (PGI, APR) were between 2 and 2.7 times slower than the MPI versions on the IBM SP/2. On the Cray T3D, the HPF-compiled versions were between 1.7 and 2.2 times slower than the optimal MPI versions. For the Class A BT benchmark on 64 processors, the HPF-compiled versions ranged between 1.2 and 1.5 times slower than the MPI version. SS, however, did point out that the vendors' optimized MPI versions sometimes required twice the memory of the HPF-compiled versions (trade-off in memory for speed). SS indicated that the NPB 2.0 MPI-based versions are complete and that the NPB 3.0 HPF-based versions of BT, SP, FT, and EP are being prepared. He also stressed the need for PARKBENCH to join in the Petaflop performance modeling efforts related to applications in climate and ocean modeling, for example. SS also discussed the history of nanotechonology and the principle of "writing with atoms". He posed an important question to the group: are current benchmarks suitable for future application-based simulations?" At 1:55pm, JD and ES asked the group to consider reducing the PARKBENCH group structure (low-level, kernel, compact applications, methodology, HPF, and analysis) down to a single group. It was unanimous that a single PARKBENCH group be used with the electronic mail address parkbench-comm@cs.utk.edu as the single channel of communication. ES pointed out that the webpages made need to be changed to indicate the single group structure. ES then suggested that a PARKBENCH bibliography be constructed from all PARKBENCH-related publications and pointers to other benchmarking activities. The formation of a benchmarking catalogue was suggested. The group then discussed an update of the previous PARKBENCH report (edited by Hockney and Berry). It was decided that a second report should detail changes in the PARKBENCH effort and address issues in Petaflops performance modeling. MBa, MBe, ES, SS, and VG agreed to help produce the second PARKBENCH report. JD suggested that the report be ready by Supercomputing'96. JD and VG then commented on efforts for funding PARKBENCH activities. JD indicated that the UT-based proposal is still in a development phase and VG indicated that the Universities of Southampton and Westminster had submitted a joint proposal to get a 3-year grant for Optimization and Performance Modeling work. The group also made a list of future benchmarks that might be considered for PARKBENCH: I/O (MBa will investigate possible benchmarks) C, C++ Web Servers Posix Threads JD indicated that the PARKBENCH BOFS (Birds-Of-a-Feather-Session) at Supercomputing'96 in Pittsburgh will be on Tuesday (Nov. 19) at 1:30pm (2 hours) in the Washington Room of the Doubletree Hotel. Current PARKBENCH results and presentation of recent NPB data by SS will be presented before questions are taken from the audience. The final discussion of this meeting concerned the creation of a new electronic journal on performance evaluation and analysis which would be published on a Web Server at Southampton and mirrored on the Netlib server at UTK. The group suggested a variety of names appropriate for such a journal including: - High-Performance Evaluation and Modeling, - Performance Evaluation and Modeling for High-Performance Computing, - High-Performance Computing: Evaluation and Modeling, plus other combinations. *** ACTION ITEM *** JD will send out candidate names for the electronic journal to parkbench-comm@cs.utk.edu for voting by PARKBENCH members. The name decision was postponed to give all members time to propose alternatives. Suggestions for populating the Editorial Board of the proposed journal includes: Joanne Martin (IBM), Phil Tannenbaum (NEC), Ken Miura, SGI representative, and Japanese computer vendor representatives. ES suggested that an electronic mail address be created for this journal for communication of such topics. The next Parkbench Meeting is scheduled for April 1997 in Knoxville. JD adjourned the meeting at 3:15pm EST. From owner-parkbench-comm@CS.UTK.EDU Tue Nov 12 13:52:09 1996 Return-Path: Received: from CS.UTK.EDU by netlib2.cs.utk.edu with ESMTP (cf v2.9t-netlib) id NAA07461; Tue, 12 Nov 1996 13:52:08 -0500 Received: from localhost (root@localhost) by CS.UTK.EDU with SMTP (cf v2.9s-UTK) id LAA14754; Tue, 12 Nov 1996 11:20:28 -0500 Received: from berry.cs.utk.edu (BERRY.CS.UTK.EDU [128.169.94.70]) by CS.UTK.EDU with ESMTP (cf v2.9s-UTK) id LAA14742; Tue, 12 Nov 1996 11:20:26 -0500 Received: from cs.utk.edu by berry.cs.utk.edu with ESMTP (cf v2.11c-UTK) id LAA13494; Tue, 12 Nov 1996 11:16:42 -0500 Message-Id: <199611121616.LAA13494@berry.cs.utk.edu> to: parkbench-comm@CS.UTK.EDU Subject: another version of minutes Date: Tue, 12 Nov 1996 11:16:42 -0500 From: "Michael W. Berry" Yes, it's me again with a new verison of the minutes. I added a more appropriate specification of the new IBM SP to the NAS PB description (provided by Edgar K.). Regards, Mike ----------------------------------------------------------------- Minutes of ParKBench Meeting - Knoxville Hilton, October 31, 1996 ----------------------------------------------------------------- Attendee List: (MBa) Mark Baker Univ. of Portsmouth mab@sis.port.ac.uk (MBe) Michael Berry Univ. of Tennessee berry@cs.utk.edu (JD) Jack Dongarra Univ. of Tenn./ORNL dongarra@cs.utk.edu (VG) Vladimir Getov Univ. of Westminister getovv@wmin.ac.uk Christian Halloy Univ. of Tennessee halloy@cs.utk.edu Adolfy Hoisie Cornell Theory Ctr hoisie@tc.cornell.edu Edgar Kalns IBM kalns@vnet.ibm.com Phil Mucci Univ. of Tennessee mucci@cs.utk.edu Erik Riedel GENIAS Software GmbH erik@genias.de (SS) Subhash Saini NASA Ames saini@nas.nasa.gov (RS) Ron Sercely Convex/HP sercely@convex.hp.com (ES) Erich Strohmaier Univ. of Tennessee erich@cs.utk.edu At 9:05am EST, JD opened the meeting. The participants introduced themselves and the final agenda was discussed. At 9:20am ES gave a status report of the current release of ParkBench. He reviewed the ParkBench structure and the changes in V2.1.1 to the last version 2.0. The changes to the MPI implementations of comms2 and comms3 were discussed in detail. Further changes to comms3 were proposed by RS. The committee decided to change the MPI and PVM version of comms3 to model MPI_ALLTOALL communication pattern. At 10:15am ES presented the table of results available. The committee reviewed selected results and discussed in detail necessary changes to vector length and message sizes in Low Level benchmarks to reflect the increasing memory and cache sizes. Two parameters TOTMEM and TOTCACHE will be used for this. JD suggested to characterize more precisely which measurements are performed. He also proposed to add an additional test for cold cache behavior to the existing ones which only work on warm caches. JD and ES will investigate necessary changes to the LINALG kernels to facilitate their usage. VG reported that a set of measurement results has recently been taken by the low-level sequential benchmarks on a number of high-performance workstations, including SGI-Onyx, DEC-Alpha, and SUN-Ultra2. These results will be submitted in due course. At 11:00am JD started the discussion about additions for the next release. ES reviewed the status of the old HPF Compiler test suite. The committee agreed on generating a new set of Low Level test which should use similar data exchange pattern as the Message Passing test. Charles Koelbel will be contacted to help with this. The necessity of additional HPF test will also be investigated. *** ACTION ITEM *** MBa will talk with Chuck K. and others (NPAC, etc.) about assembling the HPF codes. At 11:15am MBa proposed to generate a similar set of tests for shared memory programming. The committee agreed that one sided exchange operations like put and get would be a good first candidate for this. The standardization of these constructs in MPI-2 would provide a good starting point for the actual implementation. RS indicated that he could look into thread-based SM-type benchmarks. There was also a discussion about BSP and Oxford lab. It was suggested that this group should encourage them to complete low-level codes which could be added to Netlib. MBa will report the details of this discussion to Oxford. to Oxford the At 11:30am MBe reviewed the minutes from the April 26, 1996 PARKBENCH meeting and the following corrections were made prior to acceptance of these minutes: "SS then presented some results on HPF-based NAS Parallel Benchmarks (PB). He pointed out the heavy investment at NAS in CMF on the CM-2 and CM-5 (shutdown on March 1995). He showed that for the FT benchmarks, that the MPI versions were about 2 times faster than the HPF versions of NAS PB Version 2.0 on the IBM SP/2. Also he showed that for the SP benchmark, that the MPI version was about 3 times faster than the HPF versions of NAS PB Version 2.0 on the IBM SP/2." RS pointed out that many of the commercial benchmark codes (MSC/NASTRAN, STARCD, ABACUS, LSDYNA, and PAMCRASH) discussed at the April PARKBENCH meeting utilize PVM and that MPI versions are now in beta test for most vendors. RS also pointed out memory referencing patterns and communication can greatly vary for such codes for different problem sizes and datasets. VG indicated that the PIC code (via R. Hockney) called LPM that was under consideration for a future release of PARKBENCH is proprietary. The group then convened for lunch at the Soup Kitchen at noon. At 12:55pm, MBa overviewed GBIS (which is now over 2 years old). He compared it with the presentation of NPB results at NAS which uses precomputed postscript graph files. He indicated that a prototype of the new interface for GBIS has been designed which is currently using a DB2 database (demo planned for Supercomputing'96). He suggested that future work in the coupling of object databases and java graphics might be the best approach for upcoming interfaces to PARKBENCH data. A more efficient mechanism is also needed for adding new PARKBENCH data. The URL for the Database Navigator Project at the Univ. of Southampton which has updated GBIS is available at the URL below: http://barrel.ecs.soton.ac.uk:8080/dbhome.html SS indicated that NAS PB 1.0 (vendor optimized) results collected from machines such as: NEC SX-4, Fujitsu VPP700, Hitachi SR2201, Convex SPP2000, SGI Origin 2000, Cray T3E, and IBM's new Power2 Super Chip (P2SC 120 MHz)-based SP would be released at SC'96. He showed that for the Class A SP benchmark on 64 processors the HPF-compiled versions (PGI, APR) were between 2 and 2.7 times slower than the MPI versions on the IBM SP/2. On the Cray T3D, the HPF-compiled versions were between 1.7 and 2.2 times slower than the optimal MPI versions. For the Class A BT benchmark on 64 processors, the HPF-compiled versions ranged between 1.2 and 1.5 times slower than the MPI version. SS, however, did point out that the vendors' optimized MPI versions sometimes required twice the memory of the HPF-compiled versions (trade-off in memory for speed). SS indicated that the NPB 2.0 MPI-based versions are complete and that the NPB 3.0 HPF-based versions of BT, SP, FT, and EP are being prepared. He also stressed the need for PARKBENCH to join in the Petaflop performance modeling efforts related to applications in climate and ocean modeling, for example. SS also discussed the history of nanotechonology and the principle of "writing with atoms". He posed an important question to the group: are current benchmarks suitable for future application-based simulations?" At 1:55pm, JD and ES asked the group to consider reducing the PARKBENCH group structure (low-level, kernel, compact applications, methodology, HPF, and analysis) down to a single group. It was unanimous that a single PARKBENCH group be used with the electronic mail address parkbench-comm@cs.utk.edu as the single channel of communication. ES pointed out that the webpages made need to be changed to indicate the single group structure. ES then suggested that a PARKBENCH bibliography be constructed from all PARKBENCH-related publications and pointers to other benchmarking activities. The formation of a benchmarking catalogue was suggested. The group then discussed an update of the previous PARKBENCH report (edited by Hockney and Berry). It was decided that a second report should detail changes in the PARKBENCH effort and address issues in Petaflops performance modeling. MBa, MBe, ES, SS, and VG agreed to help produce the second PARKBENCH report. JD suggested that the report be ready by Supercomputing'96. JD and VG then commented on efforts for funding PARKBENCH activities. JD indicated that the UT-based proposal is still in a development phase and VG indicated that the Universities of Southampton and Westminster had submitted a joint proposal to get a 3-year grant for Optimization and Performance Modeling work. The group also made a list of future benchmarks that might be considered for PARKBENCH: I/O (MBa will investigate possible benchmarks) C, C++ Web Servers Posix Threads JD indicated that the PARKBENCH BOFS (Birds-Of-a-Feather-Session) at Supercomputing'96 in Pittsburgh will be on Tuesday (Nov. 19) at 1:30pm (2 hours) in the Washington Room of the Doubletree Hotel. Current PARKBENCH results and presentation of recent NPB data by SS will be presented before questions are taken from the audience. The final discussion of this meeting concerned the creation of a new electronic journal on performance evaluation and analysis which would be published on a Web Server at Southampton and mirrored on the Netlib server at UTK. The group suggested a variety of names appropriate for such a journal including: - High-Performance Evaluation and Modeling, - Performance Evaluation and Modeling for High-Performance Computing, - High-Performance Computing: Evaluation and Modeling, plus other combinations. *** ACTION ITEM *** JD will send out candidate names for the electronic journal to parkbench-comm@cs.utk.edu for voting by PARKBENCH members. The name decision was postponed to give all members time to propose alternatives. Suggestions for populating the Editorial Board of the proposed journal includes: Joanne Martin (IBM), Phil Tannenbaum (NEC), Ken Miura, SGI representative, and Japanese computer vendor representatives. ES suggested that an electronic mail address be created for this journal for communication of such topics. The next Parkbench Meeting is scheduled for April 1997 in Knoxville. JD adjourned the meeting at 3:15pm EST. From owner-parkbench-comm@CS.UTK.EDU Wed Nov 13 15:36:18 1996 Return-Path: Received: from CS.UTK.EDU by netlib2.cs.utk.edu with ESMTP (cf v2.9t-netlib) id PAA25421; Wed, 13 Nov 1996 15:36:18 -0500 Received: from localhost (root@localhost) by CS.UTK.EDU with SMTP (cf v2.9s-UTK) id PAA12703; Wed, 13 Nov 1996 15:04:10 -0500 Received: from blueberry.cs.utk.edu (BLUEBERRY.CS.UTK.EDU [128.169.92.34]) by CS.UTK.EDU with ESMTP (cf v2.9s-UTK) id PAA12697; Wed, 13 Nov 1996 15:04:08 -0500 Received: by blueberry.cs.utk.edu (cf v2.11c-UTK) id UAA02445; Wed, 13 Nov 1996 20:01:47 GMT From: "Erich Strohmaier" Message-Id: <9611131501.ZM2443@blueberry.cs.utk.edu> Date: Wed, 13 Nov 1996 15:01:46 -0500 X-Face: ,v?vp%=2zU8m.23T00H*9+qjCVLwK{V3T{?1^Bua(Ud:|%?@D!~^v^hoA@Z5/*TU[RFq_n'n"}z{qhQ^Q3'Mexsxg0XW>+CbEOca91voac=P/w]>n_nS]V_ZL>XRSYWi:{MzalK9Hb^=B}Y*[x*MOX7R=*V}PI.HG~2 X-Mailer: Z-Mail (3.2.0 26oct94 MediaMail) To: parkbench-comm@CS.UTK.EDU Subject: ParkBench BOF - Supercomputing'96 Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Dear Colleague, The ParkBench (PARallel Kernels and BENCHmarks) committee has organized a BOF session at the Supercomputing'96 in Pittsburgh. Room: Doubletree Hotel, Washington Room Time: Tuesday 1:30-3:30pm We will talk about the latest release, results available and plans for the coming releases. The inital 'Call for Papers' for our new journal will also be available: Electronic Journal of Performance Evaluation and Modeling for Computer Systems Tentative Agenda of the BOF - Introduction, background, WWW-Server - Current Release of ParkBench - Low Level Performance Evaluation Tools - LinAlg Kernel Benchmarks - NAS Parallel Benchmarks, including latest results - Plans for the next Release - Electronic Journal of Performance Evaluation and Modeling for Computer Systems - Questions from the floor / discussion Please mark your calendar and plan to attend. Jack Dongarra Tony Hey Erich Strohmaier From owner-parkbench-comm@CS.UTK.EDU Tue Nov 19 06:59:54 1996 Return-Path: Received: from CS.UTK.EDU by netlib2.cs.utk.edu with ESMTP (cf v2.9t-netlib) id GAA26554; Tue, 19 Nov 1996 06:59:53 -0500 Received: from localhost (root@localhost) by CS.UTK.EDU with SMTP (cf v2.9s-UTK) id GAA23052; Tue, 19 Nov 1996 06:45:22 -0500 Received: from postoffice.npac.syr.edu (postoffice.npac.syr.edu [128.230.7.30]) by CS.UTK.EDU with ESMTP (cf v2.9s-UTK) id GAA23038; Tue, 19 Nov 1996 06:45:17 -0500 From: Received: from yosemite (pc280.sis.port.ac.uk [148.197.205.60]) by postoffice.npac.syr.edu (8.7.5/8.7.1) with SMTP id GAA24427 for ; Tue, 19 Nov 1996 06:45:05 -0500 (EST) Date: Tue, 19 Nov 96 11:24:13 Subject: IO Benchmarks To: parkbench-comm@CS.UTK.EDU X-PRIORITY: 3 (Normal) X-Mailer: Chameleon 5.0.1, TCP/IP for Windows, NetManage Inc. Message-ID: MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII I have started looking into IO benchmarks - as per the action from the last Parkbench committee meeting. >From what I can see David Patterson (Berkeley) and Peter Chen (Michigan) - see http://www.eecs.umich.edu/~pmchen/ for a list of their work - have done a considerable amount of research in the area of IO benchmarks (see "Self-Scaling I/O Benchmarks..." paper for example). I would like to approach these researchers to help our activities - maybe they would be interested in getting actively involved or acting on a consultative basis, .i.e. giving us some good advice... I will assume that the committee members do not have a problem with this ? Regarding an associated matter, it would be useful to get someone actively involved with the MPI IO work to work with group within Parkbench looking at IO benchmarks. Are there any suggestions of who I could approach ? Regards Mark ------------------------------------- DIS, University of Portsmouth, Hants, UK Tel: +44 1705 844285 E-mail: mab@npac.syr.edu Date: 11/19/96 - Time: 11:24:13 AM URL http://www.sis.port.ac.uk/~mab/ ------------------------------------- From owner-parkbench-comm@CS.UTK.EDU Thu Nov 21 18:38:04 1996 Return-Path: Received: from CS.UTK.EDU by netlib2.cs.utk.edu with ESMTP (cf v2.9t-netlib) id SAA19974; Thu, 21 Nov 1996 18:38:03 -0500 Received: from localhost (root@localhost) by CS.UTK.EDU with SMTP (cf v2.9s-UTK) id SAA05216; Thu, 21 Nov 1996 18:27:14 -0500 Received: from cuspidor.cs.utk.edu (CUSPIDOR.CS.UTK.EDU [128.169.92.137]) by CS.UTK.EDU with ESMTP (cf v2.9s-UTK) id SAA05209; Thu, 21 Nov 1996 18:27:11 -0500 From: Philip Mucci Received: by cuspidor.cs.utk.edu (cf v2.11c-UTK) id SAA28593; Thu, 21 Nov 1996 18:27:09 -0500 Date: Thu, 21 Nov 1996 18:27:09 -0500 Message-Id: <199611212327.SAA28593@cuspidor.cs.utk.edu> To: parkbench-comm@CS.UTK.EDU Subject: Comms3 all to all X-Mailer: [XMailTool v3.1.2b] I have a comms3 with all to all. Send me mail if you'd like to try it out. MPICH and P4 perform horribly, apparently it doesn't use any sort of fan out, but rather a linear exchange, I'd be interested in running it on some decent machines... /%*\ Philip J. Mucci | GRA in CS under Dr. JJ Dongarra /*%\ \*%/ http://www.cs.utk.edu/~mucci PVM/Active Messages \%*/ From owner-parkbench-comm@CS.UTK.EDU Wed Nov 27 05:45:30 1996 Return-Path: Received: from CS.UTK.EDU by netlib2.cs.utk.edu with ESMTP (cf v2.9t-netlib) id FAA08822; Wed, 27 Nov 1996 05:45:29 -0500 Received: from localhost (root@localhost) by CS.UTK.EDU with SMTP (cf v2.9s-UTK) id FAA26050; Wed, 27 Nov 1996 05:36:05 -0500 Received: from osiris.sis.port.ac.uk (root@osiris.sis.port.ac.uk [148.197.100.10]) by CS.UTK.EDU with SMTP (cf v2.9s-UTK) id FAA26040; Wed, 27 Nov 1996 05:35:58 -0500 From: Received: from yosemite (pc280.sis.port.ac.uk) by osiris.sis.port.ac.uk (4.1/SMI-4.1) id AA23435; Wed, 27 Nov 96 10:36:52 GMT Date: Wed, 27 Nov 96 10:24:16 Subject: Set up HPF mailing group... To: parkbench-comm@CS.UTK.EDU X-Priority: 3 (Normal) X-Mailer: Chameleon 5.0.1, TCP/IP for Windows, NetManage Inc. Message-Id: Mime-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII I have started discussing HPF Parkbench codes with a number of interested parties. It seems appropriate that we set up a HPF mailing group discusion forum. Can someone at UTK set up a majodomo email group called something like parkbench-hpf@cs.utk.edu. Regards Mark ------------------------------------- DIS, University of Portsmouth, Hants, UK Tel: +44 1705 844285 E-mail: mab@npac.syr.edu Date: 11/27/96 - Time: 10:24:16 AM URL http://www.sis.port.ac.uk/~mab/ ------------------------------------- From owner-parkbench-comm@CS.UTK.EDU Mon Dec 2 10:07:38 1996 Return-Path: Received: from CS.UTK.EDU by netlib2.cs.utk.edu with ESMTP (cf v2.9t-netlib) id KAA29548; Mon, 2 Dec 1996 10:07:38 -0500 Received: from localhost (root@localhost) by CS.UTK.EDU with SMTP (cf v2.9s-UTK) id JAA03102; Mon, 2 Dec 1996 09:54:27 -0500 Received: from convex.convex.com (convex.convex.com [130.168.1.1]) by CS.UTK.EDU with ESMTP (cf v2.9s-UTK) id JAA03093; Mon, 2 Dec 1996 09:54:22 -0500 Received: from brittany.rsn.hp.com by convex.convex.com (8.6.4.2/1.35) id IAA05873; Mon, 2 Dec 1996 08:53:48 -0600 Received: from localhost by brittany.rsn.hp.com with SMTP (1.38.193.4/16.2) id AA03758; Mon, 2 Dec 1996 08:54:48 -0600 Sender: sercely@convex.convex.com Message-Id: <32A2EDB8.1584@convex.com> Date: Mon, 02 Dec 1996 08:54:48 -0600 From: Ron Sercely Organization: Hewlett-Packard Convex Technology Center X-Mailer: Mozilla 2.0 (X11; I; HP-UX A.09.05 9000/710) Mime-Version: 1.0 To: parkbench-comm@CS.UTK.EDU Subject: Problems with LOWLEVEL comms benchmarks Content-Type: multipart/mixed; boundary="------------F2D48B923F" This is a multi-part message in MIME format. --------------F2D48B923F Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit -- Ron Sercely HP/CXTC Toolsmith --------------F2D48B923F Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit Content-Disposition: inline; filename="comms.summary" INTRODUCTION I recently submitted results to the committee, and since then CXD has released an implemenation of MPI that results in substatially better performance on these benchmarks, (up to a factor of 5x), but also reveals some problems with the comms benchmarks. I am unsure how to proceed, hence this wide distribution. With the current comms1 and comms2 benchmarks, the results are such non-sense that we cannot report results, so a prompt resolution is necessary. MY CONCLUSIONS Vendors should report startup for short messages, maximum bandwidth and the message size that produced it, and the result file, only. A plot of the result file should allow a user of the benchmark get the relevant information from the benchmark. DISCUSSION Briefly, these benchmarks assume that message bandwidth increases asymtotically with increasing message size. This assumption does not hold for the current HP/Convex Division (CXD) implementation. Caching effects of the CPU and the fact that we are a shared memory machine results in a maximum bandwidth at a transfer size comparable to the cache size, with decreasing bandwidth for larger messages. Our curve for comms1 looks something like this: | | * | * * | * * | * * | | * | | | * | | | | * | | | | * | | * |________________________________________________________________ Bandwidth vs message size This causes several problems. 1. The routine that estimates message passing time, to set the loop count, returns negative transfer times, which results in the comms code using a loop count of 1, which gives terrible statistics. 2. When this is corrected, the method used to correct times for the "check" routine overhead is poor. It measures the time each time through the loop, but discards the values except for the last iterations. For large messages, CXD spends 1/3 of the loop time in this check routine, so measuring this time as precisely as possible is critical. With these changes, CXD reliabily gets results similar to the graph above (for "unmodified" source). To put a scale to this graph, here is sample result file: ================================================= === === === GENESIS / ParkBench Parallel Benchmarks === === === === comms1_mpi === === === ================================================= The measurement time requested for each test case was 1.00E+00 seconds. SHORT MESSAGES <= 576B (B=Byte) Zero length messages were used in least squares fitting. Case LENGTH(B) TIME(sec) RINF(B/s) N1/2(B) %error fit 1 0 5.180E-06 .000E+00 .000E+00 .000E+00 2 1 6.884E-06 5.869E+05 3.040E+00 .000E+00 3 2 6.996E-06 1.102E+06 5.999E+00 5.364E+00 4 10 7.956E-06 4.966E+06 3.029E+01 7.576E+00 5 30 1.171E-05 5.352E+06 3.285E+01 4.627E+00 6 100 1.287E-05 1.493E+07 1.046E+02 1.072E+01 7 300 1.651E-05 3.063E+07 2.348E+02 1.059E+01 8 576 2.214E-05 3.812E+07 3.026E+02 8.044E+00 LONG MESSAGES > 576B (B=Byte) Case LENGTH(B) TIME(sec) RINF(B/s) N1/2(B) %error fit 9 577 3.254E-05 .000E+00 .000E+00 .000E+00 10 1000 3.903E-05 6.510E+07 1.541E+03 .000E+00 11 2000 4.138E-05 1.813E+08 5.634E+03 4.293E+00 12 5000 5.271E-05 2.443E+08 7.971E+03 3.263E+00 13 10000 6.792E-05 2.845E+08 9.577E+03 2.700E+00 14 20000 1.137E-04 2.509E+08 8.096E+03 2.098E+00 15 50000 2.340E-04 2.479E+08 7.934E+03 9.549E-01 16 100000 4.459E-04 2.423E+08 7.539E+03 5.952E-01 17 300000 1.341E-03 2.296E+08 6.107E+03 5.905E-01 18 1000000 4.539E-03 2.219E+08 4.384E+03 3.466E-01 19 3000000 1.567E-02 1.933E+08 -1.136E+04 1.277E+00 20 10000000 5.234E-02 1.910E+08 -1.521E+04 3.807E-01 21 30000000 1.671E-01 1.802E+08 -6.731E+04 5.413E-01 22100000000 5.472E-01 1.825E+08 -3.331E+04 1.906E-01 ------------------------ COMMS1: Message Pingpong ------------------------ Result Summmary --------------- Short Messages, <= 576 Byte rinf = 38.116 MByte/s, nhalf = 302.592 Byte, startup = 7.939 us Long Messages, > 576 Byte rinf = 182.491 MByte/s, nhalf = -33314.086 Byte, startup = -182.552 us MY CONCLUSIONS Vendors should report startup for short messages, maximum bandwidth and the message size that produced it, and the result file, only. Four changes are necessary to the benchmarks: 1. New ESTCOM routine that does not assume asymtotic behavior. 2. Modified COMMS to correct times based upon total time spent in CHECK. 3. Result Summary should ONLY report startup for short messages, and maximum bandwidth/message size. Possible instead of fitting for startup time, the measured time for 0 or 1 byte messages should be used 4. Results file should print measured, as opposed to fit bandwidth. I have MPI source to reflect these changes, if the committe would like it. --------------F2D48B923F-- From owner-parkbench-comm@CS.UTK.EDU Mon Dec 2 14:34:54 1996 Return-Path: Received: from CS.UTK.EDU by netlib2.cs.utk.edu with ESMTP (cf v2.9t-netlib) id OAA03010; Mon, 2 Dec 1996 14:34:54 -0500 Received: from localhost (root@localhost) by CS.UTK.EDU with SMTP (cf v2.9s-UTK) id OAA27832; Mon, 2 Dec 1996 14:14:06 -0500 Received: from donner.cs.utk.edu (DONNER.CS.UTK.EDU [128.169.94.222]) by CS.UTK.EDU with ESMTP (cf v2.9s-UTK) id OAA27818; Mon, 2 Dec 1996 14:14:02 -0500 From: Philip Mucci Received: by donner.cs.utk.edu (cf v2.11c-UTK) id OAA25484; Mon, 2 Dec 1996 14:13:58 -0500 Date: Mon, 2 Dec 1996 14:13:58 -0500 Message-Id: <199612021913.OAA25484@donner.cs.utk.edu> To: sercely@convex.convex.com, parkbench-comm@CS.UTK.EDU Subject: Re: Problems with LOWLEVEL comms benchmarks In-Reply-To: <32A2EDB8.1584@convex.com> X-Mailer: [XMailTool v3.1.2b] Hi Ron, If you could prepare us a set of diffs for the code versus the old code, we could take a look. Since I started on this project, I have been complaining about the poor organization of the Low Level benchmarks. In fact, more than one executive has complained/asked me why they see such poor performance out of the comms. Your patch will definitely do the trick, but in the long term, a rewrite will be necessary. In addition, I believe the data analysis section should be completely separate from the benchmarks themselves. The benchmarks should produce data points, and fitting should be done externally do handle this kind of situation. (I'm sure something similar will arise again...) We should start over and have them rewritten in C, scrapping the output format and curve fitting code. In addition, they should be written by someone who understands end-to-end performance issues. (Remeber, Low Levels are supposed to tell you what the hardware's capable of, not what application throughput you can expect.) I'd be happy to help with spec'ing hwo these benchmarks should be done... I'm sure the other members will have input on this, we'll see what the general consensus is. However, I'm sure all the vendor's would like to start seeing numbers that parallel their own internal measurements... Send us that patch when you can... /%*\ Philip J. Mucci | GRA in CS under Dr. JJ Dongarra /*%\ \*%/ http://www.cs.utk.edu/~mucci PVM/Active Messages \%*/ From owner-parkbench-comm@CS.UTK.EDU Fri Dec 13 09:24:11 1996 Return-Path: Received: from CS.UTK.EDU by netlib2.cs.utk.edu with ESMTP (cf v2.9t-netlib) id JAA11428; Fri, 13 Dec 1996 09:24:10 -0500 Received: from localhost (root@localhost) by CS.UTK.EDU with SMTP (cf v2.9s-UTK) id JAA29640; Fri, 13 Dec 1996 09:07:40 -0500 Received: from relay-11.mail.demon.net (relay-11.mail.demon.net [194.217.242.137]) by CS.UTK.EDU with SMTP (cf v2.9s-UTK) id JAA29632; Fri, 13 Dec 1996 09:07:32 -0500 Received: from minnow.demon.co.uk ([158.152.73.63]) by relay-9.mail.demon.net id aa900478; 13 Dec 96 13:54 GMT Message-ID: Date: Fri, 13 Dec 1996 13:53:49 +0000 To: parkbench-comm@CS.UTK.EDU From: Roger Hockney Subject: Reply to Phil Mucci regarding COMMS1 MIME-Version: 1.0 X-Mailer: Turnpike Version 3.01 This is a response to the contribution of Phil Mucci on the Low-level Benchmarks: Phil, I think you are proposing to throw far too much away and perhaps even "the baby out with the bath water". I think also there is a misunderstanding as to the purpose of the Low-level Parkbenchmarks. I shall respond to your points as they appear in your note. Your text is identified with an introductory >> in the usual way: >>Since I started on this project, I have been complaining about the >>poor organization of the Low Level benchmarks. You will have to be more specific here. COMMS1 in particular has been picked over by several respected people at Southampton, so there cannot be too much wrong with it. Remember programming style and organisation is very subjective. One person's beautiful pogram is another person's abomination, particularly if they come from very different backgrounds and experiences. >>In fact, more than one executive has complained/asked me why they >>see such poor performance out of the comms. As detailed below the LL benchmarks aim to test the performance of the computer as seen through the high level programming interface that is used. If COMMS1, e.g., gives poor results, the executive should find out what it is in the Fortran programming interface that degrades the basic hardware capability which his internal tests are probably measuring. Users of his company's computer are more likely to see the performance through a high-level language and obtain results like COMMS1, than they are to see the performance of his internal tests. >>Your patch will definitely do the trick, but in the long term, >>a rewrite will be necessary. I think a minimalist approach is better. Some changes are certainly advisable but a complete rewrite is not needed. People gradually get to know and understand benchmark code, and code stability is important to establish as far as this is possible and sensible. >>In addition, I believe the data >>analysis section should be completely separate from the benchmarks >>themselves. The benchmarks should produce data points, and fitting >>should be done externally to handle this kind of situation. (I'm sure >>something similar will arise again...) Section 3.2 of PBR states that the " ... LL benchmarks aim to measure performance parameters that characterise the basic architecture of the computer and the compiler software through which it is used". The measurement of performance parameters requires a curve-fitting procedure, therefore this means that the analysis section is an integral part of the benchmark and should not be separated from it. Why make two jobs out of one? Internally of course the curve-fitting procedure is separated out as a subroutine. Actually this is the only subroutine that needs to be changed in COMMS1 to solve the difficulty that Ron Sercely has had with COMMS1 (see my response to his e-mail to this group). This is where you are proposing to "throw the baby out with the bath water", because if you remove the performance parameter calculation from COMMS1 you remove a great deal of its value. Remember that the curve fitting replaces a lengthy table of numbers with just two performance parameters (rinf,nhalf) and a formula r=rinf/(1+nhalf/n). This is a tremendous simplification which can be used to produce timing relations for more complex codes in the form of mathematical formulae. Such a performance characterisation really does open up new oportunities for understanding and interpreting performance. It should not be thrown away because of trouble with the fitting program. The best solution is to correct the fitting program, in the way that I have shown. The value of the parameter nhalf is particularly revealing because it tells you how important in terms of message length a particular value of t0 is in degrading bandwidth performance. The factor (1+nhalf/n) in the above formula IS the degradation factor introduced by the startup time. Notice the startup time itself is not relevant, it is the product of startup time and asymptotic bandwidth that counts, and that EQUALS nhalf. That is to say nhalf is the number of bytes that could have been transferred at the asymptotic rate during the startup time. For more on (rinf,nhalf) the best source is my new book "The Science of Computer Benchmarking" SIAM 1996 in Jack's series on Software, Environments, and Tools (the book's URL is on my home page, whose URL is in my e-mail signature below). >>We should start over and have them rewritten in C, Why C? There is no point in writing a benchmark in C and thus measuring the performance as seen through C, if you want to find out the performance of code written in Fortran. To find out the performance of code in Fortran, the benchmarks must be written in Fortran. This is not to say that a version of the LLs in C would not be welcome and useful to those who might be programming applications in C. It's just that the Parkbench Committee initially selected as its primary programming model Fortran77 + PVM and later extended this to F77 + MPI (see section 1.4 of the Parkbench Report (PBR)) >>scrapping the output format and curve fitting code. The present output format gives both the primary measured data (t vs n) and the changing performance parameter values as more data is added. This is valuable for seeing the stability of the fitted parameters. I and Ron would like to see one column added for the measured bandwidth (= n/t). This should have been in the original code. It was an omission which I would like to see corrected. I would also like to see a slight change in the presentation of the Summary Results as indicated in my reply to Ron's comments. Users can then decide whether they wish to just use the table of measured (t vs n), or whether they wish to use the parameters. They may also re-analyse the data in any way they wish. All options are still open. Since the curve fitting is sometimes difficult (as we are learning) it is surely helpful to provide, as part of the benchmark, the best shot we can for this. Usually for COMMS1 the parametric fit works exceptionally well. >>In addition, they should be written by someone who understands >>end-to-end performance issues. Can you explain to the uninitiated what this means, please. >>(Remember, Low Levels are supposed to tell you what the hardware's >>capable of, not what application throughput you can expect.) I think that you are quite wrong here. The Introduction of the Parkbench report clearly states in section 1.4: "Computer benchmarks are computer programs that form standard tests for the performance of a computer and the software through which it is used ..." and later in section 1.4: "A benchmrk is therefore testing a software interface to a computer, and not a particular type of computer architecture. For example benchmarks using the "F77+PVM" programming model can be run on any computer providing this interface ..." Nowhere in the PBR does it say that the LL benchmarks are supposed to tell you what the hardware is capable of. From the users' perspective, I think the PBR's point of view is the more useful one. >>However, I'm sure all the vendor's would like to start seeing >>numbers that parallel their own internal measurements... This is up to the vendors themselves by finding out what in their Fortran interface is degrading the performance, and then improving the interface. This is the kind of constructive effect that we hope the benchmarks will have on the vendors. For example, COMMS1 has drawn the vendors attention to the importance of reducing t0 and nhalf in addition to their previous concentration only on increasing maximum bandwidth (=rinf). Actually we at Southampton who originated these LL benchmarks have NOT heard many "complaints" from vendors about the COMMS1 results. In our experience they usually agree with them. If there are such complaints, other than that from Ron, we would like to know about them, and can probably help you more than anyone else to resolve any problems. It is anyway right that the code originaters be contacted for agreement before any changes are made to these benchmarks. The code and revision history WAS part of the code contributed by the Southampton group to the Parkbenchmarks. You can tell from this who did what and when. It is important that this historical record be kept with the benchmarks. I notice that it has already disappeared from the output files where I think that it should be reinstated. Thank you Phil for raising these issues which are important ones. -- Roger Hockney. Checkout my new Web page at URL http://www.minnow.demon.co.uk and link to my new book: "The Science of Computer Benchmarking" suggestions welcome. Know any fish movies or suitable links? From owner-parkbench-comm@CS.UTK.EDU Fri Dec 13 09:27:06 1996 Return-Path: Received: from CS.UTK.EDU by netlib2.cs.utk.edu with ESMTP (cf v2.9t-netlib) id JAA11487; Fri, 13 Dec 1996 09:27:06 -0500 Received: from localhost (root@localhost) by CS.UTK.EDU with SMTP (cf v2.9s-UTK) id JAA00441; Fri, 13 Dec 1996 09:16:29 -0500 Received: from relay-11.mail.demon.net (relay-11.mail.demon.net [194.217.242.137]) by CS.UTK.EDU with SMTP (cf v2.9s-UTK) id JAA00408; Fri, 13 Dec 1996 09:16:17 -0500 Received: from minnow.demon.co.uk ([158.152.73.63]) by relay-10.mail.demon.net id aa1020162; 13 Dec 96 13:54 GMT Message-ID: Date: Fri, 13 Dec 1996 13:52:03 +0000 To: parkbench-comm@CS.UTK.EDU From: Roger Hockney Subject: Reply to Ron Sercely regarding COMMS1 MIME-Version: 1.0 X-Mailer: Turnpike Version 3.01 I would like to thank Ron Sercely for bringing to our attention the problems with the COMMS1 benchmark (which will equally apply to COMMS2). I have spent about a week considering his data and isolating the cause of the parametric fitting problem. I believe that I have found a solution to the problem which means only a minor change to the existing benchmark. The assumption made by the parametric fitting part of the COMMS1 benchmark is that the time, t, can be represented approximately as a linear function of message length, n, i.e. t=(n+nhalf)/rinf (1) which implies that the bandwidth is given by rinf r = n/t = ---------- (2) (1+nhalf/n) This type of variation is a very good fit to most of the data that I have seen and is even a reasonable fit to the data given by Ron both for n<=576 and for n>=577, as can be seen in the log(r) vs log(n) graph is available on my Web page at URL: http://www.minnow.demon.co.uk/Pbench/emails/sercely1.htm This graph shows the measured data taken from Ron's e-mail as green circles, and two fits using the formula (2) above shown as dashed lines. These are NOT the fit lines produced by the current COMMS1, but show that the functional form (2) can fit Ron's data if rinf and nhalf are chosen sensibly. The falloff of bandwidth in the measured data for large r, mentioned by Ron, is only small (about 20%) and can be satisfactorily fitted by (2) which is roughly constant in this region. The problem is not that the above functional form is unsatisfactory, but that the current COMMS1 fitting program does not produce sensible values for the parameters (rinf,nhalf) when the range of time values becomes very large such as e+6 and the bandwidth decreases with increasing large n, as in his data. The rest of this note is concerned with explaining how COMMS1 can be easily modified to produce sensible values of rinf and nhalf even in the case of such data. The graph shows the result of the proposed modified fitting procedure which I regard as satisfactory for all message lengths. ============================================================================ The reasons for the bad fit are as follows: Current Method -------------- The (rinf,nhalf) performance parameters are determined in the current COMMS1 by fitting the straight line (1) to the measured t vs n data in such a way as to minimise the sum of the squares of the ABSOLUTE difference (or error) between the measured and predicted value from (1). This is the normal way of fitting straight lines to measured data and is called linear regression by the statisticians or linear least-squares fitting by others. I have checked that the current COMMS1 produces the correct parameters using the above procedure by comparing the results with those produced by the Jandel SigmaPlot (SP) package that I happen to use. Both SigmaPlot and COMMS1 produce a negative nhalf=-33310 [SP value -33567 which I regard as agreement]. The curve from (2) with these parameters fits the last six data values (n>=3e+5) very well but gives silly even negative values for r for smaller values of n. But I emphasise these are the correct parameter values if one sets the condition to minimise the sum of the squares of the ABSOLUTE error, and that this procedure has worked satisfactorily with earlier versions of COMMS1 which used a maximum message length of about 40000. The problem with using the absolute error is that data values associated with the larger n-values are over emphasised in the sum being minimised with the effect that values for lower n are virtually ignored. Hence the parameters give a good fit for large n and can give a bad fit for small n. But this is what you get with standard unweighted linear least squares. Weighted Least Squares ---------------------- This is evidently a well known problem in curve fitting with data that ranges in value over many orders of magnitude, as it does NOW with COMMS1. The solution is to use weighted least squares (or regression) in which each value in the sum to be minimised is multiplied by a weight in such a way as to make each term of the sum of comparable magnitude and therefore of comparable importance in the fitting condition. If the weights are taken as the inverse square of the measured values t_i, then the sum of the squares of the RELATIVE (rather than absolute) error is minimised. When one thinks about it this is what we really want to do: to be able to predict t with a certain relative error over the whole range. We would be happy to know that t was fitted to within say 20%, but this would mean an absolute error that could vary by six orders of magnitude if the values of t varied by that amount, as they typically do. So we cannot achieve the desired result by demanding a roughly even absolute error as is done in the current benchmark. Using the relative error has the additional advantsge that if t is known to within a certain relative error, then r is also known to the same relative error. If weighted least squares is used with Ron's data to minimise the relative error as described above, we obtain rinf=2.15e8 and nhalf=+6494 as the best fit parameters. Note that nhalf is now positive, that there are no negative values of r, and that now the fitted curve is held close to all the data values including those between n=577 and n=10000 that were previously not fitted at all. This is seen as the red dashed line (labelled fit8) on the log(r) vs log(n) graph that can be found on the web page mentioned above. Conclusion ---------- The problem of the unsatisfactory calculation of the (rinf,nhalf) performance parameters is solved by modifying the least-squares fitting procedure to minimise the sum of the squares of the relative rather than the absolute error (as the current code does). This requires only a small modification of the existing benchmark. I am making these modifications and will test them on the Cray T3D at Edinburgh, and also make the revised code available to others for evaluation. But give me a few more weeks. Although the current COMMS1 fitting routine gives satisfactory results for message lengths less than about 40000, there is no doubt that the curve-fitting procedure within COMMS1 needs to be modified to cope with the larger range of message length that are now being tested. I believe that the above changes will make the routine much more robust to wide variations in time values. ***************************************************************************** RON's OTHER POINTS ***************************************************************************** >>With the current comms1 and comms2 benchmarks, the results are such non- sense >>that we cannot report results, so a prompt resolution is necessary. Actually the values of rinf reported are quite OK, it is only the nhalf values that are useless for the reasons given above. >> MY (Ron's) CONCLUSIONS >>Vendors should report startup for short messages, maximum bandwidth and the >>message size that produced it, and the result file, only. >>A plot of the result file should allow a user of the benchmark get the >>relevant information from the benchmark. If the fitting procedure is corrected as proposed above, the example of the revised fit to Ron's data shows that there is no need to throw away the parametric description which has the following advantages: (1) Accuracy: The parametric values can be considered as a kind of average -------- calculated from all the data values (i.e. their value depends on all the measured values). In the face of experimental error, their values should be intrinsicaly more reliable than a single spot measured value, such as the highest bandwidth observed or time for a zero length message. In much the same way that one would consider the average of a 20 measurements of the same quantity to be more reliable than a single spot measurement. (2) Completeness: The parametric values give an approximate fit to all the ------------ measured date, compared to the spot values which only give the performance at the two extremes of message length (zero and e+8) (3) Importance of nhalf: nhalf characterises, in terms of message length, the ------------------- extent to which t0 degrades measured bandwidth. This is very important and should appear as part of the result summary. (4) Data Compression: Two numbers, rinf and nhalf, give through equations (1) ---------------- and (2) the performance for all message lengths. Or if there are different protocols are used for short and long messages, 4 numbers do the job. This contrasts with recording about 22 pairs of (t,n) values or 44 numbers. A data compression of a factor of 10. (5) Performance understanding: The parameters together with equations ------------------------- (1) and (2) can be used in formulae for the performance of more complex codes. This contributes to the understanding of performance and the identification of bottlenecks. More on the meaning and significance of rinf and nhalf can be found in my new book "The Science of Computer Benchmarking, SIAM 1996, in Jack's series "Software, Environments, and Tools" at URL: http://www.siam.org/catalog/mcc07/hockney.htm >> MY CONCLUSIONS >>Vendors should report startup for short messages, maximum bandwidth and the >>message size that produced it, and the result file, only. >>Four changes are necessary to the benchmarks: >>1. New ESTCOM routine that does not assume asymtotic behavior. If this routine is the one that does the least-squares fit, I outline above how I think this should be changed. >>2. Modified COMMS to correct times based upon total time spent in CHECK. My comments here have no connection with this point. Perhaps Ron's patch addresses this? >>3. Result Summary should ONLY report startup for short messages, and maximum >>bandwidth/message size. Possible instead of fitting for startup time, the >>measured time for 0 or 1 byte messages should be used I agree that the Result Summary should be improved, and think we should include both the fitted parameters AND the spot values suggested above. I suggest a layout as follows: ------------------------------------------------------------------ Result Summary -------------- Short Messages, fit to n <= 576 Byte Bandwidth fitted to r = pi0*n/(1+n/nhalf) pi0 = 1.472e5 Hz, nhalf = 217.020 B measured t0 = 5.180e-6 s, fitted t0 = 6.793e-6 s (agreement to 31.1 %) -------------- Long Messages, fit to n > 576 Byte Bandwidth fitted to r = rinf/(1+nhalf/n) rinf = 2.154e+8 B/s, nhalf = 6494.??? B; measured r (at n=e+8) = 1.8275e+8 B/s, fitted rinf = 2.154e+8 B/s (agreement to 17.9 %) ------------------------------------------------------------------- In the case that there is no break in the message transfer protocol, "fit to n = 576" and "fit to n >577" in the above format should both be replaced by "fit to all data". Then the two formulae give identical results: nhalf is the same for both fits and pi0=rinf/nhalf . The reason for keeping the two different functional forms even though they are identical algebraically is to emphasise the asymptotic form of r = pi0*n for short messages (defined as messages with n < nhalf), and the asymptotic form r = rinf for long messages (defined as n > nhalf). In the above revised result summary, I have included the fitting formula for convenience and completeness, and used the alternative (pi0,nhalf) parametric formula for short messages. This emphasises the linear increase of r with n for small n at a rate given by pi0 (units are (B/s) per Byte or s^(-1) = Hz). Note pi0=nhalf/rinf=1/t0(fit). Hence t0(fit)=1/pi0. Consistency of the fit can be checked by comparing t0(measured) to t0(fit) and measured r (at largest n) to fitted rinf. The absolute % difference is also given. This layout should satisfy everybody. Those liking performance parameters can look at the first line of results, and those who only believe measured spot values can look at the second line which also gives a directly comparison of the two approaches. >>4. Results file should print measured, as opposed to fit bandwidth. I think BOTH should be included. This means keeping the existing output layout and adding a column for the measured bandwidth. The purpose of listing the fit values of rinf and nhalf for each value of n is to observe how stable these values are as n increases, and therefore how reliable they are. -- Roger Hockney. Checkout my new Web page at URL http://www.minnow.demon.co.uk and link to my new book: "The Science of Computer Benchmarking" suggestions welcome. Know any fish movies or suitable links? From owner-parkbench-comm@CS.UTK.EDU Mon Dec 16 16:51:19 1996 Return-Path: Received: from CS.UTK.EDU by netlib2.cs.utk.edu with ESMTP (cf v2.9t-netlib) id QAA21739; Mon, 16 Dec 1996 16:51:19 -0500 Received: from localhost (root@localhost) by CS.UTK.EDU with SMTP (cf v2.9s-UTK) id QAA04652; Mon, 16 Dec 1996 16:40:33 -0500 Received: from timbuk.cray.com (root@timbuk.cray.com [128.162.19.7]) by CS.UTK.EDU with ESMTP (cf v2.9s-UTK) id QAA04632; Mon, 16 Dec 1996 16:40:27 -0500 Received: from ironwood.cray.com (root@ironwood-fddi.cray.com [128.162.21.36]) by timbuk.cray.com (8.8.4/CRI-gate-8-2.11) with SMTP id PAA04685 for ; Mon, 16 Dec 1996 15:40:18 -0600 (CST) Received: from fir407.cray.com (cmg@fir407 [128.162.173.7]) by ironwood.cray.com (8.6.12/CRI-ccm_serv-8-2.8) with ESMTP id PAA04771 for ; Mon, 16 Dec 1996 15:40:16 -0600 From: Charles Grassl Received: by fir407.cray.com (8.6.12/btd-b3) id PAA03168; Mon, 16 Dec 1996 15:40:13 -0600 Message-Id: <199612162140.PAA03168@fir407.cray.com> Subject: COMMS benchmarks To: parkbench-comm@CS.UTK.EDU Date: Mon, 16 Dec 1996 15:40:12 -0600 (CST) X-Mailer: ELM [version 2.4 PL24-CRI-b] MIME-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit To: Parkbench Group From: Charles Grassl Subject: Errors in and usefulness of COMMS benchmarks Date: 16, December 1996 The COMMS analysis and curve fitting model is incorrect for message passing systems with internal buffered libraries and for cache based CPUs. For the CRAY T3E, the COMMS results have an error of 70% and hence are not useful. (Note: I have access to accurate Tstart and Rinf values and am therefore able to calculate the actual error.) Though the theory behind the COMMS benchmarks is elegant and useful for modeling, it is not accurate enough for testing and measurement. The errors in the COMMS benchmarks results are not experimental. Rather these errors are systematic because they are related to the model and modeling. For example, we can adjust the fitting range and parameters and obtain much different different results for Tstart and Rinf. The "jiggling" the model is a systematic effect, not an experimental effect. An alternative and more accurate method of measuring Tstart and Rinf would be to use "spot" measurements, where bandwidth measurements are made at a small number of sizes. My experiments indicates that the spot measurement techniques are accurate, reproducible and reliable predictors of general usage. I urge the Parkbench group to withdraw the COMMS benchmarks until they are fixed. They are causing great confusion due to their unreliable and inaccurate results. I suggest that the COMMS1 and COMMS2 benchmarks be replaced with spot measurement programs, of which I have copies. These types of programs are accurate, reliable and easy to use. Charles Grassl Cray Research Eagan, Minnesota USA From owner-parkbench-comm@CS.UTK.EDU Tue Dec 17 08:04:55 1996 Return-Path: Received: from CS.UTK.EDU by netlib2.cs.utk.edu with ESMTP (cf v2.9t-netlib) id IAA13544; Tue, 17 Dec 1996 08:04:54 -0500 Received: from localhost (root@localhost) by CS.UTK.EDU with SMTP (cf v2.9s-UTK) id HAA18240; Tue, 17 Dec 1996 07:59:46 -0500 Received: from relay-11.mail.demon.net (relay-11.mail.demon.net [194.217.242.137]) by CS.UTK.EDU with SMTP (cf v2.9s-UTK) id HAA18233; Tue, 17 Dec 1996 07:59:40 -0500 Received: from minnow.demon.co.uk ([158.152.73.63]) by relay-10.mail.demon.net id aa1015523; 17 Dec 96 12:09 GMT Message-ID: Date: Tue, 17 Dec 1996 12:07:42 +0000 To: Charles Grassl Cc: parkbench-comm@CS.UTK.EDU From: Roger Hockney Subject: Re: COMMS benchmarks In-Reply-To: <199612162140.PAA03168@fir407.cray.com> MIME-Version: 1.0 X-Mailer: Turnpike Version 3.01 Charles Grassl writes > >To: Parkbench Group >From: Charles Grassl >Subject: Errors in and usefulness of COMMS benchmarks >Date: 16, December 1996 > > >The COMMS analysis and curve fitting model is incorrect for message >passing systems with internal buffered libraries and for cache based >CPUs. For the CRAY T3E, the COMMS results have an error of 70% and >hence are not useful. (Note: I have access to accurate Tstart and > Before we abandon attempts to model message passing behaviour for all message lengths and replace it with two spot values, I would like to be sure that the modification to the fitting procedure that I proposed on 13 Dec does not cure the problem. Charles, could you please send me the one set of results that you mention above for the T3E, so that I can try the modified fitting method on it. By this I mean the output file from the existing version of COMMS1 which gives such inaccurate results. Also, Charles, do you have any ideas as to what would be a more satisfactory timing model to use for the systems with buffered libraries and CPU caches? >An alternative and more accurate method of measuring Tstart and Rinf >would be to use "spot" measurements, where bandwidth measurements are >made at a small number of sizes. What spot values do you propose, and how will these be "kept up to date" as computers change? Spot-value measurements only give the message performance for very small and very large messages. They are inaccurate for all other message lengths. In order to know what large and small mean, we need to have an estimate for nhalf and to calculate the error factor (1+nhalf/n). For this we need to fit the whole range of message lengths. If we can make the fitting method work, we have a model for all message lengths. >My experiments indicates that the spot >measurement techniques are accurate, reproducible and reliable >predictors of general usage. > How in detail do you use the spot values as a reliable predictor for general usage at some intermediate message length that is therefore not a spot value? My proposal of 13 Dec was to provide BOTH revised model fitting and spot values. Although more complicated this does satisfy both points of view and provide a clearly visible check between them. Users can then take their pick. -- Roger Hockney. Checkout my new Web page at URL http://www.minnow.demon.co.uk University of and link to my new book: "The Science of Computer Benchmarking" Westminster UK suggestions welcome. Know any fish movies or suitable links? From owner-parkbench-comm@CS.UTK.EDU Tue Dec 17 11:38:26 1996 Return-Path: Received: from CS.UTK.EDU by netlib2.cs.utk.edu with ESMTP (cf v2.9t-netlib) id LAA16330; Tue, 17 Dec 1996 11:38:26 -0500 Received: from localhost (root@localhost) by CS.UTK.EDU with SMTP (cf v2.9s-UTK) id LAA05844; Tue, 17 Dec 1996 11:31:49 -0500 Received: from ute.usi.utah.edu (ute.usi.utah.edu [128.110.136.30]) by CS.UTK.EDU with SMTP (cf v2.9s-UTK) id LAA05832; Tue, 17 Dec 1996 11:31:44 -0500 Received: by ute.usi.utah.edu (AIX 3.2/UCB 5.64/4.03) id AA64539; Tue, 17 Dec 1996 09:31:40 -0700 Date: Tue, 17 Dec 1996 09:31:40 -0700 From: usisaf@ute.usi.utah.edu (Stefano Foresti) Message-Id: <9612171631.AA64539@ute.usi.utah.edu> To: parkbench@CS.UTK.EDU Subject: unsubscribe Cc: parkbench-comm@CS.UTK.EDU I would like to be removed from the parkbench emaillist. Thanks. stefano@osiris.usi.utah.edu From owner-parkbench-comm@CS.UTK.EDU Tue Dec 17 13:24:21 1996 Return-Path: Received: from CS.UTK.EDU by netlib2.cs.utk.edu with ESMTP (cf v2.9t-netlib) id NAA18377; Tue, 17 Dec 1996 13:24:21 -0500 Received: from localhost (root@localhost) by CS.UTK.EDU with SMTP (cf v2.9s-UTK) id NAA14676; Tue, 17 Dec 1996 13:15:15 -0500 Received: from ute.usi.utah.edu (ute.usi.utah.edu [128.110.136.30]) by CS.UTK.EDU with SMTP (cf v2.9s-UTK) id NAA14658; Tue, 17 Dec 1996 13:15:09 -0500 Received: by ute.usi.utah.edu (AIX 3.2/UCB 5.64/4.03) id AA64257; Tue, 17 Dec 1996 11:15:04 -0700 Date: Tue, 17 Dec 1996 11:15:04 -0700 From: usisaf@ute.usi.utah.edu (Stefano Foresti) Message-Id: <9612171815.AA64257@ute.usi.utah.edu> To: parkbench-comm@CS.UTK.EDU Subject: remove from list I would like to be removed from the parkbench emaillist. Thanks. stefano@osiris.usi.utah.edu From owner-parkbench-comm@CS.UTK.EDU Tue Dec 17 17:39:33 1996 Return-Path: Received: from CS.UTK.EDU by netlib2.cs.utk.edu with ESMTP (cf v2.9t-netlib) id RAA21726; Tue, 17 Dec 1996 17:39:32 -0500 Received: from localhost (root@localhost) by CS.UTK.EDU with SMTP (cf v2.9s-UTK) id RAA09642; Tue, 17 Dec 1996 17:35:04 -0500 Received: from timbuk.cray.com (root@timbuk.cray.com [128.162.19.7]) by CS.UTK.EDU with ESMTP (cf v2.9s-UTK) id RAA09613; Tue, 17 Dec 1996 17:34:58 -0500 Received: from ironwood.cray.com (root@ironwood-fddi.cray.com [128.162.21.36]) by timbuk.cray.com (8.8.4/CRI-gate-8-2.11) with SMTP id QAA28187; Tue, 17 Dec 1996 16:34:53 -0600 (CST) Received: from fir407.cray.com (cmg@fir407 [128.162.173.7]) by ironwood.cray.com (8.6.12/CRI-ccm_serv-8-2.8) with ESMTP id QAA27540; Tue, 17 Dec 1996 16:34:49 -0600 From: Charles Grassl Received: by fir407.cray.com (8.6.12/btd-b3) id QAA04168; Tue, 17 Dec 1996 16:34:47 -0600 Message-Id: <199612172234.QAA04168@fir407.cray.com> Subject: Re: COMMS benchmarks To: roger@minnow.demon.co.uk (Roger Hockney) Date: Tue, 17 Dec 1996 16:34:47 -0600 (CST) Cc: cmg@cray.com, parkbench-comm@CS.UTK.EDU In-Reply-To: from "Roger Hockney" at Dec 17, 96 12:07:42 pm X-Mailer: ELM [version 2.4 PL24-CRI-b] MIME-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit To: Roger Hockney From: Charles Grassl Date: 17, December 1996 File number 1 contains output from the parkbench version of program COMM1. Note that the times for the short messages shorter than 100 bytes are f 12 to 18 microseconds. The least square fit calculates a Tstart of 41 microseconds. This calculated value does not agree with the actual measurement. File number 2 contains my test program with direct or spot measurements. File number 3 contains output from my test program with direct or spot measurements. I suggest using operationally useful definitions of "short" and "long" messages as follows: SHORT MESSAGE: The shortest message which can be sent. This might be one byte, one word or one cache line. Note that zero length messages are not of interest because this length message might be trapped by the programmer or the message passing library or the OS. LONG MESSAGE: The longest message which can be sent. This is approximately the size of the memory. It might be one half the size of memory so that there is room for a message passing buffer. "Spot" values for measured sizes would be one [byte,word,cache line] for short messages and [1,1/2,1/4]*memsize for long messages. The transition region between short and long messages is complicated and is not static. It is related to cache states and to OS parameters for the message passing library. For a CRAY T3E, the MPI buffer has a default value, but can be overridden by a user set environment variable. The default setting is several Kbytes. The cache state is more dynamic and depends on the orientation and relative alignment of data and instruction segments. There are generally transitions at sizes 4, 8 and 12 Kbytes in the data cache. A linear model can be used for small messages, say less than 100 bytes, and for large messages, say more than 500,000 bytes. But, for cached and buffered library message passing, a linear model cannot be made to fit both regions combined. The linear model is useful for representing computer network performance, but it is not useful for measuring Tstart and Rinf. The direct measurements are more accurate, reliable and easier to program. Because of their simplicity and accuracy, they are more useful for low level benchmarks. I suggest that we replace the COMMS1 and COMMS2 benchmarks with versions similar to what I include below. Regards, Charles Grassl Cray Research FILE NUMBER 1: Output from Parkbench COMMS1 program ************************************************************************ ================================================= === === === GENESIS / ParkBench Parallel Benchmarks === === === === comms1_mpi === === === ================================================= The measurement time requested for each test case was 1.00E+00 seconds No distinction was made between long and short messages. Zero length messages were not used in least squares fitting. Case LENGTH(B) TIME(sec) RINF(B/s) N1/2(B) %error fit 1 8 1.260E-05 0.000E+00 0.000E+00 0.000E+00 2 10 1.348E-05 2.277E+06 2.070E+01 0.000E+00 3 20 1.380E-05 1.293E+07 1.592E+02 2.168E+00 4 30 1.590E-05 7.656E+06 8.976E+01 2.414E+00 5 40 1.561E-05 1.032E+07 1.257E+02 3.129E+00 6 50 1.648E-05 1.140E+07 1.407E+02 2.849E+00 7 60 1.618E-05 1.420E+07 1.800E+02 3.580E+00 8 70 1.773E-05 1.396E+07 1.765E+02 3.061E+00 9 80 1.694E-05 1.632E+07 2.107E+02 3.739E+00 10 90 1.793E-05 1.712E+07 2.224E+02 3.422E+00 11 100 1.802E-05 1.834E+07 2.405E+02 3.430E+00 12 110 1.889E-05 1.864E+07 2.450E+02 3.145E+00 13 120 1.780E-05 2.104E+07 2.815E+02 3.979E+00 14 130 1.917E-05 2.155E+07 2.893E+02 3.589E+00 15 140 1.902E-05 2.268E+07 3.069E+02 3.651E+00 16 150 1.941E-05 2.357E+07 3.208E+02 3.563E+00 17 160 1.896E-05 2.529E+07 3.479E+02 3.894E+00 18 170 2.057E-05 2.519E+07 3.463E+02 3.490E+00 19 180 1.911E-05 2.716E+07 3.779E+02 4.129E+00 20 190 2.125E-05 2.680E+07 3.720E+02 3.633E+00 21 200 1.894E-05 2.929E+07 4.125E+02 4.702E+00 22 210 2.091E-05 2.965E+07 4.184E+02 4.174E+00 23 220 2.011E-05 3.110E+07 4.424E+02 4.457E+00 24 230 2.136E-05 3.136E+07 4.467E+02 4.113E+00 25 240 2.015E-05 3.305E+07 4.751E+02 4.564E+00 26 250 2.228E-05 3.274E+07 4.697E+02 4.057E+00 27 260 2.144E-05 3.348E+07 4.824E+02 4.196E+00 28 270 2.212E-05 3.378E+07 4.874E+02 4.004E+00 29 280 2.111E-05 3.510E+07 5.101E+02 4.326E+00 30 290 2.259E-05 3.527E+07 5.129E+02 3.979E+00 31 300 2.284E-05 3.543E+07 5.157E+02 3.874E+00 32 400 2.256E-05 3.902E+07 5.800E+02 4.479E+00 33 600 2.549E-05 4.647E+07 7.168E+02 4.641E+00 34 800 2.817E-05 5.431E+07 8.625E+02 4.602E+00 35 1000 3.253E-05 5.715E+07 9.160E+02 3.988E+00 36 2000 4.496E-05 6.571E+07 1.081E+03 3.115E+00 37 5000 6.135E-05 1.022E+08 1.813E+03 4.116E+00 38 10000 8.579E-05 1.353E+08 2.503E+03 3.932E+00 39 20000 1.294E-04 1.683E+08 3.230E+03 3.396E+00 40 30000 1.722E-04 1.854E+08 3.632E+03 2.882E+00 41 40000 2.161E-04 1.950E+08 3.871E+03 2.459E+00 42 50000 2.594E-04 2.012E+08 4.041E+03 2.156E+00 43 100000 4.534E-04 2.203E+08 4.678E+03 1.748E+00 44 200000 7.784E-04 2.515E+08 5.978E+03 1.817E+00 45 300000 1.110E-03 2.668E+08 6.743E+03 1.536E+00 46 500000 1.697E-03 2.871E+08 8.080E+03 1.445E+00 47 1000000 3.276E-03 3.022E+08 9.529E+03 9.226E-01 48 2000000 6.373E-03 3.121E+08 1.097E+04 5.644E-01 49 3000000 9.489E-03 3.155E+08 1.165E+04 4.020E-01 50 5000000 1.569E-02 3.180E+08 1.248E+04 2.620E-01 51 10000000 3.134E-02 3.191E+08 1.311E+04 1.344E-01 ------------------------ COMMS1: Message Pingpong ------------------------ Result Summary --------------- rinf = 319.084 MByte/s, nhalf = 13112.553 Byte, startup = 41.09 ************************************************************************ FILE NUMBER 2: Grassl's Tstart and Rinf measurement program ************************************************************************ PROGRAM COMMS1 INCLUDE 'mpif.h' parameter ( MEMTOT = 100* 2**20 ) ! Memory size in bytes INTEGER I, ierr, my_rank, status(MPI_STATUS_SIZE), numprocs, ibytes REAL*8 A(MEMTOT/8) REAL TN,tstart,nhalf,rinf CALL MPI_INIT( ierr ) CALL MPI_COMM_RANK(MPI_COMM_WORLD, my_rank, ierr) CALL MPI_COMM_SIZE(MPI_COMM_WORLD, numprocs, ierr) IF ( my_rank.EQ.0 ) THEN WRITE(*,100) numprocs END IF tstart = 9999999. rinf = 0. do I = 0,22 ibytes = (2**i)*8 n = max((2**(18-1))/(2**i),10) call pingpong(a,ibytes,n,my_rank,tn) tmsec = tn*1.e6 tstart = min(tstart,tmsec) rinf = max(rinf,ibytes/tmsec) if ( my_rank.eq.0 ) write(*,101) ibytes,n,tmsec,ibytes/tmsec end do IF ( my_rank.EQ.0 )THEN nhalf = tstart*rinf WRITE(*,990) tstart,nhalf,rinf ENDIF CALL MPI_FINALIZE(ierr) call exit() 100 FORMAT(//' Message Pingpong' & /' Number of processors: ',i8 & //' Length [Bytes] Repetitions ', & ' Time [microsec.] Bandwidth [Mbyte/s]', & /' ', 68('-')) 101 FORMAT(i10,5x,i10,8x,f10.3,8x,f10.3) 990 FORMAT(//' Result Smmmary ',/, & ' --------------- ',//, & ' T start: ',f10.1,' microsec.'/ & ' N half: ',f10.1,' bytes' / & ' R inf: ',f10.1,' Mbyte/sec.'//) END subroutine pingpong(a,ilen,nrept,my_rank,tn) INCLUDE 'mpif.h' real*8 a(ilen/8), DVAL INTEGER I, ie, my_rank, status(MPI_STATUS_SIZE) REAL T0, T1, T2, TN !timer() = MPI_WTIME() timer() = 3.33e-9*rtc() DVAL = 2. A = DVAL T1 = timer() DO I = 1,NREPT CALL DUMMY(I) END DO T2 = timer() T0 = T2-T1 CALL MPI_BARRIER( MPI_COMM_WORLD, ie ) T1 = timer() DO I = 1,NREPT CALL DUMMY(I) IF ( my_rank.EQ.0 ) THEN CALL MPI_SEND(A,ILEN,MPI_BYTE,1,10,MPI_COMM_WORLD,ie) CALL MPI_RECV(A,ILEN,MPI_BYTE,1,20,MPI_COMM_WORLD,status,ie) ENDIF IF ( my_rank.EQ.1 ) THEN CALL MPI_RECV(A,ILEN,MPI_BYTE,0,10,MPI_COMM_WORLD,status,ie) CALL MPI_SEND(A,ILEN,MPI_BYTE,0,20,MPI_COMM_WORLD,ie) ENDIF END DO T2 = timer() TN = (T2 - T1 - T0 )/(NREPT*2) CALL CHECK(A, ILEN/8, DVAL) return end SUBROUTINE CHECK(WRK, LWRK, DVAL) INTEGER LWRK REAL*8 WRK(LWRK), DVAL INTEGER I,IERR IERR=0 DO 10 I = 1, LWRK IF ( WRK(I).NE.DVAL ) IERR=IERR+1 10 CONTINUE IF( IERR.NE.0 )THEN WRITE(6,*)'ERROR: DVAL= ',DVAL,' IERR= ',IERR STOP ENDIF RETURN END SUBROUTINE DUMMY(I) INTEGER I RETURN END ************************************************************************ FILE NUMBER 3: Output from Grassl's Tstart and Rinf measurement program ************************************************************************ Message Pingpong Number of processors: 2 Length [Bytes] Repetitions Time [microsec.] Bandwidth [Mbyte/s] -------------------------------------------------------------------- 8 131072 15.153 0.528 16 65536 15.528 1.030 32 32768 20.972 1.526 64 16384 20.854 3.069 128 8192 22.510 5.686 256 4096 25.898 9.885 512 2048 29.601 17.297 1024 1024 37.772 27.110 2048 512 51.092 40.085 4096 256 77.785 52.658 8192 128 92.190 88.860 16384 64 130.623 125.429 32768 32 210.075 155.982 65536 16 353.894 185.185 131072 10 596.613 219.693 262144 10 1017.277 257.692 524288 10 1849.011 283.551 1048576 10 3482.081 301.135 2097152 10 6830.946 307.008 4194304 10 13538.897 309.797 8388608 10 26720.909 313.934 16777216 10 53281.705 314.878 33554432 10 106447.112 315.222 Result Summary --------------- T start: 15.2 microsec. N half: 4776.5 bytes R inf: 315.2 Mbyte/sec. From owner-parkbench-comm@CS.UTK.EDU Tue Jan 7 18:57:55 1997 Return-Path: Received: from CS.UTK.EDU by netlib2.cs.utk.edu with ESMTP (cf v2.9t-netlib) id SAA27283; Tue, 7 Jan 1997 18:57:55 -0500 Received: from localhost (root@localhost) by CS.UTK.EDU with SMTP (cf v2.9s-UTK) id SAA26691; Tue, 7 Jan 1997 18:54:23 -0500 Received: from workhorse.cs.utk.edu (WORKHORSE.CS.UTK.EDU [128.169.92.141]) by CS.UTK.EDU with ESMTP (cf v2.9s-UTK) id SAA26684; Tue, 7 Jan 1997 18:54:18 -0500 From: Philip Mucci Received: by workhorse.cs.utk.edu (cf v2.11c-UTK) id SAA02950; Tue, 7 Jan 1997 18:54:17 -0500 Date: Tue, 7 Jan 1997 18:54:17 -0500 Message-Id: <199701072354.SAA02950@workhorse.cs.utk.edu> To: parkbench-comm@CS.UTK.EDU Subject: Cache Benchmark Dear ParkBench Members, I have recently completed work on a first version of a cache/memory benchmark. This is not intended to replace anything, but it has its roots in the issues that were raised with RINF at the last meeting. In encourage you to experiment with it and send me your comments. Please see http://www.cs.utk.edu/~mucci/cachebench/cbench.html Hope you all had a great holiday season, Sincerely, -Philip Mucci From owner-parkbench-comm@CS.UTK.EDU Wed Jan 8 00:56:05 1997 Return-Path: Received: from CS.UTK.EDU by netlib2.cs.utk.edu with ESMTP (cf v2.9t-netlib) id AAA29718; Wed, 8 Jan 1997 00:56:01 -0500 Received: from localhost (root@localhost) by CS.UTK.EDU with SMTP (cf v2.9s-UTK) id AAA20789; Wed, 8 Jan 1997 00:51:26 -0500 Received: from ute.usi.utah.edu (ute.usi.utah.edu [128.110.136.30]) by CS.UTK.EDU with SMTP (cf v2.9s-UTK) id AAA20781; Wed, 8 Jan 1997 00:51:17 -0500 Received: by ute.usi.utah.edu (AIX 3.2/UCB 5.64/4.03) id AA30562; Tue, 7 Jan 1997 22:51:11 -0700 Date: Tue, 7 Jan 1997 22:51:11 -0700 From: usisaf@ute.usi.utah.edu (Stefano Foresti) Message-Id: <9701080551.AA30562@ute.usi.utah.edu> To: parkbench-comm@CS.UTK.EDU Subject: unsubscribe Please unsubscribe stefano@osiris.usi.utah.edu From owner-parkbench-comm@CS.UTK.EDU Thu Jan 9 14:37:13 1997 Return-Path: Received: from CS.UTK.EDU by netlib2.cs.utk.edu with ESMTP (cf v2.9t-netlib) id OAA07831; Thu, 9 Jan 1997 14:37:12 -0500 Received: from localhost (root@localhost) by CS.UTK.EDU with SMTP (cf v2.9s-UTK) id OAA01600; Thu, 9 Jan 1997 14:13:46 -0500 Received: from relay-7.mail.demon.net (relay-7.mail.demon.net [194.217.242.9]) by CS.UTK.EDU with SMTP (cf v2.9s-UTK) id OAA01540; Thu, 9 Jan 1997 14:13:16 -0500 Received: from minnow.demon.co.uk ([158.152.73.63]) by relay-5.mail.demon.net id aa529061; 9 Jan 97 16:48 GMT Message-ID: Date: Thu, 9 Jan 1997 16:46:54 +0000 To: parkbench-comm@CS.UTK.EDU From: Roger Hockney Subject: COMMS1 MIME-Version: 1.0 X-Mailer: Turnpike Version 3.01 To: Charles Grassl and the Parkbench Committee From: Roger Hockney Subject: COMMS1 benchmark Date: 7 January 1997 A happy New Year to you all. Thank you, Charles, for sending us the complete timing data from the Cray T3E for the present COMMS1 (PB2.1) and your Proposed New Code (PNC). After examining this data I am able now to reply to Charles Grassl's e-mails of 16 and 17 December regarding the COMMS1 benchmark. As Charles says something needs to be done, so I propose for IMMEDIATE ACTION: My conclusion first is that the problems with COMMS1 can be addressed by relatively minor changes to the existing code. These involve reporting more prominently the KEY SPOT values already measured by COMMS1, comparing them with the parametric fit values, and changing the fitting procedure. In particular: (1) COMMS1.f Add output of spot bandwidth and Key-spot-values. Also add a revison history. I hope Ian Gledinning and Mark Papiani will watch what I propose here, bearing in mind its effect on automatic processing scripts which are to produce GBIS files from COMMS1 output files. Ron Sercely, can you please send me your suggested mods to this routine. (2) LSTSQ.f Add hidden switch to minimising either relative or absolute error. Set switch in program to relative. (3) ESTCOM.f Base estimate of parameters on two spot values taken for 1B and MAXLEN Byte. I hope Ian Glendinning and Ron Serceley will watch carefully what I do here. Ron Sercely, can you please send me your suggested mods to this routine and any others you found it necessary to change. As the original author of COMMS1 and chairman of the low-level Parkbench subcommittee (if this still exists?), I will take on the task of submitting revised subroutines ASAP to the group for evaluation. I have set a personal deadline of 15 Feb 1997 for this. I will post proposed code revisions to the group ASAP in the hope that some members, particularly Charles and Ron, may have time to try them out, and give me some feed back. Leaving one month for evaluation, this means that revised code should be available for inclusion in a future release by 15 March 1997. How does this fit into the UTK Parkbench group's schedule for future releases? Roger Hockney (Westminster University) For those interested in the details, read on: You might find it easier to read from my web page which also includes various graphs. The URL is http://www.minnow.demon.co.uk/Pbench/emails/grassl4.htm --------------------------------------------------------------- DETAILED RESPONSE TO GRASSL'S EMAILS: 16 December: (1) PARAMETRIC FIT ------------------ >The COMMS analysis and curve fitting model is incorrect for message >passing systems with internal buffered libraries and for cache based >CPUs. >For the CRAY T3E, the COMMS results have an error of 70% and >hence are not useful. This is true for the T3D data provided, but we need much more data before we can draw general conclusions. I find that the timing data you submitted on 17 Dec 1996 cannot be fitted accurately by a straight line (or two-parameter fit), even if the fitting procedure is modified to minimise the sum of the squares of the relative error (as I propose) rather than absolute error (as COMMS1 does now). The absolute error fitting procedure is certainly not useful and this fitting method should be abandoned. The relative error method gives a marginally acceptable fit in my opinion to this difficult data and confirms my opinion that the revied COMMS1 should use this method. Figure G4_1 shows Grassl's COMMS1(PB2.1) T3E data as green dots with the red dashed line (fit12) showing the two parameter (rinf,nhalf) based on a linear timing line that is produced by the existing COMMS1 which minimises the square of the absolute errors. This, as expected, fits the final values for large n well but effectively ignores those for small n. This is the curve that corresponds, as reported by COMMS1, to rinf=319.084 MB/s, nhalf= 13112 B, startup= 41.09 us These figures have been confirmed to be the correct best fit values using the absolute error minimisation condition. There is no numerical error in the COMMS1 or LSTSQ routines. The values however cause concern because the output file shows that the time to send an 8 B message is 12.6 us which is much less than the 41 us startup time. That is to say the fit is hopeless at small message lengths. However the agreement between rinf (fitted) and the largest bandwidth measured is exact to 3 decimal places. An inspection of the graph shows that the relative error is very unevenly distributed, being much worse for short messages. I regard this fit as unacceptable and certainly not useful. So minimising absolute error is again found to lead to unsatisfactory results. If we adopt my suggestion and minimise instead the square of the relative error, we obtain fit11 which is shown as a dashed blue line and gives rinf=267 MB/s, nhalf= 4816 B, startup = 18.06 us This fit spreads the relative error evenly over the range and gives a fit that some might regard as just acceptable in giving a broad view of the performance for ALL message lengths. However the fit at small lengths gives a startup of 18 us compared to a measured r(n=8B)=12us (50% error), and the fit at long lengths gives rinf=267 MB/s compared to a measured r(n=10^7)= 319 MB/s (19% error). The error in the fit is worst (around 100 %) for intermediate values between n= 10^3 and 10^4. Nevertheless I believe that this relative-error fit is the most sensible fit to all the data that can be produced with two parameters, and confirms my view that the revised COMMS1 code should change to minimising the relative error. If we want a better fit then a third parameter is required (more on this later). Note that it is possible with the linear two-parameter fit to choose parameters that give an exact fit at the two ends of the range by equating the time at say 8B to the startup, and the rate at say n=10^7 to the final asymptotic value rinf. nhalf is then derived by calculation. This procedure would give rinf= r(n=10^7 B), nhalf=startup*rinf, startup= t(n=8 B) This amounts to Charles's proposal and matches exactly the spot values at the two ends and ignores all data in between. This is shown in Fig. G4_2. Now the linear approximation over-estimates the performance for all message lengths. (2) SPOT VALUES --------------- In addition to the fitted parameters discussed above it seems to be forgotten that the existing COMMS1 produces a table of 20 to 40 (typically) spot measurements for message lengths chosen by the benchmarker in the COMMS1 input data file. These are intended to be chosen to give a complete picture of the communication performance for all mesage lengths. These spot values computed by COMMS1 for each of the specified message lengths are valid. When plotted, I can see no significant difference (by which I mean >say 20%) between the times reported by PNC and COMMS1, excepting that the existing COMMS1 gives consistently slightly better results (i.e. lower time values) than PNC. I can see some possible reasons for this in the PNC code (see below). From this point of view I would expect manufacturers to prefer the values measured by COMMS1. Because of the similarity of the spot values all subsequent work has been conducted just on the COMMS1 data. (3) DEFINITIONS --------------- >(Note: I have access to accurate Tstart and >Rinf values and am therefore able to calculate the actual error.) Charles, to clarify things can you give a definition of what you mean by Tstart and Rinf. Some confusion is occurring I think between the spot measured values and fitted performance parameters due to the use of similar names for both. I suggest we distinguish them more clearly in the future. If Tstart means the measured time for the shortest message used, say 8 B in the case of the T3E measurements, then I think that this should be reported for exactly what it is, i.e.: t(n=8 B) = ??.?? us If Rinf means the measured bandwidth at the longest message used, then again I suggest that this is reported for what it is, i.e.: r(n=10^7 B) = ???.? MB/s I don't think that one should use the term Rinf for this, because rinf is the name given to a performance parameter which is a beast of a different nature. The performance parameters, on the other hand, are associated with some timing model. In the case of (rinf,nhalf) they are defined as the best values obtained by fitting a range of measured (n,t) values to the linear relation t=(n+nhalf)/rinf or t=t0+n/rinf. [Whence t0=nhalf/rinf. t0 is called "startup" in the COMMS1 output, but t0 (or t sub zero) is the better name.] Since t0 and rinf arise from a least squares fitting procedure they should not be expected to correspond exactly to any spot value, but if the linear approximation is good and the data does not have too much experimental scatter, then we expect t0 approx= t(n=shortest) rinf approx= r(n=longest) In these circumstances t(shortest) and r(longest) are good estimates for t0 and rinf. But this is NOT always the case. If, for example, the experimental data has a lot of scatter due to the use of an a poor computer clock or an inadequate repeat value (NREPT) and the linear model is good, we would not want to use T(shortest) and r(longest) as estimates for t0 and rinf and ignore all the other measured data. The best experimental procedure is to fit a straight line to ALL the experimental data and extract t0 and rinf from the intercept and slope of the fitted line. This is EXACTLY what the existing COMMS1 does. (4) THEORY BEHIND COMMS1 ------------------------ >Though the theory behind the COMMS benchmarks is elegant and useful >for modeling, it is not accurate enough for testing and measurement. Obviously it depends on how linear the experimental timing is as a function of message length. We need to see many more results before drawing conclusions. That's why I am so keen to have people submit results to the GBIS data base, so we can see how useful the linear fit is. And also why it is important that the GBIS plots also include the linear-fit line. For example, the results obtained at Southampton for the Meiko CS-2 and reported on page 59 of "The Science of Computer Benchmarking" show very good agreement with the linear model: startup(linear fit)=195.2us, t(n=0)=198.3us, t(n=1B)=200.1us; rinf(linear fit)=9.18 MB/s, r(n=40000B=20*nhalf)=8.7 MB/s. Also rinf and nhalf are pretty stable from about n=200 B to n=40000 B. In this case the linear model IS accurate enough for testing and measurement. Can others comment from their own experience, please, on the validity/usefulness or otherwise of the linear model? (5) SYSTEMATIC ERRORS --------------------- >The errors in the COMMS benchmarks results are not experimental. >Rather these errors are systematic because they are related to the >model and modeling. For example, we can adjust the fitting range and >parameters and obtain much different different results for Tstart and >Rinf. The "jiggling" the model is a systematic effect, not an >experimental effect. I agree that the deviations from the linear model are systematic in the case of the T3E data. Both rinf and nhalf, which would remain roughly constant if the data obeyed the linear model, are both seen to increase monotonicaly. However we should not forget that the linear model does fit some results very well, in which case it is a very useful and concise description of the communication performance. It should not be thrown away because it does not work well in all circumstances. (6) THREE-PARAMETER FIT ----------------------- I have two quite successful 3-parameter fits, but as yet no rational as to why they work. The best definition of a third parameter is a matter for consideration by those of us who signed up for membership of a "Performance evaluation (or some such name)" subcommittee under the chaimanship of Aad van der Steen at a past Parkbench meeting. (6.1) Variable Power Form This fits the performance(or bandwidth),r, to rinf r = -------------------------------- [1 + (nhalf/n)^gamma]^(1/gamma) and time [n^gamma + nhalf^gamma]^(1/gamma) t = --------------------------------- rinf The three parameters are (rinf,nhalf,gamma). When gamma=1 we retrieve the original straight line fit. This form gives the best fit that I have so far found (SigmaPlot reports an error norm of 0.13) and is shown as fit16 in Fig G4_4 using the parameters rinf=365.8 MB/s, nhalf=4467 B, gamma=0.446 The fit which is shown as a solid red line is within a few percent of all measured spot. In other words it is an excellent fit. The only trouble is that I do not know how to interpret the functional form or the new parameter gamma. Strictly speaking nhalf should now be called n-sub-1/2^(1/gamma)! (6.2) Hyperbolic Form This fits a hyperbola to the final asymptotic line with symmetry about the n-axis, and rinf r = ------------------------------- [(1+nhalf/n)^2 - (a/n)^2]^(1/2) whence [(n+nhalf)^2 - a^2]^(1/2) t = ------------------------- rinf When a=0 the existing linear model is obtained. The following values give the best fit with a reported norm of 0.18. rinf = 323 MB/s, nhalf = 43600 B, a=43340 B This is plotted as the red line in Fig. G4_5. The fit is also very good but marginally worse than that of the Variable Power Form. (7) REPORTING OF COMMS1 RESULTS ------------------------------- >An alternative and more accurate method of measuring Tstart and Rinf >would be to use "spot" measurements, where bandwidth measurements are >made at a small number of sizes. My experiments indicates that the spot >measurement techniques are accurate, reproducible and reliable >predictors of general usage. >I urge the Parkbench group to withdraw the COMMS benchmarks until they >are fixed. They are causing great confusion due to their unreliable >and inaccurate results. >I suggest that the COMMS1 and COMMS2 benchmarks be replaced with spot >measurement programs, of which I have copies. These types of programs >are accurate, reliable and easy to use. But wait a moment, COMMS1 already does this: COMMS1 starts by measuring some 20 to 40 SPOT values which are listed in columns 2 (=n) and 3 (=t) of the output file. I agree that a new column should be inserted between 3 and 4 giving the spot bandwidth (r=n/t). As mentioned above the existing COMMS1 and PNC both give much the same spot values. COMMS1 also computes a straight-line fit which can be very useful when the (n,t) data is roughly linear, but of course is naturally not very useful when the data is significantly non linear as in Grassl's T3E results. I think that most of the confusion mentioned above can be solved by reporting BOTH the "key spot values" as recommended by Charles, followed by the results of the two-parameter fit with a comparison with the "key spot values". That is to say I see the resolution of the problem principally in the way in which the COMMS1 results are reported. ----------------------------------------------------------- Date: 17, December 1996 (8) SHORT AND LONG MESSAGES --------------------------- >I suggest using operationally useful definitions of "short" and >"long" messages as follows: >SHORT MESSAGE: The shortest message which can be sent. This might be one byte, one word or one cache line. Note that zero length messages are not of interest because this length message might be trapped by the programmer or the message passing library or the OS. >LONG MESSAGE: The longest message which can be sent. This is approximately the size of the memory. It might be one half the size of memory so that there is room for a message passing buffer. >"Spot" values for measured sizes would be one [byte,word,cache line] >for short messages and [1,1/2,1/4]*memsize for long messages. I agree that zero-length messages might give spurious results and should not be used. After all users are surely not really interested in sending messages with no information! The above selection of message lengths gives no values for message lengths between SHORT and LONG (after all long starts at say 25MB if memsize=10^8B). What about users with messages of say 1KB, 100KB, 1MB? I think that a benchmark that does not provide information in this intermediate range is inadequate. I favour the present practice of specifying in the COMMS1 data file 20 to 40 message lengths spaced roughly equally on a logarithmic scale from 1B to the maximum possible. This tells the WHOLE story in the primary result listing for those who want it, after which "Key spot values" are selected, followed by the result of the parametric fit. (9) UNDERSTANDING WHAT'S GOING ON --------------------------------- >The transition region between short and long messages is complicated >and is not static. It is related to cache states and to OS >parameters for the message passing library. For a CRAY T3E, the MPI >buffer has a default value, but can be overridden by a user set >environment variable. The default setting is several Kbytes. The >cache state is more dynamic and depends on the orientation and >relative alignment of data and instruction segments. There are >generally transitions at sizes 4, 8 and 12 Kbytes in the data cache. >A linear model can be used for small messages, say less than 100 bytes, >and for large messages, say more than 500,000 bytes. But, for cached >and buffered library message passing, a linear model cannot be made to >fit both regions combined. I find this very interesting because a better understanding of what is actually going on would help a lot in trying to find a suitable third parameter to characterise results like those from the T3E. Can you help, Charles, with time versus length equation that might apply? How does a "message passing systems with internal buffered libraries" operate in detail? The measured T3E results actually show a WORSE performance than would be seen if a linear model applied with the given Tstart and final Rinf (see Fig.4_2). In this sense manufacturers might regard making their comms system follow a linear timing model as an IDEAL goal. (10) CPU CACHE -------------- The presence of a CPU cache should be of no benefit to performance if the COMMS1 benchmark operates in the way that is intended. This is because the data to be received by the Master processor is supposed to come from the slave. If this data is being picked up directly from cache then the compiler is defeating the object of the benchmark. Do you think that the presence of the repeat loop allows this to happen? It is possible that the details of the loop kernel need to be changed to ensure this does not happen. (11) PROPOSED REVISED PINGPONG LOOP ----------------------------------- I am considering changing the basic pinpong loop to: (comments invited) Master sends array to Slave. Slave receives it, adds a number to each element and returns it to Master. This ensures that the array is used by the Slave code and the data moved out of system buffers to user space as is intended. The Master receives the modified array subtracts the same number from each element and checks the result. Again the array is used by the master. The overhead is measured by having the Master perform the actions in sequence of the Slave then the Master with the comms omitted. By this I mean the Master adds the number to each element (the Slave action), then subtracts it and checks result (the Master action). This is repeated NREPT times and timed outside the repeat loop. It is an error in my view to have any timer calls within the repeat loop (and COMMS1 does currently have two which need to be removed). This is because the sole purpose of the repeat is to allow measurements with poor clocks that (by definition) cannot resolve time intervals during one pass of the loop. But if the loop is repeated say 10^4 times then they can measure its length. For such clocks any time difference taken within a single pass of the loop may take place in less than a clock tick and be incorrectly recorded as zero time or as one clock tick. Multiplying this time by NREPT outside the loop does not give a correct value, it only multiplies up the error. (12) REPLACEMENT OF COMMS1 -------------------------- >The linear model is useful for representing computer network >performance, but it is not useful for measuring Tstart and Rinf. The >direct measurements are more accurate, reliable and easier to program. >Because of their simplicity and accuracy, they are more useful for low >level benchmarks. I suggest that we replace the COMMS1 and COMMS2 >benchmarks with versions similar to what I include below. Since COMMS1 already does all the direct measurements that are wanted, the desired result can be achieved by changing the result format of COMMS1. (13) COMMENTS ON GRASSL'S PROPOSED CODE --------------------------------------- I list below the PNC in full with two corrections. They are identified between lines of !!!!!!!!!!!!: (1) The IF tests to identify the Master and Slave codes should go outside the repeat (NREPT) loop. This may partly account for the worse spot results reported by PNC compared with COMMS1. (2) The CHECK routine should be inside the Master repeat loop otherwise the compiler/system may NOT transfer the communicated data from the hidden MPI buffer to the user's array A(). The check routine is present not only to test that the right numbers are there but also to be sure they are in the array A and available to the user's program. This is done by having the program do something with the array A(). FILE NUMBER 2: Grassl's Tstart and Rinf measurement program ************************************************************************ PROGRAM COMMS1 INCLUDE 'mpif.h' parameter ( MEMTOT = 100* 2**20 ) ! Memory size in bytes INTEGER I, ierr, my_rank, status(MPI_STATUS_SIZE), numprocs, ibytes REAL*8 A(MEMTOT/8) REAL TN,tstart,nhalf,rinf CALL MPI_INIT( ierr ) CALL MPI_COMM_RANK(MPI_COMM_WORLD, my_rank, ierr) CALL MPI_COMM_SIZE(MPI_COMM_WORLD, numprocs, ierr) IF ( my_rank.EQ.0 ) THEN WRITE(*,100) numprocs END IF tstart = 9999999. rinf = 0. do I = 0,22 ibytes = (2**i)*8 n = max((2**(18-1))/(2**i),10) call pingpong(a,ibytes,n,my_rank,tn) tmsec = tn*1.e6 tstart = min(tstart,tmsec) rinf = max(rinf,ibytes/tmsec) if ( my_rank.eq.0 ) write(*,101) ibytes,n,tmsec,ibytes/tmsec end do IF ( my_rank.EQ.0 )THEN nhalf = tstart*rinf WRITE(*,990) tstart,nhalf,rinf ENDIF CALL MPI_FINALIZE(ierr) call exit() 100 FORMAT(//' Message Pingpong' & /' Number of processors: ',i8 & //' Length [Bytes] Repetitions ', & ' Time [microsec.] Bandwidth [Mbyte/s]', & /' ', 68('-')) 101 FORMAT(i10,5x,i10,8x,f10.3,8x,f10.3) 990 FORMAT(//' Result Smmmary ',/, & ' --------------- ',//, & ' T start: ',f10.1,' microsec.'/ & ' N half: ',f10.1,' bytes' / & ' R inf: ',f10.1,' Mbyte/sec.'//) END subroutine pingpong(a,ilen,nrept,my_rank,tn) INCLUDE 'mpif.h' real*8 a(ilen/8), DVAL INTEGER I, ie, my_rank, status(MPI_STATUS_SIZE) REAL T0, T1, T2, TN !timer() = MPI_WTIME() timer() = 3.33e-9*rtc() DVAL = 2. A = DVAL T1 = timer() DO I = 1,NREPT CALL DUMMY(I) END DO T2 = timer() T0 = T2-T1 CALL MPI_BARRIER( MPI_COMM_WORLD, ie ) T1 = timer() DO I = 1,NREPT CALL DUMMY(I) !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! The following IF test for the Master should go outside the NREPT loop, as in COMMS1 !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! IF ( my_rank.EQ.0 ) THEN CALL MPI_SEND(A,ILEN,MPI_BYTE,1,10,MPI_COMM_WORLD,ie) CALL MPI_RECV(A,ILEN,MPI_BYTE,1,20,MPI_COMM_WORLD,status,ie) !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! The Check routine should go here, as it does in COMMS1 CALL CHECK(A, ILEN/8, DVAL) !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! ENDIF !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! The following IF test for the Slave should go outside the NREPT loop, as in COMMS1 !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! IF ( my_rank.EQ.1 ) THEN CALL MPI_RECV(A,ILEN,MPI_BYTE,0,10,MPI_COMM_WORLD,status,ie) CALL MPI_SEND(A,ILEN,MPI_BYTE,0,20,MPI_COMM_WORLD,ie) ENDIF END DO T2 = timer() TN = (T2 - T1 - T0 )/(NREPT*2) !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! The Check routine should be moved inside the NREPT loop (see above), as it is in COMMS1 !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! CALL CHECK(A, ILEN/8, DVAL) return end SUBROUTINE CHECK(WRK, LWRK, DVAL) INTEGER LWRK REAL*8 WRK(LWRK), DVAL INTEGER I,IERR IERR=0 DO 10 I = 1, LWRK IF ( WRK(I).NE.DVAL ) IERR=IERR+1 10 CONTINUE IF( IERR.NE.0 )THEN WRITE(6,*)'ERROR: DVAL= ',DVAL,' IERR= ',IERR STOP ENDIF RETURN END SUBROUTINE DUMMY(I) INTEGER I RETURN END ************************************************************************ -- Roger Hockney. Checkout my new Web page at URL http://www.minnow.demon.co.uk University of and link to my new book: "The Science of Computer Benchmarking" Westminster UK suggestions welcome. Know any fish movies or suitable links? From owner-parkbench-comm@CS.UTK.EDU Fri Jan 10 06:38:10 1997 Return-Path: Received: from CS.UTK.EDU by netlib2.cs.utk.edu with ESMTP (cf v2.9t-netlib) id GAA17139; Fri, 10 Jan 1997 06:38:09 -0500 Received: from localhost (root@localhost) by CS.UTK.EDU with SMTP (cf v2.9s-UTK) id GAA24610; Fri, 10 Jan 1997 06:33:27 -0500 Received: from aloisius.vcpc.univie.ac.at ([193.171.58.11]) by CS.UTK.EDU with ESMTP (cf v2.9s-UTK) id GAA24571; Fri, 10 Jan 1997 06:33:01 -0500 Received: (from smap@localhost) by aloisius.vcpc.univie.ac.at (8.7.5/8.7.3) id MAA20798; Fri, 10 Jan 1997 12:25:52 +0100 (MET) From: Ian Glendinning X-Authentication-Warning: aloisius.vcpc.univie.ac.at: smap set sender to using -f Received: from beavis(193.171.58.38) by aloisius via smap (V1.3) id sma020796; Fri Jan 10 12:25:39 1997 Received: (from ian@localhost) by beavis.vcpc.univie.ac.at (8.7.1/8.7.1) id MAA03086; Fri, 10 Jan 1997 12:25:39 +0100 (MET) Date: Fri, 10 Jan 1997 12:25:39 +0100 (MET) Message-Id: <199701101125.MAA03086@beavis.vcpc.univie.ac.at> To: parkbench-comm@CS.UTK.EDU, roger@minnow.demon.co.uk Subject: Re: COMMS1 X-Sun-Charset: US-ASCII > From owner-parkbench-comm@cs.utk.edu Thu Jan 9 20:27:23 1997 > (3) ESTCOM.f Base estimate of parameters on two spot values > taken for 1B and MAXLEN Byte. I hope Ian Glendinning > and Ron Serceley will watch carefully what I do here. > Ron Sercely, can you please send me your suggested > mods to this routine and any others you found it > necessary to change. This sounds like a sensible change to me. Ian -- Ian Glendinning European Centre for Parallel Computing at Vienna (VCPC) ian@vcpc.univie.ac.at Liechtensteinstr. 22, A-1090 Vienna, Austria Tel: +43 1 310 939612 WWW: http://www.vcpc.univie.ac.at/~ian/ From owner-parkbench-comm@CS.UTK.EDU Sat Jan 11 02:20:12 1997 Return-Path: Received: from CS.UTK.EDU by netlib2.cs.utk.edu with ESMTP (cf v2.9t-netlib) id CAA28484; Sat, 11 Jan 1997 02:20:12 -0500 Received: from localhost (root@localhost) by CS.UTK.EDU with SMTP (cf v2.9s-UTK) id CAA28877; Sat, 11 Jan 1997 02:17:07 -0500 Received: from ute.usi.utah.edu (ute.usi.utah.edu [128.110.136.30]) by CS.UTK.EDU with SMTP (cf v2.9s-UTK) id CAA28867; Sat, 11 Jan 1997 02:17:01 -0500 Received: by ute.usi.utah.edu (AIX 3.2/UCB 5.64/4.03) id AA34345; Sat, 11 Jan 1997 00:16:59 -0700 Date: Sat, 11 Jan 1997 00:16:59 -0700 From: usisaf@ute.usi.utah.edu (Stefano Foresti) Message-Id: <9701110716.AA34345@ute.usi.utah.edu> To: owner-parkbench-comm@CS.UTK.EDU, parkbench-comm@CS.UTK.EDU Subject: unsubscribe Please unsubscribe stefano@osiris.usi.utah.edu Thanks From owner-parkbench-comm@CS.UTK.EDU Wed Jan 15 09:29:34 1997 Return-Path: Received: from CS.UTK.EDU by netlib2.cs.utk.edu with ESMTP (cf v2.9t-netlib) id JAA26405; Wed, 15 Jan 1997 09:29:33 -0500 Received: from localhost (root@localhost) by CS.UTK.EDU with SMTP (cf v2.9s-UTK) id JAA09071; Wed, 15 Jan 1997 09:12:11 -0500 Received: from convex.convex.com (convex.convex.com [130.168.1.1]) by CS.UTK.EDU with ESMTP (cf v2.9s-UTK) id JAA09051; Wed, 15 Jan 1997 09:12:02 -0500 Received: from brittany.rsn.hp.com by convex.convex.com (8.6.4.2/1.35) id IAA14882; Wed, 15 Jan 1997 08:11:28 -0600 Received: from localhost by brittany.rsn.hp.com with SMTP (1.38.193.4/16.2) id AA04999; Wed, 15 Jan 1997 08:11:42 -0600 Sender: sercely@convex.convex.com Message-Id: <32DCE59E.5AC6@convex.com> Date: Wed, 15 Jan 1997 08:11:42 -0600 From: Ron Sercely Organization: Hewlett-Packard Convex Technology Center X-Mailer: Mozilla 2.0 (X11; I; HP-UX A.09.05 9000/710) Mime-Version: 1.0 To: Roger Hockney Cc: parkbench-comm@CS.UTK.EDU Subject: Re: COMMS1 Respond to Roger's recent post (long) References: Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit I have sent Roger the code he asked for in his post, but have several things to discuss with the committee as a whole. I belived his goal of 15 Feb is a good one, as it should allow us to have information for our next meeting in Knoxville. 1. Although we have been focused on COMMS1, COMMS2 uses the identical (except for the name of an include file :-( ) ESTCOM routine, so this needs to be changed as well. Also I suspect that the COMMS2 fitting routines/printing of spot values needs to be done. 2. Currently, because of the data output, we plot fitted values of bandwidth. I think this should be replaced with plots of raw bandwidth. As a user, one of things I would want, is the ability to ask, "If I use a message length of X on machine Y, how long does it take?" The raw data provides this number, the fitted numbers do not. 3. In looking at the fit plotted in figures G4_1 and G4_2. I find both of these to be unaccepable. There is a large systematic error, resulting from a fit between reality and model. I strongly prefer NO fit to these fits. 4. The fit for HP/Convex data is just as bad. I have data or postscript files of our results for comms1, if anyone would like a copy. They are also contained in a report, (Frame document or HTML). Let me know what you want and in what format. 5. The fit shown in figures G4_4 and G4_5 are acceptable, but before we include these fits in ParkBench, I believe we must verify that other vendors data is equally well fit. My problem with bad fits is the following. Some people will simply look at the reported values of t0 and rinf. If they see a lower value t0 (higher value of rinf) for a particular machine, they will assume that they will attain lower message latency for small messages (higher measured bandwidth for large messages) IN THEIR APPLICATION, which may be false using the fit in Figure 1. In a more general case, IMHO, if we provide models of communicaion parameters, users will use these communication models in modeling application performance. If the model is not _very_ accurate, the user would be much better served by simply using a table lookup (spot values). 6. Discussion of Caching and Comms performance > > (10) CPU CACHE > -------------- > The presence of a CPU cache should be of no benefit to performance > if the COMMS1 benchmark operates in the way that is intended. This > is because the data to be received by the Master processor is > supposed to come from the slave. If this data is being picked up > directly from cache then the compiler is defeating the object of > the benchmark. Do you think that the presence of the repeat loop > allows this to happen? It is possible that the details of the loop > kernel need to be changed to ensure this does not happen. > > (11) PROPOSED REVISED PINGPONG LOOP > ----------------------------------- > I am considering changing the basic pinpong loop to: > (comments invited) > > Master sends array to Slave. Slave receives it, adds a number to each > element and returns it to Master. This ensures that the array is > used by the Slave code and the data moved out of system buffers to > user space as is intended. The Master receives the modified array > subtracts the same number from each element and checks the result. > Again the array is used by the master. > > The overhead is measured by having the Master perform the actions in > sequence of the Slave then the Master with the comms omitted. By this > I mean the Master adds the number to each element (the Slave action), > then subtracts it and checks result (the Master action). This is > repeated NREPT times and timed outside the repeat loop. It is my belief that CPU cache effects can have a STRONG impact on comms1 performance. Briefly, blocking message receives complete as soon as data is safely in a users buffer, ready for access. This does NOT necessarily mean that the data is within the receiving processes cache. In general, the data will be in cache ONLY after application code access the data. Consider the example in proposal(11) on the Exemplar architecture. Assume that the processes will exchange X bytes of data, the machines have Y byte caches, and that the processes receive and send into the same buffer.. Consider the cache states for the case X = 2*Y, after startup effects Master starts timer Slave is waiting Master sends array to slave, and will not find any data in cache. (see ** below) Slave does a read/modify/write on each array element, starting at the beginning of the buffer Master waits on receive Because the sender just sent the data, each read is always a cache miss Slave sends to Master. Because the send library starts sending from the beginning of the buffer, each data access will be a cache miss! Master receives data Slave calls receive => waits for next send from Master Master does read/modify/write, and will cache miss on each location (** ), leaving the "top" half of the buffer in cache, which is what caused the send library routine miss, discussed above. If the master and slave do their read/modify/write _backwards, i.e., starting at the max value and working backwards, the "bottom" half of the buffer is left in cache, so when the send library routine is called, the library routine will find half the data encached! The overhead method discussed above could account for this, but only if it first, in a loop, does the add to each element in the array, then, in a separate loop, does the subtract. If performed in a single loop, (the master did an add then a subtract), the caching would be totally different then in the case with real communication. The _current_ comms1 overhead estimation technique is inadequate in this regard, and is why HP/Convex reports results (as modified code) with the overhead checks removed. Notice that this example also indicates that although we can design tests to "move data out of the system buffers", we cannot create tests to ensure that data is in cache! Now consider the case where x < y. With communication in place, the Master and Slave will still miss on each cache access. BUT, the overhead estimation routine will NOT! The overhead routine will miss on the first read/modify/write (when adding the number), but not on the second (when subtracting the number). As a result, the estimation of overhead will be low by almost a factor of two, resulting in calculating much poorer comm1 performance numbers than are actually obtained. This case must be addressed, or, we must eliminated this "overhead" loop calculation in its entirety. I believe that eliminating the "overhead" loop is the more appropriate solution. Again, I believe there is an important distinction between having data available in the address space of a receiver (mpi_recv completes), and having the data in the cache of the user (user accesses the data). I feel we need to ensure that we do NOT measure caching effects in the comms routines. -- Ron Sercely HP/CXTC Toolsmith From owner-parkbench-comm@CS.UTK.EDU Wed Jan 15 11:56:03 1997 Return-Path: Received: from CS.UTK.EDU by netlib2.cs.utk.edu with ESMTP (cf v2.9t-netlib) id LAA28872; Wed, 15 Jan 1997 11:56:03 -0500 Received: from localhost (root@localhost) by CS.UTK.EDU with SMTP (cf v2.9s-UTK) id LAA24196; Wed, 15 Jan 1997 11:44:16 -0500 Received: from osiris.usi.utah.edu (osiris.usi.utah.edu [128.110.138.150]) by CS.UTK.EDU with SMTP (cf v2.9s-UTK) id LAA24189; Wed, 15 Jan 1997 11:44:11 -0500 Received: by osiris.usi.utah.edu (AIX 3.2/UCB 5.64/4.03) id AA25451; Wed, 15 Jan 1997 09:43:51 -0700 Date: Wed, 15 Jan 1997 09:43:51 -0700 From: stefano@osiris.usi.utah.edu (Stefano Foresti) Message-Id: <9701151643.AA25451@osiris.usi.utah.edu> To: owner-parkbench-comm@CS.UTK.EDU, parkbench-comm@CS.UTK.EDU Subject: unsubscribe I have sent already 3 mails trying to unsubscribe from parkbench stefano@osiris.usi.utah.edu Please someone let me know what I have to do. Thanks, Stefano Foresti From owner-parkbench-comm@CS.UTK.EDU Wed Jan 15 18:06:20 1997 Return-Path: Received: from CS.UTK.EDU by netlib2.cs.utk.edu with ESMTP (cf v2.9t-netlib) id SAA03606; Wed, 15 Jan 1997 18:06:20 -0500 Received: from localhost (root@localhost) by CS.UTK.EDU with SMTP (cf v2.9s-UTK) id RAA29207; Wed, 15 Jan 1997 17:50:26 -0500 Received: from rudolph.cs.utk.edu (RUDOLPH.CS.UTK.EDU [128.169.92.87]) by CS.UTK.EDU with ESMTP (cf v2.9s-UTK) id RAA29171; Wed, 15 Jan 1997 17:50:18 -0500 From: Philip Mucci Received: by rudolph.cs.utk.edu (cf v2.11c-UTK) id RAA06526; Wed, 15 Jan 1997 17:49:51 -0500 Date: Wed, 15 Jan 1997 17:49:51 -0500 Message-Id: <199701152249.RAA06526@rudolph.cs.utk.edu> To: sercely@convex.convex.com, Roger Hockney Subject: Re: COMMS1 Respond to Roger's recent post (long) In-Reply-To: <32DCE59E.5AC6@convex.com> Cc: parkbench-comm@CS.UTK.EDU, halloy@CS.UTK.EDU X-Mailer: [XMailTool v3.1.2b] Ron, I believe an easier and more accurate solution to avoid caching effects would be to "slide" the send and receive buffer pointers along with the test. I've seen this technique used before...basically it ensures that capacity misses occur with every block access to comm data... Allocate send/recv space (at least cache_size*2) call this pointer x. The sender would do for message size y for (i=0;i Received: from CS.UTK.EDU by netlib2.cs.utk.edu with ESMTP (cf v2.9t-netlib) id SAA27427; Fri, 17 Jan 1997 18:31:21 -0500 Received: from localhost (root@localhost) by CS.UTK.EDU with SMTP (cf v2.9s-UTK) id SAA27429; Fri, 17 Jan 1997 18:01:34 -0500 Received: from VNET.IBM.COM (vnet.ibm.com [199.171.26.4]) by CS.UTK.EDU with SMTP (cf v2.9s-UTK) id SAA27409; Fri, 17 Jan 1997 18:01:28 -0500 Received: from MHV by VNET.IBM.COM (IBM VM SMTP V2R3) with BSMTP id 7790; Fri, 17 Jan 97 18:01:05 EST Received: by MHV (XAGENTA 4.0) id 0351; Fri, 17 Jan 1997 18:01:13 -0500 Received: by mailserv.pok.ibm.com (AIX 3.2/UCB 5.64/9309151207) id AA119398; Fri, 17 Jan 1997 18:01:10 -0500 From: (Edgar Kalns) Message-Id: <9701172301.AA119398@mailserv.pok.ibm.com> Subject: COMMS1/2 discussions To: parkbench-comm@CS.UTK.EDU Date: Fri, 17 Jan 1997 18:01:09 -0500 (EST) X-Mailer: ELM [version 2.4 PL24] Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit Regarding the COMMS1/2 issues and the linear fitting of message-passing data, I concur with Charles Grassl and Ron Sercely that the most important aspect of the COMMs kernels is to report exact measurements for a wide range of message sizes (including 0-byte messages; more on this later). Of the data reported in COMMS1/2, this is the data of most utility to multi-processor/multi-computer programmers. The fitted vs. measured startup values on the IBM RS/6000 SP {Tstart vs. t(0 Byte) using Roger Hockney's nomenclature} differ by a factor of 2.8, similar to the Cray T3E's variance of a factor of 3.4. I believe these variances are far too large to be of any value. Roger's suggestion to move to a square of the relative error improves the situation somewhat by not having such large absolute errors; however, stated errors in the range of 50 to 100% ("small" to "medium"-sized messages) are still unacceptably large. That is not to say that the fitting method ought to be jettisoned entirely; I'm intrigued by Roger's investigations of 3-parameter fits and I encourage the committee to pursue this avenue in earnest as it appears promising. 0-byte messages: Charles asserts that "... zero length messages are not of interest because this length message might be trapped by the programmer or the message passing library or the OS." True, it might be, but if so, then there should be a non-trivial difference between the latency measured at 0 bytes and 1 byte. For the SP, a 0-byte latency measurement is the best-known method to obtain the cost of message-passing protocol processing. I believe that it can be safely retained for COMMS1/2. To summarize, tightening up on the error of the fitted values is of paramount importance for COMMS1/2 so that potential consumers of this data can use it effectively. Perhaps 3-parameter fits will solve the problems associated with the current method. Regards, Edgar ---------------------------------------------------------------- || Edgar T. Kalns | tel: 914-433-8494 || || Parallel System Performance | fax: 914-433-8469 || || Technical Strategy & Architecture | kalns@vnet.ibm.com || || IBM RS/6000 Division | || || = = = | || || 522 South Road, Mailstation P967 | || || Poughkeepsie, NY 12601-5400 | || ---------------------------------------------------------------- From owner-parkbench-comm@CS.UTK.EDU Mon Jan 20 09:24:47 1997 Return-Path: Received: from CS.UTK.EDU by netlib2.cs.utk.edu with ESMTP (cf v2.9t-netlib) id JAA25022; Mon, 20 Jan 1997 09:24:46 -0500 Received: from localhost (root@localhost) by CS.UTK.EDU with SMTP (cf v2.9s-UTK) id JAA00638; Mon, 20 Jan 1997 09:12:04 -0500 Received: from convex.convex.com (convex.convex.com [130.168.1.1]) by CS.UTK.EDU with ESMTP (cf v2.9s-UTK) id JAA00630; Mon, 20 Jan 1997 09:11:58 -0500 Received: from brittany.rsn.hp.com by convex.convex.com (8.6.4.2/1.35) id IAA19431; Mon, 20 Jan 1997 08:11:24 -0600 Received: from localhost by brittany.rsn.hp.com with SMTP (1.38.193.4/16.2) id AA06473; Mon, 20 Jan 1997 08:11:27 -0600 Sender: sercely@convex.convex.com Message-Id: <32E37D0F.1D2E@convex.com> Date: Mon, 20 Jan 1997 08:11:27 -0600 From: Ron Sercely Organization: Hewlett-Packard Convex Technology Center X-Mailer: Mozilla 2.0 (X11; I; HP-UX A.09.05 9000/710) Mime-Version: 1.0 To: " (Edgar Kalns)" Cc: parkbench-comm@CS.UTK.EDU Subject: Re: COMMS1/2 discussions References: <9701172301.AA119398@mailserv.pok.ibm.com> Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit I would like say thati I agree with Edgar point that 0 length messages should be reported. I think some users will use these for synchronization of processes, and will be interested in the reported values. -- Ron Sercely HP/CXTC Toolsmith From owner-parkbench-comm@CS.UTK.EDU Tue Jan 21 17:05:11 1997 Return-Path: Received: from CS.UTK.EDU by netlib2.cs.utk.edu with ESMTP (cf v2.9t-netlib) id RAA03062; Tue, 21 Jan 1997 17:05:11 -0500 Received: from localhost (root@localhost) by CS.UTK.EDU with SMTP (cf v2.9s-UTK) id QAA23763; Tue, 21 Jan 1997 16:51:10 -0500 Received: from timbuk.cray.com (root@timbuk.cray.com [128.162.19.7]) by CS.UTK.EDU with ESMTP (cf v2.9s-UTK) id QAA23754; Tue, 21 Jan 1997 16:51:07 -0500 Received: from ironwood.cray.com (ironwood-fddi.cray.com [128.162.21.36]) by timbuk.cray.com (8.8.4/CRI-gate-8-2.11) with SMTP id PAA03051; Tue, 21 Jan 1997 15:50:34 -0600 (CST) Received: from magnet.cray.com (magnet [128.162.173.162]) by ironwood.cray.com (8.6.12/CRI-ccm_serv-8-2.8) with ESMTP id PAA14658; Tue, 21 Jan 1997 15:50:17 -0600 Received: from magnet by magnet.cray.com (8.8.0/btd-b3) via SMTP id VAA15799; Tue, 21 Jan 1997 21:50:30 GMT Sender: cmg@cray.com Message-ID: <32E53A24.15FB@cray.com> Date: Tue, 21 Jan 1997 15:50:28 -0600 From: Charles Grassl Organization: Cray Research X-Mailer: Mozilla 3.01SC-SGI (X11; I; IRIX 6.2 IP22) MIME-Version: 1.0 To: Ron Sercely CC: " (Edgar Kalns)" , parkbench-comm@CS.UTK.EDU Subject: Re: COMMS1/2 discussions References: <9701172301.AA119398@mailserv.pok.ibm.com> <32E37D0F.1D2E@convex.com> Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit Ron and Edgar propose retaining tests of zero length messages. We must be careful that we make our test programs consistent, useful and extendable for more than one message passing implementation. In particular, we should leave the door open for testing one-side message passing. The use of zero length messages for this kind message passing might not have an unambiguos interpretation with respect to synchronization. Here is a point for discussion: Do the various message passing definitions treat the case of zero length messages? That some users might use zero length messages for synchronization is not reason to test it. This practice might not be portable and "safe". Charles Grassl Cray Research From owner-parkbench-comm@CS.UTK.EDU Tue Jan 21 19:18:58 1997 Return-Path: Received: from CS.UTK.EDU by netlib2.cs.utk.edu with ESMTP (cf v2.9t-netlib) id TAA04821; Tue, 21 Jan 1997 19:18:58 -0500 Received: from localhost (root@localhost) by CS.UTK.EDU with SMTP (cf v2.9s-UTK) id TAA06613; Tue, 21 Jan 1997 19:09:48 -0500 Received: from workhorse.cs.utk.edu (WORKHORSE.CS.UTK.EDU [128.169.92.141]) by CS.UTK.EDU with ESMTP (cf v2.9s-UTK) id TAA06600; Tue, 21 Jan 1997 19:09:41 -0500 From: Philip Mucci Received: by workhorse.cs.utk.edu (cf v2.11c-UTK) id TAA01098; Tue, 21 Jan 1997 19:09:28 -0500 Date: Tue, 21 Jan 1997 19:09:28 -0500 Message-Id: <199701220009.TAA01098@workhorse.cs.utk.edu> To: cmg@cray.com, sercely@convex.convex.com Subject: Re: COMMS1/2 discussions Cc: kalns@vnet.IBM.COM, parkbench-comm@CS.UTK.EDU It would be easy just to check for an error from Pvm and MPI in the case that a 0 length is not accepted. If it is, we report it, if not, just be quiet about it. -Phil Mucci From owner-parkbench-comm@CS.UTK.EDU Wed Jan 22 16:20:51 1997 Return-Path: Received: from CS.UTK.EDU by netlib2.cs.utk.edu with ESMTP (cf v2.9t-netlib) id QAA24254; Wed, 22 Jan 1997 16:20:51 -0500 Received: from localhost (root@localhost) by CS.UTK.EDU with SMTP (cf v2.9s-UTK) id PAA10161; Wed, 22 Jan 1997 15:58:38 -0500 Received: from osiris.sis.port.ac.uk (root@osiris.sis.port.ac.uk [148.197.100.10]) by CS.UTK.EDU with SMTP (cf v2.9s-UTK) id PAA10136; Wed, 22 Jan 1997 15:58:21 -0500 From: Received: from yosemite (node10.remote.port.ac.uk) by osiris.sis.port.ac.uk (4.1/SMI-4.1) id AA04297; Wed, 22 Jan 97 20:57:49 GMT Date: Wed, 22 Jan 97 20:43:00 Subject: FW: Returned mail: User unknown To: parkbench-hpf@CS.UTK.EDU, parkbench-comm@CS.UTK.EDU X-Priority: 3 (Normal) X-Mailer: Chameleon 5.0.1, TCP/IP for Windows, NetManage Inc. Message-Id: Mime-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII Dear Collegues, Firstly, my apologies if you recieve this email twice. An email discussion group is being set up to discuss the design and implementation of a set of HPF benchmarks - it is being carried out under the auspices of the ParKBench initiative (see http://www.netlib.org/parkbench/) led by Tony Hey at the University of Southampton and Jack Dongarra at the University of Tennessee at Knoxville. At present the following researchers and interested parties are pre-subscribed onto the list: Mark Baker Guy Robinson Vladimir Getov Ken Hawick Bryan Carpenter John Merlin Chuck Koebel David Presberg John Prentice Hon Yau Subhash Saini Ron Perrott If you are interested in subscribing to the list send an email to "parkbench-hpf@cs.utk.edu" with "subscribe parkbench-hpf" in the body of the email text. Here is a further list of Mojor-domo commands for reference. -- subscribe [
] - Subscribe yourself (or
if specified) to the named . -- unsubscribe [
] -Unsubscribe yourself (or
if specified) from the named . -- get -Get a file related to . -- index - Return an index of files you can "get" for . -- which [
] - Find out which lists you (or
if specified) are on. -- who - Find out who is on the named . -- info - Retrieve the general introductory information for the named . -- lists - Show the lists served by this Majordomo server. -- Help - Retrieve this message. -- End - Stop processing commands (useful if your mailer adds a signature). Commands should be sent in the body of an email message to "Majordomo". Commands in the "Subject:" line NOT processed. If you have any questions or problems, please contact "Majordomo-Owner". ------------------------------------------------------------------------------------------------- The ParKBench HPF Group Aims The aims of the ParkBench HPF group is to discuss, design, implement, test and release a set of HPF benchmarks that satisfies the needs of applications developers and vendors. The codes that are developed should be capable of being used to compare different platforms running HPF as well as and different vendor implementations HPF. Initial Objective The initial objective of Parkbench HPF group is to produce a proposal to present before the HPF Forum at the "The First Annual HPF User Group Meeting" on February 24-26, 1997 in Santa Fe, New Mexico. The support and assistance of the HPFF will greatly enhance the credibility of the work undertaken by the Parkbench HPF group. Methodology It is anticipated that the basic design and implementation methodology used for the HPF benchmarks will closely follow that previously used by Parkbench to develop, and then release, the Message Passing (MPI/PVM) benchmark codes (http://www.netlib.org/parkbench/). The Parkbench codes are split up into three catorgories: -- Low Level - These codes are used to measure low level machine performance parameters, such as inter-processor bandwidth and peak processor computational performance. -- Kernels - These codes comprise of what would be considered the core kernels of applications, such as a matrix reduction or an FFT. -- Compact applications - These codes are complete user applications which are used to assess the performance of the parallel machines on "real" applications. The existing ParkBench benchmark codes are predominantly used for comparing different parallel platforms, rather than implementations of MPI or PVM on a particular platform (they can be used for this purpose if needs are such). Within the Parkbench HPF version of the benchmarks it is likely that an important usage of the codes will be the analysis of different vendor releases of HPF on a parallel platform. This particular emphasis, coupled with the factor that many of the MP low-level codes cannot easily mapped onto equivalent HPF versions, means that it is probable that the HPF low-level codes will consist of a mixture of codes - some being similar to the existing MP codes and others which can be used to assess important aspects that influence the efficiency of HPF implementations by a particular vendor, answering questions such as: -- How well does the system deal with regular communications? -- Does the system use ghost regions for the regular case? -- How well does the system deal with irregular communications? -- Does the system cache pre-computed communication schedules ? -- Efficiency of common array intrinsics. Working Groups The discussion on the HPF codes will be led by the following people: Low-Level - Mark Baker, University of Portsmouth, UK Kernels - Chuck Koebel, Rice University, USA (to be confirmed). Compact Applications - Subhash Saini, NAS, NASA, USA ------------------------------------------------------------------------------- ------------------------------------- Dr Mark Baker DIS, University of Portsmouth, Hants, UK Tel: +44 1705 844285 E-mail: mab@npac.syr.edu Date: 1/22/97 - Time: 8:43:01 PM URL http://www.sis.port.ac.uk/~mab/ ------------------------------------- From owner-parkbench-comm@CS.UTK.EDU Sun Feb 16 11:00:00 1997 Return-Path: Received: from CS.UTK.EDU by netlib2.cs.utk.edu with ESMTP (cf v2.9t-netlib) id KAA11002; Sun, 16 Feb 1997 10:59:59 -0500 Received: from localhost (root@localhost) by CS.UTK.EDU with SMTP (cf v2.9s-UTK) id KAA20152; Sun, 16 Feb 1997 10:53:35 -0500 Received: from relay-11.mail.demon.net (relay-11.mail.demon.net [194.217.242.137]) by CS.UTK.EDU with SMTP (cf v2.9s-UTK) id KAA20145; Sun, 16 Feb 1997 10:53:31 -0500 Received: from minnow.demon.co.uk ([158.152.73.63]) by relay-10.mail.demon.net id aa1005559; 16 Feb 97 15:33 GMT Message-ID: <5rwr5BARhyBzEw4y@minnow.demon.co.uk> Date: Sun, 16 Feb 1997 15:31:29 +0000 To: parkbench-comm@CS.UTK.EDU From: Roger Hockney Subject: Modified COMMS1 MIME-Version: 1.0 X-Mailer: Turnpike Version 3.01 To: Parkbench Committee 16 Feb 1997 REVISED COMMS1 (MPI) -------------------- I have now completed the modifications to the existing COMMS1 benchmark which have been made in response to the problems reported by Ron Sercely and Charles Grassl, and the revised benchmark is now ready for evaluation. So far, I have only modified the MPI version of COMMS1. The main changes are: (1) Giving proper prominence to the spot measured bandwidth in the output file, as requested by both Charles and Ron. (2) Improving the accuracy of the least-squares 2-parameter fit by minimising the relative rather than absolute error. (3) Introduction of a new 3-parameter variable-power fit for cases in which the existing 2-parameter fit is inadequate. (4) Parametric fits are now only reported if the errors are less than amounts specified as parameters in comms1.inc. (5) Including the changes to ESTCOM.f recommended by Ron. (6) Including the improved measurement of CHECK-time in COMMS1.f as recommended by Ron. Seven files are involved and it should only be necessary to replace the existing routines with the new ones to get the new code. The new routines have been put up on my Web space with the URL addresses given below (go directly to the complete address). The files may immediately download (all the *.F did for me with MSIE) or can be "saved to disk" if they are displayed in the browser. Alternatively I will email them to anyone who requests them. Everything is CASE SENSITIVE and I apologise for unwanted capitalisation in the URLs, so be careful. In order to come to a sensible decision about the usefulness and future of COMMS1/COMMS2 we need as much data as possible from the new code. I hope that some of you will have time to try it, and report the results to this email group. The old routine was used with data way out of its tested range with some unfortunate results. The new routine should be much more robust and give more sensible parametric fits. In particular the usefulness of the parametric fitting should be judged on results from this new code NOT from the old code. ------ All files have a revision date of RWH-11-Feb-1997 which is a comment on the second or third line of the code. In detail the files and modifications made are: (1) ParkBench/Low_Level/comms1/src_mpi/COMMS1.f URL http://www.minnow.demon.co.uk/Pbench/comms1/COMMS1_1.F This is a revision of the existing routine. Introduction of Ron Sercely's improved measurement of CHECK time. Improvement of output format to include printout of spot bandwidth at each point and the two KEY measured values requested by Charles Grassl. These are the time for the shortest message and the bandwidth for the longest message, both of which are now prominently displayed. The parametric curve fitting now only displays results if both the RMS relative error and maximum recorded relative error are less than user definable values which are set in comms1.inc. In addition the two-parameter straight-line (t vs n) now minimises the relative rather than absolute error. A new library routine LINERR.f is introduced which calculates directly the RMS and maximum relative error of the fit. There is a new 3-parameter fitting function that does very well with Grassl's data and is calculated with a new library routine VPOWER.f which also directly calculates the RMS and maximum relative error of the fit. (2) ParkBench/Low_Level/comms1/src_mpi/comms1.inc URL http://www.minnow.demon.co.uk/Pbench/comms1/COMMS1_1.INC This is a revision of the existing include file. The following parameters are added: ERMSLM limiting value for acceptable RMS error EMAXLM limiting value for acceptable maximum relative error Diagnostic printout is now available viag parameter KDIAG KDIAG=0 for normal running without diagnostics =1 first level diagnostics: errors at each point =2 second level diagnostics: each point each iteration (3) ParkBench/Low_Level/comms1/src_mpi/ESTCOM.f URL http://www.minnow.demon.co.uk/Pbench/comms1/ESTCOM_1.F This is a revision of the existing routine. Introduction of Ron Sercely's method of estimating the ERINF and ESTART values. This is now based on two measurements at the longest and shortest messages, and overcomes the problems met with the Convex computers using the existing code. (4) ParkBench/lib/Low_Level/LSTSQ.f URL http://www.minnow.demon.co.uk/Pbench/comms1/LSTSQ_1.F This is a revision of the existing routine. Internal switch, KREL, introduced to allow choice of minimising either the absolute or relative error in time. This is internally set to select the minimisation of relative error which is obviously the best choice. (5) ParkBench/lib/Low_Level/LINERR.f URL http://www.minnow.demon.co.uk/Pbench/comms1/LINERR_1.F This is a new library routine that calculates directly the RMS relative error of the linear 2-parameter fit function, and also finds the maximum relative error with its sign and its position. If KDIAG=1 is set in comms1.inc, a diagnostic printout compares the measured and fitted values at each point and their individual errors. (6) ParkBench/lib/Low_Level/VPOWER.f URL http://www.minnow.demon.co.uk/Pbench/comms1/VPOWER_1.F This is a new library routine that calculates by iteration the best 3-parameter fit using the new Variable-Power function. This is a generalisation of the existing two-parameter function that has more flexibility (as was desired by Charles). The variable-power function is adjusted to fit the smallest and largest message lengths exactly. It also fits the measured data at a mid-point in the range. For more details see comments in the program and output files. The comms1.inc parameter KDIAG offers two levels of diagnostic printout for use in checking the course of the iteration and viewing errors at individual points. (7) Parkbench/lib/Low_Level/CHECK.f URL http://www.minnow.demon.co.uk/Pbench/comms1/CHECK_1.F This has had a STOP statement removed to avoid the possibility of deadlock arising. A possible problem pointed out by Charles Grassl during preliminary evaluation. With best regards Roger Hockney (Westminster University, UK) -- Roger Hockney. Checkout my new Web page at URL http://www.minnow.demon.co.uk University of and link to my new book: "The Science of Computer Benchmarking" Westminster UK suggestions welcome. Know any fish movies or suitable links? From owner-parkbench-comm@CS.UTK.EDU Mon Feb 17 08:02:51 1997 Return-Path: Received: from CS.UTK.EDU by netlib2.cs.utk.edu with ESMTP (cf v2.9t-netlib) id IAA20634; Mon, 17 Feb 1997 08:02:51 -0500 Received: from localhost (root@localhost) by CS.UTK.EDU with SMTP (cf v2.9s-UTK) id HAA07774; Mon, 17 Feb 1997 07:50:06 -0500 Received: from relay-7.mail.demon.net (relay-7.mail.demon.net [194.217.242.9]) by CS.UTK.EDU with SMTP (cf v2.9s-UTK) id HAA07755; Mon, 17 Feb 1997 07:50:02 -0500 Received: from minnow.demon.co.uk ([158.152.73.63]) by relay-5.mail.demon.net id aa502304; 17 Feb 97 10:54 GMT Message-ID: <7ab7FHAxiDCzEwSH@minnow.demon.co.uk> Date: Mon, 17 Feb 1997 10:53:37 +0000 To: parkbench-comm@CS.UTK.EDU From: Roger Hockney Subject: Revised COMMS1 output files MIME-Version: 1.0 X-Mailer: Turnpike Version 3.01 To: Parkbench Committee 17 Feb 1997 REVISED COMMS1 (MPI) -------------------- In connection with my emailing yesterday, I forgot to include specimen output files. These have now been placed at the URLs listed below. KDIAG=0 for normal running without diagnostics =1 first level diagnostics: errors at each point =2 second level diagnostics: each point each iteration The output from Ron Sercely's data which was emailed to the committee on 2 Dec 1996 http://www.minnow.demon.co.uk/Pbench/comms1/SERC1_0.RES ... KDIAG=0 http://www.minnow.demon.co.uk/Pbench/comms1/SERC1_1.RES ... KDIAG=1 http://www.minnow.demon.co.uk/Pbench/comms1/SERC1_2.RES ... KDIAG=2 The output from Charles Grassl's data which was emailed to the committee on 17 Dec 1996 http://www.minnow.demon.co.uk/Pbench/comms1/GRAS2_0.RES ... KDIAG=0 http://www.minnow.demon.co.uk/Pbench/comms1/GRAS2_1.RES ... KDIAG=1 http://www.minnow.demon.co.uk/Pbench/comms1/GRAS2_2.RES ... KDIAG=2 With best regards Roger Hockney (Westminster University, UK) -- Roger Hockney. Checkout my new Web page at URL http://www.minnow.demon.co.uk University of and link to my new book: "The Science of Computer Benchmarking" Westminster UK suggestions welcome. Know any fish movies or suitable links? From owner-parkbench-comm@CS.UTK.EDU Thu Mar 13 10:52:11 1997 Return-Path: Received: from CS.UTK.EDU by netlib2.cs.utk.edu with ESMTP (cf v2.9t-netlib) id KAA14731; Thu, 13 Mar 1997 10:52:11 -0500 Received: from localhost (root@localhost) by CS.UTK.EDU with SMTP (cf v2.9s-UTK) id KAA07654; Thu, 13 Mar 1997 10:41:32 -0500 Received: from relay-11.mail.demon.net (relay-11.mail.demon.net [194.217.242.137]) by CS.UTK.EDU with SMTP (cf v2.9s-UTK) id KAA07632; Thu, 13 Mar 1997 10:41:25 -0500 Received: from minnow.demon.co.uk ([158.152.73.63]) by relay-10.mail.demon.net id aa1007353; 13 Mar 97 14:35 GMT Message-ID: Date: Thu, 13 Mar 1997 14:25:08 +0000 To: parkbench-comm@CS.UTK.EDU From: Roger Hockney Subject: Revised COMMS1 bug fixed MIME-Version: 1.0 X-Mailer: Turnpike Version 3.01 A bug has been found and corrected in the revised version of COMMS1. On some but not all compilers, the bug caused the program to crash in subroutine LINERR when NSBYTE=0 with an out-of-range index. The correction involves only the subroutine COMMS1.f and the corrected version of this subroutine (now dated RWH-12-Mar-1997) has been placed on my Web page: http://www.minnow.demon.co.uk/Pbench/comms1/COMMS1_1.F Please download this new version if you are evaluating the revised COMMS1 and use it to replace the one dated RWH-11-Feb-1997. No other routines need be changed. Please let me know immediately if you have any problems with running the new benchmark, and send me any results you obtain. At Westminster we have run it successfully on the Cray T3D at Edinburgh EPCC. I shall report the results shortly. -- Roger Hockney. Checkout my new Web page at URL http://www.minnow.demon.co.uk University of and link to my new book: "The Science of Computer Benchmarking" Westminster UK suggestions welcome. Know any fish movies or suitable links? From owner-parkbench-comm@CS.UTK.EDU Fri Mar 14 12:13:05 1997 Return-Path: Received: from CS.UTK.EDU by netlib2.cs.utk.edu with ESMTP (cf v2.9t-netlib) id MAA01571; Fri, 14 Mar 1997 12:13:05 -0500 Received: from localhost (root@localhost) by CS.UTK.EDU with SMTP (cf v2.9s-UTK) id MAA05435; Fri, 14 Mar 1997 12:00:45 -0500 Received: from blueberry.cs.utk.edu (BLUEBERRY.CS.UTK.EDU [128.169.92.34]) by CS.UTK.EDU with ESMTP (cf v2.9s-UTK) id MAA05415; Fri, 14 Mar 1997 12:00:39 -0500 Received: by blueberry.cs.utk.edu (cf v2.11c-UTK) id QAA00871; Fri, 14 Mar 1997 16:57:36 GMT From: "Erich Strohmaier" Message-Id: <9703141157.ZM869@blueberry.cs.utk.edu> Date: Fri, 14 Mar 1997 11:57:35 -0500 X-Face: ,v?vp%=2zU8m.23T00H*9+qjCVLwK{V3T{?1^Bua(Ud:|%?@D!~^v^hoA@Z5/*TU[RFq_n'n"}z{qhQ^Q3'Mexsxg0XW>+CbEOca91voac=P/w]>n_nS]V_ZL>XRSYWi:{MzalK9Hb^=B}Y*[x*MOX7R=*V}PI.HG~2 X-Mailer: Z-Mail (3.2.0 26oct94 MediaMail) To: parkbench-comm@CS.UTK.EDU, parkbench-hpf@CS.UTK.EDU Subject: ParkBench Committee Meeting Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Dear Colleague, The ParkBench (Parallel Benchmark Working Group) will meet in Knoxville, Tennessee on May 9th, 1997. The meeting site will be the Knoxville Downtown Hilton Hotel. We have made arrangements with the Hilton Hotel in Knoxville. Hilton Hotel 501 W. Church Street Knoxville, TN Phone: 423-523-2300 When making arrangements tell the hotel you are associated with the Parallel Benchmarking or ParkBench. The rate about $79.00/night. You can download a postscript map of the area by looking at http://www.netlib.org/utk/people/JackDongarra.html. You can rent a car or get a cab from the airport to the hotel. We should plan to start at 9:00 am May 9th and finish about 5:00 pm. If you will be attending the meeting please send me email so we can better arrange for the meeting. The format of the meeting is: Thursday October 31th 9:00 - 12.00 Full group meeting 12.00 - 1.30 Lunch 1.30 - 5.00 Full group meeting Please send us your suggestions for the agenda. The objectives for the group are: 1. To establish a comprehensive set of parallel benchmarks that is generally accepted by both users and vendors of parallel system. 2. To provide a focus for parallel benchmark activities and avoid unnecessary duplication of effort and proliferation of benchmarks. 3. To set standards for benchmarking methodology and result-reporting together with a control database/repository for both the benchmarks and the results. The following mailing lists have been set up. parkbench-comm@cs.utk.edu Whole committee parkbench-hpf@cs.utk.edu HPF subcommittee Jack Dongarra Erich Strohmaier From owner-parkbench-comm@CS.UTK.EDU Wed Apr 2 06:47:24 1997 Return-Path: Received: from CS.UTK.EDU by netlib2.cs.utk.edu with ESMTP (cf v2.9t-netlib) id GAA02504; Wed, 2 Apr 1997 06:47:23 -0500 Received: from localhost (root@localhost) by CS.UTK.EDU with SMTP (cf v2.9s-UTK) id GAA03239; Wed, 2 Apr 1997 06:40:38 -0500 Received: from relay-11.mail.demon.net (relay-11.mail.demon.net [194.217.242.137]) by CS.UTK.EDU with SMTP (cf v2.9s-UTK) id GAA03181; Wed, 2 Apr 1997 06:40:18 -0500 Received: from minnow.demon.co.uk ([158.152.73.63]) by relay-10.mail.demon.net id aa1011690; 2 Apr 97 12:39 BST Message-ID: Date: Wed, 2 Apr 1997 12:28:50 +0100 To: Erich Strohmaier Cc: parkbench-comm@CS.UTK.EDU, parkbench-hpf@CS.UTK.EDU From: Roger Hockney Subject: Re: ParkBench Committee Meeting In-Reply-To: <9703141157.ZM869@blueberry.cs.utk.edu> MIME-Version: 1.0 X-Mailer: Turnpike Version 3.01 Erich Strohmaier writes > >Please send us your suggestions for the agenda. > Agenda item and report to be submitted by Roger Hockney (leader of the Low-Level subcommittee) and presented by Vladimir Getov (Westminster University, UK): Low-Level subcommittee: Report on problems with the COMMS benchmarks, and the presentation to the committee of a revised version of COMMS1 (MPI). > >3. To set standards for benchmarking methodology and result-reporting > together with a control database/repository for both the benchmarks and > the results. > What happened to the 4th objective: 4. To make the benchmarks and results freely available in the public domain I feel that this is a very important aspect of the Parkbench activity which distinguishes it in an important way from some other benchmarking initiatives. In my opinion we should not lose it. -- Roger Hockney. Checkout my new Web page at URL http://www.minnow.demon.co.uk University of and link to my new book: "The Science of Computer Benchmarking" Westminster UK suggestions welcome. Know any fish movies or suitable links? From owner-parkbench-comm@CS.UTK.EDU Wed Apr 23 14:46:05 1997 Return-Path: Received: from CS.UTK.EDU by netlib2.cs.utk.edu with ESMTP (cf v2.9t-netlib) id OAA08597; Wed, 23 Apr 1997 14:46:05 -0400 Received: from localhost (root@localhost) by CS.UTK.EDU with SMTP (cf v2.9s-UTK) id OAA18073; Wed, 23 Apr 1997 14:36:43 -0400 Received: from blueberry.cs.utk.edu (BLUEBERRY.CS.UTK.EDU [128.169.92.34]) by CS.UTK.EDU with ESMTP (cf v2.9s-UTK) id OAA18062; Wed, 23 Apr 1997 14:36:39 -0400 Received: by blueberry.cs.utk.edu (cf v2.11c-UTK) id SAA12213; Wed, 23 Apr 1997 18:36:17 GMT From: "Erich Strohmaier" Message-Id: <9704231436.ZM12211@blueberry.cs.utk.edu> Date: Wed, 23 Apr 1997 14:36:16 -0400 X-Face: ,v?vp%=2zU8m.23T00H*9+qjCVLwK{V3T{?1^Bua(Ud:|%?@D!~^v^hoA@Z5/*TU[RFq_n'n"}z{qhQ^Q3'Mexsxg0XW>+CbEOca91voac=P/w]>n_nS]V_ZL>XRSYWi:{MzalK9Hb^=B}Y*[x*MOX7R=*V}PI.HG~2 X-Mailer: Z-Mail (3.2.0 26oct94 MediaMail) To: parkbench-lowlevel@CS.UTK.EDU, parkbench-comm@CS.UTK.EDU, parkbench-hpf@CS.UTK.EDU Subject: ParkBench Committee Meeting - tentative Agenda Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Dear Colleague, The ParkBench (Parallel Benchmark Working Group) will meet in Knoxville, Tennessee on May 9th, 1997. The meeting site will be the Knoxville Downtown Hilton Hotel. We have made arrangements with the Hilton Hotel in Knoxville. Hilton Hotel 501 W. Church Street Knoxville, TN Phone: 423-523-2300 When making arrangements tell the hotel you are associated with the 'ParkBench'. The rate about $79.00/night. You can download a postscript map of the area by looking at http://www.netlib.org/utk/people/JackDongarra.html. ---------------- The format of the meeting is: Friday May 9th, 1997. 9:00 - 12.00 Full group meeting 12.00 - 1.30 Lunch 1.30 - 5.00 Full group meeting There might be also a joint session with the SPEC/HPG group on Thursday 8th at about 3pm-5pm ---------------- Please send us your comments about the tentative agenda: 1. Minutes of last meeting (MBe) Changes to Current release: 2. Low Level (ES, VG, RS) comms1, comms2, comms3, poly2 3. Linear Algebra (ES) 4. Compact Applications - NPBs (SS, ES) New benchmarks: 5. HPF Low Level benchmarks (MBa) ? 6. New shared memory Low Level benchmarks (MBa) ? 7. New performance database design and new benchmark output format (MBa,VG) ? 8. Update of GBIS with new Web front-end (MBa,VG) Report from other benchmark activities 9. ASCI Benchmark Codes (RS) 10. SPEC (RE) ParkBench: 11. ParkBench Bibliography 12. ParkBench Report 2 Other Activities: 13. Discussion of the ParkBench Workshop 11/12 September, UK 14. "Electronic Benchmarking Journal" - status report - 15. Miscellaneous - 16. Date and venue for next meeting - (MBa) Mark Baker Univ. of Portsmouth (MBe) Michael Berry Univ. of Tennessee (JD) Jack Dongarra Univ. of Tenn./ORNL (RE) Rudi Eigenmann SPEC (VG) Vladimir Getov Univ. of Westminister (TH) Tony Hey Univ. of Southampton (SS) Subhash Saini NASA Ames (RS) Ron Sercely HP/CXTC (ES) Erich Strohmaier Univ. of Tennessee Jack Dongarra Erich Strohmaier From owner-parkbench-comm@CS.UTK.EDU Wed Apr 23 16:46:22 1997 Return-Path: Received: from CS.UTK.EDU by netlib2.cs.utk.edu with ESMTP (cf v2.9t-netlib) id QAA10139; Wed, 23 Apr 1997 16:46:22 -0400 Received: from localhost (root@localhost) by CS.UTK.EDU with SMTP (cf v2.9s-UTK) id QAA02756; Wed, 23 Apr 1997 16:40:05 -0400 Received: from blueberry.cs.utk.edu (BLUEBERRY.CS.UTK.EDU [128.169.92.34]) by CS.UTK.EDU with ESMTP (cf v2.9s-UTK) id QAA02732; Wed, 23 Apr 1997 16:40:00 -0400 Received: by blueberry.cs.utk.edu (cf v2.11c-UTK) id SAA12213; Wed, 23 Apr 1997 18:36:17 GMT From: "Erich Strohmaier" Message-Id: <9704231436.ZM12211@blueberry.cs.utk.edu> Date: Wed, 23 Apr 1997 14:36:16 -0400 X-Face: ,v?vp%=2zU8m.23T00H*9+qjCVLwK{V3T{?1^Bua(Ud:|%?@D!~^v^hoA@Z5/*TU[RFq_n'n"}z{qhQ^Q3'Mexsxg0XW>+CbEOca91voac=P/w]>n_nS]V_ZL>XRSYWi:{MzalK9Hb^=B}Y*[x*MOX7R=*V}PI.HG~2 X-Mailer: Z-Mail (3.2.0 26oct94 MediaMail) To: parkbench-lowlevel@CS.UTK.EDU, parkbench-comm@CS.UTK.EDU, parkbench-hpf@CS.UTK.EDU Subject: ParkBench Committee Meeting - tentative Agenda Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Dear Colleague, The ParkBench (Parallel Benchmark Working Group) will meet in Knoxville, Tennessee on May 9th, 1997. The meeting site will be the Knoxville Downtown Hilton Hotel. We have made arrangements with the Hilton Hotel in Knoxville. Hilton Hotel 501 W. Church Street Knoxville, TN Phone: 423-523-2300 When making arrangements tell the hotel you are associated with the 'ParkBench'. The rate about $79.00/night. You can download a postscript map of the area by looking at http://www.netlib.org/utk/people/JackDongarra.html. ---------------- The format of the meeting is: Friday May 9th, 1997. 9:00 - 12.00 Full group meeting 12.00 - 1.30 Lunch 1.30 - 5.00 Full group meeting There might be also a joint session with the SPEC/HPG group on Thursday 8th at about 3pm-5pm ---------------- Please send us your comments about the tentative agenda: 1. Minutes of last meeting (MBe) Changes to Current release: 2. Low Level (ES, VG, RS) comms1, comms2, comms3, poly2 3. Linear Algebra (ES) 4. Compact Applications - NPBs (SS, ES) New benchmarks: 5. HPF Low Level benchmarks (MBa) ? 6. New shared memory Low Level benchmarks (MBa) ? 7. New performance database design and new benchmark output format (MBa,VG) ? 8. Update of GBIS with new Web front-end (MBa,VG) Report from other benchmark activities 9. ASCI Benchmark Codes (RS) 10. SPEC (RE) ParkBench: 11. ParkBench Bibliography 12. ParkBench Report 2 Other Activities: 13. Discussion of the ParkBench Workshop 11/12 September, UK 14. "Electronic Benchmarking Journal" - status report - 15. Miscellaneous - 16. Date and venue for next meeting - (MBa) Mark Baker Univ. of Portsmouth (MBe) Michael Berry Univ. of Tennessee (JD) Jack Dongarra Univ. of Tenn./ORNL (RE) Rudi Eigenmann SPEC (VG) Vladimir Getov Univ. of Westminister (TH) Tony Hey Univ. of Southampton (SS) Subhash Saini NASA Ames (RS) Ron Sercely HP/CXTC (ES) Erich Strohmaier Univ. of Tennessee Jack Dongarra Erich Strohmaier From owner-parkbench-comm@CS.UTK.EDU Wed Apr 23 19:13:50 1997 Return-Path: Received: from CS.UTK.EDU by netlib2.cs.utk.edu with ESMTP (cf v2.9t-netlib) id TAA12069; Wed, 23 Apr 1997 19:13:49 -0400 Received: from localhost (root@localhost) by CS.UTK.EDU with SMTP (cf v2.9s-UTK) id TAA16846; Wed, 23 Apr 1997 19:10:14 -0400 Received: from osiris.sis.port.ac.uk (root@osiris.sis.port.ac.uk [148.197.100.10]) by CS.UTK.EDU with SMTP (cf v2.9s-UTK) id TAA16794; Wed, 23 Apr 1997 19:09:55 -0400 Received: from mordillo (node3.remote.port.ac.uk) by osiris.sis.port.ac.uk (4.1/SMI-4.1) id AA29461; Thu, 24 Apr 97 00:10:42 BST Date: Wed, 23 Apr 97 23:56:13 From: Mark Baker Subject: RE: ParkBench Committee Meeting - tentative Agenda To: parkbench-lowlevel@CS.UTK.EDU, parkbench-comm@CS.UTK.EDU, parkbench-hpf@CS.UTK.EDU, Erich Strohmaier X-Priority: 3 (Normal) X-Mailer: Chameleon 5.0.1, TCP/IP for Windows, NetManage Inc. Message-Id: Mime-Version: 1.0 Content-Type: TEXT/PLAIN; CHARSET=us-ascii Erich, Some corrections... --- On Wed, 23 Apr 1997 14:36:16 -0400 Erich Strohmaier wrote: >Please send us your comments about the tentative agenda: > > 1. Minutes of last meeting (MBe) > > Changes to Current release: > 2. Low Level (ES, VG, RS) > comms1, comms2, comms3, poly2 > 3. Linear Algebra (ES) > 4. Compact Applications - NPBs (SS, ES) > > New benchmarks: > 5. HPF Low Level benchmarks (MBa) >? 6. New shared memory Low Level benchmarks (MBa) Can you change this to report on our I/O benchmark efforts. >? 7. New performance database design and new benchmark output format (MBa,VG) >? 8. Update of GBIS with new Web front-end (MBa,VG) Tony or I will update the committe on the new back/fronts ends of GBIS + hopefully also give a demo. VG, as far as I know, is not involved in this activity. Regards Mark ------------------------------------- DIS, University of Portsmouth, Hants, UK Tel: +44 1705 844285 Fax: +44 1705 844006 E-mail: mab@sis.port.ac.uk Date: 4/23/97 - Time: 11:56:13 PM URL http://www.sis.port.ac.uk/~mab/ ------------------------------------- From owner-parkbench-comm@CS.UTK.EDU Fri May 2 11:52:00 1997 Return-Path: Received: from CS.UTK.EDU by netlib2.cs.utk.edu with ESMTP (cf v2.9t-netlib) id LAA25870; Fri, 2 May 1997 11:52:00 -0400 Received: from localhost (root@localhost) by CS.UTK.EDU with SMTP (cf v2.9s-UTK) id LAA20111; Fri, 2 May 1997 11:41:13 -0400 Received: from osiris.sis.port.ac.uk (root@osiris.sis.port.ac.uk [148.197.100.10]) by CS.UTK.EDU with SMTP (cf v2.9s-UTK) id LAA19968; Fri, 2 May 1997 11:40:24 -0400 Received: from mordillo (pc297.sis.port.ac.uk) by osiris.sis.port.ac.uk (4.1/SMI-4.1) id AA24320; Fri, 2 May 97 16:41:01 BST Date: Fri, 2 May 97 16:14:17 From: Mark Baker To: parkbench-hpf@CS.UTK.EDU, parkbench-comm@CS.UTK.EDU X-Priority: 3 (Normal) X-Mailer: Chameleon 5.0.1, TCP/IP for Windows, NetManage Inc. Message-Id: Mime-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII I've put the draft of the Parkbench HPF Suite document on my homepage under ParkBench or you can go directly to it via... http://www.sis.port.ac.uk/~mab/ParkBench/ This document is by no ways complete, but will give those who are interested a chance to see "where we are going". The document is based on comments and discussion sent to the parkbench-hpf@cs.utk.edu mailing list. I will talk about our email discussions at the ParkBench meeting next Friday (9th May) in Knoxville. Regards Mark ------------------------------------- DIS, University of Portsmouth, Hants, UK Tel: +44 1705 844285 Fax: +44 1705 844006 E-mail: mab@sis.port.ac.uk Date: 5/2/97 - Time: 4:14:17 PM URL http://www.sis.port.ac.uk/~mab/ ------------------------------------- From owner-parkbench-comm@CS.UTK.EDU Fri May 2 15:53:02 1997 Return-Path: Received: from CS.UTK.EDU by netlib2.cs.utk.edu with ESMTP (cf v2.9t-netlib) id PAA00358; Fri, 2 May 1997 15:53:02 -0400 Received: from localhost (root@localhost) by CS.UTK.EDU with SMTP (cf v2.9s-UTK) id PAA13341; Fri, 2 May 1997 15:44:43 -0400 Received: from blueberry.cs.utk.edu (BLUEBERRY.CS.UTK.EDU [128.169.92.34]) by CS.UTK.EDU with ESMTP (cf v2.9s-UTK) id PAA13327; Fri, 2 May 1997 15:44:36 -0400 Received: by blueberry.cs.utk.edu (cf v2.11c-UTK) id TAA08348; Fri, 2 May 1997 19:44:04 GMT From: "Erich Strohmaier" Message-Id: <9705021544.ZM8346@blueberry.cs.utk.edu> Date: Fri, 2 May 1997 15:44:03 -0400 X-Face: ,v?vp%=2zU8m.23T00H*9+qjCVLwK{V3T{?1^Bua(Ud:|%?@D!~^v^hoA@Z5/*TU[RFq_n'n"}z{qhQ^Q3'Mexsxg0XW>+CbEOca91voac=P/w]>n_nS]V_ZL>XRSYWi:{MzalK9Hb^=B}Y*[x*MOX7R=*V}PI.HG~2 X-Mailer: Z-Mail (3.2.0 26oct94 MediaMail) To: parkbench-comm@CS.UTK.EDU Subject: ParkBench Committee Meeting Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Dear Colleague, Here is the revised agenda. Please send me ASAP a short email if you come so that we can arrange for a meeting room. ------------------- The ParkBench (Parallel Benchmark Working Group) will meet in Knoxville, Tennessee on May 9th, 1997. The meeting site will be the Knoxville Downtown Hilton Hotel. We have made arrangements with the Hilton Hotel in Knoxville. Hilton Hotel 501 W. Church Street Knoxville, TN Phone: 423-523-2300 When making arrangements tell the hotel you are associated with the 'ParkBench'. The rate about $79.00/night. You can download a postscript map of the area by looking at http://www.netlib.org/utk/people/JackDongarra.html. ---------------- The tentative agenda for the meeting is: 1. Minutes of last meeting (MBe) Changes to Current release: 2. Low Level (ES, VG, RS) comms1, comms2, comms3, poly2 3. Linear Algebra (ES) 4. Compact Applications - NPBs (SS, ES) New benchmarks: 5. HPF Low Level benchmarks (MBa) 6. Java Low-Level Benchmarks (VG) 7. New I/O benchmark benchmarks (MBa) 8. New performance database design and new benchmark output format Update of GBIS with new Web front-end (MBa,TH) Report from other benchmark activities 9. ASCI Benchmark Codes (AH) 10. SPEC-HPG (RE, JD) ParkBench: 11. ParkBench Bibliography 12. ParkBench Report 2 Other Activities: 13. Discussion of the ParkBench Workshop 11/12 September, UK (TH, MBa) 14. PEMCS - "Electronic Benchmarking Journal" - status report - (TH, MBa) 15. Status of Funding proposals (JD, TH) 15. Miscellaneous - 16. Date and venue for next meeting - (MBa) Mark Baker Univ. of Portsmouth (MBe) Michael Berry Univ. of Tennessee (JD) Jack Dongarra Univ. of Tenn./ORNL (RE) Rudi Eigenmann SPEC (VG) Vladimir Getov Univ. of Westminister (TH) Tony Hey Univ. of Southampton (AH) Adolfy Hoisie LLNL (SS) Subhash Saini NASA Ames (RS) Ron Sercely HP/CXTC (ES) Erich Strohmaier Univ. of Tennessee Jack Dongarra Erich Strohmaier From owner-parkbench-comm@CS.UTK.EDU Tue May 6 14:46:45 1997 Return-Path: Received: from CS.UTK.EDU by netlib2.cs.utk.edu with ESMTP (cf v2.9t-netlib) id OAA04480; Tue, 6 May 1997 14:46:45 -0400 Received: from localhost (root@localhost) by CS.UTK.EDU with SMTP (cf v2.9s-UTK) id OAA25737; Tue, 6 May 1997 14:34:05 -0400 Received: from punt-2.mail.demon.net (relay-11.mail.demon.net [194.217.242.137]) by CS.UTK.EDU with SMTP (cf v2.9s-UTK) id OAA25715; Tue, 6 May 1997 14:33:58 -0400 Received: from minnow.demon.co.uk ([158.152.73.63]) by punt-2.mail.demon.net id aa1000641; 6 May 97 19:07 BST Message-ID: Date: Tue, 6 May 1997 19:06:15 +0100 To: parkbench-comm@CS.UTK.EDU From: Roger Hockney Subject: Parkbench Meeting Documents In-Reply-To: <9705021544.ZM8346@blueberry.cs.utk.edu> MIME-Version: 1.0 X-Mailer: Turnpike Version 3.01 AGENDA ITEM: > Changes to Current release: > 2. Low Level (VG) > comms1, comms2, Two documents will be submitted to the committee on this item by Roger Hockney and Vladimir Getov (Westminster University, UK). They can be downloaded as postscript files from: "New COMMS1 Benchmark: Results and Recommendations" http://www.minow.demon.co.uk/Pbench/comms1/PBPAPER2.PS "New COMMS1 Benchmark: The Details" http://www.minow.demon.co.uk/Pbench/comms1/PBPAPER3.PS The papers will be presented by Vladimir who will bring some paper copies with him. Best wishes Roger and Vladimir -- Roger Hockney. Checkout my new Web page at URL http://www.minnow.demon.co.uk University of and link to my new book: "The Science of Computer Benchmarking" Westminster UK suggestions welcome. Know any fish movies or suitable links? From owner-parkbench-comm@CS.UTK.EDU Tue May 6 17:54:47 1997 Return-Path: Received: from CS.UTK.EDU by netlib2.cs.utk.edu with ESMTP (cf v2.9t-netlib) id RAA07526; Tue, 6 May 1997 17:54:46 -0400 Received: from localhost (root@localhost) by CS.UTK.EDU with SMTP (cf v2.9s-UTK) id RAA17012; Tue, 6 May 1997 17:48:50 -0400 Received: from punt-1.mail.demon.net (relay-7.mail.demon.net [194.217.242.9]) by CS.UTK.EDU with SMTP (cf v2.9s-UTK) id RAA17003; Tue, 6 May 1997 17:48:47 -0400 Received: from minnow.demon.co.uk ([158.152.73.63]) by punt-1.mail.demon.net id aa0623986; 6 May 97 21:37 BST Message-ID: Date: Tue, 6 May 1997 21:26:50 +0100 To: parkbench-comm@CS.UTK.EDU From: Roger Hockney Subject: Parkbench Meeting Documents (Correction) MIME-Version: 1.0 X-Mailer: Turnpike Version 3.01 I am resending this because there was a typo in the URLs: There are two MM in "minnow". Also if you took PBPAPER2.PS before receiving this repeat message, please take it again as I have corrected two errors in the graphs. SORRY Roger ************************ AGENDA ITEM: > Changes to Current release: > 2. Low Level (VG) > comms1, comms2, Two documents will be submitted to the committee on this item by Roger Hockney and Vladimir Getov (Westminster University, UK). They can be downloaded as postscript files from: CORRECTED URLs: "New COMMS1 Benchmark: Results and Recommendations" http://www.minnow.demon.co.uk/Pbench/comms1/PBPAPER2.PS "New COMMS1 Benchmark: The Details" http://www.minnow.demon.co.uk/Pbench/comms1/PBPAPER3.PS The papers will be presented by Vladimir who will bring some paper copies with him. Best wishes Roger and Vladimir -- -- Roger Hockney. Checkout my new Web page at URL http://www.minnow.demon.co.uk University of and link to my new book: "The Science of Computer Benchmarking" Westminster UK suggestions welcome. Know any fish movies or suitable links? From owner-parkbench-comm@CS.UTK.EDU Mon May 12 05:36:41 1997 Return-Path: Received: from CS.UTK.EDU by netlib2.cs.utk.edu with ESMTP (cf v2.9t-netlib) id FAA24086; Mon, 12 May 1997 05:36:41 -0400 Received: from localhost (root@localhost) by CS.UTK.EDU with SMTP (cf v2.9s-UTK) id FAA10068; Mon, 12 May 1997 05:18:21 -0400 Received: from haven.EPM.ORNL.GOV (haven.epm.ornl.gov [134.167.12.69]) by CS.UTK.EDU with ESMTP (cf v2.9s-UTK) id FAA10051; Mon, 12 May 1997 05:18:18 -0400 Received: (from worley@localhost) by haven.EPM.ORNL.GOV (8.8.3/8.8.3) id FAA29262; Mon, 12 May 1997 05:18:16 -0400 (EDT) Date: Mon, 12 May 1997 05:18:16 -0400 (EDT) From: Pat Worley Message-Id: <199705120918.FAA29262@haven.EPM.ORNL.GOV> To: parkbench-comm@CS.UTK.EDU Subject: Gordon Conference on HPC and NII Forwarding: Mail from 'Tony Skjellum ' dated: Sat, 10 May 1997 16:32:12 -0500 (CDT) Cc: worley@haven.EPM.ORNL.GOV Just in case you haven't received information on this already, here is a blurb on the 1997 Gordon conference in high performance computing. Unlike previous years, there is not an explicit emphasis on performance evaluation in this year's stated themes, but you can't (shouldn't) discuss future architectures and their impacts without discussing how to evaluate performance, and I am hoping that some benchmarking-minded people will show up and keep the discussion honest. ---------- Begin Forwarded Message ---------- The deadline for applying to attend the 1997 Gordon conference in high performance computing is June 1. If you are interested in attending, please apply as soon as possible. The simplest way to apply is to download the application form from the web site indicated below, or to use the online registration option. If you have any problems with either of these, please contact the organizers at tony@cs.msstate.edu and worleyph@ornl.gov. ------------------------------------------------------------------------------- The 1997 Gordon Conference on High Performance Computing and Information Infrastructure: "Practical Revolutions in HPC and NII" Chair, Anthony Skjellum, Mississippi State University, tony@cs.msstate.edu, 601-325-8435 Co-Chair, Pat Worley, Oak Ridge National Laboratory, worleyph@ornl.gov, 615-574-3128 Conference web page: http://www.erc.msstate.edu/conferences/gordon97 July 13-17, 1997 Plymouth State College Plymouth NH The now bi-annual Gordon conference series in HPC and NII commenced in 1992 and has had its second meeting in 1995. The Gordon conferences are an elite series of conferences designed to advance the state-of-the-art in covered disciplines. Speakers are assured of anonymity and referencing presentations done at Gordon conferences is prohibited by conference rules in order to promote science, rather than publication lists. Previous meetings have had good international participation, and this is always encouraged. Experts, novices, and technically interested parties from other fields interested in HPC and NII are encouraged to apply to attend. All attendees, including speakers, poster presenters, and session chairs must apply to attend. We *strongly* encourage all poster presenters to have their poster proposals in by May 13, 1997, though we will consider poster presentations up to six weeks prior to the conference. Application to attend the conference is also six weeks in advance. More information on the conference can be found at the web page listed above, including the list of speakers and poster presenters and information on applying for attendance. ----------- End Forwarded Message ----------- From owner-parkbench-comm@CS.UTK.EDU Tue May 13 13:58:00 1997 Return-Path: Received: from CS.UTK.EDU by netlib2.cs.utk.edu with ESMTP (cf v2.9t-netlib) id NAA20879; Tue, 13 May 1997 13:57:59 -0400 Received: from localhost (root@localhost) by CS.UTK.EDU with SMTP (cf v2.9s-UTK) id NAA11997; Tue, 13 May 1997 13:33:14 -0400 Received: from timbuk.cray.com (timbuk-fddi.cray.com [128.162.8.102]) by CS.UTK.EDU with ESMTP (cf v2.9s-UTK) id NAA11983; Tue, 13 May 1997 13:33:10 -0400 Received: from ironwood.cray.com (root@ironwood-fddi.cray.com [128.162.21.36]) by timbuk.cray.com (8.8.5/CRI-gate-news-1.3) with ESMTP id MAA20939 for ; Tue, 13 May 1997 12:33:07 -0500 (CDT) Received: from magnet.cray.com (magnet [128.162.173.162]) by ironwood.cray.com (8.8.4/CRI-ironwood-news-1.0) with ESMTP id MAA16428 for ; Tue, 13 May 1997 12:33:06 -0500 (CDT) From: Charles Grassl Received: by magnet.cray.com (8.8.0/btd-b3) id RAA20181; Tue, 13 May 1997 17:33:04 GMT Message-Id: <199705131733.RAA20181@magnet.cray.com> Subject: Parkbench directions To: parkbench-comm@CS.UTK.EDU Date: Tue, 13 May 1997 12:33:04 -0500 (CDT) X-Mailer: ELM [version 2.4 PL24-CRI-d] MIME-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit To: ParkBench Group From: Charles Grassl Date: May 13, 1997 (Long) I appreciated the meeting this past week and wish to thank Eric and Jack for hosting it. I am aware of the great effort of many individuals have contributed to developing and implementing the ParkBench suite. In spite of this, I feel that we need to evaluate and correct our course. ParkBench should not merge with or use benchmarks from the SPEC/HPG (High Performance Group) group. SGI/Cray and IBM have already withdrawn from the SPEC/HPG group and Fujitsu and NEC are no longer participating. The reasons for these companies and other institutions no longer participating should indicate to us (ParkBench) that something is amiss with the SPEC/HPG benchmarks and paradigm. Several of the reasons for the supercomputer manufacturers not supporting the SPEC/HPG effort are listed below. I list these reasons so that the ParkBench group can learn from them and avoid the same problems. - Relevance. The particular benchmark programs being used by SPEC/HPG are not relevant or appropriate for supercomputing. The programs in the current SPEC/HPG suite do not represent any leading edge software which is more typical of usage for high performance systems. - Redundancy. The programs being developed by SPEC/HPG are not qualitatively or quantitatively different from the SPEC/OSG programs and as such, it is viewed as redundant and expensive. - Methodology. The methodology being used by SPEC/HPG to procure, develop and run benchmarks lacks scientific and technical basis and hence results have a vague and arbitrary interpretation. - Programming model. Designing benchmarks for portability across systems is a convenient idea but does not reflect actual constraints or usage. More often than not, compatibility with a PREVIOUS model of computer is more important than compatibility ACROSS computers. - Expense. Some of the large data cases for the SPEC/HPG programs will requires hours or days to run with little new data or information gained by the exercise. These exercises are extremely expensive both in time and capital equipment and in logistics. - Ergonomics. The cumbersome design of SPEC/HPG Makefiles and build procedures make the programs difficult and expensive to test, maintain and analyze. We in the ParkBench group must acknowledge the above items if we are to maintain interest and participation from computer vendors. I believe that reorganizing and refocusing the group could revitalize high performance computer benchmarking and and re-invigorate the ParkBench group. As the ParkBench suite now stands, there are too many programs and they are difficult to build, test and maintain. This situation impedes usage and participation. Here are a few suggestions for our future practices and directions: - Design and write benchmarks programs. Don't borrow or solicit old code. The borrowed or solicited code is never quite appropriate and usually obsolete. Our greatest asset is that we have scientist who are capable of designing experiments (benchmarks). (Build value.) - Monitor and evaluate accuracy. Though we mention accuracy in ParkBench Report 1, we haven't applied it to the current programs (Scientifically validate, or invalidate, our experiments.) - Make it simple. Write and develop simple programs which do not need elaborate build procedures and which easier to test and to maintain. (Keep It Simple, Stupid.) - Build a better user interface. The belabored "run rules" and the interface with layers of Makefiles, includes and embedded relative file paths is unacceptable. An acceptable interface might require binary distribution and hence a desirable emphasis on designing and running rather than building and porting the benchmarks. (Make the product more attractive to more users.) - Make the suite truly modular. The current structure makes the simplest one CPU program as difficult to build and run as the most complicated program with Makefile includes, special compilers, source file includes, special libraries, suite libraries, etc. (Make it manageable.) - Drop the connection with SPEC/HPG and with NPB. This "grand unifying" scheme make redundant code. It has had the opposite effect of focusing benchmarking attention on ParkBench because it is yet another collection of benchmarks used by other organizations. (Be distinguishable and identifiable.) - Emphasis what ParkBench is associated with: benchmarking distributed memory parallel computers. We should write and develop benchmark programs which measure and instrument the parallel processing aspect of MPP systems. (Keep our focus.) I volunteer to develop and write a suite of message passing test programs which measure the performance and variance of message passing communication schemes. I have much experience with writing such a programs and believe that such suite would be useful for others and for the computer industry in general. I hesitate to contribute such programs to the present structure for several reasons: - The network test suite does not logically fit into the current "hierarchy" and hence might further clutter the ParkBench suite and make it further unfocused. - The current ParkBench structure is not manageable. Testing and maintenance would be extremely expensive in the current structure. - My company's effort may be interpreted as an endorsement of the current structure and model. The suite is not popular with vendors for reasons outlined above. Participation is currently discouraged. Discussion? Regards, Charles Grassl SGI/Cray Eagan, Minnesota USA From owner-parkbench-comm@CS.UTK.EDU Wed May 21 17:25:15 1997 Return-Path: Received: from CS.UTK.EDU by netlib2.cs.utk.edu with ESMTP (cf v2.9t-netlib) id RAA27513; Wed, 21 May 1997 17:25:15 -0400 Received: from localhost (root@localhost) by CS.UTK.EDU with SMTP (cf v2.9s-UTK) id RAA07579; Wed, 21 May 1997 17:18:07 -0400 Received: from rastaman.rmt.utk.edu (root@TCHM11A6.RMT.UTK.EDU [128.169.27.188]) by CS.UTK.EDU with ESMTP (cf v2.9s-UTK) id RAA07571; Wed, 21 May 1997 17:18:02 -0400 Received: from rastaman.rmt.utk.edu (localhost [127.0.0.1]) by rastaman.rmt.utk.edu (8.7.6/8.7.3) with SMTP id RAA01108; Wed, 21 May 1997 17:24:43 -0400 Sender: mucci@CS.UTK.EDU Message-ID: <3383681A.D98C5FB@cs.utk.edu> Date: Wed, 21 May 1997 17:24:42 -0400 From: "Philip J. Mucci" Organization: University of Tennessee, Knoxville X-Mailer: Mozilla 3.01 (X11; I; Linux 2.0.28 i586) MIME-Version: 1.0 To: parkbench-comm@CS.UTK.EDU CC: "PVM Developer's Mailing List" Subject: Mesg Passing Benchmarks References: <199705131733.RAA20181@magnet.cray.com> Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit Hi all, Charles Grassl in his last message to this committee volunteered to write a suite of message passing benchmarks to replace the Low Levels...Before any action on his or this committee's part, I would recommend that you all have a look at version 3 of my pvmbench package. It now does MPI as well and can easily support other message passing primitives with a few #defines. Version 3 along with some sample results can be found at http://www.cs.utk.edu/~mucci/pvmbench. Note that this has not been tested on any MPP's with UTK PVM. This benchmark will generate and graph the following: bandwidth gap time (to buffer an outgoing message) roundtrip (latency /2) barrier/sec broadcast summation reduction Other tests can easily be added...I would highly recommend before any action done that this code be examined. It is less than a year old, version 3 available on that page is in beta, i.e. it has not been released to the general public. Let me know what you think... -Phil -- /%*\ Philip J. Mucci | GRA in CS under Dr. JJ Dongarra /*%\ \*%/ http://www.cs.utk.edu/~mucci PVM/Active Messages \%*/ From owner-parkbench-comm@CS.UTK.EDU Fri May 23 12:03:04 1997 Return-Path: Received: from CS.UTK.EDU by netlib2.cs.utk.edu with ESMTP (cf v2.9t-netlib) id MAA06549; Fri, 23 May 1997 12:03:03 -0400 Received: from localhost (root@localhost) by CS.UTK.EDU with SMTP (cf v2.9s-UTK) id LAA15901; Fri, 23 May 1997 11:05:32 -0400 Received: from berry.cs.utk.edu (BERRY.CS.UTK.EDU [128.169.94.70]) by CS.UTK.EDU with ESMTP (cf v2.9s-UTK) id LAA15895; Fri, 23 May 1997 11:05:30 -0400 Received: from cs.utk.edu by berry.cs.utk.edu with ESMTP (cf v2.11c-UTK) id LAA01370; Fri, 23 May 1997 11:05:31 -0400 Message-Id: <199705231505.LAA01370@berry.cs.utk.edu> to: parkbench-comm@CS.UTK.EDU Subject: Minutes of May ParkBench Meeting Date: Fri, 23 May 1997 11:05:31 -0400 From: "Michael W. Berry" Here are the minutes from the recent ParkBench meeting in Knoxville. Best regards, Mike ----------------------------------------------------------------- Minutes of ParkBench Meeting - Knoxville Hilton, May 9, 1997 ----------------------------------------------------------------- ParkBench Attendee List: (MBa) Mark Baker Univ. of Portsmouth mab@sis.port.ac.uk (MBe) Michael Berry Univ. of Tennessee berry@cs.utk.edu Shirley Browne Univ. of Tennessee browne@cs.utk.edu (JD) Jack Dongarra Univ. of Tenn./ORNL dongarra@cs.utk.edu Jeff Durachta Army Res. Lab MSRC durachta@arl.mil (VG) Vladimir Getov Univ. of Westminister getovv@wmin.ac.uk (CG) Charles Grassl SGI/Cray cmg@cray.com (TH) Tony Hey Univ. of Southampton ajgh@ecs.soton.ac.uk (AH) Adolfy Hoisie Los Alamos Nat'l Lab hoisie@lanl.gov (CK) Charles Koelbel Rice University chk@cs.rice.edu (PM) Phil Mucci Univ. of Tennessee mucci@cs.utk.edu Erik Riedel GENIAS Software GmbH erik@genias.de (SS) Subhash Saini NASA Ames saini@nas.nasa.gov (RS) Ron Sercely HP-Convex sercely@convex.hp.com Alan Stagg CEWES stagga@wes.army.mil (ES) Erich Strohmaier Univ. of Tennessee erich@cs.utk.edu (PW) Pat Worley Oak Ridge Nat'l Lab worleyph@ornl.gov SPEC-HPG Visitors: Don Dossa DEC dossa@eng.pko.dec.com (RE) Rudi Eigenmann Purdue University eigenman@ecn.purdue.edu Greg Gaertner DEC ggg@zko.dec.com Jean Suplick HP suplick@rsn.hp.com Joe Throp Kuck & Associates throp@kai.com At 9:05am EST, TH opened the meeting and ask that all the attendees introduce themselves. After a brief overview of the proposed agenda, MBe reviewed the minutes from the last ParkBench meeting in October of '96. The minutes were unanimously accepted and TH asked VG to present the proposed changes to the low-level benchmarks (9:20am). VG reviewed the original COMMS1 (ping-pong or simplex communication) and the COMMS2 (duplex communication) low-level benchmarks. He discussed some of the problems with the previous versions. These included the omission of calculated bandwidth, large message length problems, and large errors in the asymptotic fit. In collaboration with RS and CG, a number of improvements have been made to these benchmarks: 1. Measured bandwidth is provided in output. 2. Time for shortest message is provided. 3. Maximum measured bandwidth and the corresponding message length is now provided. 4. The accuracy of the least-squares 2-parameter fit has been improved (sum of squares of the "relative" and not absolute error is now used). 5. New 3-parameter variable-power fit for certain cases added. 6. Can report parametric fits if the error is less than some user-specified tolerance. 7. Introduce KDIAG parameter to invoke diagnostic outputs. 8. Modifications fo ESTCOM.f (as suggested by RS). CG pointed out that it may not always be possible to interpret zero-length messages for these codes. On the Cray machines, such messages force an immediate return (i.e., no synchronization). He proposed that allowing zero- length messages be removed for the COMMS benchmarks. RS showed an actual COMMS1 performance graph demonstrating the difficulty of data extrapolation (if used to get latency for zero-length message-passing). RS pointed out, however, that zero-length message are defined w/in MPI, and suggested that a simple return (as in the case of Cray machines) is not standard. VG displayed some of the observed COMMS1/2 performance obtained on the Cray T3E. The 3-parameter fit yielded a 7% relative error for messages ranging from 8 to 1.E+7 bytes. CG questioned how the breakpoints were determined? He indicated the input parameters to the program required previous knowledge of where breakpoints occur (although implementations could change constantly). TH suggested that the parametric fitting should not be the default for these benchmarks, i.e., separate the analysis from the actual benchmarking (this concept was seconded by CG). RS suggested that the fitting routines could be placed on the WWW/Internet and the COMMS1/2 codes simply produce data. CK, however, stressed that the codes should maintain some minimal parametric fitting for clarity and consistency of output interpretations. The minimal message length shown for the T3E results shown by VG was 8 and the corresponding minimal message length for a Convex CXD set of COMMS benchmarks was 1. The lack of similar ranges of messages could pose problems for comparisons. JD strongly felt that users will return to the notion of "latency" and want zero-length message overheads. Users may be primarily interested in start-up time for message-passing. RS pointed out that MPI does process zero-length messages. JD suggested that the minimal message length for the COMMS benchmarks be 8 bytes and RS proposed that the minimal message-passing time and corresp. message length be an output. After more discussion, the following COMMS changes/outputs were unanimously agreed upon: 1. Maximum bandwidth with corresp. message size. 2. Minimum message-passing time with corresp. message size. 3. Time for minimum message length (could be 0, 1, 8, or 32 bytes but must be specified). 4. The software will be split into two program: one to report the spot measurements and the other for the analysis. At 10:00 am, SPEC-HPG members joined the ParkBench meeting for a joint session. CK reviewed the DoD Modernization Program. He indicated that the program is based on 3 primary components: 1. CHSSI (Commonly Highly Scalable Software Initiative) 2. DREN (Defense Research & Engineering Network) 3. Shared Resource Centers (4 Major Shared Resource Centers or MSRC's and 20 Distributed Centers or DC's) Benchmarking is part of the mission of the MSRC's, especially for system integration and the Programming Environment & Training (PET) team. CK mentioned that the resources available at the MSRC's include: 256-proc. Cray T3E, SGI Power Challenge (CEWES), 256 proc. IBM SP/2 and SGI Origin 2000 at ASC, SGI 790 at NAVO, and a collection of {SGI Origin, Cray Titan, J90} at the Army Research Lab. The benchmarking needs of the DoD program can be categorized as either contractual or training. The contractual needs are specified as PL1 (evaluation of initial machines), PL2 (upgrade to gain 3 times the performance of PL1), and PL3 (upgrade to gain 10 times the performance of PL1). CK mentioned that the MSRC's are planning for the PL2 phase later this year with PL3 scheduled in approx. 3 years. The training needs include: the evaluation of programming paradigms, the evaluation of performance trade-offs, templates for designing new codes, and benchmarks for training examples. The contractual benchmarks comprise 30 benchmarks (22 programs) some of which are export-controlled or proprietary (data may not be used in the public domain in some cases). The run rules specify the number of iterations for each benchmark in the suite. Each MSRC uses a different number of iterations per benchmark. Code modifications are allowed (parallel directives and message-passing can be used but no assembler) and algorithm substitutions are permitted provided the problem does not become specialized. The only performance metric reported for these benchmarks is the elapsed time for the entire suite. Benchmarks can be upgraded to reflect current workloads of the MSRCs but they must be compared head-to-head with previous systems. Example codes included in the DoD benchmark suite include: CTH (finite volume shock simulation), X3D (explicit finite element code), OCEAN-O2 (an ocean modeling code), NIKE3D (implicit nonlinear 3D FEM), and Aggregate I/O benchmark. Planned benchmarking activites for the DoD Modernization Program include: 1. benchmarks for evaluating programming techniques (determine what works; develop decision trees) 2. benchmarks for teaching (classes on "worked" examples; template modification) This effort currently has 1 FTE and over 50 University personnel (in PET program) involved (although they are not primarily responsible for benchmarking work). At 10:35am, TH asked AH from Los Alamos Nat'l Lab to overview their ASCI benchmark suite. He began by pointing out that these codes formulate the "Los Alamos set of" ASCI Benchmarks. Before presenting the list of codes, AH noted that the philosophy of this activity was to achieve "experiment ahead" capability especially with immature computing platforms. Los Alamos is also interested in developing performance modes as well as kernels. The list of active/research codes and compact applications comprising this suite are: Code Language(s) Parallelism Description *HEAT(RAGE) f77, f90 MPI(f90) Eulerian adaptive mesh MPIfSM(f77) refinement based on Riemann solvers; coupled physics-CFD; particle & radiative transport EULER f90 MPI Admissable fluid (for SIMD); SIMD(SP unstructured mesh, explicit vector) solution; high-speed fluids; SP=single processor NEUT f77 MPI,SM, Monte-Carlo, particle SHMEM SWEEP3D f90 MPI, SHMEM Inner/outer iteration (kernel) (compact application) HYDRO(T) f77 Serial (compact application) TBON f77 MPI Material science; quantum mechanics; polymer age simulation *TECOLOTE C++ MPI Mixed call hydro. with regular structured grid *TELURIDE f90 MPI Casting simulation; irregular structured grid; Krylov solution methods *DANTE HPF MPI * = export controlled The codes and compact apps above vary in size from 2,000 to 35,000 lines. AK noted that LANL could provide support for future ASCI-based ParkBench codes. The ASCI benchmark suite presented might include in the future tri-lab (Livermore, Sandia, Los Alamos) contributions. The ASCI application suite can be set up with data sets leading to varying run-times. AH mentioned that Los Alamos' ASCI benchmarking efforts are focused on high performance computing, leading edge architectures, algorithms, and applications. They are particularly concentrating in developing expertise in distributed shared-memory performance evaluation and modeling. AH expressed the hope that the efforts of ParkBench will follow similar directions. At 11:05am, SS reviewed some of the most recent NAS Parallel Benchmarks results. He began with vendor-optimized CG Class B results using row and column distribution blocking. Results for different numbers of processors of the T3D were reported along with results for the NEC SX-4, SGI Origin 2K, Convex SPP2K, Fujitsu VPP700, and IBM P2SC. He also showed results for FT Class B and BT Class B (all machines reported performed well on this benchmark). For BT, it was pointed out that 4 of the machines (Cray T3E, DEC Alpha, IBM P2SC, and NEC SX-4) essentially are based on the same processor but achieve widely-varying results. SS also reported HPF Class A MG results on 16 processors of the IBM SP2. The HPF version (APR-HPF/Portland Group compiled) was only 3 times slower than the MPI-based (f77) implementation. This is indeed a significant result given that two years ago the HPF version was as much as 10 times slower than the comparable MPI version. An HPF version of the Class A FT benchmark on 64 processors was shown to be faster than the MPI version (1.6 times faster) when optimized libraries are used in both versions. For the Class A SP benchmark (on 64 processors of the SP/2), the APR- and PGI-compiled HPF versions were within a factor of 2 of the MPI versions. Finally, the HPF Class A BT code on 64 processors of the Cray T3D was within a factor of 0.5 of the MPI version. At 11:35am, TH invited RE to overview current SPEC-HPG activities. The SPEC-HPG benchmarks define a suite of real-world high-performance computing applications designed for comparisons across different platforms (serial and message- passing). RE pointed out the history of the SPEC-HPG effort as a merger between the PERFECT and SPEC benchmarking activities. The current SPEC-HPG suite is comprised of 2 codes: SPECchem96 and SPECseis96. The SPECchem96 code evolved from the GAMES code used in pharmaceutical and chemical industries. It comprises 109,389 lines of f77 (21% comments), 865 subroutines and functions. The wave functions are written to disk. The SPECseis96 code is derived from the ARCO benchmark suite which consists of four phases: data generation, stack data, time migration, and depth migration. This code decomposes the domain into n equal parts (for n processors) with each part processed independently. It is have over 15K lines of code made up of 230 Fortran subroutines and 199 C functions for I/O and systems utilities. SPECseis96 uses 32-bit precision, FFT's, Kirchoff integrals, and finite differences. The very first set of SPEC-HPG benchmark results were approved on May 8, 1997 (preceding day). New benchmarks being considered are PMD (Parallel Molecular Dynamics) and MM5 (NCAR Weather Processing C code). The decision on whether or not to accept these two potential SPEC-HPG codes will be made in about 5 months. The SPEC-HPG run rules permit the use of compiler switches, source code changes, optimized libraries (which have been disclosed to customers). Only approved algorithmic changes will be disclosed. RE gave the URL for the SPEC-HPG effort: http://www.specbench.org/hpg. He also referred to a recent article by himself and S. Hassanzadeh in "IEEE Computational Science & Engineering" and two email reflectors for SPEC-HPG communication: comments@specbench.org and info@specbench.org. JD then gave a brief history of ParkBench and SPEC-HPG interactions and suggested that the two efforts might consider sharing results and software. The biggest difference in the two efforts is in the availability of software as ParkBench code is freely available and SPEC-HPG software has some restrictions. A forum to publish both sets of results was discussed and it was agreed that both efforts should at least share links on their respective webpages. RE pointed out that anyone can get the SPEC-HPG CD of benchmarks without actually being a SPEC member. JD stressed that the process of running codes (for any suite) needs to be simplified so that building executables for different platforms is not problematic. Modifications for porting should be restricted to driver programs. RS indicated that he has Perl scripts that runs all low_level, including COMMS3 for 2 to N procs, and produces a summary of the results. *** ACTION ITEM *** JD, RE, AH, and CK will discuss a potential joint effort to simplify the running of benchmark codes (contact RS also about his Perl scripts). MBa noted that the SPEC-HPG members should be added to the ParkBench email list (parkbench-comm@cs.utk.edu). He also indicated that European benchmarking workshop scheduled next Fall might coordinate with the European SPEC group (scheduled for Sept. 11-12). At 12:10pm, the attendees went to the lunch (Soup Kitchen). After lunch (1:30pm), TH asked ES and VG to coordinate changes to the COMMS benchmarks discussed above (*** ACTION ITEM ***). ES then discussed modifications to poly2 for the ParkBench V2.2 suite. The proposed changes include 1. enlarged arrays A(1000000), B(1000000) 2. removal of arrays C and D 3. avoid cache flush (use a sliding vector), i.e., DO I=1,N DO I=NMIN,NMAX becomes ... NMIN=NMIN+N+INC where INC=17 by default (avoids reuse of the old cache line). PM then discussed a program for determining parameters for memory subsystems. Characteristics of this software include the use of tight loops, independent memory references, maximized register use. He showed graphs of memory hierarchy bandwidth (reads and writes) depicting memory size (ranging from 4Kb to 4Mb) versus Mb/sec transfer rates. Some curves illustrated the effective cache size quite well. PM pointed out that dynamically-scheduled processors pose a significant problem for this type of modeling. The program can be run with or without a calibration loop exploiting known memory transfer data. CG suggested that it would be nice to have such a program to measure latency at all levels of the hierarchy. PM's webpages for this program are: http://www.cs.utk.edu/~mucci/cachebench and http://www.cs.utk.edu/~mucci/parkbench. CK suggested that an uncalibrated version of PM's benchmark would be more useful to users (more reflective of real codes). JD pointed out that the output of the program could be tabulated bandwidths, latencies, etc. CG felt this program would be a very useful tool. PM noted that the calibration will not be used by default. TH suggested that the ParkBench effort might want to develop a future "ParkBench Tool Set" which contains progams like this one developed by PM. With regard to the Linalg Kernels, ES noted that although many of the routines have calls to Scalapack routines, Scalapack will not be included in future software releases. Users will have to ge their own copies of the source (or binaries) for Scalapack. The size of these particular kernel benchmarks drops by a factor of one-third by removing Scalapack. *** ACTION ITEM *** ES will report the most recent Linalg benchmark performance results at the next ParkBench meeting. TH then asked for discussions on new benchmarks with MBa leading the discussion on HPF benchmarks. MBa indicated that a new mail reflector (parkbench-hpf@cs.utk.edu) had been set up for this cause with himself as moderator for low-level codes (CK will moderate kernels and SS will moderate discussions on HPF compact applications). MBa noted that there is limited manpower for the HPF benchmarking activities. CK noted that he had discussed this effort at recent the HPFF meeting (and other users meetings). A draft document on the ParkBench HPF benchmarks is available at http://www.sis.port.ac.uk/~mab/ParkBench. MBa felt strongly that without manpower support this particular activity will die and that a lead site is needed. *** ACTION ITEM *** CK and SS will investigate interest in HPF compact application development. JD indicated that wrappers are being used to create HPF versions of the Linalg kernels. The procedure involves writing wrappers for the current Scalapack driver programs. Eventually, these programs may be completely rewritten in HPF (this will start in the summer). TH suggested that HPF kernel benchmark performance be reported at the ParkBench meeting in September (at Southampton Performance Workshop). MBa went on to report on the status of I/O benchmarks. Basically, not much progress has been made on the ParkBench I/O initiative. A new I/O project between ECMWF, FECIT, and the Univ. of Southampton was launched this past February. They are looking at the I/O in the IFS code from the ECMWF (European Weather Forecasting). David Snelling is the FECIT leader who has also participated in ParkBench activities. This I/O project has 1 FTE at Southampton and 1.5 FTE at FECIT along with several personnel at ECMWF. One workshop, two technical meetings for the 1-year project is planned. The goals are: to develop instrumented I/O benchmarks and build on top of MPI-IO (test, characterize parallel systems). Their methodology is very similar to that of ParkBench. Codes in f90 and ANSI C are being considered (stubs for VAMPIR and PABLO). Regular reports to Fujitsu (sponsor of activity) are planned and a full I/O test suite is planned by February 1998. MBa also reported on the status of the ParkBench graphical database. Currently, the performance data is kept in a relational DBMS. A frontend Java applet has been written to query the DBMS on-the-fly. A backend is also in development which will automate the extraction of new performance data and insertion into the DBMS (via an http server). By September, a more complete prototype which will allow MS access and JDBC between 2 different machines should be ready. VG then discussed the development of Java-based low-level benchmarks. He presented a Java-to-C Interface Generator which would allow Java benchmarks to call existing C libraries on remote machines. He presented sample Java+C NAS PB results on a 16-processor IBM SP/2 (Class A IS Benchmark): Version 1 Proc 2Procs 4 Procs 8 Procs 16 Procs NASA (C) 29.1 17.4 9.4 5.2 2.8 C 40.5 24.9 13.1 9.3 15.6 Java ---- 132.5 64.7 37.9 33.5 At 2:50pm, TH reported other ParkBench activities including the new PEMCS (Performance Evaluation and Modeling for Computer Systems) electronic journal. Suggested articles/authors include: *1. ParkBench Report No. 2 (ES, MBe) *2. NAS PB 3. SPEC-HPG *4. Top 500 5. AutoBench (M. Ginsburg) *6. Euroben (van der Steen) 7. RAPS 8. Europort *9. Cache benchmarks 10. ASCI benchmarks (DoD) *11. PERFORM 12. R. Hockney *13. PEPS 14. C3I/Rome Labs Those articles possible for Summer '97 are marked via *. JD suggested that articles be available in Encapsulated Postscript, PDF (Adobe), and HTML. TH noted that EU funding will provide a host computer and some administration. Possible publishers are Oxford Univ. Press and Elsevier. At 3:10pm, ES requested more items for the ParkBench bibliography which will be available on the WWW. PW suggested that authors should be able to submit links to ParkBench-related applications. JD then briefly discussed WebBench which is a website focused on benchmarking and performance evaluation. Data is presented on platform,s applications, organizations, vendors, conferences, papers, newsgroups, FAQ's, and repositories (PDS, Top500, Linpack, etc.). The WebBench URL is http://www.netlib.org/benchweb. MBa reminded attendees of the Fall Performance Workshop/ParkBench meeting on (Thursday and Friday) Sept. 11 and 12. This meeting will be held at Venue, County Hotel, Southampton, UK. Invited and contributed talks will be presented. With regard to ParkBench funding, JD indicated that the UT/ORNL/NASA Ames proposal was not selected for funding but that it could be re- submitted next year. Expected funding from Rome lab was not received. TH and VG did not succeed this past year either although some funding from Fujitsu is possible. TH adjourned the meeting at 3:25pm EST. From owner-parkbench-comm@CS.UTK.EDU Tue May 27 10:32:45 1997 Return-Path: Received: from CS.UTK.EDU by netlib2.cs.utk.edu with ESMTP (cf v2.9t-netlib) id KAA25239; Tue, 27 May 1997 10:32:45 -0400 Received: from localhost (root@localhost) by CS.UTK.EDU with SMTP (cf v2.9s-UTK) id KAA05022; Tue, 27 May 1997 10:12:02 -0400 Received: from exu.inf.puc-rio.br (exu.inf.puc-rio.br [139.82.16.3]) by CS.UTK.EDU with SMTP (cf v2.9s-UTK) id KAA05013; Tue, 27 May 1997 10:11:53 -0400 Received: from obaluae (obaluae.inf.puc-rio.br) by exu.inf.puc-rio.br (4.1/SMI-4.1) id AA20170; Tue, 27 May 97 11:11:00 EST From: maira@inf.puc-rio.br (Maira Tres Medina) Received: by obaluae (SMI-8.6/client-1.3) id LAA16226; Tue, 27 May 1997 11:10:58 -0300 Date: Tue, 27 May 1997 11:10:58 -0300 Message-Id: <199705271410.LAA16226@obaluae> To: parkbench-comments@CS.UTK.EDU Subject: Benchmarks Cc: parkbench-comm@CS.UTK.EDU, maira@CS.UTK.EDU, victal@CS.UTK.EDU X-Sun-Charset: US-ASCII Hello I'm a graduate student at the Computer Science Department of PUC-Rio (Catholic University of Rio de Janeiro). I'm currently studing Low_Level benchmarks for measuring basic computer characteristics. I have had same problems trying to run some of the benchmarks. For example, the benchmark comms1 for PVM, prints the following errors messages and stops. n05.sp1.lncc.br:/u/renata/maira/ParkBench/bin/RS6K>comms1_pvm Number of nodes = 2 Front End System (1=yes, 0=no) = 0 Spawning done by process (1=yes, 0=no) = 1 Spawned 0 processes OK... libpvm [t4000c]: pvm_mcast(): Bad parameter TIDs sent...benchmark progressing... n05.sp1.lncc.br:/u/renata/maira/ParkBench> bin/RS6K/comms1_pvm 1525-006 The OPEN request cannot be processed because STATUS=OLD was coded in the OPEN statement but the file comms1.dat does not exist. The program will continue if ERR= or IOSTAT= has been coded in the OPEN statement. 1525-099 Program is stopping because errors have occurred in an I/O request and ERR= or IOSTAT= was not coded in the I/O statement. I would like to know how I can execute the benchmarks only for PVM. Can you help me? I have not had problems with benchmarks sequentials (tick1, tick2 ...). Thank you very much for your attention. Maira Tres Medina Phd. Student Pontificial Catholic University Rio de Janeiro, Brazil From owner-parkbench-comm@CS.UTK.EDU Wed May 28 16:36:07 1997 Return-Path: Received: from CS.UTK.EDU by netlib2.cs.utk.edu with ESMTP (cf v2.9t-netlib) id QAA15377; Wed, 28 May 1997 16:36:06 -0400 Received: from localhost (root@localhost) by CS.UTK.EDU with SMTP (cf v2.9s-UTK) id QAA16158; Wed, 28 May 1997 16:26:41 -0400 Received: from rastaman.rmt.utk.edu (root@TCHM03A16.RMT.UTK.EDU [128.169.27.60]) by CS.UTK.EDU with ESMTP (cf v2.9s-UTK) id QAA16150; Wed, 28 May 1997 16:26:37 -0400 Received: from rastaman.rmt.utk.edu (localhost [127.0.0.1]) by rastaman.rmt.utk.edu (8.7.6/8.7.3) with SMTP id QAA00226; Wed, 28 May 1997 16:33:33 -0400 Sender: mucci@CS.UTK.EDU Message-ID: <338C968B.124F15AA@cs.utk.edu> Date: Wed, 28 May 1997 16:33:33 -0400 From: "Philip J. Mucci" Organization: University of Tennessee, Knoxville X-Mailer: Mozilla 3.01 (X11; I; Linux 2.0.28 i586) MIME-Version: 1.0 To: Maira Tres Medina CC: parkbench-comments@CS.UTK.EDU, parkbench-comm@CS.UTK.EDU Subject: Re: Benchmarks References: <199705271410.LAA16226@obaluae> Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit Hi, You need to make sure the dat files are in the executable directory. They should be installed in $PVM_ROOT/bin/$PVM_ARCH. -Phil -- /%*\ Philip J. Mucci | GRA in CS under Dr. JJ Dongarra /*%\ \*%/ http://www.cs.utk.edu/~mucci PVM/Active Messages \%*/ From owner-parkbench-comm@CS.UTK.EDU Thu Jun 5 11:30:41 1997 Return-Path: Received: from CS.UTK.EDU by netlib2.cs.utk.edu with ESMTP (cf v2.9t-netlib) id LAA11302; Thu, 5 Jun 1997 11:30:41 -0400 Received: from localhost (root@localhost) by CS.UTK.EDU with SMTP (cf v2.9s-UTK) id KAA14227; Thu, 5 Jun 1997 10:53:09 -0400 Received: from haven.EPM.ORNL.GOV (haven.epm.ornl.gov [134.167.12.69]) by CS.UTK.EDU with ESMTP (cf v2.9s-UTK) id KAA14220; Thu, 5 Jun 1997 10:53:07 -0400 Received: (from worley@localhost) by haven.EPM.ORNL.GOV (8.8.3/8.8.3) id KAA06499; Thu, 5 Jun 1997 10:53:06 -0400 (EDT) Date: Thu, 5 Jun 1997 10:53:06 -0400 (EDT) From: Pat Worley Message-Id: <199706051453.KAA06499@haven.EPM.ORNL.GOV> To: parkbench-comm@CS.UTK.EDU Subject: Gordon conference deadline extended Forwarding: Mail from 'Pat Worley ' dated: Thu, 5 Jun 1997 10:48:07 -0400 (EDT) Cc: worley@haven.EPM.ORNL.GOV, tony@cs.msstate.edu (Our apologies if you receive this multiple times.) There is still room for additional attendees at the Gordon Conference on High Performance Computing, and the Gordon Research Conference administration has agreed to extend the application deadline. As a practical matter, applications need to be submitted no later than JULY 1. We will also stop accepting applications before that date if the maximum meeting size is reached, so please apply as soon as possible if you are interested in attending. The simplest way to apply is to download the application form from the web site http://www.erc.msstate.edu/conferences/gordon97 or to use the online registration option available at the same site. If you have any problems with either of these, please contact the organizers at tony@cs.msstate.edu and worleyph@ornl.gov. Complete information on the meeting is available from the Web site or its links, but a short summary of the meeting follows: -------------------------------------------------------------------------- The 1997 Gordon Conference on High Performance Computing and Information Infrastructure: "Practical Revolutions in HPC and NII" Chair, Anthony Skjellum, Mississippi State University, tony@cs.msstate.edu, 601-325-8435 Co-Chair, Pat Worley, Oak Ridge National Laboratory, worleyph@ornl.gov, 615-574-3128 Conference web page: http://www.erc.msstate.edu/conferences/gordon97 July 13-17, 1997 Plymouth State College Plymouth NH The now bi-annual Gordon conference series in HPC and NII commenced in 1992 and has had its second meeting in 1995. The Gordon conferences are an elite series of conferences designed to advance the state-of-the-art in covered disciplines. Speakers are assured of anonymity and referencing presentations done at Gordon conferences is prohibited by conference rules in order to promote science, rather than publication lists. Previous meetings have had good international participation, and this is always encouraged. Experts, novices, and technically interested parties from other fields interested in HPC and NII are encouraged to apply to attend. The conference consists of technical sessions in the morning and evening, with afternoons free for discussion and recreation. Each session consists of 2 or 3 one hour talks, with ample time for questions and discussion. All speakers are invited and there are no parallel sessions. All attendees are both encouraged and expected to actively participate, via discussions during the technical sessions or via poster presentations. All attendees, including speakers, poster presenters, and session chairs, must apply to attend. Poster presenters should indicate their poster proposals on their applications. While all posters must be approved, successful applicants should assume that their posters have been accpeted unless they hear otherwise. Meeting Themes: Networks: Emerging capabilities and the practical implications : New types of networking Real-Time Issues Multilevel Multicomputers Processors-in-Memory and Other Fine Grain Computational Architectures Impact of Evolving Hardware on Applications Impact of Software Abstractions on Performance Confirmed Speakers: Ashok K. Agrawala University of Maryland Kirstie Bellman DARPA/SISTO James C. Browne University of Texas at Austin Andrew Chien University of Illiniois, Urbana-Champaign Thomas H. Cormen Dartmouth College Jean-Dominique Decotignie CSEM David Greenberg Sandia National Laboratories William Gropp Argonne National Laboratory Don Heller Ames Laboratory Jeff Koller Information Sciences Institute Peter Kogge University of Notre Dame Chris Landauer The Aerospace Corporation Olaf M. Lubeck Los Alamos National Laboratory Andrew Lumsdaine University of Notre Dame Lenore Mullins SUNY, Albany Paul Plassmann Argonne National Laboratory Lui Sha Carnegie Mellon Univeristy Paul Woodward University of Minnesota From owner-parkbench-comm@CS.UTK.EDU Tue Jul 1 17:06:52 1997 Return-Path: Received: from CS.UTK.EDU by netlib2.cs.utk.edu with ESMTP (cf v2.9t-netlib) id RAA20550; Tue, 1 Jul 1997 17:06:51 -0400 Received: from localhost (root@localhost) by CS.UTK.EDU with SMTP (cf v2.9s-UTK) id RAA21503; Tue, 1 Jul 1997 17:03:35 -0400 Received: from osiris.sis.port.ac.uk (root@osiris.sis.port.ac.uk [148.197.100.10]) by CS.UTK.EDU with SMTP (cf v2.9s-UTK) id RAA21438; Tue, 1 Jul 1997 17:02:42 -0400 Received: from baker (baker.npac.syr.edu) by osiris.sis.port.ac.uk (4.1/SMI-4.1) id AA10168; Tue, 1 Jul 97 22:00:22 BST Date: Tue, 1 Jul 97 20:55:49 From: Mark Baker Subject: Fall 97 Parkbench Workshop - Southampton, UK To: ejz@ecs.soton.ac.uk, parkbench-comm@CS.UTK.EDU, parkbench-hpf@CS.UTK.EDU, William Gropp , Antoine Hyaric , gent@genias.de, gcf@npac.syr.edu, geerd.hoffman@ecmwf.co.uk, reed@cs.uiuc.edu, david@cs.cf.ac.uk, clemens-august.thole@gmd.de, klaus.stueben@gmd.de, "J.C.T. Pool" , Paul Messina , foster@mcs.anl.gov, idh@soton.ac.uk, rjc@soton.ac.uk, plg@pac.soton.ac.uk, Graham.Nudd@dcs.warwick.ac.uk Cc: lec@ecs.soton.ac.uk, rjr@ecs.soton.ac.uk, "MATRAVERS Prof. D R STAF" , wilsona@sis.port.ac.uk, grant , hwyau@epcc.ed.ac.uk X-Priority: 3 (Normal) X-Mailer: Chameleon 5.0.1, TCP/IP for Windows, NetManage Inc. Message-Id: Mime-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII Dear All, This is to let you know that the Department of Electronics and Computer Science at the University of Southampton is organising a Fall 97 Parkbench Workshop on the 11th and 12th of September 1997. See http://hpc-journals.ecs.soton.ac.uk/Workshops/PEMCS/fall-97/ for futher details. The workshop will include a number of talks from researchers working in th field of performance evaluation and modelling of computer systems, a panel discussion session and a Parkbench committee meeting. The Workshop is free to attend - workshop delegates need only cover their own travel and accommodation expenses. Attendance is limited and so the availability of places at the Workshop will be allocated on a first come basis. It is planned to turn the talks given at the Workshop into a series of short papers which will be put together and published as a Special Issue of the electronic journal Performance Evaluation and Modelling of Computer Systems (PEMCS). For further information or registration details refer to the Web pages - (http://hpc-journals.ecs.soton.ac.uk/Workshops/PEMCS/fall-97/registration.html). I would appreciate it if you would kindly pass this email onto colleges who may be interested in the event. Regards Mark ------------------------------------- Dr Mark Baker CSM, University of Portsmouth, Hants, UK Tel: +44 1705 844285 Fax: +44 1705 844006 E-mail: mab@sis.port.ac.uk Date: 7/1/97 - Time: 8:55:49 PM URL http://www.sis.port.ac.uk/~mab/ ------------------------------------- From owner-parkbench-comm@CS.UTK.EDU Wed Jul 23 17:19:23 1997 Return-Path: Received: from CS.UTK.EDU by netlib2.cs.utk.edu with ESMTP (cf v2.9t-netlib) id RAA04434; Wed, 23 Jul 1997 17:19:23 -0400 Received: from localhost (root@localhost) by CS.UTK.EDU with SMTP (cf v2.9s-UTK) id RAA28191; Wed, 23 Jul 1997 17:10:39 -0400 (EDT) Received: from osiris.sis.port.ac.uk (root@osiris.sis.port.ac.uk [148.197.100.10]) by CS.UTK.EDU with SMTP (cf v2.9s-UTK) id RAA28171; Wed, 23 Jul 1997 17:10:24 -0400 (EDT) Received: from baker (baker.npac.syr.edu) by osiris.sis.port.ac.uk (4.1/SMI-4.1) id AA14190; Wed, 23 Jul 97 22:10:30 BST Date: Wed, 23 Jul 97 22:01:41 +0000 From: Mark Baker Subject: PEMCS Web Site To: parkbench-comm@CS.UTK.EDU, parkbench-hpf@CS.UTK.EDU X-Mailer: Chameleon ATX 6.0.1, Standards Based IntraNet Solutions, NetManage Inc. X-Face: "3@c]&iv:nfs&\mp6nN90ioxbQ-Eu:]}^MyviIL7YjwT,Cl)|TYpTQ})PP'&O=V`~)JQRWjM?H;'`q\"3mv "j@5vs)}!WC3pG9q:;rpe0\LoLQfY"1?1A.\(f=E*&QAW8WK+)*)T0[Bv=[{.-f7<6Ddv!2XaWhH X-Priority: 3 (Normal) Message-Id: Mime-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII Dear All, The Web site that will host the Journal of "Performance Evaluation and Modelling of Computer Systems (PEMCS)" can be found at: http://hpc-journals.ecs.soton.ac.uk/PEMCS/ The pages I have put up are at the present still in a "draft/under-construction" state. I would appreciate any comments or feedback about the pages. Regards Mark ------------------------------------- Dr Mark Baker DIS, University of Portsmouth, Hants, UK Tel: +44 1705 844285 Fax: +44 1705 844006 E-mail: mab@sis.port.ac.uk Date: 07/23/97 - Time: 22:01:41 URL http://www.sis.port.ac.uk/~mab/ ------------------------------------- From owner-parkbench-comm@CS.UTK.EDU Thu Jul 24 08:26:42 1997 Return-Path: Received: from CS.UTK.EDU by netlib2.cs.utk.edu with ESMTP (cf v2.9t-netlib) id IAA12708; Thu, 24 Jul 1997 08:26:42 -0400 Received: from localhost (root@localhost) by CS.UTK.EDU with SMTP (cf v2.9s-UTK) id IAA04617; Thu, 24 Jul 1997 08:21:55 -0400 (EDT) Received: from berry.cs.utk.edu (BERRY.CS.UTK.EDU [128.169.94.70]) by CS.UTK.EDU with SMTP (cf v2.9s-UTK) id IAA04599; Thu, 24 Jul 1997 08:21:23 -0400 (EDT) Received: from cs.utk.edu by berry.cs.utk.edu with ESMTP (cf v2.11c-UTK) id IAA13817; Thu, 24 Jul 1997 08:21:24 -0400 Message-Id: <199707241221.IAA13817@berry.cs.utk.edu> To: Mark Baker cc: parkbench-comm@CS.UTK.EDU, parkbench-hpf@CS.UTK.EDU Subject: Re: PEMCS Web Site In-reply-to: Your message of Wed, 23 Jul 1997 22:01:41 -0000. Date: Thu, 24 Jul 1997 08:21:24 -0400 From: "Michael W. Berry" > Dear All, > > The Web site that will host the Journal of "Performance > Evaluation and Modelling of Computer Systems (PEMCS)" can > be found at: > > http://hpc-journals.ecs.soton.ac.uk/PEMCS/ > > The pages I have put up are at the present still in a > "draft/under-construction" state. > > I would appreciate any comments or feedback about the > pages. > > Regards > > Mark > > > > ------------------------------------- > Dr Mark Baker > DIS, University of Portsmouth, Hants, UK > Tel: +44 1705 844285 Fax: +44 1705 844006 > E-mail: mab@sis.port.ac.uk > Date: 07/23/97 - Time: 22:01:41 > URL http://www.sis.port.ac.uk/~mab/ > ------------------------------------- > Mark, the webpages are well organized. You might reconsider the red text on the green background of the menu frame. It was difficult to read on my machine at home. Nice work! Mike ------------------------------------------------------------------- Michael W. Berry Ayres Hall 114 berry@cs.utk.edu Department of Computer Science OFF:(423) 974-3838 University of Tennessee FAX:(423) 974-4404 Knoxville, TN 37996-1301 URL:http://www.cs.utk.edu/~berry/ ------------------------------------------------------------------- From owner-parkbench-comm@CS.UTK.EDU Fri Aug 1 12:59:29 1997 Return-Path: Received: from CS.UTK.EDU by netlib2.cs.utk.edu with ESMTP (cf v2.9t-netlib) id MAA05831; Fri, 1 Aug 1997 12:59:27 -0400 Received: from localhost (root@localhost) by CS.UTK.EDU with SMTP (cf v2.9s-UTK) id MAA01387; Fri, 1 Aug 1997 12:38:00 -0400 (EDT) Received: from osiris.sis.port.ac.uk (root@osiris.sis.port.ac.uk [148.197.100.10]) by CS.UTK.EDU with SMTP (cf v2.9s-UTK) id MAA01337; Fri, 1 Aug 1997 12:37:24 -0400 (EDT) Received: from baker (baker.npac.syr.edu) by osiris.sis.port.ac.uk (4.1/SMI-4.1) id AA15842; Fri, 1 Aug 97 17:36:11 BST Date: Fri, 1 Aug 97 17:17:51 +0000 From: Mark Baker Subject: Reminder - Fall Parkbench Workshop To: parkbench-comm@CS.UTK.EDU, parkbench-hpf@CS.UTK.EDU X-Mailer: Chameleon ATX 6.0.1, Standards Based IntraNet Solutions, NetManage Inc. X-Face: ,<'y31|nlb,jCP5?km9\KD+>p9/e?:|$RRhY]e;#`awGHh=mrY.T??#]-*rt}l0*u`k2A7n KlqNG"u'-%cS@3|G[%=m%bSB[lfSn5n"gD4CU(j?1y?#SOkm!qw_=p%c#"6g&(+\Oy6T{4CEShal?z M)&Gd'Pb6Qc~>SPx{m[F55=]yY>cN>|/m5)T?q`OTjdQL=7-n%NT({;;$P*2[#7ZWL8baLoI_/F89, x'u`*$'<|ctKNYTSJuLV=!$QT3bN*>91V,a0Cc"_UsxwMKg\;#W2LZ$!`j?ZWp;byz~;y}2Dz6i7y% E&;gfnmI_~}+oifmWXJMHfWeezBL1("ZnFe!rnX[Q|,:IJ?iq+PePa/[3R4 X-Priority: 3 (Normal) Message-Id: Mime-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII Dear All, This email is a reminder about the: ---------------------------------------------------------------------------------------------------- Fall ParkBench Workshop Thursday 11th/Friday 12th September 1997 at the University of Southampton, UK See http://hpc-journals.ecs.soton.ac.uk/Workshops/PEMCS/fall-97/ ---------------------------------------------------------------------------------------------------- If you are interested in attending the Workshop you should register now and reserve accommodation as hotel rooms in Southampton during the workshop period will be in short supply due to the "International Southampton Boat Show" which will also be taking place. At present we have a preliminary reservation on rooms at the County Hotel where the Workshop is being held. Without concrete delegate reservations we can only hold onto there rooms for approximately another week. Thereafter, accommodation at the Hotel, or around the city, may be more problematic in getting and reserving. So, I encourage potential Workshop delegates to register ASAP. Mark ------------------------------------- Dr Mark Baker University of Portsmouth, Hants, UK Tel: +44 1705 844285 Fax: +44 1705 844006 E-mail: mab@sis.port.ac.uk Date: 08/01/97 - Time: 17:17:52 URL http://www.sis.port.ac.uk/~mab/ ------------------------------------- From owner-parkbench-comm@CS.UTK.EDU Mon Aug 11 13:13:12 1997 Return-Path: Received: from CS.UTK.EDU by netlib2.cs.utk.edu with ESMTP (cf v2.9t-netlib) id NAA20171; Mon, 11 Aug 1997 13:13:11 -0400 Received: from localhost (root@localhost) by CS.UTK.EDU with SMTP (cf v2.9s-UTK) id NAA06842; Mon, 11 Aug 1997 13:02:59 -0400 (EDT) Received: from MIT.EDU (SOUTH-STATION-ANNEX.MIT.EDU [18.72.1.2]) by CS.UTK.EDU with SMTP (cf v2.9s-UTK) id NAA06808; Mon, 11 Aug 1997 13:02:42 -0400 (EDT) Received: from MIT.MIT.EDU by MIT.EDU with SMTP id AA27349; Mon, 11 Aug 97 13:02:14 EDT Received: from HOCKEY.MIT.EDU by MIT.MIT.EDU (5.61/4.7) id AA01161; Mon, 11 Aug 97 13:02:12 EDT Message-Id: <9708111702.AA01161@MIT.MIT.EDU> X-Sender: mmccarth@po9.mit.edu X-Mailer: Windows Eudora Pro Version 2.1.2 Mime-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Date: Mon, 11 Aug 1997 13:02:12 -0400 To: alison.wall@rl.ac.uk, weber@scripps.edu, schauser@cs.ucsb.edu, dewombl@sandia.gov, edgorha@sandia.gov, rdskocy@sandia.gov, sales@pgroup.com, utpds@CS.UTK.EDU, parkbench-comm@CS.UTK.EDU, pancake@cs.orst.edu, johnreed@ghost.CS.ORST.EDU, levesque@apri.com, davida@cit.gu.edu.au, gddt@gup.uni-linz.ac.at, atempt@gup.uni-linz.ac.at, rileyba@ornl.gov, bac@ccs.ornl.gov From: "Michael F. McCarthy" Subject: For Sale: CM-5 PLEASE FORWARD THIS NOTE TO ANYONE THAT YOU BELIEVE MAY HAVE AN INTEREST IN PURCHASING THIS SYSTEM! __________________________________________________________________________ Case #3971 -- FOR SALE - CM5 with 128 nodes and SDA -- __________________________________________________________________________ The MIT Lab for Computer Science offers for bid sale a Thinking Machines CM-5 Connection Machine (described below). Bids to purchase this system are requested from all interested parties, (with a minimum expected Bid of $25,000). All bids must be received at the MIT property office by 5:00 PM (EDT) on Monday, 8/Sept/97. The machine must be moved from MIT within 10 business days of acceptance of the bid. All expenses and arrangements for moving will be made by purchaser. The system consists of: 1) 128 PN CM-5 w/ Vector Units, 256 Network addresses-Part No.CM5-128V-32F 2) Scalable Disk Array with Twenty-four(24) 1.2 GB Drives-Part No.CM5-SA25F 3) Control Processor Interface-Part No. CM5-CPI 4) S-Bus to Diagnostics Network Interface-Part No. CM5-SDN 5) S-Bus Network Interface Board(5)-Part No. CM5-SNI [N.B. On July 16 1997 power was turned off.The machine can be turned back on in its present location only until Friday, 22/AUG/97 when wiring changes are planned in that machine room.] "The Institute reserves the right to reject any or all offers.MIT makes no warranty of any kind, express or implied, with respect to this equipment. This includes fitness for a particular purpose. It is the responsibility of those making an offer to determine, before making an offer, that the equipment meets any conditions required by those making that offer.Thank you." __________________________________________________________________________ Submit bids for Case #3971 before Monday, 8/Sept/97, 5:00 PM (EDT) to: ***************************************************************** * Michael F. McCarthy * Phone: (617)253-2779 * * MIT Property Office * FAX: (617)253-2444 * * E19-429 * E-Mail: mmccarth@MIT.EDU * * 77 Massachusetts Ave. * * * Cambridge, MA 02139 * * ***************************************************************** __________________________________________________________________________ SYSTEM HISTORY The Project SCOUT CM-5 is housed in M.I.T's Laboratory for Computer Science (L.C.S). The machine was acquired in 1993 as part of the the ARPA sponsored project SCOUT, and used to accomplish the stated aim of the project of "fermenting collaborations between users, builders and networkers of massively parallel computers". The CM-5 computer, developed and manufactured by Thinking Machines Corporation, evolved from earlier T.M.C. computers (the CM-2 and the CM-200)with an architecture targeted toward teraflops performance for large, complex data intensive applications. The MIT hardware consists of a total of 128 32MHz SPARC microprocessors, each with 4 proprietary floating point arithmetic units and 32Mb of local memory attached to it. The system also includes a subsidiary 25Gb parallel file system for handling high volume parallel application I/O. The system was operated under full maintenance contract from May of 1993 until March 20 1997. On July 16 1997 power was turned off. The machine can be turned back on in its present location only until Friday, 22/AUG/97 when wiring changes are planned in that machine room. The system was used primarily for research but a description of an instructional use made of the machine can be found at http://www-erl.mit.edu/eaps/seminar/iap95/cnh/CM5Intro.html Web sites about other CM5 sites and general information include: http://www.math.uic.edu/~hanson/cmg.html http://www.acl.lanl.gov/UserInfo/cm5admin.html http://ec.msc.edu/CM5/ __________________________________________________________________________ FUTURE MAINTENANCE People submitting bids may wish to discuss future maintenance issues with a company that is a present maintainer of CM5 Equipment, Connection Machine Services. ***************************************************************** * Larry Stewart * Phone: (505) 820-1470 * * * Cell: (505) 690-7799 * * Account Executive * FAX: (505) 820-0810 * * Connection Machines Services * Home: (505) 983-9670 * * 1373 Camino Sin Salida * Pager (888) 712-4143 * * Santa Fe, NM 87501 * E-Mail: stewart@ix.netcom.com * ***************************************************************** __________________________________________________________________________ Michael F. McCarthy MIT Property Office E19-429 77 Massachusetts Ave. Cambridge, MA 02139 Ph (617)253-2779 Fax (617)253-2444 From owner-parkbench-comm@CS.UTK.EDU Mon Sep 1 05:44:50 1997 Return-Path: Received: from CS.UTK.EDU by netlib2.cs.utk.edu with ESMTP (cf v2.9t-netlib) id FAA11838; Mon, 1 Sep 1997 05:44:50 -0400 Received: from localhost (root@localhost) by CS.UTK.EDU with SMTP (cf v2.9s-UTK) id FAA07176; Mon, 1 Sep 1997 05:35:14 -0400 (EDT) Received: from osiris.sis.port.ac.uk (root@osiris.sis.port.ac.uk [148.197.100.10]) by CS.UTK.EDU with SMTP (cf v2.9s-UTK) id FAA07160; Mon, 1 Sep 1997 05:34:44 -0400 (EDT) Received: from mordillo (pc297.sis.port.ac.uk) by osiris.sis.port.ac.uk (4.1/SMI-4.1) id AA14311; Mon, 1 Sep 97 10:33:06 BST Date: Mon, 1 Sep 97 10:19:23 +0000 From: Mark Baker Subject: Final Announcement: Fall ParkBench Workshop To: "Daniel A. Reed" , "J.C.T. Pool" , a.j.grant@mcc.ac.uk, Antoine Hyaric , Ed Zaluska , Fritz Ferstl , Hon W Yau , idh@soton.ac.uk, parkbench-comm@CS.UTK.EDU, parkbench-hpf@CS.UTK.EDU, Paul Messina , R.Rankin@Queens-Belfast.AC.UK, rjc@soton.ac.uk, topic@mcc.ac.uk, Wolfgang Genzsch Cc: lec@ecs.soton.ac.uk X-Mailer: Chameleon ATX 6.0.1, Standards Based IntraNet Solutions, NetManage Inc. X-Face: ,<'y31|nlb,jCP5?km9\KD+>p9/e?:|$RRhY]e;#`awGHh=mrY.T??#]-*rt}l0*u`k2A7n KlqNG"u'-%cS@3|G[%=m%bSB[lfSn5n"gD4CU(j?1y?#SOkm!qw_=p%c#"6g&(+\Oy6T{4CEShal?z M)&Gd'Pb6Qc~>SPx{m[F55=]yY>cN>|/m5)T?q`OTjdQL=7-n%NT({;;$P*2[#7ZWL8baLoI_/F89, x'u`*$'<|ctKNYTSJuLV=!$QT3bN*>91V,a0Cc"_UsxwMKg\;#W2LZ$!`j?ZWp;byz~;y}2Dz6i7y% E&;gfnmI_~}+oifmWXJMHfWeezBL1("ZnFe!rnX[Q|,:IJ?iq+PePa/[3R4 X-Priority: 3 (Normal) Message-Id: Mime-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII Dear all, This is the FINAL ANNOUNCEMENT: If you would like to attend this workshop please let Lesley Courtney (lec@ecs.soton.ac.uk) know by Friday 5th September 1997 at the latest as we need to confirm numbers. Workshop details can be found at http://hpc-journals.ecs.soton.ac.uk/Workshops/PEMCS/fall-97/ Regards Mark ------------------------------------- Dr Mark Baker University of Portsmouth, Hants, UK Tel: +44 1705 844285 Fax: +44 1705 844006 E-mail: mab@sis.port.ac.uk Date: 09/01/97 - Time: 10:19:23 URL http://www.sis.port.ac.uk/~mab/ ------------------------------------- From owner-parkbench-comm@CS.UTK.EDU Wed Sep 3 15:37:55 1997 Return-Path: Received: from CS.UTK.EDU by netlib2.cs.utk.edu with ESMTP (cf v2.9t-netlib) id PAA20262; Wed, 3 Sep 1997 15:37:55 -0400 Received: from localhost (root@localhost) by CS.UTK.EDU with SMTP (cf v2.9s-UTK) id PAA08273; Wed, 3 Sep 1997 15:19:14 -0400 (EDT) Received: from punt-2.mail.demon.net (punt-2b.mail.demon.net [194.217.242.6]) by CS.UTK.EDU with SMTP (cf v2.9s-UTK) id PAA08262; Wed, 3 Sep 1997 15:19:10 -0400 (EDT) Received: from minnow.demon.co.uk ([158.152.73.63]) by punt-2.mail.demon.net id aa0626941; 3 Sep 97 17:35 BST Message-ID: Date: Wed, 3 Sep 1997 16:31:07 +0100 To: parkbench-comm@CS.UTK.EDU From: Roger Hockney Subject: Prototype PICT release 1.0 MIME-Version: 1.0 X-Mailer: Turnpike Version 3.03a At their last meeting the Parkbench Committee recommended that an interactive curve fitting tool be produced for the postprocessing and parametrisation of Parkbench results using the latest Internet Web technology. I have produced a prototype of such a tool as a Java applet running on a Web page on the user's machine and called it PICT (Parkbench Interactive Curve-fitting Tool). This is now ready for evaluation and testing by the committee. The tool provides the following features: (1) Automatic plotting of Low-Level Parkbench output files from a URL anywhere on the Web (At present limited to New COMMS1 and Raw data, but easily extended to original COMMS1 and RINF1). This is useful for a quick comparison of raw data. (2) Automatic plotting of both 2 and 3-parameter curve-fits which are produce by the benchmarks. Good for checking the quality of the fits. (3) Allows manual rescaling of the graph range to suit the data, either by typing in the required range values or by dragging out a range box with the mouse. (4) Allows the 2-parameter and 3-parameter performance curves to be manually moved about the graph in order to fine tune the fits. The curve follows the mouse and the RMS and MAX percentage errors are shown as the curve moves. Alternatively parameter values can be typed in and the Manual button pressed when the curve for these values will be plotted. (5) The data file being plotted can be VIEWed and a HELP button provides a description of the action of each button in a separate windows. The PICT applet has been built on top of Leigh Brookshaw's 2D plotting package the URL for which is given at the bottom of the HELP window. The features under the RESTART button are in his original code, I have just added the 2-PARA and 3-PARA features. The applet was developed using JDK1.0 beta on a PC with a 1600x1200 display and works on the PC both locally and from my Web page with appletview, MSIE 3.02 and Netscape 3.01. It has also been successfully run on a Solaris Sun with NS3.01, but another Sun user has reported no graphs and errors due to "wrong applet version". So please report your experiences (both success and failure please) to me with all the details. To play with PICT turn your browser to: http://www.minnow.demon.co.uk/pict/source/pict1.html or pict1a.html pict1.html asks for 1000x732 pixels and suits PCs best (it's about the minimum useful size). pict1a.html asks for 1020x900 pixels and was necessary for the whole applet to visible on the Sun. For those wishing to look closer all the source is provided and should be downloadable. Suggestions for improvement, corrections or constructive criticism are solicited. I have asked for an agenda item to be included for the Parkbench meeting on 11 Sept in Southampton so that PICT can be discussed. I look forward to seeing some of you there. -- Roger Hockney. Checkout my new Web page at URL http://www.minnow.demon.co.uk University of and link to my new book: "The Science of Computer Benchmarking" Westminster UK suggestions welcome. Know any fish movies or suitable links? From owner-parkbench-comm@CS.UTK.EDU Wed Sep 10 06:40:25 1997 Return-Path: Received: from CS.UTK.EDU by netlib2.cs.utk.edu with ESMTP (cf v2.9t-netlib) id GAA21186; Wed, 10 Sep 1997 06:40:25 -0400 Received: from localhost (root@localhost) by CS.UTK.EDU with SMTP (cf v2.9s-UTK) id GAA20806; Wed, 10 Sep 1997 06:31:06 -0400 (EDT) Received: from sun3.nsfnet-relay.ac.uk (sun3.nsfnet-relay.ac.uk [128.86.8.50]) by CS.UTK.EDU with ESMTP (cf v2.9s-UTK) id GAA20791; Wed, 10 Sep 1997 06:30:47 -0400 (EDT) Received: from bright.ecs.soton.ac.uk by sun3.nsfnet-relay.ac.uk with JANET SMTP (PP); Wed, 10 Sep 1997 11:30:44 +0100 Received: from landlord.ecs.soton.ac.uk by bright.ecs.soton.ac.uk; Wed, 10 Sep 97 11:32:57 BST From: Vladimir Getov Received: from bill.ecs.soton.ac.uk by landlord.ecs.soton.ac.uk; Wed, 10 Sep 97 11:33:16 BST Date: Wed, 10 Sep 97 11:33:13 BST Message-Id: <2458.9709101033@bill.ecs.soton.ac.uk> To: parkbench-lowlevel@CS.UTK.EDU, parkbench-comm@CS.UTK.EDU, parkbench-hpf@CS.UTK.EDU Subject: ParkBench Committee Meeting - tentative Agenda Dear Colleague, The ParkBench (Parallel Benchmark Working Group) will meet in Southampton, U.K. on September 11th, 1997 as part of the ParkBench Workshop. The Workshop site will be the County Hotel in Southampton. County Hotel Highfield Lane Southampton, U.K. Phone: +44-(0)1703-359955 Please send us your comments about the tentative agenda: 14:30 Finalize meeting agenda Minutes of last meeting (Erich Strohmaier) 14:45 Changes to Current release: - Low Level COMMS benchmarks (Vladimir Getov) - NAS Parallel Benchmarks (Subhash Saini) 15:15 New benchmarks: - HPF Low Level benchmarks (Mark Baker) 15:30 ParkBench Performance Analysis Tools: - ParkBench Result Templates (Vladimir Getov and Mark Papiani) - Visualization of Parallel Benchmark Results - new GBIS (Mark Papiani and Flavio Bergamaschi) - Interactive Web-page Curve-fitting of Parallel Performance Measurements (Roger Hockney) 16:15 Demonstrations: - Java Low-Level Benchmarks (Vladimir Getov) - BenchView: Java Tool for Visualization of Parallel Benchmark Results (Mark Papiani and Flavio Bergamaschi) - PICT: An Interactive Web-page Curve-fitting Tool (Roger Hockney) 16:45 Other activities: - "Electronic Benchmarking Journal" - status report (Mark Baker) Miscellaneous Date and venue for next meeting 17:00 Adjourn Tony Hey Vladimir Getov Erich Strohmaier From owner-parkbench-comm@CS.UTK.EDU Wed Sep 24 06:04:19 1997 Return-Path: Received: from CS.UTK.EDU by netlib2.cs.utk.edu with ESMTP (cf v2.9t-netlib) id GAA23913; Wed, 24 Sep 1997 06:04:18 -0400 Received: from localhost (root@localhost) by CS.UTK.EDU with SMTP (cf v2.9s-UTK) id FAA23163; Wed, 24 Sep 1997 05:46:35 -0400 (EDT) Received: from osiris.sis.port.ac.uk (root@osiris.sis.port.ac.uk [148.197.100.10]) by CS.UTK.EDU with SMTP (cf v2.9s-UTK) id FAA23156; Wed, 24 Sep 1997 05:46:26 -0400 (EDT) Received: from mordillo (pc297.sis.port.ac.uk) by osiris.sis.port.ac.uk (4.1/SMI-4.1) id AA29780; Wed, 24 Sep 97 10:47:01 BST Date: Wed, 24 Sep 97 10:38:39 +0000 From: Mark Baker Subject: PC timers To: parkbench-comm@CS.UTK.EDU, parkbench-low-level@CS.UTK.EDU X-Mailer: Chameleon ATX 6.0.1, Standards Based IntraNet Solutions, NetManage Inc. X-Priority: 3 (Normal) Message-Id: Mime-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII Can someone suggest the appropriate PC-based timer function (MS Visual C++ or Digital Visual Fortran) to replace the usual gettimeofday call !? Cheers Mark ------------------------------------- CSM, University of Portsmouth, Hants, UK Tel: +44 1705 844285 Fax: +44 1705 844006 E-mail: mab@sis.port.ac.uk Date: 09/24/97 - Time: 10:38:39 URL http://www.sis.port.ac.uk/~mab/ ------------------------------------- From owner-parkbench-comm@CS.UTK.EDU Thu Sep 25 10:11:01 1997 Return-Path: Received: from CS.UTK.EDU by netlib2.cs.utk.edu with ESMTP (cf v2.9t-netlib) id KAA20147; Thu, 25 Sep 1997 10:11:01 -0400 Received: from localhost (root@localhost) by CS.UTK.EDU with SMTP (cf v2.9s-UTK) id JAA18087; Thu, 25 Sep 1997 09:24:56 -0400 (EDT) Received: from osiris.sis.port.ac.uk (root@osiris.sis.port.ac.uk [148.197.100.10]) by CS.UTK.EDU with SMTP (cf v2.9s-UTK) id JAA18080; Thu, 25 Sep 1997 09:24:53 -0400 (EDT) Received: from mordillo (pc297.sis.port.ac.uk) by osiris.sis.port.ac.uk (4.1/SMI-4.1) id AA12457; Thu, 25 Sep 97 14:25:35 BST Date: Thu, 25 Sep 97 14:11:59 +0000 From: Mark Baker Subject: PC Time function To: parkbench-comm@CS.UTK.EDU X-Mailer: Chameleon ATX 6.0.1, Standards Based IntraNet Solutions, NetManage Inc. X-Priority: 3 (Normal) Message-Id: Mime-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII Thanks to all for timer info. I used the C function _ftime() in the end because it had millisec resolution. Just had to get a my head around using INTERFACE in F90 to include the external C function. I've inserted my version of the _ftime() timer below - I don't think there are any obvious error in it :-) I also implemented the dflib F90 function CALL GETTIM(hour, min, sec, hund) - this function passed tick2 testing but only has 1/100 sec resolution. ------------------------------------------------------- double dwalltime00() { struct _timeb timebuf; _ftime( &timebuf ); return (double) timebuf.time + (double) timebuf.millitm / 1000.0; } double dwalltime00_() { return dwalltime00(); } double DWALLTIME00() { return dwalltime00(); } ------------------------------------------------------- Cheers Mark ------------------------------------- CSM, University of Portsmouth, Hants, UK Tel: +44 1705 844285 Fax: +44 1705 844006 E-mail: mab@sis.port.ac.uk Date: 09/25/97 - Time: 14:11:59 URL http://www.sis.port.ac.uk/~mab/ ------------------------------------- From owner-parkbench-comm@CS.UTK.EDU Tue Oct 7 06:35:04 1997 Return-Path: Received: from CS.UTK.EDU by netlib2.cs.utk.edu with ESMTP (cf v2.9t-netlib) id GAA26560; Tue, 7 Oct 1997 06:35:04 -0400 Received: from localhost (root@localhost) by CS.UTK.EDU with SMTP (cf v2.9s-UTK) id GAA25697; Tue, 7 Oct 1997 06:10:11 -0400 (EDT) Received: from osiris.sis.port.ac.uk (root@osiris.sis.port.ac.uk [148.197.100.10]) by CS.UTK.EDU with SMTP (cf v2.9s-UTK) id GAA25668; Tue, 7 Oct 1997 06:09:40 -0400 (EDT) Received: from mordillo (pc297.sis.port.ac.uk) by osiris.sis.port.ac.uk (4.1/SMI-4.1) id AA05125; Tue, 7 Oct 97 11:09:53 BST Date: Tue, 7 Oct 97 10:43:49 +0000 From: Mark Baker Subject: Workshop Papers To: "Aad J. van der Steen" , Charles Grassl , Clemens Thole , David Snelling , Erich Strohmaier , Grapham Nudd , Klaus Stueben , parkbench-comm@CS.UTK.EDU, Roger Hockney , Saini Subhash , Vladimir Getov , William Gropp X-Mailer: Chameleon ATX 6.0.1, Standards Based IntraNet Solutions, NetManage Inc. X-Priority: 3 (Normal) Message-Id: Mime-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII Dear All, I am now back in the office and have a small amount of time to follow up the Parkbench Workshop that took place a few weeks ago. I would firstly like to thanks everyone who attended - especially all the speakers. Even though we did not attract hundreds of delegates to the workshop, I think the event was very successful - but I may be bias... So, the plans are that in the first instance I will collect the slides from all the speaker and package them up and put them on the PEMCS Web site. We also decided that we would encourage all the speaker to produce short papers on their talks and put all the workshop paper together to create a special issue the the PEMCES journal. Can the speakers therefore send me their slides (I would prefer powerpoint or word version if possible). I will harrass you further about a short papers in the near future. Thanks in advance for your help. Regards Mark ------------------------------------- CSM, University of Portsmouth, Hants, UK Tel: +44 1705 844285 Fax: +44 1705 844006 E-mail: mab@sis.port.ac.uk Date: 10/07/97 - Time: 10:43:49 URL http://www.sis.port.ac.uk/~mab/ ------------------------------------- From owner-parkbench-comm@CS.UTK.EDU Sun Oct 12 09:55:57 1997 Return-Path: Received: from CS.UTK.EDU by netlib2.cs.utk.edu with ESMTP (cf v2.9t-netlib) id JAA28908; Sun, 12 Oct 1997 09:55:57 -0400 Received: from localhost (root@localhost) by CS.UTK.EDU with SMTP (cf v2.9s-UTK) id JAA08800; Sun, 12 Oct 1997 09:44:23 -0400 (EDT) Received: from osiris.sis.port.ac.uk (root@osiris.sis.port.ac.uk [148.197.100.10]) by CS.UTK.EDU with SMTP (cf v2.9s-UTK) id JAA08793; Sun, 12 Oct 1997 09:44:20 -0400 (EDT) Received: from mordillo (p26.nas4.is2.u-net.net) by osiris.sis.port.ac.uk (4.1/SMI-4.1) id AA11347; Sun, 12 Oct 97 14:45:07 BST Date: Sun, 12 Oct 97 14:35:10 +0000 From: Mark Baker Subject: Equivalent to comms1 To: parkbench-comm@CS.UTK.EDU X-Mailer: Chameleon ATX 6.0.1, Standards Based IntraNet Solutions, NetManage Inc. X-Priority: 3 (Normal) Message-Id: Mime-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII Can someone point me at the equivalant of comms1 written in C - either MPI or sockets (or even PVM if its out there). Cheers Mark ------------------------------------- Dr Mark Baker CSM, University of Portsmouth, Hants, UK Tel: +44 1705 844285 Fax: +44 1705 844006 E-mail: mab@sis.port.ac.uk Date: 10/12/97 - Time: 14:35:10 URL http://www.sis.port.ac.uk/~mab/ ------------------------------------- From owner-parkbench-comm@CS.UTK.EDU Mon Oct 13 16:30:04 1997 Return-Path: Received: from CS.UTK.EDU by netlib2.cs.utk.edu with ESMTP (cf v2.9t-netlib) id QAA17020; Mon, 13 Oct 1997 16:29:59 -0400 Received: from localhost (root@localhost) by CS.UTK.EDU with SMTP (cf v2.9s-UTK) id QAA24297; Mon, 13 Oct 1997 16:02:05 -0400 (EDT) Received: from dancer.cs.utk.edu (DANCER.CS.UTK.EDU [128.169.92.77]) by CS.UTK.EDU with SMTP (cf v2.9s-UTK) id QAA24288; Mon, 13 Oct 1997 16:02:03 -0400 (EDT) From: Philip Mucci Received: by dancer.cs.utk.edu (cf v2.11c-UTK) id QAA02925; Mon, 13 Oct 1997 16:02:00 -0400 Date: Mon, 13 Oct 1997 16:02:00 -0400 Message-Id: <199710132002.QAA02925@dancer.cs.utk.edu> To: mab@sis.port.ac.uk, parkbench-comm@CS.UTK.EDU Subject: Re: Equivalent to comms1 In-Reply-To: X-Mailer: [XMailTool v3.1.2b] I would check out my mpbench on my web page.... It does PVM and MPI for now... > Can someone point me at the equivalant of comms1 written in > C - either MPI or sockets (or even PVM if its out there). > > Cheers > > Mark > > > ------------------------------------- > Dr Mark Baker > CSM, University of Portsmouth, Hants, UK > Tel: +44 1705 844285 Fax: +44 1705 844006 > E-mail: mab@sis.port.ac.uk > Date: 10/12/97 - Time: 14:35:10 > URL http://www.sis.port.ac.uk/~mab/ > ------------------------------------- > /%*\ Philip J. Mucci | GRA in CS under Dr. JJ Dongarra /*%\ \*%/ http://www.cs.utk.edu/~mucci PVM/Active Messages \%*/ From owner-parkbench-comm@CS.UTK.EDU Mon Oct 20 10:37:14 1997 Return-Path: Received: from CS.UTK.EDU by netlib2.cs.utk.edu with ESMTP (cf v2.9t-netlib) id KAA15359; Mon, 20 Oct 1997 10:37:14 -0400 Received: from localhost (root@localhost) by CS.UTK.EDU with SMTP (cf v2.9s-UTK) id KAA07990; Mon, 20 Oct 1997 10:19:41 -0400 (EDT) Received: from osiris.sis.port.ac.uk (root@osiris.sis.port.ac.uk [148.197.100.10]) by CS.UTK.EDU with SMTP (cf v2.9s-UTK) id KAA07691; Mon, 20 Oct 1997 10:17:09 -0400 (EDT) Received: from mordillo (pc297.sis.port.ac.uk) by osiris.sis.port.ac.uk (4.1/SMI-4.1) id AA16636; Mon, 20 Oct 97 15:17:33 BST Date: Mon, 20 Oct 97 15:02:39 +0000 From: Mark Baker Subject: PEMCS Short Article To: parkbench-comm@CS.UTK.EDU, parkbench-hpf@CS.UTK.EDU X-Mailer: Chameleon ATX 6.0.1, Standards Based IntraNet Solutions, NetManage Inc. X-Priority: 3 (Normal) Message-Id: Mime-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII Dear All, I've just put up (at last!!) the first PEMCES short article at http://hpc-journals.ecs.soton.ac.uk/PEMCS/Articles/ At the moment there is not much of a "house style" for the format of the papers and articles - this will hopefully be developed over the coming months. I expect to put the first full paper up on the Web in the next week or so. Comments, ideas and help with the journal and its Web site are most welcome. Regards Mark ------------------------------------------------------------------------------------------ COMPARING COMMUNICATION PERFORMANCE OF MPI ON THE CRAY RESEARCH T3E-600 AND IBM SP-2 1 by Glenn R. Luecke and James J. Coyle Iowa State University Ames, Iowa 50011-2251, USA Waqar ul Haque University of Northern British Columbia Prince George, British Columbia, Canada V2N 4Z9 Abstract This paper reports the performance of the Cray Research T3E and IBM SP-2 on a collection of communication tests that use MPI for the message passing. These tests have been designed to evaluate the performance of communication patterns that we feel are likely to occur in scientific programs. Communication tests were performed for messages of sizes 8 Bytes (B), 1 KB, 100 KB, and 10 MB with 2, 4, 8, 16, 32 and 64 processors. Both machines provided a high level of concurrency for the nearest neighbor communication tests and moderate concurrency on the broadcast operations. On the tests used, the T3E significantly outperformed the SP-2 with most performance tests being at least three times faster than the SP-2. ------------------------------------- CSM, University of Portsmouth, Hants, UK Tel: +44 1705 844285 Fax: +44 1705 844006 E-mail: mab@sis.port.ac.uk Date: 10/20/97 - Time: 15:02:42 URL http://www.sis.port.ac.uk/~mab/ ------------------------------------- From owner-parkbench-comm@CS.UTK.EDU Sat Oct 25 08:52:33 1997 Return-Path: Received: from CS.UTK.EDU by netlib2.cs.utk.edu with ESMTP (cf v2.9t-netlib) id IAA12875; Sat, 25 Oct 1997 08:52:33 -0400 Received: from localhost (root@localhost) by CS.UTK.EDU with SMTP (cf v2.9s-UTK) id IAA05256; Sat, 25 Oct 1997 08:41:15 -0400 (EDT) Received: from osiris.sis.port.ac.uk (root@osiris.sis.port.ac.uk [148.197.100.10]) by CS.UTK.EDU with SMTP (cf v2.9s-UTK) id IAA05244; Sat, 25 Oct 1997 08:41:05 -0400 (EDT) Received: from mordillo (p16.nas2.is2.u-net.net) by osiris.sis.port.ac.uk (4.1/SMI-4.1) id AA01764; Sat, 25 Oct 97 13:41:26 BST Date: Sat, 25 Oct 97 13:27:24 +0000 From: Mark Baker Subject: Parkbench Workshop Talks - On line To: Chuck Koelbel , Clemens Thole , Grapham Nudd , Guy Robinson , Klaus Stueben , parkbench-comm@CS.UTK.EDU, William Gropp X-Mailer: Chameleon ATX 6.0.1, Standards Based IntraNet Solutions, NetManage Inc. X-Priority: 2 (High) Message-Id: Mime-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII Dear All, I have put the talks received so far up at... http://hpc-journals.ecs.soton.ac.uk/Workshops/PEMCS/fall-97/abstracts.html Please can the speakers who have not passed their talks onto me to do so. Thanks in advance. Regards Mark ------------------------------------- Dr Mark Baker CSM, University of Portsmouth, Hants, UK Tel: +44 1705 844285 Fax: +44 1705 844006 E-mail: mab@sis.port.ac.uk Date: 10/25/97 - Time: 13:27:25 URL http://www.sis.port.ac.uk/~mab/ ------------------------------------- From owner-parkbench-comm@CS.UTK.EDU Fri Oct 31 08:22:47 1997 Return-Path: Received: from CS.UTK.EDU by netlib2.cs.utk.edu with ESMTP (cf v2.9t-netlib) id IAA19412; Fri, 31 Oct 1997 08:22:46 -0500 Received: from localhost (root@localhost) by CS.UTK.EDU with SMTP (cf v2.9s-UTK) id HAA15140; Fri, 31 Oct 1997 07:44:09 -0500 (EST) Received: from post.mail.demon.net (post-20.mail.demon.net [194.217.242.27]) by CS.UTK.EDU with SMTP (cf v2.9s-UTK) id HAA15133; Fri, 31 Oct 1997 07:44:05 -0500 (EST) Received: from minnow.demon.co.uk ([158.152.73.63]) by post.mail.demon.net id aa2017784; 31 Oct 97 12:25 GMT Message-ID: Date: Fri, 31 Oct 1997 12:22:33 +0000 To: parkbench-comm@CS.UTK.EDU From: Roger Hockney Subject: Announcing PICT2 MIME-Version: 1.0 X-Mailer: Turnpike Version 3.03a ANNOUNCING PICT2 ++++++++++++++++ The prototype Parkbench Interactive Curve Fitting Tool (PICT1) that was demonstrated at the Southampton meeting of Parkbench in September was difficult to use on small screens because the image was too large and could not be reduced in size to suit the users' screen size. Sorry, I had developed it on my own 1600x1200 display without realising that most users considered 800x600 as large! Well the new version PICT2 that is now on my web page allows for the full range of screen sizes: 640x480, 800x600, 1024x768, >=1600x1200, and also allows the user to customise his own display by selecting a font size and screen width and height. So the new version should be usable by all -- I hope! Another problem at Southampton was that the display workstation was very old and too slow in MHz to do the job. I use a P133 Pentium and the graphs lines move around instantly, but if you only have a 20MHz machine for example the response wil probably be too slow to be useful for real curve interactive fitting. There is nothing I can do about this except to suggest that you use the need to use PICT as an excuse (I mean justification) to upgrade your equipment. PICT2 still relies on the use of New COMMS1 to compute the least square 2-para fit and the 3-point fit fot the 3-para. The next step will be to put these features in PICT but that is a fair amount of code to get right and I thought it best to solve the screen-size problem first. But remember the key point about PICT is that it allows Interactive manual fitting and display that is not otherwise available. To try out PICT2 turn your browser to: http://www.minnow.demon.co.uk/pict/source/pict2a.html and follow the instructions. When you have a good PICT Frame displayed, press the HELP button for a description of the button actions. Please report problems, experiences (good and bad), suggestions to me at: roger@minnow.demon.co.uk I need feedback in order to improve the tool. Best wishes to you all Roger -- Roger Hockney. Checkout my new Web page at URL http://www.minnow.demon.co.uk University of and link to my new book: "The Science of Computer Benchmarking" Westminster UK suggestions welcome. Know any fish movies or suitable links? From owner-parkbench-comm@CS.UTK.EDU Tue Nov 11 06:21:05 1997 Return-Path: Received: from CS.UTK.EDU by netlib2.cs.utk.edu with ESMTP (cf v2.9t-netlib) id GAA18373; Tue, 11 Nov 1997 06:21:05 -0500 Received: from localhost (root@localhost) by CS.UTK.EDU with SMTP (cf v2.9s-UTK) id GAA27963; Tue, 11 Nov 1997 06:06:45 -0500 (EST) Received: from osiris.sis.port.ac.uk (root@osiris.sis.port.ac.uk [148.197.100.10]) by CS.UTK.EDU with SMTP (cf v2.9s-UTK) id GAA27930; Tue, 11 Nov 1997 06:06:15 -0500 (EST) Received: from mordillo (pc297.sis.port.ac.uk) by osiris.sis.port.ac.uk (4.1/SMI-4.1) id AA23083; Tue, 11 Nov 97 11:07:22 GMT Date: Tue, 11 Nov 97 11:00:36 GMT From: Mark Baker Subject: Couple of Announcements To: parkbench-comm@CS.UTK.EDU, parkbench-hpf@CS.UTK.EDU X-Mailer: Chameleon ATX 6.0.1, Standards Based IntraNet Solutions, NetManage Inc. X-Priority: 3 (Normal) Message-Id: Mime-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII A couple of announcements... Firstly, the majority of the papers presented at Fall ParkBench Workshop on Thursday 11th /Friday 12th September 1997 at the University of Southampton, are now on-line and can be found at... http://hpc-journals.ecs.soton.ac.uk/Workshops/PEMCS/fall-97/abstracts.html or >From http://hpc-journals.ecs.soton.ac.uk/PEMCS/ and click on News in the left frame... Secondly, the first full paper for the electronic journal Performance Evaluation and Modelling of Computer Systems (PEMCS) "PERFORM - A Fast Simulator For Estimating Program Execution Time" By Alistair Dunlop and Tony Hey, Department Electronics and Computer Science University of Southampton Southampton, SO17 1BJ, U.K. Can be found at... http://hpc-journals.ecs.soton.ac.uk/PEMCS/Papers/vol1.html See you'll at the Parkbench BOF at SC'97... Mark ------------------------------------- Dr Mark Baker CSM, University of Portsmouth, Hants, UK Tel: +44 1705 844285 Fax: +44 1705 844006 E-mail: mab@sis.port.ac.uk Date: 11/11/97 - Time: 11:00:36 URL http://www.sis.port.ac.uk/~mab/ ------------------------------------- From owner-parkbench-comm@CS.UTK.EDU Wed Nov 12 21:46:18 1997 Return-Path: Received: from CS.UTK.EDU by netlib2.cs.utk.edu with ESMTP (cf v2.9t-netlib) id VAA14031; Wed, 12 Nov 1997 21:46:17 -0500 Received: from localhost (root@localhost) by CS.UTK.EDU with SMTP (cf v2.9s-UTK) id VAA06813; Wed, 12 Nov 1997 21:31:03 -0500 (EST) Received: from rudolph.cs.utk.edu (RUDOLPH.CS.UTK.EDU [128.169.92.87]) by CS.UTK.EDU with SMTP (cf v2.9s-UTK) id VAA06806; Wed, 12 Nov 1997 21:31:01 -0500 (EST) Received: from localhost by rudolph.cs.utk.edu with SMTP (cf v2.11c-UTK) id VAA24812; Wed, 12 Nov 1997 21:31:01 -0500 Date: Wed, 12 Nov 1997 21:31:00 -0500 (EST) From: Erich Strohmaier To: parkbench-hpf@CS.UTK.EDU, parkbench-lowlevel@CS.UTK.EDU, parkbench-comm@CS.UTK.EDU Subject: ParkBench BOF session at the SC'97 Message-ID: MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII Dear Colleague, The ParkBench (PARallel Kernels and BENCHmarks) committee has organized a BOF session at the SC'97 in San Jose. Room: Convention Center Room C1 Time: Wednesday 5:30pm We will talk about the latest release, new results available and future plans. Tentative Agenda of the BOF - Introduction, background, WWW-Server - Current Release of ParkBench - Low Level Performance Evaluation Tools - LinAlg Kernel Benchmarks - NAS Parallel Benchmarks, including latest results - Plans for the next Release - Electronic Journal of Performance Evaluation and Modeling for Computer Systems - Questions from the floor / discussion Please mark your calendar and plan to attend. Jack Dongarra Tony Hey Erich Strohmaier From owner-parkbench-comm@CS.UTK.EDU Thu Nov 13 06:31:53 1997 Return-Path: Received: from CS.UTK.EDU by netlib2.cs.utk.edu with ESMTP (cf v2.9t-netlib) id GAA07105; Thu, 13 Nov 1997 06:31:52 -0500 Received: from localhost (root@localhost) by CS.UTK.EDU with SMTP (cf v2.9s-UTK) id FAA01880; Thu, 13 Nov 1997 05:56:05 -0500 (EST) Received: from osiris.sis.port.ac.uk (root@osiris.sis.port.ac.uk [148.197.100.10]) by CS.UTK.EDU with SMTP (cf v2.9s-UTK) id FAA01835; Thu, 13 Nov 1997 05:55:18 -0500 (EST) Received: from mordillo (p19.nas2.is2.u-net.net) by osiris.sis.port.ac.uk (4.1/SMI-4.1) id AA18430; Thu, 13 Nov 97 10:56:11 GMT Date: Thu, 13 Nov 97 10:48:53 GMT From: Mark Baker Subject: Fall 97 Parkbench Committee Meeting Minutes To: parkbench-comm@CS.UTK.EDU, parkbench-hpf@CS.UTK.EDU, parkbench-lowlevel@CS.UTK.EDU X-Mailer: Chameleon ATX 6.0.1, Standards Based IntraNet Solutions, NetManage Inc. X-Priority: 3 (Normal) References: Message-Id: Mime-Version: 1.0 Content-Type: MULTIPART/MIXED; BOUNDARY="mordillo:879418490:877:126:21579" --mordillo:879418490:877:126:21579 Content-Type: TEXT/PLAIN; charset=US-ASCII Dear All, Here are the minutes of the Parkbench committee meeting held The County Hotel in Southampton during the Fall 97 Parkbench Workshop. For those of you with a MIME-compliant mail-reader I've attached a formatted word 7 doc. Regards Mark ----------------------------------------------------------------------------- Parkbench Committee Meeting Held during the Fall Parkbench Workshop The County Hotel Southampton, UK 1515, 11th September 1997 Meeting Participation List: Mark Baker - Univ. of Portsmouth (mab@sis.port.ac.uk) Flavio Bergamaschi - Univ of Southampton (fab@ecs.soton.ac.uk) Jack Dongarra - Univ. of Tenn./ORNL (dongarra@cs.utk.edu) Vladimir Getov - Univ. of Westminister (getovv@wmin.ac.uk) Charles Grassl - SGI/Cray (cmg@cray.com) William Gropp - ANL (gropp@mcs.anl.gov) Tony Hey - Univ. of Southampton (ajgh@ecs.soton.ac.uk) Roger Hockney - Univ. of Westminister (roger@minnow.demon.co.uk) Mark Papiani - Univ of Southampton (mp@ecs.soton.ac.uk) Subhash Saini - NASA Ames (saini@nas.nasa.gov) Dave Snelling - FECIT (snelling@fecit.co.uk) Aad J. van der Steen - RUU (steen@fys.ruu.nl) Erich Strohmaier - Univ. of Tennessee (erich@cs.utk.edu) Klaus Stueben - GMD (klaus.stueben@gmd.de) Meeting Activities and Actions Tony Hey chaired the meeting. Minutes from last meeting were seven pages long and it was decided that only the actions from the last meeting would be reviewed. The actions from last meeting were reviewed - a short discussion about each took place. A discussion about interaction with SPEC-HPG was initiated. Comms Low-Level Benchmarks Vladimir Getov gave a short presentation on the current status of the Parkbench Comms benchmarks. Charles Grassl was asked to explained how his new Comms programs worked and the rationale behind it. A long discussion ensued. Action - Create a formal proposal of alternative or additions to the comms low-level benchmarks for SC'97 - Charles Grassl. Action - Members should look at the PALLAS version of the low-level benchmarks (based on Genesis/RAPS). Action - Erich Strohmaier and Vladimir Getov will discuss the efforts needed to split up Parkbench and add in the new Comms1 benchmark (with new curve fitting routine). NPB - Subhash Siani reported on the status of the NAS Parallel Benchmarks HPF - Mark Baker read Chuck Koebel's email about CEWES HPCM HPF efforts. Action - Subhash Siani will let RICE know that Gina should start of from the single NAS codes Electronic Journal - Mark Baker and Tony Hey reported on the electronic journal PEMCS and its Web site. It was agreed that this would be discussed further informally. Parkbench Report -Erich Strohmaier reported on the efforts of creating a new Parkbench report. A short discussion about this ensued. Action - Jack Dongarra /Tony Hey will talk to other members about the potential efforts that could be put into a Parkbench report II by SC'97. Funding Efforts Jack Dongarra's recent benchmarking proposal was turned down. Tony Hey mentioned the possibly of entering a proposal to the EU. Possibility of a joint EU / NSF bid. Mark Baker asked if SIO would be interested in being more closely involved. William Gropp reported that SIO was actually winding down and so formal association was not really an option. AOB The participants were then invited by Tony to move to the University of Southampton (bldg. 16) for the Parkbench demonstrations which included: -- Java Low-Level Benchmarks (Vladimir Getov) -- BenchView: Java Tool for Visualization of Parallel Benchmark Results (Mark Papiani and Flavio Bergamaschi) -- PICT: An Interactive Web-page Curve-fitting Tool (Roger Hockney) Jack Dongarra informed the committee of Parkbench BOF at SC'97 (Wednesday at 3.30PM). The meeting was wound up by Tony Hey at 1630. ----------------------------------------------------------------------------- ------------------------------------- CSM, University of Portsmouth, Hants, UK Tel: +44 1705 844285 Fax: +44 1705 844006 E-mail: mab@sis.port.ac.uk Date: 11/13/97 - Time: 10:48:53 URL http://www.sis.port.ac.uk/~mab/ ------------------------------------- --mordillo:879418490:877:126:21579 Content-Type: APPLICATION/msword; name="minutes-fall-97.doc" Content-Transfer-Encoding: BASE64 Content-Description: minutes-fall-97.doc 0M8R4KGxGuEAAAAAAAAAAAAAAAAAAAAAPgADAP7/CQAGAAAAAAAAAAAAAAAB AAAAEQAAAAAAAAAAEAAAEgAAAAEAAAD+////AAAAABAAAAD///////////// //////////////////////////////////////////////////////////// //////////////////////////////////////////////////////////// //////////////////////////////////////////////////////////// //////////////////////////////////////////////////////////// //////////////////////////////////////////////////////////// //////////////////////////////////////////////////////////// //////////////////////////////////////////////////////////// //////////////////////////////////////////////////////////// //////////////////////////////////////////////////////////// ///////////////////////cpWgAY+AJBAAAAABlAAAAAAAAAAAAAAAAAwAA hxAAABAeAAAAAAAAAAAAAAAAAAAAAAAAhw0AAAAAAAAAAAAAAAAAAAAAAAAA AAAAAAAAAAAAAAAAAAAAABgAAGoAAAAAGAAAagAAAGoYAAAAAAAAahgAAAAA AABqGAAAAAAAAGoYAAAAAAAAahgAABQAAACkGAAAAAAAAKQYAAAAAAAApBgA AAAAAACkGAAAAAAAAKQYAAAAAAAApBgAAAoAAACuGAAAEAAAAKQYAAAAAAAA Eh0AAHwAAAC+GAAAAAAAAL4YAAAAAAAAvhgAAAAAAAC+GAAAAAAAAL4YAAAA AAAAvhgAAAAAAAC+GAAAAAAAAL4YAAAAAAAABxoAAAIAAAAJGgAAAAAAAAka AAAAAAAACRoAAEsAAABUGgAAUAEAAKQbAABQAQAA9BwAAB4AAACOHQAAWAAA AOYdAAAqAAAAEh0AAAAAAAAAAAAAAAAAAAAAAAAAAAAAahgAAAAAAAC+GAAA AAAAAAAACQAKAAEAAgC+GAAAAAAAAL4YAAAAAAAAAAAAAAAAAAAAAAAAAAAA AL4YAAAAAAAAvhgAAAAAAAASHQAAAAAAANQYAAAAAAAAahgAAAAAAABqGAAA AAAAAL4YAAAAAAAAAAAAAAAAAAAAAAAAAAAAAL4YAAAAAAAA1BgAAAAAAADU GAAAAAAAANQYAAAAAAAAvhgAABYAAABqGAAAAAAAAL4YAAAAAAAAahgAAAAA AAC+GAAAAAAAAAcaAAAAAAAAAAAAAAAAAAAQq9KCIvC8AX4YAAAOAAAAjBgA ABgAAABqGAAAAAAAAGoYAAAAAAAAahgAAAAAAABqGAAAAAAAAL4YAAAAAAAA BxoAAAAAAADUGAAAMwEAANQYAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA AAAAAAAAAAAAAAAAAAAAAAAAAABQYXJrYmVuY2ggQ29tbWl0dGVlIE1lZXRp bmcNDUhlbGQgZHVyaW5nIHRoZSBGYWxsIFBhcmtiZW5jaCBXb3Jrc2hvcA0N VGhlIENvdW50eSBIb3RlbA0NU291dGhhbXB0b24sIFVLDQ0xNTE1LCAgMTF0 aCBTZXB0ZW1iZXIgMTk5Nw0NDU1lZXRpbmcgUGFydGljaXBhdGlvbiBMaXN0 Og0NTWFyayBCYWtlciAtIFVuaXYuIG9mIFBvcnRzbW91dGggKG1hYkBzaXMu cG9ydC5hYy51aykNRmxhdmlvIEJlcmdhbWFzY2hpICAtIFVuaXYgb2YgU291 dGhhbXB0b24gKGZhYkBlY3Muc290b24uYWMudWspDUphY2sgRG9uZ2FycmEg LSBVbml2LiBvZiBUZW5uLi9PUk5MIChkb25nYXJyYUBjcy51dGsuZWR1KQ1W bGFkaW1pciBHZXRvdiAgLSBVbml2LiBvZiBXZXN0bWluaXN0ZXIgKGdldG92 dkB3bWluLmFjLnVrKQ1DaGFybGVzIEdyYXNzbCAtIFNHSS9DcmF5IChjbWdA Y3JheS5jb20pDVdpbGxpYW0gR3JvcHAgLSBBTkwgKGdyb3BwQG1jcy5hbmwu Z292KQ1Ub255IEhleSAtIFVuaXYuIG9mIFNvdXRoYW1wdG9uIChhamdoQGVj cy5zb3Rvbi5hYy51aykNUm9nZXIgSG9ja25leSAtIFVuaXYuIG9mIFdlc3Rt aW5pc3RlciAocm9nZXJAbWlubm93LmRlbW9uLmNvLnVrKQ1NYXJrIFBhcGlh bmkgLSBVbml2IG9mIFNvdXRoYW1wdG9uIChtcEBlY3Muc290b24uYWMudWsp DVN1Ymhhc2ggU2FpbmkgLSBOQVNBIEFtZXMgKHNhaW5pQG5hcy5uYXNhLmdv dikNRGF2ZSBTbmVsbGluZyAtIEZFQ0lUIChzbmVsbGluZ0BmZWNpdC5jby51 aykNQWFkIEouIHZhbiBkZXIgU3RlZW4gIC0gUlVVIChzdGVlbkBmeXMucnV1 Lm5sKQ1FcmljaCBTdHJvaG1haWVyIC0gVW5pdi4gb2YgVGVubmVzc2VlIChl cmljaEBjcy51dGsuZWR1KQ1LbGF1cyBTdHVlYmVuIC0gR01EICAoa2xhdXMu c3R1ZWJlbkBnbWQuZGUpDQ1NZWV0aW5nIEFjdGl2aXRpZXMgYW5kIEFjdGlv bnMNDVRvbnkgSGV5IGNoYWlyZWQgdGhlIG1lZXRpbmcuDQ1NaW51dGVzIGZy b20gbGFzdCBtZWV0aW5nIHdlcmUgc2V2ZW4gcGFnZXMgbG9uZyBhbmQgaXQg d2FzIGRlY2lkZWQgdGhhdCBvbmx5IHRoZSBhY3Rpb25zIGZyb20gdGhlIGxh c3QgbWVldGluZyB3b3VsZCBiZSByZXZpZXdlZC4gVGhlIGFjdGlvbnMgZnJv bSBsYXN0IG1lZXRpbmcgd2VyZSByZXZpZXdlZCAtIGEgc2hvcnQgZGlzY3Vz c2lvbiBhYm91dCBlYWNoIHRvb2sgcGxhY2UuIEEgZGlzY3Vzc2lvbiBhYm91 dCBpbnRlcmFjdGlvbiB3aXRoIFNQRUMtSFBHIHdhcyBpbml0aWF0ZWQuDQ1D b21tcyBMb3ctTGV2ZWwgQmVuY2htYXJrcyANDVZsYWRpbWlyIEdldG92IGdh dmUgYSBzaG9ydCBwcmVzZW50YXRpb24gb24gdGhlIGN1cnJlbnQgc3RhdHVz IG9mIHRoZSBQYXJrYmVuY2ggQ29tbXMgYmVuY2htYXJrcy4gIENoYXJsZXMg R3Jhc3NsIHdhcyBhc2tlZCB0byBleHBsYWluZWQgaG93IGhpcyBuZXcgQ29t bXMgcHJvZ3JhbXMgd29ya2VkIGFuZCB0aGUgcmF0aW9uYWxlIGJlaGluZCBp dC4gDUEgbG9uZyBkaXNjdXNzaW9uIGVuc3VlZC4NDUFjdGlvbiAtIENyZWF0 ZSBhIGZvcm1hbCBwcm9wb3NhbCAgb2YgYWx0ZXJuYXRpdmUgb3IgYWRkaXRp b25zIHRvIHRoZSBjb21tcyBsb3ctbGV2ZWwgYmVuY2htYXJrcyBmb3IgU0OS OTcgLSBDaGFybGVzIEdyYXNzbC4NDUFjdGlvbiAtIE1lbWJlcnMgc2hvdWxk IGxvb2sgYXQgdGhlIFBBTExBUyB2ZXJzaW9uIG9mIHRoZSBsb3ctbGV2ZWwg YmVuY2htYXJrcyAoYmFzZWQgb24gR2VuZXNpcy9SQVBTKS4NDUFjdGlvbiAg LSBFcmljaCAgU3Ryb2htYWllciBhbmQgVmxhZGltaXIgR2V0b3Ygd2lsbCBk aXNjdXNzIHRoZSBlZmZvcnRzIG5lZWRlZCB0byBzcGxpdCB1cCBQYXJrYmVu Y2ggYW5kIGFkZCBpbiB0aGUgbmV3IENvbW1zMSBiZW5jaG1hcmsgKHdpdGgg bmV3IGN1cnZlIGZpdHRpbmcgcm91dGluZSkuDQ1OUEIgLSBTdWJoYXNoIFNp YW5pIHJlcG9ydGVkIG9uIHRoZSBzdGF0dXMgb2YgdGhlIE5BUyBQYXJhbGxl bCBCZW5jaG1hcmtzDQ1IUEYgLSBNYXJrIEJha2VyIHJlYWQgQ2h1Y2sgS29l YmVsknMgZW1haWwgYWJvdXQgQ0VXRVMgSFBDTSBIUEYgZWZmb3J0cy4NDUFj dGlvbiAtIFN1Ymhhc2ggU2lhbmkgd2lsbCBsZXQgUklDRSBrbm93IHRoYXQg R2luYSBzaG91bGQgc3RhcnQgb2YgZnJvbSB0aGUgc2luZ2xlIE5BUyBjb2Rl cw0NRWxlY3Ryb25pYyBKb3VybmFsIC0gTWFyayBCYWtlciBhbmQgVG9ueSBI ZXkgcmVwb3J0ZWQgb24gdGhlIGVsZWN0cm9uaWMgam91cm5hbCBQRU1DUyBh bmQgaXRzIFdlYiBzaXRlLiBJdCB3YXMgYWdyZWVkIHRoYXQgdGhpcyB3b3Vs ZCBiZSBkaXNjdXNzZWQgIGZ1cnRoZXIgaW5mb3JtYWxseS4NDVBhcmtiZW5j aCBSZXBvcnQgLUVyaWNoIFN0cm9obWFpZXIgcmVwb3J0ZWQgb24gdGhlIGVm Zm9ydHMgb2YgY3JlYXRpbmcgYSBuZXcgUGFya2JlbmNoIHJlcG9ydC4gQSBz aG9ydCBkaXNjdXNzaW9uIGFib3V0IHRoaXMgZW5zdWVkLg0NQWN0aW9uIC0g SmFjayBEb25nYXJyYSAvVG9ueSBIZXkgd2lsbCB0YWxrIHRvIG90aGVyIG1l bWJlcnMgYWJvdXQgdGhlIHBvdGVudGlhbCBlZmZvcnRzIHRoYXQgY291bGQg YmUgcHV0IGludG8gYSBQYXJrYmVuY2ggcmVwb3J0IElJIGJ5IFNDkjk3Lg0N RnVuZGluZyBFZmZvcnRzDQ1KYWNrIERvbmdhcnJhknMgcmVjZW50IGJlbmNo bWFya2luZyAgcHJvcG9zYWwgd2FzIHR1cm5lZCBkb3duLiBUb255IEhleSBt ZW50aW9uZWQgdGhlIHBvc3NpYmx5IG9mIGVudGVyaW5nIGEgcHJvcG9zYWwg dG8gdGhlIEVVLg1Qb3NzaWJpbGl0eSBvZiBhIGpvaW50IEVVIC8gTlNGIGJp ZC4NDU1hcmsgQmFrZXIgYXNrZWQgaWYgU0lPIHdvdWxkIGJlIGludGVyZXN0 ZWQgaW4gYmVpbmcgbW9yZSBjbG9zZWx5IGludm9sdmVkLiAgV2lsbGlhbSBH cm9wcCByZXBvcnRlZCB0aGF0IFNJTyB3YXMgYWN0dWFsbHkgd2luZGluZyBk b3duIGFuZCBzbyBmb3JtYWwgYXNzb2NpYXRpb24gd2FzIG5vdCByZWFsbHkg YW4gb3B0aW9uLg0NQU9CDQ1UaGUgcGFydGljaXBhbnRzIHdlcmUgdGhlbiBp bnZpdGVkIGJ5IFRvbnkgdG8gbW92ZSB0byB0aGUgVW5pdmVyc2l0eSBvZiBT b3V0aGFtcHRvbiAoYmxkZy4gMTYpIGZvciB0aGUgUGFya2JlbmNoIGRlbW9u c3RyYXRpb25zIHdoaWNoIGluY2x1ZGVkOg0NSmF2YSBMb3ctTGV2ZWwgQmVu Y2htYXJrcyAoVmxhZGltaXIgR2V0b3YpDUJlbmNoVmlldzogSmF2YSBUb29s IGZvciBWaXN1YWxpemF0aW9uIG9mIFBhcmFsbGVsIEJlbmNobWFyayBSZXN1 bHRzIChNYXJrIFBhcGlhbmkgYW5kIEZsYXZpbyBCZXJnYW1hc2NoaSkNUElD VDogQW4gSW50ZXJhY3RpdmUgV2ViLXBhZ2UgQ3VydmUtZml0dGluZyBUb29s IChSb2dlciBIb2NrbmV5KQ0NSmFjayBEb25nYXJyYSAgaW5mb3JtZWQgdGhl IGNvbW1pdHRlZSBvZiAgUGFya2JlbmNoIEJPRiBhdCBTQ5I5NyAoV2VkbmVz ZGF5IGF0IDMuMzBQTSkuDQ1UaGUgbWVldGluZyB3YXMgd291bmQgdXAgYnkg VG9ueSBIZXkgYXQgMTYzMC4NFQCk0C+l4D2mCAenCAeooAWpoAWqAAAAAAAA AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAwAAHQMAAGgD AAByAwAAdAMAAIYDAAChAwAAogMAAMUDAADXAwAABAQAABcEAAA+BAAAUQQA AHwEAACNBAAAqgQAALYEAADNBAAA3gQAAAEFAAAVBQAAPgUAAFYFAABvBQAA jgUAAKsFAAC9BQAA1gUAAOoFAAAJBgAAGQYAAEIGAABSBgAAagYAAH4GAACB BgAAoAYAANcHAADzBwAARAgAAEkIAACJCAAAjggAANgIAADeCAAAVgkAAFwJ AABeCQAAvwkAAMUJAAA3CgAAPQoAAGsKAABuCgAAtgoAALkKAAAACwAABgsA AF8LAABxCwAACAwAABgMAACODAAAlAwAAB4NAAAtDQAALg0AAJIOAACVDgAA hxAAAJ4QAAD79gD0APHvAO0A7QDtAO0A7QDrAO0A7QDtAO0A7QDtAO0A7QDm APEA7QDtAOMA4+EA4wDtAPEA8QDjAPEA8QDjAPHvAPEA3wAAAAAAAAAAAAAA AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAJ1AQACVoEABFWBVoEA CFWBXQMAYxgAAANdBQADXQQAA10DAAVVgV0DAAJoAQAIVYFdAwBjHAAACFWB XQMAYyQARwADAAAcAwAAHQMAAEUDAABGAwAAVwMAAFgDAABoAwAAaQMAAIQD AACFAwAAhgMAAKIDAACjAwAA2QMAABkEAABTBAAAjwQAALgEAADgBAAAFwUA AFgFAACQBQAAvwUAAOwFAAAbBgAAVAYAAIAGAACBBgAAoAYAAKEGAAC/BgAA wAYAANYHAADXBwAA8wcAAPQHAAC9CAAA1wgAANgIAAD9AAHAIaIB+gABwCGi Af0AAcAhRgH9AAHAIUYB/QABwCFGAf0AAcAhRgH9AAHAIUYB/QABwCHrAP0A AcAh6wD6AAHAIesA+gABwCHrAPoAAcAh6QD6AAHAIesA+gABwCHyAPoAAcAh 8gD6AAHAIfIA+gABwCHyAPoAAcAh8gD6AAHAIfIA+gABwCHyAPoAAcAh8gD6 AAHAIfIA+gABwCHyAPoAAcAh8gD6AAHAIfIA+gABwCHyANwAAcAh8gD6AAHA IesA+gABwCEWAfoAAcAh6wD6AAHAIesA+gABwCHrAPoAA8Ah6wD6AAHAIesA +gABwCHpAPoAAcAh6wD6AALAIfIA+gABwCHrAPoAAcAh6wAAAAAAAAAAHQAA BQMMNP8BAAgAAAEAAAABAGgBAAAAAAAAtwAAAAAAAAAAAAAAAAAAAAAAAAAA AAAAAAAAAAAAAAAAAgAABQMAAgAABQEn2AgAAFUJAABWCQAAvgkAAL8JAABq CgAAawoAALUKAAC2CgAA/woAAAALAABeCwAAXwsAAAcMAAAIDAAAjQwAAI4M AAAdDQAAHg0AAC4NAAAvDQAAsA0AANUNAADWDQAAkQ4AAJIOAACWDgAAlw4A ACcPAAAoDwAAUw8AAL4PAAD/DwAAABAAAFgQAABZEAAAhxAAAP0E/8Ah2QH9 AAHAIesA/QT/wCHZAf0AAcAh6wD9BP/AIeAB/QABwCHrAP0AAcAh7gD9AAHA IesA/QABwCHuAP0AAcAh6wD9AAHAIe4A/QABwCHrAP0E/8Ah2QH9AAHAIesA /QT/wCHZAf0AAcAh6wD9BP/AIdkB/QABwCHrAP0AAcAh6QD9AAHAIesA/QAC wCHrAP0AAcAh6wD9AAHAIesA/QACwCHrAP0AAcAh6wD9AAHAIekA/QABwCHr AP0AAsAh6wD9AAHAIesA2wABwCH6ANsE/8Ah5QHbAAHAIfoA/QABwCHrAP0A AcAh6wD9AAHAIesA/QABwCHrAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA AAAAAAAAAAAAAAAAIQAABQMNCxFoAROY/gw0/wEACAAAAQAAAAEAaAEAAAAA AAC3AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAACAAAFAyQOAA8A CAABAEsADwAAAAAAGgAAQPH/AgAaAAZOb3JtYWwAAgAAAAMAYQkEAAAAAAAA AAAAAAAAAAAAAAAAACIAQUDy/6EAIgAWRGVmYXVsdCBQYXJhZ3JhcGggRm9u dAAAAAAAAAAAAAAAAAAAAIcNAAAEAIcQAAAAAP////8CAAQh//8BAAAg//8C AAAAAABqBwAAhw0AAAAAAQAAAAEAAAAAAAADAACeEAAACQAAAwAA2AgAAIcQ AAAKAAsAAAAAAAECAAAVAgAAiQ0AAAcAHAAHADMBC01hcmsgIEJha2VyJEM6 XHRleFxQYXJrQmVuY2hcbWludXRlcy1mYWxsLTk3LmRvYwtNYXJrICBCYWtl cjNDOlx0ZXhcUGFya0JlbmNoXEFkbWluaXN0cmF0aW9uXG1pbnV0ZXMtZmFs bC05Ny5kb2MLTWFyayAgQmFrZXIzQzpcdGV4XFBhcmtCZW5jaFxBZG1pbmlz dHJhdGlvblxtaW51dGVzLWZhbGwtOTcuZG9jC01hcmsgIEJha2VyM0M6XHRl eFxQYXJrQmVuY2hcQWRtaW5pc3RyYXRpb25cbWludXRlcy1mYWxsLTk3LmRv YwtNYXJrICBCYWtlcjNDOlx0ZXhcUGFya0JlbmNoXEFkbWluaXN0cmF0aW9u XG1pbnV0ZXMtZmFsbC05Ny5kb2P/QFRla3Ryb25peCBQaGFzZXIgNTUwIDEy MDAgZHBpAExQVDE6AHdpbnNwb29sAFRla3Ryb25peCBQaGFzZXIgNTUwIDEy MDAgZHBpAFRla3Ryb25peCBQaGFzZXIgNTUwIDEyMDAgZHBpAAAAAQQABJwA tAATzwEAAQABAOoKbwhkAAEADwBYAgIAAQAAAAMAAABMZXR0ZXIAABQAZWVl ZWVlZWVlZWVlZWVlZWVlZWVlZQAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAFBSSVbgEAAAAAAAAAAAAAAA AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAYAAAAAAAQJxAnECcAABAnAAAA AAAAAABjdQgA/wMAAQEBAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAFRla3Ryb25peCBQaGFzZXIg NTUwIDEyMDAgZHBpAAAAAQQABJwAtAATzwEAAQABAOoKbwhkAAEADwBYAgIA AQAAAAMAAABMZXR0ZXIAAAAADwAGAAAACgAwARQAMAEUAHIAcABjAAAAAAAA AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA AAAAAFBSSVbgEAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA AAAYAAAAAAAQJxAnECcAABAnAAAAAAAAAABjdQgA/wMAAQEBAAAAAAAAAAAA AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA AAAAAAOAAQDHAAAAxwAAAAgAzwDPAMcAAAAAAAAAxwAAAHwAFRaQAQAAVGlt ZXMgTmV3IFJvbWFuAAwSkAECAFN5bWJvbAAWIpABAAZBcmlhbABIZWx2ZXRp Y2EAABsmvAIAAEFyaWFsIFJvdW5kZWQgTVQgQm9sZAARNZABAABDb3VyaWVy IE5ldwARNZABAgBNUyBMaW5lRHJhdwAiAAQAcQiJGAAA0AIAAGgBAAAAANBb GYa2ahuGAAAAAAcAXAAAAPQBAAAnCwAAAgAFAAAABACDEBcAAAAAAAAAAAAA AAIAAQAAAAEAAAAAAAAAIQMAAAAAKgAAAAAAAAALTWFyayAgQmFrZXILTWFy ayAgQmFrZXIAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA AAAAAAAAAAAAAAAAAAAAAAAAAAEAAAACAAAAAwAAAAQAAAAFAAAABgAAAAcA AAAIAAAACQAAAAoAAAALAAAADAAAAA0AAAAOAAAADwAAAP7////9////FAAA AP7///8cAAAA/v/////////////////////////////////////////+//// //////////////////////////////////////////////////////////// //////////////////////////////////////////////////////////// //////////////////////////////////////////////////////////// //////////////////////////////////////////////////////////// //////////////////////////////////////////////////////////// //////////////////////////////////////////////////////////// //////////////////////////////////////////////////////////// //////////////////////////////////////////////////////////// ////////////////////////////////////////////////UgBvAG8AdAAg AEUAbgB0AHIAeQAAAGspDUphY2sgRG9uZ2FycmEgLSBVbml2LiBvZiBUZW5u Li9PUk5MIChkbxYABQH//////////wEAAAAACQIAAAAAAMAAAAAAAABGAAAA AKD5PUK9vrwBEKvSgiLwvAETAAAAQAMAAGdldG9XAG8AcgBkAEQAbwBjAHUA bQBlAG4AdAAAAHNzbCAtIFNHSS9DcmF5IChjbWdAY3JheS5jb20pDVdpbGxp YW0gGgACAQIAAAADAAAA/////3BwQG1jcy5hbmwuZ292KQ1Ub255IEhleSAt IFVuaXYuIG9mIAAAAAAQHgAAdG9uIAEAQwBvAG0AcABPAGIAagAAAC51aykN Um9nZXIgSG9ja25leSAtIFVuaXYuIG9mIFdlc3RtaW5pc3RlciAocm8SAAIB ////////////////LmNvLnVrKQ1NYXJrIFBhcGlhbmkAAAAAAAAAAAAAAAAA AAAAAAAAAGoAAABtcEBlBQBTAHUAbQBtAGEAcgB5AEkAbgBmAG8AcgBtAGEA dABpAG8AbgAAAHMgKHNhaW5pQG5hcy5uYXNhLmdvdikNRCgAAgH/////BAAA AP////9FQ0lUIChzbmVsbGluZ0BmZWNpdAAAAAAAAAAAAAAAAAAAAAACAAAA vAEAAHRlZW4BAAAA/v///wMAAAAEAAAABQAAAAYAAAAHAAAACAAAAP7///8K AAAACwAAAAwAAAD+//////////////////////////////////////////// //////////////////////////////////////////////////////////// //////////////////////////////////////////////////////////// //////////////////////////////////////////////////////////// //////////////////////////////////////////////////////////// //////////////////////////////////////////////////////////// //////////////////////////////////////////////////////////// //////////////////////////////////////////////////////////// //////////////////////////////////////////////////////////// //////////////////////////////////////////////////////////// /////////////////////////////////wEA/v8DCgAA/////wAJAgAAAAAA wAAAAAAAAEYYAAAATWljcm9zb2Z0IFdvcmQgRG9jdW1lbnQACgAAAE1TV29y ZERvYwAQAAAAV29yZC5Eb2N1bWVudC42APQ5snEAAAAAAAAAAAAAAAAAAAAA AAAAAAAAAAAAAAAAAAAAAAAA/v8AAAQAAgAAAAAAAAAAAAAAAAAAAAAAAQAA AOCFn/L5T2gQq5EIACsns9kwAAAAjAEAABIAAAABAAAAmAAAAAIAAACgAAAA AwAAAKwAAAAEAAAAuAAAAAUAAADMAAAABgAAANgAAAAHAAAA5AAAAAgAAAD0 AAAACQAAAAgBAAASAAAAFAEAAAoAAAA8AQAACwAAAEgBAAAMAAAAVAEAAA0A AABgAQAADgAAAGwBAAAPAAAAdAEAABAAAAB8AQAAEwAAAIQBAAACAAAA5AQA AB4AAAABAAAAAAAGAB4AAAABAAAAAFdSTR4AAAAMAAAATWFyayAgQmFrZXIA HgAAAAEAAAAAOmkQHgAAAAEAAAAAAAAAHgAAAAcAAABOb3JtYWwAYR4AAAAM AAAATWFyayAgQmFrZXIAHgAAAAIAAAA3AAQAHgAAAB4AAABNaWNyb3NvZnQg V29yZCBmb3IgV2luZG93cyA5NQAAAEAAAAAAKC3aDAAAAEAAAAAAAAAABQBE AG8AYwB1AG0AZQBuAHQAUwB1AG0AbQBhAHIAeQBJAG4AZgBvAHIAbQBhAHQA aQBvAG4AAAAAAAAAAAAAADgAAgD///////////////8AAAAAAAAAAAAAAAAA AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAJAAAA6AAAAAAAAAAAAAAAAAAAAAAA AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA AAAAAAAAAAAAAAAAAP///////////////wAAAAAAAAAAAAAAAAAAAAAAAAAA AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA AAAAAAAA////////////////AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAD/ //////////////8AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA AAAAAAAAAAAAAAAAAAD+/wAABAACAAAAAAAAAAAAAAAAAAAAAAABAAAAAtXN 1ZwuGxCTlwgAKyz5rjAAAAC4AAAACAAAAAEAAABIAAAADwAAAFAAAAAEAAAA dAAAAAUAAAB8AAAABgAAAIQAAAALAAAAjAAAABAAAACUAAAADAAAAJwAAAAC AAAA5AQAAB4AAAAZAAAAVW5pdmVyc2l0eSBvZiBQb3J0c21vdXRoAAAAAAMA AAAAOgAAAwAAABcAAAADAAAABQAAAAsAAAAAAAAACwAAAAAAAAAMEAAAAgAA AB4AAAABAAAAAAMAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA AABAAAAAADhSnMW+vAFAAAAAANR+ciLwvAEDAAAAAgAAAAMAAAD0AQAAAwAA ACcLAAADAAAAAAAAAAAAAAD+/wAABAACAAAAAAAAAAAAAAAAAAAAAAABAAAA AtXN1ZwuGxCTlwgAKyz5rjAAAAC4AAAACAAAAAEAAABIAAAADwAAAFAAAAAE AAAAdAAAAAUAAAB8AAAABgAAAIQAAAALAAAAjAAAABAAAACUAAAADAAAAJwA AAACAAAA5AQAAB4AAAAZAAAAVW5pdmVyc2l0eSBvZiBQb3J0c21vdXRoAAAA AAMAAAAAOgAAAwAAABcAAAADAAAABQAAAAsAAAAAAAAACwAAAAAAAAAMEAAA AgAAAB4AAAABAAAAAAMAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA AAAAAAAAAAAAAAAAAAAA --mordillo:879418490:877:126:21579-- From owner-parkbench-comm@CS.UTK.EDU Mon Nov 17 08:32:09 1997 Return-Path: Received: from CS.UTK.EDU by netlib2.cs.utk.edu with ESMTP (cf v2.9t-netlib) id IAA28026; Mon, 17 Nov 1997 08:32:09 -0500 Received: from localhost (root@localhost) by CS.UTK.EDU with SMTP (cf v2.9s-UTK) id HAA07698; Mon, 17 Nov 1997 07:58:13 -0500 (EST) Received: from post.mail.demon.net (post-20.mail.demon.net [194.217.242.27]) by CS.UTK.EDU with SMTP (cf v2.9s-UTK) id HAA07665; Mon, 17 Nov 1997 07:57:54 -0500 (EST) Received: from minnow.demon.co.uk ([158.152.73.63]) by post.mail.demon.net id aa2024828; 17 Nov 97 12:43 GMT Message-ID: <06u4dCAfsDc0Ew8p@minnow.demon.co.uk> Date: Mon, 17 Nov 1997 12:39:59 +0000 To: parkbench-comm@CS.UTK.EDU From: Roger Hockney Subject: To the PARKBENCH97 BOF MIME-Version: 1.0 X-Mailer: Turnpike Version 3.03a GREETINGS TO THE PARKBENCH 1997 BOF ----------------------------------- I am not able to attend the Parkbench BOF this year but would like to make the following input: Chairman: Please express my apologies for absence to the meeting. Agenda Item: Low-Level Performance Evaluation tools. -------------------------------------- The latest version of the Parkbench Interactive Curve Fitting Tool (PICT2) is on my Web page at: http://www.minnow.demon.co.uk/pict/source/pict2a.html I believe that this solves the problem of displaying on different sized screens. Please try it and give me feedback (I have had little so far, so I don't know how worthwhile it is!). This plots and allows manual interactive curve fitting of data anywhere on the Web in raw-data, Original COMMS1, and New COMMS1 format. However, it still relies on COMMS1 calculating the least squares 2-Para and 3-Point 3-Para fits. Agenda Item : Plans for the next Release. -------------------------- Just a reminder that New COMMS1 as announced in my email to the committee of 16 Feb 1997, was designed as the minimum necessary changes to the existing release to solve the problems raised at the beginning of the year. It involves new versions of 5 routines and 2 new routines. In addition, the Make files need the 2 new routines added where appropriate. We have incorporated these changes at Westminster in the existing release without trouble. I believe that these should be incorported in the next release. In summary: New COMMS1 In directory: http://www.minnow.demon.co.uk/Pbench/comms1/ The 5 Changed Routines: (1) File COMMS1_1.F replaces ParkBench/Low_Level/comms1/src_mpi/COMMS1.f (2) File COMMS1_1.INC replaces ParkBench/Low_Level/comms1/src_mpi/comms1.inc (3) File ESTCOM_1.F replaces ParkBench/Low_Level/comms1/src_mpi/ESTCOM.f (4) File LSTSQ_1.F replaces ParkBench/lib/Low_Level/LSTSQ.f (5) File CHECK_1.F replaces Parkbench/lib/Low_Level/CHECK.f The 2 New Routines: (6) File LINERR_1.F add as ParkBench/lib/Low_Level/LINERR.f (7) File VPOWER_1.F add as ParkBench/lib/Low_Level/VPOWER.f HAVE A NICE MEETING, and best wishes to you all, Roger Hockney -- Roger Hockney. Checkout my new Web page at URL http://www.minnow.demon.co.uk University of and link to my new book: "The Science of Computer Benchmarking" Westminster UK suggestions welcome. Know any fish movies or suitable links? From owner-parkbench-comm@CS.UTK.EDU Mon Dec 1 08:38:55 1997 Return-Path: Received: from CS.UTK.EDU by netlib2.cs.utk.edu with ESMTP (cf v2.9t-netlib) id IAA05062; Mon, 1 Dec 1997 08:38:55 -0500 Received: from localhost (root@localhost) by CS.UTK.EDU with SMTP (cf v2.9s-UTK) id IAA20432; Mon, 1 Dec 1997 08:03:34 -0500 (EST) Received: from hermes.lsi.usp.br (hermes.lsi.usp.br [143.107.161.220]) by CS.UTK.EDU with ESMTP (cf v2.9s-UTK) id IAA20425; Mon, 1 Dec 1997 08:03:30 -0500 (EST) Received: from cali.lsi.usp.br (cali.lsi.usp.br [10.0.161.7]) by hermes.lsi.usp.br (8.8.5/8.7.3) with SMTP id LAA05866; Mon, 1 Dec 1997 11:03:20 -0200 (BDB) Message-ID: <34830ABD.487C@lsi.usp.br> Date: Mon, 01 Dec 1997 11:06:37 -0800 From: Martha Torres Organization: LSI X-Mailer: Mozilla 3.01Gold (Win95; I) MIME-Version: 1.0 To: parkbench-comm@CS.UTK.EDU CC: mxtd@lsi.usp.br Subject: compiling ParkBench for MPICH Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit Sirs ParkBench Committee Dear Sirs, I am Ph.D student and I am working with collective communication operations. Particulary, I am interested in to quantify the influence of collective communication operations on the total execution time of several MPI-programs. My platform is a cluster of 8 Dual Pentium Pro processors interconnected by 100Mb/s Fastethernet. I use MPICH version 1.1, fort77 and cc compilers I have downloaded ParkBench.tar from netlib. I followed all instructions but there are some programs that did not work: 1. Low_Level/poly1 poly2 rinf1 tick1 tick2 They did not compile. It appears the following: ParkBench/lib/LINUX/ParkBench_misc.a: No such file or directory. How do I create this library?? 2. Kernels/LU_solver QR TRD They also did not compile. It appears the following: ParkBench/lib/LINUX/pblas_subset.a: In function 'pberror_' undefined reference to 'blacs_gridinfo_' undefined reference to 'blacs_abort_' 3. Comp_Apps/PSTSWM and Kernels/MATMUL They compiled but they did not run Thanks in advance, Best Regards Martha Torres Laboratorio de Sistema Integraveis University of Sao Paulo Sao Paulo - S.P. Brazil From owner-parkbench-comm@CS.UTK.EDU Wed Jan 7 16:49:19 1998 Return-Path: Received: from CS.UTK.EDU by netlib2.cs.utk.edu with ESMTP (cf v2.9t-netlib) id QAA19963; Wed, 7 Jan 1998 16:49:19 -0500 Received: from localhost (root@localhost) by CS.UTK.EDU with SMTP (cf v2.9s-UTK) id QAA17461; Wed, 7 Jan 1998 16:30:05 -0500 (EST) Received: from timbuk.cray.com (timbuk-fddi.cray.com [128.162.8.102]) by CS.UTK.EDU with ESMTP (cf v2.9s-UTK) id QAA17452; Wed, 7 Jan 1998 16:30:02 -0500 (EST) Received: from ironwood.cray.com (root@ironwood-fddi.cray.com [128.162.21.36]) by timbuk.cray.com (8.8.7/CRI-gate-news-1.3) with ESMTP id PAA16817 for ; Wed, 7 Jan 1998 15:30:03 -0600 (CST) Received: from magnet.cray.com (magnet [128.162.173.162]) by ironwood.cray.com (8.8.4/CRI-ironwood-news-1.0) with ESMTP id PAA27253; Wed, 7 Jan 1998 15:30:00 -0600 (CST) From: Charles Grassl Received: by magnet.cray.com (8.8.0/btd-b3) id VAA26077; Wed, 7 Jan 1998 21:29:59 GMT Message-Id: <199801072129.VAA26077@magnet.cray.com> Subject: Low Level benchmarks To: parkbench-comm@CS.UTK.EDU Date: Wed, 7 Jan 1998 15:29:59 -0600 (CST) X-Mailer: ELM [version 2.4 PL24-CRI-d] MIME-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit -- Charles Grassl From owner-parkbench-comm@CS.UTK.EDU Wed Jan 7 16:56:40 1998 Return-Path: Received: from CS.UTK.EDU by netlib2.cs.utk.edu with ESMTP (cf v2.9t-netlib) id QAA19981; Wed, 7 Jan 1998 16:56:40 -0500 Received: from localhost (root@localhost) by CS.UTK.EDU with SMTP (cf v2.9s-UTK) id QAA17784; Wed, 7 Jan 1998 16:36:27 -0500 (EST) Received: from timbuk.cray.com (timbuk-fddi.cray.com [128.162.8.102]) by CS.UTK.EDU with ESMTP (cf v2.9s-UTK) id QAA17776; Wed, 7 Jan 1998 16:36:24 -0500 (EST) Received: from ironwood.cray.com (root@ironwood-fddi.cray.com [128.162.21.36]) by timbuk.cray.com (8.8.7/CRI-gate-news-1.3) with ESMTP id PAA17087 for ; Wed, 7 Jan 1998 15:36:24 -0600 (CST) Received: from magnet.cray.com (magnet [128.162.173.162]) by ironwood.cray.com (8.8.4/CRI-ironwood-news-1.0) with ESMTP id PAA28449 for ; Wed, 7 Jan 1998 15:36:22 -0600 (CST) Received: from magnet by magnet.cray.com (8.8.0/btd-b3) via SMTP id VAA26107; Wed, 7 Jan 1998 21:36:21 GMT Sender: cmg@cray.com Message-ID: <34B3F553.167E@cray.com> Date: Wed, 07 Jan 1998 15:36:19 -0600 From: Charles Grassl Organization: Cray Research X-Mailer: Mozilla 3.01SC-SGI (X11; I; IRIX 6.2 IP22) MIME-Version: 1.0 To: parkbench-comm@CS.UTK.EDU Subject: Low Level benchmark errors and differences Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit To: Parkbench Low Level interests From: Charles Grassl Subject: Low Level benchmark errors and differences Date: 7 January, 1998 We should not produce or publish Parkbench Low level benchmark results with the current suite of programs because the programs are inaccurate and unreliable. I ran the Low Level programs and compared the results with the same metrics as recorded from other benchmark programs. The differences range from less than 5% (acceptable) to a factor of 6 times difference, which is unacceptable. The differences, or "errors", are summarized in the table below. The recorded differences in results from the Low Level program were arrived at by comparing the Parkbench program reported metrics with the same metrics as measured by alternative programs. Table. Differences in Low Level benchmark results for two systems. System A is an Origin 2000. System B is a CRAY T3E. System A System B Rinf Startup Rinf Startup ----------------------------------------- COMMS1 <10% 6x <5% 6x COMMS2 2x 3x <5% <5% COMMS3 <5% <5% POLY1 <5% 60% 2x <5% POLY2 <5% 60% 2x <5% POLY3 - - 2x 80x The Parkbench Low Level programs are occasionally requested for benchmarking computer systems, but the results are usually rejected because of their inaccuracy and unreliability. If not rejected, they cause confusion and consternation because the results do not agree with other measurements of the same variables. I emphasize that this is not a case of obtaining optimization and favorable results for a computer system. The problem is with the inaccuracy and unreliability of the results. The Low Level programs measure and report low level parameters. Therefore their value is in accuracy and utility. The programs do not constitute definitions of the reported metrics and hence the results should correlate with other measurements of the the same variables. The Low Level programs are obsolete and need to be replaced. I have written seven simple programs, with MPI and PVM versions, and offer them as a replacement for the Low Level suite. I strongly suggest that we delete or withdraw from distribution the current Low Level suite. From owner-parkbench-comm@CS.UTK.EDU Thu Jan 8 05:40:28 1998 Return-Path: Received: from CS.UTK.EDU by netlib2.cs.utk.edu with ESMTP (cf v2.9t-netlib) id FAA01529; Thu, 8 Jan 1998 05:40:28 -0500 Received: from localhost (root@localhost) by CS.UTK.EDU with SMTP (cf v2.9s-UTK) id FAA00442; Thu, 8 Jan 1998 05:20:21 -0500 (EST) Received: from sun1.ccrl-nece.technopark.gmd.de (sun1.ccrl-nece.technopark.gmd.de [193.175.160.67]) by CS.UTK.EDU with ESMTP (cf v2.9s-UTK) id FAA00380; Thu, 8 Jan 1998 05:20:13 -0500 (EST) Received: from sgi7.ccrl-nece.technopark.gmd.de (sgi7.ccrl-nece.technopark.gmd.de [193.175.160.89]) by sun1.ccrl-nece.technopark.gmd.de (8.7/3.4W296021412) with SMTP id LAA28869; Thu, 8 Jan 1998 11:20:05 +0100 (MET) Received: (from hempel@localhost) by sgi7.ccrl-nece.technopark.gmd.de (950413.SGI.8.6.12/950213.SGI.AUTOCF) id LAA24864; Thu, 8 Jan 1998 11:18:48 +0100 Date: Thu, 8 Jan 1998 11:18:48 +0100 From: hempel@ccrl-nece.technopark.gmd.de (Rolf Hempel) Message-Id: <199801081018.LAA24864@sgi7.ccrl-nece.technopark.gmd.de> To: parkbench-comm@CS.UTK.EDU Subject: Low Level benchmark errors and differences Cc: ritzdorf@ccrl-nece.technopark.gmd.de, zimmermann@ccrl-nece.technopark.gmd.de, clantwin@ess.nec.de, eckhard@ess.nec.de, lonsdale@ccrl-nece.technopark.gmd.de, tbeckers@ess.nec.de Reply-To: hempel@ccrl-nece.technopark.gmd.de To: Parkbench Low Level interests From: Rolf Hempel Subject: Low Level benchmark errors and differences, Note from Charles Grassl of January 7th Date: 8 January, 1998 Thank you, Charles, for your note on the Low Level benchmarks. It could not have come at a better time, because at NEC we just recently ran into problems with COMMS1. This code had been specified by a customer as a test case in a current procurement. When we ran COMMS1 with our current MPI library, the results for rinfinity and latency were completely wrong. In particular, the latency values were off by more than a factor of two, when compared with other ping-pong test programs. The following turned out to be the main reasons for the errors: 1. The performance model is completely inadequate. A linear dependency between time and message length, fitted to the measurements by least squares, is bound to fail in the presence of discontinuities caused by protocol changes. Most MPI implementations change protocols for different message lengths for an overall performance optimization. 2. To make things worse, the least square fit overweighs the data points for very long messages, because the differences "model minus measurement" are largest there in absolute terms. The fitted line, therefore, more or less ignores the short message measurements. As a result, the latencies are completely up to chance. 3. The correction for internal measurement overhead (e.g., for subroutine calls) is programmed in a sloppy way, to say the least. We discovered several subroutine calls which were not taken into account, and the overhead is measured with low precision. For our implementation, this alone introduced a latency error of about 25%. The result in our case was that, instead of the 13.5 usec latency measured by the MPICH MPPTEST routine, COMMS1 initially reported some 28 usec. My colleague Hubert Ritzdorf then made an interesting experiment: he removed some optimization from our MPI library for long messages, thus INCREASING the communication times for messages longer than 128000 bytes, and not changing anything for shorter messages. The resulting DROP in latency from 28 to under 22 usec clearly shows how ridiculous the COMMS1 benchmark is. Thus, I strongly agree with Charles in that the COMMS* benchmarks must be removed from PARKBENCH. They don't help anybody, and they only cause confusion on the side of customers and frustration on the side of benchmarkers. Let's get rid of this long-standing nuisance as quickly as possible. Best regards, Rolf Hempel ------------------------------------------------------------------------ Rolf Hempel (email: hempel@ccrl-nece.technopark.gmd.de) Senior Research Staff Member C&C Research Laboratories, NEC Europe Ltd., Rathausallee 10, 53757 Sankt Augustin, Germany Tel.: +49 (0) 2241 - 92 52 - 95 Fax: +49 (0) 2241 - 92 52 - 99 From owner-parkbench-comm@CS.UTK.EDU Thu Jan 8 08:07:54 1998 Return-Path: Received: from CS.UTK.EDU by netlib2.cs.utk.edu with ESMTP (cf v2.9t-netlib) id IAA02383; Thu, 8 Jan 1998 08:07:53 -0500 Received: from localhost (root@localhost) by CS.UTK.EDU with SMTP (cf v2.9s-UTK) id HAA05392; Thu, 8 Jan 1998 07:50:13 -0500 (EST) Received: from osiris.sis.port.ac.uk (root@osiris.sis.port.ac.uk [148.197.100.10]) by CS.UTK.EDU with SMTP (cf v2.9s-UTK) id HAA05383; Thu, 8 Jan 1998 07:50:03 -0500 (EST) Received: from mordillo (p108.nas1.is4.u-net.net) by osiris.sis.port.ac.uk (4.1/SMI-4.1) id AA03072; Thu, 8 Jan 98 12:48:32 GMT Date: Thu, 8 Jan 98 12:10:55 GMT From: Mark Baker Subject: Re: Low Level benchmark errors and differences To: Charles Grassl , parkbench-comm@CS.UTK.EDU X-Mailer: Chameleon ATX 6.0.1, Standards Based IntraNet Solutions, NetManage Inc. X-Priority: 3 (Normal) References: <34B3F553.167E@cray.com> Message-Id: Mime-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII I am in agreement with Charles and Rolf about the low-level codes. We've known for some time that they (the codes) are less than perfect, if not in some cases flawed. At the SC'97 Parkbench meeting it was mooted that Parkbench should concentrate on producing, supporting, analysing and recording Low-Level codes and results. If this is the case then we should certainly ensure that what we support codes that are soundly written and produce consistent and reliable results. I certainly believe that a set of codes, akin to the low-level ones, should be part of the Parkbench suite. Maybe this is a good time to replace the current codes with those that Charles has produced !? As a side issue, I think we should produce C versions of whatever low-level codes we produce. Charles, I'd be interested in your thoughts on the codes that Pallas produce - ftp://ftp.pallas.de/pub/PALLAS/PMB/PMB10.tar.gz. These are C benchmark codes that run: PingPong - like comms1 PingPing - like comms2 Xover Cshift Exchange Allreduce Bcast Barrier - like synch1 Obviously, I would'nt like to comment on how well written they are or how reliable the results that they produce are. I'm relatively impressed with them. I also like the fact they try and produce results for commonly used MPI functions - cshift/exchange/etc. I've run the codes on NT boxes and they appear to produce results close to what I would expect. Regards Mark --- On Wed, 07 Jan 1998 15:36:19 -0600 Charles Grassl wrote: > To: Parkbench Low Level interests > From: Charles Grassl > > Subject: Low Level benchmark errors and differences > > Date: 7 January, 1998 > > > We should not produce or publish Parkbench Low level benchmark results > with the current suite of programs because the programs are inaccurate > and unreliable. I ran the Low Level programs and compared the results > with the same metrics as recorded from other benchmark programs. > The differences range from less than 5% (acceptable) to a factor of 6 > times difference, which is unacceptable. > > The differences, or "errors", are summarized in the table below. > The recorded differences in results from the Low Level program were > arrived at by comparing the Parkbench program reported metrics with the > same metrics as measured by alternative programs. > > > Table. Differences in Low Level benchmark results > for two systems. System A is an Origin 2000. > System B is a CRAY T3E. > > System A System B > Rinf Startup Rinf Startup > ----------------------------------------- > COMMS1 <10% 6x <5% 6x > COMMS2 2x 3x <5% <5% > COMMS3 <5% <5% > POLY1 <5% 60% 2x <5% > POLY2 <5% 60% 2x <5% > POLY3 - - 2x 80x > > > The Parkbench Low Level programs are occasionally requested for > benchmarking computer systems, but the results are usually rejected > because of their inaccuracy and unreliability. If not rejected, they > cause confusion and consternation because the results do not agree > with other measurements of the same variables. I emphasize that this > is not a case of obtaining optimization and favorable results for a > computer system. The problem is with the inaccuracy and unreliability > of the results. > > The Low Level programs measure and report low level parameters. > Therefore their value is in accuracy and utility. The programs do not > constitute definitions of the reported metrics and hence the results > should correlate with other measurements of the the same variables. > > The Low Level programs are obsolete and need to be replaced. I have > written seven simple programs, with MPI and PVM versions, and offer them > as a replacement for the Low Level suite. > > I strongly suggest that we delete or withdraw from distribution the > current Low Level suite. > ---------------End of Original Message----------------- ------------------------------------- CSM, University of Portsmouth, Hants, UK Tel: +44 1705 844285 Fax: +44 1705 844006 E-mail: mab@sis.port.ac.uk Date: 01/08/98 - Time: 12:10:55 URL http://www.sis.port.ac.uk/~mab/ ------------------------------------- From owner-parkbench-comm@CS.UTK.EDU Mon Jan 12 16:02:28 1998 Return-Path: Received: from CS.UTK.EDU by netlib2.cs.utk.edu with ESMTP (cf v2.9t-netlib) id QAA26216; Mon, 12 Jan 1998 16:02:28 -0500 Received: from localhost (root@localhost) by CS.UTK.EDU with SMTP (cf v2.9s-UTK) id PAA16631; Mon, 12 Jan 1998 15:38:05 -0500 (EST) Received: from post.mail.demon.net (post-20.mail.demon.net [194.217.242.27]) by CS.UTK.EDU with SMTP (cf v2.9s-UTK) id PAA16588; Mon, 12 Jan 1998 15:37:38 -0500 (EST) Received: from minnow.demon.co.uk ([158.152.73.63]) by post.mail.demon.net id aa2012292; 12 Jan 98 17:34 GMT Message-ID: Date: Mon, 12 Jan 1998 17:33:01 +0000 To: hempel@ccrl-nece.technopark.gmd.de Cc: parkbench-comm@CS.UTK.EDU, ritzdorf@ccrl-nece.technopark.gmd.de, zimmermann@ccrl-nece.technopark.gmd.de, clantwin@ess.nec.de, eckhard@ess.nec.de, lonsdale@ccrl-nece.technopark.gmd.de, tbeckers@ess.nec.de From: Roger Hockney Subject: Re: Low Level benchmark errors and differences In-Reply-To: <199801081018.LAA24864@sgi7.ccrl-nece.technopark.gmd.de> MIME-Version: 1.0 X-Mailer: Turnpike Version 3.03a To: Rolf, Charles, Mark and others, From: Roger I too am distressed to see the original COMMS1 code (written and tested for message lengths only up to 10^4) is still being issued by Parkbench and being used well outside its range of proven validity (message lengths now typically up to 10^7 or even 10^8). These problems were pointed out about one year ago by Charles and Ron, and as a result I worked on the code and issued to the committee a minmum set of changes to the current release that would solve many of the problems. These involve replacing five existing routines and adding two to the existing release. The routines involved have been downloadable from my Web site since about 12 March 1997 and have been used successfully at Westminster University in our work. The New COMMS1, as I called it, was the subject of two printed reports to the May 1997 meeting of Parkbench and further results were shown at the Sept 1997 meeting. There were also extensive discussions in this email group during 1997. Unfortunately my simple fixes were not inserted into the Parkbench release and as a result we are still getting a bad press from benchmarkers. After all the effort I put into solving this problem a year ago, I feel rather let down that my work was never used. If my changes had been encorporated into the Parkbenchmarks when they were offered at least as an interim measure, I believe we could have avoided much of the current bad publicity. I emphasise that the New COMMS1 was written as a minimum patch to the existing release to solve an urgent problem in the simplest way. I am not against a complete rethink of the low level benchmarks and now that MPI has become a recognised standard, benchmarks timing the principal software primitives of MPI would seem to be the most useful. Quite possibly Charles's or Mucci's codes could be used. However, I am still firmly convinced of the value of approximate parametric representation of all the benchmark measurements based on a simple performance model. Most of the existing low-level benchmarks were written primarily to determine such parameters and hence include both raw measurements and least squares curve fitting to obtain the parameters. I have yet to see data that cannot be satisfactorily fitted by 2 or 3 parameters, or two sets of 2-paras. And remember that I am talking here about fitting ALL the measured data by some simple formulae. After the decision of the May 1997 meeting to separate the raw measurements from the parametric curve fitting, the curve fitting will eventually become part of the "Parkbench Interactive Curve Fitting Tool" (PICT). At present this applet can be used to produce a manual curve fit, but eventually I will put up on my Web site a version in which the least squares and 3-point buttons are active. But PICT as it is can now be used manually to see how good or bad the 2-para and 3-para fits are. Turn your browser to: http://www.minnow.demon.co.uk/pict/source/pict2a.html and insert your raw data. I would be very interested to see what the NEC data looks like. To answer some of Rolf's points: Rolf Hempel writes > >1. The performance model is completely inadequate. A linear dependency > between time and message length, fitted to the measurements by > least squares, is bound to fail in the presence of discontinuities > caused by protocol changes. Most MPI implementations change > protocols for different message lengths for an overall performance > optimization. > Note that the original COMMS1 that you are using allows you to insert one break point to take account of one major discontinuity. Have you tried this? In any case, to make t_0 a good measure of startup it is sensible ALWAYS to make a breakpoint at say 100 or 1000 Byte, then the short message t_0 should be a good measure of startup. The long message t_0 is then not of interest and should be ignored. In this way one is using the straight- line fit over a short range of lengths, and the resulting t_0 should be a better estimate of latency because it is derived from several measurements rather than just selecting a single measurement (e.g. the time for the shortest message) -- surely a better experimental procedure. I emphasise that this procedure can be used now with the original COMMS1 to get sensible results. If there are many small discontinuities or changes of protocol then I expect you data is rather like that shown by Charles this time last year and used as an example in PICT. In this case the 3-para fit may give good results for your data as it did for Charles's. >2. To make things worse, the least square fit overweighs the data points > for very long messages, because the differences "model minus > measurement" are largest there in absolute terms. The fitted line, > therefore, more or less ignores the short message measurements. > As a result, the latencies are completely up to chance. > This is absolutely true and was discovered to be the problem one year ago. My solution, used in the New COMMS1, was and is to minimise the sum of the squares of the relative (rather than absolute) error. If this is done the values for short messages are not ignored in the way described, and t_0 is held much closer to the time for the smallest message length. Note also that the 3-parameter fit provided by New COMMS1 can be fitted exactly to the time for the shortest message, to the bandwidth for the longest message, and to the bandwidth near the mid point. This is the so-called 3-point fit, but it does require a third parameter. Can you please email me the output file for the NEC from the original COMMS1. I can then put this data through the New COMMS1 and see what two and three parameter fits are produced. Otherwise you could update your version of Parkbenchmarks with the 7 subroutines and rerun using New COMMS1. See the instructions at the end of this email. >28 usec. My colleague Hubert Ritzdorf then made an interesting >experiment: he removed some optimization from our MPI library for >long messages, thus INCREASING the communication times for messages >longer than 128000 bytes, and not changing anything for shorter >messages. The resulting DROP in latency from 28 to under 22 usec >clearly shows how ridiculous the COMMS1 benchmark is. > Hubert's results are just what one would expect from minimising the absolute error. I suspect you would not see this effect with New COMMS1 which does not over-emphasise the long message measurements. Please remember that the t_0 reported by COMMS1 is not a measurement of the time for any particular message length. It is the constant term in the fitted curve: t = t_0 + n/rinf which is an approximation to ALL the measured data. If you want to know the time, say for the smallest message length, then that is listed in the table of lengths and times reported in the benchmark output. If you mean by latency the time for the shortest message (hopefully zero or 1 Byte) then the COMMS1 measurements of this are in this table not in t_0. For those who missed my two earlier emailings on using the New COMMS1, I copy my earlier email below: Agenda Item : Plans for the next Release. -------------------------- Just a reminder that New COMMS1 as announced in my email to the committee of 16 Feb 1997, was designed as the minimum necessary changes to the existing release to solve the problems raised at the beginning of the year. It involves new versions of 5 routines and 2 new routines. In addition, the Make files need the 2 new routines added where appropriate. We have incorporated these changes at Westminster in the existing release without trouble. I believe that these should be incorported in the next release. In summary: New COMMS1 In directory: http://www.minnow.demon.co.uk/Pbench/comms1/ The 5 Changed Routines: (1) File COMMS1_1.F replaces the following file in the current release: ParkBench/Low_Level/comms1/src_mpi/COMMS1.f (2) File COMMS1_1.INC replaces ParkBench/Low_Level/comms1/src_mpi/comms1.inc (3) File ESTCOM_1.F replaces ParkBench/Low_Level/comms1/src_mpi/ESTCOM.f (4) File LSTSQ_1.F replaces ParkBench/lib/Low_Level/LSTSQ.f (5) File CHECK_1.F replaces Parkbench/lib/Low_Level/CHECK.f The 2 New Routines: (6) File LINERR_1.F add as ParkBench/lib/Low_Level/LINERR.f (7) File VPOWER_1.F add as ParkBench/lib/Low_Level/VPOWER.f Best wishes to you all Roger -- Roger Hockney. Checkout my new Web page at URL http://www.minnow.demon.co.uk University of and link to my new book: "The Science of Computer Benchmarking" Westminster UK suggestions welcome. Know any fish movies or suitable links? From owner-parkbench-comm@CS.UTK.EDU Tue Jan 13 08:38:07 1998 Return-Path: Received: from CS.UTK.EDU by netlib2.cs.utk.edu with ESMTP (cf v2.9t-netlib) id IAA17513; Tue, 13 Jan 1998 08:38:07 -0500 Received: from localhost (root@localhost) by CS.UTK.EDU with SMTP (cf v2.9s-UTK) id IAA03191; Tue, 13 Jan 1998 08:20:10 -0500 (EST) Received: from sun1.ccrl-nece.technopark.gmd.de (sun1.ccrl-nece.technopark.gmd.de [193.175.160.67]) by CS.UTK.EDU with ESMTP (cf v2.9s-UTK) id IAA03184; Tue, 13 Jan 1998 08:20:07 -0500 (EST) Received: from sgi7.ccrl-nece.technopark.gmd.de (sgi7.ccrl-nece.technopark.gmd.de [193.175.160.89]) by sun1.ccrl-nece.technopark.gmd.de (8.7/3.4W296021412) with SMTP id OAA04953; Tue, 13 Jan 1998 14:19:47 +0100 (MET) Received: (from hempel@localhost) by sgi7.ccrl-nece.technopark.gmd.de (950413.SGI.8.6.12/950213.SGI.AUTOCF) id OAA02202; Tue, 13 Jan 1998 14:18:30 +0100 Date: Tue, 13 Jan 1998 14:18:30 +0100 From: hempel@ccrl-nece.technopark.gmd.de (Rolf Hempel) Message-Id: <199801131318.OAA02202@sgi7.ccrl-nece.technopark.gmd.de> To: roger@minnow.demon.co.uk Subject: COMMS1 Benchmark Cc: tbeckers@ess.nec.de, lonsdale@ccrl-nece.technopark.gmd.de, eckhard@ess.nec.de, clantwin@ess.nec.de, parkbench-comm@CS.UTK.EDU Reply-To: hempel@ccrl-nece.technopark.gmd.de Dear Roger, thank you for your note on the COMMS1 benchmark. We didn't try the NEW COMMS1 code yet with our MPI library, so I cannot comment on its accuracy. I just would like to answer some of the issues you raised in your mail. Of course we have seen that in COMMS1 you can select a transition point between a short and a long model. For this choice, however, you have to be able to change the input data. In our case (a benchmark suite used in a procurement) our customer had provided the input dataset, and we were not allowed to change it. So, the only way for us to correct the results was to tune our MPI library to make it fit to the benchmark program. I don't think that this is what you had in mind when you wrote COMMS1. You didn't comment on the inaccuracies we found in the raw measurements. We ran several ping-pong benchmarks before, as, for example, the MPPTEST routine of MPICH, and they consistently give better latencies for short messages (difference approx. 25%). As I explained in my previous mail, we found the reason to be an improper correction for measurement overheads in COMMS1. Thus, the raw data are flawed, and this cannot be resolved by any parameter fitting. This is also the reason that I hesitate to send you the raw data reported by COMMS1 on our machine. I agree with you that it would be nice to have a few parameters to characterize the performance of any given system. The values for "n1/2" and "rinfinity" have been quite successful for vector arithmetic operations. The situation is, however, much more complicated for communication operations. As an example, let's take the famous ping-pong benchmark. We already discussed the problem of discontinuities caused by protocol changes. If you want to do a parameter fitting, the only reasonable solution seems to me that your test program automatically detects such points and handles the different protocols separately. If you leave the selection to an input parameter, you will inevitably run into the problem I discussed above. Even if you solve this problem, there remain many others. In modern (i.e. highly optimized) MPI implementations, the performance of a ping-pong operation crucially depends on the status of the two processes involved. Is the receiving process already waiting for the message? In a ping-pong, it usually is. This can make a huge difference! Also, the performance can also depend on the global number of processes active in the application. Not only do search lists in communication progress engines become shorter if there are fewer processes, but some implementers even went as far as writing special code for the case where you just have two processes. Ping-pong codes such as COMMS1 almost always just use two communicating processes, so they measure the best case. Another effect which is too often ignored is that messages can interfere with each other (both at the hardware and software level) if they are sent at the same time between different process pairs. All those effects combined cause a substantial difference between ping-pong results and measurements in real applications. In this situation the apparent precision of performance parameters can be quite misleading. If I want to judge the quality of an MPI implementation, I don't trust in best fit parameters so much. For the ping-pong code, I just look at a graphic representation of time versus message length for short messages, and another one of bandwidth versus message length for long messages. This way I can study discontinuities and other minor effects in detail. And then, take real applications and measure the communication times there. Then you will often find surprising results which you have never seen in a ping-pong benchmark. Best wishes, Rolf ------------------------------------------------------------------------ Rolf Hempel (email: hempel@ccrl-nece.technopark.gmd.de) Senior Research Staff Member C&C Research Laboratories, NEC Europe Ltd., Rathausallee 10, 53757 Sankt Augustin, Germany Tel.: +49 (0) 2241 - 92 52 - 95 Fax: +49 (0) 2241 - 92 52 - 99 From owner-parkbench-comm@CS.UTK.EDU Thu Jan 15 14:17:57 1998 Return-Path: Received: from CS.UTK.EDU by netlib2.cs.utk.edu with ESMTP (cf v2.9t-netlib) id OAA00690; Thu, 15 Jan 1998 14:17:56 -0500 Received: from localhost (root@localhost) by CS.UTK.EDU with SMTP (cf v2.9s-UTK) id NAA23858; Thu, 15 Jan 1998 13:55:08 -0500 (EST) Received: from timbuk.cray.com (timbuk-fddi.cray.com [128.162.8.102]) by CS.UTK.EDU with ESMTP (cf v2.9s-UTK) id NAA23830; Thu, 15 Jan 1998 13:54:57 -0500 (EST) Received: from ironwood.cray.com (root@ironwood-fddi.cray.com [128.162.21.36]) by timbuk.cray.com (8.8.7/CRI-gate-news-1.3) with ESMTP id LAA11159 for ; Thu, 15 Jan 1998 11:11:42 -0600 (CST) Received: from magnet.cray.com (magnet [128.162.173.162]) by ironwood.cray.com (8.8.4/CRI-ironwood-news-1.0) with ESMTP id LAA08650 for ; Thu, 15 Jan 1998 11:11:41 -0600 (CST) From: Charles Grassl Received: by magnet.cray.com (8.8.0/btd-b3) id RAA07227; Thu, 15 Jan 1998 17:11:40 GMT Message-Id: <199801151711.RAA07227@magnet.cray.com> Subject: Low Level Benchmarks To: parkbench-comm@CS.UTK.EDU Date: Thu, 15 Jan 1998 11:11:39 -0600 (CST) X-Mailer: ELM [version 2.4 PL24-CRI-d] MIME-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit To: Parkbench interests From: Charles Grassl Subject: Low Level benchmarks Date: 15 January, 1998 Mark, thank you for pointing us to the PMB benchmark. It is well written and coded, but has some discrepancies and shortcomings. My comments lead to suggestions and recommendation regarding low level communication benchmarks. First, in program PMB the PingPong tests are twice as fast (in time) as the corresponding message length tests in the PingPing tests (as run on a CRAY T3E). The calculation of the time and bandwidth is incorrect by a factor of 100% in one of the programs. This error can be fixed by recording, using and reporting the actual time, amount of data sent and their ratio. That is, the time should not be divided by two in order to correct for a round trip. This recorded time is for a round trip message, and is not precisely the time for two messages. Half the round trip message passing time, as reported in the PMB tests, is not the time for a single message and should not be reported and such. This same erroneous technique is used in the COMMS1 and COMMS2 two benchmarks. (Is Parkbench is responsible for propagating this incorrect methodology.) In program PMB, the testing procedure performs a "warm up". This procedure is a poor testing methodology because is discards important data. Testing programs such as this should record all times and calculate the variance and other statistics in order to perform error analysis. Program PMB does not measure contention or allow extraction of network contention data. Tests "Allreduce" and "Bcast" and several others stress the inter-PE communication network with multiple messages, but it is not possible to extract information about the contention from these tests. The MPI routines for Allreduce and Bcast have algorithms which change with respect to number of PEs and message lengths, Hence, without detailed information about the specific algorithms used, we cannot extract information about network performance or further characterize the inter-PE network. Basic measurements must be separated from algorithms. Tests PingPong, PingPing, Barrier, Xover, Cshift and Exchange are low level. Tests Allreduce and Bcast are algorithms. The algorithms Allreduce and Bcast need additional (algorithmic) information in order to be described in terms of the basic level benchmarks. With respect to low level testing, the round trip exchange of messages, as per PingPing and PingPong in PMB or COMMS1 and COMMS2, is not characteristic of the lowest level of communication. This pattern is actually rather rare in programming practice. It is more common for tasks to send single messages and/or to receive single messages. In this scheme, messages do not make a round trip and there is not necessarily caching or other coherency effects. The single message passing is a distinctly different case from that of round trip tests. We should be worried that the round trip testing might introduce artifacts not characteristic of actual (low level) usage. We need a better test of basic bandwidth and latency in order to measure and characterize message passing performance. Here are suggestions and requirements, in an outline form, for a low level benchmark design: I. Single and double (bidirectional) messages. A. Test single messages, not round trips. 1. The round trip test is an algorithm and a pattern. As such it should not be used as the basic low level test of bandwidth. 2. Use direct measurements where possible (which is nearly always). For experimental design, the simplest method is the most desirable and best. 3. Do not perform least squares fits A PIORI. We know that the various message passing mechanisms are not linear or analytic because different mechanisms are used for different message sizes. It is not necessarily known before hand where this transition occurs. Some computer systems have more than two regimes and their boundaries are dynamic. 4. Our discussion of least squares fitting is loosing tract of experimental design versus modeling. For example, the least squares parameter for t_0 from COMMS1 is not a better estimate of latency than actual measurements (assuming that the timer resolution is adequate). A "better" way to measure latency is to perform addition DIRECT measurements, repetitions or otherwise, and hence decrease the statistical error. The fitting as used in the COMMS programs SPREADS error. It does not reduce error and hence it is not a good technique for measuring such an important parameter as latency. B. Do not test zero length messages. Though valid, zero length messages are likely to take special paths through library routines. This special case is not particularly interesting or important. 1. In practice, the most common and important message size is 64 bits (one word). The time for this message is the starting point for bandwidth characterization. D. Record all times and use statistics to characterize the message passing time. That is, do not prime or warm up caches or buffers. Timings for unprimed caches and buffers give interesting and important bounds. These timings are also the nearest to typical usage. 1. Characterize message rates by a minimum, maximum, average and standard deviation. E. Test inhomogeneity of the communication network. The basic message test should be performed for all pairs of PEs. II. Contention. A. Measure network contention relative to all PEs sending and/or receiving messages. B. Do not use high level routines where the algorithm is not known. 1. With high level algorithms, we cannot deduce which component of the timing is attributable to the "operation count" and which is attributable to the actual system (hardware) performance. III. Barrier. A. Simple test of barrier time for all numbers of processors. Additionally, the suite should be easy to use. C and Fortran programs for direct measurements of message passing times are short and simple. These simple tests are of order 100 lines of code and, at least in Fortran 90, can be written in a portable and reliable manner. The current Parkbench low level suite does not satisfy the above requirements. It is inaccurate, as pointed out by previous letters, and uses questionable techniques and methodologies. It is also difficult to use, witness the proliferation of files, patches, directories, libraries and the complexity and size of the Makefiles. This Low Level suite is a burden for those who are expecting a tool to evaluate and investigate computer performance. The suite is becoming a liability for our group. As such, it should be withdrawn from distribution. I offer to write, test and submit a new set of programs which satisfy most of the above requirements. Charles Grassl SGI/Cray Research Eagan, Minnesota USA From owner-parkbench-comm@CS.UTK.EDU Fri Jan 16 09:12:18 1998 Return-Path: Received: from CS.UTK.EDU by netlib2.cs.utk.edu with ESMTP (cf v2.9t-netlib) id JAA11774; Fri, 16 Jan 1998 09:12:18 -0500 Received: from localhost (root@localhost) by CS.UTK.EDU with SMTP (cf v2.9s-UTK) id IAA16130; Fri, 16 Jan 1998 08:53:07 -0500 (EST) Received: from haven.EPM.ORNL.GOV (haven.epm.ornl.gov [134.167.12.69]) by CS.UTK.EDU with ESMTP (cf v2.9s-UTK) id IAA16123; Fri, 16 Jan 1998 08:53:06 -0500 (EST) Received: (from worley@localhost) by haven.EPM.ORNL.GOV (8.8.3/8.8.3) id IAA01963; Fri, 16 Jan 1998 08:52:17 -0500 (EST) Date: Fri, 16 Jan 1998 08:52:17 -0500 (EST) From: Pat Worley Message-Id: <199801161352.IAA01963@haven.EPM.ORNL.GOV> To: parkbench-comm@CS.UTK.EDU Subject: Re: Low Level Benchmarks In-Reply-To: Mail from 'Charles Grassl ' dated: Thu, 15 Jan 1998 11:11:39 -0600 (CST) Cc: worley@haven.EPM.ORNL.GOV, ritzdorf@ccrl-nece.technopark.gmd.de, zimmermann@ccrl-nece.technopark.gmd.de, clantwin@ess.nec.de, eckhard@ess.nec.de, lonsdale@ccrl-nece.technopark.gmd.de, tbeckers@ess.nec.de I have not been paying close attention to the current Low Level communication suite discussions, having confidence in capabilities and resolve of the current participants, but have decided to muddy the waters with a few personal observations. 1) I do not use the Low Level suite in my own performnace-related work. I find that the interpretation of results is much easier if the experiments are designed to answer (my) specific performance questions. Producing numbers that are accurate enough and whose experiments are well-enough understood to be used to answer arbitrary performance questions is much more difficult. 2) It may be time to revisit the goals of the Low Level suite. There are two obvious extremes. a) Determine some (hopefully representative) metrics of point-to-point communication performance, concentrating on making the measurements fair when comparing across platforms, but not requiring that the underlying architecture parameters be derivable from these numbers, or that they agree exactly with any other group's measurements. In this situation, a two (or more) parameter model fit to the data can be useful, if only as a shorthand for the raw data, but the model should not be expected to explain the data. b) Characterize the low level communication performance for each platform. Charles Grassl's latest recommendation is a first step in that direction. As a personal aside, I attempted such an exercise a few years ago (on the T3D, looking at the effect of common usage patterns on performance, not just ping-pong between nearest neighbors). I quickly became swamped by the amount of data and by the number of ways of presenting it (and the work was never written up). I realize now that my problem was trying to address too many evaluation questions simultaneously. In addition to the large amount of data required, an accurate characterization is likely to require more platform-specific elements, and will continue to evolve as new machines are added, in order to be as fair to the new machines as it is to the old ones. (The two parameter models are very acurrate for some of the previous generation of homogeneous message-passing platforms.) In case my sympathies are not clear, I prefer to revisit and fix the current suite, "dumbing it down", if only in presentation, making it clear what it does and does not measure. In my own work, the point-to-point measurements are only for establishing a general performance baseline. The important measures are the performance observed in the kernel and full application codes. The baseline measurements are simply to assess the "peak achieveable" communication performance. While a full characterization is an important thing to do, I do not believe that this group has the manpower, resources, or staying power to do it right. At one time in the past, we proposed to simply be a clearinghouse for the best of the performance measurement codes. If Charles wants to write and submit such an extensive low level suite, we can consider it, but in the meantime we should address the problems in the current suite, and not claim more than is appropriate. In particular, make sure that the customer does not become concerned that the vendor-stated latency and bandwidth does not match the PARKBENCH reported values. A discrepancy does not necessarily mean that someone is lying, simply that different aspects are being measured. But we should also be sure that intermachine comparisons using PARKBENCH measurements are valid, otherwise, they serve no purpose. Pat Worley PS. - I may be in the fringe, but all my codes are written using variants of SWAP and SENDRECV, and most of the codes I see can be written in such a fashion (and could gain something from it). So, ping-pong and ping-ping are not irrelevant to me. PPS. - Of course the real reason for using ping-pong is the difficulty in measuring the time for one-way messaging. I was not aware that this was a solved problem, at least at the MPI or PVM level. Perhaps system instrumentation can answer it, but I didn't know that portable measurement codes could be guaranteed to do so across the different platforms. From owner-parkbench-comm@CS.UTK.EDU Fri Jan 16 10:57:55 1998 Return-Path: Received: from CS.UTK.EDU by netlib2.cs.utk.edu with ESMTP (cf v2.9t-netlib) id KAA13381; Fri, 16 Jan 1998 10:57:55 -0500 Received: from localhost (root@localhost) by CS.UTK.EDU with SMTP (cf v2.9s-UTK) id KAA20483; Fri, 16 Jan 1998 10:38:52 -0500 (EST) Received: from sun1.ccrl-nece.technopark.gmd.de (sun1.ccrl-nece.technopark.gmd.de [193.175.160.67]) by CS.UTK.EDU with ESMTP (cf v2.9s-UTK) id KAA20468; Fri, 16 Jan 1998 10:38:45 -0500 (EST) Received: from sgi7.ccrl-nece.technopark.gmd.de (sgi7.ccrl-nece.technopark.gmd.de [193.175.160.89]) by sun1.ccrl-nece.technopark.gmd.de (8.7/3.4W296021412) with SMTP id QAA09438; Fri, 16 Jan 1998 16:38:41 +0100 (MET) Received: (from hempel@localhost) by sgi7.ccrl-nece.technopark.gmd.de (950413.SGI.8.6.12/950213.SGI.AUTOCF) id QAA04930; Fri, 16 Jan 1998 16:37:14 +0100 Date: Fri, 16 Jan 1998 16:37:14 +0100 From: hempel@ccrl-nece.technopark.gmd.de (Rolf Hempel) Message-Id: <199801161537.QAA04930@sgi7.ccrl-nece.technopark.gmd.de> To: parkbench-comm@CS.UTK.EDU Subject: Re: Low Level Benchmarks Cc: tbeckers@ess.nec.de, lonsdale@ccrl-nece.technopark.gmd.de, eckhard@ess.nec.de, clantwin@ess.nec.de, zimmermann@ccrl-nece.technopark.gmd.de, ritzdorf@ccrl-nece.technopark.gmd.de, hempel@ccrl-nece.technopark.gmd.de Reply-To: hempel@ccrl-nece.technopark.gmd.de I would like to send some remarks to the notes by Charles Grassl and Pat Worley on the problem of low-level communication benchmarks. As Pat pointed out, the ping-pong benchmark has been invented because generally there is no global clock by which you could measure the time for a single message. Everybody knows that this is no perfect solution, and in my previous mail I already explained some aspects of why ping-pong results can differ substantially from times found in real applications. So, I think we will have to use ping-pong tests in the future, with the caveat that they only measure a very special case of message-passing. If Charles knows a way to measure single messages, I would like to learn about it. In most other points I agree with Charles. I'm strongly convinced that the COMMS* routines are obsolete and should be replaced with something reasonable. In particular, the current routines are far too complicated to use, and give completely meaningless results. Therefore, I think one should not even try to correct the COMMS* routines, especially as there are already better alternatives available. One example is the PMB suite of PALLAS. It is relatively easy to use, but the documentation should provide more information than the internal calling tree given in the README file. What is missing is a precise definition of the underlying measuring methodology. I strongly prefer the output of timing tables (perhaps translated in good graphical representations) over crude parametrizations like the ones in the COMMS* benchmarks. Those can only frustrate the experts and confuse all other people. As to the definition of latency, Charles is right in saying that zero byte messages are dangerous because they often use special algorithms. The straightforward solution to use 1 byte messages instead is bad because usually messages are sent as multiples of 4 or 8 bytes, and for other message lengths some overhead by additional copying or even subroutine calls may be introduced. Since the lengths of most real messages are multiples of 4 or 8 bytes, I support Charles' proposal to measure the time for an 8 byte message and call it the latency. I think the warm-up phase before the actual benchmarking is important in order not to smear out initialization overheads over some number of messages. The time for the first ping-pong (or other operation), however, should be measured and compared with the time found for the following operations. I very much welcome Charles Grassl's kind offer to write a new benchmark suite. Perhaps there are even other suites available which could also be candidates for getting adopted by PARKBENCH. This forum meanwhile is quite well-known, which gives them considerable responsibility. PARKBENCH's choice of benchmark programs influences procurements of new machines world-wide, and the availability of a good set of low level benchmarks could give PARKBENCH a good reputation. I'm afraid that the current set of routines has the opposite effect. - Rolf Hempel ------------------------------------------------------------------------ Rolf Hempel (email: hempel@ccrl-nece.technopark.gmd.de) Senior Research Staff Member C&C Research Laboratories, NEC Europe Ltd., Rathausallee 10, 53757 Sankt Augustin, Germany Tel.: +49 (0) 2241 - 92 52 - 95 Fax: +49 (0) 2241 - 92 52 - 99 From owner-parkbench-comm@CS.UTK.EDU Fri Jan 16 12:46:04 1998 Return-Path: Received: from CS.UTK.EDU by netlib2.cs.utk.edu with ESMTP (cf v2.9t-netlib) id MAA14801; Fri, 16 Jan 1998 12:46:04 -0500 Received: from localhost (root@localhost) by CS.UTK.EDU with SMTP (cf v2.9s-UTK) id MAA27007; Fri, 16 Jan 1998 12:29:03 -0500 (EST) Received: from haven.EPM.ORNL.GOV (haven.epm.ornl.gov [134.167.12.69]) by CS.UTK.EDU with ESMTP (cf v2.9s-UTK) id MAA27000; Fri, 16 Jan 1998 12:29:01 -0500 (EST) Received: (from worley@localhost) by haven.EPM.ORNL.GOV (8.8.3/8.8.3) id MAA02149; Fri, 16 Jan 1998 12:29:01 -0500 (EST) Date: Fri, 16 Jan 1998 12:29:01 -0500 (EST) From: Pat Worley Message-Id: <199801161729.MAA02149@haven.EPM.ORNL.GOV> To: parkbench-comm@CS.UTK.EDU Subject: Re: Low Level Benchmarks In-Reply-To: Mail from 'hempel@ccrl-nece.technopark.gmd.de (Rolf Hempel)' dated: Fri, 16 Jan 1998 16:37:14 +0100 Cc: worley@haven.EPM.ORNL.GOV In most other points I agree with Charles. I'm strongly convinced that the COMMS* routines are obsolete and should be replaced with something reasonable. I have no problem with this. As I indicated, I have no experience with these. What is missing is a precise definition of the underlying measuring methodology. Perhaps this is the point that I was trying to make. Not only must the codes be easy to use, but the results should be easy to interpret. Every code should have a simple description of what it is measuring, what the data can be used for (and what it shouldn't be used for), and how to use the data. PARKBENCH needs to provide guidance in what data to collect, not just carefully crafted benchmark codes. And we need to describe clearly what low level communication tests are good for. For example, I have problems with low level contention tests. Understanding hotspots is an interesting exercise, but the connection to "real" codes is more subtle. Do we stress test, look at contention for given algorithms/global operators (and which algorithms), use some standard workload characterization as the background job, ...? For any given performance question, what should be used may be clear, but it is difficult to do this a priori. A simultaneous send/receive stress test may very well be something interesting to present, but we also need to be able to explain why (because it is typical in synchronous global communication operations?). In summary, I would like to see a prioritized list of what low level information is worth collecting, and why. We can then use this to choose or generate codes to do the testing. I apologize for being lazy. This may have already been laid out in the original ParkBench document, but I never worried about the low level tests before and don't have a copy of the document in front of me. Pat Worley From owner-parkbench-comm@CS.UTK.EDU Fri Jan 16 13:45:53 1998 Return-Path: Received: from CS.UTK.EDU by netlib2.cs.utk.edu with ESMTP (cf v2.9t-netlib) id NAA15447; Fri, 16 Jan 1998 13:45:52 -0500 Received: from localhost (root@localhost) by CS.UTK.EDU with SMTP (cf v2.9s-UTK) id NAA29375; Fri, 16 Jan 1998 13:15:58 -0500 (EST) Received: from c3serve.c3.lanl.gov (root@c3serve-f0.c3.lanl.gov [128.165.20.100]) by CS.UTK.EDU with ESMTP (cf v2.9s-UTK) id NAA29368; Fri, 16 Jan 1998 13:15:55 -0500 (EST) Received: from risc.c3.lanl.gov (risc.c3.lanl.gov [128.165.21.76]) by c3serve.c3.lanl.gov (8.8.5/1995112301) with ESMTP id LAA04436 for ; Fri, 16 Jan 1998 11:16:08 -0700 (MST) Received: from localhost (hoisie@localhost) by risc.c3.lanl.gov (950413.SGI.8.6.12/c93112801) with SMTP id LAA13115 for ; Fri, 16 Jan 1998 11:14:30 -0700 Date: Fri, 16 Jan 1998 11:14:30 -0700 (MST) From: Adolfy Hoisie To: parkbench-comm@CS.UTK.EDU Subject: Low Level Benchmarks Message-ID: MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII Just to amplify some of the numerous excellent points made by Pat and Charles and Rolf, the emphasis of the Parkbench group, as I see it, should be on defining the methodology for benchmarking at this level. A string of numbers says very little about machine performance in absence of a solid, scientifcally defined underlying base for the programs utilized for benchmarking. COMMS is obsolete in methodology, coding and generation and analysis of results. As such, I have used it quite some time ago only to reach the conclusions above. Instead, I always chose to write my own benchmarking programs in order to extract meaningful data for the applications I was working on. I would like to see the debate heading towards what is it that we need to measure in a suite of general use that is applicable to machines of interest. For example, very little or no attention is being paid to benchmarking DSM architectures, where quite a few architectural parameters become harder to define and subtler to interpret. Including, but not limited to, message passing characterization on these architectures. Adolfy ====================================================================== Adolfy Hoisie \ Los Alamos National Laboratory \Scientific Computing, CIC-19, MS B256 hoisie@lanl.gov \ Los Alamos, NM 87545 USA \ Phone: 505-667-5216 http://www.c3.lanl.gov/~hoisie/hoisie.html FAX: 505-667-1126 From owner-parkbench-comm@CS.UTK.EDU Sun Jan 18 07:38:42 1998 Return-Path: Received: from CS.UTK.EDU by netlib2.cs.utk.edu with ESMTP (cf v2.9t-netlib) id HAA20627; Sun, 18 Jan 1998 07:38:42 -0500 Received: from localhost (root@localhost) by CS.UTK.EDU with SMTP (cf v2.9s-UTK) id HAA21662; Sun, 18 Jan 1998 07:28:22 -0500 (EST) Received: from post.mail.demon.net (post-10.mail.demon.net [194.217.242.154]) by CS.UTK.EDU with SMTP (cf v2.9s-UTK) id HAA21655; Sun, 18 Jan 1998 07:28:20 -0500 (EST) Received: from minnow.demon.co.uk ([158.152.73.63]) by post.mail.demon.net id aa1002926; 18 Jan 98 12:25 GMT Message-ID: Date: Sun, 18 Jan 1998 12:24:20 +0000 To: parkbench-comm@CS.UTK.EDU From: Roger Hockney Subject: Low Level Benchmarks MIME-Version: 1.0 X-Mailer: Turnpike Version 3.03a To: the low-level discussion group From: Roger I comment below on recent emailings on this topic which arrived on the 16 Jan 1998. Pat Worley writes: >2) It may be time to revisit the goals of the Low Level suite. There ar > are two obvious extremes. > > a) Determine some (hopefully representative) metrics of point-to-po > point communication performance, concentrating on making the > measurements > SNIP > In this situation, a two (or more) parameter model fit to the > data can be useful, if only as a shorthand for the raw data, > but the model should not be expected to explain the data. This is of course what COMMS1 sets out to do. But please when judging this point, use the New COMMS1 revised code that DOES give much more sensible answers in difficult cases. Please do not base your opinions on results from the Original COMMS1 code that is still unfortunately being issued by Parkbench. Instructions for getting the new code was given in my email to this group on 12 Jan 1998. > (The two parameter models are very accurate for some of the > previous generation of homogeneous message-passing platforms.) It is nice to have confirmation of this from an independent source. In addition, the 3-parameter mode is available in New COMMS1 for cases where the 2-para fails. > In case my sympathies are not clear, I prefer to revisit and fix > the current suite, "dumbing it down", if only in presentation, > making it clear what it does and does not measure. Again this was my objective in writting the New COMMS1 as a minimum fix to the existing Original COMMS1. However I don't think I would call this "Dumbing Down". In fact New COMMS1 is a "Smartening UP" of the benchmark because it provides a 3-parameter fit for those cases for which the 2-para fit fails. It also reports the Key spot values of "time for shortest message (which Charles and Rolfe want to call the Latency)" and bandwidth for longest message (this could equally well be the maximum measured bandwidth). It also compares the fitted values with measured values at these key points. The fit formulae are also given in the output for completeness. Pleas note that COMMS1 has always reported ALL the measured lengths and times in the output file as the basic data, and ALL spot bandwidths were printed to the screen as measured, and could be captured in a file if required. In New COMMS1 the spot bandwidths are more conveniently included in the standard output file as they should have been in the first place. Unfortunately the above additions make the new output file more complex (which I am not happy about). An example of New COMMS1 output is attached at the end of this email. >PPS. - Of course the real reason for using ping-pong is the difficulty > in measuring the time for one-way messaging. I was not aware > that this was a solved problem, at least at the MPI or PVM > level. Perhaps system instrumentation can answer it, but I > didn't know that portable measurement codes could be guaranteed > to do so across the different platforms. Exactly so. ******************************* Rolf Hempel writes: >of message-passing. If Charles knows a way to measure single messages, >I would like to learn about it. Me too. >In most other points I agree with Charles. I'm strongly convinced that >the COMMS* routines are obsolete and should be replaced with something >reasonable. In particular, the current routines are far too complicated >to use, and give completely meaningless results. Therefore, I think one Please base your judgement on the results from New COMMS1 which has a much more satisfactory fitting procedure (see the examples in the PICT tool mentioned below). I believe that the revised program New COMMS1 gives reasonable results and is not obselete. >README file. What is missing is a precise definition of the underlying >measuring methodology. In contrast, the methodology of the COMMS1 curve fitting is given in the Parkbench Report and in detail in my book "The Science of Computer Benchmarking", see: http://www.siam.org/catalog/mcc07/hockney.htm >I strongly prefer the output of timing tables (perhaps translated in >good graphical representations) over crude parametrizations like the >ones in the COMMS* benchmarks. Those can only frustrate the experts >and confuse all other people. You seem to have failed to notice that both the Original COMMS1 and the New COMMS1 report the timing table as the FIRST part of their output files. Further a good graphical representation is available using the database tool from Southampton and my own PICT tool (see below) The COMMS1 fitting procedure is not crude. On the contrary it uses least-squares fitting of a performance model that is quite satisfactory for a lot of data. In minimising relative rather than absolute error, New COMMS1 spreads the error in a much more satisfactory way and allows the fitting to be used over a much longer range of message lengths. Furthermore where the 2-parameter model is unsuitable, New COMMS1 provides a 3-parameter model which fits the Cray T3E (Charles's data 17 Dec 96) very well. I don't think one can call all this crude. To see how good the 2 and 3 parameter fits produced by New COMMS1 are to recent data, check out the examples on my Parkbench Interactive Curve Fitting Tool (PICT) at: http://www.minnow.demon.co.uk/pict/source/pict2a.html For the most part these show that 2-parameters fit the data surprisingly well. The parameters are not meaningless and useless, but often a rather good summary of the measurements. The 3-parameter fit is described quite fully in my talk to the 11 Sep 1997. I have finally written this up with pretty pictures for the PEMCS Web Journal. Look at: http://hpc-journals.ecs.soton.ac.uk/Workshops/PEMCS/fall-97/ talks/Roger-Hockney/perfprof1.html In truth we need to see a lot more data before judging the usefulness of parametric fitting. That is why I would like to look at your NEC results. These need not be the timings from COMMS1, but any pingpong measurements that you regard as "good". Please do not base your opinion on the results produced by the Original COMMS1 which is presently in the Parkbench suit. This will only work satisfactorily results for message lengths up to about 4*10^4. When used outside this range it may produce useless numbers. >messages are multiples of 4 or 8 bytes, I support Charles' proposal to >measure the time for an 8 byte message and call it the latency. I am STRONGLY opposed to this. Latency is an ambiguous term that has different meanings to different people. If we wish to report the time for an 8-byte message we should call it what it is, no more no less, eg: t(n=8B) = 45.6 us To call this latency only leads to confusion and senseless misunderstanding and argument. **************************************************************** EXAMPLE NEW COMMS1 OUTPUT FILE: T3E Results from Grassl's 17 Dec 1996 email to Parkbench committee **************************************************************** ================================================= === === === GENESIS / ParkBench Parallel Benchmarks === === === === comms1_mpi === === === ================================================= Pingpong Benchmark: ------------------- Measures time to send a message between two nodes on a multi-processor computer (MPP or network) as a function of the message length. It also characterises the time and corresponding bandwidth by both two and three performance parameters. Original code by Roger Hockney (1986/7), modified by Ian Glendinning and Ade Miller (1993/4), and by Roger Hockney and Ron Sercely (1997). ----------------------------------------------------------------------- You are running the VERSION dated: RWH-12-Mar-1997 ----------------------------------------------------------------------- The measurement time requested for each test case was 1.00E+00 seconds. No distinction was made between long and short messages. Zero length messages were not used in least squares fitting. ----------------------------------------------- (1) PRIMARY MEASUREMENTS (BW=Bandwidth, B=Byte) ----------------------------------------------------------------------- SPOT MEASURED VALUES | EVOLVING TWO-PARAMETER FIT --------------------------------------|-------------------------------- POINT LENGTH(n) TIME(t) BW(r=n/t) | rinf nhalf RMS rel B s B/s | B/s B error % *SPOT1*-------------------------------|-------------------------------- 1 8.000E+00 1.260E-05 6.349E+05 | 0.000E+00 0.000E+00 0.000E+00 2 1.000E+01 1.348E-05 7.418E+05 | 2.273E+06 2.064E+01 -1.255E-06 3 2.000E+01 1.380E-05 1.449E+06 | 1.237E+07 1.516E+02 2.277E+00 4 3.000E+01 1.590E-05 1.887E+06 | 7.798E+06 9.157E+01 2.762E+00 5 4.000E+01 1.561E-05 2.562E+06 | 1.020E+07 1.237E+02 3.267E+00 6 5.000E+01 1.648E-05 3.034E+06 | 1.115E+07 1.366E+02 3.126E+00 7 6.000E+01 1.618E-05 3.708E+06 | 1.364E+07 1.711E+02 3.796E+00 8 7.000E+01 1.773E-05 3.948E+06 | 1.356E+07 1.699E+02 3.552E+00 9 8.000E+01 1.694E-05 4.723E+06 | 1.562E+07 1.992E+02 4.072E+00 10 9.000E+01 1.793E-05 5.020E+06 | 1.634E+07 2.095E+02 3.954E+00 11 1.000E+02 1.802E-05 5.549E+06 | 1.741E+07 2.249E+02 3.983E+00 12 1.100E+02 1.889E-05 5.823E+06 | 1.776E+07 2.300E+02 3.841E+00 13 1.200E+02 1.780E-05 6.742E+06 | 1.983E+07 2.607E+02 4.483E+00 14 1.300E+02 1.917E-05 6.781E+06 | 2.034E+07 2.682E+02 4.368E+00 15 1.400E+02 1.902E-05 7.361E+06 | 2.131E+07 2.828E+02 4.405E+00 16 1.500E+02 1.941E-05 7.728E+06 | 2.209E+07 2.946E+02 4.389E+00 17 1.600E+02 1.896E-05 8.439E+06 | 2.353E+07 3.167E+02 4.644E+00 18 1.700E+02 2.057E-05 8.264E+06 | 2.362E+07 3.179E+02 4.514E+00 19 1.800E+02 1.911E-05 9.419E+06 | 2.526E+07 3.434E+02 4.887E+00 20 1.900E+02 2.125E-05 8.941E+06 | 2.517E+07 3.420E+02 4.765E+00 21 2.000E+02 1.894E-05 1.056E+07 | 2.730E+07 3.754E+02 5.382E+00 22 2.100E+02 2.091E-05 1.004E+07 | 2.767E+07 3.812E+02 5.282E+00 23 2.200E+02 2.011E-05 1.094E+07 | 2.885E+07 3.998E+02 5.393E+00 24 2.300E+02 2.136E-05 1.077E+07 | 2.915E+07 4.047E+02 5.296E+00 25 2.400E+02 2.015E-05 1.191E+07 | 3.053E+07 4.268E+02 5.496E+00 26 2.500E+02 2.228E-05 1.122E+07 | 3.047E+07 4.258E+02 5.390E+00 27 2.600E+02 2.144E-05 1.213E+07 | 3.110E+07 4.360E+02 5.365E+00 28 2.700E+02 2.212E-05 1.221E+07 | 3.142E+07 4.412E+02 5.290E+00 29 2.800E+02 2.111E-05 1.326E+07 | 3.249E+07 4.588E+02 5.417E+00 30 2.900E+02 2.259E-05 1.284E+07 | 3.272E+07 4.626E+02 5.337E+00 31 3.000E+02 2.284E-05 1.313E+07 | 3.294E+07 4.663E+02 5.262E+00 32 4.000E+02 2.256E-05 1.773E+07 | 3.550E+07 5.098E+02 5.818E+00 33 6.000E+02 2.549E-05 2.354E+07 | 4.022E+07 5.921E+02 6.632E+00 34 8.000E+02 2.817E-05 2.840E+07 | 4.567E+07 6.883E+02 7.296E+00 35 1.000E+03 3.253E-05 3.074E+07 | 4.887E+07 7.452E+02 7.451E+00 36 2.000E+03 4.496E-05 4.448E+07 | 5.553E+07 8.657E+02 8.013E+00 37 5.000E+03 6.135E-05 8.150E+07 | 7.983E+07 1.312E+03 1.090E+01 38 1.000E+04 8.579E-05 1.166E+08 | 1.070E+08 1.814E+03 1.284E+01 39 2.000E+04 1.294E-04 1.546E+08 | 1.339E+08 2.315E+03 1.426E+01 40 3.000E+04 1.722E-04 1.742E+08 | 1.523E+08 2.659E+03 1.493E+01 41 4.000E+04 2.161E-04 1.851E+08 | 1.647E+08 2.890E+03 1.524E+01 42 5.000E+04 2.594E-04 1.928E+08 | 1.735E+08 3.056E+03 1.539E+01 43 1.000E+05 4.534E-04 2.206E+08 | 1.847E+08 3.266E+03 1.575E+01 44 2.000E+05 7.784E-04 2.569E+08 | 1.996E+08 3.548E+03 1.648E+01 45 3.000E+05 1.110E-03 2.703E+08 | 2.123E+08 3.787E+03 1.701E+01 46 5.000E+05 1.697E-03 2.946E+08 | 2.256E+08 4.039E+03 1.762E+01 47 1.000E+06 3.276E-03 3.053E+08 | 2.370E+08 4.255E+03 1.806E+01 48 2.000E+06 6.373E-03 3.138E+08 | 2.468E+08 4.440E+03 1.839E+01 49 3.000E+06 9.489E-03 3.162E+08 | 2.547E+08 4.590E+03 1.858E+01 50 5.000E+06 1.569E-02 3.187E+08 | 2.612E+08 4.714E+03 1.870E+01 51 1.000E+07 3.134E-02 3.191E+08 | 2.666E+08 4.816E+03 1.874E+01 *SPOT2*---------------------------------------------------------------- ------------------------ COMMS1: Message Pingpong ------------------------ Result Summary -------------- ------------------- (2) KEY SPOT VALUES ------------------- ----------------------- *KEY1* Shortest n = 8.000E+00 B, | t = 1.260E-05 s | ****** | | ****** *KEY2* Longest n = 1.000E+07 B, | r = 3.191E+08 B/s | ****** ----------------------- ----------------------------------------------------------------------- -- ------------------------------------------ (3) BEST TWO-PARAMETER LINEAR-(t vs n) FIT ------------------------------------------ (Minimises sum of squares of relative error at all points being fitted) Root Mean Square (RMS) Relative Error in time = 18.74 % Maximum Relative Error in time = 43.61 % at POINT = 1 This is a fit to ALL points. Even though different expressions are given for short and long messages, they are algebraically identical and either may be used for any message length in the full range. -------------- Short Messages -------------- Best expressions to use if nhalf > 0 and n <= nhalf = 4.816E+03 B Bandwidth fitted to: r = pi0*n/(1+n/nhalf) Time fitted to: t = t0*(1+n/nhalf) -------------------------------------------- *LIN1* | pi0 = 5.536E+04 Hz, nhalf= 4.816E+03 B | ****** | | ****** *LIN2* | t0 = 1/pi0 = 1.807E-05 s | ****** -------------------------------------------- Spot comparison at POINT = 1, n = 8.000E+00 B t(fit) = 1.810E-05 s, t(measured) = 1.260E-05 s, relative error in time = 43.6 % ------------- Long Messages ------------- Best expressions to use if n > nhalf = 4.816E+03 B, or nhalf=0 Bandwidth fitted to: r = rinf/(1+nhalf/n) Time fitted to: t = (n+nhalf)/rinf ----------------------------------------------- *LIN3* | rinf = 2.666E+08 B/s, nhalf = 4.816E+03 B | ****** ----------------------------------------------- Spot comparison at POINT = 51, n = 1.000E+07 B r(fit) = 2.665E+08 B/s, r(measured) = 3.191E+08 B/s, relative error in B/W = -16.5 % ----------------------------------------------------------------------- -- --------------------------------------- (4) BEST 3-PARAMETER VARIABLE-POWER FIT --------------------------------------- Root Mean Square (RMS) Relative Error in B/W = 6.89 % Maximum Relative Error in B/W = -13.41 % at POINT = 39 This fit is to ALL data points Bandwidth is fitted to: rvp = rivp/(1+(navp/n)^gamvp)^(1/gamvp) Time is fitted to: tvp = t0vp*(1+(n/navp)^gamvp)^(1/gamvp) where t0vp = navp/rivp and navp = t0vp*rivp When gamvp = 1.0, this form reduces to the linear-time form (3) above, navp becomes nhalf, and rivp becomes rinf. The three independent parameters are (t0vp is derived): ------------------------------------------------------------- *VPWR1* | rivp = 3.475E+08 B/s, navp = 3.670E+03 B, gamvp = 4.190E-01 | | | *VPWR2* | t0vp = navp/rivp = 1.056E-05 s | ------------------------------------------------------------- This function is guaranteed to fit the first and last measured values of time and bandwidth. It also fits the (interpolated) time and bandwidth at n = navp. -- Roger Hockney. Checkout my new Web page at URL http://www.minnow.demon.co.uk University of and link to my new book: "The Science of Computer Benchmarking" Westminster UK suggestions welcome. Know any fish movies or suitable links? From owner-parkbench-comm@CS.UTK.EDU Mon Jan 19 13:10:51 1998 Return-Path: Received: from CS.UTK.EDU by netlib2.cs.utk.edu with ESMTP (cf v2.9t-netlib) id NAA16306; Mon, 19 Jan 1998 13:10:50 -0500 Received: from localhost (root@localhost) by CS.UTK.EDU with SMTP (cf v2.9s-UTK) id MAA21116; Mon, 19 Jan 1998 12:53:17 -0500 (EST) Received: from haze.vcpc.univie.ac.at (haze.vcpc.univie.ac.at [131.130.186.138]) by CS.UTK.EDU with ESMTP (cf v2.9s-UTK) id MAA21105; Mon, 19 Jan 1998 12:53:14 -0500 (EST) Received: (from smap@localhost) by haze.vcpc.univie.ac.at (8.8.6/8.8.6) id SAA21164 for ; Mon, 19 Jan 1998 18:53:11 +0100 (MET) From: Ian Glendinning Received: from fidelio(131.130.186.155) by haze via smap (V2.0beta) id xma021162; Mon, 19 Jan 98 18:52:48 +0100 Received: (from ian@localhost) by fidelio.vcpc.univie.ac.at (8.7.5/8.7.3) id SAA03411 for parkbench-comm@CS.UTK.EDU; Mon, 19 Jan 1998 18:52:48 +0100 (MET) Date: Mon, 19 Jan 1998 18:52:48 +0100 (MET) Message-Id: <199801191752.SAA03411@fidelio.vcpc.univie.ac.at> To: parkbench-comm@CS.UTK.EDU Subject: Re: Low Level benchmark errors and differences X-Sun-Charset: US-ASCII Dear parkbench-comm subscriber, I have been following the discussions regarding the low-level ParkBench benchmarks over the last couple of weeks with intertest, but so far I have been content to keep my head below the parapet, as most of the things I would have said have been said by others anyway. However, there is one thing that I would like to point out. On Wed Jan 7 22:56:04 1998, Charles Grassl wrote: > The Low Level programs are obsolete and need to be replaced. I agree that the existing code could use some improvement, though most of the discussion seems to have revolved around the version in the "current release", which as Roger has pointed out several times is very old, and he has written an improved version. Have people tried that version out? > I have > written seven simple programs, with MPI and PVM versions, and offer them > as a replacement for the Low Level suite. I have tried a version of Charles's "comms1" code that he sent me, on our CS-2 system, and found that it reported approximately half the expected asymptotic bandwidth, so this code is not without its problems either! By "expected", I mean the bandwidth reported by various versions of (the ParkBench version of) COMMS1 over the years, coded using first PARMACS, then PVM, and more recently MPI, as a message-passing library. This value corresponds closely to what one would expect for the peak performance, given the performance figures for the underlying hardware. For an explanation of what I think is happening, please read on... On Thu Jan 15 20:20:36 1998, Charles Grassl wrote: > This recorded > time is for a round trip message, and is not precisely the time for > two messages. Half the round trip message passing time, as reported in > the PMB tests, is not the time for a single message and should not be > reported and such. This same erroneous technique is used in the COMMS1 > and COMMS2 two benchmarks. (Is Parkbench is responsible for propagating > this incorrect methodology.) As Pat Worley and Rolf Hempel pointed out, the ping-pong is used because of the difficulty in measuring the time for one-way messages, and I believe that this is illustrated in this instance, as it seems that Charles's attempt to time one-way messages has caused the unexpectedly low asymptotic bandwidth measurement... Charles's code executes a send, and then as fast as possible executes another one, without any concern as to whether the data has left the sending processor, or has arrived at the receiving processor, and what I think is happening is that his code is queuing requests to send, before the previous messages have left the sending processor, forcing the MPI implementation to buffer them, at the cost of an extra copy operation, which would not otherwise have been necessary, thus reducing the effective bandwidth! > With respect to low level testing, the round trip exchange of messages, > as per PingPing and PingPong in PMB or COMMS1 and COMMS2, is not > characteristic of the lowest level of communication. This pattern > is actually rather rare in programming practice. It is more common > for tasks to send single messages and/or to receive single messages. It seems to me that it is not very common programming practice to send a sequence of messages to the same destination in rapid fire, without having either done some intermediate processing, or waiting to get some response back. If you were trying to code efficiently, you would doubtless merge the messages into one, and send the data all together in one message, if it was all available already, which it must have been if you were able to execute the sends so rapidly one after another! > The single message passing is a distinctly different case from that > of round trip tests. We should be worried that the round trip testing > might introduce artifacts not characteristic of actual (low level) usage. > We need a better test of basic bandwidth and latency in order to measure > and characterize message passing performance. Well, it seems that in this case, the attempt to measure the single message passing case has introduced an artifact. To an extent it depends what you are trying to measure of course, but it has always been my understanding that the COMMS1 benchmark was trying to measure the peak performance that you could reasonably expect to obtain using a portable message-passing library interface, which, for a good implementation of MPI, ought to come close to the theoretical hardware limit, which is precisely what the existing COMMS1 ping-pong code does on our system. I would therefore argue in favour of retaining the ping-pong technique for obtaining timings. Ian -- Ian Glendinning European Centre for Parallel Computing at Vienna (VCPC) ian@vcpc.univie.ac.at Liechtensteinstr. 22, A-1090 Vienna, Austria Tel: +43 1 310 939612 WWW: http://www.vcpc.univie.ac.at/~ian/ From owner-parkbench-comm@CS.UTK.EDU Tue Jan 20 08:50:06 1998 Return-Path: Received: from CS.UTK.EDU by netlib2.cs.utk.edu with ESMTP (cf v2.9t-netlib) id IAA06977; Tue, 20 Jan 1998 08:50:06 -0500 Received: from localhost (root@localhost) by CS.UTK.EDU with SMTP (cf v2.9s-UTK) id IAA01200; Tue, 20 Jan 1998 08:28:44 -0500 (EST) Received: from sun1.ccrl-nece.technopark.gmd.de (sun1.ccrl-nece.technopark.gmd.de [193.175.160.67]) by CS.UTK.EDU with ESMTP (cf v2.9s-UTK) id IAA01193; Tue, 20 Jan 1998 08:28:39 -0500 (EST) Received: from sgi7.ccrl-nece.technopark.gmd.de (sgi7.ccrl-nece.technopark.gmd.de [193.175.160.89]) by sun1.ccrl-nece.technopark.gmd.de (8.7/3.4W296021412) with SMTP id OAA12945; Tue, 20 Jan 1998 14:19:53 +0100 (MET) Received: (from hempel@localhost) by sgi7.ccrl-nece.technopark.gmd.de (950413.SGI.8.6.12/950213.SGI.AUTOCF) id OAA09828; Tue, 20 Jan 1998 14:19:52 +0100 Date: Tue, 20 Jan 1998 14:19:52 +0100 From: hempel@ccrl-nece.technopark.gmd.de (Rolf Hempel) Message-Id: <199801201319.OAA09828@sgi7.ccrl-nece.technopark.gmd.de> To: cmg@cray.com Subject: Re: Low Level Benchmarks Cc: hempel@ccrl-nece.technopark.gmd.de, parkbench-comm@CS.UTK.EDU Reply-To: hempel@ccrl-nece.technopark.gmd.de Dear Charles, thank you for your note, and for sending me your simple test program. One thing I like about the program is that it's easy to install and run; no complicated makefiles, include files and sophisticated driver software. We had the code running in five minutes. In many points I agree with Ian Glendinning who already reported about his tests with your code on the Meiko system. When we ran the test on our SX-4, however, the results were very similar to ping-pong figures. With the particular MPI version I used for my measurements, the classical ping-pong test as implemented in MPPTEST of the MPICH distribution gives about 4 usec less time in latency and about 4% higher throughput than your test program. The reason for the increase in latency as reported by your code is fully explained by the fact that you forgot to correct for the time spent in the timer routine (see below). So, we would have no problem with adopting a corrected version of your code as the basic communication test. However, I think that this is not the point. The question we have to answer is what communication pattern we want to measure with our benchmark code. In my view the ping-pong technique, with all its problems, is much closer to a typical application than your program. Of course, the situation "receiver already waiting" implemented by the ping-pong, is a special case which will not be found for all messages in an application. In this situation, the MPI implementation can use a more efficient protocol, which will lead to a best case measurement of latency and throughput. I agree with Ian that the rapid succession of messages in one direction is very untypical. Only a stupid programmer would do it this way in an application, and not aggregate the messages to a larger one. What you really measure with this benchmark is how well the MPI library can deal with this kind of congestion. As you see, our library is not affected at all by this, but, as Ian reported, the Meiko shows a much different behaviour. In a sense, you measure a kind of worst case scenario, as opposed to the best case one in the ping-pong. One technical detail of your program: You time every send operation separately, and then sum up the individual times. This requires a quite accurate clock. I would expect that some machines could run into trouble with this approach. Also, you don't correct for the time needed for calling the timer twice for every send/receive. On machines with highly optimized MPI libraries this is not at all negligible. On our machine two timer calls require as much time as 25% of a complete send-receive sequence! As a summary, your basic communication program does not convince me as a better alternative to ping-pong programs such as MPPTEST. The only thing I really like about it is its simplicity. Best regards, Rolf ------------------------------------------------------------------------ Rolf Hempel (email: hempel@ccrl-nece.technopark.gmd.de) Senior Research Staff Member C&C Research Laboratories, NEC Europe Ltd., Rathausallee 10, 53757 Sankt Augustin, Germany Tel.: +49 (0) 2241 - 92 52 - 95 Fax: +49 (0) 2241 - 92 52 - 99 From owner-parkbench-comm@CS.UTK.EDU Wed Jan 21 11:22:06 1998 Return-Path: Received: from CS.UTK.EDU by netlib2.cs.utk.edu with ESMTP (cf v2.9t-netlib) id LAA27346; Wed, 21 Jan 1998 11:22:06 -0500 Received: from localhost (root@localhost) by CS.UTK.EDU with SMTP (cf v2.9s-UTK) id KAA20207; Wed, 21 Jan 1998 10:55:58 -0500 (EST) Received: from sun1.ccrl-nece.technopark.gmd.de (sun1.ccrl-nece.technopark.gmd.de [193.175.160.67]) by CS.UTK.EDU with ESMTP (cf v2.9s-UTK) id KAA20176; Wed, 21 Jan 1998 10:55:44 -0500 (EST) Received: from sgi7.ccrl-nece.technopark.gmd.de (sgi7.ccrl-nece.technopark.gmd.de [193.175.160.89]) by sun1.ccrl-nece.technopark.gmd.de (8.7/3.4W296021412) with SMTP id QAA01123; Wed, 21 Jan 1998 16:50:13 +0100 (MET) Received: (from hempel@localhost) by sgi7.ccrl-nece.technopark.gmd.de (950413.SGI.8.6.12/950213.SGI.AUTOCF) id QAA11663; Wed, 21 Jan 1998 16:54:00 +0100 Date: Wed, 21 Jan 1998 16:54:00 +0100 From: hempel@ccrl-nece.technopark.gmd.de (Rolf Hempel) Message-Id: <199801211554.QAA11663@sgi7.ccrl-nece.technopark.gmd.de> To: parkbench-comm@CS.UTK.EDU Subject: NEW COMMS1 benchmark Cc: eckhard@ess.nec.de, tbeckers@ess.nec.de, lonsdale@ccrl-nece.technopark.gmd.de, maciej@ccrl-nece.technopark.gmd.de, ritzdorf@ccrl-nece.technopark.gmd.de, zimmermann@ccrl-nece.technopark.gmd.de, springstubbe@gmd.de, hempel@ccrl-nece.technopark.gmd.de Reply-To: hempel@ccrl-nece.technopark.gmd.de In the recent discussion on the low-level benchmarks, Roger repeatedly asked us to base our evaluation of the COMMS1 benchmark on his new version, and not on the one which is still in the official PARKBENCH distribution. At NEC we now have repeated the tests on the NEC SX-4 machine, and I would like to make a few comments on the results. First of all, the raw data as reported by the table Primary Measurements more closely match the figures given by other ping-pong tests than the older version. The correction for oeverheads, however, is still problematic for the following reasons: 1. In every loop iteration, the returned message is compared with the message sent. If one is concerned with the correctnes of the MPI library, this could be checked in a separate loop before the timing loop. The check inside the timing loop, done only by the sender process, delays the sender and thus makes sure that the receiver is already waiting in the receive for the next message. This aggravates the "Receiver ready" situation which I discussed in an earlier mail. 2. The authors take great care in correcting for the overhead introduced by the do loop. This is done by the loop over the dummy routine before the main loop. On the other hand, the correction for the check routine call introduces an overhead of one timer call which is NOT taken into account. (Here I assume that the internal clock is read out at a fixed point in time during every call of DWALLTIME00().) I would argue that on most machines the loop overhead per iteration is negligible as compared to a function call. On our machine, MPI_Wtime calls a C function which in turn calls an assembly language routine. The time needed for this is about 10% of our message latency! Another problem in the measuring procedure is that the test message contains a single constant, repeated as many times as there are words in the message. Did the authors never think about the possibility of data compression in interconnect systems? I would not be surprised to see bandwidths of Terabytes/sec on some Ethernet connection between workstations. Apart from this, the raw data are much better now than they were before, and when the above points were fixed, the resulting table would be satisfactory. The interesting question is, however, how much added value we get from the parameter fitting. In my earlier note, I called the fitting procedure in the earlier COMMS1 benchmark "crude". I cannot find a more appropriate word for a model which in cases deviates from the measured values by more than 100%. So, how much improvement do we get from the revised COMMS1 version? As Roger said himself, the increase in modeling sophistication led to a more complicated output file. Results are now given for two models, the first one using two parameters, and the second one three. As could be expected, the two-parameter model does not work better than in the previous version. For our machine, latency is over-estimated by 18.9 percent, and the bandwidth at the last data point is off by 27%. Since a linear model is just too simple to be applied to modern message-passing libraries, I wonder why these results are still in the output file at all. The three-parameter fit is better than the two-parameter one. The major advantage is that it exactly matches the first data point in time, and the last data point in bandwidth. That is what people would look at, if there were no parameter fitting at all. So, the reported latency is the time measured for a zero-byte message, and is as good or as bad as this measurement. For our MPI library, the RMS fitting error for the whole data set is 14.04%, and the maximum relative error is 33.4%. We now can discuss the meaning of the word "crude" (and I apologize if as a non-native speaker I don't use the right word here), but I would at least call it unsatisfactory. Given those differences between model and measurements, I was not surprised to see the projected RINFINITY as being too high. The 7.65 GBytes/s are well beyond a memcpy operation in our shared memory, and measured rates never exceeded 7.1 GBytes/s. To summarize, in my opinion there is no added value given by the parameter fitting. The latency value is the first entry in the raw data table, and the asymptotic bandwidth is easy to figure out by just looking at the bandwidths as measured for very long messages. As explained above, the extrapolation by the parametrized model does not add any precision as compared with a guess based on the long-message table entries. For message lengths in between, what does a model help me if it deviates from the measurements by up to 33%? So, my conclusion would be to drop the whole parameter fitting from the PARKBENCH low-level routines. In a separate mail I will send the COMMS1 benchmark output, as produced with our MPI library, to Roger. I don't want to swamp the whole PARKBENCH forum with the detailed data. Best regards, Rolf ------------------------------------------------------------------------ Rolf Hempel (email: hempel@ccrl-nece.technopark.gmd.de) Senior Research Staff Member C&C Research Laboratories, NEC Europe Ltd., Rathausallee 10, 53757 Sankt Augustin, Germany Tel.: +49 (0) 2241 - 92 52 - 95 Fax: +49 (0) 2241 - 92 52 - 99 From owner-parkbench-comm@CS.UTK.EDU Fri Jan 23 12:24:12 1998 Return-Path: Received: from CS.UTK.EDU by netlib2.cs.utk.edu with ESMTP (cf v2.9t-netlib) id MAA07290; Fri, 23 Jan 1998 12:24:11 -0500 Received: from localhost (root@localhost) by CS.UTK.EDU with SMTP (cf v2.9s-UTK) id MAA06737; Fri, 23 Jan 1998 12:04:42 -0500 (EST) Received: from post.mail.demon.net (post-10.mail.demon.net [194.217.242.154]) by CS.UTK.EDU with SMTP (cf v2.9s-UTK) id MAA06686; Fri, 23 Jan 1998 12:04:23 -0500 (EST) Received: from minnow.demon.co.uk ([158.152.73.63]) by post.mail.demon.net id aa1003594; 23 Jan 98 16:49 GMT Message-ID: <1GgxMFAgVMy0EwfI@minnow.demon.co.uk> Date: Fri, 23 Jan 1998 16:29:20 +0000 To: hempel@ccrl-nece.technopark.gmd.de Cc: parkbench-comm@CS.UTK.EDU, eckhard@ess.nec.de, tbeckers@ess.nec.de, lonsdale@ccrl-nece.technopark.gmd.de, maciej@ccrl-nece.technopark.gmd.de, ritzdorf@ccrl-nece.technopark.gmd.de, zimmermann@ccrl-nece.technopark.gmd.de, springstubbe@gmd.de From: Roger Hockney Subject: Re: NEW COMMS1 benchmark In-Reply-To: <199801211554.QAA11663@sgi7.ccrl-nece.technopark.gmd.de> MIME-Version: 1.0 X-Mailer: Turnpike Version 3.03a To: The Parkbench discussion group From: Roger Hockney First the 3-parameter fit that is produced by New COMMS1 and discussed by Rolf can be found in the html version of this reply at: www.minnow.demon.co.uk/Pbench/emails/hempel1.htm Or by bringing up the PICT tool on your browser at: www.minnow.demon.co.uk/pict/source/pict2a.html Then: (1) select a suitable frame size for the PICT display (2) change the data URL at top from .../data/t3e.res to .../data/sx4.res (3) press the "GET DATA at URL" button, and the data should download. (4) press the 3-PARA button then the APPLY3 button, and the 3-para curve should be drawn. ************************************************************************ Rolf has especialy asked me to point out that the results that he has supplied are for the SX4 using Release 7.2 MPI software which is will soon be replaced by a newer version with significantly better latency and bandwidth. This data does not therefore represent the best that can be achieved on the SX4. ************************************************************************ I now reply to specific points in Rolf Hempel's email to group on 21 Jan 1998. >In the recent discussion on the low-level benchmarks, Roger repeatedly >asked us to base our evaluation of the COMMS1 benchmark on his new >version, and not on the one which is still in the official PARKBENCH >distribution. At NEC we now have repeated the tests on the NEC SX-4 >machine, and I would like to make a few comments on the results. > Thank you, Rolf, for taking the trouble to install New COMMS1 and sending me the results. I discuss the results below. In answer to your other points: >First of all, the raw data as reported by the table Primary Measurements >more closely match the figures given by other ping-pong tests than the >older version. The correction for oeverheads, however, is still The two points you raise could easily be incorporated in the code. I was reluctant to tamper with the measurement part of the COMMS1 code because it would introduce systematic differences in the measurements and make comparison with older measurements invalid. But of course this has to be done from time to time. My changes were deliberately kept to a minimum and confined largely to the parameter fitting part which was causing the main problems being reported. >Another problem in the measuring procedure is that the test message >contains a single constant, repeated as many times as there are words in >the message. Did the authors never think about the possibility of >data compression in interconnect systems? I would not be surprised to >see bandwidths of Terabytes/sec on some Ethernet connection between >workstations. Yes I did think about this, but decided I did not know enough about compression to devise a way to prevent it. Compression algorithms are so clever now that this may be impossible to do. Anyway this is not yet a problem, so I suggest we leave it until it becomes one. Perhaps software should get benefit in its performance numbers for the use of compression but then we need something more difficult than a sequence of constants to use as a standard test. > >Apart from this, the raw data are much better now than they were before, >and when the above points were fixed, the resulting table would be >satisfactory. I would have no objection to this. >The interesting question is, however, how much added value >we get from the parameter fitting. In my earlier note, I called the The added value provided in the case of the NEC SX4 results is that the 3-parameter fit (see graph) gives a satisfactory fit to ALL the data. This reduces 112 numbers to 3 numbers and an analytic formula that can be manipulated. This is called "Performance Characterisation" and provides very useful data compression. Furthermore the parameters themselves can be interpreted as characterising various aspects of the shape and asymptotes of the performance curve. In contrast reporting just the first time and last performance value and calling them the Latency and Bandwidth only tells us about these two points. Further the choice of which message lengths to use for this type of definition is entirely arbitrary and open to much argument at both ends. However, New COMMS1 does provide this type of output in the lines marked KEY SPOT VALUES but I deliberately avoided calling them values of Latency and Bandwidth in order to avoid senseless argument. Some people are very interested in the parametric representations, others not. One is not obliged to use or look at the parametric representations, but they are there for those who want them. For those interested just in the Raw data those are reported first in the output file of New COMMS1. >As could be expected, the two-parameter model does not work better than >in the previous version. For our machine, latency is over-estimated >by 18.9 percent, and the bandwidth at the last data point is off by >27%. Since a linear model is just too simple to be applied to modern >message-passing libraries, I wonder why these results are still in the >output file at all. The 2-PARA results are reported just so that one can see that they are unsatisfactory, and that therefore one must lose simplicity and consider a 3-para fit. Actually there is a switch that can be set in the comms1.inc file to suppress reporting of output if the errors exceed specified values. Every time I have used this, however, I have tended to rerun with the output on, in order to see just what the 2-para gave. If the 2-para can be accepted it is much preferable to the 3-para because of its simplicity and clearer interpretation of the significance of the parameters. >as bad as this measurement. For our MPI library, the RMS fitting >error for the whole data set is 14.04%, and the maximum relative error >is 33.4%. We now can discuss the meaning of the word "crude" (and I If you look at the graph itself (see above), I think you will find the agreement much more satisfactory than is apparent from the reported errors. You also may have too high an expectation of what parametric fitting can reasonably be expected to provide, especially for data with discontinuities. In my experience agreement in RMS error rarely is better than 7% and anything up to 30% is probably still useful. A maximum error of 30% is not bad at all, and may be due to a single rogue point or an isolated discontinuity. Although error numbers are reported in the output, one really has to look at the graph of all data before drawing conclusions. >Given those differences >between model and measurements, I was not surprised to see the >projected RINFINITY as being too high. The 7.65 GBytes/s are well >beyond a memcpy operation in our shared memory, and measured rates never >exceeded 7.1 GBytes/s. Actually 7.65 differs from 7.1 by 8% which is very good agreement indeed. >To summarize, in my opinion there is no added value given by the >parameter fitting. The latency value is the first entry in the raw >data table, and the asymptotic bandwidth is easy to figure out by just >looking at the bandwidths as measured for very long messages. As Your definitions of Latency and Bandwidth will have to be more precise than the above. What does "by looking at the B/W for very long messages" actually mean. What are "very long messages?". "What message length should the first entry in the Raw data table be for?" ... etc. >explained above, the extrapolation by the parametrized model does not >add any precision as compared with a guess based on the long-message >table entries. Strictly-speaking it is invalid to extrapolate the fitted curve outside the range of measured values. However we will always do this, and in this case the fit predicts the known hardware limit as well as can be reasonably expected. >For message lengths in between, what does a model help >me if it deviates from the measurements by up to 33%? So, my conclusion >would be to drop the whole parameter fitting from the PARKBENCH >low-level routines. I think the graph of the results and the 3-para fit shows remarkably good and useful agreement. But this is a subjective personal opinion. What do others think? Best wishes Roger -- Roger Hockney. Checkout my new Web page at URL http://www.minnow.demon.co.uk University of and link to my new book: "The Science of Computer Benchmarking" Westminster UK suggestions welcome. Know any fish movies or suitable links? From owner-parkbench-comm@CS.UTK.EDU Mon Jan 26 06:39:21 1998 Return-Path: Received: from CS.UTK.EDU by netlib2.cs.utk.edu with ESMTP (cf v2.9t-netlib) id GAA02920; Mon, 26 Jan 1998 06:39:21 -0500 Received: from localhost (root@localhost) by CS.UTK.EDU with SMTP (cf v2.9s-UTK) id GAA09063; Mon, 26 Jan 1998 06:22:47 -0500 (EST) Received: from osiris.sis.port.ac.uk (root@osiris.sis.port.ac.uk [148.197.100.10]) by CS.UTK.EDU with SMTP (cf v2.9s-UTK) id GAA09055; Mon, 26 Jan 1998 06:22:36 -0500 (EST) Received: from mordillo (p112.nas1.is3.u-net.net) by osiris.sis.port.ac.uk (4.1/SMI-4.1) id AA12226; Mon, 26 Jan 98 11:19:15 GMT Date: Mon, 26 Jan 98 10:14:38 GMT From: Mark Baker Subject: Re: Low Level Benchmarks To: Charles Grassl , parkbench-comm@CS.UTK.EDU Cc: solchenbach@pallas.de X-Mailer: Chameleon ATX 6.0.1, Standards Based IntraNet Solutions, NetManage Inc. X-Priority: 3 (Normal) References: <199801151711.RAA07227@magnet.cray.com> Message-Id: Mime-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII Charles, Thanks for your thoughts and experiences with the Pallas PMB codes - I will forward them to the authors... The main points in favour of the PMB codes are that they are in C and potentially produce results for a variety of MPI calls... Obviously if the results they produce are flawed... Regarding new low-level codes I would be in favour of taking up your kind offer of writing a set of codes in C/Fortran. I guess the main problem is getting a concensus with regards methodology and measurements that are used with these codes. Maybe we can decide that a number of actions should be undertaken... 1) It seems clear that no one is 100% happy with the current version of the low-level codes. So, this implies that they need to be replaced !? 2) If we are going to replace the codes we can go down a couple of routes; start from scratch, replace with Roger's new codes or some combination of both... 3) I would be happy to see us start from scratch and create C/Fortran codes where the methodology and design of each can be "hammered out" by discussion first and then implemented (and iterated as necessary). 4) Assuming that we want to go down this route, I suggest we make a starting point of Charles' "suggestions and requirements for the low level benchmark design" - towards the end of this email. I am happy to put these words on the web and update/change them as our dicussions evolve... 5) Charles has offered his services to help write/design/test these new codes - I'm willing to offer my services in a similar fashion. I'm sure that others interested in the low-level codes could contribute something here as well. Overall, it seems clear to me that we have enough energy and manpower to produce a new set low-level codes whose methodology and design is correct and relevant to todays systems... I look forward to your comments... Regards Mark --- On Thu, 15 Jan 1998 11:11:39 -0600 (CST) Charles Grassl wrote: > > To: Parkbench interests > From: Charles Grassl > Subject: Low Level benchmarks > > Date: 15 January, 1998 > > > Mark, thank you for pointing us to the PMB benchmark. It is well written > and coded, but has some discrepancies and shortcomings. My comments > lead to suggestions and recommendation regarding low level communication > benchmarks. > > First, in program PMB the PingPong tests are twice as fast (in time) > as the corresponding message length tests in the PingPing tests (as run > on a CRAY T3E). The calculation of the time and bandwidth is incorrect > by a factor of 100% in one of the programs. > > This error can be fixed by recording, using and reporting the actual > time, amount of data sent and their ratio. That is, the time should not > be divided by two in order to correct for a round trip. This recorded > time is for a round trip message, and is not precisely the time for > two messages. Half the round trip message passing time, as reported in > the PMB tests, is not the time for a single message and should not be > reported and such. This same erroneous technique is used in the COMMS1 > and COMMS2 two benchmarks. (Is Parkbench is responsible for propagating > this incorrect methodology.) > > In program PMB, the testing procedure performs a "warm up". This > procedure is a poor testing methodology because is discards important > data. Testing programs such as this should record all times and calculate > the variance and other statistics in order to perform error analysis. > > Program PMB does not measure contention or allow extraction of network > contention data. Tests "Allreduce" and "Bcast" and several others > stress the inter-PE communication network with multiple messages, > but it is not possible to extract information about the contention from > these tests. The MPI routines for Allreduce and Bcast have algorithms > which change with respect to number of PEs and message lengths, Hence, > without detailed information about the specific algorithms used, we cannot > extract information about network performance or further characterize > the inter-PE network. > > Basic measurements must be separated from algorithms. Tests PingPong, > PingPing, Barrier, Xover, Cshift and Exchange are low level. Tests > Allreduce and Bcast are algorithms. The algorithms Allreduce and Bcast > need additional (algorithmic) information in order to be described in > terms of the basic level benchmarks. > > > With respect to low level testing, the round trip exchange of messages, > as per PingPing and PingPong in PMB or COMMS1 and COMMS2, is not > characteristic of the lowest level of communication. This pattern > is actually rather rare in programming practice. It is more common > for tasks to send single messages and/or to receive single messages. > In this scheme, messages do not make a round trip and there is not > necessarily caching or other coherency effects. > > The single message passing is a distinctly different case from that > of round trip tests. We should be worried that the round trip testing > might introduce artifacts not characteristic of actual (low level) usage. > We need a better test of basic bandwidth and latency in order to measure > and characterize message passing performance. > > > Here are suggestions and requirements, in an outline form, for a low > level benchmark design: > > > > I. Single and double (bidirectional) messages. > > A. Test single messages, not round trips. > 1. The round trip test is an algorithm and a pattern. As > such it should not be used as the basic low level test of > bandwidth. > 2. Use direct measurements where possible (which is nearly > always). For experimental design, the simplest method is > the most desirable and best. > 3. Do not perform least squares fits A PIORI. We know that > the various message passing mechanisms are not linear or > analytic because different mechanisms are used for different > message sizes. It is not necessarily known before hand > where this transition occurs. Some computer systems have > more than two regimes and their boundaries are dynamic. > 4. Our discussion of least squares fitting is loosing tract > of experimental design versus modeling. For example, the > least squares parameter for t_0 from COMMS1 is not a better > estimate of latency than actual measurements (assuming > that the timer resolution is adequate). A "better" way to > measure latency is to perform addition DIRECT measurements, > repetitions or otherwise, and hence decrease the statistical > error. The fitting as used in the COMMS programs SPREADS > error. It does not reduce error and hence it is not a > good technique for measuring such an important parameter > as latency. > > B. Do not test zero length messages. Though valid, zero length > messages are likely to take special paths through library > routines. This special case is not particularly interesting or > important. > 1. In practice, the most common and important message size is 64 > bits (one word). The time for this message is the starting > point for bandwidth characterization. > > D. Record all times and use statistics to characterize the message > passing time. That is, do not prime or warm up caches > or buffers. Timings for unprimed caches and buffers give > interesting and important bounds. These timings are also the > nearest to typical usage. > 1. Characterize message rates by a minimum, maximum, average > and standard deviation. > > E. Test inhomogeneity of the communication network. The basic > message test should be performed for all pairs of PEs. > > > II. Contention. > > A. Measure network contention relative to all PEs sending and/or > receiving messages. > > B. Do not use high level routines where the algorithm is not known. > 1. With high level algorithms, we cannot deduce which component > of the timing is attributable to the "operation count" > and which is attributable to the actual system (hardware) > performance. > > > III. Barrier. > > A. Simple test of barrier time for all numbers of processors. > > > > > Additionally, the suite should be easy to use. C and Fortran programs > for direct measurements of message passing times are short and simple. > These simple tests are of order 100 lines of code and, at least in > Fortran 90, can be written in a portable and reliable manner. > > The current Parkbench low level suite does not satisfy the above > requirements. It is inaccurate, as pointed out by previous letters, and > uses questionable techniques and methodologies. It is also difficult to > use, witness the proliferation of files, patches, directories, libraries > and the complexity and size of the Makefiles. > > This Low Level suite is a burden for those who are expecting a tool to > evaluate and investigate computer performance. The suite is becoming > a liability for our group. As such, it should be withdrawn from > distribution. > > I offer to write, test and submit a new set of programs which satisfy > most of the above requirements. > > > Charles Grassl > SGI/Cray Research > Eagan, Minnesota USA > ---------------End of Original Message----------------- ------------------------------------- CSM, University of Portsmouth, Hants, UK Tel: +44 1705 844285 Fax: +44 1705 844006 E-mail: mab@sis.port.ac.uk Date: 01/26/98 - Time: 10:14:38 URL http://www.sis.port.ac.uk/~mab/ ------------------------------------- From owner-parkbench-comm@CS.UTK.EDU Mon Jan 26 11:54:37 1998 Return-Path: Received: from CS.UTK.EDU by netlib2.cs.utk.edu with ESMTP (cf v2.9t-netlib) id LAA07118; Mon, 26 Jan 1998 11:54:37 -0500 Received: from localhost (root@localhost) by CS.UTK.EDU with SMTP (cf v2.9s-UTK) id LAA18845; Mon, 26 Jan 1998 11:21:36 -0500 (EST) Received: from timbuk.cray.com (timbuk-fddi.cray.com [128.162.8.102]) by CS.UTK.EDU with ESMTP (cf v2.9s-UTK) id LAA18837; Mon, 26 Jan 1998 11:21:33 -0500 (EST) Received: from ironwood.cray.com (root@ironwood-fddi.cray.com [128.162.21.36]) by timbuk.cray.com (8.8.7/CRI-gate-news-1.3) with ESMTP id KAA23428 for ; Mon, 26 Jan 1998 10:21:26 -0600 (CST) Received: from magnet.cray.com (magnet [128.162.173.162]) by ironwood.cray.com (8.8.4/CRI-ironwood-news-1.0) with ESMTP id KAA29079 for ; Mon, 26 Jan 1998 10:21:24 -0600 (CST) From: Charles Grassl Received: by magnet.cray.com (8.8.0/btd-b3) id QAA29329; Mon, 26 Jan 1998 16:21:23 GMT Message-Id: <199801261621.QAA29329@magnet.cray.com> Subject: Low Level Benchmarks To: parkbench-comm@CS.UTK.EDU Date: Mon, 26 Jan 1998 10:21:23 -0600 (CST) X-Mailer: ELM [version 2.4 PL24-CRI-d] MIME-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit To: Parkbench interests From: Charles Grassl Subject: Low Level benchmarks Date: 26, January, 1998 A short review of where we have been and decided: Last year we agreed (via email exchanges) that the Parkbench Low Level benchmark suite is not intended to be an -MPI- -test- suite. There was a consensus that we intended to measure low level performance, not algorithm design or implementation. This is why the Pallas benchmark, though useful for testing the performance of several important MPI functions, is not the basic low level test which we desire. (I believe that the performance measurement of the MPI functions is a worthwhile project for this group, but it needs to be separate from the low level benchmarks.) At the May, 1997 Parkbench meeting in Knoxville, TN, we unanimously decided that the measurement and analysis (fitting) portions of the COMMS programs would be made into a separate program. This from Michael Berry's minutes (23 May 1997): After more discussion, the following COMMS changes/outputs were unanimously agreed upon: 1. Maximum bandwidth with corresp. message size. 2. Minimum message-passing time with corresp. message size. 3. Time for minimum message length (could be 0, 1, 8, or 32 bytes but must be specified). 4. The software will be split into two program: one to report the spot measurements and the other for the analysis. Some of the objections with the Parkbench Low Level codes are that they are difficult to build, run and analyze. This attributable to their organization and design. Separating the analysis would greatly simply the programs, but the programs still need to be rewritten. I include in this email message a simple replacement code for COMMS1. It uses the "back and forth" methodology, reports maximum and minimum times with corresponding sizes and and does not include "analysis". It is equivalent to the measurement portion of COMMS1, though it is much simpler and easier to use. I will comment on the experimental methodology used in this program. - The reported times in standard out are actual round trip times. It is a poor experimental practice to modify raw measurements too early. We should not mix measured times with derived times. The practice leads to confusion and errors (witness the Pallas benchmark code and and an earlier version of Parkbench). If we desire to divide the times by two (because of the round trip), then this should be done in a analysis portion. Otherwise we misrepresent round trip times as actual single trip times, which hay are not. - All times are saved and written to unit 7. The reported times in standard out are the first and the last measurements for each message size. The experimental principle is that no data should be discarded with out analysis. We can use statistical analysis or graphics or fitting routines to analyze the raw output. (I favor graphics and statistical analysis.) If we look at the raw output, we will see interesting features, such as the actual "warm up" count (usually five or less repetitions) and the distribution of times (not Gaussian!). - Each repetition is individually timed. If the timer does not have adequate resolution, then the times for a number of repetitions, from two to all, can be aggregated and used. This aggregation can be done in the analysis phase. (Most computers should be able to time and resolve single round trip messages.) This aggregation should not be done before adequate analysis or evidence that it needs to be done. - Each message size is tested the same number of repetitions. We prefer to keep this number a constant so that the experimental sampling error (proportional to 1/sqrt[repetitions]) is the same for each message size. Also, it is difficult to cleanly and simply adjust the repetition count relative to the message size. I also have one replacement program for both COMMS2 and COMMS3 (note that the COMMS2 measurement is a subset of COMMS3 measurements). More on that later. Charles Grassl SGI/Cray Research Eagan, Minnesota USA ----------------------------------------------------------------------- program Single ! Compile: f90 file.f -l mpi character*40 Title data Title/' Single Messages --- MPI'/ integer log2nmax,nmax,n_repetitions parameter (log2nmax=18,nmax=2**log2nmax,n_repetitions=50) integer n_starts,n_mess parameter (n_starts=2,n_mess=2) include 'mpif.h' integer ier,status(MPI_STATUS_SIZE) integer my_pe,npes integer log2n,n,nrep,i real*8 t_call,timer,tf(0:n_repetitions) real*8 A(0:nmax-1) save A call mpi_init( ier ) call mpi_comm_rank(MPI_COMM_WORLD, my_pe, ier) call mpi_comm_size(MPI_COMM_WORLD, npes, ier) radian=1 do i=0,nmax-1 A(i) = acos(radian)*i end do tf(0) = timer() do nrep=1,n_repetitions tf(nrep) = timer() end do t_call=(tf(n_repetitions)-tf(0))/n_repetitions if (my_pe.eq.0) then call table_top(Title,npes,n_starts,n_mess,n_repetitions,t_call) end if do log2n=0,log2nmax n = 2**log2n call mpi_barrier( MPI_COMM_WORLD, ier ) tf(0) = timer() do nrep=1,n_repetitions if (my_pe.eq.1) then call MPI_SEND(A,8*n,MPI_BYTE,0,10,MPI_COMM_WORLD,ier) call MPI_RECV(A,8*n,MPI_BYTE,0,20,MPI_COMM_WORLD,status,ier) end if if (my_pe.eq.0) then call MPI_RECV(A,8*n,MPI_BYTE,1,10,MPI_COMM_WORLD,status,ier) call MPI_SEND(A,8*n,MPI_BYTE,1,20,MPI_COMM_WORLD,ier) end if tf(nrep) = timer() end do if (my_pe.eq.0) then call table_body(8*n,n_mess,n_repetitions,tf,t_call) end if end do call mpi_finalize(ier) end subroutine table_top( Title,npes, . n_starts,n_mess,n_repetitions,t_call) integer M parameter (M = 1 000 000) character*40 Title integer npes,n_starts,n_mess,n_repetitions real*8 t_call write(6,9010) Title,npes,n_starts,n_mess,n_repetitions,t_call*M return 9010 format(//a40, . // ' Number of PEs: ',i8 . // ' Starts: ',i8, . / ' Messages: ',i8, . / ' Repetitions: ',i8, . / ' Timer overhead: ',f8.3,' microsecond', . // 8x,' First ', . ' Last ', . /' Length',2x,2(' Time Rate ',1x), . /' [Bytes]',2x,2(' [Microsec.] [Mbyte/s]',1x), . /' ',8('-'),2x,2(21('-'),2x)) end subroutine table_body(n_byte,n_mess,n_repetitions,tf,t_call) integer M parameter (M = 1 000 000) integer n_byte,n_mess,n_repetitions,i real*8 tf(0:n_repetitions) real*8 t_call real*8 t_first,t_last t_first = (tf(1)-tf(0))-t_call t_last = (tf(n_repetitions)-tf(n_repetitions-1))-t_call write(6,9020) n_byte,t_first*M,n_mess*n_byte/(t_first*M), . t_last *M,n_mess*n_byte/(t_last *M) write(7) n_byte,n_repetitions,n_mess write(7) ((tf(i)-tf(i-1))-t_call,i=1,n_repetitions) return 9020 format(i8, 2x,2(f10.1,1x,f10.0,2x)) end From owner-parkbench-comm@CS.UTK.EDU Mon Jan 26 13:06:36 1998 Return-Path: Received: from CS.UTK.EDU by netlib2.cs.utk.edu with ESMTP (cf v2.9t-netlib) id NAA08767; Mon, 26 Jan 1998 13:06:36 -0500 Received: from localhost (root@localhost) by CS.UTK.EDU with SMTP (cf v2.9s-UTK) id MAA23400; Mon, 26 Jan 1998 12:31:16 -0500 (EST) Received: from osiris.sis.port.ac.uk (root@osiris.sis.port.ac.uk [148.197.100.10]) by CS.UTK.EDU with SMTP (cf v2.9s-UTK) id MAA23166; Mon, 26 Jan 1998 12:30:05 -0500 (EST) Received: from mordillo ([195.102.195.125]) by osiris.sis.port.ac.uk (4.1/SMI-4.1) id AA15447; Mon, 26 Jan 98 17:31:15 GMT Date: Mon, 26 Jan 98 17:23:08 GMT From: Mark Baker Subject: Fw: Re: Low Level Benchmarks To: parkbench-comm@CS.UTK.EDU X-Mailer: Chameleon ATX 6.0.1, Standards Based IntraNet Solutions, NetManage Inc. X-Priority: 3 (Normal) References: <34CCB99F.2B3C3D63@cumbria.eng.sun.com> Message-Id: Mime-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII This came direct to me... The rest of Parkbench are probably interested in Bodo's comments. Mark --- On Mon, 26 Jan 1998 08:28:15 -0800 Bodo Parady - SMCC Performance Development wrote: > The key items to find are: > > Lock time (defined as time to release a lock remotely) > Example would be reader spinning on memory, waiting > for change in memory word, or receipt of interrupt. > This is the effective ping-pong half time. Sadly > subroutine and library call overhead can render > this result meaningless. > > Measuring one way rates is no good here since the response > time must be factored in. This is a two-way transfer > > Channel rate (defined as large block transfer rate). > > Block size at half channel rate. > > Block size at twice lock time latency. > > Full curve, stepping at 1, 2, 4, 8, 16, ..., 2*n byte block sizes > at full issue rate. This is probably the least important > since it involves coalescence of transmitted data. > > The fear is that given the limitations of MPI/PVM, and to some degree > of C and Fortran that accurate measures of these quantities may > not be practical. > > Regards. > > Bodo Parady > > Mark Baker wrote: > > > Charles, > > > > Thanks for your thoughts and experiences with the Pallas PMB codes - > > I will forward them to the authors... The main points in favour of > > the PMB codes are that they are in C and potentially produce results > > for a variety of MPI calls... Obviously if the results they produce are > > flawed... > > > > Regarding new low-level codes I would be in favour of taking up your > > kind offer of writing a set of codes in C/Fortran. I guess the main > > problem is getting a concensus with regards methodology and measurements > > that are used with these codes. > > > > Maybe we can decide that a number of actions should be undertaken... > > > > 1) It seems clear that no one is 100% happy with the current version > > of the low-level codes. So, this implies that they need to be > > replaced !? > > > > 2) If we are going to replace the codes we can go down a couple of routes; > > start from scratch, replace with Roger's new codes or some combination of > > both... > > > > 3) I would be happy to see us start from scratch and create > > C/Fortran codes where the methodology and design of each can be > > "hammered out" by discussion first and then implemented > > (and iterated as necessary). > > > > 4) Assuming that we want to go down this route, I suggest we make a starting > > point of Charles' "suggestions and requirements for the low level > > benchmark design" - towards the end of this email. I am happy to > > put these words on the web and update/change them as our dicussions > > evolve... > > > > 5) Charles has offered his services to help write/design/test these new codes - > > I'm willing to offer my services in a similar fashion. I'm sure that others > > interested in the low-level codes could contribute something here as well. > > > > Overall, it seems clear to me that we have enough energy and manpower to > > produce a new set low-level codes whose methodology and design is correct > > and relevant to todays systems... > > > > I look forward to your comments... > > > > Regards > > > > Mark > > > > --- On Thu, 15 Jan 1998 11:11:39 -0600 (CST) Charles Grassl wrote: > > > > > > To: Parkbench interests > > > From: Charles Grassl > > > Subject: Low Level benchmarks > > > > > > Date: 15 January, 1998 > > > > > > > > > Mark, thank you for pointing us to the PMB benchmark. It is well written > > > and coded, but has some discrepancies and shortcomings. My comments > > > lead to suggestions and recommendation regarding low level communication > > > benchmarks. > > > > > > First, in program PMB the PingPong tests are twice as fast (in time) > > > as the corresponding message length tests in the PingPing tests (as run > > > on a CRAY T3E). The calculation of the time and bandwidth is incorrect > > > by a factor of 100% in one of the programs. > > > > > > This error can be fixed by recording, using and reporting the actual > > > time, amount of data sent and their ratio. That is, the time should not > > > be divided by two in order to correct for a round trip. This recorded > > > time is for a round trip message, and is not precisely the time for > > > two messages. Half the round trip message passing time, as reported in > > > the PMB tests, is not the time for a single message and should not be > > > reported and such. This same erroneous technique is used in the COMMS1 > > > and COMMS2 two benchmarks. (Is Parkbench is responsible for propagating > > > this incorrect methodology.) > > > > > > In program PMB, the testing procedure performs a "warm up". This > > > procedure is a poor testing methodology because is discards important > > > data. Testing programs such as this should record all times and calculate > > > the variance and other statistics in order to perform error analysis. > > > > > > Program PMB does not measure contention or allow extraction of network > > > contention data. Tests "Allreduce" and "Bcast" and several others > > > stress the inter-PE communication network with multiple messages, > > > but it is not possible to extract information about the contention from > > > these tests. The MPI routines for Allreduce and Bcast have algorithms > > > which change with respect to number of PEs and message lengths, Hence, > > > without detailed information about the specific algorithms used, we cannot > > > extract information about network performance or further characterize > > > the inter-PE network. > > > > > > Basic measurements must be separated from algorithms. Tests PingPong, > > > PingPing, Barrier, Xover, Cshift and Exchange are low level. Tests > > > Allreduce and Bcast are algorithms. The algorithms Allreduce and Bcast > > > need additional (algorithmic) information in order to be described in > > > terms of the basic level benchmarks. > > > > > > > > > With respect to low level testing, the round trip exchange of messages, > > > as per PingPing and PingPong in PMB or COMMS1 and COMMS2, is not > > > characteristic of the lowest level of communication. This pattern > > > is actually rather rare in programming practice. It is more common > > > for tasks to send single messages and/or to receive single messages. > > > In this scheme, messages do not make a round trip and there is not > > > necessarily caching or other coherency effects. > > > > > > The single message passing is a distinctly different case from that > > > of round trip tests. We should be worried that the round trip testing > > > might introduce artifacts not characteristic of actual (low level) usage. > > > We need a better test of basic bandwidth and latency in order to measure > > > and characterize message passing performance. > > > > > > > > > Here are suggestions and requirements, in an outline form, for a low > > > level benchmark design: > > > > > > > > > > > > I. Single and double (bidirectional) messages. > > > > > > A. Test single messages, not round trips. > > > 1. The round trip test is an algorithm and a pattern. As > > > such it should not be used as the basic low level test of > > > bandwidth. > > > 2. Use direct measurements where possible (which is nearly > > > always). For experimental design, the simplest method is > > > the most desirable and best. > > > 3. Do not perform least squares fits A PIORI. We know that > > > the various message passing mechanisms are not linear or > > > analytic because different mechanisms are used for different > > > message sizes. It is not necessarily known before hand > > > where this transition occurs. Some computer systems have > > > more than two regimes and their boundaries are dynamic. > > > 4. Our discussion of least squares fitting is loosing tract > > > of experimental design versus modeling. For example, the > > > least squares parameter for t_0 from COMMS1 is not a better > > > estimate of latency than actual measurements (assuming > > > that the timer resolution is adequate). A "better" way to > > > measure latency is to perform addition DIRECT measurements, > > > repetitions or otherwise, and hence decrease the statistical > > > error. The fitting as used in the COMMS programs SPREADS > > > error. It does not reduce error and hence it is not a > > > good technique for measuring such an important parameter > > > as latency. > > > > > > B. Do not test zero length messages. Though valid, zero length > > > messages are likely to take special paths through library > > > routines. This special case is not particularly interesting or > > > important. > > > 1. In practice, the most common and important message size is 64 > > > bits (one word). The time for this message is the starting > > > point for bandwidth characterization. > > > > > > D. Record all times and use statistics to characterize the message > > > passing time. That is, do not prime or warm up caches > > > or buffers. Timings for unprimed caches and buffers give > > > interesting and important bounds. These timings are also the > > > nearest to typical usage. > > > 1. Characterize message rates by a minimum, maximum, average > > > and standard deviation. > > > > > > E. Test inhomogeneity of the communication network. The basic > > > message test should be performed for all pairs of PEs. > > > > > > > > > II. Contention. > > > > > > A. Measure network contention relative to all PEs sending and/or > > > receiving messages. > > > > > > B. Do not use high level routines where the algorithm is not known. > > > 1. With high level algorithms, we cannot deduce which component > > > of the timing is attributable to the "operation count" > > > and which is attributable to the actual system (hardware) > > > performance. > > > > > > > > > III. Barrier. > > > > > > A. Simple test of barrier time for all numbers of processors. > > > > > > > > > > > > > > > Additionally, the suite should be easy to use. C and Fortran programs > > > for direct measurements of message passing times are short and simple. > > > These simple tests are of order 100 lines of code and, at least in > > > Fortran 90, can be written in a portable and reliable manner. > > > > > > The current Parkbench low level suite does not satisfy the above > > > requirements. It is inaccurate, as pointed out by previous letters, and > > > uses questionable techniques and methodologies. It is also difficult to > > > use, witness the proliferation of files, patches, directories, libraries > > > and the complexity and size of the Makefiles. > > > > > > This Low Level suite is a burden for those who are expecting a tool to > > > evaluate and investigate computer performance. The suite is becoming > > > a liability for our group. As such, it should be withdrawn from > > > distribution. > > > > > > I offer to write, test and submit a new set of programs which satisfy > > > most of the above requirements. > > > > > > > > > Charles Grassl > > > SGI/Cray Research > > > Eagan, Minnesota USA > > > > > > > ---------------End of Original Message----------------- > > > > ------------------------------------- > > CSM, University of Portsmouth, Hants, UK > > Tel: +44 1705 844285 Fax: +44 1705 844006 > > E-mail: mab@sis.port.ac.uk > > Date: 01/26/98 - Time: 10:14:38 > > URL http://www.sis.port.ac.uk/~mab/ > > ------------------------------------- > > > > ---------------End of Original Message----------------- ------------------------------------- CSM, University of Portsmouth, Hants, UK Tel: +44 1705 844285 Fax: +44 1705 844006 E-mail: mab@sis.port.ac.uk Date: 01/26/98 - Time: 17:23:08 URL http://www.sis.port.ac.uk/~mab/ ------------------------------------- From owner-parkbench-comm@CS.UTK.EDU Mon Jan 26 14:08:38 1998 Return-Path: Received: from CS.UTK.EDU by netlib2.cs.utk.edu with ESMTP (cf v2.9t-netlib) id OAA11289; Mon, 26 Jan 1998 14:08:37 -0500 Received: from localhost (root@localhost) by CS.UTK.EDU with SMTP (cf v2.9s-UTK) id NAA02837; Mon, 26 Jan 1998 13:52:54 -0500 (EST) Received: from haven.EPM.ORNL.GOV (haven.epm.ornl.gov [134.167.12.69]) by CS.UTK.EDU with ESMTP (cf v2.9s-UTK) id NAA02817; Mon, 26 Jan 1998 13:52:50 -0500 (EST) Received: (from worley@localhost) by haven.EPM.ORNL.GOV (8.8.3/8.8.3) id NAA11755; Mon, 26 Jan 1998 13:52:49 -0500 (EST) Date: Mon, 26 Jan 1998 13:52:49 -0500 (EST) From: Pat Worley Message-Id: <199801261852.NAA11755@haven.EPM.ORNL.GOV> To: parkbench-comm@CS.UTK.EDU Subject: Re: Fw: Re: Low Level Benchmarks In-Reply-To: Mail from 'Mark Baker ' dated: Mon, 26 Jan 98 17:23:08 GMT Cc: worley@haven.EPM.ORNL.GOV (From Charles Grassl) > Last year we agreed (via email exchanges) that the Parkbench Low Level > benchmark suite is not intended to be an -MPI- -test- suite. There was a > consensus that we intended to measure low level performance, not algorithm > design or implementation. (From Bodo Parady via Mark Baker) > The fear is that given the limitations of MPI/PVM, and to some degree > of C and Fortran that accurate measures of these quantities may > not be practical. > I have a problem with attempting to determine low level communication performance parameters independent of the communication library when it a) is such a difficult task (I doubt that any portable program will be "accurate enough" across all the interesting platforms.) b) does not reflect what users would see in practice (since they will be using MPI or PVM in C or Fortran). Am I missing something? The primary utility (for me) of the low level benchmarks is to help explain the performance observed in the Parkbench kernels and compact applications, or in my own codes. What level of accuracy is required for such an application? Are more accurate or detailed measurements useful or doable? Upon reflection, such low(er) level performance data would be useful to the developer of a communication library, to help evaluate its performance, but that appears to require system-specific measurements (and system-specific interpretation). Is this really something we want to attempt? Pat Worley From owner-parkbench-comm@CS.UTK.EDU Thu Jan 29 16:29:33 1998 Return-Path: Received: from CS.UTK.EDU by netlib2.cs.utk.edu with ESMTP (cf v2.9t-netlib) id QAA19023; Thu, 29 Jan 1998 16:29:33 -0500 Received: from localhost (root@localhost) by CS.UTK.EDU with SMTP (cf v2.9s-UTK) id QAA09768; Thu, 29 Jan 1998 16:18:48 -0500 (EST) Received: from haven.EPM.ORNL.GOV (haven.epm.ornl.gov [134.167.12.69]) by CS.UTK.EDU with ESMTP (cf v2.9s-UTK) id QAA09756; Thu, 29 Jan 1998 16:18:45 -0500 (EST) Received: (from worley@localhost) by haven.EPM.ORNL.GOV (8.8.3/8.8.3) id QAA01325; Thu, 29 Jan 1998 16:18:43 -0500 (EST) Date: Thu, 29 Jan 1998 16:18:43 -0500 (EST) From: Pat Worley Message-Id: <199801292118.QAA01325@haven.EPM.ORNL.GOV> To: parkbench-comm@CS.UTK.EDU Subject: Re: Fw: Re: Low Level Benchmarks In-Reply-To: Mail from 'Mark Baker ' dated: Mon, 26 Jan 98 17:23:08 GMT Cc: worley@haven.EPM.ORNL.GOV In a private exchange, Charles Grassl made a comment that he may come to regret: " We need more input, such as yours, as to what are the important parameters and what accuracy is needed. " so here are some random comments. I have been organizing my own performance data over the last couple of weeks. I never paid too much attention to the detailed output of my own ping-ping and ping-pong tests because it was not the end product of the research. It has been enlightening to look at it now. The entry point is http://www.epm.ornl.gov/~worley/studies/pt2pt.html I tried a couple of different fitting techniques, but decided that fits told me nothing that I was interested in. What I have found mildly interesting is to measure statistics of the data, and try to build a performance model using those. The difference is that the interpretation and value of the statistics (maximum observed bandwidth, time to send 0 length message, etc.) are not functions of any model error. The problem with fitting the data is that, no matter how often I tell myself that it is simply a compact representation of the data, I keep wanting to use assign meaning to the model parameters and use them in interplatform comparisons. In summary, I have changed my mind. I no longer support even simple fits to the data unless well-defined statistical measures of the data are also included (and emphasized). Pat Worley From owner-parkbench-comm@CS.UTK.EDU Mon Feb 9 05:05:12 1998 Return-Path: Received: from CS.UTK.EDU by netlib2.cs.utk.edu with ESMTP (cf v2.9t-netlib) id FAA29859; Mon, 9 Feb 1998 05:05:11 -0500 Received: from localhost (root@localhost) by CS.UTK.EDU with SMTP (cf v2.9s-UTK) id EAA10483; Mon, 9 Feb 1998 04:57:14 -0500 (EST) Received: from gatekeeper.pallas.de (gatekeeper.pallas.de [194.45.33.1]) by CS.UTK.EDU with SMTP (cf v2.9s-UTK) id EAA10476; Mon, 9 Feb 1998 04:57:07 -0500 (EST) Received: from mailhost.pallas.de (gatekeeper [194.45.33.1]) by gatekeeper.pallas.de (SMI-8.6/SMI-SVR4) with SMTP id KAA18803; Mon, 9 Feb 1998 10:50:10 +0100 Received: from schubert.pallas.de by mailhost.pallas.de (SMI-8.6/SMI-SVR4) id KAA03909; Mon, 9 Feb 1998 10:50:07 +0100 Received: from localhost by schubert.pallas.de (SMI-8.6/SMI-SVR4) id KAA11268; Mon, 9 Feb 1998 10:46:57 +0100 Date: Mon, 9 Feb 1998 10:46:45 +0100 (MET) From: Hans Plum X-Sender: hans@schubert Reply-To: Hans Plum To: cmg@cray.com, mab@sis.port.ac.uk, parkbench-comm@CS.UTK.EDU cc: snelling@fecit.co.uk Subject: Re: Low Level Benchmarks (fwd) In-Reply-To: Message-ID: MIME-Version: 1.0 Content-Type: multipart/mixed; boundary="MimeMultipartBoundary" --MimeMultipartBoundary Content-Type: TEXT/PLAIN; charset=US-ASCII Hi, I am the "PMB person" at PALLAS Gmbh. I have heard about your discussions. First note that there is a new version PMB1.2, see http://www.pallas.de/pages/pmb.htm Also look at the PMB1.2_doc.ps.gz where we try to give the reasoning for all decisions made in PMB. We think nothing has been designed sloppy .. PMB has been developed from point of view of an application developer which I am. Of course a single person's view is limited, but for myself the information given by PMB provides a solid base for algorithmic estimates and decisions. That exactly what we wanted: Something EASY (and not COMPLETE) that covers may be 80% of the realistic situations. ------------------------------------------------------------- ---/--- Dr Hans-Joachim Plum phone : +49-2232-1896-0 / / PALLAS GmbH direct line: +49-2232-1896-18 / / / Hermuelheimer Strasse 10 fax : +49-2232-1896-29 / / / / D-50321 Bruehl email : plum@pallas.de / / / Germany URL : http://www.pallas.de / / PALLAS ------------------------------------------------------------- ---/--- --MimeMultipartBoundary-- From owner-parkbench-comm@CS.UTK.EDU Wed Apr 22 07:43:42 1998 Return-Path: Received: from CS.UTK.EDU by netlib2.cs.utk.edu with ESMTP (cf v2.9t-netlib) id HAA03238; Wed, 22 Apr 1998 07:43:41 -0400 Received: from localhost (root@localhost) by CS.UTK.EDU with SMTP (cf v2.9s-UTK) id HAA03111; Wed, 22 Apr 1998 07:05:23 -0400 (EDT) Received: from post.mail.demon.net (post-10.mail.demon.net [194.217.242.39]) by CS.UTK.EDU with SMTP (cf v2.9s-UTK) id HAA03104; Wed, 22 Apr 1998 07:05:21 -0400 (EDT) Received: from minnow.demon.co.uk ([158.152.73.63]) by post.mail.demon.net id aa1028865; 22 Apr 98 11:00 GMT Message-ID: Date: Wed, 22 Apr 1998 11:59:51 +0100 To: parkbench-comm@CS.UTK.EDU From: Roger Hockney Subject: Announcing PICT2.1 - Now fully Operational MIME-Version: 1.0 X-Mailer: Turnpike Version 3.03a To: the Parkbench discussion group From: Roger ANNOUNCING PICT 2.1 (1 Mar 1998) -------------------------------- I am pleased to announce the first fully-functional version of the Parkbench Interactive Curve-Fitting Tool (PICT). Provision is made for a wide range of screen sizes in pixels by allowing the user to make a suitable choice in the opening HTML page. All buttons now work. In particular Jack can have his least-squares fitting of the 2-parameters direct from the tool, and this can be performed over partial ranges of the data as required. The same applies to the Three-point fitting procedure to obtain the 3-parameter fits. There is also a nice "Temperature Gauge" feature that helps you minimise the error during manual fitting. The results of these fits can be assembled in a results file and annotated using the SAVE buttons. Under MSIE I find I am able to store these results in my local disk file system using SAVE as ... ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ The methodology of the 2-parameter curve fitting is given in detail in my book "The Science of Computer Benchmarking", see: http://www.siam.org/catalog/mcc07/hockney.htm The 3-parameter fit was described quite fully in my talk to the 11 Sep 1997 Parkbench meeting. I have finally written this up with pretty pictures for the PEMCS Web Journal. Look at: http://hpc-journals.ecs.soton.ac.uk/Workshops/PEMCS/fall-97/ talks/Roger-Hockney/perfprof1.html To try out PICT 2.1 please first try my own Demon Web space which has a counter from which I can judge usage: http://www.minnow.demon.co.uk/pict/source/pict2a.html If this gives problems, it is also mounted on the University of Westminster server: http://perun.hscs.wmin.ac.uk/LocalInfo/pict/source/pict2a.html We expect soon to make it available on the Southampton server. ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ PICT 2.1 has been tested by a small number of friends. Most problems and frustrations arise from either slowness of the server or of the users' computer. If download from Demon is slow or appears to hang, try the other server or try Demon later. Please do not conclude the applet is broken. I am confident it is not. A 10 to 20 second wait is normal when bringing up the requested graphical window/frame even on a good day. Once the graphical window is on your computer and the applet is running, the speed is determined by the speed of your computer. You may even disconnect from the Web at this stage and continue curve fitting with the applet with the data displayed. If you want new data, you must, of course, reconnect to the Web and use the GET DATA at URL button. Experience shows that the PICT applet will not respond satisfactorily on a computer with slower than a 100 MHz clock. This is because a lot of complex calculations must be performed as you drag the curves around the data. MSIE seems to work noticeably faster than Netscape on my Win95 PC. There is no cure for this except to use a faster computer. But again please do not think the applet is brocken. Please report experiences good or bad to: roger@minnow.demon.co.uk Constructive suggestions for improvement are also welcome. -- Roger Hockney. Checkout my new Web page at URL http://www.minnow.demon.co.uk From owner-parkbench-comm@CS.UTK.EDU Sun Jun 21 10:02:47 1998 Return-Path: Received: from CS.UTK.EDU by netlib2.cs.utk.edu with ESMTP (cf v2.9t-netlib) id KAA22167; Sun, 21 Jun 1998 10:02:47 -0400 Received: from localhost (root@localhost) by CS.UTK.EDU with SMTP (cf v2.9s-UTK) id JAA06272; Sun, 21 Jun 1998 09:47:56 -0400 (EDT) Received: from osiris.sis.port.ac.uk (root@osiris.sis.port.ac.uk [148.197.100.10]) by CS.UTK.EDU with SMTP (cf v2.9s-UTK) id JAA06265; Sun, 21 Jun 1998 09:47:54 -0400 (EDT) Received: from mordillo (p4.nas1.is5.u-net.net) by osiris.sis.port.ac.uk (4.1/SMI-4.1) id AA17767; Sun, 21 Jun 98 14:50:05 BST Date: Sun, 21 Jun 98 14:43:42 +0000 From: Mark Baker Subject: New PEMCS papers To: parkbench-comm@CS.UTK.EDU X-Mailer: Chameleon ATX 6.0.1, Standards Based IntraNet Solutions, NetManage Inc. X-Priority: 3 (Normal) Message-Id: Mime-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII Dear All, Two new papers have just been published by the PEMCS journal... 3.Comparing The Performance of MPI on the Cray T3E-900, The Cray Origin2000 And The IBM P2SC, by Glenn R. Luecke and James J. Coyle Iowa State University, Ames, Iowa 50011-2251, USA. 4.EuroBen Experiences with the SGI Origin 2000 and the Cray T3E, by A.J. van der Steen, Computational Physics, Utrecht University, Holland* See http://hpc-journals.ecs.soton.ac.uk/PEMCS/Papers/ Regards Mark ------------------------------------- CSM, University of Portsmouth, Hants, UK Tel: +44 1705 844285 Fax: +44 1705 844006 E-mail: mab@sis.port.ac.uk Date: 06/21/98 - Time: 14:43:42 URL http://www.sis.port.ac.uk/~mab/ ------------------------------------- From owner-parkbench-comm@CS.UTK.EDU Fri Sep 11 12:05:18 1998 Return-Path: Received: from CS.UTK.EDU by netlib2.cs.utk.edu with ESMTP (cf v2.9t-netlib) id MAA19578; Fri, 11 Sep 1998 12:05:18 -0400 Received: from localhost (root@localhost) by CS.UTK.EDU with SMTP (cf v2.9s-UTK) id LAA20703; Fri, 11 Sep 1998 11:54:29 -0400 (EDT) Received: from osiris.sis.port.ac.uk (root@osiris.sis.port.ac.uk [148.197.100.10]) by CS.UTK.EDU with SMTP (cf v2.9s-UTK) id LAA20636; Fri, 11 Sep 1998 11:53:20 -0400 (EDT) Received: from mordillo (p36.nas1.is5.u-net.net) by osiris.sis.port.ac.uk (4.1/SMI-4.1) id AA11111; Fri, 11 Sep 98 16:48:17 BST Date: Fri, 11 Sep 98 14:38:08 +0000 From: Mark Baker Subject: CPE - Call for papers - Message Passing Interface-based Parallel Programming with Java To: javagrandeforum@npac.syr.edu, "'mpi-nt-users@erc.msstate.edu'" , "Dr. Kenneth A. Williams" , "Stephen L. Scott" , "Aad J. van der Steen" , Advanced Java , Alexander Reinefeld , Andy Grant , Anne Trefethen , Bryan Capenter , Charles Grassl , Dave Beckett , David Snelling , DIS Everyone , fagg@CS.UTK.EDU, gentzsch@genias.de, Guy Robinson , Hon W Yau , hpvm@cs.uiuc.edu, Jack Dongarra , java-for-cse@npac.syr.edu, Joao Gabriel Silva , jtap-club-clusters@mailbase.ac.uk, Ken Hawick , Mike Berry , mpijava-users@npac.syr.edu, owner-grounds@mail.software.ibm.com, parkbench-comm@CS.UTK.EDU, partners@globus.org, Paul Messina , Roland Wismueller , Steve Larkin - AVS , Terri Canzian , Tony Hey , topic@mcc.ac.uk, Vaidy Sunderam , Vladimir Getov , William Gropp X-Mailer: Chameleon ATX 6.0.1, Standards Based IntraNet Solutions, NetManage Inc. X-Priority: 3 (Normal) Message-Id: Mime-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII Dear Colleague,, Firstly, I apologise for any cross-posting of this email. If this CFP is not in your field we would appreciate you forwarding it to your colleagues who may be in the field. This CFP can be found at http://hpc-journals.ecs.soton.ac.uk/CPE/Special/MPI-Java/ Regards Dr Mark Baker University of Portsmouth, UK ---------------------------------------------------------------------------- Call For Papers Special Issue of Concurrency: Practice and Experience Message Passing Interface-based Parallel Programming with Java Guest Editors Anthony Skjellum (MPI Software Technology, Inc.) Mark Baker (University of Portsmouth) A special issue of Concurrency: Practice and Experience (CPE) is being planned for Fall of 1999. Papers submitted and accepted for this issue will be published by John Wiley & Sons Ltd. in the CPE Journal and in addition will be made available electronically via the WWW. Background Recently there has been a great deal of interest in the idea that Java may be a good language for scientific and engineering computation, and in particular for parallel computing. The claims made on behalf of Java, that it is simple, efficient and platform-neutral - a natural language for network programming - make it potentially attractive to scientific programmers hoping to harness the collective computational power of networks of workstations and PCs, or even of the Internet. A basic prerequisite for parallel programming is a good communication API. Java comes with various ready-made packages for communication, notably an easy-to-use interface to BSD sockets, and the Remote Method Invocation (RMI) mechanism. Interesting as these interfaces are, it is questionable whether parallel programmers will find them especially convenient. Sockets and remote procedure calls have been around for about as long as parallel computing has been fashionable, and neither of them has been popular in that field. Both communication models are optimized for client-server programming, whereas the parallel computing world is mainly concerned with "symmetric" communication, occurring in groups of interacting peers. This symmetric model of communication is captured in the successful Message Passing Interface standard (MPI), established a few years ago. MPI directly supports the Single Program Multiple Data (SPMD) model of parallel computing, wherein a group of processes cooperate by executing identical program images on local data values. Reliable point-to-point communication is provided through a shared, group-wide communicator, instead of socket pairs. MPI allows numerous blocking, non-blocking, buffered or synchronous communication modes. It also provides a library of true collective operations (broadcast is the most trivial example). An extended standard, MPI 2, allows for dynamic process creation and access to memory in remote processes. Call For Papers This is a call for papers about the designs, experience, and results concerning the use of the Message Passing Interface (MPI) with Java are sought for a special issue of Concurrency Practice and Experience. Development of clear understanding of the opportunities, challenges, and state-of-the-art in scalable, peer-oriented messaging with Java are of interest and value to both the distributed computing and high performance computing communities. Topics of interest for this special issue include but are not limited to: -- Practical systems that use MPI and Java to solve real distributed high performance computing problems. -- Designs of systems for combining MPI-type functionality with Java. -- Approaches to APIs for object-oriented, group-oriented message passing with Java. -- Efforts to combine MPI with CORBA in a Java environment. -- Efforts to utilize aspects of the emerging MPI/RT standard are also of interest in the Java context. -- Efforts to do MPI interoperability (IMPI) using Java. -- Issues and both tactical and strategic solutions concerning MPI-1 and MPI-2 standard and features in conjunction with Java. -- Performance results and performance-enhancing techniques for such systems. -- Flexible frameworks and techniques for enabling High-Performance communication in Java Timescales for Submission There is a deadline of 15th December 1998 for submitted papers. Publication is currently scheduled for the third quarter of 1999. Activity Deadline Call For Papers 1st September 1998 Paper Submission 15th December 1998 Papers Returned 15th March 1999 Papers Approved 1st April 1999 Publication Q3 1999 Further details about this special issue can be found at: http://hpc-journals.ecs.soton.ac.uk/CPE/Special/MPI-Java/ ---------------------------------------------------------------------------- ------------------------------------- Dr Mark baker CSM, University of Portsmouth, Hants, UK Tel: +44 1705 844285 Fax: +44 1705 844006 E-mail: mab@sis.port.ac.uk Date: 09/11/98 - Time: 14:38:08 URL http://www.dcs.port.ac.uk/~mab/ ------------------------------------- From owner-parkbench-comm@CS.UTK.EDU Tue Sep 15 22:24:43 1998 Return-Path: Received: from CS.UTK.EDU by netlib2.cs.utk.edu with ESMTP (cf v2.9t-netlib) id WAA13441; Tue, 15 Sep 1998 22:24:43 -0400 Received: from localhost (root@localhost) by CS.UTK.EDU with SMTP (cf v2.9s-UTK) id WAA25353; Tue, 15 Sep 1998 22:23:26 -0400 (EDT) Received: from octane11.nas.nasa.gov (octane11.nas.nasa.gov [129.99.34.116]) by CS.UTK.EDU with ESMTP (cf v2.9s-UTK) id WAA25343; Tue, 15 Sep 1998 22:23:08 -0400 (EDT) Received: (from saini@localhost) by octane11.nas.nasa.gov (8.8.7/NAS8.8.7) id TAA24915; Tue, 15 Sep 1998 19:17:45 -0700 (PDT) From: "Subhash Saini" Message-Id: <9809151917.ZM24910@octane11.nas.nasa.gov> Date: Tue, 15 Sep 1998 19:17:44 -0700 In-Reply-To: Mark Baker "CPE - Call for papers - Message Passing Interface-based Parallel Programming with Java" (Sep 11, 2:38pm) References: X-Mailer: Z-Mail (3.2.3 08feb96 MediaMail) To: "'mpi-nt-users@erc.msstate.edu'" , "Aad J. van der Steen" , "Dr. Kenneth A. Williams" , "Stephen L. Scott" , Advanced Java , Alexander Reinefeld , Andy Grant , Anne Trefethen , Bryan Capenter , Charles Grassl , DIS Everyone , Dave Beckett , David Snelling , Guy Robinson , Hon W Yau , Jack Dongarra , Joao Gabriel Silva , Ken Hawick , Mark Baker , Mike Berry , Paul Messina , Roland Wismueller , Steve Larkin - AVS , Terri Canzian , Tony Hey , Vaidy Sunderam , Vladimir Getov , William Gropp , fagg@CS.UTK.EDU, gentzsch@genias.de, hpvm@cs.uiuc.edu, java-for-cse@npac.syr.edu, javagrandeforum@npac.syr.edu, jtap-club-clusters@mailbase.ac.uk, mpijava-users@npac.syr.edu, owner-grounds@mail.software.ibm.com, parkbench-comm@CS.UTK.EDU, partners@globus.org, topic@mcc.ac.uk Subject: AD _ Workshop Cc: mab@sis.port.ac.uk, saini@octane11.nas.nasa.gov Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii You are invited to attend the workshop (see below). Best regards, subhash ============================================================================== ***** REGISTER NOW ***** *** NO REGISTRATION FEE *** **** Last Day to Register is Sept. 23, 1998 **** "First NASA Workshop on Performance-Engineered Information Systems" ----------------------------------------------------------------- Sponsored by Numerical Aerospace Simulation Systems Division NASA Ames Research Center Moffett Field, California, USA September 28-29, 1998 Workshop Chairman: Dr. Subhash Saini http://science.nas.nasa.gov/Services/Training Invited Speakers: ------------------ Adve, Vikram (Rice University) Aida, K. (Tokyo Institute of Technology, JAPAN) Bagrodia, Rajive (University of California, Los Angeles) Becker, Monique (Institute Nationale des Tele. FRANCE) Berman, Francine (University of California, San Diego) Browne, James C. (University of Texas) Darema, Frederica (U.S. National Science Foundation-CISE) Dongarra, Jack (Oak Ridge National Laboratory) Feiereisen, Bill (NASA Ames Research Center) Fox, Geoffrey (Syracuse University) Gannon, Dennis (Indiana University) Gerasoulis, Apostolos (Rutgers University) Gunther, Neil J. (Performance Dynamics Company) Hey, Tony (University of Southampton UK) Hollingsworth, Jeff (University of Maryland) Jain, Raj (Ohio State University) Keahy, Kate (Los Alamos National Laboratory) Mackenzie, Lewis M. (University of Glasgow, Scotland UK) McCalpin, John (Silicon Graphics) Menasce, Daniel A. (George Mason University) Nudd, Graham (University of Warwick UK) Reed, Dan (University of Illinois) Saltz, Joel (University of Maryland) Simmons, Margaret (San Diego Supercomputer Center) Vernon, Mary (University of Wisconsin) Topics include: -------------- - Performance-by-design techniques for high-performance distributed information systems - Large transients in packet-switched and circuit-switched networks - Workload characterization techniques - Integrated performance measurement, analysis, and prediction - Performance measurement and modeling in IPG - Performance models for threads and distributed objects - Application emulators and simulation models - Performance prediction engineering of Information Systems including IPG - Performance characterization of scientific and engineering applications of interest to NASA, DoE, DoD, and industry - Scheduling tools for performance prediction of parallel programs - Multi-resolution simulations for large-scale I/O-intensive applications - Capacity planning for Web performance: metrics, models, and methods Contact: Marcia Redmond, redmond@nas.nasa.gov, (650) 604-4373 Registration: Advanced registration is required. Registration Fee: NONE. Registration Deadlines: Friday, September 23, 1998 There will be no onsite registration. Contact: Send registration information and direct questions to Marcia Redmond, redmond@nas.nasa.gov, (650) 604-4373. DESCRIPTION: The basic goal of performance modeling is to predict and understand the performance of a computer program or set of programs on a computer system. The applications of performance modeling are numerous, including evaluation of algorithms, optimization of code implementations, parallel library development, comparison of system architectures, parallel system design, and procurement of new systems. The most reliable technique for determining the performance of a program on a computer system is to run and time the program multiple times, but this can be very expensive and it rarely leads to any deep understanding of the performance issues. It also does not provide information on how performance will change under different circumstances, for example with scaling the problem or system parameters or porting to a different machine. The complexity of new parallel supercomputer systems presents a daunting challenge to the application scientists who must understand the system's behavior to achieve a reasonable fraction of the peak performance. The NAS Parallel Benchmarks (NPB) have exposed a large difference between peak and achievable performance. Such a dismal performance is not surprising, considering the complexity of these parallel distributed memory systems. At present, performance modeling, measurement, and analysis tools are inadequate for distributed/networked systems such as Information Power Grid (IPG). The purpose of performance-based engineering is to develop new methods and tools that will enable development of these information systems faster, better and cheaper. ================================================================================ Registration "First NASA Workshop on Performance-Engineered Information Systems" Send the following information to redmond@nas.nasa.gov Name _____________________________________________ Organization _____________________________________ Street Address ___________________________________ City ____________________ State __________________ Zip/Mail Code ___________ Country ________________ Phone ___________________ Fax ____________________ Email address ____________________________________ U.S. Citizen __________ Permanent Resident with Green Card ________ ******************************************************************************* Foreign National ________ (non-U.S. Citizen). Must complete the following information: Passport number ______________________ Name as it appears on passport _______________________________________ Date issued _____________ Date expires _________________ Country of citizenship____________________________ From owner-parkbench-comm@CS.UTK.EDU Sun Oct 25 12:41:55 1998 Return-Path: Received: from CS.UTK.EDU by netlib2.cs.utk.edu with ESMTP (cf v2.9t-netlib) id MAA29754; Sun, 25 Oct 1998 12:41:54 -0500 Received: from localhost (root@localhost) by CS.UTK.EDU with SMTP (cf v2.9s-UTK) id MAA29327; Sun, 25 Oct 1998 12:36:00 -0500 (EST) Received: from pan.ch.intel.com (pan.ch.intel.com [143.182.246.24]) by CS.UTK.EDU with ESMTP (cf v2.9s-UTK) id MAA29319; Sun, 25 Oct 1998 12:35:58 -0500 (EST) Received: from sedona.intel.com (sedona.ch.intel.com [143.182.218.21]) by pan.ch.intel.com (8.8.6/8.8.5) with ESMTP id RAA16591; Sun, 25 Oct 1998 17:35:56 GMT Received: from ccm.intel.com ([143.182.69.127]) by sedona.intel.com (8.9.1a/8.9.1a-chandler01) with ESMTP id KAA27181; Sun, 25 Oct 1998 10:35:54 -0700 (MST) Message-ID: <36336126.B26DEE2C@ccm.intel.com> Date: Sun, 25 Oct 1998 10:34:30 -0700 From: Anjaneya Chagam X-Mailer: Mozilla 4.05 [en] (Win95; I) MIME-Version: 1.0 To: parkbench-comm@CS.UTK.EDU, Anjaneya.Chagam@intel.com Subject: Question on parkbench source code in c X-Priority: 1 (Highest) Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit Hi: I am looking for packbench benchmarking programs source code in c language to do benchmarking comparison on Chime and PVM on NT platform @ Arizona State University. Could you please let me know if the parkbench programs are ported to c, if so where can I find them? Thanks a million. Name: Anjaneya R. Chagam Email: Anjaneya.Chagam@intel.com From owner-parkbench-comm@CS.UTK.EDU Mon Oct 26 06:36:26 1998 Return-Path: Received: from CS.UTK.EDU by netlib2.cs.utk.edu with ESMTP (cf v2.9t-netlib) id GAA11147; Mon, 26 Oct 1998 06:36:25 -0500 Received: from localhost (root@localhost) by CS.UTK.EDU with SMTP (cf v2.9s-UTK) id GAA07390; Mon, 26 Oct 1998 06:27:13 -0500 (EST) Received: from osiris.sis.port.ac.uk (root@osiris.sis.port.ac.uk [148.197.100.10]) by CS.UTK.EDU with SMTP (cf v2.9s-UTK) id GAA07383; Mon, 26 Oct 1998 06:27:10 -0500 (EST) Received: from mordillo (pc297.sis.port.ac.uk) by osiris.sis.port.ac.uk (4.1/SMI-4.1) id AA15115; Mon, 26 Oct 98 11:29:54 GMT Date: Mon, 26 Oct 98 11:14:06 GMT From: Mark Baker Subject: Re: Question on parkbench source code in c To: Anjaneya.Chagam@intel.com, parkbench-comm@CS.UTK.EDU X-Mailer: Chameleon ATX 6.0.1, Standards Based IntraNet Solutions, NetManage Inc. X-Priority: 3 (Normal) References: <36336126.B26DEE2C@ccm.intel.com> Message-Id: Mime-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII Anjaneya, The official Parkbench code are only available in Fortran 77. I rememer vaguely sometime back hearing about a graduate-students attempt to "port" some of the low-level codes to C. Charles Grassl (Cray) and I did a little work on some simple C PingPong codes. You can check-out these on... http://www.sis.port.ac.uk/~mab/TOPIC/ Regards Mark --- On Sun, 25 Oct 1998 10:34:30 -0700 Anjaneya Chagam wrote: > Hi: > I am looking for packbench benchmarking programs source code in c > language to do benchmarking comparison on Chime and PVM on NT platform @ > Arizona State University. Could you please let me know if the parkbench > programs are ported to c, if so where can I find them? > > Thanks a million. > > Name: Anjaneya R. Chagam > Email: Anjaneya.Chagam@intel.com > > ---------------End of Original Message----------------- ------------------------------------- DCS, University of Portsmouth, Hants, UK Tel: +44 1705 844285 Fax: +44 1705 844006 E-mail: mab@sis.port.ac.uk Date: 10/26/98 - Time: 11:14:07 URL: http://www.dcs.port.ac.uk/~mab/ ------------------------------------- From owner-parkbench-comm@CS.UTK.EDU Mon Nov 16 10:06:23 1998 Return-Path: Received: from CS.UTK.EDU by netlib2.cs.utk.edu with ESMTP (cf v2.9t-netlib) id KAA11375; Mon, 16 Nov 1998 10:06:23 -0500 Received: from localhost (root@localhost) by CS.UTK.EDU with SMTP (cf v2.9s-UTK) id JAA08949; Mon, 16 Nov 1998 09:01:33 -0500 (EST) Received: from del2.vsnl.net.in (del2.vsnl.net.in [202.54.15.30]) by CS.UTK.EDU with ESMTP (cf v2.9s-UTK) id JAA08936; Mon, 16 Nov 1998 09:01:28 -0500 (EST) Received: from sameer.myasa.com ([202.54.106.39]) by del2.vsnl.net.in (8.9.1a/8.9.1) with SMTP id TAA13392 for ; Mon, 16 Nov 1998 19:30:37 -0500 (GMT) From: "Kashmir Kessar Mart" To: Subject: Information Date: Mon, 16 Nov 1998 19:30:48 +0530 Message-ID: <01be1169$838dd020$276a36ca@sameer.myasa.com> MIME-Version: 1.0 Content-Type: multipart/alternative; boundary="----=_NextPart_000_006D_01BE1197.9D460C20" X-Priority: 3 X-MSMail-Priority: Normal X-Mailer: Microsoft Outlook Express 4.71.1712.3 X-MimeOLE: Produced By Microsoft MimeOLE V4.71.1712.3 This is a multi-part message in MIME format. ------=_NextPart_000_006D_01BE1197.9D460C20 Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: quoted-printable Dear Sir,=20 I have seen your Web Site but could not understand what = your company is. Please let me know if you can provide me information regarding Walnut = Kernels. Regards Azad. ------=_NextPart_000_006D_01BE1197.9D460C20 Content-Type: text/html; charset="iso-8859-1" Content-Transfer-Encoding: quoted-printable
Dear Sir,
          &nbs= p; =20 I have seen your Web Site but could not understand what your company=20 is.
Please let me know if you can = provide me=20 information regarding Walnut Kernels.
 
Regards
Azad.
------=_NextPart_000_006D_01BE1197.9D460C20-- From owner-parkbench-comm@CS.UTK.EDU Fri Dec 4 15:44:53 1998 Return-Path: Received: from CS.UTK.EDU by netlib2.cs.utk.edu with ESMTP (cf v2.9t-netlib) id PAA21941; Fri, 4 Dec 1998 15:44:53 -0500 Received: from localhost (root@localhost) by CS.UTK.EDU with SMTP (cf v2.9s-UTK) id PAA21231; Fri, 4 Dec 1998 15:18:56 -0500 (EST) Received: from gimli.genias.de (qmailr@GIMLI.genias.de [192.129.37.12]) by CS.UTK.EDU with SMTP (cf v2.9s-UTK) id PAA21223; Fri, 4 Dec 1998 15:18:51 -0500 (EST) From: Received: (qmail 14706 invoked by uid 233); 4 Dec 1998 20:14:46 -0000 Date: 4 Dec 1998 20:14:46 -0000 Message-ID: <19981204201446.14705.qmail@gimli.genias.de> Reply-to: majordomo@genias.de To: parkbench-comm@CS.UTK.EDU Subject: Newsletter on Distributed and Parallel Computing Dear Colleague, as already announced a few weeks ago, this is now the very first issue of our bi-monthly electronic Newsletter on Distributed and Parallel Computing, DPC NEWS. !! If you want to receive DPC NEWS regularly, please just return this !! !! e-mail to majordomo@genias.de with !! !! !! !! subscribe newsletter or subscribe newsletter !! !! end end !! !! !! !! in the first two lines of the email-body (text area). !! This newsletter is a FREE service to the DPC Distributed and Parallel Computing community. It regularly informs on new developments and results in DPC, e.g. conferences, important weblinks, new books and other relevant news in distributed and parallel computing. We also keep all the information in the special newsletter section of our webpage ( http://www.genias.de/dpcnews/ ) to provide a wealth of infos for the DPC community. If you have any information which might fit into these DPC subjects, please send it to me together with the corresponding weblink, for publication in DPC News. We aim to reach a very broad community with this DPC Newletter. With Season's Greetings from GENIAS Wolfgang Gentzsch, CEO and President ===================================================================== DPC NEWSletter on Distributed and Parallel Computing GENIAS Software, December 1998 ------------------------------ http://www.genias.de/dpcnews/ GENIAS NEWS: EASTMAN CHEMICAL USES CODINE FOR MOLECULAR MODELING Eastman Chemical uses commercial quantum chemistry programs, like Gaussian, Jaguar, and Cerius2, to model chemical products, intermediates, catalysts, etc. The simulation jobs take between 1 hour and 6 days to complete. Queuing software is an important part of keeping the processors working at full utilization, without being overloaded. Since October, with the new CODINE release 4.2, Eastman has maintained over 95% CPU utilization on the available systems: http://www.genias.de/dpcnews/ BMW USES CODINE AND GRD FOR CRASH-SIMULATION At the BMW crash department, very complex compute-intensive PAM-CRASH simulations are performed on a cluster of 11 compute servers and more than 100 workstations, altogether over 370 CPUs. CODINE and GRD have optimized the utilization of this big cluster by distributing the load equally, dynamically and in an application oriented way, transparent to the 45 users: http://www.genias.de/dpcnews/ GRD MANAGES ACADEMIC COMPUTER CENTER http://www.genias.de/dpcnews/ QUEUING UP FOR GRD AT ARL ARMY RESEARCH LAB http://www.genias.de/dpcnews/ GENIAS ADDS DYNAMIC RESOURCE & POLICY MGMT TO LINUX http://www.genias.de/dpcnews/ GRD STOPPS FLOODING SYSTEM WITH MANY JOBS http://www.genias.de/dpcnews/ PaTENT MPI ACCELERATES MARC K7.3 FE ANALYSIS CODE http://www.genias.de/dpcnews/ + http://www.marc.com/Techniques/ CONFERENCES on DPC, Dec'98 - March'99: - Workshop on Performance Evaluation with Realistic Applications (sponsored by SPEC), San Jose, CA USA, Jan 25 1999: http://www.spec.org/news/specworkshop.html - ACPC99, 4th Int. Conf. on Parallel Computation, ACPC Salzburg, Austria, February 16-18 1999: http://www.coma.sbg.ac.at/acpc99/index.html - MPIDC'99, Message Passing Interface Developer's and User's Conference, Atlanta, Georgia USA, March 10-12 1999: http://www.mpidc.org - 9th SIAM Conf. on Parallel Processing for Scientific Computing, San Antonio, Texas USA, March 22-24 1999: http://www.siam.org/meetings/pp99/ - 25th Speedup Workshop, Lugano, Switzerland, March 25-26 1999: http://www.speedup.ch/Workshops/Workshop25Ann.html - CC99, 2nd German Workshop on Cluster Computing, Karlsruhe, Germany, March 25-26 1999: http://www.tu-chemnitz.de/informati/RA/CC99 More on GENIAS Webpage. http://www.genias.de/dpcnews/ NEW DPC BOOKS: - Parallel Computing Using Optimal Interconnections, Kequin Li, Yi Pan, Si Qing Zheng. Kluwer Publ 1998: http://www.mcs.newpaltz.edu/~li/pcuoi.html - High-Performance Computing, Contributions to Society,T.Tabor(Ed.),1998: http://www.tgc.com - Special Issue on Metacomputing, W. Nagel, R. Williams (Eds.), Int. J. Parallel Computing, Vol. 24, No. 12-13, Elsevier Science 1998: http://www.elsevier.nl/locate/parco More books on DPC on GENIAS Webpage: http://www.genias.de/dpcnews/ DPC WEBPAGES: - PRIMEUR: HPC electronic news magazine: http://www.hoise.com - PRIMEUR List of ESPRIT Projects: http://www.hoise.com/CECupdate/contentscecdec98.html - HPCwire, Email Newsletter: http://www.tgc.com/hpcwire.html/ - EuroTools, European HPCN Tools Working Group http://www.irisa.fr/eurotools - PTOOLS, Parallel Tools Consortium: http://www.ptools.org - TOP500: 500 fastest supercomputers: http://www.top500.org - PROSOMA: Technology fair describing hundreds of European CEC funded projects: http://www.prosoma.lu/ - Links to Linux Cluster Projects: http://www.linux-magazin.de/cluster/ More DPC WebLinks: http://www.genias.de/dpcnews/ NEWS ON HPC BENCHMARKS: - STREAM, Memory Performance Benchmark from John McCalpin: http://www.cs.virginia.edu/stream/ GENIAS JOBS: - For our CODINE/GRD Devel.Team: Software engineer with experience in GUI development under OSF/Motif, Java and Windows, distributed computing, resource mgnt systems under Unix and NT: http://www.genias.de/jobs/ CALL FOR PAPERS in upcoming Journals: - Message Passing Interface-based Parallel Programming with Java: deadline Dec. 15 1999: http://hpc-journals.ecs.soton.ac.uk/CPE/Special/MPI-Java End of DPC Newsletter ========================================================================== From owner-parkbench-comm@CS.UTK.EDU Sun Jan 24 12:15:02 1999 Return-Path: Received: from CS.UTK.EDU (CS.UTK.EDU [128.169.94.1]) by netlib2.cs.utk.edu with ESMTP (cf v2.9t-netlib) id MAA26189; Sun, 24 Jan 1999 12:15:01 -0500 Received: from localhost (root@localhost) by CS.UTK.EDU with SMTP (cf v2.9s-UTK) id MAA08151; Sun, 24 Jan 1999 12:08:47 -0500 (EST) Received: from serv1.is4.u-net.net ([195.102.240.252]) by CS.UTK.EDU with SMTP (cf v2.9s-UTK) id MAA08144; Sun, 24 Jan 1999 12:08:44 -0500 (EST) Received: from mordillo [195.102.198.114] by serv1.is4.u-net.net with smtp (Exim 1.73 #1) id 104T1E-0003IJ-00; Sun, 24 Jan 1999 17:08:17 +0000 Date: Sun, 24 Jan 1999 17:05:53 +0000 From: Mark Baker Subject: New PEMCS paper. To: parkbench-comm@CS.UTK.EDU X-Mailer: Z-Mail Pro 6.2, NetManage Inc. [ZM62_16H] X-Priority: 3 (Normal) Message-ID: MIME-Version: 1.0 Content-Type: TEXT/PLAIN; CHARSET=ISO-8859-1 A new PEMCS paper has just been accepted and published... Comparing the Scalability of the Cray T3E-600 and the Cray Origin 2000 Using SHMEM Routines, by Glenn R. Luecke, Bruno Raffin and James J. Coyle, Iowa Sate University, Ames, Iowa USA Check out... http://hpc-journals.ecs.soton.ac.uk/PEMCS/Papers/ Regards Mark ------------------------------------- DCS, University of Portsmouth, Hants, UK Tel: +44 1705 844285 Fax: +44 1705 844006 E-mail: Mark.Baker@port.ac.uk Date: 01/24/1999 - Time: 17:05:53 URL: http://www.dcs.port.ac.uk/~mab/ ------------------------------------- From owner-parkbench-comm@CS.UTK.EDU Tue Feb 2 08:17:19 1999 Return-Path: Received: from CS.UTK.EDU (CS.UTK.EDU [128.169.94.1]) by netlib2.cs.utk.edu with ESMTP (cf v2.9t-netlib) id IAA08459; Tue, 2 Feb 1999 08:17:19 -0500 Received: from localhost (root@localhost) by CS.UTK.EDU with SMTP (cf v2.9s-UTK) id HAA01393; Tue, 2 Feb 1999 07:42:18 -0500 (EST) Received: from serv1.is1.u-net.net (serv1.is1.u-net.net [195.102.240.129]) by CS.UTK.EDU with ESMTP (cf v2.9s-UTK) id HAA01386; Tue, 2 Feb 1999 07:42:16 -0500 (EST) Received: from [148.197.205.63] (helo=mordillo) by serv1.is1.u-net.net with smtp (Exim 2.00 #2) for parkbench-comm@cs.utk.edu id 107f7u-0005uS-00; Tue, 2 Feb 1999 12:40:22 +0000 Date: Tue, 2 Feb 1999 12:40:29 +0000 From: Mark Baker Subject: New PEMCS Paper - resend... To: parkbench-comm@CS.UTK.EDU X-Mailer: Z-Mail Pro 6.2, NetManage Inc. [ZM62_16H] X-Priority: 3 (Normal) Message-ID: MIME-Version: 1.0 Content-Type: TEXT/PLAIN; CHARSET=ISO-8859-1 Apologies for the resend - I think this email get lost when I sent it a couple of weeks back. --------------------------------------------------------------------------- A new PEMCS paper has just been accepted and published... "Comparing the Scalability of the Cray T3E-600 and the Cray Origin 2000 Using SHMEM Routines", by Glenn R. Luecke, Bruno Raffin and James J. Coyle, Iowa Sate University, Ames, Iowa USA Check out... http://hpc-journals.ecs.soton.ac.uk/PEMCS/Papers/ Regards Mark ------------------------------------- DCS, University of Portsmouth, Hants, UK Tel: +44 1705 844285 Fax: +44 1705 844006 E-mail: Mark.Baker@port.ac.uk Date: 02/02/1999 - Time: 12:40:29 URL: http://www.dcs.port.ac.uk/~mab/ ------------------------------------- From owner-parkbench-comm@CS.UTK.EDU Tue Mar 2 10:35:47 1999 Return-Path: Received: from CS.UTK.EDU (CS.UTK.EDU [128.169.94.1]) by netlib2.cs.utk.edu with ESMTP (cf v2.9t-netlib) id KAA18531; Tue, 2 Mar 1999 10:35:46 -0500 Received: from localhost (root@localhost) by CS.UTK.EDU with SMTP (cf v2.9s-UTK) id KAA01804; Tue, 2 Mar 1999 10:18:56 -0500 (EST) Received: from gimli.genias.de (qmailr@GIMLI.genias.de [192.129.37.12]) by CS.UTK.EDU with SMTP (cf v2.9s-UTK) id KAA01781; Tue, 2 Mar 1999 10:18:49 -0500 (EST) Received: (qmail 8905 invoked from network); 2 Mar 1999 15:19:10 -0000 Received: from fangorn.genias.de (192.129.37.74) by gimli.genias.de with SMTP; 2 Mar 1999 15:19:10 -0000 Received: (from daemon@localhost) by FANGORN.genias.de (8.8.8/8.8.8) id QAA13715; Tue, 2 Mar 1999 16:19:05 +0100 Date: Tue, 2 Mar 1999 16:19:05 +0100 Message-Id: <199903021519.QAA13715@FANGORN.genias.de> To: parkbench-comm@CS.UTK.EDU From: Majordomo@genias.de Subject: Welcome to newsletter Reply-To: Majordomo@genias.de -- Welcome to the newsletter mailing list! Please save this message for future reference. Thank you. If you ever want to remove yourself from this mailing list, send the following command in email to : unsubscribe Or you can send mail to with the following command in the body of your email message: unsubscribe newsletter or from another account, besides parkbench-comm@CS.UTK.EDU: unsubscribe newsletter parkbench-comm@CS.UTK.EDU If you ever need to get in contact with the owner of the list, (if you have trouble unsubscribing, or have questions about the list itself) send email to . This is the general rule for most mailing lists when you need to contact a human. Here's the general information for the list you've subscribed to, in case you don't already have it: The GENIAS Newsletter keeps you informed about new products, services and information about High Performance Computing. It serves as an addition to our printed newsletter that is distributed to our customers. To see our printed version, just visit our web-site http://www.genias.de and follow the link 'newsletter'. From owner-parkbench-comm@CS.UTK.EDU Wed Mar 3 02:34:35 1999 Return-Path: Received: from CS.UTK.EDU (CS.UTK.EDU [128.169.94.1]) by netlib2.cs.utk.edu with ESMTP (cf v2.9t-netlib) id CAA01271; Wed, 3 Mar 1999 02:34:35 -0500 Received: from localhost (root@localhost) by CS.UTK.EDU with SMTP (cf v2.9s-UTK) id CAA00679; Wed, 3 Mar 1999 02:32:28 -0500 (EST) Received: from gimli.genias.de (qmailr@GIMLI.genias.de [192.129.37.12]) by CS.UTK.EDU with SMTP (cf v2.9s-UTK) id CAA00668; Wed, 3 Mar 1999 02:32:25 -0500 (EST) Received: (qmail 10306 invoked from network); 3 Mar 1999 07:32:58 -0000 Received: from gandalf.genias.de (192.129.37.10) by gimli.genias.de with SMTP; 3 Mar 1999 07:32:58 -0000 Received: by GANDALF.genias.de (Smail3.1.28.1 #30) id m10I69J-000B10C; Wed, 3 Mar 99 08:32 MET Message-Id: From: gentzsch@genias.de (Wolfgang Gentzsch) Subject: sorry! To: parkbench-comm@CS.UTK.EDU Date: Wed, 3 Mar 99 8:32:57 MET Cc: gent@genias.de (Wolfgang Gentzsch) Reply-To: gentzsch@genias.de X-Mailer: ELM [version 2.3 PL11] Dear colleagues, I just discovered that the parkbench-comm@CS.UTK.EDU has been collected into our mailing list for our electronic DPC Newsletter. I very much appologize for this mistake. Thank you for your understanding! Kind regards Wolfgang -- -- subscribe now to http://www.genias.de/dpcnews/ -- - - - - - - - - - - - - - - - - - - - - - - - - - - - Wolfgang Gentzsch, CEO Tel: +49 9401 9200-0 GENIAS Software GmbH & Inc Fax: +49 9401 9200-92 Erzgebirgstr. 2 http://www.geniasoft.com D-93073 Neutraubling, Germany gentzsch@geniasoft.com - - - - - - - - - - - - - - - - - - - - - - - - - - - GENIAS Software Inc. Tel: 410 455 5580 UMBC Technology Center Fax: 410 455 5567 1450 S. Rolling Road http://www.geniasoft.com Baltimore, MD 21227, USA gentzsch@geniasoft.com = = = = = = = = = = = = = = = = = = = = = = = = = = =