# To unbundle, sh this file echo README 1>&2 cat >README <<'End of README' written by: Adam Beguelin 11/23/87 adamb@boulder.Colorado.EDU 1 - - 0AMOCO PRODUCTION COMPANY 87223ART0262 Tulsa, Oklahoma August 11, 1987 - -SUBJECT: Hypercube Implementation of SCHEDULE -Introduction +____________ 0SCHEDULE, a tool for developing portable parallel Fortran programs, was converted for use on the Ncube hypercube. The original SCHEDULE was developed for shared memory parallel processors; Sched3, the hypercube version, was developed around a master/slave model on a local memory par- allel processor. (For more information on SCHEDULE see "A Portable Envi- ronment for Developing Parallel Fortran Programs", Parallel Computing, + __________________ Volume 5, Numbers 1&2, July 1987, pp. 175-186.) The implementation of Sched3 is discussed along with the differences between Sched3 and standard SCHEDULE. Limitations of the current version of Sched3 are also discussed with suggestions for future improvements. 0Implementation +______________ 0In the shared memory version of Schedule routines are saved in a queue by storing the address of a routine and its parameters. On the hypercube the routines will be loaded in the local memory of another processor so having a pointer to the address of the routine on the host would be of little value when starting the process on a node. In Sched3 the name of the file to be loaded on the node is given instead of the address of the routine. The file to be loaded is created by compiling and linking the subroutine along with the program dmain. Dmain will act as the main program for the subroutine when it is loaded on the node. Dmain receives the parameters from the host, calls the subroutine with the proper pointers into the par- ameter buffer and sends the parameters back to the host when the node subroutine returns. The node subroutine must be named nsubr. As a con- vention each parallel subroutine is in a different file which will be com- piled and linked placing the executable in a file with a descriptive name. For instance the file dotprod.fn might contain a parallel subroutine to compute a dot product. The routine would be compiled and linked to dmain and the executable file would be called dotprod. When the routine is put in the queue the string 'dotprod' is used to specify the routine. 1 - 0 2 -Since there is no shared memory, the parameters are sent to the parallel subroutine via a message. This message is built by copying data from the pointers to the parameters stored in the queue. When the subroutine com- pletes a message returns the values of the parameters and these values are stored back into the memory of the host. This strategy results in the parameters being passed by address if no other parallel routines access the memory while the parallel routine is executing. Non-determinacy may be introduced if routines are allowed to operate in parallel on the same data. It is the programmers responsibility to control such non- determinacy by building a proper dependency graph for parallel routines. 0To copy the parameters to a message buffer Sched3 needs the length of the parameters. The shared memory version of Schedule doesn't need this information since the parameters are in shared memory and pointers can simply be passed to a parallel subroutine. Originally the code to save the pointers were written in C since Fortran cannot manipulate pointers. To allow pointers to be manipulated from Fortran three routines have been written in assembler for both the host and the node processors. 0 call ptoi(p,i) stores the address of i into the integer p i = star(p) the value pointed to by p is assigned to i call lstar(p,i) loads the value of i to the location pointed to by p 0The Schedule code that decides when to run what subroutines has not been changed except the routines GTPRB and LIBOPN. In the shared memory ver- sion each processor would call GTPRB to get the next routine to execute. In Sched3 there is a main loop executing on the host which calls GTPRB to get the next job. Previously GTPRB would block waiting for work. In Sched3 GTPRB returns a zero if there is no work to be had. LIBOPN previ- ously started processes on every processor and the processors would then ask for work. In Sched3 the host will poll for work and only load pro- cesses on a node when a routine is ready to run. If there is a job to run it is loaded onto the next available node. Nodes are simply chosen by picking the lowest numbered node that is available. Perhaps a better strategy would be to chose nodes closest to the host first. It may also be advisable to avoid node zero since it may be busy as a communication gateway to the host. When a job is ready to run a node is selected and the filename containing the routine is loaded on the chosen node. 0Processes may spawn other processes. The dependency information of a spawned process is the same as its parent. To spawn a process first NXTAG is called to obtain a unique jobtag for the process to be created. NXTAG is implemented by sending a message from the node to the host requesting a new jobtag. The host services the request by calling NXTAG (the NXTAG routine from the original shared memory code) and sending the result back to the node in a message. After a new jobtag has been obtained there is a call to SPAWN. This call will send the name of the routine and its jobtag 1 - 0 3 -to the host. The host loads the routine on the next available node and then notifies the parent where the child has been placed. The parameters are then sent out from the parent to the child. After a parent has com- pleted its processing it must make a call to WAIT. When a child completes it sends its parameters to its parent thus notifying the parent that it has completed. WAIT will receive the returned parameters from all of the children that have been spawned and notify the host that the children have completed. This strategy assures that the host receives a child's comple- tion before its parent's completion. Unfortunately this strategy does not assure that a process will complete after its grandchild. This stems from different message lag time between different nodes and the host. When a process calls WAIT all processing in the parent is blocked until all of the children have completed. This is mainly a result of the lack of mul- titasking on the nodes in the Ncube hypercube. The shared memory version of SCHEDULE will use the wait time to do more processing. The blocking wait makes it very easy to write programs which will deadlock on the hypercube. For instance if every node is loaded with a routine which spawns exactly one child and then waits for the child to complete. The child will never complete since there are no available processors. Since no child can complete no parent can complete resulting in deadlock. 0Differences in SCHEDULE and Sched3 +__________________________________ 0The implementation of Sched3 on the Ncube hypercube has resulted in sev- eral changes to the syntax and semantics of a SCHEDULE program. The most apparent is the requirement that routines that are to be run in parallel must be placed in a separate file and linked to dmain. The Ncube supports the loading of programs on the nodes not simply routines. Using a standard main allowed the running of routines on the nodes as complete programs. Dmain is also needed to parse the messages containing the par- ameters to the parallel routine. Using a makefile greatly facilitates creating Sched3 programs and their related files. 0Host programs use the routines in sched.lib while the node routines use those in sched.libn. 0A parallel routine must always have a fixed number of parameters. This is because Sched3 was written in Fortran and Fortran does not allow variable number of parameters to subroutines. The current implementation requires 4 parameters to the parallel routines. 0Commons are not really commons across the parallel routines. Shared memory is only simulated on the parameter lists to parallel routines and commons are only common to a routine and any routines it may call, not routines on other processors. 1 - 0 4 -Three of the SCHEDULE calls have slightly different parameters. A number indicating how many parameters the subroutine requires and an array of the lengths of those parameters has been added to SCHED, SPAWN and PUTQ. The external subroutine parameter to PUTQ has also changed to a character*25 containing the name of the routine. 0 CALL SCHED(nprocs, paralg, nblks, a,b,c) 0 Becomes: 0 N = 4 L(1) = 200 L(2) = 1 L(3) = 1 L(4) = 1 CALL SCHED(nprocs,paralg, N, L, nblks,a,b,c) 0 And 0 external dotprod CALL PUTQ(jobtag, dotprod, x,a,b,c) 0 Becomes: 0 character*25 DOTPROD DOTPROD = 'dotprod' CALL PUTQ(jobtag, DOTPROD, N, L, x, a, b, c) 0The WAIT call mentioned above is required and blocking in Sched3 where it is neither in SCHEDULE. 0LOCKON and LOCKOF are not implemented on the nodes. They are implemented on the host but in a trivial manner. The host has no shared memory so no test and set is needed. Implementing some sort of locks for synchronizing parallel processes may be advantageous. 0Future Improvements +___________________ 0The syntactic differences in the calls to the Sched3 could be eliminated by building a preprocessor to convert SCHEDULE calls to Sched3 calls. The preprocessor would have to be able to deduce the size of variables and pad the parameter lists if necessary. A very aggressive preprocessor could also separate the parallel routines into the proper files for Sched3. A more useful WAIT strategy should be developed to allow work to be done if children have not been completed. The current spawn scheme doesn't allow spawned children to spawn their own children. This mule syndrome should be removed. 1 - 0 5 -Currently the code can be found on the Ncube in the directory /usr/zalb06/sched This contains the source code and the makefiles as well as test programs. - -Adam Beguelin 0ALB:alb/sd End of README echo cmake 1>&2 cat >cmake <<'End of cmake' ucc -o ${1}.ltl ${1}.c lc -f 24 -e ${1} ${1}.ltl rm ${1}.ltl rm ${1}.o End of cmake echo dmain.fn 1>&2 cat >dmain.fn <<'End of dmain.fn' C This is the main program for the node subroutines. It will read C a message from the host which contains the parameters to the subroutine C then call the subroutine with those parameters. parameter (MAXBUFF=4096, PARMS=21, SCOMPLETE=98,COMPLETE=99) integer buff(MAXBUFF), nid, pid, cord, type, cflag, p(20), len integer parent, offset, stat common /hid/hid integer hid common /nwait/nwait integer nwait integer nwrite, nread C Init vars nwait = 0 stat = 1 call whoami(nid, pid, hid, cord) type = PARMS parent = -1 len = nread(buff, MAXBUFF*4, parent, type, cflag) if(len .lt. 0 ) stop stat = 2 offset = 21 do 1 i = 1,20 p(i) = offset offset = offset + buff(i) 1 continue call nsubr(buff(p(1)), buff(p(2)), buff(p(3)), buff(p(4))) c , buff(p(5)), c 1 buff(p(6)),buff(p(7)),buff(p(8)),buff(p(9)),buff(p(10)), c 2 buff(p(11)),buff(p(12)),buff(p(13)),buff(p(14)),buff(p(15)), c 3 buff(p(16)),buff(p(17)),buff(p(18)),buff(p(19)),buff(p(20))) stat = 3 type = COMPLETE if (nwrite(buff,len,parent,type,cflag) .ne. 0) stop stat = 4 stop end End of dmain.fn echo doc.fortran 1>&2 cat >doc.fortran <<'End of doc.fortran' !*********************************************************************** ! * ! A M O C O P R O D U C T I O N C O M P A N Y * ! PROPRIETARY - TO BE MAINTAINED IN CONFIDENCE * ! COPYRIGHTED 1987 Amoco Production Company * !*********************************************************************** ! PROGRAM: SCHEDULE: NUCBE IMPLEMENTATION ! +------------------------------------------------------+ * ! AUTHOR: Adam Beguelin adamb@Colorado.Edu ORIGIN DATE: 87/06/04 * ! LANGUAGE: FORTRAN 77 and Assembler DATE LAST COMPILED: 87/06/16 * End of doc.fortran echo fcs.c 1>&2 cat >fcs.c <<'End of fcs.c' #include main() { char c, sum, cs; int len, i; while((c = getchar()) != EOF) { putchar(c); /* initialize sum with the first byte in the checksum */ sum = c; /* get the record length, low byte then high byte */ len = putchar(c = getchar()); sum += c; len += putchar(c = getchar()) * 0x100; sum += c; /* fprintf(stderr, "len is %x hex\n",len); */ /* read in the chars of the record summing them and putting them back out on the output */ for (i = 1; i <= len-1; ++i) sum += putchar(getchar()); /* suck up the old checksum */ c = getchar(); /*calculate the new checksum and output it */ cs = ~sum + 1; putchar(cs); /* fprintf(stderr, "Old checksum %x New checksum %x Sum %x\n", c, cs,sum); fprintf(stderr, "Old + sum %x New + sum %x\n", c+sum, cs+sum); */ } } End of fcs.c echo fl 1>&2 cat >fl <<'End of fl' ud lstar.hx fcs < ud.out > lstar.q End of fl echo fln 1>&2 cat >fln <<'End of fln' ud lstar.hqx fcs < ud.out > lstar.qn End of fln echo fntoc 1>&2 cat >fntoc <<'End of fntoc' ftsubs.f gtprb.f slen.f slen.fn sp.fn split.fn spt.f sumsqrt.fn swrapup.f wait.fn work.f wrapup.f End of fntoc echo fp 1>&2 cat >fp <<'End of fp' ud ptoi.hx fcs < ud.out > ptoi.q End of fp echo fpn 1>&2 cat >fpn <<'End of fpn' ud ptoi.hqx fcs < ud.out > ptoi.qn End of fpn echo fs 1>&2 cat >fs <<'End of fs' ud star.hx fcs < ud.out > star.q End of fs echo fsn 1>&2 cat >fsn <<'End of fsn' ud star.hqx fcs < ud.out > star.qn End of fsn echo ftsubs.f 1>&2 cat >ftsubs.f <<'End of ftsubs.f' subroutine lockon(i) integer i i = 1 return end subroutine lockoff(i) integer i i = 0 return end subroutine chekin(jobtag) CVD$R NOCONCUR integer jobtag c*********************************************************************** c c this subroutine records problem identified by jobtag is c done to appropriate nodes. these nodes are recorded in c parmq(i,jobtag) where 6 .le. i .le. nchks+5 c checkin consists of decrementing the value in each of these c locations by 1. each of these is done in a critical section c protected by qlock(ichek) c c if the value in parmq(2,ichek) is 0 where ichek is a process c dependent upon this one then ichek is placed on the readyq c by entering the critical section protected by rtlock. the c pointer rtail to the tail of the readyq is incremented c unless task done is to be recorded. task done is placed on c the ready q and the pointer rtail left in place if nchks .eq. 0 c is found. c c see the common block description in libopn for more detail. c c*********************************************************************** parameter (mxprcs = 100,nslots = 30) integer parmq,readyq,qlock,hrlock,trlock,phead,ptail,rhead,rtail, & done,qtail,qtlock common /qdata/ parmq(nslots,mxprcs),phead,ptail, & readyq(mxprcs),rhead,rtail,qtail common /qsync/ qlock(mxprcs),hrlock,trlock,done,qtlock c c c common block description: c c a complete common block description is given in the routine libopn c c**************************************************************************** c c check to see if this process has completed. if not do not checkin c c first ask if any kids spawned by jobtag c if (parmq(4,jobtag) .ne. 0 .or. parmq(5,jobtag) .ne. 0 ) then c c either kids have been spawned or ientry has been referenced c indicating reentry is required c c c find out how many are waiting to complete c if (parmq(4,jobtag) .ne. 0) then call lockon(qlock(jobtag)) parmq(2,jobtag) = parmq(2,jobtag) - parmq(4,jobtag) call lockoff(qlock(jobtag)) endif c c reset number of kids c parmq(4,jobtag) = 0 c c update the number of times this procedure has been c entered c parmq(1,jobtag) = parmq(1,jobtag) + 1 c c return without checkin if all the kids have not c checked in to jobtag yet or if a reentry is required. c process jobtag is not done in either case. c if (parmq(2,jobtag) .ne. 0 ) return c c if ientry has been called but jobtag is not waiting c on any kids then jobtag is placed back on the readyq c if ( parmq(5,jobtag) .ne. 0) then call lockon(trlock) readyq(rtail) = jobtag rtail = rtail + 1 call lockoff(trlock) return endif endif c c the process has completed so chekin proceeds c nchks = parmq(3,jobtag) c c if this is the final process (indicated by nchks .eq. 0) then c record task done. do not advance the tail so task done sequence c is set. all subsequent gtprb queries will leave rhead .eq. rtail c with readyq(rhead) .eq. done. c if (nchks .eq. 0) then call lockon(trlock) readyq(rtail) = done call lockoff(trlock) return endif do 50 j = 6,nchks+5 mychek = parmq(j,jobtag) c c get unique access to the checkin node mychek c and checkin by decrementing the appropriate counter c mchkgo = 1 call lockon(qlock(mychek)) parmq(2,mychek) = parmq(2,mychek) - 1 mchkgo = parmq(2,mychek) call lockoff(qlock(mychek)) c c place mychek on readyq if parmq(2,mychek) is 0 c if (mchkgo .eq. 0 ) then call lockon(trlock) readyq(rtail) = mychek rtail = rtail + 1 call lockoff(trlock) endif 50 continue return c c last card of chekin c end subroutine dep(jobtag,icango,nchks,mychkn) CVD$R NOCONCUR integer jobtag,icango,nchks,mychkn(*) c************************************************************************* c c warning - this routine may only be used by driver in a static definition c of the data dependencies in the task. c c c usage c subroutine xxx() c external yyy c . c . c . c call dep(jobtag,icango,nchks,mychkn) c call putq(jobtag,yyy,) c . c . c . c c this subroutine puts data dependencies for problem on the queue. c no synchronization is necessary because each index of a column of c parmq is associated with a jobtag specified by the user and c associated with a unique schedulable process. the arguments of c putq specify a process and are placed in a column of jobq according c to the menue specified in the common block description given below. c c jobtag is an integer specifying a unique schedulable process c c c icango is a positive integer specifying how many processes must check in c to this process before it can be placed on the readyq. c c nchks is the number of processes that depend upon the completion of c this process. c c mychkn is an integer array specifying the jobtags of the processes c which depend upon completion of this process. c c************************************************************************* c parameter (mxprcs = 100,nslots = 30) integer parmq,readyq,qlock,hrlock,trlock,phead,ptail,rhead,rtail, & done,qtail,qtlock common /qdata/ parmq(nslots,mxprcs),phead,ptail, & readyq(mxprcs),rhead,rtail,qtail common /qsync/ qlock(mxprcs),hrlock,trlock,done,qtlock c c c common block description: c c for a complete common block description see the subroutine libopn c c c place process jobtag on the problem queue c no synchronization required to update qtail since c only one program work executes this code. c if ((jobtag .le. 0 .or. jobtag .gt. mxprcs) .or. & icango .lt. 0 .or. nchks .lt. 0) then write(6,*) '*************SCHED USER ERROR********************' write(6,*) ' incorrect specification of dependencies ' write(6,*) ' all DEP parameters must be non-negative' write(6,*) ' input was ' write(6,*) ' jobtag ',jobtag ,'.... must be postitive ' write(6,*) ' but less than ',mxprcs write(6,*) ' icango ',icango write(6,*) ' nchks ',nchks write(6,*) ' ' write(6,*) ' EXECUTION TERMINATED BY SCHED ' stop c endif qtail = qtail + 1 next = jobtag parmq(1,next) = 1 parmq(2,next) = icango parmq(3,next) = nchks parmq(4,next) = 0 c c check to see that exactly one node has ncheks set to 0 c if (nchks .eq. 0 .and. done .eq. 0) then done = -2 else if (nchks .eq. 0) done = 0 endif c c specify identifiers of processes which depend on this one c if there are too many abort c if (nchks .gt. nslots - 5) then write(6,*) '*************SCHED USER ERROR********************' write(6,*) ' attempt to place too many dependencies ' write(6,*) ' on chekin list during call to dep ' write(6,*) ' with jobtag ',jobtag write(6,*) ' ' write(6,*) ' user tried to place ',nchks ,' dependencies ' write(6,*) ' the maximum number is ',nslots - 5 write(6,*) ' ' write(6,*) ' EXECUTION TERMINATED BY SCHED ' stop c endif do 50 j = 1,nchks parmq(j+5,next) = mychkn(j) c if (mychkn(j) .le. 0) then write(6,*) '*************SCHED USER ERROR********************' write(6,*) ' incorrect specification of dependencies ' write(6,*) ' all mychkn entries must be positive' write(6,*) ' input was ' write(6,*) ' mychkn(',j,') = ',mychkn(j) write(6,*) ' ' write(6,*) ' EXECUTION TERMINATED BY SCHED ' stop endif c 50 continue c return c c last card of dep c end subroutine libopn(nproc) integer nproc c************************************************************************ c c this routine sets locks and initializes variables c and then spawns nproc generic work routines. c c nproc is a positive integer. care should be taken to c match nproc to the number of physical processors c available. c c************************************************************************ parameter (mxprcs = 100,nslots = 30,mxces = 8) integer parmq,readyq,qlock,hrlock,trlock,phead,ptail,rhead,rtail, & done,qtail,qtlock common /qdata/ parmq(nslots,mxprcs),phead,ptail, & readyq(mxprcs),rhead,rtail,qtail common /qsync/ qlock(mxprcs),hrlock,trlock,done,qtlock c integer ispace(10) c c common block description: c c common/qdata/ c c parmq is a two dimensional integer array. each column of c this array represents a schedulable process. a process is c identified by its jobtag which corresponds to a unique c column of parmq. a column of parmq has the following c entries c c parmq(1,jobtag) = nentries c a nonzero integer. if process jobtag c is on the readyq then this integer c is equal to the one plus number of times c process jobtag has been entered. c thus when work executes this process c the integer is equal to the number c of times the process has been entered. c c parmq(2,jobtag) = icango c an integer specifying the number c of processes that must check in c before this process may scheduled c (ie. be placed on the ready queue) c c parmq(3,jobtag) = nchks c an integer specifying the number c of processes that this process c must checkin to. identifiers of c these processes are recorded below. c if nchks .eq. 0 then completion of c this process signifies completion of c task. c c parmq(4,jobtag) = the number of kids spawned by this c process. if this value is zero c then this process has not spawned c any subprocesses. c c parmq(5,jobtag) = entry_flag c has the value 1 if ientry was called c has the value 0 otherwise c c parmq(6:6+nchks,jobtag) is reserved for identifiers of the c nchks c processes that must wait for completion c of this process before they can execute. c c c phead pointer to head of parmq c c ptail pointer to tail of parmq c c readyq a one dimensional integer array that holds the jobtags of those c processes that are ready to execute. c if readyq(j) .eq. done has been set then a return from subroutine c work is indicated. c c rhead is a pointer to the head of readyq c c rtail is a pointer to the tail of readyq c c common/sync/ c c qlock is an integer array of locks. there is one lock for each c column of parmq. the purpose of this lock is to ensure c unique access to a column of parmq during the checkin operation. c c hrlock is an integer lock. the purpose of this lock is to ensure c unique access to the pointer rhead to the head of the readyq. c c trlock is an integer lock. the purpose of this lock is to ensure c unique access to the pointer rtail to the tail of the readyq. c c done is a unique non positive integer set in libopn to indicate c task done. c C if (nproc .gt. mxces) then C write(6,*) '*************SCHED USER ERROR********************' C write(6,*) ' user asking for more physical processors' C write(6,*) ' than are available on this system ' C write(6,*) ' the maximum allowed is nproc = ',mxces C write(6,*) ' ' C write(6,*) ' EXECUTION TERMINATED BY SCHED ' C stop C endif c jobtag = next done = -1 c c set qlocks off c initialize readyq(*) = -1 to set done sequence c initialize reentry indicator in parmq(5,*) c do 100 j = 1,mxprcs qlock(j) = 0 readyq(j) = -1 parmq(5,j) = 0 100 continue c c set readyq locks off c hrlock = 0 trlock = 0 tqlock = 0 c c initialize queue pointers c phead = 1 ptail = 1 rhead = 1 rtail = 1 qtail = 2 next = 1 c c set lock on pointer to head of readyq so c no process may start until all process and data dependencies c have been specified by the user supplied routine driver. c C call lockon(hrlock) c c now spawn virtual processors. these generic work routines will c assume the identity of any schedulable process specified by driver. c CVD$L CNCALL C do 200 j = 1,nproc C call work(j,ispace(j)) C 200 continue return c c last card of libopn c end subroutine nxtag(mypar,jobtag) CVD$R NOCONCUR integer mypar,jobtag c************************************************************************* c c c this subroutine puts data dependencies for problem on the queue. c no synchronization c is necessary because each index of a column of parmq is associated c with a jobtag specified by the user and associated with a unique c schedulable process. the arguments of putq specify a process and are c placed in a column of jobq according to the menue specified in the c common block description given below. c c jobtag is an integer specifying a unique schedulable process c c c icango is a positive integer specifying how many processes must check in c to this process before it can be placed on the readyq. c c nchks is the number of processes that depend upon the completion of c this process. c c mchkin is an integer array specifying the jobtags of the processes c which depend upon completion of this process. c c************************************************************************* c parameter (mxprcs = 100,nslots = 30) integer parmq,readyq,qlock,hrlock,trlock,phead,ptail,rhead,rtail, & done,qtail,qtlock common /qdata/ parmq(nslots,mxprcs),phead,ptail, & readyq(mxprcs),rhead,rtail,qtail common /qsync/ qlock(mxprcs),hrlock,trlock,done,qtlock c c c common block description: c c for a complete common block description see the subroutine libopn c c c c place this process on the next slot in the problem queue c call lockon(qtlock) next = qtail qtail = qtail + 1 call lockoff(qtlock) if ( next .gt. mxprcs) then write(6,*) '*************SCHED USER ERROR********************' write(6,*) ' user attempt to create too many processes' write(6,*) ' through dynamic spawning ' write(6,*) ' the maximum allowed is ',mxprcs write(6,*) ' ' write(6,*) ' EXECUTION TERMINATED BY SCHED ' stop endif c jobtag = next parmq(1,next) = 1 parmq(2,next) = 0 parmq(3,next) = 1 parmq(6,next) = mypar c c update the icango counter of the parent process c by adding 2 to parmq(2,mypar)... prevents race condition. c add 1 to the number of kids spawned by parent mypar c call lockon(qlock(mypar)) parmq(2,mypar) = parmq(2,mypar) + 2 parmq(4,mypar) = parmq(4,mypar) + 1 call lockoff(qlock(mypar)) c c set number of kids spawned by next to zero c parmq(4,next) = 0 c c c return c c last card of nxtag c end subroutine start2 c c this routine allows parallel processing to start after user supplied c driver has completed by unlocking the head of the readyq c parameter (mxprcs = 100,nslots = 30) integer parmq,readyq,qlock,hrlock,trlock,phead,ptail,rhead,rtail, & done,qtail,qtlock common /qdata/ parmq(nslots,mxprcs),phead,ptail, & readyq(mxprcs),rhead,rtail,qtail common /qsync/ qlock(mxprcs),hrlock,trlock,done,qtlock c c c for common block description see subroutine libopn. c if (done .ne. 0) then write(6,*) '*************SCHED USER ERROR********************' if (done .eq. -1 ) then write(6,*) ' no process has set nchks equal to 0 ' else write(6,*) ' more than one process has set nchks to 0 ' endif write(6,*) ' SCHEDULE will not be able to terminate job' write(6,*) ' correctly ' write(6,*) ' ' write(6,*) ' check subroutine passed to initial call to' write(6,*) ' to see that at exactly one call to DEP has ' write(6,*) ' set nchks = 0 ' write(6,*) ' ' write(6,*) ' EXECUTION TERMINATED BY SCHED ' stop endif c if (rhead .eq. rtail) then write(6,*) '*************SCHED USER ERROR********************' write(6,*) ' no process had an intitial icango of 0 ' write(6,*) ' SCHEDULE could not begin ' write(6,*) ' ' write(6,*) ' check subroutine passed to initial call to' write(6,*) ' to see that at least one call to DEP has ' write(6,*) ' set icango = 0 ' write(6,*) ' ' write(6,*) ' EXECUTION TERMINATED BY SCHED ' stop endif c call lockoff(hrlock) c return c c last card of start2 c end subroutine place(jobtag) CVD$R NOCONCUR integer jobtag c************************************************************************* c c c this subroutine places a problem on the readyq c c jobtag is an integer specifying a unique schedulable process c c c icango is a positive integer specifying how many processes must check in c to this process before it can be placed on the readyq. c c nchks is the number of processes that depend upon the completion of c this process. c c mchkin is an integer array specifying the jobtags of the processes c which depend upon completion of this process. c c************************************************************************* c parameter (mxprcs = 100,nslots = 30) integer parmq,readyq,qlock,hrlock,trlock,phead,ptail,rhead,rtail, & done,qtail,qtlock common /qdata/ parmq(nslots,mxprcs),phead,ptail, & readyq(mxprcs),rhead,rtail,qtail common /qsync/ qlock(mxprcs),hrlock,trlock,done,qtlock c c c common block description: c c for a complete common block description see the subroutine libopn c c c c place this process on readyq if icango is 0 c when icango .eq. 0 this process does not depend on any c others. c icango = parmq(2,jobtag) if (icango .eq. 0 ) then call lockon(trlock) readyq(rtail) = jobtag rtail = rtail + 1 call lockoff(trlock) endif c c last card of place c return end integer function ientry(jobtag,nentrs) c integer jobtag c***************************************************************************** c c this routine will allow process jobtag to continue after c spawned processes have all checked in. it should only be called if c processes have been spawned by jobtag through the use of c the subroutine spawn. c c go to (1000,2000,...,N000), ientry(jobtag,N) c 1000 continue c . c . c . c do 10 j = 1,nproc c . c . (set parameters to define spawned process) c . c call nxtag(myid,jobtag) c call spawn(myid,jobtag,subname,) c 10 continue c return c 2000 continue c . c . c . c return c N000 continue c c return c end c c this subroutine returns the number of times process jobtag c has been entered. if that number is equal to the total c number nentrs of expected reentries then parmq(5,jobtag) c is set to 0 indicating no more reentries required. c c***************************************************************************** parameter (mxprcs = 100,nslots = 30) integer parmq,readyq,qlock,hrlock,trlock,phead,ptail,rhead,rtail, & done,qtail,qtlock common /qdata/ parmq(nslots,mxprcs),phead,ptail, & readyq(mxprcs),rhead,rtail,qtail common /qsync/ qlock(mxprcs),hrlock,trlock,done,qtlock c c report the entry point where process jobtag should resume c computation c if (nentrs .lt. 2) then write(6,*) '*************SCHED USER ERROR********************' write(6,*) ' user call to IENTRY with number of ' write(6,*) ' labels in nentrs set less than 2 ' write(6,*) ' from process ',jobtag write(6,*) ' ' write(6,*) ' EXECUTION TERMINATED BY SCHED ' stop endif ientry = parmq(1,jobtag) if (ientry .lt. nentrs) then parmq(5,jobtag) = nentrs else parmq(5,jobtag) = 0 endif c return c c last card of ientry c end logical function wait(jobtag,ienter) c integer jobtag,ienter c***************************************************************************** c c this routine will allow process jobtag to continue after c spawned processes have all checked in. it should only be called if c processes have been spawned by jobtag through the use of c the subroutine spawn. this routine must be used in conjunction with c subroutine prtspn. the required syntax is c c go to (1000,...,L000,...,N000), ientry(mytag,N) c 1000 continue c . c . c . c do 100 j = 1,nproc c . c . (set parameters to define spawned process) c . c call nxtag(mytag,jobtag) c call spawn(mytag,jobtag,subname,) c 100 continue c label = L c if (wait(jobtag,label)) return c L000 continue c . c . c . c c if this subroutine returns a value of .true. then the calling process c jobtag should issue a return. if a value of .false. is returned then c the calling process jobtag should resume execution at the c statement immediately following the reference to wait (ie. at L000 in c the example above. a return value .true. indicates that some spawned c process has not yet completed and checked in. a return value .false. c indicates all spawned processes have checked in. c c***************************************************************************** parameter (mxprcs = 100,nslots = 30) integer parmq,readyq,qlock,hrlock,trlock,phead,ptail,rhead,rtail, & done,qtail,qtlock common /qdata/ parmq(nslots,mxprcs),phead,ptail, & readyq(mxprcs),rhead,rtail,qtail common /qsync/ qlock(mxprcs),hrlock,trlock,done,qtlock c c c check the icango counter to see if all spawned processes (kids) c have checked in. c icango = 1 call lockon(qlock(jobtag)) icango = parmq(2,jobtag) - parmq(4,jobtag) call lockoff(qlock(jobtag)) c if (icango .eq. 0) then c c all kids are done ... dont wait (ie return false) c wait = .false. c c record re_entry label where computation is to c resume after wait is complete c parmq(1,jobtag) = ienter c if (ienter .gt. parmq(5,jobtag)) then write(6,*) '*************SCHED USER ERROR*****************' write(6,*) ' executing SCHEDULE function WAIT ' write(6,*) ' return label larger than the maximum ' write(6,*) ' specified by user in call to ientry ' write(6,*) ' from process jobtag ',jobtag write(6,*) ' ' write(6,*) ' the maximum reentry number is ' write(6,*) ' ', parmq(5,jobtag) write(6,*) ' ' write(6,*) ' EXECUTION TERMINATED BY SCHED ' stop endif c c set last re_entry indication (parmq(5,jobtag) = 0) c if this reentry point corresponds to last one c (recorded in parmq(5,jobtag) during call to ientry) c if (ienter .eq. parmq(5,jobtag)) parmq(5,jobtag) = 0 c else c c kids are not done c wait = .true. c c a checkin will be made so set the number of c entries to return label ienter - 1 to get c correct entry point after checkin c parmq(1,jobtag) = ienter - 1 c endif c return c c last card of wait c end End of ftsubs.f echo gtprb.f 1>&2 cat >gtprb.f <<'End of gtprb.f' integer function gtprb(jobtag) c************************************************************************** c c this routine gets unique access to the head of the readyq c claims the pointer to the next schedulable process if there c is one and returns with a nonzero value when there is c a process to schedule. if task done has been recorded the value c zero is returned in gtprb. if a nonzero value is returned in c gtprb, the integer jobtag will contain the identifier of the process c that is to be scheduled. c c output parameters c c jobtag an integer containing the next process to be executed c c gtprb the return value of this integer function is: c c 0 if task done has been posted or if there is no job to post c c nonzero if a schedulable process has been claimed. c c c*************************************************************************** parameter (mxprcs = 100,nslots = 30) integer parmq,readyq,qlock,hrlock,trlock,phead,ptail,rhead,rtail, & done,qtail,qtlock common /qdata/ parmq(nslots,mxprcs),phead,ptail, & readyq(mxprcs),rhead,rtail,qtail common /qsync/ qlock(mxprcs),hrlock,trlock,done,qtlock common /nodeList/ nodeAvail, numAvail integer nodeAvail(0:64), numAvail c c c common block description: c c for a complete common block description see the routine libopn c c C Bail out if their aren't enough nodes if (numAvail .le. 0) then gtprb = 0 return endif 10 continue mhead = -1 call lockon(hrlock) c c gain access to head of readyq. if task done has not been recorded c then increment the head of the readyq. otherwise the head pointer c is left fixed so the next active process will receive task done. c if (rhead .lt. rtail) then mhead = rhead rhead = rhead + 1 endif call lockoff(hrlock) if (mhead .gt. 0) then c c there was a work unit on the readyq c jobtag = readyq(mhead) c if (jobtag .ne. done) then c c record the subroutine call identifier in gtprb and return c the process identifier in jobtag. c gtprb = parmq(1,jobtag) if (gtprb .gt. 1 .and. parmq(5,jobtag) .eq. 0) gtprb = -1 c else c c task done has been indicated. request a return from subroutine work c by returning the value 0 in gtprb. c gtprb = 0 c endif else c jobtag = readyq(rhead) if (jobtag .eq. done) then c c task done has been posted c gtprb = 0 c else c c there was not any work on the readyq c gtprb = 0 c endif endif return c c last card of gtprb c end End of gtprb.f echo lstar.hqx 1>&2 cat >lstar.hqx <<'End of lstar.hqx' 00000000 02 08 00 00 05 4C 53 54 41 52 6B 04 1D 00 00 00 <.....LSTARk.....> 00000010 19 43 46 47 2F 4E 43 55 42 45 20 46 37 37 20 56 <.CFG/NCUBE F77 V> 00000020 31 2E 35 20 20 20 20 20 20 20 9C 08 09 00 01 85 <1.5 ......> 00000030 1F 04 43 4F 44 45 2B 08 09 00 02 85 00 04 44 41 <..CODE+.......DA> 00000040 54 41 4A 08 0A 00 04 85 00 05 53 54 41 43 4B EA 00000050 1C 06 00 00 00 7B 20 7D C6 1C 04 00 00 00 7A 66 <.....{ }......zf> 00000060 1C 09 00 00 00 7B 20 83 02 83 01 37 1C 06 00 00 <.....{ ....7....> 00000070 00 75 83 03 E3 1C 08 00 00 00 7F 83 04 83 04 4F <.u.............O> 00000080 1C 0A 00 00 00 74 80 80 80 02 83 05 5C 0C 10 00 <.....t......\...> 00000090 01 00 00 00 05 4C 53 54 41 52 00 00 02 02 06 4E <.....LSTAR.....N> 000000A0 10 22 00 01 00 04 CF FC 95 00 CF 04 8F 08 C1 04 <."..............> 000000B0 41 C0 0B 04 8F 0C C1 04 41 40 04 FC CF EF 08 04 <........A@......> 000000C0 FC CF EF 08 FE 14 02 00 00 EA <..........> End of lstar.hqx echo lstar.hx 1>&2 cat >lstar.hx <<'End of lstar.hx' 00000000 80 07 00 05 4C 53 54 41 52 EE 96 28 00 00 04 43 <....LSTAR..(...C> 00000010 4F 44 45 04 44 41 54 41 05 53 54 41 43 4B 0A 4C 00000020 53 54 41 52 5F 43 4F 44 45 0A 4C 53 54 41 52 5F 00000030 44 41 54 41 77 98 07 00 60 3E 00 05 02 01 BB 98 ......> 00000040 07 00 60 00 00 06 03 01 F7 98 07 00 54 00 00 04 <..`.........T...> 00000050 04 01 04 8E 04 00 00 00 80 EE 9C 09 00 40 01 41 <.............@.A> 00000060 02 00 01 01 02 D3 8E 06 00 00 00 7B 20 7D 54 8E <...........{ }T.> 00000070 04 00 00 00 7A F4 8E 09 00 00 00 7B 20 83 03 83 <....z......{ ...> 00000080 02 C3 8E 06 00 00 00 75 83 04 70 8E 08 00 00 00 <.......u..p.....> 00000090 7F 83 05 83 05 DB 8E 0A 00 00 00 74 80 80 72 02 <...........t..r.> 000000A0 83 06 F7 90 0C 00 00 01 05 4C 53 54 41 52 00 00 <.........LSTAR..> 000000B0 07 D1 A0 11 00 01 00 00 1E B8 00 00 8E D8 55 8B <..............U.> 000000C0 EC 81 EC 00 00 D9 9C 04 00 C8 02 5D 39 A0 35 00 <...........]9.5.> 000000D0 01 0D 00 C4 5E 0C 26 8B 07 26 8B 57 02 C4 5E 08 <................> 000000E0 50 52 26 8B 07 26 8B 57 02 87 C3 8E C2 5A 58 26 <...^.&..&.W..^.&> 000000F0 89 07 26 89 57 02 8B E5 5D 1F CA 08 00 8B E5 5D <..&.W...]......]> 00000100 1F CA 08 00 6E 8A 02 00 00 74 <....n....t> End of lstar.hx echo makefile 1>&2 cat >makefile <<'End of makefile' .SUFFIXES: .f .fn .q .qn .exe .exen .f.q: f77 -d -Mcpd -Ccd $*.f .fn.qn: f77 -d -N $*.fn split: split.qn sched.libn dmain.qn ldn -b -A472K split.qn dmain.qn sched.libn \ /usr/zalb06/lib/ptr.libn \ /usr/ztlc02/lib.tlc/flibn \ /lib/f77n.lib lcn -e split -s split.sym cn.out ptr.lib: make -f pmake ptr.lib ptr.libn: make -f pnmake ptr.libn sched: sched.lib sched.libn dmain.qn ptr.lib ptr.libn sched.lib: ftsubs.q putq.q work.q wrapup.q slen.q swrapup.q sndtag.q \ sched.q gtprb.q libcr sched.lib ftsubs.q putq.q work.q wrapup.q slen.q swrapup.q \ sndtag.q sched.q gtprb.q sched.libn: spawn.qn wait.qn slen.qn nxtag.qn libcr sched.libn spawn.qn wait.qn slen.qn nxtag.qn sptest: sp spt sumsqrt sp: dmain.qn sp.qn sched.libn ldn -b -A472K sp.qn dmain.qn sched.libn \ /usr/zalb06/lib/ptr.libn \ /usr/ztlc02/lib.tlc/flibn \ /lib/f77n.lib lcn -e sp -s sp.sym cn.out sumsqrt: dmain.qn sumsqrt.qn sched.libn ldn -b -A472K sumsqrt.qn dmain.qn sched.libn \ /usr/zalb06/lib/ptr.libn \ /usr/ztlc02/lib.tlc/flibn \ /lib/f77n.lib lcn -e sumsqrt -s sumsqrt.sym cn.out spt: spt.q sched.lib ld -g -x1 spt.q \ sched.lib \ /usr/zalb06/lib/ptr.lib \ /usr/ztlc02/lib.tlc/flib /lib/f77.lib lc -e spt -s spt.sym c.out End of makefile echo nxtag.fn 1>&2 cat >nxtag.fn <<'End of nxtag.fn' C Get the jobtag from the host (who will call nxtag on the host) C Note parent is the jobtag of the parent, here it is just a dummy C to provide a consistent interface with schedule subroutine nxtag(parent, jobtag) integer parent, jobtag common /hid/hid integer hid integer len, NXTAG, cflag len = 4 NXTAG = 1999 C Request the jobtag from the host if (nwrite(parent, len, hid, NXTAG, cflag) .lt. 0) stop C Read the jobtag from the host if (nread(jobtag, len, hid, NXTAG, cflag) .lt. 0) stop return end End of nxtag.fn echo pmake 1>&2 cat >pmake <<'End of pmake' .SUFFIXES: .ftn .q fcs: fcs.c sh < cmake fcs star.q: star.hx sh < fs ptoi.q: ptoi.hx sh < fp lstar.q: lstar.hx sh < fl ptr.lib: fcs ptoi.q star.q lstar.q libcr ptr.lib ptoi.q star.q lstar.q End of pmake echo pnmake 1>&2 cat >pnmake <<'End of pnmake' .SUFFIXES: .f .qn .fn .q fcs: fcs.c sh < cmake fcs star.qn: star.hqx sh < fsn ptoi.qn: ptoi.hqx sh < fpn lstar.qn: lstar.hqx sh < fln ptr.libn: fcs lstar.qn ptoi.qn star.qn libcr ptr.libn lstar.qn ptoi.qn star.qn End of pnmake echo ptoi.hqx 1>&2 cat >ptoi.hqx <<'End of ptoi.hqx' 00000000 02 07 00 00 04 50 54 4F 49 B7 04 1D 00 00 00 19 <.....PTOI.......> 00000010 43 46 47 2F 4E 43 55 42 45 20 46 37 37 20 56 31 00000020 2E 35 20 20 20 20 20 20 20 9C 08 09 00 01 85 22 <.5 ......"> 00000030 04 43 4F 44 45 28 08 09 00 02 85 04 04 44 41 54 <.CODE(.......DAT> 00000040 41 46 08 0A 00 04 85 00 05 53 54 41 43 4B EA 1C 00000050 06 00 00 00 7B 20 7D C6 1C 07 00 00 00 77 80 83 <....{ }......w..> 00000060 01 62 1C 04 00 00 00 7A 66 1C 09 00 00 00 7B 20 <.b.....zf.....{ > 00000070 83 03 83 02 35 1C 06 00 00 00 75 83 04 E2 1C 06 <....5.....u.....> 00000080 00 00 00 7F 83 05 D7 1C 0A 00 00 00 74 80 80 80 <............t...> 00000090 01 83 06 5C 0C 0F 00 01 00 00 00 04 50 54 4F 49 <...\........PTOI> 000000A0 00 00 02 02 07 99 10 25 00 01 00 04 CF FC 95 00 <.......%........> 000000B0 CF 04 8F 08 C0 04 C0 FA 00 00 00 00 04 FA 00 00 <......@.........> 000000C0 00 00 C0 04 FC CF EF 04 04 FC CF EF 04 BC 12 07 <................> 000000D0 00 0D 20 02 13 20 02 83 14 02 00 00 EA <.. .. .......> End of ptoi.hqx echo ptoi.hx 1>&2 cat >ptoi.hx <<'End of ptoi.hx' 00000000 80 06 00 04 50 54 4F 49 3A 96 26 00 00 04 43 4F <....PTOI:.&...CO> 00000010 44 45 04 44 41 54 41 05 53 54 41 43 4B 09 50 54 00000020 4F 49 5F 43 4F 44 45 09 50 54 4F 49 5F 44 41 54 00000030 41 0F 98 07 00 60 2C 00 05 02 01 CD 98 07 00 60 00000040 04 00 06 03 01 F3 98 07 00 54 00 00 04 04 01 04 <.........T......> 00000050 8E 04 00 00 00 80 EE 9C 09 00 40 01 41 02 00 01 <..........@.A...> 00000060 01 02 D3 8E 06 00 00 00 7B 20 7D 54 8E 07 00 00 <........{ }T....> 00000070 00 77 80 83 02 EF 8E 04 00 00 00 7A F4 8E 09 00 <.w.........z....> 00000080 00 00 7B 20 83 04 83 03 C1 8E 06 00 00 00 75 83 <..{ ..........u.> 00000090 05 6F 8E 06 00 00 00 7F 83 06 64 8E 0A 00 00 00 <.o........d.....> 000000A0 74 80 80 72 01 83 07 F7 90 0B 00 00 01 04 50 54 000000B0 4F 49 00 00 08 1C 8E 07 00 00 00 77 80 83 02 EF 000000C0 8E 09 00 00 00 7B 20 83 04 83 09 BB 8E 06 00 00 <.....{ .........> 000000D0 00 75 83 0A 6A 8E 06 00 00 00 7F 83 0B 5F 8E 0A <.u..j........_..> 000000E0 00 00 00 74 80 80 72 01 83 0C F2 7E 0D 00 00 00 <...t..r....~....> 000000F0 02 05 3F 50 54 4F 49 00 00 02 F1 8E 07 00 00 00 <..?PTOI.........> 00000100 77 80 83 02 EF 7A 10 00 00 01 04 50 54 4F 49 00 00000110 00 00 00 C0 04 00 0D 64 7E 09 00 C2 80 01 01 50 <.......d~......P> 00000120 08 00 0E CF 7C 01 00 83 A0 30 00 01 00 00 1E B8 <....|....0......> 00000130 00 00 8E D8 55 8B EC 81 EC 00 00 C4 5E 08 8B C3 <....U.......^.&.> 00000140 8C C2 90 90 90 A3 00 00 89 16 02 00 8B E5 5D 1F <.&.W..........].> 00000150 CA 04 00 8B E5 5D 1F CA 04 00 16 9C 0A 00 C8 02 <.....]..........> 00000160 5D C4 18 5D C4 1C 5D BD 94 07 00 00 01 04 00 0D <]..]..].........> 00000170 00 53 8A 02 00 00 74 <.S....t> End of ptoi.hx echo putq.f 1>&2 cat >putq.f <<'End of putq.f' C********************************************************************** subroutine strtspawn(node, chan) integer node, chan C This subroutine will read a message from the node requesting C a process be spawned. It will calculate the parent, get C a jobtag and call spawn to put the job in the queue character char character*25 subname integer NSPAWN,nread, err, cflag, len, parent, jobtag integer buff(100) common /nodeList/ nodeAvail, numAvail integer nodeAvail(0:64), numAvail C Init locals NSPAWN = 128 subname = ' ' C Read the subroutine name from the node len = nread(chan, buff, 400, node, NSPAWN, cflag) if (len .lt. 0) then Print *, 'Error in reading spawned subroutine' print *, 'Node: ',node Print *, 'Error: ',err err = nclose(chan) stop endif C assign the buffer to subname len = len/4 do 10 i = 1,len-1 subname(i:i) = char(buff(i)) 10 continue C get the jobtag jobtag = buff(len) C spawn the process call spawn(node, jobtag, subname) return end C********************************************************************** integer function findNode(nodeAvail, numAvail, nprocs, jobtag) C This function will return a node from the nodeAvail list which C is available for work. integer nodeAvail(0:64), numAvail, nprocs, jobtag integer i,ans do 1 i = 0,nprocs-1 if (nodeAvail(i) .eq. 1) then ans = i C exit goto 2 endif 1 continue 2 continue C store the jobtag here so it can be recalled later nodeAvail(ans) = -1 * jobtag numAvail = numAvail - 1 findNode = ans return end C********************************************************************** subroutine putq(jobtag, subname, n,l,p1,p2,p3,p4) parameter (mxprcs = 100) c 1 p1,p2,p3,p4,p5,p6,p7,p8,p9,p10,p11,p12,p13,p14,p15,p16, c 2 p17,p18,p19,p20) integer jobtag,n,l(*) character*25 subname integer p1,p2,p3,p4,p5,p6,p7,p8,p9,p10,p11,p12,p13,p14,p15,p16, 1 p17,p18,p19,p20 C Data Structure containing the subroutine names and the paramaters C IndexSubname(jobtag) contains the name of the subroutine to be C called C IndexParms(n, 1, jobtag) is the pointer to the nth parameter to jobtag C IndexParms(n, 2, jobtag) is the length of the nth parameter to jobtag integer IndexParms(20,2,mxprcs) character*25 IndexSubname(mxprcs) common /Index/IndexSubname,IndexParms integer ptoi,i,star Copy pointers to paramater p1,...,p20 into IndexParms(1,1)...IndexParms(20,1) IndexParms(1,1,jobtag) = ptoi(p1) IndexParms(2,1,jobtag) = ptoi(p2) IndexParms(3,1,jobtag) = ptoi(p3) IndexParms(4,1,jobtag) = ptoi(p4) IndexParms(5,1,jobtag) = ptoi(p5) IndexParms(6,1,jobtag) = ptoi(p6) IndexParms(7,1,jobtag) = ptoi(p7) IndexParms(8,1,jobtag) = ptoi(p8) IndexParms(9,1,jobtag) = ptoi(p9) IndexParms(10,1,jobtag) = ptoi(p10) IndexParms(11,1,jobtag) = ptoi(p11) IndexParms(12,1,jobtag) = ptoi(p12) IndexParms(13,1,jobtag) = ptoi(p13) IndexParms(14,1,jobtag) = ptoi(p14) IndexParms(15,1,jobtag) = ptoi(p15) IndexParms(16,1,jobtag) = ptoi(p16) IndexParms(17,1,jobtag) = ptoi(p17) IndexParms(18,1,jobtag) = ptoi(p18) IndexParms(19,1,jobtag) = ptoi(p19) IndexParms(20,1,jobtag) = ptoi(p20) IndexSubname(jobtag) = subname Copy the lengths of each parameter into IndexParms(1,2)...IndexParms(20,2) do 1 i = 1,n IndexParms(i,2,jobtag) = l(i) 1 continue C Zero out the rest of the lengths in the buffer do 2 i = n+1,20 IndexParms(i,2,jobtag) = 0 2 continue call place(jobtag) return end C********************************************************************** integer function mindim(p) integer p C Find the minmum dimension of the cube containing p processors C Return -1 if the dim cannot be found in the range [0...6] integer i, dim dim = -1 do 10 i = 6,0,-1 if (2**i .ge. p) then dim = i endif 10 continue mindim = dim return end C********************************************************************** C This subroutine will allow a node to spawn another process. When C a node process wants to spawn a process it sends the request. The C subname is put in the Q and when the process is loaded the parent C node is notified and the parent will send the parameters to the C spawned node subroutine spawn(parent, jobtag, subname) integer parent, jobtag character*25 subname parameter (nslots = 30, mxprcs = 100) Common variable declarations: integer IndexParms(20,2,mxprcs) character*25 IndexSubname(mxprcs) common /Index/IndexSubname,IndexParms common /nodeList/ nodeAvail, numAvail integer nodeAvail(0:64), numAvail C Store the subname of the spawned process IndexSubname(jobtag) = subname C Store the parent's jobtag in the len feild of the first parameter C This field will only be negative if the process is a dynamic process C The parameter field isn't needed since the parameters are sent directly C from the parent node to the spawned node IndexParms(1,2,jobtag) = -1 * (parent+1) C Do this for recursive spawing if (IndexSubname(jobtag) .eq. 1 'clone ') then IndexSubname(jobtag) = IndexSubname(-nodeAvail(parent)) endif call place(jobtag) return end End of putq.f echo sched.f 1>&2 cat >sched.f <<'End of sched.f' C********************************************************************** subroutine sched(nprocs, subname, p1,p2,p3,p4) parameter (nslots = 30, mxprcs = 100) C ,p2,p3,p4,p5,p6,p7,p8,p9) C 1 p10,p11,p12,p13,p14,p15,p16,p17,p18,p19,p20) integer nprocs external subname integer p1,p2,p3,p4,p5,p6,p7,p8,p9,p10,p11,p12,p13,p14,p15,p16, 1 p17,p18,p19,p20 C Local Declarations integer i, anyNode, COMPLETE, chan, dim, myjob, NSPAWN integer anyType,err integer SCOMPLETE, type, NXTG integer parmq,readyq,qlock,hrlock,trlock,phead,ptail,rhead,rtail, 1 done,qtail,qtlock common /qdata/ parmq(nslots,mxprcs),phead,ptail, 1 readyq(mxprcs),rhead,rtail,qtail common /qsync/ qlock(mxprcs),hrlock,trlock,done,qtlock integer mindim, nopen, ntest, findNode, gtprb common /nodeList/ nodeAvail, numAvail integer nodeAvail(0:64), numAvail C Init locals COMPLETE = 99 SCOMPLETE = 98 NSPAWN = 128 NXTG = 1999 C Init nodeList do 1 i = 0,64 nodeAvail(i) = 1 1 continue C Find the minimum dimension that has the requested number of processors dim = mindim(nprocs) C If the request was invalid quit if (dim.eq.-1) then print *,'Too many processors requested' stop endif C Try to open a cube of the correct dimension 10 chan = nopen(dim) C if the desired cube size is not available try the next smaller one C until there are no nodes available if (chan .lt. 0) then dim = dim - 1 if (dim .lt. 0) then print *, 'Not enough nodes available' stop endif goto 10 endif C if the assigned cube dim is smaller than requested notify the user if (2**dim .lt. nprocs) then nprocs = 2**dim print *, 'Only ', nprocs, ' Processors available' print *, 'Using ', nprocs, ' Processors' endif C init the local number of available processors numAvail = nprocs call libopn(nprocs) C start the main user routine that sets up the dependancies call subname(p1,p2,p3,p4) c ,p2,p3,p4,p5,p6,p7,p8,p9,p10,p11,p12,p13,p14,p15,p16, c 1 p17,p18,p19,p20) C Here is the main loop which loads processes on the nodes and C services them. C the parallel processing is done when there are no more jobs to do C and the processors are all free. C while readyQ is not empty or the numAvail is not nprocs 20 if ((rhead .eq. rtail) .and. (numAvail .eq. nprocs)) then err = nclose(chan) print *,'closed cube',err return endif C if there is a job to do, start it myjob = gtprb(jobtag) if (myjob .ne. 0) then call work(findNode(nodeAvail,numAvail,nprocs,jobtag) 1 ,jobtag,chan) endif anyNode = -1 anyType = -1 C if a process completes then wrapit up err = ntest(chan, anyNode, anyType) if (err .lt. 0) then goto 20 endif if (anyType .eq. COMPLETE) then call wrapup(anyNode,chan) goto20 endif if (anyType .eq. SCOMPLETE) then C if a spawned process completes call swrapup(anyNode,chan) goto 20 endif if (anyType .eq. NSPAWN) then C if a spawn is requested call strtspawn(anyNode,chan) goto 20 endif if (anyType .eq. NXTG) then C if a call to nxtag is requested call sndtag(anyNode, chan) goto 20 endif goto 20 end End of sched.f echo slen.f 1>&2 cat >slen.f <<'End of slen.f' integer function slen(s) character*25 s parameter (MAXLEN = 25) integer i,ans ans = MAXLEN do 1 i = 1,MAXLEN if (s(i:i) .eq. ' ') then ans = i - 1 goto 2 endif 1 continue 2 slen = ans return end End of slen.f echo slen.fn 1>&2 cat >slen.fn <<'End of slen.fn' integer function slen(s) character*25 s parameter (MAXLEN = 25) integer i,ans ans = MAXLEN do 1 i = 1,MAXLEN if (s(i:i) .eq. ' ') then ans = i - 1 goto 2 endif 1 continue 2 slen = ans return end End of slen.fn echo sndtag.f 1>&2 cat >sndtag.f <<'End of sndtag.f' C Process the request for a jobtag, call nxtag and send the result to C the requesting node. subroutine sndtag(node, chan) integer node, chan common /nodeList/ nodeAvail, numAvail integer nodeAvail(0:64), numAvail integer jobtag, NXTG, parent, err NXTG = 1999 C This message contains no info, it is just a request. It's presence C denotes the request err = nread(chan, jobtag, 4, node, NXTG) if(err .lt. 0) then print *,'SNDTAG: Error in nread ',err if(nclose(chan) .lt. 0) print *,'err in nclose' stop endif C Find the parent's jobtag parent = -1 * nodeAvail(node) C Get the new jobtag call nxtag(parent, jobtag) C Send the result to the node err = nwrite(chan, jobtag, 4, node, NXTG) if(err .lt. 0) then print *,'SNDTAG: Error in nwrite ',err if(nclose(chan) .lt. 0) print *,'err in nclose' stop endif return end End of sndtag.f echo sp.fn 1>&2 cat >sp.fn <<'End of sp.fn' subroutine nsubr(a,b,c) integer a,b real c integer b1,b2,l(4),n, stat real c1,c2,x character*25 subname subname = 'sumsqrt' n = 4 l(1) = 1 l(2) = 1 l(3) = 1 l(4) = 0 b1 = (b-a)/2 b2 = b1 + 1 call nxtag(dummy, jobtag) call spawn(dummy,jobtag,subname, n, l, a,b1,c1,dummy) stat = 1 call nxtag(dummy, jobtag) call spawn(dummy,jobtag,subname, n, l, b2,b,c2,dummy) stat = 2 call wait(n) stat = 3 c = c1 + c2 x = c return end End of sp.fn echo spawn.fn 1>&2 cat >spawn.fn <<'End of spawn.fn' subroutine spawn(mytag, jobtag,subname, n,l,p1,p2,p3,p4) c mytag is just a dummy used for consistency integer mytag integer jobtag,n,l(*),p1,p2,p3,p4 character*25 subname, sub integer slen C Local Declarations character*1 c integer type,len,err,cflag,NSPAWN integer dest,ptoi,star,nwrite integer buff(4096),offset,i,j,PARAMS,SPDEST C Global Declarations common /hid/hid integer hid common /nwait/nwait integer nwait integer IndexParms(20,2,0:63) common /Index/IndexParms C Init vars sub = subname NSPAWN = 128 SPDEST = 131 PARAMS = 21 len = slen(subname) C load the name into the buffer do 1 i = 1,len buff(i) = ichar(subname(i:i)) 1 continue C Load the jobtag in the buffer len = len + 1 buff(len) = jobtag len = len * 4 C Send the request to the host if (nwrite(buff, len, hid, NSPAWN,cflag).lt.0) stop C Read the node from the host if(nread(dest,4,hid,SPDEST,cflag).lt.0) stop Copy pointers to paramater p1,...,p20 into IndexParms(1,1)...IndexParms(20,1) IndexParms(1,1,dest) = ptoi(p1) IndexParms(2,1,dest) = ptoi(p2) IndexParms(3,1,dest) = ptoi(p3) IndexParms(4,1,dest) = ptoi(p4) C Save the lengths of the parameters do 15 i = 1,n IndexParms(i,2,dest) = l(i) 15 continue C Zero out the rest of the lengths in the list do 20 i = n+1,20 IndexParms(i,2,dest) = 0 20 continue C The buffer must be loaded with the parameters to the function. C The buffer will contain the lengths of each parameter and the C data making up each parameter in the following format C | l1 | l2 | ... | l20 | p1 ... | p2 ... | ... | P20 ... | C There may not always be 20 parameters but there will always be C 20 lengths. This is because fortran will not allow a variable C number of parameters in a subprogram call such as the one to be C generated on the node from this message. C Calculate initial offset offset = 20 do 10 i = 1,20 C Copy the length of the parameter into the buffer buff(i) = IndexParms(i,2,dest) C Copy all of the data into the buffer to be sent out do 5 j = 1,buff(i) buff(j + offset) = 1 star(IndexParms(i,1,dest)+(j-1)*4) c 1 star(IndexParms(i,1,dest)) 5 continue C Calculate next offset offset = offset + buff(i) 10 continue len = offset*4 err = nwrite(buff, len, dest, PARAMS,cflag) if (err .lt. 0) stop C Count how many processes must be waited on nwait = nwait + 1 return end End of spawn.fn echo split.fn 1>&2 cat >split.fn <<'End of split.fn' subroutine nsubr(myid,a,nlevls,klevl) integer a(*),klevl(*),myid(*),rnode c external clone character*25 clone integer n, l(4) integer kl, nl c write(6,*) ' from split ',a(1) c kl = klevl(1) nl = nlevls if (klevl(1) .ge. nlevls) return c clone = 'clone' lnode = 2*a(1) rnode = lnode + 1 indx = lnode - a(1) + 1 a(indx) = lnode a(indx+1) = rnode mytag = myid(a(1)) c n = 4 l(1) = 1 l(2) = 256 - indx + 1 l(3) = 1 l(4) = 7 call nxtag(mytag,jobtag) myid(lnode) = jobtag call spawn(mytag,jobtag,clone,n,l,myid,a(indx),nlevls, 1 klevl(2)) c call nxtag(mytag,jobtag) myid(rnode) = jobtag l(2) = l(2) - 1 call spawn(mytag,jobtag,clone,n,l,myid,a(indx+1),nlevls, 1 klevl(2)) c call wait(myid(1)) return end End of split.fn echo spt.f 1>&2 cat >spt.f <<'End of spt.f' external paralg integer a,b, nprocs common /c/c1,c2 real c,c1,c2 nprocs = 3 print *,'using 3 procs' a = 1 b = 65000 call sched(nprocs, paralg, a,b,c) c = c1 + c2 print *,' c c1 c2',c,c1,c2 print *, 'the sum of the square roots from ',a,'to',b,' is ',c stop end subroutine paralg(a,b,c) integer a,b real c integer a1,a2,b1,b2 common /c/c1,c2 real c1,c2 integer jobtag, icango, nchks, mychkn, n, l(3) character*25 subname jobtag = 1 icango = 0 nchks = 1 mychkn = 1 call dep(jobtag, icango, nchks, mychkn) n = 3 l(1) = 1 l(2) = 1 l(3) = 1 subname = 'sp' call putq(jobtag, subname, n, l, a,b,c) return end End of spt.f echo star.hqx 1>&2 cat >star.hqx <<'End of star.hqx' 00000000 02 07 00 00 04 53 54 41 52 B9 04 1D 00 00 00 19 <.....STAR.......> 00000010 43 46 47 2F 4E 43 55 42 45 20 46 37 37 20 56 31 00000020 2E 35 20 20 20 20 20 20 20 9C 08 09 00 01 85 26 <.5 ......&> 00000030 04 43 4F 44 45 24 08 09 00 02 85 04 04 44 41 54 <.CODE$.......DAT> 00000040 41 46 08 0A 00 04 85 00 05 53 54 41 43 4B EA 1C 00000050 06 00 00 00 7B 20 7D C6 1C 07 00 00 00 77 80 83 <....{ }......w..> 00000060 01 62 1C 04 00 00 00 7A 66 1C 09 00 00 00 7B 20 <.b.....zf.....{ > 00000070 83 03 83 02 35 1C 06 00 00 00 75 83 04 E2 1C 06 <....5.....u.....> 00000080 00 00 00 7F 83 05 D7 1C 0A 00 00 00 74 80 80 80 <............t...> 00000090 01 83 06 5C 0C 0F 00 01 00 00 00 04 53 54 41 52 <...\........STAR> 000000A0 00 00 02 02 07 9B 10 29 00 01 00 04 CF FC 95 00 <.......)........> 000000B0 CF 04 8F 08 C1 04 41 C0 0B 04 40 FA 00 00 00 00 <..........@.....> 000000C0 04 FA 00 00 00 00 C0 04 FC CF EF 04 04 FC CF EF <................> 000000D0 04 E8 12 07 00 11 20 02 17 20 02 7B 14 02 00 00 <...... .. .{....> 000000E0 EA <.> End of star.hqx echo star.hx 1>&2 cat >star.hx <<'End of star.hx' 00000000 80 06 00 04 53 54 41 52 3C 96 26 00 00 04 43 4F <....STAR<.&...CO> 00000010 44 45 04 44 41 54 41 05 53 54 41 43 4B 09 53 54 00000020 41 52 5F 43 4F 44 45 09 53 54 41 52 5F 44 41 54 00000030 41 13 98 07 00 60 38 00 05 02 01 C1 98 07 00 60 00000040 04 00 06 03 01 F3 98 07 00 54 00 00 04 04 01 04 <.........T......> 00000050 8E 04 00 00 00 80 EE 9C 09 00 40 01 41 02 00 01 <..........@.A...> 00000060 01 02 D3 8E 06 00 00 00 7B 20 7D 54 8E 07 00 00 <........{ }T....> 00000070 00 77 80 83 02 EF 8E 04 00 00 00 7A F4 8E 09 00 <.w.........z....> 00000080 00 00 7B 20 83 04 83 03 C1 8E 06 00 00 00 75 83 <..{ ..........u.> 00000090 05 6F 8E 06 00 00 00 7F 83 06 64 8E 0A 00 00 00 <.o........d.....> 000000A0 74 80 80 72 01 83 07 F7 90 0B 00 00 01 04 53 54 000000B0 41 52 00 00 08 1E A0 11 00 01 00 00 1E B8 00 00 000000C0 8E D8 55 8B EC 81 EC 00 00 D9 9C 04 00 C8 02 5D <..U............]> 000000D0 39 A0 07 00 01 0D 00 C4 5E 08 66 A0 07 00 01 10 <9.........f.....> 000000E0 00 26 8B 17 63 A0 07 00 01 13 00 26 8E 5F 60 A0 <....c.........`.> 000000F0 26 00 01 16 00 02 87 DA 8B 0F 8B 57 02 8E D8 87 <&........^.&..&.> 00000100 C1 90 A3 00 00 89 16 02 00 8B E5 5D 1F CA 04 00 00000110 8B E5 5D 1F CA 04 00 9A 9C 07 00 C4 0E 5D C4 12 <..]..........]..> 00000120 5D FB 8A 02 00 00 74 <].....t> End of star.hx echo sumsqrt.fn 1>&2 cat >sumsqrt.fn <<'End of sumsqrt.fn' subroutine nsubr(a,b,c) integer a,b real c c = 0.0 do 1 i=a,b c = sqrt(1.0*i) + c 1 continue return end End of sumsqrt.fn echo swrapup.f 1>&2 cat >swrapup.f <<'End of swrapup.f' C********************************************************************** C This subroutine will wrap up the completion of a spawned C process that has been run on the node. C The parameters are sent directly to the spawning node so the host C does not recieve them. Here the process is simply checked in and C the node is freed. subroutine swrapup(node, chan) parameter (mxprcs = 100) parameter (MAXBUFF = 4096) integer node, chan Common block for the node list common /nodeList/ nodeAvail, numAvail integer nodeAvail(0:64), numAvail C Data Structure containing the subroutine names and the paramaters character*25 IndexSubname(mxprcs) integer IndexParms(20,2,mxprcs) common /Index/IndexSubname,IndexParms integer nread, star C Local data integer buff(MAXBUFF),err, SCOMPLETE,cflag integer jobtag, i,j,offset, tmp SCOMPLETE = 98 C Read the buffer from the node err = nread(chan, buff, MAXBUFF*4, node, SCOMPLETE) node = buff(1) if (err .le. 0) then print *, 'Error in reading params from node ' print *, 'Node: ', node print *, 'Error: ',err stop endif C Now the process must be checked in jobtag = -1 * nodeAvail(node) C Free up the node for the next process nodeAvail(node) = 1 numAvail = numAvail + 1 call chekin(jobtag) return end C of swrapup End of swrapup.f echo wait.fn 1>&2 cat >wait.fn <<'End of wait.fn' subroutine wait(myid) integer myid parameter (MAXBUFF = 4096) C Global Declarations common /nwait/nwait integer nwait integer IndexParms(20,2,0:63) common /Index/IndexParms integer hid common /hid/hid C Local Defs integer buff(MAXBUFF), len,nread, COMPLETE,cflag,node integer offset,i,j, SCOMPLETE COMPLETE = 99 SCOMPLETE = 98 C if there are spawned processes to retun, read their params 10 if (nwait .gt. 0) then node = -1 len = nread(buff, 4*MAXBUFF,node,COMPLETE,cflag) C Restore params to memory C calculate initial offset offset = 20 i = 1 44 if (i.gt.20) goto 2 C do 2 i = 1,20 C Copy all of the data from the buffer to the parameter j = 1 34 if (j.gt.buff(i)) goto 1 c do 1 j = 1,tmp call lstar(IndexParms(i,1,node)+(j-1)*4, 1 buff(j+offset)) j = j + 1 goto 34 1 continue C Calculate next offset offset = offset + buff(i) i = i + 1 goto 44 2 continue C Notify the host of the child's completion len = 4 if(nwrite(node, len, hid, SCOMPLETE,cflag) .lt. 0) stop C Update nwait and check for more children nwait = nwait - 1 goto 10 endif return end End of wait.fn echo work.f 1>&2 cat >work.f <<'End of work.f' C********************************************************************** subroutine work(id,jobtag,chan) parameter (MAXBUFF = 4096, mxprcs = 100) implicit logical (a-z) integer id, jobtag,chan C This subroutine will run the process jobtag on the processor number id C Data Structure containing the subroutine names and the paramaters character*25 IndexSubname(mxprcs) integer IndexParms(20,2,mxprcs) common /Index/IndexSubname,IndexParms integer i, j, star, nload, nwrite,slen C Local data integer buff(MAXBUFF),err, PARAMS, offset, off integer SPDEST, dest PARAMS = 21 SPDEST = 131 C Load the process on the node err = nload(chan, 1 IndexSubname(jobtag)(1:slen(IndexSubname(jobtag))) 2 ,id, buff, MAXBUFF*4) if (err.ne.0) then print *, 'Error in nload :',err stop endif C if IndexParms is less than zero then the process is a spawned C one and the node must be sent to the spawning process if (IndexParms(1,2,jobtag) .lt. 0) then dest = -1*(IndexParms(1,2,jobtag)+1) err = nwrite(chan, id, 4, dest, SPDEST) if (err .ne. 0) then print *, 'Error in sending nodeaddr to node :',err stop endif RETURN endif C The buffer must be loaded with the parameters to the function. C The buffer will contain the lengths of each parameter and the C data making up each parameter in the following format C | l1 | l2 | ... | l20 | p1 ... | p2 ... | ... | P20 ... | C There may not always be 20 parameters but there will always be C 20 lengths. This is because fortran will not allow a variable C number of parameters in a subprogram call such as the one to be C generated on the node from this message. C Calculate initial offset offset = 20 do 10 i = 1,20 C Copy the length of the parameter into the buffer buff(i) = IndexParms(i,2,jobtag) C Copy all of the data into the buffer to be sent out do 5 j = 1,buff(i) buff(j + offset) = star(IndexParms(i,1,jobtag)) c 1 star(IndexParms(i,1,jobtag)+(j-1)*4),j+offset 5 continue C Calculate next offset offset = offset + buff(i) 10 continue err = nwrite(chan, buff, offset*4, id, PARAMS) if (err .ne. 0) then print *, 'Error in sending parameters :',err stop endif return end End of work.f echo wrapup.f 1>&2 cat >wrapup.f <<'End of wrapup.f' C********************************************************************** C This subroutine will wrap up the completion of a function C that has been run on the node. C It will read the parameters sent back from the node C and put them back in memory. The NodeAvail list will also C be updated. subroutine wrapup(node, chan) parameter (mxprcs = 100) parameter (MAXBUFF = 4096) integer node, chan Common block for the node list common /nodeList/ nodeAvail, numAvail integer nodeAvail(0:64), numAvail C Data Structure containing the subroutine names and the paramaters character*25 IndexSubname(mxprcs) integer IndexParms(20,2,mxprcs) common /Index/IndexSubname,IndexParms integer nread, star C Local data integer buff(MAXBUFF),err, COMPLETE,cflag integer jobtag, i,j,offset, tmp COMPLETE = 99 C Read the buffer from the node err = nread(chan, buff, MAXBUFF*4, node, COMPLETE) if (err .le. 0) then print *, 'Error in reading params from node ' print *, 'Node: ', node print *, 'Error: ',err stop endif C Now the buffer must be loaded back into memory jobtag = -1 * nodeAvail(node) C Calculate initial offset offset = 20 i = 1 44 if (i.gt.20) goto 2 c do 2 i = 1,20 C Copy all of the data from the buffer to the parameter j = 1 34 if (j.gt.buff(i)) goto 1 c do 1 j = 1,tmp call lstar(IndexParms(i,1,jobtag)+(j-1)*4,buff(j+offset)) j = j + 1 goto 34 1 continue C Calculate next offset offset = offset + buff(i) i = i + 1 goto 44 2 continue C Free up the node for the next process nodeAvail(node) = 1 numAvail = numAvail + 1 call chekin(jobtag) return end C of wrapup End of wrapup.f