fortran - MPI_COMM_SPAWN causing a dead lock -
i have mpi program needs spawn wait different mpi program b finish. need spawn , wait program b second time.
program a
if (rank .eq. 0) call mpi_comm_spawn('prog_b', mpi_argv_null, size, & & mpi_info_null, 0, mpi_comm_self, & & child_comm, mpi_errcodes_ignore, status) write (*,*) 'parent 1 before' call mpi_barrier(child_comm, status) write (*,*) 'parent 1 after' ... change things ... call mpi_comm_spawn('prog_b', mpi_argv_null, size, & & mpi_info_null, 0, mpi_comm_self, & & child_comm, mpi_errcodes_ignore, status) write (*,*) 'parent 2 before' call mpi_barrier(child_comm, status) write (*,*) 'parent 2 after' end if
program b
... wait finished ... call mpi_comm_get_parent(parent_comm, error) if (parent_comm .ne. mpi_comm_null) write (*,*) 'before' call mpi_barrier(parent_comm, error) write (*,*) 'after' end if ... finalize ...
when run this, first spawning of program b works fine. on second pass, both programs deadlock on second barrier. i'm spawning 16 instances of program b each time.
output
parent before 1 ... output of program b ... before before before before before before before before before before before before before after after before after before after after after after after before after after parent after 1 after after after after after after ... second call spawn ... parent before 2 ... output of program b ... before before before before before before before before before before before before before before before before
as can see, each process makes past first barrier second time dead locks. have tried disconnecting parent , child comms after first spawn call. tried merging parent , child comms , calling barrier on them nothing seems fix deadlock issue.
Comments
Post a Comment