Shell,并行运行四个进程

目前我被困在耗时的模拟运行效率。 目的是并行运行4个模拟,因为它是一个单线程应用程序和一个四核心系统。 我必须改变shell脚本:

./sim -r 1 & ./sim -r 2 & ./sim -r 3 & ./sim -r 4 & wait ./sim -r 5 & ./sim -r 6 & ./sim -r 7 & ./sim -r 8 & wait ... (another 112 jobs) 

这个代码有一个又一个的等待。 我也尝试将这些任务分成四个脚本并运行,结果是一个脚本已经完成,另一个脚本剩下的工作约占30%。 我无法预测模拟将花费多长时间。

任何build议有4个模拟运行在任何时候?

在Ubuntu中安装moreutils软件包,然后使用parallel工具:

 parallel -j 4 ./sim -r -- 1 2 3 4 5 6 7 8 ... 

如果你不想安装parallel工具(假设它按照指示工作,看起来很整洁),那么你可以调整这个Perl脚本(基本上,改变执行的命令),并可能减少监视:

 #!/usr/bin/env perl use strict; use warnings; use constant MAX_KIDS => 4; $| = 1; my %pids; my $kids = 0; # Number of kids for my $i (1..20) { my $pid; if (($pid = fork()) == 0) { my $tm = int(rand() * (10 - 2) + 2); print "sleep $tm\n"; # Using exec in a block on its own is the documented way to # avoid the warning: # Statement unlikely to be reached at filename.pl line NN. # (Maybe you meant system() when you said exec()?) # Yes, I know the print and exit statements should never be # reached, but, dammit, sometimes things go wrong! { exec "sleep", $tm; } print STDERR "Oops: couldn't sleep $tm!\n"; exit 1; } $pids{$pid} = 1; $kids++; my $time = time; print "PID: $pid; Kids: $kids; Time: $time\n"; if ($kids >= MAX_KIDS) { my $kid = waitpid(-1, 0); print "Kid: $kid ($?)\n"; if ($kid != -1) { delete $pids{$kid}; $kids--; } } } while ((my $kid = waitpid(-1, 0)) > 0) { my $time = time; print "Kid: $kid (Status: $?); Time: $time\n"; delete $pids{$kid}; $kids--; } # This should not do anything - and doesn't (any more!). foreach my $pid (keys %pids) { printf "Undead: $pid\n"; } 

示例输出:

 PID: 20152; Kids: 1; Time: 1383436882 PID: 20153; Kids: 2; Time: 1383436882 sleep 5 PID: 20154; Kids: 3; Time: 1383436882 sleep 7 sleep 9 PID: 20155; Kids: 4; Time: 1383436882 sleep 4 Kid: 20155 (0) PID: 20156; Kids: 4; Time: 1383436886 sleep 4 Kid: 20152 (0) PID: 20157; Kids: 4; Time: 1383436887 sleep 2 Kid: 20153 (0) PID: 20158; Kids: 4; Time: 1383436889 sleep 9 Kid: 20157 (0) PID: 20159; Kids: 4; Time: 1383436889 sleep 6 Kid: 20156 (0) PID: 20160; Kids: 4; Time: 1383436890 sleep 6 Kid: 20154 (0) PID: 20161; Kids: 4; Time: 1383436891 sleep 9 Kid: 20159 (0) PID: 20162; Kids: 4; Time: 1383436895 sleep 7 Kid: 20160 (0) PID: 20163; Kids: 4; Time: 1383436896 sleep 9 Kid: 20158 (0) PID: 20164; Kids: 4; Time: 1383436898 sleep 6 Kid: 20161 (0) PID: 20165; Kids: 4; Time: 1383436900 sleep 9 Kid: 20162 (0) PID: 20166; Kids: 4; Time: 1383436902 sleep 9 Kid: 20164 (0) PID: 20167; Kids: 4; Time: 1383436904 sleep 2 Kid: 20163 (0) PID: 20168; Kids: 4; Time: 1383436905 sleep 6 Kid: 20167 (0) PID: 20169; Kids: 4; Time: 1383436906 sleep 9 Kid: 20165 (0) PID: 20170; Kids: 4; Time: 1383436909 sleep 4 Kid: 20168 (0) PID: 20171; Kids: 4; Time: 1383436911 Kid: 20166 (0) sleep 9 Kid: 20170 (Status: 0); Time: 1383436913 Kid: 20169 (Status: 0); Time: 1383436915 Kid: 20171 (Status: 0); Time: 1383436920 
 NUMJOBS=30 NUMPOOLS=4 seq 1 "$NUMJOBS" | for p in $(seq 1 $NUMPOOLS); do while read x; do ./sim -r "$x"; done & done 

for循环创建一个后台进程池,从共享标准输入读取数据以开始模拟。 每个后台进程在模拟运行时都会“阻塞”,然后从seq命令中读取下一个工作号。

没有for循环,可能会更容易一些:

 seq 1 "$NUMJOBS" | { while read x; do ./sim -r "$x"; done & while read x; do ./sim -r "$x"; done & while read x; do ./sim -r "$x"; done & while read x; do ./sim -r "$x"; done & } 

假设sim需要花费不少的时间来运行,第一个从标准输入读取1,第二个2等。无论哪个sim首先完成, while循环将从标准输入中read 5。 下一个结束将读取6,依此类推。 一旦最后一次模拟开始,每次read都将失败,导致循环退出。