如何在bash中等待几个subprocess完成并返回退出代码！= 0当任何subprocess以代码！= 0结束？

如何在bash脚本中等待从该脚本产生的几个subprocess完成并返回退出代码！= 0当任何subprocess以代码！= 0结束？

简单的脚本：

#!/bin/bash for i in `seq 0 9`; do doCalculations $i & done wait

上面的脚本将等待所有10个衍生subprocess，但总是会给出退出状态0（请参阅help wait ）。如何修改这个脚本，以便它能够发现生成的subprocess的退出状态，并且当任何subprocess以代码！= 0结束时返回退出代码1？

有没有比收集subprocess的PID更好的解决scheme，等待他们和总结退出状态？

wait （可选）需要进程的PID等待，并与$！你得到在后台启动的最后一个命令的PID。修改循环以将每个产生的subprocess的PID存储到一个数组中，然后再循环等待每个PID。

http://jeremy.zawodny.com/blog/archives/010717.html ：

 #!/bin/bash FAIL=0 echo "starting" ./sleeper 2 0 & ./sleeper 2 1 & ./sleeper 3 0 & ./sleeper 2 0 & for job in `jobs -p` do echo $job wait $job || let "FAIL+=1" done echo $FAIL if [ "$FAIL" == "0" ]; then echo "YAY!" else echo "FAIL! ($FAIL)" fi

如果你安装了GNU Parallel，你可以这样做：

 seq 0 9 | parallel doCalculations {}

GNU并行会给你退出代码：

0 – 所有作业运行没有错误。
1-253 – 一些工作失败。退出状态显示失败的作业数量
254 – 超过253个工作失败。
255 – 其他错误。

观看介绍video以了解更多信息： http : //pi.dk/1

10秒安装：

 wget -O - pi.dk/3 | sh

这是我到目前为止所提出的。我想看看如果一个孩子终止，如何中断睡眠命令，这样就不必调整WAITALL_DELAY的用法。

 waitall() { # PID... ## Wait for children to exit and indicate whether all exited with 0 status. local errors=0 while :; do debug "Processes remaining: $*" for pid in "$@"; do shift if kill -0 "$pid" 2>/dev/null; then debug "$pid is still alive." set -- "$@" "$pid" elif wait "$pid"; then debug "$pid exited with zero exit status." else debug "$pid exited with non-zero exit status." ((++errors)) fi done (("$#" > 0)) || break # TODO: how to interrupt this sleep when a child terminates? sleep ${WAITALL_DELAY:-1} done ((errors == 0)) } debug() { echo "DEBUG: $*" >&2; } pids="" for t in 3 5 4; do sleep "$t" & pids="$pids $!" done waitall $pids

简单地说：

 #!/bin/bash pids="" for i in `seq 0 9`; do doCalculations $i & pids="$pids $!" done wait $pids ...code continued here ...

更新：

正如多个评论者所指出的那样，上述等待所有进程在继续之前完成，但是如果其中一个失败，则不会退出并失败，可以通过@Bryan，@SBrBrightman等人的build议进行修改：

 #!/bin/bash pids="" RESULT=0 for i in `seq 0 9`; do doCalculations $i & pids="$pids $!" done for pid in $pids; do wait $pid || let "RESULT=1" done if [ "$RESULT" == "1" ]; then exit 1 fi ...code continued here ...

这里是使用wait简单例子。

运行一些进程：

 $ sleep 10 & $ sleep 10 & $ sleep 20 & $ sleep 20 &

然后用wait命令等待他们：

 $ wait < <(jobs -p)

^{或者wait （没有参数）。}

这将等待在后台完成所有的工作。

^{如果提供-n选项，则等待下一个作业终止并返回其退出状态。}

请参阅： help wait和help jobs的语法。

但缺点是只会返回最后一个ID的状态，所以您需要检查每个subprocess的状态并将其存储在variables中。

或者让你的计算function在失败时创build一些文件（空的或失败日志），然后检查该文件是否存在，例如

 $ sleep 20 && true || tee fail & $ sleep 20 && false || tee fail & $ wait < <(jobs -p) $ test -f fail && echo Calculation failed.

为了并行化…

 for i in $(whatever_list) ; do do_something $i done

把它翻译成这个…

 for i in $(whatever_list) ; do echo $i ; done | ## execute in parallel... ( export -f do_something ## export functions (if needed) export PATH ## export any variables that are required xargs -I{} --max-procs 0 bash -c ' ## process in batches... { echo "processing {}" ## optional do_something {} }' )

如果在一个进程中发生错误 ，它将不会中断其他进程，但会导致从整个序列中产生一个非零的退出代码 。
在任何特定情况下，导出函数和variables可能是或可能不是必需的。
您可以根据您想要的并行度设置--max-procs （ 0表示“一次全部”）。
GNU并行提供了一些额外的function，当用来代替xargs ，它并不总是默认安装。
在这个例子中for循环不是必须的，因为echo $i基本上只是重新生成$(whatever_list ）的输出。我只是认为使用for关键字可以更容易地看到发生了什么。
Bashstring处理可能会令人困惑 – 我发现使用单引号最适合包装非平凡的脚本。
您可以轻松地中断整个操作（使用^ C或类似），而不是更直接的Bash并行方法。

这是一个简化的工作示例

 for i in {0..5} ; do echo $i ; done |xargs -I{} --max-procs 2 bash -c ' { echo sleep {} sleep 2s }'

我在这里看到很多很好的例子，也想把它扔在里面。

 #! /bin/bash items="1 2 3 4 5 6" pids="" for item in $items; do sleep $item & pids+="$! " done for pid in $pids; do wait $pid if [ $? -eq 0 ]; then echo "SUCCESS - Job $pid exited with a status of $?" else echo "FAILED - Job $pid exited with a status of $?" fi done

我使用非常相似的东西来并行启动/停止服务器/服务，并检查每个退出状态。对我很好。希望这可以帮助别人！

我不相信Bash的内置function是可能的。

当孩子退出时你可以得到通知：

 #!/bin/sh set -o monitor # enable script job control trap 'echo "child died"' CHLD

但是，在信号处理程序中没有明显的方法来获取孩子的退出状态。

获得这个孩子状态通常是在较低级别的POSIX API中的waitfunction家族的工作。不幸的是，Bash对此的支持是有限的 – 你可以等待一个特定的subprocess（并获得它的退出状态），或者你可以等待所有的进程，并总是得到0的结果。

看起来不可能做的是相当于waitpid(-1) ，它阻塞，直到任何子进程返回。

以下代码将等待所有计算的完成，并在任何doCalculations失败时返回退出状态1。

 #!/bin/bash for i in $(seq 0 9); do (doCalculations $i >&2 & wait %1; echo $?) & done | grep -qv 0 && exit 1

如果您有bash 4.2或更高版本，以下内容可能对您有用。它使用关联数组来存储任务名称及其“代码”以及任务名称和它们的pid。我也build立了一个简单的限速方法，如果你的任务消耗了大量的CPU或I / O时间，并且你想限制并发任务的数量，这个方法可能会派上用场。

脚本在第一个循环中启动所有任务，并在第二个循环中消耗结果。

对于简单的情况来说，这有点矫枉过正，但是它可以让你看起来很整洁。例如，可以将另一个关联数组中的每个任务的错误消息存储起来，并在所有事情都落定之后将其打印出来。

 #! /bin/bash main () { local -A pids=() local -A tasks=([task1]="echo 1" [task2]="echo 2" [task3]="echo 3" [task4]="false" [task5]="echo 5" [task6]="false") local max_concurrent_tasks=2 for key in "${!tasks[@]}"; do while [ $(jobs 2>&1 | grep -c Running) -ge "$max_concurrent_tasks" ]; do sleep 1 # gnu sleep allows floating point here... done ${tasks[$key]} & pids+=(["$key"]="$!") done errors=0 for key in "${!tasks[@]}"; do pid=${pids[$key]} local cur_ret=0 if [ -z "$pid" ]; then echo "No Job ID known for the $key process" # should never happen cur_ret=1 else wait $pid cur_ret=$? fi if [ "$cur_ret" -ne 0 ]; then errors=$(($errors + 1)) echo "$key (${tasks[$key]}) failed." fi done return $errors } main

这里是我的版本，适用于多个pid，如果执行时间过长，则logging警告，如果执行时间超过给定值，则停止子stream程。

 function WaitForTaskCompletion { local pids="${1}" # pids to wait for, separated by semi-colon local soft_max_time="${2}" # If execution takes longer than $soft_max_time seconds, will log a warning, unless $soft_max_time equals 0. local hard_max_time="${3}" # If execution takes longer than $hard_max_time seconds, will stop execution, unless $hard_max_time equals 0. local caller_name="${4}" # Who called this function local exit_on_error="${5:-false}" # Should the function exit program on subprocess errors Logger "${FUNCNAME[0]} called by [$caller_name]." local soft_alert=0 # Does a soft alert need to be triggered, if yes, send an alert once local log_ttime=0 # local time instance for comparaison local seconds_begin=$SECONDS # Seconds since the beginning of the script local exec_time=0 # Seconds since the beginning of this function local retval=0 # return value of monitored pid process local errorcount=0 # Number of pids that finished with errors local pidCount # number of given pids IFS=';' read -a pidsArray <<< "$pids" pidCount=${#pidsArray[@]} while [ ${#pidsArray[@]} -gt 0 ]; do newPidsArray=() for pid in "${pidsArray[@]}"; do if kill -0 $pid > /dev/null 2>&1; then newPidsArray+=($pid) else wait $pid result=$? if [ $result -ne 0 ]; then errorcount=$((errorcount+1)) Logger "${FUNCNAME[0]} called by [$caller_name] finished monitoring [$pid] with exitcode [$result]." fi fi done ## Log a standby message every hour exec_time=$(($SECONDS - $seconds_begin)) if [ $((($exec_time + 1) % 3600)) -eq 0 ]; then if [ $log_ttime -ne $exec_time ]; then log_ttime=$exec_time Logger "Current tasks still running with pids [${pidsArray[@]}]." fi fi if [ $exec_time -gt $soft_max_time ]; then if [ $soft_alert -eq 0 ] && [ $soft_max_time -ne 0 ]; then Logger "Max soft execution time exceeded for task [$caller_name] with pids [${pidsArray[@]}]." soft_alert=1 SendAlert fi if [ $exec_time -gt $hard_max_time ] && [ $hard_max_time -ne 0 ]; then Logger "Max hard execution time exceeded for task [$caller_name] with pids [${pidsArray[@]}]. Stopping task execution." kill -SIGTERM $pid if [ $? == 0 ]; then Logger "Task stopped successfully" else errrorcount=$((errorcount+1)) fi fi fi pidsArray=("${newPidsArray[@]}") sleep 1 done Logger "${FUNCNAME[0]} ended for [$caller_name] using [$pidCount] subprocesses with [$errorcount] errors." if [ $exit_on_error == true ] && [ $errorcount -gt 0 ]; then Logger "Stopping execution." exit 1337 else return $errorcount fi } # Just a plain stupid logging function to replace with yours function Logger { local value="${1}" echo $value }

例如，等待所有三个进程完成，如果执行超过5秒，则logging警告，如果执行超过120秒，则停止所有进程。失败时不要退出程序。

 function something { sleep 10 & pids="$!" sleep 12 & pids="$pids;$!" sleep 9 & pids="$pids;$!" WaitForTaskCompletion $pids 5 120 ${FUNCNAME[0]} false } # Launch the function someting

只需将结果存储在shell中，例如存储在一个文件中。

 #!/bin/bash tmp=/tmp/results : > $tmp #clean the file for i in `seq 0 9`; do (doCalculations $i; echo $i:$?>>$tmp)& done #iterate wait #wait until all ready sort $tmp | grep -v ':0' #... handle as required

我已经做了一些工作，并将其他所有例子中的所有最好的部分结合在一起。这个脚本将在任何后台进程退出时执行checkpids函数，并输出退出状态而不用轮询。

 #!/bin/bash set -o monitor sleep 2 & sleep 4 && exit 1 & sleep 6 & pids=`jobs -p` checkpids() { for pid in $pids; do if kill -0 $pid 2>/dev/null; then echo $pid is still alive. elif wait $pid; then echo $pid exited with zero exit status. else echo $pid exited with non-zero exit status. fi done echo } trap checkpids CHLD wait

我刚刚修改了一个脚本来进行背景并行处理。

我做了一些试验（在Solaris上用bash和ksh），发现'wait'输出退出状态（如果不是零），或者当没有提供PID参数时返回非零退出的作业列表。例如

击：

 $ sleep 20 && exit 1 & $ sleep 10 && exit 2 & $ wait [1]- Exit 2 sleep 20 && exit 2 [2]+ Exit 1 sleep 10 && exit 1

KSH：

 $ sleep 20 && exit 1 & $ sleep 10 && exit 2 & $ wait [1]+ Done(2) sleep 20 && exit 2 [2]+ Done(1) sleep 10 && exit 1

这个输出写入stderr，所以OPs例子的简单解决scheme可以是：

 #!/bin/bash trap "rm -f /tmp/x.$$" EXIT for i in `seq 0 9`; do doCalculations $i & done wait 2> /tmp/x.$$ if [ `wc -l /tmp/x.$$` -gt 0 ] ; then exit 1 fi

虽然这样：

 wait 2> >(wc -l)

也将返回一个计数，但没有tmp文件。这也可以用这种方式，例如：

 wait 2> >(if [ `wc -l` -gt 0 ] ; then echo "ERROR"; fi)

但是这不比tmp文件IMO更有用。我找不到一个有用的方法来避免tmp文件，同时也避免在子shell中运行“等待”，这根本不会工作。

 #!/bin/bash set -m for i in `seq 0 9`; do doCalculations $i & done while fg; do true; done

set -m允许您在脚本中使用fg＆bg
fg除了将最后一个进程放在前台之外，还具有与前台进程相同的退出状态
while fg将停止循环，当任何fg退出非零退出状态

不幸的是，当后台进程以非零退出状态退出时，这不会处理这种情况。（循环不会立即终止，它将等待前面的过程完成。）

陷阱是你的朋友。你可以在很多系统上捕获ERR。您可以捕获EXIT，或在DEBUG上执行每条命令后执行一段代码。

这除了所有的标准信号。

这里已经有很多答案，但是我很惊讶似乎没有人提出过使用数组……所以这就是我所做的 – 这可能对未来的某些人有用。

 n=10 # run 10 jobs c=0 PIDS=() while true my_function_or_command & PID=$! echo "Launched job as PID=$PID" PIDS+=($PID) (( c+=1 )) # required to prevent any exit due to error # caused by additional commands run which you # may add when modifying this example true do if (( c < n )) then continue else break fi done # collect launched jobs for pid in "${PIDS[@]}" do wait $pid || echo "failed job PID=$pid" done

这个工作，应该是一样的好，如果不比@ HoverHell的答案更好！

 #!/usr/bin/env bash set -m # allow for job control EXIT_CODE=0; # exit code of overall script function foo() { echo "CHLD exit code is $1" echo "CHLD pid is $2" echo $(jobs -l) for job in `jobs -p`; do echo "PID => ${job}" wait ${job} || echo "At least one test failed with exit code => $?" ; EXIT_CODE=1 done } trap 'foo $? $$' CHLD DIRN=$(dirname "$0"); commands=( "{ echo "foo" && exit 4; }" "{ echo "bar" && exit 3; }" "{ echo "baz" && exit 5; }" ) clen=`expr "${#commands[@]}" - 1` # get length of commands - 1 for i in `seq 0 "$clen"`; do (echo "${commands[$i]}" | bash) & # run the command via bash in subshell echo "$i ith command has been issued as a background job" done # wait for all to finish wait; echo "EXIT_CODE => $EXIT_CODE" exit "$EXIT_CODE" # end

我最近使用了这个（感谢Alnitak）：

 #!/bin/bash # activate child monitoring set -o monitor # locking subprocess (while true; do sleep 0.001; done) & pid=$! # count, and kill when all done c=0 function kill_on_count() { # you could kill on whatever criterion you wish for # I just counted to simulate bash's wait with no args [ $c -eq 9 ] && kill $pid c=$((c+1)) echo -n '.' # async feedback (but you don't know which one) } trap "kill_on_count" CHLD function save_status() { local i=$1; local rc=$2; # do whatever, and here you know which one stopped # but remember, you're called from a subshell # so vars have their values at fork time } # care must be taken not to spawn more than one child per loop # eg don't use `seq 0 9` here! for i in {0..9}; do (doCalculations $i; save_status $i $?) & done # wait for locking subprocess to be killed wait $pid echo

从那里可以很容易地推断，并触发（触摸文件，发送信号），并改变计数标准（触摸的计数文件，或其他）来响应该触发。或者，如果你只是想'任何'非零rc，只需要从save_status杀死锁。

我需要这个，但是目标进程不是当前shell的subprocess，在这种情况下wait $PID不起作用。我确实find了以下替代方法：

 while [ -e /proc/$PID ]; do sleep 0.1 ; done

这依赖于procfs的存在，可能不可用（Mac不提供它）。所以为了便于携带，你可以使用它来代替：

 while ps -p $PID >/dev/null ; do sleep 0.1 ; done

捕捉CHLD信号可能不起作用，因为如果它们同时到达，可能会丢失一些信号。

 #!/bin/bash trap 'rm -f $tmpfile' EXIT tmpfile=$(mktemp) doCalculations() { echo start job $i... sleep $((RANDOM % 5)) echo ...end job $i exit $((RANDOM % 10)) } number_of_jobs=10 for i in $( seq 1 $number_of_jobs ) do ( trap "echo job$i : exit value : \$? >> $tmpfile" EXIT; doCalculations ) & done wait i=0 while read res; do echo "$res" let i++ done < "$tmpfile" echo $i jobs done !!!

 set -e fail () { touch .failure } expect () { wait if [ -f .failure ]; then rm -f .failure exit 1 fi } sleep 2 || fail & sleep 2 && false || fail & sleep 2 || fail expect

顶部set -e会使脚本停止失败。

expect将返回1如果任何subjob失败。

我想也许运行doCalculations; 回声“$？” >> / tmp / acc在发送到后台的子shell中，然后等待，然后/ tmp / acc将包含退出状态，每行一个。不过，我不知道多个进程追加到累加器文件的后果。

这是对这个build议的一个尝试：

文件：doCalcualtions

 ＃！/ bin / sh的

随机-e 20
睡觉$？
随机-e 10

文件：尝试

 ＃！/ bin / sh的

 rm / tmp / acc

我在$（seq 0 20） 
做
         （./doCalculations“$ i”; echo“$？”>> / tmp / acc）＆
 DONE

等待

 cat / tmp / acc |  FMT
 rm / tmp / acc

输出运行./try

  5 1 9 6 8 1 2 0 9 6 5 9 6 0 0 4 9 5 5 9 8

如何在bash中等待几个subprocess完成并返回退出代码！= 0当任何subprocess以代码！= 0结束？

在java中使用wait（）和notify（）的简单场景

为什么在Java的Object类中声明wait（）和notify（）？

jQuery：等待/延迟1秒而不执行代码

Python popen命令。等到命令完成

等待“任何过程”完成

等待几秒钟，不会阻止UI执行

如何让代码在android中暂停几秒钟？

我如何使python等待按下的键

在运行程序中等待一秒钟

如何等待R中的按键？

如何在bash中等待几个subprocess完成并返回退出代码！= 0当任何subprocess以代码！= 0结束？

在java中使用wait（）和notify（）的简单场景

为什么在Java的Object类中声明wait（）和notify（）？

jQuery：等待/延迟1秒而不执行代码

Python popen命令。 等到命令完成

等待“任何过程”完成

等待几秒钟，不会阻止UI执行

如何让代码在android中暂停几秒钟？

我如何使python等待按下的键

在运行程序中等待一秒钟

如何等待R中的按键？

Python popen命令。等到命令完成