Python线程多个bashsubprocess?

如何使用线程和subprocess模块来产生并行的bash进程? 当我开始线程ala第一个答案在这里: 如何在Python中使用线程? ,bash进程顺序运行而不是并行运行。

您不需要线程来并行运行subprocess:

from subprocess import Popen commands = [ 'date; ls -l; sleep 1; date', 'date; sleep 5; date', 'date; df -h; sleep 3; date', 'date; hostname; sleep 2; date', 'date; uname -a; date', ] # run in parallel processes = [Popen(cmd, shell=True) for cmd in commands] # do other things here.. # wait for completion for p in processes: p.wait() 

要限制并发命令的数量,可以使用multiprocessing.dummy.Pool ,它使用线程并提供与multiprocessing.Pool相同的接口。使用进程的池:

 from functools import partial from multiprocessing.dummy import Pool from subprocess import call pool = Pool(2) # two concurrent commands at a time for i, returncode in enumerate(pool.imap(partial(call, shell=True), commands)): if returncode != 0: print("%d command failed: %d" % (i, returncode)) 

这个答案演示了各种技术来限制并发subprocess的数量 :它显示了多处理.Pool,concurrent.futures,线程+基于队列的解决scheme。


您可以在不使用线程/进程池的情况下限制并发subprocess的数量:

 from subprocess import Popen from itertools import islice max_workers = 2 # no more than 2 concurrent processes processes = (Popen(cmd, shell=True) for cmd in commands) running_processes = list(islice(processes, max_workers)) # start new processes while running_processes: for i, process in enumerate(running_processes): if process.poll() is not None: # the process has finished running_processes[i] = next(processes, None) # start new process if running_processes[i] is None: # no new processes del running_processes[i] break 

在Unix上,你可以避免os.waitpid(-1, 0)上的繁忙循环和阻塞,等待任何subprocess退出 。

一个简单的线程示例:

 import threading import Queue import commands import time # thread class to run a command class ExampleThread(threading.Thread): def __init__(self, cmd, queue): threading.Thread.__init__(self) self.cmd = cmd self.queue = queue def run(self): # execute the command, queue the result (status, output) = commands.getstatusoutput(self.cmd) self.queue.put((self.cmd, output, status)) # queue where results are placed result_queue = Queue.Queue() # define the commands to be run in parallel, run them cmds = ['date; ls -l; sleep 1; date', 'date; sleep 5; date', 'date; df -h; sleep 3; date', 'date; hostname; sleep 2; date', 'date; uname -a; date', ] for cmd in cmds: thread = ExampleThread(cmd, result_queue) thread.start() # print results as we get them while threading.active_count() > 1 or not result_queue.empty(): while not result_queue.empty(): (cmd, output, status) = result_queue.get() print('%s:' % cmd) print(output) print('='*60) time.sleep(1) 

请注意,有更好的方法来做一些这个,但这不是太复杂。 该示例为每个命令使用一个线程。 当你想要使用有限数量的线程来处理未知数量的命令时,复杂性开始蔓延。 一旦掌握了线程基础知识,那些更先进的技术似乎并不复杂。 一旦掌握了这些技术,多处理就变得简单了。

这是因为它应该这样做,你想要做的事情不是multithreading,但多处理看到这个堆栈页