Pythonmultithreading等待所有线程完成

这可能是在类似的情况下提出的,但是我经过20多分钟的search才find答案,所以我会问。

我写了一个Python脚本(可以说:scriptA.py)和一个脚本(可以说scriptB.py)

在scriptB中,我想用不同的参数多次调用scriptA,每次运行需要大约一个小时,(它是一个巨大的脚本,有很多东西,不用担心),我希望能够运行scriptA同时具有所有不同的论点,但是我需要等到所有这些都完成之后再继续; 我的代码:

import subprocess #setup do_setup() #run scriptA subprocess.call(scriptA + argumentsA) subprocess.call(scriptA + argumentsB) subprocess.call(scriptA + argumentsC) #finish do_finish() 

我想同时运行所有的subprocess.call() ,然后等到它们全部完成后,我该怎么做?

我试图在这里使用线程:

 from threading import Thread import subprocess def call_script(args) subprocess.call(args) #run scriptA t1 = Thread(target=call_script, args=(scriptA + argumentsA)) t2 = Thread(target=call_script, args=(scriptA + argumentsB)) t3 = Thread(target=call_script, args=(scriptA + argumentsC)) t1.start() t2.start() t3.start() 

但我不认为这是正确的。

我怎么知道他们都已经完成运行之前去我的do_finish()

您需要在脚本的末尾使用Thread对象的连接方法。

 t1 = Thread(target=call_script, args=(scriptA + argumentsA)) t2 = Thread(target=call_script, args=(scriptA + argumentsB)) t3 = Thread(target=call_script, args=(scriptA + argumentsC)) t1.start() t2.start() t3.start() t1.join() t2.join() t3.join() 

因此主线程将等到t1t2t3完成执行。

把线程放在一个列表中,然后使用Join方法

  threads = [] t = Thread(...) threads.append(t) ...repeat as often as necessary... # Start all threads for x in threads: x.start() # Wait for all of them to finish for x in threads: x.join() 

我更喜欢使用基于input列表的列表理解:

 inputs = [scriptA + argumentsA, scriptA + argumentsB, ...] threads = [Thread(target=call_script, args=(i)) for i in inputs] [t.start() for t in threads] [t.join() for t in threads] 

在Python3中,由于Python 3.2有一个新的方法达到相同的结果,我个人更喜欢传统的线程创build/开始/join,包concurrent.futures : https : //docs.python.org/3/library/ concurrent.futures.html

使用ThreadPoolExecutor的代码将是:

 from concurrent.futures.thread import ThreadPoolExecutor def call_script(arg) subprocess.call(scriptA + arg) args = [argumentsA, argumentsB, argumentsC] with ThreadPoolExecutor(max_workers=2) as executor: for arg in args: executor.submit(call_script, arg) print('All tasks has been finished') 

其中一个优点是可以控制吞吐量设置最大并发工作者。

你可以有类似下面的东西,从中你可以添加'n'个函数或console_scripts并行激情执行,并开始执行,并等待所有的工作完成..

 from multiprocessing import Process class ProcessParallel(object): """ To Process the functions parallely """ def __init__(self, *jobs): """ """ self.jobs = jobs self.processes = [] def fork_processes(self): """ Creates the process objects for given function deligates """ for job in self.jobs: proc = Process(target=job) self.processes.append(proc) def start_all(self): """ Starts the functions process all together. """ for proc in self.processes: proc.start() def join_all(self): """ Waits untill all the functions executed. """ for proc in self.processes: proc.join() def two_sum(a=2, b=2): return a + b def multiply(a=2, b=2): return a * b #How to run: if __name__ == '__main__': #note: two_sum, multiply can be replace with any python console scripts which #you wanted to run parallel.. procs = ProcessParallel(two_sum, multiply) #Add all the process in list procs.fork_processes() #starts process execution procs.start_all() #wait until all the process got executed procs.join_all() 

也许,像

 for t in threading.enumerate(): if t.daemon: t.join()