用Python线程化本地存储

如何在Python中使用线程本地存储？

有关

什么是Python中的“线程本地存储”，为什么我需要它？ – 这个线程似乎更专注于variables共享。
有效的方法来确定一个特定的函数是否在Python的堆栈 – Alex Martelli提供了一个很好的解决scheme

如果您有一个线程工作池，并且每个线程都需要访问其自己的资源（如networking或数据库连接），则线程本地存储非常有用。请注意， threading模块使用线程（可访问进程全局数据）的常规概念，但由于全局解释器locking，这些线程并不太有用。不同的multiprocessing模块为每个模块创build一个新的subprocess，因此任何全局都将是线程本地的。

线程模块

这是一个简单的例子：

 import threading from threading import current_thread threadLocal = threading.local() def hi(): initialized = getattr(threadLocal, 'initialized', None) if initialized is None: print("Nice to meet you", current_thread().name) threadLocal.initialized = True else: print("Welcome back", current_thread().name) hi(); hi()

这将打印出来：

 Nice to meet you MainThread Welcome back MainThread

一个很容易被忽略的重要事情是：一个threading.local()对象只需要被创build一次，不是每个线程一次，也不是每个函数调用一次。 global或class是理想的地点。

这是为什么： threading.local()每次调用时都会创build一个新实例（就像任何工厂或类的调用一样），所以多次调用threading.local()不断覆盖原始对象，不是什么人想要的。当任何线程访问一个已经存在的threadLocalvariables（或者被调用的任何线程variables）时，它将获得它自己的那个variables的私有视图。

这不会按预期工作：

 import threading from threading import current_thread def wont_work(): threadLocal = threading.local() #oops, this creates a new dict each time! initialized = getattr(threadLocal, 'initialized', None) if initialized is None: print("First time for", current_thread().name) threadLocal.initialized = True else: print("Welcome back", current_thread().name) wont_work(); wont_work()

将导致这个输出：

 First time for MainThread First time for MainThread

多处理模块

所有全局variables都是线程本地的，因为multiprocessing模块为每个线程创build一个新的进程。

考虑这个例子，其中processed计数器是线程本地存储的一个例子：

 from multiprocessing import Pool from random import random from time import sleep import os processed=0 def f(x): sleep(random()) global processed processed += 1 print("Processed by %s: %s" % (os.getpid(), processed)) return x*x if __name__ == '__main__': pool = Pool(processes=4) print(pool.map(f, range(10)))

它会输出这样的东西：

 Processed by 7636: 1 Processed by 9144: 1 Processed by 5252: 1 Processed by 7636: 2 Processed by 6248: 1 Processed by 5252: 2 Processed by 6248: 2 Processed by 9144: 2 Processed by 7636: 3 Processed by 5252: 3 [0, 1, 4, 9, 16, 25, 36, 49, 64, 81]

…当然，线程ID和每个订单的计数会因运行而异。

线程本地存储可以简单地被认为是一个命名空间（通过属性符号访问值）。不同之处在于每个线程都透明地获取自己的一组属性/值，以便一个线程不会看到来自另一个线程的值。

就像一个普通的对象，你可以在你的代码中创build多个threading.local实例。它们可以是局部variables，类或实例成员或全局variables。每一个都是一个独立的命名空间。

这是一个简单的例子：

 import threading class Worker(threading.Thread): ns = threading.local() def run(self): self.ns.val = 0 for i in range(5): self.ns.val += 1 print("Thread:", self.name, "value:", self.ns.val) w1 = Worker() w2 = Worker() w1.start() w2.start() w1.join() w2.join()

输出：

 Thread: Thread-1 value: 1 Thread: Thread-2 value: 1 Thread: Thread-1 value: 2 Thread: Thread-2 value: 2 Thread: Thread-1 value: 3 Thread: Thread-2 value: 3 Thread: Thread-1 value: 4 Thread: Thread-2 value: 4 Thread: Thread-1 value: 5 Thread: Thread-2 value: 5

注意每个线程如何维护自己的计数器，即使ns属性是一个类成员（因此在线程之间共享）。

同样的例子可以使用一个实例variables或一个局部variables，但不会显示太多，因为没有共享（一个字典也可以）。有些情况下，你需要线程本地存储作为实例variables或局部variables，但它们往往是相对罕见的（而且非常微妙）。

如问题所述，Alex Martelli 在这里提供了一个解决scheme。这个函数允许我们使用一个工厂函数为每个线程生成一个默认值。

 #Code originally posted by Alex Martelli #Modified to use standard Python variable name conventions import threading threadlocal = threading.local() def threadlocal_var(varname, factory, *args, **kwargs): v = getattr(threadlocal, varname, None) if v is None: v = factory(*args, **kwargs) setattr(threadlocal, varname, v) return v

也可以写

 import threading mydata = threading.local() mydata.x = 1

mydata.x将只存在于当前线程中

用Python线程化本地存储

有关

线程模块

多处理模块

是malloc线程安全的吗？

这是线程安全的权利？

我在哪里可以得到线程安全的CollectionView？

使用线程奇数偶数打印

Java中的线程安全multithreading

Java中的volatile int是线程安全的吗？

为什么这个类不是线程安全的？

如何从input文件控制中删除一个特定的选定文件

PHP线程安全吗？

！=检查线程是否安全？