什么是“冻结字典”?

  • 冻结集是一个冻结集。
  • 一个冻结的列表可能是一个元组。
  • 冻结字典是什么? 一个不可改变的,可拆分的字典。

我想这可能是像collections.namedtuple ,但更像是一个冻结键字典(半冻结​​字典)。 不是吗?

“frozendict”应该是一个冷冻字典,它应该有keysvaluesget等等,并支持,等等。

Python没有内置的frozendicttypes。 事实certificate,这样做通常不会有用(尽pipe它可能比frozenset )。

想要这样一个types的最常见的原因是当为具有未知参数的函数记忆函数调用时。 最常见的解决scheme是存储一个dict(其中的值是可哈希的)的可tuple(sorted(kwargs.iteritems()))等价物,就像tuple(sorted(kwargs.iteritems()))

这取决于sorting不是有点疯狂。 Python不能肯定地承诺sorting会在这里产生合理的结果。 (但是不能保证太多,所以不要太多。)


你可以很容易地做出一些类似字典的包装。 它可能看起来像

 import collections class FrozenDict(collections.Mapping): """Don't forget the docstrings!!""" def __init__(self, *args, **kwargs): self._d = dict(*args, **kwargs) self._hash = None def __iter__(self): return iter(self._d) def __len__(self): return len(self._d) def __getitem__(self, key): return self._d[key] def __hash__(self): # It would have been simpler and maybe more obvious to # use hash(tuple(sorted(self._d.iteritems()))) from this discussion # so far, but this solution is O(n). I don't know what kind of # n we are going to run into, but sometimes it's hard to resist the # urge to optimize when it will gain improved algorithmic performance. if self._hash is None: self._hash = 0 for pair in self.iteritems(): self._hash ^= hash(pair) return self._hash 

它应该很好:

 >>> x = FrozenDict(a=1, b=2) >>> y = FrozenDict(a=1, b=2) >>> x is y False >>> x == y True >>> x == {'a': 1, 'b': 2} True >>> d = {x: 'foo'} >>> d[y] 'foo' 

奇怪的是,虽然我们在python中很less使用有用的frozenset ,但是仍然没有冻结映射。 这个想法在PEP 416被拒绝了。

所以Python 2的解决scheme:

 def foo(config={'a': 1}): ... 

似乎仍然有点蹩脚:

 def foo(config=None): if config is None: config = default_config = {'a': 1} ... 

在python3中你可以select这个 :

 from types import MappingProxyType default_config = {'a': 1} DEFAULTS = MappingProxyType(default_config) def foo(config=DEFAULTS): ... 

现在默认的configuration可以dynamic地更新,但是通过传递代理来保持默认的configuration是不变的。

因此, default_config更改将按预期更新DEFAULTS ,但不能写入映射代理对象本身。

诚然,这与“不可改变的,可拆分的字典”并不完全相同,但它是一个相当好的替代品,因为我们可能想要一个类似的用例。

假设字典的键和值本身是不可变的(例如string),那么:

 >>> d {'forever': 'atones', 'minks': 'cards', 'overhands': 'warranted', 'hardhearted': 'tartly', 'gradations': 'snorkeled'} >>> t = tuple((k, d[k]) for k in sorted(d.keys())) >>> hash(t) 1524953596 

这是我一直在使用的代码。 我subclassed frozenset。 这个优点如下。

  1. 这是一个真正的不变的对象。 不依赖于未来用户和开发者的良好行为。
  2. 在普通字典和冷冻字典之间来回转换很容易。 FrozenDict(orig_dict) – >冷冻字典。 字典(frozen_dict) – >正规字典。

2015年1月21日更新:我在2014年发布的原始代码使用for-loopfind匹配的密钥。 那非常慢。 现在我已经组装了一个利用了frozenset的散列特性的实现。 键值对存储在特殊容器中,其中__hash____eq__函数仅基于键。 这个代码也经过了正式的unit testing,不像我在2014年8月在这里发布的。

MIT风格的许可证。

 if 3 / 2 == 1: version = 2 elif 3 / 2 == 1.5: version = 3 def col(i): ''' For binding named attributes to spots inside subclasses of tuple.''' g = tuple.__getitem__ @property def _col(self): return g(self,i) return _col class Item(tuple): ''' Designed for storing key-value pairs inside a FrozenDict, which itself is a subclass of frozenset. The __hash__ is overloaded to return the hash of only the key. __eq__ is overloaded so that normally it only checks whether the Item's key is equal to the other object, HOWEVER, if the other object itself is an instance of Item, it checks BOTH the key and value for equality. WARNING: Do not use this class for any purpose other than to contain key value pairs inside FrozenDict!!!! The __eq__ operator is overloaded in such a way that it violates a fundamental property of mathematics. That property, which says that a == b and b == c implies a == c, does not hold for this object. Here's a demonstration: [in] >>> x = Item(('a',4)) [in] >>> y = Item(('a',5)) [in] >>> hash('a') [out] >>> 194817700 [in] >>> hash(x) [out] >>> 194817700 [in] >>> hash(y) [out] >>> 194817700 [in] >>> 'a' == x [out] >>> True [in] >>> 'a' == y [out] >>> True [in] >>> x == y [out] >>> False ''' __slots__ = () key, value = col(0), col(1) def __hash__(self): return hash(self.key) def __eq__(self, other): if isinstance(other, Item): return tuple.__eq__(self, other) return self.key == other def __ne__(self, other): return not self.__eq__(other) def __str__(self): return '%r: %r' % self def __repr__(self): return 'Item((%r, %r))' % self class FrozenDict(frozenset): ''' Behaves in most ways like a regular dictionary, except that it's immutable. It differs from other implementations because it doesn't subclass "dict". Instead it subclasses "frozenset" which guarantees immutability. FrozenDict instances are created with the same arguments used to initialize regular dictionaries, and has all the same methods. [in] >>> f = FrozenDict(x=3,y=4,z=5) [in] >>> f['x'] [out] >>> 3 [in] >>> f['a'] = 0 [out] >>> TypeError: 'FrozenDict' object does not support item assignment FrozenDict can accept un-hashable values, but FrozenDict is only hashable if its values are hashable. [in] >>> f = FrozenDict(x=3,y=4,z=5) [in] >>> hash(f) [out] >>> 646626455 [in] >>> g = FrozenDict(x=3,y=4,z=[]) [in] >>> hash(g) [out] >>> TypeError: unhashable type: 'list' FrozenDict interacts with dictionary objects as though it were a dict itself. [in] >>> original = dict(x=3,y=4,z=5) [in] >>> frozen = FrozenDict(x=3,y=4,z=5) [in] >>> original == frozen [out] >>> True FrozenDict supports bi-directional conversions with regular dictionaries. [in] >>> original = {'x': 3, 'y': 4, 'z': 5} [in] >>> FrozenDict(original) [out] >>> FrozenDict({'x': 3, 'y': 4, 'z': 5}) [in] >>> dict(FrozenDict(original)) [out] >>> {'x': 3, 'y': 4, 'z': 5} ''' __slots__ = () def __new__(cls, orig={}, **kw): if kw: d = dict(orig, **kw) items = map(Item, d.items()) else: try: items = map(Item, orig.items()) except AttributeError: items = map(Item, orig) return frozenset.__new__(cls, items) def __repr__(self): cls = self.__class__.__name__ items = frozenset.__iter__(self) _repr = ', '.join(map(str,items)) return '%s({%s})' % (cls, _repr) def __getitem__(self, key): if key not in self: raise KeyError(key) diff = self.difference item = diff(diff({key})) key, value = set(item).pop() return value def get(self, key, default=None): if key not in self: return default return self[key] def __iter__(self): items = frozenset.__iter__(self) return map(lambda i: i.key, items) def keys(self): items = frozenset.__iter__(self) return map(lambda i: i.key, items) def values(self): items = frozenset.__iter__(self) return map(lambda i: i.value, items) def items(self): items = frozenset.__iter__(self) return map(tuple, items) def copy(self): cls = self.__class__ items = frozenset.copy(self) dupl = frozenset.__new__(cls, items) return dupl @classmethod def fromkeys(cls, keys, value): d = dict.fromkeys(keys,value) return cls(d) def __hash__(self): kv = tuple.__hash__ items = frozenset.__iter__(self) return hash(frozenset(map(kv, items))) def __eq__(self, other): if not isinstance(other, FrozenDict): try: other = FrozenDict(other) except Exception: return False return frozenset.__eq__(self, other) def __ne__(self, other): return not self.__eq__(other) if version == 2: #Here are the Python2 modifications class Python2(FrozenDict): def __iter__(self): items = frozenset.__iter__(self) for i in items: yield i.key def iterkeys(self): items = frozenset.__iter__(self) for i in items: yield i.key def itervalues(self): items = frozenset.__iter__(self) for i in items: yield i.value def iteritems(self): items = frozenset.__iter__(self) for i in items: yield (i.key, i.value) def has_key(self, key): return key in self def viewkeys(self): return dict(self).viewkeys() def viewvalues(self): return dict(self).viewvalues() def viewitems(self): return dict(self).viewitems() #If this is Python2, rebuild the class #from scratch rather than use a subclass py3 = FrozenDict.__dict__ py3 = {k: py3[k] for k in py3} py2 = {} py2.update(py3) dct = Python2.__dict__ py2.update({k: dct[k] for k in dct}) FrozenDict = type('FrozenDict', (frozenset,), py2) 

每当我写这样一个函数时,我想起了frozendict:

 def do_something(blah, optional_dict_parm=None): if optional_dict_parm is None: optional_dict_parm = {} 

是的,这是我的第二个答案,但这是一个完全不同的方法。 第一个实现是纯python。 这是在Cython。 如果你知道如何使用和编译Cython模块,那么它就和普通的字典一样快。 粗略地.04到.06微秒检索一个单一的值。

这是文件“frozen_dict.pyx”

 import cython from collections import Mapping cdef class dict_wrapper: cdef object d cdef int h def __init__(self, *args, **kw): self.d = dict(*args, **kw) self.h = -1 def __len__(self): return len(self.d) def __iter__(self): return iter(self.d) def __getitem__(self, key): return self.d[key] def __hash__(self): if self.h == -1: self.h = hash(frozenset(self.d.iteritems())) return self.h class FrozenDict(dict_wrapper, Mapping): def __repr__(self): c = type(self).__name__ r = ', '.join('%r: %r' % (k,self[k]) for k in self) return '%s({%s})' % (c, r) __all__ = ['FrozenDict'] 

这里是文件“setup.py”

 from distutils.core import setup from Cython.Build import cythonize setup( ext_modules = cythonize('frozen_dict.pyx') ) 

如果您安装了Cython,请将上面的两个文件保存到同一个目录中。 移动到命令行中的该目录。

 python setup.py build_ext --inplace python setup.py install 

你应该做的。

您可以使用来自utilspie包的frozendict

 >>> from utilspie.collectionsutils import frozendict >>> my_dict = frozendict({1: 3, 4: 5}) >>> my_dict # object of `frozendict` type frozendict({1: 3, 4: 5}) # Hashable >>> {my_dict: 4} {frozendict({1: 3, 4: 5}): 4} # Immutable >>> my_dict[1] = 5 Traceback (most recent call last): File "<stdin>", line 1, in <module> File "/Users/mquadri/workspace/utilspie/utilspie/collectionsutils/collections_utils.py", line 44, in __setitem__ self.__setitem__.__name__, type(self).__name__)) AttributeError: You can not call '__setitem__()' for 'frozendict' object 

根据文件 :

frozendict(dict_obj) :接受字典types的obj,并返回一个可sorting且不可变的字典

似乎有其他答案错过了一个几乎微不足道的解决scheme。 最微不足道的答案应该是这样的

 x = {'a': 1, 'b': 2} x.__setitem__ = None 

除了__setitem__dict的只读属性,所以我们必须覆盖它:

 class FrozenDict(dict): def __setitem__(self, key, value): raise NotImplementedError('Invalid operation') 

你可以像@ MikeGraham的回答所build议的那样,在你的类中添加一个__hash__方法。

通过让__setattr____setattribute__引发错误,尤其是后者,可以使类更加不可变。 或者,您可以在类上定义一个空的__slots__属性。 但是,这并不是绝对必要的,因为用户定义的属性不会影响==的结果:

 >>> y = {'a': 1, 'b': 2} >>> x = FrozenDict(y) >>> x is y False >>> x == y True >>> x.random = 5 >>> x == y True >>> y['c'] = 3 >>> x == y False 

另外要记住的是,如果你愿意跳过这个循环,你总是可以修改一个像这样定义的dict:

 dict.__setitem__(x, 'c', 3) 

将绕过冻结。 无论这是一个错误还是一个function,都是由您来决定的。

namedtuple的主要缺点是在使用前需要指定,所以对于一次性使用情况不太方便。

但是,有一个实际的解决方法可以用来处理这种情况。 假设您想拥有以下字典的不可变的等价物:

 MY_CONSTANT = { 'something': 123, 'something_else': 456 } 

这可以像这样模拟:

 from collections import namedtuple MY_CONSTANT = namedtuple('MyConstant', 'something something_else')(123, 456) 

甚至可以编写一个辅助函数来实现这个function:

 def freeze_dict(data): from collections import namedtuple keys = sorted(data.keys()) frozen_type = namedtuple(''.join(keys), keys) return frozen_type(**data) a = {'foo':'bar', 'x':'y'} fa = freeze_dict(data) assert a['foo'] == fa.foo 

当然,这只适用于扁平字典,但实现recursion版本不应该太困难。

另一个选项是multidict包中的MultiDictProxy类。