How can I make a python dataclass hashable?
I'd like to add a special note for use of unsafe_hash.
You can exclude fields from being compared by hash by setting compare=False, or hash=False. (hash by default inherits from compare).
This might be useful if you store nodes in a graph but want to mark them visited without breaking their hashing (e.g if they're in a set of unvisited nodes..).
from dataclasses import dataclass, field
@dataclass(unsafe_hash=True)
class node:
x:int
visit_count: int = field(default=10, compare=False) # hash inherits compare setting. So valid.
# visit_count: int = field(default=False, hash=False) # also valid. Arguably easier to read, but can break some compare code.
# visit_count: int = False # if mutated, hashing breaks. (3* printed)
s = set()
n = node(1)
s.add(n)
if n in s: print("1* n in s")
n.visit_count = 11
if n in s:
print("2* n still in s")
else:
print("3* n is lost to the void because hashing broke.")
This took me hours to figure out... Useful further readings I found is the python doc on dataclasses. Specifically see the field documentation and dataclass arg documentations. https://docs.python.org/3/library/dataclasses.html
From the docs:
Here are the rules governing implicit creation of a
__hash__()
method:[...]
If
eq
andfrozen
are both true, by defaultdataclass()
will generate a__hash__()
method for you. Ifeq
is true andfrozen
is false,__hash__()
will be set toNone
, marking it unhashable (which it is, since it is mutable). Ifeq
is false,__hash__()
will be left untouched meaning the__hash__()
method of the superclass will be used (if the superclass is object, this means it will fall back to id-based hashing).
Since you set eq=True
and left frozen
at the default (False
), your dataclass is unhashable.
You have 3 options:
- Set
frozen=True
(in addition toeq=True
), which will make your class immutable and hashable. Set
unsafe_hash=True
, which will create a__hash__
method but leave your class mutable, thus risking problems if an instance of your class is modified while stored in a dict or set:cat = Category('foo', 'bar') categories = {cat} cat.id = 'baz' print(cat in categories) # False
- Manually implement a
__hash__
method.