Validating input when mutating a dataclass
Perhaps lock down the attribute using getters and setters instead of mutating the attribute directly. If you then extract your validation logic into a separate method, you can validate the same way from both your setter and the __post_init__
function.
A simple and flexible solution can be to override the__setattr__
method:
@dataclass
class Person:
name: str
age: float
def __setattr__(self, name, value):
if name == 'age':
assert value > 0, f"value of {name} can't be negative: {value}"
self.__dict__[name] = value
Dataclasses are a mechanism to provide a default initialization to accept the attributes as parameters, and a nice representation, plus some niceties like the __post_init__
hook.
Fortunatelly, they do not mess with any other mechanism for attribute access in Python - and you can still have your dataclassess attributes being created as property
descriptors, or a custom descriptor class if you want. In that way, any attribute access will go through your getter and setter functions automatically.
The only drawback for using the default property
built-in is that you have to use it in the "old way", and not with the decorator syntax - that allows you to create annotations for your attributes.
So, "descriptors" are special objects assigned to class attributes in Python in a way that any access to that attribute will call the descriptors __get__
, __set__
or __del__
methods. The property
built-in is a convenince to build a descriptor passed 1 to 3 functions taht will be called from those methods.
So, with no custom descriptor-thing, you could do:
@dataclass
class MyClass:
def setname(self, value):
if not isinstance(value, str):
raise TypeError(...)
self.__dict__["name"] = value
def getname(self):
return self.__dict__.get("name")
name: str = property(getname, setname)
# optionally, you can delete the getter and setter from the class body:
del setname, getname
By using this approach you will have to write each attribute's access as two methods/functions, but will no longer need to write your __post_init__
: each attribute will validate itself.
Also note that this example took the little usual approach of storing the attributes normally in the instance's __dict__
. In the examples around the web, the practice is to use normal attribute access, but prepending the name with a _
. This will leave these attributes polluting a dir
on your final instance, and the private attributes will be unguarded.
Another approach is to write your own descriptor class, and let it check the instance and other properties of the attributes you want to guard. This can be as sofisticated as you want, culminating with your own framework. So for a descriptor class that will check for attribute type and accept a validator-list, you will need:
def positive_validator(name, value):
if value <= 0:
raise ValueError(f"values for {name!r} have to be positive")
class MyAttr:
def __init__(self, type, validators=()):
self.type = type
self.validators = validators
def __set_name__(self, owner, name):
self.name = name
def __get__(self, instance, owner):
if not instance: return self
return instance.__dict__[self.name]
def __delete__(self, instance):
del instance.__dict__[self.name]
def __set__(self, instance, value):
if not isinstance(value, self.type):
raise TypeError(f"{self.name!r} values must be of type {self.type!r}")
for validator in self.validators:
validator(self.name, value)
instance.__dict__[self.name] = value
#And now
@dataclass
class Person:
name: str = MyAttr(str)
age: float = MyAttr((int, float), [positive_validator,])
That is it - creating your own descriptor class requires a bit more knowledge about Python, but the code given above should be good for use, even in production - you are welcome to use it.
Note that you could easily add a lot of other checks and transforms for each of your attributes -
and the code in __set_name__
itself could be changed to introspect the __annotations__
in the owner
class to automatically take note of the types - so that the type parameter would not be needed for the MyAttr
class itself. But as I said before: you can make this as sophisticated as you want.
The answer provided by @jsbueno is great, but it doesn't allow for default arguments. I expanded it to allow defaults:
def positive_validator(name, value):
if value <= 0:
raise ValueError(f"values for {name!r} have to be positive")
class MyAttr:
def __init__(self, typ, validators=(), default=None):
if not isinstance(typ, type):
if isinstance(typ, tuple) and all([isinstance(t,type) for t in typ]):
pass
else:
raise TypeError(f"'typ' must be a {type(type)!r} or {type(tuple())!r}` of {type(type)!r}")
else:
typ=(typ,)
self.type = typ
self.name = f"MyAttr_{self.type!r}"
self.validators = validators
self.default=default
if self.default is not None or type(None) in typ:
self.__validate__(self.default)
def __set_name__(self, owner, name):
self.name = name
def __get__(self, instance, owner):
if not instance: return self
return instance.__dict__[self.name]
def __delete__(self, instance):
del instance.__dict__[self.name]
def __validate__(self, value):
for validator in self.validators:
validator(self.name, value)
def __set__(self, instance, value):
if value == self:
value = self.default
if not isinstance(value, self.type):
raise TypeError(f"{self.name!r} values must be of type {self.type!r}")
instance.__dict__[self.name] = value
#And now
@dataclass
class Person:
name: str = MyAttr(str,[]) # required attribute, must be a str, cannot be none
age: float = MyAttr((int, float), [positive_validator,],2) # optional attribute, must be an int >0, defaults to 2
posessions: Union[list, type(None)] = MyAttr((list, type(None)),[]) # optional attribute in which None is default