How to make type cast for python custom class

To answer the question, one way of doing this is by "abusing" __repr__ in combination with eval(). Let's first have a look at the __repr__ docs (emphasis: mine):

Called by the repr() built-in function to compute the “official” string representation of an object. If at all possible, this should look like a valid Python expression that could be used to recreate an object with the same value (given an appropriate environment). If this is not possible, a string of the form <...some useful description...> should be returned. The return value must be a string object. If a class defines __repr__() but not __str__(), then __repr__() is also used when an “informal” string representation of instances of that class is required.

This is typically used for debugging, so it is important that the representation is information-rich and unambiguous.

With this in mind, we know that it is recommended to return a string from __repr__ which can be used with eval(). This is implied by the statement that the value "should look like a valid Python expression".

Example

Here is an example which uses this. The example also overrides __eq__, but only for convenience for the print-outs. And for completeness we also add a value to the instance.

The example creates a new instance. Then the value is converted to a string using __repr__ (by using the repr() function. Next that string value is passed to eval() which will evaluate the string and return the result. The result will be a new instance of the same class and is stored in second_instance. We also print out the id() to visualise that we have indeed two different instances. Finally we show that first_instance == second_instance is indeed True:

class MyClass:

    def __init__(self, value):
        self.value = value

    def __eq__(self, other):
        return isinstance(self, MyClass) and self.value == other.value

    def __repr__(self):
        return '%s(%r)' % (self.__class__.__name__, self.value)


first_instance = MyClass(123)
print('First instance: repr=%r, id=%d' % (first_instance, id(first_instance)))

stringified = repr(first_instance)
print('Stringified: %r' % stringified)

second_instance = eval(stringified)  # !!! DANGEROUS (see below) !!!
print('Second instance: repr=%r, id=%d' % (second_instance, id(second_instance)))

print('First == Second: %r' % (first_instance == second_instance))

When is it OK to do this?

This is 100% acceptable if absolutely everything going into eval() is under your control! This means:

  • The scope in which eval() is called is under your control
  • No place in the evaluated string should contain data coming from outside sources. Outside sources include:
    • Database values
    • User-Input
    • Data read from disk
    • ... basically any I/O

Keeping all this in mind and guaranteeing that at no point in the future of the project I/O will end up in an eval() call is almost impossible. As such I strongly recommend avoiding this in important production code as it opens up nasty security holes.

For code not running in production, this is absolutely acceptable. For example unit-tests, personal utility scripts, e.t.c. But the risk should always be taken into consideration.

Why is this Dangerous?

  • The code passed into eval() is executed inside the Python process calling it, with the same privileges. Example: You read a value from a DB where multiple users have access and you eval() it. In that case, another user may inject code via the database and that code will run as your user!
  • Using eval() when the values come from outside sources opens up the possibility of code-injections.
  • It is not guaranteed that repr() will return a valid Python expression. This is only a recommendation by the docs. Hence the call to eval with __repr__ is prone to run-time errors.
  • In the example above, the scope calling eval() needs to "know" about the class MyClass (it must be imported). It only looks for the name. So if by pure chance that same name exists in the scope, but pointing to another object, you will call something else unintentionally and may run into weird bugs. Granted, this is an edge-case.

Safer Alternative

Use one of the many available serialisation options. The most popular, and simplest one to use is to convert the object to/from JSON strings. The above example could be made safe like this:

import json


class MyClass:

    @staticmethod
    def from_json(document):
        data = json.loads(document)
        instance = MyClass(data['value'])
        return instance

    def __init__(self, value):
        self.value = value

    def __eq__(self, other):
        return isinstance(self, MyClass) and self.value == other.value

    def __repr__(self):
        return '%s(%r)' % (self.__class__.__name__, self.value)

    def to_json(self):
        data = {
            'value': self.value
        }
        return json.dumps(data)


first_instance = MyClass(123)
print('First instance: repr=%r, id=%d' % (first_instance, id(first_instance)))

stringified = first_instance.to_json()
print('Stringified: %r' % stringified)

second_instance = MyClass.from_json(stringified)
print('Second instance: repr=%r, id=%d' % (second_instance, id(second_instance)))

print('First == Second: %r' % (first_instance == second_instance))

This is only marginally more difficult but much safer.

The same approach can be used with other serialisation methods. Popular formats are:

  • XML
  • YAML
  • ini/cfg files
  • pickle (note that this uses bytes instead of text as serialisation medium).
  • MessagePack (note that this uses bytes instead of text as serialisation medium).
  • Custom Implementation
  • ...

For those who are looking for overriding conversion builtins such as int(obj), float(obj), and str(obj), see Overload int() in Python. You need to implement __int__, __float__, or __str__ on the object.