The world’s leading publication for data science, AI, and ML professionals.

Python OPP, and Why repr() and str() Matter

The article discovers various faces behind using repr() and str() for Python classes

PYTHON PROGRAMMING

Python classes need string representation to provide the user and the developer with more information than just a mess of letters. Photo by Surendran MP on Unsplash
Python classes need string representation to provide the user and the developer with more information than just a mess of letters. Photo by Surendran MP on Unsplash

Python classes have many faces. For instance, you can create an empty class:

class MyClass:
    pass

and it still can be of use, for instance, as a sentinel value. You can add an __init__() method:

class MyClass:
    def __init__(self, value):
        self.value = value

It still will be a very simple class, but this time, it’ll keep a particular value.

A superb power of Python classes is that they can be used as types, as shown below:

def foo(x: MyClass, n: int) -> list[MyClass]:
    return [x] * n

Remember that not implementing the __init__() method does not mean it doesn’t exist. In fact, we overloaded the __init__() method above, not just implemented it. This is another significant aspect of Python classes that you should know: you can overload many other methods, such as __new__(), __eq__(), and __setattr__(). If you do not overload these methods, some will have their default implementation (like __init__(), __new__(), __setattr__(), and __eq__()), while others will not (like __lt__() and all the other comparison methods other than __eq__(), __getitem__(), __setitem__() and __len__()).

A class can inherit from another class, like here:

class MyClass(dict):
    @staticmethod
    def say_hello(self):
        print("Hello!")

and, also as above, it can use static methods, but also class methods. You can create mixin classes and abstract base classes, singletons, and make tons of other things, sometimes very useful.

Python classes have so many faces that it would take years to discuss each of them in detail, and we’ll be doing so in the future articles. In this one, we will focus on one particular aspect: the difference between and the power of the __repr__() and __str__() methods.

At first glance, you may think this is a minor topic, but it’s actually quite important. It’s easy to implement a Python class, but it takes more effort to implement a good Python class. And it is these minor details that differentiate a skillful Python developer from a regular one.

Note: To run doctests, I used Python 3.11. Don’t be amazed to see that older versions of Python could provide slightly different results. If you’d like to read more about Python doctesting, grab the following article:

Python Documentation Testing with doctest: The Easy Way

repr vs str

In theory, repr() should return an unambiguous string representation of an object, from which you should be able to recreate the object. str(), on the other hand, should return a human-readable string representation of an object.

So, still in theory, repr() should provide detailed information about the objects it’s used for, while str() should provide a readable string explaining what the object is and what it may contain. For example, we use str() to see an object in an interactive session or for logging purposes. But when we’re debugging and need more details, repr() is the way to go. As we’ll see in the next section, we usually call these functions indirectly, without even knowing this – or at least without thinking about it.

We compared the repr() and str() functions above. To implement or overload them in a class, we need to use the corresponding methods, __repr__() and __str__(), respectively. If a class defines a __repr__() method, it’s used to generate the string representation of objects of that class when you call repr(). The same goes for str() and __str__().

We’ll see this in action soon – first let’s see what I meant when I mentioned indirect calling repr() and str().

Calling repr() and str() indirectly

There’s a secret related to these two functions, and it’s good to know it. Consider the following code:

>>> class StrAndRepr:
...     def __repr__(self): return f"I am __repr__"
...     def __str__(self): return "I am __str__"
>>> str_and_repr = StrAndRepr()
>>> str_and_repr
I am __repr__
>>> print(str_and_repr)
I am __str__

Focus on the last two calls. As you can see, it can make a difference whether or not you use print() to print an object in a Python session or just an object’s name.

Difference between calling print(obj) and obj in a Python session. Image by author
Difference between calling print(obj) and obj in a Python session. Image by author

The image below summarizes this: print(obj) calls str(obj) while obj calls repr(obj).

repr vs str

Above, I explained the concept behind repr() and __repr__() and str() and __str__(). The former pair should provide more information than the latter.

Oftentimes, however, practice shows a different face:

>>> class MyClass: ...
>>> inst = MyClass()
>>> inst.__repr__()
'<__main__.MyClass object at 0x7f...>'
>>> inst.__str__()
'<__main__.MyClass object at 0x7f...>'
>>> inst.__repr__() == repr(inst)
True
>>> inst.__str__() == str(inst)
True

As you can see, the default implementations of both of these methods are the same:

>>> str(inst) == repr(inst)
True

So, even the default implementations of __str__() and __repr__(), used when you don’t overload these two methods in a Python class, go against the rule mentioned above. In addition, developers can overload either or both of these methods, and in real life, this can also mean going against this very rule.

What happens when only one of the two methods is implemented? To show this, I’ll implement the following four classes:

>>> class StrAndRepr:
...     def __repr__(self): return "I am repr of StrAndRepr"
...     def __str__(self): return "I am str of StrAndRepr"
>>> class OnlyStr:
...     def __str__(self): return "I am str of OnlyStr"
>>> class OnlyRepr:
...     def __repr__(self): return "I am repr of OnlyRepr"
>>> class NeietherStrNorRepr: ...

So, we defined four classes: one with neither __str__() nor __repr__(), two with one of them, and one with both. Let’s see what happens if we call str() and repr() on their instances:

>>> str_and_repr = StrAndRepr()
>>> str(str_and_repr)
'I am str of StrAndRepr'
>>> repr(str_and_repr)
'I am repr of StrAndRepr'

>>> only_str = OnlyStr()
>>> str(only_str)
'I am str of OnlyStr'
>>> repr(only_str)
'<__main__.OnlyStr object at 0x7f...>'

>>> only_repr = OnlyRepr()
>>> str(only_repr)
'I am repr of OnlyRepr'
>>> repr(only_repr)
'I am repr of OnlyRepr'

>>> neither_str_nor_repr = NeietherStrNorRepr()
>>> str(neither_str_nor_repr)
'<__main__.NeietherStrNorRepr object at 0x7...>'
>>> repr(neither_str_nor_repr)
'<__main__.NeietherStrNorRepr object at 0x7f...>'

Here are the conclusions from the above doctests:

  • Implement neither __str__() nor __repr__(): For both, the default implementations will be used; they are the same, both providing the class’s name and the instance’s address.
  • Implement both __str__() and __repr__(): Usually, this is the recommended approach. It makes your code more readable and maintainable – although, at the same time, longer.
  • Implement only __str__(): Python will use it for str() but for repr() the default implementation will be used.
  • Implement only __repr__(): Python will use it for both str() and repr().

So, what should I implement?

It depends. The most obvious conclusion is that if you implement a complex class, you should define both these methods. This will give you more opportunities to debug the code and use better logging.

Nevertheless, when you don’t have much time for coding as deadlines are approaching, you should implement at the very least one of the methods. Implementing none means the string representation of the class will contain little useful information, as it will contain the class’s name and the instance’s address. Therefore, do so only when you’re certain the class’s name is all you need. In prototyping, for example, this is often all you need.

For small classes, implementing only one of the two methods can be enough, but always make sure it’s indeed enough. Besides, how often do you have so little time to be unable to implement such a simple method as __str__() or __repr__()? I know this can happen – but I don’t think something like this happens more often that once in a while. To be honest, in my over five years of Python development, this has not happened even once.

So, I think time is seldom a concern. Space, on the other hand, can be. When your module contains a number of small classes that take several lines, implementing both __repr__() and __str__() for all of them can double the length of the module. This can make a big difference, so it’s worth considering whether both methods are needed, and if not, which one should be implemented.

Many built-in classes use the same implementation for __repr__() and __str__(), including dict and list. The same goes for many classes from well-known add-on packages, a perfect example from Data Science realm being pandas.DataFrame.

Let’s summarize our discussion in a set of rules. To be honest, even though I’ve been using them for years, this is the first time I’ve thought to write them down. I hope you’ll find them useful in your coding practice to decide whether to implement both __repr__() and __str__(), only one of them, or none:

  • When you’re writing a prototype class and don’t plan to use its string representation at all, you can ignore both __repr__() and __str__(). For production code, however, think twice before doing so. During development, I often skip these methods unless I need to debug the code using the class’s instances.
  • When your class produces complex instances with many attributes, I typically consider implementing both __repr__() and __str__(). Then: (i) The __str__() method should provide a simple human-readable string representation, which can be obtained by printing the instance using both the print() and str() functions. (ii) The __repr__() method should provide as much information as possible, including all the information required to recreate the class’s instances; this can be obtained using the repr() function or by typing the instance’s name in the interactive session.
  • If your class needs to be used in debugging, make its __repr__() method as detailed as possible, regardless of whether or not you implement __str__(). This does not mean that the output of __repr__() must be insanely long; instead, in such situations, include whatever information is necessary for debugging purposes.
  • When a class needs a human-readable string representation and at the same time you need to implement a detailed __repr__() method, implement __str__().
  • If a class needs a human-readable string representation but you don’t need the detailed __repr__(), implement only __repr__(). This will give users a nice human-readable string representation from both methods, and they will avoid seeing the default __repr__() representation, which is usually of little value. When implementing only __repr__(), it is important to be consistent with the format of the string that is returned. This will make it easier for users to read and understand the output of both str() and repr().

Implementing __repr__() and __str__()

Now that we know when to implement the two methods, it’s worth to consider how to do it. There are only two rules that you must follow, and fortunately, both are simple.

The first one deals with the methods’ arguments, and the other one with the types of their return values. We can thus present them using the two methods’ expected signatures, that is:

def __repr__(self) -> str:
    ...

def __str__(self) -> str:
    ...

Is this all?

Basically, yes – but… I wrote these are the expected signatures, but the truth is, you should treat them as required signatures. You’ll see why below.

To learn why, you should know an interesting thing, one that I guess quite a few Python users don’t know. I, for one, wasn’t aware of it for quite some times.

This rule applies when you want the class’s __str__() to work with str() and print(), and __repr__() to work with repr() and by using the instance’s name in the session. To show this, let’s implement a class with __str__() having a non-optional argument:

>>> class StrWithParams:
...     def __str__(self, value):
...         return f"StrWithParams with value of {value}"

Will the method work?

>>> inst = StrWithParams()
>>> inst.__str__(10)
'StrWithParams with value of 10'

Hey, it does! So how come I just wrote __str__() should not take arguments?

It should not – although theoretically, it can. It can under an unrealistic condition that the only way the method is called is inst.__str__(10) (the value itself doesn’t matter). Above, we saw such a call, and it worked indeed. But what we’ll see now is three bitter failures:

>>> str(inst, value=10)
Traceback (most recent call last):
    ...
TypeError: 'value' is an invalid keyword argument for str()
>>> print(inst)
Traceback (most recent call last):
    ...
TypeError: StrWithParams.__str__() missing 1 required positional argument: 'value'
>>> print(inst, value=10)
Traceback (most recent call last):
    ...
TypeError: 'value' is an invalid keyword argument for print()

So, using an argument for __str__() is not a syntax error, but it’s definitely a static error:

A screenshot from Visual Studio Code. Sonarlint shows that str() should not take arguments. Image by author
A screenshot from Visual Studio Code. Sonarlint shows that str() should not take arguments. Image by author

It definitely is a static error, but as shown above, a bigger problem is that using arguments for __str__() will most likely lead to a TypeError exception raised at runtime, as shown above.

Typing inst directly in a session calls repr(), and since we did not implement it, the default implementation is used:

>>> inst
<__main__.StrWithParams object at 0x7f...>

But as shown before, calling print(inst) failed, for the simple reason that there was no direct way to providing a value for the non-optional argument value.

Now, let’s move on to the other issue, that is, returning an object of a non-string type. It seems like something to be considered a static error. Let’s consider two versions: untyped and typed class definition:

A screenshot from Visual Studio Code. Based on an untyped class definition, Sonarlint shows that str() should return a string. Image by author
A screenshot from Visual Studio Code. Based on an untyped class definition, Sonarlint shows that str() should return a string. Image by author
A screenshot from Visual Studio Code. Based on a typed class definition, Mypy shows that str() should return a string. Image by author
A screenshot from Visual Studio Code. Based on a typed class definition, Mypy shows that str() should return a string. Image by author

So, returning a non-string object from a __str__() method is definitely a static error – but would it also cause raising a runtime TypeError exception?

Yes, it would:

>>> class StrNotStr:
...     def __str__(self):
...         return 10
>>> inst = StrNotStr()
>>> inst.__str__()
10
>>> str(inst)
Traceback (most recent call last):
    ...
TypeError: __str__ returned non-string (type int)

The rules for __repr__() are the same:

>>> class ReprWithParams:
...     def __repr__(self, value):
...         return f"ReprWithParams with value of {value}"
>>> inst = ReprWithParams()
>>> inst.__repr__(10)
'ReprWithParams with value of 10'
>>> repr(inst, value=10)
Traceback (most recent call last):
    ...
TypeError: repr() takes no keyword arguments
>>> inst
Traceback (most recent call last):
    ...
TypeError: ReprWithParams.__repr__() missing 1 required positional argument: 'value'

>>> class ReprNotStr:
...     def __repr__(self):
...         return 10
>>> inst = ReprNotStr()
>>> inst.__repr__()
10
>>> repr(inst)
Traceback (most recent call last):
    ...
TypeError: __repr__ returned non-string (type int)

So, remember not to use parameters for __repr__() and __str__(), and remember that both should return strings. But it’s also worth to remember what will happen when you break any of these two rules.

Example of a custom class

As mentioned above, when you’re implementing a complex custom class, you should usually implement both __str__() and __repr__(), and they should be different.

What does "complex" mean in this context? It can mean different things, but in the example below, it means that the class contains some attributes that don’t need to be included in the regular string representation, but which we may want to include for debugging or logging purposes.

We will implement a popular Point class, but we will make it a little more complex:

  • Its main attributes are x and y, defining the point’s coordinates.
  • It also have an optional group attribute which defines an instance’s group membership; it can be a group like the species in the famous Iris dataset.
  • You can also add a comment to the class’s instances. It can be any comment, such as "Correct the group", "Double-check the coordinates" or "Possible mistake". Comments are not used in comparisons – just as a source of information about a particular point; we will see this in the code below.

This is the implementation of the Point class:

from typing import Optional

class Point:
    def __init__(
        self,
        x: float,
        y: float,
        group: Optional[str] = None,
        comment: Optional[str] = None) -> None:
        self.x = x
        self.y  = y
        self.group = group
        self.comment = comment

    def distance(self, other: "Point") -> float:
        """Calculates the Euclidean distance between two Point instances.

        Args:
            other: Another Point instance.

        Returns:
            The distance between two Point instances, as a float.

        >>> p1 = Point(1, 2)
        >>> p2 = Point(3, 4)
        >>> p1.distance(p2)
        2.8284271247461903
        >>> p1.distance(Point(0, 0))
        2.23606797749979
        """
        dx = self.x - other.x
        dy = self.y - other.y
        return (dx**2 + dy**2)**.5

    def __str__(self) -> str:
        """String representation of self.

        >>> p1 = Point(1, 2, "c", "Needs checking")
        >>> p1
        Point(x=1, y=2, group=c)
        Comment: Needs checking
        >>> print(p1)
        Point(1, 2, c)

        When group is None, __str__() and __repr__() will
        provide different representations:
        >>> p2 = Point(1, 2, None)
        >>> p2
        Point(x=1, y=2, group=None)
        >>> print(p2)
        Point(1, 2)
        """
        if self.group is not None:
            return f"Point({self.x}, {self.y}, {self.group})"
        return f"Point({self.x}, {self.y})"

    def __repr__(self) -> str:
        msg = (
            f"Point(x={self.x}, y={self.y}, "
            f"group={self.group})"
        )
        if self.comment is not None:
            msg += (
                "n"
                f"Comment: {self.comment}"
            )
        return msg

    def __eq__(self, other) -> bool:
        """Compare self with another object.

        Group must be provided for comparisons.
        Comment is not used.

        >>> Point(1, 2, "g") == 1
        False
        >>> Point(1, 2, "c") == Point(1, 2, "c")
        True
        >>> Point(1, 2) == Point(1, 2)
        False
        >>> Point(1, 2) == Point(1, 3, "s")
        False
        """
        if not isinstance(other, Point):
            return False
        if self.group is None:
            return False
        return (
            self.group == other.group
            and self.x == other.x
            and self.y == other.y
        )

if __name__ == "__main__":
    import doctest

    doctest.testmod()

Let’s analyze differences between the __repr__() and __str__():

The level of detail

As mentioned above, such comments are not usually necessary in the regular string representation of a class instance. Therefore, we don’t need to include them in __str__(). However, when we’re debugging, comments can be extremely helpful, especially when they provide critical information about a particular class instance.

This is why we should include comments in __repr__() but not in __str__(). Consider this example:

>>> p1 = Point(1, 2, "c", "Needs checking")
>>> p1
Point(x=1, y=2, group=c)
Comment: Needs checking
>>> print(p1)
Point(1, 2, c)

A more detailed picture

In our implementation, the two methods provide different pictures of a class instance. Compare

Point(x=1, y=2, group=c)
Comment: Needs checking

with

'Point(1, 2, c)'

In addition to providing a comment, __repr__() offers a more detailed picture by providing attribute names than does __str__(). This may not that big a difference in this particular class, but when a class has more attributes you want to include in string representation and their names are longer than here, the difference could be much more visible. Even here, however, __str__() does offer much more concise information about an instance than __repr__() does.

Recreating an instance from __repr__()

We mentioned this one, too. If possible, it’s a good practice to provide in __repr__() all the information required to recreate the instance. Here, __str__() is not enough for us to do so:

>>> str(p1)
'Point(1, 2, c)'
>>> p1_recreated_from_str = Point(1, 2, "c")
>>> p1
Point(x=1, y=2, group=c)
Comment: Needs checking
>>> p1_recreated_from_str
Point(x=1, y=2, group=c)

It doesn’t matter here that comments are not used to compare instances and that, for that reason, p1 == p1_recreated_from_str returns True:

>>> p1 == p1_recreated_from_str
True

This only says that from the user’s point of view these two instances are equal. From a developers point of view, however, they are not: p1 is not the same as p1_recreated_from_str. If we want to fully recreate p1, we need to use its `__repr__() representation:

>>> p1
Point(x=1, y=2, group=c)
Comment: Needs checking
>>> p1_recreated_from_repr = Point(
...     1, 2, "c", comment="Needs checking")
>>> p1_recreated_from_repr
Point(x=1, y=2, group=c)
Comment: Needs checking

Conclusion

I hope that reading this article has helped you see the subtle differences between repr() and str(), and between __repr__() and __str__(). Such subtleties may not be required for intermediate Python users, but if you want to be an advanced Python user or developer, it is precisely such subtleties that you need to know and use in your daily coding.

This is just the tip of the iceberg, but I won’t leave you with just this. We have discussed such subtleties of Python before, and we will discuss them even more in future articles.


Thanks for reading. If you enjoyed this article, you may also enjoy other articles I wrote; you will see them here. And if you want to join Medium, please use my referral link below:

Join Medium with my referral link – Marcin Kozak


Related Articles