PYTHON PROGRAMMING
Sometimes, Python type hinting can make things easier. True, not always – but at least in my opinion, quite often it does – given that it’s done wisely. Some disagree, but I am not going to dispute with them: in my eyes, this is quite a subjective subject.
I wrote what I think about Python type hinting, how to use it to increase code readability, and how not to use it to do otherwise, in the following article:
Today, we’ll discuss what consistent-with and duck-type compatibility mean in terms of Python types.
Imagine you’re hinting the use of float
, like in the function below:
from collections.abc import Sequence
def sum_of_squares(x: Sequence[float]) -> float:
n, s = len(x), sum(x)
return sum((x_i - s/n)**2 for x_i in x)
This is a typical statistical function, calculating the sum of squares of a variable. It takes a container of floating-point numbers and returns a float.
As you see, to annotate the function, I used Sequence
, a generic abstract base class available from collections.abc
(before Python 3.9 you needed to use typing.Sequence
). This means you can provide a list or a tuple – but you can’t provide, for instance, a generator¹.
Okay, so this is a statistical function, and it expects a sequence of floating-point numbers. That makes sense, right? But in real-life, quite often quantitative variables are integers, like the number of mites per shoot, the number of items sold, the number of inhabitants, to name just a few.
Shouldn’t we thus do something with the function, to take this fact into account? We all know that dynamically, the function will work just fine for integers, and that dynamically we can easily join integers and floating-point numbers in x
. But what about Type Hints and static checkers?
Is it fine to use int
for this function, or should we rather make it clear that it accepts int
values, too? Should we do it like below?
def sum_of_squares(x: Sequence[float | int]) -> float:
n, s = len(x), sum(x)
return sum((x_i - s/n)**2 for x_i in x)
This seems clear: you can use a sequence of floating-point numbers or integers, and the function returns a float. Isn’t this version better, at least from type-hinting perspective?
To answer this question, let’s return to the previous version, without int
. What do static type checkers say about that?
Not a word! Look what Pylance
(in Visual Studio Code) says about it:
Nothing! Had Pylance
seen a static error, we would’ve seen it underlined in the color red. And here, mypy
‘s opinion:
Why can you use int
instead of float
?
We’ve come to the main topic of this article. Long story short, you can use int
instead of float
when you hint float
.
First, let’s have a look at the webpage of mypy
‘s documentation describing duck type compatibility:
This is what we will read there, among others:
In Python, certain types are compatible even though they aren’t subclasses of each other. For example,
int
objects are valid wheneverfloat
objects are expected. Mypy supports this idiom via duck type compatibility.
Ha!
Don’t worry, this doesn’t widen the knowledge you should know about type hints too much:
This is supported for a small set of built-in types:
int
is duck type compatible withfloat
andcomplex
.float
is duck type compatible withcomplex
.bytearray
andmemoryview
are duck type compatible withbytes
.
So now we know. We do not have to hint int
when we already hint the use of float
. This will work in exactly the same way as float | int
(or Union[float, int]
). This means that the | int
part in the hint is redundant.
And just like int
is duck type compatible with float
, it is also duck type compatible with complex
, and float
is duck type compatible with complex
, and both bytearray
and memoryview
are duck type compatible with bytes
.
Okay, that’s mypy
. Now, let’s look into my favorite Python book, one I’ve referred to quite a lot in my articles: Fluent Python, 2nd ed., by Luciano Ramalho:
To learn what’s going on in here, we should move to where Luciano explains what consistent-with means. What he writes is, we don’t have to add int
to a float
type hint because int
is consistent-with float
.
But what does consistent-with mean? (And yes, Luciano does use the hyphen in and the italics for consistent-with every time, unlike PEP 484.)
As he explains, T2
is consistent-with T1
when T1
is a subtype of T2
. In other words, a subclass is consistent-with all its superclasses – with some exceptions that enwiden the definition of consistent-with. Based on this section of PEP 484, Luciano explains that the definition also comprises the above-cited scenarios with numbers.
And when we add the scenario with the types consistent-with bytes
, we will have the following definition of consistent-with:
T2
is consistent-with T1
when:
T1
is a subtype ofT2
, orT1
is duck type compatible withT2
.
What we need to remember is that if one type is consistent-with another type, it’s either its subtype (subclass), or it is duck type compatible with it – which boils down to a fact that it’s enough to type hint the latter; you can simply omit the former.
To be honest, I did such a mistake quite often – I mean, I did this redundant thing, something like below:
from typing import Iterable
def sum_of_squares(x: Iterable[float | int]) -> float:
n, s = len(x), sum(x)
return sum((x - s/n)**2)
I always thought I was making the user’s life easier, thanks to clarifying that x
can include both integers and floating-point numbers.
Was I? I don’t know. For sure, I was making the code verbose. A person who does not know that int
is a duck type of float
may think, why only float
? On the other hand, we should not write the code in a way that makes easy to understand by those who do not know. Of course, there are some limits, but I don’t think this situation crosses a line. Besides, anyone who knows Python a little bit should know that where a float
is expected, an int
can be used; this is rather common knowledge. Anyway, this is one of the reasons why I’m writing this article – so that my readers know that not only can an int
be used dynamically instead of a float
, but also that this is fine from a static checkers point of view.
Let’s return to the sum_of_squares()
function. When you know about duck type compatibility, the concise version is as clear but shorter and thus cleaner:
from typing import Iterable
def sum_of_squares(x: Iterable[float]) -> float:
n, s = len(x), sum(x)
return sum((x - s/n)**2)
So, I could say that my lack of Python knowledge made me think I was doing a favor to the users of my code – now I know that I wasn’t.
Named tuples
With collection.namedtupes
and typing.NamedTuples
, the situation is similar, with a small difference. Both these types are subtypes of the regular tuple
type, and this is why they are consistent-with it.
That’s why the below annotation is… Well, it’s not the best:
from collections import namedtuple
from typing import NamedTuple
def join_names(names: tuple | namedtuple | NamedTuple) -> str:
return " ".join(names)
The function itself is not the smartest among those I’ve written, but that’s not the point. The point is, if you want to accept a tuple
, a namedtuple
and a NamedTuple
, you can do it this way:
def join_names(names: tuple) -> str:
return " ".join(names)
However, if you want to accept only one of the two named tuples, you can type hint it, for example:
from collections import namedtuple
def join_names(names: namedtuple) -> str:
return " ".join(names)
And here, only instances of collections.namedtuple
and of its subclasses can be used. You could of course indicate typing.NamedTuple
the same way, and then a collections.namedtuple
could not be used. Remember, if T1
is consistent-with T2
, it does not mean that T2
is consistent-with T1
.
Remember, if
T1
is consistent-withT2
, it does not mean thatT2
is consistent-withT1
.
Conclusion
We learned what consistent-with and duck type compatibility mean. Don’t be afraid to use this knowledge in your code. You know how to respond to the following questions: "Why only float
? What if I want to use an int
?"
Footnotes
¹ That sum_of_squares()
defined that way does not accept a generator makes plenty of sense. To see why, analyze the function’s body, keeping in mind how generators work.
Note that calculating len(x)
would consume the generator – so, the function would not be able to calculate the sum of x
. Look:
>>> sum_of_squares((i for i in (1., 2, 34)))
Traceback (most recent call last):
...
n, s = len(x), sum(x)
^^^^^^
TypeError: object of type 'generator' has no len()
Pylance
screams:
mypy
does not like it, either:
error: Argument 1 to "sum_of_squares" has incompatible type
"Generator[float, None, None]"; expected "Sequence[Union[float, int]]"
[arg-type]
Do you see how using a static type checker can help you catch errors that otherwise would be caught at runtime?
So, kudos to type hinting? Yes – but kudos to good type hinting!
Thanks for reading. If you enjoyed this article, you may also enjoy other articles I wrote; you will see them here. And if you want to join Medium, please use my referral link below: