Have you ever considered why it is considered bad practice to subclass the list
, dict
, set
, and tuple
classes? Today, I will delve into the reasons behind this practice and present alternative classes that streamline the implementation and extension of built-in-like objects.
Why it’s considered bad practice
Subclassing built-in types like dict
, set
, list
, and tuple
can introduce complications due to method delegation, initialization quirks, unexpected behavior from internal optimizations, and potential performance hits. These issues stem from the fact that built-in types are implemented in C, leading to some methods not being overridden as expected or performing sub-optimally when subclassed. Instead, Python offers more robust alternatives through composition and the abstract base classes in the collections
module, which are specifically designed for extension and customization without the pitfalls of direct subclassing.
Note
Mixin classes are generally considered bad practice, as is multiple inheretance. So this post is taking the path of least evil, as this is the tooling the standard library provides for this task.
If you are not interested in inheriting behavior, a Protocol based approach may be more suitable. These classes should only generally be used when implementing built-in-like objects where you intend to override core behaviors.
The standard library is littered with weird code due to many additions to features over the years. This particular oddity is a result of the addition of type hints.
What are the alternatives?
Python provides a number of alternatives in the collections
module. For simple dicts and lists, there are the UserDict
and UserList
classes.
But for Tuples. Sets and more complex list and dict like objects we have a number of abstract mixin classes that allow us to create our own implementations. Provided by the collections.abc
module, these classes are:
dict
=>MutableMapping
set
=>MutableSet
list
=>MutableSequence
tuple
=> does not have a direct mixin alternative, and insteadSequence
andHashable
would have to be used.
This is not an exhaustive list of the mixin classes, see a complete list here.
Implementing the Mixins
Each of these classes require you to implement a number of methods, and by doing so, Python will provide the rest of the interface based on these methods. Here are the basic interfaces, and the extra functions they provide.
MutableMapping
To achieve a dict
like interface, we would need to implement these method:
class MyMapping(MutableMapping):
def __getitem__(self, key):
...
def __setitem__(self, key, item):
...
def __delitem__(self, key):
...
def __iter__(self):
...
def __len__(self):
...
This will provide the following methods:
__contains__
`__eq__
__ne__
keys
items
values
get
popitem
clear
update
setdefault
MutableSequence
Just like the previous example, we implement the basic methods, and this will provide us with a list
like interface.
class MySequence(MutableSequance):
def __getitem__(self, index):
...
def __setitem__(self, index, item):
...
def __delitem__(self, index):
...
def __iter__(self):
...
def __len__(self):
...
def insert(self, item)
This will offer the following methods:
__contains__
__iter__
__reversed__
__iadd__
index
count
append
clear
reverse
extend
pop
remove
MutableSet
Like the previous examples, it only takes a handful of methods to provide us with a set
like interface.
class MySet(MutableSet):
def __contains__(self, item):
...
def __iter__(self):
...
def __len__(self):
...
def add(self, item):
...
def discard(item):
...
This will offer the following methods:
__le__
__lt__
__eq__
__ne__
__gt__
__ge__
__and__
__or__
__sub__
__xor__
__ior__
__iand__
__ixor__
__isub__
isdisjoint
clear
pop
remove
Hashable
& Sequence
This example differs slightly from the previous example, here we have to make use of multiple inheritance to enable the creation of the tuple
like interface. This is because Sequence
provides the collection interface, but does not provide the Hashable
interface, which is required of a tuple
like object.
class MyHashableSequance(Hashable, Sequence):
def __hash__(self):
...
def __getitem__(self, index):
...
def __iter__(self):
...
def __len__(self):
...
This will offer the following methods:
__contains__
__iter__
__reversed__
index
count
As you can see, by implementing a minimal number of methods, we can benefit from a wealth of additional functionality. If necessary, we also have the option to override those methods. These classes will align with the same interface as the pre-existing types, enabling them to be utilized interchangeably. This provides a robust and extensible alternative to extending built-in collections through inheritance.
Final thoughts
The collections.abc
module offers a streamlined way to extend Python’s data structures efficiently and in line with the language’s design principles.
The abstract base classes in collections.abc
ensure that custom collections adhere to Python’s expected interfaces and behaviours, enhancing usability and integration with other parts of the Python ecosystem. This consistency is especially beneficial in large-scale applications, where predictable data structure behaviour is essential.
Ultimately, it is more advantageous to utilize collections.abc
instead of directly subclassing built-in types like list
, dict
, set
, and tuple
. By embracing this Pythonic approach, custom collections become more robust, maintainable, and well-integrated into the Python environment. When extending a built-in data structure, consider turning to the collections.abc
module first.