Watch out for mutable defaults in function arguments

Default arguments in functions are useful. When you write a function with defaults you do so with the understanding that other developers may not wish to alter or amend options that are common or change infrequently. All defaults imply a contract – both a technical one, but also a social one to your fellow coders – that they cannot be mutable; if they change, you run the risk of invalidating the intent that you or your fellow developers made when they first called your function. But there are a parts of Python’s language design that can break that contract if you are not careful.

What is mutability?

Mutability is another way of saying that something is changeable. When an object in Python is mutable, it means you can alter its internal state, like adding a key-value pair to a dictionary, or append elements to a list.

By the way …

Note that immutability in this case only extends to the tuple and frozenset objects and not necessarily the elements inside it! A tuple of lists is perfectly legitimate, for example.

Contrarily, a tuple, or frozenset, is immutable. You cannot add or remove elements from them after you have created them . Compound objects that you create yourself to hold state – like a salary field in an Employee object – is another example of mutable objects. In fact, very few things in Python are truly immutable due to the design of the language:

>>> import math
>>> math.answer = 42
>>> print(f'What is the answer to Life, the Universe, and Everything else? {math.answer}')
What is the answer to Life, the Universe, and Everything else? 42

Here I’ve imported the math module and added a constant, answer, to the module to show that with few exceptions almost nothing is truly immutable in Python.

Mutable Defaults

Now that I’ve clarified what I mean by mutability, it’s worth looking at what a mutable default value is. But ask yourself where you would want that? If it’s a default value, then we want it so because it’s static and does not change. Even though the word default does not mean static or immutable, to most of us it’s intuitive to assume that, no, a default value should not change.

Now consider what happens if you have a little script like this one:

# merge_customers.py
from collections import namedtuple

Customer = namedtuple('Customer', 'age name')

def merge_customers(new_customers: list, existing_customers: list = []):
    existing_customers.extend(new_customers)
    return existing_customers

kramer = Customer(name='Kramer', age=42)
george = Customer(name='George', age=37)

merged_customers = merge_customers([george])
print(merged_customers)

merged_customers = merge_customers([kramer])
print(merged_customers)

The merge_customers function simply merges two lists into one and returns it. If you give it only new_customers then it’ll use its default argument of existing_customer = [] to create a default list for you and then merge the new customers into that. If you give it an optional list of existing customers, then it will of course use that list in lieu of the default value.

So what happens if I run merge_customers twice, as I do above?

Well…

$ python merge_customers.py
[Customer(age=37, name='George')]
[Customer(age=37, name='George'), Customer(age=42, name='Kramer')]

OK, so that is probably not what you expected to see. merge_customers did not reset the default value back to an empty list when I called it a second time.

The reason it didn’t is because Python evaluated the source file once when you ran it and assigned [] – which is a mutable object, remember? – as the default value to existing_customers. Any call to merge_customers will use the same object instance of existing_customers. You can test it by printing the object’s internal ID and replacing existing_customers = [] with a custom function that reports how often it is being called. Do that and you will see that they are in fact identical between invocations:

def make_list():
    l = []
    print(f"Creating a new list. ID={id(l)}")
    return l

def merge_customers(new_customers: list, existing_customers: list = make_list()):
    print(f'ID={id(existing_customers)}')
    existing_customers.extend(new_customers)
    return existing_customers

Running the modified version yields the following answer:

Creating a new list. ID=140657205032768
ID=140657205032768
[Customer(age=37, name='George')]
ID=140657205032768
[Customer(age=37, name='George'), Customer(age=42, name='Kramer')]

As you can see, make_list was called only once when the file was run. Intuitively, you’d expect it to be called twice, but you know now why it did not. The IDs, indeed, are also the same.

The id() function takes an object and returns a value that uniquely identifies the object. It is useful if you want to know if two objects are the one and the same.

So what is the fix? Well, the fix is not to put anything that is mutable in the default value field. Defaulting to None and checking for that and using an empty list is simpler:

def merge_customers(new_customers: list, existing_customers: Optional[list] = None):
    if existing_customers is None:
        existing_customers = []
    existing_customers.extend(new_customers)
    return existing_customers

This is a fool-proof way of ensuring that stale data does not persist in unlikely places because the object is mutable.

Alternatively, of course, you could exploit this quirk and build your software around mutable defaults. But that is not recommended: it is unintuitive and it assumes that Python will never reload or re-evaluate the module. You also run the risk of others spotting the “mistake” and fixing it, causing logic errors in your code.

Watch out for Side Effects

Another pernicious issue that relates to mutability is code with side-effects.

A function with side effects is a function that alters (mutates) state that exists outside the function itself. For instance, a function create_user may talk to a database or API to create the user before returning.

For the exact same reasons as above, you should avoid writing code that can alter state:

def open_database(connection = make_connection(host='foo.example.com')):
    print(f'Connected to {connection}!')
    # ... do something with the connection ...
    connection.close()

This code suffers the same problems as the example from before. Making the result of a call to open_database a default value will most surely break as the connection is established only once, and subsequent calls would fail as the connection is closed (and never re-opened) after the first call. Furthermore, the connection is made when the file is loaded, which could happen minutes or hours before it is used, at which point the connection may have timed out; indeed, the connection may fail on load, resulting in an exception that crashes your application.

Even things that seem innocuous can cause problems:

import datetime
def print_datetime(dt = datetime.datetime.now()):
    return str(dt)

Repeat calls to print_datetime will not return the current time.

import datetime

CURRENT_DATETIME = datetime.datetime.now()

def print_datetime(dt = CURRENT_DATETIME):
    return str(dt)

So, in conclusion, the best way to avoid mutability is to think of default values as constant assignments. In fact, if I rewrite the code slightly, as I have above, you’d spot the problem immediately, right?

Summary

You must avoid mutable arguments at all costs

There are few legitimate use-cases for mutable default arguments. Most people inadvertently write mutable defaults because their intuition and knowledge of how Python evaluates and runs their code is wrong. You should only consider using mutable defaults when you have exhausted all other options. Make sure you carefully document your code and explain what you are doing.

You should also avoid code with side-effects

Even simple things, like getting the current time and date, is wrong. It’s evaluated on module load. Always assume default values are constants that are executed once and never again.

Immutable objects like tuples or frozensets are usually safe to use as default values

But always remember that immutability is a property of a particular object and not necessarily any objects it in turn references. A mutable list can contain immutable elements; and an immutable tuple can contain mutable lists.

Liked the Article?

Why not follow us …

Be Inspired Get Python tips sent to your inbox

We'll tell you about the latest courses and articles.

Absolutely no spam. We promise!