The world’s leading publication for data science, AI, and ML professionals.

Understanding *args and **kwargs in Python

Learn how to pass variable number of arguments to functions in Python

Photo by Chris Ried on Unsplash
Photo by Chris Ried on Unsplash

If you are a beginning Python programmer, you might come across function declarations with parameters that look like this:

def do_something(num1, num2, *args, **kwargs):

The * and the ** operators above allow you to pass in variable number of arguments to the function. However, this feature often creates a lot of confusion for Python programmers. This article shall attempt to make it clear on the usefulness of the * and ** operators, and how you can use them in your daily programming task.

**Passing in variable number of arguments using ***

First, consider the following function definition:

def do_something(num1, num2, *args):
    print(num1)
    print(num2)

To call do_something(), you need to pass in:

  • two mandatory arguments, followed by,
  • optional variable number of arguments

Let’s try this out and see what happens. We can first call do_something() like this:

# calling the function with two mandatory arguments
do_something(5,6)

You should now see:

5
6

You can also pass additional arguments after the first two arguments, like this:

do_something(5,6,7,8,9)

The third arguments onwards are passed into the args variable in the do_something() function. The data type of args is tuple. And so you can use a for-in loop within the function to extract the individual optional arguments passed in:

def do_something(num1, num2, *args, **kwargs):
    print(num1)
    print(num2)
    for arg in args:
        print(arg)

And so your output will now look like this:

5
6
7
8
9

The following calls to the do_something() functions are all valid:

do_something(5,6,7)
do_something(5,6,7,8)
do_something(5,6,7,8,9)
do_something(5,6,7,8,9,True)
do_something(5,6,7,8,9,True,3.14)

While you can pass in different arguments of different types for the optional parameter, it is more common (and logical) to pass in arguments of the same type. For example, you might have a function named average() to calculate the average of a set of numbers:

def average(*args):
    return sum(args) / len(args)
print(average(1,2))            # 1.5
print(average(1,2,3))          # 2.0
print(average(1,2,3,4,5))      # 3.0
print(average(1,2,3,4,5, 6.1)) # 3.516666666666667

In this case, it makes a lot of sense to pass in arguments of the same type (numeric type).

It is worthwhile mentioning it now that the naming of the variable parameter is not constrained to "args". You can name it using your own variable name.

Unpacking arguments using *

In the previous section you saw how to pass in variable number of arguments to a function like this:

do_something(5,6,7,8,9)

What if your optional arguments are stored in a list or tuple, like this:

values = (7,8,9)   # tuple

In this case, to pass in the values variable to the do_something() function, you can use the *``** operator to unpack the values so that they can be passed into the function:

do_something(5,6,*values)

The * operator works with lists as well:

values = [7,8,9]   # list
do_something(5,6,*values)

Passing in "keyworded" arguments using **

The second form of variable arguments is slightly more confusing for the beginner. Consider the following function definition:

def do_something(num1, num2, **kwargs):

The kw in kwargs stands for keyworded. Again, you are free to name this variable using your own preferred name.

Besides passing in the first two mandatory arguments, you can now pass in an optional "keyworded" arguments. An example should make this clear:

do_something(5,6,x=7)
do_something(5,6,x=7,y=8)

Personally, I prefer to call the x=7 and y=8 as key/value pairs.

In the above example, you can pass in additional arguments by specifying the keys (x and y), and their corresponding values (7 and 8).

Within the function, the keyworded pairs are passed in to kwargs as a dictionary. And so you can extract them using the for-in loop as shown below:

def do_something(num1, num2, **kwargs):
    print(num1)
    print(num2)
    for k,v in kwargs.items():   # get each key/value pair one at a 
        print(k,v)               # time

Of course you can also extract the keys and then extract the values like this as well:

def do_something(num1, num2, **kwargs):
    print(num1)
    print(num2)
    for k in kwargs.keys():    # retrieve all the keys
        print(k, kwargs[k])    # then get the value based on the key

In either case, the output will look like this:

5
6
x 7
y 8

Uses of **kwargs

A common question on **kwargs is its use case. Under what situation is the `kwargs`** useful? Consider the following case where you have a dataframe with four columns:

import pandas as pd
import numpy as np
df = pd.DataFrame(np.random.randn(5,4), 
                  columns=['Summer','Winter','Autumn','Spring'])
df

The dataframe looks like this:

The Pandas dataframe for the example
The Pandas dataframe for the example

Suppose you are asked to write a function, say fetch_data(), that allows users to extract rows from one or more columns with cells containing values greater than some specific values. For example, you want to extract all the rows from the Summer column with values greater than 0.5. In addition, you may also want to extract rows from the Winter column with values greater than 0.2. In these cases, your function should be flexible enough to specify multiple column name(s) as well as the value(s), like this:

fetch_data(Summer=0.5)             # retrieve only the Summer column
fetch_data(Summer=0.5, Winter=0.2) # retrieve the Summer and Winter 
                                   # columns

To implement this capability, your function can accept these columns and values using keyworded pairs, like this:

def fetch_data(**kwargs):
    for k,v in kwargs.items():
        print(df[[k]][df[k]>v])
        # df[[k]] first returns the specified column (k) as a 
        # dataframe, then [df[k]>v] retrieves all the rows whose 
        # cell value is more than the value specified (v)

So the following calls to fetch_data():

fetch_data(Summer=0.5, Winter=0.2)

Yields the following output:

     Summer
0  0.842614
2  0.767157
4  0.935648
     Winter
0  0.843960
3  0.663104

Unpacking arguments using **

Just as you can unpack a list of values using the * operator and pass it to a function with variable arguments, you can also use the ** operator to unpack a dictionary object and pass it to a function that accepts Keyworded Arguments, like the following:

kwpairs = {'Summer':0.5, 'Winter':0.2}    # dictionary
fetch_data(**kwpairs)
# same as
fetch_data(Summer=0.5, Winter=0.2)

Order of variable arguments and keyworded arguments

In a function declaration, you can have the following types of arguments:

  • mandatory arguments, and/or
  • variable arguments, and/or
  • keyworded arguments

However, if a function accepts keyworded arguments, it must always be placed last in a function declaration. The following function declarations are not valid:

def do_something(**kwargs, *args):
    ...
def do_something(*args, **kwargs, num1):
    ...
def do_something(**kwargs, num1):
    ...

If a function accepts keyworded arguments, it must always be placed last in a function declaration.

The following examples show some valid function declarations:

def do_something(num1, *args, **kwargs):
    ...
def do_something(num1, num2, *args):
    ...
def do_something(*args, **kwargs):
    ...
def do_something(**kwargs):
    ...
def do_something(*args, num1):
    ...
def do_something(*args, num1, **kwargs):
    ...

I will leave it as an exercise for the reader to try out the above function declarations and explore how values can be passed into them.

I hope you now have a better understanding of *args and **kwargs!


Related Articles