The world’s leading publication for data science, AI, and ML professionals.

Do Not Use “+” to Join Strings in Python

A comparison of the approaches for joining strings in Python, using "+" and join() method.

Photo by tcausley on Pixabay
Photo by tcausley on Pixabay

When I start to use Python, it is very intuitive and easy to come out to use the plus operator + to join string, as many Programming languages do such as Java.

However, soon I realised that many developers seem to like to use the .join() method rather than +. In this article, I’ll introduce what’s the differences between these two approaches and why you should not use +.

Beginning

Photo by Dayne Topkin on Unsplash
Photo by Dayne Topkin on Unsplash

As a beginner, or someone has just switched from other languages that use + to join strings, it is very easy to write code like this:

str1 = "I love "
str2 = "Python."
print(str1 + str2)

As you use Python more and more, you may realise that someone else prefers to use the join() method like this:

str1 = "I love "
str2 = "Python."
print(''.join([str1, str2]))

Honestly, when I saw the above method first time, I was thinking that this is not intuitive and looks kind of ugly.

Join Multiple Strings

Photo by Tim Boote on Unsplash
Photo by Tim Boote on Unsplash

Nevertheless, one time I need to join multiple strings in a list.

strs = ['Life', 'is', 'short,', 'I', 'use', 'Python']

Initially, I have done it like this:

strs = ['Life', 'is', 'short,', 'I', 'use', 'Python']
def join_strs(strs):
    result = ''
    for s in strs:
        result += ' ' + s
    return result[1:]
join_strs(strs)

In this example, I have to write a for-loop to join the strings one by one. Also, the result string needs to be trimmed a white space I added at the beginning because all the strings need to be added a white space in the front, but not the first one. You may have other solutions such as adding an index to the for loop so that the string at the index = 0 should not be added this white space. Anyway, you will still need this for-loop and do something for the white spaces.

After that, I recalled that I’ve seen the .join() method before, maybe this is the time that I need to use it!

def join_strs_better(strs):
    return ' '.join(strs)
join_strs_better(strs)

How easy it is! One line of code does everything. Since the .join() method is called by a string object, the string object will be utilised to join every string in the list, so you don’t need to worry about the white spaces at the beginning.

But wait, do you really think this is the only reason why we need to use the join() method rather than +? No, please read the next section.

Logic Behind join() Method

Photo by Michael Dziedzic on Unsplash
Photo by Michael Dziedzic on Unsplash

Now, let’s compare these two methods in terms of their performance. We can use the magic method%timeit of Jupyter Notebook to evaluate them.

The performance shown above is based on 100k trials so that the results are very confident and obvious. Using thejoin() method can be 4 times faster than using + to join the strings in the list.

Why?

Here is a conceptual graph that I drew for demonstrating of the approach using + to join the strings.

Using + operator and for-loop to join strings in a list
Using + operator and for-loop to join strings in a list

This shows what the for-loop and the + operator did:

  1. For each loop, the string is found from the list
  2. The Python executor interprets the expression result += ' ' + s and apply for memory address for the white space ' '.
  3. Then, the executor realise that the white space needs to be joined with a string, so it will apply for memory address for the string s, which is "Life" for the first loop.
  4. For every loop, the executor will need to apply for memory address twice, one for the white spaces and the other one is for the string
  5. There are 12 times memory allocations

However, what happened for join() method?

Using "join()" method to join strings in a list
Using "join()" method to join strings in a list
  1. The executor will count how many strings in the list. There are 6.
  2. It means that the string that is used to join the strings in the list will need to be repeated 6–1=5 times.
  3. It knows that there are totally 11 memory spaces are needed, so all of these will be applied at once and be allocated upfront.
  4. Put the strings in order, return the result.

Therefore, it is obvious that the major difference is that the number of times for memory allocation is the main reason for the performance improvement.

Imagine that it is already 4x faster to use thejoin() method to join 6 strings together. What if we are joining a very large number of strings? It will make a much larger difference!

Summary

Photo by Liam Briese on Unsplash
Photo by Liam Briese on Unsplash

In this short article, I have compared the differences between the + operator and the join() method when joining strings in Python. Apparently, the join() method is preferred because of its performance.

Learning a programming language is usually a long curve, but Python makes it relatively shorter for beginners, which is absolutely great. After we have entered the door, start to use Python, we should not stop there and satisfy what we can do use Python. Usually, the difference between a master and a regular developer comes from the knowledge in detail.

Let’s keep finding more tips on Python to make ourselves closer to a Python Master!

Read every story from Christopher Tao (and thousands of other writers on Medium)

If you feel my articles are helpful, please consider joining Medium Membership to support me and thousands of other writers! Click the above link.


Related Articles