"String" is one of the most important data types in most of the Programming languages, so it is in Python. Most programming languages will provide many built-in functions/methods to help programmers manipulate strings more easily, such as replacing, splitting and so on.
Python is a programming language that is famous because it provides a lot of features out-of-box. This is true in almost every single aspect. I often see people writing Python code regarding string manipulations but they are actually "re-inventing" the wheels. This is because most of the other languages do not provide so many useful built-in functions to do so.
In this article, I’ll collect some Python string-related built-in functions that are potentially very useful but are easy to be ignored.
Case Converting

I’m not gonna show you lower()
or upper()
, because I believe most people know this and are using these. However, fewer people will know the following.
capitalize()
What if you have got a sentence and want to make the first letter capital?
s = 'this is a sentence.'
Re-inventing the wheel:
s = s[0].upper() + s[1:]
Using the Python built-in function:
s.capitalize()

title()
For the same scenario, how about you are making a sentence into a title? In other words, the first letter of every word should be capitalised.
s = 'useful python built-in string functions but few people use'
Re-inventing the wheel:
words = s.split(' ')
words = [w[0].upper() + w[1:] for w in words]
s = ' '.join(words)
Using the Python built-in function:
s.title()

Please note that the method didn’t use the built-in function even has more problem. That is, it can only recognise the words if they are separated by space. Therefore, the word "build-in" will be converted into "Build-in", not "Build-In". To achieve the same outcome like title()
does, regex needs to be taken in to account which even makes it more complex.
String Filling

I’m sure that you know how to use str.format()
and f-strings. However, some of the other string filling built-in functions is not commonly seen but useful.
center()
Suppose you are writing a command-line interface using Python and you want to displace a menu to the user. To make it good looking you want to add some "=" symbols and put the word "Menu" in the centre.
Another problem is that you may have other contextual words to put in the output and the length is varied, so it cannot be hardcoded. Suppose the fixed length is 20.
s = 'Menu'
fixed_len = 20
Re-invent the wheel:
line = int((fixed_len - len(s)) / 2 - 1) * '=' + ' ' + s + ' '
line += (fixed_len - len(line)) * '='
Using the Python built-in function:
print(s.center(len(s) + 2).center(fixed_len, '='))

In here, we used the center()
function twice. The first time is to add two spaces around the word, and the second time is to add the "=" symbol to fill the string to the fixed_length.
zfill()
Quite commonly, some projects will use some data with numeric "ID" fields. The "ID" might not have the same length. However, we have to make them have the same length by adding zeros before them to the fixed length.
id_list = [
'123',
'45',
'4321',
'51323'
]
fixed_len = 6
Re-invent the wheel:
[(fixed_len - len(id)) * '0' + id
for id in id_list]
Using the Python built-in function:
[id.zfill(fixed_len) for id in id_list]
Both the two examples are using list comprehension, but the one using just a built-in function is much neater.
Surpass Regex

Regular Expression is commonly used in most of the programming languages that are very powerful to handle many complicated string pattern matching issues. This is of course also true in Python. However, in some cases that are not that complex, Python built-in functions will achieve things much easier.
Suppose you want to decide whether a string contains only letters and numbers.
s = '1plus1'
Re-invent the wheel:
import re
bool(re.match(r'^[dA-Za-z]+$', s))
Using the Python built-in function:
s.isalnum()

Besides the function isalnum()
, several other functions can be used in similar scenarios but for different patterns.
- If you want to match a string with letters only, use
isalpha()
- If you want to match a string with digits only, use
isdigit()
- If you want to match a decimal number (not binary or hex), use
isdecimal()
- If you consider the underscore "_" is valid in a string, use
isidentifier()
String Cleansing

Sometimes we have strings that are not plain, especially if we got them from web scrapper. Usually, there will be a lot of "new-line" or "tab" characters (n
and t
). Also, we want to remove any spaces at the front and the end. In this case, the strip()
function can help.
Unlike most of the other programming language such as Java that uses trim()
, which only works for the whitespace, the strip()
function works for the new line and tab as well.
Suppose I got the string of my name, but with some of those characters.
s = ' nChristopher Taot '
Before cleansing, if we use it in output, the format will be ugly.
print(f'My name is {s}. Thank you for reading.')

However, if we simply use the strip()
function, everything is elegant now.
print(f'My name is {s.strip()}. Thank you for reading.')

Cryptography

Do you Python can do simple cryptography for us? For example, we want to substitute some letters with numbers in some text so that the plain text will not reveal.
The key is to use maketrans()
function:
s_in = 'abcdef'
s_out = '856741'
cipher = str.maketrans(s_in, s_out)
Please be noted that the length of s_in
and s_out
must be same, because the chars from both of them will be used for translating afterwards.
To encrypt a string using the cipher, we can use the translate()
function.
'Christopher Tao'.translate(cipher)

Since we only defined "abcdef", so the other letters will be left as plain.
Of course, this cryptography is NOT safe at all. For such kind of substitution-based cryptography, it will be cracked in seconds. However, can we use the functions in some other ways?
Basically, if we have a look at the cipher
object, it is nothing but a plain dictionary with the ASCII as keys and values.

We can use this "unsafe" cryptography feature provided by Python to do some elegant replacing works!
Advanced Replacing
I bet you must know how to use the replace()
function to replace any sub-strings in a string with something else. However, sometimes when we need to make multiple substitutions, the code is bulky and not elegant.
For example, we want to replace a
with 1
, b
with 2
and c
with 3
. If we write this using replace()
function, it will be like this:
'abc'.replace('a', '1').replace('b', '2').replace('c', '3')
However, now we have a better way to do so.
dictionary = str.maketrans('abc', '123')
'abc'.translate(dictionary)

Splitting And Joining

Last but not least, I want to talk about some functions that can be used to split or join a string. I know you must know split()
, but do you there is another function called splitlines()
?
splitlines()
When we want to break down a large chunk of string into lines by the newlines, we can definitely use split()
. However, we have to explicitly specify the new line character. The problem is that the newline character might be different in different coding or operating systems which makes our program vulnerable.
Please look at the following example:
"123 n 456 r 789 rn abc".split('n')

If we use splitlines()
, the problem will never happen.
"123 n 456 r 789 rn abc".splitlines()

Something about join() you may not know
OK. Lastly, I bet you know the join()
function for sure. It can be used to join a list of strings together in a very efficient way.
Shameless advertising 🙂 If want to know why join()
function is considered to be much more efficient, please refer to my previous article below.
Sorry, let’s come back to the topic. The join()
function can be utilised on everything iterable, not only the list. For example, you might know that a string is actually iterable in Python, so we can use it even on a string.
Suppose we want to add whitespace between every letter in a word, then we can use thejoin()
function.
s = 'abc'
print(' '.join(s))

I won’t try everything here. If you’re interested, you may try everything iterable in Python with the join()
function, even the dictionary.
The only thing I want to emphasis is the set. It might not be safe to use the join()
method on a set in most of the cases, because a set is not ordered. So, the result is unexpected.
s = {'a','b','c','d','e'}
print(','.join(s))

Summary

In this article, I have introduced some very useful built-in functions in Python about string manipulation. In some scenarios, they can be a lifesaver, but I see rarely people are using them. Hope these tips and little tricks will help you one day.
If you feel my articles are helpful, please consider joining Medium Membership to support me and thousands of other writers! (Click the link above)