The world’s leading publication for data science, AI, and ML professionals.

Single Python package to cover 99% of your Path needs

Pathlib: the library you have always dreamed of

If you like to experience Medium yourself, consider supporting me and thousands of other writers by signing up for a membership. It only costs $5 per month, it supports us, writers, greatly, and you get to access all the amazing stories on Medium.

Photo by Alice Donovan Rouse on Unsplash
Photo by Alice Donovan Rouse on Unsplash

Manipulating paths is a fundamental task in any production project. You may need to load files from distant servers that conform to a specific pattern, move or store different processed files and versions across your pipeline, or simply read or write files.

For all of these steps, you will encounter the manipulation of paths. Often, you will find yourself repeatedly searching the internet for solutions to common problems such as:

  • how do I get all files from a folder?
  • how to check if a folder exists?
  • how to create a folder in Python?

While the answers to these questions are easy to find, what you will encounter is an inconsistency of code throughout your pipeline. One part of the code may use os.path, another may use shutil, and another may use glob ..

In order to avoid this inconsistency, I recommend you using the pathlib module, which offers a consistent way of working with paths across different operating systems and provides an object-oriented interface, making it more readable, simpler and more consistent than os.path or shutil module.

pathlib and Path object

pathlib is a module that simplifies working with file paths in Python. It provides a Path class that represents a file or directory path, and offers a range of methods for performing various operations on paths.

One key advantage of pathlib is its object-oriented approach to working with paths. Instead of using separate functions to manipulate paths as strings, pathlib provides a single Path class with a range of methods that can be used to perform various operations on a path. This makes it easier to understand and work with paths, as all the relevant functionality is contained within a single object.

Another advantage of pathlib is its unified interface for working with paths. It provides a single set of methods that can be used to perform various operations on different types of paths, such as local file paths, FTP/SFTP paths, and more. This makes it easier to write code that works with different types of paths, as you don’t have to use different functions or libraries for each type of path.

To show you how to manipulate all the functions later, we will need to define a path that is pointing to _"origin/Data/forarli", let’s do it by defining one Path object:

from pathlib import Path
Path = Path('origin/data/for_arli')

You are set to work with the methods of Path objects and you will see, it is super simple.

Path existence and type

In many situations, we want to check if a folder or a file is available in a given path, and if it does not exist we want to operate a certain operation like raising an Error.


if path.exists():
    print(f"{path} exists.")
    if path.is_file():
        print(f"{path} is a file.")
    elif path.is_dir():
        print(f"{path} is a directory.")
else:
    raise ValueError(f"{path} does not exists")

This code will check if the path _'origin/data/for_arli'_ exists, and if it does, it will check whether it is a file or a directory. If the path does not exist, it will print a raise an Error indicating that the path does not exist.

File and directory operations

Suppose now that we are interested in listing all the files/folders from the path, we can do it with:

for f in path.iterdir():
    print(f)

It will iterate through the path and print every file or folder that is in it. You could use it in combination with the previous is_dir() and is_file() methods to list either files or directories.

Now suppose that we want to erase the path we can do it with:

path.rmdir()

As you notice, this command will only work if the path is empty.

Therefore you will need a method to remove all the files from a folder before:

for f in path.iterdir():
    f.unlink()

Here we are removing the files that are in the path with unlink(). Now you can remove the folder without any issue with rmdir().

Oh and if you want to recreate the folder:

path.mkdir()

Notice that mkdir() has very useful arguments like:

  • parents=True to create any missing parents of the path
  • exists=True if the folder already exists, will ignore any Errors

You could also rename the file or directory with the command below:

path.rename('origin/data/for_arli2')

Manipulation and information

One of the most used methods of Path is surely joinpath() to concatenate the path with string (it also handles the join between two Path objects):

path = Path("/origin/data/for_arli")
# Join another path to the original path
new_path = path.joinpath("la")
print(new_path)  # prints 'origin/data/for_arli/bla'

You could also sometimes want some crucial information on your files or folders like statistics (creation time, modification time) or owners (user or group). For that, you can use:

print(path.stat()) # print statistics 
print(path.owner()) # print owner

Input/Ouput

It is also possible to use pathlib to either read or write files.

The open() method is used to open a file at the specified path and return a file object. This method works similarly to the built-in open(): You can use the file object to read or write to the file. Here is an example of writing a file with write() method from Path.

# Open a file for writing
path = Path('origin/data/for_arli/example.txt')
with path.open(mode='w') as f:
    # Write to the file
    f.write('Hello, World!')

Notice that you do not need to create manually example.txt.

For the read operation, the principle is the same, but with the method read() :

path = Path('example.txt')
with path.open(mode='r') as f:
    # Read from the file
    contents = f.read()
    print(contents) # Output: Hello World!

Overall, pathlib is a useful library for Python developers as it provides an object-oriented interface for representing file paths and performing various operations on them in a consistent way across different operating systems. It makes working with file paths and directories more convenient, straightforward and easy to understand with its unified set of methods. Additionally,pathlib object-oriented design allows for more readable and maintainable code, compared to using string-based path manipulation libraries like os.path or shutil.


With no extra costs, you can subscribe to Medium via my referral link.

Join Medium with my referral link – Arli

Or you can get all my posts in your inbox. Do that here!


If you like to experience Medium yourself, consider supporting me and thousands of other writers by signing up for a membership. It only costs $5 per month, it supports us, writers, greatly, and you get to access all the amazing stories on Medium.


Related Articles