Automation with PyWin32

In Automate Excel with Python, the concepts of the Excel Object Model which contain Objects, Properties, Methods and Events are shared. The tricks to access the Objects, Properties, and Methods in Excel with Python pywin32
library are also explained with examples.
Now, let us leverage the Automation of Excel report with Pivot Table, one of the most wonderful functions in Excel!
Why PyWin32?
You may curious why don’t we use pandas.DataFrame.pivot
or pandas.DataFrame.pivot_table
from pandas
library instead? It’s a built-in library that we don’t even need to install.
Well, the two pandas
functions mentioned above can create the Pivot Table easily, but if you are preparing an Excel report which will be accessed by other domain users, the hard-coded pivot table created may not favour them as they cannot modify the Pivot Table Fields.

The image above shows the Pivot Table Fields for Pivot Table in Excel, the Pivot Table shows the sales of video games in different location according to the game’s genre. With the interactive Excel Pivot Table, the domain users have the freedom to select any number of countries to shows in the Pivot Table, while the hard-coded Pivot Table created by pandas
unable to do so.
In the following example, the datasets used are PS4 Games Sales data from Kaggle. Then, the script used to create Pivot Table is referring to the Notebook created by Trenton McKinney, How to Create a Pivot Table in Excel with the Python win32com Module. In the Notebook of McKinney, he has defined the function to create the synthetic data, Pivot Table and Excel com object in Python (he also shown how to do it with Excel VBA).
He has nicely optimized the script, hence, in the example below, McKinney’s script will be used to create Pivot Table in Excel. Then, I will explain how to access the components of the Pivot Table, modify the filter of the Pivot Table and extract the filtered data with Python for further analysis.
To provide a clearer picture of what we are going to do in the following example, let me explain the input and output of the example. The input is the PS4 Games Sales in CSV format as shown in the image below.

The attributes included the games’ title, year of publication, games’ genre, publisher, and the sales of the games in North America, Europe, Japan, the rest of the world and the global sales. Based on the data, we are going to create the Pivot Table that illustrates the total sales of PS4 games in each location based on the genre. The image below shows the Pivot Table we going to create with the Python pywin32
library.

Create Pivot Table and Manipulate It with pywin32
There are five parts in the following section:
- Import Libraries
- Read and Process Datasets
- Create Pivot Table
- Access to Methods and Properties of Pivot Table
- Modify the Filter of the Pivot Table and Extract the Filtered Data
Import Libraries
import win32com.client as win32
import pandas as pd
import numpy as np
from pathlib import Path
import re
import sys
win32c = win32.constants
Read and Process Datasets
df = pd.read_csv("PS4_GamesSales.csv", encoding = 'unicode_escape', engine ='python')
# remove null values
df = df.dropna()
# write the csv file to xlsx File to create Pivot Table
df.to_excel("PS4_GamesSales.xlsx", sheet_name = 'Sales', index = False)
As there are special characters in the datasets, encoding
need to be defined when reading the CSV file. If you are interested, here’s the reference for encoding.
One thing to take note of is when we exporting the data which will be used to create Pivot Table later, we must set index = False
. If not we may face problem during creating the Pivot Table, as the function not able to judge which row is label row or header.
Create Pivot Table
The script of creating the Pivot Table is modified from the Notebook from McKinney. The script contains three functions, which are pivot_table()
, run_excel()
and main()
.
pivot_table()
function is used to allocate the Data Fields into respective Pivot Table Fields (Filters, Columns, Rows and Values).
The last part of the function is used to modify the visibility of Values Row and Column Grand Total, which are the Properties of Pivot Table Object of Excel Object Model, you may refer here for other Properties.
Next function is run_excel()
. This function is used to create Excel Object, and then create a new sheet for Pivot Table. The Data Field(s) that will be added to Pivot Table Field (Filters, Columns, Rows, and Values) will be defined here.
After the Pivot Table is created, wb.Save()
will save the Excel file. If this line is not included, the Pivot Table created will be lost. If you are running this script to create Pivot Table in the background or on a scheduled job, you may want to close the Excel file and quit the Excel object by wb.Close(True)
and excel.Quit()
respectively. In this way, you will not need to close the Excel file manually after the job is done. Alternatively, you may set excel.Visible = False
, then the Excel file will not be open from the beginning.
main()
function is the main function, which will call the run_excel()
function, and then the run_excel()
function will execute pivot_table()
function.
An error will occur if you run the function twice, as the pivot table is programmed to be created on a sheet name ‘pivot_table’, which is already created in the first run, you may change pt_name or just delete the sheet created in the first execution.

Access the Properties of Pivot Table
The Pivot Table can be studied or modified by manipulating two Methods and five Properties and of Pivot Table object (PivotTable object Reference) below:
Methods
- Clear All Filters
- Pivot Items
Properties
- Current Page (grouped under PivotField object)
- Page Range
- Row Fields
- Column Fields
- Table Range 1

The figure above shows the CurrentPage, PageRange and TableRange1 of a Pivot Table. RowFields and ColumnFields of PivotTable Field are shown in the figure below.

The Pivot Table Methods and Properties mentioned above will be used in the example to extract the data of filtered Pivot Table and save it into DataFrame.
First, create the Excel object. Although the Excel object was created in the process of creating the Pivot Table, we still need to create it again as the Excel object was created in a function and not carry forward.
f_path = Path.cwd()
f_name = 'PS4_GamesSales.xlsx'
filename = f_path / f_name
# create excel object
excel = win32.gencache.EnsureDispatch('Excel.Application')
# excel can be visible or not
excel.Visible = True # False
wb = excel.Workbooks.Open(filename)
pvtTable = wb.Sheets("pivot_table").Range("A3").PivotTable
Page Range and Current Page Properties, Clear All Filters Method
Page Range is referring the Filters field of the Pivot Table (field name and also the values) as shown in the figure below.

CurrentPage is referring to the value(s) of the filter, it’s used to set the values for ONE Filter, while Page Range return the current filter and its value(s).
The script below print the Page Range and then clear all the filters.
page_range_item = []
for i in pvtTable.PageRange:
page_range_item.append(str(i))
print(page_range_item)
pvtTable.PivotFields("Year").ClearAllFilters()
To provide a piece of clearer information, GIF below shows the outcome of running the script above. I manually set the filter in the beginning, and then print the Page Range. After that, all filters are cleared with the script, then the Page Range is print again. The GIF also shows how fantastic is pywin32
as we can see the changes made to Excel immediately.

pvtTable.PivotFields("Year").CurrentPage = "2020"
page_range_item = []
for i in pvtTable.PageRange:
page_range_item.append(str(i))
print(page_range_item)
Now, let us modify the filter by using Pivot Field Properties, Current Page.

As we have to specify the Pivot Field when modifying the Current Page, it’s only can be used to modify the value of one filter at a time.
In this example, we only filter with one value. The way to filter multiple values is shown in the last example, which also a complete example of modifying the Pivot Table filter and then extract the filtered data.
Row Fields / Column Fields
Row Fields and Column Fields will list all the field added to Row fields or Column fields in Pivot Table Fields when creating Pivot Table.
row_fields_item = []
for i in pvtTable.RowFields:
row_fields_item.append(str(i))
print(row_fields_item)
column_fields_item = []
for i in pvtTable.ColumnFields:
column_fields_item.append(str(i))
print(column_fields_item)
In our case, Row Fields is ‘Genre’ while Column Fields is ‘Values’ which is auto-generated. However, we can disable it with the script below.
wb.Sheets("pivot_table").PivotTables("example").DisplayFieldCaptions = False
The GIF below illustrates the outcome of the script above.

The "Values" and "Row Labels" are the Field Headers, not a Data Field. The script is the same as the action below.

Table Range 1
TableRange1 print the data without the Page Range, if you wish to print the Page Field as well, you may refer to TableRange2 Properties of Pivot Table.
table_data = []
for i in pvtTable.TableRange1:
#print(i)
table_data.append(str(i))
print(table_data)
Table Range 1 Property returns the result in list form.

To convert the list into a DataFrame, we need to know the actual dimension of the Pivot Table. The number of columns can be easily determined by adding the pt_fields
in the script of creating Pivot Table and 1 for the column of row items. In this example, there are five items in pt_fields
, "Total Sales in North America", "Total Sales in Europe", "Total Sales in Japan", "Total Sales in Rest of World", "Total Global Sales". So the number of columns = 5 + 1 = 6
.
The complicated part is to determine the number of rows. In this article’s example, there are three rows built from Column Fields Header (the "Values", the row of Pivot Table Field ( "Total Sales in North America", "Total Sales in Europe", "Total Sales in Japan", "Total Sales in Rest of World", "Total Global Sales") and the Column’s Grand Total. Other rows are the Items from Row Field after filtered.
I obtained the number of items in Row Field by removing the items of Column Fields ("Values") and pt_fields
, other labels like "Row Labels", "Column Label", "Grand Total" and "None" and all the number values in the Pivot Table by using RegEx.
After obtained the number of columns and rows of the Pivot Table, the list from Table Range 1 can be reshaped into DataFrame.

At this stage, the DataFrame is not able to detect the correct header which is the second row of the table. So, we have set the second row ( row index = 1) as header, then drop the first two rows. There are multiple ways to do this, you may rename the column and then drop the first two rows as well.
df.columns=df.iloc[1]
df = df.drop(index = 0)
df = df.drop(index = 1)
Modify the Filter of the Pivot Table and Extract the Filtered Data
Now, let us combine everything, modify the filter according to a list of items, then extract the filtered Pivot Table Data and save it as a DataFrame.
# Find all items in Year
year_items = []
for item in pvtTable.PivotFields("Year").PivotItems():
year = str(item)
year_items.append(year)
year_to_include = ['2013','2014','2015']
year_to_exclude = [x for x in year_items if x not in year_to_include]
Before we modify the filter, it’s important to know the exact number of items contains in the filter. In our case, the filter is "Year". This step can be done by using Pivot Table Method, Pivot Items() as shown in the script above.
The below is the complete function to modify the filter and then extract the filtered Pivot Table as DataFrame.
Let’s witness this in the following GIF.

Bonus: Create Multiple Worksheets of Different Pivot Table
Received an enquiry on how to create multiple worksheets of different Pivot Tables. We can modify the run_excel()
function to achieve that.
We only need to duplicate the part of the script where we set up and call the pivot_table()
function. Remember to modify the details according to your need. Most importantly, we need to change the variable names that hold the new worksheet’s title and the new worksheet’s object. If not, you may wonder why only one pivot table has been created in the workbook.😂 I have bold the variable name that you may need to be aware of in the code below.
# Setup second pivot table and call pivot_table
ws3_name = 'pivot_table_2'
wb.Sheets.Add().Name = ws3_name
ws3 = wb.Sheets(ws3_name)
pt_name = 'example' # must be a string
pt_rows = ['expense'] # must be a list
pt_cols = ['products'] # must be a list
pt_filters = ['date'] # must be a list
# [0]: field name [1]: pivot table column name [3]: calulation method [4]: number format
pt_fields = [['price', 'price: mean', win32c.xlAverage, '$#,##0.00']] # must be a list of lists
pivot_table(wb, ws1, ws3, ws3_name, pt_name, pt_rows, pt_cols, pt_filters, pt_fields)
Thank you for reading until here, this is a long article I know. That’s a little more only after this, which is essential to know if you want to use pywin32
for your work.
The Possible Error You May Face

I faced this error a few times during the execution of the script to create Pivot Table. Fortunately, I found a solution on Stack Overflow. The solution is simple. All we need is to delete the folder title "00020813–0000–0000-C000–000000000046x0x1x9" in the path return by the script below.
import win32com
print(win32com.__gen_path__)
As this is a repeating issue that may pop up periodically (not daily), I will suggest you run the following script to remove the folder directly when the error is shown.
import win32com
import shutil
path = win32com.__gen_path__
shutil.rmtree(path)
Side Note
If you are confused about the Objects, Methods and Properties of the Excel Object Model, or want to know how to translate the script of Excel VBA into Python, you may have a look at Automate Excel with Python.
If you are interested to create a chart or PivotChart with Python and export it as an image automatically, Automate Excel Chart with Python.
If you are interested to know how to use pywin32
to access Microsoft Outlook mails and download attachment, you may refer to this article, "Automatically Download Email Attachment with Python".
If your job used Google Sheet instead of Microsoft Excel, you may refer to this article, "Automate Google Sheet Report" for the possible automation to be made.
Stay Connected
Subscribe on YouTube
Reference
Congrats and thanks for reading to the end. Hope you enjoy this article. ☺️
