Automate SAP Report Extraction with Pyautogui

A guide to saving your time from the most tedious task in the world.

Pathairush Seeda

Published in

Towards Data Science

8 min readApr 17, 2021

Introduction

Reporting is the foundation of any business. In daily life, you somehow have to ingest the new data from a report to decide where to go next every day. The report can come in various formats such as Microsoft Excel, Web application, or exporting from an enterprise resource planning system (ERP).

I have recently got a request to build a dashboard that replicates the business number in the crafted report. The finance team manually created this report monthly. The most tedious process is to export the source files from the SAP system and manually place it in excel. After that, they have to adjust some numbers because it’s an edge case that the user-defined formula cannot calculate.

In my opinion, If we start with the source table from the SAP table with the same logic, we can derive the report without doing many manual tasks. That’s what I thought the first time.

It would be an easy task to replicate those numbers. The logic seems quite straightforward. But when I dig deep into the detail, I found out that all those numbers have been processed with many underlying SAP business rules across time. Also, the business rules are so dynamic that it’s hard for someone who doesn’t understand the whole process thoroughly to replicate it.

After trial and error for a while, I decided not to go further with the replicating process. It would take me more than a month to understand everything and convert the SAP ABAP language to another language I can work with. Sometimes, we need to let something remains as it is. Do not touch it even if you think it would be a great way of doing it in the future.

Then, what should I do?

Here we come to today's main topic. Instead of ingesting source data and calculate the number by myself, I skipped all that process and ingested the end reporting result instead. How can I do it? The answer is with a pyautogui module. You can find it in this Github.

In short, pyautogui helps you to do basic automating tasks such as move your mouse, click the button, type some words. You can do whatever you want on the desktop by referencing the pixel you want the mouse to move to. Also, It attaches the computer-vision ability to recognize the location on your screen based on the picture provided. That’s superb.

Today, I will guide you through this library and add some other useful functions to help you automate anything on your desktop. Today, I won’t write a tutorial for what apyautogui can do because you can find it in the documentation anyway. I will show you the process and where you can apply this thing.

Here is the step we will cover today.

Open the SAP GUI and log in to the server.
Input the SAP program name and parameters for extracting the report.
Execute the report and export it to a Microsoft Excel spreadsheet.
Upload it to AWS S3 for the further ETL process.

So, let’s open the SAP so that we can extract the report from it. Here is how you do it.

import subprocesssap_gui = subprocess.Popen("path/to/saplogon.exe")

Firstly, we open the program with the subprocess module. It’s pretty convenient for you to open any exe file. You store the program object with the parameter sap_gui . At the end of the script, you can terminate the SAP like this.

sap_gui.terminate()

When the SAP logon is opened. You will see the screen like this.

One thing to be careful about when using the pyautogui is that if the desktop doesn’t process correctly. I mean, let’s say you select some option on the screen, and the program takes some time to process it. If you don’t put any wait time in the script, all the later steps might be gone wrong.

To make the above example better.

import pyautogui
import subprocess
import timesap_gui = subprocess.Popen("path/to/saplogon.exe")# wait 3 seconds for the program to load
time.sleep(3)# assume that we move to input username fieldsusername_field_location = pyautogui.locateOnScreen('username_field.png')
pyautogui.moveTo(username_field_location) 
pyautogui.click()# type the username in the field that we just clicked
pyautogui.typewrite(username)# move (relative from the current cursor position) to below 50 pixel
pyautogui.moveRel(0, 50)
pyautogui.click()# type the password
pyautogui.typewrite(password)
pyautogui.press('enter')

With the following code, we show you how to log in to the SAP system by using pyautogui functions. The above snippet contains almost all of the functions I used in the ingestion script. We use both referencing movement and computer-vision movement. Now you will be here as the following figure.

At this point, you will put all the parameters you need for the report and then execute it. We can use all the functions above for achieving this task. Let me point only at the function that I didn’t mention earlier.

# Other functions that I used to assign the value to get the report# double click
pyautogui.doubleClick()# right click
pyautogui.click(button='right')# hot key with ctrl + f9 (shortcut for SAP command)
pyautogui.hotkey('ctrl', 'f9')# press tab to shift to the next input field
pyautogui.press('tab')

After clicking the export button for the SAP exporting process, it will automatically open the excel file for you. This behavior is annoying sometimes because you can’t do anything with the result file until you close the excel file.

That’s where another useful function came in. We can check that is the process (MS Excel) is running or not. After, we can use them as a trigger point to go to the next step. If the SAP system can save the excel correctly, we can close the excel result file without hesitation. We will use the following function to do that.

import osdef process_exists(process_name):    # Thanks to ewerybody for providing this useful snippet code https://stackoverflow.com/questions/7787120/check-if-a-process-is-running-or-not-on-windows-with-python    # This function will return true if the process is running.
      It's compatible only in MS Window.    try:
        call = 'TASKLIST', '/FI', 'imagename eq %s' % process_name             # use buildin check_output right away    
        output =  subprocess.check_output(call).decode()    
        
        # check in last line for process name    
        last_line = output.strip().split('\r\n')[-1]            # because Fail message could be translated    
       return last_line.lower().startswith(process_name.lower())    except Exception as e:
        print(e)def close_process(process_name):    # This function will return true if the process is running.
      It's compatible only in MS Window.    try:
        os.system('TASKKILL /F /IM {}'.format(process_name))
    except Exception as e:
        print(e)

Here we check that whether Microsoft exists. If it exists, we close them.

# Let's wait untill the program is exists and close it.retry = 0
while not is_process_exists('path/to/excel.exe'):time.sleep(10) # may be we need more time for saving
    retry += 1
    if retry > threshold:
        raise ValueError("Excel cannot be saved")close_process('path/to/excel.exe')

We will post-process the exported file to put it in the place where it should be.

local_file = path/to/export.xlsx
target_dir = path/to/{load_date}
file_name = {business_key_parameter}.xlsx
absolute_file_name = os.path.join(target_dir, file_name)if os.path.exist(target_dir):    # if the target dir doesn't exist, create one.
    os.mkdir(target_dir)if os.path.isfile(absolute_file_name):    # if there is a file in the target folder, replace with new one.
    os.remove(absolute_file_name)# rename and move export file to the target dir
os.rename(local_file, absolute_file_name)

Now our desired report has been placed in the target directory. We will loop through to all the business units to get all the reports.

The thing to be careful with `pyautogui`

One drawback of the pyautogui module is that if previous instruction's output doesn’t result as expected (webpage doesn’t load), That mistake will fail all the later steps drastically. To solve this problem, There are 2 options I can think of

To find a reliable reference of each pyautogui stage. If that reliable reference doesn’t exist, we will wait until it is loaded. Please be careful that it can lead you to an infinite loop if you use while loop without retry.
Raise the error and skip the current one. After that, you can retry it later. This assumes that the problem is occurred due to the temporary unreliable reference. We can create a function to check the number of the exported files. Then for the missing file, we re-run the script again.

After you get all the report your want, let’s upload the target file to AWS S3 for a further process.

import logging
from botocore.config import Config

# Intial S3 client
s3_config = Config(region_name = <<YOUR AWS REGION NAME>>)
s3_client = boto3.client(
    "s3", config=s3_config,
    aws_access_key_id = config["CREDENTIALS"]["AWS_ACCESS_ID"],
    aws_secret_access_key = config["CREDENTIALS"]["AWS_SECRET_KEY"]
            )
try:
    response = s3_client.upload_file(
               file_name, bucket, object_name
               )
except Exception as e:
    print(e)

Voila! We have extracted the report automatically and upload it to AWS S3 without human assistance. All the rest is to schedule this task to run it daily.

It’s time to rest

We have gone through several useful libraries today. As you can see, we can do all the tedious tasks automatically with python.

I’ve spent time writing the action code for this task around half day. But the time I can save is even more.

Thinking that you have to export it once a day with a 1 hour period of extraction and typing the same parameter day in day out. It would be the most boring task in the world. Now I have a script that takes care of all this thing.

At first, I thought automating stuff would have some cost. I saw some elegant RPA applications on the internet. If I have to spend the money to acquire those applications, it will take a lot of time through the POC and procurement processes. With pyautogui we can do everything locally within a day.

However, as I mentioned earlier, you have to take care of the edge case yourself. If it’s a simple application, I encourage you to give it a try to automate it yourself. But for the complex process, you may choose to acquire the completed elegant RPA application. It might be a better option.

Don’t let yourself unproductive with the tedious tasks. Let’s automate it!

Pathairush Seeda

if you like this article and would like to see something like this more.

Follow me on Medium
Other channels? LinkedIn, Twitter, and Facebook

Automate SAP Report Extraction with Pyautogui

A guide to saving your time from the most tedious task in the world.

Introduction

Then, what should I do?

Here is the step we will cover today.

The thing to be careful with pyautogui

It’s time to rest

Pathairush Seeda

Written by Pathairush Seeda

The thing to be careful with `pyautogui`