Why Python is Essential for Business Analysts

Nikita Khudov
Towards Data Science
5 min readNov 24, 2020

--

If somebody asks you about the most important hard skills for business analysts, which ones would you name? Well, in management consulting, investment banking, and many other analytical jobs it usually is (or at least traditionally was) Excel and PowerPoint. This software is certainly convenient and often efficient for data analysis and visualization, which is the solid reason it is widely accepted as a standard. However, numerous limitations have been becoming more and more clear in recent years. Data volume and complexity grow exponentially, which calls for more sophisticated data processing tools such as programming languages. In this article, I will focus on Python and explain, based on my work experience in management consulting, why I believe it to be an essential skill for business analysts right now and maybe even a critical skill in the future.

Photo by Christina @ wocintechchat.com on Unsplash

Why particularly Python? To be honest, almost any programming language might be suitable, and I am not going to cover this topic in detail here. But I do believe that Python is the best choice because of several reasons:

  • Relatively simple to learn
    When compared to other programming languages, some might say that Python is quite easy to learn. It has a relatively simple syntax and plenty of great learning resources.
  • Huge community support
    Obviously, Python is very popular. That is why, when you run into some problem, there is a high possibility that someone had run into the same problem before. Therefore, you can always find help or maybe even a ready-to-use solution.
  • Plenty of brilliant off-the-shelf analytical solutions (including libraries for ML/DL)
    Speaking about ready-to-use solutions, analysts are often used to working with such, compared to developers who are comfortable designing their own. That is, the existence of numerous great Python libraries for data analytics is a great benefit.

I was never a professional programmer and always wanted to work in a business-related field. That is why I majored in Economics & Finance and knew literally zero about programming, but I believed that Data Science could actually add some value to my profile and just was of great interest to me. So I enrolled in a good Master program in the DS field where I have learned all this stuff and started applying it to the tasks I usually was doing with Excel. Not surprisingly, I found it efficient and quite enjoyable, so I started promoting it to my colleagues as a great analytical tool. While arguing about Python's value proposition in analytics, I came up with 3 key advantages that I will describe below.

Automatization & Replicability

Nobody likes doing routine repeatable tasks. Well, maybe somebody does, but it is definitely not a common trait for analysts as we often prefer to delegate these tasks if possible. This is where Python comes to help because it allows us to automate many processes. Here are some examples I have encountered:

  • Repeating the same analysis for several markets/ competitors/ customer segments/ etc.
  • Collecting some data from online sources (Web Scraping)
  • Handling errors in the text data to merge different datasets (Fuzzy matching)

The list can go on and on, but the point is that instead of doing the same task several times (maybe even doing it manually each time), we develop a robust, reusable pipeline. It can be used by ourselves in the future or handed over to our colleagues (obviously, if they know Python). Moreover, Python script is usually transparent and readable if created properly, so it should be easy for our colleagues or supervisors to go through each step of our analysis and get what is going on there. That is a great advantage compared to Excel, as Excel does not allow us to see all these intermediate steps.

Working with Big Data

Here I am talking not only about the traditional definition of Big Data (3 Vs) but also about any data too big to fit into Excel (over ~1 mln rows). This is something that I also have encountered a lot. The majority of large corporations (not only digitally-advanced ones) have already generated enough data to call it “Big”, and they predictably want to get insights from this data. So when traditional business analysts are unable to do it, these corporations turn to data scientists for answers. I understand that some of these questions may be extremely complicated and indeed require the data specialists’ involvement, but is it always the case? I believe that it’s not. Sometimes the task is as easy as to open a few 5Gb datasets, merge them, make some pivots, and draw the slides/ charts on their base. Of course, it may sound like a problem if you only know how to use Excel, but if you are a Python-user, you would have easily cracked this task and saved some money and time for your organization.

It happened many times in my practice when I simply needed to draw some charts based on such large databases as ticket sales distribution, customer information, SKU data, and so on. It was crucial to show the full picture instead of some sample, so Big Data-processing skills were necessary.

Advanced Modeling

That’s the most exciting part for me, but the least often used, to be honest. Nevertheless, sometimes it could be quite important for success. The underlying reason is the same as for the previous argument — corporations have data and want insights. Some of these insights could not be uncovered without machine learning, as they are too complex to be manually found by human analysts. Some tasks of this type I have been doing:

  • Price forecast with econometrical modeling
  • Market segmentation with clusterization algorithms
  • Product classification with tree-based algorithms
  • Product price elasticity estimation

Could these tasks be solved in Excel using linear regression? Well, with some degree of creativity, maybe…but I will not be sure about the quality of such work. More advanced modeling in Python will certainly give us higher flexibility and very likely better results.

On a closing remark, one of the things I love about Python is its open-source principles. There is a wide community of professionals who openly share their ideas and groundwork, help others, and create some great projects together. This idea was confusing for me at first as for a person from the traditional corporate world, but later I have realized its enormous potential. That is why I am trying to motivate people to learn Python and join this great community.

If this article gets some traction, I plan to publish a few more stories sharing (as much as I am allowed to) some useful details and pieces of code for particular examples I’ve been working on at Bain.

--

--