The world’s leading publication for data science, AI, and ML professionals.

Useful Excel Functions for Data Science

Excel for Data Science Journey

Photo by Lukas Blazek on Unsplash
Photo by Lukas Blazek on Unsplash

A Data Scientist spends plenty of time with the data. And, When we talk about the data, excel come in handy all the time. Excel has many useful functions like addition, subtraction, multiplication, but those are useful for calculation. In the Data Science field, we need a massive amount of data to train our machine learning models. The data can come from different sources, but finally, we need it organized. And, in most of the case, we convert our data in tabular format.

Once we convert the data in the tabular format, we need to plot different graphs and charts to visualize the data features and get the relation in other data columns. Before plotting the data, we need to make many changes and perform many operations in our data. And, there comes the need for excel functions.

This article will discuss some latest and useful functions in excel that we generally use for our data science tasks.

Match Function

Match function helps in getting the index number of any item in a list. We generally use the match function with Horizontal Lookup or the Vertical Lookup. I will be discussing the vertical and horizontal lookup later in this article.

Syntax

MATCH(lookup_value, lookup_array, [match_type])

Here, match type can be 0, which is for Exact Match, -1 for less than any number, and 1 for Greater than any number.

Match Function
Match Function

Lookup Functions

Lookup functions help in creating sub-data from the main data sources. The main data source can be in the same excel sheet or the different excel sheets. The columns in the sub-data should be a part of the main data source; else, we will get a null value for those columns. Lookup functions search for the data in two ways; first is the vertical lookup, and the second is a horizontal lookup.

Vertical Lookup

There should be a common column name for the vertical lookup in both the main data source and the sub-data. The common column in the main data should be present on the extreme left-hand side of the data.

Syntax

VLOOKUP(lookup value, range containing the lookup value, the column  number in the range containing the return value, Approximate match  (TRUE) or Exact match (FALSE))

Here, the lookup value is the value of the common column value in both sub data and the main data. The range is the selected data part of the main lookup data. And the column number is the list of sub column names that should be present in the main data. These sub-columns names help in generating sub table.

The Main Data Source in Different Sheet
The Main Data Source in Different Sheet

For our example below, we will use a vertical lookup with the match function to get the column index in the main data.

=VLOOKUP($A29,lookup_table!$A:$K,MATCH(B$28,lookup_table!$A$1:$K$1,0),0)
Vertical Lookup with match
Vertical Lookup with match

Horizontal Lookup

Horizontal lookup, we generally use for getting the calculated result value for any column name. The result column names should be part of the main data. Those columns’ names should be present in the top row of the data in the main data.

Syntax

HLOOKUP(lookup_value,table_array,row_index_num,range_lookup)
Horizontal lookup
Horizontal lookup

Index & Match Function

Vertical and Horizontal works on the condition that the lookup range list should be either extreme left or at the top first place. Sometimes, we need to make the match with the list at any random position. In that case, we use Index & Match.

Syntax

INDEX(array, row number,[column number]) 
Our Main Data with Matching Column at Right
Our Main Data with Matching Column at Right
=INDEX(lookup_table_2!$A$2:$K$38,MATCH($B61,lookup_table_2!$K$2:$K$38,0),MATCH(Sheet1!C$60,lookup_table_2!$A$1:$K$1,0))

Here, we need to select our main lookup table and then define the matches for row and the columns using the match function.

Index & Match Function Example
Index & Match Function Example

X Lookup

Vertical lookup search for the match in the top to bottom direction, X Lookup provides us the functionality to shuffle the direction from top to bottom and bottom to top.

Freezing Method

The freeze method helps in freezing any column or row. It is useful when we copy our function to the entire table. We need to use the dollar "$" symbol to freeze any row or column. The shortcut key for freezing any cell is "f4" on the keyboard.

Final Points

We have discussed some main functions that we generally for our data science tasks. These are many handy excel functions when we have to deal with Data Visualization, and we have to get the sub-data from our main data sources. And, also in data science interviews, these are the much-asked topics in excel.

Pranjal Saxena – Senior Software Engineer – Capgemini | LinkedIn

Well, that’s it for now. I hope you find this article useful, keep enjoying data science.


Related Articles