Data Analytics in Fashion: Excel Series (#1)-Overview

A comprehensive blueprint on getting started with data analysis in Microsoft Excel

Kiitan Olabiyi
Towards Data Science

--

Photo by Timur Saglambilek on Pixels

The principles discussed in this article are applicable to other industries aside from fashion. In other words, whether you are a data enthusiast, researcher, fashion enthusiast, student, or a professional, the Excel concepts discussed are relevant and applicable to your field.

Fashion brands and retailers are in search of ways to maximize data analytics mostly in the improvement of product offerings through personalizations toward their target audience.

According to McKinsey, fashion, and luxury companies that have integrated data into their planning, merchandising, and supply chain have received substantial results, with a 10% increase in sales based on implementing data-driven decisions in stock and store optimization and about 15% reduction in inventory costs.

Generally, brands that have optimized the ability of data into building a customer-centric business have grown digital sales by 30–50 %.

Bottomline, with more data being released every day, the earlier fashion brands learn to leverage data, the better the chances of their survival in this era- where the gap between data pacesetters and late adopters has widened.

Stay with me…

While many fashion brands are yet to harness the power of data, some brands are looking to explore employee upskilling options, and talent hunting and some are even willing to train new data professionals such as data engineers, data analysts, data scientists, cloud engineers, etc.

In other words, whether you are a core designer, illustrator, merchandiser, technical designer, etc, you need to know how to interpret insights from your company’s data-have a grasp of data analytics.

That being said, this article will focus on one of the key steps of data analytics, which is, ‘data cleaning’, but before exploring the step, let me walk you through the phases of the data analytics life cycle.

The 6 Phases Of Data Analytics Life Cycle

data analytics life-cycle
Image by Author

Approximately 2.5 quintillion bytes of data are generated every day and as technology continues to develop, so will the amount of data we generate. As data gets created, it also needs to be processed and reused. Thus data goes through a somewhat circular motion that revolves around 6 phases, with each phase having a stipulated goal, tasks, features, and relevance.

The phases above follow each other sequentially, however, the flow could be both forward or backward movement-it is an iterative process.

#1. Data Discovery:

Photo by Pavel Danilyuk on Pixels

Here are some of the activities that happen in the data discovery phase:

  1. The data team and major stakeholders examine business trends.
  2. Define goals and success criteria.
  3. Create business case studies i.e a business problem is framed and the team comes up with a strategy to deal with it.
  4. Data requirement, source of data, and story the data is expected to tell.
  5. Assessment of resources needed and evaluation.
  6. Formulation of hypothesis based on the business problem.

#2 Data Preparation:

The focus moves from business requirements to data requirements.

The data team at this point performs the tasks below:

  • Identify data sources
  • Ascertain how much data can be gotten from each source within a specified time frame.
  • Sort out data collection, processing, and cleansing.
  • This phase is likely to be performed multiple times.

#3 Model Planning:

  • The data from the previous phase is explored by the team.
  • An analytic sandbox is required for processing and storing the data. It also provides networking capabilities all through the project time frame.
  • The team learns about the variables and selects the ones that suit the business case- feature selection for model building.
  • Next, the techniques, models to build, and workflow are determined.

# 4. Model Building:

  • The dataset is developed for testing, training, and production
  • The models are built using different algorithms as determined in the model planning phase.
  • A trial run of the model is executed to test the effectiveness of the model based on the business problem.

#5. Communicate Results:

Photo by Photo by RODNAE Productions on Pixels
  • The team identifies the key results, measures the value relative to the business question, and produces a data story to communicate the results to the stakeholders.
  • The major stakeholders compare the results obtained with the goals and success criteria set during the data delivery phase.
  • This phase is where the success or failure of the project is determined.

#6. Operationalize :

  • Finally, the team runs a pilot project to deploy the model and test the real-time environment.
  • This approach helps to learn about the model performance and related constraints in a production environment on a small scale and make adjustments before full deployment.
  • The team prepares a Full report, briefings, source codes, and related documents.

However, in a case where the result is opposed to the success criteria from phase 1, the team can move backward in the data analytics lifecycle to any of the previous phases to modify any input in order to get a different result.

The data analytics cycle is indeed an iterative process.

It’s been a long read but I am glad we have covered the basics of the data analytics life cycle.

QUICK QUESTION:Remember the focus of this article is on data cleaning, what phase of the data analytics life cycle do you think data cleaning belongs to?

Give yourself a thumb up if you made it to this point. You are doing great!

Gif by Giphy

However, I still have one more theoretical concept to cover and then move straight to data cleaning in Excel!

It’s a brief one, I promise!

In order to better understand what we want to do in Excel, it is important to highlight some benefits of data cleaning.

Grab your coffee, and let’s do this!!!

What is Data Cleaning?

Photo by Adli Wahid on Unsplash

Data cleaning is a key step in the data requirement phase and you are likely to spend about 80% of your time in this phase. In order words, the rest of the phase largely depends on how clean your data is. Hence it is advisable that you perform this to the best of your abilities.

Why data cleaning?

A major reason to clean your data is to get accurate results when you build your model. For instance, having duplicate entries would alter your results when you build a model that is supposed to predict sales or whether or not a user will click on an Ad. More so, clean data is easier to read, comprehend, and more presentable.

Having said that, let’s get to the moment you have been waiting for.

Told you I was gonna make it brief!

*Winks*

Excel For Data Analysis

Image by Author

Microsoft Excel is a spreadsheet application that has been around for a while. It is not only used to store structured data but can also perform calculations, visualizations, build models, etc. Despite the fact that it has been around for over 30 years, it still has great relevance in the industry and is thus a great tool to learn data analytics.

Why start analytics with Excel? Why not Python? SQL?

Aside from the fact that Excel is affordable and requires less processing capacity (device type), it also serves as a strong foundation for other tools used in data analytics. In other words, is easier to learn, and building a foundation in it will deepen your knowledge of the analytics process.

Simply put, learning other complex tools, eg, SQL, will become easy when you know Excel. It will ensure you have a better transition.

On this note, this entire series will cover 12 techniques and formulas used for data cleaning in excel.

Ready to dive in??

Grab some cookies and let's do this!!!

Photo by Tamas Pap on Unsplash
Two things to do before performing data cleaning;#1.Make a copy of your dataset.
#2.Spend some time to understand your data-is it a sales data? customer data? what features does it have?the features, datatypes.
Doing the #2 will guide you into what to look out for when performing data cleaning.

Here are the 12 techniques to be discussed in this tutorial series:

  1. Resize columns
  2. Get rid of leading and trailing spaces.
  3. Remove line breaks from cells
  4. Remove Duplicates
  5. Deal with blank rows and cells
  6. Standardize sentence case
  7. Number Formatting
  8. Highlight errors
  9. Split text to column
  10. Merge two or more columns
  11. Find and replace
  12. Table formatting

Now that you understand the data cleaning concept and some excel techniques to clean data, Jump straight HERE where you will dive straight into a step-by-step process of using each of the 15 techniques mentioned above.

The next two tutorials in this series will be structured below;Tutorial 2: Techniques 1–6Tutorial 3: Techniques 7–12Tutorial 4: A complete Data Cleaning Project in Microsoft Excel.

Each tutorial has screenshots, short video clips, and practice exercises to reinforce your learning.

Click HERE to begin!!!!

Hope you enjoyed reading this article as much as I enjoyed writing it.

Don’t hesitate to drop your questions and contributions in the comment session.

Connect with me on LinkedIn.

CHEERS!

--

--