How I won the Santander Data Masters Competition

3 central soft skills I used and the hard skills I learned.

First things first

Santander released the 2020 edition of their Data Masters Program in Juni. The idea and the concept that make it a brilliant program is to bring competition and learning together in order to help all kind of Data Science students or professionals to self-improve and become a better professional.

I will tell you more about this amazing program later. First, let me show where I was at the beginning of this journey and so you will have a baseline to work with.

I’m a chemical engineer with a background in math modelling. Before the Pandemic-Isolation, I was a mediocre programmer. Though I learned a lot of statistic and probability during my years studying chemical engineer, as the time went on, my knowledge was not so trusty at all! So, that was me. Basically much like almost every data science enthusiast, I had a helpful background but there were also some points in my skillset that I need to improve.

Santander Data Masters Banner — Source: (free to use)

The Bank and The Program

The Santander Group is a global banking group, led by Banco Santander S.A., the largest bank in the euro area. It has its origin in Santander, Cantabria, Spain. The Data Master program was created as part of their empowerment culture.

Santander Data Masters — Data Science Path is a learning and certification program provided by Academia Santander, the corporate university. The program consists of a mix of training and competition, where Santander University provides the content and selects the best participants.

The competition

The program has 3 evaluation phases, which are:

  • Phase 1: the candidates must take the Cultural Fit, General Knowledge and Logic Reasoning assessments. In this phase, there were about 3200 candidates and only 100 were selected.
  • Phase 2: the candidates have 33 days to study all material selected by Santander Academy. In the end, there is a technical assessment and a performance over 50% is needed to get certified.
  • Phase 3: the top 3 participants receive the opportunity to develop a Project provided by Santander and join a mentorship session with Santander staff.

So you can imagine that from the beginning everyone wants to win in order to receive that mentorship and the opportunity to work in a real-world project. And so was I!

Learning & Hard Skills

This part is what makes the program so special. Instead of just keeping the old competition format, where a problem is set and everyone tries to come out with an overfitted solution, Data Masters focus on how well you can learn the most important hard skills in order to be a very good data scientist!

After selecting the 100 participants, they provided a structured learning material covering the most important knowledge and skills that a data scientist should have. All participants had 40 days to learn the most they could about the following HARD SKILLS:

  • Probability and Statistics: from basics to advanced concepts;
  • SQL and NoSQL;
  • Big Data concepts and applications;
  • Programming;
  • Regression: Linear, Multiple, Ridge & Lasso, Variable Selection
  • Classification: Logistic Regression, Decision Tree, Random Forest and Naive Bayes;
  • Clustering: K-means, Hierarchical Clustering (divisive and agglomerative), Latent Dirichlet Allocation;
  • Performance Metrics.

3 Critical Soft Skills to WIN

As you may think, to overcome 3200 competitors and win this competition demands something out of the box. That is exactly what I thought and would like to share! So let’s go.

1. Synergy

“Connecting the dots” as Steve Jobs would call it.

During the program, I was also taking part in another two scholarship competitions from Udacity — AWS Machine Learning and Machine Learning Scholarship Program for Microsoft Azure. As you can imagine, there were thousands of things to learn in each of the competitions (Santander, AWS and Azure). My best chance was connecting the dots in the most synergic way so I could achieve high performance in every competition. So here is why this soft skill is so powerful:

  • Learn or do related things at the same time, makes the performance skyrocket as never. In my case, I learned everything about algorithms at the same time from each course and I could answer all questions correctly.
  • Your learn capability will improve and speed up! As I saw related things over and over, this repetition process assured that all the information would make its way from short-term memory to long-term memory.
  • You start to think out of the box. By facing the same subject from different angles, we begin to see new applications and ways to use what we are learning or doing.

I sat down and figured out a way to learn all subjects, that were highly related, in the same time-spam. That speeded up so much my learning that I could cover all the subjects and learn them properly. It worked so well that it was the most important soft skill that brought me to the top of 3200 competitors.

2. Time Management

As I had 40 days to cover all those topics and another two other competitions going on in the same period, time management was vital. So here is how I did it:

  • Break the problem in small parts, so we can precise better how long it takes to accomplish each task.
  • Show your plan to others and explain to them why you scheduled things in that way. I did it with my girlfriend and a friend. The feedback I received was very important in order to gain valuable insights.
  • Iterate over the points above.

3. Communication

Oft we think that communication is just making some presentations or sharing results, but there is so much more that we can make use of to achieve a better performance in learning or solving problems.

When I got stuck at some point or didn’t find a way to solve a problem I was facing, I tried to write it down and explain to my friends or discuss them in forums. It helped mainly because to communicate something I needed first to bring some order to my head. What happens is that when I prepare a question to ask, I oft find the answer just because I needed to struct everything in an ordered way.

The Prize

Appling and improving my Synergy, Time Management and Communication skills, as well as learning so much crucial hard skills for the Data Scientist role, already was an immense reward! Besides that, I received two amazing prizes!

The first one was to solve a 3-parts problem where the task was to maximize the profits by building tools to understand the satisfaction grade of each customer. In this project, I learned and put into practice concepts like Classification, Clustering, Net Promoter Score (NPS), Feature Selection and Bayesian Optimization. Visit my GitHub to know more about the project and see all the code I developed to solve this case!

Second, after solving the case I had about two hours conversation with Felipe Simões, Data Science and Big Data Manager @Santander Brasil, and Caio Martins, Senior Data Scientist at Santander Brasil. The mentorship was simply amazing and very fun! I got feedback about the project, important insights to my career and learning path as well as suggestions for the next steps in my career. Here are some pictures of us having a great time discussing data science together!

Mentorship with Felipe (on the left) and Caio (on the right).

My final word is: if you have the opportunity to take part in a competition of this kind and format, go 100% for it! The benefits are simply amazing and you will improve your soft and hard skills as well! Oh, and don’t forget to have some fun along the way!

Thanks for reading,

— Pedro

