The world’s leading publication for data science, AI, and ML professionals.

ML Engineering: past and current industry trends, open-source and what the future looks like

With the release of a host of new industial scale open sourced ML toolkits, what does the future of ML engineering look like?

Photo by fauxels from Pexels
Photo by fauxels from Pexels

for the ML engineer.

2020 has not been a great year in almost all fronts, however one field for sure saw a huge boom in terms of release of open sourced toolkits, making it easier to build industrial-scale applications using ML. Much easier than it would have been in 2019. How and what is this going to impact? And as a student, how should you be ready for that?

If you are a person interested in and following ML, it does feel like a stream of new interesting developments in the field shows up every day. 2020 especially has seen a rise in a lot of production scale open source ML engineering-toolkits, a bunch of new organizations founded by some really smart people, all dedicated to AI, accompanied by a rise in enterprise toolkits. More recently, data-science positions have seen a shift from ‘data-science/research’ to ‘applied-ML/engineering’ kind and also, most companies in 2020 are involved in building their own AI-platforms

In this article, I’ll frame a little advice based on my experience as an ML-Engineer and dealing with applied researchers, data scientists, product managers, engineering managers, on a daily basis, how do you position yourself to make the best out of this new ML revolution by the time you graduate regardless of what your interests are?


A brief history of ML positions over time and the usual talent pool

Photo by Carles Rabada on Unsplash
Photo by Carles Rabada on Unsplash

Historically, companies thrived on having a smart data-science group, and top tech companies would compete with each other for the best talent. Speaking with my observation of the hiring trends in the bay area and the US in general, most of this talent pool usually referred to people with a proven research record. This is generally true for big tech companies all over the world. When you take a look at the type of positions in ML at top companies and talent pool, we can divide recent history into these.

Pre-TensorFlow era (before 2015)

In this era, data science has evolved well enough, and almost all tech companies leading the industry teams have strong data science teams of their own. And as mentioned before, their strength lies in their data-scientists. But from an engineering point of view, ‘cloud’ was the buzzword and almost all major companies have their product/service on cloud ecosystems, either of their own or on AWS, with rising Azure and Google Cloud, and cloud service providers hiring a large number of software engineers. DevOps had seen a huge growth.

When you talk about a data-scientist position straight after college, they used to be open to a select few with STEM PhD’s or MS students with good relevant publications (NIPS, KDD, etc), i.e. companies were mostly looking for candidates with a proven track record in research to work on their ML problems. The bar was high.

Data-science teams usually comprise of Data-scientists and Software-engineers working together with other engineers (dev, quality, devops). Data-engineers usually used to belong to a specific team (platform team, data-engineering-groups etc) serving data-engineering needs of many different orgs, and only if the org is a big one they have their own data-engineering team. There usually weren’t any specialized ‘Machine Learning‘ Engineer (MLE’s at most companies. Mostly it was ‘Software Engineers’ who did ML, in collaboration with data-scientists and product managers, and engineers from other non-ML teams. And, that’s in short how usually they functioned. At smaller companies/startups a data-scientist would generally do a lot of these things themselves (research, product management, engineering, data engineering etc).

Deep-learning mass adaption era (2016–2019)

By now (2016), most of the big companies (tech and non-tech) have started adapting ML/Deep Learning frameworks at some scale to optimize something somewhere in their pipelines, accompanied by changes in their traditional infrastructure to be able to do that. Many who started new ML teams, did so entirely on cloud. This happened mostly because of these following reasons.

  1. Tensorflow, PyTorch, etc and their widespread adaption across all companies because of improvement in GPU architecture and decreasing cost of compute. That along with widespread availability of free cloud based compute power enabling deep-learning academic research, brought in some revolutionary new academic research during this age, along with revolutionary deep learning work being made available to use.
  2. A boom in self-driving startups boosting research in computer vision and applied deep learning. This of course led by the likes of Tesla, Uber, and hundreds of AI-startups working in the space many funded by big Auto and big Tech sometimes in partnership. There are couple of dozens of car companies apart from Tesla working on a production electric car to deliver in a few years.
  3. A boom in adaptation and interest in conversational-AI (which seemed to have hyped down over years) boosting academic research as well as adaptation of conversational dialogue systems in products across the system to interact with customers/users. This age saw the rise of a new kind of products AI-enabled-speakers, which ** ran** Alexa, Cortana, Siri, Google Assistant etc. All major device maker basically had their own AI-chatbot, etc, resulting in the revolution of the field of Natural Language Processing/Understanding. Similar booms occurred, in multiple different sub-fields, all taking deep-learning a step forward.
  4. Breakthrough inventions in deep learning from top research orgs and researchers, made AI researchers at top universities attain the status of top basketball players in tech. Many work part time or moved full time to lead tech labs of big Tech, while others continued to produce tremendous academic research, mostly funded by corporations themselves involved in AI-research. This age strengthened and increased the academia-Industry tie in ML research. Many ML products we use today, are results of industry-academia partnerships.
  5. The Generative Adversarial Networks (GAN’s) generation: Most of the breakthrough research happened during this phase but I guess I’d get called out if I do not mention GAN’s, and the countless problems it had solved. GAN today is a new subject/topic altogether with full-fedged university classes dedicated to GAN’s. GAN’s came before 2016 actually, (2014) but this was the time when GAN’s attained widespread popularity. Apart from GAN’S all major deep learning tech became really common especially driven by advancements in computer vision (CNN, R-CNN, etc), also RNN’s, LSTM’s, Auto-encoders and others.

…and many other events actually which shaped the ML landscape of today. I certainly missed a lot of other key events…


Photo by Patrik Göthe on Unsplash
Photo by Patrik Göthe on Unsplash

How ML hiring changed during this period?

All of the factors mentioned above brought in a fast change to the ML landscape and ultimately lead to a widespread hiring of people in ML. This era saw a fast growth in ML positions, as well as candidates who know ML. This is mainly because of the widespread growth of ‘online-courseware’ in the preceding years (since around 2010), and a growing availability of course materials, one better than the other, for free. Hence, the competition for these data-science positions at top companies remained high.

Data-science teams are still Data-scientists and Software-engineers working together with other engineers, but there are a lot more data science teams in general. Big companies having different teams who usually had one central data-science/AI team serving all divisions, now have separate AI teams in all their divisions, all optimizing a different parts of the solution, who also started hiring internally as per their own teams, rather than at a company level. There was an explosive growth in ML-focused startups as well, who hired for a lot of data science positions during this time. In short, due to surge in demand, Data science teams became more open i.e. non restrictive.

  1. The requirement for building scalable ML products brought the need for ML people with engineering experience. Many of experienced software engineers at this time, actually had the knowledge of ML(who are actually from the post 2010 era of ‘open-courseware’, who wanted an ML position after graduation but moved to a SE position few years back). Data-science teams now are combination of a substantial number of SE’s who then advance to be data-scientists with enough years of experience.
  2. A huge boon of this was to the BS/MS/PhD students involved with ML in academia, because many companies started hiring interns across different levels. Many of these interns did return to work full time. Hence, at a relatively old ML/DS team at a top company, it would be usual to see: a bunch of senior data scientists, a couple of experienced software engineers, product managers, fresh off college graduates, etc. The fresh off-college graduates are mostly likely to be called SE’s at most companies, and will in most cases never be called a data scientist if they do not have a PhD, but in most cases their responsibilities will not differ from someone with a PhD either. It is just that different companies have different conventions for naming positions.
  3. This age did see the rise of ‘domain-expert’ data-scientists who focus on one domain. For eg, advertising. Most companies now who deal with Ads invest heavily in a data science team, hence if you are someone who works in ML/DS in Ads at a company, it will not be much difficult to find a similar Ads Data science role at another company, because that is pretty common now. Domains and domain-experts saw a growth, for eg. Vision, Chatbot, Ads, Recommendations, Search, Risk/Trust, frontEnd/UX-ML, saw a rise in domain-specific positions.
Photo by Nick Chong on Unsplash
Photo by Nick Chong on Unsplash

2020 and beyond: The rise of the ‘Machine Learning Engineer’

In terms of ML, 2020 saw a huge boost in engineering in terms of free open source scalable ML toolkits. With companies (both small and big) relentlessly releasing toolkits and updates, it’s easier than ever to build fully functioning ML products making use of the toolkits, provided there is a steady supply of smart engineers who know how to use them for the purpose.

Big tech companies who have spent years and millions of dollars on infrastructure and talent to set up production ML systems serving millions/billions of consumers today. However, does the availability of these open source ML toolkits mean their problem is solved? I’d say no, because,

  1. The problems those are being dealt with at different industries are inherently much more complicated, with many constraints, and considerations, and many a time, a framework may not work out in terms of engineering, even if it is the best in terms of business. Hence, it’d take a substantial investment of human resources (engineers, scientists) to set up a pipeline (despite using the open-source toolkits) available now to achieve a similar performance to their existing ML systems.
  2. Companies with existing legacy ML systems will certainly see an opportunity, but most of them won’t straightly ‘jump ship’ i.e. design entirely new infrastructure on a newly available ML framework. Rather they are more likely to use the bests of multiple available toolkits, to best solve their problems.
  3. However new ML infrastructure being developed at smaller startups, or small scale data-science teams, or small data-science teams (basically teams not having a decade old ML platform) are more likely to design their ML infrastructure ground up using the new toolkits, as it is much easier to use them straight off the bat, then spend months building a platform just to do that.

Ideally, to give an example, an engineer who is comfortable with building data-etl-pipelines using AirFlow, Kafka etc, performing analyis and training/fine-tuning models in Tensorflow/Torch, model deployment and serving in Kubernetes, backend development in Node.js/etc, to deploy a protype to AB-test that transforms the change to actual product performance metrics, is likely to see the highest demand for their skills. But, entry level positions will most likely be subsets of this ideal engineer.

Expected skills Trends

  1. ‘Applying ML’ emerging as the go-to industry skill Engineers with the knowledge of ML, able to perform research in collaboration with experienced data scientists, and able to take part in most of the data-engineering/modeling/deployment will be in demand. Basically, meaning ‘end-to-end combination of’ skills will be valued more, than just research or just engineering.
  2. Fine-tuning as an engineering skill in the age of Transfer Learning A rise in open-source toolkits will be accompanied by experts in open-source toolkits as well, but the best AI-engineer will be someone who can use most of the toolkits at ease, fine-tuning each to figure out if it solves the problem at hand, and then deploy it. _Transfer learning (TL) is a research problem in machine learning (ML) that focuses on storing knowledge gained while solving one problem and applying it to a different but related problem. For example, knowledge gained while learning to recognize cars could apply when trying to recognize trucks._ Most of the ML models available as of today do let you fine-tune a pretrained model, to optimize for specific tasks, thereby reducing the requirement for spending effort and time to build an extensive and accurate database. Start-ups can just start right away with a model available to deploy (check out ml5js, tfHub). TensorFlow has ready to deploy pre-trained, and custom, models for web (tf.js), for Mobile devices (tfLite), and a really easy way to fine-tune common models for many common tasks in image, speech, vision, NLU, etc. Even building a self-driving AI system is not as difficult as it was a year back. Nvidia, among many others has a self-driving AI and other solutions (Nvidia Drive AGX). This age will see a huge increase in startups and established players making use of all solutions available, and hence the demand for engineers who can do that at ease will rise. Startups will "hunt" for people who can understand the intricacies of details of a research paper of a new solution, and then can fire up a model to use the code and data of the paper, to transfer the learning to their own problem, if it works out deploy to existing ci/cd system (if you have a system designed to plug in models), and then measure the impact.
  3. AI Platform engineering specific roles: There actually already is an increase in the AI platform engineer role, which is basically current "data-engineering role + ML engineer" role. They are called ML-platform-engineers, or SE-(ML-platform), or Software Engineers (Platform) right now, its ambiguous across companies.
  4. AI/ML knowledge as mandatory in Product Management and offered as new ‘track’ in lot more majors. Knowing AI helps, and its best when your PM’s know exactly what you’re proposing and is themselves able to present new approaches, thereby resulting in more inclusion.

My advice for the ML undergrad, currently in university.

As an undergrad, interested in ML, you might be wondering how to navigate through the sea of new information that shows up in your feed everyday? Which are surprisingly all relevant? How do you make the most of the new toolkits getting released almost everyday, are you missing out? New roles that are suddenly open to undergrads, but their requirements seem competitive. This, is all too much information to process and keep track of amidst classes, assignments, and spending those hours on social media dealing memes. What do you do?

Historically, even if Machine Learning / Data Science were only positions open to grads, the next generation of people working on ML will have more undergraduates than ever, working on ML straight off college. Of course the barrier to entry will still be high.

With increasing number of students taking introductory ML classes starting their third year in college, and thanks to companies like Coursera, edX, and universities who release their course materials, resulting in the democratization of ML/DL knowledge, we have more students than ever before, who are interested in ML more than their actual major.

Photo by Austin Distel on Unsplash
Photo by Austin Distel on Unsplash

When the whole world is eyeing a handful of ML positions, regardless of what field they work in, you need to differentiate yourself, amongst the million other undergrads who all claim to be ML/deep-learning/data-science experts. Of course being in certain majors will boost your chances over others (not all fields marked STEM have equal affinity to the skills required in data science), and with relaxed classwork it’s easy to graduate with a major and minor today which can be the perfect combination positioned for the win (CS+Stat, CS+Math, Stat+CS, Operations Research + CS, CS + Finance, Econ + Stat, Econ + CS, etc) for MLE jobs coming up in the next few years. Those MLE jobs will be stepping stones to Senior ML Engineer or Data Science positions, but DS jobs will lose their appeal over time, year over year, with companies realizing they can get a similar level of smart work done by undergrads with experience, under mentorship of fewer data-scientists.

At an age where there are millions of STEM Phd’s graduating every year, all over, just getting a STEM PhD simply won’t be enough for PhD’s looking to enter the job market now facing a growing competition of ML-experienced undergrads/masters who spent 2–3 years on the job with actual engineering experience despite being much younger on average than a PhD graduate.

Let me put down a few advice, which I’d say are helpful as an undergrad, to prove yourself to be worthy in a competitive industrial ML world. I’d suggest forming habits, instead of suggesting some straight-up remedies.

  1. Keep track of the recent advancements and industry adaptions Make sure you are well informed in terms of relevant tools and technologies in your specific field/domain. Be it ML or your actual major. It’s the age of information, one quick search in YouTube about how something works returns hundreds of relevant videos. Once a research is out, there are thousands (like me) writing about it, making videos about it, people discussing online, and in a short time even formal classes mentioning it. There is no dearth of resources to learn something, if at all you lack something, it is the time. You’ll just need to set aside some time to keep yourself updated on a daily basis.
  2. Fundamentals first It may be really tempting to jump straight to building products with all the tools available, especially for undergrads new to the field. However, regardless of how dedicated you are to ML, do not for any reason ignore the fundamentals that you are supposed to be learning as an engineer. Most likely, there will be no undergrad who is getting a degree specialized in ML (ML doesn’t yet exist as a dedicated engineering branch), a vast majority of undergrads will be CS, followed by CE/ECE, Mechanical/Civil and so on. Mathematics, statistics, coding are still the fundamentals to be a good data scientist, and all the CS basics concepts from (algorithms, data-structures, databases, operating-systems, networks, etc) are still important to be a software engineer. And you still will need to be a good data-science person as well as a software engineer if you are looking to break into ML Engineering.
  3. Solution and product based approach to engineering Focus on grades and GPA in a formal academic world is expected, that should be totally respected. However, a solution based approach to everything is the way to go for a good engineer. By this I just mean find projects which might solve practical problems, no matter how small the problem is. Do not think you’ll stand out by doing a lot of ‘boiler-plate’ projects which are very common. A person with a single unique project will always be preferred over someone who did a bunch of projects which you can easily find on other’s resumes.
  4. Focus on impact, the right kind, not token achievements Nobody in the ML industry will care you were the president of your placement committee in your college even though you might be really proud of it, if you do not have a single ML project. You might think you had an impactful position, but a student who worked at a startup delivering actual ‘value’ and gaining ‘experience’ will be much more valued than a student posting five token achievements on LinkedIn. Hence, in all of your academic and non-academic projects, it’s important to know about how much change, how much impact you brought about.
  5. Github, Docker, open-source and showcase Github is your guide to ML worthiness. First of all git (version control) and Docker (containerized applications) is a really important skill you need to have. All I have to say is, make it a habit of using these. Regardless of you are a Computer Science undergrad. If you are a CS undergrad, you certainly should have a Github profile you are proud of by the time you graduate.
  6. Competitions, Coding, Kaggle and free datasets Winning or being a leader in a competition is always appreciated. Taking part in competitions on Kaggle also does give an idea on the actual ML problems faced, as companies hosting the competitions do release a data. Many of the competitions are for prize money or swag, however, just taking part in them is a huge learning experience. Also a badge of honor. There are plenty of sites you can take part in weekly, monthly, yearly competitions, both in software engineering and data science.
  7. Hackathons and the general quest for building something smart Take part in as much hackathons as you can, especially if you haven’t before. Hackathons are important mostly because it teaches you how to work in a team in a constrained setting and limited time and come up with a viable product. Companies value team-work a lot. You are much more valuable to a company if you demonstrated impact as a part of a team, rather than alone. Many startups have started as a hackathon idea. Also, it is a great way to meet others interested in the field who might later be your co-founders in your quest for founding the next unicorn.
  8. Research as an undergrad As an undergrad, your exposure to actual research can be by collaborating with professors, PhD or graduate students working in the field, your internship group, or interning with a university research group. If you are interested in research, with the aim of getting a graduate degree later, certainly it should be helpful for you to read research papers in the kind of research you are planning to go. And starting from there, and with the mentorship of your professors or other research people, if you could publish a paper that is great. A publication isn’t a necessity, but again it does increase your advantage by a lot.
  9. Confidence and grip of the field you say you are an expert in If you are targeting to be an ML-engineer, just being able to implement a tensorflow model isn’t enough. Companies will test you down to your fundamentals, and if you mentioned ML or related stuff, as your expertise, you should be able to face ML questions, both conceptual and situational/case-study. If ML is your expertise, you should be able to explain how SVM’s and Random Forests work, as well as which one you’ll use in a situation like detecting fraudulent payments, etc. As well as you should be well versed in solving ‘data-structures-and-algorithms’ problems.
  10. Soft-skills, communications and presenting ideas A lot of time if you are an undergrad dedicating a substantial time to doing ML, you might ignore communications as not so important. But, as an ML engineer or data-scientist you’ll be spending a substantial amount of time explaining your idea or work to an audience. Better make it a habit of presenting and getting used to it (stage fright is common) while you are still in university.
  11. Internships An internship is the best way to enter the industry, as the barrier to entry is a bit lower, although the competition is more. That being said, companies still hire the best candidates they find to spend their summer with them. But they try to do so with a simpler process with fewer interviews. A good internship is always well regarded in the industry when you are entering full time. A good internship doesn’t necessarily mean at a top company (kudos if you do though), but it has to be impactful. If you can mention what you’ve learnt, engineering problems you’ve solved or contributed to, what your responsibilities were, and if any significant goal was achieved on your part, that looks good on your profile. Also, return offers are great, and a significant part of people joining full time are actually internship converts.

I’ll prepare a whole new article for undergrads on how to best prepare for an ML career later but, I believe that doing the above will certainly bring a lot of competitive advantage.


Photo by CHUTTERSNAP on Unsplash
Photo by CHUTTERSNAP on Unsplash

The world is changing and so is the ML landscape, more so recently, especially in 2020. As such, it remains to be seen how the new toolkits get adapted over time, how the industry and the hiring trends change over time, and what new opportunities it brings.

Change is the only thing that is constant. This statement is more true in the case of industrial ML. It is difficult to keep up with it. In such a scenario, it is best and wisest for people to have their own strategy for staying relevant in an ever changing domain.

The barrier to entry is high in this very competitive field, despite an explosion in positions and educational resources. As such, being informed, focusing on value, and figuring out a niche for oneself is the best advice I could give to both professionals and students.

Thank you!


Related Articles