The solutions to your Three-Star Open Source Operationalisation headaches are just a 4 minute read away

In my previous articles, I talked about the similarities between operationalisation of open source Analytics, and working in a restaurant. I set out all the challenges in the process, and I am now (finally) going to talk about how we can solve them. This should help you to achieve open source operationalisation worthy of three stars! I suggest that it is important to work on three aspects: analytics heterogeneity, treating models as corporate assets and not just focusing on technology.
Analytics heterogeneity
First, let’s understand analytics heterogeneity. This means diverse analytics solutions working together, which is exactly what we need to effectively operationalise open source. Diversity is a key element for innovation, both in employees and in technology. A 2020 study at Columbia University by Cowgill et al concluded that a diverse data science team helps reduce bias. This is becoming more and more important because regulations on ethical AI are being brought forward by governments around the world. For more information about this, make sure to join the SAS Webinar on 13th April.
For technology, the enterprise needs to strike a balance between choice for its users and control for governance. Choice allows a wider talent pool when hiring for open positions and makes any role more interesting. This will of course help to retain staff. After all, you’re not going to stay in a job where you’re forced to use R when you’ve spent years learning Python! Choice also helps to reduce vendor lock-in and technology debt, through enterprise acquisition and well-documented data and model pipelines. However, it’s important to not go wild with choice. A balance with control is key, and can be achieved by targeting a consistent technology stack across the enterprise. This can only be managed by understanding user needs, and matching those to technology, so that you reduce duplicate technology and use case-based technology acquisition.
Models as corporate assets
Second, treating models as corporate assets changes the mindset about their use in the enterprise. It gives them the same level of importance as other assets including cash, stock inventory, machinery and patents. To truly understand and control models as assets, you must have processes to monitor drift in model performance, bias and the fairness of decisions. This helps to facilitate governance, which is crucial in combatting the challenges of operationalisation. It also ensures people remain involved in the decision-making process.
This, in turn, will mean that governance covers both data and models. Ownership of all contributing elements will be known, helping the enterprise to comply with policies, standards and regulations. In turn, this will encourage the reuse of assets across the business through interpretability and democratisation of analytics. In other words, more people will know what is being done with analytics and can contribute to its use, speeding up innovation.
Focusing on more than technology
Finally, don’t just focus on the technology! Throughout these articles, I have stressed that successful operationalisation requires more. The tricky part is not to have too much technology, increasing operating expenditure unnecessarily. With a consistent technology stack driving technology acquisition, the enterprise can strike a balance between choice and control, removing unnecessary use cases. Not spending time solving technical debt, or in technology acquisition meetings, frees up time to do the things that matter, such as increasing diversity, ensuring analytics heterogeneity and installing correct governance to allow models to be viewed as corporate assets!
That said, there are some features and capabilities that you will want to include in your technology stack. First, you need to be able to provide governance for all models, no matter the language. This allows you to give maximum freedom to data scientists to build the best possible model. Ideally the governance would allow continuous integration, delivery and monitoring, increasing efficiency and integration between the different elements of the enterprise analytics platform, applying automation where needed. Second, you should ensure access to all your data is possible. There is no point including a brand-new data science tool that can build the most accurate, jaw-dropping models, if you can’t then get access to all your data. It will also reduce the value you’re generating through a model if the tool requires you to copy all the data for processing.
Final thoughts
I hope that it will be clear that collaboration, shared knowledge of the process and openness are key elements to achieving high quality, efficient and repeatable open source operationalisation. These are, of course, crucial aspects needed for a restaurant to achieve three stars!
To understand even more about this topic, I encourage you to look into ‘ModelOps’ or ‘Mlops‘. I would recommend a three-part series, also published on Medium by yours truly, as a great place to start. I also encourage you to take the self-assessment that I have developed with some colleagues. This provides customised recommendations in order of importance, which can be supported by advisory services spanning people, process and of course technology.
As a final thought, if you’re familiar with the film Ratatouille that I showed in the first article, you’ll know that the plongeur (or dishwasher) who we saw knock the soup over – the incident that sent our friend Remy the rat hurtling into the washing up – is crucial in saving the famous French chef Gusteau’s restaurant from mediocrity. Without giving people the opportunity to help innovate through collaboration and knowledge of the process, who knows how many chances you are missing?
Please feel free to reach out to me on social media; twitter or LinkedIn, if you have any questions you want to discuss in depth!