My Reflection on Data Science Projects — What Worked and What Didn’t

Learn the lessons so you can avoid the same mistakes, and invest your time more wisely.

Jack Huang
Towards Data Science

--

Photo by Brad Neathery on Unsplash

For the past 5 years, I’ve been working in the data science space to serve users in the supply chain area of my company. I’d developed various dashboards and a handful of small data applications. The majority of the solutions deployed are still actively used, while few projects looked fancy and promising in the beginning but failed to take root.

I reflected on the nature of the projects, in this article, I would like to share the hard-learned reasons why some of my projects worked, and some didn’t.

Let me start with what worked well.

  1. Observe a business pain point, then fill up the gap
    In the supply chain business, one of the main challenges is the operational managers need real-time data to guide their operations, few hours of aging data deemed useless. Every day, the analysts used to pull and compile various data for creating indicators to support business decisions. The tasks were repetitive, manual, and unsustainable. I knew we need a different approach to the problem. After some study, I developed and deployed an integrated dashboard that can pull real-time data, and with richer insights. Moreover, the dashboard is reusable. This has fulfilled the hunger of users who need quick and real-time indicators every day.
  2. Partner with process owner to automate essential tasks
    There were many well-established business processes but lack a creative way of execution. Finance officers spend days preparing bi-yearly budgets; Engineers spend a few hours every week pulling various data sheets to estimate storage consumption in the warehouse. These are examples of essential tasks and must-do activities to keep the business running. The personnel may not know there is a better way to carry out the task. For a few times, I partnered with the users to automate and simplify the process. This has transformed the way they carry out the tasks. The outcome of the partnership was satisfactory, the automation has helped them more efficient in their work.
  3. Organize the scattered data into a streamlined database
    There were a few occasions I observed my peers created very useful data sheets on their computers which they used for a specific analysis. It was nothing wrong to maintain the data sheets locally, what I was excited about the potential of the dataset. I converted the dataset from a spreadsheet to merge into a mainstream database. The result? More people leveraged the dataset, more successful use cases were created. I’ve liberalized the data from the silo and standalone form to a streamlined and integrated database. My point is always to keep an eye on good data, convert it to a more systematic format that is more scalable. This inherently expands the scope to wider audiences, and you’re on the right path to create more powerful applications.

On the flip side, I was not as lucky all the time. I had a handful of projects that did not tip. It boils down to two main reasons as the following.

  1. Fancy concept but no real business case
    There were a few projects that I imagined it would be cool, without many studies, I started to develop the application. For one of the projects, I spent three months developing a supply chain visualization tool. While it was true I managed to wow the stakeholders, but I didn’t hear back from them ever since. There were also times a business partner presented me with a very cool concept, I’d spent a few weeks developing the prototype. We didn’t manage to operationalize the prototype. In hindsight, I’ve learned the hard way that a concept without a solid ground of study, mainly based on imagination tends to fail. Sometimes, it’s a good strategy to practice strategic procrastination, spend more time listening to users, understand their paint point before rushing to do development works.
  2. Over-engineering a solution
    I just said that we should always listen to users, right? Not all the time. The users may imagine a feature that is cool, but not practical. Pay attention to this kind of ask. There was one project, I’ve got a long list of enhancement as a result of 3 days of Kaizen discussion, I spent 3 months to build all the features. The output was an integrated dashboard consist of 14 modules, but soon I realized that out of the 14 modules, the users only actively use 5 of them! This finally dawned on me that while it is true to always listen to users, but users may be wrong! They may have “imagination” syndrome like you. Don’t be too generous to accept a fire hose of requests, instead, question every feature they ask.

Final Thoughts:
To create a successful data science project, I found often that the most challenging part is not in development, but more on if we can operationalize the work so targeted audiences start to adopt the solution. Below are the key lessons I’ve accumulated.

  1. Identify business pain points, then fill up the gap
    Sometimes, users do not know there is a better way out. Talk to them more often, be in the arena and get your hands dirty.
  2. Tap on existing cumbersome business tasks
    Partner with the owners to improve the process. Let the owners accountable for the solution, support them behind the scene.
  3. Follow the data
    Organize the scattered data into a more streamlined platform usually is a quick win. The richer the data is, and the more integrated the data is, the better.
  4. Don’t imagine a solution
    Understand the real problem users are facing. Don’t rush to development just because it looks cool.
  5. Don’t be too generous to accept all requests
    Probe every enhancement, get down to the root of the needs, focus on the few.

Thank you for reading, I hope you’ll find the lessons useful and be able to help you invest your time more wisely in the future.

--

--