In previous articles on how to hire an AI consultant, and how to price an AI project, I tried to give you a sense of how things work in the AI consulting space. I also gave you a sense of the special challenges faced by many established organizations in getting their data into the cloud.
This article is about the smaller and newer companies out there. Let me preface this whole thing by saying that not all startups are the same. Everyone is a special flower that grows or wilts just so. Now, let’s generalize. There exists a strange dynamic in the startup world where Artificial Intelligence is really hot, and yet the alignment between expectations and reality is way out of whack when it comes to pricing expectations. An enterprise client will typically need some quick solutions to deploy apache spark or hadoop or whatever, just to get at their data in a timely manner. That’s before even thinking about machine learning. For startups it is just as challenging.
Once the data is accessible, the ML system needs to be designed, built, tested, and deployed. For some models this is very straightforward, but these steps still take time. There may be a need to try a few different models. The client almost always needs GPU resources to be provisioned, even if that just means loading up an AMI in AWS on a p2 instance. That takes yet more time. Don’t forget sometimes you want a reserved instance to lower cost. Maybe you want a platform comparison between GCP and AWS. Now, the system almost always needs security that exceeds username and password. I like using PPK/pem certificates and Google Authenticator. So in short, even small machine learning projects for startups are not so small.
I started writing this article after a small one man startup approached our company last week with a project that would take 3 months on a part time basis, and capped the cost at $1,000 USD. I like to be very time efficient, but this is silly. If the project was estimated at 4 hours, I still would have to decline. Just to go through the NDA and billing process… Yeah. This doesn’t make sense.
Now, this startup was not trying to jerk me around. It’s what they can afford. What I had to advise this entrepreneur to do was to go raise money for his idea and to stay in touch. These musings on the difficulty of ML startups are not new for me.
I like the table by Kevin Dewalt on Machine Learning project cost estimates for enterprise clients. See the link below:
The costs Kevin lists are higher than many startup machine learning projects, because many startups don’t have the big data requirements for their prototype. However, it should give everyone a good sense of what it costs to keep the lights on in a machine learning project.
There are some risks, like early unpaid client demos, that are worth it for the chance to win a project. Sometimes, though, the client has super unrealistic expectations about what is possible…. Time, money, quality → pick 2

It gets even stranger. ML development is about results, not how many hours of coding your consultant spends. In a commoditized market, we would charge for ML projects like a software house charges for web development. More time means more money, and the relationship between hours in and results out is sort of linear (maybe really more like a tanh S-curve). However, in ML the effort input often results in an unknown results output. Data science is still a bit of an art, and sometimes the biggest results come from knowing where to look for the right library (low effort; high reward), rather than writing the most aggressive code (high effort; high reward).
I want to share an experience I had with a client’s data last week, to drive home that ML development follows a winding path:
I started with the client’s small one table mySQL dataset of ~10,000 records for a regression and categorization model. I started with unsupervised transfer learning because of the small size. I got good results, but suspected I could do better. Next I proceeded to develop a supervised learning model, which compiled but the results were worse. Next I proceeded to generative models. Results even worse than before pushed me back toward my original solution. I added some feature engineering to the original approach, and scored a small improvement in performance. Now, it seems the model is as good as it’s going to get for this phase. Here, at the end of the proof of concept project, I have spent over 80% of the effort on code that will not be used. And the code with the best performance is the stuff I wrote in the first 20% of the project. Mostly in the first 20 minutes!
What we learn from this, is that the relationship between time and results is shifty and weird. Data science is more chemistry and lemon pie than civil engineering and pure math. It is a more dynamic and practical process, and less a money-in/results-out equation.
My advice on startup expectations is to collect, organize, and label your data as much as you can before starting your machine learning project. Discuss your resources with your stakeholders and don’t be shy about getting multiple quotes from different vendors. You will find a trade-off of price and quality as with any other kind of shopping. Temper your expectations, but not your excitement. It is a really awesome time to be deploying machine learning!
If you like this post, then please recommend it, share it, or give it some love (❤). I’m also happy to hear your feedback in the comments.
Happy Coding!
-Daniel [email protected] ← Say hi. Lemay.ai 1(855)LEMAY-AI
Other articles you may enjoy: