Opinion

The Case Against Enterprise LLMs

A sober perspective as to why boring is best, even for AI

Published in

Towards Data Science

6 min readApr 29, 2023

Over the last few weeks, we’ve had a trove of custom LLM requests from clients and partners. This excitement, although warranted, is based on tech news inundation, not on getting a fundamental corporate advantage.

LLMs, even though they are not conceptually far off from most transformer-based training pipelines, require much more complex machinery to fine-tune and operate smoothly in a corporate setting. All the ones we already tested and deployed for clients are good, but they don’t have the same gloss as polished commercial products, which is an issue with business leads.

Don’t get me wrong: LLMs are the best thing in AI since GPU-accelerated training, but they should be a tool of last resort, not a first dip of the toe in the enterprise AI pool.

“Software gets slower faster than hardware gets faster” is an old adage in computer science. It’s easy to bloat some software, as it’s usually easier to add modules to a piece of code than remove some.

When it comes to AI, the growth of models (from machine learning models to deep neural networks, to LSTMs, to pre-trained CNNs, to now transformers) is following a similar path. Although the case for using best-in-class technology makes sense, there are some situations where asymptotic gains are just diminishing returns if the total cost of ownership exceeds any identified benefit.

In our conversations with clients, an eerily familiar pattern is re-emerging: the same execs that are pushing for ChatGPT clones today are the ones that were adamant about chatbots a few years ago. It’s an easy play to get a demonstrable win without linking it to enterprise value or requiring one to think about its justification. Their religiosity behind “you just don’t see the value yet" is a lazy trope at best, and at worse a resource sink turned corporate liability.

A lot of issues arise in these deployments not in the projects’ rationale or in their feasibility, but around the justification as to why starting off with the most complex piece of technology ever developed by mankind is a valid first option for improving your next quarter by 10%.

AI is not fairy dust; it creates value in its relationship with data, context, and inference validity. Hope and prayer are usually not valid strategies when it comes to technical debt. American football games are won with strategy and execution, not with Hail Mary passes.

My Bias

As an engineer, my concerns are about deployment feasibility, total cost of ownership, and value for money. Our clients trust us to be transparent adjudicators of new technologies and their applications within our clients’ organizations. Deployment risks need to be evaluated against business upside and ROI. Implementing a technology simply because it is à la mode usually triggers numerous alarms, and rightly so.

It’s not that we have an issue (whether ethical or economical) with LLMs — we love technology. We have a track record of giving the Rorschach test to pre-trained models, building a horoscope trading bot, and even creating an emergency party button with disco balls. Technology is cool.

However, I must insist that technology will not solve your problems if you don’t know what your problems actually are. Compounding that lack of visibility are the engineering challenges associated with LLMs means that every typical project risk is now an existential crisis against delivery.

What Success Looks Like

In all our years of running our AI consultancy, the biggest driver of AI adoption wins is clear measures of success. This means:

The business context has been clarified with KPIs;
The requirements of the project have been established; and
The delivery of the project has established goals.

Quality management systems, a natural yardstick to compare your machine learning projects against, require traceability between established requirements and verification/validation and new management techniques call for Objective/Key-Result (OKR) task assignments. Expectations surrounding an AI deployment should equally be evaluated against measurable and objective success metrics.

Costs Without Justifications

Especially with LLMs, the AI drive over the last few years has had a series of push-and-pull forces between elective projects and projects of necessity. Elective projects are fun and turn into cool stories at the proverbial water cooler; projects of necessity are the monochrome suits that get the job done in the backroom. Which one would you rather have during market uncertainty?

Good AI, just like good design, should be invisible, not the centerpiece.

So Many Options Before LLMs

Before even considering generative AI, the older families of transformer-based models and pipelines can get you equivalent business outcomes without breaking the bank.

Most use cases are knowledge bases, historical analysis, and insights generation, so let’s see what alternative approaches we can find.

Navigating your data

In the last few years, two technologies have made intelligent text search a breeze: sentence embeddings and vector databases.

Sentence (or document) embeddings have truly been a differentiator since the last few word- or subword-embedding technologies. Awareness of word order (thanks to positional encoding) creates much more comprehension within nuances and has incredible complexity navigation. Complex sentence structures, even documents, can reliably be vectorized, clustered, and compared.

Vector databases, many of which are comfortably open-source (such as Vald and Weaviate) already include self-optimization and approximate-nearest search right out of the box.

The number of applications in the business context of this simple model is dizzying: you now have a mini-search engine that can retrieve historical RFP sentences that are the most similar to your latest proposal, or even find and organize the relevant documents needed for a contract.

The advantage of this approach is you avoid LLM hallucinations: ranked results provide contextual value first, meaning that you don’t need to dig beyond the first few results for your answer. Either you have a direct answer in front of you or you don’t. This is not as reassuring as the cajoling pace of a prompted response, the information is extremely accurate, and even the absence of worthwhile results is an indication of the internal state of affairs within your team.

Note: you don’t even need a VD to gain value from similarity searches. A flat file with cosine similarity is actually fast enough on a multi-core system to be usable in an enterprise setting. If you want to try it yourself, I recommend converting all your documents to Markdown and splitting the text between headers. Congratulations, you now have your mini-search engine.

Before LLMs (which is a bit of a misnomer; it seems to include everything and anything that generates text), the NLP world was abuzz with various applications and well-defined solution patterns. (Take a look at the NLP section of Papers With Code for examples.)

Building a sentence or document classifier is still a tried and true approach to organizing data; not least of which is the data cleanup process itself which forces the organization to recognize its half-empty databases.

Whatever Happened to Basic Data Hygiene?

I cannot stress this enough: data will solve your AI problems; AI will not solve your data problems.

I have written at length about the relationship between data and AI and value creation, and how AI not only generates insights but helps with cleaning up the cobweb-filled digital archives. In a business context, it’s natural to have inertia due to processes and historical culture. This fog of war is due to the sheer number of individual relationships between people to accomplish vision and mission success.

In these processes, however, incomplete forms and missing reports are expected findings. The energy shouldn't go into skipping over those, it should go into properly filing them.

Still, LLMs Are the Future

The key takeaway is that companies new to AI adoption should, under most circumstances, walk before they run. Successful project delivery while saving money is an all-around solid strategy.

LLMs have an otherworldly ability to navigate complex ideas and cleanly summarize them in a fraction of a second, but most news articles reference the best-case scenarios, not the total amount of effort to get there. Just like social media, the reality is often deceiving. If it looks effortless, it probably wasn't.

I’m not making a case against enterprise LLMs; I’m making a case against enterprise LLMs as a first AI project.