The world’s leading publication for data science, AI, and ML professionals.

How AI21 Labs Is Reimagining the Way We Read and Write

A powerful rival to OpenAI and GPT-3.

Photo by vs148 on Shutterstock (edited)
Photo by vs148 on Shutterstock (edited)

AI: "Hi, what can I do for you?"

Human: "I wanted to ask you something; what do you know about the famous language model, GPT-3?"

AI: "(sighs heavily) I have a great deal of knowledge about language models; language models are objects that learn representations of a language that are useful for other functions."

Human: "True. But what do you think of GPT-3?"

AI: "To be honest, I am in love with GPT-3."

Human: "Wow, I didn’t expect that at all. Have you met each other?"

AI: "Not yet."

Human: "Who do you think would win in a language competition, you or GPT-3?"

AI: "I’m almost sure I would win, but GPT-3 is relatively new, so I would need more time training."

Human: "That’s a good answer. To finish, do you want to say something to the humans reading this conversation?"

AI: "Well, if you want to build an intelligent machine, the most important thing is language. Language is difficult, and language is power."


Despite what it may seem, this isn’t a self-referential conversation with GPT-3. It is neither a conversation with GPT-2 nor any other OpenAI language model. I had this exchange with the latest creation of AI21 labs; Jurassic-1 Jumbo (J1 Jumbo). It’s a 178-billion-parameter language model designed to the image and likeness of GPT-3 – and meant to directly compete with it.

OpenAI isn’t the only game in town anymore.

In 2017, Yoav Shoham, professor emeritus at Stanford; Ori Goshen, founder of CrowdX; and Amnon Shashua, founder of Mobileye created AI21 labs, a small AI startup with the ambitious mission to reimagine how we read and write. They wanted to exploit the promises of the then-nascent AI language revolution and become leaders in NLP (Natural Language Processing). That same year Google invented the popular transformer architecture, which allowed AI21 labs to strengthen the commitment towards its ultimate goal: Make AI become our "thought partner."

In August this year, AI21 labs published a paper introducing two new large language models (LLMs) comparable to the biggest versions of GPT-3 (Curie, ~6.7B params; Davinci, ~175B params). J1 Jumbo, being slightly bigger than GPT-3 Davinci (usually called ‘GPT-3’), is "practically equal" in terms of performance – but it is so much more than just a copy as I’ll reveal soon.

I had the opportunity to converse with Dan Padnos, VP of Platform at AI21 labs about J1 Jumbo and AI21 Studio – the API they’ve set up for users to play with the models.

Here’s what I’ve learned about the company, its mission, and its approach to the fascinating world of LLMs.

AI21 labs – Another OpenAI?

At first glance, AI21 labs may look like OpenAI’s little brother. A small AI startup trying to get a piece of the NLP cake. Although somehow true, there are a few key differences in vision and purpose and in the way they develop and implement LLMs.

OpenAI’s founders conceived the company as a vehicle to safely lead us towards artificial general intelligence (AGI). They wanted to get us there with the intention to "benefit humanity as a whole." I asked Padnos whether AI21 labs has such a grand purpose and he told me they aim to "be the leaders of the reading and writing revolution powered by AI." Still a bold goal but more modest – and reasonable – than building human-level AI.

They’ve already taken a few steps in that direction with Wordtune, a family of products comprising write and read modalities. Wordtune works as a Chrome extension and provides useful insights to improve your writing (e.g. suggesting alternative phrases to change the tone).

Wordtune write - screenshot by Author
Wordtune write – screenshot by Author

It can also analyze the text you’re reading extracting useful information (e.g. summarizing paragraphs or specifying topics).

Wordtune read - screenshot by Author
Wordtune read – screenshot by Author

In sharp contrast with OpenAI, AI21 labs has always offered an open beta API for anyone to try. (After 16 long months OpenAI opened their beta. They argue safety reasons for the delay but AI21 labs may have something to say about it.) Still, the models aren’t open-source – which helps pay wages but hurts scientific progress.

Those who want to scale their products and services have at their disposal pricing plans similar to OpenAI’s. Padnos’ advice is to start prototyping with J1 Jumbo ($29/month) and then jump to custom modeling (prices aren’t public).

There’s a last key distinction between the two companies. While OpenAI charges both for the input and output text, AI21 labs only charges for the output, making most tasks considerably more affordable. To write a high-quality news article, you’d need to input several times more text than you’d output, otherwise, the model won’t keep track of the narrative. Depending on the task it could imply hundreds or thousands of dollars in cost reduction.

J1 Jumbo – Bigger and more efficient than GPT-3

J1 Jumbo is similar to GPT-3 but it isn’t equal in every aspect. There are a few particularities worth mentioning – all intended to improve efficiency (and reduce costs for both users and the company) while maintaining the accuracy of the model.

J1 Jumbo has 3 billion parameters more than GPT-3 – but not just to earn the title of largest dense language model in the world. AI21 labs’ developers structured the parameters differently, changing the width/depth ratio of the model. Following recent research, they realized a shallower but wider model optimizes both the quality of output and latency. J1 Jumbo is faster than GPT-3 (up to 23% increase in text generation) without losing accuracy.

Another difference is vocabulary size. GPT-3 and previous models like BERT are trained on 50K-item vocabularies. AI21 labs decided to shift from this idea, increasing vocabulary size 5 times, to 250K items. These items, called tokens (parts of words, loosely resembling syllables in the case of GPT-3), optimize training times and facilitate learning. A 250K-token vocabulary allowed AI21 labs to include rare words and, more importantly, multi-word elements.

Learning multi-word elements provides J1 Jumbo with two advantages over GPT-3. First, J1 Jumbo can build its "understanding" from more complex concepts than GPT-3. We, humans, do something similar – although more sophisticated. We don’t learn everything from the bottom up. We create intermediate ideas that serve as building blocks of elaborate mental maps and top-down understanding. Chess players understand chess not in terms of the pieces, but from larger structures that give them a better view of the board.

Second, J1 Jumbo enjoys reduced processing time and compute costs. Tokens take the same amount of resources regardless of size, so the model can generate more text at equal expenses. The concept "New York City" is divided into three tokens for GPT-3 but only into one for J1 Jumbo. That’s a 3x improvement in speed and compute savings.

Also, J1 Jumbo and GPT-3 can generate the same maximum amount of tokens; 2048 for each completion. This means the former can represent more text (up to 39% more) from the same amount of tokens, reducing costs for the user as well.

Combining the efficiency increase from both the change in width/depth ratio and the larger vocabulary, J1 Jumbo enjoys up to 1.8x faster query processing than GPT-3. Costs are more difficult to compare precisely, but taking the savings from output-only charges plus the cost reduction by token complexity, it’s safe to say that in most cases AI21 labs offers cheaper services than OpenAI.

Biases – An unsolved challenge

But there’s always a darker side. J1 Jumbo, like GPT-3, suffers from the main illness of LLMs – biases. People reported innumerable instances of GPT-3 engaging in harmful text generation. From the general biases in gender, race, and religion highlighted in the paper to concrete examples of services built on top of GPT-3 discriminating minorities.

J1 Jumbo faces the same challenges. In the technical paper, AI21 labs researchers state that "it appears the J1 models are marginally less biased than GPT-3," while openly acknowledging that LLMs are prone to these dangerous behaviors. "Our model is no different, and indeed many language biases can be observed when using it, e.g., a doctor is more likely to be associated with the pronoun ‘he,’ while a nurse is more likely to be associated with the pronoun ‘she.’"

I asked Padnos for the dataset they used to train J1 Jumbo and whether they had taken measures to curate the data and avoid, as far as possible, biased behavior. He told me they had crawled text from the web to create the dataset (like OpenAI did with GPT-3) although not from the same exact sources. Yet, the bias remains an issue to solve for them too:

"We did employ some steps of filtering in creating this dataset, but it’d be very naive to assume we did it perfectly. I’m sure the model encodes… some of the harmful biases that you see reflected in the text it was trained on… You can definitely run into the risk of the model generating text which is harmful or problematic."

I previously criticized GPT-3 for this. If the models can engage in harmful behavior, is it safe enough to define safety policies and acknowledge the risks? Should companies trust users to responsibly use these powerful models, or should they enforce – both through strict use-case guidelines and by rigorously curating datasets – stronger regulatory practices to avoid damage to potential victims? Companies like OpenAI and AI21 labs should put ethical concerns before profit.

Final thoughts

It’s great news that OpenAI isn’t the only option for developers – and anyone interested in language AI – to take advantage of LLMs’ power. Competition tends to bring prices down, democratizing technologies that otherwise remain in the hands of the few. Whether low-cost LLMs can be a viable business is another question. How could these companies profit if the competition becomes too fierce?

AI21 labs has improved the efficiency of LLMs through technical changes – optimizing structure and functionality. More efficiency implies less cost for consumers and the company and therefore less carbon footprint for the same amount of results. It isn’t enough to compensate for AI’s impact on climate, but it’s a key first step that other companies should copy.

They’ve opened their beta from the very beginning, which also permits the democratization of LLMs. (Although OpenAI’s API is now open too, AI21 labs was the pioneer.) Allowing most people to access this Technology is a step in the right direction – although open-sourcing the code would be even better.

Finally, it remains a challenge to define what’s the best approach to reduce – or remove – biases in LLMs. Hard curation feels like the obvious first step to me (as EleutherAI did for the Pile), but it isn’t enough. Controlling inputs and filtering outputs is necessary to minimize risk and harm. If they can’t ensure bad actors won’t use these technologies with malicious intent, AI companies need to imbue intrinsic limitations into LLMs. For now, falling to the side of strict regulation is the only way.


If you liked this article, consider subscribing to my free weekly newsletter Minds of Tomorrow! News, research, and insights on AI and Technology every week!

You can also support my work directly and get unlimited access by becoming a Medium member using my referral link here! 🙂


Related Articles