By now each one of you might have seen the tweets, apps, codes, texts produced by the mighty GPT-3 model of OpenAI. The 175 Billion parameter model was trained on a large corpus of data of the size of the internet. It has already amazed people in both tech and non-tech space with its real-worldly results and folks are drooling over the possibilities in the coming years.

What GPT models do is estimate the probability that how likely is the sentence will be formed in real-world e.g. I played with a football in the ground is a sequence of characters that’s much more likely than I played with an apple in the ground. Words or phrases are randomly removed from the text, and the model must learn to fill them in using only the surrounding words as context. This training task has resulted in a powerful and generalizable model.
I am still in the waitlist for GPT-3, so I decided to use GPT-2 and have fun with its 255 million parameter model.
My idea of fun was to teach philosophical texts to the machine and then seek answers to the most esoteric questions of human life and existence!
Some people really need to get a life and understand the definition of fun, I tell you.
GPT-3 is massive, GPT-2 isn’t small either, so there are two factors to consider regarding the real-time usage of the models:
- How accessible are they to programmers?
- How easy they are to train and how well do they perform on deployment in production mode?
The second factor was the reason that I was quite headstrong about the idea of training the model on my local machine and not use Colab or Paperspace. My machine is humble Macpro-16" without GPU support(Thanks Nvidia for CUDA! Not!!)
I started with a small amount of text first and trained the model with Aristotle and Plato’s works. It was after the model learned Plato’s Republic that I started getting coherent results about the questions I was asking. Fine-tuning the model was quite straightforward but a little time-consuming task.

I created a streamlit app with a decent UI/UX so I can deploy it on GCP, EC2, or Heroku but given the models are large in size, floodgates of error codes were opened and I spent an entire day dealing with those problems unsuccessfully. Later I decided to focus on training the model than putting this AI-Philosopher for public use. (Github link for the repo)

I believe after consuming the works of Nietzsche and Voltaire, the model results got really weird. The same words and sentences would be repeated and a few sentences were getting printed in bolder font.
Is GPT-2 yelling at me?

That became dark very quickly! Dogs of the world, run away! Well, it was not a good question as we already know that it’s the cats who go to heaven, not dogs. Ok, let me ask one more question.

At this moment, I decided to use the parameter ‘max-length’ to restrict the length of these balderdash answers but no effect. I trained the model for many more steps to see whether it will bring any sanity to the system.

What a philosophical discussion between a man and a machine!!!
Ok, one more about the proverbial love.

Alright, yeah, love isn’t an easy concept but it was an engaging experiment. The sentences made sense a few times and at others, they were garbage. More often than not it felt like talking to someone who has major psychological issues.
I have seen results on GPT-3 and they seem reasonable and real-worldly. The sentences that I got as answers were grammatically correct but it felt like a deranged human is answering them at the top of their lungs. I can already see an animated drama actor/actress yelling them at the top of his/her lungs modulating the pitch and speed and portraying a crazy performance.
Maybe everything produced by the trained model was just a sequence of characters that it learned from the books it read. It isn’t the Artificial General Intelligence, the machine didn’t concoct new thoughts, it merely read from the large corpus, computed probabilities of what sentences would be, and in the absence of any new sentences it kept on repeating the old ones.
The other thought that I have on the increased font size is a little creepy. Of course, the machine didn’t get any AGI, but it could be the first step towards it. There was a lot of repetition in many sentences and it reminded me of Gödel, Escher, Bach and how recursions and repetitions are at the centre of how inanimate things give rise to the animated world around us.
Conclusion
I know that it’s only a small model trained on a small computer but if trained on a massive machine, what would the results mean? These GPT models are going to grow in size in the near future and answers will be more and more coherent.
The internal quantum states of this Philosophy model are indeterministic unless we observe them(Thanks to quantum mechanics for this esoteric statement). So, does this mean that none of these statements produced has any real meaning?
Although, Inspecting the internal state of the model might offer a few insights into how a machine thinks when fed a large amount of data. It can mimic a human in the thoughts and while doing so it comes across as a jerk.
PS: Also, my Github contains this work, I couldn’t upload the PyTorch model for the sheer size. Those who want to improve the model can contact me, I can share the trained model on Dropbox or GDrive.
Before saying Ciao! I asked one more question.

Wow! what a repetitive cold-shouldered jerk!