
So what is Artificial Intelligence?
Artificial Intelligence (or AI) is a field in computer science that focuses on solving problems by applying learning techniques (and some math).
In some ways, AI and the field, in general, focuses on building programs that try and imitate the way your own brain works.
But let’s talk about learning some more because it’s important in understanding artificial intelligence.
There are so many ways we as humans or even other animals learn. Let’s take my dog, Buster. When Buster was a pup I wanted to teach him to roll over, but I had two main problems. The first is that Buster doesn’t know English. The second is that he doesn’t exactly know what it means to roll over. So not only can I not communicate with him in the same way I could communicate with a kid but I can’t even describe the action to him.
But I have two keys ways to teach buster to roll over. The first is that Buster knows when I tell him "Good Job!", that he’s on the right track. The second is that Buster loves chicken and little doggie treats. I can use these to guide Buster in the right direction by reinforcing his actions when they’re correct. By using these techniques I can teach a dog, who knows nothing about the English language, to effectively understand certain English words.
In AI, researchers and computer scientists can do the same sort of thing using Machine Learning techniques like reinforcement learning. This learning technique works exactly like teaching Buster to roll over.
Let’s say I want to teach a computer to play chess. Here’s what I could do. I can tell it the rules and the way the pieces move but it won’t get the point of the game. So, we’ll need to teach it the same way we trained Buster. We give the computer rewards when it takes a piece and we punish the computer (by making it lose rewards) when it loses a piece. Over time, the computer will learn to maximize rewards. The consequence of maximizing these rewards will be that the computer will learn to play chess extremely well.
But what’s another way that we learn? Well, there are actually lots of ways to learn but one that you probably know best is repetition.
What if you wanted to learn to shoot a basketball. The best way is to just practice shooting at the hoop and over time your body will learn to make small adjustments in how hard you throw the ball, how you push the ball of your fingertips, and many other micro-adjustments that all add up to shooting really shot that over time become more consistent.
Our brains are amazing at learning in this way and in AI we try and emulate this behavior, something even babies can do seamlessly.
But computers are just really good calculators. They can do simple math really fast. But how they learn as we do? That’s what the field of AI is all about. Abstracting the learning process so that even a dumb machine can learn.
Another Example of Making Something Smart
There are a lot of ways to make something "intelligent" or at least seem intelligent. But before we take a look at computers let’s turn our attention to us humans again. I want you to fill in the last word in the following sentence:
"I was headed out to work but I forgot my …"
Did you think keys – or maybe pants? – How did you know what word to think of? There are millions of words to choose from but your brain was smart enough to narrow down the possibilities of which words I was going to say pretty substantially. So how did your brain do it?
The first thing might be your experience.
We’ve been forgetting our keys for years now and some of us have been forgetting our pants far too often. But we aren’t forgetting frogs when we leave the house.
So when I said, "I was headed out to work but I forgot my…", your brain, rather efficiently and effectively took past experiences, and maybe even times you’ve heard a similar phrase before and narrowed the search result – all in a few milliseconds.
That’s what we want to do with computers. Narrow the search space to just a few words.
One way a computer could do this is through math. More specifically statistics and probability. There’s a pretty low probability that someone forgot their chair at home. Not only that but there’s also a pretty low probability that the next missing word in the sentence "forgot my…" would even be the word "chair".
Ever see those word suggestions at the top of your keyboard on your smartphone? The words your smartphone suggests to you are based on the probability of the next word given the last few words that you already typed!

But take a look at the example above and you’ll see how you’re still better than a computer.
The suggestions it gave me were "phone", "password" and "username".
This is actually a pretty big problem in AI generally. Context. AI, so far isn’t well designed to understand the broader context. Mostly because it would take much more work to do so.
Based on the suggestions Google Keyboard gave me it’s clear that the AI they used knew I forgot something but it didn’t quite understand that I was leaving a physical location, because if it did, it probably wouldn’t have suggested the words password or username.
After all, how many times in your life have you left your house but forgot your username?
Google Keyboard (probably) uses the same probabilistic approach we talked about earlier. Where they take the last few words like "forgot my" and try and determine which words have a high probability of coming after. And looking it that way, words like username and password make a lot of sense.
As you learn more and more about AI, you’ll come to see that context is the biggest shortcoming and that the way we build some of these systems is still a little flawed. These algorithms lack context. That means that they’ll be hyper good in specific narrow areas, but have a hard time making more abstract decisions.
This problem is being worked on and technologies like GPT3 from OpenAI show promise.
But what about really learning. Like a baby would learn. Can computers learn through experience and training? The answer is yes and it’s through what we call Artificial Neural Networks.
Artificial Neural Networks
The goal of artificial neural networks is to try and stimulate the brain.
Sounds hard right? Well yes but also not really.
The brain processes information by passing information from one neuron to the next in the brain and building neural pathways. These pathways determine things like a certain memory you may have or how your hand should move.
And okay, I know that already sounds a little complicated but just bear with me, when we start to learn about neural networks in computers, it’ll make much more sense of how your own brain works. Which will help you further understand neural nets.
The general idea is that we have data to start out with and the outputs that we’re looking for.
Let’s take an example. Let’s say I wanted to create an AI that could guess what grade you would get on a test.
To do this what I’ll do is I’ll take your _previous test scores (output – because it’s what I’m trying to guess) but I also take some more information to help me learn from. Something good might be the minutes you spent studying for each test and the hours you slept_ the day before the test.
Basically what I’m trying to do is teach a computer to learn or find the correlation between (minutes you spent studying + hours slept) and your grades. If the computer can learn a strong enough correlation it would be able to pretty accurately guess your grades.
For example, here’s some past data I just made up,

In this example, ‘Minutes Studied‘ and ‘Hours Slept‘ will be our inputs into the ANN (Artificial Neural Network) because this is the data we want to use to try and guess what score (output) you’ll get on your next test. That makes ‘Points‘ our output.
So let’s just jump into it and start drawing out the ANN to see how one might look like. Here we have two inputs and an output.

The input neurons here are the circles in the input layer and are just there to take your data and that’s it. They don’t do anything else.
So here I took in our two inputs from the table above. 10 and 8. Now the problem is that our network looks a little dead. The output neuron doesn’t really do anything at this point and this whole thing barely looks like a network.
So, let’s add two new neurons into the mix.

Let’s also just think back to the brain.
In the brain, a neuron sends electrical information to other neurons. But the issue is that if the neuron sent electrical information to all the neurons it was connected to, you’re entire brain would light up eventually. But it doesn’t. Only certain pathways get lit up. That means that electrical information gets passed to some neurons but not to others. Let’s simulate this.

We added out two little neurons in a new layer called the hidden layer. But now we need our neurons to do something.
Let’s make them just add together whatever information gets passed to them. In the example above we just look at the second neuron in the hidden layer.
So cool, our neuron does addition, but the information just gets passed along to the output neuron. I want a little more control as to what goes past this 2nd neuron in the hidden layer and what doesn’t. After all, we don’t want all input to just go from one neuron to the next. Like in the example of the brain certain pathways need to be built.
Here’s where activation functions come in.
All an activation function does is tell the neuron, "hey this data can be passed or not be passed", depending on the value given and the type of activation function being applied.
For simplicity, let’s say I have an imaginary activation function that checks that the number is greater than 30. I chose this number randomly but let’s just see what happens and this time we’ll look at both neurons in the hidden layer.

Okay, so 10 and 8 get passed to both of our hidden neurons and dutifully the neurons add the numbers up. They then apply the activation function to see whether they should send the information along.
Since 18 is less than 30, they can’t pass the value to the output and so our output gives us 0!
Now, we also know that the real output should be 60 from the table above, not 0. So, how is our algorithm suppose to adjust? Remember the neurons aren’t smart, they can just add and apply the activation function I gave them.
We need a tool that lets us adjust things as they get passed from one neuron to the next. That way our AI could fiddle with the network until it learns the best pathway.
This is where weights can be used. Before each number gets passed from one neuron to another, we can multiply them by our weights to signify how important we think this value is for a neuron. So let’s see it in action.

You can see how I added the weights randomly to each line (or synapse as it would be called in the brain) on the network. And we got one neuron to activate! The second neuron in the hidden layer added the two numbers it received along with multiplying their respective weights (10(4)+8(2) = 56). It then used the activation function to check if it could pass (is 56 bigger than 30) and finally the value was sent to the output.
Neural networks are designed to compare their training values with the real values from the data in the table we gave it. Each time to goes through the network and gets an answer, it checks against the real value and then adjusts the weights, so that next time it can get even closer to the real answer.
So here the neural net will compare 56 to the real value 60. It’s close but not quite there, so it adjusts the weights again to try and get an answer closer to 60.
This readjustment of the weights is actually called backpropagation.
Think of it like when you practice math problems in elementary school. You try and solve problems, get an answer, and then check the back of the book. If you got the answer wrong you go back and try something else until you get the right answer. That’s exactly what the neural network is doing right now.
Now, this was a very simplistic look at neural networks. The activation function doesn’t exactly work the way I described but it’s fairly close and makes the example easier to understand.
In fact, if you understand this, you basically understand 90% of how neural networks actually work and I’d highly recommend watching this video to really solidify the understanding. This YouTuber does a great job of making the subject easy to understand and animations will really help you visualize what’s happening.
Recap
- AI is a field of computer science that tries to create algorithms, frameworks, or methods that can help computers understand and learn more abstract concepts.
- AI uses a mix of probability and statistics, mathematics, and neural networks to try and create algorithms that can perform a certain task like guessing your grades.
- Context is still a hard problem for computers
- Neural Networks use many neurons that can only perform simple calculations.
- These networks can be seen as an algorithm that taught itself a specific task by training on lots of data.
- Neural networks are trained by adjusting their weights and (biases which were not discussed here).
Conclusion
The field of artificial intelligence can be quite daunting but looking at it by comparing it to how we as humans learn is a great way to get to have an intuitive understanding of how these systems function. As the field grows more and more people start to use it in their work and so even if you aren’t a data scientist or software engineer understanding the basics of AI can help you better understand the kind of data you’re getting from these systems and what their shortcoming may be.