Train Image Recognition AI with 5 lines of code

Moses Olafenwa
Towards Data Science
11 min readJul 20, 2018

--

In this article, we will briefly introduce the field of artificial intelligence, particularly in computer vision, the challenges involved, the existing modern solutions to these challenges and how you can apply these solutions conveniently and easily without taking much time and effort.

Artificial Intelligence has for decades been a field of research in which both scientists and engineers have been making intense efforts to unravel the mystery of getting machines and computers to perceive and understand our world well enough to act properly and serve humanity. One of the most important aspect of this research work is getting computers to understand visual information (images and videos) generated everyday around us. This field of getting computers to perceive and understand visual information is known as computer vision.

During the rise of artificial intelligence research in the 1950s to the 1980s, computers were manually given instructions on how to recognize images, objects in images and what features to look out for. This method are traditional algorithms and were called Expert Systems, as they require that humans take the pain of identifying features for each unique scene of object that has to be recognize and representing these features in mathematical models that the computer can understand. That involves a whole lot of tedious work because there are hundreds and thousands of various ways an object can be represented and there are thousands (or even millions) of different scenes and objects that uniquely exist, and therefore finding the optimized and accurate mathematical models to represent all the possible features of each objects or scene, and for all possible objects or scene is more of work that will last forever.

Then, in the 1990s, the concept of Machine Learning was introduced and it ushered in an era in which instead of telling computers what to look out for in recognizing scenes and objects in images and videos, we can instead design algorithms that will make computers to learn how to recognize scenes and objects in images by itself, just like a child learns to understand his/her environment by exploring. Machine learning opened the way for computers to learn to recognize almost any scene or object we want them too.

With the emergence of powerful computers such as the NVIDIA GPUs and state-of-the-art Deep Learning algorithms for image recognition such as AlexNet in 2012 by Alex Krizhevsky et al, ResNet in 2015 by Kaeming He et al, SqueezeNet in 2016 by Forrest Landola et al, DenseNet in 2016 by Gao Huang et al, to mention a few, it is possible to put together a number of pictures (more like image books for computers) and define an artificial intelligence model to learn features of scenes and objects in these pictures by itself and use the knowledge gained from the learning process to recognize all other instance of the type of scene or objects it will encounter after.

To train an artificial intelligence model that can recognize whatever you want it to recognize in pictures, it traditional involves lots of expertise in Applied Mathematics and use of Deep Learning libraries, not to mention the amount of time involved and stress you have to go through to write the code for the algorithm and fit the code to your images. This is where we have provided our solutions.

Our team at AI Commons has developed a python library that can let you train an artificial intelligence model that can recognize any object you want it to recognize in images using just 5 simple lines of python code. The python library is ImageAI , a library built to let students, developers and researchers with all levels of expertise to build systems and applications with state-of-the-art computer vision capabilities using between 5 to 15 simple lines of code. Now, let us walk you through creating your first artificial intelligence model that can recognize whatever you want it to.

To train your artificial intelligence model, you need a collection of images called a dataset. A dataset contains hundreds to thousands of sample images of objects you want your artificial intelligence model to recognize. But you don’t have worry! We are not asking you to go and download thousands of pictures right now just to train your artificial intelligence model. For this tutorial, we have provided a dataset called IdenProf. IdenProf (Identifiable Professionals) is a dataset that contains 11,000 pictures of 10 different professionals that humans can see and recognize their jobs by their mode of dressing. The classes of professionals whose pictures are in this dataset are as below:

· Chef

· Doctor

· Engineer

· Farmer

· Firefighter

· Judge

· Mechanic

· Pilot

· Police

· Waiter

This dataset is split into 9000 (900 pictures for each profession) pictures to train the artificial intelligence model and 2000 (200 pictures for each profession) pictures to test the performance of the artificial intelligence model as it is training. IdenProf has been properly arranged and made ready for training your artificial intelligence model to recognize professionals by their mode of dressing. For reference purposes, if you are using your own image dataset, you must collect at least 500 pictures for each object or scene you want your artificial intelligence model to recognize. To train any image dataset you collect yourself with ImageAI, you must arrange the images in folders as seen in the example below:

idenprof//train//chef// 900 images of chefsidenprof//train//doctor// 900 images of doctorsidenprof//train//engineer// 900 images of engineeridenprof//train//farmer// 900 images of farmersidenprof//train//firefighter// 900 images of firefightersidenprof//train//judge// 900 images of judgesidenprof//train//mechanic// 900 images of mechanicsidenprof//train//pilot// 900 images of pilotsidenprof//train//chef// 900 images of chefidenprof//train//police// 900 images of policeidenprof//train//waiter// 900 images of waitersidenprof//test//chef// 200 images of chefsidenprof//test//doctor// 200 images of doctorsidenprof//test//engineer// 200 images of engineeridenprof//test//farmer// 200 images of farmersidenprof//test//firefighter// 200 images of firefightersidenprof//test//judge// 200 images of judgesidenprof//test//mechanic// 200 images of mechanicsidenprof//test//pilot// 200 images of pilotsidenprof//test//chef// 200 images of chefidenprof//test//police// 200 images of policeidenprof//test//waiter// 200 images of waiters

Now that you have understand how to prepare own image dataset for training artificial intelligence models, we will now proceed with guiding you training an artificial intelligence model to recognize professionals using ImageAI.

· First you must download the zip of IdenProf dataset via this link. Also you can view all the details and sample results of artificial intelligence models trained to recognize professions in the IdenProf GitHub repository whose link is below.

https://github.com/OlafenwaMoses/IdenProf

· Because training artificial intelligence models require high performance computer systems, I strongly advice that you ensure your computer/laptop that you want to use for this training has NVIDIA GPU. Alternatively, you can use Google Colab for this experiment has it offers a free NVIDIA K80 GPU for experiments.

· Then you have to install ImageAI and its dependencies.

Install Python 3.7.6 and pip

(Skip this section if you already have Python 3.7.6)

Install ImageAI and dependencies

(Skip any of the installation instruction in this section if you already have the library installed )

- Tensorflow

pip install tensorflow==2.4.0

- Others

pip install keras==2.4.3 numpy==1.19.3 pillow==7.0.0 scipy==1.4.1 h5py==2.10.0 matplotlib==3.3.2 opencv-python keras-resnet==0.2.0

Install the ImageAI library

pip install imageai --upgrade

· Create a python file with any name you want to give it, for example “FirstTraining.py”.

· Copy the zip of the IdenProf dataset into the folder where your Python file is. Then unzip it into the same folder.

· Then copy the code below into the python file (e.g FirstTraining.py).

That’s it! That’s all the code you need to train your artificial intelligence model. Before you run the code to start the training, let us explain the code.

In the first line, we imported ImageAI’s model training class. In the second line we, created an instance of the model training class. In the third line, we set the model type to ResNet50 (there are four model types available which are MobileNetv2, ResNet50, InceptionV3 and DenseNet121). In the fourth line, we set the data directory (dataset directory) to the folder of the dataset zip file you unzipped. Then in the fifth line, we call the trainModel function and specified the following values:
number_objects : This refers to the number of different types of professionals in the IdenProf dataset.
num_experiments : This is the number of times the model trainer will study all the images in the idenprof dataset in order to achieve maximum accuracy.
Enhance_data (Optional) : This is to tell the model trainer to create modified copies of the images in the IdenProf dataset to ensure maximum accuracy is achieved.
batch_size: This refers to the number of images the set that the model trainer will study at once, until it has studied all the images in the IdenProf dataset.
Show_network_summary (Optional) : This is to show the structure of the model type you are using to train the artificial intelligence model.

Now you can start run the Python file and start the training. When the training starts, you will see results like the one below:

=====================================

Total params: 23,608,202

Trainable params: 23,555,082

Non-trainable params: 53,120

______________________________________

Using Enhanced Data Generation

Found 4000 images belonging to 4 classes.

Found 800 images belonging to 4 classes.

JSON Mapping for the model classes saved to C:\Users\User\PycharmProjects\FirstTraining\idenprof\json\model_class.json

Number of experiments (Epochs) : 200
Epoch 1/100

1/280 [>.............................] - ETA: 52s - loss: 2.3026 - acc: 0.2500
2/280 [>.............................] - ETA: 52s - loss: 2.3026 - acc: 0.25003/280 [>.............................] - ETA: 52s - loss: 2.3026 - acc: 0.2500..............................,
..............................,
..............................,
279/280 [===========================>..] - ETA: 1s - loss: 2.3097 - acc: 0.0625Epoch 00000: saving model to C:\Users\User\PycharmProjects\FirstTraining\idenprof\models\model_ex-000_acc-0.100000.h5


280/280 [==============================] - 51s - loss: 2.3095 - acc: 0.0600 - val_loss: 2.3026 - val_acc: 0.1000

Let us explain the details shown above:

1. The statement “JSON Mapping for the model classes saved to C:\Users\User\PycharmProjects\FirstTraining\idenprof\json\model_class.json” means the model trainer has saved a JSON file for the idenprof dataset which you can use to recognize other pictures with the custom image prediction class (explanation available as you read further).

2. The line Epoch 1/200 means the network is performing the first training of the targeted 200
3. The line 1/280 [>………………………..] — ETA: 52s — loss: 2.3026 — acc: 0.2500 represents the number of batches that has been trained in the present experiment
4. The line Epoch 00000: saving model to C:\Users\User\PycharmProjects\FirstTraining\idenprof\models\model_ex-000_acc-0.100000.h5 refers to the model saved after the present training. The ex_000 represents the experiment at this stage while the acc0.100000 and valacc: 0.1000 represents the accuracy of the model on the test images after the present experiment (maximum value value of accuracy is 1.0). This result helps to know the best performed model you can use for custom image prediction.

Once you are done training your artificial intelligence model, you can use the “CustomImagePrediction” class to perform image prediction with you’re the model that achieved the highest accuracy.

Just in case you have not been able to train the artificial intelligence model yourself due to lack of accessing an NVIDIA GPU, for the purpose of this tutorial, we have provided an artificial intelligence model we have trained on the IdenProf dataset which you can use right now to predict new images of any of the 10 professionals that is in the dataset. This model achieved over 79% accuracy after 61 training experiments. Click this link to download the model. Also, if you have not perform the training yourself, also download the JSON file of the idenprof model via this link. Then, you are ready to start recognizing professionals using the trained artificial intelligence model. Just follow the instructions below.

Next, create another Python file and give it a name, for example FirstCustomImageRecognition.py . Copy the artificial intelligence model you downloaded above or the one you trained that achieved the highest accuracy and paste it to the folder where your new python file (e.g FirstCustomImageRecognition.py ) . Also copy the JSON file you downloaded or was generated by your training and paste it to the same folder as your new python file. Copy a sample image(s) of any professional that fall into the categories in the IdenProf dataset to the same folder as your new python file.

Then copy the code below and put it into your new python file

View sample image and result below.

waiter  :  99.99997615814209chef  :  1.568847380895022e-05judge  :  1.0255866556008186e-05

That was easy! Now let’s explain the code above that produced this prediction result.

The first and second lines of code above imports the ImageAI’s CustomImageClassification class for predicting and recognizing images with trained models and the python os class. The third line of code creates a variable which holds the reference to the path that contains your python file (in this example, your FirstCustomImageRecognition.py) and the ResNet50 model file you downloaded or trained yourself. In the above code, we created an instance of the CustomImageClassification() class in the fourth line, then we set the model type of the prediction object to ResNet50 by calling the .setModelTypeAsResNet50() in the fifth line and then we set the model path of the prediction object to the path of the artificial intelligence model file (idenprof_061–0.7933.h5) we copied to the project folder folder in the sixth line. In the seventh line, we set the path of the JSON file we copied to the folder in the seventh line and loaded the model in the eightieth line. Finally, we ran prediction on the image we copied to the folder and print out the result to the Command Line Interface.

So far, you have learnt how to use ImageAI to easily train your own artificial intelligence model that can predict any type of object or set of objects in an image.

If you will like to know everything about how image recognition works with links to more useful and practical resources, visit the Image Recognition Guide linked below.

You can find all the details and documentation use ImageAI for training custom artificial intelligence models, as well as other computer vision features contained in ImageAI on the official GitHub repository.

If you find this article helpful and enjoyed it, kindly give it a clap. Also, feel free to share it with friends and colleagues.

Do you have any questions, suggestions or will like to reach to me? Send me an email to guymodscientist@gmail.com . I am also available on twitter via the handle @OlafenwaMoses and on Facebook via https://www.facebook.com/moses.olafenwa .

--

--

Software Engineer @BabylonHealth, Prev. @Microsoft. A self-Taught computer programmer, Deep Learning, AI Engineer.