How to Make an AI Dog Eat Your Homework with Computer Vision

A funny project in OpenCV and Python that makes use of facial feature detection

Rohan Agarwal
Towards Data Science

--

Photo by charlesdeluvio on Unsplash

While it’s important to learn how to use AI tools to tackle complex challenges, I want to show you how they also allow you to be creative, have some fun, and turn anything into a reality. In this case, we’re building the classic childhood excuse for not doing your work on time: my dog ate it.

Project Scope

We want a dog on your laptop basically. Using your laptop’s web cam, the dog will stare at your face. If you get distracted (i.e. you look away from the computer) or get sleepy from your hard work (i.e. your eyes are closed), your new pet dog will start barking at you to get you back on task or wake you up. If you still don’t get on task, it will eat your homework (i.e. delete the file you were supposed to be working on).

This project is:

  • A funny joke
  • A forceful way to keep yourself on task
  • A good way to learn facial feature detection, audio playback, and system file manipulation in Python and OpenCV.

How to Build Your Dog

Setup

First, open a new Python3 file. Install cv2, send2trash , and playsound using the package manager of your choice (I like pip).

From https://github.com/npinto/opencv/tree/master/data/haarcascades, download these two files that allow OpenCV to detect facial features, including open and closed eyes: haarcascade_eye_tree_eyeglasses.xml and haarcascade_frontalface_alt.xml. Put them in your project directory.

At the top of your Python file, make the imports of the three libraries and two files we downloaded:

import cv2
from send2trash import send2trash
from playsound import playsound
eye_cascPath = '{PATH}/haarcascade_eye_tree_eyeglasses.xml' #eye detect modelface_cascPath = '{PATH}/haarcascade_frontalface_alt.xml' #face detect model

Also define two global variables that we’ll use later to track the number of frames the user’s eyes have been closed and the number of warnings they have received without punishment.

frames_closed = 0 # number of frames the user's eyes are not detected as open
warnings = 0 # number of warning the user has received without their homework being eaten

Dog Behavior

To make our Python script feel like a dog, we need it to bark! And since the facial feature detection we’ll use isn’t 100% foolproof, we need to have a little leniency for the user, so we’ll implement a warning system. Each warning, the dog will bark, and after three warnings, it will eat your homework (the function for which we’ll implement soon!). To make the dog bark, use the playsound library on any audio file you want (if you don’t have one, try this website to get a free sound easily).

Note the use of global at the start of the function. This is sometimes necessary in Python to access global variables and not define a local one of the same name. This would be easier to avoid with object-oriented techniques, but since this is a quick script, we’re okay with being a bit messy. Try not to use this for big projects as it gets messy quickly.

def warn():  global warnings  warnings += 1  playsound("./BARKS_WITH_ECHO_SPg.wav")  if warnings >= 3:     delete_hw(hw_path)

Eating Behavior

Now we will write the code that eventually allows your dog to eat your homework.

First, we want to get the file the user is working on using Python’s built-in input function. This just asks for a string from the user in the terminal where you run the script.

hw_path = input("Whatcha working on? ")

Then define the function to delete or eat the homework using the send2trash library. I used this library since it allows the user to recover their file from the system trash if needed, but if you want to be ruthless, you can use os.remove() instead. The os library has many other simple functions for manipulating system files, so it is worth learning for future projects.

We also use our warn() function in case the user gave us an invalid path, which we check with try and except, catching the exception that send2trash gives when it receives an invalid path. Again note the use of global as discussed before (this is starting to show how it gets messy with multiple variables being modified!).

def delete_hw(path):  global frames_closed  global warnings  frames_closed = 0  warnings = 0  try:     send2trash(path)  except Exception as e:     print("THAT'S NOT A VALID PATH YOU LITTLE CHEAT: " + str(e))     warn()

Now we have the main “dog eats your homework” functionality.

Giving Our Dog a Brain: Computer Vision and Facial Feature Detection

We want the dog to eat our homework after three warning if our eyes or closed or not looking at our computer (basically, not working), so it is time to implement the main logic of our program with OpenCV.

Load the files we downloaded with the constructor provided by OpenCV. This uses the data in the file to form the classifiers for detecting faces and open eyes. These xml files contain data to identify different features, as outlined in this paper. Essentially, different areas of the image are selected with rectangles of different orientations. When put in order (“cascading”, as you might see in the function names), they form a rough decision tree that efficiently detects whether we’re looking at the feature that file is for. We’re using standard classifier files provided by OpenCV since facial feature detection is a common task, but it is possible to train your own with this algorithm if you want to modify this project!

faceCascade = cv2.CascadeClassifier(face_cascPath)
eyeCascade = cv2.CascadeClassifier(eye_cascPath)

Trigger the web cam with OpenCV’s given function. There are multiple kinds of arguments you can pass into this constructor, but I chose to pass in the device index of my camera for ease. To be specific, I passed in the index 0, which always chooses the primary camera of the device. Passing in 1 would choose the second camera, 2 the third, and so on. Since it’s not great if you don’t know the index, there are other constructors available too.

cap = cv2.VideoCapture(0)

Then have a while loop that runs each frame. The rough steps are as follows:

  1. While the program is running, capture the image from the webcam every frame.
  2. If there is an image, convert it to grayscale. This is necessary because the detectMultiScale function in OpenCV expects a grayscale image since color information does not really help the algorithm in any way. You can easily identify a face in a black and white movie, no? The shape matters most, which is what the xml files contain data for.
  3. Use our face model xml file to detect if there is a face in the image using the Haar Cascade algorithm as mentioned above.
  4. If there is a face, draw a rectangle around the image on our screen just so we can see it working. Show the image in a window at the end of the loop. This doesn’t add any functionality to the project, but it is a nice visual.
  5. Also use our eye model to check if there are open eyes in the face, using the same technique as detecting faces.
  6. Update frames_closed accordingly, and if there are too many closed frames in a row, then we must warn() the user.

Here is the code:

while 1:  ret, img = cap.read()  if ret:    frame = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)    faces = faceCascade.detectMultiScale(      frame,      scaleFactor=1.1,      minNeighbors=5,     minSize=(30, 30),     # flags = cv2.CV_HAAR_SCALE_IMAGE   )    if len(faces) > 0:       for (x, y, w, h) in faces:          cv2.rectangle(img, (x, y), (x + w, y + h), (0, 255, 0), 2)       frame_tmp = img[faces[0][1]:faces[0][1] + faces[0][3], faces[0][0]:faces[0][0] + faces[0][2]:1, :]       frame = frame[faces[0][1]:faces[0][1] + faces[0][3], faces[0][0]:faces[0][0] + faces[0][2]:1]       eyes = eyeCascade.detectMultiScale(          frame,          scaleFactor=1.1,          minNeighbors=5,          minSize=(30, 30),          # flags = cv2.CV_HAAR_SCALE_IMAGE      )      if len(eyes) == 0:        print('no eyes!!!')        frames_closed += 1      else:        print('eyes!!!')        frames_closed = 0      frame_tmp = cv2.resize(frame_tmp, (400, 400), interpolation=cv2.INTER_LINEAR)      cv2.imshow('Face Recognition', frame_tmp)    else:      print('no face!!!')      frames_closed += 1
if frames_closed >= 60: frames_closed = 0 warn()
waitkey = cv2.waitKey(1) if waitkey == ord('q') or waitkey == ord('Q'): cv2.destroyAllWindows() break

Wrap Up

An example of what you should see when running the program.

Now you should have a functioning AI dog that eats your homework if you don’t focus. I hope this tutorial helped you with facial feature detection in OpenCV and Python. I also hope you found some humor out of it! And a subjectively useful tool…More than anything, I hope this inspires you to have fun with code and let your imagination run wild — software is not just utilitarian. Even simple tools like this are insanely powerful if combined and used correctly.

Facial Recognition Risks

One thing to be wary of are the risks of facial detection. It can be a serious invasion of privacy — luckily, this simple script stores no data and does all processing locally. If it were to be modified though, user privacy should be maintained. Additionally, algorithms may be biased in their performance across different ethnicities, sexes, etc., so be aware of that.

Let me know any thoughts or questions in the comments. I’m happy to point to more of my writing on wild ideas and on art meets tech as well!

--

--

Computer scientist and artist involved in startups and philanthropy. I write about where tech meets art and offer ideas on interesting problems in the world.