Playing chrome’s dino game by physically jumping and crouching
A creative PoseNet application that runs on your browser and tries to predict if you’re jumping, crouching, or staying still
You all know what this game is about. This is the best service-offline-sorry page in the world. People have made simple bots that time the dino’s jump to beat the game to reinforcement learning agents with CNN state encoders.
It’s a game and we’re supposed to have fun. Today, I’ll walk you through how to write some JavaScript code to play the game by jumping around in your room.
Overcoming Tech Barriers
Setting up a small webpage with basic javascript support to get a webcam feed and a dino game container is trivial for seasoned developers. All you need is latest chrome, a <video>
tag, some JavaScript snippets to load a webcam feed from stackoveflow and the ripped t-rex game.
Moving on to the interesting part. Movement/Action detection.
Tensorflow Lite has open — sourced a lot of fine — tuned models for web or mobile usage. I decided to use the full (both lower and upper body) pose detector. You can import tensorflow lite for javascript as well as the model I am using here, by adding these 2 lines inside your <head>
.
With every dependency downloaded and ready, we can start by handling the camera element’s loadeddata
event.
Once the webcam is ready, load the model and start a pose prediction loop.
I’ve also included a small recording play/pause functionality by just adding an on — click event on the video feed container.
Detecting Actions
Posenet outputs a list of each predicted bone positions, along with a score value.
For the sake of simplicity, our algorithm is pretty simple.
- Pick out the left and right hip bones
- Select the hip bone with the greatest confidence (score)
- If the score is not
> 0.6
( better than a random guess ), go to next frame - Else, perform simple thresholding to detect actions, on the
y
axis
Here’s the first part of our algorithm so far:
Moving on to thresholding.
KISS.
- If the hip is positioned on the lower part (bottom
20%
) of the camera feed, treat this as crouching. - If the hip is in the middle of the feed (between
20%
and70%
of the height of the camera feed), treat this as idle. - Else, treat this as a jump
Of course, this presumes that the player is standing in the proper position in front of the camera. This is pretty easy to do. Just make sure you are far away so that your hips are in the middle of the viewport, with enough room so that you can crouch and jump.
The last part, implementing toggleAction
is pretty easy. We are just going to emulate key pressed events. Key code 32
means spacebar and key code 40
means bottom arrow.
That’s it. Thanks for reading!