5 things I learned training an AI model on every NBA player
It sounded easy. Just train a facial recognition model on every NBA player for the 2017 season. But it was actually really challenging. Here is what I learned that might end up saving you a lot of time.
You need quality facial recognition — I of course recommend Facebox. Its state of the art, and in my tests, performed better than any other commercially available facial recognition tool, including the big cloud vendors (cognitive services). Recognizing a face is one thing, but if you’re going to be spending a day gathering every photo of every NBA player, you don’t want your time wasted training a model that won’t perform very well.
Think hard about the reason you’re doing this. Not as an existential quandry, but as a realistic assesment of your use cases. If you’re training a model to detect NFL players while they’re playing football, you might want to think twice about training them with their faces, and maybe thinking more about how their helmets obscure the recognition. When we were training the model to detect US politicians, we used their official portraits as the training set. This worked because in real life, when they’re on camera or in pictures, they’re often using the same facial expressions as they did when they have their professional photos taken. Whereas with NBA players, I noticed that frames from in game footage often feature some hilarious facial expressions, that are so dissimilar to their official NBA portraits that a face recognition technology might have a hard time identifying them.
You also run into issues with sweat, injuries, and a lot of faces pointed away from the camera. Its best to get as much training data as possible to take into account these use cases and scenarios. I used a lot of stills from each player actually playing basketball so that I was able to make sure the model had a good representation of each player’s face in different scenarios.
The quality of the training photo. This one caught me by surprise, and unfortunately doubled the amount of time it took me to gather the training set. On my first go around, to save time and disk space, I pulled thumbnails of photos for every player. These thumbnails were around 11KB. The model’s accuracy was not what I wanted it to be, and I was able to trace the problem to the file size and fidelity of the training photos. I painstakingly went back and pulled higher resolution photos of every player, where the filesize was not lower than 70KB. This greatly improved the accuracy of the model once it was trained on this newer data set. If I had known this when I started, I could have saved countless hours of work.
Make sure your photo has one face in it.
This probably sounds obvious, but you’d be surprised what a state of the art facial recognition model will be able to pick up. In basketball, some of the greatest shots of players are on the court, and naturally there are other players and spectators. Even photos with a depth of field that shows only the player in focus can be a problem. Out of focus faces may still register as faces, and of course the model will then not know which face you want to teach.
Data gathering is the hard part. Tools like Machine Box make using machine learning really easy. Thats because the truly hard part is gathering good training data. You have to think a bit like a data scientist, and in some ways, you have to become one (albeit an amateur one) when setting about training a model. You have to think about your use cases, analyze the data you’ve chosen, and perhaps do some tests first before you go about spending a day pulling photos of every single NBA player off the internet.