The world’s leading publication for data science, AI, and ML professionals.

A textual portrait of alien spaceships

Prepare yourself for unforeseeable future with text mining

Based on data collected by National UFO Reporting Center over the past century, we can piece together bits of information (color, shape, movement) from more than 800,000 reported sightings of UFOs – if they exist.

How it may look like

Unlike what’s generally portrayed as silver saucer, records point to diverse color-and-shape combinations:

color and shape
color and shape
  • Red and orange are the most common colors. They often appear as a ball of light or emit light of such colors
  • Other frequent associations include: green fireball, silver disk, black triangle

How it may move

By extracting the verbs we can observe different speeds, directions and modes of movement – hover vs shoot, appear vs varnish, float vs descend, glow vs pulsate, etc, and also view them in association with the colors.

how movement is described
how movement is described

Besides movement like changing direction ("Triangle zigzagged. Another shined light on us. Others jetted to horizon"), people also reported UFO changing in color and shape ("Golden-orange bright star fell from the sky; changed into flat black floppy object, landed in tree then flew away"). This information is extracted via a combination of dependency parsing and searching for neighbouring words.

So far we can neither prove nor disprove whether the witnesses were under the influence of alcohol.


This is #day50 of my #100dayprojects on Data Science and visual storytelling. Full code on my github. Thanks for reading. Suggestions of new topics and feedbacks are always welcomed.


Related Articles