The world’s leading publication for data science, AI, and ML professionals.

Keeping accurate time

Some devices, in the Internet of Things, don't have a clock.

DOING DATA SCIENCE FROM SCRATCH TASK BY TASK

Image by the author showing the NatureBytes kit: Raspberry Pi Powered Wildlife Cameras
Image by the author showing the NatureBytes kit: Raspberry Pi Powered Wildlife Cameras

If you have been reading my column here on Towards Data Science, you will know that I am on a mission. I wanted to count the number of cars passing my house using Computer Vision and Motion Detection. This article provides a small update and something you need to consider in designing your own Data Science experiments and data loggers. Time is an essential dimension in Computer Vision, and it is less sexy than using Neural Compute Sticks, and it certainly transcends libraries such as Cuda, OpenVision, or OpenVino. It goes right to the fundamentals of basic research.

My photo, above, is a NatureBytes Raspberry Pi Powered Wildlife Camera. You can see it deployed in the wild and in general terms, it is a motion detection device. The NatureBytes device can be deployed to count passing cars as it has a PIR sensor. However, the device does not have an Internet Connection, and hence no network time synchronization happens.

"Clock synchronization is a topic in computer science and engineering that aims to coordinate otherwise independent clocks. Even when initially set accurately, real clocks will differ after some amount of time due to clock drift, caused by clocks counting time at slightly different rates." – Wikipedia

Certainly, Clock Synchronization has been troublesome for me, and I have learned from experience that this is an important topic. I am also guilty of having taken Clock Synch. for granted at times. Let me explain why this is an important topic, how I fixed it and what I learned from my adventure.

Linux Clock Synchronization

So how does Linux keep the correct date and time? Is keeping the accurate date and time that important? Frankly, it is essential to keep your computer with the correct date and time. Consider a Hadoop cluster with the Yarn manager. Yarn will tear down workers that are considered to have gone stale. Did you know that? I learned it the hard way. Keeping the time right across a cluster is essential; otherwise, any jobs you want to run will have no workers to process the data. Clock variances between Workers and the Yarn Master have taught me many things. I am unsure if Spark suffers the same fate, but I could imagine so. Certainly Spark over Hadoop with Yarn does!

Image by author - showing how Raspberry Pi manages Network Time Protocol.
Image by author – showing how Raspberry Pi manages Network Time Protocol.

Linux has a background service that runs on start-up and uses the Network Time Protocol (NTP) to get the current time and date and then adjust the internal Clock.

sudo timedatectl set-ntp false #stop the NTP service
sudo timedatectl set-ntp true #start the NTP service

Now you know, and I have stated the importance of having your computer devices show the correct date and time. There is an excellent write-up here if you wish to read more about the Raspberry Pi and NTP.

Images by author. Left my Hadoop Yarn cluster and right my PicoCluster native Spark cluster. Clock time is critical.
Images by author. Left my Hadoop Yarn cluster and right my PicoCluster native Spark cluster. Clock time is critical.

But how did this lack of timekeeping impact my work? Well, I use scripts to combine photos into videos, and the sequencing is decided by the file name. I wrote about sequencing videos previously. As Humans, we are very conditioned by time, and anything out of sequence is just confusing. Consider the list of images below. These were taken in December 2020, but the timestamp is 2018, and that really doesn’t play well with other cameras.

Image by the author showing pictures from the NatureBytes device - notice the date and time.
Image by the author showing pictures from the NatureBytes device – notice the date and time.

It is also essential to understand that in regulated industries, there are standards of evidence. Presenting a photograph with the wrong timestamp would just be inadmissible and lead to governance concerns.

Adding a clock to the NatureBytes kit

In the article, Electronic Components for the Wildlife Cam Case NatureBytes sets out a component manifest. The Real-Time Clock (RTC) is listed as optional.

Image by the author from NatureBytes article quoted
Image by the author from NatureBytes article quoted

Naturally, I read anything optional as "nice to have" and therefore ignore them completely. Do you think the same as me? I took delivery of my kit, did the assembly and testing, and everything was fine. Fine until I started to combine pictures from different cameras and my home security system, and then I realized that I needed to be accurate with my time stamps.

Mini RTC Module for Raspberry Pi

Therefore I had to order the RTC module from The PiHut.com. The PiHut used DHL Couriers, and they somehow managed to navigate the COVID-19 restrictions and the BREXIT chaos to get the package to me very promptly.

So I figured this would be a simple component insert, with a battery recharge and I would be back in business. Wrong! The RTC module plugs into the GIPO pins, but those PINS are partially covered on the final assembled kit. I had to do a strip down, and that was really annoying and pretty time-consuming. This holiday season, I received the gift of joy, discovering and feeling annoyance, but understanding why some developers get annoyed at the mere mention of a change. Some folk hate re-work.

After a little bit of fiddling around, I managed to strip down the kit, insert the RTC Module, put everything back together, do the testing, and then put everything back into that waterproof case. Job is done!

Here is a GIF that shows a snapshot of the build and retrofit for the RTC module.

Image by the author showing the Kit build at various stages
Image by the author showing the Kit build at various stages

Thankfully the NatureBytes Raspian OS image already contained the customizations required to get the Clock to read from the RTC module. In case you needed those instructions they are available on the PiHut website.

So I had an annoying teardown and rebuild, but I have now fixed my issue and can move forward. I guess you could ask yourself how is this relevant to Towards Data Science readers? Well, photographs contain features that are useful in Machine Learning. Those features might be objects detected in the image, the file name, the directory, the date & time. Here is an illustration of some of the other features available. Device maker, the device model, are just some. Other cameras include GPS co-ordinates, and location, and these can provide vital clues to help train a model and infer items about our world.

Image by the author: the properties of a single photograph.
Image by the author: the properties of a single photograph.

What did I learn on this adventure?

Doing Data Science from Scratch is involved, and even with the best planning, we often conclude that we failed to capture a data element or two critical to our analysis. In my case, taking photographs and not having the correct timestamp is hardly a world-ending event, but it did tend to reduce the exercise’s overall human usefulness. Optional components are not always ‘nice to have’, sometimes they are plain needed! You have to make this decision, yourself, for each project. I learned to be more careful with the perception that optional means ‘nice to have’ because it might not in all cases.

Image by author
Image by author

What do you see in the image above?

YOLO thinks with 58% confidence that the Cat is a Bear !

[{‘image’: ‘/home/pi/Desktop/_2020–12–29_17–11–12.953773.jpg‘, ‘result’: [‘bear: 0.5804_ -> loc: 1325 : 598 : 420 : 722′], ‘timing’: ‘[INFO] YOLO took 30.452949 seconds’}]

Thoughts? I would love to hear from you


Related Articles