Timestamp Vs Timedelta Vs Time Period

Understanding Pandas Time Series data structures

Padhma Muniraj
Towards Data Science

--

Photo by Nathan Dumlao on Unsplash

DATA! Not just data science but every other field in the software industry handles data. From system software to application software — handling and storing data in an efficient manner is always a challenge. A common and most effective combat strategy is the utilization of data structures.

As the computer scientist Fred Brooks puts it,

The programmer’s primary weapon in the never-ending battle against slow system is to change the intramodular structure. Our first response should be to reorganize the modules’ data structures.

One can’t stress data structures highly enough — you can have the perfect code, perfect logic, zero errors yet storing data in a clumsy manner can be the downfall of the application.

Pandas was essentially developed for the purpose of analyzing financial time series data. It contains an ample amount of tools to deal with date and time related data. In this article, I’ll explain the basic pandas time series data structures, the type of inputs they can take, and their functionalities.

Some of the elementary data structures for working with time series data are:

  • Time Stamps
  • Time Deltas
  • Time Periods

1) Time Stamps:

Python provides the date and time functionality in the datetime module that contains three different types,

  • date — day, month, year
  • time —hours, minutes, seconds, microseconds
  • datetime — components of both date and time

While datetime in python consists of both date and time together, pandas’ alternative is the Timestamp object that encapsulates date and time together. It is a counterpart for python’s datetime but is based on the more efficient numpy.datetime64 data type.

Time Stamp Illustration

Pandas Timestamp references to a specific instant in time that has nanosecond precision(one thousand-millionth of a second).

It is the most basic type of time series data that associates values with specific instants in time. The Timestamp constructor is very flexible, in the sense that it can handle a variety of inputs like strings, floats, ints. Below are the examples of the different types of input that it can accept.

Creating a timestamp object with a variety of inputs

In the example above, [7] and [8] contains a single scalar value. A single value such as integer or a float can also be passed to the Timestamp constructor which returns date and time equivalent to the number of seconds after the UNIX epoch (January 1, 1970). Also, it allows human-interpretable date objects to be converted to and from a UNIX epoch for ease of computation. When a nan value is passed as in [9], it returns a NaT (not a time) value which is pandas’ null value for timestamp data.

The Timestamp constructor understands time zone conversions. By default, it is unaware of the time zone but can be made time zone-aware by passing a time zone to the parameter tz when creating the object. This object internally stores a UTC timestamp value that makes conversion between time zones simple.

Note: If your column/index/object is not time zone-aware, you will get an error such as this

Time Zone Conversion Error

You can overcome this by localizing (making it time-zone aware) and then converting it.

Apart from this, timestamps can store frequency information, has a large number of attributes and methods that are useful for data manipulation purposes. The index structure for Timestamp is the DatetimeIndex which is of type datetime64 and contains time series specific methods for easy processing.

2) Time Deltas:

One of the definitions of the delta is that it is a difference between two things or values. Timedelta is nothing but the difference in time which can refer to an amount of time or the exact length of time between two intervals and is based on numpy.timedelta64.

Similar to the Timestamp constructor, Timedelta also tends to take a variable amount of inputs. An interesting thing about it is that it can take both positive and negative values. Below are some examples to give you a glimpse of it.

Timedeltas are part of both python and pandas. They can be

  • added or subtracted from each other
  • divided by each other to return a float value
  • added to timestamp
  • added to datetime
  • added to date
  • cannot be added to time

Similar to Timestamp, Timedelta also has a large number of attributes and methods for manipulating data and the associated index structure is TimedeltaIndex which is of type int64.

3) Time Periods:

Time Periods references a specific length of time between a start and end timestamp which is invariable and does not overlap. The Period class takes the Period type which takes a string or an integer and encodes a fixed frequency based on numpy.datetime64.

In general, a value and a frequency parameter are passed to the Period() constructor, the frequency parameter specified as freq takes a predefined set of strings. Consider the example below,

Time Period Example

In [1] freq='A-OCT' means that the frequency is annual, anchored to the end of October whereas in [3] freq='M' means that the frequency is monthly. You can also specify quarterly, weekly frequencies. To know more, check out here.

Time Period Illustration

A sequence of Period objects can be generated using the period_range() function that takes a start, end and frequency parameters.

Generate a sequence of periods

Converting between frequencies is a task that tends to be dealt with mostly. This can be done with the asfreq method. Let’s consider an example to convert a monthly frequency to weekly frequency,

converting between frequencies

Apart from this, Periods can be converted to Timestamps and vice versa. The index structure associated with the Period objects is the PeriodIndex.

Thanks for reading all the way down here. Let me know in the comment section if you have any concerns, feedback, or criticism. Have a good day!

--

--