
A data scientist works with data and uses modeling techniques to detect anomalies, make financial predictions, classify medical images, etc. This data though does not arrive magically at the data scientist’s computer; it comes through pipelines, that interface with a variety of data sources, such as real-time financial systems, real-time sensor data, medical imaging equipment, etc. These pipelines are created by software engineers, very often utilizing object-oriented programming (OOP) concepts. The data scientist is the one that knows best what flows through the pipelines and therefore, will sit in code reviews, interface with other software engineers, go over system architectures, etc. To be a functional member of any team, one must speak the language of the team, in this case, OOP.
Also, OOP knowledge can be helpful for the Data Science work itself. For example, as we know, one of the most time-consuming tasks in a data scientist’s job is data cleaning, i.e., removing data not useful for the data science task at hand. Sitting in a code review and going over the system architecture, could reveal a data flow pathway, where data come out cleaner. This could save time and effort for our data scientist. There has been some debate as to whether OOP should be an integral part of a data scientist’s skill set. Elsewhere, it is argued that OOP reduces coding overhead and provides for robustness in systems with data science tasks[2].
My opinion is that a data scientist should, at a minimum, grasp the main concepts of OOP: encapsulation, inheritance, polymorphism, and object association. Python OOP is a skill that I would personally like to test at interview time. So, how do we do that? Just like a picture is worth a thousand words, a program is worth a hundred questions. If one does a web search for OOP interview questions, the class design of a hotel’s reservation management system comes up often. So here is my test: Design the classes for a small hotel’s reservation management system (RMS), where you utilize the concept of encapsulation, and at least once the concepts of inheritance, polymorphism, and object association. The rest of this post depicts the solution of our little exercise.
SOLUTION: Design of the Reservation Management System for a Boutique Resort in Mykonos, Greece

To make this a fun learning experience, let us design the classes for the RMS of a boutique resort that rents villas in Mykonos, Greece. The resort rents four standard villas and two VIP villas. VIP villas are larger and come with a personal yacht. All villas come with a personal assistant.
We will model the system with the following classes: villa, vipvilla, guest, reservation, resort.
Data Encapsulation
Classes must protect the most critical asset, our data. On the other hand, they must promote transparency and synergy by exposing certain data to the outside world using get/set access functions. This OOP concept will permeate the design of all of the classes in our example, and each one of them will be designed such that it contains the data and only the data that is relevant to the class characteristics in the real world.
One of the least privileged class is villa, in the sense that it encapsulates the least data (the name of the villa and the name of the villa’s personal assistant). It also offers informative functions about the hours that the personal assistant will be on call and the dates that the villa will be cleaned and keys will be changed. Also, it has a function to print the label of the gift that is left in the room of each new guest:
Class guest encapsulates the following attributes of a guest: first and last name, number of adults, and number of children in the room. It offers an access function to the guest’s last name and a printing function for the guest object.
Inheritance and Polymorphism
Class vipVilla offers examples of implementation of both class inheritance and polymorphism. As we can see below, the class inherits from class villa and it redefines the base class method setPersonalAssistant(). Which method gets invoked, gets resolved by Python using method resolution order.
Object Association
Class resort offers examples of this. The methods setGuest() and setReservation() accept objects of type guest and reservation respectively. In terms of encapsulation, class resort encapsulates the following attributes: a list with the names of the (standard) villas, a list with the names of the VIP villas, a guest list, a reservation list, and a reservation ID list.
Our final class is reservation, which encapsulates the following attributes of a reservation: the name of the reserved villa, check-in date, check-out date, reservation ID, a printing function for the reservation class.
The entire code along with a test module is on github: https://github.com/theomitsa/python/blob/master/mykonos2.ipynb