The world’s leading publication for data science, AI, and ML professionals.

A Data Scientist Should Know At Least This Much Python OOP

A data scientist works with data and uses modeling techniques to detect anomalies, make financial predictions, classify medical images…

© s72677466/Adobe Stock
© s72677466/Adobe Stock

A data scientist works with data and uses modeling techniques to detect anomalies, make financial predictions, classify medical images, etc. This data though does not arrive magically at the data scientist’s computer; it comes through pipelines, that interface with a variety of data sources, such as real-time financial systems, real-time sensor data, medical imaging equipment, etc. These pipelines are created by software engineers, very often utilizing object-oriented programming (OOP) concepts. The data scientist is the one that knows best what flows through the pipelines and therefore, will sit in code reviews, interface with other software engineers, go over system architectures, etc. To be a functional member of any team, one must speak the language of the team, in this case, OOP.

Also, OOP knowledge can be helpful for the Data Science work itself. For example, as we know, one of the most time-consuming tasks in a data scientist’s job is data cleaning, i.e., removing data not useful for the data science task at hand. Sitting in a code review and going over the system architecture, could reveal a data flow pathway, where data come out cleaner. This could save time and effort for our data scientist. There has been some debate as to whether OOP should be an integral part of a data scientist’s skill set. Elsewhere, it is argued that OOP reduces coding overhead and provides for robustness in systems with data science tasks[2].

My opinion is that a data scientist should, at a minimum, grasp the main concepts of OOP: encapsulation, inheritance, polymorphism, and object association. Python OOP is a skill that I would personally like to test at interview time. So, how do we do that? Just like a picture is worth a thousand words, a program is worth a hundred questions. If one does a web search for OOP interview questions, the class design of a hotel’s reservation management system comes up often. So here is my test: Design the classes for a small hotel’s reservation management system (RMS), where you utilize the concept of encapsulation, and at least once the concepts of inheritance, polymorphism, and object association. The rest of this post depicts the solution of our little exercise.

SOLUTION: Design of the Reservation Management System for a Boutique Resort in Mykonos, Greece

© 2mmedia/Adobe Stock
© 2mmedia/Adobe Stock

To make this a fun learning experience, let us design the classes for the RMS of a boutique resort that rents villas in Mykonos, Greece. The resort rents four standard villas and two VIP villas. VIP villas are larger and come with a personal yacht. All villas come with a personal assistant.

We will model the system with the following classes: villa, vipvilla, guest, reservation, resort.

Data Encapsulation

Classes must protect the most critical asset, our data. On the other hand, they must promote transparency and synergy by exposing certain data to the outside world using get/set access functions. This OOP concept will permeate the design of all of the classes in our example, and each one of them will be designed such that it contains the data and only the data that is relevant to the class characteristics in the real world.

One of the least privileged class is villa, in the sense that it encapsulates the least data (the name of the villa and the name of the villa’s personal assistant). It also offers informative functions about the hours that the personal assistant will be on call and the dates that the villa will be cleaned and keys will be changed. Also, it has a function to print the label of the gift that is left in the room of each new guest:

class villa(object):
def __init__(self,n,id):
self.villaName=n
def setPersonalAssistant(self,pa):
self.personalAssistant=pa
print(f"{pa} will be on call from 8.00am to 8.00pm for villa {self.villaName}")
def cleanAndChangeKey(self,d1,d2):
print(f"Villa {self.villaName} will be cleaned and keys will be changed on {d1} and {d2}")
def printGiftLabel(self,s):
print(f"Welcome at the {self.villaName}, {s} party!")
view raw villa hosted with ❤ by GitHub

Class guest encapsulates the following attributes of a guest: first and last name, number of adults, and number of children in the room. It offers an access function to the guest’s last name and a printing function for the guest object.

class guest(object):
def __init__(self,l1,f1,b,c):
self.first=l1
self.last=f1
self.noofAdults=b
self.noofChildren=c
def getLastName(self):
return self.last
def __repr__(self):
return 'Guest: (%s, %s)' % (self.first, self.last)
view raw guest hosted with ❤ by GitHub

Inheritance and Polymorphism

Class vipVilla offers examples of implementation of both class inheritance and polymorphism. As we can see below, the class inherits from class villa and it redefines the base class method setPersonalAssistant(). Which method gets invoked, gets resolved by Python using method resolution order.

class vipVilla(villa):
def __init__(self,nn,id):
villa.__init__(self,nn,id)
def setPersonalAssistant(self,pa):
self.vipPersAssist=pa
print(f"{pa} will be on call (7.00am-9.00pm) for villa {self.villaName} and arrange for a personal yacht")
view raw vipvilla hosted with ❤ by GitHub

Object Association

Class resort offers examples of this. The methods setGuest() and setReservation() accept objects of type guest and reservation respectively. In terms of encapsulation, class resort encapsulates the following attributes: a list with the names of the (standard) villas, a list with the names of the VIP villas, a guest list, a reservation list, and a reservation ID list.

class resort(object):
vil=['Elektra','Persephone','Artemis','Kouros']
vipVil=['Zeus','Alexandrian']
guestList=[]
reservationList=[]
resIDList=[0]
def __init__(self):
print("Welcome to Myconos Hidden Cove!")
def setGuest(self,g):
self.guestList.append(g)
def setReservation(self,r):
self.reservationList.append(r)
def getresID(self):
return self.resIDList[-1]
def updateResIDList(self):
i=self.getresID()+1
self.resIDList.append(i)
return(self.resIDList[-1])
def printLists(self):
print(f" The guest list is: {self.guestList}")
print(f" The reservation list is: {self.reservationList}")
print(f" The resID list is: {self.resIDList}")
view raw resort hosted with ❤ by GitHub

Our final class is reservation, which encapsulates the following attributes of a reservation: the name of the reserved villa, check-in date, check-out date, reservation ID, a printing function for the reservation class.

class reservation(object):
def __init__(self,n,de,le):
self.checkinDate=de
self.lengthofStay=le
self.villaName=n
self.checkoutDate=de+datetime.timedelta(days=le)
def getvillaName(self):
return self.villaName
def getcheckinDate(self):
return self.checkinDate
def getcheckoutDate(self):
return self.checkoutDate
def setreservID(self,id):
self.reservID=id
def __repr__(self):
return 'Reservation: (%s, %s, %s, %s)' % (self.checkinDate, self.lengthofStay, self.villaName,self.reservID)

The entire code along with a test module is on github: https://github.com/theomitsa/python/blob/master/mykonos2.ipynb


Related Articles

Some areas of this page may shift around if you resize the browser window. Be sure to check heading and document order.