
Recently, I completed a small project that required me to make suggestions for optimizing an elevator configuration within a theoretical high-rise in New York City. The building is set up as follows. Entrants to the building must first swipe badges through a security system, then they can push the elevator call button and wait for an elevator to arrive. I was asked to answer 5 main questions, and to only spend a couple of hours completing the task. The main questions were:
Provide an overview of the current state of elevator wait times.
Figure out the overall average wait time.
Determine the times during the day where the wait times are longer.
Determine the floors where the elevator wait times are longer.
Provide a minimum of 2 recommendations on how to optimize the elevators’ configuration to reduce wait times.
In order to answer these questions, 2 datasets were provided. One called people.csv
which documented all of the swipes into the buildings security system, and another called simulation.csv
which was a log of each elevator’s status and operations over the course of the few months which we had data. There were a total of 4 elevators in the 20 story building.


Initial Insights.
The first thing I wanted to do was to wrap my head around the data so I can begin thinking about how to answer the provided questions. It was during this stage that I determined there were only 4 elevators and that the building had 20 floors. I also learned that in order to properly analyze the data, I would need to convert the datetime columns in each dataset from a string to a pandas datetime.
simulation['time'] = pd.to_datetime(simulation['time'], infer_datetime_format=True)
people['time'] = pd.to_datetime(people['time'], infer_datetime_format=True)
There were some additional columns not listed in the data description but present in the dataset, these included date
, date_daily
which both seemed to be just derived from the time
column and as a result I decided to drop them.
simulation.drop(['date','date_daily', 'weight'], axis=1, inplace=True)
In order to see who was swiping into the building, I used a seaborn countplot to visualize the total count of guests/not guests who entered the building.


Deeper Analysis.
I wanted to figure out how long it took for people in the lobby to call an elevator after swiping in. In order to do this, I needed to make one assumption about the path a person takes to the elevator. I assume that after swiping into the building security system, a persons next action is to click the call button.
To figure this out, I used pd.merge_asof which is similar to a left join but instead of matching exact keys it will match to the closest key. In our case we want to join the closest call to floor 0 (lobby) with the closest time that a swipe into the building was recorded. We want to do a backward search because the swipe time would have occurred earlier than the elevator call for everyone entering the building. Then, we can just subtract the elevator call time from the swipe time in order to determine how long a person took to hit the call button after swiping in.
Now we know that, on average, it takes people 28 seconds to hit the elevator call button after swiping through security. The longest it took someone to press the call button was 59 seconds. After determining this I decided that I would focus my effort on the simulation.csv
file, which had the logs for each elevator and can be used to answer the main questions.
To determine how long it took for an elevator to complete each trip it made, search the data for a system (call) action then find the next time an elevator door opened on the floor where the call button was pressed. I can then find out which elevator arrived, how long it took, and which floor the elevator was arriving from. This would be enough information to answer the main questions.
By subtracting the elevator call floor from the arriving elevator floor, we can see how many floors the elevator traveled to reach the person who called it, this may provide some insights into longer-than-average wait times.
Now we should have enough data to analyze and answer the main questions.
Times during the day where elevator wait times are longer?

As we can see above, wait times spike during the beginning of the day (shortly after 7:30 AM) and the end (4:00 PM – 7/8:00 PM) of the workday. There are also two smaller spikes at lunchtime, between 11:00 AM and 1:00 PM.
On what floors are wait times longer?

The graph above plots the wait times of the 1st, 2nd, and 3rd quantiles of wait times. The red line describes how long those who waited the longest for an elevator were kept waiting. We can see from this graph that during peak times, the higher you are in the building – the longer you wait for an elevator. Why? Further analysis shows that those who waited the longest were waiting for the elevator to travel more than 10 floors to get to them. This means that the elevators are defaulting to the lobby.
The graph below shows that for elevator users who waited longer than 60 seconds for their elevator to arrive, the elevator that did arrive was most commonly coming from the lobby (floor 0).

Overall average wait time?
wait_times = simulation.loc[simulation['action'] == 'call']
overall_average = wait_times.wait_time.mean()
The overall average wait time for an elevator was 25.69 seconds.
Descriptive Machine Learning.
As a sanity check and to satisfy my own curiosity, I fit a linear regression model to the data, then calculated the permutation importance of the features in order to determine what the model thought was the biggest factor contributing to elevator wait times. It turns out that the total_floors_traveled
had the biggest impact on the amount of time an elevator user waited.

Recommendations.
In order to reduce wait times for the elevators in our fictional building, I would suggest first to trigger an elevator call to the lobby when a user swipes through security, this would shave some time off of the initial elevator ride to the destination floor.
My second suggestion would be to implement some variation of a Fixed Sectoring Common Sector System, which is a configuration where each elevator is given its own sector which it prioritizes. Since we have a total of 4 elevators, we would have 4 different sectors that an assigned elevator would service.
I would also suggest that at least one elevator be configured to default to the top (19th) floor, to alleviate those waits over a minute for an elevator to arrive. These changes would reduce wait related complaints by the buildings occupants.
Next Steps.
This was a very interesting project to think about and try to "solve". I did very little research going into it, and would welcome any critiques of my methods and approach in the comments. After completing this iteration I found that there is a large body of work devoted to solving this very problem. One book that I find particularly interesting is Elevator Traffic Handbook Theory and Practice by Gina Barney, Lutfi Al-Sharif. After I have completed reading this book, it would be interesting to revisit this project and apply what I’ve learned/see where I could have improved my initial analysis.
Until then, happy coding!
You can find the data and notebook for this project here.