Overlooking Crime in New York City amid the Pandemic and Protests
What kind of story does the crime data tell about NYC in 2020 so far?
Introduction: What is going on with NYC?
These days, the crime rate in New York City got very conflicting ideas. Some politicians said violent crime is rampaging in the big apple, but some others said it is safer than ever before due to the lockdown. With the upcoming election, I think this is just one of the public topics that have always been used for political interests. But I also got many versions of the story personally.
I have friends and families said it becomes inhabitable. Meanwhile, I haven’t heard anything different from my foreign friends or colleagues still in the city. This got me interested in getting back to my old project about NYC crime almost a year old and do something different this time.

The Topics
Based on very contradicting opinions mentioned above, I can think of these two related questions:
- Can we find the impact of COVID19 or the protest on the crime rate generally?
- Does NYC become more violent or dangerous than before? What type of crime stands out amid 2020?
The Data
This is not an easy case given how unprecedentedly 2020 is different from previous years. To not get involved in political views or preassumption, I choose the open-sourced data to do the research. We will revisit the NYPD Complaint Data on NYCOpenData, but this time, we will pick the data spanning the past five years (starting from 2015). To answer these topics, though, there are a huge caveat and an issue.
- We have to assume that the criminal record system from NYPD hasn’t experienced any disfunction during any specific event. The recording didn’t stop or change the process that could affect how the record displays itself in the dataset because of the pandemic or the protest.
- Since the time this article was made is the start of November 2020, we miss the data from the other half of the year. Because of this, we extracted only the data from the first six months of each year to do the testing.
p.s.1 The crime rate is well-known for changing according to the season of the year. If we plot the data from previous years along with the time, we could clearly see some identical seasonality.

p.s.2 I used Python with Pandas & Matplotlib package for most of the process. I will keep minimal codes throughout the article for simplicity, but if you are interested in data processing, visualization, or some details, please feel free to contact me for more information.
Can we find the impact of COVID19 or the protest on the crime rate generally?
Before even applying any statistics, we can spot a distinct difference from a simple graph. In the middle of the first half-year (around March), the crime data from 2020 had a big valley. And later, the crime rate slightly returns to a mild level as shown under.

If we put a more detailed time frame on the axis and focus on 2020, we can spot some coincidence between the major events and the drop.

From the graph above, the executive order had somewhat correlated to the drop. To our surprise. the protest didn’t obviously spike the total number of crime events. (We will dig deeper and examine the crime type in the later part.)
Statistical analysis
Until now, we might already see the difference between these two sets, but let’s just said that we don’t consider the time factor but only the difference in the total distribution of the daily crime number. With the statistical testing (Anderson–Darling test, a statistical test of whether a given sample of data is drawn from a given probability distribution), we could see if the distributions of the past 5 years and the 2020s are the same. Under the setting (The null hypothesis is “the 2020 data are drawn from the same distribution” and the confidence level is set to 95%), we confidently rejected the hypothesis and say that 2020 data are not generated from the same distribution from a 5-years data.

Here is the plotting (Histogram & Cumulative Distribution Function) showing how different between those two in a visual way.


p.s. Although the number of daily crime cases is the discrete value, the sampling number is big enough (the 5-year histogram is similar to a normal distribution) for Anderson–Darling test, which is usually used for continuous data.
Until this point, we already prove that the crime rate in 2020 is different from the previous 5 years. Also, we could speculate that COVID seems to affect the crime rate significantly but the protest doesn’t appear in that way. To expand a little bit on the latter part, we should proceed to the next topic and look into the crime category.
Does NYC become more violent or dangerous than before? What type of crime stands out amid 2020?
Here, I am going to use the same arbitrary rule to classify the crime event (please see my previous article for more explanation). We will investigate if any type of crime spike in 2020 compared to the previous 5 years.
p.s. Again, to avoid the possible seasonality related to crime types, we will use only the first half-year of the data. Also, the 5-year data is the average of the past 5 years of data.

As we can see from the difference between 5-year and 2020 data, we know that most types of crime events have declined in 2020. There are some noticeable increases in some types, though. Burglary, Motor vehicle theft, Arson, and Murder have some level of increase in 2020.
Let’s plot these types over time to give a closer look.

We can summarize some takeaways from the plot above:
- Most of the crimes seem affected by the COVID19 shutdown as we see nearly identical dropdowns happened in March. However, Motor vehicle theft, Burglary, Arson, and Gambling didn’t seem influenced in March for some reason. It could be these types of crimes aren’t necessarily limited by the close-down.
- In general, after the huge dumps in March, most of the crime events rebound at some level, but they are still lower than the numbers at the start of the year. We might speculate that the crime events are correlated to the level of economic activities in a way, especially for non-violent crimes (e.g., Social commercial related crime, Larceny, Fraud)
- During the second half of these distributions (about May), we can see some big spikes of the incidents. Specific violent crimes (such as Criminal mischief property, Arson, Aggravated assault, Burglary) has surpassed the regular level of the past 5 years. We could possibly attribute this to the series protests starting May 29th. Burglary and Arson reached a record-breaking high level as this is not something that happened frequently under normal circumstances.
- The protest did spark specific types of crime events and it is disastrous to the city. That being said, the majority of crime events is still less than the 5-year average, and some specific ones (Burglary and Arson) that media extensively reported only went up in one week. Aggravated assault and Motor vehicle theft could be new problems as we see some trends, but the available data here doesn’t fully support the saying that New York City has became more violent these days in my honest opinion.
One thing worth pointing out is that the real number of Murder cases is way smaller than others (e.g., the scale is 100s). Thus the percentage would appear more intimidating accordingly with slight fluctuation. However, this is not downplaying distressing events! For crimes like murders and shooting, the perception and the cause of those are far more important than just the figure. This requires more detailed data to identify the underlying cause (e.g., street gangs, individual incidents). The limitation of this probing stop us from getting more concrete insight in this matter.
Conclusion: Hard to say anything till this point
It is not an easy task to draw insight or conclusion from the complicated data this year, given that multiple unprecedented things happened at once and closely (New York City’s last curfew was in 1945!). Still, it is interesting, or somewhat responsible, to look at the real data and try to picture a true overview of the subject.
To close up the study, I can only recklessly conclude that NYC has lower crime activities because of the pandemic, even it experienced a spike from “aggressive protests” in one to two weeks. It is easy to jump to the conclusion that New York City is a cesspool of crime right now, but comparing to the past, the city doesn’t do too bad at least for the first half-year. It is crucial to keep tracking the data and see if there is a systematic change in general crime numbers if we like to reach any solid conclusion.
What’s next?
After passing 2020 (hopefully, we can make it 😀), I will revisit the subject again and see if there is something more interesting to investigate. There would also be a more impactful event, the US election in November, coming along the way, and it would be another great opportunity to spot how the political activities could affect the crime rate in a city.
Thanks for reading the long post! Feel free to reply with any thoughts and opinions, or contact me through Linkedin!
p.s. If you are interested in some COVID-19 state-wise info, be sure to visit this neat personal dashboard to track the curves.
Jeff






