Analyzing California’s Electric Vehicle Adoption Rate

Using DMV Data with Pandas and GeoPandas

Dan Wilentz
Towards Data Science

--

Tesla (Courtesy of Matt Weissinger on pexels.com)

California is pushing for aggressive societal change towards a net-zero emissions future, and a big piece of that puzzle are the vehicles its citizens use to go about their daily lives. In conjunction with the Inflation Reduction Act (which provides tax credits up to $7,500 for new EV purchases and up to $4,000 for used EVs — conditional on locations of vehicle assembly and battery material sourcing), California has implemented the Advanced Clean Cars II (ACC II) regulations, which require automaker sales to be at least 35% EVs by 2026. After 2026, the requirement scales up linearly each year until 2035, when all sales must be EVs.

This analysis focuses on the Electric Vehicle (EV) Adoption Rate by Californians in the era of these new incentives. I define EV adoption rate as:

EV Adoption Rate = (total EVs purchased) / (total vehicles purchased)

In this analysis, we’ll explore whether California is on track to hit the 2026 target of 35% EV Adoption Rate using publicly available DMV registration data. Then we’ll break this down further to look at progress on a geographic level and an automaker level.

Important note: Since the 35% requirement is ultimately on vehicle sales and the DMV provides us with vehicle registration counts (not vehicle sales counts), this analysis approximates sales using registrations. Specifically, for each year of DMV data, I only used cars made within 3 years of the vehicle registration date to approximate registrations as new cars being purchased.

Data

The data is publicly available on California’s Open Data Portal and California’s State Geoportal.

Geography data:

Process

  1. Ingest publicly available data and clean it using Pandas.
  2. Analyze data with Pandas. Overlay it onto maps and plot it with GeoPandas.
  3. Push code routinely to github.
  4. Iterate!

Technical Approach (GeoPandas Tutorial)

Feel free to skip this section if you’re only interested in the results.

This project served as an opportunity for me to learn how to use GeoPandas, a python library used for data analysis projects with a spatial component.

The general workflow for using GeoPandas is to connect the data you want to plot (such as number of vehicles and EVs in a zip code) with an associated geometry (such as the zip codes geometrical boundaries) within a structure called a GeoDataFrame. The GeoDataFrame is the bread and butter of GeoPandas and is a child class of the Pandas DataFrame object and includes a geometry column.

For me, I had vehicle counts at the zip code level, but I wanted to plot vehicle counts on a county level. I started with the necessary library imports and read in my geojson files for zip code and county boundaries.

import geopandas as gpd
import matplotlib.pyplot as plt

zip_codes = gpd.read_file(zip_code_geojson_path)
counties = gpd.read_file(county_geojson_path)

GeoDataFrames can have only one “active” geometry column. Whichever column is active will be used for joins, plotting, or other applications. You can use the GeoDataFrame.set_geometry() method to set the geometry to a different column. Also, when two GeoDataFrames are joined, one of the active geometry columns will be dropped (as a GeoDataFrame can only have one active geometry column)

Since I wanted to combine my zip code and county GeoDataFrames but preserve the geometry information of both, I renamed the zip code geometry column. I also made a duplicate of the counties geometry column.

# rename zip_code geom column
zip_codes.rename_geometry(‘zip_code_geometry’, inplace=True)

# create duplicate county geometry column
counties[‘county_geometry’] = counties.geometry

Since some zip codes had boundaries which overlapped multiple county boundaries, and I wanted to assign a zip code only once, I took the centroid (which is the geometric center of an object) of each zip code’s boundaries, and then looked to see if that zip code centroid lay within a county’s boundaries. Effectively, I reduced each zip code’s overall shape to its center-point and then determined which county a given zip code’s center-point was within.

To do this, I first set the CRS (coordinate reference system) for each GeoDataFrame from 4326 (the default) to 3857. This effectively sets our coordinate system from a globe to a map:

zip_codes.to_crs(3857, inplace = True)
counties.to_crs(3857, inplace = True)

I then calculated the zip code centroids and set those centroids to the active geometry:

# Calculate zip code centroids
zip_codes[‘zip_code_centroid’] = zip_codes.centroid

# Set the zip code active geometry to the centroid column
zip_codes.set_geometry(‘zip_code_centroid’, inplace=True)

Finally, I joined the two GeoDataFrames:

zip_codes_with_county=gpd.sjoin(zip_codes, counties, how=’inner’,predicate=’intersects’)

Once I had a GeoDataFrame that included zip code name, county name, zip code geometry, and county geometry, I joined vehicle counts and EV counts by zip code onto my GeoDataFrame, and aggregated counts to the county level. This left me with a GeoDataFrame with 58 rows (for the 58 counties in California) which included county name, county geography, vehicle count, and EV count. Perfect for plotting!

Here is an example of the plotting code below. In it, I also included an extra GeoDataFrame for some cities in California to serve as landmarks on my plot:

# EV Adoption Rate 2022
fig, ax = plt.subplots(figsize = (10, 10))
county_gdf.plot(ax=ax,
column=’ev_rate_2022',
legend=True,
legend_kwds={‘shrink’:0.5},
cmap = ‘Greens’)

city_gdf.plot(ax=ax,
color = ‘orange’,
markersize = 10)

for idx, row in city_gdf.iterrows():
plt.annotate(text=row[‘city’], xy=row[‘coords’], horizontalalignment=’center’, color=’Black’)

ax.get_xaxis().set_visible(False)
ax.get_yaxis().set_visible(False)

for edge in [‘right’, ‘bottom’, ‘top’,’left’]:
ax.spines[edge].set_visible(False)

ax.set_title(‘CA EV Adoption Rate (2022) by County’, size=18, weight=’bold’)

This code produced the first map of California in the next section (Results).

Results

EV Adoption Rate (Overall):

Above is a graph demonstrating EV adoption rate across the entire state of California from 2018 through 2022. EVs are qualified as battery-electric vehicles, hydrogen fuel-cell vehicles, or plug-in hybrids. However, EVs do not include hybrid-gasoline cars (as those don’t satisfy the tax credit requirements or the automaker regulations). A 2019 Toyota Prius, for example, would not count.

We can notice a bump in sales from 2021 to 2022. This increase was from about 6.6% to 9.5%. The increase seems to have come primarily from an increase in ZEV purchases.

If we assume a naive linear extrapolation from 2021 to 2022 to continue onwards, then it appears we will not hit the goal of 35% EV adoption rate by 2026. However, the DMV data reflects vehicle counts at the beginning of each year, and therefore isn’t counting the period of time after the new incentive structure was rolled out (August of 2022). The blue circle in the graph above compensates for that. It demonstrates the state-wide EV adoption rate for the year of 2022, and was taken from energy.ca.gov. If we include the blue circle in the trend and extrapolate linearly, it looks like the goal of 35% by 2026 is likely to be satisfied.

That being said, assuming a linear extrapolation is an oversimplification and may not even be the correct shape of the trend. Overall, it’s hard to predict what the next 4 years will look like, but the increase from 2021 to 2022 is a promising sign, as is the extra data point from energy.ca.gov.

EV Adoption Rate (County Level):

We can also look at EV adoption rate on a county level, to get an idea of how spatially the state is trending towards purchasing EVs at higher rates:

In the map above, we can see that the Bay Area by far has the largest EV adoption rates of the state. In other parts of the state, EV adoption rate tends to be higher along coastal counties, and along the Bay Area — Tahoe corridor. Specifically, the following 5 counties have the highest EV adoption rates:

One hypothesis for why the Bay Area has high EV adoption rates compared to the rest of the state is that it reflects the wealth and political leanings of the citizens who reside there (while out of scope for this analysis, we could use income data from the most recent census and how Californian’s voted on prop 30 in 2022 to explore this further)

The areas of California with the lowest EV adoption rate tend to be clustered in the northeast section of the state. The 5 counties with the lowest EV adoption rates are:

The northeast area of California is low in population. It may have residents who feel hesitant to adopt EVs as these areas tend to face severe weather (and perhaps there is a sentiment amongst residents that EVs will be a functional downgrade from what they are used to in these conditions). It is also possible that there is little charging infrastructure in this part of the state. The big outlier here is Imperial County, the southeastern most county of California. It is a more populous county than the others in this list and is a desert (as opposed to redwood filled mountains). It may also be facing an infrastructure shortage. Although out of scope for this analysis, we could determine if lack of infrastructure correlates with EV adoption rate by looking at EV charger location data from the US Department of Energy.

If we take each county’s 2021 and 2022 EV adoption rates and extrapolate linearly, we can come up with an estimate for which counties will hit the target by 2026 and which will not.

Linear Extrapolation — based on 2021 to 2022 countywide growth

However, this extrapolation doesn’t include EV adoption rates after the new incentive structure. If we extrapolate by assuming yearly growth is equal to the average of county-wide 2021 to 2022 growth and the statewide 2022 to 2023 growth as taken from energy.ca.gov, we can produce the projection below:

Linear Extrapolation — based on average of 2021 to 2022 countywide growth and 2022–2023 statewide growth

Similarly to the EV adoption rate graph we saw earlier, the counties with the higher EV adoption rates tend to be the ones I am projecting will hit the 2026 target. If we take all the counties that are not projected to hit the target, and take how much they are projected to miss the target, weighted by their respective populations, we can determine which counties are most “important” to focus on. These are the counties that can be thought of as having the highest area for improvement/biggest opportunity to push California as a whole towards the 2026 goal. The following 5 highest opportunity counties are as follows:

These counties mostly lie along the southeast corner of the state and in the southern central valley. They are both high in population and low in EV usage. If California is lagging behind 35% EV adoption rate in 2026, this region could have been a big reason for that. Los Angeles is a particularly notable county here. I am projecting it to be at 33.7% EV adoption rate in 2026 (almost hitting the goal of 35% but not quite), but since it is so high in population, it appears towards the top of the list of most important counties.

It is useful to note that the above model used to estimate EV adoption rates in 2026 is very simple and is just one way in which we can think about predicting future EV adoption. In future iterations of this work, we could include more complexity for likely more accurate results.

EV Adoption Rate (Automaker Level):

We can also look at the data on an automaker level to assess which automakers are on their way to hitting the 2026 target and which are lagging.

Important note: I noticed that the DMV data had a large percentage (approximately 28%) of EVs labeled as “Other/Unknown” when it came to their make. I’m not sure which EVs are in this group, but if they were all correctly apportioned, these results could look different.

If we look purely at who is registering the most EVs, we see the following:

2022 DMV Data

We can see that Tesla has the lion’s share with Toyota a distant second. After that there is a long tail of other EV sellers. Most automakers’ EV rates are in the low single digits, with a couple of luxury brands at a few percentage points higher. From this data, it’s clear that automakers have a lot of work to do to get to 35% EV sales by 2026.

I discussed this list with a few industry professionals, who pointed out an odd lack of specific automakers, particularly Nissan, Kia, and Hyundai. I did some digging and saw that Nissan in particular had registered many older EVs (such as EVs made in 2018 or 2019) in 2022, and therefore were being filtered out by my rule of only looking at cars that were made within the past 3 years of the registration year. If I included one extra year, Nissan was included in this list at #8. Honda also made it onto the list in this scenario. This adjustment did not change the results for Hyundai and Kia very much. It’s possible that those two automakers are significant portions of the “Other/Unknown” group mentioned above.

If we instead order our data by who sells the most vehicles overall, we see the following:

2022 DMV Data

From this graph, it’s clear that all large automakers (with the exception of Tesla) have a lot of work to do in order to hit 35% EV sales.

IRS updates regarding which vehicles will qualify for the IRA tax credit were shared with the public on 4/17/23. Assuming no changes to this list for the foreseeable future, the tax credits will primarily benefit Chevrolet, Ford, Tesla, Jeep, and Lincoln and hurt Nissan, Volvo, Audi, Hyundai, Kia, Subaru, Toyota, and Volkswagen (although I’ve seen conflicting information of whether the Volkswagen ID.4 will qualify). This is expected, as the IRA is supposed to stimulate car manufacturing (and consequently automaker jobs) within the US and its allies, and is supposed to decrease reliance on Foreign Entities of Concern (FEC) such as China.

We can also examine California on a county level and see which are the top EV sellers per county:

Tesla is by far the biggest seller with Toyota the second biggest. After that, it’s not clear if any EV sellers have clear regions where they are strong.

Vehicles per Capita (County Level):

Though this doesn’t have to do directly with EVs, I also looked at California’s Vehicles per Capita on a county level and examined how that number is trending from 2018 to 2022. As we seek to transition to a more sustainable society, we must not only move towards EVs but also use less personal vehicles in general, depending on public transportation, biking, and carpooling where we can. This portion of the analysis studies how many vehicles there are per person in California, and whether that number is trending upwards or downwards regionally.

Number of vehicles per person in 2022
% Change in Vehicles Per Person from 2018 to 2022

As you can see in the first graph, the Sierra mountain counties are the ones with the highest vehicles per capita, and San Francisco county (though it may be hard to see) and the bay area are the lowest. Generally we see through the central valley a low number of vehicles per capita, but this could be more likely because of lower wealth rather than the bay area, where it likely has to do with higher access to transportation. The lack of vehicles per capita in the central valley could also have to do with a lack of vehicles being registered.

The second graph demonstrates how much vehicles per capita has shifted per county in the last 5 years. In most counties it has either stayed relatively flat or increased. In the bay area it has dropped significantly. One hypothesis I have for this is the advent of work from home. My hypothesis is that since many people in the tech industry (which is particularly concentrated around the bay area) started working from home as a result of the pandemic, they were not commuting and had less need for their cars, therefore selling them (though out of scope for this analysis, used car sales data could be used to explore this further).

Conclusion:

The EV landscape in California is sprawling and nuanced, and we have just scratched the surface here. Overall, it seems like Californians are trending in the correct direction when it comes to 35% of EV adoption rate by 2026, while individual automakers have a lot to do to hit this goal themselves. It will be interesting to see how the next few years play out!

I hope you enjoyed this analysis! You can review all the code yourself on my github page. Please follow me if you enjoyed this analysis to be notified of my future work.

Note: All images unless otherwise noted are by the author.

--

--