Hands-on Tutorials

Pitcher deception is something of a white whale for Baseball Analytics. We all know those pitchers who look weird but get good results doing it. We know there’s something about them that makes them harder to hit, but what? Inherently a visual quality, attempting to quantify deception perfectly is an impossible task. Attempting to quantify it well-enough is daunting, too, and that’s what I attempt in this project.
"Deception" is defined as the act of causing someone to accept as true or valid what is false or invalid. In a pitching context, that "someone" is the batter. A pitcher’s deception is trying to convince the batter that something is coming, when what really comes is something different. Thus, deception is something that is possessed by the pitcher, but perceived through the batter. When quantifying pitcher "deception", we must therefore think through the lens of the batter but focus on attributes of the pitcher. This is a motif that we will keep coming back to throughout the methodology.
After some brainstorming, my definition for pitcher deception is a three-part equation (where w represents a weight):
*Deception = w1 Unpredictability + w2 Indistinguishability + w3 Unexpectedness**
We will calculate each of these three sub-metrics (naming conventions could definitely use some work) before rolling them up into the final overall deception metric.
The following analysis is done using publicly available Statcast data from Baseball Savant, using regular season data from the 2018–2020 seasons. I tried to balance the trade-off between increased sample size and losing the recency effect of any arsenal or mechanical changes that a pitcher makes from season-to-season by aggregating metrics up to the pitcher-season level (for example, separate deception metrics for 2018 Kershaw, 2019 Kershaw, and 2020 Kershaw).
I performed the methodology using R, and you can find the source code along with the CSV of the full deception leaderboard here on my Github page.
Unpredictability
The first sub-metric we need to calculate is unpredictability, which I defined as not being able to accurately guess what pitch type the pitcher will use in a given count.
Assumption 1: The less a pitcher deviates from his usual pitch type frequencies no matter the count, the harder it will be for the batter to guess which pitch is coming.
Consider the fastball tendencies of Trevor Bauer and Aaron Nola, as an example:

Bauer and Nola throw four-seam fastballs at similar overall rates; however, while Bauer’s fastball tendencies fluctuate heavily based on the count, Nola will be much harder to predict when he will throw a fastball. If a batter got to a 3–1 count, against Bauer he can be reasonably sure that the next pitch will be a four-seam fastball. Against Nola, the 3–1 count provides little to no additional information on whether a four-seam fastball is coming.
To quantify this, I calculated the pitch type tendency for each pitcher’s pitch types, in each count. By subtracting this tendency from the pitcher’s overall tendency to throw that pitch type regardless of count, I am left with a measure of deviation that indicates predictability by count. To roll this metric up to the pitcher-season level, I took a weighted sum of these deviations, with the weight being the proportion that each count appears overall throughout the MLB. This values unpredictability on an 0–0 count much higher than unpredictability on a 3–0 count, for example, because a pitcher will find himself in an 0–0 much more often.
Indistinguishability
What I refer to as "indistinguishability" refers to release point tunneling. When a batter sees different pitches coming out of the same arm slot, it becomes much more difficult to pre-emptively guess which pitch is coming next.
Assumption 2: I decided to measure release point consistency on a consecutive-pitch basis, because the arm slot of the previous pitch is going to be the most prominent image in the batter’s mind going into the next pitch. He is unlikely to remember the pitcher’s arm slot from 50 games ago, for example, so incorporating a pitcher’s overall release point average across three seasons would be counteractive to the thought experiment of getting inside the mind of the batter.
Thus, I calculated the Euclidean distance from the x-z release position of the current pitch and the x-z release position of the previous pitch for each pitch, grouped by at-bat. This means that the Euclidean distance of the first pitch of each at-bat was always null because there was no previous pitch. This grouping strategy is meant to avoid the "Yusmeiro Petit" situation, who switches sides of the rubber depending on the batter’s handedness. Such a drastic change in the x-dimension release point would throw off his release point consistency.
However, this methodology does not account for the "Rich Hill" situation, who sometimes changes his arm angle during an at-bat depending on which of his two curveballs he wants to throw. These instances are heavily penalized for not showing consistent tunneling, even though one could argue that this is a form of deception. To handle these outlier cases, I plotted out the Euclidean distance distribution for the entire pitch population from 2018–20.

As you can see, the distribution is highly skewed. Examples of either bad data or "Rich Hill situations" are littered throughout the data set. To filter them out as to not over-penalize a pitcher who is deliberately changing his arm angle drastically, I excluded any pitch with a release point Euclidean distance over 0.75 feet from the previous pitch. From the distribution, this only excludes examples over three standard deviations above the mean – true outliers.
Finally, to roll this up to the pitcher-season level, I simply took the mean Euclidean distance for each pitcher as their average pitch-to-pitch tunneling consistency.
Unexpectedness
Quantifying unexpectedness was perhaps the most involved of the three metrics to calculate. I define unexpectedness as the deviation between the actual pitch movement (horizontal and vertical) based on the pitcher’s release point, and the expected pitch movement based on the pitcher’s release point. If a pitcher’s pitch moves in a different way based on what hitters have seen from similar arm slots in the past, those pitches will be more deceptive. The release point fools the hitters into believing a pitch movement profile that does not match reality.
This sub-metric was inspired by Eno Sarris’ prior work on the subject for Fangraphs, which can be found here.
First, I calculated the average release point coordinates and horizontal and vertical movements by pitch type for each pitcher. I multiplied the horizontal movements of all pitches thrown by a lefty by -1 to put them on the same scale as righties.
Assumption 3: I used pitcher-season averages instead of pitch-level data in order to model a hitter’s memory. Expected pitch movement comes from the expectation of the hitter, so it remains important to view this methodology through the lens of the hitter. They will tend to remember what a pitcher looked like in general rather than remembering the movement shape of each individual pitch.
After doing some data exploration, I noticed a weak but significant correlation between release point and pitch movement. I know that any model predicting movement solely based on release point will not generate very accurate predictions, but neither will the human brain. If a hitter sees Ryan Thompson’s sidearm delivery, he will assume that his sinker breaks a certain amount arm-side and down. If a hitter sees Mike Fiers’ over-the-top delivery, he will assume that his four-seam "rises". These are the generalized predictions that I want the model to capture.
Assumption 4: To continue with the motif of modeling the hitter’s mindset, in order to calculate expected pitch movement based on release point, I decided to use a k-nearest neighbors regression model. The model looks at the release point of the k-most similar pitchers and averages their horizontal and vertical movements as the predicted, "expected" movement value. Again, this mimics the thought process of the batter. When confronted with a new pitcher’s release point, they will likely fill in the blank with historical precedent of similar release points that they have seen before. I chose k = 10 as the number of nearest neighbors to consider – a number that I assumed would strike the balance between a comprehensive look at several pitchers and not stretching the limits of a hitter’s memory. Due to the weak relationship between the input and target variables, the "optimal" k-value via the lowest RMSE was k = sample size, which simply returns the mean of the population. However, the difference between the RMSE of k = 10 and the RMSE of a higher k was minimal, indicating very little loss of performance by choosing a more reasonable k.
Naturally, for arm angles that are seldom seen around the league, this will lead to some inaccurate predictions, and therefore high unexpected movements, since the 10 most similar release points won’t actually be all that close. However, this means that this "bug" will capture and value the scarcity of release points. A wildly unorthodox release point should be a form of deception, since hitters are not accustomed to seeing it, and are less familiar with how pitches move out of the slot.
During data exploration, I noticed that each pitch type had a very different relationship between release point and movement.


I knew based on these distributions that I would have to build separate models for each pitch type to get predicted pitch movement on the correct scale. Excluding knuckleballs and screwballs (and grouping forkballs with splitters, and knuckle-curves with curveballs), I created a pfx_x model and a pfx_z model for each pitch type using only release point x and z coordinates. I then appended the results in a data frame, and for each pitcher, calculated the Euclidean distance between each pitch’s actual movement compared to its predicted movement. I then rolled this up to the individual pitcher-season level by taking a weighted sum of each pitcher’s pitch type deviations, weighted by the frequency they throw each pitch type. This became their final unexpectedness score.
Overall Pitcher Deception
Finally, with all three sub-metrics in hand, I can calculate a pitcher-season’s overall deception. First, I used min-max scaling to get unpredictability, indistinguishability, and unexpectedness on equal scales (from 0 to 1) while retaining the shape of the distribution. I knew it would be disingenuous to pretend each of the three are equally important and simply take a sum to arrive at deception, so my final task was to decide how to weight each of the three.
I decided to run a multiple linear regression between each pitcher-season’s three metrics and a result variable that would largely capture the desired result of having deception – Called Plus Swinging Strike Percent (CSW%). I took the absolute value of the t-value for each input, representing their "importance" in predicting CSW%, and used these as the weights. The final deception equation became:
*Deception = 5.665 Unpredictability + 2.191 Indistinguishability + 6.876 Unexpectedness**
Indistinguishability turned out to be much less important in predicting CSW% than the other two sub-metrics. This is not entirely surprising, as release point tunneling is only one aspect of true tunneling. In order to get a better estimate, one could map the trajectory of consecutive pitches and calculate the Euclidean distance at the point in trajectory where the hitter must decide whether or not to swing.
Up to this point, I am not aware of any comparable publicly-available pitcher deception metric to validate against. We can, however, look at the correlations against itself, year over year. Note that deception was further normalized from 0 to 100 for interpretability.


With relatively consistent and strong correlations year over year, it appears that deception is a repeatable skill that pitchers possess. Here are the much-anticipated top 10 leaders in deception from 2018–20:

And the top 10 leaders from the 2020 season specifically:

Interestingly, seven of the top 10 pitchers featured a below-average fastball velocity in 2020. Yet of the six that qualified in xwOBA, four of these deception leaders were well-above-average in this regard. It appears that the deception metric is helping capture high performers who don’t rely on pure stuff to dominate.
Overall, I believe this passes the eye test for the most part, as Tyler Rogers has a highly unorthodox submarine delivery, and side-armers Zamora, Suter, McFarland and Pazos have visual deception in their delivery. Meanwhile, high-spinners like Roe, Karinchak, and Maton rank among the highest "unexpected movement" scores which vaults their overall deception.
It’s an almost-impossible task to perfectly encapsulate pitcher deception, an inherently visual quality, solely with numbers. This deception metric is certainly flawed in that regard, and is certainly lacking other elements under the "deception" umbrella (such as a pitcher’s ability to "hide the ball" in his delivery), but I believe it to be an adequate first-pass at the subject.
And without further ado, please enjoy some pure, unfiltered deception in action.
References:
Asel, John. "Thinking About Deception." BaseballCloud Blog, 15 July 2020, baseballcloud.blog/2020/07/15/thinking-about-deception/.
Sarris, Eno. "An Attempt to Quantify Pitcher Deception." FanGraphs Baseball, blogs.fangraphs.com/an-attempt-to-quantify-pitcher-deception/.