Outfield Positioning Models
Summary
We run two different models for outfield positioning:
The first one doesn’t know anything about how fielders were positioned and estimates the probability that a flyball would be caught based on the batted ball characteristics.
The second one estimates the probability that an average MLB OF would catch a batted ball given where he started.
The difference between the p(out) from these two models is the value we assign to the OF's positioning, which can be represented in outs and linear weight runs.
Both of these models were originally trained on data from the 2016-2018 MLB regular seasons.
Model #1
This model estimates the out probability for a batted ball without knowing anything about how the fielders were positioned.
Inputs
Where the ball landed
Hang time
How close the ball got to the wall, or how high up it hit or would have hit the wall
Model #2
This model estimates the probability that an average MLB OF would catch a batted ball given where he was positioned.
Inputs
Where the OF started at pitch release
Where the ball landed
Hang time
How close the ball got to the wall, or how high up it hit or would have hit the wall
Example
Imagine a given batted ball that the first model thinks would be caught only half the time. p(positioning unknown out) = 50%.
But now we see the fielder was well positioned and, factoring that in, the second model estimates it will be caught 95% of the time. p(positioning dependent out) = 95%.
Whether the fielder catches it or not, his positioning value is 0.45 outs because that's the gain in out probability from factoring in his starting position at pitch release.
Key Metrics
OFPosOutsPM - Outs added or lost due to outfield positioning
OFPosLWRnPM - Runs scored or prevented due to outfield positioning