Rewarding Off-Ball Occupation Using Shot Freeze Frames

18 min readJul 7, 2024

Oh, the work that goes unnoticed in a log of events. How much of that work can be recognized with a picture at a crucial point? Still blind to movements and their full effects, what if a picture is all you have?

The picture, in this case, is a StatsBomb shot freeze frame. Seemingly overlooked by many in terms of application to new measures, these freeze frames contain a combination of data type and scale that is not really available anywhere else in StatsBomb’s event data or the public sphere in general.

With the positions of off-ball teammates and opponents (within the visible area) recorded at the point of each shot, and all players crucially identified, we are offered a window into these inherently valuable attacking events. Is it just context for xG calculations, though? Or can the information, while limited, allow us to broaden the scope of the event to include those who would otherwise be deemed completely uninvolved? In doing so, can we open the door for those players to be rewarded on a new set of outcomes?

That would guide my objective here: reward attackers who played important off-ball roles in manufacturing the conditions which were conducive to a teammate attempting a shot.

Shot Prerequisites

In the early stages of the process, it became apparent that a shot itself must pass certain prerequisites for this type of analysis to be applied logically. If an attempt failed to meet any of the following conditions, no reward was given to any off-ball attacker:

Shot was not directly from a set piece
Shot was not within 15 seconds of any attacking corner kick
Shot was not within 15 seconds of a free kick or throw-in into the attacking penalty area
Shot did not follow a dribble
Shot was not an aerial duel won

There was no base desire for an open-play-only measure, but there is simply too much difference in function on set pieces. The penalty area is loaded freely with the ball not in play, before players make increasingly choreographed movements in attempt to manufacture a chance. This does not fit with the outlined objective, as the onus is not truly on individuals to create these conditions (individuals certainly matter for reading and attacking the ball itself, of course).

The last two constraints are, admittedly, more flexible in my mind. If a shooting player manufactured the conditions for a shot themselves, I wanted to keep any reward limited to them. Beating an opponent with the ball or rising up above a tight contest to win on the vertical axis fits that description to me, but I recognize this as an area open to interpretation.

Off-ball occupation can certainly open the game up for a 1-on-1 or vacate space in the box that makes an aerial duel easier to attack, but I believe that would be stretching the aspect of play we want to measure beyond its true limit and value with only the freeze frames at our disposal.

Assigning “Occupation Value”

With the wide range of qualifying shots, the metric I envisioned would capture the overall threat manufactured, as well as the varying importance of each off-ball individual’s role in that chance.

For that first aspect, I took the path of least resistance and used the shot xG values right there in the data. This means that the overall reward given out for a shot is tied to the shot’s danger.

During initial brainstorming, I considered only rewarding on attempts where the shooter was fully “open” as defined by a certain distance to the closest defender. In practice, however, “openness” in regards to shooting manifests much more in subtle pockets than actual free spaces, making that nonviable as a constraint. With StatsBomb’s xG, though, the space the shooter is in and the lack of traffic in their angle to goal can be a determining factor in the extent of the reward — a bonus rather than a qualifier.

There is a bit more complexity involved in divvying up that value. When experimenting with Attributing xG allowed to individual defenders, Ricardo Tavares focused on the shooter and defender(s) who appeared to be directly marking them, weighting xG based on distance to the ball and giving it as a negative reward to players within a marking zone. Those exact confines do not fit here with the objective to look away from the ball, but I ended up employing some similar concepts.

In trying to establish “who gets what”, I first decided that some (many, maybe) should get nothing. Value needed to be directed to those who appeared to be truly impacting play the way I had outlined. If you read the shot prerequisites and started thinking, “What about a player being through on goal or [insert context]?”, many such situations were taken into consideration here.

For an off-ball attacker to be eligible for reward, they first must have been within a maximum distance behind the shooter to prevent value being assigned to players simply dallying behind the play. This distance is fluid depending on the location of the shot, calculated with the following formula:

5 yards + 1/3 * (shot height - edge of penalty area height)

*this is using StatsBomb’s 120x80 yard pitch coordinates, where 102 is the edge-of-penalty-area height

For a shot from the edge of the penalty area, attackers within five yards back can be eligible. We often see players in different lanes taking up slightly staggered positions, and this also shows some leniency for the shooter advancing a bit when setting up their attempt. As the distance to goal shrinks, though, I wanted to expand that cushion to reflect the fluctuating demands of occupation. A major part of creating close-range chances is having players arriving simultaneously to cutback-type positions or occupying a defender while the eventual shooter slips in for a ball over the top or in behind. The attempt to capture this fluctutation is illustrated here…

As tap-in range is approached, the eligible distance pushes further back towards the height of the penalty spot. Of course, this also means that as you move further back from the edge of the box and get to shooting from range, that eligible distance behind the ball shrinks.

As an additional requirement for eligibility, the attacker must be occupying at least one opponent. This condition was met if (1) there was an outfield defender within five yards of the attacker, and (2) the distance between that defender and the attacker was less than the distance between that defender and the shooter.

Five yards has been used elsewhere in StatsBomb data as a threshold for pressure, and I think it works well as a “marking distance” with some tolerance for the defender potentially starting to disengage in reaction to a pass or cross going into the shooter. The second piece is an attempt to avoid rewarding “redundant occupation” in such a similar zone to the shooter that it essentially had little impact on the defender.

We are still blind to the type of occupation, whether that be pinning, fully dragging a defender, etc. Thus, it is unavoidable that all occupation is viewed under the same umbrella.

Finally, there is no reward given to the player who assisted the shot (if applicable). That player’s contribution is already captured in our plethora of creative metrics, and I find it quite unfair to compare the pull or gravity of the ball and the act of holding-then-releasing to that of occupying off-ball.

That criteria leaves us with our set of eligible off-ball attackers between which the value will be distributed (there could be nobody, of course). For making these allotments, I had initially thought of simply distributing the value evenly — if there are three qualifying players, reward them each a third of the shot’s xG value. Splitting means that players are rewarded more for clearing out space themselves than if their team had systematically loaded advanced areas with numbers. However, leaving the rewards even, in addition to being boring, failed to reflect my view that not all occupation has the same weight or impact on the opposition.

The measure I needed would not only reflect how I believe this weight changes with proximity to goal, but also capture the disproportionately rising nature of that weight. Ultimately, I settled on utilizing an expected possession value (EPV) grid to weight the reward assigned to each player.

EPV measures the probability of a possession resulting in a goal, in this case determined by the zonal position of the ball. So yes, I am extrapolating a ball-location-based value metric onto off-ball player positions, but it is a great fit for the desired function. The grid I used was created by Laurie Shaw a few years back, and if you are unfamiliar with the nature of these things, you can get a sense of how the values range in the attacking half…

The way those numbers heat up sharply as you progress into higher-threat zones in front of goal was exactly what I was looking for. On each shot, the EPV of all eligible attackers is summed, and each individual’s percentage of that total determines the percentage of the shot’s xG rewarded to them as “occupation value.”

Benjamin Larrousse applied “hypothetical” xG based on the positions of off-ball attackers in a somewhat similar way in his investigation into Improving Decisionmaking for Shots, attempting to gauge a player’s appeal as an actual option for the ball if the shooter had instead attempted to pass.

Quick Examples

Example 1:

Brazil vs. Serbia | 2022 World Cup | 55th Minute | Shooter: Neymar

Shot xG: 0.112

Qualifying Off-Ball Attackers:

Richarlison
Raphinha

Attacker EPV’s (% of total):

Richarlison: 0.163 (50%)
Raphinha: 0.163 (50%)

Occupation Value Rewarded:

Richarlison: 0.056
Raphinha: 0.056

Richarlison (near post) and Raphinha (far post) combine to push the Serbia back line towards the six-yard box, allowing Neymar to sit off on the penalty spot for a square ball to feet. In this case, they end up in equally valued zones with the symmetrical nature of the EPV grid, meaning the occupation value is actually split evenly.

Example 2:

Brazil vs. Switzerland | 2022 World Cup | 83rd Minute | Shooter: Casemiro

Shot xG: 0.069

Qualifying Off-Ball Attackers:

Vinicius Junior
Gabriel Jesus
Antony

Attacker EPV’s (% of total):

Vinicius Junior: 0.054 (15.4%)
Gabriel Jesus: 0.160 (45.7%)
Antony: 0.136 (38.9%)

Occupation Value Rewarded:

Vinicius Junior: 0.011
Gabriel Jesus: 0.031
Antony: 0.027

Here we can see more of the weighted reward distribution process in play. Vinicius gets some credit from his deeper position in the box, but more value is allotted to Jesus and Antony pinning defenders further back. Jesus’ occupation in a slightly more threatening central zone completes the hierarchy for value distribution.

In addition to showing the process in practice, these examples may well illustrate the potential for disagreement and discrepancy from shot to shot. The goal is to reward behavior that is generally valuable, but the range of views on the extent to which certain behavior is valuable may be vast. Ultimately, for a proxy measure adapting to our knowingly limited data, the most important thing is creating a base on which logical insights can hopefully be found.

Broadscale Application

The closest we can get to replicating recruitment or player identification conditions with StatsBomb’s public releases is to use their data for the 2015/16 season across each of the big five European leagues. We can see the top overall names that would be flagged by occupation value and occupation value per 90…

Though you will see he still fares pretty well, is there still room in this world to appreciate a Messi-test-inapplicable metric? Anyways, we see a lot of familiar penalty area presences popping up in these lists, and they hint towards the expected distribution of the metric pretty heavily towards the most advanced positions…

Important to note that this is with many teams having only one center forward on the field and two or three attacking mids/wingers.

When using a metric like this in the search for talent, we of course tend to look at value accumulated relative to these position groups to find further standouts. We can view separate rankings for center forwards, attacking midfielders and wingers, and center midfielders, with the classifications based on a player’s position at the time of each shot, not their singular primary position over the course of the season…

In these rankings, you will notice a plethora of players from Europe’s territorially dominant sides. With any metric reliant on the progression of the ball into dangerous attacking zones, there is also plenty of opportunity to look for outliers relative to their environment and opportunity. There are many such interesting cases who did not appear in any of the prior rankings…

Romelu Lukaku: 7.09 occupation value. Highest on Everton, 1.7x more than closest teammate, 23rd in overall value
Andrea Belotti: 6.80 occupation value. Highest on Torino, 1.7x more than closest teammate, 26th in overall value
Wissam Ben Yedder: 5.97 occupation value. Highest on Toulouse, 1.2x more closest teammate (Martin Braithwaite!), 33rd in overall value
Timo Werner: 5.32 occupation value. Highest on Stuttgart, 1.5x more than closest teammate, 49th in overall value
Aritz Aduriz: 4.87 occupation value. Highest on Athletic Club, 2.1x more than closest teammate, 64th in overall value
Sadio Mané: 3.70 occupation value. Only behind Graziano Pellè on Southampton, 17th in value as an attacking midfielder or winger
Georginio Wijnaldum: 4.50 occupation value. Only behind Aleksandar Mitrović on Newcastle, 24th in value as an attacking midfielder or winger (with some value coming from center midfield as well)
Dele Alli (at a big club, but deserving of a shout-out with some of the positional splitting doing him a bit wrong): 4.41 occupation value. 3rd in Tottenham squad, 30th in value as an attacking midfielder or winger, 39th in value as a center midfielder

All of this information should paint a decent picture of the profiles who can accumulate exceptional value with this approach. On the other hand, there are some big names for whom occupation is deemed an area of lower impact relative to their overall attacking output and environment…

Lionel Messi: 4.87 occupation value. 3rd in Barcelona squad, significantly lower than Suárez and Neymar, 9th in value as an attacking midfielder or winger (good, but speaking relatively…)
Ángel Di María: 2.81 occupation value. 5th in PSG squad, behind Blaise Matuidi while playing slightly more, 66th in value as an attacking midfielder or winger
Eden Hazard: 2.62 occupation value. 2nd in Chelsea squad, but not a great deal ahead of Oscar or Willian, 57th in value as an attacking midfielder or winger

Many of the on-ball brand of genius will not pop here, though we can see the potentially underappreciated robust side of Neymar’s game in the MSN trio. This is a metric for box threats, big men who force the defense to to collapse towards the hoop, crafty nuisances, and crashers.

Would it allow you to identify a new type of player beyond your npxG, receiving xT/EPV, and other box activity stats? Probably not. The potential value, though, would be adding another piece to that puzzle by turning these would-be non-outcomes into outcomes reflective of another aspect of these archetypes’ play.

Another thing I will say is this: 2015/16 is recent enough for most of us to have a solid recollection of player profiles, but there has been some major shifting in how teams approach the game. I will utilize Bayer Leverkusen’s 2023/24 Bundesliga data a bit more later, but here is how the core attacking members of Xabi Alonso’s side rank:

While the nine in Victor Boniface still leads the pack, Jeremie Frimpong’s per 90 tally would be good enough for 23rd in the 2015/16 rankings — going toe to toe with someone like Edinson Cavani. His 5.95 occupation value accumulated as a full back or wing back is 2.3x greater than the highest in 2015/16 (Diego Laxalt — one of the “almost wingers” of yesteryear).

The Dutchman offered solid direct threat himself (0.33 npxG per 90), but significantly less than almost every other name near him on that 2015/16 list. Only Erik Lamela is really comparable, and the Argentine is the big outlier from that season in this regard. To my knowledge, Frimpong’s overall width is also unmatched by anyone close.

The point is, I do not see a clear equivalent to the Leverkusen wing back in the 2015/16 data. His performance raises an interesting question about whether this metric could have greater use for flagging more unique skill sets in a game that is requiring more and more creative solutions to break down defenses. Beyond tracking general repeatability and scalability, this is what would intrigue me the most if I had a greater breadth of recent data. Frimpong may just be the Lamela outlier of this past season (a real string of words there), or he could be indicative of greater trends that might give a metric like this more interesting utility in player identification.

Synergy Through a Different Lens

The most fun “offshoot” application of occupation value came in applying it to player relationships. Given the information broadly available in event data, data-led discussions of synergy between attackers tend to focus heavily on the relationship between passer and receiver, or assister and scorer.

While not my initial goal, I realized the concept of occupation provides a different lens through which an attacking relationship can be viewed. Take the highest-valued pairings of occupying player and the teammate whose shot they helped open up…

All of a sudden, the effectiveness of a duo like Benzema and Ronaldo can be represented in a whole different light. Beyond what players create for a teammate in terms of the ball moving between them, we have a glimpse at how they manufacture opportunities for one another through their spacing and manipulating off the ball. It would make sense for this aspect of a synergistic relationship to be more of a two-way street as well, relying on a shared understanding more than specialized abilities that fit together in a sequence.

Benzema “clears a path” for Ronaldo trailing him into the box for a cutback. I liked this example because Benzema quite literally provides a pancake block on one of the center backs like an NFL offensive linemen.

The direct application of this lens may be limited in an area like recruitment. Nobody is really signing two attackers from the same club, and, while the data was not there for me to try anything like this, I cannot even envision a path towards predicting fluctuations depending on new specific teammate combos. That being said, this view into a different aspect of a player’s relationships can certainly offer further indication of what is enabling them to thrive in their environment. More accurately reflecting the extent of importance of any additional part of that picture can be valuable.

Situational + Individual Application Potential

With tracking data, you hear analysts discuss the more focused gains that can be made by simply flagging and grouping game situations. The occupation value approach could provide a similar platform for spotting trends which may be aiding in chance creation, and then building more efficiently into video.

We can put the spotlight back on Jeremie Frimpong to try this out. Say you have a team objective to create “gold zone” chances inside the penalty area, and within the width of the six-yard box (in 2024, you just might). To investigate common behaviors from your attacking pieces which may be conducive to fulfilling this objective, you can start by tying occupation value to the player’s actual positioning on these desired attempts. Start with a snapshot of the activity of your dynamic right wing back…

There is some potential for general grouping — Frimpong has a flurry in the right-central area between the six-yard box and penalty spot, or you may see that as the end of a diagonal funnel of activity coming inside from the edge of the penalty area. You could take these plays altogether and investigate if you wanted, but you can easily get more specific with the data to cater to more detailed objectives.

For the sake of this walkthrough, you are particularly interested in how Frimpong can provide value as a crasher, giving you an additional body on cross or cutback situations against deep blocks. It is very straightforward to implement additional filters for shots where the last pass came from the left flank, and, using additional freeze frame information, there were 6+ defenders in the penalty area.

Frimpong’s high-valued occupations can then provide a reference for analyzing the true makeup of these plays, and you can move to observing them in action…

If you identify specific behaviors that you like, you can then think about encouraging them, building situations where they can occur more often, or using examples to guide other players. Maybe you find the gradual creeping inside beneficiary, or you want to see more harder horizontal bursts — whatever might allow the player to continually pull an opponent away and open a gap for a teammate, or at least force the defense to respect the back-post option and compensate.

The limitations of the data mean that you still have to rely on working back from successful outcomes. There is not really a way to say from freeze frames that, “Player X could have provided more value from this more optimal position/more optimal run.” A good deal of work is still needed with video to recognize actual interactions between different pieces, but the benefit here would be having something that the video can be tied to without tracking at your disposal.

You could theoretically employ a similar process to the above walkthrough when attempting to gauge the tendencies of a certain opponent. For the type of chances you want to manufacture or prevent being manufactured, the key off-ball occupiers can provide some meaningful context.

Additional Thoughts

At the end of the day, I feel that this approach to rewarding occupation meets the threshold for potentially useful without going much further. For a metric meant to adapt to the conditions of available data, I take this as a solid outcome.

I also feel pretty confident that valuing attacking occupation is the best off-ball avenue you can take with these shot freeze frames. While I applaud the efforts referenced in those two freeze-frame-utilizing pieces I found and would certainly encourage further work, I feel that the overarching points of focus just are not as optimal. It is way messier to try and assign blame defensively at the point of a shot compared to rewarding attackers. Working from the fact that a shot has already been created, we simply do not have the same references for what potentially advantageous defensive behavior looks like from individuals (collectively we can say we want pressure on the shot, numbers in front of goal, etc.). Similarly, in trying to theorize completely new potential decisions and actions, the utility of static information at the point of the existing event starts to wear quite thin.

In other words, attacking off-ball occupation seems to be easiest to identify, and most likely to capture behavior that is generally impactful over time.

If the data range allowed for longitudinal testing of stability and repeatability of performance, these are the main things that I would have looked into tweaking:

Stricter occupation criteria: The guidelines for occupation should be straightforward and should not utilize modeling based on tracking data that, in this context, we do not have. Within that, I will not act like I pushed the criteria to its limit in testing. Maybe there is some way (an angular aspect?) to further tighten up the identification of attackers who are truly influencing the makeup of play, rather than just having a defender hovering off of them. This will never be perfect, of course.
Single shot reward cap or scale: I like xG as the natural determinant of overall reward. It bases things in a widely understood context, and we, of course, want to reward more dangerous attacking play, which means better chances. But could it make more sense for occupation to be tied to chance quality only to a certain extent? If we think hey, the off-ball conditions created on this play were enough to create a good chance, but the exact delivery of the ball, where the shooter decided to hit it, and where the goalkeeper positioned themselves is not really up to those occupying players, could it be beneficial to flatten the importance of the xG values to some extent?
Time/distance carried limit for shooting player: I operated on the basis that if the conditions were present for a player to shoot without beating an opponent, those conditions were inherently advantageous. That said, there can be wonky situations where a cap on time or distance traveled with the ball pre-shot might help — say, Player X turns the corner and drives in for a shot, and Player Y gets rewarded for roaming in on the opposite side. Could that improve the metric overall?