Investigating Team-Adjusted Statistics

Sam Gustafson
19 min readJan 2, 2021

Diving into an idea I had regarding that crucial question: How do we add more context to player statistics?

Before I get started, I’d just like to say that I don’t think anything I’ve done here is very advanced. My goal was to finally examine a concept that has interested me for a while and see if it yielded anything useful. I hope that this can inspire some form of further analysis or discussion. With that being said, let me take you through my investigation of team-adjusted statistics.

Popular Adjustment Methods

Now, I won’t say team-adjusted stats, which I will explain in just a second, are a fully original idea. However, I rarely see them used at all, and when I do, it’s almost always just for goals. In my time as a member of the online analytics sector, there are a few methods for adjusting player data that are very common to me.

Per 90 Minutes

Example: A player takes 30 shots over the course of 900 minutes during the season. That equates to 3 shots per 90 minutes.

Definitely the most common method, measuring statistics per 90 minutes of playing time (or 96 minutes, like American Soccer Analysis) helps account for the fact that, more often than not, different players will not have accumulated the same amount of time on the pitch. This is a more basic conversion and it is in a different vain than the others I’ll discuss, as per 90 minute stats can still go through further adjustment.

Per x touches

Example: A player has tallied a total of 20 successful dribbles on 1,000 touches. That gives them an average of 2 successful dribbles per 100 touches.

An alternative for normalizing with minutes played is using a player’s touches. I have seen the likes of @utdarena, @DomC0801, @topimpacat, and @MishraAbhiA utilize this concept before. Whether it’s per 100 touches, per 50 touches, or just per touch, this can help identify players who maybe are doing a lot with little service. To my knowledge, I have only seen this concept applied to attacking metrics.

Possession-adjusted

A bit too complex for me to give a simple example, and there are multiple different methodologies.

Possession-adjustment on metrics, particularly defensive metrics, has become very popular. The logic behind this is that how a side possesses the ball can either inflate or limit the volume of actions a player can perform. For instance, a footballer in a heavy possession based system will be defending for less time on average, thus giving them fewer opportunities to make tackles, interceptions, etc.

@StatifiedF adjust stats in a “simpler” way, looking at averages for 30 minutes of possession. So defensively, this could be how many tackles a player makes per 30 minutes of opposition possession, while offensively it could be how many progressive passes a player completes per 30 minutes of their team’s possession. StatsBomb, on the other hand, use a more complex formula involving Euler’s number. To my knowledge, they also limit their use of possession-adjustment to defensive metrics.

Why am I telling you about all this? First off, all of these other methods are related to what I did for team-adjustment in one way or another. I actually used per 90 stats directly to then perform team-adjustment, while the thinking behind the other two are very similar — try to add some context to a player’s performance based on how their team plays.

Also, this is one place where I could hopefully inspire some people out there. I don’t have the time nor resources to do a full comparison of all of these different methods. If someone thinks they can try that with basic per 90 stats, per touch stats, “basic” possession-adjusted stats, StatsBomb possession-adjusted stats, and team-adjusted stats (plus any more you can think of) that would be awesome to read.

In saying that, I don’t mean that someone should find out which one is the best. I don’t think that’s possible. Instead, it would be cool to see which do the best job at uncovering different player profiles, as well as which methods are best-applicable to different stats or different situations.

Now that I’ve bored many of you already and mentioned team-adjustment multiple times, I should probably explain how I went about it and what it showed.

Team-Adjusted Stats

Like I said, this isn’t something that I just came up with myself after hours of intense thought. You’ve probably seen the concept before on posts like, “Messi scored x percent of Barcelona goals last season.” Furthermore, I know I’ve seen @HemmenKees utilize a variant of this where he looked at a player’s performance relative to the average of players in that side. This is also a very smart way of looking at things, in my opinion, but I went a slightly different route.

Essentially, I wanted to look at the share of different tasks which each team must perform on the pitch that were allotted to each player. As I wanted to account for discrepancies in minutes played, this is the team-adjustment formula I used:

Player’s Team-Adjusted Tally = Player’s per 90 Tally/Team’s per 90 Tally

That’s it. Just converting each stat into a share of the team average. Say a player averages 1.5 passes into the penalty area per 90 minutes and their side averages 3.0. They get a team-adjusted tally (or score, rating, whatever) of 0.5.

I guess the key characteristic I was really trying to highlight this way was responsibility. I wanted to find players who didn’t just perform a high volume of actions, but who were the main source of that action for their team — players who were responsible for driving play in different areas.

Before we get into the results, I will just point out a weakness in this analysis. Ideally, this would work best if you had access to the team data for only when a specific player was on the pitch. That way you’d be able to directly see stuff like, “This player provides 20% of the side’s shots on target when on the field.” Unfortunately, I can’t do this, which is why I have to use the player and team averages over the course of the season.

All in all, I still think there is a lot of value doing it that way, so let’s get into some results.

What does the data tell us?

I applied the concept to data for the big five European leagues in 2019/20 from Football Reference, via StatsBomb. The players I focused on for the analysis were outfield players who had played at least 1,000 minutes played in that season. To start, I figured I’d run through some team-adjusted leaderboards for different metrics.

Non-Penalty xG

  1. Michail Antonio | West Ham United | 0.49
  2. Mario Balotelli | Brescia | 0.44
  3. Willian José | Real Sociedad | 0.43
  4. Adrien Hunou | Stade Rennais | 0.43
  5. Christian Benteke | Crystal Palace | 0.43

xG Assisted

  1. Adama Traoré | Metz | 0.58
  2. Florian Kainz | Köln | 0.44
  3. Dimitri Payet | Marseille| 0.40
  4. Denis Suárez | Celta Vigo | 0.38
  5. Lionel Messi | Barcelona | 0.38
Note: Expected assists here should be xG assisted, and Scoring Responsibility isn’t the best label, but I didn’t want to type out Getting Into Good Goalscoring Positions Responsibility.

Shot-Creating Dribbles

  1. Allan Saint-Maximin | Newcastle United| 0.90
  2. Gonçalo Guedes | Valencia | 0.87
  3. Michail Antonio | West Ham United | 0.73
  4. Raúl de Tomás | Espanyol | 0.66
  5. Lys Mousset | Sheffield United | 0.65

Shot-Creating Live-Ball Passes

  1. Adama Traoré | Metz | 0.28
  2. Jack Grealish | Aston Villa | 0.25
  3. Joaquín | Real Betis | 0.25
  4. Emi Buendía | Norwich City | 0.25
  5. Dejan Kulusevski | Parma | 0.25

Progressive Distance Passed

  1. Dante | Nice | 0.19
  2. Benjamin Hübner | Hoffenheim | 0.19
  3. Stefano Sabelli | Brescia | 0.19
  4. Damián Suárez | Getafe | 0.19
  5. Diego Rico | Bournemouth | 0.18

Progressive Distance Carried

  1. Allan Saint-Maximin | Newcastle United | 0.29
  2. Felipe Anderson | West Ham United | 0.27
  3. Gerard Deulofeu | Watford | 0.25
  4. Jack Grealish | Aston Villa | 0.24
  5. Wilfried Zaha | Crystal Palace | 0.24

Successful Pressures

  1. Arturo Vidal | Barcelona | 0.21
  2. Konrad Laimer | RB Leipzig | 0.21
  3. Lucas Deaux | Nîmes | 0.20
  4. Marvelous Nakamba | Aston Villa | 0.20
  5. Tom Trybull | Norwich City | 0.19

Interceptions

  1. Jerdy Schouten | Bologna | 0.32
  2. Lucas Leiva | Lazio | 0.30
  3. Benjamin André | Lille | 0.28
  4. Francesco Magnanelli | Sassuolo | 0.26
  5. Nicolas Höfler | Freiburg | 0.25

Blocked Shots

  1. Chris Smalling | Roma | 0.44
  2. Simon Kjær | AC Milan | 0.41
  3. Danilo Larangeira | Bologna | 0.38
  4. Tony Jantschke | Borussia Mönchengladbach | 0.38
  5. Unai Núñez | Athletic Bilbao | 0.37

Clearances

  1. Mats Hummels | Borussia Dortmund | 0.35
  2. Shane Duffy | Brighton | 0.34
  3. Kamil Glik | Monaco | 0.33
  4. Leandro Cabrera | Getafe | 0.33
  5. Xabier Etxeita | Getafe | 0.33

Rate statistics like pass completion rate can also be team-adjusted. In these cases, a tally of greater than 1 means that player’s rate is higher than the side’s, and less than 1 means that it is lower.

Pass Completion Rate

  1. Recio | Leganés | 1.20
  2. Jerry St. Juste | Mainz | 1.19
  3. André Hoffmann | Fortuna Düsseldorf | 1.19
  4. Eddy Gnahoré | Amiens | 1.19
  5. Kiko Olivas | Real Valladolid | 1.18

Pressure Success Rate

  1. Sidnei | Real Betis | 1.58
  2. Jeffrey Gouweleeuw | Augsburg | 1.56
  3. Maya Yoshida | Sampdoria | 1.52
  4. Tyrone Mings | Aston Villa | 1.51
  5. Stanley N’Soki | Nice | 1.51

Clearly, looking through this lens brings up different batches of players. For non-penalty xG, the top names on a per 90 level were Mbappé, Gabriel Jesus, Robert Lewandowski, Sergio Agüero, and Mauro Icardi. I think it’s fair to say those guys are more adept overall at carving out chances for themselves than the likes of Michail Antonio and Benteke. However, relative to their team’s overall ability to create chances, those players stand out.

So, team-adjusted metrics are more reflective of the responsibility a player takes on in their side, instead of just quality. Of course, there will be some overlap — Saint-Maximin is one of the world’s best dribblers, Messi is one of the world’s best creators, Konrad Laimer is one of the world’s best pressers— but the key elements here are role and responsibility.

This is what I expected heading in, but I was more interested in how those team-adjusted numbers could then be applied to different practices.

Player Similarity

One useful tool for player similarity is factor analysis. I’ve been using this for reducing the number of dimensions with Wyscout data, which is all basic per 90 outside of a couple possession-adjusted defensive stats. Factor analysis has brought some pretty solid results, but now it was time to apply it to team-adjusted stats.

Before I go any further, I’d just like to give a big shoutout again to @Eoin_OBrien_, whose work with factor analysis was a massive inspiration. If you’re unfamiliar with the concept, that piece I linked from Eoin is a great introduction. I’ll briefly walk you through the process here, though.

You start off with an assortment of different variables, which in this case are the individual team-adjusted metrics — team-adjusted shots, team-adjusted switches, team-adjusted attacking third pressures, etc. I ended up using a total of 44 different metrics.

Then, the factor analysis performs dimensionality reduction on all of those variables. It looks at the relationships between the metrics of each player, and attempts to explain as much of the variance of the original data set as possible by creating factors.

Each factor explains a certain portion of that variance. Also, how a player performs in certain original metrics determines how they will score in the different factors, which I will go further into when I explain each one.

Before I dive into that, it’s worth noting that I decided to include all of the outfield players in the sample in the same factor analysis. Previously, I have broken them down by position group, but here, I was intrigued by some of the possibilities. Could this reveal centre midfielders who perform similar tasks to many full-backs, or centre forwards who play more like attacking midfielders.

With all that background in mind, I ended up getting seven factors. They combined to explain 78.5% of the variance in those original 44 metrics. Let’s look into each one.

Factor 1

Metrics with strongest positive correlation: Pass reception rate, defensive third touches, and medium passes attempted.

Metrics with strongest negative correlation: Non-penalty xG, attacking penalty area touches, and shots.

Highest ranking players: José Luis Palomino, Aaron Wan-Bissaka, and Cristian Romero.

Lowest ranking players: Christian Benteke, Joselu, and Habib Diallo.

What this seems to reflect: Deeper players who are active defenders in the defensive third (things like defensive third tackles, defensive third pressures, and shots blocked are also strongly correlated).

Factor 2

Metrics with strongest positive correlation: Crosses into the penalty area, passes into the penalty area, and attacking third touches.

Metrics with strongest negative correlation: Pass completion rate, shots blocked, and aerial wins.

Highest ranking players: Daniel Brosinski, Christian Günter, and Joan Sastre Vanrell.

Lowest ranking players: Aymeric Laporte, Virgil van Dijk, and Matija Nastasić.

What this seems to reflect: Wide players, mainly full backs, who get high up the pitch and play dangerous passes.

Factor 3

Metrics with strongest positive correlation: Successful pressures, middle third pressures, and middle third tackles.

Metrics with strongest negative correlation: Defensive penalty area touches, clearances, and defensive third touches.

Highest ranking players: Arturo Vidal, Konrad Laimer, and Allan.

Lowest ranking players: Tommaso Augello, Scott Dann, and Jesús Navas.

What this seems to reflect: Midfield defensive engines who are very active pressers and ball winners.

Factor 4

Metrics with strongest positive correlation: Middle third touches, short passes attempted, and passes into the final third.

Metrics with strongest negative correlation: Offsides, attacking penalty area touches, and non-penalty xG.

Highest ranking players: Maxime Lopez, Salva Sevilla, and Kevin Stöger.

Lowest ranking players: Mauro Icardi, Christian Kabasele, and José Luis Palomino.

What this seems to reflect: Tidy, metronomic players who are very involved in buildup play.

Factor 5

Metrics with strongest positive correlation: Progressive distance carried, dribbles attempted, and shot-creating dribbles.

Metrics with strongest negative correlation: Aerial wins and clearances.

Highest ranking players: Allan Saint-Maximin, Fabián Orellana, and Wilfried Zaha.

Lowest ranking players: Bas Dost, Maxi Gómez, and Troy Deeney.

What this seems to reflect: Active dribblers and carriers of the ball.

Factor 6

Metrics with strongest positive correlation: Switches, long passes attempted, and passes into the final third.

Metrics with strongest negative correlation: Short passes attempted, attacking penalty area touches, and offsides.

Highest ranking players: Jonjo Shelvey, Dimitri Payet, and Oliver Norwood.

Lowest ranking players: Juan Bernat, Jonny Castro, and Luke Shaw.

What this seems to reflect: Long, direct passers, most of whom are deep lying centre midfielders.

Factor 7

Metrics with strongest positive correlation: Aerial wins, defensive penalty area touches, and clearances.

Metrics with strongest negative correlation: Middle third pressures, pass completion rate, shot-creating live-ball passes, and dribbles attempted.

Highest ranking players: Joselu, Calum Chambers, and Christian Benteke.

Lowest ranking players: Óscar Melendo, Boubakary Soumaré, and Georginio Wijnaldum.

What this seems to reflect: Mainly strong aerial presences. Some will be more “dirty work” defenders who drop deep, and some are centre forwards who are more poachers and target men with limited creativity.

We can now investigate player similarity using Euclidean distance on the scores from these factors. Let’s run through the top matches for a few big names.

Lionel Messi

  1. Neymar | 85% similarity
  2. Papu Gómez | 83%
  3. Josip Iličić | 80%
  4. Josip Brekalo | 78%
  5. Filippo Falco | 77%
  6. Ruslan Malinovskiy | 75%
  7. Óscar Rodríguez | 75%
  8. Paulo Dybala | 74%
  9. Radja Nainggolan | 73%
  10. Zinedine Ferhat | 72%

Harry Kane

  1. Vedad Ibišević | 94%
  2. Darío Benedetto | 91%
  3. Maxi Gómez | 89%
  4. Lucas Alario | 87%
  5. Arkadiusz Milik | 86%
  6. Kevin Volland | 86%
  7. Rouwen Hennings | 86%
  8. Patrick Schick | 86%
  9. Moussa Dembélé | 85%
  10. Teemu Pukki | 85%

Kevin De Bruyne

  1. Mathias Autret | 88%
  2. Pascal Groß | 82%
  3. Lorenzo Pellegrini | 81%
  4. Bruno Fernandes | 80%
  5. Ángel Di María | 79%
  6. Pedro León | 79%
  7. James Maddison | 79%
  8. Robert Snodgrass | 78%
  9. Mathieu Dossevi | 77%
  10. Hamed Junior Traorè | 76%

Sergio Busquets

  1. Mikel Merino | 91%
  2. Marten de Roon | 86%
  3. Rodri | 85%
  4. Habib Maïga | 84%
  5. Ibrahim Sangaré | 84%
  6. Jorginho | 84%
  7. Milan Badelj | 83%
  8. Manu Trigueros | 83%
  9. Sidy Sarr | 83%
  10. Lucas Deaux | 82%

Trent Alexander-Arnold

  1. Daniel Brosinski | 83%
  2. Kieran Trippier | 79%
  3. Nacho | 77%
  4. Julien Faussurier | 77%
  5. Lucas Digne | 76%
  6. Diego Rico | 75%
  7. Aleksandr Kolarov | 75%
  8. Christopher Trimmel | 74%
  9. Philipp Max | 73%
  10. Maximilian Mittelstadt | 73%

Virgil van Dijk

  1. Gian Marco Ferrari | 83%
  2. Aymeric Laporte | 82%
  3. Matthias Ginter | 81%
  4. Aïssa Mandi | 80%
  5. Benoît Badiashile | 80%
  6. Thiago Silva | 80%
  7. Jonny Evans | 78%
  8. Kurt Zouma | 78%
  9. Joe Gomez | 78%
  10. Christophe Hérelle | 77%

I think there is some good stuff here, but something feels different than my prior factor analyses. For Messi, the likes of Neymar and Iličić still came out on top before, but then you had more players like Philippe Coutinho, Martin Ødegaard, and Jack Grealish, and there certainly wasn’t any Filippo Falco or Zinedine Ferhat.

By shifting away from the use of basic volume stats, it seems we are more answering the question of, “Who is this team’s Lionel Messi?” as opposed to, “Which footballers are closest to Lionel Messi in their play?” if that makes sense. Once again, there will of course be some overlap, but I think the two can be utilized for different purposes.

Player Well Roundedness

Another subject I wanted to examine here was the idea of a well rounded footballer. If someone is able to take on responsibility for several different aspects of play in their team, that can be very valuable. I had a few ideas to measure this, which I’ll run through now.

The first involved using the players’ factor scores. For each of the seven factors, the scores were scaled in a 0–1 range, then the average of those seven would be their “Well-Roundedness Score”. The top names are pretty unique, but they honestly do make a lot of sense.

  1. Emi Buendía | 0.65
  2. Stefano Sabelli | 0.62
  3. Fred | 0.62
  4. Rodrigo De Paul | 0.62
  5. Trent Alexander-Arnold | 0.60

Think about it, each of those guys contributed to their side’s performance in many different ways. Buendía put up excellent creative and dribbling numbers, while also putting in a shift defensively. Someone like Fred covers vast areas defensively and is also key in ball progression. Alexander-Arnold provides immense creativity and progressive passing from the full-back position.

On the other end of the spectrum, the lowest averages all came from centre forwards.

  1. Mauro Icardi | 0.24
  2. Jamie Vardy | 0.25
  3. Borja Iglesias | 0.25
  4. Erling Haaland | 0.25
  5. Sergio Agüero | 0.27

While this may seem harsh on these footballers, it makes sense that number nines are statistically the most one dimensional. These poacher-type players are largely uninvolved in progression, usually don’t dribble or play the killer pass themselves, and often do very little defending — their job is to score goals.

It’s not a knock on those forwards — clearly there are some absolutely world class players there — I just found that interesting. The more insightful thing to do would be to compare across the same position group to answer questions like, “Who was the most well rounded centre forward?” or, “Who was Liverpool’s most well rounded centre midfielder?”. The answers to those questions are Paulo Dybala (at least on Wyscout he was listed as primarily a centre forward) and Jordan Henderson.

But, I also wanted to try something different with just a few select team-adjusted metrics. I started by breaking things down into a few different categories — ball winning, ball progression, chance creation, and scoring threat. For each category, a player’s score was calculated by first scaling each individual metric from 0–1 and then taking the average.

For ball winning, I chose to include aerial wins, successful pressures, and interceptions. The top players were Lucas Leiva, Benjamin André, Geoffrey Kondogbia, Jerdy Schouten, and Tom Trybull.

The progression category consisted of progressive passes, progressive distance passed, dribbles, and progressive distance carried. Coming out on top were Felipe Anderson, Lionel Messi, Neymar, Tommaso Augello, and Toni Villa.

Expected assists, shot-creating live-ball passes, and shot-creating dribbles are what I chose to include for creation. Allan Saint-Maximin, Lionel Messi, Adama Traoré (Metz), Gonçalo Guedes, and Raphinha were the top names.

Lastly, for scoring threat, I used non-penalty xG, shots, and attacking penalty area touches. The highest ranking players were Kylian Mbappé, Michail Antonio, Ángel Rodríguez, Christian Benteke, and Joselu.

Now, for the overall score, each of those individual categories were scaled as well, and then the average was taken. Here are the top five players:

  1. Lionel Messi | 0.68
  2. Emi Buendía | 0.65
  3. Allan Saint-Maximin | 0.64
  4. Gerard Deulofeu | 0.64
  5. Raphinha | 0.63

Obviously, this is far too biased towards well balanced attackers. I guess I still set it up to have to many attacking and in-possession metrics. In an attempt to account for this, and I really don’t know how statistically correct this is, what if we weighted the ball winning category by 3? This is how the top of the list looks then:

  1. Christian Benteke | 0.95
  2. Joselu | 0.94
  3. Emi Buendía | 0.94
  4. Lucas Leiva | 0.93
  5. Benjamin André | 0.92
  6. Sergej Milinković-Savić | 0.86
  7. Youcef Attal | 0.86
  8. Lucas Deaux | 0.86
  9. Mikel Merino | 0.86
  10. Guido Carrillo | 0.85

Now we’re seeing more of what we want to see — those all round centre midfielders like Milinković-Savić and Merino, or a full back like Attal. However, there is way too much bias towards aerial centre-forwards, as they put up attacking numbers, but then by my logic are also good defenders because of their aerial duel tallies. Removing aerial wins from the ball winning category turns that previous list into this:

  1. Emi Buendía | 1.00
  2. Gerard Deulofeu | 0.94
  3. Lucas Leiva | 0.93
  4. Marko Rog | 0.92
  5. Benjamin André | 0.92
  6. Felipe Anderson | 0.91
  7. Wilfried Zaha | 0.90
  8. Domenico Berardi | 0.90
  9. Téji Savanier | 0.88
  10. Giovani Lo Celso | 0.88

Elsewhere in the top 20 are the likes of Fred, Stefano Sabelli, De Paul, Geoffrey Kondogbia, Raphinha, Yangel Herrera, Luis Alberto, and Attal. Now, we are back to a more interesting mix of profiles — some box to box midfielders, some attackers with high defensive work rates, some “one man shows” in attack who are also important progressors.

Feel free to let me know your thoughts on this concept. I definitely don’t think any of the things I proposed are a perfect solution, but I would like to build off of them.

Further Ideas Involving Team-Adjusted Stats

Another notion I hope to look into is the idea of a balanced team. I don’t have a concrete plan yet, but I just wanted to put this out there to maybe inspire someone to investigate it themselves.

How do the top clubs allocate different metrics? How much dependence on a single player is too much? There are so many questions that I feel could be explored by people much smarter than me.

Related to team balance, I don’t think I’ve mentioned a single Bayern Munich player in this whole piece…

Some other topics that could be fun are how a player’s role and responsibilities have changed over time. Or, when a player transfers to a bigger or smaller club, use team-adjusted metrics to see if their style has truly changed, or if they’re just performing to a different scale.

Maybe even within the same team, if a club has a breakout season and improves a lot, see whether the players’ roles changed or if it was more of just an overall improvement in quality.

Finally, it could be cool to see someone build a model that tries to predict a player’s position or qualify them in a certain role based on their team-adjusted numbers.

Final Thoughts

I guess the main takeaway here is that there a number of possible ways to look at player statistics, and each serve their own purpose and highlight different footballers. Team-adjusted metrics reflect more of a player’s responsibility in their side than quality in general, and when used for player similarity, it brings forth players who fit a similar role in their teams as opposed to players who are most similar in general performance.

The main benefits I can think of would definitely be to uncover talent in a poor team, or to compare the makeup of two sides — their distribution of different tasks across different positions — possibly for scouting purposes.

--

--

Sam Gustafson

Data-centric work. Information person, not stats person.