Constructing Chances: Single Game G+ Performance Thresholds in Major League Soccer
Imagine that you are working as a GM for an MLS team, clearing through an old dusty office when you come across a lamp. You see some writing on the side, and when you attempt to clear away the dirt with your sleeve, a genie emerges in a puff of smoke.
He offers you two choices. Option 1, you can sign a player that will never give you a truly awful performance, but also never give a great one either; he will happily sit in the middle 50 of player performances his entire career. Option 2, you get a talented but mercurial player. Half of his games will be some shade of brilliant — better than what many players will achieve in their entire career. The other half, though, will be terrible games where he can hardly contribute, worse than what many players will ever sink to in their entire career.
What do you choose? Do you want the consistent and predictable performances of the first player, or the greatness of the second regardless of the considerable risk? Is there even a right answer that doesn’t take into account the players already on the roster?
I posed the question (with unfortunate typos) on twitter, and the bulk of responses seemed to favor the more consistent player.
In the discussion of the US Men’s National Team on the most recent American Soccer Analysis podcast, the idea was also strongly implied that a coach would prefer predictable performances over potentially good but potentially bad ones with his job on the line. This specifically related to young players who show high ceilings, but whom the coach is still unfamiliar with in the international environment. So most people given the choice, would lean towards the solid if not spectacular player.
What is a good performance anyway?
The first challenge is defining what performances count as good, and what are bad, and what are in between. We can turn to American Soccer Analysis for a little bit of help here. I’ve written about Goals Added before, and I still feel that of all the objective or semi objective measures of player quality it’s one of the best and most transparent. It also correlates pretty well with xGF, which itself is pretty key to winning games.
1. Here’s a distribution that counts every individual player performance over the course of the 2021 MLS season. (Quick note — because I am focused on offensive production, the G+ used in the article does not take into account interrupting).
You can see that most individual performances are within a few decimal points of 0. The median is actually -0.02, slightly below the “average” MLS performer at the position measured. The middle 50% are between 0.06 and 0.02.
2. Another way of looking at what it means to have a “good” performances is to look at the average team. This is the average G+ for the top 6 performers on a team.
Our hypothetical consistent performer would spend his entire time in this area, sometimes contributing above average performances, and sometimes below average, but never in a huge direction either way. (These players do exist in reality, though usually not over 34 games — Wilfried Kaptoum played in 21 games this season without rising above or falling below the middle 50).
Still, about 27% of MLS performers failed to record a 75th+ percentile single game performance. Many of that group still had 25th percentile games or below, so you’d be pretty confident that player would be as good or better than a quarter of the league at a minimum. Seems like a pretty good option.
Now, our hypothetical Jekyll/Hyde performer does not really exist in MLS. Most players have some games in the middle 50 no matter what, and none of the ones that don’t played more than 5 games this season. There are some players that tend towards the tails of the spectrum, though.
Who’s the best in MLS right now?
If the genie offered the best player in MLS in terms of consistent positive impact across games, that would be a no brainer. If that was an option, who would it be?
If you guessed Carles Gil, you’re probably right.
Among players with at least 20 games, Gil had the highest percentage in the 90th+ and 75th+ percentile bands (14 and 20, respectively). He also had only 2 games below the 25th percentile.
Here’s a look at some of the best performers in MLS in 2021 terms of G+ above average per 96 minutes played. Notice that most players have mostly good games — but there are some erratic ones in there, like Brian Rodriguez, Daniel Salloi, or Taty Castellanos.
That’s right. Taty Castellanos — MLS Golden Boot winner, NYCFC Talisman, and just about equally likely to give you an awful game as he does a transcendent one. In fact, he had just 1 game in the middle 50 in 2021.
If the genie offered you one season with Taty or a player who had a similar G+/96, but with more middle 50 games — let’s say Jhonder Cadiz, about 60% of whose games were in the middle 50th percentile — who would you take? It’s hard to look past the golden boot winner and best striker in MLS. It’s also hard not to notice that of all these top players, none of them fall in the upper left quadrant.
Now again — given the choice you want a player that is consistently among the best and never has bad games. But every player signing is a risk, and if you’re going to take a risk in general it makes sense to go with a ceiling even if the consistency isn’t there.
Great Performances Make Great Teams.
Being a manager is hard — and a huge amount of it is up to chance. When you have to deal with the reality that your job might not even matter, it makes sense that managers would tend to be risk averse, preferring the players that give those consistent middle 50 performances over a player that might give you something great but could just as easily have a terrible game. The middle 50 is comforting, secure, and controllable.
But here’s the thing about middle 50 performances. They don’t matter. They just don’t make wins.
I mean in a grand sense they probably do matter, as each player on the field has a job to do. To completely discount the contributions of a single player who works hard for 90 minutes is unfair. The fact remains that if you design a team around maximizing middle 50 performances at the the cost of better ones, you end up with a weak team.
So what does matter?
We know right now that G+ has a great correlation with xGF. Here’s an animation that shows how that worked in MLS in 2021.
The correlation coefficient for the team’s sum of G+ above average with xGF was 0.81, a pretty strong correlation. But how is that total sum made?
In the graph above, you can see that for teams to perform better, they get more top end performances. The total number of middle 50 performances remains fairly consistent across outcomes — reinforcing the idea that having more, or less, doesn’t really matter much. Maximize the top level games, and the team has better outcomes. And a quick note — the overall number of 75th+ percentile games has a similar correlation with team G+ and xGF than the number of 90th+ but increases the sample size and creates a more clear picture of who is creating value.
But what about bad games?
Let’s look back at Taty. If 56% of his games are 75th percentile or better, but 34% were 25th or worse. would it be an improvement on the back end if he were a little less good but a lot less bad? Instead of 56–3–41, what if he were 45–30–25?
This is where it gets a little more complex. The bottom line is that bad games have a smaller negative correlation with xG for than good games have a positive correlation.
Lets assume we start at 0–50–0 for a player. For each percentage point that goes up on each side — from 0–50–0 to 1–48–1 to 2–46–2 (and on and on…) the net positive impact increases, before maxing out at 50–0–50.
So here’s the bottom line: the challenge for the GM is finding the player that will produce the most good performances, regardless of other outcomes. Then, if two players are similar in their ability to create good games, the player who avoids negative performances more consistently is the better.
In the genie problem, the right pick would probably be to take the mercurial player that promises good performances even if it comes with bad ones.
There is room for optimization, of course, and the truth is that it is impossible to know in advance what a player will bring.
The biggest issue with inconsistent players is the risk involved. If you have an attacking group of 4 players, each with a 50–50 chance of a good game or bad game, you should be in theory a strong team most of the time. After all, you have a 68.7% chance of getting 2 or more good performances. Even assuming a fairly normal midfield and defense, you are well on your way to winning. Still, there’s a 25% chance you end up with just one good showing from the 4, and a 6.3% chance you don’t get anything at all. That’s survivable in a regular season where points will probably even out over time, but in the playoffs it can be fatal.
That’s where managers have to make choices with lineups and decide their approach. It’s not an easy job but that’s why they make the money they do.
Evaluating Players by Performances
Thinking about performances rather than just averages over many games lets us look at players in a different light. Rather than outliers maybe shaping the opinion of a player it is easier to see impact over time, a players form, and much more about an individual. Here’s a look at Keaton Parks over his time in MLS:
It’s clear to see that while Parks has had more low end games in the last two years, his overall impact has been greater. That’s also true when you look at his ratios:
Even though Park’s G+ average has slightly decreased since 2019, his consistency has remained good and his ability to produce good performances has increased.
Being able to tease out things like consistency is important — it provides valuable nuance to discussions about quality or fit. A player can have a 0.02 G+ in the same position as another, but the two players could have gone about getting those numbers in very different ways.