Player usage always seems to come into the mix of conversation, at least for me, when analyzing Dan Girardi as a purely metric based player (this is not a blog for eye test v. stats test drama, sorry to disappoint). There are certainly a few factors that I believe can rattle a players metrics, mainly being the widely known:
- Zone Starts
- Team Performance
- Score Effects
- Many more - hockey is a fluid game. We all need to try and remember that hockey isn't baseball sometimes.
The biggest issue I have is when analysts or fans alike will quote a players relative Corsi For percentage off of War-On-Ice.com without providing any context or caveats as to why a players relCF% might be as 'ugly' as it is. It is so much more than 'player X is terrible because his relCF% is -10, and player Y is amazing because his relCF% is +10'. Well, okay, but how come we're not talking about the fact that player Y starts in the offensive zone 12% more than his teammates do?
This post will take a deep dive into what I believe is the biggest driving factor in player usage context; zone starts.
If we take a player usage chart from War-On-Ice.com, and change the Y-axis to relCF% rather than competition, will we see a correlation between relCF% and relZS?
(2014/2015, 5v5, Forwards and D, minimum 350 TOI)
This is a convoluted mess. There is a general uptick when you increase the zone starts, but it's sloppy all around. Luckily, software exists that can tell you exact correlations.
Using R, I was able to run correlations for this data, relCF% and relZSO% from the 2007-2008 season through the present day. For each season, I would run the full dataset, and then two runs of 250 random players to make sure things remained consistent. The results were quite surprising.
While always a positive correlation, the correlations returned were barely moderately strong, which I found quite surprising.
Looking again at the full player usage chart, you can eyeball that there is a lot of noise between -5 and +5 relZSO% where it may be "easier" for a player to play to a positive or relative corsi for percentage, as they are at or near average zone starts for their team.
For the 2014-2015, by a frequency standpoint, this hypothesis was proven true. There is just too much noise in the middle of the field that is making it harder to analyze the outliers in terms of relZS%
Hypothesis confirmed. It takes a good possession player to be a negative relZSO and positive relCF% player. It takes a bad possession player to be a positive relZSO and negative relCF% player.
On the other hand, being a negative relZSO and negative relCF% player doesn't necessarily make you a bad player; while being a positive relZSO and positive relCF% player doesn't necessarily make you a good player.
This is confirmed when re-running a correlation for the full data set of the 2014-2015 season. When including the 'noise' of players from -5 to +5 relZSO, the correlation is a moderately strong .521. However, when filtering out the 'noise' of players from -5 to +5 relZSO, the correlation grows substantially to 0.659.
By the way, that 1 forward who is a positive relCF% player despite abhorrent relZSO? Kyle Chipchura with -17.66% relZSO. The next forward with a positive relCF% is Brock Nelson, -11.84% relZSO.
The only other factor on the above list that I've taken a dive into is teammates and competition.
Competition is one that really convinces me that hockey analytics, as it stands today (at least in terms of people outside the inner circle of the game), are in their infancy. War-On-Ice has a TOIC% statistic that is a measure of competition faced by each player. Through TOIC% among forwards who logged 300 or more minutes last season, the range is a measly 3.26% from 15.85% (Anton Lander) to 18.11% (Jon Toews).
It's not any better for defensemen ranging from 16% (Keith Aulie) to 17.89% (Mark Giordano).
Is this really the best competition metric that we currently have?
You can also look at competition corsi for/60 (the rate at which the skaters a player faces record shot attempts), but that does no better for our range than the TOIC% does.
Of course, these metrics could be tiered. You could place all players on a percentile to get a better relative comparison between forwards and between defensemen. It all circles back to the fact that the argument has been made: hockey is a fluid game. Skaters jump on and off the ice quickly, and over the course of 82 games, everyone is going to face their fair share of competition.
But what about teammates? When coaches have set lines and d-pairs, we should be able to see a bigger range.
And we do. For forwards, the range goes from 13.36% (Dennis Everberg) to 19.69% Jason Pominville. For defensemen, we see the same pattern. The range for teammates goes from 13.01% (Zach Redmond) to 17.93% (Tobias Enstrom).
Relative stats like corsi, fenwick, goals for, etc... do a good job of taking the team performance out of individual player analysis. Yet, when attempting to adjust player statistics (rant on that coming soon, I think), how the team performs in general needs to be looked into when adjusting a player's metrics for their situations.
Score effects are very real, especially on the team level. We all know the rule of thumb, but just in case, the team leading will generally "sit back" allowing the team that is trailing to shoot the puck more. Leading teams possession metrics decrease, while the trailing teams possession metrics increase. However, with increased possession, does not come increased quality. As a trailing team will "fire at will" from anywhere on the ice, and have their defensemen jump into the play, the leading team can take a more calculated approach to their attack, and will gain more scoring chances on developed odd-man-rushes.
This game is just too fluid. Too much can be attributed to a bad bounce, a puck hopping over a stick, a shot going an inch too far one way or the other and hitting the post (yeah, but an inch the other way, and you'd have missed the net completely). Hockey is not baseball.
Part II on this should launch soon, where I take a long look at Dan Girardi as the prime example of a player's metrics being severely burned by his outlandish usage.