Post Mortem: Player Projection Model

Well, this was disappointing. The same model showed some pretty good success in predicting the 15-16 season, but totally botched the 16-17 season.

Running the model to the 15-16 season, predicted goals had an r^2 of 0.3037 and predicted primary points had an r^2 of 0.3421. In this iteration, 80 forwards had their goals predicted correctly, 77 forwards had their primary points predicted correctly, and 36 players had both their goals and primary points predicted correctly.

For this season though, well, the model essentially failed. With the same weighting parameters and system, this season didn't seem to cooperate. Goals had an r^2 of 0.1982, and primary points had an r^2 of 0.2952. Of the 298 players evaluated, the model hit on just 51 players for goals, 66 for primary points, and a measly 25 for both.

Issues with the model:

I think it's too lenient. The way the predictions work, if you can recall, is similar to 538's version of CARMELO. The issue, though, is that I don't think I'm getting enough separation to make each player unique. Each player, with the current weighting system, is coming away with too many comparable players. Throw all of these guys into the mix, and the  model will end up projecting very close to league average, even for the league's better players. 

For example, when running the projections for Sidney Crosby, 87 comes away with 144 player seasons with a positive similarity score. That's too many for Sid, who even I can admit, is in a league of his own. Now, a good sign is that no one has a similarity score over 57 (highest is Datsyuk's aged 29 season, a 97 point campaign for PD), but 144 players, even indexed based on their similarity score, is going to bring Crosby's projections down, and it does, predicting a high of 13.7 5v5 goals for Sid this year.

To counter this, I think I'm going to need to add more constraints to how the similarity scores are calculated.

But still, I think the power of this model comes in the way we can view a player's career trends, and not necessarily the actual numbers it produces. In the sense of, which players career goal and point totals are trending up based on similar players, and which players career goal and point totals are trending down based on similar players. Especially for upcoming free agents.

For example, a team may find themselves in the market for a UFA winger this year like Vanek or Stafford. Using these projections, they could get a baseline impression of how each players career will progress based on the previous careers of similar players...

(The first line in the below images is still the 16-17 season).

Stafford:

Vanek:

It's no surprise that the model is more optimistic in Vanek's future point totals, as Vanek has always been the more offensive player of the two, but, it is interesting that the model perceives a huge dropoff when Stafford hits age 34 that it doesn't have for Vanek. As always with signing players over 30 to long-term contracts, buyer-beware. 

Next Steps:

Continue to tinker. The model was only decent for the 15-16 season and was truly horrid for the 16-17 season. This probably means making the parameters more strict in order to get fewer and more accurate player projections. Which, I think means, adding more parameters.

I have plans on launching a Shiny app for the 17-18 season, but haven't gotten around to that yet.

True Shooting Percentage: How many shots does it take?

If you've ever heard me talk about hockey before, or have read this blog, or my Twitter feed, you know that I'm very skeptical of JT Miller.

via Puckalytics.com, JT Miller is shooting 12.12% this year during 5v5 play over 66 shots. Last season, JT Miller shot 14.78% during 5v5 play over 115 shots. Among players with at least 60 shots on goal this year, Miller ranks 46th in the league. Not so egregious. Last season, though, Miller ranked 9th in the league. Really, it was the explosion of goals last year for Miller that has caused me to dive into this, as 9th in the entire league does seem a bit high for JT. And it seems high, because last year's mark of 14.78% is a shooting percentage that JT didn't even touch when he was playing in the AHL. From the seasons 12-13 through 14-15, JT Miller took 235 shots in the AHL and scored on 29 of them, for an all situations shooting percentage of 12.34%

JT Miller's all-situations shooting percentage across last season and this season? 16.81% on 226 shots on goal.

The burning question remains, is JT Miller an elite shooter? Has his game evolved to that level? Or, is he getting lucky? Miller has 38 goals since 15-16, while his Corsica.Hockey expected goal total is a "mere" 23.17 goals, meaning his outpacing his expected total by 15 goals. In this sample (last two seasons), JT Miller is 13th in the league for goals scored above expected. The 12 names above him? Marchand, P. Kane, Tarasenko, Crosby, Burns, Ovechkin, Stamkos, Panarin, Hoffman, Kucherov, Weber, and Scheifele. I think even Rangers fans can admit that these are names that JT Miller likely does not belong to. Which is no shot at JT Miller the hockey player, but these are the NHL elite goal scorers.

When you couple together the names that JT exists among, and the comparison between his NHL and AHL shooting acumen, one has to wonder if we have enough data on JT to accept that this is his true shooting percentage talent. 

Thus, I've taken the following methodology to determine how many shots it takes for a player to reveal his true shooting percentage.

Stealing a page from baseball, as I often do, from this article which attempted to discover true numbers across varying metrics in baseball: http://www.fangraphs.com/blogs/525600-minutes-how-do-you-measure-a-player-in-a-year/

Following this, what I did was set a few benchmarks (500 shots on goal through 1500 shots on goal) and ran split-half correlations for each player that met these requirements on their odd games and even games shooting percentage. The author of the article above sought out a 0.7 correlation at a minimum to determine feasibility of finding truth in a metric, so I'll do the same. The results are as follows:

[data via custom query in Corsica.Hockey. The sample is only forwards, and 5v5 play to eliminate potential noise from a player receiving PP or PK time one year, but not the next. Data is from 2007-2008 through the 2015-16 season]

Since we were already at a sample size of just 50 at 1100 shots, I decided to just make the jump to 1500 as a maximum test, since we can't really discern any value from 8 players there anyway.

At no point after 500 shots on goal do we see a sample that produces a correlation of 0.7 of greater between odd and even game shooting percentage.

To round this back to the discussion on JT Miller, if we reduce the shot requirement to 225 (JT has 226 over the past two seasons) we get a correlation of 0.517 on a sample of 496 forwards.

This sort of goes against what we already believe. Where it's hard to imagine that we don't know a player's true shooting percentage after they've put 1000 shots on net in the NHL. And even after running this test, I'm still not certain that isn't true. But, I do think it is true that there doesn't seem to be this magic number of shots on net where we can definitively say that at that point, this player is a x% shooter. Well, at least not with this methodology.

Gains, Losses, and the Human Response to Risk

From Michael Lewis's new book, The Undoing Project (highly, highly recommend)

Imagine two scenarios with these choices:

Scenario 1: 
Choice 1: $500 in your pocket
Choice 2: 50% chance of $1000 and a 50% chance at $0

In this scenario, it's extremely likely that you are going to opt for choice 1. There is no risk, you get $500 and can walk away.

Scenario 2:
Choice 1: Lose $500
Choice 2: 50% chance of losing $1000 and 50% chance of losing $0

In this scenario, I bet that most people would opt for Choice 2. The risk of losing more is worth the potential chance to lose nothing.

People respond to risk very differently when it involves losses than when it involves gains. In a sense that with a gain, people are ready to take the sure thing even though they could've risked it for more, and in losses, people would rather gamble and potentially lose more for the chance to lose nothing.

How does this approach work with hockey? Well, take your pick of any coach around the NHL. It's highly likely that this coach has a player that he knows what he's going to get night in and night out. It might not be a very good player by objective standards, but again, the coach knows what he's getting. Now, consider the player who is scratched so this known commodity can play. The scratched player is an unknown. A risk of sorts. The coach doesn't know what he's going to get, so he is less inclined to play him.

Right here, we have the sure-fired $500 gain rather than the potential $1000 gain or risk of $0 gain.

If instead, these coaches flipped the script and found themselves losing $500 a night and they had a 50% chance of losing nothing in the pressbox every night, then maybe they'd be more inclined to flip the lineup and roll the dice.

Home and Away Quality of Competition

Wanted to take a quick look today and see if coaches really are consciously putting certain players on the ice in certain situations. When teams can dictate the matchups on home ice, it is expected to see coaches lean heavily on some d-men versus tough competition, while staying lenient on others.

In order to see if this was an actual phenomenon, I used the xGF quality of competition from Corsica.Hockey to check the home and away usage for d-men during 5v5 play. In order to be featured on the chart, the d-man would have had to play 50 minutes both on home and away ice.

Here is a link to an Imgur album of all 30 teams. (some charts will feature a running dashed line, this is a line showing where xGF Away = xGF Home)

I have since changed the labels on the plots below. I'm not sure if 'disrespect' was the right term to use for someone facing easy competition on the road - as that may very well be a sign of respect if an opposing coach keeps his toughest matchups away from you.

Some interesting ones below:

Alain Vigneault showing a lot of faith in Nick Holden this year.

Alain Vigneault showing a lot of faith in Nick Holden this year.

Capuano trusts only two d-men with tough assignments.

Capuano trusts only two d-men with tough assignments.

Barry Trotz is a man who has set pairings with set roles

Barry Trotz is a man who has set pairings with set roles

Mike Sullivan has studied Barry Trotz. Set pairings, set roles.

Mike Sullivan has studied Barry Trotz. Set pairings, set roles.

Duncan Keith is deployed to score on home ice

Duncan Keith is deployed to score on home ice

Tyler Myers is the most egregious example I could find of a player who gets top competition at home and very easy competition on the road.

Tyler Myers is the most egregious example I could find of a player who gets top competition at home and very easy competition on the road.

McQuaid and Miller, a tale of two defensemen

McQuaid and Miller, a tale of two defensemen

Dylan McIlrath - A Tragic Misuse of Assets

It has been reported in the New York Daily News that Dylan McIlrath has been told, before the Rangers step onto the ice to begin their season, that he is not in the plans for the top-6 for the Rangers. This makes no objective sense. 

As the rumors swirl that the Rangers are exploring trading McIlrath, well, chalk that up to a moment in time where a GM sold at rock-bottom value (see: Yakupov, Nail). And it's hard to believe, really. The Rangers drafted McIlrath 6 years ago now. Six years of investment into the kid. Battling through knee injuries, and finally getting his shot last season, McIlrath, when given consistent playing time, proved that at a minimum he could take third-pairing assignments with the potential to see even more in the future.

Any team would likely be thrilled that their investment was paying off. Not the Rangers. Not Alain Vigneault.

Meanwhile, making less sense, is that Dan Girardi has been practicing on a pairing with Ryan McDonagh this week, showing that the Rangers will be "kicking the can" (to borrow a phrase) once more.

While Dan Girardi has proved time and time again that he is not a legitimate number 1 pairing defenseman (or 2nd or 3rd pair for that matter currently) the Rangers are more than happy to allow Girardi to bring McDonagh down to his level, while their most promising young right-handed d-man sits in the pressbox. This is all too familiar territory for McIlrath.

We all know how bad Girardi is, but let's see what the Rangers are giving up allowing McIlrath to sit out.

stats provided by Corsica.Hockey

First, I think it's worthwhile to look at the ranks among Rangers D last season for relative shot attempt metrics, and time on ice per game:

There are certainly some sample size issues here, but there are takeaways.

AV has clearly never trusted McIlrath, and he made that very clear to see last season as McIlrath would average just 12:48 of 5v5 ice last season. Skjei, the d-man who received the 7th most TOI per game played would average 14 minutes a game.

Another item that immediately jumps out is the relative goals for percentage that ranks first on the team for Dylan at +14.15%. Not only did this number lead the Rangers, but this was 7th in the league among defensemen with 400 or more minutes played during 5v5 play.

We can trace back a lot of McIlrath's struggles on the relative scoring chances for % metric by utilizing Corsica.Hockey's combos tool. When partnered with Dan Boyle, a truly atrocious pairing that was used by AV, McIlrath would see a relative scoring chances for % of -14.38%. This pairing was responsible for a 12.15 scoring chances against per 60. Compare this to the pairing of Yandle-McIlrath, which only allowed 7.49 scoring chances against per 60. Compare this to the three pairings that Girardi would skate more than 50 minutes with last season (McDonagh, Yandle, and Staal), which featured scoring chances against per 60 of 11.07, 11.91, and 12.74 respectively.

Presented for analysis are the impact stats when each Girardi and McIlrath were paired with Yandle.

Impact stats are a neat way of rolling up the per 60 relative metrics for and against into a net number. We do this with the formula: (Metric for per 60 minus metric against per 60). We use subtraction here because having a negative metric against per 60 number is a good thing. 

Maybe the most hugely important issue to discuss here, is that when paired with the Rangers best offensive defenseman last season, McIlrath was able to allow Yandle to keep pushing a positive gain for the Rangers, while Girardi was a complete drag on Yandle, eliminating his offense.

We can also take a look at the ranking of all Rangers combos that played more than 50 minutes together last season during 5v5 play, and their shot attempt impact (best for small sample sizes as we are dealing with)

Yes, that is Dan Girardi's three consistent partners being pile-drived to the bottom of the chart as his partner. And while McIlrath has two partners with negative impacts, he saw his most consistent time, and played his best hockey, with Keith Yandle. Is that a testament to Yandle? Absolutely, but we have a clear comparison in what having a bad partner can to do you with the Yandle and Girardi pairing.

Everyone who has watched McIlrath play also knows that he has quite a shot from the point, and the key to this skill is that McIlrath is not afraid to let shots go. Last season, McIlrath would record 9.32 shot attempts per 60, this was the 2nd most shots on the team behind Yandle. 

Finally, via Hockeyviz.com, I want to take a look at McIlrath's overview card & three-year overview.

What we see here is a player who, when given consistent playing time with the same partner (that chunk of time between games 20 and 40 with Yandle) is a player that can drive shots and goals for his team. As the season wore on, his partners were jumbled, his ice time fluctuated, and he spent a ton of development time watching instead of playing.

 

On this, we see a player who hasn't been given a real shot (doesn't even log third-pairing minutes). Despite not being known for his offense, he is still producing at the rate of a third-pairing d-man. And finally, the relative to teammate shot and goal generation. 

It's obvious that Vigneault trusts the likes of Girardi and Nick Holden more than McIlrath. Perhaps in their veteran state, they have earned that. How many games will it take for Dylan to get his shot? Will he ever get his shot on this team?

McIlrath, with his play last season, has earned the right to prove he is a steady third-pairing defenseman, and he should get that shot.