Sunday, 14 December 2014

How Long Does It Take For A Forward's Shooting To Stabilize?


If a player scores one goal on five shots, does that mean they are suddenly a 20% shooter? What if it is 10 goals on 50 shots? How about 20 goals on 100 shots? This is a classic issue of sample size in trying to separate the signal (talent) from the noise (randomness). That issue being, how big does a sample need to be before it stops being small? The question has been tackled before in other sports, see baseball here and basketball here, and my analysis here will mirror a lot of the methodology laid out in those pieces. 

Relating the problem to a player's shooting talent, how many shots does a player need to take before we can separate the talent from the randomness?

Now if you don't care about the math then please skip to the *** for the answer and analysis. 

Otherwise lets dive in!

The most common method used for testing this problem is typically split-half reliability testing. For example, if we were wondering how stable a player's shooting percentage is after 100 shots we would label each shot from 1-100 and then randomly split these 100 shots into two random 50 shot samples. We would then compare the player's shooting percentage between these two samples. This method is fine but it can be improved upon in our case by using the Kuder-Richardson Formula 21 (KR-21).

This formula will tell us the reliability of a test involving binary outcomes (two results), which is great for this test since when a player takes a shot there are only two possible results, a save or a goal. The KR-21 formula allows us to perform a split half reliability test but instead of only being able to compare one only type of combination it allows us to compare every single possible combination of these outcomes. For example, if we were going to preform a basic split half reliability test for a total sample of 10 shots (each labelled, 1 2 3 4 5 6 7 8 9 10) a simple method would be to compare all the even number shots with the odd number shots. Using the KR-21 formula however goes further and compares every single type of combination (ex. evens vs. odds, 1-5 vs. 6-10, 1 2 3 9 10 vs. 4 5 6 7 9, etc..). The results of this 10 shot KR-21 test will be a much better estimate of how reliable an indicator of a player's true talent level a stat will be over a 5 shot sample (10 divided by 2 = 5). 

Our goal is to reach a reliability of 0.707 at which point the signal (skill/talent) will begin to overtake the noise (randomness/luck) in our sample (0.707 x 0.707 = 50%). Below I have charted shots versus their reliability to show how the reliability of a sample which change as your sample's cutoff point increases. The blue line shows the logarithmic curve of reliability (which had an R-squared fit of 0.99626 with the data points) which I used instead of simply plotting a basic curve graph. I used the log curve because as you might notice in the table below I got a tad lazy and stopped running the numbers as frequently for bigger samples so I used the logarithmic curve which shows the relationship just as well. The red line shows the 0.707 cut off line where talent beings to overtake the randomness. Above the red line = good, below the red line = not good.


***


I found that after about 223 shots the reliability will cross the 0.707 threshold.





Shots Reliability Signal (Talent) Noise (Luck)
25 0.169 2.8% 97.2%
50 0.317 10.0% 90.0%
75 0.410 16.8% 83.2%
100 0.493 24.3% 75.7%
125 0.560 31.4% 68.6%
150 0.604 36.5% 63.5%
175 0.656 43.0% 57.0%
200 0.677 45.9% 54.1%
212.5 0.693 48.0% 52.0%
217.5 0.704 49.5% 50.5%
222.5 0.707 50.0% 50.0%
225 0.712 50.6% 49.4%
250 0.732 53.6% 46.4%
300 0.765 58.5% 41.5%
375 0.805 64.9% 35.1%
500 0.891 79.5% 20.5%

We now know that at 223 shots a player's shooting percentage is about 50% skill and 50% luck, which is still a lot of noise. We have to get about 400 shots before we really see a player's talent begin to shine through. This once again demonstrates how easy it is to be fooled by small sample sizes. While 223 may seem like a reasonable estimate it should be noted that only 40 players last season (2013-2014), or just over 6% of the entire league, record more than 223 shots. Alexander Ovechkin led the league with 386 shots total (along with a 13.6 shooting percentage) and still only gives us a signal strength of about 65%. 

This isn't meant to be predictive necessarily. That is to say, just because John shot 9% over 223 shots doesn't mean that we should expect John to shot 9% over his next 223 shots. If John shoots 17% over his next 50 games did he suddenly become a better shooter? Probably not. However, if John shoots 12% over his next 223 shots, the case can actually be made that this player may have improved his actual shooting talent. 

This all goes to show that it does take quite a bit of time for a player's shooting percentage to stabilize. Many are quick to reach assumptions about a player's actual ability simply based on a single season which we can see here rarely makes sense when the vast majority of the league will have taken so few shots that separating the signal from the noise is incredibly difficult. There is definitely talent at the heart of a player's ability to score goals, it just takes some time for that talent to truly become evident.



Thursday, 27 November 2014

Corsi Against Doesn't Correlate with Save Percentage


How does a goalie's workload affect their ability to preform?  This question always seems to be bouncing around  and recently has come up again with regards to whether a goalie's workload (the amount of Corsi events they face) has a tangible impact on their save percentage.


Previous Literature 


The first analysis was done by Brodeur Is a Fraud and found little to no evidence of a correlation between the two variables. Another look was done over at Hockey-Graphs and found similar results with a different method:
For the forty active goaltenders to play at least one hundred NHL games over the past four seasons, there is no substantial relationship in them playing better -in terms of save percentage- when facing more or less shots against.
Chris Boyle in his own study at SportsNet did seemed to find a quite strong relationship yet I have some serious doubts as to the validity of his methodology. Essentially by looking at the raw shot counts and save percentages posted in individual games while removing goalies who didn't play the full game you result a very serious issue of survivor bias. Why do goalies in this study who see a large amount of shots against only post high save percentages? Most likely it is because if a goalie faces a large number of shots and doesn't post a high save percentage they will allow a large number of goals which leads to them being pulled from the game and therefore they are removed from this study. This removal doesn't happen for goalies who face a low number of shots while posting a low save percentage because they can still allow only a low number of goals against giving their coach no incentive to pull them. Example, a goalie faces 20 shots against and lets 2 in. That's a .900 save percentage which in the big picture isn't good but in an individual game only allowing two goals against is just fine. Therein lies my issues with this study.

Finally, we arrive at the most recent post by David Johnson at Hockey Analysis who can summarize his own methods best:
In my opinion, the proper way to answer the question of whether shot volume leads to higher save percentages is to look at how individual goalies save percentages have varied from year to year in relation to how their CA60 has varied from year to year. To do this I looked at the past 7 seasons of data and selected all goalie seasons where the goalie played at least 1500 minutes of 5v5 ice time. I then selected all goalies who have had at least 5 such seasons. There were 23 such goalies. I then took their 5-7 years worth of CA60 and save % stats and calculated a correlation between them. 
Basically, he found the individual correlations for each goaltender and then averaged these individual correlations. A few issues I noticed starting with the fact that correlation coefficients aren't additive. You need to first convert them to Fischer z values which are additive. This issue is minor as I ran his test again the results don't alter too much with this adjustment.

The second issue I take is with the claims made based on this study. Starting the use of word "boost" in the title implying that there is not only causation here which I am not convinced of (we simply see a correlation via his methodology) and also that there is only a positive correlation, meaning that an increase in CA/60 results in an increase in SV%.  Examine the data closer you find that 8/23 goalies saw the inverse effect (more shot-attempts against lowered their SV%) while another two saw essentially zero change in SV% in relation to their shot-attempts faced. This leaves us with only 13 goalies who we can see to have a positive correlation. This leads to my issue with the author making a general assumption about the impact of CA/60 boosting Save Percentage as a uniform result that can be applied across the board to all goalies, when he is really only talking about a specific subgroup. Later on in this post I will reveal my doubts with regard to his methods and how I believe he simply found a false positive for a relationship that doesn't exist. 

My Findings


I tweeted this graph out earlier when this question was first raised on Twitter. It is a very basic graph that took me a few minutes to put together but you can see a team allowing more shot-attempts against having a noticeable impact on their save percentage to be essentially zero.


These next few charts look at the individual goalie level. I set different cut offs in each graph just to see if we could weed out some goalie talent since better goalies tend to play the more minutes (unless your team is located in Winnipeg) and we still aren't able to find any strong evidence (the correlation does actually increase as we narrow the sample jumping from about 0 to 3%). 


This graph below is the same as the ones above but only using the data included in the Hockey Analysis study.



Since none of the graphs I managed to produce were able to find any correlation I decided to try my own blind recreation of the method used at Hockey Analysis. Below are two graphs very similar to the graphs first produced at Hockey Analysis that seemed to demonstrate the correlation between CA/60 and SV%. I have removed the titles of these two to add an element of surprise. Take a quick look at both before finding their titles below. 



***

***

***

 Surprised? This is my basic way to suggest that the results shown in Hockey Analysis' study could be the result of simple random variation. Pekka Rinne's chart is to show how one of these samples can be pretty much out of wack on the individual level while the Niemi vs. Howard chart shows that even when picking two variables that we know for a fact should have zero correlation to each other, when dealing with such small samples in this case only 5 seasons (or data points), it can be pretty easy to discover a relationship that doesn't actually exist.

The chart below shows the data on the correlation's found by Hockey Analysis. I took the liberty of converting it to Fisher z-values and then the Inverse of that which is the real correlation that he was looking for. So in actuality his correlation was higher than he first reported. To make things simpler I have stared* the important column here with the true correlation. 

Average Correlation Average Fisher Average Fisher Inverse*
0.183 0.215 0.212



The issue as you may have seen above in the Niemi vs. Howard chart is that it is very easy with this data set and this method to find correlation's that we know for a fact shouldn't exist. Below I calculated 23 correlations and their subsequent Fisher values in my blind test. I simply put the goalies in alphabetical order and compared the CA/60 for goalie A with the SV% of goalie B. 



Correlation Fisher
-0.292 -0.301
0.098 0.098
-0.098 -0.098
0.730 0.930
0.631 0.743
-0.407 -0.432
-0.726 -0.919
0.116 0.117
0.536 0.599
0.117 0.118
0.126 0.127
-0.230 -0.234
-0.131 -0.132
0.338 0.351
0.586 0.671
0.468 0.507
-0.631 -0.744
-0.383 -0.403
-0.616 -0.718
-0.708 -0.882
-0.213 -0.217
-0.095 -0.095
Average Average Fisher
-0.784 -0.916
Fisher Inverse*
-0.724

We know from common sense and logic that the number of shot-attempts faced by Evgeni Nabokov will have no effect on Henrik Lundqvist's save percentage but the number's actually show a correlation (.73). This is obviously a false positive showing a correlation that doesn't truly exist. Simply stated, correlation doesn't always prove causation. Based on what I have found here and the earlier research done on the subject, I feel confident in stating there is still little to no evidence relating the Corsi Against a goaltender and their Save Percentage.



You can reach me via email me here: DTMAboutHeart@gmail.com or via Twitter here: @DTMAboutHeart







Tuesday, 11 November 2014

NHL Draft Pick Value Chart


Drafts have always been a mystery in the sporting world. The number of teams relying on the draft to build their teams continues to rise in an era of delicate salary caps and bigger, stronger, faster athletes. Evaluating and projecting young athletes is far from an exact science to say the least. Look back at the 2007 NHL Entry Draft when the Pittsburgh Penguins selected Angelo Esposito 20th overall while the Dallas Stars were able to pick up future captain Jamie Benn with the 129th pick in the 5th round. In hindsight the mistake's seem obvious but this is hardly the standard, as you can see in the graph below earlier picks tend to yield much higher success rates than later selections.


Goalie's as it has been well documented in the past, are slightly less predictable to say the least...


What is each draft slot worth however? Attempting to nail down the value of a draft slot in the NHL has been attempted many, many, many, many, many, many times. I decided that it was time to reevaluate the idea from a slightly different approach than most.

In order to come up with my values, I gathered each draft pick going back to 1970 (when the draft really started to resemble what it is today) and looked at each player's Point Shares only during their first seven seasons in the NHL. I fully recognize that catch-all statistics are not perfect evaluations of a player but they are probably the best available statistics for judging large numbers of players throughout history. I chose Point Shares over GVT mainly because Point Shares cannot be negative, GVT on the other hand can be negative which causes difficulties when comparing certain players. Example, how do you value a player who makes the NHL and records a negative GVT against a player who never played an NHL game and therefore has a zero GVT? Should that player be counted less even though many would argue they were probably a better hockey player? It is a tough question but thankfully Point Shares doesn't share this issue.

Looking at only a player's first seven years rather than a players full career accounts for the assumption that when a team selects a player in the draft they are only guaranteed at most 7 years of that player's services before they hit free agency (3 years from their rookie contract and then 4 years of their RFA rights). I then fitted this data with a logarithmic curve to smooth the data to show the sharp drop off in value from the first picks followed then a more gradual drop for the later picks.

In the future I hope to replicate and build on Eric T.'s work found here regarding the market value of a draft pick. Where as my values were based on draft results, Eric based his on the market rate as determined by team trades. Comparing the two methods could provide some insight into what spots in the draft might be over or undervalued by teams relative to their actual expected value.

Below is the grid for comparing the individual value of each pick. Reminder that these value's are arbitrary numbers and should only be used to compare draft slots and not any players involved in a potential trade. This once again is an approximation many years of data and in no way a hard rule of how every pick should be valued. Enjoy!



You can reach me via email me here: DTMAboutHeart@gmail.com or via Twitter here: @DTMAboutHeart

Wednesday, 5 November 2014

Normalized Career Player Stats



Who is the greatest goal-scorer of all-time? What about playmaker? Hockey like all sports has evolved so much over the years that it is extremely hard to compare individuals in different eras. With the help of Rob Vollman's database and Hockey-Reference's Normalized Data I have compared the career's of every player over the past 47 years (since the 1967 expansion) to help shed some more light on these debates.

The Normalized Data is presented just like any player stats except all the stats are scaled to reflect certain changes throughout the league's history. The most common adjustments are to account for different lengths of schedules, amount of players carried on each roster and era adjustment to account for the amount of goals being regularly scored in those games (ex. it was easier to score a goal in 1981 than it is in 2014).

You can filter and sort this table at your own discretion and pleasure, enjoy!


*Players needed at least 300 Games Played by the end of the 2013-2014 season to qualify
**Even if a player's career began before 1967 this chart will only reflect their stats since 1967

Observations

  • The data is obviously skewed towards players whose careers have yet to end. It is extremely hard to maintain high levels of play throughout your entire career which is why active players still in or near their primes will see their stats slightly inflated. 
  • Bobby Orr was amazing. He absolutely dominated the game from an offensive stand point that we will probably never see again. 
  • Sidney Crosby is the greatest player alive and one of the best ever.
  • Ovechkin is probably one of the greatest goal scorers to ever lace up the skates. It still amazes me how much garbage is thrown Ovechkin's way by people who have a seriously flawed understanding of the game of hockey or are simply trying grab a headline. Ovechkin is one of, if not the most, lethal goal scorer in the leagues past half-century and we should all just appreciate the opportunity to bear-witness. 
  • Kovalchuk's stats will forever be skewed from the fact that he essentially played out his best years in the NHL before bolting to the KHL which essentially ensures that his career rate stats will never suffer as he ages. He did have a great run though, while it lasted. 
  • Lemieux and Gretzky come down to the wire here. Lemieux has the better era-adjusted PTS/Game due to his big years occurring in the 90s as opposed to Gretzky who succeeded in the high flying 80s. Gretzky however, played about 500 more games which has to be considered as a positive when considering the two.
  • Jagr is ageless. He keeps on clicking at a ridiculous rate despite taking 3 years off to play in Europe only to come back and put up unheard of numbers for a player older than 40.
  • Cam Janssen just nudges out Colton Orr for worst PTS/Game of any regular forward in the last half decade. Likewise, Wade Belak takes home the title of least offensive defenceman of the modern era.

Friday, 3 October 2014

Maggie Projections 2014-2015



The start of the new NHL season is right around the corner and with that I present my first instalment of the Maggie Projections for the 2014-2015 season. Essentially these are projections for the the upcoming NHL season based on the system developed by Tom Tango for baseball about a decade ago.
It is the most basic forecasting system you can have, that uses as little intelligence as possible. So, that's the allusion to the monkey. It uses 3 years of MLB data, with the most recent data weighted heavier. It regresses towards the mean. And it has an age factor.
Tango named his system Marcel after Marcel the monkey due to the idea that they're so basic a monkey could do them. In order to avoid any potential confusion and add a little hockey flavour and have affectionately named these projections after Maggie the Monkey. A quick history lesson for those who may not know, Maggie the Monkey was a reoccurring guest on TSN during the playoffs in which she would spin a giant wheel to predict playoff rounds. I think it was a brilliant display of the randomness of hockey and the unpredictability of the small sample tournament Stanley Cup Playoffs. Her record was pretty impressive all things considered, (I must remind you, it's a monkey spinning a giant wheel) she was 50% on her career and 53.33% before her tough last season.

I may look to try and improve on these projections at a later date by adding on new stats and adjusting the projections with some tweaks here and there. Reminder, I do not stand behind these forecasts as this is essentially one big formula that I have taken and applied to hockey with no subjective input from me at all. (Credit to Rob Vollman's player spreadsheets for my data)
To save people some time, please use the following format for all complaints:
<player> is clearly ranked <too high/too low> because <reason unrelated to Maggie Projection system>. <subjective ranking system> is way better than this. <unrelated player-supporting or -denigrating comment, preferably with poor spelling and/or chat-acceptable spelling>

So without further ado, here are the Maggie Projections for the 2014-2015 season.


Monday, 22 September 2014

Year to Year Repeatability of New Goalie Stats

For as long as people have been analyzing hockey, goalies in particular have been difficult to analyze. Based on what information has been available to the masses up until now, it has been extremely difficult to get an accurate read on a goalie's abilities until they see a large amount of action. Metrics in evaluating goalie performance to date have ranged from awful (wins and GAA) to informative but unpredictable (SV%). However, a new hockey statistics site called War-On-Ice has released data that has been able to further break down a goalies SV% into more detailed categories depending on the shot location data provided by the NHL. Their four new stats (and one more classic stat) are as follows:
  • Low Save Percentage
  • Medium Save Percentage
  • High Save Percentage 
  • Adjusted Save Percentage
  • UnAdjusted Save Percentage (technically not new since it's just basic SV%)
While these new metrics are definitely very descriptive and informative I really wanted to test the predictive value in goalie projections. That is, if a goalie posted a .900 SV% in a certain category last year, what should we expect them to get next year?
Save Percentage Zones via war-on-ice.com
The picturing above shows the zone breakdown for each of these stats:
  • Blue = high percentage shots (SvPctHigh)
  • Red = medium percentage shots (SvPctMedium)
  • Yellow = low-percentage shots (SvPctLow)
Here is the breakdown of how these stats matched up versus one another from one season to the next. (Regression line in red)


Well, that wasn't very informative. We see very little correlation from one season to next regardless if you're treating every shot as equal (Unadjusted SV%) or break each shot down by it's general degree of difficulty (Low, Med, High SV%). I had high hopes for Adjusted Save% too (especially after I first ran these numbers and found some interesting results before realizing I had completely screwed up my data, idiot).

Adjusted Save% is still quite interesting in my eyes as explained in the War-On-Ice Glossary:
AdjustedSvPct: The weighted average of Low, Medium and High Save Percentages, as weighted by the league average frequency of each shot type. Compare to statistical benchmarking -- correcting a simple random sample for known stratification issues.
And from one of the creators, A.C. Thomas:
Here's a table with the results (we want RSQ to be close to 100% and p-value to be less than 0.05):


Type Count Shots Faced AdjustedRSQ RSQ p-value
UnAdjusted 99 >1500 1.72% 0.70% 0.196
Adjusted 99 >750 0.11% 1.14% 0.294
Low 77 >500 2.02% 3.31% 0.1134
Med 80 >300 -0.14% 1.13% 0.3482
High 103 >200 7.81% 8.71% 0.00247

**The Count column is simply the number of observances for each type of Save%. Shots faced is the arbitrary number of shots I made as a cut off point. Ex. For a goalie's LowSvPct to be counted, they would have to have faced more than 200 shots from the designated Low area of the ice in back-to-back years. This happened 77 times between 2008 and 2014 or an average of ~15 goalies per pair of years. Also, these stats are just even-strength (5v5).

The best results here were clearly for SvPctHigh, which is for shots directly in front of the next (in the slot). While the correlation is still small, it seems better than most results we see when it comes to goalie metrics. So maybe there is a little something to a goalies ability to defend a likely scoring chance.

Obviously, when using this method you observe some survivor bias as goalies who see lots of action in back-to-back years will tend to be of the higher caliber to begin which can result in some bias. Maybe I will revisit this later and try to simply weight goalies based on their shots faced instead of simply ignoring those below an arbitrary threshold. Until then, it seems like goalies will remain one of hockey's greatest enigmas. 

Thursday, 21 August 2014

Drafting Strategies - Reaches vs. Fallers



Yes, I am aware this isn't the timeliest of articles but please bear with me. If you have ever watched any televised professional sports draft you will probably have noticed that there are always players who seem to be "reaches" or players who seem to "fall" on draft day. What makes a player a "reach"? A reach would be a player like Derrick Pouliot in the 2012 NHL Entry Draft. Pouliot was ranked 17th in TSN's Rankings but the Pittsburgh Penguins stepped to the podium and selected him with their 8th pick. That's a fairly substantial jump. Opposite of that, we have the fallers. What is a faller? Look no further than Teavo Teravainen from that very same draft. Ranked 7th heading into the draft, Teavo had to wait til the 18th pick before he was snatched up by the Chicago Blackhawks (all TSN Rankings used in this article can be found here).

What I am attempting to do here is try and find if one strategy is really more effective than the other. Should you trust your gut and draft the guy that you know everyone has ranked lower than this spot? Should you be genuinely tempted to grab that player that everyone seems to be passing on? At the end of the day I expect each team to stick with their guns and draft the play highest on their own list. But maybe it wouldn't hurt however for teams look at what other people think and ask themselves "Are we way off on this?"

Let's pretend for a second that it's the 2005 NHL Entry Draft and you're the San Jose Sharks with the 8th overall pick. You're really high on this winger from the WHL named Devin Setoguchi which is great and all except he is ranked 26th overall (in hindsight you can do a lot worse with a first round pick than Setoguchi but this example still works). But you take a second to look at some other rankings and notice TSN has this Slovenian centre by the name of Anze Kopitar ranked 5th overall, interesting. In the end the Sharks stuck to their gut and drafted Setoguchi, leaving Kopitar to be snatched by the Kings at 11th overall and well the rest is history. This exercise was to help dive into that decision making process and see if history can teach us anything:
The x-axis of this histogram ranges from reaches to fallers (left to right) and gives us a good idea of team drafting patterns. Team's aren't afraid to take slight reaches (as seem by the highest column slightly to the left of 0) but aren't as eager to pick up players who seem to being falling as noted by the more spread out pattern to the right of the 0. If you aren't really up to speed with histograms, learn more here.
This looks at total games played by each player picked in the 1st round between 2004-2009 excluding goaltenders. Nothing really new to see here other than of course how many "sure things" never end up cracking the NHL.

Multiple R-squared:  0.0663 - 6.63%
Adjusted R-squared:  0.06057 - 6.06%
p-value: 0.0008414

So the correlation isn't the strongest here (shown by the R^2 value) which I would have guessed just based on looking at the scatter plot. Interesting though, is the p-value being significantly less than 0.05 (you can brush up on p-values here). Essentially what we can differ from these results is that there isn't a strong correlation (shown with the low R^2), ex. just because you fall X spots doesn't mean you'll play Y amount of NHL games. There is however, some significance between a player who is a faller and one that is a reach.

Of course using simply GP as a measure of a successful draft pick is hardly fair when you simply consider that a player drafted in 2004 will by default have a higher chance of playing more games than a player drafted in 2009. So another method I used to tackle this issue is by dividing the players into separate bins based on the difference between where they were ranked and where they were selected (explained below the table).

Bins Count Average GP % Success Avg Diff
Fallers 1-20% 33 88.64 27.27% 17.42
21-40% 33 250.73 66.67% 4.76
41-60% 33 315.30 81.82% 0.79
61-80% 33 286.82 78.79% -0.70
Reaches 81-100% 33 173.94 57.58% -2.00
Fallers > 5 46 117.91 45.45% 14.39
> 0 and < 5 41 131.83 96.97% 2.34
0 27 348.00 69.70% 0.00
> -5 and < 0 39 222.59 78.79% -2.38
Reaches < -5 12 179.00 21.21% -9.92

  • Bins - I grouped the players based on how high or low they were taken relative to their TSN Ranking
    • 1-20% bin holds the 33 biggest fallers, while the 81%-100% bin holds the 33 biggest reaches
    • The 2nd set of bins holds players based on an arbitrary cutoff point I came up with, ex.  >5 contains the 46 players who were taken at least 6 spots below their TSN Ranking
  • Count - How many players qualify for each bin
  • Average GP - Average NHL Games Played by the players in that bin
  • % Success - # of players to play at least 100 NHL Games in that bin / Total players in that bin
  • Avg Diff - Average difference between where the player was ranked and where they were drafted; + means faller while - means a reach.
Some quick observations of the chart, the high % Success of slight fallers in the "> 0 and < 5" bin are probably due to players like Seth Jones, who was ranked 2nd but taken 4th so while technically he fell in the draft everyone is pretty sure a player of that caliber will become a successful NHL player. Reaches of 5 spots of more seem to have very suspect returns, with only 1 out of every 5 players turning into successful NHL players, same as the 33 largest fallers who had a 1 in 4 success rate.

I think if anything, what this exercise has can help show is that maybe consensus rules above all. Thinking of being bold and grabbing the dark horse no one else is even considering? Maybe there is a legitimate reason most aren't considering them. See that hot shot prospect seemingly passed on by every other squad? Maybe there is a reason for that too. All that said, it doesn't mean you shouldn't be afraid to stick with your guts. There are guys who fell 6 spots and turned out pretty great (Kopitar, Zajac). Similarly, there are also guys who would be considered reaches who have turned out pretty well themselves (Karlsonn, Eberle, Couture). There is no golden rule when it comes to drafting but each piece of the puzzle can only help make the picture that much clearer. 

Reach me at on Twitter @DTMAboutHeart or email me at DTMAboutHeart@gmail.com

Thursday, 17 July 2014

What in the world are the Colorado Avalanche doing?

Whether it was their enigmatic coach, their rookie superstar or their third best record in the NHL, The Colorado Avalanche made waves this past season. Looking back on the season however, it reads more like a cautionary tale than the preludes of good things to come. Objectively it was pretty obvious that Colorado's success this season was more a product of a sky high PDO (3rd in the NHL) which will surely regress next season (for starters, Varlamov might be an above-league average goalie but, I am willing to wager a lot of money with anyone who believes he will be a Vezina finalist again next season). Trouble continues to lie below the surface in the form of an absolutely horrendous possession team (27th in the NHL). So while nothing is set in stone and Colorado could obviously be a better team next year... teams who have a glimpse of success but with terrible underlying numbers don't exactly have the brightest track record in their future endeavours (ex. 2013-2014 Toronto Maple Leafs & 2011-2012 Minnesota Wild).
“Those who cannot remember the past are condemned to repeat it” - George Santayana
So now that we have established that it is probably in the Avs best interests to make some improvements this offseason lets see what they have done...

The Offseason



Gained Lost/Cost Impact
Didn't pay in this case
4 years x $7mil 
Paul Stastny Lost - their 2nd leading points-per-game player
(0.85 ppg), their best possession forward, who
did so while playing tougher minutes than
franchise corner stones Duchene and Mackinnon
Jesse Winchester 2 years x $900k Unnecessarily committed 2 years to a borderline
depth forward 
Zach Redmond 2 years x $750k See above except a defenceman 
Jarome Iginla 3 years x $5.33mil You could talk me into that first year, the second
is the cost of doing business but the third might
get ugly
Brad Stuart 2016 2nd-round pick
2017 6th-round pick
Contract - 1 year x $3.6 mil
Gain - a drag on possession despite strong
linemates and reasonable usage who doesn't
put up points but will still make $3.6 million
Lost - A valuable 2nd round pick
Daniel Briere
Same cap hit
Paranteau has 2 years left
vs. Breire's 1
PA Parenteau
2015 5th-round pick

Lost - their best scoring winger behind
Landeskog (0.6 ppg) and third best possession
player
Gain - possession non-factor and forward whose
lost his scoring touch (0.36 ppg)
Side note - I guess they swapped a two year
contract for a one year? Sure lets go with that 

The Ryan O'Reilly Saga



So yeah, about that whole "improve" thing, not exactly off to the hottest start. However, don't fret Avs fans, you still have a relatively young core with great pieces to build around. Wait, they're trying to force one of their young superstars into a contract that is terribly skewed in their favour based on a shaky set of precedence and flawed logic? Oh good, this will totally end well.

Ryan O'Reilly is a very good NHL hockey player. I just wanted to air that out. He is disciplined, point producing and possession driving dynamo. Basically you couldn't dream up a better two way player if he was from Winnipeg, captained a team to 2 Stanley Cups in 3 years and has been universally proclaimed the greatest leader to ever walk the earth since Ghandi. That is neither here nor there.

Quick background before this all seems like deja-vu. O'Reilly was a RFA two years ago and was actually about to sit out and continue to play in Russia before signing an offer sheet (2 years x $5mil per year) with the Calgary Flames which the Avs reluctantly matched. Now we are back here again. Whats the issue this time?

O'Reilly's camp is probably looking for a big payday which he undoubtedly deserves. Rumour has it that the Avs have been trying to point at deals like Jamie Benn and say "You see these deals that are great deals for the teams and in hindsight greatly under pay these players? Yeah you should take one of those." This is logic is flawed for a variety of reasons. See this example if it isn't already clear:

You walk into a store and buy a shirt on sale for $10. You then proceed to walk into a different store without a sale going on and try to tell the owner that you should only pay him $10 for his shirt that is priced at $15 because the other guy gave you a deal. Well good for the other guy but, this is a different store and a different shirt. Just because you get a good deal somewhere doesn't mean that automatically sets the standard for the future.

Also, that deal was signed right after coming off their Entry Level Deals which cannot be ignored. I am sure O'Reilly would have loved to sign for 5 years x $5 million two years ago. Colorado wouldn't offer it. So now instead of getting him on a contract that gets him for 4 years at a discount due to him being an RFA they will now have to fork over some considerable money since they only own his rights for 2 more years. That is the risk they took by not going long term 2 years ago as was their right to do.

So even after a good season, it is apparent that Colorado has a lot of room to improve and a very long way to go. It also can't be understated that I went almost this entire article without even mentioning they might have the worst collective group of defenceman in the NHL. Despite all this though, Colorado has managed to possibly become a worse hockey team this offseason while at the same time pushing one of their young core pieces out the door because they don't grasp the concept of paying their players what they could be reasonably requested (you could argue this is the same misaligned judgement that resulted in them letting Paul Stastny walk for nothing in return). Instead they are trying to enforce some distorted notion that high end NHL players should play for less than they are worth because what? They are legitimate cup contenders while playing in potentially the toughest division in the NHL? Hey, all the power to you. Needless to say I can't wait for next season, when this becomes an every night occurrence. Or will it be in two years when they try to sign Nathan MacKinnon for 8 years x $2.5 mil? Keep selling that pipe dream Avs.


Reach me at on Twitter @DTMAboutHeart or email me at DTMAboutHeart@gmail.com

Tuesday, 24 June 2014

PK Subban's Next Contract


All eyes are on the Montreal Canadiens and their superstar defenceman PK Subban this summer to see what will come of this round of contract negotiations. A lot has been made of what Subban's price will eventually be when all is said and done but many agree we will be looking at a relatively large number.

Lets hop in George Michael's sport machine and travel back to 2012. Subban is coming off his rookie deal and needs to negotiate a new deal with Montreal. Based on strong speculation it is believe that Subban and his agents thought 5 years at a $5 million cap hit was fair market value for his services. The Canadiens astute GM, Marc Bergevin, decided that he would not be pushed around by some hot shot RFA, so he stiff armed the young star by forcing him into a 2 year x $2.875 million cap hit bridge deal. The Canadiens bet against their young defenceman and are now about to pay the price.

In the past two seasons the hockey world has witnessed Subban's stock as a hockey player rise astronomically. He cracked the strongest and deepest group of defenceman assembled over the past two years on Team Canada and only couldn't crack the starting line-up due to the obscene theory that professional superstar hockey players seemingly crumble if forced to play on the side of their ice that doesn't correlate with the way they hold their stick. One of those two bridge seasons was also capped off by a Norris Trophy marking him as the best defenceman in a given season, which pretty much speaks for itself when you consider who he has joined on the list of previous winners. So let's start the process of nailing down a price tag.

Andrei Markov

Andrei Markov, PK Subban's most common defence partner, recently inked a new 3 year deal with the Montreal Canadiens worth $5.75 million per season. This deal can make us very confident that Subban should exceed that cap hit fairly comfortably due to the fact that 1) he is much younger, 2) he makes Markov much better when they are on the ice together and 3) is appreciably better than Markov in just about every facet of the game of hockey (no slight to Markov).

Defensive Issues

This is a slight tangent against people who say to use the "eye test" but it is very important when considering a player who has a well defined reputation. Subban's reputation is wrongfully that of a player with defensive issues the root of which I think can be summed it up brilliantly in this post by Tyler Dellow:
If someone asked me what I think the biggest failing of the eyeball test is, I’d respond that it’s the emphasis on the big mistake. There are gigabytes of information contained in a hockey game. So much information that I think it’s difficult for anyone to take it in and organize it rationally. The way that our brains deal with that is by focusing on the big mistake.
What is the big mistake? The big mistake is the play that leads to a goal against. When we see a player who’s made a bunch of big mistakes in a row, we get down on him. 
Subban falls victim to this method of thinking via the created perception that he is a high risk high reward player who therefore is a defensive liability, this couldn't be further from the truth. He does make his share of mistakes and defensive lapses but they are not substantially more frequently or costly than those of any other elite defensemen. Check out these clips of three high end NHL defenceman who are all shown to have made costly defensive lapses seemingly have never received the same depths of criticism.







Comparables

So now lets break down the statistics a little further and try to get a better look at how Subban stacks up versus other elite defenceman. Here I have complied the top 10 paid defenceman (pre-Markov signing) in the NHL along with 2 players who are in the top 20 but who signed their deals right after their Entry Level Contracts expired (Ekman-Larsson and Myers).



Subban stacks up pretty favourably with all of his future cap comparables. Focusing first on his usage we can see that at even-strength Subban was given about average zone-starts for what we tend to expect from a defenceman of his calibre, this lets us reasonably assume his stats aren't due to sheltered usage (far right of the graph) or extremely tough playing conditions (far left). We can also see that he occupies an absurd amount of time on a top Montreal's powerplay which should really come as no surprise when you take his blistering shot and offensive instincts into account. Also, you might note his minuscule usage during shorthanded situations which I would chalk up to the wrongful perception that Subban can't play defence (see above) and newly extended Canadiens coach Michel Therrien's wacky personnel decisions (see Douglas Murray being his 2nd most used defenceman on the penalty kill).

Subban falls in the midrange for the majority of these categories which is no slight against him, showing that he clearly belongs in the conversation to be one of the sports best paid. His most impressive feat though would probably be his leading CorsiRel%. Subban essentially is driving the possession bus for all of his team, raising the play of his teammates and dictating the play when he is on the ice. Needless to say, it is very impressive.

Money

So once again, what should Subban be paid? For starters, I think it is safe the say that the term will be 8 years since there is really no reason for either party to want anything else. The only other possible scenario would be if Subban only wants 2 more years so that he can be free to walk as a UFA after that contract, which won't happen but that is really the only conceivable way they wouldn't go the fully allowed 8 years.

An general overview of NHL salary structures breaks down as follows. When a player of this caliber typically enters the league they do so on a 3 year cap-controlled Entry Level Contract. Once that contract expires the team still controls them as a Restricted Free Agent for the next 4 years. Since a team owns these players rights for those 4 years they have the leverage to negotiate a discount rate. After this term a player becomes an Unrestricted Free Agent and can be offered a contract by any NHL team effectively driving up the price for their services up to full market value. This methodology can help us explain, in part, how Drew Doughty and Dion Phaneuf can have seemingly nearly identical contracts. Doughty's includes 4 RFA years that can be negotiated to below market value while Phaneuf's entire 7 years are UFA years which cost more as a typical rule of thumb.

Looking at Subban's situation we see that on a potential 8 year contract, only 2 of those could be accounted for at a discounted RFA rate. It's tough to estimate what the cost of these 2 years for Subban's service would cost but I will put it around $7.5 million per year range which is what Shea Weber's cap hit was determined to be by an independent arbitrator. The trick then becomes what to make of his 6 remaining UFA years since the two biggest cap hits also come on contracts with absurd lengths to offset the total cost (Weber and Sutter). Combine all of this with a rising salary cap in the upcoming seasons (I have no clue what the rate of growth will be which is why I didn't try to specifically use it in this determination) and Subban has a strong case to become the highest paid defenceman in the game.

Everyone will have a different idea of what Subban deserves and thats fine, it is truly difficult to nail down an exact idea of a player's worth. As of now though, my best estimation of fair market value for PK Subban would be 8 years X $8.25 - $8.625 million per year. 

Monday, 26 May 2014

NHL Entry Draft - Drafting Goalies in the Late Rounds

With the NHL Entry Draft right around the corner its always great to speculate and ponder the upcoming prospects. I've recently stumbled upon two draft related posts that got me interested in thinking about how we evaluate these up and comers (check out this awesome post that pits the Vancouver Canucks scouting department versus a potato and this post that looks at the high success of offensive defencemen).

I am slowly growing my database of NHL Draft data (if someone reading this has their own database of prospect stats and wouldn't mind sharing, please contact me). I decided to look at goalies selections in the NHL draft, mostly because I haven't seen much done on the subject.

Below is a chart I created of goalies drafted between 1997 and 2006 (I chose this time frame in hopes to keep the sample relatively modern while still giving them ample time to crack an NHL roster).

I define success as the goalie having played at least 100 NHL games as of the end of the 2014 season. The tier's section sorts the goalies by what round they were drafted:
  • Top - Rounds 1-3
  • Middle - Rounds 4-6 (pre-lockout) Rounds 4-5 (post-lockout)
  • Bottom - Rounds 7-9 (pre-lockout) Rounds 6-7 (post-lockout)

I highlighted the most striking observation in that chart. (Sorry if the formatting is confusing).

TierLeaguePlayersSuccessBustsSuccess Rate
TopTotal85275831.76%
CHL44143031.82%
Not CHL41132831.71%
MiddleTotal10110919.90%
CHL4654110.87%
Not CHL555509.09%
BottomTotal959869.47%
CHL340340.00%
Not CHL6195214.75%
Total2814623516.37%

No goalie who played their draft eligible season in the CHL (OHL, WHL, QMJHL) between 1997 and 2006 and was drafted in the later rounds of the draft has ever managed to play 100 NHL games.

The 2 Non-CHL goalies were Brian Elliott (Ajax-OPJHL) and Scott Clemmensen (Des Moines).

The other 7 were all Europeans: Henrik Lundqvist, Pasi Nurminen, Cristobal Huet, Martin Gerber, Fredrik Norrena, Jaroslav Halak, Pekka Rinne.

Why is this the case? I really don't know. Maybe all of the CHL prospects are so highly scouted and scrutinized it's harder to steal a future contributor in the later rounds. If I am running a draft table however, I am definitely leaning towards taking a European goalie with those late round picks. 

If you have any theories let me know, I will be looking into more NHL draft material in the upcoming weeks.