Sunday 14 December 2014

How Long Does It Take For A Forward's Shooting To Stabilize?


If a player scores one goal on five shots, does that mean they are suddenly a 20% shooter? What if it is 10 goals on 50 shots? How about 20 goals on 100 shots? This is a classic issue of sample size in trying to separate the signal (talent) from the noise (randomness). That issue being, how big does a sample need to be before it stops being small? The question has been tackled before in other sports, see baseball here and basketball here, and my analysis here will mirror a lot of the methodology laid out in those pieces. 

Relating the problem to a player's shooting talent, how many shots does a player need to take before we can separate the talent from the randomness?

Now if you don't care about the math then please skip to the *** for the answer and analysis. 

Otherwise lets dive in!

The most common method used for testing this problem is typically split-half reliability testing. For example, if we were wondering how stable a player's shooting percentage is after 100 shots we would label each shot from 1-100 and then randomly split these 100 shots into two random 50 shot samples. We would then compare the player's shooting percentage between these two samples. This method is fine but it can be improved upon in our case by using the Kuder-Richardson Formula 21 (KR-21).

This formula will tell us the reliability of a test involving binary outcomes (two results), which is great for this test since when a player takes a shot there are only two possible results, a save or a goal. The KR-21 formula allows us to perform a split half reliability test but instead of only being able to compare one only type of combination it allows us to compare every single possible combination of these outcomes. For example, if we were going to preform a basic split half reliability test for a total sample of 10 shots (each labelled, 1 2 3 4 5 6 7 8 9 10) a simple method would be to compare all the even number shots with the odd number shots. Using the KR-21 formula however goes further and compares every single type of combination (ex. evens vs. odds, 1-5 vs. 6-10, 1 2 3 9 10 vs. 4 5 6 7 9, etc..). The results of this 10 shot KR-21 test will be a much better estimate of how reliable an indicator of a player's true talent level a stat will be over a 5 shot sample (10 divided by 2 = 5). 

Our goal is to reach a reliability of 0.707 at which point the signal (skill/talent) will begin to overtake the noise (randomness/luck) in our sample (0.707 x 0.707 = 50%). Below I have charted shots versus their reliability to show how the reliability of a sample which change as your sample's cutoff point increases. The blue line shows the logarithmic curve of reliability (which had an R-squared fit of 0.99626 with the data points) which I used instead of simply plotting a basic curve graph. I used the log curve because as you might notice in the table below I got a tad lazy and stopped running the numbers as frequently for bigger samples so I used the logarithmic curve which shows the relationship just as well. The red line shows the 0.707 cut off line where talent beings to overtake the randomness. Above the red line = good, below the red line = not good.


***


I found that after about 223 shots the reliability will cross the 0.707 threshold.





Shots Reliability Signal (Talent) Noise (Luck)
25 0.169 2.8% 97.2%
50 0.317 10.0% 90.0%
75 0.410 16.8% 83.2%
100 0.493 24.3% 75.7%
125 0.560 31.4% 68.6%
150 0.604 36.5% 63.5%
175 0.656 43.0% 57.0%
200 0.677 45.9% 54.1%
212.5 0.693 48.0% 52.0%
217.5 0.704 49.5% 50.5%
222.5 0.707 50.0% 50.0%
225 0.712 50.6% 49.4%
250 0.732 53.6% 46.4%
300 0.765 58.5% 41.5%
375 0.805 64.9% 35.1%
500 0.891 79.5% 20.5%

We now know that at 223 shots a player's shooting percentage is about 50% skill and 50% luck, which is still a lot of noise. We have to get about 400 shots before we really see a player's talent begin to shine through. This once again demonstrates how easy it is to be fooled by small sample sizes. While 223 may seem like a reasonable estimate it should be noted that only 40 players last season (2013-2014), or just over 6% of the entire league, record more than 223 shots. Alexander Ovechkin led the league with 386 shots total (along with a 13.6 shooting percentage) and still only gives us a signal strength of about 65%. 

This isn't meant to be predictive necessarily. That is to say, just because John shot 9% over 223 shots doesn't mean that we should expect John to shot 9% over his next 223 shots. If John shoots 17% over his next 50 games did he suddenly become a better shooter? Probably not. However, if John shoots 12% over his next 223 shots, the case can actually be made that this player may have improved his actual shooting talent. 

This all goes to show that it does take quite a bit of time for a player's shooting percentage to stabilize. Many are quick to reach assumptions about a player's actual ability simply based on a single season which we can see here rarely makes sense when the vast majority of the league will have taken so few shots that separating the signal from the noise is incredibly difficult. There is definitely talent at the heart of a player's ability to score goals, it just takes some time for that talent to truly become evident.