clock menu more-arrow no yes mobile

Filed under:

Lies, Damn Lies, and Statistics...

There's been a lot of discussion lately regarding a study that was cited in Canada's National Post that argued that, among other things, Major Penalties help NHL teams win hockey games. There was an obvious "yeah, right" reaction from much of the blogosphere, and Sisu Hockey has done a nice evisceration of this shoddy piece, but I thought I'd put my thoughts in as well. What we really have here is another sad example of the media picking up a study and running with it before giving any kind of thorough vetting. So let's take a look inside...

Abstract: In the past few years, the National Hockey League (NHL) has struggled financially. Teams within the NHL and the league itself have been struggling to make money, and last year the NHL season did not take place because of a labor dispute and resulting lockout between the players and owners. Therefore, this makes the NHL a very appropriate target for study. As previous research on various professional sports and the NHL have shown that winning teams are going to draw more fans, determining what makes NHL teams win games is a worthy endeavor. This study does just this. By using Ordinary Least Squares regression and a data set compiled on numerous individual and team statistics for the 1999-2000 through the 2003-2004 seasons, this study determines the various factors that contribute to Team Point production and Goals Allowed in the NHL. Of special note is the finding that Major Penalties, most commonly assessed for fighting, do in fact help NHL teams win games. Hopefully by seeing what aspects of the game lead to NHL team success, the league can determine how to draw more fans in order to make more revenue.

First of all, I'd give these guys a "D" for writing. I know my high school English teacher would have scratched more than a few red lines through this part. But enough about style, let's get to the meat...

Right now the National Hockey League is in an interesting position.

What position is that? Does it involve stirrups or something? Sorry, another snipe at their inept writing. I'll try not to keep harping on this.

The ultimate goal of these new regulations is to draw more fans to NHL games in order to make more money. In the 2002-2003 season 50% of the NHL’s revenues came from ticket sales, and an additional 30% came from in-arena sales. Thus, 80% of the league’s revenue can be traced directly back to the fans. If the NHL can increase attendance throughout the league, the individual teams and the league itself should make more money.

I believe this falls into the "No Shit, Sherlock" category.

Now, as the NHL has made these adjustments to bring more fans to the games, fans also want to see their team win. As numerous studies have shown, attendance at professional sporting events is significantly impacted by a team’s success.4 Better teams that win more games are going to draw more fans, and they are going to gain more support off the playing field as well. Sports teams are profitable entities and a winning record is an important factor in determining their financial success.

More "No Shit, Sherlock", except for that last sentence. I thought the whole point here is that, by and large, NHL teams are not profitable. And it should be noted that a winning record can be important for an individual team's profitability, but not for the league as a whole. For every winner, there's a loser...


Here follows a largely irrelevant section surveying various studies across baseball, football, basketball, and hockey, none of which really get to the point about the factors that drive NHL team performance. The hockey ones are mostly about wage structure and its relation to ethnicity or violent play.


Here's where we enter the morass...

As wins and win percentage have often been employed in production models dealing with professional sports, the NHL does not base their rankings on either of these measures. In contrast, rankings in the NHL are determined by Team Points, where each team receives two points for a win and one point for a tie after regulation.

Obviously the authors don't understand that Team Points are equivalent to Win Percentage * Games Played. And another writing criticism; that first sentence is really awkward. Try starting it with "Although", rather than "As".

For some reason, they decide to focus on deriving two measurements for team performance: Team Points and Goals Allowed. Maybe it's just because those are the numbers that best worked out for them, but one would think that Goals For would be the natural complement to Goals Allowed, since the relation of Goals For and Against to Team Points has been fleshed out pretty well by numerous parties (including Hockey Analytics, for example).

As Sisu Hockey notes, there follows a list of "independent" variables which are used in the analysis, but since they aren't truly independent, the findings are fatally flawed. For instance, in the calculation for Team Points, they use both Goals For and Shooting Percentage. Since Goals For/Shots = Shooting Percentage, those two variables aren't truly independent; the authors are clearly guilty of statistical incest here.

As hockey is a team sport, player cooperation is an integral part of the game. Thus, some aspects of teamwork should be included in the production model. Therefore, the simple statistic of Assists is included as an independent variable in the Team Points model. As Assists help Goals happen and Goals are going to help NHL teams win games, Assists are expected to have a positive effect on Team Points.

More 3rd-grade quality writing here, accompanied by some bass-ackward logic. From a statistical perspective, Assists don't "help Goals happen." It's the other way way around. You can't have Assists without Goals, and regardless, it's the Goals For figure that impacts game results, not how many assists a team tallied on those goals. Again, two commingled variables are being called "Independent" here, but aren't truly so.

A final variable that will be included in this analysis of NHL Team Production is that of Plus/Minus. This statistic is kept for each individual player and shows whether a player has contributed more to goals, or has been scored on more while he is on the ice. If a player is on the ice while his team scores a goal, a player receives a "plus." If he is on the ice while the opposing team scores, the player receives a "minus." These "plusses" and "minuses" are then tallied throughout the season to assess how well a player plays both offensively and defensively. A high, positive Plus/Minus statistic is one sign of a good player. This variable will be included only in the Team Points model, but is expected to have a positive effect in that equation.

First of all, they don't make the Power Play distinction regarding the Plus/Minus calculation, which leads to more speculation that the authors don't really know much about hockey. You could have a team that's above average at Even Strength, with lousy special teams, and their Plus/Minus will look much better than their win/loss record would indicate.

Another example of commingled variables is using PIM and the count of Major Penalties. Majors are a component of overall PIM, so including these together only muddies the waters further.

Here follows the outcome of this mish-mash. Two conclusions were cited in the National Post article regarding Major Penalties - "For each penalty minute served, a team accrued 0.07 points and decreased their opponent's scoring by 0.24 goals."

First, I believe they mistate the numbers in the paper, which refers to the Number of Majors, not the number of penalty minutes served. The coefficient for Majors to Team Points is indeed 0.07587 in this study. What the National Post overlooks is that if you're going to take a Major, you also need to account for the 5 PIM, which has a coefficient of -0.01087, so the net result (even using the paper's shoddy analysis) would be 0.02152, much less than was cited. As for the impact on opponent scoring (0.24 Goals Against reduction per Major), I just don't see that in here. The only table included regarding Goals Against uses logarithmic data, so perhaps I'm missing something.

By and large, the numbers match up with amazing precision and seem to point to some radical results. But that's really because they were hopelessly intertwined to begin with.

The real firebomb is right here:

As MAJORS were found to have a positive effect on PTS and a negative effect on GA, this implies that Major Penalties (more specifically fighting) do in fact aid a team’s success.

First of all, there is nothing in the study that suggests that fighting majors have any different effect than other major penalties. The authors are making an unwise leap here.

Even though fighting results in a penalty, it is shown to be able to jump start a team into action and elicit better play. For example, if a team is not playing well, a player might start a fight with the opposing team in order to get the momentum of the game back on his side.

Again, there is no evidence in the study to suggest this. They certainly don't look at event sequencing, to see if teams score goals after taking major penalties, or instead take those penalties once they have a safe lead.

My Conclusion
Overall, this looks like something an undergrad would put together, as an exercise in how to build a statistical study around a topic of their choice. Then the professor would rip it apart and send them back for better data to work with before drawing such assertive conclusions.

The work I do here in this blog isn't as formal as these authors are trying to be, in part because I think at this stage the interesting work lies in identifying fruitful areas for future study. Papers like this only serve to cast a disparaging light on the work of hockey statisticians in general, by publicizing absurd claims which fly in the face of both common sense and critical judgement. I'm an economist by training, but I wouldn't jump to use the economist's toolbox in this field before I truly felt confident that the source material is valid, and that the results would be useful. This study fails in both those areas.