WFTDA Playoffs 2013: By The Numbers

Was the roller derby this year better than last year? Let's look at the data.

Welcome to WRDN’s look-back at the 2013 WFTDA playoff season! Your intrepid commentator will opine on three different items in three separate posts, creating a snapshot of what the WFTDA has been doing right, what it has been doing wrong, and it’s flat-out not doing. This is part one.

Measuring progress in the WFTDA is tricky business.

Sure, seeing growth in the number of affiliated WFTDA leagues (now 234 full member and 89 apprentice) is great. You can also look at the ballooning WFTDA international presence and be left with the impression that the future is bright for roller derby abroad.

However, ticket-buying customers that go to see WFTDA teams play may not know about the rapid expansion of the game. Frankly, they may not care. They just want to see roller derby. If the roller derby they see is not entertaining and competitive, they will not come back to see it again. This is important, because without their financial support, there is nothing to realistically sustain whatever progress there appears to be.

Which means it’s not really progress. Tricky business, ain’t it?

Casting aside the debate on the rapid growth of roller derby, let’s just take a look at the roller derby itself. If we are to try and measure progress in that area, the question is simple:

Is WFTDA flat track roller derby getting any better—on the flat track?

“Better” is a subjective term, one that can differ with opinion. Anyone who watched the playoffs, particularly the championship tournament, could have easily walked away thinking this season was better that last year. (Hell, even I thought it was better than last year—at first.) But filtering the question through one’s own emotions is not the best way of going about answering it definitively.

To gauge the true progress of the WFTDA from 2012 to 2013, we can use Rinxter stats to take an objective look at the numbers. Score differentials, penalty numbers, etc. Take the numbers from this year, compare them to the numbers from last year, and see if things are really getting better—or at least, are moving in the right direction.

To its credit, the WFTDA acted to move that way, overhauling both its playoff tournament format and the ranking system (which, coincidentally, also looks only at the numbers) that placed teams into it. By equally spreading out high-, mid-, and low-ranked playoff teams, the thinking was that non-competitive games would be done and over with early in the playoffs, building up to more close games as the brackets progressed and seeds converged in both the consolation and championship brackets.

However, the change format makes it a bit difficult to draw a direct comparison between last year and this year. Comparing a 10-team regional from 2012 next to a 10-team divisional from 2013 this year is of no use. The varied strength of the old regions and the uniform strength of the new divisions cannot be weighed against each other equally. Putting the 2012 and 2013 WFTDA Championship brackets next to one another would be flawed, too, since the ability level and seeding of the 12 teams that made it in were not the same.

A one-to-one comparison will not do the job, so the only thing we can do is to take the whole lump-sum of the Division 1 tournament season, Championships included, and see how it stacks up to the whole of the 2012 Big 5 playoff season. This should make sure that all competitive games from the 2012 regional tournaments, many of which were concentrated early in the brackets, can be more equally compared to the divisional format, where more competitive games are expected to occur later in the brackets and at Championships.

The primary factor of competitiveness is how close bouts are at the finish. The best gauge of measuring that is, of course, the points difference of games. By plotting out the score differentials from all 80 tournament games in 2013 and lining them up to the score differentials from all 80 tournament games from 2012, we should get a pretty fair gauge of how the overall competitive landscape in the WFTDA has changed, independent of the playoff format used.

Below is a chart laying out the final points gap of all games from last year (top) and this year (bottom). Shorter bars in the blue and green zones mean closer contests, and longer bars in the red zones mean bigger blowouts. The bars highlighted in WFTDA-pink call out the 12 games that took place at the WFTDA Championships event in that particular year.

The question: Was the 2013 a “better” year for WFTDA playoff roller derby than 2012?


There were many more competitive games at Champs 2013, but it’s hard to argue that—overall—roller derby in the WFTDA was more competitive in 2013 than in 2012.

The numbers say it all: The entirety of the 2013 WFTDA playoff season was far less competitive than the 2012 season.

This year, close and competitive games (within 50 points) were down 10% and blowouts (more than 100 points) were up a mind-boggling 40%. In fact, almost half of the playoff games played in 2013 were blowouts. Half! In 2012, less than one-third of games ended in triple-digit point spreads. In 2013, you had a 50/50 chance of seeing a hopeless playoff contest.

This is astonishing. The WFTDA reorganized its ranking system and dumped regional play to specifically ensure—and these are the exact words of the WFTDA—there would be “more competitive play within and across the WFTDA Playoffs.”

But that did not happen. And it wasn’t even close.

To be fair, maybe it was unrealistic that this goal would be met within one year of play. Kinks within the new ranking system were inevitably going to give good teams bad draws and others easier paths that allowed them to overachieve in the tournament. This may have created mismatches in the later rounds, instead of the expected close matches.

Then again, another WFTDA pre-season goal was to ensure the “best teams” were given a chance to advance to Championships. That did indeed happen, with almost all the final games of 2013 being on the more competitive side of chart—though you would need to turn a blind eye to that horribly awful Gotham-Ohio blowout to statistically confirm it.

The new tournament format virtually guaranteed that the WFTDA Championships would be stacked…but it was also supposed to make sure all the play-in tournaments would be more competitive across the board.

Clearly, they were not. So what happened here?

We can eliminate the possibility of teams and players getting worse from last year to this year. Players learn new skate skills, tighten up on blocking strategy, and get better scouting information on their opponents every year, so we know they are all more prepared to play.

Plus, all of the teams that made it to the playoffs were—according to the new WFTDA ranking system—the most deserving of being there. There were no good teams in strong regions left out, and there were no weak teams in weak regions that got in. These scenarios happened frequently in the old regional format, but the new division system has eliminated that. So we can throw the “bad teams bringing down the average” variable out the window.

…Or can we? Because some of the data suggests otherwise.

Since 2009, total points scored per game has been steadily increasing, this year up 13% more than last. One way to interpret this figure is across-the-board weaker defenses that are getting worse at stopping jammers, letting them score more points than before.1

Because we know the 40 teams in the WFTDA playoffs this year were all better than the 40 teams that got to their regional tournament last year, it is impossible that any number of bottom-feeding teams involved in one-sided blowouts are driving the average up WFTDA-wide. Since the average game is a blowout, everyone is to blame.

In fact, the numbers indicate that the lack of defense is apparent in the “good” teams, too. Consider the total points scored in the last five WFTDA Championship finals, the games that showcase the absolute best the WFTDA has to offer each year. No one can deny the blocking ability of the teams that make it to the final—if they didn’t have it, they wouldn’t have got there.


Total Points Scored in WFTDA Championships Final Game

     2009: 278
     2010: 293
     2011: 237
     2012: 363
     2013: 372

Even the best blockers on the best teams playing in the showcase game of the WFTDA have been getting worse and worse at keeping points off the scoreboard. The starkly low-scoring (for the WFTDA) 2011 Gotham-Oly final of 140-97 looks like the exception that proves the rule, given the insane amount of talent that was on the track in that game.

That the insane amount of talent in those other games could not do the same might indicate that their talent is not so insane after all.

Seemingly contradicting this negative trend is a positive one. No doubt aided by the WFTDA (and roller derby in general) switching to a no-minors penalty enforcement system, overall penalties (box trips) are down 5.5% in 2013, with blocker penalties alone down a solid 12.4%. This cleaner play translated into only 27 blocker penalties per team, per 2013 playoff game.

This is a nice decrease. Fewer blocker penalties mean more blockers are on the track for longer stretches of gameplay. Fuller packs give teams stouter defensive options, which therefore makes them more effective at playing jammer defen—

Except, the numbers suggest they were actually less effective at jammer defense?

This don’t make any sense!

Only in the WFTDA can fewer blocker penalties and denser defenses lead to a decrease in blocker effectiveness and corresponding increase in total points scored.

It seems that the positive trends in cleaner blocker play and fuller packs are being completely dwarfed by some other factor, one that renders this improvement worthless.

The numbers can explain the truth behind this phenomenon, too. But these numbers are so nonsensical, we need to plunge into the realm of the imaginary.

Throw away your calculators and get ready to divide by zero: It’s time to talk about jammer penalties and power jams.

It should not be a surprise to anyone that power jams have had a detrimental effect on the WFTDA game. With forcing jammer penalties, stopped packs, power jams and passive offense the strategies du jour, defenses were turned powerless to get packs moving, or stop the jammer from eating them for lunch.

Sure, the great teams can play effective power jam defense—sometimes. But the majority of teams in WFTDA are not great. And even the great teams are getting worse at doing it, as mentioned above.

Adding into this is how harshly the no-minors environment punishes common jammer penalties, like cutting. At the beginning of the year, an increase in jammer penalties, and therefore power jams, was expected for this reason. But how much they have increased is a surprise.

And not the good kind of surprise.

Comparing 2012 box trips to 2013 box trips, jammer penalties are up a staggering 38.7%. At more than 13 total jammer penalties per 2013 playoff game, this works out to four extra power jams per game on average, compared to last year. That is across all 80 games, remember.

This is significant by itself. But consider that more power jams create more two-minute jams and longer jams in general, meaning fewer jams per game. If jammer penalties stayed flat and jam counts decreased, that would give any one jam with a power jam in it more influence on the overall game result.

Yet not only were there fewer jams per game in 2013, there were more jammer penalties, too. This meant that a much larger percentage of game time was played with only one (or less) jammer on the track.

There are also the side-effects of a jammer penalty—lead jammer earned via uncontested jam starts, opposing blocker penalties via illegal stopped-pack defenses, etc. Because power jams were likely to happen once every three or four jams, and there were fewer regular jams with which to counteract their influence, jammer penalties (and penalties in general, despite the downward trend) had such a far-reaching effect on tournament games that trying to make sense out of the numbers is impossible.

In that regard, here is one example of how silly things got during the 2013 playoffs:


Read that again: The people who paid money to the WFTDA to have their (excellent!) product mentioned with the penalty box wanted the thing they paid to happen often, happen less often.

Screw the numbers. That shit is bonkers.

Bonkers, but at least this helps explains the apparent inconsistencies with the points and penalty numbers, what with them somehow being worse despite everyone being better and playing in a tournament format designed to be more competitive.

Jammer penalties being way up translates into more power jams across the WFTDA, making defenses less effective and letting jammers score points with less effort. More time with less than two jammers on the track mean blockers are engaging each other less often, limiting the amount of player-on-opponent contact—therefore limiting the amount of illegal player-on-opponent contact, a factor in the drop in blocker penalties.

So even though player skills are increasing and teams are subjectively becoming more competitive against each other, the fact is that the penalty-dependent style of gameplay in the WFTDA can make even top-ranked WFTDA teams look absolutely terrible, for no reason other than that is the game that the WFTDA Rules of Flat Track Roller Derby describes that teams must practice to be successful.

When even the good teams are getting worse at keeping their jammers out of the box or keeping points off the scoreboard, things are definitely heading in the wrong direction.

To that end, think over this final stat: Of all the penalties committed in the 2013 WFTDA Division 1 tournament season, playoffs and championships, 19.9% of them were jammer penalties. Effectively, any one of the five players on a team had an equal chance to be penalized at any given moment, since one out of five penalties were committed by one out of the five players on the track.

Of all the numbers I took a look at, this was bar-none the most surprising to me. With blocker penalties down and jammer penalties way up, the expectation might have been that this balance would be thrown out of whack.

Instead, the balance has been restored.

The numbers were getting worse because the current iteration of WFTDA roller derby rules was merely regressing to the mean. That is to say, the huge-and-boring effect that jammer penalties and power jams had on games is 2013 is actually normal.

Normal for the WFTDA, of course.

History has shown that just because you balance the number of penalties across player positions, you will not automatically balance gameplay across all situations. If one kind of penalty does not have an equal effect on a game as does another kind of penalty, you create divide-by-zero types of gameplay paradoxes. Namely, better defenses that have no chance to play better defense.

There is no need to reiterate why the blocker-jammer penalty imbalance is a problem, but there is every need to repeat the one and only real solution to it: Change the rules so that one jammer penalty is guaranteed to have the same effect on the game as one blocker penalty would, and vice versa. Cautious optimism—very cautious optimism—says that the WFTDA may be trending toward that next year with the 2014 rules.

But the numbers for 2013 show, as they have been showing for years, that the WFTDA needs a major rules overhaul. When your best players appear to be playing worse across the board, maybe the time is right for them start playing something else, so that their skills can be properly showcased to the world.