Friday, March 18, 2011

Research Challenge: Distribution of Win% Across Reach Differences

I saw this a couple of days ago and noticed a few of you talking about reach.


Promoted by Mike Fagan.
Here's your winner of the inaugural FightMetric Research Challenge. Congratulations to PistonHyundai. Additional congratulations go out to everyone who participated. Rami and I are both very happy with the turnout, though that's to be expected with the Bloody Elbow community.
What's next? We're looking for another question for the community to answer in April. What do you want to see? Leave your suggestions in the comments.

Ok, here goes nothing.
So far, there's been some great legwork done on the correlation between reach advantage and fight results. Here's a brief recap before I introduce my analysis:

When you consider MMA's dynamism as a sport, and facts like 1.) submission victories not necessarily resulting from ground-heavy fights and 2.) knockouts not being exclusive to stand-up wars, it's reasonable to say that these figures do indeed suggest an advantage for rangier fighters. At the very least, the results merit a closer look.
I feel that it would be very useful to examine the distribution of win percentages across the spectrum of reach differences. A direct relationship between win rate and the amount of reach "advantage" would lend more credibility to the idea of reach as a literal advantage. An inverse relationship would obviously suggest the opposite, although it would still highlight reach as a (weird) influence on fight results. An undecipherable cluster of data points would damage the reach advantage theory. Accordingly, I've prepared a series of histograms to explore the possibilities of such relationships.

How to Read These Graphs (Important)

It's no lie: these graphs are a little clunky. But I couldn't think of a better way to express the necessary data, and I promise that just a bit of explanation will make things significantly clearer:

  1. The X axis tracks the difference in inches between the fighters' reach. "1" includes all differences greater than 0 and equal to 1, etc.
  2. The last X value includes all reach differences greater than 8 inches because beyond that, there just weren't enough fights per inch to generate defensible percentages.
  3. The Y axis only pertains to the orange/red longer reach win% bars. The blue bars represent the total number of fights featuring each respective reach difference, and are only meant to convey the relative amounts of data that were available for calculating the percentages.
Let's jump right in with the graph for the whole population.

First things first: the most obvious trend is portrayed by the blue bars. To wit, as the reach difference increases, the number of fights in the data pool consistently decreases. This is a natural result of weight class stratification and probably matchmakers' preference for pairing similarly-sized fighters within those classes. 331 fights commenced with a 1 inch or shorter difference, while only 20 fights represent freakish differences of 8 or more inches. The important thing to take away is that shorter blue bars indicate less reliable orange/red win% bars.

Speaking of win%, a direct relationship is clearly delineated here. On the side of the histogram with the most data, win% increases steadily along with reach difference. Even in the less populated categories that happen to buck the trend, win% still breaks the halfway rate in favor of rangier fighters. Looks like there's something to this "reach" business. Now, let's break it down further to isolate decisions from finishes.

The Decisions histogram effectively demonstrates the potential buckshot I mentioned earlier, but there's still an important relationship to point out: looking back at the raw percentages, fights ending by judges' decision least correlated with reach advantage. Being that this is the only examined sample to favor fighters with shorter reach, combined with its sporadic win% distribution, it seems likely that over the course of a complete sanctioned MMA bout (with more time for each fighter to execute a variety of skills), reach just loses influence.
On the other hand, finishes appear to correspond to reach advantage quite favorably. Observe the consistent increase between and including the 2 and 5 inch categories, and the overall direct relationship, despite slight anomalies in underrepresented reach differences. This appears to be where reach truly matters.
One last sample division drills the point even further: (T)KOs vs. Submissions.

DEPTH LEVEL 3

The graphs support the commonly accepted notion that a reach advantage is most heavily favored during striking portions of a bout. Longer-limbed fighters boast more than 70% of the knockout victories in three of the reach difference categories (with, yet again, a direct relationship between percentage and size difference). Submission rates are less linear throughout the spectrum, but still express an overall upward trend. Purely speculating, this may be reflective of long arms and legs being as vulnerable in certain grappling positions as they are threatening in others.
I'll bring it home with one final diagram. Observe the linear regressions of all five samples:


LINEAR REGRESSIONS

These are the "best fit" or "trending" lines of all the preceding data. Lines that are higher and steeper express stronger positive correlations between reach advantage/difference and win rate. Predictably, (T)KO win% combines the highest Y values with the most dramatic slope. Likewise, Decision win% offers the lowest Y values, but in spite of its wildly distributed data points, it still betrays an upward trajectory over the 9 measurements of reach difference. The fact that all five lines possess positive slopes and most of them cross the 50% mark within the leftmost quadrant indicates not only a direct relationship between reach advantage and rate of victory, but also a supporting relationship between the amount of reach advantage and rate of victory.
Reach matters.

Posted via email from MMACrypt.com

No comments:

Post a Comment