Match by “Games Played” – Analysis

One of the persistent suggestions from certain segments of the community has been that online matchmaking would be, in some way, better off if matches were arranged based on number of games played rather than similar Team Values.   The logic behind this is that it would better simulate scheduled league play (where teams typically play the same number of games due to the schedule) and where (supposedly) none of the problems that people feel exist in matchmaking environments seem to show up.

For our analysis of this idea, we’ll use the FUMBBL BlackBox data.  The B league on FUMBBL is a matchmaking environment based on similar TVs exclusively, with no ability to refuse matches you are assigned.   Each match is assigned a “win” value of 1 for a win, 0 for loss, and 0.5 for a draw (as per the BBRC’s win% calculation).  Due to the very low granularity of those values, we can expect to see small r values in our correlations, without those small values really meaning they are of low practical significance.   Luckily, we’re really just looking at relative strength as far as using TV or games played as a predictor of a given match’s outcome.

Test 1 – overall prediction

First, lets look at how well relative TVs and relative team ages correlate with match outcome.   This means we’re not breaking down the matches into any specific TV ranges, we’re just looking at all matches played within the Black Box division.

Relative TV:  r = 0.085, p < 0.01, N = 137448
Relative Age: r = 0.075, p < 0.01, N = 137448

If we control for the effect of each on the other (TV and team age themselves correlate for obvious reasons), we find that TV difference becomes a stronger predictor of match outcome:

Relative TV:  r = 0.076, p < 0.01, N = 137448
Relative Age: r = 0.065, p < 0.01, N = 137448

Result:  Across all TV levels, TV difference is a better predictor of match outcome than relative games played.

Test 2 – low games played (where a team has 10 or less games played)

Next, lets look at how the measures predict match outcome for a team that has 10 or less games under its belt.  We’re not going to limit the number of games the opponent has played because we want to allow for the supposed effects of “minmaxing” – low TV, high games played teams that are well-developed but kept at a low TV to abuse new teams.

Relative TV: r = 0.076, p < 0.01, N = 70053
Relative Age: r = 0.074, p < 0.01, N = 70053

if we control for each in the calculation for the other, the difference is higher:

Relative TV: r = 0.067, p < 0.01, N = 70053
Relative Age: r = 0.064, p < 0.01, N = 70053

Result:  Close, but TV difference remains the stronger predictor of match outcome, even if we allow for scenarios where, for example, one team has played 3 games and the other 300.

Conclusion

Both across the entire dataset, and when looking at matches involving teams with a low number of games played, TV difference remains the stronger predictor of match outcome.   Limiting our view to matches played by teams with less than 10 games covers both the mean and median number of games that teams in that environment play (10 and 5 respectively).   Certainly the data shows that the likelihood of a win tends to decrease as the gap between a team and its opponent grows (in the opponent’s favour), and that the likelihood of a win decreases as the gap between number of games played grows (in the opponent’s favour), but the TV difference appears to have more effect than the team’s age.

Given this, there does not seem to be a strong case for matching by “games played” being a superior method for matchmaking than matching by TV similarity.  If anything, it would result in less equal match-ups.