15 July 2007

Mean averages - the debate continues...

Cricket Burble's John Wright raises an interesting question with his recent post on the method of calculation of batting averages. He reports a proposal by a pair of actuaries to update the current method (runs scored divided by number of dismissals) with a new algorithm that would reduce the impact of frequent not-out scores on the batting average of tail-enders. The proposal is to modify the 'times out' divisor, treating 'not out' innings as partial dismissals to an extent determined by the length of the innings (i.e. balls faced) compared to that batsman's average.

To illustrate, consider the following fictional example of an inveterate tail-ender - let us call him Mavis. Mavis's obdurate forward defensive lasts, upon average, 30 balls per innings. One week, Mavis's arrival at the crease spells doom for his more accomplished batting partner, who swings belligerently and gets bowled before Mavis can even face a ball. As a result, Mavis's average would be unaffected under this new scheme just as with the old convention. The following week, Mavis's partner has learnt his lesson, and offers our chap the strike. Mavis blocks out for 15 balls before his partner is again dismissed. Mavis thus incurs 15/30 = 0.5 of a dismissal in his batting average. Finally, our congenial fellow's moment of glory comes with a heroic unbeaten 60-ball innings to save the match against the odds. Since he faces more than his average 30 balls per innings, the statisticians would treat this as a full dismissal.

I argue that this partial discrediting of not-out innings is unfair. Should Mavis's batting statistics actually suffer at the hands of his heroic 60-ball marathon compared to the occasion he didn't even get to face a ball? One of the stated aims of this new technique is to avoid tail-enders bumping up their averages with lots of not outs - but surely a no-hope rabbit isn't likely to have scored many runs before their partner is dismissed (and so won't really have greatly boosted their average with the not-out score)? Likewise, should a more capable tail-ender be punished for the fact that he has run out of partners just as he was building a score?

It is surely broadly true for any batsman at any level, whether number one or eleven, that batting becomes easier - and the danger of being dismissed less acute - the longer he stays in and the more runs he has compiled. The not-out tail-ender at the end of the innings already stands to lose the benefit of 'having his eye in' come his next visit to the crease. It seems harsh that a revised approach to batting averages should target frequent not-outers any further.

The Maini and Narayanan proposal also brings up the topic of arch-finisher Michael Bevan's impressive one-day batting average of 53.6. Harshly labelled a 'late-order batsman', they cite his frequent not-out innings (67 of 232 matches) as the cause of an undeservedly high average. The application of their technique reduces the value to the rather more human 38.7. Is this really fair though? In the countless ODIs where an undefeated Bevan - doubtless well established and seeing the ball like the proverbial football - shepherded Australia to yet another victory, should the statistics really deem him to be as good as dismissed?

Let us know your thoughts!


Ed said...

Given this some thought and I think I have a problem with the fact that if a batsman has an average number of balls faced of, say, 50, and he is not out after facing 51 balls when the innings ends, he shouldn't be considered out.

He needs to be given credit for the fact that this is one of his better innings, so perhaps a better way of working it out would be to look at the average number of balls he faced in all innings that lasted longer than his average duration as measured by number of balls.

Only if he has faced more than that number of balls should his innings be considered complete.

Understand? I think I may have confused myself with that one....!

David said...

Hmm. I guess your proposal cuts some slack towards the guys who have a statistical history of batting for days once they're established. A more mathematical benchmark might be e.g. a standard deviation (a measure of the 'spread' of values) above his average.

But on the other hand, I argued that our (now well settled) batsman doesn't deserve to be deemed dismissed after his 'average innings length', and so I would argue even more strongly that he deserves this fate even less after reaching this later point in a long innings!

And more pragmatically, a more important factor for the batting average algorithm is probably simplicity - both of calculation and comprehension!

Em said...

The probability of getting out on any particular ball is close enough to the probability of getting out on any other particular ball as makes no difference, the number of balls you will face before getting out is therefore a binomial distribution.

The probability of you staying in for 31 balls, say, is the probability of staying in for 30 multiplied by p, your probability of not getting out on any particular ball. The p for a batsman is fairly easily derived from the number of balls they had faced each time they got out in previous games. The probability of someone being out on their 30th ball is a more sensible number to add to the divisor when they make it through 30 balls not out, since it will be between zero and one and will be higher the longer they stayed in as opposed to the rather artificial "add one to the divisor if they were in longer than their average" system whereby a batsman who gets out once and winds up not-out after less balls than his average every subsequent innings still gets away with murder.

David said...

"The probability of getting out on any particular ball is close enough to the probability of getting out on any other particular ball as makes no difference" - I disagree. Batsman are always a lot more likely to get out before they've 'got their eye in'. So it won't be binomial, as each delivery's probability isn't an independent event.

If you could come up with a vaguely realistic model for P(n) (or indeed, P(n | 'still not out after (n-1)th ball') !) - and were able to calculate all the model's parameters for every batsman - then you'd probably have a better divisor than that article proposed, as you say. But it'd all be getting a bit silly. Clearly the current system is superior in terms of simplicity, if nothing else!

And as for the not-out batsman 'getting away with murder' - I personally reckon he deserves not to be penalized (i.e. there's nothing wrong with the current system) if nobody's able to get him out!