I've been studying the issue of the best way to measure group size compared with how it is usually measured for some time. In the process I've come across articles which discuss alternatives to the extreme spread, and why they are better for describing group size. Even though some of these articles have been around for years, it seems that either the information isn't widely known or for various reasons people have decided not to use it. Based on some of the arguments I've seen in forums elsewhere about statistics in general, I'm guessing the information just isn't widely known -- but maybe I'm wrong. Either way, hopefully this will be of some use to someone, and I'm interested in seeing what people might have to say about the subject in general.

You can divide the reason for measuring group size into two categories: One, deciding who wins a competition and two, making predictions about the future.

If you are looking at targets for a competition then extreme spread is fine because all that matters is who shot the best target(s) in a particular contest. If the two best shooters are A and B, and A is better than B then you can expect A to win most of the time. But if A is only a little better than B then random chance will allow B to beat A sometimes. The closer they are in ability and equipment, the more often B will win. This is why someone like me might be satisfied with factory loads and people like A and B worry about hand-loading. They need to eliminate as much random chance as they can to beat the other people who are close to them in ability. The higher up you get in competitive shooting, the more you have to worry about smaller and smaller effects so you can reduce the amount of random chance. Shooter A will win on average, but in any particular contest the winner is the winner regardless of how anyone does on average.

On the other hand, if you are comparing loads, or two different barrels, or two different rifles, etc., then you don't care how they do for just one particular target -- now what matters is what you can expect on average for all future shots under effectively the same conditions. If you shoot three shots of load X and three of load Y, and load X gives you a smaller group, then so what? What you really want to know is which load corresponds to the smaller group on average, for all future shots. Now we are talking about predicting the future and although you can still use extreme spread, it is an inefficient and crude statistic for making predictions.

To see why, we first have to consider that using statistics for making predictions involves the fact that the more samples we have (or equivalently the bigger the sample) the more accurate our prediction is likely to be. If we've seen shooters A and B in two competitions and B happened to win both, we might conclude B was better and predict he would win in the future. But if we had seen them in ten competitions and A won 7 out of 10 then our larger sample leads us to the more accurate conclusion that A is better than B. With an even larger sample we could probably say how much better A is than B. (Assuming neither gets better or worse.) Also note that we are using all 10 pieces of our data, not just two.

Keeping this in mind, now lets consider what extreme spread measures. For any group, the probability of a hit is highest at the center, decreasing as you move out. Extreme spread is based on two shots which hit the farthest from the group center on any particular target (target = sample). This means extreme spread is based on the least probable shots. Even if we increase the number of shots for a particular target we are still basing our measurement on the two least probable shots. The information the rest of them could give us is thrown away. If we shoot several targets and average the extreme spreads this is a little better, but we are still basing our measurement on the least probable shots and throwing away most of the information we could have used. So for a given number of shots, our predictions about the future are less accurate than if we had used a more efficient statistic.

There are two other problems with extreme spread. First, even if two groups had the same size (that is, density of shots near the center) the extreme spread can still vary greatly between them. Again, this is because extreme spread is based on the two least likely shots in the group.

The second problem is that on average extreme spread increases with the number of shots in the group. Think about what this means. If you are testing a load and using extreme spread to measure group size, then the more shots you fire, the larger you will claim your group size to be. I've seen people get into really heated arguments (on other forums) about this, not realizing that what they are seeing is just the statistical nature of extreme spread rather than anything about the load. For a particular load you should expect a certain fraction of the shots to always hit within a certain fixed distance from the center of the group regardless of how many shots you fire. This distance should correspond to the group size.

A better measure of group size would be one which becomes more accurate as you increase the number of shots in the group. While you might see more low probability shots at the outer edge of the group, you will see many more high probability shots filling in the center of the group. A good measure of group size should take the increasing density of shots near the center into account, and extreme spread doesn't.

I've worked out the math and used the results to write a program to calculate the probability of seeing various extreme spreads given a fixed group size and the number of shots in the groups. Here are a few results, all for a fixed group size of 1.0 MOA:

For a 3-shot group the probability of seeing an extreme spread greater than 1.0 MOA is 57.8%; for greater than 1.5 MOA it's 12.7%.

For a 5-shot group the numbers are: greater than 1.0 MOA 94.4%; greater than 1.5 MOA 36.4%.

For a 10-shot group: greater than 1.0 MOA. more than 99.99%; greater than 1.5 MOA 86.9%.

So if you are comparing two loads and you shoot enough shots, then extreme spread can probably tell you if there is a big difference, but not a small one. Assume shooter A is only beating shooter B by a fraction of an MOA. Then, if shooter B could only work up a load (or bullet treatment or whatever) that on average improved his group size by that fraction he could start beating shooter A on average and win most of the time. But assuming they both already have pretty good loads, a small improvement is all B can hope for and extreme spread isn't good enough to show small differences. B might try several things, each of which would have done the trick, but give up on them because extreme spread is too coarse to show the small change in average group size.

So what would be better? There are several alternatives. I'll describe three in order of increasing quality.

The first is usually called the mean radius. This is the average of the distances for all hits from the center of the group. This is a large improvement over extreme spread because it takes all of the holes in a target into account. The problem is that it treats low probability and high probability hits equally.

An improvement over this is a statistic referred to as radial standard deviation (RSD). It is the standard deviation of the distances of all theholes from the center of the group. It, does a better job of weighting theimportance of high and low probability holes and so is a more efficient statistic. (More efficient means getting a more accurate result for the same number of shots.)

The mathematically best possible is what is referred to in research circles as "width". For a perfectly round group the width gives a number proportional to the RSD but, in reality groups aren't perfectly round. Width takes this into account with the result that it gives an even more accurate and efficient measure of group size. In addition, with some related statistics, it can tell you not only the size of the group, but also its shape and orientation. For someone with enough shooting experience, all of this could tell you an awful lot about a load, barrel harmonics, flinching, or anything else you are testing.

There are two major trade offs in using any of these alternatives to the extreme spread -- time and difficulty of the calculations. Extreme spread might take a few seconds. But with a laptop computer and software to handle the calculations, and a caliper for making measurements, it might take two or three minutes to get one of these other statistics for a five-shot group. Of course at the same time you could also have the computer give you a lot of additional information. If you trade off a little accuracy then there are also ways to transfer data directly from the target to the computer, skipping the most time consuming part of the whole process. However, given the amount of time and money people spend making small tweaks to their guns and to their loads I think there are people would find the results worth the trouble.

For a different perspective on the same subject (with similar conclusions) here is a link to an article on alternatives to extreme spread written more than ten years ago.