Let's investigate the range of x1≤x2≤⋯≤xn given that their arithmetic mean (AM) is a small multiple 1+δ of their geometric mean (GM) (with δ≥0). In the question, δ≈0.001 but we don't know n.
Since the ratio of these means does not change when the units of measurement are changed, pick a unit for which the GM is 1. Thus, we seek to maximize xn subject to the constraint that x1+x2+⋯+xn=n(1+δ) and x1⋅x2⋯xn=1.
x1=x2=⋯=xn−1=xxn=z≥x. Thus
n(1+δ)=x1+⋯+xn=(n−1)x+z
and
1=x1⋅x2⋯xn=xn−1z.
The solution x is a root between 0 and 1 of
(1−n)xn+n(1+δ)xn−1−1.
It is easily found iteratively. Here are the graphs of the optimal x and z as a function of δ for n=6,20,50,150, left to right:
As soon as n reaches any appreciable size, even a tiny ratio of 1.001 is consistent with one large outlying xn (the upper red curves) and a group of tightly clustered xi (the lower blue curves).
At the other extreme, suppose n=2k is even (for simplicity). The minimum range is achieved when half the xi equal one value x≤1 and the other half equal another value z≥1. Now the solution (which is easily checked) is
xk=1+δ±δ2+2δ−−−−−−√.
For tiny δ, we may ignore the δ2 as an approximation and also approximate the kth root to first order, giving
x≈1+δ−2δ−−√k; z≈1+δ+2δ−−√k.
The range is approximately 32δ−−−√/n.
In this manner we have obtained upper and lower bounds on the possible range of the data. We have learned that they depend heavily on the amount of data n. The upper bound shows the range can be appreciable even for tiny δ, thereby improving our sense of just how close to each other the data points really need to be--and placing a lower limit on their range, too.
Similar analyses, just as easily carried out, can inform you--quantitatively--of how tightly clustered the xi might be in terms of any other measure of spread, such as their variance or coefficient of variation.
x=c(-5,-5,1,2,3,10); prod(x)^(1/length(x))
[1] 3.383363
(ในขณะที่ค่าเฉลี่ยเลขคณิตคือ 1)