Okay, so rather than go and re-derive Saunder's equation (5), I will just state it here. Condition 1 and 2 imply the following equality:
∏j=1m(∑k≠ihkdjk)=(∑k≠ihk)m−1(∑k≠ihk∏j=1mdjk)
where
djk=P(Dj|Hk,I)hk=P(Hk|I)
Now we can specialise to the case m=2 (two data sets) by taking D(1)1≡D1 and relabeling D(1)2≡D2D3…Dm. Note that these two data sets still satisfy conditions 1 and 2, so the result above applies to them as well. Now expanding in the case m=2 we get:
(∑k≠ihkd1k)(∑l≠ihld2l)=(∑k≠ihk)(∑l≠ihld1ld2l)
→∑k≠i∑l≠ihkhld1kd2l=∑k≠i∑l≠ihkhld1ld2l
→∑k≠i∑l≠ihkhld2l(d1k−d1l)=0(i=1,…,n)
The term (d1a−d1b) occurs twice in the above double summation, once when k=a and l=b, and once again when k=b and l=a. This will occur as long as a,b≠i. The coefficient of each term is given by d2b and −d2a. Now because there are i of these equations, we can actually remove i from these equations. To illustrate, take i=1, now this means we have all conditions except where a=1,b=2 and b=1,a=2. Now take i=3, and we now can have these two conditions (note this assumes at least three hypothesis). So the equation can be re-written as:
∑l>khkhl(d2l−d2k)(d1k−d1l)=0
Now each of the hi terms must be greater than zero, for otherwise we are dealing with n1<n hypothesis, and the answer can be reformulated in terms of n1. So these can be removed from the above set of conditions:
∑l>k(d2l−d2k)(d1k−d1l)=0
Thus, there are n(n−1)2 conditions that must be satisfied, and each conditions implies one of two "sub-conditions": that djk=djl for either j=1 or j=2 (but not necessarily both). Now we have a set of all of the unique pairs (k,l) for djk=djl. If we were to take n−1 of these pairs for one of the j, then we would have all the numbers 1,…,n in the set, and dj1=dj2=⋯=dj,n−1=dj,n. This is because the first pair has 2 elements, and each additional pair brings at least one additional element to the set*
But note that because there are n(n−1)2 conditions, we must choose at least the smallest integer greater than or equal to 12×n(n−1)2=n(n−1)4 for one of the j=1 or j=2. If n>4 then the number of terms chosen is greater than n−1. If n=4 or n=3 then we must choose exactly n−1 terms. This implies that dj1=dj2=⋯=dj,n−1=dj,n. Only with two hypothesis (n=2) is where this does not occur. But from the last equation in Saunder's article this equality condition implies:
P(Dj|H¯¯¯¯¯i)=∑k≠idjkhk∑k≠ihk=dji∑k≠ihk∑k≠ihk=dji=P(Dj|Hi)
Thus, in the likelihood ratio we have:
P(D(1)1|Hi)P(D(1)1|H¯¯¯¯¯i)=P(D1|Hi)P(D1|H¯¯¯¯¯i)=1 ORP(D(1)2|Hi)P(D(1)2|H¯¯¯¯¯i)=P(D2D3…,Dm|Hi)P(D2D3…,Dm|H¯¯¯¯¯i)=1
To complete the proof, note that if the second condition holds, the result is already proved, and only one ratio can be different from 1. If the first condition holds, then we can repeat the above analysis by relabeling D(2)1≡D2 and D(2)2≡D3…,Dm. Then we would have D1,D2 not contributing, or D2 being the only contributor. We would then have a third relabeling when D1D2 not contributing holds, and so on. Thus, only one data set can contribute to the likelihood ratio when condition 1 and condition 2 hold, and there are more than two hypothesis.
*NOTE: An additional pair might bring no new terms, but this would be offset by a pair which brought 2 new terms. e.g. take dj1=dj2 as first[+2], dj1=dj3 [+1] and dj2=dj3 [+0], but next term must have djk=djl for both k,l∉(1,2,3). This will add two terms [+2]. If n=4 then we don't need to choose any more, but for the "other" j we must choose the 3 pairs which are not (1,2),(2,3),(1,3). These are (1,4),(2,4),(3,4) and thus the equality holds, because all numbers (1,2,3,4) are in the set.