ค่าที่คาดหวังของค่ามัธยฐานตัวอย่างให้ค่าเฉลี่ยตัวอย่าง


16

Let YYแสดงค่ามัธยฐานและให้ˉ XX¯หมายถึงค่าเฉลี่ยของตัวอย่างที่สุ่มจากขนาดn = 2 k + 1n=2k+1จากการจัดจำหน่ายที่เป็นN ( μ , σ 2N(μ,σ2) ) ฉันจะคำนวณE ( Y | ˉ X = ˉ x ) ได้E(Y|X¯=x¯)อย่างไร

สังหรณ์ใจเพราะสมมติฐานปกติก็จะทำให้ความรู้สึกที่จะอ้างว่าE ( Y | ˉ X = ˉ x ) = ˉ xE(Y|X¯=x¯)=x¯และแน่นอนว่าเป็นคำตอบที่ถูกต้อง สามารถที่จะแสดงอย่างจริงจังว่า?

ความคิดเริ่มต้นของฉันคือการเข้าถึงปัญหานี้โดยใช้การแจกแจงแบบปกติตามเงื่อนไขซึ่งโดยทั่วไปแล้วจะเป็นผลลัพธ์ที่ทราบ ปัญหาคือว่าเนื่องจากฉันไม่ทราบค่าที่คาดหวังและดังนั้นความแปรปรวนของค่ามัธยฐานฉันจะต้องคำนวณค่าเหล่านั้นโดยใช้สถิติลำดับk + 1 k+1แต่นั่นซับซ้อนมากและฉันจะไม่ไปที่นั่นเว้นแต่ฉันจะต้องทำอย่างแน่นอน


2
ผมเชื่อว่านี่เป็นผลโดยตรงจากการทั่วไปที่ผมเพิ่งโพสต์ที่stats.stackexchange.com/a/83887 การกระจายตัวของส่วนที่เหลือx i - ˉ xxix¯ชัดเจนคือสมมาตรประมาณ00ซึ่งค่ามัธยฐานของพวกมันมีการกระจายแบบสมมาตรดังนั้นค่าเฉลี่ยของมันจึงเป็นศูนย์ ดังนั้นความคาดหวังของค่ามัธยฐานของตัวเอง (ไม่ใช่แค่ของเหลือ) เท่ากับ0 + E ( ˉ X | ˉ X = ˉ x ) = ˉ x  0+E(X¯ | X¯=x¯)=x¯ , QED
whuber

@whuber ขอโทษที่เหลือ?
JohnK

ฉันนิยามมันไว้ในความคิดเห็นของฉัน: พวกมันต่างกันระหว่างx ixiกับค่าเฉลี่ยของพวกมัน
whuber

@whuber ไม่ฉันเข้าใจ แต่ฉันยังคงทำงานเพื่อทำความเข้าใจว่าคำตอบอื่นของคุณเกี่ยวข้องกับคำถามของฉันอย่างไรและความคาดหวังที่คุณได้ใช้จริง ๆ
JohnK

2
@whuber Okay, then please correct me If I'm wrong, E(Y|ˉX)=E(ˉX|ˉX)+E(YˉX|ˉX)E(Y|X¯)=E(X¯|X¯)+E(YX¯|X¯) And now the second term is zero because the median is symmetric around ˉxx¯. Therefore, the expectation reduces to ˉxx¯
JohnK

คำตอบ:


7

Let XX denote the original sample and ZZ the random vector with entries Zk=XkˉXZk=XkX¯. Then ZZ is normal centered (but its entries are not independent, as can be seen from the fact that their sum is zero with full probability). As a linear functional of XX, the vector (Z,ˉX)(Z,X¯) is normal hence the computation of its covariance matrix suffices to show that ZZ is independent of ˉXX¯.

Turning to YY, one sees that Y=ˉX+TY=X¯+T where TT is the median of ZZ. In particular, TT depends on ZZ only hence TT is independent of ˉXX¯, and the distribution of ZZ is symmetric hence TT is centered.

Finally, E(YˉX)=ˉX+E(TˉX)=ˉX+E(T)=ˉX.

E(YX¯)=X¯+E(TX¯)=X¯+E(T)=X¯.

Thank you, this was asked almost a year ago and I am very glad that someone finally cleared it up.
JohnK

7

The sample median is an order statistic and has a non-normal distribution, so the joint finite-sample distribution of sample median and sample mean (which has a normal distribution) would not be bivariate normal. Resorting to approximations, asymptotically the following holds (see my answer here):

n[(ˉXnYn)(μv)]LN[(00),Σ]

n[(X¯nYn)(μv)]LN[(00),Σ]

with

Σ=(σ2E(|Xv|)[2f(v)]1E(|Xv|)[2f(v)]1[2f(v)]2)

Σ=(σ2E(|Xv|)[2f(v)]1E(|Xv|)[2f(v)]1[2f(v)]2)

where ˉXnX¯n is the sample mean and μμ the population mean, YnYn is the sample median and vv the population median, f()f() is the probability density of the random variables involved and σ2σ2 is the variance.

So approximately for large samples, their joint distribution is bivariate normal, so we have that

E(YnˉXn=ˉx)=v+ρσvσˉX(ˉxμ)

E(YnX¯n=x¯)=v+ρσvσX¯(x¯μ)

where ρρ is the correlation coefficient.

Manipulating the asymptotic distribution to become the approximate large-sample joint distribution of sample mean and sample median (and not of the standardized quantities), we have ρ=1nE(|Xv|)[2f(v)]11nσ[2f(v)]1=E(|Xv|)σ

ρ=1nE(|Xv|)[2f(v)]11nσ[2f(v)]1=E(|Xv|)σ

So E(YnˉXn=ˉx)=v+E(|Xv|)σ[2f(v)]1σ(ˉxμ)

E(YnX¯n=x¯)=v+E(|Xv|)σ[2f(v)]1σ(x¯μ)

We have that 2f(v)=2/σ2π2f(v)=2/σ2π due to the symmetry of the normal density so we arrive at

E(YnˉXn=ˉx)=v+π2E(|Xμσ|)(ˉxμ)

E(YnX¯n=x¯)=v+π2E(Xμσ)(x¯μ)

where we have used v=μv=μ. Now the standardized variable is a standard normal, so its absolute value is a half-normal distribution with expected value equal to 2/π2/π (since the underlying variance is unity). So

E(YnˉXn=ˉx)=v+π22π(ˉxμ)=v+ˉxμ=ˉx

E(YnX¯n=x¯)=v+π22π(x¯μ)=v+x¯μ=x¯

2
As always, nice answer +1. However, since we have no information about the sample size, the asymptotic distribution might not hold. If there is no way to obtain the exact distribution though, I suppose I'll have to make do. Thank you very much.
JohnK

6

The answer is ˉxx¯.

Let x=(x1,x2,,xn)x=(x1,x2,,xn) have a multivariate distribution FF for which all the marginals are symmetric about a common value μμ. (It does not matter whether they are independent or even are identically distributed.) Define ˉxx¯ to be the arithmetic mean of the xi,xi, ˉx=(x1+x2++xn)/nx¯=(x1+x2++xn)/n and write xˉx=(x1ˉx,x2ˉx,,xnˉx)xx¯=(x1x¯,x2x¯,,xnx¯) for the vector of residuals. The symmetry assumption on FF implies the distribution of xˉxxx¯ is symmetric about 00; that is, when ERnERn is any event,

PrF(xˉxE)=PrF(xˉxE).

PrF(xx¯E)=PrF(xx¯E).

Applying the generalized result at /stats//a/83887 shows that the median of xˉxxx¯ has a symmetric distribution about 00. Assuming its expectation exists (which is certainly the case when the marginal distributions of the xixi are Normal), that expectation has to be 00 (because the symmetry implies it equals its own negative).

Now since subtracting the same value ˉxx¯ from each of a set of values does not change their order, YY (the median of the xixi) equals ˉxx¯ plus the median of xˉx. Consequently its expectation conditional on ˉx equals the expectation of xˉx conditional on ˉx, plus E(ˉx | ˉx). The latter obviously is ˉx whereas the former is 0 because the unconditional expectation is 0. Their sum is ˉx, QED.


Thank you for posting it as a full answer. I now understand the essence of your argument but I might ping you if something is still unclear.
JohnK

5
JohnK, I need to alert you to be cautious. A counterexample to this argument has been brought to my attention. I have encouraged its originator to post it here for further discussion, but briefly it concerns a discrete bivariate distribution with symmetric marginals but asymmetric conditional marginals. Its existence points to a flawed deduction early in my argument. I currently hope that the argument might be rescued by imposing stronger conditions on the xi, but my attention is presently focused elsewhere and I might not get to think about this for awhile.
whuber

4
In the meantime I would encourage you to unaccept this answer. I would ordinarily delete any answer of mine known to be incorrect, but (as you might be able to tell) I like solutions based on first principles rather than detailed calculations, so I hope this argument can be rescued. I therefore intend to leave it open for criticism and improvement (and therefore made it CW); let the votes fall as they may.
whuber

Of course, thanks for letting me know. We will discuss it further when you have time. In the meantime I will settle for the asymptotic argument proposed by @Alecos Papadopoulos.
JohnK

6

This is simpler than the above answers make it. The sample mean is a complete and sufficient statistic (when the variance is known, but our results do not depend on the variance, hence will be valid also in the situation when the variance is unknown). Then the Rao-Blackwell together with the Lehmann-Scheffe theorems (see wikipedia ...) will imply that the conditional expectation of the median, given the arithmetic mean, is the unique minimum variance unbiased estimator of the expectation μ. But we know that is the arithmetic mean, hence the result follows.

We did also use that the median is an unbiased estimator, which follows from symmetry.


1
By symmetry E[Y]=μ, indeed. Then from these two theorems we know that E[Y|ˉX] is the Unique Minimum Variance Unbiased Estimator for μ which we already know to be equal to ˉX. This is a brilliant answer, thank you very much. I would have marked it as the correct one, had I not done that already for another answer.
JohnK
โดยการใช้ไซต์ของเรา หมายความว่าคุณได้อ่านและทำความเข้าใจนโยบายคุกกี้และนโยบายความเป็นส่วนตัวของเราแล้ว
Licensed under cc by-sa 3.0 with attribution required.