ก่อสร้างการกระจาย Dirichlet ด้วยการกระจายแกมม่า


18

ให้X 1 , , X k + 1X1,,Xk+1เป็นตัวแปรสุ่มอิสระที่ต่างกันซึ่งแต่ละอันมีการแจกแจงแกมม่าที่มีพารามิเตอร์α i , i = 1 , 2 , , k + 1αi,i=1,2,,k+1แสดงว่าY i = X iX 1 + + X k + 1 ,i=1,,kYi=XiX1++Xk+1,i=1,,k, มีการแบ่งส่วนร่วมเป็นDirichlet(α1,α2,,αk;αk+1)Dirichlet(α1,α2,,αk;αk+1)

PDF ร่วมของ( X 1 , , X k + 1 ) = e - k + 1 i = 1 x i x α 1 - 1 1x α k + 1 - 1 k + 1Γ ( α 1 ) Γ ( α 2 ) Γ ( α k + 1 ) จาก(X1,,Xk+1)=ek+1i=1xixα111xαk+11k+1Γ(α1)Γ(α2)Γ(αk+1)นั้นหาไฟล์ PDF ร่วมของ(Y1,,Yk+1)(Y1,,Yk+1)ฉันไม่สามารถหาจาโคเบียนเช่นJ(x1,,x k + 1y 1 , , y k + 1 )J(x1,,xk+1y1,,yk+1)


3
ดูที่หน้า 13-14 ของเอกสารนี้

@ Prorastrastator ขอบคุณมากเอกสารของคุณคือคำตอบที่ดีที่สุดสำหรับคำถามของฉัน
Argha

2
@Procrastinator - บางทีคุณควรใส่นี่เป็นคำตอบเนื่องจาก OP มีความสุขกับมันและเพิ่มสองสามประโยคเพื่อที่คุณจะไม่เดินทาง "เราต้องการคำตอบมากกว่าหนึ่งประโยค"?
jbowman

4
That document now is a non-answer because it's a 404.
whuber

2
Wayback machine to the rescue: pdf
mobeets

คำตอบ:


30

Jacobians--the absolute determinants of the change of variable function--appear formidable and can be complicated. Nevertheless, they are an essential and unavoidable part of the calculation of a multivariate change of variable. It would seem there's nothing for it but to write down a k+1k+1 by k+1k+1 matrix of derivatives and do the calculation.

There's a better way. It's shown at the end in the "Solution" section. Because the purpose of this post is to introduce statisticians to what may be a new method for many, much of it is devoted to explaining the machinery behind the solution. This is the algebra of differential forms. (Differential forms are the things that one integrates in multiple dimensions.) A detailed, worked example is included to help make this become more familiar.


Background

กว่าหนึ่งศตวรรษที่ผ่านมานักคณิตศาสตร์ได้พัฒนาทฤษฎีพีชคณิตเชิงอนุพันธ์เพื่อทำงานกับ "อนุพันธ์ลำดับสูงกว่า" ที่เกิดขึ้นในเรขาคณิตหลายมิติ ดีเทอร์มิแนนต์เป็นกรณีพิเศษของวัตถุพื้นฐานที่จัดการโดย algebras ดังกล่าวซึ่งโดยทั่วไปแล้วจะสลับรูปแบบหลายเส้น ความสวยงามของสิ่งนี้อยู่ที่การคำนวณที่ง่ายเพียงใด

นี่คือทั้งหมดที่คุณต้องรู้

  1. ค่าคือการแสดงออกของรูปแบบ " D x ฉัน " มันคือการต่อกันของ " d " กับชื่อตัวแปรใด ๆdxid

  2. A one-form is a linear combination of differentials, such as dx1+dx2dx1+dx2 or even x2dx1exp(x2)dx2x2dx1exp(x2)dx2. That is, the coefficients are functions of the variables.

  3. Forms can be "multiplied" using a wedge product, written . This product is anti-commutative (also called alternating): for any two one-forms ωω and ηη,

    ωη=ηω.

    ωη=ηω.

    This multiplication is linear and associative: in other words, it works in the familiar fashion. An immediate consequence is that ωω=ωωωω=ωω, implying the square of any one-form is always zero. That makes multiplication extremely easy!

  4. For the purposes of manipulating the integrands that appear in probability calculations, an expression like dx1dx2dxk+1dx1dx2dxk+1 can be understood as |dx1dx2dxk+1||dx1dx2dxk+1|.

  5. When y=g(x1,,xn)y=g(x1,,xn) is a function, then its differential is given by differentiation:

    dy=dg(x1,,xn)=gx1(x1,,xn)dx1++gx1(x1,,xn)dxn.

    dy=dg(x1,,xn)=gx1(x1,,xn)dx1++gx1(x1,,xn)dxn.

The connection with Jacobians is this: the Jacobian of a transformation (y1,,yn)=F(x1,,xn)=(f1(x1,,xn),,fn(x1,,xn))(y1,,yn)=F(x1,,xn)=(f1(x1,,xn),,fn(x1,,xn)) is, up to sign, simply the coefficient of dx1dxndx1dxn that appears in computing

dy1dyn=df1(x1,,xn)dfn(x1,,xn)

dy1dyn=df1(x1,,xn)dfn(x1,,xn)

after expanding each of the dfidfi as a linear combination of the dxjdxj in rule (5).


Example

The simplicity of this definition of a Jacobian is appealing. Not yet convinced it's worthwhile? Consider the well-known problem of converting two-dimensional integrals from Cartesian coordinates (x,y)(x,y) to polar coordinates (r,θ)(r,θ), where (x,y)=(rcos(θ),rsin(θ))(x,y)=(rcos(θ),rsin(θ)). The following is an utterly mechanical application of the preceding rules, where "()()" is used to abbreviate expressions that will obviously disappear by virtue of rule (3), which implies drdr=dθdθ=0drdr=dθdθ=0.

dxdy=|dxdy|=|d(rcos(θ))d(rsin(θ))|=|(cos(θ)drrsin(θ)dθ)(sin(θ)dr+rcos(θ)dθ|=|()drdr+()dθdθrsin(θ)dθsin(θ)dr+cos(θ)drrcos(θ)dθ|=|0+0+rsin2(θ)drdθ+rcos2(θ)drdθ|=|r(sin2(θ)+cos2(θ))drdθ)|=r drdθ.

dxdy=|dxdy|=|d(rcos(θ))d(rsin(θ))|=|(cos(θ)drrsin(θ)dθ)(sin(θ)dr+rcos(θ)dθ|=|()drdr+()dθdθrsin(θ)dθsin(θ)dr+cos(θ)drrcos(θ)dθ|=|0+0+rsin2(θ)drdθ+rcos2(θ)drdθ|=|r(sin2(θ)+cos2(θ))drdθ)|=r drdθ.

The point of this is the ease with which such calculations can be performed, without messing about with matrices, determinants, or other such multi-indicial objects. You just multiply things out, remembering that wedges are anti-commutative. It's easier than what is taught in high school algebra.


Preliminaries

Let's see this differential algebra in action. In this problem, the PDF of the joint distribution of (X1,X2,,Xk+1)(X1,X2,,Xk+1) is the product of the individual PDFs (because the XiXi are assumed to be independent). In order to handle the change to the variables YiYi we must be explicit about the differential elements that will be integrated. These form the term dx1dx2dxk+1dx1dx2dxk+1. Including the PDF gives the probability element

fX(x,α)dx1dxk+1(xα111exp(x1))(xαk+11k+1exp(xk+1))dx1dxk+1=xα111xαk+11k+1exp((x1++xk+1))dx1dxk+1.

fX(x,α)dx1dxk+1(xα111exp(x1))(xαk+11k+1exp(xk+1))dx1dxk+1=xα111xαk+11k+1exp((x1++xk+1))dx1dxk+1.

(The normalizing constant has been ignored; it will be recovered at the end.)

Staring at the definitions of the YiYi a few seconds ought to reveal the utility of introducing the new variable

Z=X1+X2++Xk+1,

Z=X1+X2++Xk+1,

giving the relationships

Xi=YiZ.

Xi=YiZ.

This suggests making the change of variables xiyizxiyiz in the probability element. The intention is to retain the first kk variables y1,,yky1,,yk along with zz and then integrate out zz. To do so, we have to re-express all the dxidxi in terms of the new variables. This is the heart of the problem. It's where the differential algebra takes place. To begin with,

dxi=d(yiz)=yidz+zdyi.

dxi=d(yiz)=yidz+zdyi.

Note that since Y1+Y2++Yk+1=1Y1+Y2++Yk+1=1, then

0=d(1)=d(y1+y2++yk+1)=dy1+dy2++dyk+1.

0=d(1)=d(y1+y2++yk+1)=dy1+dy2++dyk+1.

Consider the one-form

ω=dx1++dxk=z(dy1++dyk)+(y1++yk)dz.

ω=dx1++dxk=z(dy1++dyk)+(y1++yk)dz.

It appears in the differential of the last variable:

dxk+1=zdyk+1+yk+1dz=z(dy1++dyk)+(1y1yk)dz=dzω.

dxk+1=zdyk+1+yk+1dz=z(dy1++dyk)+(1y1yk)dz=dzω.

The value of this lies in the observation that

dx1dxkω=0

dx1dxkω=0

because, when you expand this product, there is one term containing dx1dx1=0dx1dx1=0 as a factor, another containing dx2dx2=0dx2dx2=0, and so on: they all disappear. Consequently,

dx1dxkdxk+1=dx1dxkzdx1dxkω=dx1dxkz.

dx1dxkdxk+1=dx1dxkzdx1dxkω=dx1dxkz.

Whence (because all products dzdzdzdz disappear),

dx1dxk+1=(zdy1+y1dz)(zdyk+ykdz)dz=zkdy1dykdz.

dx1dxk+1=(zdy1+y1dz)(zdyk+ykdz)dz=zkdy1dykdz.

The Jacobian is simply |zk|=zk|zk|=zk, the coefficient of the differential product on the right hand side.


Solution

The transformation (x1,,xk,xk+1)(y1,,yk,z)(x1,,xk,xk+1)(y1,,yk,z) is one-to-one: its inverse is given by xi=yizxi=yiz for 1ik1ik and xk+1=z(1y1yk)xk+1=z(1y1yk). Therefore we don't have to fuss any more about the new probability element; it simply is

(zy1)α11(zyk)αk1(z(1y1yk))αk+11exp(z)|zkdy1dykdz|=(zα1++αk+11exp(z)dz)(yα111yαk1k(1y1yk)αk+11dy1dyk).

(zy1)α11(zyk)αk1(z(1y1yk))αk+11exp(z)|zkdy1dykdz|=(zα1++αk+11exp(z)dz)(yα111yαk1k(1y1yk)αk+11dy1dyk).

That is manifestly a product of a Gamma(α1++αk+1)(α1++αk+1) distribution (for ZZ) and a Dirichlet(α)(α) distribution (for (Y1,,Yk)(Y1,,Yk)). In fact, since the original normalizing constant must have been a product of Γ(αi)Γ(αi), we deduce immediately that the new normalizing constant must be divided by Γ(α1++αk+1)Γ(α1++αk+1), enabling the PDF to be written

fY(y,α)=Γ(α1++αk+1)Γ(α1)Γ(αk+1)(yα111yαk1k(1y1yk)αk+11).

fY(y,α)=Γ(α1++αk+1)Γ(α1)Γ(αk+1)(yα111yαk1k(1y1yk)αk+11).
โดยการใช้ไซต์ของเรา หมายความว่าคุณได้อ่านและทำความเข้าใจนโยบายคุกกี้และนโยบายความเป็นส่วนตัวของเราแล้ว
Licensed under cc by-sa 3.0 with attribution required.