การกระจายตัวที่อธิบายความแตกต่างระหว่างตัวแปรกระจายแบบทวินามลบ

18

การกระจาย Skellamอธิบายความแตกต่างระหว่างสองตัวแปรที่มีการแจกแจงปัวส์ซอง มีการแจกแจงแบบเดียวกันที่อธิบายความแตกต่างระหว่างตัวแปรที่ตามหลังการแจกแจงทวินามลบหรือไม่

ข้อมูลของฉันผลิตโดยกระบวนการปัวซง แต่รวมถึงเสียงรบกวนในปริมาณที่เหมาะสม ดังนั้นการสร้างแบบจำลองข้อมูลด้วยการแจกแจงลบทวินาม (NB) จึงทำงานได้ดี ถ้าฉันต้องการจำลองความแตกต่างระหว่างชุดข้อมูล NB สองชุดตัวเลือกของฉันคืออะไร หากช่วยได้ให้ถือว่าวิธีการและความแปรปรวนที่คล้ายกันสำหรับทั้งสองชุด

— chrisamiller
แหล่งที่มา

มีการแจกแจงจำนวนมากที่อธิบายได้ง่ายซึ่งไม่มีชื่อมาตรฐาน

— Glen_b -Reinstate Monica

22

ฉันไม่รู้ชื่อของการกระจายตัวนี้ แต่คุณสามารถได้มาจากกฎความน่าจะเป็นทั้งหมด สมมติว่าแต่ละคนมีการแจกแจงแบบทวินามลบด้วยพารามิเตอร์และตามลำดับ ฉันใช้การกำหนดพารามิเตอร์โดยที่แสดงถึงจำนวนความสำเร็จก่อนที่ความล้มเหลวของ 'และ ' ตามลำดับ จากนั้น $X, Y$ $(r_{1}, p_{1})$ $(r_{2}, p_{2})$ $X,Y$ $r_{1}$ $r_{2}$

P (X - Y = k) = E_{Y} (P (X - Y = k)) = E_{Y} (P (X = k + Y)) = Σ_{Y = 0}^{\infty} P (Y = Y) P (X = k + Y)

$P(X - Y = k) = E_{Y} \Big( P(X-Y = k) \Big) = E_{Y} \Big( P(X = k+Y) \Big) = \sum_{y=0}^{\infty} P(Y=y)P(X = k+y)$

พวกเรารู้

P (X = k + Y) = (\binom{k + Y + R_{1} - 1}{k + Y}) (1 - {พี}_{1})^{R_{1}} {พี}_{1}^{k + Y}

$P(X = k + y) = {k+y+r_{1}-1 \choose k+y} (1-p_{1})^{r_{1}} p_{1}^{k+y}$

และ

P (Y = y) = (\binom{y + r_{2} - 1}{y}) (1 - {พี}_{2})^{R_{2}} {พี}_{2}^{Y}

$P(Y = y) = {y+r_{2}-1 \choose y} (1-p_{2})^{r_{2}} p_{2}^{y}$

ดังนั้น

P (X - Y = k) = \sum_{y = 0}^{\infty} (\binom{y + r_{2} - 1}{y}) (1 - p_{2})^{r_{2}} p_{2}^{y} \cdot (\binom{k + y + r_{1} - 1}{k + y}) (1 - p_{1})^{r_{1}} p_{1}^{k + y}

$P(X-Y=k) = \sum_{y=0}^{\infty} {y+r_{2}-1 \choose y} (1-p_{2})^{r_{2}} p_{2}^{y} \cdot {k+y+r_{1}-1 \choose k+y} (1-p_{1})^{r_{1}} p_{1}^{k+y}$

นั่นไม่ได้สวย (yikes!) ความเรียบง่ายอย่างเดียวที่ฉันเห็นทันทีคือ

p_{1}^{k} (1 - p_{1})^{r_{1}} (1 - p_{2})^{r_{2}} \sum_{y = 0}^{\infty} (p_{1} p_{2})^{y} (\binom{y + r_{2} - 1}{y}) (\binom{k + y + r_{1} - 1}{k + y})

$p_{1}^{k} (1-p_{1})^{r_{1}} (1-p_{2})^{r_{2}} \sum_{y=0}^{\infty} (p_{1}p_{2})^{y} {y+r_{2}-1 \choose y} {k+y+r_{1}-1 \choose k+y}$

which is still pretty ugly. I'm not sure if this is helpful but this can also be re-written as

\frac{p_{1}^{k} (1 - p_{1})^{r_{1}} (1 - p_{2})^{r_{2}}}{(r_{1} - 1)! (r_{2} - 1)!} \sum_{y = 0}^{\infty} (p_{1} p_{2})^{y} \frac{(y + r_{2} - 1)! (k + y + r_{1} - 1)!}{y! (k + y)!}

$\frac{ p_{1}^{k} (1-p_{1})^{r_{1}} (1-p_{2})^{r_{2}} }{ (r_{1}-1)! (r_{2}-1)! } \sum_{y=0}^{\infty} (p_{1}p_{2})^{y} \frac{ (y+r_{2}-1)! (k+y+r_{1}-1)! }{y! (k+y)! }$

I'm not sure if there is a simplified expression for this sum but it could be approximated numerically if you only need it to calculate $p$ -values

I verified with simulation that the above calculation is correct. Here is a crude R function to calculate this mass function and carry out a few simulations

  f = function(k,r1,r2,p1,p2,UB)  
  {

  S=0
  const = (p1^k) * ((1-p1)^r1) * ((1-p2)^r2)
  const = const/( factorial(r1-1) * factorial(r2-1) ) 

  for(y in 0:UB)
  {
     iy = ((p1*p2)^y) * factorial(y+r2-1)*factorial(k+y+r1-1)
     iy = iy/( factorial(y)*factorial(y+k) )
     S = S + iy
  }

  return(S*const)
  }

 ### Sims
 r1 = 6; r2 = 4; 
 p1 = .7; p2 = .53; 
 X = rnbinom(1e5,r1,p1)
 Y = rnbinom(1e5,r2,p2)
 mean( (X-Y) == 2 ) 
 [1] 0.08508
 f(2,r1,r2,1-p1,1-p2,20)
 [1] 0.08509068
 mean( (X-Y) == 1 ) 
 [1] 0.11581
 f(1,r1,r2,1-p1,1-p2,20)
 [1] 0.1162279
 mean( (X-Y) == 0 ) 
 [1] 0.13888
 f(0,r1,r2,1-p1,1-p2,20)
 [1] 0.1363209

I've found the sum converges very quickly for all of the values I tried, so setting UB higher than 10 or so is not necessary. Note that R's built in rnbinom function parameterizes the negative binomial in terms of the number of failures before the $r$ 'th success, in which case you'd need to replace all of the $p_{1}, p_{2}$ 's in the above formulas with $1-p_{1}, 1-p_{2}$ for compatibility.

— Macro
แหล่งที่มา

Thanks. I'll need some time to digest this, but your help is much appreciated.

— chrisamiller

-2

Yes. skewed generalized discrete Laplace distribution is the difference of two negative binomial distributed random variables. For more clarifications refer the online available article "skewed generalized discrete Laplace distribution" by seetha Lekshmi.V. and simi sebastian

— simi sebastian
แหล่งที่มา

4

Can you provide a complete citation & a summary of the information in the paper so future readers can decide if it's something they want to pursue?

— gung - Reinstate Monica

บทความกล่าวโดย @ กึ่ง-เซบาสเตียน (ผู้เขียน?) เป็นijmsi.org/Papers/Volume.2.Issue.3/K0230950102.pdf อย่างไรก็ตามยกเว้นว่าฉันเข้าใจผิดมันจะจัดการกับกรณีของตัวแปรเนกาทีฟทวินามเท่านั้น

X

$X$ และ

Y

$Y$ ทั้งสองมีพารามิเตอร์การกระจายแบบเดียวกันมากกว่ากรณีทั่วไปที่อธิบายโดยโปสเตอร์ดั้งเดิม

— Constantinos