ความแตกต่างของตัวแปรสุ่ม iid lognormal สองตัว

Let $X_1$ และ $X_2$ 2 iidrv ของที่ $\log(X_1),\log(X_2) \sim N(\mu,\sigma)$ )ผมอยากจะรู้ว่าการกระจายสำหรับ $X_1 - X_2$ 2

สิ่งที่ดีที่สุดที่ฉันสามารถทำได้คือนำซีรีย์ของทั้งสอง Taylor และได้รับความแตกต่างคือผลรวมของความแตกต่างระหว่างสอง rv ปกติและสอง chi-squared rv นอกเหนือจากความแตกต่างที่เหลือระหว่างเงื่อนไขที่เหลือ มีวิธีที่ตรงไปตรงมามากขึ้นที่จะได้รับการกระจายความแตกต่างระหว่าง 2 iid log-normal rv หรือไม่?

— frayedchef
แหล่งที่มา

นี่คือเอกสารที่เกี่ยวข้อง คุณจะพบเอกสารเพิ่มเติมโดย googling! papers.ssrn.com/sol3/papers.cfm?abstract_id=2064829

— kjetil b halvorsen

ฉันได้ดูคร่าวๆอย่างคร่าวๆและดูเหมือนจะไม่ตอบคำถามของฉันอย่างน่าพอใจ พวกเขาดูเหมือนจะเกี่ยวข้องกับการประมาณค่าตัวเลขกับปัญหาที่ยากขึ้นของการหาการแจกแจงสำหรับผลรวม / ความแตกต่างระหว่างlognormal rv ที่สัมพันธ์กัน ฉันหวังว่าจะมีคำตอบที่ง่ายกว่าสำหรับคดีอิสระ

— frayedchef

อาจเป็นคำตอบที่ง่ายกว่าในกรณีที่เป็นอิสระ แต่ไม่ใช่คำตอบที่ง่าย! กรณี lognormal เป็นกรณียากที่รู้จักกันอย่างมีชื่อเสียง --- ฟังก์ชั่นที่สร้างช่วงเวลาของการกระจาย lognormal ไม่มีอยู่ --- นั่นคือมันไม่ได้บรรจบกันในช่วงเวลาที่เปิดที่มีศูนย์ ดังนั้นคุณจะไม่พบทางออกที่ง่าย

— kjetil b halvorsen

ฉันเห็น ... ดังนั้นวิธีที่ฉันระบุไว้ข้างต้นจะสมเหตุสมผล (เช่นถ้า

Y_{i} = \log (X_{i})

$Y_i = \log(X_i)$

เรารู้อะไรเกี่ยวกับเงื่อนไขการสั่งซื้อที่สูงขึ้นหรือวิธีผูกมัดพวกเขา

X_{1} - X_{2} \approx (Y_{1} - Y_{2}) + (Y_{1}^{2} - Y_{2}^{2}) / 2 + . . .

$X_1 - X_2 \approx (Y_1 - Y_2) + (Y_1^2 - Y_2^2)/2 + {} ...$

— frayedchef

เพื่อแสดงให้เห็นถึงความยากลำบาก --- lognormal mgf ถูกกำหนดบน

เท่านั้นเพื่อประมาณการกระจายความแตกต่างด้วยวิธีการอานเราจำเป็นต้อง (K = cumulant gf)

และ ผลรวมนั้นถูกกำหนดในจุดเดียวเท่านั้นศูนย์ดังนั้นดูเหมือนจะไม่ทำงานผลรวมหรือค่าเฉลี่ยจะง่ายกว่า!

(- \infty, 0]

$(-\infty,0]$

K (s) + K (- s)

$K(s)+K(-s)$

— kjetil b halvorsen

คำตอบ:

นี่เป็นปัญหาที่ยาก ฉันคิดก่อนเกี่ยวกับการใช้ (ประมาณบางส่วน) ฟังก์ชั่นสร้างช่วงเวลาของการแจกแจงล็อกนอร์มอล ไม่ทำงานอย่างที่ฉันจะอธิบาย แต่ก่อนอื่นบางสัญกรณ์:

ให้เป็นความหนาแน่นปกติมาตรฐานและฟังก์ชั่นการแจกแจงสะสมที่สอดคล้องกัน เราจะวิเคราะห์กรณีการกระจาย lognormal ซึ่งมีฟังก์ชั่นความหนาแน่น $\phi$ $\Phi$ $lnN(0,1)$ และฟังก์ชันการแจกแจงสะสม สมมติว่าและเป็นตัวแปรสุ่มอิสระที่มีการแจกแจงล็อกปกติข้างต้น เราสนใจในการแจกแจงของซึ่งเป็นการกระจายแบบสมมาตรโดยมีค่าเฉลี่ยเป็นศูนย์ ให้

f (x) = \frac{1}{\sqrt{2 π} x} e^{- \frac{1}{2} (\ln x)^{2}}

$f(x)=\frac1{\sqrt{2\pi}x} e^{-\frac12 (\ln x)^2}$

F (x) = Φ (\ln x)

$F(x) =\Phi(\ln x)$

X

$X$

Y

$Y$

D = X - Y

$D=X-Y$

เป็นฟังก์ชั่นช่วงเวลาที่ก่อให้เกิดของX

มันถูกกำหนดไว้สำหรับ

M (t) = E e^{t X}

$M(t) = \DeclareMathOperator{\E}{E} \E e^{tX}$

X

$X$

t \in (- \infty, 0]

$t\in (-\infty,0]$ ดังนั้นจึงไม่ได้กำหนดไว้ในช่วงเวลาเปิดที่มีศูนย์ฟังก์ชั่นสร้างช่วงเวลาสำหรับ

คือ

ดังนั้นฟังก์ชันสร้างโมเมนต์สำหรับ

ถูกกำหนดเฉพาะสำหรับ

D

$D$

M_{D} (t) = E e^{t (X - Y)} = E e^{t X} E e^{- t Y} = M (t) M (- t)

$M_D(t)=\E e^{t(X-Y)}= \E e^{tX} \E e^{-tY}= M(t)M(-t)$

D

$D$

t = 0

$t=0$ จึงไม่ค่อยมีประโยชน์

นั่นหมายความว่าเราจะต้องมีวิธีการโดยตรงมากขึ้นในการหาค่าประมาณสำหรับการกระจายของDสมมติว่า $D$ $t\ge 0$

\begin{aligned} P (D \leq t) & = P (X - Y \leq t) \\ = \int_{0}^{\infty} P (X - y \leq t | Y = y) f (y) d y \\ = \int_{0}^{\infty} P (X \leq t + y) f (y) d y \\ = \int_{0}^{\infty} F (t + y) f (y) d y \end{aligned}

$\begin{align} P(D \le t) &= P(X-Y\le t) \\ &= \int_0^\infty P(X-y\le t | Y=y) f(y) \; dy \\ &= \int_0^\infty P(X\le t+y) f(y) \; dy \\ &= \int_0^\infty F(t+y) f(y) \; dy \end{align}$ (and the case

t < 0

$t<0$ is solved by symmetry, we get

P (D \leq t) = 1 - P (D \leq | t |)

$P(D\le t)=1-P(D\le |t|)$ ).

This expression can be used for numerical integration or as a basis for simulation. First a test:

 integrate(function(y) plnorm(y)*dlnorm(y), lower=0,  upper=+Inf)
  0.5 with absolute error < 2.3e-06

which is clearly correct. Let us wrap this up inside a function:

pDIFF  <-  function(t) {
    d  <-  t
    for (tt in seq(along=t)) {
        if (t[tt] >= 0.0) d[tt] <- integrate(function(y) plnorm(y+t[tt])*dlnorm(y),
                                         lower=0.0,  upper=+Inf)$value else
                          d[tt] <- 1-integrate(function(y) plnorm(y+abs(t[tt]))*dlnorm(y),
                                         lower=0.0, upper=+Inf)$value
    }
    return(d)
}

> plot(pDIFF,  from=-5,  to=5)

which gives:

Then we can find the density function by differentiating under the integral sign, obtaining

dDIFF  <-  function(t) {
       d  <- t; t<- abs(t)
       for (tt in seq(along=t)) {
           d[tt]  <-  integrate(function(y) dlnorm(y+t[tt])*dlnorm(y),
                                lower=0.0,  upper=+Inf)$value
       }
       return(d)
}

which we can test:

> integrate(dDIFF,  lower=-Inf,  upper=+Inf)
0.9999999 with absolute error < 1.3e-05

And plotting the density we get:

plot(dDIFF,  from=-5,  to=5)

I did also try to get some analytic approximation, but so far didn't succeed, it is not an easy problem. But numerical integration as above, programmed in R is very fast on modern hardware, so is a good alternative which probably should be used much more.

— kjetil b halvorsen
แหล่งที่มา

This does not strictly answer your question, but wouldn't it be easier to look at the ratio of the $X$ and $Y$ ? You then simply arrive at

\begin{aligned} Pr (\frac{X}{Y} \leq t) & = Pr (\log (\frac{X}{Y}) \leq \log (t)) \\ = Pr (\log (X) - \log (Y) \leq \log (t)) \\ \sim N (0, 2 σ^{2}) \end{aligned}

$\begin{align} \Pr\left(\frac{X}{Y} \leq t\right) &= \Pr\left(\log\left(\frac{X}{Y}\right) \leq \log(t) \right) \\ &= \Pr(\log(X) - \log(Y) \leq \log(t)) \\ &\sim \mathcal{N}(0, 2 \sigma^2) \end{align}$

Depending on your application, this may serve your needs.

— Vincent Traag
แหล่งที่มา

But aren't we looking at X-Y instead of log(X) - log(Y) ?

— Sextus Empiricus

Yes, of course. This is just in case somebody would be interested in knowing how two lognormal variables differ from each other without it necessarily needing to be a difference. That's why I also say it doesn't the answer the question.

— Vincent Traag