การประมาณช่วงความเชื่อมั่นแบบทวินาม

30

ฉันใช้โค้ด r ต่อไปนี้เพื่อประมาณช่วงความเชื่อมั่นของสัดส่วนทวินามเพราะฉันเข้าใจว่าการแทนที่ "การคำนวณกำลังไฟฟ้า" เมื่อออกแบบตัวรับสัญญาณที่มีลักษณะการตรวจหาโรคในประชากร

n คือ 150 และเราเชื่อว่าโรคนี้เป็นที่แพร่หลายในประชากร 25% ฉันคำนวณค่าความไว 75% และความเฉพาะเจาะจง 90% (เพราะนั่นคือสิ่งที่ผู้คนดูเหมือนจะทำ)

    binom.test(c(29,9), p=0.75, alternative=c("t"), conf.level=0.95)

    binom.test(c(100, 12), p=0.90, alternative=c("t"), conf.level=0.95)

ฉันเคยไปที่ไซต์นี้:

http://statpages.org/confint.html

หน้าใดที่เป็นจาวาซึ่งคำนวณช่วงความเชื่อมั่นทวินามและให้คำตอบเดียวกัน

อย่างไรก็ตามหลังจากการตั้งค่าแบบยาวนั้นฉันต้องการถามว่าทำไมช่วงความเชื่อมั่นจึงไม่สมมาตรเช่นความไวคือ

   95 percent confidence interval:
   0.5975876 0.8855583 

   sample estimate probability: 0.7631579

ขออภัยถ้านี่เป็นคำถามที่โง่ แต่ทุกที่ที่ฉันมองดูเหมือนจะแนะนำว่าพวกเขาจะสมมาตรและเพื่อนร่วมงานของฉันดูเหมือนจะคิดว่าพวกเขาจะเกินไป

confidence-interval binomial

— Chris Beeley
แหล่งที่มา

20

พวกเขาเชื่อว่าสมมาตรเพราะมักใช้การประมาณปกติ อันนี้ใช้ได้ดีพอในกรณีที่ p อยู่ประมาณ 0.5 binom.testในอีกทางหนึ่งรายงาน "ช่วงเวลา" แบบปิด "Clopper-Pearson ซึ่งขึ้นอยู่กับการแจกแจงแบบ F (ดูที่นี่สำหรับสูตรที่แม่นยำของทั้งสองวิธี) ถ้าเราจะใช้ช่วง Clopper-Pearson ใน R มันจะเป็นสิ่งที่ต้องการ (ดูหมายเหตุ ):

Clopper.Pearson <- function(x, n, conf.level){
    alpha <- (1 - conf.level) / 2
    QF.l <- qf(1 - alpha, 2*n - 2*x + 2, 2*x)
    QF.u <- qf(1 - alpha, 2*x + 2, 2*n - 2*x)

    ll <- if (x == 0){
          0
    } else { x / ( x + (n-x+1)*QF.l ) }

    uu <- if (x == 0){
          0
    } else { (x+1)*QF.u / ( n - x + (x+1)*QF.u ) }

    return(c(ll, uu))
}

คุณเห็นทั้งในลิงก์และในการนำไปใช้งานว่าสูตรสำหรับขีด จำกัด บนและล่างนั้นแตกต่างกันอย่างสิ้นเชิง กรณีเดียวของช่วงความเชื่อมั่นแบบสมมาตรคือเมื่อ p = 0.5 การใช้สูตรจากลิงค์และคำนึงถึงว่าในกรณีนี้ $n = 2\times x$ มันเป็นเรื่องง่ายที่จะเข้าใจว่ามันเกิดขึ้นได้อย่างไร

ฉันเองเข้าใจว่าเป็นการดีกว่าที่จะมองหาช่วงความเชื่อมั่นที่ดีขึ้นโดยใช้วิธีการโลจิสติกส์ โดยทั่วไปข้อมูลทวินามนั้นจะใช้แบบจำลองโดยใช้ฟังก์ชั่นลิงค์ logit ซึ่งนิยามเป็น:

l o g i t (x) = \log (\frac{x}{1 - x})

${\rm logit}(x) = \log\! \bigg( \frac{x}{1-x} \bigg)$

ฟังก์ชันลิงก์นี้ "แม็พ" คำผิดพลาดในการถดถอยโลจิสติกส์ไปยังการแจกแจงแบบปกติ ดังนั้นความมั่นใจในกรอบการทำงานของโลจิสติกส์จึงมีความสมมาตรรอบค่า logit เหมือนกับในกรอบการถดถอยเชิงเส้นแบบคลาสสิก การแปลงแบบลอจิทนั้นใช้อย่างถูกต้องเพื่อให้สามารถใช้ทฤษฏีเชิงบรรทัดฐานทั้งหมดรอบ ๆ การถดถอยเชิงเส้น

หลังจากทำการแปลงผกผัน:

{l o g i t}^{- 1} (x) = \frac{e^{x}}{1 + e^{x}}

${\rm logit}^{-1}(x) = \frac{e^x}{1+e^{x}}$

คุณจะได้รับช่วงอสมมาตรอีกครั้ง ตอนนี้ช่วงความเชื่อมั่นเหล่านี้จะลำเอียงจริง ความครอบคลุมของพวกเขาไม่ใช่สิ่งที่คุณคาดหวังโดยเฉพาะในขอบเขตของการแจกแจงทวินาม พวกเขาแสดงให้คุณเห็นว่าทำไมมันเป็นตรรกะที่การแจกแจงทวินามมีช่วงความเชื่อมั่นไม่สมมาตร

ตัวอย่างใน R:

logit <- function(x){ log(x/(1-x)) }
inv.logit <- function(x){ exp(x)/(1+exp(x)) }
x <- c(0.2, 0.5, 0.8)
lx <- logit(x)
upper <- lx + 2
lower <- lx - 2

logxtab <- cbind(lx, upper, lower)
logxtab # the confidence intervals are symmetric by construction
xtab <- inv.logit(logxtab)
xtab # back transformation gives asymmetric confidence intervals

หมายเหตุ : อันที่จริง R ใช้การแจกแจงแบบเบต้า แต่นี่เทียบเท่าและมีประสิทธิภาพมากกว่าเล็กน้อย การนำไปใช้ใน R นั้นแตกต่างจากที่ฉันแสดงที่นี่ แต่ให้ผลเหมือนกันทุกประการ

— Joris Meys
แหล่งที่มา

2

คุณหมายถึงว่าจริง ๆ หรือว่า logit "แปลงการแจกแจงทวินามเป็นการแจกแจงแบบปกติ"

— whuber

@whuber: จับสูตรได้ดีและจับสูตรได้ดี ไม่ค่อยสวยเท่าไหร่ ทำให้แน่ใจว่าข้อผิดพลาดในการถดถอยโลจิสติกเป็นไปตามการแจกแจงปกติ ขอบคุณสำหรับการแก้ไข

— Joris Meys

เพียงแค่บันทึกย่อทางเทคนิคการแปลง "arcsine" เป็นสิ่งหนึ่งที่มีการลู่เข้าสู่ภาวะปกติได้เร็วกว่าการแปลงโลจิสติก ตั้ง

(โดยที่

คือจำนวนของ "ความสำเร็จ" และ

จำนวนการทดลอง) และคุณสามารถแสดงด้วยวิธีการที่เรียกว่า "วิธีเดลต้า" ซึ่งความแปรปรวนของ

นั้นมีค่าคงที่โดยประมาณ (และเป็นอิสระจาก

ตามที่ควรจะเป็น การแจกแจงแบบปกติ)

Y = \frac{2}{π} \arcsin \sqrt{\frac{X}{N}}

$Y=\frac{2}{\pi}\arcsin{\sqrt{\frac{X}{N}}}$

X

$X$

N

$N$

Y

$Y$

Y

$Y$

— ความน่าจะเป็นที่เป็นไปได้

ลิงก์ที่คุณให้ไว้สำหรับ "ความน่าจะเป็นที่แน่นอน" นั้นใช้งานไม่ได้ คุณมีอีกอันไหม

— S. Kolassa - Reinstate Monica

@StephanKolassa คุณสามารถหาสูตร Clopper Pearson ได้ที่นี่เช่นกัน: en.wikipedia.org/wiki/…

— Joris Meys

24

$p=0.9$ $\hat{p}=0.9$ $p$ $\hat{p}$

— Rob Hyndman
แหล่งที่มา

9

@Joris พูดถึงช่วงเวลาแบบ symmetric หรือ "asymptotic" ซึ่งเป็นไปได้มากว่าเป็นสิ่งที่คุณคาดหวัง @Joris ยังกล่าวถึงช่วงเวลา "แน่นอน" Clopper-Pearson และให้การอ้างอิงซึ่งดูดีมาก มีช่วงความมั่นใจอื่นสำหรับสัดส่วนที่คุณอาจพบ (โปรดทราบว่ายังไม่สมมาตร) ช่วงเวลา "Wilson" ซึ่งเป็นช่วงเวลาแบบ asymptotic ตามการย้อนกลับการทดสอบคะแนน จุดสิ้นสุดของช่วงเวลาแก้ไข (ใน $p$ ) สมการ

(\hat{พี} - พี) / \sqrt{พี (1 - พี)} = \pm Z_{α / 2}

$(\hat{p} - p)/\sqrt{p(1-p)}=\pm z_{\alpha/2}$

อย่างไรก็ตามคุณสามารถรับทั้งสามใน R ต่อไปนี้:

library(Hmisc)
binconf(29, 38, method = "asymptotic")
binconf(29, 38, method = "exact")
binconf(29, 38, method = "wilson")

โปรดทราบว่าวิธีการ "wilson" เป็นช่วงความเชื่อมั่นเดียวกันกับที่ใช้โดย prop.test โดยไม่ต้องแก้ไขความต่อเนื่องของ Yates:

prop.test(29, 38, correct = FALSE)

ดูที่นี่สำหรับคู่มือ SPLUS + R ฟรีของ Laura Thompson ซึ่งมาพร้อมกับการวิเคราะห์ข้อมูลอย่างละเอียดของ Agresti ซึ่งกล่าวถึงปัญหาเหล่านี้อย่างละเอียด

1

(+1) Nice that you cite Laura's textbook and add this complement of information about Wilson's CIs.

— chl

2

Thanks. I would like to point out that the Wilson interval is discussed in the article that @Joris referenced.

9

มีอยู่ช่วงความเชื่อมั่นสมมาตรสำหรับการกระจายทวินาม: สมส่วนไม่ได้ถูกบังคับให้กับพวกเราแม้จะมีเหตุผลทั้งหมดที่กล่าวมาแล้ว ช่วงเวลาที่สมมาตรมักจะถือว่าด้อยกว่า

แม้ว่าพวกเขาจะมีความสมมาตรเชิงตัวเลขแต่พวกเขาก็ไม่ได้มีความสมมาตรในความน่าจะเป็นนั่นคือการครอบคลุมแบบหนึ่งด้านของพวกเขาแตกต่างกัน นี่คือผลสืบเนื่องที่จำเป็นของความไม่สมดุลที่เป็นไปได้ของการแจกแจงแบบทวินาม - เป็นปมของสสาร
บ่อยครั้งที่จุดปลายหนึ่งจะต้องไม่สมจริง (น้อยกว่า 0 หรือมากกว่า 1) ตามที่ @Rob Hyndman ชี้ให้เห็น

ต้องบอกว่าฉันสงสัยว่าซีไอเอแบบสมมาตรเชิงตัวเลขอาจมีคุณสมบัติที่ดีเช่นมีแนวโน้มที่จะสั้นกว่าความน่าจะเป็นแบบสมมาตรในบางสถานการณ์

— whuber
แหล่งที่มา

With regard to the last sentence: then why not calculate the shortest confidence interval (which has equal density values instead of equal interval width or equal tail area to both sides)? With regard to 2.: having the same width to both sides of

\hat{p} = k / n

$\hat p = k/n$ does not imply that a (the normal) approximation must be used. I'd say that this particular interval does not exist if the limits would need to be extended outside [0, 1].

— cbeleites supports Monica

@cb I don't follow this. First, a shortest CI will not necessarily have equal densities at each end. Second, the comment about "does not exist" makes no sense to me: what does "not exist" mean?

— whuber

1

shortest CI. To calculate the shortest CI for a given coverage, I'd start at the max density and enlarge a short step to the side where density is higher. There I get most confidence coverage (for the short step that is). I enlarge the c.i. repeatedly until I have the desired area (coverage). If my steps are small (infinitesimal) the density at both sides will be (approx.) the same. Did I make a mistake in this strategy?

— cbeleites supports Monica

does not exist: e.g. 4 successes out of 5. It does make sense to ask for the 95 % c.i. However if I calculate the probability density for the true

p

$p$ given that I observed 4 successes out of 5 trials, the tail above

\hat{p} = 4 / 5 = 0.8

$\hat p = 4/5 = 0.8$ is only about 0.35. Thus instead of accepting e.g. the normal approximation saying the 95% c.i. goes up to 1.15 (which cannot be correct as the the true

p

$p$ of the binomial trial cannot exceed 1, I'd say the c.i. with equal width towards lower and higher

p

$p$ does only exist for confidence levels

< 70 %

$< 70 \%$ .

— cbeleites supports Monica

1

Are we talking about different things? The binomial distribution is discrete, a c.i. would be "for

p = 0.8

$p = 0.8$ , in 94 % of the repetitions we observe

k \in {3, 4, 5}

$k \in \{3, 4, 5\}$ successes in

n = 5

$n = 5$ tests". But I understood that we are to estimate

p

$p$ for already observed

n

$n$ and

k

$k$ . E.g.

p

$p$ given that

k = 4

$k = 4$ out of

n = 5

$n = 5$ tests were successes. So I'm talking about

P r (p | n = 5, k = 4)

$Pr (p | n = 5, k = 4)$ ,

p \in [0, 1]

$p \in [0, 1]$ . This is not the binomial distribution

P r (k | n, p)

$Pr (k | n, p)$ but that of proportion

p

$p$ (I don't know its name). Please help me to understand why there is no density for this distribution?

— cbeleites supports Monica

6

Binomial distribution is just not symmetric, yet this fact emerges especially for $p$ near $0$ or $1$ and for small $n$ ; most people use it for $p\approx 0.5$ and so the confusion.

— chl
แหล่งที่มา

2

I know that it has been a while, but I thought that I would chime in here. Given n and p, it is simple to compute the probability of a particular number of successes directly using the binomial distribution. One can then examine the distribution to see that it is not symmetric. It will approach symmetry for large np and large n(1-p).

One can accumulate the probabilities in the tails to compute a particular CI. Given the discrete nature of the distribution, finding a particular probability in a tail (e.g., 2.5% for a 95% CI) will require interpolation between the number of successes. With this method, one can compute CIs directly without approximation (other than the required interpolation).

— Dr. Eric
แหล่งที่มา

การประมาณช่วงความเชื่อมั่นแบบทวินาม - ทำไมมันไม่สมมาตร