เพื่อสรุป (และในกรณีที่ไฮเปอร์ลิงก์ OP ล้มเหลวในอนาคต) เรากำลังดูชุดข้อมูลhsb2
ดังนี้:
id female race ses schtyp prog read write math science socst
1 70 0 4 1 1 1 57 52 41 47 57
2 121 1 4 2 1 3 68 59 53 63 61
...
199 118 1 4 2 1 1 55 62 58 58 61
200 137 1 4 3 1 2 63 65 65 53 61
ซึ่งสามารถนำเข้าที่นี่
เราเปลี่ยนตัวแปรread
เป็นและสั่ง / ตัวแปรลำดับ:
hsb2$readcat<-cut(hsb2$read, 4, ordered = TRUE)
(means = tapply(hsb2$write, hsb2$readcat, mean))
(28,40] (40,52] (52,64] (64,76]
42.77273 49.97849 56.56364 61.83333
ตอนนี้เรามีทุกชุดที่จะเพียงแค่ทำงานปกติ ANOVA - ใช่มันเป็น R และเราโดยทั่วไปมีตัวแปรขึ้นอยู่อย่างต่อเนื่องและตัวแปรอธิบายที่มีหลายระดับwrite
readcat
ใน R เราสามารถใช้lm(write ~ readcat, hsb2)
1. การสร้างเมทริกซ์ความคมชัด:
ตัวแปรที่เรียงลำดับมีสี่ระดับที่แตกต่างกันreadcat
ดังนั้นเราจะมีความแตกต่างn−1=3
table(hsb2$readcat)
(28,40] (40,52] (52,64] (64,76]
22 93 55 30
ก่อนอื่นเราไปหาเงินและดูที่ฟังก์ชั่น R ในตัว:
contr.poly(4)
.L .Q .C
[1,] -0.6708204 0.5 -0.2236068
[2,] -0.2236068 -0.5 0.6708204
[3,] 0.2236068 -0.5 -0.6708204
[4,] 0.6708204 0.5 0.2236068
ตอนนี้เรามาแยกกันว่าเกิดอะไรขึ้นภายใต้ประทุน:
scores = 1:4 # 1 2 3 4 These are the four levels of the explanatory variable.
y = scores - mean(scores) # scores - 2.5
y=[−1.5,−0.5,0.5,1.5]
seq_len(n) - 1=[0,1,2,3]
n = 4; X <- outer(y, seq_len(n) - 1, "^") # n = 4 in this case
⎡⎣⎢⎢⎢⎢1111−1.5−0.50.51.52.250.250.252.25−3.375−0.1250.1253.375⎤⎦⎥⎥⎥⎥
What happened there? the outer(a, b, "^")
raises the elements of a
to the elements of b
, so that the first column results from the operations, (−1.5)0, (−0.5)0, 0.50 and 1.50; the second column from (−1.5)1, (−0.5)1, 0.51 and 1.51; the third from (−1.5)2=2.25, (−0.5)2=0.25, 0.52=0.25 and 1.52=2.25; and the fourth, (−1.5)3=−3.375, (−0.5)3=−0.125, 0.53=0.125 and 1.53=3.375.
ต่อไปเราจะทำการย่อยสลายแบบออโธกราฟนอร์ของเมทริกซ์นี้และทำการแทนค่า Q ( ) บางส่วนของการทำงานภายในของฟังก์ชั่นที่ใช้ในการ QR ตัวประกอบใน R ใช้ในการโพสต์นี้มีอธิบายเพิ่มเติมที่นี่QRc_Q = qr(X)$qr
⎡⎣⎢⎢⎢⎢−20.50.50.50−2.2360.4470.894−2.502−0.92960−4.5840−1.342⎤⎦⎥⎥⎥⎥
z = c_Q * (row(c_Q) == col(c_Q))
RQR
raw = qr.qy(qr(X), z)
Qqr(X)$qr
QQ = qr.Q(qr(X))
QzQ %*% z
QR does not change the orthogonality of the constituent column vectors, but given that the absolute value of the eigenvalues appears in decreasing order from top left to bottom right, the multiplication of Qz will tend to decrease the values in the higher order polynomial columns:
Matrix of Eigenvalues of R
[,1] [,2] [,3] [,4]
[1,] -2 0.000000 0 0.000000
[2,] 0 -2.236068 0 0.000000
[3,] 0 0.000000 2 0.000000
[4,] 0 0.000000 0 -1.341641
Compare the values in the later column vectors (quadratic and cubic) before and after the QR factorization operations, and to the unaffected first two columns.
Before QR factorization operations (orthogonal col. vec.)
[,1] [,2] [,3] [,4]
[1,] 1 -1.5 2.25 -3.375
[2,] 1 -0.5 0.25 -0.125
[3,] 1 0.5 0.25 0.125
[4,] 1 1.5 2.25 3.375
After QR operations (equally orthogonal col. vec.)
[,1] [,2] [,3] [,4]
[1,] 1 -1.5 1 -0.295
[2,] 1 -0.5 -1 0.885
[3,] 1 0.5 -1 -0.885
[4,] 1 1.5 1 0.295
Finally we call (Z <- sweep(raw, 2L, apply(raw, 2L, function(x) sqrt(sum(x^2))), "/", check.margin = FALSE))
turning the matrix raw
into an orthonormal vectors:
Orthonormal vectors (orthonormal basis of R^4)
[,1] [,2] [,3] [,4]
[1,] 0.5 -0.6708204 0.5 -0.2236068
[2,] 0.5 -0.2236068 -0.5 0.6708204
[3,] 0.5 0.2236068 -0.5 -0.6708204
[4,] 0.5 0.6708204 0.5 0.2236068
This function simply "normalizes" the matrix by dividing ("/"
) columnwise each element by the ∑col.x2i−−−−−−−√. So it can be decomposed in two steps: (i) apply(raw, 2, function(x)sqrt(sum(x^2)))
, resulting in 2 2.236 2 1.341
, which are the denominators for each column in (ii) where every element in a column is divided by the corresponding value of (i).
At this point the column vectors form an orthonormal basis of R4, until we get rid of the first column, which will be the intercept, and we have reproduced the result of contr.poly(4)
:
⎡⎣⎢⎢⎢⎢−0.6708204−0.22360680.22360680.67082040.5−0.5−0.50.5−0.22360680.6708204−0.67082040.2236068⎤⎦⎥⎥⎥⎥
The columns of this matrix are orthonormal, as can be shown by (sum(Z[,3]^2))^(1/4) = 1
and z[,3]%*%z[,4] = 0
, for example (incidentally the same goes for rows). And, each column is the result of raising the initial scores - mean to the 1-st, 2-nd and 3-rd power, respectively - i.e. linear, quadratic and cubic.
2. Which contrasts (columns) contribute significantly to explain the differences between levels in the explanatory variable?
We can just run the ANOVA and look at the summary...
summary(lm(write ~ readcat, hsb2))
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 52.7870 0.6339 83.268 <2e-16 ***
readcat.L 14.2587 1.4841 9.607 <2e-16 ***
readcat.Q -0.9680 1.2679 -0.764 0.446
readcat.C -0.1554 1.0062 -0.154 0.877
... to see that there is a linear effect of readcat
on write
, so that the original values (in the third chunk of code in the beginning of the post) can be reproduced as:
coeff = coefficients(lm(write ~ readcat, hsb2))
C = contr.poly(4)
(recovered = c(coeff %*% c(1, C[1,]),
coeff %*% c(1, C[2,]),
coeff %*% c(1, C[3,]),
coeff %*% c(1, C[4,])))
[1] 42.77273 49.97849 56.56364 61.83333
... or...
... or much better...
Being orthogonal contrasts the sum of their components adds to zero ∑i=1tai=0 for a1,⋯,at constants, and the dot product of any two of them is zero. If we could visualized them they would look something like this:
The idea behind orthogonal contrast is that the inferences that we can exctract (in this case generating coefficients via a linear regression) will be the result of independent aspects of the data. This would not be the case if we simply used X0,X1,⋯.Xn as contrasts.
Graphically, this is much easier to understand. Compare the actual means by groups in large square black blocks to the prediced values, and see why a straight line approximation with minimal contribution of quadratic and cubic polynomials (with curves only approximated with loess) is optimal:
If, just for effect, the coefficients of the ANOVA had been as large for the linear contrast for the other approximations (quadratic and cubic), the nonsensical plot that follows would depict more clearly the polynomial plots of each "contribution":
The code is here.