การจัดการความสัมพันธ์อัตโนมัติสูงใน MCMC

ฉันกำลังสร้างแบบจำลองเบย์แบบลำดับชั้นที่ค่อนข้างซับซ้อนสำหรับการวิเคราะห์เมตาโดยใช้ R และ JAGS ลดความซับซ้อนของบิตสองระดับที่สำคัญของแบบจำลองมี โดยที่เป็นข้อสังเกตที่จุดสิ้นสุด (ในกรณีนี้จีเอ็มเทียบกับการปลูกพืชที่ไม่ใช่จีเอ็ม) ในการศึกษา ,เป็นผลสำหรับการศึกษา , s เป็นผลกระทบของตัวแปรระดับการศึกษาต่างๆ (สถานะการพัฒนาทางเศรษฐกิจของประเทศที่ ทำการศึกษาชนิดพันธุ์พืชวิธีการศึกษา ฯลฯ ) จัดทำดัชนีโดยกลุ่มฟังก์ชันและ

y_{i j} = α_{j} + ϵ_{i}

$y_{ij} = \alpha_j + \epsilon_i$

α_{j} = \sum_{h} γ_{h (j)} + ϵ_{j}

$\alpha_j = \sum_h \gamma_{h(j)} + \epsilon_j$

y_{i j}

$y_{ij}$

i

$i$

j

$j$

α_{j}

$\alpha_j$

j

$j$

γ

$\gamma$

h

$h$

ϵ

$\epsilon$ s เป็นเงื่อนไขข้อผิดพลาด โปรดทราบว่า s ไม่ใช่สัมประสิทธิ์ของตัวแปรจำลอง แต่มีตัวแปรแตกต่างกันสำหรับค่าระดับการศึกษาที่แตกต่างกัน ตัวอย่างเช่นมีสำหรับประเทศกำลังพัฒนาและสำหรับประเทศที่พัฒนาแล้ว

γ

$\gamma$

γ

$\gamma$

γ_{d e v e l o p i n g}

$\gamma_{developing}$

γ_{d e v e l o p e d}

$\gamma_{developed}$

ฉันสนใจที่จะประเมินค่าของ s เป็นหลัก ซึ่งหมายความว่าการทิ้งตัวแปรระดับการศึกษาจากตัวแบบไม่ใช่ตัวเลือกที่ดี $\gamma$

มีความสัมพันธ์สูงระหว่างตัวแปรระดับการศึกษาหลายอย่างและฉันคิดว่าสิ่งนี้กำลังสร้างความสัมพันธ์อัตโนมัติขนาดใหญ่ในเครือข่าย MCMC ของฉัน พล็อตการวินิจฉัยนี้แสดงให้เห็นถึงวิถีลูกโซ่ (ซ้าย) และผลสัมพันธ์อัตโนมัติ (ขวา):

จากผลของความสัมพันธ์อัตโนมัติฉันได้ขนาดตัวอย่างที่มีประสิทธิภาพ 60-120 จาก 4 กลุ่ม 10,000 ตัวอย่าง

ฉันมีคำถามสองข้อข้อหนึ่งมีเป้าหมายที่ชัดเจนและอีกเรื่องเป็นเรื่องส่วนตัว

นอกเหนือจากการทำให้ผอมบางเพิ่มโซ่มากขึ้นและเรียกใช้ตัวอย่างอีกต่อไปฉันสามารถใช้เทคนิคใดในการจัดการปัญหาความสัมพันธ์อัตโนมัตินี้ โดย "จัดการ" ฉันหมายถึง "สร้างการประมาณการที่ดีพอสมควรในจำนวนเวลาที่สมเหตุสมผล" ในแง่ของพลังการประมวลผลฉันกำลังใช้งานโมเดลเหล่านี้ใน MacBook Pro
ออโตคอร์เรชั่นระดับนี้ร้ายแรงแค่ไหน? การสนทนาทั้งที่นี่และในบล็อกของ John Kruschkeแนะนำว่าถ้าเราเพิ่งเรียกใช้แบบจำลองนานพอ "ความสัมพันธ์แบบกลุ่มอัตโนมัติอาจจะถูกเฉลี่ย" (Kruschke) และดังนั้นจึงไม่ใช่เรื่องใหญ่

นี่คือรหัส JAGS สำหรับรุ่นที่สร้างพล็อตด้านบนในกรณีที่ใครก็ตามที่สนใจพอที่จะอ่านรายละเอียด:

model {
for (i in 1:n) {
    # Study finding = study effect + noise
    # tau = precision (1/variance)
    # nu = normality parameter (higher = more Gaussian)
    y[i] ~ dt(alpha[study[i]], tau[study[i]], nu)
}

nu <- nu_minus_one + 1
nu_minus_one ~ dexp(1/lambda)
lambda <- 30

# Hyperparameters above study effect
for (j in 1:n_study) {
    # Study effect = country-type effect + noise
    alpha_hat[j] <- gamma_countr[countr[j]] + 
                    gamma_studytype[studytype[j]] +
                    gamma_jour[jourtype[j]] +
                    gamma_industry[industrytype[j]]
    alpha[j] ~ dnorm(alpha_hat[j], tau_alpha)
    # Study-level variance
    tau[j] <- 1/sigmasq[j]
    sigmasq[j] ~ dunif(sigmasq_hat[j], sigmasq_hat[j] + pow(sigma_bound, 2))
    sigmasq_hat[j] <- eta_countr[countr[j]] + 
                        eta_studytype[studytype[j]] + 
                        eta_jour[jourtype[j]] +
                        eta_industry[industrytype[j]]
    sigma_hat[j] <- sqrt(sigmasq_hat[j])
}
tau_alpha <- 1/pow(sigma_alpha, 2)
sigma_alpha ~ dunif(0, sigma_alpha_bound)

# Priors for country-type effects
# Developing = 1, developed = 2
for (k in 1:2) {
    gamma_countr[k] ~ dnorm(gamma_prior_exp, tau_countr[k])
    tau_countr[k] <- 1/pow(sigma_countr[k], 2)
    sigma_countr[k] ~ dunif(0, gamma_sigma_bound)
    eta_countr[k] ~ dunif(0, eta_bound)
}

# Priors for study-type effects
# Farmer survey = 1, field trial = 2
for (k in 1:2) {
    gamma_studytype[k] ~ dnorm(gamma_prior_exp, tau_studytype[k])
    tau_studytype[k] <- 1/pow(sigma_studytype[k], 2)
    sigma_studytype[k] ~ dunif(0, gamma_sigma_bound)
    eta_studytype[k] ~ dunif(0, eta_bound)
}

# Priors for journal effects
# Note journal published = 1, journal published = 2
for (k in 1:2) {
    gamma_jour[k] ~ dnorm(gamma_prior_exp, tau_jourtype[k])
    tau_jourtype[k] <- 1/pow(sigma_jourtype[k], 2)
    sigma_jourtype[k] ~ dunif(0, gamma_sigma_bound)
    eta_jour[k] ~ dunif(0, eta_bound)
}

# Priors for industry funding effects
for (k in 1:2) {
    gamma_industry[k] ~ dnorm(gamma_prior_exp, tau_industrytype[k])
    tau_industrytype[k] <- 1/pow(sigma_industrytype[k], 2)
    sigma_industrytype[k] ~ dunif(0, gamma_sigma_bound)
    eta_industry[k] ~ dunif(0, eta_bound)
}
}

— แดนฮิกส์
แหล่งที่มา

แบบจำลองหลายระดับที่ซับซ้อนนั้นมีเหตุผลมากมายที่สแตนมีอยู่สำหรับเหตุผลที่คุณระบุ

— Sycorax พูดว่า Reinstate Monica

ตอนแรกฉันพยายามสร้างสิ่งนี้ในสแตนเมื่อหลายเดือนก่อน การศึกษาเกี่ยวข้องกับการค้นพบที่แตกต่างกันซึ่งอย่างน้อยในเวลานั้นฉันไม่ได้ตรวจสอบเพื่อดูว่ามีการเปลี่ยนแปลงหรือไม่จำเป็นต้องเพิ่มความซับซ้อนอีกชั้นหนึ่งให้กับโค้ดและหมายความว่าสแตนไม่สามารถใช้ประโยชน์จากการคำนวณเมทริกซ์ ที่ทำให้มันเร็ว

— Dan Hicks

ฉันไม่ได้คิดถึงความเร็วมากเท่าประสิทธิภาพที่ HMC สำรวจหลัง ความเข้าใจของฉันคือเนื่องจาก HMC สามารถครอบคลุมพื้นที่ได้มากขึ้นการวนซ้ำแต่ละครั้งจึงมีความสัมพันธ์อัตโนมัติต่ำ

— Sycorax พูดว่า Reinstate Monica

โอ้ใช่นั่นเป็นจุดที่น่าสนใจ ฉันจะใส่ลงในรายการสิ่งที่ฉันต้องลอง

— Dan Hicks

ทำตามคำแนะนำจาก user777 ดูเหมือนว่าคำตอบสำหรับคำถามแรกของฉันคือ "use Stan" หลังจากเขียนแบบจำลองใหม่ใน Stan นี่คือวิถี (4 โซ่ x 5000 ซ้ำหลังจากการเผาไหม้):
และ autocorrelation แปลง:

ดีกว่ามาก! เพื่อความสมบูรณ์นี่คือรหัส Stan:

data {                          // Data: Exogenously given information
// Data on totals
int n;                      // Number of distinct finding i
int n_study;                // Number of distinct studies j

// Finding-level data
vector[n] y;                // Endpoint for finding i
int study_n[n_study];       // # findings for study j

// Study-level data
int countr[n_study];        // Country type for study j
int studytype[n_study];     // Study type for study j
int jourtype[n_study];      // Was study j published in a journal?
int industrytype[n_study];  // Was study j funded by industry?

// Top-level constants set in R call
real sigma_alpha_bound;     // Upper bound for noise in alphas
real gamma_prior_exp;       // Prior expected value of gamma
real gamma_sigma_bound;     // Upper bound for noise in gammas
real eta_bound;             // Upper bound for etas
}

transformed data {
// Constants set here
int countr_levels;          // # levels for countr
int study_levels;           // # levels for studytype
int jour_levels;            // # levels for jourtype
int industry_levels;        // # levels for industrytype
countr_levels <- 2;
study_levels <- 2;
jour_levels <- 2;
industry_levels <- 2;
}

parameters {                    // Parameters:  Unobserved variables to be estimated
vector[n_study] alpha;      // Study-level mean
real<lower = 0, upper = sigma_alpha_bound> sigma_alpha;     // Noise in alphas

vector<lower = 0, upper = 100>[n_study] sigma;          // Study-level standard deviation

// Gammas:  contextual effects on study-level means
// Country-type effect and noise in its estimate
vector[countr_levels] gamma_countr;     
vector<lower = 0, upper = gamma_sigma_bound>[countr_levels] sigma_countr;
// Study-type effect and noise in its estimate
vector[study_levels] gamma_study;
vector<lower = 0, upper = gamma_sigma_bound>[study_levels] sigma_study;
vector[jour_levels] gamma_jour;
vector<lower = 0, upper = gamma_sigma_bound>[jour_levels] sigma_jour;
vector[industry_levels] gamma_industry;
vector<lower = 0, upper = gamma_sigma_bound>[industry_levels] sigma_industry;


// Etas:  contextual effects on study-level standard deviation
vector<lower = 0, upper = eta_bound>[countr_levels] eta_countr;
vector<lower = 0, upper = eta_bound>[study_levels] eta_study;
vector<lower = 0, upper = eta_bound>[jour_levels] eta_jour;
vector<lower = 0, upper = eta_bound>[industry_levels] eta_industry;
}

transformed parameters {
vector[n_study] alpha_hat;                  // Fitted alpha, based only on gammas
vector<lower = 0>[n_study] sigma_hat;       // Fitted sd, based only on sigmasq_hat

for (j in 1:n_study) {
    alpha_hat[j] <- gamma_countr[countr[j]] + gamma_study[studytype[j]] + 
                    gamma_jour[jourtype[j]] + gamma_industry[industrytype[j]];
    sigma_hat[j] <- sqrt(eta_countr[countr[j]]^2 + eta_study[studytype[j]]^2 +
                        eta_jour[jourtype[j]] + eta_industry[industrytype[j]]);
}
}

model {
// Technique for working w/ ragged data from Stan manual, page 135
int pos;
pos <- 1;
for (j in 1:n_study) {
    segment(y, pos, study_n[j]) ~ normal(alpha[j], sigma[j]);
    pos <- pos + study_n[j];
}

// Study-level mean = fitted alpha + Gaussian noise
alpha ~ normal(alpha_hat, sigma_alpha);

// Study-level variance = gamma distribution w/ mean sigma_hat
sigma ~ gamma(.1 * sigma_hat, .1);

// Priors for gammas
gamma_countr ~ normal(gamma_prior_exp, sigma_countr);
gamma_study ~ normal(gamma_prior_exp, sigma_study);
gamma_jour ~ normal(gamma_prior_exp, sigma_study);
gamma_industry ~ normal(gamma_prior_exp, sigma_study);
}

— แดนฮิกส์
แหล่งที่มา