prior must be on theta_raw when model_type = 2

stan-dev · imadmali · Jul 10, 2017 · Jul 11, 2017 · Jul 13, 2017 · Jul 13, 2017
commit 87ec6a964643d1cf0a0680b4a78222415d6c5f7f
diff --git a/R/stan_spatial.fit.R b/R/stan_spatial.fit.R
@@ -60,6  60,16 @@ stan_spatial.fit <- function(x, y, w,
                        binomial = 4,
                        Gamma = 5)
 
   # for when consistent-family-numbers gets merged
   # family_num <- switch(family$family,
   #                      gaussian = 1,
   #                      Gamma = 2,
   #                      inv_gaussian = 3,
   #                      beta = 4,
   #                      binomial = 5,
   #                      poisson = 6,
   #                      neg_binomial_2 = 7)
 
   if (family$family %in% c("gaussian", "Gamma")) {
     is_continuous <- TRUE
     y_real <- y
@@ -392,7  402,7 @@ create_scaling_factor <- function(dat) {
   # Compute the diagonal elements of the covariance matrix subject to the 
   # constraint that the entries of the ICAR sum to zero.
   # See the function help for further details.
-  Q_inv <- inla.qinv(Q_pert, constr=list(A = matrix(1,1,dat$N),e=0))
   Q_inv <- INLA::inla.qinv(Q_pert, constr=list(A = matrix(1,1,dat$N),e=0))
 
   # Compute the geometric mean of the variances, which are on the diagonal of Q.inv
   scaling_factor <- exp(mean(log(Matrix::diag(Q_inv))))

diff --git a/exec/spatial.stan b/exec/spatial.stan
@@ -46,7  46,7 @@ data {
   int link;
   int<lower=0,upper=1> is_continuous;
   int<lower=0,upper=1> has_aux;
-  int<lower=1,upper=3> model_type;       // Besag = 0; BYM = 1; BYM = 2
   int<lower=1,upper=3> model_type;       // Besag = 1; BYM = 2; BYM2 = 3
   int<lower=0,upper=1> has_intercept;
   vector[K] xbar;
   int<lower=0> trials[family == 4 ? N : 0];
@@ -122,7  122,7 @@ transformed parameters {
   vector[N] psi;
   phi[1:(N - 1)] = phi_raw;
   phi[N] = -sum(phi_raw);
-  /*
   /* precision form
   if (model_type == 1)
     psi = phi * sqrt(inv(tau));
   else if (model_type == 2)
@@ -175,13  175,14 @@ model {
   #include "priors.stan"
   // model specific priors
   if (model_type == 2) {
     target = normal_lpdf(theta_raw | 0, 1);  // unstructured (random) effect
     // prior on overall spatial variation
     if (prior_dist_rho == 1)
-      target = normal_lpdf(rho | prior_mean_rho, prior_scale_rho);
       target = normal_lpdf(rho[1] | prior_mean_rho, prior_scale_rho);
     else if (prior_dist_rho == 2)
-      target = student_t_lpdf(rho | prior_df_rho, prior_mean_rho, prior_scale_rho);
       target = student_t_lpdf(rho[1] | prior_df_rho, prior_mean_rho, prior_scale_rho);
     else if (prior_dist_rho == 3)
-      target = exponential_lpdf(rho | prior_scale_rho);
       target = exponential_lpdf(rho[1] | prior_scale_rho);
     /* else prior_dist_tau is 0 and nothing is added */
   }
   else if (model_type == 3) {  // BYM

diff --git a/vignettes/spatial.Rmd b/vignettes/spatial.Rmd
@@ -60,7  60,7 @@ Unfortunately there is no guarantee that $\mathbf{W}$ is positive definite so $\
 
 ## Besag (ICAR) Spatial Prior
 
-The `stan_besag` function fits the data to an ICAR model. This means that the spatial effect enters the linear predictor as
 The `stan_besag` modeling function fits the data to an ICAR model. This means that the spatial effect enters the linear predictor as
 $$
 \begin{align}
 \boldsymbol{\psi} = \boldsymbol{\phi} \\
@@ -71,19  71,35 @@ $$
 
 where $\rho$ is a scalar that controls the overall spatial variation and has an appropriate hyperprior distribution. 
 
-## Variant of the BYM Spatial Prior
 ## BYM
 
-The `stan_bym` model fits the data to a variant of the BYM model where the spatial effect enters as a convolution of the structured (spatial) effect and thex unstructured (random) effect,
 The ICAR model is limited in that it only accounts for spatial variation among the spatial units. Thus, the random variation is picked up by the spatial variation which results in misleading parameter estimates and invalid inferences. The BYM model explains spatial variation as the sum of a structured (spatial) component $\boldsymbol{\phi}$ and an unstructured (random) component $\boldsymbol{\theta}$. Therefore, the spatial component takes the following form,
 $$
 \begin{align}
-\boldsymbol{\psi} &= \tau(\boldsymbol{\theta}\sqrt{1-\rho}   \boldsymbol{\phi}\sqrt{\rho}) \\
 \boldsymbol{\psi} &= \rho\phi   \tau\theta \\
 f(\boldsymbol{\phi} | \boldsymbol{\mu}) &= (2\pi)^{-n/2} |\mathbf{Q}|^{1/2}
 \exp\bigg( -\frac{1}{2}\boldsymbol{\phi}^{\top}\mathbf{Q}\boldsymbol{\phi} \bigg) \\
-f(\theta_i) &= (2\pi\sigma^2)^{-1/2}\exp{\bigg( -\frac{\theta_i^2}{2} \bigg)}
 f(\theta_i) &= (2\pi)^{-1/2}\exp{\bigg( -\frac{\theta_i^2}{2} \bigg)}
 \end{align}
 $$
 Note that the unstructured effect $\boldsymbol{\theta}$ is distributed standard normal.
 
 ## BYM2 (Variant of the BYM Spatial Prior)
 
 The `stan_bym2` modeling function fits the data to a variant of the BYM model where the spatial effect enters as a convolution of the structured (spatial) effect and thex unstructured (random) effect,
 $$
 \begin{align}
 \boldsymbol{\psi} &= \tau(\boldsymbol{\theta}\sqrt{1-\rho}   \boldsymbol{\phi}\sqrt{\rho}) \\
 f(\boldsymbol{\phi} | \boldsymbol{\mu}) &= (2\pi)^{-n/2} |\mathbf{Q}|^{1/2}
 \exp\bigg( -\frac{1}{2}\boldsymbol{\phi}^{\top}\mathbf{Q}\boldsymbol{\phi} \bigg) \\
 f(\theta_i) &= (2\pi)^{-1/2}\exp{\bigg( -\frac{\theta_i^2}{2} \bigg)}
 \end{align}
 $$
 
 As in the BYM model $\boldsymbol{\theta}$ is distributed standard normal. Here, the parameter $\rho$ is on the unit interval and is interpreted as the proportion of spatial variation that is contributed to overall variation, and $\tau$ explains the overall (convolved) variation.
 
 Priors on $\rho$ should be chosen wisely as $\rho=0$ reduces to a model that not account for spatial variation and $\rho=1$ reduces to the ICAR model, which does not account for random variation among the spatial units. 
 
 ## Posterior
 
 Combining these components yields the following posteriors. For `stan_besag` we have,
@@ -96,7  112,8 @@ f(\rho) \times
 f(\lambda)
 $$
 
-and for `stan_bym` we have,
 
 and for both `stan_bym` and `stan_bym2` we have,
 $$
 f(\alpha, \boldsymbol{\beta},\boldsymbol{\phi}, \boldsymbol{\theta}, \rho, \tau, \gamma | \mathbf{y},\mathbf{X}, \mathbf{Q}) \propto 
 \prod_{i=1}^N f(y_i | \gamma) \times 
@@ -112,7  129,7 @@ $$
 
 As an example we use spatial units defined on a lattice. Below we plot a GMRF of 100 spatial units available in the rstanarm package.
 
-```{r load-lattice, fig.align='center', fig.height=5}
 ```{r load-lattice, fig.align='center', fig.height=8}
 library(spdep)
 # Load the preconstruced lattice/GMRF
 data("lattice", package = "rstanarm")
@@ -155,11  172,11 @@ Now we can fit the model with `stan_besag`.
 
 ```{r fit-besag, results = "hide"}
 library(rstanarm)
-fit_besag <- stan_besag(y ~ 1   x   z, data = spatial_data,
 fit_besag <- stan_besag(y ~ 1   x   z, data = spatial_data, W,
                         prior_intercept = normal(0,1), prior = normal(0,1), prior_rho = normal(0,1),
                         family = binomial(link="logit"), trials = spatial_data$trials,
                         chains = CHAINS, cores = CORES, seed = SEED, iter = ITER)
-fit_besag_bad <- stan_besag(y ~ 1   x   I(x^2), data = spatial_data,
 fit_besag_bad <- stan_besag(y ~ 1   x   I(x^2), data = spatial_data, W,
                             prior_intercept = normal(0,1), prior = normal(0,1), prior_rho = normal(0,1),
                             family = binomial(link="logit"), trials = spatial_data$trials,
                             chains = CHAINS, cores = CORES, seed = SEED, iter = ITER)
@@ -168,15  185,15 @@ print(fit_besag_bad)
 ```
 
 As a two-dimensional posterior predictive check we can plot the posterior predictions on the lattice using the `posterior_predict` function.
-```{r ppcheck-2d, fig.align='center', fig.height=5}
 ```{r ppcheck-2d, fig.align='center', fig.height=8}
 grid_sim@data$y_pred <- colMeans(posterior_predict(fit_besag))
 var_range_y_pred <- seq(min(grid_sim@data$y_pred), max(grid_sim@data$y_pred)   1, by = 1)
-spplot(grid_sim, "y", at = var_range_y_pred, main = expression(y[pred]),
 spplot(grid_sim, "y_pred", at = var_range_y_pred, main = expression(y[pred]),
        col.regions = colorRampPalette(c("#ef8a62", "#f7f7f7", "#67a9cf"))(max(var_range_y_pred)   1))
 ```
 
 Alternatively we can look at the conventional one-dimensional posterior predictive check with the `pp_check` function.
-```{r ppcheck-1d, fig.align='center', fig.height=5}
 ```{r ppcheck-1d, fig.align='center', fig.height=8}
 pp_check(fit_besag)
 ```
 
@@ -196,7  213,29 @@ compare(loo_besag, loo_besag_bad)
 
 ## Smoothing the Spatial Random Walk
 
-In some cases modeling the GMRF spatial component with the precsion matrix $\mathbf{Q}$ leads to rough spatial varation. This occurs when dealing with spatial units on a lattice. Using the square of the precision matrix $\mathbf{Q}\mathbf{Q}$ allows us to smooth out the spatial variation. Below we visually illustrate this result with a random walk in one-dimension using the `stan_rw` function with the argument `smooth = TRUE`.
 In some cases modeling the GMRF spatial component with the precsion matrix $\mathbf{Q}$ leads to rough spatial varation. This occurs when dealing with spatial units on a lattice. Using the square of the precision matrix $\mathbf{Q}\mathbf{Q}$ allows us to smooth out the spatial variation across the spatial units. Below we use the `order = 2` argument to fit the model with smoothing of order 2.
 
 ```{r smoothing, results="hide"}
 fit_besag_smooth <- stan_besag(y ~ 1   x   z, data = spatial_data, W, order = 2,
                                prior_intercept = normal(0,1), prior = normal(0,1), prior_rho = normal(0,1),
                                family = binomial(link="logit"), trials = spatial_data$trials,
                                chains = CHAINS, cores = CORES, seed = SEED, iter = ITER)
 ```
 
 Now we plot the results alongside the first fit with `order = 1`.
 
 ```{r plot-smoothing, fig.align='center', fig.height=12}
 grid_sim@data$y_pred_smooth <- colMeans(posterior_predict(fit_besag_smooth))
 
 var_range_y_pred <- seq(min(grid_sim@data$y_pred_smooth), max(grid_sim@data$y_pred_smooth)   1, by = 1)
 gridExtra::grid.arrange(layout_matrix = matrix(1:2, nrow = 1),
   spplot(grid_sim, "y_pred_smooth", at = var_range_y_pred, main = expression(y[pred_smooth]),
          col.regions = colorRampPalette(c("#ef8a62", "#f7f7f7", "#67a9cf"))(max(var_range_y_pred)   1)),
   spplot(grid_sim, "y_pred", at = var_range_y_pred, main = expression(y[pred]),
          col.regions = colorRampPalette(c("#ef8a62", "#f7f7f7", "#67a9cf"))(max(var_range_y_pred)   1))
 )
 ```
 
 
 ## References