-
-
Notifications
You must be signed in to change notification settings - Fork 134
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feature/car #322
Open
imadmali
wants to merge
69
commits into
master
Choose a base branch
from
feature/car
base: master
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Open
feature/car #322
Changes from 1 commit
Commits
Show all changes
69 commits
Select commit
Hold shift click to select a range
33ea96b
adding stan_CARbym; print/summary methods work
imadmali 6feee4f
get important post-estimation methods/functions working
imadmali 6ad17ba
adding stan_bym (inla version)
imadmali c2e2212
adding stan_besag (icar)
imadmali 5b3fda8
fixing conditional in stanreg for car models
imadmali 55fb6c6
expanding prior options for CAR models
imadmali 6247da9
changes to spatial.fit file
imadmali f6f853d
Merge branch 'master' into feature/car
imadmali e3ea475
adding linkinv chunks to spatial.stan
imadmali 4c0f42d
using chunk functions for meanPPD
imadmali 2070bb4
demeaning, QR decom, and posterior methods work
imadmali fcc8d71
get prior summary working for CAR models
imadmali 34f03e6
adding documentation for CAR models
imadmali 1869c9c
adding stan_besag tests; CAR models return 'terms'
imadmali 44f7e4a
adding vb methods for CAR models; optimizing currently fails
imadmali 8746893
adding tests for stan_bym
imadmali c329a6a
use sparse matrix ops for 'scaling_factor'
imadmali 0ebd45c
adding light documentation for stan_spatial.fit
imadmali 9859ddb
omit optimizing from supported algorithms in CAR models
imadmali 7c36fef
use chunks stan code; use prior_aux
imadmali fbdeeeb
adding more likelihoods to CAR models
imadmali 11c132e
refactor stan chunks for spatial.stan
imadmali aeb898b
draft of CAR vignette; fixing binomial lpmf bug in spatial.stan
imadmali 81cf7f6
updating documentation/vignette for CAR models
imadmali b65253c
adding 2nd order spatial random walks
imadmali a039e2a
Q should be squared if 'order == 2'
imadmali 40b248b
update stan_besag tests with inla comparisons
imadmali b846069
changing stan_bym to stan_bym2 (INLA convention)
imadmali 9aafe0b
adding BYM model; changing spatial parameter names
imadmali 3582ca6
change parameterization in spatial.stan from prec to sd
imadmali 9cc2a37
convert spatial weight matrix to sparse if necessary
imadmali 87ec6a9
prior must be on theta_raw when model_type = 2
imadmali 4ea4674
remove INLA dependency for bym2 model
imadmali 41555ff
updating CAR spatial tests
imadmali 31e8d1c
update CAR vignette
imadmali c3a55d0
changing names for spatial priors
imadmali 2a9001a
updates to besag unit tests
imadmali 557179b
fixing conflicts in feature/car
imadmali ee55e48
getting spatial models to compile
imadmali 2da6c77
get posterior predict to work with CAR models (hacky)
imadmali 7ab7a9a
updating spatial docs
imadmali a641852
return error when using loo on CAR models (tentative)
imadmali 5a1173e
drop loo tests
imadmali a12bb9a
resolving merge conflicts in feature/car
imadmali 4824fd3
modified namespace
imadmali 02aa765
Merge branch 'master' into feature/car
imadmali 3f52c0c
spatial.Rmd: fix encoding
bgoodri 2876ac3
update from master
imadmali a129012
Merge branch 'master' into feature/car
imadmali b498fc0
making CAR family func consistent with other stan models
imadmali 19a809d
fixing awkward family/link pairs; neg_bin(link=identity) still broken
imadmali a564548
fix count family/link issues in spatial.stan
imadmali 8fe3ffd
cleanup loo
imadmali 1556945
revise CAR tests
imadmali 4ab4d1f
use sparseMatrix when dealing with CAR adjacency matrix
imadmali 9b5bedc
simplify spatial tests
imadmali e65b2c7
fixing unit tests
imadmali 04fc47a
Merge branch 'master' into feature/car
imadmali abd0a72
adding is_car instead of is(object, 'car')
imadmali 557958c
revising spatial model vignette
imadmali 05eed19
delete src/include
bgoodri 6efcfe4
rm src/Modules*
bgoodri 9eddcc2
dropping INLA requirement
imadmali e0dfbcc
fix family/link bounds in spatial.stan
imadmali ba88f97
Merge branch 'feature/car' of https://github.com/stan-dev/rstanarm in…
bgoodri 5b08936
refactoring in-line declarations and likelihood contributions
imadmali 8eb3c7f
NAMESPACE: update
bgoodri b3d968b
Merge branch 'feature/car' of https://github.com/stan-dev/rstanarm in…
bgoodri c809d14
Merge branch 'master' into feature/car
bgoodri File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
revising spatial model vignette
- Loading branch information
commit 557958c44c799d17f15f84ad440906b10744adb5
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -41,27 41,31 @@ The linear predictor takes the following form, | |
$$ | ||
\boldsymbol{\eta} = \alpha \mathbf{X}\boldsymbol{\beta} \boldsymbol{\psi} | ||
$$ | ||
where $\alpha$ is the intercept, $\mathbf{X}$ is an $N$-by-$K$ matrix of predictors, $\boldsymbol{\beta}$ is a $K$-dimensional vector of regression coefficients, and $\boldsymbol{\psi}$ is a $N$-dimensional vector representing the spatial effect (and $N$ denotes the number of spatial units). The construction of $\boldsymbol{\psi}$ depends on the model, which is discussed in the relevant sections below. | ||
where $\alpha$ is the intercept, $\mathbf{X}$ is an $N$-by-$K$ matrix of predictors ($N$ being the number of observations and $K$ being the number of predictors), $\boldsymbol{\beta}$ is a $K$-dimensional vector of regression coefficients, and $\boldsymbol{\psi}$ is a $N$-dimensional vector representing the spatial effect. The construction of $\boldsymbol{\psi}$ depends on the model, which is discussed in the relevant sections below. | ||
|
||
Depending on the choice of likelihood there may or may not be an additional auxiliary parameter $\gamma$ in the model (e.g. in a Gaussian likelihood this would be the variation of the data). With all these components, for some probability density/mass function $f$, we can state the general form of the likelihood as, | ||
|
||
Depending on the choice of likelihood there may or may not be an additional auxiliary parameter $\gamma$ in the model (e.g. in a Gaussian likelihood this would be the variation of the data). Thus, for some probability density/mass function $f$ we can state the general form of the likelihood as, | ||
$$ | ||
\mathcal{L}(\alpha, \boldsymbol{\beta}, \gamma | \mathbf{y}) = \prod_{i=1}^N f(y_i | \alpha, \boldsymbol{\beta}, \gamma ) | ||
$$ | ||
|
||
## GMRF Hierarchical Component | ||
|
||
CAR models require that you define the spatial component as a Gaussian Markov Random Field (GMRF). The random vector $\boldsymbol{\phi}$ is a GMRF with respect to the graph $\mathcal{G} = (\mathcal{V} = \{1,\ldots,n\},\mathcal{E})$ with mean vector $\boldsymbol{\mu}$ and precision matrix $\mathbf{W}$ if its probability density function takes the precision form of the multivariate normal distribution, | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Can we include an image for a small graph like that? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Like you want to include a plot of the lattice? I have that in there if you build the vignette. |
||
|
||
$$ | ||
f(\boldsymbol{\phi} | \boldsymbol{\mu}) = (2\pi)^{-n/2} |\mathbf{W}|^{1/2} | ||
\exp\bigg( -\frac{1}{2}(\boldsymbol{\phi-\mu})^{\top}\mathbf{W}(\boldsymbol{\phi-\mu}) \bigg) | ||
$$ | ||
|
||
and $W_{i,j} \neq 0$ when $\{i,j\}\in\mathcal{E}$ for all $i \neq j$. Here, $\mathcal{V}$ refers to the verticies on the graph (i.e. the spatial units) and $\mathcal{E}$ refers to the edges on the graph (i.e. the spatial units that are neighbors). In other words, $\mathbf{W}$ is an $N$-by-$N$ matrix (with zeros on the diagonal) which describes spatial adjacency. | ||
|
||
Unfortunately there is no guarantee that $\mathbf{W}$ is positive definite so $\mathbf{Q} = \mbox{diag}(\mathbf{W1}) - W$ (which is guaranteed to be positive semi-definite) is used as the precision matrix. | ||
|
||
## Besag (ICAR) Spatial Prior | ||
|
||
The `stan_besag` modeling function fits the data to an ICAR model. This means that the spatial effect enters the linear predictor as | ||
|
||
$$ | ||
\begin{align} | ||
\boldsymbol{\psi} = \boldsymbol{\phi} \\ | ||
|
@@ -74,7 78,8 @@ where $\tau$ is a scalar that controls the overall spatial variation and has an | |
|
||
## BYM | ||
|
||
The ICAR model is limited in that it only accounts for spatial variation among the spatial units. Thus, the random variation is picked up by the spatial variation which results in misleading parameter estimates and invalid inferences. The `stan_bym` model explains spatial variation as the sum of a structured (spatial) component $\boldsymbol{\phi}$ and an unstructured (random) component $\boldsymbol{\theta}$. Therefore, the spatial effect takes the following form, | ||
The ICAR model is limited in that it only accounts for spatial variation among the spatial units. Thus, the random variation is picked up by the spatial variation which may result in misleading parameter estimates and invalid inferences. The `stan_bym` model explains spatial variation as the sum of a structured (spatial) component $\boldsymbol{\phi}$ and an unstructured (random) component $\boldsymbol{\theta}$. Therefore, the spatial effect takes the following form, | ||
|
||
$$ | ||
\begin{align} | ||
\boldsymbol{\psi} &= \rho\boldsymbol{\phi} \tau\boldsymbol{\theta} \\ | ||
|
@@ -83,11 88,13 @@ f(\boldsymbol{\phi} | \boldsymbol{\mu}) &= (2\pi)^{-n/2} |\mathbf{Q}|^{1/2} | |
f(\theta_i) &= (2\pi)^{-1/2}\exp{\bigg( -\frac{\theta_i^2}{2} \bigg)} | ||
\end{align} | ||
$$ | ||
|
||
Note that the unstructured effect $\boldsymbol{\theta}$ is distributed standard normal. | ||
|
||
## BYM2 (Variant of the BYM Spatial Prior) | ||
|
||
The `stan_bym2` modeling function fits the data to a variant of the BYM model where the spatial effect enters as a convolution of the structured (spatial) effect and the unstructured (random) effect, | ||
|
||
$$ | ||
\begin{align} | ||
\boldsymbol{\psi} &= \tau(\boldsymbol{\theta}\sqrt{1-\rho} \boldsymbol{\phi}\sqrt{\rho}) \\ | ||
|
@@ -97,9 104,9 @@ f(\theta_i) &= (2\pi)^{-1/2}\exp{\bigg( -\frac{\theta_i^2}{2} \bigg)} | |
\end{align} | ||
$$ | ||
|
||
As in the BYM model $\boldsymbol{\theta}$ is distributed standard normal. However, the parameter $\rho$ is on the unit interval and is interpreted as the proportion of spatial variation that is contributed to overall variation, and $\tau$ explains the overall (convolved) variation. | ||
As in the BYM model $\boldsymbol{\theta}$ is distributed standard normal. However, now the parameter $\rho$ is on the unit interval and is interpreted as the proportion of spatial variation that is contributed to overall variation, and $\tau$ explains the overall (convolved) variation. | ||
|
||
Priors on $\rho$ should be chosen wisely as $\rho=0$ reduces to a model that not account for spatial variation and $\rho=1$ reduces to the ICAR model, which does not account for random variation among the spatial units. | ||
Priors on $\rho$ should be chosen carefully as $\rho=0$ reduces to a model that not account for spatial variation and $\rho=1$ reduces to the ICAR model, which does not account for random variation among the spatial units. | ||
|
||
## Posterior | ||
|
||
|
@@ -193,19 200,28 @@ spplot(grid_sim, "y_pred", at = var_range_y_pred, main = expression(y[pred]), | |
``` | ||
|
||
Alternatively we can look at the conventional one-dimensional posterior predictive check with the `pp_check` function. | ||
|
||
```{r ppcheck-1d, fig.align='center', fig.height=8} | ||
pp_check(fit_besag) | ||
``` | ||
|
||
In order to compare the predictive performance between the models we need to use the [loo](http://mc-stan.org/loo) package. Currently, however, this is not supported for CAR spatial models. | ||
In order to compare the predictive performance between the models we need to use the [loo](http://mc-stan.org/loo) package. Looking at the results below we can confirm that the correct model outperforms the incorrect model in terms of predictive accuracy. | ||
|
||
```{r loo} | ||
library(loo) | ||
loo_correct <- loo(fit_besag) | ||
loo_incorrect <- loo(fit_besag_bad) | ||
compare(loo_correct, loo_incorrect) | ||
``` | ||
|
||
## Smoothing the Spatial Random Walk | ||
|
||
In some cases modeling the GMRF spatial component with the precision matrix $\mathbf{Q}$ leads to rough spatial varation. This often occurs when dealing with spatial units on a fine lattice. Using the square of the precision matrix $\mathbf{Q}\mathbf{Q}$ allows us to smooth out the spatial variation across the spatial units using a spatial random walk of order 2. Below we use the `order = 2` argument to fit the model with smoothing of order 2. | ||
|
||
```{r smoothing, results="hide"} | ||
fit_besag_smooth <- stan_besag(y ~ 1 x z, data = spatial_data, W, order = 2, | ||
prior_intercept = normal(0,1), prior = normal(0,1), prior_rho = normal(0,1), | ||
prior_intercept = normal(0,1), prior = normal(0,1), | ||
prior_rho = normal(0,1), | ||
family = binomial(link="logit"), trials = spatial_data$trials, | ||
chains = CHAINS, cores = CORES, seed = SEED, iter = ITER) | ||
``` | ||
|
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is$\gamma$ actually $\sigma$ in the output?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's
aux
in the stan model but I think this maps to different things depending on the likelihood (e.g. it's sigma in the gaussian case).