tikreg is a Python package that efficiently implements Tikhonov regression.
Tikhonov regression gives us a framework to estimate spatiotemporal encoding models with non-spherical multivariate normal priors. This framework is useful to model biological signals. This package was developed to analyze brain data collected using functional magnetic resonance imaging (fMRI). tikreg
can also be used to model other neuroimaging signals (e.g. 2P, ECoG, etc) and LTI signals more generally.
- Useful when building large joint models that combine multiple feature spaces
- Transfer function estimation via regularized FIR models
- Efficient implementation of ridge regression for multiple outputs
- Dual and primal solutions for ridge regression
Clone the repo from GitHub and do the usual python install from the command line
$ git clone https://github.com/gallantlab/tikreg.git
$ cd tikreg
$ sudo python setup.py install
Or with pip:
$ pip install tikreg
The following code performs 5-fold cross-validated ridge regression.
from tikreg import models
from tikreg import utils as tikutils
# Generate synthetic data
weights_true, (Xtrain, Xtest), (Ytrain, Ytest) = tikutils.generate_data(noise=1, testsize=100)
# Specify fit
options = dict(ridges=np.logspace(0,3,11), weights=True, metric='rsquared')
fit = models.cvridge(Xtrain, Ytrain, Xtest, Ytest, **options)
# Evaluate results
weights_estimate = fit['weights']
weights_corr = tikutils.columnwise_correlation(weights_true, weights_estimate)
print(weights_corr.mean()) # > 0.9
print(fit['cvresults'].shape) # (5, 1, 11, 2): (nfolds, 1, nridges, nresponses)
By default, the optimal ridge regularization parameter is found across the population of responses. We can specifiy that the optimal regularization parameter be found for each individual response (population_optimum=False
).
options = dict(ridges=np.logspace(0,3,11), performance=True, predictions=True, weights=True, metric='rsquared')
fit = models.cvridge(Xtrain, Ytrain, Xtest, Ytest, population_optimum=False, **options)
print(fit.keys())
The model performance (performance=True
) and the test set predictions (predictions=True
) are also computed in the example above.
Conveniently, tikreg.models.cvridge()
will choose an efficient method to fit the ridge regression model automatically. Whenever the number of features is greater than the number of samples (P > N), the fit will be performed using kernel ridge regression.
# P >> N
nfeatures, ntimepoints = 1000, 100
weights_true, (Xtrain, Xtest), (Ytrain, Ytest) = tikutils.generate_data(n=ntimepoints, p=nfeatures, testsize=100)
# Model is automatically fit using kernel ridge
options = dict(ridges=np.logspace(0,3,11), weights=True, metric='rsquared')
fit = models.cvridge(Xtrain, Ytrain, Xtest, Ytest, **options)
# The weights are in the feature space
weights_estimate = fit['weights']
print(weights_estimate.shape) # (nfeatures, nresponses)
https://gallantlab.github.io/tikreg/index.html
When estimating a joint encoding model that consists of two feature spaces, banded ridge regression can be used to fit the model and assign each feature space a different regularization parameter. A banded ridge regression model with two multi-dimensional feature spaces (X1 and X2) can be expressed as
where the weights for each feature space are assumed to be indpendently distributed as multivariate normal with different variance (see Figure 5 in Nunez-Elizalde, et al., 2019). That is:
Estimating this model is computational expensive, requiring cross-validating two regularization parameters for every voxel ( and )
For more information:
-
Technical description of banded ridge: View notebook or Launch Google Colab notebook
-
Banded ridge regression tutorial: View notebook or Launch Google Colab notebook
Nunez-Elizalde AO, Huth AG, and Gallant, JL (2019). Voxelwise encoding models with non-spherical multivariate normal priors. NeuroImage. https://doi.org/10.1016/j.neuroimage.2019.04.012