Basic GWR Model

For a data set of \(n\) samples and \(p\) independent variables, the basic GWR model at sample \(i\) is defined as

\[y_i = \beta_{i0} + \sum_{k=1}^p \beta_{ik}x_{ik} + \epsilon_{i}\]

where \(y_i\) is the dependent variable, \(x_{ik}\) is the \(k\)-th independent variable, \(\beta_{ik}\) is the \(k\)-th coefficient, \(\beta_{i0}\) is the intercept, \(\epsilon_i\) is the random error which \(\epsilon_i \sim N(0, \sigma^2)\) and \(\sigma\) is the standard deviation. Then \(\beta_i\) is calibrated by the following estimator

(1)\[\beta_i = \left(X' W X \right)^{-1} X' W y\]

where \(\beta_i=(\beta_{i0},\beta_{i1},\cdots,\beta_{ip})'\) is the vector of coefficients, \(X\) is the matrix of independent variables, \(y\) is the vector of the dependent variable, \(W\) is called spatial weighting matrix defined by

\[\begin{split}W = \begin{pmatrix} w_{i1} & & & \\ & w_{i2} & & \\ & & \ddots & \\ & & & w_{in} \\ \end{pmatrix}\end{split}\]

Each \(w_{ij}\) in \(W\) is calculated by a kernel function \(k\) according to the distance from sample \(i\) to sample \(j\). Larger distance, lower weights.

Kernel functions

There are some useful kernel functions:

Gaussian: \[k(d;b) = \exp\left\{- \frac{d^2}{2 b^2}\right\}\]
Exponential: \[k(d;b) = \exp\left\{- \frac{|d|}{b}\right\}\]
Bi-squared: \[\begin{split}k(d;b) = \left\{ \begin{array}{ll} \left[ 1 - \left( \frac{d}{b} \right)^2 \right]^2, & \mbox{if } d < b \\ 0, & \mbox{otherwise} \end{array} \right.\end{split}\]
Tri-cube: \[\begin{split}k(d;b) = \left\{ \begin{array}{ll} \left[ 1 - \left( \frac{d}{b} \right)^3 \right]^3, & \mbox{if } d < b \\ 0, & \mbox{otherwise} \end{array} \right.\end{split}\]
Box-car: \[\begin{split}k(d;b) = \left\{ \begin{array}{ll} 1, & \mbox{if } d < b \\ 0, & \mbox{otherwise} \end{array} \right.\end{split}\]

The parameter \(b\) is called “bandwidth”. Its value is usually automatically optimized by the golden-selection algorithm from data according to some criterions. Usually, the following criterions are supported,

Cross-validation (CV)

For given bandwidth \(b\), the CV value is defined by

\[CV(b) = \sum_{i=1}^n \left( y - x_i \hat{\beta}_{-i} \right)^2\]

where \(x_i\) is the \(i\)-th row of \(X\), and \(\hat{\beta}_{-i}\) is the coefficient vector estimated without sample \(i\). It is also calibrated according to (1) but set \(w_{ii} = 0\).

Corrected Akaike Information Criterion (AIC:sub:c)

For given bandwidth \(b\), the AIC value is defined by

\[AIC(b) = 2n \ln \hat{\sigma} + n \ln 2pi + n \left\{ \frac{n+tr(S)}{n - 2 - tr(S)} \right\}\]

where \(\hat{\sigma}\) is the estimated deviation of random error, \(S\) is called the “hat matrix” which is defined by

\[\begin{split}S = \begin{pmatrix} x_1 (X'W_1X)^{-1}X'W_1 \\ x_2 (X'W_2X)^{-1}X'W_2 \\ \vdots \\ x_n (X'W_nX)^{-1}X'W_n \end{pmatrix}\end{split}\]

and it works like

\[\hat{y} = Sy\]

Distance Metrics

Not only euclidean distance but also any kinds of distance metrics can be applied in GWR. Currently, there are two kinds of distance metrics supported.

CRS Distance

Distance as the crow flies is calculated according to the type of coordinate reference system (CRS). When the CRS is projected, for two samples at \((u_i,v_i)\) and \((u_j,v_j)\),

\[d_{ij} = \sqrt{ (u_i - u_j)^2 + (v_i - v_j)^2 }\]

When the CRS is geographic, their great circle distance is calculated.

Minkwoski Distance

This metric is only applicable when the CRS is projected. It is defined by

\[d_{ij} = \sqrt[p]{ |u_i - u_j|^p + |v_i - v_j|^p }\]

In the future, we will support to set distances by a matrix file.

Example

To calibrate a basic GWR model, use gwm::GWRBasic.

Basic Usage

#include <armadillo>
using namespace arma;

mat coords = randr(100, 2, distr_param(0, 25));
mat x = join_rows(ones(100, 1), randn(100, 2));
mat beta = join_rows(
    ones(100) * 3.0,
    1.0 + (coords.col(0) + coords.col(1)) / 12.0,
    1.0 + (36.0 - (6.0 - coords.col(0) / 2)) % (36.0 - (6.0 - coords.col(1) / 2)) / 324
);
vec y = sum(x % beta, 1);

CGwmCRSDistance distance(false);
CGwmBandwidthWeight bandwidth(25, true, CGwmBandwidthWeight::Gaussian);
CGwmSpatialWeight spatial(&bandwidth, &distance);

GWRBasic algorithm;
algorithm.setCoords(coords);
algorithm.setDependentVariable(y);
algorithm.setIndependentVariables(x);
algorithm.setSpatialWeight(spatial);
mat beta_hat = algorithm.fit();

Bandwidth Optimization

If you are not confident about the bandwidth value, you can also let the algorithm optimize it by making the following changes:

GWRBasic algorithm;
algorithm.setCoords(coords);
algorithm.setDependentVariable(y);
algorithm.setIndependentVariables(x);
algorithm.setSpatialWeight(spatial);
algorithm.setIsAutoselectBandwidth(true);
algorithm.setBandwidthSelectionCriterion(GWRBasic::BandwidthSelectionCriterionType::AIC);
mat beta_hat = algorithm.fit();

The argument passing to gwm::GWRBasic::setBandwidthSelectionCriterion() can be either value of gwm::GWRBasic::BandwidthSelectionCriterionType.

Independent Variable Optimization

If you do not want to include all independent variables into the model and just include some significant variables, you can let the algorithm optimize variables by making the following changes:

GWRBasic algorithm;
algorithm.setCoords(coords);
algorithm.setDependentVariable(y);
algorithm.setIndependentVariables(x);
algorithm.setSpatialWeight(spatial);
algorithm.setIsAutoselectIndepVars(true);
algorithm.setIndepVarSelectionThreshold(3.0);
mat beta_hat = algorithm.fit();

The argument passing to gwm::GWRBasic::setIndepVarSelectionThreshold() is the threshold of AIC change determining whether one model is significantly different from another. Generally speaking, the size of this value depends on the number of samples. Data set of larger number of samples may need a larger threshold.