Spatial Data, Analysis, and Regression - III
A mini-course
Second block
Spatial Regression
Morning:
- Motivation
Specification
- Spatial dependence
- Spatial heterogeneity
Afternoon:
- Diagnostics
- Estimation
- Software implementation
Motivation
Explicit introduction of spatial effects in an econometric framework
Theory-driven
Space (by itself or as a proxy of something else) is relevant in many conceptual frameworks:
- Spatial externalities/spill-overs
- Spatial competition (e.g. spatial reaction functions, policy competition)
Data-driven
Even if models do not, real world occurs in space, and this (sometimes) creates problems:
- Modifiable Areal Unit Problem (MAUP)
- Scale issues and boundary mismatch
Some of these violate classical assumptions in OLS
Model specification
- Spatial dependence Vs. spatial heterogeneity
- Deviations from traditional linear model:
Y = α + Xβ + ε
Spatial heterogeneity
- Account for systematic differences across space without relying on interdependences
- Typically justified by unobservables that have a clear spatial dimension
- Econometrically "simpler"
Spatial fixed effects
Y = αr + Xβ + ε
- Level differences in the outcome Y due to location
- Needs to be defined ex-ante
- Non-spatial estimation
Spatial regimes
Y = αr + Xβr + ε
- Level differences + different effects of exogenous variables in the outcome Y due to location
- Needs to be defined ex-ante
- Non-spatial estimation
- Similar to running separate regressions by regime ("complete no pooling"), but allows for testing of the differences with a (spatial) Chow test
Spatial dependence
- Model interdependencies between observations channeled through space
- Also, potential way to account for MAUP (spatial smoothing)
- Econometrically more involved, because (most of the times) it violates many OLS assumptions
Exogenous spatial effects
Main equation in matrix form:
Y = Xβ + WXγ + ε
- Limited spatial extent: after one order of spatial magnitude, their effect dissapears ( ≈ local externalities)
- Akin to one more exogenous variable → Ignoring it is akin to an omitted variable problem (bias and loss of efficiency)
- Straighforward estimation (OLS) because they are exogenous
Spatial lag model (AR)
Y = ρWY + Xβ + ε
Endogenous variable is spatially lagged and included as one more explanatory variable
Captures global spatial effects (spatial multiplier):
Y = (I − ρW) − 1Xβ + (I − ρW) − 1ε
↓
(I − ρW) − 1Xβ = Xβ + ρWXβ + ρ2WWXβ + . . .
Spatial lag model (AR)
Two main rationales behind its adoption:
- Theory-driven: compatible with spatial interaction and reaction functions
- Data-driven: spatial filter to deal with scale problems
- Its effect is interpreted as the outcome of a simultaneous system, not as a direct causal effect → model interdependent decisions
Omission induces bias and efficiency issues and, because of the endogeneity induced, estimation requires particular techniques (e.g. ML, IV)
Spatial error model
- Spatial effects in (uncorrelated) unmodelled shocks
- Off-diagonal elements of the VC matrix are non-zero and follow a spatial pattern
- Efficiency problem: OLS estimates remain "on target" but their precission is damaged
Spatial error model
Y = Xβ + u
u = λWu + ε
-Global effects: u = (I − λW) − 1ε
Spatial error model
Y = Xβ + u
u = λWε + ε
-Local effects: after two orders of neighbors, the effect washes away