# Spatial Weights

IPYNB

Spatial weights are mathematical structures used to represent spatial relationships. Many spatial analytics, such as spatial autocorrelation statistics and regionalization algorithms rely on spatial weights. Generally speaking, a spatial weight $w_{i,j}$ expresses the notion of a geographical relationship between locations $i$ and $j$. These relationships can be based on a number of criteria including contiguity, geospatial distance and general distances.

PySAL offers functionality for the construction, manipulation, analysis, and conversion of a wide array of spatial weights.

We begin with construction of weights from common spatial data formats.

import pysal as ps
import numpy as np


There are functions to construct weights directly from a file path.

shp_path = "../data/texas.shp"


## Weight Types

### Contiguity:

#### Queen Weights

A commonly-used type of weight is a queen contigutiy weight, which reflects adjacency relationships as a binary indicator variable denoting whether or not a polygon shares an edge or a vertex with another polygon. These weights are symmetric, in that when polygon $A$ neighbors polygon $B$, both $w{AB} = 1$ and $w{BA} = 1$.

To construct queen weights from a shapefile, use the queen_from_shapefile function:

qW = ps.queen_from_shapefile(shp_path)

qW

<pysal.weights.weights.W at 0x104142860>


All weights objects have a few traits that you can use to work with the weights object, as well as to get information about the weights object.

To get the neighbors & weights around an observation, use the observation's index on the weights object, like a dictionary:

qW[4] #neighbors & weights of the 5th observation (0-index remember)

{0: 1.0, 3: 1.0, 5: 1.0, 6: 1.0, 7: 1.0}


By default, the weights and the pandas dataframe will use the same index. So, we can view the observation and its neighbors in the dataframe by putting the observation's index and its neighbors' indexes together in one list:

self_and_neighbors = [4]
self_and_neighbors.extend(qW.neighbors[4])
print(self_and_neighbors)

[4, 0, 3, 5, 6, 7]


and grabbing those elements from the dataframe:

dataframe.loc[self_and_neighbors]

NAME STATE_NAME STATE_FIPS CNTY_FIPS FIPS STFIPS COFIPS FIPSNO SOUTH HR60 ... BLK90 GI59 GI69 GI79 GI89 FH60 FH70 FH80 FH90 geometry
4 Ochiltree Texas 48 357 48357 48 357 48357 1 0.000000 ... 0.021911 0.236998 0.352940 0.343949 0.374461 5.172414 4.0 4.758392 9.159159 <pysal.cg.shapes.Polygon object at 0x10d91efd0>
0 Lipscomb Texas 48 295 48295 48 295 48295 1 0.000000 ... 0.031817 0.286929 0.378219 0.407005 0.373005 6.724512 4.5 3.835360 6.093580 <pysal.cg.shapes.Polygon object at 0x10d91ee10>
3 Hansford Texas 48 195 48195 48 195 48195 1 0.000000 ... 0.000000 0.253527 0.357813 0.393938 0.383924 7.591786 4.7 5.542986 7.125457 <pysal.cg.shapes.Polygon object at 0x10d91ef60>
5 Roberts Texas 48 393 48393 48 393 48393 1 0.000000 ... 0.000000 0.320275 0.318656 0.398681 0.339626 5.762712 5.3 6.231454 4.885993 <pysal.cg.shapes.Polygon object at 0x10d91b080>
6 Hemphill Texas 48 211 48211 48 211 48211 1 0.000000 ... 0.188172 0.286707 0.385605 0.352996 0.346318 6.300115 6.9 6.451613 6.330366 <pysal.cg.shapes.Polygon object at 0x10d91b0f0>
7 Hutchinson Texas 48 233 48233 48 233 48233 1 1.936915 ... 2.635369 0.199287 0.322579 0.323864 0.370207 5.387316 5.1 6.222679 8.484271 <pysal.cg.shapes.Polygon object at 0x10d91b160>

6 rows × 70 columns

A full, dense matrix describing all of the pairwise relationships is constructed using the .full method, or when pysal.full is called on a weights object:

Wmatrix, ids = qW.full()
#Wmatrix, ids = ps.full(qW)

Wmatrix

array([[ 0.,  0.,  0., ...,  0.,  0.,  0.],
[ 0.,  0.,  1., ...,  0.,  0.,  0.],
[ 0.,  1.,  0., ...,  0.,  0.,  0.],
...,
[ 0.,  0.,  0., ...,  0.,  1.,  1.],
[ 0.,  0.,  0., ...,  1.,  0.,  1.],
[ 0.,  0.,  0., ...,  1.,  1.,  0.]])

n_neighbors = Wmatrix.sum(axis=1) # how many neighbors each region has

n_neighbors[4]

5.0

qW.cardinalities[4]

5


Note that this matrix is binary, in that its elements are either zero or one, since an observation is either a neighbor or it is not a neighbor.

However, many common use cases of spatial weights require that the matrix is row-standardized. This is done simply in PySAL using the .transform attribute

qW.transform = 'r'


Now, if we build a new full matrix, its rows should sum to one:

Wmatrix, ids = qW.full()

Wmatrix.sum(axis=1) #numpy axes are 0:column, 1:row, 2:facet, into higher dimensions

array([ 1.,  1.,  1.,  1.,  1.,  1.,  1.,  1.,  1.,  1.,  1.,  1.,  1.,
1.,  1.,  1.,  1.,  1.,  1.,  1.,  1.,  1.,  1.,  1.,  1.,  1.,
1.,  1.,  1.,  1.,  1.,  1.,  1.,  1.,  1.,  1.,  1.,  1.,  1.,
1.,  1.,  1.,  1.,  1.,  1.,  1.,  1.,  1.,  1.,  1.,  1.,  1.,
1.,  1.,  1.,  1.,  1.,  1.,  1.,  1.,  1.,  1.,  1.,  1.,  1.,
1.,  1.,  1.,  1.,  1.,  1.,  1.,  1.,  1.,  1.,  1.,  1.,  1.,
1.,  1.,  1.,  1.,  1.,  1.,  1.,  1.,  1.,  1.,  1.,  1.,  1.,
1.,  1.,  1.,  1.,  1.,  1.,  1.,  1.,  1.,  1.,  1.,  1.,  1.,
1.,  1.,  1.,  1.,  1.,  1.,  1.,  1.,  1.,  1.,  1.,  1.,  1.,
1.,  1.,  1.,  1.,  1.,  1.,  1.,  1.,  1.,  1.,  1.,  1.,  1.,
1.,  1.,  1.,  1.,  1.,  1.,  1.,  1.,  1.,  1.,  1.,  1.,  1.,
1.,  1.,  1.,  1.,  1.,  1.,  1.,  1.,  1.,  1.,  1.,  1.,  1.,
1.,  1.,  1.,  1.,  1.,  1.,  1.,  1.,  1.,  1.,  1.,  1.,  1.,
1.,  1.,  1.,  1.,  1.,  1.,  1.,  1.,  1.,  1.,  1.,  1.,  1.,
1.,  1.,  1.,  1.,  1.,  1.,  1.,  1.,  1.,  1.,  1.,  1.,  1.,
1.,  1.,  1.,  1.,  1.,  1.,  1.,  1.,  1.,  1.,  1.,  1.,  1.,
1.,  1.,  1.,  1.,  1.,  1.,  1.,  1.,  1.,  1.,  1.,  1.,  1.,
1.,  1.,  1.,  1.,  1.,  1.,  1.,  1.,  1.,  1.,  1.,  1.,  1.,
1.,  1.,  1.,  1.,  1.,  1.,  1.,  1.,  1.,  1.,  1.,  1.,  1.,
1.,  1.,  1.,  1.,  1.,  1.,  1.])


Since weight matrices are typically very sparse, there is also a sparse weights matrix constructor:

qW.sparse

<254x254 sparse matrix of type '<class 'numpy.float64'>'
with 1460 stored elements in Compressed Sparse Row format>

qW.pct_nonzero #Percentage of nonzero neighbor counts

2.263004526009052


By default, PySAL assigns each observation an index according to the order in which the observation was read in. This means that, by default, all of the observations in the weights object are indexed by table order. If you have an alternative ID variable, you can pass that into the weights constructor.

For example, the texas.shp dataset has a possible alternative ID Variable, a FIPS code.

dataframe.head()

NAME STATE_NAME STATE_FIPS CNTY_FIPS FIPS STFIPS COFIPS FIPSNO SOUTH HR60 ... BLK90 GI59 GI69 GI79 GI89 FH60 FH70 FH80 FH90 geometry
0 Lipscomb Texas 48 295 48295 48 295 48295 1 0.0 ... 0.031817 0.286929 0.378219 0.407005 0.373005 6.724512 4.5 3.835360 6.093580 <pysal.cg.shapes.Polygon object at 0x10d91ee10>
1 Sherman Texas 48 421 48421 48 421 48421 1 0.0 ... 0.139958 0.288976 0.359377 0.415453 0.378041 5.665722 1.7 3.253796 3.869407 <pysal.cg.shapes.Polygon object at 0x10d91ee80>
2 Dallam Texas 48 111 48111 48 111 48111 1 0.0 ... 2.050906 0.331667 0.385996 0.370037 0.376015 7.546049 7.2 9.471366 14.231738 <pysal.cg.shapes.Polygon object at 0x10d91eef0>
3 Hansford Texas 48 195 48195 48 195 48195 1 0.0 ... 0.000000 0.253527 0.357813 0.393938 0.383924 7.591786 4.7 5.542986 7.125457 <pysal.cg.shapes.Polygon object at 0x10d91ef60>
4 Ochiltree Texas 48 357 48357 48 357 48357 1 0.0 ... 0.021911 0.236998 0.352940 0.343949 0.374461 5.172414 4.0 4.758392 9.159159 <pysal.cg.shapes.Polygon object at 0x10d91efd0>

5 rows × 70 columns

The observation we were discussing above is in the fifth row: Ochiltree county, Texas. Note that its FIPS code is 48357.

Then, instead of indexing the weights and the dataframe just based on read-order, use the FIPS code as an index:

qW = ps.queen_from_shapefile(shp_path, idVariable='FIPS')

qW[4] #fails, since no FIPS is 4.

---------------------------------------------------------------------------

KeyError                                  Traceback (most recent call last)

<ipython-input-21-1d8a3009bc1e> in <module>()
----> 1 qW[4] #fails, since no FIPS is 4.

/Users/dani/anaconda/envs/gds-scipy16/lib/python3.5/site-packages/pysal/weights/weights.py in __getitem__(self, key)
504         {1: 1.0, 4: 1.0, 101: 1.0, 85: 1.0, 5: 1.0}
505         """
--> 506         return dict(list(zip(self.neighbors[key], self.weights[key])))
507
508     def __iter__(self):

KeyError: 4


Note that a KeyError in Python usually means that some index, here 4, was not found in the collection being searched, the IDs in the queen weights object. This makes sense, since we explicitly passed an idVariable argument, and nothing has a FIPS code of 4.

Instead, if we use the observation's FIPS code:

qW['48357']

{'48195': 1.0, '48211': 1.0, '48233': 1.0, '48295': 1.0, '48393': 1.0}


We get what we need.

In addition, we have to now query the dataframe using the FIPS code to find our neighbors. But, this is relatively easy to do, since pandas will parse the query by looking into python objects, if told to.

First, let us store the neighbors of our target county:

self_and_neighbors = ['48357']
self_and_neighbors.extend(qW.neighbors['48357'])


Then, we can use this list in .query:

dataframe.query('FIPS in @self_and_neighbors')

NAME STATE_NAME STATE_FIPS CNTY_FIPS FIPS STFIPS COFIPS FIPSNO SOUTH HR60 ... BLK90 GI59 GI69 GI79 GI89 FH60 FH70 FH80 FH90 geometry
0 Lipscomb Texas 48 295 48295 48 295 48295 1 0.000000 ... 0.031817 0.286929 0.378219 0.407005 0.373005 6.724512 4.5 3.835360 6.093580 <pysal.cg.shapes.Polygon object at 0x10d91ee10>
3 Hansford Texas 48 195 48195 48 195 48195 1 0.000000 ... 0.000000 0.253527 0.357813 0.393938 0.383924 7.591786 4.7 5.542986 7.125457 <pysal.cg.shapes.Polygon object at 0x10d91ef60>
4 Ochiltree Texas 48 357 48357 48 357 48357 1 0.000000 ... 0.021911 0.236998 0.352940 0.343949 0.374461 5.172414 4.0 4.758392 9.159159 <pysal.cg.shapes.Polygon object at 0x10d91efd0>
5 Roberts Texas 48 393 48393 48 393 48393 1 0.000000 ... 0.000000 0.320275 0.318656 0.398681 0.339626 5.762712 5.3 6.231454 4.885993 <pysal.cg.shapes.Polygon object at 0x10d91b080>
6 Hemphill Texas 48 211 48211 48 211 48211 1 0.000000 ... 0.188172 0.286707 0.385605 0.352996 0.346318 6.300115 6.9 6.451613 6.330366 <pysal.cg.shapes.Polygon object at 0x10d91b0f0>
7 Hutchinson Texas 48 233 48233 48 233 48233 1 1.936915 ... 2.635369 0.199287 0.322579 0.323864 0.370207 5.387316 5.1 6.222679 8.484271 <pysal.cg.shapes.Polygon object at 0x10d91b160>

6 rows × 70 columns

Note that we have to use @ before the name in order to show that we're referring to a python object and not a column in the dataframe.

#dataframe.query('FIPS in self_and_neighbors') will fail because there is no column called 'self_and_neighbors'


Of course, we could also reindex the dataframe to use the same index as our weights:

fips_frame = dataframe.set_index(dataframe.FIPS)

NAME STATE_NAME STATE_FIPS CNTY_FIPS FIPS STFIPS COFIPS FIPSNO SOUTH HR60 ... BLK90 GI59 GI69 GI79 GI89 FH60 FH70 FH80 FH90 geometry
FIPS
48295 Lipscomb Texas 48 295 48295 48 295 48295 1 0.0 ... 0.031817 0.286929 0.378219 0.407005 0.373005 6.724512 4.5 3.835360 6.093580 <pysal.cg.shapes.Polygon object at 0x10d91ee10>
48421 Sherman Texas 48 421 48421 48 421 48421 1 0.0 ... 0.139958 0.288976 0.359377 0.415453 0.378041 5.665722 1.7 3.253796 3.869407 <pysal.cg.shapes.Polygon object at 0x10d91ee80>
48111 Dallam Texas 48 111 48111 48 111 48111 1 0.0 ... 2.050906 0.331667 0.385996 0.370037 0.376015 7.546049 7.2 9.471366 14.231738 <pysal.cg.shapes.Polygon object at 0x10d91eef0>
48195 Hansford Texas 48 195 48195 48 195 48195 1 0.0 ... 0.000000 0.253527 0.357813 0.393938 0.383924 7.591786 4.7 5.542986 7.125457 <pysal.cg.shapes.Polygon object at 0x10d91ef60>
48357 Ochiltree Texas 48 357 48357 48 357 48357 1 0.0 ... 0.021911 0.236998 0.352940 0.343949 0.374461 5.172414 4.0 4.758392 9.159159 <pysal.cg.shapes.Polygon object at 0x10d91efd0>

5 rows × 70 columns

Now that both are using the same weights, we can use the .loc indexer again:

fips_frame.loc[self_and_neighbors]

NAME STATE_NAME STATE_FIPS CNTY_FIPS FIPS STFIPS COFIPS FIPSNO SOUTH HR60 ... BLK90 GI59 GI69 GI79 GI89 FH60 FH70 FH80 FH90 geometry
FIPS
48357 Ochiltree Texas 48 357 48357 48 357 48357 1 0.000000 ... 0.021911 0.236998 0.352940 0.343949 0.374461 5.172414 4.0 4.758392 9.159159 <pysal.cg.shapes.Polygon object at 0x10d91efd0>
48295 Lipscomb Texas 48 295 48295 48 295 48295 1 0.000000 ... 0.031817 0.286929 0.378219 0.407005 0.373005 6.724512 4.5 3.835360 6.093580 <pysal.cg.shapes.Polygon object at 0x10d91ee10>
48195 Hansford Texas 48 195 48195 48 195 48195 1 0.000000 ... 0.000000 0.253527 0.357813 0.393938 0.383924 7.591786 4.7 5.542986 7.125457 <pysal.cg.shapes.Polygon object at 0x10d91ef60>
48393 Roberts Texas 48 393 48393 48 393 48393 1 0.000000 ... 0.000000 0.320275 0.318656 0.398681 0.339626 5.762712 5.3 6.231454 4.885993 <pysal.cg.shapes.Polygon object at 0x10d91b080>
48211 Hemphill Texas 48 211 48211 48 211 48211 1 0.000000 ... 0.188172 0.286707 0.385605 0.352996 0.346318 6.300115 6.9 6.451613 6.330366 <pysal.cg.shapes.Polygon object at 0x10d91b0f0>
48233 Hutchinson Texas 48 233 48233 48 233 48233 1 1.936915 ... 2.635369 0.199287 0.322579 0.323864 0.370207 5.387316 5.1 6.222679 8.484271 <pysal.cg.shapes.Polygon object at 0x10d91b160>

6 rows × 70 columns

#### Rook Weights

Rook weights are another type of contiguity weight, but consider observations as neighboring only when they share an edge. The rook neighbors of an observation may be different than its queen neighbors, depending on how the observation and its nearby polygons are configured.

We can construct this in the same way as the queen weights, using the special rook_from_shapefile function:

rW = ps.rook_from_shapefile(shp_path, idVariable='FIPS')

rW['48357']

{'48195': 1.0, '48295': 1.0, '48393': 1.0}


These weights function exactly like the Queen weights, and are only distinguished by what they consider "neighbors."

self_and_neighbors = ['48357']
self_and_neighbors.extend(rW.neighbors['48357'])
fips_frame.loc[self_and_neighbors]

NAME STATE_NAME STATE_FIPS CNTY_FIPS FIPS STFIPS COFIPS FIPSNO SOUTH HR60 ... BLK90 GI59 GI69 GI79 GI89 FH60 FH70 FH80 FH90 geometry
FIPS
48357 Ochiltree Texas 48 357 48357 48 357 48357 1 0.0 ... 0.021911 0.236998 0.352940 0.343949 0.374461 5.172414 4.0 4.758392 9.159159 <pysal.cg.shapes.Polygon object at 0x10d91efd0>
48295 Lipscomb Texas 48 295 48295 48 295 48295 1 0.0 ... 0.031817 0.286929 0.378219 0.407005 0.373005 6.724512 4.5 3.835360 6.093580 <pysal.cg.shapes.Polygon object at 0x10d91ee10>
48195 Hansford Texas 48 195 48195 48 195 48195 1 0.0 ... 0.000000 0.253527 0.357813 0.393938 0.383924 7.591786 4.7 5.542986 7.125457 <pysal.cg.shapes.Polygon object at 0x10d91ef60>
48393 Roberts Texas 48 393 48393 48 393 48393 1 0.0 ... 0.000000 0.320275 0.318656 0.398681 0.339626 5.762712 5.3 6.231454 4.885993 <pysal.cg.shapes.Polygon object at 0x10d91b080>

4 rows × 70 columns

#### Bishop Weights

In theory, a "Bishop" weighting scheme is one that arises when only polygons that share vertexes are considered to be neighboring. But, since Queen contiguigy requires either an edge or a vertex and Rook contiguity requires only shared edges, the following relationship is true:

$\mathcal{Q} = \mathcal{R} \cup \mathcal{B}$

where $\mathcal{Q}$ is the set of neighbor pairs via queen contiguity, $\mathcal{R}$ is the set of neighbor pairs via Rook contiguity, and $\mathcal{B}$ via Bishop contiguity. Thus:

$\mathcal{Q} \setminus \mathcal{R} = \mathcal{B}$

Bishop weights entail all Queen neighbor pairs that are not also Rook neighbors.

PySAL does not have a dedicated bishop weights constructor, but you can construct very easily using the w_difference function. This function is one of a family of tools to work with weights, all defined in ps.weights, that conduct these types of set operations between weight objects.

bW = ps.w_difference(qW, rW, constrained=False, silent_island_warning=True) #silence because there will be a lot of warnings

bW.histogram

[(0, 161), (1, 48), (2, 33), (3, 8), (4, 4)]


Thus, the vast majority of counties have no bishop neighbors. But, a few do. A simple way to see these observations in the dataframe is to find all elements of the dataframe that are not "islands," the term for an observation with no neighbors:

islands = bW.islands

# Using .head() to limit the number of rows printed
dataframe.query('FIPS not in @islands').head()

NAME STATE_NAME STATE_FIPS CNTY_FIPS FIPS STFIPS COFIPS FIPSNO SOUTH HR60 ... BLK90 GI59 GI69 GI79 GI89 FH60 FH70 FH80 FH90 geometry
0 Lipscomb Texas 48 295 48295 48 295 48295 1 0.0 ... 0.031817 0.286929 0.378219 0.407005 0.373005 6.724512 4.5 3.835360 6.093580 <pysal.cg.shapes.Polygon object at 0x10d91ee10>
1 Sherman Texas 48 421 48421 48 421 48421 1 0.0 ... 0.139958 0.288976 0.359377 0.415453 0.378041 5.665722 1.7 3.253796 3.869407 <pysal.cg.shapes.Polygon object at 0x10d91ee80>
2 Dallam Texas 48 111 48111 48 111 48111 1 0.0 ... 2.050906 0.331667 0.385996 0.370037 0.376015 7.546049 7.2 9.471366 14.231738 <pysal.cg.shapes.Polygon object at 0x10d91eef0>
3 Hansford Texas 48 195 48195 48 195 48195 1 0.0 ... 0.000000 0.253527 0.357813 0.393938 0.383924 7.591786 4.7 5.542986 7.125457 <pysal.cg.shapes.Polygon object at 0x10d91ef60>
4 Ochiltree Texas 48 357 48357 48 357 48357 1 0.0 ... 0.021911 0.236998 0.352940 0.343949 0.374461 5.172414 4.0 4.758392 9.159159 <pysal.cg.shapes.Polygon object at 0x10d91efd0>

5 rows × 70 columns

## Distance

There are many other kinds of weighting functions in PySAL. Another separate type use a continuous measure of distance to define neighborhoods.

radius = ps.cg.sphere.RADIUS_EARTH_MILES

3958.755865744055

#ps.min_threshold_dist_from_shapefile?

threshold = ps.min_threshold_dist_from_shapefile('../data/texas.shp',radius) # now in miles, maximum nearest neighbor distance between the n observations

threshold

60.47758554135752


### knn defined weights

knn4_bad = ps.knnW_from_shapefile('../data/texas.shp', k=4) # ignore curvature of the earth

knn4_bad.histogram

[(4, 254)]

knn4 = ps.knnW_from_shapefile('../data/texas.shp', k=4, radius=radius)

knn4.histogram

[(4, 254)]

knn4[0]

{3: 1.0, 4: 1.0, 5: 1.0, 6: 1.0}

knn4_bad[0]

{4: 1.0, 5: 1.0, 6: 1.0, 13: 1.0}


#### Kernel W

Kernel Weights are continuous distance-based weights that use kernel densities to define the neighbor relationship. Typically, they estimate a bandwidth, which is a parameter governing how far out observations should be considered neighboring. Then, using this bandwidth, they evaluate a continuous kernel function to provide a weight between 0 and 1.

Many different choices of kernel functions are supported, and bandwidths can either be fixed (constant over all units) or adaptive in function of unit density.

For example, if we want to use adaptive bandwidths for the map and weight according to a gaussian kernel:

kernelWa = ps.adaptive_kernelW_from_shapefile('../data/texas.shp', radius=radius)
kernelWa

<pysal.weights.Distance.Kernel at 0x7f8fe4cfe080>

dataframe.loc[kernelWa.neighbors[4] + [4]]

NAME STATE_NAME STATE_FIPS CNTY_FIPS FIPS STFIPS COFIPS FIPSNO SOUTH HR60 ... BLK90 GI59 GI69 GI79 GI89 FH60 FH70 FH80 FH90 geometry
4 Ochiltree Texas 48 357 48357 48 357 48357 1 0.0 ... 0.021911 0.236998 0.352940 0.343949 0.374461 5.172414 4.0 4.758392 9.159159 <pysal.cg.shapes.Polygon object at 0x7f8fe4ee1...
5 Roberts Texas 48 393 48393 48 393 48393 1 0.0 ... 0.000000 0.320275 0.318656 0.398681 0.339626 5.762712 5.3 6.231454 4.885993 <pysal.cg.shapes.Polygon object at 0x7f8fe4ee1...
3 Hansford Texas 48 195 48195 48 195 48195 1 0.0 ... 0.000000 0.253527 0.357813 0.393938 0.383924 7.591786 4.7 5.542986 7.125457 <pysal.cg.shapes.Polygon object at 0x7f8fe4ee1...
4 Ochiltree Texas 48 357 48357 48 357 48357 1 0.0 ... 0.021911 0.236998 0.352940 0.343949 0.374461 5.172414 4.0 4.758392 9.159159 <pysal.cg.shapes.Polygon object at 0x7f8fe4ee1...

4 rows × 70 columns

kernelWa.bandwidth[0:7]

array([[ 30.30546757],
[ 30.05684855],
[ 39.14876899],
[ 29.96302462],
[ 29.96302462],
[ 30.21084447],
[ 30.23619029]])

kernelWa[4]

{3: 9.99999900663795e-08, 4: 1.0, 5: 0.002299013803371608}

kernelWa[2]

{1: 9.99999900663795e-08, 2: 1.0, 8: 0.23409571720488287}


## Distance Thresholds

#ps.min_threshold_dist_from_shapefile?

# find the largest nearest neighbor distance between centroids
threshold = ps.min_threshold_dist_from_shapefile('../data/texas.shp', radius=radius) # decimal degrees

WARNING: there are 2 disconnected observations
Island ids:  [133, 181]

Wmind0.histogram

[(0, 2),
(1, 3),
(2, 5),
(3, 4),
(4, 10),
(5, 26),
(6, 16),
(7, 31),
(8, 70),
(9, 32),
(10, 29),
(11, 12),
(12, 5),
(13, 2),
(14, 5),
(15, 2)]

Wmind = ps.threshold_binaryW_from_shapefile('../data/texas.shp', radius=radius, threshold=threshold)

Wmind.histogram

[(1, 2),
(2, 3),
(3, 4),
(4, 8),
(5, 5),
(6, 20),
(7, 26),
(8, 9),
(9, 32),
(10, 31),
(11, 37),
(12, 33),
(13, 23),
(14, 6),
(15, 7),
(16, 2),
(17, 4),
(18, 2)]

centroids = np.array([list(poly.centroid) for poly in dataframe.geometry])

centroids[0:10]

array([[-100.27156111,   36.27508641],
[-101.8930971 ,   36.27325425],
[-102.59590795,   36.27354996],
[-101.35351324,   36.27230422],
[-100.81561379,   36.27317803],
[-100.81482387,   35.8405153 ],
[-100.2694824 ,   35.83996075],
[-101.35420366,   35.8408377 ],
[-102.59375964,   35.83958662],
[-101.89248229,   35.84058246]])

Wmind[0]

{3: 1, 4: 1, 5: 1, 6: 1, 13: 1}

knn4[0]

{3: 1.0, 4: 1.0, 5: 1.0, 6: 1.0}


## Visualization

%matplotlib inline
import matplotlib.pyplot as plt
from pylab import figure, scatter, show

wq = ps.queen_from_shapefile('../data/texas.shp')

wq[0]

{4: 1.0, 5: 1.0, 6: 1.0}

fig = figure(figsize=(9,9))
plt.plot(centroids[:,0], centroids[:,1],'.')
plt.ylim([25,37])
show()


wq.neighbors[0]

[4, 5, 6]

from pylab import figure, scatter, show
fig = figure(figsize=(9,9))

plt.plot(centroids[:,0], centroids[:,1],'.')
#plt.plot(s04[:,0], s04[:,1], '-')
plt.ylim([25,37])
for k,neighs in wq.neighbors.items():
#print(k,neighs)
origin = centroids[k]
for neigh in neighs:
segment = centroids[[k,neigh]]
plt.plot(segment[:,0], segment[:,1], '-')
plt.title('Queen Neighbor Graph')
show()


wr = ps.rook_from_shapefile('../data/texas.shp')

fig = figure(figsize=(9,9))

plt.plot(centroids[:,0], centroids[:,1],'.')
#plt.plot(s04[:,0], s04[:,1], '-')
plt.ylim([25,37])
for k,neighs in wr.neighbors.items():
#print(k,neighs)
origin = centroids[k]
for neigh in neighs:
segment = centroids[[k,neigh]]
plt.plot(segment[:,0], segment[:,1], '-')
plt.title('Rook Neighbor Graph')
show()


fig = figure(figsize=(9,9))
plt.plot(centroids[:,0], centroids[:,1],'.')
#plt.plot(s04[:,0], s04[:,1], '-')
plt.ylim([25,37])
for k,neighs in Wmind.neighbors.items():
origin = centroids[k]
for neigh in neighs:
segment = centroids[[k,neigh]]
plt.plot(segment[:,0], segment[:,1], '-')
plt.title('Minimum Distance Threshold Neighbor Graph')
show()


Wmind.pct_nonzero

3.8378076756153514

wr.pct_nonzero

2.0243040486080974

wq.pct_nonzero

2.263004526009052


## Exercise

1. Answer this question before writing any code: What spatial weights structure would be more dense, Texas counties based on rook contiguity or Texas counties based on knn with k=4?
2. Why?
3. Write code to see if you are correct.