Prepare LSOA/MSOA table for Liverpool#
We need the following two datasets:
LSOAs originally downloaded from the CDRC data store (original link).
LSOA to MSOA crosswalk from ONS.
LSOAs come from the IMD package from the CDRC. The dataset was most easily downloaded from the CDRC data store (link) and, since it already comes both in tabular as well as spatial data format (shapefile), it does not need merging or joining to additional geometries.
In addition, we will be using the lookup between LSOAs and Medium Super Output Areas (MSOAs), which can be downloaded on this link. This connects each LSOA polygon to the MSOA they belong to. MSOAs are a coarser geographic delineation from the Office of National Statistics (ONS), within which LSOAs are nested. That is, no LSOA boundary crosses any of an MSOA.
import pandas
import geopandas
We read the LSOAs
lsoas = geopandas.read_file("../../E08000012_IMD/shapefiles/E08000012.shp")
lsoas.info()
<class 'geopandas.geodataframe.GeoDataFrame'>
RangeIndex: 298 entries, 0 to 297
Data columns (total 13 columns):
# Column Non-Null Count Dtype
--- ------ -------------- -----
0 LSOA11CD 298 non-null object
1 imd_rank 298 non-null int64
2 imd_score 298 non-null float64
3 income 298 non-null float64
4 employment 298 non-null float64
5 education 298 non-null float64
6 health 298 non-null float64
7 crime 298 non-null float64
8 housing 298 non-null float64
9 living_env 298 non-null float64
10 idaci 298 non-null float64
11 idaopi 298 non-null float64
12 geometry 298 non-null geometry
dtypes: float64(10), geometry(1), int64(1), object(1)
memory usage: 30.4+ KB
We also need the crosswalk between LSOA and MSOA
cw = pandas.read_csv("../../E08000012_IMD/OA11_LSOA11_MSOA11_LAD11_EW_LUv2.csv",
encoding="iso-8859-1"
)
cw.info()
<class 'pandas.core.frame.DataFrame'>
RangeIndex: 181408 entries, 0 to 181407
Data columns (total 8 columns):
# Column Non-Null Count Dtype
--- ------ -------------- -----
0 OA11CD 181408 non-null object
1 LSOA11CD 181408 non-null object
2 LSOA11NM 181408 non-null object
3 MSOA11CD 181408 non-null object
4 MSOA11NM 181408 non-null object
5 LAD11CD 181408 non-null object
6 LAD11NM 181408 non-null object
7 LAD11NMW 10036 non-null object
dtypes: object(8)
memory usage: 11.1+ MB
/opt/conda/lib/python3.7/site-packages/IPython/core/interactiveshell.py:3072: DtypeWarning: Columns (7) have mixed types.Specify dtype option on import or set low_memory=False.
interactivity=interactivity, compiler=compiler, result=result)
Grab MSOA codes for Liverpool LSOA
msoas = cw[['LSOA11CD', 'MSOA11CD']]\
.drop_duplicates(keep='last')\
.set_index('LSOA11CD')
Build the table
msoas.head()
MSOA11CD | |
---|---|
LSOA11CD | |
E01000002 | E02000001 |
E01032740 | E02000001 |
E01000005 | E02000001 |
E01000009 | E02000017 |
E01000008 | E02000016 |
db = lsoas.join(msoas, on="LSOA11CD")\
[["LSOA11CD", "MSOA11CD", "geometry"]]
db.info()
<class 'geopandas.geodataframe.GeoDataFrame'>
RangeIndex: 298 entries, 0 to 297
Data columns (total 3 columns):
# Column Non-Null Count Dtype
--- ------ -------------- -----
0 LSOA11CD 298 non-null object
1 MSOA11CD 298 non-null object
2 geometry 298 non-null geometry
dtypes: geometry(1), object(2)
memory usage: 7.1+ KB
Write as Geopackage
! rm -f liv_lsoas.gpkg
db.to_file("liv_lsoas.gpkg", driver="GPKG")