Datasets

Census socio-demographics

There are two Census datasets used in the labs:

Table of LSOA areas in Liverpool with population counts by World region. The table is derived from the CDRC Census data pack (see below). “Lab 1 - Extra” contains an in detail explanation of how the table is constructed.

Source: available here.

Collection of socio-demographic characteristics from the 2011 Census for the city of Liverpool. A detailed description of the dataset, as well as instructions as to how to download it are available on the source link.

Source: CDRC’s Census data pack for the city of Liverpool (UK). Available in this
link.

Instructions: you will need to be registered on the CDRC website, which is free and very easy. Once logged in, click on the link provided above and select “Download” on the dataset’s page.

Index of Multiple Deprivation

Scores, ranks, and components of the 2015 Index of Multiple Deprivation (IMD). A detailed description of the dataset, as well as instructions as to how to download it are available on the source link.

Source: CDRC’s English Indices of Deprivation 2015 Geodata Pack for the city of Liverpool (UK). Available in this link.

Instructions: you will need to be registered on the CDRC website, which is free and very easy. Once logged in, click on the link provided above and select “Download” on the dataset’s page.

OS Geodata Pack

This is a compilation of spatial data about the city of Liverpool produced by the Ordnance Survey, distributed as open data, and provided by the CDRC. A detailed description of the dataset, as well as instructions as to how to download it are available on the source link.

Source: CDRC’s Geodata pack for the city of Liverpool (UK). Available in this link.

Instructions: you will need to be registered on the CDRC website, which is free and very easy. Once logged in, click on the link provided above and select “Download” on the dataset’s page.

UK raster

Lab 2 includes a blurb on displaying raster data. To do that, it uses a file
from the Ordnance Survey that is available from the OS website:

https://www.ordnancesurvey.co.uk/opendatadownload/products.html

For convenience, the file is also available for download here.

John Snow’s Cholera map

This is the dataset of the famous cholera map in central London in 1854, made by Dr. John Snow

The folder contains the street network, point data for the location of the pumps -one of which was contaminated with cholera- and a polygon file with building blocks from the Ordnance Survey (OS data © Crown copyright and database right, 2015). An explanation of the data sources is provided in the companion text file README.txt.

All the necessary data are available as a single download from the course website on the following link:

http://darribas.org/gds16/content/labs/data/john_snow.zip

AirBnb listing for Inner London - MSOA level

This dataset contains information for AirBnb properties for the area of Inner London aggregated at the MSOA level. It has been prepared by Dani Arribas-Bel using as the original source the full listing of AirBnb locations for London provided by Inside AirBnb. Same as the source, the dataset is released under a CC0 1.0 Universal License.

For every polygon, the following variables are provided:

Source: Inside AirBnb’s extract of AirBnb locations in London (UK).

Instructions: the data is provided as a GeoJSON file and is available for download in the following url (right-clik and “Save As” on the link):

http://darribas.org/gds16/content/labs/data/ilm_abb.geojson

The lab also uses an additional file that contains the boundary lines of the London boroughs, which has been obtained from:

https://github.com/radoi90/housequest-data/blob/master/london_boroughs.geojson

However, some students have experienced problems with the original file. If that is the case for you, go ahead and download this version from the course website:

http://darribas.org/gds16/content/labs/data/london_boroughs.geojson

Additional files: A Jupyter notebook showing the process of cleaning and aggregation carried out from the original data to the file provided here can be accessed in .ipynb and html format.

Geo-referenced tweets

This dataset, provided as a shapefile, contains a collection of locations and
time stamps relating to Twitter postings within the
municipality of Liverpool. The data was originally provided by Guy Lansley
from UCL and processed by Dani Arribas-Bel.

Every row in the dataset contains an individual tweet, and is provided with the
following information:

Source: Twitter via Guy Lansley (UCL).

Instructions: given the Twitter license that applies to the data, this
dataset cannot be redistributed publicly online. For that reason, it has been
uploaded to the VITAL page of the course. You can find it on the following
location:

VITAL ENVS3/563 –> Learning Resources –> Twitter dataset

The shapefile is provided as a compressed .zip file. Download it and extract
it where you can access and set the path appropriately.