Datasets

This section covers the datasets required to run the course interactively. For archival reasons, all of those listed here have been mirrored in the repository for this course so, if you have downloaded the course, you already have a local copy of them.

Madrid

Airbnb properties

Source

This dataset has been sourced from the course “Spatial Modelling for Data Scientists”. The file imported here corresponds to the v0.1.0 version.

This dataset contains a pre-processed set of properties advertised on the AirBnb website within the region of Madrid (Spain), together with house characteristics.

License
This dataset is licensed under a CC0 1.0 Universal Public Domain Dedication.

Airbnb neighbourhoods

Source

This dataset has been directly sourced from the website Inside Airbnb. The file was imported on February 10th 2021.

This dataset contains neighbourhood boundaries for the city of Madrid, as provided by Inside Airbnb.

License
This dataset is licensed under a CC0 1.0 Universal Public Domain Dedication.

Arturo

This dataset contains the street layout of Madrid as well as scores of habitability, where available, associated with street segments. The data originate from the Arturo Project, by 300,000Km/s, and the available file here is a slimmed down version of their official street layout distributed by the project.

Creative Commons License
This dataset is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.

Sentinel 2 - 120m mosaic

This dataset contains four scenes for the region of Madrid (Spain) extracted from the Digital Twin Sandbox Sentinel-2 collection, by the SentinelHub. Each scene corresponds to the following dates in 2019:

  • January 1st

  • April 1st

  • July 10th

  • November 17th

Each scene includes red, green, blue and near-infrared bands.

Creative Commons License
This dataset is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.

Sentinel 2 - 10m GHS composite

This dataset contains a scene for the region of Madrid (Spain) extracted from the GHS Composite S2, by the European Commission.

Creative Commons License
This dataset is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.

Cambodia

Pollution

Surface with \(NO_2\) measurements (tropospheric column) information attached from Sentinel 5.

Friction surfaces

This dataset is an extraction of the following two data products by Weiss et al. (2020) [WNVR+20] and distributed through the Malaria Atlas Project:

  • Global friction surface enumerating land-based travel walking-only speed without access to motorized transport for a nominal year 2019 (Minutes required to travel one metre)

  • Global friction surface enumerating land-based travel speed with access to motorized transport for a nominal year 2019 (Minutes required to travel one metre)

Each is provided on a separate file.

Regional aggregates

Source

This dataset relies on boundaries from the Humanitarian Data Exchange. The file is provided by the World Food Programme through the Humanitarian Data Exchange and was accessed on February 15th 2021.

Pollution and friction aggregated at Level 2 (municipality) administrative boundaries for Cambodia.

Creative Commons License
This dataset is licensed under a Creative Commons Attribution 4.0 International License.

Cambodian cities

Extract from the Urban Centre Database (UCDB), version 1.2, of the centroid for Cambodian cities.

Creative Commons License
This dataset is licensed under a Creative Commons Attribution 4.0 International License.