Datasets
Contents
Datasets¶
This section covers the datasets required to run the course interactively. For archival reasons, all of those listed here have been mirrored in the repository for this course so, if you have downloaded the course, you already have a local copy of them.
Madrid¶
Airbnb properties¶
Source
This dataset has been sourced from the course “Spatial Modelling for Data Scientists”. The file imported here corresponds to the v0.1.0
version.
This dataset contains a pre-processed set of properties advertised on the AirBnb website within the region of Madrid (Spain), together with house characteristics.
🗃️ Data file
madrid_abb.gpkg
🤖 Code used to generate the file
[URL]
ℹ️ Furhter information
[URL]
This dataset is licensed under a CC0 1.0 Universal Public Domain Dedication.
Airbnb neighbourhoods¶
Source
This dataset has been directly sourced from the website Inside Airbnb. The file was imported on February 10th 2021.
This dataset contains neighbourhood boundaries for the city of Madrid, as provided by Inside Airbnb.
🗃️ Data file
neighbourhoods.geojson
ℹ️ Furhter information
[URL]
This dataset is licensed under a CC0 1.0 Universal Public Domain Dedication.
Arturo¶
This dataset contains the street layout of Madrid as well as scores of habitability, where available, associated with street segments. The data originate from the Arturo Project, by 300,000Km/s, and the available file here is a slimmed down version of their official street layout distributed by the project.
🗃️ Data file
arturo_streets.gpkg
🤖 Code used to generate the file
[Page]
ℹ️ Furhter information
[URL]
This dataset is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.
Sentinel 2 - 120m mosaic¶
This dataset contains four scenes for the region of Madrid (Spain) extracted from the Digital Twin Sandbox Sentinel-2 collection, by the SentinelHub. Each scene corresponds to the following dates in 2019:
January 1st
April 1st
July 10th
November 17th
Each scene includes red, green, blue and near-infrared bands.
This dataset is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.
Sentinel 2 - 10m GHS composite¶
This dataset contains a scene for the region of Madrid (Spain) extracted from the GHS Composite S2, by the European Commission.
🗃️ Data file
madrid_scene_s2_10_tc.tif
🤖 Code used to generate the file
[Page]
ℹ️ Furhter information
[URL]
This dataset is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.
Cambodia¶
Pollution¶
Surface with \(NO_2\) measurements (tropospheric column) information attached from Sentinel 5.
🗃️ Data file
cambodia_s5_no2.tif
🤖 Code used to generate the file
[Page]
ℹ️ Furhter information
[URL]
Friction surfaces¶
This dataset is an extraction of the following two data products by Weiss et al. (2020) [WNVR+20] and distributed through the Malaria Atlas Project:
Global friction surface enumerating land-based travel walking-only speed without access to motorized transport for a nominal year 2019 (Minutes required to travel one metre)
Global friction surface enumerating land-based travel speed with access to motorized transport for a nominal year 2019 (Minutes required to travel one metre)
Each is provided on a separate file.
Regional aggregates¶
Source
This dataset relies on boundaries from the Humanitarian Data Exchange. The file is provided by the World Food Programme through the Humanitarian Data Exchange and was accessed on February 15th 2021.
Pollution and friction aggregated at Level 2 (municipality) administrative boundaries for Cambodia.
🗃️ Data file
cambodia_regional.gpkg
🤖 Code used to generate the file
[Page]
This dataset is licensed under a Creative Commons Attribution 4.0 International License.
Cambodian cities¶
Extract from the Urban Centre Database (UCDB), version 1.2, of the centroid for Cambodian cities.
🗃️ Data file
cambodian_cities.geojson
🤖 Code used to generate the file
[Page]
ℹ️ Furhter information
[URL]
This dataset is licensed under a Creative Commons Attribution 4.0 International License.