Prepare Brexit dataset#
This notebook shows how the table with results from the EU referendum has been assembled for this course.
The dataset is the same as the Brexit data used in the GDS Book reyABwolf, but here we have just assembled it into a single file that can be read remotely. For more details on sources, please refer to:
import pandas, geopandas
Referendum results#
# Updated to work on Sep. 15th 2020
url = ("https://www.electoralcommission.org.uk/sites/default/"
"files/2019-07/EU-referendum-result-data.csv")
url
'https://www.electoralcommission.org.uk/sites/default/files/2019-07/EU-referendum-result-data.csv'
ref_res = pandas.read_csv(url)
ref_res.info()
<class 'pandas.core.frame.DataFrame'>
RangeIndex: 382 entries, 0 to 381
Data columns (total 21 columns):
# Column Non-Null Count Dtype
--- ------ -------------- -----
0 id 382 non-null int64
1 Region_Code 382 non-null object
2 Region 382 non-null object
3 Area_Code 382 non-null object
4 Area 382 non-null object
5 Electorate 382 non-null int64
6 ExpectedBallots 382 non-null int64
7 VerifiedBallotPapers 382 non-null int64
8 Pct_Turnout 382 non-null float64
9 Votes_Cast 382 non-null int64
10 Valid_Votes 382 non-null int64
11 Remain 382 non-null int64
12 Leave 382 non-null int64
13 Rejected_Ballots 382 non-null int64
14 No_official_mark 382 non-null int64
15 Voting_for_both_answers 382 non-null int64
16 Writing_or_mark 382 non-null int64
17 Unmarked_or_void 382 non-null int64
18 Pct_Remain 382 non-null float64
19 Pct_Leave 382 non-null float64
20 Pct_Rejected 382 non-null float64
dtypes: float64(4), int64(13), object(4)
memory usage: 62.8+ KB
Join#
Link up both tables, keep only required columns for a simpler and slimmer table, and drop areas without values:
db = lads.join(ref_res.set_index("Area_Code"), on="lad16cd")\
[['objectid', 'lad16cd', 'lad16nm', 'Pct_Leave', 'geometry']]\
.dropna()
db.info()
<class 'geopandas.geodataframe.GeoDataFrame'>
Int64Index: 380 entries, 0 to 390
Data columns (total 5 columns):
# Column Non-Null Count Dtype
--- ------ -------------- -----
0 objectid 380 non-null int64
1 lad16cd 380 non-null object
2 lad16nm 380 non-null object
3 Pct_Leave 380 non-null float64
4 geometry 380 non-null geometry
dtypes: float64(1), geometry(1), int64(1), object(2)
memory usage: 17.8+ KB
Write out#
We save as a Geopackage:
! rm -f brexit.gpkg
db.to_file("brexit.gpkg", driver="GPKG")