Geographic Data Science - Lecture I

Introduction

Dani Arribas-Bel

Today

  • This course
  • The (geo-)data revolution
  • (Geo-)Data Science

This course

Quiz

  • Have you ever heard the terms "Big Data" and "Data Science"?
  • Have you ever written a line of computer code?
  • How would you define in one sentence "Data Science"?
  • Do you think "Geographic Data Science" is closer to GIS or Statistics?

More stats than a GIS course, more GIS than a stats course...

...but in a fun way!

Structure

11 weeks of:

  • Prep. materials: videos, podcasts, articles... 1h. approx. (most recommended!)
  • 1h. Lecture: concepts, methods, examples
  • 2h. Computer practical: hands-on, application of concepts, Python (highly employable)
  • Further readings: how to go beyond the minimum

IMPORTANT: Week 7 has no class! [Labs are booked so I recommend you spend the lab time working on your first assignment]

Website

http://darribas.org/gds15

Philosophy

  • (Lots of) methods and techniques
    • General overview
    • Intuition
    • Very little math
  • Emphasis on the application
  • Close connection to "real world" applications
  • FUN

Assignments

  • Mark based on two assignments, due:

    1. Week 8 (50%)
    2. Week 13 (50%)
  • Coursework

  • Equivalent to 2,500: report with code, figures (e.g. maps), and text

The (geo-)data revolution

The (geo-)data revolution

Exciting times to be a:

  • Geographer
  • Map fan
  • Data fan

The world is being "datafied"...

"Datafication"

Quantification of phenomena through the systematic recording of data

“taking all aspects of life and turning them into data” Cukier & (Mayer-Schoenberg)

Examples: credit transactions, public transit, tweets, facebook likes, spotify songs, etc.

"Datafication"

Many implications:

  • Opportunities for optimization of systems (Industrial IoT, planning systems...)
  • Window into human behaviour (this course)
  • Issues with intentionality and privacy
  • ...

Why now?

Why now?

Advances in:

  • Computing power
  • Communication
  • Geospatial technology

Why now? --> Computing power

Source

Why now? --> Computing power

Source

Why now? --> Communication

Source

Why now? --> Communication

Source

Why now? --> Geospatial technology

Source

Why now? --> Geospatial technology

Source

The (geo-)data revolution

The confluence of the three (computing, communication and geospatial) is creating large amounts of data.

Now, data in itself is not very valuable:

  • Data --> Information --> Knowledge --> Action

Data Science

Methods, tools and techniques to turn data into actionable knowledge

But wait, isn't statistics just that?

Not only...

Data Science

Source: Drew Conway

Data Science

Statistics is a very important part of DS...

... but not the only one:

  • Computational tools --> Programming (hence this course's tutorials!)
  • Comunication skills --> "Story telling" (hence this course's assignments)
  • Domain expertise --> Theories about why the data are the way they are (hence the rest of your degree)

Data Science

  • Not all new (standing on the shoulders of giants)
  • "The data becomes key part in the product"
  • Focus on actionability and solving particular problems

Some examples...

Amazon

Dating sites

Uber

Geo-Data Science

Geo-Data Science

  • A (very) large portion of all these new data are inherently geographic or can be traced back to some location over space.
  • Spatial is special.
  • Some of the methods require an explicitly spatial treatment --> (Geo-)Data Science

Some examples...

AirBnb neighborhoods

Google Maps routing

John Snow's cholera map

Creative Commons License
Geographic Data Science'15 - Lecture 1 by Dani Arribas-Bel is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License.