Geographic Data Science - Lecture I

Introduction

Dani Arribas-Bel

Today

  • This course
  • The (geo-)data revolution
  • (Geo-)Data Science

This course

Quiz

  • Can you think of a real-world context where data and statistics are being used to make a difference?
  • Have you ever heard the term "Big Data"?
  • Have you ever heard the term "Data Science"?
  • Have you ever written a line of computer code?

More stats than a GIS course, more GIS than a stats course...

...but in a fun way!

Philosophy

  • (Lots of) methods and techniques
    • General overview
    • Intuition
    • Very little math
    • Lots of ways to continue on your own
  • Emphasis on the application and use
  • Close connection to "real world" applications

Logistics - Website

http://darribas.org/gds17

Logistics - Format

11 weeks of:

  • Prep. materials: videos, podcasts, articles... 1h. approx. (most recommended!)
  • 1h. Lecture: concepts, methods, examples
  • 2h. Computer practical: hands-on, application of concepts, Python (highly employable)
  • Further readings: how to go beyond the minimum

Logistics - Content

  • Weeks 1-3: "big picture" lectures + introduction to computational tools (learning curve)
  • Weeks 4-8: "meat" of the course (lots of concepts packed)
  • Weeks 9-11: catch up + prepare an awesome Assignment II

Code

"Even if you won't be a poet, you need to know how to write"

Python

Python

    • General purpose programming language
    • Sweet spot between "proof-of-concept" and "production-ready"
    • Industry standard: GIS (Esri, QGIS) and Data Science (Google, Facebook, Amazon, Netflix, The New York Times, NASA...)

Self-directed learning

Prepare for the labs

  • I won't be leading/lecturing at the computer labs
  • Go over the notebooks before the lecture and the computer lab --> If the first time you see a notebook is at the lab, you won't be able to follow on
  • Bring questions, comments, feedback, (informed) rants to class/labs
  • Use the forum (link on VITAL)
  • Collaborate (it's NOT a zero-sum win!!!)

More help!!!

This course is much more about "learning to learn" and problem solving rather than acquiring specific programming tricks or stats wizardry

  • Learn to ask questions (but don't expect exact answers all the time!!!)
  • Help others as much as you can (the best way to learn is to teach)
  • Search heavily on Google + Stack Overflow

Assignments

  • Mark (mostly) based on two assignments, due:

    • Week 7 (40%), Week 12 (55%)
    • Coursework
    • Equivalent to 2,500: report (notebook) with code, figures (e.g. maps), and text
  • Discussion board (5%)

NOTE: recommendation letters only for great students (>70)

The (geo-)data revolution

The (geo-)data revolution

Exciting times to be a:

  • Geographer
  • Map fan
  • Data fan

The world is being "datafied"...

"Datafication"

Quantification of phenomena through the systematic recording of data, "taking all aspects of life and turning them into data" (Cukier & Mayer-Schoenberg)

Examples: credit transactions, public transit, tweets, facebook likes, spotify songs, etc.

"Datafication"

Many implications:

  • Window into human behaviour (this course)
  • Opportunities for optimization of systems (Industrial IoT, planning systems...)
  • Issues with intentionality and privacy
  • ...

Why now?

Advances in:

  • Computing power and storage
  • Connectivity
  • Geospatial technology

The (geo-)data revolution

The confluence of the three (computing, communication and geospatial) is creating large amounts of data.

Now, data in itself is not very valuable:

  • Data --> Information --> Knowledge --> Action

Data Science

Methods, tools and techniques to turn data into actionable knowledge

Data Science

Source: Drew Conway

Data Science

Statistics + ...

  • Computational tools --> Programming (hence this course's tutorials!)
  • Comunication skills --> "Story telling" (hence this course's assignments)
  • Domain expertise --> Theories about why the data are the way they are (hence the rest of your degree)
  • Some examples...

    Geo-Data Science

    Geo-Data Science

    • A (very) large portion of all these new data are inherently geographic or can be traced back to some location over space.
    • Spatial is special.
    • Some of the methods require an explicitly spatial treatment --> (Geo-)Data Science

    Some examples...

    Creative Commons License
    Geographic Data Science'17 - Lecture 1 by Dani Arribas-Bel is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License.