Get started with the GDS stack (Windows)

This document shows how you can get started with the tools used in the GDS'19 course. In particular, it walks you through each step of three of the most common tasks you will have to do in order to follow on each computer practical:

  1. Start the Jupyter Notebook
  2. Download notebook files with the practicals
  3. Download data files and access them within the notebook

We assume, you are either on a university computer that has all the required software installed (Install University Applications --> Scientific --> Geographic Data Science Stack 2019) or, if you are on a personal computer, you have installed all the software following the instructions (if not, go here).

Fire up the jupyter notebook

All of the course is based on the Jupyter notebook, a computational environment that provides an interface to the Python programming language by allowing to capture in a single document (an .ipynb file, or notebook from now on) Python code, its output, and additional text. You will interact with notebooks through the Jupyter Lab app. Let's see how you can open it!

University machines

Follow these steps to start the Lab app:

  • Go to the Search Windows box on the bottom-left corner of the screen and type GDS19 (ignore the GDS18 in the screenshot, you should look for GDS19):

  • Hit enter and that should open up a couple of black terminals, ignore them. After a few seconds, a third window will launch:

Note that the file browser on the left should display a different list, reflecting the contents in your M: drive.

  • Place the files downloaded from the course website within the M: drive and navigate to them to be able to open them up!

Additional

If you are accessing a notebook from a computer that is not owned by the University of Liverpool, there's a couple of additional steps you need to follow:

  • Launch a command prompt or terminal in your machine. In Windows, this can be done by typing "Anaconda prompt" in the start menu; on Macs, you can find the Terminal.app app under utilities; on Linux, you can use any bash or shell prompt available in your OS.
  • Now you need to navigate through the computer file system (just as you would do with the Explorer window, but through commands). Assuming your data are on a folder under the path /path/to/folder/, you can type:

cd /path/to/folder

which should change the default preamble into /path/to/folder. This means you have changed to the /path/to/folder directory. Note this could be, for example, C:\Users\darribas\Desktop.

  • Next step is activating the gds environment. For that, type the following command:

conda activate gds

This should set at the beginning of each prompt line the text (gds).

  • At this point, you are ready to fire up the lab! Type:

jupyter lab

And the browser should start and open a page that looks like the one above.

Congrats, you are good to go!!!

Download course files

Almost every file you will need for this course can be accessed through the course website:

darribas.org/gds19

To demonstrate how to effectively access the files needed for the computer practicals, we will use the example of the first practical. Here are the steps to follow:

  • In this case, we are interested in the first part. The file you want to download is the ipynb, which contains the notebook. Go ahead and right-click with the mouse on top of the link (mind that if you are on a Windows machine, the menu might look slightly different, make sure to click on "Save link as").

And save it somewhere WITHIN the M DRIVE. This is important for two reasons: a) you have started your Jupyter session there and b) it is the only way to ensure the file stays safely backed up and protected.

  • At this point, you can go back to you Jupyter Notebook session and navigate within it to the folder where you have placed the notebook file.

  • Click on the file and that should open the notebook:

At this point, you are all set to hack some Python code!

Read files into the notebook

Finally, once you have the notebook for a practical ready, you will want to download the dataset of interest and be able to access it within the notebook. The download part is just as any other linked file from the internet; the accessing it through the notebook is a bit trickier but, once you've got around doing it once, it is always the same process.

Let us use the dataset in the first lab as an example. To download it, you can access it on this link:

http://darribas.org/gds19/content/labs/data/liv_pop.csv

NOTE This is explained also in the data section of the course website.

You can download in the same way as we did the notebook before: righ-click on the link --> "Save link as..." (if on Windows, "Save link as..." otherwise). Same as before, place it somewhere within your M drive.

Now, depending on where you have saved the file, you will do one thing or another of the following:

A - If you have placed the dataset in the same folder as the notebook, all you need to do is just use the name of the file:

In [1]:
f = 'liv_pop.csv'

B - If you have placed the dataset in a folder within the folder where the notebook is. For example, let us assume that, within the folder where your notebook is, there is a subfolder called data. In this case, you will point to the file in this way:

In [2]:
f = 'data/liv_pop.csv'

C - If you have placed the dataset in a different folder altogether. In this case, you need to obtain what is called the "full path" of the file. Let us assume you downloaded the data file into the Downloads folder. To obtain the full path, the easiest way is to open a Windows Explorer window:

  • Click on the top bar, where you would usually enter a web address if this was an internet browser. This should change it to something looking approximately like:

  • Then right-click on "Desktop" and "Copy":

  • What you have just copied is the path to the folder where the file is, and you can paste this in-lieu of the options above. However, you need to the add the name of the file (just as above):
In [3]:
f = "C:/Users/darribas/Downloads/liv_pop.csv"

NOTE: we have had reports that "relative paths" (option B) might not work on university machines when your data is stored on the M: drive. If you experience issues in this case, please use absolute paths (option C).

Whichever way of the three (A, B, C) you have decided to go, you should now be ready to get on with the practical. Happy hacking!!!