{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# Reading & Manipulating Tabular Data\n", "\n", "{ref}`👨‍🏫 [Sync] `" ] }, { "cell_type": "code", "execution_count": 1, "metadata": {}, "outputs": [], "source": [ "import pandas\n", "import numpy as np\n", "from sqlalchemy import create_engine" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "To note:\n", "- `matplotlib` magic\n", "- Library import and alias" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Data" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "* Read in `.csv`" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "````{tabbed} Local files\n", "\n", "Assuming you have [the file](https://github.com/darribas/data_science_studio/raw/master/content/data/paris_abb.csv.zip) downloaded on the path `../data/`: \n", "\n", "```python\n", "db = pandas.read_csv(\"../data/paris_abb.csv.zip\")\n", "```\n", "````\n", "\n", "````{tabbed} Online read\n", "\n", "If you're online, you can do:\n", "\n", "```python\n", "db = pandas.read_csv(\"https://github.com/darribas/data_science_studio/raw/master/content/data/paris_abb.csv.zip\")\n", "```\n", "````" ] }, { "cell_type": "code", "execution_count": 2, "metadata": { "tags": [ "remove-cell" ] }, "outputs": [], "source": [ "db = pandas.read_csv(\"../data/paris_abb.csv.zip\")" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "* Connect to `SQLite` and query" ] }, { "cell_type": "code", "execution_count": 3, "metadata": {}, "outputs": [], "source": [ "engine = create_engine(\"sqlite:///../data/paris_abb_mini.db\")" ] }, { "cell_type": "code", "execution_count": 4, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
idneighbourhood_cleansedproperty_typeroom_typeaccommodatesbathroomsbedroomsbedsbed_typePrice
03109ObservatoireApartmentEntire home/apt21.00.01.0Real Bed60.0
15396HĂ´tel-de-VilleApartmentEntire home/apt21.00.01.0Pull-out Sofa115.0
27397HĂ´tel-de-VilleApartmentEntire home/apt41.02.02.0Real Bed119.0
37964OpéraApartmentEntire home/apt21.01.01.0Real Bed130.0
49952PopincourtApartmentEntire home/apt21.01.01.0Real Bed75.0
\n", "
" ], "text/plain": [ " id neighbourhood_cleansed property_type room_type accommodates \\\n", "0 3109 Observatoire Apartment Entire home/apt 2 \n", "1 5396 Hôtel-de-Ville Apartment Entire home/apt 2 \n", "2 7397 Hôtel-de-Ville Apartment Entire home/apt 4 \n", "3 7964 Opéra Apartment Entire home/apt 2 \n", "4 9952 Popincourt Apartment Entire home/apt 2 \n", "\n", " bathrooms bedrooms beds bed_type Price \n", "0 1.0 0.0 1.0 Real Bed 60.0 \n", "1 1.0 0.0 1.0 Pull-out Sofa 115.0 \n", "2 1.0 2.0 2.0 Real Bed 119.0 \n", "3 1.0 1.0 1.0 Real Bed 130.0 \n", "4 1.0 1.0 1.0 Real Bed 75.0 " ] }, "execution_count": 4, "metadata": {}, "output_type": "execute_result" } ], "source": [ "qry = \"\"\"\n", "SELECT * \n", "FROM db \n", "LIMIT 5;\n", "\"\"\"\n", "dbs = pandas.read_sql(qry, engine)\n", "dbs" ] }, { "cell_type": "code", "execution_count": 5, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "\n", "RangeIndex: 5137 entries, 0 to 5136\n", "Data columns (total 2 columns):\n", " # Column Non-Null Count Dtype \n", "--- ------ -------------- ----- \n", " 0 id 5137 non-null int64 \n", " 1 Price 5137 non-null float64\n", "dtypes: float64(1), int64(1)\n", "memory usage: 80.4 KB\n" ] } ], "source": [ "qry = \"\"\"\n", "SELECT id, Price \n", "FROM db \n", "WHERE neighbourhood_cleansed = 'Popincourt';\n", "\"\"\"\n", "dbs = pandas.read_sql(qry, engine)\n", "dbs.info()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "* Explore" ] }, { "cell_type": "code", "execution_count": 6, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
idneighbourhood_cleansedproperty_typeroom_typeaccommodatesbathroomsbedroomsbedsbed_typePrice
03109ObservatoireApartmentEntire home/apt21.00.01.0Real Bed60.0
15396HĂ´tel-de-VilleApartmentEntire home/apt21.00.01.0Pull-out Sofa115.0
27397HĂ´tel-de-VilleApartmentEntire home/apt41.02.02.0Real Bed119.0
37964OpéraApartmentEntire home/apt21.01.01.0Real Bed130.0
49952PopincourtApartmentEntire home/apt21.01.01.0Real Bed75.0
\n", "
" ], "text/plain": [ " id neighbourhood_cleansed property_type room_type accommodates \\\n", "0 3109 Observatoire Apartment Entire home/apt 2 \n", "1 5396 Hôtel-de-Ville Apartment Entire home/apt 2 \n", "2 7397 Hôtel-de-Ville Apartment Entire home/apt 4 \n", "3 7964 Opéra Apartment Entire home/apt 2 \n", "4 9952 Popincourt Apartment Entire home/apt 2 \n", "\n", " bathrooms bedrooms beds bed_type Price \n", "0 1.0 0.0 1.0 Real Bed 60.0 \n", "1 1.0 0.0 1.0 Pull-out Sofa 115.0 \n", "2 1.0 2.0 2.0 Real Bed 119.0 \n", "3 1.0 1.0 1.0 Real Bed 130.0 \n", "4 1.0 1.0 1.0 Real Bed 75.0 " ] }, "execution_count": 6, "metadata": {}, "output_type": "execute_result" } ], "source": [ "db.head()" ] }, { "cell_type": "code", "execution_count": 7, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "\n", "RangeIndex: 50280 entries, 0 to 50279\n", "Data columns (total 10 columns):\n", " # Column Non-Null Count Dtype \n", "--- ------ -------------- ----- \n", " 0 id 50280 non-null int64 \n", " 1 neighbourhood_cleansed 50280 non-null object \n", " 2 property_type 50280 non-null object \n", " 3 room_type 50280 non-null object \n", " 4 accommodates 50280 non-null int64 \n", " 5 bathrooms 50280 non-null float64\n", " 6 bedrooms 50280 non-null float64\n", " 7 beds 50280 non-null float64\n", " 8 bed_type 50280 non-null object \n", " 9 Price 50280 non-null float64\n", "dtypes: float64(4), int64(2), object(4)\n", "memory usage: 3.8+ MB\n" ] } ], "source": [ "db.info()" ] }, { "cell_type": "code", "execution_count": 8, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
idaccommodatesbathroomsbedroomsbedsPrice
count5.028000e+0450280.00000050280.00000050280.00000050280.00000050280.000000
mean1.968671e+073.0634251.1170451.0778041.664698110.630549
std1.140675e+071.5567310.6873701.0010001.164917157.848189
min3.109000e+031.0000000.0000000.0000000.0000000.000000
25%9.340209e+062.0000001.0000001.0000001.00000060.000000
50%1.999153e+072.0000001.0000001.0000001.00000080.000000
75%2.987492e+074.0000001.0000001.0000002.000000120.000000
max3.858893e+0717.00000050.00000050.00000050.00000010000.000000
\n", "
" ], "text/plain": [ " id accommodates bathrooms bedrooms beds \\\n", "count 5.028000e+04 50280.000000 50280.000000 50280.000000 50280.000000 \n", "mean 1.968671e+07 3.063425 1.117045 1.077804 1.664698 \n", "std 1.140675e+07 1.556731 0.687370 1.001000 1.164917 \n", "min 3.109000e+03 1.000000 0.000000 0.000000 0.000000 \n", "25% 9.340209e+06 2.000000 1.000000 1.000000 1.000000 \n", "50% 1.999153e+07 2.000000 1.000000 1.000000 1.000000 \n", "75% 2.987492e+07 4.000000 1.000000 1.000000 2.000000 \n", "max 3.858893e+07 17.000000 50.000000 50.000000 50.000000 \n", "\n", " Price \n", "count 50280.000000 \n", "mean 110.630549 \n", "std 157.848189 \n", "min 0.000000 \n", "25% 60.000000 \n", "50% 80.000000 \n", "75% 120.000000 \n", "max 10000.000000 " ] }, "execution_count": 8, "metadata": {}, "output_type": "execute_result" } ], "source": [ "db.describe()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "`DataFrame` objects can hold different types of data\n", "\n", "- \"Whole\" numbers (`int`)\n", "- Decimals (`float`)\n", "- Categorical (`pandas.Category`)\n", "- Dates (`pandas.Timestamp`)\n", "- Geo (`geopandas.GeoDataFrame`)\n", "- ..." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Indices" ] }, { "cell_type": "code", "execution_count": 9, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
idneighbourhood_cleansedproperty_typeroom_typeaccommodatesbathroomsbedroomsbedsbed_typePrice
03109ObservatoireApartmentEntire home/apt21.00.01.0Real Bed60.0
15396HĂ´tel-de-VilleApartmentEntire home/apt21.00.01.0Pull-out Sofa115.0
\n", "
" ], "text/plain": [ " id neighbourhood_cleansed property_type room_type accommodates \\\n", "0 3109 Observatoire Apartment Entire home/apt 2 \n", "1 5396 HĂ´tel-de-Ville Apartment Entire home/apt 2 \n", "\n", " bathrooms bedrooms beds bed_type Price \n", "0 1.0 0.0 1.0 Real Bed 60.0 \n", "1 1.0 0.0 1.0 Pull-out Sofa 115.0 " ] }, "execution_count": 9, "metadata": {}, "output_type": "execute_result" } ], "source": [ "db.head(2)" ] }, { "cell_type": "code", "execution_count": 10, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
neighbourhood_cleansedproperty_typeroom_typeaccommodatesbathroomsbedroomsbedsbed_typePrice
id
3109ObservatoireApartmentEntire home/apt21.00.01.0Real Bed60.0
5396HĂ´tel-de-VilleApartmentEntire home/apt21.00.01.0Pull-out Sofa115.0
\n", "
" ], "text/plain": [ " neighbourhood_cleansed property_type room_type accommodates \\\n", "id \n", "3109 Observatoire Apartment Entire home/apt 2 \n", "5396 Hôtel-de-Ville Apartment Entire home/apt 2 \n", "\n", " bathrooms bedrooms beds bed_type Price \n", "id \n", "3109 1.0 0.0 1.0 Real Bed 60.0 \n", "5396 1.0 0.0 1.0 Pull-out Sofa 115.0 " ] }, "execution_count": 10, "metadata": {}, "output_type": "execute_result" } ], "source": [ "db.set_index(\"id\").head(2)" ] }, { "cell_type": "code", "execution_count": 11, "metadata": {}, "outputs": [], "source": [ "dbi = db.set_index(\"id\")" ] }, { "cell_type": "code", "execution_count": 12, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "Int64Index([ 3109, 5396, 7397, 7964, 9952, 10710,\n", " 11170, 11213, 11265, 11798,\n", " ...\n", " 38485563, 38489275, 38513797, 38516223, 38516341, 38517692,\n", " 38520175, 38537044, 38546594, 38588929],\n", " dtype='int64', name='id', length=50280)" ] }, "execution_count": 12, "metadata": {}, "output_type": "execute_result" } ], "source": [ "dbi.index" ] }, { "cell_type": "code", "execution_count": 13, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "Index(['neighbourhood_cleansed', 'property_type', 'room_type', 'accommodates',\n", " 'bathrooms', 'bedrooms', 'beds', 'bed_type', 'Price'],\n", " dtype='object')" ] }, "execution_count": 13, "metadata": {}, "output_type": "execute_result" } ], "source": [ "dbi.columns" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Slicing and Dicing Data" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Index-based queries" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "- Columns" ] }, { "cell_type": "code", "execution_count": 14, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "0 60.0\n", "1 115.0\n", "2 119.0\n", "3 130.0\n", "4 75.0\n", "Name: Price, dtype: float64" ] }, "execution_count": 14, "metadata": {}, "output_type": "execute_result" } ], "source": [ "db[\"Price\"].head()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "- Generic point-querying" ] }, { "cell_type": "code", "execution_count": 15, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "60.0" ] }, "execution_count": 15, "metadata": {}, "output_type": "execute_result" } ], "source": [ "db.loc[0, \"Price\"]" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "- Full slice of one dimension" ] }, { "cell_type": "code", "execution_count": 16, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "id 3109\n", "neighbourhood_cleansed Observatoire\n", "property_type Apartment\n", "room_type Entire home/apt\n", "accommodates 2\n", "bathrooms 1\n", "bedrooms 0\n", "beds 1\n", "bed_type Real Bed\n", "Price 60\n", "Name: 0, dtype: object" ] }, "execution_count": 16, "metadata": {}, "output_type": "execute_result" } ], "source": [ "db.loc[0, :]" ] }, { "cell_type": "code", "execution_count": 17, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "0 60.0\n", "1 115.0\n", "2 119.0\n", "3 130.0\n", "4 75.0\n", " ... \n", "50275 250.0\n", "50276 40.0\n", "50277 60.0\n", "50278 65.0\n", "50279 69.0\n", "Name: Price, Length: 50280, dtype: float64" ] }, "execution_count": 17, "metadata": {}, "output_type": "execute_result" } ], "source": [ "db.loc[:, \"Price\"]" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "- Range queries" ] }, { "cell_type": "code", "execution_count": 18, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "0 Observatoire\n", "1 Hôtel-de-Ville\n", "2 Hôtel-de-Ville\n", "3 Opéra\n", "4 Popincourt\n", "5 Élysée\n", "Name: neighbourhood_cleansed, dtype: object" ] }, "execution_count": 18, "metadata": {}, "output_type": "execute_result" } ], "source": [ "db.loc[0:5, \"neighbourhood_cleansed\"]" ] }, { "cell_type": "code", "execution_count": 19, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "property_type Apartment\n", "room_type Entire home/apt\n", "accommodates 4\n", "bathrooms 1\n", "bedrooms 1\n", "beds 2\n", "bed_type Real Bed\n", "Name: 5, dtype: object" ] }, "execution_count": 19, "metadata": {}, "output_type": "execute_result" } ], "source": [ "db.loc[5, \"property_type\":\"bed_type\"]" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "- List-based queries" ] }, { "cell_type": "code", "execution_count": 20, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
Priceid
060.03109
49128.032082
1965.017919
2980.021264
3949.026567
9120.011798
\n", "
" ], "text/plain": [ " Price id\n", "0 60.0 3109\n", "49 128.0 32082\n", "19 65.0 17919\n", "29 80.0 21264\n", "39 49.0 26567\n", "9 120.0 11798" ] }, "execution_count": 20, "metadata": {}, "output_type": "execute_result" } ], "source": [ "db.loc[[0, 49, 19, 29, 39, 9], [\"Price\", \"id\"]]" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Order-based queries" ] }, { "cell_type": "code", "execution_count": 21, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
neighbourhood_cleansedproperty_typeroom_typeaccommodatesbathroomsbedroomsbedsbed_typePrice
id
3109ObservatoireApartmentEntire home/apt21.00.01.0Real Bed60.0
5396HĂ´tel-de-VilleApartmentEntire home/apt21.00.01.0Pull-out Sofa115.0
7397HĂ´tel-de-VilleApartmentEntire home/apt41.02.02.0Real Bed119.0
7964OpéraApartmentEntire home/apt21.01.01.0Real Bed130.0
9952PopincourtApartmentEntire home/apt21.01.01.0Real Bed75.0
\n", "
" ], "text/plain": [ " neighbourhood_cleansed property_type room_type accommodates \\\n", "id \n", "3109 Observatoire Apartment Entire home/apt 2 \n", "5396 Hôtel-de-Ville Apartment Entire home/apt 2 \n", "7397 Hôtel-de-Ville Apartment Entire home/apt 4 \n", "7964 Opéra Apartment Entire home/apt 2 \n", "9952 Popincourt Apartment Entire home/apt 2 \n", "\n", " bathrooms bedrooms beds bed_type Price \n", "id \n", "3109 1.0 0.0 1.0 Real Bed 60.0 \n", "5396 1.0 0.0 1.0 Pull-out Sofa 115.0 \n", "7397 1.0 2.0 2.0 Real Bed 119.0 \n", "7964 1.0 1.0 1.0 Real Bed 130.0 \n", "9952 1.0 1.0 1.0 Real Bed 75.0 " ] }, "execution_count": 21, "metadata": {}, "output_type": "execute_result" } ], "source": [ "dbi.iloc[0:5, :]" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "---\n", "\n", "**EXERCISE**: \n", "\n", "- Slice the dataset to keep only properties with the following IDs, in that order: `38520175`, `619716`, and `37847454`\n", "- Extract the section of the dataset that includes the 50th to the 100th rows, and the `room_type` and `bedrooms` columns\n", "\n", "---" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Conditional queries" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "- Using `loc`" ] }, { "cell_type": "code", "execution_count": 22, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
neighbourhood_cleansedPrice
0Observatoire60.0
47Observatoire90.0
52Observatoire150.0
104Observatoire84.0
144Observatoire140.0
\n", "
" ], "text/plain": [ " neighbourhood_cleansed Price\n", "0 Observatoire 60.0\n", "47 Observatoire 90.0\n", "52 Observatoire 150.0\n", "104 Observatoire 84.0\n", "144 Observatoire 140.0" ] }, "execution_count": 22, "metadata": {}, "output_type": "execute_result" } ], "source": [ "db.loc[db[\"neighbourhood_cleansed\"] == \"Observatoire\", \n", " [\"neighbourhood_cleansed\", \"Price\"]]\\\n", " .head()" ] }, { "cell_type": "code", "execution_count": 23, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
idneighbourhood_cleansed
03109Observatoire
49952Popincourt
510710Élysée
611170Panthéon
1011848Popincourt
\n", "
" ], "text/plain": [ " id neighbourhood_cleansed\n", "0 3109 Observatoire\n", "4 9952 Popincourt\n", "5 10710 Élysée\n", "6 11170 Panthéon\n", "10 11848 Popincourt" ] }, "execution_count": 23, "metadata": {}, "output_type": "execute_result" } ], "source": [ "db.loc[db[\"Price\"] < 100, [\"id\", \"neighbourhood_cleansed\"]].head()" ] }, { "cell_type": "code", "execution_count": 24, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
idneighbourhood_cleansedproperty_typeroom_typeaccommodatesbathroomsbedroomsbedsbed_typePrice
120668876983LuxembourgBoutique hotelPrivate room1650.050.016.0Real Bed85.0
2296418766792LuxembourgBoutique hotelPrivate room1650.050.017.0Real Bed85.0
2478819819352LuxembourgBoutique hotelPrivate room1650.050.020.0Real Bed85.0
2579820433587LuxembourgBoutique hotelPrivate room1650.050.050.0Real Bed85.0
2604820691340LuxembourgBoutique hotelPrivate room1650.050.050.0Real Bed85.0
2610820747725LuxembourgBoutique hotelPrivate room1650.050.050.0Real Bed85.0
2612120768013LuxembourgBoutique hotelPrivate room1650.050.050.0Real Bed85.0
\n", "
" ], "text/plain": [ " id neighbourhood_cleansed property_type room_type \\\n", "12066 8876983 Luxembourg Boutique hotel Private room \n", "22964 18766792 Luxembourg Boutique hotel Private room \n", "24788 19819352 Luxembourg Boutique hotel Private room \n", "25798 20433587 Luxembourg Boutique hotel Private room \n", "26048 20691340 Luxembourg Boutique hotel Private room \n", "26108 20747725 Luxembourg Boutique hotel Private room \n", "26121 20768013 Luxembourg Boutique hotel Private room \n", "\n", " accommodates bathrooms bedrooms beds bed_type Price \n", "12066 16 50.0 50.0 16.0 Real Bed 85.0 \n", "22964 16 50.0 50.0 17.0 Real Bed 85.0 \n", "24788 16 50.0 50.0 20.0 Real Bed 85.0 \n", "25798 16 50.0 50.0 50.0 Real Bed 85.0 \n", "26048 16 50.0 50.0 50.0 Real Bed 85.0 \n", "26108 16 50.0 50.0 50.0 Real Bed 85.0 \n", "26121 16 50.0 50.0 50.0 Real Bed 85.0 " ] }, "execution_count": 24, "metadata": {}, "output_type": "execute_result" } ], "source": [ "db.loc[(db[\"Price\"] < 100) & \\\n", " (db[\"bathrooms\"] >= 8),\n", " :]" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "- Conditional filters" ] }, { "cell_type": "code", "execution_count": 25, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "0 False\n", "1 False\n", "2 False\n", "3 False\n", "4 False\n", "Name: bathrooms, dtype: bool" ] }, "execution_count": 25, "metadata": {}, "output_type": "execute_result" } ], "source": [ "fltr = db[\"bathrooms\"] > 8\n", "fltr.head()" ] }, { "cell_type": "code", "execution_count": 26, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
idneighbourhood_cleansedproperty_typeroom_typeaccommodatesbathroomsbedroomsbedsbed_typePrice
120668876983LuxembourgBoutique hotelPrivate room1650.050.016.0Real Bed85.0
2296418766792LuxembourgBoutique hotelPrivate room1650.050.017.0Real Bed85.0
2478819819352LuxembourgBoutique hotelPrivate room1650.050.020.0Real Bed85.0
2579820433587LuxembourgBoutique hotelPrivate room1650.050.050.0Real Bed85.0
2604820691340LuxembourgBoutique hotelPrivate room1650.050.050.0Real Bed85.0
2610820747725LuxembourgBoutique hotelPrivate room1650.050.050.0Real Bed85.0
2612120768013LuxembourgBoutique hotelPrivate room1650.050.050.0Real Bed85.0
\n", "
" ], "text/plain": [ " id neighbourhood_cleansed property_type room_type \\\n", "12066 8876983 Luxembourg Boutique hotel Private room \n", "22964 18766792 Luxembourg Boutique hotel Private room \n", "24788 19819352 Luxembourg Boutique hotel Private room \n", "25798 20433587 Luxembourg Boutique hotel Private room \n", "26048 20691340 Luxembourg Boutique hotel Private room \n", "26108 20747725 Luxembourg Boutique hotel Private room \n", "26121 20768013 Luxembourg Boutique hotel Private room \n", "\n", " accommodates bathrooms bedrooms beds bed_type Price \n", "12066 16 50.0 50.0 16.0 Real Bed 85.0 \n", "22964 16 50.0 50.0 17.0 Real Bed 85.0 \n", "24788 16 50.0 50.0 20.0 Real Bed 85.0 \n", "25798 16 50.0 50.0 50.0 Real Bed 85.0 \n", "26048 16 50.0 50.0 50.0 Real Bed 85.0 \n", "26108 16 50.0 50.0 50.0 Real Bed 85.0 \n", "26121 16 50.0 50.0 50.0 Real Bed 85.0 " ] }, "execution_count": 26, "metadata": {}, "output_type": "execute_result" } ], "source": [ "db[fltr]" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "* Concatenated queries" ] }, { "cell_type": "code", "execution_count": 27, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
idneighbourhood_cleansedproperty_typeroom_typeaccommodatesbathroomsbedroomsbedsbed_typePrice
120668876983LuxembourgBoutique hotelPrivate room1650.050.016.0Real Bed85.0
2296418766792LuxembourgBoutique hotelPrivate room1650.050.017.0Real Bed85.0
2478819819352LuxembourgBoutique hotelPrivate room1650.050.020.0Real Bed85.0
2579820433587LuxembourgBoutique hotelPrivate room1650.050.050.0Real Bed85.0
2604820691340LuxembourgBoutique hotelPrivate room1650.050.050.0Real Bed85.0
2610820747725LuxembourgBoutique hotelPrivate room1650.050.050.0Real Bed85.0
2612120768013LuxembourgBoutique hotelPrivate room1650.050.050.0Real Bed85.0
\n", "
" ], "text/plain": [ " id neighbourhood_cleansed property_type room_type \\\n", "12066 8876983 Luxembourg Boutique hotel Private room \n", "22964 18766792 Luxembourg Boutique hotel Private room \n", "24788 19819352 Luxembourg Boutique hotel Private room \n", "25798 20433587 Luxembourg Boutique hotel Private room \n", "26048 20691340 Luxembourg Boutique hotel Private room \n", "26108 20747725 Luxembourg Boutique hotel Private room \n", "26121 20768013 Luxembourg Boutique hotel Private room \n", "\n", " accommodates bathrooms bedrooms beds bed_type Price \n", "12066 16 50.0 50.0 16.0 Real Bed 85.0 \n", "22964 16 50.0 50.0 17.0 Real Bed 85.0 \n", "24788 16 50.0 50.0 20.0 Real Bed 85.0 \n", "25798 16 50.0 50.0 50.0 Real Bed 85.0 \n", "26048 16 50.0 50.0 50.0 Real Bed 85.0 \n", "26108 16 50.0 50.0 50.0 Real Bed 85.0 \n", "26121 16 50.0 50.0 50.0 Real Bed 85.0 " ] }, "execution_count": 27, "metadata": {}, "output_type": "execute_result" } ], "source": [ "db.loc[(db[\"Price\"] < 100) & \\\n", " (db[\"bathrooms\"] >= 8),\n", " :]" ] }, { "cell_type": "code", "execution_count": 28, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
idneighbourhood_cleansedproperty_typeroom_typeaccommodatesbathroomsbedroomsbedsbed_typePrice
81496088687TempleApartmentEntire home/apt21.01.01.0Real Bed8500.0
101137225849Buttes-MontmartreApartmentEntire home/apt82.02.05.0Real Bed9379.0
112868093890PassyApartmentEntire home/apt31.02.02.0Real Bed8721.0
2510619974916Buttes-MontmartreCondominiumEntire home/apt41.01.02.0Real Bed0.0
2555820219162Buttes-ChaumontApartmentEntire home/apt11.00.01.0Real Bed0.0
2567620291987PassyApartmentEntire home/apt31.00.02.0Real Bed0.0
2569720313940TempleApartmentEntire home/apt21.00.01.0Real Bed0.0
2689921422028PopincourtApartmentEntire home/apt32.02.03.0Real Bed0.0
3219025448670VaugirardApartmentEntire home/apt11.00.01.0Real Bed8576.0
3529127608896ObservatoireApartmentPrivate room11.00.01.0Real Bed9059.0
4396134380017BourseServiced apartmentHotel room82.53.04.0Real Bed5150.0
4676736019554LouvreServiced apartmentHotel room82.52.04.0Real Bed8000.0
4757436402651Batignolles-MonceauHouseEntire home/apt123.04.05.0Real Bed10000.0
\n", "
" ], "text/plain": [ " id neighbourhood_cleansed property_type room_type \\\n", "8149 6088687 Temple Apartment Entire home/apt \n", "10113 7225849 Buttes-Montmartre Apartment Entire home/apt \n", "11286 8093890 Passy Apartment Entire home/apt \n", "25106 19974916 Buttes-Montmartre Condominium Entire home/apt \n", "25558 20219162 Buttes-Chaumont Apartment Entire home/apt \n", "25676 20291987 Passy Apartment Entire home/apt \n", "25697 20313940 Temple Apartment Entire home/apt \n", "26899 21422028 Popincourt Apartment Entire home/apt \n", "32190 25448670 Vaugirard Apartment Entire home/apt \n", "35291 27608896 Observatoire Apartment Private room \n", "43961 34380017 Bourse Serviced apartment Hotel room \n", "46767 36019554 Louvre Serviced apartment Hotel room \n", "47574 36402651 Batignolles-Monceau House Entire home/apt \n", "\n", " accommodates bathrooms bedrooms beds bed_type Price \n", "8149 2 1.0 1.0 1.0 Real Bed 8500.0 \n", "10113 8 2.0 2.0 5.0 Real Bed 9379.0 \n", "11286 3 1.0 2.0 2.0 Real Bed 8721.0 \n", "25106 4 1.0 1.0 2.0 Real Bed 0.0 \n", "25558 1 1.0 0.0 1.0 Real Bed 0.0 \n", "25676 3 1.0 0.0 2.0 Real Bed 0.0 \n", "25697 2 1.0 0.0 1.0 Real Bed 0.0 \n", "26899 3 2.0 2.0 3.0 Real Bed 0.0 \n", "32190 1 1.0 0.0 1.0 Real Bed 8576.0 \n", "35291 1 1.0 0.0 1.0 Real Bed 9059.0 \n", "43961 8 2.5 3.0 4.0 Real Bed 5150.0 \n", "46767 8 2.5 2.0 4.0 Real Bed 8000.0 \n", "47574 12 3.0 4.0 5.0 Real Bed 10000.0 " ] }, "execution_count": 28, "metadata": {}, "output_type": "execute_result" } ], "source": [ "db.loc[(db[\"Price\"] < 5) | \\\n", " (db[\"Price\"] > 5000),\n", " :]" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "- Using `query`" ] }, { "cell_type": "code", "execution_count": 29, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
neighbourhood_cleansedPrice
0Observatoire60.0
47Observatoire90.0
52Observatoire150.0
104Observatoire84.0
144Observatoire140.0
\n", "
" ], "text/plain": [ " neighbourhood_cleansed Price\n", "0 Observatoire 60.0\n", "47 Observatoire 90.0\n", "52 Observatoire 150.0\n", "104 Observatoire 84.0\n", "144 Observatoire 140.0" ] }, "execution_count": 29, "metadata": {}, "output_type": "execute_result" } ], "source": [ "db.query(\"neighbourhood_cleansed == 'Observatoire'\")\\\n", " [[\"neighbourhood_cleansed\", \"Price\"]]\\\n", " .head()" ] }, { "cell_type": "code", "execution_count": 30, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
idneighbourhood_cleansed
03109Observatoire
49952Popincourt
510710Élysée
611170Panthéon
1011848Popincourt
\n", "
" ], "text/plain": [ " id neighbourhood_cleansed\n", "0 3109 Observatoire\n", "4 9952 Popincourt\n", "5 10710 Élysée\n", "6 11170 Panthéon\n", "10 11848 Popincourt" ] }, "execution_count": 30, "metadata": {}, "output_type": "execute_result" } ], "source": [ "db.query(\"Price < 100\")\\\n", " [[\"id\", \"neighbourhood_cleansed\"]]\\\n", " .head()" ] }, { "cell_type": "code", "execution_count": 31, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
idneighbourhood_cleansedproperty_typeroom_typeaccommodatesbathroomsbedroomsbedsbed_typePrice
120668876983LuxembourgBoutique hotelPrivate room1650.050.016.0Real Bed85.0
2296418766792LuxembourgBoutique hotelPrivate room1650.050.017.0Real Bed85.0
2478819819352LuxembourgBoutique hotelPrivate room1650.050.020.0Real Bed85.0
2579820433587LuxembourgBoutique hotelPrivate room1650.050.050.0Real Bed85.0
2604820691340LuxembourgBoutique hotelPrivate room1650.050.050.0Real Bed85.0
2610820747725LuxembourgBoutique hotelPrivate room1650.050.050.0Real Bed85.0
2612120768013LuxembourgBoutique hotelPrivate room1650.050.050.0Real Bed85.0
\n", "
" ], "text/plain": [ " id neighbourhood_cleansed property_type room_type \\\n", "12066 8876983 Luxembourg Boutique hotel Private room \n", "22964 18766792 Luxembourg Boutique hotel Private room \n", "24788 19819352 Luxembourg Boutique hotel Private room \n", "25798 20433587 Luxembourg Boutique hotel Private room \n", "26048 20691340 Luxembourg Boutique hotel Private room \n", "26108 20747725 Luxembourg Boutique hotel Private room \n", "26121 20768013 Luxembourg Boutique hotel Private room \n", "\n", " accommodates bathrooms bedrooms beds bed_type Price \n", "12066 16 50.0 50.0 16.0 Real Bed 85.0 \n", "22964 16 50.0 50.0 17.0 Real Bed 85.0 \n", "24788 16 50.0 50.0 20.0 Real Bed 85.0 \n", "25798 16 50.0 50.0 50.0 Real Bed 85.0 \n", "26048 16 50.0 50.0 50.0 Real Bed 85.0 \n", "26108 16 50.0 50.0 50.0 Real Bed 85.0 \n", "26121 16 50.0 50.0 50.0 Real Bed 85.0 " ] }, "execution_count": 31, "metadata": {}, "output_type": "execute_result" } ], "source": [ "db.query(\"(Price < 100) & (bathrooms >= 8)\")" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Editing tables" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "- Modifying single values" ] }, { "cell_type": "code", "execution_count": 32, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "'Pull-out Sofa'" ] }, "execution_count": 32, "metadata": {}, "output_type": "execute_result" } ], "source": [ "db.loc[1, \"bed_type\"]" ] }, { "cell_type": "code", "execution_count": 33, "metadata": {}, "outputs": [], "source": [ "db.loc[1, \"bed_type\"] = \"Pullout Sofa\"" ] }, { "cell_type": "code", "execution_count": 34, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "'Pullout Sofa'" ] }, "execution_count": 34, "metadata": {}, "output_type": "execute_result" } ], "source": [ "db.loc[1, \"bed_type\"]" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "- Modifying blocks of values" ] }, { "cell_type": "code", "execution_count": 35, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "array(['Observatoire', 'Hôtel-de-Ville', 'Opéra', 'Popincourt', 'Élysée',\n", " 'Panthéon', 'Entrepôt', 'Buttes-Montmartre', 'Gobelins',\n", " 'Buttes-Chaumont', 'Luxembourg', 'Louvre', 'Palais-Bourbon',\n", " 'Reuilly', 'Bourse', 'Ménilmontant', 'Vaugirard',\n", " 'Batignolles-Monceau', 'Temple', 'Passy'], dtype=object)" ] }, "execution_count": 35, "metadata": {}, "output_type": "execute_result" } ], "source": [ "db.neighbourhood_cleansed.unique()" ] }, { "cell_type": "code", "execution_count": 36, "metadata": {}, "outputs": [], "source": [ "db.loc[db[\"neighbourhood_cleansed\"] == \"Hôtel-de-Ville\",\n", " \"neighbourhood_cleansed\"] = \"City Council\"" ] }, { "cell_type": "code", "execution_count": 37, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "array(['Observatoire', 'City Council', 'Opéra', 'Popincourt', 'Élysée',\n", " 'Panthéon', 'Entrepôt', 'Buttes-Montmartre', 'Gobelins',\n", " 'Buttes-Chaumont', 'Luxembourg', 'Louvre', 'Palais-Bourbon',\n", " 'Reuilly', 'Bourse', 'Ménilmontant', 'Vaugirard',\n", " 'Batignolles-Monceau', 'Temple', 'Passy'], dtype=object)" ] }, "execution_count": 37, "metadata": {}, "output_type": "execute_result" } ], "source": [ "db.neighbourhood_cleansed.unique()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "- Creating new columns and rows" ] }, { "cell_type": "code", "execution_count": 38, "metadata": {}, "outputs": [], "source": [ "db[\"more_beds_than_accomodates\"] = db[\"beds\"] > db[\"accommodates\"]" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "---\n", "\n", "**EXERCISE**\n", "\n", "- Find how many properties have more than ten bathrooms\n", "- Can you rent an AirBnb in Paris with only one bed but three bedrooms?\n", "- In which neighbourhoods can you rent an \"Airbed\"?\n", "\n", "---" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Writing Data" ] }, { "cell_type": "code", "execution_count": 39, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "['to_clipboard',\n", " 'to_csv',\n", " 'to_dict',\n", " 'to_excel',\n", " 'to_feather',\n", " 'to_gbq',\n", " 'to_hdf',\n", " 'to_html',\n", " 'to_json',\n", " 'to_latex',\n", " 'to_markdown',\n", " 'to_numpy',\n", " 'to_parquet',\n", " 'to_period',\n", " 'to_pickle',\n", " 'to_records',\n", " 'to_sql',\n", " 'to_stata',\n", " 'to_string',\n", " 'to_timestamp',\n", " 'to_xarray']" ] }, "execution_count": 39, "metadata": {}, "output_type": "execute_result" } ], "source": [ "[i for i in dir(db) if i[:3]==\"to_\"]" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Further Resources" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Similar introduction:\n", "\n", "> [http://darribas.org/gds19/content/labs/lab_01.html](http://darribas.org/gds17/content/labs/lab_01.html)\n", "\n", "More stuff on indices:\n", "\n", "> [http://pandas.pydata.org/pandas-docs/stable/indexing.html](http://pandas.pydata.org/pandas-docs/stable/indexing.html)\n", "\n", "And for the pros:\n", "\n", "> [http://pandas.pydata.org/pandas-docs/stable/advanced.html](http://pandas.pydata.org/pandas-docs/stable/advanced.html)" ] } ], "metadata": { "kernelspec": { "display_name": "Python 3", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.8.6" } }, "nbformat": 4, "nbformat_minor": 4 }