{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# Introduction to pandas\n", "Adapted from \"10 minutes to pandas\":\n", "https://pandas.pydata.org/pandas-docs/stable/getting_started/10min.html#min\n", "\n", "See also the cheatsheet:\n", "https://github.com/pandas-dev/pandas/blob/master/doc/cheatsheet/Pandas_Cheat_Sheet.pdf" ] }, { "cell_type": "code", "execution_count": 2, "metadata": {}, "outputs": [], "source": [ "import numpy as np\n", "import pandas as pd" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Pandas provides you with two handy data structures:\n", "- series\n", "- data frame\n", "\n", "which can store 1-dimensional and 2-dimensional labelled arrays.\n", "NumPy arrays have one dtype for the entire array, while pandas DataFrames have one dtype per column.\n", "Which means data frames can store different types of objects in each column,\n", "e.g., integers, reals, booleans, strings, dates." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Object creation\n", "Creating a Series by passing a list of values, letting pandas create \n", "a default integer index:" ] }, { "cell_type": "code", "execution_count": 3, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "0 1.0\n", "1 3.0\n", "2 5.0\n", "3 NaN\n", "4 6.0\n", "5 8.0\n", "dtype: float64" ] }, "execution_count": 3, "metadata": {}, "output_type": "execute_result" } ], "source": [ "s = pd.Series([1,3,5,np.nan, 6, 8])\n", "s" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Creating a DataFrame by passing a NumPy array,\n", "with a datetime index and labeled columns:" ] }, { "cell_type": "code", "execution_count": 3, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "DatetimeIndex(['2019-10-01', '2019-10-02', '2019-10-03', '2019-10-04',\n", " '2019-10-05', '2019-10-06', '2019-10-07', '2019-10-08',\n", " '2019-10-09', '2019-10-10', '2019-10-11', '2019-10-12',\n", " '2019-10-13', '2019-10-14', '2019-10-15', '2019-10-16'],\n", " dtype='datetime64[ns]', freq='D')" ] }, "execution_count": 3, "metadata": {}, "output_type": "execute_result" } ], "source": [ "dates = pd.date_range('20191001',periods=16)\n", "dates" ] }, { "cell_type": "code", "execution_count": 7, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
ABCD
2019-10-010.7341780.4016190.6444190.285785
2019-10-020.8345160.4679540.9090690.949717
2019-10-030.7399080.1417100.7218650.726645
2019-10-040.5420310.2561830.8296370.649897
2019-10-050.4720530.3637430.6368650.106799
2019-10-060.1876480.0329990.1747560.793372
2019-10-070.3418330.1160820.8793300.225108
2019-10-080.7421600.9019110.5849580.118793
2019-10-090.4254930.3505750.6793850.477699
2019-10-100.2405250.1356370.9968560.339611
2019-10-110.7335590.0760320.5725300.410940
2019-10-120.6924410.0048520.1899720.762320
2019-10-130.9309070.4511380.4244500.030775
2019-10-140.8781150.8845880.8373950.034881
2019-10-150.4489870.3504920.1596610.012150
2019-10-160.2958130.8832040.4630570.709098
\n", "
" ], "text/plain": [ " A B C D\n", "2019-10-01 0.734178 0.401619 0.644419 0.285785\n", "2019-10-02 0.834516 0.467954 0.909069 0.949717\n", "2019-10-03 0.739908 0.141710 0.721865 0.726645\n", "2019-10-04 0.542031 0.256183 0.829637 0.649897\n", "2019-10-05 0.472053 0.363743 0.636865 0.106799\n", "2019-10-06 0.187648 0.032999 0.174756 0.793372\n", "2019-10-07 0.341833 0.116082 0.879330 0.225108\n", "2019-10-08 0.742160 0.901911 0.584958 0.118793\n", "2019-10-09 0.425493 0.350575 0.679385 0.477699\n", "2019-10-10 0.240525 0.135637 0.996856 0.339611\n", "2019-10-11 0.733559 0.076032 0.572530 0.410940\n", "2019-10-12 0.692441 0.004852 0.189972 0.762320\n", "2019-10-13 0.930907 0.451138 0.424450 0.030775\n", "2019-10-14 0.878115 0.884588 0.837395 0.034881\n", "2019-10-15 0.448987 0.350492 0.159661 0.012150\n", "2019-10-16 0.295813 0.883204 0.463057 0.709098" ] }, "execution_count": 7, "metadata": {}, "output_type": "execute_result" } ], "source": [ "df = pd.DataFrame(np.random.rand(16,4),index=dates,columns=list('ABCD'))\n", "df\n", "# ?np.random.randint\n", "# np.random.randint(0,high=10,size=(3,4))" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "If you don't like the column names, you can use an array of strings" ] }, { "cell_type": "code", "execution_count": 8, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
ABCyderD
2019-10-010.1994550.9441340.3066060.915470
2019-10-020.8017110.8960390.6375530.102936
2019-10-030.3013310.3380660.8095260.940079
2019-10-040.2487940.2189640.2473220.880936
2019-10-050.7379160.2274650.5105300.791399
2019-10-060.8644730.1048770.2623960.260462
2019-10-070.8966520.6603030.2411620.764813
2019-10-080.7133560.8415440.2210160.364391
2019-10-090.1168070.0302810.5063260.664463
2019-10-100.2015390.1508800.5132920.247200
2019-10-110.6306590.7569280.9304130.852314
2019-10-120.3959640.5482750.3375900.515698
2019-10-130.6710360.6357330.1378160.067704
2019-10-140.7646040.8555460.2251120.311566
2019-10-150.9486760.6496150.6305450.427537
2019-10-160.2896020.2630250.5578090.546518
\n", "
" ], "text/plain": [ " A B Cyder D\n", "2019-10-01 0.199455 0.944134 0.306606 0.915470\n", "2019-10-02 0.801711 0.896039 0.637553 0.102936\n", "2019-10-03 0.301331 0.338066 0.809526 0.940079\n", "2019-10-04 0.248794 0.218964 0.247322 0.880936\n", "2019-10-05 0.737916 0.227465 0.510530 0.791399\n", "2019-10-06 0.864473 0.104877 0.262396 0.260462\n", "2019-10-07 0.896652 0.660303 0.241162 0.764813\n", "2019-10-08 0.713356 0.841544 0.221016 0.364391\n", "2019-10-09 0.116807 0.030281 0.506326 0.664463\n", "2019-10-10 0.201539 0.150880 0.513292 0.247200\n", "2019-10-11 0.630659 0.756928 0.930413 0.852314\n", "2019-10-12 0.395964 0.548275 0.337590 0.515698\n", "2019-10-13 0.671036 0.635733 0.137816 0.067704\n", "2019-10-14 0.764604 0.855546 0.225112 0.311566\n", "2019-10-15 0.948676 0.649615 0.630545 0.427537\n", "2019-10-16 0.289602 0.263025 0.557809 0.546518" ] }, "execution_count": 8, "metadata": {}, "output_type": "execute_result" } ], "source": [ "df = pd.DataFrame(np.random.rand(16,4),index=dates,columns=list(['A','B','Cyder','D']))\n", "df" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "or rename only some of the columns:" ] }, { "cell_type": "code", "execution_count": 9, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
ABbubaD
2019-10-010.1994550.9441340.3066060.915470
2019-10-020.8017110.8960390.6375530.102936
2019-10-030.3013310.3380660.8095260.940079
2019-10-040.2487940.2189640.2473220.880936
2019-10-050.7379160.2274650.5105300.791399
2019-10-060.8644730.1048770.2623960.260462
2019-10-070.8966520.6603030.2411620.764813
2019-10-080.7133560.8415440.2210160.364391
2019-10-090.1168070.0302810.5063260.664463
2019-10-100.2015390.1508800.5132920.247200
2019-10-110.6306590.7569280.9304130.852314
2019-10-120.3959640.5482750.3375900.515698
2019-10-130.6710360.6357330.1378160.067704
2019-10-140.7646040.8555460.2251120.311566
2019-10-150.9486760.6496150.6305450.427537
2019-10-160.2896020.2630250.5578090.546518
\n", "
" ], "text/plain": [ " A B buba D\n", "2019-10-01 0.199455 0.944134 0.306606 0.915470\n", "2019-10-02 0.801711 0.896039 0.637553 0.102936\n", "2019-10-03 0.301331 0.338066 0.809526 0.940079\n", "2019-10-04 0.248794 0.218964 0.247322 0.880936\n", "2019-10-05 0.737916 0.227465 0.510530 0.791399\n", "2019-10-06 0.864473 0.104877 0.262396 0.260462\n", "2019-10-07 0.896652 0.660303 0.241162 0.764813\n", "2019-10-08 0.713356 0.841544 0.221016 0.364391\n", "2019-10-09 0.116807 0.030281 0.506326 0.664463\n", "2019-10-10 0.201539 0.150880 0.513292 0.247200\n", "2019-10-11 0.630659 0.756928 0.930413 0.852314\n", "2019-10-12 0.395964 0.548275 0.337590 0.515698\n", "2019-10-13 0.671036 0.635733 0.137816 0.067704\n", "2019-10-14 0.764604 0.855546 0.225112 0.311566\n", "2019-10-15 0.948676 0.649615 0.630545 0.427537\n", "2019-10-16 0.289602 0.263025 0.557809 0.546518" ] }, "execution_count": 9, "metadata": {}, "output_type": "execute_result" } ], "source": [ "df=df.rename(columns={'Cyder':'buba'})\n", "df" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Creating a DataFrame by passing a dictionary of objects \n", "that can be converted to series-like." ] }, { "cell_type": "code", "execution_count": 12, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
ABCDEF
01.02013-01-021.03testfoo
11.02013-01-021.03trainfoo
21.02013-01-021.03testfoo
31.02013-01-021.03trainfoo
\n", "
" ], "text/plain": [ " A B C D E F\n", "0 1.0 2013-01-02 1.0 3 test foo\n", "1 1.0 2013-01-02 1.0 3 train foo\n", "2 1.0 2013-01-02 1.0 3 test foo\n", "3 1.0 2013-01-02 1.0 3 train foo" ] }, "execution_count": 12, "metadata": {}, "output_type": "execute_result" } ], "source": [ "df2 = pd.DataFrame({'A': 1.,\n", " ...: 'B': pd.Timestamp('20130102'),\n", " ...: 'C': pd.Series(1, index=list(range(4)), dtype='float32'),\n", " ...: 'D': np.array([3] * 4, dtype='int32'),\n", " ...: 'E': pd.Categorical([\"test\", \"train\", \"test\", \"train\"]),\n", " ...: 'F': 'foo'})\n", "df2" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Manipulate" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "The data frame can be sorted in several ways, e.g.:\n", "- by row or column names\n", "- by a selected row or column values" ] }, { "cell_type": "code", "execution_count": 35, "metadata": { "scrolled": true }, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
BAbubaD
2019-10-010.8865820.9391750.5331530.765497
2019-10-020.6072900.8763740.3250790.887634
2019-10-030.2535450.1703840.4524980.148827
2019-10-040.7835120.6346890.0583270.749859
2019-10-050.1910810.0037230.6192220.058100
2019-10-060.4801610.4727050.7480150.322100
2019-10-070.9687690.0780440.8387670.023229
2019-10-080.3975430.7962170.4167770.214413
2019-10-090.1634340.1904080.1062360.654870
2019-10-100.8955480.7696560.0282510.530253
2019-10-110.4509410.8540550.5336130.267064
2019-10-120.6992530.0974760.0949200.081038
2019-10-130.4728290.8145250.8396440.091387
2019-10-140.0198500.3884990.0374730.759586
2019-10-150.5789250.1677290.1330130.172758
2019-10-160.1639980.9056420.2542540.570922
\n", "
" ], "text/plain": [ " B A buba D\n", "2019-10-01 0.886582 0.939175 0.533153 0.765497\n", "2019-10-02 0.607290 0.876374 0.325079 0.887634\n", "2019-10-03 0.253545 0.170384 0.452498 0.148827\n", "2019-10-04 0.783512 0.634689 0.058327 0.749859\n", "2019-10-05 0.191081 0.003723 0.619222 0.058100\n", "2019-10-06 0.480161 0.472705 0.748015 0.322100\n", "2019-10-07 0.968769 0.078044 0.838767 0.023229\n", "2019-10-08 0.397543 0.796217 0.416777 0.214413\n", "2019-10-09 0.163434 0.190408 0.106236 0.654870\n", "2019-10-10 0.895548 0.769656 0.028251 0.530253\n", "2019-10-11 0.450941 0.854055 0.533613 0.267064\n", "2019-10-12 0.699253 0.097476 0.094920 0.081038\n", "2019-10-13 0.472829 0.814525 0.839644 0.091387\n", "2019-10-14 0.019850 0.388499 0.037473 0.759586\n", "2019-10-15 0.578925 0.167729 0.133013 0.172758\n", "2019-10-16 0.163998 0.905642 0.254254 0.570922" ] }, "execution_count": 35, "metadata": {}, "output_type": "execute_result" } ], "source": [ "# df2.sort_index(axis=1,ascending=True)\n", "# df2.sort_values(by='E',ascending=False)\n", "# df.sort_values(by='2019-10-12',axis=1,ascending=False)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Now read the documentation (or cheatsheet) and explain what happens in each of the following lines:" ] }, { "cell_type": "code", "execution_count": 50, "metadata": {}, "outputs": [], "source": [ "# df.T\n", "# pd.melt(df)\n", "# df2.pivot(columns='E')\n", "# df.drop(columns=['A'])" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Access data\n", "There are many ways to access data frame entries.\n", "\n", "Let's try different ways of selecting the first column:\n", "- by column name (which became a method associated with the data frame object)\n", "- by column name\n", "- using .loc method (select all rows and column named 'A')\n", "- using .iloc method (select all rows and the first column)\n", "\n", "Uncomment each line below and check results:" ] }, { "cell_type": "code", "execution_count": 31, "metadata": {}, "outputs": [], "source": [ "# df.A\n", "# df['A']\n", "# df.loc[:,'A']\n", "# df.iloc[:,0]" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "You can access range of entries:\n", "- by specifying start:end using names\n", "- by specifying start:end using integer row indices\n", "- by inserting arrays of indices (don't need to be consecutive)\n", "- by specifying entries with an array of boolean values (True = include data, False = exclude data)\n", "\n", "Uncomment each line below separately to see the results" ] }, { "cell_type": "code", "execution_count": 39, "metadata": {}, "outputs": [], "source": [ "# df[0:3]\n", "# df['20191001':'20191003']\n", "# df.iloc[0:3, :]\n", "# df.iloc[[1,4,13],[0,2]]\n", "# df.iloc[:,[True,True,False,False]]" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "You can look up the column names using regular expression\n", "(provided you were smart naming them).\n", "\n", "Let's find all the columns starting with an uppercase letter:" ] }, { "cell_type": "code", "execution_count": 23, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
ABD
2019-10-010.1994550.9441340.915470
2019-10-020.8017110.8960390.102936
2019-10-030.3013310.3380660.940079
2019-10-040.2487940.2189640.880936
2019-10-050.7379160.2274650.791399
2019-10-060.8644730.1048770.260462
2019-10-070.8966520.6603030.764813
2019-10-080.7133560.8415440.364391
2019-10-090.1168070.0302810.664463
2019-10-100.2015390.1508800.247200
2019-10-110.6306590.7569280.852314
2019-10-120.3959640.5482750.515698
2019-10-130.6710360.6357330.067704
2019-10-140.7646040.8555460.311566
2019-10-150.9486760.6496150.427537
2019-10-160.2896020.2630250.546518
\n", "
" ], "text/plain": [ " A B D\n", "2019-10-01 0.199455 0.944134 0.915470\n", "2019-10-02 0.801711 0.896039 0.102936\n", "2019-10-03 0.301331 0.338066 0.940079\n", "2019-10-04 0.248794 0.218964 0.880936\n", "2019-10-05 0.737916 0.227465 0.791399\n", "2019-10-06 0.864473 0.104877 0.260462\n", "2019-10-07 0.896652 0.660303 0.764813\n", "2019-10-08 0.713356 0.841544 0.364391\n", "2019-10-09 0.116807 0.030281 0.664463\n", "2019-10-10 0.201539 0.150880 0.247200\n", "2019-10-11 0.630659 0.756928 0.852314\n", "2019-10-12 0.395964 0.548275 0.515698\n", "2019-10-13 0.671036 0.635733 0.067704\n", "2019-10-14 0.764604 0.855546 0.311566\n", "2019-10-15 0.948676 0.649615 0.427537\n", "2019-10-16 0.289602 0.263025 0.546518" ] }, "execution_count": 23, "metadata": {}, "output_type": "execute_result" } ], "source": [ "df.filter(regex='[A-Z]')" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Multiple entries can be overwritten simultaneously.\n", "\n", "Explain what will change after running the following lines:" ] }, { "cell_type": "code", "execution_count": 26, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
ABbubaD
2019-10-0130.3000000.3000001.200000
2019-10-0230.8960390.6375530.102936
2019-10-0330.3380660.8095260.940079
2019-10-0430.2189640.2473220.880936
2019-10-0530.2274650.5105300.791399
2019-10-0630.1048770.2623960.260462
2019-10-0730.6603030.2411620.764813
2019-10-0830.8415440.2210160.364391
2019-10-0930.0302810.5063260.664463
2019-10-1030.1508800.5132920.247200
2019-10-1130.7569280.9304130.852314
2019-10-1230.5482750.3375900.515698
2019-10-1330.6357330.1378160.067704
2019-10-1430.8555460.2251120.311566
2019-10-1530.6496150.6305450.427537
2019-10-1630.2630250.5578090.546518
\n", "
" ], "text/plain": [ " A B buba D\n", "2019-10-01 3 0.300000 0.300000 1.200000\n", "2019-10-02 3 0.896039 0.637553 0.102936\n", "2019-10-03 3 0.338066 0.809526 0.940079\n", "2019-10-04 3 0.218964 0.247322 0.880936\n", "2019-10-05 3 0.227465 0.510530 0.791399\n", "2019-10-06 3 0.104877 0.262396 0.260462\n", "2019-10-07 3 0.660303 0.241162 0.764813\n", "2019-10-08 3 0.841544 0.221016 0.364391\n", "2019-10-09 3 0.030281 0.506326 0.664463\n", "2019-10-10 3 0.150880 0.513292 0.247200\n", "2019-10-11 3 0.756928 0.930413 0.852314\n", "2019-10-12 3 0.548275 0.337590 0.515698\n", "2019-10-13 3 0.635733 0.137816 0.067704\n", "2019-10-14 3 0.855546 0.225112 0.311566\n", "2019-10-15 3 0.649615 0.630545 0.427537\n", "2019-10-16 3 0.263025 0.557809 0.546518" ] }, "execution_count": 26, "metadata": {}, "output_type": "execute_result" } ], "source": [ "df.loc[dates[0], 'D']=1.2\n", "df.at[dates[0], 'B':'buba']=0.3\n", "df['A']=3\n", "df" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Select data\n", "\n", "You can perform logical operations on multiple data frame entries at the same time:" ] }, { "cell_type": "code", "execution_count": 29, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "2019-10-01 True\n", "2019-10-02 True\n", "2019-10-03 True\n", "2019-10-04 True\n", "2019-10-05 True\n", "2019-10-06 True\n", "2019-10-07 True\n", "2019-10-08 True\n", "2019-10-09 True\n", "2019-10-10 True\n", "2019-10-11 True\n", "2019-10-12 True\n", "2019-10-13 True\n", "2019-10-14 True\n", "2019-10-15 True\n", "2019-10-16 True\n", "Freq: D, Name: B, dtype: bool" ] }, "execution_count": 29, "metadata": {}, "output_type": "execute_result" } ], "source": [ "df.B>0" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Since we can access data with arrays of logical values, then...\n", "\n", "Explain what happens here:" ] }, { "cell_type": "code", "execution_count": 40, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
ABbubaD
2019-10-0230.8960390.6375530.102936
2019-10-0330.3380660.8095260.940079
2019-10-0730.6603030.2411620.764813
2019-10-0830.8415440.2210160.364391
2019-10-1130.7569280.9304130.852314
2019-10-1230.5482750.3375900.515698
2019-10-1330.6357330.1378160.067704
2019-10-1430.8555460.2251120.311566
2019-10-1530.6496150.6305450.427537
\n", "
" ], "text/plain": [ " A B buba D\n", "2019-10-02 3 0.896039 0.637553 0.102936\n", "2019-10-03 3 0.338066 0.809526 0.940079\n", "2019-10-07 3 0.660303 0.241162 0.764813\n", "2019-10-08 3 0.841544 0.221016 0.364391\n", "2019-10-11 3 0.756928 0.930413 0.852314\n", "2019-10-12 3 0.548275 0.337590 0.515698\n", "2019-10-13 3 0.635733 0.137816 0.067704\n", "2019-10-14 3 0.855546 0.225112 0.311566\n", "2019-10-15 3 0.649615 0.630545 0.427537" ] }, "execution_count": 40, "metadata": {}, "output_type": "execute_result" } ], "source": [ "df[df.B>0.3]" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "or here (uncomment each line):" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# df[df.B+df.D>=df.A*df.buba]\n", "# df[df>.3]\n", "# df[df>.3].sort_values(by='B',na_position='first')" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "If the data frame contains objects encoded as different types,\n", "you can select each type separately.\n", "\n", "For instance, let's take only categorical variables:" ] }, { "cell_type": "code", "execution_count": 41, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
E
0test
1train
2test
3train
\n", "
" ], "text/plain": [ " E\n", "0 test\n", "1 train\n", "2 test\n", "3 train" ] }, "execution_count": 41, "metadata": {}, "output_type": "execute_result" } ], "source": [ "df2.select_dtypes(include='category')" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Viewing data\n", "If the data is too big you might want to have only a glimpse on a couple of instances:" ] }, { "cell_type": "code", "execution_count": 42, "metadata": {}, "outputs": [], "source": [ "# df.head()\n", "# df.tail(3)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "or get familiar with the column and row names:" ] }, { "cell_type": "code", "execution_count": 47, "metadata": {}, "outputs": [], "source": [ "# df.index\n", "# df.columns" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "You might want to have a look at some summary statistics:" ] }, { "cell_type": "code", "execution_count": 43, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
ABbubaD
count16.016.00000016.00000016.000000
mean3.00.4673460.4417750.558626
std0.00.2934820.2316180.327352
min3.00.0302810.1378160.067704
25%3.00.2253400.2457820.298790
50%3.00.4431700.4219580.531108
75%3.00.6844590.5759930.806628
max3.00.8960390.9304131.200000
\n", "
" ], "text/plain": [ " A B buba D\n", "count 16.0 16.000000 16.000000 16.000000\n", "mean 3.0 0.467346 0.441775 0.558626\n", "std 0.0 0.293482 0.231618 0.327352\n", "min 3.0 0.030281 0.137816 0.067704\n", "25% 3.0 0.225340 0.245782 0.298790\n", "50% 3.0 0.443170 0.421958 0.531108\n", "75% 3.0 0.684459 0.575993 0.806628\n", "max 3.0 0.896039 0.930413 1.200000" ] }, "execution_count": 43, "metadata": {}, "output_type": "execute_result" } ], "source": [ "df.describe()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Or at least at some of the ones that are of interest:" ] }, { "cell_type": "code", "execution_count": 44, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "A 3.000000\n", "B 0.467346\n", "buba 0.441775\n", "D 0.558626\n", "dtype: float64" ] }, "execution_count": 44, "metadata": {}, "output_type": "execute_result" } ], "source": [ "# df.sum()\n", "# df.count()\n", "df.mean()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "If you haven't loaded Matplotlib (and you should have!)\n", "you still have several options to plot the data:" ] }, { "cell_type": "code", "execution_count": 49, "metadata": { "scrolled": true }, "outputs": [ { "data": { "text/plain": [ "" ] }, "execution_count": 49, "metadata": {}, "output_type": "execute_result" }, { "data": { "image/png": "iVBORw0KGgoAAAANSUhEUgAAAYIAAAEGCAYAAABo25JHAAAABHNCSVQICAgIfAhkiAAAAAlwSFlzAAALEgAACxIB0t1+/AAAADh0RVh0U29mdHdhcmUAbWF0cGxvdGxpYiB2ZXJzaW9uMy4xLjEsIGh0dHA6Ly9tYXRwbG90bGliLm9yZy8QZhcZAAATA0lEQVR4nO3df4xd513n8fdnnIntxWlreQYJeRycVVwtUeSmaDYt6z/a0qJN+ofzhysUa7vQVaglloDYViTZZVWq8BeuEBIibLFK1QWJhoARsVBLkCCoq0JQJtvEIukGmbTgaVbKdOp2a9V2xsx3/5jrdjI/PNede+6v835Jlu655/G93zwZ++PzPOd5TqoKSVJ7TQy6AEnSYBkEktRyBoEktZxBIEktZxBIUsvdNOgCbtTU1FQdPHhw0GVI0kh57rnnvl5V0xudG7kgOHjwIHNzc4MuQ5JGSpJ/2uycQ0OS1HIGgSS1nEEgSS1nEEhSyxkEktRyBoEktZxBIEkt11gQJPl0kteS/P0m5/9DkrOdX3+T5G1N1aLBW7x4hRfOf5PFi1cGXYqkNZpcUPYZ4LeA39vk/FeAd1XVhST3AqeAdzRYjwbkyee/xsOnzzI5McHS8jInjx3m6F37B12WpI7Grgiq6gvAN65z/m+q6kLn8BlgpqlaNDiLF6/w8OmzXF5a5ttXrnJ5aZmHTp/1ykAaIsMyR/AA8PnNTiY5kWQuydzCwkIfy9J2zV+4xOTEG3/MJicmmL9waUAVSVpr4EGQ5D2sBMHDm7WpqlNVNVtVs9PTG+6ZpCE1s3c3S8vLb3hvaXmZmb27B1SRpLUGGgRJDgOfAu6rqsVB1qJm7Nuzk5PHDrNrcoJbdt7ErskJTh47zL49OwddmqSOge0+muRW4E+A/1hV/zCoOtS8o3ft58jtU8xfuMTM3t2GgDRkGguCJJ8F3g1MJZkHfgWYBKiqTwIfA/YBv50E4GpVzTZVjwZr356dBoA0pBoLgqo6vsX5nwF+pqnvlyR1Z+CTxZKkwTIIJKnlDAJJajmDQJJaziCQpJYzCCSp5QwCSWo5g0CSWs4gkKSWMwgkqeUMAklqOYNAklrOIJCkljMIJKnlDAJJajmDQJJaziCQpJYzCCSp5QwCSWo5g0CSWs4gkKSWMwgkqeUMAklqOYNAklrOIJCklmssCJJ8OslrSf5+k/NJ8ptJziU5m+RHm6pFkrS5Jq8IPgPcc53z9wKHOr9OAP+jwVokSZtoLAiq6gvAN67T5D7g92rFM8BbkvxQU/VIkjY2yDmC/cD5VcfznffWSXIiyVySuYWFhb4UJ0ltMcggyAbv1UYNq+pUVc1W1ez09HTDZUlSuwwyCOaBA6uOZ4BXB1SLJLXWIIPgDPBTnbuH3gl8q6r+7wDrkaRWuqmpD07yWeDdwFSSeeBXgEmAqvok8Dng/cA54DvAf2qqFknS5hoLgqo6vsX5An6uqe+XJHXHlcWS1HIGgSS1nEGgobV48QovnP8mixevDLoUaaw1NkcgbceTz3+Nh0+fZXJigqXlZU4eO8zRuzZcbyhpm7wi0NBZvHiFh0+f5fLSMt++cpXLS8s8dPqsVwZSQwwCDZ35C5eYnHjjj+bkxATzFy4NqCJpvBkEGjoze3eztLz8hveWlpeZ2bt7QBVJ480g0NDZt2cnJ48dZtfkBLfsvIldkxOcPHaYfXt2Dro0aSw5WayhdPSu/Ry5fYr5C5eY2bvbEJAaZBBoaO3bs9MAkPrAoSFJajmDQJJaziBoKVftSrrGOYIWctWupNW8ImgZV+1KWssgaBlX7UpayyBoGVftSlrLIGgZV+2qTbwpojtOFreQq3bVBt4U0T2DoKVctatxtvqmiMusDIU+dPosR26f8ud+Aw4NSRo73hRxYwwCSWPHmyJujEEgaex4U8SNcY5A0ljypojuGQSSxpY3RXSn0aGhJPckeTnJuSSPbHD+1iRPJ/lSkrNJ3t9kPZKk9RoLgiQ7gMeAe4E7gONJ7ljT7L8DT1TV24H7gd9uqh5J0saavCK4GzhXVa9U1evA48B9a9oU8KbO6zcDrzZYjzbh6kup3ZqcI9gPnF91PA+8Y02bjwN/keTngR8A3tdgPdqAqy8lNXlFkA3eqzXHx4HPVNUM8H7g95OsqynJiSRzSeYWFhYaKLWd3JJaEjQbBPPAgVXHM6wf+nkAeAKgqv4W2AVMrf2gqjpVVbNVNTs9Pd1Que3j6ktJ0GwQPAscSnJbkptZmQw+s6bNPwPvBUjyI6wEgf/k7xNXX0qCBoOgqq4CDwJPAV9m5e6gF5M8muRop9lHgQ8neQH4LPChqlo7fKSGuPpSTfImhNGRUft7d3Z2tubm5gZdxlhZvHjF1ZfqKW9CGD5Jnquq2Y3OubJYrr5UT7kF9Ohx0zlJPeVNCKPHIJDUU96EMHoMAkk95U0Io8c5Akk95xbQo8UgkNQIb0IYHQ4NSVLLGQSS1HIGgSS1nEEgSS1nEEhSyxkEktRyBoEktZxBIEktZxBIUsu1Mgh8YIYkfU/rtpjwgRmS9EatuiJY/cCMb1+5yuWlZR46fdYrA0mt1qog8IEZkrReq4LAB2ZI0nqtCgIfmCFJ67VustgHZkjSG3UdBEmmAapqobly+sMHZkjS91x3aCgrPp7k68D/Af4hyUKSj/WnPElS07aaI/hF4Ajwb6tqX1XtBd4BHEnyXxqvTpLUuK2C4KeA41X1lWtvVNUrwAc75yRJI26rIJisqq+vfbMzTzC51YcnuSfJy0nOJXlkkzY/meSlJC8m+YPuypYk9cpWk8Wvf5/nSLIDeAz4CWAeeDbJmap6aVWbQ8B/BY5U1YUkP9hd2ZKkXtkqCN6W5P9t8H6AXVv83ruBc52hJJI8DtwHvLSqzYeBx6rqAkBVvdZV1ZKknrluEFTVjm189n7g/KrjeVYmmld7K0CSLwI7gI9X1Z+v/aAkJ4ATALfeeus2SpIkrdXkyuJs8F6tOb4JOAS8GzgOfCrJW9b9pqpTVTVbVbPT09M9L1SS2qzJIJgHDqw6ngFe3aDNk1W11Lkz6WVWgkGS1CdNBsGzwKEktyW5GbgfOLOmzZ8C7wFIMsXKUNErDdYkSVqjsSCoqqvAg8BTwJeBJ6rqxSSPJjnaafYUsJjkJeBp4JeqarGpmiRJ66Vq7bD9cJudna25ublBlyFJIyXJc1U1u9G5Vm1Drd7wmc/SeGndNtTaHp/5LI0frwjUNZ/5LI0ng0Bd85nP0ngyCNQ1n/ksjSeDQF3zmc/Dx4l79YKTxbohPvN5eDhxr14xCHTDfObz4K2euL/MynDdQ6fPcuT2Kf/f6IY5NCSNICfu1UsGgTSCnLhXLxkE0ghy4l695ByBNKKcuFevGATSCHPiXr3g0JAktZxBIEktZxBIUssZBJLUcgaBJLWcQSBJLWcQSFLLGQSSesItsUeXC8okbZtbYo82rwgkbYvPsh59BoGkbXFL7NFnEEjaFrfEHn2NBkGSe5K8nORckkeu0+4DSSrJbJP1SOo9t8QefY1NFifZATwG/AQwDzyb5ExVvbSm3S3ALwB/11QtkprlltijrckrgruBc1X1SlW9DjwO3LdBu18FTgKXG6xFUsP27dnJ2w68xRAYQU0GwX7g/Krj+c5735Xk7cCBqvqz631QkhNJ5pLMLSws9L5SSWqxJoMgG7xX3z2ZTAC/AXx0qw+qqlNVNVtVs9PT0z0sUZLUZBDMAwdWHc8Ar646vgW4E/jrJF8F3gmcccJYkvqrySB4FjiU5LYkNwP3A2eunayqb1XVVFUdrKqDwDPA0aqaa7AmSWrcqG230dhdQ1V1NcmDwFPADuDTVfVikkeBuao6c/1PkKTRM4rbbaSqtm41RGZnZ2tuzosGScNn8eIVjvzaX3F56XsL7HZNTvDFh3984HdTJXmuqjYcendlsST1yKhut2EQSFKPjOp2GwaBJPXIqG634fMIJKmHRnG7DYNAknps356dIxEA1zg0JEktZxBIUsOGfYGZQ0OS1KBRWGDmFYEkNWRUnudsEEhSQ0ZlgZlBIEkNGZUFZgaBJDVkVBaYOVksSQ0ahQVmBoEkNWzYF5g5NCRJLWcQNGjYF5FIEjg01JhRWEQiSeAVQSN6sYjEqwlJ/eIVQQOuLSK5zPfuH762iKSbCSOvJiT1k1cEDdjOIpJRWZIuaXwYBA3YziKSUVmSLml8ODTUkO93EcmoLEmXND68ImjQvj07eduBt9zQQpJRWZIuaXx4RTCERmFJuqTxYRAMqWFfki5pfDQ6NJTkniQvJzmX5JENzn8kyUtJzib5yyQ/3GQ9kqT1GguCJDuAx4B7gTuA40nuWNPsS8BsVR0G/hg42VQ9kqSNNXlFcDdwrqpeqarXgceB+1Y3qKqnq+o7ncNngJkG65EkbaDJINgPnF91PN95bzMPAJ/f6ESSE0nmkswtLCz0sERJUpNBkA3eqw0bJh8EZoFPbHS+qk5V1WxVzU5PT/ewRElSk3cNzQMHVh3PAK+ubZTkfcAvA++qKvdRkKQ+a/KK4FngUJLbktwM3A+cWd0gyduB3wGOVtVrDdYiSdpEY0FQVVeBB4GngC8DT1TVi0keTXK00+wTwB7gj5I8n+TMJh8nSWpIowvKqupzwOfWvPexVa/f1+T3S5K25l5DktRyBoEktZxBIEktZxBIUssZBJLUcgaBJLWcQSBJLWcQSFLLGQSSNKQWL17hhfPfZPFis9uw+ahKSRpCTz7/NR4+fZbJiQmWlpc5eewwR++63k7+3z+vCCRpyCxevMLDp89yeWmZb1+5yuWlZR46fbaxKwODQJKGzPyFS0xOvPGv58mJCeYvXGrk+wwCSRoyM3t3s7S8/Ib3lpaXmdm7u5HvMwgkacjs27OTk8cOs2tyglt23sSuyQlOHjvMvj07G/k+J4sl3bDFi1eYv3CJmb27G/vLqe2O3rWfI7dP9aWfDQJJN6Sfd7O03b49O/sStA4NSepav+9mUX8YBJK61u+7WdQfBoGkrvX7bhb1h0EgqWv9vptF/eFksaQb0s+7WdQfBoGkG9avu1nUHw4NSVLLGQSS1HIGgSS1nEEgSS3XaBAkuSfJy0nOJXlkg/M7k/xh5/zfJTnYZD2SpPUaC4IkO4DHgHuBO4DjSe5Y0+wB4EJV3Q78BvBrTdUjSdpYk1cEdwPnquqVqnodeBy4b02b+4D/2Xn9x8B7k6TBmiRJazS5jmA/cH7V8Tzwjs3aVNXVJN8C9gFfX90oyQngROfwYpKXN/nOqbW/V/bJJuyX9eyT9capT354sxNNBsFG/7Kv76MNVXUKOLXlFyZzVTXbXXntYJ9szH5Zzz5Zry190uTQ0DxwYNXxDPDqZm2S3AS8GfhGgzVJktZoMgieBQ4luS3JzcD9wJk1bc4AP915/QHgr6pq3RWBJKk5jQ0Ndcb8HwSeAnYAn66qF5M8CsxV1Rngd4HfT3KOlSuB+7f5tVsOH7WQfbIx+2U9+2S9VvRJ/Ae4JLWbK4slqeUMAklquZEMAreuWK+LPvlIkpeSnE3yl0k2vad4XGzVJ6vafSBJJRn72wShu35J8pOdn5cXk/xBv2vsty7+/Nya5OkkX+r8GXr/IOpsTFWN1C9WJp7/EfjXwM3AC8Ada9r8Z+CTndf3A3846LqHoE/eA/yrzuuftU++2+4W4AvAM8DsoOsehn4BDgFfAvZ2jn9w0HUPQZ+cAn628/oO4KuDrruXv0bxisCtK9bbsk+q6umq+k7n8BlW1nWMs25+TgB+FTgJXO5ncQPUTb98GHisqi4AVNVrfa6x37rpkwLe1Hn9ZtaviRppoxgEG21dsX+zNlV1Fbi2dcW46qZPVnsA+HyjFQ3eln2S5O3Agar6s34WNmDd/Ky8FXhrki8meSbJPX2rbjC66ZOPAx9MMg98Dvj5/pTWH6P4zOKebV0xRrr+703yQWAWeFejFQ3edfskyQQrO95+qF8FDYluflZuYmV46N2sXDn+ryR3VtU3G65tULrpk+PAZ6rq15P8GCvrn+6squXmy2veKF4RuHXFet30CUneB/wycLSqrvSptkHZqk9uAe4E/jrJV4F3AmdaMGHc7Z+fJ6tqqaq+ArzMSjCMq2765AHgCYCq+ltgFysb0o2FUQwCt65Yb8s+6QyD/A4rITDuY76wRZ9U1beqaqqqDlbVQVbmTY5W1dxgyu2bbv78/CkrNxeQZIqVoaJX+lplf3XTJ/8MvBcgyY+wEgQLfa2yQSMXBJ0x/2tbV3wZeKI6W1ckOdpp9rvAvs7WFR8BNr11cBx02SefAPYAf5Tk+SRrf9DHSpd90jpd9stTwGKSl4CngV+qqsXBVNy8Lvvko8CHk7wAfBb40Dj949ItJiSp5UbuikCS1FsGgSS1nEEgSS1nEEhSyxkEktRyBoG0TUn+pXNL7gtJ/neSfzfomqQb4e2j0jYluVhVezqv/z3w36pq3Lfw0BjxikDqrTcBFwZdhHQjRnHTOWnY7E7yPCvbDvwQ8OMDrke6IQ4NSdu0Zmjox4BPAXeO0xYEGm8ODUk91NmZcgqYHnQtUrcMAqmHkvwbVh59OLabtGn8OEcgbd+1OQJYecjJT1fVvwyyIOlGOEcgSS3n0JAktZxBIEktZxBIUssZBJLUcgaBJLWcQSBJLWcQSFLL/X/IxFDGYF0sZgAAAABJRU5ErkJggg==\n", "text/plain": [ "
" ] }, "metadata": { "needs_background": "light" }, "output_type": "display_data" } ], "source": [ "# df.plot.box()\n", "# df.plot.hist()\n", "df.plot.scatter(x='B',y='D')" ] } ], "metadata": { "kernelspec": { "display_name": "Python 3", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.7.4" } }, "nbformat": 4, "nbformat_minor": 2 }