{
"cells": [
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# Introduction to pandas\n",
"Adapted from \"10 minutes to pandas\":\n",
"https://pandas.pydata.org/pandas-docs/stable/getting_started/10min.html#min\n",
"\n",
"See also the cheatsheet:\n",
"https://github.com/pandas-dev/pandas/blob/master/doc/cheatsheet/Pandas_Cheat_Sheet.pdf"
]
},
{
"cell_type": "code",
"execution_count": 2,
"metadata": {
"collapsed": true
},
"outputs": [],
"source": [
"import numpy as np\n",
"import pandas as pd"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Pandas provides you with two handy data structures:\n",
"- series\n",
"- data frame\n",
"\n",
"which can store 1-dimensional and 2-dimensional labelled arrays.\n",
"NumPy arrays have one dtype for the entire array, while pandas DataFrames have one dtype per column.\n",
"Which means data frames can store different types of objects in each column,\n",
"e.g., integers, reals, booleans, strings, dates."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Object creation\n",
"Creating a Series by passing a list of values, letting pandas create \n",
"a default integer index:"
]
},
{
"cell_type": "code",
"execution_count": 3,
"metadata": {
"collapsed": false
},
"outputs": [
{
"data": {
"text/plain": [
"0 1.0\n",
"1 3.0\n",
"2 5.0\n",
"3 NaN\n",
"4 6.0\n",
"5 8.0\n",
"dtype: float64"
]
},
"execution_count": 3,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"s = pd.Series([1,3,5,np.nan, 6, 8])\n",
"s"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Creating a DataFrame by passing a NumPy array,\n",
"with a datetime index and labeled columns:"
]
},
{
"cell_type": "code",
"execution_count": 3,
"metadata": {
"collapsed": false
},
"outputs": [
{
"data": {
"text/plain": [
"DatetimeIndex(['2019-10-01', '2019-10-02', '2019-10-03', '2019-10-04',\n",
" '2019-10-05', '2019-10-06', '2019-10-07', '2019-10-08',\n",
" '2019-10-09', '2019-10-10', '2019-10-11', '2019-10-12',\n",
" '2019-10-13', '2019-10-14', '2019-10-15', '2019-10-16'],\n",
" dtype='datetime64[ns]', freq='D')"
]
},
"execution_count": 3,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"dates = pd.date_range('20191001',periods=16)\n",
"dates"
]
},
{
"cell_type": "code",
"execution_count": 7,
"metadata": {
"collapsed": false
},
"outputs": [
{
"data": {
"text/html": [
"
\n",
"\n",
"
\n",
" \n",
" \n",
" | \n",
" A | \n",
" B | \n",
" C | \n",
" D | \n",
"
\n",
" \n",
" \n",
" \n",
" 2019-10-01 | \n",
" 0.734178 | \n",
" 0.401619 | \n",
" 0.644419 | \n",
" 0.285785 | \n",
"
\n",
" \n",
" 2019-10-02 | \n",
" 0.834516 | \n",
" 0.467954 | \n",
" 0.909069 | \n",
" 0.949717 | \n",
"
\n",
" \n",
" 2019-10-03 | \n",
" 0.739908 | \n",
" 0.141710 | \n",
" 0.721865 | \n",
" 0.726645 | \n",
"
\n",
" \n",
" 2019-10-04 | \n",
" 0.542031 | \n",
" 0.256183 | \n",
" 0.829637 | \n",
" 0.649897 | \n",
"
\n",
" \n",
" 2019-10-05 | \n",
" 0.472053 | \n",
" 0.363743 | \n",
" 0.636865 | \n",
" 0.106799 | \n",
"
\n",
" \n",
" 2019-10-06 | \n",
" 0.187648 | \n",
" 0.032999 | \n",
" 0.174756 | \n",
" 0.793372 | \n",
"
\n",
" \n",
" 2019-10-07 | \n",
" 0.341833 | \n",
" 0.116082 | \n",
" 0.879330 | \n",
" 0.225108 | \n",
"
\n",
" \n",
" 2019-10-08 | \n",
" 0.742160 | \n",
" 0.901911 | \n",
" 0.584958 | \n",
" 0.118793 | \n",
"
\n",
" \n",
" 2019-10-09 | \n",
" 0.425493 | \n",
" 0.350575 | \n",
" 0.679385 | \n",
" 0.477699 | \n",
"
\n",
" \n",
" 2019-10-10 | \n",
" 0.240525 | \n",
" 0.135637 | \n",
" 0.996856 | \n",
" 0.339611 | \n",
"
\n",
" \n",
" 2019-10-11 | \n",
" 0.733559 | \n",
" 0.076032 | \n",
" 0.572530 | \n",
" 0.410940 | \n",
"
\n",
" \n",
" 2019-10-12 | \n",
" 0.692441 | \n",
" 0.004852 | \n",
" 0.189972 | \n",
" 0.762320 | \n",
"
\n",
" \n",
" 2019-10-13 | \n",
" 0.930907 | \n",
" 0.451138 | \n",
" 0.424450 | \n",
" 0.030775 | \n",
"
\n",
" \n",
" 2019-10-14 | \n",
" 0.878115 | \n",
" 0.884588 | \n",
" 0.837395 | \n",
" 0.034881 | \n",
"
\n",
" \n",
" 2019-10-15 | \n",
" 0.448987 | \n",
" 0.350492 | \n",
" 0.159661 | \n",
" 0.012150 | \n",
"
\n",
" \n",
" 2019-10-16 | \n",
" 0.295813 | \n",
" 0.883204 | \n",
" 0.463057 | \n",
" 0.709098 | \n",
"
\n",
" \n",
"
\n",
"
"
],
"text/plain": [
" A B C D\n",
"2019-10-01 0.734178 0.401619 0.644419 0.285785\n",
"2019-10-02 0.834516 0.467954 0.909069 0.949717\n",
"2019-10-03 0.739908 0.141710 0.721865 0.726645\n",
"2019-10-04 0.542031 0.256183 0.829637 0.649897\n",
"2019-10-05 0.472053 0.363743 0.636865 0.106799\n",
"2019-10-06 0.187648 0.032999 0.174756 0.793372\n",
"2019-10-07 0.341833 0.116082 0.879330 0.225108\n",
"2019-10-08 0.742160 0.901911 0.584958 0.118793\n",
"2019-10-09 0.425493 0.350575 0.679385 0.477699\n",
"2019-10-10 0.240525 0.135637 0.996856 0.339611\n",
"2019-10-11 0.733559 0.076032 0.572530 0.410940\n",
"2019-10-12 0.692441 0.004852 0.189972 0.762320\n",
"2019-10-13 0.930907 0.451138 0.424450 0.030775\n",
"2019-10-14 0.878115 0.884588 0.837395 0.034881\n",
"2019-10-15 0.448987 0.350492 0.159661 0.012150\n",
"2019-10-16 0.295813 0.883204 0.463057 0.709098"
]
},
"execution_count": 7,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"df = pd.DataFrame(np.random.rand(16,4),index=dates,columns=list('ABCD'))\n",
"df\n",
"# ?np.random.randint\n",
"# np.random.randint(0,high=10,size=(3,4))"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"If you don't like the column names, you can use an array of strings"
]
},
{
"cell_type": "code",
"execution_count": 8,
"metadata": {
"collapsed": false
},
"outputs": [
{
"data": {
"text/html": [
"\n",
"\n",
"
\n",
" \n",
" \n",
" | \n",
" A | \n",
" B | \n",
" Cyder | \n",
" D | \n",
"
\n",
" \n",
" \n",
" \n",
" 2019-10-01 | \n",
" 0.199455 | \n",
" 0.944134 | \n",
" 0.306606 | \n",
" 0.915470 | \n",
"
\n",
" \n",
" 2019-10-02 | \n",
" 0.801711 | \n",
" 0.896039 | \n",
" 0.637553 | \n",
" 0.102936 | \n",
"
\n",
" \n",
" 2019-10-03 | \n",
" 0.301331 | \n",
" 0.338066 | \n",
" 0.809526 | \n",
" 0.940079 | \n",
"
\n",
" \n",
" 2019-10-04 | \n",
" 0.248794 | \n",
" 0.218964 | \n",
" 0.247322 | \n",
" 0.880936 | \n",
"
\n",
" \n",
" 2019-10-05 | \n",
" 0.737916 | \n",
" 0.227465 | \n",
" 0.510530 | \n",
" 0.791399 | \n",
"
\n",
" \n",
" 2019-10-06 | \n",
" 0.864473 | \n",
" 0.104877 | \n",
" 0.262396 | \n",
" 0.260462 | \n",
"
\n",
" \n",
" 2019-10-07 | \n",
" 0.896652 | \n",
" 0.660303 | \n",
" 0.241162 | \n",
" 0.764813 | \n",
"
\n",
" \n",
" 2019-10-08 | \n",
" 0.713356 | \n",
" 0.841544 | \n",
" 0.221016 | \n",
" 0.364391 | \n",
"
\n",
" \n",
" 2019-10-09 | \n",
" 0.116807 | \n",
" 0.030281 | \n",
" 0.506326 | \n",
" 0.664463 | \n",
"
\n",
" \n",
" 2019-10-10 | \n",
" 0.201539 | \n",
" 0.150880 | \n",
" 0.513292 | \n",
" 0.247200 | \n",
"
\n",
" \n",
" 2019-10-11 | \n",
" 0.630659 | \n",
" 0.756928 | \n",
" 0.930413 | \n",
" 0.852314 | \n",
"
\n",
" \n",
" 2019-10-12 | \n",
" 0.395964 | \n",
" 0.548275 | \n",
" 0.337590 | \n",
" 0.515698 | \n",
"
\n",
" \n",
" 2019-10-13 | \n",
" 0.671036 | \n",
" 0.635733 | \n",
" 0.137816 | \n",
" 0.067704 | \n",
"
\n",
" \n",
" 2019-10-14 | \n",
" 0.764604 | \n",
" 0.855546 | \n",
" 0.225112 | \n",
" 0.311566 | \n",
"
\n",
" \n",
" 2019-10-15 | \n",
" 0.948676 | \n",
" 0.649615 | \n",
" 0.630545 | \n",
" 0.427537 | \n",
"
\n",
" \n",
" 2019-10-16 | \n",
" 0.289602 | \n",
" 0.263025 | \n",
" 0.557809 | \n",
" 0.546518 | \n",
"
\n",
" \n",
"
\n",
"
"
],
"text/plain": [
" A B Cyder D\n",
"2019-10-01 0.199455 0.944134 0.306606 0.915470\n",
"2019-10-02 0.801711 0.896039 0.637553 0.102936\n",
"2019-10-03 0.301331 0.338066 0.809526 0.940079\n",
"2019-10-04 0.248794 0.218964 0.247322 0.880936\n",
"2019-10-05 0.737916 0.227465 0.510530 0.791399\n",
"2019-10-06 0.864473 0.104877 0.262396 0.260462\n",
"2019-10-07 0.896652 0.660303 0.241162 0.764813\n",
"2019-10-08 0.713356 0.841544 0.221016 0.364391\n",
"2019-10-09 0.116807 0.030281 0.506326 0.664463\n",
"2019-10-10 0.201539 0.150880 0.513292 0.247200\n",
"2019-10-11 0.630659 0.756928 0.930413 0.852314\n",
"2019-10-12 0.395964 0.548275 0.337590 0.515698\n",
"2019-10-13 0.671036 0.635733 0.137816 0.067704\n",
"2019-10-14 0.764604 0.855546 0.225112 0.311566\n",
"2019-10-15 0.948676 0.649615 0.630545 0.427537\n",
"2019-10-16 0.289602 0.263025 0.557809 0.546518"
]
},
"execution_count": 8,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"df = pd.DataFrame(np.random.rand(16,4),index=dates,columns=list(['A','B','Cyder','D']))\n",
"df"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"or rename only some of the columns:"
]
},
{
"cell_type": "code",
"execution_count": 9,
"metadata": {
"collapsed": false
},
"outputs": [
{
"data": {
"text/html": [
"\n",
"\n",
"
\n",
" \n",
" \n",
" | \n",
" A | \n",
" B | \n",
" buba | \n",
" D | \n",
"
\n",
" \n",
" \n",
" \n",
" 2019-10-01 | \n",
" 0.199455 | \n",
" 0.944134 | \n",
" 0.306606 | \n",
" 0.915470 | \n",
"
\n",
" \n",
" 2019-10-02 | \n",
" 0.801711 | \n",
" 0.896039 | \n",
" 0.637553 | \n",
" 0.102936 | \n",
"
\n",
" \n",
" 2019-10-03 | \n",
" 0.301331 | \n",
" 0.338066 | \n",
" 0.809526 | \n",
" 0.940079 | \n",
"
\n",
" \n",
" 2019-10-04 | \n",
" 0.248794 | \n",
" 0.218964 | \n",
" 0.247322 | \n",
" 0.880936 | \n",
"
\n",
" \n",
" 2019-10-05 | \n",
" 0.737916 | \n",
" 0.227465 | \n",
" 0.510530 | \n",
" 0.791399 | \n",
"
\n",
" \n",
" 2019-10-06 | \n",
" 0.864473 | \n",
" 0.104877 | \n",
" 0.262396 | \n",
" 0.260462 | \n",
"
\n",
" \n",
" 2019-10-07 | \n",
" 0.896652 | \n",
" 0.660303 | \n",
" 0.241162 | \n",
" 0.764813 | \n",
"
\n",
" \n",
" 2019-10-08 | \n",
" 0.713356 | \n",
" 0.841544 | \n",
" 0.221016 | \n",
" 0.364391 | \n",
"
\n",
" \n",
" 2019-10-09 | \n",
" 0.116807 | \n",
" 0.030281 | \n",
" 0.506326 | \n",
" 0.664463 | \n",
"
\n",
" \n",
" 2019-10-10 | \n",
" 0.201539 | \n",
" 0.150880 | \n",
" 0.513292 | \n",
" 0.247200 | \n",
"
\n",
" \n",
" 2019-10-11 | \n",
" 0.630659 | \n",
" 0.756928 | \n",
" 0.930413 | \n",
" 0.852314 | \n",
"
\n",
" \n",
" 2019-10-12 | \n",
" 0.395964 | \n",
" 0.548275 | \n",
" 0.337590 | \n",
" 0.515698 | \n",
"
\n",
" \n",
" 2019-10-13 | \n",
" 0.671036 | \n",
" 0.635733 | \n",
" 0.137816 | \n",
" 0.067704 | \n",
"
\n",
" \n",
" 2019-10-14 | \n",
" 0.764604 | \n",
" 0.855546 | \n",
" 0.225112 | \n",
" 0.311566 | \n",
"
\n",
" \n",
" 2019-10-15 | \n",
" 0.948676 | \n",
" 0.649615 | \n",
" 0.630545 | \n",
" 0.427537 | \n",
"
\n",
" \n",
" 2019-10-16 | \n",
" 0.289602 | \n",
" 0.263025 | \n",
" 0.557809 | \n",
" 0.546518 | \n",
"
\n",
" \n",
"
\n",
"
"
],
"text/plain": [
" A B buba D\n",
"2019-10-01 0.199455 0.944134 0.306606 0.915470\n",
"2019-10-02 0.801711 0.896039 0.637553 0.102936\n",
"2019-10-03 0.301331 0.338066 0.809526 0.940079\n",
"2019-10-04 0.248794 0.218964 0.247322 0.880936\n",
"2019-10-05 0.737916 0.227465 0.510530 0.791399\n",
"2019-10-06 0.864473 0.104877 0.262396 0.260462\n",
"2019-10-07 0.896652 0.660303 0.241162 0.764813\n",
"2019-10-08 0.713356 0.841544 0.221016 0.364391\n",
"2019-10-09 0.116807 0.030281 0.506326 0.664463\n",
"2019-10-10 0.201539 0.150880 0.513292 0.247200\n",
"2019-10-11 0.630659 0.756928 0.930413 0.852314\n",
"2019-10-12 0.395964 0.548275 0.337590 0.515698\n",
"2019-10-13 0.671036 0.635733 0.137816 0.067704\n",
"2019-10-14 0.764604 0.855546 0.225112 0.311566\n",
"2019-10-15 0.948676 0.649615 0.630545 0.427537\n",
"2019-10-16 0.289602 0.263025 0.557809 0.546518"
]
},
"execution_count": 9,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"df=df.rename(columns={'Cyder':'buba'})\n",
"df"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Creating a DataFrame by passing a dictionary of objects \n",
"that can be converted to series-like."
]
},
{
"cell_type": "code",
"execution_count": 12,
"metadata": {
"collapsed": false
},
"outputs": [
{
"data": {
"text/html": [
"\n",
"\n",
"
\n",
" \n",
" \n",
" | \n",
" A | \n",
" B | \n",
" C | \n",
" D | \n",
" E | \n",
" F | \n",
"
\n",
" \n",
" \n",
" \n",
" 0 | \n",
" 1.0 | \n",
" 2013-01-02 | \n",
" 1.0 | \n",
" 3 | \n",
" test | \n",
" foo | \n",
"
\n",
" \n",
" 1 | \n",
" 1.0 | \n",
" 2013-01-02 | \n",
" 1.0 | \n",
" 3 | \n",
" train | \n",
" foo | \n",
"
\n",
" \n",
" 2 | \n",
" 1.0 | \n",
" 2013-01-02 | \n",
" 1.0 | \n",
" 3 | \n",
" test | \n",
" foo | \n",
"
\n",
" \n",
" 3 | \n",
" 1.0 | \n",
" 2013-01-02 | \n",
" 1.0 | \n",
" 3 | \n",
" train | \n",
" foo | \n",
"
\n",
" \n",
"
\n",
"
"
],
"text/plain": [
" A B C D E F\n",
"0 1.0 2013-01-02 1.0 3 test foo\n",
"1 1.0 2013-01-02 1.0 3 train foo\n",
"2 1.0 2013-01-02 1.0 3 test foo\n",
"3 1.0 2013-01-02 1.0 3 train foo"
]
},
"execution_count": 12,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"df2 = pd.DataFrame({'A': 1.,\n",
" ...: 'B': pd.Timestamp('20130102'),\n",
" ...: 'C': pd.Series(1, index=list(range(4)), dtype='float32'),\n",
" ...: 'D': np.array([3] * 4, dtype='int32'),\n",
" ...: 'E': pd.Categorical([\"test\", \"train\", \"test\", \"train\"]),\n",
" ...: 'F': 'foo'})\n",
"df2"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Manipulate"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"The data frame can be sorted in several ways, e.g.:\n",
"- by row or column names\n",
"- by a selected row or column values"
]
},
{
"cell_type": "code",
"execution_count": 35,
"metadata": {
"collapsed": false,
"scrolled": true
},
"outputs": [
{
"data": {
"text/html": [
"\n",
"\n",
"
\n",
" \n",
" \n",
" | \n",
" B | \n",
" A | \n",
" buba | \n",
" D | \n",
"
\n",
" \n",
" \n",
" \n",
" 2019-10-01 | \n",
" 0.886582 | \n",
" 0.939175 | \n",
" 0.533153 | \n",
" 0.765497 | \n",
"
\n",
" \n",
" 2019-10-02 | \n",
" 0.607290 | \n",
" 0.876374 | \n",
" 0.325079 | \n",
" 0.887634 | \n",
"
\n",
" \n",
" 2019-10-03 | \n",
" 0.253545 | \n",
" 0.170384 | \n",
" 0.452498 | \n",
" 0.148827 | \n",
"
\n",
" \n",
" 2019-10-04 | \n",
" 0.783512 | \n",
" 0.634689 | \n",
" 0.058327 | \n",
" 0.749859 | \n",
"
\n",
" \n",
" 2019-10-05 | \n",
" 0.191081 | \n",
" 0.003723 | \n",
" 0.619222 | \n",
" 0.058100 | \n",
"
\n",
" \n",
" 2019-10-06 | \n",
" 0.480161 | \n",
" 0.472705 | \n",
" 0.748015 | \n",
" 0.322100 | \n",
"
\n",
" \n",
" 2019-10-07 | \n",
" 0.968769 | \n",
" 0.078044 | \n",
" 0.838767 | \n",
" 0.023229 | \n",
"
\n",
" \n",
" 2019-10-08 | \n",
" 0.397543 | \n",
" 0.796217 | \n",
" 0.416777 | \n",
" 0.214413 | \n",
"
\n",
" \n",
" 2019-10-09 | \n",
" 0.163434 | \n",
" 0.190408 | \n",
" 0.106236 | \n",
" 0.654870 | \n",
"
\n",
" \n",
" 2019-10-10 | \n",
" 0.895548 | \n",
" 0.769656 | \n",
" 0.028251 | \n",
" 0.530253 | \n",
"
\n",
" \n",
" 2019-10-11 | \n",
" 0.450941 | \n",
" 0.854055 | \n",
" 0.533613 | \n",
" 0.267064 | \n",
"
\n",
" \n",
" 2019-10-12 | \n",
" 0.699253 | \n",
" 0.097476 | \n",
" 0.094920 | \n",
" 0.081038 | \n",
"
\n",
" \n",
" 2019-10-13 | \n",
" 0.472829 | \n",
" 0.814525 | \n",
" 0.839644 | \n",
" 0.091387 | \n",
"
\n",
" \n",
" 2019-10-14 | \n",
" 0.019850 | \n",
" 0.388499 | \n",
" 0.037473 | \n",
" 0.759586 | \n",
"
\n",
" \n",
" 2019-10-15 | \n",
" 0.578925 | \n",
" 0.167729 | \n",
" 0.133013 | \n",
" 0.172758 | \n",
"
\n",
" \n",
" 2019-10-16 | \n",
" 0.163998 | \n",
" 0.905642 | \n",
" 0.254254 | \n",
" 0.570922 | \n",
"
\n",
" \n",
"
\n",
"
"
],
"text/plain": [
" B A buba D\n",
"2019-10-01 0.886582 0.939175 0.533153 0.765497\n",
"2019-10-02 0.607290 0.876374 0.325079 0.887634\n",
"2019-10-03 0.253545 0.170384 0.452498 0.148827\n",
"2019-10-04 0.783512 0.634689 0.058327 0.749859\n",
"2019-10-05 0.191081 0.003723 0.619222 0.058100\n",
"2019-10-06 0.480161 0.472705 0.748015 0.322100\n",
"2019-10-07 0.968769 0.078044 0.838767 0.023229\n",
"2019-10-08 0.397543 0.796217 0.416777 0.214413\n",
"2019-10-09 0.163434 0.190408 0.106236 0.654870\n",
"2019-10-10 0.895548 0.769656 0.028251 0.530253\n",
"2019-10-11 0.450941 0.854055 0.533613 0.267064\n",
"2019-10-12 0.699253 0.097476 0.094920 0.081038\n",
"2019-10-13 0.472829 0.814525 0.839644 0.091387\n",
"2019-10-14 0.019850 0.388499 0.037473 0.759586\n",
"2019-10-15 0.578925 0.167729 0.133013 0.172758\n",
"2019-10-16 0.163998 0.905642 0.254254 0.570922"
]
},
"execution_count": 35,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"# df2.sort_index(axis=1,ascending=True)\n",
"# df2.sort_values(by='E',ascending=False)\n",
"# df.sort_values(by='2019-10-12',axis=1,ascending=False)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Now read the documentation (or cheatsheet) and explain what happens in each of the following lines:"
]
},
{
"cell_type": "code",
"execution_count": 50,
"metadata": {
"collapsed": true
},
"outputs": [],
"source": [
"# df.T\n",
"# pd.melt(df)\n",
"# df2.pivot(columns='E')\n",
"# df.drop(columns=['A'])"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Access data\n",
"There are many ways to access data frame entries.\n",
"\n",
"Let's try different ways of selecting the first column:\n",
"- by column name (which became a method associated with the data frame object)\n",
"- by column name\n",
"- using .loc method (select all rows and column named 'A')\n",
"- using .iloc method (select all rows and the first column)\n",
"\n",
"Uncomment each line below and check results:"
]
},
{
"cell_type": "code",
"execution_count": 31,
"metadata": {
"collapsed": true
},
"outputs": [],
"source": [
"# df.A\n",
"# df['A']\n",
"# df.loc[:,'A']\n",
"# df.iloc[:,0]"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"You can access range of entries:\n",
"- by specifying start:end using names\n",
"- by specifying start:end using integer row indices\n",
"- by inserting arrays of indices (don't need to be consecutive)\n",
"- by specifying entries with an array of boolean values (True = include data, False = exclude data)\n",
"\n",
"Uncomment each line below separately to see the results"
]
},
{
"cell_type": "code",
"execution_count": 39,
"metadata": {
"collapsed": true
},
"outputs": [],
"source": [
"# df[0:3]\n",
"# df['20191001':'20191003']\n",
"# df.iloc[0:3, :]\n",
"# df.iloc[[1,4,13],[0,2]]\n",
"# df.iloc[:,[True,True,False,False]]"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"You can look up the column names using regular expression\n",
"(provided you were smart naming them).\n",
"\n",
"Let's find all the columns starting with an uppercase letter:"
]
},
{
"cell_type": "code",
"execution_count": 23,
"metadata": {
"collapsed": false
},
"outputs": [
{
"data": {
"text/html": [
"\n",
"\n",
"
\n",
" \n",
" \n",
" | \n",
" A | \n",
" B | \n",
" D | \n",
"
\n",
" \n",
" \n",
" \n",
" 2019-10-01 | \n",
" 0.199455 | \n",
" 0.944134 | \n",
" 0.915470 | \n",
"
\n",
" \n",
" 2019-10-02 | \n",
" 0.801711 | \n",
" 0.896039 | \n",
" 0.102936 | \n",
"
\n",
" \n",
" 2019-10-03 | \n",
" 0.301331 | \n",
" 0.338066 | \n",
" 0.940079 | \n",
"
\n",
" \n",
" 2019-10-04 | \n",
" 0.248794 | \n",
" 0.218964 | \n",
" 0.880936 | \n",
"
\n",
" \n",
" 2019-10-05 | \n",
" 0.737916 | \n",
" 0.227465 | \n",
" 0.791399 | \n",
"
\n",
" \n",
" 2019-10-06 | \n",
" 0.864473 | \n",
" 0.104877 | \n",
" 0.260462 | \n",
"
\n",
" \n",
" 2019-10-07 | \n",
" 0.896652 | \n",
" 0.660303 | \n",
" 0.764813 | \n",
"
\n",
" \n",
" 2019-10-08 | \n",
" 0.713356 | \n",
" 0.841544 | \n",
" 0.364391 | \n",
"
\n",
" \n",
" 2019-10-09 | \n",
" 0.116807 | \n",
" 0.030281 | \n",
" 0.664463 | \n",
"
\n",
" \n",
" 2019-10-10 | \n",
" 0.201539 | \n",
" 0.150880 | \n",
" 0.247200 | \n",
"
\n",
" \n",
" 2019-10-11 | \n",
" 0.630659 | \n",
" 0.756928 | \n",
" 0.852314 | \n",
"
\n",
" \n",
" 2019-10-12 | \n",
" 0.395964 | \n",
" 0.548275 | \n",
" 0.515698 | \n",
"
\n",
" \n",
" 2019-10-13 | \n",
" 0.671036 | \n",
" 0.635733 | \n",
" 0.067704 | \n",
"
\n",
" \n",
" 2019-10-14 | \n",
" 0.764604 | \n",
" 0.855546 | \n",
" 0.311566 | \n",
"
\n",
" \n",
" 2019-10-15 | \n",
" 0.948676 | \n",
" 0.649615 | \n",
" 0.427537 | \n",
"
\n",
" \n",
" 2019-10-16 | \n",
" 0.289602 | \n",
" 0.263025 | \n",
" 0.546518 | \n",
"
\n",
" \n",
"
\n",
"
"
],
"text/plain": [
" A B D\n",
"2019-10-01 0.199455 0.944134 0.915470\n",
"2019-10-02 0.801711 0.896039 0.102936\n",
"2019-10-03 0.301331 0.338066 0.940079\n",
"2019-10-04 0.248794 0.218964 0.880936\n",
"2019-10-05 0.737916 0.227465 0.791399\n",
"2019-10-06 0.864473 0.104877 0.260462\n",
"2019-10-07 0.896652 0.660303 0.764813\n",
"2019-10-08 0.713356 0.841544 0.364391\n",
"2019-10-09 0.116807 0.030281 0.664463\n",
"2019-10-10 0.201539 0.150880 0.247200\n",
"2019-10-11 0.630659 0.756928 0.852314\n",
"2019-10-12 0.395964 0.548275 0.515698\n",
"2019-10-13 0.671036 0.635733 0.067704\n",
"2019-10-14 0.764604 0.855546 0.311566\n",
"2019-10-15 0.948676 0.649615 0.427537\n",
"2019-10-16 0.289602 0.263025 0.546518"
]
},
"execution_count": 23,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"df.filter(regex='[A-Z]')"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Multiple entries can be overwritten simultaneously.\n",
"\n",
"Explain what will change after running the following lines:"
]
},
{
"cell_type": "code",
"execution_count": 26,
"metadata": {
"collapsed": false
},
"outputs": [
{
"data": {
"text/html": [
"\n",
"\n",
"
\n",
" \n",
" \n",
" | \n",
" A | \n",
" B | \n",
" buba | \n",
" D | \n",
"
\n",
" \n",
" \n",
" \n",
" 2019-10-01 | \n",
" 3 | \n",
" 0.300000 | \n",
" 0.300000 | \n",
" 1.200000 | \n",
"
\n",
" \n",
" 2019-10-02 | \n",
" 3 | \n",
" 0.896039 | \n",
" 0.637553 | \n",
" 0.102936 | \n",
"
\n",
" \n",
" 2019-10-03 | \n",
" 3 | \n",
" 0.338066 | \n",
" 0.809526 | \n",
" 0.940079 | \n",
"
\n",
" \n",
" 2019-10-04 | \n",
" 3 | \n",
" 0.218964 | \n",
" 0.247322 | \n",
" 0.880936 | \n",
"
\n",
" \n",
" 2019-10-05 | \n",
" 3 | \n",
" 0.227465 | \n",
" 0.510530 | \n",
" 0.791399 | \n",
"
\n",
" \n",
" 2019-10-06 | \n",
" 3 | \n",
" 0.104877 | \n",
" 0.262396 | \n",
" 0.260462 | \n",
"
\n",
" \n",
" 2019-10-07 | \n",
" 3 | \n",
" 0.660303 | \n",
" 0.241162 | \n",
" 0.764813 | \n",
"
\n",
" \n",
" 2019-10-08 | \n",
" 3 | \n",
" 0.841544 | \n",
" 0.221016 | \n",
" 0.364391 | \n",
"
\n",
" \n",
" 2019-10-09 | \n",
" 3 | \n",
" 0.030281 | \n",
" 0.506326 | \n",
" 0.664463 | \n",
"
\n",
" \n",
" 2019-10-10 | \n",
" 3 | \n",
" 0.150880 | \n",
" 0.513292 | \n",
" 0.247200 | \n",
"
\n",
" \n",
" 2019-10-11 | \n",
" 3 | \n",
" 0.756928 | \n",
" 0.930413 | \n",
" 0.852314 | \n",
"
\n",
" \n",
" 2019-10-12 | \n",
" 3 | \n",
" 0.548275 | \n",
" 0.337590 | \n",
" 0.515698 | \n",
"
\n",
" \n",
" 2019-10-13 | \n",
" 3 | \n",
" 0.635733 | \n",
" 0.137816 | \n",
" 0.067704 | \n",
"
\n",
" \n",
" 2019-10-14 | \n",
" 3 | \n",
" 0.855546 | \n",
" 0.225112 | \n",
" 0.311566 | \n",
"
\n",
" \n",
" 2019-10-15 | \n",
" 3 | \n",
" 0.649615 | \n",
" 0.630545 | \n",
" 0.427537 | \n",
"
\n",
" \n",
" 2019-10-16 | \n",
" 3 | \n",
" 0.263025 | \n",
" 0.557809 | \n",
" 0.546518 | \n",
"
\n",
" \n",
"
\n",
"
"
],
"text/plain": [
" A B buba D\n",
"2019-10-01 3 0.300000 0.300000 1.200000\n",
"2019-10-02 3 0.896039 0.637553 0.102936\n",
"2019-10-03 3 0.338066 0.809526 0.940079\n",
"2019-10-04 3 0.218964 0.247322 0.880936\n",
"2019-10-05 3 0.227465 0.510530 0.791399\n",
"2019-10-06 3 0.104877 0.262396 0.260462\n",
"2019-10-07 3 0.660303 0.241162 0.764813\n",
"2019-10-08 3 0.841544 0.221016 0.364391\n",
"2019-10-09 3 0.030281 0.506326 0.664463\n",
"2019-10-10 3 0.150880 0.513292 0.247200\n",
"2019-10-11 3 0.756928 0.930413 0.852314\n",
"2019-10-12 3 0.548275 0.337590 0.515698\n",
"2019-10-13 3 0.635733 0.137816 0.067704\n",
"2019-10-14 3 0.855546 0.225112 0.311566\n",
"2019-10-15 3 0.649615 0.630545 0.427537\n",
"2019-10-16 3 0.263025 0.557809 0.546518"
]
},
"execution_count": 26,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"df.loc[dates[0], 'D']=1.2\n",
"df.at[dates[0], 'B':'buba']=0.3\n",
"df['A']=3\n",
"df"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Select data\n",
"\n",
"You can perform logical operations on multiple data frame entries at the same time:"
]
},
{
"cell_type": "code",
"execution_count": 29,
"metadata": {
"collapsed": false
},
"outputs": [
{
"data": {
"text/plain": [
"2019-10-01 True\n",
"2019-10-02 True\n",
"2019-10-03 True\n",
"2019-10-04 True\n",
"2019-10-05 True\n",
"2019-10-06 True\n",
"2019-10-07 True\n",
"2019-10-08 True\n",
"2019-10-09 True\n",
"2019-10-10 True\n",
"2019-10-11 True\n",
"2019-10-12 True\n",
"2019-10-13 True\n",
"2019-10-14 True\n",
"2019-10-15 True\n",
"2019-10-16 True\n",
"Freq: D, Name: B, dtype: bool"
]
},
"execution_count": 29,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"df.B>0"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Since we can access data with arrays of logical values, then...\n",
"\n",
"Explain what happens here:"
]
},
{
"cell_type": "code",
"execution_count": 40,
"metadata": {
"collapsed": false
},
"outputs": [
{
"data": {
"text/html": [
"\n",
"\n",
"
\n",
" \n",
" \n",
" | \n",
" A | \n",
" B | \n",
" buba | \n",
" D | \n",
"
\n",
" \n",
" \n",
" \n",
" 2019-10-02 | \n",
" 3 | \n",
" 0.896039 | \n",
" 0.637553 | \n",
" 0.102936 | \n",
"
\n",
" \n",
" 2019-10-03 | \n",
" 3 | \n",
" 0.338066 | \n",
" 0.809526 | \n",
" 0.940079 | \n",
"
\n",
" \n",
" 2019-10-07 | \n",
" 3 | \n",
" 0.660303 | \n",
" 0.241162 | \n",
" 0.764813 | \n",
"
\n",
" \n",
" 2019-10-08 | \n",
" 3 | \n",
" 0.841544 | \n",
" 0.221016 | \n",
" 0.364391 | \n",
"
\n",
" \n",
" 2019-10-11 | \n",
" 3 | \n",
" 0.756928 | \n",
" 0.930413 | \n",
" 0.852314 | \n",
"
\n",
" \n",
" 2019-10-12 | \n",
" 3 | \n",
" 0.548275 | \n",
" 0.337590 | \n",
" 0.515698 | \n",
"
\n",
" \n",
" 2019-10-13 | \n",
" 3 | \n",
" 0.635733 | \n",
" 0.137816 | \n",
" 0.067704 | \n",
"
\n",
" \n",
" 2019-10-14 | \n",
" 3 | \n",
" 0.855546 | \n",
" 0.225112 | \n",
" 0.311566 | \n",
"
\n",
" \n",
" 2019-10-15 | \n",
" 3 | \n",
" 0.649615 | \n",
" 0.630545 | \n",
" 0.427537 | \n",
"
\n",
" \n",
"
\n",
"
"
],
"text/plain": [
" A B buba D\n",
"2019-10-02 3 0.896039 0.637553 0.102936\n",
"2019-10-03 3 0.338066 0.809526 0.940079\n",
"2019-10-07 3 0.660303 0.241162 0.764813\n",
"2019-10-08 3 0.841544 0.221016 0.364391\n",
"2019-10-11 3 0.756928 0.930413 0.852314\n",
"2019-10-12 3 0.548275 0.337590 0.515698\n",
"2019-10-13 3 0.635733 0.137816 0.067704\n",
"2019-10-14 3 0.855546 0.225112 0.311566\n",
"2019-10-15 3 0.649615 0.630545 0.427537"
]
},
"execution_count": 40,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"df[df.B>0.3]"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"or here (uncomment each line):"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"collapsed": true
},
"outputs": [],
"source": [
"# df[df.B+df.D>=df.A*df.buba]\n",
"# df[df>.3]\n",
"# df[df>.3].sort_values(by='B',na_position='first')"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"If the data frame contains objects encoded as different types,\n",
"you can select each type separately.\n",
"\n",
"For instance, let's take only categorical variables:"
]
},
{
"cell_type": "code",
"execution_count": 41,
"metadata": {
"collapsed": false
},
"outputs": [
{
"data": {
"text/html": [
"\n",
"\n",
"
\n",
" \n",
" \n",
" | \n",
" E | \n",
"
\n",
" \n",
" \n",
" \n",
" 0 | \n",
" test | \n",
"
\n",
" \n",
" 1 | \n",
" train | \n",
"
\n",
" \n",
" 2 | \n",
" test | \n",
"
\n",
" \n",
" 3 | \n",
" train | \n",
"
\n",
" \n",
"
\n",
"
"
],
"text/plain": [
" E\n",
"0 test\n",
"1 train\n",
"2 test\n",
"3 train"
]
},
"execution_count": 41,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"df2.select_dtypes(include='category')"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Viewing data\n",
"If the data is too big you might want to have only a glimpse on a couple of instances:"
]
},
{
"cell_type": "code",
"execution_count": 42,
"metadata": {
"collapsed": true
},
"outputs": [],
"source": [
"# df.head()\n",
"# df.tail(3)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"or get familiar with the column and row names:"
]
},
{
"cell_type": "code",
"execution_count": 47,
"metadata": {
"collapsed": true
},
"outputs": [],
"source": [
"# df.index\n",
"# df.columns"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"You might want to have a look at some summary statistics:"
]
},
{
"cell_type": "code",
"execution_count": 43,
"metadata": {
"collapsed": false
},
"outputs": [
{
"data": {
"text/html": [
"\n",
"\n",
"
\n",
" \n",
" \n",
" | \n",
" A | \n",
" B | \n",
" buba | \n",
" D | \n",
"
\n",
" \n",
" \n",
" \n",
" count | \n",
" 16.0 | \n",
" 16.000000 | \n",
" 16.000000 | \n",
" 16.000000 | \n",
"
\n",
" \n",
" mean | \n",
" 3.0 | \n",
" 0.467346 | \n",
" 0.441775 | \n",
" 0.558626 | \n",
"
\n",
" \n",
" std | \n",
" 0.0 | \n",
" 0.293482 | \n",
" 0.231618 | \n",
" 0.327352 | \n",
"
\n",
" \n",
" min | \n",
" 3.0 | \n",
" 0.030281 | \n",
" 0.137816 | \n",
" 0.067704 | \n",
"
\n",
" \n",
" 25% | \n",
" 3.0 | \n",
" 0.225340 | \n",
" 0.245782 | \n",
" 0.298790 | \n",
"
\n",
" \n",
" 50% | \n",
" 3.0 | \n",
" 0.443170 | \n",
" 0.421958 | \n",
" 0.531108 | \n",
"
\n",
" \n",
" 75% | \n",
" 3.0 | \n",
" 0.684459 | \n",
" 0.575993 | \n",
" 0.806628 | \n",
"
\n",
" \n",
" max | \n",
" 3.0 | \n",
" 0.896039 | \n",
" 0.930413 | \n",
" 1.200000 | \n",
"
\n",
" \n",
"
\n",
"
"
],
"text/plain": [
" A B buba D\n",
"count 16.0 16.000000 16.000000 16.000000\n",
"mean 3.0 0.467346 0.441775 0.558626\n",
"std 0.0 0.293482 0.231618 0.327352\n",
"min 3.0 0.030281 0.137816 0.067704\n",
"25% 3.0 0.225340 0.245782 0.298790\n",
"50% 3.0 0.443170 0.421958 0.531108\n",
"75% 3.0 0.684459 0.575993 0.806628\n",
"max 3.0 0.896039 0.930413 1.200000"
]
},
"execution_count": 43,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"df.describe()"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Or at least at some of the ones that are of interest:"
]
},
{
"cell_type": "code",
"execution_count": 44,
"metadata": {
"collapsed": false
},
"outputs": [
{
"data": {
"text/plain": [
"A 3.000000\n",
"B 0.467346\n",
"buba 0.441775\n",
"D 0.558626\n",
"dtype: float64"
]
},
"execution_count": 44,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"# df.sum()\n",
"# df.count()\n",
"df.mean()"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"If you haven't loaded Matplotlib (and you should have!)\n",
"you still have several options to plot the data:"
]
},
{
"cell_type": "code",
"execution_count": 49,
"metadata": {
"collapsed": false,
"scrolled": true
},
"outputs": [
{
"data": {
"text/plain": [
""
]
},
"execution_count": 49,
"metadata": {},
"output_type": "execute_result"
},
{
"data": {
"image/png": "iVBORw0KGgoAAAANSUhEUgAAAYIAAAEGCAYAAABo25JHAAAABHNCSVQICAgIfAhkiAAAAAlwSFlzAAALEgAACxIB0t1+/AAAADh0RVh0U29mdHdhcmUAbWF0cGxvdGxpYiB2ZXJzaW9uMy4xLjEsIGh0dHA6Ly9tYXRwbG90bGliLm9yZy8QZhcZAAATA0lEQVR4nO3df4xd513n8fdnnIntxWlreQYJeRycVVwtUeSmaDYt6z/a0qJN+ofzhysUa7vQVaglloDYViTZZVWq8BeuEBIibLFK1QWJhoARsVBLkCCoq0JQJtvEIukGmbTgaVbKdOp2a9V2xsx3/5jrdjI/PNede+6v835Jlu655/G93zwZ++PzPOd5TqoKSVJ7TQy6AEnSYBkEktRyBoEktZxBIEktZxBIUsvdNOgCbtTU1FQdPHhw0GVI0kh57rnnvl5V0xudG7kgOHjwIHNzc4MuQ5JGSpJ/2uycQ0OS1HIGgSS1nEEgSS1nEEhSyxkEktRyBoEktZxBIEkt11gQJPl0kteS/P0m5/9DkrOdX3+T5G1N1aLBW7x4hRfOf5PFi1cGXYqkNZpcUPYZ4LeA39vk/FeAd1XVhST3AqeAdzRYjwbkyee/xsOnzzI5McHS8jInjx3m6F37B12WpI7Grgiq6gvAN65z/m+q6kLn8BlgpqlaNDiLF6/w8OmzXF5a5ttXrnJ5aZmHTp/1ykAaIsMyR/AA8PnNTiY5kWQuydzCwkIfy9J2zV+4xOTEG3/MJicmmL9waUAVSVpr4EGQ5D2sBMHDm7WpqlNVNVtVs9PTG+6ZpCE1s3c3S8vLb3hvaXmZmb27B1SRpLUGGgRJDgOfAu6rqsVB1qJm7Nuzk5PHDrNrcoJbdt7ErskJTh47zL49OwddmqSOge0+muRW4E+A/1hV/zCoOtS8o3ft58jtU8xfuMTM3t2GgDRkGguCJJ8F3g1MJZkHfgWYBKiqTwIfA/YBv50E4GpVzTZVjwZr356dBoA0pBoLgqo6vsX5nwF+pqnvlyR1Z+CTxZKkwTIIJKnlDAJJajmDQJJaziCQpJYzCCSp5QwCSWo5g0CSWs4gkKSWMwgkqeUMAklqOYNAklrOIJCkljMIJKnlDAJJajmDQJJaziCQpJYzCCSp5QwCSWo5g0CSWs4gkKSWMwgkqeUMAklqOYNAklrOIJCklmssCJJ8OslrSf5+k/NJ8ptJziU5m+RHm6pFkrS5Jq8IPgPcc53z9wKHOr9OAP+jwVokSZtoLAiq6gvAN67T5D7g92rFM8BbkvxQU/VIkjY2yDmC/cD5VcfznffWSXIiyVySuYWFhb4UJ0ltMcggyAbv1UYNq+pUVc1W1ez09HTDZUlSuwwyCOaBA6uOZ4BXB1SLJLXWIIPgDPBTnbuH3gl8q6r+7wDrkaRWuqmpD07yWeDdwFSSeeBXgEmAqvok8Dng/cA54DvAf2qqFknS5hoLgqo6vsX5An6uqe+XJHXHlcWS1HIGgSS1nEGgobV48QovnP8mixevDLoUaaw1NkcgbceTz3+Nh0+fZXJigqXlZU4eO8zRuzZcbyhpm7wi0NBZvHiFh0+f5fLSMt++cpXLS8s8dPqsVwZSQwwCDZ35C5eYnHjjj+bkxATzFy4NqCJpvBkEGjoze3eztLz8hveWlpeZ2bt7QBVJ480g0NDZt2cnJ48dZtfkBLfsvIldkxOcPHaYfXt2Dro0aSw5WayhdPSu/Ry5fYr5C5eY2bvbEJAaZBBoaO3bs9MAkPrAoSFJajmDQJJaziBoKVftSrrGOYIWctWupNW8ImgZV+1KWssgaBlX7UpayyBoGVftSlrLIGgZV+2qTbwpojtOFreQq3bVBt4U0T2DoKVctatxtvqmiMusDIU+dPosR26f8ud+Aw4NSRo73hRxYwwCSWPHmyJujEEgaex4U8SNcY5A0ljypojuGQSSxpY3RXSn0aGhJPckeTnJuSSPbHD+1iRPJ/lSkrNJ3t9kPZKk9RoLgiQ7gMeAe4E7gONJ7ljT7L8DT1TV24H7gd9uqh5J0saavCK4GzhXVa9U1evA48B9a9oU8KbO6zcDrzZYjzbh6kup3ZqcI9gPnF91PA+8Y02bjwN/keTngR8A3tdgPdqAqy8lNXlFkA3eqzXHx4HPVNUM8H7g95OsqynJiSRzSeYWFhYaKLWd3JJaEjQbBPPAgVXHM6wf+nkAeAKgqv4W2AVMrf2gqjpVVbNVNTs9Pd1Que3j6ktJ0GwQPAscSnJbkptZmQw+s6bNPwPvBUjyI6wEgf/k7xNXX0qCBoOgqq4CDwJPAV9m5e6gF5M8muRop9lHgQ8neQH4LPChqlo7fKSGuPpSTfImhNGRUft7d3Z2tubm5gZdxlhZvHjF1ZfqKW9CGD5Jnquq2Y3OubJYrr5UT7kF9Ohx0zlJPeVNCKPHIJDUU96EMHoMAkk95U0Io8c5Akk95xbQo8UgkNQIb0IYHQ4NSVLLGQSS1HIGgSS1nEEgSS1nEEhSyxkEktRyBoEktZxBIEktZxBIUsu1Mgh8YIYkfU/rtpjwgRmS9EatuiJY/cCMb1+5yuWlZR46fdYrA0mt1qog8IEZkrReq4LAB2ZI0nqtCgIfmCFJ67VustgHZkjSG3UdBEmmAapqobly+sMHZkjS91x3aCgrPp7k68D/Af4hyUKSj/WnPElS07aaI/hF4Ajwb6tqX1XtBd4BHEnyXxqvTpLUuK2C4KeA41X1lWtvVNUrwAc75yRJI26rIJisqq+vfbMzTzC51YcnuSfJy0nOJXlkkzY/meSlJC8m+YPuypYk9cpWk8Wvf5/nSLIDeAz4CWAeeDbJmap6aVWbQ8B/BY5U1YUkP9hd2ZKkXtkqCN6W5P9t8H6AXVv83ruBc52hJJI8DtwHvLSqzYeBx6rqAkBVvdZV1ZKknrluEFTVjm189n7g/KrjeVYmmld7K0CSLwI7gI9X1Z+v/aAkJ4ATALfeeus2SpIkrdXkyuJs8F6tOb4JOAS8GzgOfCrJW9b9pqpTVTVbVbPT09M9L1SS2qzJIJgHDqw6ngFe3aDNk1W11Lkz6WVWgkGS1CdNBsGzwKEktyW5GbgfOLOmzZ8C7wFIMsXKUNErDdYkSVqjsSCoqqvAg8BTwJeBJ6rqxSSPJjnaafYUsJjkJeBp4JeqarGpmiRJ66Vq7bD9cJudna25ublBlyFJIyXJc1U1u9G5Vm1Drd7wmc/SeGndNtTaHp/5LI0frwjUNZ/5LI0ng0Bd85nP0ngyCNQ1n/ksjSeDQF3zmc/Dx4l79YKTxbohPvN5eDhxr14xCHTDfObz4K2euL/MynDdQ6fPcuT2Kf/f6IY5NCSNICfu1UsGgTSCnLhXLxkE0ghy4l695ByBNKKcuFevGATSCHPiXr3g0JAktZxBIEktZxBIUssZBJLUcgaBJLWcQSBJLWcQSFLLGQSSesItsUeXC8okbZtbYo82rwgkbYvPsh59BoGkbXFL7NFnEEjaFrfEHn2NBkGSe5K8nORckkeu0+4DSSrJbJP1SOo9t8QefY1NFifZATwG/AQwDzyb5ExVvbSm3S3ALwB/11QtkprlltijrckrgruBc1X1SlW9DjwO3LdBu18FTgKXG6xFUsP27dnJ2w68xRAYQU0GwX7g/Krj+c5735Xk7cCBqvqz631QkhNJ5pLMLSws9L5SSWqxJoMgG7xX3z2ZTAC/AXx0qw+qqlNVNVtVs9PT0z0sUZLUZBDMAwdWHc8Ar646vgW4E/jrJF8F3gmcccJYkvqrySB4FjiU5LYkNwP3A2eunayqb1XVVFUdrKqDwDPA0aqaa7AmSWrcqG230dhdQ1V1NcmDwFPADuDTVfVikkeBuao6c/1PkKTRM4rbbaSqtm41RGZnZ2tuzosGScNn8eIVjvzaX3F56XsL7HZNTvDFh3984HdTJXmuqjYcendlsST1yKhut2EQSFKPjOp2GwaBJPXIqG634fMIJKmHRnG7DYNAknps356dIxEA1zg0JEktZxBIUsOGfYGZQ0OS1KBRWGDmFYEkNWRUnudsEEhSQ0ZlgZlBIEkNGZUFZgaBJDVkVBaYOVksSQ0ahQVmBoEkNWzYF5g5NCRJLWcQNGjYF5FIEjg01JhRWEQiSeAVQSN6sYjEqwlJ/eIVQQOuLSK5zPfuH762iKSbCSOvJiT1k1cEDdjOIpJRWZIuaXwYBA3YziKSUVmSLml8ODTUkO93EcmoLEmXND68ImjQvj07eduBt9zQQpJRWZIuaXx4RTCERmFJuqTxYRAMqWFfki5pfDQ6NJTkniQvJzmX5JENzn8kyUtJzib5yyQ/3GQ9kqT1GguCJDuAx4B7gTuA40nuWNPsS8BsVR0G/hg42VQ9kqSNNXlFcDdwrqpeqarXgceB+1Y3qKqnq+o7ncNngJkG65EkbaDJINgPnF91PN95bzMPAJ/f6ESSE0nmkswtLCz0sERJUpNBkA3eqw0bJh8EZoFPbHS+qk5V1WxVzU5PT/ewRElSk3cNzQMHVh3PAK+ubZTkfcAvA++qKvdRkKQ+a/KK4FngUJLbktwM3A+cWd0gyduB3wGOVtVrDdYiSdpEY0FQVVeBB4GngC8DT1TVi0keTXK00+wTwB7gj5I8n+TMJh8nSWpIowvKqupzwOfWvPexVa/f1+T3S5K25l5DktRyBoEktZxBIEktZxBIUssZBJLUcgaBJLWcQSBJLWcQSFLLGQSSNKQWL17hhfPfZPFis9uw+ahKSRpCTz7/NR4+fZbJiQmWlpc5eewwR++63k7+3z+vCCRpyCxevMLDp89yeWmZb1+5yuWlZR46fbaxKwODQJKGzPyFS0xOvPGv58mJCeYvXGrk+wwCSRoyM3t3s7S8/Ib3lpaXmdm7u5HvMwgkacjs27OTk8cOs2tyglt23sSuyQlOHjvMvj07G/k+J4sl3bDFi1eYv3CJmb27G/vLqe2O3rWfI7dP9aWfDQJJN6Sfd7O03b49O/sStA4NSepav+9mUX8YBJK61u+7WdQfBoGkrvX7bhb1h0EgqWv9vptF/eFksaQb0s+7WdQfBoGkG9avu1nUHw4NSVLLGQSS1HIGgSS1nEEgSS3XaBAkuSfJy0nOJXlkg/M7k/xh5/zfJTnYZD2SpPUaC4IkO4DHgHuBO4DjSe5Y0+wB4EJV3Q78BvBrTdUjSdpYk1cEdwPnquqVqnodeBy4b02b+4D/2Xn9x8B7k6TBmiRJazS5jmA/cH7V8Tzwjs3aVNXVJN8C9gFfX90oyQngROfwYpKXN/nOqbW/V/bJJuyX9eyT9capT354sxNNBsFG/7Kv76MNVXUKOLXlFyZzVTXbXXntYJ9szH5Zzz5Zry190uTQ0DxwYNXxDPDqZm2S3AS8GfhGgzVJktZoMgieBQ4luS3JzcD9wJk1bc4AP915/QHgr6pq3RWBJKk5jQ0Ndcb8HwSeAnYAn66qF5M8CsxV1Rngd4HfT3KOlSuB+7f5tVsOH7WQfbIx+2U9+2S9VvRJ/Ae4JLWbK4slqeUMAklquZEMAreuWK+LPvlIkpeSnE3yl0k2vad4XGzVJ6vafSBJJRn72wShu35J8pOdn5cXk/xBv2vsty7+/Nya5OkkX+r8GXr/IOpsTFWN1C9WJp7/EfjXwM3AC8Ada9r8Z+CTndf3A3846LqHoE/eA/yrzuuftU++2+4W4AvAM8DsoOsehn4BDgFfAvZ2jn9w0HUPQZ+cAn628/oO4KuDrruXv0bxisCtK9bbsk+q6umq+k7n8BlW1nWMs25+TgB+FTgJXO5ncQPUTb98GHisqi4AVNVrfa6x37rpkwLe1Hn9ZtaviRppoxgEG21dsX+zNlV1Fbi2dcW46qZPVnsA+HyjFQ3eln2S5O3Agar6s34WNmDd/Ky8FXhrki8meSbJPX2rbjC66ZOPAx9MMg98Dvj5/pTWH6P4zOKebV0xRrr+703yQWAWeFejFQ3edfskyQQrO95+qF8FDYluflZuYmV46N2sXDn+ryR3VtU3G65tULrpk+PAZ6rq15P8GCvrn+6squXmy2veKF4RuHXFet30CUneB/wycLSqrvSptkHZqk9uAe4E/jrJV4F3AmdaMGHc7Z+fJ6tqqaq+ArzMSjCMq2765AHgCYCq+ltgFysb0o2FUQwCt65Yb8s+6QyD/A4rITDuY76wRZ9U1beqaqqqDlbVQVbmTY5W1dxgyu2bbv78/CkrNxeQZIqVoaJX+lplf3XTJ/8MvBcgyY+wEgQLfa2yQSMXBJ0x/2tbV3wZeKI6W1ckOdpp9rvAvs7WFR8BNr11cBx02SefAPYAf5Tk+SRrf9DHSpd90jpd9stTwGKSl4CngV+qqsXBVNy8Lvvko8CHk7wAfBb40Dj949ItJiSp5UbuikCS1FsGgSS1nEEgSS1nEEhSyxkEktRyBoG0TUn+pXNL7gtJ/neSfzfomqQb4e2j0jYluVhVezqv/z3w36pq3Lfw0BjxikDqrTcBFwZdhHQjRnHTOWnY7E7yPCvbDvwQ8OMDrke6IQ4NSdu0Zmjox4BPAXeO0xYEGm8ODUk91NmZcgqYHnQtUrcMAqmHkvwbVh59OLabtGn8OEcgbd+1OQJYecjJT1fVvwyyIOlGOEcgSS3n0JAktZxBIEktZxBIUssZBJLUcgaBJLWcQSBJLWcQSFLL/X/IxFDGYF0sZgAAAABJRU5ErkJggg==\n",
"text/plain": [
"