{ "cells": [ { "cell_type": "markdown", "metadata": { "colab_type": "text", "id": "emk6sw0OaVxc" }, "source": [ "# Utilisation du module Pandas\n", "Le module `csv` utilisé précédemment se contente de lire les données structurées. Il ne fait aucun effort particulier pour analyser les données. Nous nous en sommes aperçus lorsqu'il a fallu convertir par `int()` toutes les valeurs numériques, qui étaient interprétées comme des chaînes de caractères. \n", "La bibliothèque `pandas` est par contre spécialement conçue pour l'analyse des données (*data analysis*) : elle est donc naturellement bien plus performante." ] }, { "cell_type": "code", "execution_count": 4, "metadata": { "colab": {}, "colab_type": "code", "collapsed": true, "id": "L9Yp-3USaVyY", "jupyter": { "outputs_hidden": true } }, "outputs": [], "source": [ "import pandas as pd #import du module pandas, abrégé classiquement par \"pd\"" ] }, { "cell_type": "code", "execution_count": 5, "metadata": { "colab": {}, "colab_type": "code", "collapsed": true, "id": "W9JdmXZiaVzo", "jupyter": { "outputs_hidden": true } }, "outputs": [], "source": [ "df = pd.read_csv('data/top14.csv', encoding = 'utf-8')" ] }, { "cell_type": "markdown", "metadata": { "colab_type": "text", "id": "hLY3q0vEaVz4" }, "source": [ "La variable est nommée classiquement `df` pour *dataframe* (que l'on peut traduire par *table de données*)" ] }, { "cell_type": "code", "execution_count": 6, "metadata": { "colab": { "base_uri": "https://localhost:8080/", "height": 35 }, "colab_type": "code", "collapsed": true, "id": "h310TURAaVz7", "jupyter": { "outputs_hidden": true }, "outputId": "bd0f51db-0f55-43aa-b0c6-439232f3bb62" }, "outputs": [ { "data": { "text/plain": [ "pandas.core.frame.DataFrame" ] }, "execution_count": 6, "metadata": {}, "output_type": "execute_result" } ], "source": [ "type(df)" ] }, { "cell_type": "markdown", "metadata": { "colab_type": "text", "id": "lTTRshytaV0P" }, "source": [ "## Premiers renseignements sur les fichiers de données" ] }, { "cell_type": "markdown", "metadata": { "colab_type": "text", "id": "G7BhHDnvaV0S" }, "source": [ "Que contient la variable `df`?" ] }, { "cell_type": "code", "execution_count": 7, "metadata": { "colab": { "base_uri": "https://localhost:8080/", "height": 424 }, "colab_type": "code", "collapsed": true, "id": "9l4V561kaV0W", "jupyter": { "outputs_hidden": true }, "outputId": "ad37fa3f-b8d6-419a-8a93-8cf9d52ff65c" }, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
EquipeNomPosteDate de naissanceTaillePoids
0AgenAnton PEIKRISHVILIPilier18/09/1987183122
1AgenDave RYANPilier21/04/1986183116
2AgenGiorgi TETRASHVILIPilier31/08/1993177112
3AgenKamaliele TUFELEPilier11/10/1995182123
4AgenMalino VANAÏPilier04/05/1993183119
.....................
590ToulouseWerner KOKAilier27/01/199317778
591ToulouseYoann HUGETAilier02/06/198719097
592ToulouseMatthis LEBELArrière25/03/199918591
593ToulouseMaxime MÉDARDArrière16/11/198618085
594ToulouseThomas RAMOSArrière23/07/199517886
\n", "

595 rows × 6 columns

\n", "
" ], "text/plain": [ " Equipe Nom Poste Date de naissance Taille Poids\n", "0 Agen Anton PEIKRISHVILI Pilier 18/09/1987 183 122\n", "1 Agen Dave RYAN Pilier 21/04/1986 183 116\n", "2 Agen Giorgi TETRASHVILI Pilier 31/08/1993 177 112\n", "3 Agen Kamaliele TUFELE Pilier 11/10/1995 182 123\n", "4 Agen Malino VANAÏ Pilier 04/05/1993 183 119\n", ".. ... ... ... ... ... ...\n", "590 Toulouse Werner KOK Ailier 27/01/1993 177 78\n", "591 Toulouse Yoann HUGET Ailier 02/06/1987 190 97\n", "592 Toulouse Matthis LEBEL Arrière 25/03/1999 185 91\n", "593 Toulouse Maxime MÉDARD Arrière 16/11/1986 180 85\n", "594 Toulouse Thomas RAMOS Arrière 23/07/1995 178 86\n", "\n", "[595 rows x 6 columns]" ] }, "execution_count": 7, "metadata": {}, "output_type": "execute_result" } ], "source": [ "df" ] }, { "cell_type": "markdown", "metadata": { "colab_type": "text", "id": "vHakvyjsaV0j" }, "source": [ "Les données sont présentées dans l'ordre originel du fichier. \n", "Il est possible d'avoir uniquement les premières lignes du fichier avec la commande `head()` et les dernières du fichier avec la commande `tail()`. Ces commandes peuvent recevoir en paramètre un nombre entier." ] }, { "cell_type": "code", "execution_count": 8, "metadata": { "colab": { "base_uri": "https://localhost:8080/", "height": 206 }, "colab_type": "code", "collapsed": true, "id": "ffxCWRv-aV0m", "jupyter": { "outputs_hidden": true }, "outputId": "09c59d55-73f2-4ddf-ed05-b2a0813cba8a" }, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
EquipeNomPosteDate de naissanceTaillePoids
0AgenAnton PEIKRISHVILIPilier18/09/1987183122
1AgenDave RYANPilier21/04/1986183116
2AgenGiorgi TETRASHVILIPilier31/08/1993177112
3AgenKamaliele TUFELEPilier11/10/1995182123
4AgenMalino VANAÏPilier04/05/1993183119
\n", "
" ], "text/plain": [ " Equipe Nom Poste Date de naissance Taille Poids\n", "0 Agen Anton PEIKRISHVILI Pilier 18/09/1987 183 122\n", "1 Agen Dave RYAN Pilier 21/04/1986 183 116\n", "2 Agen Giorgi TETRASHVILI Pilier 31/08/1993 177 112\n", "3 Agen Kamaliele TUFELE Pilier 11/10/1995 182 123\n", "4 Agen Malino VANAÏ Pilier 04/05/1993 183 119" ] }, "execution_count": 8, "metadata": {}, "output_type": "execute_result" } ], "source": [ "df.head()" ] }, { "cell_type": "code", "execution_count": 9, "metadata": { "colab": { "base_uri": "https://localhost:8080/", "height": 206 }, "colab_type": "code", "collapsed": true, "id": "cnX7EHQxaV0w", "jupyter": { "outputs_hidden": true }, "outputId": "57d91e89-eb49-4443-9d0c-5201f242831f" }, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
EquipeNomPosteDate de naissanceTaillePoids
590ToulouseWerner KOKAilier27/01/199317778
591ToulouseYoann HUGETAilier02/06/198719097
592ToulouseMatthis LEBELArrière25/03/199918591
593ToulouseMaxime MÉDARDArrière16/11/198618085
594ToulouseThomas RAMOSArrière23/07/199517886
\n", "
" ], "text/plain": [ " Equipe Nom Poste Date de naissance Taille Poids\n", "590 Toulouse Werner KOK Ailier 27/01/1993 177 78\n", "591 Toulouse Yoann HUGET Ailier 02/06/1987 190 97\n", "592 Toulouse Matthis LEBEL Arrière 25/03/1999 185 91\n", "593 Toulouse Maxime MÉDARD Arrière 16/11/1986 180 85\n", "594 Toulouse Thomas RAMOS Arrière 23/07/1995 178 86" ] }, "execution_count": 9, "metadata": {}, "output_type": "execute_result" } ], "source": [ "df.tail()" ] }, { "cell_type": "code", "execution_count": 10, "metadata": { "colab": { "base_uri": "https://localhost:8080/", "height": 143 }, "colab_type": "code", "collapsed": true, "id": "LcibjQkVaV07", "jupyter": { "outputs_hidden": true }, "outputId": "ccff9aac-34c5-458e-c4e2-e0a3a2dd92f5" }, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
EquipeNomPosteDate de naissanceTaillePoids
0AgenAnton PEIKRISHVILIPilier18/09/1987183122
1AgenDave RYANPilier21/04/1986183116
2AgenGiorgi TETRASHVILIPilier31/08/1993177112
\n", "
" ], "text/plain": [ " Equipe Nom Poste Date de naissance Taille Poids\n", "0 Agen Anton PEIKRISHVILI Pilier 18/09/1987 183 122\n", "1 Agen Dave RYAN Pilier 21/04/1986 183 116\n", "2 Agen Giorgi TETRASHVILI Pilier 31/08/1993 177 112" ] }, "execution_count": 10, "metadata": {}, "output_type": "execute_result" } ], "source": [ "df.head(3)" ] }, { "cell_type": "markdown", "metadata": { "colab_type": "text", "id": "evMRdFilaV1G" }, "source": [ "Pour avoir des renseignements globaux sur la structure de notre fichier, on peut utiliser la commande `df.info()`" ] }, { "cell_type": "code", "execution_count": 11, "metadata": { "colab": { "base_uri": "https://localhost:8080/", "height": 213 }, "colab_type": "code", "collapsed": true, "id": "7Vbi029jaV1J", "jupyter": { "outputs_hidden": true }, "outputId": "4c5114d1-8d11-4208-f317-bdae5f96e79b" }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "\n", "RangeIndex: 595 entries, 0 to 594\n", "Data columns (total 6 columns):\n", " # Column Non-Null Count Dtype \n", "--- ------ -------------- ----- \n", " 0 Equipe 595 non-null object\n", " 1 Nom 595 non-null object\n", " 2 Poste 595 non-null object\n", " 3 Date de naissance 595 non-null object\n", " 4 Taille 595 non-null int64 \n", " 5 Poids 595 non-null int64 \n", "dtypes: int64(2), object(4)\n", "memory usage: 28.0+ KB\n" ] } ], "source": [ "df.info()" ] }, { "cell_type": "markdown", "metadata": { "colab_type": "text", "id": "N2K-Pd-MaV1Z" }, "source": [ "Pour accéder à une fiche particulière de joueur, on peut utiliser la fonction `loc()` :" ] }, { "cell_type": "code", "execution_count": 12, "metadata": { "colab": { "base_uri": "https://localhost:8080/", "height": 141 }, "colab_type": "code", "collapsed": true, "id": "Ud31B7YsaV1c", "jupyter": { "outputs_hidden": true }, "outputId": "381be15b-cc3d-4355-8dee-e0f69aad2fd2" }, "outputs": [ { "data": { "text/plain": [ "Equipe Bayonne\n", "Nom Torsten VAN JAARSVELD\n", "Poste Talonneur\n", "Date de naissance 30/06/1987\n", "Taille 175\n", "Poids 106\n", "Name: 45, dtype: object" ] }, "execution_count": 12, "metadata": {}, "output_type": "execute_result" } ], "source": [ "df.loc[45]" ] }, { "cell_type": "markdown", "metadata": { "colab_type": "text", "id": "7mAB7-ofaV1z" }, "source": [ "## Extraction de colonnes, création de graphiques \n", "Pour créer une liste contenant uniquement les données numériques de la colonne poids, il suffit d'écrire :" ] }, { "cell_type": "code", "execution_count": 13, "metadata": { "colab": {}, "colab_type": "code", "collapsed": true, "id": "zd3nVgSRaV12", "jupyter": { "outputs_hidden": true } }, "outputs": [], "source": [ "poids = df['Poids']" ] }, { "cell_type": "markdown", "metadata": { "colab_type": "text", "id": "G7B9_CdxaV2D" }, "source": [ "Attention, la variable `poids` n'est pas une liste qui contiendrait `[122,116,112,...]` mais un type particulier à `pandas`, appelé \"Series\"." ] }, { "cell_type": "code", "execution_count": 14, "metadata": { "colab": { "base_uri": "https://localhost:8080/", "height": 230 }, "colab_type": "code", "collapsed": true, "id": "yIrGAyBcaV2F", "jupyter": { "outputs_hidden": true }, "outputId": "88c26953-30fb-46a0-977f-539b99f62f34" }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "0 122\n", "1 116\n", "2 112\n", "3 123\n", "4 119\n", " ... \n", "590 78\n", "591 97\n", "592 91\n", "593 85\n", "594 86\n", "Name: Poids, Length: 595, dtype: int64\n" ] } ], "source": [ "print(poids)" ] }, { "cell_type": "code", "execution_count": 15, "metadata": { "colab": { "base_uri": "https://localhost:8080/", "height": 35 }, "colab_type": "code", "collapsed": true, "id": "gdNVPJkyaV2O", "jupyter": { "outputs_hidden": true }, "outputId": "59c2b9d0-c059-4f43-93b8-45eb6208f75d" }, "outputs": [ { "data": { "text/plain": [ "pandas.core.series.Series" ] }, "execution_count": 15, "metadata": {}, "output_type": "execute_result" } ], "source": [ "type(poids)" ] }, { "cell_type": "markdown", "metadata": { "colab_type": "text", "id": "XU5OJi-MaV2Y" }, "source": [ "On peut néanmoins s'en servir comme d'une liste classique." ] }, { "cell_type": "code", "execution_count": 16, "metadata": { "colab": { "base_uri": "https://localhost:8080/", "height": 35 }, "colab_type": "code", "collapsed": true, "id": "vwXS6I84aV2a", "jupyter": { "outputs_hidden": true }, "outputId": "b3b7391f-68ea-49c9-de82-c9bf54917dc3" }, "outputs": [ { "data": { "text/plain": [ "122" ] }, "execution_count": 16, "metadata": {}, "output_type": "execute_result" } ], "source": [ "poids[0]" ] }, { "cell_type": "markdown", "metadata": { "colab_type": "text", "id": "Pj5eT9KFaV2i" }, "source": [ "On voit donc que les données sont automatiquement traitées comme des nombres. Pas besoin de conversion comme avec le module `csv` !\n", "\n", "Pour tracer notre nuage de points poids-taille, le code sera donc simplement :" ] }, { "cell_type": "code", "execution_count": 17, "metadata": { "colab": { "base_uri": "https://localhost:8080/", "height": 265 }, "colab_type": "code", "collapsed": true, "id": "K0yhJ-15aV2l", "jupyter": { "outputs_hidden": true }, "outputId": "2f24da3a-db9b-49ab-82da-f81454ce1ca0" }, "outputs": [ { "name": "stderr", "output_type": "stream", "text": [ "Matplotlib is building the font cache; this may take a moment.\n" ] }, { "data": { "image/png": "\n", "text/plain": [ "
" ] }, "metadata": { "needs_background": "light" }, "output_type": "display_data" } ], "source": [ "%matplotlib inline\n", "import matplotlib.pyplot as plt\n", "X = df['Poids']\n", "Y = df['Taille']\n", "\n", "plt.plot(X,Y,'ro') # r pour red, o pour un cercle. voir https://matplotlib.org/api/markers_api.html\n", "plt.show()" ] }, { "cell_type": "markdown", "metadata": { "colab_type": "text", "id": "fzl7pMgMaV2t" }, "source": [ "L'interprétation numérique permet à `pandas` d'analyser automatiquement les données, avec notamment la fonction `describe()`." ] }, { "cell_type": "code", "execution_count": 18, "metadata": { "colab": { "base_uri": "https://localhost:8080/", "height": 177 }, "colab_type": "code", "collapsed": true, "id": "E9IwMet9aV2w", "jupyter": { "outputs_hidden": true }, "outputId": "be9163f0-3461-440d-ea8c-fa38cf38a16f" }, "outputs": [ { "data": { "text/plain": [ "count 595.000000\n", "mean 186.559664\n", "std 7.572615\n", "min 169.000000\n", "25% 181.000000\n", "50% 186.000000\n", "75% 192.000000\n", "max 208.000000\n", "Name: Taille, dtype: float64" ] }, "execution_count": 18, "metadata": {}, "output_type": "execute_result" } ], "source": [ "df['Taille'].describe()" ] }, { "cell_type": "markdown", "metadata": { "colab_type": "text", "id": "S5kGQ0peaV27" }, "source": [ "On voit donc que les indicateurs statistiques sont proposés automatiquent. \n", "D'ailleurs, on peut très facilement tracer des boites à moustaches avec `boxplot()`." ] }, { "cell_type": "code", "execution_count": 19, "metadata": { "colab": { "base_uri": "https://localhost:8080/", "height": 283 }, "colab_type": "code", "collapsed": true, "id": "CIs7_4Z9aV28", "jupyter": { "outputs_hidden": true }, "outputId": "6ca6ca6c-2146-4084-98c8-84a45e01298a" }, "outputs": [ { "data": { "text/plain": [ "" ] }, "execution_count": 19, "metadata": {}, "output_type": "execute_result" }, { "data": { "image/png": "iVBORw0KGgoAAAANSUhEUgAAAXcAAAD4CAYAAAAXUaZHAAAAOXRFWHRTb2Z0d2FyZQBNYXRwbG90bGliIHZlcnNpb24zLjMuMywgaHR0cHM6Ly9tYXRwbG90bGliLm9yZy/Il7ecAAAACXBIWXMAAAsTAAALEwEAmpwYAAAQJklEQVR4nO3df4xdZZ3H8fdHYY0LgrrFkQWkmC2usIYqswQ16kQJIv6B8QfSbBSVOIH0D3HVbDUIataEGDWBkJVMFAuJqeIWEbekLkuYrRrQLVh+FDAUhVioVCEpVhHF/e4fc7pcywxzZ+7cafv0/Upu5rnPc86530lOPz155pz7pKqQJLXlOXu6AEnSwjPcJalBhrskNchwl6QGGe6S1KAD9nQBAEuWLKmlS5fu6TKkaf3ud7/joIMO2tNlSM9w6623/qaqDptubK8I96VLl7Jx48Y9XYY0rcnJScbGxvZ0GdIzJHlwpjGnZSSpQYa7JDXIcJekBhnuktQgw12SGmS4S1KDDHdJapDhLkkN2iseYpIWS5JF+RzXSdCe5pW79itVNefX0f/yH3PeR9rTDHdJapDhLkkNMtwlqUGGuyQ1yHCXpAYZ7pLUIMNdkhpkuEtSgwx3SWqQ4S5JDZo13JMcleSmJHcn2ZzkI13/i5PckOS+7ueLuv6xJDuSbOpeFw77l5Ak/aV+rtyfAj5WVccBJwMrkxwHrAJurKplwI3d+11+UFXLu9fnFrxqSdKzmjXcq2pbVd3WtX8L3AMcAZwBXNltdiXwjiHVKEmaozl95W+SpcCrgR8DI1W1rRv6FTDSs+lrk9wOPAx8vKo2T3OscWAcYGRkhMnJyTkXLy0Wz0/ta/oO9yQHA2uB86vq8d7vxa6qSrLre05vA46uqp1JTgeuBZbtfryqmgAmAEZHR2tsbGy+v4M0XOvX4fmpfU1fd8skOZCpYP9GVV3TdT+S5PBu/HBgO0BVPV5VO7v29cCBSZYseOWSpBn1c7dMgK8B91TVl3uGrgPO7tpnA9/ttn9ptw9JTuo+49GFLFqS9Oz6mZZ5PfA+4M4km7q+TwEXA1cnOQd4EDizG3s3cF6Sp4AngLPKpWkkaVHNGu5V9UNgpoUn3zLN9pcBlw1YlyRpAD6hKkkNMtwlqUGGuyQ1yHCXpAYZ7pLUIMNdkhpkuEtSgwx3SWqQ4S5JDTLcJalBhrskNchwl6QGGe6S1CDDXZIaZLhLUoMMd0lqkOEuSQ3qZw3Vo5LclOTuJJuTfKTrf3GSG5Lc1/18UdefJJcm2ZLkjiSvGfYvIUn6S/1cuT8FfKyqjgNOBlYmOQ5YBdxYVcuAG7v3AG8DlnWvceArC161JOlZzRruVbWtqm7r2r8F7gGOAM4Aruw2uxJ4R9c+A7iqptwCvDDJ4QtduCRpZrMukN0ryVLg1cCPgZGq2tYN/QoY6dpHAL/s2W1r17etp48k40xd2TMyMsLk5OQcS5cWj+en9jV9h3uSg4G1wPlV9XiS/x+rqkpSc/ngqpoAJgBGR0drbGxsLrtLi2f9Ojw/ta/p626ZJAcyFezfqKpruu5Hdk23dD+3d/0PAUf17H5k1ydJWiT93C0T4GvAPVX15Z6h64Czu/bZwHd7+t/f3TVzMrCjZ/pGkrQI+pmWeT3wPuDOJJu6vk8BFwNXJzkHeBA4sxu7Hjgd2AL8HvjgQhYsSZrdrOFeVT8EMsPwW6bZvoCVA9YlSRqAT6hKUoMMd0lqkOEuSQ0y3CWpQYa7JDXIcJekBhnuktQgw12SGmS4S1KDDHdJapDhLkkNMtwlqUGGuyQ1yHCXpAYZ7pLUIMNdkhrUzzJ7VyTZnuSunr4Tktyc5M4k30tySNe/NMkTSTZ1r8uHWbwkaXr9XLmvBk7bre+rwKqqehXwHeATPWP3V9Xy7nXuwpQpSZqLWcO9qjYAj+3WfSywoWvfALxrgeuSJA2gnwWyp7MZOAO4FngPcFTP2DFJfgo8DlxQVT+Y7gBJxoFxgJGRESYnJ+dZijR8np/a18w33D8EXJrk08B1wB+7/m3Ay6rq0SQnAtcmOb6qHt/9AFU1AUwAjI6O1tjY2DxLkYZs/To8P7WvmVe4V9W9wKkASY4F3t71Pwk82bVvTXI/U1M4GxekWklSX+Z1K2SSl3Q/nwNcAFzevT8syXO79suBZcDPF6ZUSVK/Zr1yT7IGGAOWJNkKXAQcnGRlt8k1wNe79huBzyX5E/C/wLlVtfsfYyVJQzZruFfVihmGLplm27XA2kGLkiQNxidUJalBhrskNchwl6QGGe6S1CDDXZIaZLhLUoMMd0lqkOEuSQ0y3CWpQfP9Vkhpr3DCZ/+THU/8aeifs3TVuqEe/9DnH8jtF5061M/Q/sVw1z5txxN/4oGL3z7Uz5icnBz6V/4O+z8P7X+clpGkBhnuktQgw12SGmS4S1KDDHdJatCs4Z7kiiTbk9zV03dCkpuT3Jnke0kO6Rn7ZJItSX6W5K3DKlySNLN+rtxXA6ft1vdVYFVVvQr4DvAJgCTHAWcBx3f7/NuuNVUlSYtn1nCvqg3A7uugHgts6No3AO/q2mcA36yqJ6vqF8AW4KQFqlWS1Kf5PsS0makgvxZ4D3BU138EcEvPdlu7vmdIMg6MA4yMjDA5OTnPUrS/G/a5s3PnzkU5P/03oIU033D/EHBpkk8D1wF/nOsBqmoCmAAYHR2tYT8BqEatXzf0p0cX4wnVxfg9tH+ZV7hX1b3AqQBJjgV2Pf/9EE9fxQMc2fVJkhbRvG6FTPKS7udzgAuAy7uh64CzkjwvyTHAMuAnC1GoJKl/s165J1kDjAFLkmwFLgIOTrKy2+Qa4OsAVbU5ydXA3cBTwMqq+vMwCpckzWzWcK+qFTMMXTLD9p8HPj9IUZKkwfiEqiQ1yHCXpAYZ7pLUIMNdkhpkuEtSgwx3SWqQ4S5JDTLcJalBhrskNchwl6QGGe6S1CDDXZIaZLhLUoMMd0lqkOEuSQ2a7xqq0l7hBa9cxauuXDX8D7pyuId/wSvh6dUqpcEZ7tqn/faei3ng4uGG4mIskL101bqhHl/7n1mnZZJckWR7krt6+pYnuSXJpiQbk5zU9Y8l2dH1b0py4TCLlyRNr58599XAabv1fQH4bFUtBy7s3u/yg6pa3r0+tyBVSpLmZNZwr6oNwGO7dwOHdO1DgYcXuC5J0gDmO+d+PvD9JF9k6j+I1/WMvTbJ7UwF/seravN0B0gyDowDjIyMMDk5Oc9StL8b9rmzc+fORTk//TeghTTfcD8P+GhVrU1yJvA14BTgNuDoqtqZ5HTgWmDZdAeoqglgAmB0dLSG/QcrNWr9uqH/sXMx/qC6GL+H9i/zvc/9bOCarv1t4CSAqnq8qnZ27euBA5MsGbhKSdKczDfcHwbe1LXfDNwHkOSlSdK1T+qO/+igRUqS5mbWaZkka4AxYEmSrcBFwIeBS5IcAPyBbu4ceDdwXpKngCeAs6qqhlG4JGlms4Z7Va2YYejEaba9DLhs0KIkSYPxu2UkqUGGuyQ1yHCXpAYZ7pLUIMNdkhpkuEtSgwx3SWqQ4S5JDTLcJalBhrskNchwl6QGGe6S1CDDXZIaZLhLUoPmu8yetNdYumrd8D9k/XA/49DnHzjU42v/Y7hrn/bAxW8f+mcsXbVuUT5HWkh9TcskuSLJ9iR39fQtT3JLkk1JNnbL6pEplybZkuSOJK8ZVvGSpOn1O+e+Gjhtt74vAJ+tquXAhd17gLcBy7rXOPCVgauUJM1JX+FeVRuAx3bvBg7p2ocytWg2wBnAVTXlFuCFSQ5fiGIlSf0ZZM79fOD7Sb7I1H8Sr+v6jwB+2bPd1q5vW+/OScbpFtYeGRlhcnJygFKk4fL81L5mkHA/D/hoVa1NcibwNeCUfneuqglgAmB0dLTGxsYGKEUaovXr8PzUvmaQ+9zPBq7p2t8GTuraDwFH9Wx3ZNcnSVokg4T7w8Cbuvabgfu69nXA+7u7Zk4GdlTVtukOIEkajr6mZZKsAcaAJUm2AhcBHwYuSXIA8Ae6+XPgeuB0YAvwe+CDC1yzJGkWfYV7Va2YYejEabYtYOUgRUmSBuN3y0hSgwx3SWqQ4S5JDTLcJalBhrskNchwl6QGGe6S1CDDXZIaZLhLUoMMd0lqkOEuSQ0y3CWpQYa7JDXIcJekBhnuktQgw12SGjRruCe5Isn2JHf19H0ryabu9UCSTV3/0iRP9IxdPsTaJUkz6GclptXAZcBVuzqq6r272km+BOzo2f7+qlq+QPVJkuZh1nCvqg1Jlk43liTAmUwtkC1J2ksMOuf+BuCRqrqvp++YJD9N8t9J3jDg8SVJ89DXAtnPYgWwpuf9NuBlVfVokhOBa5McX1WP775jknFgHGBkZITJyckBS5GGx/NT+5p5h3uSA4B3Aifu6quqJ4Enu/atSe4HjgU27r5/VU0AEwCjo6M1NjY231Kk4Vq/Ds9P7WsGmZY5Bbi3qrbu6khyWJLndu2XA8uAnw9WoiRprvq5FXINcDPwiiRbk5zTDZ3FX07JALwRuKO7NfLfgXOr6rEFrFeS1Id+7pZZMUP/B6bpWwusHbwsSdIgfEJVkhpkuEtSgwx3SWqQ4S5JDTLcJalBhrskNchwl6QGGe6S1CDDXZIaZLhLUoMMd0lqkOEuSQ0y3CWpQYa7JDXIcJekBhnuktQgw12SGtTPMntXJNme5K6evm8l2dS9HuiW1ds19skkW5L8LMlbh1S3JOlZzLrMHrAauAy4aldHVb13VzvJl4AdXfs4ptZWPR74W+C/khxbVX9ewJolSbOY9cq9qjYA0y5ynSTAmTy9UPYZwDer6smq+gWwBThpgWqVJPWpnyv3Z/MG4JGquq97fwRwS8/41q7vGZKMA+MAIyMjTE5ODliKNDyen9rXDBruK3j6qn1OqmoCmAAYHR2tsbGxAUuRhmT9Ojw/ta+Zd7gnOQB4J3BiT/dDwFE974/s+iRJi2iQWyFPAe6tqq09fdcBZyV5XpJjgGXATwYpUJI0d/3cCrkGuBl4RZKtSc7phs5itymZqtoMXA3cDawHVnqnjCQtvlTVnq6B0dHR2rhx454uQ/uBqRu8hm9v+Hel9iW5tapGpxvzCVXtV6pqzq+bbrppzvtIe5rhLkkNMtwlqUGGuyQ1yHCXpAYZ7pLUIMNdkhpkuEtSgwx3SWrQXvGEapJfAw/u6TqkGSwBfrOni5CmcXRVHTbdwF4R7tLeLMnGmR7xlvZWTstIUoMMd0lqkOEuzW5iTxcgzZVz7pLUIK/cJalBhrskNchw134lyd8k2dS9fpXkoZ73f7XbtucmeX/XXp3k3V17Mom3RmqvdsCeLkBaTFX1KLAcIMlngJ1V9cUZtr188SqTFpZX7trvJflwkv9JcnuStUn+uuv/TJKPz7LvqUluTnJbkm8nOXhxqpaeneEuwTVV9Y9VdQJwD3BOPzslWQJcAJxSVa8BNgL/PLwypf45LSPBPyT5V+CFwMHA9/vc72TgOOBHSQD+Crh5GAVKc2W4S7AaeEdV3Z7kA8BYn/sFuKGqVgypLmnenJaR4AXAtiQHAv80h/1uAV6f5O8AkhyU5NhhFCjNleEuwaeBHwM/Au7td6eq+jXwAWBNkjuYmpL5+2EUKM2VXz8gSQ3yyl2SGmS4S1KDDHdJapDhLkkNMtwlqUGGuyQ1yHCXpAb9H3VEfzDVqamHAAAAAElFTkSuQmCC\n", "text/plain": [ "
" ] }, "metadata": { "needs_background": "light" }, "output_type": "display_data" } ], "source": [ "df.boxplot(\"Taille\")" ] }, { "cell_type": "markdown", "metadata": { "colab_type": "text", "id": "UPK6FbVxaV3G" }, "source": [ "Pour les données non-numériques, la commande `describe()` n'est que peu d'utilité. Elle renseigne toutefois la valeur la plus fréquente (en statistiques, le *mode* ou *valeur modale*)" ] }, { "cell_type": "code", "execution_count": 20, "metadata": { "colab": { "base_uri": "https://localhost:8080/", "height": 35 }, "colab_type": "code", "collapsed": true, "id": "J2VdkaI5aV3J", "jupyter": { "outputs_hidden": true }, "outputId": "fd1f3385-e9ee-4cae-a720-3ad43b81c24c" }, "outputs": [ { "data": { "text/plain": [ "'3ème ligne'" ] }, "execution_count": 20, "metadata": {}, "output_type": "execute_result" } ], "source": [ "df['Poste'].describe().top" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "colab": {}, "colab_type": "code", "collapsed": true, "id": "V3e_FSFQcDwK", "jupyter": { "outputs_hidden": true } }, "outputs": [], "source": [] }, { "cell_type": "markdown", "metadata": { "colab_type": "text", "id": "1yGECzxmaV3V" }, "source": [ "Pour connaître par exemple la date de naissance la plus fréquente chez les joueurs du top14, on utilisera simplement :" ] }, { "cell_type": "code", "execution_count": 21, "metadata": { "colab": { "base_uri": "https://localhost:8080/", "height": 35 }, "colab_type": "code", "collapsed": true, "id": "cizeAeudaV3Y", "jupyter": { "outputs_hidden": true }, "outputId": "1e1e2254-ffe3-457e-e7f8-ee9f27fa758f" }, "outputs": [ { "data": { "text/plain": [ "'07/08/1990'" ] }, "execution_count": 21, "metadata": {}, "output_type": "execute_result" } ], "source": [ "df['Date de naissance'].describe().top" ] }, { "cell_type": "markdown", "metadata": { "colab_type": "text", "id": "tuuqyF5GaV3d" }, "source": [ "Qui sont les joueurs nés à cette date ?" ] }, { "cell_type": "code", "execution_count": 22, "metadata": { "colab": { "base_uri": "https://localhost:8080/", "height": 88 }, "colab_type": "code", "collapsed": true, "id": "RSp3VnloaV3h", "jupyter": { "outputs_hidden": true }, "outputId": "4ba24701-e07a-4876-9eff-c70c8e7d35c6" }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "157 Rory SCHOLES\n", "382 Laurent PANIS\n", "567 Alban PLACINES\n", "Name: Nom, dtype: object\n" ] } ], "source": [ "print(df['Nom'][df['Date de naissance'] == '23/04/1993'])" ] }, { "cell_type": "markdown", "metadata": { "colab_type": "text", "id": "h5cEi0kcaV3m" }, "source": [ "Beaucoup plus de renseignements sont donnés par la commande `value_counts()`." ] }, { "cell_type": "code", "execution_count": 23, "metadata": { "colab": { "base_uri": "https://localhost:8080/", "height": 728 }, "colab_type": "code", "collapsed": true, "id": "M8DhHZG-aV3n", "jupyter": { "outputs_hidden": true }, "outputId": "85768f49-f611-4551-e913-23f7a4445f4e" }, "outputs": [ { "data": { "text/plain": [ "180 52\n", "183 40\n", "188 35\n", "185 31\n", "181 31\n", "182 29\n", "190 25\n", "184 25\n", "187 25\n", "186 24\n", "193 24\n", "189 21\n", "178 20\n", "177 18\n", "198 17\n", "192 17\n", "195 16\n", "191 16\n", "196 15\n", "194 14\n", "200 12\n", "179 9\n", "202 9\n", "174 9\n", "175 9\n", "176 8\n", "197 6\n", "199 6\n", "201 5\n", "172 4\n", "203 4\n", "206 3\n", "170 3\n", "173 3\n", "171 3\n", "204 3\n", "208 2\n", "205 1\n", "169 1\n", "Name: Taille, dtype: int64" ] }, "execution_count": 23, "metadata": {}, "output_type": "execute_result" } ], "source": [ "df['Taille'].value_counts()" ] }, { "cell_type": "markdown", "metadata": { "colab_type": "text", "id": "RPrflhgCaV3s" }, "source": [ "## Filtres et recherches\n", "Comment créer une *dataframe* ne contenant que les joueurs de l'UBB ? \n", "\n", "L'idée syntaxique est d'écrire à l'intérieur de `df[]` le test qui permettra le filtrage." ] }, { "cell_type": "code", "execution_count": 24, "metadata": { "colab": {}, "colab_type": "code", "collapsed": true, "id": "GQ6t5M6SaV3u", "jupyter": { "outputs_hidden": true } }, "outputs": [], "source": [ "UBB = df[df['Equipe'] == 'Bordeaux']" ] }, { "cell_type": "code", "execution_count": 25, "metadata": { "colab": { "base_uri": "https://localhost:8080/", "height": 1000 }, "colab_type": "code", "collapsed": true, "id": "KUOO86j6aV30", "jupyter": { "outputs_hidden": true }, "outputId": "61ca1c80-8e32-43e6-c303-8087d5928e9e" }, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
EquipeNomPosteDate de naissanceTaillePoids
80BordeauxJefferson POIROTPilier01/11/1992181117
81BordeauxLasha TABIDZEPilier04/07/1997185117
82BordeauxLaurent DELBOULBÈSPilier17/11/1986181106
83BordeauxLekso KAULASHVILIPilier27/08/1992187120
84BordeauxPeni RAVAIPilier16/06/1990185119
85BordeauxThierry PAÏVAPilier19/11/1995184125
86BordeauxVadim COBILASPilier30/07/1983180118
87BordeauxAdrien PÉLISSIÉTalonneur07/08/1990181110
88BordeauxClément MAYNADIERTalonneur11/10/1988187100
89BordeauxAlexandre FLANQUART2ème ligne09/10/1989204120
90BordeauxCyril CAZEAUX2ème ligne10/02/1995198113
91BordeauxJandré MARAIS2ème ligne14/06/1989198118
92BordeauxKane DOUGLAS2ème ligne01/06/1989202123
93BordeauxMasalosalo TUTAIA2ème ligne05/06/1984204125
94BordeauxAfa AMOSA3ème ligne11/10/1990187112
95BordeauxAlexandre ROUMAT3ème ligne27/06/1997198104
96BordeauxBéka GORGADZE3ème ligne08/02/1996189105
97BordeauxCameron WOKI3ème ligne07/11/1998196103
98BordeauxMahamadou DIABY3ème ligne15/08/1990188105
99BordeauxMarco TAULEIGNE3ème ligne30/08/1993191115
100BordeauxScott HIGGINBOTHAM3ème ligne05/09/1986195110
101BordeauxMaxime LUCUMêlée12/01/199317779
102BordeauxYann LESGOURGUESMêlée17/01/199117471
103BordeauxBen BOTICAOuverture07/10/198917893
104BordeauxLucas MÉRETOuverture30/01/199517885
105BordeauxMatthieu JALIBERTOuverture06/11/199818079
106BordeauxJean-Baptiste DUBIÉCentre16/07/198918185
107BordeauxRémi LAMERATCentre14/01/1990184105
108BordeauxSemi RADRADRACentre13/06/1992188102
109BordeauxSeta TAMANIVALUCentre23/07/1992189110
110BordeauxUlupano SEUTENICentre09/12/199318595
111BordeauxBlair CONNORAilier29/09/198818382
112BordeauxNicolas PLAZYAilier17/05/199418885
113BordeauxSantiago CORDEROAilier06/12/199317783
114BordeauxGeoffrey CROSArrière08/03/199718585
115BordeauxNans DUCUINGArrière06/11/199118190
116BordeauxRomain BUROSArrière31/07/199718791
\n", "
" ], "text/plain": [ " Equipe Nom Poste Date de naissance Taille \\\n", "80 Bordeaux Jefferson POIROT Pilier 01/11/1992 181 \n", "81 Bordeaux Lasha TABIDZE Pilier 04/07/1997 185 \n", "82 Bordeaux Laurent DELBOULBÈS Pilier 17/11/1986 181 \n", "83 Bordeaux Lekso KAULASHVILI Pilier 27/08/1992 187 \n", "84 Bordeaux Peni RAVAI Pilier 16/06/1990 185 \n", "85 Bordeaux Thierry PAÏVA Pilier 19/11/1995 184 \n", "86 Bordeaux Vadim COBILAS Pilier 30/07/1983 180 \n", "87 Bordeaux Adrien PÉLISSIÉ Talonneur 07/08/1990 181 \n", "88 Bordeaux Clément MAYNADIER Talonneur 11/10/1988 187 \n", "89 Bordeaux Alexandre FLANQUART 2ème ligne 09/10/1989 204 \n", "90 Bordeaux Cyril CAZEAUX 2ème ligne 10/02/1995 198 \n", "91 Bordeaux Jandré MARAIS 2ème ligne 14/06/1989 198 \n", "92 Bordeaux Kane DOUGLAS 2ème ligne 01/06/1989 202 \n", "93 Bordeaux Masalosalo TUTAIA 2ème ligne 05/06/1984 204 \n", "94 Bordeaux Afa AMOSA 3ème ligne 11/10/1990 187 \n", "95 Bordeaux Alexandre ROUMAT 3ème ligne 27/06/1997 198 \n", "96 Bordeaux Béka GORGADZE 3ème ligne 08/02/1996 189 \n", "97 Bordeaux Cameron WOKI 3ème ligne 07/11/1998 196 \n", "98 Bordeaux Mahamadou DIABY 3ème ligne 15/08/1990 188 \n", "99 Bordeaux Marco TAULEIGNE 3ème ligne 30/08/1993 191 \n", "100 Bordeaux Scott HIGGINBOTHAM 3ème ligne 05/09/1986 195 \n", "101 Bordeaux Maxime LUCU Mêlée 12/01/1993 177 \n", "102 Bordeaux Yann LESGOURGUES Mêlée 17/01/1991 174 \n", "103 Bordeaux Ben BOTICA Ouverture 07/10/1989 178 \n", "104 Bordeaux Lucas MÉRET Ouverture 30/01/1995 178 \n", "105 Bordeaux Matthieu JALIBERT Ouverture 06/11/1998 180 \n", "106 Bordeaux Jean-Baptiste DUBIÉ Centre 16/07/1989 181 \n", "107 Bordeaux Rémi LAMERAT Centre 14/01/1990 184 \n", "108 Bordeaux Semi RADRADRA Centre 13/06/1992 188 \n", "109 Bordeaux Seta TAMANIVALU Centre 23/07/1992 189 \n", "110 Bordeaux Ulupano SEUTENI Centre 09/12/1993 185 \n", "111 Bordeaux Blair CONNOR Ailier 29/09/1988 183 \n", "112 Bordeaux Nicolas PLAZY Ailier 17/05/1994 188 \n", "113 Bordeaux Santiago CORDERO Ailier 06/12/1993 177 \n", "114 Bordeaux Geoffrey CROS Arrière 08/03/1997 185 \n", "115 Bordeaux Nans DUCUING Arrière 06/11/1991 181 \n", "116 Bordeaux Romain BUROS Arrière 31/07/1997 187 \n", "\n", " Poids \n", "80 117 \n", "81 117 \n", "82 106 \n", "83 120 \n", "84 119 \n", "85 125 \n", "86 118 \n", "87 110 \n", "88 100 \n", "89 120 \n", "90 113 \n", "91 118 \n", "92 123 \n", "93 125 \n", "94 112 \n", "95 104 \n", "96 105 \n", "97 103 \n", "98 105 \n", "99 115 \n", "100 110 \n", "101 79 \n", "102 71 \n", "103 93 \n", "104 85 \n", "105 79 \n", "106 85 \n", "107 105 \n", "108 102 \n", "109 110 \n", "110 95 \n", "111 82 \n", "112 85 \n", "113 83 \n", "114 85 \n", "115 90 \n", "116 91 " ] }, "execution_count": 25, "metadata": {}, "output_type": "execute_result" } ], "source": [ "UBB" ] }, { "cell_type": "markdown", "metadata": { "colab_type": "text", "id": "jflqx-tlaV36" }, "source": [ "### Exercice 1\n", "\n", "Créer une dataframe `gros` qui contient les joueurs de plus de 135 kg." ] }, { "cell_type": "code", "execution_count": null, "metadata": { "colab": { "base_uri": "https://localhost:8080/", "height": 300 }, "colab_type": "code", "collapsed": true, "id": "JHQooyUSWlf4", "jupyter": { "outputs_hidden": true }, "outputId": "b71ef8c1-9e2a-495a-fafb-033e3d05b95a" }, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
EquipeNomPosteDate de naissanceTaillePoids
48BayonneEdwin MAKA2ème ligne25/01/1993196140
167CastresTapu FALATEAPilier12/12/1988187137
253La RochelleUini ATONIOPilier26/03/1990196152
324MontpellierAntoine GUILLAMONPilier04/06/1991192136
373ParisChristopher VAOTOAPilier16/11/1996185138
425PauMalik HAMADACHEPilier17/10/1988193141
465Racing92Ali OZPilier28/05/1995193140
466Racing92Ben TAMEIFUNAPilier30/08/1991182140
\n", "
" ], "text/plain": [ " Equipe Nom ... Taille Poids\n", "48 Bayonne Edwin MAKA ... 196 140\n", "167 Castres Tapu FALATEA ... 187 137\n", "253 La Rochelle Uini ATONIO ... 196 152\n", "324 Montpellier Antoine GUILLAMON ... 192 136\n", "373 Paris Christopher VAOTOA ... 185 138\n", "425 Pau Malik HAMADACHE ... 193 141\n", "465 Racing92 Ali OZ ... 193 140\n", "466 Racing92 Ben TAMEIFUNA ... 182 140\n", "\n", "[8 rows x 6 columns]" ] }, "execution_count": 24, "metadata": { "tags": [] }, "output_type": "execute_result" } ], "source": [] }, { "cell_type": "markdown", "metadata": { "colab_type": "text", "id": "Gd9gCUTkaV4U" }, "source": [ "### Exercice 2\n", "\n", "Créer une dataframe `grand_gros` qui contient les joueurs de plus de 2m et plus de 120 kg." ] }, { "cell_type": "code", "execution_count": 8, "metadata": { "colab": { "base_uri": "https://localhost:8080/", "height": 424 }, "colab_type": "code", "collapsed": true, "id": "X3UY2qC-aV4Z", "jupyter": { "outputs_hidden": true }, "outputId": "3a361346-179d-4e0e-94fd-223f5546b0f5" }, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
EquipeNomPosteDate de naissanceTaillePoids
0AgenAnton PEIKRISHVILIPilier18/09/1987183122
3AgenKamaliele TUFELEPilier11/10/1995182123
12AgenMickaël DE MARCO2ème ligne22/04/1989195134
13AgenPierce PHILLIPS2ème ligne06/10/1992203119
35BayonneCensus JOHNSTONPilier06/05/1981189130
.....................
562ToulouseFlorian VERHAEGHE2ème ligne27/04/1997202108
563ToulouseIosefa TEKORI2ème ligne17/12/1983198127
564ToulouseRichie ARNOLD2ème ligne01/07/1990208127
565ToulouseRichie GRAY2ème ligne24/08/1989206125
566ToulouseRory ARNOLD2ème ligne01/07/1990208120
\n", "

96 rows × 6 columns

\n", "
" ], "text/plain": [ " Equipe Nom Poste Date de naissance Taille Poids\n", "0 Agen Anton PEIKRISHVILI Pilier 18/09/1987 183 122\n", "3 Agen Kamaliele TUFELE Pilier 11/10/1995 182 123\n", "12 Agen Mickaël DE MARCO 2ème ligne 22/04/1989 195 134\n", "13 Agen Pierce PHILLIPS 2ème ligne 06/10/1992 203 119\n", "35 Bayonne Census JOHNSTON Pilier 06/05/1981 189 130\n", ".. ... ... ... ... ... ...\n", "562 Toulouse Florian VERHAEGHE 2ème ligne 27/04/1997 202 108\n", "563 Toulouse Iosefa TEKORI 2ème ligne 17/12/1983 198 127\n", "564 Toulouse Richie ARNOLD 2ème ligne 01/07/1990 208 127\n", "565 Toulouse Richie GRAY 2ème ligne 24/08/1989 206 125\n", "566 Toulouse Rory ARNOLD 2ème ligne 01/07/1990 208 120\n", "\n", "[96 rows x 6 columns]" ] }, "execution_count": 8, "metadata": { "tags": [] }, "output_type": "execute_result" } ], "source": [] }, { "cell_type": "markdown", "metadata": { "colab_type": "text", "id": "cHgScf-caV4h" }, "source": [ "### Exercice 3\n", "\n", "Trouver en une seule ligne le joueur le plus léger du Top14." ] }, { "cell_type": "code", "execution_count": 16, "metadata": { "colab": { "base_uri": "https://localhost:8080/", "height": 35 }, "colab_type": "code", "collapsed": true, "id": "-qxr-kP7Sad-", "jupyter": { "outputs_hidden": true }, "outputId": "627f91eb-8cd8-44b6-89d5-a2551752c0fc" }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Dylan HAYES\n" ] } ], "source": [] }, { "cell_type": "markdown", "metadata": { "colab_type": "text", "id": "BdC4T8JeaV4p" }, "source": [ "## Tris de données\n", "Le tri se fait par la fonction `sort_values()` :" ] }, { "cell_type": "code", "execution_count": 26, "metadata": { "colab": {}, "colab_type": "code", "collapsed": true, "id": "v7zDG9l8aV4q", "jupyter": { "outputs_hidden": true } }, "outputs": [], "source": [ "newdf = df.sort_values(by=['Poids'], ascending = True)" ] }, { "cell_type": "code", "execution_count": 27, "metadata": { "colab": { "base_uri": "https://localhost:8080/", "height": 363 }, "colab_type": "code", "collapsed": true, "id": "icXyy5ERaV4w", "jupyter": { "outputs_hidden": true }, "outputId": "bbc85c5b-7194-4ff1-9fe0-d3d7f763346e" }, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
EquipeNomPosteDate de naissanceTaillePoids
491Racing92Teddy IRIBARENMêlée25/09/199017070
102BordeauxYann LESGOURGUESMêlée17/01/199117471
545ToulonGervais CORDINArrière10/12/199817273
353MontpellierBenoît PAILLAUGUEMêlée17/11/198717274
143BriveQuentin DELORDMêlée10/02/199917174
578ToulouseSébastien BÉZYMêlée22/11/199117474
446PauClovis LE BAILMêlée29/11/199517374
64BayonneGuillaume ROUETMêlée13/08/198817075
364MontpellierGabriel N'GANDEBEAilier30/03/199717475
283La RochelleMarc ANDREUAilier27/12/198517075
\n", "
" ], "text/plain": [ " Equipe Nom Poste Date de naissance Taille Poids\n", "491 Racing92 Teddy IRIBAREN Mêlée 25/09/1990 170 70\n", "102 Bordeaux Yann LESGOURGUES Mêlée 17/01/1991 174 71\n", "545 Toulon Gervais CORDIN Arrière 10/12/1998 172 73\n", "353 Montpellier Benoît PAILLAUGUE Mêlée 17/11/1987 172 74\n", "143 Brive Quentin DELORD Mêlée 10/02/1999 171 74\n", "578 Toulouse Sébastien BÉZY Mêlée 22/11/1991 174 74\n", "446 Pau Clovis LE BAIL Mêlée 29/11/1995 173 74\n", "64 Bayonne Guillaume ROUET Mêlée 13/08/1988 170 75\n", "364 Montpellier Gabriel N'GANDEBE Ailier 30/03/1997 174 75\n", "283 La Rochelle Marc ANDREU Ailier 27/12/1985 170 75" ] }, "execution_count": 27, "metadata": {}, "output_type": "execute_result" } ], "source": [ "newdf.head(10)" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "colab": {}, "colab_type": "code", "collapsed": true, "id": "JVGM5VO5aV42", "jupyter": { "outputs_hidden": true } }, "outputs": [], "source": [] }, { "cell_type": "markdown", "metadata": { "colab_type": "text", "id": "Uc7hwYlPaV47" }, "source": [ "## Rajout d'une colonne\n", "Afin de pouvoir trier les joueurs suivant de nouveaux critères, nous allons rajouter un champ pour chaque joueur.\n", "Prenons un exemple stupide : fabriquons un nouveau champ `'Poids après les vacances'` qui contiendra le poids des joueurs augmenté de 8 kg. \n", "Ceci se fera simplement par :\n" ] }, { "cell_type": "code", "execution_count": 28, "metadata": { "colab": {}, "colab_type": "code", "collapsed": true, "id": "J85gFmzDaV48", "jupyter": { "outputs_hidden": true } }, "outputs": [], "source": [ "df['Poids après les vacances'] = df['Poids'] + 8" ] }, { "cell_type": "code", "execution_count": 29, "metadata": { "colab": { "base_uri": "https://localhost:8080/", "height": 206 }, "colab_type": "code", "collapsed": true, "id": "mwiLaa19aV5B", "jupyter": { "outputs_hidden": true }, "outputId": "1902af3d-7f82-4cb3-adbb-58462c37f874" }, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
EquipeNomPosteDate de naissanceTaillePoidsPoids après les vacances
0AgenAnton PEIKRISHVILIPilier18/09/1987183122130
1AgenDave RYANPilier21/04/1986183116124
2AgenGiorgi TETRASHVILIPilier31/08/1993177112120
3AgenKamaliele TUFELEPilier11/10/1995182123131
4AgenMalino VANAÏPilier04/05/1993183119127
\n", "
" ], "text/plain": [ " Equipe Nom Poste Date de naissance Taille Poids \\\n", "0 Agen Anton PEIKRISHVILI Pilier 18/09/1987 183 122 \n", "1 Agen Dave RYAN Pilier 21/04/1986 183 116 \n", "2 Agen Giorgi TETRASHVILI Pilier 31/08/1993 177 112 \n", "3 Agen Kamaliele TUFELE Pilier 11/10/1995 182 123 \n", "4 Agen Malino VANAÏ Pilier 04/05/1993 183 119 \n", "\n", " Poids après les vacances \n", "0 130 \n", "1 124 \n", "2 120 \n", "3 131 \n", "4 127 " ] }, "execution_count": 29, "metadata": {}, "output_type": "execute_result" } ], "source": [ "df.head()" ] }, { "cell_type": "markdown", "metadata": { "colab_type": "text", "id": "TgLXLWeYaV5G" }, "source": [ "Pour supprimer cette colonne sans intérêt, faisons :" ] }, { "cell_type": "code", "execution_count": 30, "metadata": { "colab": {}, "colab_type": "code", "collapsed": true, "id": "hvhUwpp2aV5I", "jupyter": { "outputs_hidden": true } }, "outputs": [], "source": [ "del df['Poids après les vacances'] " ] }, { "cell_type": "code", "execution_count": 31, "metadata": { "colab": { "base_uri": "https://localhost:8080/", "height": 206 }, "colab_type": "code", "collapsed": true, "id": "4BTXaPtEaV5N", "jupyter": { "outputs_hidden": true }, "outputId": "2dea73f7-2d2f-4138-cad1-631eaa54f5b3" }, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
EquipeNomPosteDate de naissanceTaillePoids
0AgenAnton PEIKRISHVILIPilier18/09/1987183122
1AgenDave RYANPilier21/04/1986183116
2AgenGiorgi TETRASHVILIPilier31/08/1993177112
3AgenKamaliele TUFELEPilier11/10/1995182123
4AgenMalino VANAÏPilier04/05/1993183119
\n", "
" ], "text/plain": [ " Equipe Nom Poste Date de naissance Taille Poids\n", "0 Agen Anton PEIKRISHVILI Pilier 18/09/1987 183 122\n", "1 Agen Dave RYAN Pilier 21/04/1986 183 116\n", "2 Agen Giorgi TETRASHVILI Pilier 31/08/1993 177 112\n", "3 Agen Kamaliele TUFELE Pilier 11/10/1995 182 123\n", "4 Agen Malino VANAÏ Pilier 04/05/1993 183 119" ] }, "execution_count": 31, "metadata": {}, "output_type": "execute_result" } ], "source": [ "df.head()" ] }, { "cell_type": "markdown", "metadata": { "colab_type": "text", "id": "DNYbEtZZaV5R" }, "source": [ "### Exercice 4\n", "\n", "1. Créer une colonne contenant l'IMC de chaque joueur\n", "2. Créer une nouvelle dataframe contenant tous les joueurs du top14 classés par ordre d'IMC croissant." ] }, { "cell_type": "code", "execution_count": 19, "metadata": { "colab": { "base_uri": "https://localhost:8080/", "height": 206 }, "colab_type": "code", "collapsed": true, "id": "0gNbLPmiMaro", "jupyter": { "outputs_hidden": true }, "outputId": "734131f2-86f8-46ac-83af-4d35309fe6f3" }, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
EquipeNomPosteDate de naissanceTaillePoidsIMC
0AgenAnton PEIKRISHVILIPilier18/09/198718312236.429872
1AgenDave RYANPilier21/04/198618311634.638239
2AgenGiorgi TETRASHVILIPilier31/08/199317711235.749625
3AgenKamaliele TUFELEPilier11/10/199518212337.133196
4AgenMalino VANAÏPilier04/05/199318311935.534056
\n", "
" ], "text/plain": [ " Equipe Nom Poste ... Taille Poids IMC\n", "0 Agen Anton PEIKRISHVILI Pilier ... 183 122 36.429872\n", "1 Agen Dave RYAN Pilier ... 183 116 34.638239\n", "2 Agen Giorgi TETRASHVILI Pilier ... 177 112 35.749625\n", "3 Agen Kamaliele TUFELE Pilier ... 182 123 37.133196\n", "4 Agen Malino VANAÏ Pilier ... 183 119 35.534056\n", "\n", "[5 rows x 7 columns]" ] }, "execution_count": 19, "metadata": { "tags": [] }, "output_type": "execute_result" } ], "source": [] }, { "cell_type": "code", "execution_count": 20, "metadata": { "colab": { "base_uri": "https://localhost:8080/", "height": 424 }, "colab_type": "code", "collapsed": true, "id": "nUuXBvIWXsFD", "jupyter": { "outputs_hidden": true }, "outputId": "b782ddf9-620b-4f8e-f402-008479e3e37d" }, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
EquipeNomPosteDate de naissanceTaillePoidsIMC
102BordeauxYann LESGOURGUESMêlée17/01/19911747123.450918
77BayonneAymeric LUCArrière14/10/19971807623.456790
66BayonneBrandon FAJARDOOuverture25/06/19941817723.503556
141BriveDavid DELARUEMêlée27/10/19961908523.545706
200CastresMartin LAVEAUAilier10/09/19961827823.547881
........................
376ParisPaul ALO-EMILEPilier22/12/199118012839.506173
253La RochelleUini ATONIOPilier26/03/199019615239.566847
373ParisChristopher VAOTOAPilier16/11/199618513840.321402
511ToulonSébastien TAOFIFENUAPilier21/03/199217813041.030173
466Racing92Ben TAMEIFUNAPilier30/08/199118214042.265427
\n", "

595 rows × 7 columns

\n", "
" ], "text/plain": [ " Equipe Nom Poste ... Taille Poids IMC\n", "102 Bordeaux Yann LESGOURGUES Mêlée ... 174 71 23.450918\n", "77 Bayonne Aymeric LUC Arrière ... 180 76 23.456790\n", "66 Bayonne Brandon FAJARDO Ouverture ... 181 77 23.503556\n", "141 Brive David DELARUE Mêlée ... 190 85 23.545706\n", "200 Castres Martin LAVEAU Ailier ... 182 78 23.547881\n", ".. ... ... ... ... ... ... ...\n", "376 Paris Paul ALO-EMILE Pilier ... 180 128 39.506173\n", "253 La Rochelle Uini ATONIO Pilier ... 196 152 39.566847\n", "373 Paris Christopher VAOTOA Pilier ... 185 138 40.321402\n", "511 Toulon Sébastien TAOFIFENUA Pilier ... 178 130 41.030173\n", "466 Racing92 Ben TAMEIFUNA Pilier ... 182 140 42.265427\n", "\n", "[595 rows x 7 columns]" ] }, "execution_count": 20, "metadata": { "tags": [] }, "output_type": "execute_result" } ], "source": [] }, { "cell_type": "code", "execution_count": null, "metadata": { "colab": {}, "colab_type": "code", "collapsed": true, "id": "W3LBRsDGYAqD", "jupyter": { "outputs_hidden": true } }, "outputs": [], "source": [] } ], "metadata": { "colab": { "collapsed_sections": [ "jflqx-tlaV36", "Gd9gCUTkaV4U", "cHgScf-caV4h", "DNYbEtZZaV5R" ], "name": "PROF_03_Pandas_eleves.ipynb", "provenance": [] }, "kernelspec": { "display_name": "Python 3", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.8.6" }, "toc": { "base_numbering": 1, "nav_menu": {}, "number_sections": false, "sideBar": true, "skip_h1_title": false, "title_cell": "Table of Contents", "title_sidebar": "Contents", "toc_cell": false, "toc_position": {}, "toc_section_display": true, "toc_window_display": false } }, "nbformat": 4, "nbformat_minor": 4 }