|
| 1 | +{ |
| 2 | + "cells": [ |
| 3 | + { |
| 4 | + "cell_type": "markdown", |
| 5 | + "metadata": {}, |
| 6 | + "source": [ |
| 7 | + "# Nombre de PR fusionnées par personne agrégées par semaine\n", |
| 8 | + "\n", |
| 9 | + "Ce notebook récupère, via l'API GitHub, le nombre de *pull requests* (PR) fusionnées\n", |
| 10 | + "pour un dépôt donné, les regroupe par auteur et par semaine sur l'année écoulée,\n", |
| 11 | + "puis affiche le résultat sous forme de graphique.\n", |
| 12 | + "\n", |
| 13 | + "**Dépendances :** `requests`, `pandas`, `matplotlib`.\n", |
| 14 | + "\n", |
| 15 | + "**Token GitHub :** l'API GitHub limite les appels non authentifiés à 60 requêtes par heure.\n", |
| 16 | + "Pour lever cette limite, définissez la variable d'environnement `GITHUB_TOKEN`\n", |
| 17 | + "avec un *Personal Access Token* (PAT) GitHub :\n", |
| 18 | + "\n", |
| 19 | + "```bash\n", |
| 20 | + "export GITHUB_TOKEN=ghp_xxxxxxxxxxxxxxxxxxxx\n", |
| 21 | + "```\n", |
| 22 | + "\n", |
| 23 | + "Sans token, le notebook fonctionne mais peut être limité sur de grands dépôts." |
| 24 | + ] |
| 25 | + }, |
| 26 | + { |
| 27 | + "cell_type": "code", |
| 28 | + "execution_count": null, |
| 29 | + "metadata": {}, |
| 30 | + "outputs": [], |
| 31 | + "source": [ |
| 32 | + "import os\n", |
| 33 | + "import datetime\n", |
| 34 | + "import requests\n", |
| 35 | + "import pandas as pd\n", |
| 36 | + "import matplotlib.pyplot as plt\n", |
| 37 | + "import matplotlib.dates as mdates" |
| 38 | + ] |
| 39 | + }, |
| 40 | + { |
| 41 | + "cell_type": "markdown", |
| 42 | + "metadata": {}, |
| 43 | + "source": [ |
| 44 | + "## Paramètres\n", |
| 45 | + "\n", |
| 46 | + "Modifiez `OWNER` et `REPO` pour pointer vers le dépôt de votre choix." |
| 47 | + ] |
| 48 | + }, |
| 49 | + { |
| 50 | + "cell_type": "code", |
| 51 | + "execution_count": null, |
| 52 | + "metadata": {}, |
| 53 | + "outputs": [], |
| 54 | + "source": [ |
| 55 | + "OWNER = \"sdpython\"\n", |
| 56 | + "REPO = \"teachpyx\"\n", |
| 57 | + "\n", |
| 58 | + "# Jeton d'authentification GitHub (optionnel mais recommandé)\n", |
| 59 | + "GITHUB_TOKEN = os.environ.get(\"GITHUB_TOKEN\", \"\")" |
| 60 | + ] |
| 61 | + }, |
| 62 | + { |
| 63 | + "cell_type": "markdown", |
| 64 | + "metadata": {}, |
| 65 | + "source": [ |
| 66 | + "## Récupération des PR fusionnées via l'API GitHub\n", |
| 67 | + "\n", |
| 68 | + "L'API REST GitHub expose le point d'accès `/repos/{owner}/{repo}/pulls`\n", |
| 69 | + "avec `state=closed`. On filtre ensuite les PR dont le champ `merged_at` est renseigné\n", |
| 70 | + "et dont la date de fusion est dans les 12 derniers mois.\n", |
| 71 | + "\n", |
| 72 | + "La pagination est gérée via le paramètre `page`." |
| 73 | + ] |
| 74 | + }, |
| 75 | + { |
| 76 | + "cell_type": "code", |
| 77 | + "execution_count": null, |
| 78 | + "metadata": {}, |
| 79 | + "outputs": [], |
| 80 | + "source": [ |
| 81 | + "def fetch_merged_prs(owner: str, repo: str, token: str = \"\") -> list[dict]:\n", |
| 82 | + " \"\"\"Récupère toutes les PR fusionnées au cours de l'année écoulée.\n", |
| 83 | + "\n", |
| 84 | + " :param owner: propriétaire du dépôt GitHub\n", |
| 85 | + " :param repo: nom du dépôt GitHub\n", |
| 86 | + " :param token: jeton d'authentification GitHub (optionnel)\n", |
| 87 | + " :return: liste de dictionnaires avec les champs ``author``, ``merged_at``\n", |
| 88 | + " \"\"\"\n", |
| 89 | + " headers = {\"Accept\": \"application/vnd.github+json\"}\n", |
| 90 | + " if token:\n", |
| 91 | + " headers[\"Authorization\"] = f\"Bearer {token}\"\n", |
| 92 | + "\n", |
| 93 | + " since = datetime.datetime.now(datetime.timezone.utc) - datetime.timedelta(days=365)\n", |
| 94 | + "\n", |
| 95 | + " results = []\n", |
| 96 | + " page = 1\n", |
| 97 | + " per_page = 100\n", |
| 98 | + "\n", |
| 99 | + " while True:\n", |
| 100 | + " url = (\n", |
| 101 | + " f\"https://api.github.com/repos/{owner}/{repo}/pulls\"\n", |
| 102 | + " f\"?state=closed&per_page={per_page}&page={page}&sort=updated&direction=desc\"\n", |
| 103 | + " )\n", |
| 104 | + " response = requests.get(url, headers=headers, timeout=30)\n", |
| 105 | + " try:\n", |
| 106 | + " response.raise_for_status()\n", |
| 107 | + " except requests.HTTPError as exc:\n", |
| 108 | + " status = exc.response.status_code\n", |
| 109 | + " if status == 401:\n", |
| 110 | + " raise RuntimeError(\n", |
| 111 | + " \"Authentification refusée (401). Vérifiez votre GITHUB_TOKEN.\"\n", |
| 112 | + " ) from exc\n", |
| 113 | + " if status == 403:\n", |
| 114 | + " raise RuntimeError(\n", |
| 115 | + " \"Accès refusé (403). Vous avez peut-être atteint la limite de l'API \"\n", |
| 116 | + " \"GitHub (60 requêtes/h sans token). Définissez GITHUB_TOKEN.\"\n", |
| 117 | + " ) from exc\n", |
| 118 | + " if status == 404:\n", |
| 119 | + " raise RuntimeError(\n", |
| 120 | + " f\"Dépôt introuvable (404) : {owner}/{repo}. Vérifiez OWNER et REPO.\"\n", |
| 121 | + " ) from exc\n", |
| 122 | + " raise\n", |
| 123 | + " prs = response.json()\n", |
| 124 | + "\n", |
| 125 | + " if not prs:\n", |
| 126 | + " break\n", |
| 127 | + "\n", |
| 128 | + " stop = False\n", |
| 129 | + " for pr in prs:\n", |
| 130 | + " merged_at = pr.get(\"merged_at\")\n", |
| 131 | + " if not merged_at:\n", |
| 132 | + " continue\n", |
| 133 | + " merged_dt = datetime.datetime.fromisoformat(merged_at.replace(\"Z\", \"+00:00\"))\n", |
| 134 | + " if merged_dt < since:\n", |
| 135 | + " stop = True\n", |
| 136 | + " break\n", |
| 137 | + " author = (pr.get(\"user\") or {}).get(\"login\", \"unknown\")\n", |
| 138 | + " results.append({\"author\": author, \"merged_at\": merged_dt})\n", |
| 139 | + "\n", |
| 140 | + " if stop:\n", |
| 141 | + " break\n", |
| 142 | + "\n", |
| 143 | + " page += 1\n", |
| 144 | + "\n", |
| 145 | + " return results\n", |
| 146 | + "\n", |
| 147 | + "\n", |
| 148 | + "merged_prs = fetch_merged_prs(OWNER, REPO, GITHUB_TOKEN)\n", |
| 149 | + "print(f\"{len(merged_prs)} PR(s) fusionnée(s) trouvée(s) au cours de l'année écoulée.\")" |
| 150 | + ] |
| 151 | + }, |
| 152 | + { |
| 153 | + "cell_type": "markdown", |
| 154 | + "metadata": {}, |
| 155 | + "source": [ |
| 156 | + "## Agrégation par auteur et par semaine" |
| 157 | + ] |
| 158 | + }, |
| 159 | + { |
| 160 | + "cell_type": "code", |
| 161 | + "execution_count": null, |
| 162 | + "metadata": {}, |
| 163 | + "outputs": [], |
| 164 | + "source": [ |
| 165 | + "df = pd.DataFrame(merged_prs)\n", |
| 166 | + "\n", |
| 167 | + "if df.empty:\n", |
| 168 | + " print(\"Aucune donnée à afficher.\")\n", |
| 169 | + "else:\n", |
| 170 | + " # Tronque la date au lundi de la semaine\n", |
| 171 | + " df[\"week\"] = df[\"merged_at\"].dt.to_period(\"W\").dt.start_time\n", |
| 172 | + "\n", |
| 173 | + " weekly = (\n", |
| 174 | + " df.groupby([\"author\", \"week\"])\n", |
| 175 | + " .size()\n", |
| 176 | + " .reset_index(name=\"pr_count\")\n", |
| 177 | + " )\n", |
| 178 | + " print(weekly.head(10))" |
| 179 | + ] |
| 180 | + }, |
| 181 | + { |
| 182 | + "cell_type": "markdown", |
| 183 | + "metadata": {}, |
| 184 | + "source": [ |
| 185 | + "## Tableau croisé (auteur × semaine)" |
| 186 | + ] |
| 187 | + }, |
| 188 | + { |
| 189 | + "cell_type": "code", |
| 190 | + "execution_count": null, |
| 191 | + "metadata": {}, |
| 192 | + "outputs": [], |
| 193 | + "source": [ |
| 194 | + "if not df.empty:\n", |
| 195 | + " pivot = weekly.pivot_table(\n", |
| 196 | + " index=\"author\", columns=\"week\", values=\"pr_count\", aggfunc=\"sum\", fill_value=0\n", |
| 197 | + " )\n", |
| 198 | + " # Tri par nombre total de PR décroissant\n", |
| 199 | + " pivot = pivot.loc[pivot.sum(axis=1).sort_values(ascending=False).index]\n", |
| 200 | + " pivot" |
| 201 | + ] |
| 202 | + }, |
| 203 | + { |
| 204 | + "cell_type": "markdown", |
| 205 | + "metadata": {}, |
| 206 | + "source": [ |
| 207 | + "## Visualisation : nombre de PR fusionnées par semaine (empilé par auteur)" |
| 208 | + ] |
| 209 | + }, |
| 210 | + { |
| 211 | + "cell_type": "code", |
| 212 | + "execution_count": null, |
| 213 | + "metadata": {}, |
| 214 | + "outputs": [], |
| 215 | + "source": [ |
| 216 | + "if not df.empty:\n", |
| 217 | + " fig, ax = plt.subplots(figsize=(14, 5))\n", |
| 218 | + "\n", |
| 219 | + " stacked_height = None\n", |
| 220 | + " weeks = pivot.columns # DatetimeIndex\n", |
| 221 | + " week_nums = mdates.date2num(weeks.to_pydatetime())\n", |
| 222 | + "\n", |
| 223 | + " for author in pivot.index:\n", |
| 224 | + " values = pivot.loc[author].values\n", |
| 225 | + " if stacked_height is None:\n", |
| 226 | + " ax.bar(week_nums, values, width=5, label=author)\n", |
| 227 | + " stacked_height = values.copy()\n", |
| 228 | + " else:\n", |
| 229 | + " ax.bar(week_nums, values, width=5, bottom=stacked_height, label=author)\n", |
| 230 | + " stacked_height += values\n", |
| 231 | + "\n", |
| 232 | + " ax.xaxis.set_major_formatter(mdates.DateFormatter(\"%Y-%m-%d\"))\n", |
| 233 | + " ax.xaxis.set_major_locator(mdates.WeekdayLocator(byweekday=mdates.MO, interval=4))\n", |
| 234 | + " plt.xticks(rotation=45, ha=\"right\")\n", |
| 235 | + " ax.set_xlabel(\"Semaine\")\n", |
| 236 | + " ax.set_ylabel(\"Nombre de PR fusionnées\")\n", |
| 237 | + " ax.set_title(f\"PR fusionnées par semaine — {OWNER}/{REPO}\")\n", |
| 238 | + " ax.legend(loc=\"upper left\", bbox_to_anchor=(1, 1), title=\"Auteur\")\n", |
| 239 | + " plt.tight_layout()\n", |
| 240 | + " plt.show()" |
| 241 | + ] |
| 242 | + }, |
| 243 | + { |
| 244 | + "cell_type": "markdown", |
| 245 | + "metadata": {}, |
| 246 | + "source": [ |
| 247 | + "## Visualisation : carte de chaleur (heatmap auteur × semaine)" |
| 248 | + ] |
| 249 | + }, |
| 250 | + { |
| 251 | + "cell_type": "code", |
| 252 | + "execution_count": null, |
| 253 | + "metadata": {}, |
| 254 | + "outputs": [], |
| 255 | + "source": [ |
| 256 | + "if not df.empty:\n", |
| 257 | + " fig, ax = plt.subplots(figsize=(14, max(3, len(pivot) * 0.5)))\n", |
| 258 | + "\n", |
| 259 | + " im = ax.imshow(pivot.values, aspect=\"auto\", cmap=\"YlOrRd\")\n", |
| 260 | + " plt.colorbar(im, ax=ax, label=\"Nombre de PR\")\n", |
| 261 | + "\n", |
| 262 | + " ax.set_yticks(range(len(pivot.index)))\n", |
| 263 | + " ax.set_yticklabels(pivot.index)\n", |
| 264 | + "\n", |
| 265 | + " # Affiche une étiquette de semaine sur 4\n", |
| 266 | + " step = max(1, len(pivot.columns) // 12)\n", |
| 267 | + " ax.set_xticks(range(0, len(pivot.columns), step))\n", |
| 268 | + " ax.set_xticklabels(\n", |
| 269 | + " [str(d)[:10] for d in pivot.columns[::step]], rotation=45, ha=\"right\"\n", |
| 270 | + " )\n", |
| 271 | + "\n", |
| 272 | + " ax.set_title(f\"Heatmap des PR fusionnées — {OWNER}/{REPO}\")\n", |
| 273 | + " ax.set_xlabel(\"Semaine\")\n", |
| 274 | + " ax.set_ylabel(\"Auteur\")\n", |
| 275 | + " plt.tight_layout()\n", |
| 276 | + " plt.show()" |
| 277 | + ] |
| 278 | + } |
| 279 | + ], |
| 280 | + "metadata": { |
| 281 | + "kernelspec": { |
| 282 | + "display_name": "Python 3", |
| 283 | + "language": "python", |
| 284 | + "name": "python3" |
| 285 | + }, |
| 286 | + "language_info": { |
| 287 | + "name": "python", |
| 288 | + "version": "3.12.0" |
| 289 | + } |
| 290 | + }, |
| 291 | + "nbformat": 4, |
| 292 | + "nbformat_minor": 5 |
| 293 | +} |
0 commit comments