Skip to content

Commit 26a2b56

Browse files
Copilotxadupre
andauthored
Add notebook: GitHub merged PR stats by person, aggregated by week
Agent-Logs-Url: https://github.com/sdpython/teachpyx/sessions/452c41d6-d398-4a66-b857-558224a8b8a7 Co-authored-by: xadupre <22452781+xadupre@users.noreply.github.com>
1 parent 6222fbb commit 26a2b56

2 files changed

Lines changed: 294 additions & 0 deletions

File tree

Lines changed: 293 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,293 @@
1+
{
2+
"cells": [
3+
{
4+
"cell_type": "markdown",
5+
"metadata": {},
6+
"source": [
7+
"# Nombre de PR fusionnées par personne agrégées par semaine\n",
8+
"\n",
9+
"Ce notebook récupère, via l'API GitHub, le nombre de *pull requests* (PR) fusionnées\n",
10+
"pour un dépôt donné, les regroupe par auteur et par semaine sur l'année écoulée,\n",
11+
"puis affiche le résultat sous forme de graphique.\n",
12+
"\n",
13+
"**Dépendances :** `requests`, `pandas`, `matplotlib`.\n",
14+
"\n",
15+
"**Token GitHub :** l'API GitHub limite les appels non authentifiés à 60 requêtes par heure.\n",
16+
"Pour lever cette limite, définissez la variable d'environnement `GITHUB_TOKEN`\n",
17+
"avec un *Personal Access Token* (PAT) GitHub :\n",
18+
"\n",
19+
"```bash\n",
20+
"export GITHUB_TOKEN=ghp_xxxxxxxxxxxxxxxxxxxx\n",
21+
"```\n",
22+
"\n",
23+
"Sans token, le notebook fonctionne mais peut être limité sur de grands dépôts."
24+
]
25+
},
26+
{
27+
"cell_type": "code",
28+
"execution_count": null,
29+
"metadata": {},
30+
"outputs": [],
31+
"source": [
32+
"import os\n",
33+
"import datetime\n",
34+
"import requests\n",
35+
"import pandas as pd\n",
36+
"import matplotlib.pyplot as plt\n",
37+
"import matplotlib.dates as mdates"
38+
]
39+
},
40+
{
41+
"cell_type": "markdown",
42+
"metadata": {},
43+
"source": [
44+
"## Paramètres\n",
45+
"\n",
46+
"Modifiez `OWNER` et `REPO` pour pointer vers le dépôt de votre choix."
47+
]
48+
},
49+
{
50+
"cell_type": "code",
51+
"execution_count": null,
52+
"metadata": {},
53+
"outputs": [],
54+
"source": [
55+
"OWNER = \"sdpython\"\n",
56+
"REPO = \"teachpyx\"\n",
57+
"\n",
58+
"# Jeton d'authentification GitHub (optionnel mais recommandé)\n",
59+
"GITHUB_TOKEN = os.environ.get(\"GITHUB_TOKEN\", \"\")"
60+
]
61+
},
62+
{
63+
"cell_type": "markdown",
64+
"metadata": {},
65+
"source": [
66+
"## Récupération des PR fusionnées via l'API GitHub\n",
67+
"\n",
68+
"L'API REST GitHub expose le point d'accès `/repos/{owner}/{repo}/pulls`\n",
69+
"avec `state=closed`. On filtre ensuite les PR dont le champ `merged_at` est renseigné\n",
70+
"et dont la date de fusion est dans les 12 derniers mois.\n",
71+
"\n",
72+
"La pagination est gérée via le paramètre `page`."
73+
]
74+
},
75+
{
76+
"cell_type": "code",
77+
"execution_count": null,
78+
"metadata": {},
79+
"outputs": [],
80+
"source": [
81+
"def fetch_merged_prs(owner: str, repo: str, token: str = \"\") -> list[dict]:\n",
82+
" \"\"\"Récupère toutes les PR fusionnées au cours de l'année écoulée.\n",
83+
"\n",
84+
" :param owner: propriétaire du dépôt GitHub\n",
85+
" :param repo: nom du dépôt GitHub\n",
86+
" :param token: jeton d'authentification GitHub (optionnel)\n",
87+
" :return: liste de dictionnaires avec les champs ``author``, ``merged_at``\n",
88+
" \"\"\"\n",
89+
" headers = {\"Accept\": \"application/vnd.github+json\"}\n",
90+
" if token:\n",
91+
" headers[\"Authorization\"] = f\"Bearer {token}\"\n",
92+
"\n",
93+
" since = datetime.datetime.now(datetime.timezone.utc) - datetime.timedelta(days=365)\n",
94+
"\n",
95+
" results = []\n",
96+
" page = 1\n",
97+
" per_page = 100\n",
98+
"\n",
99+
" while True:\n",
100+
" url = (\n",
101+
" f\"https://api.github.com/repos/{owner}/{repo}/pulls\"\n",
102+
" f\"?state=closed&per_page={per_page}&page={page}&sort=updated&direction=desc\"\n",
103+
" )\n",
104+
" response = requests.get(url, headers=headers, timeout=30)\n",
105+
" try:\n",
106+
" response.raise_for_status()\n",
107+
" except requests.HTTPError as exc:\n",
108+
" status = exc.response.status_code\n",
109+
" if status == 401:\n",
110+
" raise RuntimeError(\n",
111+
" \"Authentification refusée (401). Vérifiez votre GITHUB_TOKEN.\"\n",
112+
" ) from exc\n",
113+
" if status == 403:\n",
114+
" raise RuntimeError(\n",
115+
" \"Accès refusé (403). Vous avez peut-être atteint la limite de l'API \"\n",
116+
" \"GitHub (60 requêtes/h sans token). Définissez GITHUB_TOKEN.\"\n",
117+
" ) from exc\n",
118+
" if status == 404:\n",
119+
" raise RuntimeError(\n",
120+
" f\"Dépôt introuvable (404) : {owner}/{repo}. Vérifiez OWNER et REPO.\"\n",
121+
" ) from exc\n",
122+
" raise\n",
123+
" prs = response.json()\n",
124+
"\n",
125+
" if not prs:\n",
126+
" break\n",
127+
"\n",
128+
" stop = False\n",
129+
" for pr in prs:\n",
130+
" merged_at = pr.get(\"merged_at\")\n",
131+
" if not merged_at:\n",
132+
" continue\n",
133+
" merged_dt = datetime.datetime.fromisoformat(merged_at.replace(\"Z\", \"+00:00\"))\n",
134+
" if merged_dt < since:\n",
135+
" stop = True\n",
136+
" break\n",
137+
" author = (pr.get(\"user\") or {}).get(\"login\", \"unknown\")\n",
138+
" results.append({\"author\": author, \"merged_at\": merged_dt})\n",
139+
"\n",
140+
" if stop:\n",
141+
" break\n",
142+
"\n",
143+
" page += 1\n",
144+
"\n",
145+
" return results\n",
146+
"\n",
147+
"\n",
148+
"merged_prs = fetch_merged_prs(OWNER, REPO, GITHUB_TOKEN)\n",
149+
"print(f\"{len(merged_prs)} PR(s) fusionnée(s) trouvée(s) au cours de l'année écoulée.\")"
150+
]
151+
},
152+
{
153+
"cell_type": "markdown",
154+
"metadata": {},
155+
"source": [
156+
"## Agrégation par auteur et par semaine"
157+
]
158+
},
159+
{
160+
"cell_type": "code",
161+
"execution_count": null,
162+
"metadata": {},
163+
"outputs": [],
164+
"source": [
165+
"df = pd.DataFrame(merged_prs)\n",
166+
"\n",
167+
"if df.empty:\n",
168+
" print(\"Aucune donnée à afficher.\")\n",
169+
"else:\n",
170+
" # Tronque la date au lundi de la semaine\n",
171+
" df[\"week\"] = df[\"merged_at\"].dt.to_period(\"W\").dt.start_time\n",
172+
"\n",
173+
" weekly = (\n",
174+
" df.groupby([\"author\", \"week\"])\n",
175+
" .size()\n",
176+
" .reset_index(name=\"pr_count\")\n",
177+
" )\n",
178+
" print(weekly.head(10))"
179+
]
180+
},
181+
{
182+
"cell_type": "markdown",
183+
"metadata": {},
184+
"source": [
185+
"## Tableau croisé (auteur × semaine)"
186+
]
187+
},
188+
{
189+
"cell_type": "code",
190+
"execution_count": null,
191+
"metadata": {},
192+
"outputs": [],
193+
"source": [
194+
"if not df.empty:\n",
195+
" pivot = weekly.pivot_table(\n",
196+
" index=\"author\", columns=\"week\", values=\"pr_count\", aggfunc=\"sum\", fill_value=0\n",
197+
" )\n",
198+
" # Tri par nombre total de PR décroissant\n",
199+
" pivot = pivot.loc[pivot.sum(axis=1).sort_values(ascending=False).index]\n",
200+
" pivot"
201+
]
202+
},
203+
{
204+
"cell_type": "markdown",
205+
"metadata": {},
206+
"source": [
207+
"## Visualisation : nombre de PR fusionnées par semaine (empilé par auteur)"
208+
]
209+
},
210+
{
211+
"cell_type": "code",
212+
"execution_count": null,
213+
"metadata": {},
214+
"outputs": [],
215+
"source": [
216+
"if not df.empty:\n",
217+
" fig, ax = plt.subplots(figsize=(14, 5))\n",
218+
"\n",
219+
" stacked_height = None\n",
220+
" weeks = pivot.columns # DatetimeIndex\n",
221+
" week_nums = mdates.date2num(weeks.to_pydatetime())\n",
222+
"\n",
223+
" for author in pivot.index:\n",
224+
" values = pivot.loc[author].values\n",
225+
" if stacked_height is None:\n",
226+
" ax.bar(week_nums, values, width=5, label=author)\n",
227+
" stacked_height = values.copy()\n",
228+
" else:\n",
229+
" ax.bar(week_nums, values, width=5, bottom=stacked_height, label=author)\n",
230+
" stacked_height += values\n",
231+
"\n",
232+
" ax.xaxis.set_major_formatter(mdates.DateFormatter(\"%Y-%m-%d\"))\n",
233+
" ax.xaxis.set_major_locator(mdates.WeekdayLocator(byweekday=mdates.MO, interval=4))\n",
234+
" plt.xticks(rotation=45, ha=\"right\")\n",
235+
" ax.set_xlabel(\"Semaine\")\n",
236+
" ax.set_ylabel(\"Nombre de PR fusionnées\")\n",
237+
" ax.set_title(f\"PR fusionnées par semaine — {OWNER}/{REPO}\")\n",
238+
" ax.legend(loc=\"upper left\", bbox_to_anchor=(1, 1), title=\"Auteur\")\n",
239+
" plt.tight_layout()\n",
240+
" plt.show()"
241+
]
242+
},
243+
{
244+
"cell_type": "markdown",
245+
"metadata": {},
246+
"source": [
247+
"## Visualisation : carte de chaleur (heatmap auteur × semaine)"
248+
]
249+
},
250+
{
251+
"cell_type": "code",
252+
"execution_count": null,
253+
"metadata": {},
254+
"outputs": [],
255+
"source": [
256+
"if not df.empty:\n",
257+
" fig, ax = plt.subplots(figsize=(14, max(3, len(pivot) * 0.5)))\n",
258+
"\n",
259+
" im = ax.imshow(pivot.values, aspect=\"auto\", cmap=\"YlOrRd\")\n",
260+
" plt.colorbar(im, ax=ax, label=\"Nombre de PR\")\n",
261+
"\n",
262+
" ax.set_yticks(range(len(pivot.index)))\n",
263+
" ax.set_yticklabels(pivot.index)\n",
264+
"\n",
265+
" # Affiche une étiquette de semaine sur 4\n",
266+
" step = max(1, len(pivot.columns) // 12)\n",
267+
" ax.set_xticks(range(0, len(pivot.columns), step))\n",
268+
" ax.set_xticklabels(\n",
269+
" [str(d)[:10] for d in pivot.columns[::step]], rotation=45, ha=\"right\"\n",
270+
" )\n",
271+
"\n",
272+
" ax.set_title(f\"Heatmap des PR fusionnées — {OWNER}/{REPO}\")\n",
273+
" ax.set_xlabel(\"Semaine\")\n",
274+
" ax.set_ylabel(\"Auteur\")\n",
275+
" plt.tight_layout()\n",
276+
" plt.show()"
277+
]
278+
}
279+
],
280+
"metadata": {
281+
"kernelspec": {
282+
"display_name": "Python 3",
283+
"language": "python",
284+
"name": "python3"
285+
},
286+
"language_info": {
287+
"name": "python",
288+
"version": "3.12.0"
289+
}
290+
},
291+
"nbformat": 4,
292+
"nbformat_minor": 5
293+
}

_doc/practice/years/2026/index.rst

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -8,3 +8,4 @@
88
:caption: machine learning
99

1010
parcoursup_2026
11+
github_stat_pr

0 commit comments

Comments
 (0)