Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
6 changes: 6 additions & 0 deletions lab-07-observability/.gitignore
Original file line number Diff line number Diff line change
@@ -0,0 +1,6 @@
__pycache__/
*.pyc
*.pyo
.pytest_cache/
.venv/
venv/
25 changes: 25 additions & 0 deletions lab-07-observability/Dockerfile
Original file line number Diff line number Diff line change
@@ -0,0 +1,25 @@
FROM python:3.12-slim

RUN apt-get update && apt-get install -y --no-install-recommends curl && rm -rf /var/lib/apt/lists/*

COPY --from=ghcr.io/astral-sh/uv:latest /uv /usr/local/bin/uv

WORKDIR /app

RUN uv pip install --system \
"django>=5.0" \
"djangorestframework>=3.15" \
"django-prometheus>=2.3" \
"opentelemetry-sdk>=1.25" \
"opentelemetry-exporter-otlp-proto-grpc>=1.25" \
"opentelemetry-instrumentation-django>=0.46b0" \
"opentelemetry-instrumentation-psycopg2>=0.46b0" \
"psycopg2-binary>=2.9"

COPY . .

EXPOSE 8002

ENV DJANGO_SETTINGS_MODULE=config.settings

CMD ["python", "manage.py", "runserver", "0.0.0.0:8002"]
163 changes: 163 additions & 0 deletions lab-07-observability/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,163 @@
# Lab 07 — Observabilidad: Prometheus - Grafana - OpenTelemetry

Stack de monitoreo completo sobre el Django API del Lab 02. Expone métricas HTTP con `django-prometheus`, traces distribuidos con OpenTelemetry SDK, y un dashboard Grafana pre-configurado con paneles de peticiones/s, latencia y tasa de errores.

---

## Stack

| Componente | Tecnología |
|---|---|
| **Métricas** | Prometheus · django-prometheus |
| **Traces** | OpenTelemetry SDK · OTel Collector |
| **Visualización** | Grafana |
| **Aplicación instrumentada** | Django 5 (Lab 02 adaptado) |
| **Infraestructura** | Docker · Docker Compose |

---

## Conceptos Demostrados

### Métricas vs Traces
Las **métricas** (Prometheus) son agregados numéricos en el tiempo: cuántas peticiones por segundo, latencia promedio, tasa de errores. Los **traces** (OpenTelemetry) son registros detallados del recorrido de una petición individual a través del sistema, útiles para diagnóstico de latencia y errores específicos.

### django-prometheus
Middleware que instrumenta automáticamente Django exponiendo un endpoint `/metrics` en formato Prometheus. Registra contadores e histogramas de peticiones HTTP sin modificar el código de la aplicación.

### OpenTelemetry Collector
Componente que recibe traces del SDK, los procesa y los exporta a uno o más backends. Actúa como agente intermedio, desacoplando la aplicación del backend de observabilidad concreto.

### Grafana Dashboards como Código
El dashboard `transit-overview.json` se monta como volumen y Grafana lo carga automáticamente al iniciar mediante provisioning, sin necesidad de configuración manual en la UI.

---

## Estructura

```
lab-07-observability/
├── docker-compose.yml # 4 servicios: django, prometheus, grafana, otel-collector
├── Dockerfile # python:3.12-slim + uv; instala dependencias vía pyproject
├── manage.py
├── demo.sh # Genera tráfico contra /api/routes/, /api/stops/ y /metrics
├── config/
│ ├── settings.py # INSTALLED_APPS: django_prometheus, OTel SDK, REST_FRAMEWORK JSON-only
│ ├── middleware.py # Middleware custom: registra transit_request_duration_seconds
│ ├── urls.py # /metrics (django-prometheus) · /api/ · /api-auth/
│ └── wsgi.py
├── lab07/
│ └── apps/
│ └── routes/
│ ├── models.py # Modelo Route (code, name, origin, destination, is_active)
│ ├── serializers.py # RouteSerializer (DRF)
│ ├── views.py # RouteViewSet (ModelViewSet, solo lectura pública)
│ ├── urls.py # Router DRF → /api/routes/
│ └── apps.py # AppConfig con label="lab07_routes" (requerido para makemigrations)
├── prometheus/
│ └── prometheus.yml # Scrape jobs: django (:8002/metrics) · otel-collector (:8889/metrics)
├── grafana/
│ ├── provisioning/
│ │ ├── datasources/prometheus.yml # Datasource Prometheus auto-provisionado
│ │ └── dashboards/default.yml # Apunta al directorio de dashboards JSON
│ └── dashboards/
│ └── transit-overview.json # 4 paneles: req/s · latencia p95 · errores 5xx · latencia por endpoint
└── otel/
└── otel-collector-config.yml # Receivers: OTLP gRPC/HTTP · Exporters: Prometheus + logging
```

---

## Servicios Docker

| Servicio | Puerto (host) | Descripción |
|---|---|---|
| `django` | 8002 | Django instrumentado con django-prometheus + OTel |
| `prometheus` | 9090 | Prometheus (scraping + storage) |
| `grafana` | 3001 | Grafana (dashboards) |
| `otel-collector` | 4317, 4318 | OTel Collector (gRPC / HTTP) |

---

## Inicio Rápido

```bash
# Construir y levantar todo el stack
docker compose up -d --build

# Ver logs del Django instrumentado
docker compose logs -f django
```

### Primera vez: migraciones y datos de prueba

```bash
# Crear tablas (especificar el label de la app)
docker compose exec django python manage.py makemigrations lab07_routes
docker compose exec django python manage.py migrate

# Crear rutas de prueba
docker compose exec django python manage.py shell -c "
from lab07.apps.routes.models import Route
Route.objects.create(code='R01', name='Ruta Escazu', origin='San Jose', destination='Escazu')
Route.objects.create(code='R02', name='Ruta Cartago', origin='San Jose', destination='Cartago')
Route.objects.create(code='R03', name='Ruta Alajuela', origin='San Jose', destination='Alajuela')
"
```

> **Nota:** `makemigrations` sin argumentos no detecta la app automáticamente — se debe especificar el label `lab07_routes` explícitamente.

### Generar tráfico para poblar métricas

```bash
docker compose exec django bash demo.sh
```

> `bash demo.sh` debe correr **dentro del contenedor** (`docker compose exec django`). Desde Windows/Git Bash directo falla por el relay WSL2.

### Accesos

| Interfaz | URL | Credenciales |
|---|---|---|
| Django API | http://localhost:8002/api/routes/ | — |
| Métricas raw | http://localhost:8002/metrics | — |
| Prometheus | http://localhost:9090 | — |
| Grafana | http://localhost:3001 | admin / admin |

### Ver métricas en Grafana

1. Abrir http://localhost:3001
2. Iniciar sesión con `admin` / `admin`
3. Ir a **Dashboards → SIMOVI Transit Overview**

### Consultar en Prometheus

En http://localhost:9090, escribir en la barra de expresión y presionar **Execute**:

```promql
# Peticiones por segundo por método HTTP
rate(django_http_requests_total_by_method_total[1m])

# Latencia p95 de endpoints de transporte
histogram_quantile(0.95, rate(transit_request_duration_seconds_bucket[1m]))
```

---

## Métricas Disponibles

| Métrica | Tipo | Descripción |
|---|---|---|
| `django_http_requests_total` | Counter | Total de peticiones por método, path y status |
| `django_http_request_duration_seconds` | Histogram | Latencia de peticiones HTTP |
| `django_http_requests_latency_seconds` | Summary | Latencia por vista |
| `transit_request_duration_seconds` | Histogram | Métrica custom de latencia por ruta de transporte |

---

## Qué Demuestra Este Laboratorio

- **Instrumentación automática** de Django con `django-prometheus` sin modificar lógica de negocio
- **Traces distribuidos** con OpenTelemetry SDK exportados al OTel Collector
- **Pipeline de observabilidad** completo: aplicación → OTel Collector → Prometheus → Grafana
- **Dashboards como código** con provisioning automático de Grafana desde JSON
- **Métricas custom** con histogramas propios para latencia de endpoints de transporte
Empty file.
43 changes: 43 additions & 0 deletions lab-07-observability/config/middleware.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,43 @@
"""Middleware con métricas custom para endpoints de transporte."""

import time

from prometheus_client import Histogram

TRANSIT_REQUEST_DURATION = Histogram(
"transit_request_duration_seconds",
"Latencia de peticiones a endpoints de transporte",
["method", "endpoint"],
)


class TransitMetricsMiddleware:
"""Registra latencia de peticiones a rutas /api/routes/ y /api/stops/."""

TRACKED_PREFIXES = ("/api/routes", "/api/stops")

def __init__(self, get_response):
self.get_response = get_response

def __call__(self, request):
path = request.path
if not any(path.startswith(p) for p in self.TRACKED_PREFIXES):
return self.get_response(request)

endpoint = self._normalize(path)
start = time.perf_counter()
response = self.get_response(request)
duration = time.perf_counter() - start

TRANSIT_REQUEST_DURATION.labels(
method=request.method,
endpoint=endpoint,
).observe(duration)

return response

@staticmethod
def _normalize(path: str) -> str:
parts = path.strip("/").split("/")
normalized = ["<id>" if part.isdigit() else part for part in parts]
return "/" + "/".join(normalized)
48 changes: 48 additions & 0 deletions lab-07-observability/config/settings.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,48 @@
"""Settings para el Django instrumentado del Lab 07."""

import os
from pathlib import Path

BASE_DIR = Path(__file__).resolve().parent.parent

SECRET_KEY = "lab07-observability-secret-key"

DEBUG = True

ALLOWED_HOSTS = ["*"]

INSTALLED_APPS = [
"django_prometheus",
"django.contrib.contenttypes",
"django.contrib.auth",
"rest_framework",
"lab07.apps.routes",
]

MIDDLEWARE = [
"django_prometheus.middleware.PrometheusBeforeMiddleware",
"django.middleware.common.CommonMiddleware",
"config.middleware.TransitMetricsMiddleware",
"django_prometheus.middleware.PrometheusAfterMiddleware",
]

ROOT_URLCONF = "config.urls"

DATABASES = {
"default": {
"ENGINE": "django.db.backends.postgresql",
"NAME": os.environ.get("POSTGRES_DB", "lab07"),
"USER": os.environ.get("POSTGRES_USER", "lab07_user"),
"PASSWORD": os.environ.get("POSTGRES_PASSWORD", "lab07_pass"),
"HOST": os.environ.get("POSTGRES_HOST", "db"),
"PORT": "5432",
}
}

REST_FRAMEWORK = {
"DEFAULT_RENDERER_CLASSES": ["rest_framework.renderers.JSONRenderer"],
"DEFAULT_PAGINATION_CLASS": "rest_framework.pagination.PageNumberPagination",
"PAGE_SIZE": 20,
}

DEFAULT_AUTO_FIELD = "django.db.models.BigAutoField"
6 changes: 6 additions & 0 deletions lab-07-observability/config/urls.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,6 @@
from django.urls import include, path

urlpatterns = [
path("", include("django_prometheus.urls")),
path("api/", include("lab07.apps.routes.urls")),
]
7 changes: 7 additions & 0 deletions lab-07-observability/config/wsgi.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,7 @@
import os

from django.core.wsgi import get_wsgi_application

os.environ.setdefault("DJANGO_SETTINGS_MODULE", "config.settings")

application = get_wsgi_application()
38 changes: 38 additions & 0 deletions lab-07-observability/demo.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,38 @@
#!/usr/bin/env bash
# demo.sh — Genera tráfico contra el Django del Lab 07 para poblar métricas.
# Uso: bash demo.sh [duración_en_segundos]

set -euo pipefail

BASE_URL="${BASE_URL:-http://localhost:8002}"
DURATION="${1:-60}"
ENDPOINTS=(
"/api/routes/"
"/api/routes/1/"
"/api/routes/2/"
"/api/stops/"
"/metrics"
)

echo "=================================================="
echo " SIMOVI — Generador de tráfico"
echo " Base URL : $BASE_URL"
echo " Duración : ${DURATION}s"
echo "=================================================="

end=$((SECONDS + DURATION))
count=0

while [ $SECONDS -lt $end ]; do
for ep in "${ENDPOINTS[@]}"; do
url="${BASE_URL}${ep}"
status=$(curl -s -o /dev/null -w "%{http_code}" "$url" 2>/dev/null || echo "ERR")
printf "[%s] %-35s → %s\n" "$(date +%H:%M:%S)" "$ep" "$status"
count=$((count + 1))
done
sleep 2
done

echo "--------------------------------------------------"
echo "Peticiones enviadas: $count"
echo "Abra Grafana en http://localhost:3001 para ver las métricas."
Loading
Loading