Skip to content
View silvahudson's full-sized avatar

Block or report silvahudson

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don’t include any personal information such as legal names or email addresses. Markdown is supported. This note will only be visible to you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
silvahudson/README.md

Hudson Silva

Senior Data Engineer · AI Data Engineer
9+ years building production data platforms — lakehouse, governance, and applied GenAI/RAG.

LinkedIn: osilvahudson Open to remote work (global) Based in Brazil


👋 About

Senior / AI Data Engineer with 9+ years designing and shipping end-to-end data solutions — from stakeholder requirements to production lakehouse platforms. I work where most teams are short on talent: robust data engineering combined with applied GenAI / RAG in production (embeddings, retrieval, LLM-backed pipelines). I'm data/LLM-infrastructure, not ML research.

Currently consulting as Senior / AI Data Engineer, fully remote.

🧰 Tech I work with

Languages & ProcessingPython · PySpark · SQL Lakehouse & WarehouseDatabricks · Unity Catalog · Snowflake · Delta Lake CloudAzure (Synapse, Data Factory, ADLS) · GCP Orchestration & ModelingApache Airflow 3 · dbt Data + AIRAG pipelines · Embeddings (sentence-transformers) · Vector DB (Qdrant) · LLM serving Engineering & GovernanceFastAPI · Docker · pytest · PostgreSQL/PostGIS · Data Quality · GDPR/LGPD

📈 Selected impact (production work)

  • 🚀 Scalable data platform sustaining 200% data growth
  • 💸 −30% storage cost · −50% pipeline latency
  • −40% processing time migrating Pandas → PySpark
  • +25% data accuracy · −40% integration errors via governance (GDPR/LGPD)

📌 Featured project

fastapi-trips — Mobility-data engineering pipeline: FastAPI ingestion (sync + async), PostgreSQL + PostGIS for spatial-temporal analysis, real-time status via WebSocket, Dockerized, scalable design (GIST indexing, AWS architecture sketch). A compact end-to-end example of how I build production data services.

Most of my work lives behind enterprise NDAs (Intel, Vale, BEES/AB InBev, Cyrela, Paschoalotto). For the full picture — projects, impact, and references — see my LinkedIn and reach out directly.

📫 Contact

  • LinkedIn: in/osilvahudson
  • Open to remote Senior / AI Data Engineer roles · LATAM-friendly

Pinned Loading

  1. fastapi-trips fastapi-trips Public

    Production-grade data engineering pipeline for mobility data — FastAPI + PostgreSQL/PostGIS, sync & async ingestion, real-time WebSocket status, Dockerized, designed to scale to 100M+ trips.

    Python