Skip to content
View prathamk11's full-sized avatar

Block or report prathamk11

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don’t include any personal information such as legal names or email addresses. Markdown is supported. This note will only be visible to you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
prathamk11/README.md

🚀 Prathamesh Kulkarni

Data Engineer • AI Engineer • MLOps Developer

💡 Designing scalable data platforms, real-time streaming pipelines, and production AI systems

Transforming raw data → intelligent systems → business impact using
Data Engineering • Machine Learning • Cloud Infrastructure

Location Badge

💡 Core Specializations

Real-Time Data Engineering
Kafka • Spark • Airflow • Distributed Streaming Pipelines

🤖 AI & Generative AI Systems
LLMs • RAG Architectures • NLP • Deep Learning Models

☁️ Cloud & MLOps Infrastructure
AWS • Docker • Kubernetes • MLflow • CI/CD

📊 End-to-End Data Platforms
Data Ingestion → Feature Engineering → ML Pipelines → API Deployment


🎯 Engineering Focus

  • ⚡ Building high-throughput streaming data systems
  • 🧠 Designing production-grade ML pipelines
  • 🤖 Developing LLM-powered AI applications
  • ☁️ Deploying scalable cloud-native AI infrastructure

Turning Data into Scalable Intelligent Systems

🧠 About Me

Hi, I'm Prathamesh Kulkarni, a Data Engineer and AI Developer based in Pune, India 🇮🇳.

I build production-grade data pipelines, machine learning systems, and AI applications designed to operate at scale.
My work focuses on real-time data processing, distributed systems, and deploying intelligent models into production environments.


🎓 Education

  • M.Sc Computer Science — Savitribai Phule Pune University (CGPA: 8.5)
  • B.E Computer Science — Savitribai Phule Pune University (CGPA: 8.6)

📈 Proven Business Impact

Metric Result Where
Streaming Latency Reduced 45% Telphatech LLP — Kafka Architecture
🤖 Manual Effort Eliminated 40% CaryanamIndia — PySpark + Airflow
🎯 Production Model Accuracy 88%+ NullClass — TensorFlow + HuggingFace
💬 Chatbot Intent Accuracy +32% Telphatech LLP — Flask + PyTorch
📊 User Interactions Tracked 10K+ Telphatech LLP — Streamlit Dashboards
🌲 Model Training Time Cut 40% NullClass — PySpark Pipelines

🏗️ System Architecture Expertise

                        ┌─────────────────────────────────────────────────┐
                        │          REAL-TIME AI DATA PLATFORM              │
                        └─────────────────────────────────────────────────┘

   Data Sources          Ingestion           Processing          Serving
  ┌──────────┐         ┌─────────┐         ┌──────────┐        ┌─────────┐
  │ REST APIs│────────▶│  Kafka  │────────▶│  PySpark │───────▶│ FastAPI │
  │ Databases│         │ Streams │         │Streaming │        │  Flask  │
  │  Files   │         └─────────┘         └──────────┘        └─────────┘
  └──────────┘              │                    │                   │
                            ▼                    ▼                   ▼
                       ┌─────────┐         ┌──────────┐        ┌─────────┐
                       │ Airflow │         │  Delta   │        │ Docker  │
                       │  DAGs   │         │   Lake   │        │   K8s   │
                       └─────────┘         └──────────┘        └─────────┘
                            │                    │                   │
                            ▼                    ▼                   ▼
                       ┌─────────┐         ┌──────────┐        ┌─────────┐
                       │   dbt   │         │   ML     │        │   AWS   │
                       │Snowflake│         │  Model   │        │ EC2·S3  │
                       └─────────┘         └──────────┘        └─────────┘
                                                │
                                    ┌───────────┴───────────┐
                                    │      MLflow           │
                                    │  Experiment Tracking  │
                                    └───────────────────────┘

⚡ Full Tech Stack

🧑‍💻 Languages

Python SQL R JavaScript Bash YAML

🤖 AI / ML & GenAI

PyTorch TensorFlow Scikit-learn HuggingFace LangChain LlamaIndex OpenAI MLflow Pinecone

📊 Big Data & Data Engineering

PySpark Kafka Airflow Databricks Snowflake dbt Delta Lake

☁️ Cloud & DevOps

AWS Docker Kubernetes Git MongoDB Elasticsearch Power BI

🛠️ Frameworks & Tools

Flask FastAPI React Streamlit Postman Jupyter


💼 Work Experience

━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
  🏢  CaryanamIndia                              Oct 2025 – Jan 2026
      Software Development Intern — AI & Data Engineering | Pune
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
  ✅  Architected PySpark + Airflow automation pipelines
      → Eliminated 40% manual effort across business operations
  ✅  Built NLP document intelligence pipelines on AWS S3 + Lambda
      → Enabled scalable, low-latency automated workflows
  ✅  Delivered AI-powered Power BI decision-support dashboards
      → Directly improved operational KPIs & cross-team productivity

━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
  🏢  Telphatech LLP                             Jan 2024 – Jul 2024
      Full-Stack Developer Intern | Pune
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
  ✅  Deployed production AI chatbot (Flask + PyTorch)
      → 32% boost in intent-recognition accuracy on live traffic
  ✅  Engineered real-time Kafka streaming pipelines
      → 45% reduction in end-to-end system latency
  ✅  Built Streamlit + Tableau dashboards tracking 10K+ interactions
      → Containerized via Docker for scalable deployment

━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
  🏢  NullClass                                  Jan 2024 – Jun 2024
      Data Science Intern | Remote
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
  ✅  Developed TensorFlow + HuggingFace emotion-detection models
      → 88%+ accuracy on production datasets
  ✅  Refactored PySpark preprocessing pipelines
      → 40% reduction in model training time
  ✅  MLflow experiment tracking → production-ready ML components

🚀 Featured Projects

Project Stack Highlights
🌫️ Air Quality Index Prediction PySpark · Flask · Random Forest · Heroku End-to-end regression pipeline · Web scraping · 6+ models benchmarked · Best RMSE: 38.85 · Live on Heroku
🌿 Cotton Plant Disease Detection TensorFlow · VGG-19 · Flask · Docker Fine-tuned transfer learning · 94.6% accuracy · Dockerized · Real-time inference API
📈 Apple Stock Price Forecasting Stacked LSTM · Tingo API · MLflow 100-day lookback windows · MLflow tracking · Test RMSE: 239.6
🔍 Fraud Transaction Classification Scikit-learn · PySpark · Python Imbalanced data handling · Cross-validation · Random Forest: 94% accuracy

🎓 Certifications

AWS Databricks DeepLearning Microsoft


📊 GitHub Analytics

🌐 Let's Connect & Build Something Great

LinkedIn GitHub Gmail


💼 Available for: Data Engineer · AI Engineer · MLOps · Data Analyst roles

📍 Based in: Pune, India  |  🌐 Open to: Remote & Hybrid roles globally

🔒 Engineering Credibility

✔️ All projects in this profile follow production-grade practices used in real data platforms.

✔️ Code includes scalable data pipelines, ML workflows, and deployment-ready architectures.

✔️ Built using industry tools such as PySpark, Kafka, Airflow, AWS, Docker, and MLflow.

✔️ Every repository contains complete code, documentation, and reproducible workflows.

💼 Open to Data Engineer · AI Engineer · MLOps opportunities.



"Data is the new oil — I build the refineries that turn it into intelligence."


Pinned Loading

  1. Air-Quality-Index-Prediction Air-Quality-Index-Prediction Public

    Jupyter Notebook

  2. first-contributions first-contributions Public

    Forked from firstcontributions/first-contributions

    🚀✨ Help beginners to contribute to open source projects

  3. firstcontributions/first-contributions firstcontributions/first-contributions Public

    🚀✨ Help beginners to contribute to open source projects

    53.8k 102k

  4. Codecademy/docs Codecademy/docs Public archive

    Codecademy Docs is a collection of information for all things code. 📕

    TypeScript 1.1k 4.3k

  5. Apple-Stock-Price-Forecasting Apple-Stock-Price-Forecasting Public

    Python

  6. Cotton-Plant-Disease-Detection Cotton-Plant-Disease-Detection Public

    PureBasic