UIDAI Data Hackathon 2026 Submission
Problem β’ Solution β’ Features β’ Methodology β’ Visualizations β’ Installation β’ Tech Stack β’ Team
- Problem Statement
- Our Solution
- Key Features
- Deep Learning Edition
- Architecture
- Methodology Flow Diagram
- Key Findings
- Visualizations
- Installation
- Usage
- Dataset Structure
- Tech Stack
- Policy Recommendations
- Team
The Hidden Crisis: Children Losing Benefits Due to MBU Non-Compliance
Children enrolled in Aadhaar are mandated to update their biometrics at two critical life stages:
| Age | Update Type | Reason |
|---|---|---|
| 5 years | Mandatory Biometric Update | Fingerprints mature, facial features change |
| 15 years | Mandatory Biometric Update | Adolescent biometric changes |
β Aadhaar becomes INACTIVE
βββ π« School admission blocked
βββ π« Scholarship disbursement fails
βββ π« Mid-day meal authentication fails
βββ π« DBT (Direct Benefit Transfer) denied
Thousands of children risk losing government benefits worth crores of rupees annually because their Aadhaar wasn't updated on time β often due to lack of awareness or inaccessible update centers.
We built an intelligent system that identifies "Ghost Cohorts" β children who were enrolled in Aadhaar but never completed their Mandatory Biometric Updates.
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β MBU GAP ANALYZER β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ€
β β
β Layer 1: COHORT TRACKING β
β βββ Track children from enrolment β MBU age β
β βββ Calculate MBU Compliance Ratio per district β
β βββ Identify "Ghost Cohorts" with high gap β
β β
β Layer 2: SERVICE DESERT IDENTIFICATION β
β βββ K-Means clustering on district performance β
β βββ Identify underserved areas β
β βββ Priority ranking for intervention β
β β
β Layer 3: SCHOLARSHIP RISK PREDICTION β
β βββ Time series analysis of update trends β
β βββ Forecast "Update Crunch" periods β
β βββ Predict children at risk before admission season β
β β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
|
|
|
|
Branch:
deep-learning| Notebook:MBU_Gap_Analyzer_DeepLearning.ipynb
We've enhanced the standard ML solution with 4 cutting-edge Deep Learning modules to deliver a production-ready, hackathon-winning submission.
| # | Module | Technology | Purpose |
|---|---|---|---|
| 1 | LSTM Forecaster | PyTorch | Time-series prediction for MBU demand during school admission rush |
| 2 | Autoencoder Anomaly | PyTorch | Unsupervised detection of suspicious districts with abnormal patterns |
| 3 | SHAP Explainability | SHAP Library | Explain WHY K-Means classified districts as Service Deserts |
| 4 | Geospatial Map | Folium | Interactive India map with color-coded Service Desert markers |
Architecture: Input β LSTM (2 layers, 64 hidden) β FC β Output
LSTM (Long Short-Term Memory) captures temporal patterns in biometric update trends to predict future MBU demand.
- Training: 100 epochs with Adam optimizer + learning rate scheduling
- Output: 6-month forecast to anticipate school admission rush (June-July 2026)
- Use Case: UIDAI can pre-deploy mobile Seva Kendras to high-demand districts
Encoder: 5 features β 32 β 16 β 8 (bottleneck)
Decoder: 8 β 16 β 32 β 5 features (reconstruction)
Autoencoder learns normal patterns and flags districts with high reconstruction error as anomalies.
- Detection: Districts with abnormal enrolment-to-update ratios
- Use Case: Identify data quality issues or potential fraud
SHAP (SHapley Additive exPlanations) uses game theory to explain model decisions.
- Method: KernelSHAP for K-Means clustering
- Output: Feature importance showing why a district is classified as Service Desert
- Use Case: Provide transparent, auditable explanations for policy decisions
Key Result: LSTM reduces RMSE by 1.7% vs Linear Regression and 17% vs Moving Average
Training converges smoothly over 100 epochs with learning rate scheduling
LSTM predicts MBU demand surge during school admission season (June-July 2026)
Folium generates an interactive HTML map of India with:
| Marker Color | Status | Compliance |
|---|---|---|
| π΄ Red | Service Desert | < 50% |
| π Orange | At Risk | 50-80% |
| π’ Green | Compliant | > 80% |
- Popups: District name, compliance ratio, MBU gap
- Output:
analysis_outputs/service_desert_map.html
# Switch to deep-learning branch
git checkout deep-learning
# Install additional dependencies
pip install torch shap folium
# Run the notebook
jupyter notebook MBU_Gap_Analyzer_DeepLearning.ipynb| Compliance Ratio | Risk Category | Action Required |
|---|---|---|
| β₯ 80% | π’ Green (Compliant) | Maintain current operations |
| 50% - 80% | π‘ Yellow (Moderate Risk) | Awareness campaigns needed |
| < 50% | π΄ Red (Service Desert) | Urgent intervention required |
Input Data Processing Output
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
ββββββββββββββββ
[Enrolment CSV] βββββββββββββΊβ β
β Merge & β βββββββββββββββββββ
[Biometric CSV] βββββββββββββΊβ Aggregate ββββββΊβ Gap Analysis DF β
β by Dist. β ββββββββββ¬βββββββββ
[Demographic CSV] βββββββββββΊβ β β
ββββββββββββββββ βΌ
βββββββββββββββββββ
β StandardScaler β
ββββββββββ¬βββββββββ
βΌ
βββββββββββββββββββ
β K-Means (k=4) β
ββββββββββ¬βββββββββ
βΌ
ββββββββββββββββββββββββββ
β Cluster Assignment β
β β’ Service Desert β
β β’ At Risk β
β β’ Moderate β
β β’ High Performer β
ββββββββββββββββββββββββββ
|
Districts Analyzed |
Children at Risk |
Service Desert Districts |
Benefits at Risk |
Green (Compliant) ββββββββββββββββββββββββββββββββββββββββ 930 districts (87%)
Yellow (Moderate) ββ 32 districts (3%)
Red (Service Desert) ββββββ 147 districts (14%)
Our analysis generates 8 interactive visualizations for the hackathon submission:
| # | Chart | Purpose |
|---|---|---|
| 1 | Problem Districts by State | Identify states with most Service Desert + Moderate Risk districts |
| 2 | MBU Gap Treemap | Visualize children at risk by state (size = gap, color = severity) |
| 3 | MBU Gap by State | Children at risk of losing benefits per state |
| 4 | Optimal Cluster Selection | Elbow Method + Silhouette Score for K-Means tuning |
| 5 | Service Desert Scatter Plot | K-Means clustering with district-level risk categories |
| 6 | Daily Update Trend | Time series with 7-day moving average |
| 7 | Forecast Chart | Prophet-based 6-month prediction for school admission season |
| 8 | Financial Impact Bar | Scholarships & DBT benefits at risk (Rs. Crore) |
π All visualizations are also available as interactive HTML files in
analysis_outputs/
π Click the map above or here to view the Interactive India Map
Explore the Folium-powered geospatial visualization showing Service Desert districts across India:
- π΄ Red markers = Service Desert districts (< 50% compliance)
- π‘ Yellow markers = Moderate Risk districts (50-80%)
- π’ Green markers = Compliant districts (> 80%)
- Click markers for district-level details (MBU Gap, Compliance %)
- Python 3.10 or higher
- pip package manager
- Git
# 1. Clone the repository
git clone https://github.com/STIWARTs/UIDAI_DH_2k26.git
cd UIDAI_DH_2k26
# 2. Create virtual environment
python -m venv .venv
# 3. Activate virtual environment
# Windows (PowerShell)
.venv\Scripts\Activate.ps1
# Windows (CMD)
.venv\Scripts\activate.bat
# Linux/Mac
source .venv/bin/activate
# 4. Install all dependencies
pip install -r requirements.txt| Category | Packages |
|---|---|
| Core Data Science | numpy, pandas, scipy |
| Machine Learning | scikit-learn, shap |
| Deep Learning | torch (PyTorch) |
| Time Series | prophet |
| Visualization | matplotlib, plotly, folium, kaleido |
| Jupyter | ipykernel, ipython, jupyter_client |
β‘ Note: The full installation may take 5-10 minutes due to PyTorch and Prophet dependencies.
pip install torch shap folium kaleido-
Open Jupyter Notebook
jupyter notebook MBU_Gap_Analyzer.ipynb
-
Execute All Cells
- Press
Shift + Enterto run cells sequentially - Or use
Cell β Run Allfrom menu
- Press
-
View Outputs
- Interactive charts display inline
- CSV exports saved to
analysis_outputs/folder - HTML charts for PDF conversion
-
Switch to deep-learning branch
git checkout deep-learning
-
Install Deep Learning dependencies
pip install torch shap folium
-
Open the Deep Learning notebook
jupyter notebook MBU_Gap_Analyzer_DeepLearning.ipynb
-
Execute All Cells - Includes 4 advanced AI modules:
- Module 1: LSTM Time-Series Forecaster
- Module 2: Autoencoder Anomaly Detector
- Module 3: SHAP Explainable AI
- Module 4: Folium Geospatial Map
analysis_outputs/
βββ mbu_gap_analysis_by_district.csv # Full gap analysis
βββ state_wise_compliance_summary.csv # State-level metrics
βββ service_desert_districts.csv # Priority intervention list
βββ ghost_cohort_districts.csv # Top 20 underperforming districts
βββ chart1_state_compliance.html # Interactive visualizations
βββ chart2_gap_treemap.html
βββ chart3_service_desert_scatter.html
βββ chart4_daily_trends.html
βββ chart5_forecast.html
βββ chart6_financial_impact.html
βββ service_desert_map.html # π Interactive Folium map (Deep Learning Edition)
uidai_datasets/
βββ api_data_aadhar_enrolment/ # ~1M records
β βββ api_data_aadhar_enrolment_0_500000.csv
β βββ api_data_aadhar_enrolment_500000_1000000.csv
β βββ api_data_aadhar_enrolment_1000000_1006029.csv
β
βββ api_data_aadhar_biometric/ # ~1.8M records
β βββ api_data_aadhar_biometric_0_500000.csv
β βββ api_data_aadhar_biometric_500000_1000000.csv
β βββ api_data_aadhar_biometric_1000000_1500000.csv
β βββ api_data_aadhar_biometric_1500000_1861108.csv
β
βββ api_data_aadhar_demographic/ # ~2M records
βββ api_data_aadhar_demographic_0_500000.csv
βββ api_data_aadhar_demographic_500000_1000000.csv
βββ api_data_aadhar_demographic_1000000_1500000.csv
βββ api_data_aadhar_demographic_1500000_2000000.csv
βββ api_data_aadhar_demographic_2000000_2071700.csv
| Dataset | Columns |
|---|---|
| Enrolment | date, state, district, pincode, age_0_5, age_5_17, age_18_greater |
| Biometric | date, state, district, pincode, bio_age_5_17, bio_age_17_ |
| Demographic | date, state, district, pincode, demo_age_5_17, demo_age_17_ |
Based on our analysis, we recommend the following interventions:
| # | Recommendation | Impact |
|---|---|---|
| 1 | Deploy Mobile Aadhaar Seva Kendras in Service Desert districts | Reach underserved rural/tribal areas |
| 2 | Launch SMS/IVR Reminder Campaigns for children turning 5/15 | Proactive awareness |
| 3 | Integrate MBU Status Check with school admission portals | Flag inactive Aadhaar early |
| 4 | Extend Fee Waiver beyond Oct 2025 for low-compliance states | Remove financial barrier |
| 5 | Deploy Real-time Dashboard for District Collectors | Weekly progress monitoring |
Team OMEGA
UIDAI Data Hackathon 2026
| Name | Role | |
|---|---|---|
| Piyush Verma | Team Leader | |
| Stiwart Stance Saxena | Team Member |
This project is submitted as part of the UIDAI Data Hackathon 2026. All rights reserved.
Built with β€οΈ for Digital India
Making Aadhaar work for every child










