Goal: Get the NVIDIA RAG Blueprint deployed and tariff PDFs ingested in ~30 minutes.
-
EKS Cluster Running
kubectl cluster-info
If not, deploy it first:
cd infrastructure/terraform ./install.sh -
NGC API Key
- Get from: https://org.ngc.nvidia.com/setup/api-key
export NGC_API_KEY="nvapi-..."
-
kubectl Configured
aws eks update-kubeconfig --region us-west-2 --name aiq-udf-eks
cd infrastructure/helm
./deploy-rag-blueprint.shWhat's deploying:
- ✅ Milvus vector database
- ✅ RAG ingest server (PDF processing)
- ✅ RAG query server (search & retrieval)
Wait for it to finish, then verify:
./verify-rag-deployment.shExpected output:
✅ Milvus - 1/1 pods running
✅ RAG Query Server - 2/2 pods running
✅ RAG Ingest Server - 1/1 pods running
cd ../../scripts
./setup_tariff_rag_enterprise.shWhat's happening:
- Port-forwards to RAG ingest service
- Creates
us_tariffscollection - Uploads 99 tariff PDF chapters
- Runs test queries
Expected output:
✅ Success: 99
📦 Total: 99
-
Get frontend URL:
kubectl get svc -n aiq-agent aiq-agent-frontend -o jsonpath='{.status.loadBalancer.ingress[0].hostname}' -
Open in browser
-
Enter these test queries:
- "What is the tariff for replacement batteries for a Raritan remote management card?"
- "What's the tariff of Reese's Pieces?"
- "Tariff of a replacement Roomba vacuum motherboard, used"
-
Set collection name:
us_tariffs
kubectl get pods -n rag-blueprint
kubectl describe pod <pod-name> -n rag-blueprint# Check port-forward
curl http://localhost:8082/health
# Check logs
kubectl logs -n rag-blueprint -l app=rag-ingest-server -f# Verify collection was created
kubectl logs -n rag-blueprint -l app=rag-ingest-server | grep "us_tariffs"
# Re-run ingestion
cd scripts
./setup_tariff_rag_enterprise.shUser Query
↓
AI-Q Agent Backend
↓
RAG Query Server (8081) ← Milvus Vector Store
↓ ↑
Embedding NIM (8000) |
|
RAG Ingest Server (8082)
↑
Tariff PDFs (99)
✅ Enterprise Vector Store: Milvus (production-grade)
✅ Hybrid Search: Vector + keyword (BM25) for tariff codes
✅ GPU-Accelerated: PDF processing with NVIDIA NIMs
✅ Scalable: Auto-scales with Karpenter
✅ Citation Support: Returns source documents with answers
- Add more document collections (regulations, trade agreements)
- Integrate RAG into UDR dynamic strategies
- Scale query server for production traffic
- Set up monitoring and alerts
For detailed configuration, troubleshooting, and operations:
- NVIDIA_RAG_BLUEPRINT_DEPLOYMENT.md - Complete enterprise deployment guide
- README.md - Main project documentation
- DEPLOYMENT.md - Full infrastructure deployment
Deployed in ~30 minutes! Now you have enterprise-grade RAG powered by NVIDIA blueprints. 🚀