A modern, AI-powered chatbot that allows users to upload PDF documents and have intelligent conversations about their content using Retrieval-Augmented Generation (RAG).
- PDF Document Upload: Upload and process PDF documents for analysis
- Intelligent Chat: Ask questions and get contextually relevant answers from your documents
- User Authentication: Secure authentication powered by Clerk
- Modern UI: Beautiful, responsive interface built with Radix UI components
- Real-time Streaming: Stream AI responses in real-time for better user experience
- Persistent Storage: Store your documents and chat history in Neon PostgreSQL database
- Next.js 15.5.4 - React framework with App Router
- React 19 - UI library
- TypeScript - Type safety
- Groq AI (@ai-sdk/groq) - Fast LLM inference
- HuggingFace Inference - Text embeddings for document vectorization
- Vercel AI SDK - AI/ML utilities and streaming
- LangChain Text Splitters - Document chunking and processing
- Clerk - Complete authentication solution
- Neon Database - Serverless PostgreSQL
- Drizzle ORM - Type-safe database toolkit
- Radix UI - Unstyled, accessible component primitives
- Tailwind CSS - Utility-first CSS framework
- Lucide React - Beautiful icon set
- Sonner - Toast notifications
- React Hook Form - Form management with Zod validation
- pdf-parse - PDF text extraction
- React Flow (@xyflow/react) - Node-based visualization (optional feature)
- Recharts - Data visualization
- cmdk - Command palette interface
- Node.js 18+
- npm, yarn, pnpm, or bun
- A Clerk account (for authentication)
- A Groq API key (for LLM inference)
- A HuggingFace API token (for embeddings)
- A Neon database account
git clone https://github.com/rooneyrulz/rag-chatbot.git
cd rag-chatbotnpm install
# or
yarn install
# or
pnpm install
# or
bun installCreate a .env.local file in the root directory:
# Clerk Authentication
NEXT_PUBLIC_CLERK_PUBLISHABLE_KEY=your_clerk_publishable_key
CLERK_SECRET_KEY=your_clerk_secret_key
# Clerk URLs (optional - defaults work for most cases)
NEXT_PUBLIC_CLERK_SIGN_IN_URL=/sign-in
NEXT_PUBLIC_CLERK_SIGN_UP_URL=/sign-up
NEXT_PUBLIC_CLERK_AFTER_SIGN_IN_URL=/
NEXT_PUBLIC_CLERK_AFTER_SIGN_UP_URL=/
# Groq AI
GROQ_API_KEY=your_groq_api_key
# HuggingFace
HUGGINGFACE_API_KEY=your_huggingface_token
# Neon Database
DATABASE_URL=your_neon_database_url# Generate database migrations
npx drizzle-kit generate
# Run migrations
npx drizzle-kit pushnpm run dev
# or
yarn dev
# or
pnpm dev
# or
bun devOpen http://localhost:3000 in your browser.
rag-chatbot/
βββ app/ # Next.js app directory
β βββ (auth)/ # Authentication routes
β βββ (chat)/ # Chat interface routes
β βββ api/ # API routes
β βββ layout.tsx # Root layout
βββ components/ # React components
β βββ ui/ # Reusable UI components
β βββ ... # Feature-specific components
βββ lib/ # Utility functions and configurations
β βββ db/ # Database schema and queries
β βββ ai/ # AI/RAG logic
β βββ utils.ts # Helper functions
βββ public/ # Static assets
βββ ...config files
- Document Upload: Users upload PDF documents through the interface
- Text Extraction: The system extracts text content from PDFs using
pdf-parse - Chunking: Documents are split into manageable chunks using LangChain text splitters
- Embedding: Text chunks are converted to vector embeddings using HuggingFace models
- Storage: Embeddings and metadata are stored in Neon PostgreSQL database
- Query Processing: When a user asks a question:
- The question is converted to an embedding
- Similar document chunks are retrieved using vector similarity search
- Relevant context is passed to Groq LLM
- The LLM generates a contextual response
- Streaming Response: Answers are streamed back to the user in real-time
npm run dev- Start development server with Turbopacknpm run build- Build for production with Turbopacknpm start- Start production servernpm run lint- Run Biome linternpm run format- Format code with Biome
Use Drizzle Kit to manage your database schema:
# Generate new migration
npx drizzle-kit generate
# Push changes to database
npx drizzle-kit push
# Open Drizzle Studio (database GUI)
npx drizzle-kit studioEdit the AI configuration in your code to use different models:
- LLM Models: Configure Groq model selection (e.g., llama, mixtral)
- Embedding Models: Change HuggingFace embedding model
- Chunking Strategy: Adjust text splitter parameters for optimal performance
The easiest way to deploy is using Vercel:
- Push your code to GitHub
- Import your repository in Vercel
- Add environment variables in Vercel dashboard
- Deploy
This Next.js app can be deployed to any platform that supports Node.js:
- Netlify
- Railway
- AWS
- Digital Ocean
- Self-hosted
Contributions are welcome! Please feel free to submit a Pull Request.
- Fork the repository
- Create your feature branch (
git checkout -b feature/AmazingFeature) - Commit your changes (
git commit -m 'Add some AmazingFeature') - Push to the branch (
git push origin feature/AmazingFeature) - Open a Pull Request
This project is licensed under the MIT License - see the LICENSE file for details.
- Vercel AI SDK for AI utilities
- Groq for fast LLM inference
- Clerk for authentication
- Neon for serverless PostgreSQL
- Radix UI for accessible components
- shadcn/ui for UI component inspiration
For questions or feedback, please open an issue on GitHub.
Made with β€οΈ using Next.js and AI