Skip to content

rooneyrulz/rag-chatbot

Repository files navigation

RAG Chatbot

A modern, AI-powered chatbot that allows users to upload PDF documents and have intelligent conversations about their content using Retrieval-Augmented Generation (RAG).

🌟 Features

  • PDF Document Upload: Upload and process PDF documents for analysis
  • Intelligent Chat: Ask questions and get contextually relevant answers from your documents
  • User Authentication: Secure authentication powered by Clerk
  • Modern UI: Beautiful, responsive interface built with Radix UI components
  • Real-time Streaming: Stream AI responses in real-time for better user experience
  • Persistent Storage: Store your documents and chat history in Neon PostgreSQL database

πŸ› οΈ Tech Stack

Core Framework

  • Next.js 15.5.4 - React framework with App Router
  • React 19 - UI library
  • TypeScript - Type safety

AI & RAG

  • Groq AI (@ai-sdk/groq) - Fast LLM inference
  • HuggingFace Inference - Text embeddings for document vectorization
  • Vercel AI SDK - AI/ML utilities and streaming
  • LangChain Text Splitters - Document chunking and processing

Authentication

  • Clerk - Complete authentication solution

Database & ORM

  • Neon Database - Serverless PostgreSQL
  • Drizzle ORM - Type-safe database toolkit

UI Components

  • Radix UI - Unstyled, accessible component primitives
  • Tailwind CSS - Utility-first CSS framework
  • Lucide React - Beautiful icon set
  • Sonner - Toast notifications
  • React Hook Form - Form management with Zod validation

Additional Libraries

  • pdf-parse - PDF text extraction
  • React Flow (@xyflow/react) - Node-based visualization (optional feature)
  • Recharts - Data visualization
  • cmdk - Command palette interface

πŸ“‹ Prerequisites

  • Node.js 18+
  • npm, yarn, pnpm, or bun
  • A Clerk account (for authentication)
  • A Groq API key (for LLM inference)
  • A HuggingFace API token (for embeddings)
  • A Neon database account

πŸš€ Getting Started

1. Clone the repository

git clone https://github.com/rooneyrulz/rag-chatbot.git
cd rag-chatbot

2. Install dependencies

npm install
# or
yarn install
# or
pnpm install
# or
bun install

3. Set up environment variables

Create a .env.local file in the root directory:

# Clerk Authentication
NEXT_PUBLIC_CLERK_PUBLISHABLE_KEY=your_clerk_publishable_key
CLERK_SECRET_KEY=your_clerk_secret_key

# Clerk URLs (optional - defaults work for most cases)
NEXT_PUBLIC_CLERK_SIGN_IN_URL=/sign-in
NEXT_PUBLIC_CLERK_SIGN_UP_URL=/sign-up
NEXT_PUBLIC_CLERK_AFTER_SIGN_IN_URL=/
NEXT_PUBLIC_CLERK_AFTER_SIGN_UP_URL=/

# Groq AI
GROQ_API_KEY=your_groq_api_key

# HuggingFace
HUGGINGFACE_API_KEY=your_huggingface_token

# Neon Database
DATABASE_URL=your_neon_database_url

4. Set up the database

# Generate database migrations
npx drizzle-kit generate

# Run migrations
npx drizzle-kit push

5. Run the development server

npm run dev
# or
yarn dev
# or
pnpm dev
# or
bun dev

Open http://localhost:3000 in your browser.

πŸ—οΈ Project Structure

rag-chatbot/
β”œβ”€β”€ app/                    # Next.js app directory
β”‚   β”œβ”€β”€ (auth)/            # Authentication routes
β”‚   β”œβ”€β”€ (chat)/            # Chat interface routes
β”‚   β”œβ”€β”€ api/               # API routes
β”‚   └── layout.tsx         # Root layout
β”œβ”€β”€ components/            # React components
β”‚   β”œβ”€β”€ ui/               # Reusable UI components
β”‚   └── ...               # Feature-specific components
β”œβ”€β”€ lib/                   # Utility functions and configurations
β”‚   β”œβ”€β”€ db/               # Database schema and queries
β”‚   β”œβ”€β”€ ai/               # AI/RAG logic
β”‚   └── utils.ts          # Helper functions
β”œβ”€β”€ public/               # Static assets
└── ...config files

πŸ“– How It Works

  1. Document Upload: Users upload PDF documents through the interface
  2. Text Extraction: The system extracts text content from PDFs using pdf-parse
  3. Chunking: Documents are split into manageable chunks using LangChain text splitters
  4. Embedding: Text chunks are converted to vector embeddings using HuggingFace models
  5. Storage: Embeddings and metadata are stored in Neon PostgreSQL database
  6. Query Processing: When a user asks a question:
    • The question is converted to an embedding
    • Similar document chunks are retrieved using vector similarity search
    • Relevant context is passed to Groq LLM
    • The LLM generates a contextual response
  7. Streaming Response: Answers are streamed back to the user in real-time

🎨 Available Scripts

  • npm run dev - Start development server with Turbopack
  • npm run build - Build for production with Turbopack
  • npm start - Start production server
  • npm run lint - Run Biome linter
  • npm run format - Format code with Biome

πŸ”§ Configuration

Database Schema

Use Drizzle Kit to manage your database schema:

# Generate new migration
npx drizzle-kit generate

# Push changes to database
npx drizzle-kit push

# Open Drizzle Studio (database GUI)
npx drizzle-kit studio

Customizing AI Models

Edit the AI configuration in your code to use different models:

  • LLM Models: Configure Groq model selection (e.g., llama, mixtral)
  • Embedding Models: Change HuggingFace embedding model
  • Chunking Strategy: Adjust text splitter parameters for optimal performance

🌐 Deployment

Deploy to Vercel

The easiest way to deploy is using Vercel:

Deploy with Vercel

  1. Push your code to GitHub
  2. Import your repository in Vercel
  3. Add environment variables in Vercel dashboard
  4. Deploy

Other Platforms

This Next.js app can be deployed to any platform that supports Node.js:

  • Netlify
  • Railway
  • AWS
  • Digital Ocean
  • Self-hosted

🀝 Contributing

Contributions are welcome! Please feel free to submit a Pull Request.

  1. Fork the repository
  2. Create your feature branch (git checkout -b feature/AmazingFeature)
  3. Commit your changes (git commit -m 'Add some AmazingFeature')
  4. Push to the branch (git push origin feature/AmazingFeature)
  5. Open a Pull Request

πŸ“ License

This project is licensed under the MIT License - see the LICENSE file for details.

πŸ™ Acknowledgments

πŸ“§ Contact

For questions or feedback, please open an issue on GitHub.

Made with ❀️ using Next.js and AI

About

Retrieval-Augmented Generation chatbot built with Next.js 15, Groq AI, HuggingFace embeddings, Clerk auth, and Neon PostgreSQL. Chat with your PDFs intelligently.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors