RAG Chatbot

A modern, AI-powered chatbot that allows users to upload PDF documents and have intelligent conversations about their content using Retrieval-Augmented Generation (RAG).

🌟 Features

PDF Document Upload: Upload and process PDF documents for analysis
Intelligent Chat: Ask questions and get contextually relevant answers from your documents
User Authentication: Secure authentication powered by Clerk
Modern UI: Beautiful, responsive interface built with Radix UI components
Real-time Streaming: Stream AI responses in real-time for better user experience
Persistent Storage: Store your documents and chat history in Neon PostgreSQL database

🛠️ Tech Stack

Core Framework

Next.js 15.5.4 - React framework with App Router
React 19 - UI library
TypeScript - Type safety

AI & RAG

Groq AI (@ai-sdk/groq) - Fast LLM inference
HuggingFace Inference - Text embeddings for document vectorization
Vercel AI SDK - AI/ML utilities and streaming
LangChain Text Splitters - Document chunking and processing

Authentication

Clerk - Complete authentication solution

Database & ORM

Neon Database - Serverless PostgreSQL
Drizzle ORM - Type-safe database toolkit

UI Components

Radix UI - Unstyled, accessible component primitives
Tailwind CSS - Utility-first CSS framework
Lucide React - Beautiful icon set
Sonner - Toast notifications
React Hook Form - Form management with Zod validation

Additional Libraries

pdf-parse - PDF text extraction
React Flow (@xyflow/react) - Node-based visualization (optional feature)
Recharts - Data visualization
cmdk - Command palette interface

📋 Prerequisites

Node.js 18+
npm, yarn, pnpm, or bun
A Clerk account (for authentication)
A Groq API key (for LLM inference)
A HuggingFace API token (for embeddings)
A Neon database account

🚀 Getting Started

1. Clone the repository

git clone https://github.com/rooneyrulz/rag-chatbot.git
cd rag-chatbot

2. Install dependencies

npm install
# or
yarn install
# or
pnpm install
# or
bun install

3. Set up environment variables

Create a .env.local file in the root directory:

# Clerk Authentication
NEXT_PUBLIC_CLERK_PUBLISHABLE_KEY=your_clerk_publishable_key
CLERK_SECRET_KEY=your_clerk_secret_key

# Clerk URLs (optional - defaults work for most cases)
NEXT_PUBLIC_CLERK_SIGN_IN_URL=/sign-in
NEXT_PUBLIC_CLERK_SIGN_UP_URL=/sign-up
NEXT_PUBLIC_CLERK_AFTER_SIGN_IN_URL=/
NEXT_PUBLIC_CLERK_AFTER_SIGN_UP_URL=/

# Groq AI
GROQ_API_KEY=your_groq_api_key

# HuggingFace
HUGGINGFACE_API_KEY=your_huggingface_token

# Neon Database
DATABASE_URL=your_neon_database_url

4. Set up the database

# Generate database migrations
npx drizzle-kit generate

# Run migrations
npx drizzle-kit push

5. Run the development server

npm run dev
# or
yarn dev
# or
pnpm dev
# or
bun dev

Open http://localhost:3000 in your browser.

🏗️ Project Structure

rag-chatbot/
├── app/                    # Next.js app directory
│   ├── (auth)/            # Authentication routes
│   ├── (chat)/            # Chat interface routes
│   ├── api/               # API routes
│   └── layout.tsx         # Root layout
├── components/            # React components
│   ├── ui/               # Reusable UI components
│   └── ...               # Feature-specific components
├── lib/                   # Utility functions and configurations
│   ├── db/               # Database schema and queries
│   ├── ai/               # AI/RAG logic
│   └── utils.ts          # Helper functions
├── public/               # Static assets
└── ...config files

📖 How It Works

Document Upload: Users upload PDF documents through the interface
Text Extraction: The system extracts text content from PDFs using pdf-parse
Chunking: Documents are split into manageable chunks using LangChain text splitters
Embedding: Text chunks are converted to vector embeddings using HuggingFace models
Storage: Embeddings and metadata are stored in Neon PostgreSQL database
Query Processing: When a user asks a question:
- The question is converted to an embedding
- Similar document chunks are retrieved using vector similarity search
- Relevant context is passed to Groq LLM
- The LLM generates a contextual response
Streaming Response: Answers are streamed back to the user in real-time

🎨 Available Scripts

npm run dev - Start development server with Turbopack
npm run build - Build for production with Turbopack
npm start - Start production server
npm run lint - Run Biome linter
npm run format - Format code with Biome

🔧 Configuration

Database Schema

Use Drizzle Kit to manage your database schema:

# Generate new migration
npx drizzle-kit generate

# Push changes to database
npx drizzle-kit push

# Open Drizzle Studio (database GUI)
npx drizzle-kit studio

Customizing AI Models

Edit the AI configuration in your code to use different models:

LLM Models: Configure Groq model selection (e.g., llama, mixtral)
Embedding Models: Change HuggingFace embedding model
Chunking Strategy: Adjust text splitter parameters for optimal performance

🌐 Deployment

Deploy to Vercel

The easiest way to deploy is using Vercel:

Push your code to GitHub
Import your repository in Vercel
Add environment variables in Vercel dashboard
Deploy

Other Platforms

This Next.js app can be deployed to any platform that supports Node.js:

Netlify
Railway
AWS
Digital Ocean
Self-hosted

🤝 Contributing

Contributions are welcome! Please feel free to submit a Pull Request.

Fork the repository
Create your feature branch (git checkout -b feature/AmazingFeature)
Commit your changes (git commit -m 'Add some AmazingFeature')
Push to the branch (git push origin feature/AmazingFeature)
Open a Pull Request

📝 License

This project is licensed under the MIT License - see the LICENSE file for details.

🙏 Acknowledgments

Vercel AI SDK for AI utilities
Groq for fast LLM inference
Clerk for authentication
Neon for serverless PostgreSQL
Radix UI for accessible components
shadcn/ui for UI component inspiration

📧 Contact

For questions or feedback, please open an issue on GitHub.

Made with ❤️ using Next.js and AI

Name		Name	Last commit message	Last commit date
Latest commit History 15 Commits
migrations		migrations
public		public
src		src
types		types
.gitignore		.gitignore
README.md		README.md
biome.json		biome.json
components.json		components.json
drizzle.config.ts		drizzle.config.ts
example.env		example.env
next.config.ts		next.config.ts
package-lock.json		package-lock.json
package.json		package.json
postcss.config.mjs		postcss.config.mjs
tsconfig.json		tsconfig.json

Folders and files

Latest commit

History

Repository files navigation

RAG Chatbot

🌟 Features

🛠️ Tech Stack

Core Framework

AI & RAG

Authentication

Database & ORM

UI Components

Additional Libraries

📋 Prerequisites

🚀 Getting Started

1. Clone the repository

2. Install dependencies

3. Set up environment variables

4. Set up the database

5. Run the development server

🏗️ Project Structure

📖 How It Works

🎨 Available Scripts

🔧 Configuration

Database Schema

Customizing AI Models

🌐 Deployment

Deploy to Vercel

Other Platforms

🤝 Contributing

📝 License

🙏 Acknowledgments

📧 Contact

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Packages