AI-powered document management and intelligent search system
FileAI is a powerful open-source document management system that uses RAG (Retrieval Augmented Generation) to provide intelligent search and Q&A capabilities across your documents. Upload PDFs, Word documents, and text files, then ask questions in natural language to get AI-powered answers with source citations.
| Feature | Description |
|---|---|
| Multi-Format Support | Process PDF, DOCX, DOC, XML, and TXT files with automatic text extraction |
| Semantic Search | Find documents by meaning using vector embeddings and Qdrant |
| AI-Powered Q&A | Get intelligent answers using RAG with Ollama or OpenAI |
| Flexible Storage | Store files locally, on S3, or self-hosted MinIO |
| Secure by Default | JWT authentication and role-based access control |
| Modern UI | Beautiful interface built with Next.js 14 and shadcn/ui |
| Easy Setup | One-time setup wizard for quick configuration |
| Extensible | Plugin architecture for custom file processors and storage |
- Enterprise Knowledge Base - Centralize company documents and enable instant search
- Research Assistant - Upload papers and ask questions across your library
- Legal Document Analysis - Search through contracts and legal documents
- Personal Document Manager - Organize and search your personal files with AI
| Layer | Technologies |
|---|---|
| Frontend | Next.js 14, React 19, TailwindCSS, shadcn/ui |
| Backend | Node.js, Express, tRPC |
| AI/ML | LangChain.js, Ollama, OpenAI |
| Database | MongoDB (metadata), Qdrant (vectors) |
| Storage | S3 / MinIO / Local filesystem |
| DevOps | Docker, Docker Compose, Turborepo |
# Clone the repository
git clone https://github.com/hashcott/fileai.git
cd fileai
# Run setup script
npm run setup
# Start services (MongoDB, Qdrant, MinIO)
npm run services:start
# Start development server
npm run devOpen http://localhost:3000 and complete the setup wizard.
# Start all services
docker-compose up -d
# View logs
docker-compose logs -f
# Stop services
docker-compose downUse the pre-built Docker images from GitHub Container Registry:
# Set required environment variables
export GITHUB_OWNER=hashcott
export FILEAI_VERSION=v1.0.0
export JWT_SECRET=your-super-secret-jwt-key
# Deploy with production compose file
docker-compose -f docker-compose.prod.yml up -dAvailable images:
ghcr.io/hashcott/fileai/server:latest- Backend API serverghcr.io/hashcott/fileai/web:latest- Frontend web application
Images are built for both linux/amd64 and linux/arm64 platforms.
fileai/
├── apps/
│ ├── web/ # Next.js frontend
│ │ ├── app/ # App router pages
│ │ ├── components/ # React components
│ │ └── lib/ # Utilities and stores
│ └── server/ # Node.js backend
│ ├── routers/ # tRPC routers
│ ├── services/ # Business logic
│ └── db/ # Database models
├── packages/
│ ├── shared/ # Shared types and schemas
│ └── config/ # Shared configurations
├── docker-compose.yml # Docker services
├── turbo.json # Turborepo config
└── package.json # Root package
Backend (apps/server/.env):
# Database
MONGODB_URI=mongodb://localhost:27017/fileai
JWT_SECRET=your-super-secret-key
QDRANT_URL=http://localhost:6333
# LLM Provider
LLM_PROVIDER=ollama
OLLAMA_BASE_URL=http://localhost:11434
OLLAMA_MODEL=llama3
OLLAMA_EMBEDDING_MODEL=nomic-embed-text
# Or use OpenAI
# LLM_PROVIDER=openai
# OPENAI_API_KEY=sk-...
# OPENAI_MODEL=gpt-4-turbo-previewFrontend (apps/web/.env.local):
NEXT_PUBLIC_API_URL=http://localhost:3001/trpc| Option | Description | Best For |
|---|---|---|
local |
Local filesystem | Development, small deployments |
s3 |
Amazon S3 | Production, cloud deployments |
minio |
Self-hosted S3 | Self-hosted, data sovereignty |
| Provider | Models | Notes |
|---|---|---|
| Ollama | llama3, mistral, qwen, etc. | Free, runs locally |
| OpenAI | gpt-4o, gpt-4-turbo, gpt-3.5-turbo | Paid, cloud-based |
# Install dependencies
npm install
# Run in development mode
npm run dev
# Build for production
npm run build
# Run linting
npm run lint
# Type checking
npm run type-check
# Clean build artifacts
npm run cleanThis project uses GitHub Actions for continuous integration and deployment.
| Workflow | Trigger | Description |
|---|---|---|
| CI | Push/PR to main, develop |
Runs linting, type-check, build, and Docker build tests |
| Docker Build & Release | Push tag v*.*.* or manual |
Builds and pushes Docker images to ghcr.io |
# Create and push a new version tag
git tag v1.0.0
git push origin v1.0.0This will automatically:
- Build Docker images for
serverandweb - Push images to GitHub Container Registry (ghcr.io)
- Create a GitHub Release with release notes
You can also trigger a Docker build manually:
- Go to Actions → Build and Release Docker Images
- Click Run workflow
- Enter a custom tag (default:
latest) - Click Run workflow
// apps/server/src/services/processors/my-processor.ts
import { FileProcessor, ProcessedDocument } from '@fileai/shared';
export class MyProcessor implements FileProcessor {
supportedTypes = ['application/x-myformat'];
async process(file: Buffer, filename: string): Promise<ProcessedDocument> {
const text = extractTextFromFile(file);
return {
text,
metadata: { filename, format: 'myformat' },
};
}
}// apps/server/src/services/storage/my-storage.ts
import { StorageAdapter } from '@fileai/shared';
export class MyStorageAdapter implements StorageAdapter {
async upload(file: Buffer, path: string): Promise<string> {
// Implementation
}
async download(path: string): Promise<Buffer> {
// Implementation
}
async delete(path: string): Promise<void> {
// Implementation
}
}We love contributions! Please read our Contributing Guide before submitting a Pull Request.
- Report Bugs - Open an issue with detailed reproduction steps
- Suggest Features - Share your ideas in Discussions
- Improve Docs - Help us make documentation better
- Submit PRs - Fix bugs or add new features
- Fork the repository
- Create a feature branch:
git checkout -b feature/amazing-feature - Make your changes
- Run tests:
npm run test - Commit:
git commit -m 'feat: add amazing feature' - Push:
git push origin feature/amazing-feature - Open a Pull Request
- Multi-format document support (PDF, DOCX, XML, TXT)
- Vector search with Qdrant
- Ollama and OpenAI integration
- Role-based access control
- OCR support with Tesseract
- Multi-language support
- Document collaboration features
- API rate limiting
- Webhook notifications
- Mobile app
See the open issues for a full list of proposed features.
- Setup Guide - Detailed installation instructions
- Project Summary - Architecture overview
- API Reference - API documentation
This project is licensed under the Apache License 2.0 - see the LICENSE file for details.
If you use FileAI, you must:
- Keep the copyright notice - Retain all copyright, patent, trademark, and attribution notices
- Include the NOTICE file - Distribute a copy of the NOTICE file
- State changes - If you modify the code, clearly indicate your modifications
- Credit FileAI - Include attribution in your application (e.g., "Powered by FileAI")
Copyright 2026 FileAI Contributors
Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.
- LangChain.js - AI/LLM framework
- Qdrant - Vector database
- Ollama - Local LLM runner
- shadcn/ui - UI components
- Next.js - React framework
- tRPC - Type-safe APIs
- GitHub Discussions - Ask questions, share ideas
- Discord Server - Chat with the community
Made with love by the FileAI community
Star us on GitHub if you find this project useful!
