FileAI

AI-powered document management and intelligent search system

Overview

FileAI is a powerful open-source document management system that uses RAG (Retrieval Augmented Generation) to provide intelligent search and Q&A capabilities across your documents. Upload PDFs, Word documents, and text files, then ask questions in natural language to get AI-powered answers with source citations.

Features

Feature	Description
Multi-Format Support	Process PDF, DOCX, DOC, XML, and TXT files with automatic text extraction
Semantic Search	Find documents by meaning using vector embeddings and Qdrant
AI-Powered Q&A	Get intelligent answers using RAG with Ollama or OpenAI
Flexible Storage	Store files locally, on S3, or self-hosted MinIO
Secure by Default	JWT authentication and role-based access control
Modern UI	Beautiful interface built with Next.js 14 and shadcn/ui
Easy Setup	One-time setup wizard for quick configuration
Extensible	Plugin architecture for custom file processors and storage

Use Cases

Enterprise Knowledge Base - Centralize company documents and enable instant search
Research Assistant - Upload papers and ask questions across your library
Legal Document Analysis - Search through contracts and legal documents
Personal Document Manager - Organize and search your personal files with AI

Tech Stack

Layer	Technologies
Frontend	Next.js 14, React 19, TailwindCSS, shadcn/ui
Backend	Node.js, Express, tRPC
AI/ML	LangChain.js, Ollama, OpenAI
Database	MongoDB (metadata), Qdrant (vectors)
Storage	S3 / MinIO / Local filesystem
DevOps	Docker, Docker Compose, Turborepo

Quick Start

Prerequisites

Node.js >= 22.0.0
Docker and Docker Compose
Ollama (optional, for local AI)

Installation

# Clone the repository
git clone https://github.com/hashcott/fileai.git
cd fileai

# Run setup script
npm run setup

# Start services (MongoDB, Qdrant, MinIO)
npm run services:start

# Start development server
npm run dev

Open http://localhost:3000 and complete the setup wizard.

Docker Deployment

# Start all services
docker-compose up -d

# View logs
docker-compose logs -f

# Stop services
docker-compose down

Production Deployment with Pre-built Images

Use the pre-built Docker images from GitHub Container Registry:

# Set required environment variables
export GITHUB_OWNER=hashcott
export FILEAI_VERSION=v1.0.0
export JWT_SECRET=your-super-secret-jwt-key

# Deploy with production compose file
docker-compose -f docker-compose.prod.yml up -d

Available images:

ghcr.io/hashcott/fileai/server:latest - Backend API server
ghcr.io/hashcott/fileai/web:latest - Frontend web application

Images are built for both linux/amd64 and linux/arm64 platforms.

Project Structure

fileai/
├── apps/
│   ├── web/                    # Next.js frontend
│   │   ├── app/               # App router pages
│   │   ├── components/        # React components
│   │   └── lib/               # Utilities and stores
│   └── server/                # Node.js backend
│       ├── routers/           # tRPC routers
│       ├── services/          # Business logic
│       └── db/                # Database models
├── packages/
│   ├── shared/                # Shared types and schemas
│   └── config/                # Shared configurations
├── docker-compose.yml         # Docker services
├── turbo.json                 # Turborepo config
└── package.json               # Root package

Configuration

Environment Variables

Backend (apps/server/.env):

# Database
MONGODB_URI=mongodb://localhost:27017/fileai
JWT_SECRET=your-super-secret-key
QDRANT_URL=http://localhost:6333

# LLM Provider
LLM_PROVIDER=ollama
OLLAMA_BASE_URL=http://localhost:11434
OLLAMA_MODEL=llama3
OLLAMA_EMBEDDING_MODEL=nomic-embed-text

# Or use OpenAI
# LLM_PROVIDER=openai
# OPENAI_API_KEY=sk-...
# OPENAI_MODEL=gpt-4-turbo-preview

Frontend (apps/web/.env.local):

NEXT_PUBLIC_API_URL=http://localhost:3001/trpc

Storage Options

Option	Description	Best For
`local`	Local filesystem	Development, small deployments
`s3`	Amazon S3	Production, cloud deployments
`minio`	Self-hosted S3	Self-hosted, data sovereignty

LLM Options

Provider	Models	Notes
Ollama	llama3, mistral, qwen, etc.	Free, runs locally
OpenAI	gpt-4o, gpt-4-turbo, gpt-3.5-turbo	Paid, cloud-based

Development

# Install dependencies
npm install

# Run in development mode
npm run dev

# Build for production
npm run build

# Run linting
npm run lint

# Type checking
npm run type-check

# Clean build artifacts
npm run clean

CI/CD

This project uses GitHub Actions for continuous integration and deployment.

Workflows

Workflow	Trigger	Description
CI	Push/PR to `main`, `develop`	Runs linting, type-check, build, and Docker build tests
Docker Build & Release	Push tag `v..*` or manual	Builds and pushes Docker images to ghcr.io

Creating a Release

# Create and push a new version tag
git tag v1.0.0
git push origin v1.0.0

This will automatically:

Build Docker images for server and web
Push images to GitHub Container Registry (ghcr.io)
Create a GitHub Release with release notes

Manual Docker Build

You can also trigger a Docker build manually:

Go to Actions → Build and Release Docker Images
Click Run workflow
Enter a custom tag (default: latest)
Click Run workflow

Extending FileAI

Adding a File Processor

// apps/server/src/services/processors/my-processor.ts
import { FileProcessor, ProcessedDocument } from '@fileai/shared';

export class MyProcessor implements FileProcessor {
  supportedTypes = ['application/x-myformat'];

  async process(file: Buffer, filename: string): Promise<ProcessedDocument> {
    const text = extractTextFromFile(file);
    return {
      text,
      metadata: { filename, format: 'myformat' },
    };
  }
}

Adding a Storage Adapter

// apps/server/src/services/storage/my-storage.ts
import { StorageAdapter } from '@fileai/shared';

export class MyStorageAdapter implements StorageAdapter {
  async upload(file: Buffer, path: string): Promise<string> {
    // Implementation
  }
  async download(path: string): Promise<Buffer> {
    // Implementation
  }
  async delete(path: string): Promise<void> {
    // Implementation
  }
}

Contributing

We love contributions! Please read our Contributing Guide before submitting a Pull Request.

Ways to Contribute

Report Bugs - Open an issue with detailed reproduction steps
Suggest Features - Share your ideas in Discussions
Improve Docs - Help us make documentation better
Submit PRs - Fix bugs or add new features

Development Workflow

Fork the repository
Create a feature branch: git checkout -b feature/amazing-feature
Make your changes
Run tests: npm run test
Commit: git commit -m 'feat: add amazing feature'
Push: git push origin feature/amazing-feature
Open a Pull Request

Roadmap

See the open issues for a full list of proposed features.

Documentation

Setup Guide - Detailed installation instructions
Project Summary - Architecture overview
API Reference - API documentation

License

This project is licensed under the Apache License 2.0 - see the LICENSE file for details.

Attribution Requirements

If you use FileAI, you must:

Keep the copyright notice - Retain all copyright, patent, trademark, and attribution notices
Include the NOTICE file - Distribute a copy of the NOTICE file
State changes - If you modify the code, clearly indicate your modifications
Credit FileAI - Include attribution in your application (e.g., "Powered by FileAI")

Copyright 2026 FileAI Contributors

Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at

    http://www.apache.org/licenses/LICENSE-2.0

Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.

Acknowledgements

LangChain.js - AI/LLM framework
Qdrant - Vector database
Ollama - Local LLM runner
shadcn/ui - UI components
Next.js - React framework
tRPC - Type-safe APIs

Community

GitHub Discussions - Ask questions, share ideas
Discord Server - Chat with the community

Made with love by the FileAI community

Star us on GitHub if you find this project useful!

Name		Name	Last commit message	Last commit date
Latest commit History 50 Commits
.github		.github
apps		apps
docs		docs
packages		packages
scripts		scripts
.dockerignore		.dockerignore
.gitignore		.gitignore
.nvmrc		.nvmrc
.prettierignore		.prettierignore
.prettierrc		.prettierrc
CONTRIBUTING.md		CONTRIBUTING.md
LICENSE		LICENSE
NOTICE		NOTICE
PROJECT_SUMMARY.md		PROJECT_SUMMARY.md
README.md		README.md
SETUP_GUIDE.md		SETUP_GUIDE.md
docker-compose.prod.yml		docker-compose.prod.yml
docker-compose.yml		docker-compose.yml
package-lock.json		package-lock.json
package.json		package.json
turbo.json		turbo.json

Uh oh!

License

hashcott/FileAI

Folders and files

Latest commit

History

Repository files navigation

FileAI

Overview

Features

Use Cases

Tech Stack

Quick Start

Prerequisites

Installation

Docker Deployment

Production Deployment with Pre-built Images

Project Structure

Configuration

Environment Variables

Storage Options

LLM Options

Development

CI/CD

Workflows

Creating a Release

Manual Docker Build

Extending FileAI

Adding a File Processor

Adding a Storage Adapter

Contributing

Ways to Contribute

Development Workflow

Roadmap

Documentation

License

Attribution Requirements

Acknowledgements

Community

About

Resources

License

Contributing

Uh oh!

Stars

Watchers

Forks

Releases

Sponsor this project

Uh oh!

Packages 0

Contributors 2

Uh oh!

Languages

Packages