🎵 Lyrics Generator - Generative AI Project

An intelligent AI-powered lyrics generator built with Retrieval-Augmented Generation (RAG) and TensorFlow Keras. The application combines semantic search with neural language models to generate contextually relevant lyrics based on user input.

🌟 Features

Retrieval-Augmented Generation (RAG): Uses TF-IDF vectorization and cosine similarity to retrieve contextually relevant lyric snippets
Advanced Language Model: TensorFlow Keras-based next-word prediction with temperature sampling for creative variation
Interactive Web UI: Built with Streamlit for easy, real-time lyrics generation
Multi-Source Data Support: Loads lyrics from both CSV files and MongoDB databases
Docker Support: Production-ready Docker containerization for easy deployment
Temperature Control: Adjust creativity level of generated lyrics (0.0 = deterministic, 1.0+ = creative)
Sequence Padding: Intelligent padding for variable-length input sequences

🏗️ Architecture

The project combines two key AI components:

Retrieval Module (RAG)
- TF-IDF Vectorizer for text representation
- Cosine similarity search for context retrieval
- Returns most semantically relevant lyric from dataset
Generation Module
- Keras Sequential Model trained on lyric sequences
- Word tokenization and padding
- Temperature-based sampling for output diversity
- Configurable sequence length (default: 100 tokens)

📁 Project Structure

lyrics_generator/
├── main.py                      # Streamlit application (main entry point)
├── lyrics_generator.ipynb       # Jupyter notebook for model training & experimentation
├── requirements.txt             # Python dependencies (71 packages)
├── Dockerfile                   # Docker containerization
├── docker-compose.yml           # Docker Compose configuration
├── .env.example                 # Environment variables template
├── ArianaGrande.csv            # Sample dataset (Ariana Grande lyrics)
├── README.md                    # This file
│
├── models/                      # Pre-trained model artifacts
│   ├── rag_lyrics_model.h5      # Trained Keras model weights
│   ├── tokenizer.pickle         # Word tokenizer for text preprocessing
│   └── tfidf_vectorizer.pkl     # Fitted TF-IDF vectorizer
│
├── myenv/                       # Virtual environment (local development)
│   ├── Scripts/                 # Python executables
│   ├── Lib/                     # Installed packages
│   └── pyvenv.cfg               # Virtual env configuration
│
└── .github/                     # GitHub workflows & templates

🔧 Tech Stack

Component	Technology	Version
Backend	Python	3.11+
ML Framework	TensorFlow/Keras	3.11.3
Web Framework	Streamlit	1.28+
Data Processing	Pandas	2.3.3
ML Utilities	scikit-learn	1.3+
Database	MongoDB	4.15.3
Containerization	Docker	Latest
NLP	NumPy, SciPy	Latest

📋 Prerequisites

Python: 3.11 or higher
pip: Latest version
Git: For cloning the repository
Docker (optional): For containerized deployment
MongoDB (optional): For database-backed lyrics storage

🚀 Quick Start

1. Clone the Repository

git clone https://github.com/Mayankvlog/lyrics_generator_generative_ai.git
cd lyrics_generator

2. Create Virtual Environment

Windows PowerShell:

python -m venv myenv
myenv\Scripts\Activate.ps1

macOS/Linux:

python3 -m venv myenv
source myenv/bin/activate

3. Install Dependencies

pip install -r requirements.txt

4. Run the Application

Local Development:

python -m streamlit run main.py

Access the app at: http://localhost:8501

🐳 Docker Deployment

Using Docker Compose (Recommended)

# Build and start the service
docker-compose up -d

# View logs
docker-compose logs -f app

# Stop the service
docker-compose down

Using Docker CLI

# Build the image
docker build -t lyrics-generator:latest .

# Run the container
docker run -p 8502:8502 \
  -e MONGODB_URI="your_mongodb_uri" \
  lyrics-generator:latest

Access the app at: http://localhost:8502

⚙️ Configuration

Environment Variables

Create a .env file in the project root (use .env.example as a template):

MONGODB_URI=mongodb+srv://username:password@cluster.mongodb.net/?retryWrites=true&w=majority
VPS_HOST=your_host
VPS_USER=your_user
VPS_PASSWORD=your_password

Database Configuration

The app supports data loading from:

CSV Files (Default)
- Place CSV files in project root
- Ensure a Lyric column exists
- Example: ArianaGrande.csv
MongoDB (Optional)
- Configure MONGODB_URI environment variable
- Database: food (default)
- Collection: lyrics (default)
- Modify database/collection names in main.py if different

Model Parameters

Modify these in main.py if needed:

max_sequence_length = 100  # Must match training sequence length
temperature = 0.8          # Adjust generation creativity (0.0-2.0)
num_words = 100           # Number of words to generate

📊 Model Details

Training Process

The model was trained on Ariana Grande lyrics using:

Input: Sequences of tokens (max 100 words)
Output: Next-word predictions with softmax probabilities
Loss Function: Categorical crossentropy
Optimizer: Adam
Epochs: Trained on sequence data

Generation Method

Preprocessing
- User input → lowercase conversion
- Text cleaning (remove special characters)
- Tokenization using trained tokenizer
Retrieval
- TF-IDF vectorization of user input
- Cosine similarity search against dataset
- Return most relevant lyric snippet
Generation Loop
- Use retrieved lyric as seed
- Iteratively predict next word (100 iterations)
- Apply temperature sampling for diversity
- Append predictions to generate full lyric

💻 Usage

Web Interface

Open http://localhost:8501
Enter a prompt or theme (e.g., "love", "heartbreak", "dreams")
Adjust generation parameters:
- Temperature: 0.5 (deterministic) to 2.0 (creative)
- Number of Words: 50-200 (length of output)
Click "Generate Lyrics"
View generated lyrics with retrieved context

Example Prompts

"love and heartbreak"
"dancing in the moonlight"
"wish you were here"
"breaking free"

📦 Dependencies

Key Packages:

streamlit - Web UI framework
tensorflow - Deep learning
keras - High-level neural networks
pandas - Data manipulation
numpy - Numerical computing
scikit-learn - ML utilities
pymongo - MongoDB driver
h5py - HDF5 file handling
joblib - Model serialization

See requirements.txt for complete list with versions.

📈 Performance & Optimization

Model Inference: ~100-200ms per generation (GPU: ~50-100ms)
TF-IDF Search: <10ms for dataset < 10,000 lyrics
Memory Usage: ~500MB for model + data
Optimization Tips:
- Use GPU for faster inference: CUDA_VISIBLE_DEVICES=0
- Cache models with @st.cache_resource
- Batch process multiple generations

Check existing issues: GitHub Issues
Create new issue: Include:
- Python version & OS
- Error message & traceback
- Steps to reproduce
- Expected vs actual behavior
Discussions: Use GitHub Discussions for general questions

🚀 Future Enhancements

📚 Related Resources

Made with ❤️ by Mayank Kumar | 2025

Docker Execution

# Build the image
docker build -t lyrics-generator:latest .

# Run the container
docker run -p 8501:8501 lyrics-generator:latest

Docker Compose

# Create .env file (copy from .env.example)
cp .env.example .env

# Start all services
docker-compose up -d

# View logs
docker-compose logs -f app

🚀 VPS Deployment

Access your app on VPS

VPS IP: 167.71.235.91
URL: http://167.71.235.91:8501

Deploy to VPS (Linux)

SSH into your VPS:

ssh root@167.71.235.91

Run the deployment script:

curl -fsSL https://raw.githubusercontent.com/Mayankvlog/lyrics_generator_generative_ai/main/deploy-vps.sh | bash

Or manually:

apt-get update && apt-get install -y git curl
curl -fsSL https://get.docker.com | sh
git clone https://github.com/Mayankvlog/lyrics_generator_generative_ai.git
cd lyrics_generator_generative_ai
docker-compose up -d

Access the app:
- Open your browser and go to: http://167.71.235.91:8501

CI/CD Deployment

The GitHub Actions workflow automatically:

✅ Tests code and dependencies
✅ Builds Docker image
✅ Pushes to Docker Hub
✅ Deploys to VPS via SSH

Name		Name	Last commit message	Last commit date
Latest commit History 44 Commits
.github/workflows		.github/workflows
models		models
.env.example		.env.example
.gitignore		.gitignore
ArianaGrande.csv		ArianaGrande.csv
Dockerfile		Dockerfile
README.md		README.md
docker-compose.prod.yml		docker-compose.prod.yml
docker-compose.yml		docker-compose.yml
lyrics_generator.ipynb		lyrics_generator.ipynb
main.py		main.py
requirements.txt		requirements.txt

Folders and files

Latest commit

History

Repository files navigation

🎵 Lyrics Generator - Generative AI Project

🌟 Features

🏗️ Architecture

📁 Project Structure

🔧 Tech Stack

📋 Prerequisites

🚀 Quick Start

1. Clone the Repository

2. Create Virtual Environment

3. Install Dependencies

4. Run the Application

🐳 Docker Deployment

Using Docker Compose (Recommended)

Using Docker CLI

⚙️ Configuration

Environment Variables

Database Configuration

Model Parameters

📊 Model Details

Training Process

Generation Method

💻 Usage

Web Interface

Example Prompts

📦 Dependencies

📈 Performance & Optimization

🚀 Future Enhancements

📚 Related Resources

Docker Execution

Docker Compose

🚀 VPS Deployment

Access your app on VPS

Deploy to VPS (Linux)

CI/CD Deployment

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages