Skip to content

An intelligent all-in-one financial assistant combining spam message detection, smart budget planning, and an AI chatbot for secure, effortless money management. Built with machine learning and NLP to protect users from scams, track spending, and offer personalized financial insights.

Notifications You must be signed in to change notification settings

Preethibk20/FinSecure-AI

Folders and files

NameName
Last commit message
Last commit date

Latest commit

ย 

History

5 Commits
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 

Repository files navigation

๐Ÿ›ก๏ธ Advanced Spam Text Detector

A comprehensive spam detection system that combines traditional machine learning with deep learning models for superior accuracy. This project implements multiple AI approaches including LSTM, CNN, and ensemble methods with an interactive web interface.

Python TensorFlow FastAPI License

Try it live: https://finsecure-ai-uepz.onrender.com

๐Ÿš€ Features

๐Ÿค– Multiple AI Models

  • Deep Learning Models: LSTM, CNN-LSTM, and Ensemble neural networks
  • Traditional ML: Naive Bayes with TF-IDF (baseline)
  • Ensemble Method: Weighted combination of all models for optimal accuracy
  • Real-time Model Switching: Compare different algorithms instantly

๐Ÿ” Advanced Analysis

  • Text Classification: Multi-model spam detection with confidence scores
  • URL Safety Analysis: Domain trust scoring and phishing detection
  • Interactive Comparison: Side-by-side model performance analysis
  • Real-time Processing: Fast inference with multiple model options

๐ŸŽจ Modern Web Interface

  • Responsive Design: Works on desktop and mobile devices
  • Model Selection: Easy switching between AI algorithms
  • Visual Analytics: Charts and confidence meters
  • Dark/Light Theme: Modern UI with smooth animations

๐Ÿ“ธ Screenshots

Main Interface

โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
โ”‚                ๐Ÿ›ก๏ธ Advanced Spam Text Detector           โ”‚
โ”œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ค
โ”‚  ๐Ÿค– Select AI Model                                    โ”‚
โ”‚  โ—‹ Ensemble (Best accuracy)     โœ… Available           โ”‚
โ”‚  โ—‹ Deep Learning (LSTM/CNN)     โœ… Available           โ”‚
โ”‚  โ—‹ Traditional ML (Naive Bayes) โœ… Available           โ”‚
โ”‚                                                         โ”‚
โ”‚  โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ” โ”‚
โ”‚  โ”‚ Enter text to analyze...                            โ”‚ โ”‚
โ”‚  โ”‚                                                     โ”‚ โ”‚
โ”‚  โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜ โ”‚
โ”‚  [๐Ÿ” Analyze] [๐Ÿ—‘๏ธ Clear] [โš–๏ธ Compare All Models]        โ”‚
โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜

Results Display

โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
โ”‚  ๐Ÿšจ SPAM DETECTED                                       โ”‚
โ”‚  Confidence: โ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–‘โ–‘โ–‘โ–‘ 89.3%                โ”‚
โ”‚  Model: Deep Learning (LSTM/CNN)                       โ”‚
โ”‚                                                         โ”‚
โ”‚  ๐Ÿ“Š Ensemble Breakdown:                                 โ”‚
โ”‚  Traditional ML (30%): Not Spam (62.1%)               โ”‚
โ”‚  Deep Learning (70%):  Spam (89.3%)                   โ”‚
โ”‚  Final Decision:       Spam (78.5%)                   โ”‚
โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜

๐Ÿ› ๏ธ Installation

Quick Start

# Clone the repository
git clone https://github.com/yourusername/spam-text-detector.git
cd spam-text-detector

# Run automated setup
python setup.py

# Start the application
python main_dl.py

Manual Installation

# Install dependencies
pip install -r requirements.txt

# Download NLTK data
python -c "import nltk; nltk.download('punkt'); nltk.download('stopwords')"

# Train models (requires dataset)
python train_models.py --model both

# Start enhanced application
python main_dl.py

# Or start original version
python main.py

๐Ÿ“Š Dataset

Required: SMS Spam Collection Dataset

๐Ÿง  Model Architecture

Deep Learning Models

1. LSTM Model

Sequential([
    Embedding(10000, 128, input_length=100),
    Bidirectional(LSTM(64, return_sequences=True, dropout=0.3)),
    Bidirectional(LSTM(32, dropout=0.3)),
    Dense(64, activation='relu'),
    BatchNormalization(),
    Dropout(0.5),
    Dense(1, activation='sigmoid')
])

2. CNN-LSTM Hybrid

Sequential([
    Embedding(10000, 128, input_length=100),
    Conv1D(128, 5, activation='relu'),
    MaxPooling1D(5),
    Conv1D(64, 5, activation='relu'),
    MaxPooling1D(5),
    LSTM(64, dropout=0.3),
    Dense(64, activation='relu'),
    Dense(1, activation='sigmoid')
])

3. Ensemble Architecture

  • LSTM Branch: Bidirectional LSTM + GlobalMaxPooling
  • CNN Branch: Conv1D + GlobalMaxPooling
  • Fusion: Concatenate + Dense layers
  • Output: Sigmoid activation for binary classification

Training Configuration

  • Optimizer: Adam (lr=0.001)
  • Loss: Binary crossentropy
  • Metrics: Accuracy, Precision, Recall
  • Callbacks: EarlyStopping, ReduceLROnPlateau
  • Data Split: 70% train, 15% validation, 15% test

๐Ÿš€ Usage

Web Application

# Enhanced version with deep learning
python main_dl.py

# Original version (traditional ML only)
python main.py

# Open browser
http://localhost:8000

API Usage

Analyze Text

import requests

# Single model analysis
response = requests.post('http://localhost:8000/api/analyze', 
    json={
        'text': 'URGENT: Your account has been compromised!',
        'model_type': 'ensemble'  # or 'deep_learning', 'traditional'
    }
)
result = response.json()
print(f"Prediction: {result['prediction']}")
print(f"Confidence: {result['confidence']:.1f}%")

Compare All Models

# Compare all models on same text
response = requests.get('http://localhost:8000/api/models/compare', 
    params={'text': 'Free money! Click here now!'}
)
comparison = response.json()

for model, result in comparison['model_predictions'].items():
    print(f"{model}: {result['prediction']} ({result['confidence']:.1f}%)")

Check Model Status

response = requests.get('http://localhost:8000/api/models/status')
status = response.json()
print("Available models:", list(status.keys()))

Command Line Training

# Train all models
python train_models.py --model both --epochs 30

# Train specific model
python train_models.py --model deep_learning --epochs 50
python train_models.py --model traditional

# Get help
python train_models.py --help

Python API

from deep_learning_model import SpamDetectorDL

# Load trained model
detector = SpamDetectorDL.load_model(
    'spam_detector_ensemble.h5', 
    'tokenizer_ensemble.pkl'
)

# Make predictions
result = detector.predict("Congratulations! You won $1000!")
print(f"Prediction: {result['prediction']}")
print(f"Confidence: {result['confidence']:.1f}%")
print(f"Spam Probability: {result['spam_probability']:.1f}%")

๐Ÿ“ˆ Performance Metrics

Expected Results

Model Accuracy Precision Recall F1-Score
Traditional ML ~96% ~95% ~94% ~94%
LSTM ~97-98% ~96% ~95% ~95%
CNN-LSTM ~97-98% ~96% ~95% ~95%
Ensemble ~98-99% ~97% ~96% ~96%

Inference Speed

  • Traditional ML: <10ms per prediction
  • Deep Learning: <100ms per prediction
  • Ensemble: <150ms per prediction

๐Ÿ—๏ธ Project Structure

spam-text-detector/
โ”œโ”€โ”€ ๐Ÿ“„ README.md                   # This file
โ”œโ”€โ”€ ๐Ÿ“„ requirements.txt            # Python dependencies
โ”œโ”€โ”€ ๐Ÿ“„ setup.py                    # Automated setup script
โ”œโ”€โ”€ ๐Ÿ“„ train_models.py             # Model training script
โ”œโ”€โ”€ ๐Ÿ“„ test_spam_detection.py      # Testing script
โ”‚
โ”œโ”€โ”€ ๐Ÿค– AI Models
โ”‚   โ”œโ”€โ”€ deep_learning_model.py     # Deep learning implementation
โ”‚   โ”œโ”€โ”€ main_dl.py                 # Enhanced FastAPI app
โ”‚   โ””โ”€โ”€ main.py                    # Original FastAPI app
โ”‚
โ”œโ”€โ”€ ๐ŸŽจ Frontend
โ”‚   โ”œโ”€โ”€ templates/
โ”‚   โ”‚   โ”œโ”€โ”€ index_dl.html          # Enhanced interface
โ”‚   โ”‚   โ””โ”€โ”€ index.html             # Original interface
โ”‚   โ””โ”€โ”€ static/
โ”‚       โ”œโ”€โ”€ css/styles.css         # Styling
โ”‚       โ”œโ”€โ”€ js/main_dl.js          # Enhanced JavaScript
โ”‚       โ””โ”€โ”€ js/main.js             # Original JavaScript
โ”‚
โ”œโ”€โ”€ ๐Ÿ“Š Data (you need to add)
โ”‚   โ””โ”€โ”€ mail_data.csv              # SMS spam dataset
โ”‚
โ””โ”€โ”€ ๐Ÿ”ง Generated (after training)
    โ”œโ”€โ”€ models/
    โ”‚   โ”œโ”€โ”€ spam_detector_lstm.h5
    โ”‚   โ”œโ”€โ”€ spam_detector_ensemble.h5
    โ”‚   โ”œโ”€โ”€ tokenizer_lstm.pkl
    โ”‚   โ””โ”€โ”€ text_classification.pkl
    โ””โ”€โ”€ outputs/
        โ”œโ”€โ”€ training_history_dl.png
        โ”œโ”€โ”€ confusion_matrix_dl.png
        โ””โ”€โ”€ model_comparison.png

๐ŸŽฏ Model Selection Guide

When to Use Each Model

Model Best For Pros Cons
Ensemble Production use Highest accuracy, robust Slower inference
Deep Learning Complex patterns Context understanding Requires more resources
Traditional ML Fast deployment Speed, interpretability Lower accuracy

Performance Comparison

# Test different models
models = ['traditional', 'deep_learning', 'ensemble']
test_text = "URGENT: Verify your account now!"

for model in models:
    result = analyze_text(test_text, model)
    print(f"{model:15}: {result['prediction']:8} ({result['confidence']:5.1f}%)")

๐Ÿ”ง Configuration

Model Parameters

# Deep Learning Configuration
SpamDetectorDL(
    max_features=10000,    # Vocabulary size
    max_length=100,        # Sequence length  
    embedding_dim=128      # Embedding dimensions
)

# Training Parameters
epochs=30
batch_size=32
validation_split=0.2
early_stopping_patience=10

# Ensemble Weights
dl_weight = 0.7           # Deep learning model weight
traditional_weight = 0.3   # Traditional ML weight

Environment Variables

# Optional: Disable TensorFlow warnings
export TF_ENABLE_ONEDNN_OPTS=0

# Optional: Set TensorFlow log level
export TF_CPP_MIN_LOG_LEVEL=2

๐Ÿงช Testing

Run Comprehensive Tests

# Test all models with various samples
python test_spam_detection.py

# Expected output:
# โœ… Obvious spam detection: 80-95%
# โœ… Legitimate messages: 95-100%  
# โš ๏ธ  Subtle spam detection: 60-80%

Manual Testing

# Test specific samples
test_samples = [
    "CONGRATULATIONS! You won $1000!",           # Should be: Spam
    "Hey, lunch tomorrow at 12pm?",              # Should be: Not Spam
    "Your account has been compromised!",        # Should be: Spam
    "Meeting moved to Monday",                   # Should be: Not Spam
]

for text in test_samples:
    result = detector.predict(text)
    print(f"'{text}' โ†’ {result['prediction']} ({result['confidence']:.1f}%)")

๐Ÿšจ Troubleshooting

Common Issues

Model Loading Errors

# Retrain models if corrupted
python train_models.py --model both

Memory Issues

# Reduce batch size in training
batch_size=16  # Instead of 32

NLTK Data Missing

import nltk
nltk.download('punkt')
nltk.download('stopwords')

Server Not Starting

# Check if port is in use
netstat -an | findstr :8000

# Use different port
uvicorn main_dl:app --port 8001

Low Accuracy

# Retrain with more epochs
python train_models.py --model deep_learning --epochs 50

# Check dataset quality
python -c "import pandas as pd; print(pd.read_csv('mail_data.csv').info())"

๐Ÿ“š API Reference

Endpoints

Method Endpoint Description Parameters
GET / Web interface -
POST /api/analyze Analyze text text, model_type
GET /api/models/compare Compare models text
GET /api/models/status Model availability -
GET /api/demo-text Sample text -

Response Format

{
  "prediction": "Spam",
  "confidence": 89.3,
  "spam_probability": 89.3,
  "not_spam_probability": 10.7,
  "model_used": "Deep Learning (LSTM/CNN)",
  "urls": [
    {
      "domain": "suspicious-site.com",
      "trust_score": 25,
      "classification": "Suspicious",
      "risk_factors": ["Contains suspicious keywords"]
    }
  ]
}

๐Ÿค Contributing

  1. Fork the repository
  2. Create a feature branch (git checkout -b feature/amazing-feature)
  3. Commit your changes (git commit -m 'Add amazing feature')
  4. Push to the branch (git push origin feature/amazing-feature)
  5. Open a Pull Request

Development Setup

# Install development dependencies
pip install -r requirements.txt
pip install pytest black flake8

# Run tests
pytest tests/

# Format code
black *.py

# Lint code
flake8 *.py

Code Style

  • Follow PEP 8 guidelines
  • Use type hints where possible
  • Add docstrings to functions
  • Write unit tests for new features

๐Ÿ“„ License

This project is licensed under the MIT License - see the LICENSE file for details.

๐Ÿ™ Acknowledgments

  • Dataset: UCI ML Repository - SMS Spam Collection
  • Frameworks: TensorFlow, Keras, FastAPI, scikit-learn
  • Libraries: NLTK, pandas, numpy, matplotlib
  • UI: Chart.js, Font Awesome, modern CSS

๐Ÿ”ฎ Future Enhancements

  • Transformer Models: BERT/RoBERTa integration
  • Multi-language Support: Detect spam in different languages
  • Real-time Learning: Online learning capabilities
  • Mobile App: React Native/Flutter implementation
  • Browser Extension: Chrome/Firefox extension
  • Email Integration: Gmail/Outlook plugins
  • Advanced Analytics: Detailed reporting dashboard
  • A/B Testing: Model performance comparison tools

โญ Star this repository if you found it helpful!

๐Ÿ›ก๏ธ Happy Spam Detecting!

About

An intelligent all-in-one financial assistant combining spam message detection, smart budget planning, and an AI chatbot for secure, effortless money management. Built with machine learning and NLP to protect users from scams, track spending, and offer personalized financial insights.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published