An AI-powered image authenticity checker that detects whether an image is real or AI-generated, and identifies the generator type (GAN, Diffusion, or Other).
With the rise of AI image generation tools like Stable Diffusion, MidJourney, and DALL-E, it's becoming increasingly difficult to distinguish real photos from synthetic ones. ImageTrust-AI helps identify AI-generated images using deep learning — and goes further by identifying which type of AI architecture produced them.
- Real vs AI-generated classification with confidence score
- Generator type detection — identifies GAN, Diffusion, or Other architecture
- Grad-CAM visual explanation — see which regions influenced the decision
- Image metadata extraction and EXIF analysis
- REST API endpoint
- Local Streamlit UI + deployed Gradio interface
- Python 3.10
- PyTorch + TorchVision (ResNet18)
- FastAPI
- Streamlit (local) / Gradio (deployed)
- Pillow, OpenCV
- scikit-learn
- pytorch-grad-cam
- Architecture: ResNet18 (transfer learning, pretrained on ImageNet)
- Dataset: ArtiFact (30k subset - 15k real, 15k fake)
- Validation Accuracy: ~94.5%
- Architecture: ResNet18 (transfer learning)
- Classes: Real, GAN, Diffusion, Other
- Dataset: ArtiFact (40k - 10k per class, manually curated)
- Test Accuracy: ~94.7%
ImageTrust-AI/
├── app/
│ ├── main.py
│ ├── streamlit_app.py
│ ├── gradio_app.py
│ └── routes/predict.py
├── src/
│ ├── data/
│ │ ├── loader.py
│ │ ├── transforms.py
│ │ └── generator_loader.py
│ ├── models/
│ │ ├── model.py
│ │ ├── train.py
│ │ ├── train_efficientnet.py
│ │ ├── train_cross_validation.py
│ │ ├── train_generator.py
│ │ └── inference.py
│ ├── services/
│ │ ├── predictor.py
│ │ ├── metadata_checker.py
│ │ └── gradcam.py
│ └── utils/
├── notebooks/
│ └── failure_analysis.ipynb
├── saved_models/
├── sample_images/
├── requirements.txt
└── README.md
git clone https://github.com/SiemonCha/ImageTrust-AI.git
cd ImageTrust-AI
conda create -n imagetrust-ai python=3.10
conda activate imagetrust-ai
pip install -r requirements.txtDownload ArtiFact dataset from Kaggle:
kaggle datasets download -d awsaf49/artifact-datasetPlace it outside the repo. Update DATASET_ROOT in src/data/loader.py and src/data/generator_loader.py.
Start the API:
PYTHONPATH=. uvicorn app.main:app --reloadStart the UI (new terminal):
PYTHONPATH=. streamlit run app/streamlit_app.pyOr run Gradio directly:
PYTHONPATH=. python app/gradio_app.py| Metric | Score |
|---|---|
| Validation Accuracy | 94.5% |
| Best Val Loss | 0.151 |
| Training Epochs | 5 (early stopping) |
| Class | Precision | Recall | F1 |
|---|---|---|---|
| Real | 0.95 | 0.84 | 0.90 |
| GAN | 1.00 | 1.00 | 1.00 |
| Diffusion | 0.89 | 0.95 | 0.92 |
| Other | 0.95 | 1.00 | 0.97 |
| Overall | 0.95 | 0.95 | 0.95 |
| Model | Val Acc | Val Loss | Epochs |
|---|---|---|---|
| ResNet18 | 94.5% | 0.151 | 5 |
| EfficientNet-B0 | 91.3% | 0.207 | 17 |
ResNet18 selected as production model — higher accuracy, faster convergence.
- Seen (train): Stable Diffusion, StyleGAN2, DDPM
- Unseen (test): Glide, Latent Diffusion
- Train Accuracy: ~94% | Unseen Test Accuracy: ~57%
Significant generalisation gap revealed — model learns generator-specific patterns rather than universal AI artifacts. Consistent with findings in published synthetic image detection research.
| Image | Expected | Predicted | Confidence | Correct? |
|---|---|---|---|---|
| DALL-E generated | AI | Real | 72.57% | No |
| Heavily edited photo | Real | AI-Generated | 99.78% | Partial |
| iPhone camera photo | Real | Real | 99.96% | Yes |
| MidJourney generated | AI | Real | 99.33% | No |
| Screenshot | Real | Real | 100% | Yes |
Real images detected correctly. AI images from unseen generators (DALL-E, MidJourney) were misclassified — consistent with V3 cross-dataset validation findings.
See notebooks/failure_analysis.ipynb for full analysis.
- Total test failures: 93 / ~3,750 (~2.5% error rate)
- Mean confidence on failures: 76.6%
- High confidence wrong predictions (>90%): 26
- Most common failure: real images with unusual lighting/texture classified as AI
- Generalisation drops on unseen AI generators (DALL-E, MidJourney)
- Missing EXIF data does not prove an image is fake
- Model confidence is not definitive proof
- Generator type classifier uses manually curated labels — may not generalise perfectly
- Train on all ArtiFact generators for better generalisation
- Frequency-domain features (FFT) for generator-agnostic detection
- Docker deployment
- Adversarial robustness testing

