This project provides a robust object detection system leveraging state-of-the-art transformer models, including DETR (DEtection TRansformer) and YOLOS (You Only Look One-level Series). The system supports object detection and panoptic segmentation from uploaded images or image URLs. It features a user-friendly Gradio web interface for interactive use and a FastAPI endpoint for programmatic access.
Try the online demo on Hugging Face Spaces: Object Detection Demo.
The application supports the following models, each tailored for specific detection or segmentation tasks:
-
DETR (DEtection TRansformer):
facebook/detr-resnet-50: Fast and accurate object detection with a ResNet-50 backbone.facebook/detr-resnet-101: Higher accuracy object detection with a ResNet-101 backbone, slower than ResNet-50.facebook/detr-resnet-50-panoptic: Panoptic segmentation with ResNet-50 (note: may have stability issues).facebook/detr-resnet-101-panoptic: Panoptic segmentation with ResNet-101 (note: may have stability issues).
-
YOLOS (You Only Look One-level Series):
hustvl/yolos-tiny: Lightweight and fast, ideal for resource-constrained environments.hustvl/yolos-base: Balances speed and accuracy for object detection.
- Image Upload: Upload images via the Gradio interface for object detection.
- URL Input: Provide image URLs for detection through the Gradio interface or API.
- Model Selection: Choose between DETR and YOLOS models for detection or panoptic segmentation.
- Object Detection: Highlights detected objects with bounding boxes and confidence scores.
- Panoptic Segmentation: Supports scene segmentation with colored masks (DETR panoptic models).
- Image Properties: Displays metadata like format, size, aspect ratio, file size, and color statistics.
- API Access: Programmatically process images via the FastAPI
/detectendpoint. - Flexible Deployment: Run locally, in Docker, or in cloud environments like Google Colab.
Follow these steps to set up the application locally:
- Python 3.8 or higher
pipfor installing dependencies- Git for cloning the repository
git clone https://github.com/NeerajCodz/ObjectDetection
cd ObjectDetectionInstall required packages from requirements.txt:
pip install -r requirements.txtLaunch the Gradio interface:
python app.pyTo enable the FastAPI server:
python app.py --enable-fastapi- Gradio: Open the URL displayed in the console (typically
http://127.0.0.1:7860). - FastAPI: Navigate to
http://localhost:8000for the API or Swagger UI (if enabled).
Use Docker for a containerized setup.
- Docker installed on your machine. Download from Docker's official site.
Pull the pre-built image from Docker Hub:
docker pull neerajcodz/objectdetection:latestRun the application on port 8080:
docker run -d -p 8080:80 neerajcodz/objectdetection:latestAccess the interface at http://localhost:8080.
To build the Docker image locally:
- Ensure you have a
Dockerfilein the repository root (example provided in the repository). - Build the image:
docker build -t objectdetection:local .- Run the container:
docker run -d -p 8080:80 objectdetection:localAccess the interface at http://localhost:8080.
Try the demo on Hugging Face Spaces:
The app.py script supports the following command-line arguments:
--gradio-port <port>: Specify the port for the Gradio UI (default: 7860).- Example:
python app.py --gradio-port 7870
- Example:
--enable-fastapi: Enable the FastAPI server (disabled by default).- Example:
python app.py --enable-fastapi
- Example:
--fastapi-port <port>: Specify the port for the FastAPI server (default: 8000).- Example:
python app.py --enable-fastapi --fastapi-port 8001
- Example:
--confidence-threshold <float-value): Confidence threshold for detection (Range: 0 - 1) (default: 0.5).- Example:
python app.py --confidence-threshold 0.75
- Example:
You can combine arguments:
python app.py --gradio-port 7870 --enable-fastapi --fastapi-port 8001 --confidence-threshold 0.75Alternatively, set the GRADIO_SERVER_PORT environment variable:
export GRADIO_SERVER_PORT=7870
python app.pyNote: The FastAPI API is currently unstable and may require additional configuration for production use.
The /detect endpoint allows programmatic image processing.
Enable FastAPI when launching the script:
python app.py --enable-fastapiOr run FastAPI separately with Uvicorn:
uvicorn objectdetection:app --host 0.0.0.0 --port 8000Access the Swagger UI at http://localhost:8000/docs for interactive testing.
- Endpoint:
POST /detect - Parameters:
file: (optional) Image file (must beimage/*type).image_url: (optional) URL of the image.model_name: (optional) Model name (e.g.,facebook/detr-resnet-50,hustvl/yolos-tiny).
- Content-Type:
multipart/form-datafor file uploads,application/jsonfor URL inputs.
curl -X POST "http://localhost:8000/detect" \\
-H "Content-Type: application/json" \\
-d '{"image_url": "https://example.com/image.jpg", "model_name": "facebook/detr-resnet-50"}'curl -X POST "http://localhost:8000/detect" \\
-F "file=@/path/to/image.jpg" \\
-F "model_name=facebook/detr-resnet-50"The response includes a base64-encoded image with detections and detection details:
{
"image_url": "data:image/png;base64,...",
"detected_objects": ["person", "car"],
"confidence_scores": [0.95, 0.87],
"unique_objects": ["person", "car"],
"unique_confidence_scores": [0.95, 0.87]
}- Ensure only one of
fileorimage_urlis provided. - The API may experience instability with panoptic models; use object detection models for reliability.
- Test the API using the Swagger UI for easier debugging.
To contribute or modify the application:
- Clone the repository:
git clone https://github.com/NeerajCodz/ObjectDetection
cd ObjectDetection- Install dependencies:
pip install -r requirements.txt- Run the application:
python app.pyOr run FastAPI:
uvicorn objectdetection:app --host 0.0.0.0 --port 8000- Access at
http://localhost:7860(Gradio) orhttp://localhost:8000(FastAPI).
Contributions are welcome! To contribute:
- Fork the repository.
- Create a feature or bugfix branch (
git checkout -b feature/your-feature). - Commit changes (
git commit -m "Add your feature"). - Push to the branch (
git push origin feature/your-feature). - Open a pull request on the GitHub repository.
Please include tests and documentation for new features. Report issues via GitHub Issues.
- Port Conflicts: If port 7860 is in use, specify a different port with
--gradio-portor setGRADIO_SERVER_PORT.- Example:
python app.py --gradio-port 7870
- Example:
- Colab Asyncio Error: If you encounter
RuntimeError: asyncio.run() cannot be called from a running event loopin Colab, the application now usesnest_asyncioto handle this. Ensurenest_asynciois installed (pip install nest_asyncio). - Panoptic Model Bugs: Avoid
detr-resnet-*-panopticmodels until stability issues are resolved. - API Instability: Test with smaller images and object detection models first.
- FastAPI Not Starting: Ensure
--enable-fastapiis used, and check that the specified--fastapi-port(default: 8000) is available.
For further assistance, open an issue on the GitHub repository.