Skip to content

Production-quality object detection & tracking pipeline with self-supervised evaluation metrics. Built with Mask R-CNN + SORT algorithm.

License

Notifications You must be signed in to change notification settings

SAMithila/objectSpace

Repository files navigation

ObjectSpace

🎯 A production-quality object detection & tracking pipeline for workspace monitoring β€” demonstrating real-world ML engineering with self-supervised evaluation metrics.

Demo

Python 3.9+ License: MIT CI


πŸš€ Why This Project?

Smart workspace monitoring enables:

  • Productivity analytics β€” Track object interactions over time
  • Ergonomics research β€” Monitor desk setup and posture indicators
  • Automated inventory β€” Detect and track items on workspaces

This project demonstrates end-to-end ML pipeline engineering: from raw video to tracked objects with quality metrics β€” all without requiring ground truth annotations.


✨ Key Features

Feature Description
Object Detection Pre-trained Mask R-CNN with configurable confidence thresholds
Multi-Object Tracking SORT algorithm with 8D Kalman filtering
Self-Supervised Evaluation Quality metrics without ground truth
Modular Architecture Clean separation of detection, tracking, I/O, and evaluation
Multiple Outputs COCO JSON annotations, visualization frames, evaluation reports
CLI + Python API Flexible usage for scripts or integration

πŸ“Š Evaluation Results

The built-in evaluation framework measures tracking quality without ground truth:

Video Overall Continuity Stability Tracks ID Switches
video1 (complex) 36.8 66.5 25.4 23 6
video2 (medium) 44.4 67.8 43.3 11 3
video4 (simple) 78.4 95.9 100.0 8 0
Average 53.2 76.7 56.3 - -

Key Findings

  • βœ… 100% stability on simple scenes (≀8 concurrent tracks)
  • ⚠️ Stability degrades with scene complexity (IoU-based matching limitation)
  • πŸ”§ Identified bottleneck: ID association in crowded scenes β†’ recommends Deep SORT

πŸ—οΈ Architecture

objectSpace/
β”œβ”€β”€ src/objectSpace/
β”‚   β”œβ”€β”€ detection/          # Mask R-CNN object detection
β”‚   β”‚   β”œβ”€β”€ base.py         # Abstract detector interface
β”‚   β”‚   └── mask_rcnn.py    # Mask R-CNN implementation
β”‚   β”œβ”€β”€ tracking/           # SORT with Kalman filtering
β”‚   β”‚   β”œβ”€β”€ kalman.py       # Kalman filter implementation
β”‚   β”‚   β”œβ”€β”€ association.py  # IoU & Hungarian matching
β”‚   β”‚   └── sort_tracker.py # SORT algorithm
β”‚   β”œβ”€β”€ evaluation/         # Self-supervised quality metrics
β”‚   β”‚   β”œβ”€β”€ metrics.py      # Metric dataclasses
β”‚   β”‚   β”œβ”€β”€ analyzer.py     # TrackingAnalyzer
β”‚   β”‚   β”œβ”€β”€ reporter.py     # Report generation
β”‚   β”‚   └── integration.py  # Pipeline integration
β”‚   β”œβ”€β”€ io/                 # Video I/O and COCO export
β”‚   β”‚   β”œβ”€β”€ video.py        # Video reading
β”‚   β”‚   └── export.py       # COCO JSON export
β”‚   β”œβ”€β”€ pipeline.py         # Main orchestration
β”‚   └── config.py           # Typed configuration
β”œβ”€β”€ tests/                  # Unit & integration tests
β”‚   β”œβ”€β”€ evaluation/         # Evaluation module tests
β”‚   β”œβ”€β”€ test_detection.py
β”‚   └── test_tracking.py
β”œβ”€β”€ examples/               # Demo notebooks
β”‚   └── demo.ipynb          # Interactive demo
β”œβ”€β”€ configs/                # YAML configurations
β”‚   β”œβ”€β”€ default.yaml
β”‚   └── tuned.yaml
└── assets/                 # Demo media
    └── demo.gif

πŸš€ Quick Start

Installation

git clone https://github.com/SAMithila/objectSpace.git
cd objectSpace
python -m venv venv
source venv/bin/activate
pip install -e ".[dev]"

Process a Video

from objectSpace import DetectionTrackingPipeline

pipeline = DetectionTrackingPipeline()
results = pipeline.process_video("video.mp4", output_dir="output/")

Process with Evaluation

# Get tracking results + quality metrics
results, evaluation = pipeline.process_video_with_evaluation("video.mp4")

print(f"Overall Score: {evaluation.overall_score:.1f}/100")
print(f"ID Switches: {evaluation.id_switches.total_switches}")

CLI Usage

# Process single video
python process_one_video.py task3.1_video1

# Evaluate existing results
python run_evaluation.py

# Compare all videos
python compare_videos.py

πŸ“ˆ Evaluation Framework

The evaluation module computes tracking quality without ground truth annotations:

Metrics

Metric What It Measures
Continuity Score Track completeness (gaps, fragmentation)
Stability Score ID consistency (fewer switches = better)
Speed Score Processing FPS vs target
Overall Score Weighted combination

Usage

from objectSpace.pipeline import evaluate_annotations

# Evaluate existing tracking results
result = evaluate_annotations("output/video_annotations.json")

print(f"Fragmented tracks: {result.fragmentation.fragmented_tracks}")
print(f"ID switches: {result.id_switches.total_switches}")
print(f"Avg coverage: {result.fragmentation.avg_coverage_ratio:.1%}")

Compare Videos

python compare_videos.py

Output:

EVALUATION COMPARISON
================================================================================
Video                      Overall    Cont.    Stab.    Speed  Tracks
--------------------------------------------------------------------------------
task3.1_video1                36.8     66.5     25.4      0.0      23
task3.1_video2                44.4     67.8     43.3      0.0      11
task3.1_video4                78.4     95.9    100.0      0.0       8
--------------------------------------------------------------------------------
AVERAGE                       53.2     76.7     56.3      0.0      42

βš™οΈ Configuration

Default settings in configs/default.yaml:

Parameter Default Description
detector.device auto CPU/CUDA selection
detector.default_confidence 0.3 Detection threshold
tracker.max_age 8 Frames to keep lost tracks
tracker.iou_threshold 0.3 Minimum IoU for matching

Tuned Configuration

Based on evaluation results, configs/tuned.yaml improves performance:

tracker:
  max_age: 15          # Handles longer occlusions
  iou_threshold: 0.2   # Fewer false ID switches

πŸ§ͺ Development

# Run tests
pytest tests/ -v

# Run specific test module
pytest tests/evaluation/ -v

# Run with coverage
pytest tests/ --cov=objectSpace --cov-report=term-missing

πŸ“ Output Format

COCO JSON with Tracking

{
  "annotations": [
    {
      "id": 1,
      "image_id": 0,
      "category_id": 1,
      "bbox": [100, 100, 50, 80],
      "track_id": 0
    }
  ]
}

Evaluation Reports

  • *_evaluation.json β€” Machine-readable metrics
  • *_evaluation.md β€” Human-readable report
  • EVALUATION_SUMMARY.md β€” Cross-video comparison

πŸ“š Technical Highlights

This project demonstrates:

  1. Modular Design β€” Separate concerns for detection, tracking, evaluation
  2. Type Safety β€” Full type hints with dataclasses
  3. Configuration Management β€” YAML configs with typed validation
  4. Self-Supervised ML β€” Quality metrics without labeled data
  5. Production Patterns β€” Logging, error handling, CLI interface
  6. CI/CD β€” GitHub Actions for automated testing

πŸ› οΈ Extending

Add New Detector

from objectSpace.detection import BaseDetector

class YOLODetector(BaseDetector):
    def detect(self, frame):
        # Your implementation
        pass

Add Custom Metrics

from objectSpace.evaluation import TrackingAnalyzer

class CustomAnalyzer(TrackingAnalyzer):
    def compute_custom_metric(self, annotations):
        # Your metric logic
        pass

πŸ“„ License

MIT License β€” see LICENSE for details.

πŸ™ Acknowledgments