An end-to-end machine learning pipeline for detecting network security threats and phishing attempts with real-time prediction capabilities.
- Real-time Threat Detection: Machine learning models for network security threat prediction
- High-Performance API: FastAPI-based REST API for serving predictions
- Data Management: MongoDB integration for efficient data storage and retrieval
- Experiment Tracking: MLflow integration for model lifecycle management
- Version Control: DagsHub for collaborative ML workflows
- Robust Architecture: Comprehensive logging and exception handling
- Production Ready: Scalable deployment configuration
- Python 3.8+
- MongoDB database
- Git
- DagsHub account (for version control)
-
Clone the repository
git clone https://github.com/Suraj-G-Rao/Network_Security.git cd Network_Security -
Create and activate virtual environment
python -m venv venv # Windows venv\Scripts\activate # Linux/Mac source venv/bin/activate
-
Install dependencies
pip install -r requirements.txt
-
Set up environment variables
# Create a .env file in the root directory # Add your MongoDB connection string MONGODB_URL_KEY=your_mongodb_connection_string
-
Install the package
pip install -e .
Network_Security/
├── src/ # Source code
│ ├── components/ # ML pipeline components
│ ├── constant/ # Configuration constants
│ ├── entity/ # Data entities and configs
│ ├── exception/ # Custom exceptions
│ ├── logging/ # Logging utilities
│ ├── pipeline/ # ML pipelines
│ └── utils/ # Utility functions
├── app.py # FastAPI application
├── main.py # Training pipeline entry point
├── requirements.txt # Python dependencies
├── setup.py # Package configuration
├── Network_data/ # Training data
├── final_model/ # Trained models
└── templates/ # HTML templates
Run the complete training pipeline:
python main.pyThis will execute:
- Data Ingestion: Load and prepare training data
- Data Validation: Validate data quality and schema
- Data Transformation: Preprocess and feature engineer
- Model Training: Train the network security model
- Model Evaluation: Evaluate model performance
Start the FastAPI server:
python app.pyOr using uvicorn directly:
uvicorn app:app --host 0.0.0.0 --port 8000 --reloadThe API will be available at http://localhost:8000
- GET /: Root endpoint - redirects to prediction interface
- GET /docs: Interactive API documentation (Swagger UI)
- POST /predict: Make predictions on network data
Send a POST request to /predict with network data:
curl -X POST "http://localhost:8000/predict" \
-H "Content-Type: application/json" \
-d '{"features": [your_network_features_here]}'- Set up a MongoDB cluster (MongoDB Atlas recommended)
- Create a database and collection for network data
- Add the connection string to your
.envfile
- Create a DagsHub repository
- Configure MLflow tracking:
import mlflow mlflow.set_tracking_uri("your_dagshub_mlflow_uri")
The system includes comprehensive model evaluation with:
- Accuracy metrics
- Precision and recall scores
- Confusion matrix visualization
- Feature importance analysis
- Input Validation: Comprehensive data validation pipeline
- Error Handling: Robust exception handling throughout
- Logging: Detailed logging for monitoring and debugging
- CORS Configuration: Secure cross-origin resource sharing
- Fork the repository
- Create a feature branch (
git checkout -b feature/AmazingFeature) - Commit your changes (
git commit -m 'Add some AmazingFeature') - Push to the branch (
git push origin feature/AmazingFeature) - Open a Pull Request
This project is licensed under the MIT License - see the LICENSE file for details.
Suraj G Rao
- Email: surajgrao0203@gmail.com
- GitHub: Suraj-G-Rao
- MLflow for experiment tracking
- FastAPI for high-performance API framework
- MongoDB for data storage
- Scikit-learn for machine learning algorithms
- DagsHub for version control and collaboration
For any queries or support, please reach out via:
- Email: surajgrao0203@gmail.com
- GitHub Issues: Create an issue