🩺 Breast Cancer Classification using Machine Learning

A machine learning classification project focused on predicting whether a breast tumor is malignant or benign using the Breast Cancer Wisconsin dataset from scikit-learn.

📌 Project Objective

The objective of this project is to build and evaluate multiple machine learning classification models while understanding:

Classification techniques
Evaluation metrics
ROC Curve & AUC analysis
Handling imbalanced data
Model comparison
Feature importance interpretation

📂 Dataset Information

Dataset: Breast Cancer Wisconsin Dataset
Source: scikit-learn built-in dataset
Total Samples: 569
Total Features: 30

🎯 Target Classes

Value	Meaning
0	Malignant
1	Benign

⚙️ Technologies Used

Python
Pandas
NumPy
Matplotlib
Seaborn
Scikit-learn

🤖 Machine Learning Models

The following classification models were implemented:

Model	Purpose
Logistic Regression	Baseline classification
Decision Tree	Rule-based classification
Random Forest	Ensemble learning

📊 Evaluation Metrics

The models were evaluated using:

Accuracy
Precision
Recall
F1-score
Confusion Matrix
ROC Curve
AUC Score

📈 Project Workflow

Data Loading
Exploratory Data Analysis (EDA)
Missing Value Analysis
Feature Correlation Heatmap
Train-Test Split
Feature Scaling
Logistic Regression Modeling
ROC-AUC Evaluation
Handling Imbalanced Data
Decision Tree Classification
Random Forest Classification
Model Comparison
Feature Importance Analysis
Final Insights & Conclusion

🏆 Results Summary

Model	Accuracy
Logistic Regression	98.24%
Decision Tree	91.22%
Random Forest	95.61%

✅ Best Performing Model

Logistic Regression achieved the strongest overall balance between:

Accuracy
Stability
Interpretability
ROC-AUC performance

🔍 Key Insights

Recall is highly important in medical diagnosis because false negatives may lead to undetected cancer cases.
ROC-AUC analysis provides better evaluation than accuracy alone.
Random Forest improved predictive capability using ensemble learning.
Feature importance analysis identified influential medical indicators.

📁 Repository Structure

AI_ML_Task4_Classification_Project/
│
├── AI_ML_Task4_Classification.ipynb
├── AI_ML_Task4_Classification.pdf
└── README.md

🚀 Conclusion

This project demonstrates how machine learning classification techniques can be applied to real-world medical diagnosis problems using proper evaluation metrics and model comparison techniques.

👨‍💻 Author

Sahil Bhatti

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

🩺 Breast Cancer Classification using Machine Learning

📌 Project Objective

📂 Dataset Information

🎯 Target Classes

⚙️ Technologies Used

🤖 Machine Learning Models

📊 Evaluation Metrics

📈 Project Workflow

🏆 Results Summary

✅ Best Performing Model

🔍 Key Insights

📁 Repository Structure

🚀 Conclusion

👨‍💻 Author

FilesExpand file tree

README.md

Latest commit

History

README.md

File metadata and controls

🩺 Breast Cancer Classification using Machine Learning

📌 Project Objective

📂 Dataset Information

🎯 Target Classes

⚙️ Technologies Used

🤖 Machine Learning Models

📊 Evaluation Metrics

📈 Project Workflow

🏆 Results Summary

✅ Best Performing Model

🔍 Key Insights

📁 Repository Structure

🚀 Conclusion

👨‍💻 Author