🩺 Breast Cancer Classification using Machine Learning

A machine learning classification project focused on predicting whether a breast tumor is malignant or benign using the Breast Cancer Wisconsin dataset from scikit-learn.

📌 Project Objective

The objective of this project is to build and evaluate multiple machine learning classification models while understanding:

Classification techniques
Evaluation metrics
ROC Curve & AUC analysis
Handling imbalanced data
Model comparison
Feature importance interpretation

📂 Dataset Information

Dataset: Breast Cancer Wisconsin Dataset
Source: scikit-learn built-in dataset
Total Samples: 569
Total Features: 30

🎯 Target Classes

Value	Meaning
0	Malignant
1	Benign

⚙️ Technologies Used

Python
Pandas
NumPy
Matplotlib
Seaborn
Scikit-learn

🤖 Machine Learning Models

The following classification models were implemented:

Model	Purpose
Logistic Regression	Baseline classification
Decision Tree	Rule-based classification
Random Forest	Ensemble learning

📊 Evaluation Metrics

The models were evaluated using:

Accuracy
Precision
Recall
F1-score
Confusion Matrix
ROC Curve
AUC Score

📈 Project Workflow

Data Loading
Exploratory Data Analysis (EDA)
Missing Value Analysis
Feature Correlation Heatmap
Train-Test Split
Feature Scaling
Logistic Regression Modeling
ROC-AUC Evaluation
Handling Imbalanced Data
Decision Tree Classification
Random Forest Classification
Model Comparison
Feature Importance Analysis
Final Insights & Conclusion

🏆 Results Summary

Model	Accuracy
Logistic Regression	98.24%
Decision Tree	91.22%
Random Forest	95.61%

✅ Best Performing Model

Logistic Regression achieved the strongest overall balance between:

Accuracy
Stability
Interpretability
ROC-AUC performance

🔍 Key Insights

Recall is highly important in medical diagnosis because false negatives may lead to undetected cancer cases.
ROC-AUC analysis provides better evaluation than accuracy alone.
Random Forest improved predictive capability using ensemble learning.
Feature importance analysis identified influential medical indicators.

📁 Repository Structure

AI_ML_Task4_Classification_Project/
│
├── AI_ML_Task4_Classification.ipynb
├── AI_ML_Task4_Classification.pdf
└── README.md

🚀 Conclusion

This project demonstrates how machine learning classification techniques can be applied to real-world medical diagnosis problems using proper evaluation metrics and model comparison techniques.

👨‍💻 Author

Sahil Bhatti

Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
AI_ML_Task4_Classification.ipynb		AI_ML_Task4_Classification.ipynb
AI_ML_Task4_Classification_report.pdf		AI_ML_Task4_Classification_report.pdf
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

🩺 Breast Cancer Classification using Machine Learning

📌 Project Objective

📂 Dataset Information

🎯 Target Classes

⚙️ Technologies Used

🤖 Machine Learning Models

📊 Evaluation Metrics

📈 Project Workflow

🏆 Results Summary

✅ Best Performing Model

🔍 Key Insights

📁 Repository Structure

🚀 Conclusion

👨‍💻 Author

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

🩺 Breast Cancer Classification using Machine Learning

📌 Project Objective

📂 Dataset Information

🎯 Target Classes

⚙️ Technologies Used

🤖 Machine Learning Models

📊 Evaluation Metrics

📈 Project Workflow

🏆 Results Summary

✅ Best Performing Model

🔍 Key Insights

📁 Repository Structure

🚀 Conclusion

👨‍💻 Author

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages