This repository focuses on training and comparing machine learning models to predict a binary outcome (0 or 1) for diabetes diagnosis using a highly imbalanced dataset. The models include XGBoost, Gradient Boosting, Random Forest, and Logistic Regression. It implements bootstrap resampling to generate precision-recall curves for model comparison, which is particularly useful for imbalanced data. Additionally, it analyzes classification thresholds (False Negatives, False Positives, True Positives, True Negatives) to identify individuals who are harder to classify and determine optimal thresholds for diabetes detection.
anozk/Machine-learning-Models-Testing-on-Diabetes-Data
Folders and files
| Name | Name | Last commit date | ||
|---|---|---|---|---|