Skip to content

luisastue/datascience-ckd

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

73 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Chronic Kidney Disease Prediction

This repository contains a data science project focused on the prediction of chronic kidney disease (CKD) using clinical and laboratory data. It covers the full workflow from preprocessing and exploratory data analysis to imputation, statistical modeling, machine learning, causal inference, and evaluation.

The project is organized as a notebook-based pipeline and compares multiple predictive approaches, including decision trees, random forests, gradient boosting, logistic regression, k-nearest neighbors, support vector machines, and neural networks.


Project Overview

Chronic kidney disease is a major global health issue associated with increased morbidity and mortality. In the project notes, CKD is described as a progressive loss of kidney function, with diagnosis commonly relying on indicators such as creatinine-related measurements and albumin in urine. The notes also highlight several relevant clinical risk factors and biomarkers, including hypertension, diabetes mellitus, blood urea, serum creatinine, hemoglobin, and specific gravity.

This repository investigates how well CKD can be predicted from patient features and which variables are most informative for classification.


Repository Structure

datascience-ckd/
├── assets/                         # Images and figures used in the project
├── data/                           # Raw data
├── processed/                      # Processed datasets
├── plots/                          # Exported visualizations
├── results/                        # Model outputs and final results
├── util/                           # Helper functions and utilities
├── 1_preprocessing.ipynb
├── 2_eda.ipynb
├── 3_imputation.ipynb
├── 4_statistical_modeling.ipynb
├── 5_learning.ipynb
├── 5.1_decision_tree.ipynb
├── 5.2_random_forests.ipynb
├── 5.2a_random_forests_without_diabetes.ipynb
├── 5.3_gradient_boosting.ipynb
├── 5.3a_gradient_boosting_without_diabetes.ipynb
├── 5.4_logistic_regression.ipynb
├── 5.4a_logistic_regression_without_diabetes.ipynb
├── 5.5_knn.ipynb
├── 5.6_svm.ipynb
├── 5.7_neural_networks.ipynb
├── 5.7a_neural_networks_without_diabetes.ipynb
├── 5.8_causal_inference.ipynb
├── 6_evaluation.ipynb
├── 6_evaluation_luisa.ipynb
├── 6_evaluation_without_diabetes.ipynb
├── metrics_dtree.csv
├── metrics_knn.csv
├── metrics_logreg.csv
├── metrics_rf.csv
├── metrics_svm.csv
├── data.json
└── research.md

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors