Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
3,298 changes: 3,298 additions & 0 deletions BioNeuralNet.ipynb

Large diffs are not rendered by default.

30 changes: 30 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -35,3 +35,33 @@ and this project adheres to [Semantic Versioning](https://semver.org/).

## **[Unreleased]**
- Multi-Modal Integration

## [1.0] - 2025-03-06

### **Added**
- **Simplified requirements**: Requirements.txt was severely simplified. Addtionally removed unnecessary imports from core package
- **New Metrics**: New correlation, evaluations and plot python files
- **Plotting Functions**:
- plot_variance_distribution
- plot_variance_by_feature
- plot_performance
- plot_embeddings
- plot_network
- compare_clusters
- **Correlation Functions**
- omics_correlation
- cluster_correlation
- louvain_to_adjacency
- **Evaluation**
- evaluate_rf
- **New Utilities**: Added files to convert RData (Networks as adjency matrix) files to Pandas Dataframes Adjancy matrix.
- **Variance Functions**:
- remove_variance
- remove_fraction
- network_remove_low_variance
- network_remove_high_zero_fraction
- network_filter
- omics_data_filter

- **Updated Tutorials and Documentation**: New end to end jupiter notebook example.
- **Updated Test**: All test have been updated and new ones have been added.
219 changes: 122 additions & 97 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
# BioNeuralNet: A Multi-Omics Integration and GNN-Based Embedding Framework
# BioNeuralNet: Multi-Omics Integration with Graph Neural Networks

![License](https://img.shields.io/badge/license-MIT-blue.svg)
![PyPI](https://img.shields.io/pypi/v/bioneuralnet)
Expand All @@ -8,52 +8,40 @@

[![Documentation](https://img.shields.io/badge/docs-read%20the%20docs-blue.svg)](https://bioneuralnet.readthedocs.io/en/latest/)

## Welcome to [BioNeuralNet Beta 0.2](https://bioneuralnet.readthedocs.io/en/latest/index.html)
## Welcome to BioNeuralNet 1.0

![BioNeuralNet Logo](/assets/LOGO_WB.png)
![BioNeuralNet Logo](assets/LOGO_WB.png)

**Note:** This is a **beta version** of BioNeuralNet. We are actively developing new features and improving stability.
Feedback and bug reports are highly encouraged!

BioNeuralNet is a Python framework for integrating **multi-omics data** with **Graph Neural Networks (GNNs)**.
It provides tools for **graph construction, clustering, network embedding, subject representation, and disease prediction**.
BioNeuralNet is a robust Python framework for integrating multi-omics data with Graph Neural Networks (GNNs).

![BioNeuralNet Workflow](assets/BioNeuralNet.png)

---

## **Key Features**

BioNeuralNet enables **multi-omics analysis** through **five core steps**:

### 1. **Graph Construction**
- Build multi-omics networks using **SmCCNet** or custom adjacency matrices.

### 2. **Graph Clustering**
- Identify meaningful communities with **Correlated Louvain**, **Hybrid Louvain**, or **Correlated PageRank**.
## Table of Contents

### 3. **GNN Embedding**
- Generate node embeddings using models like **GCN, GAT, GraphSAGE, and GIN**.
- [1. Installation](#1-installation)
- [1.1. Install BioNeuralNet](#11-install-bioneuralnet)
- [1.2. Install PyTorch and PyTorch Geometric](#12-install-pytorch-and-pytorch-geometric)
- [2. BioNeuralNet Core Features](#2-bioneuralnet-core-features)
- [3. Quick Example: SmCCNet + DPMON for Disease Prediction](#3-quick-example-smccnet--dpmon-for-disease-prediction)
- [4. Documentation and Tutorials](#4-documentation-and-tutorials)
- [5. Frequently Asked Questions (FAQ)](#5-frequently-asked-questions-faq)
- [6. Acknowledgments](#6-acknowledgments)
- [7. Testing and Continuous Integration](#7-testing-and-continuous-integration)
- [8. Contributing](#8-contributing)
- [9. License](#9-license)
- [10. Contact](#10-contact)

### 4. **Subject Representation**
- Integrate GNN-based embeddings into omics data via **GraphEmbedding**.
## 1. Installation

### 5. **Disease Prediction**
- Use **DPMON**, a GNN-powered classifier, to predict disease phenotypes.
BioNeuralNet supports Python 3.10 and 3.11.

---

## **Installation**

BioNeuralNet supports **Python 3.10 and 3.11**.

### **1. Install BioNeuralNet**
### 1.1. Install BioNeuralNet
```bash
pip install bioneuralnet
```

### **2. Install PyTorch and PyTorch Geometric**
BioNeuralNet relies on PyTorch for GNN computations. Install PyTorch **separately**:
## 1.2. Install PyTorch and PyTorch Geometric
BioNeuralNet relies on PyTorch for GNN computations. Install PyTorch separately:

- **PyTorch (CPU)**:
```bash
Expand All @@ -65,26 +53,80 @@ BioNeuralNet relies on PyTorch for GNN computations. Install PyTorch **separatel
pip install torch_geometric
```

For GPU acceleration, visit:
For GPU acceleration, please refer to:
- [PyTorch Installation Guide](https://pytorch.org/get-started/locally/)
- [PyTorch Geometric Installation Guide](https://pytorch-geometric.readthedocs.io/en/latest/notes/installation.html)

---

## **Quick Example: SmCCNet + DPMON for Disease Prediction**
## 2. BioNeuralNet Core Features

For an End-to-End example example of BioNeuralNet, see [BioNeuralNet Demo](https://bioneuralnet.readthedocs.io/en/latest/BioNeuralNet.html)

**Network Embedding**:

- Given a multi-omics network as input, BioNeuralNet can generate embeddings using Graph Neural Networks (GNNs).
- Generate embeddings using methods such as **GCN**, **GAT**, **GraphSAGE**, and **GIN**.
- Outputs can be obtained as native tensors or converted to pandas DataFrames for easy analysis and visualization.
- Embeddings unlock numerous downstream applications, including disease prediction, enhanced subject representation, clustering, and more.

**Graph Clustering**:

- Identify functional modules or communities using **correlated clustering methods** (e.g., CorrelatedPageRank, CorrelatedLouvain, HybridLouvain) that integrate phenotype correlation to extract biologically relevant modules [1]_.
- Clustering methods can be applied to any network represented allowing flexible analysis across different domains.
- All clustering components return either raw partitions dictionaries or induced subnetwork adjacency matrices (as DataFrames) for visualization.
- Use cases include, feature selection, biomarker discovery, and network-based analysis.

**Downstream Tasks**:

- **Subject Representation**:

- Integrate node embeddings back into omics data to enrich subject-level profiles by weighting features with learned embedding.
- This embedding-enriched data can be used for downstream tasks such as disease prediction or biomarker discovery.
- The result can be returned as a DataFrame or a PyTorch tensor, fitting naturally into downstream analyses.

- **Disease Prediction for Multi-Omics Network DPMON**:

- Classification End-to-End pipeline for disease prediction using Graph Neural Network embeddings.
- DPMON supports hyperparameter tuning-when enabled, it finds the best for the given data.
- This approach, along with the native pandas integration across modules, ensures that BioNeuralNet can be easily incorporated into your analysis workflows.

**Metrics**:

- Several plotting funcctions to visualize networks, emebddings, variance distribution, cluster comparison, and more.
- Correlation based functions to compare clustersand omics data with the phenotype.

**Utilities**:

- **Filtering Functions**:

- Network filtering allows users to select variance or zero-fraction filtering to an omics network.
- Reducing noise, and removing outliers.

- **Data Conversion**:

- Convert RData files both CSV and to Pandas DataFrame. For ease of integration for R-based workflows.

**External Tools**:

- **Graph Construction**:

- BioNeuralNet provides additional tools in the [External Tools](https://bioneuralnet.readthedocs.io/en/latest/external_tools/index.html) module.
- Allowing users to generate networks using R-based tools like WGCNA and SmCCNet.
- While optional, these tools enhance BioNeuralNet's capabilities and are recommended for comprehensive analysis.

## 3. Quick Example: SmCCNet + DPMON for Disease Prediction

```python
import pandas as pd
from bioneuralnet.datasets import DatasetLoader
from bioneuralnet.external_tools import SmCCNet
from bioneuralnet.downstream_task import DPMON
import pandas as pd

# 1) Load dataset
# 1. Load dataset
loader = DatasetLoader("example1")
omics1, omics2, phenotype, clinical = loader.load_data()

# 2) Generate adjacency matrix using SmCCNet
# 2. Generate adjacency matrix using SmCCNet
smccnet = SmCCNet(
phenotype_df=phenotype,
omics_dfs=[omics1, omics2],
Expand All @@ -94,7 +136,7 @@ smccnet = SmCCNet(
)
global_network, _ = smccnet.run()

# 3) Run Disease Prediction using DPMON
# 3. Run Disease Prediction using DPMON
dpmon = DPMON(
adjacency_matrix=global_network,
omics_list=[omics1, omics2],
Expand All @@ -106,76 +148,62 @@ dpmon_predictions = dpmon.run()
print("Disease Predictions:\n", dpmon_predictions.head())
```

### **Output**
- **Adjacency Matrix**: The constructed multi-omics network.
- **Predictions**: Disease phenotype predictions.
## 4. Documentation and Tutorials

---

## **Documentation & Tutorials**
- Full documentation: [Read the Docs](https://bioneuralnet.readthedocs.io/en/latest/)
- Full documentation: [BioNeuralNet Documentation](https://bioneuralnet.readthedocs.io/en/latest/)
- Tutorials include:
- Multi-omics graph construction
- GNN embeddings for disease prediction
- Subject representation with integrated embeddings
- Clustering using Hybrid Louvain & Correlated PageRank

---
- Clustering using Hybrid Louvain and Correlated PageRank
- API details are available in the [API Reference](https://bioneuralnet.readthedocs.io/en/latest/api.html).

## **Frequently Asked Questions (FAQ)**
## 5. Frequently Asked Questions (FAQ)

- Does BioNeuralNet support **GPU acceleration**?
- Yes, install PyTorch with CUDA support.
- **Does BioNeuralNet support GPU acceleration?**
Yes, install PyTorch with CUDA support.

- Can I use my own **adjacency matrix**?
- yes, you can provide a custom matrix instead of using SmCCNet.
- **Can I use my own omics network?**
Yes, you can provide a custom network as an adjancy matrix instead of using SmCCNet.

- What clustering methods are supported?
- **Correlated Louvain**, **Hybrid Louvain**, and **Correlated PageRank**.
- **What clustering methods are supported?**
BioNeuralNet supports Correlated Louvain, Hybrid Louvain, and Correlated PageRank.

See the full [FAQ](https://bioneuralnet.readthedocs.io/en/latest/faq.html).
For more FAQs, please visit our [FAQ page](https://bioneuralnet.readthedocs.io/en/latest/faq.html).

---

## **Acknowledgments**
## 6. Acknowledgments

BioNeuralNet integrates multiple open-source libraries. We acknowledge key dependencies:

- **PyTorch** - GNN computations and deep learning models.
- **PyTorch Geometric** - Graph-based learning for multi-omics.
- **NetworkX** - Graph data structure and algorithms.
- **Scikit-learn** - Feature selection and evaluation utilities.
- **pandas & numpy** - Core data processing tools.
- **ray[tune]** - Hyperparameter tuning for GNN models.
- **matplotlib** - Data visualization.
- **cptac** - Dataset handling for clinical proteomics.
- **python-louvain** - Community detection algorithms.
- [**PyTorch**](https://github.com/pytorch/pytorch) - GNN computations and deep learning models.
- [**PyTorch Geometric**](https://github.com/pyg-team/pytorch_geometric) - Graph-based learning for multi-omics.
- [**NetworkX**](https://github.com/networkx/networkx) - Graph data structures and algorithms.
- [**Scikit-learn**](https://github.com/scikit-learn/scikit-learn) - Feature selection and evaluation utilities.
- [**pandas**](https://github.com/pandas-dev/pandas) & [**numpy**](https://github.com/numpy/numpy) - Core data processing tools.
- [**ray[tune]**](https://github.com/ray-project/ray) - Hyperparameter tuning for GNN models.
- [**matplotlib**](https://github.com/matplotlib/matplotlib) - Data visualization.
- [**cptac**](https://github.com/PNNL-CompBio/cptac) - Dataset handling for clinical proteomics.
- [**python-louvain**](https://github.com/taynaud/python-louvain) - Community detection algorithms.

We also acknowledge R-based tools for external network construction:
- **SmCCNet** - Sparse multiple canonical correlation network.
- **WGCNA** - Weighted gene co-expression network analysis.

These tools **enhance BioNeuralNet** but are **not required** for core functionality.

---
- [**SmCCNet**](https://github.com/UCD-BDLab/BioNeuralNet/tree/main/bioneuralnet/external_tools/smccnet) - Sparse multiple canonical correlation network.
- [**WGCNA**](https://cran.r-project.org/web/packages/WGCNA/) - Weighted gene co-expression network analysis.

## **Testing & Continuous Integration**
## 7. Testing and Continuous Integration

1. **Run Tests Locally**:
- **Run Tests Locally:**
```bash
pytest --cov=bioneuralnet --cov-report=html
open htmlcov/index.html
```

2. **Continuous Integration**:
- GitHub Actions runs automated tests on each commit.
- **Continuous Integration:**
GitHub Actions runs automated tests on every commit.

---
## 8. Contributing

## **Contributing**

We welcome contributions!
To get started:
We welcome contributions! To get started:

```bash
git clone https://github.com/UCD-BDLab/BioNeuralNet.git
Expand All @@ -185,21 +213,18 @@ pre-commit install
pytest
```

### **How to Contribute**
- **Fork** the repo, create a new branch, implement your changes.
- **Add tests and documentation** for your new feature.
- **Submit a pull request** with a clear description.
### How to Contribute
- Fork the repository, create a new branch, and implement your changes.
- Add tests and documentation for any new features.
- Submit a pull request with a clear description of your changes.

For more details, check our [Contributing Guide](https://github.com/UCD-BDLab/BioNeuralNet/blob/main/CONTRIBUTING.md).
For more details, see our [Contributing Guide](https://github.com/UCD-BDLab/BioNeuralNet/blob/main/CONTRIBUTING.md).

---
## 9. License

## **License**
- **License:** [MIT License](https://github.com/UCD-BDLab/BioNeuralNet/blob/main/LICENSE)
BioNeuralNet is distributed under the [MIT License](https://github.com/UCD-BDLab/BioNeuralNet/blob/main/LICENSE).

---
## 10. Contact

## **Contact**
- **Issues & Feature Requests:** [Open an Issue](https://github.com/UCD-BDLab/BioNeuralNet/issues)
- **Issues and Feature Requests:** [Open an Issue](https://github.com/UCD-BDLab/BioNeuralNet/issues)
- **Email:** [vicente.ramos@ucdenver.edu](mailto:vicente.ramos@ucdenver.edu)

Loading
Loading