Example 1: SmCCNet + DPMON for Disease Prediction¶
This tutorial illustrates how to:
Build an adjacency matrix with SmCCNet.
Predict disease phenotypes using DPMON.
Workflow:
Data Preparation: - Load multi-omics, phenotype, and clinical data using DatasetLoader.
Network Construction: - Use auto_pysmccnet() to create an adjacency matrix from the combined omics data.
Disease Prediction: - DPMON integrates the adjacency matrix, omics data, and phenotype data to train a GNN-based classifier.
Diagram of the workflow: The figure below illustrates the process.
Embedding-enhanced subject data using DPMON for improved disease prediction.¶
View full-size image: Disease Prediction (DPMON)
Step-by-Step Instructions:
Data Setup: - Load synthetic multi-omics, phenotype, and clinical data using DatasetLoader.
Network Construction (SmCCNet): - Call auto_pysmccnet() to produce an adjacency matrix from the omics data.
Disease Prediction (DPMON): - Pass the adjacency, omics, phenotype, and clinical data into DPMON. - Run .run() to predict disease phenotypes.
Below is a complete Python implementation:
import pandas as pd
from bioneuralnet.network import auto_pysmccnet
from bioneuralnet.downstream_task import DPMON
from bioneuralnet.datasets import DatasetLoader
# Step 1: Load your data or use one of the provided datasets
Example = DatasetLoader("example")
omics_genes = Example.data["X1"]
omics_proteins = Example.data["X2"]
phenotype = Example.data["Y"]
clinical = Example.data["clinical_data"]
# Step 2: Network Construction
result = auto_pysmccnet(
X=[omics1, omics2],
Y=phenotype,
DataType=["genes", "mirna"],
subSampNum=1000,
seed=SEED,
Kfold=3,
BetweenShrinkage=5,
CutHeight=1 - 0.1**10,
summarization="NetSHy",
)
global_network = result["AdjacencyMatrix"]
subnetworks = result["Subnetworks"]
print("Adjacency matrix generated.")
# Step 3: Disease Prediction (DPMON)
dpmon = DPMON(
adjacency_matrix=global_network,
omics_list=[omics_genes, omics_proteins],
phenotype_data=phenotype,
clinical_data=clinical,
model="GCN",
)
predictions, avg_accuracy = dpmon.run()
print("Disease phenotype predictions:\n", predictions)
Output: - Adjacency Matrix: Generated using SmCCNet. - Predictions: Phenotype predictions for each subject.