bioneuralnet.network.pysmccnet.pipeline¶

Main SmCCNet pipeline. Supports both CCA (continuous) and PLS (binary) phenotypes.

Functions

`auto_pysmccnet`(X, Y[, AdjustedCovar, ...])	Automated SmCCNet workflow with GPU acceleration.
`data_preprocess`(X[, covariates, is_cv, ...])	PyTorch version of data_preprocess for omics dataset preparation.
`get_abar`(ws[, feature_label])	PyTorch equivalent of get_abar performing matrix multiplication on GPU.
`get_can_cor_multi`(X, cc_coef, cc_weight, Y)	PyTorch version of get_can_cor_multi calculating canonical correlation value on GPU.
`get_can_weights_multi`(X[, Trait, Lambda, ...])	PyTorch version of get_can_weights_multi wrapper.
`get_logger`(name)	Retrieves a global logger configured to write to 'bioneuralnet.log'.
`get_omics_modules`(Abar[, cut_height])	Extract omics modules via hierarchical clustering on the similarity matrix.
`get_robust_weights_multi`(X, Trait, Lambda[, ...])	PyTorch version of get_robust_weights_multi with subsampling loop.
`get_robust_weights_multi_binary`(X, Y[, ...])	PyTorch version of get_robust_weights_multi_binary using hybrid GPU/CPU execution.
`prune_modules`(Abar, X_combined, Y, modules, ...)	Prune network modules to target size range and compute summarization scores.
`r_scale`(x)	Numpy scaling using sample standard deviation.
`r_scale_torch`(x)	Pytorch scaling using sample standard deviation.

Classes

tqdm(*_, **__)

Decorate an iterable object, returning an iterator which acts exactly like the original iterable, but prints a dynamically updating progress bar every time a value is requested.

bioneuralnet.network.pysmccnet.pipeline.auto_pysmccnet(X: List[DataFrame | ndarray], Y: DataFrame | ndarray, AdjustedCovar: DataFrame | None = None, preprocess: bool = False, Kfold: int = 5, subSampNum: int = 100, DataType: List[str] | None = None, BetweenShrinkage: float = 2.0, ScalingPen: List[float] = [0.1, 0.1], saving_dir: str = '/home/docs/checkouts/readthedocs.org/user_builds/bioneuralnet/checkouts/latest/docs/source', tuneLength: int = 5, tuneRangeCCA: List[float] = [0.1, 0.5], tuneRangePLS: List[float] = [0.5, 0.9], EvalMethod: str = 'accuracy', ncomp_pls: int = 3, seed: int = 123, CutHeight: float = 0.9999999999, min_size: int = 10, max_size: int = 100, summarization: str = 'NetSHy', precomputed_fold_data: dict | None = None, device: torch.device | None = 'cpu', dtype: torch.dtype = torch.float64, rename: bool = True) → dict[source]¶

Automated SmCCNet workflow with GPU acceleration.

Runs the complete SmCCNet pipeline supporting both CCA (continuous phenotype) and PLS (binary phenotype) modes. The workflow includes optional preprocessing, cross-validation for penalty tuning, subsampling for stability selection, and final network construction.

Parameters:

X (List[pd.DataFrame | np.ndarray]) – Input data matrices (omics layers) for integration.
Y (pd.DataFrame | np.ndarray) – Phenotype vector; numeric for CCA or binary (0/1) for PLS.
AdjustedCovar (pd.DataFrame | None) – Optional covariates to regress out from X before analysis.
preprocess (bool) – If True, center and scale data; if False, use raw input.
Kfold (int) – Number of cross-validation folds for penalty parameter tuning.
subSampNum (int) – Number of subsampling iterations for stability selection.
DataType (List[str] | None) – Names for each omics layer in X; defaults to generic names if None.
BetweenShrinkage (float) – Shrinkage factor for between-omics scaling weights.
ScalingPen (List[float]) – Penalty terms used for determining scaling factors.
saving_dir (str) – Directory path for saving output results.
tuneLength (int) – Number of candidate penalty parameters to test per omics layer.
tuneRangeCCA (List[float]) – Min and max penalty values for CCA (continuous phenotype).
tuneRangePLS (List[float]) – Min and max penalty values for PLS (binary phenotype).
EvalMethod (str) – Metric for PLS evaluation; one of ‘accuracy’, ‘auc’, ‘precision’, ‘recall’, or ‘f1’.
ncomp_pls (int) – Number of latent components to use for PLS models.
CutHeight (float) – Height threshold for hierarchical tree cutting in module extraction.
min_size (int) – Minimum number of nodes to retain a network module.
max_size (int) – Maximum module size; larger modules are pruned down.
summarization (str) – Network summarization method. Currently only ‘NetSHy’ is supported.
seed (int) – Random seed for reproducibility.
precomputed_fold_data (dict | None) – Precomputed CV folds to bypass internal fold generation.
device (torch.device | cpu) – PyTorch device; if None, automatically selects GPU if available.
dtype (torch.dtype) – PyTorch data type for computations.
rename (bool) – If True, prefix datatype to column names; if False, use original column names.

Returns:

Dictionary containing results for ‘CCA’ or ‘PLS’ including adjacency matrices, processed data, and CV results.

Return type:

dict