bioneuralnet.network.pysmccnet.pipeline

Main SmCCNet pipeline. Supports both CCA (continuous) and PLS (binary) phenotypes.

Functions

auto_pysmccnet(X, Y[, AdjustedCovar, ...])

Automated SmCCNet workflow with GPU acceleration.

data_preprocess(X[, covariates, is_cv, ...])

PyTorch version of data_preprocess for omics dataset preparation.

get_abar(ws[, feature_label])

PyTorch equivalent of get_abar performing matrix multiplication on GPU.

get_can_cor_multi(X, cc_coef, cc_weight, Y)

PyTorch version of get_can_cor_multi calculating canonical correlation value on GPU.

get_can_weights_multi(X[, Trait, Lambda, ...])

PyTorch version of get_can_weights_multi wrapper.

get_logger(name)

Retrieves a global logger configured to write to 'bioneuralnet.log'.

get_omics_modules(Abar[, cut_height])

Extract omics modules via hierarchical clustering on the similarity matrix.

get_robust_weights_multi(X, Trait, Lambda[, ...])

PyTorch version of get_robust_weights_multi with subsampling loop.

get_robust_weights_multi_binary(X, Y[, ...])

PyTorch version of get_robust_weights_multi_binary using hybrid GPU/CPU execution.

prune_modules(Abar, X_combined, Y, modules, ...)

Prune network modules to target size range and compute summarization scores.

r_scale(x)

Numpy scaling using sample standard deviation.

r_scale_torch(x)

Pytorch scaling using sample standard deviation.

Classes

tqdm(*_, **__)

Decorate an iterable object, returning an iterator which acts exactly like the original iterable, but prints a dynamically updating progressbar every time a value is requested.

bioneuralnet.network.pysmccnet.pipeline.auto_pysmccnet(X: List[DataFrame | ndarray], Y: DataFrame | ndarray, AdjustedCovar: DataFrame | None = None, preprocess: bool = False, Kfold: int = 5, subSampNum: int = 100, DataType: List[str] | None = None, BetweenShrinkage: float = 2.0, ScalingPen: List[float] = [0.1, 0.1], saving_dir: str = '/home/docs/checkouts/readthedocs.org/user_builds/bioneuralnet/checkouts/latest/docs/source', tuneLength: int = 5, tuneRangeCCA: List[float] = [0.1, 0.5], tuneRangePLS: List[float] = [0.5, 0.9], EvalMethod: str = 'accuracy', ncomp_pls: int = 3, seed: int = 123, CutHeight: float = 0.9999999999, min_size: int = 10, max_size: int = 100, summarization: str = 'NetSHy', precomputed_fold_data: dict | None = None, device: torch.device | None = 'cpu', dtype: torch.dtype = torch.float64, rename: bool = True) dict[source]

Automated SmCCNet workflow with GPU acceleration.

Runs the complete SmCCNet pipeline supporting both CCA (continuous phenotype) and PLS (binary phenotype) modes. The workflow includes optional preprocessing, cross-validation for penalty tuning, subsampling for stability selection, and final network construction.

Parameters:
  • X (List[pd.DataFrame | np.ndarray]) – Input data matrices (omics layers) for integration.

  • Y (pd.DataFrame | np.ndarray) – Phenotype vector; numeric for CCA or binary (0/1) for PLS.

  • AdjustedCovar (pd.DataFrame | None) – Optional covariates to regress out from X before analysis.

  • preprocess (bool) – If True, center and scale data; if False, use raw input.

  • Kfold (int) – Number of cross-validation folds for penalty parameter tuning.

  • subSampNum (int) – Number of subsampling iterations for stability selection.

  • DataType (List[str] | None) – Names for each omics layer in X; defaults to generic names if None.

  • BetweenShrinkage (float) – Shrinkage factor for between-omics scaling weights.

  • ScalingPen (List[float]) – Penalty terms used for determining scaling factors.

  • saving_dir (str) – Directory path for saving output results.

  • tuneLength (int) – Number of candidate penalty parameters to test per omics layer.

  • tuneRangeCCA (List[float]) – Min and max penalty values for CCA (continuous phenotype).

  • tuneRangePLS (List[float]) – Min and max penalty values for PLS (binary phenotype).

  • EvalMethod (str) – Metric for PLS evaluation; one of ‘accuracy’, ‘auc’, ‘precision’, ‘recall’, or ‘f1’.

  • ncomp_pls (int) – Number of latent components to use for PLS models.

  • CutHeight (float) – Height threshold for hierarchical tree cutting in module extraction.

  • min_size (int) – Minimum number of nodes to retain a network module.

  • max_size (int) – Maximum module size; larger modules are pruned down.

  • summarization (str) – Network summarization method. Currently only ‘NetSHy’ is supported.

  • seed (int) – Random seed for reproducibility.

  • precomputed_fold_data (dict | None) – Precomputed CV folds to bypass internal fold generation.

  • device (torch.device | cpu) – PyTorch device; if None, automatically selects GPU if available.

  • dtype (torch.dtype) – PyTorch data type for computations.

  • rename (bool) – If True, prefix datatype to column names; if False, use original column names.

Returns:

Dictionary containing results for ‘CCA’ or ‘PLS’ including adjacency matrices, processed data, and CV results.

Return type:

dict