Network Construction & Analysis

The network module provides tools for constructing multi-omics networks from raw tabular data and analyzing their topology.

from bioneuralnet.network import (
    similarity_network,
    correlation_network,
    threshold_network,
    gaussian_knn_network,
    NetworkAnalyzer,
    network_search,
    auto_pysmccnet,
)

All construction functions accept a pandas.DataFrame of shape (n_samples, n_features) and return a weighted adjacency matrix of shape (n_features, n_features). All methods are GPU-accelerated via PyTorch when CUDA is available.

Network Construction

  • bioneuralnet.network.similarity_network(): Builds a k-NN similarity graph using cosine similarity or a Gaussian kernel on Euclidean distances.

    from bioneuralnet.network import similarity_network
    net = similarity_network(X, k=15, metric="cosine", mutual=False, normalize=True)
    

    Parameters: k (neighbors per node), metric ("cosine" or "euclidean"), mutual (restrict to mutual kNN edges), per_node (per-node vs global cutoff), self_loops, normalize (row-normalize).

  • bioneuralnet.network.correlation_network(): Builds a correlation-based graph using Pearson or Spearman correlation mapped to [0,1] via (C + 1) / 2 (signed) or |C| (unsigned).

    from bioneuralnet.network import correlation_network
    net = correlation_network(X, k=15, method="spearman", signed=True, normalize=True)
    

    Parameters: k, method ("pearson" or "spearman"), signed, mutual, per_node, threshold (overrides k in global cutoff mode), self_loops, normalize.

  • bioneuralnet.network.threshold_network(): Builds a soft-threshold co-expression graph by raising absolute Pearson correlations to power b, then applying a kNN mask.

    from bioneuralnet.network import threshold_network
    net = threshold_network(X, b=6.3, k=22, mutual=False, normalize=True)
    

    Parameters: b (soft-threshold exponent), k, mutual, self_loops, normalize.

  • bioneuralnet.network.gaussian_knn_network(): Builds a Gaussian RBF k-NN graph from pairwise Euclidean distances. When sigma=None, a median squared distance heuristic is used. Self-loops are enabled by default.

    from bioneuralnet.network import gaussian_knn_network
    net = gaussian_knn_network(X, k=15, sigma=None, mutual=False, normalize=True)
    

    Parameters: k, sigma (RBF bandwidth; None for median heuristic), mutual, self_loops, normalize.

Phenotype-Driven Construction

  • bioneuralnet.network.auto_pysmccnet(): Constructs a phenotype-specific multi-omics network using Sparse Multiple Canonical Correlation Analysis (SmCCNet 2.0). Automatically selects CCA mode for continuous phenotypes and PLS mode for binary phenotypes. R and the SmCCNet CRAN package are required.

    from bioneuralnet.network import auto_pysmccnet
    
    result = auto_pysmccnet(
        X=[omics1, omics2],
        Y=phenotype,
        DataType=["genes", "mirna"],
        subSampNum=1000,
        Kfold=3,
        BetweenShrinkage=5,
        CutHeight=1 - 0.1**10,
        summarization="NetSHy",
    )
    global_network = result["AdjacencyMatrix"]
    subnetworks    = result["Subnetworks"]
    

    Returns a dict with keys: "AdjacencyMatrix", "Data", "CVResult", "Subnetworks".

    Key parameters: X (list of omics DataFrames), Y (phenotype), DataType (layer names), subSampNum (subsampling iterations), Kfold, BetweenShrinkage (between-omics scaling shrinkage), tuneRangeCCA, tuneRangePLS, EvalMethod, CutHeight, min_size, max_size, summarization.

    For full parameter reference see the SmCCNet documentation.

Network Quality Assessment

  • bioneuralnet.network.NetworkAnalyzer: GPU-accelerated network topology analysis class. Auto-detects omics types from feature name prefixes (e.g., genes_Gene_7 resolves to genes). Pass source_omics to override with original DataFrames.

    from bioneuralnet.network import NetworkAnalyzer
    
    analyzer = NetworkAnalyzer(adjacency_matrix=net)
    analyzer.edge_weight_analysis()
    analyzer.basic_statistics(threshold=0.1)
    analyzer.hub_analysis(threshold=0.1, top_n=10)
    analyzer.cross_omics_analysis(threshold=0.1)
    _ = analyzer.find_strongest_edges(top_n=10)
    

    Methods:

    • edge_weight_analysis(): Returns an array of all non-zero edge weights and logs distribution statistics and percentiles.

    • basic_statistics(threshold): Returns a dict with node count, edge count, density, average/max/min degree, and isolated node count.

    • hub_analysis(threshold, top_n): Returns a DataFrame of the top-N nodes by degree with omics type assignments.

    • cross_omics_analysis(threshold): Returns a dict of within-layer and between-layer edge counts and density for each omics pair. Requires dtype=torch.long for index tensors.

    • find_strongest_edges(top_n): Returns a DataFrame of the top-N highest-weight feature pairs with omics assignments.

    • degree_distribution(threshold): Returns a DataFrame of degree, count, and percentage across all nodes.

    • clustering_coefficient_gpu(threshold, sample_size): Computes local clustering coefficients using GPU matrix operations; samples up to 5,000 nodes on large graphs.

    • connected_components(threshold): Returns component count, node label assignments, and component size distribution via scipy sparse BFS.