bioneuralnet.network.tools¶

Functions

`connected_components`(csgraph[, directed, ...])	connected_components(csgraph, directed=True, connection='weak',
`correlation_network`(X[, k, method, signed, ...])	Build a correlation-based graph from feature vectors with optional kNN sparsification.
`cross_val_score`(estimator, X[, y, groups, ...])	Evaluate a score by cross-validation.
`gaussian_knn_network`(X[, k, sigma, mutual, ...])	Build a Gaussian (RBF) kNN similarity graph from feature vectors.
`get_logger`(name)	Retrieves a global logger configured to write to 'bioneuralnet.log'.
`network_search`(omics_data, y_labels[, ...])	Search over graph-construction hyperparameters using a structural proxy.
`similarity_network`(X[, k, metric, mutual, ...])	Build a k-nearest neighbors similarity graph from feature vectors.
`threshold_network`(X[, b, k, mutual, ...])	Build a soft-thresholded kNN co-expression graph, similar to WGCNA-style networks.

Classes

`NetworkAnalyzer`(adjacency_matrix[, ...])	Performs GPU-accelerated network analysis.
`ParameterGrid`(param_grid)	Grid of parameters with a discrete number of values for each.
`RidgeClassifier`([alpha, fit_intercept, ...])	Classifier using Ridge regression.
`StandardScaler`(*[, copy, with_mean, with_std])	Standardize features by removing the mean and scaling to unit variance.
`StratifiedKFold`([n_splits, shuffle, ...])	Class-wise stratified K-Fold cross-validator.
`csr_matrix`(arg1[, shape, dtype, copy, maxprint])	Compressed Sparse Row matrix.

class bioneuralnet.network.tools.NetworkAnalyzer(adjacency_matrix: DataFrame, source_omics: list | None = None, device: str = 'cuda')[source]¶

Bases: object

Performs GPU-accelerated network analysis.

This class leverages PyTorch tensors to speed up graph statistics, clustering computations, and edge analysis for large-scale omics networks.

Parameters:

adjacency_matrix (pd.DataFrame) – The input weighted adjacency matrix representing network connections.
source_omics (list) – Optional list of original DataFrames used to build the network to dynamically assign omics types.
device (str) – The target computing device, defaulting to ‘cuda’ if available.

basic_statistics(threshold: float = 0.5) → Dict[str, float | int | ndarray][source]¶

Computes fundamental graph metrics including density, degree statistics, and node isolation counts.

This provides a high-level overview of the network topology and connectivity at a specific threshold.

Parameters:: threshold (float) – The threshold used to binarize the network before analysis.
Returns:: A dictionary containing node count, edge count, density, average/max/min degree, and isolated node count.
Return type:: dict

clustering_coefficient_gpu(threshold: float = 0.5, sample_size: int | None = None) → Dict[str, float | ndarray][source]¶

Computes the local clustering coefficient for nodes using GPU-optimized matrix operations.

This measures the degree to which nodes tend to cluster together, using random sampling for efficiency on large graphs.

Parameters:

threshold (float) – The threshold used to define valid edges.
sample_size (Optional[int]) – The number of nodes to sample for calculation to save memory on massive graphs.

Returns:

Statistics including average, max, and min clustering coefficients, plus raw values and sample indices.

Return type:

dict

connected_components(threshold: float = 0.5) → Dict[str, int | ndarray | List[int]][source]¶

Identifies isolated subgraphs within the network using Breadth-First Search logic.

This computation is performed on the CPU using scipy due to the sequential nature of traversal algorithms.

Parameters:: threshold (float) – The threshold used to define connectivity.
Returns:: Contains the count of components, label assignments for each node, and a size distribution list.
Return type:: dict

cross_omics_analysis(threshold: float = 0.5) → Dict[tuple, Dict][source]¶

Quantifies the connectivity density between different omics layers (e.g., RNA vs Protein).

This reveals whether the network structure is driven by within-omics correlations or cross-omics interactions.

Parameters:: threshold (float) – The threshold used to count valid edges between features.
Returns:: A nested dictionary mapping omics pairs to their edge counts and density statistics.
Return type:: dict

degree_distribution(threshold: float = 0.5) → DataFrame[source]¶

Calculates the frequency distribution of node degrees across the entire network.

This helps identify if the network follows a scale-free power law or a random graph distribution.

Parameters:: threshold (float) – The threshold used to binarize the network.
Returns:: A DataFrame with columns for degree, count, and percentage of total nodes.
Return type:: pd.DataFrame

edge_weight_analysis() → ndarray | None[source]¶

Analyzes the statistical distribution of edge weights across the entire network.

This is useful for determining appropriate threshold values and understanding signal strength distribution.

Parameters:: None.
Returns:: An array of all non-zero edge weights, or None if no edges exist.
Return type:: Optional[np.ndarray]

find_strongest_edges(top_n: int = 50) → DataFrame[source]¶

Retrieves the strongest edges in the network sorted by weight magnitude.

This isolates the most significant pairwise interactions between features.

Parameters:: top_n (int) – The number of top weighted edges to return.
Returns:: A DataFrame detailing the top interactions, including feature names and weights.
Return type:: pd.DataFrame

hub_analysis(threshold: float = 0.5, top_n: int = 10) → DataFrame[source]¶

Identifies and ranks the most highly connected ‘hub’ nodes in the network.

This is critical for finding central regulatory features or bottlenecks in the omics network.

Parameters:

threshold (float) – The threshold used to define network edges.
top_n (int) – The number of top degree nodes to retrieve.

Returns:

A table of the top N nodes including their rank, feature name, omics type, and degree.

Return type:

pd.DataFrame

threshold_network(threshold: float) → torch.Tensor[source]¶

Generates a binary adjacency matrix by applying a hard threshold to the connection weights.

This converts continuous edge weights into a binary structure suitable for standard graph topology metrics.

Parameters:: threshold (float) – The cutoff value above which an edge is considered to exist.
Returns:: A binary tensor where 1 indicates an edge and 0 indicates no edge.
Return type:: torch.Tensor

bioneuralnet.network.tools.network_search(omics_data: DataFrame, y_labels, methods: list = ['correlation', 'threshold', 'similarity', 'gaussian'], seed: int = 1883, verbose: bool = True, trials: int | None = None, centrality_mode: str = 'eigenvector', topology_weight: float = 0.15, scoring: str = 'f1_macro') → tuple[DataFrame, dict, DataFrame][source]¶

Search over graph-construction hyperparameters using a structural proxy.

Each candidate configuration builds a graph, scores it with a fast centrality-weighted Ridge classifier proxy, and blends that score with a topological quality term (average clustering coefficient) to favour well-connected, informative graphs.

Parameters:

omics_data – Feature matrix of shape (n_samples, n_features).
y_labels – Target labels for stratified CV evaluation.
methods – Graph-construction methods to search over.
seed – Random seed for reproducibility.
verbose – Log per-configuration progress.
trials – Optional cap on evaluated configurations (random subset).
centrality_mode – Centrality used for feature weighting in the proxy; one of "eigenvector" or "degree".
topology_weight – Blending factor in [0, 1] that controls how much the topological quality term contributes to the final score. 0 ignores topology; 1 ignores the proxy F1.

Returns:

A 3-tuple of (best_graph, best_params, results_df).

Raises:

RuntimeError – If every configuration fails.