bioneuralnet.network.tools

Functions

connected_components(csgraph[, directed, ...])

connected_components(csgraph, directed=True, connection='weak',

correlation_network(X[, k, method, signed, ...])

Build a correlation-based graph from feature vectors with optional kNN sparsification.

cross_val_score(estimator, X[, y, groups, ...])

Evaluate a score by cross-validation.

gaussian_knn_network(X[, k, sigma, mutual, ...])

Build a Gaussian (RBF) kNN similarity graph from feature vectors.

get_logger(name)

Retrieves a global logger configured to write to 'bioneuralnet.log'.

network_search(omics_data, y_labels[, ...])

Search over graph-construction hyperparameters using a structural proxy.

similarity_network(X[, k, metric, mutual, ...])

Build a k-nearest neighbors similarity graph from feature vectors.

threshold_network(X[, b, k, mutual, ...])

Build a soft-thresholded kNN co-expression graph, similar to WGCNA-style networks.

Classes

NetworkAnalyzer(adjacency_matrix[, ...])

Performs GPU-accelerated network analysis.

ParameterGrid(param_grid)

Grid of parameters with a discrete number of values for each.

RidgeClassifier([alpha, fit_intercept, ...])

Classifier using Ridge regression.

StandardScaler(*[, copy, with_mean, with_std])

Standardize features by removing the mean and scaling to unit variance.

StratifiedKFold([n_splits, shuffle, ...])

Class-wise stratified K-Fold cross-validator.

csr_matrix(arg1[, shape, dtype, copy, maxprint])

Compressed Sparse Row matrix.

class bioneuralnet.network.tools.NetworkAnalyzer(adjacency_matrix: DataFrame, source_omics: list | None = None, device: str = 'cuda')[source]

Bases: object

Performs GPU-accelerated network analysis.

This class leverages PyTorch tensors to speed up graph statistics, clustering computations, and edge analysis for large-scale omics networks.

Parameters:
  • adjacency_matrix (pd.DataFrame) – The input weighted adjacency matrix representing network connections.

  • source_omics (list) – Optional list of original DataFrames used to build the network to dynamically assign omics types.

  • device (str) – The target computing device, defaulting to ‘cuda’ if available.

basic_statistics(threshold: float = 0.5) Dict[str, float | int | ndarray][source]

Computes fundamental graph metrics including density, degree statistics, and node isolation counts.

This provides a high-level overview of the network topology and connectivity at a specific threshold.

Parameters:

threshold (float) – The threshold used to binarize the network before analysis.

Returns:

A dictionary containing node count, edge count, density, average/max/min degree, and isolated node count.

Return type:

dict

clustering_coefficient_gpu(threshold: float = 0.5, sample_size: int | None = None) Dict[str, float | ndarray][source]

Computes the local clustering coefficient for nodes using GPU-optimized matrix operations.

This measures the degree to which nodes tend to cluster together, using random sampling for efficiency on large graphs.

Parameters:
  • threshold (float) – The threshold used to define valid edges.

  • sample_size (Optional[int]) – The number of nodes to sample for calculation to save memory on massive graphs.

Returns:

Statistics including average, max, and min clustering coefficients, plus raw values and sample indices.

Return type:

dict

connected_components(threshold: float = 0.5) Dict[str, int | ndarray | List[int]][source]

Identifies isolated subgraphs within the network using Breadth-First Search logic.

This computation is performed on the CPU using scipy due to the sequential nature of traversal algorithms.

Parameters:

threshold (float) – The threshold used to define connectivity.

Returns:

Contains the count of components, label assignments for each node, and a size distribution list.

Return type:

dict

cross_omics_analysis(threshold: float = 0.5) Dict[tuple, Dict][source]

Quantifies the connectivity density between different omics layers (e.g., RNA vs Protein).

This reveals whether the network structure is driven by within-omics correlations or cross-omics interactions.

Parameters:

threshold (float) – The threshold used to count valid edges between features.

Returns:

A nested dictionary mapping omics pairs to their edge counts and density statistics.

Return type:

dict

degree_distribution(threshold: float = 0.5) DataFrame[source]

Calculates the frequency distribution of node degrees across the entire network.

This helps identify if the network follows a scale-free power law or a random graph distribution.

Parameters:

threshold (float) – The threshold used to binarize the network.

Returns:

A DataFrame with columns for degree, count, and percentage of total nodes.

Return type:

pd.DataFrame

edge_weight_analysis() ndarray | None[source]

Analyzes the statistical distribution of edge weights across the entire network.

This is useful for determining appropriate threshold values and understanding signal strength distribution.

Parameters:

None.

Returns:

An array of all non-zero edge weights, or None if no edges exist.

Return type:

Optional[np.ndarray]

find_strongest_edges(top_n: int = 50) DataFrame[source]

Retrieves the strongest edges in the network sorted by weight magnitude.

This isolates the most significant pairwise interactions between features.

Parameters:

top_n (int) – The number of top weighted edges to return.

Returns:

A DataFrame detailing the top interactions, including feature names and weights.

Return type:

pd.DataFrame

hub_analysis(threshold: float = 0.5, top_n: int = 10) DataFrame[source]

Identifies and ranks the most highly connected ‘hub’ nodes in the network.

This is critical for finding central regulatory features or bottlenecks in the omics network.

Parameters:
  • threshold (float) – The threshold used to define network edges.

  • top_n (int) – The number of top degree nodes to retrieve.

Returns:

A table of the top N nodes including their rank, feature name, omics type, and degree.

Return type:

pd.DataFrame

threshold_network(threshold: float) torch.Tensor[source]

Generates a binary adjacency matrix by applying a hard threshold to the connection weights.

This converts continuous edge weights into a binary structure suitable for standard graph topology metrics.

Parameters:

threshold (float) – The cutoff value above which an edge is considered to exist.

Returns:

A binary tensor where 1 indicates an edge and 0 indicates no edge.

Return type:

torch.Tensor

Search over graph-construction hyperparameters using a structural proxy.

Each candidate configuration builds a graph, scores it with a fast centrality-weighted Ridge classifier proxy, and blends that score with a topological quality term (average clustering coefficient) to favour well-connected, informative graphs.

Parameters:
  • omics_data – Feature matrix of shape (n_samples, n_features).

  • y_labels – Target labels for stratified CV evaluation.

  • methods – Graph-construction methods to search over.

  • seed – Random seed for reproducibility.

  • verbose – Log per-configuration progress.

  • trials – Optional cap on evaluated configurations (random subset).

  • centrality_mode – Centrality used for feature weighting in the proxy; one of "eigenvector" or "degree".

  • topology_weight – Blending factor in [0, 1] that controls how much the topological quality term contributes to the final score. 0 ignores topology; 1 ignores the proxy F1.

Returns:

A 3-tuple of (best_graph, best_params, results_df).

Raises:

RuntimeError – If every configuration fails.