bioneuralnet.clustering.hybrid_louvain

Hybrid Louvain-PageRank - Significant Subgraph Detection.

This module implements an iterative refinement algorithm that alternates between community detection (Louvain) and local ranking (PageRank) to identify phenotype-correlated subgraphs.

References

Abdel-Hafiz et al. (2022), “Significant Subgraph Detection in Multi-omics Networks for Disease Pathway Identification,” Frontiers in Big Data.

Algorithm:

The process alternates between global community detection and local refinement until convergence:

Iteration 1 (Global Scope):
  1. Run Correlated Louvain on the full graph to optimize Hybrid Modularity.

  2. Select the community with the highest phenotype correlation \(|\rho|\).

  3. Assign seed weights to these nodes based on their marginal contribution to \(\rho\).

  4. Execute Correlated PageRank on the full graph.

  5. Use a sweep cut to produce the initial refined subgraph.

Iteration 2+ (Local Scope):
  1. Restrict the graph strictly to the output of the previous PageRank.

  2. Run Correlated Louvain on this reduced subgraph.

  3. Repeat refinement steps until size converges or a singleton is produced.

Output:

The subgraph that achieved the highest \(|\rho|\) across all iterations.

Notes

Hybrid Modularity (Correlated Louvain) Balances internal topological connectivity with phenotype correlation:

\[Q_{hybrid} = k_L Q + (1 - k_L) \rho\]

Hybrid Conductance (Correlated PageRank) Balances the external cut/internal volume ratio with correlation:

\[\Phi_{hybrid} = k_P \Phi + (1 - k_P) \rho\]

Seed Weighting Teleportation probabilities \(\alpha_i\) are weighted by a node’s marginal contribution:

\[\alpha_i = \frac{\rho_i}{\max(\rho_{seeds})} \cdot \alpha_{max}\]

Where \(\rho_i = \rho(S) - \rho(S \setminus \{i\})\).

Functions

get_logger(name)

Retrieves a global logger configured to write to 'bioneuralnet.log'.

Classes

CorrelatedLouvain(G, B, Y[, k_L, weight, ...])

Correlated Louvain community detection.

CorrelatedPageRank(graph, omics_data, ...[, ...])

Correlated PageRank clustering on a multi-omics network.

HybridLouvain(G, B, Y[, k_L, teleport_prob, ...])

Hybrid Louvain-PageRank for significant subgraph detection.

class bioneuralnet.clustering.hybrid_louvain.HybridLouvain(G: Graph | DataFrame, B: DataFrame, Y: DataFrame | Series, k_L: float = 0.8, teleport_prob: float = 0.05, k_P: float = 0.7, max_iter: int = 10, min_nodes: int = 3, weight: str = 'weight', seed: int | None = None)[source]

Bases: object

Hybrid Louvain-PageRank for significant subgraph detection.

Iteratively refines a multi-omics network by alternating:

  1. Correlated Louvain to find the most phenotype-associated community

  2. Correlated PageRank to refine that community via sweep cut

The graph shrinks each iteration. The best subgraph by rho is tracked across all iterations and returned.

Parameters:
  • G (Union[nx.Graph, pd.DataFrame]) – Weighted undirected graph or adjacency matrix DataFrame.

  • B (pd.DataFrame) – Omics data (n_samples x n_features).

  • Y (Union[pd.DataFrame, pd.Series]) – Phenotype vector.

  • k_L (float) – Weight on modularity for Correlated Louvain).

  • teleport_prob (float) – Teleportation probability for PageRank (alpha).

  • k_P (float) – Weight on conductance for PageRank sweep cut.

  • max_iter (int) – Maximum Hybrid iterations.

  • min_nodes (int) – Stop if graph shrinks below this size.

  • weight (str) – Edge attribute name for weights.

  • seed (Optional[int]) – Random seed.

property best_subgraph: Tuple[List[Any], float, int]

Retrieves the nodes and performance metrics of the best subgraph found.

Returns:

(nodes, rho , iteration_index).

Return type:

Tuple[List[Any], float, int]

property iterations: List[Dict[str, Any]]

Provides access to per-iteration details from the most recent run.

Returns:

A list of result dictionaries for each iteration.

Return type:

List[Dict[str, Any]]

run(as_dfs: bool = False) Dict[str, Any] | List[DataFrame][source]

Execute the Hybrid Louvain-PageRank algorithm.

Returns:

  • best_nodes: nodes of the highest rho subgraph

  • best_correlation: float

  • best_iteration: int

  • iterations: full per-iteration metadata

  • all_subgraphs: {iteration_index: [nodes]}

Return type:

Dict