bioneuralnet.clustering.correlated_louvain¶
Correlated Louvain Community Detection.
This module extends the standard Louvain algorithm by incorporating an absolute phenotype-correlation objective into the modularity maximization process.
References
Abdel-Hafiz et al. (2022), “Significant Subgraph Detection in Multi-omics Networks for Disease Pathway Identification,” Frontiers in Big Data.
Notes
Hybrid Modularity Objective The algorithm optimizes connectivity and phenotype correlation simultaneously using the following weighted objective function:
- Where:
\(Q\): Standard modularity (internal connectivity).
\(\\rho\): Absolute Pearson correlation of the community’s first principal component (PC1) with phenotype \(Y\).
\(k_L\): User-defined weight on modularity (Suggested: 0.2).
- Algorithm:
The hierarchical loop and Phase 2 (network aggregation) remain identical to the standard Louvain method. The modification occurs exclusively in Phase 1 (Local Optimization).
When evaluating the movement of node \(v\) from community \(D\) to community \(C\), the gain is calculated as:
\[\begin{split}\Delta_{hybrid} = k_L \Delta Q + (1 - k_L) \Delta \\rho\end{split}\]The correlation gain \(\Delta \\rho\) is defined as the change in total correlation across affected communities:
\[\begin{split}\Delta \\rho = [|\\rho(D \setminus \{v\})| + |\\rho(C \cup \{v\})|] - [|\\rho(D)| + |\\rho(C)|]\end{split}\]
Functions
|
Retrieves a global logger configured to write to 'bioneuralnet.log'. |
Classes
|
Correlated Louvain community detection. |
|
Standard Louvain community detection. |
- class bioneuralnet.clustering.correlated_louvain.CorrelatedLouvain(G: Graph, B: DataFrame, Y: Series | DataFrame, k_L: float = 0.2, weight: str = 'weight', max_passes: int = 50, min_delta: float = 1e-06, seed: int | None = None)[source]¶
Bases:
LouvainCorrelated Louvain community detection.
Inherits from
Louvain.- Parameters:
G (nx.Graph) – The input graph for community detection.
B (pd.DataFrame) – Omics data (n_samples x n_features). Column names must match nodes.
Y (Union[pd.Series, pd.DataFrame]) – Phenotype vector aligned with rows of B.
k_L (float) – Weight on modularity in combined objective (Eq. 9).
weight (str) – Edge attribute name for weights.
max_passes (int) – Maximum number of passes for Phase 1 optimization.
min_delta (float) – Convergence tolerance for objective gain.
seed (Optional[int]) – Random seed for reproducibility.
- property communities: Dict[int, List[Any]]¶
Retrieves the communities grouped by community ID.
Convenient for iterating over sets of nodes belonging to the same community.
- Returns:
A dictionary mapping community IDs to lists of nodes.
- Return type:
Dict[int, List[Any]]
- get_combined_quality() → float[source]¶
Access the calculated combined quality score.
- Returns:
The Q* score.
- Return type:
- get_top_communities(n: int = 1) → List[Tuple[int, float, List[Any]]][source]¶
Retrieve the top communities based on absolute correlation.
- property history: List[Dict[str, Any]]¶
Retrieves the history of the algorithm’s execution levels.
Provides insight into the convergence process and reduction of graph size.
- Returns:
A list of dictionaries containing stats for each level.
- Return type:
List[Dict[str, Any]]
- property modularity: float¶
Retrieves the final modularity score of the computed partition.
Requires that the run() method has been executed previously.
- Returns:
The modularity score.
- Return type: