bioneuralnet.external_tools

External Tools Module

This module provides utility functions for interoperability between Python and R. It handles the execution of external R scripts to extract, convert, and load RData structures (such as cross-validation folds and network matrices) into standardized Python data structures like pandas DataFrames and NumPy arrays.

Available Functions:

  • extract_and_load_folds: Triggers Rscript extraction and loads the folds.

  • load_r_export_folds: Directly loads a previously extracted R directory structure.

  • rdata_to_df: Converts an arbitrary .RData file object to a pandas DataFrame.

Functions

extract_and_load_folds(output_path[, ...])

Extracts .Rdata fold files into CSVs using an R script, then loads them.

load_r_export_folds(base_path, num_omics[, k])

Loads the specific SmCCNet directory structure exported from R.

rdata_to_df(rdata_file, csv_file[, Object])

Converts an RData file to a pandas DataFrame.

bioneuralnet.external_tools.extract_and_load_folds(output_path: str, num_omics: int = 3, k: int = 5) dict[source]

Extracts .Rdata fold files into CSVs using an R script, then loads them.

This function acts as a wrapper to execute the external ‘extract_CVfold.R’ script, which parses ‘CVFold.Rdata’ and ‘globalNetwork.Rdata’ into a standard directory structure of CSVs. Once the R script completes successfully, it loads the data into memory using load_r_export_folds.

Parameters:
  • output_path (str) – The target directory containing the source .Rdata files.

  • num_omics (int) – The number of omics data blocks to process. Defaults to 3.

  • k (int) – The number of cross-validation folds. Defaults to 5.

Returns:

A dictionary containing the parsed cross-validation fold data.

Return type:

dict

Raises:
  • EnvironmentError – If ‘Rscript’ is not found in the system path.

  • FileNotFoundError – If the required ‘extract_CVfold.R’ script is missing.

  • RuntimeError – If the R script execution fails and returns a non-zero exit code.

bioneuralnet.external_tools.load_r_export_folds(base_path: str, num_omics: int, k: int = 5) dict[source]

Loads the specific SmCCNet directory structure exported from R.

This function iterates through the cross-validation fold directories (fold_1, fold_2, etc.) and loads the associated omics CSV files and phenotype data into NumPy arrays.

Parameters:
  • base_path (str) – The base directory containing the ‘fold_N’ subdirectories.

  • num_omics (int) – The number of omics data blocks to load per fold.

  • k (int) – The number of cross-validation folds to load. Defaults to 5.

Returns:

A dictionary where keys are fold names (e.g., ‘fold_1’) and values are dictionaries containing ‘X_train’ (list of numpy arrays), ‘X_test’ (list of numpy arrays), ‘Y_train’ (numpy array), and ‘Y_test’ (numpy array).

Return type:

dict

Raises:

FileNotFoundError – If a required fold directory or CSV file cannot be found.

bioneuralnet.external_tools.rdata_to_df(rdata_file: Path, csv_file: Path, Object=None) DataFrame[source]

Converts an RData file to a pandas DataFrame.

This function executes an external R script to load the .RData file, identify a suitable matrix, data frame, or graph object (e.g., igraph), and export it to CSV.

Parameters:
  • rdata_file (Path) – Path to the input .RData file.

  • csv_file (Path) – Path where the temporary CSV file should be written.

  • Object (str | None) – Optional name of the specific object to extract; if None, the script attempts to auto-detect the first suitable object.

Returns:

The converted data loaded into a pandas DataFrame.

Return type:

pd.DataFrame

Raises:

Modules