bioneuralnet.external_tools.extract_CVfold

Functions

extract_and_load_folds(output_path[, ...])

Extracts .Rdata fold files into CSVs using an R script, then loads them.

load_r_export_folds(base_path, num_omics[, k])

Loads the specific SmCCNet directory structure exported from R.

Classes

Path(*args, **kwargs)

PurePath subclass that can make system calls.

bioneuralnet.external_tools.extract_CVfold.extract_and_load_folds(output_path: str, num_omics: int = 3, k: int = 5) dict[source]

Extracts .Rdata fold files into CSVs using an R script, then loads them.

This function acts as a wrapper to execute the external ‘extract_CVfold.R’ script, which parses ‘CVFold.Rdata’ and ‘globalNetwork.Rdata’ into a standard directory structure of CSVs. Once the R script completes successfully, it loads the data into memory using load_r_export_folds.

Parameters:
  • output_path (str) – The target directory containing the source .Rdata files.

  • num_omics (int) – The number of omics data blocks to process. Defaults to 3.

  • k (int) – The number of cross-validation folds. Defaults to 5.

Returns:

A dictionary containing the parsed cross-validation fold data.

Return type:

dict

Raises:
  • EnvironmentError – If ‘Rscript’ is not found in the system path.

  • FileNotFoundError – If the required ‘extract_CVfold.R’ script is missing.

  • RuntimeError – If the R script execution fails and returns a non-zero exit code.

bioneuralnet.external_tools.extract_CVfold.load_r_export_folds(base_path: str, num_omics: int, k: int = 5) dict[source]

Loads the specific SmCCNet directory structure exported from R.

This function iterates through the cross-validation fold directories (fold_1, fold_2, etc.) and loads the associated omics CSV files and phenotype data into NumPy arrays.

Parameters:
  • base_path (str) – The base directory containing the ‘fold_N’ subdirectories.

  • num_omics (int) – The number of omics data blocks to load per fold.

  • k (int) – The number of cross-validation folds to load. Defaults to 5.

Returns:

A dictionary where keys are fold names (e.g., ‘fold_1’) and values are dictionaries containing ‘X_train’ (list of numpy arrays), ‘X_test’ (list of numpy arrays), ‘Y_train’ (numpy array), and ‘Y_test’ (numpy array).

Return type:

dict

Raises:

FileNotFoundError – If a required fold directory or CSV file cannot be found.