Package 'BioMoR' reference manual

Title:	Bioinformatics Modeling with Recursion and Autoencoder-Based Ensemble
Description:	Provides tools for bioinformatics modeling using recursive transformer-inspired architectures, autoencoders, random forests, XGBoost, and stacked ensemble models. Includes utilities for cross-validation, calibration, benchmarking, and threshold optimization in predictive modeling workflows.
Authors:	MD. Arshad [aut, cre]
Maintainer:	MD. Arshad <[email protected]>
License:	MIT + file LICENSE
Version:	0.1.0
Built:	2026-05-14 08:59:44 UTC
Source:	https://github.com/sulkysubject37/biomor

Benchmark a trained model

Description

Evaluates a trained caret model on test data, returning Accuracy, F1 score, and ROC-AUC. If only one class is present in the test set, ROC-AUC is returned as NA.

Usage

biomor_benchmark(model, test_data, outcome_col)
biomor_benchmark(model, test_data, outcome_col)

Arguments

model

A trained caret model

test_data

Dataframe containing predictors and outcome

outcome_col

Name of outcome column

Value

A named list of metrics

Run full BioMoR pipeline

Description

Run full BioMoR pipeline

Usage

biomor_run_pipeline(data, feature_cols = NULL, epochs = 50)
biomor_run_pipeline(data, feature_cols = NULL, epochs = 50)

Arguments

data

dataframe with Label + descriptors

feature_cols

optional feature set

epochs

autoencoder epochs

Value

list of trained models + benchmark reports

Compute Brier Score

Description

The Brier score is the mean squared error between predicted probabilities and the true binary outcome (0/1). Lower is better.

Usage

brier_score(y_true, y_prob, positive = "Active")
brier_score(y_true, y_prob, positive = "Active")

Arguments

y_true

True factor labels.

y_prob

Predicted probabilities for the positive class.

positive

Name of the positive class (default "Active").

Value

Numeric Brier score.

Calibrate model probabilities

Description

Calibrate model probabilities

Usage

calibrate_model(model, test_data, method = "platt")
calibrate_model(model, test_data, method = "platt")

Arguments

model

caret or xgboost model

test_data

test dataframe

method

"platt" or "isotonic"

Value

calibrated probs

Compute optimal threshold for maximum F1 score

Description

Sweeps thresholds between 0 and 1 to find the one that maximizes F1.

Usage

compute_f1_threshold(y_true, y_prob, positive = "Active")
compute_f1_threshold(y_true, y_prob, positive = "Active")

Arguments

y_true

True factor labels.

y_prob

Predicted probabilities for the positive class.

positive

Name of the positive class (default "Active").

Value

A list with elements:

threshold: Best probability cutoff.
best_f1: Maximum F1 score achieved.

Get caret cross-validation control

Description

Creates a caret::trainControl object for cross-validation, configured for two-class problems, ROC-based performance, and optional sampling strategies such as SMOTE or ROSE.

Usage

get_cv_control(cv = 5, sampling = NULL)
get_cv_control(cv = 5, sampling = NULL)

Arguments

cv

Number of folds (default 5).

sampling

Sampling method (e.g., "smote", "rose", or NULL).

Value

A caret::trainControl object.

Get Embeddings from Autoencoder (stub)

Description

Placeholder for extracting embeddings from a trained autoencoder.

Usage

get_embeddings(ae_obj, data, feature_cols = NULL)
get_embeddings(ae_obj, data, feature_cols = NULL)

Arguments

ae_obj

Autoencoder object

data

Input data

feature_cols

Columns to use as features

Value

Matrix of embeddings (currently NULL since this is a stub)

Prepare dataset for modeling

Description

Prepare dataset for modeling

Usage

prepare_model_data(df, outcome_col = "Label")
prepare_model_data(df, outcome_col = "Label")

Arguments

df

A data.frame

outcome_col

Name of the outcome column

Value

A processed data.frame with factor outcome

Train Autoencoder (stub)

Description

Placeholder for future autoencoder integration in BioMoR.

Usage

train_autoencoder(
  data,
  feature_cols = NULL,
  epochs = 10,
  batch_size = 32,
  lr = 0.001
)
train_autoencoder(
  data,
  feature_cols = NULL,
  epochs = 10,
  batch_size = 32,
  lr = 0.001
)

Arguments

data

Input data (matrix or data frame)

feature_cols

Columns to use as features

epochs

Number of training epochs

batch_size

Mini-batch size

lr

Learning rate

Value

A placeholder list with class "autoencoder"

Train BioMoR Autoencoder

Description

Train BioMoR Autoencoder

Usage

train_biomor(data, feature_cols, epochs = 100, batch_size = 50, lr = 0.001)
train_biomor(data, feature_cols, epochs = 100, batch_size = 50, lr = 0.001)

Arguments

data

Dataframe with numeric features + Label

feature_cols

Character vector of feature columns

epochs

Number of training epochs

batch_size

Batch size

lr

Learning rate

Value

list(model, dataset, embeddings)

Train a Random Forest model with caret

Description

Train a Random Forest model with caret

Usage

train_rf(df, outcome_col = "Label", ctrl)
train_rf(df, outcome_col = "Label", ctrl)

Arguments

df

A data.frame containing predictors and outcome

outcome_col

Name of the outcome column (binary factor)

ctrl

A caret::trainControl object

Value

A caret train object

Train an XGBoost model with caret

Description

Train an XGBoost model with caret

Usage

train_xgb_caret(df, outcome_col = "Label", ctrl)
train_xgb_caret(df, outcome_col = "Label", ctrl)

Arguments

df

A data.frame containing predictors and outcome

outcome_col

Name of the outcome column (binary factor)

ctrl

A caret::trainControl object

Value

A caret train object

Package 'BioMoR'

Help Index

Benchmark a trained model

Description

Usage

Arguments

Value

Run full BioMoR pipeline

Description

Usage

Arguments

Value

Compute Brier Score

Description

Usage

Arguments

Value

Calibrate model probabilities

Description

Usage

Arguments

Value

Compute optimal threshold for maximum F1 score

Description

Usage

Arguments

Value

Get caret cross-validation control

Description

Usage

Arguments

Value

Get Embeddings from Autoencoder (stub)

Description

Usage

Arguments

Value

Prepare dataset for modeling

Description

Usage

Arguments

Value

Train Autoencoder (stub)

Description

Usage

Arguments

Value

Train BioMoR Autoencoder

Description

Usage

Arguments

Value

Train a Random Forest model with caret

Description

Usage

Arguments

Value

Train an XGBoost model with caret

Description

Usage

Arguments

Value