| Type: | Package | 
| Title: | Bioinformatics Modeling with Recursion and Autoencoder-Based Ensemble | 
| Version: | 0.1.0 | 
| Description: | Tools for bioinformatics modeling using recursive transformer-inspired architectures, autoencoders, random forests, XGBoost, and stacked ensemble models. Includes utilities for cross-validation, calibration, benchmarking, and threshold optimization in predictive modeling workflows. The methodology builds on ensemble learning (Breiman 2001 <doi:10.1023/A:1010933404324>), gradient boosting (Chen and Guestrin 2016 <doi:10.1145/2939672.2939785>), autoencoders (Hinton and Salakhutdinov 2006 <doi:10.1126/science.1127647>), and recursive transformer efficiency approaches such as Mixture-of-Recursions (Bae et al. 2025 <doi:10.48550/arXiv.2507.10524>). | 
| License: | MIT + file LICENSE | 
| Encoding: | UTF-8 | 
| RoxygenNote: | 7.3.3 | 
| Depends: | R (≥ 4.2.0) | 
| Imports: | caret, recipes, themis, xgboost, magrittr, dplyr, pROC | 
| Suggests: | randomForest, testthat (≥ 3.0.0), PRROC, ggplot2, purrr, tibble, yardstick, knitr, rmarkdown | 
| VignetteBuilder: | knitr | 
| Config/testthat/edition: | 3 | 
| NeedsCompilation: | no | 
| Packaged: | 2025-09-27 09:30:29 UTC; apple | 
| Author: | MD. Arshad [aut, cre] | 
| Maintainer: | MD. Arshad <arshad10867c@gmail.com> | 
| Repository: | CRAN | 
| Date/Publication: | 2025-10-03 13:50:02 UTC | 
BioMoR: Bioinformatics Modeling with Recursion, Autoencoders, and Stacked Models
Description
The BioMoR package provides a modeling framework for bioinformatics tasks, combining recursive deep learning architectures (transformer-inspired), autoencoders for feature compression, and stacked models (RF, XGBoost, meta-learners).
Details
Main features:
- Data preparation utilities with recipe-based preprocessing and SMOTE-ready CV. 
- Base learners: Random Forest and XGBoost (caret interface). 
- Meta-models: stacked learners with recursive refinements. 
- Evaluation: ROC, PR, F1 tuning, balanced accuracy, Brier score, calibration. 
Authors
Maintainer: MD. Arshad arshad10867c@gmail.com
Author(s)
Maintainer: MD. Arshad arshad10867c@gmail.com
Benchmark a trained model
Description
Evaluates a trained caret model on test data, returning Accuracy, F1 score, and ROC-AUC. If only one class is present in the test set, ROC-AUC is returned as NA.
Usage
biomor_benchmark(model, test_data, outcome_col)
Arguments
| model | A trained caret model | 
| test_data | Dataframe containing predictors and outcome | 
| outcome_col | Name of outcome column | 
Value
A named list of metrics
Run full BioMoR pipeline
Description
Run full BioMoR pipeline
Usage
biomor_run_pipeline(data, feature_cols = NULL, epochs = 50)
Arguments
| data | dataframe with Label + descriptors | 
| feature_cols | optional feature set | 
| epochs | autoencoder epochs | 
Value
list of trained models + benchmark reports
Compute Brier Score
Description
The Brier score is the mean squared error between predicted probabilities and the true binary outcome (0/1). Lower is better.
Usage
brier_score(y_true, y_prob, positive = "Active")
Arguments
| y_true | True factor labels. | 
| y_prob | Predicted probabilities for the positive class. | 
| positive | Name of the positive class (default  | 
Value
Numeric Brier score.
Calibrate model probabilities
Description
Calibrate model probabilities
Usage
calibrate_model(model, test_data, method = "platt")
Arguments
| model | caret or xgboost model | 
| test_data | test dataframe | 
| method | "platt" or "isotonic" | 
Value
calibrated probs
Compute optimal threshold for maximum F1 score
Description
Sweeps thresholds between 0 and 1 to find the one that maximizes F1.
Usage
compute_f1_threshold(y_true, y_prob, positive = "Active")
Arguments
| y_true | True factor labels. | 
| y_prob | Predicted probabilities for the positive class. | 
| positive | Name of the positive class (default  | 
Value
A list with elements:
- threshold
- Best probability cutoff. 
- best_f1
- Maximum F1 score achieved. 
Get caret cross-validation control
Description
Creates a caret::trainControl object for cross-validation, configured for two-class problems, ROC-based performance, and optional sampling strategies such as SMOTE or ROSE.
Usage
get_cv_control(cv = 5, sampling = NULL)
Arguments
| cv | Number of folds (default 5). | 
| sampling | Sampling method (e.g., "smote", "rose", or NULL). | 
Value
A caret::trainControl object.
Get Embeddings from Autoencoder (stub)
Description
Placeholder for extracting embeddings from a trained autoencoder.
Usage
get_embeddings(ae_obj, data, feature_cols = NULL)
Arguments
| ae_obj | Autoencoder object | 
| data | Input data | 
| feature_cols | Columns to use as features | 
Value
Matrix of embeddings (currently NULL since this is a stub)
Prepare dataset for modeling
Description
Prepare dataset for modeling
Usage
prepare_model_data(df, outcome_col = "Label")
Arguments
| df | A data.frame | 
| outcome_col | Name of the outcome column | 
Value
A processed data.frame with factor outcome
Train Autoencoder (stub)
Description
Placeholder for future autoencoder integration in BioMoR.
Usage
train_autoencoder(
  data,
  feature_cols = NULL,
  epochs = 10,
  batch_size = 32,
  lr = 0.001
)
Arguments
| data | Input data (matrix or data frame) | 
| feature_cols | Columns to use as features | 
| epochs | Number of training epochs | 
| batch_size | Mini-batch size | 
| lr | Learning rate | 
Value
A placeholder list with class "autoencoder"
Train BioMoR Autoencoder
Description
Train BioMoR Autoencoder
Usage
train_biomor(data, feature_cols, epochs = 100, batch_size = 50, lr = 0.001)
Arguments
| data | Dataframe with numeric features + Label | 
| feature_cols | Character vector of feature columns | 
| epochs | Number of training epochs | 
| batch_size | Batch size | 
| lr | Learning rate | 
Value
list(model, dataset, embeddings)
Train a Random Forest model with caret
Description
Train a Random Forest model with caret
Usage
train_rf(df, outcome_col = "Label", ctrl)
Arguments
| df | A data.frame containing predictors and outcome | 
| outcome_col | Name of the outcome column (binary factor) | 
| ctrl | A caret::trainControl object | 
Value
A caret train object
Train an XGBoost model with caret
Description
Train an XGBoost model with caret
Usage
train_xgb_caret(df, outcome_col = "Label", ctrl)
Arguments
| df | A data.frame containing predictors and outcome | 
| outcome_col | Name of the outcome column (binary factor) | 
| ctrl | A caret::trainControl object | 
Value
A caret train object