requireNamespace() checks
with bare library(<suggests_pkg>) calls; the bare calls would
hard-error if the suggested package was not installed, defeating
the defensive checks elsewhere in the same vignette. The
tidymodels-interop chunk now carries
eval = requireNamespace("recipes", quietly = TRUE) && requireNamespace("yardstick", quietly = TRUE)
in its chunk header, so the chunk is skipped (rather than erroring)
during vignette build when those Suggests packages are absent. The
parallel-setup chunk gains a brief comment documenting that
future is a Suggests dependency. A regression test
(test-vignette-suggests.R) walks the vignette and asserts that
every chunk-level library(<suggests_pkg>) call is inside an
appropriately gated chunk.predict_guard() is now also accessible through the standard
[stats::predict()] generic via a registered S3 method
predict.GuardFit(). Calling predict(fit, newdata) on a GuardFit
object dispatches to predict.GuardFit() and yields output that is
bit-identical to the legacy predict_guard(fit, newdata).
predict_guard() is preserved as a thin backward-compatible alias,
so existing code continues to work without modification.
methods(class = "GuardFit") now returns print, summary, and
predict, restoring the standard R idiom for transformer objects.
Added show() / print() methods to the public result classes that
previously only had summary():
LeakFit: new show() (S4) — brief auto-print giving task,
outcome, learners, fold count, and fold-status one-liner.LeakAudit: new show() (S4) — brief auto-print giving task,
outcome, permutation-gap statistics, and component row counts
(batch association, target leakage, duplicates).LeakTune: new print() (S3) — brief auto-print giving outer-fold
success rate, tuning-grid size, selection rule, and refit status.
Each method ends with a one-line hint pointing to summary(<obj>)
for the full diagnostic report. methods(class = ...) now returns
show/print alongside summary for all three classes..guard_fit() is renamed to guard_fit() and .guard_ensure_levels()
is renamed to guard_ensure_levels(). Leading-dot prefixes on
exported functions are unconventional and were causing the renamed
helpers to appear awkwardly in help(package = "bioLeak"). Behavior,
arguments, and return values are unchanged; only the names move from
the dot-prefixed form to ordinary names. Internal callers
(fit_resample(), impute_guarded(), predict_guard()'s
documentation, and the package vignette) are updated to use the new
names.LeakFit, LeakAudit, and
LeakDeltaLSI objects without reaching into S4 internals via @.
The new accessors are purely additive; slot definitions are unchanged
and existing code that uses @ continues to work.
LeakFit: fit_metrics().LeakAudit: audit_perm_gap(), audit_batch_assoc(),
audit_target_assoc(), audit_duplicates(), audit_info().LeakDeltaLSI: dlsi_metric(), dlsi_robust(), dlsi_ci(),
dlsi_p_value(), dlsi_tier(), dlsi_R_eff(), dlsi_repeats().
Each accessor performs an is(x, "<Class>") validation and emits an
informative error when called on the wrong object.delta_lsi(): inference tier strings renamed to accurately reflect what each
tier provides. "C_point_only" → "C_signflip" (the sign-flip p-value is
available at this tier, not just point estimates); "B_ci_only" →
"B_signflip_ci" (both the sign-flip p-value and BCa CI are available).
Code that compares result@tier against the old string literals must be
updated.delta_lsi() gains a block_size argument and makes exchangeability
actionable for "blocked_time" inputs. When exchangeability = "blocked_time",
the sign-flip test now uses a block procedure that flips contiguous blocks of
repeats together, preserving serial autocorrelation under the null.
block_size is auto-estimated from the AR(1) of the repeat-level deltas when
NULL (default) and capped at floor(R/3) to guarantee at least three
independent blocks. The @info slot gains block_size_used and n_blocks
fields. If the block structure yields fewer than five independent blocks,
@p_value is set to NA and a warning is issued.delta_lsi() now emits an explicit warning when exchangeability is
"by_group" or "within_batch", informing users that those modes are stored
but inference still uses the iid sign-flip procedure. Previously these values
were accepted silently without affecting computation.fit_resample(): compact + combined mode now correctly excludes
constraint-axis violations from training sets. Previously the compact
fallback used setdiff(all, test), ignoring multi-axis constraints declared
via make_split_plan(constraints = ...). The same fix is applied in the
as_rsample() conversion path for consistency.delta_lsi(): R_eff and the inference tier are now recomputed after
repeat-level intersection, so that dropped all-NA repeats correctly reduce the
effective sample size and select the appropriate tier.fit_resample(): fold error messages are now correctly captured when running
in parallel via future.apply. Previously <<- mutations inside worker
processes were silently lost; errors are now attached as result attributes and
extracted after the parallel map.tune_resample(): fold-ID columns (id, id2, .notes) no longer leak
into hyperparameter aggregation in the internal select_config() helper.summary.LeakFit() now returns object@metric_summary invisibly, matching
the documented return value (previously returned the object itself).bioLeak-intro) referencing a shadowed data frame for sample
count; now reads from fit_safe@splits@info$coldata.audit_leakage() roxygen documenting a duplicates column named
in_train_test; the actual column name is cross_fold.make_split_plan(): time-series mode now warns and skips folds with fewer
than 3 test samples instead of producing degenerate folds.fit_resample(): added bounds checking for repeat_id in compact fold
resolution to produce a clear error instead of a cryptic index failure.show() and summary() for LeakDeltaLSI now label the sign-flip p-value
as testing mean(Δr) (delta_metric), not delta_lsi, making the
estimator–inference pairing explicit.summary() prints a diagnostic note when the sign-flip p-value and BCa CI
lead to qualitatively different conclusions (one significant, one spanning
zero), which can occur when outlier repeats pull the arithmetic mean away from
the Huber estimate.summary() prints the block size and number of blocks used when
exchangeability = "blocked_time".constraints in make_split_plan(),
generalizing beyond two-axis combined CV while preserving train/test exclusion
across all declared axes.compact = TRUE split storage (fold assignments) for large datasets to
reduce split object memory footprint.check_split_overlap() for explicit overlap-invariant validation across
fold/group axes.cv_ci() (with Nadeau-Bengio correction) and integrated CI columns into
fit_resample() and tune_resample() metric summaries (*_ci_lo, *_ci_hi).guard_to_recipe() to map guarded preprocessing configurations to
recipes pipelines with explicit fallback/warning behavior.benchmark_leakage_suite() for reproducible modality-by-mechanism
benchmark grids and detection-rate summaries.audit_leakage() diagnostics with mechanism taxonomy fields
(mechanism_class, taxonomy, mechanism_summary) and richer risk
attribution outputs.p_value_adj, flag_fdr) with selectable
multiple-testing correction (target_p_adjust, target_alpha).feature_space (raw/rank) and duplicate_scope
(train_test/all) controls for duplicate diagnostics.perm_mode handling for
rsample-derived splits and safer perm_refit = "auto" behavior.split_cols = "auto", mode/perm-mode propagation, stricter
compatibility checks).tune_resample(): final refit now aggregates
hyperparameters across outer folds (median/majority) instead of selecting a
single best outer fold.tune_resample() using inner-fold
predictions (tune_threshold, threshold_grid, threshold_metric).fold_status) and elapsed timing in
both fitting and tuning paths for better failure-mode observability.bioLeak.strict,
bioLeak.validation_mode) with structured condition classes for safer recipe
and workflow guardrails..bio_capture_provenance) and attached provenance
metadata to LeakFit, LeakAudit, and LeakTune.summary.LeakAudit() output with explicit Mechanism Risk Assessment
reporting.fit_resample() to avoid fold-time failures
when recipes reference split metadata columns (for example subject).simulate_leakage_suite() default B, auto refit cap handling).paper/ with refreshed large-scale
simulation outputs and case-study artifacts.tune_resample(): nested
cross-validation using tidymodels tune/dials with leakage-aware outer
splits.fit_resample() now accepts rsample
rset/rsplit objects as splits, recipes::recipe for preprocessing,
workflows::workflow as learner, and yardstick::metric_set for metrics.
as_rsample() converts LeakSplits to an rsample rset.learner argument in
fit_resample().calibration_summary() and plot_calibration()
for probability calibration checks; confounder_sensitivity() and
plot_confounder_sensitivity() for sensitivity analysis.simulate_leakage_suite() for generating controlled
leakage scenarios and benchmarking audit sensitivity.audit_report(): renders a self-contained HTML
summary of all audit results for sharing and review.audit_leakage_by_learner() to audit each
learner in a multi-model fit separately.audit_leakage()
for supported tasks, complementing the existing univariate scan.perm_refit = TRUE or "auto") in
audit_leakage() for a more powerful permutation gap test when refit data
are available.fit_resample() for imbalanced classification
tasks.plot_fold_balance(), plot_overlap_checks(),
plot_perm_distribution(), plot_time_acf().LeakSplits, LeakFit, LeakAudit) now include setValidity
checks for slot consistency.summary() methods for LeakFit, LeakAudit, and LeakTune improved with
clearer console output and edge-case handling.impute_guarded() gains enhanced diagnostics and RNG safety..guard_fit() and .guard_ensure_levels() made more robust with better error
messages.permute_labels) gains verbose mode, digest-based
caching, and improved stratification safety.audit_leakage() handles NA metrics gracefully and enriches trail metadata.make_split_plan() improved stratification logic and reproducible seeding.audit_report() now renders from a temporary copy of the Rmd template to
avoid write failures on read-only file systems (e.g. during R CMD check).bioLeak-intro) rewritten with guided workflow and
leaky-vs-correct comparisons.fit_resample() result aggregation when folds fail during
preprocessing.missForest preprocessing dropping rows.glmnet folds receiving non-numeric design matrices.make_split_plan() for leakage-aware splitting
(subject-grouped, batch-blocked, study leave-out, time-ordered);
fit_resample() for cross-validated fitting with built-in guarded
preprocessing (train-only imputation, normalisation, filtering, feature
selection).audit_leakage() with label-permutation gap test,
batch/study association tests, univariate target leakage scan, and
near-duplicate detection.impute_guarded(), predict_guard(),
.guard_fit(), .guard_ensure_levels().LeakSplits, LeakFit, LeakAudit.glm, glmnet, ranger, xgboost (via
custom_learners).SummarizedExperiment input support.