tl_read() Family)tl_read() dispatcher function — auto-detects format
from file extension, URL pattern, or connection string and routes to the
appropriate readertidylearn_data object, a tibble
subclass carrying source, format, and timestamp metadata via
print.tidylearn_data()tl_read_csv() / tl_read_tsv() — via readr
with base R fallbacktl_read_excel() — .xls,
.xlsx, .xlsm files via readxltl_read_parquet() — via nanoparquettl_read_json() — tabular JSON via jsonlitetl_read_rds() / tl_read_rdata() — native R
formats via base Rtl_read_db() — query any live DBI connectiontl_read_sqlite() — auto-connect to SQLite files via
RSQLitetl_read_postgres() — connection string or named params
via RPostgrestl_read_mysql() — connection string or named params via
RMariaDBtl_read_bigquery() — Google BigQuery via bigrquerytl_read_s3() — download and read from S3 URIs via
paws.storagetl_read_github() — download raw files from GitHub
repositoriestl_read_kaggle() — download datasets via the Kaggle
CLItl_read() accepts a character vector of paths — reads
each and row-binds with a source_file columntl_read_dir() — scan a directory for data files with
optional format, pattern, and recursive filteringtl_read_zip() — extract and read from zip archives,
with optional file selectiontl_check_packages()tl_read()
in the workflowtl_transfer_learning() hanging indefinitely when
used with PCA pre-training. The .obs_id row-identifier
column from PCA output was being included in the supervised formula,
creating a massive dummy-variable matrix. The column is now stripped
before both training and prediction.tl_run_pipeline() failing with “attempt to select
less than one element” when all cross-validation metrics were NA. Root
cause: scale() returned matrix columns instead of vectors,
causing downstream metric computation to produce NaN. Added
as.vector() wrapper and hardened the best-model selection
to handle all-NA metric values gracefully.tl_auto_ml() time budget enforcement. The
budget now controls which models are attempted: budgets under 30s skip
slow C-level models (forest, SVM, XGBoost) entirely, and
cross-validation is skipped when remaining time is tight. Baseline model
order changed to fast-first (tree, logistic/linear, then forest). See
?tl_auto_ml for full details on budget tiers.tl_interaction_effects() crashing with “unused
argument (se.fit)” because tidylearn’s predict() method
does not support se.fit. Now uses
stats::predict() on the raw model object for confidence
intervals. Also fixed an invalid formula in the internal slope
calculation.tl_plot_interaction() expecting
fit/lwr/upr columns from
predict() output. Now correctly handles tidylearn’s
.pred tibble format.tl_plot_intervals() calling non-existent
tl_prediction_intervals() function. Now computes confidence
and prediction intervals directly via
stats::predict(..., interval = "confidence") and
stats::predict(..., interval = "prediction").tl_plot_svm_boundary() erroring with “at least
two predictor variables required” when using response ~ .
formulas. The function now resolves predictors from data column names
instead of all.vars(), which does not expand
.. Also switched from geom_contour_filled
(which failed on discrete class predictions) to
geom_raster.tl_plot_svm_tuning() passing NULL
entries in the ranges list to e1071::tune(),
which caused “NA/NaN/Inf in foreign function call” errors. Tuning ranges
are now built conditionally based on the kernel type.tl_plot_xgboost_shap_summary() failing with
“arguments imply differing number of rows” when n_samples
differed from nrow(data). Sampling is now performed before
SHAP computation so that feature values and SHAP values always have the
same number of rows.tl_check_assumptions() crashing with “list object
cannot be coerced to logical” when some assumption checks returned NULL
(e.g., when optional test packages were not installed).gamma calculation to use predictor
count only (1 / (ncol(data) - 1)) instead of including the
response column.@return tag to
print.tidylearn_data().size parameter with
linewidth in all geom_line() calls across
visualization, classification, PCA, DBSCAN, and validation plotting
functions.tl_default_param_grid, tl_tune_grid,
tl_tune_random, tl_plot_tuning_results, and
input validation.1:n patterns with
seq_len() / seq_along().lintr configuration enforcing
%>% pipe consistencytl_table() dispatcher function — mirrors
plot() but produces formatted gt tables
instead of ggplot2 visualisationstl_table_metrics() — styled evaluation metrics table
from tl_evaluate()tl_table_coefficients() — model coefficients with
p-values (lm/glm) or sorted by magnitude (glmnet), with conditional
highlightingtl_table_confusion() — confusion matrix with correct
predictions highlighted on the diagonaltl_table_importance() — ranked feature importance with
colour gradienttl_table_variance() — PCA variance explained with
cumulative % colouredtl_table_loadings() — PCA loadings with diverging
red–blue colour scaletl_table_clusters() — cluster sizes and mean feature
values for kmeans, pam, clara, dbscan, and hclust modelstl_table_comparison() — side-by-side multi-model
comparison tablegt theme via
internal tl_gt_theme() helpergt is a suggested dependency — functions error with an
install message if gt is not availabletl_fit_dbscan() returning a non-existent
core_points field instead of summary from the
underlying tidy_dbscan() resultplot() failing on supervised models with “could
not find function ‘tl_plot_model’” by implementing the missing
tl_plot_model() and tl_plot_unsupervised()
internal dispatchers (#1)tl_plot_actual_predicted(),
tl_plot_residuals(), and tl_plot_confusion()
failing due to accessing a non-existent $prediction column
on predict output (correct column is $.pred)$prediction column mismatch in the
tl_dashboard() predictions tabletl_model() - Single function to fit 20+ machine
learning models$fit for package-specific
functionalitytl_split() - Train/test splitting with stratification
supporttl_prepare_data() - Data preprocessing (scaling,
imputation, encoding)tl_evaluate() - Model evaluation with multiple
metricstl_auto_ml() - Automated machine learningtl_tune() - Hyperparameter tuning with grid and random
searchtidylearn wraps established R packages including: stats, glmnet, randomForest, xgboost, gbm, e1071, nnet, rpart, cluster, dbscan, MASS, and smacof.