vimp 2.3.6
Major changes
- Update the way that PPV, NPV, sensitivity, and specificity are
calculated so that they return meaningful answers if all predictions are
the same (for example, if using the mean outcome value to predict)
Minor changes
- Add tests for PPV, NPV, sensitivity, specificity
vimp 2.3.5
Major changes
None
Minor changes
- Update reference pages to pass new CRAN checks for links and
items
vimp 2.3.4
Major changes
None
Minor changes
vimp 2.3.3
Major changes
- Add clustered bootstrap and associated unit tests
Minor changes
- Update software author list
- Fix roxygen2 CRAN bug for package documentation
vimp 2.3.2
Major changes
- Fixed bugs introduced in 2.3.1 for
final_point_estimate = "average"
vimp 2.3.1
Major changes
- In cases where sample-splitting is used (which is required for valid
inference under the null hypothesis of zero variable importance), there
is now the option to report a point estimate that is based on the entire
dataset, rather than only the split on which inference (confidence
intervals and p-values) is performed. The point estimator (using either
the single split, the full dataset, or the average of the two
split-specific point estimates) is valid regardless of whether the null
holds or not. If this option is chosen, there may be a discrepancy
between the point estimate and the interval estimate; this is likely to
occur only in small-sample (or small effective sample-size, for binary
outcomes) settings.
Minor changes
- For predictiveness measures that lie in [0, 1] by definition
(accuracy, ANOVA, R-squared, deviance, AUC), the default is now to
compute confidence intervals on the logit scale, which guarantees that
the interval will also lie in [0, 1]. Note that this means the interval
will not be centered at the point estimate; however, it retains the
desired level of coverage.
vimp 2.3.0
Major changes
- Predictiveness measures now have their own S3class,
which makes internal code cleaner and facilitates simpler addition of
new predictiveness measures.
- In this version, the default return value of
extract_sampled_split_predictionsis a vector, not a list.
This facilitates proper use in the new version of the package.
Minor changes
- You can now specify truncate = FALSEinvimp_ci
vimp 2.2.11
Major changes
- You can now compute variable importance using the average value
under the optimal treatment rule. This includes functions
measure_avg_value(computes the average value and efficient
influence function) and updates tovim,cv_vim, andsp_vim.
Minor changes
vimp 2.2.10
Major changes
Minor changes
- Specify methodandfamilyfor weighted EIF
estimation within outer functions (vim,cv_vim,sp_vim) rather than themeasure*functions. This allows compatibility for binary
outcomes.
- Added a vignette for coarsened-data settings.
vimp 2.2.9
Major changes
Minor changes
- Allow for unequal numbers of cross-fitting folds between full and
reduced predictiveness
vimp 2.2.8
Major changes
Minor changes
- Return objects in sp_vimthat are necessary to compute
the test statistics
vimp 2.2.7
Major changes
Minor changes
- Allow parallelargument to be specified for calls toCV.SuperLearnerbut not for calls toSuperLearner
vimp 2.2.6
Major changes
Minor changes
- Allow different types of bootstrap interval (e.g., percentile) to be
computed
- More precise documentation for Zin coarsened-data
settings; allow case-insensitive specification of covariate
names/positions when creatingZ
- Vdefaults to 5 if no cross-fitting folds are specified
externally
- More precise documentation for cross_fitted_f1andcross_fitted_f2incv_vim
- Allow non-list cross_fitted_f1andcross_fitted_f2incv_vim
vimp 2.2.5
Major changes
Minor changes
- Update how cv_vimhandles an odd number of outer folds
being passed with pre-computed regression function estimates. Now, you
can use an odd number of folds (e.g., 5) to estimate the full and
reduced regression functions and still obtain cross-validated variable
importance estimates.
vimp 2.2.4
Major changes
Minor changes
- Allow for odd number of folds in cross-fit and sampled-split VIM
estimation
- Add vrc01data as an exported object
- Change dataset for vignettes to vrc01data
vimp 2.2.3
Major changes
- Updated computation of standard errors. Some of the changes in
v2.2.0 (namely, that the efficient influence function can be estimated
on the entire dataset regardless of whether or not sample-splitting was
requested) do not match with the form of the standard error estimator
that we use. In this update, we ensure that independent data are used to
estimate both the predictiveness and the efficient influence
function; however, the nuisance functions may still be estimated on a
larger portion of the data than in versions prior to v2.2.0 when
cross-fitting is used.
Minor changes
- Added explicit-value tests for point estimates throughout
testthat/
- Harmonized vignettes with new SE computation
- Allow Cto not be specified inmake_folds
vimp 2.2.2
Major changes
None
Minor changes
- Increased tolerance for AUC vs CV-AUC
vimp 2.2.1
Major changes
- Updated the internals of measure_aucto hew more
closely toROCRandcvAUC, using computational
tricks to speed up weighted AUC and EIF computation.
Minor changes
vimp 2.2.0
Major changes
- Added argument cross_fitted_setocv_vimandsp_vim; this logical option allows the standard error
to be estimated using cross-fitting. This can improve performance in
cases where flexible algorithms are used to estimate the full and
reduced regressions.
- Added bootstrap-based standard error estimates as an option to both
vimandcv_vim; currently, this option is only
available for non-sampled-split calls (i.e., withsample_splitting = FALSE)
- Updated sample-splitting behavior to match more closely with
theoretical results (and improve power!): namely, that since estimation
of the nuisance regression functions (i.e., the regression of outcome on
all covariates and outcome on the reduced set of covariates) can be
treated as fixed in making inference, sample-splitting is only necessary
for evaluating predictiveness. Thus, the final regression functions from
a call to vimare based on the entire dataset, while the
full and reduced predictiveness (predictiveness_fullandpredictiveness_reduced, along with the corresponding
confidence intervals) is evaluated using separate portions of the data
for the full and reduced regressions.
- Added argument sample_splittingtovim,cv_vimandsp_vim; ifFALSE,
sample-splitting is not used to estimate predictiveness. Note that we
recommend using the default,TRUE, in all cases, since
inference usingsample_splitting = FALSEwill be invalid
for variables with truly null variable importance.
- Updated cross-fitting (also referred to as cross-validation)
behavior within sample_splitting = TRUEto match more
closely with theoretical results (and improve power!). In this case, we
first split the data into \(2K\)
cross-fitting folds, and split these folds equally into two
sample-splitting folds. For the nuisance regression using all
covariates, for each \(k \in \{1, \ldots,
K\}\) we set aside the data in sample-splitting fold 1 and
cross-fitting fold \(k\) [this
comprises \(1 / (2K)\) of the data]. We
train using the remaining observations [comprising \((2K-1)/(2K)\) of the data] not in this
testing fold, and we test on the originally withheld data. We repeat for
the nuisance regression using the reduced set of covariates, but
withhold data in sample-splitting fold 2. This update affects bothcv_vimandsp_vim. Ifsample_splitting = FALSE, then we use standard
cross-fitting.
Minor changes
- Use >=in computing the numerator of AUC with
inverse probability weights
- Update roxygen2documentation for wrappers
(vimp_*) to inherit parameters and details fromcv_vim(reduces potential for documentation
mismatches)
vimp 2.1.10
Major changes
None
Minor changes
- Automatically determine the familyif it isn’t
specified; usestats::binomial()if there are only two
unique outcome values, otherwise usestats::gaussian()
vimp 2.1.9
Major changes
None
Minor changes
- Update sensitivity and specificity to use weak inequalities rather
than strict inequalities (better aligns with cvAUC)
- Add a test of CV-AUC estimation against cvAUC
- Borrow information across folds for empirically estimated quantities
(e.g., the outcome variance or probability of a certain class);
asymptotically equivalent to the prior procedure, but could result in
small-sample differences
- Use fold-specific EIFs for cross-validated SE estimation (again,
asymptotically equivalent to the prior procedure, but could result in
small-sample differences)
vimp 2.1.8
Major changes
None
Minor changes
- Allow the user to specify either an augmented inverse probability of
coarsening (AIPW, the default) estimator in coarsened-at-random
settings, or specify an IPW estimator, using new argument
ipc_est_type(available invim,cv_vim, andsp_vim; also corresponding wrapper
functions for each VIM and corresponding internal estimation
functions)
vimp 2.1.7
Major changes
None
Minor changes
- Updated internals so that stratified estimation can be performed in
outer regression functions for binary outcomes, but that in the case of
two-phase samples the stratification won’t be used in any internal
regressions with continuous outcomes
- Updated internals to allow stratification on both the outcome and
observed status, so that there are sufficient cases per fold for both
the phase 1 and phase 2 regressions (only used with two-phase
samples)
vimp 2.1.6
Major changes
None
Minor changes
- Updated links to DOIs and package vignettes throughout
- Updated all tests in testthat/to useglmrather thanxgboost(increases speed)
- Updated all examples to use glmrather thanxgboostorranger(increases speed, even
though the regression is now misspecified for the truth)
- Removed forcatsfrom vignette
vimp 2.1.5
Major changes
None
Minor changes
- Fixed a bug where if the number of rows in the different folds (for
cross-fitting or sample-splitting) differed, the matrix of fold-specific
EIFs had the wrong number of rows
- Changes to internals of measure_accuracyandmeasure_aucfor project-wide consistency
- Update all tests in testthat/to not explicitly loadxgboost
vimp 2.1.4
Major changes
None
Minor changes
- Fixed a bug where if the number of rows in the different folds (for
cross-fitting or sample-splitting) differed, the EIF had the wrong
number of rows
vimp 2.1.3
Major changes
None
Minor changes
- Compute logit transforms using stats::qlogisandstats::plogisrather than bespoke functions
vimp 2.1.2
Major changes
None
Minor changes
- Bugfix from 2.1.1.1: compute the correction correctly
vimp 2.1.1.1
Major changes
None
Minor changes
- Allow confidence interval (CI) and inverse probability of coarsening
corrections on different scales (e.g., log) to ensure that estimates and
CIs lie in the parameter space
vimp 2.1.1
Major changes
- Compute one-step estimators of variable importance if inverse
probability of censoring weights are entered. You input the weights,
indicator of coarsening, and observed variables, and vimpwill handle the rest.
Minor changes
- Created new vignettes “Types of VIMs” and “Using precomputed
regression function estimates in vimp”
- Updated main vignette to only use run_regression = TRUEfor simplicity
- Added argument verbosetosp_vim; ifTRUE, messages are printed throughout fitting that display
progress andverboseis passed toSuperLearner
- Change names of internal functions from
cv_predictiveness_point_estandpredictiveness_point_esttoest_predictiveness_cvandest_predictiveness,
respectively
- Removed functions cv_predictiveness_update,cv_vimp_point_est,cv_vimp_update,predictiveness_update,vimp_point_est,vimp_update; this functionality is now inest_predictiveness_cvandest_predictiveness(for the*update*functions) or directly invimorcv_vim(for the*vimp*functions)
- Removed functions predictiveness_seandpredictiveness_ci(functionality is now invimp_seandvimp_ci, respectively)
- Changed weightsargument toipc_weights,
clarifying that these weights are meant to be used as inverse
probability of coarsening (e.g., censoring) weights
vimp 2.1.0
Major changes
Added functions sp_vim, sample_subsets,
spvim_ics, spvim_se; these allow computation
of Shapely Population Variable Importance (SPVIM)
Minor changes
None
vimp 2.0.2
Major changes
- Removed functions sp_vimand helper functionsrun_sl,sample_subsets,spvim_ics,spvim_se; these will be added in a
future release
- Removed function cv_vim_nodonsker, sincecv_vimsupersedes this function
Minor changes
- Modify examples to pass all CRAN checks
vimp 2.0.1
Major changes
- Added new function sp_vimand helper functionsrun_sl,sample_subsets,spvim_ics,spvim_se; these functions allow
computation of the Shapley Population Variable Importance Measure
(SPVIM)
- Both cv_vimandvimnow use an outer layer
of sample splitting for hypothesis testing
- Added new functions vimp_auc,vimp_accuracy,vimp_deviance,vimp_rsquared
- vimp_regressionis now deprecated; use- vimp_anovainstead
- added new function vim; each variable importance
function is now a wrapper function aroundvimwith thetypeargument filled in
- cv_vim_nodonskeris now deprecated; use- cv_viminstead
- each variable importance function now returns a p-value based on the
(possibly conservative) hypothesis test against the null of zero
importance (with the exception of vimp_anova)
- each variable importance function now returns the estimates of the
individual risks (with the exception of vimp_anova)
- added new functions to compute measures of predictiveness (and
cross-validated measures of predictiveness), along with their influence
functions
Minor changes
- Return tibbles in cv_vim, vim, merge_vim, and average_vim
vimp 1.1.6
Major changes
None
Minor changes
- Changed tests to handle gampackage update by switching
library toSL.xgboost,SL.step, andSL.mean
- Added small unit tests for internal functions
vimp 1.1.5
Major changes
None
Minor changes
- Attempt to handle gampackage update in unit tests
vimp 1.1.4
Major changes
None
Minor changes
- cv_vimand- cv_vim_nodonskernow return the
cross-validation folds used within the function
vimp 1.1.3
Major changes
None
Minor changes
- users may now only specify a familyfor the top-level
SuperLearner ifrun_regression = TRUE; in call cases, the
second-stage SuperLearner uses agaussianfamily
- if the SuperLearner chooses SL.meanas the best-fitting
algorithm, the second-stage regression is now run using the original
outcome, rather than the first-stage fitted values
vimp 1.1.2
Major changes
- added function cv_vim_nodonsker, which computes the
cross-validated naive estimator and the update on the same, single,
validation fold. This does not allow for relaxation of the Donsker class
conditions.
Minor changes
None
vimp 1.1.1
Major changes
- added function two_validation_set_cv, which sets up
folds for V-fold cross-validation with two validation sets per fold
- changed the functionality of cv_vim: now, the
cross-validated naive estimator is computed on a first validation set,
while the update for the corrected estimator is computed using the
second validation set (both created fromtwo_validation_set_cv); this allows for relaxation of the
Donsker class conditions necessary for asymptotic convergence of the
corrected estimator, while making sure that the initial CV naive
estimator is not biased high (due to a higher R^2 on the training
data)
Minor changes
None
vimp 1.1.0
Major changes
None
Minor changes
- changed the functionality of cv_vim: now, the
cross-validated naive estimator is computed on the training data for
each fold, while the update for the corrected cross-validated estimator
is computed using the test data; this allows for relaxation of the
Donsker class conditions necessary for asymptotic convergence of the
corrected estimator
vimp 1.0.0
Major changes
- removed function vim, replaced with
individual-parameter functions
- added function vimp_regressionto match Python
package
- cv_vimnow can compute regression estimators
- renamed all internal functions; these are now vimp_ci,vimp_se,vimp_update,onestep_based_estimator
- edited vignette
- added unit tests
vimp 0.0.3
Major changes
None
Minor changes
Bugfixes etc.