msSPChelpR 0.9.1
New Features
- new function histgroup_iarc()to create variable for
groups of malignant neoplasms considered to be histologically
‘different’ for the purpose of defining multiple tumors, ICD-O-3 (see
#100)
- some functions gain new quietargument to suppressrlang::warn()andrlang::inform()messages.
You can use this when you have checked your results for correctness and
want to reduce message output, but keep the progress bars.
- asir(): add World Standard Population 2000-2025 for
function with option- std_pop=="WHO2000"as described here:
https://seer.cancer.gov/stdpopulations/world.who.html
- sir_byfutime()gains new argument- expect_missing_refstrata_df. You can define another
dataframe that contains strata expected to be missing from refrates_df
(because they are not explicitly coded with incidence = 0). This can be
helpful, if refrates_df has a lot of strata and 0 incidence strata have
been removed to save storage space. Internally, the rows of
expect_missing_refstrata_df will be appended to refrates_df. This
reduces the number of lines reported in attribute- problems_missing_ref_strata. Default setting is- expect_missing_refstrata_df = NULL.
- sample data set for data("us_second_cancer")gains new
variablet_histon histology, i.e. ICD-O-3-Code on tumor
morphology (4 digits)
Breaking Changes
- no breaking changes in this version
Bug fixes
- make calc_refrates()more robust for missingrace_var(Closes #89)
- fix bug in calc_refrates()usingcalc_totals == TRUE(Closes #90)
- fix bug in calc_refrates()using numeric versions offill_sites(Closes #92)
- fix bug in asir()that throws error for variable not
needed (Closes #95)
Internal
- replace progress bars by cli
- deprecate verb.()syntax from tidytable (Closes
#94)
msSPChelpR 0.9.0
New Features
- new function calc_refrates()to calculate age-, sex-,
region-, year-specific reference rates from a long format dataframe with
cancer cases that are counted for incident cases and then matched with a
reference population. The resulting reference rates dataframe can
directly be used withsir_byfutime()function.
- functions gain new default dattype = NULLand thus are
more flexible to take other source data types (Closes #73)
Breaking Changes
- functions asir,calc_futime*,calc_refrates,ir_crosstab_byfutime,pat_status*,renumber_time_id*, andsir_byfutimenow by default are set todattype = NULL. If you relied on automatic variable naming
feature, you need to adddattype = "seer"ordattype = "zfkd"to your function call.
- fix typo in attribute names: attributes are now correctly named
problems_missing_count_strataandproblems_missing_fu_strata(Closes #80)
Bug fixes
- sir_byfutime():- 
- attributes with notes and problems are now correctly saved to
results_df
 
Internal
- deprecated functions from tidytablepackage have been
replaced (Closes #71 and #74)
msSPChelpR 0.8.7
New Features
- new function sir_ratio()and relatedsir_ratio_lci()andsir_ratio_uci()to
calculate ratio of two SIRs/SMRs to get relative risk and confidence
limits for this ratio.
- tidytable variant of reshape_long function,
i.e. reshape_long_tt()⇒ the _tt variants usually have
smaller memory use than tidyverse and data.table variants. Execution
time is usually much faster than tidyverse and comparable to or a little
slower than the data.table variant.
- summarize_sir_results():- 
- add ability to summarize by different site_var than the one used in
sir_byfutime()
 
Bug fixes
- summarize_sir_results():- 
- PYARs are now correctly calculated when using
summarize_site == TRUE. Previously the results incorrectly
counted each site multiple times. (Closes #62)
 
- pat_status():- 
- update default values for dattype = "zfkd"
 
Internal
- add R-CMD-Check to github actions
msSPChelpR 0.8.6
New Features
- new sample data set for standard populations ⇒
data("standard_population")
- new sample data set for us population ⇒
data("population_us")(Closes #58)
Bug fixes
- sir_byfutime(): change output of integer columns to
numeric to fix bug in- summarize_sir_results()(Closes
#59)
Other changes
- add examples to function documentation (Closes #56)
- remove “R” from package title (Closes #57)
- update package description (Closes #54)
- update introduction vignette
vignette("introduction")
msSPChelpR 0.8.5 -
2020-09-28
New Features
- tidytable variants of functions,
i.e. reshape_wide_tt(),renumber_time_id_tt(),pat_status_tt(),vital_status_tt(),calc_futime_tt()⇒ the _tt variants usually have smaller
memory use than tidyverse and data.table variants. Execution time is
usually much faster than tidyverse and comparable to or a little slower
than the data.table variant.
- sir_byfutime():
- is much faster using tidytable package
- gained the option race_varto optionally stratify SIR
calculations by race.
- summarize_sir_results():
- new function that increases functionality in summarizing results
from sir_byfutime()function
- new option to define custom site_var_name
- new package website https://marianschmidt.github.io/msSPChelpR
- new sample datasets included in the package to demonstrate examples
(#36)
Breaking Changes
- sir_byfutime():- 
- options add_total_rowandadd_total_fuare
replaced bycalc_total_rowandcalc_total_fu.
These are logical parameters now. The positioning of total rows and
columns is completely handled by thesummarize_sir_results()function now. There total rows can
be set to top and bottom and total columns to left and right.
- option expcount_srcincluding related parametersstdpop_df,refpop_df,std_pop,truncate_std_popandpyar_varhave been
removed. Functionsir_byfutime()will only work calculating
expected counts based on reference rates, not within the cohort of the
dataset. To calculate expected based on the cohort, a new functioncreate_refrateswill be added in the future. (#41)
- option collapse_cihas been removed and added tosummarize_sir_results()instead.
- option name for tumor site variable changed from
icdcat_vartosite_var
- option name for age/age group variable changed from
agegroup_vartoage_var
- in total the parameters expcount_src,futime_src,stdpop_df,refpop_df,std_pop,truncate_std_pop,pyar_var,icdcat_var,collapse_cihave been removed to simply the function ⇒ make sure you remove these
arguments from yoursir_byfutime()function calls.
 
- sir():- 
- is superseded by the use of sir_byfutime(). To migrate
your formersir()functions, you can simply usesir_byfutime(, futime_breaks = "none")that will yield the
same results.
 
- summarize_sir_results():- 
- option name for tumor site variable changed from
summarize_icdcattosummarize_site
 
- reshape_long_tidyr():- 
- option var_selectionis deprecated. Please select
variables before running thereshape_long_*functions.
 
- asir():- 
- option name for age/age group variable changed from
agegroup_vartoage_var
- option name for tumor site variable changed from
icdcat_vartosite_var
 
- pat_status(),- pat_status_tt(),- vital_status(), and- vital_status_tt():- 
- Capitalized default variable labelling.
- This might break code that relied on using the labels coming out of
these functions in later filter or mutate functions.
 
- ir_crosstab_byfutime():- 
- option futime_breaksnow uses breaks in years instead
of months as previously.
- default futime_varis now follow-up time in years
 
- now requires dplyr version 1.0.0
- now requires tidytable package
- the default option name for tumor site variable changed from
icdcat_vartosite_var. This need manual
update of function calls ofsir_byfutime()andasir(), if option is specified.
- the default variable name for tumor site in all functions has been
changed from t_icdcattot_site. So the
reference data frames used will need to have at_sitecolumn.
- the data.table variants of functions
(renumber_time_id_dt(),pat_status_dt(),reshape_long_dt(),reshape_wide_dt(),vital_status_dt()) have been removed for simplicity, please
use tidytable variants, i.e.reshape_wide_tt(),renumber_time_id_tt(),pat_status_tt(),vital_status_tt(),calc_futime_tt(), instead.
They will give the same data.table output and same performance.
Bug Fixes
- implement new reliable routine to split df when
reshape_wide()with optionchunksis used.
Closes #1.
- Sorting of columns in wide datasets by
reshape_wide_tidyr()andreshape_wide_tt()is
now preserved. Closes #31.
- ensure sorting in renumer_time_id()and make sure thatnew_time_id_varis returned as integer.
- fix bug in pat_status_*(., check = TRUE)option
- improve internal tests in sir_byfutime()so that PYARs
do not get lost before running summary function
- sir_byfutime()now also gives correct results if range
of- futime_breaksis not 0-Inf but smaller
msSPChelpR 0.8.4 -
2020-05-21
New Features
- add timevar_max option to renumber_time_id()function;
use sorting by date of diagnosis instead of old time_id_var
- various improvements to reshape_wide_tidyr()function
- various improvements to reshape_wide_dt()function
which is much faster now and usesdata.table::dcastinstead
ofstats::reshapenow
- various improvements to pat_status()andpat_status_dt()functions
- option summarize_icdcat in summarize_sir_results()is
now functional
- update vignette vignette("introduction")
Bug Fixes
- fix incomplete check for required variables in
pat_status()andpat_status_dt()functions
- fix error in check for required variables in
renumber_time_id()that broke functions
- fix bug in check for end of FU time in pat_status()andcalc_futime()
- implement new tidyselect routine using
tidyselect::all_ofinsummarize_sir_results()
msSPChelpR 0.8.3
New Features
- new faster version of reshape_long based on data.table
- start new vignette on workflow from filtered long dataset to
follow-up times vignette("patstatus_futime")
Bug Fixes
- implement new tidyselect routine using
tidyselect::all_offor vector-based variable selection
- implement correct referencing in vital_status_dtandpat_status_dt
- add exports from data.table
- update documentation for sir and sir_byfutime functions
- make reshape_longfunction work
msSPChelpR 0.8.2
msSPChelpR 0.8.1
New Features
- new faster version of vital_status function using data.table
- new faster version of pat_status function using data.table
msSPChelpR 0.8.0
New Features
- new faster version of reshape_wide_dt function based on data.table
and without problematic slices done by reshape_wide
- new faster version of renumber_time_id function based on
data.table
msSPChelpR 0.7.4
New Features
- new function renumber_time_id
msSPChelpR 0.7.3
Bug Fixes
- add check to revert status_var to numeric in case it was created
with option as_labelled_factor
- fix label bug in life_var_new
msSPChelpR 0.7.2
- add option as_labelled_factor to vital_status function
- fix newly introduced error in vital_status function
msSPChelpR 0.7.1
- fix error in vital_status function by replacing
sjlabelled::get_label function
msSPChelpR 0.7.0
- fix error in pat_status and vital_status functions due to change in
sjlabelled package
msSPChelpR 0.6.10
- rebuild description file and manual
msSPChelpR 0.6.9
- remove nest_legacy functions and use new tidyr syntax, close
#19
msSPChelpR 0.6.8
- make summarize_sir_results function work without break
variables
msSPChelpR 0.6.7
- for function sir_byfutime ⇒ make option add_total_rowwork, even if optionybreak_vars = "none"
msSPChelpR 0.6.6
- Make use of time_id_var and case_id_var use coherent across reshape
functions
msSPChelpR 0.6.5
msSPChelpR 0.6.4
- Added a NEWS.mdfile to track changes to the
package.
msSPChelpR 0.6.3
- add option futime_breaks = "none"tosir_byfutimefunction
major changes in msSPChelpR
0.6.0
- includes a new function to calculate crude (absolute) incidence
rates a tabulate them by whatever number of grouping variables and it
can be used as a Table 1 for publications ⇒ The function is called
msSPChelpR::ir_crosstab
- includes a new function to calculate SIRs (standardized incidence
ratios) by whatever strata you desire (unlimited ybreak_vars; one
xbreak_var) and additionally customized breaks for follow-up times
(default is: to 6 months, .5-1 year, 1-5 years, 5-10 years, >10
years) ⇒ attention, it only makes sense to stratify results (ybreak_vars
or xbreak_var) by variables measured at baseline and not for variables
that are dependent on the occurrence of an SPC) ⇒ function
msSPChelpR::sir_byfutime ⇒ depending on the number of stratification
variables you are using, this function may result in a very long results
data.frame. So please use it together with the new function
msSPChelpR::summarize_sir_results
- includes a new function to summarize results dataframes from SIR
calculations
- New reshape functions that are faster and are using less memory