In this example we’re going to summarise the characteristics of individuals with an ankle sprain, ankle fracture, forearm fracture, or a hip fracture using the Eunomia synthetic database.
We’ll begin by creating our condition study cohorts with the
generateConceptCohortSet function from
CDMConnector.
library(omock)
library(CDMConnector)
library(dplyr, warn.conflicts = FALSE)
library(ggplot2)
library(CodelistGenerator)
library(PatientProfiles)
library(CohortCharacteristics)
cdm <- mockCdmFromDataset(datasetName = "GiBleed", source = "duckdb")
cdm <- generateConceptCohortSet(
cdm = cdm,
name = "injuries",
conceptSet = list(
"ankle_sprain" = 81151,
"ankle_fracture" = 4059173,
"forearm_fracture" = 4278672,
"hip_fracture" = 4230399
),
end = "event_end_date",
limit = "all"
)
settings(cdm$injuries)
#> # A tibble: 4 × 6
#> cohort_definition_id cohort_name limit prior_observation future_observation
#> <int> <chr> <chr> <dbl> <dbl>
#> 1 1 ankle_sprain all 0 0
#> 2 2 ankle_fracture all 0 0
#> 3 3 forearm_fract… all 0 0
#> 4 4 hip_fracture all 0 0
#> # ℹ 1 more variable: end <chr>
cohortCount(cdm$injuries)
#> # A tibble: 4 × 3
#> cohort_definition_id number_records number_subjects
#> <int> <int> <int>
#> 1 1 1915 1357
#> 2 2 464 427
#> 3 3 569 510
#> 4 4 138 132Now we’ve created our cohorts, we can obtain a summary of the characteristics in the patients included in these cohorts. We’ll create two different age group in below example: under 50 and 50+.
chars <- cdm$injuries |>
summariseCharacteristics(ageGroup = list(c(0, 49), c(50, Inf)))
chars |>
glimpse()
#> Rows: 220
#> Columns: 13
#> $ result_id <int> 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,…
#> $ cdm_name <chr> "GiBleed", "GiBleed", "GiBleed", "GiBleed", "GiBleed"…
#> $ group_name <chr> "cohort_name", "cohort_name", "cohort_name", "cohort_…
#> $ group_level <chr> "ankle_sprain", "ankle_sprain", "ankle_sprain", "ankl…
#> $ strata_name <chr> "overall", "overall", "overall", "overall", "overall"…
#> $ strata_level <chr> "overall", "overall", "overall", "overall", "overall"…
#> $ variable_name <chr> "Number records", "Number subjects", "Cohort start da…
#> $ variable_level <chr> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, N…
#> $ estimate_name <chr> "count", "count", "min", "q25", "median", "q75", "max…
#> $ estimate_type <chr> "integer", "integer", "date", "date", "date", "date",…
#> $ estimate_value <chr> "1915", "1357", "1912-02-25", "1968-06-15", "1982-11-…
#> $ additional_name <chr> "overall", "overall", "overall", "overall", "overall"…
#> $ additional_level <chr> "overall", "overall", "overall", "overall", "overall"…Now we have generated the results, we can create a nice table in gt
format to display the results using tableCharacteristics
function. By default it returns summary statistics for, number records,
number subjects, age, cohort_start_date, cohort_end_date,
prior_observations, future_observations and days in cohort. Days in
cohort is defined as the number of days between the cohort start and end
dates, inclusive of both boundaries (i.e., cohort_end_date –
cohort_start_date + 1). This represents the total duration an individual
in the cohort. In contrast, future observation time is defined as the
time from the cohort start date to the end of the individual’s
observation period (observation_period_end_date – cohort_start_date).
Prior observation works similar to future observation but look at
observations before cohort_start_date, hence defined as
(cohort_start_date - observation_period_start_date). The difference in
definitions arises because days in cohort captures a duration within the
cohort, while future observation time measures a remaining time window
for potential follow-up beyond cohort entry.
|
CDM name
|
||||||
|---|---|---|---|---|---|---|
|
GiBleed
|
||||||
| Variable name | Variable level | Estimate name |
Cohort name
|
|||
| ankle_sprain | ankle_fracture | forearm_fracture | hip_fracture | |||
| Number records | – | N | 1,915 | 464 | 569 | 138 |
| Number subjects | – | N | 1,357 | 427 | 510 | 132 |
| Cohort start date | – | Median [Q25 - Q75] | 1982-11-09 [1968-06-15 - 1999-04-13] | 1981-01-15 [1965-03-11 - 1997-08-03] | 1981-07-24 [1967-03-05 - 2000-12-16] | 1996-09-17 [1977-09-20 - 2010-06-22] |
| Range | 1912-02-25 to 2019-05-30 | 1911-09-07 to 2019-06-23 | 1917-08-16 to 2019-06-26 | 1927-12-14 to 2019-05-08 | ||
| Cohort end date | – | Median [Q25 - Q75] | 1982-12-10 [1968-07-06 - 1999-05-09] | 1981-02-28 [1965-04-11 - 1997-10-12] | 1981-08-23 [1967-04-10 - 2001-02-27] | 1996-11-16 [1977-12-04 - 2010-07-22] |
| Range | 1912-03-10 to 2019-05-30 | 1911-12-06 to 2019-06-24 | 1917-11-14 to 2019-06-26 | 1928-03-13 to 2019-06-07 | ||
| Age | – | Median [Q25 - Q75] | 21 [9 - 41] | 16 [9 - 43] | 17 [9 - 46] | 40 [13 - 66] |
| Mean (SD) | 26.63 (21.03) | 27.38 (24.70) | 28.69 (25.97) | 40.06 (28.82) | ||
| Range | 0 to 105 | 0 to 107 | 0 to 106 | 1 to 108 | ||
| Age group | 0 to 49 | N (%) | 1,587 (82.87%) | 367 (79.09%) | 440 (77.33%) | 87 (63.04%) |
| 50 or above | N (%) | 328 (17.13%) | 97 (20.91%) | 129 (22.67%) | 51 (36.96%) | |
| Sex | Female | N (%) | 954 (49.82%) | 238 (51.29%) | 286 (50.26%) | 74 (53.62%) |
| Male | N (%) | 961 (50.18%) | 226 (48.71%) | 283 (49.74%) | 64 (46.38%) | |
| Prior observation | – | Median [Q25 - Q75] | 7,833 [3,628 - 15,147] | 6,030 [3,360 - 16,032] | 6,289 [3,390 - 16,847] | 14,522 [4,801 - 24,401] |
| Mean (SD) | 9,918.17 (7,672.74) | 10,196.57 (9,011.31) | 10,670.43 (9,480.30) | 14,821.73 (10,521.89) | ||
| Range | 299 to 38,429 | 299 to 39,430 | 299 to 38,943 | 390 to 39,792 | ||
| Future observation | – | Median [Q25 - Q75] | 12,868 [6,860 - 18,078] | 13,748 [6,878 - 19,331] | 13,165 [5,988 - 18,548] | 7,798 [2,874 - 14,913] |
| Mean (SD) | 12,865.11 (7,543.50) | 13,470.92 (8,215.96) | 12,913.27 (7,929.17) | 9,167.33 (7,160.81) | ||
| Range | 0 to 38,403 | 1 to 39,051 | 0 to 36,654 | 0 to 29,045 | ||
| Days in cohort | – | Median [Q25 - Q75] | 22 [15 - 29] | 61 [31 - 91] | 61 [31 - 91] | 61 [31 - 91] |
| Mean (SD) | 25.02 (8.00) | 61.65 (25.38) | 62.16 (25.32) | 59.26 (24.79) | ||
| Range | 1 to 37 | 2 to 92 | 1 to 91 | 1 to 91 | ||
| Days to next record | – | Median [Q25 - Q75] | 6,082 [2,640 - 10,540] | 5,100 [2,430 - 11,474] | 5,760 [1,922 - 15,606] | 7,430 [2,355 - 15,966] |
| Mean (SD) | 7,457.56 (5,937.15) | 7,784.46 (7,686.21) | 9,217.83 (8,826.17) | 9,037.50 (8,138.51) | ||
| Range | 300 to 29,145 | 330 to 28,365 | 360 to 34,482 | 690 to 19,200 | ||
We can also use the plotCharacteristics function to
display the results in a plot. The plotCharacteristics
function can only take in one variable. So you will need to filter the
results to the variable you want to create a plot for beforehand.
chars |>
filter(variable_name == "Age") |>
plotCharacteristics(
plotType = "boxplot",
colour = "cohort_name",
facet = c("cdm_name")
)We can also generate summaries that are stratified by some variable of interest. In this example we added an age group variable to our cohort table and then created the stratification for age group in our results.
chars <- cdm$injuries |>
addAge(ageGroup = list(
c(0, 49),
c(50, Inf)
)) |>
summariseCharacteristics(strata = list("age_group"))Again we used the tableCharacteristics function to
display the results in gt table format.
|
CDM name
|
||||||
|---|---|---|---|---|---|---|
|
GiBleed
|
||||||
| Variable name | Variable level | Estimate name |
Cohort name
|
|||
| ankle_sprain | ankle_fracture | forearm_fracture | hip_fracture | |||
| overall | ||||||
| Number records | – | N | 1,915 | 464 | 569 | 138 |
| Number subjects | – | N | 1,357 | 427 | 510 | 132 |
| Cohort start date | – | Median [Q25 - Q75] | 1982-11-09 [1968-06-15 - 1999-04-13] | 1981-01-15 [1965-03-11 - 1997-08-03] | 1981-07-24 [1967-03-05 - 2000-12-16] | 1996-09-17 [1977-09-20 - 2010-06-22] |
| Range | 1912-02-25 to 2019-05-30 | 1911-09-07 to 2019-06-23 | 1917-08-16 to 2019-06-26 | 1927-12-14 to 2019-05-08 | ||
| Cohort end date | – | Median [Q25 - Q75] | 1982-12-10 [1968-07-06 - 1999-05-09] | 1981-02-28 [1965-04-11 - 1997-10-12] | 1981-08-23 [1967-04-10 - 2001-02-27] | 1996-11-16 [1977-12-04 - 2010-07-22] |
| Range | 1912-03-10 to 2019-05-30 | 1911-12-06 to 2019-06-24 | 1917-11-14 to 2019-06-26 | 1928-03-13 to 2019-06-07 | ||
| Age | – | Median [Q25 - Q75] | 21 [9 - 41] | 16 [9 - 43] | 17 [9 - 46] | 40 [13 - 66] |
| Mean (SD) | 26.63 (21.03) | 27.38 (24.70) | 28.69 (25.97) | 40.06 (28.82) | ||
| Range | 0 to 105 | 0 to 107 | 0 to 106 | 1 to 108 | ||
| Sex | Female | N (%) | 954 (49.82%) | 238 (51.29%) | 286 (50.26%) | 74 (53.62%) |
| Male | N (%) | 961 (50.18%) | 226 (48.71%) | 283 (49.74%) | 64 (46.38%) | |
| Prior observation | – | Median [Q25 - Q75] | 7,833 [3,628 - 15,147] | 6,030 [3,360 - 16,032] | 6,289 [3,390 - 16,847] | 14,522 [4,801 - 24,401] |
| Mean (SD) | 9,918.17 (7,672.74) | 10,196.57 (9,011.31) | 10,670.43 (9,480.30) | 14,821.73 (10,521.89) | ||
| Range | 299 to 38,429 | 299 to 39,430 | 299 to 38,943 | 390 to 39,792 | ||
| Future observation | – | Median [Q25 - Q75] | 12,868 [6,860 - 18,078] | 13,748 [6,878 - 19,331] | 13,165 [5,988 - 18,548] | 7,798 [2,874 - 14,913] |
| Mean (SD) | 12,865.11 (7,543.50) | 13,470.92 (8,215.96) | 12,913.27 (7,929.17) | 9,167.33 (7,160.81) | ||
| Range | 0 to 38,403 | 1 to 39,051 | 0 to 36,654 | 0 to 29,045 | ||
| Days in cohort | – | Median [Q25 - Q75] | 22 [15 - 29] | 61 [31 - 91] | 61 [31 - 91] | 61 [31 - 91] |
| Mean (SD) | 25.02 (8.00) | 61.65 (25.38) | 62.16 (25.32) | 59.26 (24.79) | ||
| Range | 1 to 37 | 2 to 92 | 1 to 91 | 1 to 91 | ||
| Days to next record | – | Median [Q25 - Q75] | 6,082 [2,640 - 10,540] | 5,100 [2,430 - 11,474] | 5,760 [1,922 - 15,606] | 7,430 [2,355 - 15,966] |
| Mean (SD) | 7,457.56 (5,937.15) | 7,784.46 (7,686.21) | 9,217.83 (8,826.17) | 9,037.50 (8,138.51) | ||
| Range | 300 to 29,145 | 330 to 28,365 | 360 to 34,482 | 690 to 19,200 | ||
| 0 to 49 | ||||||
| Number records | – | N | 1,587 | 367 | 440 | 87 |
| Number subjects | – | N | 1,211 | 341 | 411 | 86 |
| Cohort start date | – | Median [Q25 - Q75] | 1978-07-08 [1965-08-07 - 1992-05-07] | 1974-08-26 [1960-08-21 - 1988-07-30] | 1974-12-23 [1964-05-04 - 1988-03-09] | 1983-05-29 [1973-07-30 - 1997-03-20] |
| Range | 1912-02-25 to 2019-05-06 | 1911-09-07 to 2018-10-12 | 1917-08-16 to 2019-06-26 | 1927-12-14 to 2019-01-09 | ||
| Cohort end date | – | Median [Q25 - Q75] | 1978-08-05 [1965-09-01 - 1992-05-28] | 1974-10-25 [1960-10-20 - 1988-10-09] | 1975-02-06 [1964-06-11 - 1988-05-07] | 1983-08-27 [1973-08-29 - 1997-05-19] |
| Range | 1912-03-10 to 2019-05-06 | 1911-12-06 to 2018-11-11 | 1917-11-14 to 2019-06-26 | 1928-03-13 to 2019-04-09 | ||
| Age | – | Median [Q25 - Q75] | 16 [7 - 31] | 13 [7 - 25] | 13 [7 - 23] | 15 [9 - 34] |
| Mean (SD) | 19.32 (13.95) | 16.49 (12.90) | 16.48 (12.87) | 21.15 (15.27) | ||
| Range | 0 to 49 | 0 to 49 | 0 to 49 | 1 to 49 | ||
| Sex | Female | N (%) | 791 (49.84%) | 190 (51.77%) | 213 (48.41%) | 41 (47.13%) |
| Male | N (%) | 796 (50.16%) | 177 (48.23%) | 227 (51.59%) | 46 (52.87%) | |
| Prior observation | – | Median [Q25 - Q75] | 5,970 [2,910 - 11,512] | 4,941 [2,640 - 9,266] | 4,814 [2,662 - 8,680] | 5,838 [3,510 - 12,728] |
| Mean (SD) | 7,249.25 (5,084.37) | 6,221.68 (4,697.60) | 6,212.80 (4,686.12) | 7,920.29 (5,584.42) | ||
| Range | 299 to 18,243 | 299 to 18,105 | 299 to 18,158 | 390 to 18,086 | ||
| Future observation | – | Median [Q25 - Q75] | 14,582 [9,510 - 19,018] | 15,936 [10,900 - 20,859] | 15,833 [11,020 - 19,580] | 12,667 [7,957 - 16,282] |
| Mean (SD) | 14,564.63 (6,955.73) | 15,980.16 (7,193.49) | 15,495.41 (6,973.47) | 12,656.62 (6,557.62) | ||
| Range | 0 to 38,403 | 30 to 39,051 | 0 to 36,654 | 162 to 29,045 | ||
| Days in cohort | – | Median [Q25 - Q75] | 22 [15 - 29] | 61 [31 - 91] | 61 [31 - 91] | 61 [31 - 91] |
| Mean (SD) | 25.06 (7.88) | 61.01 (25.37) | 63.18 (25.35) | 63.41 (23.87) | ||
| Range | 1 to 37 | 31 to 91 | 1 to 91 | 31 to 91 | ||
| Days to next record | – | Median [Q25 - Q75] | 6,225 [2,758 - 10,942] | 5,100 [2,430 - 12,824] | 8,702 [3,098 - 16,603] | 17,465 [9,078 - 18,332] |
| Mean (SD) | 7,618.24 (6,033.81) | 8,245.36 (7,985.71) | 10,818.35 (9,256.74) | 12,451.67 (10,222.78) | ||
| Range | 300 to 29,145 | 630 to 28,365 | 660 to 34,482 | 690 to 19,200 | ||
| 50 or above | ||||||
| Number records | – | N | 328 | 97 | 129 | 51 |
| Number subjects | – | N | 292 | 93 | 116 | 48 |
| Cohort start date | – | Median [Q25 - Q75] | 2008-10-08 [1997-01-11 - 2014-03-06] | 2009-07-25 [1999-01-22 - 2015-04-07] | 2008-12-20 [2000-10-17 - 2014-09-23] | 2010-09-19 [2005-05-10 - 2016-01-10] |
| Range | 1961-02-11 to 2019-05-30 | 1970-06-04 to 2019-06-23 | 1961-07-16 to 2019-06-12 | 1982-01-17 to 2019-05-08 | ||
| Cohort end date | – | Median [Q25 - Q75] | 2008-10-30 [1997-02-13 - 2014-03-25] | 2009-09-23 [1999-04-22 - 2015-06-03] | 2009-01-19 [2000-12-09 - 2014-12-22] | 2010-10-19 [2005-06-24 - 2016-03-26] |
| Range | 1961-02-25 to 2019-05-30 | 1970-07-04 to 2019-06-24 | 1961-08-15 to 2019-06-13 | 1982-04-17 to 2019-06-07 | ||
| Age | – | Median [Q25 - Q75] | 59 [53 - 67] | 68 [60 - 75] | 69 [61 - 78] | 71 [62 - 82] |
| Mean (SD) | 62.00 (11.40) | 68.59 (11.77) | 70.33 (12.90) | 72.31 (13.84) | ||
| Range | 50 to 105 | 50 to 107 | 50 to 106 | 51 to 108 | ||
| Sex | Female | N (%) | 163 (49.70%) | 48 (49.48%) | 73 (56.59%) | 33 (64.71%) |
| Male | N (%) | 165 (50.30%) | 49 (50.52%) | 56 (43.41%) | 18 (35.29%) | |
| Prior observation | – | Median [Q25 - Q75] | 21,747 [19,421 - 24,795] | 25,114 [22,188 - 27,715] | 25,445 [22,496 - 28,815] | 25,964 [22,994 - 30,277] |
| Mean (SD) | 22,831.56 (4,167.50) | 25,235.61 (4,310.11) | 25,874.71 (4,714.82) | 26,594.78 (5,045.12) | ||
| Range | 18,264 to 38,429 | 18,354 to 39,430 | 18,379 to 38,943 | 18,899 to 39,792 | ||
| Future observation | – | Median [Q25 - Q75] | 3,494 [1,722 - 6,684] | 2,909 [1,173 - 5,608] | 3,335 [1,316 - 5,988] | 2,808 [914 - 4,672] |
| Mean (SD) | 4,642.15 (4,070.72) | 3,977.22 (3,624.08) | 4,105.97 (3,334.07) | 3,215.02 (3,035.15) | ||
| Range | 0 to 19,780 | 1 to 17,814 | 1 to 16,492 | 0 to 13,595 | ||
| Days in cohort | – | Median [Q25 - Q75] | 22 [15 - 29] | 61 [31 - 91] | 61 [31 - 91] | 61 [31 - 61] |
| Mean (SD) | 24.82 (8.58) | 64.10 (25.37) | 58.69 (25.01) | 52.18 (24.95) | ||
| Range | 1 to 36 | 2 to 92 | 2 to 91 | 1 to 91 | ||
| Days to next record | – | Median [Q25 - Q75] | 4,130 [2,205 - 8,048] | 4,545 [2,985 - 5,542] | 2,624 [990 - 5,451] | 3,390 [2,700 - 7,430] |
| Mean (SD) | 5,127.69 (3,614.35) | 3,982.00 (2,662.27) | 3,554.46 (3,260.29) | 5,623.33 (5,110.16) | ||
| Range | 630 to 13,260 | 330 to 6,508 | 360 to 9,660 | 2,010 to 11,470 | ||
Then plotted age stratified prior observation time.
chars |>
filter(variable_name == "Prior observation") |>
plotCharacteristics(
plotType = "boxplot",
colour = "cohort_name",
facet = c("age_group")
) +
coord_flip()We explored whether patients had any exposure to a list of selected medications (acetaminophen, morphine, warfarin)
medsCs <- getDrugIngredientCodes(
cdm = cdm,
name = c("acetaminophen", "morphine", "warfarin")
)
cdm <- generateConceptCohortSet(
cdm = cdm,
name = "meds",
conceptSet = medsCs,
end = "event_end_date",
limit = "all",
overwrite = TRUE
)We can use the intersects arguement inside the function
to get this information.
chars <- cdm$injuries |>
summariseCharacteristics(cohortIntersectFlag = list(
"Medications prior to index date" = list(
targetCohortTable = "meds",
window = c(-Inf, -1)
),
"Medications on index date" = list(
targetCohortTable = "meds",
window = c(0, 0)
)
))To view the summary table
|
CDM name
|
||||||
|---|---|---|---|---|---|---|
|
GiBleed
|
||||||
| Variable name | Variable level | Estimate name |
Cohort name
|
|||
| ankle_sprain | ankle_fracture | forearm_fracture | hip_fracture | |||
| Number records | – | N | 1,915 | 464 | 569 | 138 |
| Number subjects | – | N | 1,357 | 427 | 510 | 132 |
| Cohort start date | – | Median [Q25 - Q75] | 1982-11-09 [1968-06-15 - 1999-04-13] | 1981-01-15 [1965-03-11 - 1997-08-03] | 1981-07-24 [1967-03-05 - 2000-12-16] | 1996-09-17 [1977-09-20 - 2010-06-22] |
| Range | 1912-02-25 to 2019-05-30 | 1911-09-07 to 2019-06-23 | 1917-08-16 to 2019-06-26 | 1927-12-14 to 2019-05-08 | ||
| Cohort end date | – | Median [Q25 - Q75] | 1982-12-10 [1968-07-06 - 1999-05-09] | 1981-02-28 [1965-04-11 - 1997-10-12] | 1981-08-23 [1967-04-10 - 2001-02-27] | 1996-11-16 [1977-12-04 - 2010-07-22] |
| Range | 1912-03-10 to 2019-05-30 | 1911-12-06 to 2019-06-24 | 1917-11-14 to 2019-06-26 | 1928-03-13 to 2019-06-07 | ||
| Age | – | Median [Q25 - Q75] | 21 [9 - 41] | 16 [9 - 43] | 17 [9 - 46] | 40 [13 - 66] |
| Mean (SD) | 26.63 (21.03) | 27.38 (24.70) | 28.69 (25.97) | 40.06 (28.82) | ||
| Range | 0 to 105 | 0 to 107 | 0 to 106 | 1 to 108 | ||
| Sex | Female | N (%) | 954 (49.82%) | 238 (51.29%) | 286 (50.26%) | 74 (53.62%) |
| Male | N (%) | 961 (50.18%) | 226 (48.71%) | 283 (49.74%) | 64 (46.38%) | |
| Prior observation | – | Median [Q25 - Q75] | 7,833 [3,628 - 15,147] | 6,030 [3,360 - 16,032] | 6,289 [3,390 - 16,847] | 14,522 [4,801 - 24,401] |
| Mean (SD) | 9,918.17 (7,672.74) | 10,196.57 (9,011.31) | 10,670.43 (9,480.30) | 14,821.73 (10,521.89) | ||
| Range | 299 to 38,429 | 299 to 39,430 | 299 to 38,943 | 390 to 39,792 | ||
| Future observation | – | Median [Q25 - Q75] | 12,868 [6,860 - 18,078] | 13,748 [6,878 - 19,331] | 13,165 [5,988 - 18,548] | 7,798 [2,874 - 14,913] |
| Mean (SD) | 12,865.11 (7,543.50) | 13,470.92 (8,215.96) | 12,913.27 (7,929.17) | 9,167.33 (7,160.81) | ||
| Range | 0 to 38,403 | 1 to 39,051 | 0 to 36,654 | 0 to 29,045 | ||
| Days in cohort | – | Median [Q25 - Q75] | 22 [15 - 29] | 61 [31 - 91] | 61 [31 - 91] | 61 [31 - 91] |
| Mean (SD) | 25.02 (8.00) | 61.65 (25.38) | 62.16 (25.32) | 59.26 (24.79) | ||
| Range | 1 to 37 | 2 to 92 | 1 to 91 | 1 to 91 | ||
| Days to next record | – | Median [Q25 - Q75] | 6,082 [2,640 - 10,540] | 5,100 [2,430 - 11,474] | 5,760 [1,922 - 15,606] | 7,430 [2,355 - 15,966] |
| Mean (SD) | 7,457.56 (5,937.15) | 7,784.46 (7,686.21) | 9,217.83 (8,826.17) | 9,037.50 (8,138.51) | ||
| Range | 300 to 29,145 | 330 to 28,365 | 360 to 34,482 | 690 to 19,200 | ||
| Medications prior to index date | 11289 warfarin | N (%) | 12 (0.63%) | 8 (1.72%) | 11 (1.93%) | 4 (2.90%) |
| 7052 morphine | N (%) | 15 (0.78%) | 1 (0.22%) | 2 (0.35%) | 2 (1.45%) | |
| 161 acetaminophen | N (%) | 1,530 (79.90%) | 357 (76.94%) | 447 (78.56%) | 119 (86.23%) | |
| Medications on index date | 161 acetaminophen | N (%) | 773 (40.37%) | 240 (51.72%) | 264 (46.40%) | 90 (65.22%) |
| 11289 warfarin | N (%) | 0 (0.00%) | 0 (0.00%) | 0 (0.00%) | 0 (0.00%) | |
| 7052 morphine | N (%) | 0 (0.00%) | 0 (0.00%) | 0 (0.00%) | 0 (0.00%) | |
To visualise the exposure of these drugs in a bar plot.
plot_data <- chars |>
filter(
variable_name == "Medications prior to index date",
estimate_name == "percentage"
)
plot_data |>
plotCharacteristics(
plotType = "barplot",
colour = "variable_level",
facet = c("cdm_name", "cohort_name")
) +
scale_x_discrete(limits = rev(sort(unique(plot_data$variable_level)))) +
coord_flip() +
ggtitle("Medication use prior to index date")Instead of creating cohorts, we could have directly used our concept sets for medications when characterising our study cohorts.
chars <- cdm$injuries |>
summariseCharacteristics(conceptIntersectFlag = list(
"Medications prior to index date" = list(
conceptSet = medsCs,
window = c(-Inf, -1)
),
"Medications on index date" = list(
conceptSet = medsCs,
window = c(0, 0)
)
))Although, like here, concept sets can lead to the same result as using cohorts it is important to note this will not always be the case. This is because the creation of cohorts will have involved the collapsing of overlapping records as well as imposing certain requirements, such as only including records that were observed during an an ongoing observation period. Meanwhile, when working with concept sets we will instead be working directly with record-level data.
|
CDM name
|
||||||
|---|---|---|---|---|---|---|
|
GiBleed
|
||||||
| Variable name | Variable level | Estimate name |
Cohort name
|
|||
| ankle_sprain | ankle_fracture | forearm_fracture | hip_fracture | |||
| Number records | – | N | 1,915 | 464 | 569 | 138 |
| Number subjects | – | N | 1,357 | 427 | 510 | 132 |
| Cohort start date | – | Median [Q25 - Q75] | 1982-11-09 [1968-06-15 - 1999-04-13] | 1981-01-15 [1965-03-11 - 1997-08-03] | 1981-07-24 [1967-03-05 - 2000-12-16] | 1996-09-17 [1977-09-20 - 2010-06-22] |
| Range | 1912-02-25 to 2019-05-30 | 1911-09-07 to 2019-06-23 | 1917-08-16 to 2019-06-26 | 1927-12-14 to 2019-05-08 | ||
| Cohort end date | – | Median [Q25 - Q75] | 1982-12-10 [1968-07-06 - 1999-05-09] | 1981-02-28 [1965-04-11 - 1997-10-12] | 1981-08-23 [1967-04-10 - 2001-02-27] | 1996-11-16 [1977-12-04 - 2010-07-22] |
| Range | 1912-03-10 to 2019-05-30 | 1911-12-06 to 2019-06-24 | 1917-11-14 to 2019-06-26 | 1928-03-13 to 2019-06-07 | ||
| Age | – | Median [Q25 - Q75] | 21 [9 - 41] | 16 [9 - 43] | 17 [9 - 46] | 40 [13 - 66] |
| Mean (SD) | 26.63 (21.03) | 27.38 (24.70) | 28.69 (25.97) | 40.06 (28.82) | ||
| Range | 0 to 105 | 0 to 107 | 0 to 106 | 1 to 108 | ||
| Sex | Female | N (%) | 954 (49.82%) | 238 (51.29%) | 286 (50.26%) | 74 (53.62%) |
| Male | N (%) | 961 (50.18%) | 226 (48.71%) | 283 (49.74%) | 64 (46.38%) | |
| Prior observation | – | Median [Q25 - Q75] | 7,833 [3,628 - 15,147] | 6,030 [3,360 - 16,032] | 6,289 [3,390 - 16,847] | 14,522 [4,801 - 24,401] |
| Mean (SD) | 9,918.17 (7,672.74) | 10,196.57 (9,011.31) | 10,670.43 (9,480.30) | 14,821.73 (10,521.89) | ||
| Range | 299 to 38,429 | 299 to 39,430 | 299 to 38,943 | 390 to 39,792 | ||
| Future observation | – | Median [Q25 - Q75] | 12,868 [6,860 - 18,078] | 13,748 [6,878 - 19,331] | 13,165 [5,988 - 18,548] | 7,798 [2,874 - 14,913] |
| Mean (SD) | 12,865.11 (7,543.50) | 13,470.92 (8,215.96) | 12,913.27 (7,929.17) | 9,167.33 (7,160.81) | ||
| Range | 0 to 38,403 | 1 to 39,051 | 0 to 36,654 | 0 to 29,045 | ||
| Days in cohort | – | Median [Q25 - Q75] | 22 [15 - 29] | 61 [31 - 91] | 61 [31 - 91] | 61 [31 - 91] |
| Mean (SD) | 25.02 (8.00) | 61.65 (25.38) | 62.16 (25.32) | 59.26 (24.79) | ||
| Range | 1 to 37 | 2 to 92 | 1 to 91 | 1 to 91 | ||
| Days to next record | – | Median [Q25 - Q75] | 6,082 [2,640 - 10,540] | 5,100 [2,430 - 11,474] | 5,760 [1,922 - 15,606] | 7,430 [2,355 - 15,966] |
| Mean (SD) | 7,457.56 (5,937.15) | 7,784.46 (7,686.21) | 9,217.83 (8,826.17) | 9,037.50 (8,138.51) | ||
| Range | 300 to 29,145 | 330 to 28,365 | 360 to 34,482 | 690 to 19,200 | ||
| Medications prior to index date | 161 acetaminophen | N (%) | 1,530 (79.90%) | 357 (76.94%) | 447 (78.56%) | 119 (86.23%) |
| 7052 morphine | N (%) | 15 (0.78%) | 1 (0.22%) | 2 (0.35%) | 2 (1.45%) | |
| 11289 warfarin | N (%) | 12 (0.63%) | 8 (1.72%) | 11 (1.93%) | 4 (2.90%) | |
| Medications on index date | 161 acetaminophen | N (%) | 773 (40.37%) | 240 (51.72%) | 264 (46.40%) | 90 (65.22%) |
| 11289 warfarin | N (%) | 0 (0.00%) | 0 (0.00%) | 0 (0.00%) | 0 (0.00%) | |
| 7052 morphine | N (%) | 0 (0.00%) | 0 (0.00%) | 0 (0.00%) | 0 (0.00%) | |
More generally, we can also include summaries of the patients’ presence in other clinical tables of the OMOP CDM. For example, here we add a count of visit occurrences
chars <- cdm$injuries |>
summariseCharacteristics(
tableIntersectCount = list(
"Visits in the year prior" = list(
tableName = "visit_occurrence",
window = c(-365, -1)
)
),
tableIntersectFlag = list(
"Any drug exposure in the year prior" = list(
tableName = "drug_exposure",
window = c(-365, -1)
),
"Any procedure in the year prior" = list(
tableName = "procedure_occurrence",
window = c(-365, -1)
)
)
)|
CDM name
|
||||||
|---|---|---|---|---|---|---|
|
GiBleed
|
||||||
| Variable name | Variable level | Estimate name |
Cohort name
|
|||
| ankle_sprain | ankle_fracture | forearm_fracture | hip_fracture | |||
| Number records | – | N | 1,915 | 464 | 569 | 138 |
| Number subjects | – | N | 1,357 | 427 | 510 | 132 |
| Cohort start date | – | Median [Q25 - Q75] | 1982-11-09 [1968-06-15 - 1999-04-13] | 1981-01-15 [1965-03-11 - 1997-08-03] | 1981-07-24 [1967-03-05 - 2000-12-16] | 1996-09-17 [1977-09-20 - 2010-06-22] |
| Range | 1912-02-25 to 2019-05-30 | 1911-09-07 to 2019-06-23 | 1917-08-16 to 2019-06-26 | 1927-12-14 to 2019-05-08 | ||
| Cohort end date | – | Median [Q25 - Q75] | 1982-12-10 [1968-07-06 - 1999-05-09] | 1981-02-28 [1965-04-11 - 1997-10-12] | 1981-08-23 [1967-04-10 - 2001-02-27] | 1996-11-16 [1977-12-04 - 2010-07-22] |
| Range | 1912-03-10 to 2019-05-30 | 1911-12-06 to 2019-06-24 | 1917-11-14 to 2019-06-26 | 1928-03-13 to 2019-06-07 | ||
| Age | – | Median [Q25 - Q75] | 21 [9 - 41] | 16 [9 - 43] | 17 [9 - 46] | 40 [13 - 66] |
| Mean (SD) | 26.63 (21.03) | 27.38 (24.70) | 28.69 (25.97) | 40.06 (28.82) | ||
| Range | 0 to 105 | 0 to 107 | 0 to 106 | 1 to 108 | ||
| Sex | Female | N (%) | 954 (49.82%) | 238 (51.29%) | 286 (50.26%) | 74 (53.62%) |
| Male | N (%) | 961 (50.18%) | 226 (48.71%) | 283 (49.74%) | 64 (46.38%) | |
| Prior observation | – | Median [Q25 - Q75] | 7,833 [3,628 - 15,147] | 6,030 [3,360 - 16,032] | 6,289 [3,390 - 16,847] | 14,522 [4,801 - 24,401] |
| Mean (SD) | 9,918.17 (7,672.74) | 10,196.57 (9,011.31) | 10,670.43 (9,480.30) | 14,821.73 (10,521.89) | ||
| Range | 299 to 38,429 | 299 to 39,430 | 299 to 38,943 | 390 to 39,792 | ||
| Future observation | – | Median [Q25 - Q75] | 12,868 [6,860 - 18,078] | 13,748 [6,878 - 19,331] | 13,165 [5,988 - 18,548] | 7,798 [2,874 - 14,913] |
| Mean (SD) | 12,865.11 (7,543.50) | 13,470.92 (8,215.96) | 12,913.27 (7,929.17) | 9,167.33 (7,160.81) | ||
| Range | 0 to 38,403 | 1 to 39,051 | 0 to 36,654 | 0 to 29,045 | ||
| Days in cohort | – | Median [Q25 - Q75] | 22 [15 - 29] | 61 [31 - 91] | 61 [31 - 91] | 61 [31 - 91] |
| Mean (SD) | 25.02 (8.00) | 61.65 (25.38) | 62.16 (25.32) | 59.26 (24.79) | ||
| Range | 1 to 37 | 2 to 92 | 1 to 91 | 1 to 91 | ||
| Days to next record | – | Median [Q25 - Q75] | 6,082 [2,640 - 10,540] | 5,100 [2,430 - 11,474] | 5,760 [1,922 - 15,606] | 7,430 [2,355 - 15,966] |
| Mean (SD) | 7,457.56 (5,937.15) | 7,784.46 (7,686.21) | 9,217.83 (8,826.17) | 9,037.50 (8,138.51) | ||
| Range | 300 to 29,145 | 330 to 28,365 | 360 to 34,482 | 690 to 19,200 | ||
| Any drug exposure in the year prior | – | N (%) | 597 (31.17%) | 149 (32.11%) | 171 (30.05%) | 41 (29.71%) |
| Any procedure in the year prior | – | N (%) | 123 (6.42%) | 26 (5.60%) | 37 (6.50%) | 15 (10.87%) |
| Visits in the year prior | – | Median [Q25 - Q75] | 0.00 [0.00 - 0.00] | 0.00 [0.00 - 0.00] | 0.00 [0.00 - 0.00] | 0.00 [0.00 - 0.00] |
| Mean (SD) | 0.00 (0.06) | 0.00 (0.00) | 0.00 (0.00) | 0.00 (0.00) | ||
| Range | 0.00 to 1.00 | 0.00 to 0.00 | 0.00 to 0.00 | 0.00 to 0.00 | ||