This vignette is largely based on the PharmaSUG 2023 conference paper.
When using a maraca plot in a regulatory setting, for example to visualize clinical study results for regulatory submissions, there will be strict validation demands. For example, the results might need to be double programmed for validation purposes.
In order to facilitate the validation of the graphic output, the
maraca package includes the function validate_maraca_plot()
that allows the user to extract important metrics from the plot itself.
This allows to programmatically compare the results of a plot produced
using the maraca package with other programmatic approaches. The paper
linked above walks through a validation example where the double
programming was done in SAS.
To use the validation functionality, we need to first create a maraca object.
library(maraca)
data(hce_scenario_a)
maraca_dat <- maraca(
data = hce_scenario_a,
step_outcomes = c("Outcome I", "Outcome II", "Outcome III", "Outcome IV"),
last_outcome = "Continuous outcome",
fixed_followup_days = 3 * 365,
column_names = c(outcome = "GROUP", arm = "TRTP", value = "AVAL0"),
arm_levels = c(active = "Active", control = "Control"),
compute_win_odds = TRUE
)We then create a maraca plot and save the actual plot as an object.
# Save plot as its own object
maraca_plot <- plot(maraca_dat)
# The plot has its own class called "maracaPlot"
class(maraca_plot)
## [1] "maracaPlot" "ggplot2::ggplot" "ggplot" "ggplot2::gg"
## [5] "S7_object" "gg"
# Display plot
maraca_plot
Now we can validate the plot using the
validate_maraca_plot() function.
validation_list <- validate_maraca_plot(maraca_plot)
## Joining with `by = join_by(group)`
## Joining with `by = join_by(group)`
# Display which metrics are included
str(validation_list)
## List of 9
## $ plot_type : chr "default"
## $ proportions : Named num [1:5] 13.8 18.1 12.7 9.8 45.6
## ..- attr(*, "names")= chr [1:5] "Outcome I" "Outcome II" "Outcome III" "Outcome IV" ...
## $ tte_data :'data.frame': 544 obs. of 3 variables:
## ..$ x : num [1:544] 0.125 0.159 0.463 0.591 0.621 ...
## ..$ y : num [1:544] 0.2 0.4 0.6 0.2 0.4 0.6 0.8 0.8 1 1.2 ...
## ..$ group: Factor w/ 2 levels "Active","Control": 2 2 2 1 1 1 2 1 2 2 ...
## $ binary_step_data: NULL
## $ binary_last_data: NULL
## $ scatter_data : NULL
## $ boxstat_data :'data.frame': 2 obs. of 9 variables:
## ..$ group : Factor w/ 2 levels "Active","Control": 1 2
## ..$ whisker_lower: num [1:2] 63.8 57.8
## ..$ hinge_lower : num [1:2] 74.9 71
## ..$ median : num [1:2] 79.7 75.9
## ..$ hinge_upper : num [1:2] 83.9 80.7
## ..$ whisker_upper: num [1:2] 95.5 94.5
## ..$ outliers :List of 2
## .. ..$ : num 60.8
## .. ..$ : num [1:5] 54.4 96.4 97.2 99.7 100
## ..$ x_lowest : num [1:2] 60.8 54.4
## ..$ x_highest : num [1:2] 95.5 100
## $ violin_data :'data.frame': 2060 obs. of 7 variables:
## ..$ group : Factor w/ 2 levels "Active","Control": 1 1 1 1 1 1 1 1 1 1 ...
## ..$ x : num [1:2060] 60.8 60.8 60.9 61 61 ...
## ..$ y : num [1:2060] 44 44 44 44 44 44 44 44 44 44 ...
## ..$ density : num [1:2060] 1.44 1.48 1.51 1.55 1.6 ...
## ..$ density_scaled: num [1:2060] 0.0197 0.0202 0.0208 0.0213 0.0219 ...
## ..$ violinwidth : num [1:2060] 44.2 44.2 44.2 44.2 44.2 ...
## ..$ width : num [1:2060] 18.7 18.7 18.7 18.7 18.7 ...
## $ wo_stats : Named num [1:4] 1.64 1.42 1.91 9.35e-12
## ..- attr(*, "names")= chr [1:4] "winodds" "lowerCI" "upperCI" "p_value"Running the validate_maraca_plot() function on a maraca
plot object returns a list with the following items:
plot_type: depending on which
density_plot_type was selected for the plot either
GeomPoint, GeomViolin and/or
GeomBoxplotproportions: the proportions of the HCE componentstte_data: time-to-event data if part of the step
outcomes has type tte, otherwise NULLbinary_step_data: binary data if part of the step
outcomes has type binary, otherwise NULLbinary_step_data: if last endpoint was binary then
contains the data for the minimum, maximum and middle point x values
displayed in the ellipsis, otherwise NULLscatter_data: if last endpoint was continuous and plot
was created with density_plot_type = "scatter" then
contains dataset that was plotted in scatter plot, otherwise
NULLboxstat_data: if last endpoint was continuous and if
plot was created with density_plot_type = "box" or
density_plot_type = "default" then contains the boxplot
statistics, otherwise NULLviolin_data: if last endpoint was continuous and if
plot was created with density_plot_type = "violin" or
density_plot_type = "default" then contains the violin
distribution data, otherwise NULLwo_stats: if maraca object was created with
compute_win_odds = TRUE then contains the win odds
statistics, otherwise NULLThese can then be converted to a convenient format for validation, such as as individual data.frames.
library(dplyr)
library(tidyr)
validation_list$proportions %>%
as.data.frame() %>%
rename("proportion" = ".")
## proportion
## Outcome I 13.8
## Outcome II 18.1
## Outcome III 12.7
## Outcome IV 9.8
## Continuous outcome 45.6
head(validation_list$tte_data)
## x y group
## geom_step.3 0.1251495 0.2 Control
## geom_step.4 0.1593700 0.4 Control
## geom_step.5 0.4625571 0.6 Control
## geom_step.6 0.5905571 0.2 Active
## geom_step.7 0.6209704 0.4 Active
## geom_step.8 0.6478802 0.6 Active
validation_list$boxstat_data %>%
unnest_wider(outliers, names_sep = "") %>%
pivot_longer(., cols = -group, names_to = "stat_name", values_to = "values") %>%
filter(!is.na(values)) %>%
as.data.frame()
## group stat_name values
## 1 Active whisker_lower 63.82323
## 2 Active hinge_lower 74.87904
## 3 Active median 79.68861
## 4 Active hinge_upper 83.87907
## 5 Active whisker_upper 95.53322
## 6 Active outliers1 60.77490
## 7 Active x_lowest 60.77490
## 8 Active x_highest 95.53322
## 9 Control whisker_lower 57.76833
## 10 Control hinge_lower 70.99354
## 11 Control median 75.90131
## 12 Control hinge_upper 80.73315
## 13 Control whisker_upper 94.45711
## 14 Control outliers1 54.40000
## 15 Control outliers2 96.39603
## 16 Control outliers3 97.24586
## 17 Control outliers4 99.68534
## 18 Control outliers5 100.00000
## 19 Control x_lowest 54.40000
## 20 Control x_highest 100.00000
head(validation_list$violin_data)
## group x y density density_scaled violinwidth width
## 1 Active 60.77490 44 1.438990 0.01972645 44.18464 18.72
## 2 Active 60.84292 44 1.476413 0.02023947 44.18944 18.72
## 3 Active 60.91094 44 1.514659 0.02076376 44.19435 18.72
## 4 Active 60.97896 44 1.554340 0.02130773 44.19944 18.72
## 5 Active 61.04698 44 1.595229 0.02186826 44.20469 18.72
## 6 Active 61.11500 44 1.637199 0.02244361 44.21007 18.72
validation_list$wo_stats
## winodds lowerCI upperCI p_value
## 1.643265e+00 1.416117e+00 1.906848e+00 9.354073e-12