Effects of settlement patterns, choices of cluster size and buffer widths, and the extent of spillover between arms on the outcomes of CRTs do not lend themselves to mathematical analysis. Simulations of trials are used to explore the effects of these variables on trial power and on the robustness of statistical methodologies.
Trials can be simulated using the simulateCRT function,
which augments a trial data frame (created externally) or
object of class CRTsp (created by package functions) with
simulated outcome data. The input object must be given location
information and both cluster and arm assignments (see Use Case 1) (or the package can generate these
if the objective is purely simulation.
Information about the underlying spatial pattern of disease is used
in the form of the intra-cluster correlation of the outcome, which is
input to the simulation as variable ICC_inp, and of the
propensity. The former takes a single value for the chosen
design. The latter takes a positive real value for each location. In the
case of malaria, propensity can be thought of as measuring
exposure to infectious mosquitoes.
ICC_inp and propensity may either be
estimated from other datasets or supplied by the user. The behaviour of
the function depends on which variables are supplied, and the value of
generateBaseline, as follows:
| Data supplied by the user | Function behaviour | 
|---|---|
| propensitysupplied by user | Baseline data are created by sampling around propensity | 
| Baseline data are supplied by user and propensityis not supplied | propensityis created from the baseline
data | 
| Neither baseline data nor propensityare
supplied | propensityis generated using normal
kernels, with the bandwidth adjusted to achieve the input value of theICC_inp(after the further smoothing stage to simulate
spillover (see below)) | 
The effect of intervention is simulated as a fixed percentage
reduction in the propensity. Contamination or spillover
between trial arms is then modelled as a additional smoothing process
applied to the intervention-adjusted propensity via a
further bivariate normal kernel. In the case of mosquito borne disease
this is proposed as an approximation to the effect of mosquito movement.
The degree of spillover is specified either as a spillover interval with
the spillover_interval parameter, or as sd,
the bandwidth of the corresponding normal kernel. If both are provided
then it is the value of spillover_interval that is
used.
library(CRTspat)
set.seed(1234)
example_locations <- readdata('example_site.csv')
example_locations$base_denom <- 1
library(dplyr)
example_randomized <- CRTsp(example_locations) %>%
  aggregateCRT(auxiliaries = c("RDT_test_result", "base_denom")) %>%
  specify_clusters(h = 50, algorithm = 'NN') %>%
  randomizeCRT(matchedPair = FALSE)
summary(example_randomized)## ===============================CLUSTER RANDOMISED TRIAL ===========================
## 
## Summary of coordinates
## ----------------------
##                Min.   : 1st Qu.: Median : Mean   : 3rd Qu.: Max.   :
##       x        -3.20    -1.40    -0.30    -0.07     1.26     5.16   
##       y        -5.08    -2.84     0.19     0.05     2.49     6.16   
## nearestDiscord -3.47    -0.49     0.04     0.14     0.74     3.30   
## 
## Total area (within  0.2 km of a location) :  27.6 sq.km
## Total area (convex hull) :  48.2 sq.km
## 
## Locations and Clusters
## ----------------------                                          -                        
## Coordinate system                      (x, y)                        
## Locations:                                                      1181                        
## Available clusters (across both arms)                           24                        
##   Per cluster mean number of points                             49.2                        
##   Per cluster s.d. number of points                             3.9                        
## Cluster randomization:                      Independently randomized                        
## No power calculations to report          -                        
## 
## Other variables in dataset
## --------------------------          RDT_test_result  base_denomplotCRT(example_randomized, map = TRUE, legend.position = c(0.8, 0.8))
  example2a <- simulateCRT(example_randomized,
           effect = 0.8,
           outcome0 = 0.5,
           generateBaseline = FALSE,
           baselineNumerator = "RDT_test_result",
           baselineDenominator = "base_denom",
           ICC_inp = 0.05, spillover_interval = 0.8)
summary(example2a)## ===============================CLUSTER RANDOMISED TRIAL ===========================
## 
## Summary of coordinates
## ----------------------
##                Min.   : 1st Qu.: Median : Mean   : 3rd Qu.: Max.   :
##       x        -3.20    -1.40    -0.30    -0.07     1.26     5.16   
##       y        -5.08    -2.84     0.19     0.05     2.49     6.16   
## nearestDiscord -3.47    -0.49     0.04     0.14     0.74     3.30   
## 
## Total area (within  0.2 km of a location) :  27.6 sq.km
## Total area (convex hull) :  48.2 sq.km
## 
## Locations and Clusters
## ----------------------                                          -                        
## Coordinate system                      (x, y)                        
## Locations:                                                      1181                        
## Available clusters (across both arms)                           24                        
##   Per cluster mean number of points                             49.2                        
##   Per cluster s.d. number of points                             3.9                        
## Cluster randomization:                      Independently randomized                        
## No power calculations to report          -                        
## 
## Other variables in dataset
## --------------------------          RDT_test_result  base_denom  denom  propensity  numlibrary(Matrix)
examplemesh100 <- readdata("examplemesh100.rds")
example2aanalysis <- CRTanalysis(trial=example2a, method = 'T')
summary(example2aanalysis)## 
## =====================CLUSTER RANDOMISED TRIAL ANALYSIS =================
## Analysis method:  T 
## Link function:  logit 
## Model formula:  arm + (1 | cluster) 
## No modelling of spillover 
## Estimates:       Control:  0.386  (95% CL:  0.301 0.478 )
##             Intervention:  0.191  (95% CL:  0.131 0.269 )
##                 Efficacy:  0.506  (95% CL:  0.351 0.784 )
## Coefficient of variation:  49.3 %  (95% CL:  36.5 77.5 )
##  
## P-value (2-sided):  0.001224732plotCRT(example2aanalysis)
example2aINLA <- CRTanalysis(trial=example2a,
                 method = 'INLA', link='logit', cfunc = 'Z',
                 clusterEffects = FALSE, spatialEffects = TRUE,
                 requireMesh = TRUE, inla_mesh = examplemesh100)
plotCRT(example2aINLA, map = TRUE, fill = 'prediction',
  showClusterBoundaries = TRUE, legend.position = c(0.8, 0.8))
 
 Fig 2.1 Map of
allocations of clusters to arms
 
 Fig 2.2 Plot of data by
distance to other arm
 
 Fig 2.3 Smoothed outcome
from geostatistical model
set.seed(1234)
# Simulate a site with 2000 locations
new_site <- CRTsp(geoscale = 2, locations=2000, kappa=3, mu=40)
# propensity surface generated as an arbitrary linear function of x the co-ordinate
new_site$trial$propensity <- 0.5*new_site$trial$x - min(new_site$trial$x)+1
library(dplyr)
example2b<- CRTsp(new_site) %>%
   specify_clusters(h = 40, algorithm = 'NN') %>%
   randomizeCRT(matchedPair = FALSE) %>%
   simulateCRT(effect = 0.8,
               outcome0 = 0.5,
               generateBaseline = TRUE,
               ICC_inp = 0.05,
               spillover_interval = 0.5)## 
## =====================    SIMULATION OF CLUSTER RANDOMISED TRIAL    =================## *** computed distance to nearest measurements in discordant arm ***## Estimating the smoothing required to achieve the target ICC of 0.05## 
tol: 0.005 loss = 0.0034482803603185 ## 
bandwidth: 0.0117436284570214  ICC = 0.108722060252673 loss = 0.0034482803603185 ## ===============================CLUSTER RANDOMISED TRIAL ===========================
## 
## Summary of coordinates
## ----------------------
##                Min.   : 1st Qu.: Median : Mean   : 3rd Qu.: Max.   :
##       x        -8.73    -5.16    -1.03     0.00     5.17    11.26   
##       y        -9.55    -4.42    -0.58     0.00     4.56    10.45   
## nearestDiscord -4.20    -0.93     0.00     0.03     0.96     3.94   
## 
## Total area (within  0.2 km of a location) :  181 sq.km
## Total area (convex hull) :  396 sq.km
## 
## Locations and Clusters
## ----------------------                                          -                        
## Coordinate system                      (x, y)                        
## Locations:                                                      2000                        
## Available clusters (across both arms)                           50                        
##   Per cluster mean number of points                             40                        
##   Per cluster s.d. number of points                             0                        
## Cluster randomization:                      Independently randomized                        
## No power calculations to report          -                        
## 
## Other variables in dataset
## --------------------------          denom  propensity  num  base_denom  base_num##     (Intercept) armintervention 
##      -0.6322523      -0.4504233## No non-linear parameter.  No fixed effects of distance -## 
## =====================CLUSTER RANDOMISED TRIAL ANALYSIS =================
## Analysis method:  GEE 
## Link function:  logit 
## Model formula:  arm 
## No modelling of spillover 
## Estimates:       Control:  0.347  (95% CL:  0.292 0.406 )
##             Intervention:  0.253  (95% CL:  0.216 0.294 )
##                 Efficacy:  0.271  (95% CL:  0.0608 0.433 )
## Coefficient of variation:  50.4 %  (95% CL:  40.5 67.3 )
## Intracluster correlation (ICC)  :  0.109  (95% CL:  NA NA )
## 
 
 Fig 2.4 Map of clusters
in simulated trial
set.seed(1234)
# use co-ordinates, cluster and arm assignments, and baseline data from `example_simulated`
example2c<- CRTsp(geoscale = 2, locations=2000, kappa=3, mu=40) %>%
   specify_clusters(h = 40, algorithm = 'NN') %>%
   randomizeCRT(matchedPair = FALSE) %>%
   simulateCRT(effect = 0.8,
       outcome0 = 0.5,
       generateBaseline = TRUE,
       baselineNumerator = 'base_num',
       baselineDenominator = 'base_denom',
       ICC_inp = 0.08,
       spillover_interval = 0.2)## 
## =====================    SIMULATION OF CLUSTER RANDOMISED TRIAL    =================## *** computed distance to nearest measurements in discordant arm ***## Estimating the smoothing required to achieve the target ICC of 0.08## 
tol: 0.005 loss = 0.00125339394677766 ## 
bandwidth: 0.0246321272161409  ICC = 0.115403304178814 loss = 0.00125339394677766 ##     (Intercept) armintervention 
##      -0.6677537      -0.3730238## No non-linear parameter.  No fixed effects of distance -## 
## =====================CLUSTER RANDOMISED TRIAL ANALYSIS =================
## Analysis method:  GEE 
## Link function:  logit 
## Model formula:  arm 
## No modelling of spillover 
## Estimates:       Control:  0.339  (95% CL:  0.281 0.402 )
##             Intervention:  0.261  (95% CL:  0.208 0.322 )
##                 Efficacy:  0.23  (95% CL:  -0.0152 0.419 )
## Coefficient of variation:  51.9 %  (95% CL:  41.6 69.6 )
## Intracluster correlation (ICC)  :  0.115  (95% CL:  NA NA )
##