| Type: | Package | 
| Title: | Estimate IV-Optimal Individualized Treatment Rules | 
| Version: | 0.1.0 | 
| Author: | Bo Zhang | 
| Maintainer: | Bo Zhang <bozhan@wharton.upenn.edu> | 
| Description: | A method that estimates an IV-optimal individualized treatment rule. An individualized treatment rule is said to be IV-optimal if it minimizes the maximum risk with respect to the putative IV and the set of IV identification assumptions. Please refer to <doi:10.48550/arXiv.2002.02579> for more details on the methodology and some theory underpinning the method. Function IV-PILE() uses functions in the package 'locClass'. Package 'locClass' can be accessed and installed from the 'R-Forge' repository via the following link: https://r-forge.r-project.org/projects/locclass/. Alternatively, one can install the package by entering the following in R: 'install.packages("locClass", repos="http://R-Forge.R-project.org")'. | 
| License: | GPL-3 | 
| Encoding: | UTF-8 | 
| LazyData: | true | 
| RoxygenNote: | 7.1.0 | 
| Depends: | R (≥ 2.10) | 
| Suggests: | locClass | 
| Imports: | stats, nnet, randomForest, dplyr, rlang | 
| NeedsCompilation: | no | 
| Packaged: | 2020-09-03 19:15:08 UTC; ASUS | 
| Repository: | CRAN | 
| Date/Publication: | 2020-09-11 08:40:03 UTC | 
Estimate an IV-optimal individualized treatment rule
Description
IV_PILE estimates an IV-optimal individualized treatment
rule given a dataset with estimated partial identification intervals
for each instance.
Usage
IV_PILE(dt, kernel = "linear", C = 1, sig = 1/(ncol(dt) - 5))
Arguments
| dt | A dataframe whose first column is a binary IV 'Z', followed by q columns of observed covariates, a binary treatment indicator 'A', a binary outcome 'Y', lower endpoint of the partial identification interval 'L', and upper endpoint of the partial identification interval 'U'. The dataset has q+5 columns in total. | 
| kernel | The kernel used in the weighted SVM algorithm. The user may choose between 'linear' (linear kernel) and 'radial' (Gaussian RBF kernel). | 
| C | Cost of violating the constraint. This is the parameter C in the Lagrange formulation. | 
| sig | Sigma in the Gaussian RBF kernel. Default is set to 1/dimension of covariates, i.e., 1/q. This parameter is not relevant for linear kernel. | 
Value
An object of the type wsvm, inheriting from svm.
Examples
## Not run: 
# It is necessary to install the package locClass in order
# to run the following code.
attach(dt_Rouse)
# Construct an IV out of differential distance to two-year versus
# four-year college. Z = 1 if the subject lives not farther from
# a 4-year college compared to a 2-year college.
Z = (dist4yr <= dist2yr) + 0
# Treatment A = 1 if the subject attends a 4-year college and 0
# otherwise.
A = 1 - twoyr
# Outcome Y = 1 if the subject obtained a bachelor's degree
Y = (educ86 >= 16) + 0
# Prepare the dataset
dt = data.frame(Z, female, black, hispanic, bytest, dadsome,
     dadcoll, momsome, momcoll, fincome, fincmiss, A, Y)
# Estimate the Balke-Pearl bound by estimating each constituent
# conditional probability p(Y = y, A = a | Z, X) with a multinomial
# regression.
dt_with_BP_bound_multinom = estimate_BP_bound(dt, method = 'multinom')
# Estimate the IV-optimal individualized treatment rule using a
# linear kernel, under the putative IV and the Balke-Pearl bound.
iv_itr_BP_linear = IV_PILE(dt_with_BP_bound_multinom, kernel = 'linear')
## End(Not run)
Rouse (1995) dataset
Description
Variables of the dataset is as follows:
- educ86
- Years of education since 1986. 
- twoyr
- Attending a two-year college immediately after high school. 
- female
- Gender: 1 if female and 0 otherwise. 
- black
- Race: 1 if African American and 0 otherwise. 
- hispanic
- Race: 1 if Hispanic and 0 otherwise. 
- bytest
- Test score. 
- dadsome
- Dad's education: some college. 
- dadcoll
- Dad's education: college. 
- momsome
- Mom's education: some college. 
- momcoll
- Mom's education: college. 
- fincome
- Family income. 
- fincmiss
- Missingness indicator for family income. 
- tuition2
- Average state two-year college tuition. 
- tuition4
- Average state four-year college tuition. 
- dist2yr
- Distance to the nearest two-year college. 
- dist4yr
- Distance to the nearest four-year college. 
Usage
data(dt_Rouse)
Format
A data frame with 4437 rows and 16 columns.
Source
ss
Estimate the Balke-Pearl bound for each instance in a dataset
Description
estimate_BP_bound estimates the Balke-Pearl bound for
each instance in the input dataset with a binary IV, observed
covariates, a binary treatment indicator, and a binary outcome.
Usage
estimate_BP_bound(dt, method = "rf", nodesize = 5)
Arguments
| dt | A dataframe whose first column is a binary IV 'Z', followed by q columns of observed covariates, followed by a binary treatment indicator 'A', and finally followed by a binary outcome 'Y'. The dataset has q+3 columns in total. | 
| method | A character string indicator the method used to estimate each constituent conditional probability of the Balke-Pearl bound. Users can choose to fit multinomial regression by setting method = 'multinom', and random forest by setting method = 'rf'. | 
| nodesize | Node size to be used in a random forest algorithm if method is set to 'rf'. The default value is set to 5. | 
Value
The original dataframe with two additional columns: L and U. L indicates the Balke-Pearl lower bound and U is the Balke-Pearl upper bound.
Examples
attach(dt_Rouse)
# Construct an IV out of differential distance to two-year versus
# four-year college. Z = 1 if the subject lives not farther from
# a 4-year college compared to a 2-year college.
Z = (dist4yr <= dist2yr) + 0
# Treatment A = 1 if the subject attends a 4-year college and 0
# otherwise.
A = 1 - twoyr
# Outcome Y = 1 if the subject obtained a bachelor's degree
Y = (educ86 >= 16) + 0
# Prepare the dataset
dt = data.frame(Z, female, black, hispanic, bytest, dadsome,
     dadcoll, momsome, momcoll, fincome, fincmiss, A, Y)
# Calculate the Balke-Pearl bound by estimating each constituent
# conditional probability p(Y = y, A = a | Z, X) with a random
# forest.
dt_with_BP_bound_rf = estimate_BP_bound(dt, method = 'rf', nodesize = 5)
# Calculate the Balke-Pearl bound by estimating each constituent
# conditional probability p(Y = y, A = a | Z, X) with a multinomial
# regression.
dt_with_BP_bound_multinom = estimate_BP_bound(dt, method = 'multinom')
Estimate the partial identification bound as in Siddique (2013, JASA) for each instance in a dataset
Description
estimate_Sid_bound estimates the partial identification bound
for each instance in the input dataset with a binary IV, observed
covariates, a binary treatment indicator, and a binary outcome according
to Siddique (2013, JASA).
Usage
estimate_Sid_bound(dt, method = "rf", nodesize = 5)
Arguments
| dt | A dataframe whose first column is a binary IV 'Z', followed by q columns of observed covariates, followed by a binary treatment indicator 'A', and finally followed by a binary outcome 'Y'. The dataset has q+3 columns in total. | 
| method | A character string indicator the method used to estimate each constituent conditional probability of the partial identification bound. Users can choose to fit multinomial regression by setting method = 'multinom', and random forest by setting method = 'rf'. | 
| nodesize | Node size to be used in a random forest algorithm if method is set to 'rf'. The default value is set to 5. | 
Value
The original dataframe with two additional columns: L and U. L indicates the lower bound and U the upper bound as in Siddique 2013
Examples
attach(dt_Rouse)
# Construct an IV out of differential distance to two-year versus
# four-year college. Z = 1 if the subject lives not farther from
# a 4-year college compared to a 2-year college.
Z = (dist4yr <= dist2yr) + 0
# Treatment A = 1 if the subject attends a 4-year college and 0
# otherwise.
A = 1 - twoyr
# Outcome Y = 1 if the subject obtained a bachelor's degree
Y = (educ86 >= 16) + 0
# Prepare the dataset
dt = data.frame(Z, female, black, hispanic, bytest, dadsome,
     dadcoll, momsome, momcoll, fincome, fincmiss, A, Y)
# Calculate the Siddique bound by estimating each constituent
# conditional probability p(Y = y, A = a | Z, X) with a random
# forest.
dt_with_Sid_bound_rf = estimate_Sid_bound(dt, method = 'rf', nodesize = 5)
# Calculate the Siddique bound by estimating each constituent
# conditional probability p(Y = y, A = a | Z, X) with a multinomial
# regression.
dt_with_Sid_bound_multinom = estimate_Sid_bound(dt, method = 'multinom')