| Type: | Package | 
| Title: | Datasets and Functions for the Class "Modelling and Data Analysis for Pharmaceutical Sciences" | 
| Version: | 0.0.5 | 
| Description: | Provides datasets and functions for the class "Modelling and Data Analysis for Pharmaceutical Sciences". The datasets can be used to present various methods of data analysis and statistical modeling. Functions for data visualization are also implemented. | 
| License: | AGPL-3 | 
| Encoding: | UTF-8 | 
| LazyData: | true | 
| RoxygenNote: | 7.3.1 | 
| NeedsCompilation: | no | 
| Packaged: | 2025-05-02 13:00:09 UTC; lionel | 
| Author: | Lionel Voirol [aut, cre], Stéphane Guerrier [aut], Yuming Zhang [aut], Luca Insolia [aut] | 
| Maintainer: | Lionel Voirol <lionelvoirol@hotmail.com> | 
| Depends: | R (≥ 3.5.0) | 
| Repository: | CRAN | 
| Date/Publication: | 2025-05-02 13:20:02 UTC | 
Breast Cancer
Description
This dataset consists of several clinical features observed or measured for 116 participants in a study of breast cancer.
Usage
BreastCancer
Format
- Age
 Age in years
- BMI
 Body mass index in kg/
m^2- Glucose
 Glucose in mg/dL
- Insulin
 Insulin in
\muU/mL- HOMA
 Homeostasis model assessment
- Classification
 Presence of breast cancer (0 if no cancer, 1 if with cancer)
Source
https://bmccancer.biomedcentral.com/articles/10.1186/s12885-017-3877-1
References
Patricio, Miguel, et al. "Using Resistin, glucose, age and BMI to predict the presence of breast cancer", BMC Cancer, (2018).
HP13Cbicarbonate
Description
Data from an experiment made on rats which compares the HP13C bicarbonate signal intensities normalized to the total sum of metabolites and corresponding initial reaction rate as a function of the injected dose of HP1-13C pyruvate. Two groups of rats were compared (i.e. fed and overnight-fasted). Dataset from Can et al. 2022.
Usage
HP13Cbicarbonate
Format
- signal
 HP13C bicarbonate signal intensities normalized to the total sum of metabolites
- dose
 initial reaction rate as a function of the injected dose of HP13C pyruvate
- group
 fed and overnight-fasted
Source
https://www.nature.com/articles/s42003-021-02978-2
Peruvian Blood Pressure
Description
This dataset consists of variables possibly relating to blood pressures of 39 Peruvians who have moved from rural high-altitude areas to urban lower-altitude areas.
Usage
PeruvianBP
Format
- Age
 Age in years
- Years
 Years in urban area
- Weight
 Weight in kg
- Height
 Height in mm
- Chin
 Chin skinfold
- Forearm
 Forearm skinfold
- Calf
 Calf skinfold
- Pulse
 Resting pulse rate
- Systol
 Systolic blood pressure
boxplot_w_points
Description
boxplot_w_points
Usage
boxplot_w_points(
  ...,
  col_points = "#9033FF3F",
  col_boxplot = "#d2d2d2",
  horizontal = FALSE,
  main = "",
  names = NULL,
  las = 0,
  xlab = "",
  ylab = "",
  seed = 123,
  jitter_param = 0.25
)
Arguments
... | 
 data vectors to be visualized.  | 
col_points | 
 color of the points to be added to the boxplot.  | 
col_boxplot | 
 color of the boxplot.  | 
horizontal | 
 logical indicating if the boxplots should be horizontal; default FALSE means vertical boxes.  | 
main | 
 string indicating the title of the plot.  | 
names | 
 vector of string indicating the group labels which will be printed under each boxplot.  | 
las | 
 a numeric value indicating the orientation of the tick mark labels and any other text added to a plot after its initialization. The options are as follows: always parallel to the axis (the default, 0), always horizontal (1), always perpendicular to the axis (2), and always vertical (3).  | 
xlab | 
 a string indicating the x label.  | 
ylab | 
 a string indicating the y label.  | 
seed | 
 an integer specifying a seed for the random jitter of the boxplot points.  | 
jitter_param | 
 a double specifying the amount of jittering applied on points.  | 
Value
No return value. Plot a boxplot.
Examples
x <- rnorm(20, mean = 5)
y <- rnorm(20, mean = 10)
z <- rnorm(20, mean = 15)
boxplot_w_points(x, main = "test")
boxplot_w_points(x, y, names = c("x", "y"), las = 1, main = "Data")
boxplot_w_points(x, y, z, names = c("x", "y", "z"), horizontal = TRUE, las = 1, main = "Data")
boxplot_w_points(x, y, z, names = c("x", "y", "z"), horizontal = FALSE, las = 1, main = "Data")
Bronchitis
Description
Data collected in a study to assess the effects of smoking and pollution on being diagnosed with bronchitis. This dataset is based on 212 subjects.
Usage
bronchitis
Format
- bron
 Presence of bronchitis (0 for no and 1 for yes)
- cigs
 Average daily number of smoked cigarettes
- poll
 Pollution index
Centenarian Blood Pressure
Description
This dataset consists of variables that are potentially related to blood pressure measurements and contains one group of patients aged between 52 and 89 years old who live in urban areas, and another group of 50 centenarian women aged between 101-121 who live in the island of Okinawa, which is known for its high number of centenarians.The dataset lists the following variables:
Usage
centenarian
Format
- Age
 Age in years
- Chin
 Chin skinfold in cm
- Forearm
 Forearm skinfold in cm
- Calf
 Calf skinfold in cm
- Pulse
 Resting pulse rate
- BMI
 The Body Mass Index (BMI) of the participant
- Centenarian
 A dummy variable indicating if the participant is Centenarian
- Cystol
 Systolic blood pressure
codex
Description
This dataset is based on an observational study conducted at Geneva University Hospitals to assess the impact of weight on the pharmacokinetics of dexamethasone in normal-weight versus obese patients hospitalized for COVID-19.
Usage
codex
Format
- id
 ID of the patient
- gender
 Gender (0 for men and 1 for women)
- age
 Age
- bmi
 Body mass index
- weight
 Weight in kg
- number_doses
 Number of doses of the dexamethasone (DEX) drug
- tmax
 The time it takes for the drug to reach the maximum concentration (i.e. Cmax) after its administration in hours (h)
- cmax
 The maximum concentration that achieves in the blood after the drug has been administered (ng/m)
- t1_2
 t1_2 is the time required to decrease the drug concentration within the body by one-half during elimination in hours (h)
- auc
 The integral (from 0 to 8 hours) of a curve that describes the variation of a drug concentration in the blood as a function of time it takes for a drug to reach the maximum concentration (Cmax) after administration of a drug (ng.h/m)
- length_hospital
 Number of days the patient were hospitalized
- length_intermed
 Number of days the patient were hospitalized at the intermediate and intensive care unit
- crp
 crp
- comor_e
 Presence of cormobidity type e
- comor_p
 Presence of cormobidity type p
- comor_v
 Presence of cormobidity type v
- comor_c
 Presence of cormobidity type c
- comor_r
 Presence of cormobidity type r
- obese
 Indicator variable based on whether the subject is obese (i.e. with BMI > 30), 0 for no and 1 for yes.
Biomarkers in pigs fed with various diets
Description
This dataset contains measured biomarkers in pigs fed with various diets.
Usage
cortisol
Format
A data frame with 61 rows and 9 variables:
- id
 the id of the pig
- group
 the diet fed to the pig (chipped diet or non-chipped diet)
- gender
 the gender of the pig
- cortisol
 urine costisol in pg/ml
- acth
 serum acth in pg/ml
- crh
 serum crh in pg/ml
- testosterone
 testosterone in ng/ml
- lh
 LH in ng/ml
- caloric
 daily caloric intake in kcal
Intensive care admission of COVID-19 patients in Belgium
Description
Data from Parisi, et al., (2021) which studies the applicability of predictive models for intensive care admission of COVID-19 patients in a secondary care hospital in Belgium. This study is based on data of patients admitted to an emergency department with a positive RT-PCR SARS-CoV-2 test.
Usage
covid
Format
A data frame with 64 rows and 5 variables:
- icu
 admission to an Intensive Care Unit (0 for no, 1 for yes)
- sex
 sex (men, women)
- age
 age in years
- ldh
 lactate dehydrogenase in U/L
- spo2
 oxygen saturation in percentage
Source
https://jeccm.amegroups.org/article/view/6927/html
References
Parisi, Nicolas, et al. "Non applicability of validated predictive models for intensive care admission and death of COVID-19 patients in a secondary care hospital in Belgium.", Journal of Emergency and Critical Care Medicine, (2021).
COVID-19 Spatial
Description
Data from the COVID-19 Data Hub joined with spatial features for Switzerland.
Usage
data_covid_switzerland_spatial
Format
- admin
 Country
- iso_alpha_3
 3-letter code of the country according to the standard ISO 3166-1 Alpha-3
- date
 Date
- confirmed
 Cumulative number of confirmed cases
- population
 Total population
- tests
 Cumulative number of tests
- diff_confirmed
 Daily number of confirmed cases
- diff_test
 Daily number of tests
- confirmed_per_pop
 Number of daily confirmed cases divided per the country population
- confirmed_per_pop_ma
 Moving Average applied to confirmed_per_pop with a window of 7 days
- geometry
 'sf' geometry list of country
Source
Diabetes study in Bangladesh
Description
This dataset contains reports of diabetes symptoms from 520 individuals, encompassing symptoms potentially associated with the condition. It was compiled through a questionnaire aimed at recently diagnosed diabetics or individuals displaying one or more symptoms. Data collection took place via direct questionnaire at Sylhet Diabetes Hospital in Bangladesh.
Usage
diabetes
Format
- age
 Age of the patient in years
- gender
 Gender of the patient (Male, Female)
- polyuria
 Presence of polyuria (excessive urination) (Yes, No)
- polydipsia
 Presence of polydipsia (excessive thirst) (Yes, No)
- sudden_weight_loss
 Presence of sudden weight loss (Yes, No)
- weakness
 Presence of weakness (Yes, No)
- polyphagia
 Presence of polyphagia (excessive hunger) (Yes, No)
- genital_thrush
 Presence of genital thrush (Yes, No)
- visual_blurring
 Presence of visual blurring (Yes, No)
- itching
 Presence of itching (Yes, No)
- irritability
 Presence of irritability (Yes, No)
- delayed_healing
 Presence of delayed healing (Yes, No)
- partial_paresis
 Presence of partial paresis (Yes, No)
- muscle_stiffness
 Presence of muscle stiffness (Yes, No)
- alopecia
 Presence of alopecia (Yes, No)
- obesity
 Presence of obesity (Yes, No)
- class
 Diagnosis class (1 if presence of diabetes, 0 otherwise)
Source
https://link.springer.com/chapter/10.1007/978-981-13-8798-2_12
References
Islam, M. M. F., et al. "Likelihood prediction of diabetes at early stage using data mining techniques", Computer vision and machine intelligence in medical image analysis, (2020).
Diet
Description
Diet
Usage
diet
Format
- id
 ID
- gender
 Gender (male or female)
- age
 Age in years
- height
 Height in m
- diet.type
 Type of diet (A, B or C)
- initial.weight
 Initial weight in kg
- final.weight
 Final weight in kg
Forced Expiratory Volume
Description
This dataset is based on a study conducted in suburban Boston in the late 1970s to investigate the relationship between forced expiratory volume and smoking behavior in 654 youths between the ages of 3 and 19.
Usage
fev
Format
- fev
 forced expiratory volume or FEV, which measures the amount of air a person can exhale during a forced breath.
- age
 age in years
- sex
 gender of the person (0 for males and 1 for females)
- height
 height in cm
- smoke
 smoking behavior (0 for non-smokers and 1 for smokers)
hist_compare_to_normal
Description
hist_compare_to_normal
Usage
hist_compare_to_normal(
  x,
  col = "lightgray",
  main = "",
  xlab = "",
  ylab = "",
  lwd_line = 1.5,
  col_line1 = "#ff160e",
  col_line2 = "#335bff",
  add_legend = TRUE,
  legend_position = "topleft",
  delta = 0.2,
  ...
)
Arguments
x | 
 data vector to be visualized.  | 
col | 
 color of the histogram.  | 
main | 
 string indicating the title of the plot.  | 
xlab | 
 a string indicating the x label.  | 
ylab | 
 a string indicating the y label.  | 
lwd_line | 
 width of density lines.  | 
col_line1 | 
 color of density line classic mle estimation.  | 
col_line2 | 
 color of density line classic robust estimation.  | 
add_legend | 
 a Boolean if the estimated parameters of the Normal distribution should be plotted.  | 
legend_position | 
 a string specifying the position of the legend.  | 
delta | 
 graphic parameter to determine the shrinkage of the axis.  | 
... | 
 Extra graphical arguments.  | 
Value
No return value. Plot a histogram.
Examples
n <- 1000
x <- rnorm(n = n)
hist_compare_to_normal(x)
x2 <- rexp(n, rate = 25)
hist_compare_to_normal(x2, legend_position = "topright")
Kuwait Blood Pressure
Description
This dataset contains a collection of variables believed to be potentially associated with the blood pressure measurements of 213 individuals from Kuwait. The dataset lists the following variables:
Usage
kuwait_bp
Format
- age
 Age in years
- weight
 Weight in kg
- height
 Height in mm
- chin
 Chin skinfold in cm
- forearm
 Forearm skinfold in cm
- calf
 Calf skinfold in cm
- pulse
 Resting pulse rate
- left_handed
 Whether or not the participant is left-handed
- bmi
 The Body Mass Index (BMI) of the participant
- systol
 Systolic blood pressure
Customer attendance of a pharmacy in Geneva
Description
This dataset contains the number of clients in a pharmacy for each hour over two years.
Usage
pharmacy
Format
A data frame with 17520 rows and 4 variables:
- date
 the date
- hours
 the hour of the day
- weekday
 the week day
- attendance
 the recorded number of clients
Reading
Description
This dataset is based on the effectiveness of directed reading activities for elementary school students (6-12 years old).
Usage
reading
Format
- id
 Student id
- score
 Degree of Reading Power (DRP) test score
- age
 Age of the students
- group
 Binary variable indicating whether a student participated to the directed reading activities (Treatment if the student participated, Control otherwise)
Snoring
Description
This dataset is based on a study on the physical and behavioral characteristics of snorers.
Usage
snoring
Format
- sex
 gender of the person (0 for males and 1 for females)
- age
 age in years
- height
 height in cm
- weight
 weight in kg
- smoke
 smoking behavior (0 for non-smokers and 1 for smokers)
- alcohol
 number of glasses drunk per day (in red wine equivalent)
- snore
 snoring diagnosis (0 for not snoring, 1 for snoring)
Students
Description
Students
Usage
students
Format
- day
 day
- case
 case