Modified Detecting Deviating Cells (MDDC) algorithm for adverse event signal identification with boxplot method for cutoff selection.
Source:R/mddc_boxplot.R
mddc_boxplot.Rd
Modified Detecting Deviating Cells (MDDC) algorithm for adverse event signal identification. Boxplot method is used for cutoff selection in step 2 of the algorithm.
Usage
mddc_boxplot(
contin_table,
col_specific_cutoff = TRUE,
separate = TRUE,
if_col_cor = FALSE,
cor_lim = 0.8,
coef = 1.5,
num_cores = 2
)
Arguments
- contin_table
A data matrix of an \(I\) x \(J\) contingency table with row (adverse event) and column (drug or vaccine) names. Please first check the input contingency table using the function
check_and_fix_contin_table()
.- col_specific_cutoff
Logical. In the second step of the algorithm, whether to apply boxplot method to the standardized Pearson residuals of the entire table, or within each drug or vaccine column. Default is
TRUE
, that is within each drug or vaccine column (column specific cutoff).FALSE
indicates applying boxplot method on residuals of the entire table.- separate
Logical. In the second step of the algorithm, whether to separate the standardized Pearson residuals for the zero cells and non zero cells and apply boxplot method separately or together. Default is
TRUE
.- if_col_cor
Logical. In the third step of the algorithm, whether to use column (drug or vaccine) correlation or row (adverse event) correlation. Default is
FALSE
, that is using the adverse event correlation.TRUE
indicates using drug or vaccine correlation.- cor_lim
A numeric value between (0, 1). In the third step, what correlation threshold should be used to select “connected” adverse events. Default is 0.8.
- coef
A numeric value or a list of numeric values. If a single numeric value is provided, it will be applied uniformly across all columns of the contingency table. If a list is provided, its length must match the number of columns in the contingency table, and each value will be used as the coefficient for the corresponding column.
- num_cores
Number of cores used to parallelize the MDDC Boxplot algorithm. Default is 2.
Value
A list with the following components:
boxplot_signal
returns the signals identified in the second step. 1 indicates signals, 0 for non signal.corr_signal_pval
returns the p values for each cell in the contingency table in the fifth step, when the \(r_{ij}\) values are mapped back to the standard normal distribution.corr_signal_adj_pval
returns the Benjamini-Hochberg adjusted p values for each cell in the fifth step. We leave here an option for the user to decide whether to usecorr_signal_pval
orcorr_signal_adj_pval
, and what threshold for p values should be used (for example, 0.05). Please see the example below.
See also
find_optimal_coef
for finding an optimal value of
coef
.
Examples
# using statin49 data set as an example
data(statin49)
# apply the mddc_boxplot
boxplot_res <- mddc_boxplot(statin49)
# signals identified in step 2 using boxplot method
signal_step2 <- boxplot_res$boxplot_signal
# signals identified in step 5 by considering AE correlations
# In this example, cells with p values less than 0.05 are
# identified as signals
signal_step5 <- (boxplot_res$corr_signal_pval < 0.05) * 1