Skip to contents

Modified Detecting Deviating Cells (MDDC) algorithm for adverse event signal identification. Monte Carlo (MC) method is used for cutoff selection in the second step of the algorithm.

Usage

mddc_mc(
  contin_table,
  quantile = 0.95,
  rep = 10000,
  exclude_same_drug_class = TRUE,
  col_specific_cutoff = TRUE,
  separate = TRUE,
  if_col_cor = FALSE,
  cor_lim = 0.8,
  num_cores = 2,
  seed = NULL
)

Arguments

contin_table

A data matrix of an \(I\) x \(J\) contingency table with row (adverse event) and column (drug or vaccine) names. Please first check the input contingency table using the function check_and_fix_contin_table().

quantile

In the second step of the algorithm, the quantile of the null distribution obtained via MC method to use as a threshold for identifying cells with high value of the standardized Pearson residuals. Default is 0.95.

rep

In the second step, the number of Monte Carlo replications in the MC method. Default is 10000.

exclude_same_drug_class

In the second step, when applying Fisher's exact test to cells with a count less than six, a 2 by 2 contingency table needs to be constructed. Does the construction need to exclude other drugs or vaccines in the same class as the drug or vaccine of interest? Default is TRUE.

col_specific_cutoff

Logical. In the second step of the algorithm, whether to apply MC method to the standardized Pearson residuals of the entire table, or within each drug or vaccine column. Default is TRUE, that is within each drug or vaccine column (column specific cutoff). FALSE indicates applying MC method on residuals of the entire table.

separate

Logical. In the second step of the algorithm, whether to separate the standardized Pearson residuals for the zero cells and non zero cells and apply MC method separately or together. Default is TRUE.

if_col_cor

Logical. In the third step of the algorithm, whether to use column (drug or vaccine) correlation or row (adverse event) correlation. Default is FALSE, that is using the adverse event correlation. TRUE indicates using drug or vaccine correlation.

cor_lim

A numeric value between (0, 1). In the third step, what correlation threshold should be used to select “connected” adverse events. Default is 0.8.

num_cores

Number of cores used to parallelize the MDDC MC algorithm. Default is 2.

seed

An optional integer to set the seed for reproducibility. If NULL, no seed is set.

Value

A list with the following components:

  • mc_pval returns the p values for each cell in the second step. For cells with a count greater than five, the p values are obtained via MC method. For cells with a count less than or equal to five, the p values are obtained via Fisher's exact tests.

  • mc_signal returns the signals with a count greater than five and identified in the second step by MC method. 1 indicates signals, 0 for non signal.

  • fisher_signal returns the signals with a count less than or equal to five and identified in the second step by Fisher's exact tests. 1 indicates signals, 0 for non signal.

  • corr_signal_pval returns the p values for each cell in the contingency table in the fifth step, when the \(r_{ij}\) values are mapped back to the standard normal distribution.

  • corr_signal_adj_pval returns the Benjamini-Hochberg adjusted p values for each cell in the fifth step. We leave here an option for the user to decide whether to use corr_signal_pval or corr_signal_adj_pval, and what threshold for p values should be used (for example, 0.05). Please see the example below.

Examples

# using statin49 data set as an example
data(statin49)

# apply the mddc_mc
mc_res <- mddc_boxplot(statin49)

# signals identified in step 2 using MC method
signal_step2 <- mc_res$mc_signal

# signals identified in step 5 by considering AE correlations
# In this example, cells with p values less than 0.05 are
# identified as signals
signal_step5 <- (mc_res$corr_signal_pval < 0.05) * 1