Skip to contents

This function performs a grid search to determine the optimal adaptive boxplot coefficient `coef` for each column of a contingency table, ensuring the target false discovery rate (FDR) is met.

Usage

find_optimal_coef(
  contin_table,
  n_sim = 1000,
  target_fdr = 0.05,
  grid = 0.1,
  col_specific_cutoff = TRUE,
  exclude_small_count = TRUE
)

Arguments

contin_table

A matrix representing the \(I \times J\) contingency table.

n_sim

An integer specifying the number of simulated tables under the assumption of independence between rows and columns. Default is 1000.

target_fdr

A numeric value specifying the desired level of false discovery rate (FDR). Default is 0.05.

grid

A numeric value representing the size of the grid added to the default value of coef = 1.5 as suggested by Tukey. Default is 0.1.

col_specific_cutoff

Logical. If TRUE, then a single value of the coefficient is returned for the entire dataset, else when FALSE specific values corresponding to each of the columns are returned.

exclude_small_count

A logical indicating whether to exclude cells with counts smaller than or equal to five when computing boxplot statistics. Default is TRUE.

Value

A list with the following components:

coef: A numeric vector containing the optimal coefficient `coef` for each column of the input contingency table. FDR: A numeric vector with the corresponding false discovery rate (FDR) for each column.

Examples

# \donttest{
# This example uses the statin49 data
data(statin49)
find_optimal_coef(statin49)
#> $coef
#> [1] 2.6 3.4 2.7 2.8 2.6 2.6 1.9
#> 
#> $FDR
#> [1] 0.043 0.050 0.048 0.049 0.042 0.049 0.045
#> 
# }