Batch Testing Methods¶

Batch testing methods extend classical multiple testing procedures to the online setting, where hypotheses arrive in batches over time and must be tested sequentially while maintaining overall FDR control across all batches.

Overview¶

The batching framework, developed by Zrnic et al. (2020), addresses scenarios where: - Tests naturally arrive in groups (batches) - Each batch can be processed using classical procedures - Overall FDR control is required across all batches - Adaptive alpha allocation improves power over time

Available Methods¶

Benjamini-Hochberg Batch Testing¶

`online_fdr.p_values.BatchBH` ¶

Bases: AbstractBatchingTest

Benjamini-Hochberg procedure for online batch FDR control.

BatchBH extends the classical Benjamini-Hochberg (BH) procedure to the online batching setting, where hypotheses arrive in batches over time and must be tested sequentially while maintaining overall FDR control across all batches.

This implements Algorithm 1 from "The Power of Batching in Multiple Hypothesis Testing" by Zrnic, Jiang, Ramdas, and Jordan (2020). The key innovation is the calculation of adaptive alpha levels that account for the interdependence between batches while preserving the BH optimality within each batch.

The algorithm maintains FDR control by: 1. Allocating alpha budget using a gamma sequence 2. Adjusting for dependencies between batches via Î²_t correction 3. Computing R^+ (maximum possible rejections) for power optimization 4. Applying standard BH procedure within each batch

Parameters:

Name	Type	Description	Default
`alpha`	`float`	Target FDR level (e.g., 0.05 for 5% FDR). Must be in (0, 1).	required

Attributes:

Name	Type	Description
`alpha0`		Original target FDR level.
`num_test`	`int`	Number of batches tested so far.
`seq`		Gamma sequence for alpha allocation across batches.
`r_s`	`list[int]`	Number of rejections in each batch.
`r_s_plus`	`list[int]`	Maximum possible rejections for each batch (R^+ values).
`alpha_s`	`list[float]`	Alpha level used for each batch.

Examples:

>>> # Basic batch testing
>>> bh = BatchBH(alpha=0.05)
>>> batch1 = [0.001, 0.02, 0.15, 0.8]
>>> decisions1 = bh.test_batch(batch1)
>>> print(f"Batch 1 discoveries: {sum(decisions1)}")

>>> # Sequential batches with adaptive alpha
>>> batch2 = [0.03, 0.9, 0.006, 0.4]
>>> decisions2 = bh.test_batch(batch2)  # Alpha adjusted based on batch1
>>> print(f"Batch 2 discoveries: {sum(decisions2)}")

>>> # Multiple batches
>>> batches = [[0.001, 0.8], [0.02, 0.3], [0.005, 0.9]]
>>> all_decisions = []
>>> for i, batch in enumerate(batches):
...     decisions = bh.test_batch(batch)
...     all_decisions.append(decisions)
...     print(f"Batch {i+1}: {sum(decisions)} discoveries")

References

Zrnic, T., D. Jiang, A. Ramdas, and M. I. Jordan (2020). "The Power of Batching in Multiple Hypothesis Testing." Proceedings of the 37^th International Conference on Machine Learning (ICML), PMLR, 119:11504-11515.

Benjamini, Y., and Y. Hochberg (1995). "Controlling the False Discovery Rate: A Practical and Powerful Approach to Multiple Testing." Journal of the Royal Statistical Society: Series B, 57(1):289-300.

Source code in online_fdr/p_values/batching/bh.py

class BatchBH(AbstractBatchingTest):
    """Benjamini-Hochberg procedure for online batch FDR control.

    BatchBH extends the classical Benjamini-Hochberg (BH) procedure to the online
    batching setting, where hypotheses arrive in batches over time and must be
    tested sequentially while maintaining overall FDR control across all batches.

    This implements Algorithm 1 from "The Power of Batching in Multiple Hypothesis
    Testing" by Zrnic, Jiang, Ramdas, and Jordan (2020). The key innovation is
    the calculation of adaptive alpha levels that account for the interdependence
    between batches while preserving the BH optimality within each batch.

    The algorithm maintains FDR control by:
    1. Allocating alpha budget using a gamma sequence
    2. Adjusting for dependencies between batches via Î²_t correction
    3. Computing R^+ (maximum possible rejections) for power optimization
    4. Applying standard BH procedure within each batch

    Args:
        alpha: Target FDR level (e.g., 0.05 for 5% FDR). Must be in (0, 1).

    Attributes:
        alpha0: Original target FDR level.
        num_test: Number of batches tested so far.
        seq: Gamma sequence for alpha allocation across batches.
        r_s: Number of rejections in each batch.
        r_s_plus: Maximum possible rejections for each batch (R^+ values).
        alpha_s: Alpha level used for each batch.

    Examples:
        >>> # Basic batch testing
        >>> bh = BatchBH(alpha=0.05)
        >>> batch1 = [0.001, 0.02, 0.15, 0.8]
        >>> decisions1 = bh.test_batch(batch1)
        >>> print(f"Batch 1 discoveries: {sum(decisions1)}")

        >>> # Sequential batches with adaptive alpha
        >>> batch2 = [0.03, 0.9, 0.006, 0.4]
        >>> decisions2 = bh.test_batch(batch2)  # Alpha adjusted based on batch1
        >>> print(f"Batch 2 discoveries: {sum(decisions2)}")

        >>> # Multiple batches
        >>> batches = [[0.001, 0.8], [0.02, 0.3], [0.005, 0.9]]
        >>> all_decisions = []
        >>> for i, batch in enumerate(batches):
        ...     decisions = bh.test_batch(batch)
        ...     all_decisions.append(decisions)
        ...     print(f"Batch {i+1}: {sum(decisions)} discoveries")

    References:
        Zrnic, T., D. Jiang, A. Ramdas, and M. I. Jordan (2020). "The Power of
        Batching in Multiple Hypothesis Testing." Proceedings of the 37th
        International Conference on Machine Learning (ICML), PMLR, 119:11504-11515.

        Benjamini, Y., and Y. Hochberg (1995). "Controlling the False Discovery Rate:
        A Practical and Powerful Approach to Multiple Testing." Journal of the Royal
        Statistical Society: Series B, 57(1):289-300.
    """

    def __init__(self, alpha: float):
        """Initialize BatchBH with FDR control level alpha.

        Args:
            alpha: Target FDR control level. Must be in (0, 1).

        Raises:
            ValueError: If alpha is not in (0, 1).
        """
        super().__init__(alpha)
        self.alpha0 = alpha
        self.seq = DefaultSaffronGammaSequence(gamma_exp=1.6, c=0.4374901658)
        self.r_s_plus: list[int] = []  # R^+ values for each batch
        self.r_s: list[int] = []  # R values (number of rejections) for each batch
        self.alpha_s: list[float] = []  # Alpha values used for each batch

    def test_batch(self, p_vals: Sequence[float]) -> list[bool]:
        """Test a batch of p-values using the BatchBH procedure.

        Args:
            p_vals: List of p-values for the current batch

        Returns:
            List of boolean values indicating which hypotheses are rejected
        """
        p_vals_local = list(p_vals)
        n_batch = len(p_vals_local)
        if n_batch == 0:
            return []
        validity.check_p_vals_batch(p_vals_local)
        t = self.num_batches  # Current batch index (0-based)

        if t == 0:
            # First batch: Î±â‚ = Î³â‚Î±
            alpha_t = self.alpha0 * self.seq.calc_gamma(j=1)
        else:
            # Calculate Î²_t
            beta_t = 0.0
            total_rejections_except_s = sum(self.r_s)  # Total rejections so far

            for s in range(t):
                # For each previous batch s, calculate its contribution to Î²_t
                # Denominator is R^+_s + sum of all other rejections up to t-1
                rejections_from_other_batches = total_rejections_except_s - self.r_s[s]
                denominator = self.r_s_plus[s] + rejections_from_other_batches
                if denominator > 0:
                    beta_t += self.alpha_s[s] * self.r_s_plus[s] / denominator

            # Calculate Î±_t = (Î£_{sâ‰¤t} Î³_s Î± - Î²_t) Ã— (n_t + Î£_{s<t} R_s) / n_t
            gamma_sum = sum(self.seq.calc_gamma(j=i + 1) for i in range(t + 1))
            numerator = gamma_sum * self.alpha0 - beta_t
            total_prev_rejections = sum(self.r_s)
            alpha_t = numerator * (n_batch + total_prev_rejections) / n_batch

            # Ensure alpha_t is non-negative
            alpha_t = max(0, alpha_t)

        # Run BH on current batch
        num_reject, threshold = bh(p_vals_local, alpha_t)

        # Calculate R^+_t (maximum rejections if one p-value is set to 0)
        r_plus = num_reject  # Start with current rejections
        adjusted = list(p_vals_local)
        for i in range(len(adjusted)):
            # Temporarily set p-value to 0
            original_p = adjusted[i]
            adjusted[i] = 0.0
            temp_reject, _ = bh(adjusted, alpha_t)
            r_plus = max(r_plus, temp_reject)
            # Restore original p-value
            adjusted[i] = original_p

        # Store results
        self.r_s.append(num_reject)
        self.r_s_plus.append(r_plus)
        self.alpha_s.append(alpha_t)
        self._set_test_level(alpha_t, rejection_threshold=threshold)
        self._advance_batch(n_batch)

        # Return rejection decisions
        return [p_val <= threshold for p_val in p_vals_local]

Functions¶

`test_batch(p_vals)` ¶

Test a batch of p-values using the BatchBH procedure.

Parameters:

Name	Type	Description	Default
`p_vals`	`Sequence[float]`	List of p-values for the current batch	required

Returns:

Type	Description
`list[bool]`	List of boolean values indicating which hypotheses are rejected

Source code in online_fdr/p_values/batching/bh.py

def test_batch(self, p_vals: Sequence[float]) -> list[bool]:
    """Test a batch of p-values using the BatchBH procedure.

    Args:
        p_vals: List of p-values for the current batch

    Returns:
        List of boolean values indicating which hypotheses are rejected
    """
    p_vals_local = list(p_vals)
    n_batch = len(p_vals_local)
    if n_batch == 0:
        return []
    validity.check_p_vals_batch(p_vals_local)
    t = self.num_batches  # Current batch index (0-based)

    if t == 0:
        # First batch: Î±â‚ = Î³â‚Î±
        alpha_t = self.alpha0 * self.seq.calc_gamma(j=1)
    else:
        # Calculate Î²_t
        beta_t = 0.0
        total_rejections_except_s = sum(self.r_s)  # Total rejections so far

        for s in range(t):
            # For each previous batch s, calculate its contribution to Î²_t
            # Denominator is R^+_s + sum of all other rejections up to t-1
            rejections_from_other_batches = total_rejections_except_s - self.r_s[s]
            denominator = self.r_s_plus[s] + rejections_from_other_batches
            if denominator > 0:
                beta_t += self.alpha_s[s] * self.r_s_plus[s] / denominator

        # Calculate Î±_t = (Î£_{sâ‰¤t} Î³_s Î± - Î²_t) Ã— (n_t + Î£_{s<t} R_s) / n_t
        gamma_sum = sum(self.seq.calc_gamma(j=i + 1) for i in range(t + 1))
        numerator = gamma_sum * self.alpha0 - beta_t
        total_prev_rejections = sum(self.r_s)
        alpha_t = numerator * (n_batch + total_prev_rejections) / n_batch

        # Ensure alpha_t is non-negative
        alpha_t = max(0, alpha_t)

    # Run BH on current batch
    num_reject, threshold = bh(p_vals_local, alpha_t)

    # Calculate R^+_t (maximum rejections if one p-value is set to 0)
    r_plus = num_reject  # Start with current rejections
    adjusted = list(p_vals_local)
    for i in range(len(adjusted)):
        # Temporarily set p-value to 0
        original_p = adjusted[i]
        adjusted[i] = 0.0
        temp_reject, _ = bh(adjusted, alpha_t)
        r_plus = max(r_plus, temp_reject)
        # Restore original p-value
        adjusted[i] = original_p

    # Store results
    self.r_s.append(num_reject)
    self.r_s_plus.append(r_plus)
    self.alpha_s.append(alpha_t)
    self._set_test_level(alpha_t, rejection_threshold=threshold)
    self._advance_batch(n_batch)

    # Return rejection decisions
    return [p_val <= threshold for p_val in p_vals_local]

Storey-BH Batch Testing¶

`online_fdr.p_values.BatchStoreyBH` ¶

Bases: AbstractBatchingTest

Storey-BH batch procedure for online batch-level FDR control.

Source code in online_fdr/p_values/batching/storey_bh.py

class BatchStoreyBH(AbstractBatchingTest):
    """Storey-BH batch procedure for online batch-level FDR control."""

    def __init__(self, alpha: float, lambda_: float):
        super().__init__(alpha)
        self.alpha0: float = alpha
        self.lambda_: float = lambda_

        if not 0 < lambda_ < 1:
            raise ValueError("lambda_ must be between 0 and 1.")

        self.seq = DefaultSaffronGammaSequence(gamma_exp=1.6, c=0.4374901658)
        self.k_s: list[int] = []
        self.pi0_estimates: list[float] = []
        self.r_s_plus: list[int] = []
        self.r_sums: list[int] = []
        self.alpha_s: list[float] = []

    def test_batch(self, p_vals: Sequence[float]) -> list[bool]:
        p_vals_local = list(p_vals)
        n_batch = len(p_vals_local)
        if n_batch == 0:
            return []
        validity.check_p_vals_batch(p_vals_local)

        batch_number = self.num_batches + 1
        if batch_number == 1:
            alpha_batch = self.alpha0 * self.seq.calc_gamma(j=1)
        else:
            gamma_sum = self.alpha0 * sum(
                self.seq.calc_gamma(i) for i in range(1, batch_number + 1)
            )
            total_rejections = sum(self.r_sums)
            penalty = 0.0
            for idx in range(batch_number - 1):
                denom = self.r_s_plus[idx] + (total_rejections - self.r_sums[idx])
                if denom > 0:
                    penalty += (
                        self.k_s[idx] * self.alpha_s[idx] * (self.r_s_plus[idx] / denom)
                    )
            alpha_batch = (gamma_sum - penalty) * (
                (n_batch + total_rejections) / n_batch
            )

        batch_decisions, pi0_batch = self._storey_batch_decisions(
            p_vals_local, alpha_batch
        )
        num_reject = sum(batch_decisions)

        self.k_s.append(int(max(p_vals_local) > self.lambda_))
        self.pi0_estimates.append(pi0_batch)
        self.r_sums.append(num_reject)
        self.alpha_s.append(alpha_batch)
        r_plus = self._calculate_r_plus(p_vals_local, alpha_batch)
        self.r_s_plus.append(r_plus)

        threshold = max(
            (
                p_val
                for p_val, rejected in zip(p_vals_local, batch_decisions)
                if rejected
            ),
            default=0.0,
        )
        self._set_test_level(alpha_batch, rejection_threshold=threshold)
        self._advance_batch(n_batch)
        return batch_decisions

    def _calculate_r_plus(
        self, p_vals: list[float], alpha_batch: float | None = None
    ) -> int:
        """Compute R+ via one-coordinate replacement p_i <- 0 on same batch size."""
        if not p_vals:
            return 0
        if alpha_batch is None:
            if self.last_test_level is None:
                raise AssertionError(
                    "BatchStoreyBH alpha threshold was not initialized."
                )
            alpha_batch = float(self.last_test_level)

        r_plus = 0
        n = len(p_vals)
        for i in range(n):
            pseudo_pvals = p_vals[:i] + p_vals[i + 1 :] + [0.0]
            pseudo_rejections, _ = self._storey_batch_decisions(
                pseudo_pvals, alpha_batch
            )
            r_plus = max(r_plus, sum(pseudo_rejections))
        return r_plus

    def _storey_batch_decisions(
        self, p_vals: list[float], alpha_batch: float
    ) -> tuple[list[bool], float]:
        """Replicate onlineFDR's BatchStBH inner Storey-BH rejection rule."""
        n_batch = len(p_vals)
        if n_batch == 0:
            return [], 0.0

        num_above_lambda = sum(1 for p in p_vals if p > self.lambda_)
        pi0_batch = (num_above_lambda + 1.0) / ((1.0 - self.lambda_) * n_batch)

        order_desc = sorted(range(n_batch), key=p_vals.__getitem__, reverse=True)
        inv_order = [0] * n_batch
        for rank, idx in enumerate(order_desc):
            inv_order[idx] = rank

        adjusted_desc: list[float] = []
        running_min = 1.0
        for rank, idx in enumerate(order_desc):
            j_val = n_batch - rank
            candidate = (n_batch / j_val) * pi0_batch * p_vals[idx]
            running_min = min(running_min, candidate)
            adjusted_desc.append(min(1.0, running_min))

        return (
            [adjusted_desc[inv_order[i]] <= alpha_batch for i in range(n_batch)],
            pi0_batch,
        )

Benjamini-Yekutieli Batch Testing¶

`online_fdr.p_values.BatchBY` ¶

Bases: AbstractBatchingTest

Benjamini-Yekutieli extension for online batch testing.

BatchBY extends the online batching framework to use the Benjamini-Yekutieli (BY) procedure within each batch. This is a conservative extension for settings where the within-batch independence assumption may be violated, such as spatial statistics, time series analysis, or genomics with linkage disequilibrium.

The BY procedure is a modification of the Benjamini-Hochberg (BH) procedure that uses harmonic weights for arbitrary dependence in a fixed batch. While this comes at the cost of reduced power compared to BH, it provides a more conservative within-batch correction.

The algorithm follows the online batching framework, allocating alpha budget across batches using a gamma sequence and adjusting for inter-batch accounting through the beta_t correction mechanism. It is not a direct onlineFDR parity method or a separately published BatchBY author implementation.

Parameters:

Name	Type	Description	Default
`alpha`	`float`	Target FDR level (e.g., 0.05 for 5% FDR). Must be in (0, 1).	required

Attributes:

Name	Type	Description
`alpha0`	`float`	Original target FDR level.
`num_test`	`int`	Number of batches tested so far.
`seq`		Gamma sequence for alpha allocation across batches.
`r_s_plus`	`list[int]`	Maximum possible rejections for each batch.
`r_s`	`list[int]`	Rejection indicators for each batch.
`r_total`	`int`	Total number of rejections across all batches.
`r_sums`	`list[int]`	Cumulative rejection counts for dependency tracking.
`alpha_s`	`list[float]`	Alpha levels used for each batch.

Examples:

>>> # Basic usage for dependent p-values
>>> by_test = BatchBY(alpha=0.05)
>>> # Test correlated p-values (e.g., from spatial data)
>>> batch1 = [0.001, 0.002, 0.15, 0.8]  # May be dependent
>>> decisions1 = by_test.test_batch(batch1)
>>> print(f"BY discoveries in batch 1: {sum(decisions1)}")

>>> # Sequential dependent batches
>>> batch2 = [0.03, 0.04, 0.006, 0.4]   # Also potentially dependent
>>> decisions2 = by_test.test_batch(batch2)
>>> print(f"BY discoveries in batch 2: {sum(decisions2)}")

>>> # Comparing with standard BH under dependence
>>> from online_fdr.p_values.batching import BatchBH
>>> bh_test = BatchBH(alpha=0.05)
>>> by_test = BatchBY(alpha=0.05)
>>> # BY is more conservative than BH under within-batch dependence

Notes

The BY procedure is particularly recommended when: - P-values exhibit positive dependence - Spatial or temporal correlation is present - A conservative BY-style within-batch correction is required - The exact dependence structure is unknown

Trade-off: Enhanced robustness comes at the cost of reduced power compared to the standard Benjamini-Hochberg procedure.

References

Benjamini, Y., and D. Yekutieli (2001). "The control of the false discovery rate in multiple testing under dependency." Annals of Statistics, 29(4):1165-1188.

Zrnic, T., D. Jiang, A. Ramdas, and M. I. Jordan (2020). "The Power of Batching in Multiple Hypothesis Testing." Proceedings of the 37^th International Conference on Machine Learning (ICML), PMLR, 119:11504-11515.

Source code in online_fdr/p_values/batching/by.py

class BatchBY(AbstractBatchingTest):
    """Benjamini-Yekutieli extension for online batch testing.

    BatchBY extends the online batching framework to use the Benjamini-Yekutieli (BY)
    procedure within each batch. This is a conservative extension for settings where
    the within-batch independence assumption may be violated, such as spatial
    statistics, time series analysis, or genomics with linkage disequilibrium.

    The BY procedure is a modification of the Benjamini-Hochberg (BH) procedure
    that uses harmonic weights for arbitrary dependence in a fixed batch. While
    this comes at the cost of reduced power compared to BH, it provides a more
    conservative within-batch correction.

    The algorithm follows the online batching framework, allocating alpha budget
    across batches using a gamma sequence and adjusting for inter-batch accounting
    through the beta_t correction mechanism. It is not a direct onlineFDR parity
    method or a separately published BatchBY author implementation.

    Args:
        alpha: Target FDR level (e.g., 0.05 for 5% FDR). Must be in (0, 1).

    Attributes:
        alpha0: Original target FDR level.
        num_test: Number of batches tested so far.
        seq: Gamma sequence for alpha allocation across batches.
        r_s_plus: Maximum possible rejections for each batch.
        r_s: Rejection indicators for each batch.
        r_total: Total number of rejections across all batches.
        r_sums: Cumulative rejection counts for dependency tracking.
        alpha_s: Alpha levels used for each batch.

    Examples:
        >>> # Basic usage for dependent p-values
        >>> by_test = BatchBY(alpha=0.05)
        >>> # Test correlated p-values (e.g., from spatial data)
        >>> batch1 = [0.001, 0.002, 0.15, 0.8]  # May be dependent
        >>> decisions1 = by_test.test_batch(batch1)
        >>> print(f"BY discoveries in batch 1: {sum(decisions1)}")

        >>> # Sequential dependent batches
        >>> batch2 = [0.03, 0.04, 0.006, 0.4]   # Also potentially dependent
        >>> decisions2 = by_test.test_batch(batch2)
        >>> print(f"BY discoveries in batch 2: {sum(decisions2)}")

        >>> # Comparing with standard BH under dependence
        >>> from online_fdr.p_values.batching import BatchBH
        >>> bh_test = BatchBH(alpha=0.05)
        >>> by_test = BatchBY(alpha=0.05)
        >>> # BY is more conservative than BH under within-batch dependence

    Notes:
        The BY procedure is particularly recommended when:
        - P-values exhibit positive dependence
        - Spatial or temporal correlation is present
        - A conservative BY-style within-batch correction is required
        - The exact dependence structure is unknown

        Trade-off: Enhanced robustness comes at the cost of reduced power
        compared to the standard Benjamini-Hochberg procedure.

    References:
        Benjamini, Y., and D. Yekutieli (2001). "The control of the false discovery
        rate in multiple testing under dependency." Annals of Statistics, 29(4):1165-1188.

        Zrnic, T., D. Jiang, A. Ramdas, and M. I. Jordan (2020). "The Power of
        Batching in Multiple Hypothesis Testing." Proceedings of the 37th
        International Conference on Machine Learning (ICML), PMLR, 119:11504-11515.
    """

    def __init__(self, alpha: float):
        """Initialize BatchBY with FDR control level.

        Args:
            alpha: Target FDR control level. Must be in (0, 1).

        Raises:
            ValueError: If alpha is not in (0, 1).
        """
        super().__init__(alpha)
        self.alpha0: float = alpha

        self.seq = DefaultSaffronGammaSequence(gamma_exp=1.6, c=0.4374901658)
        self.r_s_plus: list[int] = []
        self.r_s: list[int] = []
        self.r_total: int = 0
        self.r_sums: list[int] = [0]
        self.alpha_s: list[float] = []

    def test_batch(self, p_vals: Sequence[float]) -> list[bool]:
        """Test a batch of p-values using the Benjamini-Yekutieli procedure.

        The BY procedure uses harmonic weights in the rejection threshold calculation.
        This method adapts the static BY procedure to the online batching framework
        as a conservative extension.

        The algorithm:
        1. Calculates adaptive alpha level for the current batch
        2. Applies the BY procedure with harmonic correction
        3. Updates statistics for future batch calculations
        4. Computes R+ for inter-batch accounting

        Args:
            p_vals: List of p-values for the current batch.

        Returns:
            List of boolean values indicating which hypotheses are rejected.

        Examples:
            >>> by_test = BatchBY(alpha=0.05)
            >>> # Test potentially dependent p-values
            >>> decisions = by_test.test_batch([0.001, 0.002, 0.15, 0.8])
            >>> print(f"Rejections with BY: {sum(decisions)}")

        Note:
            The BY procedure is more conservative than BH within a batch.
        """
        p_vals_local = list(p_vals)
        n_batch = len(p_vals_local)
        if n_batch == 0:
            return []
        validity.check_p_vals_batch(p_vals_local)
        batch_number = self.num_batches + 1
        if batch_number == 1:
            alpha_t = (
                self.alpha0  # fmt: skip
                * self.seq.calc_gamma(j=1)
            )
        else:
            alpha_t = (
                sum(self.seq.calc_gamma(i) for i in range(1, batch_number + 1))
                * self.alpha0  # fmt: skip
            )
            alpha_t -= sum(
                [
                    self.alpha_s[i]
                    * self.r_s_plus[i]
                    / (self.r_s_plus[i] + self.r_sums[i + 1])
                    for i in range(0, batch_number - 1)
                ]
            )
            alpha_t *= (n_batch + self.r_total) / n_batch

        num_reject, threshold = by(p_vals_local, alpha_t)

        self.r_sums.append(self.r_total)
        self.r_sums[1:batch_number] = \
            [x + num_reject for x in self.r_sums[1:batch_number]]  # fmt: skip
        self.r_total += num_reject
        self.alpha_s.append(alpha_t)

        r_plus = 0
        adjusted = list(p_vals_local)
        for i, p_val in enumerate(adjusted):
            adjusted[i] = 0.0
            r_plus = max(r_plus, by(adjusted, alpha_t)[0])
            adjusted[i] = p_val
        self.r_s_plus.append(r_plus)

        self._set_test_level(alpha_t, rejection_threshold=threshold)
        self._advance_batch(n_batch)
        return [p_val <= threshold for p_val in p_vals_local]

Functions¶

`test_batch(p_vals)` ¶

Test a batch of p-values using the Benjamini-Yekutieli procedure.

The BY procedure uses harmonic weights in the rejection threshold calculation. This method adapts the static BY procedure to the online batching framework as a conservative extension.

The algorithm: 1. Calculates adaptive alpha level for the current batch 2. Applies the BY procedure with harmonic correction 3. Updates statistics for future batch calculations 4. Computes R+ for inter-batch accounting

Parameters:

Name	Type	Description	Default
`p_vals`	`Sequence[float]`	List of p-values for the current batch.	required

Returns:

Type	Description
`list[bool]`	List of boolean values indicating which hypotheses are rejected.

Examples:

>>> by_test = BatchBY(alpha=0.05)
>>> # Test potentially dependent p-values
>>> decisions = by_test.test_batch([0.001, 0.002, 0.15, 0.8])
>>> print(f"Rejections with BY: {sum(decisions)}")

Note

The BY procedure is more conservative than BH within a batch.

Source code in online_fdr/p_values/batching/by.py

def test_batch(self, p_vals: Sequence[float]) -> list[bool]:
    """Test a batch of p-values using the Benjamini-Yekutieli procedure.

    The BY procedure uses harmonic weights in the rejection threshold calculation.
    This method adapts the static BY procedure to the online batching framework
    as a conservative extension.

    The algorithm:
    1. Calculates adaptive alpha level for the current batch
    2. Applies the BY procedure with harmonic correction
    3. Updates statistics for future batch calculations
    4. Computes R+ for inter-batch accounting

    Args:
        p_vals: List of p-values for the current batch.

    Returns:
        List of boolean values indicating which hypotheses are rejected.

    Examples:
        >>> by_test = BatchBY(alpha=0.05)
        >>> # Test potentially dependent p-values
        >>> decisions = by_test.test_batch([0.001, 0.002, 0.15, 0.8])
        >>> print(f"Rejections with BY: {sum(decisions)}")

    Note:
        The BY procedure is more conservative than BH within a batch.
    """
    p_vals_local = list(p_vals)
    n_batch = len(p_vals_local)
    if n_batch == 0:
        return []
    validity.check_p_vals_batch(p_vals_local)
    batch_number = self.num_batches + 1
    if batch_number == 1:
        alpha_t = (
            self.alpha0  # fmt: skip
            * self.seq.calc_gamma(j=1)
        )
    else:
        alpha_t = (
            sum(self.seq.calc_gamma(i) for i in range(1, batch_number + 1))
            * self.alpha0  # fmt: skip
        )
        alpha_t -= sum(
            [
                self.alpha_s[i]
                * self.r_s_plus[i]
                / (self.r_s_plus[i] + self.r_sums[i + 1])
                for i in range(0, batch_number - 1)
            ]
        )
        alpha_t *= (n_batch + self.r_total) / n_batch

    num_reject, threshold = by(p_vals_local, alpha_t)

    self.r_sums.append(self.r_total)
    self.r_sums[1:batch_number] = \
        [x + num_reject for x in self.r_sums[1:batch_number]]  # fmt: skip
    self.r_total += num_reject
    self.alpha_s.append(alpha_t)

    r_plus = 0
    adjusted = list(p_vals_local)
    for i, p_val in enumerate(adjusted):
        adjusted[i] = 0.0
        r_plus = max(r_plus, by(adjusted, alpha_t)[0])
        adjusted[i] = p_val
    self.r_s_plus.append(r_plus)

    self._set_test_level(alpha_t, rejection_threshold=threshold)
    self._advance_batch(n_batch)
    return [p_val <= threshold for p_val in p_vals_local]

PRDS Batch Testing¶

`online_fdr.p_values.BatchPRDS` ¶

Bases: AbstractBatchingTest

Batch FDR control under Positive Regression Dependency on a Subset (PRDS).

BatchPRDS provides FDR control when p-values within each batch satisfy the PRDS condition - a form of positive dependence that is less restrictive than independence but more structured than arbitrary dependence. This makes it suitable for applications where there is positive correlation between test statistics, such as in genomics or neuroimaging.

The algorithm extends the classical Benjamini-Hochberg procedure to the online batching setting under PRDS conditions. It allocates alpha budget across batches using a gamma sequence and adjusts the significance level based on the number of previous discoveries and current batch size.

PRDS (Positive Regression Dependency on a Subset) means that for any subset of true null hypotheses, the joint distribution of corresponding p-values is stochastically smaller when conditioned on smaller values of other p-values. This includes many practically relevant dependence structures.

Parameters:

Name	Type	Description	Default
`alpha`	`float`	Target FDR level (e.g., 0.05 for 5% FDR). Must be in (0, 1).	required

Attributes:

Name	Type	Description
`alpha0`		Original target FDR level.
`seq`		Gamma sequence for alpha allocation across batches.
`num_test`	`int`	Number of batches tested so far.
`r_total`	`int`	Total number of rejections across all batches.
`alpha_s`	`list[float]`	Alpha levels used for each batch (stored for testing).

Examples:

>>> # Basic usage under PRDS conditions
>>> prds_test = BatchPRDS(alpha=0.05)
>>> # Test batch with positive correlation (e.g., genetic data)
>>> batch1 = [0.001, 0.005, 0.15, 0.8]  # Positively correlated
>>> decisions1 = prds_test.test_batch(batch1)
>>> print(f"PRDS discoveries: {sum(decisions1)}")

>>> # Subsequent batch - alpha adjusted for previous discoveries
>>> batch2 = [0.02, 0.03, 0.4, 0.9]
>>> decisions2 = prds_test.test_batch(batch2)
>>> print(f"Total discoveries: {prds_test.r_total}")

>>> # PRDS vs standard BH under positive dependence
>>> # PRDS maintains FDR control while BH may be conservative

Notes

PRDS conditions are satisfied in many practical scenarios: - Genomic association studies with linkage disequilibrium - Neuroimaging with spatial smoothing - Financial time series with positive correlation - Social network analysis with homophily

The algorithm provides exact FDR control under PRDS while maintaining good power compared to more conservative methods like BY.

References

Zrnic, T., A. Ramdas, and M. I. Jordan (2018). "Asynchronous Online Testing of Multiple Hypotheses." arXiv preprint arXiv:1812.05068.

Benjamini, Y., and D. Yekutieli (2001). "The control of the false discovery rate in multiple testing under dependency." Annals of Statistics, 29(4):1165-1188.

Benjamini, Y., and D. Yekutieli (2001). "On the Adaptive Control of the False Discovery Rate in Multiple Testing With Independent Statistics." Journal of Educational and Behavioral Statistics, 25(1):60-83.

Source code in online_fdr/p_values/batching/prds.py

class BatchPRDS(AbstractBatchingTest):
    """Batch FDR control under Positive Regression Dependency on a Subset (PRDS).

    BatchPRDS provides FDR control when p-values within each batch satisfy the
    PRDS condition - a form of positive dependence that is less restrictive than
    independence but more structured than arbitrary dependence. This makes it
    suitable for applications where there is positive correlation between test
    statistics, such as in genomics or neuroimaging.

    The algorithm extends the classical Benjamini-Hochberg procedure to the online
    batching setting under PRDS conditions. It allocates alpha budget across batches
    using a gamma sequence and adjusts the significance level based on the number
    of previous discoveries and current batch size.

    PRDS (Positive Regression Dependency on a Subset) means that for any subset
    of true null hypotheses, the joint distribution of corresponding p-values is
    stochastically smaller when conditioned on smaller values of other p-values.
    This includes many practically relevant dependence structures.

    Args:
        alpha: Target FDR level (e.g., 0.05 for 5% FDR). Must be in (0, 1).

    Attributes:
        alpha0: Original target FDR level.
        seq: Gamma sequence for alpha allocation across batches.
        num_test: Number of batches tested so far.
        r_total: Total number of rejections across all batches.
        alpha_s: Alpha levels used for each batch (stored for testing).

    Examples:
        >>> # Basic usage under PRDS conditions
        >>> prds_test = BatchPRDS(alpha=0.05)
        >>> # Test batch with positive correlation (e.g., genetic data)
        >>> batch1 = [0.001, 0.005, 0.15, 0.8]  # Positively correlated
        >>> decisions1 = prds_test.test_batch(batch1)
        >>> print(f"PRDS discoveries: {sum(decisions1)}")

        >>> # Subsequent batch - alpha adjusted for previous discoveries
        >>> batch2 = [0.02, 0.03, 0.4, 0.9]
        >>> decisions2 = prds_test.test_batch(batch2)
        >>> print(f"Total discoveries: {prds_test.r_total}")

        >>> # PRDS vs standard BH under positive dependence
        >>> # PRDS maintains FDR control while BH may be conservative

    Notes:
        PRDS conditions are satisfied in many practical scenarios:
        - Genomic association studies with linkage disequilibrium
        - Neuroimaging with spatial smoothing
        - Financial time series with positive correlation
        - Social network analysis with homophily

        The algorithm provides exact FDR control under PRDS while maintaining
        good power compared to more conservative methods like BY.

    References:
        Zrnic, T., A. Ramdas, and M. I. Jordan (2018). "Asynchronous Online
        Testing of Multiple Hypotheses." arXiv preprint arXiv:1812.05068.

        Benjamini, Y., and D. Yekutieli (2001). "The control of the false discovery
        rate in multiple testing under dependency." Annals of Statistics, 29(4):1165-1188.

        Benjamini, Y., and D. Yekutieli (2001). "On the Adaptive Control of the
        False Discovery Rate in Multiple Testing With Independent Statistics."
        Journal of Educational and Behavioral Statistics, 25(1):60-83.
    """

    def __init__(self, alpha: float):
        """Initialize BatchPRDS with FDR control level.

        Args:
            alpha: Target FDR control level. Must be in (0, 1).

        Raises:
            ValueError: If alpha is not in (0, 1).
        """
        super().__init__(alpha)
        self.alpha0 = alpha

        self.seq = DefaultSaffronGammaSequence(gamma_exp=1.6, c=0.4374901658)
        self.r_total: int = 0

        self.alpha_s: list[float] = []  # only for test

    def test_batch(self, p_vals: Sequence[float]) -> list[bool]:
        """Test a batch of p-values under PRDS conditions.

        The algorithm calculates an adaptive significance level based on the
        gamma sequence, current batch size, and total previous discoveries.
        It then applies the standard Benjamini-Hochberg procedure with this
        adapted alpha level.

        The alpha calculation incorporates:
        - Gamma sequence value for the current batch number
        - Batch size normalization
        - Adjustment for accumulated discoveries

        Args:
            p_vals: List of p-values for the current batch. Must satisfy
                   PRDS conditions within the batch.

        Returns:
            List of boolean values indicating which hypotheses are rejected.

        Examples:
            >>> prds_test = BatchPRDS(alpha=0.05)
            >>> # Test positively dependent p-values
            >>> decisions = prds_test.test_batch([0.001, 0.002, 0.15, 0.8])
            >>> print(f"PRDS rejections: {sum(decisions)}")

        Note:
            This method assumes that p-values within the batch satisfy PRDS
            conditions. If this assumption is violated, FDR control may not
            be maintained.
        """
        p_vals_local = list(p_vals)
        batch_size = len(p_vals_local)
        if batch_size == 0:
            return []
        validity.check_p_vals_batch(p_vals_local)
        batch_number = self.num_batches + 1
        alpha_t = (
            self.alpha0
            * self.seq.calc_gamma(batch_number)
            / batch_size
            * (batch_size + self.r_total)
        )
        self.alpha_s.append(alpha_t)
        num_reject, threshold = bh(p_vals_local, alpha_t)

        self.r_total += num_reject

        self._set_test_level(alpha_t, rejection_threshold=threshold)
        self._advance_batch(batch_size)
        return [p_val <= threshold for p_val in p_vals_local]

Functions¶

`test_batch(p_vals)` ¶

Test a batch of p-values under PRDS conditions.

The algorithm calculates an adaptive significance level based on the gamma sequence, current batch size, and total previous discoveries. It then applies the standard Benjamini-Hochberg procedure with this adapted alpha level.

The alpha calculation incorporates: - Gamma sequence value for the current batch number - Batch size normalization - Adjustment for accumulated discoveries

Parameters:

Name	Type	Description	Default
`p_vals`	`Sequence[float]`	List of p-values for the current batch. Must satisfy PRDS conditions within the batch.	required

Returns:

Type	Description
`list[bool]`	List of boolean values indicating which hypotheses are rejected.

Examples:

>>> prds_test = BatchPRDS(alpha=0.05)
>>> # Test positively dependent p-values
>>> decisions = prds_test.test_batch([0.001, 0.002, 0.15, 0.8])
>>> print(f"PRDS rejections: {sum(decisions)}")

Note

This method assumes that p-values within the batch satisfy PRDS conditions. If this assumption is violated, FDR control may not be maintained.

Source code in online_fdr/p_values/batching/prds.py

def test_batch(self, p_vals: Sequence[float]) -> list[bool]:
    """Test a batch of p-values under PRDS conditions.

    The algorithm calculates an adaptive significance level based on the
    gamma sequence, current batch size, and total previous discoveries.
    It then applies the standard Benjamini-Hochberg procedure with this
    adapted alpha level.

    The alpha calculation incorporates:
    - Gamma sequence value for the current batch number
    - Batch size normalization
    - Adjustment for accumulated discoveries

    Args:
        p_vals: List of p-values for the current batch. Must satisfy
               PRDS conditions within the batch.

    Returns:
        List of boolean values indicating which hypotheses are rejected.

    Examples:
        >>> prds_test = BatchPRDS(alpha=0.05)
        >>> # Test positively dependent p-values
        >>> decisions = prds_test.test_batch([0.001, 0.002, 0.15, 0.8])
        >>> print(f"PRDS rejections: {sum(decisions)}")

    Note:
        This method assumes that p-values within the batch satisfy PRDS
        conditions. If this assumption is violated, FDR control may not
        be maintained.
    """
    p_vals_local = list(p_vals)
    batch_size = len(p_vals_local)
    if batch_size == 0:
        return []
    validity.check_p_vals_batch(p_vals_local)
    batch_number = self.num_batches + 1
    alpha_t = (
        self.alpha0
        * self.seq.calc_gamma(batch_number)
        / batch_size
        * (batch_size + self.r_total)
    )
    self.alpha_s.append(alpha_t)
    num_reject, threshold = bh(p_vals_local, alpha_t)

    self.r_total += num_reject

    self._set_test_level(alpha_t, rejection_threshold=threshold)
    self._advance_batch(batch_size)
    return [p_val <= threshold for p_val in p_vals_local]

TOAD Decision Deadlines¶

`online_fdr.p_values.Toad` ¶

Bases: StatefulMethodMixin

Thresholds based on active discoveries for decision-deadline online FDR.

Source code in online_fdr/p_values/batching/toad.py

class Toad(StatefulMethodMixin):
    """Thresholds based on active discoveries for decision-deadline online FDR."""

    error_rate = "FDR"

    def __init__(self, alpha: float = 0.05):
        validity.check_alpha(alpha)
        self.target_level = float(alpha)
        self._records: list[_ToadRecord] = []
        self._records_by_id: dict[Hashable, _ToadRecord] = {}
        self._next_auto_id = 1
        self._current_rejections: set[int] = set()
        self._current_stage = 0
        self._current_decisions: dict[Hashable, bool] = {}
        self._final_decisions: dict[Hashable, bool] = {}

    def _snapshot_state(self) -> dict[str, Any]:
        state = dict(self.__dict__)
        state["_records"] = [dict(record.__dict__) for record in self._records]
        state.pop("_records_by_id", None)
        return state

    @classmethod
    def _restore_snapshot_state(cls, state: dict[str, Any]) -> dict[str, Any]:
        records = [_ToadRecord(**record) for record in state.get("_records", [])]
        state["_records"] = records
        state["_records_by_id"] = {record.test_id: record for record in records}
        return state

    @property
    def records(self) -> tuple[ToadRecord, ...]:
        return tuple(record.to_snapshot() for record in self._records)

    @property
    def current_stage(self) -> int:
        return self._current_stage

    @property
    def current_decisions(self) -> dict[Hashable, bool]:
        return dict(self._current_decisions)

    @property
    def final_decisions(self) -> dict[Hashable, bool]:
        return dict(self._final_decisions)

    @property
    def num_hypotheses(self) -> int:
        return len(self._records)

    def add_test(
        self,
        p_val: float,
        deadline: int,
        test_id: Hashable | None = None,
        weight: float | None = None,
    ) -> bool:
        validity.check_p_val(p_val)
        if test_id is None:
            test_id = self._next_auto_id
            self._next_auto_id += 1
        if test_id in self._records_by_id:
            raise ValueError(f"test_id {test_id!r} has already been added.")

        stage = len(self._records) + 1
        if deadline < stage:
            raise ValueError("deadline must be at least the test's arrival stage.")
        if weight is None:
            weight = _default_stream_weight(stage)
        if weight <= 0:
            raise ValueError("weight must be positive.")
        if sum(record.weight for record in self._records) + weight > 1 + 1e-10:
            raise ValueError("streaming TOAD weights must sum to at most 1.")

        record = _ToadRecord(
            test_id=test_id,
            p_val=float(p_val),
            deadline=deadline,
            weight=float(weight),
            stage=stage,
        )
        self._records.append(record)
        self._records_by_id[test_id] = record
        self._current_stage = stage
        self._recompute_at(stage)
        return self._current_decisions[test_id]

    def advance_to(self, stage: int) -> dict[Hashable, bool]:
        if stage < self.current_stage:
            raise ValueError("stage cannot move backwards.")
        self._current_stage = stage
        self._recompute_at(stage)
        finalized: dict[Hashable, bool] = {}
        for record in self._records:
            if not record.finalized and record.deadline < stage:
                record.finalized = True
                decision = self._current_decisions[record.test_id]
                self._final_decisions[record.test_id] = decision
                finalized[record.test_id] = decision
        return finalized

    @staticmethod
    def run_finite(
        p_values: Sequence[float],
        deadlines: Sequence[int],
        alpha: float = 0.05,
        weights: Sequence[float] | None = None,
    ) -> list[bool]:
        return run_finite(p_values, deadlines, alpha=alpha, weights=weights)

    def _recompute_at(self, stage: int) -> None:
        ratios = [record.p_val / record.weight for record in self._records]
        candidates = [
            idx
            for idx, record in enumerate(self._records)
            if record.stage <= stage and record.deadline >= stage
        ]
        self._current_rejections = _toad_step(
            ratios, self.target_level, self._current_rejections, candidates
        )
        for idx, record in enumerate(self._records):
            record.rejected = idx in self._current_rejections
            self._current_decisions[record.test_id] = record.rejected

Key Concepts¶

Alpha Allocation¶

The batching framework uses a gamma sequence $\{_t\}$ to allocate alpha budget across batches: - Batch 1: $_1 = _1 $
- Batch t: $_t$ calculated using inter-batch dependency corrections

R Calculation¶

For each batch, the algorithm computes $R^+$ (maximum possible rejections if one p-value were 0): - Used to determine optimal alpha allocation for future batches - Balances current discoveries with future testing power - Key innovation enabling adaptive power allocation

_t Correction¶

The inter-batch dependency correction $_t$ accounts for: - Previous batch results affecting current alpha allocation - Preventing "double spending" of alpha across batches - Maintaining valid FDR control despite dependencies

Batch Size Considerations¶

Batch Size	Recommended Method	Gamma Sequence	Notes
< 10	BatchBH	Polynomial decay	Small batch penalty
10-100	BatchBH/BatchStoreyBH	Polynomial decay	Good balance
100	BatchStoreyBH	Half sequence	estimation effective
Variable	BatchBH	Adaptive	Handles size variation

Method Comparison¶

Power Under Different Conditions¶

Method	Independent	PRDS	Arbitrary Dependence	< 1
BatchBH	Excellent	Good	May not control	Good
BatchStoreyBH	Excellent	Good	May not control	Excellent
BatchBY	Good	Good	Conservative extension	Fair
BatchPRDS	Excellent	Excellent	May not control	Good

Computational Complexity¶

Method	Per-batch Time	Per-batch Space	Cumulative Storage
BatchBH	O(n log n)	O(n)	O(T)
BatchStoreyBH	O(n log n)	O(n)	O(T)
BatchBY	O(n log n)	O(n)	O(T)
BatchPRDS	O(n log n)	O(n)	O(1)

where n = batch size, T = number of batches.

Usage Guidelines¶

Method Selection¶

BatchBH: Default choice for most applications
BatchStoreyBH: When < 1 and batches are reasonably large
BatchBY: When a conservative BY-style within-batch correction is desired
BatchPRDS: When positive dependence structure is known
TOAD: When each test has a fixed decision deadline

Parameter Tuning¶

Alpha: Set based on desired FDR level (typically 0.05 or 0.1)
Gamma sequence: Use defaults unless specific decay patterns needed
** (Storey)**: 0.5 is standard, higher values more conservative

Practical Implementation¶

from online_fdr.p_values import BatchBH

# Initialize batch testing procedure
batch_test = BatchBH(alpha=0.05)

# Process batches sequentially  
batches = [
    [0.001, 0.02, 0.15, 0.8],      # Batch 1
    [0.03, 0.9, 0.006, 0.4],       # Batch 2  
    [0.12, 0.005, 0.7, 0.25]       # Batch 3
]

all_decisions = []
for i, batch in enumerate(batches, 1):
    decisions = batch_test.test_batch(batch)
    discoveries = sum(decisions)
    all_decisions.extend(decisions)
    print(f"Batch {i}: {discoveries}/{len(batch)} discoveries")

total_discoveries = sum(all_decisions)
print(f"Total: {total_discoveries} discoveries with FDR  0.05")

Advanced Topics¶

Asynchronous Batching¶

When batches complete out of order: - Maintain batch ordering for FDR calculations - Buffer results until dependencies resolved
- Use timestamps or sequence numbers

Variable Batch Sizes¶

The framework naturally handles: - Different batch sizes across time - Empty batches (no effect on FDR control) - Very large batches (may need memory management)

Online-to-Batch Adaptation¶

Converting online methods to batch setting: - Group individual tests into batches - Apply batch framework with chosen internal procedure - May improve power over pure online methods

Applications¶

Genomics¶

GWAS studies: SNPs tested in chromosomal batches
RNA-seq: Genes tested by biological pathway
Meta-analysis: Studies combined in batches

A/B Testing¶

Feature releases: Tests grouped by release cycle
Market segments: Tests batched by user demographic
Time periods: Daily/weekly testing batches

Clinical Trials¶

Interim analyses: Endpoints tested in groups
Safety monitoring: Adverse events by system
Biomarker discovery: Markers tested by assay batch

Implementation Notes¶

Memory Management¶

Store only essential statistics between batches
Use efficient R calculation algorithms
Consider streaming for very large batch sequences

Numerical Stability¶

Handle very small p-values carefully
Avoid numerical overflow in cumulative calculations
Use log-space computations when appropriate

Validation¶

Verify FDR control through simulation
Test edge cases (empty batches, extreme p-values)
Benchmark against known implementations

References¶

Zrnic, T., D. Jiang, A. Ramdas, and M. I. Jordan (2020). "The Power of Batching in Multiple Hypothesis Testing." Proceedings of the 37^th International Conference on Machine Learning (ICML), PMLR, 119:11504-11515.

Benjamini, Y., and Y. Hochberg (1995). "Controlling the False Discovery Rate: A Practical and Powerful Approach to Multiple Testing." Journal of the Royal Statistical Society: Series B, 57(1):289-300.

Storey, J. D. (2002). "A direct approach to false discovery rates." Journal of the Royal Statistical Society: Series B, 64(3):479-498.

Batch Testing Methods¶

Overview¶

Available Methods¶

Benjamini-Hochberg Batch Testing¶

online_fdr.p_values.BatchBH ¶

Functions¶

test_batch(p_vals) ¶

Storey-BH Batch Testing¶

online_fdr.p_values.BatchStoreyBH ¶

Benjamini-Yekutieli Batch Testing¶

online_fdr.p_values.BatchBY ¶

Functions¶

test_batch(p_vals) ¶

PRDS Batch Testing¶

online_fdr.p_values.BatchPRDS ¶

Functions¶

test_batch(p_vals) ¶

TOAD Decision Deadlines¶

online_fdr.p_values.Toad ¶

Key Concepts¶

Alpha Allocation¶

R Calculation¶

_t Correction¶

Batch Size Considerations¶

Method Comparison¶

Power Under Different Conditions¶

Computational Complexity¶

Usage Guidelines¶

Method Selection¶

Parameter Tuning¶

Practical Implementation¶

Advanced Topics¶

Asynchronous Batching¶

Variable Batch Sizes¶

Online-to-Batch Adaptation¶

Applications¶

Genomics¶

A/B Testing¶

Clinical Trials¶

Implementation Notes¶

Memory Management¶

Numerical Stability¶

Validation¶

References¶

`online_fdr.p_values.BatchBH` ¶

`test_batch(p_vals)` ¶

`online_fdr.p_values.BatchStoreyBH` ¶

`online_fdr.p_values.BatchBY` ¶

`test_batch(p_vals)` ¶

`online_fdr.p_values.BatchPRDS` ¶

`test_batch(p_vals)` ¶

`online_fdr.p_values.Toad` ¶