Utility Functions¶

The utils module provides essential supporting functions for online FDR control algorithms, including static procedures, gamma sequences, validation functions, and data generation utilities.

Static Procedures¶

Core implementations of classical multiple testing procedures:

`online_fdr.utils.static.bh(p_vals, alpha)` ¶

Static Benjamini-Hochberg (BH) procedure for FDR control.

The Benjamini-Hochberg procedure is the fundamental method for controlling the False Discovery Rate in multiple testing. It provides FDR control at level α under independence of p-values or positive dependence (PRDS).

The algorithm sorts p-values and finds the largest index i such that P(i) ≤ (i/n) × α, where P(i) is the i-th smallest p-value and n is the total number of tests.

Parameters:

Name	Type	Description	Default
`p_vals`	`[float]`	List of p-values to test. Should be in [0,1].	required
`alpha`	`float`	Target FDR level. Must be in (0,1).	required

Returns:

Type	Description
`(int, float)`	Tuple of (number_of_rejections, rejection_threshold).
`(int, float)`	number_of_rejections: Number of hypotheses rejected
`(int, float)`	rejection_threshold: The p-value threshold used for rejection

Examples:

>>> # Basic usage
>>> p_values = [0.01, 0.02, 0.5, 0.8]
>>> num_rej, threshold = bh(p_values, alpha=0.05)
>>> print(f"Rejected {num_rej} hypotheses at threshold {threshold:.4f}")

>>> # Check which p-values are rejected
>>> rejected = [p <= threshold for p in p_values]
>>> print(f"Rejected p-values: {[p for p, r in zip(p_values, rejected) if r]}")

References

Benjamini, Y., and Y. Hochberg (1995). "Controlling the False Discovery Rate: A Practical and Powerful Approach to Multiple Testing." Journal of the Royal Statistical Society: Series B, 57(1):289-300.

Source code in online_fdr/utils/static.py

def bh(p_vals: [float], alpha: float) -> (int, float):
    """Static Benjamini-Hochberg (BH) procedure for FDR control.

    The Benjamini-Hochberg procedure is the fundamental method for controlling
    the False Discovery Rate in multiple testing. It provides FDR control at
    level α under independence of p-values or positive dependence (PRDS).

    The algorithm sorts p-values and finds the largest index i such that
    P(i) ≤ (i/n) × α, where P(i) is the i-th smallest p-value and n is the
    total number of tests.

    Args:
        p_vals: List of p-values to test. Should be in [0,1].
        alpha: Target FDR level. Must be in (0,1).

    Returns:
        Tuple of (number_of_rejections, rejection_threshold).
        - number_of_rejections: Number of hypotheses rejected
        - rejection_threshold: The p-value threshold used for rejection

    Examples:
        >>> # Basic usage
        >>> p_values = [0.01, 0.02, 0.5, 0.8]
        >>> num_rej, threshold = bh(p_values, alpha=0.05)
        >>> print(f"Rejected {num_rej} hypotheses at threshold {threshold:.4f}")

        >>> # Check which p-values are rejected
        >>> rejected = [p <= threshold for p in p_values]
        >>> print(f"Rejected p-values: {[p for p, r in zip(p_values, rejected) if r]}")

    References:
        Benjamini, Y., and Y. Hochberg (1995). "Controlling the False Discovery Rate:
        A Practical and Powerful Approach to Multiple Testing." Journal of the Royal
        Statistical Society: Series B, 57(1):289-300.
    """
    n = len(p_vals)
    sorted_p_vals = sorted(p_vals)

    def condition(i):
        return sorted_p_vals[i] <= alpha * (i + 1) / n

    left, right = 0, n
    while left < right:
        mid = (left + right) // 2
        if condition(mid):
            left = mid + 1
        else:
            right = mid

    return left, alpha * left / n if left else 0

`online_fdr.utils.static.storey_bh(p_vals, alpha, lambda_)` ¶

Static Storey-Benjamini-Hochberg procedure with π₀ estimation.

The Storey-BH procedure extends the classical Benjamini-Hochberg method by estimating π₀ (the proportion of true null hypotheses) and incorporating this estimate into the rejection threshold. This typically provides higher power when π₀ < 1, which is common in genomic and other high-dimensional applications.

The algorithm estimates π₀ using Storey's method: π̂₀ = min(1, (1 + #{p > λ}) / (n × (1 - λ)))

Then applies the BH procedure with adjusted significance level α/π̂₀.

Parameters:

Name	Type	Description	Default
`p_vals`	`[float]`	List of p-values to test. Should be in [0,1].	required
`alpha`	`float`	Target FDR level. Must be in (0,1).	required
`lambda_`	`float`	Threshold parameter for π₀ estimation. Must be in [0,1). Common choice: λ = 0.5. Higher values give more conservative π₀ estimates.	required

Returns:

Type	Description
`(int, float)`	Tuple of (number_of_rejections, rejection_threshold).
`(int, float)`	number_of_rejections: Number of hypotheses rejected
`(int, float)`	rejection_threshold: The p-value threshold used for rejection

Raises:

Type	Description
`ValueError`	If lambda_ >= 1.0.

Examples:

>>> # Standard usage with λ = 0.5
>>> p_values = [0.001, 0.02, 0.5, 0.8, 0.9]
>>> num_rej, threshold = storey_bh(p_values, alpha=0.05, lambda_=0.5)
>>> print(f"Storey-BH rejected {num_rej} at threshold {threshold:.4f}")

>>> # Compare with standard BH
>>> num_rej_bh, threshold_bh = bh(p_values, alpha=0.05)
>>> print(f"Standard BH rejected {num_rej_bh} at threshold {threshold_bh:.4f}")
>>> # Storey-BH typically has higher power when π₀ < 1

References

Storey, J. D. (2002). "A direct approach to false discovery rates." Journal of the Royal Statistical Society: Series B, 64(3):479-498.

Storey, J. D. (2003). "The positive false discovery rate: a Bayesian interpretation and the q-value." Annals of Statistics, 31(6):2013-2035.

Source code in online_fdr/utils/static.py

def storey_bh(p_vals: [float], alpha: float, lambda_: float) -> (int, float):
    """Static Storey-Benjamini-Hochberg procedure with π₀ estimation.

    The Storey-BH procedure extends the classical Benjamini-Hochberg method by
    estimating π₀ (the proportion of true null hypotheses) and incorporating
    this estimate into the rejection threshold. This typically provides higher
    power when π₀ < 1, which is common in genomic and other high-dimensional
    applications.

    The algorithm estimates π₀ using Storey's method:
    π̂₀ = min(1, (1 + #{p > λ}) / (n × (1 - λ)))

    Then applies the BH procedure with adjusted significance level α/π̂₀.

    Args:
        p_vals: List of p-values to test. Should be in [0,1].
        alpha: Target FDR level. Must be in (0,1).
        lambda_: Threshold parameter for π₀ estimation. Must be in [0,1).
                Common choice: λ = 0.5. Higher values give more conservative
                π₀ estimates.

    Returns:
        Tuple of (number_of_rejections, rejection_threshold).
        - number_of_rejections: Number of hypotheses rejected
        - rejection_threshold: The p-value threshold used for rejection

    Raises:
        ValueError: If lambda_ >= 1.0.

    Examples:
        >>> # Standard usage with λ = 0.5
        >>> p_values = [0.001, 0.02, 0.5, 0.8, 0.9]
        >>> num_rej, threshold = storey_bh(p_values, alpha=0.05, lambda_=0.5)
        >>> print(f"Storey-BH rejected {num_rej} at threshold {threshold:.4f}")

        >>> # Compare with standard BH
        >>> num_rej_bh, threshold_bh = bh(p_values, alpha=0.05)
        >>> print(f"Standard BH rejected {num_rej_bh} at threshold {threshold_bh:.4f}")
        >>> # Storey-BH typically has higher power when π₀ < 1

    References:
        Storey, J. D. (2002). "A direct approach to false discovery rates."
        Journal of the Royal Statistical Society: Series B, 64(3):479-498.

        Storey, J. D. (2003). "The positive false discovery rate: a Bayesian
        interpretation and the q-value." Annals of Statistics, 31(6):2013-2035.
    """
    if not p_vals:
        return 0, 0.0

    n = len(p_vals)

    # Estimate π₀ using Storey's method
    # π₀ = (1 + #{p_i > λ}) / (n(1-λ))
    if lambda_ >= 1.0:
        raise ValueError("lambda_ must be less than 1.0")

    num_above_lambda = sum(1 for p in p_vals if p > lambda_)
    pi0 = min(1.0, (1 + num_above_lambda) / (n * (1 - lambda_)))

    # Apply BH procedure with adjusted alpha
    sorted_p_vals = sorted(p_vals)

    # Find the largest i such that P(i) ≤ (i/n) * (alpha/π₀)
    num_reject = 0
    threshold = 0.0

    for i in range(n):
        if sorted_p_vals[i] <= (i + 1) * alpha / (n * pi0):
            num_reject = i + 1
            threshold = sorted_p_vals[i]
        else:
            break

    return num_reject, threshold

`online_fdr.utils.static.by(p_vals, alpha)` ¶

Static Benjamini-Yekutieli (BY) procedure for FDR control under dependence.

The Benjamini-Yekutieli procedure extends the Benjamini-Hochberg method to provide FDR control even under arbitrary dependence among p-values. It uses harmonic weights to maintain FDR control at the cost of reduced power compared to the standard BH procedure.

The algorithm finds the largest index i such that: P(i) ≤ (i/n) × α / H_n

where H_n = Σ(1/j) for j=1 to n is the n-th harmonic number.

Parameters:

Name	Type	Description	Default
`p_vals`	`[float]`	List of p-values to test. Should be in [0,1].	required
`alpha`	`float`	Target FDR level. Must be in (0,1).	required

Returns:

Type	Description
`(int, float)`	Tuple of (number_of_rejections, rejection_threshold).
`(int, float)`	number_of_rejections: Number of hypotheses rejected
`(int, float)`	rejection_threshold: The p-value threshold used for rejection

Examples:

>>> # Usage under arbitrary dependence
>>> p_values = [0.01, 0.02, 0.5, 0.8]  # May be dependent
>>> num_rej, threshold = by(p_values, alpha=0.05)
>>> print(f"BY rejected {num_rej} at threshold {threshold:.4f}")

>>> # Compare with BH (more conservative under dependence)
>>> num_rej_bh, threshold_bh = bh(p_values, alpha=0.05)
>>> print(f"BH rejected {num_rej_bh} (may not control FDR under dependence)")

Notes

The BY procedure is recommended when: - P-values may be arbitrarily dependent - Conservative FDR control is required - The exact dependence structure is unknown

It maintains valid FDR control under any dependence structure but typically has lower power than BH under independence.

References

Benjamini, Y., and D. Yekutieli (2001). "The control of the false discovery rate in multiple testing under dependency." Annals of Statistics, 29(4):1165-1188.

Source code in online_fdr/utils/static.py

def by(p_vals: [float], alpha: float) -> (int, float):
    """Static Benjamini-Yekutieli (BY) procedure for FDR control under dependence.

    The Benjamini-Yekutieli procedure extends the Benjamini-Hochberg method to
    provide FDR control even under arbitrary dependence among p-values. It uses
    harmonic weights to maintain FDR control at the cost of reduced power
    compared to the standard BH procedure.

    The algorithm finds the largest index i such that:
    P(i) ≤ (i/n) × α / H_n

    where H_n = Σ(1/j) for j=1 to n is the n-th harmonic number.

    Args:
        p_vals: List of p-values to test. Should be in [0,1].
        alpha: Target FDR level. Must be in (0,1).

    Returns:
        Tuple of (number_of_rejections, rejection_threshold).
        - number_of_rejections: Number of hypotheses rejected
        - rejection_threshold: The p-value threshold used for rejection

    Examples:
        >>> # Usage under arbitrary dependence
        >>> p_values = [0.01, 0.02, 0.5, 0.8]  # May be dependent
        >>> num_rej, threshold = by(p_values, alpha=0.05)
        >>> print(f"BY rejected {num_rej} at threshold {threshold:.4f}")

        >>> # Compare with BH (more conservative under dependence)
        >>> num_rej_bh, threshold_bh = bh(p_values, alpha=0.05)
        >>> print(f"BH rejected {num_rej_bh} (may not control FDR under dependence)")

    Notes:
        The BY procedure is recommended when:
        - P-values may be arbitrarily dependent
        - Conservative FDR control is required
        - The exact dependence structure is unknown

        It maintains valid FDR control under any dependence structure but
        typically has lower power than BH under independence.

    References:
        Benjamini, Y., and D. Yekutieli (2001). "The control of the false discovery
        rate in multiple testing under dependency." Annals of Statistics, 29(4):1165-1188.
    """
    n = len(p_vals)
    sorted_p_vals = sorted(p_vals)
    harmonic_sum = sum(1 / (i + 1) for i in range(n))

    def condition(i):
        return sorted_p_vals[i] <= alpha * (i + 1) / (n * harmonic_sum)

    left, right = 0, n
    while left < right:
        mid = (left + right) // 2
        if condition(mid):
            left = mid + 1
        else:
            right = mid

    return left, alpha * left / (n * harmonic_sum) if left else 0

Gamma Sequences¶

Specialized sequences for alpha allocation in online procedures:

`online_fdr.utils.sequence.DefaultSaffronGammaSequence` ¶

Bases: AbstractGammaSequence

Default gamma sequence for SAFFRON (Spectral Approach for FDR with Online N-tests).

SAFFRON uses a power-law gamma sequence that provides excellent performance for adaptive online FDR control. The sequence can operate in two modes: with or without a normalization constant c.

The sequence is defined as: - With c: γⱼ = c / j^(gamma_exp)
- Without c: γⱼ = j^(gamma_exp)

Parameters:

Name	Type	Description	Default
`gamma_exp`		Exponent controlling the decay rate. Recommended: 1.6.	required
`c`		Normalization constant. If None, uses pure power law.	required

Examples:

>>> # Standard SAFFRON sequence with recommended parameters
>>> gamma_seq = DefaultSaffronGammaSequence(gamma_exp=1.6, c=0.4374901658)
>>> gamma_values = [gamma_seq.calc_gamma(j) for j in range(1, 6)]
>>> print(f"SAFFRON γ values: {gamma_values}")

References

Ramdas, A., T. Zrnic, M. J. Wainwright, and M. I. Jordan (2018). "SAFFRON: an adaptive algorithm for online control of the false discovery rate." Proceedings of the 35^th International Conference on Machine Learning (ICML), PMLR, 80:4286-4294.

Source code in online_fdr/utils/sequence.py

class DefaultSaffronGammaSequence(AbstractGammaSequence):
    """Default gamma sequence for SAFFRON (Spectral Approach for FDR with Online N-tests).

    SAFFRON uses a power-law gamma sequence that provides excellent performance
    for adaptive online FDR control. The sequence can operate in two modes:
    with or without a normalization constant c.

    The sequence is defined as:
    - With c: γⱼ = c / j^(gamma_exp)  
    - Without c: γⱼ = j^(gamma_exp)

    Args:
        gamma_exp: Exponent controlling the decay rate. Recommended: 1.6.
        c: Normalization constant. If None, uses pure power law.

    Examples:
        >>> # Standard SAFFRON sequence with recommended parameters
        >>> gamma_seq = DefaultSaffronGammaSequence(gamma_exp=1.6, c=0.4374901658)
        >>> gamma_values = [gamma_seq.calc_gamma(j) for j in range(1, 6)]
        >>> print(f"SAFFRON γ values: {gamma_values}")

    References:
        Ramdas, A., T. Zrnic, M. J. Wainwright, and M. I. Jordan (2018).
        "SAFFRON: an adaptive algorithm for online control of the false discovery rate."
        Proceedings of the 35th International Conference on Machine Learning (ICML),
        PMLR, 80:4286-4294.
    """

    def __init__(self, gamma_exp, c):
        """Initialize the SAFFRON gamma sequence.

        Args:
            gamma_exp: Decay exponent. Recommended value: 1.6.
            c: Normalization constant. Recommended value: 0.4374901658.
               If None, uses pure power law without normalization.
        """
        super().__init__(gamma_exp=gamma_exp, c=c)

    def calc_gamma(self, j: int, *args):
        """Calculate gamma value for position j.

        Args:
            j: Position index (1-based).
            *args: Additional arguments (unused).

        Returns:
            Gamma value for position j.
        """
        return j**self.gamma_exp if self.c is None \
            else self.c / j**self.gamma_exp  # fmt: skip

Functions¶

`calc_gamma(j, *args)` ¶

Calculate gamma value for position j.

Parameters:

Name	Type	Description	Default
`j`	`int`	Position index (1-based).	required
`*args`		Additional arguments (unused).	`()`

Returns:

Type	Description
	Gamma value for position j.

Source code in online_fdr/utils/sequence.py

def calc_gamma(self, j: int, *args):
    """Calculate gamma value for position j.

    Args:
        j: Position index (1-based).
        *args: Additional arguments (unused).

    Returns:
        Gamma value for position j.
    """
    return j**self.gamma_exp if self.c is None \
        else self.c / j**self.gamma_exp  # fmt: skip

`online_fdr.utils.sequence.DefaultLordGammaSequence` ¶

Bases: AbstractGammaSequence

Default gamma sequence for LORD (Levels based On Recent Discovery) procedures.

This gamma sequence is optimized for LORD algorithms and provides the balance between early power and long-term FDR control. Unlike LOND sequences, LORD sequences don't require the alpha parameter since they work with alpha wealth dynamics.

The sequence is defined as: γⱼ = c × log(max(j,2)) / (j × exp(√log(j)))

Parameters:

Name	Type	Description	Default
`c`	`float`	Normalization constant controlling sequence scale.	required

Examples:

>>> # Standard LORD gamma sequence  
>>> gamma_seq = DefaultLordGammaSequence(c=0.07720838)
>>> gamma_values = [gamma_seq.calc_gamma(j) for j in range(1, 6)]
>>> print(f"First 5 gamma values: {gamma_values}")

References

Javanmard, A., and A. Montanari (2018). "Online rules for control of false discovery rate and false discovery exceedance." Annals of Statistics, 46(2):526-554.

Source code in online_fdr/utils/sequence.py

class DefaultLordGammaSequence(AbstractGammaSequence):
    """Default gamma sequence for LORD (Levels based On Recent Discovery) procedures.

    This gamma sequence is optimized for LORD algorithms and provides the balance
    between early power and long-term FDR control. Unlike LOND sequences, LORD
    sequences don't require the alpha parameter since they work with alpha wealth
    dynamics.

    The sequence is defined as:
    γⱼ = c × log(max(j,2)) / (j × exp(√log(j)))

    Args:
        c: Normalization constant controlling sequence scale.

    Examples:
        >>> # Standard LORD gamma sequence  
        >>> gamma_seq = DefaultLordGammaSequence(c=0.07720838)
        >>> gamma_values = [gamma_seq.calc_gamma(j) for j in range(1, 6)]
        >>> print(f"First 5 gamma values: {gamma_values}")

    References:
        Javanmard, A., and A. Montanari (2018). "Online rules for control of false
        discovery rate and false discovery exceedance." Annals of Statistics,
        46(2):526-554.
    """

    def __init__(self, c: float):
        """Initialize the LORD gamma sequence.

        Args:
            c: Normalization constant. Recommended value: c ≈ 0.07720838.
        """
        super().__init__(c)

    def calc_gamma(self, j: int, **kwargs):
        """Calculate the gamma value for position j in the sequence.

        Args:
            j: Position index (1-based).
            **kwargs: Additional arguments (unused for LORD sequences).

        Returns:
            The gamma value for position j.
        """
        return (
            self.c  # fmt: skip
            * math.log(max(j, 2))
            / (j * math.exp(math.sqrt(math.log(j))))
        )

Functions¶

`calc_gamma(j, **kwargs)` ¶

Calculate the gamma value for position j in the sequence.

Parameters:

Name	Type	Description	Default
`j`	`int`	Position index (1-based).	required
`**kwargs`		Additional arguments (unused for LORD sequences).	`{}`

Returns:

Type	Description
	The gamma value for position j.

Source code in online_fdr/utils/sequence.py

def calc_gamma(self, j: int, **kwargs):
    """Calculate the gamma value for position j in the sequence.

    Args:
        j: Position index (1-based).
        **kwargs: Additional arguments (unused for LORD sequences).

    Returns:
        The gamma value for position j.
    """
    return (
        self.c  # fmt: skip
        * math.log(max(j, 2))
        / (j * math.exp(math.sqrt(math.log(j))))
    )

`online_fdr.utils.sequence.DefaultLondGammaSequence` ¶

Bases: AbstractGammaSequence

Default gamma sequence for LOND (Levels based On Number of Discoveries).

This gamma sequence is specifically designed for LOND procedures and provides the recommended decay rate for optimal power-stability trade-offs. The sequence incorporates both logarithmic and exponential decay components to balance early power with long-term FDR control.

The sequence is defined as: γⱼ = c × α × log(max(j,2)) / (j × exp(√log(j)))

Parameters:

Name	Type	Description	Default
`c`	`float`	Normalization constant controlling the overall scale of the sequence.	required

Attributes:

Name	Type	Description
`c`	`float \| None`	The normalization constant.

Examples:

>>> # Standard LOND gamma sequence
>>> gamma_seq = DefaultLondGammaSequence(c=0.07720838)
>>> gamma_1 = gamma_seq.calc_gamma(1, alpha=0.05)
>>> gamma_10 = gamma_seq.calc_gamma(10, alpha=0.05)
>>> print(f"γ₁ = {gamma_1:.6f}, γ₁₀ = {gamma_10:.6f}")

References

Javanmard, A., and A. Montanari (2018). "Online rules for control of false discovery rate and false discovery exceedance." Annals of Statistics, 46(2):526-554.

Source code in online_fdr/utils/sequence.py

class DefaultLondGammaSequence(AbstractGammaSequence):
    """Default gamma sequence for LOND (Levels based On Number of Discoveries).

    This gamma sequence is specifically designed for LOND procedures and provides
    the recommended decay rate for optimal power-stability trade-offs. The sequence
    incorporates both logarithmic and exponential decay components to balance
    early power with long-term FDR control.

    The sequence is defined as:
    γⱼ = c × α × log(max(j,2)) / (j × exp(√log(j)))

    Args:
        c: Normalization constant controlling the overall scale of the sequence.

    Attributes:
        c: The normalization constant.

    Examples:
        >>> # Standard LOND gamma sequence
        >>> gamma_seq = DefaultLondGammaSequence(c=0.07720838)
        >>> gamma_1 = gamma_seq.calc_gamma(1, alpha=0.05)
        >>> gamma_10 = gamma_seq.calc_gamma(10, alpha=0.05)
        >>> print(f"γ₁ = {gamma_1:.6f}, γ₁₀ = {gamma_10:.6f}")

    References:
        Javanmard, A., and A. Montanari (2018). "Online rules for control of false
        discovery rate and false discovery exceedance." Annals of Statistics,
        46(2):526-554.
    """

    def __init__(self, c: float):
        """Initialize the LOND gamma sequence.

        Args:
            c: Normalization constant. Recommended value: c ≈ 0.07720838.
        """
        super().__init__(c)

    def calc_gamma(self, j: int, **kwargs):
        """Calculate the gamma value for the j-th test.

        Args:
            j: Test index (1-based). 
            **kwargs: Must contain 'alpha' - the target FDR level.

        Returns:
            The gamma value for test j.

        Examples:
            >>> seq = DefaultLondGammaSequence(c=0.07720838)
            >>> gamma_5 = seq.calc_gamma(5, alpha=0.05)
        """
        alpha = kwargs.get("alpha")
        return (
            self.c
            * alpha
            * (math.log(max(j, 2)) / (j * math.exp(math.sqrt(math.log(j)))))
        )

Functions¶

`calc_gamma(j, **kwargs)` ¶

Calculate the gamma value for the j-th test.

Parameters:

Name	Type	Description	Default
`j`	`int`	Test index (1-based).	required
`**kwargs`		Must contain 'alpha' - the target FDR level.	`{}`

Returns:

Type	Description
	The gamma value for test j.

Examples:

>>> seq = DefaultLondGammaSequence(c=0.07720838)
>>> gamma_5 = seq.calc_gamma(5, alpha=0.05)

Source code in online_fdr/utils/sequence.py

def calc_gamma(self, j: int, **kwargs):
    """Calculate the gamma value for the j-th test.

    Args:
        j: Test index (1-based). 
        **kwargs: Must contain 'alpha' - the target FDR level.

    Returns:
        The gamma value for test j.

    Examples:
        >>> seq = DefaultLondGammaSequence(c=0.07720838)
        >>> gamma_5 = seq.calc_gamma(5, alpha=0.05)
    """
    alpha = kwargs.get("alpha")
    return (
        self.c
        * alpha
        * (math.log(max(j, 2)) / (j * math.exp(math.sqrt(math.log(j)))))
    )

Validation Functions¶

Input validation and error checking:

`online_fdr.utils.validity.check_p_val(p_val)` ¶

Validate that a p-value is in the valid range [0, 1].

Parameters:

Name	Type	Description	Default
`p_val`	`float`	The p-value to validate.	required

Raises:

Type	Description
`ValueError`	If p_val is not in [0, 1].

Examples:

>>> check_p_val(0.05)  # Valid - no exception
>>> check_p_val(1.5)   # Raises ValueError
Traceback (most recent call last):
ValueError: Given p-value must be between [0,1].

Source code in online_fdr/utils/validity.py

def check_p_val(p_val: float) -> None:
    """Validate that a p-value is in the valid range [0, 1].

    Args:
        p_val: The p-value to validate.

    Raises:
        ValueError: If p_val is not in [0, 1].

    Examples:
        >>> check_p_val(0.05)  # Valid - no exception
        >>> check_p_val(1.5)   # Raises ValueError
        Traceback (most recent call last):
        ValueError: Given p-value must be between [0,1].
    """
    if not 0 <= p_val <= 1:
        raise ValueError(
            """
            Given p-value must be between [0,1].
            """
        )

`online_fdr.utils.validity.check_alpha(p_val)` ¶

Validate that an alpha value is in the valid range (0, 1).

Parameters:

Name	Type	Description	Default
`p_val`	`float`	The alpha/significance level to validate.	required

Raises:

Type	Description
`ValueError`	If p_val is not in (0, 1).

Examples:

>>> check_alpha(0.05)  # Valid - no exception
>>> check_alpha(0.0)   # Raises ValueError (not > 0)
>>> check_alpha(1.0)   # Raises ValueError (not < 1)

Source code in online_fdr/utils/validity.py

def check_alpha(p_val: float) -> None:
    """Validate that an alpha value is in the valid range (0, 1).

    Args:
        p_val: The alpha/significance level to validate.

    Raises:
        ValueError: If p_val is not in (0, 1).

    Examples:
        >>> check_alpha(0.05)  # Valid - no exception
        >>> check_alpha(0.0)   # Raises ValueError (not > 0)
        >>> check_alpha(1.0)   # Raises ValueError (not < 1)
    """
    if not 0 < p_val < 1:
        raise ValueError(
            """
            Given alpha value must be between (0,1).
            """
        )

`online_fdr.utils.validity.check_initial_wealth(initial_wealth, alpha)` ¶

Validate that initial wealth is properly bounded relative to alpha.

In alpha-wealth procedures (LORD, SAFFRON, etc.), the initial wealth must be positive but less than the target FDR level to ensure proper wealth dynamics and FDR control.

Parameters:

Name	Type	Description	Default
`initial_wealth`	`float`	The initial alpha wealth.	required
`alpha`	`float`	The target FDR level.	required

Raises:

Type	Description
`ValueError`	If initial_wealth is not in (0, alpha).

Examples:

>>> check_initial_wealth(0.025, 0.05)  # Valid - no exception
>>> check_initial_wealth(0.1, 0.05)    # Raises ValueError (> alpha)
>>> check_initial_wealth(0.0, 0.05)    # Raises ValueError (not > 0)

Source code in online_fdr/utils/validity.py

def check_initial_wealth(initial_wealth: float, alpha: float) -> None:
    """Validate that initial wealth is properly bounded relative to alpha.

    In alpha-wealth procedures (LORD, SAFFRON, etc.), the initial wealth
    must be positive but less than the target FDR level to ensure proper
    wealth dynamics and FDR control.

    Args:
        initial_wealth: The initial alpha wealth.
        alpha: The target FDR level.

    Raises:
        ValueError: If initial_wealth is not in (0, alpha).

    Examples:
        >>> check_initial_wealth(0.025, 0.05)  # Valid - no exception
        >>> check_initial_wealth(0.1, 0.05)    # Raises ValueError (> alpha)
        >>> check_initial_wealth(0.0, 0.05)    # Raises ValueError (not > 0)
    """
    if not 0 < initial_wealth < alpha:
        raise ValueError(
            """
            The initial wealth should be between (0, alpha).
            """
        )

`online_fdr.utils.validity.check_candidate_threshold(lambda_)` ¶

Validate that a candidate threshold λ is in the valid range (0, 1).

The candidate threshold λ is used in SAFFRON and ADDIS algorithms to determine which p-values are considered "candidates" for rejection. It must be strictly between 0 and 1.

Parameters:

Name	Type	Description	Default
`lambda_`	`float`	The candidate threshold to validate.	required

Raises:

Type	Description
`ValueError`	If lambda_ is not in (0, 1).

Examples:

>>> check_candidate_threshold(0.5)  # Valid - no exception  
>>> check_candidate_threshold(0.0)  # Raises ValueError
>>> check_candidate_threshold(1.0)  # Raises ValueError

Source code in online_fdr/utils/validity.py

def check_candidate_threshold(lambda_: float) -> None:
    """Validate that a candidate threshold λ is in the valid range (0, 1).

    The candidate threshold λ is used in SAFFRON and ADDIS algorithms to
    determine which p-values are considered "candidates" for rejection.
    It must be strictly between 0 and 1.

    Args:
        lambda_: The candidate threshold to validate.

    Raises:
        ValueError: If lambda_ is not in (0, 1).

    Examples:
        >>> check_candidate_threshold(0.5)  # Valid - no exception  
        >>> check_candidate_threshold(0.0)  # Raises ValueError
        >>> check_candidate_threshold(1.0)  # Raises ValueError
    """
    if not 0 < lambda_ < 1:
        raise ValueError(
            """
            The candidate threshold (lambda) should be between (0, 1).
            """
        )

`online_fdr.utils.validity.check_wealth(wealth)` ¶

Validate that alpha wealth is positive (not depleted).

In alpha-wealth procedures, when wealth reaches zero or below, no more tests can be performed while maintaining FDR control. This function checks for wealth depletion.

Parameters:

Name	Type	Description	Default
`wealth`	`float`	Current alpha wealth level.	required

Raises:

Type	Description
`ValueError`	If wealth <= 0.

Examples:

>>> check_wealth(0.01)  # Valid - no exception
>>> check_wealth(0.0)   # Raises ValueError (wealth depleted)
>>> check_wealth(-0.1)  # Raises ValueError (negative wealth)

Source code in online_fdr/utils/validity.py

def check_wealth(wealth: float) -> None:
    """Validate that alpha wealth is positive (not depleted).

    In alpha-wealth procedures, when wealth reaches zero or below,
    no more tests can be performed while maintaining FDR control.
    This function checks for wealth depletion.

    Args:
        wealth: Current alpha wealth level.

    Raises:
        ValueError: If wealth <= 0.

    Examples:
        >>> check_wealth(0.01)  # Valid - no exception
        >>> check_wealth(0.0)   # Raises ValueError (wealth depleted)
        >>> check_wealth(-0.1)  # Raises ValueError (negative wealth)
    """
    if wealth <= 0:
        raise ValueError(
            """
            Alpha wealth depleted. Test execution stopped.
            """
        )

`online_fdr.utils.validity.check_decay_factor(decay_factor)` ¶

Validate that a decay factor is in the valid range (0, 1).

Decay factors are used in various algorithms to control the rate at which parameters decrease over time. They must be strictly between 0 and 1 for proper convergence behavior.

Parameters:

Name	Type	Description	Default
`decay_factor`	`float`	The decay factor to validate.	required

Raises:

Type	Description
`ValueError`	If decay_factor is not in (0, 1).

Examples:

>>> check_decay_factor(0.9)   # Valid - no exception
>>> check_decay_factor(0.0)   # Raises ValueError (not > 0)
>>> check_decay_factor(1.0)   # Raises ValueError (not < 1)
>>> check_decay_factor(1.5)   # Raises ValueError (not < 1)

Source code in online_fdr/utils/validity.py

def check_decay_factor(decay_factor: float) -> None:
    """Validate that a decay factor is in the valid range (0, 1).

    Decay factors are used in various algorithms to control the rate
    at which parameters decrease over time. They must be strictly
    between 0 and 1 for proper convergence behavior.

    Args:
        decay_factor: The decay factor to validate.

    Raises:
        ValueError: If decay_factor is not in (0, 1).

    Examples:
        >>> check_decay_factor(0.9)   # Valid - no exception
        >>> check_decay_factor(0.0)   # Raises ValueError (not > 0)
        >>> check_decay_factor(1.0)   # Raises ValueError (not < 1)
        >>> check_decay_factor(1.5)   # Raises ValueError (not < 1)
    """
    if not 0 < decay_factor < 1:
        raise ValueError(
            """
            Decay factor must be between (0,1).
            """
        )

Overview¶

Static Procedures¶

The static module contains implementations of classical FDR control procedures that form the building blocks for online and batch methods:

Benjamini-Hochberg (BH): The foundational FDR procedure
Storey-BH: Enhanced power through π₀ estimation
Benjamini-Yekutieli (BY): FDR control under arbitrary dependence

These functions are used internally by batch testing methods and can also be used directly for offline multiple testing.

Gamma Sequences¶

Gamma sequences determine how alpha budget is allocated over time in online procedures. Different algorithms require different sequence properties:

SAFFRON sequences: Power-law decay optimized for adaptive procedures
LORD sequences: Logarithmic decay for wealth-based procedures
LOND sequences: Complex decay balancing early/late power

Validation Framework¶

The validation module ensures input correctness and provides informative error messages:

Range validation: P-values in [0,1], alpha in (0,1)
Relationship validation: Initial wealth < alpha
State validation: Wealth depletion detection

Usage Examples¶

Static Procedures¶

from online_fdr.utils.static import bh, storey_bh, by

# Sample p-values  
p_values = [0.001, 0.02, 0.15, 0.8, 0.9]
alpha = 0.05

# Standard Benjamini-Hochberg
num_rej_bh, threshold_bh = bh(p_values, alpha)
print(f"BH: {num_rej_bh} rejections at threshold {threshold_bh:.4f}")

# Storey-BH with π₀ estimation
num_rej_storey, threshold_storey = storey_bh(p_values, alpha, lambda_=0.5)
print(f"Storey-BH: {num_rej_storey} rejections at threshold {threshold_storey:.4f}")

# Benjamini-Yekutieli (under dependence)
num_rej_by, threshold_by = by(p_values, alpha)  
print(f"BY: {num_rej_by} rejections at threshold {threshold_by:.4f}")

Gamma Sequences¶

from online_fdr.utils.sequence import DefaultSaffronGammaSequence

# Create SAFFRON gamma sequence
gamma_seq = DefaultSaffronGammaSequence(gamma_exp=1.6, c=0.4374901658)

# Generate first 10 gamma values
gamma_values = [gamma_seq.calc_gamma(j) for j in range(1, 11)]
print("SAFFRON gamma sequence:", gamma_values)

# Show decay behavior
import matplotlib.pyplot as plt
j_values = range(1, 101)
gamma_values = [gamma_seq.calc_gamma(j) for j in j_values]
plt.loglog(j_values, gamma_values)
plt.xlabel("Position j")
plt.ylabel("Gamma value")
plt.title("SAFFRON Gamma Sequence Decay")

Validation Functions¶

from online_fdr.utils.validity import check_p_val, check_alpha, check_initial_wealth

# Validate inputs
try:
    check_p_val(0.05)           # Valid p-value
    check_alpha(0.05)           # Valid alpha  
    check_initial_wealth(0.025, 0.05)  # Valid wealth < alpha
    print("All validations passed")
except ValueError as e:
    print(f"Validation error: {e}")

# Example of validation failure
try:
    check_p_val(1.5)            # Invalid p-value > 1
except ValueError as e:
    print(f"P-value validation failed: {e}")

Algorithm Support¶

How Utilities Support Online Algorithms¶

Static procedures provide the core multiple testing logic
Gamma sequences control alpha allocation strategies
Validation functions ensure algorithmic correctness
Data utilities support testing and evaluation

Customization and Extension¶

The utility framework supports customization:

Custom gamma sequences: Inherit from AbstractGammaSequence
Custom spending functions: Implement AbstractSpendFunc
Custom static procedures: Follow the standard interface
Additional validation: Extend existing validation functions

Best Practices¶

Static Procedure Usage¶

Choose appropriate procedure based on dependence assumptions
Handle edge cases (empty input, all p-values = 1)
Validate inputs before calling procedures
Understand threshold interpretation for downstream usage

Gamma Sequence Selection¶

Use algorithm-specific defaults when available
Consider decay rate for your application timeline
Validate convergence properties for custom sequences
Test empirical performance versus theoretical optimal

Validation Integration¶

Validate early and often in computational pipelines
Provide informative error messages to users
Handle validation gracefully in production code
Document validation requirements for API users

Performance Considerations¶

Computational Complexity¶

Static procedures: O(n log n) due to sorting
Gamma sequences: O(1) per value computation
Validation: O(1) per check

Memory Usage¶

Static procedures: O(n) for input storage
Gamma sequences: O(1) stateless computation
Validation: Minimal memory footprint

Optimization Tips¶

Pre-sort p-values when calling static procedures repeatedly
Cache gamma values for frequently accessed positions
Batch validation when checking many values
Use numpy arrays for vectorized operations when possible

References¶

Static Procedures: - Benjamini, Y., and Y. Hochberg (1995). "Controlling the False Discovery Rate" - Storey, J. D. (2002). "A direct approach to false discovery rates"
- Benjamini, Y., and D. Yekutieli (2001). "Control under dependency"

Gamma Sequences: - Ramdas, A., et al. (2018). "SAFFRON: adaptive algorithm for online FDR control" - Javanmard, A., and A. Montanari (2018). "Online rules for FDR control"

Validation Theory: - Wassmer, G., and W. Brannath (2016). "Group Sequential and Confirmatory Adaptive Designs"

Utility Functions¶

Static Procedures¶

online_fdr.utils.static.bh(p_vals, alpha) ¶

online_fdr.utils.static.storey_bh(p_vals, alpha, lambda_) ¶

online_fdr.utils.static.by(p_vals, alpha) ¶

Gamma Sequences¶

online_fdr.utils.sequence.DefaultSaffronGammaSequence ¶

Functions¶

calc_gamma(j, *args) ¶

online_fdr.utils.sequence.DefaultLordGammaSequence ¶

Functions¶

calc_gamma(j, **kwargs) ¶

online_fdr.utils.sequence.DefaultLondGammaSequence ¶

Functions¶

calc_gamma(j, **kwargs) ¶

Validation Functions¶

online_fdr.utils.validity.check_p_val(p_val) ¶

online_fdr.utils.validity.check_alpha(p_val) ¶

online_fdr.utils.validity.check_initial_wealth(initial_wealth, alpha) ¶

online_fdr.utils.validity.check_candidate_threshold(lambda_) ¶

online_fdr.utils.validity.check_wealth(wealth) ¶

online_fdr.utils.validity.check_decay_factor(decay_factor) ¶

Overview¶

Static Procedures¶

Gamma Sequences¶

Validation Framework¶

Usage Examples¶

Static Procedures¶

Gamma Sequences¶

Validation Functions¶

Algorithm Support¶

How Utilities Support Online Algorithms¶

Customization and Extension¶

Best Practices¶

Static Procedure Usage¶

Gamma Sequence Selection¶

Validation Integration¶

Performance Considerations¶

Computational Complexity¶

Memory Usage¶

Optimization Tips¶

References¶

`online_fdr.utils.static.bh(p_vals, alpha)` ¶

`online_fdr.utils.static.storey_bh(p_vals, alpha, lambda_)` ¶

`online_fdr.utils.static.by(p_vals, alpha)` ¶

`online_fdr.utils.sequence.DefaultSaffronGammaSequence` ¶

`calc_gamma(j, *args)` ¶

`online_fdr.utils.sequence.DefaultLordGammaSequence` ¶

`calc_gamma(j, **kwargs)` ¶

`online_fdr.utils.sequence.DefaultLondGammaSequence` ¶

`calc_gamma(j, **kwargs)` ¶

`online_fdr.utils.validity.check_p_val(p_val)` ¶

`online_fdr.utils.validity.check_alpha(p_val)` ¶

`online_fdr.utils.validity.check_initial_wealth(initial_wealth, alpha)` ¶

`online_fdr.utils.validity.check_candidate_threshold(lambda_)` ¶

`online_fdr.utils.validity.check_wealth(wealth)` ¶

`online_fdr.utils.validity.check_decay_factor(decay_factor)` ¶