LOND: Levels based On Number of Discoveries¶

LOND (significance Levels based On Number of Discoveries) is one of the first procedures for online false discovery rate (FDR) control, where significance levels are multiplied by the number of discoveries made so far.

Original Papers

Javanmard, A., and Montanari, A. "On online control of false discovery rate." arXiv preprint arXiv:1502.06197, 2015.

Javanmard, A., and A. Montanari. "Online rules for control of false discovery rate and false discovery exceedance." Annals of Statistics, 46(2):526-554, 2018.

Overview¶

Historical Importance¶

LOND represents one of the first successful attempts to control FDR in the online setting. Javanmard and Montanari (2015) were among the first to tackle the challenge of sequential hypothesis testing with FDR guarantees.

Algorithm Principle¶

LOND is conceptually simple: test levels are multiplied by the number of rejections made thus far. The more discoveries you make, the higher your future rejection thresholds become, creating a self-reinforcing discovery process.

Key Limitation¶

While LOND provably controls FDR, it has a significant drawback: unless many discoveries are made early, the adjusted significance levels quickly approach zero, leading to very low power. This motivated the development of LORD and other alpha-investing procedures.

Class Reference¶

`online_fdr.investing.lond.lond.Lond` ¶

Bases: AbstractSequentialTest

LOND: Levels based On Number of Discoveries for online FDR control.

LOND is one of the first procedures for online false discovery rate (FDR) control, where significance levels are adjusted based on the number of discoveries made so far. It is a relatively simple algorithm where test levels are multiplied by the number of rejections up to the current time.

While LOND provably controls the FDR, it has a significant limitation: unless many discoveries are made early, the adjusted significance levels quickly approach zero, leading to very low power. This motivated the development of LORD procedures that use "alpha investing" to maintain better power over time.

Parameters:

Name	Type	Description	Default
`alpha`	`float`	Target FDR level (e.g., 0.05 for 5% FDR). Must be in (0, 1).	required
`original`	`bool`	If True, use original LOND formulation (num_reject + 1). If False, use modified version max(num_reject, 1). Default is True.	`True`
`dependent`	`bool`	If True, apply correction for arbitrary dependence using harmonic series. If False, assume independence/positive dependence. Default is False.	`False`

Attributes:

Name	Type	Description
`alpha0`	`float`	Original target FDR level.
`num_test`	`int`	Number of hypotheses tested so far.
`num_reject`	`int`	Number of hypotheses rejected so far.
`original`	`bool`	Whether to use original LOND formulation.
`dependent`	`bool`	Whether to apply dependence correction.

Examples:

>>> # Basic usage
>>> lond = Lond(alpha=0.05)
>>> decision = lond.test_one(0.01)  # Test a small p-value
>>> print(f"Rejected: {decision}")

>>> # For dependent p-values
>>> lond_dep = Lond(alpha=0.05, dependent=True)
>>> decisions = [lond_dep.test_one(p) for p in [0.001, 0.3, 0.02]]

Note

LOND is primarily of historical importance as one of the first online FDR methods. For practical applications, consider using LORD, SAFFRON, or ADDIS which typically achieve higher power.

References

Javanmard, A., and Montanari, A. (2015). "On online control of false discovery rate." arXiv preprint arXiv:1502.06197.

Javanmard, A., and A. Montanari (2018). "Online rules for control of false discovery rate and false discovery exceedance." Annals of Statistics, 46(2):526-554.

Source code in online_fdr/investing/lond/lond.py

class Lond(AbstractSequentialTest):
    """LOND: Levels based On Number of Discoveries for online FDR control.

    LOND is one of the first procedures for online false discovery rate (FDR) control,
    where significance levels are adjusted based on the number of discoveries made so far.
    It is a relatively simple algorithm where test levels are multiplied by the number 
    of rejections up to the current time.

    While LOND provably controls the FDR, it has a significant limitation: unless many 
    discoveries are made early, the adjusted significance levels quickly approach zero,
    leading to very low power. This motivated the development of LORD procedures that 
    use "alpha investing" to maintain better power over time.

    Args:
        alpha: Target FDR level (e.g., 0.05 for 5% FDR). Must be in (0, 1).
        original: If True, use original LOND formulation (num_reject + 1).
                 If False, use modified version max(num_reject, 1). Default is True.
        dependent: If True, apply correction for arbitrary dependence using harmonic
                  series. If False, assume independence/positive dependence. Default is False.

    Attributes:
        alpha0: Original target FDR level.
        num_test: Number of hypotheses tested so far.
        num_reject: Number of hypotheses rejected so far.
        original: Whether to use original LOND formulation.
        dependent: Whether to apply dependence correction.

    Examples:
        >>> # Basic usage
        >>> lond = Lond(alpha=0.05)
        >>> decision = lond.test_one(0.01)  # Test a small p-value
        >>> print(f"Rejected: {decision}")

        >>> # For dependent p-values
        >>> lond_dep = Lond(alpha=0.05, dependent=True)
        >>> decisions = [lond_dep.test_one(p) for p in [0.001, 0.3, 0.02]]

    Note:
        LOND is primarily of historical importance as one of the first online FDR 
        methods. For practical applications, consider using LORD, SAFFRON, or ADDIS 
        which typically achieve higher power.

    References:
        Javanmard, A., and Montanari, A. (2015). "On online control of false discovery 
        rate." arXiv preprint arXiv:1502.06197.

        Javanmard, A., and A. Montanari (2018). "Online rules for control of false 
        discovery rate and false discovery exceedance." Annals of Statistics, 
        46(2):526-554.
    """

    def __init__(
        self,
            alpha: float,
            original: bool = True,
            dependent: bool = False
    ):  # fmt: skip
        super().__init__(alpha)
        self.alpha0: float = alpha

        self.num_test: int = 0
        self.num_reject: int = 0

        self.seq = DefaultLondGammaSequence(c=0.07720838)

        self.original: bool = original
        self.dependent: bool = dependent

    def test_one(self, p_val: float) -> bool:
        """Test a single p-value using the LOND procedure.

        The LOND algorithm processes p-values sequentially:
        1. Calculate base significance level using gamma sequence
        2. Apply dependence correction if enabled (harmonic series)
        3. Multiply by number of discoveries (+ 1 for original version)
        4. Reject if p-value ≤ threshold and update discovery count

        Args:
            p_val: P-value to test. Must be in [0, 1].

        Returns:
            True if the null hypothesis is rejected (discovery), False otherwise.

        Raises:
            ValueError: If p_val is not in [0, 1].

        Examples:
            >>> lond = Lond(alpha=0.05)
            >>> lond.test_one(0.001)  # First test, small p-value
            True
            >>> lond.test_one(0.04)   # Second test, higher threshold after discovery
            True
            >>> lond.test_one(0.04)   # Third test, threshold increased again
            False

        Note:
            The threshold increases with each discovery, but decreases rapidly
            if no discoveries are made early on, leading to low power.
        """
        validity.check_p_val(p_val)
        self.num_test += 1

        self.alpha = self.seq.calc_gamma(self.num_test, alpha=self.alpha0)
        self.alpha /= (
            sum(1 / i for i in range(1, self.num_test + 1))
            if self.dependent
            else 1  # fmt: split
        )
        self.alpha *= (
            self.num_reject + 1 if self.original else max(self.num_reject, 1)
        )  # fmt: split

        is_rejected = p_val <= self.alpha
        self.num_reject += 1 if is_rejected else 0

        return is_rejected

Functions¶

`test_one(p_val)` ¶

Test a single p-value using the LOND procedure.

The LOND algorithm processes p-values sequentially: 1. Calculate base significance level using gamma sequence 2. Apply dependence correction if enabled (harmonic series) 3. Multiply by number of discoveries (+ 1 for original version) 4. Reject if p-value ≤ threshold and update discovery count

Parameters:

Name	Type	Description	Default
`p_val`	`float`	P-value to test. Must be in [0, 1].	required

Returns:

Type	Description
`bool`	True if the null hypothesis is rejected (discovery), False otherwise.

Raises:

Type	Description
`ValueError`	If p_val is not in [0, 1].

Examples:

>>> lond = Lond(alpha=0.05)
>>> lond.test_one(0.001)  # First test, small p-value
True
>>> lond.test_one(0.04)   # Second test, higher threshold after discovery
True
>>> lond.test_one(0.04)   # Third test, threshold increased again
False

Note

The threshold increases with each discovery, but decreases rapidly if no discoveries are made early on, leading to low power.

Source code in online_fdr/investing/lond/lond.py

def test_one(self, p_val: float) -> bool:
    """Test a single p-value using the LOND procedure.

    The LOND algorithm processes p-values sequentially:
    1. Calculate base significance level using gamma sequence
    2. Apply dependence correction if enabled (harmonic series)
    3. Multiply by number of discoveries (+ 1 for original version)
    4. Reject if p-value ≤ threshold and update discovery count

    Args:
        p_val: P-value to test. Must be in [0, 1].

    Returns:
        True if the null hypothesis is rejected (discovery), False otherwise.

    Raises:
        ValueError: If p_val is not in [0, 1].

    Examples:
        >>> lond = Lond(alpha=0.05)
        >>> lond.test_one(0.001)  # First test, small p-value
        True
        >>> lond.test_one(0.04)   # Second test, higher threshold after discovery
        True
        >>> lond.test_one(0.04)   # Third test, threshold increased again
        False

    Note:
        The threshold increases with each discovery, but decreases rapidly
        if no discoveries are made early on, leading to low power.
    """
    validity.check_p_val(p_val)
    self.num_test += 1

    self.alpha = self.seq.calc_gamma(self.num_test, alpha=self.alpha0)
    self.alpha /= (
        sum(1 / i for i in range(1, self.num_test + 1))
        if self.dependent
        else 1  # fmt: split
    )
    self.alpha *= (
        self.num_reject + 1 if self.original else max(self.num_reject, 1)
    )  # fmt: split

    is_rejected = p_val <= self.alpha
    self.num_reject += 1 if is_rejected else 0

    return is_rejected

Usage Examples¶

Basic Usage¶

from online_fdr.investing.lond.lond import Lond

# Create LOND instance
lond = Lond(alpha=0.05)

# Test individual p-values
p_values = [0.001, 0.15, 0.03, 0.8, 0.02, 0.45, 0.006]

print("LOND Online Testing:")
discoveries = []

for i, p_value in enumerate(p_values):
    decision = lond.test_one(p_value)

    if decision:
        discoveries.append(i + 1)
        print(f"✓ Test {i+1}: p={p_value:.3f} → DISCOVERY! (total: {lond.num_reject})")
    else:
        print(f"  Test {i+1}: p={p_value:.3f} → no rejection (threshold: {lond.alpha:.6f})")

print(f"\nTotal discoveries: {len(discoveries)}")
print(f"Discovery indices: {discoveries}")

Understanding the Discovery Momentum¶

def demonstrate_discovery_momentum():
    """Show how LOND's power depends on early discoveries."""

    # Scenario 1: Early discoveries
    print("Scenario 1: Early Discoveries")
    print("=" * 35)

    lond1 = Lond(alpha=0.1)  # Higher alpha for better visibility
    early_discoveries = [0.001, 0.005, 0.02, 0.8, 0.3, 0.04, 0.2]

    for i, p_val in enumerate(early_discoveries, 1):
        decision = lond1.test_one(p_val)
        print(f"Test {i}: p={p_val:.3f} → {'REJECT' if decision else 'ACCEPT'} "
              f"(threshold: {lond1.alpha:.6f}, discoveries: {lond1.num_reject})")

    print(f"Final discoveries: {lond1.num_reject}\n")

    # Scenario 2: No early discoveries  
    print("Scenario 2: No Early Discoveries")
    print("=" * 37)

    lond2 = Lond(alpha=0.1)
    no_early = [0.8, 0.9, 0.7, 0.001, 0.005, 0.02, 0.04]  # Same p-values, reordered

    for i, p_val in enumerate(no_early, 1):
        decision = lond2.test_one(p_val)
        print(f"Test {i}: p={p_val:.3f} → {'REJECT' if decision else 'ACCEPT'} "
              f"(threshold: {lond2.alpha:.6f}, discoveries: {lond2.num_reject})")

    print(f"Final discoveries: {lond2.num_reject}")
    print(f"\nPower difference due to ordering: {lond1.num_reject - lond2.num_reject} discoveries")

demonstrate_discovery_momentum()

Handling Dependent Data¶

def lond_with_dependence():
    """Compare LOND variants for different dependence assumptions."""

    print("LOND Dependence Handling:")
    print("=" * 30)

    # Test p-values with some correlation structure
    p_values = [0.01, 0.02, 0.015, 0.8, 0.03, 0.9, 0.025, 0.7]

    # Independent version
    lond_indep = Lond(alpha=0.05, dependent=False)
    indep_decisions = [lond_indep.test_one(p) for p in p_values]

    # Dependent version (with harmonic correction)
    lond_dep = Lond(alpha=0.05, dependent=True)
    dep_decisions = [lond_dep.test_one(p) for p in p_values]

    print("Results:")
    print(f"Independent assumption: {sum(indep_decisions)} discoveries")
    print(f"Dependent correction: {sum(dep_decisions)} discoveries")
    print(f"Dependent version is more conservative as expected")

    # Show threshold differences
    print(f"\nFinal thresholds:")
    print(f"Independent: {lond_indep.alpha:.6f}")  
    print(f"Dependent: {lond_dep.alpha:.6f}")

lond_with_dependence()

Original vs Modified Formulation¶

def compare_lond_variants():
    """Compare original and modified LOND formulations."""

    print("LOND Formulation Comparison:")
    print("=" * 32)

    p_values = [0.002, 0.8, 0.01, 0.9, 0.005, 0.7, 0.015]

    # Original: uses (num_reject + 1) in threshold
    lond_orig = Lond(alpha=0.05, original=True)

    # Modified: uses max(num_reject, 1) in threshold
    lond_mod = Lond(alpha=0.05, original=False)

    print("Test | P-value | Original | Modified")
    print("-" * 35)

    for i, p_val in enumerate(p_values, 1):
        orig_decision = lond_orig.test_one(p_val)
        mod_decision = lond_mod.test_one(p_val)

        print(f"{i:4d} | {p_val:7.3f} | {'REJECT' if orig_decision else 'ACCEPT':>8} | "
              f"{'REJECT' if mod_decision else 'ACCEPT':>8}")

    print(f"\nOriginal formulation: {lond_orig.num_reject} discoveries")
    print(f"Modified formulation: {lond_mod.num_reject} discoveries")

compare_lond_variants()

Performance Evaluation¶

from online_fdr.utils.generation import DataGenerator, GaussianLocationModel

def evaluate_lond_performance():
    """Evaluate LOND on simulated data."""

    print("LOND Performance Evaluation:")
    print("=" * 32)

    # Generate realistic test scenario
    dgp = GaussianLocationModel(alt_mean=2.0, alt_std=1.0, one_sided=True)
    generator = DataGenerator(n=100, pi0=0.9, dgp=dgp)  # 90% nulls

    # Create LOND instance
    lond = Lond(alpha=0.05)

    # Simulate testing
    true_positives = 0
    false_positives = 0
    total_tests = 50

    print(f"Testing {total_tests} hypotheses (10% alternatives expected):")
    print()

    for i in range(total_tests):
        p_value, is_alternative = generator.sample_one()
        decision = lond.test_one(p_value)

        if decision:
            if is_alternative:
                true_positives += 1
                result = "TRUE discovery ✓"
            else:
                false_positives += 1
                result = "FALSE discovery ✗"

            truth = "ALT" if is_alternative else "NULL"
            print(f"Test {i+1:2d}: p={p_value:.3f} ({truth}) → REJECT ({result})")

    # Calculate metrics
    total_discoveries = true_positives + false_positives
    empirical_fdr = false_positives / max(total_discoveries, 1)

    print(f"\nPerformance Summary:")
    print(f"Total discoveries: {total_discoveries}")
    print(f"True positives: {true_positives}")
    print(f"False positives: {false_positives}")
    print(f"Empirical FDR: {empirical_fdr:.3f}")
    print(f"Target FDR: {lond.alpha0}")
    print(f"FDR controlled: {'✓' if empirical_fdr <= lond.alpha0 else '✗'}")

evaluate_lond_performance()

Mathematical Foundation¶

Threshold Formula¶

For test t, LOND sets the rejection threshold as:

\[\alpha_t = \gamma_t \cdot \begin{cases} R_t + 1 & \text{(original formulation)} \\ \max(R_t, 1) & \text{(modified formulation)} \end{cases}\]

where: - γ_t is from a gamma sequence with Σ γ_t ≤ α
- R_t is the number of rejections up to time t

Dependence Correction¶

For arbitrarily dependent p-values, LOND applies the correction:

\[\alpha_t^{\text{dep}} = \frac{\alpha_t}{\sum_{i=1}^t \frac{1}{i}} = \frac{\alpha_t}{H_t}\]

where H_t is the t-th harmonic number.

FDR Guarantee¶

Theorem (LOND FDR Control): - For independent p-values: LOND controls FDR at level α - For positively dependent (PRDS) p-values: LOND controls FDR at level α
- For arbitrary dependence: LOND with harmonic correction controls FDR at level α

Comparison with Other Methods¶

LOND vs LORD vs SAFFRON¶

Method	Adaptation	Power	Complexity	FDR Control
LOND	None	Low (without early discoveries)	Simple	✓
LORD	Timing-based	Medium	Moderate	✓
SAFFRON	Null proportion	High	Moderate	✓

When to Use LOND¶

Appropriate Use Cases

Historical studies: Understanding the evolution of online FDR methods
Educational purposes: Learning basic online testing concepts
Baseline comparisons: Simple benchmark for other methods
Very sparse alternatives: When few discoveries are expected

Not Recommended For

Practical applications: Better methods are available (SAFFRON, ADDIS)
Unknown discovery patterns: Non-adaptive nature is limiting
High-power requirements: Performance degrades without early discoveries

Best Practices¶

Parameter Selection¶

Alpha Selection

Use standard values (0.05, 0.1) for comparability
Higher α may be needed to see any discoveries with LOND

Dependence Setting

dependent=False: For independent or positively dependent tests
dependent=True: Conservative choice for unknown dependence
The harmonic correction is quite conservative

Improving LOND Performance¶

Ensure early discoveries: LOND works best when early tests are promising
Consider pre-filtering: Remove obviously null hypotheses
Use as baseline: Compare against more powerful methods
Understand limitations: Expect lower power than adaptive methods

References¶

Javanmard, A., and Montanari, A. (2015). "On online control of false discovery rate." arXiv preprint arXiv:1502.06197.
Javanmard, A., and A. Montanari (2018). "Online rules for control of false discovery rate and false discovery exceedance." Annals of Statistics, 46(2):526-554.
Benjamini, Y., and Y. Hochberg (1995). "Controlling the false discovery rate: A practical and powerful approach to multiple testing." Journal of the Royal Statistical Society: Series B, 57(1):289-300.