GAI: Generalized Alpha-Investing¶

GAI (Generalized Alpha-Investing) extends the original alpha-investing procedure of Foster and Stine (2008) for sequential control of expected false discoveries, using SAFFRON-style update rules for improved power.

Original Papers

Foster, D., and R. Stine. "α-investing: a procedure for sequential control of expected false discoveries." Journal of the Royal Statistical Society (Series B), 70(2):429-444, 2008.

Ramdas, A., T. Zrnic, M. J. Wainwright, and M. I. Jordan. "SAFFRON: an adaptive algorithm for online control of the FDR." Proceedings of the 35^th International Conference on Machine Learning (ICML), 2018.

Overview¶

The Alpha-Investing Revolution¶

Alpha-investing introduced a paradigm shift in sequential hypothesis testing: instead of fixed significance levels, the procedure earns back probability when discoveries are made. This creates a dynamic system where successful discoveries enable more powerful future testing.

Key Innovation¶

The fundamental insight is that when you reject a null hypothesis, you gain evidence that not all hypotheses are null, justifying spending more α-wealth on future tests. This creates a virtuous cycle where discoveries beget more discoveries.

GAI Enhancement¶

This implementation combines the original alpha-investing philosophy with SAFFRON's gamma sequence and update rules, providing a more principled approach to wealth allocation while maintaining the core alpha-investing benefits.

Class Reference¶

`online_fdr.investing.alpha.alpha.Gai` ¶

Bases: AbstractSequentialTest

GAI: Generalized Alpha-Investing for online FDR control with SAFFRON updates.

Generalized Alpha-Investing (GAI) extends the original alpha-investing procedure of Foster and Stine (2008) for sequential control of expected false discoveries. This implementation uses SAFFRON-style update rules for improved power while maintaining the core alpha-investing philosophy.

Alpha-investing resembles alpha-spending but with a key difference: when a test rejects a null hypothesis, the procedure earns additional probability toward subsequent tests. This allows incorporation of domain knowledge and improved power over non-adaptive methods.

The GAI framework has become fundamental for online hypothesis testing, providing a robust, computationally efficient approach that requires no parametric assumptions about underlying null and alternative distributions.

Parameters:

Name	Type	Description	Default
`alpha`	`float`	Target FDR level (e.g., 0.05 for 5% FDR). Must be in (0, 1).	required
`wealth`	`float`	Initial alpha-wealth for purchasing rejection thresholds. Must satisfy 0 ≤ wealth ≤ alpha.	required

Attributes:

Name	Type	Description
`alpha0`	`float`	Original target FDR level.
`wealth0`	`float`	Initial wealth allocation.
`num_test`	`int`	Number of hypotheses tested so far.
`candidates`	`list[bool]`	Boolean list indicating which tests were candidates.
`reject_idx`	`list[int]`	Indices of rejected hypotheses.

Examples:

>>> # Basic usage
>>> gai = Gai(alpha=0.05, wealth=0.025)
>>> decision = gai.test_one(0.01)  # Test a small p-value
>>> print(f"Rejected: {decision}")

>>> # Sequential testing with wealth dynamics
>>> p_values = [0.001, 0.3, 0.02, 0.8, 0.005]
>>> decisions = [gai.test_one(p) for p in p_values]
>>> discoveries = sum(decisions)

References

Foster, D., and R. Stine (2008). "α-investing: a procedure for sequential control of expected false discoveries." Journal of the Royal Statistical Society (Series B), 70(2):429-444.

Ramdas, A., T. Zrnic, M. J. Wainwright, and M. I. Jordan (2018). "SAFFRON: an adaptive algorithm for online control of the FDR." Proceedings of the 35^th International Conference on Machine Learning (ICML), Proceedings of Machine Learning Research, vol. 80, pp. 4286-4294, PMLR.

Source code in online_fdr/investing/alpha/alpha.py

class Gai(AbstractSequentialTest):
    """GAI: Generalized Alpha-Investing for online FDR control with SAFFRON updates.

    Generalized Alpha-Investing (GAI) extends the original alpha-investing procedure 
    of Foster and Stine (2008) for sequential control of expected false discoveries.
    This implementation uses SAFFRON-style update rules for improved power while 
    maintaining the core alpha-investing philosophy.

    Alpha-investing resembles alpha-spending but with a key difference: when a test 
    rejects a null hypothesis, the procedure earns additional probability toward 
    subsequent tests. This allows incorporation of domain knowledge and improved power 
    over non-adaptive methods.

    The GAI framework has become fundamental for online hypothesis testing, providing 
    a robust, computationally efficient approach that requires no parametric assumptions 
    about underlying null and alternative distributions.

    Args:
        alpha: Target FDR level (e.g., 0.05 for 5% FDR). Must be in (0, 1).
        wealth: Initial alpha-wealth for purchasing rejection thresholds.
                Must satisfy 0 ≤ wealth ≤ alpha.

    Attributes:
        alpha0: Original target FDR level.
        wealth0: Initial wealth allocation.
        num_test: Number of hypotheses tested so far.
        candidates: Boolean list indicating which tests were candidates.
        reject_idx: Indices of rejected hypotheses.

    Examples:
        >>> # Basic usage
        >>> gai = Gai(alpha=0.05, wealth=0.025)
        >>> decision = gai.test_one(0.01)  # Test a small p-value
        >>> print(f"Rejected: {decision}")

        >>> # Sequential testing with wealth dynamics
        >>> p_values = [0.001, 0.3, 0.02, 0.8, 0.005]
        >>> decisions = [gai.test_one(p) for p in p_values]
        >>> discoveries = sum(decisions)

    References:
        Foster, D., and R. Stine (2008). "α-investing: a procedure for sequential 
        control of expected false discoveries." Journal of the Royal Statistical 
        Society (Series B), 70(2):429-444.

        Ramdas, A., T. Zrnic, M. J. Wainwright, and M. I. Jordan (2018). 
        "SAFFRON: an adaptive algorithm for online control of the FDR." 
        Proceedings of the 35th International Conference on Machine Learning (ICML), 
        Proceedings of Machine Learning Research, vol. 80, pp. 4286-4294, PMLR.
    """

    def __init__(self, alpha: float, wealth: float):
        super().__init__(alpha)
        self.alpha0: float = alpha
        self.wealth0: float = wealth

        validity.check_initial_wealth(wealth, alpha)
        validity.check_candidate_threshold(alpha)

        self.num_test: int = 0
        self.candidates: list[bool] = []
        self.reject_idx: list[int] = []

        self.seq = DefaultSaffronGammaSequence(gamma_exp=1.6, c=0.4374901658)

    def test_one(self, p_val: float) -> bool:
        validity.check_p_val(p_val)
        self.num_test += 1
        self.alpha = self.calc_alpha_t()

        is_candidate = p_val <= self.alpha  # candidate
        self.candidates.append(is_candidate)

        is_rejected = p_val <= self.alpha  # rejection
        self.reject_idx.append(self.num_test) if is_rejected else None
        return is_rejected

    def calc_alpha_t(self):

        if self.num_test == 1:
            alpha_t = (
                self.seq.calc_gamma(1, None)  # fmt: skip
                * self.wealth0
            )
        else:
            alpha_t = self.wealth0 * self.seq.calc_gamma(
                self.num_test - sum(self.candidates), None
            )
            if len(self.reject_idx) >= 1:
                tau_1 = self.reject_idx[0]
                c_1_plus = sum(self.candidates[tau_1:])
                alpha_t += (self.alpha0 - self.wealth0) * self.seq.calc_gamma(
                    (self.num_test - tau_1 - c_1_plus), None
                )
            if len(self.reject_idx) >= 2:
                alpha_t += self.alpha0 * sum(
                    self.seq.calc_gamma(
                        (self.num_test - idx - sum(self.candidates[idx:])),
                        None,
                    )
                    for idx in self.reject_idx[1:]
                )
        return alpha_t / (1 + alpha_t)

Usage Examples¶

Basic Alpha-Investing¶

from online_fdr.investing.alpha.alpha import Gai

# Create GAI instance
gai = Gai(alpha=0.05, wealth=0.025)

# Test individual p-values
p_values = [0.001, 0.15, 0.03, 0.8, 0.02, 0.45, 0.006]

print("Generalized Alpha-Investing:")
discoveries = []

for i, p_value in enumerate(p_values):
    decision = gai.test_one(p_value)

    if decision:
        discoveries.append(i + 1)
        print(f"✓ Test {i+1}: p={p_value:.3f} → DISCOVERY!")
    else:
        print(f"  Test {i+1}: p={p_value:.3f} → no rejection")

print(f"\nTotal discoveries: {len(discoveries)}")
print(f"Discovery indices: {discoveries}")

Understanding Wealth Dynamics¶

def demonstrate_gai_mechanism():
    """Show how GAI differs from fixed-level testing."""

    print("GAI vs Fixed-Level Testing:")
    print("=" * 35)

    test_sequence = [0.001, 0.8, 0.02, 0.9, 0.005, 0.7, 0.015]

    # GAI with dynamic thresholds
    gai = Gai(alpha=0.1, wealth=0.05)  # Higher values for visibility

    print("GAI (Dynamic Thresholds):")
    gai_discoveries = 0

    for i, p_val in enumerate(test_sequence, 1):
        # Calculate threshold before testing
        gai.num_test += 1
        threshold = gai.calc_alpha_t()

        decision = p_val <= threshold

        if decision:
            gai_discoveries += 1
            gai.candidates.append(p_val <= gai.alpha0)  # Assuming p_val as candidate check
            gai.reject_idx.append(gai.num_test)

        print(f"Test {i}: p={p_val:.3f}, threshold={threshold:.6f} → {'REJECT' if decision else 'ACCEPT'}")

    # Fixed-level testing for comparison
    print(f"\nFixed-Level (α=0.05):")
    fixed_discoveries = 0

    for i, p_val in enumerate(test_sequence, 1):
        decision = p_val <= 0.05
        if decision:
            fixed_discoveries += 1
        print(f"Test {i}: p={p_val:.3f}, threshold=0.050000 → {'REJECT' if decision else 'ACCEPT'}")

    print(f"\nComparison:")
    print(f"GAI discoveries: {gai_discoveries}")
    print(f"Fixed-level discoveries: {fixed_discoveries}")
    print(f"Power advantage: {gai_discoveries - fixed_discoveries}")

demonstrate_gai_mechanism()

Incorporating Prior Knowledge¶

def gai_with_prior_knowledge():
    """Demonstrate how GAI can incorporate domain knowledge."""

    print("GAI with Prior Knowledge:")
    print("=" * 30)

    # Simulate a scenario where you expect more promising tests later
    # (e.g., genomics where genes are ordered by biological relevance)

    # Early tests: mostly null
    early_tests = [0.8, 0.7, 0.9, 0.6, 0.75]

    # Later tests: mix of null and alternative  
    later_tests = [0.001, 0.02, 0.8, 0.005, 0.03, 0.9, 0.007]

    # Conservative start to preserve wealth for later promising tests
    gai = Gai(alpha=0.05, wealth=0.01)  # Lower initial wealth

    print("Early phase (expected mostly nulls):")
    early_discoveries = 0

    for i, p_val in enumerate(early_tests, 1):
        decision = gai.test_one(p_val)
        if decision:
            early_discoveries += 1
            print(f"✓ Test {i}: p={p_val:.3f} → DISCOVERY")
        else:
            print(f"  Test {i}: p={p_val:.3f} → no rejection (conservative)")

    print(f"\nTransition to promising region...")
    print("Later phase (expected more alternatives):")

    later_discoveries = 0
    for i, p_val in enumerate(later_tests, len(early_tests) + 1):
        decision = gai.test_one(p_val)
        if decision:
            later_discoveries += 1
            print(f"✓ Test {i}: p={p_val:.3f} → DISCOVERY")
        else:
            print(f"  Test {i}: p={p_val:.3f} → no rejection")

    print(f"\nResults:")
    print(f"Early discoveries: {early_discoveries}")
    print(f"Later discoveries: {later_discoveries}")
    print(f"Total discoveries: {early_discoveries + later_discoveries}")
    print("GAI preserved wealth for the promising region!")

gai_with_prior_knowledge()

Comparison with Other Alpha-Investing Methods¶

from online_fdr.investing.saffron.saffron import Saffron
from online_fdr.investing.lord.three import LordThree

def compare_alpha_investing_family():
    """Compare different alpha-investing approaches."""

    print("Alpha-Investing Family Comparison:")
    print("=" * 40)

    # Test sequence with varied difficulty
    p_values = [0.002, 0.8, 0.01, 0.9, 0.005, 0.7, 0.03, 0.6, 0.008]

    # Different alpha-investing approaches
    methods = {
        'GAI': Gai(alpha=0.05, wealth=0.025),
        'SAFFRON': Saffron(alpha=0.05, wealth=0.025, lambda_=0.5),
        'LORD 3': LordThree(alpha=0.05, wealth=0.025, reward=0.025)
    }

    results = {}

    for method_name, method in methods.items():
        decisions = [method.test_one(p) for p in p_values]
        discoveries = sum(decisions)
        discovery_indices = [i+1 for i, d in enumerate(decisions) if d]

        results[method_name] = {
            'discoveries': discoveries,
            'indices': discovery_indices
        }

        print(f"{method_name:>8}: {discoveries} discoveries at positions {discovery_indices}")

    return results

compare_alpha_investing_family()

Simulating Industrial A/B Testing¶

from online_fdr.utils.generation import DataGenerator, GaussianLocationModel

def simulate_ab_testing_with_gai():
    """Simulate GAI in an industrial A/B testing environment."""

    print("Industrial A/B Testing with GAI:")
    print("=" * 36)

    # Simulate A/B testing scenario:
    # - Many tests run simultaneously
    # - Most are null (no real effect)
    # - Some have real but small effects
    # - Occasional large effects

    dgp = GaussianLocationModel(alt_mean=1.5, alt_std=1.0, one_sided=True)
    generator = DataGenerator(n=200, pi0=0.95, dgp=dgp)  # 95% nulls (realistic)

    gai = Gai(alpha=0.05, wealth=0.025)

    true_positives = 0
    false_positives = 0
    test_count = 0

    print("Running A/B tests sequentially...")

    # Simulate first 100 tests
    for i in range(100):
        p_value, is_alternative = generator.sample_one()
        decision = gai.test_one(p_value)
        test_count += 1

        if decision:
            if is_alternative:
                true_positives += 1
                result_type = "TRUE effect ✓"
            else:
                false_positives += 1
                result_type = "FALSE alarm ✗"

            # Show significant results
            effect_type = "REAL" if is_alternative else "NULL"
            print(f"Test {i+1:3d}: p={p_value:.4f} ({effect_type}) → SIGNIFICANT ({result_type})")

    # Calculate business metrics
    total_discoveries = true_positives + false_positives
    empirical_fdr = false_positives / max(total_discoveries, 1)
    power = true_positives / max(sum(generator.sample_one()[1] for _ in range(100)), 1)  # Approximate

    print(f"\nA/B Testing Campaign Results:")
    print(f"Total tests run: {test_count}")
    print(f"Significant effects found: {total_discoveries}")
    print(f"True discoveries (real effects): {true_positives}")
    print(f"False alarms: {false_positives}")
    print(f"False Discovery Rate: {empirical_fdr:.3f}")
    print(f"Target FDR: {gai.alpha0}")
    print(f"FDR controlled: {'✓' if empirical_fdr <= gai.alpha0 else '✗'}")

    # Business interpretation
    print(f"\nBusiness Impact:")
    if true_positives > 0:
        print(f"✓ Found {true_positives} real improvements to implement")
    if false_positives > 0:
        print(f"⚠ {false_positives} false alarms avoided implementing bad changes")

    efficiency = true_positives / max(total_discoveries, 1)
    print(f"Discovery efficiency: {efficiency:.1%}")

simulate_ab_testing_with_gai()

Mathematical Foundation¶

Core Alpha-Investing Principle¶

The fundamental equation of alpha-investing is the wealth update rule:

\[W_{t+1} = W_t - \alpha_t + \text{payout} \cdot \mathbf{1}_{\text{discovery at } t}\]

where the payout compensates for the discovery, enabling future testing.

GAI Threshold Formula¶

GAI uses SAFFRON-style gamma sequences to set thresholds:

\[\alpha_t = \text{wealth-based allocation} \times \gamma_t\]

The wealth allocation adapts based on candidate history and discovery patterns.

Theoretical Guarantees¶

Theorem (Alpha-Investing FDR Control): Under independence, GAI controls the False Discovery Rate (FDR) at level α.

The proof relies on the martingale property of the wealth process under the null hypothesis.

Historical Context and Evolution¶

Original Alpha-Investing (Foster & Stine, 2008)¶

Introduced the wealth-based paradigm
Controlled mFDR (marginal FDR) rather than FDR
Simple payout rules

Generalized Alpha-Investing¶

Extended to various payout schemes
Incorporated prior weights and penalties
Better theoretical understanding

Modern Variants (GAI)¶

SAFFRON-style update rules
Improved power characteristics
Maintains original philosophy with better performance

Best Practices¶

Parameter Selection¶

Wealth Selection Guidelines

Conservative: W₀ = α/4 (preserves wealth for later)
Moderate: W₀ = α/2 (balanced approach)
Aggressive: W₀ = α (spends wealth early)

Domain Knowledge Integration

Start conservatively if expecting null-heavy early tests
Higher initial wealth if early alternatives are expected
Consider test ordering when possible

When to Use GAI¶

Good Use Cases

Industrial A/B testing: Many tests, mostly null, need efficiency
Sequential screening: Tests arrive over time, immediate decisions needed
Prior knowledge: Can order tests by expected promise
Computational constraints: Simpler than adaptive methods

Consider Alternatives

Unknown π₀: SAFFRON adapts better to null proportion
Conservative nulls: ADDIS handles better
Batch setting: Standard BH procedures are optimal

Common Pitfalls¶

Over-aggressive early spending: Leaves no wealth for later discoveries
Under-conservative start: Misses early opportunities
Ignoring test ordering: Random order wastes the alpha-investing advantage
Wrong wealth initialization: Mismatch with discovery expectations

References¶

Foster, D. P., and R. A. Stine (2008). "α-investing: a procedure for sequential control of expected false discoveries." Journal of the Royal Statistical Society: Series B, 70(2):429-444.
Ramdas, A., T. Zrnic, M. J. Wainwright, and M. I. Jordan (2018). "SAFFRON: an adaptive algorithm for online control of the FDR." Proceedings of the 35^th International Conference on Machine Learning (ICML), PMLR, 80:4286-4294.
Aharoni, E., and D. Rosset (2014). "Generalized α-investing: definitions, optimality results and application to public databases." Journal of the Royal Statistical Society: Series B, 76(4):771-794.
Li, L., and J. G. Canner (2007). "Modified alpha-investing: a procedure for multiple testing with prior knowledge." Computational Statistics & Data Analysis, 51(7):3598-3607.