Online FDR: Online False Discovery Rate Control Algorithms¶
online-fdr is a comprehensive Python library for controlling False Discovery Rate (FDR) and Family-Wise Error Rate (FWER) in online multiple hypothesis testing scenarios. Unlike traditional methods that require all p-values upfront, this library provides truly online algorithms that make decisions sequentially as data arrives.
Why Online FDR Control?¶
In many modern applications, hypotheses arrive sequentially and decisions must be made in real-time:
Interim analyses as patient data accumulates, allowing for early stopping or protocol modifications while maintaining statistical validity.
Continuous experimentation in tech companies where new variants are tested as they're developed, requiring immediate go/no-go decisions.
Sequential gene discovery studies where new candidates are evaluated as they're identified through various screening methods.
Real-time anomaly detection in trading systems where suspicious patterns must be flagged immediately as they occur.
Ongoing feature testing and optimization where user behavior changes need rapid assessment for business decisions.
Key Features¶
- โ True Online Processing: Make immediate decisions without waiting for future data
- ๐ Rigorous Statistical Guarantees: Maintain FDR control under various dependency structures
- ๐ Unified API: Consistent interface across all methods with
test_one()
for sequential testing - ๐ Comprehensive Method Coverage: State-of-the-art algorithms from recent literature
- ๐ Performance Optimized: Efficient implementations suitable for high-throughput applications
- ๐ Rich Documentation: Detailed mathematical explanations and practical examples
Quick Installation¶
Quick Start Example¶
from online_fdr.investing.addis.addis import Addis
from online_fdr.utils.generation import DataGenerator, GaussianLocationModel
# Initialize a data generator for demonstration
dgp = GaussianLocationModel(alt_mean=3.0, alt_std=1.0, one_sided=True)
generator = DataGenerator(n=1000, pi0=0.9, dgp=dgp) # 10% alternatives
# Create an online FDR procedure
addis = Addis(alpha=0.05, wealth=0.025, lambda_=0.25, tau=0.5)
# Test hypotheses sequentially
discoveries = []
for i in range(100):
p_value, label = generator.sample_one()
is_discovery = addis.test_one(p_value)
if is_discovery:
discoveries.append(i)
print(f"Discovery at test {i}: p-value = {p_value:.4f}")
print(f"Made {len(discoveries)} discoveries")
Available Methods¶
Sequential Testing (One-by-One)¶
Method Family | Methods | Best For |
---|---|---|
Alpha Investing | GAI, SAFFRON, ADDIS | High-throughput screening |
LORD | LORD3, LORD++, D-LORD, Discard, Memory Decay | Time series with trends |
LOND | LOND | Independent/weakly dependent p-values |
Alpha Spending | Bonferroni, LORD3 spending | Conservative control |
Batch Testing¶
Method | Description | Best For |
---|---|---|
BatchBH | Classic Benjamini-Hochberg | Independent p-values |
BatchStoreyBH | Adaptive Storey-BH procedure | Unknown null proportion |
BatchPRDS | Positive regression dependency | Positively correlated tests |
BatchBY | Benjamini-Yekutieli | Arbitrary dependence |
Mathematical Guarantees¶
All implemented methods provide rigorous theoretical guarantees:
FDR Control
For FDR control methods: \(\mathbb{E}[\text{FDR}] \leq \alpha\) under specified conditions
FWER Control
For alpha spending methods: \(\mathbb{P}(\text{FWER} > 0) \leq \alpha\)
Getting Started¶
Start with our Quick Start Guide for a hands-on introduction to the library.
Explore the Theory Section for mathematical foundations and algorithm details.
Jump to Examples for real-world use cases and method comparisons.
Check the API Reference for detailed class and method documentation.
Acknowledgements¶
This library is inspired by and validated against the R package onlineFDR.
Key differentiator: Our implementation provides a truly online API with test_one()
method calls, enabling real-time sequential applications. The R package requires pre-collected data arrays.
Support¶
- ๐ Documentation: Comprehensive guides and API reference
- ๐ Issues: Report bugs on GitHub Issues
- ๐ฌ Discussions: Ask questions in GitHub Discussions
- ๐ง Contact: Reach out to the maintainers for collaboration opportunities
License¶
This project is licensed under the BSD 3-Clause License - see the LICENSE file for details.