Skip to content

User Guide

Welcome to the ONAD User Guide! This comprehensive guide will help you master online anomaly detection with ONAD, from basic concepts to advanced deployment strategies.

What is Online Anomaly Detection?

Online anomaly detection is the process of identifying unusual patterns or outliers in streaming data as it arrives, without requiring the entire dataset to be available upfront. This approach is essential for:

  • Real-time systems that need immediate responses to anomalies
  • Large-scale data where storing everything is impractical
  • Evolving patterns where the definition of "normal" changes over time
  • Resource-constrained environments with limited memory and compute

ONAD Philosophy

ONAD is built around several core principles:

Streaming-First Design

Every component in ONAD is designed to process data one point at a time, maintaining constant memory usage regardless of stream length.

Composable Architecture

Models, transformers, and utilities can be combined in flexible pipelines to solve complex problems.

Production-Ready

Built-in logging, error handling, and memory management make ONAD suitable for production deployments.

Type Safety

Comprehensive type hints ensure code reliability and better IDE support.

Getting Started

1. Choose Your Model

ONAD provides several categories of anomaly detection models:

2. Prepare Your Data

Use ONAD's data transformation tools:

3. Build Your Pipeline

Combine components for sophisticated processing:

Common Workflows

Basic Anomaly Detection

from onad.model.iforest import OnlineIsolationForest

# Initialize model
model = OnlineIsolationForest()

# Process streaming data
for data_point in stream:
    model.learn_one(data_point)
    score = model.score_one(data_point)
    if score > threshold:
        handle_anomaly(data_point, score)

Data Preprocessing Pipeline

from onad.transform.preprocessing.scaler import StandardScaler
from onad.model.svm import IncrementalOneClassSVMAdaptiveKernel

# Create pipeline components
scaler = StandardScaler()
detector = IncrementalOneClassSVMAdaptiveKernel()


# Process data through pipeline
def detect_anomaly(raw_data):
    scaler.learn_one(raw_data)
    normalized_data = scaler.transform_one(raw_data)
    detector.learn_one(normalized_data)
    return detector.score_one(normalized_data)

Batch Processing with Streaming Interface

from onad.stream import ParquetStreamer, Dataset

with ParquetStreamer(Dataset.FRAUD) as streamer:
    for features, label in streamer:
        # Process each data point
        model.learn_one(features)
        score = model.score_one(features)

Topics Covered

Core Concepts

Advanced Topics

Quick Reference

Essential Imports

# Models
from onad.model.iforest import OnlineIsolationForest
from onad.model.svm import IncrementalOneClassSVMAdaptiveKernel
from onad.model.stat.multi import MovingMahalanobisDistance

# Transformers
from onad.transform.preprocessing.scaler import StandardScaler, MinMaxScaler
from onad.transform.projection.incremental_pca import IncrementalPCA

# Streaming
from onad.stream import ParquetStreamer, Dataset

Common Parameters

# Memory management
model = OnlineIsolationForest(window_size=1000)

# Performance tuning
model = OnlineIsolationForest(num_trees=100, max_leaf_samples=32)

# Data preprocessing
scaler = StandardScaler()
pca = IncrementalPCA(n_components=10)

Typical Workflow

  1. Initialize model and transformers
  2. Learn from each data point with learn_one()
  3. Score data points with score_one()
  4. Transform data with transform_one() (if using transformers)
  5. Monitor model state with logging and metrics

Next Steps