Streaming Rashomon Sets for Real-Time Network Intrusion Detection with Theoretical Guarantees
PI: Tyler Mccormick
Sponsor: Sandia National Laboratories
Project Period:
-
Amount: $363,996.00
Abstract
In the literature, the vast majority of online unsupervised anomaly detection algorithms for cyber-intrusion detection endeavor to quantify historic features of standard behavior and detect anomalies as observed events with radically different characteristics. Classical approaches compute empirical statistics, while others parametrically or non-parametrically learn the underlying probability distribution of standard events. Recent methodologies have seen the use of quantile regression, an approach that learns a percentile of univariate data at which a particular observation is seen with a certain probability. All methods determine anomalies as outlying events measured exceeding a computed discriminator yet cannot account for a cyber-network's limited and/or highly nuanced history, rendering such approaches infeasible.
A critical limitation of existing approaches is their reliance on a single model or metric, potentially missing important anomalies that would be detected by alternative, equally-performing models. Furthermore, the lack of theoretical guarantees for streaming settings makes it difficult to understand how model performance evolves over time, particularly as network behavior shifts.
As a step towards developing theoretically-grounded, unsupervised, real-time intrusion detection algorithms, this project will leverage Prof. Tyler McCormick's specific research expertise in statistical network analysis and algorithms for cybersecurity. The project will develop a novel framework based on streaming Rashomon sets with the following key innovations:
We will develop theoretical properties and guarantees for Rashomon sets in streaming environments, establishing how these sets of equally-performing models evolve over time and how changes in the set composition can signal anomalies. This theoretical work will provide rigorous foundations for understanding when and why certain network behaviors should be flagged as anomalous.
The framework will incorporate multiple network attributes, including Prof. McCormick's recently developed univariate network curvature metric that identifies network shape and captures perturbations. These attributes, along with others, will serve as features within the Rashomon set models, allowing for multi-perspective anomaly detection.
By monitoring how the Rashomon set changes over time - which models enter or leave the set, how model weights shift, and how prediction consensus varies - we can detect anomalies that might be missed by any single model approach. The theoretical guarantees will ensure reliable performance even as network conditions evolve.
This work's primary goal is to develop a theoretically-grounded anomaly detection framework that maintains multiple equally-performing models to capture different aspects of normal network behavior, with rigorous guarantees about detection performance in streaming settings.
