How Device Fingerprinting Works #

Most SIEMs identify devices by IP address or MAC address. Huginn goes further: it composes multiple weak signals into a strong device identity using hyperdimensional memory (HDM) vectors.

The Problem with MAC-Based Identification #

A MAC address tells you the manufacturer (via the OUI prefix). 88:66:5A:xx:xx:xx is Apple. But is it an iPhone, an iPad, a MacBook, or an Apple TV? MAC addresses can’t distinguish device types within a manufacturer, and they can be spoofed.

Huginn’s Approach: Signal Composition #

Every device on your network emits multiple identification signals through normal operation:

Signal Source What It Reveals
MAC OUI DHCP/ARP Manufacturer (Apple, Samsung, Intel)
DHCP Option 55 DHCP requests OS family – the order of requested parameters is distinctive per OS
DHCP Vendor Class DHCP requests Sometimes includes OS version (e.g., “MSFT 5.0” = Windows)
DHCP Hostname DHCP requests Often reveals device type (“iPhone-John”, “DESKTOP-ABC123”)
DNS Query Patterns DNS logs OS-specific probes (captive.apple.com, connectivitycheck.gstatic.com)
Connection Profile Firewall logs Typical destinations, ports, and protocols
Traffic Timing Firewall logs Diurnal patterns, beacon intervals (IoT devices are very regular)

Any single signal is ambiguous. Combined, they’re highly discriminative.

How It Works #

Huginn uses hyperdimensional computing to fuse these signals into a single, compact device identity. Each signal is encoded as a high-dimensional binary vector, and the vectors are composed using algebraic operations that preserve the contribution of every signal while producing a fixed-size result.

The composed vector acts as a holistic fingerprint. Two devices with similar signal profiles produce similar vectors; devices with different profiles are far apart in the vector space. Matching a new device against the known profile database is fast – sublinear in the number of profiles – so it scales without issue even on modest hardware.

Confidence reflects both how closely a device matches a known profile and how many independent signals contributed to the match:

  • iPhone 15, iOS 17 – 94% confidence (many strong signals agree)
  • Samsung Galaxy, Android – 82% confidence (most signals match, one is ambiguous)
  • Unknown IoT device, similar to Nest Thermostat – 67% confidence (partial signal overlap)

Incremental Learning #

Unlike ML-based approaches that require training on labeled datasets, HDM vectors update incrementally. As new signals arrive for a device, the fingerprint is refined in place. Within normal observation, most devices accumulate enough signal for reliable classification.

The system also learns new device profiles from your network. If you have a device that doesn’t match any known profile, Huginn creates a new profile entry. Over time, your local installation builds a device database tailored to your specific network.

Why Not Machine Learning? #

Traditional ML approaches to device fingerprinting (random forests, neural networks) achieve slightly higher accuracy on benchmark datasets. But they require:

  • Labeled training data (someone has to identify every device to build the training set)
  • A training pipeline (feature extraction, model training, validation)
  • Retraining when new device types appear
  • Significant compute (GPU for neural approaches)

HDM vectors trade marginal accuracy for massive operational simplicity:

  • No training data needed – profiles are composed from signal definitions
  • No training pipeline – vectors update incrementally from observation
  • New device types are learned automatically from network observation
  • Runs on any CPU – the math is cheap bitwise operations

For a SIEM that needs to work out of the box on a $200 mini-PC, this tradeoff is the right one.