Skip to content

Edge ML

Consystence runs machine learning models directly on the Nvidia Orin GPU using TensorRT for hardware-accelerated inference. Edge ML provides real-time anomaly detection without relying on cloud connectivity.

What it detects

Edge ML uses autoencoder neural networks trained on baseline operational data. The autoencoder learns the normal operating patterns of a piece of equipment and flags deviations.

Detection Input Signals Anomaly Pattern
Bearing wear Motor current, vibration, temperature Gradual current drift at constant speed
Impeller blockage Discharge pressure, current, flow Current increase with pressure/flow decrease
Voltage imbalance Phase currents (3-phase) Asymmetric current draw across phases
Seal failure Bearing temperature, vibration Temperature spike with vibration increase
Belt slip Motor speed, conveyor speed, current Speed ratio deviation from baseline

How it works

graph LR
    PLC[PLC Tags] --> Pre[Preprocessing]
    Pre --> AE[Autoencoder — TensorRT]
    AE --> Score[Anomaly Score]
    Score --> Health[Health Index]
    Health --> Stream[Log Stream → Site Server]
  1. Preprocessing — raw tag values are normalised and windowed (typically 60-second sliding window).
  2. Autoencoder inference — the TensorRT model reconstructs the input. High reconstruction error indicates an anomaly.
  3. Anomaly score — the reconstruction error is converted to a 0–100 anomaly score.
  4. Health index — the anomaly score is smoothed over time to produce a stable health index for the equipment.
  5. Log stream — scores and health indices are streamed to the site server via gRPC for display in the HealthBadgeNode on the scene graph.

Performance target

Metric Target
Inference latency < 100 ms per equipment instance
Model size < 10 MB per equipment type
GPU memory < 500 MB total for all models on one Orin

Training

Models are trained on baseline operational data — a period of known-good operation for each equipment type. Training happens in the cloud tier, not on the edge device.

Training pipeline

  1. Data collection — the edge device streams tag data to the site server during a designated baseline period (typically 2–4 weeks of normal operation).
  2. Upload — the site server uploads the baseline dataset to the cloud.
  3. Training — the cloud trains an autoencoder for the equipment type using the aggregated baseline data.
  4. Validation — the model is tested against held-out normal data and known-fault data (if available).
  5. Publish — the trained model is published as a TensorRT engine file optimised for the Orin GPU architecture.

Fleet learning

Fleet learning is the process of improving models across all deployments of an equipment type.

sequenceDiagram
    participant A as Site A — Orin
    participant C as Cloud
    participant B as Site B — Orin

    A->>C: Anomaly detected + tag data window
    C->>C: Add to training dataset for equipment type
    C->>C: Retrain autoencoder
    C->>A: Updated model weights
    C->>B: Updated model weights
    A->>A: Hot-load new model
    B->>B: Hot-load new model

When a site experiences a confirmed equipment failure:

  1. The anomaly data window (tag values surrounding the event) is uploaded to the cloud.
  2. The cloud aggregates failure patterns across all sites running the same equipment type.
  3. The autoencoder is retrained on the expanded dataset — improving detection for failure modes seen at other sites.
  4. Updated model weights are pushed to all edge devices running that equipment type.
  5. The edge device hot-loads the new model without restarting the edge service.

This works because device types standardise tag schemas. A centrifugal pump has the same tags at every site, so training data is directly transferable.

Model deployment

Models are stored as TensorRT engine files (.engine) on the edge device:

/opt/consystence/models/
├── consystence.pump.centrifugal-v1.2.engine
├── consystence.conveyor.belt-v1.0.engine
└── consystence.motor.induction-v1.1.engine

The edge service watches this directory and hot-loads models when files are updated. No service restart is required.

Model versioning

Each model is versioned alongside the device type it belongs to. The device type manifest declares the minimum model version required:

# manifest.yaml
ai:
  model: consystence.pump.centrifugal
  minModelVersion: 1.2

When an edge device receives an updated model, it validates the version against the installed device types before loading.