Back to Enki Systems

We don't trust anything. We verify everything.

Most intelligence platforms trust the LLM, trust the database, trust the network. AI hallucinations pollute their graph. A single SQL update rewrites history. A compromised auth provider opens every door. Enki is built on the opposite assumption: nothing is trusted until it's cryptographically verified. Six gates stand between a signal arriving and a fact appearing in your knowledge graph.

The discipline layer

Six cryptographic gates.

1
Signed at the device
Pulse app or optional hardware key
2
Governed at the vocabulary
Pack-signed, fail-closed active set
3
Extracted via the LDU
Open mouth — raw candidates, never canonical
4
Admitted via the GDU
Locked jaw — 5-layer collapse, base-60 address
5
Hash-chained at admission
Tamper one row, the chain breaks
6
Encrypted at rest
LUKS volumes + AES-256-GCM key envelopes
Gate 1

Signed at the device.

Every admin action and every signal-ingest call carries a hardware-rooted Ed25519 signature. Compromise the server remotely and you still can't enroll devices, push updates, or admit signals without approval from the Pulse app — or the optional hardware key.

What attackers can't do

  • Pair a rogue device without the device-owner pressing approve in Pulse
  • Forge an admin push by stealing a session token — challenge-response requires the live private key
  • Replay an old signed request — every signature is bound to a server-issued challenge with a timestamp

Backing it up

  • Algorithm: Ed25519, RFC 8032
  • Key storage: AES-256-GCM envelope, PBKDF2-HMAC-SHA256 with 600k iterations
  • Anti-replay: challenge_id | peer_node_id | timestamp, canonicalized per RFC 8785
  • Optional hardware: hardware key (air-gapped signing device), private key never leaves the device
Gate 2

Governed at the vocabulary.

The AI can't invent entity types. Every category, every subtype, every relationship the system will ever recognize is defined in a signed pack. The boot loader refuses to start in production unless every pack's content hash matches the signed active set. Tamper with a vocab JSON file and the system fails closed — it doesn't run.

What attackers can't do

  • Slip a new entity type into the system without going through pack ceremony
  • Modify the resolution rules without invalidating the active-set hash
  • Force the loader into a degraded fallback — production mode is fail-closed

Backing it up

  • Active set: allow-list of admitted packs with version pins + SHA-256 content hashes
  • Namespace registry: prevents semantic collisions between packs
  • Canonical hash: RFC 8785 JCS + Unicode NFC normalization before SHA-256
  • Activation ceremony: explicit human acceptance via immutable activation records
  • Phase 4.2 enforcement: direct PackLoader() usage forbidden without GDU context
Gate 3

Extracted via the LDU.

The Local Definition Universe is the open mouth. The AI extracts freely — people, organizations, relationships, facts, contradictions — into a provisional working space. Nothing here is canonical. Nothing here is in your knowledge graph yet. Vision models run the same pipeline on photos: faces, text, objects, intelligence value.

What the LDU does

  • Tier 1: deterministic keyword + form classifier (~10K forms/sec, no LLM)
  • Tier 2: LLM extraction (Gemma 4 default, Llama / Mistral / Qwen swappable)
  • Vision: LLaVA-7B + MediaPipe face detection + PaddleOCR with Ollama fallback
  • Intelligence classifier flags faces / IDs / weapons / vessels / aircraft / crime scenes
  • Redaction-region detection on OCR — flagged for compliance, never logged as content

Why open mouth, locked jaw

If the AI were allowed to write directly to the knowledge graph, every hallucination would become a permanent fact. The LDU/GDU split lets the LLM be creative in extraction while the graph stays disciplined. The LDU is full of provisional things. The GDU only admits what resolves cleanly.

Gate 4

Admitted via the GDU.

The Governed Definition Universe is the locked jaw. Every candidate entity passes through five resolution layers before it can become a node. 47 articles about the same company produce one canonical node with 47 sources attached — not 47 duplicates. Every admitted node gets a permanent base-60 address that never changes, never gets reused, and works as an audit-stable handle across the federation.

The 5-layer collapse

  1. Hard identifier match (CIK, MMSI, ICAO hex, VIN, plate)
  2. Normalized name (Unicode NFC, case, whitespace, punctuation stripped)
  3. Alias lookup against known aliases for promoted entities
  4. AI-powered similarity against the form-distribution model (base-60 cyclical LM)
  5. Spatial proximity (PostGIS distance within type-specific tolerance)

Why base-60 addresses

60 has 12 divisors. The address space is sexagesimal (Sumerian). 9 categories × ~87 subtypes = ~90 valid positions out of 3,600 possible two-digit addresses. 97.5% of addresses are invalid by construction — a strong hallucination filter for any predictor trying to invent a new type.

e.g. Jeffrey Epstein → 0.0.171025 · aircraft type → 0.3.1832 · form "PBL" → 11.4.202

Gate 5

Hash-chained at admission.

Every admission is recorded in an append-only log with a SHA-256 link to the previous row. Tamper with any historical record and the chain breaks at that point — every subsequent hash is wrong. Evidence you can defend under courtroom-grade scrutiny.

What attackers can't do

  • Modify an admitted fact without breaking the chain at that row and all rows after
  • Insert a back-dated row — the previous-hash linkage anchors temporal order
  • Quietly delete history — gaps are detectable by sequence and hash

Deterministic AI for audit

Extraction runs at temperature 0.0. Same input, same output, every time. Re-run any extraction from any year-old log entry and get the byte-identical result. Required for regulated environments. Required for any case that ends up in court.

Gate 6

Encrypted at rest.

The database volume is LUKS-encrypted. The hardware-rooted signing key is wrapped in an AES-256-GCM envelope with PBKDF2 600,000-iteration key derivation. Container processes drop privileges and run with read-only source mounts. Pull the drive and you get ciphertext.

Hardening floor

  • SecureBoot enabled at firmware level (G1 prod)
  • LUKS volume encryption on the postgres data directory
  • Container `no-new-privileges:true` + per-service CPU/memory limits
  • All non-public ports bound to 127.0.0.1, not 0.0.0.0
  • Source-code mounts read-only — api process cannot rewrite its own code
  • Daily encrypted pg_dump backup with integrity verification on restore

Access control

  • RFC 6238 TOTP MFA with bcrypt-hashed single-use backup codes
  • Per-IP rate-limit tracking on auth attempts
  • Optional Cloudflare Access edge policy with email + service-token rules
  • Air-gap mode — node fully operational with zero external connectivity

Why most platforms can't do this.

Built-for-trust isn't a feature you can add later. It's a substrate choice. Once a system has decided the AI's output is canonical, that decision propagates through every table, every API, every report. Enki was built with the opposite choice from the first commit.

Most intelligence platforms

  • AI output written directly to the canonical DB
  • Schema mutable by anyone with SQL access
  • No append-only chain — historical edits invisible
  • Cloud-only — your data lives on someone else's hardware
  • Auth = "trust the IDP"

Enki

  • AI output enters a provisional space (LDU), not the canonical graph
  • Schema bound to signed packs — no silent mutation
  • Hash-chained admission — tampering is mathematically detectable
  • Run on your own hardware — full air-gap supported
  • Auth = device signature + optional physical key + TOTP MFA

Two ways in.

Browse the public federation for free — read what we've already pulled from FBI, CIA, USAF, AARO, PACER, SEC. Or get your own node and feed it your own data.