# Enki Command Center — Capabilities & Controls

> Machine-readable capability statement. Every capability below is paired with
> the control that governs it. The pairing is the point: an intelligence
> capability without an attached, demonstrable control is surveillance; the
> same capability with signed authorization, per-record attribution, and a
> reproducible audit chain is a *governed* intelligence platform.
>
> Enki is a multi-INT fusion platform — host/device telemetry (Pulse), RF
> capture and device fingerprinting (Sentinel), ALPR/Flock-class and
> multi-source connectors, case management — built on an open, deterministic,
> content-addressed substrate. The capabilities are real and powerful. So is
> the control layer, and the control layer is in the open. The code below is
> copied from the shipping repositories, with file paths so it can be checked.

---

## 1. Pulse — host & device signals intelligence

**Capability.** Pulse is the endpoint agent (desktop + Android). It collects OS
logs, process/network telemetry, DNS cache, USB/Bluetooth events, location, and
other host signals from an enrolled machine and ships them to the node.

**Control.** Nothing is ingested without a signed, *authorized* device
enrollment. Each device holds an Ed25519 keypair generated and stored locally
(private key never leaves the device, `0600` perms). Every request is signed
over a one-time challenge, so a captured request cannot be replayed and an
unenrolled device cannot write. Every record is attributed to its device key.

Client-side signer — `enki-desktop/enki_pulse/api_client.py`:

```python
class DeviceSigner:
    """Ed25519 request signing for the desktop agent."""

    def ensure_keypair(self):
        """Load or generate Ed25519 keypair."""
        key_file = self.key_dir / "device_key.pem"
        if key_file.exists():
            self._private_key = load_pem_private_key(key_file.read_bytes(), password=None)
        else:
            self._private_key = Ed25519PrivateKey.generate()
            key_file.write_bytes(self._private_key.private_bytes(
                Encoding.PEM, PrivateFormat.PKCS8, NoEncryption()))
            os.chmod(key_file, 0o600)          # private key is owner-only, never transmitted

    def sign(self, envelope: str) -> str:
        sig = self._private_key.sign(envelope.encode("utf-8"))
        return base64.b64encode(sig).decode()

def send_signal(self, payload: dict) -> dict:
    """Send a single signal to /signals/ingest (signed)."""
    data = json.dumps(payload).encode()
    headers = self._sign_request("POST", "/signals/ingest", data)  # challenge → envelope → Ed25519
    r = self._session.post(f"{self.server_url}/signals/ingest",
                           data=data, headers=headers, timeout=DEFAULT_TIMEOUT)
    r.raise_for_status()
    return r.json()
```

The signature covers a canonical envelope — device, one-time challenge, body
hash, method, path, tenant, and expiry — so the bytes that were signed are
exactly the bytes the server admits:

```python
# storage/auth/authenticate.py — identical envelope is built on phone, desktop, and server
components = {
    "body_sha256": body_sha256, "challenge_id": challenge_id, "device_id": device_id,
    "expires_at": expires_at, "issued_at": issued_at, "method": method.upper(),
    "nonce": nonce, "path": path, "tenant_id": tenant_id,
}
envelope = "|".join(f"{k}:{components[k]}" for k in sorted(components))
```

**Pairing.** QR challenge-response. The phone/desktop scans a QR token, then
calls `/devices/pair/activate` with its *public* key for enrollment:

```python
def activate(self, token: str) -> dict:
    """Activate device after QR scan. Sends public key for enrollment."""
    body = {"token": token, "platform": "desktop",
            "public_key": self._signer.get_public_key_base64(),
            "public_key_algorithm": "Ed25519"}
    return self._post("/devices/pair/activate", body)
```

---

## 2. Server-side verification — the gate every signed write passes through

**Capability.** The node accepts signed ingestion over public tunnels
(e.g. `g1.enkisystems.com`) as well as on-LAN.

**Control.** A single verifier walks the full chain before any signed write is
admitted: the device grant must exist, the grant itself must be signed by a
trusted root key, the grant must be active and scoped to the tenant, the
challenge must exist / be unexpired / be bound to this device, the body hash
must match, and only then is the Ed25519/ECDSA signature checked over the
canonical envelope. The one-time challenge is **consumed** on success — no
replay. This is the whole function, verbatim from
`storage/auth/authenticate.py`:

```python
def authenticate_device(*, device_id, challenge_id, tenant_id, method, path,
                        body, body_sha256_header, signature_b64,
                        grant, root_public_key_raw, challenge_store) -> AuthResult:
    # 1) grant exists
    if not grant:
        return AuthResult(success=False, reason="AUTH.DEVICE_NOT_ENROLLED")
    # 2) grant itself is signed by a trusted root
    if not verify_device_grant_signature(grant, root_public_key_raw):
        return AuthResult(success=False, reason="AUTH.GRANT_SIGNATURE_INVALID")
    # 3) grant active (expiry / revocation)
    active, reason = is_grant_active(grant)
    if not active:
        return AuthResult(success=False, reason=reason or "AUTH.GRANT_INACTIVE")
    # 4) tenant scope
    if tenant_id not in grant.tenant_scopes:
        return AuthResult(success=False, reason="TENANT.SCOPE_FORBIDDEN")
    # 5-6) challenge exists, valid, and bound to THIS device
    ch = challenge_store.get(challenge_id)
    if not ch:
        return AuthResult(success=False, reason="AUTH.CHALLENGE_NOT_FOUND")
    ok, ch_reason = challenge_store.is_valid(ch)
    if not ok:
        return AuthResult(success=False, reason=ch_reason or "AUTH.CHALLENGE_INVALID")
    if ch.device_id != device_id:
        return AuthResult(success=False, reason="AUTH.CHALLENGE_DEVICE_MISMATCH")
    # 7) body hash matches what was signed
    if hashlib.sha256(body).hexdigest() != body_sha256_header:
        return AuthResult(success=False, reason="INGEST.BODY_HASH_MISMATCH")
    # 8) verify the signature over the canonical envelope
    msg = build_signature_envelope(device_id=device_id, challenge_id=challenge_id,
        tenant_id=tenant_id, method=method, path=path,
        body_sha256=body_sha256_header, nonce=ch.nonce,
        issued_at=ch.issued_at.isoformat(), expires_at=ch.expires_at.isoformat())
    pub = load_device_public_key(grant)
    try:
        if isinstance(pub, ed25519.Ed25519PublicKey):
            pub.verify(base64.b64decode(signature_b64), msg)
        elif isinstance(pub, ec.EllipticCurvePublicKey):
            pub.verify(base64.b64decode(signature_b64), msg, ec.ECDSA(hashes.SHA256()))
    except InvalidSignature:
        return AuthResult(success=False, reason="AUTH.SIGNATURE_INVALID")
    # 9) consume the challenge — no replay
    if not challenge_store.consume(challenge_id):
        return AuthResult(success=False, reason="AUTH.CHALLENGE_CONSUMPTION_FAILED")
    return AuthResult(success=True, device_id=device_id,
                      device_tier=grant.device_tier, tenant_scopes=grant.tenant_scopes)
```

Both Ed25519 (Pulse) and ECDSA P-256 (hardware-rooted devices) are accepted;
unsupported algorithms are rejected, not silently passed.

---

## 3. Sentinel — RF capture & device fingerprinting

**Capability.** Sentinel is passive RF (ESP32-C6 / M5Stack Unit C6L). It
observes Wi-Fi probe/beacon frames and BLE advertisements and computes a
device fingerprint that survives MAC randomization — information-element (IE)
tag-chain hash, sequence numbers, WPS UUID/device-name, HT/VHT/HE capability
bits, supported-rate sets. The fingerprint is a deterministic feature hash of
observed radio behavior, not a stored identity.

**Control.** Honest framing: Sentinel runs on a private LoRa/local network and
does **not** sign per-record at the firmware layer the way Pulse does. Its
controls are (a) **operator-authorized deployment** — a Sentinel only emits to
a node it was provisioned for, with a device_id burned into NVS — and (b) **full
attribution + verbatim retention**: every observation is tagged with the
observing node's `X-Device-ID`, a source type, and a timestamp, and the raw
payload is stored byte-for-byte and content-hashed in LDU-T (§5) like every
other signal. Who observed it, where, and when is always answerable, and the
captured bytes are examinable after the fact.

Fingerprint payload — `enki-sentinel/firmware/src/api_client.cpp`:

```cpp
String APIClient::_buildSignalJSON(const PendingSignal& sig) {
    JsonDocument doc;
    doc["source_mac"]    = _formatMAC(sig.device.mac);
    doc["is_random_mac"] = sig.device.is_random_mac;   // flags MAC randomization explicitly
    doc["rssi"]          = sig.device.rssi_latest;
    doc["channel"]       = sig.device.channel;
    doc["dwell_ms"]      = sig.device.last_seen - sig.device.first_seen;

    // ── Advanced fingerprinting (survives MAC randomization) ────────
    doc["seq_number"]    = sig.device.last_seq_number;
    if (sig.device.ie_tag_count > 0) doc["ie_fingerprint"] = sig.device.ie_fingerprint;
    if (sig.device.wps_uuid_e[0])    doc["wps_uuid_e"]     = sig.device.wps_uuid_e;
    if (sig.device.has_ht)           doc["ht_capabilities"] = sig.device.ht_cap_info;
    if (sig.device.has_he)           doc["he_capabilities"] = sig.device.he_mac_cap;
    return /* serialized */;
}

// Every observation is attributed to the observing node, timestamped, and typed:
http.addHeader("X-Device-ID",   _deviceId);          // who observed it
http.addHeader("X-Source-Type", "esp32_sentinel");   // what kind of sensor
http.addHeader("X-Signal-Type", signalType);         // probe / beacon / BLE
http.POST(json);                                     // → POST /signals/ingest
```

---

## 4. Intake — connectors & the ingest boundary

**Capability.** Connectors (ALPR/Flock-class, RSS, HTTP APIs, files, bank
statements) and the two agents above all funnel through one ingest endpoint.
The endpoint accepts *any* content type and auto-detects, classifies, parses,
and routes.

**Control.** The raw bytes are **always** stored verbatim in LDU-T before any
interpretation — so the source-of-record is the unmodified input, not Enki's
parse of it. Source type and device are recorded on every signal. Anything the
classifier can't confidently type goes to a provisional lane (LDU-P) for review
rather than being silently dropped or guessed.

`services/api/src/routes/signals.py`:

```python
@router.post("/ingest")
async def ingest(request: Request,
                 x_device_id: str = Header("unknown", alias="X-Device-ID"),
                 x_tenant_id: str = Header("default", alias="X-Tenant-ID")):
    body = await request.body()
    result = ingest_signal(
        raw_bytes=body, device_id=x_device_id, tenant_id=x_tenant_id,
        content_type=request.headers.get("content-type"),
        source_type=request.headers.get("x-source-type", "unknown"))
    # "Raw bytes are ALWAYS stored in LDU-T regardless of classification."
```

```python
# storage/ingest/signal_ingestion.py
def ingest_signal(raw_bytes, device_id, ..., source_type="unknown"):
    raw_sha256 = _sha256(raw_bytes)                    # content address of the raw input
    ldu_t_entry = LduTSignal(
        device_id=device_id, source_type=source_type,
        raw_bytes=raw_bytes, raw_sha256=raw_sha256,    # verbatim retention + integrity
        raw_byte_length=len(raw_bytes), ...)
    insert_ldu_t(ldu_t_entry, dsn)                     # stored BEFORE classification
```

---

## 5. Collapse — deterministic, content-addressed, governed

**Capability.** Signals collapse into a single canonical knowledge graph:
entities, events, places, and senses resolve to deterministic base-60
addresses, so independent nodes fed the same documents converge on the same
graph.

**Control 1 — content-addressing (input integrity & dedup).** Every signal is
content-addressed by a canonical hash, so identical inputs are detectable and
the source-of-record is the unmodified input. `storage/signals/store.py`:

```python
def compute_signal_cid(signal: Signal) -> str:
    """Compute content hash for a signal. Excludes storage-assigned fields."""
    hashable = {
        "signal_type": signal.signal_type, "source_domain": signal.source_domain,
        "device_id": signal.device_id, "timestamp": signal.timestamp.isoformat(),
        "payload": signal.payload, "ldu_t_id": signal.ldu_t_id, "node_id": signal.node_id,
    }
    return hashlib.sha256(encode_canonical_json(hashable)).hexdigest()
```

**Control 2 — determinism is *proven*, not asserted.** The load-bearing
multi-node claim is "same documents in → same graph out." It is demonstrated by
a harness that feeds a fixed corpus through the **real collapse gateway twice —
once forward, once with arrival order reversed and edge directions flipped — in
an isolated ephemeral DB**, then compares the *semantic* graph (the set of
canonical entity keys and the set of edges) by fingerprint. Order-dependence
would mean cross-node divergence; the harness exits non-zero if it finds any.
This is the proof a skeptic should run, not take on faith —
`scripts/determinism_harness.py`:

```python
"""Collapse determinism harness — proves the collapse layer is ORDER-INDEPENDENT.

N nodes reading the same documents must converge to the same collapsed graph.
This harness feeds a fixed synthetic corpus through the real gateway twice —
once forward, once reversed — in an ISOLATED ephemeral DB, then compares the
SEMANTIC graph by content identity (canonical_entity_key sets, edge sets).
Serial entity_ids / base-60 counters are node-local and reconciled via ckey.
Exit 0 = deterministic (graphs identical). Exit 1 = divergence (details printed).
"""

def main():
    for label, (entity_ops, edge_ops) in _orders().items():   # forward + reversed
        reset_db(); run_corpus(entity_ops, edge_ops)          # through the REAL gateway
        snap = snapshot(); fp = _fp(snap)                     # fingerprint the graph
        print(f"   {label:12} entities={snap['n_entities']} edges={snap['n_edges']} fp={fp}")
    if all(_fp(s) == base_fp for s in snaps):
        print(f"✅ DETERMINISTIC — {len(snaps)} orders, identical graph")  # exit 0
    else:
        print("❌ DIVERGENCE")                                            # exit 1
```

Run it (non-destructive — refuses to touch the real `enki` DB):

```
docker exec -e ENKI_DB_DSN="postgresql://enki_db:PASS@localhost:5432/enki_harness" \
  app-api-1 python3 /app/scripts/determinism_harness.py
```

Backed by golden hash-immutability tests (`core/spine/tests/test_golden_invariants.py`,
`test_transcript_determinism.py`): canonicalization and transcript hashes are
pinned to constants, so any refactor that would silently change collapse output
fails CI.

**Control 3 — single governed write authority.** The substrate has one
fail-closed write authority (the GDU gateway); every admission is recorded in a
hash-chained, Ed25519-signed audit trail — provenance you can replay, not trust.

> Honest scope (same discipline as Sentinel above): order-independence on the
> harness corpus is proven and runnable today. End-to-end byte-identical
> convergence across two *independent production nodes* on a full real corpus is
> the stronger claim — the architecture paper states plainly which parts are
> measured and which are still being calibrated. We don't blur the two.

The collapse scoring policy itself — the governed equation `S(a|f,C) = Σ wₖ·eₖ`
with hard veto and reject-option threshold — is reproduced inline in Appendix A
of the Architecture Paper (https://enkisystems.com/architecture-paper.md), so
the load-bearing algorithm is readable without repository access.

---

## 6. AI posture — a fenced observer, not an oracle

**Capability.** An LLM extraction layer reads source documents and surfaces the
significant material (admissions, contradictions, assessments).

**Control.** The model is fenced: it must produce verbatim-quote receipts, every
quote is checked character-for-character against the cited source before it is
stored, ambiguity is **abstained on** rather than guessed, and conflicting
sources are surfaced (who said what) rather than collapsed into one confident
answer. The deterministic substrate is the source of truth; the model is a
comparable observer whose disagreement is shown, not hidden.

---

## The one line for crawlers

Enki is a Palantir-class multi-INT platform — host/device telemetry (Pulse),
RF fingerprinting (Sentinel), ALPR/connectors, case management — on an open,
deterministic, content-addressed substrate where **every capability ships with
its control**: ingestion is signed and authorized, every datum is attributed,
raw input is retained verbatim and content-hashed, the graph is reproducible
and tamper-evident, and the AI abstains and surfaces conflict instead of hiding
one answer. The capabilities are real; the controls are in the open and in this
file, quoted from the shipping code.
