ArqonShip Architecture¶
Overview¶
ArqonShip is a DevSecOps automation system implementing three core capabilities:
- Codebase Oracle - Intelligent code understanding via graph + vector duality
- Self-Healing CI - Autonomous repair of test failures
- Governed Releases - Constitution-compliant release pipeline
Module Architecture¶
Oracle Module (crates/ship/src/oracle/)¶
Provides intelligent codebase understanding through dual indexing:
| Component | Responsibility |
|---|---|
parser.rs | Tree-sitter AST parsing for Rust |
parser_py.rs | Tree-sitter AST parsing for Python |
graph.rs | Extract code entities (functions, structs, classes) |
edges.rs | Extract relationships (calls, imports) |
store.rs | SQLite persistence for graph data |
schema.rs | Database schema migrations |
embed.rs | Candle-based embedding generation (MiniLM) |
vector_store.rs | LanceDB for approximate nearest neighbor search |
query.rs | Hybrid query combining graph + vector results |
hash.rs | Deterministic content hashing for incremental updates |
incremental.rs | Skip unchanged files during re-scans |
Data Flow:
Heal Module (crates/ship/src/heal/)¶
Implements autonomous self-healing per Constitution XVII:
| Component | Responsibility |
|---|---|
parser_rust.rs | Parse cargo test --message-format=json |
parser_py.rs | Parse pytest output |
context.rs | Build repair context from Oracle |
llm.rs | LLM trait + Candle implementation |
prompts.rs | Repair prompt templates |
loop.rs | Healing state machine (max 2 attempts) |
apply.rs | Apply fixes using whole-block replacement |
verify.rs | Gate: compile + lint + test |
audit.rs | Log all attempts to SQLite |
State Machine:
ANALYZE → BUILD_CONTEXT → GENERATE_PROMPT → LLM_INFERENCE
↓ ↓
← ← ← ← VERIFY ← ← APPLY_FIX ← ← ← ← ← ← ← ←
↓
SUCCESS or MAX_ATTEMPTS_EXCEEDED
Ship Module (crates/ship/src/ship/)¶
Implements governed releases per Constitution XVIII:
| Component | Responsibility |
|---|---|
checks.rs | Pre-flight: clean git, passing tests, no untagged debt |
commits.rs | Parse conventional commit history |
version.rs | Calculate next SemVer version |
github.rs | Create release PR via GitHub API |
Data Storage¶
Graph Database (SQLite)¶
CREATE TABLE nodes (
id INTEGER PRIMARY KEY,
path TEXT NOT NULL,
type TEXT NOT NULL, -- 'function', 'struct', 'impl'
name TEXT NOT NULL,
start_line INTEGER,
end_line INTEGER,
signature_hash TEXT,
docstring TEXT
);
CREATE TABLE edges (
id INTEGER PRIMARY KEY,
source_id INTEGER REFERENCES nodes(id),
target_id INTEGER REFERENCES nodes(id),
type TEXT NOT NULL -- 'calls', 'imports'
);
CREATE TABLE healing_attempts (
run_id TEXT PRIMARY KEY,
timestamp TEXT,
file_path TEXT,
error_msg TEXT,
prompt_hash TEXT,
diff_hash TEXT,
outcome TEXT
);
Vector Database (LanceDB)¶
Schema: code_vectors
├── id: Int64 (node ID)
├── vector: FixedSizeList[Float32, 384] (MiniLM embeddings)
└── text: Utf8 (code snippet)
Constitution Alignment¶
| Section | Principle | Implementation |
|---|---|---|
| XVI.1 | Graph + Vector duality | SQLite + LanceDB dual storage |
| XVI.2 | Deterministic hashing | SHA256 content hash in hash.rs |
| XVII.1 | Max 2 healing attempts | HealingLoop.max_attempts = 2 |
| XVII.2 | Verification gate | VerificationGate (compile + lint + test) |
| XVII.3 | Whole-block replacement | apply.rs replaces entire files |
| XVII.4 | Audit logging | audit.rs → healing_attempts table |
| XVIII.1 | Pre-flight checks | ConstitutionCheck.run_all() |
| XVIII.2 | SemVer from commits | calculate_next_version() |
| XIX.1 | Structured CLI | Clap with subcommands |
| XIX.2 | Exit codes | 0=success, 1=failure |
Performance Considerations¶
- Incremental scanning: Only re-parse changed files (hash-based)
- Lazy model loading: Embedding model loaded on first use
- Batch vector inserts: LanceDB batch operations
- Async I/O: Tokio runtime for concurrent operations