Gotcha Registry¶
This registry captures traps, failure modes, and migration hazards discovered while working on Arqon Maestro.
Use this page for things that are too operational or failure-oriented for the decision log, but too important to leave in chat history.
Video placeholder: how to use the gotcha registry during migrations.
Maintenance Rule¶
Add an entry when a problem is:
- repeatable
- surprising
- likely to waste time again
- phase-relevant
- safety-relevant during migration
Do not put every bug here. This page is for patterns and traps, not generic defect tracking.
Categories¶
Configuration¶
Use for:
- path fallback surprises
- environment variable conflicts
- stale config migration issues
- old and new config sources disagreeing
Runtime Startup¶
Use for:
- Electron startup traps
- loading hangs
- silent sidecar failure
- local backend detection traps
Audio And Voice Pipeline¶
Use for:
- microphone capture surprises
- endpointing failures
- dictate-mode confusion
- chunk lifecycle traps
Process And Sidecar Identity¶
Use for:
pkillhazards- renamed process mismatch
- stale background helpers
- filename/process-name coupling
Build And Packaging¶
Use for:
- Gradle/CMake path assumptions
- copied artifact name mismatches
- installDist/runtime lookup breakage
- source-root and library-root coupling
Protocol And Plugin Integration¶
Use for:
- plugin handshake mismatches
- callback/completed contract issues
- editor state retrieval traps
- cross-process string mismatches
Namespace And Dependency Identity¶
Use for:
- package-name cascades
- JNI/header regeneration traps
- artifact coordinate mismatches
- wrapper vs hard-rename tradeoffs
External Infrastructure¶
Use for:
- endpoint ownership issues
- CDN dependencies
- marketplace identity constraints
- DNS/TLS blockers
Entries¶
GOTCHA-001: Path Rename Is Not Identity Rename¶
- Category: Build And Packaging
- Status: active
- Summary: Renaming the repo subtree from
serenade/tomaestro/does not solve package names, process names, config names, JNI namespaces, or dependency identity. - Impact: High
- Where it matters:
- path migrations
- build root variables
- docs and scripts
- Avoidance:
- treat path rename as its own phase
- do not combine it with namespace or dependency migration
GOTCHA-002: Config Breakage Presents As Random Runtime Failure¶
- Category: Configuration
- Status: active
- Summary: Removing
.serenadefallback too early can show up as missing login state, wrong endpoint, missing scripts, or silent custom-command failure rather than a clean config error. - Impact: High
- Where it matters:
- settings migration
- first-run behavior
- support docs
- Avoidance:
- make
.arqon/arqon.jsonprimary - preserve
.serenade/serenade.jsonread fallback during transition
GOTCHA-003: Process-Name Mismatch Fails Silently¶
- Category: Process And Sidecar Identity
- Status: active
- Summary: If the sidecar filename, process name, and shutdown logic disagree, custom commands can stop working without an obvious top-level error.
- Impact: High
- Where it matters:
- custom command sidecar rename
pkillcleanup logic- startup/shutdown lifecycle
- Avoidance:
- stop relying on loose string matching alone
- rename producer and consumer together
- verify keepalive and reload behavior after each process rename
GOTCHA-004: Namespace Renames Cascade Across Java, JNI, and Artifacts¶
- Category: Namespace And Dependency Identity
- Status: active
- Summary:
ai.serenade.*is not a branding string. It is embedded in Java packages, imports, generated headers, native glue, and artifact identity. - Impact: High
- Where it matters:
- tree-sitter Java binding
- parser imports
- Gradle artifact coordinates
- Avoidance:
- do not treat namespace migration like a text-edit pass
- stage wrappers first where possible
- regenerate and verify native bindings in the same subsystem patch
GOTCHA-005: External Endpoint Names Are Not Ready To Rebrand¶
- Category: External Infrastructure
- Status: active
- Summary:
stream-*.serenade.aiandserenadecdn.comare still real dependencies. Renaming config or docs to Arqon-owned endpoints before the infrastructure exists will create false expectations and broken runtime behavior. - Impact: Medium
- Where it matters:
- endpoint config
- model download scripts
- support docs
- Avoidance:
- keep inherited external endpoints until Arqon-owned replacements are live and validated
GOTCHA-006: Preferred Path Helpers Are Not Enough¶
- Category: Configuration
- Status: mitigated
- Summary: A codebase can appear Arqon-first while still behaving legacy-first if reads and writes continue to prefer existing legacy files.
- Impact: High
- Where it matters:
- settings migration
- first-run behavior
- VS Code integration
- Avoidance:
- return canonical
.arqonfile paths from the settings layer - migrate legacy contents forward when needed
- keep legacy files as read fallback, not as the preferred write target
GOTCHA-007: Docs Can Undermine A Compatibility Layer¶
- Category: Build And Packaging
- Status: active
- Summary: Even when the code prefers Arqon names, stale runbooks that still tell users to export
SERENADE_*or edit~/.serenade/serenade.jsonwill drag the system back into legacy-first operation. - Impact: Medium
- Where it matters:
- root runbooks
- troubleshooting guides
- training and build docs
- Avoidance:
- make Arqon names canonical in every user-facing command example
- mention legacy names only as explicit compatibility notes
GOTCHA-008: Top-Level Ignore Rules Must Move With The Subtree¶
- Category: Build And Packaging
- Status: mitigated
- Summary: Renaming the engine subtree without updating the root
.gitignorewill surface generated assets and model directories as unexpected changes, even when the rename itself is correct. - Impact: Medium
- Where it matters:
- subtree rename
- generated model assets
- local packaging output
- Avoidance:
- update root ignore rules in the same patch as the subtree move
- verify untracked model/runtime directories are ignored before closing the phase
GOTCHA-009: Sidecar Rebrand Must Preserve User Script Surface¶
- Category: Process And Sidecar Identity
- Status: mitigated
- Summary: Renaming the sidecar entrypoint alone is not enough. Existing user automations can still depend on
global.serenadeand~/.serenade/scripts, so a strict cutover can break custom commands even when the process starts correctly. - Impact: High
- Where it matters:
- custom-command sidecar migration
- user automation compatibility
- startup and reload verification
- Avoidance:
- expose
global.arqonas canonical - preserve
global.serenadeas a compatibility alias during transition - load and watch both
.arqon/scriptsand.serenade/scripts
GOTCHA-010: Native Packaging Failures Can Be Environmental, Not Rename Regressions¶
- Category: Build And Packaging
- Status: active
- Summary: After a runtime identity migration, native packaging failures may still come from unrelated external issues such as stale CMake state, missing Marian artifacts, or protobuf compiler/header drift. Those failures should not be misattributed to process-name changes.
- Impact: High
- Where it matters:
- Phase 4 verification
- native engine packaging
- evidence review
- Avoidance:
- clean native build directories before verification
- rerun with explicit
ARQON_MAESTRO_SOURCE_ROOTandARQON_MAESTRO_LIBRARY_ROOT - prove at least one renamed packaged artifact path independently of the failing external toolchain
GOTCHA-011: Shared .arqon Root Means Migration Must Be Per-Entry¶
- Category: Configuration
- Status: mitigated
- Summary:
~/.arqoncan already exist for other Arqon tools. That means migration cannot assume the root directory is empty or use root-directory existence as a signal that Maestro state has already been migrated. - Impact: High
- Where it matters:
- config storage migration
- script migration
- log migration
- Avoidance:
- migrate file-by-file and directory-by-directory
- treat each canonical target independently
- only skip migration when the specific destination file or directory is already populated
GOTCHA-012: Canonical Storage Without Script Migration Breaks User Expectations¶
- Category: Configuration
- Status: mitigated
- Summary: Moving the config files to
.arqonis not enough. Ifcustom.jsremains only under.serenade/scripts, UI actions like “open custom commands” point at the canonical directory while the user’s real automation still lives elsewhere. - Impact: High
- Where it matters:
- script migration
- custom-command UX
- support/debugging
- Avoidance:
- migrate scripts with config state, not as a separate afterthought
- treat canonical script directories as part of the user-facing product surface
GOTCHA-013: Optional Native Module Warnings Can Hide Real Build Failures¶
- Category: Build And Packaging
- Status: mitigated
- Summary: The Electron main build can look unhealthy even when the real issue is only optional
wsnative module resolution. Leaving those warnings in place makes later build failures harder to interpret. - Impact: Medium
- Where it matters:
- Electron main build
- CI logs
- runtime recovery work
- Avoidance:
- mark optional native modules explicitly in bundler configuration
- keep the active build path warning-free so new failures stand out
GOTCHA-014: Missing Root .gitmodules Leaves The Repo In A Half-Tracked State¶
- Category: Namespace And Dependency Identity
- Status: active
- Summary: After vendoring one tree-sitter dependency, the repo can still contain active gitlinks for other tree-sitter grammars. If the root
.gitmodulesfile is not tracked, the repo shape is incomplete and future clone/submodule operations become fragile. - Impact: Medium
- Where it matters:
- tree-sitter grammar dependencies
- fresh clone behavior
- repository consistency checks
- Avoidance:
- track the root
.gitmodulesfile whenever gitlinks remain in the repo - treat vendored dependencies and active submodules as separate states that both need explicit metadata
GOTCHA-015: Legacy Publishing Plugins Become Build Breakers Once Promoted To Local Dependencies¶
- Category: Namespace And Dependency Identity
- Status: mitigated
- Summary: An inherited subproject can look harmless while it is consumed as an external artifact, then fail immediately once it is included in the root build because stale publishing plugins resolve dead repositories or abandoned transitive metadata.
- Impact: High
- Where it matters:
- local dependency replacement
- tree-sitter binding promotion into the root Gradle build
- Phase 7 verification
- Avoidance:
- strip publishing-only plugins from subprojects before making them first-class build dependencies
- verify the promoted subproject independently before relying on it from
core
GOTCHA-016: Gitlinks Break Parent Pushes If The New Commit Is Only Local¶
- Category: Namespace And Dependency Identity
- Status: mitigated
- Summary: If a parent repo points at a nested git commit that exists only in your local clone, the parent push is structurally incomplete. Anyone else syncing the parent will receive a gitlink to an unreachable object.
- Impact: High
- Where it matters:
- vendoring inherited dependencies
- nested tree-sitter repos
- hard-close publishing
- Avoidance:
- do not leave Phase 7 as a parent gitlink update alone
- vendor the dependency into the parent repo if you do not control the nested remote
- preserve the nested
.gitdirectory separately as rollback evidence before internalizing it
GOTCHA-017: Upstream Package Names Are Not The Same Problem As Internal Namespace Leaks¶
- Category: Namespace And Dependency Identity
- Status: active
- Summary: External package names such as
serenade-drivercan remain in lockfiles or manifests even after the internal code graph has been rebranded. Treating those two problems as identical pushes teams toward risky manifest surgery with little runtime value. - Impact: Medium
- Where it matters:
- npm dependency manifests
- supply-chain audit review
- Phase 7 closeout decisions
- Avoidance:
- remove direct code-facing imports first
- wrap inherited upstream packages behind Arqon-named local modules when publication ownership has not changed
- document residual manifest names explicitly in the evidence pack
GOTCHA-018: Local Startup Can Hang Even When The Bundle Is Structurally Incomplete¶
- Category: Runtime Startup
- Status: mitigated
- Summary: If the local bundle is missing
speech-engine,code-engine, or their model directories, the UI can sit inStarting Serverwhile the real problem is simply that the packaged local stack is incomplete. - Impact: High
- Where it matters:
- local endpoint startup
- bundled service launches
- support/debugging
- Avoidance:
- validate the packaged local bundle before polling service health
- fail local startup explicitly and point to the packaging command and native dependency docs
GOTCHA-019: Missing Native Dependency Roots Should Fail In Gradle, Not CMake¶
- Category: Build And Packaging
- Status: mitigated
- Summary: When the local library root is missing Boost, Protobuf, Crow, SentencePiece, Marian, or Kaldi assets, letting packaging continue into CMake produces noisy secondary errors that obscure the real blocker.
- Impact: High
- Where it matters:
client:installServer- local runtime recovery
- environment bring-up
- Avoidance:
- verify native dependency inputs before invoking CMake
- fail with a concrete missing-path list and the exact remediation command
GOTCHA-020: Internal Package Slugs Can Collide With Repo Names¶
- Category: Namespace And Dependency Identity
- Status: mitigated
- Summary: Renaming an internal Python package to
maestrowould collide conceptually with the repo subtree name and could create import ambiguity. The safer internal slug isarqon_maestro. - Impact: Medium
- Where it matters:
- Python training/tooling package migration
- import path cleanup
- documentation examples
- Avoidance:
- use
arqon_maestrofor the internal Python package slug - reserve
maestrofor repo/directory identity rather than Python import identity
GOTCHA-021: Legacy Tree-Sitter JNI Symbols Can Crash Core During System.load¶
- Category: Namespace And Dependency Identity
- Status: mitigated
- Summary: If
libjava-tree-sitter.soexportsJava_ai_serenade_treesitter_*while Java code importsai.arqon.maestro.treesitter.*, local core can crash duringParser.<clinit>()before normal app startup. - Impact: High
- Where it matters:
core/bin/build-tree-sitter.pyclient:installServerpackaging path- Wave B runtime smoke
- Avoidance:
- build JNI from the in-repo
maestro/tree-sitter/java-tree-sitter/build.py - reject cached JNI artifacts without
Java_ai_arqon_maestro_treesitter_*symbols - verify JNI symbol namespace in Wave B evidence
GOTCHA-022: Code-Engine Can Segfault Before main() Despite Valid Model Paths¶
- Category: Build And Packaging
- Status: mitigated
- Summary:
code-enginecan crash with SIGSEGV before enteringmain(), so normal startup logs and argument-validation paths never run. This appears as an immediate wrapper crash even whenCODE_ENGINE_MODELSis set. - Impact: High
- Where it matters:
- Wave B local runtime hard-close
- native ABI/static-init compatibility checks
- local service health validation on
:17203 - Avoidance:
- treat pre-
main()crashes as native ABI/static-init blockers, not app-logic bugs - capture native traces outside sandbox ptrace restrictions
- keep sentencepiece tokenization on a stable boundary (
spm_encodeCLI path inTokenIdConverter) - do not hard-close local runtime milestones until
:17203health checks are green
GOTCHA-023: Failed Native Link Can Leave A Zero-Filled Engine Artifact In Local Bundle¶
- Category: Build And Packaging
- Status: active
- Summary: If
code-enginenative link fails but packaging reuses an existing path,client/static/local/code-engine/arqon-maestro-code-enginecan appear as a large zero-filled file (file: data), causingExec format errorand misleading runtime debugging. - Impact: High
- Where it matters:
code-engine:buildCMakeandcode-engine:distTarclient:installCodeEngine- local runtime smoke scripts
- Avoidance:
- verify native target output exists in
maestro/code-engine/server/build/code-engine/arqon-maestro-code-engineafter CMake link - verify installed artifact with
filebefore runtime smoke - fail the pipeline immediately on native link errors before trusting packaged local binaries
GOTCHA-024: Global ELECTRON_RUN_AS_NODE Pollutes App Runtime Evidence¶
- Category: Runtime Startup
- Status: mitigated
- Summary: Exporting
ELECTRON_RUN_AS_NODE=1globally can make Electron app startup and packaged smoke checks run in the wrong mode, producing misleading failures or false negatives. - Impact: High
- Where it matters:
- desktop runtime smoke checks
- packaging evidence collection
- support/debug sessions where shell environment is reused
- Avoidance:
- keep legacy
ELECTRON_RUN_AS_NODE=1usage scoped to the single command that needs it - run Maestro Electron commands through
maestro/scripts/with_clean_electron_env.sh - fail readiness checks when global contamination is detected
GOTCHA-025: External Cutover Before Ownership Assignment Causes Operational Deadlock¶
- Category: External Infrastructure
- Status: active
- Summary: Migrating runtime endpoints or update hosts before assigning DNS/TLS/CDN/runtime ownership creates a high-risk state where incidents have no clear operator or rollback authority.
- Impact: High
- Where it matters:
- Wave D live endpoint migration
- desktop update pipeline cutover
- incident response and rollback
- Avoidance:
- complete Wave D1 ownership inventory before any endpoint switch
- require explicit owners for DNS, TLS, CDN, runtime service, and release operations
- enforce a no-cutover rule until readiness checklist is fully green
GOTCHA-026: Rewriting Legal/Provenance Text Directly Can Corrupt Historical Accuracy¶
- Category: External Infrastructure
- Status: mitigated
- Summary: Bulk rebrand rewrites can unintentionally alter legal or historical meaning in inherited privacy/terms/blog content.
- Impact: High
- Where it matters:
- Wave E content migration
- legal policy pages
- historical public archive pages
- Avoidance:
- classify legal/historical pages as
preserve + annotate - add explicit provenance notices instead of rewriting legal body text
- rewrite active product surfaces separately from historical/legal bodies
GOTCHA-027: Evidence Drift Between Code And Closeout Docs¶
- Category: Configuration
- Status: active
- Summary: Documentation can claim production defaults or completed rollout while
settings.tsstill uses conservative shadow defaults, or vice versa. - Impact: High
- Where it matters:
- closeout packs
- release readiness reviews
- operator handoff
- Avoidance:
- verify docs against current defaults before every closeout
- include a command-backed defaults snapshot in evidence docs
GOTCHA-028: Mock Confidence Trap¶
- Category: Protocol And Plugin Integration
- Status: active
- Summary: Passing a mock bus harness can be misreported as production readiness even though it only proves protocol handling in a controlled environment.
- Impact: High
- Where it matters:
- go/no-go decisions
- rollout stage promotion
- incident preparedness
- Avoidance:
- label evidence as
mock,staging, orproduction-like - require separate production-like validation evidence before GO
GOTCHA-029: Path-Portability Trap In Documentation Links¶
- Category: Configuration
- Status: active
- Summary: Absolute
file:///links work only on one workstation and break portability, external review, and AI handoff quality. - Impact: Medium
- Where it matters:
- implementation plans
- architecture references
- cross-repo collaboration
- Avoidance:
- use repo-relative links in docs
- run link portability checks before closeout
GOTCHA-030: Stage Gate Bypass By Flag Drift¶
- Category: Runtime Startup
- Status: active
- Summary: Manual stage approval may exist in code, but conflicting defaults or stale user/system config flags can still route traffic unexpectedly.
- Impact: High
- Where it matters:
- staged cutover
- local recovery testing
- release operations
- Avoidance:
- verify all cutover flags as a single preflight checklist
- require explicit evidence for defaults and current stage before promotion
GOTCHA-031: Green Build Illusion¶
- Category: Build And Packaging
- Status: active
- Summary: A clean compile can hide runtime correctness gaps, missing integration semantics, or incomplete rollback behavior.
- Impact: High
- Where it matters:
- phase completion decisions
- hard-close packs
- quality audits
- Avoidance:
- require build + harness + rollback proof together
- block closeout when any operational gate lacks evidence
Entry Template¶
### GOTCHA-XXX: <Short Title>
- **Category**: Configuration | Runtime Startup | Audio And Voice Pipeline | Process And Sidecar Identity | Build And Packaging | Protocol And Plugin Integration | Namespace And Dependency Identity | External Infrastructure
- **Status**: active | mitigated | resolved
- **Summary**: <What goes wrong>
- **Impact**: Low | Medium | High
- **Where it matters**: <Affected phases/subsystems>
- **Avoidance**: <How to avoid repeating it>