Architecture¶
This document defines the intended setup for GitTinkerer (comment-to-code automation), including boundaries, runtime contracts, and the on-disk audit artifacts.
Scope and constraints¶
- Single repo per run
- Triggered by:
- PR comment (GitHub App webhook on
issue_comment) - Web UI comment (Svelte app in web/)
- Non-interactive execution (no prompts, no HITL)
- Commits are pushed back to the target branch derived from PR metadata/ref
- Review happens after execution via commits + artifacts
System context (boundaries)¶
flowchart LR
Dev[Developer]
subgraph GitHub[GitHub]
PR[Pull Request]
Comments[PR Comments]
API[GitHub API]
Repo[Git Repository]
Webhook[GitHub Webhook Delivery]
end
subgraph VPS[VPS]
subgraph GT["GitTinkerer (this repo)"]
Entry["bin/gittinkerer (bash)"]
Work[Workspace manager]
Codex[Codex runner]
Git[Git commit/push]
Reply[Reply publisher]
Art[Artifacts writer]
end
subgraph Web["Web UI (same repo)"]
Svelte["Svelte app (web/)"]
end
Service["HTTP service (service/)"]
CodexCLI["Codex CLI (installed/authenticated)"]
end
subgraph Traefik["Traefik (elsewhere)"]
Proxy[Reverse proxy]
end
Dev -->|writes comment| Comments
Dev -->|uses web UI| Proxy
Proxy -->|routes HTTP| Svelte
Comments --> Webhook
Webhook -->|HTTP POST| Service
Service -->|spawn| Entry
Entry --> Work
Work --> Repo
Entry --> Codex
Codex --> CodexCLI
Entry --> Git
Git --> Repo
Entry --> Reply
Reply --> API
Entry --> Art
Execution flows¶
Service/web/data flow (with analytics)¶
flowchart LR
subgraph Web
UI["Web UI\n(/, /runs, /runs/:id, /admin, /analytics)"]
end
subgraph Service
HTTP[HTTP API\nNode 24 service]
Metrics[Metrics & Analytics]
Sentry[Sentry SDK]
end
subgraph Data
PG[(PostgreSQL\nruns, run_metrics,\npaused_repos)]
Redis[(Redis\nrate limits & cache)]
end
UI -->|REST| HTTP
HTTP -->|read/write| PG
HTTP -->|cache rate limits\nrun status| Redis
Metrics -->|aggregations| PG
HTTP -->|errors| Sentry
Run lifecycle (API/web)¶
sequenceDiagram
participant UI as Web UI / Webhook
participant API as Service API
participant PG as Postgres
participant R as Redis
participant S as Sentry
UI->>API: POST /api/runs (payload)
API->>PG: insert run (status=running, user/installation)
API->>R: rate-limit check + status cache
API-->>UI: 202 run_id
API->>API: spawn CLI runner
API-->>S: capture errors (if spawn/migrate fails)
API->>PG: update status + metrics on completion
API->>R: cache final status
UI->>API: GET /api/runs/:id (polling)
Bash module flow (current implementation)¶
flowchart TD
A[bin/gittinkerer] -->|parse_args| B[lib/args.sh]
B -->|load_env| C[lib/env.sh]
C -->|parse_payload| D[lib/payload.sh]
D -->|init_artifacts| E[lib/artifacts.sh]
E -->|init_workspace| F[lib/workspace.sh]
F -->|build+run prompt| G[lib/codex.sh]
G -->|diff/commit/push| H[lib/git.sh]
H -->|publish reply| I{source?}
I -->|pr| J[lib/reply/github.sh]
I -->|web| K[lib/reply/web.sh]
G -->|metrics| M[lib/metrics.sh]
subgraph Error handling
X[EXIT trap] -->|stage + message| Y[write_failure_artifacts]
end
PR comment mode (happy path)¶
sequenceDiagram
autonumber
actor Dev as Developer
participant GH as GitHub PR
participant Service as VPS HTTP service
participant GT as bin/gittinkerer
participant Repo as Target Git Repo (workspace)
participant Codex as Codex CLI
participant API as GitHub API
Dev->>GH: Comment with instruction (must start with /tinker)
GH->>Service: issue_comment webhook delivery
Service->>Service: Verify signature, gate on /tinker
Service->>API: Fetch PR JSON via issue.pull_request.url
Service->>GT: Spawn bin/gittinkerer with --payload-file (stdin not supported)
GT->>GT: Create artifacts/<timestamp>/agent-run/
GT->>Repo: Clone or fetch/pull
GT->>Repo: Checkout ref derived from PR metadata/ref
GT->>GT: Build prompt (includes instruction + reporting contract)
GT->>Codex: Run Codex CLI (non-interactive)
Codex-->>GT: Changes + response text
GT->>Repo: Commit changes (nntin-bot)
GT->>Repo: Push commits
GT->>API: Post a new PR conversation comment (by PR number)
GT->>GT: Write artifacts (prompt, diff, summary, etc.)
Web comment mode (happy path)¶
sequenceDiagram
autonumber
actor Dev as Developer
participant Web as Svelte Web UI
participant VPS as VPS
participant GT as bin/gittinkerer
participant Repo as Target Git Repo (workspace)
participant Codex as Codex CLI
participant GH as GitHub (repo + API)
Dev->>Web: Submit instruction comment
Web->>GT: Invoke run (local call on VPS)
GT->>GT: Create artifacts/<timestamp>/agent-run/
GT->>Repo: Clone or fetch/pull
GT->>Repo: Checkout ref (from payload / derived context)
GT->>Codex: Run Codex CLI
Codex-->>GT: Changes + response text
GT->>Repo: Commit + push to GitHub
Repo->>GH: Push
GT->>Web: Post reply to run_id callback
GT->>GT: Write artifacts
Data persistence¶
Actual database schema (from service/src/infra/db/migrations.ts):
erDiagram
runs {
TEXT run_id PK
TEXT repo_full_name
TEXT repo_clone_url
TEXT source
TEXT status
JSONB payload
TEXT payload_path
TEXT artifacts_dir
TEXT comment_body
TEXT comment_raw_body
INT pr_number
TEXT pr_head_ref
TEXT pr_head_sha
TEXT pr_base_ref
TEXT timestamp
TIMESTAMPTZ started_at
TIMESTAMPTZ finished_at
INT exit_code
TEXT user_id
TEXT installation_id
TIMESTAMPTZ created_at
TIMESTAMPTZ updated_at
}
run_metrics {
BIGSERIAL id PK
TEXT run_id FK
TEXT metric_name
NUMERIC metric_value
TIMESTAMPTZ recorded_at
}
rate_limits {
TEXT repo_full_name PK
TIMESTAMPTZ last_reset_at
INT allowed_per_window
INT window_seconds
INT used_in_window
TIMESTAMPTZ updated_at
}
paused_repos {
TEXT repo_full_name PK
TEXT paused_reason
TIMESTAMPTZ paused_until
TIMESTAMPTZ created_at
}
paused_run_targets {
TEXT repo_full_name PK
TEXT user_id PK
TEXT paused_reason
TIMESTAMPTZ paused_until
TIMESTAMPTZ created_at
}
runs ||--o{ run_metrics : "has metrics"
runs ||--o{ paused_run_targets : "can be paused per user"
- Indexes: repo/status/started/user/installation on runs; run_id/metric_name on run_metrics; repo_full_name and user_id on paused_run_targets.
- Migrations are idempotent and tracked in schema_migrations; failures surface with a clear startup error.
Runtime components (VPS)¶
- Entrypoint: bin/gittinkerer (bash)
- Workspace manager: Ensures the single target repo is present and on the right ref.
- Codex runner: Builds prompt, executes Codex CLI, captures response.
- Git manager: Creates diff, commits as
nntin-bot, pushes. - Reply publisher: Posts response back to PR comment thread or web conversation.
- Artifacts writer: Writes a complete audit bundle per run.
Domain model (service/src/domain)¶
classDiagram
class Run {
+runId: string
+repoFullName: string
+repoCloneUrl: string
+source: "pr"|"web"
+status: RunStatus
+payload: unknown
+payloadPath: string?
+artifactsDir: string?
+commentBody: string?
+commentRawBody: string?
+prNumber: number?
+prHeadRef: string?
+prHeadSha: string?
+prBaseRef: string?
+timestamp: string?
+startedAt: Date?
+finishedAt: Date?
+exitCode: number?
+userId: string?
+installationId: string?
+createdAt: Date
+updatedAt: Date
+start()
+succeed()
+fail(error)
+isTerminal()
+duration: number?
}
class RunStatus {
<<enum>>
pending
running
success
succeeded
failed
ignored
unknown
}
class RepoRef {
+fullName: string
+cloneUrl: string
+owner(): string
+name(): string
+getWorkspacePath(resolver)
}
class Instruction {
+body: string
+rawBody: string
+source: "pr"|"web"
+getPrompt()
+fromComment(raw, source)
}
Run --> RunStatus : uses
Run --> RepoRef : repo metadata
Run --> Instruction : instruction prompt
CLI contract: bin/gittinkerer (current)¶
- Entry: bin/gittinkerer sources modules: args, env, payload, artifacts, sentry, workspace, metrics, git, codex, reply/github, reply/web.
- Command:
bin/gittinkerer run --payload-file <path> [--workspaces-dir <path>] [--artifacts-dir <path>] [--timestamp <value>] [--dry-run] - Parsing: lib/args.sh enforces
run+--payload-file(file must exist); bad usage exits2. - Environment: lib/env.sh loads
.envand requiresGITHUB_TOKEN(or GitHub App token) plus optional overrides for ARTIFACTS_DIR/WORKSPACES_DIR and bot identity. - Flow: EXIT trap writes failure artifacts with
failure_stage/failure_message; stages advance in order env → payload → artifacts → workspace sync → codex → git → reply. - Dry run: skips push and reply but still writes artifacts and metrics.
- Exit codes:
0success,1failure at any stage,2usage/payload validation.
Runner invocation (service → CLI)¶
- The HTTP service calls spawnGittinkerer with args:
run --payload-file <path>and optional--workspaces-dir,--artifacts-dir,--timestamp,--dry-run. - Payload files are created under
serviceConfig.directories.artifactsviacreatePayloadFile; cleaned up bycleanupPayloadFileafter spawn completes. - The runner uses
CLI_PATHand inherits service env; stdout/stderr are buffered, and duration is measured in ms for downstream analytics/logging.
Metrics collection flow¶
sequenceDiagram
autonumber
participant Codex as lib/codex.sh
participant Git as lib/git.sh
participant Metrics as lib/metrics.sh
participant Service as HTTP service
Codex->>Metrics: record_metric("actual_prompt_tokens"|"completion_seconds")
Git->>Metrics: record_metric("diff_loc")
Metrics->>Metrics: flush_metrics (requires RUNNER_METRICS_TOKEN & PAYLOAD_RUN_ID)
Metrics->>Service: POST /api/runs/:run_id/metrics (Bearer RUNNER_METRICS_TOKEN)
Service-->>Metrics: 2xx on success (logged, queue cleared)
Proposed payload schema (single repo)¶
The exact payload can evolve, but the following is the minimum shape GitTinkerer should accept.
web:
{
"run_id": "2025-12-23T12:34:56Z-<random>",
"source": "web",
"repo": {
"full_name": "OWNER/REPO",
"clone_url": "https://github.com/OWNER/REPO.git"
},
"comment_body": "Instruction text (after stripping /tinker)",
"comment_raw_body": "/tinker Instruction text (raw; artifact-only)",
"web": {
"web_conversation_id": "conv_01J...",
"user_id": "user_123"
},
"pr": {
"head_ref": "feature/branch-name"
}
}
{
"run_id": "2025-12-23T12:34:56Z-<random>",
"source": "pr",
"repo": {
"full_name": "OWNER/REPO",
"clone_url": "https://github.com/OWNER/REPO.git"
},
"comment_body": "Instruction text (after stripping /tinker)",
"comment_raw_body": "/tinker Instruction text (raw; artifact-only)",
"pr": {
"number": 123,
"comment_id": 999999,
"head_ref": "feature/branch-name",
"head_sha": "abcdef123456...",
"base_ref": "main"
},
"web": {
"web_conversation_id": "conv_01J..."
}
}
Rules:
sourceis"pr"or"web".- If
source == "pr",prMUST be present withnumber,comment_id,head_ref,head_sha, andbase_ref. - If
source == "web",web.web_conversation_id(andweb.user_id) MUST be present. - If
source == "web",pris optional and may include onlyhead_reffor branch selection. - Target checkout/push ref is derived from PR metadata/ref (PR mode).
GitHub webhook notes (PR mode):
- Trigger source is the GitHub App
issue_commentwebhook. - Only comments that begin with
/tinkershould trigger a run. /tinkersupports multiline instructions; everything after the prefix is treated as the instruction body.comment_bodyshould contain the stripped instruction text (no/tinker).comment_raw_bodyis optional but recommended, and is used for audit only (stored inpr_comment.txt).pr.comment_idis optional for this mode.
Authorization¶
- Owner-only restriction: Only repository owners can trigger
/tinkercommands. - The service verifies that
repository.owner.type === 'User'(organization-owned repos are not supported). - The service verifies that
comment.user.login === repository.owner.login. - Unauthorized attempts are logged to console and result in a GitHub comment explaining the restriction.
- These checks occur before rate limiting or PR metadata fetching to minimize resource usage.
Reason: This prevents prompt injection attacks from untrusted collaborators and ensures only the repository owner controls AI-driven code changes.
Prompt contract (Codex)¶
Every run must include the instruction comment and MUST include the following post-change reporting requirement:
After making changes, write a short rationale explaining:
- What was changed
- Why it was changed
- Any assumptions made
Do not include internal reasoning or deliberation.
The agent response is:
- Persisted to artifacts
- Posted back to the origin channel (PR reply or web UI reply)
Artifacts contract¶
Each run writes to artifacts/<timestamp>/agent-run/ (or an overridden base). Required files:
agent-run/
├── prompt.txt
├── pr_comment.txt
├── diff.patch
├── files_changed.json
├── commit_sha.txt
└── summary.md
Success expectations:
- prompt.txt: exact prompt sent to Codex (from lib/codex.sh)
- pr_comment.txt: raw instruction text (if present in payload)
- diff.patch: patch for all changes applied (lib/git.sh)
- files_changed.json: JSON array of changed file paths
- commit_sha.txt: SHA of pushed commit (empty if no commit)
- summary.md: agent response/rationale (Codex output)
Failure expectations (explicit)¶
On ANY failure (including payload parsing, repo sync, Codex error, git push failure, or reply failure):
- The run directory
artifacts/<timestamp>/agent-run/MUST still be created. summary.mdMUST exist and begin with a short, human-readable status section:Status: failedFailed stage: <stage-name>(e.g.,payload,sync,codex,commit,push,reply)- A concise error message (no stack traces if they might leak sensitive paths/tokens)
files_changed.jsonMUST exist:[]if no changes were applieddiff.patchMUST exist:- Empty file if no diff exists
commit_sha.txtMUST exist:- Empty file if no commit was created or push did not succeed
prompt.txtandpr_comment.txtMUST be written whenever the corresponding data was available at the time of failure.
Web UI hosting (Traefik-friendly)¶
The web UI runs on the VPS and is intended to sit behind Traefik (configured elsewhere):
- Dev:
npm run dev -- --host 0.0.0.0 --port 5173 - Production-like preview:
npm run preview -- --host 0.0.0.0 --port 4173
Binding to 0.0.0.0 enables Traefik to route to the service.
HTTP API (Node 24)¶
The web UI interacts with a lightweight HTTP service running on the VPS (implementation in this repository).
This service is the only interface the web UI needs:
POST /api/runs- Accepts a JSON payload (single repo per run; see payload schema above)
- Starts a non-interactive run by spawning
bin/gittinkerer run --payload-file <temp> -
Returns
run_idimmediately (and optional status metadata) -
GET /api/runs/:run_id - Returns run status and results
- MUST include enough information for the UI to display the agent response (e.g.,
summary.mdcontents) - MUST include enough information for operators to locate run artifacts (e.g., the artifact path)
Configuration¶
See .env.example.
GITHUB_TOKENis required on the VPS and should be minted by a GitHub App with least-privilege scopes.- The bot commit identity is:
nntin-bot48604375+nntin-bot@users.noreply.github.com- The HTTP service now requires a reachable PostgreSQL instance during startup; configure
DATABASE_URLorPGHOST/PGUSER/PGPASSWORD/PGDATABASEbefore launching.
Docker Compose¶
graph TD
%% ===== Host Machine =====
subgraph "Host"
SSH[~/.ssh<br/>Git Auth]
Codex["/usr/local/bin/codex<br/>AI CLI"]
RepoFS["Repo checkout<br/>(./ workspace)"]
end
%% ===== Docker Compose Stack =====
subgraph "Docker Compose"
subgraph "Services"
P[postgres:16-alpine<br/>DB: Runs/Metrics<br/>Host:5432]
R[redis:7-alpine<br/>Cache/Rate Limit<br/>Host:6379]
S[service Node.js<br/>Webhooks/API/Admin<br/>Host:3000<br/>Mounts: SSH,Codex,Repo]
W[web SvelteKit<br/>UI<br/>Host:5173]
end
%% User access
W -->|API Calls<br/>VITE_SERVICE_URL| S
%% Internal dependencies
S -->|Queries / Inserts| P
S -->|Cache / Limits| R
%% Host mounts
SSH -.->|/root/.ssh:ro| S
Codex -.->|/usr/local/bin/codex:ro| S
RepoFS -.->|./:/repo bind| S
%% Persistence
P -.->|pgdata| PGVol[pgdata Volume]
end
%% ===== Styling =====
style S fill:#f9f,stroke:#333,stroke-width:3px
style W fill:#bbf,stroke:#333
Redis 7.0+ is required.
Authentication¶
graph TD
%% ===== Host Machine =====
subgraph "Host"
SSH[~/.ssh<br/>Git Auth]
Codex["/usr/local/bin/codex<br/>AI CLI"]
RepoFS["Repo checkout<br/>(./ workspace)"]
end
%% ===== Docker Compose Stack =====
subgraph "Docker Compose"
subgraph "Services"
P[postgres:16-alpine<br/>DB: Runs/Metrics]
R[redis:7-alpine<br/>Cache/Rate Limit]
S[service Node.js<br/>API + Webhooks<br/>Internal:3000<br/>Mounts: SSH,Codex,Repo]
W[web SvelteKit<br/>UI<br/>Internal:5173]
end
%% Internal service communication (NO Traefik)
W -->|API Calls<br/>VITE_SERVICE_URL| S
%% Internal dependencies
S -->|Queries / Inserts| P
S -->|Cache / Limits| R
%% Host mounts
SSH -.->|/root/.ssh:ro| S
Codex -.->|/usr/local/bin/codex:ro| S
RepoFS -.->|./:/repo bind| S
%% Persistence
P -.->|pgdata| PGVol[pgdata Volume]
end
%% ===== External Systems =====
subgraph "External"
GH[GitHub<br/>Webhooks & API]
User[User Browser]
Internet[Public Internet]
end
%% ===== Edge Layer =====
subgraph "Ingress"
T[Traefik<br/>Reverse Proxy<br/>TLS Termination]
KC[Keycloak<br/>OIDC]
end
%% ===== Ingress Traffic =====
%% GitHub webhooks (no auth, signed payload)
GH -.->|Signed Webhook Events| Internet
Internet -->|HTTPS<br/>/api/github/webhook| T
T -->|Forward :3000| S
%% User access to UI (Keycloak-protected)
User -.->|HTTPS / WSS| Internet
Internet -->|HTTPS :5173| T
T -->|OIDC Auth| KC
KC -->|Token| T
T -->|Forward :5173<br/>with Keycloak Auth| W
%% ===== Styling =====
style T fill:#ffd,stroke:#333,stroke-width:3px
style KC fill:#dfd,stroke:#333
style S fill:#f9f,stroke:#333,stroke-width:3px
style W fill:#bbf,stroke:#333
style Internet fill:#eee,stroke:#666,stroke-dasharray: 5 5
Data aggregation¶
flowchart LR
subgraph Web
UI["Web UI<br/>(/, /runs, /runs/:id, /admin, /analytics)"]
end
subgraph Service
HTTP[HTTP API<br/>Node 24 service]
Metrics[Metrics & Analytics]
Sentry[Sentry SDK]
end
subgraph Data
PG[(PostgreSQL runs,<br/>run_metrics,<br/>paused_repos)]
Redis[(Redis rate<br/>limits & cache)]
end
UI -->|REST| HTTP
HTTP -->|read/write| PG
HTTP -->|cache rate limits<br/>run status| Redis
Metrics -->|aggregations| PG
HTTP -->|errors| Sentry